Methods of Nonlinear Analysis: Applications to Differential Equations (Birkhauser Advanced Texts Basler Lehrbucher)

Birkhäuser Advanced Texts Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Loui...

Author: Pavel Drabek | Jaroslav Milota

26 downloads 711 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Birkhäuser Advanced Texts

Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Louis Shrawan Kumar, University of North Carolina at Chapel Hill -DQ1HNRYiĜ8QLYHUVLWp3LHUUHHW0DULH&XULH3DULV

3DYHO'UiEHN -DURVODY0LORWD

Methods of Nonlinear Analysis $SSOLFDWLRQVWR'LIIHUHQWLDO(TXDWLRQV

Birkhäuser Basel · Boston · Berlin

Authors: 3DYHO'UiEHN 'HSDUWPHQWRI0DWKHPDWLFV )DFXOW\RI$SSOLHG6FLHQFHV 8QLYHUVLW\RI:HVW%RKHPLDLQ3LOVHQ Univerzitní 8 3O]HĖ Czech Republic

-DURVODY0LORWD 'HSDUWPHQWRI0DWKHPDWLFDO$QDO\VLV )DFXOW\RI0DWKHPDWLFVDQG3K\VLFV &KDUOHV8QLYHUVLW\LQ3UDJXH Sokolovská 83 3UDKD Czech Republic

0DWKHPDWLFV6XEMHFW&ODVVL¿FDWLRQ%[[---$+[[++- -...5[[$$((&&&

/LEUDU\RI&RQJUHVV&RQWURO1XPEHU

%LEOLRJUDSKLFLQIRUPDWLRQSXEOLVKHGE\'LH'HXWVFKH%LEOLRWKHN 'LH'HXWVFKH%LEOLRWKHNOLVWVWKLVSXEOLFDWLRQLQWKH'HXWVFKH1DWLRQDOELEOLRJUD¿H detailed bibliographic data is available in the Internet at .

,6%1%LUNKlXVHU9HUODJ$*%DVHOÂ%RVWRQÂ%HUOLQ 7KLVZRUNLVVXEMHFWWRFRS\ULJKW$OOULJKWVDUHUHVHUYHGZKHWKHUWKHZKROHRUSDUWRIWKHPDWHULDOLVFRQFHU QHGVSHFL¿FDOO\WKHULJKWVRIWUDQVODWLRQUHSULQWLQJUHXVHRILOOXVWUDWLRQVUHFLWDWLRQEURDGFDVWLQJUHSURGXF WLRQRQPLFUR¿OPVRULQRWKHUZD\VDQGVWRUDJHLQGDWDEDQNV)RUDQ\NLQGRIXVHSHUPLVVLRQRIWKHFRS\ULJKW owner must be obtained.

%LUNKlXVHU9HUODJ$* Basel · Boston · Berlin 32%R[&+%DVHO6ZLW]HUODQG 3DUWRI6SULQJHU6FLHQFH%XVLQHVV0HGLD 3ULQWHGRQDFLGIUHHSDSHUSURGXFHGRIFKORULQHIUHHSXOS7&)f 3ULQWHGLQ*HUPDQ\ ,6%1

H,6%1

ZZZELUNKDXVHUFK

Dedicated to the memory of Svatopluk Fuˇc´ık

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

1 Preliminaries 1.1 Elements of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . 1.2 Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . .

1 24

2 Properties of Linear and Nonlinear 2.1 Linear Operators . . . . . . . 2.2 Compact Operators . . . . . 2.3 Contraction Principle . . . .

55 77 91

Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Abstract Integral and Diﬀerential Calculus 3.1 Integration of Vector Functions . . . . . . . . . . . . . . . . . . . . 105 3.2 Diﬀerential Calculus in Normed Linear Spaces . . . . . . . . . . . . 117 3.2A Newton Method . . . . . . . . . . . . . . . . . . . . . . . . 134 4 Local Properties of Diﬀerentiable Mappings 4.1 Inverse Function Theorem . . . . . . . . . . . . . . . 4.2 Implicit Function Theorem . . . . . . . . . . . . . . 4.3 Local Structure of Diﬀerentiable Maps, Bifurcations 4.3A Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields . . . . . . . . . . . . . . . . 4.3B Diﬀerential Forms . . . . . . . . . . . . . . . 4.3C Integration on Manifolds . . . . . . . . . . . . 4.3D Brouwer Degree . . . . . . . . . . . . . . . . .

. . . . . . . . 139 . . . . . . . . 146 . . . . . . . . 156 . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

181 195 208 228

5 Topological and Monotonicity Methods 5.1 Brouwer and Schauder Fixed Point Theorems . . . . . . . . . . 5.1A Fixed Point Theorems for Noncompact Operators . . . 5.2 Topological Degree . . . . . . . . . . . . . . . . . . . . . . . . . 5.2A Global Bifurcation Theorem . . . . . . . . . . . . . . . . 5.2B Topological Degree for Generalized Monotone Operators

. . . . .

. . . . .

249 261 267 295 303

viii

Contents

5.3 5.4

Theory of Monotone Operators . . . . . . . . . . . . . . . . 5.3A Browder and Leray–Lions Theorem . . . . . . . . . . Supersolutions, Subsolutions, Monotone Iterations . . . . . 5.4A Minorant Principle and Krein–Rutman Theorem . . 5.4B Supersolutions, Subsolutions and Topological Degree

6 Variational Methods 6.1 Local Extrema . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Global Extrema . . . . . . . . . . . . . . . . . . . . . . . . 6.2A Ritz Method . . . . . . . . . . . . . . . . . . . . . 6.2B Supersolutions, Subsolutions and Global Extrema . 6.3 Relative Extrema and Lagrange Multipliers . . . . . . . . 6.3A Contractible Sets . . . . . . . . . . . . . . . . . . . 6.3B Krasnoselski Potential Bifurcation Theorem . . . . 6.4 Mountain Pass Theorem . . . . . . . . . . . . . . . . . . . 6.4A Pseudogradient Vector Fields in Banach Spaces . . 6.4B Lusternik–Schnirelmann Method . . . . . . . . . . 6.5 Saddle Point Theorem . . . . . . . . . . . . . . . . . . . . 6.5A Linking Theorem . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

309 323 330 338 351

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

361 375 388 398 401 414 416 426 436 442 456 464

. . . . .

. . . . .

473 477 481 486 492

. . . . .

. . . . .

499 505 510 515 527

7 Boundary Value Problems for Partial Diﬀerential Equations 7.1 Classical Solution, Functional Setting . . . . . . . . . . . . . . . 7.2 Classical Solution, Applications . . . . . . . . . . . . . . . . . . 7.3 Weak Solutions, Functional Setting . . . . . . . . . . . . . . . . 7.4 Weak Solutions, Application of Fixed Point Theorems . . . . . 7.5 Weak Solutions, Application of Degree Theory . . . . . . . . . 7.5A Application of the Degree of Generalized Monotone Operators . . . . . . . . . . . . . . . . . . . . 7.6 Weak Solutions, Application of Theory of Monotone Operators 7.6A Application of Leray–Lions Theorem . . . . . . . . . . . 7.7 Weak Solutions, Application of Variational Methods . . . . . . 7.7A Application of the Saddle Point Theorem . . . . . . . .

Summary of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Comparison of Bifurcation Results . . . . . . . . . . . . . . . . . . . . . . . 539 List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561

Preface Motto: Real world problems are in essence nonlinear. Hence methods of nonlinear analysis became important tools of modern mathematical modeling.

There are many books and monographs devoted to methods of nonlinear analysis and their applications. Typically, such a book is either dedicated to a particular topic and treats details which are diﬃcult for a student to understand, or it deals with an application to complicated nonlinear partial diﬀerential equations in which a lot of technicalities are involved. In both cases it is very diﬃcult for a student to get oriented in this kind of material and to pick up the ideas underlying the main tools for treating the problems in question. The purpose of this book is to describe the basic methods of nonlinear analysis and to illustrate them on simple examples. Our aim is to motivate each method considered, to explain it in a general form but in the simplest possible abstract framework, and ﬁnally, to show its application (typically to boundary value problems for elementary ordinary or partial diﬀerential equations). To keep the text free of technical details and make it accessible also to beginners, we did not formulate some key assertions and illustrative examples in the most general form. The exposition of the book is at two levels, visually diﬀerentiated by diﬀerent font sizes. The basic material is contained in the body of the seven chapters. The more advanced material is contained in appendices to a number of sections and is presented in a smaller font size. The basic material is independent of the advanced material, is self-contained, and can be read by students new to the subject. It should prepare an undergraduate student in mathematics to read scientiﬁc papers in nonlinear analysis and to understand applications of the methods presented to more complex problems. Each chapter contains a number of exercises that should provoke the reader’s creativity and help develop his or her own style of approaching problems. However, the exercises play an additional role. They carry some of the technical material that was omitted in simplifying some of the basic proofs. They are thus an organic

x

Preface

part of the exposition for graduate students who already have experience with the methods of nonlinear analysis and are interested in generalizations. We have organized the material in this book as follows. In Chapters 1–3, we introduce some necessary notions and basic assertions from linear algebra (Section 1.1) and linear functional analysis (Sections 1.2–2.2), and we also present some preliminaries concerning the Contraction Principle and diﬀerential and integral calculus in normed linear spaces (Sections 2.3–3.2). In this part of the text we give proofs of the results which are closely related to the nonlinear part of the book. On the other hand, several very important statements of linear functional analysis are left without proofs. In Chapter 4, local properties of diﬀerentiable mappings are treated. In particular, it includes topics such as the Inverse Function Theorem, the Implicit Function Theorem together with the Rank Theorem and the notion of the diﬀerentiable manifold. Results such as the Lyapunov–Schmidt Reduction and the Morse Theorem are used to prove the Local Bifurcation Theorem of Crandall and Rabinowitz. Chapter 5 is devoted to the topological and monotonicity methods of nonlinear analysis. We focus on the Brouwer and Schauder Fixed Point Theorems, the Sard Theorem and the analytic approach to the degree of a mapping, monotone operators and the method of monotone iterations based on the notions of superand sub-solutions. In Chapter 6, basic variational methods are presented. We start with local and global extrema and then continue with the method of Lagrange Multipliers with applications to eigenvalue problems (Courant–Fischer and Courant– Weinstein Principles), the Mountain Pass Theorem and the Saddle Point Theorem. Abstract results from Chapters 4–6 are accompanied by examples of various boundary value problems for ordinary diﬀerential equations. Since these applications are spread over a large number of pages, we add a brief account of examples of boundary value problems for both ordinary and partial diﬀerential equations together with the methods used at the end of the book. The reader will also ﬁnd there a short guide on the bifurcation results presented in the book. Chapter 7 deals with several applications of the preceding methods to boundary value problems for elementary nonlinear partial diﬀerential equations. We present and discuss the notions of classical and weak solutions and try to minimize the technical diﬃculties connected with the formulation of the problems. All this material represents a self-contained introduction to the methods of nonlinear analysis with simple applications to elementary diﬀerential equations. More advanced material is presented in appendices which are attached to a number of sections. In particular, Appendix 3.2A explores an abstract Newton Method as an application of the Contraction Principle and diﬀerential calculus in Banach spaces. Appendices 4.3A to 4.3D are devoted to the analysis on manifolds (vector ﬁelds, diﬀerential forms and integration on manifolds). The main results presented

Preface

xi

in these appendices are an abstract version of the Stokes Theorem (and its applications) and the construction of the Brouwer degree by means of diﬀerentiable forms. Some ﬁxed point theorems for noncompact operators are presented in Appendix 5.1A. As an application of the Leray–Schauder degree theory we consider global bifurcation theorems in Appendix 5.2A while Appendix 5.2B is devoted to the generalization of the Leray–Schauder degree to generalized monotone mappings. Appendix 5.3A deals with the generalization of the theory of monotone operators to a more general functional setting and to operators which are monotone only in the principal part. In Appendix 5.4A we give the proof of the famous Krein–Rutman Theorem which itself falls within the linear theory but plays an essential role in the study of nonlinear problems. Appendix 5.4B illustrates the connection between the method of supersolutions and subsolutions and the topological degree. The Ritz Method is presented in Appendix 6.2A as an application of an abstract variational principle. Appendix 6.2B illustrates the connection between the method of supersolutions and subsolutions and the existence of global extrema. Appendix 6.3A has an auxiliary character for establishing the “potential” bifurcation theorem in Appendix 6.3B. In Appendix 6.4A we generalize the so-called Deformation Lemma and present the Mountain Pass Theorem in a more general setting. The generalization of the Lagrange Multiplier Method is carried out in Appendix 6.4B. Appendix 6.5A is dedicated to the generalization of the Saddle Point Theorem. Appendices 7.5A, 7.6A and 7.7A are devoted to applications of the degree of generalized monotone mappings, the Leray–Lions Theorem and the Saddle Point Theorem, respectively, to boundary value problems for elementary partial diﬀerential equations. This more advanced part contains several generalizations of the methods presented in the basic part and the beginner in the subject who is reading the book can skip it. On the other hand, there are (a few) places in appendices where we have to refer to the basic text and the notions we refer to are contained in the forthcoming chapters or sections. This, however, corresponds to our “philosophy of two levels” of the text and, in our opinion, does not impair the “smoothness” of reading. In order to make the text self-contained, we decided to comment on several notions and statements in footnotes. To place the material from the footnotes in the text could disturb a more advanced reader and make the exposition more complicated. In order to emphasize the role of the statements in our exposition we identify them as Theorem, Proposition, Lemma or Corollary. However, the reader should be aware of the fact that this by no means expresses the importance of the statement within the whole of mathematics. So, several times, we call important theorems Propositions, Lemmas or Corollaries. Although the book should primarily serve as a textbook for students on the graduate level, it can be a suitable source for scientists and engineers who have need of modern methods of nonlinear analysis.

xii

Preface

At this point we would like to include a few words about our good friend, colleague and mentor Svatopluk Fuˇc´ık to whom we dedicate this book. His work in the ﬁeld of nonlinear analysis is well recognized and although he died in 1979 at the age of 34, he ranks among the most important and gifted Czech mathematicians of the 20th century. We would like to thank Marie Benediktov´ a and Jiˇr´ı Benedikt for an excellent typesetting of this book in LATEX 2ε , excellent ﬁgures and illustrations as well as for their valuable comments which improved the quality of the text. Our thanks belong also to Eva Faˇsangov´a, Gabriela Holubov´ a, Eva Kaspˇr´ıkov´ a and Petr Stehl´ık for their careful reading of the manuscript and useful comments which have decreased the number of our mistakes and made this text more readable. Our special thanks belong to Jiˇr´ı Jarn´ık for correction of our English, Ralph Chill and Herbert Leinfelder for their improvements of the text and methodological advice. Both authors appreciate the support of the research projects of the Ministry of Education, Youth and Sports of the Czech Republic MSM 4977751301 and MSM 0021620839.

Plzeˇ n–Praha, November 2006

Pavel Dr´ abek Jaroslav Milota

Chapter 1

Preliminaries 1.1 Elements of Linear Algebra This section is rather brief since we suppose that the reader already has some knowledge of linear algebra. Therefore, it should be viewed mainly as a source of concepts and notation which will be frequently used in the sequel. There are plenty of textbooks on this subject. As we are interested in applications to analysis we recommend to the reader the classical book Halmos [64] to ﬁnd more detailed information. A decisive part of analysis concerns the study of various sets of numbers such as R, C, RM , . . . , sets of functions (continuous, integrable, diﬀerentiable), and mappings between them. These sets usually allow some algebraic operations, mainly addition and multiplication by scalars. We will denote the set of scalars by and have usually in mind either the set of real numbers R or that of complex numbers C. Deﬁnition 1.1.1. A set X on which two operations – addition and multiplication by scalars – are deﬁned, is called a linear space over a ﬁeld if the following conditions are fulﬁlled: (1) X with respect to the operation x, y ∈ X → x + y ∈ X forms a commutative group with a unit element denoted by o and the inverse element to x ∈ X denoted by −x. (2) The operation a ∈ , x ∈ X → ax ∈ X satisﬁes (a) a(bx) = (ab)x, a, b ∈ , x ∈ X, (b) 1x = x, x ∈ X, where 1 is the multiplicative unit of the ﬁeld . (3) For the two operations the distributive laws hold, i.e., for a, b ∈ , x, y ∈ X, we have (a) (a + b)x = ax + bx, (b) a(x + y) = ax + ay.

2

Chapter 1. Preliminaries

If = R or C, then X is called a real or complex linear space, respectively. If a subset Y ⊂ X itself is a linear space with respect to the operations induced by those deﬁned on X, then Y is said to be a (linear) subspace of X. In the rest of this section the character X always denotes a linear space over

. If is not speciﬁed, then it always means that a deﬁnition or a statement holds n for an arbitrary ﬁeld . For x1 , . . . , xn ∈ X, a1 , . . . , an ∈ the sum ai xi is well i=1

deﬁned and determines an element x ∈ X which is called a linear combination of x1 , . . . , xn (with coeﬃcients a1 , . . . , an ). Notice that only ﬁnite linear combinations are deﬁned since inﬁnite sums cannot be deﬁned without any topology on X. If A is a subset of X, then the set of all linear combinations of elements of A is denoted by Lin A and is called the span of A. A span is always a subspace of X. We can ask whether x ∈ Lin{x1 , . . . , xn } can be expressed in a unique way as a linear combination of x1 , . . . , xn . This uniqueness holds if and only if x1 , . . . , xn are linearly independent, i.e., the condition n

ai xi = o

⇐⇒

a 1 = · · · = an = 0

i=1

is satisﬁed. More generally, we have the following deﬁnition. Deﬁnition 1.1.2. A set A ⊂ X is said to be linearly independent if every ﬁnite subset of A is linearly independent. A set A ⊂ X is called a basis 1 of X provided A is linearly independent and Lin A = X. Theorem 1.1.3. Every linear space X = {o} has a basis. If A, B are two bases of X, then there is a bijective (i.e., injective and surjective) mapping from A onto B. We will give the proof of the existence part since it contains a very important method which is frequently used. To see the idea of the proof, notice that a basis is a linearly independent set which is maximal in the sense that, by adding an element, it will cease to be linearly independent. The question why such a maximal set has to exist concerns generally mathematical philosophy. There are several equivalent statements of set theory which guarantee this existence result. As the most useful we have found the following one.2 Theorem 1.1.4 (Zorn’s Lemma). Let (A, ≺) be an ordered set in which every chain has the lowest upper bound. Then for any a ∈ A there is a maximal m ∈ A such that a ≺ m.3 1 It

is sometimes called a Hamel basis in order to emphasize the distinction from a Schauder or orthonormal basis in a Banach space or a Hilbert space, respectively (see Section 1.2). 2 It can be viewed also as an axiom of set theory. 3 A binary relation ≺ on A × A is called an ordering if (1) x ≺ x for all x ∈ A, (2) if x ≺ y and y ≺ x, then x = y, (3) if x ≺ y and y ≺ z, then x ≺ z.

1.1. Elements of Linear Algebra

3

We now return to Proof of Theorem 1.1.3. Let A be a collection of all linearly independent subsets of X and deﬁne A ≺ B for A, B ∈ A if A is a subset of B. Choose A ∈ A (A = ∅ since X = {o}) and let M be a maximal element of A , A ⊂ M, whose existence is guaranteed by Zorn’s Lemma (if B is a chain in A , then sup B = B). Then B∈B

Lin M = X. The proof of the latter part of Theorem 1.1.3 is more involved (the construction of an injection of A into B is also based on the application of Zorn’s Lemma) and it is omitted. Deﬁnition 1.1.5. Let X be a linear space and let A be a basis of X. Then the cardinality of A is called the dimension of X. Example 1.1.6. (i) Assume that A is a basis of a linear space X. Then there is a set (the socalled index set) Γ and a bijection γ ∈ Γ → eγ ∈ A onto A. We will also say that {eγ }γ∈Γ is a basis of X. For any x ∈ X there is a ﬁnite subset K ⊂ Γ and scalars {αγ }γ∈K such that x=

αγ eγ .

γ∈K

These scalars are uniquely determined and will be called the coordinates of x with respect to the basis {eγ }. (ii) The space RM of real M -tuples with the usual operations is a real linear space and the elements ek = (0, . . . , 0, 1, 0, . . . , 0),

k = 1, . . . , M

(1 is at the kth position), form a basis of RM . It will be called the standard basis of RM . If x = (x1 , . . . , xM ) ∈ RM , then x1 , . . . , xM are the coordinates of x with respect to the standard basis. If (A, ≺) is an ordered set, then B ⊂ A is called a chain if for every x, y ∈ B we have either x ≺ y or y ≺ x. An element b ∈ A is called the lowest upper bound of a subset B ⊂ A (b = sup B) if (1) a ∈ B =⇒ a ≺ b; (2) if a ≺ c for all a ∈ B, then b ≺ c. Similarly, we call d ∈ A the greatest lower bound of a subset B ⊂ A (d = inf B) if (1) a ∈ B =⇒ d ≺ a; (2) if c ≺ a for all a ∈ B, then c ≺ d. An element m ∈ A is called a maximal element of A if m ≺ x for an x ∈ A implies m = x.

4

Chapter 1. Preliminaries

(iii) Similarly, CM is the space of complex M -tuples and the set of elements e1 , . . . , eM deﬁned as above is the standard basis of CM . More generally, if X is a real linear space and iX is deﬁned by iX {ix : x ∈ X} where i2 = −1, then XC X + iX ( = {x + iy : x, y ∈ X} ) is the complexiﬁcation of X. The equality x + iy = o holds if and only if x = y = o. The operations in XC are deﬁned as follows: (x1 + iy1 ) + (x2 + iy2 ) (x1 + x2 ) + i(y1 + y2 ), (a + ib)(x + iy) (ax − by) + i(bx + ay),

x1 , x2 , y1 , y2 ∈ X, a, b ∈ R, x, y ∈ X.

It is easy to verify that XC is a complex linear space. (iv) Let P be the family of all polynomials of one variable with real or complex coeﬃcients. Then P is respectively a real or complex linear space and the polynomials Pk (z) = z k , k = 0, 1, . . . , form a basis of P. (v) The space C[0, 1] of all real (complex) continuous functions on the interval [0, 1] is a real (complex) linear space. According to Theorem 1.1.3, C[0, 1] has a basis but it is uncountable (this is not so easy to prove). We will not distinguish among diﬀerent inﬁnite cardinals and refer to spaces like P and C[0, 1] as inﬁnite dimensional spaces and use (incorrectly) the symbol dim = ∞. (vi) We can consider R as a linear space √ over the ﬁeld Q of rational numbers. For example, the elements 1 and 2 are linearly independent in this case. In this case a basis is uncountable, and serves as a tool for the constructions of “pathological” examples in analysis, like a noncontinuous (or, equivalently, non-measurable) solution f of the functional equation f (x + y) = f (x) + f (y),

x, y ∈ R.

g

Remark 1.1.7. In the sequel we will use the symbol

∞

to warn the reader that the statement in question is true only in linear spaces of ﬁnite dimension. Next we state a corollary of Theorem 1.1.3. Corollary 1.1.8. Let X be a linear space and let Y be a subspace of X. Then there exists a subspace Z of X with the following properties: (i) for every x ∈ X there are unique y ∈ Y , z ∈ Z such that x = y + z; (ii) Y ∩ Z = {o}.

1.1. Elements of Linear Algebra

5

Notation. X = Y ⊕ Z and X is called the direct sum of Y , Z, and Z a direct complement of Y in X. Proof. Let A be a basis of Y and A = {B linearly independent subset of X : A ⊂ B}. ˜ Put C = A˜ \ A (the set compleBy Zorn’s Lemma, A has a maximal element A. ment). It is easy to see that Z Lin C satisﬁes both (i) and (ii). Notice that the elements y ∈ Y , z ∈ Z are uniquely determined by x in (i). If {o} = Y and Y = X, then Z is not uniquely determined. A simple example can be given in R2 and the reader is invited to provide one! Deﬁnition 1.1.9. Let X and Y be linear spaces over the same ﬁeld . A mapping A : X → Y is said to be a linear operator if it possesses the following properties: (1) A(x + y) = Ax + Ay for all x, y ∈ X; (2) A(αx) = αAx for every α ∈ , x ∈ X. The collection of all linear operators from X into Y is denoted by L(X, Y ). We will use the simpler notation L(X) if X = Y . Remark 1.1.10. (i) A linear operator A ∈ L(X, Y ) is uniquely determined by its values on the elements of a basis A = {eγ }γ∈Γ . Indeed, let fγ Aeγ , γ ∈ Γ, and

x=

αγ eγ .

γ∈K K ﬁnite ⊂Γ

If A is linear, then Ax has to be equal to

αγ fγ . On the other hand, if

γ∈K

{fγ }γ∈Γ are given, then Ax

αγ fγ

for

γ∈K

x=

αγ eγ

γ∈K

satisﬁes (1) and (2) from Deﬁnition 1.1.9. (ii) Assume that both X and Y are ﬁnite dimensional spaces and {e1 , . . . , eM }

and

∞

{f1 , . . . , fN }

are bases of X and Y , respectively. If A ∈ L(X, Y ), then Aej =

N i=1

aij fi ,

j = 1, . . . , M,

for some scalars aij .

(1.1.1)

6

Chapter 1. Preliminaries

These scalars form a matrix A = (aij ) i=1,...,N j=1,...,M

∞

(N rows and M columns; the jth column consists of the coordinates of Aej ). This matrix A is called the matrix representation of the linear operator A with respect to the bases {e1 , . . . , eM } and {f1 , . . . , fN }. On the other hand, if {e1 , . . . , eM } and {f1 , . . . , fN } are bases of X and Y , respectively, and A is an N × M matrix, then the formula (1.1.1) determines a linear operator A ∈ L(X, Y ). (iii) If A, B ∈ L(X, Y ) have matrix representations A and B (with respect to the same bases), then A + B (aij + bij ) i=1,...,N j=1,...,M

is the matrix representation of A + B : x → Ax + Bx. Similarly, αA (αaij ) i=1,...,N j=1,...,M

is the matrix representation of αA : x → αAx. It is obvious that L(X, Y ) is a linear space (over the same scalar ﬁeld ) under these deﬁnitions of A + B, αA. This is true without any restrictions on the dimensions of X and Y . (iv) If X, Y , Z are linear spaces over the same scalar ﬁeld B ∈ L(Y, Z), then BA : x → B(Ax), x ∈ X,

∞

and A ∈ L(X, Y ),

is a linear operator from X into Z. Moreover, if X, Y , Z are ﬁnite dimensional spaces and A = (aij ) i=1,...,N , B = (bki )k=1,...,P are matrix representations j=1,...,M

i=1,...,N

of A and B, respectively, then BA

N i=1

bki aij k=1,...,P j=1,...,M

is the matrix representation of BA. This product of operators is non-commutative in general, even in the case X = Y = Z.

1.1. Elements of Linear Algebra

7

For A ∈ L(X, Y ) we denote by Ker A {x ∈ X : Ax = o} the kernel of A, and by Im A {Ax : x ∈ X} the image of A. Evidently, Ker A and Im A are linear subspaces of X and Y , respectively. Deﬁnition 1.1.11. A linear operator A ∈ L(X, Y ) is said to be (1) injective if Ker A = {o}, (2) surjective if Im A = Y , (3) an isomorphism if A is both injective and surjective. Remark 1.1.12. (i) If A ∈ L(X, Y ) is injective and e1 , . . . , en are linearly independent elements of X, then Ae1 , . . . , Aen are linearly independent elements of Y . Further, A ∈ L(X, Y ) is an isomorphism if and only if {Aeγ }γ∈Γ is a basis of Y whenever {eγ }γ∈Γ is a basis of X. In other words: linear spaces X, Y (over the same scalar ﬁeld ) have the same dimension if and only if there is an isomorphism A ∈ L(X, Y ). (ii) Assume that A ∈ L(X, Y ) is an isomorphism and put A−1 y = x where y = Ax. Then A−1 ∈ L(Y, X) and AA−1 = IY ,

A−1 A = IX

where IX and IY denote the identity maps on X and Y , respectively. A−1 is called the inverse of A. If X = Y and A is a matrix representation of A, then A−1 has the inverse matrix A−1 as the representation in the same bases. (iii) (Transformation of coordinates in a ﬁnite dimensional space) Let E = {e1 , . . . , eM } and F = {f1 , . . . , fM } be two bases of a linear space X. There are two questions: (a) What is the relation between the coordinates of a given x ∈ X with respect to these bases? (b) Let A ∈ L(X) have matrix representations AE and AF with respect to these bases. What is the relation between AE and AF ? The answer to the ﬁrst question is easy: Put T ej = fj ,

j = 1, . . . , M,

and extend T to a linear operator on X. Then T is an isomorphism. Denote by T = (tij ) i=1,...,M its matrix representation with respect to the basis E, j=1,...,M

i.e., T ej =

M i=1

tij ei ,

j = 1, . . . , M.

∞

8

Chapter 1. Preliminaries

For x =

M

ηj fj we have

j=1

x=

M j=1

ηj

M

tij ei =

i=1

M

⎛ ⎞ M ⎝ tij ηj ⎠ ei .

i=1

j=1

⎛

⎞ ξ1 ⎜ ⎟ This means that the column vector ξ = ⎝ ... ⎠ of the coordinates of x in ξM the basis E is given by ξ = T η where ξi =

M

tij ηj .

j=1

The second question can be answered by the same method but a certain caution in computation is desirable. Write M M M M (E) (F ) (F ) tkj ek = tkj aik ei = akj T ek = akj tik ei . Afj = A k=1

k,i=1

k=1

k,i=1

This equality can be expressed in matrix notation as AE T = T AF . Since the matrix T has an inverse, we get AF = T −1 AE T .

(1.1.2)

Example 1.1.13. (i) Let X = Y ⊕ Z. Deﬁne Px = y

where x = y + z,

y ∈ Y,

z ∈ Z.

Then P is the so-called projection of X onto Y and has the following properties: (a) P 2 P P = P , (b) Ker P = Z. It is easy to see that the properties (a), (b) determine uniquely the projection P and hence also the decomposition X = Y ⊕ Z (Y = Im P ). (ii) Let Y be a subspace of X. For x ∈ X put [x] x + Y = {x + y : y ∈ Y }.

1.1. Elements of Linear Algebra

9

If x, y ∈ X, then either [x] = [y] (⇔ x − y ∈ Y ) or [x] ∩ [y] = ∅. Deﬁne [x] + [y] [x + y],

for x, y ∈ X, α ∈ .

α[x] [αx]

These operations are well deﬁned and endow the set X|Y {[x] : x ∈ X} with the structure of a linear space. The space X|Y is called a factor space or simply a Y -factor. Put κ : x → [x],

x ∈ X.

Then κ (the so-called canonical embedding of X onto X|Y ) is a linear, surjective operator from X onto X|Y , and Ker κ = Y . If x = y + z where y ∈ Y , z ∈ Z and X = Y ⊕ Z, then the mapping j : [x] → z is an isomorphism of X|Y onto Z. In particular, X|Y and Z have the same dimension. The dimension of X|Y is sometimes called the codimension of Y (codim Y ) and dim X = dim Y + codim Y.

(1.1.3)

Warning. If X is an inﬁnite dimensional space, then the sum on the rightg hand side is the sum of inﬁnite cardinal numbers! Proposition 1.1.14. Let A ∈ L(X, Y ) and let κ be the canonical embedding of X ˆ Ax, then Aˆ is injective and the diagram in Figure 1.1.1 onto X|Ker A . If A[x] ˆ is commutative, i.e., A = Aκ. κ

X

X|Ker A Aˆ

A Y Figure 1.1.1.

Proof. The assertion is obvious but do not forget to prove that Aˆ is well deﬁned. Corollary 1.1.15. Let A ∈ L(X, Y ). Then dim X = dim Ker A + dim Im A.

(1.1.4)

In particular, if X = Y and dim X < ∞, then A ∈ L(X, Y ) is injective if and only if it is surjective.

∞

10

Chapter 1. Preliminaries

Proof. We have codim Ker A = dim X|Ker A = dim Im Aˆ = dim Im A since Aˆ is an isomorphism of X|Ker A onto its image. Equality (1.1.4) follows immediately from (1.1.3). If A is injective, then dim X = dim Im A, and this implies (only in the case of X and Y having the same ﬁnite dimension) that Y = Im A. If Im A = Y , then (ﬁnite dimensions!) dim Ker A = 0,

i.e., A is injective.

Example 1.1.16. Let X be the space of bounded (real) sequences l∞ (N) and deﬁne the right-shift SR : x = (x1 , . . . ) → (0, x1 , x2 , . . . ) and the left-shift SL : x = (x1 , . . . ) → (x2 , x3 , . . . ). Then SR is injective but not surjective and SL is surjective but not injective. Moreover, SL S R x = x for every x ∈ X. g What is S S ? R L

The following special case of linear operators plays an important role both in the theory of linear spaces and in applications. Deﬁnition 1.1.17. Let X be a linear space over a ﬁeld . A linear operator from X into is called a linear form. The linear space of all linear forms on X is called the (algebraic) dual space of X and is denoted by X # . Example 1.1.18. (i) Let {e1 , . . . , eM } be a basis of X, i.e., for every x there is a unique M -tuple M (ξ1 , . . . , ξM ) ∈ M (coordinates of x) such that x = ξi ei . The mapping i=1

ei : x → ξi is a linear form (the ith coordinate form). It is straightforward to show that e1 , . . . , eM are linearly independent and Lin{e1 , . . . , eM } = X # ,

∞

i.e., {e1 , . . . , eM } is a basis of X # (the so-called dual basis of X # , dual to {e1 , . . . , eM }).

1.1. Elements of Linear Algebra

11

(ii) If f ∈ X # \ {o}, then codim Ker f = 1. To see this choose x0 ∈ X such that f (x0 ) = 1. Then x = (x − f (x)x0 ) + f (x)x0 ∈ Ker f ⊕ Lin{x0 }. On the other hand, if Y is a subspace of X of codimension 1,4 then X = Y ⊕ Lin{x0 }

for an x0 ∈ X.

For x = y + αx0 , y ∈ Y , we put f (x) = α. Then f ∈ X # \ {o}

and

Ker f = Y.

Moreover, if f, g ∈ X # are such that Ker f = Ker g, then there is an α ∈ g for which f = αg. This fact has the following generalization, which will be used in Section 6.3, more precisely in the proof of Theorem 6.3.2. Proposition 1.1.19. Let f1 , . . . , fn , g be linear forms on X. Then g ∈ Lin{f1 , . . . , fn }

n

if and only if

Ker fi ⊂ Ker g.

i=1

Proof. The “only if” part is obvious. For the “if” part notice that the assertion g ∈ Lin{f1 , . . . , fn } can be interpreted as the existence of a linear form λ ∈ (n )# such that g = λ◦F where F (x) (f1 (x), . . . , fn (x)). (1.1.5) Let n = Im F (X) ⊕ Y (Corollary 1.1.8). If α = β + γ, β = F (x), γ ∈ Y , then the mapping λ(α) = g(x) is a well-deﬁned linear form (by assumption). This means that (1.1.5) holds. Deﬁnition 1.1.20. Let A ∈ L(X, Y ) and g ∈ Y # . Then the linear form f (x) g(Ax) is denoted by f = A# g and A# is called the adjoint operator to A. Remark 1.1.21. (i) A# ∈ L(Y # , X # ). (ii) If A has a matrix representation A = (aij ) i=1,...,N with respect to bases j=1,...,M

E = {x1 , . . . , xM } in X and F = {y1 , . . . , yN } in Y , then the adjoint operator A# has the representation A# = (aji )j=1,...,M i=1,...,N

(i.e., A# is the transpose of A) with respect to the dual bases. 4 Such

a subspace is often called a hyperplane.

∞

12

Chapter 1. Preliminaries

Warning. We will encounter diﬀerent adjoint operators in the next section and the adjoint A∗ with respect to a scalar product will have a diﬀerent representation in a complex space! Now we turn our attention to a system of linear equations M

aij xj = bi ,

i = 1, . . . , N.

(1.1.6)

j=1

This system can be written in a more “compact” form, namely as Ax = b

(1.1.7)

where A is a matrix representation of the linear operator A from X into Y . By choosing ﬁxed bases E = {e1 , . . . , eM } in X and F = {f1 , . . . , fN } in Y (also Y = RN or CN ), A is deﬁned by its matrix representation A = (aij ) i=1,...,N with j=1,...,M

respect to these bases. In order to formulate results on solvability of (1.1.6) (or, equivalently, of (1.1.7)) the following notation will be useful. Notation. If U is a subset of X (not necessarily a subspace of X), then U ⊥ = {f ∈ X # : x ∈ U ⇒ f (x) = 0}. Similarly, W⊥ = {x ∈ X : f ∈ W ⇒ f (x) = 0}

∞

for W ⊂ X # .

Proposition 1.1.22. (i) (U ⊥ )⊥ = Lin U for every U ⊂ X. (ii) If dim X < ∞, then (W⊥ )⊥ = Lin W for all W ⊂ X # . Proof. We include the proof because it contains a construction which should be compared with an analogous one in Section 2.1 (see Proposition 2.1.27 and its proof). (i) We can assume U to be a subspace of X since U ⊥ = (Lin U)⊥ . The inclusion U ⊂ (U ⊥ )⊥ is obvious. To prove the reverse let us suppose by contradiction that there is an element x0 ∈ (U ⊥ )⊥ \ U. By the method of proof of Theorem 1.1.3, a subspace Y of X can be found such that X = Lin{x0 } ⊕ Y

and

U ⊂ Y.

According to Example 1.1.18(ii) there exists f ∈ X # with Ker f = Y . In particular, f ∈ U ⊥ and f (x0 ) = 0, which contradicts the choice of x0 . (ii) This part follows from (i) by replacing X by X # . To repeat the proof we need that (X # )# could be identiﬁed with X. We note that this is possible because dim X < ∞.

1.1. Elements of Linear Algebra

13

The main idea – separation of a point from a subspace by a linear form (i.e., by a hyperplane) – can be substantially generalized. Deﬁnition 1.1.23. A subset C of a (real or complex) linear space X is called convex if for every x, y ∈ C, t ∈ [0, 1], the point tx + (1 − t)y

belongs to

C.

Proposition 1.1.24. Let X be a real linear space, ∅ = C a convex subset of X with a nonempty algebraic interior C 0 {a ∈ C : ∀y ∈ X ∃t0 > 0 such that a + ty ∈ C for all t ∈ [0, t0 )}. Let x0 ∈ X \ C. Then there is f ∈ X # such that f (x) ≤ f (x0 )

for all

x ∈ C.

Proof. It needs a special tool for the treatment of convex sets and a considerably more sophisticated extension procedure,5 and, therefore, it is omitted. See, e.g., Rockafellar [109, § 11] where the interested reader can ﬁnd also applications to convex optimization, and also Corollary 2.1.18. Theorem 1.1.25. For A ∈ L(X, Y ) we have (i) Im A = (Ker A# )⊥ , (ii) Im A# = (Ker A)⊥ . (iii) If, moreover, dim X = dim Y < ∞, then dim Ker A = dim Ker A# .

∞ (1.1.8)

Proof. (i) It is straightforward to prove both the inclusions which lead to the equality (Im A)⊥ = Ker A# . The result follows then from Proposition 1.1.22(i). (ii) Let Y = Im A ⊕ Z (Corollary 1.1.8). For f ∈ (Ker A)⊥ and y = Ax + z, z ∈ Z, put g(y) = f (x). This deﬁnition does not depend on a concrete choice of x since f ∈ (Ker A)⊥ . This proves that f = A# g and hence the inclusion (Ker A)⊥ ⊂ Im A# holds. The reverse inclusion is trivial. (iii) Observe ﬁrst that (X|U )# is isomorphic to U ⊥ for any subspace U of X, namely, Φ(F )(x) F ([x]), F ∈ (X|U )# is the desired isomorphism. If dim X < ∞, then X|U is isomorphic to (X|U )# (both spaces have the same dimension) and, therefore, X|U is isomorphic to U ⊥ . Now, we apply this observation to U = Ker A. We recall that Im A is isomorphic to X|Ker A (Proposition 1.1.14) and therefore to (Ker A)⊥ . By (ii), Im A is isomorphic to Im A# . The equality (1.1.8) follows from Corollary 1.1.15. 5 See

Corollary 2.1.18 for a similar process.

14

Chapter 1. Preliminaries

Remark 1.1.26. (i) Note that Theorem 1.1.25(i) is an existence result for the equation (1.1.6) (or (1.1.7)) because it can be reformulated as follows: The equation (1.1.6) has a solution for b = (b1 , . . . , bN ) if and only if N bi fi = 0 i=1

for all solutions f = (f1 , . . . , fN ) of the adjoint homogeneous equation N aji fi = 0, j = 1, . . . , M. i=1

In particular, we have also the alternative result: Either the equation (1.1.6) has a solution for all right-hand sides or6 the adjoint homogeneous equation has a nontrivial solution. Theorem 1.1.25(ii) can be reformulated similarly. (ii) If A is a matrix representation of A ∈ L(X, Y ) (X and Y are ﬁnite dimensional spaces), then dim Im A is equal to the number of linearly independent columns of A and is called the rank of A. If X = Y , then A is a square matrix of the type M × M (M = dim X), and it is called a regular matrix provided M = rank A.

∞

Equivalently, A is a regular matrix if and only if its determinant det A does not vanish. By the proof of Theorem 1.1.25(iii), dim Im A = dim Im A# . In particular, this means that the rank of A is equal to the rank of its transpose. The reader is asked to ﬁnd more matrix formulations of the previous results. We often do calculations with a matrix representation instead of the operator itself. Since there are plenty of representations of the same operator it would be convenient to work with the simplest possible form. To examine this problem we start with some notions. Deﬁnition 1.1.27. Let X be a complex linear space and A ∈ L(X). A complex number λ is called an eigenvalue of A if there is x = o such that Ax = λx.

∞

Such an element x is called an eigenvector of A (associated with the eigenvalue λ). The set of all eigenvalues of A is called the spectrum of A and is denoted by σ(A). 6

The conjunction “or” has exclusive character here. This alternative result is sometimes called a Fredholm alternative since I. Fredholm proved such a result for linear integral equations. See also Section 2.2.

1.1. Elements of Linear Algebra

15

Warning. In inﬁnite dimensions the spectrum of a linear operator can contain also other points than the eigenvalues and is deﬁned in a diﬀerent way (see page 56)! Remark 1.1.28. It is obvious that the following statements are equivalent in a ﬁnite dimensional complex space X: λ ∈ σ(A)

⇐⇒ ⇐⇒

Ker (λI − A) = {o} det (λI − A) = 0

⇐⇒

∞

rank (λI − A) < dim X

where A is a representation of A. Since P (z) det(zI − A) is a polynomial (the so-called characteristic polynomial ) of degree M = dim X, the problem of ﬁnding σ(A) is equivalent to solving an algebraic equation (the so-called characteristic equation of A) P (z) = 0.

(1.1.9)

According to the Fundamental Theorem of Algebra (see Theorem 4.3.111) there exists at least one solution of (1.1.9) in C. The reason for considering complex spaces here is the fact that (1.1.9) need not have a real solution. It is an easy consequence of the Fundamental Theorem of Algebra that the polynomial P can be written in the form P (λ) = (λ − λ1 )m1 · · · (λ − λk )mk

(1.1.10)

where σ(A) = {λ1 , . . . , λk } (λ1 , . . . , λk are diﬀerent) and m1 + · · · + mk = dim X. The positive integer mi is called the multiplicity of the eigenvalue λi . Deﬁnition 1.1.29. Let A ∈ L(X). (1) A subspace Y ⊂ X is said to be A-invariant if A(Y ) ⊂ Y . (2) An A-invariant subspace Y ⊂ X is said to reduce A if there is a decomposition X = Y ⊕ Z where Z is also A-invariant. From now on till the end of this section we consider exclusively ﬁnite dimensional spaces. Example 1.1.30. (i) Let X = Y ⊕ Z where both Y and Z are A-invariant. If {e1 , . . . , em } is a basis of Y and {em+1 , . . . , eM } is a basis of Z, then the matrix representation A of A with respect to {e1 , . . . , eM } has a block form

A=

AY O

O AZ

where AY and AZ are representations of restrictions of A to Y and Z, respectively.

∞

16

Chapter 1. Preliminaries

(ii) Assume that there is a basis {e1 , . . . , eM } of X consisting of eigenvectors of A ∈ L(X) and Aei = λi ei , i = 1, . . . , M (λ1 , . . . , λM are not necessarily distinct). Then the matrix representation of A with respect to this basis is the diagonal matrix ⎛ ⎞ λ1 0 · · · 0 ⎜ 0 λ2 · · · 0 ⎟ ⎜ ⎟ ⎜ .. .. .. ⎟ . .. ⎝ . . . . ⎠ 0

0

···

λM

1 1 is a representation of a linear operator A ∈ L(C2 ) 0 1 which has no one-dimensional reducing subspace. Hence A has no diagonal g representation.

(iii) The matrix A =

Because of the last example we have to improve our previous idea: Choose λ ∈ σ(A) and denote k

Nk Ker (λI − A) . It is obvious that Nk ⊂ Nk+1 and they cannot be all distinct. If Nk = Nk+1 , then Ni = Nk for all i > k. Denote by n(λ) the least such k and set

n(λ)

N (λ)

Nj = Nn(λ) ,

n(λ)

R(λ) Im (λI − A)

.

j=1

∞

Lemma 1.1.31. Let A ∈ L(X) and λ ∈ σ(A). (i) Both N (λ) and R(λ) are A-invariant subspaces and the decomposition X = N (λ) ⊕ R(λ)

holds.

(1.1.11)

(ii) Denote by A|N and A|R the restrictions of A respectively to N (λ) and R(λ). Then σ(A|N ) = {λ}, σ(A|R ) = σ(A) \ {λ}. Moreover, the dimension of N (λ) is equal to the multiplicity of the eigenvalue λ. (iii) If σ(A) = {λ1 , . . . , λk }, then X = N (λ1 ) ⊕ · · · ⊕ N (λk ).

(1.1.12)

Proof. (i) Since R(λ) ∩ N (λ) = {o} (by the deﬁnition of n(λ)) and dim X = dim N (λ) + dim R(λ) (Corollary 1.1.15), we deduce the decomposition (1.1.11). If y = (λI − A)n(λ) x ∈ R(λ),

1.1. Elements of Linear Algebra

17

then Ay = −(λI − A)y + λy = −(λI − A)n(λ) (λI − A)x + λy ∈ R(λ). The A-invariance of N (λ) is also clear. (ii) Obviously, σ(A|N ) ⊂ σ(A). Let µ ∈ σ(A) \ {λ} and let x be a corresponding eigenvector. By (1.1.11) we have x = y + z where y ∈ N (λ), z ∈ R(λ). Further, o = (µI − A)x = (λI − A)y + (µ − λ)y + (µI − A)z. By virtue of the uniqueness of the decomposition we have (λI − A)y = (λ − µ)y. This implies that o = (λI − A)n(λ) y = (λ − µ)(λI − A)n(λ)−1 y,

i.e., y ∈ Ker (λI − A)n(λ)−1 .

By repeating this procedure we get y ∈ Ker (λI − A) and, therefore, (λ − µ)y = o, i.e., y=o

and

x = z ∈ R(λ).

This shows that µ ∈ σ(A|N ) and µ ∈ σ(A|R ). Since N (λ) ∩ R(λ) = {o} the eigenvalue λ does not belong to σ(A|R ). The matrix representation A of A with respect to the basis formed by joining the bases of N (λ) and R(λ) has the block form

AN O A= . O AR It follows that det(zI − A) = det(zI − AN ) det(zI − AR ) and hence the characteristic polynomial of AN is PN (z) = (z − λ)m(λ) where m(λ) is the multiplicity of the eigenvalue λ of A. Therefore dim N (λ) = m(λ). (iii) This follows by induction with respect to the eigenvalues of A. For a polynomial P (z) = an z n + · · · + a1 z + a0 and A ∈ L(X) we put P (A) = an An + · · · + a1 A + a0 I. Corollary 1.1.32 (Hamilton–Cayley). Let A ∈ L(X) and let P be the characteristic polynomial of A. Then P (A) = O.

∞

18

Chapter 1. Preliminaries

Proof. Assume that P has the form (1.1.10) and x = x1 + · · · + xk is the decomposition given by (1.1.12). Since mk = n(λk ), k−1

(A − λk I)mk x =

(A − λk I)mk xj + o.

j=1

The result follows by induction.

It remains to compute the representation of the restriction of λi I − A to N (λi ). Notice that this restriction is nilpotent.7

∞

Lemma 1.1.33. Let B ∈ L(X) be a nilpotent operator of order n. Then for any x ∈ X \ Ker B n−1 the elements x, Bx, . . . , B n−1 x are linearly independent and the subspace Y = Lin{x, Bx, . . . , B n−1 x} reduces B. The restriction B|Y of B to Y has the representation ⎛ ⎜ ⎜ ⎜ ⎝

0 0 .. . 0

1 ··· 0 ··· .. . . . . 0 ···

⎞ 0 0 ⎟ ⎟ ⎟ 1 ⎠ 0

with respect to the basis {B n−1 x, . . . , x}. There exists a B-invariant direct complement of Y and the restriction of B to such a complement is nilpotent of order ≤ n. Proof. It is easy to see the linear independence of the elements x, . . . , B n−1 x. Indeed, if n−1 αj B j x = o, j=0

then, by applying B n−1 , we get α0 B n−1 x = o,

i.e.,

α0 = 0.

Repetition shows that αj = 0 for all j = 0, 1, . . . , n − 1. The form of representation of B|Y is obvious. The existence of an invariant direct complement of Y can be proved by induction with respect to the order of nilpotency. We omit details and refer to, e.g., Halmos [64, § 57]. We are now ready to summarize all information to obtain the following fundamental result.

1.1. Elements of Linear Algebra

∞

19

Theorem 1.1.34 (Jordan Canonical Form). Let X be a complex linear space of ﬁnite dimension and let A ∈ L(X). Assume that σ(A) = {λ1 , . . . , λk }. Then there exists a basis F of X in which A has the canonical block representation ⎛

⎞

(1)

A1

⎜ ⎜ ⎜ ⎜ F A =⎜ ⎜ ⎜ ⎝

..

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

O

. (l )

A1 1 O

..

. (l )

Ak k

where the block matrices (the so-called Jordan cells) have the form ⎛ (i) Aj

⎜ ⎜ =⎜ ⎜ ⎝

λj

1

0 .. .

λj

0

0 .. .

···

..

. ···

lj columns

0

⎞

⎟ 0 ⎟ ⎟, ⎟ 1 ⎠ λj

i = 1, . . . , lj , j = 1, . . . , k.

(1.1.13)

Remark 1.1.35. (i) We can also interpret Theorem 1.1.34 as follows. Let AE be the representation of A with respect to the basis E. By Remark 1.1.12(iii), there is a regular transformation matrix T such that (1.1.2) holds. The canonical matrix AF may be viewed as a representation of a B ∈ L(X) with respect to the basis E. Denote by T a linear operator represented in the basis E by the matrix T . Then one has B = T −1 AT. (1.1.14) (ii) Assume that A ∈ L(X) where X is a real linear space. The problem in the application of Theorem 1.1.34 lies in the fact that the spectrum σ(A) ∩ R is not suﬃcient to guarantee the decomposition (1.1.12). This obstacle can be overcome by the complexiﬁcation XC of X. Namely, A is extendable to XC by the formula AC (x + iy) = Ax + iAy. If λ = α + iβ, β = 0, is an eigenvalue of AC with an eigenvector u + iv, then u and v are linearly independent in X and the complex conjugate λ is also an eigenvalue of AC and u − iv is the corresponding eigenvector. Moreover, both λ and λ have the same multiplicity. Rearranging the AC -canonical basis by joining its parts which correspond to λ and λ we obtain a basis of the real operator B ∈ L(X) is said to be nilpotent if there is such an n ∈ N that B n = O. The least such integer n is called the order of nilpotency.

7 An

∞

20

Chapter 1. Preliminaries

space X in which the representation of ⎛ α β 1 0 ⎜ −β α 0 1 ⎜ ⎜ ⎜ 0 0 α β ⎜ ⎜ .. .. ⎜ . . −β α ⎜ ⎜ . .. .. .. ⎜ .. . . . ⎜ ⎝ 0 0 0 0

A has blocks of the form ⎞ ··· 0 ··· 0 ⎟ ⎟ .. ⎟ ··· . ⎟ ⎟ ⎟ . ··· 1 0 ⎟ ⎟ ⎟ .. . 0 1 ⎟ ⎟ α β ⎠ · · · −β α

We omit simple computations which conﬁrm these statements and leave them to the reader.

∞

The simple canonical form is convenient for solving a system of linear diﬀerential equations with real constant coeﬃcients. Such a system can be written in the form dx x˙ = Ax, A ∈ L(X). (1.1.15) dt If X = RM and A = (aij ) is the representation of A with respect to the standard basis e1 , . . . , eM , then (1.1.15) is an abstract formulation of the system x˙ i (t) =

M

aij xj (t),

i = 1, . . . , M,

j=1

where x(t) =

M

xi (t)ei .

(1.1.16)

i=1

In order to ﬁnd a solution, it is convenient to transform (1.1.16) into a canonical form. If T ∈ L(X) is invertible, then x = T y is a solution of (1.1.15) if and only if y solves the equation y˙ = By,

where By T −1 AT y.

Theorem 1.1.34 says that T can be chosen in such a way that the representation of B with respect to the standard basis is the Jordan Canonical Form of A. Having this form it is easy to solve (1.1.16) (see Exercise 1.1.41). Qualitative properties of solutions of (1.1.15) are often more interesting than an involved formula for solutions. Therefore it would be convenient to generalize the exponential function solving x˙ = ax in R to L(X). Similarly to the onedimensional case we put ∞ n t n etA x A x n! n=0 provided the series is convergent in L(X). We postpone the question of convergence of this series (see Exercise 2.1.34) and give instead an equivalent deﬁnition of a function f (A) for A ∈ L(X) without any use of inﬁnite series.

1.1. Elements of Linear Algebra

21

First we will deﬁne f (B) for B ∈ L(CM ) which has a representation in the form ⎞ ⎛ λ 1 ⎟ ⎜ .. .. ⎟ ⎜ . . (1.1.17) B=⎜ ⎟. ⎝ 1 ⎠ λ Assume that f is a polynomial P : z → a0 z n + · · · + an . Obviously, we deﬁne P (B) = a0 B n + · · · + an I. It will be convenient to rewrite P (B) in a form which is more adequate for generalization. Since n P (j) (λ) P (z) = (z − λ)j , j! j=0 we can write P (z) =

M−1 j=0

P (j) (λ) (z − λ)j + (z − λ)M R(z) j!

where R is a polynomial, possibly equal to 0. Since z → (z − λ)M is the characteristic polynomial of B, we have (B − λI)M = O (by Corollary 1.1.32). This means that M−1 P (j) (λ) P (B) = (B − λI)j . (1.1.18) j! j=0 This shows that we may deﬁne f (B)

M−1 j=0

f (j) (λ) (B − λI)j j!

(1.1.19)

for a function f holomorphic on a neighborhood (depending on f ) of σ(B) = {λ}.8 We denote by H(σ(B)) the collection of such functions. It is easy to check that the formula (f g)(B) = f (B)g(B) = g(B)f (B) holds for f, g ∈ H(σ(B)). In particular, for w ∈ C \ {λ} and rw (z) = (w − z)−1 we get M−1 (B − λI)j −1 rw (B) = (wI − B) = . (1.1.20) (w − λ)j+1 j=0 8A

weaker assumption on f would be also suﬃcient but we do not try to obtain an unduly general deﬁnition. See also Lemma 1.1.37 below.

22

Chapter 1. Preliminaries

Remark 1.1.36. The following assertion yields another equivalent deﬁnition of f (B) which can be used also in a general Banach space for a linear continuous operator B : X → X (see Section 1.2 for the notions of the Banach space and the continuous linear operator). Also Theorem 1.1.38 holds in this more general setting (Dunford Functional Calculus, see Proposition 3.1.14 or Dunford & Schwartz [44]). Lemma 1.1.37. Let γ be a positively oriented Jordan curve, σ(B) ⊂ int γ, and let f be a holomorphic function on a neighborhood of int γ. Then 1 f (B)x = f (w)(wI − B)−1 x dw, x ∈ X.9 2πi γ

Proof. By (1.1.20) we have 1 2πi

−1

f (w)(wI − B) γ

x dw =

M−1 j=0

1 2πi

γ

f (w) dw (B − λI)j x. (w − λ)j+1

The result follows now from the Cauchy Integral Formula.10

Let A ∈ L(CM ) have the canonical form (1.1.17), i.e., A = T BT −1. Then we deﬁne f (A) by (1.1.19) replacing B by A. Notice that f (A) = T f (B)T −1. We can proceed in the same way for a general A ∈ L(X) using the decomposition (1.1.12). This leads to the following theorem. Theorem 1.1.38 (Functional Calculus). Let X be a complex linear space and let A ∈ L(X). Then there exists a unique linear operator Φ : H(σ(A)) → L(X) with the following properties: (i) Φ(f g) = Φ(f )Φ(g) = Φ(g)Φ(f ) for f, g ∈ H(σ(A)); n n (ii) if P (z) = aj z j , then Φ(P ) = aj Aj ; (iii) if f (z) =

j=0 1 w−z

j=0

for w ∈ σ(A), then Φ(f ) = (wI − A)−1 .

9 Since the integrand is a function w ∈ γ → CM ×M (in a matrix representation), the integral is an M × M -tuple of standard curve integrals. 10 We recall the following result from the theory of functions of a complex variable: If f and γ are as in Lemma 1.1.37, then j! f (w) dw holds for z ∈ int γ and j ∈ N ∪ {0}. f (j) (z) = 2πi γ (w − z)j+1

1.1. Elements of Linear Algebra

23

Remark 1.1.39. (i) A mapping Φ(f ) can be computed either by Lemma 1.1.37 which is valid also for a general A, or by the formula f (A)x =

k m(λ l )−1 f (j) (λl ) l=1

j=0

j!

∞

(A − λl I)j πl x

where σ(A) = {λ1 , . . . , λk } and πl is the projection onto N (λl ) deﬁned by the decomposition (1.1.12). We note that these projections are also functions of A, namely πl = χl (A) where 1, z ∈ B(λl ; δ), χl (z) = 0, z ∈ B(λl ; δ) and δ > 0 is small enough so that σ(A) ∩ B(λl ; δ) = {λl }. (ii) If X is a real linear space of ﬁnite dimension and A ∈ L(X), then we can construct a functional calculus for XC and AC (see Remark 1.1.35(ii)). (iii) We deduced a functional calculus from Theorem 1.1.34. The opposite way is also possible, namely to use functional calculus for ﬁnding the canonical form. An important role is played by projections πl giving the decomposition (1.1.12). The interested reader can ﬁnd more details, e.g., in Dunford & Schwartz [44, Section VII, 1].

∞

Exercise 1.1.40. Show that sgn det A = (−1)p

where p =

m(λ)

λ∈σ(A) λ<0

for a matrix representation of A with real entries. (The sum over the empty set is deﬁned to be zero.) mk 1 Hint. Notice that det A = λm 1 · · · λk . Exercise 1.1.41. Show that the formula (1.1.19) yields a matrix representation of etB in the form ⎛ tλ ⎞ tM −1 tλ e tetλ · · · (M−1)! e ⎜ ⎟ ⎜ 0 tM −2 tλ ⎟ etλ · · · (M−2)! e ⎟ ⎜ ⎜ ⎟ ⎜ . ⎟ . . . ⎝ .. ⎠ .. .. .. tλ 0 0 ··· e

∞

whenever B has the representation (1.1.17) with the respect to the same basis. Exercise 1.1.42. Use the formula in Remark 1.1.39(i) to estimate etA xCM for large positive t and large negative t in dependence on σ(A).

∞

24

Chapter 1. Preliminaries

Hint. Suppose that α < Re λ < β for all λ ∈ σ(A). Show that there are constants c1 , c2 such that c1 eβt x for t ≥ 0, for all x ∈ CM . etA x ≤ c2 eαt x for t ≤ 0 In particular, if β < 0, then all solutions of (1.1.15) tend to zero as t → +∞ (asymptotic stability). Exercise 1.1.43. Let A ∈ L(CM ) have a regular matrix representation. Show that (i) all matrix representations of A are regular; (ii) there exists B ∈ L(CM ) such that eB = A. Is this B unique? How is σ(B) related to σ(A)?

1.2 Normed Linear Spaces In the preface we mentioned that our main attention is focused on the properties of nonlinear mappings deﬁned on various spaces of functions. Besides the linear structure studied in the previous section such spaces also have a topological structure. We will assume that the two structures are joined together (like multiplication and addition are joined together by the distributive law in the notion of the ﬁeld). A natural requirement of continuity of linear operations leads to the notion of a linear topological space. These spaces are often too general for purposes of nonlinear analysis. For example, basic notions and results of diﬀerential calculus in such spaces are not straightforward generalizations of the corresponding notions for functions of several variables and they frequently need profound ideas. Because of that we restrict our interest to cases when a topological structure is given by a metric, especially by a norm. Before starting with this concept we brieﬂy introduce the main topological notions. For more information, the interested reader can consult books like Dugundji [43], Kelley [75]. A set X with a collection T of its subsets is called a topological space if T possesses the following properties: (1) ∅, X ∈ T ; (2) an intersection of a ﬁnite number of sets of T belongs to T ; (3) a union of any subcollection of T belongs to T . Elements of T are called open sets. A subset U ⊂ X is called a neighborhood of a point x ∈ X if there is an open set G ⊂ X such that x ∈ G ⊂ U. An important special case of a topological space is the so-called metric space. This is a set X with a real function (metric) : X × X → [0, ∞) for which

1.2. Normed Linear Spaces

25

(1) (x, y) = 0 ⇐⇒ x = y, (2) x, y ∈ X =⇒ (x, y) = (y, x) (the so-called symmetry of the metric), (3) x, y, z ∈ X =⇒ (x, z) ≤ (x, y) + (y, z) (the so-called triangle inequality). If is a metric on X, then B(x; r) {y ∈ X : (x, y) < r} is called the open ball centered at x ∈ X with the radius r > 0. Open sets in a metric space are deﬁned as subsets G ⊂ X which have the following property: for every x ∈ G there is δ > 0 such that B(x; δ) ⊂ G. It is easy to prove that a metric space with this deﬁnition of open sets is also a topological space. For the following notions and results see, e.g., Dieudonn´e [35]. A subset F of a topological space X is called a closed set if X \ F is open. If A ⊂ X, then the intersection of all closed sets containing A is called the closure of A and is denoted by A, i.e., A= F. A⊂F F is closed

A dual notion is the interior (int A) of A: int A =

G.

G⊂A G is open

The boundary ∂A of A is deﬁned by ∂A A ∩ X \ A. A subset A of X is said to be dense if A = X. It is almost obvious that in a metric space X we have the following equivalences: 11 (i) x ∈ A ⇐⇒ ∃{xn }∞ n=1 ⊂ A : lim (xn , x) = 0, n→∞

(ii) x ∈ int A ⇐⇒ ∃δ > 0 : B(x; δ) ⊂ A. A metric space X is said to be separable if there is a countable dense subset of X. 11 We

also say that the sequence {xn }∞ n=1 is convergent to x and write lim xn = x or, more n→∞

simply, xn → x. The notion of a convergent sequence can be introduced also in topological spaces: {xn }∞ n=1 is convergent to x if for every neighborhood U of x there is an index n0 ∈ N such that xn ∈ U for each n ≥ n0 . Warning. In a topological space there need not be enough convergent sequences in order to describe a closure, etc.! See, e.g., J. von Neumann’s example in Dunford & Schwartz [44, Chapter V, 7, Example 38].

26

Chapter 1. Preliminaries

If X, Y are topological spaces and f : X → Y , then f is said to be continuous on X provided f−1 (G) is open in X whenever G is an open set in Y . If f is injective and surjective, f , f −1 are both continuous, then f is called a homeomorphism of X onto Y . It is also possible to deﬁne continuity at a point a ∈ X with help of the notion of a neighborhood: f is continuous at a if f−1 (U) is a neighborhood of a whenever U is a neighborhood of f (a). A mapping f is continuous on X if and only if it is continuous at every point of X. The following equivalence holds in metric spaces X and Y : f: X →Y

is continuous at a ∈ X

⇐⇒

(xn → a =⇒ f (xn ) → f (a)).

A very important notion is that of compactness: A topological space X is said to be compact if for every open covering {Gγ }γ∈Γ of X (i.e., X = Gγ ) there is a γ∈Γ

ﬁnite subset K ⊂ Γ such that X=

Gγ

γ∈K

(a ﬁnite subcovering). Any subset A of a topological space X is itself a topological space with the collection of open sets {(G ∩ A) : G open in X}. A subset A of a topological space X is said to be compact in X if A is a compact topological space in this induced topology. Further, A ⊂ X is said to be relatively compact provided A is compact. In metric spaces we have the following characterization. Proposition 1.2.1. Let X be a metric space. Then A ⊂ X is relatively compact if 12 and only if for any sequence {xn }∞ n=1 ⊂ A there is a convergent subsequence. Beside this proposition, the importance of compactness in analysis is obvious from the next result which will be discussed more deeply in Section 6.2. Proposition 1.2.2. Let X be either a compact topological space or a sequentially compact topological space and let f be a continuous real function on X. Then there exist a maximal and a minimal value of f , i.e., there are x1 , x2 ∈ X such that f (x1 ) ≤ f (x) ≤ f (x2 )

for all

x ∈ X.

topological space X is said to be sequentially compact if for any sequence {xn }∞ n=1 ⊂ X there is a subsequence {xnk }∞ k=1 which is convergent to a point x ∈ X.

12 A

Warning. These two notions of compactness are diﬀerent in topological spaces. To be more precise: There is a compact topological space which is not sequentially compact and there is a sequentially compact topological space which is not compact!

1.2. Normed Linear Spaces

27

To ﬁnd a criterion for compactness in a particular space need not be an easy task. To formulate a general result we need one more notion the signiﬁcance of ∞ which goes far beyond our present considerations. A sequence {xn }n=1 of elements of a metric space X is called a Cauchy sequence if for every ε > 0 there is n0 ∈ N such that

(xm , xn ) < ε for all m, n ≥ n0 . A metric space X is said to be complete if any Cauchy sequence in X is convergent (to an element of X). We will encounter complete spaces “almost everywhere” in the subsequent text. Proposition 1.2.3. Let X be a complete metric space. Then A ⊂ X is relatively compact if and only if for every ε > 0 there is a ﬁnite set K ⊂ X (the so-called ﬁnite ε-net for A) such that

In other words, A ⊂

∀a ∈ A ∃x ∈ K :

(a, x) < ε.

B(x; ε).

x∈K

Proposition 1.2.4. Let X be a complete metric space and let f : [α, β) → X. If f is uniformly continuous on [α, β),13 then there exists lim f (x) ∈ X. In particular, x→β−

if β < ∞, then f can be continuously extended to [α, β]. Deﬁnition 1.2.5. A topological space X is called a connected space provided it is not possible to ﬁnd two disjoint nonempty open sets G1 , G2 such that X = G1 ∪ G2 . For a ∈ X put C(a)

{A ⊂ X : a ∈ A and A is connected}.

Then C(a) is a connected set and it is called the component of the point a. If a, b ∈ X, a = b, then either C(a) = C(b) or C(a) ∩ C(b) = ∅. Proposition 1.2.6. Let X be a connected space, let f : X → Y be continuous. Then f (X) is a connected subset of Y . In particular, if γ : [0, 1] → Y is continuous, A ⊂ Y , and γ(0) ∈ A, γ(1) ∈ A, then there exists t0 ∈ [0, 1] such that γ(t0 ) ∈ ∂A. Proposition 1.2.7. Let X be a normed linear space and let G be an open subset of X. Then G is connected if and only if for any two points a, b ∈ G there exists a continuous mapping γ : [0, 1] → G such that γ(0) = a, γ(1) = b. In particular, γ can be chosen piecewise linear. Now we are ready to start with the main subject of this section. Deﬁnition 1.2.8. Let X be a real or complex linear space. A function · X : X → R is called a norm on X if it has the following properties: 13 I.e.,

∀ε > 0 ∃δ > 0 ∀x, y ∈ [α, β) : |x − y| < δ =⇒ (f (x), f (y)) < ε.

28

Chapter 1. Preliminaries

(1) xX = 0 ⇐⇒ x = o, (2) αxX = |α|xX for α ∈ R or C and x ∈ X, (3) x + yX ≤ xX + yX for x, y ∈ X (the so-called triangle inequality). If a linear space X is endowed with a norm, then X is called a normed linear space. In the sequel we will drop the index of the norm whenever there is no danger of confusion. It is obvious that (x, y) x − y is a metric on X. Therefore all metric notions and results are transmitted to normed linear spaces. If a normed linear space is complete in this metric, then it is called a Banach space. Any metric space can be embedded as a dense set into a complete metric space. For a normed linear space X we get a slightly stronger result: ˜ (the so-called completion of X) and a There exists a Banach space X ˜ ˜ and linear injection L : X → X such that Im L is a dense subset of X xX = L(x)X˜

x ∈ X.

for all

Example 1.2.9. Let X be an M -dimensional real linear space. Choose a basis M f1 , . . . , fM of X and let e1 , . . . , eM be the standard basis of RM . For x = xi fi ∈ X put ϕ(x) =

M

i=1

xi ei . Then ϕ is an isomorphism of X onto RM . Moreover,

i=1

(x1 , . . . , xM )1

M

|xi |,

(x1 , . . . , xM )∞

i=1

(x1 , . . . , xM )2

M

max |xi |,

i=1,...,M

12 |xi |

2

i=1

are norms on RM (for indices 1, ∞ it is obvious, the triangle inequality for index 2 needs some eﬀort – see also Proposition 1.2.30 below). These norms can be transmitted to X with help of ϕ, i.e., xα ϕ(x)α ,

α = 1, 2, ∞.

Similar results are true also for a complex linear space X when CM is used instead of RM . The space X is a Banach space with respect to any of the above norms. g

∞

The classical Bolzano–Weierstrass result on the compactness of a closed bounded interval in R has the following generalization: Let X be a ﬁnite dimensional space endowed with an α-norm (α = 1, 2, ∞). Then A ⊂ X is relatively compact if and only if it is bounded (i.e., there is a constant c such that xα ≤ c for every x ∈ A).

1.2. Normed Linear Spaces

29

We note that this result is true for any norm on X (see Corollary 1.2.11(i) below). Proposition 1.2.10. Let X and Y be normed linear spaces and let A be a linear operator from X into Y . Then the following statements are equivalent: (i) A is continuous on X; (ii) A is continuous at o ∈ X; (iii) there is a constant c such that the inequality AxY ≤ cxX

is valid for all

x ∈ X.

Proof. The easy proof is left to the reader as an exercise.

(1.2.1)

We denote the collection of all continuous linear operators from X into Y by L(X, Y ) and the least possible constant c in (1.2.1) by AL(X,Y ) . This quantity has all properties of a norm on the linear space L(X, Y ). We will always consider this norm (the so-called operator norm) on L(X, Y ). If X = Y , we will use the shorter notation L(X) instead of L(X, X). We now return to Example 1.2.9. It is obvious that there are positive constants c1 , c2 such that for all x ∈ RM (CM ). c1 x1 ≤ x2 ≤ c2 x1 √ 1 , c2 = M .) Such constants exist also for the norms · 1 , (Here, e.g., c1 = M · ∞ . More generally, two norms on a linear space X are called equivalent if they satisfy such inequalities. In other words, two norms · α , · β on a linear space X are equivalent if the identity map from (X, · α ) into (X, · β ) is continuous together with its inverse, i.e., it is an isomorphism.14 Corollary 1.2.11. (i) Any two norms on a ﬁnite dimensional linear space X are equivalent. In particular, X is a Banach space. (ii) Let X, Y be normed linear spaces and dim X < ∞. Then L(X, Y ) = L(X, Y ), i.e., any linear operator from X into Y is continuous. Proof. (i) Let ϕ be as in Example 1.2.9 and consider RM (or CM ) equipped with the · 1 -norm. Then for x = (x1 , . . . , xM ) ∈ RM we have M M M ϕ−1 (x)X = xi fi ≤ |xi | fi X ≤ c |xi | = cx1 , i=1

X

i=1

i=1

i.e., ϕ−1 is continuous. Observe that for proving continuity of ϕ it is suﬃcient to show that inf{ϕ−1 (x)X : x1 = 1} > 0. 14 Unlike the “algebraic” isomorphism from Deﬁnition 1.1.11 here, it is understood in the “topological” sense. In general, A ∈ L(X, Y ) is an isomorphism if A is injective, surjective and A−1 ∈ L(Y, X).

∞

30

Chapter 1. Preliminaries

But this is true since the set {x ∈ RM : x1 = 1} is compact and ϕ−1 is continuous. Now let · , · ∼ be two norms on X, dim X = M and let ι be the identity ˜ (= X with the norm · ∼ ). The result follows from the map from X onto X commutativity of the diagram in Figure 1.2.1. Since RM and CM are complete spaces with respect to the · 1 -norm (the classical Bolzano–Cauchy condition) ∞ ∞ and {un }n=1 ⊂ X is a Cauchy sequence if and only if {ϕ(un )}n=1 is a Cauchy sequence, X is a Banach space. ι

X

˜ X ϕ˜−1

ϕ RM (CM ) Figure 1.2.1.

(ii) It is suﬃcient to prove continuity with respect to the 1-norm on X. M For x = xi fi ∈ X we have i=1

AxY ≤

M i=1

|xi | Afi Y ≤ c

M

|xi | = cx1 .

i=1

Example 1.2.12 (spaces of continuous functions). Let T be a compact topological space. Then any continuous real (complex) function f is bounded on T (Proposition 1.2.2), and f T = sup{|f (x)| : x ∈ T } is a norm on the space C(T ) of all such functions. Convergence of a sequence in this norm is the uniform convergence on T . It follows that C(T ) is a Banach space. If T is not compact, then a continuous function on T need not be bounded. To get a topology on a family of continuous functions on T we can either restrict our attention to the space BC(T ) of all bounded, continuous functions on T or assume certain properties of T which are weaker than compactness (the reader can consider RM as a model of such T ). As a result we wish to obtain a topology on C(T ) in which convergence of a sequence is equivalent to the locally uniform convergence. This can be done as follows. Let a topological space T be a countable union of open, relatively compact subsets Tn .15 We leave to the reader to verify that the sum ∞ 1 f − gn

(f, g) (1.2.2) n 1 + f − g 2 n n=1 basic example is RM or CM . Another example is the set N of natural numbers with the discrete metric: d(m, n) = 1 if m = n and d(m, m) = 0.

15 A

1.2. Normed Linear Spaces

31

where f − gn sup{|f (x) − g(x)| : x ∈ Tn }, deﬁnes a metric on C(T ) and the convergence of a sequence in this metric is actually the locally uniform convergence, i.e., uniform convergence on any compact subset of T . Since is bounded it cannot be induced by any norm. Even more is true, namely, there is no norm on C(T ) which generates the same system of open g sets as the metric does (provided T itself is not compact). We now state two fundamental results concerning spaces of continuous functions. To formulate the ﬁrst we need the concept of equicontinuity: A family F ⊂ C(T ) is said to be equicontinuous if for all x ∈ T and ε > 0 there is a neighborhood U of x such that y ∈ U, f ∈ F

=⇒

|f (y) − f (x)| < ε.

Theorem 1.2.13 (Arzel`a–Ascoli). Let T be a topological space which is a union of a sequence of open, relatively compact subsets. Then F ⊂ C(T ) is relatively compact in the -metric if and only if the following two conditions are satisﬁed: (i) F is equicontinuous; (ii) for each x ∈ T the set {f (x) : f ∈ F } is bounded in R (or C).16 Proof. We omit the proof and refer, e.g., to Dugundji [43, Section XII, 6] or Kelley [75, Chapter 7, Theorem 17] where more general results are proved. Since a continuous function can be very strange (e.g., nowhere diﬀerentiable) it is often desirable to have an approximation procedure. The ﬁrst result of this type was the famous Weierstrass Theorem on uniform approximation by polynomials. One of the characteristic features of this approximation consists in the fact that the product of two continuous functions is a continuous function and the same is true for polynomials. In algebraic terms: Both sets are not only linear spaces but also algebras.17 The following generalization of the Weierstrass Theorem is due to M.H. Stone. Theorem 1.2.14 (Stone–Weierstrass). Let T satisfy the assumption of Theorem 1.2.13 and let CR (T ) be an algebra of all real continuous functions on T . Let A ⊂ CR (T ) be a subalgebra which contains constant functions and separates points of T (i.e., for any x, y ∈ T , x = y, there is f ∈ A such that f (x) = f (y)). Then A is dense in CR (T ) with respect to the -metric. 16 If T is compact and (i) holds, then the assumption (ii) is equivalent to the boundedness of F in C(T ). 17 A linear space X with a binary operation (product) which is associative and distributive with respect to linear operations is called an algebra. Further, if X is a normed linear space and for the product the inequality x · y ≤ xy holds for every x, y ∈ X, then X is called a normed algebra and, in the case that X is complete, a Banach algebra.

32

Chapter 1. Preliminaries

Proof. The proof can be found, e.g., in Dugundji [43, XIII, 3] or Kelley [75, Chapter 7, Exercise T]. We note that Theorem 1.2.14 can be easily extended to the space of complex continuous functions. In this case, A is assumed to possess the following additional property: If f ∈ A, then also f ∈ A.18 The reader can ask why certain additional properties are needed for compactness in inﬁnite dimensional spaces like C(T ) in contrast to ﬁnite dimensional spaces. The following theorem explains not only this situation but also the technical diﬃculties which one meets in the calculus of variations (see Chapter 6). Proposition 1.2.15 (F. Riesz). Let X be a normed linear space. Then the closed unit ball B(o; 1) {x ∈ X : x ≤ 1} is compact (in the norm topology) if and only if X has ﬁnite dimension. Proof. Suﬃciency is obvious (see Example 1.2.9 and Corollary 1.2.11(i)). It remains to prove necessity. We proceed by contradiction. Assume that dim X = ∞. Choose 0 < ε < 1 and suppose that we have x1 , . . . , xn ∈ B(o; 1) such that xi − xj > 1 − ε

for all 1 ≤ i < j ≤ n.

We shall show that we can ﬁnd another element xn+1 ∈ B(o; 1) such that {x1 , . . . , xn+1 } has the same property. Since Xn = Lin{x1 , . . . , xn } = X there is y ∈ X \ Xn . Denote d inf{y − x : x ∈ Xn }. Observe that d > 0 since Xn is a closed subspace.19 By the deﬁnition of the greatest lower bound, there exists x ˜ ∈ Xn such that d ≤ y − x ˜ < d(1 + ε). For xn+1

y−˜ x y−˜ x

∈ B(o; 1) and x ∈ Xn we get

xn+1 − x =

1 1 y − (˜ x + y − x˜x) ≥ d > 1 − ε. y − x ˜ d(1 + ε) ∞

Thus an inﬁnite sequence {xn }n=1 ⊂ B(o; 1) with no convergent subsequence has been constructed, which contradicts compactness of B(o; 1). Example 1.2.16 (spaces of integrable functions). Let Ω be a Lebesgue measurable subset of RM and let dx denote the Lebesgue measure in RM . 18 If

z = x + iy, x, y ∈ R, then its complex conjugate z is deﬁned by z x − iy. ﬁnite dimensional subspace Y ⊂ X is complete, and therefore closed in X.

19 Every

1.2. Normed Linear Spaces

33

For p ∈ [1, ∞) we denote Lp (Ω) f : Ω → R (or C) : f is measurable p1 p and |f |Lp (Ω) |f (x)| dx < ∞ .

(1.2.3)

Ω

The Minkowski inequality |f + g|Lp (Ω) ≤ |f |Lp (Ω) + |g|Lp (Ω)

(1.2.4)

implies that Lp (Ω) is a linear space. Observe that | · |Lp (Ω) is not a norm since |f |Lp (Ω) = 0 implies only f = o almost everywhere (abbreviation: a.e.) in Ω. Put N (Ω) = {f : Ω → C : f = o a.e. in Ω}. Then N is a linear subspace of Lp and the factor space Lp (Ω) Lp (Ω)|N is a normed linear space with the norm [f ]Lp(Ω) = |f |Lp (Ω)

for any f ∈ [f ].20

For the sake of simplicity we will use the notation f instead of the superﬂuous [f ] for an element of Lp (Ω) and will call it simply a function. It is also convenient to introduce the space L∞ (Ω) of all (classes of) essentially bounded measurable functions. We recall that f is said to be essentially bounded on Ω if there is a constant c such that |f (x)| ≤ c for a.e. x in Ω. The least possible c is denoted by f L∞ (Ω) . Again · L∞ (Ω) is a norm on L∞ (Ω). We mention another important inequality – the so-called H¨ older inequality: 1 If 1 ≤ p ≤ ∞ and p is the conjugate exponent ( p1 + p1 = 1 where ∞ is

here deﬁned to be 0) and f ∈ Lp (Ω), g ∈ Lp (Ω), then f g ∈ L1 (Ω) and f g1 ≤ f p gp .

(1.2.5) g

Proposition 1.2.17. Lp (Ω) is a Banach space for any 1 ≤ p ≤ ∞. Proof. We give the proof for p = 1 (some small modiﬁcations are needed for 1 < p < ∞, while the proof for p = ∞ is similar to the one of completeness of 1 C(T ), cf. Example 1.2.12). Let {fn }∞ n=1 be a Cauchy sequence in L (Ω). Then for 20 For

the sake of simplicity we will use in the sequel the notation · p instead of · Lp (Ω) .

34

Chapter 1. Preliminaries

any k ∈ N there is nk ∈ N such that fn −fnk 1 < that the sequence

∞ {nk }k=1

for all n ≥ nk . We can assume p is strictly increasing. Put gp = |fnk+1 − fnk |. Since k=1

gp (x) dx ≤ Ω

1 2k

p

|fnk+1 (x) − fnk (x)| dx ≤

Ω

k=1

p 1 , 2k

k=1

the Monotone Convergence Theorem21 gives that g = lim gn has a ﬁnite integral n→∞ ∞ over Ω and therefore g is ﬁnite a.e. in Ω. This means that |fnk+1 (x) − fnk (x)| k=1

is a.e. convergent, and therefore f (x) lim fnk (x) exists a.e. in Ω. By the Fatou k→∞

Lemma22 we have |f (x) − fnk (x)| dx ≤ lim inf |fnl (x) − fnk (x)| dx ≤ l→∞

Ω

Ω

1 . 2k−1

In particular, f ∈ L1 (Ω)

and

lim fnk − f 1 = 0.

k→∞

The rest of the proof is easy. Indeed, a Cauchy sequence which has a convergent subsequence is itself convergent. Remark 1.2.18. The proof shows that the following statement is true: ∞ If {fn }n=1 is convergent to f in the Lp -norm, then there is a subse∞ quence {fnk }k=1 which converges to f a.e., and there is g ∈ Lp (Ω), g ≥ 0, such that |fnk (x)| ≤ g(x)

for a.e.

x ∈ Ω.

Warning. The whole sequence need not be To see this arrange the a.e. kconvergent! characteristic functions of the intervals k−1 into a sequence. , 2k 2k 21 This

theorem reads as follows: Let {gn }∞ n=1 be an increasing sequence of nonnegative measurable functions on Ω and let g = lim gn . Then n→∞

lim

n→∞ 22 The

gn (x) dx = Ω

g(x) dx. Ω

Fatou Lemma reads: Let {hn }∞ n=1 be a sequence of measurable functions which are uniformly bounded below by an h ∈ L1 (Ω). Then lim inf hn (x) dx ≤ lim inf hn (x) dx. Ω n→∞

n→∞

Ω

The statement holds for lim sup with the reverse inequality for a sequence bounded above by an integrable function. Put here hl = |fnl − fnk |.

1.2. Normed Linear Spaces

35

Approximations of integrable functions by more regular functions, like continuous or diﬀerentiable ones, are often desirable. Proposition 1.2.19 (Density Theorem). For any p ∈ [1, ∞) the subset C(Ω)∩Lp (Ω) is dense in Lp (Ω). Proof. It is based on the application of the Luzin Theorem.23 See also Proposition 1.2.21 below. We now show another type of approximation which is more constructive and therefore often more convenient in applications. If f , g are measurable functions on RM , then we deﬁne their convolution f ∗ g as (f ∗ g)(x) f (x − y)g(y) dy for all x ∈ RM (1.2.6) RM

for which the integral exists. We note that the properties of the convolution follow from the Fubini Theorem provided measurability of the function (x, y) → f (x − y)g(y) is established. For details see, e.g., Folland [52], Gripenberg, Londen & Staﬀans [62, Chapters 2–4], and also Example 2.1.28. The following assertion is a basic result on convolutions. Proposition 1.2.20. Let f ∈ L1 (RM ). (i) If g ∈ Lp (RM ), 1 ≤ p ≤ ∞, then f ∗ g ∈ Lp (RM )

and

f ∗ gp ≤ f 1gp .

(ii) If g ∈ L∞ (RM ), then f ∗ g is bounded and uniformly continuous on RM . ∂g ∂g ∂ ∈ Lp (RM ), then ∂x (f ∗ g) = f ∗ ∂x a.e. in RM . (iii) If g ∈ Lp (RM ) and ∂x i i i (iv) If ϕ is a nonnegative, measurable function with ϕ(x) dx = 1 (the soRM

called molliﬁer) and ϕn (x) nM ϕ(nx), then ϕn ∗ g converge to g in the Lp -norm for any g ∈ Lp (RM ), 1 ≤ p < ∞. If T is a topological space and f : T → R (C), then the support of f (abbreviation supp f ) is the set {x ∈ T : f (x) = 0}. If Ω ⊂ RM is an open set, then D(Ω) denotes the set of all inﬁnitely diﬀerentiable functions on Ω (i.e., their derivatives of arbitrary order are continuous in Ω) which have compact support lying in Ω. 23

Roughly speaking, the Luzin Theorem says that a bounded measurable function is continuous with respect to sets, measures of which are arbitrarily close to the measure of Ω provided the latter is ﬁnite. For a more general formulation and the proof of the Luzin Theorem the reader can consult, e.g., Rudin [113, § 2.23].

36

Chapter 1. Preliminaries

We show that D(Ω) contains enough functions. Put − 1 e 1−x2 , |x| < 1, ω(x) = 0, |x| ≥ 1. It is a matter of simple calculation to prove that ω ∈ D(R). If a ∈ Ω, then B(a; δ) ⊂ Ω for a δ > 0 small enough and the function ϕ(y) ω 2δ y − aRM belongs to D(Ω). However, much more is true. Proposition 1.2.21. Let Ω be an open set in RM and let p ∈ [1, ∞). Then D(Ω) is dense in Lp (Ω). Proof. The just deﬁned function ϕ multiplied by an appropriate constant satisﬁes the assumptions of Proposition 1.2.20(iv). There is a strictly increasing sequence of compact subsets Cm of Ω such that ∞

Cm = Ω.

m=1

Extend f ∈ Lp (Ω) by zero outside Ω and put f m = χm f where χm is the characteristic function of the set Cm . Then fm → f in the Lp norm. By Proposition 1.2.20, ϕn ∗ fm ∈ D(Ω) for n ≥ nm and ϕn ∗ fm − f p ≤ ϕn ∗ (fm − f )p + ϕn ∗ f − f p ≤ fm − f p + ϕn ∗ f − f p .

The result follows from Proposition 1.2.20(iv).

Remark 1.2.22. If meas Ω < ∞ and 1 ≤ p˜ < p ≤ ∞, then, by the H¨ older inequality, 1

1

f p˜ ≤ (meas Ω) p˜ − p f p ,

f ∈ Lp (Ω).

(1.2.7)

This means that the identity map of Lp (Ω) into Lp˜(Ω) is continuous. We will denote this fact by Lp (Ω) ⊂ Lp˜(Ω) and say that Lp (Ω) is continuously embedded into Lp˜(Ω). Warning. Simple examples show that this is not true if meas Ω = ∞! The following assertion is an analogue of Theorem 1.2.13. Proposition 1.2.23 (A.N. Kolmogorov). Let Ω be an open set in RM . Then M ⊂ Lp (Ω), p ∈ [1, ∞), is relatively compact if and only if the following conditions are satisﬁed: (i) M is bounded in Lp (Ω),

1.2. Normed Linear Spaces

37

(ii) ∀ε > 0 ∃δ > 0 ∀f ∈ M: (iii) ∀ε > 0 ∃η > 0 ∀f ∈ M:

Ω

|f (x + y) − f (x)|p dx < ε for all yRM < δ,24

{x∈Ω:xRM ≥η}

|f (x)|p dx < ε.

Proof. For the proof based on Proposition 1.2.3 see Yosida [135, Chapter 10, § 1]. Remark 1.2.24. All results from 1.2.16–1.2.23 also hold in spaces of sequences ⎧ ⎫ p1 ∞ ⎨ ⎬ ∞ lp x = {xn }n=1 : xp = |xn |p <∞ ⎩ ⎭ n=1

which can be regarded as Lp (N) equipped with the counting measure µ (µ(A) = card A). Example 1.2.25 (spaces of diﬀerentiable functions). We can consider either classical derivatives (deﬁned as limits of relative diﬀerences) or weak derivatives. We start with the former case. Let α = (α1 , . . . , αM ) be a multiindex , i.e., αi ∈ N ∪ {0}, i = 1, . . . , M , and |α| α1 + · · · + αM . For a function f on an open set Ω ⊂ RM we put Dα f (x)

∂ |α| f (x) M · · · ∂xα M

1 ∂xα 1

and say that f ∈ C n (Ω) if Dα f are continuous for all multiindices α for which |α| ≤ n. We can use the metric given by (1.2.2) to deﬁne α (f, g) (Dα f, Dα g) for a multiindex α and set

n (f, g)

α (f, g). |α|≤n

Then n is a metric on C n (Ω) and the convergence in this metric is the locally uniform convergence of all derivatives Dα , 0 ≤ |α| ≤ n (Do f = f ). Another possibility is to consider only such functions f ∈ C n (Ω) for which Dα f is bounded in Ω for all 0 ≤ |α| ≤ n. We denote the collection of such functions by C n (Ω)25 and put f C n(Ω) sup |Dα f (x)|. |α|≤n

x∈Ω

This is a norm, C n (Ω) is a Banach space, and the convergence of a sequence ∞ {fk }k=1 ⊂ C n (Ω) to f in this norm means that D α fk ⇒ D α f

uniformly on Ω

for all |α| ≤ n.

x + y ∈ Ω, then we set f (x + y) 0. connection with this notation observe that for a relatively compact set Ω all derivatives D α f , |α| ≤ n − 1, are uniformly continuous, and therefore continuously extendable to Ω. 24 If

25 In

38

Chapter 1. Preliminaries

It is sometimes convenient to have a ﬁner scale of spaces of diﬀerentiable functions. We can achieve that by introducing the H¨ older continuous functions: A function f : Ω → R (or C) is called γ-H¨ older continuous (0 < γ ≤ 1) if there is a constant c such that the inequality |f (x) − f (y)| ≤ cx − yγ

holds for all x, y ∈ Ω.26

The quantity f C 0,γ (Ω) sup |f (x)| + sup x∈Ω

x,y∈Ω x =y

|f (x) − f (y)| x − yγ

is a norm on the space C 0,γ (Ω) of γ-H¨older continuous, bounded functions on Ω. The space C n,γ (Ω) is deﬁned similarly. We note that C n,γ (Ω) is a Banach space with respect to the above norm (cf. Exercise 7.1.4). Now we turn our attention to weak derivatives on an open set Ω ⊂ RM . Let f ∈ L1loc (Ω) (this means that f ∈ L1 (K) for every compact subset K ⊂ Ω), and let α be a multiindex. A function g ∈ L1loc (Ω) is called an α-weak derivative of f if f (x)Dα ϕ(x) dx = (−1)|α| g(x)ϕ(x) dx for every ϕ ∈ D(Ω). (1.2.8) Ω

We will denote g =

Ω α Dw f

and omit w when there is no danger of ambiguity.

Warning. Even in the one-dimensional case the ordinary derivative existing almost everywhere need not be the weak derivative! For example, the Heaviside function 1, x ≥ 0, H(x) = satisﬁes 0, x < 0,

H (x) = 0

for x ∈ R \ {0}

but the weak derivative does not exist. The distributional derivative of H 27 is the Dirac measure. 26 If

γ = 1, then it is more common to say that f is a Lipschitz continuous function. We note that the inequality is satisﬁed for a γ > 1 only if f is a constant function (cf. Exercise 7.1.6). 27 A linear form Φ on the linear space D(Ω) is called a distribution (this notion is due to L. Schwartz) if it has the following continuity property: If ϕn ∈ D(Ω) have their supports in the same compact set K ⊂ Ω and D α ϕn ⇒ D α ϕ uniformly on Ω for all multiindices α, then Φ(ϕn ) → Φ(ϕ).

1.2. Normed Linear Spaces

39

We note that an absolutely continuous function f on an interval I ⊂ R has a derivative a.e. (the Lebesgue Theorem, see Rudin [113]), and x f (x) = f (a) + f (y) dy for a, x ∈ I. a

This implies that Dw f = f . The situation in higher dimensions is not so simple since there are several non-equivalent deﬁnitions of absolutely continuous functions. Having a deﬁnition of weak derivatives we can deﬁne Sobolev spaces W k,p (Ω) for an open set Ω ⊂ RM as follows: α W k,p (Ω) {f ∈ Lp (Ω) : derivatives Dw f exist

and belong to Lp (Ω) for all |α| ≤ k} with the norm f W k,p (Ω)

α Dw f p .28

(1.2.9)

|α|≤k

Similarly to the deﬁnition of Lp spaces, classes of functions are considered here. Since Lp (Ω) is a Banach space, W k,p (Ω) is a Banach space, too. As we will see later in this book, Sobolev spaces play an important role in the study of boundary value problems. For this purpose the following assertions g are important. Theorem 1.2.26 (Sobolev Embedding Theorem). Let k ∈ N and let p ∈ [1, ∞). ∗ k 29 (i) If k < Np , then W k,p (RN ) ⊂ Lp (RN ) for p1∗ = p1 − N . (ii) If k =

(iii) If k >

N p,

N p,

then W k,p (RN ) ⊂ Lr (RN )

for all

r ∈ [p, ∞)

W k,p (RN ) ⊂ Lrloc (RN )

for all

r ≥ 1.

then W k,p (RN ) ⊂ C 0,γ (RN ) for all 0 ≤ γ < k −

and

N 30 p.

Note that any f ∈ L1loc (Ω) (and even a regular Borel measure on Ω, see, e.g., Rudin [113]) yields f (x)ϕ(x) dx for any ϕ ∈ D(Ω). The distributional a distribution Φf by the formula Φf (ϕ) = Ω

derivative D α of a distribution Φ is deﬁned as D α Φ(ϕ) (−1)|α| Φ(D α ϕ), ϕ ∈ D(Ω). It is easy to prove that D α Φ is again a distribution, and an α-weak derivative of f ∈ L1loc (Ω) is actually equal to the distributional derivative D α Φf . As the Heaviside function shows the converse is not true. 28 Similarly as for the Lebesgue norm we will use in the sequel the notation · k,p instead of · W k,p (Ω) for the Sobolev norm. 29 The exponent p∗ pN is sometimes called the critical Sobolev exponent. N−kp 30 This means that any function f ∈ W k,p (RN ) can be changed on a set of measure zero in such a way that the new function f˜ is γ-H¨ older continuous and f˜C 0,γ (RN ) ≤ cf W k,p (RN ) . The older continuous symbol RN means that functions from C 0,γ (RN ) are bounded and uniformly γ-H¨ on the whole RN .

40

Chapter 1. Preliminaries

Proof. Proofs of these statements are quite involved and also have a long history. The interested reader can consult, e.g., Adams [2], Kufner, John & Fuˇc´ık [82], Maz’ja [93], Stein [123, Chapters V, VI]. For a readable account of Sobolev spaces we recommend Evans [48, Chapter 5]. Spaces with fractional derivatives which extend the class of Sobolev spaces can be also deﬁned, e.g., Triebel [128], [129]. Remark 1.2.27. The situation for an open set Ω with a nonempty boundary ∂Ω (in particular, for a bounded Ω) is even more complicated because some techniques from harmonic analysis, like Fourier transform, are not available. One possibility is to extend f ∈ W k,p (Ω) to a function f˜ ∈ W k,p (RN ). This is possible if the boundary ∂Ω possesses certain smoothness properties. To explain this more precisely we would need some facts about manifolds (see Section 4.3 and Appendix 4.3A). So we omit details and just state that Theorem 1.2.26 is true provided ∂Ω is locally Lipschitz (see Section 7.3 for details). Theorem 1.2.28 (Rellich–Kondrachov). Let Ω be a bounded open set in RN with a locally Lipschitz boundary, k ∈ N, p ∈ [1, ∞). (i) Let k <

N p

and q ∈ [1, p∗ ) where p∗

pN . N − kp

(1.2.10)

Then the embedding W k,p (Ω) into Lq (Ω) is compact.31 (ii) If k = Np , then W k,p (Ω) ⊂⊂ Lq (Ω) for all q ∈ [1, ∞). (iii) If 0 ≤ γ < k −

N p,

then W k,p (Ω) ⊂⊂ C 0,γ (Ω).

Proof. For the proof see references given above.

Now, we turn our attention to abstract spaces. Proposition 1.2.15 has pointed out the diﬀerence between ﬁnite dimensional spaces and (inﬁnite dimensional) function spaces. Another diﬀerence between the ﬁnite and inﬁnite dimension lies in the notion of a basis. It can be shown that any algebraic basis in an inﬁnite dimensional Banach space X has to be uncountable, and therefore the representation of a point by its coordinates can hardly be of any use. This observation leads to the necessity of expressing an element of X by an inﬁnite sum. A sequence ∞ {en }n=1 ⊂ X is called a Schauder basis of X if for each x ∈ X there is a (uniquely determined) sequence {ξn }∞ n=1 of numbers (real or complex according to whether X is real or complex) such that x=

∞

ξn en .

(1.2.11)

n=1

will use the notation ⊂⊂ for compact embeddings. An embedding of X into Y is compact if a ball in X is relatively compact in Y .

31 We

1.2. Normed Linear Spaces

41

There are several imperfections in this deﬁnition. Namely, there are separable32 Banach spaces which do not possess a Schauder basis. Moreover, the convergence of the sum in (1.2.11) can be understood in several non-equivalent meanings. These problems do not appear in a special class of spaces with an additional structure which is connected with the norm and allows measuring angles. Deﬁnition 1.2.29. Let X be a real (or complex) linear space. A mapping (·, ·)X : X × X → R (or C) is called a scalar product on X if the following conditions are satisﬁed: (1) for any y ∈ X the mapping x → (x, y)X is linear; (2) (x, y)X = (y, x)X for all x, y ∈ X in the real case and (x, y)X = (y, x)X in the complex case; (3) (x, x)X ≥ 0 for every x ∈ X and (x, x)X = 0 if and only if x = o. Proposition 1.2.30. Let (·, ·) be a scalar product on a linear space X. Then (i) the so-called Schwartz inequality |(x, y)|2 ≤ (x, x)(y, y)

holds for all

x, y ∈ X;

(1.2.12)

1

(ii) the mapping · : x → [(x, x)] 2 is a norm on X. Proof. Assertion (i). For x, y ∈ X there exists c ∈ C, |c| = 1, such that for yˆ = cy we have (x, yˆ) ∈ R. Hence it suﬃces to prove (1.2.12) for the real space X. For any α ∈ R we have 0 ≤ (x + αy, x + αy) = (x, x) + 2α(x, y) + |α|2 (y, y), i.e., the discriminant 4|(x, y)|2 −4(x, x)(y, y) is nonpositive. Hence (1.2.12) follows. In assertion (ii), only the triangle inequality has to be checked. For x, y ∈ X we get33 x + y2 = (x + y, x + y) = (x, x) + 2 Re(x, y) + (y, y) ≤ x2 + 2|(x, y)| + y2 and the Schwartz inequality completes the proof.

If X is a linear space with a scalar product we will always consider the norm on X induced by this scalar product. If X is complete with respect to this norm, then X is called a Hilbert space and will be usually denoted by H. We note that ˜ which is a completion of X. if X is not complete there exists a Hilbert space H 32 If a space X has a Schauder basis, then X is separable. This is not a serious drawback since most function spaces used in analysis are separable. 33 Notice here a typical procedure with the norm induced by a scalar product, namely using the second power of the norm in calculation.

42

Chapter 1. Preliminaries

Example 1.2.31. (i) RM with the scalar product (x, y) =

M

ξi ηi ,

x=

i=1

M

ξi ei ,

y=

i=1

M

ηi ei ,

i=1

(e1 , . . . , eM the standard basis) is a Hilbert space. Similarly, CM is a Hilbert M space with respect to the scalar product (x, y) = ξi ηi . i=1

(ii) The norm on L2 (Ω) given by (1.2.3) is induced by the scalar product (f, g)L2 (Ω) = f (x)g(x) dx (1.2.13) Ω

(in the complex case). Similarly, for p = 2 the norm (1.2.9) is equivalent to the norm induced by the scalar product (Dα f, Dα g)L2 (Ω) . (f, g)W k,2 (Ω) = |α|≤k

(iii) The “sup norm” on BC(Ω) is not induced by any scalar product. This can be seen from the parallelogram identity x + y2 + x − y2 = 2x2 + 2y2,

x, y ∈ X

(1.2.14)

which is valid only in such a space X the norm of which is induced by a scalar product. Indeed, if a norm satisﬁes (1.2.14), then (in the real case) (x, y) =

1 (x + y2 − x − y2 ) 4

(1.2.15)

(polarization identity) has all properties of a scalar product, and the induced norm coincides with · . It is not diﬃcult to show that the “sup norm” does not satisfy (1.2.14). Even more is true, namely, the “sup norm” is not equivalent to any norm on BC(Ω) induced by a scalar product. Since C[0, 1] ⊂ L2 (0, 1), the scalar product (1.2.13) is also a scalar product on C[0, 1]. But the space C[0, 1] is not complete in the L2 -norm and, therefore, the L2 -norm on C[0, 1] cannot be equivalent to the “sup norm”; only the inequality f L2 (0,1) ≤ f C[0,1] holds. Observe that L2 (0, 1) is a completion of C[0, 1] with respect to the g integral norm given by (1.2.3). The most useful concept in spaces with a scalar product is the following one.

1.2. Normed Linear Spaces

43

Deﬁnition 1.2.32. Let X be a linear space with a scalar product (·, ·). (1) Subsets A, B ⊂ X are said to be orthogonal (and denoted by A ⊥ B) if (a, b) = 0 for every a ∈ A, b ∈ B. (2) A system {xγ }γ∈Γ ⊂ X is said to be orthonormal if (xγ , xγ˜ ) =

0, γ = γ˜ , 1, γ = γ˜ .

∞

∞

(3) A sequence {en }n=1 ⊂ X is called an orthonormal basis of X if {en }n=1 is both an orthonormal system and a Schauder basis of X. Suppose that x1 , . . . , xn are linearly independent elements of a space X with a scalar product (·, ·). Put e1 = xx11 and if orthonormal elements e1 , . . . , ek (k < n) are constructed in such a way that Lin{x1 , . . . , xk } = Lin{e1 , . . . , ek }, then deﬁne yk+1 = xk+1 −

k

(xk+1 , ej )ej ,

ek+1 =

j=1

yk+1 . yk+1

It is obvious that (ej , ek+1 ) = 0,

ek+1 = 1

j = 1, . . . , k,

and Lin{x1 , . . . , xk+1 } = Lin{e1 , . . . , ek+1 }. This procedure is called the Schmidt orthogonalization. For any x ∈ Y n αk ek . Taking the scalar product with ej , we get Lin{x1 , . . . , xn } we have x = k=1

(x, ej ) =

n

αk (ek , ej ) = αj ,

k=1

and also

⎛ x2 = ⎝

n j=1

(x, ej )ej ,

n k=1

⎞ (x, ek )ek ⎠ =

n k=1

|(x, ek )|2 .

44

Chapter 1. Preliminaries

Assume now that X = Y and let us look for an approximation of a y ∈ X \ Y by n an element x = αj ej : j=1

⎛ y − x2 = ⎝y −

n

αj ej , y −

j=1

= y2 −

n

αj ej ⎠

j=1

n

αj (y, ej ) −

j=1

= y2 +

⎞

n

n

αj (y, ej ) +

j=1

|αj |2

j=1

|αj − (y, ej )|2 −

j=1

n

n

(1.2.16)

|(y, ej )|2

j=1

2 n n 2 2 ≥ y − |(y, ej )| = y − (y, ej )ej . j=1 j=1 Two consequences follow from this inequality. First, the best approximation of y ∈ X by an element of Y is Pn y

n

(y, ej )ej .34

j=1

Observe also that (y − Pn y) ⊥ Y . Second, n

|(y, ej )|2 ≤ y2

for all y ∈ X.

j=1

Since n is arbitrary (in an inﬁnite dimensional space) we have obtained the socalled Bessel inequality: ∞ If {en }n=1 is an orthonormal system in X, then ∞

|(y, en )|2 ≤ y2

for all y ∈ X.

(1.2.17)

n=1

In particular, the sum

∞

|(y, ej )|2 is always convergent.

j=1 34 We

note that this result, namely the linearity of the operator Pn of the best approximation, is typical for spaces with scalar products. In a general normed linear space X and a ﬁnite dimensional subspace Y the best approximation of an arbitrary x ∈ X by elements of Y exists (by a compactness argument) but a special property of the norm is needed for the uniqueness of the best approximation. Linearity of the best approximation on all subspaces of dimension 2 implies that the norm is induced by a scalar product. More details can be found in the monograph of Singer [120].

1.2. Normed Linear Spaces

45

Proposition 1.2.33. Let X be a linear space with a scalar product, let X be separable.35 Then there exists an orthonormal basis in X. Proof. Let {x1 , x2 , x3 , . . . } be a dense set in X. Put Yn = Lin{x1 , . . . , xn },

Y =

∞

Yn .

n=1

Then Y = X. By omitting linearly dependent elements we can assume that dim Yn = n. According to the Schmidt orthogonalization there exists an orthonor∞ mal sequence {en }∞ n=1 such that Yn = Lin{e1 , . . . , en }. Let x ∈ X and let {yn }n=1 be a sequence such that yn ∈ Yn and lim yn = x (the density of Y in X). By the n→∞

inequality (1.2.16),

This means that x =

n x − yn ≥ x − (x, ej )ej . j=1 ∞

(x, ej )ej .

j=1

To prove uniqueness, suppose that x =

∞

αj ej . Since the scalar product is

j=1

continuous, we have ⎛ (x, ek ) = lim ⎝ n→∞

n

⎞ αj ej , ek ⎠ = αk .

j=1

In order to obtain some useful properties which guarantee that an orthonormal sequence is a basis we need to use completeness. We start with a general approximation result. Theorem 1.2.34. Let H be a Hilbert space and let C be a closed convex subset of H. Then for any x ∈ H there exists a unique y ∈ C such that x − y = inf {x − z : z ∈ C}.

(1.2.18)

This best approximation y is characterized by the following property: y ∈ C and Re(x − y, y − z) ≥ 0

for all z ∈ C

(1.2.19)

(see Figure 1.2.2 36 ). 35 The assumption on separability is redundant. Without separability an orthonormal basis {eγ }γ∈Γ still exists but Γ is uncountable. Moreover, if x ∈ X, then (x, eγ ) = 0 for all but countably many γ. 36 For A ⊂ H we denote A⊥ {x ∈ H : a ∈ A =⇒ (a, x) = 0}.

46

Chapter 1. Preliminaries

x y + {x − y}⊥ y z C

x−y

{x − y}⊥

y−z o Figure 1.2.2.

Proof. Step 1 (Existence). Denote the right-hand side in (1.2.18) by d. If d = 0, then ∞ x ∈ C (C is closed) and y = x. Suppose that d > 0. Then there are {zn }n=1 ⊂ C such that 1 d ≤ x − zn < d + . n By (1.2.14) we get zn − zm 2 = x − zm − (x − zn )2

2 zn + zm = 2(x − zm + x − zn ) − 4 x − 2 2 2

1 1 <2 d+ +2 d+ − 4d2 n m 2

2

∞

m (notice that zn +z ∈ C since C is convex). This implies that {zn }n=1 is a Cauchy 2 sequence, and therefore it is convergent to a y ∈ C. Obviously, x − y = d.

Step 2 (Uniqueness). Assume that x − y = x − y˜ = d for y, y˜ ∈ C. Using (1.2.14) as above we get y = y˜. Step 3 (Characterization). Let y be the best approximation of x and let z ∈ C. Then zt tz + (1 − t)y ∈ C for t ∈ (0, 1) (C is convex) and x − zt 2 = x − y + t(y − z)2 = x − y2 + t2 y − z2 + 2t Re(x − y, y − z) ≥ x − y2 ,

1.2. Normed Linear Spaces

47

i.e., ty − z2 + 2 Re(x − y, y − z) ≥ 0 and taking the limit for t → 0+ , the inequality (1.2.19) follows. If (1.2.19) is satisﬁed, then x − z2 = x − y + y − z2 = x − y2 + y − z2 + 2 Re(x − y, y − z) ≥ x − y2 ,

and therefore y is the best approximation of x.

Corollary 1.2.35. Let H be a Hilbert space and M a closed linear subspace of H, M = H, M = {o}. Then there exists a unique subspace M ⊥ (the so-called orthogonal complement of M ) such that H = M ⊕ M ⊥,

M ⊥ M ⊥.

Moreover, if P denotes the projection to M given by this direct sum37 (the socalled orthogonal projection), then P ∈ L(H), P L(H) = 1 and (P x, y) = (x, P y)

for all

x, y ∈ H.

Proof. A closed linear subspace M is a closed convex set. Denote by P x ∈ M the best approximation of x ∈ H in M . Choose w ∈ M and put z = P x − w ∈ M. By (1.2.19) we get Re(x − P x, w) ≥ 0 and also Re(x − P x, iw) ≥ 0 (by taking z = P x − iw), i.e., (x − P x, w) ≥ 0. Since also −w ∈ M , we ﬁnally have (x − P x, w) = 0

for all w ∈ M.

(1.2.20)

It is easy to see that (1.2.20) is a characterization of P x. By using (1.2.20) for αx, x1 + x2 , we see that P is a linear operator. Since P 2 = P , P is a projection onto M . The identity (1.2.20) also shows that Ker P = M ⊥ {y ∈ H : x ∈ M =⇒ (x, y) = 0}. By the orthogonality of P x and x − P x we have x2 = P x2 + x − P x2 ≥ P x2 ,

i.e.,

P L(H) ≤ 1.

Since P x = x for x ∈ M , P L(H) = 1. By (1.2.20) we get (x, P y) = (x − P x + P x, P y) = (P x, P y) = (P x, P y − y + y) = (P x, y). 37 Cf.

Example 1.1.13(i).

48

Chapter 1. Preliminaries ∞

Corollary 1.2.36. Let H be a Hilbert space and let {en }n=1 be an orthonormal sequence in H. Then the following statements are equivalent: (i) {en }∞ n=1 is an orthonormal basis; (ii) if (x, en ) = 0 for all n, then x = o; (iii) Lin{e1 , e2 , . . . } is dense in H; (iv) the Parseval equality x2 =

∞

|(x, en )|2

is valid for all

x ∈ H.

(1.2.21)

n=1

Proof. The implication (i)⇒(ii) is obvious and follows from the deﬁnition of the orthonormal basis. The implication (ii)⇒(iii): Denote Y = Lin{e1 , e2 , . . . }. Assume that Y is not dense, i.e., Y = H. By Corollary 1.2.35 there exists x ∈ Y ⊥ \ {o}. In particular, (x, en ) = 0 for all n, a contradiction. The implication (iii)⇒(iv): The proof of Proposition 1.2.33 shows that the sequence n sn (x, ek )ek converges to x for all x ∈ H. k=1

Moreover, sn ⊥ (x − sn ), and hence x2 = sn 2 + x − sn 2 =

n

|(x, ek )|2 + x − sn 2 .

k=1

By taking the limit, the Parseval equality follows. The implication (iv)⇒(i): Let x ∈ H be arbitrary. For sn deﬁned as above and m > n we have m sm − sn 2 = |(x, ek )|2 . k=n

Since the series in (1.2.21) is convergent, the sequence {sn }∞ n=1 is Cauchy, and therefore it is convergent to a y ∈ H since H is complete. Moreover, (y, en ) = (x, en ), and by the Parseval equality x − y2 =

∞

|(x − y, en )|2 = 0.

n=1 ∞

Remark 1.2.37. Let H be a Hilbert space and {en }n=1 an orthonormal basis in H. The proof of the last implication shows that for an arbitrary sequence {αn }∞ n=1 ⊂ R ∞ |αn |2 is (or C depending on whether H is a real or complex space) for which n=1

1.2. Normed Linear Spaces

convergent, the series

∞

49

αn en is convergent in H to an x ∈ H and (x, en ) = αn .

n=1

Moreover, the operator 2 38 U : x ∈ H → {(x, en )}∞ n=1 ∈ l (N)

is a unitary operator (i.e., (U x, U y)l2 (N) = (x, y), x, y ∈ H) which is surjective. It implies also that all inﬁnite dimensional separable Hilbert spaces over the same ﬁeld of scalars are unitarily equivalent. This statement is known as the Riesz– Fischer Theorem. Having this result we can ask why not restrict our attention only to a single abstract separable Hilbert space. The reason is that in a special function space like W k,2 (Ω) one has more ways of computation since its elements are functions. Example 1.2.38. (i) The space L2 (−π, π) is a Hilbert space. It is separable since continuous 2πperiodic functions are dense in L2 (−π, π) and any such function can be apn proximated by trigonometric polynomials of the type ak eikt (either the k=−n

classical Weierstrass Approximation Theorem or Theorem 1.2.14). It is easy to see that 1 en : t → √ eint , t ∈ (−π, π), n ∈ Z, 2π form an orthonormal system in L2 (−π, π). By Corollary 1.2.36(iii) it is also an orthonormal basis.39 (ii) Functions Hn (t)e

2

− t2

n t2

where Hn (t) = (−1) e

2

dn e−t dtn

(the so-called Hermite polynomials) form (after normalization) an orthonormal basis in L2 (R). For the proof and relevant results in harmonic analysis we recommend the classical book Kaczmarz & Steinhaus [70]. We note that 38 l2 (N)

is the space of all (generally complex) sequences x = {ξn }∞ n=1 such that

convergent. The scalar product on l2 (N) is given by (x, y)l2 (N) = y = {ηn }∞ n=1 (see also Remark 1.2.24). 39 Here this means that f (t) =

+∞ −∞

fˆ(n)eint

where

∞ n=1

1 fˆ(n) = (f, en )L2 (−π,π) = 2π

∞

|ξn |2 is

n=1

ξn η n for x = {ξn }∞ n=1 ,

π

f (t)e−int dt

−π

and the series is convergent in the L2 -norm for arbitrary f ∈ L2 (−π, π). It is worth noting that the series is actually a.e. convergent to f but this by no means follows from the norm convergence. This result is due to L. Carlesson and it is one of the most diﬃcult and profound results in analysis.

50

Chapter 1. Preliminaries

there are many diﬀerent orthonormal bases in L2 -spaces. We will present one g general method of their construction in Theorem 2.2.16. ∞

Proposition 1.2.39. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Then a bounded set M ⊂ H is relatively compact if and only if for any ε > 0 there is k ∈ N such that ∞

|(x, en )|2 < ε

for all

x ∈ M.

n=k

Proof. The statement follows from Proposition 1.2.3.

Theorem 1.2.40 (Riesz Representation Theorem). Let H be a Hilbert space and let F be a continuous linear form on H. Then there is a unique f ∈ H such that F (x) = (x, f )

for all

x ∈ H.

Moreover, F = f where F = F L(H,R) or F = F L(H,C) depending on whether H is a real or a complex space. Proof. Suppose that H is a complex Hilbert space. If F = o, then f = o. Suppose that F = o. The idea of constructing f is that f has to be orthogonal to Ker F which is a closed subspace of H. By Corollary 1.2.35, H = Ker F ⊕ (Ker F )⊥ . Take x0 ∈ (Ker F )⊥ , x0 = 1, and put f = αx0 where α will be determined later. Let x = y + βx0 , y ∈ Ker F , β ∈ C be arbitrary. Then (x, f ) = βα,

F (x) = βF (x0 ).

Choose now α = F (x0 ). If there is another g ∈ H such that F (x) = (x, g), x ∈ H, then 0 = (x, f − g) for all x ∈ H, in particular, for x = f − g. Therefore f = g. By the Schwartz inequality (1.2.12) we obtain |F (x)| = |(x, f )| ≤ xf ,

i.e.,

F ≤ f .

Since F (f ) = f 2 , we have F ≥ f . This shows that F = f .

The following variant of the Riesz Representation Theorem is often used in the functional analysis approach to diﬀerential equations (see, e.g., Evans [48]). Proposition 1.2.41 (Lax–Milgram). Let H be a complex Hilbert space and let B : H × H → C be a mapping with the following properties: (i) The mapping x → B(x, y) is linear for any y ∈ H. (ii) B(x, α1 y1 +α2 y2 ) = α1 B(x, y1 )+α2 B(x, y2 ) for every x, y1 , y2 ∈ H, α1 , α2 ∈ C. (iii) There is a constant c such that |B(x, y)| ≤ cxy for every x, y ∈ H.

1.2. Normed Linear Spaces

51

Then there is A ∈ L(H), AL(H) ≤ c, such that x, y ∈ H.

B(x, y) = (x, Ay), Moreover, (iv) if there is a positive constant d such that B(x, x) ≥ dx2

for each

x ∈ H,

then A is invertible, A−1 ∈ L(H)

and

A−1 L(H) ≤

1 . d

Proof. The existence of A follows from (i), (iii) and the Riesz Representation Theorem. The property (ii) yields the linearity of A. Since Ay2 = (Ay, Ay) = B(Ay, y) ≤ cAyy, we have Ay ≤ cy, i.e., A ∈ L(H) and AL(H) ≤ c. The property (iv) means that dy2 ≤ B(y, y) = (y, Ay) ≤ yAy, i.e., Ay ≥ dy

for all y ∈ H.

(1.2.22)

In particular, A is injective. Moreover, Im A is a closed subspace of H. Indeed, ∞ let Ayn → z ∈ Im A. By (1.2.22), {yn }n=1 is a Cauchy sequence, and hence it is convergent to a y ∈ H. By continuity of A, Ay = z, i.e., z ∈ Im A. In fact, Im A = H. Indeed, if w ∈ (Im A)⊥ , then dw2 ≤ B(w, w) = (w, Aw) = 0

and

w = o.

So Dom (A−1 ) = Im A = H and (1.2.22) implies that A−1 L(H) ≤

1 . d

Exercise 1.2.42. Let {Fα }α∈A be a system of closed subsets of a compact space M . Prove the ﬁnite intersection property: % % If Fα = ∅ for any ﬁnite K ⊂ A, then Fα = ∅. α∈K

α∈A

(This property characterizes compact spaces.) Hint. Suppose not. Then {M \ Fα }α∈A is an open covering of M .

52

Chapter 1. Preliminaries

Exercise 1.2.43. Prove that F ⊂ C[a, b] is relatively compact if and only if F is bounded in C[a, b] and the following equicontinuity condition is satisﬁed: ∀ε > 0 ∃δ > 0 ∀f ∈ F :

x, y ∈ [a, b], |x − y| < δ

|f (x) − f (y)| < ε.

=⇒

Hint. Use Proposition 1.2.3. Obviously, this statement is also a special case of Theorem 1.2.13. ∞

Exercise 1.2.44. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Deﬁne ⎧ ⎪ n if x = en , ⎪ ⎨ 1 f (x) = n(1 − 2x − en ) if x − en < , ⎪ 2 ⎪ ⎩ 0 otherwise. Show that f is a well-deﬁned continuous functional on H which is not bounded on the closed unit ball. Exercise 1.2.45. Let ∅ = M ⊂ X be a subset of a normed linear space X. For x ∈ X set dist(x, M) = inf{x − y : y ∈ M}. Prove that for any x1 , x2 ∈ X we have | dist(x1 , M) − dist(x2 , M)| ≤ x1 − x2 . Hint. Assume dist(x1 , M) ≥ dist(x2 , M). For any ε > 0 there exists xε ∈ M such that x2 − xε < dist(x2 , M) + ε. Use the triangle inequality for x1 − xε . Exercise 1.2.46.40 Let Ω be a bounded open set in RM . For p ∈ [1, ∞) and k ∈ N deﬁne W0k,p (Ω) to be the closure of D(Ω) with respect to the W k,p (Ω)-norm (1.2.9). (i) Prove that W0k,p (Ω) ⊂ W k,p (Ω) and W0k,p (Ω) need not be dense in W k,p (Ω) (compare it with the statement of Theorem 1.2.28(iii); see also the Trace Theorem (Theorem 7.3.1)). (ii) Prove the Poincar´e inequality: There exists a constant cp such that for all u ∈ W01,p (Ω) the inequality

|u(x)| dx ≤ cp

∇u(x)p dx 41

p

Ω

holds.

Ω

40 Supplement 41 Finding

to Example 1.2.25. the smallest possible value of the constant diﬃcult problem. See also ( ' cp is a much more

Exercise 6.3.19 and Example 7.4.4. Here ∇u(x) =

weak derivatives (see (1.2.8)), is the gradient of u.

∂u ∂u , . . . , ∂x ∂x1 M

where

∂ , ∂xi

i = 1, . . . , M , are

1.2. Normed Linear Spaces

53

Hint. It suﬃces to prove the assertion for u ∈ D(Ω). Consider ﬁrst Ω = (0, 1) and use the Mean Value Theorem. Then suppose (without loss of generality) ˜ (0, d) × RM−1 and notice that D(Ω) ⊂ D(Ω). ˜ Ω⊂Ω (iii) Use the Poincar´e inequality to prove that |u|W 1,p (Ω) = 0

p1 ∇u(x)p dx

Ω

is an equivalent norm on W01,p (Ω) with the norm

uW 1,p (Ω) = 0

|u(x)| dx p

Ω

p1

p1 ∇u(x) dx . p

+ Ω

Exercise 1.2.47. Let u ∈ W 1,p (0, 1), 1 ≤ p < ∞. Prove that functions u+ (x) max{u(x), 0},

u− (x) max{−u(x), 0}

also belong to W 1,p (0, 1). We remark that the corresponding result is false for W k,p (0, 1), k ≥ 2.

Chapter 2

Properties of Linear and Nonlinear Operators 2.1 Linear Operators In this section we point out some fundamental properties of linear operators in Banach spaces. The key assertions presented are the Uniform Boundedness Principle, the Banach–Steinhaus Theorem, the Open Mapping Theorem, the Hahn–Banach Theorem, the Separation Theorem, the Eberlain–Smulyan Theorem and the Banach Theorem. We recall that the collection of all continuous linear operators from a normed linear space X into a normed linear space Y is denoted by L(X, Y ), and L(X, Y ) is a normed linear space with the norm AL(X,Y ) = sup {AxY : xX ≤ 1}. Proposition 2.1.1. Let Y be a Banach space. Then L(X, Y ) is a Banach space, too. In particular, the space X ∗ of all linear continuous forms on X is complete. Proof. Let {An }∞ n=1 be a Cauchy sequence in L(X, Y ). Then for any ε > 0 there is n0 ∈ N such that for all n, m ≥ n0 and x ∈ X, An x − Am x ≤ An − Am x ≤ εx. ∞

Since Y is complete, the sequence {An x}n=1 is convergent to a point in Y that can be denoted by Ax. Obviously A is a linear operator from X into Y and Ax − Am x = lim An x − Am x ≤ εx, n→∞

m ≥ n0 ,

x ∈ X.

This implies (Proposition 1.2.10) that A ∈ L(X, Y ) and A − Am → 0. The importance of this result can be seen from the following statement.

56

Chapter 2. Properties of Linear and Nonlinear Operators

Proposition 2.1.2. Let X be a Banach space and A ∈ L(X). If A < 1, then the operator I − A is continuously invertible and (I − A)−1 =

∞

An

n=0

where the sum is convergent in the L(X)-norm. Proof. First we prove the convergence. Let ε > 0 be arbitrary. Put Sk =

k

An .

n=0

Then l l l An ≤ An ≤ An < ε 1 Sl − Sk = n=k+1

n=k+1

for

l>k

n=k+1

provided k is suﬃciently large. By Proposition 2.1.1, the limit of Sk exists in the ∞ L(X)-norm. Denote B lim Sk = An . We have k→∞

n=0

(I − A)B = lim (I − A) k→∞

k

An = lim

n=0

k→∞

k

An −

n=0

k+1

An

n=1

= lim (I − Ak+1 ) = I k→∞

since lim An = O. Similarly, n→∞

B(I − A) = I,

i.e.,

B = (I − A)−1 .

If X is a complex Banach space and A ∈ L(X), we denote

(A) {λ ∈ C : λI − A is continuously invertible in L(X)} (the so-called resolvent set of A) and σ(A) C \ (A) (the so-called spectrum of A).2 The operator-valued function λ → (λI − A)−1 ,

λ ∈ (A),

is called the resolvent of A. A ∈ L(X, Y ), B ∈ L(Y, Z), then BA ∈ L(X, Z) and BAL(X,Z) ≤ BL(Y,Z) AL(X,Y ) . reason for considering only complex spaces consists in the fact that σ(A) = ∅ for all A ∈ L(X) in this case. This will be proved later in this section (see the discussion following Example 2.1.20). 1 If

2 The

2.1. Linear Operators

57

Corollary 2.1.3. Let X be a complex Banach space and A ∈ L(X). Then (A) is an open set and {λ : |λ| > A} ⊂ (A). Proof. If |λ| > A, then and I −

A −1 λ

A λI − A = λ I − λ

∈ L(X) according to Proposition 2.1.2. Hence we have (λI − A)−1 =

∞ An 3 . λn+1 n=0

Similarly, if λ0 ∈ (A), then λI − A = (λ − λ0 )I + (λ0 I − A) = (λ0 I − A)[I − (λ0 − λ)(λ0 I − A)−1 ]. For a parameter λ such that (λ0 − λ)(λ0 I − A)−1 < 1, the inverse operator B = [I − (λ0 − λ)(λ0 I − A)−1 ]−1 exists and (λI − A)−1 = B(λ0 I − A)−1 .

The next theorem together with Theorems 2.1.8 and 2.1.13 is one of the most signiﬁcant results in linear functional analysis. For the proofs the interested reader can consult textbooks on functional analysis, e.g., Conway [28], Dunford & Schwartz [44], Rudin [112], Yosida [135]. Theorem 2.1.4 (Uniform Boundedness Principle). Let X be a Banach space and Y a normed linear space. If {Aγ }γ∈Γ ⊂ L(X, Y ) is such that the sets {Aγ xY : γ ∈ Γ} are bounded for all x ∈ X, then {Aγ L(X,Y ) : γ ∈ Γ} is also bounded. This result is the quintessence of several results on approximation of functions in classical analysis and can be used for “modern” proofs of such results. The following example is typical. Example 2.1.5. There exists a periodic continuous function the Fourier series of which is divergent at zero.4 To see this we recall that the nth partial sum of the Fourier series of a function f at 0 is given by π sin n + 12 t 1 sn (f )(0) = Dn (0 − t)f (t) dt where Dn (t) = , 0 < |t| < π 2π −π sin 2t (the nth Dirichlet kernel ). Since σn : f → sn (f )(0) forms on ) are continuous *linear ∞ the space C[−π, π], the sequence of their norms σn L(C[−π,π],R) n=1 should be 3 This series actually converges for λ such that |λ| > r(A) sup {|µ| : µ ∈ σ(A)} but its proof is more involved. The quantity r(A) is called the spectral radius of A. 4 Even divergent at uncountably many points but always of measure zero. The set of such “bad” functions is dense in C[−π, π].

58

Chapter 2. Properties of Linear and Nonlinear Operators

bounded provided σn (f ) is convergent for all f ∈ C[−π, π] (Theorem 2.1.4). One can calculate that π 1 σn = |Dn (t)| dt, 2π −π g and a careful estimate shows that σ is like log n for large n. n

As indicated in the previous example, Theorem 2.1.4 is essentially an approximation result. This is clearer from its next variant. Corollary 2.1.6 (Banach–Steinhaus). Let X and Y be Banach spaces and let ∞ {An }n=1 ⊂ L(X, Y ). Then the limits lim An x exist for every x ∈ X if and n→∞ only if the following conditions are satisﬁed: (i) There is a dense set M ⊂ X such that lim An x exists for each x ∈ M. ∞ (ii) The sequence of norms {An }n=1 is bounded. Moreover, under these conditions Ax lim An x n→∞

exists for all x ∈ X and A ∈ L(X, Y ).5 The following proposition is also often useful. Proposition 2.1.7. Let X be a Banach space and Y a normed linear space. If B : X × X → Y is a bilinear operator (i.e., linear in both variables) and (i) for every y ∈ X the mapping x → B(x, y) belongs to L(X, Y ); (ii) for every x ∈ X the mapping y → B(x, y) belongs to L(X, Y ), then there exists a constant c such that B(x, y)Y ≤ cxX yX ,

x, y ∈ X.

In particular, if xn → x, yn → y, then B(xn , yn ) → B(x, y). Proof. Denote By : x → B(x, y). By (i), By ∈ L(X, Y ) for all y ∈ X, y ≤ 1. By (ii), By (x) ≤ c(x). The Uniform Boundedness Principle implies the existence of a constant c such that sup sup B(x, y) ≤ c. x≤1 y≤1

Theorem 2.1.8 (Open Mapping Theorem). Let X, Y be Banach spaces, let A ∈ L(X, Y ) and let A have a closed range Im A. Then for any open set G ⊂ X its image A(G) is an open set in Im A. In particular, if A is, in addition, injective and surjective, then A−1 ∈ L(Y, X). 5 This

type of convergence is the so-called convergence in the strong operator topology. It is weaker than the norm convergence.

2.1. Linear Operators

59

When applied to linear equations Ax = y, Theorem 2.1.8 says that the continuous dependence of a solution on the righthand side is a consequence of the existence and uniqueness result. Such continuous dependence is important for any reasonable numerical approximation. Theorem 2.1.8 can be also used in a “negative” sense: Example 2.1.9. Denote by 1 fˆ(n) = 2π

π

f (t)e−int dt

−π

the nth Fourier coeﬃcient of f ∈ L1 (−π, π). Since fˆ(n) → 0 for |n| → ∞ for all trigonometric polynomials which are dense in L1 (−π, π), we have fˆ(n) → 0

for all f ∈ L1 (−π, π)

(the so-called Riemann–Lebesgue Lemma). In other words, A : f → fˆ(·) is a continuous linear operator from L1 (−π, π) into c0 (Z) {an }n∈Z : lim |an | = 0 , {an }c0 (Z) = sup |an |. |n|→∞

Applications of Fourier series to various problems in analysis (like convolution equations, diﬀerential equations, . . . ) would be much easier if A were a mapping onto c0 (Z). Theorem 2.1.8 shows that this cannot be true for then A−1 would be bounded, i.e., f L1 (−π,π) ≤ c sup |fˆ(n)|

for all f ∈ L1 (−π, π).

∞

If {Dk }k=1 is the sequence of Dirichlet kernels (Example 2.1.5), then 1, |n| ≤ k, ˆ and Dk L1 (−π,π) ∼ log k, Dk (n) = 0, |n| > k, g

a contradiction.

Theorem 2.1.8 also yields a suﬃcient condition for a linear operator to be continuous. To formulate it we need the notion of a closed operator: Let X, Y be normed linear spaces. A linear operator A : Dom A ⊂ X → Y is said to be closed if ∞

{xn }n=1 ⊂ Dom A,

xn → x,

Axn → y

implies that x ∈ Dom A

and

Ax = y.

60

Chapter 2. Properties of Linear and Nonlinear Operators

Equivalently, A is a closed operator if and only if the graph of A, i.e., G(A) {(x, Ax) : x ∈ Dom A}, is a closed linear subspace of X × Y . Corollary 2.1.10 (Closed Graph Theorem). Let X, Y be Banach spaces and let A be a closed operator from Dom A = X into Y . Then A is continuous. Proof. If G(A) denotes the graph of A, then put T (x, Ax) = x. By Theorem 2.1.8, T −1 is continuous, and therefore A = π2 ◦ T −1 is continuous as well (π2 is the projection of X × Y onto the second component Y ). Example 2.1.11. Many diﬀerential operators are either closed or have closed extensions. If they are viewed as operators from X into X, then they are only densely deﬁned. A very simple example: X = C[0, 1], Ax = x, ˙ Dom A = {x ∈ X : x(t) ˙ exists for all t ∈ [0, 1] and x˙ ∈ X}. A well-known classical result says that A is a closed operator. But A is not contig nuous. For xn (t) = tn we have xn = 1, x˙ n = n. Example 2.1.12. Let X be a Banach space and M a linear subspace of X. Let N be an (algebraic) complement of M and let P be the corresponding projection onto M . Then P is continuous if and only if both M and N are closed. The suﬃciency part follows from the Closed Graph Theorem and from an observation that P is closed whenever M and N are closed subspaces. The necessity part is obvious since M = Ker(I − P ),

N = Ker P.

This statement should be compared with the Hilbert space case (Corollary 1.2.35). An important special case is codim M < ∞. By deﬁnition, this means that an algebraic direct complement N has a ﬁnite dimension (codim M dim N ) and therefore N is closed (Corollary 1.2.11(i)). If M is closed as well, then any projection onto M is continuous. We postpone the case of dim M < ∞ to Remark 2.1.19. We note that if X is a Banach space such that there exists a continuous projection P , P L(X) ≤ 1, onto every closed subspace of X, then X has an g equivalent norm induced by the scalar product on X (see Kakutani [71]).

2.1. Linear Operators

61

Now we turn our attention to the dual space X ∗ of all continuous linear forms on a normed linear space X. In Section 1.1 we have seen the importance of linear forms. Namely, they allowed us to deﬁne an algebraic adjoint operator A# and formulate Theorem 1.1.25. The dual space X ∗ is even more important for a normed linear space X since another topology can be introduced on X with help of X ∗ which in a certain sense has better properties (Theorem 2.1.25 below). Surprisingly, the following basic result does not need any topology. Theorem 2.1.13 (Hahn–Banach). Let X be a real linear space and let Y be a linear subspace of X. Assume that f is a linear form on Y which is dominated by a sublinear functional p.6 Then there exists F ∈ X # such that (i) F (y) = f (y) for all y ∈ Y (extension); (ii) F (x) ≤ p(x) for all x ∈ X (dominance). Proof. The proof is based on an extension of f to a subspace whose dimension is larger by 1 and such that this extension is dominated by the same p, and the use of Zorn’s Lemma as an inductive argument, similarly as in the proof of Theorem 1.1.3. Remark 2.1.14. If X is a complex linear space, then we need p to satisfy a stronger condition than (2) in footnote 6, namely (2 ) p(αx) = |α|p(x), α ∈ C, x ∈ X. In this case p is called a semi-norm.7 The dominance also has to be stronger: |f (x)| ≤ p(x). The extension result follows from Theorem 2.1.13 by considering Re f and Im f and observing that Re f (ix) = − Im f (x). Corollary 2.1.15. Let X be a normed linear space and let Y be a linear subspace of X (not necessarily closed). If f ∈ Y ∗ , then there exists F ∈ X ∗ such that (i) F (y) = f (y) for y ∈ Y ; (ii) F X ∗ = f Y ∗ . Proof. Put p(x) = f x, x ∈ X, and apply Theorem 2.1.13 or Remark 2.1.14, respectively. Corollary 2.1.16 (Dual Characterization of the Norm). Let X be a normed linear space. Then xX = max {|f (x)| : f ∈ X ∗ with f X ∗ ≤ 1}. (2.1.1) 6

A mapping p : X → R is called sublinear if (1) p(x + y) ≤ p(x) + p(y) for any x, y ∈ X; (2) p(αx) = αp(x) for any x ∈ X and α ≥ 0.

7 The

diﬀerence between a norm and a semi-norm is that a semi-norm need not satisfy the condition: p(x) = 0 =⇒ x = o.

62

Chapter 2. Properties of Linear and Nonlinear Operators

Proof. Put g0 (αx) = αx, α ∈ R (or α ∈ C). Then g0 is a continuous linear form on Lin{x} and its norm is 1 (provided x = o). Let f0 be its extension from Corollary 2.1.15. Then f0 (x) = x, f0 = 1,

i.e.,

x ≤ sup {|f (x)| : f ∈ X ∗ with f ≤ 1}.

The converse inequality follows from the deﬁnition of f .

Remark 2.1.17. (i) If X is a Hilbert space, then the equality (2.1.1) can be obtained immediately from the Riesz Representation Theorem (Theorem 1.2.40). This theorem can be often used in Hilbert spaces instead of the Hahn–Banach Theorem. (ii) A slightly weaker form of (2.1.1) is often used: If f (x) = 0 for all f ∈ X ∗ , then x = o. The equivalent assertion reads as follows: X ∗ separates points of X. Corollary 2.1.18 (Separation Theorem). Let X be a normed linear space and let C be a nonempty, closed, convex set. If x0 ∈ C, then there exists F ∈ X ∗ such that sup {Re F (x) : x ∈ C} < Re F (x0 ). (2.1.2) Proof. It is suﬃcient to give the proof for a real space X and under the additional assumption o ∈ C. In particular, this assumption means that x0 = o. We wish to extend the form f deﬁned on Lin{x0 } by f (αx0 ) = α, α ∈ R. To do that we need a suitable dominating functional. Since d dist(x0 , C) > 0, there exists a convex neighborhood of C which does not contain x0 , e.g., d K = x + y : x ∈ C, y < . 2 , + z pK (z) inf α > 0 : ∈ K α

Put

for

z ∈ X.8

It is a matter of simple calculation to show that pK is sublinear, pK (x0 ) > 1, and pK (z) ≤ 1 for z ∈ K. Let F be an extension of f given by Theorem 2.1.13. Since o ∈ C, we have F (±y) ≤ pK (±y) ≤ 1 This shows that F ≤ 8p

K

2 , d

i.e.,

for

y <

F ∈ X ∗.

is the so-called Minkowski functional of the convex set K.

d . 2

2.1. Linear Operators

63

The inequality (2.1.2) follows from domination: namely, we have F (x) + F (y) ≤ pK (x + y) ≤ 1 i.e.,

for x ∈ C

and all y <

d < 1 = F (x0 ). F (x) ≤ 1 − sup F (y) : y < 2

d , 2

Remark 2.1.19. If C from Corollary 2.1.18 is a closed linear subspace of X and F ∈ X ∗ satisﬁes (2.1.2), then F (x) = 0 for all x ∈ C. Notice that F (x0 ) = 1 for F which has been constructed in the proof. This observation yields the existence of a continuous projection onto a ﬁnite dimensional subspace Y of X. Namely, suppose that {y1 , . . . , yn } is a basis of Y , and denote by Yk the span of y1 , . . . , yk−1 , yk+1 , . . . , yn . Then Yk is a closed linear subspace of X and yk ∈ Yk . Let Fk ∈ X ∗ be such that 1, j = k, j = 1, . . . , n. Fk (yj ) = 0, j = k, Then Px =

n

Fk (x)yk

k=1

is a continuous projection onto Y . Warning. It is not true that every projection onto Y is continuous even if dim Y = 1 but the construction (i.e., the construction of a noncontinuous linear form) is not obvious! Example 2.1.20. (i) By Corollary 1.2.11(ii), (RM )∗ = (RM )# . This means that (RM )∗ can be identiﬁed with RM . (ii) Let K be a compact subset of RM . Then for any F ∈ [C(K)]∗ there exists a unique complex Borel measure µ on K such that F (f ) = f (x) dµ(x) for every f ∈ C(K), K

and F [C(K)]∗ = |µ|(K) where |µ| is the total variation of µ. A similar statement holds under a more general assumption on K – for details and the corresponding notions see Dunford & Schwartz [44, Section IV, 6] or Rudin [113, Chapter 6] and, especially, Bourbaki [14]. In the last book the integration theory is developed on the basis of this representation theorem. (iii) Let Ω be an open subset of RM and let p ∈ [1, ∞). Then the dual space [Lp (Ω)]∗ can be identiﬁed with Lp (Ω) (p is the conjugate exponent, i.e.,

64

Chapter 2. Properties of Linear and Nonlinear Operators 1 p

+ p1 = 1) in the following sense. For any F ∈ [Lp (Ω)]∗ there exists a unique

ϕ ∈ Lp (Ω) such that F (f ) =

f (x)ϕ(x) dx

for every f ∈ Lp (Ω).

Ω

Moreover, F [Lp(Ω)]∗ = ϕLp (Ω) . Details can be found in books cited above. Warning. The dual space [L∞ (Ω)]∗ is much larger than L1 (Ω)! (iv) The dual spaces to Sobolev spaces W k,p (RM ) can be identiﬁed with special subspaces of tempered distributions for example via the Fourier transform. We omit details since their description is beyond the scope of this book. g The reader can ask why we are so interested in continuous linear forms. One of the reasons is the following. Suppose that ϕ is a vector-valued function (i.e., a mapping from R or C into a normed linear space X). For any f ∈ X ∗ the composition f ◦ ϕ is a real or complex function of a real or complex variable and therefore results of classical analysis can be applied to f ◦ ϕ. To be more speciﬁc, consider the resolvent (see page 56) of A ∈ L(X) R(λ)x (λI − A)−1 x,

λ ∈ (A),

which is an X-valued function for every x ∈ X. Then for any F ∈ X ∗ , the complex function ϕ(λ) = F [(λI − A)−1 x] is holomorphic in (A). For |λ| > A we also have ∞ An |ϕ(λ)| ≤ F X ∗ (λI − A)−1 L(X) xX = F x n+1 λ n=0 ≤ F x

∞ An , |λ|n+1 n=0

and so lim |ϕ(λ)| = 0.

|λ|→∞

If (A) = C, ϕ would be identically zero (by the Liouville Theorem from the complex functions theory). Since this should be true for all F ∈ X ∗ , we get (λI − A)−1 x = o for all x ∈ X, a contradiction. Therefore, the spectrum σ(A) is nonempty for each A ∈ L(X). This is a generalization of the existence of an eigenvalue of a linear operator in a ﬁnite dimensional space and therefore also a generalization of the Fundamental Theorem of Algebra (cf. page 15). It is worth mentioning that the Jordan Canonical Form (Theorem 1.1.34) is based on this result.

2.1. Linear Operators

65

Warning. It is not true that any A ∈ L(X), dim X = ∞, has an eigenvalue! A simple example is X = C[0, 1], Ax(t) tx(t). Our main reason for considering dual spaces comes from an attempt to ﬁnd a weaker topology on a normed linear space in which bounded sets would be relatively compact. The importance of this fact will become clear in Chapter 6. We also ask the reader to return to Proposition 1.2.2 for motivation. Deﬁnition 2.1.21. Let {xn }∞ n=1 be a sequence of elements in a normed linear ∞ space X. We say that {xn }n=1 converges weakly to x ∈ X (notation xn x or w- lim xn = x) if n→∞

lim f (xn ) = f (x)

n→∞

for every f ∈ X ∗ .

Proposition 2.1.22. (i) (uniqueness) If xn x and xn y, then x = y. (ii) If lim xn − x = 0, then xn x.9 n→∞

(iii) A weakly convergent sequence is bounded. Moreover, if xn x, then x ≤ lim inf xn . n→∞

(iv) If X is a uniformly convex Banach space,10 xn x and xn → x, then {xn }∞ n=1 converges to x in the norm topology. Proof. Assertion (i) follows immediately from Remark 2.1.17(ii) since in this case f (x) = f (y) for every f ∈ X ∗ . Assertion (ii) is obvious. Assertion (iii) is basically a consequence of Theorem 2.1.4, but certain preliminaries are needed: Since X ∗ is a normed linear space, its dual X ∗∗ (X ∗ )∗ is deﬁned. Put κ(x) : f → f (x), f ∈ X ∗. Then κ (the so-called canonical embedding) is a linear continuous operator from X into X ∗∗ , and κ(x)X ∗∗ = sup |f (x)| = xX f X ∗ ≤1

The converse statement is not true in general (see Exercise 2.1.37)! A Banach space X is said to be uniformly convex for every ε > 0 there is δ > 0 such if that x, y ∈ X, x = y = 1, x − y ≥ ε =⇒ 1 − x+y ≥ δ. Every uniformly convex space 2 9 Warning. 10

is reﬂexive, see Yosida [135, Chapter V, 2]. Hilbert spaces, Lp (Ω)-spaces and W 1,p (Ω)-spaces (1 < p < ∞) are uniformly convex (for a Hilbert space this follows from the parallelogram identity (1.2.14), for the other two cases see, e.g., Adams [2, Corollary 2.29 and Theorem 3.5]).

66

Chapter 2. Properties of Linear and Nonlinear Operators

(Corollary 2.1.16).11 Since the space X ∗ is always complete (Proposition 2.1.1), Theorem 2.1.4 can be applied to the sequence {κ(xn )}∞ n=1 . This shows that ∞ {xn }n=1 is bounded. If xn x, we choose f ∈ X ∗ such that f = 1

and

f (x) = x

(Corollary 2.1.16). Then x = f (x) = lim f (xn ) ≤ lim inf xn . n→∞

n→∞

Assertion (iv) is obvious for x = o. If x = o, then we may assume that also x xn = o and put y x and yn xxnn . Since xn x and xn → x, we have f (yn ) =

1 1 f (xn ) → f (x) = f (y) for any f ∈ X ∗ , xn x

i.e., yn y.

If we prove that yn − y → 0, then xn − x = (yn xn − yx) ≤ xn yn − y + y-xn − x- → 0 due to the assumption xn → x. To prove yn → y we proceed by contradiction using the uniform convexity of X. Suppose that there is ε > 0 such that yn −y ≥ ε for inﬁnitely many n. Then, by the uniform convexity of X, yn + y ≤ 2(1 − δ). Let us choose f0 ∈ X ∗ , f0 = 1, f0 (y) = y = 1 (see Corollary 2.1.16). Then 2(1 − δ) ≥ lim sup yn + y ≥ lim sup f0 (yn + y) = 2f0 (y) = 2, n→∞

a contradiction.

n→∞

Remark 2.1.23. The weak convergence is the convergence in the weak topology. It is convenient to deﬁne this topology by systems of neighborhoods of points. We say that U ⊂ X is a weak neighborhood of a point x ∈ X if there are f1 , . . . , fn ∈ X ∗ such that {y ∈ X : |fi (y) − fi (x)| < 1 for i = 1, . . . , n} ⊂ U. A subset G ⊂ X is weakly open (i.e., open in the weak topology) provided it is a weak neighborhood of each of its points. It is easy to see that a weakly open set is also open in the norm topology. The converse is generally true only in ﬁnite dimensional spaces. 11 It

is not generally true that κ is surjective. A Banach space X is said to be reﬂexive if κ is surjective. Every Hilbert space and spaces Lp (Ω), 1 < p < ∞, are reﬂexive (the Riesz Representation Theorem and Example 2.1.20(iii)). Spaces L1 (Ω), L∞ (Ω) and C(Ω) are not reﬂexive.

2.1. Linear Operators

67

As we have mentioned, our aim is to ﬁnd compact sets in the weak topology. Remark 2.1.24. The weak topology in an inﬁnite dimensional space is not metrizable. Therefore two concepts of compactness, namely the sequential and the covering one (see footnote 12 on page 26) are in principle diﬀerent. It is surprising that they coincide for weak topologies in Banach spaces. This very deep result is known as the Eberlain–Smulyan Theorem (see Dunford & Schwartz [44, Chapter 5]). Theorem 2.1.25 (Eberlain–Smulyan). Let X be a reﬂexive space. Then any bounded sequence contains a weakly convergent subsequence. Proof. We present a simple proof for the case that X is a Hilbert space. A proof for an arbitrary reﬂexive space can be found, e.g., in Dunford & Schwartz [44], Fabian et al. [49], Yosida [135]. Let {xn }∞ n=1 ⊂ X be a bounded sequence, and put Y = Lin{x1 , x2 , . . . } (the closure is taken in the norm topology). Since the sequence of scalar products {(x1 , xn )}∞ n=1 is a bounded sequence of + numbers,(real or complex), there is a

subsequence, say {xn }∞ n=1 , such that (1)

(1)

(x1 , xn )

reason there is a subsequence {xn }∞ n=1 of (2)

(k)

converges, etc. Put yk = xk

∞

converges. For the same + ,∞ (2) that (x2 , xn )

n=1 (1) {xn }∞ n=1 such

n=1

(the diagonal choice). Then lim (xj , yk ) exists for k→∞

all j ∈ N, and therefore lim (x, yk ) exists for each x ∈ Lin{x1 , x2 , . . . }. k→∞

Since the sequence of linear forms fk : x → (x, yk ) is bounded in Y ∗ , the Banach–Steinhaus Theorem (Corollary 2.1.6) implies the existence of f ∈ Y ∗ such that lim fk (x) = f (x)

k→∞

for all x ∈ Y.

Let P be the orthogonal projection onto Y . Put g(x) = f (P x)

for

x ∈ X.

Then g ∈ X ∗ and by the Riesz Representation Theorem there is y ∈ X such that g(x) = (x, y)

for x ∈ X.

Moreover, lim (x, yk ) = lim (P x, yk ) = f (P x) = (x, y)

n→∞

n→∞

This means that yk y.

for all x ∈ X.

68

Chapter 2. Properties of Linear and Nonlinear Operators

Remark 2.1.26. Weak convergence in a dual space X ∗ is more confusing since two ∗ approaches can be used. We say that a sequence {fn }∞ n=1 ⊂ X (i) converges weakly to f ∈ X ∗ (notation fn f or w- lim fn = f ) if n→∞

lim F (fn ) = F (f )

n→∞

for every F ∈ X ∗∗ ; ∗

(ii) converges weak star to f ∈ X ∗ (notation fn f or w∗ - lim fn = f ) if n→∞

lim fn (x) = f (x)

n→∞

for every x ∈ X.

Criteria for weak convergence in Lp -spaces can be found, e.g., in Dunford & Schwartz [44, Chapter IV, 8]. The weak convergence in X ∗ has obviously the same properties as that in X. Because of the continuous embedding κ : X → X ∗∗ (see the proof of Proposition 2.1.22(iii)) the w-convergence implies the w∗ -convergence. The converse is true if X is a reﬂexive space, i.e., κ(X) = X ∗∗ . Since the w∗ -topology is generally weaker than the w-topology there can exist more w∗ -compact sets than the w-compact ones. In fact, the following result (the Alaoglu–Bourbaki Theorem, see Conway [28], Dunford & Schwartz [44], Fabian et al. [49]) holds: If X is a normed linear space, then any closed ball in X ∗ is w∗ compact. If, moreover, X is separable, then the ball is also sequentially w∗ -compact. For example, this theorem can be applied to balls in Lp (Ω), 1 < p ≤ ∞. In the rest of this section we will examine adjoint operators. Suppose that X and Y are normed linear spaces and A ∈ L(X, Y ). If g ∈ Y ∗ , then A∗ g g(A) ∈ X ∗ . The operator A∗ : Y ∗ → X ∗ is obviously linear, and it is also continuous since |A∗ g(x)| = |g(Ax)| ≤ gY ∗ AxY ≤ gY ∗ AL(X,Y ) xX . If H1 , H2 are Hilbert spaces and A ∈ L(H1 , H2 ) we have another approach to the deﬁnition of an adjoint operator, namely the one based on the Riesz Representation Theorem: For y ∈ H2 the mapping f : x → (Ax, y)H2 is a continuous linear form on H1 , and hence there is z ∈ H1 for which f (x) = (x, z)H1 . This z is uniquely determined by y, and we denote for a moment z = A+ y, i.e., (Ax, y)H2 = (x, A+ y)H1 .

2.1. Linear Operators

69

There is a very slight diﬀerence between A∗ and A+ , e.g., (αA)∗ = αA∗ and (αA)+ = αA+ (see also Example 2.1.28 below). So we will use the same notation, namely A∗ , for both concepts. Symmetric matrices have certain special properties (e.g., their canonical forms are diagonal). The same can be expected for their generalization in the Hilbert space setting which is deﬁned as follows: An operator A ∈ L(H) is said to be self-adjoint if A = A∗ , i.e., (Ax, y) = (x, Ay)

for all x, y ∈ H.

In order to generalize Theorem 1.1.25 to continuous linear operators on inﬁnite dimensional normed linear spaces we will use the same notation but with a slightly diﬀerent meaning: If M ⊂ X, then M⊥ {f ∈ X ∗ : x ∈ M ⇒ f (x) = 0}. If N ⊂ X ∗ , then N⊥ {x ∈ X : f ∈ N ⇒ f (x) = 0}. We invite the reader to compare this symbol with that for orthogonal complements in Hilbert spaces. Proposition 2.1.27. Let X, Y be normed linear spaces and let A ∈ L(X, Y ). Then (i) if xn x, then Axn Ax; (ii) if A is, moreover, continuously invertible, then A∗ is also continuously invertible and (A∗ )−1 = (A−1 )∗ ; (iii) Ker A = (Im A∗ )⊥ ; (iv) Im A = (Ker A∗ )⊥ . Proof. (i) It is easy with the use of A∗ . (ii) It is suﬃcient to show that (A−1 )∗ A∗ = IY ∗ and A∗ (A−1 )∗ = IX ∗ . This follows from the more general result (AB)∗ = B ∗ A∗ which is easily veriﬁed. (iii) The inclusion ⊂ is obvious from the deﬁnition, for the converse inclusion ⊃ it is suﬃcient to use the fact that Y ∗ separates the points of Y . (iv) It is easy to see that (Im A)⊥ = Ker A∗ . To get (iv) it suﬃces to prove that (M⊥ )⊥ = Lin M for M ⊂ X. If x0 belonged to (M⊥ )⊥ \ Lin M, x0 would be separated from Lin M by a linear form f ∈ X ∗ (Corollary 2.1.18). Since Lin M is a subspace of X, this separating f would be in (Lin M)⊥ = M⊥ . Therefore f (x0 ) = 0, and a contradiction is obtained. The converse inclusion Lin M ⊂ (M⊥ )⊥ is obvious.

70

Chapter 2. Properties of Linear and Nonlinear Operators

Notice that the statement (iv) is not a suﬃcient condition for solvability of the equation Ax = y since only the closure of Im A is characterized. There are many operators the range t

of which is not closed. A simple example is Ax(t) =

x(s) ds considered either 0

2

in C[0, 1] or in L (0, 1). It is not an easy task to decide whether an operator has a closed range or not. The following statement is useful in applications. If X, Y are Banach spaces and A ∈ L(X, Y ) is injective, then Im A is closed if and only if there is a positive constant c such that Ax ≥ cx

for all

x ∈ X.

Suﬃciency is easy, the necessity part follows from the Open Mapping Theorem. There is an important subclass of operators with a closed range, namely the so-called Fredholm operators. An operator A ∈ L(X) is said to be Fredholm if dim Ker A < ∞,

Im A is closed,

and

codim Im A < ∞

(i.e., the dimension of any direct complement of Im A is ﬁnite). We note that codim Im A = dim Ker A∗ (this is basically Proposition 2.1.27(iv)). We deﬁne ind A dim Ker A − dim Ker A∗ and call it the index of the Fredholm operator . A special class of Fredholm operators will be examined in the next section. We have not yet introduced any suﬃciently broad family of continuous linear operators. The next example ﬁlls this gap. ˜ be open subsets of RM and Example 2.1.28 (Integral operators). Let Ω and Ω ˜ M ˜ R , respectively. Assume that k : Ω × Ω → C is a measurable function for which there are constants c1 , c2 such that ˜ |k(t, s)| ds ≤ c1 for a.a. t ∈ Ω, |k(t, s)| dt ≤ c2 for a.a. s ∈ Ω. ˜ Ω

Ω

Then the operator A deﬁned by k(t, s)x(s) ds

Ax(t) =

(2.1.3)

Ω

˜ for 1 ≤ p ≤ ∞.12 is a linear bounded operator from Lp (Ω) into Lp (Ω) ˜ can be found in Dunford on the kernel k which guarantee that A ∈ L(Lp (Ω), Lr (Ω)) & Schwartz [44, Chapter VI, 11A].

12 Conditions

2.1. Linear Operators

71

˜ is To prove this assertion we have to show that Ax(t) exists for a.a. t ∈ Ω, p ˜ 13 ˜ measurable on Ω and belongs to L (Ω). For 1 ≤ p < ∞, by the H¨ older inequality, we get for p1 = 1 − p1 : |Ax(t)| ≤ Ω

1 p

1 p

1 p

|k(t, s)| |k(t, s)| |x(s)| ds ≤ c1

Set

|k(t, s)||x(s)| ds p

ϕ(t)

.

Ω

|k(t, s)||x(s)| ds p

p1

p1 .

Ω

Since the measurable function (t, s) → |k(t, s)||x(s)|p can be approximated by step ˜ × Ω bounded), the function t → ϕ(t) is measurable on functions (consider ﬁrst Ω ˜ The Fubini Theorem yields Ω. p p |ϕ(t)| dt = |k(t, s)||x(s)| ds dt ˜ ˜ Ω Ω Ω |k(t, s)| dt |x(s)|p ds ≤ c2 xpLp (Ω) . = ˜ Ω

Ω

˜ (by the same In particular, ϕ is ﬁnite a.e. Since t → Ax(t) is measurable on Ω argument as above), we also have p1 1 1 p AxLp (Ω) = |Ax(t)| dt ≤ c1p c2p xLp (Ω) . ˜ ˜ Ω

˜ ∗ , 1 ≤ p < ∞, with a The Fubini Theorem also yields (we identify g ∈ [Lp (Ω)] p ˜ function from L (Ω) – see Example 2.1.20(iii)) Ax(t)g(t) dt = k(t, s)x(s) ds g(t) dt ˜ ˜ Ω Ω Ω k(t, s)g(t) dt x(s) ds = (A∗ g)(s)x(s) ds, = Ω

i.e., ∗

A g : s →

˜ Ω

Ω

k(t, s)g(t) dt, ˜ Ω

˜ g ∈ Lp (Ω).

We note that the adjoint operator to A for p = 2 in the sense of the Riesz Representation Theorem is of the form ∗ k(t, s)g(t) dt. A g(s) = ˜ Ω

˜ and k(t, s) = k(s, t). We will continue the In particular, A is self-adjoint if Ω = Ω g study of integral operators in the next section (Example 2.2.5). 13 The

case p = ∞ is left to the reader.

72

Chapter 2. Properties of Linear and Nonlinear Operators

In Example 2.1.11 we have mentioned that diﬀerential operators on a function space are not continuous and are only densely deﬁned. Therefore we wish to extend the notion of the adjoint operator to this case. Assume that A is a linear operator deﬁned on a dense subspace Dom A of X with values in Y . Put D∗ = {g ∈ Y ∗ : a linear form x ∈ Dom A → g(Ax) has a continuous extension f to the whole of X}. Obviously, D∗ is a linear subspace of Y ∗ containing o and the extension f is uniquely determined by g. We denote A∗ g f,

Dom (A∗ ) = D∗

and call A∗ the adjoint operator to A. Example 2.1.29. The simplest diﬀerential operator is deﬁned by Ax(t) = x(t). ˙ This relation can be considered in various function spaces and also with diﬀerent domains. If we are interested in its adjoint we should have a good representation of the dual space. This leads to an observation that spaces of integrable functions would be more convenient than spaces of continuous functions. Therefore let X = Lp (0, 1), 1 ≤ p < ∞ and Dom A = C 1 [0, 1]. Consider A : Dom A ⊂ X → X. We wish to compute A∗ . Assume g ∈ Dom (A∗ ) ⊂ Lp (0, 1) and A∗ g = f , i.e., 1 1 x(t)g(t) ˙ dt = x(t)f (t) dt = A∗ g(x) for all x ∈ Dom A. g(Ax) = 0

0

In particular, for x ∈ V = {x ∈ Dom A : x(1) = 0}

and

t

F (t) =

f (s) ds, 0

the integration by parts14 yields 1 x(t)f (t) dt = x(t)F (t)|10 − 0

1

x(t)F ˙ (t) dt = −

1

x(t)F ˙ (t) dt.

0

0

Since the restriction A|V of A to V has a dense range in Lp (0, 1) (Im A|V = C[0, 1]), we have F + g = o in Lp (0, 1). This means that g can be changed on a set of measure zero to have g absolutely continuous and

g˙ = −f ∈ Lp (0, 1),

i.e.,

g ∈ W 1,p (0, 1).

you are not familiar with integration by parts for the Lebesgue integral (notice that f ∈ Lp (0, 1) ⊂ L1 (0, 1)), you can approximate f by a continuous function to get a standard situation for integration by parts. 14 If

2.1. Linear Operators

73

1

Moreover, g(0) = −F (0) = 0. Taking F (t) = −

f (s) ds we see that also g(1) = t

0. This proves that

Dom (A∗ ) ⊂ {g ∈ W 1,p (0, 1) : g(0) = g(1) = 0} = W01,p (0, 1) 15 and ˙ A∗ g = −g. Integration by parts yields also the converse inclusion, i.e.,

Dom(A∗ ) = W01,p (0, 1). Notice that Im A is dense in Lp (0, 1) but not closed while A∗ is injective and p Im A = f ∈ L (0, 1) :

1

∗

f (t) dt = 0

0

is closed but not dense in Lp (0, 1). Notice also that (A) = (A∗ ) = ∅ and any λ ∈ C is an eigenvalue of A. To g the contrary, A∗ has no eigenvalues. A more general result (due to S. Banach) is stated in the following proposition (see, e.g., Yosida [135]). Proposition 2.1.30. Let X, Y be Banach spaces and let A be a closed densely deﬁned linear operator from X into Y . Then Im A is closed if and only if Im A∗ is closed. Moreover, Im A = (Ker A∗ )⊥

and

Ker A = (Im A∗ )⊥ .

Nevertheless, notice that A is not closed in our example. Proposition 2.1.30 can be applied to A∗ (A∗ is always closed); Dom (A∗∗ ) = W 1,p (0, 1), A∗∗ x = x. ˙ 16 This simple example shows how the domain of a (linear) noncontinuous operator aﬀects its properties. Example 2.1.31. Put Ax = −¨ x

with

Dom A = {x ∈ C 2 (a, b) : x(a) = x(b) = 0}.

If the equation Ax = λx 15 The last equality should be proved. A deeper insight into these Sobolev spaces will be given in Chapter 7, cf. also Exercise 1.2.46. 16 Notice that A∗∗ is an extension of A and, moreover, the graph of A∗∗ is the closure of the graph of A (it is also said that A∗∗ is the closure of A).

74

Chapter 2. Properties of Linear and Nonlinear Operators

has a nonzero solution w (∈ Dom A), then λ is called an eigenvalue and w a cork2 π 2 responding eigenfunction of A. Simple calculation shows that (b−a) 2 are all eigen-

kπ (t − a) are the corresponding eigenfunctions. Consider values of A,17 and sin b−a now the boundary value problem −¨ x(t) = λx(t) + f (t), t ∈ (a, b), (2.1.4) x(a) = x(b) = 0.

Let ϕ1 , ϕ2 be a fundamental system for the diﬀerential equation −¨ x − λx = 0. The Variation of Constants Formula shows that t ϕ1 (s)ϕ2 (t) − ϕ1 (t)ϕ2 (s) x(t) = c1 ϕ1 (t) + c2 ϕ2 (t) + f (s) ds (2.1.5) W (s) a is a solution to −¨ x − λx = f . Here W is the Wronski determinant of ϕ1 , ϕ2 (notice that for this equation we always can choose ϕ1 , ϕ2 such that W ≡ 1). We wish to ﬁnd constants c1 , c2 such that x given by (2.1.5) satisﬁes the boundary conditions x(a) = x(b) = 0. The number λ is not an eigenvalue if and only if

ϕ1 (a) ϕ2 (a) = 0. det ϕ1 (b) ϕ2 (b) In this case the formula (2.1.5) shows that for any f ∈ C[a, b] the problem (2.1.4) has a unique solution in Dom A 18 which is called a classical solution. This means that λ ∈ (A). Suppose now that λ is an eigenvalue. Then we can take ϕ1 as a corresponding eigenfunction and get x(a) = c2 ϕ2 (a),

i.e.,

c2 = 0

(ϕ2 (a) = 0 since ϕ1 , ϕ2 are linearly independent), and b b ϕ1 (s)f (s) ds = 0, i.e., ϕ1 (s)f (s) ds = 0 x(b) = ϕ2 (b) a

(2.1.6)

a

since ϕ2 (b) = 0 (by the same argument as above). Notice that (2.1.6) is also a necessary condition for solvability of (2.1.4). We will return to this example in the next section (see Example 2.2.17). g Example 2.1.32. Linear diﬀerential operators of the second order with nonconstant coeﬃcients are more complicated. To simplify our exposition we consider a diﬀerential expression Lx p0 x ¨ + p1 x˙ + p2 x 17 The minus sign in the deﬁnition of A is conventional; it is introduced to obtain positive eigenvalues. 18 If f ∈ Lp (a, b), then it is possible to show that the function x = x(t) given by (2.1.5) belongs to W 2,2 (a, b), x(a) = x(b) = 0, and the equation in (2.1.4) is satisﬁed a.e. in (a, b). Such a solution is called a strong solution.

2.1. Linear Operators

75

where p¨0 , p˙1 , p2 are continuous functions on a closed bounded interval [a, b] and p0 < 0 on this interval (the so-called regular case). Let X = Lp (a, b), 1 ≤ p < ∞ and D = {x ∈ W 2,p (a, b) : x(a) = x(b) = 0}. Put x ∈ D = Dom A

Ax = Lx, and consider

A : Dom A ⊂ X → X. A solution of Ax = f is therefore a strong solution of Lx(t) = f (t), t ∈ (a, b), x(a) = x(b) = 0. It can be proved that A is injective provided p2 > 0 in [a, b]. (Assume by contradiction that Ker A = {o} and show that there is x0 ∈ Ker A which has a negative minimum at an interior point c ∈ (a, b). Deduce that Lx0 (c) < 0.) The Variation of Constants Formula shows that the operator A is also surjective and A−1 is an integral operator b A−1 f (t) = G(t, s)f (s) ds (2.1.7) a

where G is the so-called Green function of L. The Green function is nonnegative on [a, b] × [a, b] and satisﬁes the estimates from Example 2.1.28. Therefore A−1 ∈ L(X). In order to calculate the adjoint A∗ it is convenient to consider the so-called formal adjoint expression to L, i.e., M y = (p0 y)¨− (p1 y)˙ + p2 y

b

Lx(t)y(t) dt and omit-

which is obtained by integrating by parts in the integral a

ting the boundary terms. Put By = M y

for

y ∈ D = Dom B.

The same integration as above shows that B ⊂ A∗ . The proof of the equality A∗ = B needs a more careful calculation. The interested reader can consult the books Coddington & Levinson [27, Chapter 9], Edmunds & Evans [46] or Dunford & Schwartz [45], in particular Chapter XIII, for details and also for more complicated singular cases which are important in applications, e.g., in Quantum Mechanics (the Schr¨ odinger equation). g

76

Chapter 2. Properties of Linear and Nonlinear Operators

Exercise 2.1.33. Let X, Y be Banach spaces. If A ∈ L(X, Y ) has a continuous inverse A−1 ∈ L(Y, X) and B ∈ L(X, Y ) is such that B − A <

1 , A−1

then B is also continuously invertible and B −1 ≤

A−1 , 1 − A−1 B − A

B −1 − A−1 ≤

A−1 2 B − A. 1 − A−1 B − A

Hint. Examine the proof of Corollary 2.1.3 and write A−1 B = A−1 (B − A) + I. Exercise 2.1.34. Show that etA =

∞ n n t A n! n=0

is well deﬁned for all t ∈ R, A ∈ L(X), provided X is a Banach space, and, moreover, the vector function ϕ : t → etA x0 solves the diﬀerential equation x(t) ˙ = Ax(t) and satisﬁes the initial condition ϕ(0) = x0 . (See also the end of Section 1.1, in particular Exercise 1.1.41.) Exercise 2.1.35. Let K be a continuous real function on [a, b] × [a, b] and let h ∈ C[a, b] be ﬁxed. Let M=

max (t,τ )∈[a,b]×[a,b]

and let λ ∈ R be such that |λ| <

|K(t, τ )|

1 . M (b − a)

Prove that the integral equation x(t) = λ

b

K(t, τ )x(τ ) dτ + h(t) a

has a unique solution x ∈ C[a, b]. ∞

∞

Exercise 2.1.36. Let {xn }n=1 , {yn }n=1 be sequences in a Hilbert space H such that xn x, yn → y. Then (xn , yn ) → (x, y). Hint. Use Proposition 2.1.22(iii).

2.2. Compact Operators

77 ∞

Exercise 2.1.37. Let {en }n=1 be an orthonormal sequence in a Hilbert space. Show that en o. Hint. Use the Bessel inequality (1.2.17). Exercise 2.1.38. Prove assertion (iv) of Proposition 2.1.22 for a Hilbert space X. Hint. Use the relation between the scalar product and the norm in X. Exercise 2.1.39. Show that a convex set (in particular a subspace) of a normed linear space is weakly closed if and only if it is closed in the norm topology. Hint. Suppose by contradiction that C is a norm-closed convex set which is not weakly closed. Then there is x0 ∈ C w \ C. Use the Separation Theorem (Corollary 2.1.18) to obtain a contradiction. Exercise 2.1.40. Prove that actually A∗ L(Y ∗ ,X ∗ ) = AL(X,Y ) . Hint. The inequality A∗ ≤ A follows from the calculation after Remark 2.1.26. For the converse inequality use the dual characterization of the norm Ax.

2.2 Compact Operators In this section we present a class of continuous linear operators the properties of which are closely related to the properties of ﬁnite dimensional linear operators. The key assertions presented concern the Riesz–Schauder Theory and the Hilbert– Schmidt Theorem. Deﬁnition 2.2.1. Let X and Y be normed linear spaces. A linear operator A ∈ L(X, Y ) is called a compact operator if the image of a ball in X is relatively compact in Y . The set of all compact operators from X into Y is denoted by C (X, Y ). Remark 2.2.2. (i) Every compact linear operator is continuous. (ii) The compactness condition is mostly used in the following equivalent form: ∞

For any bounded sequence {xn }n=1 ⊂ X there is a subsequence ∞ {xnk }k=1 such that Axnk converge in the norm topology of Y . (iii) Replacing the norm topology in Y by the weak topology a weakly compact operator can be deﬁned. If either X or Y is reﬂexive, then any A ∈ L(X, Y ) is weakly compact. This follows from the Eberlain–Smulyan Theorem (Remark 2.1.24) and the observation that A ∈ L(X, Y ) maps a weakly convergent sequence into a weakly convergent one (cf. Proposition 2.1.27(i)).

78

Chapter 2. Properties of Linear and Nonlinear Operators

Example 2.2.3. (i) If A ∈ L(X, Y ) and dim Im A < ∞ (the so-called operator of ﬁnite rank ), then A ∈ C (X, Y ). ∞ (ii) Let {en }n=1 be an orthonormal basis in a Hilbert space H. Put Aen = λn en and extend A by linearity to the dense set D Lin{e1 , . . . } in H. The operator A is bounded on D (and therefore it can be uniquely extended ∞ to a continuous operator on H) if and only if {λn }n=1 is a bounded sequence. In addition, A = sup |λn |. n

This follows immediately from the identity |λn |2 |(x, en )|2 for every x ∈ H. Ax2 = Moreover, A is a compact operator on H if and only if lim λn = 0.

n→∞

g

This is an easy consequence of Proposition 1.2.39.

Proposition 2.2.4. Let X, Y and Z be normed linear spaces. Then (i) if A ∈ C (X, Y ), B ∈ L(Y, Z), then BA ∈ C (X, Z); (ii) if A ∈ C (Y, Z), B ∈ L(X, Y ), then AB ∈ C (X, Z); ∞ (iii) if A ∈ C (X, Y ) and a sequence {xn }n=1 ⊂ X converges weakly to x ∈ X, then lim Axn − Ax = 0. n→∞

∞

(iv) Assume that Y is a Banach space and a sequence {An }n=1 ⊂ C (X, Y ) converges to A ∈ L(X, Y ) in the norm operator topology. Then A ∈ C (X, Y ). Proof. The assertions (i) and (ii) are obvious. To prove (iii) assume by contradiction that there is a subsequence {xnk }∞ k=1 such that Axnk − Ax ≥ c > 0. ∞

The sequence {x , is bounded (Proposition 2.1.22(iii)), and hence there exists +n }n=1 a subsequence xnkl

∞

l=1

and y ∈ Y such that

Axnkl − y → 0. Since f (Axn ) = A∗ f (xn ) → A∗ f (x) = f (Ax) we have y = Ax, and hence a contradiction.

for every f ∈ X ∗ ,

2.2. Compact Operators

79

(iv) Let B(o; 1) be the unit ball. By Proposition 1.2.3 it suﬃces to show that for any ε > 0 there is a ﬁnite ε-net of A(B(o; 1)). We choose n such that An − A < 2ε , and a ﬁnite 2ε -net for An (B(o; 1)). By the triangle inequality, this is the desired ε-net for A(B(o; 1)). Example 2.2.5. (i) Let k be a continuous function on the Cartesian product [a, b] × [a, b]. Then the operator b Ax : t ∈ [a, b] → k(t, s)x(s) ds a

is compact as an operator from C[a, b] into itself.19 We give two proofs of this assertion. The ﬁrst is based on the use of the Arzel` a–Ascoli Theorem (Theorem 1.2.13). Its assumptions are satisﬁed for F = A(B(o; 1)) where B(o; 1) is the unit ball in C[a, b]. The equicontinuity of F follows from the uniform continuity of k on [a, b] × [a, b]. The second proof uses Proposition 2.2.4(iv). Put A = {(t, s) → x(t)y(s) : x, y ∈ C[a, b]}. It is easy to see that A is a subalgebra of C([a, b] × [a, b]) which satisﬁes the assumptions of the real or complex Stone–Weierstrass Theorem (The∞ orem 1.2.14). Hence there are sequences {qn }∞ n=1 , {rn }n=1 in C[a, b] such that qn (t)rn (s) ⇒ k(t, s) uniformly in [a, b] × [a, b]. In particular, this means that the operators b rn (s)x(s) ds An x : t → qn (t) a

converge in the operator norm to A. Since Im An ⊂ Lin{qn }, all An are compact and, therefore, A is compact. (ii) Let Ω be a measurable subset of RM and let k ∈ L2 (Ω×Ω). Then the operator k(t, s)x(s) ds Ax(t) = Ω

(the so-called Hilbert–Schmidt operator ) is compact as an operator from L2 (Ω) into itself. We present again two proofs of this statement. The ﬁrst will be a typical Hilbert space proof, the second will use the reﬂexivity of L2 (Ω) and we will show how it could be used to get compactness of an integral operator on Lp (Ω). 19 This

is true under more general assumptions, e.g., if the interval [a, b] is replaced by a compact k(t, s)x(s) dµ(s).

topological space K, µ is a Borel measure on K and A is deﬁned by Ax(t) = K

80

Chapter 2. Properties of Linear and Nonlinear Operators

The ﬁrst proof is based on the following observation: ∞ ∞ Let {ek }k=1 , {fk }k=1 be two orthonormal bases in a separable Hilbert space H. Let B ∈ L(H). By the Parseval equality we have n2 (B)

∞

|(Bek , fn )|2 =

k,n=1

∞

Bek 2 =

∞

B ∗ fn 2 ≤ ∞.

n=1

k=1

This shows that the quantity n(B) depends only on B and not on the particular choice of bases. Moreover, if n(B) < ∞, then B ∈ C (H). To see this ∞ take nε ∈ N such that B ∗ fn 2 < ε and deﬁne n=nε +1

Bε x =

nε

(Bx, fn )fn .

n=1

Then dim Im Bε < ∞ and Bε x − Bx2 =

∞

|(Bx, fn )|2 ≤ x2

n=nε +1

∞

B ∗ fn 2 ≤ εx2 .

n=nε +1

The compactness of B follows from Proposition 2.2.4(iv). In order to apply this statement to the Hilbert–Schmidt operator choose an 2 orthonormal basis {en }∞ n=1 in L (Ω) and notice that ϕm,n (t, s) em (t)en (s) is an orthonormal set in L2 (Ω × Ω). Notice that {ϕm,n }∞ m,n=1 is an orthonormal basis (use Corollary 1.2.36). Since (Aen , em )L2 (Ω) = (k, ϕm,n )L2 (Ω×Ω) , the ﬁniteness of n(A) follows from the Bessel inequality. ∞ Now we give the second proof. Let {xn }n=1 be a bounded set in L2 (Ω). Since 2 L (Ω) as a Hilbert space is reﬂexive, there is a subsequence – denote it again ∞ by {xn }n=1 – which is weakly convergent to an x in L2 (Ω). In particular, k(t, s)xn (s) ds → k(t, s)x(s) ds for a.a. t ∈ Ω Ω

Ω

(the Fubini Theorem shows that k(t, ·) ∈ L2 (Ω) for a.a. t ∈ Ω). Since |k(t, s)| |xn (s) − x(s)| ds |Axn (t) − Ax(t)| ≤ Ω

≤ xn − xL2 (Ω)

|k(t, s)|2 ds

12

≤c

Ω

|k(t, s)|2 ds

12 ,

Ω

the Lebesgue Dominated Convergence Theorem yields Axn − AxL2 (Ω) → 0.

g

2.2. Compact Operators

81

Proposition 2.2.6. Let H be a Hilbert space and A ∈ L(H). Then A is a compact operator if and only if there is a sequence {An }∞ n=1 ⊂ L(H) of operators of ﬁnite rank which converges to A in the operator norm topology. Proof. Because of Proposition 2.2.4 only the necessity part is left to be proved. Let B(o; 1) be the unit ball in H. Since A(B(o; 1)) is compact, it is a separable metric space, and therefore Y = Lin A(B(o; 1)) is a separable Hilbert space. Let {en }∞ n=1 be an orthonormal basis in Y . Put An x =

n

(Ax, ek )ek .

k=1

Then An have ﬁnite rank and An x − Ax2 =

∞

|(Ax, ek )|2 < ε

for every x ∈ B(o; 1)

k=n+1

provided n is suﬃciently large (Proposition 1.2.39).

Remark 2.2.7. The proof of the preceding proposition indicates that the result ∞ holds also in a Banach space X with a Schauder basis {en }n=1 (see page 40). The famous conjecture of S. Banach was that any separable Banach space has a Schauder basis. The ﬁrst counterexample was constructed by P. Enﬂo. He found a compact operator in a separable Banach space which cannot be approximated by operators of ﬁnite rank. We notice that separable Banach spaces of functions like C(Ω), Lp (Ω), W k,p (Ω) (1 ≤ p < ∞) have a Schauder basis. One of our goals in this section is to generalize the Fredholm alternative (see footnote 6 on page 14). As we have seen in Section 1.1 the notion of the adjoint operator is very important. Proposition 2.2.8 (Schauder). Let X, Y be Banach spaces and assume that A ∈ L(X, Y ). Then A is compact if and only if A∗ is compact. Proof. ∞

Step 1 (the “only if” part). Suppose that A ∈ C (X, Y ) and {gn }n=1 ⊂ Y ∗ , gn Y ∗ ≤ 1. It is easy to verify the assumptions of the Arzel`a–Ascoli Theorem (Theorem 1.2.13) for the sequence of functions gn : K A(B(o; 1)) → R

(or C) ∞

(B(o; 1) is the unit ball in X). By this theorem there is a subsequence {gnk }k=1 which is uniformly convergent on K. Since |A∗ gnk (x) − A∗ gnl (x)| ≤ sup |gnk (y) − gnl (y)|

for each x ∈ B(o; 1)

y∈K

∞

and X ∗ is complete, the sequence {A∗ gnk }k=1 is convergent in X ∗ .

82

Chapter 2. Properties of Linear and Nonlinear Operators

Step 2 (the “if” part). Assume now that A∗ ∈ C (Y ∗ , X ∗ ). We embed X into X ∗∗ and Y into Y ∗∗ with help of the canonical isometrical embeddings κX and κY (see the proof of Proposition 2.1.22(iii)). Since A∗ is compact, A∗∗ is compact by the ﬁrst part of the proof. It suﬃces to show that κY (Ax) = A∗∗ κX (x)

for x ∈ X

and we leave that to the reader. If A ∈ C (X, Y ), then the equation Ax = y

(2.2.1)

is scarcely ever well posed20 as follows from the ﬁrst part of the next theorem. This is the reason why we are interested rather in equations of the type x − Ax = y.

(2.2.2)

Theorem 2.2.9 (Riesz–Schauder Theory). Let X be a Banach space and A ∈ C (X). Then (i) if Im A is closed, then dim Im A < ∞; (ii) dim Ker (I − A) < ∞; (iii) Im (I − A) is closed; (iv) (the Fredholm alternative) Im (I − A) = X

if and only if

Ker (I − A) = {o};

(v) dim Ker (I − A) = dim Ker (I ∗ − A∗ ). Proof. (i) If Y = Im A is closed, then A : X → Y is an open mapping (Theorem 2.1.8). This means that a certain ball B(o; δ) in Y is contained in the relatively compact set A(B(o; 1)), i.e., B(o; δ) itself is relatively compact. By Proposition 1.2.15, dim Y < ∞. (ii) For the rest of the proof we put T I −A

and

Y Ker T.

Then the restriction of A to the Banach space Y maps Y onto Y . By (i), dim Y < ∞. (iii) Because of (ii) there exists a continuous projection P of X onto Y (Remark 2.1.19). Denote Z Ker P,

i.e.,

X =Y ⊕Z

equation (2.2.1) is said to be well-posed if A is injective and A−1 is continuous. If A is an integral operator, then (2.2.1) is called an integral equation of the ﬁrst kind . The equation (2.2.2) is called an integral equation of the second kind. The research of these equations carried out by I. Fredholm is supposed to be one of the starting points in the development of functional analysis. 20 An

2.2. Compact Operators

83

and both Y and Z are Banach spaces. Since T is injective on Z, Im T is closed provided there is a positive constant c such that T zY ≥ czZ

for each z ∈ Z,

see page 70. Suppose by contradiction that such c does not exist, i.e., there are zn ∈ Z such that zn Z = 1

T zn Y <

and

1 zn Z . n

∞

Then one can ﬁnd a subsequence {znk }k=1 for which Aznk converges to a y. Since T znk → o, we have lim znk = y ∈ Z. This means that n→∞

T y = o,

i.e.,

y ∈ Y ∩ Z,

and thus

y = o.

This is a contradiction since znk → y implies that yY = 1. (iv) We will prove the necessity part by way of contradiction. Put Yk Ker T k . Then Y1 Y2 · · · Yk · · · since for x1 ∈ Ker (I − A), x1 = o, there is x2 such that x1 = T x2 , i.e., x2 ∈ Y2 \Y1 , etc. It follows from the construction in the proof of Proposition 1.2.15 that there are yk ∈ Yk , yk Yk = 1, such that dist(yk+1 , Yk ) ≥ 12 . For k > l we have Ayk − Ayl Yk = yk − (yl − T yl + T yk )Yk ≥ dist(yk , Yk−1 ) ≥

1 . 2

∞

This means that there is no convergent subsequence of {Ayk }k=1 , a contradiction. The suﬃciency part is now easy: It follows from Proposition 2.2.8 and the previous part (iii) that Im T ∗ is closed. Assume that Ker T = {o}. By Proposition 2.1.27(iii), Im T ∗ = (Ker T )⊥ = X ∗ . According to the ﬁrst part of this proof, Ker T ∗ = {o} and, again by (iii) and Proposition 2.1.27(iv), Im T = (Ker T ∗ )⊥ = X. (v) As in the proof of (iii), X = Y ⊕ Z and the corresponding projection P of X onto Y is continuous. It can be shown that a direct complement W of Im T in X is isomorphic to Ker T ∗ .21 This means that dim W = dim Ker T ∗ < ∞. 21 This is clear for X being a Hilbert space, since Im T is closed and the orthogonal complement (Im T )⊥ is equal to Ker T ∗ (Proposition 2.1.27(iv)). In a general Banach space we can use the factor space X|Im T which is algebraically isomorphic to a direct complement W of Im T and ∗ for g ∈ X|Im T put f (x) = g([x]). It remains to show that the correspondence g → f is an (isometric) isomorphism onto (Im T )⊥ = Ker T ∗ .

84

Chapter 2. Properties of Linear and Nonlinear Operators

Denote dim Ker T = n

dim Ker T ∗ = n∗ .

and

We shall prove that n = n∗ . Assume that n > n∗ . In particular, this means that there is a surjective linear operator Φ ∈ L(Y, W ). Such Φ cannot be injective (see Corollary 1.1.15), i.e., there is x0 ∈ Y , x0 = o, for which Φ(x0 ) = o. Put now B A + ΦP. Since P ∈ C (X), we have B ∈ C (X) and Bx0 = Ax0 + o = x0 ,

i.e.,

Ker (I − B) = {o}.

By the Fredholm alternative (iv), Im (I − B) = X. But (I − B)(Z) = Im T

and

(I − B)(Y ) = Φ(Y ) = W,

i.e., Im (I − B) = Im T + W = X, a contradiction. This proves the inequality n ≤ n∗ . By interchanging T and T ∗ we similarly obtain n∗ ≤ n.22

Remark 2.2.10. The proof of the following statement is similar to that of Lemma 1.1.31(i). If A ∈ C (X) and 1 ∈ σ(A), then there is k ∈ N such that X = Ker (I − A)k ⊕ Im (I − A)k . Moreover, both the spaces on the right-hand side are A-invariant, and dim Ker (I − A)k < ∞.23 Remark 2.2.11. Theorem 2.2.9 can be generalized to operators A ∈ L(X) for which there is k ∈ N such that Ak ∈ C (X). Another way of generalization is connected with perturbations of Fredholm operators. Notice that the statement (v) of Theorem 2.2.9 says that I − A is a Fredholm operator of index zero provided A ∈ C (X). The following theorem states the stability of index. Theorem 2.2.12. Let X, Y be Banach spaces and let A ∈ L(X, Y ) be a Fredholm operator. Then (i) if B ∈ C (X, Y ), then A + B is Fredholm and ind A = ind (A + B);

(2.2.3)

(ii) the set of Fredholm operators in L(X, Y ) is an open subset of L(X, Y ); furthermore, ind is a continuous function on this open set. 22 We

recommend to the reader to do that carefully to see that no reﬂexivity of X is needed. dimension is called the multiplicity of the eigenvalue 1.

23 This

2.2. Compact Operators

85

Proof. The proofs and further results can be found, e.g., in Kato [73, § IV.5.].

Corollary 2.2.13. Let X be a complex Banach space and let A ∈ C (X). Then (i) σ(A) \ {0} is a countable set of eigenvalues of ﬁnite multiplicity; (ii) if dim X = ∞, then 0 ∈ σ(A), and if λ is an accumulation point of σ(A), then λ = 0. Proof. (i) If λ = 0, then λI − A = λ I − A λ and Theorem 2.2.9 can be applied. In particular, if such λ belongs to σ(A), then λ is an eigenvalue of ﬁnite multiplicity. It remains to show that for any r > 0 the set = {λ ∈ σ(A) : |λ| > r} is ﬁnite. Assume by way of contradiction that there is a sequence of mutually diﬀerent ∞ points {λn }n=1 ⊂ and let xn be the corresponding nonzero eigenvectors. Put Wn = Lin{x1 , . . . , xn }. It is easy to see by induction that x1 , . . . , xn are linearly independent. So we can ﬁnd yn+1 ∈ Wn+1 such that yn+1 = 1

and

dist(yn+1 , Wn ) ≥

1 . 2

Now for k > l we have Ayk −Ayl = λk yk −[(λk I −A)yk +(λl I −A)yl −λl yl ] ≥ |λk | dist(yk , Wk−1 ) ≥

r 2

and this contradicts the compactness of A. (ii) The statement on accumulation points follows immediately from the proof of (i). To see that 0 is a point of σ(A) provided dim X = ∞ it is suﬃcient to realize that σ(A) cannot be a ﬁnite set of nonzero numbers λ1 , . . . , λn . Indeed, with help of Remark 2.2.10 we get X = Ker (λ1 I − A)k1 ⊕ · · · ⊕ Ker (λn I − A)kn ⊕ V

(2.2.4)

where V is a nontrivial closed A-invariant subspace of X. Therefore the spectrum σ(A|V ) of the restriction A|V of A to V is a subset of σ(A). Since σ(A|V ) = ∅ (see the discussion following Example 2.1.20), we have {λ1 , . . . , λn } = σ(A).

Example 2.2.14. Consider Ax(t)

t

x(s) ds

on the space

L2 (0, 1).

0

This is a special class of operators which have been examined in Example 2.2.5(ii): 1 for 0 ≤ s ≤ t ≤ 1, k(t, s) = 0 for 0 ≤ t < s ≤ 1. Therefore A ∈ C (L2 (0, 1)).

86

Chapter 2. Properties of Linear and Nonlinear Operators

If λ = 0 were an eigenvalue of A with an eigenfunction x, then 1 t x(s) ds, x(t) = λ 0 i.e., x is absolutely continuous and x˙ =

1 x, λ

x(0) = 0.

This implies that x = o in [0, 1]. Since σ(A) cannot be empty, σ(A) = {0}, and 0 is no eigenvalue of A. We notice that the same statement (with a more complicated proof) is valid for any Volterra integral operator t Ax(t) = k(t − s)x(s) ds, x ∈ L2 (0, 1), 0

provided, e.g., k ∈ L2 (0, 1). See also Example 2.3.7.

g

Corollary 2.2.13 can be signiﬁcantly strengthened in the case that X is a Hilbert space and A is a compact, self-adjoint operator. To see this we need some technicalities. Proposition 2.2.15. Let H be a Hilbert space and A a self-adjoint continuous operator on H. Then (i) A = sup |(Ax, x)|; x=1

(ii) m inf (Ax, x) and M sup (Ax, x) belong to the spectrum of A; x=1

x=1

(iii) A = sup {|λ| : λ ∈ σ(A)}; (iv) σ(A) ⊂ R; (v) if Ax = λx, Ay = µy, λ = µ, then (x, y) = 0. Proof. (i) Denote the right-hand side by α. Obviously α ≤ A. To prove the converse inequality take o = x ∈ H, y = Ax. Then for any t > 0, using (1.2.14), we have

1 1 2 Ax = A(tx), Ax = A(tx), y t t 1 1 1 1 1 A tx + y , tx + y − A tx − y , tx − y = 4 t t t t 2 2 α tx + 1 y + tx − 1 y = α t2 x2 + 1 y2 . ≤ 4 t t 2 t2

2.2. Compact Operators

87

Now we choose t such that 1 t x + 2 y2 = 2xy, t 2

2

Hence

t=

y x

12 and

i.e.,

2

1 tx − y = 0. t

Ax2 ≤ αxy

follows.

(ii) By taking A + AI instead of A, we can assume that 0 ≤ m ≤ M = A ∞ (the last equality follows from (i)). Let {xn }n=1 be a sequence such that xn = 1

and

lim (Axn , xn ) = M.

n→∞

Then lim sup Axn − M xn 2 = lim sup [(Axn , Axn ) − 2M (Axn , xn ) + M 2 ] n→∞

n→∞

≤ lim sup [2M 2 − 2M (Axn , xn )] = 0. n→∞

If M ∈ (A), then there is a constant c > 0 such that Ax − M x ≥ cx. The previous calculation shows that this cannot be true. The assertion on m is obtained by replacing A by −A. (iii) This is a consequence of (i) and (ii) and Corollary 2.1.3. (iv) Let λ = α + iβ, β = 0. A simple calculation yields that λx − Ax2 ≥ |β|2 x2

for every

x ∈ H.

This inequality shows that both λI − A

and

λI − A∗ = λI − A

are injective and Im (λI − A) is closed. By Proposition 2.1.27(iv) and Corollary 1.2.35, Im (λI − A) = [Ker (λI − A)∗ ]⊥ = [Ker (λI − A)]⊥ = H. Therefore λ ∈ (A). (v) We have λ(x, y) = (Ax, y) = (x, Ay) = (x, µy) = µ(x, y) (by (iv), µ ∈ R). Since λ = µ, we conclude that (x, y) = 0.

88

Chapter 2. Properties of Linear and Nonlinear Operators

Theorem 2.2.16 (Hilbert–Schmidt). Let H be a separable Hilbert space and A a self-adjoint compact operator. Then there exists an orthonormal basis {en }∞ n=1 where en are the eigenvectors of A. If ∞ Aen = λn en and x= (x, en )en , n=1

then Ax =

∞

λn (x, en )en .

n=1 ∞

Proof. Let {λn }n=1 be the sequence of all nonzero and pairwise distinct eigenvalues (k) (k) of A. Choose an orthonormal basis e1 , . . . , enk of Nk Ker (λk I − A). Remember that Nk ⊥ Nk+1 (Proposition 2.2.15(v)). Let us align the collection (k) {e1 , . . . , e(k) nk } k

into a sequence {e1 , e2 , . . . }. This sequence is an orthonormal basis of H1 Lin{e1 , e2 , . . . }. If H1 = H, the proof is complete. Assume therefore that H = H1 . The orthogonal complement H1⊥ is A-invariant. This means that the restriction B A|H1⊥ is a self-adjoint operator on the Hilbert space H1⊥ . Since σ(B) ⊂ σ(A), σ(B) cannot contain any nonzero number (Corollary 2.2.13(i)). As σ(B) = ∅, we have σ(B) = {0} and, by Proposition 2.2.15(iii), on H1⊥ .

B=O

Hence 0 is an eigenvalue of B as well as of A. By adding an orthonormal basis of H1⊥ to {e1 , e2 , . . . } we obtain an orthonormal basis of H. Example 2.2.17.24 We have found that the inverse operator to Ax = −(px)˙ ˙ + qx, 25

x ∈ Dom A = {x ∈ W 2,2 (a, b) : x(a) = x(b) = 0},

exists provided p, ˙ q ∈ C[a, b] and p, q > 0 on [a, b]. Moreover, A−1 is an integral operator b A−1 f (t) = G(t, s)f (s) ds a 24 A

continuation of Example 2.1.32. 25 This operator is called a Sturm–Liouville operator.

2.2. Compact Operators

89

where G is the Green function of the diﬀerential expression. From the construction of G it follows that G ∈ C([a, b] × [a, b]), in particular, G ∈ L2 (a, b), and G is a real symmetric function (G(t, s) = G(s, t)), see, e.g., Walter [131]. By Example 2.2.5(ii), A−1 is a compact, self-adjoint26 operator in the real space L2 (a, b) and Theorem 2.2.16 can be applied to obtain an orthonormal basis ∞ of L2 (a, b) formed by the eigenfunctions {en }n=1 of A−1 , i.e., by the eigenfunctions of A. Since b (Ax, x)L2 (a,b) = [p(t)x˙ 2 (t) + q(t)|x(t)|2 ] dt > 0 for all x ∈ Dom A, x = o, a

all eigenvalues are positive. If λ is an eigenvalue of A (equivalently value of A−1 ), then dim Ker (λI − A) = 1

1 λ

is an eigen-

since the equation (px)˙ ˙ + (q − λ)x = 0 cannot have two linearly independent solutions satisfying the initial condition x(a) = 0. Let the eigenvalues λn of A be arranged into a sequence so that 0 < λ1 < λ2 < · · · . From the properties of compact operators (Corollary 2.2.13) it follows that λn → ∞. It is sometimes important to know how quickly λn tend to inﬁnity. A simple estimate can be obtained with help of the quantity n(A−1 ) (Example 2.2.5(ii)), namely ∞ 1 = n2 (A−1 ) < ∞. 2 λ n=1 n However, this result is far from being optimal. We remark here that a variational approach to an eigenvalue problem for compact, self-adjoint operators will be brieﬂy described in Section 6.3. Consider now the equation Ax = λx + f

(2.2.5)

or, equivalently (cf. Exercise 2.2.23), ∞ n=1

(λn − λ)(x, en ) =

∞

(f, en ),

i.e., (λn − λ)(x, en ) = (f, en )

for n ∈ N.

n=1

If λ is no eigenvalue of A, then inf |λn − λ| > 0 (since λn → ∞) and n

x= 26 We

∞ (f, en ) en λ −λ n=1 n

restrict our attention to a special diﬀerential operator A in contrast to the general operator from Example 2.1.32 in order to get a self-adjoint inverse A−1 .

90

Chapter 2. Properties of Linear and Nonlinear Operators

is a unique solution of (2.2.5). (Notice that this series is convergent.) If λ = λn , then the condition (f, en ) = 0 is a necessary and suﬃcient condition for solvability of (2.2.5) (see also Example 2.1.31). If we examined singular diﬀerential operators, e.g., on the interval [0, ∞), we would meet with many diﬃculties arising for example from the fact that A−1 is not compact and, therefore, its spectrum is more complicated. The interested g reader can consult the book Dunford & Schwartz [45]. Remark 2.2.18. The Hilbert–Schmidt Theorem allows to introduce a functional calculus for compact, self-adjoint operators similarly as it has been done for matrices in Theorem 1.1.38: Let A be a compact, self-adjoint operator on a Hilbert space H. Then there exists a unique mapping Φ : C(σ(A)) → L(H) 27 with the following properties: (i) Φ is an algebra homomorphism; (ii) Φ is a continuous mapping from C(σ(A)) into L(H) with the operator topology; m m (iii) if P (x) = ak xk , then Φ(P ) = ak Ak ; k=0

k=0

(iv) if w ∈ σ(A) and f (x) = then Φ(f ) = (wI − A)−1 ; (v) σ(Φ(f )) = f (σ(A)) for every f ∈ C(σ(A)). ∞ If Ax = λn (x, en )en , then it is easy to verify properties (i)–(v) for 1 w−x ,

n=1

Φ(f )x

∞

f (λn )(x, en )en .

n=1

We omit the proof of uniqueness. It is worth mentioning that we can introduce a functional calculus for a linear operator A which has a compact, self-adjoint resolvent (λ0 I − A)−1 . We leave this easy construction to the interested reader. Example 2.2.17 shows a class of such operators. Exercise 2.2.19. Suppose that A ∈ L(X, Y ) maps a weakly convergent sequence into a strongly convergent one. Prove that A is compact provided X is reﬂexive. Exercise 2.2.20. Prove the assertion from Remark 2.2.10 and the decomposition (2.2.4). 27 If

σ(A) = {0} ∪ {λn }∞ n=1 , then f ∈ C(σ(A)) if and only if lim f (λn ) = f (0). n→∞

2.3. Contraction Principle

91

Exercise 2.2.21. Consider a special case of the Sturm–Liouville operator Ax = −¨ x in the space L2 (0, π) with the boundary conditions (i) x(0) = x(π) = 0 (Dirichlet boundary conditions), (ii) x(0) ˙ = x(π) ˙ = 0 (Neumann boundary conditions), (iii) α0 x(0)+β0 x(0) ˙ = 0, α1 x(π)+β1 x(π) ˙ = 0 (mixed or Newton–Robin boundary conditions), (iv) x(0) = x(π), x(0) ˙ = x(π) ˙ (periodic conditions). Find Green functions, eigenvalues and eigenfunctions. What follows from the Hilbert–Schmidt Theorem? Compare this result with that of Example 1.2.38(i). Exercise 2.2.22. Deﬁne etA for the operator A from Exercise 2.2.21 (see Remark 2.2.18). Take x ∈ Dom A and show that the function t ≥ 0, u(t, ξ) etA x (ξ), is a solution to the heat equation ∂2u ∂u = ∂t ∂ξ 2 satisfying the initial condition u(0, ·) = x(·) and the boundary conditions given by u(t, ·) ∈ Dom A. Do not forget to deﬁne the notion of a solution. Exercise 2.2.23. Let A be as in Example 2.2.17. Prove that . ∞ ∞ Dom A = x = (x, en )en : |λn |2 |(x, en )|2 < ∞ n=1

and Ax =

n=1 ∞

λn (x, en )en .

n=1

2.3 Contraction Principle The previous four sections have been devoted to some basic facts in the linear theory. It is now time to start with nonlinear problems, especially with the solution of the nonlinear equation f (x) = a

for

f : X → X.

(2.3.1)

The basic assertions in this section are ﬁxed point theorems for contractible and non-expansive mappings. If X is a linear space, (2.3.1) is equivalent to the equation F (x) a − f (x) + x = x.

92

Chapter 2. Properties of Linear and Nonlinear Operators

The solution of this equation is called a ﬁxed point of F . In the case that f (x) = x − Ax

(F (x) = Ax + a)

where A ∈ L(X), we succeeded in solving this equation in Section 2.1 (cf. Proposition 2.1.2) by applying the iteration process x0 = a,

xn = a + Axn−1

provided A < 1.

This idea can be easily generalized to the following result which is often attributed to S. Banach. Theorem 2.3.1 (Contraction Principle). Let M be a complete metric space and let F : M → M be a contraction, i.e., there is q ∈ [0, 1) such that

(F (x), F (y)) ≤ q (x, y)

for every

x, y ∈ M.

Then there exists a unique ﬁxed point x ˜ of F in M . Moreover, if x0 ∈ M,

xn = F (xn−1 ),

∞

then the sequence {xn }n=1 converges to x ˜ and the estimates qn

(x1 , x0 ) 1−q q

(xn , xn−1 ) ˜) ≤

(xn , x 1−q

(xn , x ˜) ≤

(a priori estimate),

(2.3.2)

(a posteriori estimate)

(2.3.3)

hold. ∞

Proof. We prove that {xn }n=1 is a Cauchy sequence. Indeed, for m > n we have

(xm , xn ) ≤ (xm , xm−1 ) + · · · + (xn+1 , xn ) = (F (xm−1 ), F (xm−2 )) + · · · + (F (xn ), F (xn−1 )) ≤ q[ (xm−1 , xm−2 ) + · · · + (xn , xn−1 )] qn

(x1 , x0 ). ≤ (q m−1 + · · · + q n ) (x1 , x0 ) ≤ 1−q Since q < 1, the right-hand side is arbitrarily small for suﬃciently large n. The ∞ Cauchy sequence {xn }n=1 has a limit x ˜ in the complete space M , and for this limit the estimate (2.3.2) holds. Being a contraction, F is a continuous mapping, and therefore ( ' x ˜ = lim xn = lim F (xn−1 ) = F lim xn−1 = F (˜ x). n→∞

n→∞

n→∞

Uniqueness of a ﬁxed point is even easier: If x ˜ = F (˜ x), y˜ = F (˜ y ), then

(˜ x, y˜) = (F (˜ x), F (˜ y )) ≤ q (˜ x, y˜),

i.e., (˜ x, y˜) = 0

(q < 1).

The a posteriori estimate also follows from the above estimate of (xm , xn ).

2.3. Contraction Principle

93

The ﬁxed point of F the existence of which has been just established often depends on a parameter. The following result is useful in investigating this dependence. Corollary 2.3.2. Let M be a complete metric space and A a topological space. Assume that F : A × M → M possesses the following properties: (i) There is q ∈ [0, 1) such that

(F (a, x), F (a, y)) ≤ q (x, y)

for all

a∈A

and

x, y ∈ M.

(ii) For every x ∈ M the mapping a → F (a, x) is continuous on A. Then for each a ∈ A there is a unique ϕ(a) x ˜ such that F (a, x ˜) = x ˜. Moreover, ϕ is continuous on A. Proof. The existence of ϕ follows directly from Theorem 2.3.1. The estimates

(ϕ(a), ϕ(b)) = (F (a, ϕ(a)), F (b, ϕ(b))) ≤ (F (a, ϕ(a)), F (b, ϕ(a))) + (F (b, ϕ(a)), F (b, ϕ(b))) ≤ (F (a, ϕ(a)), F (b, ϕ(a))) + q (ϕ(a), ϕ(b)) yield

(ϕ(a), ϕ(b)) ≤

1

(F (a, ϕ(a)), F (b, ϕ(a))), 1−q

and the continuity of ϕ follows.

Remark 2.3.3. Notice that ϕ is Lipschitz continuous provided a → F (a, x) is Lipschitz continuous uniformly with respect to x (and, of course, A is a metric space). There is an enormous number of applications of the Contraction Principle. The proof of the existence theorem for the initial value problem for ordinary differential equations belongs to standard applications. However, the historical development went in the opposite direction. The following theorem had been proved (by iteration) about thirty years before the Contraction Principle was formulated in its full generality. Another application will be given in Section 4.1. Theorem 2.3.4 (Picard). Let G be an open set in R× RN and let f : (t, x1 , . . . , xN ) ∈ G → RN be continuous and locally Lipschitz continuous with respect to the xvariables, i.e., for every (s, y) ∈ G there exist δ > 0, δˆ > 0, L > 0 such that f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2

ˆ i = 1, 2. whenever |t − s| < δ, xi − y < δ,

Then for any (t0 , ξ0 ) ∈ G there exists δ > 0 such that the equation x˙ = f (t, x)

(2.3.4)

94

Chapter 2. Properties of Linear and Nonlinear Operators

has a unique solution on the interval (t0 − δ, t0 + δ) satisfying the initial condition x(t0 ) = ξ0 .

(2.3.5)

Proof. First we rewrite the initial value problem (2.3.4), (2.3.5) into an equivalent ﬁxed point problem for an integral operator F deﬁned by t F (x) : t → ξ0 + f (s, x(s)) ds, t ∈ (t0 − δ, t0 + δ).28 (2.3.6) t0

This equivalence is easy to establish (by integration and by diﬀerentiation with respect to t). Therefore we wish to solve the equation F (x) = x in a complete metric space M . We choose M to be a closed subset of the Banach space C[t0 − δ, t0 + δ] for a certain small δ > 0. We need two properties of F and M , namely that F maps M into M and F is a contraction on M . Choose ﬁrst δ1 , δˆ1 such that R1 [t0 − δ1 , t0 + δ1 ] × {x ∈ RN : x − ξ0 ≤ δˆ1 } ⊂ G. This set R1 is compact, and therefore f is bounded and uniformly Lipschitz continuous on it, i.e., there are constants K, L such that f (s, x) ≤ K,

f (s, x) − f (s, y) ≤ Lx − y

for (s, x), (s, y) ∈ R1 .

Put M = {x ∈ C[t0 − δ, t0 + δ] : x(t) − ξ0 ≤ δˆ1 ∀t ∈ [t0 − δ, t0 + δ]} for a δ ≤ δ1 . Then sup F (x(t)) − ξ0 ≤ δK,

t∈Iδ

sup F (x(t)) − F (y(t)) ≤ δL sup x(t) − y(t) t∈Iδ

t∈Iδ

where Iδ [t0 − δ, t0 + δ]. If we choose δ so small that δK ≤ δˆ1 and δL ≤ 12 , then F maps M into itself (the ﬁrst condition) and is a contraction with q = 12 (the second condition). By the Contraction Principle, F has a unique ﬁxed point y in M and this is a solution of (2.3.4), (2.3.5) on the interval (t0 − δ, t0 + δ). If x ˜ is a solution of (2.3.4), (2.3.5) on the interval (t0 − δ, t0 + δ), then x ˜ ∈ M (prove it!), i.e., y = x ˜, and the uniqueness follows.

28 If

t

t < t0 , then we deﬁne t0

t0

f (s, x(s)) ds = − t

f (s, x(s)) ds, and

t0 t0

f (s, x(s)) ds = 0.

2.3. Contraction Principle

95

Remark 2.3.5. The mapping F deﬁned by (2.3.6) depends actually not only on x but also on t0 , ξ0 . By taking smaller δ we can prove that F is also Lipschitz continuous with respect to the initial conditions and Corollary 2.3.2 yields that the solution x(t; t0 , ξ0 ) of (2.3.4), (2.3.5) is also Lipschitz continuous with respect to the initial conditions. Remark 2.3.6. If we apply Theorem 2.3.4 (i.e., the Contraction Principle) to a system of linear diﬀerential equations x˙ = A(t)x + g(t) with a continuous matrix A and a continuous vector function g on an interval (a, b), we need an extra eﬀort to prove that a solution exists on the whole interval (a, b). Namely, Theorem 2.3.4 gives only local existence, and in the continuation process (take (t0 + δ, x(t0 + δ)) as a new initial condition) there is no a priori evidence that δ could not be smaller and smaller.29 It is therefore sometimes more convenient not to refer to the Contraction Principle but to prove the convergence of iterations directly. The following example demonstrates this approach. Example 2.3.7. Let k be a bounded measurable function on the set M = {(s, t) ∈ R2 : 0 ≤ s ≤ t ≤ 1}. Then for any f ∈ L1 (0, 1) and λ = 0 there is a unique solution to the integral equation t x(t) − λ k(t, s)x(s) ds = f (t). (2.3.7) 0

To prove this assertion, denote

t

Ax(t) =

k(t, s)x(s) ds. 0

Then A ∈ L(L1 (0, 1)) (Example 2.1.28). Put x0 = f,

xn = f + λAxn−1 .

1 Due to the completeness of L (0, 1) the sequence {xn }∞ n=1 is convergent in L (0, 1) ∞ xn − xn−1 L1 (0,1) is convergent. We have if and only if the sum 1

n=1

xn − xn−1 = λ A f n

n

and

n

A f (t) =

t

kn (t, s)f (s) ds 0

where

k1 = k

and

t

kn (t, s) =

kn−1 (t, σ)k(σ, s) dσ s

(check this relation). situation does not occur for the equation x˙ = f (t, x) provided, e.g., that G = RN+1 and there exists L > 0 such that for all (t, x), (t, y) ∈ RN+1 the inequality f (t, x)−f (t, y) ≤ Lx−y holds. 29 This

96

Chapter 2. Properties of Linear and Nonlinear Operators

It is easy to prove by induction that |kn (t, s)| ≤ knL∞ (M) and hence

Since the series

∞ n=1

(t, s) ∈ M,

- t ≤ |λ| kn (t, s)f (s) ds-- dt 0 0 1 1 |λ|n kn n f L1 (0,1) . |f (s)| |kn (t, s)| dt ds ≤ ≤ |λ| n! 0 s

xn − xn−1 L1 (0,1)

(t − s)n−1 , (n − 1)!

n

an n!

1

is convergent for any a ∈ R the limit lim xn = x˜ ∈ n→∞

L1 (0, 1) exists and x˜ is a solution to (2.3.7). In fact x ˜ is a unique solution (see Exercise 2.3.18). Moreover, x ˜ depends continuously on f , which means that σ(A) = {0}.30 This result holds also for k ∈ C(M) in the space C[0, 1]. The proof is the g same. Example 2.3.8. Find suﬃcient conditions for the existence of a classical solution (cf. Example 2.1.31) of the boundary value problem x ¨(t) = εf (t, x(t)), t ∈ (0, 1), (2.3.8) x(0) = x(1) = 0. Theorem 2.3.4 suggests the assumption that f is continuous with respect to t and Lipschitz continuous with respect to the x-variable on a certain rectangle [0, 1] × [−r, r]. Denote a Lipschitz constant on this interval by L(r). We wish to rewrite the problem (2.3.8) as a ﬁxed point problem. To reach this goal suppose that we have a solution y and denote g(t) εf (t, y(t)). Then y solves also the equation y¨ = g and satisﬁes y(0) = y(1) = 0. It is easy to see that this problem has exactly one solution which is given by 1 t 1 y(t) = G(t, s)g(s) ds (t − 1)sg(s) ds + t(s − 1)g(s) ds 0

0

t

(G is the Green function – see Example 2.1.32). Therefore, we are looking for a continuous function x which solves the integral equation 1 x(t) = ε G(t, s)f (s, x(s)) ds. (2.3.9) 0

we have proved that C \ {0} ⊂ (A), i.e., σ(A) ⊂ {0}. Since A ∈ L(L1 (0, 1)), σ(A) =

∅, we have σ(A) = {0}. 30 Actually,

2.3. Contraction Principle

97

Denote

F (ε, x) ε

1

G(t, s)f (s, x(s)) ds. 0

We can solve (2.3.9) by applying the Contraction Principle in M {x ∈ C[0, 1] : x ≤ r}

for an appropriate choice of r.

For x ∈ M we have |f (s, x(s))| ≤ |f (s, 0)| + |f (s, x(s)) − f (s, 0)| ≤ K + L(r)r where K > 0 is a constant such that |f (s, 0)| ≤ K, s ∈ [0, 1], and F (ε, x) ≤

|ε| (K + L(r)r). 8

This estimate shows that F maps M into itself if q

|ε| L(r) < 1 8

and

r≥

|ε|K 1 . 8 1−q

Then F is also a contraction on M with the constant q. We can conclude that for a given r there is ε0 > 0 such that for |ε| ≤ ε0 both the above conditions31 are satisﬁed and (2.3.9) has a solution. Now we have to show that a continuous solution x of (2.3.9) is actually a classical solution of the boundary value problem (2.3.8). Since we know the explicit form of the Green function G, it is obvious that x(0) = x(1) = 0 and it is also easy to diﬀerentiate twice the right-hand side of (2.3.9) (taking into account that x is continuous). We remark that we have not used all properties of the integral operator with the kernel G. In particular, such an operator is compact (Example 2.2.5(i)) and this property has not been used. This property will be signiﬁcant in Section 5.1. g The a posteriori estimate (2.3.3) shows that the convergence of iterations may be rather slow. It can be sometimes desirable to have faster convergence at the expense of more restrictive assumptions. The classical Newton Method for solving an equation f (x) = 0, f : R → R, is illustrated in Figure 2.3.1. In order to generalize this method we need the notion of a derivative of f : X → X. This will be the main subject of the next chapter. 31 Notice

that for a ﬁxed ε these conditions are antagonistic, namely the ﬁrst requires small r and the other large r. This situation is typical in applications of the Contraction Principle.

98

Chapter 2. Properties of Linear and Nonlinear Operators

y = f (x) y = f (xn )(x − xn ) + f (xn )

x ˜ xn+1

xn Figure 2.3.1.

There are many generalizations of the Contraction Principle. One of them concerns the assumption q < 1. A mapping F : M → M is called non-expansive if

(F (x), F (y)) ≤ (x, y)

for all x, y ∈ M.

A simple example F (x) = x + 1, x ∈ R, shows that F may have no ﬁxed point. This can be caused by the fact that F does not map any bounded set into itself. However, there are non-expansive mappings which map the unit ball into itself and do not possess any ﬁxed point either. See the following example or Exercise 2.3.17. Example 2.3.9 (Beals). Let M be the space of all sequences with zero limit with the sup norm (this space is usually denoted by c0 ) and let F (x) = (1, x1 , x2 , . . . )

for

x = (x1 , x2 , . . . ) ∈ M.

Then F is a non-expansive map of the unit ball into itself without any ﬁxed point. g This example indicates that some special properties of the space are needed. We formulate the following assertion in a Hilbert space and use the Hilbert structure essentially in its proof. The statement is true also in uniformly convex spaces but the proof is more involved (see, e.g., Goebel [60]). Let us note an interesting fact that the validity of Proposition 2.3.10 in a reﬂexive Banach space is an open problem. Proposition 2.3.10 (Browder). Let M be a bounded closed and convex set in a Hilbert space H. Let F be a non-expansive mapping from M into itself. Then there is a ﬁxed point of F in M. Moreover, if x0 ∈ M,

xn = F (xn−1 )

and

yn =

n−1 1 xk , n k=0

then the sequence

∞ {yn }n=1

is weakly convergent to a ﬁxed point.

2.3. Contraction Principle

99

Proof. The existence result is not diﬃcult to prove.32 So we will prove a more interesting result which yields also a numerical method for ﬁnding a ﬁxed point. The proof consists of four steps, the last one is crucial and has a variational character. Step 1. Since M is bounded, closed and convex, yn ∈ M and there is a subsequence ∞ {ynk }k=1 weakly convergent to an x ˜ ∈ M (Theorem 2.1.25 and Exercise 2.1.39). Fix such a weakly convergent subsequence {ynk }∞ ˜ ∈ M. k=1 and its weak limit x Step 2. We have lim F (yn ) − yn = 0. Indeed, n→∞

F k (x0 ) − F (yn ) + F (yn ) − yn 2 = F k (x0 ) − F (yn )2 + F (yn ) − yn 2 + 2 Re(F k (x0 ) − F (yn ), F (yn ) − yn ) where F k (x0 ) = F (F k−1 (x0 )). Summing up this equality from k = 0 to k = n − 1 and dividing by n we get n−1 n−1 1 k 1 k F (x0 ) − F (yn )2 F (x0 ) − yn 2 = n n k=0

k=0

+ F (yn ) − yn 2 + 2 Re(yn − F (yn ), F (yn ) − yn )

=

n−1 1 k F (x0 ) − F (yn )2 − F (yn ) − yn 2 . n k=0

Since F is non-expansive, we conclude from this equality that F (yn ) − yn 2 ≤

n−1 1 k−1 1 F (x0 ) − yn 2 + x0 − F (yn )2 n n

−

k=1 n−1

1 n

F k (x0 ) − yn 2

k=0

1 1 = x0 − F (yn )2 − F n−1 (x0 ) − yn 2 → 0 n n (all sequences belong to M, and hence they are bounded). Step 3. The element x ˜ is a ﬁxed point of F . To see this, observe that the inequality (z − F (z) − (ynk − F (ynk )), z − ynk ) = (z − ynk , z − ynk ) − (F (z) − F (ynk ), z − ynk ) ≥ z − ynk 2 − z − ynk 2 = 0 32 It is possible to assume that o ∈ M. For any t ∈ (0, 1) the mapping F (x) tF (x) is a t contraction. Letting t → 1 we obtain a sequence {xn }∞ n=1 ⊂ M for which xn − F (xn ) → o. Therefore it is suﬃcient to show that (I − F )(M) is closed. This needs a trick which is typical for monotone operators (Section 5.3). Notice that I − F is monotone provided F is non-expansive.

100

Chapter 2. Properties of Linear and Nonlinear Operators

holds for any z ∈ M. By Exercise 2.1.36 and Step 2, the limit of the left-hand side is (z − F (z), z − x ˜), i.e., the inequality (z − F (z), z − x ˜) ≥ 0

(2.3.10)

is also true. Now take t ∈ (0, 1) and put z = (1 − t)˜ x + tF (˜ x)

(z ∈ M).

For t → 0, the inequality (2.3.10) divided by t yields ˜ x − F (˜ x)2 ≤ 0. Step 4. If x is a ﬁxed point of F , then xn − x2 = F (xn−1 ) − F (x)2 ≤ xn−1 − x2 and, therefore, the limit ϕ(x) lim xn − x2 exists. By Step 3, x ˜ is also a ﬁxed n→∞ point, and we get x − v2 + v − xk 2 + 2 Re(˜ x − v, v − xk ) ϕ(˜ x) ≤ ˜ x − xk 2 = ˜

for any v ∈ H.

Summing up from k = 0 to k = n − 1 and dividing by n we arrive at ϕ(˜ x) ≤ ˜ x − v2 +

n−1 1 v − xk 2 + 2 Re(˜ x − v, v − yn ). n

(2.3.11)

k=0

∞

∞

Let v be a weak limit of a subsequence {ynl }l=1 ⊂ {yn }n=1 , possibly diﬀerent ∞ from {ynk }k=1 . Then v is a ﬁxed point of F by virtue of the previous steps. Set n = nl and take the limit for l → ∞ in (2.3.11). We ﬁnally obtain33 ϕ(˜ x) ≤ ϕ(v) − ˜ x − v2 , and v = x ˜ follows. In particular, the limit of any weakly convergent subsequence ∞ ∞ of {yn }n=1 coincides with x ˜, and therefore the whole sequence {yn }n=1 weakly converges to x ˜. Remark 2.3.11. We have mentioned in footnote 32 on page 99 that I − F is a monotone operator whenever F is non-expansive. The converse statement is not true even in R. Consider, e.g., F (x) = −2x. Proposition 2.3.10 should be compared with Theorem 5.3.4. In the following three exercises we brieﬂy show other modiﬁcations of the Contraction Principle. 33 Observe

nl −1 1 x n l→∞ l j=0

that lim

− xj 2 = lim x − xn 2 . n→∞

2.3. Contraction Principle

101

Exercise 2.3.12. If M is a complete metric space, F : M → M and there is a function V : M → R+ ∪ {+∞} such that V (F (x)) + (x, F (x)) ≤ V (x),

x ∈ M,

(2.3.12)

then for arbitrary x0 ∈ M,

xn = F (xn−1 ),

{xn }∞ n=1

the sequence is convergent in M to an x ˜. Moreover, if the graph of F is closed in M × M , then F (˜ x) = x ˜. ∞

∞

Hint. Show that {V (xn )}n=1 is a decreasing sequence; this implies that {xn }n=1 is a Cauchy sequence. Remark 2.3.13. The condition (2.3.12) is suitable for a vector-valued mapping F and plays an important role in game theory. For details see, e.g., Aubin & Ekeland [10, Chapter VI]. Exercise 2.3.14. Let M be a complete metric space and let F : M → M . If there is n ∈ N such that F n is a contraction, then F has a unique ﬁxed point in M . Hint. Let x ˜ be a ﬁxed point of G F n,

x ˜ = lim Gk (x0 ). k→∞

Estimate (F (Gk (x0 )), Gk (x0 )). It is possible to show that x ˜ = lim F k (x0 ). k→∞

Remark 2.3.15. The power n ∈ N need not be the same for all x, y ∈ M , i.e., if there is q ∈ [0, 1) and for every x ∈ M there exists n(x) ∈ N such that

(F n(x) (x), F n(y) (y)) ≤ q (x, y)

for all y ∈ M,

then F also has a unique ﬁxed point (Sehgal [119]). The proof is similar to the previous one. Exercise 2.3.16 (Edelstein). Let M be a compact metric space and let F : M → M satisfy the condition

(F (x), F (y)) < (x, y)

for all x, y ∈ M, x = y.

Then F has a unique ﬁxed point in M . Hint. Only existence has to be proved: By compactness there is a convergent subsequence F nk (x0 ) → x ˜. Now show that the sequence αn (F n (x0 ), F n+1 (x0 )) is decreasing and lim αn = (˜ x, F (˜ x)) = (F (˜ x), F 2 (˜ x)),

n→∞

i.e.,

F (˜ x) = x˜.

102

Chapter 2. Properties of Linear and Nonlinear Operators

Exercise 2.3.17. Let K = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ 1, x(0) = 0, x(1) = 1},

F : x(t) → tx(t).

Then F (K) ⊂ K, F is non-expansive and there is no ﬁxed point of F in K! Prove these facts and explain their relation to Proposition 2.3.10. Exercise 2.3.18. Let x ∈ L1 (0, 1) be a solution of

t

k(t, s)x(s) ds

x(t) = λ 0

where λ and k are as in Example 2.3.7. Prove that x = 0 a.e. in (0, 1). Hint. First show that x ∈ L∞ (0, 1). From the equation we have xL∞ (0,t) ≤ |λ|tkL∞ (M) xL∞ (0,t) ,

t ∈ (0, 1).

Now deduce x = 0 a.e. in (0, 1). Exercise 2.3.19. Prove Corollary 2.1.3 using Theorem 2.3.1. Exercise 2.3.20. Let f ∈ C[0, 1]. Prove that there exists ε0 > 0 such that for any ε ∈ [0, ε0 ] the boundary value problem x ¨(t) − x(t) + ε arctan x(t) = f (t), t ∈ (0, 1), x(0) = x(1) = 0, has a unique solution x ∈ C 2 [0, 1]. Exercise 2.3.21. Let K be a continuous real function on [a, b] × [a, b] × R and assume there exists a constant N > 0 such that for any t, τ ∈ [a, b], z1 , z2 ∈ R, we have |K(t, τ, z1 ) − K(t, τ, z2 )| ≤ N |z1 − z2 |. Let h ∈ C[a, b] be ﬁxed and let λ ∈ R be such that |λ| <

1 . N (b − a)

Prove that the integral equation x(t) = λ

b

K(t, τ, x(τ )) dτ + h(t) a

has a unique solution x ∈ C[a, b]. Exercise 2.3.22. Let A : (a, b) → RM×M be a continuous matrix-valued function and let α ∈ (a, b), ξ ∈ RM .

2.3. Contraction Principle

103

(i) Modify the procedure from Example 2.3.7 to prove that the initial value problem x(t) ˙ = A(t)x(t), x(α) = ξ, has a unique solution which is deﬁned on (a, b). (ii) Prove that the equation x(t) ˙ = A(t)x(t)

(2.3.13)

has M linearly independent solutions ϕ1 , . . . , ϕM on the interval (a, b) and any solution of (2.3.13) is a linear combination of ϕ1 , . . . , ϕM . The matrix Φ = (ϕji )i,j=1,...,M is called a fundamental matrix of (2.3.13). (iii) Let A be continuous on R and T -periodic (T > 0). Denote C = Φ(T ) where Φ is a fundamental matrix, Φ(0) = I. Suppose that B is a solution of the equation eT B = C (see Exercise 1.1.42). Prove that Q(t) Φ(t)e−tB is regular for all t ∈ R and T -periodic. Moreover, x is a solution to (2.3.13) if and only if y(t) Q−1 (t)x(t) is a solution of the equation y˙ = By which has constant coeﬃcients. Find a condition in terms of σ(C) for the existence of a nontrivial kT -periodic solution to (2.3.13) (k ∈ N). (iv) Let f : R → RM be a continuous and T -periodic mapping. Is there any relation between the existence of a nontrivial T -periodic solution to (2.3.13) and the existence of a T -periodic solution to the equation x(t) ˙ = A(t)x(t) + f (t)? Hint. Use the Variation of Constant Formula and (iii).

Chapter 3

Abstract Integral and Diﬀerential Calculus 3.1 Integration of Vector Functions This short section is devoted to the integration of mappings which take values in a Banach space X. We will consider two types of domains of such mappings: either compact intervals or measurable spaces. For scalar functions the former case leads to the Riemann integral and the latter to the Lebesgue integral with respect to a measure. Deﬁnition 3.1.1. Let f : [a, b] → X. Let there exist x ∈ X with the following property: For every ε > 0 there is δ > 0 such that for all divisions D {a = t0 < · · · < tn = b} for which |D| max (ti − ti−1 ) < δ and for all choices τi ∈ [ti−1 , ti ], i = 1, . . . , n, i=1,...,n

the inequality

n f (τi )(ti − ti−1 ) − x i=1

<ε

(3.1.1)

X

is satisﬁed. Then x is called the Riemann integral of f over [a, b] and it is denoted by b f (t) dt. a

A basic existence theorem is a straightforward generalization of the classical (Riemann’s) result. Theorem 3.1.2 (Graves). Let X be a Banach space and let f : [a, b] → X be b continuous. Then the Riemann integral f (t) dt exists. a

106

Chapter 3. Abstract Integral and Diﬀerential Calculus

Proof. Since f is continuous on the compact interval [a, b], f is uniformly continuous on it. Take an equidistant division Dn = {a = tn0 < · · · < tnn = b} of the interval [a, b], i.e., tni = a +

i (b − a), n

i = 1, . . . , n,

Then sn =

n

and

|Dn | =

b−a . n

f (tni )(tni − tni−1 )

i=1

forms a Cauchy sequence (by the uniform continuity of f ). Let x = lim sn . It is n→∞

easy to see, again by the uniform continuity of f , that condition (3.1.1) is satisﬁed whenever |D| is suﬃciently small. Since the Riemann integral is linear and the estimate b b f (t) dt ≤ f (t)X dt ≤ f C([a,b],X)(b − a) 1 a a

(3.1.2)

X

holds for each f ∈ C([a, b], X), the integral is a linear continuous operator. Its commutativity with linear operators is important. Proposition 3.1.3. Let X, Y be Banach spaces and let f : [a, b] → X be Riemann integrable. (i) If A ∈ L(X, Y ), then Af is also integrable and b b A f (t) dt = Af (t) dt holds. (3.1.3) a

a

(ii) If A : X → Y is a linear closed operator and Af is Riemann integrable, then b f (t) dt ∈ Dom A a

and (3.1.3) is true. Proof. The veriﬁcation of (i) is straightforward, for (ii) we choose a sequence ∞ {sn }n=1 of Riemann sums such that b b lim sn = f (t) dt and lim Asn = Af (t) dt. n→∞

a

n→∞

a

The statement follows from the deﬁnition of a closed operator.

With help of this generalization of the Riemann integral we can also prove a basic result on the existence and uniqueness of a solution of a diﬀerential equation in a Banach space. 1 Here

f C([a,b],X) = max f (t)X . t∈[a,b]

3.1. Integration of Vector Functions

107

Assume that f: I ×G → X where I is an open interval in R, G is an open subset of a Banach space X. By a solution of a diﬀerential equation x˙ = f (t, x)

(3.1.4)

we mean a mapping x : J → X where J is an open interval, J ⊂ I, such that x(J ) ⊂ G and for every t ∈ J the limit x(t + τ ) − x(t) τ →0 τ

x(t) ˙ lim exists and

x(t) ˙ = f (t, x(t)). Theorem 3.1.4. Let I be an open interval in R and let G be an open subset of a Banach space X. Assume that f : I × G → X is continuous and locally satisﬁes the Lipschitz condition with respect to the second variable, i.e., for every s ∈ I, y ∈ G there are δ > 0, δˆ > 0, L > 0 such that f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2 ˆ i = 1, 2. whenever |t − s| < δ and xi − y < δ, Then for each t0 ∈ I, x0 ∈ G there exists h > 0 such that the equation (3.1.4) has a unique solution on the interval J = (t0 − h, t0 + h) which satisﬁes the initial condition x(t0 ) = x0 . (3.1.5) The proof of this theorem is based on the use of the Contraction Principle for the equivalent integral equation (see also the proof of Theorem 2.3.4)

t

f (s, x(s)) ds, 2

x(t) = x0 +

t ∈ J,

(3.1.6)

t0

where the integral is the Riemann integral. The equivalence of (3.1.4), (3.1.5) and (3.1.6) is established in the following lemma. Lemma 3.1.5. Suppose that f is continuous on I × G and (t0 , x0 ) ∈ I × G. Then a continuous function x : J → G is a solution of (3.1.4) on an interval J ⊂ I and satisﬁes the condition (3.1.5) if and only if t0 ∈ J and x solves on J the integral equation (3.1.6). 2 Recall

t

that t0

t0

g(s) ds = − t

g(s) ds for t < t0 (see footnote 28 on page 94).

108

Chapter 3. Abstract Integral and Diﬀerential Calculus

Proof. Step 1. Assume ﬁrst that x is a solution of (3.1.4). Then x as well as the mapping t ∈ J → f (t, x(t)) are continuous on J . Choose τ ∈ J and integrate both sides of (3.1.4) over the interval [t0 , τ ] (or [τ, t0 ]). Notice that both sides are Riemann integrable (Theorem 3.1.2). Moreover,

τ τ τ d ϕ(x(t)) dt = ϕ(x(τ )) − ϕ(x0 ) ϕ x(t) ˙ dt = ϕ(x(t)) ˙ dt = dt t0 t0 t0 for all ϕ ∈ X ∗ (the last equality follows from the so-called Basic Theorem of Calculus). By the Hahn–Banach Theorem, in particular Remark 2.1.17(ii), we have τ x(t) ˙ dt = x(τ ) − x0 , t0

i.e., x satisﬁes (3.1.6). Step 2. Suppose now that x : J → X is a continuous solution of (3.1.6). Then x satisﬁes (3.1.5) and it remains to check that d t f (s, x(s)) ds = f (t, x(t)). dt t0 This can be done by copying the proof for the scalar case.

Proof of Theorem 3.1.4. Choose δ > 0, δˆ > 0 small enough and K > 0 such that f (s, x1 ) ≤ K,

f (s, x1 ) − f (s, x2 ) ≤ Lx1 − x2 + , ˆ i = 1, 2. Let 0 < h ≤ min δˆ , 1 , δ and for s ∈ [t0 − δ, t0 + δ], xi − x0 ≤ δ, K 2L Mh = {x ∈ C([t0 − h, t0 + h], X) : x(s) − x0 ≤ δˆ for s ∈ [t0 − h, t0 + h]}. Then Mh is a complete metric space (with respect to the induced metric) and the operator t F (x) : t → x0 + f (s, x(s)) ds, t ∈ [t0 − h, t0 + h], t0

is well deﬁned on Mh , F (Mh ) ⊂ Mh (by (3.1.2)), and t sup [f (s, x1 (s)) − f (s, x2 (s))] ds F (x1 ) − F (x2 ) = t∈[t0 −h,t0 +h]

t0

≤ Lhx1 − x2 ≤

1 x1 − x2 2

for x1 , x2 ∈ Mh .

By the Contraction Principle (Theorem 2.3.1), there is a unique x ∈ Mh such that F (x) = x. Using Lemma 3.1.5 we conclude that x is a solution of (3.1.4), (3.1.5) on the interval J = (t0 − h, t0 + h).

3.1. Integration of Vector Functions

109

Let y be another solution of the same problem on the interval J . Then y ∈ Mk for a k ≤ h. Because of the uniqueness in the Contraction Principle, y(t) = x(t) for t ∈ [t0 − k, t0 + k]. Taking (t0 ± k, x(t0 ± k)) as new initial conditions we can extend ˜ t0 + k], ˜ i.e., y ∈ M˜ . the equality y(t) = x(t) to a larger closed interval [t0 − k, k This argument shows that y(t) = x(t)

for

t ∈ J.

Corollary 3.1.6. Let f satisfy the assumptions of Theorem 3.1.4 where I = (α, ∞), G = X. If, moreover, f is bounded on I × X, then for each t0 ∈ I and x0 ∈ X, (3.1.4) has a unique solution satisfying the initial condition (3.1.5) which is deﬁned on the whole interval I. Proof. The problem (3.1.4), (3.1.5) has a solution xγ on an interval (β, γ) ⊂ I. Such an interval exists due to Theorem 3.1.4 and the solution xγ is unique on this interval by a similar argument as in the proof of uniqueness. Denote γ˜ = sup {γ > β : there is a solution xγ on (β, γ)}. If γ1 < γ2 and xγi is a corresponding solution on (β, γi ), i = 1, 2, then xγ1 (t) = xγ2 (t)

for

t ∈ (β, γ1 )

(by uniqueness). This allows us to deﬁne the solution x = x(t) on the entire interval (β, γ˜ ). Since t x(t) − x(s) ≤ f (σ, x(σ)) dσ ≤ |t − s| sup f (τ, y) s

(τ,y)∈I×X

for any β < s < t < γ˜, the solution x is uniformly continuous on (β, γ˜ ) and, therefore, continuously extendable at γ˜ provided γ˜ < ∞ (see Proposition 1.2.4). The local Theorem 3.1.4 allows us to continue x as a solution beyond the value of γ˜, a contradiction. Hence γ˜ = ∞. Similarly, we prove inf β = α. Remark 3.1.7. Under the assumptions of Corollary 3.1.6 the solution x depends continuously on the initial data. In order to formulate this result we denote by x = x(·; t0 , x0 ) the solution of (3.1.4), (3.1.5) on the interval I. The continuous dependence now reads as follows: For any compact interval J ⊂ I, t0 ∈ J , and any ε > 0 there is δ > 0 such that x(t; t0 , x0 ) − x(t; t1 , x1 ) < ε

for all

t∈J

provided t1 ∈ J , |t0 − t1 | < δ and x0 − x1 < δ. Cf. Remark 2.3.5. Remark 3.1.8. Another existence theorem for the scalar diﬀerential equation (3.1.4) (i.e., X = RN ) is based on the continuity of f only (cf. Proposition 5.1.13).

110

Chapter 3. Abstract Integral and Diﬀerential Calculus

Warning. A generalization to an inﬁnite dimensional space does not hold, e.g., ∞ 1 1 for x = (x1 , x2 , . . . ) ∈ c0 f (x) = |xn | 2 + n + 1 n=1 where c0 is the space of sequences which converge to zero. As a norm on c0 the “sup norm” is taken. It is not diﬃcult to see that the equation x˙ = f (x) has no solution satisfying the initial condition x(0) = o! We now turn to the integration of vector functions deﬁned on a measurable space (M, Σ, µ) where Σ is a σ-algebra of subsets of M and µ is a (positive) measure deﬁned on Σ. A generalization of the abstract Lebesgue integral3 can be done in two diﬀerent ways: Either by integrating ϕ ◦ f over M for all ϕ ∈ X ∗ or by approximating f by step functions for which the integral is naturally deﬁned, and then passing to the limit. The former approach leads to a weak integral and the latter one to the so-called Bochner integral . Since an existence theorem for the weak integral (i.e., the existence of x ∈ X such that ϕ(x) = ϕ(f ) dµ for all M

ϕ ∈ X ∗ ) is complicated, we only brieﬂy describe the less general Bochner integral. Deﬁnition 3.1.9. Let (M, Σ, µ) be a measurable space and let X be a Banach space. (1) A function s : M → X is called a step function if there are pairwise disjoint sets M1 , . . . , Mn in Σ with µ(Mk ) < ∞, k = 1, . . . , n, such that s is constant (say, equal to xk ) on each Mk and for t ∈ M \

s(t) = o

n

Mk .

k=1

The integral of s is deﬁned by n s dµ = xk µ(Mk ). M

k=1

(2) A function f : M → X is said to be strongly measurable if there is a sequence ∞ {sm }m=1 of step functions such that lim sm (t) = f (t)

m→∞

exists for µ-a.a. t ∈ M.

(3) A strongly measurable function f : M → X is said to be Bochner integrable if ∞ there is a sequence {sm }m=1 of step functions which converges to f µ-a.e. and f − sm X dµ = 0. (3.1.7) lim m→∞

3 The

M

reader who is not acquainted with measure theory and the abstract Lebesgue integral can assume that M is an open subset of RN , µ is a Lebesgue measure and Σ is a collection of all Lebesgue measurable subsets of M .

3.1. Integration of Vector Functions

In this case we put

111

f dµ = lim

m→∞

M

sm dµ.

(3.1.8)

M

Remark 3.1.10. In order to show that this deﬁnition is correct we need to prove that the norm of a strongly measurable function is a measurable function (this is obvious) and, therefore, the condition (3.1.7) makes sense. From (3.1.7) it also immediately follows that the limit in (3.1.8) does not depend on any special choice ∞ of {sn }n=1 . The following statement oﬀers a very useful criterion for Bochner integrability. Proposition 3.1.11 (Bochner). Let X be a Banach space and let (M, Σ, µ) be a measurable space. A strongly measurable vector function f : M → X is Bochner integrable if and only if the norm f X is Lebesgue integrable. Moreover, ≤ f dµ f X dµ. (3.1.9) M

M

X

Proof. ∞

Step 1. Let f be Bochner integrable andlet {sn }n=1 be a sequence of step functions ∞

sn dµ

from the deﬁnition. Then M

particular, its limit, say α ∈ R, f dµ = lim

n→∞

M

is a Cauchy sequence (by (3.1.7)), in n=1

exists. Then sn dµ ≤ lim n→∞

M

sn dµ = α.

M ∞

It is easy to see that α does not depend on any special choice of {sn }n=1 from the deﬁnition and, moreover, f dµ ≤ f − sn dµ + sn dµ, i.e., f dµ ≤ α. M

M

M

M ∞

Step 2. Suppose now that f is Lebesgue integrable and that {sn }n=1 is a sequence of step functions from the deﬁnition of strong measurability. Put

⎧ ⎨s (t) if s (t) ≤ 1 + 1 f (t), n n n σn (t) = ⎩ o otherwise. Then σn → f µ-a.e. and, by the Lebesgue Dominated Convergence Theorem,

1 f (t). σn − f dµ → 0 since σn (t) − f (t) ≤ 2 + n M It follows from the independence of α of the special choice of approximating step functions that α ≤ f dµ. This proves the inequality (3.1.9). M

112

Chapter 3. Abstract Integral and Diﬀerential Calculus

Proposition 3.1.12. Let X, Y be Banach spaces, let (M, Σ, µ) be a measurable space and let f : M → X be Bochner integrable. (i) If A ∈ L(X, Y ), then Af is also Bochner integrable and Af dµ = A f dµ. (3.1.10) M

M

(ii) If A is a closed linear operator from X into Y and Af is Bochner integrable, then f dµ ∈ Dom A M

and (3.1.10) holds. Proof. The proof of statement (i) is straightforward. To prove (ii) let Z = {(x, Ax) : x ∈ Dom A} be equipped with the graph norm (x, Ax)Z xX + AxY . Since A is closed and X, Y are Banach spaces, Z is a Banach space as well. The crucial point of the proof is to show that g(t) (f (t), Af (t)),

t ∈ M,

is strongly measurable. Achieving this4 the rest of the proof is easy: By Proposition 3.1.11, g is Bochner integrable and g dµ = f dµ, Af dµ ∈ Z M

M

M

f dµ ∈

(since g maps into Z, its integral has to belong to Z, too). In particular, M

Dom A and (3.1.10) holds.

4 We sketch the proof of this result: Let ϕ ∈ Z ∗ . According to the Hahn–Banach Theorem there is an extension Φ = (Φ1 , Φ2 ) of ϕ to (X × Y )∗ . Since f and Af are strongly measurable, we conclude that t → ϕ(g(t)) = Φ1 (f (t)) + Φ2 (Af (t)) is measurable. It can be also shown that there is N ⊂ M , µ(M \ N ) = 0 such that g(N ) is separable. The result now follows from the Pettis Theorem (see, e.g., Dunford & Schwartz [44, Chapter III, 6], Yosida [135]): A function g : M → Z (Banach space) is strongly measurable if and only if the following two conditions are satisﬁed:

(i) For every ϕ ∈ Z ∗ the function t → ϕ(g(t)) is a measurable function. (ii) There is N ⊂ M such that µ(M \ N ) = 0 and g(N ) is a separable subspace of Z.

3.1. Integration of Vector Functions

113

Remark 3.1.13. If f : M → X is a Bochner integrable function and ϕ ∈ X ∗ , then, by the previous proposition, ϕ(f ) : M → R (or C) is integrable (in this case the Bochner and the Lebesgue integrals coincide). This shows that the Bochner integral is a restriction of any notion of a weak integral. We now return to the functional calculus given for matrices (see Theorem 1.1.38). Let B ∈ L(X) and let H(σ(B)) be a collection of holomorphic functions on a neighborhood of σ(B) (this neighborhood can depend on a function). If f ∈ H(σ(B)), then there exists a positively oriented Jordan curve γ such that σ(B) ⊂ int γ and f is holomorphic on a neighborhood of int γ. Hence the integral 1 f (w)(wI − B)−1 x dw, x ∈ X, f (B)x 2πi γ exists. Its properties are collected in the following assertions. Proposition 3.1.14 (Dunford Functional Calculus). Let X be a complex Banach space and let B ∈ L(X). There exists a unique linear mapping Φ : H(σ(B)) → L(X) with the following properties: (i) Φ(f g) = Φ(f )Φ(g) = Φ(g)Φ(f ) for f, g ∈ H(σ(B)); n n (ii) if P (w) = aj wj , then P (B) = aj B j ; (iii) if f (w) =

j=0 1 λ−w

j=0

for w = λ and λ ∈ σ(B), then f (B) = (λI − B)−1 ;

∞ {fn }n=1

⊂ H(σ(B)), fn ⇒ f on a neighborhood of int γ, then we have (iv) if fn (B) → f (B) in the norm topology; (v) if f ∈ H(σ(B)), then σ(f (B)) = f (σ(B)). Proof. The proof can be found, e.g., in Dunford & Schwartz [44, Section VII.3]. Suppose that λ0 ∈ σ(B) is an isolated point of the spectrum of B, i.e., there exist disjoint neighborhoods U0 of λ0 and U of σ(B) \ {λ0 }. The function 1 for w ∈ U0 , f (w) = 0 for w ∈ U, belongs to the collection H(σ(B)) and the operator P0 f (B) is a projection of X onto the subspace X0 P0 (X) since P02 = P0 . The operator P1 I − P0 is a projection onto the complementary subspace X1 , i.e., X = X0 ⊕ X1 . Denote by B0 and B1 the restrictions of B onto X0 (i.e., B0 ∈ L(X0 )) and X1 , respectively. Proposition 3.1.14(v) implies that σ(B0 ) = {λ0 },

σ(B1 ) = σ(B) \ {λ0 }.

114

Chapter 3. Abstract Integral and Diﬀerential Calculus

Put γ0 {λ0 + reiϕ : ϕ ∈ [0, 2π]}

for a (small) positive r.

Using Proposition 3.1.14 we get (see Exercise 3.1.16) (λI0 − B0 )−1 x0 1 dλ (µI0 − B0 )−1 x0 = 2πi γ0 µ−λ n ∞ 1 λ − λ0 1 = (λI0 − B0 )−1 x0 dλ 2πi γ0 µ − λ0 n=0 µ − λ0 ∞

=

x0 (−1)n + (λ0 I0 − B0 )n x0 µ − λ0 n=1 (µ − λ0 )n+1

(3.1.11)

for x0 ∈ X0 , |µ − λ0 | > r. The Taylor series for the function µ → (µI1 − B1 )−1 x1 has the form (see Exercise 3.1.16) −1

(µI1 − B1 )

∞ (µ − λ0 )n dn −1 x1 = (λI1 − B1 ) x1 -n n! dλ λ=λ0 n=0 =

∞

−(n+1)

(−1) (µ − λ0 ) (λ0 I1 − B1 ) n

n

(3.1.12)

x1 ,

n=0

x1 ∈ X1 , |µ − λ0 | < r0 (λ0 I1 − B1 )−1 −1 . Proposition 3.1.15. If λ0 is an isolated point of the spectrum σ(B), B ∈ L(X), then there exist operators An ∈ L(X), n ∈ Z, and r > 0 such that (µI − B)−1 x =

+∞

(µ − λ0 )n An x

(3.1.13)

n=−∞

for all x ∈ X and 0 < |µ − λ0 | < r. Moreover, if k ∈ N is such that A−n = O

for every

n>k

and

z A−k x = o,

then Bz = λ0 z. On the other hand, if λ0 is a nonzero eigenvalue of a compact operator B, then λ0 is a pole of the resolvent of B, i.e., there is k ∈ N such that A−n = O

for all

n > k.

Proof. Let λ0 be an isolated point of σ(B) and B ∈ L(X). If P0 , P1 are the above projections onto X0 , X1 , then (µI − B)−1 x = (µI − B0 )−1 x0 + (µI − B1 )−1 x1 ,

P0 x = x0 ,

P1 x = x1 ,

3.1. Integration of Vector Functions

115

and (3.1.13) follows from (3.1.11) and (3.1.12). Since A−(k+1) x = (B − λ0 I)A−k x, the second statement holds as well. Suppose now that B is compact and λ0 = 0 is an eigenvalue of B. By Corollary 2.2.13, λ0 is an isolated point of σ(B). Since the restriction B0 of B onto the subspace X0 has the spectrum σ(B0 ) consisting of λ0 only, B0−1 exists and is continuous. Therefore, the unit ball B(x0 ; 1) = B0 (B0−1 (B(x0 ; 1))) is a compact set. Proposition 1.2.15 says that M = dim X0 is ﬁnite. It follows from Lemma 1.1.31 that X0 = Ker (λ0 I0 − B0 )k for a certain k ∈ N and (see (1.1.20)) −1

(µI0 − B0 )

k−1

k (−1)n A−n x n x0 = (λ0 I0 − B0 ) x0 = n+1 (µ − λ ) (µ − λ0 )n 0 n=0 n=1

where P0 x = x0 . The proof is complete.

Exercise 3.1.16. Give details to conﬁrm the formulae of resolvent (3.1.11) and (3.1.12). Hint. For (3.1.11) replace the sum and the integral and use Proposition 3.1.14(ii). For (3.1.12) use the resolvent identity (λI − B)−1 − (µI − B)−1 = (µ − λ)(λI − B)−1 (µI − B)−1 and induction. Exercise 3.1.17. Compare the functional calculus from Proposition 3.1.14 with that of Remark 2.2.18. More precisely, show that for a compact, self-adjoint operator B the functional calculus given in Remark 2.2.18 is an extension of that of Proposition 3.1.14. Exercise 3.1.18. Let X be a Banach space. Assume that f : [a, b] → X has the Riemann integral over the interval [a, b]. Show that then the Bochner integral b f (t) dt also exists and the two integrals are equal. In particular, Proposia

tion 3.1.3 is a special case of Proposition 3.1.12. However, the proof of Proposition 3.1.3(ii) is much simpler. Exercise 3.1.19. Let A : Dom A ⊂ H → H be a densely deﬁned linear operator on a Hilbert space H. Assume that A has a compact resolvent that is also self-adjoint. (i) Extend the functional calculus (Remark 2.2.18 and Proposition 3.1.14) to such A. In particular, show that the formula for Φ(f )x still holds provided that ∞ |f (λn )|2 |(x, en )|2 < ∞ n=1

116

Chapter 3. Abstract Integral and Diﬀerential Calculus

(here σ(A) = {λ1 , . . . }). Notice that Φ(f ) need not be bounded if dim H = ∞ since σ(A) is unbounded in this case. Also, Φ(f ), Φ(g) do not commute in general. (ii) Suppose that σ(A) is bounded above. Show that the function u(t) etA x0

with x0 ∈ Dom A

is a continuous solution to the initial value problem x(t) ˙ = Ax(t), x(0) = x0 . (iii) Prove that t esA x ds ∈ Dom A for all x ∈ H

(3.1.14)

and

0

t

esA x ds = etA x − x.

A 0

In other words, etA x is a continuous solution of the integral form of (3.1.14). (iv) Prove that ∞ e−λt etA x dt = (λ − A)−1 x for all x ∈ H 0

and suﬃciently large Re λ (actually for Re λ > sup{(Ax, x) : x ∈ Dom A, x = 1}). (v) Let g : [0, ∞) → H be a continuous mapping and u : [0, ∞) → H a solution to the initial value problem x(t) ˙ = Ax(t) + g(t), x(0) = x0 . Show that

t

e(t−s)A g(s) ds

tA

u(t) = e x0 +

for

t ≥ 0.

0

(vi) Find conditions on a continuous mapping h : H → H such that the existence of a continuous solution to the integral equation t x(t) = etA x0 + e(t−s)A h(x(s)) ds 0

follows from the Contraction Principle. Such a solution is called a mild solution of the problem x(t) ˙ = Ax(t) + h(x(t)), (3.1.15) x(0) = x0 .

3.2. Diﬀerential Calculus in Normed Linear Spaces

117

If −A is as in Exercise 2.2.21(i), (ii), (iv), or, more generally, A = ∆ with suitable boundary conditions, then (3.1.15) is a semilinear partial diﬀerential equation of parabolic type. Exercise 3.1.20. Let B be a compact, self-adjoint operator on a Hilbert space H and let λ0 ∈ σ(B) \ {0}. Compute An in the expression (3.1.13).

3.2 Diﬀerential Calculus in Normed Linear Spaces We suppose that the reader is acquainted with partial derivatives and the differential of functions of two real variables. Our goal in this section is to extend these notions to mappings between normed linear spaces. Many inﬁnite dimensional spaces vary from RN by the lack of any natural basis. In particular this means that there is no way of generalizing partial derivatives. We deﬁne a directional derivative instead. Deﬁnition 3.2.1. Let X, Y be normed linear spaces (both over the same scalar ﬁeld) and let f : X → Y . If for a, h ∈ X the limit (in the norm of Y ) f (a + th) − f (a) t→0 t lim t∈R

exists, then its value is called the derivative of f at the point a and in the direction h (or directional derivative or Gˆ ateaux variation) and is denoted by δf (a; h). If δf (a; h) exists for all h ∈ X and the mappingDf (a) : h → δf (a; h) is linear and continuous, then Df (a) is called the Gˆ ateaux derivative of f at the point a.5 Remark 3.2.2. Simple examples of functions of two variables show that the directional derivative need not be linear in h and not even the existence of Df (a) guarantees the continuity of f at the point a. M N N M Example 3.2.3. Consider the standard bases eM 1 , . . . , eM and e1 , . . . , eN of R N M N and R , respectively. Then we can write f : R → R in the form

f (x) =

N

f i (x)eN i

(or brieﬂy f = (f 1 , . . . , f N )).

i=1

It is easy to see that δf (a; h) exists if and only if δf i (a; h) exists for all i = i 1, . . . , N . In particular, for h = eM j , the directional derivative δf (a; h) is nothing else than 5 The

∂f i ∂xj (a).

This means that the Gˆateaux derivative Df (a) has the matrix

terminology concerning Gˆ ateaux diﬀerentiability is not ﬁxed. Some authors do not assume linearity of Df (a).

118

Chapter 3. Abstract Integral and Diﬀerential Calculus

representation with respect to the standard bases in ⎛ ∂f 1 ∂f 1 ⎜ ∂x (a) . . . ∂x (a) 1 M ⎜ ⎜ .. .. . . ⎜ . . . ⎜ ⎝ ∂f N ∂f N (a) . . . (a) ∂x1 ∂xM

the form ⎞ ⎟ ⎟ ⎟ ⎟. ⎟ ⎠

This matrix is called the Jacobi matrix of f at the point a. If M = N , then its determinant is denoted by ∂(f 1 , . . . , f M ) = Jf ∂(x1 , . . . , xM ) g

and is called the Jacobian of f at a.

Example 3.2.4. Suppose that H is a Hilbert space and f : H → R (or C) has a Gˆ ateaux derivative Df (a) at a ∈ H. Then, by the Riesz Representation Theorem, there exists a unique point ∇f (a) ∈ H such that Df (a)h = (h, ∇f (a))H . The element ∇f (a) is called the gradient of f at a. Notice that the gradient ∇f g is a mapping from H into itself. Remark 3.2.5. One of the most important applications of the notion of derivative is in extremal problems of classical analysis. The well-known theorem (due to Fermat) asserts that the derivative is zero at an extremal point provided this derivative exists. The same result obviously holds for f : X → R also in an inﬁnite dimensional space X.6 The previous remark indicates the use of the notion of derivative for solving the equation F (x) = o for F : H → H. Namely, suppose that there is a functional f : H → R such that ∇f = F. Then it is suﬃcient to show that f has a local maximum or minimum. However, it is a very nontrivial problem to ﬁnd such f (the so-called potential of F ) or to ﬁnd conditions to ensure its existence. See Chapter 6 for more details. A discussion of the ﬁnite dimensional case (H = RM ) is given in Appendix 4.3B (Remark 4.3.62 and Theorem 4.3.64). We postpone examples since various properties of the derivative will be needed to introduce them. 6 A simple reason for this observation comes from the fact that the directional derivative δf (a; h) describes the behavior of the functional f along the straight line {a + th : t ∈ R}, i.e., the behavior of the real function t → f (a + th) near zero.

3.2. Diﬀerential Calculus in Normed Linear Spaces

119

Theorem 3.2.6 (Mean Value Theorem). Let X be a normed linear space and Y a Banach space. Let f : X → Y have the directional derivative at all points of the segment joining points a, b ∈ X in the direction of this segment, i.e., δf (a+t(b−a); b−a) exists for all t ∈ [0, 1]. If the mapping t → δf (a+t(b−a); b−a) is continuous on [0, 1], then 1 f (b) − f (a) = δf (a + t(b − a); b − a) dt. (3.2.1) 0

Proof. Take a ϕ ∈ Y

∗

and denote

g(t) = ϕ(f (a + t(b − a))),

t ∈ [0, 1].

By the deﬁnition of the directional derivative, we have g (t) = ϕ[δf (a + t(b − a); b − a)] and g is continuous on [0, 1]. It follows from the Basic Theorem of Calculus that 1 ϕ[δf (a + t(b − a); b − a)] dt = g(1) − g(0) = ϕ(f (b)) − ϕ(f (a)). 0

The Riemann integral

1

δf (a + t(b − a); b − a) dt exists (see Theorem 3.1.2) and,

0

by Proposition 3.1.12(i), we get 1 ϕ[δf (a + t(b − a); b − a)] dt = ϕ 0

1

δf (a + t(b − a); b − a) dt .

0 ∗

Since ϕ ∈ Y has been chosen arbitrary, the Hahn–Banach Theorem (in particular, Remark 2.1.17(ii)) implies the equality (3.2.1). The following result oﬀers another possible formulation. Theorem 3.2.7 (Mean Value Theorem). Let X, Y be normed linear spaces and let f : X → Y . If for given a, b ∈ X the directional derivative δf (a + t(b − a); b − a) exists for all t ∈ [0, 1], then f (b) − f (a)Y ≤ sup δf (a + t(b − a); b − a)Y

(3.2.2)

t∈[0,1]

and f (b) − f (a) − δf (a; b − a)Y ≤ sup δf (a + t(b − a); b − a) − δf (a; b − a)Y . t∈[0,1]

(3.2.3) Moreover, if Df (a + t(b − a)) exists for all t ∈ [0, 1], then f (b) − f (a)Y ≤ sup Df (a + t(b − a))L(X,Y ) b − aX . t∈[0,1]

(3.2.4)

120

Chapter 3. Abstract Integral and Diﬀerential Calculus

Proof. An idea similar to the previous proof is used. By the dual characterization of the norm (Corollary 2.1.16) there is ϕ ∈ X ∗ , ϕ = 1, such that f (b) − f (a) = ϕ(f (b) − f (a)). Deﬁne now g(t) = ϕ(f (a + t(b − a))),

t ∈ [0, 1].

Then g (t) = ϕ(δf (a + t(b − a); b − a)) and, therefore, the function g satisﬁes all assumptions of the classical Mean Value Theorem. Consequently, if X is a real space, we get f (b) − f (a) = g(1) − g(0) = g (ϑ) = ϕ(δf (a + ϑ(b − a); b − a)) ≤ δf (a + ϑ(b − a); b − a)

for a ϑ ∈ (0, 1).

If X is a complex space, we consider Re g and obtain f (b) − f (a) ≤ sup |g (ϑ)| ϑ∈(0,1)

(see the next remark) and the assertion also follows. The proof of (3.2.3) is similar and (3.2.4) is an easy consequence of (3.2.2). Remark 3.2.8. The Mean Value Theorem for functions from R → R is often stated in the following form: There is ϑ ∈ (0, 1) such that f (b) − f (a) = f (a + ϑ(b − a))(b − a) provided f is continuous on the interval [a, b] and f (x) exists for every x ∈ (a, b). Warning. This equality does not hold even for f : R → C (∼ R2 ) (e.g., f (x) = eix , a = 0, b = 2π)! Example 3.2.9. Diﬀerentiability of the norm is connected with the properties of the corresponding space (see, e.g., Fabian et al. [49, Chapter 5]). As a simple example we will show the relation between the uniqueness of the supporting hyperplane at a given point a ∈ X, a = 1, and the Gˆ ateaux diﬀerentiability of the norm at the point a. We recall that by Corollary 2.1.16 there is ϕ ∈ X ∗ , ϕ = 1, such that ϕ(a) = a = 1 and Re ϕ(x) ≤ 1

for all x ∈ X, x ≤ 1.7

hyperplane M = {x ∈ X : ϕ(x) = 1} is then called a supporting hyperplane to the unit ball of X at the point a. Such a ϕ ∈ X ∗ need not be uniquely determined. 7 The

3.2. Diﬀerential Calculus in Normed Linear Spaces

121

Put f (x) = x. Fix h ∈ X and let g(t) = a + th, t ∈ R. The function g is a convex real function, and therefore there exist the right and the left derivatives at zero and g− (0) ≤ g+ (0). Further, we have ϕ(a + th) − ϕ(a) g(t) − g(0) ≥ = ϕ(h) t t

for

t > 0.

In particular, g+ (0) ≥ ϕ(h) and similarly g− (0) ≤ ϕ(h). This means that ϕ is uniquely determined provided the directional derivative of the norm exists at a for all h ∈ X. In particular, δf (a; h) = ϕ(h),

i.e., the norm is Gˆ ateaux diﬀerentiable at a. The converse is also true. Indeed, suppose by contradiction that δf (a; h) does not exist for an h, i.e., g+ (0) > g− (0). Choose α ∈ [g− (0), g+ (0)] and deﬁne ψ(γa + th) = γ + tα for scalars γ, t. Then α ≤ g+ (0) ≤

a + th − a g(t) − g(0) = t t

for

t > 0,

and therefore ψ(a + th) = 1 + tα ≤ a + th. The same inequality holds for t ≤ 0. As an easy consequence we get |ψ(γa + th)| ≤ γa + th,

ψ(a) = 1.

This means that ψ ∈ Y ∗,

ψ = 1

where

Y = Lin{a, h}.

The Hahn–Banach Theorem yields an extension ϕ of ψ which determines a supporting hyperplane. Since for a diﬀerent α we get a diﬀerent ϕ there is no uniqueness of supporting hyperplanes at a and the duality mapping8 is not single-valued g at a. Similarly to partial derivatives, the Gˆ ateaux derivative is also unsuitable for the Chain Rule for diﬀerentiability. We recommend to the reader to construct examples of f : R2 → R, g : R → R2 such that f (g) has no derivative at o in spite of the fact that Df (o) = 0,

g(0) = o,

g (0) = (0, 0).

For this purpose a stronger notion of diﬀerentiability is needed. The following deﬁnition is a straightforward generalization of the diﬀerential of a function of two variables. 8 The map κ : X → exp X ∗ : κ(x) {f ∈ X ∗ : f ∗ = x , f (x) = x2 } is called the X X X duality mapping. It is a multivalued mapping and belongs to the fundamental concepts in the Banach space theory.

122

Chapter 3. Abstract Integral and Diﬀerential Calculus

Deﬁnition 3.2.10. Let X, Y be normed linear spaces (both over the same scalar ﬁeld). A mapping f : X → Y is said to be Fr´echet diﬀerentiable at a point a ∈ X if there exists A ∈ L(X, Y ) such that f (a + h) − f (a) − AhY = 0. h→o hX lim

(3.2.5)

In this case A is called the Fr´echet derivative of f at the point a and is denoted by f (a). Remark 3.2.11. (i) If f (a) exists, then also Df (a) exists. Moreover, f (a)h = Df (a)h

for all h ∈ X.

(ii) Suppose that a linear operator A : X → Y has the property (3.2.5). It is easy to see that A is continuous if and only if f is continuous at a. (iii) A basic analytical approach to the investigation of nonlinear problems involves their approximation by simpler objects. Among them linear approximations are more appropriate from the local point of view. The classical notion of the derivative as the best local linear approximation is the most transparent conﬁrmation of this phenomenon (e.g., the Fermat Theorem for local extremal points). The notion of Fr´echet derivative is a genuine generalization to inﬁnite dimensional spaces. Theorem 3.2.12 (Chain Rule). Let X, Y , Z be normed linear spaces and let there exist δg(a; h) for g : X → Y . If g(a) = b and for f : Y → Z the Fr´echet derivative f (b) exists, then δ(f ◦ g)(a; h) = f (b)[δg(a; h)].9 (3.2.6) Proof. Choose ε > 0 and h ∈ X. By (3.2.5) there is η > 0 such that f (b + k) − f (b) − f (b)kZ ≤ εkY

for

kY < η.

Put ω(t) g(a + th) − g(a) − tδg(a; h). ˆ For By Deﬁnition 3.2.1, there is δˆ > 0 such that ω(t)Y ≤ ε|t| for |t| < δ. k g(a + th) − g(a) = g(a + th) − b we have kY ≤ |t|δg(a; h)Y + ω(t)Y ≤ |t|[δg(a; h)Y + ε]. more transparent notation we will often use the symbol f ◦ g instead of f (g) for the composition of f and g. 9 For

3.2. Diﬀerential Calculus in Normed Linear Spaces

123

We may choose δˆ so small that the right-hand side in this inequality is less than ˆ Using all the information and δg(a; h) = k−ω(t) we obtain η whenever |t| < δ. t f (g(a + th)) − f (g(a)) − f (b)[δg(a; h)] t Z f (b + k) − f (b) − f (b)k ω(t) + f = (b) t t Z ω(t)Y εkY + f (b)L(Y,Z) ≤ ε[ε + δg(a; h)Y + f (b)L(Y,Z) ] ≤ |t| |t| ˆ The formula (3.2.6) follows. for 0 < |t| < δ.

Corollary 3.2.13. Let the hypotheses of Theorem 3.2.12 be satisﬁed. If, moreover, Dg(a) exists, then also D(f ◦ g)(a) does exist and the analogue of (3.2.6) is true. A similar assertion is true for (f ◦ g) (a) provided g (a) exists. Proof. The assertion on D(f ◦ g)(a) follows from (3.2.6). The proof for (f ◦ g) (a) is similar to that given above. Corollary 3.2.14. Let A ∈ L(Y, Z) and let δf (a; h) exist for f : X → Y . Then δ(Af )(a; h) = Aδf (a; h) and similarly for D(Af )(a) and (Af ) (a). Proof. It is suﬃcient to show that A (y) = A

for all y ∈ Y,

but this follows immediately from the deﬁnition.

The veriﬁcation of the degree of linear approximation needed in (3.2.5) is not always an easy task. The following condition can be of use in such situations. Proposition 3.2.15. Let Df (x) exist for all x in a neighborhood of a point a ∈ X. If x → Df (x) is continuous at a (as a mapping X → L(X, Y )), then f (a) exists. Proof. According to the estimate (3.2.3) we have for small h, f (a + h) − f (a) − Df (a)hY ≤ sup Df (a + th) − Df (a)L(X,Y ) hX t∈[0,1]

and the continuity of Df yields (3.2.5).

Deﬁnition 3.2.16. Let G be an open set in X and let f : X → Y . If the Gˆ ateaux derivative Df : G → L(X, Y ) is continuous on G (or equivalently, f is continuous on G), then we write f ∈ C 1 (G). One of the convenient conditions for the existence of the diﬀerential (i.e., Fr´echet derivative) of f : R2 → R is the continuity of partial derivatives. These can be interpreted also as derivatives with respect to one-dimensional subspaces. A generalization leads to the following deﬁnition.

124

Chapter 3. Abstract Integral and Diﬀerential Calculus

Deﬁnition 3.2.17. Let f : X → Y where X = X1 × X2 and X1 , X2 , Y are normed linear spaces.10 Let a2 ∈ X2 and let f1 : x1 → f (x1 , a2 ). If f1 has the Gˆateaux (or Fr´echet) derivative at a1 ∈ X1 , then Df1 (a1 ) (or f1 (a1 )) is called the partial Gˆ ateaux (or Fr´echet ) derivative of f at (a1 , a2 ) with respect to the ﬁrst variable and is denoted by D1 f (a1 , a2 ) (or f1 (a1 , a2 )). Similarly the partial derivative with respect to the second variable (D2 f or f2 ) is deﬁned. If Df (a1 , a2 ) exists, then also D1 f (a1 , a2 ), D2 f (a1 , a2 ) exist and Df (a1 , a2 )(h1 , h2 ) = D1 f (a1 , a2 )h1 + D2 f (a1 , a2 )h2 .

(3.2.7)

For the converse assertion we need more assumptions: Proposition 3.2.18. Assume that D2 f exists on a neighborhood of a point (a1 , a2 ) and the mapping D2 f : X1 × X2 → L(X2 , Y ) is continuous at (a1 , a2 ). Assume, moreover, that D1 f exists at the point (a1 , a2 ). Then Df (a1 , a2 ) exists and (3.2.7) holds. Proof. Choose suﬃciently small h1 , h2 . Then, by (3.2.3), f (a1 + th1 , a2 + th2 ) − f (a1 , a2 ) − tD1 f (a1 , a2 )h1 − tD2 f (a1 , a2 )h2 ≤ f (a1 + th1 , a2 + th2 ) − f (a1 + th1 , a2 ) − tD2 f (a1 + th1 , a2 )h2 + D2 f (a1 + th1 , a2 ) − D2 f (a1 , a2 )|t|h2 + f (a1 + th1 , a2 ) − f (a1 , a2 ) − tD1 f (a1 , a2 )h1 ≤ sup D2 f (a1 + th1 , a2 + tτ h2 ) − D2 f (a1 + th1 , a2 )|t|h2 0≤τ ≤1

+ D2 f (a1 + th1 , a2 ) − D2 f (a1 , a2 )|t|h2 f (a1 + th1 , a2 ) − f (a1 , a2 ) − D + f (a , a )h 1 1 2 1 |t| = o(|t|) t as t → 0, and the result follows.

Remark 3.2.19. If, in addition to the assumptions of Proposition 3.2.18, f1 (a1 , a2 ) exists, then f (a1 , a2 ) exists, too. The proof then follows the same lines as that above. Corollary 3.2.20. Let G be an open subset of X = X1 × X2 and f : X → Y . Then f ∈ C 1 (G) if and only if both f1 , f2 belong to C 1 (G). X is a normed linear space, too. A norm on X is, for example, deﬁned by xX = x1 X1 + x2 X2 , x = (x1 , x2 ) ∈ X1 × X2 . 10 Then

3.2. Diﬀerential Calculus in Normed Linear Spaces

125

Example 3.2.21. One of the most important nonlinear mappings is the so-called Nemytski operator which is sometimes also called the substitution (or superposition) operator . As the latter term indicates it arises by the substitution of a function ϕ : G ⊂ RM → R into the function f : G × R → R. This leads to a new operator F : ϕ → f (·, ϕ(·)) which acts on a space X of functions ϕ. We wish to ﬁnd conditions on f for F to be a mapping from X into X and to have some derivatives. We start with the case X = C[0, 1]. It is clear that the continuity of f on [0, 1] × R is suﬃcient to guarantee that F : X → X. Since f is uniformly continuous on compact sets of the form {(x, y) ∈ [0, 1] × R : |y − ϕ(x)| ≤ 1}

for ϕ ∈ C[0, 1],

F is also continuous on X. Suppose now that the partial derivative ∂f ∂y is continuous on [0, 1] × R. For ϕ, h ∈ X we have, by the classical Mean Value Theorem, f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f = (x, ϕ(x) + ϑ(t, x)th(x))h(x) t ∂y for a ϑ(t, x) ∈ (0, 1) and - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x)-sup -t ∂y x∈[0,1] - ∂f ∂f ≤ sup sup - (x, ϕ(x) + ϑth(x)) − (x, ϕ(x))-- |h(x)| ≤ εhC[0,1] ∂y 0≤ϑ≤1 ∂y x∈[0,1]

for all suﬃciently small |t| (again by uniform continuity of This means that the Gˆateaux derivative DF (ϕ) exists and DF (ϕ)h : x →

∂f ∂y

on compact sets).

∂f (x, ϕ(x))h(x). ∂y

Moreover, DF is continuous as a mapping X → L(X) (again by the uniform continuity of ∂f ∂y ) and, by Proposition 3.2.15, F (ϕ) exists for any ϕ ∈ X. Warning. It is not always true that the existence of DF !

∂f ∂y

implies the existence of

For example, let X = {ϕ ∈ C[0, ∞) : ϕ(x)e−x is bounded on [0, ∞)} with the norm ϕ

sup |ϕ(x)e−x | x∈[0,∞)

126

Chapter 3. Abstract Integral and Diﬀerential Calculus

and let f (y) = sin y. Since f is Lipschitz continuous with constant 1 we obtain F (ϕ1 ) − F (ϕ2 ) ≤ ϕ1 − ϕ2 . In particular, F is a continuous mapping from X into itself. But δF (0; h) = sin (0)h,

h ∈ X,

as could be erroneously supposed by analogy. Namely, for h(x) = ex ∈ X, - −x sin (tex ) − 0 - sin y x − e - = sup − 1-- ≥ 1 for any t > 0.11 sup -e t y x∈[0,∞) y∈[t,∞) Similar calculations yield that δF (0; h), h = o, does not exist at all.

g

The study of the Nemytski operator in spaces of integrable functions is much more complicated. First it has to be proved that F (ϕ) is a measurable function on Ω for ϕ ∈ Lp (Ω). The following notion is crucial for this purpose. Deﬁnition 3.2.22. Let Ω be an open set in RN . A function f : Ω × R → R is said to have the Carath´eodory property (notation: f ∈ CAR(Ω × R)) if (M) for all y ∈ R the function x → f (x, y) is (Lebesgue) measurable on Ω; (C) for a.a. x ∈ Ω the function y → f (x, y) is continuous on R. Proposition 3.2.23. Let Ω be an open set in RN . Then (i) if f : Ω × R → R is continuous on Ω × R, then f ∈ CAR(Ω × R); (ii) if f ∈ CAR(Ω × R) and ϕ : Ω → R is (Lebesgue) measurable on Ω, then F (ϕ) : x → f (x, ϕ(x)),

x ∈ Ω,

is a measurable function on Ω. Proof. (i) Since a continuous function f (·, y) is Lebesgue measurable, the assertion is obvious. ∞ (ii) Let ϕ be a measurable function on Ω. Then there is a sequence {sn }n=1 of step functions which converge to ϕ a.e. in Ω. If s(x) =

k

αi χΩi (x)

i=1

is a step function on Ω, i.e., there are pairwise disjoint Ω1 , . . . , Ωk which are measurable, k 1, x ∈ Ωi , Ω= Ωi and χΩi (x) = 0, x ∈ Ωi , i=1 11 The

lack of diﬀerentiability of the Nemytski operators in weighted spaces causes big problems in the use of the Implicit Function Theorem.

3.2. Diﬀerential Calculus in Normed Linear Spaces

then f (x, s(x)) =

k

127

f (x, αi )χΩi (x)

i=1

is a measurable function (property (M) in Deﬁnition 3.2.22). By property (C), lim f (x, sn (x)) = f (x, ϕ(x))

n→∞

for a.a. x ∈ Ω,

i.e., F (ϕ)(x) = f (x, ϕ(x))

is measurable.

Having measurability of F (ϕ) we can ask when F (ϕ) ∈ Lq (Ω). It is plausible that a certain growth condition for f is needed. Theorem 3.2.24. Let f ∈ CAR(Ω × R) and p, q ∈ [1, ∞). Let there exist g ∈ Lq (Ω) and c ∈ R such that p

|f (x, y)| ≤ g(x) + c|y| q

for a.a.

x∈Ω

and all

y ∈ R.

(3.2.8)

Then (i) F (ϕ) ∈ Lq (Ω) for all ϕ ∈ Lp (Ω);12 (ii) F is a continuous mapping from Lp (Ω) into Lq (Ω); (iii) F maps bounded sets in Lp (Ω) into bounded sets in Lq (Ω). Proof. The proof of (i) is based on Proposition 3.2.23 and the use of the Minkowski inequality (Example 1.2.16) and it is straightforward. The proof of (ii) is quite involved and its crucial step consists in the fact that F maps sequences converging in measure into sequences with the same property. We omit details (see, e.g., Krasnoselski [78, § I.2] or Appell & Zabreiko [8]). The property (iii) follows from the growth condition (3.2.8). Remark 3.2.25. The Carath´eodory property can be generalized to functions f : Ω × RM → R. Proposition 3.2.23 and Theorem 3.2.24 hold similarly for F (ϕ1 , . . . , ϕM )(x) f (x, ϕ1 (x), . . . , ϕM (x)). Remark 3.2.26. Let Ω ⊂ RN be an open subset of RN and f : Ω × RN +1 → R satisfy the Carath´eodory property. Assume, moreover, there exist g ∈ Lq (Ω) and c ∈ R such that N p |f (x, y)| ≤ g(x) + c |yi | q i=0

for a.a. x ∈ Ω and all y = (y0 , y1 , . . . , yN ) ∈ RN +1 . Then F deﬁned by F (u)(x) f (x, u(x), ∇u(x)) is a continuous mapping from W 1,p (Ω) into Lq (Ω) and maps bounded sets into bounded sets. it can be proved that this property implies that (3.2.8) is satisﬁed for g ∈ Lq (Ω) and c ∈ R, cf. Appell & Zabreiko [8].

12 Actually,

128

Chapter 3. Abstract Integral and Diﬀerential Calculus

The growth condition with respect to y0 can be relaxed according to the Embedding Theorem for W 1,p (Ω) (cf. Fuˇc´ık & Kufner [54]). Now we turn our attention to the directional derivative of the Nemytski operator in the space L2 (Ω). The exponents p = q = 2 are considered for simplicity only. In accordance with the computation in Example 3.2.21 we could expect δF (ϕ; h)(x) =

∂f (x, ϕ(x))h(x) ∂y

provided the right-hand side belongs to L2 (Ω). This is true if

(3.2.9) ∂f ∂y (·, ϕ(·))

∈ L∞ (Ω),

is a bounded continuous function on Ω × R. But this is not the e.g., whenever whole story since we have to show that -2 - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x)-- dx → 0 for t → 0. t ∂y Ω ∂f ∂y

For a.a. x ∈ Ω the function under the integral sign can be estimated by the Mean Value Theorem (the formula (3.2.1)): - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x) t ∂y - ∂f 1 ∂f ≤ sup -- (x, ϕ(x) + tϑh(x)) − (x, ϕ(x))--|t| |h(x)|.13 |t| 0≤ϑ≤1 ∂y ∂y The right-hand side converges to zero for t → 0 for a.a. x ∈ Ω (by the continuity of ∂f ∂y ). In order to justify the use of the Lebesgue Dominated Convergence Theorem we need a square integrable majorant. In particular, boundedness of ∂f ∂y on Ω × R 14 is suﬃcient. In the case when F depends also on the gradient of ϕ the situation is only technically slightly more complicated. Nemytski operators appear often under the integral (see Chapters 6 and 7). Since the integral is a continuous linear form, in particular, it is Fr´echet diﬀerentiable, we can use the Chain Rule to get N ∂f ∂f ∂h DΦ(ϕ)h = (x, ϕ(x), ∇ϕ(x))h(x) + (x, ϕ(x), ∇ϕ(x)) (x) dx ∂yi ∂xi Ω ∂y0 i=1 (3.2.10) 13 It is worth noticing how the classical Mean Value Theorem is used here: to avoid problems with measurability of x → ∂f (x, ϕ(x) + ϑ(x)th(x)) “the inequality form” of the theorem is employed. ∂y 14 The reader should notice problems in ﬁnding conditions which ensure that (3.2.9) is also the Fr´ echet derivative. The situation is even much worse than one would expect. The function f has to be linear for F : L2 (Ω) → L2 (Ω) to be Fr´echet diﬀerentiable (see, e.g., Ambrosetti & Prodi [6, Chapter 1, Proposition 2.8]). See also Exercise 3.2.41 and Remark 3.2.42.

3.2. Diﬀerential Calculus in Normed Linear Spaces

129

for

f (x, ϕ(x), ∇ϕ(x)) dx,

Φ(ϕ) Ω

under appropriate assumptions on f . Now we turn our attention to higher derivatives. We restrict our attention to the second derivatives and believe that the reader will be able to deﬁne the third and higher order derivatives as well. Higher order derivatives of functions are deﬁned by induction. We will do the same for abstract mappings. Let f : X → Y , and a, h, k ∈ X. Put g(t, s) = f (a + th + sk). Then ∂g (0, s) = δf (a + sk; h), ∂t which is a mapping from R (of variable s) into Y and can be diﬀerentiated again:

∂2g ∂ ∂g . (0, 0) (0, s) -∂t∂s ∂s ∂t s=0 If these derivatives exist, then δ 2 f (a; h, k)

∂ ∂s

∂g (0, s) -∂t s=0

is called the second directional derivative (in the directions h, k). Notice that generally δ 2 f (a; h, k) = δ 2 f (a; k, h). (Find an example for f : R2 → R!) It is easy to see that for f : RM → R we have δ 2 f (a; ei , ej ) =

∂2f (a) ∂xi ∂xj

if ei , ej are the unit coordinate vectors in RM . It may occur that the operator (h, k) ∈ X × X → δ 2 f (a; h, k) is linear in both variables (i.e., it is the so-called bilinear operator ) and is continuous on X × X.15 In that case δ 2 f (a; ·, ·) is called the second Gˆ ateaux derivative and is denoted by D2 f (a). 15 Equivalently,

it is continuous at the point (o, o) if there is a constant c such that δ2 f (a; h, k)Y ≤ chX kX for all h, k ∈ X.

(See a similar assertion in Proposition 1.2.10 for a linear operator.) Denoting the space of all continuous bilinear operators from X × X into Y by B2 (X, Y ) we see that the least possible constant c in the above inequality is a norm on B2 (X, Y ). See also the important Proposition 2.1.7.

130

Chapter 3. Abstract Integral and Diﬀerential Calculus

Proposition 3.2.27 (Taylor Formula). Let X be a normed linear space and Y a Banach space. Assume that a, h ∈ X and that δ 2 f (x; h, h) exists for all x ∈ M {a + th : t ∈ [0, 1]} and is continuous as a mapping from M into Y . Then 1 f (a + h) = f (a) + δf (a; h) + (1 − t)δ 2 f (a + th; h, h) dt.16 (3.2.11) 0

Proof. Put g(t) = (1 − t)δf (a + th; h). Then we have g (t) = −δf (a + th; h) + (1 − t)δ 2 f (a + th; h, h),

t ∈ [0, 1].

Since both terms on the right-hand side are continuous we get, 1 1 1 g(1) − g(0) = g (t) dt = − δf (a + th; h) dt + (1 − t)δ 2 f (a + th; h, h) dt. 0

0

0

Using Theorem 3.2.6 we obtain (3.2.11).

If we wanted to deﬁne the second Fr´echet derivative also by induction we should diﬀerentiate f : X → L(X, Y ) at a ∈ X to obtain f (a) ∈ L(X, L(X, Y )). But this space seems to be rather strange and the space L(X, L(X, L(X, Y ))) (for f (a)) really awkward. Because of that we identify L(X, L(X, Y )) with the space of continuous bilinear operators B2 (X, Y ) (see footnote 15) and deﬁne the second Fr´echet derivative f (a) to be an element of B2 (X, Y ) with the approximation property f (a + h) − f (a) − f (a)(h, ·)L(X,Y ) lim = 0. (3.2.12) h→o hX The careful reader can ask why we have written f (a)(h, ·) and not f (a)(·, h) in (3.2.12). The reason is that the mapping (h, k) → f (a)(h, k) is actually symmetric. Proposition 3.2.28. If f (a) exists, then f (a)(h, k) = f (a)(k, h)

for all

h, k ∈ X.

Proof. Similarly to the proof of the classical result on mixed partial derivatives we express the diﬀerence f (a + h + k) − f (a + h) − f (a + k) + f (a) which is equal to gi (1) − gi (0) for g1 (t) f (a + th + k) − f (a + th),

t ∈ [0, 1],

g2 (s) f (a + h + sk) − f (a + sk),

s ∈ [0, 1].

for n ∈ N the nth directional derivative δn f (a; h, . . . , h) exists for all h ∈ X, then the n 1 k δ f (a; h, . . . , h) is called the Taylor polynomial of the degree n of f mapping h → f (a) + k! k=1 16 If

k-times

at the point a.

3.2. Diﬀerential Calculus in Normed Linear Spaces

131

Since f (a) exists, both the mappings f and f are deﬁned on a neighborhood U of a. Elements h, k are chosen so small that all variables belong to U. We can express gi (1) − gi (0) = Ai + gi (0)

where Ai gi (1) − gi (0) − gi (0),

and g1 (0) = e1 (h, k) + f (a)(k, h), g2 (0) = e2 (k, h) + f (a)(h, k),

e1 (h, k) f (a + k)h − f (a)h − f (a)(k, h), e2 (k, h) f (a + h)k − f (a)k − f (a)(h, k).

Since g1 (1) − g1 (0) = g2 (1) − g2 (0), we have f (a)(h, k) − f (a)(k, h) = A1 − A2 + e1 (h, k) − e2 (k, h).

(3.2.13)

Now we estimate all terms on the right-hand side of this equality. By Theorem 3.2.7, Ai ≤ sup gi (t) − gi (0). t∈[0,1]

Since f (a) is bilinear, we have g1 (t) − g1 (0) = f (a + th + k)h − f (a + th)h − f (a + k)h + f (a)h = [f (a + th + k)h − f (a)h − f (a)(th + k, h)] − [f (a + th)h − f (a)h − f (a)(th, h)] − [f (a + k)h − f (a)h − f (a)(k, h)].

(3.2.14)

Choose now ε > 0 and δ > 0 corresponding to the deﬁnition of f (a) such that f (a + u)v − f (a)v − f (a)(u, v) ≤ εuv for

u < δ and any v ∈ X.

Then every term on the right-hand side of (3.2.14) is bounded by ε(h + k)2 provided h, k < δ. The same estimate holds for e1 (h, k) and similarly also for g2 (t) − g2 (0), e2 (k, h). By (3.2.13) we obtain f (a)(h, k)−f (a)(k, h) ≤ A1 +A2 +e1 (h, k)+e2 (k, h) ≤ 8ε[h+k]2 (3.2.15) provided h, k < δ. Choose h0 , k0 ∈ X and put h = αh0 ,

k = αk0 .

For a suﬃciently small α the estimate (3.2.15) holds. Because of the bilinearity of f (a) we get f (a)(h0 , k0 ) − f (a)(k0 , h0 ) ≤ 8ε[h0 2 + k0 2 ] This completes the proof.

for any ε > 0.

132

Chapter 3. Abstract Integral and Diﬀerential Calculus

Remark 3.2.29. (i) It is not diﬃcult to see that the existence of f (a) implies the existence of D2 f (a) and the equality f (a)(h, k) = D2 f (a)(h, k). It is also possible to prove that the continuity of D2 f on an open set G ⊂ X (as a mapping from G into B2 (X, Y )) is equivalent to the continuity of f on G. In this case we write f ∈ C 2 (G). (ii) If X = RM , Y = R and D2 f (a) exists for f : RM → R, then it is suﬃcient to know the values D2 f (a)(ei , ej ),

i, j = 1, . . . , M,

to determine D2 f (a). This means that D2 f (a) (and also f (a)) can be represented by the matrix (the so-called Hess matrix )

2 ∂ f (a) . ∂xi ∂xj Exercise 3.2.30. Let A ∈ L(X, Y ) and B ∈ B2 (X, Y ). Compute A , B and B ! Exercise 3.2.31. Let f : X → Y be injective on an open set G ⊂ X. Denote −1 (f |G ) = g. Suppose that f (a) and g (b) exist for an a ∈ G, f (a) = b. Is it true that g (b) = [f (a)]−1 ? (For conditions which guarantee the existence of g (b) see Section 4.1.) Exercise 3.2.32. Put Φ(A) = A−1 for an invertible A ∈ L(X, Y ) (here X, Y are Banach spaces). Show that Φ (A)(H) = −A−1 HA−1 ,

H ∈ L(X, Y ) for all A ∈ Dom Φ.

Hint. Use the same method as in Exercise 2.1.33. Exercise 3.2.33. Let X be either C[0, 1] or Lp (0, 1), 1 ≤ p < ∞. Compute δf (x, h), Df (x) and f (x) for f (x) = x, x ∈ X. Exercise 3.2.34. (i) Compute the duality mapping for the space Lp (Ω), 1 ≤ p < ∞. (ii) Show that the duality mapping for the space C[0, 1] need not be single-valued. Exercise 3.2.35. Let p > 1, Ω ⊂ RN and let 1 1 p f (u) = ∇u(x) dx, g(u) = |u(x)|p dx p Ω p Ω

3.2. Diﬀerential Calculus in Normed Linear Spaces

133

be functionals deﬁned on W

1,p

(Ω) (here ∇u(x)

(2 N ' ∂u(x) ∂xi

i=1

12 ). Prove that

f and g are Fr´echet diﬀerentiable at each u ∈ W 1,p (Ω), and f (u)v =

∇u(x)p−2 (∇u(x), ∇v(x)) dx,

g (u)v =

Ω

|u(x)|p−2 u(x)v(x) dx. Ω

Hint. Let t = 0, ϕ(0) = 0. ϕ(t) = |t|p−2 t, ' ( d 1 p = ϕ(t), t ∈ R. Similarly, for y ∈ RN , y = Then ϕ is continuous and dt p |t| 1

N 2 2 yi , set i=1

ψ(y) = yp−2 y,

( ' Then ∇ p1 yp = ψ(y) for all y ∈ RN .

y = o,

ψ(o) = o.

Exercise 3.2.36. Find conditions on k and f for the so-called Hammerstein operator b Hϕ(t) = k(t, s)f (s, ϕ(s)) ds a

to map L2 (a, b) into itself, and then diﬀerentiate H! Exercise 3.2.37. Diﬀerentiate the following operators: 3 1 t 2 |ϕ(s)| ds dt, ϕ ∈ C[0, 1] or ϕ ∈ L2 (0, 1); (i) F (ϕ) = 0

0

2

t

ϕ(s) ds

(ii) F (ϕ)(t) =

as

0

F : L1 (0, 1) → L1 (0, 1), F : C[0, 1] → C[0, 1], F : C[0, 1] → C 1 [0, 1]. Exercise 3.2.38. Let f : [0, 1] × R → R and

1

f (t, ϕ(t)) dt.

F (ϕ) = 0

Under which conditions on f do there exist D2 F (ϕ) and F (ϕ) if we consider F : C[0, 1] → R

and

F : L2 (0, 1) → R?

134

Chapter 3. Abstract Integral and Diﬀerential Calculus

Remark 3.2.39. The following assertion is due to I.V. Skrypnik: 2

If ∂∂yf2 is continuous and bounded on (0, 1) × R, then F ∈ C 2 (L2 (0, 1)) (F is deﬁned as in Exercise 3.2.38) if and only if f (t, y) = a(t) + b(t)y + c(t)y 2

a, b, c ∈ L∞ (0, 1).

where

It is not too diﬃcult to prove that by contradiction. Exercise 3.2.40. Let f : [0, 1] × R × R → R and

1

F (ϕ) =

f (t, ϕ(t), ϕ (t)) dt,

ϕ ∈ C 1 [0, 1].

0

Under which conditions does D2 F (ϕ) exist? Exercise 3.2.41. Suppose that Ω is a bounded open subset of RN , function f : Ω × R → R and its partial derivatives ∂f ∂y are continuous on Ω × R (or both satisfy the Carath´eodory property). Let p > 2 and let there exist constants a, b such that - ∂f - (x, y)- ≤ a + b|y|p−2 , x ∈ Ω, y ∈ R. - ∂y

p (the conjugate exponent) and F (ϕ)(x) = If f (·, 0) ∈ Lp (Ω) where p = p−1 f (x, ϕ(x)), show the following facts: (i) F maps Lp (Ω) into Lp (Ω). Hint. Integrate ∂f ∂y and use the above estimate and Theorem 3.2.24. p (ii) δF (ϕ)h : x → ∂f ∂y (x, ϕ(x))h(x) for all h ∈ L (Ω). Hint. Proceed similarly to the main text. Use the H¨ older inequality to show p p q that Fy (ϕ)(x) ∂f (x, ϕ(x)) maps L (Ω) into L (Ω), q = p−2 . ∂y

(iii) The Fr´echet derivative F (ϕ) exists for all ϕ ∈ Lp (Ω). Remark 3.2.42. If diﬀerentiability properties of F are also needed for p ≤ 2, one should replace Lp (Ω) by more sophisticated spaces like Besov or Triebel–Lizorkin ones. See, e.g., Runst & Sickel [115, Chapter 5].

3.2A Newton Method The Contraction Principle oﬀers a very eﬀective method for solving nonlinear equations, either to prove the existence of a solution or to ﬁnd them numerically. Since the speed of convergence is not always satisfactory, various modiﬁcations have appeared. One of these modiﬁcations is even much older than the Contraction Principle itself and goes back to I. Newton. An idea of this method can be seen from Figure 3.2.1 where the iterations for solving the equation f (x) = o are shown.

3.2A. Newton Method

135

f (x)

x ˜ x2

x1

y1

f (a)(x − a) + f (a)

a

Figure 3.2.1.

Suppose that we have found an approximate solution a. We wish to construct a correction y˜ such that f (a + y˜) = o. By the Taylor expansion, y + r(˜ y) = −f (a + y˜)˜ y + r(˜ y), f (a) = f (a + y˜) − f (a + y˜)˜ i.e., y )] = −[f (a + y˜)]−1 [f (a + y˜) − f (a + y˜)˜ y] F (˜ y ) (3.2.16) y˜ = −[f (a + y˜)]−1 [f (a) − r(˜ provided [f (a + y˜)]−1 exists. The idea is to solve the equation y˜ = F (˜ y)

(3.2.17)

in a certain closed ball B(o; ) around o by iterations yn+1 = F (yn ),

y0 = o.

Denoting xn = a + yn we can rewrite these iterations in the form xn+1 − a = −[f (xn )]−1 f (xn ) + xn − a,

(3.2.18)

which are exactly the iterations from Figure 3.2.1. If the sequence of iterations {yn }∞ n=1 converges to y˜, then f (a + y˜) = o as follows from (3.2.16). Our goal is to show: (A1) There is δ > 0 such that F maps B(o; δ) into itself and it is a contraction on this ball.

136

Chapter 3. Abstract Integral and Diﬀerential Calculus

(A2) The convergence of {xn }∞ n=1 is faster than the convergence of iterations given by the Contraction Principle (cf. Theorem 2.3.1), actually there is a constant c such that xn+1 − xn ≤ cxn − xn−1 2 .17 (3.2.19) We apparently need some assumptions to reach this goal. We assume that X is a Banach space, f : X → X, and, moreover: ˆ and f satisﬁes the Lipschitz ˆ such that f ∈ C 1 (B(a; δ)) (H1) There is a ball B(a; δ) condition on this ball: there exists L such that f (x) − f (y)L(X) ≤ Lx − yX

for

ˆ x, y ∈ B(a; δ).

(H2) The value f (a) is suﬃciently small.18 (H3) The derivative f (a) has a continuous inverse [f (a)]−1 ∈ L(X). The proof of (A1), (A2) will be done in several steps. For the sake of simplicity we denote A(x) f (x), α [f (a)]−1 . A f (a), −1 1 ˆ then A (x) exists and , δ ≤ δ, Step 1. If δ < αL

A−1 (x) ≤ Indeed, we can write

α 1 − αLδ

for

x ∈ B(a; δ).

A(x) = A[I + A−1 (A(x) − A)].

Since A(x) − A ≤ Lx − a, we get A−1 (A(x) − A) ≤ αLx − a and A−1 (x) exists for x ∈ B(a; δ) (by Proposition 2.1.2), and A−1 (x) =

∞

(−1)n [A−1 (A(x) − A)]n A−1 .

n=0 −1

The estimate for A (x) follows. Step 2. If w, x ∈ B(a; δ), then A−1 (w) − A−1 (x) ≤

α 1 − αLδ

2 Lw − x.

This estimate follows from the identity A−1 (w) − A−1 (x) = A−1 (w)[A(x) − A(w)]A−1 (x) and Step 1. 17 Compare this quadratic estimate (which yields an exponential one for ˜ x − xn ) with an estimate from the Contraction Principle ˜ x − xn ≤ q n x1 − x0 for a 0 < q < 1. 18 This assumption means that we actually need a good approximation of a solution of the equation f (x) = o (see Step 4 for the estimate of f (a)).

3.2A. Newton Method

137

Step 3. We have r(y) ≤

L 2 δ 2

r(y) − r(z) ≤ 3Lδy − z

and

for

y, z ∈ B(o; δ)

where r(y) f (a) − f (a + y) + A(a + y)y (see (3.2.16)). Indeed, by Theorem 3.2.6, we get

1

[A(a + y) − A(a + (1 − t)y)]y dt

r(y) = 0

and r(y) − r(z) = f (a + z) − f (a + y) + A(a + y)y − A(a + z)z 1 [A(a + t(z − y)) − A(a + y)](z − y) dt + [A(a + y) − A(a + z)]z. = 0

The estimates now follow from (H1) and Step 2. Step 4. The assertion (A1) holds. Indeed, we have F (y) − F (z) = A−1 (a + y)[r(y) − r(z)] + [A−1 (a + z) − A−1 (a + y)][f (a) − r(z)]. From (H1) and Steps 1–3 we get F (y) − F (z) ≤ c(δ + f (a)L)y − z with a c which is a bounded function of δ ∈ [0, δ0 ] (δ0 small enough). This means that we can choose δ and the estimate of f (a) in (H2) such that F (y) − F (z) ≤ qy − z,

y, z ∈ B(o; δ)

for a

q ∈ (0, 1).

Moreover, F (y) ≤ F (y) − F (o) + F (o) ≤ qδ + αf (a) ≤ δ, provided f (a) is suﬃciently small. Step 5. We can now prove the assertion (A2). By (3.2.18) and Theorem 3.2.6, f (xn ) = f (xn ) − f (xn−1 ) − f (xn−1 )(xn − xn−1 ) 1 [f (xn−1 + t(xn − xn−1 )) − f (xn−1 )](xn − xn−1 ) dt. = 0

Hence f (xn ) ≤ and also

L xn − xn−1 2 , 2

xn+1 − xn ≤ A−1 (xn )f (xn ) ≤ cxn − xn−1 2 .

Remark 3.2.43. The drawback of the iteration procedure (3.2.18) consists in the requirement to compute the inverse to the derivative at each step. This is the price for fast convergence. One can assume that by replacing [f (x)]−1 by the ﬁxed inverse [f (a)]−1

138

Chapter 3. Abstract Integral and Diﬀerential Calculus

we should avoid this disadvantage. This idea is also due to I. Newton. Conditions for convergence of these iterations were found by Kantorovich (see Kantorovich [72]). Serious problems appear when the derivative f (x) is injective but not continuously invertible. In applications, e.g., to nonlinear partial diﬀerential equations, we have many possibilities of the choice of Banach spaces Xα , Yα such that f : Xα → Yα (see, e.g., Example 1.2.25 and Example 2.1.29). It can happen that [f (x)]−1 ∈ L(Yα , Xβ )

where

Xα ⊂ Xβ .

This means that the equation f (xn )(x − xn ) = −f (xn ) (see (3.2.18)) which has to be solved to obtain the (n + 1)st -iteration xn+1 , has a solution in a larger space Xβ provided xn ∈ Xα . Therefore the iterations belong to larger and larger spaces and, after a ﬁnite number of steps, there is no solution at all. This can be also expressed by an observation that xn+1 is less smooth than xn , or that “derivatives are lost” during iterations. One way to overcome these diﬃculties consists in the approximation of [f (x)]−1 by a “better” operator L(x) in the sense that f (x)L(x) − IL(Yα ) is smaller and smaller when x approaches a solution of f (x) = o. Precise conditions under which new iterations wn+1 = wn − L(wn )f (wn ) converge to a solution can be found, e.g., in Moser [97, pp. 265–315 and 499–535]. A similar idea appeared earlier in Nash [98]. See also Remark 4.1.6 for a slightly diﬀerent explanation. Exercise 3.2.44. Let f ∈ C 1 (R) be a convex real function. (i) Using only the results of elementary calculus prove the convergence of Newton approximations under appropriate assumptions. Give the reccurence formula for f (x) = x2 − A,

A > 0.

(ii) The same as in (i) for the Kantorovich approximations (Remark 3.2.43).

Chapter 4

Local Properties of Diﬀerentiable Mappings 4.1 Inverse Function Theorem In this section we are looking for conditions which allow us to invert a map f : X → Y , especially f : RM → RN . The simple case of a linear operator f indicates that a reasonable assumption is that M = N . Let us start with the simplest case M = N = 1. The well-known theorem says that if f is continuous and strictly monotone on an open interval I, then f is injective and f (I) is an open interval J . Moreover, the inverse function f −1 is continuous on J . It is not clear how to generalize the monotonicity assumption to RM (cf. Section 5.3), and without it the theorem is not true even in R. Since the monotonicity of a diﬀerentiable function f : R → R is a consequence of the sign of the derivative of f , we take into consideration also f . The example f (x) = x2 where f is not injective in any neighborhood of the origin shows that we have to assume f (x) = 0. In fact, if f is continuous on an open interval I, f (x) exists (possibly inﬁnite) at all points of I, and f does not vanish at any point of I, then f is injective (actually strictly monotone since f is either strictly positive or strictly negative in I), and f −1 is continuous and diﬀerentiable on the open interval f (I). Therefore, we are looking for a generalization of the assumption f (x) = 0 for maps f : RM → RM . Since we are interested in a (unique) solution of the equation f (x) = y, the case of a linear function f : RM → RM (then f (x) = f ) suggests assuming that f (x) is either an injective or, equivalently because of the ﬁnite dimension, a surjective linear map. In both cases, f (x) is an isomorphism of RM onto RM (for the case of Banach spaces see Theorem 2.1.8).

140

Chapter 4. Local Properties of Diﬀerentiable Mappings

However, there is still one more problem: Let or g(z) = ez , z ∈ C.

f (r, ϑ) = (r cos ϑ, r sin ϑ), (r, ϑ) ∈ (0, ∞) × R,

Both functions are inﬁnitely many times diﬀerentiable on their domains, det f (r, ϕ) = r = 0,

g (z) = 0,

and f (r, ·) is 2π-periodic and g is 2πi-periodic, i.e., f and g are not injective. Therefore, we cannot expect more than only local invertibility. The philosophy of that is simple. Since the notion of derivative is a local one, we can deduce only local information from it. After these preliminary considerations we can state the main theorem. Since there is no simpliﬁcation in the case of ﬁnite dimension, we formulate it for general Banach spaces. Theorem 4.1.1 (Local Inverse Function Theorem). Let X, Y be Banach spaces, G an open set in X, f : X → Y continuously diﬀerentiable on G. Let the derivative f (a) be an isomorphism of X onto Y for a ∈ G. Then there exist neighborhoods U of a, V of f (a) such that f is injective on U, f (U) = V. If g denotes the inverse to the restriction f |U , then g ∈ C 1 (V). Proof. We will solve the equation f (x) = y for a ﬁxed y near the point b = f (a) by the iteration process. To do that we have to rewrite the equation f (x) = y as an equation in X. We denote by A the inverse map [f (a)]−1 ∈ L(Y, X). Then f (x) = y

⇐⇒

Fy (x) x − A[f (x) − y] = x.

(4.1.1)

The simplest condition for the convergence of iterations is given by the Contraction Principle (see Theorem 2.3.1). We have Fy (x1 ) − Fy (x2 ) = x1 − x2 − A[f (x1 ) − f (x2 )] ≤ Af (x2 ) − f (x1 ) − f (a)(x2 − x1 ) ≤ A

sup

f (ξ) − f (a)x1 − x2 ,

ξ∈B(a;r)

x1 , x2 ∈ B(a; r). (Here we have used the Mean Value Theorem (see formula (3.2.3)).) In other words, we can choose r > 0 so small that Fy (x1 ) − Fy (x2 ) ≤

1 x1 − x2 2

(4.1.2)

for x1 , x2 ∈ B(a; r) ⊂ G, y ∈ Y . Further, Fy (x) − a = Fy (x) − Fy (a) + Fy (a) − a ≤

1 x − a + Ab − y. 2

If δ > 0 is such that Aδ ≤ r2 , then Fy (x) ∈ B(a; r) provided x ∈ B(a; r), y ∈ B(b; δ). By the Contraction Principle, the equation (4.1.1) has a unique solution

4.1. Inverse Function Theorem

141

in B(a; r), x g(y) ∈ B(a; r)

for any y ∈ B(b; δ).

Moreover, if g(yi ) = xi , i = 1, 2, then g(y1 ) − g(y2 ) = Fy1 (x1 ) − Fy2 (x2 ) ≤ Fy1 (x1 ) − Fy1 (x2 ) + Fy1 (x2 ) − Fy2 (x2 ) ≤

1 x1 − x2 + Ay1 − y2 , 2

i.e., g(y1 ) − g(y2 ) ≤ 2Ay1 − y2 .

(4.1.3)

In particular, g is a Lipschitz continuous map on B(b; δ). To prove the diﬀerentiaˆ ⊂ B(b; δ). bility of g, ﬁx a y ∈ B(b; δ) and choose δˆ > 0 so small that B(y; δ) −1 A candidate for g (y) is the inverse C(x) [f (x)] for x = g(y). By (4.1.3), x ∈ B(a; r) and 1 f (x) − f (a) ≤ . 2[f (a)]−1 This means that C(x) exists and C(x) ∈ L(Y, X) (cf. Exercise 2.1.33). So we wish to estimate the expression α(k) g(y + k) − g(y) − C(x)k

ˆ for k ∈ Y, k < δ.

Put h = g(y + k) − g(y), We have

k = f (x + h) − f (x).

i.e.,

α(k) = h − C(x)k = −C(x)[f (x + h) − f (x) − f (x)h].

By the deﬁnition of the Fr´echet derivative, for any ε > 0 there is η > 0 such that f (x + h) − f (x) − f (x)h ≤ εh

provided h < η.

But (see (4.1.3)) h ≤ 2Ak. This means that α(k) = o(k),

i.e.,

g (y) = C(x) = [f (x)]−1 .

This also implies the continuity of g (y) since the inverse [f (x)]−1 depends continuously on x (see Exercise 2.1.33). To complete the proof it remains to put V = B(b; δ)

and

U = f−1 (V) ∩ B(a; r).

Corollary 4.1.2. Let X, Y be Banach spaces, G an open subset of X, f ∈ C 1 (G, Y ). If f (x) is an isomorphism of X onto Y for all x ∈ G, then f (G) is an open subset of Y . Proof. Use the deﬁnition of an open set and Theorem 4.1.1.

142

Chapter 4. Local Properties of Diﬀerentiable Mappings

Example 4.1.3. If f ∈ C k (G), k ∈ N, then g ∈ C k (V). This follows easily from the formula g (y) = [f (x)]−1 , x = g(y), g the Chain Rule and Exercise 3.2.32. Deﬁnition 4.1.4. Let X, Y be Banach spaces. Then f : X → Y is called a diﬀeomorphism of G ⊂ X (or a diﬀeomorphism of G onto H = f (G)) if the following conditions are satisﬁed: (1) G is an open set in X, f ∈ C 1 (G), (2) f (G) = H is an open set in Y , (3) f is injective on G and the inverse g = (f |G )−1 belongs to C 1 (H). If, moreover, f ∈ C k (G) for some k ∈ N, and (therefore) g ∈ C k (H), then f is called a C k -diﬀeomorphism. A diﬀeomorphism in RM can be viewed as a nonlinear generalization of a linear invertible operator A : RM → RM . Such A yields a linear transformation of coordinates y = Ax. If ϕ is a diﬀeomorphism of G onto H and a ∈ G, we can suppose without loss of generality that ϕ(a) = o (if this is not true consider a new diﬀeomorphism on G: ϕ(x) ˜ = ϕ(x) − ϕ(a)). Then the Cartesian coordinates y1 , . . . , yM of y = ϕ(x) can be taken as (generalized or nonlinear or non-Cartesian) coordinates of a point x in the neighborhood G of a. Such coordinates play an important role in problems where we have to work on non-ﬂat domains (e.g., on nonlinear manifolds – see Appendix 4.3A). Notice that we can also interpret Theorem 4.1.1 in the ﬁnite dimensional case as follows: The Cartesian coordinates (y1 , . . . , yM ) of the point y = f (x) are nonlinear coordinates of the point x. In these nonlinear coordinates the diﬀeomorphism f of U is equal to the identity map. Example 4.1.5. Standard examples of nonlinear coordinates: (i) Polar coordinates in R2 : x = r cos ϕ,

y = r sin ϕ

(ψ(r, ϕ) = (x, y) is a diﬀeomorphism of (0, ∞) × (α, α + 2π) onto R2 without a half line); (ii) Spherical coordinates in R3 : x = r cos ϕ1 cos ϕ2 ,

y = r sin ϕ1 cos ϕ2 ,

z = r sin ϕ2 ;

4.1. Inverse Function Theorem

143

(iii) Spherical coordinates in RM : x1 = r cos ϕ1 cos ϕ2 · · · cos ϕM−1 , x2 = r sin ϕ1 cos ϕ2 · · · cos ϕM−1 , x3 = r sin ϕ2 cos ϕ3 · · · cos ϕM−1 , .. . xM−1 = r sin ϕM−2 cos ϕM−1 , xM = r sin ϕM−1 . Before using the Local Inverse Function Theorem we have to check that functions ψi (r, ϕ1 , . . . , ϕM−1 ) = xi ,

i = 1, . . . , M,

have continuous partial derivatives (obvious) and their Jacobi matrix is regular. Equivalently, the determinant Jψ of the Jacobi matrix is nonzero at a point (˜ r , ϕ˜1 , . . . , ϕ˜M−1 ), ψ(˜ r , ϕ˜1 , . . . , ϕ˜M−1 ) = a. Here Jψ = rM−1

M−2 /

cosk ϕk+1 ,

M ≥ 2.1

g

k=1

Example 4.1.6. The following question concerning the assumptions of Theorem 4.1.1 naturally arises: “What happens if f (a) is not an isomorphism?” In the case of ﬁnite dimension, f (a) cannot be an isomorphism for f : RM → N R whenever M = N . If M > N , i.e., the number of equations is smaller than the number of variables, then we can expect (we recommend to consider the case of a linear f ) to compute some of the variables. The simplest case is solved in the next Section 4.2 (the Implicit Function Theorem). If M < N , then f (G) will probably be a “thin” subset of RN . This case leads to the notion of a (diﬀerentiable) manifold (see the ﬁrst part of Section 4.3 (Deﬁnition 4.3.4) and Appendix 4.3A). If both X and Y have inﬁnite dimension, it can occur that f (a) is injective but Im f (a) is a dense subset of Y , diﬀerent from Y . In this case, A = [f (a)]−1 exists but it is not continuous into X. We can also explain this situation as follows. If there is a constant c > 0 such that f (a)hY ≥ chX

for all h ∈ X,

(4.1.4)

then f (a) is injective, Y1 Im f (a) is a closed subspace of Y . Moreover, if we know that Y1 is dense in Y , then Y1 = Y and Theorem 4.1.1 can be applied. But sometimes we are able to prove only a weaker estimate, namely that there is a constant c > 0 such that f (a)hY ≥ chX˜ 1 We

use the notation

p 0 j=1

0 aj a1 · · · ap ( 1). ∅

for all h ∈ X

144

Chapter 4. Local Properties of Diﬀerentiable Mappings

where · X˜ is a weaker norm than · X . By this we mean that only the estimate hX˜ ≤ dhX

holds for all h ∈ X

(e.g., X = C 1 [0, 1], hX = sup |h(t)| + sup |h (t)|, hX˜ = sup |h(t)|). Then t∈[0,1]

t∈[0,1]

t∈[0,1]

A [f (a)]−1 maps Y continuously into the completion of X with respect to the norm · X˜ (remember that we need complete spaces for the Contraction Principle). In the above example “one derivative is lost” in an iteration. An idea how to overcome this problem is to use an approximation of A and a more rapid iteration process (e.g., the Newton iteration – see Appendix 3.2A) to compensate errors in the approximations of A (results of this type are the so-called “Hard” Local Inverse/Implicit Function Theorems) see, e.g., Deimling [34], Hamilton [65], g Moser [97], Nash [98] or Nirenberg [100]. We now turn to a global version of the Inverse Function Theorem. Theorem 4.1.7 (Global Inverse Function Theorem). Let X, Y be Banach spaces and let f : X → Y be continuously diﬀerentiable on X. Suppose that f (x) is continuously invertible for all x ∈ X and there is a constant c > 0 such that [f (x)]−1 L(Y,X) ≤ c

for all

x ∈ X.

Then f is a diﬀeomorphism of X onto Y . Proof. It is suﬃcient to prove that f is injective and surjective. The statement on the diﬀeomorphism follows then from Theorem 4.1.1. Fix an a ∈ X and denote b = f (a). Step 1. The map f is surjective, i.e., f (X) = Y . To see this choose y ∈ Y and put ϕ(t) = (1 − t)b + ty,

t ∈ [0, 1].

We wish to show that there is a curve ψ : [0, 1] → X such that f (ψ(t)) = ϕ(t),

in particular,

y = ϕ(1) = f (ψ(1)).

Since f is locally invertible at a ∈ X (Theorem 4.1.1), there is a neighborhood −1 U of a and δ > 0 such that ψ(t) = (f |U ) ϕ(t) is well deﬁned for t ∈ [0, δ) and ψ ∈ C 1 ([0, δ), X). Let A {τ ∈ [0, 1] : ∃ ω ∈ C 1 ([0, τ ], X), f (ω(t)) = ϕ(t), t ∈ [0, τ ]},

(4.1.5)

and α = sup A. Notice that ω is uniquely determined by (4.1.5) (this follows from the local invertibility of f ), and therefore there is ω ∈ C 1 ([0, α), X) such thatf (ω(t)) = ϕ(t), t ∈ [0, α). Since we have ω(t1 ) − ω(t2 ) ≤ sup ω (t)|t1 − t2 | ≤ cy − b|t1 − t2 | t∈[t1 ,t2 ]

4.1. Inverse Function Theorem

145

for all t1 , t2 ∈ [0, α), the mapping ω is uniformly continuous on the interval [0, α), hence lim ω(t) ω(α) t→α−

exists (X is a complete space) and the equality (4.1.5) holds for all t ∈ [0, α]. Now we are ready to prove that α = 1. Indeed, if α < 1, then we can apply Theorem 4.1.1 at the point ω(α) to obtain a contradiction with the deﬁnition of α. Step 2. The map f is injective. Suppose by contradiction that there are diﬀerent x1 , x2 ∈ X for which f (x1 ) = f (x2 ). Put y f (x2 ),

ψi (t) = (1 − t)a + txi ,

ϕi (t) = f (ψi (t)),

t ∈ [0, 1], i = 1, 2.

By a slight modiﬁcation of the above procedure it is possible to prove the existence of a mapping G : [0, 1] × [0, 1] → X such that f (G(t, s)) = (1 − s)ϕ1 (t) + sϕ2 (t),

(t, s) ∈ [0, 1] × [0, 1].

Then f (G(1, s)) = (1 − s)f (x1 ) + sf (x2 ) = y

for every s ∈ [0, 1].

This contradicts the local invertibility of f at x1 (= x2 ).

Exercise 4.1.8. A complex function f : C → C is called holomorphic in an open set G ⊂ C if f (z) exists for every z ∈ G. If f (a) = 0 for an a ∈ G, then f is −1 locally invertible (Theorem 4.1.1). Prove that (f |U ) is holomorphic and apply z this result to f (z) = e to obtain a power series expression of a continuous branch of the “multivalued function” log. (For the “complex function” proof see, e.g., Rudin [113, Theorem 10.30].) Exercise 4.1.9. Let 1 f (x) = x + 2x2 sin , x

x = 0,

f (0) = 0.

Show that f is not injective on any neighborhood of zero. Which assumption of Theorem 4.1.1 is not satisﬁed? Hint. If U is a neighborhood of 0, show that f (x) = 0 has a solution in U and f (x) = 0 at any such solution. Hence f is not injective on U. Note also that f is not continous at 0.

146

Chapter 4. Local Properties of Diﬀerentiable Mappings

Exercise 4.1.10. Find the form of the Laplace operator ∆u

∂ 2u ∂ 2 u + 2 ∂x2 ∂y

in the polar coordinates in the set G = {(x, y) ∈ R2 : x2 + y 2 > 0}

for

u ∈ C 2 (G).

Hint. If v(r, ϕ) = u(r cos ϕ, r sin ϕ), then (∆u) ◦ Φ =

∂ 2v 1 ∂ 2v 1 ∂v + + 2 2 2 ∂r r ∂ϕ r ∂r

where Φ(r, ϕ) = (r cos ϕ, r sin ϕ)

is the transformation. Note that we have

∂u ∂u ∂v ∂v , = , ◦ (Φ )−1 . ∂x ∂y ∂r ∂ϕ It is more comfortable to use this formula once again to compute

∂2 u ∂2u ∂x2 , ∂y 2 .

Exercise 4.1.11. Show that the estimate [f (x)]−1 L(Y,X) ≤ c + dxX is suﬃcient in Theorem 4.1.7. Hint. Use the Gronwall Lemma (Exercise 5.1.16) to estimate ω (t).

4.2 Implicit Function Theorem Let us start with a simple example of f : R2 → R, e.g., f (x, y) x2 + y 2 − 1. Denote M = {(x, y) ∈ R2 : f (x, y) = 0}, i.e., M is the unit circle in R2 . We would like to solve the equation f (x, y) = 0 for the unknown variable y or to express M as the graph of a certain function y = ϕ(x). √ We immediately see that for any x ∈ (−1, 1) there is a pair of y’s (y1,2 = ± 1 − x2 ) such that (x, y) ∈ M. In particular, M is not a graph of any function y = ϕ(x). We can only obtain that M is locally a graph, i.e., for (a, b) ∈ M, a ∈ (−1, 1),

4.2. Implicit Function Theorem

147

there is a neighborhood U of (a, b) such that M ∩ U is the graph of a function y = ϕ(x). On the other hand, for x = ±1 there is a unique y (y = 0) for which (x, y) ∈ M. But there is no neighborhood U of (1, 0) such that M ∩ U is the graph of a function y = ϕ(x). What is the diﬀerence between these two cases? In the former case the tangent line to M ∩ U exists at the point (a, b) with the slope ϕ (a). Since for x ∈ (a − δ, a + δ),

f (x, ϕ(x)) = 0 we have (formally by the Chain Rule)

∂f ∂f (a, b) + (a, b)ϕ (a) = 0, ∂x ∂y i.e., ϕ (a) = − ab , since

∂f ∂y (a, b)

(4.2.1)

= 2b = 0.

In the latter case, where (a, b) = (1, 0), we have ∂f ∂y (1, 0) = 0, and ϕ (1) cannot be determined from (4.2.1). The tangent line to M at the point (0, 1) is parallel to the y-axis, which indicates some problems with determining a solution, i.e., the (implicit) function ϕ. The reader is invited to sketch a ﬁgure. This discussion shows the importance of the assumption

∂f (a, b) = 0. ∂y How can this assumption be generalized to f : RM+N → RN ? A brief inspection of the linear case leads to the observation that we can compute the unknowns yM+1 , . . . , yM+N from the equations fi (y1 , . . . , yM+N ) =

M+N

aij yj = 0,

i = 1, . . . , N,

j=1

uniquely as functions of y1 , . . . , yM if and only if det (aij )

i=1,...,N j=M+1,...,M +N

= 0.

∂fi , and the condition on the regularity of the matrix Nevertheless, aij = ∂y j (aij ) means that the partial (Fr´echet) derivative of f (see Deﬁnii=1,...,N j=M+1,...,M +N

tion 3.2.17) with respect to the last N variables is an isomorphism of RN . Theorem 4.2.1 (Implicit Function Theorem). Let X, Y , Z be Banach spaces, f : X × Y → Z. Let (a, b) ∈ X × Y be such a point that f (a, b) = o. Let G be an open set in X × Y containing the point (a, b). Let f ∈ C 1 (G) and let the partial Fr´echet derivative f2 (a, b) be an isomorphism of Y onto Z.

148

Chapter 4. Local Properties of Diﬀerentiable Mappings

Then there are neighborhoods U of a and V of b such that for any x ∈ U there exists a unique y ∈ V for which f (x, y) = o. Denote this y by ϕ(x). Then ϕ ∈ C 1 (U). Moreover, if f ∈ C k (G), k ∈ N, then ϕ ∈ C k (U). Proof. We denote A [f2 (a, b)]−1 and deﬁne (x, y) ∈ G.

F (x, y) = (x, Af (x, y)), Then F : X × Y → X × Y , F ∈ C 1 (G) and

F (a, b)(h, k) = (h, Af (a, b)(h, k)). One can verify that F (a, b) is an isomorphism of X × Y onto itself. Hence we can apply Theorem 4.1.1 to get neighborhoods U × V of (a, b) and U˜ × V˜ of (a, o) such that for any ξ ∈ U˜ and η = o ∈ V˜ there exists a unique (x, y) ∈ U × V such that F (x, y) = (x, Af (x, y)) = (ξ, o),

i.e.,

x = ξ,

˜ U = U,

and, denoting y = ϕ(x), f (x, ϕ(x)) = o. This means that

F −1 (x, o) = (x, ϕ(x)).

Since the inverse F −1 is diﬀerentiable, by Theorem 4.1.1 we conclude that ϕ ∈ C 1 (U). Remark 4.2.2. We can also deduce a formula for ϕ (x): Indeed, since f (x, ϕ(x)) = o

for every x ∈ U

and both the functions f and ϕ are diﬀerentiable, we get from the Chain Rule f1 (x, ϕ(x)) + f2 (x, ϕ(x)) ◦ ϕ (x) = o, and therefore ϕ (x) = −[f2 (x, ϕ(x))]−1 ◦ f1 (x, ϕ(x))

for

x ∈ U1

(4.2.2)

where U1 ⊂ U may be smaller if necessary in order to guarantee the existence of the inverse [f2 (x, ϕ(x))]−1 (see Exercise 2.1.33). Remark 4.2.3. The statement of Theorem 4.2.1 is by no means the best one. If we have used the Contraction Principle directly we would obtain the existence of a solution y = ϕ(x) under weaker assumptions. Namely, f (x, y) = o is equivalent to y = y − Af (x, y)

4.2. Implicit Function Theorem

149

and since x is a parameter here we do not need to assume the diﬀerentiability with respect to x if we content ourselves just with the existence of ϕ (and give up its diﬀerentiability). We recommend that the reader uses directly the Contraction Principle to obtain the following statement: Let X be a normed linear space, Y , Z be Banach spaces and let f : X × Y → Z be continuous at the point (a, b) where f (a, b) = o. Assume that the partial Fr´echet derivative f2 (a, b) is an isomorphism of Y onto Z and f2 : X × Y → L(Y, Z) is continuous at (a, b). Then there are neighborhoods U of a and V of b such that for any x ∈ U there is a unique y = ϕ(x) ∈ V for which f (x, ϕ(x)) = o. Moreover, ϕ is continuous at a. It is also possible to avoid partly the requirement of invertibility of f2 (a, b) (see Remark 4.1.6 and references given there). There are many examples in Calculus where the Implicit Function Theorem is used. We give one in Exercise 4.2.9, see also exercises in Dieudonn´e [35]. Our attention is turned mainly towards more theoretical applications. Example 4.2.4. Let P (z) = z n + an−1 z n−1 + · · · + a0 be a polynomial with real or complex coeﬃcients a0 , . . . , an−1 . The famous Fundamental Theorem of Algebra says that if n ≥ 1, then the equation P (z) = 0 has at least one solution z˜ ∈ C and actually n solutions if all of them are counted with their multiplicity. This means that P can be factorized as follows: P (z) = (z − z1 )k1 · · · (z − zl )kl ,

k1 + · · · + kl = n,

where z1 , . . . , zl are diﬀerent. A natural question arises: How do these solutions z1 , . . . , zl depend on the coeﬃcients a0 , . . . , an−1 of P ? Let F (z, y0 , . . . , yn−1 ) = z n + yn−1 z n−1 + · · · + y0 : C × Cn → C. Then F (z1 , a0 , . . . , an−1 ) = P (z1 ) = 0

and

If z1 is a simple root, i.e., k1 = 1, then ∂F (z1 , a0 , . . . , an−1 ) = 0, ∂z

F ∈ C ∞ (C × Cn ).

150

Chapter 4. Local Properties of Diﬀerentiable Mappings

and the Implicit Function Theorem says that z1 depends continuously on a0 , . . . , an−1 (also in the real case). But what happens if k1 > 1? Notice that the cases of real and complex roots are diﬀerent. In the former case, the real root can disappear (x2 + ε = 0 for ε > 0), and in the latter case, the uniqueness can be lost. Since the solution z1 ramiﬁes or bifurcates at a0 , . . . , an−1 , this phenomenon is called a bifurcation. We postpone a basic discussion of this very important g nonlinear phenomenon till the end of the next section. Example 4.2.5 (dependence of solutions on initial conditions). Suppose that f : R × RN → RN is continuous in an open set G ⊂ R × RN and has continuous partial derivatives with respect to the last N variables in G. Denote by ϕ(·; τ, ξ) a (unique) solution of the initial value problem x˙ = f (t, x), x(τ ) = ξ (see Theorem 2.3.4). We are now interested in the properties of ϕ with respect to the variables (τ, ξ) ∈ G, cf. Remark 2.3.5. Let us deﬁne t Φ(τ, ξ, ϕ)(t) ξ + f (s, ϕ(s)) ds − ϕ(t). (4.2.3) τ

For a ﬁxed (t0 , x0 ) ∈ G the solution ϕ(·; t0 , x0 ) of Φ(t0 , x0 , ϕ) = o is deﬁned on an open interval J . Choose a compact interval I ⊂ J such that t0 ∈ int I. Then the mapping Φ given by (4.2.3) is deﬁned on a certain open subset H ⊂ R × RN × C(I, RN ) and takes its values in C(I, RN ). Further, Φ(t0 , x0 , ϕ(·; t0 , x0 )) = o and [Φ 2 (τ, ξ, ϕ)η](t) = η, [Φ 1 (τ, ξ, ϕ)](t) = −f (τ, ϕ(τ )), t f2 (s, ϕ(s))ψ(s) ds − ψ(t), [Φ 3 (τ, ξ, ϕ)ψ](t) = τ

t ∈ I,

η ∈ RN ,

ψ ∈ C(I, RN ) whenever (τ, ξ, ϕ) ∈ H.

Since these partial Fr´echet derivatives are continuous, Φ ∈ C 1 (H) (see Proposition 3.2.18). The crucial assumption of the Implicit Function Theorem is the continuous invertibility of Φ 3 (t0 , x0 , ϕ(·; t0 , x0 )) in the space C(I, RN ). Put t Bψ(t) = f2 (s, ϕ(s; t0 , x0 ))ψ(s) ds, ψ ∈ C(I, RN ). t0

We have proved in Example 2.3.7 that σ(B) = {0}. In particular, B − I = Φ 3 (t0 , x0 , ϕ(·; t0 , x0 ))

4.2. Implicit Function Theorem

151

is continuously invertible. By Theorem 4.2.1, there exist neighborhoods U of (t0 , x0 ) and V of ϕ(·; t0 , x0 ) such that for any (τ, ξ) ∈ U there is a unique ϕ ∈ V such that Φ(τ, ξ, ϕ) = o. Moreover, this ϕ is continuously diﬀerentiable with respect to τ and ξ, and for the continuous mappings Θ(·)

∂ϕ (·; t0 , x0 ) ∂τ

and

Ξ(·)

∂ϕ (·; t0 , x0 ) ∂ξ

we have, by Remark 4.2.2, t f2 (s, ϕ(s; t0 , x0 ))Θ(s) ds − Θ(t) = o, −f (t0 , x0 ) +

t0

t

η+ t0

f2 (s, ϕ(s; t0 , x0 ))Ξ(s)η ds − Ξ(t)η = o,

η ∈ RN .

This means that Θ and Ξ solve the so-called equation in variations y(t) ˙ = f2 (t, ϕ(t; t0 , x0 ))y(t)

(4.2.4)

(this is a system of N linear equations for Θ and a system of N × N equations for Ξ) and fulﬁl the initial conditions Θ(t0 ) = −f (t0 , x0 ),

Ξ(t0 ) = I.

In particular, Ξ(·) is a fundamental matrix of (4.2.4).

(4.2.5) g

As an application of diﬀerentiability with respect to initial conditions we brieﬂy sketch the approach to orbital stability of periodic solutions. Example 4.2.6. Assume that we know a non-constant T -periodic solution ϕ0 of an autonomous system x˙ = f (x), and that we are interested in the behavior of other solutions which start at time t = 0 near ϕ0 (0) = x0 . We assume that f ∈ C 1 (G), G is an open set in RN , and denote by ϕ(·, ξ) the solution satisfying ϕ(0, ξ) = ξ. Let M = {x ∈ RN : (x − x0 , f (x0 ))RN = 0}. In order to show that a solution ϕ(·, ξ) exists on such an interval [0, t(ξ)] that it meets M ∩ U (U is a neighborhood of x0 ) for the ﬁrst positive time t(ξ) near T (T is the period of ϕ0 ), see Figure 4.2.1, we can solve the equation Φ(t, ξ) (ϕ(t, ξ) − x0 , f (x0 )) = 0 in the vicinity of the point (T, x0 ). We have Φ 1 (T, x0 ) = (f (x0 ), f (x0 )) > 0

152

Chapter 4. Local Properties of Diﬀerentiable Mappings

(f (x0 ) = 0 since ϕ0 is non-constant) and

dϕ (t, x0 )η, f (x0 ) = (Ξ(t, x0 )η, f (x0 )) Φ2 (t, x0 )η = dξ (see the previous example) where Ξ(t, x0 ) is a fundamental matrix of the linear T -periodic equation y(t) ˙ = f (ϕ0 (t))y(t) (cf. (4.2.4)). So, we may use the Implicit Function Theorem to get a function t(ξ) such that Φ(t(ξ), ξ) = 0,

t(x0 ) = T,

ξ ∈ U(x0 ).

f (x0 )

M

ϕ0 (·)

RN

ϕ(·, ξ) x0 ϕ(t(ξ), ξ) ξ

Figure 4.2.1.

By (4.2.2) we also have 1 dt (T, x0 )η = − (Ξ(t, x0 )η, f (x0 )), dξ f (x0 )2

η ∈ RN .

This allows us to investigate the behavior of the so-called Poincar´e mapping P (ξ) ϕ(t(ξ), ξ),

ξ ∈ U ∩ M.

The asymptotic orbital stability of ϕ0 can be deﬁned by the requirement lim P n (ξ) = x0 .

n→∞

For more detail the interested reader can consult, e.g., Amann [4, Section 23]. g

4.2. Implicit Function Theorem

153

We are often interested in asymptotic behavior of solutions of a system of ordinary diﬀerential equations (linear or nonlinear), e.g., boundedness of solutions or its convergence to some special solutions (constant, periodic, etc.). In the following example we brieﬂy sketch a method which can be used. Example 4.2.7. Consider the equation x˙ = Ax + f

(4.2.6)

where A is a constant N × N matrix and f : R → RN is bounded and continuous on R (f ∈ BC(R, RN )). We are interested in bounded solutions of (4.2.6) on R. Let us assume σ(A) ∩ iR = ∅. With help of Functional Calculus (Theorem 1.1.38, in particular, Remark 1.1.39(i), (ii)) we can construct two projections P + , P − onto complementary subspaces X + , X − of RN which commute with A ∈ L(RN ) (A is the matrix representation of A in the standard basis) and such that σ(A− ) = σ(A) ∩ {z ∈ C : Re z < 0}

σ(A+ ) = σ(A) ∩ {z ∈ C : Re z > 0},

(A+ , A− are the restrictions of A to X + , X − , respectively). With help of the Variation of Constants Formula it can be proved that for any f ∈ BC(R, RN ) there is a unique solution x of (4.2.6) in the space BC(R, RN ), and this solution is given by the formula t +∞ + (t−s)A− − x(t) = e P f (s) ds − e(t−s)A P + f (s) ds.2 (4.2.7) −∞

t

If we are interested in bounded solutions only on R+ [0, ∞), a similar computation shows that all such solutions for f ∈ BC(R+ , RN ) are given by t ∞ − − + x(t) = etA x− + e(t−s)A P − f (s) ds − e(t−s)A P + f (s) ds (4.2.8) 0 −

t −

where x is an arbitrary point in X . Both formulae (4.2.7) and (4.2.8) may be used for ﬁnding bounded solutions to a semilinear equation x˙ = Ax + f (x)

where f (o) = o, f (o) = o, f ∈ C 1 (U)

(4.2.9)

(U is a neighborhood of o ∈ RN ). To do that we solve the corresponding nonlinear equations (4.2.7), (4.2.8) where f (·) is replaced by g(x(·)) where g is bounded and 2 The interested reader can check this formula and also (4.2.8) as an exercise on the use of the Variation of Constants Formula. Hint. Use the estimates etA x ≤ ce−αt x for x ∈ X − , t > 0, and etA x ≤ ceαt x for x ∈ X + , t < 0, where the positive constants α, c are independent of t and x, α is such that σ(A) ∩ {λ ∈ C : | Re λ| ≤ α} = ∅ and c depends on α only. These estimates follow from Functional Calculus (see Exercise 1.1.42) and they ensure that integrals in (4.2.7) do exist. Apply P + to both sides of the Variation of Constants Formula and send t → ∞ to obtain ∞

P + x(t) = − t

+

e(t−σ)A P + f (s) ds provided x is a bounded solution. Similarly P − x(t).

154

Chapter 4. Local Properties of Diﬀerentiable Mappings

g(y) = f (y) in a neighborhood of 0. For details see Hale [63, Sections III.6 and IV.3]. A solution in (4.2.8) depends on the parameter x− , so we have the equation Φ(ξ, ϕ)(t) −

ϕ(t) − etA ξ −

t

−

e(t−s)A P − g(ϕ(s)) ds +

0

∞

+

e(t−s)A P + g(ϕ(s)) ds = o

t

with Φ : X − × BC(R+ , RN ) → BC(R+ , RN ) (check it – you have to use the estimates given in footnote 2 on page 153). This formulation is suitable for the use of the Implicit Function Theorem. We have left details to the interested reader. The graph of the mapping κ : ξ ∈ X − → P + ϕ(0, ξ) s is the so-called local stable manifold Wloc (x0 ) of the equation (4.2.9) (ϕ(·, ξ) is a 3 solution of Φ(ξ, ϕ) = o). It follows from the formula (4.2.2) that

κ (o) = o, s i.e., Wloc (x0 ) is tangent to the stable manifold X − of the linear equation x˙ = Ax, g see Figure 4.2.2.

Remark 4.2.8. It is sometimes convenient to deﬁne a solution of nonlinear, in particular, partial diﬀerential equations, more generally, not assuming that a solution has all classical derivatives which appear in the equation (see Chapters 6 and 7). Actually, we have seen one such possibility in the reformulation of a diﬀerential equation as an integral equation x = F (x) where F is given by the formula (2.3.6). Having a more general notion of solution a natural question arises: Under which conditions is this solution smoother, in particular, is it a “classical” solution? Such results are known as regularity assertions. The Implicit Function Theorem can be occasionally used to prove such statements. See Theorem 6.1.14. 3 The so-called stable manifold W s (x ) of the stationary point x of the equation x ˙ = g(x) 0 0 (g(x0 ) = o) is deﬁned as follows: Let ψ(·, ξ) be a solution of this diﬀerential equation satisfying

the initial condition ψ(0, ξ) = ξ. Then the stable manifold is W s (x0 ) = s (x ) Wloc 0

ξ : lim ψ(t, ξ) = x0 t→∞

and a local stable manifold is deﬁned by {ξ ∈ W 0 ) : ψ(t, ξ) ∈ U for t ≥ 0} where U is a neighborhood of x0 . Notice the crucial assumption σ(A) ∩ iR = ∅ (i.e., o is a so-called hyperbolic stationary point of the equation (4.2.9)) in the above argument. Figure 4.2.2 shows also the distinction between stable and local stable manifolds. It is worth mentioning that a similar approach cannot be used in the case σ(A) ∩ iR = ∅. Since there can exist eigenvalues on the imaginary axis of the multiplicity greater than 1, we cannot expect a manifold consisting of bounded solutions. To get the so-called central manifold we are forced to solve a nonlinear version of the equations (4.2.7) in a weighted space instead of BC(R, RN ). However, this problem is more diﬃcult due to the lack of diﬀerentiability of the Nemytski operator (see footnote 11 on page 126). For details see, e.g., Chow, Li & Wang [25, Chapter 1] and references given there. s (x

4.2. Implicit Function Theorem

155

W s (o)

X+

RN

ϕ(o, ξ)

κ(ξ) s Wloc (o)

o

ξ

X−

W s (o) Figure 4.2.2.

Exercise 4.2.9. Let f : RM → RN and let Φ be a diﬀeomorphism deﬁned on a neighborhood U of the graph of f onto V ⊂ RM+N . Write Φ−1 (ξ, η) = (ψ 1 (ξ, η), ψ 2 (ξ, η))

for (ξ, η) ∈ V.

This means the graph of f is isomorphic to Γ = {(ξ, η) ∈ RM+N : ψ 2 (ξ, η) − f (ψ 1 (ξ, η)) = o}. The Implicit Function Theorem yields conditions for Γ to be the graph of a function η = g(ξ). (i) Formulate these conditions! (ii) Express the derivative of f in terms of the derivative of g. Hint. f (Φ−1 ) = [(ψ 2 ) 2 g + (ψ 2 ) 1 ] ◦ [(ψ 1 ) 1 + (ψ 1 ) 2 g ]−1 . Control question: Have you checked that the second term on the right-hand side is an isomorphism of RM onto RM ? (iii) Without using the general result from (ii) transform the equation dy = f (x, y) dx into polar coordinates! Exercise 4.2.10. Let M be a metric space and f : M × R → R a continuous map. Let c > 0 be such that for all x ∈ M , y1 , y2 ∈ R, we have (f (x, y1 ) − f (x, y2 ))(y1 − y2 ) ≥ c|y1 − y2 |2 .

156

Chapter 4. Local Properties of Diﬀerentiable Mappings

Prove that for any x ∈ M there exists a unique y(x) ∈ R such that f (x, y(x)) = 0 and, moreover, y : x → y(x) is a continuous map from M into R. Hint. Use the properties of real functions of one real variable. Exercise 4.2.11. Let M be a normed linear space, let f be as in Exercise 4.2.10 and, moreover, f ∈ C k (M × R) with some k ∈ N. Then the implicit function y = y(x) from Exercise 4.2.10 is of the class C k (M ). Prove it! Hint. Use Theorem 4.2.1. Exercise 4.2.12. Give details which are omitted in Example 4.2.7. Exercise 4.2.13. Let A be a densely deﬁned linear operator in a Hilbert space. Assume that A has a compact self-adjoint resolvent. Extend the construction of the local stable manifold (Example 4.2.7) to the equation (4.2.6). See Exercise 3.1.19 for the properties of this equation. Exercise 4.2.14. Assume that f (x, y) =

∞

ajk (x − x0 )j (y − y0 )k ,

|x − x0 | < α,

|y − y0 | < β.

j,k=0

Moreover, let a00 = 0, a01 = 0. Apply the Implicit Function Theorem and show that the implicit function y(x) is the sum of a power series in a neighborhood of x0 . Note that for complex variables the result follows directly from the properties of holomorphic functions and Theorem 4.2.1. In the real case one has to prove that the formal power series for y(x) has a positive radius of convergence.

4.3 Local Structure of Diﬀerentiable Maps, Bifurcations We now revert to the topic of Remark 4.1.6, i.e., to the case when the assumptions of the Local Inverse Function Theorem (Theorem 4.1.1) are violated. In particular, it was mentioned there that the assumptions of the Local Inverse Function Theorem are never satisﬁed for f : RM → RN provided M = N . In the ﬁrst part we will study local behavior of such mappings. In the second part we stress the main idea of the Lyapunov–Schmidt Reduction and the approach to bifurcation phenomena (Crandall–Rabinowitz Bifurcation Theorem). Deﬁnition 4.3.1. Let f : X → Y be a diﬀerentiable map in a neighborhood of a point a ∈ X. If f (a) is neither injective nor surjective, then a is called a singular point of f .

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

157

The following proposition deals with the ﬁrst non-singular case for the mapping f : RM → RN , M < N . For the second one see Proposition 4.3.8. Proposition 4.3.2. Let f : RM → RN be a diﬀerentiable map on an open set G ⊂ RM . Let a ∈ G and let f (a) be injective. Let Q be a (linear) projection of RN onto Y1 Im f (a). Then there exist neighborhoods U of a, V of Qf (a) in Y1 , a diﬀeomorphism ϕ of U onto V and a diﬀerentiable map g : V → RN such that f =g◦ϕ (see Figure 4.3.1). RN = Y1 ⊕ Y2

Y2 f (a)

f

a Q

o U ⊂G G ⊂ RM

g

ϕ Qf (a) V

Y1 = Im f (a)

Figure 4.3.1.

Proof. The proof is almost obvious from Figure 4.3.1. Put ϕ = Q ◦ f . Then ϕ (a) = Qf (a) is an isomorphism of RM onto Y1 . Since dim Y1 = M is ﬁnite, Y1 is a Banach space (as a closed subspace of the Banach space RN ) and, by Theorem 4.1.1, ϕ is a diﬀeomorphism of a neighborhood U of a onto a neighborhood (in Y1 ) V of Qf (a). It suﬃces to put g = f ◦ ϕ−1 . Remark 4.3.3. (i) We have used the ﬁnite dimension of Y RN to ensure both the existence of a continuous linear projection Q and the closedness of the range Im f (a). If f : X → Y , X, Y are Banach spaces, then neither of these two conditions has to be satisﬁed. It follows from the proof that Proposition 4.3.2 holds under these two additional assumptions. We notice that these assumptions are superﬂuous provided X has a ﬁnite dimension (see Remark 2.1.19).

158

Chapter 4. Local Properties of Diﬀerentiable Mappings

(ii) It is also easy to prove that Ψ(y) g(Qy) − (I − Q)y − Qf (a) is a diﬀeomorphism of a neighborhood W of b = f (a) onto a neighborhood ˜ of o in RN . Indeed, W Ψ (b)k = ϕ (a)h − (I − Q)k

Ψ(b) = o, where h ∈ RM is such that

Qk = ϕ (a)h. Moreover, y ∈ f (G) ∩ W if and only if there is an x ∈ G such that y = f (x)

and

(I − Q)Ψ(f (x)) = o.

This means that there exists a local (nonlinear) transformation of coordinates in W (given by Ψ) such that f (G) ∩ W is expressed by zM+1 = · · · = zN = 0 in these new coordinates. (iii) An interpretation similar to (ii) follows: (I − Q)f = (I − Q)g(ϕ) = (I − Q)g(Qf ),

Φ(Qf ) (I − Q)g(Qf ).

This means that after a linear transformation of coordinates the last N − M components of f (i.e., (I − Q)f ) depend (via Φ) on the ﬁrst M components of f in a neighborhood of a. Compare this local nonlinear result to the linear one for the equation Ax = y,

Figure 4.3.2. Immersion

A ∈ L(RM , RN ).

Figure 4.3.3. Injective immersion

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

159

(iv) A map f which satisﬁes the assumptions of Proposition 4.3.2 at each point a ∈ G is often called an immersion of G into RN . An injective immersion which is also a homeomorphism of G onto f (G) (in the induced topology from RN ) is called an embedding. Some examples of immersions which are not embeddings are shown in Figures 4.3.2 and 4.3.3. We note that we have already used the term embedding for an injective continuous linear operator. Further examination of Proposition 4.3.2 leads to the following deﬁnition of a diﬀerentiable manifold. This notion is basic for diﬀerential geometry and global nonlinear analysis. In this textbook we will mostly use it for purposes of terminology only. Some basic facts on manifolds are given in Appendix 4.3A and will be used for developing the notion of degree (Appendix 4.3D). Deﬁnition 4.3.4. A diﬀerentiable manifold of dimension M and of the class C k is a subset M of RN (N ≥ M ) with the following property: For each x ∈ M there is a neighborhood W of x (in RN ) and a C k -diﬀeomorphism ψ of W into RN such that ψ(M ∩ W) = {y = (y1 , . . . , yN ) ∈ RN : yM+1 = · · · = yN = 0} ∩ ψ(W). A relative neighborhood W ∩ M together with ψ is called a (local) chart at the point x ∈ M . The ﬁrst M coordinates (y1 , . . . , yM ) are called the local coordinates of x on M . The collection of all charts of M is called an atlas of M . Example 4.3.5. (i) An open subset G ⊂ RM is an M -dimensional diﬀerentiable manifold of the class C k for any k ∈ N (i.e., of the class C ∞ ). (ii) The graph of a function f : RM → R, f ∈ C k (G), G an open subset of RM , is an M -dimensional diﬀerentiable manifold of the class C k in RN , N ≥ M + 1. (iii) Let S 2 = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1} be the 2-dimensional sphere. Then S 2 is a 2-dimensional diﬀerentiable manifold of the class C ∞ in RN , N ≥ 3. Indeed, a chart for the upper open half-sphere can be constructed as follows: let 1 ψ(x, y, z) = (x, y, z − 1 − x2 − y 2 ), W = {(x, y, z) ∈ R3 : x2 + y 2 < 1, z > 0}. Then ψ is a diﬀeomorphism of W into R3 and ψ(W ∩ S 2 ) = {(u, v, w) ∈ R3 : u2 + v 2 < 1, w = 0}. We will see a more comfortable proof in Example 4.3.10.

g

160

Chapter 4. Local Properties of Diﬀerentiable Mappings

Deﬁnition 4.3.6. Let X, Y be Banach spaces, f : X → Y a diﬀerentiable map on a neighborhood of a point a ∈ X. If f (a) is a surjective map onto Y , then the point a is called a regular point. If a is not a regular point, then it is called a critical point . A value b ∈ Y is called a critical value of f provided the set f−1 (b) {x ∈ X : f (x) = b} contains a critical point. In the other case, b is a regular value. Remark 4.3.7. There is a diﬀerence between the notion of a singular point (Deﬁnition 4.3.1) and a critical point. For example, if f : RM → RN , M < N , then all points in RM are critical (but some of them can be non-singular). The importance of the notion of a critical point will be more apparent in connection with the Sard Theorem (Theorem 5.2.3) and its applications. Proposition 4.3.8. Let G be an open subset of RM , f : G → RN , f ∈ C k (G). Let a ∈ G be a regular point of f . Then there are neighborhoods U of o ∈ RM , V of a, and a diﬀeomorphism ϕ ∈ C k of U onto V such that {x ∈ V : f (x) = f (a)} = ϕ(U ∩ Ker f (a)) (see Figure 4.3.4). X2

A

RM = X 1 ⊕ X 2

V a

RN A−1

ϕ P

f (x) = f (a)

o U

X1 = Ker f (a)

Figure 4.3.4.

Proof. By Remark 4.3.7, M ≥ N . If M = N , then Theorem 4.1.1 can be applied. Therefore, we assume that M > N . Denote by P a (linear continuous) projection of X RM onto X1 Ker f (a) and by X2 the complementary subspace given by X2 = Im (I − P ). If A is the restriction of f (a) to X2 , then A is an isomorphism of X2 onto RN (A is both injective and surjective). Denote by A−1 the inverse isomorphism of RN onto X2 (A−1 is also called a right inverse of f (a)). We can rewrite f in the following way: f (x) = f (a) + f (a)[A−1 (f (x) − f (a)) + P (x − a)].

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

Let us denote

161

ψ(x) = A−1 (f (x) − f (a)) + P (x − a).

A simple calculation shows that ψ (a)h = A−1 f (a)h + P h = (I − P )h + P h = h

for any h ∈ X.

Since ψ(a) = o, ψ is a diﬀeomorphism of a neighborhood V ⊂ G of a onto a neighborhood U of o (Theorem 4.1.1). Further, x ∈ {y ∈ V : f (y) = f (a)} if and only if x ∈ V and ψ(x) = P (x − a), i.e., ψ(x) ∈ U ∩ Ker f (a).

The desired diﬀeomorphism ϕ is the inverse of ψ.

Remark 4.3.9. (i) Proposition 4.3.8 together with its proof also holds for f : X → Y , X, Y Banach spaces provided there exists a linear continuous projection P of X onto Ker f (a). The continuity of A−1 follows in this case from the Open Mapping Theorem (Theorem 2.1.8). The existence of such a projection P can be shown in two important cases, namely, when Y has ﬁnite dimension (and therefore Ker f (a) has ﬁnite codimension – Example 2.1.12) or Ker f (a) has ﬁnite dimension (Remark 2.1.19). (ii) Notice that ϕ can be viewed as a local (nonlinear) transformation of coordinates in which f is a linear map, namely f (ϕ(y)) = f (a) + f (a)y,

y ∈ U.

This formula also shows that all points in V are regular. Moreover, if z is suﬃciently close to b = f (a), then y = A−1 (z − b) ∈ U

and

f (ϕ(y)) = z.

This shows that f (G) is an open set in RN provided all points of G are regular. (iii) In the terms of diﬀerentiable manifolds (Deﬁnition 4.3.4) the statement of Proposition 4.3.8 can be formulated as follows: If f : RM → RN is a diﬀerentiable map in an open set G ⊂ RM , b ∈ RN , then the set {x ∈ G : f (x) = b} is a diﬀerentiable manifold (either empty or of dimension M − N ) provided b is a regular value of f . (iv) Proposition 4.3.8 imposes certain restrictions on the set {x ∈ RM : f (x) = f (a)}. In Figures 4.3.5–4.3.7 there are some cases in which a is not a regular point (i.e., it is a critical point). The value f (a) is critical in all cases.

162

Chapter 4. Local Properties of Diﬀerentiable Mappings

a a a (cusp) Figure 4.3.5.

Figure 4.3.6.

Figure 4.3.7.

Example 4.3.10. The sphere S 2 is a C ∞ -diﬀerentiable manifold. To see this it is suﬃcient to use Remark 4.3.9(iii) for f (x, y, z) = x2 + y 2 + z 2 − 1,

b = 0.

g

The assertions of the last two propositions are part of the following more general result. Theorem 4.3.11 (Rank Theorem). Let f : RM → RN be a diﬀerentiable map on an open subset G ⊂ RM and let the dimension of Im f (x) be constant for x ∈ G (and equal to L ∈ N). Then for any a ∈ G there exist neighborhoods U of a, W of b = f (a), cubes C in RM , D in RN and diﬀeomorphisms Φ : C → U, Ψ : W → D such that the map F deﬁned by F = Ψ ◦ f ◦ Φ has the form F (z1 , . . . , zM ) = (z1 , . . . , zL , 0, . . . , 0)

for all

z = (z1 , . . . , zM ) ∈ C

(see Figure 4.3.8). Proof. Denote X2 = Ker f (a), P a (linear) projection in RM onto X2 , X1 = Ker P and, similarly, Y1 = Im f (a), Q a (linear) projection in RN onto Y1 , Y2 = Ker Q. Then the restriction A of f (a) to X1 is an isomorphism of X1 onto Y1 . Let A−1 be the inverse isomorphism, A−1 : Y1 → X1 . By the proof of Proposition 4.3.8, α(x) = A−1 Q(f (x) − f (a)) + P (x − a) is a diﬀeomorphism of the neighborhood U of a ∈ RM onto the neighborhood U˜ of o ∈ RM . Denote by ϕ the inverse to α. For h1 ∈ X1 we have α (x)h1 = A−1 Qf (x)h1 . This implies that f (x) is injective on X1 (α (x) has this property). Since dim X1 = dim Im f (x) = L,

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

163

RN −L

RM−L

RM

RN D

F

C o

o

RL TC

RL TD

Φ

Ψ

X2 = Ker f (a)

RM

Y2

RN

f P

a

b

ϕ

W U

U˜

o

Q ˜ W

−1

X1

f (U)

ψ

g˜

o

f (U) ∩ W

A

Y1 = Im f (a)

Figure 4.3.8.

the restriction of f (x) to X1 is an isomorphism of X1 onto Im f (x). We can express this fact in the commutative diagram (Figure 4.3.9). α (x) X1

X1 A−1 Q (an isomorphism)

f (x) Im f (x) Figure 4.3.9.

Using the decomposition RM = X1 ⊕ X2 , we write u = u1 + u2

˜ ui ∈ Xi , i = 1, 2, for u ∈ U,

and deﬁne g(u1 , u2 ) = f (ϕ(u1 + u2 )). Now, we show that g actually depends on the ﬁrst variable only. To see this we compute the derivative of g with respect to the second variable: g2 (u1 , u2 )h2 = f (ϕ(u))ϕ (u)h2 .

164

Chapter 4. Local Properties of Diﬀerentiable Mappings

For k ϕ (u)h2 and ϕ(u) = x we have h2 = α (x)k = A−1 Qf (x)k + P k. This means that A−1 Qf (x)k = o. Since A−1 Q is an isomorphism of Im f (x) onto X1 (see Figure 4.3.9), we have f (x)k = o, i.e., g2 (u1 , u2 )h2 = o

for any h2 ∈ X2 .

The Mean Value Theorem (Theorem 3.2.7) implies that g(u1 , u2 ) = g(u1 , o)

for

˜4 (u1 , u2 ), (u1 , o) ∈ U.

This result is shown in Figure 4.3.8 by shaded areas. Put g˜(u1 ) g(u1 , o). We employ Proposition 4.3.2, in particular Remark 4.3.3(ii) to complete the proof. Replacing there g˜ for f , we obtain a diﬀeomorphism ψ of a neighborhood ˜ of o ∈ RN such that W of b = f (a) onto a neighborhood W (I − Q)ψ(f (U) ∩ W) = o (see the right lower corner of Figure 4.3.8). We get cubes C and D by diﬀeomorphisms TC , TD in RM , RN , respectively, which transform non-Cartesian coordinates α in X1 ⊕X2 or ψ in Y1 ⊕Y2 into Cartesian coordinates in RM = RL ×RM−L (TC (X1 ) = RL ), or in RN = RL × RN −L , respectively (see the upper part of Figure 4.3.8 and page 163). Remark 4.3.12. The assertion of the Rank Theorem can be formulated in a slightly less informative way as follows: Under the hypotheses of Theorem 4.3.11, f (G) is a diﬀerentiable manifold of dimension L. Deﬁnition 4.3.13. Functions f1 , . . . , fN : RM → RN are said to be independent in an open set G ⊂ RM if any point x ∈ G is regular for f = (f1 , . . . , fN ). In the other case, the functions are called dependent . The following assertion explains the notions of dependent and independent functions. Suppose the assumptions of the Rank Theorem are satisﬁed for f = (f1 , . . . , fL , fL+1 , . . . , fN ) : RM → RN where functions f1 , . . . , fL are independent in a neighborhood of a point a ∈ RM . Then there is a smooth function G : RL → RN −L such that (fL+1 (x), . . . , fN (x)) = G(f1 (x), . . . , fL (x)) for x in a certain neighborhood of a. 4 In fact, the use of Theorem 3.2.7 requires the segment joining (u , o) to (u , u ) to lie in U. ˜ 1 1 2 Taking a smaller U˜ if necessary we can assume that U˜ is convex. Notice that we have got a similar result at the end of the proof of Proposition 4.3.8 where we have considered only one ﬁber, namely {x : f (x) = f (a)}.

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

165

To prove this assertion notice ﬁrst that Im f (a) is an L-dimensional subspace of RN and can be identiﬁed with RL × {0}. This means that Qf (x) = H1 (x) (f1 (x), . . . , fL (x)) and, in the notation of the proof of Theorem 4.3.11, f (x) = g˜(u1 )

where

u1 = A−1 (H1 (x) − H1 (a)).

In particular, fL+1 , . . . , fN are smooth functions of f1 , . . . , fL . The notion of independent functions plays an important role also in the theory of ordinary diﬀerential equations. Indeed, let x˙ = v(x) be a system of M diﬀerential equations. A smooth non-constant function f : RM → R is called the ﬁrst integral of this system in an open set G ⊂ RM if for any a ∈ G there is an interval Ia such that for a solution ϕ(·, a) of the system such that ϕ(0, a) = a, we have that ϕ(t, a) ∈ G

and

d f (ϕ(t, a)) = 0 dt

hold for t ∈ Ia .

It has been proved in the theory of ordinary diﬀerential equations that a system x˙ = v(x) (v : G ⊂ RM → RM is smooth) has M − 1 independent ﬁrst integrals f1 , . . . , fM−1 in a neighborhood U of any non-stationary point a ∈ G. A smooth function g : U → R is the ﬁrst integral if and only if g, f1 , . . . , fM−1 are dependent on U. We remark that the knowledge of the ﬁrst integrals reduces the original system. For example, if f1 , . . . , fM−1 are independent ﬁrst integrals in a neighborhood U of a non-stationary point, then the transformation of coordinates yi = fi (x),

i = 1, . . . , M − 1,

yM = xM

leads to a new system y˙ i = 0,

i = 1, . . . , M − 1,

y˙ M = w(yM )

for a function w,

and after rescaling in time, to y˙ i = 0,

i = 1, . . . , M − 1,

y˙ M = 1.

For another interpretation and a generalization of the notion of the ﬁrst integral see Exercise 4.3.26 and the end of Appendix 4.3A.

166

Chapter 4. Local Properties of Diﬀerentiable Mappings

Remark 4.3.14. A result similar to the Rank Theorem holds also for a diﬀerentiable map f : X → Y where X, Y are Banach spaces. The delicate question is the existence of continuous linear projections P of X (onto Ker f (a)) and Q of Y (onto Im f (a)). Such projections exist provided f (a) is a Fredholm operator, i.e., Ker f (a) has ﬁnite dimension and Im f (a) is a closed subspace of ﬁnite codimension in Y (see page 70). Notice that the equation f (x) = y can be solved by the following procedure which is often called the Lyapunov–Schmidt Reduction: The equation f (x) = y is equivalent to the pair of equations y1 Qy = Qf (x1 + x2 ),

y2 (I − Q)y = (I − Q)f (x1 + x2 )

where x = x1 + x2 ,

x2 = P x.

Suppose that the ﬁrst equation may be solved5 for x1 assuming x2 to be ﬁxed (looking at x2 as a parameter). We obtain x1 = g(y1 , x2 ). The second equation is now an equation (it is called the bifurcation equation or the alternative problem) of the form (I − Q)f (x2 + g(y1 , x2 )) = y2

for an unknown x2 .

If f (a) is a Fredholm map, then this equation is an equation in ﬁnite dimensional spaces: x2 ∈ Ker f (a), y2 ∈ Y2 , dim Ker f (a) < ∞, dim Y2 = codim Im f (a) < ∞. Notice that the Implicit Function Theorem ensures a unique local solution to the ﬁrst equation for y suﬃciently close to b = f (a). In this situation we also obtain g2 (b1 , a2 ) = o, i.e., the point a2 is a critical point for F (x2 ) (I − Q)f (x2 + g(b1 , x2 )) − b2 . The simplest case for the local study of F is that codim Im f (a) = 1,

i.e.,

F : X2 = Ker f (a) → R

(see Example 4.3.20). Notice that dim X2 is ﬁnite for f (a) being a Fredholm map. 5 E.g.,

by the Implicit Function Theorem (Theorem 4.2.1) in the vicinity of a known solution b = f (a) since f (a) is an isomorphism of X1 onto Y1 or, more generally, by an iteration process.

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

167

Example 4.3.15. As an application we will investigate the existence of a solution of the following boundary value problem for a system of ordinary diﬀerential equations x(t) ˙ = f (t, x(t)), t ∈ (0, 1), (4.3.1) x(0) = x(1). We suppose (see Theorem 2.3.4) that f together with its partial derivatives with respect to the variables x = (x1 , . . . , xN ) are continuous on [0, 1] × RN . We know that any solution starting at t = 0 satisﬁes the integral equation t f (s, x(s)) ds x(t) − x(0) = 0

for all t from the interval of its existence. This means that x satisﬁes the boundary value problem (4.3.1) if and only if G(x0 )

1

f (s, x(s, x0 )) ds = o. 0

Here x(·, x0 ) denotes a (unique) solution of x(t) ˙ = f (t, x(t)) such that x(0, x0 ) = x0 . The problem of solving the equation G(x0 ) = o

for G : RN → RN

is a nontrivial topological task which we will deal with in Chapter 5. Notice that we cannot use the Implicit Function Theorem directly since there is no parameter in (4.3.1). Therefore we modify the problem by adding a multiplicative parameter ε to (4.3.1), i.e., we investigate the problem x(t) ˙ = εf (t, x(t)), t ∈ (0, 1), (4.3.2) x(0) = x(1). Notice that for ε = 0 any N -dimensional constant a solves (4.3.2). To be able to use the abstract approach described above we rewrite (4.3.2) in an operator form. To do this we deﬁne Banach spaces X = {x ∈ C([0, 1], RN ) : x(0) = x(1)},

Y = {y ∈ C([0, 1], RN ) : y(0) = o}

and operators L, N : X → Y : Lx : t → x(t) − x(0),

t

N (x) : t →

f (s, x(s)) ds,

t ∈ [0, 1].

0

Then the system (4.3.2) is equivalent to the operator equation G(x, ε) Lx − εN (x) = o.

(4.3.3)

168

Chapter 4. Local Properties of Diﬀerentiable Mappings

The operator L is linear and continuous, therefore diﬀerentiable: L (x)h = Lh

h ∈ X.

for

The operator N is also continuously diﬀerentiable and t f2 (s, x(s))h(s) ds, t ∈ [0, 1], h ∈ X. N (x)h : t → 0

Check this expression yourself, see also Example 3.2.21. This means that G 1 (a, 0)h = Lh is not injective and X2 Ker L consists of N -dimensional constant functions. Moreover, Y1 Im L = {y ∈ Y : y(1) = y(0) = o}. There are continuous linear projections P , Q onto closed subspaces X2 and Y1 , respectively, given by P x : t → x(0),

Qy : t → y(t) − ty(1).

Having the decompositions X = X 1 ⊕ X2 ,

Y = Y1 ⊕ Y2 ,

we can use the Lyapunov–Schmidt Reduction, i.e., x = x1 + a,

x1 ∈ X1 ,

a ∈ X2 ,

solves (4.3.3) if and only if it solves the pair of equations G1 (x1 , a, ε) Lx1 − εQN (x1 + a) = o,

(4.3.4)

G (x1 , a, ε) (I − Q)N (x1 + a) = o.

(4.3.5)

2

Since G1 (o, a, 0) = o and

(G1 ) 1 (o, a, 0)h = Lh

is an isomorphism of X1 onto Y1 (it is both injective and surjective), the inverse is continuous by the Open Mapping Theorem (Theorem 2.1.8). The Implicit Function Theorem yields a solution x1 ϕ(b, ε) of (4.3.4) in a neighborhood of (a, 0) for a given a ∈ X2 . We also have ϕ(a, 0) = o

and

ϕ 1 (a, 0) = o

(check it again). This means that it is suﬃcient to solve H(b, ε) (I − Q)N (ϕ(b, ε) + b) = o

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

169

with respect to b. Since dim X2 = dim Y2 = N < ∞ and H : X2 × R → Y2 we can try to use the Implicit Function Theorem once more. To this end we need an a ˜ ∈ X2 for which 1 1 f (s, ˜ a) ds = o, i.e., f (s, a ˜) ds = o, (I − Q)N (˜ a) t 0

0

and the equation (I − Q)N (˜ a)d t

0

1

f2 (s, a ˜) ds d = tc

has a unique solution for every c ∈ R . The last requirement means that the 1 g f2 (s, a ˜) ds has to be regular. N × N -matrix N

0

To summarize the considerations of the previous example, we get the following conclusion. Proposition 4.3.16. Let f = (f 1 , . . . , f N ) : [0, 1] × RN → RN be continuous and ∂f i have continuous partial derivatives ∂x (i, j = 1, . . . , N ). Let the function f satisfy j the conditions

1 1 ∂f i f (s, a ˜) ds = o, det (s, a ˜) ds = 0 0 0 ∂xj for a certain constant a ˜ ∈ RN . Then there exist δ > 0 and a diﬀerentiable map ε → x(·, ε), |ε| < δ, such that x(·, 0) = a ˜ and the functions x(·, ε) satisfy the boundary value problem (4.3.2). Remark 4.3.17. Let us make some remarks on this result. If the function f in (4.3.1) is 1-periodic in the variable t, then x is a solution of (4.3.1) if and only if x ˜(t) = x(t − n),

n = [t], t ∈ R,

is a 1-periodic solution of x˙ = f (t, x). Only technical diﬃculties appear when one generalizes the just described approach to a more general equation x(t) ˙ = A(t)x + εf (t, x) with more general boundary conditions Bx(0) − Cx(1) = o (B, C are N × N matrices). Notice also that having a result for a system of diﬀerential equations we can investigate boundary value problems for second order equations. For example, we put

b 1 b2 0 y 0 0 , B= , f (t, x) = x= , C= , g(t, y, y) ˙ y˙ c 1 c2 0 0

170

Chapter 4. Local Properties of Diﬀerentiable Mappings

to rewrite

into the form

t ∈ (0, 1),

y¨(t) = a(t)y(t) + εg(t, y(t), y(t)), ˙ ˙ = 0, b1 y(0) + b2 y(0)

⎧

⎪ ⎨ x(t) ˙ =

0 a(t)

1 0

c1 y(1) + c2 y(1) ˙ =0

x(t) + εf (t, x(t)),

t ∈ (0, 1),

⎪ ⎩ Bx(0) + Cx(1) = o. Many other examples of the use of the Implicit Function Theorem can be found in Vejvoda et al. [130]. We will return to the problem (4.3.1) in Example 5.2.18. We now turn to the study of the behavior of a diﬀerentiable function in the vicinity of a critical point. We recommend that the reader considers the cases f (x) = xn ,

n > 1,

and

f (x) =

n

aij xi xj ,

aij = aji ,

i,j=1

ﬁrst. Deﬁnition 4.3.18. Let G be an open set in a Banach space X, f : X → R, f ∈ C 2 (G). A critical point a ∈ G of f is said to be non-degenerate if for any h ∈ X, h = o, the linear form f (a)(h, ·) does not vanish. The following basic result holds also in a Hilbert space but its ﬁnite dimensional version is more transparent. Theorem 4.3.19 (Morse). Let G be an open set in RM , f : RM → R, f ∈ C 2 (G). Let a ∈ G be a non-degenerate critical point of f . Then there exists a diﬀeomorphism ϕ of a neighborhood U of a onto a neighborhood V of o ∈ RM such that for x ∈ U, y = ϕ(x), the function f can be expressed in the form 1 λi yi2 2 i=1 M

f (x) = f (a) +

where λ1 , . . . , λM are the eigenvalues of the symmetric matrix f (a). Proof. We identify a bilinear operator with its matrix representation in the standard basis in RM (Remark 3.2.29(ii)) and denote the collection of all M × M matrices by M . Then we can write B(x)(x − a, x − a) (B(x)(x − a), x − a)RM . We choose a norm ·M×M on M and keep it ﬁxed throughout the proof. A subset of M consisting of symmetric matrices is denoted by S . We also denote by F and

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

171

FS the sets of all bounded continuous maps of G into M and S , respectively. The space F equipped with the norm AF sup A(x)M×M x∈G

is a Banach space and FS is its closed subspace. Without loss of generality we can assume that G is a convex neighborhood of the point a so small that f is bounded on G. After these preliminaries we start with the proof. Since f (a) = o, the Taylor Formula (Proposition 3.2.27) gives

1

f (x) = f (a) +

(1 − t)f (a + t(x − a))(x − a, x − a) dt

0

= f (a) + B(x)(x − a, x − a) with

B(x)

1

(1 − t)f (a + t(x − a)) dt

0

(the Riemann integral of a function with values in RM×M ). Note that we have B(·) ∈ FS . Our aim is to show that we can choose C(·) ∈ F such that B(x) = C ∗ (x)JC(x) where J is the canonical form of B(a) = 12 f (a), i.e., ⎛ J=

1⎜ ⎝ 2

0

λ1 .. 0

.

⎞ ⎟ 6 ⎠.

λM

Here C ∗ stands for the adjoint matrix to C, i.e., C ∗ = (cji ) provided C = (cij ). The transformation of coordinates y = C(x)(x − a) then yields 1 λi yi2 . 2 i=1 M

f (x) = f (a) + (J(y), y)RM = f (a) +

To achieve this goal we will use the Implicit Function Theorem (Theorem 4.2.1). We put Φ(B, C) = C ∗ (x)JC(x) − B(x) : FS × F → FS . In particular, Φ(B(a), T ) = T ∗ JT − B(a) = o, 6A

symmetric matrix has a diagonal canonical form – see Proposition 6.3.8.

172

Chapter 4. Local Properties of Diﬀerentiable Mappings

provided T is a unitary matrix which transforms B(a) into its canonical form J. Put A JT . The partial diﬀerential of Φ with respect to the second variable has the form Φ 2 (B, C)M : x → M ∗ (x)JC(x) + C ∗ (x)JM (x),

x ∈ G.

Then Ker Φ 2 (B(a), T ) = {M ∈ F : M ∗ (·)A + A∗ M (·) = o} and Q : M →

1 (M − (A∗ )−1 M ∗ A) 2

is a continuous linear projection of F onto Ker Φ 2 (B(a), T ). By the assumption on the point a, J is injective. Further, T is a unitary matrix, i.e., T ∗ = T −1 . This means that (A∗ )−1 = J −1 T exists. It can be seen that I − Q is a projection onto F1 (A∗ )−1 (FS ). The partial diﬀerential Φ 2 (B(a), T ) is an isomorphism of F1 onto FS . Namely, M

1 −1 J T S ∈ F1 2

Φ 2 (B(a), T )M = S ∈ FS .

and

We can now apply the Implicit Function Theorem to Φ : FS × F1 → FS (T ∈ F1 ) and obtain positive numbers ε and δ such that for any B ∈ FS , B(·)−B(a)F < ε there is a unique C ∈ F1 , C(·) − T F < δ for which Φ(B, C) = C ∗ (x)JC(x) − B(x) = o

for all x ∈ G.

To ﬁnish the proof we have to show that there is a neighborhood U of a such that for all x ∈ U.

B(x) − B(a)F < ε By the deﬁnition of B, B(·) − B(a)F = sup x∈G

0

1

(1 − t)[f (a + t(x − a)) − f (a)] dt

M×M

1 ≤ sup f (x) − f (a)M×M . 2 x∈G This means that we can ﬁnd the desired neighborhood U.

Example 4.3.20. Let X, Y be Banach spaces and let f : X → Y . Consider the equation f (x) = o (4.3.6)

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

173

in the vicinity of a known solution x = a. Let f be a C 2 -mapping in a neighborhood of a. Suppose that f (a) is a Fredholm operator (Remark 4.3.14) and, moreover, that the above equation can be reduced to the bifurcation equation (I − Q)f (g(x2 ) + x2 ) = o. Here Q is a projection of Y onto Im f (a), X = X1 ⊕ Ker f (a) and g(x2 ) is a (unique) solution of Qf (x1 + x2 ) = o

for x2 ∈ Ker f (a),

and x2 is in a neighborhood of a2 ∈ Ker f (a) (a = a1 + a2 ). We also assume that this g is given by the Implicit Function Theorem. In particular, this means that g (a2 ) = o. Suppose now that

codim Im f (a) = 1,

i.e., I − Q is a projection onto a 1-dimensional subspace Y2 of Y . Let Y2 = Lin{y2 }. By Corollary 2.1.18 and Remark 2.1.19, there is ϕ ∈ Y ∗ , ϕ ∈ [Im f (a)]⊥ , and we may assume that ϕ(y2 ) = 1. In other words, (I − Q)y = ϕ(y)y2 , and the bifurcation equation has the form F (x2 ) ϕ(f (g(x2 ) + x2 )) = 0. We have F (a2 )h = ϕ[f (a)(g (a2 )h + h)] = 0,

h ∈ Ker f (a),

i.e., a2 is a critical point of F . Further, F (a2 )(h, k) = ϕ[f (a)(g (a2 )h + h, g (a2 )k + k)] + ϕ[f (a)(g (a2 )(h, k))] = ϕ[f (a)(h, k)] since If, for example,

ϕ ◦ f (a) = 0

and

g (a2 ) = o.

dim Ker f (a) = 2

174

Chapter 4. Local Properties of Diﬀerentiable Mappings

(this can occur for f : RN +1 → RN ) and the matrix of F (a2 ) is regular, i.e., a2 is a non-degenerate critical point of F , then after a suitable transformation of coordinates we get 1 F (x2 ) = (λ1 ξ 2 + λ2 η 2 ) 2 (the Morse Theorem) and the following conclusion: If sgn λ1 = sgn λ2 , then the equation (4.3.6) has an isolated solution x = a; if sgn λ1 = − sgn λ2 , then there are two curves of solutions given by 2 λ2 ξ = ± − η. λ1

g

The previous example can be generalized. The following problem is a standard one in the bifurcation theory: A diﬀerentiable map f : R × X → Y is given where X, Y are Banach spaces.7 A smooth curve x = α(λ), λ ∈ (−δ, δ), of solutions of the equation f (λ, x) = o (4.3.7) is known. After the transformation ξ = x − α(λ), we can suppose that f (λ, o) = o

(4.3.8)

for λ in a neighborhood of (e.g.) 0 ∈ R. Deﬁnition 4.3.21. Let (4.3.8) be satisﬁed for the equation (4.3.7). The point (0, o) ∈ R × X is called a bifurcation point provided in any neighborhood of (0, o) there is a solution (λ0 , x0 ) of (4.3.7) such that x0 = o. Notice that whenever f is diﬀerentiable in a neighborhood U of (0, o) and f2 (0, o) is an isomorphism, then (0, o) is not a bifurcation point (the Implicit Function Theorem). In order to ﬁnd a suﬃcient condition for bifurcation suppose that f ∈ C 2 (U) and A = f2 (o, o) is not an isomorphism. More precisely, let Ker A be nontrivial, i.e., let 0 be an eigenvalue of A. The simplest case occurs when 0 is a simple eigenvalue, i.e., Ker A = Lin{x0 },

x0 = o.

The following result is a classical one (see Crandall & Rabinowitz [29]). Theorem 4.3.22 (Local Bifurcation Theorem). Let X, Y be Banach spaces, f : R× X → Y a twice continuously diﬀerentiable map on a neighborhood of (0, o). Let f satisfy the assumptions (i) f (λ, o) = o for all λ ∈ (−δ, δ) for some δ > 0, (ii) dim Ker f2 (0, o) = codim Im f2 (0, o) = 1, (iii) if f2 (0, o)x0 = o, x0 = o, then f1,2 (0, o)(1, x0 ) ∈ Im f2 (0, o). 7 The

set of parameters R can be replaced by a normed linear space in general.

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

175

Denote by X1 the topological complement8 of Ker f2 (0, o) in X. Then there is a C 1 -curve (ϕ, ψ) : (−η, η) → R × X1 (for some η > 0) such that ϕ(0) = 0,

f (ϕ(t), t(x0 + ψ(t))) = o.

ψ(0) = o,

Moreover, there is a neighborhood U of (0, o) in R × X such that f (λ, x) = o

(λ, x) ∈ U

for

if and only if either x = o or λ = ϕ(t),

x = t(x0 + ψ(t))

for a certain t

– see Figure 4.3.10. Such a picture is called a bifurcation diagram. X

(ϕ(t), t(x0 + ψ(t)))

(0, o) R U

Figure 4.3.10.

Proof. We will give two proofs. The ﬁrst one for a ﬁnite dimensional case when X = Y = RM is based on the Morse Theorem. The second one which is due to M. Crandall and P. Rabinowitz is based on the Implicit Function Theorem and will be only sketched. The ﬁrst proof. We choose ω ∈ Y ∗ = RM , ω = o, such that y ∈ Im f2 (0, o)

if and only if

ω(y) = 0.

Using the Lyapunov–Schmidt Reduction (Remark 4.3.14) we obtain a map g(λ, t) : R2 → X1 such that the equation f (λ, x) = o is locally equivalent to the equation F (λ, t) ω[f (λ, tx0 + g(λ, t))] = 0. We now show that (0, 0) ∈ R2 is a non-degenerate critical point of F . To do this we need to compute F (0, 0) and F (0, 0). Since f1 (λ, o) = o,

f1,1 (λ, o) = o

for

λ ∈ (−δ, δ)

X = X1 ⊕ X2 and the corresponding projection P of X onto X1 be continuous. Then X1 is called a topological complement of X2 and vice versa. 8 Let

176

Chapter 4. Local Properties of Diﬀerentiable Mappings

(assumption (i)) and g(λ, 0) = o,

g2 (0, 0) = o

(see Remark 4.3.14), we have F (0, 0) = 0 Further,

and also

F1,1 (0, 0) = 0.

F2 (λ, 0) = ω[f2 (λ, o)(x0 + g2 (λ, 0))].

Therefore, by (iii), (0, 0) = F2,1 (0, 0) = ω[f1,2 (0, o)(1, x0 ) + f2 (0, o)g1,2 (0, 0)] β F1,2 = ω[f1,2 (0, o)(1, x0 )] = 0 (0, 0), we obtain since ω(z) = 0 for every z ∈ Im f2 (0, o). If we denote α F2,2

0 β . This matrix has the matrix representation of F (0, 0) in the form β α eigenvalues of diﬀerent signs. The rest of the proof follows by applying the Morse Theorem (see also Example 4.3.20). The second proof proceeds by using the Implicit Function Theorem for the function Φ : R × R × X1 → Y deﬁned by ⎧ ⎨1 f (λ, t(x0 + x1 )) for t = 0, Φ(λ, t, x1 ) = t ⎩f (λ, o)(x + x ) for t = 0. 2

0

1

Notice that Φ(0, 0, o) = o, and (λ, h) → Φ 1 (0, 0, o)λ + Φ 3 (0, 0, o)h is an isomorphism of R × X1 onto Y (assumptions (ii) and (iii)). For details see Crandall & Rabinowitz [29]. Example 4.3.23. The following two functions oﬀer very simple illustrative examples: f (λ, x) = λx − x2 , g(λ, x) = λx − x3 . Their bifurcation diagrams are shown in Figures 4.3.11 and 4.3.12. We use these functions to point out the typical examples of the changing of stability of a diﬀerential equation when the so-called non-hyperbolic stationary point is crossed.9 In these ﬁgures branches of stationary solutions of equations x˙ = f (λ, x),

x˙ = g(λ, x)

are shown with an indication of their stability (s for stable, u for unstable).

g

stationary point a ∈ RM is called hyperbolic for the equation x˙ = f (x) provided f (a) = o and σ(f (a)) ∩ iR = ∅. See also footnote 3 on page 154. 9A

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

177

x

x s

(0, 0)

s

u

s

s

(0, 0)

u

λ

λ

u s Figure 4.3.11. Transcritical bifurcation

Figure 4.3.12. Pitchfork bifurcation

Example 4.3.24. We wish to ﬁnd a nontrivial 2π-periodic solution of the nonlinear pendulum equation x ¨(t) + λ sin x(t) = 0. (4.3.9) We put f (λ, x) : t → x ¨(t) + λ sin x(t), and X = {x ∈ C 2 (R) : x is 2π-periodic}, ˙ + max |¨ x(t)|, xX = max |x(t)| + max |x(t)| t∈[0,2π]

t∈[0,2π]

Y = {y ∈ C(R) : y is 2π-periodic}, It is easy to show that

t∈[0,2π]

yY = max |y(t)|. t∈[0,2π]

¨ + λh f2 (λ, o)h = h(t)

and, therefore, Ker f2 (λ, o) is nontrivial if and only if λ = n2 , n ∈ N ∪ {0}, and Ker f2 (0, o) = {constant functions}, Ker f2 (n2 , o) = Lin{sin nt, cos nt}

for

n ∈ N.

In the former case, i.e., n = 0, we can apply Theorem 4.3.22. Since 2π y(s) ds = 0 Im f2 (0, o) = y ∈ Y : 0

and

f1,2 (0, o)(1, c) = c

for c ∈ Ker f2 (0, o),

the assumptions of Theorem 4.3.22 are satisﬁed. What can we do in the latter case when n ∈ N and the dimension of Ker f2 (n2 , o) is equal to 2? In spite of the fact that Theorem 4.3.22 cannot be used we still may proceed with the Lyapunov–Schmidt Reduction: Denote A f2 (n2 , o), Y = Im A ⊕ Z, 2π cos nt 2π sin nt y(s) sin ns ds + y(s) cos ns ds, (I − Q)y : t → π π 0 0

y ∈ Y.

178

Chapter 4. Local Properties of Diﬀerentiable Mappings

Then I − Q is the projection onto Z such that Ker (I − Q) = Im A. Similarly, let X = Ker A ⊕ V

where V = {v ∈ X : (I − Q)v = o}.

The operator f can be expressed by f (µ + n2 , u + v) = Av + µ(u + v) + h(µ, u, v),

u ∈ Ker A, v ∈ V,

where h(µ, u, v) = µ[sin (u + v) − (u + v)] + n2 [sin (u + v) − (u + v)]. Because of this special form of h we will try to ﬁnd a solution of (4.3.9) in the form x = µ(u + v). The equality f (µ + n2 , µ(u + v)) = 0 holds if and only if Av + µ(u + v) + µ2 g(µ, u, v) = 0

(4.3.10)

where

⎧ h(µ, µu, µv) ⎪ ⎪ , µ = 0, ⎨ µ3 g(µ, u, v) = 2 ⎪ ⎪ ⎩− n (u + v)3 , µ = 0. 6 For solving (4.3.10) we use the Lyapunov–Schmidt Reduction. According to it, the equation (4.3.10) is equivalent to the following pair of equations: 0 = Av + µQ(u + v) + µ2 Qg(µ, u, v) 0 = (I − Q)(u + v) + µ(I − Q)g(µ, u, v)

(= Av + µv + µ2 Qg(µ, u, v)), (= u + µ(I − Q)g(µ, u, v)).

By the Implicit Function Theorem, the ﬁrst equation has a unique solution v = ϕ(µ, u) in a neighborhood of the point (0, u∗ , o) for any u∗ . We insert ϕ into the bifurcation equation obtaining Φ(µ, u) = u + µ(I − Q)g(µ, u, ϕ(µ, u)) = 0.

(4.3.11)

Since Φ(0, u∗ ) = u∗ , we take u∗ = o and solve (4.3.11) in a neighborhood of (0, o). This can be done with help of the Implicit Function Theorem since Φ 2 (0, o) is an isomorphism of Ker A onto V . Denoting this solution by u = ω(µ) we can come to the following conclusion: Any point (n2 , o) is a bifurcation point of the equation (4.3.9) and a nontrivial branch of 2π-periodic solutions of (4.3.9) has the form x = µ(ω(µ) + ϕ(µ, ω(µ))),

µ ∈ (−δ, δ).

4.3. Local Structure of Diﬀerentiable Maps, Bifurcations

179

The reader is invited to generalize this procedure to obtain suﬃcient conditions for a bifurcation for the equation f (λ, x) = o assuming that f (λ, o) = o for |λ − λ∗ | < δ, f ∈ C 2 (U), dim Ker f2 (λ∗ , o) = codim Im f2 (λ∗ , o) = 2 where U is a neighborhood of (λ∗ , o). We notice that no uniqueness of the bifurcation branch was proved even in our concrete example. Compare this with the assertion given in Theorem 4.3.22. This is due to our special choice of the form of the bifurcation branch, namely g x = µ(u + v). Example 4.3.25 (Application of Theorem 4.3.22). We will study the bifurcation points of the periodic problem x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ = 0, t ∈ (0, 2π), (4.3.12) x(0) = x(2π), x(0) ˙ = x(2π). ˙ In this example we will concentrate on the point λ = 0 which is an eigenvalue of the associated eigenvalue problem x¨(t) + λx(t) = 0, t ∈ (0, 2π), (4.3.13) x(0) = x(2π), x(0) ˙ = x(2π), ˙ of multiplicity 1. We consider the same function spaces X, Y as in the previous example (Example 4.3.24). Let us deﬁne F : X × R → Y by F (λ, x)(t) = x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ where the function g = g(λ, t, x, y) satisﬁes the following hypotheses: (i) g is 2π-periodic in t and continuous with respect to all four variables (as a function from R4 into R); (ii) the derivatives of g with respect to x, y, λ up to the order p (p ≥ 2) are continuous functions from R4 into R; (iii) g(λ, t, 0, 0) = 0 for all t, λ ∈ R; (iv) g3 (λ, t, 0, 0) = g4 (λ, t, 0, 0) = 0 for all t, λ ∈ R.

180

Chapter 4. Local Properties of Diﬀerentiable Mappings

It follows from (iii) that F (λ, o) = o for all λ ∈ R. Moreover, thanks to (iv) we have F2 (λ, o)w = w ¨ + λw, and so we conclude

dim Ker F2 (o, 0) = 1.

It follows from (ii) that F ∈ C p (X × [0, 2π], Y ). By Proposition 2.1.27(iv), 2π Y1 = Im F2 (0, o) = w ∈ Y : w(t) dt = 0 0

is a closed subspace of Y of codimension 1. Set x0 = 1, X1 = Lin{1}, X2 = x ∈ X :

2π

x(t) dt = 0 .

0

Since

F1,2 (0, o)1 = 1

and

1∈ / Im F2 (0, o),

the condition (iii) in Theorem 4.3.22 is veriﬁed, too. It follows from the Crandall–Rabinowitz Bifurcation Theorem (see Theorem 4.3.22) that (0, o) is a bifurcation point of (4.3.12). In particular, the point (0, o) ∈ R × X belongs to the branch of trivial solutions (λ, o), but also to the branch ˆ Γ = {(s + x(s), λ(s)) : s ∈ (−ε, ε)},

x(0) = o, x˙ s (0) = o,

ˆ λ(0) = 0.

Hence for any s ∈ (−ε, ε), s = 0, the nontrivial solution s + x(s) is the sum of a constant function (with respect to t) and the perturbed function x(s) (which g depends on t) such that x(s) belongs to X2 . Exercise 4.3.26. (i) Let G be an open subset of RM and let v ∈ C 1 (G, RM ). Assume that a ∈ G is not a stationary point of v, i.e., v (a) = o. Prove that there exists a diﬀeomorphism F of a neighborhood U of a onto a neighborhood V of o such that F maps solutions to the equation x(t) ˙ = v(x(t))

(4.3.14)

which lie in U to solutions in V of the system of equations y˙ 1 (t) = 1, y˙ i (t) = 0,

i = 2, . . . , M.

Hint. Choose a subspace Y of RM for which RM = Y ⊕ Lin{v(a)}. Deﬁne G(z) = ϕ(t; a + y)

for z = y + tv(a), y ∈ Y,

where ϕ(t; ξ) is a solution to (4.3.14) such that ϕ(0; ξ) = ξ. Prove that G is a local diﬀeomorphism and F = G−1 has the desired property.

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

181

(ii) Deduce from (i) that the equation (4.3.14) has M − 1 independent ﬁrst integrals in a neighborhood of a non-stationary point. (iii) Is there any relation between the ﬁrst integral of (4.3.14) and the linear partial diﬀerential equation v1 (x)

∂u ∂u + · · · + vM (x) = 0? ∂x1 ∂xM

Exercise 4.3.27. Apply Theorem 4.3.22 to the (Dirichlet) boundary value problem x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ = 0, t ∈ (0, π), x(0) = x(π) = 0, and show that every (k 2 , o), k ∈ N, is a bifurcation point! Exercise 4.3.28. Replace the Dirichlet boundary condition in Exercise 4.3.27 by the Neumann boundary condition x(0) ˙ = x(π) ˙ =0 and prove that every (k 2 , o), k ∈ N ∪ {0}, is a bifurcation point! Exercise 4.3.29. Why cannot the approach used in Example 4.3.25 be applied to prove that the points (k 2 , o), k ∈ N, are bifurcation points of (4.3.12) even if k 2 is an eigenvalue of the associated eigenvalue problem (4.3.13)? Can you modify the method from Example 4.3.24? Exercise 4.3.30. Apply Theorem 4.3.22 to the boundary value problem ⎧ 4 ... ⎨ d x (t) − λx(t) + g(λ, t, x(t), x(t), ˙ x ¨(t) x (t)) = 0, t ∈ (0, π), dt4 ⎩ x(0) = x ¨(0) = x(π) = x¨(π), and show that, under appropriate assumptions on g, every (k 2 , o), k ∈ N, is a bifurcation point.

4.3A Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields We have deﬁned a diﬀerentiable manifold in the basic text (Deﬁnition 4.3.4) and have also shown some examples (Examples 4.3.5 and 4.3.10). In the following Appendices 4.3A– 4.3D we will provide more information about this object in order to develop a geometric approach to one of the most powerful tools of nonlinear analysis, namely to the Brouwer degree (cf. Section 5.2 for an alternative approach). There is no doubt as to the importance of the notion of the derivative (or differential) for local study of functions of one or more variables. Therefore a notion of the diﬀerential will be a rudiment of analysis on diﬀerentiable manifolds, too. We have learned that it is convenient to deﬁne the diﬀerential of f : RM → R

at a point

a ∈ RM

182

Chapter 4. Local Properties of Diﬀerentiable Mappings

as a linear form f (a) on RM which approximates f locally at a in a given precise way. To extend such an approach to functions on diﬀerentiable manifolds we have to say what is a linear form on a (nonlinear) manifold. This is done by the important notion of the tangent space Ta M of a diﬀerentiable manifold M at a point a ∈ M . Roughly speaking, Ta M is the collection of all tangent vectors to M at the point a. We can imagine a tangent vector v at the point a ∈ M with help of the following physical interpretation. Consider a force F which acts on a material point P . Suppose that there are certain rigid constraints which make P move along a smooth curve γ which lies on a manifold M ⊂ RN (this manifold is determined by these constraints). As a concrete example, imagine that we move on the globe (because of gravitation) which can be approximated by the smooth two-dimensional sphere S 2 . Let the point P be at a certain instant, say t = 0, at the position a = γ(0) ∈ M , and let the force F and also all constraints stop operating suddenly at this time. What will happen then? According to the First Newton Law the point P will continue to move with a constant speed v = |γ(0)| ˙

(γ(0) ˙

dγ (0)) dt

along the line with the directional vector γ(0). ˙ This vector is the tangent vector to the curve γ at the point a. The collection of all these tangent vectors (which are given by all possible motions through a ﬁxed point a ∈ M ) forms the tangent space Ta M . More precisely, we give the following deﬁnition. Deﬁnition 4.3.31. Let M be an M -dimensional diﬀerentiable manifold in RN and let ˙ to all smooth a ∈ M . The tangent space Ta M is the collection of tangent vectors γ(0) curves γ ∈ Γa {γ : R → RN : there is an open interval Iγ 0 such that γ ∈ C 1 (Iγ ), γ(Iγ ) ⊂ M , γ(0) = a}. The method for computation of tangent vectors to a “parametrized” M -dimensional manifold M ⊂ RN is based on the use of local coordinates: Let a ∈ M , let ψ be a diﬀeomorphism of a neighborhood W ⊂ RN of the point a into RN (see Deﬁnition 4.3.4). If V = W ∩ M, then (V, ψ) is a local chart of M at the point a ∈ M . Denote the inverse of the restriction of P ψ to V by ϕ where P y = (y1 , . . . , yM ) ∈ RM for y = (y1 , . . . , yN ) ∈ RN . Then ϕ maps the neighborhood U = P ψ(V) of the point b = P ψ(a) ∈ RM into M and we will consider ϕ also as an embedding (see Remark 4.3.3(iv)) of U into RN . We call ϕ the local parametrization of V ⊂ M . The main reason for introducing ϕ is that ϕ can be diﬀerentiated, but ψ|V cannot, and the whole ψ does not describe M . See Figure 4.3.13. Consider now a smooth curve γ ∈ Γa (see Figure 4.3.13). We can choose Iγ so small that γ(Iγ ) ⊂ V. Then κ(t) = (P ψ)(γ(t))

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

RN −M

183

RN

M W

V

γ(0) ˙ a γ

ϕ

o

Pψ κ(0) ˙ U

κ

b

RM Figure 4.3.13. Manifold is a smooth curve in U ⊂ RM . We have ϕ ◦ κ = γ and, consequently, γ˙ j (0) = or, more brieﬂy,

M ∂ϕj (b)κ˙ i (0), 10 ∂yi i=1

j = 1, . . . , N,

˙ γ(0) ˙ = ϕ (b)(κ(0)).

Since also

(4.3.15)

κ(0) ˙ = P ψ (a)(γ(0)), ˙

there is a correspondence between the tangent vector v = (γ˙ 1 (0), . . . , γ˙ N (0)) ∈ Ta M and the tangent vector w = (κ˙ 1 (0), . . . , κ˙ M (0)) to a curve κ = (P ψ) ◦ γ

at

b = κ(0) = ψ(a).

Obviously, for any w = (w , . . . , w ) ∈ R there is a smooth curve κ (e.g., κ(t) = M wi ei ; e1 , . . . , eM is the standard coordinate basis in RM ) such that w = κ(0). ˙ b+t 1

M

M

i=1

This means that

Ta M = Im ϕ (b)

10 The Einstein summation convention is often used in diﬀerential geometry. According to it, the sum is taken with respect to all indices which appear simultaneously in upper and lower positions. For example, if e1 , . . . , eM is a basis in RM , then the coordinates of a point x ∈ RM with respect M to this basis should be denoted by x1 , . . . , xM , since x = xi ei xi ei by this convention. i=1

Similarly, ϕ : RM → RN has components ϕ1 , . . . , ϕN and values ϕ(x1 , . . . , xM ) (= ϕ(x)) and M N j j ∂ϕj . Moreover, ϕ (a)(hi ei ) = ∂ϕ (a)hi ej (= (a)hi ej ). Since partial derivatives ∂ϕ ∂x ∂x ∂x i

i

this notation is not too common in analysis we do not use it.

j=1

i=1

i

184

Chapter 4. Local Properties of Diﬀerentiable Mappings

and the linear operations in RM induce those in Ta M . Therefore, Ta M is a linear space of dimension M (M = dim M ) and ϕ (b)ei , i = 1, . . . , M , form a basis of Ta M . Since (y 1 , . . . , y M ) (ψ 1 (x), . . . , ψ M (x)) can be viewed as local (nonlinear) coordinates of a point x ∈ V, the vector ϕ (b)ei is also ∂ . This means that denoted by ∂y i M M i ∂ ˙ = ϕ (b) κ˙ (0)ei = κ˙ i (0) . (4.3.16) γ(0) ˙ = ϕ (b)(κ(0)) ∂y i i=1 i=1 Example 4.3.32. Let us compute the tangent space to the 2-dimensional sphere √

2 1 1 , , . S 2 = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1} at the point a = 2 2 2 As local coordinates we choose the spherical coordinates x = cos α cos β, y = sin α cos β, z = sin β, i.e., (x, y, z) = ϕ(α, β), b = π4 , π4 and ϕ π4 , π4 = a. Then √

∂ 1 1 ∂ 1 ∂ϕ ' π π ( ∂ϕ ' π π ( 1 2 , = − , ,0 , , = − ,− , ∂α ∂α 4 4 2 2 ∂β ∂β 4 4 2 2 2 is a basis of Ta S 2 . Choosing a perpendicular vector v to both √ (1, 1, 2), we get the following expression for Ta S 2 : √ Ta S 2 = {(x, y, z) ∈ R3 : x + y + 2z = 0}.

∂ ∂α

and

∂ , ∂β

e.g., v =

e

(If you have drawn a picture, you get a slightly better insight.)

It was shown in Remark 4.3.9(iii) that a manifold can also be given implicitly, i.e., as the set of solutions of the equation f (x) = o. Proposition 4.3.33. Let f : RM → RN have continuous partial derivatives in an open set G ⊂ RM and let o be a regular value of f (Deﬁnition 4.3.6). Then M = {x ∈ G : f (x) = o} is an (M − N )-dimensional diﬀerentiable manifold provided M is not empty, and for a ∈ M the tangent space Ta M is equal to Ker f (a). Proof. The ﬁrst part is exactly Proposition 4.3.8 and Remark 4.3.9(iii). If a map γ : Iγ → M is a smooth curve, γ(0) = a, then f (γ(t)) = o

for t ∈ Iγ

and

f (a)γ(0) ˙ = o,

i.e.,

γ(0) ˙ ∈ Ker f (a).

Since Ta M ⊂ Ker f (a) and both the spaces have the same (ﬁnite) dimension (the assumption on regularity of o), we have Ta M = Ker f (a).

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

185

Since the same geometric object M can be viewed as a manifold with diﬀerent local parametrizations or as solutions of diﬀerent equations, we would like to know how the notion of the tangent space (and other notions to be introduced later on) depends on the way it is introduced. As the implicit deﬁnition of manifold leads to local parametrizations (see the proof of Proposition 4.3.8) we can consider only the deﬁnition given by parametrizations. First of all we should say when two atlases of M deﬁne the same structure on M . Deﬁnition 4.3.34. Two C k -atlases (Vα , ψα )α∈A , (V˜β , ψ˜β )β∈B of M are said to be equiva˜β the mapping lent if for every a ∈ M and any α ∈ A, β ∈ B for which a ∈ Vα ∩ V Φ = (P ψ˜β ) ◦ ϕα ˜β ) onto (ϕ is a C -diﬀeomorphism of (ϕα )−1 (Vα ∩ V ˜β )−1 (Vα ∩ V˜β ) (see Figure 4.3.14). k

V

V˜ a

P ψ˜

ϕ U

U˜

˜ ◦ϕ Φ = (P ψ) ˜b

b ˜ U1 ϕ−1 (V ∩ V)

˜ U˜1 ϕ˜−1 (V ∩ V)

Figure 4.3.14. ˜ be two local charts at a point a ∈ M which belong to ˜ ψ) Example 4.3.35. Let (V, ψ), (V, ˙ for a smooth curve γ. Then equivalent atlases of M . Let v ∈ Ta M , v = γ(0) γ =ϕ◦κ =ϕ ˜◦κ ˜

where

κ ˜ =Φ◦κ

and Φ is deﬁned as in Deﬁnition 4.3.34. It follows that ϕ (b)(κ(0)) ˙ = ϕ˜ (˜b)(Φ (b)κ(0)). ˙ ∂ ˙ denoting ∂y ϕ (b)ei (as above) and ∂z∂ j ϕ ˜ (˜b)ej , we get In particular, for ei = κ(0), i the transformation rule for the tangent vectors M M ∂Φj ∂ ∂Φj ∂ =ϕ ˜ (b) (b)ej = (b) , i = 1, . . . , M. (4.3.17) ∂yi ∂y ∂y ∂z i i j j=1 j=1

e

We will now examine a more general situation. Let M , M˜ be diﬀerentiable man˜ ˜ be ˜ ψ) ifolds in RN and RN , respectively. Suppose that g : M → M˜ and let (V, ψ), (V, local charts in a ∈ M and a ˜ = g(a) ∈ M˜. Put ˜ ◦g◦ϕ G = (P˜ ψ) (see Figure 4.3.15). Then G is called a local realization of g.

186

Chapter 4. Local Properties of Diﬀerentiable Mappings

RN

RN −M

˜

RN − M ˜

˜

RN M˜

M

V˜ g

˜ ∩V g−1 (V)

a ˜

V a ϕ˜ ϕ

o

P˜ ψ˜

˜ ◦g◦ϕ o G = (P˜ ψ)

Pψ

U R

b

M

˜b

˜ ∩ V) P ψ(g−1 (V)

U˜

˜

RM

Figure 4.3.15.

We say that a mapping g : M → M˜ is of the class C k (or k-times continuously ˜ In this case g maps a ˜ ψ). diﬀerentiable) if G ∈ C k (U, U˜ ) for all charts (V, ψ) and (V, smooth curve γ ∈ Γa onto a smooth curve g ◦ γ ∈ Γa˜ . Namely, g◦γ =ϕ ˜◦G◦κ

where

γ = ϕ ◦ κ,

and d ˙ (4.3.18) (g ◦ γ)(0) = ϕ ˜ (b)(G (b)κ(0)). dt We say that g “pushes forward” the tangent vector γ(0) ˙ ∈ Ta M to the tangent vector d (g ◦ γ)(0) ∈ Tg(a) M˜ dt ˙ and which which is denoted by g∗ (γ(0)) ' is(called a push-forward . In particular, g pushes ∂ ∂ forward the tangent vector ∂yi to g∗ ∂y where i

g∗

∂ ∂yi

=

˜ M ∂Gj ∂ (b) ; ∂y ∂z i j j=1

∂ =ϕ ˜ (˜b)ej . ∂zj

(4.3.19)

We wish to point out that g∗ is the generalization of g (a) for g : RM → RM . The transformation rule (4.3.17) is a special case of (4.3.18) where g = I. An important special case of a smooth mapping is a diﬀerentiable function on a manifold. For such a function we deﬁne the notion of the diﬀerential. ˜

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

187

Deﬁnition 4.3.36. A function f : M → R is said to be diﬀerentiable at a ∈ M if f ◦ γ ∈ C 1 (Iγ )

for all

γ ∈ Γa .

The diﬀerential df (a) of f at the point a is deﬁned by the relation df (a)(γ(0)) ˙

d (f ◦ γ)(0) dt

for all

γ ∈ Γa .

The (algebraic) dual space to Ta M will be denoted by (Ta M )∗ (it is sometimes called the cotangent space) and the dual basis to ∂y∂ 1 , . . . , ∂y∂M is denoted by dy 1 , . . . , dy M , i.e.,

1 for i = j, ∂ = δij = dy j ∂yi 0 for i = j. Remark 4.3.37. (i) From the deﬁnition of df (a) it is obvious that df (a) ∈ (Ta M )∗ and its values can be expressed in local coordinates as follows: If (ψ, V) is a local chart at a ∈ M , F = f ◦ ϕ : U ⊂ RM → R, then ∂F d (b)κ˙ i (0). (f ◦ γ)(0) = dt ∂yi i=1 M

f ◦γ = F ◦κ

for

γ = ϕ ◦ κ ∈ Γa ,

i.e.,

In other words, df (a) =

M ∂F (b) dy i . ∂y i i=1

(4.3.20)

In particular, for f (x) = ψ i (x) we have dψ i (a) = dy i ,

i = 1, . . . , M.

Observe that the formula (4.3.20) allows us to deﬁne continuity of the mapping x ∈ M → df (x) ∈ (Tx M )∗ by the requirement that all F (corresponding to all charts of M ) have continuous partial derivatives.11 ˜ at a point a ∈ M . Let ˜ ψ) (ii) Suppose now that there are two local charts (V, ψ), (V, f : M → R be diﬀerentiable at a ∈ M . Put F f ◦ ϕ,

F˜ f ◦ ϕ, ˜

i.e.,

F = F˜ ◦ Φ

11 It is possible to deﬁne the structure of a diﬀerentiable manifold on the collection T M {(x, v) : x ∈ M , v ∈ Tx M } of all tangent spaces together with their “base” points. This set T M is called a tangent bundle of M . The structure of a diﬀerentiable manifold on T M is given by the local charts which are constructed as follows: If (V, ψ) is a local chart at a ∈ M , then VT = {{x} × Tx M : x ∈ V} ⊂ RN+M and ψT (x, w) = (ψ(x), P ψ (x)w) ⊂ RN+M . In a similar way the cotangent bundle is the collection T ∗ M {(x, κ) : x ∈ M , κ ∈ (Tx M )∗ }. The continuity of df which has just been deﬁned is then the continuity of df : M → T ∗ M .

188

Chapter 4. Local Properties of Diﬀerentiable Mappings where Φ is deﬁned in Deﬁnition 4.3.34. By virtue of (4.3.20) we have M M M ∂ F˜ ∂F ∂Φj i ˜ df (a) = (b) dy = (b) (b) dy i ∂yi ∂zj ∂yi i=1 i=1 j=1 M M M ∂ F˜ ˜ ∂Φj ∂ F˜ ˜ i = (b) (b) dy = (b) dz j . ∂z ∂y ∂z j i j j=1 i=1 j=1

(4.3.21)

The equality dz j =

M ∂Φj (b) dy i ∂yi i=1

(4.3.22)

follows from this calculation applied to the jth coordinate function z j = ψ˜j (x),

x ∈ V ∩ V˜

and Remark 4.3.37(i).12 According to the last remark, the existence of the diﬀerential df (a) does not depend on the choice of local chart. The fact that the diﬀerentiability of functions, and similarly other notions, does not depend on a particular choice of local coordinates is crucial for diﬀerential geometry and global analysis. Namely, these parts of mathematics study objects in their own geometric nature. The invariance of geometric objects with respect to various groups of transformations (in our case this group is the group of diﬀeomorphisms) plays a very important role in various applications, mainly in physics (e.g., in general relativity). It is also worth mentioning that the emphasis on invariance is a certain kind of the philosophy dual to Descartes. Analytic geometry and classical diﬀerential geometry transform geometric properties into the language of analysis. Local coordinates which introduce analytic tools into geometry are used mainly for computations. Remark 4.3.38. The transformation rule (4.3.21) can be generalized. Consider a function f : M˜ → R and suppose that a mapping g : M → M˜ is given. We can look at g as a “generalized” transformation (we do not assume that g is injective). We are interested in the relation between df (˜ a) and the diﬀerential of the “transformed” function f ◦g : M → R. The desired chain rule can be obtained again with help of local charts (V, ψ) at a ∈ M ˜ ψ) ˜ at a and (V, ˜ = g(a) ∈ M˜ (see Figure 4.3.15). This means that we investigate F ◦ G instead of f ◦ g. Here ˜ ◦g◦ϕ G = (P˜ ψ)

and

F = f ◦ ϕ. ˜

According to (4.3.19)–(4.3.21) we have

d(f ◦ g)(a)

∂ ∂yi

∂F ∂Gj ∂(F ◦ G) (b) = (˜b) (b) ∂yi ∂zj ∂yi j=1

∂ ∂ = df (˜ a) g∗ a)) g ∗ ( df (˜ . ∂yi ∂yi ˜ M

=

(4.3.23)

12 Notice the diﬀerence between the transformation rules (4.3.17) and (4.3.22). The reader who is acquainted with tensor analysis can realize that tangent vectors are transformed as contravariant tensors and diﬀerentials as covariant ones.

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

189

The linear form g ∗ ( df (˜ a)) ∈ (Ta M )∗ is called a pull-back of the linear form df (˜ a) ∈ ∗ ˜ (Ta˜ M ) . This operation will play an important role in the deﬁnition of the degree (see Proposition 4.3.116). We will return to pull-back in Exercise 4.3.71. Remark 4.3.39. The notion of the diﬀerentiable manifold can be generalized in such a way that it is not a priori assumed that M is a subset of RN . In fact, we have needed in Deﬁnition 4.3.4 that M has a topological structure (inherited from RN ) such that the neighborhood V = W ∩ M is homeomorphic via P ψ|V ((P ψ|V )−1 = ϕ) with U ⊂ RM . A diﬀerential structure on M is introduced with help of diﬀerentiability properties of the mappings Φ1,2 ϕ−1 2 ◦ϕ1 for diﬀerent neighborhoods V1 , V2 of M (cf. Deﬁnition 4.3.34). This is suﬃcient for correctness of the deﬁnition of the smooth function f : M → R. (It is smooth provided f ◦ ϕ : U → R is smooth for all ϕ. Namely, f (x) = (f ◦ ϕ1 )(y) = (f ◦ ϕ2 )[Φ1,2 (y)]

for

x ∈ V1 ∩ V2 .)

These considerations allow us to say that the topological space M 13 is called the M dimensional diﬀerentiable manifold if M is locally homeomorphic to open sets of RM in such a way that all composite mappings Φ1,2 belong to the class C k .14 Remark 4.3.40. We can deﬁne an inﬁnite dimensional diﬀerentiable (e.g, of the class C k ) manifold by replacing RM by a Banach space X. As an important example consider a mapping f ∈ C k (X, R), k ≥ 1, and deﬁne M = {x ∈ X : f (x) = 1}. If M = ∅ and all its points are regular (i.e., f (x) = 0 for all x ∈ M ), then M is an inﬁnite dimensional manifold of the class C k (this follows from Proposition 4.3.8 and Remark 4.3.9(i)). Moreover, the tangent space Ta M is equal to Ker f (a) for a ∈ M . Indeed, the inclusion Ta M ⊂ Ker f (a) can be proved as in Proposition 4.3.33. To get the reverse inclusion let h ∈ Ker f (a) and let ϕ be the diﬀeomorphism from Proposition 4.3.8. For γ(t) ϕ(th) there exists δ > 0 such that γ(t) ∈ M for |t| < δ and γ(0) ˙ = ϕ (o)h = h (cf. the proof of Proposition 4.3.8), i.e., h ∈ Ta M . In order to deﬁne the diﬀerential df (a) for f : M → R we need a generalization of the notion of the tangent space Ta M in this more general setting. For a ∈ V ⊂ M we deﬁne Γa {γ : R → M : ∃ open interval Iγ 0 : γ(0) = a ∈ V, ϕ−1 ◦ γ ∈ C 1 (Iγ ), where ϕ : U → V is a local parametrization of V}.

d (ϕ−1 ◦ γ)-t=0 with a linear Similarly as above, Ta M is the collection of all vectors dt structure of RM . (Actually, Ta M now coincides with RM . We remark that previously Ta M was an M -dimensional subset of RN .) Further,

13 If

M =

α

df (a) : Ta M → R Vα is a set such that there are injective and surjective mappings ϕα : Uα → Vα

where Uα are open subsets of RM , then the sets Vα form a subbase of a topology in M . 14 The deep theorem due to H. Whitney roughly says that a connected M -dimensional diﬀerentiable manifold can be embedded (Remark 4.3.3(iv)) into R2M +1 (see, e.g., Whitney [133, Chapter IV], Aubin [11, Theorem 1.22] or Sternberg [124, Theorem 2.4.4]). This means that our previous approach was not too restrictive.

190

Chapter 4. Local Properties of Diﬀerentiable Mappings

is deﬁned as df (a)v

d d = where v = κ(o), ˙ κ = ϕ−1 ◦ γ, F = f ◦ ϕ−1 , (f ◦ γ)-(F ◦ κ)-dt dt t=0 t=0

see Figure 4.3.16.

0

Iγ

γ

V ⊂M

κ U ⊂ RM ϕ−1

γ(Iγ )

κ(Iγ ) a ϕ

F

f

R Figure 4.3.16. In Appendix 6.3B we consider the level sets of a function ψ : X → R, ψ ∈ C k , k ≥ 1. If 0 is the regular value of ψ and M = {x ∈ X : ψ(x) = ψ(a)} = ∅, then M is a C k diﬀerentiable manifold (with a “parameter space” X1 = Ker ψ (a), see Remark 4.3.9(i) and the proof of Proposition 4.3.8). In this case Ta M can be identiﬁed with X1 (the analogue of Proposition 4.3.33). The following notion is also useful in nonlinear analysis. Deﬁnition 4.3.41. A vector ﬁeld on a diﬀerentiable manifold M is such a mapping v : M → T M that v(x) ∈ Tx M for all x ∈ M . A vector ﬁeld v determines the diﬀerential equation x˙ = v(x).

(4.3.24)

A solution of (4.3.24) is a curve γ : Iγ → M (Iγ is an open interval in R) such that γ(t) ˙ = v(γ(t))

for all

t ∈ Iγ .

If we choose a chart (V, ψ) at a point a ∈ M , we can try to ﬁnd a solution γ of (4.3.24) which passes through the point a, i.e., we are looking for a curve γ : Iγ 0 → M which is a solution of (4.3.24) and γ(0) = a. If v(x) =

M i=1

v i (x)

∂ ∂yi

for

x ∈ V,

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

191

then the equation (4.3.24) has the form of a system y˙ i = v i (ϕ ◦ y),

i = 1, . . . , M.

(4.3.25)

The local existence (and uniqueness) theorem for the system (4.3.25) can be used asi suming that the vector ﬁeld v is continuous (and all partial derivatives ∂(v∂y◦ϕ) are conj tinuous). A standard continuation process then yields a solution γ which is deﬁned on a maximal “time” interval Ia (γ(0) = a). It is well known that even very simple diﬀerential equations in R need not have any solution deﬁned on the whole of R (e.g., x˙ = x2 + 1). The situation is better in the case of a compact diﬀerentiable manifold M (i.e., when M is a compact subset of RN ). If v is continuous on M , a ∈ M , then there exists a solution γ of (4.3.24), γ(0) = a, which is deﬁned on the whole of R. Because of the compactness k Vi . By of M there is a ﬁnite number of charts (Vi , ψi ), i = 1, . . . , k such that M = i=1

the continuity of v there is also a constant K for which |v(x)| ≤ K

for all

x ∈ M.

Any solution γ is therefore uniformly continuous on Iγ . If Iγ = R, then the limits (in M ) of γ at the terminal point(s) of Iγ exist, and γ could be continued. The reader is invited to ﬁll in all details of this proof using local coordinates. The global existence of solutions means that the map σ : R × M → M : σ(t, a) = γa (t) where γa is a solution of (4.3.24) on R which satisﬁes γa (0) = a, is a smooth (provided v is smooth) dynamical system on M . On the other hand, with any smooth dynamical system σ on M we can associate a smooth vector ﬁeld v on M . The reader who is interested in dynamical systems on manifolds can consult, e.g., Chillingworth [24], Ruelle [114] for brief information, Palis & De Melo [101] or Katok & Hasselblatt [74] (this is actually much more than an introduction). We have mentioned on page 165 the role of the ﬁrst integrals for (autonomous) systems of ordinary diﬀerential equations. The notion of the ﬁrst integral is connected with partial diﬀerential equations, since f : M → R is the ﬁrst integral of (4.3.24) if f satisﬁes (in local coordinates) the linear partial diﬀerential equation of the ﬁrst order ∂F d d f (γ(t)) = F (κ(t)) = (κ(t))v i (γ(t)) = df (x)(v(x)) = o dt dt ∂y i i=1 n

where x = γ(t)

and

γ(t) = ϕ(κ(t)).

The solutions of (4.3.24) are called the characteristics of this partial diﬀerential equation for an unknown function F . We obtain a system of partial diﬀerential equations by considering a family of vector ﬁelds. Let v1 , . . . , vk be vector ﬁelds on a manifold M and denote by V (x) = Lin{v1 (x), . . . , vk (x)}

192

Chapter 4. Local Properties of Diﬀerentiable Mappings

the subspace of Tx M . The ﬁrst integral of the system v1 , . . . , vk (or the collection of subspaces {V (x)}x∈M ) is a function f : M → R for which df (x)(vi (x)) = 0,

i = 1, . . . , k,

x ∈ M,

(4.3.26)

or equivalently, df annihilates all V (x), x ∈ M , i.e., Lw f = o for all vector ﬁelds w such that w(x) ∈ V (x). (Lw f is the so-called Lie derivative – see Exercise 4.3.46.) From this formulation it is clear that we can suppose that the vector ﬁelds v1 , . . . , vk are linearly independent at each x ∈ M . Contrary to the case of one equation, the system (4.3.26) need not have a solution. The following problem is similar to the preceding one: Let G be an open subset of RM and let g = (g 1 , . . . , g M ) be a smooth mapping from G × R into RM . Since RM can be also interpreted as the dual space to RM , the mapping g determines a system of partial diﬀerential equations u (x) = g(x, u(x)),

x ∈ G ⊂ RM .

(4.3.27)

Expressing the Fr´echet derivative (i.e., the diﬀerential) u (x) in terms of partial derivatives we get the system ∂u (x) = g i (x, u(x)), ∂xi

i = 1, . . . , M,

x ∈ G.

If this system has a solution u, then u ∈ C 2 (G) (since g is supposed to be smooth), which implies a necessary condition for the existence of a solution given by mixed derivatives 2 2 u u = ∂x∂j ∂x , i, j = 1, . . . , M ): ( ∂x∂i ∂x j i ∂g i ∂g i j ∂g j ∂g j i g = g, + + ∂xj ∂u ∂xi ∂u

i, j = 1, . . . , M.

(4.3.28)

It is a question how to formulate this integrability condition for the system (4.3.26). The system (4.3.26) is said to be completely integrable in M if for any x ∈ M there is a submanifold N (x) of (the integral manifold ) M containing x such that i∗ (Ty N (x)) = V (y)

for all

y ∈ N (x)

(i is the natural embedding of N (x) into M ). Notice that for one vector ﬁeld v = o (i.e., dim V (x) = 1) the integral manifold is the integral curve of the system x˙ = v(x) and, in general, it contains the integral curves of all equations x˙ = v i (x),

i = 1, . . . , M.

The gluing of all integral curves need not be a manifold. A possible problem is shown in Remark 4.3.43 below. The basic result on complete integrability is the next theorem (Frobenius Theorem) which we state without proof. This theorem is an important tool in diﬀerential geometry, and the reader can ﬁnd its proof in textbooks on this subject, e.g., Aubin [11], Sternberg [124, § 3.5]. To formulate the theorem in a compact form we need15 the notion 15 We

will give another formulation of this theorem at the end of Appendix 4.3B.

4.3A. Diﬀerentiable Manifolds, Tangent Spaces and Vector Fields

193

of Lie brackets (they are sometimes, mainly in applications to Hamiltonian mechanics, called Poisson brackets). If v, w are two smooth vector ﬁelds with local representations v=

M

vi

i=1

∂ , ∂yi

w=

M

wj

j=1

∂ , ∂yj

then [v, w] is the vector ﬁeld with the local representation M M i i ∂ j ∂w j ∂v [v, w] = v −w . ∂y ∂y ∂y j j j j=1 i=1

(4.3.29)

For another interpretation of this operation see Remark 4.3.43 below or Exercise 4.3.47. Theorem 4.3.42 (Frobenius). Let v1 , . . . , vk be smooth vector ﬁelds on a manifold M . Then this system is completely integrable if and only if [vi (x), vj (x)] ∈ V (x)

for all

x ∈ M,

i, j = 1, . . . , k.

Remark 4.3.43. Suppose that two smooth vector ﬁelds v, w are given on a (compact) manifold M and let σv , σw be the corresponding dynamical systems on M . There is no reason to expect that these systems commute, i.e., σv (t, σw (s, x)) = σw (s, σv (t, x)), and it is not diﬃcult to construct a counterexample which conﬁrms that (see Figure 4.3.17).

σv (t, x) = y1 σw (s, y1 )

x

σv (t, y2 ) σw (s, x) = y2 Figure 4.3.17. It can be shown that a necessary and suﬃcient condition for commutativity is that [v, w] = o (see (4.3.29)). Exercise 4.3.44. We say that a function f : M → R is an integral of the equation (4.3.24) if f (γ(·)) is constant for any solution γ of (4.3.24). If this is true only locally, f is called a local integral. Suppose that dim M = 2 and (V, ψ) is a local chart of M , f is the integral of (4.3.24) and df (x) = 0 for all x ∈ V. Prove the following assertions. (i) There is no stationary point of (4.3.24) in V (a ∈ V is a stationary point of (4.3.24) if γ(t) = a, t ∈ R, is a solution of (4.3.24)).

194

Chapter 4. Local Properties of Diﬀerentiable Mappings

(ii) Take functions g : U → R, ϕ(U) = V, such that z = Φ(y) = (F (y), g(y))

F = f ◦ ϕ,

where

is a diﬀeomorphism of U onto U˜ (why does such a function exist?). Then there exists h : U˜ → R such that the vector ﬁeld v has the form v(x) = 0 ·

∂ ∂ + h(z) ∂z1 ∂z2

in these new local coordinates z = (z1 , z2 ) (use the transformation rule (4.3.17) and the fact that f is an integral of (4.3.24)). Notice that h(z) = 0 for all z ∈ U˜ . (iii) Put

H(z1 , z2 ) =

z2 0 z2

dη h(z1 , η)

(what is the relation of H to a solution of (4.3.24) in the z-coordinates?). Consider another transformation of coordinates ξ = (z1 , H(z1 , z2 )). Then the vector ﬁeld v has the form v(x) = 0 ·

∂ ∂ +1· ∂ξ1 ∂ξ2

in the local coordinates ξ = (ξ1 , ξ2 ). Cf. Exercise 4.3.26. Can you formulate the result in terms of ordinary diﬀerential equations? Can you generalize this result to higher dimensional manifolds? Exercise 4.3.45. Let v1 , . . . , vk be smooth vector ﬁelds on a diﬀerentiable manifold M which are linearly independent on a neighborhood U of a ∈ M . Assume that [vi , vj ] = o,

i, j = 1, . . . , k,

on

U.

Prove that there exist local coordinates (y1 , . . . , yM ) such that vi =

∂ , ∂yi

i = 1, . . . , k,

in a neighborhood of a. Hint. Cf. Exercises 4.3.44 and 4.3.26. Exercise 4.3.46. Let M be a diﬀerentiable manifold of the class C ∞ and let X be the set of all C ∞ functions on M . Let D : X → X satisfy (D1) D is linear, (D2) D(f g) = gDf + f Dg for all f, g ∈ X (pointwise multiplication). Show that there is a vector ﬁeld v on M such that Lv f df (x)(v(x)) = Df (x),

x ∈ M,

f ∈ X.

4.3B. Diﬀerential Forms

195

Here Lv f is the directional derivative (in the direction of the vector ﬁeld v) which is also called the Lie derivative (cf. page 192).16 Hint. Put M ∂ ai where ai = Dyi . v= ∂y i i=1 Show that Df − Lv f = o holds for polynomials of degree ≥ 1 on U. Then use the Taylor polynomial. It remains to extend this result from local charts to the whole manifold – use a partition of unity. See Deﬁnition 4.3.74 and Theorem 4.3.76. The converse statement, i.e., the fact that Lv satisﬁes (D1), (D2) for smooth v is easy to prove. (Do that.) Is there any diﬀerence between the diﬀerential and the Lie derivative? Exercise 4.3.47. Let v, w be two smooth vector ﬁelds on a diﬀerentiable manifold. Deﬁne the vector ﬁeld [v, w], the so-called commutator (or Lie bracket) of v, w, by the formula L[v,w] f = Lv (Lw f ) − Lw (Lv f )

for every

f ∈X

(see (4.3.29) and Exercise 4.3.46). Show that this deﬁnition is correct, i.e., [v, w] is a vector ﬁeld, and show that the Jacobi identity [u, [v, w]] + [v, [w, u]] + [w, [u, v]] = o holds for any smooth vector ﬁelds u, v, w.17

4.3B Diﬀerential Forms Before starting with the notion of a diﬀerential form we need to summarize some basic facts from multilinear algebra. Let X be a (real) linear space. A bilinear form is a map A : X × X → R which is linear in both variables. A typical example of a bilinear form is the scalar product in a real Hilbert space.18 A p-linear form A : X × · · · × X → R is p

deﬁned in a similar way. 16 Sophus Lie was one of the promoters of geometric methods in analysis. A topological group (i.e., the group with a topological structure such that group operations are continuous) with the structure of a diﬀerentiable manifold (e.g., S N , groups of regular or orthogonal matrices) is called a Lie group. 17 Let A be a set with two binary operations +, · such that (A1) A with operations +, · is a ring,

(A2) a · a = o for all a ∈ A, (A3) (a · b) · c + (b · c) · a + (c · a) · b = o for all a, b, c ∈ A. Then A is said to be a Lie ring. If A is, moreover, a linear space, then A is called a Lie algebra. (If A is an associative ring and [a, b] = a · b − b · a, then (A, +, [·, ·]) is a Lie ring.) For more information see, e.g., Adams [1], Bourbaki [15], Br¨ ocker & Dieck [17], Helgason [66]. 18 This is not true for a complex Hilbert space since (x, αy) = α(x, y) for α ∈ C.

196

Chapter 4. Local Properties of Diﬀerentiable Mappings

Deﬁnition 4.3.48. A p-linear form A is said to be skew-symmetric if A(xπ(1) , . . . , xπ(p) ) = sgn πA(x1 , . . . , xp ) holds for any permutation π of the set {1, . . . , p} and all x1 , . . . , xp ∈ X. Here sgn π = 1 if the number of sign changes in π (a sign change occurs whenever i < j and π(i) > π(j)) is even and sgn π = −1 if this number is odd. The collection of all skew-symmetric p-linear forms is denoted by Λp (X). Remark 4.3.49. (i) Let e1 , . . . , eM be a basis of X and f 1 , . . . , f M its dual basis, i.e., a basis of the space X ∗ of all linear forms on X for which i

f (ej ) =

δji

=

1, 0,

i = j, i = j.

Then any element x ∈ X can be expressed in the from x=

M

f i (x)ei .

i=1

For A ∈ Λp (X), p ≤ M = dim X, we have A(x1 , . . . , xp ) =

M

f i1 (x1 ) · · · · · f ip (xp )A(ei1 , . . . , eip )

i1 ,...,ip =1

=

1≤i1 <···
⎡

⎣

⎤ sgn πf

iπ(1)

(x1 ) · · · · · f

iπ(p)

(xp )⎦

π∈{1,...,p}

× A(ei1 , . . . , eip ) =

det (f ij (xk ))j,k=1,...,p A(ei1 , . . . , eip ).

1≤i1 <···
In particular, if p = M , then A(x1 , . . . , xM ) = det (f i (xk ))i,k=1,...,M A(e1 , . . . , eM ),

(4.3.30)

i.e., dim ΛM (X) = 1. Notice also that Λp (X) = {o} for p > M . (ii) Elements x1 , . . . , xp of X are linearly dependent if and only if A(x1 , . . . , xp ) = 0

for all

A ∈ Λp (X).

This follows easily from the formula given above. The product operation can be deﬁned in the family of skew-symmetric forms.

4.3B. Diﬀerential Forms

197

Deﬁnition 4.3.50. Let A ∈ Λp (X), B ∈ Λq (X) be skew-symmetric forms. Then their exterior product A ∧ B is the (p + q)-skew-symmetric form deﬁned by the formula A ∧ B(x1 , . . . , xp+q ) = sgn πA(xπ(1) , . . . , xπ(p) )B(xπ(p+1) , . . . , xπ(p+q) ). π∈{1,...,p+q} π(1)<···<π(p) π(p+1)<···<π(p+q)

Remark 4.3.51. (i) The exterior product of three or more skew-symmetric forms is deﬁned by induction and the associative law holds, i.e., A ∧ B ∧ C (A ∧ B) ∧ C = A ∧ (B ∧ C). (ii) The exterior product is not commutative. Namely, B ∧ A = (−1)pq A ∧ B

for

A ∈ Λp (X),

B ∈ Λq (X).

Example 4.3.52. (i) If A, B are one-forms (i.e., linear forms), then A ∧ B(x1 , x2 ) = A(x1 )B(x2 ) − A(x2 )B(x1 ). More generally, by induction, A1 ∧ · · · ∧ An (x1 , . . . , xn ) = det (Ai (xj ))i,j=1,...,n

for one-forms

A1 , . . . , A n .

(ii) If e1 , . . . , eM and f 1 , . . . , f M are mutually dual bases of X and X ∗ , respectively, then for any A ∈ ΛM (X). A = A(e1 , . . . , eM )f 1 ∧ · · · ∧ f M In other words, f 1 ∧ · · · ∧ f M generates ΛM (X) (dim X = M ). More generally, the products (f i1 ∧ · · · ∧ f in )1≤i1 <···
where ai1 ,...,in = A(ei1 , . . . , ein ) (see Remark 4.3.49(i)).

e

The main goal of this appendix is to investigate skew-symmetric forms on manifolds which are continuous (smooth) with respect to the topological (diﬀerential) structure of the manifold. The basic deﬁnition is the following: Deﬁnition 4.3.53. Let M be a diﬀerentiable manifold of dimension M . A mapping ω : x ∈ M → ω(x) ∈ Λp (Tx M ) is said to be a p-diﬀerential form on M if ω is continuous (or smooth) in the following sense:

198

Chapter 4. Local Properties of Diﬀerentiable Mappings Let (V, ψ) be a local chart of M and let ω(x) = ai1 ,...,ip (x) dy i1 ∧ · · · ∧ dy ip

(4.3.32)

1≤i1 ≤···≤ip ≤M

be the representation of ω in this chart (see (4.3.31)). Then all functions ai1 ,...,ip are continuous (or smooth) in V. Remark 4.3.54. (i) A smooth function f : M → R is sometimes called a diﬀerential form of order 0. Its diﬀerential df is the one-form with the local representation df (x) =

M ∂(f ◦ ϕ)(y) i dy , ∂yi i=1

ϕ(y) = x

(see (4.3.20)). (ii) Let ω be a p-diﬀerential form in RN with the representation ai1 ,...,ip (x) df i1 ∧ · · · ∧ df ip ω(x) = 1≤i1 <···
(f 1 , . . . , f N is the dual basis to the standard one e1 , . . . , eN in RN ). In accordance with the notation of coordinates in RN we can also write ai1 ,...,ip (x) dxi1 ∧ · · · ∧ dxip . ω(x) = 1≤i1 <···
If M is a diﬀerentiable manifold in RN of dimension M ≥ p, then ω can be restricted to M . Since Tx M ⊂ RN (see (4.3.16)) we have

∂ ∂ ∂ = ω(x) ,..., ai1 ,...,ip (x) det f ik . ∂yj1 ∂yjp ∂yjl k,l=1,...,p 1≤i <···
p

˜ are local charts at the same point, then we have two represen˜ ψ) (iii) If (V, ψ) and (V, ˜ tations for a p-diﬀerential form ω in V ∩ V: ω(x) = fi1 ,...,ip dy i1 ∧ · · · ∧ dy ip , i≤i1 <···
ω(x) =

gj1 ,...,jp dz j1 ∧ · · · ∧ dz jp .

1≤j1 <···<jp ≤M

(Here dy 1 , . . . , dy M is the basis of (Tx M )∗ with respect to the local chart (V, ψ) ˜ The Transformation Rule (4.3.22) yields a ˜ ψ).) and similarly dz 1 , . . . , dz M for (V, relation between the coeﬃcients f... and g... . This relation is simple for M -forms (M = dim M ), namely ∂Φj (ψ(x)) dy 1 ∧ · · · ∧ dy M ω(x) = g(x) dz 1 ∧ · · · ∧ dz M = g(x) det ∂yi

4.3B. Diﬀerential Forms ' where

∂Φj ∂yi

199

( (y)

is the Jacobi matrix of the transformation z = Φ(y) of i,j=1,...,M

local coordinates (see Figure 4.3.14). The determinant of the Jacobi matrix will be called the Jacobian and denoted by JΦ (Example 4.1.5). This Transformation Rule can be generalized to mappings between manifolds in a way similar to (4.3.19) and (4.3.23). If g : M → M˜ is a smooth map and ω is a p-diﬀerential form on M˜, then the formula (4.3.33) g ∗ ω(x)(v1 , . . . , vp ) = ω(g(x))(g∗v1 , . . . , g∗ vp ), x ∈ M , v1 , . . . , vp ∈ Tx M , where g∗ vi is the push-forward of the tangent vector vi (see (4.3.19)), deﬁnes the pull-back of ω. To obtain a local representation of the type (4.3.33) we choose local coordinates at x, put vk = ∂y∂j and use the Transformation k

Rule (4.3.19). However, the ﬁnal formula is rather cumbersome and we will not need it with the exception of the case when dim M = dim M˜ = M and ω is an M -form, ω(z) = f (z) dz 1 ∧ · · · ∧ dz M . Then

g ∗ ω(x) = f (g(x))JG (ψ(x)) dy 1 ∧ · · · ∧ dy M

where G is the local realization of g (see Figure 4.3.15). An important special case is ϕ∗ ω where ϕ is a coordinate mapping U ⊂ RM → V ⊂ M . The next example shows how to compute the pull-back of (M − 1)-forms for small M . These formulae are often used in vector calculus – see also special cases of integration in Appendix 4.3C. Example 4.3.55. (i) Let ω(x, y) = f (x, y) dx + g(x, y) dy be a 1-form in R2 and γ = (γ1 , γ2 ) : (a, b) → R2 a smooth curve. Then

∂ ∂ = ω(γ(t)) γ∗ γ ∗ ω(t) ∂t ∂t = f (γ1 (t), γ2 (t))γ˙ 1 (t) dt + g(γ1 (t), γ2 (t))γ˙ 2 (t) dt. (ii) Let ω(x, y, z) = f (x, y, z) dy ∧ dz + g(x, y, z) dz ∧ dx + h(x, y, z) dx ∧ dy be a 2-form in R3 and ϕ : (u, v) ∈ U ⊂ R2 → R3 a smooth parametrization of a surface S in R3 . Then [ϕ∗ ω(u, v)](e1 , e2 ) = ω(ϕ(u, v))(ϕ (u, v)e1 , ϕ (u, v)e2 ) ∂ ∂u

∂ ∂v

and, if ∂ ∂ ∂ ∂ = u1 + u2 + u3 , ∂u ∂x ∂y ∂z

∂ ∂ ∂ ∂ = v1 + v2 + v3 ∂v ∂x ∂y ∂z

∂ is actually the ﬁrst vector of the standard basis in R3 – see Remark (here ∂x 4.3.54 (ii)), then

∂ ∂ = u2 v3 − u3 v2 , dy ∧ dz etc., , ∂u ∂v

200

Chapter 4. Local Properties of Diﬀerentiable Mappings and eventually ϕ∗ ω

∂ ∂ , ∂u ∂v

=

(f, g, h),

∂ ∂ × ∂u ∂v

R3

∂ ∂ where the brackets (·, ·)R3 denote the scalar product in R and ∂u × ∂v is the so∂ ∂ called cross (or vector) product of vectors ∂u = (u1 , u2 , u3 ), ∂v = (v1 , v2 , v3 ) in R3 , i.e., ∂ ∂ e × = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ). ∂u ∂v 3

Remark 4.3.56. The reader can ask why it is necessary (or reasonable) to introduce diﬀerential forms even though vectors and vector ﬁelds have been deﬁned. Actually, there is only a technical diﬀerence for one-forms, since Ta M is isomorphic to its dual (Ta M )∗ . For example, df (a) ∈ (Ta M )∗ and therefore it can be represented by a scalar product in Ta M . Since Ta M is a linear subspace of RN (for M ⊂ RN ) we may deﬁne the scalar product in Ta M as (v, w)Ta M (v, w)RN

for

v, w ∈ Ta M .19

In particular, this means that there is a vector ∇f (a) – the so-called gradient of f – such that df (a)(v) = (v, ∇f (a))Ta M . If df (a) = then

fi (a) =

fi (a) dyi ,

∂ , ∇f (a) ∂yi

.20 Ta M

The reason for distinguishing between diﬀerential forms and vector ﬁelds lies in the richer structure of the collection of all diﬀerential forms – there are operations like the exterior product and the exterior diﬀerential (Deﬁnition 4.3.57). Moreover, the diﬀerential forms ω 1 = f dx + g dy + h dz

and

ω 2 = f dy ∧ dz + g dz ∧ dx + h dx ∧ dy

can be attached to the vector ﬁeld F =f

∂ ∂ ∂ +g +h . ∂x ∂y ∂z

We will see in Appendix 4.3C that the integral of ω 1 along a curve γ can be interpreted as work done by the force ﬁeld F along γ and the integral of ω 2 along a surface S has the meaning of the rate at which a ﬂuid ﬂow represented by the velocity ﬁeld F crosses S. Another reason consists in a simpliﬁcation of various notions and results of classical vector analysis and diﬀerential geometry. Examples like orientation, elementary volume and the Stokes Theorem will be shown in Appendix 4.3C. 19 In

this connection see footnote 27 on page 214. ∂ warn that vectors ∂y , . . . , ∂y∂ need not be orthogonal in Ta M !

20 We

1

M

4.3B. Diﬀerential Forms

201

Deﬁnition 4.3.57. Let M be a diﬀerentiable manifold of dimension M and let ω be a smooth p-diﬀerential form on M which has the local representation (4.3.32). Then the diﬀerential of ω is the (p + 1)-diﬀerential form dω with the local representation

dω(x) =

dai1 ,...,ip (x) ∧ dy i1 ∧ · · · ∧ dy ip .

(4.3.34)

1≤i1 <···
Example 4.3.58. (i) If f : M → R is a diﬀerentiable function, i.e., a 0-form, then the diﬀerential df given by Remark 4.3.54(i) is the same as that in (4.3.34). (ii) Let ω(x) = f1 (x) dx1 + f2 (x) dx2 + f3 (x) dx3 be a 1-form on an open set G in R3 and (x1 , x2 , x3 ) the Cartesian coordinates of a point x. If f1 , f2 , f3 are smooth functions on G, then dω(x) = df1 (x) ∧ dx1 + df2 (x) ∧ dx2 + df3 (x) ∧ dx3 ∂f1 ∂f1 ∂f1 dx1 ∧ dx1 + dx2 ∧ dx1 + dx3 ∧ dx1 + · · · ∂x1 ∂x2 ∂x3 =0

∂f2 ∂f1 ∂f3 ∂f2 = dx1 ∧ dx2 + dx2 ∧ dx3 − − ∂x1 ∂x2 ∂x2 ∂x3

∂f1 ∂f3 dx3 ∧ dx1 . + − ∂x3 ∂x1

=

If we interpret the components (f1 , f2 , f3 ) of the form ω as those of a vector ﬁeld v (Remark 4.3.56), then the components of dω, more precisely

∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1 − , − , − ∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2

,

form the so-called curl of v (notation curl v or ∇ × v; for the cross product see e Example 4.3.55(ii)). Remark 4.3.59. By computing the diﬀerentials dfi1 ...ip as in the previous example and rearranging the sum in (4.3.34) to get rid of the zero terms dy i1 ∧ · · · ∧ dy ip+1 where two indices coincide, we obtain, e.g., for an (M − 1)-form, ω(x) =

M

fi (x) dy 1 ∧ · · · ∧ 7 dy i ∧ · · · ∧ dy M

i=1

(here 7 dy i means that dy i is missing), dω(x) =

M i=1

(−1)i

∂(fi ◦ ϕ) (ψ(x)) dy 1 ∧ · · · ∧ dy M . ∂yi

Here ϕ, ψ are given by a local chart in x ∈ M .

202

Chapter 4. Local Properties of Diﬀerentiable Mappings

Example 4.3.58(i) leads to the following question: Are all one-diﬀerential forms diﬀerentials of smooth functions? In other words, has any (continuous) one-form ω a “primitive function” f , i.e., is there f such that df = ω? A short speculation on oneforms in R2 suggests obstacles caused by mixed partial derivatives. We investigate this problem in a more general way. Proposition 4.3.60. (i) Let ω be a diﬀerential form of the class C 2 . Then d2 ω d( dω) = 0. (ii) Let ω and κ be a p-diﬀerential form and q-diﬀerential form, respectively, then d(ω ∧ κ) = ( dω) ∧ κ + (−1)q ω ∧ ( dκ). Proof. An easy proof is left to the reader. Notice however that the exchangeability of mixed partial derivatives of C 2 -functions is the crucial point in the statement (i). Deﬁnition 4.3.61. A diﬀerential form ω is said to be (1) closed if dω = 0, (2) exact if there is a diﬀerential form κ such that ω = dκ. Remark 4.3.62. The concept of exact diﬀerential forms is a generalization of the classical notion of the potential of a mapping f : RM → RM : A function F : G → R is called a potential of f in an open set G ⊂ RM if F (x)h = (f (x), h)RM ,

x ∈ G,

h ∈ RM .

In particular, if F is a potential of a C 1 -function, then ∂fj ∂fi (x) = (x), ∂xj ∂xi

i, j = 1, . . . , M,

x ∈ G.

The following example shows that this necessary condition is not suﬃcient. Example 4.3.63. Let G = R2 \ {(0, 0)} and let ω(x, y) = −

y x dx + 2 dy x2 + y 2 x + y2

be a 1-form in G. This form is closed in G. Suppose now that ω is exact, i.e., there is a function f : G → R such that df = ω, in particular, ∂f y , =− 2 ∂x x + y2

∂f x = 2 ∂y x + y2

in

G.

Integrating, we obtain ⎧ x ⎪ ⎨ − arctan y + C(y) f (x, y) = ⎪ ⎩ arctan y + D(x) x

for

(x, y) ∈ G, y = 0,

for

(x, y) ∈ G, x = 0.

4.3B. Diﬀerential Forms

203

Since arctan z + arctan 1z = π2 for z = 0, we have C(y) − D(x) = π2 , i.e., C and D are constant functions in all quadrants. Taking limits for x → 0± , y → 0± we arrive at a contradiction, i.e., ω is not exact. The reader can ask how we have found this example. The problem is more transparent if R2 is identiﬁed with the complex plane. If F (z) = then Re F (x + iy) =

1 , z

z = x + iy,

x , x2 + y 2

Im F (x + iy) = −

y . x2 + y 2

It is well known that there is no (holomorphic) function Φ such that Φ (z) =

1 z

for all

z ∈ C \ {0}

dz ). In the theory of functions of a complex variable a primitive function z can be constructed by a curve integral. We will use the same approach in constructing a “primitive form” to a diﬀerential form. This is the main idea of the proof of the following e basic result. (consider

S1

Theorem 4.3.64 (H. Poincar´e). Any closed diﬀerential form on a diﬀerentiable manifold is locally exact. Proof. Let ω be a closed p-form on an M -dimensional manifold M (1 ≤ p ≤ M ). We choose a local chart (V, ψ) such that P ψ(V) = U is an open ball in RM with center at the origin. The pull-back (see Remark 4.3.54(iii)) Ω ϕ∗ ω (ϕ = (P ψ)−1 ) is a p-form in U. We deﬁne a (p − 1)-form σ on U by the formula 1 tp−1 Ω(ty)(y, v1 , . . . , vp−1 ) dt σ(y)(v1 , . . . , vp−1 ) = 0

for y ∈ U, v1 , . . . , vp−1 ∈ Ty U .

21

We have to show that

(i) the integral exists (this fact follows from the continuity of t → Ω(ty)); (ii) σ is a (p − 1)-diﬀerential form on U (the skew-symmetry of σ follows from the same property of Ω); (iii) dσ(y) = Ω(y) for y ∈ U. Veriﬁcation of the last statement is technically complicated. The case p = 1 is more transparent, and therefore we will give the computation only for this case. For the induction step the reader can consult, e.g., Sternberg [124, Theorem III.4.1], Cartan [21, Theorem II.3.2.12.1] or Taylor [127, Theorem 1.13.2]. Suppose that Ω has in U the form Ω(y) (ϕ∗ ω)(y) =

M

gi (y) df i ,

y ∈ U,

i=1 21 Here

we identify the point y ∈ U with the vector y ∈ Ty U = RM .

204

Chapter 4. Local Properties of Diﬀerentiable Mappings

where f 1 , . . . , f M is the dual basis to the standard one e1 , . . . , eM in RM . We wish to show that the function M 1 σ(y) gi (ty)yi dt, y = (y1 , . . . , yM ) ∈ U, (4.3.35) i=1

0

has the diﬀerential dσ(y) = Ω(y),

∂σ (y) = gj (y), ∂yj

i.e.,

j = 1, . . . , M.

By diﬀerentiating the integral (4.3.35) with respect to the parameter yj we obtain ∂σ (y) = ∂yj

1

gj (ty) dt + 0

M i=1

1 0

∂gi (ty)tyi dt = ∂yj

1

gj (ty) dt + 0

M i=1

1 0

∂gj (ty)tyi dt. ∂yi

For the last equality we have used the assumption dω = 0 and Exercise 4.3.71(iv): dΩ = d(ϕ∗ ω) = ϕ∗ ( dω) = 0 and, consequently, ∂gi ∂gj = , i, j = 1, . . . , M. ∂yj ∂yi Using integration by parts we get 1 1 M 1 8 9t=1 d ∂gj gj (ty) dt = tgj (ty) t=0 − t gj (ty) dt = gj (y) − t (ty)yi dt. dt ∂yi 0 0 0 i=1 If we put f (x) = σ(y) for x = ϕ(y), then

df = ω.

Remark 4.3.65. The proof of the case p = 1 shows that there exists a potential σ of a smooth mapping g = (g 1 , . . . , g M ) : U → RM in a ball U provided the symmetry conditions ∂g j ∂g i = , i, j = 1, . . . , M, hold. ∂yj ∂yi Example 4.3.63 suggests that certain topological properties of U are necessary if U is not a ball. In the proof of the previous theorem the potential σ was deﬁned by the curve integral 1

σ(y) = 0

(g(γ(t)), γ(t)) ˙ RM dt

Ω

22

γo,y

along the curve γo,y = {ty : t ∈ [0, 1]}. The crucial point in the direct computation of the Fr´echet derivative of σ is an estimate of the diﬀerence σ(y + h) − σ(y). If the curve integral depends only on the initial and terminal points and not on the path which joins these points, then 1 σ(y + h) − σ(y) = Ω= (g(y + th), h)RM dt = (g(y), h)RM + o(hRM ) γy,y+h

22 The

0

deﬁnition of the curve integral and of the integral of a diﬀerential form is given in the next Appendix 4.3C.

4.3B. Diﬀerential Forms

205

provided g is continuously diﬀerentiable. Example 4.3.63 can be easily adapted to show that the independence of the curve integral on the path in U implies that U is not punctured. There is another way to express this observation. It consists in considering the obstacles preventing a closed form from being exact. Assume that M is a connected manifold and denote the group (with respect to pointwise addition) of closed p-diﬀerential forms on M by Z p (M ) and the subgroup of exact p-forms by B p (M ). The quotient H p (M ) Z p (M )|B p (M ) is called the p-(de Rham) cohomology group of M . If H 1 (M ) is trivial, i.e., any closed one-form in M is exact, then M is said to be simply connected . More details on the role of cohomological groups in the study of diﬀerentiable manifolds can be found, e.g., in Whitney [133, Chapter IV]. The calculation of cohomology groups is by no means trivial. Example 4.3.66. (i) Let f : M → N be a smooth map. If ω ∈ Z p (N ), then d(f ∗ ω) = f ∗ ( dω) = 0,

i.e.,

f ∗ ω ∈ Z p (M ).

(Exercise 4.3.71(iv)). Similarly, if ω ∈ B p (N ), then f ∗ ω ∈ B p (M ). This means that f induces a linear map f ∗ : H p (N ) → H p (M ). In particular, if f is a diﬀeomorphism of M onto N , then H p (M ) is isomorphic to H p (N ). (ii) Using the previous example we can show that H 1 (S 1 ) is isomorphic to R. Instead of H 1 (S 1 ) it is suﬃcient to compute H 1 (R|Z ): Denote by i the natural projection of R onto R|Z and consider a closed one-form ω on R|Z . Then f (x) dx i∗ ω(x) where f is a 1-periodic function. Deﬁne ϑ(ω) =

1

f (x) dx. 0

It is easy to see that ϑ(ω) = 0 if and only if ω is exact, and also that ϑ maps B 1 (R|Z ) onto R. This shows that ϑ induces the isomorphism of H 1 (R|Z ) onto R.

e

Now we explain the notion of a simply connected domain in another way which will be important in the sequel (e.g., in the degree theory – Appendix 4.3D). Deﬁnition 4.3.67. Let X, Y be metric (topological) spaces. Continuous maps f, g : X → Y are called homotopic if there exists a continuous map Φ : X × [0, 1] → Y such that Φ(·, 0) = f (·),

Φ(·, 1) = g(·).

Such Φ is said to be a homotopy between f and g.

206

Chapter 4. Local Properties of Diﬀerentiable Mappings

Remark 4.3.68. The relation between two continuous maps to be homotopic is clearly an equivalence relation. The set of all continuous maps C(X, Y ) is therefore divided into disjoint classes of mutually homotopic maps. We denote the class containing f by [f ]. Here we are using the homotopy concept mainly for curves. The reader can imagine that curves γ0 , γ1 : [0, 1] → M are homotopic if γ0 may be continuously deformed (in M !) into γ1 . A curve γ is called null-homotopic if γ is homotopic to a constant curve (point) γ ˜ : t ∈ [0, 1] → a ∈ M . In particular, this is important for closed curves (γ(0) = γ(1)). To see this, choose a ﬁxed point a ∈ M and deﬁne H1 (M ) {[γ] : γ : [0, 1] → M is continuous, γ(0) = γ(1) = a}. H1 (M ) forms a group – the so-called fundamental group of M – under “multiplication” γ = γ2 · γ1 deﬁned by 8 9 t ∈ 0, 12 , γ1 (2t), 81 9 γ(t) = γ2 (2t − 1), t ∈ 2 , 1 (notice that the deﬁnition [γ2 ] · [γ1 ] [γ2 · γ1 ] is correct). If M is path-connected, i.e., for any a, b ∈ M there exists a continuous curve γ in M such that γ(0) = a, γ(1) = b, then the fundamental group of M does not depend on the choice of the point a. Whenever the fundamental group is trivial, i.e., any closed curve can be continuously deformed into a point, then there are no holes in M and the integral of a one-form along a closed curve is zero, i.e., this integral does not depend on the path (Remark 4.3.65). Cohomology and homotopy groups belong to the main tools in algebraic topology. The reader who is interested in these techniques can consult the corresponding textbooks, e.g., Adams [1], Dold [36], Greenberg [61], Kosniowski [77] (here, in Chapter 26, you can ﬁnd applications of fundamental groups to the classiﬁcation of two-dimensional compact connected surfaces; for example, such a surface is simply connected if and only if it is homeomorphic to S 2 ), Spanier [122]. At the end of this appendix we link diﬀerential one-forms and systems of diﬀerential equations and continue the discussion from Appendix 4.3A. To simplify it we assume that α=

M

γi dxi

i=1

is a non-vanishing smooth one-form in an open set G ⊂ RM . The form α is uniquely determined by its kernel up to a multiplication factor. Let v 1 , . . . , v M −1 be a basis of this kernel, i.e., v 1 , . . . , v M −1 are vector ﬁelds on G which are linearly independent at each point x ∈ G and annihilate α. The equation α=0

(4.3.36)

is called the exterior diﬀerential equation in G and its solution is a mapping T : x ∈ G → a subspace T (x) ⊂ Tx G such that α(v) = 0

for all

v ∈ T (x).

4.3B. Diﬀerential Forms

207

A submanifold S of G is said to be an integral manifold for the equation (4.3.36) if dim S = M − 1

and

ϕ∗ α(y) = 0

for all

y∈S

where ϕ is a local parametrization of S in a neighborhood of y. This integral manifold is the same object as that at the end of Appendix 4.3A, i.e., i∗ (Ty S ) = Lin{v 1 (y), . . . , v M −1 (y)} where i : S → G is an embedding. The notion of the exterior diﬀerential equation can be generalized to a system αj = 0,

j = 1, . . . , k,

(4.3.37)

where α , . . . , α are diﬀerential forms on a manifold M not necessarily of the same order. In the special case when α1 , . . . , αk are linearly independent one-forms ((4.3.37), it is the so-called Pfaﬀ system), the intersection of their kernels has a basis formed by M − k vector ﬁelds. The Frobenius Theorem for the existence of an integral manifold for (4.3.37) has the following form. 1

k

Theorem 4.3.69 (Frobenius, the diﬀerential forms version). Let α1 , . . . , αk be smooth differential one-forms on a diﬀerentiable manifold M . The necessary and suﬃcient condition for the existence of an integral manifold in a vicinity of any point of M is that dαi ∧ α1 ∧ · · · ∧ αk = 0,

i = 1, . . . , k.

Proof. The equivalence of Theorem 4.3.42 and Theorem 4.3.69 is not diﬃcult to prove, and the case k = 1, dim M = 3 is rather instructive. In this case a connection to the Poincar´e Theorem 4.3.64 and its proof should also be recognized. Exercise 4.3.70. Denote by O(N ) the set of all regular linear mappings A : RN → RN for which A−1 = A∗ and by SO(N ) the set {A ∈ O(N ) : det A = 1}. (i) As a subset of R the set O(N ) is a diﬀerentiable manifold. Find its dimension. Hint. Consider A → A∗ A. The dimension is N(N−1) . 2 N×N

(ii) Show that SO(N ) is the component of O(N ) containing the identity. (iii) Show that A ∈ O(N ) induces a mapping of S N−1 into itself. (iv) Let ω be a one-form on S 2 which is invariant under SO(3), i.e., A∗ ω = ω

for all

A ∈ SO(3).

Prove that ω = 0. (v) Does a result analogous to (iv) hold for a two-form ω on R3 ? Exercise 4.3.71. Prove the following properties of the pull-back operation: (i) g ∗ (ω ∧ κ) = (g ∗ ω) ∧ (g ∗ κ), (ii) (h ◦ g)∗ ω = g ∗ (h∗ ω), (iii) g ∗ ( df ) = d(f ◦ g) for f : M → R,23 (iv) g ∗ ( dω) = d(g ∗ ω) where d denotes the diﬀerential. 23 If

we interpret f as a 0-form, then the notation g ∗ f instead of f ◦ g is more agreeable.

208

Chapter 4. Local Properties of Diﬀerentiable Mappings

Exercise 4.3.72. Let M be the open unit ball in R2 without its origin. Show that H1 (M ) is isomorphic to Z. Is it also true in R3 ? Exercise 4.3.73. Show that H1 (S 1 ) is isomorphic to Z. Hint. Use an approach similar to that in Example 4.3.66(ii), and instead of the mapping ϑ show that there is a lifting γ˜ of a continuous closed curve γ : [0, 1] → R|Z , i.e., γ ˜ : [0, 1] → R continuous such that i(˜ γ ) = γ, γ˜ (0) = 0. Now consider γ˜ (1) (actually this is the degree of γ – see Appendix 4.3D). For details see, e.g., Kosniowski [77, Chapter 16].

4.3C Integration on Manifolds We have met the curve integral in the previous Appendix 4.3B. There are two objects which can be integrated along curves: functions and diﬀerential one-forms. The situation with functions is simple. If M is an M -dimensional diﬀerentiable manifold in RN , f : M → R is a continuous function and γ : [a, b] → M is a smooth curve, then we deﬁne b f dγ f (γ(t))γ(t) ˙ dt. (4.3.38) γ

a

The Euclidian norm γ(·) ˙ of tangent vectors expresses here a quantity which could be viewed as the “inﬁnitesimal” length of γ. Recall in this context the formula for the length of γ: b l(γ) γ(t) ˙ dt. a

We will return to the length and area of a nonlinear object later in this appendix. The integral on the right-hand side of (4.3.38) is the Riemann integral and consequently it has reasonable properties. It could be generalized to some noncontinuous functions (via the Lebesgue integral) and/or to certain non-smooth curves (pairwise smooth or with bounded variation via the Riemann–Stieltjes integral). Since we are not interested in these generalizations we always assume that all objects are as smooth as we need (manifolds at least of the class C 1 , functions, vector ﬁelds, diﬀerential forms at least continuous). The situation with integration of one-forms is diﬀerent. Namely, diﬀerential forms are deﬁned only on manifolds (recall that an open subset of RM is also a manifold) and curves need not be manifolds (see Figure 4.3.2). There are two possibilities to avoid these obstacles: either to assume that γ lies on a manifold where the one-form is deﬁned or to restrict integration to curves which are themselves manifolds. We now examine the ﬁrst possibility and postpone the other one to Deﬁnition 4.3.86. For the deﬁnition of the integral of a one-form given in Remark 4.3.54 we have assumed that the whole curve γ lies in one chart to get the same representation of the form at all points of γ. If more charts are needed to cover the curve we have to be careful not to integrate over some parts of the curve several times. To eliminate this risk the following tool is very useful. In order to build it up we need a topological interlude.

4.3C. Integration on Manifolds

209

Deﬁnition 4.3.74. Let (Vn , ψn )n∈N be an atlas of a diﬀerentiable manifold M . Let {αn }n∈N be a collection of smooth (often C ∞ ) nonnegative functions on M which have the following properties: (1) for all n ∈ N the support of αn deﬁned by supp αn {x ∈ M : αn (x) = 0} is a compact subset of Vn ; ∞ αn (x) = 1 for all x ∈ M . (2) n=1

Then {αn }n∈N is said to be a partition of unity subordinate to {Vn }n∈N .24 Since M ⊂ RN is separable in the induced topology a countable atlas always exists. It is also possible to construct a sequence {Gn }∞ n=1 of open subsets of M such that Gn ⊂ int Gn+1 , 25

Gn

is compact,

and

M =

∞

Gn .

n=1

For example, Gn can be chosen as the intersection of M with the open ball centered at o with radius n. For the construction of a partition of unity the following topological device is convenient. We will need various types of balls so B(a; r) will denote the open ball in RM (M = dim M ). Lemma 4.3.75. Let {Wα }α∈A be an open covering of an M -dimensional manifold M in RN . Then there is a countable open covering {Vm }m∈N of M with the following properties: (i) {Vm }m∈N is subordinate to {Wα }α∈A , i.e., for each m ∈ N there is an index αm ∈ A such that Vm ⊂ Wαm ; ∞ ϕm (B(o; 1)) = M ; (ii) there are smooth mappings ϕm : B(o; 2) → Vm such that m=1

(iii) the collection {Vm }m∈N forms a locally ﬁnite system, i.e., any point x ∈ M has a neighborhood which intersects only a ﬁnite number of {Vm }m∈N . Proof. Choose a sequence {Gn }∞ n=1 of open subsets of M which has the property stated prior to this lemma. Put in addition G0 = G−1 = ∅. The main idea behind the forthcoming construction is that the compact sets Kn Gn \ Gn−1 ,

n ∈ N,

cover M and the larger open sets Hn Gn+1 \ Gn−2 ,

n ∈ N,

24 A partition of unity is deﬁned in topology in a more general way; see the corresponding textbooks, e.g., Dugundji [43, Chapter VIII]. 25 We wish to point out that topological notions (like interior) are taken here with respect to the topology of M , i.e., G ⊂ M is open provided there is an open set H ⊂ RN such that G = M ∩ H.

210

Chapter 4. Local Properties of Diﬀerentiable Mappings

V

Hnα Vxn

x ϕnx (z)

ϕnx

Pψ (P ψ)−1 z

y

δ y+ z 2

o B(o; 1)

B(y; δ)

B(o; 2)

Figure 4.3.18.

form a locally ﬁnite system. Fix n ∈ N. Let (V, ψ) be a local chart at x ∈ Hα n Hn ∩ Wα . Put y = P ψ(x). P ψ(V ∩Hα n ).

There is a ball B(y; δ) ⊂ We now shift the center y to the origin and expand the ball appropriately, namely we put

δ −1 y+ z for z ∈ B(o; 2). ϕn x (z) = (P ψ) 2 With help of these smooth maps ϕn x we return back to the manifold by setting Vxn = ϕn x (B(o; 1)) (see Figure 4.3.18). Notice that Vxn is open in M . Open sets {Vxn }x∈Hα cover the n ,α∈A compact set Kn . We choose a ﬁnite subcovering Vxn1 , . . . , Vxnkn . The collection {Vxnj }j=1,...,kn , n∈N , covers M , and {ϕn xj (B(o; 2))}j=1,...,kn , n∈N , is the desired locally ﬁnite countable system {Vm }m∈N .

4.3C. Integration on Manifolds

211

Theorem 4.3.76. For any atlas (Wα , ψα )α∈A of a manifold M there exists a subordinate partition of unity. Proof. According to the previous Lemma 4.3.75 we choose a locally ﬁnite subordinate covering {Vk }k∈N of M and the corresponding functions {ϕk }∞ k=1 . It is easy to show that the function − 1 e 1−y2 , y < 1, η(y) = y ∈ RM , 0, y ≥ 1, is a C ∞ -function in RM . Put

βk (x) =

x ∈ Vk , x = ϕk (y), x ∈ M \ Vk .

η(y), 0,

Then βk is smooth (of the same order as M ) and βk (x) > 0

for

x ∈ ϕk (B(o; 1)).

Since {Vk }k∈N is a locally ﬁnite system, the series nonzero terms. Moreover,

∞

∞

βk (x) has only a ﬁnite number of

k=1

βk (x) > 0 for all x ∈ M due to

k=1

∞

ϕk (B(o; 1)) = M . It is

k=1

now suﬃcient to put βk (x) , αk (x) = ∞ βn (x)

k ∈ N,

n=1

to obtain the desired partition of unity.

We can now return to the deﬁnition of the integral of a one-form ω. If {αn }n∈N is a partition of unity which is subordinate to a covering {Vn }n∈N of M where (Vn , ψn )n∈N is an atlas of M , then ω(x) =

∞

x ∈ M.

αn (x)ω(x),

n=1

Notice that αn ω is a one-form and supp αn ω ⊂ Vn . This decomposition of ω allows us to deﬁne the integral locally. If γ is a smooth curve in M and γ(t) ∈ Vn , then γ(t) ˙ ∈ Tγ(t) M and it can be written in the form γ(t) ˙ =

M

γ˙ i (t)

i=1

∂ . ∂yi

Deﬁnition 4.3.77. Let M be an M -dimensional diﬀerentiable manifold and denote (Vn , ψn )n∈N its atlas. Let {αn }n∈N be a partition of unity subordinate to {Vn }n∈N . Let γ : I → M be a smooth curve and ω a one-form on M . If ω(x) =

M i=1

fi (x) dy i ,

x ∈ Vn ,

212

Chapter 4. Local Properties of Diﬀerentiable Mappings

then we deﬁne ω= γ

∞ n=1

αn ω γ

∞ n=1

αn (γ(t)) I

M

fi (γ(t))γ˙ i (t) dt

(4.3.39)

i=1

provided the integrals on the right-hand side exist and the sum is absolutely convergent. Remark 4.3.78. (i) If γ is a smooth curve deﬁned on a compact interval I = [a, b] and {Vn }n∈N is a locally ﬁnite covering of M , then γ lies in a ﬁnite number of {Vn }n∈N only. If, moreover, the form ω is continuous, then the integrals in (4.3.39) exist and the sum is absolutely convergent, since it contains only ﬁnitely many nonzero terms. We require absolute convergence of the series because we do not want the value of the integral to depend on the arrangement of charts (Vn , ψn )n∈N into a sequence. (ii) It can be proved (do it as an exercise!) that the formula (4.3.39) does not depend on the choice of partition of unity. It should be also proved that the right-hand side in (4.3.39) is the same for all equivalent atlases on M (see Deﬁnition 4.3.34). This follows from the transformation rules for tangent vectors (4.3.17) and for diﬀerential forms (4.3.22). Remark 4.3.79. We can interpret the local coordinates (f1 , . . . , fM ) of a one-form ω as the local coordinates of a vector ﬁeld F (x) =

M

fi (x)

i=1

∂ , ∂xi

x ∈ Vn

(and vice-versa – Remark 4.3.56). If we deﬁne F = ω, γ

γ

F expresses the work done by the vector ﬁeld F along the curve γ.

then the integral γ

The special cases M = R2 , R3 are known from introductory courses in mechanics (see, e.g., Kittel, Knight & Ruderman [76, Chapter 5]). Remark 4.3.80. Figures 4.3.2 and 4.3.3 show that a smooth curve need not be a diﬀerentiable manifold in RN . In order to avoid such cases it is suﬃcient to assume that the curve γ has a parametrization which is an embedding (Remark 4.3.3(iv)). If, moreover, γ lies on a manifold M , then it is the so-called submanifold of M in the following sense: A subset P of a diﬀerentiable manifold M is said to be a P -dimensional submanifold of M if there is an atlas (Vn , ψn )n∈N of M such that ψn (x) = (y1 , . . . , yP , 0, . . . , 0) ∈ RN

for all

x ∈ Vn ∩ P.

The proof of Proposition 4.3.2 shows that the image of an embedding is a submanifold. In order to integrate functions over a surface in R3 , or more generally over a manifold, we need to generalize the notion of area of a parallelogram to a non-ﬂat domains. Let

4.3C. Integration on Manifolds

213

us recall here the deﬁnition of the multiple Riemann integral. The notion of (normalized) area or volume is based on the fact that the unit cube M . C x i ei : 0 ≤ x i ≤ 1 i=1

(e1 ,...,eM is the standard basis in RM ) has the M -dimensional volume (i.e., the Lebesgue measure) equal to 1. Let A be a parallelepiped in RM spanned by vectors v1 , . . . , vM , i.e., . M αj vj : 0 ≤ αj ≤ 1 . A j=1

Then the volume V (A) of A is deﬁned by V (A)

1 dx. A

This integral can be calculated with help of the linear operator T : RM → RM which M sends the vectors e1 , . . . , eM of the standard basis to v1 , . . . , vM (T ej = vj = tij ei ). i=1

It is well known that 1 dx = | det T | dy = | det T | | det (tij )i,j=1,...,M |.26 A

(4.3.40)

C

There is a problem with the generalization to a manifold since a manifold is bent. Nonetheless, a manifold can be supposed to be locally ﬂat provided it is smooth. This basic principle of analysis allows us to deﬁne an inﬁnitesimal area or volume via these notions for ﬂat tangent spaces. Another problem now arises since there is no natural unit cube in Ta M . To overcome this obstacle we want to express the M -volume of the parallelepiped, spanned by the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M , without using the standard basis. This can be done for the parallelepiped A given above with help of the scalar product (in which the standard basis is an orthonormal basis). If G(v1 , . . . , vM ) is the so-called Gramm matrix of vectors v1 , . . . , vM , i.e., ⎛

(v1 , v1 )RN ⎜ .. G(v1 , . . . , vM ) = ⎝ . (vM , v1 )RN then the formula

··· .. . ···

⎞ (v1 , vM )RN ⎟ .. ⎠, . (vM , vM )RN 1

V (A) = [det G(v1 , . . . , vM )] 2

(4.3.41)

holds (see Exercise 4.3.103). 26 More

generally:If T is a (nonlinear) diﬀeomorphism which maps C onto A T (C), then the 1 dx = | det T (y)| dy holds for the Lebesgue measure V (A) of A. The

formula V (A)

A

C

proof of this nonlinear version is based on (4.3.40).

214

Chapter 4. Local Properties of Diﬀerentiable Mappings

Since Ta M ⊂ RN , the scalar product in RN can be restricted to Ta M to get a 27 This justiﬁes the next deﬁnition. We wish to point out natural scalar product in T' aM . ( ∂ ∂ , ∂y∂ j that the Gramm matrix G ∂y1 , . . . , ∂y∂M consists of scalar products of vectors ∂y i in RN (the diﬀerential structure of M is inherited from RN ). Deﬁnition 4.3.81. Let M be a diﬀerentiable manifold with an atlas (Vn , ψn )n∈N and let ϕn = (ψn |Vn )−1 .

Un = Pn ψn (Vn ),

Let {αn }n∈N be a partition of unity in M subordinate to {Vn }n∈N . If f is a continuous function on M , then we deﬁne 1

∞ 2 ∂ ∂ f dV (αn f )(ϕn (y)) det G ,..., dy1 · · · dyM (4.3.42) ∂y1 ∂yM M n=1 Un provided the right-hand side exists and the sum is absolutely convergent. Remark 4.3.82. It is possible to show that this deﬁnition does not depend on the partition of unity and on the choice of an atlas. The right-hand side in (4.3.42) exists whenever f has compact support or, in particular, if M is a compact manifold. Example 4.3.83. Compute the surface area V (S 2 ) of the unit sphere S 2 in R3 . It is obvious that the two-dimensional surface area of the “Greenwich meridian G” is zero. The rest S 2 \ G is covered by one chart with ' π π( ϕ(α, ϑ) = (cos α cos ϑ, sin α cos ϑ, sin ϑ), α ∈ (0, 2π), ϑ ∈ − , . 2 2 ∂ ∂ Since n = 1, U1 = (0, 2π) × − π2 , π2 , α1 = 1 and det G ∂α , ∂ϑ = cos2 ϑ, ∂ = (− sin α cos ϑ, cos α cos ϑ, 0), ∂α

we have

∂ = (− cos α sin ϑ, − sin α sin ϑ, cos ϑ), ∂ϑ

V (S 2 ) =

1 dS = S2

(0,2π)×(− π ,π 2 2)

cos ϑ dα dϑ = 4π.

(It is more common to denote the integration symbol in the two-dimensional case by dS e instead of dV .) It follows from Deﬁnition 4.3.77 (see also Exercise 4.3.100) that the integral of a one-form along a curve γ depends on the orientation of γ. Namely, if γ˜ (t) = γ(1 − t),

t ∈ (0, 1),

then γ˜˙ (t) = −γ(1 ˙ − t),

ω=−

and hence γ ˜

27

ω. γ

This scalar product leads to uniformly distributed mass or currents in physical applications but it is sometimes unrealistic. To cover further applications in a more realistic manner we can consider diﬀerent scalar products at diﬀerent points of a manifold. Since any positive deﬁnite symmetric bilinear form in RM × RM determines a scalar product in RM , we can introduce a metric structure on a manifold M by a (smooth) mapping g : x ∈ M → S2+ (Tx M ) (positive deﬁnite symmetric bilinear forms on Tx M ). Such g is called a Riemann metric on M .

4.3C. Integration on Manifolds

215

This dependence on an orientation is crucial for the generalization of the curve integral to an integral of a diﬀerential form over a manifold. What can the orientation of a manifold be? Let us start with simple examples like R, R2 , R3 . It is the common understanding that the “standard” (equivalently “positive”) orientation on R is from the left to the right, in R2 anticlockwise and by the right-thumb rule in R3 . These slightly vague formulations can be made precise by taking ﬁxed bases in R, R2 , R3 , e.g., the standard bases. Then all bases are divided into two disjoint classes according to the sign of the determinant of the transformation matrix T which sends the ﬁxed (e.g., standard) basis into a new one. We say that e˜1 , . . . , e˜M is a positive basis if det T > 0. We want to remind the reader that det T = f 1 ∧ · · · ∧ f M (˜ e1 , . . . , e˜M ) if T ei = e˜i , i = 1, . . . , M, and f 1 , . . . , f M is the dual basis to e1 , . . . , eM (Example 4.3.52(i)). This indicates that the choice of a ﬁxed nowhere-vanishing continuous M -form ω (i.e., ω(x) = 0 for all x ∈ M ) on the M -dimensional manifold M makes it possible to introduce an orientation on M . If (V, ψ) is a local chart at a point x ∈ M , then the basis ∂y∂ 1 , . . . , ∂y∂M of Tx M is said to be a positive basis of Tx M provided

∂ ∂ > 0. ω(x) ,..., ∂y1 ∂yM It can be proved that a continuous non-vanishing form exists on M if and only if there ˜ ψ) ˜ of this atlas the is an atlas (Vn , ψn ) of M such that for any two charts (V, ψ), (V, ˜ ◦ ϕ) (y) (see (4.3.17)) has a positive determinant for all transformation matrix ((P ψ) ˜ (provided V ∩ V ˜ = ∅). y ∈ ψ(V ∩ V) Deﬁnition 4.3.84. A diﬀerentiable manifold M of dimension M is said to be orientable if there exists a continuous nowhere-vanishing M -form on M . If such a form ω is ﬁxed, then (M , ω) is called an oriented manifold. Example 4.3.85. (i) Suppose that M is a two-dimensional orientable connected manifold in R3 (i.e., a surface) and ω is a nowhere-vanishing two-form on M . The question is how these orientations of Tx M cohere with the natural orientation of R3 . To ﬁnd an answer we choose a point x ∈ M and local coordinates at x such that ∂y∂ 1 , ∂y∂ 2 form a positive basis in Tx M . It is obvious that there is a vector n ∈ R3 which is perpendicular to Tx M ⊂ R3 and such that

∂ ∂ n, , ∂y1 ∂y2 n is called a (unit) outer normal vector to is a positive basis of R3 . The vector n

M at the point x. It is easy to prove that

n=

∂ ∂ × . ∂y1 ∂y2

For the deﬁnition of the cross product see Example 4.3.55(ii). (ii) The M¨ obius strip S ⊂ R3 is an example of a non-orientable manifold. An argument to prove that can be based on the above consideration. Choose a point a ∈ S , a basis ∂y∂ 1 , ∂y∂ 2 in Ta S and ﬁnd the outer normal vector na . Now move the point

216

Chapter 4. Local Properties of Diﬀerentiable Mappings ' ( a together with the basis n, ∂y∂ 1 , ∂y∂ 2 along the whole strip to come back to the e initial position. The vector na will end at −na (see Figure 4.3.19).

x

∂ ∂y1

nx ∂ ∂y2

S

∂ ∂y2

na ∂ ∂y1

a

Figure 4.3.19. Deﬁnition 4.3.86. Let (M , ω) be an oriented M -dimensional diﬀerentiable manifold. Let (Vn , ψn )n∈N be an atlas of M for which the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M form a positive basis of Tx M for all x ∈ Vn and all n ∈ N. Let {αn }n∈N be a partition of unity subordinate to this atlas. If ω is a continuous M -form with the local representation ω(x) = fn (x) dy 1 ∧ · · · ∧ dy M ,

x ∈ Vn ,

then we deﬁne the integral of ω over M as ∞ ∞ ω= αn ω ϕ∗n (αn ω)(y) M

n=1

M

n=1

=

∞ n=1

Un

Un

(4.3.43) (αn fn )(ϕn (y)) dy1 · · · dyM 28

provided the right-hand side exists and the sum is absolutely convergent. Remark 4.3.87. N (i) If a form ω has compact support, in particular, if M is a compact set in R , then the integral ω exists. M 28 For

M η(y) = g(y) df 1 ∧ · · · ∧ df M a continuous M -form on a measurable set U ⊂ R (cf. Re-

mark 4.3.54(ii)) we deﬁne U

g(y) df 1 ∧ · · · ∧ df M =

U

g(y) dy1 · · · dyM .

4.3C. Integration on Manifolds

217

(ii) Deﬁnition 4.3.86 does not depend on the concrete choice of an atlas and on a partition of unity. If the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M determine a negative basis in Tx M , then we change the order, e.g., to ∂ ∂ ∂ ∂ , , ,..., , ∂y2 ∂y1 ∂y3 ∂yM to get a positive basis. Notice that ω(x) = f (x) dy 1 ∧ · · · ∧ dy M = −f (x) dy 2 ∧ dy 1 ∧ dy 3 ∧ · · · ∧ dy M . (iii) Deﬁnition 4.3.86 is also independent on a transformation of coordinates in the following sense: Let g be a diﬀeomorphism of an oriented manifold M onto a manifold M˜.29 Then g induces an orientation on M˜30 and with respect to this orientation the equality

g∗ω

ω= M˜

(4.3.44)

M

holds for any continuous M -form ω on M˜. (iv) If a curve γ is itself a diﬀerentiable manifold (Remark 4.3.80) with the orientation ω deﬁned by (4.3.39) is the same as in (4.3.43). induced from R, then γ

(v) If M is an oriented manifold and ωV is the M -form given in local coordinates by 1

2 ∂ ∂ ,..., dy 1 ∧ · · · ∧ dy M ωV = det G ∂y1 ∂yM where

∂ , . . . , ∂y∂M ∂y1

is a positive basis, then the volume V (M )

(4.3.42)) is given by

dV (see M

V (M ) =

ωV . M

Example 4.3.88. Let M be part of the hemisphere given by 1 (x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1, z > − 2 and ω(x, y, z) = y 2 dx ∧ dy + yz dx ∧ dz + x2 dy ∧ dz. We choose the spherical coordinates ϕ: x = cos α cos ϑ,

y = sin α cos ϑ,

z = sin ϑ,

' π π( (α, ϑ) ∈ U (−π, π) × − , 6 2

29 We have not deﬁned this notion yet, but it is almost evident how to generalize the well-known case RM → RM . One has to overcome certain diﬃculties which are caused by the local deﬁnition of a manifold and the global notion of diﬀeomorphism. ∂ ∂ 30 Let (V , ψ ) form a n n n∈N be an atlas of M such that the coordinate vectors ∂y , . . . , ∂y 1 M ∂ ∂ positive basis of Tx M for x ∈ Vn . Then g∗ , . . . , g∗ determine a positive basis of Tg(x) M˜. ∂y1

∂yM

218

Chapter 4. Local Properties of Diﬀerentiable Mappings

∂ ∂ and the orientation such that ∂α , ∂ϑ is a positive basis. We wish to compute the integral ω. M * ) has two-dimensional surface meaThe curve γ = (− cos ϑ, 0, sin ϑ) : ϑ ∈ − π6 , π2 sure equal to zero, and therefore ω= ϕ∗ ω dα dϑ. M

U

We have (see Example 4.3.55(ii) or (4.3.33) and (4.3.19)) ϕ∗ ( dx ∧ dy) = sin ϑ cos ϑ dα ∧ dϑ, ϕ∗ ( dy ∧ dz) = cos α cos2 ϑ dα ∧ dϑ,

i.e.,

ϕ∗ ω = cos3 α cos4 ϑ dα ∧ dϑ.

ϕ∗ ( dz ∧ dx) = sin α cos2 ϑ dα ∧ dϑ, An easy computation gives ω= M

(−π,π)×(− π ,π 6 2)

cos3 α cos4 ϑ dα dϑ = 0.

e

It is evident that the computation of

ω need not be an easy task when several M

charts have to be used to cover the support of ω. The main reason is that a partition of unity must be constructed and this is technically diﬃcult. Because of that, we would like to have such useful tools like the Fubini Theorem and the Fundamental Theorem of Calculus. The later theorem, i.e., b f (x) dx = f (b) − f (a), a

can be interpreted in the manifold language as follows. The closed interval [a, b] is a manifold M positively oriented from the left to the right. The one-form f (x) dx is the diﬀerential of the zero-form (i.e., the function f ), and the (oriented) boundary of M consists of the points b, a (in this order). The Fundamental Theorem of Calculus reduces the integral of the form df (x) = f (x) dx over M to an “integral” of f over ∂M . This observation is essential for the generalization to manifolds with boundaries. To do that we have to deﬁne the boundary of M ﬁrst, and then to show how this boundary inherits the orientation of M . Deﬁnition 4.3.89. Let N be an M -dimensional diﬀerentiable manifold in RN . A closed subset M of N is said to be an M -dimensional diﬀerentiable manifold with boundary 31 if int M = M (the interior and closure are taken in the topology of N ) and for any point x ∈ M there is a chart (V, ψ) of an atlas of N such that either 31 This boundary can be an empty set (see, e.g., Remark 4.3.90(i)). If this boundary is nonempty, then the manifold M is not a diﬀerentiable manifold in the sense of Deﬁnition 4.3.4. See also Remark 4.3.90(ii).

4.3C. Integration on Manifolds

219

(i) V ⊂ M or (ii) P ψ(x) = (0, y2 , . . . , yM ) and P ψ(V ∩ M ) = {(y1 , . . . , yM ) ∈ RM : y1 ≤ 0} ∩ P ψ(V). A point x is called an interior point of M in the case (i). If x is not an interior point, then x is called a boundary point of M . The collection of all boundary points is called the boundary of M and denoted by ∂M (see Figure 4.3.20).

RN

RN −M

∂M

V ∩M

M

x V ϕ

Pψ

{(y1 , . . . , yM ) ∈ RM : y1 < 0}

o

{(0, y2 , . . . , y

M)

P ψ(x) y1

N

P ψ(V)

∈ RM }

{(y1 , . . . , yM ) ∈ RM : y1 > 0}

RM Figure 4.3.20.

Remark 4.3.90. (i) A manifold can have empty boundary. The sphere S 2 in R3 is an example of this fact. Such a manifold is also called a manifold without boundary. (ii) If M is a manifold with nonempty boundary ∂M , then the boundary ∂M is itself a diﬀerentiable manifold of dimension M −1 in RN . An atlas is given by the restriction of the original atlas. We notice that ∂M has empty boundary, i.e., ∂(∂M ) = ∅. (iii) Let M be a manifold with boundary. The tangent space Ta M for an interior point a ∈ M is deﬁned as in Appendix 4.3A and Ta M = Ta N . If a ∈ ∂M , then we take M M : y1 < 0}) through the point all smooth curves in RM − (R− = {(y1 , . . . , yM ) ∈ R N b = ψ(a) and transfer them into R by applying ϕ = (P ψ)−1 .

220

Chapter 4. Local Properties of Diﬀerentiable Mappings The tangent vectors at the point a of these transferred curves form the tangent space Ta (∂M ). If ω : (−1, 1) → RM is a smooth curve such that ω(0) = b,

ω(t) ∈

RM − RM +

for for

t < 0, t > 0,

then ϕ (b)[ω(0)] ˙ is the so-called outer vector to ∂M . The outer normal n to ∂M at the point a ∈ ∂M is the unit outer vector which is perpendicular in RN to Ta (∂M ) (see Figure 4.3.21).

RN

RN −M M

ϕ(ω)

V

a {a} + Ta ∂M n

Ta M

ϕ Pψ P ψ(V) o ω y1

ω(0) ˙ R

b

RM − RM +

M

Figure 4.3.21. (iv) If M ⊂ N is a diﬀerentiable manifold with boundary and (N , ω) is an oriented manifold, then ω induces an orientation on ∂M as follows. Choose a ∈ ∂M and d ϕ(t, b2 , . . . , bM )-v1 dt t=0

4.3C. Integration on Manifolds

221

(a special case of an outer vector), and deﬁne ω ∂ (a)(v2 , . . . , vM ) = ω(a)(v 1 , v2 , . . . , vM ). The form ω ∂ deﬁnes the induced orientation on ∂M . In other words: v 2 , . . . , v M is a positive basis of Ta (∂M ) provided v 1 , . . . , v M is a positive basis of Ta N . Example 4.3.91. (i) Let M be the closed ring {(x, y) ∈ R2 : 1 ≤ x2 + y 2 ≤ 4}. Then M ⊂ R2 is a two-dimensional manifold with boundary ∂M = S11 ∪ S21 (Sr1 denotes the circle with radius r and center at the origin). As the local coordinates we take the polar coordinates (r, ϑ). The (standard) orientation on M is given either by the standard Euclidean basis e1 = (1, 0), e2 = (0, 1) in Ta M = R2 or by the form ω(a) = f 1 ∧ f 2 dx ∧ dy = r(a) dr ∧ dϑ. The special outer vector v 1 mentioned in Remark 4.3.90(iv) is ⎧ ∂ ⎪ ⎪ ⎨ − ∂r at the point a1 = (1, ϑ1 ), v1 = ⎪ ⎪ ⎩ ∂ at the point a2 = (2, ϑ2 ). ∂r The completion to a positive basis in Ta R2 is shown in Figures 4.3.22 and 4.3.23.

R2 a1

M

v2

v1 o

ϑ=π S11 a2 v1

S21

v2 Figure 4.3.22.

222

Chapter 4. Local Properties of Diﬀerentiable Mappings

v1 =

r

∂ ∂r 2

a2

∂ v2 = ∂ϑ

v2 = −

∂ ∂ϑ a1

1 v1 = − −π

∂ ∂r π

0

ϑ

Figure 4.3.23.

The form ω ∂ on S11 is given (in the polar coordinates r, ϑ) by

∂ ∂ ∂ = r(a) dr ∧ dϑ − , = −r(a). ω ∂ (a) ∂ϑ ∂r ∂ϑ (ii) Let B be the closed unit ball in R3 . Then B is a three-dimensional manifold (included in R3 ) with boundary ∂B = S 2 (the two-dimensional sphere). The standard orientation in R3 = Ta B gives the orientation in int B. The induced orientation on ∂ at ∂B is obtained by Remark 4.3.90(iv). Namely, we take a normal vector n = ∂r a point a ∈ ∂B and independent vectors v 2 , v 3 ∈ Ta (∂B) in the order given by the right-thumb rule for (n, v 2 , v 3 ) (see Figure 4.3.24).32 Notice that the orientation on B is given, e.g., by ω = dx ∧ dy ∧ dz = (r 2 cos ϑ) dr ∧ dα ∧ dϑ in the spherical coordinates. Similarly,

ω ∂ (v2 , v3 ) = ω

∂ , v2 , v3 . ∂r

e

The next theorem is a basic result on diﬀerential forms and it is the promised generalization of the fundamental theorem of calculus. Theorem 4.3.92 (Stokes, abstract version). Let M be an M -dimensional oriented manifold with boundary ∂M . Let ω be a smooth (M − 1)-diﬀerential form on M with compact support. Then i∗ ω 33

dω = M

∂M

provided ∂M has the induced orientation and i : ∂M → M is the canonical injection. 32 More precisely, for n v × v (the cross product is deﬁned in Example 4.3.55(ii)) the vectors 2 3 3 n, v 2 , v 3 form a positive basis in R and n is perpendicular to v 2 , v 3 . 33 Here M

dω is deﬁned as in Deﬁnition 4.3.86 where Un need not be an open set – see foot-

note 28 on page 216.

4.3C. Integration on Manifolds

223

R3 n

v2 a v3

Ta S 2

B

Figure 4.3.24.

Proof. Let (Vn , ψn )n∈N be an atlas of M such that the coordinate vectors given by ∂ , . . . , ∂y∂M form positive bases in the corresponding tangent spaces. Let {αn }n∈N be ∂y1 a partition of unity subordinate to {Vn }n∈N . Since ω has compact support, we have k dω = αj dω where supp (αj dω) ⊂ Vj . M

M

j=1

Therefore, it is suﬃcient to prove the Stokes Theorem only for the case when supp ω is contained in one coordinate neighborhood, say V. Suppose that ω has the representation ω(x) =

M

(−1)i−1 fi (y) dy 1 ∧ · · · ∧ 7 dy i ∧ · · · ∧ dy M ,

x = ϕ(y) ∈ V,

i=1

where the hat denotes that the term dy i is missing. Then M ∂fi (y) dy 1 ∧ · · · ∧ dy M . dω(x) = ∂yi i=1 There are two cases for V:

i∗ ω = 0 and

Case 1 (V ∩ ∂M = ∅). Then ∂M

dω = M

M ∂fi (y) dy1 . . . dyM = 0 ∂y i i=1 U

by the Fubini Theorem, since fi = 0 outside of a compact subset of U.

224

Chapter 4. Local Properties of Diﬀerentiable Mappings

Case 2 (V ∩ ∂M = ∅). According to the deﬁnition of the boundary we can assume that V and ψ(V) = U have the form as in Figure 4.3.20. Then i∗ ω(x) = f1 (y) dy 2 ∧ · · · ∧ dy M

for

x = ϕ(y) ∈ V ∩ ∂M

and

i∗ ω = ∂M

f1 (y) dy 2 ∧ · · · ∧ dy M = ∂M

U ∩RM −1

f1 (0, y2 , . . . , yM ) dy2 . . . dyM .

On the other hand,

0 M ∂fi ∂f1 dy1 . . . dyM = · · · (y) dy1 dy2 . . . dyM ∂yi −∞ ∂y1 U ∩RM −1 i=1 U = f1 (0, y2 , . . . , yM ) dy2 . . . yM

dω = M

U ∩RM −1

since the integrals for i = 2, . . . , M vanish because of the compact support of the restriction of f to U ∩ RM −1 .

This completes the proof.

Several special cases of the Stokes Theorem are worth mentioning. We say that a curve γ : [a, b] → RN is simple if γ(t1 ) = γ(t2 ),

t1 < t2 ,

implies

a = t1 ,

b = t2 .

Corollary 4.3.93 (Green). Let Ω be a bounded open subset of R2 the boundary of which is the image of a simple closed smooth curve γ which is oriented so that Ω is on the left-hand side when we move along the curve. Let F = (f, g) : R2 → R2 be a C 1 -mapping in a neighborhood of the closure Ω. Then ∂f ∂g (f dx + g dy). (4.3.45) − dx dy = ∂x ∂y Ω γ (See Deﬁnition 4.3.77 for the integral on the right-hand side.) Proof. Notice that Ω is an oriented manifold (the positive orientation of R2 ) with boundary ∂Ω (smoothness of γ) and the above mentioned orientation of γ agrees with the induced orientation on ∂Ω. Put ω = f dx + g dy

and use the Stokes Theorem. Remark 4.3.94. For F (x, y) = 12 (−y, x) the formula (4.3.45) gives 1 V (Ω) = 2

−y dx + x dy, γ

which can be used for computation of the area of planar sets.

4.3C. Integration on Manifolds

225

Corollary 4.3.95 (Gauss–Ostrogradski). Let Ω be a bounded open subset of R3 such that Ω is a diﬀerentiable manifold with boundary ∂Ω. Assume that F = (f 1 , f 2 , f 3 ) : R3 → R3 is a smooth mapping in a neighborhood of Ω. Then 3 ∂f i dx1 dx2 dx3 = (F, n) dS (4.3.46) Ω i=1 ∂xi ∂Ω where n is the unit outer normal vector to ∂Ω and (F, n) is the scalar product in R3 . The integral on the right-hand side is deﬁned in Deﬁnition 4.3.81. Proof. Put ω = f 1 dy ∧ dz + f 2 dz ∧ dx + f 3 dx ∧ dy ∂ ∂ , ∂v forms a positive orthonorand choose local coordinates (u, v) on ∂Ω such that n, ∂u 3 ∗ mal basis in R . The pull-back i ω has been computed in Example 4.3.55(ii):

∂ ∂ du ∧ dv. i∗ ω = F, × ∂u ∂v The cross product ∂ ∂ × ∂v = n. ∂u

∂ ∂u

∂ × ∂v is the unit vector which is perpendicular in R3 to

∂ , ∂ . ∂u ∂v

So,

Corollary 4.3.96 (Stokes, classical version). Let M be a bounded two-dimensional oriented manifold in R3 (i.e., a surface) with boundary which is described by a simple smooth curve γ. Let F be a C 1 -vector ﬁeld on M . Then (curl F, n) dS = F M

γ

where γ has the orientation induced from M , the vector curl F is deﬁned in Example 4.3.58(ii) and n is the unit outer normal vector to M in R3 (if ∂y∂ 1 , ∂y∂ 2 is a positive ' ( basis at a ∈ M , then n is perpendicular in R3 to Ta M and n, ∂y∂ 1 , ∂y∂ 2 is a positive basis in R3 ).

Proof (a hint). Rewrite the abstract Stokes theorem for this special case using the corresponding deﬁnitions of integrals and Example 4.3.58(ii). Remark 4.3.97. Considering F in Corollary 4.3.95 as a velocity ﬁeld of a ﬂuid ﬂow (Remark 4.3.56) we can interpret the right-hand side in (4.3.46) as the amount of the ﬂuid which ﬂows out of a region Ω per a unit time. In particular, if the divergence of F 3 ∂f i ) vanishes everywhere in Ω, then this amount is zero for any subregion (div F ∂xi i=1

of Ω. In other words, the ﬂuid is incompressible in this case. Remark 4.3.98. Using an “inﬁnitesimal” ball Ω centered at a in Corollary 4.3.95 it is 3 ∂f i possible to interpret the value of the function div F = at the point a physically. ∂xi i=1

From the mathematical point of view it is more interesting that this is a starting point for the generalization of basic diﬀerential operators to non-ﬂat domains. We now brieﬂy describe this procedure. Let F be a vector ﬁeld on a manifold M and let ω be an M -form. Deﬁne an (M − 1)-form ωF by (ωF )(v1 , . . . , vM −1 ) ω(F, v1 , . . . , vM −1 ).

226

Chapter 4. Local Properties of Diﬀerentiable Mappings

If (M , ω) is an oriented manifold, then d(ωF ) has to be a multiple of ω: d(ωF ) (div F )ω. We strongly recommend that the reader computes div F , e.g., on the sphere S 2 . One of the most important partial diﬀerential operators is the Laplacian ∆: if G is an open set in RN , f : G → R is smooth, then ∆f

M ∂2f ∂x2i i=1

and it is easy to see that

∆f = div (∇f )

where

∇f

∂f ∂f ,..., ∂x1 ∂xN

is the gradient of f . Since the notion of the gradient has been deﬁned for functions on manifolds (Remark 4.3.56), we are able to generalize the Laplacian to functions deﬁned on a manifold M : ∆M f = div (∇f ). This operator ∆M is often called the Laplace–Beltrami operator. For more information on the signiﬁcance of this operator the reader can consult, e.g., Chavel [23], Davies & Safarov [31], Robinson [108] or Rosenberg [110]. Remark 4.3.99. In weak formulations of boundary problems for elliptic diﬀerential equations (see Chapter 7) the following Green formula is frequently used: Let a manifold M satisfy the assumptions of the abstract Stokes Theorem and let f, g ∈ C 2 (M ). Then

M

(∆M f )gω +

M

(∇f, ∇g)Tx M ω = ∂M

g(∇f, n)Tx (∂M) ω ∂ .

(4.3.47)

The proof is based on a generalization of (4.3.46): (div F )ω = (F, n)ω ∂ M

∂M

which follows from the abstract Stokes Theorem. Another ingredient is the formula for the divergence of the product of a function and a vector ﬁeld div (gF ) = g div F + (∇g, F ). This formula follows from the deﬁnition of the divergence by computation. ω does not depend on the choice of an atExercise 4.3.100. Prove that the deﬁnition of γ

las and a partition of unity. What can be said about the dependence on a parametrization of γ? Exercise 4.3.101. Let ω be an exact one-form and f its “primitive function”, i.e., df = ω. Compute

ω! What is the result if γ is a closed curve? γ

4.3C. Integration on Manifolds

227

Exercise 4.3.102. Let ω be a closed one-form in M . Show that ω is exact if and only if ω = 0 for any smooth closed curve in M . γ

Hint. Consult the proof of Theorem 4.3.64. Exercise 4.3.103. Prove (4.3.41)! Hint. Use induction. To prove the induction step it is convenient to compute the distance δ of vM from the span of v1 , . . . , vM −1 . Show that δ2 =

det G(v1 , . . . , vM ) det G(v1 , . . . , vM −1 )

provided v1 , . . . , vM −1 are linearly independent. Exercise 4.3.104. Check that the deﬁnition (4.3.38) is a special case of (4.3.42)! Express the length of a curve γ ⊂ R3 in spherical coordinates! Exercise 4.3.105. Let a surface S be determined by the graph of a smooth function 3 for the area of S! f : U ⊂ R2 → : R . Finda formula

2 2 ∂f ∂f Hint. 1+ + dx dy ∂x ∂y U Exercise 4.3.106. Let M be the graph of a smooth function f : RM → R deﬁned on an open set U ⊂ RM . Show that M is an orientable manifold! Exercise 4.3.107. Let f : RN → R be a smooth mapping for which o is a regular value. Show that M = {x ∈ RN : f (x) = 0} is an orientable manifold of dimension N − 1 (provided M = ∅)! Exercise 4.3.108. Deduce a version of the Cauchy Theorem, i.e., f dz = 0, γ

from the Green Theorem (Corollary 4.3.93)! Hint. Interpret f (z) dz as the couple of diﬀerential 1-forms (g dx − h dy, h dx + g dy). Exercise 4.3.109. Find the formula34 for ∆f (i) in polar coordinates; Hint. You should get ∆f =

∂2f 1 ∂f 1 ∂2f + ; + ∂r 2 r ∂r r 2 ∂ϑ2

(ii) in spherical coordinates in R3 ; 34 Such

formulae are convenient if one is looking for a solution with some symmetries, i.e., invariant with respect to some group actions.

228

Chapter 4. Local Properties of Diﬀerentiable Mappings Hint. You should get ∆f =

2 ∂f ∂2f 1 + + 2 ∂r 2 r ∂r r cos ϑ

∂2f ∂2f ∂f + cos ϑ 2 + sin ϑ 2 ∂α ∂ϑ ∂ϑ

;

(iii) on S 2 ; (iv) on the Riemann manifold (see footnote 27 on page 214). Exercise 4.3.110. Let M be a connected diﬀerentiable manifold. (i) Prove that there exists a Riemann metric g on M . Hint. M is embedded into RN . (ii) Prove that any two points x, y ∈ M can be connected by a C 1 -curve, i.e., there is

γ : [a, b] → M ,

γ ∈ C1,

γ(a) = x, γ(b) = y.

Hint. Use the assumption that M is connected. (iii) Deﬁne the length of a C 1 -curve γ : [a, b] → M by the formula

b

lg (γ)

; gγ(t) (γ(t), ˙ γ(t)) ˙ dt.

a

Put g (x, y) = inf{lg (γ) : γ : [a, b] → M , γ(a) = x, γ(b) = y} and show that g is a metric on M . (iv) How is the topology on M given by g related to the topology induced from RN ?

4.3D Brouwer Degree We will establish the main properties of the degree of a mapping f : RM → RM in the basic text (Section 5.2). In this appendix we will present another deﬁnition of the Brouwer degree, prove its main properties and give some of its topological applications. This goal can be achieved in diﬀerent ways but all of them are lengthy and contain intricate calculations. Here we choose the treatment based on the integration of diﬀerential forms, mainly to introduce the interested reader to a “geometric world” and to relate the notion of the degree to classical results of the theory of functions of complex variable. Since C.F. Gauss was one of the fathers of this theory we will start with his approach to the so-called Fundamental Theorem of Algebra. Theorem 4.3.111 (Fundamental Theorem of Algebra). Let P be a polynomial with complex coeﬃcients of a degree at least 1. Then there exists z0 ∈ C such that P (z0 ) = 0. Equivalently, P : C → C is a surjective mapping. We will give two rather diﬀerent proofs the ideas of which go back to Gauss. The former is purely geometric with a little analysis for introducing a diﬀerentiable structure on the sphere S 2 . The latter uses the theory of functions of a complex variable and demonstrates the above mentioned connection to the degree.

4.3D. Brouwer Degree

229

Proof (see, e.g., Milnor [95]). We will regard the sphere S 2 as a compactiﬁcation of the complex plane and endow it with the structure of a diﬀerentiable manifold by two charts (see Example 4.3.5(iii)) given by the stereographic projections π+ , π− of S 2 \ {N } where N is the north pole, and S 2 \ {S} where S is the south pole, respectively, onto R2 . See Figure 4.3.25.

N

S2

x R2

π− (x) π+ (x)

S Figure 4.3.25.

Let P be a non-constant polynomial. We deﬁne f to be −1 π+ ◦ P ◦ π+ (x), x ∈ S 2 \ {N }, f (x) N, x = N. One can prove that f : S 2 → S 2 is continuously diﬀerentiable at all points. To prove diﬀerentiability at N it is suﬃcient to verify another formula for f , namely −1 f = π− ◦ Q ◦ π−

where

Q(z) =

1 P (z −1 )

,

Q(0) = 0

(z is the complex conjugate to z). The calculation of f∗ (see (4.3.19)) shows that x0 ∈ S 2 is a singular point of f (i.e., f∗ (Ta S 2 ) = Tf (a) S 2 ) if and only if P (z0 ) = o,

z0 = π+ (x0 ).

Since the last equation has only a ﬁnite number of solutions, the set A of singular values of f is ﬁnite. This is the main point where we have used that P is a polynomial. Consider now a point y ∈ S 2 \A, i.e., y is a regular value of f . Then the set f−1 (y) is ﬁnite (possibly empty) since a polynomial takes any value only ﬁnitely many times. Let ϕ(y) denote the number of points, i.e., the cardinality, of f−1 (y). The main part of the proof is to show that ϕ is a constant function on B S 2 \ A. To prove that we consider two kinds of regular values, B1 = {y : f−1 (y) = ∅}, B2 = B \ B1 . Since B1 = S 2 \ f (S 2 ) and S 2 is compact, B1 is open in S 2 . For y ∈ B2 we have f−1 (y) = {x1 , . . . , xk }

230

Chapter 4. Local Properties of Diﬀerentiable Mappings

and there are disjoint open neighborhoods U(x1 ), . . . , U(xk ) on which f is a diﬀeomorphism (see Local Inverse Function Theorem 4.1.1). Put Vi f (U(xi )). It is easy to see that ϕ is constant on the set k k 2 Vi \ f S \ U(xi ) , i=1

i=1

which is a neighborhood of the point y. Since S 2 is connected and A is ﬁnite, the set B is also connected. The function ϕ, being locally constant, is constant on the whole of B. Moreover, ϕ cannot vanish on B, i.e., B1 = ∅. This shows that f actually maps S 2 onto S 2 , i.e., P : C → C is surjective as well. In particular, there exists z0 ∈ C such that P (z0 ) = 0. The following result generalizes the above fact that f is surjective (see, e.g., Sternberg [124, Theorem 3.4.3]). Proposition 4.3.112. Let M1 , M2 be two oriented manifolds of the same dimension, and let M2 be a connected space. Suppose that f : M1 → M2 is a proper35 diﬀerentiable mapping such that its realization (see Figure 4.3.15) has a nonnegative Jacobian at any point. Then either the Jacobian vanishes everywhere or f (M1 ) = M2 . Remark 4.3.113. Write f : C → C as f (z) = g(x, y) + ih(x, y)

where

z = x + iy

and

g, h : R2 → R.

If f is a holomorphic function in an open set G, then the Cauchy–Riemann conditions ∂h ∂g = , ∂x ∂y

∂g ∂h =− ∂y ∂x

hold for

z = x + iy ∈ G,

and the Jacobian of (g, h) : R2 → R2 satisﬁes J(g,h) (x, y) = |f (z)|. In particular, J(g,h) ≥ 0 and if f is a polynomial, then f is proper (why?), and so Proposition 4.3.112 can be used to get the Fundamental Theorem of Algebra. The idea of the latter proof is based on the notion of the index of a point a with respect to a curve. If γ is a closed C 1 -curve in C, a ∈ γ,36 then the index , Indγ a, is deﬁned by the formula 1 dz . Indγ a = 2πi γ z − a We say that γ is positively oriented if Indγ a ≥ 0

for all

a ∈ γ.37

mapping f is said to be proper if f−1 (K) is compact whenever K is a compact set. to complicate matters by diﬀerent notation we identify the curve, i.e., a mapping γ : interval I → C, with its image. 37 This deﬁnition, common in the theory of functions of a complex variable, coincides with our deﬁnition of an oriented manifold. 35 A

36 Not

4.3D. Brouwer Degree

231

In particular, if a=0

and

γ(t) = reit ,

t ∈ [0, 2nπ],

n ∈ Z,

then Indγ 0 = n and it can be interpreted as the number of revolutions of γ and also as the increment of the argument along γ divided by 2π. Proposition 4.3.114 (Rouch´e). Let γ be a simple, closed, positively oriented C 1 -curve in an open set Ω, and let G {z ∈ Ω \ γ : Indγ z = 0}. If f is a holomorphic function in Ω for which 0 ∈ f (γ), then the number Nf (G) of solutions of the equation f (z) = 0 that belong to G is equal to f (z) 1 (4.3.48) Nf (G) = dz = Indf ◦γ 0 38 2πi γ f (z) provided the solutions are counted with their multiplicity.39 If g is another holomorphic function in Ω such that |f (z) − g(z)| < |f (z)|,

z ∈ γ,

(4.3.49)

then Nf (G) = Ng (G). Proof. The proof is based on the Residue Theorem, see, e.g., Rudin [113, Theorem 10.43]. We wish to point out that the condition (4.3.49) is a quantitative description of stability of the number of solutions with respect to perturbations of f . The second proof of Theorem 4.3.111. Our second proof of the Fundamental Theorem of Algebra follows easily from the previous proposition: Suppose that P (z) = z n + a1 z n−1 + · · · + an and put f (z) = z n . Let R > 0 be such that |a1 z n−1 + · · · + an | < |z n | = Rn

for

|z| = R

(why such an R does exist?). Proposition 4.3.114 implies that n = Nf (G) = NP (G) where G is the open ball B(0; R).

The connection between the winding number and the degree deg (f, Ω, p) as the latter is deﬁned in Deﬁnition 5.2.1 for a regular value p is given by the following result. For a holomorphic function f and its regular value p the degree deg (f, Ω, p) is deﬁned 38 This

quantity is also called the winding number of f at 0 with respect to γ. solution z0 has multiplicity k if f (z0 ) = · · · = f (k−1) (z0 ) = 0, f (k) (z0 ) = 0. Notice that k is ﬁnite provided f is not identically zero. 39 A

232

Chapter 4. Local Properties of Diﬀerentiable Mappings

as the number of solutions in Ω of the equation f (z) = p. By identiﬁcation of f : C → C,

f (z) = g(x, y) + ih(x, y)

with

(g, h) : R2 → R2 ,

this deﬁnition coincides with Deﬁnition 5.2.1. Lemma 4.3.115. Let Ω be an open, bounded set whose boundary is the image of a C 1 -simple, closed, positively oriented curve γ. Assume that f : C → C is holomorphic in a certain neighborhood of Ω and p ∈ f (∂Ω) is a regular value of f (i.e., f (z) = 0 whenever f (z) = p). Then f (z) 1 deg (f, Ω, p) = dz. (4.3.50) 2πi γ f (z) − p Proof. Denote A = {z ∈ Ω : f (z) = p}. If A = ∅, then both sides vanish (use the Cauchy Theorem for the right-hand side). If A = ∅, then A is ﬁnite since A ∩ ∂Ω = ∅, and f is holomorphic. Both sides in (4.3.50) are equal to the cardinality of A. This follows for the left-hand side from the deﬁnition of the degree and for the right-hand side by the Residue Theorem. We would like to point out here that the formula (4.3.50) indicates a way to remove the assumption on regularity of p. Namely, the integral exists for any holomorphic function f provided p ∈ f (∂Ω). Put d inf |f (z) − p|, z∈∂Ω

and let A = {z1 , . . . , zn } be the same as in the proof of Lemma 4.3.115. Assume that the solution zj is of multiplicity mj , and denote m m1 + · · · + mk . According to Proposition 4.3.114, the number of solutions (including multiplicity) of the equation f (z) = q is also m provided |p−q| < d. In this neighborhood of the point p there exists a regular value of f . This follows either from the Sard Theorem (see Theorem 5.2.3) or, in this special case more easily from the properties of holomorphic functions – see, e.g., Rudin [113, Theorem 10.32]. If the degree has the property of stability with respect to the point (Property (vi) or (viii) in Theorem 5.2.7 or Theorem 4.3.124), the equality (4.3.50) will also hold for singular values. In order to motivate the next result, we note that the integral in (4.3.50) can be rewritten as an integral over Ω (by the Green Theorem). We avoid such uninteresting and intricate calculation, and put aside the special case of holomorphic functions. Instead we will consider the general case of mappings RM → RM . For the rest of this appendix we will suppose that ⎧ ⎪ Ω is a bounded open set in RM , ⎪ ⎨ (H) f : Ω → RM is continuous on Ω, ⎪ ⎪ ⎩ p ∈ RM \ f (∂Ω).

4.3D. Brouwer Degree

233

Proposition 4.3.116. In addition to (H) let f ∈ C 1 (Ω, RM ) and let p be a regular value of f . Then there exist aneighborhood U of the point p and a smooth function α : RM → R supported in U with

α(y) dy = 1 such that

U

f ∗ω = Ω

( deg (f, Ω, p)).

sgn Jf (x)

(4.3.51)

x∈Ω f (x)=p

Here f ∗ ω is the pull-back of the form ω(y) = α(y) dy 1 ∧ · · · ∧ dy M (see Remark 4.3.54(iii)). Proof.40 By the deﬁnition of the pull-back we have f ∗ ω(x) = α(f (x))Jf (x) dx1 ∧ · · · ∧ dxM

f ∗ ω dx =

and

Ω

α(f (x))Jf (x) dx Ω

where Jf (x) is the Jacobian of f at x. We consider two complementary cases: Case 1 (p ∈ f (Ω)). Then there is a neighborhood U of p which is a subset of RM \ f (Ω). Choose a smooth function α on RM with its support in U and such that α(y) dy = 1. Ω

Then

f ∗ω = 0

since

α|f (Ω) = 0. The right-hand side of (4.3.51) also vanishes by deﬁnition ( = 0). Ω

∅

Case 2 (p ∈ f (Ω)). Since p is a regular value, the set A {x ∈ Ω : f (x) = p} is ﬁnite. Let A {x1 , . . . , xk }. Then for any xi ∈ A there is a neighborhood Vi ⊂ Ω of xi such that f |Vi is a diﬀeomorphism of Vi onto a neighborhood Ui of p. These neighborhoods k % V1 , . . . , Vk can be chosen mutually disjoint. Take a neighborhood U ⊂ Ui of p such i=1

that f−1 (U) ⊂

k

Vi

i=1

(why does such a U exist?). Choose a smooth function α with its support in U and normalize α so that α(y) dy = 1. Then U

f ∗ω = Ω

k i=1

=

k i=1

40 The

f ∗ω = Vi

k i=1

α(f (x))Jf (x) dx = Vi

sgn Jf (xi )

α(y) dy = f (Vi )

k

sgn Jf (xi )

i=1 k

sgn Jf (xi ).

α(f (x))|Jf (x)| dx Vi

i=1

proof based on a more explicit construction of the form ω is given by Mawhin [92]. There the reader can also ﬁnd a diﬀerent approach to the homotopy invariance property of the degree (see Lemma 4.3.117 below).

234

Chapter 4. Local Properties of Diﬀerentiable Mappings

We deﬁne the degree deg (f, Ω, p) for f ∈ C 1 (Ω, RM ) and p a regular value of f by the right-hand side of (4.3.51). This deﬁnition coincides with Deﬁnition 5.2.1. We also want to point out that Case 2 of the proof contains the essence of the use of diﬀerential forms for the deﬁnition of the degree. The formula (4.3.51) can hardly be used for computation of the degree but its advantage consists in the fact that the integral exists also if p is a singular value. However, we should be careful and examine whether this integral does not depend on the choice of an M -form ω also in the case when p is a singular value. Lemma 4.3.117. Let ω be a smooth M -form in RM with its support in a certain open cube Q. If ω = 0, Q

then there exists a smooth (M − 1)-form η with support in Q such that dη = ω, i.e., ω is exact. Proof. It is similar to that of the Poincar´e Theorem (Theorem 4.3.64).

Corollary 4.3.118. Suppose (H) and f ∈ C (Ω, R ). ˜ be two smooth M -forms Let ω, ω M with supports in a cube Q ⊂ R \ f (∂Ω) such that ω= ω ˜ . Then 1

M

Q

f ∗ω = Ω

Q

f ∗ω ˜. Ω

Proof. According to Lemma 4.3.117 there exists an (M −1)-form η for which dη = ω − ω ˜. Then ˜ ) = f ∗ ( dη) = d(f ∗ η) f ∗ (ω − ω (see Exercise 4.3.71(iv)). Since the support of the form f ∗ η is a compact subset of Ω, the Stokes Theorem (Theorem 4.3.92) implies that d(f ∗ η) = 0. Ω

Let p be a singular value of f and B B(p; r) be such a ball that B ∩ f (∂Ω) = ∅. According to the Sard Theorem (see Corollary 5.2.4) there is a regular value q of f in B. We want to deﬁne ω= f ∗ ω. deg (f, Ω, p) deg (f, Ω, q) = Q

Ω

To be sure that this deﬁnition does not depend on the choice of the point q we choose another regular value q˜ ∈ B. It is obvious that a diﬀeomorphism h of B (onto B) can be constructed such that h(q) = q˜.

4.3D. Brouwer Degree

235

Let ω ˜ be a diﬀerential form constructed in the proof of Proposition 4.3.116 supported on ˜ q˜ ∈ U˜ ⊂ B. Then by the deﬁnition of pull-back a cube U, ω ˜= h∗ ω ˜. U

˜ U

The form ω ˜ can be chosen in such a way that h∗ ω ˜ is supported by the same cube Q as the form ω (explain why!). By Corollary 4.3.118, ω= h∗ ω ˜= ω ˜. Q

Q

˜ U

This shows that the assumption on regularity of p can be omitted in the deﬁnition of the degree. Now we may drop the assumption on smoothness of f . However, it will need a certain further eﬀort. Lemma 4.3.119. Let Ω be a bounded open set in RM and let H : [0, 1] × Ω → RM be such that the mapping t ∈ [0, 1] → H(t, ·) ∈ C 1 (Ω, RM ) ∩ C(Ω, RM ) is continuous. If p ∈ RM \ H([0, 1] × ∂Ω), then deg (H(t, ·), Ω, p) is constant on the interval [0, 1]. Proof. First note that H is continuous as a mapping considered on [0, 1]×Ω, and therefore H([0, 1] × ∂Ω) is compact and there is an open cube Q ⊂ RM \ H([0, 1] × ∂Ω) which is a neighborhood of p. Choose a smooth M -form ω with its support in Q. By deﬁnition, (H(t, ·))∗ ω. deg (H(t, ·), Ω, p) = Ω

This shows that the function t → deg (H(t, ·), Ω, p) is continuous on [0, 1]. Taking only integer values it has to be constant.

Corollary 4.3.120. Suppose (H) and denote d dist (p, f (∂Ω)). If g, h are mappings from C 1 (Ω, RM ) ∩ C(Ω, RM ) such that f − gC(Ω,RM ) sup |f (x) − g(x)| < d,

f − hC(Ω,RM ) < d,

x∈Ω

then deg (g, Ω, p) = deg (h, Ω, p). Proof. Put H(t, x) (1 − t)g(x) + th(x), The assertion now follows from Lemma 4.3.119.

t ∈ [0, 1],

x ∈ Ω.

The last step in our approach to the deﬁnition of the degree consists in the approximation of a continuous mapping by smooth ones.

236

Chapter 4. Local Properties of Diﬀerentiable Mappings

Lemma 4.3.121. Let Ω be a bounded open set in RM and let f : Ω → RM be continuous. Then there exist mappings fn ∈ C 1 (Ω, RM ) ∩ C(Ω, RM ) that converge uniformly on Ω to f. Proof. Observe ﬁrst that it is suﬃcient to prove the statement for individual components of f . So we will assume that f : Ω → R is continuous. There are many ways to prove the density of smooth functions in the space of continuous functions. See the discussion on page 31. We will present the approach based on convolution approximations (see Proposition 1.2.20 and the proof of Proposition 1.2.21). Extend f to a continuous bounded function g on RM (such an extension exists by the Tietze Theorem41 ). Choose a nonnegative C ∞ -function ϕ : RM → R with compact support and

ϕ(x) dx = 1. Put RM

ϕn (x) = nM ϕ(nx).

Then the convolutions

(ϕn ∗ g)(x)

RM

ϕn (x − y)g(y) dy,

x ∈ RM ,

are C ∞ -functions which converge to g locally uniformly on RM , in particular, uniformly to f on Ω. Proposition 1.2.20(iii) implies the smoothness of ϕn ∗g. To see the convergence it is convenient to visualize the form of ϕn (see Figure 4.3.26 for M = 1).42 The convergence is obtained as follows: ϕn (x − y)|g(y) − g(x)| dy |(ϕn ∗ g)(x) − g(x)| ≤ M R = ϕn (x − y)|g(y) − g(x)| dy U (x)

+ RM \U (x)

ϕn (x − y)|g(y) − g(x)| dy < δ + 0.

The ﬁrst integral is arbitrarily small for a suﬃciently small neighborhood U(x) of x by the continuity of g at x; the second integral is zero for suﬃciently large n since ϕn (x − ·) vanishes on RM \ U(x) for such n. 41 The

Tietze Theorem says: Assume that g is a bounded, continuous, real function on a closed non-void subset A of a metric space X equipped with a metric . Then there exists a continuous extension G : X → R such that sup g(x) = sup G(x), x∈A

x∈X

inf g(x) = inf G(x).

x∈A

x∈X

This theorem permits generalization (nontrivial) to normal topological spaces (see, e.g., Dugundji [43, Section VII.5]). The proof for metric spaces is quite easy. Indeed, without loss (x,y)g(y) of generality we can suppose that 1 ≤ g ≤ 2. Put G(x) = inf dist(x,A) for x ∈ X \ A, and y∈A

G(x) = g(x) for x ∈ A. It is not diﬃcult to show that G has all the required properties. 42 It is also said that ϕ converge to the Dirac measure (this is true in the sense of distributions) n 1 M or that {ϕn }∞ n=1 is the so-called approximate unit (the space L (R ) with the convolution multiplication is a Banach algebra without a unit, and the convergence ϕn ∗ g → g takes place in the L1 -norm for all g ∈ L1 (RM )).

4.3D. Brouwer Degree

237

ϕ5

ϕ2 ϕ = ϕ1 −a

− a2

− a5

0

a 5

a 2

a

x

Figure 4.3.26.

Corollary 4.3.122. Suppose (H) and let {fn }∞ n=1 be a sequence the existence of which follows from Lemma 4.3.121. Then lim deg (fn , Ω, p)

n→∞

exists and its value does not depend on {fn }∞ n=1 whenever this sequence possesses the properties from Lemma 4.3.121. Proof. Since dist(p, f (∂Ω)) is positive, we conclude that p ∈ fn (∂Ω) for all suﬃciently large n and the degrees deg (fn , Ω, p) are deﬁned. Corollary 4.3.120 shows that the sequence {deg (fn , Ω, p)}∞ n=1 is eventually constant. This corollary also yields the independence of the limit of the sequence of degrees on the choice of {fn }∞ n=1 . The previous corollary shows how the deﬁnition of the degree deg (f, Ω, p) is extended to all triples (f, Ω, p) satisfying (H). Remark 4.3.123. The approach which has just been described may be extended to a mapping f : M → M˜ between manifolds of the same dimension. We need to assume that both M and M˜ are oriented (in order for the integral of an M -form to be deﬁned, cf. Proposition 4.3.116 where instead of Jf (x) we take the Jacobian of a realization of f ). A set Ω ⊂ M is supposed to be an open set with compact closure in the topology of M . We consider the degree of f ∈ C(Ω, M˜) with respect to a point p ∈ M˜ \ f (∂Ω). Here ∂Ω is again the boundary of Ω in the topology of M . M -forms have their supports in some coordinate neighborhoods V of p which are disjoint with f (∂Ω). An analogue of Lemma 4.3.117 still holds. There are problems with an analogue of Corollary 4.3.120 – we have to use a topology on the space of mappings M → M˜ since we have not deﬁned any metric on a manifold. The existence of an approximating sequence similar to that of Lemma 4.3.121 is not clear, either. These obstacles can be overcome but special tools are required. Since they are beyond the scope of this book we only refer the interested reader to, e.g., the book Hirsch [67, Chapters 2 and 5].

238

Chapter 4. Local Properties of Diﬀerentiable Mappings

We are now able to prove the main properties of the degree. See also Proposition 5.2.2 and Theorem 5.2.7. Notice that the following theorem is also true in the case of manifolds. Theorem 4.3.124. There exists a mapping deg which sends any triple (f, Ω, p) satisfying (H) into Z and has the following properties: (i) (normalization property) If f is the identity map, p ∈ Ω, then deg (f, Ω, p) = 1. (ii) (additivity property) If Ω1 , Ω2 are disjoint open subsets of Ω and the point p is such that p ∈ f (Ω \ (Ω1 ∪ Ω2 )), then deg (f, Ω, p) = deg (f, Ω1 , p) + deg (f, Ω2 , p). (iii) (continuity property) Let {fn , Ω, p}, n = 0, 1, . . . , satisfy (H). If the sequence {fn }∞ n=1 converges uniformly to f0 on Ω, then lim deg (fn , Ω, p) = deg (f0 , Ω, p).

n→∞

(iv) (translation invariance property) deg (f, Ω, p) = deg (f − p, Ω, o); (v) (solution property) If deg (f, Ω, p) = 0, then there exists an x ∈ Ω such that f (x) = p. (vi) (homotopy invariance property) If H : [0, 1] × Ω → RM is continuous and (H) is satisﬁed for all (H(t, ·), Ω, p), t ∈ [0, 1], then deg (H(t, ·), Ω, p) is constant on [0, 1]. (vii) (boundary values dependence property) If (f, Ω, p), (g, Ω, p) satisfy (H) and f and g coincide on ∂Ω, then deg (f, Ω, p) = deg (g, Ω, p). (viii) (point dependence property) The mapping p → deg (f, Ω, p) is constant on every component of RM \ f (∂Ω). (ix) (multiplication property) Let Ω be a bounded open set in RM and let the mapping f : Ω → RM be continuous. Denote by U1 , . . . all bounded components of RM \ f (∂Ω). Suppose that the mappig g : RM → RM is continuous on f (Ω) and p ∈ g(f (∂Ω)). Then deg (g ◦ f, Ω, p) = deg (f, Ω, Ui ) deg (g, Ui , p) 43 (4.3.52) i

where the sum contains only a ﬁnite number of nonzero terms. of property (viii), deg (f, Ω, U ) deg (f, Ω, q) for a q ∈ U is well deﬁned for any component U of RM \ f (∂Ω).

43 Because

4.3D. Brouwer Degree

239

Proof. The degree is deﬁned in Deﬁnition 5.2.1 for f ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) and a regular value p ∈ f (∂Ω) and has been extended above by the procedure started with Proposition 4.3.116. It follows from this construction that it is suﬃcient to prove all parts of Theorem 4.3.124 for f ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ). Another proof is given on pages 270–271 (the proof of Proposition 5.2.2). (i) This follows immediately from the deﬁnition. (ii) This is a consequence of Proposition 4.3.116 since an M -form ω can be chosen in such a way that its support is disjoint with f (Ω \ (Ω1 ∪ Ω2 )). Then f ∗ω = f ∗ω + f ∗ ω. Ω

Ω1

Ω2

(iii) It is obtained directly from Corollary 4.3.122. (iv) It follows from Proposition 4.3.116. (v) This is slightly tricky: Suppose by contradiction that f−1 (p) ∩ Ω = ∅, and choose four mutually disjoint, nonempty open subsets Ω1 , . . . , Ω4 of Ω. Then, by the additivity property, we have deg (f, Ω, p) = deg (f, Ω1 , p) + deg (f, Ω2 , p) = deg (f, Ω3 , p) + deg (f, Ω4 , p), and also deg (f, Ω, p) = deg (f, Ω1 ∪ Ω2 , p) + deg (f, Ω3 ∪ Ω4 , p) = · · · = 2 deg (f, Ω, p). This contradicts the inequality deg (f, Ω, p) = 0. (vi) It follows from the construction – see Lemma 4.3.119. (vii) It is suﬃcient to apply property (vi) to H(t, x) = tf (x) + (1 − t)g(x). (viii) Choose an M -form ω supported in an open set U where U ∩ f (∂Ω) = ∅. Then, by deﬁnition, deg (f, Ω, p) = f ∗ω for all p ∈ U. Ω

This means that the degree is locally constant on RM \f (∂Ω), and therefore it is constant on every component of RM \ f (∂Ω). (ix) If the equation g(f (x)) = p has no solution in Ω, then the left-hand side of (4.3.52) vanishes (by the solution property). For the same reason all products on the right-hand side are also equal to zero. Suppose therefore that f (Ω) ∩ g−1 (p) = ∅. There is exactly one unbounded component U0 of RM \ f (∂Ω). Regardless of whether U0 ∩ g−1 (p) = ∅ or not, deg (f, Ω, U0 ) = 0 (by (v) and (viii)). Since g−1 (p) ∩ f (Ω)

240

Chapter 4. Local Properties of Diﬀerentiable Mappings

is compact there is only a ﬁnite number of bounded components of RM \ f (∂Ω), say U1 , . . . , Uk , which contain some points of g−1 (p) (diﬀerent components are disjoint). According to property (ii) we have deg (g ◦ f, Ω, p) =

k

deg (g ◦ f, Ωi , p)

(4.3.53)

i=1

where Ωi = f−1 (Ui ). By deﬁnition, there is a neighborhood V of p, V ∩ g(f (∂Ω)) = ∅, and an M -form ω supported in V with ω = 1 for which V

(g ◦ f )∗ ω.

deg (g ◦ f, Ω, p) = Ω

First consider a component Ui such that

deg (g, Ui , p) =

Ui

g ∗ ω = 0.

In this case put 1 g ∗ ω|Ui . deg (g, Ui , p) Then κi is an admissible M -form for the deﬁnition of deg (f, Ωi , Ui ), and we have (g ◦ f )∗ ω = deg (g, Ui , p) f ∗ κi = deg (g, Ui , p) deg (f, Ωi , Ui ). deg (g ◦ f, Ωi , p) = κi =

Ωi

Ωi

Now consider a component Ui such that

deg (g, Ui , p) =

(4.3.54)

g ∗ ω = 0. Ui

Let

g ∗ ω(y) = κ(y) dy 1 ∧ · · · ∧ dy M . If κ does not vanish on Ui , then there are disjoint open sets Ui+ and Ui− which carry the ± positive part κ+ and the negative part κ− , respectively. Let Ω± i = f−1 (Ui ). Then f ∗ (g ∗ ω) f ∗ (g ∗ ω) Ω+ Ω− + − i i = deg (f, Ωi , Ui ) = − . deg (f, Ωi , Ui ) = deg (f, Ωi , Ui ) = κ+ (y) dy κ− (y) dy Ui+

Ui−

g ∗ ω = 0 and g ∗ ω does not vanish

Notice that both denominators are nonzero since everywhere. This means that we have f ∗ (g ∗ ω) = f ∗ (g ∗ ω) + deg (g ◦ f, Ωi , p) = Ωi

= deg (f, Ωi , Ui )

Ω+ i

κ (y) dy −

Ω− i

f ∗ (g ∗ ω)

−

+

Ui+

Ui

Ui−

κ (y) dy = deg (f, Ωi , Ui ) deg (g, Ui , p),

i.e., (4.3.54) holds in this case as well. Summing up and using (4.3.53) we complete the proof.

4.3D. Brouwer Degree

241

Remark 4.3.125. It can be proved (see Amann & Weiss [5]) that the properties (i)–(iv) determine the degree uniquely. We will conclude this appendix with some topological applications of the degree. Applications to diﬀerential equations are shown in Section 5.2. The well-known and basic Jordan Separation Theorem asserts that a Jordan curve divides the complex domain into two open components and exactly one of them is bounded (the interior domain of γ). This theorem has the following generalization. Theorem 4.3.126 (Generalized Jordan Separation Theorem). Let K be a compact set in RM such that RM \ K has a ﬁnite number, say k, of components. If f is a continuous injection of K into RM , then RM \ f (K) has exactly k components. Proof (a sketch). Notice ﬁrst that f is actually a homeomorphism of K, i.e., f −1 is continuous on the compact set f (K). Applying the Tietze Theorem (see the footnote on page 236) to each coordinate f i of f = (f 1 , . . . , f M ) and f −1 we conclude that there are continuous extensions g and h of f and f −1 , respectively, which are deﬁned in RM . Denote by G0 , . . . , Gk−1 the components of RM \K where G0 is the unique unbounded component. Similarly, let U0 , . . . , Um (m ∈ N ∪ {∞}) denote the components of RM \ f (K) where U0 is the unique unbounded component. The idea of the proof consists in the application of the multiplication property of the degree (Theorem 4.3.124 (ix)): Show that deg (h ◦ g, Gj , p) = δij

for

p

∈ Gi ,

deg (g ◦ h, Ui , q) = δij

for

q

∈ Uj ,

i = 1, . . . , m, j = 1, . . . , k − 1.

Deﬁne matrices A (deg (g, Gj , Ui ))

i=1,...,m , j=1,...,k−1

B (deg (h, Ui , Gj ))j=1,...,k−1 , i=1,...,m

and show that AB = Im . This means that m ≤ k − 1. Similarly, BA = Ik−1 . The equality m = k − 1 follows. Corollary 4.3.127 (Invariance of domain). Let G ⊂ R continuous injection. Then f (G) is an open set.

M

be an open set and f : G → R

M

a

Proof (a sketch). Show ﬁrst that it is suﬃcient to prove the assertion for the case when G is an open ball B and f is continuous and injective on B. According to Theorem 4.3.126, RM \f (∂B) has exactly two components U0 , U1 . If U0 denotes the unbounded component, then RM \ f (B) ⊂ U0 (again by Theorem 4.3.126), i.e., U1 ⊂ f (B). To show the opposite inclusion recall that f (B) is bounded and connected. Remark 4.3.128. If M < N and f : RM → RN is a continuous injection, then it can be proved (not too easily) that the complement of f (RM ) is a dense set in RN . We want to recall the famous “Peano curve”, i.e., a continuous (but not injective) map from the interval [0, 1] onto the square [0, 1] × [0, 1].44 This map can be used to construct a 44 This

example (for the construction see, e.g., Dugundji [43, Section IV.4]) has played an important role in developing the notion of a curve.

242

Chapter 4. Local Properties of Diﬀerentiable Mappings

continuous surjection of RM onto RN for any M < N . The existence of such a surjection for M ≥ N is trivial. As we have mentioned in Section 4.3, our main interest in the degree theory consists in applying it to solving equations, i.e., in using the solution property (Theorem 4.3.124 (v)). This means to compute the degree, which is by no means an easy task. Fortunately, we do not need to know the exact value of the degree. It will be suﬃcient to show that it is not equal to zero. For this purpose the following mapping property is very important. Deﬁnition 4.3.129. (1) A nonempty subset A of a linear space X is said to be symmetric if for every

x∈A

we have

− x ∈ A.

(2) A mapping f : A ⊂ X → Y , X, Y linear spaces, is called an odd mapping on a symmetric set A if f (−x) = −f (x)

for each

x ∈ A.

Theorem 4.3.130 (Borsuk Antipodal Theorem). Let Ω be a bounded, open, symmetric subset of RM and o ∈ Ω. Let f : Ω → RM be a continuous mapping whose restriction to ∂Ω is odd. If o ∈ f (∂Ω), then deg (f, Ω, o) is an odd integer. In particular, deg (f, Ω, o) = 0, and there is a solution of the equation f (x) = o

in

Ω.

There are several proofs of this important topological theorem. For the proof based on algebraic machinery see, e.g., Dugundji [43, Section XVI.6]. Here we present the main steps of an “analytic” proof which is taken from Schwartz [118] (see also Nirenberg [100] or Rothe [111] or Krawcewicz & Wu [80]). Proof of Theorem 4.3.130. The assertion is obvious for M = 1, therefore, we assume that M > 1. The main idea of this proof is quite simple. First observe that deg (f, Ω, o) does not depend on a continuous extension of f from ∂Ω into Ω (see Theorem 4.3.124(vii)). Since o ∈ Ω, there is a small open ball B B(o; δ) inside Ω, and there is a continuous mapping g : Ω → RM such that f (x) for x ∈ ∂Ω, g(x) = x for x ∈ B (by the Tietze Theorem). Part (ii) of Theorem 4.3.124 implies that deg (g, Ω, o) = deg (g, Ω \ B, o) + deg (g, B, o) = deg (g, Ω \ B, o) + 1. If g is constructed in such a way that g ∈ C 1 (Ω \ B) and o is a regular value of g, then deg (g, Ω \ B, o) = sgn Jg (x) where S = {x ∈ Ω \ B : g(x) = o}. x∈S

4.3D. Brouwer Degree

243

If, moreover, g is odd, then S is either the empty set or a symmetric set and g (−x) = g (x), deg (g, Ω \ B, o) is an even integer, and the proof is completed. Unfortunately, it is not known whether such g does exist. Therefore, we will show that all the required properties of g are not actually needed. The demand for regularity of o can be replaced by the assumption that g does not vanish on a part of a hyperplane, for instance on H {(x1 , . . . , xM −1 , 0) ∈ Ω \ B}. Indeed, in this case we have deg (g, Ω \ B, o) = deg (g, H+ , o) + deg (g, H− , o) where

H± {(x1 , . . . , xM ) ∈ Ω \ B : xM > 0 or xM < 0}.

Moreover,

deg (g, H+ , o) =

g∗ω = H+

H− +

g ∗ ω = deg (g, H− , o)

since Jg (x) = Jg (−x) and the mapping x ∈ H → −x is a diﬀeomorphism. It can be shown, by smooth approximation, that the equality deg (g, H+ , o) = deg (g, H− , o) holds also for continuous g. Therefore, the core of the proof is the following substantial strengthening of the Tietze theorem for odd mappings. Lemma 4.3.131. Let D be a bounded, open, symmetric subset of RM and o ∈ D. Let f be a continuous mapping from ∂D into RM which is odd and nowhere zero on ∂D. Then there exists a continuous odd extension g : D → RM such that g(x) = o

for all

x ∈ H {(x1 , . . . , xM ) ∈ D : xM = 0}.

To elucidate the problems of construction we note that the requirement of oddness of g does not cause diﬃculties. The crucial point is that we need g to be nowhere zero on H. The following simple example shows that the existence of such an extension is not obvious. Let D = (−1, 1), f (−1) = −1, f (1) = 1. Then any continuous extension of f has a zero point in D. Proof of Lemma 4.3.131. We assume again M > 1. The notation H± is used similarly as above. The key point is an odd, nowhere zero, continuous extension of f from ∂D ∩ H to f˜: H → RM . See the above example and notice the distinction, namely that dim H = M − 1 and f is a map into RM . Having such an extension, the rest of the proof is an application of the Tietze Theorem: ⎧ f (x), x ∈ ∂D, ⎪ ⎪ ⎪ ⎨ f˜(x), x ∈ H, g(x) = ⎪ g˜(x), x ∈ H+ , ⎪ ⎪ ⎩ −˜ g (−x), x ∈ H− , where g˜ is the Tietze extension of f and f˜. The existence of an extension f˜ follows from the following slightly more general assertion:

244

Chapter 4. Local Properties of Diﬀerentiable Mappings Let G be a bounded, open, symmetric subset of RN , o ∈ G, and let the mapping f : ∂G → RM be continuous, odd and nowhere zero. If N < M , then there exists a continuous, odd and nowhere zero extension ϕ : G → RM .

We will prove this assertion by induction with respect to the dimension N . If N = 1, then G ⊂ [−β, −α] ∪ [α, β]

0 < α < β < ∞,

for some

and we need to ﬁnd a continuous extension ϕ to [α, β] which is nowhere zero, deﬁne ϕ on [−β, −α] to be odd and restrict ϕ to G. The induction step is done similarly: First use the induction hypothesis for an extension to G ∩ RN−1 , and then for extending it into the upper half-space. In order to show that such an extension actually exists (also for N = 1) we need the following key result: Let K be a compact subset of RN and let f be a continuous nowhere-zero mapping from K into RM where N < M . Then for any compact set L ⊃ K there exists a continuous ϕ : L → RM which extends f and is nowhere-zero on L. Recall again the above example to see the obstacles in the proof. Denote c min |f (x)| x∈K

c

and choose ε ∈ 0, 2 . First we prove the existence of a smooth approximation which is deﬁned on a neighborhood of L. By the Tietze Theorem, there is a continuous extension f1 : L → RM . This f1 can be smoothly approximated on L (the proof of Lemma 4.3.121): Take ϕ1 ∈ C 1 (U) where U is a neighborhood of L such that f1 − ϕ1 C(L) <

ε . 2

Put ϕ ˜1 (x1 , . . . , xM ) = ϕ1 (x1 , . . . , xN ) In particular, this means that i.e., all points of ϕ1 (U) = ϕ˜1 (U × R Theorem (Theorem 5.2.3)

for

x = (x1 , . . . , xN , . . . , xM ) ∈ U × RM −N .

det ϕ ˜1 (x) = 0, M −N

) are critical values of ϕ ˜1 . According to the Sard

meas ϕ1 (U) = 0 (meas is the Lebesgue measure in RM ), and RM \ ϕ1 (U) is dense.45 Therefore, there is a point y0 ∈ RM \ ϕ1 (U) such that y0 < ε2 . Put ϕ2 = ϕ1 − y0 . Then ϕ2 (x) = o for every x ∈ L and f − ϕ2 C(K) < ε. Moreover, ϕ2 (x) ≥

c 2

for

x ∈ K.

4.3D. Brouwer Degree

245

We can assume that the last inequality holds for all x ∈ L. Otherwise, we multiply ϕ2 c by the function 2 ϕ(·)

outside the set where ϕ2 (x) ≥ 2c . Since the Tietze extension retains upper and lower bounds, we can extend f − ϕ2 from the set K to a continuous mapping ψ on L for which ψ(x) < ε, x ∈ L. It remains to put ϕ(x) = ϕ2 (x) + ψ(x),

x ∈ L.

The above proof is cumbersome and seems to be endless. So we recommend that the reader goes through the main steps once again, not checking all the technicalities but concentrating on their main ideas. Corollary 4.3.132 (Borsuk–Ulam). Let f be a continuous mapping from the M -dimensional sphere S M ⊂ RM +1 into RM . Then there is a point x0 ∈ S M such that f (x0 ) = f (−x0 ). Proof. Extend ϕ(x) = f (x) − f (−x) to a mapping from the unit ball B(o; 1) ⊂ RM +1 into RM which is viewed as a subset RM × {0} of RM +1 . If o ∈ ϕ(S M ), the proof is complete. For the case o ∈ ϕ(S M ), the application of the Borsuk Theorem yields deg (ϕ, B(o; 1), o) = 0. By Theorem 4.3.124 (viii), deg (ϕ, B(o; 1), o) = deg (ϕ, B(o; 1), y)

y = (0, . . . , 0, 1) ∈ RM +1 ,

where

45 In fact we do not need here the whole strength of the Sard Theorem. The following much weaker result is suﬃcient: Let G ⊂ RM be an open set and ψ : G → RM a C 1 -mapping on G. If A ⊂ G has Lebesgue measure zero, so has ψ(A). Indeed, every point of A belongs to a ball B ⊂ G on which ψ is bounded. By the Mean Value Theorem, there is a constant K such that

ψ(x) − ψ(y) ≤ Kx − y,

x, y ∈ B.

(∗)

is separable, the set A can be covered by countably many balls {Bj }j∈N , i.e., A = Since ∞ ∞ (A∩Bj ), and ψ(A) = ψ(A∩Bj ). To complete the proof we show that meas (ψ(A ∩ Bj )) = RM

j=1

j=1

0, j ∈ N. So take η > 0. Since meas (A ∩ Bj ) = 0, there are countably many cubes (or balls) ∞ ∞ Qk , meas Qk < η, and the estimate (∗) holds for all Qk with {Qk }k∈N A ∩ Bj ⊂ k=1

k=1

the same constant K. This implies meas (ψ(A ∩ Bj )) ≤

∞ k=1

meas ψ(Qk ) ≤ c

∞

meas Qk < cη

k=1

where the constant c depends only on the dimension M and on K. Since η > 0 is arbitrary, meas (ψ(A ∩ Bj )) = 0 for all j ∈ N. To obtain the result required in the proof above take A = U × {0, . . . , 0} . (M −N)-tuple

246

Chapter 4. Local Properties of Diﬀerentiable Mappings

i.e., the equation ϕ(x) = y has a solution in B(o; 1). However, this is impossible since ϕ(B(o; 1)) ⊂ RM × {0}.

For more information in this direction see Schwartz [118]. Exercise 4.3.133. Deduce the classical Jordan Separation Theorem from Theorem 4.3.126! Hint. A Jordan curve is homeomorphic to S 1 . Exercise 4.3.134. Show that there is no continuous injection of RM into RN whenever M > N! Hint. Assume by contradiction that ϕ is such a mapping and put f (x) = (ϕ(x), 0, . . . , 0). Apply Corollary 4.3.127. For another proof see Exercise 4.3.137. Exercise 4.3.135. Let Ω be a ball in C with suﬃciently large radius and let P be a polynomial of degree n ≥ 1. Show that deg (P, Ω, 0) = n. Hint. For P (x) =

n

ak xk , an = 0, use the homotopy

k=0

H(t, x) = tP (x) + (1 − t)an xn

on

Ω.

Exercise 4.3.136. Let f be an odd mapping from S = ∂B(o; 1) ⊂ RM +1 into RM +1 \{o}. Show that there is no continuous extension of f to M

ϕ : B(o; 1) → RM +1 \ {o}. (This is the original Borsuk’s formulation of Theorem 4.3.130.) Exercise 4.3.137. Deduce the assertion of Exercise 4.3.134 from the Borsuk–Ulam Theorem (Corollary 4.3.132). Exercise 4.3.138. Prove the following result due to Lusternik and Schnirelmann : Let F1 , . . . , FM +1 be closed sets which cover S M . Then at least one Fi contains a pair of antipodal points (i.e., x, −x ∈ Fi ). Hint. Let ϕ(x) −x,

x ∈ SM,

and suppose that ϕ(Fi ) ∩ Fi = ∅

for

i = 1, . . . , M.

There are continuous functions f i : S M → [0, 1] such that f i (Fi ) = {0},

f i (ϕ(Fi )) = {1}

(this consequence of the Tietze Theorem is known in a normal topological space as the Urysohn Lemma). Put f = (f 1 , . . . , f M ) and apply the Borsuk–Ulam Theorem to obtain a point x0 . Show that x0 ∈ FM +1 ∩ ϕ(FM +1 ).

4.3D. Brouwer Degree

247

Exercise 4.3.139. Prove the following complement of the above covering result of Lusternik and Schnirelmann: There exist closed sets F1 , . . . , FM +2 which cover S M , and such that no Fi contains a pair of antipodal points. Hint. Proceed by induction with respect to M . The assertion is obviously true for M = 1. Let M = 2. Cover the equator of S 2 with three closed sets E1 , E2 and E3 , with Ej ∩(−Ej ) = ∅, j = 1, 2, 3. Then choose a latitude L on the southern hemisphere and extend the cover of the equator to a covering A1 , A2 and A3 of the set of all points which lie to the north of L, including those of latitude L (see Figure 4.3.27).

A2

A3 E3

E2 A1 E1 L A4 Figure 4.3.27.

Here Aj consists of all great circle arcs from latitude L to the north pole which contain a point of Ej . Finally, let A4 consist of all points lying to the south of L, including those of latitude L . Then A1 , A2 , A3 and A4 is the desired covering of S 2 . Continue the argument for M ≥ 3. Exercise 4.3.140. Prove the following Bread–Ham–Cheese Theorem: If B1 , . . . , BM are bounded measurable sets in RM with M ≥ 1, then there is an (M − 1)-dimensional plane which divides all the sets Bj into two parts of the same measure. This assertion can be reformulated in three dimensions as follows: Suppose we have a sandwich of bread, ham and cheese with ham and cheese piled attractively but irregularly on the bread. Then the sandwich can be cut in two with one straight slash of a knife in such a way that each of two persons gets an identical share of bread, ham and cheese. Hint. Let M = 2 and let d ∈ S 1 determine a direction in R2 . Take a perpendicular line to d and move it from −∞ to +∞ (see Figure 4.3.28).

248

Chapter 4. Local Properties of Diﬀerentiable Mappings

B1 o

d

Hd Figure 4.3.28.

Take the ﬁrst and the last perpendicular (they need not be necessarily distinct) which splits the set B1 into two parts of the same measure. The perpendicular Hd which is half-way between these two has the equation (x, d) = a(d) where a : R2 → R satisﬁes

a(−d) = −a(d).

In order to ﬁnd d for which Hd also splits B2 into two parts of the same measure, we set f (d) meas {x ∈ B2 : (x, d) > a(d)}. Then f : S 1 → R is continuous and f (d) + f (−d) = meas B2 . By the Borsuk–Ulam Theorem, there is a point d such that f (d) = f (−d). Thus the corresponding Hd divides B2 into two parts of the same measure. If M ≥ 3, then construct functions f2 , . . . , fM corresponding to B2 , . . . , BM .

Chapter 5

Topological and Monotonicity Methods 5.1 Brouwer and Schauder Fixed Point Theorems One of the most frequent problems in analysis, especially in its applications, consists in solving the equation F (x) = y where F is a mapping from a Banach space X into a Banach space Y .1 Such an equation can be reduced to the equation F (x) = o, or, provided X ⊂ Y , to the equation F (x) = x. (5.1.1) In this section we present two basic results on the solvability of (5.1.1) in a special case, namely, for a continuous mapping F and a ﬁnite dimensional X, and a compact mapping F in a general Banach space of inﬁnite dimension – the Brouwer and the Schauder Fixed Point Theorems. We start with the ﬁnite dimensional case. A brief inspection of F : R → R indicates that reasonable assumptions on F are continuity on a closed interval I and F (I) ⊂ I. Moreover, the interval I should also be bounded. The Intermediate Value Theorem from Calculus applied to g(x) = F (x) − x says that there is a solution of (5.1.1) in I provided these assumptions are satisﬁed. Notice that these assumptions are too weak to say anything about the number of solutions. Having no appropriate ordering in R2 , standard proofs of the above result fail in R2 and, therefore, a generalization is far from being simple. 1 Spaces X, Y are assumed to have linear and topological structure since we discuss problems of analysis and not only of algebra or topology. Banach space structures are supposed mainly for simpliﬁcation here, but sometimes they can be crucial.

250

Chapter 5. Topological and Monotonicity Methods

Instead of an interval we consider the closed unit ball B B(o; 1) in RN , N ≥ 2, and a continuous mapping F : B → B. Suppose that the equation (5.1.1) has no solution in B. Deﬁne a map G as indicated in Figure 5.1.1, i.e., G(x) = τ (x)F (x) + (1 − τ (x))x where τ (x) ≤ 0 is a solution of the quadratic equation τ F (x) + (1 − τ )x2 = 1. The mapping G is well deﬁned (remember our assumption that F (x) = x for G(x) x F (x) B Figure 5.1.1.

x ∈ B), it maps B continuously onto the unit sphere S N −1 in RN and G(x) = x

for x = 1.

However, this seems to be impossible as our experience says that if the ball is continuously deformed (by G) onto the sphere, this ball has to be punctured. Nevertheless, a rigorous proof of this fact is far from being obvious. Such a proof could be based on introducing certain topological notions which are preserved under continuous deformation. If we show that some of these notions are diﬀerent for the ball and its boundary, we obtain a contradiction with the existence of G. Algebraic topology is devoted to the study of topological invariants of an algebraic nature (homotopic groups, homological groups, etc.). However, such methods are beyond the scope of this book. Instead we will give an analytic proof of the existence of a ﬁxed point of a continuous mapping F : B → B. This proof which is due to Milnor [95] is based on the idea of approximating a “bad” nonlinear mapping F by a simpler one. A smooth approximation is possible by the Weierstrass Approximation Theorem (see Theorem 1.2.14 and the discussion there). Suppose, by contradiction, that F has no ﬁxed point in B. For any ε > 0 there are polynomials P1 , P2 , . . . , PN of N variables, such that for P = (P1 , . . . , PN ) we have sup F (x) − P (x) < ε x≤1

5.1. Brouwer and Schauder Fixed Point Theorems

251

P and also that P˜ 1+ε : B → B. Moreover, P˜ has no ﬁxed points in B, either. This follows from the estimate ε ˜ F (x) − x ≤ F (x) − P (x) + (1 + ε) P (x) − x + x 1+ε

< 2ε + (1 + ε)P˜ (x) − x and the fact that inf F (x) − x > 0, since B is compact. Now we construct the x∈B

mapping G : B → S N −1 corresponding to P˜ as has been shown in Figure 5.1.1 for F . We put H(t, x) (1 − t)x + tG(x), x ∈ B. The most important properties of H are given in the next lemma. Lemma 5.1.1. (i) H(t, ·) maps B into itself for every t ∈ [0, 1]. (ii) H(t, ·) maps int B into itself for every t ∈ [0, 1). (iii) The partial Fr´echet derivative H2 (t, x) exists on [0, 1] × int B and is bounded on this set. (iv) For small t ≥ 0 the mapping H(t, ·) is a diﬀeomorphism of int B onto itself. Proof. The ﬁrst two statements are obvious, the third follows from diﬀerentiation of τ (x) (see Exercise 5.1.18). Let us prove the fourth statement. For a small positive t the derivative H2 (t, x) = (1 − t)I + tG (x) is an isomorphism (Proposition 2.1.2) and, by the Local Inverse Function Theorem, H(t, ·) is a local diﬀeomorphism. Since, by the Mean Value Theorem, applied to G, H(t, x) − H(t, y) ≥ (1 − t)x − y − t sup G (z) x − y ≥ (1 − ct)x − y z<1

for a constant c and all x, y ∈ int B, the mapping H(t, ·) is injective for small t and hence it is also a diﬀeomorphism on the whole int B. It remains to prove that H(t, ·)(int B) = int B. Notice that int B is a connected set and thus it is suﬃcient to show that M H(t, ·)(int B) is open and relatively closed in int B. The former property is a consequence of the local continuous invertibility of H(t, ·). To see the latter property we point out that M = H(t, ·)(B) ∩ int B (H(t, x) = 1 for every x = 1). Since B is compact, H(t, ·)(B) is also compact and therefore closed, i.e., M is relatively closed in int B.

252

Chapter 5. Topological and Monotonicity Methods

Having Lemma 5.1.1 we continue in the proof of our main statement on the existence of a ﬁxed point of F . To reach a contradiction with the assumption of non-existence, we will use the substitution theorem for the Lebesgue integral: If meas A denotes the N -dimensional Lebesgue measure of A ⊂ RN , we have meas (int B) = dx = det H2 (t, y) dy (5.1.2) int B

int B

for small positive t. Notice that H2 (0, y) = I and thus det H2 (t, y) is positive for small t > 0. The second equality follows from the substitution x = H(t, y) (Lemma 5.1.1(iv)). The last integral in (5.1.2) is deﬁned for all t ∈ [0, 1], and it is a polynomial Q(t) of the variable t ∈ [0, 1]. Since Q(t) is a constant for small t, we also obtain that Q(1) = meas (int B). The substitution G(y) H(1, y) = z yields Q(1) = det G (y) dy = int B

G(int B)

dz = meas G(int B).2

But G(B) = S N −1 and meas S N −1 = 0. Hence 0 = Q(1) = Q(0) = meas (int B), a contradiction. In order to get a ﬁxed point theorem in reasonable generality we prove the following simple topological result. Lemma 5.1.2. Let K be a convex, closed and bounded subset of RN which contains at least two diﬀerent points. Then K is homeomorphic to the unit ball B M B(o; 1) in RM for some M ≤ N . Proof. Choose linearly independent elements x1 , . . . , xM of K such that X Lin{x1 , . . . , xM } contains K. The existence of x0 ∈ K such that for any x ∈ X there is α > 0 such that x0 + α1 x ∈ K can be proved by induction with respect to the dimension of X. For the sake of simplicity we assume that x0 = o, and deﬁne the Minkowski functional of K, i.e., x 1 p(x) = inf α > 0 : x ∈ K , ϕ(x) = p(x) , x ∈ X \ {o}, ϕ(o) = o. α xRN It is not diﬃcult to prove that ϕ is a homeomorphism of K onto B N ∩ X. Since X with the induced RN -norm is isomorphic to RM (Corollary 1.2.11(i)), K is also homeomorphic to B M . 2 Notice

that for this substitution we do not need G to be a diﬀeomorphism.

5.1. Brouwer and Schauder Fixed Point Theorems

253

The ﬁrst main result of this section is the following Brouwer Fixed Point Theorem. Theorem 5.1.3 (Brouwer Fixed Point Theorem). Let K be a nonempty, convex, closed and bounded subset of RN . Assume that F : K → K is continuous. Then F has a ﬁxed point in K. Proof. If K has exactly one point then the statement is obvious. In other cases choose a homeomorphism Φ of B M = B(o; 1) ⊂ RM onto K (Lemma 5.1.2). According to the above discussion, the mapping Φ−1 ◦ F ◦ Φ : B M → B M has a ﬁxed point x ˜ ∈ B M . Then F (Φ(˜ x)) = Φ(˜ x) ∈ K.

The following example shows an interesting application of the Brouwer Fixed Point Theorem in linear algebra. Example 5.1.4. Let A = (aij )i,j=1,...,N be a matrix such that aij ≥ 0

for all i, j = 1, . . . , N.

Then there exists a nonnegative eigenvalue λ of A with an eigenvector x = (x1 , . . . , xN ) having all its components xi ≥ 0,

i = 1, . . . , N.

Indeed, consider the l1 -norm on RN , i.e., x1 =

N

|xi |,

and let D {x ∈ RN : x1 = 1, xi ≥ 0, i = 1, . . . , N }.

i=1

Then D is a nonempty, closed, convex and bounded subset of RN . Let A : RN → RN be a linear operator with the representation in the standard basis given by the matrix A. If A vanishes at an x ∈ D, then such x is an eigenvector for the eigenvalue λ = 0. If this is not the case, put f (x) =

Ax Ax1

for x ∈ D.

Since f maps D continuously into D, it has a ﬁxed point x in D. Then Ax = λx

where λ = Ax1 .

g

Let us mention now the standard application of the Brouwer Fixed Point Theorem to the existence of periodic solutions of ordinary diﬀerential equations. The basic idea goes back to H. Poincar´e: Denote by x(·; ξ) a solution of the initial value problem x(t) ˙ = f (t, x(t)), x(0) = ξ. (5.1.3)

254

Chapter 5. Topological and Monotonicity Methods

Assume that f satisﬁes conditions which ensure the existence and uniqueness of (5.1.3) (see, e.g., Theorem 2.3.4) and, moreover, that f (·, x) is T -periodic. Then x(·; ξ) is a T -periodic solution of (5.1.3) if and only if x(·; ξ) is deﬁned on the interval [0, T ] and P ξ x(T ; ξ) = ξ (P is called the Poincar´e mapping). Since x(·; ξ) depends continuously on the initial condition ξ (under reasonable assumptions on f , see Remark 2.3.5 and Example 4.2.5), the Poincar´e mapping is continuous and its ﬁxed points can be found with help of the Brouwer Fixed Point Theorem as the following example suggests. Example 5.1.5. Assume in addition that there exists r > 0 such that (x, f (t, x))RN ≤ 0

for all t ∈ [0, T ] and xRN ≤ r.

Then there exists a T -periodic solution of (5.1.3). To be able to apply the Brouwer Fixed Point Theorem to the Poincar´e mapping P it is suﬃcient to show that P maps the closed ball B(o; r) into itself. Since d 1 x(t)2 = (x(t), f (t, x(t)) ≤ 0 whenever x(t) ≤ r, dt 2 the function t → x(t) is decreasing provided x(0) ≤ r. In particular, P is well deﬁned (i.e., a solution x(·, x0 ) exists on the interval [0, T ] provided x(0) < r) g and P maps B(o; r) into itself. Example 5.1.6. Assume that the right-hand side in (5.1.3) is asymptotically linear in RN , i.e., there exist a T -periodic continuous matrix A(t) and the function g : R × RN → RN continuous, T -periodic in t, and locally Lipschitz with respect to the x-variables such that for f (t, x) = A(t)x + g(t, x)

(5.1.4)

the following condition is satisﬁed: (H) ∀ε > 0 ∃b > 0 :

g(t, x) ≤ b + εx

for all t ∈ R, x ∈ RN .3

We are again interested in periodic solutions to (5.1.3) with f given by (5.1.4). First we have to show that the Poincar´e mapping is well deﬁned for all ξ ∈ RN . Denote by Φ(t, s) Φ(t)Φ−1 (s) where Φ(t) is the fundamental matrix of the linear equation x˙ = A(t)x 3 Roughly

(5.1.5)

speaking: g has a uniformly (with respect to t) “vanishing derivative at inﬁnity”.

5.1. Brouwer and Schauder Fixed Point Theorems

255

such that Φ(s, s) = I.4 Then the solution of (5.1.3) satisﬁes the integral equation t x(t; ξ) = Φ(t, 0)ξ + Φ(t, s)g(s, x(s; ξ)) ds (5.1.6) 0

(the Variation of Constants Formula) whenever it exists on the interval [0, t]. The fundamental matrix Φ is continuous on [0, T ] × [0, T ] and therefore it is bounded: Φ(t, s)RN ×N ≤ K,

(t, s) ∈ [0, T ] × [0, T ].

By (H), we get the estimate

t

x(s; ξ) ds

x(t; ξ) ≤ Kξ + KbT + Kε 0

and, with help of the Gronwall inequality (see Exercise 5.1.16), x(t; ξ) ≤ K(bT + ξ)eKεT = L1 + L2 ξ

(5.1.7)

whenever x(·; ξ) is deﬁned on the interval [0, t] ⊂ [0, T ]. If the maximal interval of the existence of the solution x(·; ξ) is [0, τ ) with τ ≤ T , then the boundedness of x(·; ξ), (5.1.7) and the condition (H) imply that x(·; ξ) is uniformly continuous on [0, τ ), and therefore it can be extended to a larger interval (see Proposition 1.2.4 and cf. a similar idea in Corollary 3.1.6). This implies that x(·; ξ) is deﬁned on [0, T ] (actually on R) and the mapping P is deﬁned for all ξ ∈ RN . To apply the Brouwer Fixed Point Theorem (the problem is to show that P maps a ball into itself) we assume that 1 is not the Floquet multiplier of the linear equation (5.1.5), i.e., 1 ∈ σ(Φ(T, 0)) or, equivalently, the equation (5.1.5) possesses only the trivial T -periodic solution. Then the equation P (ξ) = ξ is equivalent to the equation F (ξ) [I − Φ(T, 0)]−1 [x(T ; ξ) − Φ(T, 0)ξ] = ξ. From (5.1.6) and (5.1.7) we obtain F (ξ) ≤ [I − Φ(T, 0)]−1 [Kb + Kε(L1 + L2 ξ)]T = c1 (ε) + c2 εξ where c2 does not depend on ε. Choose ε small enough to satisfy c2 ε < 1. Keeping such ε ﬁxed there is r > 0 such that c1 (ε) + c2 εr ≤ r. It follows that F maps the ball B(o; r) ⊂ RN of the radius r into itself and the Brouwer Fixed Point Theorem g yields a T -periodic solution of (5.1.4). The Brouwer Fixed Point Theorem is a very strong device for solving ﬁnite dimensional nonlinear equations. Unfortunately, it does not hold in inﬁnite dimensions as the following example shows. means that x : t → Φ(t, s)ξ is a (unique) solution of the equation (5.1.5) which satisﬁes x(s) = ξ. 4 This

256

Chapter 5. Topological and Monotonicity Methods

Example 5.1.7 (Kakutani). Let H be a separable Hilbert space with an orthonormal basis {en }∞ n=1 . Denote by A ∈ L(H) the right shift given by ∞ ∞ Aen = en+1 , i.e., A xn en = xn en+1 , n=1

and

n=1 1

F (x) = (1 − x2 ) 2 e1 + Ax. Then F is continuous and F (x)2 = 1 − x2 + Ax2 = 1 If x =

∞

for

x ≤ 1.

xn en is a ﬁxed point of F , then

n=1

xn = xn+1 The series

∞

and

1

x1 = (1 − x2 ) 2 .

x2n , with xn = xn+1 , is convergent only if xn = 0 for all n, i.e., x =

n=1

0. Then x1 = 1, a contradiction.

g

Notice that in the previous example the apparently simple linear operator A is perturbed by a nonlinear operator with a one-dimensional range. Continuous operators with the range in a ﬁnite dimensional subspace form an important special subclass of the so-called (nonlinear) compact operators. Deﬁnition 5.1.8. Let X, Y be normed linear spaces and let M ⊂ X. A mapping F : M → Y is called a compact operator on M into Y if F is continuous on M (M being a metric space with the metric induced by the norm of X) and F (M ∩ K) is a relatively compact set in Y for any bounded set K ⊂ X. The set of all compact operators from M into Y is denoted by C (M, Y ). If the range of F ∈ C (M, Y ) is a subset of a ﬁnite dimensional subspace of Y , then we say that F is a ﬁnite dimensional operator and write F ∈ Cf (M, Y ). We recall that linear compact operators have been investigated in Section 2.2. Warning. In contrast to the linear case the continuity of a nonlinear operator F is not a consequence of the fact that F maps bounded sets onto relatively compact ones! A simple example can be constructed for F : R → R. Our interest in compact operators arises from the observation that they are close to ﬁnite dimensional ones. The precise formulation follows. Theorem 5.1.9. Let X be a normed linear space, Y a Banach space and let M be a bounded subset of X. ∞ (i) If F ∈ C (M, Y ), then there is a sequence {Fn }n=1 ⊂ Cf (M, Y ) which converges to F uniformly on M.

5.1. Brouwer and Schauder Fixed Point Theorems

257

∞

(ii) If {Fn }n=1 ⊂ C (M, Y ) and lim Fn (x) = F (x) uniformly for x ∈ M, then n→∞

F ∈ C (M, Y ).

-net y1 , . . . , ym ∈ F*(M) Proof. (i) Since F (M) is compact there is a ﬁnite n1 ) of F (M) (Proposition 1.2.3). Functions ϕk (x) = max 0, n1 − F (x) − yk are m ϕk (x) > 0 for every x ∈ M. Therefore the functions continuous on M and k=1

ϕk (x) , µk (x) m ϕk (x)

k = 1, . . . , m,

k=1

form a continuous partition of unity on M. Put Fn (x) =

µk (x)yk , x ∈ M.

k=1

Then Fn ∈ Cf (M, Y ) and F (x) − Fn (x) ≤

m

m

µk (x)F (x) − yk <

k=1

1 n

for every x ∈ M.

(ii) If we literally translate the classical proof for real functions to vector functions we see that F is continuous on M. Let n ∈ N be such that sup F (x) − Fn (x) < ε x∈M

and y1 , . . . , yk is an ε-net for Fn (M). Then it is also a 2ε-net for F (M). Since Y is a Banach space, Proposition 1.2.3 shows that F (M) is compact. Remark 5.1.10. The assertion (i) of Theorem 5.1.9 obviously holds for linear compact operators, but generally we cannot guarantee linearity of the approximating ∞ sequence {Fn }n=1 (see Remark 2.2.7). The following theorem is a generalization of the Brouwer Fixed Point Theorem into the inﬁnite dimensional setting. Theorem 5.1.11 (Schauder Fixed Point Theorem). Let K be a nonempty, closed, convex and bounded subset of a normed linear space X. Assume that F ∈ C (K, X) and F (K) ⊂ K. Then there is a ﬁxed point of F in K. ∞

Proof. Let {Fn }n=1 be the sequence constructed in the proof of Theorem 5.1.9(i). Denote Xn Lin{y1 , . . . , ym }. Since y1 , . . . , ym ∈ F (K) and K is convex, we have Fn (K) ⊂ K ∩ Xn . The restriction of Fn to K ∩ Xn satisﬁes the assumptions of the Brouwer Fixed Point Theorem and hence there is xn ∈ K ∩ Xn such that Fn (xn ) = xn .

258

Chapter 5. Topological and Monotonicity Methods ∞

By the compactness of F there is a subsequence {F (xnk )}k=1 which converges to an x ∈ F (K) ⊂ K = K. The estimate F (xnk ) − xnk = F (xnk ) − Fnk (xnk ) <

1 nk

implies that also lim xnk = x. Since F is continuous, we conclude that k→∞

lim F (xnk ) = F (x)

and

k→∞

F (x) = x.

Remark 5.1.12. The above proof of Theorem 5.1.11 is based on the approximation of F by Fn ∈ Cf (K, X). The construction in the proof of Theorem 5.1.9(i) is surely not unique. We recommend that the reader thinks about a possible simpliﬁcation when F acts on a separable Hilbert space. Another possibility occurs when K is a compact convex set. We obtain a typical situation as soon as X is a reﬂexive Banach space and K is a closed, convex and bounded subset of X. Then K is compact in the weak topology (Theorem 2.1.25)5 and the continuity of F : K → K in the weak topology (it sends weakly convergent sequences into weakly convergent ones) is suﬃcient to justify application of Theorem 5.1.11. A slightly more general statement was proved by A.N. Tikhonov (for a proof see, e.g., Dugundji [43, Appendix 1] and Deimling [34, § 10.3]). We now show how the Schauder Fixed Point Theorem can be applied to diﬀerential equations. To avoid technical details we restrict ourselves to ordinary diﬀerential equations. Their solutions are generally smooth which suggests a relation to compact operators. Proposition 5.1.13. Let G be an open subset of RN +1 and let f : G → RN be continuous on G. Then for any (t0 , x0 ) ∈ G there exists δ > 0 such that the equation x˙ = f (t, x) has a solution on the interval (t0 −δ, t0 +δ) which satisﬁes the condition x(t0 ) = x0 . Proof. It has been shown in Lemma 3.1.5 that the initial value problem is equivalent to the integral equation t F (x)(t) x0 + f (s, x(s)) ds = x(t) (5.1.8) t0

in the space C[t0 − δ, t0 + δ]. Choose δ > 0, r > 0 such that M = [t0 − δ, t0 + δ] × B(x0 ; r) ⊂ G 5 We

also have to use the fact that a convex set which is closed in the norm topology is also weakly closed (cf. Exercise 2.1.39).

5.1. Brouwer and Schauder Fixed Point Theorems

259

(B(x0 ; r) is the closed ball in RN of radius r centered at x0 ). Then M is a compact set in RN +1 , and therefore f is bounded on M, say f (t, x)RN ≤ c

for

(t, x) ∈ M.

Then F (x) − x0 C[t0 −δ,t0 +δ] ≤ cδ ≤ r for x ∈ K {y ∈ C[t0 − δ, t0 + δ] : y − x0 C[t0 −δ,t0 +δ] ≤ r} provided δ is suﬃciently small. This proves that F (K) ⊂ K. Since f is also uniformly continuous on M, the operator F is continuous on K (the convergence on K is the uniform convergence). Further, for t, s ∈ [t0 − δ, t0 + δ], t < s, x ∈ K, we have s

F (x)(t) − F (x)(s)RN ≤

t

f (σ, x(σ))RN dσ ≤ c|s − t|.

This means that F (K) is equicontinuous. By Theorem 1.2.13, F (K) is relatively compact on C[t0 − δ, t0 + δ]. It follows from Theorem 5.1.11 that the equation (5.1.8) has a solution. Our second example concerns a boundary value problem for an ordinary diﬀerential equation. Example 5.1.14. Let f be a continuous function on [0, 1] × R. We wish to solve the equation x ¨(t) = f (t, x(t)) (5.1.9) with the Dirichlet boundary conditions x(0) = x(1) = 0.

(5.1.10)

We have dealt with this problem already in Example 2.3.8. It has been proved there that y is a solution of this problem if and only if it is continuous and satisﬁes the integral equation (f is assumed to be continuous) F y(t)

1

G(t, s)f (s, y(s)) ds = y(t)

(5.1.11)

0

where the Green function G is given by s(t − 1), 0 ≤ s ≤ t ≤ 1, G(t, s) = t(s − 1), 0 ≤ t ≤ s ≤ 1. The operator F maps C[0, 1] into itself (actually into C 2 [0, 1]) and is compact.

260

Chapter 5. Topological and Monotonicity Methods

This can be proved by two types of argument: (i) For any R > 0 there is c(R) such that |f (s, y)| ≤ c(R)

for s ∈ [0, 1], |y| ≤ R.

Since

d2 F (y)(t) = f (t, y(t)), dt2 F maps the ball B(o; R) in C[0, 1] into the set of functions which have uniformly bounded second derivatives. Thus F (B(o; R)) is relatively compact in C[0, 1] (see Theorem 1.2.13). (ii) The operator F is a composition of a linear integral operator and a Nemytski operator (see Example 3.2.21). The Nemytski operator Φ : y → f (·, y(·)) is continuous from C[0, 1] into itself, and the integral operator 1 K : x → G(·, s)x(s) ds 0

is compact from C[0, 1] into itself (Example 2.2.5). Therefore F = K ◦ Φ is also compact. It remains to prove that F maps a ball B(o; R) ⊂ C[0, 1] into itself. For this purpose some growth assumptions on f are needed. If |f (s, y)| ≤ a + b|y|

for s ∈ [0, 1], y ∈ R,

then

F (y)C[0,1] ≤ [a + byC[0,1]] sup t∈[0,1]

Whenever b < 8, R ≥ solution.

a 8−b ,

1

|G(t, s)| ds ≤

0

a + byC[0,1] . 8

then F maps B(o; R) into itself and (5.1.11) has a g

Exercise 5.1.15. If f has a sublinear growth in y, i.e., there is α ∈ [0, 1) such that |f (s, y)| ≤ a + b|y|α , then no restriction on b is needed. Prove this fact! Exercise 5.1.16. Prove the Gronwall inequality: Let f be a nonnegative continuous function on an interval [a, b] and let A, B be nonnegative reals. Assume that f (t) ≤ A + B

t

f (s) ds,

t ∈ [a, b].

a

Then f (t) ≤ AeB(t−a) ,

t ∈ [a, b].

(5.1.12)

5.1A. Fixed Point Theorems for Noncompact Operators

261

Hint. Denote the right-hand side of (5.1.12) by g and notice that g(t) ˙ = Bf (t) ≤ Bg(t). Remark 5.1.17. More general integral and diﬀerential inequalities can be investigated in a similar way. Let us mention x(t) ˙ ≤ f (t, x(t)) as an example. Exercise 5.1.18. Let τ be as in the proof of Lemma 5.1.1. Prove that x → τ (x),

x ∈ int B,

has a bounded Fr´echet derivative. Hint. Use the Implicit Function Theorem for Φ(τ, x) τ P˜ (x) + (1 − τ )x2 − 1. Exercise 5.1.19. Regard the operator F given by (5.1.11) as an operator on a space of integrable functions. Repeat the argument from Example 5.1.14. Exercise 5.1.20. Let f in (5.1.9) depend also on x(t), ˙ i.e., f = f (t, x(t), x(t)). ˙ Formulate assumptions on f (x, y, z) to get the existence of a solution of the boundary value problem (5.1.9), (5.1.10). See also Example 5.2.16. Exercise 5.1.21. Let K be a bounded continuous real function on [a, b] × [a, b] × R and let h ∈ C[a, b]. Prove that the integral equation b x(t) = K(t, τ, x(τ )) dτ + h(t) a

has at least one solution x ∈ C[a, b].

5.1A Fixed Point Theorems for Noncompact Operators There are many generalizations of the Schauder Fixed Point Theorem. We mention here one which shows that the assumption of compactness of the operator can be relaxed. However, having in mind Example 5.1.7 this must be done carefully and more than continuity of the operator must be required. To this purpose we need a tool which will measure “how much noncompact” the operator actually is. Deﬁnition 5.1.22. Let M be a bounded set in a metric space (X, ). The Kuratowski measure of noncompactness χ(M) is deﬁned to be the inﬁmum of the set of all numbers d > 0 with the property that (KM) M can be covered by ﬁnitely many sets, each of whose diameters6 is less than or equal to d. If X is complete, then it follows from Proposition 1.2.3 that M is relatively compact if and only if (KM) holds for every d > 0. Therefore χ(M) = 0 is equivalent to relative compactness of M. If the value of χ(M) increases, M deviates more strongly (in the sense of condition (KM)) from relatively compact sets. diameter of M is deﬁned as diam M sup (x, y) where the supremum is taken over all x, y ∈ M. 6 The

262

Chapter 5. Topological and Monotonicity Methods

Proposition 5.1.23 (Properties of the Kuratowski measure of noncompactness). Let X be a (real or complex) Banach space. Then for all bounded subsets M, M1 , . . . , Mn , N of X the following assertions hold: (i) χ(∅) = 0; (ii) χ(M) = 0 ⇐⇒ M is relatively compact; (iii) 0 ≤ χ(M) ≤ diam M; (iv) M ⊂ N =⇒ χ(M) ≤ χ(N ); (v) χ(M + N ) ≤ χ(M) + χ(N );7 (vi) χ(βM) = |β|χ(M) for all β ∈ R (or C); (vii) χ(M) = χ(M);

n (viii) χ Mi = max{χ(M1 ), . . . , χ(Mn )}; i=1

(ix) χ(M) = χ(Co M). Proof. The properties (i)–(vii) follow directly from Deﬁnition 5.1.22, and so the proof is left to the reader. Let us prove (viii). Set M=

n

Mi

and

a = max{χ(M1 ), . . . , χ(Mn )}.

i=1

Then it follows from Mi ⊂ M and from (iv) that χ(Mi ) ≤ χ(M), so a ≤ χ(M). To i prove the equality, choose ε > 0 and a covering {M1i , M2i , . . . , Mm i } of Mi with diam Mji ≤ χ(Mi ) + ε ≤ a + ε. All of these Mji form a covering of M, so that χ(M) ≤ a + ε,

χ(M) ≤ a.

i.e.,

Hence, χ(M) = a and (viii) is proved. Finally, we prove (ix). It follows from M ⊂ Co M and (iv) that χ(M) ≤ χ(Co M). Conversely, we show that χ(Co M) ≤ χ(M). This will be done in three steps. Step 1. We prove inequality (5.1.13) below. For every ε > 0 there exists a covering N Mi with diam Mi ≤ χ(M) + ε. Since diam (Co Mi ) = diam Mi ,8 we may M ⊂ i=1

assume that Mi are all convex. Let Λ

λ = (λ1 , . . . , λN ) ∈ R

N

:

N

. λi = 1, λi ≥ 0 for all i

i=1

and A(λ)

N

λi Mi

for all

i=1 7M

+ N {z = x + y : x ∈ M, y ∈ N }. reader is invited to prove this equality.

8 The

λ = (λ1 , . . . , λN ) ∈ Λ.

5.1A. Fixed Point Theorems for Noncompact Operators

263

Now it follows from (iv), (v) and (vi) that χ(A(λ)) ≤

Step 2. We show that the union

N

λi χ(Mi ) ≤ χ(M) + ε.

(5.1.13)

i=1

A(λ) is a convex set. Indeed, let

λ∈Λ

x=

N

λi x i ,

y=

N

i=1

t ∈ [0, 1]

µi yi ,

z = tx + (1 − t)y

and

i=1

where λ, µ ∈ Λ and xi , yi ∈ Mi for all i. The point z can be represented in the form ⎧ N ⎨t λi , for ξ > 0, i ξi ξi zi where ξi = tλi + (1 − t)µi , zi = i xi + (1 − i )yi , i = z= ⎩0, for ξ = 0. i=1

i

By deﬁnition of ξ we have 0 ≤ i ≤ 1. The set Mi is convex, so zi ∈ Mi . Moreover, ξ ∈ Λ, by the convexity of Λ. Hence z ∈ A(ξ). Step 3. We prove that χ(Co M) ≤ χ(M) + 3ε. Since the set Λ is compact, for a given ε > 0 we can ﬁnd ﬁnitely many points λ(1) , . . . , ' ( N (j) (j) λ(m) ∈ Λ such that for any x = λi xi ∈ A(λ) there exists λ(j) = λ1 , . . . , λN for i=1

which

N ε (j) (j) λ x x − k=ε i ≤ max -λi − λi - max |xi | ≤ i i=1,...,N i=1,...,N k i=1

where k > 0 is a common bound for all sets Mi . Therefore,

A(λ) ⊂

λ∈Λ

So, by Step 2, we have Co M ⊂ χ(Co M) ≤ χ

A(λ)

m

' ( A λ(j) + B(o; ε).

j=1

A(λ) and by the other statements and (5.1.13),

λ∈Λ

≤χ

m

'

A λ

(j)

(

+ B(o; ε)

j=1

λ∈Λ

≤

m

' ' (( χ A λ(j) + 2ε

j=1

≤ χ(M) + 3ε, i.e., since ε > 0 is arbitrary, χ(Co M) ≤ χ(M).

Example 5.1.24. Let B(o; 1) ⊂ X be the open unit ball in a Banach space X. If dim X < ∞, then χ(B(o; 1)) = χ(B(o; 1)) = χ(∂B(o; 1)) = 0 (see Proposition 1.2.3). On the other hand, if dim X = ∞, then χ(B(o; 1)) = χ(B(o; 1)) = χ(∂B(o; 1)) = 2.

(5.1.14)

The proof of this fact is not trivial. Since the diameter of B(o; 1) is equal to 2, we know that χ(B(o; 1)) ≤ 2. In order to prove (5.1.14) we show that χ(∂B(o; 1)) ≥ 2. Assume

264

Chapter 5. Topological and Monotonicity Methods

the contrary. Then there exist sets Mi with ∂B(o; 1) =

n

Mi

i=1

and the diameter of every Mi is strictly less than 2. We may take all Mi s to be closed. Let Xn ⊂ X be a subspace of X such that dim Xn = n. Then we have ∂B(o; 1) ∩ Xn =

n

(Mi ∩ Xn ).

i=1

The sets Mi ∩ Xn , i = 1, 2, . . . , n, cover the closed unit sphere ∂B(o; 1) ∩ Xn in Xn . By the result of Lusternik and Schnirelmann (see Exercise 4.3.138) there exists Mj such that Mj ∩ Xn contains an antipodal pair {x, −x}. Consequently, 2 ≤ diam (Mj ∩ Xn ) ≤ diam Mj , which is a contradiction. Finally, by (iv) and (vii) of Proposition 5.1.23 we have (5.1.14).

e

In the next deﬁnition we will consider a special class of continuous and bounded operators. Deﬁnition 5.1.25. Let T : M ⊂ X → X be a bounded operator9 from a Banach space X into itself. The operator T is called a k-set contraction if there is a number k ≥ 0 such that χ(T (M)) ≤ kχ(M) for all bounded sets M in M . The bounded operator T is called condensing if χ(T (M)) < χ(M) for all bounded sets M in M with χ(M) > 0. Obviously, every k-set contraction for 0 ≤ k < 1 is condensing. Every compact map T is a k-set contraction with k = 0. A typical example of a k-set contraction with 0 ≤ k < 1 is the following one. Example 5.1.26. Let K, C : D ⊂ X → X be operators on a Banach space X. Let K be a k-contractive, i.e., there exists k ∈ [0, 1) such that K(x) − K(y) ≤ kx − y

for all

x, y ∈ D,

(5.1.15)

and let C be compact. Then K + C is a k-set contraction. Indeed, let M ⊂ D be a bounded set. By Deﬁnition 5.1.22 it follows from (5.1.15) that χ(K(M)) ≤ kχ(M). By (ii) of Proposition 5.1.23 we have χ(C(M)) = 0. Set T K + C. Now (iv) and (v) of Proposition 5.1.23 imply χ(T (M)) ≤ χ(K(M) + C(M)) ≤ χ(K(M)) + χ(C(M)) ≤ kχ(M).

e

The following assertion is a generalization of the Schauder Fixed Point Theorem (note that every compact operator is condensing). The operator T is said to be bounded on M if T (M ∩ A) is a bounded set provided A is a bounded set.

9

5.1A. Fixed Point Theorems for Noncompact Operators

265

Theorem 5.1.27 (Darbo). Let us suppose that (i) M is a nonempty, closed, bounded and convex subset of a Banach space X; (ii) an operator T : M ⊂ X → M is condensing and continuous on M. Then T has a ﬁxed point in M. Proof. The idea of the proof is to ﬁnd a suitable subset A of M which is mapped into itself by T in such a way that the Schauder Fixed Point Theorem can be applied to the restriction T : A → A. The resulting ﬁxed point is then trivially a ﬁxed point of the original mapping T : M → M. The set A is constructed in the following way. Choose a point m ∈ M and let Σ denote the system of all closed, convex subsets K of M for which m ∈ K and T (K) ⊂ K. Set A= K and C = Co {T (A) ∪ {m}}.10 K∈Σ

Since m ∈ A and T (A) ⊂ A, it follows that C ⊂ A. This implies T (C) ⊂ T (A). Obviously T (A) ⊂ C, i.e., T (C) ⊂ C which means that C ∈ Σ. So, A ⊂ C. We have proved that A = C. Now, (vii), (viii) and (ix) of Proposition 5.1.23 imply that χ(A) = χ(C) = χ(T (A)).

(5.1.16)

Since T is condensing, χ(A) = 0. Since A is also closed, A is a compact set. The restriction of T to A is thus a compact operator. Consequently, the Schauder Fixed Point Theorem can be applied to the mapping T : A → A. Corollary 5.1.28. Let K, C : M ⊂ X → X be operators in a Banach space X such that (K + C)(M) ⊂ M, let M be a nonempty, closed, bounded and convex set in X, let K be k-contractive (0 ≤ k < 1) and C compact. Then K + C has a ﬁxed point in M. Proof. The proof follows immediately from Example 5.1.26 and Theorem 5.1.27.

The following assertion generalizes the existence part of Theorem 3.1.4 and follows from the previous Corollary 5.1.28, cf. the statement with the example on page 110. Let us consider the initial value problem x˙ = f (t, x) + g(t, x), (5.1.17) x(t0 ) = x0 in a Banach space Y . For ﬁxed positive numbers a and b deﬁne R [t0 − a, t0 + a] × [x ∈ Y : x − x0 ≤ b}. Proposition 5.1.29. Let us assume that (i) the map f : R → Y is continuous and also Lipschitz continuous with respect to the second variable, i.e., there exists L > 0 such that f (t, x) − f (t, y) ≤ Lx − y

for all

(t, x), (t, y) ∈ R;

(ii) the map g : R → Y is compact; 10 Observe

that Σ = ∅ because M ∈ Σ, and A = ∅ because m ∈ A.

266

Chapter 5. Topological and Monotonicity Methods

(iii) the sum f + g is bounded, i.e., there exists B > 0 such that f (t, y) + g(t, y) ≤ B

(t, y) ∈ R;

for all

(iv) the number c > 0 is chosen such that c ≤ a,

cL < 1,

Bc ≤ b.

Then the problem (5.1.17) has a solution x = x(t) deﬁned on (t0 − c, t0 + c). Proof. It follows from Lemma 3.1.5 that the problem (5.1.17) is equivalent to the integral equation t

x(t) = x0 +

[f (s, x(s)) + g(s, x(s))] ds.

(5.1.18)

t0

Let X = C([t0 − c, t0 + c], Y )

M = {x ∈ X : x − x0 X ≤ b}.11

and

Then (5.1.18) can be regarded as the operator equation x = K(x) + C(x), where

x ∈ M,

t

K(x)(t) = x0 +

(5.1.19)

f (s, x(s)) ds,

t

C(x)(t) =

t0

g(s, x(s)) ds. t0

Similarly to the proof of Theorem 3.1.4 we obtain that (K + C)(M) ⊂ M. Furthermore, the operator K is k-contractive with k = Lc and the operator C is compact. So, Corollary 5.1.28 yields the existence of a solution of (5.1.19), hence of (5.1.18), and thus, ultimately, of (5.1.17). A similar approach can be also used for functional diﬀerential equations. In the following example we describe a simple situation. For a more general treatment of evolution equations see, e.g., Milota & Petzeltov´ a [96]. Example 5.1.30. Consider a system of ordinary functional diﬀerential equations t f (t, s, x(s)) ds, x(0) = x0 , (5.1.20) x(t) ˙ = A(t)x(t) + 0

where A is an N × N -matrix with continuous entries on the interval [0, T ], f : M older continuous with respect ([0, T ] × [0, T ] × RN ) → RN is continuous and locally α-H¨ to the ﬁrst variable and locally satisﬁes the Lipschitz condition with respect to the third variable, i.e., for any (t0 , s0 , x0 ) ∈ M there are a neighborhood U of this point and constants c > 0, L > 0, α ∈ (0, 1) such that |f (t1 , s, x1 ) − f (t2 , s, x2 )| ≤ c|t1 − t2 |α + L|x1 − x2 |

for

(ti , s, xi ) ∈ U, i = 1, 2.

Instead of (5.1.20) we consider the equivalent integral equation t s Φ(t, s) f (s, σ, x(σ)) dσ ds x(t) = H(x)(t) Φ(t, 0)x0 + 0 11 Here

0

x0 ∈ X is understood to be a constant function deﬁned on [t0 − c, t0 + c] with value in Y .

5.2. Topological Degree

267

where Φ(t) is a fundamental matrix of the equation x(t) ˙ = A(t)x(t). We put

s

F (s, x)

f (s, σ, x(σ)) dσ,

s ∈ [0, T ],

x ∈ C([0, T ], RN )

0

and

t

Φ(t, s)[F (s, x) − F (t, x)] ds,

G1 (x)(t) =

t

Φ(t, s) ds F (t, x).

G2 (x)(t) =

0

0

It is not diﬃcult to show that there are r > 0, τ > 0 small enough such that H maps the set . Q(r, τ )

y ∈ C([0, τ ], RN ) : sup |y(t) − x0 | ≤ r t∈[0,τ ]

a–Ascoli Theorem) on Q(r, τ ) and into itself, G1 is a compact mapping (using the Arzel` G2 is a contraction on Q(r, τ ). The local existence of a solution of (5.1.20) follows now from Corollary 5.1.28. This local solution can be continuously extended. To keep the time step τ ﬁxed it is suﬃcient to assume that f satisﬁes the global Lipschitz condition with respect to the x-variable on the whole domain M . Exercise 5.1.31. Let H and Q(r, τ ) be as in Example 5.1.30. Prove that H maps the set Q(r, τ ) into itself. Exercise 5.1.32. Let G1 , G2 and Q(r, τ ) be as in Example 5.1.30. Prove that G1 is a compact mapping on Q(r, τ ) and G2 is a contraction on Q(r, τ ). Exercise 5.1.33. Prove Proposition 5.1.23(i)–(vii). Exercise 5.1.34. Consider the boundary value problem x ¨(t) = f (t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0,

(5.1.21)

where f : R3 → R is a real function. Find conditions on f and apply Corollary 5.1.28 to prove the existence of a solution to (5.1.21). Hint. Look for conditions which guarantee that the Nemytski operator given by f is a sum of a contraction and a compact operator.

5.2 Topological Degree In this section we stress the basic properties of the Brouwer degree of a continuous map in ﬁnite dimensional spaces and of the Leray–Schauder degree of a compact perturbation of the identity in general Banach spaces. We start with some elementary considerations in one dimension. The reader can ﬁnd another motivation from the theory of functions of a complex variable in Appendix 4.3D. In the previous section we have dealt with a solution of the operator equation F (x) = 0.

268

Chapter 5. Topological and Monotonicity Methods

Now we are asking what happens with its solution if F : R → R is slightly perturbed. Figures 5.2.1–5.2.3 show that the situation can change considerably, namely, either a solution may disappear (if a perturbation takes place in the solid arrow direction) or the number of solutions may vary (in the dashed arrow direction). F

F

F

G x1

x0

x0

x2

x0 G

Figure 5.2.1.

Figure 5.2.2.

Figure 5.2.3.

A closer examination indicates that this can happen since a solution x0 is either on the boundary (Figure 5.2.1) or the derivative F vanishes at x0 , i.e., x0 is a critical point of F (Figures 5.2.2 and 5.2.3). We expect that a small perturbation of F does not cause any alteration provided the just described cases do not occur. There is another point which should be mentioned, namely the distinction between perturbations of F in the direction of one or the other arrow in Figures 5.2.2 and 5.2.3. The number of solutions changes by two, being even in Figure 5.2.2 (0 is even by deﬁnition), and being odd in Figure 5.2.3. Is there any way to describe this phenomenon? Look at the dashed curve G in Figure 5.2.2 and assume that G ∈ C 1 . We have G (x2 ) > 0. G (x1 ) < 0, These signs remain the same in some neighborhoods U1 , U2 of x1 and x2 . In particular, G is injective (actually a diﬀeomorphism) on these neighborhoods and can be regarded as a local transformation of the x-coordinate. This transformation changes the orientation at x1 and does not do that at x2 . The sum of signs of G at the solutions of G(x) = 0 is zero (more generally even) in Figure 5.2.2 and odd for the dashed curve in Figure 5.2.3. This observation can be generalized to higher dimensions: If A : RN → RN is a linear transformation of coordinates (i.e., A is injective and surjective), then we say that A does not change the orientation in RN provided det A > 0 where A is the matrix representation of A (this does not depend on the choice of basis in which the representation is taken). This concept can be also used locally for a nonlinear C 1 -transformation G : RN → RN by replacing G (a) for A. Then

5.2. Topological Degree

269

the sign of the matrix representation of G (a) is the sign of its Jacobian JG (a). This idea leads to the following preparatory deﬁnition. Deﬁnition 5.2.1. Let Ω be an open and bounded subset of RN and let F ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ). Assume that y0 ∈ RN \ F (∂Ω) and y0 is a regular value of F .12 Then we deﬁne the Brouwer degree of F as deg (F, Ω, y0 ) = sgn JF (x) .13 (5.2.1) x∈F−1 (y0 )∩Ω

We point out that the sum in (5.2.1) is ﬁnite. Indeed, otherwise the set F−1 (y0 ) ∩ Ω {x ∈ Ω : F (x) = y0 } x) = y0 , and since has an accumulation point x ˜ ∈ Ω. By the continuity of F , F (˜ y0 ∈ F (∂Ω), x ˜ ∈ Ω. By the Local Inverse Function Theorem (Theorem 4.1.1), F is injective in a neighborhood U of x ˜. But U contains points of F−1 (y0 ) diﬀerent from x ˜, a contradiction. Notice that in this argument we have used all assumptions of Deﬁnition 5.2.1. Proposition 5.2.2. Let Ω be an open bounded subset of RN . The degree deﬁned in Deﬁnition 5.2.1 has the following properties ( I is the identity map): 1 if y0 ∈ Ω, (i) deg (I, Ω, y0 ) = 0 if y0 ∈ Ω. Suppose that F ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) and y0 ∈ RN \ F (∂Ω) is a regular value of F . Then (ii) deg (F, Ω, y0 ) ∈ Z; (iii) deg (F, Ω, y0 ) = deg (F − y0 , Ω, o); (iv) if deg (F, Ω, y0 ) = 0, then the equation F (x) = y0 has a solution in Ω; (v) if Ω1 is an open subset of Ω and y0 ∈ F (Ω \ Ω1 ), then deg (F, Ω, y0 ) = deg (F, Ω1 , y0 ). More generally, if Ω1 , . . . , Ωk are pairwise disjoint open subsets of Ω and k Ωj , then y0 ∈ F Ω \ j=1

deg (F, Ω, y0 ) =

k

deg (F, Ωj , y0 ).

j=1 12 The 13 Here

deﬁnition of a regular value is given in Deﬁnition 4.3.6. = 0 as usual. ∅

(5.2.2)

270

Chapter 5. Topological and Monotonicity Methods

(vi) For all y ∈ RN which are suﬃciently close to y0 , deg (F, Ω, y) = deg (F, Ω, y0 )

holds.

(5.2.3)

(vii) For all G ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) which are suﬃciently close to F in the C 1 -topology,14 deg (G, Ω, y0 ) = deg (F, Ω, y0 )

is valid.

(5.2.4)

Proof. The properties (i)–(v) follow immediately from Deﬁnition 5.2.1. To prove (vi) let F−1 (y0 ) = {x1 , . . . , xk } and let F be a diﬀeomorphism of an open neighborhood Uj of xj onto a neighborhood Vj of y0 (the Local Inverse Function Theorem (Theorem 4.1.1). Denote ⎧ ⎫ k ⎨ ⎬ d inf F (x) − y0 : x ∈ Ω \ Uj > 0. ⎩ ⎭ j=1

If y − y0 < d, then there is no solution of F (x) = y in Ω\

k

Uj , and if y ∈

j=1

k %

Vj

j=1

(the neighborhood of y0 ), then there is exactly one x ˜j ∈ Uj such that F (˜ xj ) = y. Moreover, sgn JF (˜ xj ) = sgn JF (xj ). This completes the proof of (5.2.3).15 To prove (vii) we use the same notation as above. Let G diﬀer a little from F in the C 1 -topology, say F − GC 1 (Ω,RN ) < ε. The quantity ε will be speciﬁed later. Put H(t, x) = (1 − t)F (x) + tG(x),

x ∈ Ω, t ∈ (−δ, 1 + δ) for δ > 0.

(5.2.5)

Choose a ﬁxed neighborhood Uj as above. Notice that we can take Uj so small that c1 sup F (x)L(RN ) < ∞

and

x∈Uj

c2 sup [F (x)]−1 L(RN ) < ∞, x∈Uj

and the determinant det F (x) has a constant sign in Uj . We have H(t, x) − y0 ≥ F (x) − y0 − |t|F (x) − G(x) ≥ d − |t|ε > 0 14 I.e.,

there exists ε > 0 such that F − GC 1 (Ω,RN ) sup F (x) − G(x)RN + sup F (x) − G (x)L(RN ) < ε. x∈Ω

15 Notice

x∈Ω

that this proof is correct also for F−1 (y0 ) = ∅.

5.2. Topological Degree k

for every x ∈ Ω \

271

Uj and t ∈ (−δ, 1 + δ), and for suﬃciently small ε > 0. In

j=1

particular, deg (H(t, ·), Uj , y0 ) is well deﬁned16 and, by (v), deg (H(t, ·), Ω, y0 ) =

k

deg (H(t, ·), Uj , y0 ).

j=1

We wish to prove that this degree is constant on the interval [0, 1]. We will study the set M {(t, x) ∈ [0, 1] × Uj : H(t, x) = y0 } with help of the Implicit Function Theorem at the point (0, xj ). This is possible since H2 (t, x) − F (x)L(RN ) = |t|F (x) − G (x)L(RN ) ≤ |t|ε < (t, x) ∈ (−δ, 1 + δ) × Uj ,

1 , c2

for small ε > 0.

This estimate implies that [H2 (t, x)]−1 exists (Exercise 2.1.33). The Implicit Function Theorem implies that M has the form {(t, ϕ(t)) : t ∈ [0, β)} in a certain neighborhood of (0, xj ) (F (xj ) = y0 ) where ϕ ∈ C 1 ([0, β), RN ) and −1 ϕ(t) ˙ H1 (t, ϕ(t))L(RN ) ≤ L(RN ) = [H2 (t, ϕ(t))]

c2 ε , 1 − c2 ε

see again Exercise 2.1.33. In particular, ϕ is uniformly continuous and, if necessary, it can be continued at least until t = 1.17 Therefore M is a graph of ϕ on the interval [0, 1], i.e., {x ∈ Uj : G(x) = H(1, x) = y0 } = {ϕ(1)}, and, consequently, deg (G, Uj , y0 ) = deg (F, Uj , y0 ).

One of our main goals is to show that the degree is homotopically invariant, i.e., if for H given by (5.2.5) we have y0 = H(t, x)

for all t ∈ [0, 1],

x ∈ ∂Ω,

then deg (H(t, ·), Ω, y0 ) is constant on [0, 1]. In particular, deg (H(0, ·), Ω, y0 ) = deg (H(1, ·), Ω, y0 ) 16 A 17 If

(5.2.6)

homotopy H(t, x) for which the degree is well deﬁned is called an admissible homotopy. β ≤ 1, then lim ϕ(t) = x ˜ exists and x ˜ ∈ Uj (see Proposition 1.2.4). t→β−

272

Chapter 5. Topological and Monotonicity Methods

provided at least one side in (5.2.6) is deﬁned. The problem in proving this property can be seen from Figures 5.2.2 and 5.2.3. Namely, if the dashed curve G is moving up, then it is equal to F in one instance, o is not a regular value for F , and so deg (F, Ω, o) is not yet deﬁned. To overcome this obstacle we approximate a critical value by a regular one. Such approximation is based on the so-called Sard Theorem. Its special case stated below will be suﬃcient for our purposes. Theorem 5.2.3 (Sard). Let Ω be an open subset of RN and assume that F ∈ C 1 (Ω, RN ). Then the Lebesgue measure of the set of critical values of F is zero. Proof. Since RN can be covered by a sequence of bounded open sets and a countable union of sets of measure zero has also measure zero, we can suppose that Ω is bounded. Choose now an open subset G ⊂ Ω such that G ⊂ Ω. Let S be the set of critical points of F in G. By the same argument as above, it is suﬃcient to show that measN F (S) = 0 where measN is the Lebesgue measure in RN . Since G is compact, d dist(G, RN \ Ω) > 0 and G can be covered by a ﬁnite number of closed cubes C1 , . . . , Ck with √ sides parallel to the coordinate hyperplanes and edges of length a. If a < d N , then k Ci ⊂ Ω and i=1

sup F (x)L(RN ) < ∞.

c

x∈

k

i=1

Ci

Again it is suﬃcient to show that measN F (Ci ∩ S) = 0,

i = 1, . . . , k.

Choose one of these cubes and denote it by C. By the Mean Value Theorem (Theorem 3.2.7), F (y) − F (x) ≤ cx − y and F (y) − Lx y ≤ ω(x − y)x − y, x, y ∈ C where lim ω(r) = 0 (uniform continuity of F on the compact set C) and r→0+

Lx y F (x) + F (x)(y − x). a , and consider a small Divide now the cube C into mN small cubes with edges m ˜ ˜ cube r = √ a C which contains a critical point x. Denote the diameter of C by r˜ (˜ ˜ N m ). Since Lx (C) lies in an (N − 1)-dimensional hyperplane (x is a critical point!),

˜ ≤ (c˜ measN −1 Lx (C) r )N −1 we have

and

˜ ≤ ω(˜ ˜ dist(F (y), Lx (C)) r )˜ r for y ∈ C,

˜ ≤ cN −1 ω(˜ measN F (C) r)˜ rN .

5.2. Topological Degree

273

The number of such small cubes which contain critical points is mN at most. Therefore, measN F (S ∩ C) ≤ cN −1 ω(˜ r )˜ rN mN ≤ c1 ω(˜ r) with a constant c1 independent of m. Since ω(˜ r ) → 0 for m → ∞, measN F (S ∩ C) = 0.

Corollary 5.2.4. Under the hypotheses of Theorem 5.2.3 the set of regular values of F : Ω → RN is dense in RN . Proof. The complement of regular values, i.e., F (Ω ∩ S), cannot have an interior point and zero Lebesgue measure simultaneously. Remark 5.2.5. A more general Sard Theorem concerns F : RM → RN . It is surprising that the assertion measN F (S ∩ C) = 0 needs more smoothness of F , namely F ∈ C r (Ω, RN )

where

r > max {0, M − N }.

The proof is more involved (see, e.g., Hirsch [67, Chapter 3, Theorem 1.3] for the C ∞ -case and comments given there, or Sard [117], or Sternberg [124, Theorem II.3.1]). If r ≤ max {0, M − N }, then there exists F ∈ C r (Ω, RN ) such that int F (Ω ∩ S) = ∅, see Whitney [132]. The statement on the Lebesgue measure can be strengthened by considering the ﬁner Hausdorﬀ measure or dimension. The following result also holds: If F : RM → R is analytic, then F (Ω ∩ S) is even countable. For more detail see, e.g., Fuˇc´ık et al. [56, Chapter IV and Appendix IV]. There is also a generalization for mappings F : X → Y , X, Y Banach spaces: If F (x) is a Fredholm operator for all x ∈ Ω and F is suﬃciently smooth, then F (Ω ∩ S) is nowhere dense in Y . Sharper results can be proved for functionals (i.e., Y = R) (see the book Fuˇc´ık et al. [56] cited above). Now we return to the degree deg (F, Ω, y0 ) where y0 ∈ RN \ F (∂Ω) and it is ∞ a critical value of F . According to Corollary 5.2.4 there is a sequence {yn }n=1 of regular values of F such that lim yn = y0 ,

n→∞

yn ∈ F (∂Ω).

In particular, deg (F, Ω, yn ) is well deﬁned by Deﬁnition 5.2.1. Part (vi) of Proposi∞ tion 5.2.2 allows us to presume that the sequence {deg (F, Ω, yn )}n=1 is eventually

274

Chapter 5. Topological and Monotonicity Methods ∞

constant and does not depend on the choice of the sequence {yn }n=1 of regular values. To see this we need to extend Proposition 5.2.2(vi) to guarantee that deg (F, Ω, y) is constant on any open connected set G ⊂ RN \ F (∂Ω ∪ S). This can be done due to the fact that any two diﬀerent points in an open connected subset of RN can be connected by a smooth curve in this subset (see Proposition 1.2.7). We leave details to the reader. He or she should also be convinced that all statements of Proposition 5.2.2 are still valid for this more general deﬁnition of the degree. For the deﬁnition of the degree deg (F, Ω, y0 ) it is not necessary to assume that F ∈ C 1 (Ω, RN ) since any F ∈ C(Ω, RN ) can be approximated by smooth mappings. This is a consequence of the Stone–Weierstrass Theorem (see Theorem 1.2.14).18 To show that deg (G, Ω, y0 ) is the same for all G ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) which are close to F in the C(Ω, RN )-norm we need the following extension of Proposition 5.2.2(vii). Proposition 5.2.6. Let Ω be a bounded open subset of RN and let F, G be the mappings from C(Ω, RN ) ∩ C 1 (Ω, RN ). Put H(t, x) = (1 − t)F (x) + tG(x),

t ∈ [0, 1],

x ∈ Ω.

Assume that y0 ∈ RN \ {H(t, x) : t ∈ [0, 1], x ∈ ∂Ω}. Then deg (F, Ω, y0 ) = deg (G, Ω, y0 ).

(5.2.7)

Proof. As has been stated above, Proposition 5.2.2(vii) holds for an arbitrary y0 ∈ RN \ F (∂Ω), in particular for H(t, ·) and t small. Put t0 = sup {t ∈ [0, 1] : deg (H(t, ·), Ω, y0 ) = deg (F, Ω, y0 )}. By the same statement (deg (H(t, ·), Ω, y0 ) is deﬁned for all t ∈ [0, 1]), deg (H(t0 , ·), Ω, y0 ) = deg (H(t, ·), Ω, y0 )

for t ∈ (t0 − δ, t0 + δ) ∩ [0, 1],

i.e., t0 = 1, and the equality (5.2.7) follows.

The following result is a summarization of the previous exposition. 18 For the set A in Theorem 1.2.14 take restrictions of polynomials of N variables to Ω and approximate separately every Fi , i = 1, . . . , N , where F = (F1 , . . . , FN ). For another proof see Lemma 4.3.121.

5.2. Topological Degree

275

Theorem 5.2.7. Let Ω be a bounded open set in RN . There exists a mapping deg : C(Ω, RN ) × RN → Z deﬁned for all F ∈ C(Ω, RN ) and y0 ∈ RN \ F (∂Ω) which has the properties (i)–(vii) from Proposition 5.2.2 and from Proposition 5.2.6.19 If, moreover, F ∈ C 1 (Ω, RN ) and y0 is a regular value of F , then the formula (5.2.1) holds. Remark 5.2.8. The function“deg” from the previous theorem is unique. The reader can consult, e.g., Amann & Weiss [5] or Deimling [34] to get more information. Example 5.2.9 (Brouwer). Let F be a continuous mapping from the closed unit ball B B(o; 1) into itself. Then F has a ﬁxed point in B. Indeed, if there is x ∈ ∂B such that x − F (x) = o, then the statement is true. In the other case put H(t, x) = x − tF (x). Then H(1, x) = o and H(t, x) ≥ x − tF (x) ≥ 1 − t > 0

for t ∈ [0, 1), x = 1.

By the homotopy invariance property of the degree, 1 = deg (I, int B, o) = deg (x − F (x), int B, o). By property (iv), the equation x − F (x) = o has a solution in int B.

g

Example 5.2.10. Let B B(o; 1) be the closed unit ball in RN and let A be a linear injective operator from RN into RN .20 Then where p = m(λ) deg (A, int B, o) = (−1)p λ∈σ(A) λ<0

and m(λ) is the multiplicity21 of the eigenvalue λ of A. This follows immediately from Deﬁnition 5.2.1 and Exercise 1.1.40. Notice that the same result is true when A is replaced by a C 1 -mapping F : RN → RN which has an isolated zero at x0 ∈ RN , B(o; 1) is replaced by a suﬃciently small 19 Mappings

F , G are now supposed to be continuous only. A maps RN onto RN and o ∈ A(∂B). 21 See footnote 23 on page 84, m(λ) is often called the algebraic multiplicity. 20 Actually,

276

Chapter 5. Topological and Monotonicity Methods

ball B(x0 ; r) (such that the equation F (x) = o has a unique solution in this closed ball, namely x0 ) and F (x0 ) is injective. Under these hypotheses we have deg (F, B(x0 ; r), o) = (−1)p where p = m(λ). (5.2.8) λ∈σ(F (x0 )) λ<0

The value deg (F, B(x0 ; r), o) is also called the index of an isolated solution x0 of g the equation F (x) = o. Example 5.2.11. Let Ω be a bounded open subset of RN and let Y be a linear subspace of RN . Suppose that f ∈ C(Ω, Y ) is such that o = f (∂Ω). Denote by π a projection of RN onto Y , by f˜ the restriction of πf onto Ω ∩ Y and g(x) f˜(πx) + (I − π)x. Then deg (g, Ω, o) = deg (f˜, Ω ∩ Y, o).

(5.2.9)

To see this notice ﬁrst that ∂(Ω ∩ Y ) = ∂Ω ∩ Y

and

g−1 (o) = f˜−1 (o).

This means that both sides of (5.2.9) are deﬁned. By the construction of the degree, it suﬃces to prove the equality under the following additional assumptions: f ∈ C 1 (Ω) and o is a regular value. Since g (y)(h + k) = f˜ (y)h + k

for

y ∈ Y ∩ Ω,

h ∈ Y,

k ∈ Ker π,

we get det g (y) = det f˜ (y), 22 g

and (5.2.9) follows.

Our next aim is to generalize the notion of the degree to inﬁnite dimensional spaces. Since the Brouwer Theorem is a corollary of the homotopy invariance property of the degree and this theorem does not hold even in an inﬁnite dimensional Hilbert space (Example 5.1.7) we cannot expect a meaningful generalization of the Brouwer degree which would be valid for all continuous mappings. Similarly as in the Schauder Fixed Point Theorem we restrict our attention to operators which are well approximated by ﬁnite dimensional ones, i.e., to compact operators. One more remark is desirable. One of the main consequences of the notion of deg (F, Ω, y0 ) is a suﬃcient condition for the solvability of the equation F (x) = y0 in the set Ω. If F is a compact operator, then F (Ω) is rather small in an inﬁnite dimensional space. Therefore it is much better to solve either x − F (x) − y0 = o 22 The

reader is asked to check this equality.

5.2. Topological Degree

277

(recall the Fredholm theory in Section 2.2) or, more generally, Ax − F (x) = o

for a suitable A.

The Leray–Schauder degree concerns operators of the type I −F where F : X → X is a compact operator. Let Ω be a bounded open set in a Banach space X, o ∈ X \ (I − F )(∂Ω). Using the compactness of F , it is easy to prove that (I − F )(∂Ω) is closed, and hence, d dist(o, (I − F )(∂Ω)) > 0. Let {Fn }∞ n=1 be a sequence in Cf (Ω, X) which converges to F uniformly on Ω (Theorem 5.1.9). Denote Xn = Lin Fn (Ω),

Ωn = Ω ∩ X n ,

and Gn (x) = x − Fn (x)

for x ∈ Ωn .

If n is suﬃciently large, then o ∈ Gn (∂Ωn ) and deg (Gn , Ωn , o) is well deﬁned. Lemma 5.2.12. Under the above stated hypotheses, the sequence of integers ∞ {deg (Gn , Ωn , o)}n=1 is constant for large n, and its limit does not depend on the choice of the approximating sequence {Gn }∞ n=1 . Proof. For a given ε > 0 there is n0 such that sup F (x) − Fn (x) < ε

for

n ≥ n0 .

x∈Ω

Choose some n, m ≥ n0 and put ˜ = X n + Xm , X

˜ k (x) = x − Fk (x), G

˜ x ∈ Ω ∩ X,

k = n, m.

Consider the homotopy ˜ n (x), ˜ m (x) + (1 − t)G H(t, x) = tG

˜ t ∈ [0, 1], x ∈ Ω ∩ X.

˜ = ∂(Ω ∩ X) ˜ we have For x ∈ ∂Ω ∩ X H(t, x) = x − F (x) − t[Fm (x) − F (x)] − (1 − t)[Fn (x) − F (x)] ≥ x − F (x) − tFm (x) − F (x) − (1 − t)Fn (x) − F (x) ≥ d − 2ε > 0 for small ε > 0. By Theorem 5.2.7, ˜ o) = deg (G ˜ n , Ω ∩ X, ˜ o). ˜ m , Ω ∩ X, deg (G

278

Chapter 5. Topological and Monotonicity Methods

Using Example 5.2.11 we get ˜ k , Ω ∩ X, ˜ o) = deg (Gk , Ωk , o), deg (G

k = m, n,

i.e., deg (Gn , Ωn , o) is constant for n ≥ n0 . If Φn is another approximating sequence of F , then the homotopy joining the restrictions of I − Fn and I − Φn to the span Z of Im Fn + Im Φn can be deﬁned. The same procedure as above yields deg (x − Fn (x), Ω ∩ Z, o) = deg (x − Φn (x), Ω ∩ Z, o).

We are now able to deﬁne the Leray–Schauder degree deg (I − F, Ω, y0 ) deg (I − (F + y0 ), Ω, o) ∞

as the limit of deg (Gn , Ωn , o) for any approximating sequence {Gn , Ωn }n=1 . This construction also shows that the Leray–Schauder degree inherits its properties from the Brouwer degree. Theorem 5.2.13. Let Ω be a bounded open subset of a Banach space X. There exists a mapping deg (I − F, Ω, y0 ) deﬁned for all F ∈ C (Ω, X) and y0 ∈ X such that x − F (x) = y0 for all x ∈ ∂Ω. This mapping has the following properties: 1 if y0 ∈ Ω, (i) deg (I, Ω, y0 ) = 0 if y0 ∈ Ω. (ii) deg (I − F, Ω, y0 ) = deg (I − F − y0 , Ω, o). (iii) If deg (I − F, Ω, y0 ) = 0, then the equation x − F (x) = y0 has a solution in Ω. (iv) If Ω1 , . . . , Ωk are pairwise disjoint open subsets of Ω and x − F (x) = y0 for k each x ∈ Ω \ Ωj , then j=1

deg (I − F, Ω, y0 ) =

k

deg (I − F, Ωj , y0 ).

j=1

(v) If F, G ∈ C (Ω, X) and sup F (x) − G(x)X < inf x − F (x) − y0 X , x∈Ω

x∈∂Ω

then deg (I − F, Ω, y0 ) = deg (I − G, Ω, y0 ).

5.2. Topological Degree

279

(vi) (homotopy invariance property) If F, G ∈ C (Ω, X) and H(t, x) = (1 − t)F (x) + tG(x),

t ∈ [0, 1],

x ∈ Ω,

are such that x − H(t, x) = y0

for every

x ∈ ∂Ω

and

t ∈ [0, 1],

then deg (I − H(t, ·), Ω, y0 ) is constant on [0, 1]. In particular, deg (I − F, Ω, y0 ) = deg (I − G, Ω, y0 ). Proof. Finite dimensional approximations and the corresponding properties of the Brouwer degree are used to prove all the statements. We only give details for (iii). ∞ We can assume that y0 = o (by (ii)). Let {Fn }n=1 ⊂ Cf (Ω, X) be a sequence of ﬁnite dimensional approximations that converges to F uniformly on Ω. We know, by construction of the degree, that deg (I − F, Ω, o) = deg (In − Fn , Ωn , o) = 0

for all n large.

By Theorem 5.2.7 there are xn ∈ Ωn ⊂ Ω such that Fn (xn ) = xn . ∞

Since F is compact there exists a subsequence {F (xnk )}k=1 converging to a z ∈ X. It follows from the uniform convergence of Fn that lim Fnk (xnk ) = z, too. This k→∞

means that also lim xnk = z ∈ Ω and, therefore, k→∞

F (z) = z. But z cannot belong to ∂Ω, i.e., z ∈ Ω.

Example 5.2.14 (Rothe’s version of the Schauder Fixed Point Theorem). Assume that F is a compact operator from the closed unit ball B(o; 1) of a Banach space X into X. If F (∂B(o; 1)) ⊂ B(o; 1), then F has a ﬁxed point in B(o; 1). Indeed, suppose not and consider H(t, x) x − tF (x). By the homotopy invariance property, deg (I − F, B(o; 1), o) = deg (I, B(o; 1), o) = 1, a contradiction.

g

280

Chapter 5. Topological and Monotonicity Methods

Example 5.2.15 (Schaefer). Let F ∈ C (X, X) and let Σ {x ∈ X : ∃t ∈ [0, 1] such that x − tF (x) = o} be bounded.23 Then F has a ﬁxed point. To prove this choose an r > 0 such that Σ ⊂ B(o; r) and put Ω = B(o; r) (open ball). The homotopy invariance property of the degree can be applied to H(t, x) = (1 − t)x + t(x − F (x)), g

and the result follows.

The next example shows that ﬁnding an a priori estimate need not be a trivial task. Example 5.2.16. Consider the boundary value problem x ¨(t) = f (t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0,

(5.2.10)

where f : [0, 1] × R2 → R is a continuous function. We know (Example 2.3.8 and also Example 5.1.14) that a solution of (5.2.10) is also a solution of the integral equation 1 F (x)(t) G(t, s)f (s, x(s), x(s)) ˙ ds = x(t) 0

where the Green function G(t, s) is deﬁned as follows: s(t − 1), 0 ≤ s ≤ t ≤ 1, G(t, s) = t(s − 1), 0 ≤ t < s ≤ 1. Notice the diﬀerence between this example and Example 5.1.14. Here the operator F depends also on the derivative x, ˙ and it is thus deﬁned only on a dense subset of the space C[0, 1]. The notion of the degree cannot be used for F in this space. Therefore, we have to work either in C 1 or in the space X = {x ∈ C 2 [0, 1] : x(0) = x(1) = 0} which our solution has to belong to. In both the spaces the problem of an a priori estimate of a possible solution to (5.2.10) occurs – see Step 2 of this example. We will work in X. 23 This assumption is often called an a priori estimate. Notice that it is not assumed that the equation x−tF (x) = o has any solution. However, if a solution exists, then it belongs to a certain ball the radius of which is independent of t. The result given in this example is also called the Leray–Schauder Continuation Method.

5.2. Topological Degree

281

Step 1. First we show that F is a compact operator on X. To prove this notice that F is the composition Ψ◦ Φ where Φ : X → C[0, 1] is a Nemytski type operator Φ(x) : t → f (t, x(t), x(t)) ˙ and Ψ : C[0, 1] → X is the linear integral operator 1 Ψ(y)(t) = G(t, s)y(s) ds. 0

The Nemytski operator Φ is also a composition of a compact embedding of X into C 1 (the Arzel` a–Ascoli Theorem) and a continuous operator from C 1 [0, 1] into C[0, 1]. It is suﬃcient to show that Ψ is a continuous linear operator from C[0, 1] into X. Indeed, since Ψ(y) is a solution of the boundary value problem x ¨(t) = y(t), t ∈ (0, 1), x(0) = x(1) = 0 (Example 2.3.8), we have x = Ψ(y) ∈ X. Because of the boundary conditions there is t0 ∈ (0, 1) such that x(t ˙ 0 ) = 0 (the classical theorem due to Rolle). This allows us to write t x(t) ˙ = y(s) ds, t ∈ [0, 1], t0

- t x(s) ˙ ds-- ≤ yC[0,1], and to get the estimate |x(t)| ˙ ≤ yC[0,1]. Since |x(t)| = 0 we have ˙ + sup |¨ x(t)| ≤ 3yC[0,1]. xX sup |x(t)| + sup |x(t)| t∈[0,1]

t∈[0,1]

t∈[0,1]

Step 2. In order to establish an a priori estimate we have to require estimates on the behavior of f (t, x, y) for large x and y: (H1) There is M0 such that xf (s, x, 0) > 0

for s ∈ [0, 1] and |x| ≥ M0 .

(H2) There are c1 , c2 such that |f (s, x, y)| ≤ c1 y 2 + c2

for s ∈ [0, 1],

|x| ≤ M0 ,

y ∈ R.24

Suppose that (H1) and (H2) hold. Let there exist x ∈ X, x = o, and λ ∈ (0, 1] such that x = λF (x). 24 The

tion.

condition (H1) is sometimes called the sign condition and (H2) the Nagumo-type condi-

282

Chapter 5. Topological and Monotonicity Methods

First we will estimate xC[0,1]. There is t0 ∈ (0, 1) such that |x(t0 )| = xC[0,1] and we assume x(t0 ) > 0. Then x(t ˙ 0 ) = 0 and x ¨(t0 ) ≤ 0. Since x(t0 ) = λx(t0 )f (t0 , x(t0 ), 0), 0 ≥ x(t0 )¨ we have x(t0 ) ≤ M0 according to (H1). Similarly for x(t0 ) < 0, i.e., xC[0,1] ≤ M0 . To get an estimate of x˙ we consider the function 2 ˙ + c2 ]. g(t) log[c1 (x(t))

Since g(t) ˙ = 2c1

x(t)¨ ˙ x(t) x(t)f ˙ (t, x(t), x(t)) ˙ = 2λc1 , 2 +c 2+c c1 (x(t)) ˙ c1 (x(t)) ˙ 2 2

we obtain, by (H2), |g(t)| ˙ ≤ 2c1 |x(t)|. ˙ Let G = {t ∈ [0, 1] : x(t) ˙ = 0}. Then G=

Jj

where Jj is a closed interval such that x(t) ˙ = 0 for t ∈ Jj and x˙ vanishes at one end point of Jj (say τj ) at least. Now, we have - t - t c1 x˙ 2 (t) + c2 0 ≤ log = g(t) − g(τj ) = g(s) ˙ ds ≤ |g(s)| ˙ dsc1 · 0 + c 2 τj τj - - t ≤ 2c1 |x(s)| ˙ ds- = 2c1 |x(t) − x(τj )|, - τj since x˙ does not change its sign in Jj . Hence log

c1 x˙ 2 (t) + c2 ≤ 4c1 M0 . c2

This inequality shows that

2

|x(t)| ˙ ≤ M1

c2 2c1 M0 e , c1

t ∈ [0, 1].

If M2 sup {f (t, x, y) : t ∈ [0, 1], |x| ≤ M0 , |y| ≤ M1 }, then ¨ xC[0,1] ≤ M2 . These estimates of xC[0,1], x ˙ C[0,1] , ¨ xC[0,1] show that the set Σ from Example 5.2.15 is bounded and therefore the proof of existence of a solution of (5.2.10) under the hypotheses (H1) and (H2) is complete. The reader can imagine that (H1), (H2) are not the only suﬃcient conditions for solving (5.2.10). However, the direct use of the Schauder Fixed Point Theorem leads to more restrictive assumptions on f . A survey of results until 1980 can be g found in the monograph Fuˇc´ık [53].

5.2. Topological Degree

283

It is clear that the above stated procedure can be used for solving a more general equation Ax = F (x) where F ∈ C (Ω, X) (5.2.11) and A is a linear operator with a bounded inverse. In that case (5.2.11) is equivalent to x = A−1 F (x) with a compact operator A−1 F . More interesting questions arise for a non-invertible A. Since many diﬀerential operators (both ordinary and partial) are Fredholm operators we will suppose that A is a linear closed Fredholm operator25 of index zero, and proceed as in Remark 4.3.14 with the exception that A is not assumed to be continuous. We denote X1 Ker A, Y2 Im A, and choose topological complements X2 , Y1 such that X = X 1 ⊕ X2 ,

Y = Y1 ⊕ Y2 .

These closed complements exist because X1 has a ﬁnite dimension and Y2 a ﬁnite co-dimension (Example 2.1.12 and Remark 2.1.19). By the assumption on the index of A there is also an homeomorphism Λ of Y1 onto X1 . Denote by P and Q the linear continuous projections onto X1 and Y1 with kernels X2 and Y2 , respectively. Then the restriction of A to X2 ∩ Dom A is an injective operator with a bounded inverse B.26 The equation (5.2.11) is equivalent to the pair of equations Ax2 = (I − Q)F (x1 + x2 ),

o = QF (x1 + x2 ),

x1 ∈ X1 , x2 ∈ X2 ∩ Dom A,

(see Figure 5.2.4) or x2 = B(I − Q)F (x1 + x2 ),

x1 = x1 + ΛQF (x1 + x2 ).27

(5.2.12)

This pair of equations is equivalent to the equation G(x) = x

where

G(x) P x + ΛQF (x) + B(I − Q)F (x).

linear closed operator A is said to be Fredholm if dim Ker A < ∞, Im A is closed and codim Im A < ∞. The index of such an operator is deﬁned as ind A dim Ker A − codim Im A. See the deﬁnition on page 70. 26 Indeed, B is a closed operator (as an inverse to a closed operator) deﬁned on the Banach space Y2 . The continuity of B follows now from the Closed Graph Theorem (Corollary 2.1.10). The operator B(I − Q) is called a generalized (or a right) inverse to A. It is characterized by the following two properties: (i) AB(I − Q) = I − Q (the reason for calling it the right inverse); (ii) B(I − Q)Ax = (I − P )x for x ∈ Dom A. 25 A

27 Since

F is nonlinear, Λ need not be taken as linear. It is actually only essential that Λ maps Y1 into X1 and Λ−1 (o) = {o}.

284

Chapter 5. Topological and Monotonicity Methods

X

X2

Y2 = Im A

Y

B Q

P o

X1 = Ker A

Y1 o Λ Figure 5.2.4.

If Ω is a bounded open subset of X, Ax − F (x) = o for each x ∈ ∂Ω and G is compact on Ω, then the Leray–Schauder degree deg (I − G, Ω, o) is well deﬁned. It is called the coincidence degree of the couple (A, F ). It can be proved that this deﬁnition does not depend on the choice of the projections P , Q and the class of Λ’s which do not change the orientations in X1 and Y1 . The coincidence degree was introduced by J. Mawhin (see, e.g., Mawhin [91] or Gaines & Mawhin [57]). J. Mawhin also proved the following theorem which generalizes the statement of Example 5.2.15. Theorem 5.2.17 (J. Mawhin). Let A : Dom A ⊂ X → X be a Fredholm operator of index zero, Ω a bounded open subset of a Banach space X, and let B(I − Q)F ∈ C (Ω, X) where B(I − Q) is a generalized inverse to A. Assume further that (i) Ax − λF (x) = o for x ∈ ∂Ω ∩ Dom A, λ ∈ (0, 1),28 (ii) deg (ΛQF |Ker A∩Ω , Ker A ∩ Ω, o) = 0. Then the equation (5.2.11) has a solution in Ω. Proof. The proof is based on the observation that the coincidence degree can be reduced with help of the homotopy invariance property and the Product Formula to the Brouwer degree of the restriction of ΛQF to Ker A ∩ Ω – for details see the references given above. Example 5.2.18. Consider the equation x˙ = f (t, x)

(5.2.13)

together with the periodic boundary condition x(0) = x(1)

(5.2.14)

where f ∈ C([0, 1] × R, R). See also Example 4.3.15. 28 Notice

λ = 0.

that injectivity of A is actually not needed hence the assumption (i) is not assumed for

5.2. Topological Degree

285

We denote Dom A = {x ∈ C 1 [0, 1] : x(0) = x(1)} and put for x ∈ Dom A.

Ax = x˙

Further, let F be a Nemytski operator deﬁned by t ∈ [0, 1],

F (x)(t) = f (t, x(t)),

x ∈ X C[0, 1].

Then A is a Fredholm operator of index zero. We choose projections P x = x(0) and

onto Ker A

1

y(s) ds onto the complement of Im A.

Qy = 0

Then the generalized inverse to A is given by t B(I − Q)y(t) = y(s) ds − t 0

1

y(s) ds

0

and it is an isomorphism of Im A = Ker Q onto X2 ∩ Dom A where X2 = Ker P . Moreover, B(I − Q)F is a compact operator on X. To verify conditions (i), (ii) of Theorem 5.2.17 we suppose (H) there are functions f+ , f− ∈ C[0, 1] such that lim f (t, x) = f− (t),

lim f (t, x) = f+ (t)

x→−∞

x→+∞

uniformly with respect to t ∈ [0, 1] and t ∈ [0, 1],

f− (t) < f (t, x) < f+ (t),

x ∈ R.29

Step 1. Veriﬁcation of condition (i): For any solution x of (5.2.13)–(5.2.14) we have, by integration, 1 1 1 f− (t) dt < 0 = f (t, x(t)) dt < f+ (t) dt. (5.2.15) 0

0

Take an ε > 0 such that 1

0

f− (t) dt < −ε,

0

1

f+ (t) dt > ε. 0

Then there exists r > 0 such that 0 < f (t, x) − f− (t) < and 0 < f+ (t) − f (t, x) < 29 Conditions

ε 2 ε 2

for t ∈ [0, 1], for

t ∈ [0, 1],

x < −r, x > r.

similar to (H) are called conditions of the Landesman–Lazer type. See, e.g., Landesman & Lazer [83], Fuˇc´ık [53], Mawhin [91] or Dr´ abek [39], and also Section 7.5.

286

Chapter 5. Topological and Monotonicity Methods

This implies that

1

f (t, x) dt < −

0

1

f (t, x) dt > 0

ε 2

ε 2

for

x < −r,

for

x > r.

It means that for any solution x of (5.2.13)–(5.2.14) there exists t0 ∈ [0, 1] such that |x(t0 )| ≤ r. Since

t

x(t) = x(t0 ) +

f (s, x(s)) ds, t0

we get xC[0,1] ≤ M r + max {f− C[0,1] , f+ C[0,1] }. If we take Ω to be a ball B(o; R) of radius R > M , then the condition (i) from Theorem 5.2.17 is satisﬁed for λ = 1. The same is also true for the solution of λ ∈ (0, 1).

Ax = λF (x),

Step 2. Veriﬁcation of condition (ii): For x ∈ Ker A we have

1

ΛQF (x) =

f (t, x) dt 0

provided we have taken Λ as the identity map. Moreover,

1

f (t, R) dt > 0 > 0

1

f (t, −R) dt.

(5.2.16)

0

From the construction of the Brouwer degree it follows that we can assume that Φ(x)

1

f (t, x) dt 0

is smooth, and 0 is a regular value of Φ. By (5.2.16), the number of zero points of Φ is odd. This means that deg (Φ, (−R, R), 0) = sgn det Φ (x) Φ(x)=0

= 1 = deg (ΛQF |Ker A∩B(o;R) , Ker A ∩ B(o; R), o) and, therefore, condition (ii) is also satisﬁed.

5.2. Topological Degree

287

These considerations show that the problem (5.2.13)–(5.2.14) has a solution provided (H) is fulﬁlled. Notice that we have shown that

1

1

f− (t) dt < 0 < 0

f+ (t) dt 0

is also a necessary condition for the solvability of (5.2.13)–(5.2.14) under the asg sumption (H) by (5.2.15)). Remark 5.2.19. It is not necessary to consider only projections P , Q on the “small” Ker A and a complement of Im A. For example, suppose that there are projections {Qn } converging to the identity in a certain sense, and Qn Q = Q (e.g., Qn can be the partial sums of the Fourier series of the elements of Y = C[0, 1] for the periodic problem). If we can take projections Pn so that A(Im (I − Pn )) = Im (I − Qn ), then there is a chance to solve the ﬁrst equation in (5.2.12) for a ﬁxed x1 by the Contraction Principle even if F is only locally Lipschitz. This idea belongs to L. Cesari (see, e.g., his survey in Cesari [22]). Using this approach he proved the existence of a 2π-periodic solution of the equation x ¨ + x3 = sin t. Notice a signiﬁcant diﬀerence in the sign of the nonlinear term here and in (H1) in Example 5.2.16, and the fact that the growth of the nonlinear term is faster here than in (H2). At the end of this section we turn our attention to the bifurcations of solutions. As in Section 4.3 we consider the equation f (λ, x) = o where f : R × X → X is continuous on J × U, J is an open interval and U is a neighborhood of o in a Banach space X. We suppose that f (λ, o) = o

for all λ ∈ J

and desire to ﬁnd conditions under which the point λ0 ∈ J is a bifurcation point according to Deﬁnition 4.3.21. In Section 4.3 we have used a method based on the Implicit Function Theorem and now we want to employ the topological approach based on the degree theory. Notice that the deﬁnition of the index of an isolated solution (Example 5.2.10) can be literally used also in an inﬁnite dimensional space.

288

Chapter 5. Topological and Monotonicity Methods

Proposition 5.2.20. Let h(λ, ·) : X → X be a compact operator on the neighborhood U of zero in a Banach space X for all λ ∈ J . Let o be an isolated solution of f (λ, x) x − h(λ, x) = o

in

U

for all

λ ∈ J \ {λ0 }.

Put i(λ) = deg (f (λ, ·), U, o). If lim i(λ) = lim i(λ),

λ→λ0−

(5.2.17)

λ→λ0+

then (λ0 , o) is a bifurcation point of f . Proof. Suppose not. Then there is a neighborhood V = J˜ × U˜ of (λ0 , o) such that (λ, o) are the only solutions of f (λ, x) = o

in V.

This means that for any λ ∈ J˜ the index i(λ) = deg (I − h(λ, ·), U˜ , o) is deﬁned and, by the general homotopy invariance property of the degree (Exercise 5.2.29), the index i(λ) is constant, a contradiction to (5.2.17). The use of Proposition 5.2.20 is restricted to the problems of computing the index i(λ). The following classical result (Theorem 5.2.23) is based on a special form of f which is often met in applications. For the proof we need two prerequisites which are of independent interest. Proposition 5.2.21. Let Ω be an open set in a Banach space X and let F ∈ C (Ω, X). If the Fr´echet derivative F (x0 ) exists for an x0 ∈ Ω, then F (x0 ) is a (linear) compact operator. ∞

Proof. If F (x0 ) is not compact, then one can ﬁnd ε0 > 0 and a sequence {yn }n=1 ⊂ X such that yn ≤ 1 and F (x0 )yk − F (x0 )yl ≥ ε0

for k = l.

By the deﬁnition of Fr´echet derivative, there is δ > 0 such that F (x0 + h) − F (x0 ) − F (x0 )h ≤

ε0 h 4

provided h < δ

Choose τ such that τ yk < δ

and

x0 + τ yk ∈ Ω

for all k ∈ N.

( ≤ 1).

5.2. Topological Degree

289

Then F (x0 + τ yk ) − F (x0 + τ yl ) ≥ F (x0 )(τ yk − τ yl ) − F (x0 + τ yk ) − F (x0 ) − F (x0 )τ yk ) − F (x0 + τ yl ) − F (x0 ) − F (x0 )τ yl ≥ But this means that F is not compact on Ω, a contradiction.

ε0 τ . 2

Proposition 5.2.22 (Leray–Schauder Index Formula). Let Ω be an open bounded set in a Banach space X and let F ∈ C (Ω, X). Let x0 ∈ Ω be a unique solution in Ω of the equation x = F (x). Assume that the Fr´echet derivative F (x0 ) exists and I − F (x0 ) is continuously invertible. Then deg (I − F, Ω, o) = (−1)β where β = m(λ) (5.2.18) λ∈σ(F (x0 ))∩R λ>1

and m(λ) is the multiplicity30 of the eigenvalue λ of the operator F (x0 ). Proof. First we recall that F (x0 ) is a compact operator (Proposition 5.2.21) and, therefore, β is a ﬁnite number (Corollary 2.2.13). Choose such a small ball B(o; ε) that x0 + B(o; ε) ⊂ Ω, and put H(t, y) = and

F (x0 + ty) − F (x0 ) , t H(0, y) = F (x0 )y,

t ∈ (0, 1],

y ∈ B(o; ε),

y ∈ B(o; ε).

The ball B(o; ε) can be chosen such that the equation y = H(t, y) has a unique solution in B(o; ε), namely y = o. Indeed, 1 [F (x + ty) − F (x ) − F (x )(ty)] + F (x )y − y H(t, y) − y = 0 0 0 0 t 1 c ≥ F (x0 )y − y − t [F (x0 + ty) − F (x0 ) − F (x0 )(ty) ≥ cy − 2 y provided y is small enough (by the deﬁnition of F (x0 ) and the assumption on I − F (x0 )). 30 For

the deﬁnition of multiplicity see footnote 23 on page 84.

290

Chapter 5. Topological and Monotonicity Methods

By the homotopy invariance property (in a more general setting – see Exercise 5.2.29), deg (I − H(t, ·), B(o; ε), o) is constant on the interval [0, 1]. In particular, deg (I − F, Ω, o) = deg (I − F (x0 ), B(o; ε), o). Put X1

∞

Ker [λI − F (x0 )]p .

λ∈σ(F (x0 )) p=1 λ>1

As we have mentioned above, dim X1 = β < ∞. Moreover, there exists a topological complement X2 to X1 in X which is F (x0 )-invariant (see the decomposition (2.2.4)). This decomposition of X allows us to use the Product Formula for the degree (Exercise 5.2.28) provided balls Bi ⊂ Xi , i = 1, 2, are chosen such that B1 × B2 ⊂ B(o; ε). Hence we obtain deg (I − F (x0 ), B(o; ε), o) = deg (F1 , B1 , o) deg (F2 , B2 , o) where Fi denotes the restriction of I − F (x0 ) to Xi , i = 1, 2. To compute deg (F2 , B2 , o) we introduce the homotopy H2 (t, y) = y − tF (x0 )y,

t ∈ [0, 1],

y ∈ X2 .

Assume that H2 (t, y) = o for a y = o. Then t ∈ (0, 1) and 1t ∈ σ(F (x0 )) and, by the deﬁnition of X1 , y ∈ X1 . Since X1 ∩ X2 = {o} we arrive at a contradiction. This consideration shows that we may apply the homotopy invariance property to H2 to get deg (F2 , B2 , o) = deg (I, B2 , o) = 1. The degree deg (F1 , B1 , o) is the Brouwer degree of the linear operator F1 in the ﬁnite dimensional space X1 , and it was computed in Example 5.2.10. Notice that A is here(I − F (x0 ))|X1 , and thus {µ ∈ σ(A) : µ < 0} = {λ ∈ σ(F (x0 )) : λ > 1}. This shows that deg (F1 , B1 , o) = (−1)β .

Theorem 5.2.23 (Krasnoselski Local Bifurcation Theorem). Let U be a neighborhood of o in a Banach space X, and let f (λ, x) = x − λAx − G(λ, x),

λ ∈ J,

x ∈ U,

where J is an open interval in R, A is a linear compact operator on X, G(λ, ·) : U → X is a compact operator and G(λ, x) =o x→o x lim

for all

λ ∈ J.

If λ0 ∈ J is such that λ10 is an eigenvalue of A of odd multiplicity, then (λ0 , o) is a bifurcation point of f .

5.2. Topological Degree

291

Proof. Suppose that (λ0 , o) is not a bifurcation point. Then there is a neighborhood J˜ × V of (λ0 , o) such that the equation f (λ, x) = o has for every λ ∈ J˜ a*unique solution in V, namely x = o. We may assume that ) 0 ∈ J˜, λ : λ1 ∈ σ(A) ∩ J˜ = {λ0 }, and also that λ0 > 0 (for λ0 < 0 consider ˜ x) = f (−λ, x)). The degree deg (f (λ, ·), V, o) is deﬁned and it is given by the f(λ, Leray–Schauder Index Formula (Proposition 5.2.22): deg (f (λ, ·), V, o) = (−1)β where β = m(µ) = m(κ). µ∈σ(λA)∩R µ>1

κ∈σ(A)∩R 1 κ> λ

' ( Since m λ10 is odd, the degree deg (f (λ, ·), V, o) changes sign at λ0 . A contradiction follows now from Proposition 5.2.20. The Krasnoselski Theorem 5.2.23 is of a local nature and does not say anything about the global behavior of a “branch” of nontrivial solutions of the equation f (λ, x) = o. The so-called global bifurcation theorems describe these branches (see Appendix 5.2A). The interested reader can also consult, e.g., Rabinowitz [103, pages 11– 36], Iz´e [69], Nirenberg [100, Chapter 3], Krasnoselski & Zabreiko [79], Krawcewicz & Wu [80]. There are also methods depending on other topological tools. See, e.g., Alexander [3, pages 457–483] or Fitzpatrick [50] and references given there. Remark 5.2.24 (Comparison of Theorems 4.3.22 and 5.2.23). Let value of A and

1 dim Ker I − A = 1. λ0

1 λ0

be an eigen-

Denote by x0 , x0 = 1, an eigenvector of A associated with λ10 . Let us compare the assumptions of Theorems 4.3.22 and 5.2.23. One of the essential diﬀerences consists in the smoothness assumptions: while Theorem 4.3.22 applied to f (λ, x) = x − λAx − G(λ, x) requires G being a C 2 -mapping, Theorem 5.2.23 demands G compact (and so continuous). The assumption G(λ, x) = o(x), x → o, yields f2 (λ0 , o) = I − λ0 A,

f1,2 (λ0 , o) = −A.

(5.2.19)

Theorem 4.3.22 requires Im (I − λ0 A) = Im (I − λ0 A), codim Im (I − λ0 A) = 1,

(5.2.20) (5.2.21)

(λ0 , o)(1, x0 ) ∈ Im (I − λ0 A). f1,2

(5.2.22)

292

Chapter 5. Topological and Monotonicity Methods

The compactness of A (Theorem 5.2.23) implies that I − λ0 A is a Fredholm operator of index 0, hence (5.2.20) holds and also codim Im (I − λ0 A) = dim Ker (I − λ0 A) = 1, i.e., follows. The last assumption is closely connected with the multiplicity ( ' (5.2.21) 1 1 m λ0 of λ0 as follows from the assertion: The assumption (5.2.22) is veriﬁed if and only if

m

1 λ0

= dim

∞

Ker (I − λ0 A)k = 1.

(5.2.23)

k=1

First, let us prove (5.2.23) ⇒ (5.2.22). Assume the contrary: f1,2 (λ0 , o)(1, x0 ) ∈ Im (I − λ0 A).

According to (5.2.19) it means −Ax0 ∈ Im (I − λ0 A). Since x0 = λ0 Ax0 , we have x0 ∈ Im (I − λ0 A) as well. Then there exists w ∈ X such that x0 = w − λ0 Aw. But x0 ∈ Ker (I − λ0 A), i.e., w ∈ Ker (I − λ0 A)2 . Since x = o, we have w ∈ Ker (I − λ0 A), which implies

dim Ker (I − λ0 A) > dim Ker (I − λ0 A), 2

i.e.,

m

1 λ0

> 1,

a contradiction. Now, let us prove (5.2.22) ⇒ (5.2.23). Take w ∈ Ker (I − λ0 A)2

and set

u = (I − λ0 A)w.

Then (I − λ0 A)u = (I − λ0 A)2 w = o that implies u ∈ Ker (I − λ0 A). Since Ker (I − λ0 A) is generated by x0 , there exists a ∈ R such that u = ax0 . Simultaneously, u = (I − λ0 A)w ∈ Im (I − λ0 A). For a = 0 we have −Ax0 = −λ0 x0 = −

λ0 u ∈ Im (I − λ0 A), a

a contradiction with (5.2.19) and (5.2.22). Hence a = 0 and u = (I − λ0 A)w = o,

i.e.,

This proves Ker (I − λ0 A)2 ⊂ Ker (I − λ0 A).

w ∈ Ker A.

5.2. Topological Degree

293

Since the opposite inclusion is evident, we have proved Ker (I − λ0 A)2 = Ker (I − λ0 A). By induction by the power n we now easily prove that Ker (I − λ0 A)n+1 = Ker (I − λ0 A)n

for any n ∈ N

(do it in detail!). Exercise 5.2.25. Prove the following assertion: Let f , A and G be as in Theorem 5.2.23. Let λ0 = 0 be a bifurcation point of f . Then λ10 is an eigenvalue of A. Hint. If λ0 is a bifurcation point of f , there are o = xn → o, Set vn

xn xn .

λn → λ0 ,

f (λn , xn ) = o.

Then vn = λn Avn −

G(λn , vn ) . xn

(5.2.24)

∞

Since {vn }n=1 is bounded, A is compact, and passing to a subsequence if necessary we may assume that vn → v for a v ∈ X, v = o. From (5.2.24) we obtain that v = λ0 Av. Exercise 5.2.26. Let F ∈ C (Ω, X) where Ω ⊂ X is an open, bounded, symmetric with respect to o ∈ X, and nonempty set in a Banach space X, F (x) = o for all x ∈ ∂Ω. Assume that F (x) = −F (−x)

for any x ∈ ∂Ω.

Then the Leray–Schauder degree deg (I − F, Ω, o) is an odd number. Hint. Use ﬁnite dimensional approximations as in the construction of the Leray– Schauder degree (see pages 277–278) and Theorem 4.3.130. Exercise 5.2.27. Modify the proof of (5.2.9) to obtain the so-called Product Formula: deg (g, Ω, y0 ) = deg (f1 , Ω1 , y1,0 ) deg (f2 , Ω2 , y2,0 ) where g = (f1 , f2 ) : Ω → RN1 +N2 , Ω = Ω1 × Ω2 , y0 = (y1,0 , y2,0 ), Ωi ⊂ RNi , fi ∈ C(Ωi , RNi ), yi,0 ∈ fi (∂Ωi ), i = 1, 2. Exercise 5.2.28. By repeating the construction of the Leray–Schauder degree show that the Product Formula and the boundary dependence (Theorem 4.3.124(vii)) of the degree also hold for the Leray–Schauder degree.

294

Chapter 5. Topological and Monotonicity Methods

Exercise 5.2.29. Prove the following general homotopy invariance property: Let Ω be an open bounded set in a Banach space X and assume that h = h(t, x) ∈ C([0, 1] × Ω, X) and x − h(t, x) = y0

for every

x ∈ ∂Ω

and

t ∈ [0, 1].

Then deg (I − h(t, ·), Ω, y0 ) is constant with respect to t ∈ [0, 1]. The following two exercises use an idea similar to that of Example 5.2.14. Exercise 5.2.30. Let H be a Hilbert space and F a compact operator on a bounded open set Ω ⊂ H into H. Assume that o ∈ Ω and (F (x), x) ≤ x2

for each x ∈ ∂Ω.

Prove that F has a ﬁxed point in Ω. Hint. Suppose not and show that there is t0 ∈ [0, 1], x0 ∈ ∂Ω such that x0 = t0 F (x0 ). By assumption, t0 = 1. Exercise 5.2.31. Let F be a compact operator from the closed unit ball B(o; 1) of a Banach space X into X and, moreover, let x − F (x)2 ≥ F (x)2 − x2

for x ∈ ∂B(o; 1).

Prove that F has a ﬁxed point in B(o; 1). Exercise 5.2.32. Let f be continuous and satisfy the following growth conditions: There are K > 0 and 0 < γ < 1 such that the inequality |f (t, x, y)| ≤ K(1 + |x|γ + |y|γ )

holds for

t ∈ [0, 1], x, y ∈ R.

Then the boundary value problem (5.2.10) has a solution. Prove that! Hint. Proceed similarly to Example 5.2.16. Use the equation to estimate x ∈ Σ and compute x˙ with help of a special form of the kernel G. Exercise 5.2.33. Apply Theorem 5.2.23 to the Dirichlet boundary value problem x ¨(t) + λx(t) + g(λ, t, x(t)) = 0, t ∈ (0, π), x(0) = x(π) = 0, and show that every point (k 2 , o), k = 1, 2, . . . is a bifurcation point.

5.2A. Global Bifurcation Theorem

295

5.2A Global Bifurcation Theorem In this appendix we study the bifurcation equation f (λ, x) x − λAx − G(λ, x) = o.

(5.2.25)

The following result is due to Rabinowitz [103, pp. 11–36], Rabinowitz [104]. Theorem 5.2.34 (Rabinowitz Global Bifurcation Theorem). Let X be a Banach space, Ω an open set in R × X, (λ0 , o) ∈ Ω, λ0 = 0. Let us assume: A is a compact linear operator from X into X,

(5.2.26)

G is a compact (nonlinear) operator from Ω into X,

(5.2.27)

for any bounded set M ⊂ {v ∈ R : (v, o) ∈ Ω} we have G(λ, x) = o(x), x → 0, uniformly for λ ∈ M, 1 is an eigenvalue of A of odd multiplicity. λ0

(5.2.28) (5.2.29)

Denote by S the closure of all solutions of (5.2.25) with x = o, i.e., S = {(λ, x) ∈ Ω : x = o, f (λ, x) = o}. Then S contains the point (λ0 , o).31 Let C be a component of S which contains (λ0 , o). Then at least one of the following assertions holds: (i) C is not a compact set in Ω. (ii) C contains an even number of points (λ, o) where multiplicity.

1 λ

is an eigenvalue of A of odd

Proof. We shall follow the proof of Iz´e [69]. The idea is the following. We will assume that C is compact, and prove that it contains an even number of points described in (ii). Since C is compact, it contains only a ﬁnite number of points (λ, o) where λ = 0 and λ1 is an eigenvalue of the compact operator A (see Figure 5.2.5): We shall denote them by (λ0 , o), . . . , (λk−1 , o). ˜ Since C is a component of S in Ω and S is closed, there exists an open bounded set Ω ˜ and S ∩ ∂ Ω ˜ = ∅. We prove that Ω ˜ can be chosen in such a way that such that C ⊂ Ω ˜ j = 0, 1, . . . , k − 1, but (λ, o) ∈ ˜ for 1 ∈ σ(A), λ = λj , j = 0, 1, . . . , k − 1” “(λj , o) ∈ Ω, /Ω λ (see Figure 5.2.6). Indeed, let U be a δ-neighborhood of C such that U \ C does not contain any point (λ, o), λ = 0, λ1 ∈ σ(A). The set K = U ∩ S is then compact,32 and obviously C ∩ (∂U ∩ S) = ∅. By Deimling [34, Lemma 29.1] there exist compact disjoint sets K1 , K2 ⊂ K such that K = K1 ∪ K2 , 31 I.e., 32 The

C ⊂ K1 ,

∂U ∩ S ⊂ K2 .

(λ0 , o) is a bifurcation point in the sense of Deﬁnition 4.3.21. reader is invited to prove it using the compactness of A and G.

296

Chapter 5. Topological and Monotonicity Methods

S

X

C (0, o)

λ0

λ1

λ2

λ3

λ

Figure 5.2.5.

X ∂U ∩ S

δ K1

K2

ε0

C

(0, o)

λ0 U0 (ε, ε)

λ1

λ2

U1 (ε, ε)

U2 (ε, ε)

U3 (ε, ε)

λ3

λ

˜ Ω

U Figure 5.2.6.

˜ can be chosen as an ε0 -neighborhood of K1 with Hence Ω ε0 < min {dist (K1 , K2 ), dist (K1 , ∂U), δ}. ˜ → X × R as For any r > 0 deﬁne fr : Ω fr (λ, x) = (x2 − r 2 , f (λ, x)).

(5.2.30)

5.2A. Global Bifurcation Theorem

297

Then obviously fr (λ, x) = o

⇐⇒

f (λ, x) = o

and

x = r.

(In other words, the function fr “considers” the solutions of f (λ, x) = o which belong ˜ and the homotopy invariance to the sphere x = r.) Then thanks to the choice of Ω property of the degree (Theorem 5.2.13(vi)), we conclude that ˜ o) deg (fr , Ω, is well deﬁned and independent of r > 0. The rest of the proof consists in the calculation of this degree for suﬃciently large r and for suﬃciently small r. ˜ implies that there exists C > 0 such Step 1 (suﬃciently large r). The boundedness of Ω ˜ that x < C for any (λ, x) ∈ Ω. Then for r > C the equation fr (λ, x) = o ˜ and so, according to Theorem 5.2.13(iii), we have has no solution in Ω, ˜ o) = 0. deg (fr , Ω, Step 2 (suﬃciently small r). For j = 0, 1, . . . , k − 1 set Uj (ε, r) {(λ, x) : x2 + |λ − λj |2 < r 2 + ε2 }, ˜ and choose ε > 0 so small that the sets Uj (ε, ε) are pairwise disjoint, all belong to Ω, and do not contain (0, o) (see Figure 5.2.6). We prove ﬁrst that there exists r > 0 (r ≤ ε) such that x − λAx − tG(λ, x) = o

(5.2.31)

˜ 0 < x ≤ r, |λ − λj | ≥ ε, j = 0, 1, . . . , k − 1. Indeed, for all t ∈ [0, 1], (λ, x) ∈ Ω, assume via contradiction that such r > 0 does not exist. Then there exist tn ∈ [0, 1] and ˜ n ∈ N, o = xn → o, |λn − λj | ≥ ε, j = 0, 1, . . . , k − 1, not satisfying (5.2.31), (λn , xn ) ∈ Ω, i.e., xn − λn Axn − tn G(λn , xn ) = o. (5.2.32) ˜ It follows from the construction We can assume, without loss of generality, that λn → λ. ˜ that 1 ∈ σ(A). On the other hand, it follows from (5.2.32) that (setting yn = xn ) of Ω ˜

xn

λ yn − λn Ayn − tn

G(λn , xn ) = o. xn

(5.2.33)

Now, the compactness of A and (5.2.28) imply that for a y = o (ynk → y for a subsequence) we have ˜ y − λAy = o, a contradiction. We shall write Uj = Uj (ε, r) for simplicity. It follows from Theorem 5.2.13(iv) that ˜ o) = deg (fr , Ω,

k−1 j=0

deg (fr , Uj , o).

(5.2.34)

298

Chapter 5. Topological and Monotonicity Methods

Let λj be ﬁxed. It follows from the choice of ε > 0 that for 0 < |λ − λj | ≤ ε we have 1

∈ σ(A). λ Then for any such λ the degree deg (I − λA, B(o; r), o) is well deﬁned. Moreover, the homotopy invariance property of the degree implies that it is locally constant with respect to λ. Denote ij− = deg (I − (λj − ε)A, B(o; r), o),

ij+ = deg (I − (λj + ε)A; B(o; r), o).

It follows from Lemma 5.2.35 below that deg (fr , Uj , o) = ij− − ij+ . If mj is the multiplicity of

1 λj

(5.2.35)

, then Proposition 5.2.22 yields ij+ = (−1)mj ij− .

Hence for mj even we obtain deg (fr , Uj , o) = ij− − ij+ = 0,

(5.2.36)

deg (fr , Uj , o) = 2ij− .

(5.2.37)

while for mj odd we have It follows from (5.2.34)–(5.2.37) that ˜ o) = 2 deg (fr , Ω,

k−1

ij− .

j=0 mj odd

Since this degree is independent of r, it must be equal to zero (see Step 1 of this proof). Hence there must be an even number of eigenvalues of odd algebraic multiplicity among λ0 , . . . , λk−1 . Now we prove an analogue of the Leray–Schauder Index Formula (see Proposition 5.2.22). Lemma 5.2.35. Let fr , Uj , ij− , ij+ be as above. Then deg (fr , Uj , o) = ij− − ij+ . Proof. We will connect fr with a simpler mapping using a suitable homotopy. Let us deﬁne this homotopy in the following way: ∀t ∈ [0, 1]

ft,r : Uj → R × X : ft,r (λ, x) = (τt , yt ),

τt = t(x2 − r 2 ) + (1 − t)(ε2 − (λ − λj )2 ), We prove that for any t ∈ [0, 1]

o∈ / ft,r (∂Uj ).

yt = x − λAx − tG(λ, x).

5.2A. Global Bifurcation Theorem

299

Assume the contrary, i.e., there exist t ∈ [0, 1] and (λ, x) ∈ ∂Uj such that ft,r (λ, x) = o. The fact that (λ, x) ∈ ∂Uj implies x2 + (λ − λj )2 = r 2 + ε2 . At the same time, from 0 = τt = t(x2 + (λ − λj )2 ) − t(r 2 + ε2 ) + ε2 − (λ − λj )2 we obtain λ = λj ± ε, and so x = r. This together with yt = o contradicts (5.2.31). The homotopy invariance property of the degree then implies deg (fr , Uj , o) = deg (f0,r , Uj , o). The mapping f0,r is now easier to deal with. Indeed, the point o has two preimages, (λj − ε, o) and (λj + ε, o), with respect to the mapping f0,r (λ, x) = (ε2 − (λ − λj )2 , x − λAx). At both points the Fr´echet diﬀerential f0,r is injective: (λ, 0)(λ, u) = (−2(λ − λj )λ, u − λAu). f0,r

Let us choose suﬃciently small neighborhoods of points (λj ± ε, o) in the following way: Let V ± be small neighborhoods of points λj ± ε and let U be a small neighborhood of o in X such that U ± V ± × U ⊂ Uj . We have

deg (f0,r , Uj , o) = deg (f0,r , U − , o) + deg (f0,r , U + , o)

and, by the Product Formula (Exercise 5.2.28) and Proposition 5.2.22, deg (f0,r , U − , o) = deg (I − (λj − ε)A, U, o) · deg (ε2 − (· − λj )2 , V − , o) = ij− · 1. Similarly, we get deg (f0,r , U + , o) = ij+ · (−1).

This completes the proof.

Corollary 5.2.36. If Ω = R × X in Theorem 5.2.34, then the ﬁrst possibility (i) reduces to C is unbounded in R × X and (ii) remains unchanged. Proof. Let (λ, x) ∈ C. Then x = λAx + G(λ, x). This implies that if C is bounded in R × X, it is also relatively compact because T (λ, x) = λAx + G(λ, x) is a compact operator. But C is closed, and so it is compact. We have thus proved that if C is bounded, it is also compact.

300

Chapter 5. Topological and Monotonicity Methods

We will now discuss the special case when λ10 is an eigenvalue of A the multiplicity of which is equal to 1. If this is the case in Theorem 5.2.34 and C is the component containing the point (λ0 , o), then C consists of two connected sets C ± which near (λ0 , o) meet only in (λ0 , o). More precisely, the next assertion holds (see Deimling [34, Corollary 29.1]). Corollary 5.2.37. Under the hypotheses of Theorem 5.2.34 suppose, in addition, that the multiplicity of λ10 is 1. Then the component C containing (λ0 , o) consists of two connected sets C + and C − , C = C + ∪ C − such that C + ∩ C − ∩ B((λ0 , o); ) = {(λ0 , o)}

and

C ± ∩ ∂B((λ0 , o); ) = ∅

for suﬃciently small > 0. The meaning of C ± is the following. Let us assume that (λn , xn ) ∈ C ± , λn → λ0 and xn → o. Then similarly to Exercise 5.2.25 we prove that xxnn → ±v0 where v0 = o is a normalized eigenvector associated with the eigenvalue λ0 . In other words, the sets C ± describe the “branches” of nontrivial solutions which bifurcate in the direction of the eigenvectors ±v0 (see Figure 5.2.7 for the projections of C ± into the space X).

C+ v0 o −v0 C− Figure 5.2.7. The global properties of C ± were studied by Dancer [30]. The main result of this paper can be formulated as follows. Theorem 5.2.38 (Dancer Global Bifurcation Theorem). The sets C + and C − are either both unbounded, or C + ∩ C − = {(λ0 , o)}. Example 5.2.39 (Application of the Dancer Global Bifurcation Theorem). Let us consider the Dirichlet boundary value problem x ¨(t) + λx(t) = g(λ, t, x(t)), t ∈ (0, π), (5.2.38) x(0) = x(π) = 0. We assume that g = g(λ, t, s) is a continuous function from [0, π] × R × R into R and, given any bounded interval I ⊂ R, lim

s→0

g(λ, t, s) =0 s

(5.2.39)

5.2A. Global Bifurcation Theorem

301

holds uniformly with respect to t ∈ [0, π] and λ ∈ I. In particular, g(λ, t, 0) = 0,

t ∈ [0, π], λ ∈ R,

and so (5.2.38) has a trivial solution. In this example we discuss the existence and properties of nontrivial weak solutions of (5.2.38). Let X W01,2 (0, π) and deﬁne operators A, G : X → X as follows: π π (Ax, y) = x(t)y(t) dt, (G(λ, x), y) = g(λ, t, x(t))y(t) dt for any x, y ∈ X. 0

0

The existence of a weak solution of (5.2.38) is equivalent to the existence of a solution of the operator equation (5.2.25), i.e., x − λAx − G(λ, x) = o, cf. Example 5.3.11. Moreover, (5.2.39) implies (5.2.28) (the reader is invited to check it). Set λ0 = n2 where n ∈ N is ﬁxed. Then λ10 = n12 is an eigenvalue of A of the multiplicity 1. It follows from the above results (Theorems 5.2.34 and 5.2.38) that there is a component C of S which contains nontrivial solutions of (5.2.25), and such that C = C+ ∪ C−,

{(n2 , o)} ∈ C + ∩ C − ,

C ± are either both unbounded, or C + ∩ C − = {(n2 , o)}. We show that the latter case cannot occur if g = g(λ, t, s) is locally Lipschitz continuous with respect to the third variable s (cf. page 93). To prove this fact the properties of the initial value problem (with λ ﬁxed) x ¨(t) + λx(t) − g(λ, t, x(t)) = 0, (5.2.40) ˙ 0 ) = x1 , t0 ∈ [0, π], x(t0 ) = x0 , x(t play a crucial role. In particular, we use the uniqueness of the solution to (5.2.40), which in turn implies that (5.2.40) with x0 = x1 = 0 has only the trivial solution. The regularity result (cf. Remark 5.3.10 and Exercise 5.3.26) for weak solutions of (5.2.38) yields that for any (λ, x) ∈ C ± we have x ∈ C 2 [0, π] and the above mentioned uniqueness result for (5.2.40) also implies that any such x has only a ﬁnite number of nodes in (0, π). Let 2 1 2 v0 (t) = sin nt n π be a normalized eigenfunction associated with the eigenvalue n12 of A. Consider (λk , xk ) ∈ C + such that λk → n2 and xk → 0. Then xk − λk Axk − G(λk , xk ) = o. The reﬂexivity of X, (5.2.39) and the compactness of A imply that vk

xk → v0 xk

in

X.

The embedding X = W01,2 (0, π) ⊂ C[0, π] and the fact that vk = λk Avk +

G(λk , xk ) xk

(5.2.41)

302

Chapter 5. Topological and Monotonicity Methods

then yield that vk → v0 even in C 2 [0, π]. In particular, it means that for large enough k, the functions xk share the nodal properties of v0 . More precisely, let A+ {(λ, x) ∈ C + : x has exactly (n − 1) nodes in (0, π) and x(0) ˙ > 0}, A− {(λ, x) ∈ C − : x has exactly (n − 1) nodes in (0, π) and x(0) ˙ < 0}. Then there exists 0 > 0 such that C ± ∩ B((n2 , o); ) = A±

for any

0 ≤ ≤ 0 .

In particular, A± = ∅. We show that A± is closed and open in C ± . Let us consider C + , the case of C − is similar. Recall that C + is a connected set with respect to the ˆ x topology induced by the topology on R × X. For a given (λ, ˆ) ∈ C + the convergence ˆ x (λk , xk ) → (λ, ˆ) in this topology means that ˆ λk → λ

in R

and

xk → x ˆ

in X.

The above mentioned regularity result and the embedding X = W01,2 (0, π) ⊂ C[0, π] then imply that ˆ in C 2 [0, π]. xk → x Let us assume that (λk , xk ) ∈ A+ ,

(λk , xk ) = (λ, o),

ˆ x (λk , xk ) → (λ, ˆ) ∈ C + .

ˆ x The fact xk → x ˆ in C 2 [0, π] then yields that (λ, ˆ) ∈ A+ , i.e., A+ is closed in C + . On + 2 ˆ ˆ ˆ) = (n , o), then there exists ˆ > 0 such that the other hand, if (λ, x ˆ) ∈ A , (λ, x ˆ x C + ∩ B((λ, ˆ); ˆ) ⊂ A+ , ˆ and xk → x for otherwise there would be λk → λ ˆ in C 2 [0, π], (λk , xk ) ∈ A+ , (λk , xk ) ∈ + + + C , a contradiction. Hence A is open in C . We have just proved that A± = C ± and so the sets C + and C − do not have any common point besides (n2 , o). According to Theorem 5.2.38 both C + and C − are unbounded in R × X. Let us emphasize that this means that C ± are unbounded either with respect to x, or with respect to λ (or with respect to both x and λ!). Some further properties of g might provide more information about the sets of C ± (e.g., boundedness with respect to x – if there are a priori estimates for all solutions – and unboundedness with respect to λ; or vice versa, boundedness with respect to λ and e unboundedness with respect to x). Exercise 5.2.40. Consider the boundary value problem (5.2.38) and apply Theorem 5.2.34 to get conclusions about the bifurcation branches. Formulate further assumptions on g which will imply unboundedness of the branches with respect to x and λ, respectively. Exercise 5.2.41. Consider the Neumann boundary value problem x ¨(t) + λx(t) = g(λ, t, x(t)), t ∈ (0, π), x(0) ˙ = x(π) ˙ = 0.

(5.2.42)

Find conditions on g = g(λ, t, s) and λ making it possible to apply Theorem 5.2.34.

5.2B. Topological Degree for Generalized Monotone Operators

303

Exercise 5.2.42. Modify assumptions from Exercise 5.2.41 on g so as to make it possible to apply Theorem 5.2.38 to (5.2.42) and to exclude the situation C + ∩ C − = {(n2 , o)}. Exercise 5.2.43. Consider the periodic problem x ¨(t) + λx(t) = g(λ, t, x(t)), x(0) = x(2π),

t ∈ (0, 2π),

x(0) ˙ = x(2π). ˙

(5.2.43)

Find conditions on g = g(λ, t, s) and λ making it possible to apply Theorem 5.2.34, cf. Example 4.3.25.

5.2B Topological Degree for Generalized Monotone Operators Let X be a reﬂexive real Banach space and X ∗ its dual. We will consider the operator T : X → X ∗.

(5.2.44)

The purpose of this appendix is to inform the reader about a possible method for extending the Leray–Schauder degree theory to mappings of the type (5.2.44). The following deﬁnition is the key to the theory presented in this appendix. Deﬁnition 5.2.44. The operator T : X → X ∗ is said to satisfy the (S+ ) condition if the assumptions un u0

(weakly) in X

and

lim sup T (un ), un − u0 ≤ 0 33 n→∞

imply un → u0

(strongly) in

X.

Remark 5.2.45. The topological degree for generalized monotone operators was independently introduced by Browder [19] and Skrypnik [121]. The notation (S+ ) is brought from Browder [19] while the same condition is called α(X) in Skrypnik [121]. The (S+ ) condition is a kind of compactness condition and plays an essential role in the construction of the degree for T : X → X ∗ . This construction is based on the Brouwer degree and ﬁnite dimensional approximations as the construction of the Leray–Schauder degree, and mappings satisfying the (S+ ) condition then play a similar role as compact perturbations of the identity. The following assertion illustrates this fact. Its proof is a straightforward consequence of Deﬁnition 5.2.44. Lemma 5.2.46. Let T : X → X ∗ satisfy the (S+ ) condition and let K : X → X ∗ be a compact operator. Then the sum T + K : X → X ∗ satisﬁes the (S+ ) condition. The following assertion is an analogue of Theorem 5.2.13 and of Exercise 5.2.26. Theorem 5.2.47 (I. V. Skrypnik [121]). Let T : X → X ∗ be a bounded and demicontinuous34 operator satisfying the (S+ ) condition. Let D ⊂ X be an open, bounded and and in the sequel we denote by f, u f (u) the value of the linear form f ∈ X ∗ for an element u ∈ X. If X is a Hilbert space, then according to the Riesz Representation Theorem, f, x = (x, f ). 34 We say that T : X → X ∗ is demicontinuous if T maps strongly convergent sequences in X to weakly convergent sequences in X ∗ . 33 Here

304

Chapter 5. Topological and Monotonicity Methods

nonempty set with the boundary ∂D such that T (u) = o for u ∈ ∂D. Then there exists an integer deg (T, D, o) (called the degree of the mapping T ) such that (i) deg (T, D, o) = 0 implies that there exists an element u0 ∈ D such that T (u0 ) = o. (ii) If D is symmetric with respect to the origin and T satisﬁes T (u) = −T (−u) for any u ∈ ∂D, then deg (T, D, o) is an odd number (and thus diﬀerent from zero). (iii) (Homotopy invariance property) Let Tλ be a family of bounded and demicontinuous mappings which satisfy the (S+ ) condition and which depend continuously on a real parameter λ ∈ [0, 1], and let Tλ (u) = o for any u ∈ ∂D and λ ∈ [0, 1]. Then deg (Tλ , D, o) is constant with respect to λ ∈ [0, 1]. In particular, we have deg (T0 , D, o) = deg (T1 , D, o). The following assertion combined with Theorem 5.2.47(i) is a crucial tool in proving the existence of a solution or the existence of a bifurcation branch (see Appendix 7.5A). Proposition 5.2.48 (I.V. Skrypnik [121]). Let T : X → X ∗ be a bounded, demicontinuous mapping satisfying the (S+ ) condition, o ∈ D \ ∂D, T (u) = o for u ∈ ∂D, D being as in Theorem 5.2.47. Let for u ∈ ∂D the inequality T (u), u ≥ 0

be valid.

Then deg (T, D, o) = 1. Let u0 ∈ X be an isolated solution of the equation T (u) = o.

(5.2.45)

Similarly to the ﬁnite dimensional case (and in the case of the Leray–Schauder degree) we deﬁne the index of an isolated solution u0 as i(T, u0 ) = lim deg (T, B(u0 ; r), o). r→0+

Then we have the following useful property of the degree. Proposition 5.2.49 (I.V. Skrypnik [121]). Let T and D be as in Theorem 5.2.47. Let T (u) = o have only isolated solutions in D and let T (u) = o for u ∈ ∂D. Then there is only a ﬁnite number of solutions of (5.2.45) in D, ui , i = 1, . . . , n, and the equality deg (T, D, o) =

n i=1

i(T, ui )

holds.

5.2B. Topological Degree for Generalized Monotone Operators

305

The last assertion connects the properties of the functionals and the degree of their Fr´echet derivatives. Proposition 5.2.50 (I. V. Skrypnik [121]). Assume that a real functional F : X → R has a local minimum at u0 ∈ X and its Fr´echet derivative F : X → X ∗ is a bounded and demicontinuous mapping which satisﬁes the (S+ ) condition. Let, moreover, u0 be an isolated solution of F (u0 ) = o. Then i(F , u0 ) = 1. Example 5.2.51. Let us consider the boundary value problem p−2 x(t))˙ ˙ − g(x(t)) = f (t), t ∈ (0, 1), −(|x(t)| ˙

(5.2.46)

x(0) = x(1) = 0

where p > 1, f ∈ Lp (0, 1), p = diﬀerential operator

p , p−1

and g : R → R is a continuous function. The

˙ 35 x → (|x| ˙ p−2 x)˙ is the so-called one-dimensional p-Laplacian (or half-linear diﬀerential operator of the second order). The parameter λ ∈ R for which there is a nontrivial weak solution (cf. Remark 5.3.10) ϕ = ϕ(t) (i.e., not identically equal to zero in (0, 1)) of the problem p−2 x(t))˙ ˙ − λ|x(t)|p−2 x(t) = 0, t ∈ (0, 1), −(|x(t)| ˙ (5.2.47) x(0) = x(1) = 0 is called an eigenvalue of the eigenvalue problem (5.2.47) and the function ϕ an eigenfunction associated with the eigenvalue λ. It is known (see, e.g., Elbert [47]) that the problem (5.2.47) has a countable set of simple eigenvalues 0 < λ1 < λ2 < · · · , lim λn = ∞ n→∞

(cf. Appendix 6.4B for the case p > 2) and the values of λn , n = 1, 2, . . . , can be explicitly calculated in terms of p and π. The eigenfunction ϕn associated with λn is continuously diﬀerentiable and has exactly n − 1 zero points in (0, 1). In particular, we can choose ϕ1 (t) > 0, t ∈ (0, 1). (See Elbert [47], Doˇsl´ y [37], Dr´ abek, Krejˇc´ı & Tak´ aˇc [41] and references given there.) However, the concrete values of λn are not important in this example. Let us assume that lim

s→±∞

g(s) =λ |s|p−2 s

where

λn < λ < λn+1

for an n = 1, 2, . . . .

(5.2.48)

The problem (5.2.46) is then called a nonresonance problem (cf. Remark 7.5.5). Put X W01,p (0, 1) with the norm

1

|x(t)| ˙ dt

x =

p

p1 .

0

Let us deﬁne operators J, G : X → X ∗ and an element f ∗ ∈ X ∗ by 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt, 36 G(x), y = g(x(t))y(t) dt, J(x), y = 0

0

right-hand side is deﬁned by ϕ(x) ˙ where ϕ(s) = s = 0, ϕ(0) = 0 for p > 1. the H¨ older inequality the integral exists and deﬁnes, for a ﬁxed x ∈ X, a continuous linear form on X.

35 The 36 By

|s|p−2 s,

306

Chapter 5. Topological and Monotonicity Methods f ∗ , y =

1

f (t)y(t) dt

x, y ∈ X.

for any

0

If we set T = J + G, then the operator equation

T (x) = f ∗

(5.2.49)

is equivalent to the requirement that the integral identity 1 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt − g(x(t))y(t) dt = f (t)y(t) dt 0

0

(5.2.50)

0

holds for all y ∈ X. The function x ∈ X satisfying (5.2.50) is then a weak solution of (5.2.46) (cf. Remark 5.3.10). It follows that the existence of a weak solution of (5.2.46) is equivalent to the existence of a solution of the operator equation (5.2.49). Our plan is to use the degree argument to prove the existence of a solution of (5.2.49). First we sketch the properties of operators J and G. The operator J satisﬁes J(x), x = xp .

(5.2.51)

Moreover, J is an odd mapping, (p − 1)-homogeneous,37 it is bounded, continuous (and so demicontinuous) and satisﬁes the (S+ ) condition. Indeed, let xn x0 in X and lim sup J(xn ), xn − x0 ≤ 0. n→∞

Then lim J(x0 ), xn − x0 = 0, and so n→∞

0 ≥ lim sup J(xn ) − J(x0 ), xn − x0 n→∞

1

= lim sup n→∞

8

0

1

≥ lim sup n→∞

9 |x˙ n |p−2 x˙ n (t) − |x˙ 0 (t)|p−2 x˙ 0 (t) (x˙ n (t) − x˙ 0 (t)) dt 1

|x˙ n (t)|p dt − 0

1 p |x˙ n (t)|p dt

0

1

−

|x˙ 0 (t)| dt p

1 p

0

1

p1 |x˙ 0 (t)|p dt

0

p1 1 |x˙ n (t)|p dt −

0

8 9 = lim sup xn p−1 − x0 p−1 [xn − x0 ] ≥ 0

.

1

|x˙ 0 (t)| dt p

0

n→∞

where the last inequality follows from the fact that s → |s|p−1 is strictly increasing on (0, ∞). Hence xn → x0 , and due to the uniform convexity of X we have xn → x0 in X.38 The operator J is also invertible and its inverse is continuous.39 The operator G is compact. This follows immediately from the compact embedding X = W01,p (0, 1) ⊂⊂ C[0, 1] and from the continuity of g (the reader is invited to prove it in detail). Hence, due to Lemma 5.2.46 the operator T satisﬁes the (S+ ) condition. J(tx) = tp−1 J(x) for any t > 0, x ∈ X. Proposition 2.1.22(iv). 39 See Exercise 5.2.53. 37 I.e., 38 See

5.2B. Topological Degree for Generalized Monotone Operators Let us deﬁne an operator S : X → X ∗ by 1 S(x), y = |x(t)|p−2 x(t)y(t) dt

307

x, y ∈ X.

for any

0

Then S is (p − 1)-homogeneous and compact (use X = W01,p (0, 1) ⊂⊂ Lp (0, 1)). We deﬁne a homotopy Tτ (x) = J(x) − (1 − τ )G(x) − τ λS(x) + (τ − 1)f ∗

τ ∈ [0, 1], x ∈ X,

for

and show that there exists R > 0 (large enough) such that this homotopy is admissible with respect to the ball B(o; R) ⊂ X. The usual way to prove it relies on an indirect argument. Assume by contradiction that for any k ∈ N there exists τk ∈ [0, 1] and xk ∈ X, xk ≥ k such that Tτk (xk ) = o, i.e.,

J(xk ) − (1 − τk )G(xk ) − τk λS(xk ) + (τk − 1)f ∗ = o.

We divide (5.2.52) by xk homogeneous to get

p−1

J(vk ) − (1 − τk )

, denote vk

xk

xk p−1

(5.2.52)

and use that J and S are (p − 1)-

G(xk ) f∗ − τk λS(vk ) + (τk − 1) = o. xk p−1 xk p−1

(5.2.53)

Due to the reﬂexivity of X and the compactness of the interval [0, 1], we may assume that vk v in X and τk → τ ∈ [0, 1]. Using the compactness of the embedding X ⊂⊂ Lp (0, 1), the facts that G and S are continuous as operators from Lp (0, 1) into Lp (0, 1), and using the assumption (5.2.48) we obtain G(xk ) → (1 − τ )λS(v) xk p−1

in

X ∗,

τk λS(vk ) → τ λS(v)

in

X ∗,

in

X∗

(1 − τk )

(τk − 1)

∗

f →o xk p−1

as

k→∞

(the reader is invited to justify all in detail!). Passing to the limit in (5.2.53) we thus get J(vk ) → (1 − τ )λS(v) + τ λS(v) i.e.,

in

vk → J −1 (λS(v))

in

X∗

as

k → ∞,

X.

Since at the same time vk v in X, we have vk → v

in

X

and J(v) − λS(v) = o

in

X ∗.

(5.2.54)

308

Chapter 5. Topological and Monotonicity Methods

Since vk = 1 for all k = 1, 2, . . . , we have v = 1, and so (5.2.54) contradicts the fact that λ is not an eigenvalue of (5.2.47). This proves that the homotopy Tτ is admissible with respect to the ball B(o; R) if R is large. Applying Theorem 5.2.47(iii) we arrive at deg (J − G − f ∗ , B(o; R), o) = deg (J − λS, B(o; R), o), but the value of the degree on the right-hand side is an odd number according to Theorem 5.2.47(ii). Hence deg (J − G − f ∗ , B(o; R), o) = 0, and the existence of at least one solution x ∈ X of (5.2.49) which satisﬁes x < R e follows from Theorem 5.2.47(i). Remark 5.2.52. It is possible to solve the problem (5.2.46) by means of the Leray– Schauder degree theory as well. In that case instead of solving the operator equation J(x) − G(x) = f ∗ one has to deal with

x = J −1 (f ∗ + G(x))

(cf. Exercise 5.2.54). Due to the properties of J −1 (cf. Exercise 5.2.53) this approach is more or less equivalent to that presented in Example 5.2.51. However, in more complicated applications (equations of higher order, partial diﬀerential equations, etc., see, e.g., Appendix 7.5A) the use of the degree presented in Theorem 5.2.47 can appear to be of essential advantage! Exercise 5.2.53. Let J be an operator from Example 5.2.51. Prove that there exists an inverse operator J −1 which is bounded and continuous. Hint. The strict monotonicity of s → |s|p−2 s implies that J(u) − J(v), u − v > 0

u = v.

for

Hence J is injective. Using the H¨ older inequality prove that J(x) − J(y), x − y ≥ (xp−1 − yp−1 )(x − y)

(5.2.55)

(cf. the proof of the (S+ ) condition in Example 5.2.51). The boundedness of J −1 then follows. To prove that J −1 is continuous proceed via contradiction. Suppose it is not, i.e., ∗ there exists a sequence {fn }∞ n=1 , fn → f in X and J −1 (fn ) − J −1 (f ) ≥ δ

for a

δ > 0.

Let xn J −1 (fn ), x = J −1 (f ). It follows that fn xn ≥ fn , xn = J(xn ), xn = xn p ,

i.e.,

xn p−1 ≤ fn .

˜ in X due to the reﬂexivity of X. Hence We may then assume xn x x), xn − x ˜ = J(xn ) − J(x), xn − x ˜ + J(x) − J(˜ x), xn − x ˜ → 0 (5.2.56) J(xn ) − J(˜ since J(xn ) → J(x) in X ∗ . It follows from (5.2.55) (with x xn , y x ˜) and (5.2.56) that x. Hence xn → x ˜ follows due to the fact that X is a uniformly convex Banach xn → ˜ space (see page 65 or Adams [2, Theorem 3.5]). Since J is continuous and injective, x ˜ = x, a contradiction.

5.3. Theory of Monotone Operators

309

Exercise 5.2.54. Consider the boundary value problem (5.2.46) with g satisfying (5.2.48). Prove the existence of at least one weak solution of (5.2.46) using the Leray–Schauder degree theory. Hint. Prove that J −1 ◦ G is a compact operator from X into itself and then use the homotopy invariance property of the Leray–Schauder degree to prove that x = J −1 (f ∗ + G(x)) has at least one solution in X. Compare your proof with the method presented in Example 5.2.51. Exercise 5.2.55. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙= h(t, x(t), x(t)), ˙

t ∈ (0, 1),

x(0) = x(1) = 0,

(5.2.57)

where p > 1. Formulate conditions on h = h(t, x, s) which guarantee the existence of a weak solution of (5.2.57) (see Remark 5.3.10).

5.3 Theory of Monotone Operators The motivation for the methods presented in this section can be described by the following simple example of a real function of one real variable f : R → R. We would like to ﬁnd conditions on f which guarantee that for any y ∈ R the equation f (x) = y has a (unique) solution x. One possible way to solve this ﬁrst semester calculus problem is to consider f which is continuous, (strictly) monotone and lim |f (x)| = ∞ (see Figure 5.3.1). |x|→∞

y

y = f (x)

0

x

Figure 5.3.1.

If f is replaced by an operator T: H →H from a real Hilbert space H (with a scalar product (·, ·) and the induced norm · ) into itself and the same question is posed, then similar conditions appear to be appropriate to prove that for any h ∈ H the equation T (u) = h

310

Chapter 5. Topological and Monotonicity Methods

has a unique solution u ∈ H. It is clear how to reformulate the ﬁrst condition on f in the case of a general operator T . The third condition motivates the following deﬁnition. Deﬁnition 5.3.1. Let H be a real Hilbert space. An operator T : H → H satisfying lim

uH →∞

T (u) = ∞

is called weakly coercive. In order to reformulate the second condition we should ﬁrst note that a real function of one real variable is increasing (decreasing) if and only if (f (x) − f (y))(x − y) ≥ 0

( ≤ 0)

for any x, y ∈ R.

Deﬁnition 5.3.2. Let H be a real Hilbert space. An operator T : H → H satisfying (T (u) − T (v), u − v) ≥ 0

for any u, v ∈ H

(5.3.1)

is called a monotone operator. An operator T is called strictly monotone if for u = v the strict inequality holds in (5.3.1). An operator T is called strongly monotone if there exists c > 0 such that (T (u) − T (v), u − v) ≥ cu − v2

for any u, v ∈ H.

Remark 5.3.3. It is clear that a strongly monotone operator is strictly monotone and, therefore, monotone. Also, every strongly monotone operator is weakly coercive. Indeed, T being strongly monotone implies (T (u) − T (o), u) ≥ cu2 .

(5.3.2)

The Schwartz inequality (see Proposition 1.2.30(i)) yields (T (u) − T (o), u) ≤ [T (u) + T (o)]u.

(5.3.3)

Putting (5.3.2) and (5.3.3) together we get T (u) ≥ cu − T (o), and the weak coercivity follows. The following theorem is the basic assertion of this section. Theorem 5.3.4. Let H be a real Hilbert space and let T : H → H be continuous, monotone and weakly coercive. Then T (H) = H. If, moreover, T is strictly monotone, then for any h ∈ H the equation T (u) = h has a unique solution.

(5.3.4)

5.3. Theory of Monotone Operators

311

Proof. The uniqueness of the solution is a direct consequence of the strict monotonicity of T . The existence of a solution to (5.3.4) for any h ∈ H is proved in two steps: Step 1. Assume for a while that the assertion of the theorem holds if T is continuous and strongly monotone. We prove this fact later, in Proposition 5.3.5. Since Tn : H → H, n ∈ N, deﬁned by Tn : u →

1 u + T (u) n

is strongly monotone (prove it!) for any n ∈ N, we claim that given h ∈ H there exists un ∈ H such that Tn (un ) = h. (5.3.5) ∞

Step 2. Let us prove that {un }n=1 is a bounded sequence in H. Assume the ∞ contrary, i.e., there exists a subsequence which will be denoted by {un }n=1 again such that lim un = ∞. n→∞

It follows from the monotonicity of T that

1 1 1 un = un + (T (un ) − T (o), un ) + (T (o), un ) h ≥ h, un n un un 1 ≥ un − T (o), n *∞ ) i.e., n1 un n=1 is a bounded sequence (and therefore weakly sequentially compact – see Theorem and note that H is reﬂexive). Hence there exists a subsequence + ,∞ 2.1.25 ) 1 *∞ 1 ⊂ n un n=1 which is weakly convergent, i.e., nk unk k=1

1 un w. nk k According to (5.3.5), T (unk ) h − w. {T (unk )}∞ k=1

is a bounded sequence (Proposition 2.1.22(iii)), This implies that which contradicts the weak coercivity of T . This proves the boundedness of ∞ {un }n=1 . In particular, n1 un → o and T (un ) → h. ∞ By Theorem 2.1.25, there is a subsequence {umk }∞ k=1 ⊂ {un }n=1 such that

u mk u 0 . We prove that T (u0 ) = h. Indeed, for any v ∈ H and k ∈ N we have (T (umk ) − T (v), umk − v) ≥ 0.

312

Chapter 5. Topological and Monotonicity Methods

Passing to the limit for k → ∞ we obtain (h − T (v), u0 − v) ≥ 0

for any v ∈ H.40

Set v = u0 + λw, λ > 0, w ∈ H. Then (h − T (u0 + λw), w) ≤ 0

holds for any λ > 0 and w ∈ H.

(5.3.6)

Passing to the limit for λ → 0+ in (5.3.6) and using the continuity of T and of the scalar product in H, we get (h − T (u0 ), w) ≤ 0

for any w ∈ H.

(5.3.7)

Since (5.3.7) holds simultaneously for any w and −w, we actually have (h − T (u0 ), w) = 0

for any w ∈ H,

i.e.,

T (u0 ) = h.

Now, it remains to justify the assumption made in Step 1. For this purpose we prove the following assertion. Proposition 5.3.5. Let H be a real Hilbert space and S : H → H a continuous and strongly monotone operator. Then S(H) = H. Proof. The idea of the proof is easy. Since H is a connected metric space, it is enough to prove (see Lemmas 5.3.6 and 5.3.8) that S(H) is both open and closed in H. Then S(H) = H because the only nonempty subset of H which is both open and closed is the entire space H. First we prove that S(H) is closed. Lemma 5.3.6. Let D be a closed set in H, let S : D → H be a continuous and strongly monotone operator. Then S(D) is a closed set in H. ∞

Proof. Let {un }n=1 ⊂ D be such that S(un ) → h. Since S is strongly monotone, we have (S(un ) − S(um ), un − um ) ≥ cun − um 2 , and using the Schwartz inequality we obtain 1 S(un ) − S(um ) ≥ un − um . c 40 Here

we use that xn x and yn → y imply (xn , yn ) → (x, y). See Exercise 2.1.36.

5.3. Theory of Monotone Operators

313

∞

Hence {un }n=1 is a Cauchy sequence, and there exists u0 ∈ D such that un → u0 . The continuity of S implies that S(un ) → S(u0 ),

i.e.,

S(u0 ) = h.

To prove that S(H) is an open set is more tricky. For this purpose we need an auxiliary assertion about an extension of Lipschitz continuous operators. Lemma 5.3.7. Let D be a subset of a real Hilbert space H, let V : D → H be an operator satisfying V (u) − V (v) ≤ u − v

for any

u, v ∈ D.

Then there exists an operator W : H → H such that W (u) − W (v) ≤ u − v

for any

u, v ∈ H

(5.3.8)

and, moreover, W (u) = V (u)

for any

u ∈ D.

Proof. It follows from Zorn’s Lemma (see Theorem 1.1.4) that there exists a maximal extension W of the operator V , the domain of which satisﬁes Dom W ⊂ H,

D ⊂ Dom W,

and for any u, v ∈ Dom W the inequality (5.3.8) holds. Our aim is to prove Dom W = H. Assume the contrary, i.e., there exists u0 ∈ H \ Dom W . In order to reach a contradiction it is enough to prove the existence of v0 ∈ H such that v0 − W (u) ≤ u0 − u Indeed, setting

˜ : u → W

for any u ∈ Dom W.

(5.3.9)

u = u0 , v0 , W (u), u ∈ Dom W,

we obtain an operator ˜ : Dom W ∪ {u0 } → H W satisfying (5.3.8) for any u, v ∈ Dom W ∪ {u0 }. This will be a contradiction with the maximality of the extension W . So, in the rest of the proof we concentrate on the existence of v0 satisfying (5.3.9). Let B be a ﬁnite subset of Dom W . Denote by AB the set of all v0 ∈ H satisfying (5.3.9) for any u ∈ B. Let A denote the set of all v0 ∈ H satisfying (5.3.9) for all u ∈ Dom W . Let Bn be the system of all ﬁnite subsets B of Dom W which belong to the closed ball {u ∈ H : u ≤ n}, n ∈ N. Set An = AB . B∈Bn

314

Chapter 5. Topological and Monotonicity Methods

Clearly, we have A=

∞

An ,

An+1 ⊂ An ⊂ A1 .

n=1

We wish to prove that A = ∅. Observe ﬁrst that AB and An are weakly compact sets (they are bounded and weakly closed41 ). If AB = ∅ for any ﬁnite subset B ⊂ Dom W , then An = ∅ for any n ∈ N by Exercise 1.2.42. Applying this procedure again we ﬁnally obtain A = ∅ and the proof will be complete. Assume now that there exists B = {u1 , . . . , um } ⊂ Dom W such that AB = ∅. We want to reach a contradiction which will complete the proof. Denote Hf = Lin{u1 − u0 , . . . , um − u0 , W (u1 ), . . . , W (um )}. Then Hf is a subspace of H and dim Hf ≤ 2m. For any w ∈ Hf set h(w) = max

1≤j≤m

w − W (uj ) . u0 − uj

If there exists v0 ∈ Hf such that h(v0 ) ≤ 1, then v0 ∈ AB , a contradiction. So, assume that h(w) > 1 for any w ∈ Hf . Note that the real function h is continuous on Hf and lim h(w) = ∞. w→∞ w∈Hf

Hence, there exists w0 ∈ Hf 42 such that h(w0 ) = min h(w) = λ > 1. w∈Hf

Let us re-enumerate u1 , . . . , um in such a way that w0 − W (uj ) = λ > 1, u0 − uj w0 − W (uj ) < λ, u0 − uj

1 ≤ j ≤ k, (5.3.10) k + 1 ≤ j ≤ m.

We prove that w0 belongs to the convex hull M = Co{W (u1 ), . . . , W (uk )} of {W (u1 ), . . . , W (uk )}.43 Let us assume the contrary. Then we can ﬁnd w1 ∈ Hf 41 Since the weak topology is not metrizable the fact that A is weakly closed has to be shown B with help of weak neighborhoods (Remark 2.1.23). But this is simple due to the Dual Characterization of the Norm (Corollary 2.1.16). 42 Recall that bounded sets in a ﬁnite dimensional space H are relatively compact. f 43 The convex hull of the set A is the least convex set containing A.

5.3. Theory of Monotone Operators

315

such that w1 − W (uj ) w0 − W (uj ) < = λ, u0 − uj u0 − uj w1 − W (uj ) < λ, u0 − uj

1 ≤ j ≤ k, k + 1 ≤ j ≤ m,

(see Figure 5.3.2 for m = 5 and k = 2).44 Hence h(w1 ) < h(w0 ), a contradiction.

W (u1 )

Hf

W (u4 )

∂U3

w0

W (u5 ) w1

∂U5

wM

M

∂U2

=C

∂U1

∂U4

o{W

W (u3 ) + Figure 5.3.2. Uj = w ∈ Hf :

w−W (uj )

u0 −uj

(u 1 ), W

W (u2 ) (u 2 )}

, < λ = B(W (uj ); λu0 − uj ) ∩ Hf

Consequently, there are c1 , . . . , ck such that w0 =

k

cj W (uj ),

cj ≥ 0,

j=1

k

cj = 1.

j=1

Set zj = w0 − W (uj ), zˆj = u0 − uj , 1 ≤ j ≤ k. Then k

cj zj = o,

ˆ zj 2 < zj 2 ,

1 ≤ j ≤ k.

(5.3.11)

j=1

+ , %

w−W (u )

other words, Uj = ∅ where Uj = w ∈ Hf : u −u j < λ (see Figure 5.3.2). 0 j 1≤j≤m % Uj is a nonempty open subset of Hf and the latter (m − k) inequalities in Indeed, k+1≤j≤m % Uj . Using the convexity and compactness of M, the reader is (5.3.10) imply w0 ∈ k+1≤j≤m % Uj contains the segment {tw0 + (1 − t)wM : 0 < t < 1} where invited to show that 1≤j≤k % Uj is a nonempty set, too, and w0 belongs to w0 − wM = dist (w0 , M). Consequently, 44 In

1≤j≤k

its boundary.

316

Chapter 5. Topological and Monotonicity Methods

For 1 ≤ j, n ≤ k we also have zj − zˆn 2 , zj − zn 2 = W (un ) − W (uj )2 ≤ un − uj 2 = ˆ i.e., zj 2 + zn 2 − 2(zj , zn ) ≤ ˆ zj 2 + ˆ zn 2 − 2(ˆ zj , zˆn ).

(5.3.12)

We conclude from (5.3.11) and (5.3.12) that (ˆ zj , zˆn ) < (zj , zn ), and thus

k

cj cn (ˆ zj , zˆn ) <

j,n=1

1 ≤ j, n ≤ k, k

cj cn (zj , zn ).

j,n=1

However, 2 k cj cn (zj , zn ) = c z j j = 0, j=1 j,n=1 k

i.e.,

2 k cj cn (ˆ zj , zˆn ) = c z ˆ j j , j=1 j,n=1 k

2 k c z ˆ j j < 0, j=1

a contradiction. This proves that there exists v0 ∈ Hf such that h(v0 ) ≤ 1, i.e., AB = ∅ for any ﬁnite set B ⊂ Dom W , and the proof is complete. Now, we are ready to prove that S(H) is also an open set in H. Lemma 5.3.8. Let D ⊂ H be an open set, let S : D → H be continuous and strongly monotone. Then S(D) is an open subset of H. Proof. It is enough to prove this lemma for S satisfying the strong monotonicity assumption with c = 1 (explain why!). Let us denote R S(D). We are going to construct a continuous mapping Z : H → H, Dom Z = H, such that Z−1 (D) = R and this will imply that R is open. The operator S is injective in D and S −1 is continuous on R (see Exercise 5.3.12). So we intend to construct Z as an extension of S −1 . For this purpose set F (u) = S(u) − u. Then for u, u1 ∈ D we have (F (u) − F (u1 ), u − u1 ) ≥ 0, i.e., F is monotone. For v ∈ R set K(v) = S −1 (v) − F (S −1 (v)).

5.3. Theory of Monotone Operators

317

Let v, v1 ∈ R be such that v = S(u), v1 = S(u1 ). Then K(v) − K(v1 )2 = u − u1 2 + F (u) − F (u1 )2 − 2(F (u) − F (u1 ), u − u1 ), v − v1 2 = F (u) − F (u1 )2 + u − u1 2 + 2(F (u) − F (u1 ), u − u1 ). The monotonicity of F implies that for any v, v1 ∈ R, K(v) − K(v1 ) ≤ v − v1 . It follows from Lemma 5.3.7 that there exists a continuous extension K1 of K which is deﬁned on the whole H and for any v, v1 ∈ H we have K1 (v) − K1 (v1 ) ≤ v − v1 . For v ∈ H set Z(v) =

1 (v + K1 (v)). 2

If v ∈ R and v = S(u), then v + K1 (v) = v + K(v) = 2u,

i.e.,

Z|R = S −1

and R ⊂ Z−1 (D).

The inclusion Z−1 (D) ⊂ R

(5.3.13)

will imply that Z−1 (D) = R and, by the continuity of Z, the set R = S(D) is open. To prove (5.3.13) it is enough to show that for any v ∈ Z−1 (D) we have v = S(Z(v)). Assume by contradiction that there is v ∈ Z−1 (D) such that for u = Z(v) we have v − S(u) > 0. (5.3.14) The continuity of S implies the existence of d > 0 such that B(u; d) ⊂ D and for u1 ∈ B(u; d) we have S(u) − u − S(u1 ) + u1 ≤

1 v − S(u). 2

Let us choose t > 0 so small that t(v − S(u)) < d. Set u1 = u + t(v − S(u)),

v1 = S(u1 ).

Then u − u1 < d, and so (S(u1 ) − u1 − v + u, t(v − S(u))) = (v1 − Z(v1 ) − v + Z(v), Z(v1 ) − Z(v)) 1 = (v1 − K1 (v1 ) − v + K1 (v), v1 + K1 (v1 ) − v − K1 (v)) 4 1 = (v − v1 2 − K1 (v) − K1 (v1 )2 ) ≥ 0. 4

318

Chapter 5. Topological and Monotonicity Methods

Furthermore, (S(u1 ) − u1 − S(u) + u, u − u1 ) = (S(u1 ) − u1 − v + u, −t(v − S(u))) + (v − S(u), −t(v − S(u))) ≤ (v − S(u), −t(v − S(u))) = −tv − S(u)2 , and so tv − S(u)2 ≤ (S(u1 ) − u1 − S(u) + u, t(v − S(u))) ≤ tS(u1 ) − u1 − S(u) + uv − S(u) ≤

1 tv − S(u)2 . 2

Since t > 0 this contradicts (5.3.14). This proves (5.3.13) and the proof is complete. Let us point out that for operator equations with strongly monotone operators we obtain the continuous dependence of the solution on the right-hand side. Corollary 5.3.9. Let H be a real Hilbert space and T : H → H a continuous and strongly monotone operator. Then for any h ∈ H the equation T (u) = h has a unique solution. Let T (u1 ) = h1 and T (u2 ) = h2 . Then u1 − u2 ≤

1 h1 − h2 c

with c > 0 from Deﬁnition 5.3.2, i.e., T −1 is Lipschitz continuous. Proof. The existence part follows from Proposition 5.3.5. Uniqueness is obvious. For T (u1 ) = h1 and T (u2 ) = h2 we have (using the Schwartz inequality) cu1 − u2 2 ≤ (T (u1 ) − T (u2 ), u1 − u2 ) ≤ u1 − u2 h1 − h2 ,

which completes the proof.

Remark 5.3.10. Let h : [0, 1]×R×R → R be a real function. Consider the boundary value problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), (5.3.15) x(0) = x(1) = 0. Assume that h is continuous and x ∈ C 2 [0, 1] is a solution of (5.3.15). Let us multiply the equation in (5.3.15) by a function y ∈ W01,2 (0, 1) and then integrate the equation from 0 to 1. Applying the Integration by Parts Formula on the lefthand side, we obtain 1 1 x(t) ˙ y(t) ˙ dt = h(t, x(t), x(t))y(t) ˙ dt. (5.3.16) 0

0

5.3. Theory of Monotone Operators

319

This identity makes sense for a more general x than that from C 2 [0, 1] and also for a more general function h. We discuss this issue in Section 7.3 in detail. If we assume that h is such that the integral on the right-hand side of (5.3.16) exists for any x, y ∈ W01,2 (0, 1) (see the following Example 5.3.11), then the function x ∈ W01,2 (0, 1) is called the weak solution of (5.3.15) if the integral identity (5.3.16) holds for any y ∈ W01,2 (0, 1). Once we succeed in ﬁnding a weak solution of (5.3.15), a natural question arises whether it belongs to a “better” space than W01,2 (0, 1), e.g., the continuity of the ﬁrst and second derivatives of x can be of interest. This is the so-called regularity problem. It is a very delicate issue in the theory of partial diﬀerential equations. On the other hand, for an ordinary diﬀerential equation, it is not. For instance, if h is a continuous function, independent of x, ˙ and x ∈ W01,2 (0, 1) is a weak solution of (5.3.15), then x ∈ C 2 [0, 1] is a classical solution of (5.3.15), i.e., the equation in (5.3.15) holds pointwise in (0, 1). Example 5.3.11. Let us consider the boundary value problem −¨ x(t) + g(x(t)) = f (t), t ∈ (0, 1), x(0) = x(1) = 0

(5.3.17)

where g : R → R is a continuous function and f ∈ L2 (0, 1) is a given function. Reformulate (5.3.17) as an operator equation. Put H W01,2 (0, 1) and deﬁne operators J, G : H → H and an element f ∗ ∈ H by

1

x(t) ˙ y(t) ˙ dt,

(J(x), y) = 0

(f ∗ , y) =

(G(x), y) =

1

g(x(t))y(t) dt, 0

1

f (t)y(t) dt

for any x, y ∈ H.

0

We will work with the scalar product

1

x, y ∈ H,

x(t) ˙ y(t) ˙ dt,

(x, y) = 0

and with the norm

1

x =

12 |x(t)| ˙ dt , 2

0

cf. Exercise 1.2.46. The reader is invited to prove that the operators J and G as well as the element f ∗ are well deﬁned and that J is linear. Set S = J + G. Then the operator equation

S(x) = f ∗

320

Chapter 5. Topological and Monotonicity Methods

is equivalent to the requirement that the integral identity

1

1

x(t) ˙ y(t) ˙ dt +

1

g(x(t))y(t) dt =

0

f (t)y(t) dt

0

(5.3.18)

0

holds for all y ∈ H. This is the weak formulation of (5.3.17), and x ∈ H satisfying (5.3.18) for any y ∈ H is a weak solution of (5.3.17). Let us prove that S is a continuous operator. This fact follows from the continuity of J and G. By the deﬁnition of J and of the scalar product in H, J is the identity on H, and so it is a continuous operator. Assume now that xn → x. The embedding of H = W01,2 (0, 1) into C[0, 1] (see Theorem 1.2.26) implies that xn ⇒ x uniformly in [0, 1]. It follows then from the continuity of g on R that g ◦ xn ⇒ g ◦ x uniformly in [0, 1] (justify this statement carefully!). Then using the Dual Characterization of the Norm and the H¨ older inequality, we conclude that G(xn ) − G(x) = sup |(G(xn ) − G(x), w)| w≤1

- = sup -w≤1

≤

1

1

0

[g(xn (t)) − g(x(t))]w(t) dt- 12

|g(xn (t)) − g(x(t))| dt 2

0

≤c

1

12

|g(xn (t)) − g(x(t))| dt 2

sup wL2 (0,1)

w≤1

→ 0 as n → ∞.

0

Hence G is also a continuous operator. Next we prove that S is a strongly monotone operator provided g is an increasing function. Indeed, for any x, y ∈ H we have (S(x) − S(y), x − y) 1 2 = |x(t) ˙ − y(t)| ˙ dt + 0

1

[g(x(t)) − g(y(t))][x(t) − y(t)] dt ≥ x − y2 .45

0

It follows from Corollary 5.3.9 that the problem (5.3.17) has a unique weak solution for any f ∈ L2 (0, 1). If fn → f in L2 (0, 1), then fn∗ → f ∗ strongly in H (prove it in detail!), and according to Corollary 5.3.9 the corresponding weak solutions xn ∈ H satisfy xn − x → 0. In particular, this means that a weak solution of (5.3.17) depends continuously on g the right-hand side f ∈ L2 (Ω). 45 Here

we use that (g(r) − g(s))(r − s) ≥ 0.

5.3. Theory of Monotone Operators

321

Exercise 5.3.12. Let H be a real Hilbert space and let S : H → H be a strongly monotone operator. Prove that S is injective and S −1 is Lipschitz continuous. Exercise 5.3.13. Let ε > 0 and T : B(o; R + ε) ⊂ RN → RN be a monotone operator. Prove that T (B(o; R)) is a bounded set. Exercise 5.3.14. Let B(o; 1) ⊂ RN , N ≥ 2. Prove that there exists a strongly monotone operator T : B(o; 1) → RN for which T (B(o; 1)) is an unbounded set. ∞ Hint. Let {xn }n=1 ⊂ RN , xn = 1, xn = xm for n = m, xn → x0 . Set x for x ∈ B(o; 1)), x = xm , m = 1, 2, . . . , T : x → xn + nxn for x = xn , n = 1, 2, . . . . Prove that T (B(o; 1)) is unbounded and T is strongly monotone. Exercise 5.3.15. Deﬁne fn : t →

0 nt −

n 2

for t ≤ 12 , for t > 12 .

For x = (x1 , x2 , . . . ) ∈ l2 set T x = (f1 (x1 ), f2 (x2 ), . . . ) + (x1 , x2 , . . . ). Prove that (T (x) − T (y), x − y)l2 ≥ x − y2l2

for any x, y ∈ l2

and T (B(o; 1)) is an unbounded set. Exercise 5.3.16. Let T : RN → RN be a monotone operator and T (RN ) = RN . Prove that T is weakly coercive. ∞ Hint. Assume the contrary, i.e., there exist M > 0 and a sequence {un }n=1 such that un → ∞ as n → ∞ and T (un) ≤ M. ,∞ + which is convergent to w. Since T (RN ) = RN Choose a subsequence of uunn n=1

there is u ∈ RN such that T (u) = (M + 1)w. By the monotonicity of T we have

un − u un − u ≥ T (u), . T (un ), un un Taking lim sup on both sides we obtain a contradiction. Exercise 5.3.17. Let f : R → R be deﬁned as follows: x for x < 0, f : x → x + 1 for x ≥ 0.

322

Chapter 5. Topological and Monotonicity Methods

For any (x, y) ∈ R2 set

T (x, y) = (y + f (x), −x).

Prove that T is an injective monotone operator, T (R2 ) = R2 and T is not continuous. Can there exist an injective, monotone function T : R → R which is not continuous and T (R) = R? Exercise 5.3.18. Let H be a real Hilbert space and T : H → H a strongly monotone and Lipschitz continuous operator, i.e., there exist numbers m > 0, M > 0, M > m such that (T (u) − T (v), u − v) ≥ mu − v2 ,

T (u) − T (v) ≤ M u − v

hold for all u, v ∈ H. Prove that the equation T (u) = h has precisely one solution for every h ∈ H, and it is possible to construct this solution by iterations. Hint. Let h ∈ H, ε > 0. Deﬁne an operator Aε (u) = u − ε(T (u) − h). Prove that for u, v ∈ H, Aε (u) − Aε (v)2 ≤ (1 − 2εm + ε2 M 2 )u − v2 , 2m and show that for ε < M 2 the operator Aε is contractive. Apply the Contraction Principle (Theorem 2.3.1).

Exercise 5.3.19. Let H be a Hilbert space and T : H → H a contraction. Prove that I − T is a monotone operator. Exercise 5.3.20. A function x ∈ W 1,2 (0, 1) satisfying (5.3.16) for any y ∈ W 1,2 (0, 1) is called a weak solution of the Neumann problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), (5.3.19) x(0) ˙ = x(1) ˙ = 0. Prove that any weak solution x of (5.3.19) such that x ∈ C 2 [0, 1] satisﬁes the equation in (5.3.19) and x(0) ˙ = x(1) ˙ = 0, i.e., it is a classical solution of (5.3.19). Hint. Taking y ∈ D(0, 1) in (5.3.16) show that the equation in (5.3.19) holds pointwise in (0, 1). Then take arbitrary y ∈ C 2 [0, 1] in (5.3.16) and integrate by parts. Exercise 5.3.21. Find conditions on λ, g = g(t, x) and f for the Neumann problem −¨ x(t) + λx(t) + g(t, x(t)) = f (t), t ∈ (0, 1), x(0) ˙ = x(1) ˙ = 0, to have a unique weak solution. Hint. Use Corollary 5.3.9.

5.3A. Browder and Leray–Lions Theorem

323

5.3A Browder and Leray–Lions Theorem In this appendix we will discuss generalizations of the previous assertions from the basic text. We will present two assertions: one attributed to F.E. Browder, the other named after J. Leray and J.-L. Lions. Theorem 5.3.22 (Browder). Let X be a reﬂexive real Banach space. Moreover, let T : X → X ∗ be an operator satisfying the conditions (i) T is bounded; (ii) T is demicontinuous; (iii) T is coercive, i.e., lim

u →∞

T (u), u = +∞, u

cf. Deﬁnition 6.2.17 in the Hilbert space setting; (iv) T is monotone on the space X, i.e., for all u, v ∈ X we have T (u) − T (v), u − v ≥ 0,

(5.3.20)

cf. Deﬁnition 5.3.2 in the Hilbert space setting. Then the equation

T (u) = f ∗ ∗

(5.3.21) ∗

has at least one solution u ∈ X for every f ∈ X . If, moreover, the inequality (5.3.20) is strict for all u, v ∈ X, u = v, then the equation (5.3.21) has precisely one solution u ∈ X for every f ∗ ∈ X ∗ . The second assertion is more general since the monotonicity condition (iv) is replaced by a set of weaker conditions. Theorem 5.3.23 (Leray–Lions). Let X be a reﬂexive real Banach space. Let T : X → X ∗ be an operator satisfying the conditions (i) T is bounded; (ii) T is demicontinuous; (iii) T is coercive. Moreover, let there exist a bounded mapping Φ : X × X → X ∗ such that (iv) Φ(u, u) = T (u) for every u ∈ X; (v) for all u, w, h ∈ X and any sequence {tn }∞ n=1 of real numbers such that tn → 0, we have Φ(u + tn h, w) Φ(u, w); (vi) for all u, w ∈ X we have Φ(u, u) − Φ(w, u), u − w ≥ 0 (the so-called condition of monotonicity in the principal part);

324

Chapter 5. Topological and Monotonicity Methods

(vii) if un u and lim Φ(un , un ) − Φ(u, un ), un − u = 0,

n→∞

then we have Φ(w, un ) Φ(w, u)

for arbitrary

w ∈ X;

(viii) if w ∈ X, un u, Φ(w, un ) z, then lim Φ(w, un ), un = z, u.

n→∞

Then the equation

T (u) = f ∗

has at least one solution u ∈ X for every f ∗ ∈ X ∗ . The conditions (iv)–(viii) of Theorem 5.3.23 are somewhat unintuitive at ﬁrst glance. We will try to clarify these conditions in Appendix 7.6A where an application to boundary value problems for partial diﬀerential equations is given. Next we will discuss the main steps of the proof of Theorem 5.3.22. The proof of Theorem 5.3.23 is similar, nonetheless it is technically more demanding (see, e.g., Leray & Lions [85]). Proof of Theorem 5.3.22. We divide the proof into eight steps. Step 1. Observe that the operator Tf ∗ (u) T (u) − f ∗ also satisﬁes all the conditions of Theorem 5.3.22. Hence it suﬃces to prove that the equation T (u) = o (5.3.22) has at least one solution. Step 2. We construct an “approximation of the inﬁnite dimensional equation (5.3.22) by an equation in a space of ﬁnite dimension”. More precisely: Let Λ be the family of all subspaces of ﬁnite dimension in the space X. If F ∈ Λ, deﬁne the operator jF : F → X by jF (u) = u. Obviously jF is linear and continuous on F . Let jF∗ be the adjoint operator to jF (see Section 2.1). Then jF∗ : X ∗ → F ∗ and for u ∈ F , put

TF (u) jF∗ (T (u)).

This deﬁnes a mapping TF from the space F into the space F ∗ (see Figure 5.3.3). Step 3. Since a continuous linear operator maps a weakly convergent sequence to a weakly convergent one (see Proposition 2.1.27(i)) and the weak convergence coincides with the strong convergence on the subspace F of ﬁnite dimension (cf. Remark 2.1.23), it follows from (ii) that TF is continuous. Put c(r) inf

u∈X

u =r

T (u), u . u

5.3A. Browder and Leray–Lions Theorem

325

F

F

u

u jF

X TF

T F∗ jF∗ T (u)

TF (u) =

jF∗ (T (u))

X∗

Figure 5.3.3.

By condition (iii), we have lim c(r) = ∞,

r→∞

i.e., TF (u), u = T (u), jF (u) = T (u), u ≥ c(u)u holds for all u ∈ F .

(5.3.23)

Step 4. In Exercise 5.3.26 it is proved that the equation TF (u) = oF ∗

(5.3.24)

has at least one solution uF ∈ F . Step 5 (a priori estimate). There exists an r0 > 0 such that uF ≤ r0 holds for arbitrary F ∈ Λ and for every solution uF ∈ F of the equation (5.3.24). Indeed, if such an r0 did not exist, there would be a sequence {un }∞ n=1 of solutions of the equation (5.3.24) with F = Fn , n = 1, 2, . . . , such that lim un = ∞,

n→∞

lim c(un ) = ∞.

n→∞

This would lead to a contradiction in view of the inequality (5.3.23) c(un )un ≤ 0 = TFn (un ), un . Step 6. Let F0 ∈ Λ and

ΛF0 = {F ∈ Λ : F0 ⊂ F }. We denote by UF0 the set of all elements u ∈ X which are solutions of the equation (5.3.24) for a F ∈ ΛF0 . Furthermore, let UF0 w be the weak closure of the set UF0 .46 Note 46 In

other words, UF0

w

is the least weakly closed set which contains UF0 .

326

Chapter 5. Topological and Monotonicity Methods

that UF0 w ⊂ B(o; r0 ) for any F0 ∈ Λ (cf. Exercise 2.1.39 and the fact that UF0 ⊂ B(o; r0 ) for all F0 ∈ Λ). Let Λf ⊂ Λ be any ﬁnite subset of Λ. Then UF0 w = ∅. F0 ∈Λf

Indeed, let Λf = {F0i ∈ Λ : dim F0i < ∞, i = 1, . . . , n}. Then each of the sets UF i contains 0 n all solutions u ∈ X of the equation (5.3.24) in F0i . i=1

Since B(o; r0 ) is a compact topological space with respect to the weak topology (notice that X is reﬂexive), it follows from the result of Exercise 1.2.42 that

UF 0

w

= ∅.

F0 ∈Λ

Hence there exists

u0 ∈

UF 0 w .

F0 ∈Λ

In the next two steps we prove that u0 is the desired solution of (5.3.22). Step 7. Let v ∈ X. Choose F0 ∈ Λ such that v ∈ F0 , and let F ∈ ΛF0 . If uF ∈ F is a solution of the equation (5.3.24), then condition (iv) implies that 0 ≤ T (v) − T (uF ), v − uF = T (v), v − uF − T (uF ), v − uF = T (v), v − uF − T (uF ), jF (v − uF ) = T (v), v − uF − TF (uF ), v − uF = T (v), v − uF . Thus T (v), v − u ≥ 0

holds for arbitrary

u ∈ UF0 .

(5.3.25)

By the deﬁnition of weak topology (see Remark 2.1.23), (5.3.25) is valid even for arbitrary u ∈ UF0 w . In particular, we then have T (v), v − u0 ≥ 0.

(5.3.26)

Step 8 (the Minty trick). Choose w ∈ X, t > 0, and put v = u0 + tw in the inequality (5.3.26). Then 0 ≤ T (u0 + tw), tw = t(T (u0 + tw), w,

i.e.,

0 ≤ T (u0 + tw), w.

By passing to the limit as t → 0+ , we obtain (applying condition (ii) – demicontinuity of T ) the inequality (5.3.27) T (u0 ), w ≥ 0 which is valid for all w ∈ X. Replacing the element w in (5.3.27) by the element −w, we have (5.3.28) 0 ≤ T (u0 ), −w = −T (u0 ), w, and thus T (u0 ), w = 0

for every

w ∈ X,

i.e.,

T (u0 ) = o.

5.3A. Browder and Leray–Lions Theorem

327

Example 5.3.24. Let us consider the boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ + g(x(t)) = f (t), t ∈ (0, 1),

(5.3.29)

x(0) = x(1) = 0

where p > 1, f ∈ Lp (0, 1) and g is as in Example 5.3.11 (continuous and increasing). Put X W01,p (0, 1) with the norm

1

|x(t)| ˙ dt

x =

p

p1

0

and deﬁne operators J, G : X → X ∗ and an element f ∗ ∈ X ∗ as in Example 5.2.51. Set T = J + G. Then the operator equation

T (x) = f ∗

(5.3.30)

is equivalent to the requirement that the integral identity 1 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt + g(x(t))y(t) dt = f (t)y(t) dt 0

0

(5.3.31)

0

holds for all y ∈ X. So, as in Example 5.2.51, to ﬁnd a weak solution x of (5.3.29) (i.e., x satisfying (5.3.31)) is equivalent to ﬁnding a solution of (5.3.30). We have, by the H¨ older inequality, J(xn ) − J(x) = sup |J(xn ) − J(x), y|

y ≤1

- = sup - y ≤1

1

≤

1

8

|x˙ n (t)|

p−2

x˙ n (t) − |x(t)| ˙

p−2

0

x(t) ˙ y(t) ˙ dt-9

- p p−2 -|x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ - dt

1

≤

- p p−2 -|x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ - dt

1 p

1 p |y(t)| ˙ dt

sup

y ≤1

0

1 p

p1

0

.

(5.3.32)

0

The last integral tends to zero as xn − x → 0 due to the continuity of the Nemytski operator Φ(x)(t) = ϕ(x(t)) ˙ from Lp (0, 1) into Lp (0, 1) with ϕ(s) = |s|p−2 s, s = 0, ϕ(0) = 0, p > 1 (see Theorem 3.2.24). Using the H¨ older inequality we also have - 1 G(xn ) − G(x) = sup |G(xn ) − G(x), y| = sup -g(xn (t)) − g(x(t)) y(t) dt- y ≤1

y ≤1

1

|g(xn (t)) − g(x(t))| dt

≤ sup

y ≤1

1

|y(t)| dt p

p1

0

|g(xn (t)) − g(x(t))|p dt 0

1 p

0 1

≤c

p

0

1 p

→0

(5.3.33)

328

Chapter 5. Topological and Monotonicity Methods

as xn − x → 0 (cf. Example 5.3.11 and the continuous embedding W01,p (0, 1) ⊂ C[0, 1] for p > 1). It follows from (5.3.32) and (5.3.33) that T is continuous and hence demicontinuous. The boundedness of T follows from estimates similar to (5.3.32), (5.3.33) (the reader is invited to do it in detail!). We also have 1 9 8 p−2 p−2 T (x) − T (y), x − y = x(t) ˙ − |y(t)| ˙ y(t) ˙ (x(t) ˙ − y(t)) ˙ dt |x(t)| ˙ 0

1

[g(x(t)) − g(y(t))] (x(t) − y(t)) dt

+ 0

1

≥

1

p |x(t)| ˙ dt −

p−2 |y(t)| ˙ y(t) ˙ x(t) ˙ dt

0

0

1

−

1

p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt + 0

1

≥

|x(t)| ˙ dt −

−

|y(t)| ˙ dt p

1 p |x(t)| ˙ dt

0 p−1

= [x

1

p

0

p |y(t)| ˙ dt 0

0 1 p

1 p

1

|x(t)| ˙ dt p

p1

0

1

p1 p |y(t)| ˙ dt +

0

1 p |y(t)| ˙ dt 0

− yp−1 ][x − y] ≥ 0

with strict inequality for x = y, since s → |s|p−1 is a strictly increasing function on (0, ∞). Hence the monotonicity of T follows. Finally,

1

1

p |x(t)| ˙ dt +

T (x), x = 0

g(x(t))x(t) dt 0

1

= xp +

1

[g(x(t)) − g(0)](x(t) − 0) dt + 0

g(0)x(t) dt ≥ xp − |g(0)|x, 47 0

i.e., T is coercive. It follows then from Theorem 5.3.22 that there is a unique solution of e (5.3.30) (which in turn is a unique weak solution of (5.3.29)). The advantage of the Browder Theorem is more transparent in the case of partial diﬀerential equations when the embedding W01,p (Ω) ⊂ C(Ω) does not hold in general, and so only the demicontinuity of T can be proved. An application of the more general Theorem 5.3.23 is postponed to the last chapter, Appendix 7.6A. Exercise 5.3.25. Prove that the unique weak solution x = x(t) of (5.3.29) belongs to ˙ p−2 x˙ is absolutely continuous and the equation C 1 [0, 1], |x| ˙ + g(x(t)) = f (t) −(|x| ˙ p−2 x)˙

holds a.e. in

(0, 1).

Hint. Integrating by parts in (5.3.31), we obtain that t 1 p−2 x(t) ˙ + (g(x(τ )) − f (τ ) dτ y(t) ˙ dt = 0 |x(t)| ˙ 0

0

for every y ∈ D(0, 1). 47 We

have used xL1 (0,1) ≤ xW 1,p (0,1) . Prove it! 0

(5.3.34)

5.3A. Browder and Leray–Lions Theorem

329

Set

t

p−2 M (t) = |x(t)| ˙ x(t) ˙ +

(g(x(τ )) − f (τ ) dτ . 0

It follows from Lemma 6.1.9 that M (t) = c

a.e. in

(0, 1).

(5.3.35)

The assertion now follows from (5.3.35) as in the proof of Theorem 6.1.14. Exercise 5.3.26. Prove the following assertion: Let T be a continuous mapping deﬁned on a Banach space X of ﬁnite dimension with values in X ∗ . Assume that there exists a real function c = c(r), deﬁned on the interval (0, ∞), such that lim c(r) = ∞ and that T (u), u ≥ c(u)u holds for all u ∈ X.

r→∞

Then T (X) = X ∗ , i.e., the equation T (u) = f ∗ has at least one solution in the space X for arbitrary f ∗ ∈ X ∗ . Hint. Let f ∗ ∈ X ∗ . In the case when X = X ∗ = RN and u, v = (u, v)RN is the scalar product in RN , there exists r > 0 such that the operator F : RN → RN deﬁned by the relation F (u) = T (u) − f ∗ satisﬁes the assumption (F (u), u)RN > 0

for

u ∈ ∂B(o; R)

with R > 0 large enough.

(5.3.36)

Apply the homotopy invariance property of the Brouwer degree and show that (5.3.36) implies that there exists u0 ∈ B(o; R) such that F (u0 ) = o,

i.e.,

T (u0 ) = f ∗ .

In the general case Remark 1.1.12(i) must be employed. Exercise 5.3.27. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙= h(t, x(t), x(t)), ˙ x(0) = x(1) = 0,

t ∈ (0, 1),

(5.3.37)

where p > 1. Formulate conditions on h = h(t, x, s) which guarantee the existence of a weak solution of (5.3.37). Hint. Apply Theorem 5.3.22. Exercise 5.3.28. How do the conditions on h change if we replace the homogeneous Dirichlet boundary conditions in (5.3.37) by the Neumann ones?

330

Chapter 5. Topological and Monotonicity Methods

5.4 Supersolutions, Subsolutions, Monotone Iterations In this section we deal with another possibility of extending the notion of a monotone function to operators between Banach spaces of inﬁnite dimension. Instead of characterizing an increasing function f : R → R in terms of the inequality (f (x) − f (y))(x − y) ≥ 0

for any x, y ∈ R

(cf. Section 5.3), we use the usual “ﬁrst semester calculus” deﬁnition for any x, y ∈ R satisfying x ≤ y

we have

f (x) ≤ f (y).

(5.4.1)

However, to generalize the implication (5.4.1) to the case of general operators we have to introduce an inequality relation for Banach spaces which can be used analogously to the inequality relation for the set of real numbers. Deﬁnition 5.4.1. Let X be a real Banach space and let K be a subset of X. Then K is called an order cone if (1) K is closed, nonempty, and K = {o}; (2) a, b ∈ R, a, b ≥ 0, x, y ∈ K implies ax + by ∈ K; (3) x ∈ K and −x ∈ K implies x = o. On this basis we deﬁne x≤y

provided

y − x ∈ K,

x
provided provided

x ≤ y and x = y, y − x ∈ int K.

(5.4.2)

The set [x, y] = {z ∈ X : x ≤ z ≤ y} is called an order interval in X. Note that x≥y

means

x−y ∈K

and similarly for “>” and “”. Remark 5.4.2. Condition (2) is equivalent to saying that K is convex, and if x ∈ K and a ≥ 0, then ax ∈ K. Deﬁnition 5.4.3. By an ordered Banach space we mean a real Banach space together with an order cone. Remark 5.4.4. The reader should notice the diﬀerence between order cones and cones. A subset C of the Banach space X is called a cone if x ∈ C and a > 0 implies ax ∈ C. So, every order cone is a cone, but the converse is not true in general. Example 5.4.5. Let X = RN . We set RN,+ = {(ξ1 , . . . , ξN ) ∈ RN : ξi ≥ 0 for all i = 1, . . . , N }. Then K = RN,+ is an order cone (see Figure 5.4.1).

5.4. Supersolutions, Subsolutions, Monotone Iterations

331

We have (ξ1 , . . . , ξN ) ≤ (η1 , . . . , ηN )

if and only if ξi ≤ ηi for all i= 1, . . . , N,

(ξ1 , . . . , ξN ) (η1 , . . . , ηN ) if and only if ξi < ηi for all i= 1, . . . , N.

g

Example 5.4.6. The set C in Figure 5.4.2 is a cone in R2 but it is not an order g cone. y

y

K = R2,+ (η1 , η2 )

R2

o x

(ξ1 , ξ2 ) C o

x Figure 5.4.1.

Figure 5.4.2.

Example 5.4.7. Let X = C(Ω) for a bounded set Ω ⊂ RN . We set C + (Ω) = {f ∈ C(Ω) : f (x) ≥ 0 for every x ∈ Ω}. Then K = C + (Ω) is an order cone in X, and we have f ≤g

if and only if

f (x)≤ g(x)

for all x ∈ Ω,

f g

if and only if

f (x)< g(x)

for all x ∈ Ω.

g

The following assertion summarizes the basic properties of ordering in a Banach space X. Proposition 5.4.8. For all u, x, xn , y, yn , z ∈ X and all a, b ∈ R, we have x≤ x, x≤y x≤y

and and

y≤x y ≤z

imply imply

x= y, x≤ z.

Furthermore, we have x≤y x≤y xn ≤ yn 47 The

and and

0≤a≤b u≤z

for all

n

imply imply implies

limits are understood in the norm topology of X.

ax ≤ by, x+u≤y+z

and

lim xn ≤ lim yn 48

n→∞

n→∞

332

Chapter 5. Topological and Monotonicity Methods

provided the limits exist. For the symbol “ ”, the following implications hold: xy x≤y

and and

y≤z y z

imply imply

x z, x z,

xy

and

a>0

imply

ax ay.

Proof. Use (5.4.2) and the properties of K. For example, if xn ≤ yn for all n, then ∞ ∞ yn − xn ∈ K. Since K is closed and limits of {xn }n=1 , {yn }n=1 exist, we conclude that y − x ∈ K, i.e., x ≤ y. Deﬁnition 5.4.9. The order cone K is called normal if there is a number c > 0 such that for all x, y ∈ X, o ≤ x ≤ y we have x ≤ cy. Example 5.4.10. For X = RN , K = RN,+ is a normal order cone in RN . Similarly, g C + (Ω) is a normal order cone in C(Ω). Lemma 5.4.11. If an order cone is normal, then every order interval [x, y] is bounded in the norm. Proof. If x ≤ w ≤ y, then o ≤ w − x ≤ y − x, and hence w ≤ w − x + x ≤ cx − y + x.

Now we can introduce the deﬁnition of a monotone increasing operator between ordered Banach spaces. Deﬁnition 5.4.12. Let X and Y be ordered Banach spaces. An operator T : Dom T ⊂ X → Y is said to be monotone increasing if x
implies

T (x) ≤ T (y)

for any x, y ∈ Dom T.

An operator T is said to be strictly or strongly monotone increasing if the symbol “≤” is replaced by “<” or “”, respectively. Similarly we deﬁne (strictly, strongly) monotone decreasing operator. The operator T is said to be positive if T (o) ≥ o and for all x ∈ Dom T , x>o

implies

T (x) ≥ o.

As above, the operator is strictly or strongly positive if the symbol “≥” is replaced by “>” or “”, respectively. Example 5.4.13. Let X = Y = R, K = R+ . Then for a real function f : Dom f ⊂ R → R the concepts of (strictly) monotone increasing (or decreasing) above coincide with the usual deﬁnitions. Because of the equivalence of x y and x < y on R there is no diﬀerence here between strongly monotone increasing (decreasing) g and strictly monotone increasing (decreasing).

5.4. Supersolutions, Subsolutions, Monotone Iterations

333

Example 5.4.14. For a linear operator T , the concepts of (strictly, strongly) positive are the same as those of (strictly, strongly) monotone increasing. Indeed, let T be positive, for example. Then we have the following sequence of implications: x
=⇒

o
=⇒

o ≤ T (y) − T (x)

=⇒

o ≤ T (y − x) =⇒

T (x) ≤ T (y), g

i.e., T is monotone increasing. The other proofs are similar.

Let T : X → X be an operator on a Banach space X. We consider the operator equation u = T (u) (5.4.3) and apply an iterative method to solve it. For this purpose we consider the iterations (successive approximations) un+1 = T (un )

and

vn+1 = T (vn ),

n = 0, 1, 2, . . . .

(5.4.4)

We illustrate the idea of approximations in Figure 5.4.3. X

T

o u0

u1 u2 · · · u

v · · · v2 v1

v0

X

Figure 5.4.3. Fixed points u, v of T .

The next deﬁnition is a basic deﬁnition of the existence theory for operator equations in ordered Banach spaces.

334

Chapter 5. Topological and Monotonicity Methods

Deﬁnition 5.4.15. A point u is called a supersolution of (5.4.3) (or of the operator T ) if T (u) ≤ u. The preﬁx “super” is replaced by “sub” when the respective inequalities are reversed. For example, u ∈ X satisfying u ≤ T (u) is a subsolution of (5.4.3).49 The following assertions justify the general principle of super- and subsolutions. This principle can be formulated as follows: If there is a super- and subsolution, then a solution can be obtained by the convergent iterative method (5.4.4). Namely, we have the following results. Theorem 5.4.16. Let T : X → X be a compact monotone increasing operator on a real Banach space X with a normal order cone X + and u0 be a subsolution (v0 ∞ ∞ a supersolution) of (5.4.3). Then {un }n=1 ({vn }n=1 ) in (5.4.4) converges if and 50 only if this sequence is bounded above (below). In the case of convergence, the limit point u is the smallest ﬁxed point u of T with u0 ≤ u (v is the greatest ﬁxed point v of T with v ≤ v0 ).51 Proof. We will consider the case of a subsolution. The case of a supersolution is very similar. Since T is monotone increasing, we have the sequence of implications u0 ≤ T (u0 ) = u1 =⇒ u1 ≤ u2 =⇒ · · · ,

i.e., u0 ≤ u1 ≤ u2 ≤ · · · .

Now un → u implies that un ≤ u for all n. Consequently, the sequence {un }∞ n=1 is bounded above, if it is convergent. ∞ Conversely, the sequence {un }n=1 is convergent if it is bounded above. In∞ deed, suppose un ≤ v for all n. By Lemma 5.4.11, the sequence {un }n=1 is bounded in the norm. Since un = T (un−1 ) and since T is compact, the set of all un is rel∞ atively compact. Thus there exist a convergent subsequence {unk }k=1 and u such ∞ that unk → u. Since the sequence {un }n=1 is monotone, all convergent subse∞ quences have the same limit point. Therefore, the whole sequence {un }n=1 converges to u as well. Since un+1 = T (un ), letting n → ∞ shows that u = T (u). Let w be a solution of (5.4.3) with u0 ≤ w. Then u1 = T (u0 ) ≤ T (w) = w, etc., so that un ≤ w for all n. Hence u ≤ w. The intuitive meaning of the following assertion is demonstrated in Figure 5.4.3. 49 The terminology is not ﬁxed. Instead of super- and subsolution the notions upper and lower solutions have been also used. 50 The set M ⊂ X is bounded above if there is m ∈ X such that u ≤ m for any u ∈ M. 51 Note that the concepts of “smallest” and “greatest” are used in the usual sense, e.g., a smallest ﬁxed point u in X is characterized by u ≤ w for all other ﬁxed points w ≥ u0 .

5.4. Supersolutions, Subsolutions, Monotone Iterations

335

Corollary 5.4.17 (Monotone Iterative Method). Let X be a real Banach space with a normal order cone and T : X → X. Assume that u0 and v0 is a subsolution and a supersolution of (5.4.3), respectively, and u0 ≤ v0 . If T is a compact monotone increasing operator on the order interval [u0 , v0 ], then both the iterative sequences ∞ ∞ {un }n=1 and {vn }n=1 from (5.4.4) are deﬁned, converge, and u = lim un

and

n→∞

v = lim vn n→∞

is the smallest ﬁxed point and the largest ﬁxed point, respectively, of T in [u0 , v0 ]. Furthermore, we have the error estimates un ≤ u ≤ v ≤ vn

for all

n = 0, 1, . . .

(Figure 5.4.3). Proof. Since u0 ≤ T (u0 ), T (v0 ) ≤ v0 and u0 ≤ v0 together imply that u0 ≤ u1 ≤ v0 and similarly that un ≤ v0 for all n,

it follows that {un }∞ n=1 is bounded above. The proof then follows from Theorem 5.4.16.

Example 5.4.18 (Integral Equation). Let Ω be a bounded domain in RN , G : Ω × Ω → R be a continuous and nonnegative function and f : Ω × R → R be a continuous and increasing function in the second variable. Consider the integral equation u(x) =

G(x, y)f (y, u(y)) dy

(5.4.5)

Ω

in the space C(Ω). We write this equation in the form u ∈ C(Ω).

u = T (u), +

Let us consider the normal order cone C (Ω) from Example 5.4.7. Then the operator T : C(Ω) → C(Ω) is compact and monotone increasing.52 Considering subsolutions and supersolutions now means that we replace “=” by “≤” and “≥”, respectively, in the integral equation (5.4.5). Corollary 5.4.17 implies that if u0 ∈ C(Ω) is a subsolution and v0 ∈ C(Ω) is a supersolution with u0 ≤ v0 on Ω, then for n → ∞ the iterative method un+1 (x) = G(x, y)f (y, un (y)) dy, n = 0, 1, 2, . . . , Ω

converges uniformly on Ω to a solution u ∈ C(Ω) of the integral equation with u0 ≤ u ≤ v0 on Ω. Here u is the smallest solution with this property. If, instead, the iterative method starts with v0 , then we obtain the greatest solution v with u0 ≤ v ≤ v0 . The most diﬃcult task in solving (5.4.5) is to ﬁnd at least one g subsolution u0 and/or supersolution. 52 The

compactness of T has been proved in Example 5.1.14; f = f (y, u) increasing in u immediately implies that T is monotone increasing.

336

Chapter 5. Topological and Monotonicity Methods

Example 5.4.19 (Diﬀerential Equation). Let us consider the Dirichlet boundary value problem −¨ x(t) = f (t, x(t)), t ∈ (0, 1), (5.4.6) x(0) = x(1) = 0 where f : [0, 1] × R → R is a function continuous in the ﬁrst variable and continuously diﬀerentiable in the second one. Suppose u0 , v0 ∈ C 2 [0, 1] are such that −¨ v0 (t) ≥ f (t, v0 (t)), t ∈ (0, 1), −¨ u0 (t) ≤ f (t, u0 (t)), t ∈ (0, 1), u0 (0) ≤ 0,

u0 (1) ≤ 0,

v0 (0) ≥ 0,

v0 (1) ≥ 0.53

(5.4.7) We will show that u0 , v0 is a subsolution and a supersolution, respectively, for the operator w = T (z) deﬁned by a solution of the problem −w(t) ¨ + cw(t) = f (t, z(t)) + cz(t) F (t, z), t ∈ (0, 1), w(0) = w(1) = 0, where c > 0 is chosen in such a way that ∂f (t, s) + c > 0 for t ∈ [0, 1] and s ∈ I0 ∂s

min {u0 (t)}, max {v0 (t)} .

t∈[0,1]

t∈[0,1]

Notice that 0 ∈ I0 . The map T is correctly deﬁned because the Dirichlet problem −w(t) ¨ + cw(t) = g(t), t ∈ (0, 1), (5.4.8) w(0) = w(1) = 0 has a unique solution for any ﬁxed g ∈ C[0, 1]. Then T : C[0, 1] → C[0, 1] is a compact operator. This follows from the fact that T is composed from the Nemytski operator N : z(t) → f (t, z(t)) + cz(t) which is continuous, and a compact linear operator A−1 where A(w(t)) = −w(t) ¨ + cw(t), (cf. Example 2.2.17), i.e.,

Dom A = {w ∈ C 2 [0, 1] : w(0) = w(1) = 0} T = A−1 ◦ N.

We will prove that T : C[0, 1] → C[0, 1] is a monotone increasing operator. Indeed, let z1 , z2 ∈ C[0, 1], z1 ≤ z2 . By deﬁnition, −(T (zi ))¨(t) + cT (zi )(t) = f (t, zi (t)) + czi (t), t ∈ (0, 1), for i = 1, 2. (T (zi ))(0) = (T (zi ))(1) = 0 53 The

functions u0 , v0 are called a subsolution and supersolution of the boundary value problem (5.4.6), respectively.

5.4. Supersolutions, Subsolutions, Monotone Iterations

337

Putting w = T (z2 ) − T (z1 ) we get −w(t) ¨ + cw(t) = f (t, z2 (t)) − f (t, z1 (t)) + c(z2 (t) − z1 (t)), w(0) = w(1) = 0.

t ∈ (0, 1),

However, the function F : (t, s) → f (t, s) + cs is increasing in s on the interval I0 by the choice of c. Hence for z1 ≤ z2 , z1 (t), z2 (t) ∈ I0 for every t ∈ [0, 1], we have 0 ≤ F (t, z2 ) − F (t, z1 ) = f (t, z2 (t)) − f (t, z1 (t)) + c(z2 (t) − z1 (t)) = −w(t) ¨ + cw(t). Therefore

−w(t) ¨ + cw(t) ≥ 0, w(0) = w(1) = 0.

(5.4.9)

Assume that there is t ∈ (0, 1) such that w(t) < 0. Then there is t0 ∈ (0, 1) such that 0 > w(t0 ) = min w(t). t∈[0,1]

But then w(t ¨ 0 ) ≥ 0, a contradiction with the inequality (5.4.9). Hence w(t) ≥ 0 in (0, 1), i.e., T (z1 ) ≤ T (z2 ).54 We now prove that v0 ≥ T (v0 ), i.e., v0 is a supersolution of T . Set v1 = T (v0 ). We get −¨ v1 (t) + cv1 (t) = f (t, v0 (t)) + cv0 (t), t ∈ (0, 1), v1 (0) = v1 (1) = 0, therefore −(v1 (t) − v0 (t))¨+ c(v1 (t) − v0 (t)) = f (t, v0 (t)) + cv0 (t) + v¨0 (t) − cv0 (t) ≤ 0 for t ∈ (0, 1). The same argument as above yields that v1 (t) ≤ v0 (t), t ∈ [0, 1]. Analogously we prove that u0 ≤ T (u0 ), i.e., u0 is a subsolution of T . If, moreover, g u0 ≤ v0 , then Corollary 5.4.17 can be used (cf. Exercise 5.4.22). Exercise 5.4.20. Let T : RN → RN . Then the equation x = T (x), x ∈ RN , in (5.4.3) describes a system of nonlinear equations xi = Ti (x1 , . . . , xN ),

i = 1, . . . , N.

Consider the order cone RN,+ from Example 5.4.5. Translate all the assumptions and conclusions of Theorem 5.4.16 and Corollary 5.4.17 to this system. 54 The argument used to prove w(t) ≥ 0 in (0, 1) is a special version of the more general Maximum Principle (see, e.g., Protter & Weinberger [102]). The monotonicity of T can be also shown by proving that the Green function corresponding to the operator A (Example 2.2.17) is nonnegative.

338

Chapter 5. Topological and Monotonicity Methods

Exercise 5.4.21. Formulate conditions on a nonlinear function f : Ω×R → R which guarantee that there exist a subsolution u0 ∈ C(Ω) and a supersolution v0 ∈ C(Ω) of the operator T from Example 5.4.18 such that u0 ≤ v0 on Ω. Then formulate the corresponding existence result for the integral equation (5.4.5). Exercise 5.4.22. Formulate conditions on a function f : [0, 1] × R → R which guarantee that there exist a subsolution u0 ∈ C 2 [0, 1] and a supersolution v0 ∈ C 2 [0, 1] of the operator T from Example 5.4.19 such that u0 ≤ v0 in [0, 1]. Then formulate the corresponding existence result for the boundary value problem (5.4.6). Exercise 5.4.23. Replace in (5.4.6) the homogeneous Dirichlet boundary conditions by the Neumann ones. Modify the deﬁnitions of a subsolution and a supersolution in such a way that Corollary 5.4.17 could be applied. Formulate conditions on f = f (t, x) which guarantee the existence of a solution of the corresponding Neumann problem. Hint. Use Corollary 5.4.17. Exercise 5.4.24. Consider the Dirichlet boundary value problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0.

(5.4.10)

Formulate conditions on h = h(t, x, s) which guarantee the existence of a solution of (5.4.10). Hint. Use Corollary 5.4.17. Exercise 5.4.25. How do the conditions on h change if we replace the homogenenous Dirichlet boundary conditions in (5.4.10) by the Neumann ones?

5.4A Minorant Principle and Krein–Rutman Theorem In this appendix we study the eigenvalue problem T (x) = λx,

(5.4.11)

and the corresponding inhomogeneous equation λx − T (x) = y,

y > o,

(5.4.12)

on a real Banach space X with an order cone K X . +

Deﬁnition 5.4.26. By a positive solution (x, λ) of (5.4.11), we mean a solution of T (x) = λx with x > o and λ > 0. If we replace “=” by “≥”, then we speak about a positive subsolution. Although we present mainly statements about linear problems, the following results play a central role in the investigation of nonlinear problems, for example, in the bifurcation theory, variational principles, etc. The essential tools for investigating (5.4.11) are the Minorant Principle and the Separation Theorem for convex sets (see Corollary 2.1.18). Set Kr = {x ∈ K : x ≤ r} for a ﬁxed, given r > 0.

5.4A. Minorant Principle and Krein–Rutman Theorem

339

The key is to ﬁnd a suitable minorant M for T , so that T (x) ≥ M (x)

for all

x ∈ Kr ,

(5.4.13)

and which satisﬁes appropriate conditions. Furthermore, it is important to know a subsolution x0 , i.e., c > 0, x0 > o. (5.4.14) M (x0 ) ≥ cx0 , The general Minorant Principle: If we know a subsolution of (5.4.11), then we can obtain a positive eigenvalue with a positive eigenvector of (5.4.11), is formulated precisely in the following two theorems. Theorem 5.4.27 (Krasnoselski). Suppose that (i) X is a real Banach space with an order cone K; (ii) an operator T : Kr ⊂ X → X is compact and (5.4.13) holds; (iii) a linear operator M : K → X is positive, and there are an x0 > o and a positive real number c such that (5.4.14) holds. Then for every with 0 < < r the problem (5.4.11) has a positive solution (x, λ) satisfying x = . Theorem 5.4.28 (Zeidler). Let us set α(x) = sup{t ≥ 0 : x ≥ tx0 }

for ﬁxed

x0 > o

and all

x ∈ K.

The conclusion of Theorem 5.4.27 still holds if we replace (iii) by the following condition: (iii ) suppose that M : K ⊂ X → K is an operator, not necessarily linear, for which there is an x0 > o and there are real numbers s with 0 < s ≤ 1 and c > 0 such that M (x) ≥ (α(x))s cx0 for all x ∈ Kr . (5.4.15) Theorem 5.4.27 is a special case of Theorem 5.4.28. Indeed, since x ≥ α(x)x0 for x ∈ Kr , we have M (x) ≥ α(x)M (x0 ) ≥ α(x)cx0

x ∈ Kr .

for all

Thus, (5.4.14) implies (5.4.15) with s = 1. Proof of Theorem 5.4.28. We will use a regularization method and the Schauder Fixed Point Theorem (Theorem 5.1.11). Let us ﬁrst solve an auxiliary problem λn xn = Tn (xn ),

λn > 0,

where Tn (x) T (x) +

xn > o,

xn =

x0 , n = 1, 2, . . . , 0 < ≤ r. n

Let n be ﬁxed. Set z(x) =

x x

for

x = o

For x ∈ K we set S(x) = xTn (z(x)) +

and

z(o) = o.

( − x)x0 . n

(5.4.16)

340

Chapter 5. Topological and Monotonicity Methods

Then S is compact on K (explain why!), and by (5.4.13), (5.4.16) S(x) ≥

xx0 ( − x)x0 x0 + = >o n n n

for all

x ∈ K .

So there is an an > 0 such that S(x) ≥ an

for all

x ∈ K .

It follows from the boundedness of S(K ) that there exists a number bn > 0 such that 0 < an ≤ S(x) ≤ bn

for all

x ∈ K .

(5.4.17)

By (5.4.17), S(x) S(x) is well deﬁned on K . Furthermore, the operator V : K → K is compact on the closed, bounded, and convex set K (why?). By Theorem 5.1.11 (the Schauder Fixed Point Theorem) there is an xn ∈ K such that V (x) =

xn = V (xn ). Tn (xn ) , which In particular, xn = V (xn ) = , so z(xn ) = xn . Therefore xn = 2 S(x n )

n )

. means that xn is a solution of (5.4.16) with λn = S(x 2 Before we pass to the limit for n → ∞, we estimate the value of λn . Namely, we will show that there exist numbers a, b > 0 such that

0 < a ≤ λn ≤ b

n ∈ N.

for all

(5.4.18)

It follows from (5.4.16) that λn ≤ T (xn ) + x0 ,

so

b sup λn < ∞. n∈N

On the other hand, xn ≥ α(xn )x0 implies that there exists γ such that γ sup α(xn ) < ∞. n∈N

Indeed, otherwise there would be a subsequence, again denoted by {xn }∞ n=1 , with xn → o as n → ∞. Now (5.4.16), α(xn ) → ∞ as n → ∞, contradicting o < x0 ≤ α(x n) (5.4.13) and (5.4.15) imply that λn xn = T (xn ) +

x0 x0 x0 ≥ M (xn ) + ≥ . n n n

Therefore, α(xn ) > 0, and furthermore λn xn ≥ M (xn ) ≥ (α(xn ))s cx0 , i.e., the deﬁnition of α(xn ) implies α(xn ) ≥ This proves (5.4.18).

(α(xn ))s c , λn

so

λn ≥ α(xn )s−1 c ≥ γ s−1 c a.

5.4A. Minorant Principle and Krein–Rutman Theorem

341

Now, we pass to the limit n → ∞ in (5.4.16). Using (5.4.18) and xn = , we can ∞ ﬁnd convergent subsequences, again denoted by {λn }∞ n=1 and {T (xn )}n=1 , with λn → λ and T (xn ) → y strongly in X. By (5.4.18), λ > 0. Then we have also strong convergence in X for ' x0 ( xn = λ−1 T (xn ) + → x. n n Hence λx = T (x) and x ∈ K, x = . Example 5.4.29. We will consider the nonlinear system of equations λξi = fi (ξ1 , . . . , ξN ),

i = 1, . . . , N,

(5.4.19)

, x = , λ > 0. The following assertion (the with x = (ξ1 , . . . , ξN ) and x ∈ K R Generalized Perron Theorem) is a consequence of Theorem 5.4.27: N,+

Suppose that fi : K → (0, ∞) is continuous for i = 1, . . . , N and that there is a ﬁxed r > 0 for which fi (ξ1 , . . . , ξN ) ≥

N

µij ξj

holds for all

x∈K

with

x ≤ r,

(5.4.20)

j=1

and i = 1, . . . , N . Assume that all the real numbers µij are nonnegative, and that N min µij > 0. 1≤i≤N

j=1

Then (5.4.19) has a positive solution for every with 0 < ≤ r. Indeed, we can write (5.4.19) as λx = T (x) and apply Theorem 5.4.27 with X = RN , X + = RN,+ , x0 = (1, . . . , 1), and M (x) (η1 , . . . , ηN )

where

ηi =

N

e

µij ξj .

j=1

Example 5.4.30. We will consider the nonlinear integral equation b A(t, s)f (s, x(s)) ds λx(t) =

(5.4.21)

a

on a ﬁnite interval [a, b] with λ > 0. This time, for ﬁxed µ > 0 and r > 0, the key condition (substituting (5.4.20)) is f (s, x) ≥ µx

for all

(s, x) ∈ [a, b] × [0, r].

(5.4.22)

Applying Theorem 5.4.27 we have the following assertion (the Generalized Jentzsch Theorem): Suppose A : [a, b] × [a, b] → R is continuous, nonnegative, and b A(t, s) ds > 0. min t∈[a,b]

a

Let f : [a, b] × R → R be continuous and let (5.4.22) be satisﬁed. Then for every with 0 < ≤ r, (5.4.21) has a positive solution x ∈ C[a, b] with x = . +

342

Chapter 5. Topological and Monotonicity Methods

Indeed, we write (5.4.21) as λx = T (x) and apply Theorem 5.4.27 with X = C[a, b], X + = C + [a, b], x0 (t) ≡ 1 and

b

M (x)(t) µ

e

A(t, s)x(s) ds. a

Proposition 5.4.31. Let T : X → X be a compact linear positive operator on a real ordered Banach space X. Then there exists a positive solution of (5.4.11) if and only if (5.4.11) has a positive subsolution. Proof. This assertion is an immediate consequence of the Minorant Principle (Theorem 5.4.27 with M = T ). Our goal is to sharpen this result. Let T : X → X be a linear operator and let r(T ) denote the spectral radius of the complexiﬁcation of T .55 We call λ a simple eigenvalue of T if its multiplicity m(λ) is equal to 1.56 Recall that this means dim Ker (λI − T ) = 1

and

Ker (λI − T )2 = Ker (λI − T ).

Let K∗ X ∗,+ denote the set of all positive functionals x∗ ∈ K∗ , i.e., x∗ , x ≥ 0

for all

x ∈ K X+.

We write x∗ ≥ o if x∗ is positive. Furthermore, x∗ > o means x∗ ≥ o

and

x∗ , x > 0

for a certain

x ∈ K.

We call x∗ strictly positive if x>o

always implies

x∗ , x > 0.

A cone K X + ⊂ X is called total if Lin(K) is dense in X. Then K being total implies that K∗ is an order cone (cf. Exercise 5.4.43). In this case we call K∗ the dual order cone of K. In particular, K is total if int K = ∅ (cf. Exercise 5.4.41). For X = RN , K = RN,+ we have X ∗ = X, K∗ = K (explain why!). Proposition 5.4.32 (Krein–Rutman). Let X be a real Banach space with a total order cone K. Suppose that T : X → X is linear, compact, and positive, with r(T ) > 0. Then r(T ) is an eigenvalue of both T and T ∗ with eigenvectors in K and K∗ , respectively. If T is strongly positive, we get a sharper version of the previous assertion. 55 If X is a real Banach space, then by the complexiﬁcation of T we mean the operator T : X → C XC deﬁned by T (x + iy) = T (x) + iT (y), x, y ∈ X, where XC is the complexiﬁcation of X in the sense of Example 1.1.6(iii). 56 The signiﬁcance of simple eigenvalues, roughly speaking, is that their behavior is very stable under perturbations of the operator (cf. Example 4.2.4 and in more details Kato [73]). For this reason, simple eigenvalues play a special role also in the bifurcation theory.

5.4A. Minorant Principle and Krein–Rutman Theorem

343

Theorem 5.4.33 (Krein–Rutman). Let X be a real Banach space with an order cone K having nonempty interior. Then any linear, compact, and strongly positive operator T : X → X has the following properties: (i) T has exactly one eigenvector with x > o and x = 1. The corresponding eigenvalue is r(T ) and it is algebraically simple. Furthermore, x o. (ii) If λ ∈ σ(T ), λ = r(T ), then |λ| < r(T ). (iii) The dual operator T ∗ has r(T ) as an algebraically simple eigenvalue with a strictly positive eigenvector x∗ . Remark 5.4.34. Recall that by the Riesz–Schauder theory (see Theorem 2.2.9), the spectrum of T consists of at most countably many nonzero eigenvalues of ﬁnite multiplicity which can accumulate only at the origin, and o ∈ σ(T ) whenever dim X = ∞. The spectra of T and T ∗ coincide (X is a real space). Now, we will give proofs of Proposition 5.4.32 and Theorem 5.4.33. Proof of Proposition 5.4.32. Let us consider T on the complexiﬁcation XC = X + iX. By the Riesz–Schauder theory (see Theorem 2.2.9), all of the nonzero points of the spectrum of T consist of eigenvalues of ﬁnite multiplicity. The same holds for T ∗ . Note that σ(T ) ∩ {λ : |λ| = r(T )} = ∅. We consider the eigenvalues λ of T satisfying |λ| = r(T ), and distinguish three cases. Case 1 (λ0 = r(T ) is an eigenvalue). Our goal is to construct an x > o and an x∗ > o such that T (x) = λ0 x and T ∗ (x∗ ) = λ0 x∗ . From footnote 3 on page 57 we have (λI − T )−1 u = and, therefore,

(λI − T )−1 u ≥ o

∞ T ju , λj+1 j=0

for

λ > r(T ),

λ > λ0

and

u ≥ o.

Since T is compact, λ0 = 0 is an eigenvalue of ﬁnite multiplicity (Remark 2.2.10) and in the Laurent series +∞ (λ − λ0 )n An , (5.4.23) (λI − T )−1 = n=−∞

there is an index k such that A−n = O

for all

k
and

A−k = O

(Proposition 3.1.15). So, there is u > o such that x A−k u = o (otherwise A−k = O since K is total). It follows from Proposition 3.1.15 that T x = λ0 x.

344

Chapter 5. Topological and Monotonicity Methods

Moreover, by (5.4.23) and its proof (cf. page 114) x = A−k u = lim (λ − λ0 )k (λI − T )−1 u ≥ o, λ→λ0+

i.e.,

x > o.

Let us construct the element x∗ . By the previous step, T−n (K) ⊂ K. We choose a ∗ u∗ ∈ K∗ with u∗ , x > 0. This is possible by Exercise 5.4.42. We set x∗ = T−n u∗ . Then v ≥ o implies x∗ , v = u∗ , T−n v ≥ 0

x∗ , u = u∗ , x > 0.

and

Thus x∗ > o. Passing to the dual operator in (5.4.23), we obtain λ0 x∗ = T ∗ (x∗ ) analogously as above. Case 2 (there is an eigenvalue λ0 ∈ C of T with |λ0 | = r(T ) and λn 0 > 0 for an n ∈ N, which lies on the spectral circle of i.e., Arg λ0 57 ). Now T n has a positive eigenvalue λn 0 T n , so by Case 1 there exists a u > o with T n (u) = λn 0 u. If we set x = |λ0 |n−1 u + |λ0 |n−2 T (u) + · · · + T n−1 (u), then x > o and T (x) = |λ0 |x. Analogously we construct an x∗ for T ∗ . Case 3 (none of the eigenvalues of T with |λ| = r(T ) has the property from Case 2). We show that this is impossible. So, let λ0 be an eigenvalue of T with |λ0 | = r(T ) and with the greatest possible real part. We set Tε = T + εT 2

for

ε > 0.

By the Spectral Mapping Theorem (see Proposition 3.1.14(v)), all eigenvalues of Tε are of the form λ + ελ2 where λ is an eigenvalue of T . One can check that λ1 = λ0 + ελ20 and λ1 are the eigenvalues of Tε of greatest absolute value (the reader is asked to justify it!). k There is a sequence {εk }∞ k=1 , εk → 0, such that Arg λ1 is a rational multiple of 2π where k 2 λ1 = λ0 + εk λ0 (explain why!). According to Case 2, there is n ∈ N such that λnk 1 > 0. Since n lim λnk 1 = λ0 > 0, k→∞

we get a contradiction. Before we prove Theorem 5.4.33 we need the following geometrical result.

Lemma 5.4.35. Let X be a real Banach space with an order cone K X + containing an interior point. Let u o. Then for every v ∈ K there is a uniquely determined number αu (v) > 0 such that (i) 0 ≤ α ≤ αu (v) implies u + αv ≥ o; (ii) α > αu (v) implies u + αv ∈ K. In particular, u + αv o 57 Any

and

α>0

imply

α < αu (v).

complex number λ = 0 can be written in the form λ = |λ|ei Arg λ .

(5.4.24)

5.4A. Minorant Principle and Krein–Rutman Theorem

345

Proof. Consider the ray = {u + αv : α ≥ 0}. For small α ≥ 0 we have u + αv ∈ int K, and for large α ≥ α0 we have u + αv ∈ K. Otherwise u + nv ∈ K for large n ∈ N, and nu + v ∈ K. Passing to the limit for n → ∞, we obtain a contradiction v ∈ K. Set αu (v) sup {α > 0 : u + αv ∈ int K}.

It is easy to show that αu (v) has the desired properties. Proof of Theorem 5.4.33. We proceed in six steps.

Step 1 (existence of a positive solution). We choose an x > o. Since T is strongly positive, T (x) o, so T (x) ∈ int K. Thus T (x) − γx ∈ K for small γ > 0, so T (x) ≥ γx. By Proposition 5.4.31, there exists a positive solution (e, λ0 ): T (e) = λ0 e

with

e>o

and

λ0 > 0.

Since T (e) o, we have also e o. Step 2. We show: If T (x) = λx, x > o and λ ∈ R, then x = γe for a positive γ and λ = λ0 . To begin with, T (x) o, so λ > 0 and x o. We consider two identities T (e − βx) = λ0 (e − βλ−1 0 λx), −1

T (x − γe) = λ(x − γλ

λ0 e),

(5.4.25) (5.4.26)

and choose β = αe (−x)

and

γ = αx (−e).

Then x = γe. Otherwise, x−γe > o. This implies T (x−γe) o, and hence λ−1 λ0 < 1 by (5.4.26) and (5.4.24). On the other hand, e − βx ≥ o immediately implies T (e − βx) ≥ o, and (5.4.25) yields the contradiction λ−1 0 λ ≤ 1. Step 3. We show: If T (x) = λx and x = o, λ ∈ R \ {0} as well as x = αe for all α ∈ R, then |λ| < λ0 . By Proposition 5.4.32 λ0 = r(T ) now follows, and with respect to Step 2, dim Ker (λ0 I − T ) = 1. By Step 2, ±x ∈ K. We consider T (e ± β± x) = λ0 (e ± β± λ−1 0 λx) and set β± = αe (±x). Since e ± β± x = o, we have e ± β± x > o, so T (e ± β± x) o. Then (5.4.27) and (5.4.24) immediately imply λ−1 0 |λ| < 1.

(5.4.27)

346

Chapter 5. Topological and Monotonicity Methods

Step 4. We now consider the complexiﬁcation XC = X + iX and T : XC → XC (see footnote 55 on page 342). In this step we show: If λ is a complex eigenvalue of T , then |λ| < λ0 . Let λ = σ +iτ , σ, τ ∈ R, be an eigenvalue of T and z = x+iy, x, y ∈ X, the corresponding eigenvector, i.e., according to the deﬁnition of T , we have T (x + iy) = (σ + iτ )(x + iy), which is equivalent to T (x) = σx − τ y,

T (y) = τ x + σy.

(5.4.28)

Our goal is to show that (5.4.28) implies 1 |λ| = σ 2 + τ 2 < λ0 . The reader is invited to prove that if λ is not real and (5.4.28) holds, then x and y are linearly independent elements of X (cf. Remark 1.1.35(ii)). In particular, x = o and y = o. Let P be a two-dimensional plane in X which consists of elements ξx + ηy, ξ, η ∈ R. Then P is an invariant subspace of the operator T , i.e., T (P) ⊂ P. ˜ = K ∩ P is Let T˜ be the restriction of T onto P. Since also T (K) ⊂ K, the cone K invariant with respect to T˜, i.e., ˜ ⊂ K. ˜ T˜ (K) We want to prove that ˜ = {o}. K

(5.4.29) ˜ is an order cone in P and T˜ : P → P is strongly positive Assume the opposite, then K since T is strongly positive. According to Step 1, there exists a positive eigenvector e˜ ∈ P ˜ According to Step 2 we necessarily have of T˜ (and hence also of T ) such that e˜ ∈ K. e˜ = γe for a certain γ = 0, γ ∈ R. But this fact combined with (5.4.28) implies that x and y are linearly dependent, which is a contradiction, i.e., (5.4.29) is proved. It now follows from (5.4.29) that no elements ξx + ηy with |ξ| + |η| > 0 belong to K. In particular, x ∈ K. Since int K = ∅ implies that K is total, there exist nonzero elements x ∈ int K and x ∈ int (K) such that x = x − x .58 There exists β > 0 such that T (x ) ≤ βe. Indeed, since e ∈ int K we ﬁnd β > 0 large enough to satisfy e − β1 T (x ) ∈ K. 58 Indeed, if v 0 x= u − v 0 .

∈ int K, v0 = o and > 0 is small enough, then u = v0 + x ∈ int K, u = o. Hence

5.4A. Minorant Principle and Krein–Rutman Theorem

347

So, we have T (x) = T (x ) − T (x ) ≥ −T (x ) ≥ −βe,

i.e.,

e+

1 T (x) ∈ K. β

It follows from (5.4.28) that ψ = e + β1 T (x) can be written in the form |ξ| + |η| > 0.

ψ = ξx + ηy + e,

(5.4.30)

Let A be the set of all elements of the form (5.4.30) which belong to K. We have just shown that A = ∅. Let us consider a continuous function of two variables f : A → R which with every ψ ∈ A associates the number ξ 2 + η 2 . Since x ∈ K, y ∈ K, the function f must be bounded. It follows from the Extreme Value Theorem (K is closed) that there is ψ0 = ξ0 x + η0 y + e ∈ A

f (ψ0 ) = max f (ψ) M.

such that

ψ∈A

It follows from the strict positivity of T that there exists δ > 0 such that T (ψ0 ) ≥ δe. Indeed, ψ0 ∈ K, ψ0 = o, implies T (ψ0 ) ∈ int K. We then can ﬁnd δ > 0 small enough to satisfy T (ψ0 ) − δe ∈ K. δ λ0

Let us assume without loss of generality that

1−

δ λ0

< 1. Let us rewrite T (ψ0 ) ≥ δe as

λ0 e + (ξ1 x + η1 y) ≥ o

where

ξ1 x + η1 y = T (ξ0 x + η0 y).

(5.4.31)

Using (5.4.28), we have T (ξ0 x + η0 y) = (ξ0 σ + η0 τ )x + (−ξ0 τ + η0 σ)y and hence η1 = −ξ0 τ + η0 σ.

ξ1 = ξ0 σ + η0 τ,

(5.4.32)

Then ξ12 + η12 = (ξ02 + η02 )(σ 2 + τ 2 ) = M |λ|2 . It follows from (5.4.31) that ψ1 = e + '

ξ1 1−

δ λ0

(

x+ ' λ0

η1 1−

δ λ0

(

y λ0

is an element of the form (5.4.30). Hence

M≥ which implies |λ| < λ0 .

ξ1 λ0 − δ

2

+

η1 λ0 − δ

2 =

M |λ|2 , (λ0 − δ)2

348

Chapter 5. Topological and Monotonicity Methods

Step 5. We show that λ0 is simple. Since dim Ker (λ0 I − T ) = 1 (Step 3), it is enough to prove Ker (λ0 I − T )2 = Ker (λ0 I − T ). Let (λ0 I − T )2 (x) = o. By Step 2, this implies (λ0 I − T )x = γe. We want to show that γ = 0. Suppose γ = 0. We may assume that γ > 0, for otherwise we pass to −x. Set µ0 = λ−1 0 . Now x = µ0 T (x + γe) implies x + γe = µ0 T (x + 2γe)

and

x = µ20 T 2 (x + 2γe).

n It follows by induction that x = µn 0 T (x + nγe), so ' x x( n = µn for all γe + 0T n n

n ∈ N.

(5.4.33)

Since e ∈ int K, we have γe + nx ≥ o for large n. By (5.4.33) and the positivity of T , we n have nx ≥ o. Furthermore, from (5.4.33) and µn 0 T (e) = e we immediately conclude ' ( x n x − γe = µn ≥ o. 0T n n Passing to the limit for n → ∞ we get −γe ≥ o, so γ = 0, contradicting γ > 0. Step 6 (examination of T ∗ ). By Proposition 5.4.32 there exists e∗ > o such that T ∗ (e∗ ) = λ0 e∗ . We show that

e∗ , x > 0

provided

x > o,

(5.4.34)

i.e., e∗ is strictly positive. Indeed, let x > o. Then T (x) o and by Exercise 5.4.44, e∗ , T (x) > 0. So λ0 e∗ , x = T ∗ (e∗ ), x = e∗ , T (x) > 0. According to the Riesz–Schauder Theory (see Theorem 2.2.9), dim Ker (λ0 I ∗ − T ∗ ) = dim Ker (λ0 I − T ) which is equal to 1 by Steps 2 and 3. To prove that λ0 is an algebraically simple eigenvalue of T ∗ choose x∗ ∈ Ker (λ0 I ∗ − T ∗ )2 . Let y ∗ = λ0 x∗ − T ∗ x∗ . Then y ∗ = αe∗ for an α ∈ R. For any x > o we have αe∗ , x = y ∗ , x = x∗ , λ0 x − T x. In particular, taking x = e we obtain α = 0, i.e., y ∗ = o and x∗ ∈ Ker (λ0 I ∗ − T ∗ ). This proves Ker (λ0 I ∗ − T ∗ )2 = Ker (λ0 I ∗ − T ∗ ). This completes the proof of Theorem 5.4.33.

The authors want to point out that another proof of the Krein–Rutman Theorem can be found in, e.g., Tak´ aˇc [126].

5.4A. Minorant Principle and Krein–Rutman Theorem

349

Corollary 5.4.36. Let X and T be as in Theorem 5.4.33. For every y > o, (5.4.12) has exactly one solution x > o if λ > r(T ), and no such solution if λ ≤ r(T ). Moreover, λx − T (x) = µy

and

x > o, y > o

sgn(µ) = sgn(λ − r(T )).

imply

Here λ and µ are real numbers. Proof. The resolvent Rλ exists for λ > r(T ) and thus the equation λx − T (x) = y

(5.4.35)

has a unique solution for any y ∈ X. Since Rλ : K → K by the proof of Proposition 5.4.32, hence y > o implies x > o. On the other hand, if λ ≤ r(T ) and there is a positive solution x of (5.4.35) for y > o, then choosing e∗ ∈ X ∗ as in Step 6 of the proof of Theorem 5.4.33 we arrive at (λ − r(T ))e∗ , x = e∗ , λx − T (x) = e∗ , y > 0, a contradiction. Finally, let x > o, y > o and λx − T (x) = µy

for a certain

µ ∈ R.

Then (λ − r(T ))e∗, x = e∗ , λx − T (x) = µe∗ , y,

i.e.,

sgn(λ − r(T )) = sgn µ.

Corollary 5.4.37 (Comparison Principle). Let X and T be as in Theorem 5.4.33. If S : X → X is a compact linear operator with S(x) ≥ T (x)

for all

x ≥ o,

then r(S) ≥ r(T ). If S(x) > T (x) for all x > o, then r(S) > r(T ). Proof. Let S(x) ≥ T (x)

for all

x ≥ o.

Choose e > o such that T (e) = r(T )e. Then S(e) ≥ T (e) = r(T )e. By Proposition 5.4.31, r(S) ∈ σ(S) and therefore r(S) ≥ r(T ). In order to prove the second part of the statement we choose x > o with S(x) = r(S)x (see Proposition 5.4.32). We now set AS−T and choose e∗ as in Step 6 of the proof of Theorem 5.4.33. Then r(S)e∗ , x = e∗ , A(x) + e∗ , T (x) = e∗ , A(x) + T ∗ (e∗ ), x = e∗ , A(x) + r(T )e∗ , x. By (5.4.34), we have e∗ , x > 0 and also e∗ , A(x) > 0, and thus r(S) > r(T ).

350

Chapter 5. Topological and Monotonicity Methods

Example 5.4.38. Let X = RN and X + = RN,+ . Further, let T be a real (N × N ) matrix of positive elements only. Then T : X → X is linear, compact, and strongly positive. The e conclusions of Theorem 5.4.33 coincide with those of the classical Perron Theorem. Example 5.4.39. Let Ω be a bounded domain in RN . We set X = C(Ω), X + = C + (Ω) (cf. Example 5.4.7) and consider the integral equation A(t, s)x(s) ds

λx(t) =

for all

t ∈ Ω,

(5.4.36)

Ω

with a positive continuous kernel A : Ω × Ω → R. If we write (5.4.36) in the form λx = T (x),

x ∈ X,

e

then Theorem 5.4.33 is the classical Jentzsch Theorem.

In the next example we use some facts from the forthcoming Chapter 7. The reader who is not acquainted with the properties of the Laplace operator can skip this example or consider the one-dimensional case and replace the Laplace operator by the second derivative. Example 5.4.40. Let us consider the eigenvalue problem for the Laplace operator subject to the homogeneous Dirichlet boundary conditions

−∆u(x) = µu(x)

in

x ∈ Ω,

u(x) = 0

on

x ∈ ∂Ω,

(5.4.37)

where Ω is a bounded domain in RN and ∂Ω is its boundary (cf. Remark 7.2.2). Then (5.4.37) can be written in the form (5.4.36) with λ = µ1 where A = A(t, s) is the Green function associated with the Laplace equation with the homogeneous Dirichlet boundary conditions. Since A is a positive continuous function A : Ω × Ω → R (see, e.g., Gilbarg & Trudinger [59]), we can apply the result of Example 5.4.39. Multiplying the equation in (5.4.37) by u = u(x) (u is a real function) and using the Green Formula (cf. footnote 7 on page 479), we ﬁnd

∇u(x)2 dx = µ Ω

u2 (x) dx, Ω

which shows that (5.4.37) has only positive real eigenvalues. It then follows from Example 5.4.39 (and hence from the Krein–Rutman Theorem) that (5.4.37) has the least eigenvalue µ1 > 0 which is simple and which is the only eigenvalue of (5.4.37) having a e positive eigenfunction ϕ1 (x) > 0, x ∈ Ω. Exercise 5.4.41. Show that if int K = ∅, then K is a total cone and construct an example of a cone which is not total. Hint. If y ∈ int K, then y ± αx ∈ K for every x ∈ X with α > 0 suﬃciently small. Thus X = K − K because (y + αx) − (y − αx) . x= 2α

5.4B. Supersolutions, Subsolutions and Topological Degree

351

Exercise 5.4.42. Show that for every x ∈ K \ {o}, there exists an x∗ ∈ X ∗ such that x∗ , x > 0. Hint. Since −x ∈ K and K is closed, −x is an exterior point of K. Consequently, there is an open convex neighborhood U of −x which is disjoint from K. By the Separation Theorem for convex sets,59 there is an x∗ ∈ X ∗ with x∗ (K) ≥ 0 and x∗ (U) < 0. Hence x∗ , x > 0. Exercise 5.4.43. Show that if K is total, then K∗ is an order cone on X ∗ . Hint. K = {o} implies K∗ = {o} by Exercise 5.4.42. Suppose ±x∗ ∈ K∗ . We have to show that x∗ = o. Indeed, x∗ , ±x ≥ 0 for all x ∈ K implies x∗ , x ≥ 0 for all x ∈ X, because K is total. Hence x∗ = o. Exercise 5.4.44. Let x∗ ∈ X ∗ . Show that if x∗ > o (i.e., x∗ ≥ o and x∗ , y > 0 for a y > o), then x∗ , x > 0 for all x ∈ int K. Hint. Suppose x∗ , x = 0 for an x ∈ int K. Then x ± αy ∈ K for suﬃciently small α > 0. Hence x∗ , x ± αy ≥ 0, so x∗ , y = 0. This is a contradiction. Exercise 5.4.45. Prove that the functional v → αu (v) from Lemma 5.4.35 is continuous. Exercise 5.4.46. Apply the Krein–Rutman Theorem to the problems in Examples 2.1.32 and 2.2.17.

5.4B Supersolutions, Subsolutions and Topological Degree In this appendix we show the connection between the supersolution and subsolution on the one hand and the topological degree on the other. We consider the quasilinear boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, x(t)), t ∈ (0, 1), (5.4.38) x(0) = x(1) = 0, as a model example. A special case of it was studied in Examples 5.2.51 and 5.3.24. However, in this appendix we work in diﬀerent function spaces. Here p > 1 is a real number and f : [0, 1] × R → R is a function the properties of which will be speciﬁed later. By a solution of (5.4.38) we understand a function x ∈ C 1 [0, 1] with x(0) = x(1) = 0 such that |x| ˙ p−2 x˙ is absolutely continuous and the equation in (5.4.38) holds a.e. in (0, 1). Clearly, the problem (5.4.38) formally coincides with (5.4.6) if p = 2. Deﬁnition 5.4.47. A function u0 ∈ C 1 [0, 1] with |u˙ 0 |p−2 u˙ 0 absolutely continuous is called a subsolution of (5.4.38) if u0 (1) ≤ 0 u0 (0) ≤ 0, and −(|u˙ 0 (t)|p−2 u˙ 0 (t))˙ ≤ f (t, u0 (t))

for a.e.

In an analogous way we deﬁne a supersolution v0 of (5.4.38). We write x y if and only if x(t) < y(t), 59 This

is a minor supplement of Corollary 2.1.18.

t ∈ (0, 1),

t ∈ (0, 1).

352

Chapter 5. Topological and Monotonicity Methods

and either

x(0) < y(0)

or

x(0) = y(0)

and

x(0) ˙ < y(0), ˙

and the same alternatives hold at 1.60 Deﬁnition 5.4.48. A subsolution u0 of (5.4.38) is said to be strict if every possible solution x of (5.4.38) such that u0 ≤ x on [0, 1] satisﬁes u0 x. In an analogous way we deﬁne a strict supersolution of (5.4.38). Let us formulate (5.4.38) as a “ﬁxed point” operator equation. Assume that for any y ∈ C01 [0, 1] {x ∈ C 1 [0, 1] : x(0) = x(1) = 0} we have

f (t, y(t)) ∈ L∞ (0, 1).

We denote by T : C01 [0, 1] → C01 [0, 1] the solution operator of p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, y(t)), t ∈ (0, 1), x(0) = x(1) = 0,

(5.4.39)

i.e., for x, y ∈ C01 [0, 1], x = T (y) if and only if the equation in (5.4.39) holds a.e. in (0, 1). For any ﬁxed y ∈ C01 [0, 1] it follows by integration of (5.4.39) and the injectivity of ϕ(s) = |s|p−2 s that the operator T is well deﬁned. Clearly, the problem (5.4.38) has a solution x if and only if x = T (x), i.e., x is a ﬁxed point of T . Let f be a Carath´eodory function and for any r > 0 let there exist a constant hr > 0 such that for a.e. t ∈ (0, 1) and for all s ∈ (−r, r), |f (t, s)| < hr .

(5.4.40)

This condition is satisﬁed if, e.g., f (t, x(t)) = h(t) − g(x(t)) where h ∈ L∞ (0, 1) and g : R → R is a continuous function (cf. Examples 5.2.51 and 5.3.24). We prove that the operator T is compact. To this purpose we express T in the integral form. By the Rolle Theorem for any x = T (y) there exists tx ∈ [0, 1] such that x(t ˙ x ) = 0, i.e., -p −2 tx - tx f (τ, y(τ )) dτ -f (τ, y(τ )) dτ (5.4.41) x(t) ˙ =t

t - x(t) = -

and

0

where p =

p . p−1

t

-p −2 tx f (τ, y(τ )) dτ --

σ

f (τ, y(τ )) dτ

dσ

(5.4.42)

σ

If yn → y0 in C01 [0, 1], then the continuity of the Nemytski operator y → f (·, y)

60 Here

tx

(5.4.43)

x(0) ˙ and x(1) ˙ mean the derivative from the right and from the left, respectively.

5.4B. Supersolutions, Subsolutions and Topological Degree

353

from C[0, 1] into C[0, 1], and (5.4.41), (5.4.42) imply that xn → x0 in C01 [0, 1] where xn = T (yn ), x0 = T (y0 ), i.e., T is continuous. Let M ⊂ C01 [0, 1] be a bounded set. To prove the compactness of T we have to show that T (M) is relatively compact. Let {xn }∞ n=1 ⊂ T (M) be an arbitrary sequence, xn = T (yn ), yn ∈ M. It follows from the compact embedding C01 [0, 1] ⊂⊂ C[0, 1] (see Theorem 1.2.13) that there exists a ∞ subsequence {ynk }∞ k=1 ⊂ {yn }n=1 which converges uniformly on [0, 1]. But the continuity of the Nemytski operator (5.4.43) and (5.4.41), (5.4.42) imply that {xnk }∞ k=1 converges in C01 [0, 1], i.e., T (M) is relatively compact. Hence the compactness of T is proved. The following assertion is referred to as a well-ordered case of supersolution and subsolution. Theorem 5.4.49 (well-ordered case). Let f be a Carath´ eodory function satisfying (5.4.40). Assume that u0 and v0 are a subsolution and a supersolution of (5.4.38), respectively, with u0 ≤ v0 (see Figure 5.4.4). Then the problem (5.4.38) has at least one solution x satisfying in [0, 1]. u0 ≤ x ≤ v0 If, moreover, u0 and v0 are strict and satisfy u0 v0 , then there exists R0 > 0 such that for, all R > R0 , deg (I − T, Ω1 , o) = 1

where

Ω1 {x ∈ C01 [0, 1] : u0 x v0 } ∩ B(o; R),

is an open set in C01 [0, 1] (cf. Exercise 5.4.53).

v0

0

u0

1

t

Figure 5.4.4. Well-ordered case Proof. Set

⎧ ⎪ ⎨f (t, y) f˜(t, y) f (t, u0 (t)) ⎪ ⎩ f (t, v0 (t))

if if if

u0 (t) ≤ y ≤ v0 (t), y ≤ u0 (t), y ≥ v0 (t).

Every solution of

p−2 x(t))˙ ˙ = f˜(t, x(t)), −(|x(t)| ˙

t ∈ (0, 1),

x(0) = x(1) = 0,

(5.4.44)

is a solution of (5.4.38). Indeed, assume that x solves (5.4.44) and x > v0 in an interval I+ ⊂ (0, 1) and x = v0 on ∂I+ . Then

1 0

1 - dx(t) -p−2 dx(t) d f (t, v0 (t))(x(t) − v0 (t))∗ dt (x(t) − v0 (t))∗ dt = - dt dt dt 0

(5.4.45)

354

Chapter 5. Topological and Monotonicity Methods

where ∗

(x(t) − v0 (t)) =

x(t) − v0 (t) 0

on on

I+ , [0, 1] \ I+ .

Since v0 is a supersolution, we have 1 1 - dv0 (t) -p−2 dv0 (t) d ∗ (t)) dt ≥ f (t, v0 (t))(x(t) − v0 (t))∗ dt. (5.4.46) (x(t) − v 0 - dt dt dt 0 0 Hence, combining (5.4.45) and (5.4.46), we obtain p−2 (|x(t)| ˙ x(t) ˙ − |v˙ 0 (t)|p−2 v˙ 0 (t))(x(t) ˙ − v˙ 0 (t)) dt ≤ 0. I+

This is a contradiction,61 which proves that x(t) ≤ v0 (t),

t ∈ (0, 1).

The same argument shows that x(t) ≥ u0 (t),

t ∈ (0, 1).

Now, denote by T˜(y) a solution of the boundary value problem p−2 x(t))˙ ˙ = f˜(t, y(t)), t ∈ (0, 1), −(|x(t)| ˙ x(0) = x(1) = 0 for y ∈ C01 [0, 1]. Then T˜ : C01 [0, 1] → C01 [0, 1] is compact62 and the solutions of (5.4.44) are in a one-to-one correspondence with the ﬁxed points of T˜. The deﬁnition of f˜ ensures that there exists a constant R0 > 0 such that for any y ∈ C01 [0, 1] we have T˜ (y)C01 [0,1] < R0

(5.4.47)

(see (5.4.41), (5.4.42)). By the Schauder Fixed Point Theorem T˜ has a ﬁxed point x in B(o; R0 ), i.e., x is a solution of (5.4.44). It follows from the above considerations that u0 ≤ x ≤ v0 , and so x is also a desired solution of (5.4.38). The proof of the second part follows from the fact that due to (5.4.47), we can construct an admissible homotopy H(τ, ·) I − τ T˜ ,

τ ∈ [0, 1],

which shows that deg (I − T˜ , B(o; R0 ), o) = deg (I, B(o; R0 ), o) = 1. Since u0 and v0 are strict and there is no solution x of the equation x − T˜(x) = o for which either x(t) < u0 (t) or x(t) > v0 (t) for a t ∈ (0, 1), it follows from Theorem 5.2.13(iv) that deg (I − T˜ , Ω1 , o) = deg (I − T˜, B(o; R0 ), o) = 1. The assertion now follows from the fact that T and T˜ coincide in Ω1 . that s → is a strictly increasing function! proof of this fact is the same as that for T .

61 Note 62 The

|s|p−2 s

5.4B. Supersolutions, Subsolutions and Topological Degree

355

The next assertion is referred to as a non-well-ordered case of a supersolution and a subsolution. Theorem 5.4.50 (non-well-ordered case). Let f be a Carath´ eodory function which satisﬁes the following assumption: there are ci > 0, i = 1, 2, such that |f (t, s)| ≤ c1 + c2 |s|p−1

for a.e.

t ∈ (0, 1)

and for all

s∈R

(5.4.48)

and, moreover,

lim

|s|→∞

f (t, s) = λ1 .63 |s|p−2 s

(5.4.49)

Assume that u0 and v0 are a subsolution and a supersolution of (5.4.38), respectively, and there exists t0 such that u0 (t0 ) > v0 (t0 ) (see Figure 5.4.5).

v0

0

t0

u0

1

t

Figure 5.4.5. Non-well-ordered case Then (5.4.38) has at least one solution in the closure (with respect to the C 1 -norm) of the set S {x ∈ C01 [0, 1] : ∃t1 , t2 ∈ (0, 1), x(t1 ) < u0 (t1 ), x(t2 ) > v0 (t2 )}. Set Ω2 S ∩ B(o; R) and assume that there is no solution of (5.4.38) on ∂Ω2 . Then there exists R0 > 0 such that for all R > R0 , deg (I − T, Ω2 , o) = −1. Proof. If (5.4.38) has a solution on ∂S, we are done. Let us assume in the sequel that (5.4.38) does not have a solution on ∂S. For r > 0 let us deﬁne ⎧ ⎪ ⎨f (t, y) fr (t, y) = (1 + r − |y|)f (t, y) ⎪ ⎩ 0 63 Here

if if if

|y| < r, r < |y| < r + 1, |y| > r + 1.

λ1 is the ﬁrst eigenvalue of (5.2.47), see Example 5.2.51.

356

Chapter 5. Topological and Monotonicity Methods

Next we show that there is K > 0 such that for any r > 0 and for any possible solution of p−2 −(|x(t)| ˙ x(t))˙ ˙ = fr (t, x(t)), t ∈ (0, 1), (5.4.50) x(0) = x(1) = 0, the following a priori estimate holds: xC01 [0,1] ≤ K.

(5.4.51)

To prove this fact we argue by contradiction, and thus we assume that for any k ∈ N there are rk > 0, xk ∈ S solving −(|x˙ k (t)|p−2 x˙ k (t))˙ = frk (t, xk (t)), t ∈ (0, 1), (5.4.52) xk (0) = xk (1) = 0, and satisfying xk ≥ k. Set yk xxkk and divide (5.4.50) by xk p−1 to obtain ⎧ ⎪ ⎨ −(|y˙ k (t)|p−2 y˙ k (t))˙ = frk (t, xk (t)) , t ∈ (0, 1), xk p−1 ⎪ ⎩ y (0) = y (1) = 0. k

k

By integration we ﬁnd that {yk }∞ k=1 equivalently satisﬁes

t frk (σ, xk (σ)) dσ y˙ k (t) = ϕp ϕp (y˙ k (0)) + xk p−1 0 and

t

yk (t) =

ϕp

τ

ϕp (y˙ k (0)) +

0

0

frk (σ, xk (σ)) dσ xk p−1

(5.4.53)

dτ ,

t ∈ [0, 1],

(5.4.54)

where for s > 1 we set ϕs (ξ) = |ξ|s−2 ξ if ξ = 0 and ϕs (0) = 0. Now, since yk = 1, by passing to a subsequence if necessary, we have yk → y

in

C0 [0, 1] {x ∈ C[0, 1] : x(0) = x(1) = 0}

for a

y ∈ C0 [0, 1].64

But then (5.4.53) yields yk → y

in

C01 [0, 1]

(note that without loss of generality we may also assume that {y˙ k (0)}∞ k=1 forms a convergent sequence!). It follows from (5.4.54), (5.4.48), (5.4.49) and the Lebesgue Dominated Convergence Theorem that y solves the problem ˙ p−2 y(t))˙ ˙ −(|y(t)| = λ1 |y(t)|p−2 y(t), t ∈ (0, 1), y(0) = y(1) = 0. Since y = 1, it follows that y is a nonzero multiple of the ﬁrst eigenfunction ϕ1 (t) > 0 in (0, 1) (see Example 5.2.51). If y > 0 in (0, 1), then we ﬁnd that xk (t) → ∞ for any t ∈ (0, 1), which contradicts xk ∈ S. Also y < 0 in (0, 1) leads to a contradiction. Hence the a priori estimate (5.4.51) is proved. 64 This

is a consequence of the Arzel` a–Ascoli Theorem.

5.4B. Supersolutions, Subsolutions and Topological Degree

357

Now choose R > R0 = max{K, u0 C[0,1] , v0 C[0,1] } + 1 and consider (5.4.50) with r = R and xk = x, i.e., p−2 −(|x(t)| ˙ x(t))˙ ˙ = fR (t, x(t)),

t ∈ (0, 1),

(5.4.55)

x(0) = x(1) = 0.

Obvious modiﬁcations of the deﬁnition of a strict subsolution and supersolution of (5.4.38) lead to the same notions associated with (5.4.55). Then α = −R−2 and β = R+2 are a subsolution and a supersolution, respectively, associated with (5.4.55). Both are actually strict. Indeed, assume, e.g., that x is a solution of (5.4.55), x(t) ≥ −R − 2 and x(t0 ) = −R − 2 for a certain t0 ∈ (0, 1). Then x(t0 ) = min x(τ ), i.e., x(t ˙ 0 ) = 0 and τ ∈(0,1)

there exists η > 0 such that x(t) < −R − 1 for t ∈ [t0 , t0 + η). But fR (t, x(t)) = 0 by deﬁnition, so x(t) ≡ −R − 2 in (t0 , t0 + η]. Since this implies that x(t) ≡ −R − 2 in (t0 , 1], we obtain a contradiction. The same argument applies to R + 2. Notice also that α v0 and u0 β. Now, let us deﬁne TR : C01 [0, 1] → C01 [0, 1] by x TR (y) where x is a solution of the problem p−2 x(t))˙ ˙ = fR (t, y(t)), −(|x(t)| ˙

t ∈ (0, 1),

x(0) = x(1) = 0, and deﬁne the sets Sαβ {x ∈ C01 [0, 1] : α x β}, Su0 β {x ∈ C01 [0, 1] : u0 x β}

and

Sαv0 {x ∈ C01 [0, 1] : α x v0 }

(see Figure 5.4.6).

β =R+2 v0

0

t0

u0

α = −R − 2 Figure 5.4.6.

1

t

358

Chapter 5. Topological and Monotonicity Methods

By deﬁnition, TR and T coincide in the ball B(o; R). Applying Theorem 5.4.49 and Theorem 5.2.13(iv) we obtain 1 = deg (I − TR , B(o; R) ∩ Sαβ , o) = deg (I − TR , B(o; R) ∩ Sαv0 , o) + deg (I − TR , B(o; R) ∩ Su0 β , o) + deg (I − TR , Ω2 , o) = 2 + deg (I − TR , Ω2 , o),

which completes the proof.

Remark 5.4.51. There are several applications of Theorems 5.4.49 and 5.4.50. Also generalizations of these results to the case of partial diﬀerential equations can be found in literature, see, e.g., Dr´ abek, Girg & Man´ asevich [40]. In the next assertion we present one application of Theorems 5.4.49 and 5.4.50 which under suitable assumptions on f yields the multiplicity of solutions of (5.4.38). Theorem 5.4.52. Let f be as in Theorem 5.4.50 and let ui0 and v0i , i = 1, 2, be subsolutions and supersolutions of (5.4.38), respectively, which satisfy u10 v01 ,

u20 v02 ,

and let there exist t0 ∈ (0, 1) such that u20 (t0 ) > v01 (t0 ) (see Figure 5.4.7). Then the problem (5.4.38) has at least three distinct solutions.

x3

v02 v01

x2

0

1

t0

t

u20 x1

u10 Figure 5.4.7.

Proof. It follows from Theorem 5.4.49 that there are solutions xi = xi (t), i = 1, 2, of (5.4.38) which satisfy u20 x2 v02 . u10 x1 v01 , Now, let us apply Theorem 5.4.50 with a subsolution u20 and a supersolution v01 . We get another solution x3 = x3 (t) of (5.4.38). Clearly, all xi , i = 1, 2, 3, are mutually diﬀerent.

5.4B. Supersolutions, Subsolutions and Topological Degree

359

Exercise 5.4.53. Prove that Ω1 from Theorem 5.4.49 is an open set in C01 [0, 1]. Exercise 5.4.54. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has a pair of well-ordered supersolution and subsolution. Exercise 5.4.55. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has a pair of non-well-ordered supersolution and subsolution. Exercise 5.4.56. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has two pairs of supersolutions and subsolutions which satisfy the assumptions from Theorem 5.4.49.

Chapter 6

Variational Methods 6.1 Local Extrema In this section we present necessary and/or suﬃcient conditions for local extrema of real functionals. The most famous ones are the Euler and Lagrange necessary conditions and the Lagrange suﬃcient condition. We also present the brachistochrone problem, one of the oldest problems in the calculus of variations. We also discuss regularity of the point of a local extremum. The methods presented in this section are motivated by the equation f (x) = 0

(6.1.1)

where f is a continuous real function deﬁned in R. The solution of this equation can be transformed to the problem of ﬁnding a local extremum of the integral F of f (i.e., F (x) = f (x), x ∈ R). Indeed, if there exists a point x0 ∈ R at which the function F has its local extremum, then the derivative F (x0 ) necessarily vanishes due to a familiar theorem of the ﬁrst-semester calculus. The problem of ﬁnding solutions of (6.1.1) can be thus transformed to the problem of ﬁnding local extrema of the function F . On the other hand, one should keep in mind that the equation (6.1.1) may have a solution which is not a local extremum of F . In what follows we will deal with real functionals F: X →R where X is a normed linear space with the norm · . Deﬁnition 6.1.1. We say that F has a local minimum (maximum) at a point a ∈ X if there exists a neighborhood U of a such that for all x ∈ U \ {a} we have F (x) ≥ F (a)

(F (x) ≤ F (a)).

362

Chapter 6. Variational Methods

If the inequalities are strict, we speak about a strict local minimum (strict local maximum). If the functional F has a (strict) local minimum or (strict) local maximum at a, we say that it has a (strict ) local extremum at a. In Figure 6.1.1 the critical point a is not a point of extremum of F . R

F

a

0

R

Figure 6.1.1.

The fundamental assertion is the following Euler (or Fermat ) Necessary Condition. Proposition 6.1.2 (Euler Necessary Condition). Let F : X → R have a local extremum at a ∈ X. If for v ∈ X the derivative δF (a; v) exists, then δF (a; v) = 0. Proof. Set g(t) = F (a + tv),

t ∈ R.

Then g attains a local minimum at t = 0, thus 0 = g (0) = δF (a; v).

Deﬁnition 6.1.3. If δF (a; v) = 0

for all v ∈ X,

then a is called a critical point of the functional F .1 The more precise Lagrange Necessary Condition distinguishes between local minima and maxima, but requires the existence of the second derivative in the given direction. Proposition 6.1.4 (Lagrange Necessary Condition). Let F : X → R have a local minimum (maximum) at a ∈ X. If for v ∈ X the second derivative δ 2 F (a; v, v) exists, then δ 2 F (a; v, v) ≥ 0 (δ 2 F (a; v, v) ≤ 0). 1 Cf.

Deﬁnition 4.3.6.

6.1. Local Extrema

363

Proof. Let g be as in the proof of Proposition 6.1.2. Then g (0) = δ 2 F (a; v, v). Now we can apply the Lagrange necessary condition for local extrema of the real function g of one real variable to get the conclusion. Contrary to Propositions 6.1.2 and 6.1.4, the Lagrange Suﬃcient Condition provides the information when a critical point of F is a point of its local minimum or local maximum. Theorem 6.1.5 (Lagrange Suﬃcient Condition). Let a ∈ X be a critical point of F : X → R. Let there exist a neighborhood U of a such that the mapping x → D2 F (x) is continuous in U. If there exists α > 0 such that D2 F (a)(v, v) ≥ αv2

(D2 F (a)(v, v) ≤ −αv2 )

for any

v ∈ X,

then F has a strict local minimum (maximum) at a. Proof. Let v ∈ X be such that a + v ∈ U. Then according to Proposition 3.2.27 we have 1 F (a + v) − F (a) = (1 − t)D2 F (a + tv)(v, v) dt.2 (6.1.2) 0

On the other hand, D2 F (a + tv)(v, v) ≥ D2 F (a)(v, v) − |D2 F (a + tv)(v, v) − D2 F (a)(v, v)| 8 9 ≥ α − D2 F (a + tv) − D2 F (a)B2 (X,R) v2 . The continuity of D2 F (x) in U implies that there is δ > 0 so small that for v < δ, t ∈ [0, 1], D2 F (a + tv) − D2 F (a)B2 (X,R) < α, (6.1.3) i.e., for 0 < v < δ we have (due to (6.1.2) and (6.1.3)) F (a + v) > F (a). The proof for a strict local maximum is similar.

Let us illustrate the general statements at ﬁrst on a function of several real variables F : RN → R. Example 6.1.6. Let F : RN → R have all partial derivatives of the ﬁrst order at a point a ∈ RN and, moreover, let the function F have a local extremum at a. Then Proposition 6.1.2 states that ∂F ∂F ∂F (a) = (a) = · · · = (a) = 0. ∂x1 ∂x2 ∂xN 2 We

(6.1.4)

can assume that U is convex. Then D 2 F (a + tv) exists and is continuous for all t ∈ [0, 1].

364

Chapter 6. Variational Methods

On the other hand, it is well known that (6.1.4) does not imply that F has a local extremum at the point a. To check that this is the case we can apply Theorem 6.1.5. If F has continuous second partial derivatives in a neighborhood of a, then we should investigate the quadratic form D2 F (a)(v, v) =

N

∂ 2F (a)vi vj . ∂xi ∂xj i,j=1

(6.1.5)

To prove that F has, e.g., a local minimum at a, it is enough to show that there exists α > 0 such that for any v ∈ RN , v = 1, D2 F (a)(v, v) ≥ α.

(6.1.6)

(Here we have used the fact that the quadratic form is homogeneous.) Since we are in ﬁnite dimension, the unit sphere in RN is a compact set. Then (6.1.6) holds with an α > 0 whenever for all v = 1.3

D2 F (a)(v, v) > 0

(6.1.7)

The reader is invited to justify that (6.1.7) implies (6.1.6) and to explain why this is not the case when RN is replaced by a space of inﬁnite dimension. It follows from linear algebra4 that for any quadratic form on RN there exists a basis {u1 , . . . , uN } of RN and numbers λ1 , . . . , λN such that for any v of the form v=

N

ξi ui

i=1

we have D2 F (a)(v, v) =

N

λi ξi2 .

i=1

The inequality (6.1.7) holds if and only if all λi , i = 1, . . . , N , are positive, and so according to Theorem 6.1.5 the function F has a strict local minimum at a. If there is at least one positive and at least one negative number among λi , i = 1, . . . , N , then according to Proposition 6.1.4 the function F does not have a local extremum g at a. Before we give an application in an inﬁnite dimensional space, we prove the following assertion for convex functionals. 3 Here we use the fact that a positive continuous function on a compact set achieves its minimal value which has to be positive. 2 ∂2 F F 4 See also Corollary 6.3.9. (Remember that (a) = ∂x∂ ∂x (a).) ∂x ∂x i

j

j

i

6.1. Local Extrema

365

Deﬁnition 6.1.7. Let M ⊂ X be a convex set. A functional F : X → R is said to be convex in M if for any u, v ∈ M and t ∈ [0, 1] we have F (tu + (1 − t)v) ≤ tF (u) + (1 − t)F (v). The functional F is said to be strictly convex in M if for any u, v ∈ M, u = v and t ∈ (0, 1) we have F (tu + (1 − t)v) < tF (u) + (1 − t)F (v). Proposition 6.1.8. Let F : X → R be a convex functional on a normed linear space X. Then every critical point of F in X is a point of minimum of F over X. Proof. Without loss of generality we can assume that F (o) = 0

and

δF (o; v) = 0

for any v ∈ X

(i.e., o ∈ X is a critical point). Assume that F does not achieve the minimum value over X at o ∈ X. Then there exists u ∈ X for which F (u) = α < 0. The convexity of F implies that F (tu + (1 − t)o) ≤ tα i.e.,

for any t ∈ (0, 1),

F (tu) − F (o) ≤ α < 0. t

(6.1.8)

But (6.1.8) implies δF (o; u) ≤ α < 0, which is a contradiction.

The following result will be needed several times in the further text. Lemma 6.1.9 (Fundamental Lemma in Calculus of Variations). Let I be an open interval and f ∈ L1loc (I). If f (x)ϕ (x) dx = 0 for any ϕ ∈ D(I), 5 (6.1.9) I

then f = const. a.e. in I. Proof. Let J be a compact subinterval of I and ϕ a molliﬁer, ϕ ∈ D(R), supp ϕ ⊂ [−1, 1] (see Proposition 1.2.20(iv)). For f (x), x ∈ J , g(x) = 0, x ∈ R\J, 5 See

page 35 for the deﬁnition of D(I).

366

Chapter 6. Variational Methods

we have g ∈ L1 (R), and thus lim g ∗ ϕn = g

n→∞

in the L1 (R)-norm6

and (passing to a subsequence – cf. Remark 1.2.18) also a.e. in R. Since g(x)ϕ n (y − x) dx = f (x)ϕ n (y − x) dx (g ∗ ϕn ) (y) = R

I

whenever y − n1 , y + n1 ⊂ J , by the assumption (6.1.9), (g ∗ ϕn )(y) is constant for all such y. The convergence of g ∗ ϕn to g means that g is constant a.e. in J , i.e., f = const. a.e. in I. One of the oldest problems in the calculus of variations is studied in detail in the following example. Example 6.1.10 (Brachistochrone Problem). The problem is formulated as follows: “For two given points A and B in a vertical plane ﬁnd a curve connecting A and B which is optimal among all other such curves in the following sense. The point P of unit mass which starts from A with zero velocity and moves along this curve only due to the gravitational force will reach the point B in a minimal time.”7 In order to ﬁnd a suitable mathematical model we shall assume that the points A = (0, 0) and B = (a, b), b ≥ 0, are situated in a vertical plane with the coordinate system chosen as in Figure 6.1.2. The reader is invited to verify that such a position of A and B can be considered without loss of generality. We shall concentrate ﬁrst only on curves which are graphs of nonnegative functions y = u(x) which belong to the space C 1 [0, a]. The point P moves according to the second Newton Law. The resulting force is a composition of the gravitational force and the reaction force of the constraint (the point P moves along the given curve). The resulting direction is given by the tangent line of the curve, see Figure 6.1.2. The Second Newton Law says that for the velocity v of the point P the following identity holds: mv˙ = F = mg cos α (see Figure 6.1.2). Multiplying this identity by v and taking into account that x˙ = v cos α, we obtain

· 1 2 v = gv cos α = g x, ˙ 2 i.e., 1 2 v = gx (6.1.10) 2 (the Principle of Conservation of Energy). 6ϕ

n is deﬁned in Proposition 1.2.20(iv). 7 This problem was posed by Johann Bernoulli

(see Berkovitz [12]).

6.1. Local Extrema

367

y

b

A

P α F mg a

B

x Figure 6.1.2. The x-axis is oriented in the (downward) direction of the gravitational force.

Since the point P moves along the graph of u = u(x), its trajectory s = s(t) is given by x(t) 1 s(t) = 1 + (u (x))2 dx.8 (6.1.11) 0

Hence

ds(t) ds(t) dx 1 ˙ = = 1 + (u (x(t)))2 x(t). dt dx dt Using (6.1.10) and the strict monotonicity of x we have 1 1 + (u (x))2 dt √ = . dx 2gx v(t) =

Therefore the time needed to get from A to B is given by a1 1 + (u (x))2 ˜ √ dx. F (u) = 2gx 0

(6.1.12)

We wish to apply Proposition 6.1.2 to the functional F˜ . However, F˜ is not deﬁned on a linear space (u(a) = b = 0). To avoid this obstacle we change the variable u for this moment by a substitution b w(x) = u(x) − x. a 8 use the formula for the length of a curve given by the graph of u = u(x): s = We x0 ; 1 + (u (x))2 dx. 0

368

Chapter 6. Variational Methods

So, we can write (6.1.12) as a

b ˜ F (w) = F w + x = a 0

; 2 1 + w (x) + ab √ dx 2gx

where w ∈ C01 [0, a] {w ∈ C 1 [0, a] : w(0) = w(a) = 0}. We equip C01 [0, a] with the norm

uC01 [0,a] =

a

|u (x)|2 dx

12 .

0

For a given h ∈ C01 [0, a] we have (see Corollary 3.2.14 and Example 3.2.21)

a

δF (w; h) = 0

w (x) + ab 2 = h (x) dx. < 2 2gx 1 + w (x) + ab

The Euler Necessary Condition (Proposition 6.1.2) for the original variable u reads a u (x) 1 for all h ∈ C01 [0, a]. (6.1.13) h (x) dx = 0 2gx[1 + (u (x))2 ] 0 Let us denote

u (x) M (x) = 1 , 2gx[1 + (u (x))2 ]

x ∈ (0, a).

Applying Lemma 6.1.9 we obtain that there is a constant K ∈ R such that M (x) = K a.e. in (0, a). However, the continuity of M actually implies that u (x) 1 =K 2gx[1 + (u (x))2 ]

for all x ∈ (0, a).

(6.1.14)

We will ﬁnd a solution of the Euler equation (6.1.14). Note that K = 0 implies b = 0, and so in this case u = 0 is a unique solution of (6.1.14). Assume now that 1 b > 0, and write K as ± √4gc with a c > 0. The equation (6.1.14) then implies '

1−

x( x (u (x))2 = , 2c 2c

x ∈ [0, a].

(6.1.15)

x < 1. After the change of variables x = c(1 − cos τ ), τ ∈ [0, τ0 ] (here Hence 0 ≤ 2c τ0 < π is such that a = c(1 − cos τ0 )) we obtain

du du = c sin τ dτ dx

6.1. Local Extrema

369

and (6.1.15) is transformed into

du dτ

2 = c2 (1 − cos τ )2 .

Hence u(τ ) = ±c(τ − sin τ ),

τ ∈ [0, τ0 ].

(Notice that the integration constant is zero since u(0) = 0, and only the sign plus corresponds to our problem.) Hence the parametric equations of the graph of u are given by x = c(1 − cos τ ),

y = c(τ − sin τ ),

τ ∈ [0, τ0 ].

This is a part of the cycloid, and we have to determine parameters c and τ0 so that B is the end point of this curve. This means b τ0 − sin τ0 = , a 1 − cos τ0 Since the function τ →

τ − sin τ , 1 − cos τ

τ0 ∈ (0, π).

(6.1.16)

τ ∈ (0, π),

is strictly increasing with the supremum (over (0, π)) equal to π2 , we have that for 0 ≤ ab < π2 the functional F has a unique critical point v ∈ C01 [0, a] such that the graph of the function u(x) = v(x) + ab x has parametric equations x=a

1 − cos τ , 1 − cos τ0

y=a

τ − sin τ , 1 − cos τ0

τ ∈ [0, τ0 ],

(6.1.17)

where τ0 is given by (6.1.16). On the other hand, for ab ≥ π2 the functional F does not have critical points 1 in C0 [0, a]. However, this does not mean that the original problem has no solution at all! The restriction we made during the formulation of the mathematical model (considering only curves which are graphs of functions y = u(x)) does not ﬁt with the real situation if ab ≥ π2 ! In this case one has to parametrize the curves x = x(τ ), y = y(τ ) and to investigate the functional 2 dx 2 ' dy (2 τ0 + dτ (τ ) dτ (τ ) 1 F˜ (x, y) = dτ . 2gx(τ ) 0 An analogous procedure leads to the solution of two diﬀerential equations for x and y and one can prove the existence of a unique critical point.9 9 The

reader is invited to prove it in detail as an exercise.

370

Chapter 6. Variational Methods

Let us return to the case ab < π2 . It still remains to show that the solution (6.1.17) is a global minimum of F over C01 [0, a]. This follows from Proposition 6.1.8. Indeed, the function 1 z → 1 + z 2 is convex in R. This immediately implies that the functional F is convex on C01 [0, a] (the reader is invited to prove both facts in detail). Hence the unique critical point g of F in C01 [0, a] must be the point of its global minimum. Let us now consider a more general situation. Namely, let M = {u ∈ C 1 [a, b] : u(a) = u1 , u(b) = u2 }, and let us introduce the functional b f (x, u(x), u (x)) dx, F (u) =

u ∈ M,

a

where f = f (x, y, z) is a function deﬁned on [a, b] × R2 with continuous second partial derivatives with respect to all its variables. This assumption will hold throughout the rest of this section. Applying the Euler Necessary Condition (Proposition 6.1.2) we get the following assertion. Proposition 6.1.11. Let u0 ∈ M be a local extremum of F with respect to M. Then the function ∂f x → (x, u0 (x), u 0 (x)) (6.1.18) ∂z is continuously diﬀerentiable on [a, b] and ∂f d ∂f (x, u0 (x), u 0 (x)) − (x, u0 (x), u 0 (x)) = 0 (6.1.19) ∂y dx ∂z for all x ∈ [a, b]. Proof. Let us ﬁrst assume u1 = u2 = 0. Let w ∈ C01 [a, b]. Since b ∂f ∂f (x, u0 (x), u 0 (x))w(x) + (x, u0 (x), u 0 (x))w (x) dx, 0 = δF (u0 ; w) = ∂y ∂z a we get, by integrating by parts, x b ∂f ∂f (x, u0 (x), u0 (x)) − (ξ, u0 (ξ), u0 (ξ)) dξ w (x) dx = 0. ∂z a a ∂y Using Lemma 6.1.9 we get from (6.1.20) that there is c ∈ R such that x ∂f ∂f (x, u0 (x), u0 (x)) − (ξ, u0 (ξ), u 0 (ξ)) dξ = c ∂z ∂y a

(6.1.20)

(6.1.21)

6.1. Local Extrema

371

for all x ∈ [a, b]. This equality shows that the function (6.1.18) is continuously diﬀerentiable and (6.1.19) holds for all x ∈ [a, b]. −u1 In a general case we can consider u − u2b−a (x − a) − u1 instead of u and apply the previous result on the transformed functional. Remark 6.1.12. Equation (6.1.19) is the Euler Equation of the functional F . Taking the “formal” derivative of the second term in (6.1.19) we obtain d ∂f ∂2f (x, u0 (x), u0 (x)) = (x, u0 (x), u 0 (x)) dx ∂z ∂x∂z ∂2f ∂2f (x, u0 (x), u 0 (x))u 0 (x) + 2 (x, u0 (x), u 0 (x))u 0 (x). + ∂y∂z ∂z Hence (6.1.19) indicates that u 0 (x) should exist. This motivates the following assertion. Theorem 6.1.13 (Regularity of the “classical solution”). Let u0 ∈ M be a local extremum of F with respect to M, and let x0 ∈ (a, b) be such that ∂2f (x0 , u0 (x0 ), u 0 (x0 )) = 0. ∂z 2 Then there exists δ > 0 such that u0 ∈ C 2 (x0 − δ, x0 + δ). Proof. For x ∈ [a, b] and z ∈ R deﬁne a function ϕ by x ∂f ∂f (x, u0 (x), z) − (ξ, u0 (ξ), u 0 (ξ)) dξ − c ϕ(x, z) = ∂z a ∂y where c is the constant from the proof of Proposition 6.1.11. The Implicit Function Theorem (see Theorem 4.2.1) implies that there exist δ1 > 0, δˆ > 0 with the following properties: for any x ∈ (x0 − δ1 , x0 + δ1 ) there exists a unique z(x) ∈ ˆ u (x0 ) + δ) ˆ such that (u 0 (x0 ) − δ, 0 ϕ(x, z(x)) = 0. Moreover, z ∈ C 1 (x0 − δ1 , x0 + δ1 ). The continuity of u 0 and the uniqueness of z imply the existence of δ ∈ (0, δ1 ) such that u 0 (x) = z(x)

for

x ∈ (x0 − δ, x0 + δ).

It is more convenient to look for critical points of F on “greater” sets than M in several situations. As we will see later (Section 6.2) this is mainly connected with the fact that the space of continuously diﬀerentiable functions C 1 [a, b] is not reﬂexive and it does not possess a Hilbert structure, either. For this purpose it is

372

Chapter 6. Variational Methods

more convenient to work in the Sobolev space W 1,2 (a, b) and to look for extrema of F on the set N = {u ∈ W 1,2 (a, b) : u(a) = u1 , u(b) = u2 }. Notice that it is not obvious whether the functional F is well deﬁned on the set N . We have to assume that f satisﬁes certain growth conditions (see Theorem 3.2.24 and Remark 3.2.25; the Carath´eodory property is guaranteed by the continuity of f and its derivatives). In this case we have Theorem 6.1.14 (Regularity of the “weak solution”). Let h ∈ L2 (a, b), c1 ≥ 0 be such that for a.a. x ∈ [a, b] and for all (y, z) ∈ R2 , |f (x, y, z)| ≤ h(x) + c1 (y 2 + z 2 ), - ∂f - (x, y, z)- ≤ h(x) + c1 (|y| + |z|), - ∂y - ∂f - (x, y, z)- ≤ h(x) + c1 (|y| + |z|). - ∂z -

(6.1.22) (6.1.23) (6.1.24)

Let u0 ∈ W 1,2 (a, b) be a local minimum of F on N . For x ∈ [a, b] and z ∈ R set ψ(x, z) =

∂f (x, u0 (x), z). ∂z

Assume that ∂ψ ∂z > 0 on [a, b] × R and that for every ﬁxed x ∈ [a, b] the function z → ψ(x, z) maps R onto R. Then u0 ∈ C 2 [a, b]. Proof. First, let us assume that u1 = u2 = 0. Conditions (6.1.22)–(6.1.24) guarantee that F is well deﬁned on W01,2 (a, b) and that δF (u0 ; v) exists for any v ∈ W 1,2 (a, b).10 It follows from Proposition 6.1.2 that for any w ∈ W01,2 (a, b), b ∂f ∂f (x, u0 (x), u 0 (x))w (x) + (x, u0 (x), u 0 (x))w(x) dx = 0. δF (u0 ; w) = ∂z ∂y a If we proceed literally as in the proof of Proposition 6.1.11 we arrive at (6.1.21) which now holds for a.a. x ∈ [a, b]. Since the function x ∂f g(x, z) = ψ(x, z) − c − (ξ, u0 (ξ), u 0 (ξ)) dξ ∂y a is continuous on [a, b] × R, hence by the assumptions on the function ψ, for any x ∈ [a, b] the equation g(x, z) = 0 10 The

reader is invited to check these facts in detail, see Remark 3.2.25.

6.1. Local Extrema

373

has a unique solution z = z(x). Moreover, by the Implicit Function Theorem (see Remark 4.2.3, not Theorem 4.2.1!), the function x → z(x) is continuous on (a, b). It can be shown (Exercise 6.1.21) that it is continuous also at the end points a, b. So, it follows from (6.1.21) that x u 0 (x) = z(x) for a.a. x ∈ [a, b], i.e., u0 (x) = z(y) dy. a

and it is a local minimum of F in the space C01 [a, b]. The Hence u0 ∈ assertion now follows from Theorem 6.1.13. In the general case, we consider again C01 [a, b]

u−

u2 − u1 (x − a) − u1 b−a

instead of u and apply the previous result on the transformed functional.

Exercise 6.1.15. Consider a function of two real variables F (x, y) = sin x + sin y − sin (x + y) π 3π π 3π × − , . − , 2 2 2 2 4π 4π 2π Prove that F has a local maximum at 2π 3 , 3 , a local minimum at 3 , 3 , and there is no extremum at the critical point (0, 0). For the graph of F see Figure 6.1.3.

on the set

M=

Exercise 6.1.16. Find local and global extrema of the functional 1 F : C[0, 1] → R : F (u) = [|u(t)|2 + u(t)v(t) + w(t)] dt 0

where v, w ∈ C[0, 1] are given functions.

11

Exercise 6.1.17. Use Theorem 6.1.5 to prove that the solution of the Euler equation (6.1.14) is a local minimum of F from Example 6.1.10. Hint. Show that 3 (2c − a) 2 a δ 2 F (v; h, h) ≥ |h (x)|2 dx. √ 4c gca 0 Exercise 6.1.18. Prove that the functional π |u(x)|2 [1 − |u (x)|2 ] dx F (u) = 0

has in C01 [0, π] a unique local minimum at u = 0. functional F : X → R reaches its global minimum over M ⊂ X if there exists u0 ∈ M such that F (u) ≥ F (u0 ) for all u ∈ M. Global maximum is deﬁned similarly. See Section 6.2 for more detail on the existence of global extrema. 11 The

374

Chapter 6. Variational Methods

F

y

x

Figure 6.1.3. Graph of F

Exercise 6.1.19 (Weierstrass Example). Prove that the functional 1 F (u) = x2 |u (x)|2 dx −1

does not have its global minimum over the set M = {u ∈ C 1 [−1, 1] : u(−1) = −1, u(1) = 1}. Hint. Set un (x) =

arctan nx arctan n

and prove that lim F (un ) = 0. n→∞

Exercise 6.1.20. Prove that the functional 1 2 x 5 |u (x)|2 dx F (u) = −1

does not have its global minimum over the set M from Exercise 6.1.19. Hint. The corresponding Euler equation has no solution. Exercise 6.1.21. Prove the following statement: Let g : [a, b]×R → R be a function and assume that for any x ∈ [a, b] the equation g(x, z) = 0 has a solution denoted by z = z(x) (not necessarily unique). If ∂g (x, z) > 0 on [a, b] × R, ∂z then this solution is unique. If, moreover, g and ∂g ∂z are continuous on [a, b] × R, then z = z(x) is continuous on [a, b] as well.

6.2. Global Extrema

375

Hint. For the continuity of z = z(x) use the Implicit Function Theorem in the form of Remark 4.2.3 and notice that usage of the Contraction Principle is also possible at the end points a, b.

6.2 Global Extrema In contrast with the previous section we focus now on points of global extrema. The key assertions deal with weakly coercive and weakly sequentially lower semicontinuous functionals. Let us consider a diﬀerentiable function of one real variable, F : R → R. It is not diﬃcult to give an example which shows that local extrema of F need not be its global extrema – see Figures 6.2.1, 6.2.2. R

R F

F

0

R

Figure 6.2.1. F attains neither its maximum nor its minimum on R.

0

a

b R

Figure 6.2.2. F attains its extrema on

[a, b] at a and b, respectively.

It is quite natural to ask: What properties of F guarantee the existence of the point of global extremum of F ? First of all let us note that we can look for global minima only because global maxima of F are global minima of −F and vice versa. Let us consider the following very simple model example of a function F : R → R which is continuous in a bounded interval [a, b]. Then there exists a point x0 ∈ [a, b] such that F (x0 ) = min F (x), x∈[a,b]

i.e., the minimum of F over [a, b] is at the point x0 (see Figure 6.2.3). The proof of ∞ this fact is typical for this section. Assume that {xn }n=1 is a minimizing sequence for F on [a, b], i.e., F (xn ) inf F (x).12 (6.2.1) x∈[a,b]

12 Note

that, for a general M, we set inf M = −∞ if M is not bounded below.

376

Chapter 6. Variational Methods

R

F (b) F (a)

0

a

x0

b

R

Figure 6.2.3. ∞

∞

The compactness of [a, b] implies that there is a subsequence {xnk }k=1 ⊂ {xn }n=1 and a point x0 ∈ [a, b] such that xnk → x0 . The continuity of F then implies that F (x0 ) = inf F (x). x∈[a,b]

The reader should notice that a property weaker than the continuity of F is suﬃcient to get this conclusion, namely F (x0 ) ≤ lim inf F (xnk )

(6.2.2)

k→∞

(cf. Deﬁnition 6.2.1 below). It follows now from (6.2.1) and (6.2.2) that F (x0 ) = inf F (x) x∈[a,b]

(see Figure 6.2.4). If, moreover, F (a) > inf F (x),

F (b) > inf F (x),

x∈(a,b)

(6.2.3)

x∈(a,b)

then x0 is also a local minimum of F (see Figures 6.2.3 and 6.2.4). Assume in the sequel in this section that F: X →R is a functional on a (inﬁnite dimensional) Banach space X. It is quite natural to ask if a similar result as above holds if [a, b] is replaced by a closed and bounded set D ⊂ X and (6.2.3) is substituted by inf F (u) >

u∈∂D

inf

u∈int D

F (u).

6.2. Global Extrema

R

377

F (a)

F (b)

F (x0 ) 0

a

x0

b

R

Figure 6.2.4.

Unfortunately, the answer is no in general (see Exercise 6.2.23). The reason lies in the fact that the compactness of the bounded and closed interval [a, b] is the crucial property which plays the essential role in the proof. In fact, one can imitate the proof above to get the following result: Let F be a lower semi-continuous real functional on a compact set K ⊂ X. Then F has a minimum in K. However, this assertion has practically no applications because compact subsets of the inﬁnite dimensional Banach space X are “too thin” (see Proposition 1.2.15). For instance, for any compact set K ⊂ X we have int K = ∅. Because of this fact we have to look for a diﬀerent (weaker – why?) topology on X than that induced by the norm. We would like to ﬁnd a new topology on X with respect to which any bounded (in the norm) set D ⊂ X is relatively compact. The lower semi-continuity of a functional F with respect to this topology will then allow us to prove the above assertion with K substituted by a bounded and closed set D with respect to this new topology. These problems gave an impulse for the study of weak convergence introduced in Deﬁnition 2.1.21. The reader should notice that we will discuss weak sequential continuity of functionals instead of weak continuity (these are diﬀerent concepts since weak topology is not metrizable in general). The reason is quite practical: weak sequential (semi-) continuity is easier to prove for a concrete (e.g., integral) functional. In order to make the exposition in this section as clear as possible we will restrict our attention to real Hilbert spaces H. The reader should have in mind that the following notions can also be deﬁned in any Banach space. Deﬁnition 6.2.1. Let F : H → R be a functional, M ⊂ H. Then F is said to be weakly sequentially lower semi-continuous at a point u0 ∈ M relative to M if for ∞ any sequence {un }n=1 ⊂ M such that un u0 we have F (u0 ) ≤ lim inf F (un ). n→∞

378

Chapter 6. Variational Methods

We say that F is weakly sequentially lower semi-continuous in M ⊂ H if it is weakly sequentially lower semi-continuous at every point u ∈ M relative to M. Example 6.2.2. The norm · on H is a weakly sequentially lower semi-continuous g functional in H as follows immediately from Proposition 2.1.22(iii). Example 6.2.3. Let L : H → R be a continuous linear form. Then L is weakly sequentially lower semi-continuous in H. Indeed, it follows from the Riesz Representation Theorem (Theorem 1.2.40) that there is v ∈ H such that for all u ∈ H.

L(u) = (u, v) Hence u n u0

L(un ) → L(u0 ), 13

implies

in particular,

g

L(u0 ) ≤ lim inf L(un ). n→∞

The following assertion is a counterpart of Proposition 1.2.2 which is known as the Extreme Value Theorem for H = R. Theorem 6.2.4 (Extreme Value Theorem). Let M be a weakly sequentially compact nonempty subset of H and let F be a weakly sequentially lower semi-continuous functional in M. Then F is bounded below in M, and there exists u0 ∈ M such that F (u0 ) = min F (u). u∈M

Proof. Let

∞ {un }n=1

be a minimizing sequence for F relative to M, i.e.,

{un }∞ n=1 ⊂ M

and

F (un ) inf F (u). u∈M

Since M is weakly sequentially compact there exist u0 ∈ M and a subsequence ∞ ∞ {unk }k=1 ⊂ {un }n=1 such that unk u0 . The assumption on F implies inf F (u) ≤ F (u0 ) ≤ lim inf F (unk ) = lim F (un ) = inf F (u),

u∈M

n→∞

k→∞

u∈M

i.e., F (u0 ) = inf F (u) > −∞. u∈M

Corollary 6.2.5. Let M ⊂ H, F : H → R, and u0 be as in Theorem 6.2.4. Assume, moreover, that u0 ∈ int M. If δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. un u0 implies F (un ) → F (u0 ), then the functional F is called weakly sequentially continuous at u0 . 13 If

6.2. Global Extrema

379

Proof. The assumption u0 ∈ int M implies that F attains also its local minimum at u0 . The assertion now follows from Proposition 6.1.2. Example 6.2.6. Let us consider the boundary value problem for the second order ordinary diﬀerential equation −¨ x(t) + x3 (t) = f (t), t ∈ (0, 1), (6.2.4) x(0) = x(1) = 0, where f ∈ L2 (0, 1) is a given function. Put H W01,2 (0, 1) with the norm

1

x =

12 |x(t)| ˙ dt . 2

0

A weak solution 14 of (6.2.4) is a function x ∈ H for which the integral identity 1 1 1 3 x(t) ˙ y(t) ˙ dt + x (t)y(t) dt = f (t)y(t) dt 0

0

0

holds for any function y ∈ H. Let us deﬁne a functional F : H → R by 1 1 1 1 1 2 4 F (x) = |x(t)| ˙ dt + |x(t)| dt − f (t)x(t) dt, 2 0 4 0 0 Then for x, y ∈ H we have 1 δF (x; y) = x(t) ˙ y(t) ˙ dt + 0

1

x (t)y(t) dt − 3

0

x ∈ H.15

1

f (t)y(t) dt, 0

and any critical point of F , i.e., x ∈ H satisfying δF (x; y) = 0

for an arbitrary y ∈ H,

is a weak solution of (6.2.4) and vice versa. We will show that Corollary 6.2.5 applies to F and a suitably chosen set M ⊂ H. First let us prove that F is a weakly sequentially lower semi-continuous ∞ functional on H. Consider an arbitrary z ∈ H and {xn }n=1 ⊂ H such that xn z in H. Due to the compact embedding (Theorem 1.2.28(iii)) H = W01,2 (0, 1) ⊂⊂ C[0, 1], we have that xn → z in C[0, 1] (Proposition 2.2.4(iii)). 14 For

a detailed discussion of the notion of a weak solution see Remark 5.3.10. Note that this weak solution x0 minimizes the energy functional F , i.e., it corresponds to the state with the minimal energy of the system. 15 This functional can represent the energy of a certain system. For this reason it is often called the energy functional.

380

Chapter 6. Variational Methods

This implies

1

|xn (t)|4 dt →

0

1

|z(t)|4 dt,

0

1

f (t)xn (t) dt →

0

1

f (t)z(t) dt.

(6.2.5)

0

The weak sequential lower semi-continuity of the Hilbert norm · (Example 6.2.2) implies lim inf xn 2 ≥ z2 . (6.2.6) n→∞

We obtain from (6.2.5) and (6.2.6) that lim inf F (xn ) ≥ F (z). n→∞

To ﬁnd a suitable set M we ﬁrst note that xL2 (0,1) ≤ xW 1,2 (0,1) .16

(6.2.7)

0

Due to this fact we can estimate F using the H¨older inequality as follows: F (x) ≥

1 1 x2 − f L2(0,1) xL2 (0,1) ≥ x x − 2f L2(0,1) . 2 2

(6.2.8)

It is clear that for x > 2f L2(0,1) we have F (x) > 0, and at the same time F (o) = 0. So, taking M = {x ∈ H : x ≤ 2f L2(0,1) + 1}, the assumptions of Corollary 6.2.5 are fulﬁlled, since a closed ball in a Hilbert space is weakly sequentially compact (see Theorem 2.1.25 and Proposition 2.1.22(iii)). We then conclude that there exists at least one weak solution x0 ∈ H of the g boundary value problem (6.2.4). From (6.2.8) it is easy to see that the functional F from the previous example satisﬁes lim F (x) = ∞. x→∞

This motivates the following general deﬁnition. Deﬁnition 6.2.7. A functional F : H → R is said to be weakly coercive on H if lim F (u) = ∞.

u→∞

16 This

t

follows by a direct calculation using the H¨ older inequality for x(t) =

x(s) ˙ ds. 0

6.2. Global Extrema

381

This notion together with Corollary 6.2.5 leads to the following global result. Theorem 6.2.8. Let F : H → R be a weakly sequentially lower semi-continuous and weakly coercive functional. Then F is bounded below on H, and there exists u0 ∈ H such that F (u0 ) = min F (u). u∈H

Moreover, if δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. Proof. Let d > inf F (u). There exists R > 0 such that for u ∈ H, u ≥ R, we u∈H

have F (u) ≥ d. Hence inf F (u) = inf F (u).

u≤R

u∈H

Now, we apply Theorem 6.2.4 with M = {u ∈ H : u ≤ R}. The assertion on a directional derivative follows from Corollary 6.2.5.

From the point of view of applications, it is convenient to have suﬃcient conditions “in the language of the topology on H induced by the norm” which guarantee that the set M is weakly sequentially compact; the functional F is weakly sequentially lower semi-continuous in M.17 We recall the results from Chapter 2 which state that every closed, convex and bounded set M ⊂ H is weakly sequentially compact (see Exercise 2.1.39, Theorem 2.1.25 and Remark 2.1.24). Concerning the desired property of F we need the following auxiliary assertion. Lemma 6.2.9. Let M ⊂ H. Then F : H → R is weakly sequentially lower semicontinuous in M if and only if for every a ∈ R the set E(a) = {u ∈ M : F (u) ≤ a} is weakly sequentially closed in M.18 17 Not every continuous functional is weakly sequentially lower semi-continuous (cf. Exercise 6.2.31). 18 The set E ⊂ M is called weakly sequentially closed in M if for any {x }∞ n n=1 ⊂ E, xn x ∈ M, we have x ∈ E.

382

Chapter 6. Variational Methods

Proof. Let F be a weakly sequentially lower semi-continuous functional in M, a ∈ R, {un }∞ n=1 ⊂ E(a), un u0 , u0 ∈ M. Then F (u0 ) ≤ lim inf F (un ) ≤ a, n→∞

i.e.,

u0 ∈ E(a).

Hence E(a) is weakly sequentially closed in M. On the other hand, assume that for every a ∈ R the set E(a) is weakly ∞ sequentially closed in M. Let {un }n=1 ⊂ M, un u0 ∈ M and denote γ = lim inf F (un ). n→∞

∞

Then there is a subsequence {unk }k=1 such that F (unk ) → γ. For any ε > 0 we have unk ∈ E(γ + ε) for k suﬃciently large. Since E is weakly sequentially closed in M, u0 ∈ E(γ + ε). Hence u0 ∈ E(γ), i.e., F (u0 ) ≤ lim inf F (un ). n→∞

Proposition 6.2.10. Let F be a convex and continuous functional deﬁned in a convex set M ⊂ H. Then F is weakly sequentially lower semi-continuous in M. Proof. It follows from the convexity of F that the set E(a) = {u ∈ M : F (u) ≤ a} is convex. The continuity of F implies that E(a) is closed in M. It follows from Exercise 2.1.39 and Remark 2.1.24 that it is also weakly sequentially closed in M. The result now follows from Lemma 6.2.9. These results combined with Theorems 6.2.4 and 6.2.8 allow us to formulate the following assertions, very often used in applications. Theorem 6.2.11. Let M be a closed, convex, bounded and nonempty subset of H. Let F : H → R be a convex and continuous functional on M. Then F is bounded below on M and there exists u0 ∈ M such that F (u0 ) = inf F (u). u∈M

If, moreover, F is strictly convex, then u0 is the unique point with this property.19 Theorem 6.2.12. Let F : H → R be continuous, convex and weakly coercive on H. Then F is bounded below on H, and there exists u0 ∈ H such that F (u0 ) = inf F (u). u∈H

19 The

reader is invited to prove the uniqueness of u0 !

6.2. Global Extrema

383

If δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. If, moreover, F is strictly convex, then u0 is uniquely determined. Example 6.2.13. For any real continuous linear form L : H → R there exists u ∈ H such that u = 1 and L = L(u). Indeed, the set M = {u ∈ H : u ≤ 1} and the functional F = −L satisfy the assumptions of Theorem 6.2.11. Hence there exists u0 ∈ M such that −L(u0 ) = inf (−L(u)). u∈M

By the linearity of L and the symmetry of M we have − inf (−L(u)) = sup |L(u)|, u∈M

i.e.,

u∈M

L(u0 ) = sup |L(u)| = L. u∈M

Assume that L = 0 and u0 < 1. Then there exists t > 1 such that tu0 = 1, i.e., tu0 ∈ M, and L(tu0 ) = tL(u0 ) = tL > sup |L(u)|, u∈M

a contradiction. Note that this assertion can be proved directly using the Riesz Representation g Theorem (Theorem 1.2.40). Example 6.2.14. Let us consider the boundary value problem (6.2.4) and the energy functional 1 1 1 1 1 2 4 F (x) = |x(t)| ˙ dt + |x(t)| dt − f (t)x(t) dt, x ∈ H W01,2 (0, 1) 2 0 4 0 0 associated with (6.2.4). We have actually proved in Example 6.2.6 that F is weakly coercive on H. The continuity of F on H follows from the continuity of the norm in H, the continuity of the embedding H = W01,2 (0, 1) ⊂ L4 (0, 1) and from the continuity of the linear form x →

1

f (t)x(t) dt

on H

0

under the assumption f ∈ L2 (0, 1). The strict convexity of F follows from the strict convexity of the real functions t → t2 ,

t → t4 ,

384

Chapter 6. Variational Methods

and the convexity of the linear form. We conclude (see Theorem 6.2.12) that there exists a unique x0 ∈ H such that F (x0 ) = min F (x). x∈H

It follows then from Proposition 6.1.8 that x0 is the unique weak solution of (6.2.4). g Remark 6.2.15. The reader should compare Examples 6.2.6 and 6.2.14. In the latter one we have used Theorem 6.2.12 which enables us to avoid verifying the assumption of the weak sequential lower semi-continuity of F . This might be a diﬃcult task in general (it can not be always done so easily by means of the compact embedding as in Example 6.2.6). The reader should also notice that the continuity of F without any additional assumptions does not imply the weak sequential lower semi-continuity of F (see Exercise 6.2.31). In the last part of this section we show another possibility for ﬁnding critical points of F under the assumption that F is diﬀerentiable. First we need two auxiliary assertions. Lemma 6.2.16. Let F be a functional deﬁned on H and ∇F its gradient.20 Let ∇F : H → H be a monotone operator. Then F is weakly sequentially lower semicontinuous on H. Proof. Let u, v ∈ H. According to the Mean Value Theorem applied to the real function ϕ : s → F (v + s(u − v)), s ∈ [0, 1], there exists t ∈ (0, 1) such that F (u) − F (v) = (∇F (v + t(u − v)), u − v) = (∇F (v), u − v) + (∇F (v + t(u − v)) − ∇F (v), u − v) ≥ (∇F (v), u) − (∇F (v), v).21

(6.2.9)

Let {vn }∞ n=1 be a sequence in H such that vn v in H, i.e., (∇F (v), vn ) → (∇F (v), v). It follows from (6.2.9) that lim inf F (vn ) ≥ F (v) − (∇F (v), v) + lim (∇F (v), vn ) = F (v). n→∞

20 Remember

n→∞

that according to the Riesz Representation Theorem (Theorem 1.2.40), Gˆ ateaux derivative DF (u) is identiﬁed with an element of H which is denoted by ∇F (u) and called a gradient of F at u. Remember also that ∇F is a mapping from H into itself. (Cf. Example 3.2.4.) 21 Since the monotonicity of ∇F implies that ϕ is increasing, ϕ is a convex function, i.e., F is convex.

6.2. Global Extrema

385

Deﬁnition 6.2.17. Let T : H → H be an operator from H into itself. We say that T is coercive if (T (u), u) lim = ∞. u→∞ u Lemma 6.2.18. Let F : H → R be a functional and ∇F : H → H its gradient. Let ∇F be a coercive and bounded operator. Then F is weakly coercive. Proof. Since d F (tu) = (∇F (tu), u), dt we obtain by integration

1

F (u) = F (o) + 0

dt = F (o) + (∇F (tu), tu) t

u

0

u u dτ ,τ ∇F τ u u τ

for any u ∈ H, u = o. The coercivity of ∇F implies that there exists r ≥ 0 such that

1 u u ∇F τ ,τ ≥1 for any τ ≥ r and u ∈ H, u = o. τ u u The boundedness of ∇F implies m

sup τ ∈[0,r] u∈H, u =o

∇F τ u < ∞. u

Consequently, we obtain

u dτ u ,τ ∇F τ u u τ 0

u u dτ u ,τ ∇F τ + u u τ r

r

F (u) = F (o) +

≥ F (o) − rm + u − r

for any u ∈ H,

u > r.

The last inequality yields the weak coercivity of F .

Remark 6.2.19. Let F : H → R, h ∈ H. Assume that the gradient ∇F (u) of F exists at any point u ∈ H. Then the following equivalence obviously holds true: There exists u0 ∈ H such that ∇F (u0 ) = h if and only if there exists u0 ∈ H such that ∇G(u0 ) = o where G : u → F (u) − (u, h).

(6.2.10)

386

Chapter 6. Variational Methods

Theorem 6.2.20. Let F : H → R and let ∇F : H → H be the gradient of F . Let ∇F be a monotone, coercive and bounded operator. Then ∇F (H) = H.22 Proof. It follows from Remark 6.2.19 that it is enough to prove that for any h ∈ H, the functional G deﬁned by (6.2.10) has a critical point. But Lemmas 6.2.16 and 6.2.18 yield that G is weakly sequentially lower semi-continuous and weakly coercive. The existence of a critical point of G follows from Theorem 6.2.8. Example 6.2.21. Let us consider again the boundary value problem (6.2.4) and the associated energy functional 1 1 1 1 1 2 F (x) = |x(t)| ˙ dt + |x(t)|4 dt − f (t)x(t) dt. 2 0 4 0 0 Then

1

∇F (x)y =

x(t) ˙ y(t) ˙ dt + 0

1

x (t)y(t) dt − 3

0

1

f (t)y(t) dt 0

is the Gˆ ateaux derivative of F in H W01,2 (0, 1). We verify the assumption of Theorem 6.2.20. Using the continuous embedding H = W01,2 (0, 1) ⊂ C[0, 1] (Theorem 1.2.26) we prove the boundedness (and even continuity!) of ∇F in the space H. Since s → s3 is monotone we have (∇F (x1 ) − ∇F (x2 ), x1 − x2 ) = + 0

1

|x˙ 1 (t) − x˙ 2 (t)|2 dt

0 1

(x31 (t) − x32 (t))(x1 (t) − x2 (t)) dt ≥ x1 − x2 2

for x1 , x2 ∈ H and the monotonicity of ∇F follows. Finally, we have 1 x4L4 (0,1) 1 (∇F (x), x) = x + − f (x)x(t) dt x x x 0 xL2 (0,1) . ≥ x − f L2 (0,1) x Using the inequality (6.2.7) we get (∇F (x), x) ≥ x − f L2 (0,1) , x i.e., ∇F is coercive. We conclude from Theorem 6.2.20 that ∇F (H) = H, 22 Compare

this result with Theorem 5.3.4.

(6.2.11)

6.2. Global Extrema

387

in particular, there exists x0 ∈ H such that ∇F (x0 ) = o. Hence x0 is a weak solution of (6.2.4). The estimate (6.2.11) then implies the g uniqueness of x0 .23 Remark 6.2.22. Most of the previous results hold true when the Hilbert space H is replaced by a real reﬂexive Banach space X and the scalar product (·, ·) is replaced by the duality pairing ·, ·, between X ∗ and X, i.e., for f ∈ X ∗ and x ∈ X we write f, x f (x). However, the proofs are technically more involved and the gradient ∇F has to be replaced by the Gˆ ateaux derivative DF . ∞

Exercise 6.2.23. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Put 1 Dn = x ∈ H : x − en ≤ 2 and deﬁne a functional ⎧ ⎪ ⎪ ⎪x for ⎨ f (x) =

⎪ 2(n − 1) 1 ⎪ ⎪ − x − en for ⎩x + n 2

x ∈

n=1

x ∈ Dn .

Show that f is continuous on H, sup f (x) = 2, x≤ 32

but f does not have maximum on the ball 3 . x ∈ H : x ≤ 2 Exercise 6.2.24. The mapping U : R2 → R2 is deﬁned by U : (x, y) → (y, −x). Prove that U is monotone and satisﬁes lim

(x,y)→∞

U (x, y) = ∞

but is not coercive. 23 The

∞

reader is invited to apply Theorem 5.3.4 to get the same result.

Dn ,

388

Chapter 6. Variational Methods

Exercise 6.2.25. Prove that any coercive map F : H → H satisﬁes lim F (u) = ∞.

u→∞

Exercise 6.2.26. Prove that the same conclusion as in Example 6.2.14 holds true also if f ∈ L1 (0, 1). Hint. Use the embedding W01,2 (0, 1) ⊂ L∞ (0, 1). Exercise 6.2.27. Prove that the norm on H and linear forms on H are convex functionals. Exercise 6.2.28. Prove that in Theorem 6.2.12 the weak coercivity of F can be substituted by a weaker assumption: For any u ∈ H there exists r > 0 such that for all v ∈ H, v ≥ r, we have F (v) > F (u). Exercise 6.2.29. Let M be an open convex subset of a real Hilbert space H, let F : H → R be a functional such that for any u ∈ M there exists the second Gˆ ateaux derivative D2 F (u). Prove that (a)

=⇒

(b)

=⇒

(c)

where (a) D2 F (u)(h, h) ≥ 0 for u ∈ M, h ∈ H; (b) (∇F (u) − ∇F (v), u − v) ≥ 0 for u, v ∈ M; (c) F is convex on M. Hint. Use the Mean Value Theorem (see Theorem 3.2.6) as for real functions. Exercise 6.2.30. Prove that for any n ∈ N and f ∈ L2 (0, 1) the boundary value problem −¨ x(t) + x2n+1 (t) = f (t), t ∈ (0, 1), x(0) = x(1) = 0 has a unique weak solution. Exercise 6.2.31. Let f be the functional from Exercise 6.2.23. Prove that f is not weakly sequentially upper semi-continuous (i.e., −f is not weakly sequentially lower semi-continuous). Hint. Remember that en o.

6.2A Ritz Method In this part of the text we want to address one fundamental numerical approach to ﬁnding the global minimum of a real functional on a real Banach space. In applications such a minimum corresponds to a solution of a certain boundary value problem and the general method we will discuss below is a starting point for many numerical methods. Let

6.2A. Ritz Method

389

us mention the Galerkin Method, the Finite Elements Method, the Katchanov–Galerkin Method, etc., which are powerful tools in the numerical solution of diﬀerential equations. Let X be a real Banach space and F a real functional deﬁned on X. An element u0 ∈ X satisfying F (u0 ) = inf F (u) (6.2.12) u∈X

will be called a solution of the variational problem (6.2.12). We will discuss the Ritz Method which actually yields directly an algorithm for ﬁnding a solution of the variational problem. The basic idea of the Ritz Method is rather simple: Instead of looking for the minimum of the functional F on the entire space X, we look for its minimum on suitable subspaces of the space X in which we know how to solve the variational problem. Let us now formulate this idea precisely: To every n ∈ N, let a closed subspace Xn of the space X be assigned. The problem of ﬁnding an element un ∈ Xn such that F (un ) = inf F (u) u∈Xn

(6.2.13)

holds is called the Ritz approximation of the problem (6.2.12) and the element un ∈ Xn is called a solution of the problem (6.2.13). The following two fundamental problems immediately present themselves: (a) the problem of the existence and uniqueness of a solution of the problem (6.2.13); (b) the relation between the solutions of the problems (6.2.12) and (6.2.13). Problem (a) has already been solved by Theorem 6.2.12 in the framework of Hilbert spaces. It follows from Remark 6.2.22 that the same assertion can be proved in a reﬂexive Banach space X. Since a closed subspace Xn of a reﬂexive Banach space X is also a reﬂexive Banach space, we have the following assertion which follows directly from Theorem 6.2.12 and Remark 6.2.22. Proposition 6.2.32. Let X be a reﬂexive Banach space, and let a functional F deﬁned on the space X be continuous, strictly convex and weakly coercive on X. Then each of the problems (6.2.12) and (6.2.13) has precisely one solution u0 and un , respectively. We now focus our eﬀort on problem (b). We investigate under what condition lim u0 − un = 0

n→∞

(6.2.14)

is true. If (6.2.14) is valid, then we say that the Ritz Method converges for the problem (6.2.12) and the solutions un of the problems (6.2.13) approximate the solution of the problem (6.2.12) in the sense of the norm of the space X. Proposition 6.2.33. Let F be a continuous linear functional on a normed linear space X and let {Xn }∞ n=1 be a sequence of closed subspaces of X such that for every v ∈ X there exist elements vn ∈ Xn , n ∈ N, such that lim v − vn = 0.

n→∞

(6.2.15)

390

Chapter 6. Variational Methods

Let un be such an element of Xn that (6.2.13) holds. Then {un }∞ n=1 is a minimizing sequence for the functional F on X, i.e., lim F (un ) = inf F (u).

n→∞

(6.2.16)

u∈X

Proof. Let {αk }∞ k=1 be a sequence such that αk inf F (u). u∈X

Then there exist elements v (k) ∈ X for which ( ' F v (k) < αk . (k)

(k)

By the assumption (6.2.15) we can ﬁnd wn ∈ Xn satisfying wn Hence ' ( inf F (u) ≤ F (un ) ≤ F wn(k) .

→ v (k) for n → ∞.

u∈X

By the continuity of F we get ' ( ' ( lim sup F (un ) ≤ lim F wn(k) = F v (k) < αk . n→∞

n→∞

This implies that lim F (un ) = inf F (u).

n→∞

u∈X

The assertion on the convergence of the Ritz Method for the problem (6.2.12) is the following proposition. Proposition 6.2.34 (Ritz Method). Let H be a real Hilbert space,24 and let F be a continuous functional on the space H which has the second Gˆ ateaux derivative D2 F (u) ∈ 25 B2 (H, R). Assume, further, that there exists a constant c > 0 such that for all u, v ∈ H we have (6.2.17) D2 F (u)(v, v) ≥ cv2 . Let subspaces Hn of the space H satisfy condition (6.2.15). Then (i) there exists precisely one solution u0 ∈ H of problem (6.2.12); (ii) for every n ∈ N there exists precisely one solution un ∈ Hn of problem (6.2.13); (iii) the Ritz Method converges for problem (6.2.12), i.e., lim u0 − un = 0.

n→∞

24 We will state and prove Proposition 6.2.34 in the Hilbert space setting. The generalization to the Banach space setting can be obtained (c.f. Remark 6.2.22). The reader can ﬁnd details in specialized literature (see, e.g., Saaty [116]). 25 See Section 3.2.

6.2A. Ritz Method

391

Proof. It follows from the Taylor Formula (Proposition 3.2.27) that 1 F (u + v) = F (u) + DF (u)(v) + (1 − t)D2 F (u + tv)(v, v) dt.

(6.2.18)

0

Choosing u = o, we have due to (6.2.17) 1 (1 − t)D2 F (tv)(v, v) dt F (v) = F (o) + DF (o)(v) + 0 0. In particular, for u = tw1 + (1 − t)w2 , w1 = w2 , t ∈ (0, 1), we have F (w1 ) − F (u) > (1 − t)DF (u)(w1 − w2 ),

F (w2 ) − F (u) > −tDF (u)(w1 − w2 ).

Multiplying the ﬁrst inequality by t, the second by (1 − t) and adding both of them, we obtain that F is strictly convex on H (and also on Hn for arbitrary n). The assertions (i) and (ii) now follow from Theorem 6.2.12. It remains to prove assertion (iii). Let u0 and un be a solution of (6.2.12) and (6.2.13), respectively. Set u u0 and v un − u0 in (6.2.18). From (6.2.17) and (6.2.18) we obtain c F (un ) ≥ F (u0 ) + DF (u0 )(un − u0 ) + un − u0 2 . 2 Since u0 ∈ H is the minimum point for F on H, it follows from Theorem 6.2.12 that DF (u0 )(un − u0 ) = o,

i.e.,

F (un ) ≥ F (u0 ) +

c un − u0 2 2

(6.2.19)

holds for arbitrary n ∈ N. On the other hand, due to Proposition 6.2.33, the elements un , n ∈ N, constitute a minimizing sequence for F on H, i.e., lim F (un ) = inf F (u) = F (u0 ).

n→∞

u∈H

(6.2.20)

It follows from (6.2.19) and (6.2.20) that lim u0 − un = 0

n→∞

and the proof is complete.

So far, we have answered theoretically problems (a) and (b) formulated at the beginning of this appendix. However, from the point of view of practical (numerical) calculations the most interesting problems start right now. The most frequent and most important case arises in practice when the spaces Hn are of ﬁnite dimension, e.g., dim Hn = N . If e1 , . . . , eN is a basis of Hn and N Fn (c1 , . . . , cN ) F ci ei , i=1

392

Chapter 6. Variational Methods

then the problem (6.2.13) means to ﬁnd c˜ = (˜ c1 , . . . , c˜N ) ∈ RN such that Fn (˜ c1 , . . . , c˜N ) =

inf

(c1 ,...,cN )∈RN

Fn (c1 , . . . , cN ).

(6.2.21)

If the assumptions of Proposition 6.2.34 are satisﬁed, then the function Fn is continuous, strictly convex on the space RN , satisﬁes lim Fn (c) = ∞,

c →∞

and then the vector c˜ is a solution of problem (6.2.21) if and only if all partial derivatives of the ﬁrst order of the function Fn vanish at c˜ (cf. Theorem 6.2.12). Thus the problem of ﬁnding a solution of problem (6.2.21) is equivalent to the problem of ﬁnding a solution of the system ∂Fn (c1 , . . . , cN ) = 0, ∂c1 .. .

(6.2.22)

∂Fn (c1 , . . . , cN ) = 0. ∂cN The system (6.2.22) is a system of N algebraic equations which are generally nonlinear. However, note that if the functional F is quadratic, then the system (6.2.22) is a system of linear algebraic equations. Remark 6.2.35. We have not been concerned with the question which is fundamental from the practical point of view: “How do we solve system (6.2.22) numerically?” A vast literature dedicated to numerical methods deals with this problem. Just for an illustration we mention one minimization method. Choose arbitrarily a vector c0 = (c01 , . . . , c0N ) ∈ RN . Let us present an algorithm for the construction of a sequence {cm }∞ m=1 which converges under appropriate assumptions on f to the solution of system (6.2.22). If we m N know the vector cm = (cm 1 , . . . , cN ) ∈ R , we calculate the components of the vector m+1 N , . . . , c ) ∈ R as follows: Let the function cm+1 = (cm+1 1 N m m Fn (cm+1 , . . . , cm+1 1 i−1 , ξ, ci+1 , . . . , cN )

of the variable ξ on R assume its minimum at the point c˜m+1 . Put, then, i cm cm+1 − cm cm+1 i + ω(˜ i ) i i

where

0 < ω ≤ 2.

Here ω is the so-called relaxation parameter. If we choose ω = 1 and if F is a quadratic functional, we obtain the so-called Gauss–Seidel Iterative Method (see, e.g., Stoer & Bulirsch [125]). Nowadays there are plenty of packages available in Mathematica, Maple, Matlab, etc. and oﬀering diﬀerent solvers of system (6.2.22). From the practical point of view it is important that the system (6.2.22) be as simple as possible. The form of the system (6.2.22) depends in an essential way on the actual choice of the subspaces Hn . One special choice depends on the notion and the properties of the Schauder basis.

6.2A. Ritz Method

393

Let {ei }∞ i=1 be a Schauder basis (see Section 1.2) of a Hilbert space H (not necessarily orthonormal) and deﬁne the subspace Hn as the set of all elements u ∈ H which are of the form u = c1 e1 + · · · + cn en . It follows from the deﬁnition of the Schauder basis that {Hn }∞ n=1 satisﬁes condition (6.2.15). Example 6.2.36. Let H W01,2 (0, 1), f ∈ L1 (0, 1) and F (x)

1 2

1 2 |x(t)| ˙ dt + 0

1 4

1

1

|x(t)|4 dt − 0

x ∈ H.

f (t)x(t) dt,

(6.2.23)

0

Then F is the energy functional associated with the Dirichlet problem t ∈ (0, 1), −¨ x(t) + x3 (t) = f (t), x(0) = x(1) = 0

(6.2.24)

(cf. Example 6.2.6). We have

1

0

0

f (t)y(t) dt, 0

1

2 |y(t)| ˙ dt + 3 0

1

x3 (t)y(t) dt −

1

D2 F (x)(y, y) =

1

x(t) ˙ y(t) ˙ dt +

DF (x)(y) =

|x(t)|2 |y(t)|2 dt, 0

and the assumptions of Proposition 6.2.34 are satisﬁed.26 The sequence of functions ei , i = 1, 2, . . . , which are deﬁned by ei (t) ti (1 − t), constitutes a Schauder basis of the space H (see, e.g., Michlin [94]). Thus, if we construct the subspaces Hn as above, the condition (6.2.15) will be satisﬁed. If we rewrite the system (6.2.22) for this particular case, we obtain the system of nonlinear equations for unknowns c1 , . . . , cn , n

1

ck

k=1

0

3 1 n = d < = d
(6.2.25)

0

j = 1, . . . , n. In each of the equations of system (6.2.25), all unknowns c1 , . . . , cn appear – this fact is rather unpleasant from the computational point of view! The question then arises whether it is possible to choose the spaces Hn so that each of the equations of the system (6.2.22) depend on only a “small number” of unknowns. This is one of the fundamental questions of numerical mathematics. Such a choice of Hn is possible, there are diﬀerent ways to do it and each of them leads to a particular

26 Note

1

that we consider x = 0

2 (|x(t)| ˙ dt

12

as the norm on H.

394

Chapter 6. Variational Methods

numerical method. Below we indicate one possible choice of Hn which is diﬀerent from the previous one and which meets the above mentioned requirements. Let n ∈ N, and put ti = ni for i = 0, 1, . . . , n and Ij = [tj , tj+1 ] for j = 0, 1, . . . , n−1. We deﬁne the spaces Hn as follows: Hn is the set of functions x = x(t) continuous on the interval [0, 1] which are linear on every interval [ti , ti+1 ] and for which x(0) = x(1) = 0. Let ei ∈ Hn , i = 1, . . . , n − 1, be functions such that 1 for i = j, j = 0, . . . , n. ei (tj ) = 0 for i = j, It is easily established that the set {ei }n−1 i=1 constitutes a basis of the space Hn and that for all y ∈ Hn we have y(t) =

n−1

t ∈ [0, 1].

y(tj )ej (t),

j=1

The system (6.2.22) constructed for this basis will now be itself a system for the unknown values xn (tj ) of the solution of problem (6.2.13). The crucial point in this 8 construction 9 , i+1 . We is the fact that the functions ei (t) vanish outside the interval Ii−1 ∪ Ii = i−1 n n then have ei (t)ej (t) = e˙ i (t)e˙ j (t) = 0 for i, j = 1, . . . , n − 1, |i − j| > 1 at every point t ∈ [0, 1] (with the obvious exception, for derivatives, of the points t1 , . . . , tn−1 , which constitute a set of measure zero). Therefore, in each of the equations n−1 i=1

1

ci

1

e˙ i (t)e˙ j (t) dt + 0

0

n−1

3 ci ei (t)

1

ej (t) dt =

i=1

f (t)ej (t) dt,

(6.2.26)

0

j = 1, . . . , n − 1, of system (6.2.22) only the unknowns cj−1 , cj+1 appear apart from cj . If we compute a solution c1 , . . . , cn−1 from these equations, and if we put un (t) = c1 e1 (t) + · · · + cn−1 en−1 (t),

t ∈ [0, 1],

we obtain a solution of problem (6.2.13). Now, we wish to know whether lim un − u = 0.

n→∞

By Proposition 6.2.34, it suﬃces to show that the spaces Hn satisfy condition (6.2.15). Let y ∈ H and ε > 0. We shall show that there exist n ∈ N and yn ∈ Hn such that y − yn < ε.

(6.2.27)

Indeed, the set D(0, 1) is dense in H (see Exercise 1.2.46). Hence there exists w ∈ D(0, 1) such that ε y − w < . (6.2.28) 2

6.2A. Ritz Method

395

Let n ∈ N be arbitrary, and let us construct a function yn ∈ Hn such that yn (ti ) = w(ti )

for all

i = 0, . . . , n.

Then we have (due to the Mean Value Theorem): w − yn =

n−1 ti+1 i=0

|w(t) ˙ − y˙ n (t)|2 dt ≤

ti

n−1 i=0

max t∈[0,1]

1 2 |w(t)| ¨ (ti+1 − ti ) n2

1 ¨ = 2 max |w(t)|. n t∈[0,1] This implies that for suﬃciently large n ∈ N we have w − yn <

ε . 2

(6.2.29)

The desired inequality (6.2.27) now follows from (6.2.28) and (6.2.29).

e

Remark 6.2.37. (i) Let us point out that to get system (6.2.25) it was not essential that an equidistant division of the interval [0, 1] has been selected. Nonetheless, the norm of the division (i.e., the maximal distance between two consecutive points) must approach zero. (ii) The spaces Hn are the simplest which could be chosen for the given example. It is also possible to choose spaces of C 1 -functions which are polynomials of higher degree on every interval Ii . For instance, one can choose Hn = {y ∈ C 1 [0, 1] : y(0) = y(1) = 0, y|Ii is a polynomial of the third degree for all i = 0, . . . , n − 1}.27 There exists a basis of this space whose dimension is 2n which consists of the functions e1 , . . . , en−1 , ψ0 , . . . , ψn such that 1 for i = j, e˙ i (tj ) = 0, i = 1, . . . , n − 1, j = 0, . . . , n; ei (tj ) = 0 for i = j, ψ˙ i (tj ) =

ψi (tj ) = 0,

1 0

for for

i = j, i = j,

i, j, = 0, . . . , n,

see Figure 6.2.5. Every function y ∈ Hn can be written in the form y(t) =

n−1 j=1

y(tj )ej (t) +

n

y(t ˙ j )ψj (t),

t ∈ [0, 1].

j=0

(iii) From the computational point of view the question of how rapidly the solutions un of problem (6.2.13) converge to a solution of problem (6.2.12) is very important. This question is closely related to the regularity of solutions of equations. If, e.g., f ∈ C 0 [0, 1], then u0 ∈ C 2 [0, 1] (cf. Proposition 6.1.11 and Theorem 6.1.13) and 27 These

functions are called cubic splines (see, e.g., de Boor [32]).

396

Chapter 6. Variational Methods

1

ei ψi 0 = t0

ti−1

ti

ti+1

1 = tn t

Figure 6.2.5.

using this it can be proved that there exists a constant c > 0 such that for all n ∈ N we have c u0 − un ≤ . n If, e.g., u0 ∈ C 4 [0, 1], then we even have u0 − un ≤

c . n3

Remark 6.2.38 (Finite Elements Method). Similarly to Example 6.2.36 we could proceed even in the case H W0k,2 (Ω), Ω ⊂ RN , Ω ∈ C 0,1 . The situation then corresponds to the boundary value problem for partial diﬀerential equations – see Chapter 7 for more details. Suppose that we can divide the set Ω into a ﬁnite number of open subsets Ωi , i = 1, . . . , k, such that their diameter diam Ωi = sup x − y < x,y∈Ωi

1 n

and such that

Ω=

k

Ωi , Ωi ∩ Ωj = ∅ for i = j.

i=1

Each of the sets Ωi is called a ﬁnite element. The space Hn will consist of functions whose restrictions to Ωi are smooth functions, for instance polynomials in N variables, and satisfy certain conditions on the common boundary of the sets Ωi and Ωj (i = j). For simplicity and greater intuitive appeal we will consider Ω to be a polygon in R2 and for every n ∈ N we perform a triangulation Tn of the set Ω, i.e., we put Ω=

k

Ki

where Ki are open triangles such that

diam Ki ≤

i=1

1 , i = 1, . . . , k, n

see Figure 6.2.6. Assume that precisely one of the following situations arises for the mutual position of triangles Ki , Kj ∈ Tn (i = j): (a) the closures of two distinct triangles have no common point; (b) the closures of two distinct triangles have only one vertex in common; (c) the closures of two distinct triangles have an entire side in common. The spaces Hn will be sets of continuous functions whose restrictions to Ki are polynomials of the kth order. Below, we give examples of the spaces Hn for the case k = 1 and k = 3. The continuity of a function v ∈ Hn is ensured on the set Ω by choosing the values of parameters (used for the construction of the function) to be equal at the common vertices. The reader will ﬁnd more details in specialized literature on the Finite Elements Method (see, e.g., Brenner & Scott [16], Kˇr´ıˇzek & Neitaanm¨ aki [81], Rektorys [107]).

6.2A. Ritz Method

397

Ω Ki

Figure 6.2.6. Example 6.2.39 (k = 1). Let Ω be a polygon in R2 . Let K be an open triangle with vertices Q1 , Q2 , Q3 . Let P1 (K) be the set of all polynomials of the ﬁrst degree deﬁned on K, i.e., P ∈ P1 (K) if P (x, y) = α0 + α1 x + α2 y,

(x, y) ∈ K.

It is easily shown that any function P (x, y) ∈ P1 (K) is uniquely determined by its values at the vertices Q1 , Q2 , Q3 . The values P (Q1 ), P (Q2 ), P (Q3 ) serve as parameters by means of which the function P (x, y) is constructed. The function P ∈ P1 (K) for which P (Qi ) = v(Qi ),

i = 1, 2, 3,

is called the Lagrange interpolation of the function v ∈ C(K). The function P (x, y) constructed in this way is denoted by ΠK v. Clearly, ΠK is a linear operator from the space C(K) into P1 (K) and v − ΠK vW 1,2 (K) ≤ chK vW 2,2 (K)

(6.2.30)

holds for arbitrary functions v ∈ W 2,2 (K) (here hK = diam K and c > 0 is a constant independent of v and hK ).28 Deﬁne the space Hn as follows: Hn {v ∈ C(Ω) : v|Ki ∈ P1 (Ki ) for all Ki ∈ Tn }. Obviously, Let v ∈ W

2,2

Hn ⊂ H W 1,2 (Ω). (Ω). Construct a function vn ∈ Hn in the following way: vn |Ki = ΠKi v.

Applying inequality (6.2.30), we obtain c vW 2,2 (Ω) . n Thus, the function vn is arbitrarily close to the function v provided n is a suﬃciently large nonnegative integer. Hence, making use of the fact that the space W 2,2 (Ω) is dense v − vn ≤

28 The

reader is invited to prove it in detail!

398

Chapter 6. Variational Methods

in the space H (explain why!), we conclude that the spaces Hn , n ∈ N, satisfy condition (6.2.15). We can construct the basis functions e1 , . . . , ek of Hn just as in Example 6.2.36. If {Qi }m i=1 are all vertices of all triangles of the triangulation Tn , then 1 for i = j, e ei (Qj ) = j = 1, . . . , m. 0 for i = j, Example 6.2.40 (k = 3). Let K be an open triangle with vertices Q1 , Q2 , Q3 and with the center of gravity Q0 . Let P3 (K) be the set of polynomials of the third degree deﬁned on K, i.e., P ∈ P3 (K) if P (x, y) = α0 + α1 x + α2 x2 + α3 x3 + α4 xy + α5 xy 2 + α6 x2 y + α7 y + α8 y 2 + α9 y 3 , (x, y) ∈ K. A function P (x, y) ∈ P3 (K) is uniquely determined by its values at the vertices and at the center of gravity and by the values of the ﬁrst partial derivatives at the vertices of the triangle K. A function ΠK v ∈ P3 (K) for which ΠK v(Qi ) = v(Qi ), ∂v(Qi ) ∂ΠK v(Qi ) = , ∂x ∂x

i = 0, 1, 2, 3;

∂ΠK v(Qi ) ∂v(Qi ) = , ∂y ∂y

i = 1, 2, 3,

is called the Hermite interpolation of the function v ∈ C 1 (K). Just as in the preceding example, the inequality v − ΠK vW 3,2 (K) ≤ chK vW 4,2 (K)

holds for all

v ∈ W 4,2 (K).

If we put Hn {v ∈ C 1 (K) : v|Ki ∈ P3 (Ki ) for every triangle Ki ∈ Tn }, then Hn ⊂ H W 3,2 (Ω) and the spaces Hn , n ∈ N, again satisfy condition (6.2.15) since the set W 4,2 (Ω) is dense e in the space H. Exercise 6.2.41. Apply the spaces Hn described in Remark 6.2.37(ii) to Example 6.2.36.

6.2B Supersolutions, Subsolutions and Global Extrema In this appendix we show the connection between the supersolutions and subsolutions (see Section 5.4) on the one hand and the existence of global minima (see Section 6.2) on the other. We will illustrate it on the Dirichlet boundary value problem x ¨(t) = f (t, x(t)), t ∈ (0, 1), (6.2.31) x(0) = x(1) = 0, where f is a continuous function on [0, 1] × R (cf. Example 5.4.19). Put H W01,2 (0, 1). The functional 1 x(t) f (t, s) ds dt ψ(x) 0

0

6.2B. Supersolutions, Subsolutions and Global Extrema deﬁned on H is of the class C 1 (H, R) and 1 f (t, x(t))h(t) dt, ψ (x)(h) =

399

x, h ∈ H.29

(6.2.32)

0

Then

1

F (x) = 0

1 2 |x(t)| ˙ + 2

x(t)

f (t, s) ds dt 0

is of the class C 1 (H, R) and its critical points correspond to weak solutions of (6.2.31). A regularity argument applied to (6.2.31) (similar to that from Theorem 6.1.13) implies that every weak solution is a classical solution in the sense that x ∈ C02 [0, 1] {x ∈ C 2 [0, 1] : x(0) = x(1) = 0} and the equation in (6.2.31) holds at every point t ∈ (0, 1). The link between the method of supersolutions and subsolutions on the one side and the method of ﬁnding the global minimizer on the other side is that the existence of a well-ordered pair of a subsolution and supersolution u0 and v0 , respectively, implies that the functional F has a minimum on the convex but noncompact set M = {x ∈ H : u0 (t) ≤ x(t) ≤ v0 (t) for all t ∈ [0, 1]}. This minimum then solves (6.2.31). Namely, we have the following assertion. Theorem 6.2.42. Let u0 and v0 be a subsolution and supersolution of (6.2.31) such that u0 (t) ≤ v0 (t), t ∈ [0, 1], E {(t, x) ∈ [0, 1] × R : u0 (t) ≤ x ≤ v0 (t)}, and let f : E → R be a continuous function. Then the functional F has a global minimum on M, i.e., there exists x0 ∈ M such that F (x0 ) =

min

x∈H u0 ≤x≤v0

F (x).

Moreover, x0 is a solution of (6.2.31). Proof. Let γ(t, x) max{u0 (t), min{x, v0 (t)}} and consider the modiﬁed problem x ¨(t) = f (t, γ(t, x(t))),

t ∈ (0, 1),

x(0) = x(1) = 0. Deﬁne the energy functional associated with this modiﬁed problem by 1 x(t) 1 2 ˜ F (x) = |x(t)| ˙ + f (t, γ(t, s)) ds dt. 2 0 0 29 Cf.

Section 3.2 in order to prove these facts.

(6.2.33)

400

Chapter 6. Variational Methods

Then F˜ ∈ C 1 (H, R) and its critical points correspond to the solutions of (6.2.33). It is easy to prove (the reader should do it as an exercise) that F˜ is weakly sequentially lower semicontinuous and weakly coercive. It then follows from Theorem 6.2.8 that F˜ has a global minimum on H at x0 ∈ H, F˜ (x0 ) = o. This x0 is a weak solution of (6.2.33) and it is regular, i.e., x0 ∈ C 2 [0, 1], by Theorem 6.1.14. We shall show that u0 (t) ≤ x0 (t) ≤ v0 (t). Indeed, assume by contradiction that min (x0 (t) − u0 (t)) < 0

t∈[0,1]

and deﬁne

t0 max t ∈ [0, 1] : x0 (t) − u0 (t) = min (x0 (s) − u0 (s)) . s∈[0,1]

From the deﬁnition of a subsolution u0 and of γ we obtain that t0 < 1, and for t ≥ t0 , t close to t0 , we have t t x˙ 0 (t) − u˙ 0 (t) = [¨ x0 (s) − u ¨0 (s)] ds = [f (s, u0 (s)) − u ¨0 (s)] ds ≤ 0. t0

t0

This contradicts the deﬁnition of t0 . Hence x0 (t) ≥ u0 (t), t ∈ [0, 1]. Similarly we prove x0 (t) ≤ v0 (t), t ∈ [0, 1]. Notice that if x is such that u0 (t) ≤ x(t) ≤ v0 (t), then γ(t, x(t)) = x(t), i.e., x0 is a minimizer for F on M and F (x0 ) = o. Example 6.2.43. Consider the problem x ¨(t) = λf (t, x(t)),

t ∈ (0, 1),

x(0) = x(1) = 0,

(6.2.34)

where f is continuous on [0, 1] × R, f (t, 0) = 0, f (t, R) ≥ 0 for an R > 0 and there exists w ∈ H W01,2 (0, 1), 0 ≤ w(t) ≤ R, t ∈ [0, 1], such that 1 w(t) f (t, s) ds dt < 0. 0

0

Then there exists Λ ≥ 0 such that for all λ ≥ Λ, (6.2.34) has, besides the trivial solution, at least one nontrivial nonnegative solution. Indeed, u0 ≡ 0 is a subsolution and v0 ≡ R is a supersolution, and according to Theorem 6.2.42 there exists x0 ∈ M {x ∈ H : 0 ≤ x(t) ≤ R} which solves (6.2.34) and minimizes the energy functional F on M. Moreover, taking λ large enough, we have w(t) 1 1 2 |w(t)| ˙ +λ f (t, s) ds dt < 0, F (w) = 2 0 0 and so F (x0 ) = min F (x) ≤ F (w) < 0 = F (o). x∈M

e

6.3. Relative Extrema and Lagrange Multipliers

401

Remark 6.2.44. The same results as in Theorem 6.2.42 and Example 6.2.43 hold if the continuity of f is relaxed to f ∈ CAR([0, 1]×R) and for all r > 0 there exists h ∈ L1 (0, 1) such that for a.e. t ∈ (0, 1) and all s ∈ R, |s| ≤ r, we have |f (t, s)| ≤ h(t). The reader is invited to verify all the previous steps as an exercise. The reader who wants to learn more is referred to De Coster & Habets [33] where also the relation between non-well-ordered supersolutions and subsolutions on the one hand and the minimax method on the other is discussed. Exercise 6.2.45. How does the proof of Theorem 6.2.42 change if the homogeneous Dirichlet boundary conditions in (6.2.31) are replaced by the Neumann ones? Exercise 6.2.46. Consider the problem

p−2 |x(t)| ˙ x(t) ˙ ˙= f (t, x(t)),

t ∈ (0, 1),

x(0) = x(1) = 0, where p > 1 and

1

F (x) = 0

1 p + |x(t)| ˙ p

x(t)

(6.2.35)

f (t, s) ds dt.

0

Prove the analogue of Theorem 6.2.42 for (6.2.35). Exercise 6.2.47. Find conditions on a continuous function f : [0, 1] × R → R which guarantee that the problem (6.2.35) has a subsolution u0 and a supersolution v0 satisfying u0 (t) ≤ v0 (t)

for all

t ∈ [0, 1].

Hint. Look for u0 and v0 constant on [0, 1].

6.3 Relative Extrema and Lagrange Multipliers In this section we will investigate the local minima or maxima of a real function f on a smooth manifold M (in particular, on a surface in R3 ). Such a manifold is often determined by various constraints which are given by certain equations like Φ(x) = o (cf. Remark 4.3.9). The key assertions of this section are the Lagrange Multiplier Method, the Courant–Fischer and Courant–Weinstein Variational Principles. Deﬁnition 6.3.1. Let X be a metric (or, more generally, topological) space, M ⊂ X. We say that a function f : M → R has a local minimum (maximum) at a point a ∈ M with respect to M (or a constrained minimum on M ) if there is a neighborhood U of a such that f (x) ≥ f (a)

(f (x) ≤ f (a))

for all x ∈ M ∩ U.

402

Chapter 6. Variational Methods

We will suppose that M is given as the zero set of a map Φ : X → Y , i.e., M = {x ∈ X : Φ(x) = o}. The way of investigating the behavior of f in a relative neighborhood U ∩ M of a point a ∈ M is simple and transparent. It consists in expressing M ∩ U as the graph of a map ϕ : Z → X and subsequently studying f ◦ϕ. This is always possible if M is a diﬀerentiable manifold in X = RN (Deﬁnition 4.3.4) or if M is given by Φ as above and Φ satisﬁes certain regularity conditions (Proposition 4.3.8 and Remark 4.3.9(i)). Theorem 6.3.2 (Lagrange Multiplier Method). Let X be a Banach space, f : X → R, Φ = (Φ1 , . . . , ΦN ) : X → RN . Let f have a local minimum or maximum with respect to M = {x ∈ X : Φ(x) = o} at a point a ∈ M . Let there be a neighborhood U of a in X such that f, Φ ∈ C 1 (U) and let a be a regular point of Φ (i.e., Φ (a) is a surjective map onto RN ). Then there exist numbers λ1 , . . . , λN 30 such that N f− λi Φi (a) = o. (6.3.1) i=1

Proof. Proposition 4.3.8 and Remark 4.3.9(i) yield a diﬀeomorphism ϕ of a neighborhood U of o ∈ X onto a neighborhood V of a such that ϕ(U ∩ Ker Φ (a)) = M ∩ V,

ϕ(o) = a.

If ϕ1 denotes the restriction of ϕ to U ∩Ker Φ (a), then f ◦ ϕ1 has a local minimum (or maximum) at o and therefore (f ◦ ϕ1 ) (o) = o. Since ϕ 1 (o)h = h for any h ∈ Ker Φ (a) (see the proof of Proposition 4.3.8), it follows that Ker Φ (a) ⊂ Ker f (a). The use of Proposition 1.1.19 completes the proof.

Remark 6.3.3. (i) The main signiﬁcance of Theorem 6.3.2 consists in reducing a (diﬃcult) problem of ﬁnding the constrained extremal points to an easier task of ﬁnding the local ones for a function f−

N

λi Φi

i=1

with unknown coeﬃcients λ1 , . . . , λN (they have to be determined in the course of calculation – see Example 6.3.4). 30 The

numbers λ1 , . . . , λN are called Lagrange multipliers.

6.3. Relative Extrema and Lagrange Multipliers

403

(ii) For an inﬁnite number of constraints (i.e., Φ : X → Y , Y is a Banach space of inﬁnite dimension) the proof of Theorem 6.3.2 still holds provided there exists a continuous projection of X onto Ker Φ (a). It is interesting that the statement (now (f − F ◦ Φ) (a) = 0 for a certain F ∈ Y ∗ ) is true without the assumption on existence of a projection (the so-called Lusternik Theorem), but the proof is more diﬃcult (see Lusternik & Sobolev [90]). Example 6.3.4. Find the minimal and maximal values of f (x, y, z) = x2 y + xy 2 + z 2

on the set M = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1}.

Notice ﬁrst that all points of M are regular. The necessary condition given by Theorem 6.3.2 for extremal points requires solving the following four equations: 2xy + y 2 − 2λx = 0,

(6.3.2)

x + 2xy − 2λy = 0,

(6.3.3)

2z − 2λz = 0,

(6.3.4)

2

2

2

2

x + y + z = 1.

(6.3.5)

We have either z = 0 or λ = 1 from the third equation. Adding x2 and y 2 to (6.3.2) and (6.3.3) we obtain x2 + 2λx = y 2 + 2λy, Case 1 (z = 0). If x = y, then √ 2 x=y=± and 2

(x − y)(x + y + 2λ) = 0.

i.e.,

√ √ √ 2 2 2 f ± ,± ,0 = ± . 2 2 2

If x + y = −2λ, then (6.3.2) and (6.3.5) imply xy = − 13 and from equation (6.3.5) we ﬁnd √ √ 3 3 and hence f (x, y, 0) = xy(x + y) = ∓ . x+y =± 3 9 Case 2 (λ = 1). Again we have either x = y or x + y = −2. Putting x = y into (6.3.2) we ﬁnd x = y = 0,

z = ±1

or

x=y=

and f (0, 0, ±1) = 1,

f

2 2 1 , ,± 3 3 3

2 , 3

=

If x + y = −2, then (from (6.3.2) and (6.3.3)) x2 + 2x − 4 = y 2 + 2y − 4 = 0.

z=± 19 . 27

1 3

404

Chapter 6. Variational Methods

Summing these equations we get 0 = x2 + y 2 − 4 − 8, i.e., there cannot exist z such that x2 + y 2 + z 2 = 1. We have found several points in M for which the necessary condition is satisﬁed. Since M is a compact set in R3 and f is continuous, the maximum and the minimum of f on M have to exist. Comparing the values of f at points at which the necessary condition is satisﬁed we ﬁnd that √ √ √ 2 2 2 max f = f (0, 0, ±1) = 1, ,− ,0 = − . min f = f − M M 2 2 2 If we were interested in local minima/maxima of f with respect to M , we would need some suﬃcient conditions. Since we are able to reduce the problem of constrained minima/maxima to that of local ones (see the proof of Theorem 6.3.2), we might employ the suﬃcient condition which uses the second diﬀerential (Theg orem 6.1.5). Cf. Exercise 6.3.17. Example 6.3.5 (Existence of the principal eigenvalue). Let p > 1 be a real number, X W01,p (0, 1).31 Consider the eigenvalue problem

p−2 x(t))˙ ˙ = λ|x(t)|p−2 x(t), −(|x(t)| ˙ x(0) = x(1) = 0

t ∈ (0, 1),

(6.3.6)

with a real parameter λ. This problem is linear for p = 2 and nonlinear for p = 2. We say that λ ∈ R is an eigenvalue of (6.3.6) if there is a weak solution x ∈ X, x = o, of (6.3.6), i.e.,

1

1

p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt = λ

0

|x(t)|p−2 x(t)y(t) dt

(6.3.7)

0

holds for every y ∈ X. The corresponding x is then called an eigenfunction associated with the eigenvalue λ.32

31 We 32 To

1

will work with the norm x =

p |x(t)| ˙ dt

p1

.

0

see the analogue to the linear case the reader should notice that for p = 2 such a function x is an eigenvector (Deﬁnition 1.1.27) and λ is an eigenvalue of the linear operator Bx = x ¨, Dom B = {x ∈ W01,2 (0, 1) : x(0) = x(1) = 0} ⊂ L2 (0, 1). The identity (6.3.7) can be interpreted (for p = 2) as the operator equation x = λAx where A is deﬁned by the equality (Ax, y)W 1,2 (0,1) = 0

(x, y)L2 (0,1) . The eigenvalues of (6.3.6) are then reciprocal values of the eigenvalues of A.

6.3. Relative Extrema and Lagrange Multipliers

405

Since (6.3.7) must also hold for y = x, we obtain

1

λ = 0 1

p |x(t)| ˙ dt

, |x(t)|p dt

0

which implies that λ > 0 for any eigenvalue λ. We will prove that the value

1

0

λ1 = inf x∈X x =o

1

p |x(t)| ˙ dt

,

(6.3.8)

|x(t)| dt p

0

i.e.,

1

λ1 = inf

x∈X

1

p |x(t)| ˙ dt :

0

|x(t)|p dt = 1

0

is attained and use the Lagrange Multiplier Method to show that λ1 is the least eigenvalue (principal eigenvalue) of (6.3.6). Let us prove that the inﬁmum in (6.3.8) is achieved at an x1 ∈ X with

1

|x1 (t)|p dt = 1.

0 ∞

Indeed, there exists a minimizing sequence {xn }n=1 ⊂ X such that

1

|xn (t)| dt = 1 p

1

and

0

|x˙ n (t)|p dt → λ1 .

0 ∞

In particular, this means that the sequence {xn }n=1 is bounded in X. By the reﬂexivity of X and the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1) (see ∞ Theorem 1.2.28 and Exercise 1.2.46(i)) there exists a subsequence {xnk }k=1 ⊂ ∞ {xn }n=1 and a function x1 ∈ X such that xnk x1 Hence

1

in X,

|x1 (t)|p dt = 1

and

xnk → x1

x1 p ≤ lim inf xn p = λ1 ,

0

i.e.,

0

1

in Lp (0, 1).

|x˙ 1 (t)|p dt = λ1 .

n→∞

406

Chapter 6. Variational Methods

Now we apply Theorem 6.3.2 with 1 p |x(t)| ˙ dt and f (x) =

1

g(x) =

0

|x(t)|p dt − 1.

0

The Fr´echet derivatives of f and g at x1 (in the space X) are given by f (x1 )y = p

1

|x˙ 1 (t)|p−2 x˙ 1 (t)y(t) ˙ dt, for any y ∈ X

0

g (x1 )y = p

1

|x1 (t)|p−2 x1 (t)y(t) dt

0

(cf. Exercise 3.2.35). Since x1 = o, we also have g (x1 ) = o, and so the assumptions of Theorem 6.3.2 are fulﬁlled. Hence there exists λ ∈ R such that f (x1 ) = λg (x1 ), which is equivalent to 1 |x˙ 1 (t)|p−2 x˙ 1 (t)y(t) ˙ dt = λ 0

1

|x1 (t)|p−2 x1 (t)y(t) dt

(6.3.9)

0

for any y ∈ X. Setting y = x1 in (6.3.9) we get λ = λ1 . Now it follows from (6.3.7) and (6.3.8) that λ1 is the least eigenvalue of (6.3.6). g Remark 6.3.6. Let us emphasize that Theorem 6.3.2 provides a necessary condition only. It means that not every point a ∈ M for which f (a) −

N

λi Φ i (a) = o

with some

λi ∈ R,

i = 1, . . . , N,

i=1

need be a point of local extremum of f relative to M ! On the other hand, to ﬁnd all local extrema of f relative to M one has to start with ﬁnding all λi ∈ R, N λi Φi has a critical point a ∈ M . It i = 1, . . . , N , such that the functional f − i=1

is a well-known fact from the calculus of several real variables (when X = RN ) that the set of all such a’s is “almost always” ﬁnite (see, e.g., Example 6.3.4). Hence a very natural and deep question arises: “How many points a do we have if dim X = ∞?” Remark 6.3.7. Let us denote by Λ ⊂ R the set of all λ ∈ R such that f − λg has a critical point a ∈ M . If X is a Hilbert space of inﬁnite dimension, then in

6.3. Relative Extrema and Lagrange Multipliers

407

Krasnoselski [78, Chapter 6] the reader can ﬁnd the proof of the assertion that the set Λ contains a sequence of nonzero numbers λn = 0 such that λn → 0. The same assertion for a Banach space X can be found in Citlanadze [26], Browder [18], Fuˇc´ık & Neˇcas [55]. Actually, the whole Chapter 6 of the lecture notes by Fuˇc´ık et al. [56] is devoted to this problem. As for more recent references the reader can confer Zeidler [136] and the bibliography therein. Let us emphasize that in all above results the authors prove that the cardinality of the set Λ is equal to inﬁnity. The question: “When is Λ a countable set?” is much more involved. Some partial results in this direction can be found in Fuˇc´ık et al. [56]. The proofs are based on a stronger version of the Morse Theorem and go beyond the scope of this book. Proposition 6.3.8. Let H be an N -dimensional Hilbert space and let A be a selfadjoint operator in H. Then A has N real eigenvalues λ1 , . . . , λN (if they are counted with their multiplicities), and the corresponding eigenvectors e1 , . . . , eN form an orthonormal basis in H. Proof. Consider two functions f, ϕ1 : H → R deﬁned by f (x) = (Ax, x),

ϕ1 (x) = (x, x) − 1,

x ∈ H.

Then the set M1 = {x ∈ H : ϕ1 (x) = 0} (the unit sphere in H) is a compact subset of H and the continuous function f assumes its maximum in M1 at a point e1 ∈ M1 . By Theorem 6.3.2, there is a λ1 ∈ R such that f (e1 ) − λ1 ϕ 1 (e1 ) = o. A simple calculation shows that f (e1 )h = 2(Ae1 , h), ϕ 1 (e1 )h = 2(e1 , h). Therefore (Ae1 − λ1 e1 , h) = 0

for all h ∈ H,

i.e.,

Ae1 = λ1 e1 .

Taking h = e1 we also get λ1 = (Ae1 , e1 ) = max (Ax, x). x∈M1

In particular, λ1 is the largest (equivalently, ﬁrst) eigenvalue. To ﬁnd the second eigenvalue we add another constraint ϕ2 (x) (x, e1 ) = 0 (remember that eigenvectors of a symmetric matrix are pairwise orthogonal). The function f has again a maximum with respect to M2 = {x ∈ H : ϕ1 (x) = ϕ2 (x) = 0}

408

Chapter 6. Variational Methods

˜2 ∈ R such that and thus there are e2 ∈ M2 , λ2 , λ ˜ 2 ϕ (e2 )h = (2Ae2 − 2λ2 e2 − λ ˜2 e1 , h) = 0 f (e2 )h − λ2 ϕ 1 (e2 )h − λ 2

(6.3.10)

for all h ∈ H. In particular, for h = e1 we get ˜ 2 e1 2 = 2(e2 , Ae1 ) − λ ˜ 2 = 2λ1 (e2 , e1 ) − λ ˜2, 0 = (2Ae2 , e1 ) − λ ˜ 2 = 0. The equality (6.3.10) hence yields and consequently λ Ae2 = λ2 e2 and, similarly as above, λ2 = max (Ax, x). x=1 (x,e1 )=0

It is obvious that we can proceed by induction to obtain all eigenvalues λ1 , . . . λN and to show that the corresponding eigenvectors e1 , . . . , eN are orthonormal and form a basis of H. Corollary 6.3.9. Let A = (aij )i,j=1,...,N be a symmetric matrix (aij = aji for i, j = 1, . . . , N ). Then there exist real numbers λ1 , . . . , λN and a basis e1 , . . . , eN of RN such that N i,j=1

aij xi xj =

N

λi ξi2 ,

where

x = (x1 , . . . , xN ),

i=1

x=

N

ξi ei .

i=1

Remark 6.3.10. The procedure explored in the proof of Proposition 6.3.8 has a disadvantage, namely, to ﬁnd the kth eigenvalue λk it is necessary to know the ﬁrst k − 1 eigenvectors e1 , . . . , ek−1 . Because of that it can be convenient to have another expression for λk . We will now prove that (Ax, x) λk = min max : (x, y1 ) = · · · = (x, yk−1 ) = 0 and x = o (6.3.11) y1 ,...,yk−1 x2 provided dim H ≥ k. Expression (6.3.11) is called the Minimax Principle. Let e1 , . . . , ek be eigenvectors corresponding to the ﬁrst k eigenvalues λ1 ≥ · · · ≥ λk . Take y1 , . . . , yk−1 ∈ H and let N = {x = o : (x, y1 ) = · · · = (x, yk−1 ) = 0}. There is an x ˜ ∈ N ∩ Lin{e1 , . . . , ek }, say x ˜ =

k

αi ei . A simple argument to

i=1

see this consists in the observation that the linear operator Φ : Rk → Rk−1 (or Ck → Ck−1 ) given by k Φα = αi (ei , yj ) i=1

j=1,...,k−1

6.3. Relative Extrema and Lagrange Multipliers

409

must have a nontrivial kernel. For such an x ˜ we have ⎛ ⎞ k k k k αi λi ei , αj ej ⎠ = λi |αi |2 ≥ λk |αi |2 = λk ˜ x2 . (A˜ x, x ˜) = ⎝ i=1

j=1

i=1

i=1

This shows that the maximum in (6.3.11) (denoted by m(y1 , . . . , yk−1 )) is not less than λk and therefore inf

y1 ,...,yk−1

m(y1 , . . . , yk−1 ) ≥ λk ,

too. But the above calculation yields that m(e1 , . . . , ek−1 ) = λk . Remark 6.3.11. This method of ﬁnding eigenvalues of a self-adjoint continuous operator A cannot be extended to inﬁnite dimensional Hilbert spaces. The reason is rather simple: such an operator need not have any eigenvector (Example: Ax(t) = tx(t), x ∈ L2 (0, 1)). On the other hand, if we assume that A is, in addition to self-adjointness, also compact, then similar result holds. Theorem 6.3.12 (Courant–Fischer Principle). Let A : H → H be a compact, selfadjoint and positive33 linear operator from an (inﬁnite dimensional ) separable real Hilbert space H into itself. Then all eigenvalues of A are positive reals and there exists an orthonormal basis of H which consists of eigenvectors of A. If, moreover, λ1 ≥ λ2 ≥ λ3 ≥ · · · > 0,

λn → 0

(n → ∞),

denote the eigenvalues of A, then λ1 = max{(Au, u) : u = 1} and λk+1 = min max {(Au, u) : u = 1, (u, v1 ) = · · · = (u, vk ) = 0}, v1 ,...,vk

k = 1, 2, . . . .34 Proof. Set F (u) = (Au, u),

ϕ1 (u) = u2 − 1

for u ∈ H,

and M1 = {u ∈ H : ϕ1 (u) = 0}. linear self-adjoint operator A is said to be positive if (Au, u) > 0 for all u = o. reader should compare this assertion and its proof with the Hilbert–Schmidt Theorem (Theorem 2.2.16). 33 A

34 The

410

Chapter 6. Variational Methods ∞

Let {un }n=1 be a maximizing sequence for F subject to M1 , i.e., un = 1, n = 1, . . . , and lim F (un ) = sup {F (u) : u ∈ M1 }. n→∞

The boundedness of M1 and the compactness of A imply (Proposition 2.2.4(iii)) ∞ that we can pass to a subsequence (denoted again as {un }n=1 ) for which u n e1

and

Aun → Ae1

in

H

with an e1 ∈ H.

Then |(Aun , un ) − (Ae1 , e1 )| ≤ |(Aun − Ae1 , un )| + |(Ae1 , un − e1 )| → 0 since both terms on the right-hand side approach zero. So F (e1 ) = sup {F (u) : u ∈ M1 }. In particular, we have F (e1 ) > 0

and

e1 = o.

Let us prove that e1 = 1. Indeed, we have e1 ≤ lim inf un = 1. n→∞

Assume that e1 < 1. Then there exists t > 1 such that for e˜1 = te1 we have ˜ e1 = 1, i.e., e˜1 ∈ M1 . Also F (˜ e1 ) = (A(te1 ), te1 ) = t2 (Ae1 , e1 ) = t2 F (e1 ) > sup {F (u) : u ∈ M1 }, a contradiction. Hence λ1 = F (e1 ) = max {F (u) : u ∈ M1 }. Applying Theorem 6.3.2 we prove exactly as in Proposition 6.3.8 that λ1 is an eigenvalue of A and e1 is the corresponding eigenvector. Now, we proceed by induction using Mn = {u ∈ H : u = 1 and (u, e1 ) = · · · = (u, en−1 ) = 0} as above to get the sequence of eigenvalues λ1 ≥ λ2 ≥ · · · > 0

(6.3.12)

and the sequence of the corresponding eigenvectors e1 , e2 , . . . 35 which are pairwise orthogonal. The inﬁnite dimension of H causes that the above sequences are inﬁnite in general. 35 The

reader should perform this part of the proof in detail.

6.3. Relative Extrema and Lagrange Multipliers

411

Suppose now that there is w ∈ H such that w = 1

(w, en ) = 0

and

for all n ∈ N.

Then w∈

∞

Mn ,

(Aw, w) ≤ λn

and thus

for n = 1, 2, . . . .

n=1

Since λn → 0 (Corollary 2.2.13), we have (Aw, w) = 0. The assumption on the ∞ positivity of A implies w = o, a contradiction. This result shows that {en }n=1 is an orthonormal basis of H (Corollary 1.2.36). Moreover, the sequence (6.3.12) contains all eigenvalues of A. Indeed, if Aw = λw

for w =

∞

αn en = 0,

n=1

then λn αn = λαn

for n = 1, 2, . . . .

Therefore αn = 0 provided λn = λ. The “min max” characterization of λn ’s follows as in the ﬁnite dimensional case (Remark 6.3.10). Remark 6.3.13. It is remarkable that the Minimax Principle holds even without the assumption on the continuity of A in the sense that inf

y1 ,...,yk−1

sup {(Ax, x) : x ∈ Dom A, x = 1, (x, y1 ) = · · · = (x, yk−1 ) = 0}

yields either the kth eigenvalue or an upper bound of the essential spectrum of a linear self-adjoint operator A provided A is bounded above. For details see, e.g., Reed & Simon [106]. There is also a dual characterization of the eigenvalues of A called the Courant–Weinstein Variational Principle. Theorem 6.3.14 (Courant–Weinstein Variational Principle). Let H be a real separable Hilbert space, A : H → H a positive compact self-adjoint linear operator. Assume that the eigenvalues λn of A form a decreasing sequence λ1 ≥ λ2 ≥ λ3 ≥ · · · ≥ λn ≥ · · · > 0,

λn → 0

(n → ∞)

(cf. Theorem 6.3.12), and the multiplicity of an eigenvalue λ indicates how many times this λ repeats in the above sequence. Then for any n ∈ N, λn =

sup

inf (Au, u).

u∈X X⊂H dim X=n u=1

(Here X is an arbitrary linear subspace of H of dimension equal to n.)

412

Chapter 6. Variational Methods

Proof. Keeping the notation from Theorem 6.3.12, in particular, Aen = λn en , we denote for n ∈ N ﬁxed ˜ n = sup λ inf (Au, u). u∈X X⊂H dim X=n u=1

˜ n = λn . Our aim is to prove λ ˜ n ≥ λn . Set Step 1. We prove that λ X0 = Lin{e1 , . . . , en }. Then X0 is a linear subspace of H, dim X0 = n, and clearly ˜n ≥ min (Au, u). λ u∈X0 u=1

However, we can estimate the minimum of the quadratic form on the right-hand side in terms of λn . For u ∈ X0 , u = 1 we have u=

n

n

xi ei ,

i=1

Then

⎛

(Au, u) = ⎝

n i=1

xi λi ei ,

n

x2i = 1.

i=1

⎞ xj ej ⎠ =

j=1

n

λi x2i ≥ λn ,

i.e.,

λ˜n ≥ λn .

i=1

˜ n ≤ λn . Set Step 2. We prove λ Y = Lin{ei }∞ i=n . Then codim Y = n − 1. Let X be an arbitrary linear subspace of H, dim X = n. Then necessarily dim (X ∩ Y ) > 0, and the space X ∩ Y must contain an element w = o. We can assume w = 1. Since w ∈ Y , we have ∞ ∞ w= xi ei , x2i = 1. i=n

i=n

The estimate of the quadratic form (Au, u) on the unit sphere in X yields min (Au, u) ≤ (Aw, w) =

u∈X u=1

∞

λi x2i ≤ λn

i=n

˜n ≤ λn follows. Since X is arbitrary, the equality λ

∞

x2i = λn .

i=n

6.3. Relative Extrema and Lagrange Multipliers

413

Example 6.3.15 (Higher eigenvalues). Let p = 2 in (6.3.6), i.e., let us consider the eigenvalue problem x ¨(t) + λx(t) = 0, t ∈ (0, 1), (6.3.13) x(0) = x(1) = 0. The eigenvalues of the linear problem (6.3.13) can be calculated in an elementary way. On the other hand, if we set H W01,2 (0, 1) and deﬁne a positive and compact operator A : H → H by (Ax, y)

W01,2 (0,1)

1

x(t)y(t) dt, 36

= 0

then µ = 0 is an eigenvalue of A if and only if λ = µ1 is an eigenvalue of (6.3.13) (cf. footnote 32 on page 404). It follows from Theorem 6.3.14 that 1 = λn

sup

min

X⊂H x=1 dim X=n

1

g

|x(t)|2 dt.

0

The following two exercises show the relation between the local (global) extremum subject to a constraint and the local (global) extremum of the functional depending on a parameter (without the constraint). Exercise 6.3.16. Prove the following assertion: Let f , Φ be two real functionals deﬁned on a real Hilbert space H. Let the functional f − λΦ ( λ ∈ R) have a local (global ) extremum at a point x0 ∈ H. Then the functional f has a local (global ) extremum subject to the constraint {x ∈ H : Φ(x) = Φ(x0 )} at the point x0 . Exercise 6.3.17. Prove the following assertion: Let f, Φ : X → R satisfy the assumptions of Theorem 6.3.2 and let x0 ∈ X, λ ∈ R be such that f (x0 ) − λΦ (x0 ) = 0. Assume, moreover, that there exist D2 f (x0 ; h, h), D2 Φ(x0 ; h, h). Then x0 is a local minimum of f − λΦ (without the constraint) provided the quadratic form h → D2 f (x0 ; h, h) − λD2 Φ(x0 ; h, h),

h ∈ X,

is positive deﬁnite in X. 36 By

1

Example 2.2.17 the operator A is also deﬁned as (Ax)(t) =

G(t, s)x(s) ds, and the 0

compactness of A follows.

414

Chapter 6. Variational Methods

Exercise 6.3.18. Show that the ﬁrst eigenvalue of x ¨(t) + λx(t) = 0, t ∈ (0, π), x(0) = x(π) = 0 is simple and equal to 1, and that given λ > −1 there exists c = c(λ) > 0 such that for any x ∈ W01,2 (0, π), π π π 2 2 |x(t)| ˙ dt + λ |x(t)|2 dt ≥ c |x(t)| ˙ dt. 0

0

0

Exercise 6.3.19. Prove that for all x ∈ W01,2 (0, π) the inequality π π 2 2 |x(t)| dt ≤ |x(t)| ˙ dt holds true. 0

0

Hint. Use Exercise 6.3.18.

6.3A Contractible Sets This appendix has solely an auxiliary character and will be used in the proof of the Krasnoselski Potential Bifurcation Theorem in Appendix 6.3B. The proofs of the assertions from this appendix rely on the Brouwer Fixed Point Theorem (Theorem 5.1.3). Deﬁnition 6.3.20. Let A and B be subsets of a topological space Y . Then by deﬁnition A is contractible into B in the space Y , brieﬂy A≺B

in

Y,

if there exists a homotopy h ∈ C([0, 1] × A, Y ) such that for any u ∈ A, h(0, u) = u,

h(1, u) ∈ B.

The next assertion shows that “≺” is a transitive relation. Lemma 6.3.21. Let A, B and C be subsets of Y . If A ≺ B and B ≺ C in Y , then also A ≺ C in Y . Proof. Let us assume that A ≺ B and B ≺ C by means of homotopies h and g. Deﬁne a homotopy f ∈ C([0, 1] × A, Y ) by h(2t, u), 0 ≤ t ≤ 12 , u ∈ A, f (t, u) = g(2t − 1, h(t, u)), 12 < t ≤ 1, u ∈ A. Then f ∈ C([0, 1] × A, Y ) and Deﬁnition 6.3.20 yields A ≺ C. Let H1 and H2 be two closed subspaces of a Hilbert space H such that H = H1 ⊕ H2 .

6.3A. Contractible Sets

415

Let Pi : H → Hi , i = 1, 2, be projections (cf. Example 1.1.13(i)), and assume that dim H1 < ∞. Set R = {x ∈ H : P1 x = o}. The set R equipped with the metric induced by the norm in H is a metric space. Lemma 6.3.22. The set S1,r ∂B(o; r) ∩ H1 is not contractible to a point in R.37 Proof. It is enough to prove this assertion for the sphere with radius r = 1. Let us denote it by S1 . We proceed in two steps. We prove ﬁrst that if S1 were contractible to a point in R, then it would have to be contractible to a point in S1 . In the second step we show that this fact contradicts the Brouwer Fixed Point Theorem (Theorem 5.1.3). Step 1. If S1 is contractible to a point in R, then there exists a continuous mapping f : [0, 1] × S1 → R and x0 ∈ R such that f (0, x) = x,

f (1, x) = x0

x ∈ S1 .

for all

For t ∈ [0, 1], x ∈ S1 set g(t, x) =

P1 f (t, x) . P1 f (t, x)

Then g deforms the set S1 continuously to the point

P1 x 0

P1 x0

in S1 .

Step 2. Let the unit sphere S1 ⊂ H1 be contractible to a point in S1 , i.e., there exists a continuous map g : [0, 1] × S1 → S1 and a point x0 ∈ S1 such that g(0, x) = x,

g(1, x) = x0

Now, we deﬁne h : B(o; 1) ∩ H1 → B(o; 1) ∩ H1 by

⎧ ⎨−g 1 − x, x x h : x → ⎩ −x0

for all

x ∈ S1 .

for

x = o,

for

x = o.

Then h is continuous. Since dim H1 < ∞, the Brouwer Fixed Point Theorem (Theorem 5.1.3) implies that there exists y ∈ B(o; 1) ∩ H1 such that h(y) = y. Since h assumes only values from S1 , we have y ∈ S1 , y = 1. On the other hand, h(y) = −g(0, y) = −y, which is a contradiction. Lemma 6.3.23. Let F be a subset of R. If there exists x0 ∈ H1 , x0 = 1, such that P1 (F) ∩ {y ∈ H1 : y = ax0 , a ∈ R} = ∅, then F is contractible to a point in R. 37 I.e.,

there is no x ∈ R such that S1,r ≺ {x} in R.

416

Chapter 6. Variational Methods

Proof. Deﬁne f : [0, 1] × F → R as f (t, x) =

x + 2tx0 [1 − (x, x0 )] x0 + 2(1 − t)[x − (x, x0 )x0 ]

for for

9 8 t ∈ 0, 12 , x ∈ F, 1 9 t ∈ 2 , 1 , x ∈ F.

The mapping f is continuous and deforms F to the point x0 ∈ R. It is suﬃcient to verify that for any t ∈ [0, 1], x ∈ F we have P1 f (t, x) = o. 8

Indeed, for any t ∈ 0,

9 1 2

we have P1 f (t, x) = 2t[1 − (x, x0 )]x0 + P1 x,

for t ∈

1 2

9 , 1 we have P1 f (t, x) = [1 − 2(1 − t)(x, x0 )]x0 + 2(1 − t)P1 x.

For t ∈ [0, 1) we have then P1 f (t, x) = o due to the assumption P1 (F) ∩ Lin{x0 } = ∅. For t = 1 we have P1 f (t, x) = x0 = o.

6.3B Krasnoselski Potential Bifurcation Theorem Let us recall the deﬁnition of a potential operator. Deﬁnition 6.3.24. Let O be an open subset of a real Hilbert space H, f : O → H. We say that f has a potential (in O) if there exists a functional F : O → R which is Fr´echet diﬀerentiable in O, and for any x ∈ O we have f (x) = F (x).

(6.3.14)

Remark 6.3.25. Let us recall how to interpret the equality (6.3.14). The Fr´echet derivative F (x) is a continuous linear operator from H into R. It follows from the Riesz Representation Theorem (see Theorem 1.2.40) that there is a unique point z z(x) ∈ H such that F (x)y = (y, z), z = F (x) for any y ∈ H. In what follows we will identify F (x) with z(x) ∈ H and study bifurcation points of the equation λx − F (x) = o. (6.3.15) The main objective of this appendix is to prove that (under the assumptions F (o) = o, F (o) = o and some assumptions concerning the smoothness of F ) every point (λ, o) where λ is a nonzero eigenvalue of F (o) : H → H is a bifurcation point of (6.3.15).

6.3B. Krasnoselski Potential Bifurcation Theorem

417

Theorem 6.3.26 (Krasnoselski Potential Bifurcation Theorem). Let F be a (nonlinear) functional on a Hilbert space H. Assume that F is twice diﬀerentiable in a certain neighborhood U(o) of o ∈ H,

(6.3.16)

F is compact on U(o),

(6.3.17)

F : U(o) → L(H) is continuous at o,

(6.3.18)

F (o) = o,

F (o) = o.

(6.3.19)

Then (λ0 , o) where λ0 = 0 is a bifurcation point of λx − F (x) = o

(6.3.20)

if and only if λ0 is an eigenvalue of the operator A F (o). Remark 6.3.27. Note that the equation (6.3.20) is a special case of the equation o = λx − Ax + G(λ, x) from Theorem 5.2.23. Indeed, the left-hand side of (6.3.20) can be written as λx − F (o)x + [F (o)x − F (x)] where F (o) is a compact linear operator (see Proposition 5.2.21), and F (o)x − F (x) = o(x),

x → 0.

Note ﬁrst that the implication

0 and (λ0 , o) is a bifurcation point of (6.3.20), then λ0 is an eigenif λ0 = value of A, follows from Exercise 5.2.25. So we will concentrate on the proof of the reversed implication. Roughly speaking, we know that the “linearization of (6.3.20)”, i.e., the equation (λI − F (o))x = o has a nontrivial solution, and we want to show that there is also a nontrivial solution of the “close” but nonlinear equation (6.3.20). The basic idea of the proof consists in the fact that (6.3.20) is a necessary condition for x to be a critical point of F subject to the sphere 1 1 where J(x) = x2 . ∂B(o; r) x ∈ H : J(x) = r 2 2 2 Here we use the fact that identity is the diﬀerential of the functional J, and the Lagrange Multiplier Method. Later we will prove the existence of a suﬃciently large number of critical points of F on ∂B(o; r). If we restrict ourselves to spheres with suﬃciently small radii (B(o; r) ⊂ U(o) at least), we get critical points converging to zero. The last part of the proof consists in showing that the corresponding Lagrange multipliers can be chosen close to λ0 .

418

Chapter 6. Variational Methods

Let us assume that λ0 = 0 is an eigenvalue of the operator A. The assumption (6.3.18) guarantees that F (o) is a linear self-adjoint operator (see Proposition 3.2.28). We can assume, without loss of generality, that λ0 > 0. Let us start with a geometrical interpretation of the points x ∈ ∂B(o; r) such that λx = F (x).

(6.3.21)

In this case the diﬀerential F (x) is perpendicular (recall that F (x) ∈ H in our interpretation) to the sphere ∂B(o; r) at x. Then x can be looked for as a limit of those points of the sphere ∂B(o; r) at which the tangent projections (see (6.3.22) below and Figure 6.3.1) of F (x) converge to zero. More precisely, we have

P (z) =

F (z)

(F (z), z) z (z, z)

z {y : (z, y) = 0} o D(z)

Figure 6.3.1. Lemma 6.3.28. For z ∈ H, z = o, set D(z) = F (z) −

(F (z), z) z (z, z)

(6.3.22)

(D(z) is the orthogonal projection of F (z) to the tangent space of ∂B(o; z) at z 38 ). Let yn ∈ ∂B(o; r), yn x0 , and let F be continuous, and lim F (yn ) = y = o,

n→∞

lim D(yn ) = o.39

n→∞

(6.3.23)

Then yn → x0 , y = F (x0 ), x0 = o, and λx0 − F (x0 ) = o

where

λ=

1 (F (x0 ), x0 ). r2

Proof. From the weak convergence yn x0 and from (6.3.23) we obtain (F (yn ), yn ) → (y, x0 ) 38 This 39 Both

and hence

(F (yn ), yn ) (y, x0 ) yn x0 . r2 r2

tangent space is equal to {x ∈ H : (x, z) = 0} – see Remark 4.3.40. limits are considered with respect to the norm in H.

(6.3.24)

6.3B. Krasnoselski Potential Bifurcation Theorem

419

At the same time, from the deﬁnition of D(yn ) and (6.3.23) we have (F (yn ), yn ) yn = F (yn ) − D(yn ) → y. r2 Hence

1 (y, x0 )x0 . r2 Since y = o, we have x0 = o and also (y, x0 ) = 0. The deﬁnition of D(yn ) and the fact that D(yn ) → o yield y=

yn = r 2

F (yn ) − D(yn ) y → r2 = x0 . (F (yn ), yn ) (y, x0 )

Continuity of F at x0 then implies y = F (x0 ),

i.e.,

F (x0 ) =

(y, x0 ) (F (x0 ), x0 ) x0 = x0 . 2 r r2

We will look for a curve on the sphere ∂B(o; r) which starts at a ﬁxed point x; the values of F along this curve do not decrease, and after a ﬁnite time (even if large) we “almost” reach the critical point of F . In other words, we are looking for a curve k = k(t, x), t ∈ [0, ∞), x ∈ ∂B(o; r) such that k(0, x) = x,

(6.3.25)

and for all t ∈ (0, ∞) we require k(t, x) ∈ ∂B(o; r),

i.e.,

k(t, x)2 = r 2 .

The last relation implies d k(t, x)2 = 0, dt which is equivalent to

d k(t, x), k(t, x) dt

=0

for all

t ∈ (0, ∞).

(6.3.26)

d k(t, x) is perThe equality (6.3.26) states that for all t ∈ (0, ∞) the element dt pendicular to k(t, x). This will be satisﬁed if we look for a solution of the initial value problem ⎧ ⎨ d k(t, x) = D(k(t, x)), t ∈ (0, ∞), dt (6.3.27) ⎩ k(0, x) = x.

The assumption (6.3.18) implies that F is Lipschitz continuous in a neighborhood of o. Hence, for r > 0 suﬃciently small, D is Lipschitz continuous. Then, by virtue of Corollary 3.1.6, there exists a unique solution of (6.3.27) which is deﬁned on the whole interval (0, ∞). It follows from Remark 3.1.7 that this solution depends continuously on the initial condition x ∈ ∂B(o; r).

420

Chapter 6. Variational Methods

Let k be a solution of the initial value problem (6.3.27). Then it has the following important properties: (i) For any t ∈ (0, ∞) we have k(t, x) = x. (ii) For any t ∈ (0, ∞) we have d F (k(t, x)) = (F (k(t, x)), D(k(t, x))) = D(k(t, x))2 ≥ 0. dt In other words, the values of the functional F increase along k regardless of the choice of x ∈ ∂B(o; r). (iii) For any t ∈ (0, ∞) we have

t

D(k(τ, x))2 dτ .

F (k(t, x)) = F (x) + 0

Since F is bounded on ∂B(o; r) (by the Mean Value Theorem and (6.3.19)), there exists a sequence {ti }∞ i=1 ⊂ (0, ∞) such that lim D(k(ti , x)) = o.

i→∞

40 (iv) Since {k(ti , x)}∞ i=1 is bounded, we can select a weakly convergent subsequence.

Summarizing, we have Lemma 6.3.29. For any x ∈ ∂B(o; r) there exist a sequence {ti }∞ i=1 ⊂ (0, ∞) and x0 ∈ H such that k(ti , x) x0 , D(k(ti , x)) → o, {F (k(ti , x))}∞ i=1

is an increasing sequence.

(6.3.28) (6.3.29) (6.3.30)

It follows from (6.3.28) and (6.3.17) that F (k(ti , x)) → y. If we prove that y = o, then the assumptions of Lemma 6.3.28 are veriﬁed with yn = k(tn , x), and so the existence of a solution x0 of (6.3.20) with λ described by (6.3.24) will be proved. By an appropriate choice of the initial condition x ∈ ∂B(o; r), we show that the above convergence takes place and that λ given by (6.3.24) is suﬃciently close to λ0 . Recall that A = F (o) is a compact linear self-adjoint operator in the Hilbert space H (see Proposition 5.2.21). Its spectrum consists of a countable set of real eigenvalues with one possible limit point λ = 0. We split the set of all eigenvalues to the parts λ ≥ λ0 and λ < λ0 , respectively. We denote by H1 and H2 , respectively, the corresponding closed linear subspaces generated by the eigenvectors (see Theorem 2.2.16). Note that λ0 > 0 implies that dim H1 < ∞. The eigenspace associated with λ0 will be denoted by H0 . Let P1 , P2 be the orthogonal projections of H onto H1 , H2 , respectively (see Figure 6.3.2). 40 The

reader is invited to justify (i)–(iv).

6.3B. Krasnoselski Potential Bifurcation Theorem

421

H2 P1

P2 o

(H0 ⊂) H1 Figure 6.3.2.

Let us denote S1 = {x ∈ H1 : x = r}. Lemma 6.3.30. There exists r0 > 0 such that ∂B(o; r0 ) ⊂ U(o) (see (6.3.16)), and for all 0 < r < r0 we have (i) there is no t ∈ [0, ∞) for which the set k(t, S1 ) is contractible to a point (see Deﬁnition 6.3.20) in R = {x ∈ H : P1 x = o}, (ii) for any t ∈ [0, ∞) there exists xt ∈ S1 such that P1 k(t, xt ) ∈ H0 ,

i.e.,

k(t, xt ) ∈ H0 ⊕ H2 .

Proof. Lemma 6.3.23 and (i) imply (ii) (see Exercise 6.3.33). Hence we prove only (i). According to Lemma 6.3.21 it is suﬃcient to prove that for any t the set S1 is contractible into k(t, S1 ) in R. Indeed, according to Lemma 6.3.22 the set S1 is not contractible to a point in R. Since k is a continuous function of both variables, it is suﬃcient to prove that it assumes only values from R: we want to prove that P1 k(t, x) = o

∀t ∈ [0, ∞),

x ∈ S1 .

We have F (k(0, x)) = F (x) ≥

1 (F (o)x, x) − ε(x)x2 ≥ 2

(6.3.31)

1 λ0 − ε(x) x2 2

where ε(r) → 0 as r → 0 (see (6.3.19) and Proposition 3.2.27). Note that the last inequality holds due to x ∈ H1 . Since F (k(t, x)) is increasing in t, we conclude herefrom that

1 (6.3.32) λ0 − ε(r) r 2 . F (k(t, x)) ≥ 2 On the other hand, we have an estimate from above (we write k instead of k(t, x) for the sake of brevity): 1 1 F (k) = (F (o)k, k) + F (k) − (F (o)k, k) 2 2 1 1 ≤ (F (o)P1 k, P1 k) + (F (o)P2 k, P2 k) + ε(k)k2 2 2 (note that (F (o)P1 k, P2 k) = 0 due to H1 ⊥ H2 ).

422

Chapter 6. Variational Methods Denote µ = max {λ : λ ∈ σ(F (o))},

ν = sup {λ ∈ σ(F (o)) : λ < λ0 }.

Then µ ν ν µ−ν P1 k2 + P2 k2 + ε(k)k2 = k2 + P1 k2 + ε(k)k2 .41 2 2 2 2 Hence, due to the fact that k = r, we have F (k) ≤

ν 2 µ−ν r + P1 k2 + ε(r)r 2 . 2 2 It follows from (6.3.32) and (6.3.33) that F (k) ≤

P1 k(t, x)2 ≥

(6.3.33)

λ0 − ν 2 4 r − ε(r)r 2 . µ−ν µ−ν

This implies the existence of r0 such that P1 k(t, x)2 ≥ ar 2

for any

r ≤ r0

where

a = a(r0 ) > 0.

(6.3.34)

This completes the proof of Lemma 6.3.30.

Proof of Theorem 6.3.26. Step 1. Let tn → ∞ be an arbitrary sequence of positive numbers. Let xn be a point from S1 for which P1 k(tn , xn ) ∈ H0 (its existence follows from (ii) of Lemma 6.3.30). Since S1 is compact, we can select a strongly convergent subsequence (denoted again by {xn }∞ n=1 ) such that lim xn = x ˜.

(6.3.35)

n→∞

Step 2. It follows from Lemma 6.3.29 that there is a sequence {τi }∞ i=1 such that k(τi , x ˜ ) = yi x 0

in

H,

and at the same time also D(yi ) → o.

Step 3. The compactness of F implies that (passing again to a subsequence if necessary) there exists y ∈ H such that lim F (yi ) = y. i→∞

We show that y = o. Indeed, we have (F (yi ), P1 yi ) → (y, P1 x0 ). Also, for all i ∈ N, we have the estimate (F (yi ), P1 yi ) = (F (o)yi , P1 yi ) + (F (yi ) − F (o)yi , P1 yi ) 1 ≥ λ0 P1 yi 2 − ε(yi )yi 2 ≥ λ0 ar 2 2 for all r small enough due to (6.3.34). This immediately implies (y, P1 x0 ) = 0, 41 We

and so

use the identity P1 k2 + P2 k2 = k2 .

y = o, x0 = o.

6.3B. Krasnoselski Potential Bifurcation Theorem

423

Step 4. We have just veriﬁed the assumptions of Lemma 6.3.28. Hence yi → x0 in H, and x0 solves (6.3.20) with λ given by (6.3.24): λx0 − F (x0 ) = o,

λ=

1 (F (x0 ), x0 ). r2

Step 5. The last step consists in proving the fact that for r > 0 small enough λ is arbitrarily close to λ0 . Let us estimate 1 |λ − λ0 | = 2 |(F (x0 ), x0 ) − λ0 (x0 , x0 )| r - - 1 1 ≤ 2 |(F (x0 ) − F (o)x0 , x0 )| + 2 -F (o)x0 , x0 − F (x0 )-- + |2F (x0 ) − λ0 (x0 , x0 )| r 2 1 = 2 |2F (x0 ) − λ0 (x0 , x0 )| + ε(r). r Since ε(r) → 0 as r → 0, it suﬃces to estimate 1 |2F (x0 ) − λ0 (x0 , x0 )|. r2 The continuity of F implies F (x0 ) = lim F (yi ). i→∞

(6.3.36)

Since F is increasing along k, we also have ˜)) ≥ F (˜ x). F (yi ) = F (k(τi , x Since x ˜ ∈ S1 ,

λ0 2 1 x, x ˜ ≥ F (o)˜ r . 2 2

Then (6.3.36)–(6.3.38) imply an estimate from below:

λ0 − ε(r) r 2 . F (x0 ) ≥ 2

(6.3.37)

(6.3.38)

(6.3.39)

Now we derive an estimate from above for F (x0 ). Since xn → x ˜ and k(τi , ·) is continuous with respect to the second variable, for ﬁxed i ∈ N we have ˜ ) = yi . k(τi , xn ) → k(τi , x The continuity of F implies that for ﬁxed i ∈ N and r > 0 there exists n0 ∈ N such that for all n ≥ n0 we have (6.3.40) F (yi ) ≤ F (k(τi , xn )) + r 3 . However, for any ﬁxed i ∈ N we ﬁnd ni ≥ n0 such that tni > τi , and the monotonicity of F along k then implies F (k(τi , xni )) ≤ F (k(tni , xni )).

(6.3.41)

The choice of xn from Step 1 guarantees that k(tni , xni ) ∈ H0 ⊕ H2 , and so (writing ki instead of k(tni , xni )) we have the estimate

1 λ0 F (ki ) ≤ (F (o)ki , ki ) + ε(ki )ki 2 ≤ (6.3.42) + ε(r) r2 . 2 2

424

Chapter 6. Variational Methods

However, (6.3.36), (6.3.40) and (6.3.41) reduce (6.3.42) to

λ0 F (x0 ) ≤ + ε(r) r 2 . 2

(6.3.43)

Both the estimates (6.3.39) and (6.3.43) yield that 1 |2F (x0 ) − λ0 (x0 , x0 )| → 0 r2

as

r → 0.

This completes the proof of Theorem 6.3.26.

Remark 6.3.31. It follows from the Krasnoselski Potential Bifurcation Theorem that every point (λ0 , o) where λ0 is a nonzero eigenvalue of the operator A is a bifurcation point. But there is no warranty that there is a curve (or continuum) of nontrivial solutions which departs from (λ0 , o). In fact, there are counterexamples even in the ﬁnite dimension which prove that such a curve need not exist. B¨ ohme [13] gave an example of a real function of two independent real variables, F ∈ C ∞ (R2 ), for which (λ0 , (0, 0)) is a bifurcation point of f (z, λ) = λz − F (z) = o,

z = (x, y) ∈ R × R,

λ∈R

(6.3.44)

and there is no continuous curve of nontrivial solutions of (6.3.44) which contains the point (λ0 , (0, 0)). Example 6.3.32 (Application of the Krasnoselski Potential Bifurcation Theorem). We will consider a periodic problem similar to that studied in Example 4.3.25: x ¨(t) + λx(t) + g(λ, t, x(t)) = 0, t ∈ (0, 2π), (6.3.45) x(0) = x(2π), x(0) ˙ = x(2π). ˙ The diﬀerence between (6.3.45) and (4.3.12) consists in the fact that now we do not allow g to depend on x. ˙ The reason for this restriction consists in the fact that the boundary value problem (4.3.12) cannot be written in the form (6.3.51) if g depends on x. ˙ We simplify the situation even more and write g in the form g(λ, t, s) = (λ + 1)˜ g (t, s). Set

s

˜ s) = G(t,

1

g˜(t, τ ) dτ = 0

g˜(t, sσ)s dσ, 0

˜ is the primitive of g˜ with respect to the second variable s. i.e., G Put 2π 1 ˜ x(t)) dt. |x(t)|2 + G(t, F (x) = 2 0

(6.3.46)

We work in the Hilbert space H {x ∈ W 1,2 (0, 2π) : x(0) = x(2π)}

(6.3.47)

6.3B. Krasnoselski Potential Bifurcation Theorem

425

with the scalar product on H given by 2π (x, y) = [x(t) ˙ y(t) ˙ + x(t)y(t)] dt,

x, y ∈ H.

0

Then (F (x), y) =

2π

[x(t)y(t) + g˜(t, x(t))y(t)] dt

for any

x, y ∈ H.

(6.3.48)

0

A weak solution of the periodic problem is a function x ∈ H which satisﬁes the integral identity 2π [x(t) ˙ y(t) ˙ − λx(t)y(t) − (λ + 1)˜ g (t, x(t))y(t)] dt = 0 (6.3.49) 0

for any y ∈ H. The last equality (6.3.49) can be written as 2π [x(t) ˙ y(t) ˙ + x(t)y(t) − (λ + 1)x(t)y(t) − (λ + 1)˜ g (t, x(t))y(t)] dt = 0.

(6.3.50)

0

The integral identity (6.3.50) can be written for λ = −1 as the operator equation µx − F (x) = o Let us deﬁne an operator B : H → H by

where

µ=

1 . λ+1

(6.3.51)

2π

(B(x), y)H =

x(y)y(t) dt. 0

It follows easily that B is a bounded linear operator and the compact embedding H ⊂⊂ Y {x ∈ C[0, 2π] : x(0) = x(2π)} (see Theorem 1.2.28) yields that B is compact. Since n2 is an eigenvalue of x ¨(t) + λx(t) = 0, t ∈ (0, 2π), x(0) = x(2π), x(0) ˙ = x(2π), ˙ then µ =

1 n2 +1

is an eigenvalue of B. We make the following assumptions:

∂˜ g : R × R → R are continuous functions, (6.3.52) ∂s ∂˜ g g˜(t, 0) = 0, (t, 0) = 0 for all t ∈ R. (6.3.53) ∂s Now we prove that F veriﬁes the assumptions of Theorem 6.3.26: Note that F can be written as 2π 1 1 2π 2 |x(t)| dt + g˜(t, sx(t))x(t) ds dt. (6.3.54) F (x) = 2 0 0 0 (i) F (o) = 0 is an immediate consequence of (6.3.53). (ii) Diﬀerentiability of F follows directly from (6.3.54). (iii) Compactness of F (x). This is a consequence of the compactness of the embedding H ⊂⊂ Y (cf. Exercise 6.3.35). (iv) F (o) = o is a consequence of (6.3.53). (v) F (o) = B and F is continuous at o (cf. Exercise 6.3.35). g˜,

426

Chapter 6. Variational Methods

' ( Theorem 6.3.26 now implies that every point n21+1 , o is a bifurcation point of the equation µx − F (x) = o. In other words, for any n = 0, 1, . . . we have the following assertion: Under the assumptions (6.3.52) and (6.3.53), for an arbitrarily small neighborhood U of the point (n2 , o) ∈ R × H there exists (λ, x) ∈ U such that x = o is a weak solution of the periodic problem x ¨(t) + λx(t) + (λ + 1)˜ g (t, x(t)) = 0, t ∈ (0, 2π), x(0) = x(2π), x(0) ˙ = x(2π). ˙ Note that the continuity of g˜ and the regularity argument imply that every such nontrivial e solution satisﬁes x ∈ C 2 [0, 2π] and x(0) ˙ = x(2π). ˙ Exercise 6.3.33. Prove that Lemma 6.3.23 and Lemma 6.3.30(i) imply the statement of Lemma 6.3.30(ii). Hint. Argue by contradiction. Exercise 6.3.34. Prove that H deﬁned in Example 6.3.32 by (6.3.47) is a closed subspace of W 1,2 (0, 2π), i.e., H is a Hilbert space. Exercise 6.3.35. Prove that F from Example 6.3.32 is twice Fr´echet diﬀerentiable, F compact and that F is continuous at o. Exercise 6.3.36. Apply Theorem 6.3.26 to the Dirichlet and the Neumann boundary value problem.

6.4 Mountain Pass Theorem One of the most eﬃcient tools to prove that a given functional having a local extremum at a point possesses another critical point is the Mountain Pass Theorem. In order to motivate the main ideas of this section we will consider a real function of two real independent variables F: R×R→R which is continuously diﬀerentiable and satisﬁes the following condition: There exist r > 0, e ∈ R2 , e > r such that inf F (x) > F (o) ≥ F (e).

x=r

(6.4.1)

The graph of such a function is sketched in Figure 6.4.1. The Extreme Value Theorem and the ﬁrst inequality in (6.4.1) immediately imply that F has a local minimum and thus a critical point in the set {x ∈ R2 : x < r}.

6.4. Mountain Pass Theorem

427

(o, F (o))

(e, F (e)) Figure 6.4.1.

Hiker’s experience suggests the idea that F should have another critical point diﬀerent from that local minimum. Indeed, if the values of F are interpreted as mountains on the plastic map, then the valley (containing the origin) is surrounded by mountains. At the same time the altitude of every place the distance of which from the origin is equal to r is greater than that of the origin itself. So, there should be an “optimal pass” through the mountain range. Practical experience even suggests how to ﬁnd such a critical point. Let us consider all continuous ﬁnite paths which lie on the graph of F and which connect the points (o, F (o)) and (e, F (e)). On every curve we have at least one “highest” point. It seems that if we select the “highest” point with the “lowest” altitude, we have found a critical point of F . If we formulate precisely the considerations made above, then the “lowest” altitude of the “highest” points corresponds to the value c inf max F (γ(t)) γ∈Γ t∈[0,1]

(6.4.2)

where Γ = {γ ∈ C([0, 1], R2 ) : γ(0) = o, γ(1) = e}. If c is a critical value of F , then there exists xc ∈ R2 such that F (xc ) = c

and

F (xc ) = o.

However, the value c deﬁned above need not be a critical value of F ! An example which illustrates this phenomenon is rather elementary.

428

Chapter 6. Variational Methods

Example 6.4.1 (Br´ezis–Nirenberg). Let F (x, y) = x2 + (1 − x)3 y 2

and

r=

1 , 2

e = (2, 2).

min

F (x, y) > 0,

Then F (o) = F (e) = 0,

inf

(x,y)=r

F (x, y) =

(x,y)=r

and so the value c deﬁned by (6.4.2) is positive. Since ∂F (x, y) = 2x − 3(1 − x)2 y 2 , ∂x

∂F (x, y) = 2(1 − x)3 y, ∂y

the origin is the only critical point of F and obviously F (o) < c. The reader is g invited to sketch the level sets of the function F . It is natural to ask why this happens. Such a situation corresponds, roughly speaking, to the fact that the altitude of the “highest” points approaches the value of c but the distance of these points from the origin diverges to inﬁnity. More precisely, if xn ∈ R2 are such that F (xn ) = max F (γn (t))

for

t∈[0,1]

γn ∈ Γ

and F (xn ) → c,

(6.4.3)

xn → ∞.

(6.4.4)

then It follows that the existence of r > 0 and e ∈ R2 satisfying (6.4.1) is not suﬃcient to guarantee the existence of a critical point which is diﬀerent from the local minimum in {x ∈ R2 : x ≤ r}. On the other hand, we will prove later that 2 (6.4.1) guarantees the existence of a sequence {xn }∞ n=1 ⊂ R such that F (xn ) → c

and

∇F (xn ) → o.

(6.4.5)

Now, let us assume for a moment that F satisﬁes the following condition: ∞ ∞ > (PS) Let {xn }n=1 ⊂ R2 be such that {F (xn )}n=1 is a bounded sequence in R ∞ and ∇F (xn ) → o. Then {xn }n=1 is a bounded sequence in R2 . Then the situation described in (6.4.3) and (6.4.4) cannot occur. Moreover, (6.4.5) > already implies that c is a critical value of F . Indeed, let together with (PS) ∞ > there exists a subsequence {xn }∞ ⊂ {xn }n=1 satisfy (6.4.5). According to (PS) k k=1 ∞ {xn }n=1 such that xnk → xc . The continuous diﬀerentiability of F and (6.4.5) imply that ∇F (xc ) = o.

6.4. Mountain Pass Theorem

429

Let us consider a more general situation, namely F: H →R where H is a real Hilbert space with a scalar product (·, ·) and the induced norm · . In order to simplify the proofs we will also require more smoothness of F , i.e., let F ∈ C 2 (H, R). For the sake of brevity we will denote F d F−1 ((−∞, d]). The key assertion of this section is the following Quantitative Deformation Lemma. The reader should have in mind that it can be proved under more general assumptions (H a Banach space, F ∈ C 1 (H, R)) – see also footnotes 44, 45 and 46 on pages 430–432. Lemma 6.4.2 (Quantitative Deformation Lemma). Let H be a real Hilbert space and F a C 2 -functional, c ∈ R, ε > 0. Assume that ∇F (u) ≥ 2ε

for any

u ∈ F−1 ([c − 2ε, c + 2ε]).

Then there exists η ∈ C(H, H) such that (i) η(u) = u for any u ∈ F−1 ([c − 2ε, c + 2ε]), (ii) η(F c+ε ) ⊂ F c−ε . Proof. Let us introduce closed sets A = F−1 ([c − 2ε, c + 2ε]),

B = F−1 ([c − ε, c + ε])

(see Figure 6.4.2) and a functional ψ(u) =

dist (u, H \ A) .42 dist (u, H \ A) + dist (u, B)

H B

A

Figure 6.4.2.

42 Recall

that dist (u, C) inf{u − vX : v ∈ C} for u ∈ X, C ⊂ X.

430

Chapter 6. Variational Methods

Then 1 ψ= 0

on B, on H \ A,

ψ is locally Lipschitz continuous, 43

Let us deﬁne a vector ﬁeld ⎧ ⎨−ψ(u) ∇F (u) ∇F (u)2 f (u) = ⎩ o

for

u ∈ A,

for

u ∈ H \ A.

0 ≤ ψ ≤ 1.

Then f is also locally Lipschitz continuous44 and for any u ∈ H we have f (u) ≤

1 . 2ε

Indeed, for u ∈ H \ A we have f (u) = 0 and for u ∈ A we have f (u) ≤ |ψ(u)|

1 1 ∇F (u) ≤ . ≤ 2 ∇F (u) ∇F (u) 2ε

Consider the Cauchy problem

σ˙ = f (σ), σ(0) = u.

(6.4.6)

It follows from Corollary 3.1.6 that (6.4.6) has a unique solution, denoted by σ(·, u), which is deﬁned on R for any u ∈ H, and for any t > 0, σ(t, ·) : H → H is continuous (continuous dependence on the initial data – see Remark 3.1.7). Let us deﬁne η(u) = σ(2ε, u), u ∈ H. We will prove that η satisﬁes (i) and (ii). Property (i) follows from the fact that f (u) = 0 for u ∈ H \ A. Let us prove that (ii) is also satisﬁed. Since

d d F (σ(t, u)) = ∇F (σ(t, u)), σ(t, u) dt dt (6.4.7) = (∇F (σ(t, u)), f (σ(t, u))) = −ψ(σ(t, u)), the function t → F (σ(t, u)) is decreasing. Let u ∈ F c+ε , i.e., F (u) ≤ c + ε. We have to show that F (σ(2ε, u)) ≤ c − ε. If there is t ∈ [0, 2ε] such that F (σ(t, u)) ≤ c − ε, 43 Cf.

Exercise 6.4.8. the assumption F ∈ C 2 (H, R) is essentially used (cf. Exercise 6.4.9).

44 Here

6.4. Mountain Pass Theorem

431

then also F (σ(2ε, u)) ≤ c − ε and (ii) is satisﬁed. If, on the other hand, c − ε < F (σ(t, u)) ≤ c + ε then we obtain from (6.4.7)

for all t ∈ [0, 2ε],

i.e.,

2ε

d F (σ(2ε, u)) = F (u) + F (σ(t, u)) dt = F (u) − dt 0 ≤ c + ε − 2ε = c − ε,

ψ(σ(t, u)) = 1,

2ε

ψ(σ(t, u)) dt 0

a contradiction, and so (ii) is also satisﬁed.

The Quantitative Deformation Lemma provides a tool for proving the existence of “almost critical” points of functionals which have the so-called mountain pass type geometry (see (6.4.8) below). Proposition 6.4.3. Let F ∈ C 2 (H, R), e ∈ H and r > 0 be such that e > r and b inf F (u) > F (o) ≥ F (e). u=r

(6.4.8)

Let c inf max F (γ(t)) γ∈Γ t∈[0,1]

and

Γ {γ ∈ C([0, 1], H) : γ(0) = o, γ(1) = e}.

Then for each ε > 0 there exists u ∈ H such that (i) c − 2ε ≤ F (u) ≤ c + 2ε, (ii) ∇F (u) < 2ε. Proof. Let γ ∈ Γ be arbitrary. Then (6.4.8) implies b ≤ max F (γ(t)),

and so

t∈[0,1]

b ≤ c ≤ max F (te). t∈[0,1]

Without loss of generality, we can restrict ourselves to ε small, satisfying c − 2ε > F (o) ≥ F (e).

(6.4.9)

Suppose that the conclusion of the proposition is not satisﬁed for an ε > 0, i.e., for each u ∈ H satisfying (i) condition (ii) is violated. We apply Lemma 6.4.2 to get a contradiction. By the deﬁnition of c, there exists γ ∈ Γ such that max F (γ(t)) < c + ε.

(6.4.10)

t∈[0,1]

Consider β(t) = η(γ(t)) where η is from Lemma 6.4.2. Using Lemma 6.4.2(i), γ(0) = o, γ(1) = e, and (6.4.9) we conclude β(0) = η(o) = o

and

β(1) = η(e) = e.

Hence β ∈ Γ, i.e., c ≤ max F (β(t)). It follows from Lemma 6.4.2(ii) and (6.4.10) t∈[0,1]

that max F (β(t)) ≤ c − ε,

t∈[0,1]

a contradiction.

432

Chapter 6. Variational Methods

In order to prove that c is a critical value, our functional F has to satisfy a > However, in inﬁnite dimensions we have to “compactness” condition of type (PS). strengthen its formulation. Deﬁnition 6.4.4. Let F ∈ C 2 (H, R) and c ∈ R. The functional F satisﬁes the Palais–Smale condition on the level c (shortly (PS)c ) if any sequence {un }∞ n=1 ⊂ H such that F (un ) → c, ∇F (un ) → o (6.4.11) has a convergent (in the norm of H) subsequence.45 Now we are ready to formulate the Mountain Pass Theorem which is the simplest and one of the most useful “variational” theorems. It is one of the most eﬃcient tools to prove the existence of at least two critical points of a given functional (see, e.g., Example 6.4.7). Theorem 6.4.5 (Mountain Pass Theorem). Let the assumptions of Proposition 6.4.3 be satisﬁed. Let F satisfy (PS)c . Then c is a critical value of F .46 ∞

Proof. It follows from Proposition 6.4.3 that there is a sequence {un }n=1 ⊂ H ∞ ∞ satisfying (6.4.11). By (PS)c there exist {unk }k=1 ⊂ {un }n=1 and u0 ∈ H such 1 that unk → u0 . But F ∈ C (H, R) implies that F (u0 ) = c

and

∇F (u0 ) = o.

Remark 6.4.6. Theorem 6.4.5 actually states that there exists a critical point u0 = o of F since c ≥ inf F (u) > F (o). u=r

Example 6.4.7. Let us consider the boundary value problem t ∈ (0, π), −¨ x(t) + λx(t) = |x(t)|p−2 x(t), x(0) = x(π) = 0,

(6.4.12)

where p > 2 is a given real number and λ ∈ R is a parameter. Notice that the function identically equal to zero is a solution. We will prove that problem (6.4.12) has also a positive C 2 -solution on (0, π) if and only if λ > −1.

(6.4.13)

Let us prove that (6.4.13) is a necessary condition. Let x ∈ C 2 [0, π] be a positive solution of (6.4.12). Multiply the equation in (6.4.12) by sin t and integrate 45 Deﬁnition 6.4.4 in a more general setting with H replaced by a Banach space X and F ∈ C 1 (X, R) is due to Br´ezis, Cor´ on, Nirenberg. Cf. Corollary 6.4.19. 46 The assertion of Theorem 6.4.5 where H is replaced by a Banach space X and F ∈ C 1 (X, R) is due to Ambrosetti, Rabinowitz, see Theorem 6.4.24.

6.4. Mountain Pass Theorem

by parts:

433

π

π

|x(t)|

p−2

x(t) sin t dt =

λ

π

x(t) sin t dt + x ¨(t) sin t dt 0 0 π π > x ¨(t) sin t dt = − x(t) sin t dt.

0

0

0

Hence (6.4.13) follows. Next we show that (6.4.13) is also a suﬃcient condition. Let us deﬁne the following two functions from R into R: ⎧ ⎪ for s ≤ 0, ⎨0 0 for s ≤ 0, g(s) = G(s) = 1 ⎪ sp−1 for s > 0, ⎩ sp for s > 0. p Then G ∈ C 2 (R) and G (s) = g(s) for all s ∈ R (remember that p > 2). Put H W01,2 (0, π) and deﬁne π 1 π λ π 2 2 F (x) |x(t)| ˙ dt + |x(t)| dt − G(|x(t)|) dt, x ∈ H. 2 0 2 0 0 Then F ∈ C 2 (H, R) (see Exercise 6.4.10). We shall verify the assumptions of Theorem 6.4.5. Note that for λ > −1 the expression

π 12 π 2 2 |x| |x(t)| ˙ dt + λ |x(t)| dt 0

0

satisﬁes c1 x ≤ |x| ≤ c2 x

for any x ∈ H

(6.4.14)

where ci > 0, i = 1, 2, are constants independent of x and

π 12 2 x = |x(t)| ˙ dt 0

(cf. Exercise 6.3.18). Let us show that the functional F has a mountain pass type geometry. It follows from the Sobolev Embedding Theorem (Theorem 1.2.26) that

π

π p1 12 2 |x(t)|p dt ≤ cp |x(t)| ˙ dt . (6.4.15) 0

0

Hence combining (6.4.14) and (6.4.15) we obtain 1 π λ π 1 π 2 F (x) = |x(t)| ˙ dt + |x(t)|2 dt − |x(t)|p dt 2 0 2 0 p 0

p

p 1 1 cp 1 cp 2 p 2 1 p−2 ≥ |x| − − . |x| = |x| |x| 2 p c1 2 p c1

434

Chapter 6. Variational Methods

So, because p > 2, due to (6.4.14) there exists r > 0 (small enough) such that b = inf F (x) > 0 = F (o). x=r

Let x ∈ H, x > 0 in (0, π). Then for s ≥ 0 we have

π π 1 s2−p 1 π 2 2 F (sx) = | x(t)| ˙ dt + λ |x(t)| dt − |x(t)|p dt. sp 2 p 0 0 0 For s > 0 set e = sx. Then for s large we obtain e > r

and

F (e) ≤ 0.

It remains to verify that F satisﬁes the (PS)c condition. Actually, we will verify that F satisﬁes even a stronger version of (PS)c . Namely, we will prove that ∞ any sequence {xn }n=1 ⊂ H satisfying d sup F (xn ) < ∞, n

∇F (xn ) → o,

(6.4.16)

contains a convergent subsequence.47 A typical scheme of the proof is the following: ∞ In Step 1 we prove that {xn }n=1 is a bounded sequence. In Step 2 we pass to a weakly convergent subsequence and show that it converges strongly as well. Step 1. For n large enough, we have by (6.4.16)48 1 d + xn ≥ F (xn ) − (∇F (xn ), xn ) p π

π 1 1 1 1 2 2 2 − − xn 2 . = |x˙ n (t)| dt + λ |xn (t)| dt ≥ c1 2 p 2 p 0 0 It follows from this quadratic inequality that xn is bounded. Step 2. Passing to a subsequence if necessary, we can assume that xn x

in H.

By the compact embedding H = W01,2 (0, 1) ⊂⊂ C[0, π] (see Theorem 1.2.13) we have xn → x

in C[0, π],

and so

g(xn ) → g(x) in

C[0, π].

Observe that we have 2

|xn − x| = (∇F (xn ) − ∇F (x), xn − x) π + (g(xn (t)) − g(x(t)))(xn (t) − x(t)) dt.

(6.4.17)

0

It is clear that (∇F (xn ) − ∇F (x), xn − x) → 0 47 The 48 Due

as n → ∞

reader should justify that (PS)c in the sense of Deﬁnition - 6.4.4 is satisﬁed as well. to (6.4.16) we can actually assume that - 1p (∇F (xn ), xn )- ≤ xn .

6.4. Mountain Pass Theorem

435 ∞

∞

(cf. (6.4.16)). The uniform convergence of {xn }n=1 and {g(xn )}n=1 implies that also π (g(xn (t)) − g(x(t)))(xn (t) − x(t)) dt → 0

as n → ∞.

0

Thus it follows from (6.4.17) that |xn − x| → 0

as

n → ∞,

i.e., xn → x in H

due to (6.4.14). It follows from Theorem 6.4.5 that there exists a critical point x0 ∈ H of F (and hence a weak solution of (6.4.12)) with F (x0 ) = c ≥ b > 0. In particular, x0 = o. We prove that x0 > 0 in (0, π). Indeed, π π π x˙ 0 (t)y(t) ˙ dt + λ x0 (t)y(t) dt = g(x0 (t))y(t) dt holds for any y ∈ H. 0

0

Taking y =

49 x− 0,

0

we get π - − -2 π - dx0 (t) 2 - dt + λ |x− 0 (t)| dt = 0. - dt 0 0

(6.4.18)

Hence |x− 0 | = 0, i.e., x0 (t) ≥ 0 for all t ∈ [0, π] due to (6.4.14). A similar argument to that used in Section 6.1 yields that x0 ∈ C 2 [0, π] (cf. Exercise 6.4.11). Now, if there were t0 ∈ (0, π) such that x0 (t0 ) = 0, then, due to x− ˙ 0 (t0 ) = 0 would hold. However, the uniqueness theorem for the 0 ≡ 0, also x second order initial value problem −¨ x(t) = −λx(t) + |x(t)|p−2 x(t), ˙ 0) = 0 x(t0 ) = x(t implies that x0 ≡ 0, i.e., a contradiction to x0 = o. Hence x0 > 0 in (0, π).

g

Exercise 6.4.8. Prove that ψ deﬁned in the proof of Lemma 6.4.2 is a locally Lipschitz continuous functional on H. Hint. For u1 , u2 from a bounded set we have dist(ui , H \ A) + dist(ui , B) ≥ δ,

i = 1, 2,

with a δ ≥ 0.

Using the triangle inequality prove that |ψ(u2 ) − ψ(u1 )| ≤

dist(u2 , H \ A) | dist(u1 , B) − dist(u2 , B)| δ2 dist(u2 , B) + | dist(u1 , H \ A) − dist(u2 , H \ A)|, δ2

and then apply Exercise 1.2.45. x− = max{0, −x}. One can prove that for x ∈ W01,2 (0, π) we have x− ∈ W01,2 (0, π) (cf. Exercise 1.2.47 and the embedding of W01,2 (0, π) into C[0, π]). 49 Here

436

Chapter 6. Variational Methods

Exercise 6.4.9. Prove that f deﬁned in the proof of Lemma 6.4.2 is a locally Lipschitz continuous map from H into itself. Hint. Use the facts that ψ is locally Lipschitz continuous (Exercise 6.4.8) and F ∈ C 2 (H, R). Exercise 6.4.10. Prove that the functional F from Example 6.4.7 satisﬁes F ∈ C 2 (H, R). Hint. Use the fact that G deﬁned in Example 6.4.7 belongs to C 2 (R) if p > 2. Exercise 6.4.11. Prove that x0 ∈ C 2 [0, π] for any weak solution x0 of (6.4.12). Hint. Look at the proof of Proposition 6.1.11. Exercise 6.4.12. Consider the boundary value problem −¨ x(t) − λx(t) = g(t, x(t)), t ∈ (0, π), x(0) = x(π) = 0.

(6.4.19)

Formulate conditions on λ and g which guarantee that the energy functional associated with (6.4.19) has a geometry corresponding to the Mountain Pass Theorem. Exercise 6.4.13. Consider the Neumann boundary value problem −¨ x(t) = h(t, x(t)), t ∈ (0, π), x(0) ˙ = x(π) ˙ = 0.

(6.4.20)

Formulate conditions on h = h(t, x) which guarantee the existence of a weak solution of (6.4.20). Exercise 6.4.14. Consider the Dirichlet boundary value problem for the fourth order equation ⎧ 4 ⎨ d x(t) = h(t, x(t)), t ∈ (0, π), (6.4.21) dt4 ⎩ x(0) = x(0) ˙ = x(π) = x(π) ˙ = 0. Formulate conditions on h = h(t, x) which guarantee the existence of a weak solution of (6.4.21).

6.4A Pseudogradient Vector Fields in Banach Spaces The aim of this appendix is to show how to extend the Quantitative Deformation Lemma (Lemma 6.4.2) to continuously diﬀerentiable functionals deﬁned on a Banach space. For this purpose the notion of the pseudogradient introduced by Palais is crucial. Deﬁnition 6.4.15. Let M be a separable metric space, X a normed linear space and h : M → X ∗ \ {o} a continuous mapping. A pseudogradient vector ﬁeld for h on M is a locally Lipschitz continuous map g : M → X such that for every u ∈ M , g(u)X ≤ 2h(u)X ∗ ,

h(u), g(u)X ≥ h(u)2X ∗ .

6.4A. Pseudogradient Vector Fields in Banach Spaces

437

Lemma 6.4.16. For any h as above there exists a pseudogradient vector ﬁeld for h on M . Proof. For every v ∈ M there exists x ∈ X such that x = 1

h(v), x >

and

2 h(v).50 3

Deﬁne y 32 h(v)x. Then y < 2h(v)

h(v), y > h(v)2 .

and

Since h is continuous, there exists an open neighborhood U(v) ⊂ M such that y ≤ 2h(u)

and

h(u), y ≥ h(u)2

for every

u ∈ U(v).

(6.4.22)

The family U {U(v) : v ∈ M } is an open covering of M . Since M is a separable metric space, there exists a locally ﬁnite open covering M {Mi : i ∈ N} of M which is subordinate to U (cf. Lemma 4.3.75), i.e., for each i ∈ N there exists v ∈ M such that Mi ⊂ U(v).51 Hence there exists y = yi such that (6.4.22) is satisﬁed for every u ∈ Mi . Deﬁne, on M , i (u) dist(u, M \ Mi ) and g(u)

i∈N

(u) i yi .52 j (u) j∈N

It is now straightforward to verify that g is the desired pseudogradient vector ﬁeld for h on M (cf. Exercise 6.4.8). The following generalization of Lemma 6.4.2 was proved by Willem [134]. Lemma 6.4.17 (Quantitative Deformation Lemma). Let X be a Banach space, and let F : X → R, F ∈ C 1 (X, R), S ⊂ X, S = ∅, c ∈ R, ε, δ > 0 be such that for any u ∈ F−1 ([c − 2ε, c + 2ε]) ∩ S2δ we have53 F (u)X ∗ ≥

8ε . δ

(6.4.23)

Then there exists η ∈ C([0, 1] × X, X) such that (i) η(t, u) = u if t = 0 or u ∈ F−1 ([c − 2ε, c + 2ε]) ∩ S2δ , (ii) η(1, F c+ε ∩ S) ⊂ F c−ε ,54 (iii) for any t ∈ [0, 1], η(t, ·) is a homeomorphism of X, (iv) for any u ∈ X and any t ∈ [0, 1], η(t, u) − uX ≤ δ, (v) for any u ∈ X, F (η(·, u)) is decreasing, (vi) for any u ∈ F c ∩ Sδ and any t ∈ (0, 1], F (η(t, u)) < c. that for any v ∈ M we have h(v) = o ∈ X ∗ ! of M is actually not necessary. In the case of a general metric space M its paracompactness is used instead of separability (see Dugundji [43], Zeidler [136]). 52 Note that the sums contain only a ﬁnite number of nonzero terms. 53 Here S 2δ {u ∈ X : dist(u, S) ≤ 2δ}. 54 Recall that F c±ε F −1 ((−∞, c ± ε]). 50 Note

51 Separability

438

Chapter 6. Variational Methods

Proof.55 By Lemma 6.4.16 there exists a pseudogradient vector ﬁeld g for F on M {u ∈ X : F (u) = o}. Let us deﬁne sets A F−1 ([c − 2ε, c + 2ε]) ∩ S2δ , and a functional ψ(u)

B F−1 ([c − ε, c + ε]) ∩ Sδ

dist(u, X \ A) . dist(u, X \ A) + dist(u, B)

Then ψ is locally Lipschitz continuous (see Exercise 6.4.8) and 1 on B, ψ= 0 on X \ A. Let us deﬁne a vector ﬁeld ⎧ ⎨ −ψ(u) g(u) g(u)2 f (u) = ⎩ o

for

u ∈ A,

for

u ∈ X \ A.

Then f is also locally Lipschitz continuous (cf. Exercise 6.4.9) and by assumption (6.4.23) and by Deﬁnition 6.4.15, for any u ∈ X we have f (u) ≤ Consider the Cauchy problem

δ . 8ε

σ˙ = f (σ), σ(o) = u.

(6.4.24)

(6.4.25)

It follows from Corollary 3.1.6 and Remark 3.1.7 that (6.4.25) has a unique solution σ(·, u) which is deﬁned on the whole R and σ is continuous on R × X. Let us deﬁne η : [0, 1] × X → X by η(t, u) σ(8εt, u). It follows from Deﬁnition 6.4.15, assumption (6.4.23) and from (6.4.24) that for t ≥ 0 the inequalities t t δt (6.4.26) f (σ(τ, u)) dτ f (σ(τ, u)) dτ ≤ σ(t, u) − u = ≤ 8ε 0 0 and d F (σ(t, u)) = dt

?

55 Cf.

the proof of Lemma 6.4.2.

d σ(t, u) dt

@

1 = F (σ(t, u)), f (σ(t, u)) ≤ − ψ(σ(t, u)) ≤ 0 4 (6.4.27) hold. To verify (i), (iii), (iv), (v) and (vi) is a matter of straightforward calculation (cf. Exercise 6.4.25). F (σ(t, u)),

6.4A. Pseudogradient Vector Fields in Banach Spaces

439

In oder to verify (ii), let u ∈ F c+ε ∩ S. If there is t ∈ [0, 8ε] such that F (σ(t, u)) < c − ε, then F (σ(8ε, u)) < c − ε by (6.4.27) and (ii) is satisﬁed. If, on the other hand, we have σ(t, u) ∈ F−1 ([c − ε, c + ε]) for any t ∈ [0, 8ε], we obtain from (6.4.26) that σ(t, u) ∈ B and hence (6.4.27) yields 8ε d 1 8ε F (σ(8ε, u)) = F (u) + ψ(σ(t, u)) dt F (σ(t, u)) dt ≤ F (u) − dt 4 0 0 ≤ c + ε − 2ε = c − ε

and (ii) is also satisﬁed.

A special case of the Ekeland Variational Principle is considered to be the ﬁrst application of Lemma 6.4.17. Theorem 6.4.18 (Ekeland Variational Principle). Let X be a Banach space, let F ∈ C 1 (X, R) be bounded below, and let ε, δ > 0 be arbitrary. If F (v) ≤ inf F (u) + ε u∈X

for a

v ∈ X,

then there exists u0 ∈ X such that F (u0 ) ≤ inf F (u) + 2ε, u∈X

u0 − vX ≤ 2δ,

and

F (u0 )X ∗ <

8ε . δ

Proof. We apply Lemma 6.4.17 with S {v}

c inf F (u).

and

u∈X

We proceed via contradiction. Assume that there exist ε and δ such that F (u) ≥ for every u ∈ F−1 [c, c + 2ε] ∩ S2δ . Then

8ε δ

η(1, v) ∈ F c−ε by (ii) of Lemma 6.4.17. However, the deﬁnition of c implies F c−ε = ∅, a contradiction. Corollary 6.4.19. Let F ∈ C 1 (X, R) be bounded below. If F satisﬁes the (PS)c condition with c inf u∈X F (u), then every minimizing sequence for F contains a converging subsequence. In particular, there exists u0 ∈ X such that F (u0 ) = min F (u). u∈X

Proof. Let

{vn }∞ n=1

⊂ X be a minimizing sequence for F . We apply Theorem 6.4.18 with √ 1 εn max and δn εn . , F (vn ) − c n

Then there exists a sequence {un }∞ n=1 ⊂ X such that F (un ) → c,

F (un ) → o

The assertion follows now from (PS)c .

and

un − vn → 0.

440

Chapter 6. Variational Methods Another application of Lemma 6.4.17 is the following result.

Theorem 6.4.20 (Br´ezis and Nirenberg). Let F ∈ C 1 (X, R). If c lim inf F (u) ∈ R,

u →∞

then for every ε, δ > 0, R > 2δ, there exists u ∈ X such that (i) c − 2ε ≤ F (u) ≤ c + 2ε, (ii) u > R − 2δ, (iii) F (u) <

8ε . δ

Proof. We proceed by contradiction similarly to the proof of Theorem 6.4.18. Suppose that the assertion does not hold. Then there exist ε, δ and R such that for any u ∈ X satisfying (i) and (ii) the inequality in (iii) is false. Hence we can apply Lemma 6.4.17 with S X \B(o; R). By the deﬁnition of c, F c+ε ∩S is an unbounded set and F c−ε ⊂ B(o; r) for r > 0 large enough. By (ii) and (iv) of Lemma 6.4.17, η(1, F c+ε ∩ S) ⊂ F c−ε

and

F c+ε ∩ S ⊂ B(o; r + δ),

a contradiction.

Corollary 6.4.21 (Shujie Le). Let F ∈ C 1 (X, R) be bounded below. If for arbitrary d ∈ R every sequence {un }∞ n=1 ⊂ X such that F (un ) → d,

F (un ) → o

is bounded, then lim F (u) = ∞.

u →∞

Proof. We proceed again by contradiction. Assume the assertion does not hold. Then c lim inf F (u) ∈ R.

u →∞

By Theorem 6.4.20 there exists a sequence {un }∞ n=1 ⊂ X such that F (un ) → c,

F (un ) → o

and

un → ∞,

a contradiction.

Let us present now the most important application of Lemma 6.4.17, the General Minimax Principle. Theorem 6.4.22. Let X be a Banach space. Let M0 be a subset of a metric space M and Γ0 ⊂ C(M0 , X). Deﬁne Γ {γ ∈ C(M, X) : γ|M0 ∈ Γ0 }. If F ∈ C 1 (X, R) satisﬁes a sup

sup F (γ0 (u)) < c inf sup F (γ(u)) < ∞,

γ0 ∈Γ0 u∈M0

γ∈Γ u∈M

(6.4.28)

6.4A. Pseudogradient Vector Fields in Banach Spaces then for every ε ∈ 0,

c−a 2

441

, δ > 0 and γ ∈ Γ such that sup F (γ(u)) ≤ c + ε,

(6.4.29)

u∈M

there exists u0 ∈ X such that (i) c − 2ε ≤ F (u0 ) ≤ c + 2ε, (ii) dist(u0 , γ(M )) ≤ 2δ, . (iii) F (u0 ) < 8ε δ Proof. Suppose, by contradiction, that the assertion is false. Then there exist 0 < ε < c−a , δ > 0 and γ ∈ Γ such that (6.4.29) holds and for any u ∈ X satisfying (i) and (ii), 2 the inequality in (iii) is false. Hence Lemma 6.4.17 can be applied with S = γ(M ). Deﬁne β(u) η(1, γ(u)). Since c − 2ε > a, we obtain from (6.4.28) that β(u) = η(1, γ(u)) = γ(u)

for every

u ∈ M0

so that

β ∈ Γ.

It follows from (6.4.29) and Lemma 6.4.17 that sup F (β(u)) = sup F (η(1, γ(u))) ≤ c − ε, u∈M

u∈M

contradicting the deﬁnition of c. We now have the following consequence.

Corollary 6.4.23. Let the assumptions of Theorem 6.4.22 be fulﬁlled. Then there exists a sequence {un }∞ n=1 ⊂ X satisfying F (un ) → c,

F (un ) → o.

In particular, if F satisﬁes the (PS)c condition, then c is a critical value of F . The special choices of M , M0 , Γ and Γ0 in Theorem 6.4.22 can yield the Mountain Pass Theorem (see below) and the Saddle Point Theorem (see Appendix 6.5A) under more general assumptions than in Sections 6.4 and 6.5, respectively. Theorem 6.4.24 (Mountain Pass Theorem, Ambrosetti and Rabinowitz). Let X be a Banach space and let F ∈ C 1 (X, R), e ∈ X and r > 0 be such that e > r and inf F (u) > F (o) ≥ F (e).

u∈X

u =r

If F satisﬁes the (PS)c condition with c inf max F (γ(t)) γ∈Γ t∈[0,1]

where

Γ {γ ∈ C([0, 1], X) : γ(0) = o, γ(1) = e},

then c is a critical value of F . Proof. It suﬃces to apply Corollary 6.4.23 with M = [0, 1], M0 = {0, 1}, Γ0 = {γ0 } where γ0 (0) = o and γ0 (1) = e. Exercise 6.4.25. Verify that (i), (iii), (iv), (v) and (vi) of Lemma 6.4.17 hold true. Let g and f be from the proof of Lemma 6.4.17. Explain why locally Lipschitz continuity of g implies that f is locally Lipschitz continuous. Compare this with the proof of Lemma 6.4.2. Hint. Use (6.4.26) and (6.4.27).

442

Chapter 6. Variational Methods

Exercise 6.4.26. Consider the boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ + λx(t) = |x(t)|r−2 x(t),

t ∈ (0, 1),

x(0) = x(1) = 0,

(6.4.30)

where p > 1 and r > p. Let λ1 be the ﬁrst eigenvalue of (5.2.47). Prove that the problem (6.4.30) has a positive solution on (0, 1) provided λ > −λ1 . Hint. Follow the idea of proving the “suﬃcient condition” in Example 6.4.7 and apply Theorem 6.4.24. Exercise 6.4.27. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙− λ|x(t)|p−2 x(t) = g(t, x(t)),

t ∈ (0, 1),

x(0) = x(1) = 0,

(6.4.31)

where p > 1. Formulate conditions on λ and g which guarantee that the energy functional associated with (6.4.31) has a geometry corresponding to the Mountain Pass Theorem (Theorem 6.4.24). Exercise 6.4.28. Consider the Dirichlet boundary value problem p−2 x(t) ˙ ˙= h(t, x(t)), t ∈ (0, 1), |x(t)| ˙ x(0) = x(1) = 0,

(6.4.32)

where p > 1. Formulate conditions on h = h(t, x) which guarantee that (6.4.32) has a weak solution.

6.4B Lusternik–Schnirelmann Method The purpose of this appendix is to extend the results presented in Section 6.3, namely, we concentrate on the Lusternik–Schnirelmann Method which generalizes the Courant– Fischer and Courant–Weinstein Principles. In order to motivate the topic let us consider the unit circle S 1 in the plane and a continuous function ϕ deﬁned on it. The Extreme Value Theorem implies that this function has to attain its maximum and minimum values. If the function ϕ is the restriction of a non-vanishing linear function of two independent variables to S 1 , then ϕ has exactly one maximum at M and one minimum at m as in Figure 6.4.3. The set S 1 can be covered by two closed sets which are contractible to a point in S 1 (see Figure 6.4.4). As another example consider a two-dimensional torus T 2 and identify it with the quotient space R2 | 2 (see Figure 6.4.5). Z Let us consider a function ϕ ∈ C 1 (T 2 , R) 56 having a maximum at M and a minimum at m. We also assume that the level sets of ϕ are the curves indicated in Figure 6.4.5. The function ϕ has three critical points on the torus: the maximum at M , the minimum at m and a saddle point at S. In Figure 6.4.6 we give a covering of the torus T 2 by three closed sets which are contractible to a point in T 2 (see Appendix 6.3A for the notion of a contractible set). the notion of the diﬀerentiability of functions deﬁned on manifolds, like S 1 , T 2 , see Deﬁnition 4.3.36. 56 For

6.4B. Lusternik–Schnirelmann Method

443

M A1

A2

m Figure 6.4.3.

Figure 6.4.4.

Deﬁnition 6.4.29. We deﬁne the Lusternik–Schnirelmann category catY (A) of a closed nonempty subset A of a topological space Y as the least integer n such that there exists a covering of A by n closed sets contractible to a point in Y .57 The essential idea of the Lusternik–Schnirelmann method is the following one: The number of critical points of a C 1 -functional ϕ deﬁned on a compact manifold Y is greater than or equal to catY (Y ). The corresponding critical values are given by ck inf sup ϕ(u) A∈Ak u∈A

where Ak {A ⊂ Y : A closed, catY (A) ≥ k}.

Let us prove some elementary properties of the Lusternik–Schnirelmann category. Lemma 6.4.30. Let A and B be closed subsets of Y . Then we have (i) (normalization) catY (∅) = 0, (ii) (subadditivity) catY (A ∪ B) ≤ catY (A) + catY (B), (iii) (monotonicity) if A ≺ B,58 then catY (A) ≤ catY (B). Proof. Properties (i) and (ii) follow directly from Deﬁnition 6.4.29. Let us prove (iii). Assume that A ≺ B by means of homotopy h, and let {B1 , . . . , Bn } be a covering of B corresponding to n catY (B) according to Deﬁnition 6.4.29. Deﬁne sets Aj {u ∈ A : h(1, u) ∈ Bj }, Then A=

n

Aj ,

Aj ≺ Bj ,

j = 1, . . . , n.

j = 1, . . . , n.

j=1

According to Lemma 6.3.21, catY (A) ≤ n.

Deﬁnition 6.4.31. A metric space Y is an absolute neighborhood extensor if for every metric space E, every closed subset F ⊂ E and every continuous mapping f : F → Y there exists a continuous extension of f deﬁned on a neighborhood of F in E.59 57 See

Deﬁnition 6.3.20. Deﬁnition 6.3.20 for the relation ≺. 59 The terminology is not ﬁxed in the literature. We follow here the books Willem [134] and Zeidler [136]. On the other hand, the same objects are called “absolute neighborhood retract” in Deimling [34] and Dugundji [43]. 58 See

444

Chapter 6. Variational Methods

M

m

p S

q

S

q

S

m p

p M

S

q Figure 6.4.5.

S

6.4B. Lusternik–Schnirelmann Method

445

A2

A3

A1

A1

A1

A2

A3

A1

A1 Figure 6.4.6.

446

Chapter 6. Variational Methods

Remark 6.4.32. Note that every normed linear space is an absolute neighborhood extensor (see, e.g., the Tietze–Dugundji Theorem in Zeidler [136, Prop. 2.1]). Proposition 6.4.33. Let A be a closed subset of an absolute neighborhood extensor Y . Then there exists a closed neighborhood B of A in Y such that catY (B) = catY (A). Proof. The reader should realize that it is suﬃcient to consider the case catY (A) = 1 (cf. Lemma 6.4.30(ii) and (iii)). Let h be the corresponding homotopy which contracts A to a point. The set N ([0, 1] × A) ∪ ({0, 1} × Y ) Let u0 ∈ A be ﬁxed. The map f : ⎧ ⎪ ⎨ f (t, u) ⎪ ⎩

M [0, 1] × Y.

is closed in

N → Y deﬁned by h(t, u), u, h(1, u0 ),

t ∈ [0, 1], t = 0, t = 1,

u ∈ A, u ∈ Y, u ∈ Y,

is continuous. The fact that Y is an absolute neighborhood extensor implies that there exists a continuous extension g of f deﬁned on a neighborhood U of N . The compactness of [0, 1] implies the existence of a closed neighborhood B of A such that [0, 1] × B ⊂ U. However, then B is contractible to a point in Y , i.e., catY (B) = 1. Our aim is now to prove the Quantitative Deformation Lemma which will be the key tool for proving the existence of critical points on manifolds. In the following considerations we will always assume that X is a real separable Banach space, ψ ∈ C 2 (X, R), V {v ∈ X : ψ(v) = 1} = ∅

and

ψ (v) = 0

for every

v∈V.

The reader should be aware of the fact that some of these assumptions can be relaxed and more general results parallel to those from this section can be proved (see, e.g., Ghoussoub [58]). The set V is a diﬀerentiable manifold of the class C 2 (cf. Remark 4.3.39 or Deimling [34, § 27], Zeidler [136, Chapter 43]). The norm on X induces a metric on V and so V becomes a metric manifold, i.e., a metric space and a manifold. It can be proved that V is an absolute neighborhood extensor (see, e.g., Deimling [34, Proposition 27.6]). We denote by Tv V its tangent space at v (see Remark 4.3.40), i.e., Tv V {y ∈ X : ψ (v), yX = 0}. Let ϕ ∈ C 1 (X, R) be given. The norm of the restriction of the derivative ϕ (v) to Tv V is given by ϕ (v)∗ sup ϕ (v), yX . y∈Tv V

y X =1

The point v is a critical point of the restriction of ϕ to V if the restriction of ϕ (v) to Tv V is equal to o. We deﬁne ϕd {v ∈ V : ϕ(v) ≤ d}. We will use the following Duality Lemma.

6.4B. Lusternik–Schnirelmann Method

447

Lemma 6.4.34 (Duality Lemma). If f, g ∈ X ∗ , then sup f, y = min f − λg. λ∈R

g,y =0

y =1

Proof. For every λ ∈ R we have sup f, y = sup f − λg, y ≤ sup f − λgy = f − λg.

g,y =0

y =1

g,y =0

y =1

y =1

By the Hahn–Banach Theorem (Corollary 2.1.15) there is a continuous linear functional f˜ on X, f˜|Ker g = f , such that sup f, y = f˜.

g,y =0

y =1

Since Ker g ⊂ Ker (f − f˜), there exists λ ∈ R such that f − f˜ = λg (see Proposition 1.1.19). Hence we obtain f˜ = f − λg. The above lemma immediately yields the following assertion. Proposition 6.4.35. If ϕ ∈ C 1 (X, R) and u ∈ V {v ∈ X : ψ(v) = 1}, then ϕ (u)∗ = min ϕ (u) − λψ (u). λ∈R

In particular, u is a critical point of ϕ|V if and only if there exists λ ∈ R such that ϕ (u) = λψ (u).60 Next we deﬁne a tangent pseudogradient vector ﬁeld on V . Deﬁnition 6.4.36.61 Let ϕ ∈ C 1 (V , R). A tangent pseudogradient vector ﬁeld for ϕ on M {u ∈ V : ϕ (u)∗ = 0} is a locally Lipschitz continuous vector ﬁeld g : M → X such that g(u) ∈ Tu V and g(u) ≤ 2ϕ (u)∗ ,

ϕ (u), g(u) ≥ ϕ (u)∗

for every

u ∈ M.

Lemma 6.4.37.62 Let ϕ ∈ C 1 (X, R). Then there exists a tangent pseudogradient vector ﬁeld for ϕ on M. Proof. For every v ∈ M there exists x ∈ Tv V such that x = 1 There is also z ∈ X such that 60 Cf.

Theorem 6.3.2. Deﬁnition 6.4.15. 62 Cf. Lemma 6.4.16. 61 Cf.

and

ϕ (v), x >

ψ (v), z = 1.

2 ϕ (v)∗ . 3

448

Chapter 6. Variational Methods

Set y 32 xϕ (v)∗ and for u ∈ V such that ψ (u), z = 0, set gv (u) y −

ψ (u), y z. ψ (u), z

Since ψ (v), y = 0, we have gv (v) = y and gv (v) < 2ϕ (v)∗ ,

ϕ (v), gv (v) > ϕ (v)2∗ .

Since ϕ and gv are continuous, there exists an open neighborhood N (v) of v such that gv (u) < 2ϕ (u)∗ ,

ϕ (u), gv (u) > ϕ (u)2∗

for every

u ∈ N (v).

The family N {N (v) : v ∈ M} is an open covering of M. Since M is a metric space, there exists a locally ﬁnite open covering M = {Mi : i ∈ N} of M which is subordinate to N , i.e., such that for every i ∈ N there exists v ∈ V satisfying Mi ⊂ N (v) (cf. the proof of Lemma 6.4.16). For any i ∈ N choose one such v vi and deﬁne gvi (u), u ∈ N (vi ), gi (u) o, u ∈ N (vi ), and i (u) dist(u, X \ Mi ),

g(u)

i (u)gi (u) .63 j (u) i∈N j∈N

It is now straightforward to verify that g is a tangent pseudogradient vector ﬁeld for ϕ on M. (The interested reader is invited to check it in detail and realize that the fact ψ ∈ C 2 (X, R) is used!) The proof of the following version of the Quantitative Deformation Lemma follows the lines of the proof of Lemma 6.4.17 (cf. Exercise 6.4.50). Lemma 6.4.38 (Quantitative Deformation Lemma). Let ϕ ∈ C 1 (X, R), S ⊂ V , c ∈ R, ε, δ > 0 be such that ϕ (u)∗ ≥

8ε δ

for any

u ∈ ϕ−1 ([c − 2ε, c + 2ε]) ∩ S2δ ∩ V .

Then there exists η ∈ C([0, 1] × V , V ) such that (i) η(t, u) = u if t = 0 or u ∈ ϕ−1 ([c − 2ε, c + 2ε]) ∩ S2δ ∩ V , (ii) η(1, ϕc+ε ∩ S) ⊂ ϕc−ε , (iii) ϕ(η(·, u)) is decreasing for any u ∈ V . 63 Note

that the sums contain only a ﬁnite number of nonzero terms. Note also that separability of X can be dropped and substituted by paracompactness.

6.4B. Lusternik–Schnirelmann Method

449

We are now ready to prove the General Minimax Principle on the manifold V . We assume that ϕ ∈ C 1 (X, R) is bounded below on V . For j ≥ 1, j ∈ N, we deﬁne Aj {A ⊂ V : A is closed, catV (A) ≥ j},

cj inf sup ϕ(u). A∈Aj u∈A

(6.4.33)

Theorem 6.4.39 (General Minimax Principle). Assume that ϕ and cj are as above. If c ck = ck+1 = · · · = ck+m ,

(6.4.34)

then for every ε > 0, δ > 0, A ∈ Ak+m and B ⊂ V closed such that sup ϕ(u) ≤ c + ε,

catV (B) ≤ m,

(6.4.35)

u∈A

there exists u0 ∈ V such that (i) c − 2ε ≤ ϕ(u0 ) ≤ c + 2ε, (ii) dist(u0 , A \ int B) ≤ 2δ, (iii) ϕ (u0 )∗ ≤

8ε . δ

Proof. We assume by contradiction that there exist numbers ε > 0, δ > 0, and closed sets A ∈ Ak+m , B ⊂ V such that (6.4.35) holds and for all u ∈ V satisfying (i) and (ii)64 the inequality (iii) is false. We apply Lemma 6.4.38 with S A \ int B. We obtain by virtue of Lemma 6.4.38(ii) that (A \ int B) ≺ ϕc−ε . It follows from Lemma 6.4.30(ii), (iii) and from the deﬁnition of ck that k + m ≤ catV (A) ≤ catV (A \ int B) + catV (B) ≤ catV (ϕc−ε ) + m ≤ k − 1 + m,

a contradiction.

Deﬁnition 6.4.40. A functional ϕ satisﬁes the Palais–Smale condition (PS)c on V if any sequence {un }∞ n=1 ⊂ V such that ϕ(un ) → c,

ϕ (un )∗ → 0,

has a convergent subsequence. Theorem 6.4.41. Let ϕ be bounded below on V , satisfy (PS)c on V , and let (6.4.34) hold. Let Kc {u ∈ V : ϕ(u) = c, ϕ (u)∗ = 0}. Then catV (Kc ) ≥ m + 1. Proof. Assume that catV (Kc ) ≤ m. Then Proposition 6.4.33 implies the existence of a closed neighborhood B of Kc in V such that catV (B) ≤ m.65 64 Cf.

Exercise 6.4.55. that the fact that V is an absolute neighborhood extensor is used here, cf. page 446.

65 Note

450

Chapter 6. Variational Methods

By Theorem 6.4.39 for A = V there exists a sequence {un }∞ n=1 ⊂ V satisfying ϕ(un ) → c,

dist(un , V \ int B) → 0,

ϕ (un )∗ → 0.

It then follows from the (PS)c condition on V that Kc ∩ (V \ int B) = ∅, a contradiction with the deﬁnition of B. Theorem 6.4.42. Let ϕ be bounded below on V , d ≥ inf ϕ(u) and ϕ satisfy (PS)c on V u∈V for any c ∈ inf ϕ(u), d . Then ϕ|V has a minimum and ϕd contains at least catV (ϕd ) u∈V

critical points of ϕ|V . Proof. If n catV (ϕd ), then inf ϕ(u) = c1 ≤ c2 ≤ c3 ≤ · · · ≤ cn ≤ d

u∈V

where ci , i = 1, . . . , n, are given by (6.4.33). The critical points corresponding to diﬀerent critical levels are mutually diﬀerent. If some levels coincide, we apply Theorem 6.4.41 to get the assertion. Remark 6.4.43. Note that (i) catX (B(o; 1)) = 1 for the closed ball B(o; 1) in X where X is a Banach space (see Figure 6.4.7);

B(o; 1) ⊂ X

Figure 6.4.7. (ii) catS N −1 (S N−1 ) = 2 for the unit sphere S N−1 = ∂B(o; 1) ⊂ RN , N ≥ 1. Indeed, Figure 6.4.4 suggests that catS N −1 (S N−1 ) ≤ 2. On the other hand, it follows from Lemma 6.3.22 that catS N −1 (S N−1 ) > 1. Deﬁnition 6.4.44. Let S N−1 ⊂ RN be the unit sphere. Then P N−1 = {(u, −u) : u ∈ S N−1 } is called an (N − 1)-dimensional projective space. The geometrical interpretation of P N−1 is the following: the (N − 1)-dimensional projective space P N−1 results from S N−1 , N ≥ 1 by identifying antipodal points (see Figure 6.4.8). The following identity is the key to the proof of existence of a sequence of eigenvalues of nonlinear problems: catP N −1 (P N−1 ) = N. (6.4.36)

6.4B. Lusternik–Schnirelmann Method

451

To see that catP N −1 (P N−1 ) ≤ N we can proceed by induction as follows. S 1 can be covered by two closed symmetric sets which are contractible to a point in P 1 (see Figure 6.4.9).

A2

−u o

A1

u

A1 A2

P1 Figure 6.4.8.

Figure 6.4.9.

The closed strip along the equator on S 2 can be covered by two closed symmetric sets which are contractible to a point in P 2 as well. If we add the closed north and south caps, we get a covering of S 2 by three closed symmetric sets which are contractible to a point in P 2 , etc. (see Figure 6.4.10).

A3 A2

A1

A1

A2 A3 Figure 6.4.10.

To prove the reversed inequality we proceed by contradiction. Assume that catP N −1 (P N−1 ) < N . Then according to Exercise 6.4.51 there exist M < N and closed symmetric sets Ai , i = 1, . . . , M , such that S N−1 =

M

Ai ,

Ai = A˜i ∪ (−A˜i ),

A˜i ∩ (−A˜i ) = ∅.

i=1

Then A˜1 , . . . , A˜M , (−A˜1 ) ∪ · · · ∪ (−A˜M ) is a covering of S N−1 by M + 1 closed sets and none of them contains antipodal points. This contradicts the covering result of Lusternik

452

Chapter 6. Variational Methods

and Schnirelmann (if M = N −1, we can apply directly Exercise 4.3.138; if M < N −1, we complete the above covering by N − 1 − M empty sets and apply again Exercise 4.3.138). Similarly to Deﬁnition 6.4.44 we can deﬁne an inﬁnite dimensional projective space P ∞ {(u, −u) : u ∈ S} where S = ∂B(o; 1) is the boundary of the unit ball B(o; 1) in an inﬁnite dimensional Banach space. Then (6.4.36) immediately yields that catP ∞ (P ∞ ) = ∞. Example 6.4.45. Let f : RN → R be a continuously diﬀerentiable function. Since S N−1 is compact, it follows from the Extreme Value Theorem that there exists d > sup f (u). u∈S N −1

It follows then from Theorem 6.4.42 that the number of critical points on S N−1 is greater than or equal to catRN (S N−1 ) = 2. However, this result is trivial. On the other hand, if f is even, we can think of f as a continuous mapping from P N−1 into R. Then by (6.4.36) and Theorem 6.4.42, f has at least N critical points in P N−1 to which N pairs (−u, u) e of critical points of f on S N−1 correspond. This is a nontrivial result. If we combine this example and Theorem 6.4.42, we get the following assertion. Theorem 6.4.46. Let H be a real (separable) Hilbert space, dim H = ∞, let the functional ϕ ∈ C 1 (H, R) be bounded below, even and let it satisfy (PS)c on ∂B(o; 1) ⊂ H for ϕ(u). Then ϕ|∂B(o;1) possesses inﬁnitely many distinct pairs of critical any c ≥ inf u∈∂B(o;1)

points. Proof. Since ∂B(o; 1) {u ∈ H : ψ(u) = 1}

where

ψ(u) = (u, u)

is of the class C (H, R), we can apply Theorem 6.4.42. Indeed, since ϕ and ψ are even, we can identify the antipodal points and deﬁne 2

X {x = (u, −u) : u ∈ H},

V {x ∈ X : ψ(u) = 1}.

∞

Since V = P , we have

catV (V ) = ∞.

This completes the proof.

Now we illustrate the connection between the critical points of functionals on manifolds in Banach spaces and the nonlinear eigenvalue problems. We present this fact by a simple example. Example 6.4.47. Set X W01,p (0, 1), p ≥ 2, 1 p ϕ(x) |x(t)| ˙ dt, ψ(x) 0

1

|x(t)|p dt,

x ∈ X.

0

Then ϕ and ψ satisfy all the above assumptions. The functional ϕ is bounded below on V by λ1 (see Example 6.3.5) and satisﬁes (PS)c on V for any level c ≥ λ1 . Indeed, let ϕ(xn ) → c,

ϕ (xn )∗ → 0 {xn }∞ n=1

for

xn ∈ V .

(6.4.37)

The ﬁrst convergence in (6.4.37) implies that is a bounded sequence in X. Then the reﬂexivity of X implies that without loss of generality we can assume xn x in X,

6.4B. Lusternik–Schnirelmann Method

453

and by the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1) also xn → x in Lp (Ω). But then x ∈ V , i.e., x = o. It follows from (6.4.37) that ? @ ϕ (xn ), w = ϕ (xn ), w − ϕ(xn )ψ (xn ), w → 0 (6.4.38) ψ uniformly for all w ∈ X, w ≤ R (cf. Exercise 6.4.53). We can take w xn − x in (6.4.38) (note that {xn }∞ n=1 is bounded in X). Hence 1 1 |x˙ n (t)|p−2 x˙ n (t)(x˙ n (t) − x(t)) ˙ dt − ϕ(xn ) |xn (t)|p−2 xn (t)(xn (t) − x(t)) dt → 0. 0

0

Since also ϕ (x), xn − x =

1 p−2 |x(t)| ˙ x(t)( ˙ x˙ n (t) − x(t)) ˙ dt → 0 0

by the weak convergence xn x in X and 1 |xn (t)|p−2 xn (t)(xn (t) − x(t)) dt → 0 0

by the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1), we obtain 1 p−2 |x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ (x˙ n (t) − x(t)) ˙ dt → 0. 0

However, for p ≥ 2 we have 1 p−2 ˙ x(t) ˙ (x˙ n (t) − x(t)) ˙ dt ≥ |x˙ n (t)|p−2 x˙ n (t) − |x(t)| 0

1 p |x˙ n (t) − x(t)| ˙ dt, 0

i.e., xn → x. It follows from Theorem 6.4.46 that ϕ has inﬁnitely many distinct pairs of critical points (xi , −xi ), i = 1, 2, . . . , xi ∈ V . It follows from Proposition 6.4.35 that there exist λi , i = 1, 2, . . . , such that ϕ (xi ) = λi ψ (xi ). But, since ψ (xi ), xi = pψ(xi ), ϕ (xi ), xi = pϕ(xi ), we have i = 1, 2, . . . , ϕ(xi ) = λi , i.e., the critical values of ϕ|V are the eigenvalues of the problem 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt = λ |x(t)|p−2 x(t)y(t) dt. 0

(6.4.39)

0

From the proof of Theorem 6.4.42 we have λi = inf sup ϕ(x) A∈Ai x∈A

where

(6.4.40)

Ai {A ⊂ V : A is closed, symmetric, catV (A) ≥ i}.

We prove that lim λi = ∞.

i→∞

For this purpose we have to exclude the following two cases: Case 1. There exists n ∈ N such that λm = λn for any m ≥ n. Case 2. Case 1 does not occur but there exists Λ ∈ R such that λi " Λ.

(6.4.41)

454

Chapter 6. Variational Methods

If Case 1 occurs, then necessarily catϕλm +1 (Kλm ) = ∞ by Theorem 6.4.41. However, the (PS)λm condition implies that Kλm is a compact set and hence catϕλm +1 (Kλm ) < ∞ (see Exercise 6.4.52), i.e., Case 1 is excluded. If Case 2 occurs, then we argue as follows. Let ε > 0 be speciﬁed later and denote K {x ∈ V : λ1 ≤ ϕ(x) ≤ Λ + ε, ϕ (x)∗ = 0}. By the (PS)c condition we know that K is compact, and hence j catV (K) < ∞. ˜ of K in V such According to Proposition 6.4.33 there exists a closed neighborhood K ˜ = j. In particular, if we set that catV (K) ˜ S ϕΛ+ε \ K, we can apply Lemma 6.4.38. Indeed, choose ε and δ such that the assumptions of Lemma 6.4.38 are satisﬁed with c = Λ, and let m be the smallest integer such that λm > Λ − ε > λ1 . Choose A ∈ Aj+m such that sup ϕ(x) ≤ Λ + ε (this is possible due to x∈A

the variational characterization of λj+m , see (6.4.40)) and set ˜ B A\K (the closure is taken in the topology of V ). Then according to Lemma 6.4.30(ii), catV (B) ≥ m,

i.e.,

B ∈ Am .

It follows from Lemma 6.4.30(iii) that catV (η(1, B)) ≥ catV (B) ≥ m,

i.e.,

η(1, B) ∈ Am .

But then according to Lemma 6.4.38(ii), λm ≤ sup ϕ(η(1, x)) ≤ Λ − ε < λm , x∈B

a contradiction. Hence Case 2 is also excluded, and (6.4.41) is proved.

e

Remark 6.4.48. Using the technique of ordinary diﬀerential equations it is possible to prove that the set {λn }∞ n=1 represents all eigenvalues of (6.4.39), that every λn , n = 1, 2, . . . , is simple (see, e.g., Elbert [47], Doˇsl´ y [37] and references therein). The same approach as above can be used to prove the existence of an inﬁnite sequence of eigenvalues, approaching inﬁnity, of the p-Laplacian in more dimensions. Contrary to the one-dimensional case it is not clear if such a sequence exhausts all the eigenvalues or not. This has been a long standing and challenging open problem of nonlinear analysis. Note that the assumption p ≥ 2 can be relaxed to p > 1. However, V is not a manifold of the class C 2 for 1 < p < 2 and so a more general approach must be employed (see, e.g., Ghoussoub [58]). Remark 6.4.49. Similar and more general minimax arguments can be found in literature where instead of the (Lusternik–Schnirelmann) category a more general concept of the relative category is used.

6.4B. Lusternik–Schnirelmann Method

455

One can develop an abstract index theory where the index of a set (an analogue of the category) satisﬁes certain axioms and similar results to those in this section can be proved. The reader can ﬁnd various kinds of indices: Krasnoselski genus, S 1 -index of Benci, cohomological index of Fadell–Rabinowitz, etc. (see, e.g., Zeidler [136] and references therein). The notion of a category is, in a certain sense, a maximal function satisfying the key properties of Lemma 6.4.30 (cf. Exercise 6.4.54). Exercise 6.4.50. Give the proof of Lemma 6.4.38 in detail. Watch carefully for the moment when the assumption ψ ∈ C 2 (X, R) is essential. Exercise 6.4.51. Every closed set A∗ ⊂ P N−1 corresponds to a symmetric closed set A ⊂ S N−1 and vice versa as follows: x∈A

if and only if

(x, −x) ∈ A∗ .

Prove that catP N −1 (A∗ ) = 1 if and only if there exists A˜ such that ˜ A = A˜ ∪ (−A)

and

˜ = ∅. A˜ ∩ (−A)

Hint. catP N −1 (A∗ ) = 1 if and only if there exist an odd continuous mapping f : [0, 1] × S N−1 → S N−1 and a point a ∈ S N−1 such that for (x, −x) ∈ A∗ we have (f (0, x), f (0, −x)) = (−x, x)

and

(f (1, x), f (1, −x)) = (a, −a);

take A˜ = {x ∈ S N−1 : f (1, x) = a}. Exercise 6.4.52. Prove that if K is a compact subset of a manifold V of the class C 2 , then catV (K) < ∞. Hint. For any u ∈ K there exists B(u; R(u)) such that B(u; R(u)) ∩ V is contractible to a point in V (use Remark 6.4.43(i) and the fact that V is a manifold of the class C 2 ); n (B(u; R(u)) ∩ V ) is an open covering of K, choose B(ui ; R(ui )) a ﬁnite subcovu∈K

ering of K, use Lemma 6.4.30(ii) to show catV (K) ≤ n.

i=1

Exercise 6.4.53. Prove (6.4.38). Hint. Split w tn xn + yn , tn ∈ R, ψ (xn ), yn = 0. Using the facts {xn }∞ n=1 bounded 2 ˆ in X and xn → x in L (Ω) prove that for any R > 0 there exists R > 0 such that ˆ for all n ∈ N. Now, take also into account ψ(xn ) = 1, w ≤ R implies yn ≤ R ϕ (xn ), xn = pϕ(xn ), ψ (xn ), xn = p, in order to get ? @ ϕ ϕ (xn ), wψ(xn ) − ϕ(xn )ψ (xn ), w (xn ), w = ψ ψ 2 (xn ) = ptn ϕ(xn ) + ϕ (xn ), yn − ptn ϕ(xn ) − ϕ(xn )ψ (xn ), yn = ϕ (xn ), yn → 0 uniformly with respect to w ∈ X, w ≤ R. Exercise 6.4.54. Prove the following assertion: Let ΦY be a function deﬁned on the class A of closed subsets A of Y . If ΦY possesses properties (i)–(iii) of Lemma 6.4.30 and ΦY (A) = 1 when A consists of a single point, then ΦY (A) ≤ catY (A)

for all

A∈A.

456

Chapter 6. Variational Methods

Hint. Let catY (A) = 1. Since A is contractible to a point u0 ∈ Y , hence, by the fact that ΦY satisﬁes (iii) of Lemma 6.4.30, ΦY (A) ≤ ΦY (u0 ) = 1. Now the assertion follows by using a covering and (ii) of Lemma 6.4.30. Exercise 6.4.55. Prove that the set of all x ∈ V satisfying (i) and (ii) from Theorem 6.4.39 is not empty.

6.5 Saddle Point Theorem The main assertion in this section, the Saddle Point Theorem, is a useful tool to prove existence of a critical point which is neither a local minimum nor a local maximum of a given functional. Let us start again by considering a real function of two independent variables F: R×R→R which is continuously diﬀerentiable and satisﬁes the condition on top of the next page.

−

y o

Figure 6.5.1.

x

6.5. Saddle Point Theorem

457

There exists > 0 such that inf F (0, y) > max {F (− , 0), F ( , 0)}.

(6.5.1)

y∈R

The graph of such a function is sketched in Figure 6.5.1.The impression one can get from the graph is that c = inf max F (γ(t)) γ∈Γ t∈[−,]

where

Γ = {γ ∈ C([− , ], R2 ) : γ(− ) = (− , 0), γ( ) = ( , 0)}

is a critical value of the functional F . The following example, however, shows that this is not the case in general. Example 6.5.1. Let

2

F (x, y) = 2e−x + ey (see Figure 6.5.2). Set = 1. Then we have inf F (0, y) = 2 > max {F (−1, 0), F (1, 0)} =

y∈R

2 + 1, e

i.e., (6.5.1) is satisﬁed.

F

y

x Figure 6.5.2.

458

Chapter 6. Variational Methods

On the other hand, since 2 ∂F (x, y) = −4xe−x , ∂x

∂F (x, y) = ey , ∂y g

there is no critical point of F .

The reason why the geometric condition (6.5.1) is not suﬃcient to guarantee the existence of a critical point of F is the same as in the previous section. If we > (see page 428), then the value c is a critical value introduce the assumption (PS) > of F provided F satisﬁes (PS). Let us consider a more general situation F: H →R where H is a real Hilbert space. We will use the Quantitative Deformation Lemma (Lemma 6.4.2) and prove the following analogue of Proposition 6.4.3. Proposition 6.5.2. Let F ∈ C 2 (H, R) and let H = Y ⊕ Z where dim Y < ∞ and Z is a closed subspace of H. Moreover, assume that there is > 0 such that, denoting M = {u ∈ Y : u ≤ }, M0 = {u ∈ Y : u = }, we have inf F (u) > max F (u).

u∈Z

u∈M0

(6.5.2)

Let c inf max F (γ(u)) γ∈Γ u∈M

where

Γ {γ ∈ C(M, H) : γ|M0 = I}.

Then for each ε > 0 there exists u ∈ H such that (a) c − 2ε ≤ F (u) ≤ c + 2ε, (b) ∇F (u) < 2ε. Proof. First of all we will show that c˜ inf F (u) ≤ c. u∈Z

To establish this inequality it is suﬃcient to prove that for any γ ∈ Γ there is a point u˜ ∈ M for which γ(˜ u) ∈ Z. Let P be a continuous projection of H into Y such that Ker P = Z.

6.5. Saddle Point Theorem

459

With this P we wish to ﬁnd a solution in M of the equation P γ(u) = o.

(6.5.3)

To do that we will use the Brouwer degree. Since P γ|M0 = I, the homotopy invariance property (Proposition 5.2.6 and Theorem 5.2.7) yields that deg (P γ, int M, o) = deg (I, int M, o) = 1. Therefore (6.5.3) has a solution in M (again Theorem 5.2.7). Suppose that the conclusion of this proposition does not hold, i.e., assume that ε > 0 is so small that max F (u) < c − 2ε,

u∈M0

and for all u ∈ H satisfying (a) the condition (b) is violated. By the deﬁnition of c there exists γ ∈ Γ such that max F (γ(u)) ≤ c + ε.66

(6.5.4)

u∈M

Consider β(u) = η(γ(u)) where η is from Lemma 6.4.2. Using Lemma 6.4.2(i) we conclude that for u ∈ M0 we have β(u) = η(γ(u)) = η(u) = u,

i.e.,

β ∈ Γ,

i.e.,

c ≤ max F (β(u)). u∈M

On the other hand, it follows from Lemma 6.4.2(ii) and (6.5.4) that max F (β(u)) ≤ c − ε,

u∈M

a contradiction.

Similarly to the previous section, employing the (PS)c condition, we have the following assertion called the Saddle Point Theorem. Theorem 6.5.3 (Saddle Point Theorem). Let the assumptions of Proposition 6.5.2 be satisﬁed. Let F satisfy (PS)c . Then c is a critical value of F . Remark 6.5.4. The reader should have in mind that the Saddle Point Theorem was also proved under more general assumptions when H is a Banach space and F ∈ C 1 (H, R). In this more general form it is attributed to Rabinowitz (see Theorem 6.5.12). Example 6.5.5. Let us consider the boundary value problem −¨ x(t) − x(t) = f (t) + g(x(t)), t ∈ (0, π), x(0) = x(π) = 0, 66 Note

that the “max” exists due to the assumption dim Y < ∞.

(6.5.5)

460

Chapter 6. Variational Methods

where f ∈ L2 (0, π) is a given function and g : R → R is a continuous function having ﬁnite limits lim g(s) = g(±∞) and such that s→±∞

for all s ∈ R.

g(−∞) < g(s) < g(+∞)

We will prove that the problem (6.5.5) has a weak solution if and only if 1 π f (t) sin t dt < −g(−∞). (6.5.6) −g(+∞) < 2 0 First let us prove that (6.5.6) is a necessary condition for the solvability of (6.5.5). Assume that x ∈ H W01,2 (0, π) 67 is a weak solution of (6.5.5), i.e., π π π (x(t) ˙ y(t) ˙ − x(t)y(t)) dt = f (t)y(t) dt + g(x(t))y(t) dt 0

0

0

for any y ∈ H. Take y = sin t, then π f (t) sin t dt = − 0

π

g(x(t)) sin t dt.

0

However, an easy calculation yields π 2g(−∞) < g(x(t)) sin t dt < 2g(+∞). 0

To prove that (6.5.6) is also a suﬃcient condition we apply Theorem 6.5.3. Deﬁne π x(t) π 1 π 1 π 2 |x(t)| ˙ dt − |x(t)|2 dt − g(s) ds dt − f (t)x(t) dt, F (x) = 2 0 2 0 0 0 0 x ∈ H. Let us verify that F has a suitable geometry which corresponds to the Saddle Point Theorem. Let Y = Lin{sin t}, Z = Y ⊥ .68 For z ∈ Z we have π 1 π |z(t)|2 dt ≤ |z(t)| ˙ 2 dt (6.5.7) 4 0 0 (cf. Exercise 6.5.6). From (6.5.7) we get that F is weakly coercive on Z. Namely, we have π -- z(t) 3 π 2 |z(t)| ˙ dt − g(s) ds- dt − f L2(0,π) zL2(0,π) F (z) ≥ 8 0 0 - 0 c 3 3 2 , ≥ z − czL2(0,π) ≥ z z − 8 8 2

67 We 68 H

π

consider the norm x = 0

1 2 |x(t)| ˙ dt

2

on H.

= Y ⊕ Z. Notice that Y and Z are orthogonal to each other also in the L2 -scalar product.

6.5. Saddle Point Theorem

461

and so F (z) → ∞ for z → ∞, z ∈ Z. The functional F is weakly sequentially lower semi-continuous on Z (the argument is the same as that used in Example 6.2.6 and the reader should check it carefully). Then it follows from Theorem 6.2.8 that there exists z0 ∈ Z such that −∞ < F (z0 ) = min F (z). z∈Z

On the other hand, F ( sin t) =

2 2

π

0

2 cos2 t dt − 2

π

0

sin2 t dt

=0

π − 0

Denote

sin t

g(s) ds dt −

0

π

f (t) sin t dt. 0

σ

g(s) ds = G(σ). 0

Then

π

F ( sin t) = − 0

G( sin t) dt +

π

f (t) sin t dt . 0

Since, by the l’Hospital Rule, lim

→±∞

G( sin t) = lim g( sin t) sin t = g(±∞) sin t, →±∞

the Lebesgue Dominated Convergence Theorem and (6.5.6) yield lim F ( sin t) = −∞.

||→∞

(6.5.8)

Taking 0 large enough we then have F (± 0 sin t) < F (z0 ), i.e., the assumptions of Theorem 6.5.3 are satisﬁed with M = { sin t : ∈ [− 0 , 0 ]},

M0 = {− 0 sin t, 0 sin t}.

It remains to prove that F satisﬁes (PS)c . Similarly to Example 6.4.7 we will prove that F satisﬁes even a stronger version of (PS)c : ∞

∞

Any sequence {xn }n=1 ⊂ H such that {F (xn )}n=1 is bounded in R and ∇F (xn ) → o contains a convergent subsequence.

462

Chapter 6. Variational Methods

To prove it we follow the usual scheme. Step 1. We will ﬁrst show that {xn }∞ n=1 is bounded in H. To do that, decompose xn = yn + zn where yn ∈ Y (i.e., yn (t) = n sin t) and zn ∈ Z = Y ⊥ . First we prove that ∞ {zn }n=1 is bounded in H. To see this consider (∇F (xn ), zn )H . We have π π (∇F (xn ), zn ) = x˙ n (t)z˙n (t) dt − xn (t)zn (t) dt 0 0 π π − zn (t)g(xn (t)) dt − f (t)zn (t) dt 0 0π π 2 2 69 = zn − zn L2 (0,π) − zn (t)g(xn (t)) dt − f (t)zn (t) dt 0

0

3 3 k ≥ zn 2 − kzn L2 (0,π) ≥ zn 2 − zn 4 4 2 with a positive constant k (for the last two inequalities we have used (6.5.7)). Since we have assumed that ∇F (xn ) → o we know that ∇F (xn ) ≤ const.

for all suﬃciently large n.

In particular, this means that for all these n the inequalities (∇F (xn ), zn ) ≤ zn Hence

hold.

3 k zn 2 − zn ≤ zn , 4 2 ∞

and the boundedness of {zn }n=1 is shown. For the investigation of yn we will use the boundedness of F (xn ). We have 1 1 1 1 F (xn ) = F (yn + zn ) = yn 2 + zn 2 − yn 2L2 (0,π) − zn 2L2 (0,π) 2 2 2 2 π yn (t) π yn (t)+zn (t) − g(s) ds dt − g(s) ds dt 0

0 π

f (t)yn (t) dt −

− 0

0 π

yn (t)

f (t)zn (t) dt 0

1 1 = F (yn ) + zn 2 − zn 2L2 (0,π) 2 2 π yn (t)+zn (t) π − g(s) ds dt − f (t)zn (t) dt. 0 69 See

yn (t)

footnote 68 on page 460.

0

6.5. Saddle Point Theorem

463 ∞

∞

∞

By the boundedness of {zn }n=1 , g and {F (xn )}n=1 , we obtain that {F (yn )}n=1 is bounded. Since yn (t) = n sin t and lim F ( sin t) = −∞ (see (6.5.8)), { n }∞ n=1 ||→∞

and also yn have to be bounded. Step 2. Passing to a subsequence if necessary we may assume that xn x0 in H and xn → x0 in C[0, π]. Since ∇F (xn ) → o we get (∇F (xn ) − ∇F (x0 ), xn − x0 ) → 0

as n → ∞.

This means that π π |x˙ n (t) − x˙ 0 (t)|2 dt − |xn (t) − x0 (t)|2 dt 0 0 π − (g(xn (t)) − g(x0 (t)))(xn (t) − x0 (t)) dt − 0

π

f (t)(xn (t) − x0 (t)) dt → 0.

0

However, the last three integrals tend to zero, i.e., xn − x0 → 0

as

n → ∞. g

Hence xn → x0 in H. Exercise 6.5.6. Prove the inequality (6.5.7) for z ∈ Z. +; ,∞ +; ,∞ 2 2 Hint. Remember that sin nt , cos nt π π basis in L (0, π). For z ∈ Z one has 2

z(t) =

∞

ak sin kt

n=1

and (the Parseval equality)

form an orthonormal

n=0

z2L2(0,π) =

k=2

∞ π 2 ak . 2 k=2

A similar argument with z˙ and integration by parts leads to (6.5.7). Exercise 6.5.7. Find conditions on Φ : [0, 1] × R → R such that the procedure for proving the existence of a solution of (6.5.5) could be used for the boundary value problem ⎧ ∂Φ ⎨ −¨ (t, x(t)), t ∈ (0, 1), x(t) = ∂x ⎩ x(0) = x(1) = 0. Exercise 6.5.8. Consider the boundary value problem −¨ x(t) − λx(t) = g(t, x(t)), t ∈ (0, π), x(0) = x(π) = 0.

(6.5.9)

Formulate conditions on λ and g which guarantee that the energy functional associated with (6.5.9) has a geometry corresponding to the Saddle Point Theorem.

464

Chapter 6. Variational Methods

Exercise 6.5.9. Consider the Dirichlet boundary value problem −¨ x(t) = h(t, x(t)), t ∈ (0, π),

(6.5.10)

x(0) = x(π) = 0.

Formulate conditions on h = h(t, x) which guarantee that (6.5.10) has a weak solution. Exercise 6.5.10. Consider the Neumann boundary value problem −¨ x = h(t, x(t)), t ∈ (0, π),

(6.5.11)

x(0) ˙ = x(π) ˙ = 0.

Formulate conditions on h = h(t, x) which guarantee that (6.5.11) has a weak solution. Exercise 6.5.11. Consider the Dirichlet boundary value problem −¨ x(t) − n2 x(t) = f (t) + g(x(t)), t ∈ (0, π), x(0) = x(π) = 0,

(6.5.12)

where n ∈ N, f ∈ L2 (0, π), g : R → R is a continuous function having ﬁnite limits lim g(s) = g(±∞) and such that g(−∞) < g(s) < g(+∞) for all s.

s→±∞

Prove that (6.5.12) has a weak solution if and only if

π

−

π

(sin nt) dt − g(+∞)

g(−∞) 0

π

+

(sin nt) dt < f (t) sin nt dt 0 0 π − < g(+∞) (sin nt) dt − g(−∞) 0

π

(sin nt)+ dt

0

where (sin nt)+ and (sin nt)− are the positive and the negative part of sin nt, respectively. Hint. Modify the estimates from Example 6.5.5.

6.5A Linking Theorem The aim of this appendix is to apply the General Minimax Principle (see Theorem 6.4.22) from Appendix 6.4A and to generalize the assertion of Theorem 6.5.3. Namely, we start with the Saddle Point Theorem (cf. Remark 6.5.4). Theorem 6.5.12 (Saddle Point Theorem, Rabinowitz). Let X = Y ⊕ Z be a Banach space with Z closed in X and dim Y < ∞. For > 0 deﬁne M {u ∈ Y : u ≤ },

M0 {u ∈ Y : u = }.

6.5A. Linking Theorem

465

Let F ∈ C 1 (X, R) be such that b inf F (u) > a max F (u). u∈Z

u∈M0

If F satisﬁes the (PS)c condition with c inf max F (γ(u))

where

γ∈Γ u∈M

Γ {γ ∈ C(M, X) : γ|M0 = I},

then c is a critical value of F . Proof. We set Γ0 = {I} and apply Theorem 6.4.22 and Corollary 6.4.23. For this purpose it is enough to verify that c ≥ b. Let us prove that γ(M) ∩ Z = ∅ for every γ ∈ Γ. Denote by P the projection onto Y such that P Z = {o}. If γ(M) ∩ Z = ∅, then the map u →

P γ(u) P γ(u)

is a retraction 70 of the ball M onto its boundary M0 in the space Y . But this is impossible since dim Y < ∞. Hence, for every γ ∈ Γ, max F (γ(u)) ≥ inf F (u),

u∈M

c ≥ b.

i.e.,

u∈Z

This completes the proof.

We postpone the application of Theorem 6.5.12 to Appendix 7.7A. We prove now the Linking Theorem. Theorem 6.5.13 (Linking Theorem, Rabinowitz). Let X = Y ⊕ Z be a Banach space with Z closed in X and dim Y < ∞. Let > r > 0 and let z ∈ Z be such that z = r. Deﬁne M {u = y + λz : u ≤ , λ ≥ 0, y ∈ Y },

N {u ∈ Z : u = r},

M0 {u = y + λz : y ∈ Y, u = and λ ≥ 0, or u ≤ and λ = 0}. Let F ∈ C 1 (X, R) be such that b inf F (u) > a max F (u). u∈N

u∈M0

If F satisﬁes the (PS)c condition with c inf max F (γ(u)) γ∈Γ u∈M

where

Γ {γ ∈ C(M, X) : γ|M0 = I},

then c is a critical value of F . 70 For

the notion of the retraction see Exercise 6.5.16.

466

Chapter 6. Variational Methods

Proof. We set Γ0 = {I} and apply Theorem 6.4.22 and Corollary 6.4.23. As in the previous proof it is suﬃcient to verify that c ≥ b. Now, we prove that γ(M) ∩ N = ∅ for every γ ∈ Γ. Denote by P the projection onto Y such that P Z = {o}, and by R the retraction of (Y ⊕ Rz ) \ {z} to M0 ,71 see Figure 6.5.3. If γ(M) ∩ N = ∅, then the map

1 u → R P γ(u) + (I − P )γ(u)z r is well deﬁned on M and hence it is a retraction of M to its boundary M0 . This is impossible since M is homeomorphic to a ﬁnite dimensional ball (see Exercise 6.5.16). Hence for every γ ∈ Γ we obtain max F (γ(u)) ≥ inf F (u),

u∈M

c ≥ b.

i.e.,

u∈N

Rz M0 z

R

−

o

M0

Y

Figure 6.5.3. As an application we give the following example. Example 6.5.14. Let us consider the boundary value problem −¨ x(t) + a(t)x(t) = f (t, x(t)), t ∈ (0, 1), x(0) = x(1) = 0,

(6.5.13)

where a = a(t) is a continuous function on [0, 1] and f = f (t, s) is a continuous function on [0, 1] × R satisfying some additional hypotheses formulated below. It follows from the Sturm–Liouville theory for linear ordinary diﬀerential equations of the second order (see Example 2.2.17 and, e.g., Walter [131]) that the eigenvalues 71 By

Rz we denote the set {tz : t ∈ R} ⊂ Z.

6.5A. Linking Theorem {λn }∞ n=1 of

467

−¨ x(t) + a(t)x(t) = λx(t),

t ∈ (0, 1),

x(0) = x(1) = 0 form a strictly increasing sequence where each eigenvalue is simple and lim λn = ∞. If n→∞

we denote by e1 , e2 , . . . , en , . . . the corresponding eigenfunctions, then they are mutually orthogonal in L2 (0, 1). Suppose that there exists k ∈ N such that λ1 < λ2 < · · · < λk < 0 < λk+1 < λk+2 < · · · . Let us assume that f satisﬁes the following assumptions: (f1) there exist p ∈ (1, 2) and c > 0 such that for any |f (t, s)| ≤ c 1 + |s|p−1

t ∈ [0, 1],

s ∈ R;

(f2) there exist α > 2 and R > 0 such that for t ∈ [0, 1] and |s| > R we have s 0<α f (t, σ) dσ ≤ sf (t, s); 0

(f3) f (t, s) = o(|s|) as |s| → 0 uniformly on [0, 1]; s s2 (f4) λk f (t, σ) dσ for all s ∈ R and t ∈ [0, 1]. ≤ 2 0 We now consider the functional x(t) 1 1 1 2 2 + a(t)|x(t)| − f (t, s) ds dt |x(t)| ˙ F (x) 2 2 0 0 on H W01,2 (0, 1).72 The critical points of F correspond to weak solutions of (6.5.13). Our plan is to apply the Linking Theorem (Theorem 6.5.13) so as to prove that F has a critical point. Then the existence of a solution of (6.5.13) will follow from a regularity argument similar to that from Theorem 6.1.13. Denote by 1 x(t) ψ(x) f (t, s) ds dt 0

0

the functional deﬁned on the Sobolev space H. Then ψ is of the class C 1 (H, R), and (ψ (x), h) =

1

f (t, x(t))h(t) dt

(6.5.14)

0

(cf. Section 3.2). The fact ψ ∈ C 1 (H, R) implies immediately that F ∈ C 1 (H, R) as well. Let us deﬁne 1 Z x∈H: x(t)y(t) dt = 0, y ∈ Y . Y Lin{e1 , . . . , ek }, 0 72 Note

that we use the norm x = x ˙ L2 (0,1) .

468

Chapter 6. Variational Methods

Then

1

δ inf

2 (|x(t)| ˙ + a(t)|x(t)|2 ) dt > 0.

x∈Z

x =1

(6.5.15)

0

Indeed, by deﬁnition, on Z we have (see Remark 6.3.13) 1 2 (|x(t)| ˙ + a(t)|x(t)|2 ) dt ≥ λk+1 0

1

|x(t)|2 dt. 0

Since inf xL2 (0,1) = 0 we need some more work in order to establish (6.5.15). Consider

x =1

a minimizing sequence {xn }∞ n=1 ⊂ Z:

1

xn = 1,

(|x˙ n (t)|2 + a(t)|xn (t)|2 ) dt → δ. 0

Going to a subsequence if necessary, we may assume xn x in H, i.e., xn → x in L2 (0, 1) by the compact embedding H = W01,2 (0, 1) ⊂⊂ L2 (0, 1) (cf. Theorem 1.2.28). The continuity of a = a(t) in [0, 1] then implies that 1 1 a(t)|xn (t)|2 dt → a(t)|x(t)|2 dt. 0

0

Since Z is weakly closed and the norm on H is weakly lower semicontinuous, we obtain 1 1 1 1 2 δ =1+ a(t)|x(t)|2 dt ≥ |x(t)| ˙ dt + a(t)|x(t)|2 dt ≥ λk+1 |x(t)|2 dt. 0

0

0

0

If x = o, we have δ = 1, and if x = o, we have 1 δ ≥ λk+1 |x(t)|2 dt > 0 0

and so (6.5.15) is proved. Using (f1) and (f3), we obtain that for any ε > 0 there exists cε > 0 such that - s f (t, σ) dσ -- ≤ ε|s|2 + cε |s|p . (6.5.16) 0

It follows from (6.5.15) and (6.5.16) that on Z we have 1 δ δ F (x) ≥ x2 − (ε|x(t)|2 + cε |x(t)|p ) dt = x2 − εx2L2 (0,1) − cε xpLp (0,1) . 2 2 0 For ε > 0 small enough, by virtue of the inequality 1 < p < 2 and the embedding H = W01,2 (0, 1) ⊂ Lq (0, 1) for any q > 1, there exists r > 0 such that b inf F (x) > 0.

x =r x∈Z

By (f4) we have

1

F (x) ≤ 0

|x(t)|2 λk − 2

x(t)

f (t, s) ds dt ≤ 0

0

for

x ∈ Y.

(6.5.17)

6.5A. Linking Theorem

469

It follows from the ﬁrst inequality in (f2) that there exist c1 , c2 > 0 such that s f (t, σ) dσ ≥ c1 |s|α − c2 for all s ∈ R, t ∈ [0, 1]

(6.5.18)

0

(cf. Exercise 6.5.17). Hence for x ∈ H we have F (x) ≤

1 x2 + aC[0,1] x2L2 (0,1) − c1 xα Lα (0,1) + c2 . 2

(6.5.19)

e

Set z r ek+1 with r > 0 given above. All norms are equivalent on the ﬁnite dimensik+1

onal space Y ⊕ Rz . In particular, there is a constant c > 0 such that x ≤ cxLα (0,1)

for any

x ∈ Y ⊕ Rz .

Since α > 2 we obtain from (6.5.19) that lim

x →∞ x∈Y ⊕Rz

F (x) = −∞.

(6.5.20)

Deﬁne M {x y + λz : y ∈ Y, λ ≥ 0, x ≤ }, M0 {x y + λz : y ∈ Y, x = and λ ≥ 0, or x ≤ and λ = 0}. Since F (z) ≥ b > 0, (6.5.17) and (6.5.20) imply that there is > r such that a max F (x) ≤ 0. x∈M0

It remains to verify that F satisﬁes the (PS)c condition. This will be the case if we show that any sequence {xn }∞ n=1 ⊂ H such that d sup F (xn ) < ∞, n

F (xn ) → o,

(6.5.21)

contains a converging subsequence. We will prove it in two steps. 1 1 Step 1. First we prove that {xn }∞ n=1 is bounded in H. Let β ∈ α , 2 . For n large enough we have for some c3 , c4 > 0, by using (f2), 1 1 d + xn ≥ F (xn ) − β(F (xn ), xn ) = − β (|x˙ n (t)|2 + a(t)|xn (t)|2 ) 2 0 xn (t) f (t, s) ds dt + βf (t, xn (t))xn (t) −

≥

≥

1 −β 2

0

δzn 2 + λ1 yn 2L2 (0,1) + (αβ − 1)

1 0

xn (t)

f (t, s) ds dt − c3

0

1 − β δzn 2 + λ1 yn 2L2 (0,1) + c1 (αβ − 1)xn α Lα (0,1) − c4 2 (6.5.22)

470

Chapter 6. Variational Methods

where xn = yn + zn , yn ∈ Y , zn ∈ Z, δ is from (6.5.15), and we have also used

1

yn 2 +

a(t)|yn (t)|2 dt ≥ λ1 yn 2L2 (0,1) 0

and the fact that

1

[y˙ n (t)z˙ n (t) + a(t)yn (t)zn (t)] dt = 0.73 0

Since dim Y < ∞, the norms · and · L2 (0,1) are equivalent on Y , and (6.5.22) implies that {xn }∞ n=1 is bounded (cf. Exercise 6.5.18). Step 2. In the second step we prove that {xn }∞ n=1 contains a convergent subsequence. Going to a subsequence if necessary, we can assume that xn x in H. By the Rellich– Kondrachov Theorem (Theorem 1.2.28), xn → x in C[0, 1]. Observe that xn − x2 = (F (xn ) − F (x), xn − x) 1 8 9 (f (t, xn (t)) − f (t, x(t)))(xn (t) − x(t)) − a(t)|xn (t) − x(t)|2 dt. + 0

The boundedness of {xn }∞ n=1 and (6.5.21) imply (F (xn ) − F (x), xn − x) → 0, implies

xn → x

in C[0, 1]

1

a(t)(|xn (t)|2 − |x(t)|2 ) dt → 0, 0

and the continuity of f implies

1

(f (t, xn (t)) − f (t, x(t)))(xn(t) − x(t)) dt 0

≤ f (·, xn (·)) − f (·, x(·))C[0,1] xn − xC[0,1] → 0

as

n → ∞.

Thus we have proved that xn − x → 0,

n → ∞.

e

Remark 6.5.15. If λ1 > 0 (this is the case if, e.g., a(t) ≥ 0 in [0, 1]), it suﬃces to use the Mountain Pass Theorem instead of the Linking Theorem. The interested reader is invited to carry out the proof in detail as an exercise. Exercise 6.5.16. A retraction of a topological space X to a subspace Y is a continuous map r : X → Y such that r(y) = y

for every

y ∈ Y.

Prove that there is no retraction of B(o; 1) ⊂ RN to S N−1 = ∂B(o; 1). 73 Note

1

that

1

[e˙ j (t)z(t) ˙ + a(t)ej (t)z(t)] dt = λj 0

0

ej (t)z(t) dt = 0 for ej , j = 1, . . . , k, z ∈ Z.

6.5A. Linking Theorem

471

Hint. Assume, by contradiction, that there is a retraction r : B(o; 1) → S N−1 . Using the homotopy H(t, u) (1 − t)u + tr(u) we obtain (see Theorem 5.2.7) deg (r, B(o; 1), o) = deg (I, B(o; 1), o) = 1, i.e., there is u0 ∈ B(o; 1) such that r(u0 ) = o. This contradicts r(u) ∈ S N−1 for any u ∈ B(o; 1)! Exercise 6.5.17. Prove that the condition (f2) implies (6.5.18). Hint. Let s > R. It follows from (f2) that f (t, s)

α ≤ s

s

f (t, σ) dσ 0

and integrating over [R, s] yields

s

R

f (t, σ) dσ − log

α log s − α log R ≤ log 0

f (t, σ) dσ. 0

Taking the exponential of both sides, we obtain s R s f (t, σ) dσ f (t, σ) dσ ' s (α α 0 ≤ 0R s ≤ f (t, σ) dσ. , i.e., R Rα 0 f (t, σ) dσ 0

s

f (t, σ) dσ ≤ const. for t ∈ [0, 1], s ∈ [0, R]. Hence

Since f is continuous, we have

0 s

f (t, σ) dσ ≥ c1 sα − c2 . 0

Similarly for s < 0. Exercise 6.5.18. Prove the boundedness of the sequence {xn }∞ n=1 from Step 1 on page 469. Hint. Write xn = yn + zn , yn ∈ Y , zn ∈ Z. Since Y and Z are L2 -orthogonal, we have yn 2L2 (0,1) = xn 2L2 (0,1) − zn 2L2 (0,1) . Write (6.5.22) in the equivalent form

1 d + xn ≥ − β δzn 2 − λ1 zn 2L2 (0,1) 2

1 α − β xn 2L2 (0,1) − c4 + c1 (αβ − 1)xn Lα (0,1) + λ1 2

1 ≥ − β δzn 2 − λ1 zn 2L2 (0,1) 2

1 + c λ + xn 2Lα (0,1) c1 (αβ − 1)xn α−2 − β − c4 5 1 Lα (0,1) 2 (where the inequality xn L2 (0,1) ≤ c5 xn Lα (0,1) is used) and get the boundedness of {xn }∞ n=1 .

472

Chapter 6. Variational Methods

Exercise 6.5.19. Consider the Dirichlet boundary value problem p−2 − |x(t)| ˙ x(t) ˙ ˙− λ|x(t)|p−2 x(t) = g(t, x(t)), t ∈ (0, 1), x(0) = x(1) = 0,

(6.5.23)

where p > 1. Formulate conditions on λ and g = g(t, x) which guarantee that (6.5.23) has a geometry corresponding to (i) the Saddle Point Theorem, (ii) the Linking Theorem. Exercise 6.5.20. How do the conditions on λ and g change if the homogeneous Dirichlet conditions in (6.5.23) are replaced by the Neumann ones?

Chapter 7

Boundary Value Problems for Partial Diﬀerential Equations 7.1 Classical Solution, Functional Setting In this section we will explain the notion of the classical solution of a semilinear problem with the Laplace operator and explain what is the “right” functional setting for it. Let Ω be an open bounded subset of RN and let u : Ω → R be a real smooth function. We will denote by ∆u(x)

∂ 2 u(x) ∂ 2 u(x) ∂ 2 u(x) + + ··· + , 2 2 ∂x1 ∂x2 ∂x2N

x = (x1 , . . . , xN ) ∈ Ω

the Laplace operator deﬁned in Ω. Let g: Ω × R → R be a continuous real function. We will study the Dirichlet boundary value problem −∆u(x) = g(x, u(x)) in Ω, (7.1.1) u=0 on ∂Ω and look for its classical solution. Following the deﬁnition of the classical solution for the ordinary diﬀerential equation it should be a function u ∈ C 2 (Ω) ∩ C(Ω) such that u(x) = 0 for every x ∈ ∂Ω and the equation −∆u(x) = g(x, u(x)) is satisﬁed at every point x ∈ Ω. Let us explain why this is not a suitable deﬁnition of the solution for partial diﬀerential equations. In order to apply the methods of nonlinear functional analysis we need an operator representation of the Laplace operator subject to the Dirichlet boundary

474

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

conditions. For this purpose we need the fact that for any f ∈ C(Ω) the linear problem −∆u(x) = f (x) in Ω, (7.1.2) u=0 on ∂Ω, has a unique solution u ∈ C 2 (Ω) ∩ C(Ω). However, not for every f ∈ C(Ω) does such a solution exist in general! This fact is nontrivial and can be found in the book Gilbarg & Trudinger [59, Chapter 4]. So, we have to look for a diﬀerent concept of the classical solution. To motivate the deﬁnition of the classical solution of (7.1.1) we treat ﬁrst the linear problem (7.1.2). In order to simplify the notation, in this chapter, denote by | · | the norm in RN for any N ≥ 1. For γ ∈ (0, 1) let us consider the space of γ-H¨older continuous functions C 0,γ (Ω) {u ∈ C(Ω) : ∃K > 0 ∀x, y ∈ Ω : |u(x) − u(y)| ≤ K|x − y|γ } (cf. Example 1.2.25 and Exercise 7.1.4). Let α = (α1 , . . . , αN ) be a multiindex of length N |α| = αi , i=1

and for the sake of brevity denote Dα u =

∂ |α| u α1 αN 2 ∂x1 ∂xα 2 . . . ∂xN

(=

∂ ∂x1

α1

···

∂ ∂xN

αN u)

(cf. Section 1.2). Set C 2,γ (Ω) {u ∈ C 2 (Ω) : ∀|α| = 2, Dα u ∈ C 0,γ (Ω)} (cf. Exercise 7.1.5). In what follows we will assume that Ω is a bounded domain (an open and connected set) of the class C k,γ (k ∈ N ∪ {0}, γ ∈ (0, 1]), i.e., at each point x0 ∈ ∂Ω there is a ball B(x0 ; ) and a one-to-one mapping ψ from B(x0 ; ) onto B(o; 1) ⊂ RN such that 1 (1) ψ(B(x0 ; ) ∩ Ω) ⊂ RN +; (2) ψ(B(x0 ; ) ∩ ∂Ω) ⊂ ∂RN +; (3) ψ ∈ C k,γ (B(x0 ; )), ψ −1 ∈ C k,γ (B(o; 1)), see Figure 7.1.1 (cf. Deﬁnition 4.3.89). For the sake of brevity we write Ω ∈ C k,γ . In particular, if Ω ∈ C 0,1 , then Ω will be called the domain with a Lipschitz boundary. 1 RN +

= {x ∈ RN : x = (x1 , x2 , . . . , xN ), xN > 0}.

7.1. Classical Solution, Functional Setting

475

The following deep result is crucial for the classical setting of the problem (7.1.1). Theorem 7.1.1. Let f ∈ C 0,γ (Ω), γ ∈ (0, 1), Ω ∈ C 2,γ . Then there exists a unique u ∈ C 2,γ (Ω) such that u(x) = 0, x ∈ ∂Ω, and the equation in (7.1.2) is satisﬁed at every point x ∈ Ω. Moreover, there exists c > 0 (which depends only on Ω and γ ∈ (0, 1)) such that uC 2,γ (Ω) ≤ cf C 0,γ (Ω) . (7.1.3) B(x0 ; )

RN +

ψ

−1

x0 o ψ B(o; 1)

∂Ω Ω

Figure 7.1.1.

Proof. The proof of this assertion can be found in Gilbarg & Trudinger [59, Chapter 6]. Note that (7.1.3) is a special case of more general Schauder estimates (see, e.g., Gilbarg & Trudinger [59]). Set X {u ∈ C 2,γ (Ω) : u(x) = 0, x ∈ ∂Ω}

and

Y C 0,γ (Ω).

Then X and Y are Banach spaces (see Exercises 7.1.4 and 7.1.5) and the Arzel`a– Ascoli Theorem (see Theorem 1.2.13) implies that the following compact embedding holds true: X ⊂⊂ Y. (7.1.4) Let us deﬁne an operator L : X → Y by (Lu)(x) −∆u(x),

u ∈ X.

(7.1.5)

Then L is a linear and bounded operator (Exercise 7.1.7). It follows from Theorem 7.1.1 that L−1 : Y → X is a well-deﬁned linear and bounded operator. Actually, the best constant (i.e., the least one) in (7.1.3) is nothing but c = L−1 L(Y,X) .

476

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

It follows from the compact embedding (7.1.4) that2 L−1 : Y → Y is a compact operator. Notice, however, that L−1 L(Y ) = L−1 L(Y,X) in general, because L−1 L(Y ) ≤ cemb L−1 L(Y,X) where cemb is the constant of embedding of X into Y , i.e., the least constant c for which uY ≤ cuX holds for all u ∈ X. This constant depends on Ω and γ ∈ (0, 1). Let us simplify the situation and assume that the nonlinear function g = g(x, u) is given in the special form g(x, u(x)) = g(u(x)) + f (x) (the “variables” u and x are “separated”). In order to ﬁnd a suitable operator representation for −∆u(x) = g(u(x)) + f (x) in Ω, u=0

on

∂Ω,

(7.1.6)

we have to ﬁnd conditions on g which guarantee that G : Y → Y deﬁned by G(u)(x) g(u(x))

(7.1.7)

is a correctly deﬁned operator (see also Example 3.2.21). The following nontrivial assertion provides a necessary and suﬃcient condition on g guaranteeing that the Nemytski operator G is continuous from Y into Y. Lemma 7.1.2 (Dr´abek [38]). The operator G deﬁned by (7.1.7) maps Y continuously into Y if and only if g ∈ C 1 (R).3 Deﬁnition 7.1.3. Let f ∈ Y and g ∈ C 1 (R). Then u ∈ X is a classical solution of (7.1.6) if the equation in (7.1.6) holds at every point of Ω. Let us give an operator representation of (7.1.6). Denote Gf (u)(x) g(u(x)) + f (x). 2 To be precise, we should write I ◦ L−1 : Y → Y where I is the compact embedding of X into Y . However, for the sake of brevity of notation we drop it. 3 Note that a similar assertion can be proved also for a more general Nemytski operator G(u)(x) = g(x, u(x)). However, the conditions on g are more complicated in that case (see, e.g., Appell & Zabreiko [8]).

7.2. Classical Solution, Applications

477

Then Gf : Y → Y is a continuous operator by Lemma 7.1.2 and it maps bounded sets onto bounded sets (see Exercise 7.1.8). The problem (7.1.6) is then equivalent to L(u) = Gf (u) or u = L−1 (Gf (u)). The operator T = L−1 ◦ Gf : Y → Y is compact (explain why!). The problem (7.1.6) thus can be written as a ﬁxed point problem u = T (u).

(7.1.8)

The equation (7.1.8) can be solved by applying some of the methods presented in the previous chapters. We will show some applications in the next section. Exercise 7.1.4. Prove that for γ ∈ (0, 1) the set C 0,γ (Ω), Ω a bounded domain in RN , is a Banach space equipped with the norm |u(x) − u(y)| . |x − y|γ x,y∈Ω

uC 0,γ (Ω) = sup |u(x)| + sup x∈Ω

x =y

Exercise 7.1.5. Prove that for γ ∈ (0, 1) the set C 2,γ (Ω), Ω a bounded domain in RN , is a Banach space equipped with the norm uC 2,γ (Ω) = uC 2 (Ω) +

|Dα u(x) − Dα u(y)| . |x − y|γ x,y∈Ω sup

|α|=2 x =y

Exercise 7.1.6. Prove that for γ > 1 the following equivalence holds true: u ∈ C 0,γ (Ω) if and only if u is constant on Ω. (Cf. footnote 26 on page 38). Exercise 7.1.7. Prove that L : X → Y deﬁned by (7.1.5) is a linear and bounded operator. Exercise 7.1.8. Prove that Gf , deﬁned above, maps bounded sets in Y onto bounded sets in Y .

7.2 Classical Solution, Applications In this section we will deal with the existence (and uniqueness) of the classical solution of the Dirichlet problem −∆u(x) = g(u(x)) + f (x) in Ω, (7.2.1) u=0 on ∂Ω, under the assumptions on Ω, g and f introduced in Section 7.1. Namely, we assume that Ω is a domain of the class C 2,γ , γ ∈ (0, 1), f ∈ Y = C 0,γ (Ω) and g ∈ C 1 (R).

478

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Let us start with a direct application of the Schauder Fixed Point Theorem. Theorem 7.2.1. Let sup |g(s)| < ∞ and s∈R

sup |g (s)| < s∈R

1 L−1 L(Y )

(7.2.2)

where L−1 was introduced in Section 7.1. Then for any f ∈ Y the problem (7.2.1) has at least one classical solution. Proof. We rewrite the problem (7.2.1) into the operator form (7.1.8) with T , L−1 and Gf introduced in Section 7.1. Due to our assumptions we ﬁnd a ball B(o; r) ⊂ Y with the property T (B(o; r)) ⊂ B(o; r). The existence of at least one solution will then follow from the Schauder Fixed Point Theorem (Theorem 5.1.11). For f ∈ Y ﬁxed we get T (u) = L−1 (Gf (u)) ≤ L−1 [f + G(u)] ⎡

⎤

|g(u(x)) − g(u(y))| ⎥ ⎢ ≤ L−1 ⎣f + sup |g(u(x))| + sup ⎦ |x − y|γ x,y∈Ω x∈Ω x =y

≤ L−1 f + L−1 sup |g(s)| + L−1 sup |g (s)| sup s∈R

s∈R

x,y∈Ω x =y

|u(x) − u(y)| |x − y|γ

≤ L−1 f + L−1 sup |g(s)| + L−1 sup |g (s)|u s∈R

s∈R

for any u ∈ Y . Note that the ﬁrst two terms in the last sum are constants independent of u ∈ Y . So, taking r > 0 large enough, the assumption (7.2.2) will guarantee that T (u) < r for any u ∈ B(o; r) ⊂ Y.4

This completes the proof.

Remark 7.2.2. The above theorem actually says that the problem (7.2.1) is solvable in the classical sense if the smooth nonlinearity g is uniformly bounded on R and, moreover, its derivative is uniformly bounded by a certain constant which depends on Ω and γ ∈ (0, 1).5 This constant may be rather diﬃcult to calculate but its estimate from above can be given for special domains Ω and exponents γ ∈ (0, 1) (see Gilbarg & Trudinger [59] for more details). 4 The

reader is invited to calculate the least value of such r in terms of L−1 , f , sup |g(s)|

and sup |g (s)|! s∈R 5 Note

that L−1 depends on Ω and γ ∈ (0, 1). See also Exercise 7.2.5.

s∈R

7.2. Classical Solution, Applications

479

It is quite natural to ask under which conditions on g the classical solution from Theorem 7.2.1 is uniquely determined. For this purpose let us consider the eigenvalue problem for the Laplace operator subject to the homogeneous Dirichlet boundary conditions −∆u(x) = λu(x) in Ω, (7.2.3) u=0 on ∂Ω. The problem (7.2.3) has only real eigenvalues. More precisely, there are only real numbers λ for which (7.2.3) has nonzero classical solutions. There exists the socalled principal eigenvalue λ1 > 0 having the property that |∇u(x)|2 dx ≥ λ1 |u(x)|2 dx for all u ∈ Y.6 (7.2.4) Ω

Ω

(see Example 7.5.1 below). Now we can formulate the following existence and uniqueness result. Theorem 7.2.3. Let sup |g(s)| < ∞ and s∈R

sup |g (s)| < min s∈R

1 , λ 1 . L−1 L(Y )

(7.2.5)

Then for any f ∈ Y the problem (7.2.1) has exactly one classical solution. Proof. Let f ∈ Y be arbitrary. The existence of at least one solution follows from Theorem 7.2.1. To prove that this solution is unique we proceed via contradiction. Let u1 = u2 , ui ∈ X, i = 1, 2, be two solutions of (7.2.1) for a given f ∈ Y . Then in Ω, −∆ui (x) = g(ui (x)) + f (x) i = 1, 2. on ∂Ω, ui = 0 Multiply both equations by the diﬀerence u1 − u2 , use the Green Formula7 and the boundary conditions, and then subtract the ﬁrst expression from the second. Thus we get |∇u1 (x) − ∇u2 (x)|2 dx = (g(u1 (x)) − g(u2 (x)))(u1 (x) − u2 (x)) dx. (7.2.6) Ω

Ω

6 This

inequality is the Poincar´ e inequality (see Exercise 1.2.46 and Remark 7.4.5). 7 The Green Formula reads: Let Ω be a domain with a Lipschitz boundary (see Section 7.1) and assume that w, v ∈ C 2 (Ω). Then the relation ∆w(x)v(x) dx = − (∇w(x), ∇v(x)) dx + (∇w(x), n(x))v(x) dS Ω

Ω

∂Ω

holds where n is the unit vector of the outward normal to ∂Ω and dS indicates integration with respect to the surface measure on ∂Ω. For more details see Appendix 4.3C, in particular, Remark 4.3.99.

480

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Now, by virtue of (7.2.4) we have |∇u1 (x) − ∇u2 (x)|2 dx ≥ λ1 |u1 (x) − u2 (x)|2 dx, Ω

(7.2.7)

Ω

and (7.2.5) implies that (g(u1 (x)) − g(u2 (x)))(u1 (x) − u2 (x)) dx < λ1 |u1 (x) − u2 (x)|2 dx. (7.2.8) Ω

Ω

It follows from (7.2.6)–(7.2.8) that λ1 |u1 (x) − u2 (x)|2 dx < |∇u1 (x) − ∇u2 (x)|2 dx Ω Ω ≤ λ1 |u1 (x) − u2 (x)|2 dx, Ω

a contradiction.

Remark 7.2.4. One can ask why not prove the existence (and uniqueness) of a classical solution to (7.2.1) applying directly the Contraction Principle. The reason consists in the fact that the key assumption in this case would be the contractivity of T . Due to the linearity of L−1 this is equivalent to the Lipschitz continuity of the Nemytski operator G : Y → Y . However, according to Appell & Zabreiko [8, Theorem 7.8], G satisﬁes the global Lipschitz condition if and only if g is of the form g(u) = a + bu where a, b ∈ R, i.e., g is a linear function. According to Appell & Zabreiko [8, Theorem 7.9], G satisﬁes the local Lipschitz condition provided g is locally Lipschitz continuous. In other words, this means that the assumptions would be too restrictive, and so the above functional setting is not suitable for direct application of the Contraction Principle. Exercise 7.2.5. Prove that the statement of Theorem 7.2.1 holds provided g is a uniformly Lipschitz continuous function on R with a constant K < L1−1 . Generalize also Theorem 7.2.3. Hint. Follow the proof of Theorem 7.2.1 and for an estimate of g(u) use |g(u(·))| ≤ |g(u(·)) − g(0)| + |g(0)| ≤ K|u(·)| + |g(0)|. Exercise 7.2.6. Let u be a solution of the problem (7.2.1) where Ω is a domain of the class C 2,γ , γ ∈ (0, 1) and f ∈ C 0,γ (Ω). The maximum principle (see, e.g., Protter & Weinberger [102]) states that f ≥0

in Ω

implies

u≥0

in Ω.

7.3. Weak Solutions, Functional Setting

481

Use this fact to generalize the result of Example 5.4.19 to the problem −∆u(x) = f (x, u(x)) in Ω, u=0 on ∂Ω.

(7.2.9)

Exercise 7.2.7. Formulate conditions on f = f (x, u) which guarantee that there is a pair of a subsolution u0 and a supersolution v0 of (7.2.9) satisfying u0 ≤ v0

in Ω.

Hint. Look for u0 and v0 constant on Ω, cf. Exercise 6.2.47.

7.3 Weak Solutions, Functional Setting As we have mentioned in the previous section the concept of the classical solution is not suitable for application of many abstract results of nonlinear analysis presented in the previous chapters. Let us list the major drawbacks connected with this fact: the spaces of H¨ older continuous functions do not possess Hilbert structure and are not reﬂexive; to prove that the Nemytski operator between spaces of H¨ older continuous functions has a certain property requires too strong assumptions about the nonlinearity. To be more speciﬁc, the fact that C k,γ (Ω) are not reﬂexive spaces prevents us from applying variational methods which strongly depend on the selection of weakly convergent sequences from bounded ones (which is usually guaranteed by reﬂexivity) and sometimes also on the Hilbert structure of the function space. Concerning the restrictive assumptions on nonlinearity let us recall Remark 7.2.4. That is why it is important to look for a diﬀerent concept of “solution” than that introduced in Section 7.1. Let us point out that there are also more practical reasons for the introduction of a diﬀerent concept of solution. Many equations and boundary value problems are derived from globally formulated laws (coming from physics, chemistry, biology, sociology, economics, . . . ). To give a very simple example, let G be the primitive of g, i.e., G (s) = g(s), and let the functional 1 2 |∇u(x)| dx − G(u(x)) dx − f (x)u(x) dx (7.3.1) E(u) = 2 Ω Ω Ω represent the energy of a certain system. Following the well-known law, a physicist will be interested in ﬁnding functions u deﬁned on Ω, u = 0 on ∂Ω, for which E(u) attains its minimal value. The task for mathematicians is to ﬁnd the corresponding formalism in the framework of which the existence of such functions is guaranteed (and they can be calculated).

482

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Let us assume that u belongs to a normed linear space H of functions deﬁned on Ω and let us specify the properties of H later on. Let us also understand the following calculations formally and assume that u, g, f and Ω have all properties necessary in order to perform the calculations. Assume that u0 ∈ H, u0 = 0 on ∂Ω, is the point of minimum of E, i.e., E(u0 ) = min E(u). u∈H

Then δE(u0 ; v) = 0

for any v ∈ H

(see Proposition 6.1.2), i.e., (∇u0 (x), ∇v(x)) dx = g(u0 (x))v(x) dx + f (x)v(x) dx Ω

Ω

(7.3.2)

Ω

for any v ∈ H. Note that (7.3.2) is nothing else than the Euler necessary condition for (7.3.1). Now, if we assume that u0 ∈ C 2 (Ω), v ∈ C 1 (Ω), v = 0 on ∂Ω, and apply the Green Formula to the left-hand side of (7.3.2) (using the fact that v = 0 on ∂Ω), we arrive at (−∆u0 (x))v(x) dx = g(u0 (x))v(x) dx + f (x)v(x) dx (7.3.3) Ω

Ω

Ω

for any v ∈ H ∩ C 1 (Ω). If g ∈ C(R), f ∈ C(Ω) and H contains “enough” functions (e.g., {v ∈ C 1 (Ω) : v = 0 on ∂Ω} ⊂ H), then (7.3.3) implies in Ω, −∆u0 (x) = g(u0 (x)) + f (x) (7.3.4) on ∂Ω. u0 = 0 On the other hand, if we ﬁnd u0 ∈ H ∩ C 2 (Ω) satisfying (7.3.4), then we can pass, using the Green Formula, easily “back” to (7.3.2). However, looking carefully at (7.3.2), we immediately realize that all the expressions in (7.3.2) make sense under more general assumptions on u0 (Ω, g and f ) than the expressions in (7.3.4) do. One can immediately see that for (7.3.2) to hold we do not need to assume the existence of second partial derivatives of u0 at all. On the other hand, (7.3.4) does not make any sense without them if we understand it in the classical sense. It also makes good sense from the physical point of view to consider integral identity (7.3.2) as a starting point for the deﬁnition of the “solution” u0 .8 The 8 The usual scheme for deriving basic equations of mathematical physics is the following: First the “global formulation” in terms of an integral identity is derived from the conservation law, balance law, etc., and then the “local formulation” is derived in terms of diﬀerential equations. The second step, however, requires some extra assumptions about the solution (e.g., smoothness, etc.).

7.3. Weak Solutions, Functional Setting

483

advantage of the weak solution of (7.3.4) consists in the fact that the equation and the boundary conditions are not satisﬁed pointwise but in a more general sense which can correspond better to the real situation described by problems of the type (7.3.4). Let us point out here that the notion of the weak solution is a generalization of the notion of the classical solution. We shall show later that not every weak solution is a classical solution. We will also see later that more methods of nonlinear analysis from the previous chapters are applicable to get a weak solution instead of a classical one. On the other hand, it makes good sense to ask whether a weak solution to some problem has some better properties (continuity, H¨ older continuity, diﬀerentiability, etc.). Very often this is the case and it depends on Ω, g and f how regular (i.e., smooth) the weak solution is. For general partial diﬀerential equations, however, this is a diﬃcult problem and the regularity theory which deals with these questions is an important part of basic research in mathematics. Let us consider now a bit more general situation when g = g(x, u), and for u ∈ H let us investigate the identity (∇u(x), ∇v(x)) dx = g(x, u(x))v(x) dx, v ∈ H, (7.3.5) Ω

Ω

in more detail and try to make clear how, under “weak” assumptions concerning the functions u, v and g, the expressions appearing in (7.3.5) make sense. Note that the Lebesgue integral is used in (7.3.5). So, in particular, the functions under the integral sign must be measurable. As for ∇u, ∇v and v this will be guaranteed by the requirement that u and v belong to a suitable space of integrable functions. The measurability of the composite function g(x, u(x)) will depend on the properties of the function g itself. As was mentioned in Section 3.2 (Deﬁnition 3.2.22) the composite function h(x) = g(x, u(x)) is measurable provided u is measurable and g fulﬁls the Carath´eodory conditions. For the sake of brevity we denote this fact as follows: g ∈ CAR(Ω × R). Let us return to the right-hand side of (7.3.5). Assume that there exist r ∈ L2 (Ω) and c > 0 such that for a.a. x ∈ Ω and for all s ∈ R, |g(x, s)| ≤ r(x) + c|s|.

(7.3.6)

Let u, v ∈ L2 (Ω). Then, according to Theorem 3.2.24, g(x, u(x)) ∈ L2 (Ω) and the H¨older inequality yields that g(x, u(x))v(x) ∈ L1 (Ω). Now, let us take care of the left-hand side of (7.3.5). For this purpose we have to employ the Sobolev spaces introduced in Section 1.2.

484

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Recall that the Sobolev space W01,p (Ω), p > 1, was deﬁned in Exercise 1.2.46. It follows from the Poincar´e inequality (see Exercise 1.2.46(ii), (iii)) that the expression

p1 u = |∇u(x)|p dx Ω

W01,p (Ω).

deﬁnes an equivalent norm on For functions u ∈ W01,p (Ω), v ∈ W01,p (Ω), we can apply the H¨ older inequality to get

p1 1 p p p |(∇u(x), ∇v(x))| dx ≤ |∇u(x)| dx |∇v(x)| dx . (7.3.7) Ω

Ω

Ω

Let us point out that not every function from W01,p (Ω) is continuous in general (cf. Theorem 1.2.26). This fact depends on the values of p > 1 and N (notice that Ω ⊂ RN ) and on the properties of the boundary of Ω. For domains of the class C 0,1 (see page 474), i.e., domains with a Lipschitz boundary, we can deﬁne the space Lp (∂Ω) with the norm

uLp(∂Ω) =

|u(x)|p dS

p1

∂Ω

(see Deﬁnition 4.3.81 for the meaning of integral, and, e.g., Kufner, John & Fuˇc´ık [82], Neˇcas [99]). The following assertion is called the Trace Theorem and it is the key to understanding the notion of “boundary values” of functions from W 1,p (Ω). Theorem 7.3.1 (Trace Theorem). Let Ω ∈ C 0,1 . There exists one and only one continuous linear operator T which assigns to every function u ∈ W 1,p (Ω) a function T u ∈ Lp (∂Ω) and has the following property: “For

u ∈ C ∞ (Ω)

we have

T u = u|∂Ω .”

The following identity holds: W01,p (Ω) = {u ∈ W 1,p (Ω) : T u = o in Lp (∂Ω)} (see, e.g., Kufner, John & Fuˇc´ık [82] and Neˇcas [99]). This assertion oﬀers another way of understanding the space W01,p (Ω). Instead of T u = o in Lp (∂Ω) we can say “u = o on ∂Ω

in the sense of traces”

but we usually write u=0

on ∂Ω.

Now, if we turn back to the integral identity (7.3.5), we can immediately see, by applying the H¨ older inequality (7.3.7) with p = 2, that the integral on the

7.3. Weak Solutions, Functional Setting

485

left-hand side is ﬁnite if u, v ∈ W 1,2 (Ω). If, moreover, u ∈ W01,2 (Ω), then u satisﬁes the boundary condition in the sense of traces. These facts motivate the following deﬁnition. Deﬁnition 7.3.2. Let g ∈ CAR(Ω × R) and let Ω ∈ C 0,1 be a bounded domain in RN . By a weak solution of the Dirichlet problem (with homogeneous boundary conditions) −∆u(x) = g(x, u(x)) in Ω, u=0

on

∂Ω,

we understand a function u ∈ W01,2 (Ω) such that the integral identity (∇u(x), ∇v(x)) dx = g(x, u(x))v(x) dx Ω

(7.3.8)

Ω

holds for every v ∈ W01,2 (Ω).9 In order to simplify the notation in the sequel, we write ∇u(x)∇v(x) (∇u(x), ∇v(x)). So, (7.3.8) reads as

∇u(x)∇v(x) dx = Ω

g(x, u(x))v(x) dx, Ω

which is a more common form used in literature. Remark 7.3.3. The function v is called a test function. More general operators as well as nonhomogeneous boundary conditions are often dealt with in literature. The notion of a weak solution is then deﬁned in a similar way. However, the deﬁnition is technically more complicated (see, e.g., Gilbarg & Trudinger [59], Fuˇc´ık & Kufner [54]). Remark 7.3.4. It was shown above that every classical solution is also a weak solution. The converse is not always true as follows, e.g., from the fact that the weak solution need not possess the second partial derivatives at all! Roughly speaking, the weak solutions are looked for in larger spaces than the classical ones. That is why the chance to prove the existence of a weak solution is usually bigger than the chance to prove the existence of a classical solution. Hence, some “ill-posed” problems in the framework of classical solutions can appear to be “well-posed” in the framework of weak solutions. In the forthcoming sections we will work exclusively with bounded domains Ω ∈ C 0,1 without any further speciﬁcation. D(Ω) is dense in W01,2 (Ω) this is equivalent to the validity of (7.3.8) for all v ∈ D(Ω). We use v ∈ W01,2 (Ω) to get the scalar product in W01,2 (Ω) on the left-hand side of (7.3.8). 9 Since

486

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Exercise 7.3.5. A function u ∈ W 1,2 (Ω) satisfying (7.3.8) for any v ∈ W 1,2 (Ω) is called a weak solution of the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(x, u(x)) (7.3.9) ⎩ ∂u = 0 on ∂Ω. ∂n Here

∂u ∂n

denotes the derivative of u with respect to the outer normal n. Prove that any weak solution u of (7.3.9) such that u ∈ C 2 (Ω) satisﬁes

∂u =0 at every point of ∂Ω. ∂n Hint. Use an argument similar to that in Exercise 5.3.20. Notice that

12 2 u = |∇u(x)| dx Ω

is only a semi-norm on W

1,2

(Ω)!

7.4 Weak Solutions, Application of Fixed Point Theorems In this section we give an application of the ﬁxed point theorems to proving the existence of a weak solution to the problem −∆u(x) = g(x, u(x)) in Ω, (7.4.1) u=0 on ∂Ω. For a ﬁxed u ∈ W01,2 (Ω) and g ∈ CAR(Ω × R) satisfying (7.3.6) it is easy to see that ˆu : v → ∇u(x)∇v(x) dx, Sˆu : v → g(x, u(x))v(x) dx L Ω

Ω 1,2 W0 (Ω).

Since W01,2 (Ω) is a Hilbert are continuous linear functionals on the space space, by the Riesz Representation Theorem (Theorem 1.2.40) there exist uniquely determined elements Lu, S(u) ∈ W01,2 (Ω) such that ˆ u (v), (Lu, v)W 1,2 (Ω) = L 0

(S(u), v)W 1,2 (Ω) = Sˆu (v) 0

(7.4.2)

for all v ∈ W01,2 (Ω). To prove that the Dirichlet problem (7.4.1) has at least one weak solution it is necessary and suﬃcient to prove that the operator equation Lu = S(u) has at least one solution in the space properties of the operators L and S.

W01,2 (Ω).

(7.4.3) Let us therefore investigate the

7.4. Weak Solutions, Application of Fixed Point Theorems

487

There are several equivalent inner products deﬁned on W01,2 (Ω). If we choose (u, v)W 1,2 (Ω) ∇u(x)∇v(x) dx, 10 0

Ω

then L deﬁned by (7.4.2) is just an identity on W01,2 (Ω). On the other hand, for any u ∈ W01,2 (Ω) we also have u ∈ L2 (Ω), and Theorem 3.2.24 then implies that g(x, u) ∈ L2 (Ω). Let us assume that g is Lipschitz continuous with respect to the second variable, i.e., there exists a constant c > 0 such that for a.a. x ∈ Ω and for any s1 , s2 ∈ R, |g(x, s1 ) − g(x, s2 )| ≤ c|s1 − s2 |. Then for u1 , u2 ∈ W01,2 (Ω) we have S(u1 ) − S(u2 ) = sup |(S(u1 ) − S(u2 ), v)| v≤1

- = sup -- [g(x, u1 (x)) − g(x, u2 (x))]v(x) dx-v≤1 Ω c|u1 (x) − u2 (x)||v(x)| dx ≤ sup v≤1

(7.4.4)

Ω

≤ sup cu1 − u2 L2 (Ω) vL2 (Ω) v≤1

≤ sup c c2emb u1 − u2 v = c c2emb u1 − u2 .11 v≤1

Hence (7.4.3) is equivalent in W01,2 (Ω) to the operator equation u = S(u)

(7.4.5)

where S is a contraction if c2emb c < 1. We can apply the Contraction Principle to get the following result. Theorem 7.4.1. Let g ∈ CAR(Ω × R) be Lipschitz continuous with respect to the second variable with a constant c > 0, c < c−2 emb , where cemb is the constant of the embedding of W01,2 (Ω) into L2 (Ω). Then there is a unique ﬁxed point u ∈ W01,2 (Ω) of the operator S, i.e., u is a unique weak solution of (7.4.1). Another possibility is to apply the Schauder Fixed Point Theorem. Let us assume that g ∈ CAR(Ω × R) and |g(x, s)| ≤ r(x) for a.a. x ∈ Ω and all s ∈ R 10 Cf.

the Poincar´e inequality (7.2.4). exists a constant cemb > 0 such that uL2 (Ω) ≤ cemb u for any u ∈ W01,2 (Ω) (cf. Theorem 1.2.26 and Exercise 1.2.46). It can be shown that the best value of cemb is √1λ (cf. Re1 mark 7.4.5 and also (7.2.4)). 11 There

488

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

with a ﬁxed r ∈ L2 (Ω). Then for any u ∈ W01,2 (Ω) we have - S(u) = sup |(S(u), v)| = sup - g(x, u(x))v(x) dx-v≤1

≤ sup

v≤1

v≤1

Ω

12 12 |g(x, u(x))|2 dx |v(x)|2 dx ≤ cemb rL2 (Ω) .

Ω

Ω

(7.4.6) It follows from (7.4.6) that the operator S maps the closure of the ball B(o; R) ⊂ W01,2 (Ω) with radius R = cemb rL2 (Ω) into itself. We will show that S is compact. Indeed, let M ⊂ W01,2 (Ω) be a bounded set ∞ ∞ and {wn }n=1 ⊂ S(M) an arbitrary sequence. Let {un }n=1 ⊂ M be such that S(un ) = wn . The reﬂexivity of W01,2 (Ω) implies that un u in W01,2 (Ω) at least for a subsequence. It follows from Theorem 1.2.28 that un → u in L2 (Ω). Estimates similar to those from (7.4.4) with u1 replaced by un and u2 replaced by u yield S(un ) − S(u)W 1,2 (Ω) ≤ cemb g(·, un ) − g(·, u)L2 (Ω) . 0

(7.4.7)

The right-hand side approaches zero as follows from the continuity of the Nemytski operator from L2 (Ω) into L2 (Ω) (see Theorem 3.2.24), i.e., wn → S(u)

in W01,2 (Ω)

(at least for a subsequence). This proves the compactness of S(M). Note that (7.4.4) implies the continuity of S as well, i.e., S is a compact operator. Thus we have the following assertion. Proposition 7.4.2. Let g ∈ CAR(Ω × R) and let there exist r ∈ L2 (Ω) such that |g(x, s)| ≤ r(x) for all s ∈ R and a.a. x ∈ Ω. Then there is at least one ﬁxed point u ∈ W01,2 (Ω) of S (and so a weak solution of (7.4.1)) such that uW 1,2 (Ω) ≤ cemb rL2 (Ω) . 0

It is not hard to see that the assumptions of Proposition 7.4.2 are unnecessarily strong. In order to ﬁnd a ball B(o; R) ⊂ W01,2 (Ω) which is mapped by S into itself one can assume that g has sublinear growth with respect to the second variable. More precisely, we assume that g ∈ CAR(Ω × R) and there exist r ∈ L2 (Ω), c > 0 and δ ∈ (0, 1) such that for all s ∈ R and a.a. x ∈ Ω we have |g(x, s)| ≤ r(x) + c|s|δ .

(7.4.8)

7.4. Weak Solutions, Application of Fixed Point Theorems

489

Then for any u ∈ W01,2 (Ω), similarly to (7.4.6), we have

S(u) ≤ sup

|g(x, u(x))| dx 2

v≤1

12

Ω

Ω

≤ cemb

12

Ω

-r(x) + c|u(x)|δ -2 dx

≤ cemb

|v(x)| dx 2

|r(x)| dx 2

12

12

(7.4.9) 1

|u(x)|2δ dx

+c

Ω

2

Ω

where the last estimate is due to the Minkowski inequality (1.2.4) for p = 2. Applying the H¨ older inequality we have

Ω

12

δ2 1−δ 1−δ |u(x)|2δ dx ≤ |u(x)|2 dx (meas Ω) 2 ≤ cδemb (meas Ω) 2 uδ . Ω

(7.4.10)

Now, (7.4.9) and (7.4.10) yield (meas Ω) S(u) ≤ cemb rL2 (Ω) + c c1+δ emb

1−δ 2

uδ .

(7.4.11)

D

C

It follows from (7.4.11) that for any u ∈ B(o; R) we get S(u) < R

C + DRδ < R

provided

(7.4.12)

and hence S maps B(o; R) into itself if R is large enough. The map S is compact as well (no growth restrictions on g are needed to prove the compactness of S). Therefore we have a more general result than that in Proposition 7.4.2. Theorem 7.4.3. Let g ∈ CAR(Ω × R) satisfy (7.4.8) for δ < 1. Then there is at least one weak solution u of (7.4.1). In Remark 7.4.5 we will show that it is not possible in general to allow the linear growth of g = g(x, s) with respect to s. We will need the result from the next example. Example 7.4.4. Let λ1 ∈ R be deﬁned as

λ1 =

inf

u∈W01,2 (Ω) u =o

Ω

|∇u(x)|2 dx |u(x)|2 dx

Ω

or equivalently as

|∇u(x)|2 dx :

λ1 = inf Ω

Ω

|u(x)|2 dx = 1, u ∈ W01,2 (Ω) .

490

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

This is equivalent to the following characterization of λ1 : |u(x)|2 dx 1 Ω = sup λ1 u∈W01,2 (Ω) |∇u(x)|2 dx u =o

or

1 = sup λ1

|u(x)|2 dx :

Ω

Ω

Ω

|∇u(x)|2 dx = 1, u ∈ W01,2 (Ω) .

The operator A : W01,2 (Ω) → W01,2 (Ω) deﬁned by (A(u), v)W 1,2 (Ω) = u(x)v(x) dx, 0

u, v ∈ W01,2 (Ω),

Ω

is positive, self-adjoint and compact.12 It follows then from Theorem 6.3.12 that the supremum λ11 is achieved and the function u1 ∈ W01,2 (Ω) such that 1 = |u1 (x)|2 dx, u1 W 1,2 (Ω) = 1, 0 λ1 Ω is an eigenvector of A (corresponding to the eigenvalue Then 1 ϕ1 = λ1 u1

1 λ1 ),

cf. Example 6.3.15.

satisﬁes ϕ1 L2 (Ω) = 1

and

|∇ϕ1 (x)|2 dx.

λ1 = Ω

Moreover, we have that ∇ϕ1 (x)∇v(x) dx = λ1 ϕ1 (x)v(x) dx Ω

Ω

for any v ∈ W01,2 (Ω).13

We will then call λ1 the principal (least) eigenvalue principal (normalized) eigenfunction of the eigenvalue −∆u(x) = λu(x) in u=0 on

and ϕ1 the corresponding problem Ω, ∂Ω

(7.4.13)

(cf. page 479). According to Example 5.4.40 the function ϕ1 can be chosen positive g in Ω. Remark 7.4.5. Note that it follows from the deﬁnition of the principal eigenvalue λ1 that, 1 uL2(Ω) ≤ √ uW 1,2 (Ω) for any u ∈ W01,2 (Ω) 0 λ1 the compact embedding W01,2 (Ω) ⊂⊂ L2 (Ω) (Theorem 1.2.28). reader is invited to prove this equality using the Lagrange Multiplier Method (Theorem 6.3.2). 12 Use

13 The

7.4. Weak Solutions, Application of Fixed Point Theorems

and

√1 λ1

491

is the least constant with this property. Hence 1 cemb = √ . λ1

Let us now consider the problem (7.4.1) where g is of the form g(x, u) = λ1 u + f (x)

with f ∈ L2 (Ω).

If u ∈ W01,2 (Ω) is a weak solution of −∆u(x) = λ1 u(x) + f (x) u=0 then

in on

(7.4.14)

∇u(x)∇v(x) dx = λ1 Ω

Ω, ∂Ω,

u(x)v(x) dx +

f (x)v(x) dx

Ω

Ω

for any v ∈ W01,2 (Ω). In particular, choosing v = ϕ1 where ϕ1 is the principal eigenfunction corresponding to λ1 , we have 0= ∇u(x)∇ϕ1 (x) dx − λ1 u(x)ϕ1 (x) dx = f (x)ϕ1 (x) dx, Ω

Ω

i.e.,

Ω

f (x)ϕ1 (x) dx = 0 Ω

is a necessary condition for (7.4.14) to have a weak solution (cf. Exercise 7.5.11). In other words, it is not possible to relax condition (7.4.8) to hold for δ = 1 and to expect the existence of a weak solution of (7.4.1) at the same time. Checking carefully estimates similar to (7.4.9) but with δ = 1, we come to the conclusion that small c in (7.4.8) may compensate the linear growth and Theorem 7.4.3 will still hold true. More precisely, assuming |g(x, s)| ≤ r(x) + c|s|,

(7.4.15)

we have for any u ∈ W01,2 (Ω), similarly to (7.4.9),

S(u) ≤ sup

v≤1

|g(x, u(x))|2 dx

12

Ω

|r(x) + c|u(x)||2 dx

≤ cemb Ω

≤ cemb

12

12 |v(x)|2 dx

Ω

12

12 |r(x)|2 dx +c |u(x)|2 dx

Ω

≤ cemb rL2 (Ω) +

Ω

c c2embu.

492

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

If

1 c2emb then we can choose R > 0 such that c<

( ≤ λ1 ),

(7.4.16)

cemb rL2 (Ω) + c c2embR < R, and for any u ∈ B(o; R) we get S(u) < R.

(7.4.17)

Since S is also a compact operator under the growth condition (7.4.15), we can formulate Theorem 7.4.6. Let g ∈ CAR(Ω × R) and let there exist r ∈ L2 (Ω) and c > 0 such that (7.4.15) and (7.4.16) hold true. Then problem (7.4.1) has at least one weak solution u ∈ W01,2 (Ω). Remark 7.4.7. It follows directly that the closer c is to the value c21 , the larger emb R has to be taken in order to get (7.4.17). At the same time one can see that the statement does not hold if c = λ1 (see Remark 7.4.5). Exercise 7.4.8. Prove that if λ is an eigenvalue of (7.4.13), then λ ≥ λ1 . Hint. Apply Theorem 6.3.12 similarly to Example 6.3.15. Exercise 7.4.9. Formulate and prove an analogue of Theorems 7.4.1, 7.4.3 and 7.4.6 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(x, u(x)) ∂u ⎩ =0 on ∂Ω. ∂n

7.5 Weak Solutions, Application of Degree Theory In this section we show how to generalize assumptions (7.4.15) and (7.4.16). We apply the degree theory and prove that more general nonlinearities g = g(x, s) can be considered in (7.4.1) than those satisfying (7.4.15) and (7.4.16). At the same time we must be aware of Remark 7.4.5 and avoid the situation treated there. But ﬁrst we give an example concerning the higher eigenvalues of (7.4.13). Example 7.5.1. If for a λ there is u = o, u ∈ W01,2 (Ω), such that ∇u(x)∇v(x) dx = λ u(x)v(x) dx holds for any v ∈ W01,2 (Ω), Ω

(7.5.1)

Ω

then λ is an eigenvalue and u is the corresponding eigenfunction of (7.4.13). Let A : W01,2 (Ω) → W01,2 (Ω) be from Example 7.4.4. Then (7.5.1) is equivalent to the eigenvalue problem 1 µu = Au where µ = . (7.5.2) λ

7.5. Weak Solutions, Application of Degree Theory

493

It follows from Theorem 6.3.12 (cf. also the Hilbert–Schmidt Theorem – Theorem 2.2.16) that A has a countable set of eigenvalues {µn }∞ n=1 of ﬁnite multiplicity such that µn ≥ µn+1 ≥ · · · > 0, µn → 0. Hence the eigenvalues of (7.4.13) form an increasing sequence 0 < λ1 ≤ λ2 ≤ λ3 ≤ · · · ,

λn → ∞.

In fact, it is also possible to prove that λ1 has multiplicity 1 (i.e., λ1 < λ2 ) and the corresponding eigenspace is spanned by a function ϕ1 which is positive in Ω, see ∞ Example 5.4.40. The system of all normalized eigenfunctions {ϕn }n=1 forms an or1,2 thonormal basis in W0 (Ω) (cf. the Hilbert–Schmidt Theorem (Theorem 2.2.16)). It follows from the regularity result (which is far from being simple for N ≥ 2, see, e.g., Gilbarg & Trudinger [59], Fuˇc´ık [53], Neˇcas [99]) that all eigenfunctions g ϕk are continuous functions in Ω. Let us assume now that the nonlinear function g = g(x, s) is of the form g(x, s) = λs + h(x, s)

where

λ ∈ R,

h ∈ CAR(Ω × R)

(7.5.3)

and all s ∈ R.

(7.5.4)

and there exists r ∈ L2 (Ω) such that |h(x, s)| ≤ r(x)

for a.a. x ∈ Ω

Now we can formulate the following assertion which generalizes (in the sense of the growth of g with respect to s) the results of the previous section. Theorem 7.5.2. Let h ∈ CAR(Ω × R) satisfy (7.5.4). Let, moreover, λ = λn , n = 1, 2, . . . Then the problem −∆u(x) = λu(x) + h(x, u(x)) in Ω, (7.5.5) u=0 on ∂Ω, has at least one weak solution. Note that the assertion of Theorem 7.5.2 does not hold without the assumption λ = λn , n = 1, 2, . . . , as shown in Remark 7.4.5. Proof of Theorem 7.5.2. We will use the Leray–Schauder degree theory to prove the result. The reader is invited to verify that the existence of a weak solution of (7.5.5) is equivalent to the existence of a solution of the operator equation u = λAu + S(u) W01,2 (Ω)

where A is as in Example 7.4.4 and S : h(x, u(x))v(x) dx (S(u), v) = Ω

→

(7.5.6) W01,2 (Ω)

is deﬁned by

for all u, v ∈ W01,2 (Ω).

Note that S is a compact operator (see page 488).

494

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Our plan is the following. The existence of at least one solution of (7.5.6) would follow from deg (I − λA − S(·), B(o; R), o) = 0 (7.5.7) if we found a ball B(o; R) for which (7.5.7) is valid. To prove (7.5.7) we use the homotopy invariance property of the degree (Theorem 5.2.13(vi)) and “connect” the operator I − λA − S(·) with the operator I − λA on the boundary of a ball B(o; R) with a suﬃciently large radius R > 0. Once this is done we ﬁnally use deg (I − λA, B(o; R), o) = 0.14 So, to complete the proof, we have to ﬁnd an admissible homotopy connecting I − λA − S(·) and I − λA. Probably the simplest way to do it is the following. Deﬁne H(τ, u) = u − λA(u) − τ S(u),

τ ∈ [0, 1],

u ∈ W01,2 (Ω),

and prove that there exists R > 0 such that for all u ∈ W01,2 (Ω), u = R and τ ∈ [0, 1] we have H(τ, u) = o. (7.5.8) The usual way to establish (7.5.8) is an indirect proof. Assume that no such R > 0 ∞ ∞ exists, i.e., we can ﬁnd sequences {un }n=1 ⊂ W01,2 (Ω) and {τn }n=1 ⊂ [0, 1] such that un → ∞ and un − λA(un ) − τn S(un ) = o. (7.5.9) Set vn

un un

and divide (7.5.9) by un to get vn − λA(vn ) − τn

S(un ) = o. un

(7.5.10)

This is equivalent to h(x, un (x)) w(x) dx ∇vn (x)∇w(x) dx = λ vn (x)w(x) dx − τn un Ω Ω Ω for any w ∈ W01,2 (Ω). Now, passing to suitable subsequences we can assume that τnk → τ ∈ [0, 1], vnk v in W01,2 (Ω). At the same time |h(x, unk (x))| r(x) |w(x)| dx ≤ |w(x)| dx → 0 as k → ∞. unk Ω Ω unk To summarize, we have τnk

S(unk ) → o, unk

(7.5.11)

A(vnk ) → A(v)

(7.5.12)

(by the compactness of A, see Proposition 2.2.4(iii)). 14 This

follows from Proposition 5.2.22 or Exercise 5.2.26.

7.5. Weak Solutions, Application of Degree Theory

495

So, putting together (7.5.10)–(7.5.12) we also obtain that vnk → v ∗

in

W01,2 (Ω).

But v ∗ = v by virtue of vnk v. Now, passing to the limit in (7.5.10) with n replaced by nk , we arrive at v − λA(v) = o, (7.5.13) and v ∈ W01,2 (Ω) satisﬁes v = 1 (it is the strong limit of elements vnk which satisfy vnk = 1!). However, this contradicts our assumption λ = λn , n = 1, 2, . . . . It proves that (7.5.8) holds, i.e., the homotopy H is admissible. This completes the proof. Following the scheme of the proof one can also handle nonlinearities of type (7.5.3) with λ = λn for an n = 1, 2, . . . . However, the assumptions on h must be strengthened in order “to help” to prove that the corresponding homotopy H is admissible. The following assertion is a simple example of a generalization of Theorem 7.5.2 in this direction. Theorem 7.5.3. Let f ∈ L2 (Ω) and g : R → R be continuous with ﬁnite limits lim g(s) = g(±∞) and such that for all s ∈ R we have

s→±∞

g(−∞) < g(s) < g(+∞). Let λ1 > 0 be the ﬁrst eigenvalue of the Laplace operator subject to the Dirichlet boundary conditions. Then the problem in Ω, −∆u(x) = λ1 u(x) + g(u(x)) − f (x) (7.5.14) u=0 on ∂Ω, has at least one weak solution if and only if f (x)ϕ1 (x) dx < g(+∞) g(−∞) < Ω

(7.5.15)

where ϕ1 is the ﬁrst positive eigenfunction normalized by

ϕ1 (x) dx = 1. Ω

Proof. We will follow a scheme similar to the proof of Theorem 7.5.2. For δ > 0 so small that λ1 + δ < λ2 we deﬁne the homotopy H(τ, u) u − λ1 A(u) − (1 − τ )δA(u) − τ S(u), 15

τ ∈ [0, 1],

u ∈ W01,2 (Ω).

Performing all steps as in the proof of Theorem 7.5.2 we arrive at an analogue of (7.5.13), namely, v − [λ1 + (1 − τ )δ]A(v) = o,

v = 1,

for a τ ∈ [0, 1].

This is a contradiction if τ = 1 since λ1 + (1 − τ )δ is not an eigenvalue and v = o.

15 But

now (S(u), v) = Ω

g(u(x))v(x) dx, u, v ∈ W01,2 (Ω).

496

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Let us assume τ = 1, i.e., τnk → 1. Now, however, we have no contradiction since λ1 is an eigenvalue and v − λ1 A(v) = o has a solution with v = 1. Another step is necessary to reach a contradiction and to prove that the homotopy H is admissible. We have to revise the last step when passing to the limit in vn − λ1 A(vn ) − (1 − τn )δA(vn ) − τn

S(un ) =o un

and employ special properties of S. Namely, unk − λ1 A(unk ) − (1 − τnk )δA(unk ) − τnk S(unk ) = o is equivalent to the integral identity ∇unk (x)∇w(x) dx = [λ1 + (1 − τnk )δ] unk (x)w(x) dx Ω Ω + τnk g(unk (x))w(x) dx − τnk f (x)w(x) dx (7.5.16) Ω

for all w ∈

W01,2 (Ω).

Ω

Taking w = ϕ1 in (7.5.16) and using the fact that ∇unk (x)∇ϕ1 (x) dx = λ1 unk (x)ϕ1 (x) dx,

Ω

we obtain

Ω

(1 − τnk )δ

unk (x)ϕ1 (x) dx + τnk Ω

Ω

g(unk (x))ϕ1 (x) dx = τnk f (x)ϕ1 (x) dx. (7.5.17) Ω

As above, vnk

unk unk

→ v in

W01,2 (Ω)

and v = κϕ1 with a κ = 0. Assume that

κ > 0. Then (at least for a subsequence) unk (x) → ∞ a.e. in Ω.16 Passing to the limit in (7.5.17) and using τnk → 1− and the Lebesgue Dominated Convergence Theorem we obtain f (x)ϕ1 (x) dx ≥ lim g(unk (x))ϕ1 (x) dx = g(+∞)ϕ1 (x) dx = g(+∞). k→∞

Ω

Ω

Ω

This contradicts the second inequality in (7.5.15). Similarly we proceed if κ < 0 to get a contradiction with the ﬁrst inequality in (7.5.15). This proves that H is admissible, and so (7.5.15) is suﬃcient for the existence of a weak solution of (7.5.14). vnk → κϕ1 in W01,2 (Ω) implies vnk → κϕ1 in L2 (Ω) (Theorem 1.2.26). Hence there ex,∞ + ists a subsequence vnk ⊂ {vnk }∞ k=1 such that vnk → κϕ1 > 0 a.e. in Ω (Remark 1.2.18),

16 Indeed,

l

l=1

i.e., unk (x) → ∞ a.e. in Ω. l

l

7.5. Weak Solutions, Application of Degree Theory

497

To prove that (7.5.15) is also necessary we proceed as follows. Let u0 be a weak solution of (7.5.14), i.e., for any w ∈ W01,2 (Ω) let us have ∇u0 (x)∇w(x) dx Ω = λ1 u0 (x)w(x) dx + g(u0 (x))w(x) dx − f (x)w(x) dx. Ω

Set w = ϕ1 . Then

Ω

Ω

g(u0 (x))ϕ1 (x) dx =

Ω

f (x)ϕ1 (x) dx, Ω

and the result follows from the fact that g(u0 (x))ϕ1 (x) dx < g(+∞).17 g(−∞) <

Ω

Remark 7.5.4. Theorem 7.5.3 holds for more general nonlinearities g (see Exercises 7.5.9 and 7.5.10). Conditions of type (7.5.15) are called the Landesman–Lazer type conditions according to the paper Landesman & Lazer [83] where analogous results were proved for the ﬁrst time. Remark 7.5.5. The main diﬀerence between Theorems 7.5.2 and 7.5.3 consists in the fact that λ is not an eigenvalue of (7.4.13) in Theorem 7.5.2 while λ is an eigenvalue of (7.4.13) in Theorem 7.5.3. That is why we speak about a nonresonance problem in the former and about a resonance problem in the latter case. The word “resonance” is used here because of the fact that the corresponding ordinary diﬀerential version of (7.4.13) describes the resonance in electric circuits when λ is an eigenvalue. Remark 7.5.6. A result analogous to Theorem 7.5.3 can be proved also if λ1 is replaced by any eigenvalue λn , n ≥ 2. The Landesman–Lazer type conditions then have a diﬀerent form and all linearly independent eigenfunctions associated with λn must be involved (see, e.g., Fuˇc´ık [53]). Remark 7.5.7. Let us mention one geometrical aspect of condition (7.5.15). It follows from the linear Fredholm alternative (see Exercise 7.5.11) that in Ω, −∆u(x) = λ1 u(x) − f (x) (7.5.18) u=0 on ∂Ω with f ∈ L2 (Ω) has a weak solution if and only if f belongs to a linear subspace V of L2 (Ω) of codimension 1: f (x)ϕ1 (x) dx = 0 . V = f ∈ L2 (Ω) : Ω 17 Indeed,

note that g(−∞) < g(u0 (x)) < g(+∞), ϕ1 (x) > 0 in Ω, and

ϕ1 (x) dx = 1. Ω

498

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

If g(−∞) < 0 < g(+∞), then the set of all f for which (7.5.14) has at least one weak solution contains V as a proper subset and is in fact much larger (see Figure 7.5.1). L2 (Ω)

V f: f (x)ϕ1 (x) dx = g(±∞) Ω

o

ϕ1

Figure 7.5.1. Problem (7.5.14) has a weak solution if and only if f belongs to the shaded area.

Exercise 7.5.8. The same method as in the proof of Theorem 7.5.2 works if we replace (7.5.4) by a more general assumption |h(x, s)| ≤ r(x) + |s|δ

for a.a. x ∈ Ω and all s ∈ R

where δ ∈ (0, 1).

Modify the proof and get the corresponding generalization of Theorem 7.5.2. Exercise 7.5.9. Let h ∈ CAR(Ω × R) be bounded and satisfy the following conditions: Let there exist limits h(x, −∞) = lim h(x, s)

h(x, +∞) = lim h(x, s), s→+∞

s→−∞

for a.a. x ∈ Ω. Assume that h(x, −∞) < h(x, +∞),

Prove that

(h(x, +∞) < h(x, −∞)) for a.a. x ∈ Ω.

−∆u(x) = λ1 u(x) + h(x, u(x))

in

Ω,

u=0

on

∂Ω

has at least one weak solution provided h(x, −∞)ϕ1 (x) dx < 0 < h(x, +∞)ϕ1 (x) dx Ω Ω ' ( h(x, +∞)ϕ1 (x) dx < 0 < h(x, −∞)ϕ1 (x) dx . Ω

Ω

(7.5.19)

(7.5.20)

7.5A. Application of the Degree of Generalized Monotone Operators

499

Exercise 7.5.10. Assume that for a.a. x ∈ Ω and for all s ∈ R we have h(x, −∞) < h(x, s) < h(x, +∞)

(h(x, +∞) < h(x, s) < h(x, −∞)).

Prove that (7.5.20) is also a necessary condition for the existence of a weak solution of (7.5.19). Exercise 7.5.11. Let f ∈ L2 (Ω). Prove that (7.5.18) has a weak solution if and only if f ∈ V , i.e., f (x)ϕ1 (x) dx = 0. Ω

(Cf. the Fredholm alternative mentioned in Theorem 2.2.9(iv).) Hint. Let A be from Example 7.4.4. Then I − λ1 A is a self-adjoint operator and Ker (I − λ1 A) = Lin{ϕ1 }. Any f ∈ L2 (Ω) deﬁnes a continuous linear form f ∗ on W01,2 (Ω) by f ∗ (u) =

u ∈ W01,2 (Ω). Ω f (x)ϕ1 (x) dx = 0, It follows from Proposition 2.1.27(iv) that given f ∈ L2 (Ω), f (x)u(x) dx,

Ω

then

f ∗ ∈ (Ker (I − λ1 A))⊥ = Im (I − λ1 A).

However, Im (I − λ1 A) = Im (I − λ1 A) by Theorem 2.2.9(iii). On the other hand, if (7.5.18) has a weak solution u0 for a given f ∈ L2 (Ω), then taking v = ϕ1 as a test function we arrive at f (x)ϕ1 (x) dx = 0. Ω

Exercise 7.5.12. Prove an analogue of Theorem 7.5.2 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = λu(x) + h(x, u(x)) ∂u ⎩ =0 on ∂Ω. ∂n Exercise 7.5.13. Prove an analogue of Theorem 7.5.3 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(u(x)) − f (x) ∂u ⎩ =0 on ∂Ω. ∂n

7.5A Application of the Degree of Generalized Monotone Operators In this appendix we deal with the boundary value problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (λ, x, u(x)) in u=0

on

Ω, ∂Ω

(7.5.21)

500

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

where p > 1, Ω ∈ C 0,1 is a bounded domain in RN , f : R × Ω × R → R is a Carath´eodory function (see Remark 3.2.25) which satisﬁes some further conditions speciﬁed below, and ∆p u div |∇u|p−2 ∇u 18 is the p-Laplacian. It is well known that the problem −∆p u(x) = λ|u(x)|p−2 u(x) u=0

in

Ω,

on

∂Ω

has a principal eigenvalue (i.e., the least one) λ1 > 0 which is simple, isolated and characterized variationally by |∇u(x)|p dx λ1 = inf Ω |u(x)|p dx Ω

where “inf” is taken over all u ∈ W01,p (Ω), u = o; no eigenfunction associated with λ1 changes sign in Ω and they all form a one-dimensional linear space (see, e.g., Anane [7], Lindqvist [86], or Example 6.3.5 for the case N = 1). Our aim in this appendix is to show that under some appropriate assumptions on f the value λ1 is a bifurcation point of (7.5.21) in the sense that for any neighborhood of (λ1 , o) ∈ R × W01,p (Ω) there exists at least one (λ, u) ∈ R × W01,p (Ω), u = o, which solves (in the sense mentioned below) the boundary value problem (7.5.21). Let us assume, for simplicity, that f = f (λ, x, s) is uniformly bounded, i.e., there exists a constant M > 0 such that for any (λ, s) ∈ R2 and a.a. x ∈ Ω, |f (λ, x, s)| ≤ M.

(7.5.22)

Let us deﬁne for λ ∈ R operators J, S, Fλ : W01,p (Ω) → (W01,p (Ω))∗ by J(u), v = |∇u(x)|p−2 ∇u(x)∇v(x) dx, S(u), v = |u(x)|p−2 u(x)v(x) dx, Ω Ω Fλ (u), v = f (λ, x, u(x))v(x) dx for u, v ∈ W01,p (Ω). Ω

It follows from (7.5.22), Theorem 3.2.24 and Remark 3.2.26 that J, S and Fλ are welldeﬁned operators (the reader should justify it in detail!). Actually, combining Theorems 1.2.26, 1.2.28, 3.2.24 and Remark 3.2.26 we have (cf. Exercise 7.5.18):19 18 This

+

∂ ∂xN

means ∆p u = ⎧ ⎨' ⎩

∂u ∂x1

(2

∂ ∂x1

⎧ ⎨ ' ⎩

+ ··· +

'

∂u ∂x1

∂u ∂xN

(2

' +

( 2 p−2 2

∂u ∂xN

19 We

∂u ∂xN

work with the norm u = Ω

( 2 p−2 2

⎫ ⎬ ⎭

⎫ ⎬

∂u ∂x1 ⎭

+ ···

, ∆p u = 0 if ∇u = o.

1 p |∇u(x)|p dx on W01,p (Ω), i.e., J(u), u = up .

7.5A. Application of the Degree of Generalized Monotone Operators

501

(a) J and S are bounded operators in the sense that they map bounded sets onto bounded sets; (b) Fλ is a uniformly bounded operator in the sense that there exists a constant c > 0 such that for any λ ∈ R and u ∈ W01,p (Ω) we have Fλ (u)(W 1,p (Ω))∗ ≤ c; 0

(c) J, S and Fλ are continuous operators; (d) for any u, v ∈ W01,p (Ω), J(u) − J(v), u − v ≥ up−1 − vp−1 (u − v); (e) if un u in W01,p (Ω) and λn → λ in R, then we have S(un ) → S(u) and Fλn (un ) → Fλ (u) in (W01,p (Ω))∗ ; in particular, S and Fλ are compact. We say that a couple (λ, u) ∈ R × W01,p (Ω) is a weak solution of (7.5.21) if J(u), v = λS(u), v + Fλ (u), v

holds for any

v ∈ W01,p (Ω).

Thus, to ﬁnd a weak solution of (7.5.21) is equivalent to ﬁnding a couple (λ, u) ∈ R × W01,p (Ω) which satisﬁes the operator equation J(u) = λS(u) + Fλ (u).

(7.5.23)

Lemma 7.5.14. For any λ ∈ R the operator J − λS − Fλ satisﬁes the (S+ ) condition (see Deﬁnition 5.2.44). Proof. We use the property (d) to prove that J satisﬁes the (S+ ) condition (we proceed similarly to Example 5.2.51). The assertion then follows from (e) and Lemma 5.2.46. Assume that for any λ ∈ R and a.a. x ∈ Ω we have f (λ, x, o) = 0.

(7.5.24)

This immediately yields for any λ ∈ R. Fλ (o) = o We also assume that for any bounded interval I ⊂ R the limit lim

s→0

f (λ, x, s) =0 |s|p−2 s

(7.5.25)

exists uniformly for a.a. x ∈ Ω and λ ∈ I. This implies lim

u →0

Fλ (u) =o up−1

uniformly for

λ∈I

(7.5.26)

(cf. Exercise 7.5.19). The fact that the eigenvalue λ1 > 0 is isolated and the compactness of S imply that there exists δ > 0 such that for any λ ∈ (λ1 − δ, λ1 + δ), λ = λ1 , there exists c = c(λ) > 0 such that J(u) − λS(u)(W 1,p (Ω))∗ ≥ c(λ)up−1 0

for any

u ∈ W01,p (Ω)

(7.5.27)

(cf. Exercise 7.5.20). Denote Tλ J − λS − Fλ . Then (7.5.26) and (7.5.27) imply that for λ ∈ (λ1 − δ, λ1 + δ), λ = λ1 , o is an isolated solution of Tλ (u) = o and its index i(Tλ , o) 20 is well deﬁned. 20 Here

we mean the index from Appendix 5.2B, page 304.

502

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Proposition 7.5.15. Let us assume lim i(Tλ , o) = lim i(Tλ , o).

λ→λ1−

λ→λ1+

Then (λ1 , o) is a bifurcation point of Tλ , i.e., there exist λn → λ1 , un → o, un = o, such that Tλn (un ) = o. Proof. To prove this assertion we can follow the proof of Proposition 5.2.20, but working with the index deﬁned on page 304. Combining (7.5.26) and (7.5.27) we obtain that i(Tλ , o) = i(J − λS, o)

for

λ ∈ (λ1 − δ, λ1 + δ),

λ = λ1

(7.5.28)

(cf. Exercise 7.5.21). It then follows that, to prove that (λ1 , o) is a bifurcation point of Tλ , it suﬃces to prove that lim i(J − λS, o) = lim i(J − λS, o).

λ→λ1−

λ→λ1+

(7.5.29)

We are now ready to prove the following assertion. Theorem 7.5.16. Let f satisfy all the assumptions stated above. Then (λ1 , o) is a bifurcation point of Tλ , i.e., of (7.5.21). Proof. Let δ > 0 be such that (7.5.28) holds. Taking into account the above discussion it is enough to prove i(J − λS, o) = 1

for

λ∈ (λ1 − δ, λ1 ),

(7.5.30)

i(J − λS, o) = −1

for

λ∈ (λ1 , λ1 + δ).

(7.5.31)

It follows from the variational characterization of λ1 > 0 that J(u) − λS(u), u > 0

for

u = o

if

λ ∈ (λ1 − δ, λ1 ).

Applying Proposition 5.2.48 we immediately get (7.5.30). To prove (7.5.31) we proceed in the following way. There exists K > 0 large enough so that we can deﬁne a function ψ : R → R by ⎧ ⎨0 for t ≤ K, ψ(t) = 2δ ⎩ (t − 2K) for t ≥ 3K, λ1 and such that ψ(t) is continuously diﬀerentiable in R, positive and strictly convex in (K, 3K) (see Figure 7.5.2, the reader is invited to write an explicit formula for such ψ). We deﬁne a functional

1 λ 1 u ∈ W01,p (Ω). Ψλ (u) up − upLp (Ω) + ψ up , p p p Then Ψλ is continuously Fr´echet diﬀerentiable and its critical point u0 ∈ W01,p (Ω) corresponds to a solution of the equation J(u0 ) −

1 + ψ

λ '

1 u0 p p

( S(u0 ) = o.

7.5A. Application of the Degree of Generalized Monotone Operators

503

ψ(t)

0

K

2K

t

3K

Figure 7.5.2.

However, since λ ∈ (λ1 , λ1 + δ) and there is only one eigenvalue below λ1 + δ, a nonzero critical point u0 of Ψλ has to satisfy

λ 1 λ ' ( = λ1 , i.e., ψ − 1. (7.5.32) u0 p = 1 p λ p 1 1 + ψ p u0 Due to the properties of ψ we necessarily have 1 u0 p ∈ (K, 3K) p and due to (7.5.32) and the simplicity of λ1 , either u0 = −u1 or u0 = u1 for a u1 > 0 which is an eigenfunction associated with λ1 . So, we conclude that there are precisely three isolated critical points of Ψλ : −u1 ,

o,

u1 .

The functional Ψλ is weakly sequentially lower semicontinuous. Indeed, assume that un v0 in W01,p (Ω). Then (7.5.33) un pLp (Ω) → v0 pLp (Ω) due to the compactness of W01,p (Ω) ⊂⊂ Lp (Ω), and then lim inf Ψλ (un ) ≥ Ψλ (v0 ) n→∞

by the fact that lim inf un ≥ v0 , (7.5.33) holds, and ψ is increasing. Observe that n→∞

Ψλ is weakly coercive, i.e., lim Ψλ (u) = ∞.

(7.5.34)

u →∞

Indeed, we have Ψλ (u) =

1 λ1 λ1 − λ up − upLp (Ω) + upLp (Ω) + ψ p p p

1 up . p

Since up − λ1 upLp (Ω) ≥ 0

for any

u ∈ W01,p (Ω),

(7.5.35)

504

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

we also have (λ > λ1 ) λ1 − λ upLp (Ω) + ψ p

1 up p

λ1 − λ ≥ up + ψ pλ1 ≥−

δ 2δ up + pλ1 λ1

1 up p

1 up − 2K p

(7.5.36) →∞

for u → ∞, and (7.5.34) follows. Since Ψλ is an even functional, there are precisely two critical points at which the global minimum is achieved: −u1 and u1 . The third critical point o is obviously an isolated critical point of “saddle type”.21 By virtue of Proposition 5.2.50 we have i(Ψλ , −u1 ) = i(Ψλ , u1 ) = 1.22

(7.5.37)

At the same time, due to the deﬁnition of ψ, we have Ψλ (u), u > 0

for any

u ∈ W01,p (Ω),

u = κ

(7.5.38)

with κ > 0 large enough (cf. Exercise 7.5.22). Hence, Proposition 5.2.48 implies that deg(Ψλ , B(o; κ), o) = 1.

(7.5.39)

Take κ > 0 so large that ±u1 ∈ B(o; κ). By Proposition 5.2.49 (the additivity property), (7.5.37) and (7.5.39), we deduce (7.5.40) i(Ψλ , o) = −1. Since ψ vanishes in a small neighborhood of 0, we also have i(J − λS, o) = i(Ψλ , o).

(7.5.41)

Hence (7.5.31) follows from (7.5.40) and (7.5.41).

Remark 7.5.17. Note that a global bifurcation result in the sense of Appendix 5.2A can be proved from (7.5.30) and (7.5.31) (see, e.g., Dr´ abek, Kufner & Nicolosi [42]). We do not treat it here because other technical details have to be involved. Exercise 7.5.18. Prove rigorously properties (a)–(e) from the beginning of Appendix 7.5A. Hint. Use the H¨ older inequality and (7.5.22) for the proofs of (a)–(c). For the proof of (d) adopt the estimate similar to the one from Example 5.3.24. For the proof of (e) use the Rellich–Kondrachov Theorem (Theorem 1.2.28) and Theorem 3.2.24. Exercise 7.5.19. Prove that (7.5.25) implies (7.5.26). Hint. Apply the Lebesgue Dominated Convergence Theorem. Exercise 7.5.20. Prove (7.5.27). Hint. The inequality (7.5.27) is equivalent to J(u) − λS(u)(W 1,p (Ω))∗ ≥ c(λ) 0

for all

u = 1.

Proceed via contradiction and use the compactness of S and the fact that λ is not an eigenvalue. 21 Consider Ψ λ in the direction of u1 to get a local maximum at o due to the variational characterization of λ1 , and, on the other hand, in the direction of u = o satisfying

|∇u(x)|p dx − λ 22 The

Ω

|u(x)|p dx > 0 to get Ψλ (tu) > 0, t = 0. Ω

reader is invited to check that the assumptions of Proposition 5.2.50 are satisﬁed.

7.6. Weak Solutions, Application of Theory of Monotone Operators

505

Exercise 7.5.21. Prove (7.5.28). Hint. Use (7.5.26) and (7.5.27) and the homotopy invariance property of the degree on the ball with a suﬃciently small radius. Exercise 7.5.22. Prove (7.5.38). Hint.

1 up up p ⎤ ⎡

λ − λ1 1 p p p ⎣ p p ' ( uLp (Ω) ⎦ u − = u − λ1 uLp (Ω) + ψ u p ψ p1 up 2δ λ1 p p u − for u → ∞, ≥ uLp (Ω) → ∞ λ1 2

Ψλ (u), u = up − λupLp (Ω) + ψ

by the variational characterization of λ1 .

Exercise 7.5.23. Let λ < λ1 and let f ∈ Lp (Ω). Prove that there exists a unique weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x) in Ω, u=0 where p > 1, p =

on

∂Ω

p . p−1

Exercise 7.5.24. Let λ < λ1 and let f : Ω × R → R be a bounded Carath´eodory function. Prove that there exists a weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x, u(x)) in Ω, u=0

on

∂Ω

where p > 1.

7.6 Weak Solutions, Application of Theory of Monotone Operators Before we apply Corollary 5.3.9 we revise our growth conditions on g = g(x, s) with respect to the second variable. According to (7.3.6) and Theorem 3.2.24 we have g(x, u(x)) ∈ L2 (Ω) and the corresponding Nemytski operator is continuous from L2 (Ω) into L2 (Ω). The operator S : W01,2 (Ω) → W01,2 (Ω) deﬁned at the beginning of Section 7.4 is then continuous as follows from the estimate (7.4.4). Our goal is to show that a growth condition more general than (7.3.6) can be considered in order to get analogous results. For this purpose, however, we have to substitute the embedding W01,2 (Ω) ⊂ L2 (Ω) (7.6.1) by a more general one. Recall that Ω ∈ C 0,1 is a bounded domain. Namely, we have (see Theorem 1.2.26 or Kufner, John & Fuˇc´ık [82])

506

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

(i) N = 1 =⇒ uC(Ω) ≤ c1 uW 1,2 (Ω) , u ∈ W01,2 (Ω); 0

(ii) N = 2 =⇒ uLq (Ω) ≤ c2,q uW 1,2 (Ω) , u ∈ W01,2 (Ω), where q ≥ 1 is 0 arbitrary; (iii) N ≥ 3 =⇒ uLq (Ω) ≤ cN,q uW 1,2 (Ω) , u ∈ W01,2 (Ω), where 1 ≤ q ≤ N2N −2 . 0

The estimate (7.4.4) can then be modiﬁed as follows: S(u1 ) − S(u2 )W 1,2 (Ω) ≤ G(u1 ) − G(u2 )X

where G(u) = g(·, u(·))

0

and (i) for N = 1, X = L1 (Ω); (ii) for N = 2, q ≥ 1 arbitrary, X = Lq (Ω) where q = (iii) for N ≥ 3, 1 ≤ q ≤

2N N −2 ,

X = Lq (Ω) where q =

q q−1 ; q q−1 .

The operator S : W01,2 (Ω) → W01,2 (Ω) is continuous provided the Nemytski operator G is continuous (i) from L∞ (Ω) to L1 (Ω) for N = 1; (ii) from Lq (Ω) to Lq (Ω) for N = 2 where q ≥ 1 is arbitrary; (iii) from Lq (Ω) to Lq (Ω) for N ≥ 3 where 1 ≤ q ≤ N2N −2 . It follows from Theorem 3.2.24 that the following growth conditions guarantee the desired continuity of G: (i) for N = 1: |g(x, s)| ≤ r(x) + C(|s|)

where r ∈ L1 (Ω)

(7.6.2)

and C(t) is a nonnegative continuous function of the variable t ≥ 0; (ii) for N = 2: |g(x, s)| ≤ r(x) + c|s|q−1 (7.6.3) q

where r ∈ L q−1 (Ω), c > 0, q ≥ 1 is arbitrary; (iii) for N ≥ 3: N +2

|g(x, s)| ≤ r(x) + c|s| N −2

where

2N

r ∈ L N +2 (Ω), c > 0.

(7.6.4)

The reader should verify in detail that the growth conditions (7.6.2)–(7.6.4) generalize the condition |g(x, s)| ≤ r(x) + c|s|,

r ∈ L2 (Ω), c > 0.

(7.6.5)

It is clear that the larger q we choose in (7.6.3) the more general condition for g we obtain. It is also clear that all conditions (7.6.2)–(7.6.4) generalize (7.6.5) in the sense that more nonlinearities g can be taken into account in the nonlinear problem −∆u(x) = g(x, u(x)) in Ω, (7.6.6) u=0 on ∂Ω,

7.6. Weak Solutions, Application of Theory of Monotone Operators

507

and the deﬁnition of weak solution still makes sense. Moreover, the operator S : W01,2 (Ω) → W01,2 (Ω) deﬁned by g(x, u(x))v(x) dx for all u, v ∈ W01,2 (Ω) (S(u), v) = Ω

is also a well-deﬁned continuous operator. Warning. However, S is not compact in general! Remark 7.6.1. In order to prove the compactness of S, we need to employ some compact embeddings of W01,2 (Ω). Namely, we use the following ones (see Theorem 1.2.28): (i) N = 1 =⇒ W01,2 (Ω) ⊂⊂ C(Ω); (ii) N = 2 =⇒ W01,2 (Ω) ⊂⊂ Lq (Ω) where q ≥ 1 is arbitrary; (iii) N ≥ 3 =⇒ W01,2 (Ω) ⊂⊂ Lq (Ω) where 1 ≤ q < N2N −2 . So, S is compact if N = 1 and N = 2 provided (7.6.2) and (7.6.3), respectively, hold.23 To get compactness also for N ≥ 3 we need to modify the growth condition (7.6.4) as follows: there exists ε > 0 (arbitrarily small) such that for a.a. x ∈ Ω and all s ∈ R we have N +2

2N

|g(x, s)| ≤ r(x) + c|s| N −2 (1−ε)

where r ∈ L N +2 (Ω), c > 0.

W01,2 (Ω).

(7.6.7)

Indeed, let un u in Then un → u in L (Ω) for q arbitrarily close to 2N 2N , e.g., q = (1 − ε). Then (7.6.7) and Theorem 3.2.24 imply that N −2 N −2 q

g(·, un ) → g(·, u)

2N

in L N +2 (Ω).

As a consequence we obtain S(un ) → S(u)

in W01,2 (Ω).24

Let us give an application of Corollary 5.3.9. Theorem 7.6.2. Let g ∈ CAR(Ω × R) satisfy one of the growth conditions (7.6.2)– (7.6.4) depending on N . Moreover, let g(x, ·) be a decreasing function for a.a. x ∈ Ω. Then (7.6.6) has a unique weak solution. Proof. Set T = L − S, i.e.,25 (T (u), v) = ∇u(x)∇v(x) dx − g(x, u(x))v(x) dx for any u, v ∈ W01,2 (Ω). Ω

23 The

Ω

reader is invited to repeat the argument from the beginning of Section 7.4 to prove the compactness of S : W01,2 (Ω) → W01,2 (Ω) for N = 1, 2. 24 The reader is invited to perform these steps in detail. 25 For the deﬁnition of L : W 1,2 (Ω) → W 1,2 (Ω) see (7.4.2) and the deﬁnition of S : W 1,2 (Ω) → 0 0 0 W01,2 (Ω) see above in this section.

508

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Then T is a continuous operator from W01,2 (Ω) into itself. Moreover, for u1 , u2 ∈ W01,2 (Ω) we have (T (u1 ) − T (u2 ), u1 − u2 ) = ∇(u1 (x) − u2 (x))∇(u1 (x) − u2 (x)) dx Ω − [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx Ω |∇(u1 (x) − u2 (x))|2 dx = u1 − u2 2 . ≥ Ω

Hence T is a strongly monotone operator. The operator equation T (u) = o has a unique solution according to Corollary 5.3.9, i.e., (7.6.6) has a unique weak solution. Reading carefully the proof of Theorem 7.6.2 one can easily see that the assumptions on monotonicity of g could be relaxed. However, we have to pay the price for this modiﬁcation (see the below conditions (7.6.10) and (7.6.13)). Actually, strict monotonicity of T would be enough to prove the same assertion provided we apply Theorem 5.3.4. For instance, the following assumption on g = g(x, s) guarantees the strict monotonicity of T : [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx < |∇(u1 (x) − u2 (x))|2 dx Ω

Ω

(7.6.8)

for any u1 , u2 ∈ W01,2 (Ω), u1 = u2 . Since |∇(u1 (x) − u2 (x))|2 dx ≥ λ1 |u1 (x) − u2 (x)|2 dx Ω

Ω

by the deﬁnition of the ﬁrst eigenvalue λ1 (see (7.2.4)), the inequality (7.6.8) follows from [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx < λ1 |u1 (x) − u2 (x)|2 dx. Ω

Ω

(7.6.9) A suﬃcient condition for (7.6.9) to hold is the Lipschitz continuity of g with respect to the second variable, i.e., |g(x, s1 ) − g(x, s2 )| < λ1 |s1 − s2 |

(7.6.10)

for a.a. x ∈ Ω and any s1 , s2 ∈ R, s1 = s2 . Then (7.6.8) is satisﬁed and T is then a strictly monotone operator (the reader should verify it in detail!). In order to apply Theorem 5.3.4 we also need T to be weakly coercive, i.e., lim T (u) = ∞.

u→∞

(7.6.11)

7.6. Weak Solutions, Application of Theory of Monotone Operators

509

We have - g(x, u(x))v(x) dx-T (u) = sup |(T (u), v)| = sup -- ∇u(x)∇v(x) dx − v≤1 v≤1 Ω Ω - ≥ u − sup -- g(x, u(x))v(x) dx-v≤1

Ω

12 1 ≥ u − √ |g(x, u(x))|2 dx λ1 Ω 9 1 8 rL2 (Ω) + (λ1 − ε)uL2 (Ω) ≥ u − √ λ1

26

(7.6.12)

and all s ∈ R.

(7.6.13)

provided g satisﬁes the growth condition |g(x, s)| ≤ r(x) + (λ1 − ε)|s| Since

for a.a. x ∈ Ω

1 λ1 − ε √ (λ1 − ε)u||L2 (Ω) ≤ u, λ1 λ1

we get from (7.6.12) that T (u) ≥

ε 1 u − √ rL2 (Ω) , λ1 λ1

and so (7.6.11) follows. We have just proved the following assertion. Theorem 7.6.3. Let g ∈ CAR(Ω×R) satisfy (7.6.10) and (7.6.13). Then the boundary value problem (7.6.6) has a unique weak solution. Exercise 7.6.4. Let g be as in Theorem 7.6.2 and let λ < λ1 (here λ1 > 0 is the principal eigenvalue of the Laplacian, see Example 7.4.4). Prove that −∆u(x) = λu(x) + g(x, u(x)) in Ω, u=0

on

∂Ω

has a unique weak solution u ∈ W01,2 (Ω). Exercise 7.6.5. Let g be as in Theorem 7.6.2 and let λ < 0. Prove that ⎧ in Ω, ⎨ −∆u(x) = λu(x) + g(x, u(x)) ∂u ⎩ =0 on ∂Ω ∂n has a unique weak solution u ∈ W 1,2 (Ω). 26 Note

that

√1 λ1

is the best embedding constant.

510

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Exercise 7.6.6. Consider the problem −∆u(x) = h(x, u(x), ∇u(x)) u=0

in on

Ω, ∂Ω.

(7.6.14)

Formulate conditions on h = h(x, u, ξ) which guarantee that (7.6.14) has a unique weak solution. Exercise 7.6.7. Replace in (7.6.14) the homogeneous Dirichlet condition by the Neumann one.

7.6A Application of Leray–Lions Theorem Let p > 1, let Ω ∈ C 0,1 be a bounded domain in RN and g : Ω×RN+1 → R a Carath´eodory function (see Remark 3.2.25). We shall consider the Dirichlet problem −∆p u(x) + g(x, u(x), ∇u(x)) = f (x) in Ω, (7.6.15) u=0 on ∂Ω. Here ∆p u is the p-Laplacian (see Appendix 7.5A). Assume that 1 < p < N 27 and that g = g(x, s, t1 , . . . , tN ) satisﬁes the following growth condition: there exist (possibly small) ε > 0, a constant (possibly large) c > 0 and a ∗ function g0 ∈ L(p ) (Ω) such that |g(x, s, t1 , . . . , tN )| ≤ c g0 (x) + |s|

q(s)−ε

+

N

|ti |

q(t)−ε

(7.6.16)

i=1

for a.a. x ∈ Ω and for all (s, t1 , . . . , tN ) ∈ RN+1 where q(s) =

p∗ = p∗ − 1, (p∗ )

q(t) =

p 28 . (p∗ )

We consider the Sobolev space W01,p (Ω) with the norm 1 p |∇u(x)|p dx .

u = Ω

Let us recall the continuous embedding ∗

W01,p (Ω) ⊂ Lp (Ω)

(7.6.17)

(see Theorem 1.2.26(i) and Remark 1.2.27), and the compact embedding W01,p (Ω) ⊂⊂ Lq (Ω)

where

q ∈ [1, p∗ )

(7.6.18)

is a technical assumption. The case p ≥ N is easier since stronger embeddings are available (see Theorems 1.2.26 and 1.2.28). However, a slightly diﬀerent technique must be employed. ∗ pN 28 Recall that p∗ is the critical Sobolev exponent (see Theorem 1.2.26), (p∗ ) = p = pN−N+p , p∗ −1 27 This

i.e., q(s) =

pN−N+p , N−p

q(t) = p − 1 +

p N

for 1 < p < N .

7.6A. Application of Leray–Lions Theorem

511

(see Theorem 1.2.28(i)). Similarly to Appendix 7.5A we deﬁne (nonlinear) operators J, G : W01,p (Ω) → (W01,p (Ω))∗ and an element f ∗ ∈ (W01,p (Ω))∗ by |∇u(x)|p−2 ∇u(x)∇v(x) dx, J(u), v = Ω g(x, u(x), ∇u(x))v(x) dx, G(u), v = Ω f (x)v(x) dx for all u, v ∈ W01,p (Ω). f ∗ , v = Ω

It follows from (7.6.16) and Remark 3.2.26 that G is well deﬁned (the reader is invited to justify this statement!). We have the following properties of J and G: (a) J and G are bounded operators; (b) J and G are continuous operators;29 (c) for any u, v ∈ W01,p (Ω),

J(u) − J(v), u − v ≥ up−1 − vp−1 (u − v);

(d) if un u in W01,p (Ω), then G(un ) → G(u)

in

(W01,p (Ω))∗ .

Theorem 7.6.8. Let 1 < p < N and assume that g : Ω × RN+1 → R is a Carath´ eodory function satisfying (7.6.16) and that for all (s, t1 , . . . , tN ) ∈ RN+1 and almost all x ∈ Ω we have (7.6.19) sg(x, s, t1 , . . . , tN ) ≥ 0. Then (7.6.15) has at least one weak solution. Proof. Set T J + G. Then the operator equation T (u) = f ∗

(7.6.20)

is equivalent to the validity of the integral identity |∇u(x)|p−2 ∇u(x)∇v(x) dx + g(x, u(x), ∇u(x))v(x) dx = f (x)v(x) dx Ω

Ω

Ω

for all v ∈ This fact shows that the solutions of (7.6.20) correspond oneto-one to the weak solutions of (7.6.15). Next we verify the assumptions (i)–(viii) of Theorem 5.3.23 to prove that there is a solution of (7.6.20). Assumptions (i) and (ii) follow directly from (a) and (b). The assumption (iii), i.e., the coercivity of T , is a direct consequence of (7.6.19): 1 1 |∇u(x)|p dx + g(x, u(x), ∇u(x))u(x) dx lim T (u), u = lim

u →∞ u

u →∞ u Ω Ω W01,p (Ω).

≥

lim up−1 = ∞.

u →∞

∗

prove boundedness and continuity of G we have to use the embedding W01,p (Ω) ⊂ Lp (Ω) ∗ and the continuity of the Nemytski operator given by g from Lp (Ω) into the dual space of Lp (Ω). 29 To

512

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Let us deﬁne an operator Φ : W01,p (Ω) × W01,p (Ω) → (W01,p (Ω))∗ by Φ(u, w), v J(u), v + G(w), v

u, w, v ∈ W01,p (Ω).

for all

It is straightforward to verify the assumption (iv). In order to verify the assumption (v), let u, w, h ∈ W01,p (Ω) and tn → 0. Then Φ(u + tn h, w) = J(u + tn h) + G(w) → J(u) + G(w) = Φ(u, w) by continuity of J (see (b)). The validity of the assumption (vi) follows directly from (c). In order to verify the assumption (vii) let us assume that un u in W01,p (Ω) and lim Φ(un , un ) − Φ(u, un ), un − u = 0,

n→∞

i.e., lim J(un ) − J(u), un − u = 0.

n→∞

(7.6.21)

But (7.6.21) together with (c) implies that un → u. W01,p (Ω)

is a uniformly convex Banach space (see footThe last fact and the fact that note 10 on page 65 and check details) together with the weak convergence imply un → u

in

W01,p (Ω).

(see Proposition 2.1.22(iv)). Now, Φ(w, un ) = J(w) + G(un ) → J(w) + G(u) = Φ(w, u)

for arbitrary

w ∈ W01,p (Ω).

Finally, to verify the assumption (viii), let w ∈ W01,p (Ω) and un u in W01,p (Ω). Then G(un ) → G(u) in (W01,p (Ω))∗ by (d), and so Φ(w, un ), un = J(w) + G(un ), un → J(w), u + G(u), u = Φ(w, u), u. Since also un u in W01,p (Ω) implies that Φ(w, un ) → J(w) + G(u), the last assumption of Theorem 5.3.23 is veriﬁed. The advantage of the Leray–Lions Theorem becomes more transparent when one deals with partial diﬀerential equations of higher order. The reader is asked to see, e.g., Zeidler [136] for more advanced but also technically more involved problems. Exercise 7.6.9. Modify the assumptions on g in such a way that Theorem 5.3.22 could be applied to get at least one weak solution of (7.6.15). Exercise 7.6.10. Prove the implication (d) from page 511. Hint. Use Theorem 1.2.28(i) and Remark 3.2.26. Exercise 7.6.11. Prove the following assertion: Let p ≥ 2, then for all x1 , x2 ∈ RN , |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + 30 Recall

that xy (x, y)RN .

|x2 − x1 |p 30 . 2p−1 − 1

(7.6.22)

7.6A. Application of Leray–Lions Theorem

513

Hint (Lindqvist [86]). The strict convexity of x → |x|p implies that for any x1 , x2 ∈ RN , p > 1, |x2 |p > |x1 |p + p|x1 |p−2 x1 (x2 − x1 ). (7.6.23) Then writing

x2 +x1 2

instead of x2 in (7.6.23) we get - x + x -p 1 - 1 2p p−2 x1 (x2 − x1 ). - ≥ |x1 | + p|x1 | 2 2

Using the Clarkson inequality (see, e.g., Adams [2, Theorem 2.28]) for p ≥ 2, - x2 + x2 - p - x 1 − x 2 -p |x1 |p + |x2 |p ≥ 2 - + 2- , 2 2

(7.6.24)

we arrive at

- x − x -p - 1 2(7.6.25) |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + 2 - . 2 1 This is actually (7.6.22) but with 21−p in place of 2p−1 . Repeating this procedure, −1 starting again with (7.6.24) but now using (7.6.25) instead of (7.6.23), we get the constant improved to 21−p + 41−p . By iteration one obtains the constant 21−p + 41−p + 81−p + · · · =

1 2p−1 − 1

in (7.6.22). Exercise 7.6.12. Prove the following assertion Let 1 < p < 2, then for all x1 , x2 ∈ RN , |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + c(p)

|x1 − x2 |2 . (|x1 | + |x2 |)2−p

(7.6.26)

Hint (Lindqvist [86]). Fix x1 , x2 and expand the real function f (t) = |x1 + t(x2 − x1 )|p using the Taylor formula f (1) = f (0) + f (0) +

1

(1 − t)f (t) dt.

0

Then, provided f (t) = 0 for all 0 ≤ t ≤ 1,

1

|x2 |p = |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) +

(1 − t)f (t) dt.

(7.6.27)

0

(In the case when there exists t, 0 ≤ t ≤ 1, such that |x1 + t(x2 − x1 )| = 0 it is easily checked that (7.6.26) holds!) At the same time f (t) = p(p − 2)|x1 + t(x2 − x1 )|p−4 [(x1 + t(x2 − x1 ))(x2 − x1 )]2 + p|x1 + t(x2 − x1 )|p−2 |x2 − x1 |2 ,

514

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

and the Schwartz inequality yields f (t) ≥ p(p − 1)|x1 + t(x2 − x1 )|p−2 |x2 − x1 |2 .

(7.6.28)

Returning to (7.6.27) we estimate

1

(1 − t)f (t) dt ≥

0

3 4

1 4

f (t) dt

(7.6.29)

0

and since |x1 + t(x2 − x1 )| ≤ |x1 | + |x2 |, we use (7.6.28), (7.6.29) and arrive at (7.6.26) 3 p(p − 1). with c(p) = 16 Exercise 7.6.13. Prove that the operator J deﬁned on page 511 is strictly monotone31 for 1 < p < 2 and strongly monotone32 for p ≥ 2. Hint. For u, v ∈ W01,p (Ω) we have |∇u(x)|p−2 ∇u(x) − |∇v(x)|p−2 ∇v(x) (∇u(x) − ∇v(x)) dx J(u) − J(v), u − v = Ω = |∇u(x)|p−2 ∇u(x)(∇u(x) − ∇v(x)) dx Ω |∇v(x)|p−2 ∇v(x) (∇u(x) − ∇v(x)) dx = I1 + I2 . − Ω

For p ≥ 2, it follows from Exercise 7.6.11 that 2 |∇u(x) − ∇v(x)|p dx = cu − vp . I1 + I2 ≥ p (2p−1 − 1) Ω For 1 < p < 2, it follows from Exercise 7.6.12 that |∇u(x) − ∇v(x)|2 2c(p) I1 + I 2 ≥ dx > 0 2−p p Ω (|∇u(x)| + |∇v(x)|)

provided u, v ∈ W01,p (Ω), u = v.

Exercise 7.6.14. Prove that the weak solution from Exercise 7.5.23 is unique. Exercise 7.6.15. Let λ < λ1 and let f : Ω × R → R be a Carath´eodory function which is decreasing with respect to the second variable, i.e., f (x, s1 ) ≥ f (x, s2 )

for a.a.

x∈Ω

and

Assume, moreover, that there exist f0 ∈ Lp (Ω), p =

s1 , s2 ∈ R,

p , p−1

s1 ≤ s2 .

p > 1 and c > 0 such that

|f (x, s)| ≤ f0 (x) + c|s|p−1 . Prove that there is a unique weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x, u(x)) in Ω, u=0

on

∂Ω.

for any u, v ∈ W01,p (Ω), u = v: J(u) − J(v), u − v > 0, cf. Deﬁnition 5.3.2. there exists c > 0 such that for any u, v ∈ W01,p (Ω): J(u) − J(v), u − v ≥ cu − vp , cf. Deﬁnition 5.3.2. 31 I.e., 32 I.e.,

7.7. Weak Solutions, Application of Variational Methods

515

7.7 Weak Solutions, Application of Variational Methods Let us illustrate an application of Theorem 6.2.8 (the existence of a global minimizer) to the energy functional associated with the boundary value problem −∆u(x) = g(u(x)) + f (x) in Ω, (7.7.1) u=0 on ∂Ω. We will assume that g = g(s) is a continuous function and, moreover, (i) f ∈ L1 (Ω) if N = 1; q

(ii) f ∈ L q−1 (Ω) and there exists c > 0 such that |g(s)| ≤ c|s|q−1

where

⎧ ⎨q ∈ [1, ∞) is arbitrary if 2N ⎩q ∈ 1, if N −2

N = 2, N ≥ 3.

The energy functional33 associated with (7.7.1) is deﬁned as follows: 1 2 E(u) |∇u(x)| dx − G(u(x)) dx − f (x)u(x) dx, 34 u ∈ W01,2 (Ω), 2 Ω Ω Ω where

G(t) =

t

g(s) ds. 0

We will assume that g satisﬁes the sign condition: sg(s) ≤ 0

for any s ∈ R.

(7.7.2)

Then E is a weakly coercive functional on W01,2 (Ω). Indeed, the condition (7.7.2) immediately implies G ≤ 0, and thus G(u(x)) dx ≤ 0 for any u ∈ W01,2 (Ω), Ω

and so E(u) ≥ 33 Here

1 u2W 1,2 (Ω) − f X uX , 0 2

f represents the inﬂuence of external forces, g a nonlinear damping or restoring force, and 12 |∇u|2 the kinetic energy, respectively. Hence E(u) corresponds to the total energy of the system (cf. Examples 6.2.6 and 6.2.14) – that is where the expression “energy functional” comes from. For more details cf. Hlav´ aˇ cek & Neˇcas [68]. 34 The reader is invited to check that if u ∈ W 1,2 (Ω) satisﬁes δE(u ; v) = 0 for any v ∈ W 1,2 (Ω), 0 0 0 0 then it is a weak solution of (7.7.1), cf. Remark 5.3.10.

516

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations q

where X = L1 (Ω), X = C(Ω) if N = 1, X = L q−1 (Ω), X = Lq (Ω) if N ≥ 2. Hence there is a constant c1 > 0 such that E(u) ≥

1 u2W 1,2 (Ω) − c1 uW 1,2 (Ω) . 0 0 2

Hence lim E(u) = ∞.

u→∞

Next we prove that E is weakly sequentially lower semicontinuous. Clearly, for any un u0 in W01,2 (Ω) we have 1 1 |∇u0 (x)|2 dx ≤ lim inf |∇un (x)|2 dx, (7.7.3) n→∞ 2 Ω 2 Ω f (x)u0 (x) dx = lim f (x)un (x) dx. (7.7.4) n→∞

Ω

Ω

Our assumptions imply that for N ≥ 2 the function G(s) satisﬁes the estimate |G(s)| ≤

c q |s| . q

So, the Nemytski operator deﬁned by NG (u)(x) = G(u(x)) is continuous from Lq (Ω) into L1 (Ω) (see Theorem 3.2.24). Then this fact and the compact embedding W01,2 (Ω) ⊂⊂ Lq (Ω) imply that un → u0 in Lq (Ω), and thus G(un (·)) → G(u(·))

in L1 (Ω).

(7.7.5)

Since un → u0 in C(Ω) for N = 1 we obtain (7.7.5) easily in this case, too. Summarizing (7.7.3)–(7.7.5) we obtain E(u0 ) ≤ lim inf E(un ). n→∞

We have proved the following assertion. Theorem 7.7.1. Let g be a continuous function satisfying (7.7.2) and (ii) (for N ≥ 2) , let f satisfy (i) (for N = 1) or (ii) (for N ≥ 2). Then the boundary value problem (7.7.1) has at least one weak solution u0 ∈ W01,2 (Ω). The growth assumptions on g stated above for N ≥ 3 are not optimal and can be relaxed ifwe assume the monotonicity of g and apply Theorem 6.2.12. Indeed, s let G(s) = g(τ ) dτ be a concave function.35 Then the energy functional E 0 35 In

particular, if g is decreasing, then G is concave.

7.7. Weak Solutions, Application of Variational Methods

517

is a strictly convex functional. This functional is continuous even if the growth assumption (ii) for N ≥ 3 is relaxed to 2N , q ∈ 1, N −2 i.e., the value q =

2N N −2

is admissible, too.36 Hence we have the following assertion.

Theorem 7.7.2. Let f and g be as in Theorem 7.7.1. If g is decreasing, then (7.7.1) has a unique weak solution. The assertion remains true even if g satisﬁes (ii) with q = N2N −2 for N ≥ 3. Let us now consider the Dirichlet problem −∆u(x) + λu(x) = |u(x)|p−2 u(x) u=0

in on

Ω, ∂Ω,

(7.7.6)

and look for nonnegative solutions u ≥ 0, u = 0 a.e. in Ω (cf. Example 6.4.7 for N = 1). For future purposes we denote by 2∗ an arbitrary value greater than or equal to 1 for N = 2 and 2∗ =

2N N −2

for N ≥ 3.

The real numbers λ and p in (7.7.6) are parameters. We will apply the Mountain Pass Theorem to prove the following assertion about nonnegative solutions of (7.7.6). Theorem 7.7.3 (Willem [134]). Let N ≥ 2, 2 < p < 2∗ . Then (7.7.6) has at least one nonnegative nontrivial weak solution if and only if λ > −λ1 where λ1 > 0 is the ﬁrst eigenvalue of −∆ subject to the homogeneous Dirichlet boundary conditions on ∂Ω (see Example 7.4.4). Proof. The necessary part is simple (cf. Example 6.4.7). Indeed, let u ∈ W01,2 (Ω) be a weak solution of (7.7.6), u ≥ 0, u = 0 a.e. in Ω. Then taking v = ϕ1 (see Example 7.4.4) as a test function in ∇u(x)∇v(x) dx + λ u(x)v(x) dx = |u(x)|p−2 u(x)v(x) dx Ω

Ω

we obtain

(λ + λ1 )

Ω

due to

|u(x)|p−2 u(x)ϕ1 (x) dx

u(x)ϕ1 (x) dx = Ω

Ω

∇u(x)∇ϕ1 (x) dx − λ1 Ω

36 Use

Theorem 3.2.24 and explain why!

u(x)ϕ1 (x) dx = 0. Ω

(7.7.7)

518

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Since ϕ1 > 0 in Ω, we get from (7.7.7) that λ + λ1 > 0,

i.e.,

λ > −λ1 .

The suﬃciency part is more involved and we will apply the Mountain Pass Theorem (Theorem 6.4.5) to prove it. So, we assume λ > −λ1 . Let us start with the observation that the expression

12 |u| = |∇u(x)|2 dx + λ |u(x)|2 dx for u ∈ W01,2 (Ω) Ω

Ω

satisﬁes c1 u ≤ |u| ≤ c2 u with constants ci > 0, i = 1, 2, independent of u ∈ W01,2 (Ω) where

12 2 u = |∇u(x)| dx . Ω

Indeed, (7.2.4) yields |∇u(x)|2 dx + λ |u(x)|2 dx ≥ d |∇u(x)|2 dx Ω

Ω

(7.7.8)

Ω

, + for any u ∈ W01,2 (Ω) where d = 1 + min 0, λλ1 .37 Then 1

|λ| 2 u d u ≤ |u| ≤ 1 + λ1 1 2

for any u ∈ W01,2 (Ω)

(7.7.9)

by (7.2.4) and (7.7.8). + p

Let us deﬁne F (u) = |u p | . Then the functional |u(x)|2 |∇u(x)|2 +λ − F (u(x)) dx, E(u) = 2 2 Ω is the energy functional associated with the problem −∆u(x) + λu(x) = |u+ (x)|p−1 u=0

in on

u ∈ W01,2 (Ω),

Ω, ∂Ω.

(7.7.10)

To prove the existence of a weak solution of (7.7.10) (which possibly changes sign in Ω) we apply the Mountain Pass Theorem (see Theorem 6.4.5). For this purpose we verify that E (i) has the mountain pass type geometry (see Proposition 6.4.3), (ii) satisﬁes the (PS)c condition (see Deﬁnition 6.4.4). We have E(o) = 0 37 The

reader should prove (7.7.8) in detail!

7.7. Weak Solutions, Application of Variational Methods

and

519

cp 1 1 1 2 E(u) ≥ |u| − |u(x)|p dx ≥ c21 u2 − emb up 2 p Ω 2 p 2 p c c = u2 1 − emb up−2 .38 2 p

Hence there exists r > 0 small enough and such that b inf E(u) > 0 = E(o). u=r

On the other hand, taking w0 > 0 in Ω ﬁxed, w0 ∈ W01,2 (Ω), then c 2 t2 tp |w0 (x)|p dx for t ≥ 0. E(tw0 ) ≤ 2 w0 2 − 2 p Ω So, there exists t > 0 (large enough) that for e = tw0 ∈ W01,2 (Ω) we have both e > r

and

E(e) < 0.

(Remember that p > 2.) ∞ In order to verify the (PS)c condition we proceed as follows. Let {un }n=1 ⊂ W01,2 (Ω) be a sequence satisfying E(un ) → c,

∇E(un ) → o

with a c ∈ R.

(For ∇E see Exercise 7.7.6.) For n large enough, we also have ∇E(un ) ≤ 1. Since

∇u(x)∇v(x) dx + λ

(∇E(u), v) =

(7.7.11)

Ω

u(x)v(x) dx − Ω

where f (u) = (u+ )p−1 , we have

f (u(x))v(x) dx Ω

2

(∇E(u), u) = |u| − p

F (u(x)) dx. Ω

Since also

1 2 E(u) = |u| − 2

F (u(x)) dx, Ω

we get due to (7.7.11), ( ( 'p 'p 2 − 1 |u| + (∇E(u), u) ≥ − 1 c21 u2 − u. pE(u) = 2 2 Put u un to see that {un }∞ n=1 is a bounded sequence. 38 Note that by the Sobolev Embedding Theorem and (7.7.9) there exists a constant c emb > 0 such that uLp (Ω) ≤ cemb uW 1,2 (Ω) for any u ∈ W01,2 (Ω). 0

520

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Now, passing to a subsequence if necessary, we can assume that un u in W01,2 (Ω) for a u ∈ W01,2 (Ω). By the Rellich–Kondrachov Theorem we have un → u in Lp (Ω). It follows from the continuity of the Nemytski operator (see Theorem 3.2.24) that

f (un ) → f (u)

p =

in Lp (Ω),

p . p−1

Observe that 2

|un − u| = (∇E(un ) − ∇E(u), un − u) (f (un (x)) − f (u(x)))(un (x) − u(x)) dx. +

(7.7.12)

Ω

By the assumption ∇E(un ) → o and by un u in W01,2 (Ω) we have (∇E(un ) − ∇E(u), un − u) → 0

as n → ∞,

and by the H¨ older inequality we conclude that - - (f (un (x)) − f (u(x)))(un (x) − u(x)) dxΩ

1 p1 p p p ≤ |f (un (x)) − f (u(x))| dx |un (x) − u(x)| dx →0 Ω

Ω

as n → ∞. Hence (7.7.12) implies that |un − u| → 0

as n → ∞,

un → u

i.e.,

in W01,2 (Ω).

Consequently, E satisﬁes the (PS)c condition. It follows from Theorem 6.4.5 and Remark 6.4.6 that E has a critical point u0 ∈ W01,2 (Ω), u0 = o. Since u0 is also a weak solution of (7.7.10) we have p−1 ∇u0 (x)∇v(x) dx + λ u0 (x)v(x) dx = |u+ v(x) dx (7.7.13) 0 (x)| Ω

for any v ∈

Ω

Ω

u− 0

W01,2 (Ω).

Taking v = in (7.7.13) we arrive at − 2 2 39 |u− | = |∇u (x)| dx + λ |u− 0 0 0 (x)| dx = 0, Ω

Ω

implication u ∈ W01,2 (Ω) ⇒ u− ∈ W01,2 (Ω) is nontrivial if Ω ⊂ RN , N ≥ 2, and it is not true in general if we replace W01,2 (Ω) by W0k,2 (Ω) with k ≥ 2! For the case N = 1 see Exercise 1.2.47. It follows from Gilbarg & Trudinger [59, Section 7.4] (or Leinfelder & Simander [84, Appendix], Ziemer [137, Corollary 2.1.8 and Theorem 2.1.11]) that ⎧ ⎪ if u > 0, ⎨ ∇u ∇u if u > 0, 0 if u ≥ 0, + − ∇u = ∇u = and ∇|u| = 0 if u = 0, for u ∈ ⎪ 0 if u ≤ 0, ∇u if u < 0, ⎩ −∇u if u < 0,

39 The

W01,2 (Ω).

7.7. Weak Solutions, Application of Variational Methods

521

hence u− 0 = 0 a.e. in Ω. This proves that u0 is a nonnegative weak solution of (7.7.10). Since u0 ≥ 0 in Ω, u0 ≡ 0 in Ω, we have u+ 0 = u0 in Ω, and so u0 is a nonnegative nontrivial weak solution of (7.7.6). Remark 7.7.4. The nonlinearity u → |u|p−2 u in (7.7.6) has the so-called subcritical growth due to the inequality p < 2∗ . The reader should notice that some existence results for (7.7.6) are known also in the case of the critical growth N ≥ 3, p = 2∗ , (see, e.g., Willem [134]). The proofs are based on the Concentration Compactness Principle which is attributed to Lions (see Lions [87], Lions [88], Lions [89]). These techniques go beyond the limits of this book and the reader can consult, e.g., the book of Flucher [51] to get more information in this direction. Now we show an application of the Saddle Point Theorem. Let us consider the Dirichlet boundary value problem −∆u(x) = λu(x) + h(x, u(x)) in Ω, (7.7.14) u=0 on ∂Ω. If h is bounded and λ is not an eigenvalue of (7.4.13), the existence of a solution of (7.7.14) follows from Theorem 7.5.2. We prove the following assertion. Theorem 7.7.5. Let λ be an eigenvalue of (7.4.13) and h, ∂h ∂s be bounded and continuous. If, moreover, s H(x, s) = h(x, τ ) dτ ⇒ ∞ as |s| → ∞ uniformly for x ∈ Ω, (7.7.15) 0

then (7.7.14) possesses a weak solution.40 Proof. Let λ = λk < λk+1 for a k ∈ N where λk , k ∈ N, are the eigenvalues of (7.4.13). The energy functional associated with (7.7.14), 1 λk 2 2 E(u) = |∇u(x)| dx − |u(x)| dx − H(x, u(x)) dx, (7.7.16) 2 Ω 2 Ω Ω u ∈ W01,2 (Ω), has the property E ∈ C 2 (W01,2 (Ω), R) due to the assumptions on ∞ 1,2 h, ∂h ∂s (Exercise 7.7.8). Let {ϕj }j=1 be an orthonormal basis of W0 (Ω) consisting ∞ of the eigenfunctions associated with the eigenvalues {λj }j=1 , 0 < λ1 ≤ λ2 ≤ · · · (see Example 7.5.1). In particular, 2 λj |ϕj (x)| dx = |∇ϕj (x)|2 dx = 1 holds for all j ∈ N. (7.7.17) Ω 40 The

Ω

assertion can be proved under weaker assumptions on h, cf. Rabinowitz [105] and Remark 6.5.4.

522

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Let

Y Lin{ϕ1 , ϕ2 , . . . , ϕk }, Z =

u ∈ W01,2 (Ω) :

u(x)v(x) dx = 0, v ∈ Y

,

Ω

i.e., W01,2 (Ω) = Y ⊕ Z,

dim Y < ∞

{ϕj }∞ j≥k+1

forms an orthonormal basis of Z. and Step 1. We prove that E has a geometry of the Saddle Point Theorem. If u ∈ Z, ∞ aj ϕj and (see (7.7.17)) then u = j=k+1

Ω

∞ λk λk ≥ 1− u2 . |∇u(x)|2 − λk |u(x)|2 dx = a2j 1 − λj λk+1 j=k+1

(7.7.18) Let M sup |h(x, s)|. Then x∈Ω s∈R

- - H(x, u(x)) dx- ≤ M |u(x)| dx ≤ M1 u Ω

(7.7.19)

Ω

older and Poincar´e inequalities. Combining (7.7.18) for all u ∈ W01,2 (Ω), by the H¨ and (7.7.19) shows that E is bounded below on Z, i.e., inf E(u) > −∞.

(7.7.20)

u∈Z

ˆ where Next, if u ∈ Y , then u = u0 + u u0 ∈ Y 0 Lin{ϕj : λj = λk } 41 Then for u ∈ Y , u =

k

and

u ˆ ∈ Yˆ Lin{ϕj : λj < λk }.

aj ϕj ,

j=1

1 E(u) = 2

j:λj <λk

a2j

λk 1− − H(x, u0 (x)) dx λj Ω

H(x, u (x) + u ˆ(x)) − H(x, u (x)) dx.

−

0

(7.7.21)

0

Ω

There is a constant M2 > 0 such that

1 2 λk ≤ −M2 ˆ aj 1 − u2 , 2 λj j:λj <λk

41 Note

that the multiplicity of λk need not be equal to 1 in general but it is ﬁnite.

(7.7.22)

7.7. Weak Solutions, Application of Variational Methods

523

and - - [H(x, u0 (x) + uˆ(x)) − H(x, u0 (x))] dxΩ - 0 u0 (x) u (x)+ˆ u(x) (7.7.23) =h(x, s) ds − h(x, s) ds dx- Ω 0 0 - 0 u (x)+ˆ u(x) =h(x, s) ds dx- ≤ M |ˆ u(x)| dx ≤ M1 ˆ u. - Ω u0 (x) Ω It follows from (7.7.21)–(7.7.23) that E(u) ≤ −M2 ˆ u −

H(x, u0 (x)) dx + M1 ˆ u.

2

Ω

This implies that E(u) → −∞

as

u → ∞,

u ∈ Y.

(7.7.24)

Indeed, either ˆ u → ∞ or u0 → ∞, and then (7.7.24) follows from the assumption (7.7.15). It follows from (7.7.20) and (7.7.24) that E veriﬁes the hypotheses of Proposition 6.5.2, i.e., E has the geometry corresponding to the Saddle Point Theorem. Step 2. Now, we prove that E satisﬁes the (PS)c condition. Let us assume that |E(um )| ≤ K for some K > 0 and ∇E(um ) → o.42 Let us write um = u0m + uˆm + u˜m

where u0m ∈ Y 0 , u ˆm ∈ Yˆ , u ˜m ∈ Z.

For large m, we have ˜m )| ˜ um ≥ |(∇E(um ), u - = - [∇um (x)∇˜ um (x) − λk um (x)˜ um (x) − h(x, um (x))˜ um (x)] dx-- , Ω

(7.7.25) and the same for u ˆm . On the other hand, since Z = Y ⊥ , by (7.7.25), (7.7.18) and the boundedness of h we obtain - - ∇um (x)∇˜ um (x) − λk um (x)˜ um (x) − h(x, um (x))˜ um (x)] dx-Ω

λk ≥ 1− ˜ um 2 − M1 ˜ um . (7.7.26) λk+1 42 E(u

m)

→ c, c ∈ R, implies that there exists K > 0 such that |E(um )| ≤ K.

524

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

From (7.7.25) and (7.7.26) we obtain

λk ˜ um 2 − M1 ˜ um , ˜ um ≥ 1 − λk+1 ∞

∞

Similarly, we prove that {ˆ um }m=1 is which shows that {˜ um }m=1 is bounded. *∞ ) bounded, too. Finally, we claim that u0m m=1 is bounded. To verify the claim, observe that - 9 18 |∇˜ um (x)|2 + |∇ˆ K ≥ |E(um )| = -um (x)|2 − λk (|˜ um (x)|2 + |ˆ um (x)|2 ) 2 Ω 0 0 − H(x, um (x)) − H(x, um (x)) dx − H(x, um (x)) dx-- . Ω

By what has already been shown, the ﬁrst integral on the right-hand side is 43 bounded independently of m. Therefore H(x, u0m (x)) dx is bounded. In order Ω ) *∞ to show that u0m m=1 is bounded it is suﬃcient to prove that H(x, v(x)) dx → ∞ as v → ∞ for v ∈ Y 0 . (7.7.27) Ω

By (7.7.15) for any l > 0, there is dl such that H(x, s) ≥ l

if |s| ≥ dl

for all x ∈ Ω.

Let v ∈ Y 0 , v = o, and write v = tϕ

where

ϕ ∈ ∂B(o; 1) {w ∈ Y 0 : w = 1}.

Then

H(x, tϕ(x)) dx ≥

Ω

Ωtl (ϕ)

where Ωtl (ϕ) = {x ∈ Ω : |tϕ(x)| ≥ dl }

and

H(x, tϕ(x)) dx − M0

(7.7.28)

M0 ≥ (meas Ω) -- inf H(x, s)-- . -x∈Ω s∈R

For any ψ ∈ ∂B(o; 1) we ﬁnd an open neighborhood U(ψ) ⊂ ∂B(o; 1) of ψ, x = x(ψ) ∈ Ω and r = r(ψ) > 0 with the following property: for an arbitrary l ∈ N there exists tl (ψ) such that B(x(ψ); r(ψ)) {x ∈ Ω : |x − x(ψ)| < r(ψ)} ⊂ Ωtl (ϕ) 44 for any t ≥ tl (ψ) and ϕ ∈ U(ψ). 43 The 44 Here

reader should justify it using an estimate similar to (7.7.23). we use the fact that the eigenfunctions of (7.7.13) are continuous in Ω (cf. Example 7.5.1).

7.7. Weak Solutions, Application of Variational Methods

525

Then (7.7.28) implies that for any l ∈ N we have H(x, tϕ(x)) dx ≥ l meas B(x(ψ); r(ψ)) − M0

(7.7.29)

Ω

for any t ≥ tl (ψ), ϕ ∈ U(ψ). The system {U(ψ) : ψ ∈ ∂B(o; 1)} is an open covering of ∂B(o; 1). The compactness of ∂B(o; 1) implies that there exists a ﬁnite subcovering {U(ψi ) : ψi ∈ ∂B(o; 1)}, i = 1, . . . , n, of ∂B(o; 1). Let c min {meas B(x(ψi ); r(ψi ))},

tl max {tl (ψi )}

i=1,...,n

i=1,...,n

Then from (7.7.29) we obtain that H(x, tϕ(x)) dx ≥ cl − M0

for any l ∈ N.

for any ϕ ∈ ∂B(o; 1) and t ≥ tl ,

Ω

i.e., (7.7.27) holds for v → ∞, v ∈ Y 0 . ∞ So, we have proved that {um }m=1 ⊂ W01,2 (Ω) is a bounded sequence. Passing to a subsequence if necessary, we may assume that um u in W01,2 (Ω) and um → u in L2 (Ω). Then we have (∇E(um ) − ∇E(u), um − u) → 0. But we also have

(7.7.30)

|um (x) − u(x)|2 dx → 0,

Ω

[h(x, um (x)) − h(x, u(x))](um (x) − u(x)) dx → 0. Ω

These facts together with (7.7.30) imply |∇um (x) − ∇u(x)|2 dx → 0,

i.e.,

um → u

Ω

in W01,2 (Ω).

This proves that E veriﬁes the (PS)c condition, and the proof of Theorem 7.7.5 is complete. Exercise 7.7.6. Let 1 λ 1 2 2 E(u) = |∇u(x)| dx + |u(x)| dx − |u+ (x)|p dx, 2 Ω 2 Ω p Ω

λ ∈ R, p > 2.

Prove that E ∈ C 2 (W01,2 (Ω), R) and (∇E(u), h) = ∇u(x)∇h(x) dx + λ u(x)h(x) dx − |u+ (x)|p−1 h(x) dx. Ω

Ω

Ω

Exercise 7.7.7. Compare the assertion and the proof of Theorem 7.7.3 with Example 6.4.7. Point out the diﬀerences between the one-dimensional and higher dimensional cases.

526

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Exercise 7.7.8. Prove that the functional E(u) from (7.7.16) has the property E ∈ C 2 (W01,2 (Ω), R). Hint. First prove that the second Gˆ ateaux derivative is given by D2 E(u)(w, z) = ∇w(x)∇z(x) dx − λk w(x)z(x) dx Ω Ω ∂h − (x, u(x))w(x)z(x) dx. Ω ∂s Then show that D2 E(u) is continuous in u (Remark 3.2.29). To prove continuity of 1,2 the third term in D2 E(u) use the boundedness of ∂h ∂s , the embedding W0 (Ω) ⊂ r L (Ω), r > 2 (Remark 1.2.24) and the continuity of the Nemytski operator from r Lr (Ω) into L r−2 (Ω) (Theorem 3.2.24(ii)). Exercise 7.7.9. Let X ⊂ W01,2 (Ω), dim X < ∞. Why are the norms

u =

|∇u(x)|2 dx Ω

12

uL2∗ (Ω) =

uLp(Ω) =

, ∗

|u(x)|2

21∗ dx ,

p1 |u(x)|p dx ,

Ω

1 < p < 2∗ ,

Ω

equivalent on X? Why are these norms not equivalent on the whole space X = W01,2 (Ω)? Hint. Cf. Corollary 1.2.11. Exercise 7.7.10. Replace (7.7.15) by the assumption H(x, s) → −∞

as |s| → ∞

and prove the assertion of Theorem 7.7.5. Exercise 7.7.11. Consider the boundary value problem −∆u(x) + λu(x) = g(x, u(x)) in u=0

on

Ω, ∂Ω.

(7.7.31)

Formulate conditions on λ and g = g(x, s) which guarantee that the energy functional associated with (7.7.31) (i) is coercive, (ii) is weakly coercive, (iii) has a geometry corresponding to the Mountain Pass Theorem, (iv) has a geometry corresponding to the Saddle Point Theorem.

7.7A. Application of the Saddle Point Theorem

527

7.7A Application of the Saddle Point Theorem In this appendix we will give another application of Theorem 6.5.12. Consider the existence of weak solutions of the boundary value problem

−∆p u(x) = λ1 |u(x)|p−2 u(x) + f (x, u(x)) − h(x)

in

Ω,

u=0

on

∂Ω,

(7.7.32)

where p > 1, Ω ∈ C 0,1 is a bounded domain in RN , f : Ω × R → R is a bounded p . As in Appendix 7.5A, let λ1 > 0 Carath´eodory function and h ∈ Lp (Ω), p = p−1 be the principal eigenvalue of −∆p on Ω with zero Dirichlet boundary conditions, and let us denote by ϕ1 the positive (in Ω) eigenfunction associated with λ1 normalized by

1 p |∇ϕ1 (x)|p dx = 1. ϕ1 = Ω

We will suppose that f satisﬁes the following condition: for a.a. x ∈ Ω there exist limits lim f (x, s) = f−∞ (x),

lim f (x, s) = f+∞ (x).

s→−∞

s→+∞

It is well known that under this condition the problem (7.7.32) need not have solutions (cf. Exercise 7.7.13). The following result extends the classical result of Landesman & Lazer [83]. Theorem 7.7.12. Suppose that either

f+∞ (x)ϕ1 (x) dx <

h(x)ϕ1 (x) dx <

Ω

Ω

f−∞ (x)ϕ1 (x) dx

(7.7.33)

f+∞ (x)ϕ1 (x) dx.

(7.7.34)

Ω

or else

f−∞ (x)ϕ1 (x) dx <

h(x)ϕ1 (x) dx <

Ω

Ω

Ω

Then there exists at least one weak solution u ∈ W01,p (Ω) of the problem (7.7.32). Proof. We follow the proof from Arcoya & Orsina [9]. Let us introduce the energy functional E : W01,p (Ω) → R associated with (7.7.32): E(u)

1 p

|∇u(x)|p dx− Ω

where

λ1 p

|u(x)|p dx− Ω

F (x, u(x)) dx+ Ω

h(x)u(x) dx (7.7.35) Ω

s

F (x, s) =

f (x, t) dt

for a.a.

x∈Ω

and

s ∈ R.

0

Then E ∈ C 1 (W01,p (Ω), R) (cf. Exercise 7.7.14) and its critical points correspond to the weak solutions of (7.7.32).

528

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations We proceed in three steps.

1,p Step 1. Let {un }∞ n=1 ⊂ W0 (Ω) be such that there exists c > 0 such that

|E(un )| ≤ c

n∈N

for any

and there exists a strictly decreasing sequence - -E (un ), v- ≤ εn v for any n ∈ N

(7.7.36)

lim εn {εn }∞ n=1 , n→∞ and any

= 0, such that

v ∈ W01,p (Ω).45

(7.7.37)

{un }∞ n=1

Then we will prove that contains a subsequence which converges strongly in W01,p (Ω). 1,p Let us begin by proving that the sequence {un }∞ n=1 is bounded in W0 (Ω). Suppose, ∞ un by contradiction, that un → ∞, and deﬁne vn = un . Thus {vn }n=1 is bounded in W01,p (Ω) and hence, at least its subsequence, converges to a function v0 weakly in W01,p (Ω) and strongly in Lp (Ω). Dividing (7.7.35) with u = un by un p , we get, due to (7.7.36), un (x) F (x, un (x)) 1 λ1 |vn (x)|p dx − dx + h(x) dx ≤ 0. lim sup − p p Ω un p un p n→∞ Ω Ω

F (x, un (x)) un (x) lim dx + h(x) dx = 0 n→∞ un p un p Ω Ω ∞ by the hypotheses on f , h, and {un }n=1 while |vn (x)|p dx = |v0 (x)|p dx, lim Since

n→∞

Ω

Ω

we have

|v0 (x)|p dx ≥ 1.

λ1 Ω

Using the weak lower semicontinuity of the norm and the variational characterization of λ1 (see Appendix 7.5A), we get 1 ≤ λ1 |v0 (x)|p dx ≤ |∇v0 (x)|p dx ≤ lim inf |∇vn (x)|p dx = 1. Ω

n→∞

Ω

v0 = 1

Ω

Thus

|∇v0 (x)|p dx = λ1

and Ω

|v0 (x)|p dx. Ω

This implies, by the deﬁnition of ϕ1 , that v0 = ±ϕ1 .46 Now we write (7.7.36) and (7.7.37) with v = un in the equivalent form −cp ≤ |∇un (x)|p dx − λ1 |un (x)|p dx Ω Ω F (x, un (x)) dx + p h(x)un (x) dx ≤ cp, −p Ω Ω |∇un (x)|p dx + λ1 |un (x)|p dx −εn un ≤ − Ω Ω + f (x, un (x))un (x) dx − h(x)un (x) dx ≤ εn un . Ω

Ω

E (u

will be convenient to express the assumption n ) → 0 in this form. that we have proved that vn → v0 and so by the uniform convexity of W01,p (Ω), vn → ±ϕ1 , too. 45 It

46 Note

7.7A. Application of the Saddle Point Theorem

529

Summing up and dividing by un , we obtain - - [f (x, un (x))vn (x) − pg(x, un (x))vn (x) + (p − 1)h(x)vn (x)] dx- ≤ Ω

⎧ ⎨ F (x, s) g(x, s) = s ⎩f (x, 0)

where

if

s = 0,

if

s = 0.

cp + εn un

(7.7.38)

Letting n tend to inﬁnity and supposing that vn converge to +ϕ1 (for example), we obtain [f (x, un (x))vn (x) − pg(x, un (x))vn (x)] dx = (1 − p) h(x)ϕ1 (x) dx. lim n→∞

Ω

Ω

Since vn converge to ϕ1 , we have lim un (x) = ∞ for a.a. x ∈ Ω, and so n→∞

f (x, un (x)) → f+∞ (x)

for a.a.

x ∈ Ω,

g(x, un (x)) → f+∞ (x)

for a.a.

x ∈ Ω.

The properties of f and F and the Lebesgue Theorem then imply [f (x, un (x))vn (x) − pg(x, un (x))vn (x)] dx = (1 − p) f+∞ (x)ϕ1 (x) dx, lim n→∞

Ω

Ω

and so, since p > 1,

f+∞ (x)ϕ1 (x) dx = Ω

h(x)ϕ1 (x) dx, Ω

which contradicts both (7.7.33) and (7.7.34). 1,p Thus {un }∞ n=1 is bounded. This implies that there exists u ∈ W0 (Ω) such that, 1,p at least its subsequence, un converges to u weakly in W0 (Ω) and strongly in Lp (Ω). Choosing v = un − u in (7.7.37), we obtain - - |∇un (x)|p−2 ∇un (x)(∇un (x) − ∇u(x)) dx − λ1 |un (x)|p−2 un (x)(un (x) − u(x)) dx Ω Ω − f (x, un (x))(un (x) − u(x)) dx + h(x)(un (x) − u(x)) dx-- ≤ εn un − u. Ω

Ω

Since un → u in Lp (Ω) and, by the hypotheses on f and h, |un (x)|p−2 un (x)(un (x) − u(x)) dx = 0, lim n→∞ Ω f (x, un (x))(un (x) − u(x)) dx = 0, lim n→∞ Ω h(x)(un (x) − u(x)) dx = 0, lim n→∞

|∇un (x)|p−2 ∇un (x)(∇un (x) − ∇u(x)) dx = 0.

lim

n→∞

Subtracting

Ω

we have

Ω

|∇u(x)|p−2 ∇u(x)(∇un (x) − ∇u(x)) dx Ω

530

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

(which converges to zero as n tends to inﬁnity since u belongs to W01,p (Ω)), we conclude that (|∇un (x)|p−2 ∇un (x) − |∇u(x)|p−2 ∇u(x))(∇un (x) − ∇u(x)) dx 0 = lim n→∞

Ω

≥ lim (un p−1 − up−1 )(un − u) ≥ 0, 47 n→∞

which implies un → u. The uniform convexity of W01,p (Ω) yields that un converges strongly to u in W01,p (Ω). This completes the proof of Step 1. Note that it follows from Step 1 that E satisﬁes (PS)c on any level c ∈ R. Step 2. Note also that, in the proof of the Palais–Smale condition, we have proved that if {E(un )}∞ n=1 is a sequence bounded above with un → ∞, then (at least its subsequence) vn = uunn → ±ϕ1 in W01,p (Ω) (see footnote 46 on page 528). Using this fact, it is easy to prove that E is weakly coercive provided (7.7.33) holds. Otherwise, it is possible to choose a sequence {un }∞ n=1 such that un → ∞,

E(un ) ≤ c

and

vn =

un → ±ϕ un

in

W01,p (Ω).

Assume (for example) that vn → ϕ1 ; arguing as in the previous proof we get h(x)ϕ1 (x) dx − f+∞ (x)ϕ1 (x) dx Ω Ω F (x, un (x)) E(un ) c h(x)vn (x) dx − = lim dx ≤ lim sup ≤ lim = 0, n→∞ n→∞ u u u n n n n→∞ Ω Ω which contradicts (7.7.33). The weak coerciveness of E and the weak sequential lower semicontinuity (cf. Exercise 7.7.15) are enough in order to prove that E attains its inﬁmum (see Theorem 6.2.8 and Remark 6.2.22), so that (7.7.32) has at least one weak solution. Step 3. If (7.7.34) holds, then E has the geometry of the Saddle Point Theorem. Indeed, splitting W01,p (Ω) as the direct sum of and Z u ∈ W01,p (Ω) : u(x)ϕ1 (x) dx = 0 , Y = Lin{ϕ1 } Ω

¯ > λ1 such that we see that there exists λ p ¯ |∇u(x)| dx ≥ λ |u(x)|p dx Ω

for all

u∈Z

Ω

(cf. Exercise 7.7.16). Thus, by the H¨ older inequality and by the properties of F , there exists c > 0 such that for every u in Z,

λ1 1 1− ¯ |∇u(x)|p dx − F (x,u(x))dx + h(x)u(x)dx E(u) ≥ p λ Ω Ω Ω

1 ' ( p 1 λ1 1 c p p p 1− ¯ ≥ |∇u(x)| dx − 1 (measΩ) + hLp (Ω) |∇u(x)| dx . p λ ¯ p Ω Ω λ 47 Cf.

computation on page 328.

7.7A. Application of the Saddle Point Theorem

531

Hence, E is weakly coercive on Z, so that BZ = min E(u) > −∞. u∈Z

Observe that we have not yet used the fact that (7.7.34) holds. On the other hand, for every t ∈ R we have |∇(tϕ1 )(x)|p dx − λ1 |tϕ1 (x)|p dx = 0 Ω

Ω

as follows from the deﬁnition of λ1 and ϕ1 . Thus, E(tϕ1 ) = t h(x)ϕ1 (x) dx − F (x, tϕ1 (x)) dx Ω Ω h(x)ϕ1 (x) dx − g(x, tϕ1 (x))ϕ1 (x) dx =t Ω

Ω

where g has been deﬁned by (7.7.38). Using the positivity of ϕ1 and the hypotheses on f , it is easy to see that lim g(x, tϕ1 (x))ϕ1 (x) = f+∞ (x)ϕ1 (x)

for a.a.

t→+∞

x ∈ Ω.

Furthermore, there exists c > 0 such that |g(x, tϕ1 (x))ϕ1 (x)| ≤ cϕ1 (x) ∈ L1 (Ω), so that the Lebesgue Theorem implies lim h(x)ϕ1 (x) dx − g(x, tϕ1 (x))ϕ1 (x) dx = [h(x) − f+∞ (x)]ϕ1 (x) dx, t→+∞

Ω

Ω

Ω

and the limit is negative by (7.7.34). Analogously, if t tends to −∞, we have the same result with f+∞ replaced by f−∞ , so that the limit is positive by (7.7.34). In both the cases we have lim E(tϕ1 ) = −∞. t→±∞

Thus, there exists R > 0 such that if |t| = R, we have E(tϕ1 ) < BZ ≤ E(u)

u ∈ Z.

for all

Hence E satisﬁes the hypotheses of Theorem 6.5.12, and so there exists a critical point for E, that is, a weak solution of (7.7.32). Exercise 7.7.13. Let p = 2, let f (x, ·) be strictly increasing (decreasing). Then a necessary condition for the existence of a solution of (7.7.32) is that f−∞ (x)ϕ1 (x) dx < h(x)ϕ1 (x) dx < f+∞ (x)ϕ1 (x) dx Ω Ω Ω ( ' f+∞ (x)ϕ1 (x) dx < h(x)ϕ1 (x) dx < f−∞ (x)ϕ1 (x) dx . Ω

Ω

Ω

532

Chapter 7. Boundary Value Problems for Partial Diﬀerential Equations

Hint. Assume that u is a solution of (7.7.32), i.e., ∇u(x)∇v(x) dx = λ1 u(x)v(x) dx + f (x, u(x))v(x) dx − h(x)v(x) dx Ω

for any v ∈

Ω

W01,2 (Ω).

Ω

Ω

Choose v = ϕ1 and use the fact that ∇u(x)∇ϕ1 (x) dx = λ1 u(x)ϕ1 (x) dx. Ω

Ω

Exercise 7.7.14. Prove that the functional E(u) deﬁned by (7.7.35) belongs to the space C 1 (W01,p (Ω), R). Hint. Use an approach similar to that in Exercise 7.7.8. Exercise 7.7.15. Prove that E is a weakly sequentially lower semicontinuous functional on W01,p (Ω). Hint. Use the weak sequential lower semicontinuity of the norm in W01,p (Ω), the compact embedding W01,p (Ω) ⊂⊂ Lp (Ω), and the continuity of the Nemytski operator u → F (·, u) from Lp (Ω) to L1 (Ω). ¯ > λ1 such that Exercise 7.7.16. Prove that there exists λ ¯ |∇u(x)|p dx ≥ λ |u(x)|p dx Ω Ω u(x)ϕ1 (x) dx = 0 . for all u ∈ Z u ∈ W01,p (Ω) : Ω

Hint. Assume by contradiction that there exist εn → 0 and un ∈ Z, un = 1 such that 1 = (λ1 + εn ) |un (x)|p dx. Ω

un → u in Lp (Ω) and show u = o, Pass to a subsequence un u in |∇u(x)|p dx ≤ λ1 |u(x)|p dx. W01,p (Ω),

Ω

Ω

This contradicts u ∈ Z due to the simplicity of λ1 . Exercise 7.7.17. Consider the boundary value problem −∆p u(x) + λ|u(x)|p−2 u(x) = g(x, u(x)) u=0

in

Ω,

on

∂Ω,

(7.7.39)

where p > 1. Formulate conditions on λ and g = g(x, s) which guarantee that the energy functional associated with (7.7.39) (i) is coercive, (ii) is weakly coercive, (iii) has a geometry corresponding to the Mountain Pass Theorem, (iv) has a geometry corresponding to the Saddle Point Theorem.

Summary of Methods Presented in This Book Fixed Point Methods Contraction Principle (Theorem 2.3.1) Browder Theorem for non-expansive mappings (Proposition 2.3.10) Brouwer Fixed Point Theorem (Theorem 5.1.3) Schauder Fixed Point Theorem (Theorem 5.1.11, Example 5.2.14) Fixed Point Theorem for condensing mappings (Theorem 5.1.27)

Local Diﬀerentiability Methods Local Inverse Function Theorem (Theorem 4.1.1) Implicit Function Theorem (Theorem 4.2.1) Crandall–Rabinowitz Local Bifurcation Theorem (Theorem 4.3.22)

Topological Methods Brouwer degree (Theorems 5.2.7 and 4.3.124) Leray–Schauder degree (Theorem 5.2.13) Topological degree of Browder and Skrypnik (Theorem 5.2.47) Krasnoselski Local Bifurcation Theorem (Theorem 5.2.23) Rabinowitz and Dancer Global Bifurcation Theorems (Theorems 5.2.34 and 5.2.38)

Monotonicity Methods Monotone operators theory in Hilbert space (Theorem 5.3.4) Monotone operators theory in Banach space (Theorem 5.3.22) Leray–Lions Theorem for operators which are monotone in the principal part (Theorem 5.3.23)

534

Summary of Methods Presented in This Book

Methods in Ordered Spaces Monotone iterations, subsolutions and supersolutions (Theorem 5.4.16) Supersolutions, subsolutions and topological degree (Theorems 5.4.49 and 5.4.50) Supersolutions, subsolutions and global extrema (Theorem 6.2.42)

Variational Methods Necessary and suﬃcient conditions for local extrema (Propositions 6.1.2 and 6.1.4, Theorem 6.1.5) Global extrema (Theorems 6.2.4, 6.2.11 and 6.2.20) Relative extrema and Langrange Multiplier Method (Theorem 6.3.2) Krasnoselski Potential Bifurcation Theorem (Theorem 6.3.26) Mountain Pass Theorem (Theorem 6.4.5 in the Hilbert space setting, Theorem 6.4.24 in the Banach space setting) Lusternik–Schnirelmann Method (Theorems 6.4.42 and 6.4.46) Saddle Point Theorem (Theorem 6.5.3 in the Hilbert space setting, Theorem 6.5.12 in the Banach space setting) Linking Theorem (Theorem 6.5.13)

Approximative Methods Contraction Principle (Theorem 2.3.1) Newton Method (Appendix 3.2A) Ritz Method (Proposition 6.2.34) Finite Element Method (Theorem 6.2.38)

Typical Applications Most of the methods presented in this book are illustrated on boundary value problems for both ordinary and partial diﬀerential equations. For the reader’s convenience we stress here some typical boundary value problems and the methods illustrated by them.

Semilinear Problems – Ordinary Diﬀerential Equations x(t) ˙ = f (t, x(t)),

x(0) = x(1)

Lyapunov–Schmidt Reduction and Implicit Function Theorem (Example 4.3.15) Coincidence degree (Example 5.2.18)

x ¨(t) = f (t, x(t)),

x(0) = x(1) = 0

Contraction Principle (Example 2.3.8) Schauder Fixed Point Theorem (Example 5.1.14) Supersolutions and subsolutions (Example 5.4.19) Supersolutions and subsolutions combined with global extrema (Appendix 6.2B)

−¨ x(t) + g(x(t)) = f (t),

x(0) = x(1) = 0

Monotone operators (Example 5.3.11) Extreme Value Theorem (special case g(s) = s3 , Examples 6.2.6, 6.2.14 and 6.2.21) Saddle Point Theorem (Example 6.5.5)

536

Typical Applications

−¨ x(t) + λx(t) = |x(t)|p−2 x(t),

x(0) = x(π) = 0

Mountain Pass Theorem (Example 6.4.7)

−¨ x(t) + a(t)x(t) = f (t, x(t)),

x(0) = x(1) = 0

Linking Theorem (Example 6.5.14)

x ¨(t) + λx(t) = 0,

x(0) = x(1) = 0

Courant–Weinstein Variational Principle (Example 6.3.15)

−(p(t)x(t))˙ ˙ + q(t)x(t) = λx(t),

x(a) = x(b) = 0

Hilbert–Schmidt Theorem (Example 2.2.17) Krein–Rutman Theorem (Exercise 5.4.46)

x ¨(t) + λ sin x(t) = 0,

x(0) = x(2π),

x(0) ˙ = x(2π) ˙

Lyapunov–Schmidt Reduction (Example 4.3.24)

x ¨(t) + λx(t) + g(λ, t, x(t)) = 0,

x(0) = x(π) = 0

Krasnoselski Local Bifurcation Theorem (Exercise 5.2.33) Dancer Global Bifurcation Theorem (Example 5.2.39)

x ¨(t) + λx(t) + g(λ, t, x(t)),

x(0) = x(2π),

x(0) ˙ = x(2π) ˙

Krasnoselski Potential Bifurcation Theorem (Example 6.3.32)

x ¨(t) = f (t, x(t), x(t)), ˙

x(0) = x(1) = 0

Leray–Schauder degree (Example 5.2.16)

x ¨(t) + λx(t) + g(λ, t, x(t), x(t)), ˙

x(0) = x(2π),

x(0) ˙ = x(2π) ˙

Crandall–Rabinowitz Bifurcation Theorem (Example 4.3.25)

537

Semilinear Problems – Partial Diﬀerential Equations −∆u(x) = µu(x) in Ω,

u=0

on ∂Ω

Krein–Rutman Theorem (Example 5.4.40)

−∆u(x) = g(u(x)) + f (x) in Ω,

u=0

on ∂Ω

Schauder Fixed Point Theorem – existence of a classical solution (Theorem 7.2.1) Global extrema – existence of a weak solution (Theorem 7.7.1)

−∆u(x) = g(x, u(x))

in Ω,

u = 0 on ∂Ω

Contraction Principle – existence of a weak solution (Theorem 7.4.1) Schauder Fixed Point Theorem – existence of a weak solution (Proposition 7.4.2, Theorems 7.4.3 and 7.4.6) Monotone operators – existence of a weak solution (Theorem 7.6.2)

−∆u(x) = λu(x) + h(x, u(x))

in Ω,

u=0

on ∂Ω

Leray–Schauder degree – existence of a weak solution (Theorems 7.5.2 and 7.5.3, Exercises 7.5.8, 7.5.9, 7.5.10 and 7.5.11) Saddle Point Theorem – existence of a weak solution (Theorem 7.7.5)

−∆u(x) + λu(x) = |u(x)|p−2 u(x) in Ω,

u=0

on ∂Ω

Mountain Pass Theorem – existence of a positive weak solution (Theorem 7.7.3)

Quasilinear Problems – Ordinary Diﬀerential Equations p−2 (|x(t)| ˙ x(t))˙ ˙ + λ|x(t)|p−2 x(t) = 0,

x(0) = x(1) = 0

Lagrange Multiplier Method (Example 6.3.5) Lusternik–Schnirelmann Method (Example 6.4.47) p−2 −(|x(t)| ˙ x(t))˙ ˙ + g(x(t)) = f (t),

x(0) = x(1) = 0

Topological degree of Browder and Skrypnik (Example 5.2.51) Browder Theorem (Example 5.3.24) p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, x(t)),

x(0) = x(1) = 0

538

Typical Applications Supersolutions and subsolutions combined with the topological degree (Appendix 5.4B) p−2 −(|x(t)| ˙ x(t))˙ ˙ + λx(t) = |x(t)|r−2 x(t),

x(0) = x(1) = 0

Moutain Pass Theorem (Exercise 6.4.26)

Quasilinear Problems – Partial Diﬀerential Equations −∆p u(x) + g(x, u(x), ∇u(x)) = f (x) in Ω,

u=0

on ∂Ω

Leray–Lions Theorem – existence of a weak solution (Appendix 7.6A)

−∆p u(x) = λ1 |u(x)|p−2 u(x) + f (x, u(x)) − h(x)

in Ω,

u=0

on ∂Ω

Saddle Point Theorem – existence of a weak solution (Appendix 7.7A)

−∆p u(x) = λ|u(x)|p−2 u(x) + f (λ, x, u(x))

in Ω,

u = 0 on ∂Ω

Topological degree of Browder and Skrypnik – bifurcation result (Appendix 7.5A)

Comparison of Bifurcation Results Presented in This Book Bifurcation results presented in this book are based on the following three basic tools: Implicit Function Theorem (Crandall–Rabinowitz Local Bifurcation Theorem – Theorem 4.3.22), Degree Theory (Krasnoselski Bifurcation Theorem – Theorem 5.2.23, Rabinowitz Global Bifurcation Theorem – Theorem 5.2.34, Dancer Global Bifurcation Theorem – Theorem 5.2.38), Variational Principles (Krasnoselski Potential Bifurcation Theorem – Theorem 6.3.26) Below we present a brief discussion of these results and point out diﬀerences and links among them. A bifurcation result based on the Implicit Function Theorem provides very precise information about the structure of the set of nontrivial solutions near the bifurcation point – it is expressed in terms of a diﬀerentiable curve. Moreover, the result is obtained for “non potential” equations. On the other hand, to verify the assumptions, relatively strong smoothness assumptions on the nonlinearity are required and the information about the set of all nontrivial solutions has only local character. Also, the assumption that the dimension of the kernel of the linear part must be equal to 1 represents a relatively strong restriction. Bifurcation results based on the Degree Theory do not require smoothness of the nonlinearity at all. They allow one to treat “non potential” equations as well and provide also some information about the global structure of the set of nontrivial solutions. On the other hand, the multiplicity of an eigenvalue of the “linear part” must be odd, the set of nontrivial solutions need not be a curve (even in a small neighborhood of the bifurcation point) and its global structure may be unclear if there is no additional information about “higher order terms”. A bifurcation result based on Variational Principles holds for any multiplicity of an eigenvalue of the “linear part”. The price to be paid for that consists in the fact that the equation has to possess a potential. It provides only local information about nontrivial solutions and the structure of the set of all nontrivial solutions might be “very wild” in general. We can summarize the above discussion in the following table.

Theorem

(Theorem 4.3.22)

(Theorem 6.3.26)

Krasnoselski Theorem

(Theorem 5.2.38)

Dancer Theorem

(Theorem 5.2.34)

Rabinowitz Theorem

(Theorem 5.2.23)

Method

Multiplier

Lagrange

Degree Theory

Degree Theory

Degree Theory

Function

Theorem

Krasnoselski Theorem

Implicit

to prove it

Crandall–Rabinowitz

Bifurcation result

Basic tool

arbitrary

m≥1

m=1

only

potential

in general

non potential

in general

non potential

m≥1 m odd

in general

non potential

in general

non potential

equation

Form of the

m odd

m≥1

m=1

the linear part

an eigenvalue of

Multiplicity m of

local

global

global

local

local

information

of the

Character

at the point

diﬀerentiable

twice continuously

continuous

continuous

continuous

in a neighborhood

diﬀerentiable

twice continuously

required

Smoothness

540 Comparison of Bifurcation Results Presented in This Book

List of Symbols Sets and spaces M×N M Mw ∂M int M exp M sup M inf M N Z Q R C R N , CN

M ⊥ , N⊥ M⊥ X# X∗ M⊥N [a, b] U(x), V(x), . . . {Un } B(a; r) ∂B(a; r) Sk K, Kr X ∗,+ Lin M Co M M+N

Cartesian product of sets M and N , 2 closure of the set M, 25 weak closure of the set M, 77 boundary of the set M, 25 interior of the set M, 22 set of all subsets of M, 121 lowest upper bound (supremum) of the set M, 3 greatest lower bound (inﬁmum) of the set M, 3 set of all positive integers, 19: N = {1, 2, . . . } set of all integers, 49 set of all rational numbers, 4 set of all real numbers, 1 set of all complex numbers, 1 real, or complex space of dimension N ∈ N, 1, 12 set of scalars (in general), 1 nullsets, 12, 69 orthogonal complement of the set M, 47 (algebraic) dual space of the linear space X, 10 dual (adjoint) space of the Banach space X, 55 orthogonality of the sets M and N , 43 order interval in X, 330: [a, b] = {x ∈ X : a ≤ x ≤ b} neighborhood of the point x, 24 covering, 26 open ball centered at the point a ∈ X with radius r > 0, 25 sphere centered at the point a ∈ X with radius r > 0, 279 k-dimensional sphere in RN , 182 order cone and Kr = {x ∈ K : x ≤ r}, 338 set of all positive functionals, 342 span of the elements of the set M, 2 convex hull of the elements of the set M, 314 set of all z = x + y, x ∈ M, y ∈ N , 262

542

List of Symbols

X ⊕Y dim X codim X X|Y M TM T ∗M H p (M ) H1 (M ) {αn }

algebraic direct sum of the spaces X and Y , 5 dimension of the linear space X, 4 co-dimension of the linear subspace X, 9 factor space X over Y , 9 manifold, 154 tangent bundle, 187 cotangent bundle, 187 cohomology group of the manifold M , 205 fundamental group of the manifold M , 206 partition of unity, 209

Elements 0 o

zero number in R or C, 2 zero element of a (topological, . . . , Banach, Hilbert) space X including RN and CN , 1

Special spaces and classes space of all sequences {xn }∞ n=1 with

lp l∞ c0 L(X, Y ) L(X, Y ) B2 (X, Y ) C(X, Y ), C[a, b] C k (X, Y ) C k (M) C k,γ (M) C0k (X, Y ) D(M) BC(X) Lp (M) L∞ (M) Lploc (M) W k,p (M), Ta M (Ta M )∗ CAR(M)

∞

|xn |p < ∞, 37, 49

n=1

space of all bounded sequences, 10 space of all sequences with zero limit and with the sup norm, 59 space of all linear operators from X into Y , 5 space of all linear continuous operators from X into Y , 29 space of all bilinear continuous operators from x into Y , 129 space of all continuous maps from X into Y , and of the closed interval [a, b], 30, 4 space of all maps from X into Y with continuous derivatives up to order k, 37 space of all maps from C k (M) for which continuous derivatives up to order k are bounded in M, 37 space of all γ-H¨ older continuous, bounded functions in M with continuous derivatives up to order k, 38 space of all maps from X into Y with continuous derivatives up to order k and with compact supports in X, 368 class of all inﬁnite diﬀerentiable functions on M which have compact supports lying in M, 35 space of all bounded, continuous maps on X, 30 Lebesgue space on M, 33 space of all classes of essentially bounded functions, 33 space of all classes of functions which are in Lp on every compact subset of M, 38 W0k,p (M) Sobolev space on M, 39, 52 tangent space of the diﬀerentiable manifold M at the point a, 182 cotangent space of the diﬀerentiable manifold M at the point a, 187 class of all Carath´eodory functions on M, 126

List of Symbols H(M) Γa Λn (x) [f ] C 0,1 C (X, Y ) Cf (X, Y )

543 class of all holomorphic functions on a neighborhood of the closed set M, 21 class of all smooth curves, 182 class of all skew-symmetric n-linear forms, 196 class of mutual homotopic continuous maps containing f , 206 class of all domains with Lipschitz boundary, 396 class of all compact operators from X into Y , 77 class of all ﬁnite-dimensional compact operators from X into Y , 256

Maps, functions and operators Dom f domain of the map f , 51 Im f range (image) of the map f , set of all values f (x), x ∈ Dom f , 7 f: X → Y f is a map from X into Y (does not automatically mean that either Dom f = X or Im f = Y ), 26 Ker f kernel of the linear operator f , null–space of f , 7 f−1 (M) set of all preimages, 26: f−1 (M) = {x ∈ Dom f : f (x) ∈ M} f −1 inverse map to the map f , 7 positive and negative part of f , respectively, 53: f +, f − f + = max{f, 0}, f − = max{−f, 0}, f = f + − f − f ◦g composition of the maps f and g, 122: f ◦ g = f (g) O, O zero map and zero matrix, respectively, 17, 15: Ox = o, x ∈ X, Ox = o Special functions and maps J Jacobi matrix, 171 Jf determinant of the Jacobi matrix of the map f (Jacobian), 118 partial Fr´echet derivative (in the inﬁnite dimension) of f with ref1 , fx spect to the ﬁrst variable x, 122 δf (a; v), ﬁrst (and kth) Gˆ ateaux derivative of the map f at the point a in δ k f (a; v1 , . . . , vk ) the direction v (directions v1 , . . . , vk ), 117, 129 ﬁrst (and kth) Gˆ ateaux diﬀerential of the map f at the point a, Df (a), Dk f (a) 117, 129 α Dw f α-weak derivative of the map f , 38 ∇f gradient of the map f , 52 curl f curl of the map f , 201: curl f = ∇ × f N ∂fi div f divergence of the map f , 225: div f = ∂xi i=1

∆f

Laplace operator of the map f (Laplacian), 146: N ∂2 f ∆f = , ∆f = div (∇f ) ∂x2

∆p f ∆M f Lg f

p-Laplacian of the map f , 500 Laplace–Beltrami operator of the map f on the manifold M , 226 directional (Lie) derivative of the map f (in the direction of the vector ﬁeld g), 192 commutator (Lie bracket) of the vector ﬁelds f and g, 195

i=1

[f, g]

i

544 x, y (x, y)

List of Symbols duality between the Banach spaces X ∗ and X, 303, 387: x, y = x(y), x ∈ X ∗ , y ∈ X scalar (inner) product in the linear space X or in RN , CN (xy (x, y)RN in Chapter 7), 41, 42: N N xi yi , x, y ∈ RN , (x, y)CN = xi y i , x, y ∈ CN , (x, y)RN = i=1

x, xRN ≡ x2 x∧y x×y X ⊂ Y , X ⊂⊂ Y meas M dist(x, M), dist(M, N )

i=1

respectively norm in the normed space X and Euclidean norm in RN (|x| N x2i , x ∈ RN xRN in Chapter 7), respectively, 28: x2RN = i=1

exterior product in the linear space X, 197 cross (vector) product in R3 , 200 the space X is continuously (and compactly) embedded into the space Y , 36, 40 Lebesgue measure of the set M, 36 distance of the point x from the set M, and of the sets M and N , respectively, 52, 272: dist(M, N ) = sup inf (x, y) x∈M y∈N

diam M

diameter of the set M ⊂ (X, ), 261: diam M = sup (x, y)

Re f Im f f∗ (v) f ∗g deg(f, M, a) ind f

real part of f , 41 imaginary part of f , 61 push-forward; the map f pushes forward a tangent vector v, 186 pull-back; the map f pulls back a diﬀerential form g, 189 degree of the map f at the point a with respect to the set M, 231 index of the Fredholm operator f , 70: ind f = dim Ker f − dim Ker f ∗ index of the point a with respect to the curve γ, 230 support of the map f , 35 spectrum (set of all eigenvalues) of the map f , 14 resolvent set of the map f , 56: (f ) = C \ σ(f ) (M − 1)-form of the vector ﬁeld f and of the M -form ω, 225

x,y∈M

Indγ a supp f σ(f ) (f ) ωf Others x y, y x a.a. a.e.

let x be y; x is deﬁned by y; denote y as x, 4, 135 almost all (in the sense of the Lebesgue measure), 70 almost everywhere, almost every (in the sense of the Lebesgue measure), 33 xn → x strong convergence of the sequence {xn }∞ n=1 to the element x, 25 xn x weak convergence of the sequence {xn }∞ n=1 to the element x, 65 ∗ weak star convergence of the sequence {xn }∞ xn x n=1 to the element x, 68 uniform convergence of the sequence {xn }∞ xn ⇒ x n=1 to the element x, 37 p exponent conjugate to p, 33 p = p−1 Np p∗ = N−kp critical Sobolev exponent, 39 x > o, x ≥ o, x o ordering in the Banach space X, o, x ∈ X, 330

Index a posteriori estimate, 92 a priori estimate, 92, 280 absolute neighborhood extensor, 443 absolutely continuous function, 39 accumulation point, 85 adjoint operator, 12, 68, 72 algebraic, 11 admissible homotopy, 271 Alaoglu–Bourbaki theorem, 68 algebra, 31 Banach, 31 Lie, 195 normed, 31 alternative Fredholm, 14, 82 problem, 166 antipodal point, 450 theorem (Borsuk), 242 approximation theorem (Weierstrass), 250 approximative unit, 236 Arzel` a–Ascoli theorem, 31 atlas, 159 equivalent, 185 ball, open, 25 Banach algebra, 31 space, 28 ordered, 330 Banach–Steinhaus theorem, 58 basis, 2 dual, 10 Hamel, 2 orthonormal, 43 positive, 215

Schauder, 40 standard, 3 Bessel inequality, 44 bifurcation, 150 branch, 176 diagram, 175 equation, 166 global Dancer theorem, 300 Rabinowitz theorem, 295 local Crandall–Rabinowitz theorem, 174 Krasnoselski theorem, 290 pitchfork, 177 point, 174 potential (Krasnoselski) theorem, 417 transcritical, 177 bijective linear operator, 7 mapping, 2 bilinear form, 195 operator, 58, 129 Bochner integral, 110 theorem, 111 Borel measure, 39, 63 Borsuk antipodal theorem, 242 Borsuk–Ulam theorem, 245 boundary, 25 condition, 91 Dirichlet, 91 mixed, 91 Neumann, 91 periodic, 91 Lipschitz, 474

546 of manifold, 218 value problem, 96 bounded operator, 264 brachistochrone problem, 366 brackets Lie, 193 Poisson, 193 branch, 176 bread–ham–cheese theorem, 247 Brouwer degree, 238, 269, 275 ﬁxed point theorem, 253 Browder theorem, 323 bundle cotangent, 187 tangent, 187 calculus, functional, 22, 90 Dunford, 22, 113 canonical embedding, 9, 65 Carath´eodory property, 126 Cartesian coordinates, 142 category, Lusternik–Schnirelmann, 443 Cauchy integral formula, 22 Cauchy sequence, 27 Cauchy–Riemann conditions, 230 central manifold, 154 chain rule, 121 characteristic equation, 15 of partial diﬀerential, 191 function of set, 36 chart, 159 classical solution, 74, 476 closed diﬀerential form, 202 graph theorem, 60 operator, 59 set, 25 weakly sequentially, 381 closure, 25 weak, 77 codimension, 9 coercive functional, weakly, 380 operator, 323, 385 weakly, 310 cohomology group, 205

Index coincidence degree, 284 commutator, 195 compact embedding, 40 operator linear, 77 nonlinear, 256 set, 26 relatively, 26 space, 26 sequentially, 26 compactness in C(T ), 31 in Lp (Ω), 35 comparison principle, 349 complement direct, 5 orthogonal, 47 topological, 175 complete metric space, 27 completely integrable system, 192 completion, 28 complex linear space, 2 complexiﬁcation, 4 of operator, 342 component, 27 concentration compactness principle, 521 condensing operator, 264 condition boundary, 91 Dirichlet, 91 mixed, 91 Neumann, 91 periodic, 91 Cauchy–Riemann, 230 Euler necessary, 362 growth, 294, 506, 510 sublinear, 488 initial, 89 integrability, 192 Lagrange necessary, 362 suﬃcient, 363 Landesman–Lazer type, 285, 497 Lipschitz, 107 monotonicity in principal part, 323 Nagumo-type, 281

Index Palais–Smale ((PS)c ), 432 on manifold, 449 (S+ ), 303 sign, 281, 515 cone, 330 order, 330 dual, 342 normal, 332 total, 342 conjugate exponent, 33 connected space, 27 constant, Lipschitz, 96 constrained extremum, 401 maximum, 401 minimum, 401 continuous embedding, 36 extension, 27 linear form, 50 mapping, 26 operator, 29 contractible set, 414 contraction, 92 k-set, 264 principle, 92 convergence, 25 in strong operator topology, 58 uniform, 30 locally, 30 weak, 65, 68 star, 68 convex functional, 365 strictly, 365 hull, 314 set, 13 convolution, 35 coordinates, 3 Cartesian, 142 local, 159 nonlinear, 142 polar, 142 spherical, 142, 143 cotangent space, 187 Courant–Fischer principle, 409

547 Courant–Weinstein variational principle, 411 covering, 26 Crandall–Rabinowitz local bifurcation theorem, 174 critical growth, 521 point, 160, 362 non-degenerate, 170 Sobolev exponent, 39 value, 160 cross product, 200 cubic spline, 395 curl of vector ﬁelds, 201 curve integral of function, 208 of one-form, 212 null-homotopic, 206 oriented, positively, 230 Peano, 241 simple, 224 Dancer global bifurcation theorem, 300 Darbo theorem, 265 deformation lemma, 429, 437, 448 degree Brouwer, 238, 269, 275 coincidence, 284 generalized monotone operator, 304 Leray–Schauder, 278 demicontinuity, 303 dense set, 25 density theorem, 35 dependent functions, 164 derivative directional, 117 second, 129 distributional, 39 Fr´echet, 122 second, 130 Gˆ ateaux, 117 second, 129 Lie, 192 partial Fr´echet, 124 Gˆ ateaux, 124 weak, 38

548 diagram, bifurcation, 175 diameter, 261 diﬀeomorphism, 142 diﬀerentiable manifold, 159 inﬁnite dimensional, 189 with boundary, 218 diﬀerential equation characteristic, 191 exterior, 206 in Banach space, 106 form, 197 closed, 202 exact, 202 on manifold, 197 smooth, 201 of diﬀerential form, 201 on manifold, 187 operator, 74 Dirac measure, 38 direct sum, 5 directional derivative, 117 second, 129 Dirichlet boundary condition, 91 kernel, 57 distribution, 38 distributional derivative, 39 divergence, 225 domain of class C k,γ , 474 with Lipschitz boundary, 474 dual basis, 10 characterization of norm, 61 order cone, 342 space, 10, 61 duality lemma, 447 mapping, 121 Dunford functional calculus, 22, 113 Eberlain–Smulyan theorem, 67 eigenfunction, 305, 404 principal, 405, 490 eigenvalue, 14, 305 principal, 405, 490 simple, 174, 342

Index eigenvector, 14 Ekeland variational principle, 439 element, ﬁnite, 396 embedding, 159 canonical, 9, 65 compact, 40 continuous, 36 energy functional, 379, 515 ε-net, 27 equality, Parseval, 48 equation bifurcation, 166 characteristic, 15 diﬀerential characteristics, 191 exterior, 206 in Banach space, 106 Euler, 368, 371 in variations, 151 integral ﬁrst kind, 82 second kind, 82 well-posed, 82 equicontinuity, 31 equivalence of norms, 29 estimate a posteriori, 92 a priori, 92, 280 Schauder, 475 Euler equation, 368, 371 necessary condition, 362 exact diﬀerential form, 202 example Beals, 98 Br´ezis–Nirenberg, 428 Brouwer, 275 Edelstein, 101 Kakutani, 256 Weierstrass, 374 exponent conjugate, 33 Sobolev critical, 39 extensor, absolute neighborhood, 443 exterior diﬀerential equation, 206 product, 197

Index extreme value theorem, 378 extremum global, 373 local, 361 constrained, 401 strict, 362 factor space, 9 Fatou lemma, 34 Fermat necessary condition, 362 ﬁnite elements method, 396 intersection property, 51 ﬁrst integral, 165 ﬁxed point, 92 Floquet multiplier, 255 form bilinear, 195 diﬀerential closed, 202 exact, 202 on manifold, 197 smooth, 201 Jordan canonical, 19 linear, 10 continuous, 50, 61 skew-symmetric, 196 formula Cauchy integral, 22 Green, 226, 479 Leray–Schauder index, 289 product, 293 Taylor, 130 Fr´echet derivative, 122 partial, 124 second, 130 diﬀerentiability, 122 Fredholm alternative, 14, 82 operator, 70, 283 Frobenius theorem, 193, 207 function Bochner integrable, 110 continuous absolutely, 39 H¨ older, 38 Lipschitz, 38

549 uniformly, 27 diﬀerentiable on manifold, 187 essentially bounded, 33 Green, 75 holomorphic, 145 strongly measurable, 110 test, 485 vector-valued, 64 functional calculus, 22, 90, 113 coercive, weakly, 380 convex, 365 strictly, 365 energy, 379, 515 Minkowski, 62 positive, 342 sublinear, 61 weakly sequentially continuous, 378 lower semi-continuous, 377 functions dependent, 164 independent, 164 fundamental group, 206 lemma in calculus of variations, 365 matrix, 254 theorem of algebra, 15, 228 Gˆ ateaux derivative, 117 partial, 124 second, 129 diﬀerentiability, 117 variation, 117 Gauss–Ostrogradski theorem, 225 Gauss–Seidel iterative method, 392 general minimax principle, 440, 449 on manifold, 449 generalized inverse, 283 geometry, mountain pass, 431 global extremum, 373 inverse function theorem, 144 maximum, 373 minimum, 373 gradient, 52, 118, 200 Gramm matrix, 213

550 Graves theorem, 105 greatest lower bound, 3 Green formula, 226, 479 function, 75 theorem, 224 Gronwall inequality, 260 group cohomology, 205 fundamental, 206 Lie, 195 topological, 195 growth condition, 294, 506, 510 critical, 521 subcritical, 521 sublinear, 488 Hahn–Banach theorem, 61 half-linear diﬀerential operator, 305 Hamel basis, 2 Hamilton–Cayley theorem, 17 Hammerstein operator, 133 Heaviside function, 38 Hermite interpolation, 398 polynomial, 49 Hess matrix, 132 Hilbert space, 41 Hilbert–Schmidt operator, 79 theorem, 88 holomorphic function, 145 homeomorphism, 26 homotopic mappings, 205 homotopy, 205 admissible, 271 invariance property, 238, 279, 304 H¨ older continuous function, 38 inequality, 33 hull, convex, 314 hyperbolic stationary point, 154, 176 hyperplane, 11 supporting, 120 identity Jacobi, 195

Index parallelogram, 42 polarization, 42 image, 7 immersion, 159 implicit function theorem, 147 independent functions, 164 index Leray–Schauder formula, 289 of ﬁxed point, 230 of Fredholm operator, 70 of isolated solution, 276, 304 inequality Bessel, 44 Gronwall, 260 H¨ older, 33 Minkowski, 33 Poincar´e, 52 Schwartz, 41 triangle, 25, 28 inﬁnite dimensional diﬀerentiable manifold, 189 initial condition, 89 value problem, 93 injection, linear, 7 injective linear operator, 7 integrability condition, 192 integral Bochner, 110 curve of function, 208 of one-form, 212 equation ﬁrst kind, 82 second kind, 82 ﬁrst, 165 formula (Cauchy), 22 manifold, 192, 207 of diﬀerential form, 216 operator (Volterra), 86 Riemann, 105 weak, 110 interior, 25 interpolation Hermite, 398 Lagrange, 397 interval, order, 330

Index invariance of domain, 241 invariant subspace, 15 inverse, 7 function theorem global, 144 local, 140 generalized, 283 matrix, 7 right, 160, 283 isomorphism in algebraic sense, 7 in topological sense, 29 Jacobi identity, 195 matrix, 118 Jacobian, 118 Jentzsch theorem, 350 generalized, 341 Jordan canonical form, 19 cell, 19 separation theorem, generalization, 241 Kakutani counterexample, 256 kernel, 7 Krasnoselski theorem bifurcation local, 290 potential, 417 minorant, 339 Krein–Rutman proposition, 342 theorem, 343 k-set contraction, 264 Kuratowski measure of noncompactness, 261 Lagrange interpolation, 397 multiplier, 402 method, 402 necessary condition, 362 suﬃcient condition, 363 Landesman–Lazer type conditions, 285, 497 Laplace operator, 226, 473 p-Laplacian, 500

551 one-dimensional, 305 Laplace–Beltrami operator, 226 Lax–Milgram proposition, 50 Le Shujie theorem, 440 Lebesgue measure, 32 space, 33 lemma duality, 447 Fatou, 34 fundamental in calculus of variations, 365 quantitative deformation, 429, 437, 448 Riemann–Lebesgue, 59 Urysohn, 246 Zorn, 2 Leray–Lions theorem, 323 Leray–Schauder continuation method, 280 degree, 278 formula, 289 Lie algebra, 195 brackets, 193 derivative, 192 group, 195 ring, 195 linear form, 10 operator, 5 space, 1 linking theorem, 465 Lipschitz boundary, 474 condition, 107 constant, 96 continuous function, 38 local bifurcation theorem, 174, 290 chart, 159 coordinates, 159 extremum, 361 inverse function theorem, 140 maximum, 361 minimum, 361 parametrization, 182 stable manifold, 154

552 locally ﬁnite system, 209 Lipschitz continuous mapping, 93 lower solution, 334 lowest upper bound, 3 Lusternik theorem, 403 Lusternik–Schnirelmann category, 443 method, 443 theorem, 246 Luzin theorem, 35 Lyapunov–Schmidt reduction, 166 manifold, 40 central, 154 diﬀerentiable, 159 inﬁnite dimensional, 189 with boundary, 218 integral, 192, 207 orientable, 215 oriented, 215 simply connected, 205 stable, 154 local, 154 submanifold, 212 mapping bijective, 2 class C k , 186 continuous, 26 locally Lipschitz, 93 uniformly Lipschitz, 94 contractive, 92 duality, 121 homotopic, 205 non-expansive, 98 odd, 242 Poincar´e, 152, 254 proper, 230 retraction, 465 set contraction, 264 matrix fundamental, 254 Gramm, 213 Hess, 132 inverse, 7 Jacobi, 118 rank of, 14 regular, 14

Index representation, 6 transpose of, 11 Mawhin theorem, 284 maximum global, 373 local, 361 constrained, 401 strict, 362 mean value theorem, 119 measurable function, strongly, 110 measure Borel, 39, 63 Dirac, 38 Lebesgue, 32 of noncompactness (Kuratowski), 261 method ﬁnite elements, 396 Gauss–Seidel iterative, 392 Lagrange multiplier, 402 Leray–Schauder continuation, 280 Lusternik–Schnirelmann, 443 monotone iterative, 335 Newton, 97, 134 Ritz, 389 metric, 24 Riemann, 214 space, 24 symmetry of, 25 mild solution, 116 minimax principle, 408 general, 440, 449 minimizing sequence, 375, 390 minimum global, 373 local, 361 constrained, 401 strict, 362 Minkowski functional, 62 inequality, 33 minorant, 339 principle, 339 Minty trick, 326 mixed boundary condition, 91 M¨ obius strip, 215 molliﬁer, 35 monotone

Index convergence theorem, 34 iterative method, 335 operator, 310 decreasing, 332 increasing, 332 strictly, 310 strictly decreasing, 332 strictly increasing, 332 strongly, 310 strongly decreasing, 332 strongly increasing, 332 monotonicity in principal part, 323 Morse theorem, 170 mountain pass theorem, 432 Ambrosetti–Rabinowitz, 441 type geometry, 431 multiindex, 37 multiplicity of eigenvalue, 15, 84 multiplier Floquet, 255 Lagrange, 402 Nagumo-type condition, 281 neighborhood, 24 weak, 66 Nemytski operator, 125 net (ε-net), 27 Neumann boundary condition, 91 Newton method, 97, 134 Newton–Robin boundary condition, 91 nilpotent operator, 19 non-expansive mapping, 98 nonresonance problem, 305, 497 norm, 27 equivalent, 29 induced by scalar product, 41 of linear operator, 29 on Cartesian product, 124 normal cone, 332 normed algebra, 31 linear space, 28 null-homotopic curve, 206 number, winding, 231 odd mapping, 242 open

553 ball, 25 mapping theorem, 58 set, 24 operator adjoint, 12, 68, 72 algebraic, 11 bilinear, 58, 129 bounded, 264 closed, 59 coercive, 323, 385 weakly, 310 compact ﬁnite dimensional, 256 linear, 77 nonlinear, 256 condensing, 264 continuous, 29 demicontinuous, 303 diﬀerential, 74 half-linear, 305 Fredholm, 70, 283 Hammerstein, 133 Hilbert–Schmidt, 79 integral (Volterra), 86 inverse, 7 generalized, 283 right, 160, 283 Laplace, 226, 473 p-Laplacian, 500 Laplace–Beltrami, 226 linear, 5 bijective, 7 injective, 7 isomorphism, 7 surjective, 7 monotone, 310 decreasing, 332 increasing, 332 strictly, 310 strictly decreasing, 332 strictly increasing, 332 strongly, 310 strongly decreasing, 332 strongly increasing, 332 Nemytski, 125 nilpotent, 19 norm, 29

554 of ﬁnite rank, 78 positive, 332, 409 strictly, 332 strongly, 332 projection, 8 self-adjoint, 69 shift left, 10 right, 10 Sturm–Liouville, 88 sublinear, 61 substitution, 125 superposition, 125 unitary, 49 Volterra integral, 86 order cone, 330 dual, 342 interval, 330 ordered Banach space, 330 set, 2 ordering, 2 orientable manifold, 215 orientation, 215 induced, 221 oriented curve, 230 manifold, 215 orthogonal complement, 47 projection, 47 set, 43 orthogonalization, 43 orthonormal basis, 43 system, 43 outer normal vector, 215 Palais–Smale condition ((PS)c ), 432 on manifold, 449 parallelogram identity, 42 parametrization, local, 182 Parseval equality, 48 partition of unity, 209 Peano curve, 241 periodic condition, 91 Perron theorem, 350

Index generalized, 341 Pettis theorem, 112 Pfaﬀ system, 207 Picard theorem, 93 pitchfork bifurcation, 177 p-Laplacian, 305, 500 non-well-ordered case, 355 well-ordered case, 353 Poincar´e inequality, 52 mapping, 152, 254 theorem, 203 point antipodal, 450 bifurcation, 174 critical, 160, 362 non-degenerate, 170 ﬁxed, 92 regular, 160 singular, 156 stationary, 176 hyperbolic, 176 stationary hyperbolic, 154 Poisson brackets, 193 polar coordinates, 142 polarization identity, 42 polynomial characteristic, 15 Hermite, 49 Taylor, 130 positive basis, 215 functional, 342 operator, 332, 409 strictly, 332 strongly, 332 solution, 338 subsolution, 338 positively oriented curve, 230 potential, 118, 202, 416 bifurcation theorem, 417 principal eigenfunction, 405 eigenvalue, 405 principle comparison, 349 concentration compactness, 521

Index contraction, 92 Courant–Fischer, 409 Courant–Weinstein variational, 411 Ekeland variational, 439 minimax, 408 general, 440, 449 general on manifold, 449 minorant, 339 super- and subsolutions, 334 uniform boundedness, 57 problem boundary value, 96 brachistochrone, 366 initial value, 93 nonresonance, 305, 497 regularity, 319 resonance, 497 product cross, 200 exterior, 197 formula, 293 scalar, 41 vector, 200 projection, 8 orthogonal, 47 projective space, 450 proper mapping, 230 property Carath´eodory, 126 ﬁnite intersection, 51 homotopy invariance, 238, 279, 294, 304 proposition Bochner, 111 Browder, 98 Euler necessary condition, 362 Kolmogorov, 36 Krein–Rutman, 342 Lagrange necessary condition, 362 Lax–Milgram, 50 Leray–Schauder index formula, 289 Riesz, 32 Schauder, 81 Skrypnik, 304 Taylor formula, 130 (PS)c condition, 432 pseudogradient, 436 vector ﬁeld, 436

555 tangent, 447 pull-back, 189 push-forward, 186 quantitative deformation lemma, 429, 437, 448 Rabinowitz global bifurcation theorem, 295 linking theorem, 465 saddle point theorem, 464 radius, spectral, 57 rank of matrix, 14 theorem, 162 real linear space, 2 reducing subspace, 15 reduction (Lyapunov–Schmidt), 166 reﬂexive space, 66 regular matrix, 14 point, 160 value, 160 regularity of classical solution, 371 of weak solution, 372 problem, 319 theory, 483 relatively compact set, 26 relaxation parameter, 392 Rellich–Kondrachov theorem, 40 resolvent, 56 set, 56 resonance problem, 497 retraction, 465, 470 Riemann integral, 105 metric, 214 sum, 106 Riemann–Lebesgue lemma, 59 Riesz proposition, 32 representation theorem, 50 Riesz–Fischer theorem, 49 Riesz–Schauder theory, 82 right inverse, 160, 283 Ritz method, 389 Rothe theorem, 279

556 Rouch´e theorem, 231 rule, chain, 121 (S+ ) condition, 303 saddle point theorem, 459 Rabinowitz, 464 Sard theorem, 245, 272 scalar product, 41 Schauder basis, 40 estimates, 475 ﬁxed point theorem, 257 proposition, 81 Schmidt orthogonalization, 43 Schwartz inequality, 41 self-adjoint operator, 69 semi-continuity, 377 semi-norm, 61 separable space, 25 separation theorem, 62 sequence Cauchy, 27 minimizing, 375, 390 sequentially continuous functional weakly, 378 set closed, 25 weakly sequentially, 381 compact, 26 relatively, 26 contractible, 414 contraction, 264 convex, 13 dense, 25 diameter, 261 open, 24 weakly, 66 ordered, 2 orthogonal, 43 resolvent, 56 symmetric, 242 shift left, 10 right, 10 sign condition, 281, 515 simple curve, 224 eigenvalue, 174, 342

Index simply connected manifold, 205 singular point, 156 skew-symmetric form, 196 Sobolev critical exponent, 39 embedding theorem, 39 space, 39 solution classical, 74, 476 lower, 334 mild, 116 of variational problem, 389 operator, 352 positive, 338 strong, 74 upper, 334 weak, 319, 483–485 regularity of, 319, 372 space Banach, 28 completion, 28 ordered, 330 compact, 26 sequentially, 26 complete, 27 connected, 27 cotangent, 187 dual, 10, 61 factor, 9 Hilbert, 41 Lebesgue, 33 linear, 1 complex, 2 normed, 28 real, 2 metric, 24 complete, 27 of bounded sequences, 10 of compact linear operators, 77 of continuous functions, 30 of continuous linear operators, 55 of diﬀerentiable functions, 37 of integrable functions, 32 of linear operators, 5 projective, 450 reﬂexive, 66 separable, 25

Index Sobolev, 39 tangent, 182 topological, 24 uniformly convex, 65 with scalar product, 41 span, 2 spectral radius, 57 spectrum, 14, 56 spherical coordinates, 142, 143 spline cubic, 395 stability, 176 stable manifold, 154 local, 154 standard basis, 3 stationary point hyperbolic, 154, 176 non-hyperbolic, 176 step function, 110 Stokes theorem, 225 abstract, 222 Stone–Weierstrass theorem, 31 strictly monotone operator, 310 strong operator topology, 58 solution, 74 strongly measurable function, 110 monotone operator, 310 Sturm–Liouville operator, 88 subcritical growth, 521 sublinear functional, 61 growth condition, 260, 488 operator, 61 submanifold, 212 subsolution, 334, 339, 351 positive, 338 strict, 334, 352 strong, 334 subspace, 2 closed linear, 60 invariant, 15 reducing, 15 substitution operator, 125 sum, direct, 5 superposition operator, 125

557 supersolution, 334, 351 strict, 334, 352 strong, 334 support, 35, 209 supporting hyperplane, 120 surjection, linear, 7 surjective linear operator, 7 symmetric set, 242 symmetry of metric, 25 system completely integrable, 192 locally ﬁnite, 209 orthonormal, 43 Pfaﬀ, 207 tangent bundle, 187 pseudogradient vector ﬁeld, 447 space, 182 vector, 182 Taylor formula, 130 polynomial, 130 test function, 485 theorem Alaoglu–Bourbaki, 68 Ambrosetti–Rabinowitz, 441 Arzel` a–Ascoli, 31 Banach–Steinhaus, 58 bifurcation, global Dancer, 300 Rabinowitz, 295 bifurcation, local (Crandall– Rabinowitz), 174 Bochner, 111 Borsuk antipodal, 242 Borsuk–Ulam, 245 Br´ezis–Nirenberg, 440 bread–ham–cheese, 247 Browder, 323 chain rule, 122 closed graph, 60 contraction principle, 92 Courant–Fischer principle, 409 Courant–Weinstein variational principle, 411 Crandall–Rabinowitz, 174 Dancer, 300

558 Darbo, 265 density, 35 dual characterization of norm, 61 Dunford functional calculus, 113 Eberlain–Smulyan, 67 Ekeland, 439 Euler necessary condition, 362 extreme value, 378 ﬁxed point Brouwer, 253 Schauder, 257 Frobenius, 193, 207 functional calculus, 22 fundamental lemma in calculus of variations, 365 of algebra, 15, 228 Gauss–Ostrogradski, 225 Graves, 105 Green, 224 Hahn–Banach, 61 Hamilton–Cayley, 17 Hilbert–Schmidt, 88 implicit function, 147 invariance of domain, 241 inverse function global, 144 local, 140 Jentzsch, 350 generalized, 341 Jordan canonical form, 19 separation, generalization, 241 Krasnoselski local bifurcation, 290 minorant, 339 potential bifurcation, 417 Krein–Rutman, 343 Lagrange multiplier method, 402 necessary condition, 362 suﬃcient condition, 363 Le Shujie, 440 Leray–Lions, 323 linking, 465 Lusternik, 403 Lusternik–Schnirelmann, 246 Luzin, 35

Index Mawhin, 284 mean value, 119 minimax principle, 408, 440, 449 monotone convergence, 34 iterative method, 335 Morse, 170 mountain pass, 432 Ambrosetti–Rabinowitz, 441 on non-well-ordered case, 355 on well-ordered case, 353 open mapping, 58 Perron, 350 generalized, 341 Pettis, 112 Picard, 93 Poincar´e, 203 Rabinowitz, 295, 464, 465 rank, 162 regularity of classical solution, 371 of weak solution, 372 Rellich–Kondrachov, 40 Riesz representation, 50 Riesz–Fischer, 49 Riesz–Schauder theory, 82 Rothe, 279 Rouch´e, 231 saddle point, 459 Rabinowitz, 464 Sard, 245, 272 separation, 62 Skrypnik, 134, 303–305 Sobolev embedding, 39 Stokes, 225 abstract, 222 Stone–Weierstrass, 31 Taylor, 130 Tietze, 236 trace, 484 uniform boundedness, 57 Weierstrass, 31 approximation, 250 Willem, 517 Zeidler, 339 theory regularity, 483

Index Riesz–Schauder, 82 Tietze theorem, 236 topological complement, 175 group, 195 space, 24 topology strong operator, 58 weak, 66 total cone, 342 trace, 484 theorem, 484 transcritical bifurcation, 177 transpose of matrix, 11 triangle inequality, 25, 28 triangulation, 396 uniform boundedness principle, 57 convergence, 30 locally, 30 uniformly continuous function, 27 convex space, 65 Lipschitz continuous mapping, 94 unit, approximative, 236 unitary operator, 49 upper solution, 334 Urysohn lemma, 246 value critical, 160 regular, 160 variation Gˆ ateaux, 117 variational principle, 389

559 vector -valued function, 64 ﬁeld on manifold, 190 pseudogradient, 436 of outer normal, 220 product, 200 tangent, 182 Volterra integral operator, 86 weak closure, 77 convergence, 65, 68 star, 68 derivative, 38 integral, 110 neighborhood, 66 solution, 319, 483–485 topology, 66 weakly coercive functional, 380 operator, 310 open set, 66 sequentially closed set, 381 continuous functional, 378 lower semi-continuous functional, 377 Weierstrass example, 374 theorem, 31 well-posed equation, 82 winding number, 231 Zorn’s lemma, 2

Bibliography [1]

Adams, J.F., Lectures on Lie Groups, W.A. Benjamin, Inc., New York, 1969.

[2]

Adams, R.A., Sobolev spaces, Academic Press, New York, 1975.

[3]

Alexander, J.C., A primer on connectivity, pp. 455–483, in: Fixed Point Theory (Fadell, E. & Fournier, G., eds.), Lecture Notes in Math. 886, Springer Verlag, Berlin–Heidelberg–New York, 1981.

[4]

Amann, H., Ordinary Diﬀerential Equations. An Introduction to Nonlinear Analysis, de Gruyter Stud. in Math. 13, de Gruyter, Berlin–New York, 1990.

[5]

Amann, H. & Weiss, S., “On the uniqueness of the topological degree”, Math. Z. 130 (1973), 39–54.

[6]

Ambrosetti, A. & Prodi, G., A Primer of Nonlinear Analysis, Cambridge Univ. Press, Cambridge, 1993.

[7]

Anane, A., “Simplicit´e et isolation de la premi`ere valeur propre du plaplacien avec poids”, C. R. Math. Acad. Sci. Paris 305 (1987), 725–728.

[8]

Appell, J. & Zabreiko, P.P., Nonlinear Superposition Operators, Cambridge Univ. Press, Cambridge, 1990.

[9]

Arcoya, D. & Orsina, L., “Landesman–Lazer conditions and quasilinear elliptic equations”, Nonlinear Anal. 28 (1997), 1623–1632.

ë

ë

[10] Aubin, J.P. & Ekeland, I., Applied Nonlinear Analysis, Wiley, New York, 1984. [11] Aubin, T., A Course in Diﬀerential Geometry, Amer. Math. Soc., Providence, RI, 2000. [12] Berkovitz, L.D., Optimal Control Theory, Springer Verlag, Berlin–Heidelberg–New York, 1974. [13] B¨ ohme, R., “Die L¨ osung der Verzweigungsprobleme f¨ ur nichtlineare Eigenwertprobleme”, Math. Z. 127 (1972), 105–126. ´ ements de Math´ematique, vol. Livre VI, s´eries Int´egration, [14] Bourbaki, N., El´ ie Hermann et C , Paris, 1952. [15] Bourbaki, N., Groupes et Alg`ebres de Lie, Hermann et Cie , Paris, 1975.

562

Bibliography

[16] Brenner, S.C. & Scott, L.R., The Mathematical Theory of Finite Element Methods, 2nd edition, Springer Verlag, Berlin–Heidelberg–New York, 2002. [17] Br¨ ocker, T. & Dieck, T., Representations of Compact Lie Groups, Springer Verlag, Berlin–Heidelberg–New York, 1985. [18] Browder, F.E., Probl`emes Nonlin´eaires, Universit´e de Montr´eal, Montr´eal, 1966. [19] Browder, F.E., “Nonlinear elliptic boundary value problems and the generalized topological degree”, Bull. Amer. Math. Soc. 76 (1970), 999–1005. [20] Ca˜ nada, A., Dr´ abek, P., & Fonda, A., (eds.), Handbook on Diﬀerential Equations, Ordinary Diﬀerential Equations, vol. 1, Elsevier, North-Holland, Amsterdam, 2004. [21] Cartan, H., Calcul diﬀ´erentiel, Formes diﬀ´erentielles, Hermann et Cie , Paris, 1967. [22] Cesari, L., Hale, J.K., & La Salle, J., (eds.), Dynamical Systems, vol. I, Academic Press, New York, 1976. [23] Chavel, I., Eigenvalues in Riemann Geometry, Academic Press, New York, 1984. [24] Chillingworth, D.R.J., Diﬀerential Topology with View to Applications, Pitman Publ., Harlow, 1976. [25] Chow, S.-N., Li, C., & Wang, D., Normal Forms and Bifurcation of Planar Vector Fields, Cambridge Univ. Press, Cambridge, 1994. [26] Citlanadze, E.S., “Existence theorems for minimax points in Banach spaces”, Tr. Mosk. Mat. Obs. 3 (1953), 235–274, in Russian. [27] Coddington, E.A. & Levinson, N., Theory of Ordinary Diﬀerential Equations, McGraw–Hill, Inc., New York–Toronto–London, 1955. [28] Conway, J.B., A Course in Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1990. [29] Crandall, M. & Rabinowitz, P.H., “Bifurcation from simple eigenvalues”, J. Funct. Anal. 8 (1971), 321–340. [30] Dancer, E.N., “On the structure of solutions of nonlinear eigenvalue problems”, Indiana Univ. Math. J. 23 (1974), 1069–1076. [31] Davies, B. & Safarov, Y., (eds.), Spectral Theory and Geometry, Cambridge Univ. Press, Cambridge, 1999. [32] de Boor, C., A Practical Guide to Splines, Appl. Math. Sci. Verlag, Berlin–Heidelberg–New York, 2001.

ë 207, Springer

[33] De Coster, C. & Habets, P., The lower and upper solutions method for boundary value problems, in Ca˜ nada et al. [20], pp. 69–160. [34] Deimling, K., Nonlinear Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1985.

Bibliography

563

[35] Dieudonn´e, J., Foundations of Modern Analysis, Academic Press, New York– London, 1960. [36] Dold, A., Lectures on Algebraic Topology, Springer Verlag, Berlin–Heidelberg–New York, 1982. [37] Doˇsl´ y, O., Halﬂinear diﬀerential equations, in Ca˜ nada et al. [20], pp. 161–357. [38] Dr´ abek, P., “Continuity of Nemyckij’s operator in H¨ older spaces”, Comment. Math. Univ. Carolin. 16 (1975), 1, 37–57. [39] Dr´ abek, P., Solvability and Bifurcations of Nonlinear Equations, Pitman Research Notes Math. Ser. 264, Longman Scientiﬁc & Technical, Harlow, 1992. [40] Dr´ abek, P., Girg, P., & Man´ asevich, R., “Generic Fredholm alternative-type results for one dimensional p-Laplacian”, NoDEA Nonlinear Diﬀerential Equations Appl. 8 (2001), 285–298. [41] Dr´ abek, P., Krejˇc´ı, P., & Tak´ aˇc, P., Nonlinear Diﬀerential Equations, CRC Research Notes Math. 404, Chapman & Hall/CRC, Boca Raton, FL– London–New York–Washington, DC, 1999. [42] Dr´ abek, P., Kufner, A., & Nicolosi, F., Quasilinear Elliptic Equations with Degenerations and Singularities, de Gruyter Ser. Nonlinear Anal. Appl., de Gruyter, Berlin–New York, 1997. [43] Dugundji, J., Topology, Brown Publ., Dubuque, IA, 1989. [44] Dunford, N. & Schwartz, J.T., Linear Operators, vol. I. General Theory, Intersci. Publ., New York–London–Sydney–Toronto, 1958. [45] Dunford, N. & Schwartz, J.T., Linear Operators, vol. II, Intersci. Publ., New York–London–Sydney–Toronto, 1963. [46] Edmunds, D.E. & Evans, W.D., Spectral Theory and Diﬀerential Operators, Oxford Science Publications, Calderon Press, Oxford, 1987. ´ A half-linear second order diﬀerential equation, pp. 153–180, in: [47] Elbert, A., Qualitative Theory of Diﬀerential Equations, vol. I and II (Szeged, 1979), Colloq. Math. Soc. J´ anos Bolyai, 30, North-Holland, Amsterdam, 1981. [48] Evans, L.C., Partial Diﬀerential Equations, Amer. Math. Soc., Providence, RI, 1998. [49] Fabian, M., Habala, P., H´ ajek, P., Montesinos, V., Pelant, J., & Zizler, V., Functional Analysis and Inﬁnite-Dimensional Geometry, CMS Books Math. / Ouvrages Math. SMC 8, Springer Verlag, Berlin–Heidelberg–New York, 2001. [50] Fitzpatrick, P.M., “Homotopy, linearization, and bifurcation”, Nonlinear Anal. 12 (1988), 171–184. [51] Flucher, M., Variational Problems with Concentration, Birkh¨ auser, Basel– Boston–Berlin, 1999. [52] Folland, G., A Course in Abstract Harmonic Analysis, Chapman & Hall/ CRC, Boca Raton, FL, 1995.

ë

ë

ë

ë

ë

564

Bibliography

[53] Fuˇc´ık, S., Solvability of Nonlinear Equations and Boundary Value Problems, D. Reidel Publ., Dordrecht, 1980. [54] Fuˇc´ık, S. & Kufner, A., Nonlinear Diﬀerential Equations, Elsevier, Amsterdam–Oxford–New York, 1980. [55] Fuˇc´ık, S. & Neˇcas, J., “Ljusternik–Schnirelmann theorem and nonlinear eigenvalue problems”, Math. Nachr. 53 (1972), 277–289. [56] Fuˇc´ık, S., Neˇcas, J., Souˇcek, J., & Souˇcek, V., Spectral Analysis of Nonlinear Operators, Lecture Notes in Math. 346, Springer Verlag, Berlin–Heidelberg–New York, 1973. [57] Gaines, R.E. & Mawhin, J., Coincide Degree and Nonlinear Diﬀerential Equations, Lecture Notes in Math. 568, Springer Verlag, Berlin–Heidelberg–New York, 1977. [58] Ghoussoub, N., Duality and Perturbation Methods in Critical Point Theory, Cambridge Univ. Press, Cambridge, 1993. [59] Gilbarg, D. & Trudinger, N.S., Elliptic Partial Diﬀerential Equations of Second Order, Springer Verlag, Berlin–Heidelberg–New York, 2001. [60] Goebel, K., “An elementary proof of the ﬁxed-point theorem of Browder and Kirk ”, Michigan Math. J. 16 (1969), 381–383. [61] Greenberg, M., Lectures in Algebraic Topology, W.A. Benjamin, Inc., New York, 1967. [62] Gripenberg, C., Londen, S.O., & Staﬀans, O., Volterra Integral and Functional Equations, Cambridge Univ. Press, Cambridge, 1990. [63] Hale, J., Ordinary Diﬀerential Equations, Wiley, New York – London – Sydney – Toronto, 1969. [64] Halmos, P., Finite-Dimensional Vector Spaces, Van Nostrand, Princeton, NJ, 1960. [65] Hamilton, R.S., “The inverse function theorem of Nash and Moser ”, Bull. Amer. Math. Soc. 7 (1982), 65–222. [66] Helgason, S., Diﬀerential Geometry, Lie Groups, and Symmetric Spaces, Academic Press, New York, 1978. [67] Hirsch, M.W., Diﬀerential Topology, Springer Verlag, Berlin–Heidelberg– New York, 1976. [68] Hlav´ aˇcek, I. & Neˇcas, J., Mathematical Theory of Elastic and Elastoplastic Bodies: An Introduction, Elsevier, Amsterdam, 1981. [69] Iz´e, J.A., Bifurcation Theory for Fredholm Operators, Mem. Amer. Math. Soc. 174, Amer. Math. Soc., Providence, RI, 1971. [70] Kaczmarz, S. & Steinhaus, H., Theorie der Orthogonalreihen, Monografje Matematyczne, Paˇ nstwo Wydawnistwo Naukowe, Warszawa–Lwow, 1935. [71] Kakutani, S., “Some characterization of Euclidian space”, Japan. J. Math. 16 (1939), 93–97.

ë

ë

ë

Bibliography

565

[72] Kantorovich, “Functional analysis and applied mathematics”, Uspekhi Mat. Nauk 3 (1948), 3, 89–185 (in Russian).

ë

[73] Kato, T., Perturbation Theory for Linear Operators, Springer Verlag, Berlin– Heidelberg–New York, 1966. [74] Katok, A. & Hasselblatt, B., Introduction to the Modern Theory of Dynamical Systems, Cambridge Univ. Press, 1985. [75] Kelley, J.L., General Topology, Van Nostrand, Princeton, NJ, 1957. [76] Kittel, Ch., Knight, W.D., & Ruderman, M.A., Mechanics, Berkelly Physics Course, vol. I, McGraw–Hill, Inc., New York–San Francisco–Toronto– London, 1965. [77] Kosniowski, C., A First Course in Algebraic Topology, Cambridge Univ. Press, Cambridge, 1980. [78] Krasnoselski, M.A., Topological Methods in the Theory of Nonlinear Integral Equations, Pergamon, Oxford, 1964. [79] Krasnoselski, M.A. & Zabreiko, P.P., Geometric Methods of Nonlinear Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1984. [80] Krawcewicz, W. & Wu, J., Theory of Degrees with Applications to Bifurcations and Diﬀerential Equations, Wiley, New York, 1997. [81] Kˇr´ıˇzek, M. & Neitaanm¨aki, P., Finite Element Approximation of Variational Problems and Applications, Pitman Mon. Surv. Pure Appl. Math. 50, Longman Scientiﬁc & Technical, Harlow, 1990.

ë

[82] Kufner, A., John, O., & Fuˇc´ık, S., Function Spaces, Academia and Noordhoﬀ, Prague and Leyden, 1975. [83] Landesman, E.N. & Lazer, A.C., “Nonlinear perturbations of linear elliptic boundary value problems at resonance”, J. Math. Mech. 19 (1970), 609–623. [84] Leinfelder, H. & Simader, C.G., “Schroedinger operators with singular magnetic vector ﬁelds”, Math. Z. 176 (1981), 1–19. [85] Leray, J. & Lions, J.-L., “Quelques r´esultats de Viˇsik sur les probl`emes elliptiques nonlin´eaires par les m´ethodes de Minty–Browder ”, Bull. Soc. Math. France 93 (1965), 97–107. [86] Lindqvist, P., “On the equation div |∇u|p−2 ∇u + λ|u|p−2 u = 0”, Proc. Amer. Math. Soc. 109 (1990), 157–164. [87] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The locally compact case I and II ”, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 1 (1984), 109–145 and 223–283. [88] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The limit case I ”, Rev. Mat. Iberoamericana 1 (1985), 145–201. [89] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The limit case II ”, Rev. Mat. Iberoamericana 2 (1985), 45–121.

566

Bibliography

[90] Lusternik, L. & Sobolev, V., Elements of Functional Analysis, “Nauka”, Moscow, 1965 (in Russian; English translation published in Wiley, New York 1974). [91] Mawhin, J., Topological Degree Methods in Nonlinear Boundary Value Problems, Regional Conference Series in Mathematics 40, Amer. Math. Soc., Providence, RI, 1979. [92] Mawhin, J., “A simple approach to Brouwer degree based on diﬀerential forms”, Adv. Nonlinear Stud. 4 (2004), 535–548. [93] Maz’ja, V.G., Sobolev spaces, Springer Verlag, Berlin–Heidelberg–New York, 1985. [94] Michlin, S.G., Variationsmethoden in der Mathematischen Physik, Deutscher Verlag der Wissenschaften, Berlin, 1960. [95] Milnor, J., Topology from the Diﬀerentiable Viewpoint, Univ. of Virginia Press, Charlottesville, 1965. [96] Milota, J. & Petzeltov´ a, H., “An existence theorem for semilinear functional ˇ parabolic equations”, Casopis Pˇest. Mat. 110 (1985), 274–288. [97] Moser, J., “A rapidly convergent iteration method and nonlinear partial differential equations I, II ”, Ann. Scuola Norm. Sup. Pisa Cl. Sci. 20 (1966), 265–315, 499–535. [98] Nash, J., “The imbedding problem for Riemann manifolds”, Ann. of Math. 63 (1956), 20–63. [99] Neˇcas, J., Les m´ethodes directes en th´eorie des ´equations elliptiques, Masson et Cie , Paris, 1967. [100] Nirenberg, L., Topics in Nonlinear Functional Analysis, New York Univ., Courant Inst. Math. Sci., New York, 1974. [101] Palis, J. & De Melo, W., Geometry Theory of Dynamical Systems, Springer Verlag, Berlin–Heidelberg–New York, 1982. [102] Protter, M.H. & Weinberger, H.F., Maximum Principle in Diﬀerential Equations, Prentice Hall, Englewood Cliﬀs, NJ, 1967. [103] Rabinowitz, P.H., A global theorem for nonlinear eigenvalue problems and applications, pp. 11–36, in: Contribution in Nonlinear Functional Analysis (Zarantonello, E.H., ed.), Academic Press, New York, 1971. [104] Rabinowitz, P.H., “Some global results for nonlinear eigenvalue problems”, J. Funct. Anal. 7 (1971), 487–513. [105] Rabinowitz, P.H., Minimax Methods in Critical Point Theory with Applications to Diﬀerential Equations, Amer. Math. Soc., Providence, RI, 1986. [106] Reed, M. & Simon, B., Methods of Modern Mathematical Physics, vol. IV, Academic Press, New York, 1978. [107] Rektorys, K., Variational Methods in Mathematics, Science and Engineering, 2nd edition, D. Reidel Publ., Dordrecht, 1990.

ë

Bibliography

567

[108] Robinson, D.W., Elliptic Operators and Lie Groups, Oxford Univ. Press, Oxford, 1991. [109] Rockafellar, R.T., Convex Analysis, Princeton Univ. Press, Princeton, NJ, 1970. [110] Rosenberg, S., The Laplacian on a Riemann Manifold, London Math. Soc., London, 1991. [111] Rothe, E.H., Introduction to Various Aspects of Degree Theory in Banach Spaces, Amer. Math. Soc., Providence, RI, 1986. [112] Rudin, W., Functional Analysis, McGraw–Hill, Inc., New York, 1973. [113] Rudin, W., Real and Complex Analysis, McGraw–Hill, Inc., New York, 1974. [114] Ruelle, D., Elements of Diﬀerentiable Dynamics and Bifurcation Theory, Academic Press, New York, 1989. [115] Runst, T. & Sickel, W., Sobolev Spaces of Fractional Order, Nemytskij Operators and Nonlinear Partial Diﬀerential Equations, De Gruyter Series in Nonlinear Analysis and Applications 3, de Gruyter, Berlin–New York, 1996. [116] Saaty, T.L., Modern Nonlinear Equations, McGraw–Hill, Inc., New York– Toronto–London–Sydney, 1967. [117] Sard, A., “The measure of the critical set values of diﬀerential mappings”, Bull. Amer. Math. Soc. 48 (1942), 883–890. [118] Schwartz, J.T., Nonlinear Functional Analysis, Gordon & Breach, New York–London–Paris, 1969. [119] Sehgal, “A ﬁxed point theorem for mapping with a contractive iterate”, Proc. Amer. Math. Soc. 23 (1969), 631–634. [120] Singer, I., Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces, Springer Verlag, Berlin–Heidelberg–New York, 1970. [121] Skrypnik, I.V., Nonlinear Elliptic Boundary Value Problems, Teubner, Leipzig, 1986. [122] Spanier, E., Algebraic Topology, McGraw–Hill, Inc., New York, 1966. [123] Stein, E.M., Singular Integrals and Diﬀerentiability Properties of Functions, Princeton Univ. Press, Princeton, NJ, 1970. [124] Sternberg, S., Lectures on Diﬀerential Geometry, Prentice Hall, Englewood Cliﬀs, NJ, 1964. [125] Stoer, J. & Bulirsch, R., Introduction to Numerical Analysis, 3rd edition, Texts in Applied Mathematics 12, Springer Verlag, Berlin–Heidelberg– New York, 2002. [126] Tak´ aˇc, P., “A short elementary proof of the Krein–Rutman theorem”, Houston J. Math. 20 (1994), 1, 93–98. [127] Taylor, M., Partial Diﬀerential Equations, vol. I, series Basic Theory, Springer Verlag, Berlin–Heidelberg–New York, 1996.

ë

ë

ë

568

Bibliography

[128] Triebel, H., Theory of Function Spaces, vol. I, Birkh¨ auser, Basel, 1983. [129] Triebel, H., Theory of Function Spaces, vol. II, Birkh¨ auser, Basel, 1992. [130] Vejvoda, O. et al., Periodic Solutions of Partial Diﬀerential Equations: Time Periodic Solutions, Sijthoﬀ Noordhoﬀ, The Netherlands, 1981. [131] Walter, W., Ordinary Diﬀerential Equations, Graduate Texts in Mathematics, Springer Verlag, Berlin–Heidelberg–New York, 1998. [132] Whitney, H., “A function not constant on a connected set of critical points ”, Duke Math. J. 1 (1935), 514–517. [133] Whitney, H., Geometric Integration Theory, Princeton Univ. Press, Princeton, NJ, 1957. [134] Willem, M., Minimax Theorems, Birkh¨ auser, Boston–Basel–Berlin, 1966. [135] Yosida, K., Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1965. [136] Zeidler, E., Nonlinear Functional Analysis and Its Applications, vol. I, II/A, II/B, III and IV, Springer Verlag, Berlin–Heidelberg–New York, 1986. [137] Ziemer, W.P., Weakly Diﬀerentiable Functions: Sobolev Spaces and Functions of Bounded Variation, Springer Verlag, Berlin–Heidelberg–New York, 1989.

Typeset by LATEX 2ε with AMS fonts and BibTEX. Figures were sketched using PSTricks (with the aid of Mathematica) and Matlab.

Elements of Noncommutative Geometry (Birkhauser Advanced Texts Basler Lehrbucher)

Read more

Methods of nonlinear analysis. Applications to differential equations

Read more

Elements of Nonlinear Analysis (BirkhГ¤user Advanced Texts Basler LehrbГјcher)

Read more

Observation and Control for Operator Semigroups (Birkhauser Advanced Texts Basler Lehrbucher)

Read more

Nonlinear analysis and differential equations

Read more

Nonlinear Analysis and Differential Equations (Progress in Nonlinear Differential Equations and Their Applications)

Read more

Nonlinear Equations: Methods, Models and Applications (Progress in Nonlinear Differential Equations and Their Applications)

Read more

Partial Differential Equations. Nonlinear equations

Read more

Nonlinear Partial Differential Equations

Read more

Partial differential equations. Nonlinear equations

Read more

Nonlinear Partial Differential Equations with Applications

Read more

Nonlinear Partial Differential Equations and Applications

Read more

Nonlinear Differential Equations

Read more

Geometry of nonlinear differential equations

Read more

Nonlinear Partial Differential Equations with Applications

Read more

Fourier Analysis and Nonlinear Partial Differential Equations

Read more

Nonlinear partial differential equations

Read more

Nonlinear ordinary differential equations

Read more

Nonlinear Differential Equations

Read more

Nonlinear Partial Differential Equations

Read more

Variational Methods. Applications to Nonlinear Partial Differential Equations and Hamiltonian Systmes: Applications to Nonlinear Partial Differential ... Und Ihrer Grenzgebiete, 3

Read more

Fourier analysis and nonlinear partial differential equations

Read more

Nonlinear partial differential equations with applications

Read more

Nonlinear partial differential equations and their applications

Read more

Fourier Analysis and Nonlinear Partial Differential Equations

Read more

Nonlinear Analysis & Differential Equations, An Introduction

Read more

Nonlinear Partial Differential Equations with Applications

Read more

Methods of nonlinear analysis,

Read more

Methods of nonlinear analysis,

Read more

Functional Analysis: Spectral Theory (Birkhäuser Advanced Texts Basler Lehrbücher)

Read more

Recommend Documents

Elements of Noncommutative Geometry (Birkhauser Advanced Texts Basler Lehrbucher)

...

Methods of nonlinear analysis. Applications to differential equations

Birkhäuser Advanced Texts Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Loui...

Elements of Nonlinear Analysis (BirkhГ¤user Advanced Texts Basler LehrbГјcher)

Advanced Texts Michel Chipot Elements of Nonlinear Analysis Bh*hMser Advanced Texts Basler Lehrbticher Edited by Her...

Observation and Control for Operator Semigroups (Birkhauser Advanced Texts Basler Lehrbucher)

Birkhäuser Advanced Texts Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Louis...

Nonlinear analysis and differential equations

Nonlinear Analysis and Differential Equations (Progress in Nonlinear Differential Equations and Their Applications)

Progress in Nonlinear Differential and Their :Applications Nonlinear Analysis and its Applications to Differential Equ...

Nonlinear Equations: Methods, Models and Applications (Progress in Nonlinear Differential Equations and Their Applications)

Progress in Nonlinear Differential Equations and Their Applications Nonlinear Equations: Methods, Models and Applicatio...

Partial Differential Equations. Nonlinear equations

Nonlinear Partial Differential Equations

Advanced Courses in Mathematics CRM Barcelona Centre de Recerca Matemàtica Managing Editor: Carles Casacuberta For fur...

Partial differential equations. Nonlinear equations

Applied Mathematical Sciences Volume 117 Editors S.S. Antman Department of Mathematics and Institute for Physical Scien...