HANDBOOK OF APPLIED ANALYSIS
For other titles published in this series, go to www.springer.com/series/5613
Advances in Mechanics and Mathematics VOLUME 19 Series Editors David Y. Gao (Virginia Polytechnic Institute and State University) Ray W. Ogden (University of Glasgow)
Advisory Board Ivar Ekeland (University of British Columbia, Vancouver) Tim Healey (Cornell University, USA) Kumbakonam Rajagopal (Texas A&M University, USA) Tudor Ratiu (École Polytechnique Fédérale, Lausanne) David J. Steigmann (University of California, Berkeley)
Aims and Scope Mechanics and mathematics have been complementary partners since Newton’s time, and the history of science shows much evidence of the beneficial influence of these disciplines on each other. The discipline of mechanics, for this series, includes relevant physical and biological phenomena such as: electromagnetic, thermal, quantum effects, biomechanics, nanomechanics, multiscale modeling, dynamical systems, optimization and control, and computational methods. Driven by increasingly elaborate modern technological applications, the symbiotic relationship between mathematics and mechanics is continually growing. The increasingly large number of specialist journals has generated a complementarity gap between the partners, and this gap continues to widen. Advances in Mechanics and Mathematics is a series dedicated to the publication of the latest developments in the interaction between mechanics and mathematics and intends to bridge the gap by providing interdisciplinary publications in the form of monographs, graduate texts, edited volumes, and a special annual book consisting of invited survey articles.
HANDBOOK OF APPLIED ANALYSIS
By NIKOLAOS S. PAPAGEORGIOU National Technical University, Athens, Greece SOPHIA TH. KYRITSI-YIALLOUROU Hellenic Naval Academy, Piraeu s, Greece
123
N.S. Papageorgiou National Technical University Department of Mathematics 157 80 Athens Zografou Campus Greece
[email protected]
S. Th. Kyritsi-Yiallourou Hellenic Naval Academy Military Institute of University Education Leoforos Chatzikyriakou 185 39 Piraeus Greece
[email protected]
Series Editors: David Y. Gao Department of Mathematics Virginia Polytechnic Institute and State University Blacksburg, VA 24061
[email protected]
Ray W. Ogden Department of Mathematics University of Glasgow Glasgow, Scotland, UK
[email protected]
ISBN 978-0-387-78906-4 e-ISBN 978-0-387-78907-1 DOI 10.1007/b120946 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2009927202 Mathematics Subject Classification (2000): 34xx, 35xx, 46xx, 47xx, 49xx, 90xx, 91xx © Springer Science+Business Media, LLC 2009 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
To the memory of my father S. Th. K.
The mathematical sciences particularly exhibit order, symmetry and limitation; and these are the greatest forms of the beautiful. Aristotle Metaphysica, 3M, 1078b
Series Preface
As any human activity needs goals, mathematical research needs problems. —David Hilbert Mechanics is the paradise of mathematical sciences. —Leonardo da Vinci
Mechanics and mathematics have been complementary partners since Newton’s time, and the history of science shows much evidence of the beneficial influence of these disciplines on each other. Driven by increasingly elaborate modern technological applications, the symbiotic relationship between mathematics and mechanics is continually growing. However, the increasingly large number of specialist journals has generated a duality gap between the partners, and this gap is growing wider. Advances in Mechanics and Mathematics (AMMA) is intended to bridge the gap by providing multidisciplinary publications that fall into the two following complementary categories: 1. An annual book dedicated to the latest developments in mechanics and mathematics; 2. Monographs, advanced textbooks, handbooks, edited volumes, and selected conference proceedings. The AMMA annual book publishes invited and contributed comprehensive research and survey articles within the broad area of modern mechanics and applied mathematics. The discipline of mechanics, for this series, includes relevant physical and biological phenomena such as: electromagnetic, thermal,
x
Series Preface
and quantum effects, biomechanics, nanomechanics, multiscale modeling, dynamical systems, optimization and control, and computation methods. Especially encouraged are articles on mathematical and computational models and methods based on mechanics and their interactions with other fields. All contributions will be reviewed so as to guarantee the highest possible scientific standards. Each chapter will reflect the most recent achievements in the area. The coverage should be conceptual, concentrating on the methodological thinking that will allow the nonspecialist reader to understand it. Discussion of possible future research directions in the area is welcome. Thus, the annual volumes will provide a continuous documentation of the most recent developments in these active and important interdisciplinary fields. Chapters published in this series could form bases from which possible AMMA monographs or advanced textbooks could be developed. Volumes published in the second category contain review/research contributions covering various aspects of the topic. Together these will provide an overview of the state-of-the-art in the respective field, extending from an introduction to the subject right up to the frontiers of contemporary research. Certain multidisciplinary topics, such as duality, complementarity, and symmetry in mechanics, mathematics, and physics are of particular interest. The Advances in Mechanics and Mathematics series is directed to all scientists and mathematicians, including advanced students (at the doctoral and postdoctoral levels) at universities and in industry who are interested in mechanics and applied mathematics.
David Y. Gao Ray W. Ogden
Preface
The aim of this book is to present the basic modern aspects of nonlinear analysis and then to illustrate their use in different applied problems. Nonlinear analysis was born from the need to deal with nonlinear equations which arise in various problems of science, engineering, and economics and which are often notoriously difficult to solve. On a theoretical level, nonlinear analysis is a remarkable mixture of various areas of mathematics such as topology, measure theory, functional analysis, nonsmooth analysis, and multivalued analysis. On an applied level, nonlinear analysis provides the necessary tools to formulate and study realistic and accurate models describing various phenomena in different areas of physical sciences, engineering and economics. For this reason, the theoretically inclined nonmathematician (physicist, engineer, or economist), needs to have a working knowledge of at least some of the basic aspects of nonlinear analysis. This knowlegde can help him build good models for the phenomena he studies, study them in detail and extract from them important information which is crucial to the design process. As a consequence, nonlinear analysis has acquired an interdisciplinary character and is a prerequisite for many nonmathematicians, who wish to investigate their problems in detail with the greatest possible generality. This leads to a continuously increasing need for books that survey this large area of mathematical analysis and present its applications. There should be no misunderstanding. The subject is vast, it touches many different areas of mathematics and its applications cover several other fields in science and engineering. In this volume, we make an effort to present the basic theoretical aspects and the main applications of nonlinear analysis. Of course the treatment is not exhaustive; such a project would require several volumes. Nevertheless, we believe that we touch the main parts of the theory and of the applications. Mathematicians and nonmathematicians alike can find in this volume material that covers their interests and can be useful in their research and/or teaching. Chapter 1 begins with the calculus of smooth and nonsmooth functions. We present the Gˆ ateaux and Fr´echet derivatives and develop their calculus in
xii
Preface
full detail. In the direction of nonsmooth functions, first we deal with convex functions for which we develop a duality theory and a theory of subdifferentiation. Subsequently, we generalize to locally Lipschitz functions (Clarke’s theory). We also introduce and study related geometrical concepts (such as tangent and normal cones) for various kinds of sets. Finally we investigate a kind of variational convergence of functions, known as Γ-convergence, which is suitable in the stability (sensitivity) analysis of variational problems. In Chapter 2, we use the tools of the previous chapter in order to study extremal and optimal control problems. We begin with a detailed study of the notion of lower semicontinuity of functions. Next we examine constrained minimization problems and develop the method of Lagrange multipliers. This leads to mimimax theorems, saddle points and the theory of KKM-multimaps. Section 2.4 deals with some modern aspects of the direct method, which involve the so-called variational principles, central among them being the socalled “Ekeland variational principle”. The last two sections deal with the calculus of variations and optimal control. In optimal control, we focus on existence theorems, relaxation and the necessary conditions for optimality (Pontryagin’s maximum principle). Chapter 3 deals with some important families of nonlinear maps and examines their uses in fixed point theory. We start with compact and Fredholm operators which are the natural generalizations of finite rank maps. Subsequently we pass to operators of monotone type and to accretive operators. Monotone operators exhibit remarkable surjectivity properties which play a central role in the existence theory of nonlinear boundary value problems. Accretive operators are closely related to the generation theory of linear and nonlinear semigroups. We investigate this connection. Then we introduce the Brouwer degree (finite-dimensional) and the Leray–Schauder and Browder–Skrypnik degrees (the latter for operators of monotone type) which are infinite-dimensional. Having these degree maps, we can move to the fixed point theory. We deal with metric fixed points, topological fixed points and investigate the interplay between order and fixed point theory. Chapter 4 presents the main aspects of critical point theory which is the basic tool in the so-called “variational method” in the study of nonlinear boundary value problems. We start with minimax theorems describing the critical values of a C 1 -functional. Then we present the Ljusternik–Schnirelmann theory for multiple critical points of nonlinear homogeneous maps. This way we have all the necessary tools to develop the spectral properties of the Laplacian and of the p-Laplacian (under Dirichlet, Neumann and periodic boundary conditions). Then using the Lagrange multipliers method we deal with abstract eigenvalue problems. Finally we present some basic notions and results from bifurcation theory. Chapter 5 uses the tools developed in Chapters 3 and 4 in order to study nonlinear boundary value problems (involving ordinary differential equations and elliptic partial differential equations). First we illustrate the variational method based on the minimax principles of critical point theory and then
Preface
xiii
we present the method of upper and lower solutions and the degree-theoretic method. Subsequently we consider nonlinear eigenvalue problems, for which we produce constant sign and nodal (sign changing) solutions. Then we prove maximum and comparison principles involving the Laplacian and p-Laplacian differential operators. Finally we deal with periodic Hamiltonian systems. We consider the problem of prescribed minimal period and the problem of a prescribed energy level. For both we prove existence theorems. In Chapter 6, we deal with the properties of maps which have as values sets (multifunctions or set-valued maps). We introduce and study their continuity (Section 6.1) and measurability (Section 6.2) properties. Then for such multifunctions (continuous or measurable), we investigate whether they admit continuous or measurable selectors (Michael’s theorem and Kuratowski– Ryll Nardzewski, and the Yankov–von Neuman–Aumann selection theorems). This leads to the study of the sets of integrable selectors of a multifunction, which in turn permits a detailed set-valued integration. The notion of decomposability (an effective substitute of convexity) plays a central role in this direction. Then we prove fixed point theorems for multifunctions and also study Carath´eodory multifunctions. Finally we introduce and study various notions of convergence of sets that arise naturally in applications. In Chapter 7, we consider applications to problems of mathematical economics. We consider both static and dynamic models. We start with the static model of an exchange economy. Assuming that perfect competition prevails, which is modelled by a continuum (nonatomic measure space) of agents. We prove a “core Walras equivalence theorem” and we also establish the existence of Walras allocations. We then turn our attention to growth models (dynamic models). First we deal with an infinite horizon, discrete-time, multisector growth model and we establish the existence of optimal programs for both discounted and undiscounted models. For the latter, we use the notion of “weak maximality”. Then we determine the asymptotic properties of optimal programs via weak and strong turnpike theorems. We then examine uncertain growth models and optimal programs for both nonstationary discounted and stationary undiscounted models. We also characterize them using a price system. Continuous-time discounted models are then considered, and finally we characterize choice behavior consistent with the “Expected Utility Hypothesis”. Chapter 8 deals with deterministic and stochastic games, which provide a substantial amount of generalization of some of the notions considered in the previous chapter. We start with noncooperative n-players games, for which we introduce the notion of “Nash equilibrium”. We show the existence of such equilibria. Then we consider cooperative n-players games, for which we define the notion of “core” and show its nonemptiness. We continue with random games with a continuum of players and an infinite-dimensional strategy space. For such games, we prove the existence of “Cournot–Nash equilibria”. We also study corresponding Bayesian games. Subsequently, using the formalism of dynamic programming, we consider stochastic, 2-player, zero-sum
xiv
Preface
games. Finally, using approximate subdifferentials for convex function, we produce approximate Nash equilibria for noncooperative games with noncompact strategy sets. Chapter 9 studies how information can be incorporated as a variable in various decision models (in particular in ones with asymmetric information structure). First we present the mathematical framework, which will allow the analytical treatment of the notion of information. For this purpose, we define two comparable metric topologies, which we study in detail. Then we examine the ex-post view and the ex-ante view, in the modelling of systems with uncertainty. In both cases we establish the continuity of the model in the information variable. Subsequently, we introduce a third mode of convergence of information and study prediction sequences. We also study games with incomplete information and games with a general state space and an unbounded cost function. The final chapter (Chapter 10) deals with evolution equations and the mathematical tools associated with them. These tools are developed in the first section and central among them is the notion of “evolution triple”. We consider semilinear evolutions, which we study using the semigroup method. We then move on to nonlinear evolutions. We consider evolutions driven by subdifferential operators (this class of problems incorporates variational inequalities) and problems with operators of monotone type, defined within the framework of an evolution triple. The first class is treated using nonlinear semigroup theory, while the second requires Galerkin approximations. We conclude with an analogous study of second-order evolutions. The treatment of all subjects is rigorous and every chapter ends with an extensive survey of the literature. We hope that both mathematicians and nonmathematicians alike, will find some interesting and useful for their needs in this volume. Acknowledgments: We would like to thank Professor D. Y. Gao for recommending this book for publication in the Advances in Mechanics and Mathematics (AMMA) series. Many thanks are also due to Elizabeth Loew and Jessica Belanger for their patience and professional assistance.
Athens, September 2008
Nikolaos S. Papageorgiou Sophia Th. Kyritsi–Yiallourou
Contents
Series Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1
Smooth and Nonsmooth Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Gˆ ateaux and Fr´echet Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Locally Lipschitz Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Variational Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Γ-Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
Extremal Problems and Optimal Control . . . . . . . . . . . . . . . . . . 63 2.1 Lower Semicontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 2.2 Constrained Minimization Problems . . . . . . . . . . . . . . . . . . . . . . . 72 2.3 Saddle Points and Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 2.4 Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2.5 Calculus of Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 2.6 Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 2.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
3
Nonlinear Operators and Fixed Points . . . . . . . . . . . . . . . . . . . . . 147 3.1 Compact and Fredholm Operators . . . . . . . . . . . . . . . . . . . . . . . . . 148 3.2 Monotone and Accretive Operators . . . . . . . . . . . . . . . . . . . . . . . . 162 3.3 Degree Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 3.4 Metric Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 3.5 Topological Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 3.6 Order and Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 3.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
1 2 12 27 35 45 60
xvi
Contents
4
Critical Point Theory and Variational Methods . . . . . . . . . . . . 267 4.1 Critical Point Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 4.2 Ljusternik–Schnirelmann Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 289 4.3 Spectrum of the Laplacian and of the p-Laplacian . . . . . . . . . . . 297 4.4 Abstract Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 4.5 Bifurcation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 4.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
5
Boundary Value Problems–Hamiltonian Systems . . . . . . . . . . 351 5.1 Variational Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 5.2 Method of Upper–Lower Solutions . . . . . . . . . . . . . . . . . . . . . . . . . 380 5.3 Degree-Theoretic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 5.4 Nonlinear Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 5.5 Maximum and Comparison Principles . . . . . . . . . . . . . . . . . . . . . . 432 5.6 Periodic Solutions for Hamiltonian Systems . . . . . . . . . . . . . . . . . 440 5.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
6
Multivalued Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 6.1 Continuity of Multifunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 6.2 Measurability of Multifunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 6.3 Continuous and Measurable Selectors . . . . . . . . . . . . . . . . . . . . . . 475 6.4 Decomposable Sets and Set-Valued Integration . . . . . . . . . . . . . . 487 6.5 Fixed Points and Carath´eodory Multifunctions . . . . . . . . . . . . . . 504 6.6 Convergence of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514 6.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
7
Economic Equilibrium and Optimal Economic Planning . . . 529 7.1 Perfectly Competitive Economies: Core and Walras Equilibria 530 7.2 Infinite Horizon Multisector Growth Models . . . . . . . . . . . . . . . . 542 7.3 Turnpike Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558 7.4 Stochastic Growth Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 7.5 Continuous-Time Growth Models . . . . . . . . . . . . . . . . . . . . . . . . . . 589 7.6 Expected Utility Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 7.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
8
Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609 8.1 Noncooperative Games–Nash Equilibrium . . . . . . . . . . . . . . . . . . 610 8.2 Cooperative Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 8.3 Cournot–Nash Equilibria for Random Games . . . . . . . . . . . . . . . 624 8.4 Bayesian Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628 8.5 Stochastic Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632 8.6 Approximate Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644 8.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
Contents
9
xvii
Uncertainty, Information, Decision Making . . . . . . . . . . . . . . . . 651 9.1 Mathematical Space of Information . . . . . . . . . . . . . . . . . . . . . . . . 652 9.2 The ex-Post View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662 9.3 The ex-ante View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668 9.4 Convergence of σ-Fields and Prediction Sequences . . . . . . . . . . . 673 9.5 Games with Incomplete Information . . . . . . . . . . . . . . . . . . . . . . . 680 9.6 Markov Decision Chains with Unbounded Costs . . . . . . . . . . . . . 684 9.7 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
10 Evolution Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 10.1 Lebesgue–Bochner Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692 10.2 Semilinear Evolution Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 701 10.3 Nonlinear Evolution Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718 10.4 Second-Order Nonlinear Evolution Equations . . . . . . . . . . . . . . . 740 10.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753 List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
1 Smooth and Nonsmooth Calculus
Summary. *In this chapter we provide an overview of the main aspects of the smooth and nonsmooth calculus. We start with the smooth calculus and we introduce the notions of Gˆ ateaux and Fr´echet derivatives. We develop all the important calculus rules and we present the implicit and inverse function theorems together with their most important consequences. Then we pass to nonsmooth functions, starting with convex functions. For such functions, we develop the duality theory using the notion of conjugate functions and then we present the subdifferential theory. Subsequently, we extend the subdifferential theory to locally Lipschitz functions (Clarke’s theory). Using these theories, we also examine the local geometry of convex and nonconvex sets. Finally we present and study in detail a notion of convergence of functions distinct from the pointwise convergence, known as the Γ-convergence, which is designed to deal with the stability analysis of variational problems.
Introduction In this chapter we present an overview of the basic aspects of the smooth and nonsmooth calculus. This chapter provides all the basic tools to approach the theoretical and applied subjects in the chapters that follow. In Section 1.1 we present the basic items from the classical smooth calculus of functions defined in a Banach space. So we introduce the notions of Gˆ ateaux and Fr´echet derivatives and then study their properties. We develop mean value theorems and chain rules, we consider partial derivatives for functions defined on product spaces and we conclude with two basic theorems that are important in analysis and differential geometry, namely the implicit function theorem and the inverse function theorem. In Section 1.2 we deal with convex functions defined in a Banach space. Convex functions provide the link that connects the smooth calculus with the nonsmooth one. Continuous convex functions have remarkable differentiability properties on certain Banach spaces and in RN possess a derivative under conditions that are weaker than the usual ones. For lower semicontinuous convex functions that are not differentiable, we can introduce a multivalued substitute of the derivative, known as the (convex) subdifferential of the function. In Section 1.2 we study the basic N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_1, © Springer Science+Business Media, LLC 2009
2
1 Smooth and Nonsmooth Calculus
properties of the subdifferential and simultaneously we develop the duality theory for convex functions which is closely related to the subdifferential theory. In Section 1.3 we extend the notion of subdifferential to nonconvex functions. This is achieved with the use of locally Lipschitz functions. For such functions we define a subdifferential operator, known as the generalized subdifferential, and develop the basic rules of the corresponding calculus. Section 1.4 deals with the local geometry of nonsmooth convex and nonconvex sets. First we deal with convex sets that are not smooth and introduce for them substitutes of the well-known tangent and normal planes of differential geometry. These are the tangent and normal cones. Then with the help of the generalized subdifferential of the distance function (which is Lipschitz continuous) we extend the notions of tangency and normalcy to nonsmooth nonconvex sets. Finally in Section 1.5, we introduce a convergence of functions, distinct from the pointwise convergence, which is designed in such a way that makes it the right tool in the stability (sensitivity) analysis of variational problems. This convergence is known as the Γ-convergence and in Section 1.5 we present the basic results concerning it. We also introduce the so-called Mosco convergence of functions, which is more suitable when dealing with problems defined in a reflexive Banach space.
1.1 Gˆ ateaux and Fr´ echet Derivatives Let X, Y be Banach spaces and f : X −→ Y . In what follows by L(X, Y ) we denote the Banach space of bounded linear operators from X into Y . First we introduce a weak form of differentiability. DEFINITION 1.1.1 We say that f is Gˆ ateaux differentiable at x ∈ X, if there exists L ∈ L(X, Y ) such that lim
λ→0
f (x + λh) − f (x) = Lh λ
for all h ∈ X.
The operator L ∈ L(X, Y ) is called the Gˆ ateaux derivative of f at x ∈ X and is denoted by f (x). We say that f is Gˆ ateaux differentiable if it is Gˆ ateaux differentiable at every x ∈ X. REMARK 1.1.2 The Gˆ ateaux derivative is a directional derivative and so it is essentially a one-dimensional concept. Indeed, if for fixed x, h ∈ X we introduce ϕ(λ) = f (x + λh), λ ∈ R, then (d dλ)ϕ(λ)λ=0 = f (x)h. Evidently if the Gˆ ateaux derivative exists at x, then it is unique. EXAMPLE 1.1.3 (a) If A ∈ L(X, Y ), then A (x) = A for all x ∈ X. m,n n m ∈ Rm×n (the (b) If f = (fk )m k=1 : R −→ R , then f (x) = (∂fi ∂xj )(x) i=1,j=1
Jacobian matrix). We can prove a mean value theorem for Gˆ ateaux differentiable functionals ϕ : X −→ R
1.1 Gˆ ateaux and Fr´echet Derivatives
3
PROPOSITION 1.1.4 If f : X −→ R is Gˆ ateaux differentiable and x, h ∈ X, then there exists 0 < t < 1 such that f (x + h) − f (x) = f (x + th)h. PROOF: Let ϕ(λ) = f (x + λh), λ ∈ R. Then (d dλ)ϕ(λ) = f (x + λh)h. So invoking the mean for real functions, we can find 0 < t < 1 such that value theorem ϕ(1) − ϕ(0) = (d dλ)ϕ(λ)λ=t , hence f (x + h) − f (x) = f (x + th)h. If Y is not equal to R, then the mean value theorem is not valid. EXAMPLE 1.1.5 Consider the function f : R −→ R2 defined by f (t) = (cos t, sin t). In this case the mean value theorem takes an inequality form. Namely we have the following. PROPOSITION 1.1.6 If f : X −→ Y is Gˆ ateaux differentiable, x, h ∈ X and y ∗ ∈ Y ∗ , then there exists 0 < t < 1 such that y ∗ , f (x + h) − f (x)Y = y ∗ , f (x + th)h Y and
f (x + h) − f (x)Y ≤ f (x + th)L hX .
PROOF: Consider the function ϕ : X −→ R defined by ϕ(x) = y ∗ , f (x)Y . Then ϕ (x) = y ∗ , f (x)hY . By virtue of Proposition 1.1.4, we can find 0 < t < 1 such that ϕ(x + h) − ϕ(x) = y ∗ , f (x + th)h Y . Because y ∗ ∈ Y ∗ was arbitrary, we can choose y ∗ ∈ Y ∗ with y ∗ Y ∗ = 1 such that | y ∗ , f (x + h) − f (x)Y | = f (x + h) − f (x)Y . So from the first part of the proof, we have f (x + h) − f (x)Y = | y ∗ , f (x + th)h Y | ≤ f (x + th)L hX
(recall that y ∗ Y ∗ = 1).
Now we introduce the second stronger form of differentiability. DEFINITION 1.1.7 We say that f : X −→ Y is Fr´echet differentiable at x ∈ X, if there exists L ∈ L(X, Y ) such that f (x + h) − f (x) = Lh + r(x, h)
where (r(x, h)Y ) hX −→ 0 as hX −→ 0. The operator L ∈ L(X, Y ) is called the Fr´echet derivative of f at x and it is denoted by f (x). It is clear from the context which derivative we mean when we write f (x). We say that f is Fr´echet differentiable if it is Fr´echet differentiable at every x ∈ X.
4
1 Smooth and Nonsmooth Calculus
REMARK 1.1.8 The Fr´echet derivative when it exists, is unique. It is clear from the above definition that Fr´echet differentiability implies Gˆ ateaux differentiability. The converse is not in general true. EXAMPLE 1.1.9 Consider the function f : R2 −→ R defined by f (x1 , x2 ) =
x6 1 2 2 x8 1 +(x2 −x1 )
0
if (x1 , x2 ) = (0, 0)
for all x =
if (x1 , x2 ) = (0, 0)
x1 x2
∈ R2 .
Then it is easy to see that f is Gˆ ateaux differentiable at the origin with zero Gˆ ateaux derivative. On the other hand moving along the curve (u, u2 ), we see that |f (u, u2 )| 1
= 2 u uu2 R2 and so f is not Fr´echet differentiable at x = 0. The next proposition gives a situation where the two notions coincide. PROPOSITION 1.1.10 If f : X −→ Y is Gˆ ateaux differentiable on some open set U ⊆ X, x ∈ U and the map u −→ f (u) is continuous at x ∈ X from X into L(X, Y ) furnished with the operator norm topology, then f is also Fr´ echet differentiable at x ∈ X and the two derivatives coincide. PROOF: Let r(x, h) = f (x + h) − f (x) − f (x)h and y ∗ ∈ Y ∗ . Using Proposition 1.1.6 we can find 0 < t < 1 such that y ∗ , r(x, h)Y = y ∗ , f (x + th)h − f (x)h . As before, we choose y ∗ ∈ Y ∗ with y ∗ = 1 such that r(x, h)Y = | y ∗ , r(x, h)Y |, ⇒ r(x, h)Y ≤ f (x + th) − f (x)L hX , r(x, h)Y ⇒ ≤ f (x + th) − f (x)L −→ 0 hX
as
hX −→ 0.
This proves the Fr´echet differentiability of f at x and that the two derivatives coincide. Existence of the Fr´echet derivative at x ∈ X implies continuity of f at x ∈ X. PROPOSITION 1.1.11 If f is Fr´echet differentiable at x ∈ X, then f is continuous at x ∈ X. PROOF: From Definition 1.1.7, we see that there is δ > 0 such that f (x + h) − f (x) − f (x)hY ≤ hX ,
if
hX ≤ δ.
Therefore we have f (x + h) − f (x)Y ≤ (1 + f (x)L )hX , from which we conclude the continuity of f at x ∈ X.
if
hX ≤ δ,
1.1 Gˆ ateaux and Fr´echet Derivatives
5
REMARK 1.1.12 The result is not true if Fr´echet differentiability is replaced by Gˆ ateaux differentiability. The function in Example 1.1.9 is Gˆ ateaux differentiable at the origin but it is not continuous there. EXAMPLE 1.1.13 Let X = C([0, b]) (equipped with the supremum norm) and b
g ∈ C 1 (R). Consider the map f (u) = 0 g u(t) dt for all u ∈ C([0, b]). Then f : X −→ R is Fr´echet differentiable and
b
g u(t) h(t)dt f (u)h = for all h ∈ C([0, b]). 0
We have the following chain rule for derivatives. PROPOSITION 1.1.14 If X, Y, Z are Banach spaces, f :X−→Y is Gˆ ateaux differentiable at x and g : Y −→ Z is Fr´echet differentiable at f (x), then ξ = g◦f : X −→ Z is Gˆ ateaux differentiable at x and
ξ (x) = g f (x) f (x). Moreover, if f is Fr´echet differentiable at x ∈ X, then ξ is Fr´echet differentiable at x ∈ X. PROOF: If λ = 0 , then we have
1 ξ(x + λh) − ξ(x) − λg f (x) f (x)hZ |λ|
1 ≤ g f (x + λh) − g f (x) − g f (x) f (x + λh) − f (x) Z + |λ|
1 + g f (x) f (x + λh) − f (x) − λf (x)h Z . |λ|
(1.1)
Due to the Gˆ ateaux differentiability of f at x, we have
and
1 g f (x) f (x + λh) − f (x) − λf (x)h Z −→ 0 |λ| as λ −→ 0. f (x + λh) − f (x)Y −→ 0
(1.2)
Then for any λ ∈ R such that f (x + λh) = f (x), due to the Fr´echet differentiability of g at f (x), we have
1 g f (x + λh) − g f (x) − g f (x) f (x + λh) − f (x) Z |λ|
g f (x + λh) − g f (x) − g f (x) f (x + λh) − f (x) Z = f (x + λh) − f (x)Y f (x + λh) − f (x)Y × −→ 0 as λ −→ 0. |λ|
(1.3)
Returning to (1.1), passing to the limit as λ −→ 0, and using (1.2) and (1.3), we conclude that g ◦ f is Gˆ ateaux differentiable at x ∈ X. In a similar fashion we can show that g ◦ f is Fr´echet differentiable at x ∈ X, if g is Fr´echet differentiable at f (x).
6
1 Smooth and Nonsmooth Calculus
DEFINITION 1.1.15 We say that f:X−→Y is continuously Fr´echet differentiable at x ∈ X, if f is Fr´echet differentiable at x ∈ X and u −→ f (u) is continuous at x ∈ X from X into L(X, Y ) furnished with the operator norm topology. If this is true at every x ∈ X, we simply say that f is continuously Fr´echet differentiable and write f ∈ C 1 (X, Y ) (or simply f ∈ C 1 (X) if Y = R). Now suppose Y = Y1 × Y2 with Y1 , Y2 Banach spaces. It is well-known that Y can be given many different equivalent norms which make it a Banach space (e.g., (y1 , y2 )1 = y1 Y1 + y2 Y2 or (y1 , y2 )∞ = max{y1 Y1 , y2 Y2 }). The space L(X, Y1 × Y2 ) is isometrically isomorphic to the product space L(X, Y1 ) × L(X, Y2 ). Let p1 : Y −→ Y1 and p2 : Y −→ Y2 be the corresponding canonical projection operators. We know that p1 ∈ L(X, Y1 ) and p2 ∈ L(X, Y2 ). If f : X −→ Y = Y1 × Y2 , then f1 = p1 ◦ f and f2 = p2 ◦ f are the components of f ; that is, f = (f1 , f2 ). If i1 : Y1 −→ Y and i2 : Y2 −→ Y are the canonical injections, then f = i1 ◦ f1 + i2 ◦ f2 . Recall that i1 ∈ L(Y1 , Y ) and i2 ∈ L(Y2 , Y ). PROPOSITION 1.1.16 A function f : X −→ Y = Y1 ×Y2 is Fr´echet differentiable at x ∈ X if and only if f1 and f2 are Fr´echet differentiable at x ∈ X. We also have
f (x) = f1 (x), f2 (x) with f1 (x) = p1 ◦ f (x), f2 (x) = p2 ◦ f (x). PROOF: Suppose f is Fr´echet differentiable at x. Then from Proposition 1.1.14, we have that f1 and f2 are both Fr´e chet differentiable at x ∈ X, f1 (x) = p1 ◦ f (x), f2 (x) = p2 ◦ f (x), and f (x) = f1 (x), f2 (x) . Conversely, suppose that f1 and f2 are Fr´echet differentiable at x ∈ X. Endow Y =Y1 ×Y2 , for example, with the norm (y1 , y2 )Y = max{y1 Y1 , y2 Y2 } and for all h ∈ X set
ξ(h) = f1 (x)h, f2 (x)h . Then for every u ∈ X, we have f (u) − f (x) − ξ(u − x)Y = max fk (u) − fk (x)(u − x)Yk k∈{1,2}
= o(u − xX ), o(u − xX ) where −→ 0 u − xX
as u −→ x in X.
This shows that f is Fr´echet differentiable at x ∈ X and f (x) = f1 (x), f2 (x) . Next suppose that X = X1 ×X2 with X1 , X2 Banach spaces and f:X−→Y , where Y is another Banach space. Let x = (x1 , x2 ) ∈ X. The function u1 −→ f (u1 , x2 ) is called the first partial function associated with f at x and u2 −→ f (x1 , u2 ) is called the second partial function associated with f at x. We can consider the Fr´echet derivatives of these partial functions. These are called partial derivatives of f and are denoted by fu 1 (x) and fu 2 (x). Then in a straightforward manner, we can show the following result.
1.1 Gˆ ateaux and Fr´echet Derivatives
7
PROPOSITION 1.1.17 If f : X = X1 × X2 −→ Y is Fr´echet differentiable at x = (x1 , x2 ) ∈ X, then the two partial functions associated with f at x are Fr´echet differentiable at x1 ∈ X1 and at x2 ∈ X2 , respectively, and we have f (x)h = fx 1 (x)h1 + fx 2 (x)h2
for all h = (h1 , h2 ) ∈ X1 × X2 .
REMARK 1.1.18 Clearly the result can be extended to an arbitrary finite product X = X1 × · · · × Xn , n ≥ 2. The converse of Proposition 1.1.17 is not in general true. Namely the existence of the partial derivatives of f : X = X1 × X2 −→ Y at x = (x1 , x2 ) ∈ X does not imply the differentiability of f at x ∈ X. The next example illustrates this. EXAMPLE 1.1.19 Let f : R2 −→ R be defined by x x 1 2 if (x1 , x2 ) = (0, 0) 2 2 f (x1 , x2 ) = x1 +x2 0 if (x1 , x2 ) = (0, 0)
for all
x1 x2
∈ R2 .
Then (∂f ∂x1 )(0, 0) = (∂f ∂x2 )(0, 0) = 0, but f is not continuous at (0, 0), hence a fortiori it is not Fr´echet differentiable there (see Proposition 1.1.11). PROPOSITION 1.1.20 If {Xk }n k=1 , Y are Banach spaces and f : X = X1 × · · · × Xn −→ Y , then f ∈ C 1 (X, Y ) if and only if the n-partial derivatives fx k (x) exist at every x ∈ X and x −→ fx k (x) is continuous from X into L(Xk , Y ) with the operator norm topology, for all 1 ≤ k ≤ n. PROOF: We do the proof for n = 2. The general case follows easily by induction. First we assume that f ∈ C 1 (X, Y ). From Proposition 1.1.17 we know that fx1 (x), fx 2 (x) exist for all x ∈ X and we have fx 1 (x)h1 = f (x)(h1 , 0), fx 2 (x)h2 = f (x)(0, h2 ) for all x ∈ X, h1 ∈ X1 , h2 ∈ X2 . From this follows at once that x −→ fx k (x) is continuous from X into L(Xk , Y ) with the operator norm topology for k = 1, 2. Now suppose that the partial derivatives fx k exist on X and are continuous. For h = (h1 , h2 ) ∈ X = X1 × X2 , let ξ(x)h = fx 1 h1 + fx 2 h2 . Then for all u = (u1 , u2 ) ∈ X, we have f (u) − f (x) − ξ(x)(u − x)Y ≤ f (u1 , u2 ) − f (x1 , u2 ) − fx 1 (x1 , x2 )(u1 − x1 )Y
+ f (x1 , u2 ) − f (x1 , x2 ) − fx 2 (x1 , x2 )(u2 − x2 )Y .
(1.4)
Then given ε > 0, we can find δ > 0 such that u2 − x2 X ≤ δ implies f (x1 , u2 ) − f (x1 , x2 ) − fx 2 (x1 , x2 )(u2 − x2 )Y ≤ Also from Proposition 1.1.6, we have
ε u2 − x2 X2 . 2
(1.5)
8
1 Smooth and Nonsmooth Calculus f (u1 , u2 ) − f (x1 , u2 ) − fx 1 (x1 , x2 )(u1 − x1 )Y
≤
sup z1 ∈[x1 ,u1 ]
fx 1 (z1 , u2 ) − fx 1 (x1 , x2 )L u1 − x1 X1
[x1 , u1 ] = tx1 + (1 − t)u1 : 0 ≤ t ≤ 1 .
(1.6)
By hypothesis, we can find δ1 ≤ δ such that max{z1 − x1 X1 , u2 − x2 X2 } ≤ δ, implies ε fx 1 (z1 , u2 ) − fx 1 (x1 , x2 )L ≤ . (1.7) 2 Using (1.5) through (1.7) in (1.4), we see that if u − xX = max{u1 − x1 X1 , u2 − x2 X2 } ≤ δ1 , then ε ε u2 − x2 X2 + u1 − x1 X1 2 2 ≤ ε max{u2 − x2 X1 , u1 − x1 X1 }
f (u) − f (x) − ξ(x)(u − x)Y ≤
= εu − x. This proves the Fr´echet differentiability of f at x and f (x)h = ξ(x)h = fx 1 (x)h1 + fx 2 (x)h2 for all h = (h1 , h2 ) ∈ X1 × X2 . n COROLLARY 1.1.21 If Y is a Banach space and f : R −→ Y , then f ∈ 1 n C (R , Y ) if and only if the n-partial derivatives (∂f ∂xk )(x) are continuous from Rn into Y for all k = 1, . . . , n.
In this last part of this section we prove the implicit function theorem and the inverse function theorem, which are important in analysis and differential geometry. The proof of the implicit function theorem, depends on the Banach fixed theorem. Here we simply state the result and postpone its proof until Section 3.4, where the result is formulated and proved in a more general setting. PROPOSITION 1.1.22 If X is a Banach space, C ⊆ X is a nonempty closed subset, and f : C −→ C satisfies f (x) − f (z)X ≤ kx − zX
for all x, z ∈ C with k < 1,
(1.8)
then there exists a unique x ∈ C such that x = f (x) (fixed point of f ). Moreover, if Y is another Banach space, U ⊆ Y nonempty open and we have a family of functions f (u) : C −→ C, u ∈ U , satisfying (1.8) with k < 1 independent of u ∈ U , then the unique fixed point x(u) ∈ C (i.e., x(u) = f (u) x(u) for all u ∈ U ) depends continuously on u ∈ U . Using this proposition we can prove the so-called implicit function theorem. This theorem is about equations of the form f (x, y) = 0. It tells us under what conditions this equation defines a function y = g(x) (we say that y = g(x) is defined implicitly) and we would like to compute dy dx. Given such a function f , in general we can not solve for y explicitly and so it is important to know that such a function g indeed exists without having to solve for it.
1.1 Gˆ ateaux and Fr´echet Derivatives
9
THEOREM 1.1.23 (Implicit Function Theorem) If X, Y, Z are Banach spaces, U ⊆ X × Y is a nonempty open set, (x0 , y0 ) ∈ U, f ∈ C 1 (U, Z), fy (x0 , y0 ) ∈ L(Y, Z) is an isomorphism, and f (x0 , y0 ) = 0, then there exist open neighborhoods V of x0 and W of y0 with V × W ⊆ U and a function g ∈ C 1 (V, W ) such that
f x, g(x) = 0 for all x ∈ V (1.9) −1
and g (x) = −fy x, g(x) ◦ fx x, g(x) for all x ∈ V. (1.10) Moreover, for every x ∈ V, g(x) is the unique solution in W of (1.9). PROOF: For notational convenience, we set S0 = fy (x0 , y0 ). Evidently the equation f (x, y) = 0 is equivalent to y = y − S0−1 f (x, y) = F (x, y).
(1.11)
The significance of this straightforward reformulation is that on F we can apply Proposition 1.1.22. Because S0−1 ◦ S0 = IdY , we have
see (1.11) . F (x, y1 ) − F (x, y2 ) = S0−1 S0 (y1 − y2 )− f (x, y1 ) − f (x, y2 ) Because of the hypotheses on f , we can find δ1 > 0 and r > 0 such that if x−x0 X ≤ δ1 and y1 − y0 Y ≤ r, y2 − y0 Y ≤ r (i.e., y1 − y2 Y ≤ 2r), then F (x, y1 ) − F (x, y2 )Z ≤
1 y1 − y2 Y . 2
(1.12)
Also we can find 0 < δ < δ1 such that, if x − x0 X ≤ δ, then F (x, y0 ) − F (x0 , y0 )Z ≤
r . 2
(1.13)
Therefore, if x − x0 X ≤ δ and y − y0 Y ≤ r, then F (x, y) − y0 Z = F (x, y) − F (x0 , y0 )Z ≤ F (x, y) − F (x, y0 )Z + F (x, y0 ) − F (x0 , y0 )Z 1 r
≤ y − y0 Y + see (1.12) and (1.13) 2 2 ≤ r. ¯r (y0 ) = {y ∈ Y : y − y0 Y ≤ r} onto itself Hence F (x, y) maps the closed ball B and similarly the open ball Br (y0 ) = {y ∈ Y : y − y0 Y < r} onto itself. Applying Proposition 1.1.22 to the parametric family of functions y −→ F (x, y) x∈B¯ (x ) 0 δ ¯δ (x0 ) we can find a unique y = y(x) ∈ B ¯r (y0 ) such (see (1.12)), for every x ∈ B that F (x, y) = y; hence f (x, y) = 0 (which proves (1.9) and x −→ g(x) = y(x) is continuous. Let V = Bδ (x0 ) and W = Br (y0 ). Choosing enough, we can guarantee that V × W ⊆ U . We claim
δ, r > 0 small that g ∈ C 1 Bδ (x0 ), Y . To show this let (x1 , y1 ) ∈ V × W, y1 = g(x1 ) (recall that G(x, ·) maps W onto itself). By virtue of the differentiability of f at (x1 , y1 ), we have f (x, y) = K(x − x1 ) + L(y − y1 ) + u(x, y)
for all (x, y) ∈ U,
(1.14)
10
1 Smooth and Nonsmooth Calculus
where K = fx (x1 , y1 ) and L = fy (x1 , y1 ) and lim
(x,y)→(x1 ,y1 )
u(x, y)Z = 0. (x − x1 , y − y1 )X×Y
(1.15)
Inasmuch as f x, g(x) = 0 for all x ∈ V , from (1.14) we obtain
g(x) = y1 − L−1 K(x − x1 ) − L−1 u x, g(x)
for all x ∈ V.
(1.16)
Because of (1.15), we can find η > 0 such that if x−x1 X ≤ η and y−y1 X ≤ η, then 1 (x − x1 X + y − y1 Y ) 2L−1 L
1 (x − x1 X + g(x) − g(x1 )Y ), ⇒ u x, g(x) Z ≤ 2L−1 L u(x, y)Z ≤
(1.17)
for all x ∈ V . From (1.16) and (1.17) it follows that 1 g(x) − g(x1 )Y 2 with ϑ = 2L−1 KL + 1.
g(x) − g(x1 )Y ≤ L−1 KL x − x1 X + ⇒ g(x) − g(x1 )Y ≤ ϑx − x1 X
(1.18)
Let ξ(x) = L−1 u x, g(x) , x ∈ V . From (1.16), we have g(x) − g(x1 ) = L−1 K(x − x1 ) + ξ(x).
(1.19)
Note that ξ(x)Y ≤L−1 KL u x, g(x) Z , x∈V and lim
x→x1
u x, g(x) Z =0 x − x1 X
Hence we infer that lim
x→x1
(see (1.15) and (1.16)).
ξ(x)Y = 0. x − x1 X
(1.20)
Combining (1.19) and (1.20), we conclude that g is Fr´echet differentiable at x1 ∈ V and (1.21) g (x1 ) = −L−1 K which proves (1.10). Moreover, it is clear from (1.21) that g ∈ C 1 (V, W ).
REMARK 1.1.24 A special case of interest is when X = Rn and Y = Z = Rm . Then the function f has m-components fk : (x1 , . . . , xn , y1 , . . . , ym ) −→ fk (x1 , . . . , xn , y1 , . . . , ym ), 1 ≤ k ≤ m, which are C 1 R-valued functions of (n + m)variables x1 , . . . , xn , y1 , . . . , ym . Then m ∂fk (x0 , y0 ) ∈ Rm×m . fy (x0 , y0 ) = ∂yl k,l=1
1.1 Gˆ ateaux and Fr´echet Derivatives
11
Let ak = fk (x0 , y0 ), k ∈ {1, . . . , m}. Then according to Theorem 1.1.23, if detfy (x0 , y0 ) = 0, then the system of equations fk (x1 , . . . , xn , y1 , . . . , ym ) = ak , k ∈ {1, . . . , m} for every value of the parameter x = (xi )n i=1 ∈ V has a unique solution y = g(x) 1 with (x, y) ∈ U and g = (gk )m k=1 ∈ C (V, W ). So we have
fk x1 , . . . , xn , g1 (x1 , . . . , xn ), . . . , gm (x1 , . . . , xn ) = ak , k ∈ {1, . . . , m}. As a consequence of Theorem 1.1.23 we obtain the inverse function theorem. This theorem (as its name suggests), is about invertibility If
of functions. n f : Rn −→ Rn has a nonzero Jacobian determinant at x (i.e., det ∂fi ∂xj i,j=1 = 0), then f (x) ∈ L(Rn , Rn ) is an isomorphism . From the fact that the best linear approximation is invertible, we would like to conclude the invertibility of f . In general this is not possible. Consider the simple case of n = 1; that is, f : R −→ R. If f ∈ C 1 (R) and f (x) = 0, then we can only guarantee the invertibility of f near x, because f still has nonzero slope in a neighborhood of x. Therefore our main concern is local invertibility. THEOREM 1.1.25 (Inverse Function Theorem) If X, Y are Banach spaces, U ⊆ Y is a nonempty open set, f ∈ C 1 (U, X), y0 ∈ U , and f (y0 ) ∈ L(Y, X) is an isomorphism, then there exists a neighborhood U of y0 , U ⊆ U and V a neighborhood of x0 = f (y0 ) such that f : U −→ V is a C 1 -diffeomorphism (i.e., both f and f −1 are C 1 maps) and (f −1 ) (x0 ) = f (y0 )−1 . PROOF: Let F (x, y) = f (y) − x. Then Fy (x0 , y0 ) = f (y0 ) ∈ L(X, Y ) which by hypothesis is an isomorphism. So we can apply Theorem 1.1.23 and obtain 1 a neighborhood V of x
0 and g ∈ C (V , Y ) such that g(V ) ⊆ U 0 with U0 a neighborhood of y0 , F x, g(x) = 0 for all x ∈ V (i.e., f x, g(x) = x for all x ∈ V ) and f restricted on g(V ). Because g(x0 ) = y0 . In what follows we consider f x, g(x) = x, we see that g is injective on V , hence a bijection from V onto g(V ). Moreover, the set g(V ) = f −1 (V ) is open in Y because f ∈ C 1 (U, X). Set U = g(V ). We have that f : U −→ V is a bijection. Also from Theorem 1.1.23 we have
−1
g (x0 ) = − Fy x0 , g(x0 ) Fx x0 , g(x0 ) hence f (y0 ) ◦ g (x0 ) = IdX . Therefore we conclude that g (x0 ) = (f −1 ) (x0 ) = f (y0 )−1 .
COROLLARY 1.1.26 If X, Y are Banach spaces, U ⊆ Y a nonempty open set, f ∈ C 1 (U, X), f is injective, and for every y ∈ U f (y) ∈ L(Y, X) is an isomorphism, then f (U ) is open in X and f is a C 1 -diffeomorphism (i.e., both f and f −1 are C 1 maps) from U onto f (U ). REMARK 1.1.27 Suppose X = Rn , Y = Rm , U ⊆ Rm nonempty open and f ∈ C 1 (U, Rn ) with f (y0 ) = x0 . Suppose that m ≤ n and rankf (y0 ) = m (note that f (y0 ) ∈ Rn×m ). Then we can find U ⊆ U a neighborhood of y0 , V a neighborhood of x0 , and a differentiable function g : V −→ U such that g ◦f = i with i : Rm −→ Rn
12
1 Smooth and Nonsmooth Calculus
being the canonical injection map (i.e., i(u1 , . . . , um ) = (u1 , . . . , um , 0, . . . , 0)). If m ≥ n and rankf (y0 ) = n, then we can find a U ⊆ U neighborhood of y0 and a differentiable function g : U −→ U such that g(y0 ) = y0 and f ◦ g = pRn , where pRn : Rm −→ Rn is the canonical projection (i.e., pRn (u1 , . . . , um ) = (u1 , . . . , un )). We conclude this section with a well-known result about the Nemytski operator (see, e.g., for example Gasi´ nski–Papageorgiou [259]). PROPOSITION 1.1.28 (a) If (Ω, Σ, µ) is a complete σ-finite measure space, X, Y are separable Banach spaces, and f : Ω×X −→ Y is a Carath´ eodory function (i.e., Σ-measurable in ω ∈ Ω and continuous in x ∈ X) such that p/r
f (ω, x)Y ≤ α(ω) + cxX p, r ∈ [1, ∞), α ∈ Lr (Ω), c > 0
then x −→ Nf (x)(·) = f ·, x(·) is continuous and bounded from Lp (Ω, X) into r L (Ω, Y ). (b) If f : Ω×R −→ R is a Carath´ eodory function
x
f (ω, s)ds and ψ(u) = F ω, u(ω) dµ F (ω, x) = 0
for all u ∈ L (Ω), then ψ ∈ C p
1
p
Ω
L (Ω) and ψ (u) = Nf (u).
1.2 Convex Functions Throughout this section X is a locally convex vector space. Additional hypotheses are introduced as needed. Also we consider functions that also take the value +∞. For this reason we introduce the set R = R ∪ {+∞}. DEFINITION 1.2.1 Let ϕ : X −→ R. The effective domain of ϕ is the set dom ϕ = {x ∈ X : ϕ(x) < +∞}. We say that ϕ is proper if dom ϕ = ∅. The epigraph of ϕ is the set epi ϕ = {(x, λ) ∈ X × R : ϕ(x) ≤ λ}. The function ϕ is said to be lower semicontinuous (or closed ) if for every λ ∈ R, Lλ = {x ∈ X : ϕ(x) ≤ λ} is closed. We say that ϕ is convex if for all x1 , x2 ∈ dom ϕ and all λ ∈ [0, 1], we have
φ λx1 + (1 − λ)x2 ≤ λϕ(x1 ) + (1 − λ)ϕ(x2 ). If this inequality is strict when x1 = x2 and λ ∈ (0, 1), then we say that ϕ is strictly convex. The cone of proper, convex, and lower semicontinuous functions is denoted by Γ0 (X). If C ⊆ X, then the indicator function of C is defined by 0 if x ∈ C iC (x) = . +∞ otherwise
1.2 Convex Functions
13
REMARK 1.2.2 It is easy to check that ϕ is lower semicontinuous if and only if epi ϕ is closed if and only if f (x) = lim inf f (z) = sup inf f (z), where z→x
U ∈N (x)
z∈U
N (x) is the filter of neighborhoods of x. Also ϕ is convex if and only if epi ϕ is convex in X × R. Moreover, if ϕ : X −→ R is convex, then for every λ ∈ R, the set Lλ = {x ∈ X : ϕ(x) ≤ λ} is convex. The converse is not true. Think, for example, of ϕ(x) = |x|, x ∈ R. If C ⊆ X is nonempty, closed (and convex), then iC is proper, lower semicontinuous (and convex, i.e., iC ∈ Γ0 (X)). What about continuity of convex functions? The next theorem summarizes the situation in this respect. THEOREM 1.2.3 If ϕ : X −→ R is a proper, convex function, then the following statements are equivalent. (a) ϕ is bounded above in a neighborhood of x0 ∈ X. (b) ϕ is continuous at x0 ∈ X. (c) int epi ϕ = ∅. (d) int dom ϕ = ∅ and ϕint dom ϕ is continuous. Moreover, if one of the above statements holds, then int epi ϕ = {(x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ}.
PROOF: (a)⇒(b): Suppose that U is the neighborhood of x0 such that ϕU ≤ c for some c > 0. By translating things if necessary, without any loss
of generality we may assume that x0 = 0 and ϕ(0) = 0. Let 0 < ε ≤ c and set Vε = (ε c)U ∩ −(ε c)U . Then Vε is a symmetric neighborhood of the origin. We show that |ϕ(x)| ≤ ε
for all x ∈ Vε ,
(1.22) which of course implies the continuity of ϕ at x0 = 0. So let x ∈ U . Then (c ε)x ∈ U and by virtue of the convexity of ϕ and because ϕ(0) = 0, we have ε c ε ε ϕ(x) ≤ ϕ x + 1 − ϕ(0) ≤ c = ε. (1.23) c ε c c Similarly −(c ε)x ∈ U and we have 1 1+ 1 ≤ 1+
0 = ϕ(0) ≤
⇒
ε c
ϕ(x) +
ε c
1+ ε ε ϕ(x) + 1 + c
−ε ≤ ϕ(x).
ε c
c ϕ − x ε
ε c
(1.24)
From (1.23) and (1.24), it follows that (1.22) holds and so ϕ is continuous at x0 = 0. (b)⇒(a): The continuity of ϕ at x0 implies that ϕ is bounded above in a neighborhood of x0 ∈ X.
14
1 Smooth and Nonsmooth Calculus
(a)⇒(c): As before, let U be a neighborhood of x0 such that ϕU ≤ c. Then U ⊆ int dom ϕ and {(x, λ) ∈ X × R : x ∈ U, c < λ} ⊆ epi ϕ which implies that int epi ϕ = ∅. (c)⇒(a): Let (x, λ) ∈ int epi ϕ. Then we can find U a neighborhood of x and ε > 0 such that U × [λ − ε, λ + ε] ⊆ epi ϕ. Then U × {λ} ⊆ epi ϕ and so we have ϕ(x) ≤ λ for all x ∈ U , which is statement (a). (a)⇒(d): Again without any loss of generality, we may assume that x0 = 0. Let U be a neighborhood of x0 such that ϕU is bounded above. Then U ⊆ dom ϕ and so int dom ϕ = ∅. Clearly the set dom ϕ is convex. So if x ∈ int dom ϕ, we can find λ > 1 such that v = λx ∈ dom ϕ. Set 1 V =x+ 1− U, λ
which is a neighborhood of x. If z ∈ V , we have z = x + 1 − λ1 u with u ∈ U and so exploiting the convexity of ϕ, we have 1 1 1 1 ϕ(z) = ϕ v+ 1− u ≤ ϕ(v) + 1 − ϕ(u) λ λ λ λ 1 1 c = c0 . ≤ ϕ(v) + 1 − λ λ Therefore ϕV is bounded above and so it is continuous at x ∈ int dom ϕ (recall that we have already proved (a)⇔(b)). (d)⇒(a): Obvious. Now let E = {(x, λ) ∈ X × R : x ∈ int dom ϕ, ϕ(x) < λ}. Evidently int epi ϕ ⊆ E. Let v ∈ int dom ϕ such that ϕ(v) < λ. Choose ϕ(v) < µ < λ. Because ϕint dom ϕ is continuous we can find U a neighborhood of v such that U ⊆ int dom ϕ and ϕ(v) < µ for all x ∈ U . Therefore (v, λ) ∈ U × (µ, +∞) ⊆ int epi ϕ and so E ⊆ int epi ϕ. REMARK 1.2.4 Note that int dom ϕ = {x ∈ X : there exists λ ∈ R such that (x, λ) ∈ int epi ϕ}. In finite-dimensional spaces the situation simplifies. PROPOSITION 1.2.5 If X is finite-dimensional and ϕ : X −→ R is convex, then ϕ is continuous on int dom ϕ. PROOF: If x ∈ int dom ϕ, then we can find δ > 0 and {ek }N k=0 ⊆ X (N = dim X) such that Bδ (x) ⊆ conv{ek }N k=0 ⊆ dom ϕ. So, if v ∈ Bδ (x), then we can find {tk }N k=0 ⊆ [0, 1] such that
1.2 Convex Functions x=
N
tk ek
with
k=0
N
15
tk = 1.
k=0
Because of the convexity of ϕ, we have ϕ(x) ≤
N
tk ϕ(ek ) ≤ max ϕ(ek ) = c. 0≤k≤N
k=0
Invoking Theorem 1.2.3, we conclude that ϕint dom ϕ is continuous.
In infinite dimensions, we need an additional condition on ϕ in order to have continuity on int dom ϕ. PROPOSITION 1.2.6 If X is a Banach space and ϕ : X −→ R is convex and lower semicontinuous, then ϕint dom ϕ is continuous.
PROOF: Note that dom ϕ =
{ϕ ≤ n}. Due to the lower semicontinuity of ϕ,
n≥1
the sets {ϕ ≤ n} are closed. So, if x ∈ int dom ϕ, then by the Baire category theorem we can find n ≥ 1 such that int{ϕ < n} = ∅ and ϕ(x) < n. Let u ∈ int{ϕ < n} and introduce the function
ξ(λ) = ϕ x + λ(u − x) , λ > 0. ¯δu−x (x) ⊆ dom ϕ. It Because x ∈ int dom ϕ, we can find δ > 0 such that B X follows that [−δ, δ] ⊆ dom ξ and so 0 ∈ int dom ξ. Hence by Proposition 1.2.5 ξ is continuous at zero. Because ξ(0) < n, we can find ε > 0 such that ξ(λ) < n for all λ ∈ [−ε, 0]. We set y = x − ε(u − x). Then we have y ∈ {ϕ < n} and u ∈ int{ϕ < n}. Inasmuch as x ∈ [u, y) = (1 − t)u + ty : 0 ≤ t < 1 , it follows that x ∈ int dom {ϕ < n} and so by Theorem 1.2.3 ϕ is continuous at x. In fact we can identify the kind of continuity of ϕ on int dom ϕ. THEOREM 1.2.7 If X is a Banach space and ϕ : X −→ R is a convex function that is continuous at x0 ∈ dom ϕ, then there exists r > 0 such that ϕ is Lipchitz ¯r (x0 ). continuous on B PROOF: The continuity of ϕ at x0 implies that there exist r, δ, c > 0 such that |ϕ(y)| ≤ c
¯r+δ (x0 ). for all y ∈ B
¯r (x0 ) with x = y. We set Let x, y ∈ B u=y+δ
y−x y − x
and
λ=
y − x ∈ (0, 1). y − x + δ
¯r+δ (x0 ) and y = λu + (1 − λ)x. So, because ϕ is convex Then u ∈ B
16
1 Smooth and Nonsmooth Calculus ϕ(y) ≤ λϕ(u) + (1 − λ)ϕ(x)
⇒ ϕ(y) − ϕ(x) ≤ λ ϕ((u) − ϕ(x) ≤
y − x 2c 2c ≤ y − x. y − x + δ δ
Interchanging the roles of x, y in the above argument, we conclude that |ϕ(y) − ϕ(x)| ≤
2c y − x δ
¯r (x0 ). for all x, y ∈ B
COROLLARY 1.2.8 If X is a Banach space and ϕ : X −→ R is a continuous convex function, then ϕ is locally Lipschitz. Thus far we have concentrated our attention on the continuity properties of convex functions. Now we turn our attention to the differentiability properties that lead to the subdifferential. DEFINITION 1.2.9 Let X be a normed space, ϕ : X −→ R a convex function, and x0 ∈ dom ϕ. The directional derivative of ϕ at x0 in the direction h ∈ X, denoted by ϕ (x0 ; h), is defined by ϕ (x0 ; h) = lim λ↓0
ϕ(x0 + λh) − ϕ(x0 ) . λ
REMARK 1.2.10 Due to the convexity of ϕ, the quotient map λ −→ ϕ(x0 +λh)−ϕ(x0 ) is increasing on (0, +∞) and so ϕ (x0 ; h) is well defined. In fact λ 0) ϕ (x0 ; h) = inf ϕ(x0 +λh)−ϕ(x . It is easy to see that ϕ (x0 ; ·) is convex and posλ λ>0
ateaux itively homogeneous. Of course, if ϕ (x0 ; ·) ∈ X ∗ , then ϕ (x0 ; ·) is the Gˆ derivative of ϕ at x0 . The basic result on the Gˆ ateaux differentiability of convex functions is the following theorem. THEOREM 1.2.11 If X is a separable Banach space, U ⊆ X is a nonempty open convex set, and ϕ : U −→ R is a continuous convex function, then ϕ is Gˆ ateaux differentiable on a dense Gδ -subset of U . PROOF: For every h ∈ X and n ≥ 1, we introduce the set V (h, n) = x ∈ U : there exists δ = δ(x, n) > 0 such that ϕ(x + λh) − ϕ(x − λh) − 2ϕ(x) 1 sup < . λ n 0<λ<δ Because ϕ is continuous on U , given n ≥ 1 for every m ≥ 1 the set ϕ(x + Wm (h) = x ∈ U :
1 h) m
+ ϕ(x − 1 m
1 h) m
− 2ϕ(x)
<
1 n
1.2 Convex Functions is open in U . Note that V (h, n) =
17
Wm (h). So V (h, n) is open.
m≥1
Next we show that V (h, n) is dense in U . If this is not the case, then we can find x0 ∈ U and δ > 0 such that V (h, n) ∩ Bδ (x0 ) = ∅. This means that the convex function λ −→ ξ(λ) = ϕ(x0 + λh) is not differentiable on (−δ, δ), which contradicts the classical result on the differentiability of real convex functions. So V (h, n) is dense in U . Because ϕ is continuous convex, from Corollary 1.2.8 we infer that ϕ (x; ·) is continuous for every x ∈ U . So if {hk }k≥1 is dense in ∂B1 = {u ∈ X : uX = 1}, then ϕ is Gˆ ateaux differentiable on the set V (hk , n) which is dense in U (by n,k≥1
Baire’s theorem) and Gδ .
There is a corresponding result for Fr´echet differentiability due to Asplund [31]. THEOREM 1.2.12 If X is a Banach space with a separable dual, U is a nonempty open convex set, and ϕ : U −→ R a continuous convex function, then ϕ is Fr´echet differentiable on a dense Gδ -set in U . In Example 1.1.9 we saw that existence of partial derivatives does not in general imply Fr´echet differentiability of a function. In the case of convex functions, the situation changes. PROPOSITION 1.2.13 If U ⊆ RN is a nonempty open set, ϕ : U −→ R is convex, and all the partial derivatives of ϕ at x ∈ U exist, then ϕ is Fr´echet differentiable at x ∈ U . N
PROOF: Let A(x) ∈ RN be defined by A(x), h RN = (∂ϕ ∂xk )(x)hk for all k=1
N h = hk k=1 ∈ RN . Let r > 0 be such that Br (x) ⊆ U . For each h ∈ Br (0), we set
ϑ(h) = ϕ(x + h) − ϕ(x) − A(x), h RN . Clearly ϑ is convex on Br (0). For each k ∈ {1, . . . , N } we define ηk : Br (0) −→ R by
ηk (h) =
if hk = 0
ϑ(hk ek ) hk
0
if hk = 0
N for all h = hk k=1 ∈ Br (0),
{ek }N k=1
where is the usual orthonormal basis of RN . Note that ηk (h) −→ 0 as
N hRN −→ 0. For each h = hk k=1 with hRN < r N , because ϑ is convex, we have ϑ(h) = ϑ
N N 1 1 ϑ N hk ek N hk ek ≤ N N k=1
=
N
hk ηk (N h) ≤ hRN
k=1
Because −ϑ(−h) ≤ ϑ(h), we have
k=1
N ηk (N h). k=1
18
1 Smooth and Nonsmooth Calculus −hRN
N N ηk (−N h) ≤ ϑh ≤ hRN ηk (N h) k=1
ϑ(h) −→ 0 ⇒ hRN
k=1
as
hRN −→ 0.
This proves that ϕ is Fr´echet differentiable at x ∈ U and ϕ (x) = A(x).
COROLLARY 1.2.14 If X is a finite-dimensional Banach space, U ⊆ X is nonempty open, and ϕ : U −→ R is convex, then ϕ is Gˆ ateaux differentiable at x ∈ U if and only if it is Fr´ echet differentiable at x ∈ U . Before passing to the discussion of the subdifferential of convex functions, let us briefly mention a few things about the duality theory of such functions. Duality is a basic theme in the theory of convex analysis. We work within a dual pair (X, X ∗ ), where X is a Hausdorff locally convex space and X ∗ its dual. We supply X with the w(X, X ∗ )-topology (weak topology) and X ∗ with the w(X ∗ , X)-topology. The pairing of X and X ∗ is denoted by ·, ·X . In what follows R∗ = R ∪ {±∞}. DEFINITION 1.2.15 Given a function ϕ : X −→ R∗ , the conjugate (or Legendre– Fenchel transform) of ϕ is the function ϕ∗ : X ∗ −→ R∗ defined by ϕ∗ (x∗ ) = sup x∗ , xX − ϕ(x) : x ∈ X . Similarly the second conjugate of ϕ, is the function (ϕ∗ )∗ = ϕ∗∗ : X −→ R∗ defined by ϕ∗∗ (x) = sup x∗ , xX − ϕ∗ (x∗ ) : x∗ ∈ X ∗ . REMARK 1.2.16 If ϕ takes the value −∞, then ϕ∗ ≡ +∞. Also if ϕ = +∞, then ϕ∗ has values in R = R ∪ {+∞}. Directly from Definition 1.2.15, we have the following. PROPOSITION 1.2.17 If ϕ : X −→ R is proper, then ϕ∗ ∈ Γ0 (X ∗ ). PROPOSITION 1.2.18 If ϕ : X −→ R∗ , then ϕ(x) + ϕ∗ (x∗ ) ≥ x∗ , xX for all x∗ ∈ X ∗ , x ∈ X (Young–Fenchel inequality). REMARK 1.2.19 From the Young–Fenchel inequality we deduce that we always have ϕ∗∗ ≤ ϕ. In the sequel we provide necessary and sufficient conditions for equality to hold. PROPOSITION 1.2.20 If ϕ : X −→ R is proper, then ϕ admits a continuous affine minorant; that is there exists (x∗0 , ϑ0 ) ∈ X ∗ × R such that x∗0 , x − ϑ0 ≤ ϕ(x)
for all x ∈ X.
1.2 Convex Functions
19
PROOF: Let x0 ∈ X and β ∈ R such that β < ϕ(x0 ). Then (x0 , β) ∈ / epi ϕ. Because epi ϕ ⊆ X × R is closed convex, by the strong separation theorem, we can find (x∗0 , ξ0 ) ∈ X ∗ × R and ϑ0 ∈ R such that x∗0 , xX + ξ0 λ ≤ ϑ0 < x∗0 , x0 X + ξ0 β
for all (x, λ) ∈ epi ϕ,
⇒ x∗0 , xX + ξ0 ϕ(x) ≤ ϑ0 < x∗0 , x0 X + ξ0 β.
(1.25)
From (1.25) we see that ξ0 < 0. Without any loss of generality we may assume that ξ0 = −1. Then from (1.25) we have x∗0 , xX − ϕ(x) ≤ ϑ0
for all x ∈ X,
x∗0 , xX
for all x ∈ X.
− ϑ0 ≤ ϕ(x)
So ϕ is minorized by a continuous affine function.
We use this proposition to produce necessary and sufficient conditions in order for the equality ϕ∗∗ = ϕ to hold. Recall that we always have ϕ∗∗ ≤ ϕ (see Remark 1.2.19). THEOREM 1.2.21 If ϕ : X −→ R is proper, then ϕ∗∗=ϕ if and only if ϕ ∈ Γ0 (X). PROOF: ⇒: Follows from Proposition 1.2.17. ⇐: From Remark 1.2.19 we know that ϕ∗∗ ≤ ϕ. We show that the opposite inequality is also true. So let x ∈ X and µ ∈ R such that µ < ϕ(x). Then (x, µ) ∈ / epi ϕ and because epi ϕ is closed convex, we can apply the strong separation theorem and obtain (x∗ , ξ) ∈ X ∗ × R, (x∗ , ξ) = (0, 0) and ε > 0 such that x∗ , yX + ξλ ≤ x∗ , xX + ξµ − ε
for all (y, λ) ∈ epi ϕ.
(1.26)
Note that λ can increase to +∞. So from (1.26) we infer that ξ ≤ 0. Suppose ξ < 0. Then x∗ , yX + ξϕ(y) < x∗ , xX + ξµ
for all y ∈ X,
⇒ (−ξϕ)∗ (x∗ ) ≤ x∗ , xX + ξµ.
We can easily check that (−ξϕ)∗ (x∗ ) = −ξϕ∗ (x∗ − ξ). So we have x∗ −ξϕ∗ − ≤ x∗ , xX + ξµ ξ x∗ x∗ ⇒ µ≤ − − ϕ∗ − ,x ≤ ϕ∗∗ (x). ξ ξ X Because µ < ϕ(x) was arbitrary, it follows that ϕ(x) ≤ ϕ∗∗ (x). Now suppose that ξ = 0. From (1.26) we have x∗ , yX ≤ x∗ , xX − ε ⇒ x∈ / dom ϕ
and so
for all y ∈ dom ϕ,
ϕ(x) = +∞.
Therefore we need to show that ϕ∗∗ (x) = +∞. Let η ∈ R be such that x∗ , yX < η < x∗ , xX
for all y ∈ dom ϕ.
(1.27)
20
1 Smooth and Nonsmooth Calculus
From Proposition 1.2.20 we know that ϕ is bounded below by a continuous affine function. So we can find y ∗ ∈ X ∗ and ϑ0 ∈ R such that y ∗ , yX − ϑ0 ≤ ϕ(y)
for all y ∈ X.
Then for all γ > 0 we have
y ∗ , yX − ϑ0 + γ x∗ , yX − η ≤ ϕ(y) ∗
∗
⇒ y + γx , yX − ϕ(y) ≤ ϑ0 + γη
for all y ∈ X
(see (1.27)),
for all y ∈ X,
⇒ ϕ∗ (y ∗ + γx∗ ) ≤ ϑ0 + γη.
(1.28)
Therefore
y ∗ , xX − ϑ0 + γ x∗ , xX − η ≤ y ∗ + γx∗ , xX − ϕ∗ (y ∗ + γx∗ ) (see (1.28)) ≤ ϕ∗∗ (x). Recall that γ > 0 was arbitrary. So we let γ ↑ +∞ and because x∗ , x − η > 0, we conclude that ϕ∗∗ (x) = +∞. Therefore ϕ∗∗ = ϕ. REMARK 1.2.22 It follows from Theorem 1.2.21 that ϕ ∈ Γ0 (X) if and only if ϕ is the upper envelope of all continuous affine functions majorized by ϕ. Also it follows that ϕ∗∗ is the lower semicontinuous, convex regularization ϕ (i.e., the biggest lower semicontinuous, convex function majorized by ϕ). Next we introduce an operation that is basic in convex analysis. DEFINITION 1.2.23 Let ϕ, ψ : X −→ R. The infimal convolution of ϕ and ψ denoted by ϕ ⊕ ψ, is defined by (ϕ ⊕ ψ)(x) = inf ϕ(y) + ψ(x − y) : y ∈ X = inf ϕ(u1 ) + ψ(u2 ) : u1 + u2 = x . As it turns out this operation is dual to addition (see Theorem 1.2.25 below). PROPOSITION 1.2.24 If ϕ, ψ : X −→ R are convex functions and there exists x0 ∈ dom ψ such that ϕ is continuous at x0 , then inf ϕ(x) + ψ(x) : x ∈ X = max − ϕ∗ (x∗ ) − ψ ∗ (−x∗ ) : x∗ ∈ X ∗ . PROOF: Using Definition 1.2.15 we can check that for all x ∈ X and all x∗ ∈ X ∗ , we have −ϕ∗ (x∗ ) − ψ ∗ (−x∗ ) ≤ ϕ(x) + ψ(x) ⇒ m∗ = sup − ϕ∗ (x∗ ) − ψ ∗ (−x∗ ) : x∗ ∈ X ∗ ≤ inf ϕ(x) + ψ(x) : x ∈ X = m. Because x0 ∈ dom ϕ ∩ dom ψ, we have −∞ ≤ m < +∞. If m = −∞, then we are done. So suppose that m ∈ R. LetE = int epi ϕ = ∅ (see Theorem 1.2.3) and G = (x, λ) ∈ X × R : λ ≤ m − ψ(x) = ∅. Both sets are convex and if (x, λ) ∈ E, then λ > ϕ(x) ≥ m−ψ(x) , hence E ∩G = ∅. Applying the weak separation theorem, we can find (x∗0 , λ0 ) ∈ X ∗ × R, (x∗0 , λ0 ) = (0, 0) and ξ ∈ R such that
1.2 Convex Functions x∗0 , xX +λ0 λ ≤ ξ ≤ x∗0 , uX +λ0 t
21
for all (x, λ) ∈ E and all (u, t) ∈ G. (1.29)
Because λ can grow to +∞, from (1.29) we see that λ0 ≤ 0. If λ0 = 0, then x∗0 , xX ≤ ξ ≤ x∗0 , x0 X for all x ∈ dom ϕ. But we can find δ > 0 such that Bδ (x0 ) ⊆ dom ϕ. So it follows that x∗0 = 0, a contradiction to the fact that (x∗0 , λ0 ) = (0, 0). Therefore λ0 < 0 and we can assume that λ0 = −1. We have − x∗0 , uX + t ≤ −ξ
for all (u, t) ∈ G
⇒ − x∗0 , uX + m − ψ(u) ≤ −ξ
(see (1.29)),
for all u ∈ X.
(1.30)
Also we have x∗0 , xX − λ ≤ ξ ∗
¯ = epi ϕ. for all (x, λ) ∈ E
(−x∗0 )
(1.31) ∗
∗
≤ −ξ − m and from (1.31) we have ϕ (x ) ≤ ξ. From (1.30) we obtain ψ Therefore m ≤ −ϕ∗ (x∗0 ) − ψ ∗ (−x∗0 ) ≤ m∗ ≤ m. Using this proposition we can establish the duality between addition and infimal convolution. THEOREM 1.2.25 If ϕ, ψ : X −→ R are convex functions and there exists x0 ∈ dom ψ such that ϕ is continuous at x0 , then (ϕ + ψ)∗ = ϕ∗ ⊕ ψ ∗ . PROOF: For every x∗ ∈ X ∗ we have −(ϕ + ψ)∗ (x∗ ) = inf g(x) + h(x) : x ∈ X ∗ with g(x) = ϕ(x) − x , xX and h(x) = ψ(x). Evidently g, h are convex functions and g is continuous at x0 ∈ dom g = dom ψ. So we can apply Proposition 1.2.24 and obtain −(ϕ + ψ)∗ (x∗ ) = − min h∗ (u∗ ) + g ∗ (−u∗ ) : u∗ ∈ X ∗ = − min ψ ∗ (u∗ ) + ϕ∗ (x∗ − u∗ ) : u∗ ∈ X ∗ . REMARK 1.2.26 It is straightforward to check that for any proper convex functions ϕ, ψ : X −→ R, we have (ϕ ⊕ ψ)∗ = ϕ∗ + ψ ∗ . EXAMPLE 1.2.27 (a) If C ⊆ X is nonempty, and iC is the indicator function, then i∗C (x∗ ) = σ(x∗ ; C) = sup x∗ , cX : c ∈ C (the support function of the set C). Moreover, if C ⊆ X is nonempty, closed, and convex, then x ∈ C if and only if x∗ , xX ≤ σ(x∗ ; C) for all x∗ ∈ X ∗ . (b) If X is a normed space, C ⊆ X is nonempty, and ϕ(x) = d(x, C) = inf x−cX : ∗ ∗ ∗ c ∈ C , then ϕ = iC ⊕ · X and so ϕ = iC + ( · X ) = σ(·; C) + iB¯1∗ (see Remark 1.2.26). (c) If X is a normed space and h : R −→ R is an even convex function, we define ϕ(x) = h(x) for all x ∈ X. Then ϕ∗ (x∗ ) = h∗ (x∗ X ∗ ).
22
1 Smooth and Nonsmooth Calculus
Now we pass to the study of the subdifferential of a convex function. The subdifferential characterizes the local behavior of the convex function in a way analogous to that in which derivatives determine the local behavior of smooth functions. The mathematical setting remains unchanged. So X is a locally convex space, X ∗ its topological dual, and ·, ·X denotes the duality brackets for the pair (X, X ∗ ). We furnish X with the w(X, X ∗ )-topology and X ∗ with the w(X ∗ , X)-topology. DEFINITION 1.2.28 Let ϕ : X −→ R be a proper function and x0 ∈ dom ϕ. The subdifferential of ϕ at x0 is the set ∂ϕ(x0 ) ⊆ X ∗ (possibly empty) defined by ∂ϕ(x0 ) = x∗ ∈ X ∗ : x∗ , hX ≤ ϕ(x0 + h) − ϕ(x0 ) for all h ∈ X . REMARK 1.2.29 It is clear from the above definition that ∂ϕ(x0 ) is w(X ∗ , X)closed and convex. Moreover, x0 ∈ arg min ϕ (i.e., ϕ(x0 ) = inf ϕ) if and only if 0 ∈ ∂ϕ(x0 ) (generalized Fermat’s rule).
X
The following propositions are straightforward consequences of Definition 1.2.28. PROPOSITION 1.2.30 If ϕ, ψ : X −→ R are proper functions, x0 ∈ dom ϕ ∩ dom ψ and λ > 0, then (a) ∂ϕ(x0 ) + ∂ψ(x0 ) ⊆ ∂(ϕ + ψ)(x0 ). (b) ∂(λϕ)(x0 ) = λ∂ϕ(x0 ). PROPOSITION 1.2.31 If ϕ : X −→ R is a proper function, then x∗ ∈ ∂ϕ(x) if and only if ϕ(x) + ϕ∗ (x∗ ) = x∗ , xX . REMARK 1.2.32 So for the elements of the subdifferential ∂ϕ(x) (also known as subgradients of ϕ at x ∈ X), the Young–Fenchel inequality (see Proposition 1.2.18) becomes an equality. Using this proposition, we obtain the following. PROPOSITION 1.2.33 (a) If ϕ : X −→ R is a proper function and x∗ ∈ ∂ϕ(x), then x ∈ ∂ϕ∗ (x∗ ). (b) If ϕ ∈ Γ0 (X), then x∗ ∈ ∂ϕ(x) if and only if x ∈ ∂ϕ∗ (x∗ ). In general the set ∂ϕ(x) may be empty (consider, e.g., the case x ∈ / dom ϕ). The next theorem gives a situation where we have subdifferentiability of ϕ at x ∈ X (i.e., ∂ϕ(x) = ∅). THEOREM 1.2.34 If X is a Banach space and ϕ : X −→ R is a convex function that is continuous at x ∈ X, then ∂ϕ(x) = ∅ and it is a w∗ -compact and convex set in X ∗ .
PROOF: From Theorem 1.2.3 we have that int epi ϕ = ∅. Because x, ϕ(x) is a boundary point of epi ϕ, we can apply the weak separation theorem and obtain (x∗ , ξ) ∈ X ∗ × R, (x∗ , ξ) = (0, 0) such that
ξ ϕ(x) − λ ≤ x∗ , y − xX for all (y, λ) ∈ epi ϕ. (1.32)
1.2 Convex Functions
23
Because λ can increase to +∞, from(1.32) we infer that ξ ≥ 0. If ξ = 0, then x∗ , xX ≤ x∗ , yX
for all y ∈ dom ϕ.
¯δ (0), But x ∈ int dom ϕ and so for some δ > 0 we have x∗ , uX ≥ 0 for all u ∈ B which implies that x∗ = 0, a contradiction to the fact that (x∗ , ξ) = (0, 0). So ξ > 0 and we can take ξ = 1. Then from (1.32) with λ = ϕ(y), we have ϕ(x) − ϕ(y) ≤ x∗ , y − xX
for all y ∈ dom ϕ,
∗
⇒ −x ∈ ∂ϕ(x) = ∅. From Theorem 1.2.7 we know that there exists r > 0 such that ϕB¯ (x) is Lipschitz r continuous, with Lipschitz constant, say, kr > 0. Then for x∗ ∈ ∂ϕ(x), we have x∗ , hX ≤ ϕ(x + h) − ϕ(x) ≤ kr hX
¯r (0), for all h ∈ B
∗
⇒ x X ∗ ≤ kr . This proves that ∂ϕ(x∗ ) is bounded hence w∗ -compact (by Alaoglou’s theorem) and of course convex. From Example 1.2.27(a) we know that closed convex sets are completely described by their support function. In the next theorem for a certain ϕ we identify the support function of ∂ϕ(x). THEOREM 1.2.35 If X is a Banach space
and ϕ : X −→ R is a convex function that is continuous at x ∈ X, then σ h; ∂ϕ(x) = ϕ (x; h) for all h ∈ X (see Definition 1.2.9). PROOF: Let ψ(h) = ϕ (x; h), h ∈ X. Because ϕ is continuous at x ∈ X, then ψ is continuous convex and ∂ϕ(x) = ∅ (see Theorem 1.2.34). So for all x∗ ∈ ∂ϕ(x) and all h ∈ X, we have x∗ , hX ≤ ψ(h) ≤ ϕ(x + h) − ϕ(x). If ψ(h) = (1/λ) ϕ(x + h) − ϕ(x) , λ > 0, then we can easily calculate that ψλ∗ (x∗ ) =
1 ∗ ∗ ϕ (x ) + ϕ(x) − x∗ , xX , λ
λ > 0.
Because ψ = inf ψλ , then ψ ∗ = sup ψλ∗ and so λ>0
λ>0
1 ψ (x ) = sup ϕ∗ (x∗ ) + ϕ(x) − x∗ , xX = λ λ>0 ∗
∗
0 +∞
if x∗ ∈ ∂ϕ(x) otherwise.
(see Propositions 1.2.31 and 1.2.18). Therefore ψ ∗ = i∂ϕ(x) , hence ψ ∗∗ = ψ = σ∂ϕ(x) (see Theorem 1.2.21 and Example 1.2.27(a)). REMARK 1.2.36 Both Theorems 1.2.34 and 1.2.35 are still valid in the more general context of dual pairs of locally convex spaces. However, to avoid technical difficulties we have presented them in the more familiar framework of Banach spaces and their duals.
24
1 Smooth and Nonsmooth Calculus
It is natural to ask what is the exact relation between the subdifferential and Gˆ ateaux derivative when the latter exists. THEOREM 1.2.37 Let X be a Banach space and ϕ: X −→ R a proper convex function. (a) If ϕ is Gˆ ateaux differentiable at x ∈ X, then ∂ϕ(x) = {ϕ (x)}. (b) If ϕ is continuous at x ∈ X and ∂ϕ(x) is a singleton, then ϕ is Gˆ ateaux differentiable at x ∈ X and ∂ϕ(x) = {ϕ (x)}. PROOF: (a) Because ϕ is convex, we have 1 ϕ(x + λh) − ϕ(x) ≤ ϕ(x + h) − ϕ(x) ϕ (x), h X ≤ λ for all 0 < λ < 1 and all h ∈ X,
⇒ ϕ (x) ∈ ∂ϕ(x)
(see Definition 1.2.28).
Now let x∗ ∈ ∂ϕ(x). Then 1 ϕ(x + h) − ϕ(x) for all λ > 0 and all h ∈ X, λ ≤ ϕ (x), h X for all h ∈ X,
x∗ , hX ≤ ⇒ x∗ , hX
⇒ x∗ = ϕ (x); that is, ∂ϕ(x) = {ϕ (x)}. (b) Due to convexity of ϕ, ϕ(x) + λϕ (x; h) ≤ ϕ(x + h) for all λ ∈ R and all h ∈ X.
Then the line L = x + λh, ϕ(x) + λϕ (x; h) : λ ∈ R does not intersect the set int epi ϕ = ∅. So we can apply the weak separation theorem and produce a closed hyperplane H such that L ⊆ H and H ∩int epi ϕ = ∅. We know that H = Gr l where l is a continuous affine function on X and l(x) = ϕ(x). By hypothesis ∂ϕ(x) = {x∗ }. So l(h) = x∗ , hX + ξ, ξ ∈ R. Because L ⊆ H, we have ϕ (x; h) = x∗ , hX
for all h ∈ X
⇒ ϕ (x) = x∗ ; that is, ∂ϕ(x) = {ϕ (x)}. The subdifferential of convex functions has a rich calculus which in many respects parallels that of smooth functions. Next we present two basic rules of this calculus. THEOREM 1.2.38 If ϕ, ψ : X −→ R are proper convex functions and there exists x ˆ ∈ dom ϕ ∩ dom ψ where one of the two functions, say ϕ, is continuous, then ∂(ϕ + ψ)(x) = ∂ϕ(x) + ∂ψ(x) for all x ∈ X. PROOF: From Proposition 1.2.30(a) we know that we always have ∂ϕ + ∂ψ ⊆ ∂(ϕ + ψ), So it suffices to show that the opposite inclusion holds. To this end let u∗ ∈ ∂(ϕ + ψ)(x). Then x ∈ dom ϕ ∩ dom ψ and ψ(x) − ψ(y) ≤ ϕ(y) − ϕ(x) − u∗ , y − xX = h(y)
for all y ∈ X.
(1.33)
1.2 Convex Functions
25
We consider the following two sets C1 = epi h
and
C2 = {(y, µ) ∈ X × R : µ ≤ ψ(x) − ψ(y)}.
Note that both sets are convex and from Theorem 1.2.3 we have that int C1 = ∅. Moreover, int C1 ∩ C2 = ∅. To see this note that if y ∈ int C1 ∩ C2 , then h(y) < µ ≤ ψ(x) − ψ(y)
(see Theorem 1.2.3),
which contradicts (1.33). So we can apply the weak separation theorem and obtain (0, 0) such that (x∗ , ξ) ∈ X ∗ × R, (x∗ , ξ) = x∗ , uX + ξλ ≤ x∗ , yX + ξµ
for all (u, λ) ∈ C1 and all (y, µ) ∈ C2 .
(1.34)
The inequality in (1.34) is strict if (u, λ) ∈ int C1 . Note that (x, 0) ∈ C2 and λ can increase to +∞. So we must have ξ ≤ 0. If ξ = 0, then x∗ , u ≤ x∗ , x
for all u ∈ dom h.
(1.35)
But h is continuous at x and so dom h is a neighborhood of x. Therefore from (1.35) it follows that x∗ = 0; hence (x∗ , ξ) = (0, 0), a contradiction. So ξ < 0 and we may take ξ = −1. Then from (1.34), we have
x∗ , uX − h(u) ≤ x∗ , xX ≤ x∗ , yX − ψ(x) − ψ(y) , for all u ∈ domh and all y ∈ dom ψ. From the second inequality, we obtain −x∗ ∈ ∂ψ(x). From the first inequality and because h(x) = 0, we have x∗ ∈ ∂h(x), hence x∗ + u∗ ∈ ∂ϕ(x). So finally u∗ = u∗ + x∗ + (−x∗ ) ∈ ∂ϕ(x) + ∂ψ(x) ⇒ ∂(ϕ + ψ)(x) ⊆ ∂ϕ(x) + ∂ψ(x). REMARK 1.2.39 By induction it follows that Theorem 1.2.38 is true for any family {ϕk }n k=1 of proper convex functions ϕk : X −→ R = R ∪ {+∞}, such that all n but one of the ϕk s are continuous at a point x ∈ dom ϕk . k=1
THEOREM 1.2.40 If Y is another locally convex space, A ∈ L(X, Y ), and ϕ : Y −→ R is a proper convex function, then A∗ ∂ϕ(Ax) ⊆ ∂(ϕ ◦ A)(x) for all x ∈ X. Moreover, equality holds if there is a point in the range of A, where ϕ is continuous. PROOF: The inclusion follows at once from Definition 1.2.28. We show that equality holds when ϕ is continuous on the range of A. To this end let x∗ ∈ ∂(ϕ ◦ A)(x). Then x∗ , u − xX + (ϕ ◦ A)(x) ≤ (ϕ ◦ A)(u) for all u ∈ X. (1.36) Consider the affine space L ⊆ Y × R defined by
L = Au, x∗ , u − xX + (ϕ ◦ A)(x) ∈ Y × R : u ∈ X .
26
1 Smooth and Nonsmooth Calculus
Note that L ∩ int epi ϕ = ∅ and have only boundary points in common. The weak separation theorem implies that there exists a closed hyperplane H such that L⊆H
and
H ∩ int epi ϕ = ∅.
We know that H = Gr l where l : Y −→ R is the continuous affine function defined by l(y) = y ∗ , yY + ϑ, y ∗ ∈ Y ∗ , ϑ ∈ R. From the inclusion L ⊆ H, we have y ∗ , AuY + ϑ = x∗ , u − xX + (ϕ ◦ A)(x)
for all u ∈ X.
Let u = 0. Then ϑ = (ϕ ◦ A)(x) − x∗ , xX , ⇒ y ∗ , AuY = x∗ , uX ∗
for all u ∈ X,
∗ ∗
⇒ x =A y . Moreover, inasmuch as H ∩ int epi ϕ = ∅, we have y ∗ , yY + (ϕ ◦ A)(x) − A∗ y ∗ , xX ≤ ϕ(y) ∗
∗
for all y ∈ Y,
∗
⇒ y ∈ ∂ϕ(Ax); that is, x ∈ A ∂ϕ(Ax). Therefore we have proved that ∂(ϕ ◦ A)(x) ⊆ A∗ ∂ϕ(Ax) and so equality must hold. EXAMPLE 1.2.41 (a) If ϕ : R −→ R is a proper convex function and x ∈ int dom ϕ, then ∂ϕ(x) = ϕ− (x), ϕ+ (x) . Here ϕ(u) − ϕ(x) u−x ϕ(u) − ϕ(x) ϕ+ (x) = lim . u↓x u−x
ϕ− (x) = lim u↑x
and
(b) If X is a Banach spaceand ϕ(x) = 12 x2 , then ∂ϕ(x) = F(x) = x∗ ∈ X ∗ : x∗ , xX = x2X = x∗ 2X ∗ . This is the duality map, which is important in the study of evolution equations and is examined in more detail in Section 3.2. At this point simply note that if X = H a Hilbert space is identified with its dual, F = IdH . (c) If X is a Banach space and ϕ(x) = x, then ¯1∗ B if x = 0 . ∂ϕ(x) = F (x) if x = 0 x ¯1∗ = {x∗ ∈ X ∗ : x∗ X ∗ ≤ 1}. Here B (d) If X is a locally convex space, ∗ C ∗⊆ X∗ is nonempty, closed, convex, and ∗ ϕ(x)∗= N (x) = x ∈ X : x , c − x ≤ 0 for all c ∈ C = x ∈X : iC (x), then ∂ϕ(x) = C X x∗ , xX =σ(x∗ , C) . This set is a closed convex cone, known as the normal cone to C at x. In Section 1.4 we study tangent and normal cones in more detail.
1.3 Locally Lipschitz Functions
27
1.3 Locally Lipschitz Functions In this section we extend the subdifferential theory beyond the domain of convex functions. The starting point is Corollary 1.2.8. According to that result every continuous convex function is locally Lipschitz. So it is reasonable to focus our attention on locally Lipchitz functions. Throughout this section X is a Banach space with topological dual X ∗ . By ·, ·X we denote the duality brackets for the pair (X, X ∗ ). Additional hypotheses are introduced as needed.
DEFINITION 1.3.1 A function ϕ : X −→ R is said to be locally Lipschitz if for every x ∈ X we can find a neighborhood U ⊆ X of x and a constant kU > 0 (depending on U ) such that |ϕ(y) − ϕ(u)| ≤ kU y − uX
for all y, u ∈ U.
REMARK 1.3.2 If ϕ : X −→ R is Lipschitz continuous on bounded sets, then ϕ is locally Lipschitz and if dim X < +∞, then these two properties are equivalent. Although the directional derivative need not exist, nevertheless the local Lipschitz property permits the introduction of the following substitute of it.
DEFINITION 1.3.3 Let ϕ : X −→ R be a locally Lipschitz function. The generalized directional derivative of ϕ at x ∈ X in the direction h ∈ X, is defined by ϕ0 (x; h) = lim sup x →x
λ↓0
ϕ(x + λh) − ϕ(x ) ϕ(x + λh) − ϕ(x ) sup = inf . ε,δ>0 x −x≤ε λ λ 0<λ≤δ
This quantity has some useful properties that are summarized in the proposition that follows.
PROPOSITION 1.3.4 If ϕ : X −→ R is locally Lipschitz, then (a) For every x ∈ X, the function h −→ ϕ0 (x; h) is sublinear and Lipschitz continuous. (b) The function (x, h) −→ ϕ0 (x; h) is upper semicontinuous. (c) ϕ0 (x; −h) = (−ϕ)0 (x; h). PROOF: (a) It is clear from Definition 1.3.3 that ϕ0 (x; ·) is positively homogeneous. Also if h1 , h2 ∈ X, we have
28
1 Smooth and Nonsmooth Calculus ϕ0 (x; h1 + h2 )
ϕ x + λ(h1 + h2 ) − ϕ(x ) = lim sup λ x →x λ↓0
= lim sup x →x
ϕ x + λ(h1 + h2 ) − ϕ(x + λh2 ) + ϕ(x + λh2 ) − ϕ(x ) λ
λ↓0
ϕ x + λ(h1 + h2 ) − ϕ(x + λh2 ) ϕ(x + λh2 ) − ϕ(x ) + lim sup ≤ lim sup λ λ x →x x →x λ↓0
0
λ↓0
0
= ϕ (x; h1 ) + ϕ (x; h2 ). This proves the subadditivity of ϕ0 (x; ·), hence its sublinearity. If x is near x and λ > 0 is near 0, then due to the local Lipschitzness of ϕ, we have ϕ(x + λh) − ϕ(x ) for all h ∈ X, ≤ khX λ 0 for all h ∈ X, ⇒ ϕ (x; h) ≤ khX ⇒ |ϕ0 (x; h)| ≤ khX
(1.37)
for all h ∈ X (by virtue of the sublinearity of ϕ (x; ·)). From this and the sublinearity of ϕ0 (x; ·) follows easily the Lipschitz continuity of h −→ ϕ0 (x; h). 0
(b) Suppose (xn , hn ) −→ (x, h) in X × X as n → ∞. For every n ≥ 1, we can find un ∈ X and λn > 0 such that un X + λn ≤ 1/n and ϕ(xn + un + λn hn ) − ϕ(xn + un ) 1 + λn n ϕ(xn + un + λn hn ) − ϕ(xn + un + λn h) = λn ϕ(xn + un + λn h) − ϕ(xn + un ) 1 + + , λn n ⇒ lim sup ϕ0 (xn ; hn ) ≤ ϕ0 (x; h).
ϕ0 (xn ; hn ) ≤
n→∞
This proves the upper semicontinuity of (x, h) −→ ϕ0 (x; h). (c) From Definition 1.3.3, we have ϕ0 (x; −h) = lim sup x →x
ϕ(x − λh) − ϕ(x ) λ
λ↓0
= lim sup y→x
λ↓0
(−ϕ)(y + λh) − (−ϕ)(y) λ
(setting y = x − λh)
= (−ϕ)0 (x; h). The sublinearity and continuity of ϕ0 (x; ·) lead naturally to the following definition.
1.3 Locally Lipschitz Functions
29
DEFINITION 1.3.5 If ϕ : X −→ R is a locally Lipschitz function, the generalized subdifferential of ϕ at x ∈ X is defined by ∂ϕ(x) = {x∗ ∈ X ∗ : x∗ , hX ≤ ϕ0 (x; h) for all h ∈ X}. REMARK 1.3.6 By virtue of Proposition 1.2.20, for every x ∈ X the set ∂ϕ(x) is nonempty and clearly it is convex and w∗ -closed. Moreover, from (1.37) we see that ∂ϕ(x) is also bounded; hence it is w∗ -compact (Alaoglou’s theorem). Also 0 ϕ (x; ·) = σ ·; ∂ϕ(x) . PROPOSITION 1.3.7 If ϕ : X −→ R is locally Lipschitz, xn −→ x in X, x∗n −→ x∗ in X ∗ , and x∗n ∈ ∂ϕ(xn ) for all n ≥ 1, then x∗ ∈ ∂ϕ(x). w
PROOF: For every n ≥ 1 and every h ∈ X, we have x∗n , hX ≤ ϕ0 (xn ; h). Passing to the limit as n → ∞ and using Proposition 1.3.4(b), we obtain x∗ , hX ≤ ϕ0 (x; h) for all h ∈ X; hence x∗ ∈ ∂ϕ(x). PROPOSITION 1.3.8 Let ϕ : X −→ R be locally Lipschitz. (a) If ϕ is Gˆ ateaux differentiable at x ∈ X, then ϕ (x) ∈ ∂ϕ(x). (b) If ϕ ∈ C 1 (X), then ∂ϕ(x) = {ϕ (x)}. PROOF: (a) From Definition 1.3.3 we see that ϕ (x), h X ≤ ϕ0 (x; h)
⇒ ϕ (x) ∈ ∂ϕ(x)
for all h ∈ X,
(see Definition 1.3.5).
(b) Because ϕ ∈ C 1 (X), we have ϕ0 (x; h) = ϕ (x), hX for all h ∈ X; hence ∂ϕ(x) = {ϕ (x)}. REMARK 1.3.9 It can happen that ϕ is Gˆ ateaux differentiable at x ∈ X but ∂ϕ(x) contains elements other than ϕ (x). Consider the function ϕ : R −→ R defined by if x = 0 x2 sin x1 . ϕ(x) = 0 if x = 0 Then ϕ is differentiable on [0, 1], Lipschitz continuous on [0, 1], and ϕ (0) = 0. However, ϕ0 (0; h) = |h| and so ∂ϕ(0) = [−1, 1]. Also Proposition 1.3.8(b) is actually true under the weaker hypothesis that ϕ is strictly differentiable. Recall that ϕ is strictly differentiable, if for every x ∈ X, there exists ϕs (x) ∈ X ∗ such that for all h ∈ X we have ϕ(x + λh) − ϕ(x ) ϕs (x), h X = lim . λ x →x λ↓0
1
A C -function is strictly differentiable and a strictly differentiable function is locally Lipschitz. PROPOSITION 1.3.10 If ϕ is continuous, and convex (hence locally Lipschitz, see Corollary 1.2.8), then ∂ϕ(x) coincides with the subdifferential in the sense of convex analysis (see Definition 1.2.28).
30
1 Smooth and Nonsmooth Calculus
PROOF: For fixed h ∈ X, the function (t, u) −→ ϕ(u+th)−ϕ(u) is continuous from t (0, +∞) × X into R. So given ε > 0, we can find δ > 0 such that ϕ(u + th) − ϕ(u) ϕ(x + λh) − ϕ(x) ≤ +ε t λ for all (t, u) ∈ (0, +∞) × X such that |t − λ| ≤ δ and u − xX ≤ δ. Then
ϕ u + (λ + δ)h − ϕ(u) ϕ(x + λh) − ϕ(x) sup ≤ + ε. λ λ u−xX ≤δ Passing to the limit as λ, δ ↓ 0, we obtain ϕ0 (x; h) ≤ ϕ (x; h) + ε. Because ε > 0 was arbitrary let ε ↓ 0, to conclude that ϕ0 (x; h) ≤ ϕ (x; h). The opposite inequality is always true when ϕ (x; ·) exists, thus we conclude that ϕ0 (x; ·) = ϕ (x; ·). Then the proposition follows from Theorem 1.2.35. At this point let us recall the following fundamental result concerning vectorvalued locally Lipschitz functions known as Rademacher’s theorem. For a proof of it we refer to Evans–Gariepy [228]. THEOREM 1.3.11 If ϕ : Rn −→ Rm is locally Lipschitz, then ϕ is Fr´echet differentiable λn -a.e. (λn is the Lebesgue measure on Rn ). Using this theorem, when X is finite-dimensional, we can have a definition of the generalized subdifferential, which is more intuitive and geometric than the one in terms of the generalized directional derivative (see Definition 1.3.5). THEOREM 1.3.12 If ϕ : Rn −→ R is locally Lipschitz and E ⊆ Rn any Lebesgue n null set, then for every x ∈ R , ∂ϕ(x) = conv lim ∇ϕ(xn ) : xn −→ x, xn ∈ / n→∞ c E ∪Dϕ (here by Dϕ we denote the set of points of differentiability of ϕ; by Theorem c 1.3.11, λn (Dϕ ) = 0). PROOF: By (1.37), ∂ϕ(·) is bounded (i.e., maps bounded sets to bounded sets). Because ∇ϕ(xn ) ∈ ∂ϕ(xn ), n ≥ 1 (see Proposition 1.3.8(a)), we have {∇ϕ(xn )}n≥1 ∈ Rn is bounded and so we may assume that ∇ϕ(xn ) −→ u∗ in Rn . Moreover, Proposition 1.3.7 implies that u∗ ∈ ∂ϕ(x). Hence c / E ∪ Dϕ ⊆ ∂ϕ(x). S(x) = conv{lim ∇ϕ(xn ) : xn −→ x, xn ∈
Next for h = 0, let ϑh= lim ∇ϕ(x ), h x →x
Rn
(1.38)
. Given ε > 0, we can find δ = δ(ε) > 0
c x ∈E∪D / ϕ
such that
∇ϕ(x ), h
Rn
≤ ϑh + ε
c for all x − xRn ≤ δ, x ∈ / E ∪ Dϕ .
If 0 < λ < δ (2hRn ), then for almost all x ∈ Rn with x − xRn ≤ δ/2 we have
1.3 Locally Lipschitz Functions ϕ(x + λh) − ϕ(x ) =
λ
∇ϕ(x + th), h
0
⇒ ϕ0 (x; h) ≤ ϑh + ε,
⇒ ϕ0 (x; h) ≤ σ h, S(x) .
Rn
31
dt ≤ λ(ϑh + ε),
(1.39)
From (1.38) and (1.39), we conclude that S(x) = ∂ϕ(x) (see Example 1.2.27(a)). COROLLARY 1.3.13 If ϕ : Rn −→ R is locally Lipschitz, then ϕ0 (x; h) =
lim sup ∇ϕ(x ), h Rn . x →x
c x ∈E∪D / ϕ
The importance of the generalized subdifferential comes from the rich calculus that it has. We present a few basic results in this direction. We start with a definition. DEFINITION 1.3.14 A locally Lipschitz function ϕ : R −→ R is said to be regular at x if (i) The directional derivative ϕ (x; h) in the sense of Definition 1.2.9 exists for all h ∈ X. (ii) ϕ0 (x; h) = ϕ (x; h) for all h ∈ X. REMARK 1.3.15 Continuous convex functions and strictly differentiable functions (in particular C 1 -functions) are regular at x ∈ X. The first calculus rule is a direct consequence of Definition 1.3.5 (see also Definition 1.3.3). PROPOSITION 1.3.16 If ϕk : X −→ R k ∈ {1, . . . , m} are locally Lipchitz func m m λk ϕk (x) ⊆ λk ϕk (x) for all x ∈ X; equality
tions and {λk }m k=1 ⊆ R, then ∂
k=1
k=1
holds if each ϕk is regular at x ∈ X or if all but one of the ϕk s is strictly differentiable at x ∈ X and if λk ≥ 0. With the scalar multiples we can more precise. PROPOSITION 1.3.17 If ϕ : X −→ R is locally Lipchitz and λ ∈ R, then ∂(λϕ)(x) = λ∂ϕ(x) for all x ∈ X. PROOF: If λ ≥ 0, then (λϕ)0 = λϕ0 and so ∂(λϕ) = λ∂ϕ. Therefore we may assume that λ < 0 and in fact we can take λ = −1. Then x∗ ∈ ∂(−ϕ)(x) if and only if x∗ , hX ≤ (−ϕ)0 (x; h) (see Definition 1.3.5). But from Proposition 1.3.4(c), we have (−ϕ)0 (x; h) = ϕ0 (x; −h). Hence x∗ , hX ≤ ϕ0 (x; −h) for all h ∈ X and this is equivalent to saying that −x∗ ∈ ∂ϕ(x); that is, ∂(−ϕ)(x) = −∂ϕ(x). We have the following chain rule.
32
1 Smooth and Nonsmooth Calculus
PROPOSITION 1.3.18 If Y is another Banach space, f ∈ C 1 (X, Y ), and ϕ : X −→ R is locally Lipchitz, then ϕ ◦ f : X −→ R is locally Lipchitz and
∂(ϕ ◦ f )(x) ⊆ ∂ϕ f (x) ◦ f (x)
(1.40)
in the sense that for every x∗ ∈ ∂(ϕ ◦ f )(x), we have x∗ = f (x)∗ u∗
for some u∗ ∈ ∂ϕ f (x) .
(1.41)
Moreover, if ϕ or (−ϕ) is regular at f (x), then ϕ ◦ f (or −ϕ ◦ f ) is regular at x and equality holds in (1.40). Also if f maps every neighborhood of x onto a set that is dense in a neighborhood f (x) (e.g., if f (x) ∈ L(X, Y ) is surjective), then equality holds in (1.40). PROOF: Because ϕ is locally Lipschitz, we can find V a neighborhood of f (x) such that ϕV is Lipschitz continuous (see Definition 1.3.1). Then U = f −1 (V ) is a neighborhood of x ∈ X and because f (U ) ⊆ V , we see that ϕ ◦ f U is Lipschitz continuous. Therefore ϕ ◦ f : X −→ R is locally Lipschitz. By virtue of Proposition 1.3.4(b), given ε > 0 we can find 0 < δ ≤ ε such that
ϕ0 f (x); h ≤ ϕ0 f (x); h + ε
for all h − hY < δ.
Also from the definition of the generalized directional derivative (see Definition 1.3.3), we can find 0 < ξ, β such that
ϕ(y + λh ) − ϕ(y) ≤ ϕ0 f (x); h + ε ≤ ϕ0 f (x); h + 2ε λ
(1.42)
for all y − f (x)Y ≤ ξ, λ ≤ β, h − hY < δ. We set h = f (x)v, v ∈ X. Because f ∈ C 1 (X, Y ) we can find 0 < η ≤ β such that f (u + λv) − f (u) − f (x)v Y < δ λ for all u − xX ≤ η, λ ≤ η. In (1.42), we set y = f (u) and h =
and
f (u+λv)−ϕ(u) . λ
f (u) − f (x)Y ≤ ξ
We have
(ϕ ◦ f )(u + λv) − (ϕ ◦ f )(u) ≤ ϕ0 f (x); f (x)v + 2ε, λ for all u − xX ≤ η, λ ≤ η,
⇒ (ϕ ◦ f )0 (x; v) ≤ ϕ0 f (x); f (x)v = max y ∗ , f (x)v Y : y ∗ ∈ ∂ϕ f (x) . (1.43) Therefore (1.40) holds. Next suppose that ϕ is regular at f (x). The
case where −ϕ is regular at f (x) can be derived from the former because ∂(−ϕ) f (x) = −∂ϕ f (x) (see Proposition 1.3.17). Because of the regularity of ϕ at f (x), we have
1.3 Locally Lipschitz Functions
33
ϕ0 f (x); f (x)v = ϕ f (x); f (x)v
ϕ f (x) + λf (x)v − ϕ f (x) = lim λ↓0 λ
ϕ f (x) + λf (x)v − ϕ f (x + λv) ϕ f (x + λv) − ϕ f (x) = lim + λ↓0 λ λ = (ϕ ◦ f ) (x; v)
(because ϕ is locally Lipschitz and f ∈ C 1 (X, Y ))
≤ (ϕ ◦ f ) (x; v). 0
(1.44)
Comparing (1.43) and (1.44) we conclude that
ϕ0 f (x); f (x)v = (ϕ ◦ f )0 (x; v)
⇒ ∂(ϕ ◦ f )(x) = ∂ϕ f (x) ◦ f (x).
for all v ∈ X, (1.45)
If f maps any neighborhood of x onto a dense subset of a neighborhood of f (x), then we can write
ϕ y + λf (x)v − ϕ(y) ϕ0 f (x); f (x)v = lim sup λ y→f (x) λ↓0
ϕ f (u) + λf (x)v − ϕ f (u) = lim sup λ u→x λ↓0
ϕ f (x + λv) − ϕ f (u) = lim sup λ u→x λ↓0
(because f ∈ C 1 (X, Y )) = (ϕ ◦ f )0 (x; v)
for all v ∈ X.
Therefore (1.45) must hold.
REMARK 1.3.19 Using the adjoint of the linear operator f (x) ∈ L(X, Y ) we can write (1.40) as
∂(ϕ ◦ f )(x) ⊆ f (x)∗ ∂ϕ f (x) (see (1.41)). COROLLARY 1.3.20 If X is continuously and densely embedded in another Ba nach space Y , ϕ : Y −→ R is locally Lipschitz, and ϕ = ϕX , then ∂ ϕ(x) = ∂ϕ(x) for all x ∈ X, which means that every element of ∂ ϕ(x) admits a unique extension to an element of ∂ϕ(x). PROOF: Let i ∈ L(X, Y ) be the canonical embedding. Then we apply Proposition 1.3.18 with f = i to obtain the desired equality. Another result in the same direction as Proposition 1.3.18, is the following. PROPOSITION 1.3.21 If T = [0, b], f ∈ C 1 (T, X), and ϕ : X −→ R is locally Lipschitz, then g = ϕ ◦ f : T −→ R is differentiable a.e. on T and
g (t) ≤ max x∗ , f (t) X : x∗ ∈ ∂ϕ f (t) .
34
1 Smooth and Nonsmooth Calculus
PROOF: The function g : T −→ R is locally Lipchitz, hence it is differentiable almost everywhere (see Theorem 1.3.11). Let t0 ∈ T be a point of differentiability of g. We have
ϕ f (t0 + λ) − ϕ f (t0 ) g (t0 ) = lim λ→0 λ
ϕ f (t0 ) + λf (t0 ) + o(λ) − ϕ f (t0 ) = lim λ→0 λ
o(λ) −→ 0 as λ → 0 here λ ϕ f (t0 ) + λf (t0 ) − ϕ f (t0 ) = lim λ→0 λ
ϕ f (t0 ) + λf (t0 ) + o(λ) − ϕ f (t0 ) + λf (t0 ) +
λ ϕ f (t0 ) + λf (t0 ) − ϕ f (t0 ) = lim λ→0 λ
≤ ϕ0 f (t0 ); f (t0 ) = max x∗ , f (t0 ) X : x∗ ∈ ∂ϕ f (t0 ) . The next proposition produces an extension to the present nonsmooth setting of the classical Fermat’s rule for local extrema. PROPOSITION 1.3.22 If ϕ : X −→ R is locally Lipschitz which attains a local extremum (local minimum or local maximum) at x ∈ X, then 0 ∈ ∂ϕ(x). PROOF: Because ∂(−ϕ) = −∂ϕ it suffices to prove the proposition when x is a local minimizer. Then directly from Definition 1.3.3 we see that 0 ≤ ϕ0 (x; h) for all h ∈ X, hence 0 ∈ ∂ϕ(x) (see Definition 1.3.5). Using this proposition, we can prove a mean value theorem for the generalized subdifferential. THEOREM 1.3.23 If ϕ : X
−→ R is locally Lipschitz and x, u ∈ X, then we can find 0 < λ0 < 1 and v ∗ ∈ ∂ϕ (1 − λ0 )x + λ0 u such that ϕ(u) − ϕ(x) = v ∗ , u − xX . PROOF: Let ξ : R −→ R be defined
ξ(λ) = ϕ((1 − λ)x + λu + λ ϕ(x) − ϕ(u) ,
λ ∈ R.
Evidently ξ is locally Lipschitz on R, hence Lipschitz continuous on [0, 1]. Note that ξ(0) = ξ(1) = ϕ(x). So we can find λ0 ∈ (0, 1) such that ξ attains a local maximum or minimum at λ0 . By virtue of Proposition 1.3.22 we have 0 ∈ ∂ξ(λ0 ). Moreover, an easy calculation gives
∂ξ(λ0 ) ⊆ ∂ϕ x + λ0 (u − x) , u − x X + [ϕ(x) − ϕ(u)]. Therefore finally we have
1.4 Variational Geometry
35
ϕ(u) − ϕ(x) ∈ ∂ϕ x + λ0 (u − x) , u − x X . We conclude this section with a Lagrange multipliers rule. For a proof of it we refer to Clarke [149]. PROPOSITION 1.3.24 If ϕ, ψ : X −→ R are locally Lipschitz functions and x ∈ X solves the problem inf[ϕ(x) : ψ(x) ≤ 0], then there exist λ, µ ≥ 0 such that 0 ∈ λ∂ϕ(x) + µ∂ψ(x)
and
µψ(x) = 0.
Moreover, if ψ is also convex and there exists x0 ∈ X such that ψ(x0 ) < 0, then we have λ > 0 (and so we can take λ = 1, normal multipliers).
1.4 Variational Geometry Differential geometry is built upon smooth manifolds where the sets of tangents are convenient vector spaces. But in many applications we encounter nonsmooth sets. The first step away from the smooth setting of differential geometry was to consider convex sets. Then the tangent and normal spaces of differential geometry are replaced by the tangent and normal cones which actually share many of the properties of their smooth counterparts. The second step is to get rid of the convexity requirement on the set using the generalized subdifferential of locally Lipschitz functions. We show that some interesting results can be obtained this way. Throughout this section, X is a Banach space and X ∗ its topological dual. By ·, ·X we denote the duality brackets for the pair (X, X ∗ ). ¯ The DEFINITION 1.4.1 Let C ⊆ X be a nonempty convex set and x ∈ C. tangent cone to C at x is the set TC (x) =
! 1 (C − x). λ
λ>0
REMARK 1.4.2 Evidently TC (x) = TC¯ (x) and so without any loss of generality we can always assume that C is closed. Also from Definition 1.4.1 it follows that h ∈ TC (x) if and only if there exist sequences λn −→ 0+ in R and hn −→ h in X ¯ for all n ≥ 1. Note that C ⊆ x + TC (x). such that x + λn hn ∈ C PROPOSITION 1.4.3 If C ⊆ X is a nonempty closed convex set and x ∈ C, then TC (x) is a closed convex cone. 1 (C − x). First we show that for all h ∈ SC (x), we can PROOF: Let SC (x)= λ λ>0
find λ0 > 0 such that for all 0 ≤ λ ≤ λ0 we have x + λh ∈ C. Indeed note that from the definition of SC (x) we can find λ0 > 0 such that x + λ0 h ∈ C. Then for
36
1 Smooth and Nonsmooth Calculus
0 ≤ λ ≤ λ0 we have x + λh = 1 − (λ/λ0 ) x + (λ/λ0 )(x + λh) and the right-hand side is a convex combination of elements in C. Due to the convexity of C, we have that x+λh ∈ C. Using this fact we can show that SC (x) is a convex cone. To this end let h1 , h2 ∈ SC (x). Then we can find λ1 , λ2 > 0 such that x + λk hk ∈ C, k = 1, 2. Set λ0 = min{λ1 , λ2 }. From the first part of the proof we have x + λ0hk ∈ C, k = 1, 2. Exploiting the convexity of C, we obtain x + λ0 th1 + (1 − t)h2 ∈ C, t ∈ [0, 1]. This proves that SC (x) is a convex cone. It follows that SC (x)=TC (x) is a closed convex cone. REMARK 1.4.4 We have T{x} (x) = {0} and if x ∈ int C, then TC (x) = X. In differential geometry, the normal space to a smooth manifold embedded in RN at a point x, is the orthogonal complement in RN of the tangent space. In a similar way, replacing the orthogonal subspace by the negative polar cone, we can define the normal cone to C at x. DEFINITION 1.4.5 Let C ⊆ X be a nonempty, closed convex set. The normal cone to C at x ∈ C is defined by NC (x) = x∗ ∈ X ∗ : x∗ , xX = σ(x∗ ; C) ∗ = x ∈ X ∗ : x∗ , c − x ≤ 0 for all c ∈ C . REMARK 1.4.6 Evidently NC (x) = TC (x)− (= the negative polar cone of TC (x)) and so it is closed and convex. Then N{x} (x) = X ∗ and if x ∈ int C, we have NC (x) = {0} (see Remark 1.4.4). Moreover, if iC is the indicator function of the set C; that is, 0 if x ∈ C , iC (x) = +∞ if x ∈ /C then iC ∈ Γ0 (X) and ∂iC (x) = NC (x) (the subdifferential is taken in the sense of convex analysis; see Definition 1.2.28). The following calculus rules for the tangent and normal cones follow easily from Definitions 1.4.1 and 1.4.5. PROPOSITION 1.4.7 (a) If C1 ⊆ C2 ⊆ X are nonempty closed convex sets and x ∈ C1 , then TC1 (x) ⊆ TC2 (x) and NC2 (x) ⊆ NC1 (x). (b) If {Xk }n k=1 are Banach spaces Ck ⊆ Xk , k ∈ {1, . . . , n} are nonempty, closed, n n " " and convex sets, X = Xk , C = Ck ⊆ X and x = (xk )n k=1 ∈ C, then TC (x) =
n " k=1
k=1
TCk (xk ) and NC (x) =
n " k=1
k=1
NCk (xk ).
(c) If X, Y are Banach spaces, C ⊆ X is a nonempty, closed, and convex set, and A ∈ L(X, Y ), then TA(C) (Ax) = ATC (x) and NA(C) (Ax) = (A∗ )−1 NC (x). (d) If C1 , C2 ⊆ X are two nonempty, closed convex sets and x1 ∈ C1 , x2 ∈ C2 , then TC1 +C2 (x1 + x2 ) = TC1 (x1 ) + TC2 (x2 ) and NC1 +C2 (x1 + x2 ) = NC1 (x1 ) ∩ NC2 (x2 ).
1.4 Variational Geometry
37
For the next calculus rule, we need some auxiliary material which is actually of independent interest. We start with a minimax theorem, known as Nikaido’s theorem, which we state here without a proof. We return to this subject in Section 2.2. THEOREM 1.4.8 If V, Z are locally convex spaces, C ⊆ Z is nonempty, convex, and w(Z, Z ∗ )-compact, and η : V × Z −→ R is a function such that (i) For all z ∈ Z, v −→ η(v, z) is convex and lower semicontinuous; (ii) For all v ∈ V, z −→ η(v, z) is concave and upper semicontinuous; then inf max η(v, z) = max inf η(v, z). v∈V z∈C
z∈C v∈V
Using this theorem, we can prove the following optimization result. THEOREM 1.4.9 If X, Y spaces, A ∈ L(X, Y ) ϕ ∈
are reflexive Banach Γ0 (X), ψ ∈ Γ0 (Y ), 0 ∈ int A(dom ϕ) − dom ψ , and x0 ∈ X satisfies
ϕ(x0 ) + ψ A(x0 ) = min ϕ(x) + ψ A(x) : x ∈ X ,
then 0 ∈ A∗ ∂ψ A(x0 ) + ∂ϕ(x0 ). ¯1X = {x ∈ X : xX ≤ 1} and ψ ∗ : Y ∗ −→ R be the conjugate of ψ PROOF: Let B (see Definition 1.2.15). Let m = sup ∗inf ∗ sup ψ ∗ (y ∗ ) − y ∗ , A(x)Y − ϕ(x) . n≥1 y ∈Y
¯X x∈nB 1
Let η(y ∗ , x) = ψ ∗ (y ∗ ) − y ∗ , A(x)Y − ϕ(x). Evidently η satisfies hypotheses (i) and (ii) of Theorem 1.4.8 with V = Y ∗ and Z =Xw = the Banach space X furnished ¯1X is w-compact in X (due to the reflexivity of with the weak topology. Because B X), by virtue of Theorem 1.4.8, we have m = sup sup ∗inf ∗ ψ ∗ (y ∗ ) − y ∗ , A(x)Y − ϕ(x) ¯ X y ∈Y n≥1 x∈nB 1 ∗ ∗
= sup ∗inf ∗ ψ (y ) − y ∗ , A(x)Y − ϕ(x) . x∈X y ∈Y
From Propositions 1.2.18 and 1.2.31, we have that
inf ∗ ψ ∗ (y ∗ ) − y ∗ , A(x)Y = −ψ A(x) . ∗ y ∈Y
(1.46)
(1.47)
Using this (1.47) in (1.46), we obtain
m = sup −ψ A(x) −ϕ(x) =− inf ϕ(x)+ψ A(x) = −ϕ(x0 )−ψ A(x0 ) . (1.48) x∈X
x∈X
Let {yn∗ }n≥1 ⊆ Y ∗ be such that lim sup ψ ∗ (yn∗ ) − yn∗ , A(x)Y − ϕ(x) ≤ m n→∞
for all x ∈ X.
(1.49)
38
1 Smooth and Nonsmooth Calculus
By virtue of the hypothesis that 0 ∈ int A(dom ϕ) − dom ψ , we can find r > 0 such that ¯1Y ⊆ dom ψ − A(dom ϕ). rB ¯1Y can be written as u = yu − A(xu ) with yu ∈ dom ψ and Every u ∈ rB xu ∈ dom ϕ. Then yn∗ , uY = yn∗ , yu Y − A∗ (yn∗ , xu X ≤ ψ ∗ (yn∗ ) + ψ(yu ) − A∗ (yn∗ , xu X , ⇒
lim sup yn∗ , yu Y n→∞
(see Proposition 1.2.18), ¯1Y (see(1.49)). ≤ m + ϕ(xu ) + ψ(yu ) < +∞ for all u ∈ rB
Therefore {yn∗ }n≥1 ⊆ Y ∗ is weakly bounded, thus bounded. Because Y is reflexive, it is relatively w-compact and so by the Eberlein–Smulian theorem and by w passing to a subsequence if necessary, we may assume that yn∗ −→ y ∗ in Y ∗ as ∗ ∗ ∗ ∗ n → ∞. Because y −→ ψ (y ) − y , A(x)Y is weakly lower semicontinuous, from (1.49) we have
ψ ∗ (y ∗ ) − y ∗ , A(x)Y − ϕ(x) ≤ m = −ϕ(x0 ) − ψ A(x0 ) ,
(1.50)
for all x ∈ X (see (1.48)). Taking x = x0 , we obtain
ψ ∗ (y ∗ ) + ψ A(x0 ) ≤ y ∗ , A(x0 )Y ,
⇒ y ∗ ∈ ∂ψ A(x0 ) (see Propositions 1.2.18 and 1.2.31).
(1.51)
Returning to (1.50) and using (1.51) and Proposition 1.2.31, we obtain ϕ(x0 ) − ϕ(x) ≤ −A∗ (y ∗ ), x0 − xX ∗
for all x ∈ X,
∗
⇒ −A (y ) ∈ ∂ϕ(x0 ).
So finally 0 = A∗ (y ∗ ) − A∗ (y ∗ ) ∈ A∗ ∂ψ A(x0 ) (see (1.51)).
THEOREM 1.4.10 If X, Y are reflexive Banach spaces, A ∈ L(X, Y ) is invertible,
C ⊆ X, D ⊆ Y are nonempty, closed, and convex sets such that 0 ∈ int A(C) − D and E = {x ∈ C : A(x) ∈ D}, then for all x ∈ E, we have
TE (x) = TC (x) ∩ A−1 TD A(x) and NE (x) = NC (x) + A∗ ND A(x) .
PROOF: Let x∗ ∈ NC (x) and y ∗ ∈ ND A(x) . Set u∗ = x∗ + A∗ y ∗ . If u ∈ E, then u∗ , uX = x∗ , uX + y ∗ , A(u)Y ≤ x∗ , xX + y ∗ , A(x)Y , (see Definition 1.4.5), ⇒ u∗ , uX ≤ u∗ , xX ; that is, u∗ ∈ NE (x). Conversely, if u∗∈NE (x), then x∗ , uX = sup[u∗ , uX : u ∈ C and A(u) ∈ D]. We apply Theorem 1.4.9 with ϕ = iC − u∗ ∈ Γ0 (X) and ψ = iD ∈ Γ0 (Y ). Note that domϕ = C and dom ψ = D. Then because ∂ϕ(x) = NC (x) and ∂ψ A(x) = ND A(x) − u∗ , we have
0 ∈ A∗ ND A(x) + NC (x).
1.4 Variational Geometry
39
Finally from Definition 1.4.5 we know that the tangent cone is a negative polar cone of the normal cone. So
− TE (x) = NC (x) + A∗ ND A(x)
− = TC (x) ∩ A∗ ND A(x)
= TC (x) ∩ A−1 TD A(x) . This theorem permits the description of the tangent and normal cones for intersections. PROPOSITION 1.4.11 If X is a reflexive Banach space and C, D ⊆ X are nonempty, closed, convex sets such that 0 ∈ int(C − D), then for all x ∈ C ∩ D we have TC∩D (x) = TC (x) ∩ TD (x)
and
NC∩D (x) = NC (x) + ND (x).
PROOF: Apply Theorem 1.4.10 with Y = X and A = IdX .
REMARK 1.4.12 The above result fails if we do not have the constraint qualification condition that 0 ∈ int(C − D). Take, for example, two balls C, D ⊆ R2 that are tangent at x ∈ R2 . Then TC∩D (x) = T{x} (x) = {0} but TC (x) ∩ TD (x) = R. Recall that, if X = H = a Hilbert space and C ⊆ H is nonempty closed and convex, then every x ∈ H has a unique best approximation in C; that is, there is a unique c0 ∈ C such that x − c0 H = min{x − cH : x ∈ H}. This map, which to each x ∈ H assigns its unique best approximation pC (x) ∈ C, is called the metric projection map for C. Then directly from the definitions we have the following. PROPOSITION 1.4.13 If X = H =a Hilbert space, C ⊆ H is nonempty, closed, and convex and x ∈ C, then p−1 C (x) = x + NC (x) and h ∈ TC (x) if and only if h, u − xH ≤ 0 for all u ∈ p−1 C (x). PROPOSITION 1.4.14 If C ⊆ X1 is a nonempty, closed, and convex set such that int C = ∅, then int TC (x) = (int C − x). λ λ>0
1 (int C − x) is open as the union of open sets and so PROOF: Note that λ λ>0 1 1 (int C − x) ⊆ int TC (x). If SC (x)= (C − x), then int TC (x) = int SC (x). λ λ
λ>0
λ>0
So to prove the proposition it suffices to show that if h ∈ int SC (x), then for some λ > 0 h ∈ λ1 (int C − x). Let δ > 0 be such that h + δB1 ⊆ SC (x) (B1 = {x ∈ X : xX < 1}. If x + h ∈ C, then we are done. Otherwise, let u0 ∈ int C and set h0 = u0 − x.
40
1 Smooth and Nonsmooth Calculus
Then h − δ(h0 /h0 X ) ∈ SC (x) and so we can find λ > 0 such that x + λ h − δ(h0 /h0 X ) ∈ C. Let µ = λδ (λδ + h0 X ). We have x + (1 − µ)λh = µu0 + (1 − µ) x + λ h − δ
h0 . h0 X
Note that u0 ∈ int C and x + λ h − δ(h0 /h0 X ) ∈ C. Because µ ∈ (0, 1), it follows that x + (1 − µ)λh ∈ int C; that is, h ∈
1 (int C − x). (1 − µ)λ
Let us give some simple examples of tangent cones. ¯1 = {x ∈ X; xX ≤ 1} and x ∈ ∂ B ¯1 . Then if EXAMPLE 1.4.15 (a) Let C = B ∗ X∗ F : X −→ 2 isthe duality map for the space; that is, F(u)= x ∈ X ∗ : x∗ , u= x∗ 2X ∗ = u2X for all u ∈ X (see Example 1.2.41(b)), then ! − TC (x)= F (x) ={u∈X :F (x), uX ≤ 0} and NC (x)= λF(x). λ≥0
(b) If C ⊆X is a nonempty, closed, and convex cone and x∈C, then NC (x)=C − ∩ {x}⊥ , where C − = x∗ ∈X ∗ :x∗ , cX ≤ 0 for all c ∈ C and {x}⊥ = x∗ ∈X ∗ :x∗ , xX = 0 . Also we have TC (x)=C + Rx. If C ⊆X is a closed subspace and x ∈ C, then TC (x)=C and NC (x)=C ⊥ = x∗ ∈ X ∗ : x∗ , cX = 0 for all c ∈ C . (c) If Y is another Banach space, A∈L(X, Y ), and C =A−1 ({y}), then if A(x) = y, TA−1 ({y}) (x) = ker A. n n xk = 1 (the standard (d) If C ⊆ Rn is defined by C = x = (xk )n k=1 ∈ R+ :
simplex in Rn ) and I(x)={k = 1, . . . , n : xk = 0}, then h = (hk )n (x) k=1 ∈ TRn + and and
h= ∈ TC (x) n hk = 0. (hk )n k=1
k=1
if and only if hk ≥ 0 if and only if hk ≥ 0
for all k ∈ I(x) for all k ∈ I(x)
k=1
REMARK 1.4.16 A set C ⊆ X is said to be star-shaped around x ∈ C, if for all u ∈ C and all λ ∈ [0, 1], we have x + λ(u − x) ∈ C. For star-shaped around x sets 1 C we still have TC (x) = SC (x) = (C − x) and C ⊆ x + TC (x). λ λ>0
1.4 Variational Geometry
41
Now let us see what can be said about nonconvex sets C ⊆ X. Let dC (·) = inf[ · −c; c ∈ C] (the distance function from C ⊆ X). For every C ⊆ X nonempty dC (·) is Lipschitz continuous with Lipschitz constant equal to 1. Note also that dC (·) = ( · ⊕ iC )(·). Therefore, if C ⊆ X is convex, then dC (·) is convex too. ¯ Then the Clarke DEFINITION 1.4.17 Let C ⊆ X be nonempty and x ∈ C. tangent cone to C at x is defined by TC (x)= h∈X : d0C (x; h) ≤ 0 . The Clarke normal cone to C at x is defined by − NC (x)= TC (x) = x∗ ∈ X ∗ : x∗ , hX ≤ 0 for all h ∈ TC (x) . REMARK 1.4.18 Note that both cones are closed and convex in X and X ∗ , respectively. Moreover, because TC (x) = TC¯ (x), then we can always take without − (bipolar any loss of generality C ⊆ X to be closed. Note that TC (x) = NC (x) theorem). Also if int C = ∅ and x∈int C, then TC (x)=X and NC (x)={0}. Finally in the definition of TC (x) we actually have d0C (x; h)=0. ¯ then NC (x) = PROPOSITION 1.4.19 If C ⊆ X is nonempty and x ∈ C, w∗ λ∂dC (x) . λ≥0
PROOF: Note that h ∈ TC (x) if and only x∗ , hX ≤ 0 for all x∗ ∈ ∂dC (x). So the negative polar cone of TC (x) is the weak∗ -closed convex cone generated by the set w∗ λ∂dC (x) . ∂dC (x). Hence NC (x)= λ≥0
We can have an intrinsic characterization of TC (x) independent of the norm used on X (hence of the distance function too). ¯ then h ∈ TC (x) if and PROPOSITION 1.4.20 If C ⊆X is nonempty and x ∈ C, only if for every sequence {xn }n≥1 ⊆ C such that xn −→ x in X and every sequence λn ↓ 0 in R, we can find a sequence {hn }n≥1 ⊆ X such that hn −→ h in X and xn + λn hn ∈ C for all n ≥ 1. PROOF: ⇒: Let h∈TC (x) and consider sequences {xn }n≥1 ⊆ C and {λn }n≥1 ⊆ R+ such that xn −→ x in X and λn ↓ 0 as n → ∞. Then by virtue of Definition 1.4.17 (see also Remark 1.4.18), we have lim
n→∞
dC (xn + λn h) − dC (xn ) dC (xn + λn h) = lim = 0. n→∞ λn λn
(1.52)
Let cn ∈ C such that xn + λn h − cn X ≤ dC (xn + λn h) +
λn n
(1.53)
and set hn =(1/λn )(cn − xn ). Then from (1.52) and (1.53) it follows that hn −→ h in X as n → ∞. Moreover, xn + λn hn ∈ C for all n ≥ 1.
42
1 Smooth and Nonsmooth Calculus
⇐: Choose {un }n≥1 ⊆ X with un −→ x in X and {tn }n≥1 ⊆ R+ with tn ↓ 0 such that dC (un + tn h) − dC (un ) lim = d0C (x; h). n→∞ tn Let cn ∈ C such that cn − un X ≤ dC (un ) + (tn /n), n ≥ 1. Then cn −→ x in X. By hypothesis we can find {hn }n≥1 ⊆ X such that hn −→ h in X and cn + tn hn ∈ C, n ≥ 1. Because dC (·) is Lipschitz continuous, dC (un + tn h) ≤ dC (cn + tn h) + un − cn X + tn h − hn X
1 ≤ dC (un ) + tn + h − hn X n dC (un + tn h) − dC (un ) 1 ⇒ ≤ + h − hn X tn n 0 ⇒ dC (x; h) ≤ 0. Using this proposition, we can compare the two cones TC (x), TC (x) when the set is convex. PROPOSITION 1.4.21 If C ⊆ X is nonempty, closed, and convex and x ∈ C, then TC (x) = TC (x) and NC (x) = NC (x). PROOF: It is clear from Remark 1.4.2 and Proposition 1.4.20 that TC (x) ⊆ TC (x). So suppose that h ∈ TC (x). Then dC (x; h) = d0C (x; h) = 0 (see Remark 1.4.18). Hence h ∈ TC (x) and we conclude that TC (x) = TC (x). Taking negative polar cones, we have NC (x) = NC (x). COROLLARY 1.4.22 If C ⊆ X is nonempty, closed, and convex and x ∈ C, then h ∈ TC (x) if and only if dC (x; h) = d0C (x; h) = 0. The next variational principle underlines the significance of the distance function in variational problems. Roughly speaking, the result says that in many cases we can replace a constrained minimization problem by an unconstrained one with a suitably perturbed objective functional. PROPOSITION 1.4.23 If ϕ : X −→ R is a function that is Lipschitz continuous on B ⊆ X with Lipschitz constant k > 0, C ⊆ B is nonempty, and ϕ(x) = inf[ϕ(u) : u ∈ C], then for all k ≥ k the function u −→ ψk (u) = ϕ(u) + k dC (u) attains its minimum over B at x. Moreover, if k > k and C is closed, then any other point minimizing ψk over B must also lie in C. PROOF: Let us prove the first assertion of the proposition. We argue indirectly. Suppose that the result is not true. We can find v ∈ B and ε > 0 such that ϕ(v) + k dC (v) < ϕ(x) − k ε. Choose c ∈ C such that v − cX ≤ dC (v) + ε. Then we have ϕ(c) ≤ ϕ(v) + k c − vX ≤ ϕ(v) + k dC (v) + k ε < ϕ(x), a contradiction to the fact that inf ϕ = ϕ(x). C
Next suppose k > k and assume that C ⊆ B is closed. Let v ∈ B be an other minimizer of ψk on B. Using the first part of the proof (for (k + k )/2 > k), we have
1.4 Variational Geometry
43
k + k dC (v) 2 (because k > k) and so v ∈ C (because C is closed).
ϕ(v) + k dC (v) = ϕ(x) ≤ ϕ(v) + ⇒ dC (v) = 0
COROLLARY 1.4.24 If ϕ : X −→ R is Lipschitz continuous near x ∈ X, C ⊆ X is nonempty, and ϕ(x) = inf ϕ, then 0 ∈ ∂ϕ(x) + NC (x). C
PROOF: Let B be a neighborhood of x ∈ X such that ϕB is Lipschitz continuous with constant k > 0. We may assume that C ⊆ B (because the sets C and C ∩ B have the same normal cone at x). Then by Proposition 1.4.23 we know that x ∈ C minimizes the function u −→ ϕ(u) + kdC (u) locally. Hence 0 ∈ ∂(ϕ + kdC )(x)
(see Proposition 1.3.22),
⇒ 0 ∈ ∂ϕ(x) + k∂dC (x) ⊆ ∂ϕ(x) + NC (x)
(see Propositions 1.3.16 and 1.4.19).
We conclude this section with the introduction of two more geometric notions. They are useful in the analysis of noncoercive variational problems. Let C ⊆ X be a nonempty, closed, convex set. For x ∈ C, let # C −x C∞ (x) = h ∈ X : x + λh ∈ C for all λ > 0 = . λ λ>0
Clearly this set is a closed convex cone, which can be trivial (i.e., C∞ (x) = {0}). Think of a bounded set in RN . PROPOSITION 1.4.25 The closed convex cone C∞ (x) does not depend on the base point x ∈ C. PROOF: Let x, u ∈ C. We need to show that C∞ (x) ⊆ C∞ (u). To this end let
h ∈ C∞ (x) and λ > 0. For 0 < ε < 1 let vε = x + λh + (1 − ε)(u − x) = ε x + (λ/ε)h + (1 − ε)u. Because h ∈ C∞ (x) and C is convex, we have vε ∈ C. Note that u + λh = lim vε ∈ C. ε↓0
This proposition suggests that the notation C∞ is more appropriate. DEFINITION 1.4.26 The recession cone (or asymptotic cone) of the closed con = h ∈ X : x + λh ∈ C for all λ > 0 = vex set C ⊆ X, is the closed convex cone C ∞
(C − x)/λ , for any x ∈ C. λ>0
REMARK 1.4.27 This cone describes the set of directions along which one can go straight to infinity without leaving C. So, if C is bounded, then C∞ = {0}. Also we can view C∞ as the maximal (with respect to inclusion) set K ⊆ X such that K + C ⊆ C. If C is a closed convex cone, then K = C. PROPOSITION 1.4.28 A closed convex set C ⊆ RN is compact if and only if C∞ = {0}.
44
1 Smooth and Nonsmooth Calculus
PROOF: ⇒: Because C is compact, it is bounded and so it cannot have direction receding to infinity. ⇐: Suppose that we can find {xn }n≥1 ⊆ C \ {0} such that xn RN −→ +∞. Set hn = xn xn RN , n ≥ 1. We may assume that hn −→ h in RN and hRN = 1, hence h = 0. Given x ∈ C and λ > 0, we can find n ≥ 1 large such that xn ≥ λ. Then
λ λ x + λh = lim 1 − xn ∈ C x+ n→∞ xn RN xn RN ⇒ h ∈ C∞ , a contradiction because h = 0. Given ϕ ∈ Γ0 (X), we know that epi ϕ is a closed convex set in X × R. We can consider the set (epi ϕ)∞ . This is the epigragh of a function ϕ∞ and so we are led to the following definition. DEFINITION 1.4.29 If ϕ ∈ Γ0 (X), then the recession function (or asymptotic function) corresponding to ϕ is the function ϕ∞ ∈ Γ0 (X) such that epi ϕ∞ = (epi ϕ)∞ . Then we have ϕ∞ (h)=sup λ>0
ϕ(x + λh) − ϕ(x) ϕ(x + λh) − ϕ(x) = lim , λ−→+∞ λ λ
where x is arbitrary in dom ϕ. Using a topological approach, we can extend the notions of recession cone and recession function to nonconvex sets and functionals. In what follows by τ we denote either the strong or the weak topology on X. DEFINITION 1.4.30 (a) If C ⊆ X is nonempty, then the τ -recession cone of C is the τ -closed cone (nonconvex in general) defined by τ -C∞ = h ∈ X :there exist {λn }n≥1 ⊆ R+ , {hn }n≥1 ⊆ X with τ λn −→ +∞, hn −→ h and λn hn ∈ C for all n ≥ 1 . (b) If ϕ : X −→ R is a proper function, then the τ -recession function of ϕ, is the function ϕ(λn xn ) τ τ -ϕ∞ (h) = inf lim inf : λn −→ +∞, hn −→ h n→∞ λn
PROPOSITION 1.4.31 If C ⊆ X is nonempty, then (a) If C is bounded, we have τ -C∞ = {0}. (b) τ -C∞ =
µ>0 λ>µ
1 (C λ
− x)
τ
where x is any point of X.
(c) τ -C∞ = τ -(C \ B)∞ = τ -(C ∪ B)∞ for any nonempty bounded set B ⊆ X.
1.5 Γ-Convergence
45
PROPOSITION 1.4.32 If ϕ, ψ : X −→ R are proper functions, then (a) τ -ϕ∞ is τ -lower semicontinuous and positively homogeneous of degree one. (b) (τ -ϕ∞ ) + (τ -ψ∞ ) ≤ τ -(ϕ + ψ)∞ . (c) If ϕ is bounded below, then τ -ϕ∞ ≥ 0. (d) If ϕ is weakly coercive (i.e., ϕ(x) −→ +∞ as x −→ ∞), then ϕ∞ is weakly coercive too and τ -ϕ∞ (h) > 0 for all h ∈ X \ {0}.
1.5 Γ-Convergence In optimization and in the calculus of variations, we are given an objective functional ϕ : X −→ R and a constraint set C ⊆ X and we want to find the minimum value m(ϕ, C) = inf ϕ(x) : x ∈ C (1.54) and to determine the set of minimizers (solutions of (1.54)) M (ϕ, C) = x ∈ X : ϕ(x) = m(ϕ, C) . The purpose of this section is to develop the tools that allow us to determine the dependence of the value m(ϕ, C) and of the solution set M (ϕ, C) on the data of the problem (ϕ, C). The mathematical setting in this section is the following. We are given (X, τ ) a Hausdorff topological space with topology τ and ϕn : X −→ R, n ≥ 1, a sequence of proper functions. Additional hypotheses are introduced as needed. In what follows, given x ∈ X, by N (x) we denote the filter of neighborhoods of x. DEFINITION 1.5.1 With the sequence {ϕn }n≥1 , we associate two limit functions The Γτ -lower limit of the sequence {ϕn }n≥1 defined by
Γτ - lim inf ϕn (x) = sup lim inf inf ϕn (u) n→∞
U ∈N (x)
n→∞ u∈U
and the Γτ -upper limit of the sequence {ϕn }n≥1 defined by
Γτ - lim sup ϕn (x) = sup lim sup inf ϕn (u). U ∈N (x)
n→∞
n→∞
u∈U
If Γτ - lim inf ϕn = Γτ - lim sup ϕn = ϕ, then we say that the sequence {ϕn }n≥1 n→∞
n→∞
Γτ -converges to ϕ and write Γτ - lim ϕn = ϕ. n→∞
REMARK 1.5.2 Evidently we always have that Γτ - lim inf ϕn ≤ Γτ - lim sup ϕn n→∞
n→∞
and both limit functions are τ -lower semicontinuous. In the above definition we can replace N (x) by a local base B(x). Also when the topology τ used is clearly understood and no confusion is possible, then we drop the subscript τ and simply
46
1 Smooth and Nonsmooth Calculus
write Γ- lim inf ϕn , Γ- lim sup ϕn and Γ- lim ϕn . If ϕn (x) = cn ∈ R for all x ∈ X, n→∞
n→∞
n→∞
then Γτ - lim inf ϕn = lim inf cn and Γτ - lim sup ϕn = lim sup cn . Also if ϕn = ϕ for n→∞
n→∞
n→∞
n→∞
n ≥ 1, then Γτ - lim inf ϕn = Γτ - lim sup ϕn = ϕτ , where ϕτ is τ -lower semicontinuous n→∞
n→∞
regularization of ϕ; that is, ϕτ = sup
τ
inf ϕ(u) or equivalently epi ϕτ =epi ϕ .
U ∈N (x) u∈U
In general Γτ -convergence and pointwise convergence are distinct modes of convergence. 2 2
EXAMPLE 1.5.3 (a) Let X = R and consider the sequence ϕn (x) = nxe−n n ≥ 1. Then if x = 0 − √12e Γ- lim ϕn (x) = n→∞ 0 if x = 0
x
,
and lim ϕn (x) = 0 for all x ∈ R. So both the Γ and the pointwise limits exist but differ. (b) Let X =R and consider the sequence ϕn (x)=sin(nx), n ≥ 1. Then Γ- lim ϕn = n→∞
ϕ with ϕ(x) = −1 for all x ∈ R, but the pointwise limit of the sequence {ϕn }n≥1 does not exist. (c) Let X =R and consider the sequence 2 2 nxe−n x ϕn (x) = 2 2 2nxe−n x
if n = even if n = odd
,
n ≥ 1.
Then ϕn (x) −→ 0 for all x ∈ R but the Γ-limit does not exist. In fact we have √
− √2e if x = 0 and Γ- lim inf ϕn (x) = n→∞ 0 if x = 0
if x = 0 − √12e . Γ- lim sup ϕn (x) = n→∞ 0 if x = 0 Next we produce a characterization of the Γ-convergence in terms of the convergence of the epigraphs of the functions.For this purpose we need to introduce a mode of set convergence. The subject of set convergence is examined more systematically in Section 6.6. DEFINITION 1.5.4 Let {Cn }n≥1 be a sequence of subsets of X. The Kτ -lower limit of the sequence {Cn }n≥1 denoted by Kτ - lim inf Cn (or simply K- lim inf Cn n→∞
n→∞
when there is no ambiguity about the topology on X), is defined by ! # ττ Ck . Kτ - lim inf Cn = n→∞
n≥1 k≥n
So x ∈ Kτ - lim inf Cn if and only if for every U ∈N (x), we can find n0 = n0 (U ) ≥ n→∞
1 such that U ∩ Cn = ∅ for all n ≥ n0 .
1.5 Γ-Convergence
47
The Kτ -upper limit of the sequence {Cn }n≥1 denoted by Kτ - lim sup Cn (or n→∞
K- lim sup Cn ) is defined by n→∞
Kτ - lim sup Cn = n→∞
# !
τ
Ck .
n≥1 k≥n
So x ∈ Kτ - lim sup Cn if and only if for all U ∈ N (x) and for every n ≥ 1, we can n→∞
find k ≥ n such that U ∩ Ck = ∅. If Kτ - lim inf Cn = Kτ - lim sup Cn = C, then we say that the sequence {Cn }n≥1 n→∞
n→∞
converges to C in the sense of Kuratowski and we write Kτ - lim Cn = C (or n→∞
K- lim Cn =C). n→∞
REMARK 1.5.5 Clearly we always have that Kτ - lim inf Cn ⊆Kτ - lim sup Cn = C n→∞
n→∞
and both sets are closed, possibly empty. EXAMPLE 1.5.6 (a) Let X =R and [0, n1 ] Cn = [1, 1 +
if n = even , if n = odd
1 ] n
then K- lim inf Cn = ∅ and K- lim sup Cn = {0, 1}. n→∞
n→∞
K
(b) Let X =R and Cn =[0, 1/n] ∪ [n, +∞). Then Cn −→ {0}. Recall that for C ⊆ X we have the indicator function 0 if x ∈ C . iC (x) = +∞ if x ∈ /C PROPOSITION 1.5.7 If {Cn }n≥1 is a sequence of nonempty subsets of X and we set C = Kτ - lim inf Cn , C = Kτ - lim sup Cn , n→∞
n→∞
then iC = Γτ - lim sup iCn and iC = Γτ - lim inf iCn . n→∞
n→∞
PROOF: We show the first equality. The proof is similar for the second one. Let ϕ=Γτ - lim sup iCn . Clearly ϕ takes only the values 0 and +∞ (see Definition n→∞
1.5.1). So it suffices to show that ϕ(x) = 0 if and only if x ∈ C . According to Definition 1.5.4, x ∈ C if and only if for every U ∈ N (x) we can find n0 ≥ 1 such that Cn ∩ U = ∅ for all n ≥ n0 . This is equivalent to inf iCn (u) = 0 for all n ≥ n0 . u∈U
Therefore x ∈ C if and only if lim sup inf iCn (u) = 0 for all U ∈ N (x). So we n→∞
u∈U
conclude that x ∈ C if and only if ϕ(x)=0; that is, ϕ = iC .
The next theorem relates Γ-convergence and the Kuratowski convergence of the epigraphs.
48
1 Smooth and Nonsmooth Calculus
THEOREM 1.5.8 If ϕn : X −→ R, n ≥ 1, is a sequence of proper functions and ϕ = Γτ - lim inf ϕn , ϕ = Γτ - lim sup ϕn , n→∞
n→∞
then epi ϕ = Kτ - lim sup epi ϕn and epi ϕ = Kτ - lim inf epi ϕn . n→∞
n→∞
PROOF: Again we prove the first equality. The proof of the second is similar. We know that (x, λ) ∈ epi ϕ if and only if ϕ (x) ≤ λ. According to Definition 1.5.1, ϕ (x) ≤ λ if and only if for every ε > 0 and every U ∈ N (x) we have lim inf inf ϕn (u) < λ + ε. n→∞ u∈U
This last inequality is equivalent to saying that for every ε > 0, every U ∈ N (x) and every k ≥ 1, we can find n ≥ k such that inf ϕn (u) < λ + ε. This in turn is u∈U
equivalent to U × (λ − ε, λ + ε) ∩ epi ϕn = ∅. Because the sets U × (λ − ε, λ + ε) with ε > 0 and U ∈ N (x), form a local basis for (x, λ) in X × R with the product topology; we conclude that (x, λ) ∈ epi ϕ if and only if (x, λ) ∈ Kτ - lim sup epi ϕn (see Definition 1.5.4) n→∞
REMARK 1.5.9 This is the reason why many authors call the Γ-convergence, epigraphical convergence, and denote it by eτ - lim ϕn , eτ -lim inf ϕn and eτ - lim sup ϕn . n→∞
n→∞
n→∞
We have already seen that Γ-convergence and pointwise convergence are in general distinct notions. We determine what the precise relation is between them. First of all, directly from Definition 1.5.1 we have the following. PROPOSITION 1.5.10 If ϕn : X −→ R, n ≥ 1, is a sequence of proper functions, then Γτ - lim inf ϕn ≤ lim inf ϕn and Γτ - lim sup ϕn ≤ lim sup ϕn . In particular if n→∞
n→∞
n→∞
n→∞
$ exist, then ϕ ≤ ϕ. $ Γτ - lim ϕn = ϕ and lim ϕn = ϕ n→∞
PROPOSITION 1.5.11 If ϕn : X −→ R, n ≥ 1, is a sequence of proper functions that converge uniformly to ϕ, then Γτ - lim ϕn = ϕτ , where ϕτ is the τ -lower n→∞
semicontinuous regularization of ϕ. PROOF: For every U ⊆ X nonempty, and open, we have lim inf ϕn (u) = inf ϕ(u).
n→∞ u∈U
u∈U
Therefore for every x ∈ X, sup
lim inf ϕn (u) =
U ∈N (x) n→∞ u∈U
sup
inf ϕ(u) = ϕτ (x)
U ∈N (x) u∈U
⇒ Γτ - lim ϕn = ϕτ . n→∞
1.5 Γ-Convergence
49
REMARK 1.5.12 The uniform limit of τ -lower semicontinuous functions, is τ lower semicontinuous too, therefore if each ϕn is τ -lower semicontinuous, then ϕ is τ -lower semicontinuous and so Γτ - lim ϕn = ϕ. n→∞
The next two propositions explain the importance of monotone methods in optimization. PROPOSITION 1.5.13 If ϕn : X −→ R, n ≥ 1, is an increasing sequence of proper functions, then Γτ - lim ϕn = lim ϕτn = sup ϕτn . n→∞
n→∞
n≥1
PROOF: If U ⊆ X is nonempty open, then lim inf ϕn (u) = sup inf ϕn (u).
n→∞ u∈U
n≥1 u∈U
So for every x ∈ X sup
lim inf ϕn (u) =
U ∈N (x) n→∞ u∈U
sup sup inf ϕn (u)
U ∈N (x) n≥1 u∈U
= sup sup
inf ϕn (u) = sup ϕτn (x).
n≥1 U ∈N (x) u∈U
n≥1
REMARK 1.5.14 If {ϕn }n≥1 is an increasing sequence of τ -lower semicontinuous functions, then ϕn ↑ ϕ = sup ϕn is τ -lower semicontinuous too and thus n≥1
Γτ - lim ϕn = lim ϕn = ϕ. So we see that in this context Γτ and pointwise conn→∞
n→∞
vergence coincide. Without the τ -lower semicontinuity of the ϕn s this is no longer true. PROPOSITION 1.5.15 If ϕn : X −→ R, n ≥ 1, is a decreasing sequence of proper functions and ϕn (x) −→ ϕ(x) for all x ∈ X, then Γτ - lim ϕn = ϕτ . n→∞
PROOF: If U ⊆ X is nonempty open, we have lim inf ϕn (u) = inf inf ϕn (u) = inf ϕ(u)
n→∞ u∈U
u∈U n≥1 τ
⇒ Γτ - lim ϕn (x) = ϕ (x) n→∞
u∈U
(see Definition 1.5.1).
To have a general result on the equivalence of the Γτ -convergence and of the pointwise convergence, we need the following notion. DEFINITION 1.5.16 Let ϕn : X −→ R, n ≥ 1, be a sequence of proper functions. We say that {ϕn }n≥1 is equi-τ -lower semicontinuous at x ∈ X, if given any ε > 0, we can find U ∈ N (x) such that ϕn (u) ≥ ϕn (x) − ε
for all u ∈ U and all n ≥ 1.
We say that {ϕn }n≥1 is equi-τ -lower semicontinuous if it is equi-τ -lower semicontinuous at every x ∈ X.
50
1 Smooth and Nonsmooth Calculus
PROPOSITION 1.5.17 If ϕn : X −→ R, n ≥ 1, is a sequence of proper functions that is equi-τ -lower semicontinuous at x ∈ X, then (Γτ - lim inf ϕn )(x)=lim inf ϕn (x) n→∞
n→∞
and (Γτ - lim sup ϕn )(x)=lim inf ϕn (x). So if {ϕn }n≥1 is equi-τ -lower semicontinun→∞
n→∞
ous, then Γτ - lim ϕn = ϕ if and only if ϕn −→ ϕ pointwise. n→∞
PROOF: As before we prove the first equality. The proof of the other is similar. By virtue of Proposition 1.5.10 it suffices to show that lim inf ϕn (x) ≤ (Γτ - lim inf ϕn )(x). n→∞
(1.55)
n→∞
Exploiting the equi-τ -lower semicontinuity of {ϕn }n≥1 at x ∈ X, given ε > 0, we can find U ∈ N (x) such that ϕn (x) − ε ≤ inf ϕn (u) for all n ≥ 1. Hence u∈U
lim inf ϕn (x) − ε ≤ n→∞
sup lim inf inf ϕn (u).
U ∈N (x) n→∞ u∈U
Let ε ↓ 0 to obtain (1.55).
If X is a Banach space and the ϕn s are convex functions, then we can use Theorem 1.2.7 to produce the following useful consequence of Proposition 1.5.17. PROPOSITION 1.5.18 If X is a Banach space and ϕn : X −→ R, n ≥ 1, is a sequence of proper convex functions that is equibounded above in U ∈ N (x) (i.e., there exists M > 0 such that sup sup ϕn (u) ≤ M < +∞), then (Γτ - lim inf ϕn )(x) = n→∞
n≥1 u∈U
lim inf ϕn (x) and (Γτ - lim sup ϕn )(x) = lim inf ϕn (x). So, if {ϕn }n≥1 is equibounded n→∞
n→∞
n→∞
above in some neighborhood of every point x ∈ X , then Γτ - lim ϕn = ϕ if and only n→∞
if ϕn −→ ϕ pointwise.
REMARK 1.5.19 If X is finite-dimensional and the sequence ϕn , ϕ n≥1 consists of convex functions with values in R (hence continuous; see Proposition 1.2.5), then Γτ - lim ϕn = ϕ if and only if ϕn −→ ϕ pointwise. This result fails if ϕ does not n→∞
take values in R. Let X = R and consider the sequence ϕn (x) = ln x − 1. Then Γ
τ ϕn −→ ϕ = I{0} and ϕn −→ ϕ $ pointwise where
ϕ $=
1 +∞
if x = 0 . if x = 0
Next by endowing with more structure the space X, we produce some convenient sequential characterizations of Γτ - lim inf ϕn and of Γτ - lim sup ϕn . n→∞
n→∞
PROPOSITION 1.5.20 If X is first countable and ϕ = Γτ - lim inf ϕn , ϕ = n→∞
Γτ - lim sup ϕn , then n→∞
1.5 Γ-Convergence
51
(a) For every x ∈ X and every sequence xn −→ x, we have ϕ (x) ≤ lim inf ϕn (xn );
(1.56)
n→∞
and for every x ∈ X we can find a sequence xn −→ x in X such that ϕ (x) = lim inf ϕn (xn ).
(1.57)
n→∞
(b) For every x ∈ X and every sequence xn −→ x, we have ϕ (x) ≤ lim sup ϕn (xn )
(1.58)
n→∞
and for every x ∈ X we can find a sequence xn −→ x in X such that ϕ (x) = lim sup ϕn (xn ).
(1.59)
n→∞
PROOF: (a) Let U ∈ N (x). Then we can find n0 = n0 (U ) ≥ 1 such that xn ∈ U for all n ≥ n0 . We have inf ϕn (u) ≤ ϕn (xn )
for all n ≥ n0 ,
u∈U
⇒ lim inf inf ϕn (u) ≤ lim inf ϕn (xn ), n→∞ u∈U
n→∞
⇒ ϕ (x) ≤ lim inf ϕn (xn ). n→∞
This proves (1.56). Next we prove (1.57). Let x ∈ X such that ϕ (x) < +∞ and let Uk k≥1 be a local basis at x ∈ X such that Uk+1 ⊆ Uk for all k ≥ 1. Consider a sequence λk ↓ ϕ (x) in R such that λk > ϕ (x) for all k ≥ 1. We have
for all k ≥ 1.
lim inf inf ϕn (u) < λk n→∞ u∈Uk
So for all k ≥ 1, we can find a strictly increasing sequence of integers n(k) ≥ 1 such that inf ϕn(k) (u) < λk for all k ≥ 1. u∈Uk
For every k ≥ 1 we can find uk ∈ Uk such that ϕn(k) (uk ) < λk . Then we introduce the sequence if n = n(k) uk xn = x if n = n(k)
for all k ≥ 1.
Evidently xn −→ x and ϕ (x) = lim λk ≥ lim inf ϕn(k) (uk ) ≥ lim inf ϕn (xn ); that k→∞
k→∞
n→∞
is, (1.57) holds. From the first part of the proof we know that the opposite inequality holds. So ϕ (x) = lim inf ϕn (xn ); that is, (1.57) holds. n→∞
(b) Inequality (1.58) follows as (1.56). So we prove (1.59). Again let x ∈ X such that ϕ (x) < +∞ and consider µk k≥1 ⊆ R such that µk ↓ ϕ (x) and µk > ϕ (x)
52
1 Smooth and Nonsmooth Calculus
for k ≥ 1. As before for every k ≥ 1 we can find a strictly increasing sequence of integers n(k) ≥ 1 such that inf ϕn (u) < µk
u∈Uk
for all n ≥ n(k).
So for every n ≥ n(k) we can find un,k ∈ Uk such that ϕn (un,k ) < µk . We introduce the sequence xn =
x xn
if n < n(1) . if n(k) ≤ n < n(k + 1)
Then xn −→ x and so ϕ (x) = lim µk ≥ lim sup ϕn (xn ). k→∞
n→∞
Combining this with (1.58), we conclude that ϕ (x) = lim sup µk ϕn (xn ); that is, (1.59) holds. n→∞
COROLLARY 1.5.21 If X is first countable and ϕ = Γτ - lim ϕn , then for every x ∈ X and every sequence xn −→ x in X we have
n→∞
ϕ(x) ≤ lim inf ϕn (xn ). n→∞
(1.60)
and for every x ∈ X we can find a sequence xn −→ x in X such that ϕ(x) = lim ϕn (xn ). n→∞
(1.61)
PROOF: Use (1.56) to obtain (1.60) and combine (1.57) and (1.59) to get (1.61). REMARK 1.5.22 From Proposition 1.5.20 and Corollary 1.5.21 it follows that when X is first countable, then
Γτ - lim inf ϕn (x) = min lim inf ϕn (xn ) : xn −→ x in X n→∞ n→∞
Γτ - lim sup ϕn (x) = min lim sup ϕn (xn ) : xn −→ x in X n→∞ n→∞
and Γτ - lim ϕn (x) = min lim ϕn (xn ) : xn −→ x in X . n→∞
n→∞
Also note that in Corollary 1.5.21, (1.60) is a stronger assumption than the pointwise convergence because it is required to hold for every xn −→ x and not only for xn = x. On the other hand, (1.61) is a weaker requirement than the pointwise convergence, where we must have xn = x for all n ≥ 1. This explains why the two convergence notions are in general distinct.
1.5 Γ-Convergence
53
In the case of a sequence {Cn }n≥1 ⊆ 2X , then we can have the following convenient sequential characterizations of the Kuratowki limits (see Definitions 1.5.4 and Proposition 1.5.7). PROPOSITION 1.5.23 If X is first countable and {Cn }n≥1 ⊆ 2X , then lim inf Cn = x ∈ X : x = lim xn , xn ∈ Cn for all n ≥ 1 and n→∞
lim sup Cn = x ∈ X : x = lim xnk , xnk ∈ Cnk , n1 < n2 < · · · < nk < · · · . n→∞
REMARK 1.5.24 So lim inf Cn consists of all limit points of sequences with eln→∞
ements in Cn , n ≥ 1 and lim sup Cn consists of all subsequential limit points of n→∞
sequences with elements in Cn , n ≥ 1. When X is a metric space, we can write that lim inf Cn = x ∈ X : lim d(x, Cn ) = 0 and lim sup Cn = x ∈ X : lim inf d(x, Cn ) = n→∞ n→∞ n→∞ n→∞ 0 . Using Proposition 1.5.23, we can have the following double sequence lemma which is a useful analytical tool on many occasions. PROPOSITION 1.5.25 If X is first countable,
xmn
⊆ X and m(n) n≥1 and
m,n≥1
lim lim xmn = x, then we can find sequences of integers m→∞ n→∞ n(m) m≥1 increasing (not necessarily strictly) to +∞ such that
xmn(m) −→ x as m → ∞ and xm(n)n −→ x as n → ∞. PROOF: Let An = xmn m≥1 and xm = lim xmn . By virtue of Proposition 1.5.23 n→∞
we have xm ∈ lim inf An for all m ≥ 1. Because xm → x as m → ∞ and lim inf An n→∞
n→∞
is closed (see Remark 1.5.5), we have x ∈ lim inf An . Therefore x = lim un with n→∞
n→∞
un ∈ An , hence un = xm(n)n and so x = lim xm(n)n . n→∞ Next let Bm = xmn n≥1 . Then xm ∈ B m for all m ≥ 1 and so x = lim xm ∈ lim inf B m = lim inf Bm . Therefore we can find vm ∈ Bm , m ≥ 1, such that x = m→∞
m→∞
lim vm . Then vm = xmn(m) and so x = lim xmn(m) .
m→∞
m→∞
Now let us check the variational properties of Γ-convergence. We start with a topological version of the notion of coercivity. DEFINITION 1.5.26 Let (X, τ ) be a Hausdorff topological space with τ being its topology (a) A function ϕ : X −→ R is said to be coercive (resp., sequentially coercive) if τ for every λ ∈ R, the set {x ∈ X : ϕ(x) ≤ λ} is τ -countably compact (resp., τ sequentially compact). (b) A sequence of functions ϕn : X −→ R, n ≥ 1, is said to be equicoercive if for every λ ∈ R, there exists a τ -closed, τ -countably compact set Kλ ⊆ X, such that {ϕn ≤ λ} = x ∈ X : ϕn (x) ≤ λ ⊆ Kλ for all n ≥ 1.
54
1 Smooth and Nonsmooth Calculus
REMARK 1.5.27 Evidently every sequentially coercive function ϕ is coercive. If X is a reflexive Banach space and ϕ : X −→ R is coercive for X with the weak topology, then ϕ(x) −→ +∞ as x −→
+∞. Some authors call this property weak coercivity and call ϕ coercive when ϕ(x) /x −→ +∞ as x −→ +∞ (i.e., ϕ exhibits superlinear growth at infinity). PROPOSITION 1.5.28 If ϕn : X −→ R is a sequence such that −∞ < inf ϕn (x) n≥1
for all x ∈ X, then the sequence {ϕn }n≥1 is equicoercive if and only if there exists a τ -lower semicontinuous coercive function g : X −→ R such that g ≤ ϕn for all n ≥ 1. PROOF: ⇒: Let g0 (x) = inf ϕn (x) and set g = g 0τ (the τ -lower semicontinuous n≥1
regularization of g0 ). Because {ϕn }n≥1 is equicoercive, we can find a family {Kλ }λ∈R of τ -closed, τ -countably compact sets such that {ϕn ≤ λ} ⊆ Kλ for all λ ∈ R, n ≥ 1. Given x ∈ {g ≤ λ} and ε > 0, we can find n = n(ε) ≥ 1 such that ϕn (x) ≤ λ + ε and so x ∈ Kλ+ε . Therefore {g ≤ λ} ⊆ Kλ+ε and the latter set is τ -closed and e>0
τ -countably compact, which proves that g is τ -lower semicontinuous, coercive. ⇐: For every λ ∈ R and every n ≥ 1, {ϕn ≤ λ} ⊆ {g ≤ λ} = Kλ and Kλ of τ -closed (because g is τ -lower semicontinuous) and Kλ is τ -countably compact (because g is coercive). Therefore it follows that the sequence {ϕn }n≥1 is equicoercive. Equicoercive families exhibit interesting variational stability properties. THEOREM 1.5.29 If ϕn : X −→ R n ≥ 1 is an equicoercive sequence, inf ϕn (x) > n≥1
−∞ for all x ∈ X and as before ϕ = Γτ - lim inf ϕn , ϕ = Γτ - lim sup ϕn , then ϕ n→∞
n→∞
and ϕ are both coercive and min ϕ (x) = lim inf inf ϕn (xn ).
x∈X
n→∞ x∈X
(1.62)
Moreover, if ϕ = Γτ - lim ϕn , then ϕ is coercive and n→∞
min ϕ(x) = lim inf ϕn (xn ).
x∈X
n→∞ x∈X
(1.63)
PROOF: By virtue of Proposition 1.5.28, we can find g : X −→ R a τ -lower semicontinuous and coercive function such that g ≤ ϕn for all n ≥ 1. Then clearly g ≤ ϕ ≤ ϕ and so both functions ϕ and ϕ are coercive and τ -lower semicontinuous (see Remark 1.5.2). Next we prove (1.62). First we show that because of coercivity, ϕ attains its infimum on X. Indeed, if {xn }n≥1 ⊆ X is a minimizing sequence and lim ϕ (xn ) = inf ϕ = m < +∞, then we can find n0 ≥ 1 such that xn ∈ Km+1 for all n ≥ n0 with X
Km+1 τ -closed and τ -countably compact. So {xn }n≥1 has a cluster point x ∈ X and because ϕ is τ −lower semicontinuous, we have m ≤ ϕ (x) ≤ lim inf ϕ (xn ) = m, n→∞
⇒ m = ϕ (x); that is, inf ϕ = min ϕ . X
Directly from Definition 1.5.1, we have
X
1.5 Γ-Convergence lim inf inf ϕn (x) ≤ min ϕ (x). n→∞ x∈X
55 (1.64)
x∈X
Suppose that lim inf inf ϕn (x) < +∞ (or otherwise we are done, see (1.64)). We n→∞ x∈X
can find a subsequence {nk }k≥1 and λ ∈ R such that lim inf ϕnk (x) = lim inf inf ϕn (x) < λ.
k−→∞ x∈X
n→∞ x∈X
(1.65)
Evidently we may assume that for all k ≥ 1.
inf ϕnk (x) < λ
x∈X
(1.66)
Because of the equicoercivity of the sequence {ϕn }n≥1 , we can find a τ -closed and τ -countably compact set Kλ such that {ϕnk ≤ λ} ⊆ Kλ for all k ≥ 1. Note that because of (1.66) all the sets {ϕnk ≤ λ} ⊆ K, k ≥ 1, are nonempty. Hence inf ϕnk (x) = inf ϕnk (x).
x∈X
(1.67)
x∈K
Let xnk ∈K such that ϕnk (xnk ) = inf ϕnk (x) (it exists because K is τ -closed, τ x∈K
countably compact, and ϕnk is τ -lower semicontinuous by virtue of the equicoercivity of the sequence {ϕn }n≥1 ). The sequence {xnk }k≥1 ⊆ K has a cluster point x ∈ K. For every U ∈ N (x), there exists k0 ≥ 1 such that for all k ≥ k0 we have xnk ∈ U . Hence inf ϕnk (u) ≤ ϕnk (xnk )
for all k ≥ k0 ,
u∈U
⇒ lim inf inf ϕnk (u) ≤ lim ϕnk (xnk ) k→∞ u∈U
k→∞
= lim inf inf ϕn (x) n→∞ x∈X
(1.68)
(see (1.67) and (1.65)). Because U ∈ N (x) was arbitrary from (1.68) it follows that ϕ (x) ≤ lim inf inf ϕn (x) n→∞ x∈X
(with ϕ = Γτ - lim inf ϕnk ), k→∞
⇒ min ϕ (x) ≤ lim inf inf ϕn (x), x∈X
n→∞ x∈X
⇒ min ϕ (x) ≤ lim inf inf ϕn (x) x∈X
n→∞ x∈X
(because ϕ ≤ ϕ ).
(1.69)
From (1.64) and (1.69) we obtain (1.62). Finally if ϕn Γτ -converges to ϕ, then directly from Definition 1.5.1 we have lim sup inf ϕn (x) ≤ inf ϕ(x). n→∞
x∈X
From (1.62) and (1.70) we obtain (1.63).
x∈X
(1.70)
To continue with the variational features of Γ-convergence, we introduce some relevant notation.
56
1 Smooth and Nonsmooth Calculus
DEFINITION 1.5.30 Let ϕ : X −→ R be a proper function. The set of all mininizers of ϕ in X, denoted by M(ϕ) is defined by M(ϕ) = x ∈ X : ϕ(x) = inf ϕ . X
If ϕ does not attain its infimum, we look for ε > 0 minimizers. So let ε > 0 be given. An ε-minimizer of ϕ in X is a point x ∈ X such that 1 ϕ(x) ≤ max inf ϕ + ε, − . X ε We denote the set of all ε-minimizers of ϕ in X, by Mε (ϕ). REMARK 1.5.31 If inf ϕ > −∞ and ε > 0 is small, then x ∈ Mε (ϕ) if and only if X
ϕ(x) ≤ inf ϕ + ε. So in the above definition of ε-minimizer the term −(1/ε) appears X in order to take care of the case where inf X ϕ = −∞. Note that M(ϕ) = Mε (ϕ). ε>0
However, the set Mε (ϕ) is nonempty for all ε > 0, but M(ϕ) may be empty. PROPOSITION 1.5.32 Suppose ϕ = Γτ - lim ϕn . Then we have n→∞ (a) Kτ - lim sup M(ϕn ) ⊆ Kτ - lim sup Mε (ϕn ) ⊆ M(ϕ). (b) If
ε>0
(c) If
ε>0
n→∞
n→∞
ε>0
Kτ - lim sup Mε (ϕn ) = ∅, then M(ϕ) = ∅ and min ϕ=lim sup inf ϕn . X
n→∞
n→∞
X
Kτ - lim inf Mε (ϕn ) = ∅, then M(ϕ) = ∅ and min ϕ= lim inf ϕn . n→∞
X
n→∞ X
PROOF: (a) For every ε> 0 and every n ≥ 1, M(ϕn ) ⊆ Mε (ϕn ), thus first inclusion is obvious. Now let x ∈ Kτ - lim sup Mε (ϕn ). Then according to Definition 1.5.4, n→∞
ε>0
for every ε > 0, U ∈ N (x), and k ≥ 1, we can find n ≥ k such that U ∩ Mε (ϕn ) = ∅. So we obtain 1 lim inf inf ϕn (u) ≤ max lim sup inf ϕn + ε, − n→∞ u∈U ε n→∞ X ⇒ ϕ(x) ≤ lim sup inf ϕn (because ε > 0 was arbitrary). n→∞
X
But directly from Definition 1.5.1, we see that lim sup inf ϕn ≤ inf ϕ. So n→∞
X
X
min ϕ = lim sup inf ϕn . X
n→∞
X
(1.71)
(b) Follows at once from (a) (see (1.71)). (c) From the proof of (a) and Definition 1.5.1, we have ϕ(x) ≤ lim inf inf ϕn ≤ inf ϕ n→∞
X
⇒ inf ϕ = lim inf inf ϕn , X
n→∞
X
X
which together with (b), proves (c).
If ϕ = Γτ - lim ϕn is not identically +∞, then we can improve the conclusions n→∞
of Proposition 1.5.32.
1.5 Γ-Convergence
57
THEOREM 1.5.33 Suppose that ϕ = Γτ - lim ϕn , ϕ is not identically +∞ and n→∞
consider the following statements. Kτ - lim sup Mε (ϕn ) = ∅. (i) n→∞
e>0
(ii) M(ϕ) = ∅ and min ϕ = lim sup inf ϕn . X n→∞ X Kτ - lim sup Mε (ϕn ). (iii) M(ϕ) = n→∞ e>0 (iv) Kτ - lim inf Mε (ϕn ) = ∅. n→∞
e>0
(v) M(ϕ) = ∅ and min ϕ = lim inf ϕn . n→∞ X X Kτ - lim inf Mε (ϕn ) = Kτ - lim sup Mε (ϕn ). (vi) M(ϕ) = n→∞
e>0
e>0
n→∞
then (a) (i)⇔ (ii)⇒(iii). (b) (iv)⇔ (v)⇒(vi). PROOF: (a) Note that (i) ⇒ (ii) follows from Proposition 1.5.32(b). Let us prove that (ii) ⇒ (i). For this purpose let x ∈ M(ϕ). Then by hypothesis ϕ(x) = min ϕ = X
lim sup inf ϕn < +∞. So given any ε > 0, we have n→∞
X
ϕ(x) −
ε ≤ inf ϕn X 2
for infinitely many ns.
(1.72)
From Definition 1.5.1, we see that for all U ∈ N (x), ε 1 lim sup inf ϕn < max ϕ(x) + , − , 2 ε n→∞ U ε 1 ⇒ inf ϕn < max ϕ(x) + , − for all n ≥ 1 large, U 2 ε 1 for infinitely many ns (see (1.72)). ⇒ inf ϕn < max ϕ(x) + ε, − U ε (1.73) ∅ for infinitely many ns and Therefore U ∩ Mε (ϕn ) = every ε > 0 and so x ∈ Kτ - lim sup Mε (ϕn ). So we conclude that M(ϕ) = Kτ - lim sup Mε (ϕn ) = ∅ e>0
n→∞
e>0
n→∞
and from this argument we also infer that (i) and (ii) both imply (iii). (b) In this case (1.72) holds for n ≥ 1 large and then arguing as above we obtain that (1.73) is valid for alln ≥ 1 large. Hence U ∩ Mε (ϕn ) = ∅ for all n ≥ 1 large and every ε > 0 and so x ∈ Kτ - lim inf Mε (ϕn ). This combined with (a) above and e>0
n→∞
Proposition 1.5.32, gives (iv) ⇔ (v) ⇒(vi).
COROLLARY 1.5.34 If ϕ = Γτ - lim ϕn , xn ∈ Mε (ϕn ) with εn ↓ 0 and x is a n→∞
cluster point of {xn }n≥1 , then x ∈ M(ϕ) and ϕ(x) = lim sup ϕn (xn ). Moreover, if n→∞
xn −→ x in X, then x ∈ M(ϕ) and ϕ(x) = lim ϕn (xn ). n→∞
58
1 Smooth and Nonsmooth Calculus
PROOF: Note that x ∈ Kτ - lim sup Mεn (ϕn ) and n→∞
Kτ - lim sup Mε (ϕn ) = ∅. n→∞
ε>0
Then from Theorem 1.5.33(a) we obtain the first part of the corollary. If xn −→ x in X, then x ∈ Kτ - lim inf Mεn (ϕn ) and so Kτ - lim inf Mε (ϕn ) = ∅. Then the n→∞
n→∞
ε>0
second part of the corollary follows from Theorem 1.5.33(b). We have another result on the variational stability of equicoercive sequences.
THEOREM 1.5.35 If ϕ = Γτ - lim ϕn , ϕ is not identically +∞, {ϕn }n≥1 is n→∞
equicoercive, and inf ϕn (x) > −∞ for all x ∈ X, then for every V neighborhood of n≥1
M(ϕ), we can find ε0 > 0 and n0 ≥ 1 such that M(ϕn ) ⊆ Mε0 (ϕn ) ⊆ V
for all n ≥ n0 .
Moreover, for every x ∈ M(ϕ), every U ∈ N (x), and every ε > 0, we can find n1 ≥ 1 such that Mε (ϕn ) ∩ U = ∅ for all n ≥ n1 . PROOF: Let λ ∈ R such that inf ϕ < λ. From Theorem 1.5.29, we know that ϕ is X
coercive and so we can find K ⊆ X τ -closed, and τ -countably compact such that {ϕn ≤ λ} ⊆ K
for all n ≥ 1 and {ϕ ≤ λ} ⊆ K.
(1.74)
Let µ= minc ϕ. Note that K ∩ V c is τ -closed and τ −countably compact. This K∩V
together with the τ -lower semicontinuity and coercivity of ϕ, imply that we can find x ∈ K ∩ V c such that µ = ϕ(x). Because x ∈ / M(ϕ), we have inf ϕ < ϕ(x) = µ. Also X
as in the proof of Theorem 1.5.29 we obtain µ = minc ϕ ≤ lim inf inf c ϕn . K∩V
n→∞ K∩V
From (1.74) we also have that λ ≤ inf c ϕn . Therefore X∩V
min{λ, µ} ≤ lim inf inf c ϕn . n→∞ X∩V
On the other hand using Theorem 1.5.29, we have lim inf ϕn = min ϕ < min{λ, µ} ≤ lim inf inf c ϕn .
n→∞ X
X
So we can find ε0 > 0 and n0 ≥ 1 such that 1 max inf ϕ + ε0 , − < inf c ϕn X X∩V ε0 for all n ≥ n0 . ⇒ Mε0 (ϕn ) ⊆ V
n→∞ X∩V
for all n ≥ n0 ,
Finally because {ϕn }n≥1 is equicoercive, from Theorem 1.5.29 we see that statement (v) in Theorem 1.5.33 is satisfied. So by virtue of that theorem statement (vi) holds, which by virtue of Definition 1.5.4 implies the conclusion of the theorem. Sometimes it is suitable to consider different topologies when defining the two limit functions for a sequence {ϕn }n≥1 .
1.5 Γ-Convergence
59
DEFINITION 1.5.36 (a) Let X be a Banach space and ϕn : X −→ R, n ≥ 1, a sequence of proper functions. We say that ϕn s Mosco converge to a function M ϕ : X −→ R, denoted by ϕn −→ ϕ, if the following two conditions hold. w
(i) For every x ∈ X and every xn −→ x, we have ϕ(x) ≤ lim inf ϕn (xn ). n→∞
(ii) For every x∈X, we can find a sequence xn −→ x such that ϕ(x) = lim ϕn (xn ). n→∞
There is a corresponding convergence of sets. (b) Let {Cn }n≥1 ⊆2X \ {∅}. We set s-lim inf Cn ={x ∈ X :x=lim xn , xn ∈ Cn for all n→∞
n ≥ 1} and w-lim sup Cn ={x ∈ X :x=w- lim xnk , xnk ∈ Cnk , n1 < n2 < . . . < nk < k→∞
n→∞
M
. . .}. We say that the Cn s Mosco converge to a set C, denoted by Cn −→ C, if C = s- lim inf Cn = w- lim sup Cn . n→∞
n→∞
REMARK 1.5.37 Clearly we always have s- lim inf Cn ⊆ w-lim sup Cn . n→∞
n→∞
The following theorem justifies Definition 1.5.36. Its proof can be found in Attouch [34, p. 295], or Hu–Papageorgiou [313, p. 754]. THEOREM 1.5.38 If X is a reflexive Banach space and {ϕn , ϕ}n≥1⊆Γ0 (X), then ϕn −→ ϕ in X if and only if ϕ∗n −→ ϕ∗ in X ∗ . M
M
REMARK 1.5.39 So convex conjugation (see Definition 1.2.15) is bicontinuous for the Mosco convergence from Γ0 (X) onto Γ0 (X ∗ ), when X is reflexive. An immediate consequence of this theorem is the following proposition. PROPOSITION 1.5.40 If X is a reflexive Banach space, {ϕn , ϕ}n≥1 ⊆Γ0 (X) and ϕn −→ ϕ, then for every x∗ ∈ X ∗ we can find a sequence x∗n −→ x∗ in X ∗ such that M
inf (ϕn − x∗n ) −→ inf (ϕ − x∗ ). X
X
We conclude the section with some additional variational properties of the Mosco convergence. The proof of this proposition can be found in Denkowski–Mig´ orski– Papageorgiou [195, p. 468]. PROPOSITION 1.5.41 If X is a reflexive Banach space and {ϕn , ϕ}n≥1 ⊆ Γ0 (X), then M
M
(a) If ϕn −→ ϕ and inf ϕ < λ, then {ϕn ≤ λ} −→ {ϕ ≤ λ}. X
M
M
(b) If for every λ ∈ R {ϕn ≤ λ} −→ {ϕ ≤ λ}, then ϕn −→ ϕ.
60
1 Smooth and Nonsmooth Calculus
1.6 Remarks 1.1: Various aspects of the differential calculus in Banach spaces can be found in the books of Cartan [131], Denkowski–Mig´ orski–Papageorgiou [194], Dieudonn´e [200], Vainberg [590], Zeidler [620], and the survey papers of Averbukh–Smolyanov [48] and Nashed [453]. 1.2: There are several books treating the theory of convex functions. We mention the books of Rockafellar [522], Hiriart-Urruty–Lemarechal [308, 309], Webster [602], Rockafellar–Wets [530] (convex functions defined on RN ), and of Laurent [370], Roberts–Varberg [516], Holmes [310], Barbu–Precupanu [59], Ekeland– Temam [222], Ioffe–Tichomirov [327], Giles [264], Aubin–Ekeland [39], Phelps [495], and Denkowski–Mig´ orski–Papageorgiou [194] (convex functions defined on Banach spaces and on locally convex spaces). In the books of Holmes [310], Giles [264], and Phelps [495], the emphasis is on the relations between the theory of convex functions and the theory of Banach spaces. Theorem 1.2.11 dates back to Mazur [417]. The duality theory of convex functions was initiated by Fenchel [241] for functions defined on RN and was extended to dual pairs of locally convex spaces by Brondsted [112], Moreau [442], and Rockafellar [528]. This duality theory can give the following useful result originally due to H¨ ormander [311]. PROPOSITION 1.6.1 If X is a locally convex space, then there is a bijective correspondence between nonempty, closed, convex sets and sublinear w(X ∗ , X)-lower semicontinuous functions on X ∗ with values in R. This correspondence maps the set C to σ(·; C) its support function. The systematic study of the subdifferential starts with the works of Moreau [441, 442] and Rockafellar [517, 522, 523]. The subdifferential theory of convex integral functionals can be found in Rockafellar [519, 525]. The work of Rockafellar was extended by Levin [376, 377, 378], and Castaing–Valadier [134]. Interesting connections between the subdifferentiability of convex functions and Banach space theory can be found in Giles [264] and Phelps [495]. 1.3: The subdifferential theory of locally Lipschitz functions started with the thesis of Clarke [148], who using Rademacher’s theorem gave the description of the subdifferential included in Theorem 1.3.12 for functions in RN . Soon thereafter Clarke [149, 152] extended the results to locally Lipschitz functions defined on a Banach space. Proposition 1.3.24 is due to Lebourg [371]. A comprehensive presentation of the theory together with applications in mathematical programming, optimal control, and calculus of variations can be found in Clarke [153, 154]. We mention that Rademacher’s theorem (see Theorem 1.3.11) can be extended to functions between Banach spaces as follows. DEFINITION 1.6.2 Let (G, +) be an Abelian Polish group and d an invariant metric on G compatible with the topology (therefore automatically complete). A universally measurable set C ⊆ G is Haar-null , if there exists a probability measure µ on G (not unique), such that χC ∗ µ = 0. Here χC is the characteristic function of the set C and (χC ∗ µ)(x) = G χC (x + y)dµ(y).
1.6 Remarks
61
REMARK 1.6.3 So every translation of C is µ-null. The measure µ is usually called the test measure. Then the infinite-dimensional extension of Theorem 1.3.11 reads as follows (see Christensen [145]). THEOREM 1.6.4 If X is a separable Banach space, Y is a Banach space with the RNP (Radon–Nikodym property; see Denkowski–Mig´ orski–Papageorgiou [194, p. 372]) and ϕ : X −→ Y is locally Lipschitz, then there exists a universally measurable set D ⊆ X such that X \ D is Haar-null and ϕD is Gˆ ateaux differentiable. 1.4: Tangent cones and normal cones have been introduced over the years in a variety of contexts and it is almost impossible today to give a full account of the various cones existing in the literature. We have concentrated on the two most profilic cones, the tangent cones TC (x) and TC (x) and their respective normal cones. The tangent cone TC (x) (see Definition 1.4.1) was introduced by Bouligand [93] for general sets. It is usually called the contingent cone and it exhibits good properties in the context of convex sets. The other tangent cone TC (x) (see Definition 1.4.17) was introduced by Clarke [148, 153] and is always convex without the set being convex. So duality techniques are readily available for this cone. The price we pay for this convenience is that often this cone reduces to the trivial one {0}. A detailed study of these cones and of related concepts can be found in Rockafellar–Wets [530] (finite-dimensional theory) and Aubin–Frankowska [40], and Hu–Papageorgiou [313] (infinite-dimensional theory). 1.5: Γ-convergence was introduced and studied by De Giorgi–Franzoni [190, 191]. The related Kτ -convergence of sets (see Definition 1.5.4), known as the Kuratowski convergence, was actually first used by Painlev´e in 1902 in his lectures on analysis in the Universit´e de Paris. However, because it was more systematically examined and popularized by Kuratowski [366, 368], it carries his name. The Γ-convergence is designed in such a way so as to be suitable in the study of the stability analysis of variational problems. For this reason it is said to be a variational convergence and can be found in Attouch [34], Dal Maso [171], Denkowski–Mig´ orski–Papageorgiou [195], Dontchev–Zolezzi [202], Hu–Papageorgiou [313], and Rockafellar–Wets (only in RN ) [530]. The Mosco convergence of sets and functions (see Definition 1.5.36) was introduced and studied by Mosco [445, 446], who proved the main result, which is Theorem 1.5.38 and which justifies the introduction of this alternative variational convergence. For more on the Mosco convergence we refer to Attouch [34], Dontchev–Zolezzi [202], and Hu–Papageorgiou [313]. We should also mention the multiple Γ-convergence which is defined on the product of two Hausdorff topological spaces and is convenient in the variational analysis of optimal control problems. We refer the reader to Buttazzo [126] for details.
2 Extremal Problems and Optimal Control
Summary. *This chapter is devoted to the study of various kinds of variational problems and of the mathematical tools needed to deal with them. We start with the so-called “direct method”, that leads to a systematic study of lower semicontinuous functionals. Then we consider optimization problems with constraints using the Lagrange multipliers methods. This method is closely connected to nonlinear eigenvalue problems. We also develop a general duality theory for convex optimization problems and study saddle points, KKM-multimaps, coincidence theorems, variational inequalities and the Fenchel duality theory. Subsequently, we present the Ekeland variational principle and some of its most remarkable consequences. We also show that it is equivalent to a whole family of other important results of nonlinear analysis. With these mathematical tools, we then pass to the study of calculus of variations problems (Euler equation and canonical Hamiltonian equations) and of optimal control problems (existence theory, relaxation and Pontryagin’s maximum principle).
Introduction This chapter is devoted to the study of variational problems and of the mathematical tools that are necessary to do this. So, in Section 2.1 we conduct a detailed investigation of the topological notion of lower semicontinuity. Since the work of Tonelli, the concept of lower semicontinuity has played a central role in the study of variational problems. Our study outlines the so-called direct method of the calculus of variations and also introduces the notion of relaxed functional which we encounter also in Section 2.6. In Section 2.2 we deal with infinite-dimensional optimization problems with constraints and we analyze them by developing the so-called method of Lagrange multipliers. This method is closely connected to nonlinear eigenvalue problems. Section 2.3 studies saddle points and develops a general duality theory for convex optimization problems. Saddle points appear in game theory and in general in control problems where the controllers exhibit conflicting interests. In connection with saddle points we also introduce and study KKM-maps which are useful in producing coincidence theorems for families of multifunctions and also lead to existence N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_2, © Springer Science+Business Media, LLC 2009
64
2 Extremal Problems and Optimal Control
theorems for variational inequalities. Duality is at the core of convex analysis. In introductory functional analysis we encounter the first instances of duality with the separation theorems for convex sets, which was extended in Definition 1.2.15 with the introduction of the conjugate of a convex function. Here using a “perturbation” approach to a convex minimization problem, we associate a concave maximization one and investigate the precise relation between the two. As a special case, we present the so-called Fenchel duality theory. In Section 2.4 first we present the Ekeland variational principle and discuss some remarkable consequences of it. Then we introduce some other fundamental results of nonlinear analysis, namely the Caristi fixed point theorem, the Takahashi variational principle, and the drop theorem, and we show that they are all equivalent to the Ekeland variational principle. Finally we prove a principle concerning ordered spaces, which we show generates all the aforementioned results. In Section 2.5 we present some basic aspects of the calculus of variations. We consider problems with integral cost functional and fixed endpoints. We develop the Euler equation for an optimal state as well as the canonical Hamiltonian equations. In Section 2.6 we deal with optimal control problems. We limit ourselves to systems monitored by ordinary differential equations (lumped parameter systems). First we develop the existence theory for such optimal control problems. This theory reveals the importance of convex structure in the problem. When this convexity is missing, we need to augment the system in order to assure existence of optimal pairs. This is the process of relaxation. We present four relaxation methods that we show are equivalent under reasonable conditions on the data. Finally we develop necessary conditions for optimality in the form of a maximum principle. Our approach is based on the Ekeland variational principle.
2.1 Lower Semicontinuity In Section 1.2 we examined in some detail proper, lower semicontinuous, convex functions defined on a locally convex space. In this section we drop the convexity condition and concentrate on lower semicontinuous functions. The reason for this is that such functions are involved in the so-called direct method of the calculus of variations. Already in Section 1.5 we had some first results involving the notion of lower semicontinuity. Here we conduct a systematic study of this concept. Our setting is purely topological. So let (X, τ ) be a Hausdorff topological space, with τ denoting its topology. Additional conditions on X are introduced as needed. For x ∈ X, by N (x) we denote the filter of neighborhoods of x. Also recall that R∗ = R ∪ {±∞}. We start by recalling the definition of τ -lower semicontinuity (both local and global) of a function ϕ : X −→ R∗ . DEFINITION 2.1.1 A function ϕ : X −→ R∗ is said to be τ -lower semicontinuous at x, if ϕ(x) = lim inf ϕ(u)= sup inf ϕ(u). u→x
U ∈N (x) u∈U
If ϕ is τ -lower semicontinuous at every x ∈ X, then we say that ϕ is τ -lower semicontinuous. If −ϕ is τ -lower semicontinuous (at x), then we say that ϕ is τ -upper semicontinuous (at x).
2.1 Lower Semicontinuity
65
REMARK 2.1.2 If the topology τ is clearly understood, then we drop τ from the definition and say that ϕ is lower semicontinuous (at x ∈ X). Also if x ∈ X is a point where ϕ(x) > −∞, then ϕ is τ -lower semicontinuous at x if and only if for every λ ∈ R with λ < ϕ(x), we can find U ∈ N (x) such that λ < ϕ(u) for all u ∈ U . PROPOSITION 2.1.3 For a function ϕ : X −→ R∗ the following statements are equivalent. (a) ϕ is τ -lower semicontinuous. (b) For every λ ∈ R, the set Lλ = {x ∈ X : ϕ(x) ≤ λ} is τ -closed. (c) epi ϕ = {(x, λ) ∈ X × R : ϕ ≤ λ} is closed in X × R. PROOF: (a)⇒(b) We show that Lcλ is open. Let x ∈ Lλ . Then λ < ϕ(x) and by virtue of Remark 2.1.2 we can find U ∈ N (x) such that λ < ϕ(u) for all u ∈ U , which shows that Lcλ is open; hence Lλ is closed. (b)⇒(a) If ϕ(x) = −∞, then it is clear from Definition 2.1.1 that ϕ is τ -lower semicontinuous at x ∈ X. So suppose that −∞ < ϕ(x) and let λ < ϕ(x). Note that x ∈ Lcλ and because by hypothesis Lλ is closed, we can find U ∈ N (x) such that U ⊆ Lcλ , hence λ < ϕ(u) for all u ∈ U , which proves the τ -lower semicontinuity of ϕ at x ∈ X. So we have established the equivalence of statements (a) and (b). Next let h : X × R −→ R∗ be defined by h(x, λ) = ϕ(x) − λ. Clearly ϕ is τ -lower semicontinuous on X if and only if h is lower semicontinuous on the product space X × R. Because epi ϕ = {(x, λ) ∈ X × R : h(x, λ) ≤ 0}, the equivalence of (a) and (b) established above implies the equivalence of (b) and (c). COROLLARY 2.1.4 If ϕi : X −→ R∗ , i ∈ I, is a family of τ -lower semicontinuous functions, then (a) sup ϕi is τ -lower semicontinuous. i∈I
(b) If I is finite, then inf ϕi is τ -lower semicontinuous. i∈I
COROLLARY 2.1.5 A set C ⊆ X is τ -closed if and only if iC is τ -lower semicontinuous. From Definition 2.1.1 we see that ϕ : X −→ R∗ is τ -lower semicontinuous at x ∈ X; then for every xn −→ x in X, we have ϕ(x) ≤ lim inf ϕ(xn ). The converse is n→∞
not in general true. EXAMPLE 2.1.6 Let X be an infinite-dimensional Banach space furnished with the weak topology denoted by w. Also let C ⊆ X be a nonempty set which is sequentially w-closed but not w-closed. Then the indicator function iC satisfies w iC (x) ≤ lim inf iC (xn ) for every sequence xn −→ x in X, but it is not w-lower n→∞
semicontinuous. For example in X = l1 , due to the Schur property the set C = ∂B1 = {x ∈ l1 : xl1 = 1} is sequentially w-closed but it is not w-closed (in fact, w C = B 1 = {x ∈ l1 : xl1 ≤ 1}).
66
2 Extremal Problems and Optimal Control
DEFINITION 2.1.7 We say that ϕ : X −→ R∗ is sequentially τ -lower semicontinuous at x ∈ X, if ϕ(x) ≤ lim inf ϕ(xn ) for every sequence xn −→ x in X. We say that n→∞
ϕ is sequentially τ -lower semicontinuous, if it is sequentially τ -lower semicontinuous at every x ∈ X. REMARK 2.1.8 Let τseq denote the topology of X whose sets are sequentially τ -closed; then ϕ : X −→ R∗ is sequentially τ -lower semicontinuous if and only if it is τseq -lower semicontinuous. In general the τseq topology is stronger than the τ -topology and so the notion of sequential τ -lower semicontinuous is more general than the τ -lower semicontinuity. Clearly τ = τseq if and only if X is first countable. So we can state the following proposition. PROPOSITION 2.1.9 If (X, τ ) is first countable, ϕ : X −→ R∗ and x ∈ X, then ϕ is τ -lower semicontinuous at x ∈ X if and only if ϕ is sequentially τ -lower semicontinuous at x ∈ X. The next theorem summarizes the direct method of the calculus of variations and underlines the importance of the notion of lower semicontinuity in variational analysis. The result is known as the Weierstrass theorem. THEOREM 2.1.10 If ϕ : X −→ R∗ is τ -coercive and τ -lower semicontinuous (resp., sequentially τ -coercive and sequentially τ -lower semicontinuous), then (a) There exists x ∈ X such that ϕ(x) = inf ϕ. X
(b) If {xn }n≥1 is a minimizing sequence and x is a cluster point of {xn }n≥1 (resp., x is a subsequential limit of {xn }n≥1 ), then ϕ(x) = inf ϕ. X
(c) If ϕ is not identically +∞, then every minimizing sequence has a cluster point (resp., a convergent subsequence). PROOF: If ϕ=+∞, then all x ∈ X minimize ϕ and so (a) and (b) hold. So assume that ϕ is not identically +∞. Let {xn }n≥1 be a minimizing sequence for ϕ. We have ϕ(xn ) ↓ inf ϕ < +∞. Let λ ∈ R be such that inf ϕ < λ. Then we X
X
can find n0 ≥ 1 such that ϕ(xn )≤λ for all n ≥ n0 . Due to the τ -coercivity (resp., sequential τ -coercivity) of ϕ, we have that {xn }n≥1 has a cluster point x ∈ X (resp., a subsequential limit point x ∈ X); see Definition 1.5.26. Then due to the τ -lower semicontinuity (resp. sequential τ -lower semicontinuity) of ϕ, we have inf ϕ ≤ ϕ(x) ≤ lim ϕ(xn ) = inf ϕ, n→∞
X
X
⇒ ϕ(x) = inf ϕ. X
COROLLARY 2.1.11 If X is a reflexive Banach space, ϕ ∈ Γ0 (X), and ϕ(x) −→ +∞ as x −→ ∞, then we can find x ∈ X such that −∞ < ϕ(x) = inf ϕ. X
PROOF: Note that due to the convexity of ϕ, the function is lower semicontinuous if and only if it is w-lower semicontinuous. Moreover, due to the reflexivity of X bounded sets are relatively w-compact (in fact relatively sequentially w-compact by the Eberlein–Smulian theorem).
2.1 Lower Semicontinuity
67
REMARK 2.1.12 If ϕ : X −→ R∗ is lower semicontinuous convex and there exists x0 ∈ X such that ϕ(x0 ) = −∞, then ϕ is nowhere finite on X. For this reason, when convexity is present, we consider functions with values in R = R ∪ {+∞}. DEFINITION 2.1.13 Let ϕ : X −→ R∗ . The τ -lower semicontinuous regularization (or the τ -lower semicontinuous envelope) of ϕ, is the function ϕ τ : X −→ R∗ defined by ϕ τ (x) = lim inf ϕ(u) = sup inf ϕ(u). u→x
U ∈N (x) u∈U
REMARK 2.1.14 It is clear from this definition that ϕ τ is τ -lower semicontinuous. In the next proposition we show that ϕ τ is the biggest τ -lower semicontinuous function majorized by ϕ. PROPOSITION 2.1.15 If ϕ : X −→ R∗ and S(ϕ) = h : X −→ R∗ : τ τ h is τ -lower semicontinuous, h ≤ ϕ , then epi ϕ = epi ϕ and ϕ (x) = sup h(x) : h ∈ S(ϕ) for all x ∈ X. PROOF: Evidently epi ϕ ⊆ epi ϕ τ . Let (x, λ) ∈ epi ϕ τ . Then ϕ τ (x) ≤ λ. So for every U ∈ N (x) and for every ε > 0 we have inf ϕ(u) ≤ ϕ τ (x) ≤ λ < λ+ε. So we can u∈U
find u ∈ U and µ ∈ (λ, λ+ε) such that ϕ(u) < µ. Hence U ×(λ−ε, λ+ε) ∩epi ϕ = ∅, from which it follows that (x, λ) ∈ epi ϕ. Therefore epi ϕ τ = epi ϕ. Now note that ϕ τ ∈ S(ϕ). Also if h ∈ S(ϕ), then epi ϕ ⊆ epi h and so epi ϕ τ = epi ϕ ⊆ epi h = epi h (see Proposition 2.1.3). Therefore h ≤ ϕ τ and so we conclude that ϕ τ (x) = sup h(x) : h ∈ S(ϕ) for all x ∈ X. τ COROLLARY 2.1.16 If ϕ : X −→ R∗ and λ ∈ R, then ϕ τ ≤ λ = ϕ≤µ . µ>λ
COROLLARY 2.1.17 If C ⊆ X, then iC
τ
= iC τ .
COROLLARY 2.1.18 If ϕ, ψ : X −→ R∗ , then τ
τ
(a) (ϕ + ψ) ≥ ϕ τ + ψ . τ (b) If ψ is continuous and finite everywhere, then (ϕ + ψ) = ϕ τ + ψ. The following proposition characterizes ϕ τ in terms of sequences and follows from Proposition 1.5.20 (see also Remark 1.5.2). PROPOSITION 2.1.19 If (X, τ ) is first countable, ϕ : X −→ R∗ and x ∈ X, then ϕ τ. (a) For every sequence xn −→ x in X we have ϕ τ (x) ≤ lim inf ϕ(xn ). n→∞
68
2 Extremal Problems and Optimal Control
(b) There exists a sequence xn −→ x in X such that ϕ τ (x) = lim ϕ(xn ). n→∞
The next theorem summarizes the relaxation method in the study of variational problems. THEOREM 2.1.20 If ϕ : X −→ R∗ is τ -coercive, then (a) ϕ τ is τ -coercive and τ -lower semicontinuous. (b) There exists x ∈ X such that ϕ τ (x) = inf ϕ τ . X
(c) min ϕ τ = inf ϕ. X
X
(d) Every cluster point of a minimizing sequence for ϕ is a minimizer of ϕ τ . (e) If (X, τ ) is first countable, then the minimizers of ϕ τ are the limits of minimizing sequences for ϕ. PROOF: (a) We already know that ϕ τ is τ -lower semicontinuous. The τ -coercivity of ϕ τ follows from Theorem 1.5.29 and Remark 1.5.2 (see also Proposition 2.1.19). (b) Follows from part (a) and Theorem 2.1.10. (c) Let h(x) = inf ϕ for all x ∈ X. Then h ∈ S(ϕ) and so h ≤ ϕ τ (see Proposition X
2.1.15). It follows that inf ϕ ≤ min ϕ τ . X
X
Because the opposite inequality is always true (recall that ϕ τ ≤ϕ), we conclude that min ϕ τ =inf ϕ. X
X
(d) Let {xn }n≥1 be a minimizing sequence for ϕ and x a cluster point of {xn }n≥1 . Then ϕ τ (x) ≤ lim sup ϕ τ (xn ) ≤ lim ϕ(xn ) = inf ϕ = min ϕ τ n→∞
n→∞
X
X
(see part (c)), ⇒ ϕ τ (x) = min ϕ τ . X
(e) Let x ∈ X be such that ϕ τ (x) = min ϕ τ . Then from Proposition 2.1.19 and (c), we can find xn −→ x such that
X
inf ϕ = ϕ τ (x) = lim ϕ(xn ). X
n→∞
REMARK 2.1.21 This theorem shows that the relaxed functional captures the asymptotic behavior of the minimizing sequences for ϕ. For convex functions in a Banach space, the relaxation process is simpler because by Mazur’s lemma for convex sets, the closures in the norm and weak topologies coincide. So we have the following.
2.1 Lower Semicontinuity
69
PROPOSITION 2.1.22 If X is a Banach space and ϕ : X −→ R is convex, then ϕ s = ϕ w (here by s we denote the strong topology on X and by w the weak topology on X). Next we prove a lower semicontinuity result for integral functionals defined on products of Lebesgue spaces. The mathematical setting is the following. Let (Ω, Σ, µ) be a complete, finite nonatomic measure space and ϕ : Ω × RN × Rm −→ R a function that satisfies the following hypotheses. H0 :
(i) ϕ is Σ × B(RN ) × B(Rm )-measurable with B(RN ) (resp., B(Rm )) being the Borel σ-field of RN (resp., of Rm ). (ii) For µ-almost all ω ∈ Ω, (x, u) −→ ϕ(ω, x, u) is lower semicontinuous. (iii) For µ-almost all ω ∈ Ω and all x ∈ RN , u −→ ϕ(ω, x, u) is convex.
Let 1 ≤ p, r ≤ +∞ and consider the integral function Iϕ (x, u) : Lp (Ω, RN ) × L (Ω, Rm ) −→ R∗ defined by
Iϕ (x, u) = ϕ ω, x(ω), u(ω) dµ for all (x, u) ∈ Lp (Ω, RN ) × Lr (Ω, Rm ). r
Ω
We want to establish a sequential lower semicontinuity result when we consider the strong topology on Lp (Ω, RN ) and the weak topology on Lr (Ω, Rm ) (the weak∗ topology on L∞ (Ω, Rm ) when r = +∞). To do this we need the following approximation result, whose proof can be found in Buttazzo [126, p. 42]. PROPOSITION 2.1.23 If ϕ: Ω × RN × Rm −→ R satisfies H0 above and one of the following conditions. (iv) For µ-almost all ω ∈ Ω, we can function u0 : RN −→ Rm
find a continuous such that the function x −→ ϕ ω, x, u0 (x) is finite and continuous; or (v) There exists a function ϑ : R −→ R such that lim
s→+∞
ϑ(s) = +∞ s
and ϑ(uRm ) ≤ ϕ(ω, x, u) for µ-a.a. ω ∈ Ω, all x ∈ RN and all u ∈ Rm , then we can find two sequences of Carath´eodory functions an : Ω × RN −→ Rm and cn : Ω × RN −→ R such that
ϕ(ω, x, u) = sup an (ω, x), u Rm ) + cn (ω, x) n≥1
for µ-almost all ω ∈ Ω and all x ∈ RN, all u ∈ Rm . REMARK 2.1.24 Recall that h: Ω × Rk −→ Rs (1 ≤ s, k) is a Carath´eodory function, if for all x ∈ Rk , ω −→ h(ω, x) is Σ-measurable and for µ-almost all ω ∈ Ω, x −→ h(ω, x) is continuous. Such a function is automatically Σ × B(Rk )measurable. If in the above proposition ϕ ≥ 0, then it is easy to see the following approximation for ϕ holds +
ϕ(ω, x, u) = sup an (ω, x), u Rm + cn (ω, x) n≥1
with an : Ω × RN −→ Rm and cn : Ω × RN −→ R bounded Carath´eodory functions.
70
2 Extremal Problems and Optimal Control
We can prove the following sequential lower semicontinuity result for the integral functional Iϕ (x, u), (x, u) ∈ Lp (Ω, RN ) × Lr (Ω, Rm ). THEOREM 2.1.25 If ϕ : Ω×RN ×Rm −→ R satisfies H0 and the following holds, “for every sequence {xn }n≥1 converging in Lp (Ω, RN ) and every sequence {un }n≥1 weakly converging in Lr (Ω, Rm ) (weakly∗ if r = +∞) such that
ϕ+ ω, xn (ω), un (ω) dµ ≤ c + ϕ− ω, xn (ω), un (ω) dµ, Ω
Ω
for some c > 0 and all n ≥ 1, the sequence
ϕ− ·, xn (·), un (·)
n≥1
⊆ L1 (Ω)
is relatively weakly compact,” then the integral function Iϕ is well-defined on Lp (Ω, RN ) × Lr (Ω, Rm ) with values in R and it is sequentially lower semicontinuous when we consider Lp (Ω, RN ) with the norm topology and Lr (Ω, Rm ) with the weak topology (weak∗ if r = +∞). PROOF: Let x ∈ Lp (Ω, RN ), u ∈ Lr (Ω, Rm ) and suppose that for some c > 0 we have
ϕ+ ω, x(ω), u(ω) dµ ≤ c + ϕ− ω, x(ω), u(ω) dµ. (2.1) Ω Ω
Then by hypothesis ϕ− ·, x(·), u(·) ⊆ L1 (Ω) and so using (2.1) we conclude that
1 ϕ ·, x(·), u(·) ⊆ L (Ω). If (2.1) is not satisfied, then
ϕ− ω, x(ω), u(ω) dµ < +∞ ϕ+ ω, x(ω), u(ω) dµ = +∞, and Ω
Ω
⇒ ϕ ω, x(ω), u(ω) dµ = +∞. Ω
So indeed Iϕ is well-defined with values in R = R ∪ {+∞}. We prove the desired sequential lower semicontinuity of Iϕ , in steps. Step 1: We assume that there exists a function ϑ : R −→ R such that lim
s→+∞
ϑ(s) = +∞ s
and
ϑ(uRm ) ≤ ϕ(ω, x, u)
(2.2)
for µ-a.a. ω ∈ Ω, all x ∈ RN , and all u ∈ Rm . Because of (2.2) we can apply Proposition 2.1.23 (see also Remark 2.1.24) and obtain that +
ϕ(ω, x, u) = sup an (ω, x), u RN + cn (ω, x) (2.3) n≥1
with an : Ω × R −→ R and cn : Ω × RN −→ R, n ≥ 1, bounded Carath´eodory functions. Recall that
+
an (ω, x), u Rm + cn (ω, x) dµ Ω
= sup (2.4) an (ω, x), u Rm + cn (ω, x) dµ : C ∈ Σ . N
m
C
So in view of (2.3) and (2.4) it suffices to show that for all n ≥ 1 and all C ∈ Σ
2.1 Lower Semicontinuity
(x, u) −→
an (ω, x(ω)), u(ω)
C
RN
71
+ cn ω, x(ω) dµ
is sequentially lower semicontinuous on Lp (Ω, RN ) × Lr (Ω, Rm ) the first space in the product furnished with the norm topology and the second with the weak topology (weak∗ topology if r = +∞). But this is immediate because an and cn are bounded Carath´eodory functions. Step 2: Now we assume that ϕ is bounded below. Without any loss of generality, we may assume that ϕ ≥ 0. Suppose that xn −→ x w w in Lp (Ω, RN ) and un −→ u in Lr (Ω, Rm ) (weakly∗ if r = +∞). Because un −→ u w in Lr (Ω, Rm ) (weakly∗ if r = +∞) and µ is finite, we have that un −→ u in L1 (Ω, Rm ) and so by the Dunford–Pettis theorem the sequence {un }n≥1 ⊆ L1 (Ω, Rm ) is uniformly integrable. So according to the De La Vall´ee–Poussin theorem we can find ϑ : R −→ R+ such that
ϑ(s) lim ϑ(un (ω)Rm )dµ ≤ 1 for all n ≥ 1. = +∞ and s→+∞ s Ω Let ε > 0 and set ϕε (ω, x, u) = ϕ(ω, x, u) + εϑ(uRm ). Because of Step 1, we have
ϕ ω, x(ω), u(ω) dµ ≤ ϕε ω, x(ω), u(ω) dµ Ω Ω
≤ lim inf ϕε ω, xn (ω), un (ω) dµ n→∞
Ω
≤ lim inf ϕ ω, xn (ω), un (ω) dµ + ε. n→∞
Ω
Because ε > 0 was arbitrary, we let ε ↓ 0 to conclude that
ϕ ω, x(ω), u(ω) dµ ≤ lim inf ϕ ω, xn (ω), un (ω) dµ. n→∞
Ω
Ω
Step 3: General case. w Again suppose that xn −→ x in Lp (Ω, RN ) and un −→ u in Lr (Ω, Rm ) (weakly∗ if r = +∞). For every k ≥ 1 we define
ϕk (ω, x, u) = max{ϕ(ω, x, u), −k}, gn (ω) = ϕ− ω, xn (ω), un (ω) and Cn,k = {ω ∈ Ω : gn (ω) > k}. Without any loss of generality we may assume that Iϕ (xn , un ) tends to a finite limit as n → ∞. So by virtue of the hypothesis of the theorem {gn }n≥1 ⊆ L1 (Ω) is relatively weakly compact. Then for every k ≥ 1, because of Step 2, we have Iϕ (x, u) ≤ Iϕk (x, u) ≤ lim inf Iϕk (xn , un ) n→∞
= lim inf Iϕ (xn , un ) + n→∞
gn (ω) − k dµ
Cn,k
≤ lim inf Iϕ (xn , un ) + lim sup n→∞
n→∞
gn (ω) − k dµ.
Cn,k
72
2 Extremal Problems and Optimal Control
The Dunford–Pettis theorem implies that {gn }{xn }n≥1 ≥1 ⊆ L1 (Ω) is uniformly integrable. So if we pass to the limit as k −→ ∞, we conclude that Iϕ (x, u) ≤ lim inf Iϕ (xn , un ). n→∞
COROLLARY 2.1.26 If Ω ⊆ Rk is a bounded domain with Lipschitz boundary and ϕ : Ω×RN ×RN k −→ R+ satisfies (i) ϕ is Borel measurable. (ii) For almost all z ∈ Ω, (x, u) −→ ϕ(z, x, u) is lower semicontinuous. (iii) For almost all z ∈ Ω and all x ∈ RN , u −→ ϕ(z, x, u) is convex,
then the integral functional x −→ Jϕ (x) = Ω ϕ z, x(z), Dx(z) dz is sequentially 1,1 (Ω, RN ). w-lower semicontinuous on Wloc REMARK 2.1.27 This result is optimal when N = 1 (the so-called scalar case), in the sense that the convexity of ϕ(z, x, ·) is a necessary condition for the sequential weak lower semicontinuity of Jϕ (·) on W 1,p (Ω), 1 ≤ p ≤ ∞. In the case N > 1 (the so-called vector case), convex integrands produce only a small subclass of the sequentially weak lower semicontinuous functionals on W 1,p (Ω, RN ) (see Section 2.5). We can have a version of Theorem 2.1.25 for integrands defined on infinitedimensional Banach spaces. The result is due to Balder [53] and a different proof based on Young measures can be found in Hu–Papageorgiou [316, p. 31]. THEOREM 2.1.28 If X is a separable Banach space, Y is a separable reflexive Banach space, and ϕ : Ω×X ×Y −→ R satisfies
(i) ϕ is Σ × B(X) × B(Y ) measurable with B(X) (resp., B(Y )) the Borel σ-field of X (resp., Y ) ; (ii) For µ-almost all ω ∈ Ω, (x, y) −→ ϕ(ω, x, y) is lower semicontinuous; (iii) For µ-almost all ω ∈ Ω and all x ∈ X, y −→ ϕ(ω, x, y) is convex; (iv) There exist M > 0 and ϑ ∈ L1 (Ω) such that for µ−almost all ω ∈ Ω and all (x, y) ∈ X × Y we have
ϕ(ω, x, y) ≥ ϑ(ω) − M xX + yY ,
then (x, u) −→ Iϕ (x, u) = Ω ϕ ω, x(ω), u(ω) dµ is sequentially lower semicontinuous on L1 (Ω, X) × L1 (Ω, Y )w , where by L1 (Ω, Y )w we denote the Lebesgue–Bochner space L1 (Ω, Y ) furnished with the weak topology.
2.2 Constrained Minimization Problems In many applied situations we are not just looking for the minimum of an objective functional on an open set U , but we want to determine the minimum of ϕ subject to certain restrictions on the points x ∈ U . In this section we examine such minimization problems with side conditions and develop the method of Lagrange multipliers.
2.2 Constrained Minimization Problems
73
DEFINITION 2.2.1 Let X, Y be Banach spaces, U ⊆ X and V ⊆ Y nonempty open sets and ϕ : U −→ V a Fr´echet differentiable map. (a) We say that x0 ∈ U is a critical point of ϕ if and only if ϕ (x0 ) ∈ L(X, Y ) is not surjective. (b) We say that x0 ∈ U is a regular point of ϕ if and only if ϕ (x0 ) ∈ L(X, Y ) is surjective. REMARK 2.2.2 If Y = R, then x0 ∈ U is a critical (resp., regular) point of the function ϕ : U −→ R only if ϕ (x0 ) = 0 (resp., ϕ (x0 ) = 0). If X = R, then x0 ∈ (a, b) is a critical point of ϕ if and only if ϕ (x0 ) = 0. EXAMPLE 2.2.3 Let X = H = a Hilbert space, A ∈ L(H, H) is a self-adjoint isomorphism (i.e., A−1 ∈ L(H, H)) and ψ : H −→ R is Fr´echet differentiable. Let ϕ : H −→ R be defined by ϕ(x) =
1 Ax, xH − ψ(x), 2
where by ·, ·H we denote the inner product of H. Evidently ϕ is Fr´echet differentiable and ϕ (x) = Ax − ψ (x). Therefore x0 ∈ H is a critical point of ϕ if and only if x0 = A−1 ψ (x0 ). If ψ(x) = λ 2x2H , λ ∈ R, then ψ (x 0 ) = λx0 and so x0 ∈ H is a critical point of the function ϕ(x) = 1 2 Ax, xH − λ 2x2H if and only if Ax0 = λx0 ; that is, x0 is an eigenvector of the operator A with eigenvalue λ. Next we extend Definition 1.4.1 to nonconvex sets C. DEFINITION 2.2.4 Let X be a Banach space and C ⊆ X nonempty. (a) A vector h ∈ X is said to be tangent to the set C at x0 ∈ C, if there exist ε > 0 and a map r : [0, ε] −→ X such that x0 + λh + r(λ) ∈ C
for all λ ∈ [0, ε]
and
r(λ)X −→ 0 as λ −→ 0+ . λ
(b) The set of all vectors h ∈ X that are tangent to the set C at x0 ∈ C form a closed cone (which is nonempty because it contains the origin) denoted by TC (x0 ). If this cone is a subspace, then it is called the tangent space to the set C at the point x0 ∈ C. We also need the following notion from Banach space theory. DEFINITION 2.2.5 Let X be a Banach space and Y a closed subspace of it. We say that a subspace V is a topological complement of Y if (a) V is closed and (b) Y ∩ V = {0} and Y + V = X. In this case we write X = Y ⊕ V .
74
2 Extremal Problems and Optimal Control
REMARK 2.2.6 A closed subspace Y of X admits a topological complement if and only if there exists a projection operator onto Y ; that is, there exists pY ∈ L(X) such that pY Y = idY . It is easy to see that if Y is finite-dimensional, then it admits
a topological complement. Similarly, if Y has finite codimension (i.e., dim X/Y < +∞), then Y has a topological complement. In a Hilbert space every closed subspace admits a topological complement (consider the orthogonal complement of Y ). In a Banach space that is not isomorphic to a Hilbert space, we can always find a closed subspace which has no topological complement. Now let X, Y be two Banach spaces and ψ : X −→ Y a map. We consider the following constraint set C = {x ∈ X : ψ(x) = 0}. The next theorem produces a very convenient characterization of the tangent space TC (x0 ), x0 ∈ C. The result is known as Ljusternik’s theorem. THEOREM 2.2.7 If X, Y are Banach spaces, U ⊆ X is a nonempty open set, ψ : U −→ Y is continuously Fr´echet differentiable,
C = {x ∈ U : ψ(x) = 0}, and N ψ (x ) = ker ψ (x0 ) has a topological x0 ∈ C is a regular point of ϕ at which 0 complement, then TC (x0 ) = N ψ (x0 ) .
PROOF: Let V = N ψ (x0 ) . This is a closed subspace of X that by hypothesis has a topological complement W ⊆ X. So we can find pV , pW ∈ L(X) such that V = pV (X) = N (pW ), W = pW (X) = N (pV ), p2V = pV , p2W = pW (i.e., they are linear projection operators) and
pV + pW = idX
(i.e., X = V ⊕ W ). Because U ⊆ X is an open set, we can find r > 0 such that x0 + Br (0) + Br (0) ⊆ U
(recall that Br (0) = x ∈ X : xX < r ). Let f : V ∩ Br (0) × W ∩ Br (0) −→ Y be defined by
f (v, w) = ψ(x0 + v + w)
for all v ∈ V ∩ Br (0) and all w ∈ W ∩ Br (0).
The function f has the following properties. f (0, 0) = ψ(x0 ) = 0 and and
with f ∈ C (V × W, Y ) f2 (0, 0) = ψ (x0 ) W ∈ L(W, Y ). 1
f1 (0, 0)
= ψ (x0 )V = 0
(2.5)
(2.6)
Because f2 (0, 0) ∈ L(W, Y ) is bijective, from Banach’s theorem we know that f2 (0, 0)−1 ∈ L(Y, W ). These properties of f permit the use of the implicit function theorem (see Theorem 1.1.23), which gives a 0 < δ < r and a uniquely determined continuously Fr´echet differentiable g : V ∩ Bδ (0) −→ W such that
2.2 Constrained Minimization Problems
f v, g(v) = 0 and
g(0) = 0, g (0) =
for all v ∈ V ∩ Bδ (0) f2 (0, 0)−1 f1 (0, 0).
75 (2.7) (2.8)
From (2.6) and (2.8), we have that g (0) = 0 and so lim
v→0
g(v)Y = 0. vX
(2.9)
If we define ξ : x0 + V ∩ Bδ (0) −→ X by ξ(x0 + v) = x0 + v + g(v), then clearly ξ is continuous. Moreover, from (2.7) we have
0 = f v, g(v) = ψ x0 + v + g(v) , hence ξ has values in C. Note that v, g(v) are in complementary subspaces of X. Hence ξ isinjective and so we can consider its inverse on D = x0 + v + g(v) : v ∈ V ∩ Bδ (0) ⊆ C. So
ξ −1 x0 + v + g(v) = x0 + v = x0 + pV v + g(v) , ⇒ ξ −1 is continuous too. Therefore ξ is a homeomorphism of U = x0 + V ∩ Bδ (0) onto D ⊆ C. This together with (2.9) imply that TC (x0 ) = N ψ (x0 ) .
REMARK 2.2.8 If every x0 ∈ C is a regular point and N ψ (x0 ) admits a topological complement, then C = {x ∈ X : ψ(x) = 0} is a C 1 -manifold. If in addition ψ ∈ C r (U, Y ), then C is a C r -manifold. Note that the rather technical hypothesis that N ψ (x0 ) admits a topological complement is automatically satisfied in the following three cases. (a) Y is finite-dimensional. (b) X = H = a Hilbert space. (c) For all x ∈ U ϕ (x) ∈ L(X, Y ) is a Fredholm operator. Cases (a) and (b) follow from Remark 2.2.6, and Case (c) is considered in Section 3.1. COROLLARY 2.2.9 If X, Y, Z are Banach spaces, U ⊆ X is nonempty open, ψ ∈ C 1 (U, Y ), C = {x ∈ U : ψ(x) = 0}, x0 ∈ C is a regular point of ϕ at which N ψ (x0 ) admits a topological complement, and ϕ : U −→ Z is Fr´echet differentiable at x0 , then there exists a neighborhood U0 of the origin in TC (x0 ), a neighborhood D0 of x0 in C, and a continuously Fr´ echet differentiable homeomorphism
g of U0 onto D0 such that ϕ g(u) = ϕ(x0 ) + ϕ (x0 )u + o(uX ) with o(uX ) uX −→ 0 as xX −→ 0. Using Theorem 2.2.7, we can produce necessary conditions for the existence of solutions for a constrained minimization problem. This theorem is the first result establishing the well-known method of Lagrange multipliers.
76
2 Extremal Problems and Optimal Control
THEOREM 2.2.10 If X, Y are Banach spaces, U ⊆ X is a nonempty open set, ψ ∈ C 1 (U, Y ), ϕ : U −→ R is Fr´echet C = {x ∈ U : ψ(x) = 0}, x0 ∈ C
differentiable, is a regular point of ψ at which N ψ (x0 ) admits a topological complement, and x0 is a local minimizer of ϕ on the constraint set C, then there exists y0∗ ∈ Y ∗ such that ϕ (x0 ) = y0∗ ◦ ψ (x0 ).
PROOF: Let V = N ψ (x0 ) and let W be its topological complement (it exists by hypothesis). Because x0 ∈ C is a regular point of ψ, K = ψ (x0 ) ∈ L(W, Y ) W
is an isomorphism (i.e., K −1 ∈ L(Y, W ) by Banach’s theorem). According to Theorem 2.2.7 we can find D0 a neighborhood of x0 in C such that for every x0 ∈ D0 , we have
x = x0 + u + g(u) for all u ∈ V ∩ Bδ (0), V = N ψ (x0 )
with g(u)X uX −→ 0 as u −→ 0 in X. Because x0 is a local minimizer of ϕ on C, we can find 0 < δ1 ≤ δ such that
ϕ(x0 ) ≤ ϕ x0 + u + g(u) for all u ∈ V ∩ Bδ (0), for all u ∈ V ∩ Bδ1 (0), ⇒ 0 ≤ ϕ (x0 ), u + g(u) X + ox0 (u + g(u)X ) ⇒ ϕ (x0 ), u X = 0 for all u ∈ V ∩ Bδ1 (0) (recall g is a homeomorphism),
(2.10) ⇒ V ⊆ N ϕ (x0 ) = ker ϕ (x0 ). Let pW ∈ L(X) be the canonical projection on W , the topological complement of V . Because of (2.10) we can define ϕ (x0 ) ∈ W ∗ by for all x ∈ X. ϕ (x0 ), pW x W = ϕ (x0 ), x X Then y0∗ = ϕ (x0 ) ◦ K −1 ∈ Y ∗ . If pV ∈ L(X) is the canonical projection of X onto V , then for every x ∈ X we have x = pV x + pW x and so ∗ y0 ◦ ψ (x0 ), x X = y0∗ ◦ ψ (x0 ), pW x X = y0∗ ◦ K, pW xX = ϕ (x0 ), pW x X = ϕ (x0 ), x X
⇒ y0∗ ◦ ψ (x0 ) = ϕ (x0 )
in X ∗ .
REMARK 2.2.11 According to Theorem 2.2.10, x0 ∈ C is a critical point of the function L(x) = ϕ(x) − (y0∗ ◦ ψ)(x), known as the Lagrangian function of the constrained minimization problem inf ϕ. Also the element y ∗ ∈ Y ∗ is a Lagrange C
multiplier . COROLLARY 2.2.12 If U ⊆ RN is a nonempty open set, ϕ : U −→ R is Fr´echet differentiable, ψ ∈ C 1 (U, Rm ), C = {x ∈ U : ψ(x) = 0}, x0 ∈ C is a regular point
2.2 Constrained Minimization Problems
77
m of ψ, and it is also a local minimizer of ϕ on C, then we can find λ = (λk )m k=1 ∈ R such that m ∂ϕ ∂ψ (x0 ) = λk (x0 ) for all i = 1, . . . , N. ∂xi ∂xi k=1
EXAMPLE 2.2.13 Let X = H a Hilbert space and A ∈ L(H) a self-adjoint operator. Let ϕ : H −→ R be defined by ϕ(x) = Ax, xH . Also let ψ : H −→ R+ be defined by ψ(x) = x2H − 1. Suppose that x0 ∈ H, x0 H = 1 is a minimizer of ϕ on ∂B(0) = {x ∈ H : ψ(x) = 0}. According to Theorem 2.2.10 (with Y = R), we can find λ ∈ R such that ϕ (x0 ) = λψ (x0 ), hence 2A(x0 ) = 2λx0 and so A(x0 ) = λx0 . In other words x0 ∈ H is an eigenvector of A for the eigenvalue λ. This example shows how we can establish the existence of eigenvalues for bounded self-adjoint linear operators on a Hilbert space. Now we want to improve Theorem 2.2.10 by weakening the hypotheses on the constraint function ψ. To do this we need the following generalization of Ljusternik’s theorem (see Theorem 2.2.7) due to Ioffe–Tichomirov [327, p. 34], where the interested reader can find its proof. THEOREM 2.2.14 If X, Y are Banach spaces, U ⊆ X is a nonempty open set, ψ : U −→ R is Fr´echet differentiable, C = {x ∈ U : ψ(x)
= 0} and x0 ∈ C is a regular point of ψ (i.e., ψ (x0 )(X) = Y ), then TC (x0 ) = N ψ (x0 ) . REMARK 2.2.15 Note
that compared to Theorem 2.2.7, we have dropped the splitting property for N ψ (x0 ) . We also need the following general result from operator theory, which underlines the idea of the Lagrange multiplier method. PROPOSITION 2.2.16 If X, Y are Banach spaces, A ∈ L(X, Y ), x∗0 ∈ X ∗ , R(A) is closed and N (A) ⊆ N (x∗0 ), then there exists y0∗ ∈ Y ∗ such that x∗0 + y0∗ ◦ A = 0. Moreover, if A(X) = Y , then y0∗ is unique. PROOF: We know that R(A∗ ) = N (A)⊥ = x∗ ∈ X ∗ : x∗ , xX = 0 for all x ∈ N (A) . By hypothesis x∗0 ∈ N (A)⊥ . So it follows that x∗0 = −A∗ y0∗ for some y0∗ ∈ Y ∗ . Therefore (2.11) x∗0 + y0∗ ◦ A = 0. If R(A) = Y , then N (A∗ ) = R(A)⊥ = y ∗ ∈ Y ∗ : y ∗ , yY = 0 for all y ∈ R(A) = {0}. So A∗ is injective and by (2.11) this means that y0∗ is uniquely determined by x∗0 . COROLLARY 2.2.17 If X, Y are Banach spaces, A ∈ L(X, Y ), x∗0 ∈ X ∗ , R(A) is closed, and R(A) = Y , then there exists y0∗ ∈ Y ∗ , y0∗ = 0 such that y0∗ ◦ A = 0. PROOF: Let y ∈ Y such that y ∈ / R(A). Then we can find y0∗ ∈ Y ∗ with y0∗ Y ∗ = 1 such that y0∗ , yY = 0
and
y0∗ , uY = 0
for all u ∈ R(A).
78
2 Extremal Problems and Optimal Control Therefore it follows that y0∗ = 0 and y0∗ ◦ A = 0.
We are led to the following generalization of Theorem 2.2.10. THEOREM 2.2.18 If X, Y are Banach spaces, U ⊆ X is a nonempty open set, 1 ϕ : U −→ R Fr´
is echet differentiable, ψ ∈ C (U, Y ), C = {x ∈ U : ψ(x) = 0}, x0 ∈ C, R ψ (x0 ) is closed, and x0 ∈ C is a local minimizer of ϕ on C, then there exist λ0 ∈ R and y0∗ ∈ Y ∗ not both equal to zero such that λ0 ϕ (x0 ) = y0∗ ◦ ψ (x0 ).
(2.12)
PROOF: In the degenerate case R ψ (x0 ) = Y , (2.12) is true with λ0 = 0 by virtue of Corollary 2.2.17.
If R ψ (x0 ) =Y , then provided that N ψ (x0 ) admits a topological complement, the validity of (2.12) with λ0 =1 is a consequence
of Theorem 2.2.10. Finally, if we do not have the splitting of N ψ (x0 ) , then the result follows if we argue as in the proof of Theorem 2.2.10 using this time Theorem 2.2.14. Theorems 2.2.10 and 2.2.18 indicate that there is a close relation between Lagrange multipliers and constrained critical points of a function ϕ. DEFINITION 2.2.19 Let X be a Banach space, U ⊆ X nonempty open, ϕ : U −→ R a Fr´echet differentiable map, C ⊆ X a nonempty closed set and x0 ∈ C. We say that x0 is a critical point of ϕ subject to the constraint C, if for all curves u : (−ε, ε) −→ X such that u(t) ∈ C for all t ∈ (−ε, ε), u(0) = x0 , u (0) exists, we have d
= 0. ϕ u(t) dt t=0 REMARK 2.2.20 If x0 ∈ int C (the interior taken in X), then x0 is a usual critical point of ϕ (also known as the free critical point of ϕ). A critical point that is not a local extremum of ϕ (i.e., it is not a local minimizer or a local maximizer of ϕ), is called a saddle point. So x0 ∈ C is a saddle point of ϕ subject to the constraint C, if for every U ∈ N (x0 ) = filter of neighborhoods of x0 , we can find y, v ∈ C ∩ U such that ϕ(y) < ϕ(x0 ) < ϕ(v). The next theorem establishes the connection between constrained critical points and Lagrange multipliers. THEOREM 2.2.21 If X, Y are Banach spaces, U ⊆ X is nonempty open, ϕ : U −→ R is Fr´echet differentiable, ψ ∈ C 1 (U, Y ), C = {x ∈ X : ψ(x) = 0}, and x0 ∈ C is a regular point of ϕ at which N ϕ (x0 ) admits a topological complement, then x0 is a critical point of ϕ subject to the constraint C if and only if we can find y0∗ ∈ Y ∗ such that ϕ (x0 ) = y0∗ ◦ ψ (x0 ). PROOF:
⇒: From Definition 2.2.19 it follows that ϕ
(x0 ),hX = 0 for all h ∈ N ψ (x0 ) . But from Theorem 2.2.10 we have that N ψ (x0 ) = TC (x0 ). Invoking Proposition 2.2.16, we can find y0∗ ∈ Y ∗ such that ϕ (x0 ) = y0∗ ◦ ψ (x0 ).
2.3 Saddle Points and Duality
79
⇐: Let u : (−ε, ε) −→ X be a curve as in Definition 2.2.19. Then ψ u(t) = 0 for all t ∈ (−ε, ε) and ψ (x0 )u (0) = 0. It follows that ϕ (x0 ), u (0)X = 0, hence
= 0 (by the chain rule). So x0 ∈ C is a critical point of ϕ subject (d dt) ϕ u(t) t=0
to the constraint C.
We conclude this section with the so-called Dubovickii–Milyutin theorem, which is crucial for the development of the Dubovickii–Milyutin formalism for the analysis of constrained optimization problems. For the proof we refer to Girsanov [266, p. 37]. n+1 THEOREM 2.2.22 If X is a Banach space, Km m=1 are nonempty convex n+1 n cones in X, and Km m=1 are open, then Km = ∅ if and only if we can find m=1 ∗ x∗m ∈ Km = x∗ ∈X ∗ : x∗ , xX ≥ 0 for all x ∈ Km , m = 1, . . . , n + 1, not all zero, n+1 ∗ xm=0. such that m=1
2.3 Saddle Points and Duality Recall (see Remark 2.2.20), that a saddle point is a critical point of a function that is neither a local minimum nor local maximum. In this section we define a saddle point of a function on a product space C × D. This new concept that we introduce is independent of the notion introduced in the previous section and it is global in nature. DEFINITION 2.3.1 Let C, D be two nonempty sets and ϕ : C × D −→ R a function. We say that (x0 , y0 ) is a saddle point of ϕ, if we have ϕ(x0 , y) ≤ ϕ(x0 , y0 ) ≤ ϕ(x, y0 )
for all (x, y) ∈ C × D.
REMARK 2.3.2 The terminology of Definition 2.3.1 is justified by picturing the function ϕ(x, y) = x2 − y 2 in R3 (a hyperbolic paraboloid). For this function (0, 0) is a saddle point. Note that we always have sup inf ϕ(x, y) ≤ inf sup ϕ(x, y).
y∈D x∈C
x∈C y∈D
(2.13)
If equality holds in (2.13), then the common value is called the saddle value of ϕ on C ×D. DEFINITION 2.3.3 Let C, D be two nonempty sets and ϕ : C ×D −→ R a function. We say that ϕ satisfies a minimax equality on C × D, if the following three conditions hold. (a) ϕ has a saddle value (i.e., equality holds in (2.13)).
80
2 Extremal Problems and Optimal Control
(b) There is x0 ∈ C such that sup ϕ(x0 , y) = inf sup ϕ(x, y). x∈C y∈D
y∈D
(c) There is y0 ∈ D such that inf ϕ(x, y0 ) = sup inf ϕ(x, y). x∈C
y∈D x∈C
REMARK 2.3.4 In this case the inf and sup operations can be replaced by min and max, respectively, and we can write that max min ϕ(x, y) = min max ϕ(x, y). y∈D x∈C
x∈C y∈D
This is the reason for the terminology minimax equality. The following proposition is an immediate consequence of Definitions 2.3.1 and 2.3.3. PROPOSITION 2.3.5 If C, D are nonempty sets and ϕ : C×D −→ R, then ϕ has a saddle point in C × D if and only if ϕ satisfies a minimax equality on C ×D. DEFINITION 2.3.6 Let X be a vector space and B ⊆ X a nonempty set. A multifunction (set-valued function) F : B −→ 2X \ {∅} is said to be a Knaster– Kuratowski–Mazurkiewicz map (a KKM-map for short), if for every finite set m xk k=1 m ! m conv xk k=1 ⊆ F (xk ). k=1
The basic property of KKM-maps is the following. THEOREM 2.3.7 If X is a Hausdorff topological vector space, B ⊆ X is a X nonempty set, and F : B −→ 2 \{∅} is a KKM-map with closed values, then the family F (x) x∈B of sets has the finite intersection property; that is, the intersection of each finite subfamily is nonempty. PROOF: We argue indirectly. So suppose that
m k=1
m F (xk )=∅. Let C=conv xk k=1
and consider the function ϑ : C−→R+ defined by ϑ(u)=
m
d u, F (xk ) . Evidently k=1
ϑ(u) > 0 for all u ∈ C and so we can define the continuous map h : C −→ C by h(u) =
m 1
d u, F (xk ) xk . ϑ(u)
(2.14)
k=1
By Brouwer’s theorem (see Theorem 3.5.3), we can find u0 ∈ C such that h(u0 ) = u0 .
/ F (xk ). On the Let K = k ∈ {1, . . . , m} : d u0 , F (xk ) = 0 . Then u0 ∈
k∈K
other hand
u0 = h(u0 ) ∈ conv xk
k∈K
⊆
!
F (xk ),
k∈K
(see (2.14) and recall that F is a KKM-map). So we have a contradiction, which finishes the proof.
2.3 Saddle Points and Duality
81
REMARK 2.3.8 The hypotheses of this theorem can be weakened in the following way. We can drop the requirement that F has closed values and only assume that each F (x) is finitely closed. Namely, if V is a finite-dimensional flat in X, then V ∩ F (x) is closed in the Euclidean topology of V . Evidently in this case we need only assume that X is a vector space. The finitely closed sets define a topology on X, known as the finite topology. It is stronger than any Hausdorff linear topology on X. COROLLARY 2.3.9 If X is a Hausdorff topological vector space, B ⊆ X is a nonempty set, andF : B −→ 2X \{∅} is a KKM-map with closed values one of which is compact, then F (x) = ∅. x∈B
Using this corollary we can prove the following useful coincidence theorem for set-valued maps. PROPOSITION 2.3.10 If X, Y are Hausdorff topological vector spaces, C ⊆ X and D ⊆ Y are nonempty compact and convex sets, and F, G : C −→ 2D are two multifunctions such that (i) F (x) is open in D and G(x) is nonempty convex for every x ∈ C. (ii) G−1 (y) = x ∈ C : y ∈ G(x) is open in C and F −1 (y) = x ∈ C : y ∈ F (x) is convex for every y ∈ D, then we can find x0 ∈ C such that F (x0 ) ∩ G(x0 ) = ∅. PROOF: Let E = C ×D and let H : C ×D −→ 2X×Y be defined by
c H(x, y) = E ∩ G−1 (y) × F (x) . Because of hypotheses (i) and (ii), for every (x, y) ∈ E the set H(x, y) is closed in E, hence compact. Also given any (x0 , y0 ) ∈ E, choose (x, y) ∈ F −1 (y0 ) × G(x0 ) (by hypotheses (i) and (ii), F −1 (y0 ) × G(x0 ) = ∅). Hence (x0 , y0 ) ∈ G−1 (y) × F (x) and so we infer that ! −1 E= (2.15) G (y) × F (x) . (x,y)∈E
From (2.15) and the definition of the multifunction H, it follows that # H(x, y) = ∅.
(2.16)
(x,y)∈E
Because of (2.16) H cannot be a KKM-map. So according to Definition 2.3.6, we m ⊆ E such that can find elements (xk , yk ) k=1
m conv (xk , yk )
k=1
⇒ v=
m k=1
∈ /
λk (xk , yk ) ∈ /
m !
H(xk , yk )
k=1 m !
H(xk , yk )
k=1
with λk ≥ 0,
m k=1
λk = 1.
82
E\
2 Extremal Problems and Optimal Control Note that due to the convexity of E = C × D we have v ∈ E and so v ∈ m m H(xk , yk ) = G−1 (yk ) ∩ F (xk ). Hence k=1 m
k=1
λk xk ∈ G−1 (yk )
m
and
k=1
λk yk ∈ F (xk )
m
⇒ yk ∈ G
λk xk
and
xk ∈ F −1
m
k=1 m
⇒
k=1 m
⇒
for all k = 1, . . . , m,
k=1
for all k = 1, . . . , m,
k=1 m
λk yk ∈ G
λk xk
m
and
k=1 m
λk yk ∈ G
k=1
λk yk
k=1 m
λk xk
and
k=1
So, if we set x0 =
m
λk xk ∈ F −1
k=1
m
λk yk ,
k=1
λk yk ∈ F
k=1
λk xk and y0 =
m
m
λk xk .
k=1
λk yk , we see that y0 ∈ F (x0 ) ∩ G(x0 ) = ∅.
k=1
Using this proposition we can prove a basic minimax theorem. First we have a definition. DEFINITION 2.3.11 Let V be a vector space, B ⊆ V a nonempty convex set, and ψ : B −→ R. We say that ψ is quasiconvex (resp., quasiconcave), if for all λ ∈ R the set {x ∈ B : ψ(x) ≤ λ} (resp., the set {x ∈ B : ψ(x) ≥ λ}) is convex in B. REMARK 2.3.12 Evidently a convex (resp., concave) function is quasiconvex (resp., quasiconcave). The converse is not in general true. Consider, for example, the function x −→ ln(|x| + 1) which is quasiconvex, but not convex. THEOREM 2.3.13 If X, Y are Hausdorff topological vector spaces, C ⊆ X and D ⊆ Y are nonempty, compact, and convex sets, and ϕ : C ×D −→ R satisfies (i) For every y ∈ D, x −→ ϕ(x, y) is lower semicontinuous and quasiconvex. (ii) For every x ∈ C, y −→ ϕ(x, y) is upper semicontinuous and quasiconcave, then ϕ has a saddle point on C × D. PROOF: By virtue of Proposition 2.3.5 it suffices to show that ϕ satisfies a minimax equality on C × D, namely that min max ϕ(x, y) = max min ϕ(x, y). x∈C y∈D
y∈D x∈C
(2.17)
Because of the upper semicontinuity of ϕ(x, ·) and the compactness of D, max ϕ(x, y) exists for every x ∈ C (see Theorem 2.1.10). Also if h1 (x) = y∈D
max ϕ(x, y), then we claim that x −→ h1 (x) is lower semicontinuous. Indeed let y∈D
xα −→ x in C and suppose h1 (xα ) ≤ λ for all α ∈ J. We have ϕ(xα , y) ≤ λ for all α ∈ J and all y ∈ D. Because ϕ(·, y) is lower semicontinuous, we have
2.3 Saddle Points and Duality
83
ϕ(x, y) ≤ lim inf ϕ(xα , y) ≤ λ for all y ∈ D, hence h1 (x) ≤ λ and this by α∈J
virtue of Proposition 2.1.3(b) implies that h1 (·) is lower semicontinuous. Therefore min h1 (x) = min max ϕ(x, y) exists. Similarly we show that max min ϕ(x, y) x∈C
x∈C y∈D
y∈D x∈C
exists. We know that max min ϕ(x, y) ≤ min max ϕ(x, y) y∈D x∈C
x∈C y∈D
(see (2.13)).
(2.18)
We show that in (2.18) we cannot have strict inequality. We proceed by contradiction. Suppose we can find λ ∈ R such that max min ϕ(x, y) < λ < min max ϕ(x, y). y∈D x∈C
x∈C y∈D
(2.19)
We introduce the multifunctions F, G : C −→ 2D defined by F (x) = y ∈ D : ϕ(x, y) < λ and G(x) = y ∈ D : ϕ(x, y) > λ . Note that because of hypothesis (ii) for each x ∈ C, F (x) is open in D and G(x) is convex and nonempty (see (2.19)). Also G−1 (y) = x ∈ C : ϕ(x, y) > λ is open in C due to hypothesis (i), whereas F −1 (y) = x ∈ C : ϕ(x, y) < λ is convex again due to hypothesis (i). So we can apply Proposition 2.3.10 and produce a point (x0 , y0 ) ∈ C × D such that y0 ∈ F (x0 ) ∩ G(x0 ). But then λ < ϕ(x0 , y0 ) < λ, a contradiction. This proves that (2.17) holds and so ϕ admits a saddle point.
Another saddle point theorem useful in applications is the following. THEOREM 2.3.14 If X, Y are reflexive Banach spaces, C ⊆ X and D ⊆ Y are nonempty, closed, and convex sets, and ϕ : C ×D −→ R is a function such that (i) For every y ∈ D, x −→ ϕ(x, y) is convex and lower semicontinuous. (ii) For every x ∈ C, y −→ ϕ(x, y) is concave and upper semicontinuous. (iii) C is bounded or there exists y0 ∈ D such that ϕ(x, y0 ) −→ +∞ as xX −→ +∞. (iv) D is bounded or there exists x0 ∈ C such that ϕ(x0 , y) −→ −∞ as yY −→ +∞, then ϕ has a saddle point on C ×D. PROOF: Due to the reflexivity of X and Y , if C ⊆ X and D ⊆ Y are bounded, they are w-compact. Also due to convexity the functions x −→ ϕ(x, y) and y −→ −ϕ(x, y) are weakly lower semicontinuous. So in this case the result follows from Theorem 2.3.13. So suppose that at least one of the sets C ⊆ X and D ⊆ Y is unbounded. For every n ≥ 1 set Cn = x ∈ C : xX ≤ n and Dn = y ∈ D : yY ≤ n . For n ≥ 1 large Cn = ∅ and Dn = ∅. Then from the bounded case we know that we can find (xn , yn ) ∈ Cn × Dn such that
84
2 Extremal Problems and Optimal Control ϕ(xn , y) ≤ ϕ(xn , yn ) ≤ ϕ(x, yn )
for all (x, y) ∈ Cn × Dn , n ≥ 1.
(2.20)
Let x = x0 and y = y0 (see hypotheses (iii) and (iv)). Using those hypotheses, from (2.20) we infer that {xn }n≥1 ⊆ X, {yn }n≥1 ⊆ Y , and {ϕ(xn , yn )}n≥1 ⊆ R are all bounded. So passing to a subsequence if necessary, we may assume that w
w
xn −→ x0 in X,
yn −→ y0 in Y
and
ϕ(xn , yn ) −→ ξ ∈ R.
Evidently (x0 , y0 ) ∈ C × D. Also because of hypotheses (i) and (ii) we have ϕ(x0 , y) ≤ lim inf ϕ(xn , y) ≤ ξ ≤ lim sup ϕ(x, yn ) ≤ ϕ(x, y0 ) n→∞
n→∞
for all (x, y) ∈ C × D, ⇒ ϕ(x0 , y) ≤ ϕ(x0 , y0 ) ≤ ϕ(x, y0 ), that is, (x0 , y0 ) ∈ C × D is a saddle point of ϕ. REMARK 2.3.15 Functions ϕ(x, y) such as those in Theorems 2.3.13 and 2.3.14 that are convex in x ∈ C and concave in y ∈ D, are usually called saddle functions (or convex–concave functions). Next we derive a duality theory for convex optimization. Saddle functions are helpful in this respect. The mathematical setting is the following. We are given two Banach spaces X, Y and a convex function ϕ : X ×Y −→ R = R ∪ {+∞}. We know that (X × Y )∗ = X ∗ × Y ∗ and for the duality brackets we have (x∗ , y ∗ ), (x, y)X×Y = x∗ , xX + y ∗ , yY , for all x ∈ X, x∗ ∈ X ∗ , y ∈ Y , y ∗ ∈ Y ∗ . We can think of y ∈ Y as a perturbation and ϕ(x, y) as a perturbed functional with ϕ(x, 0) as the original unperturbed functional. Then the primal problem is: (P) : inf ϕ(x, 0).
(2.21)
x∈X
We also consider the value function m : Y −→ R∗ of the perturbed problem m(y) = inf ϕ(x, y).
(2.22)
x∈X
Clearly m(·) is convex. We have m∗ (y ∗ ) = sup y ∗ , yY − inf ϕ(x, y) y∈Y
x∈X
(see (2.22))
= sup 0, xX + y ∗ , yY − ϕ(x, y) = ϕ∗ (0, y ∗ ).
(2.23)
(x,y)∈X×Y
Then the dual problem associated to (P) (see (2.21)), is defined by
(D) : sup − ϕ∗ (0, y ∗ ) . y ∗ ∈Y ∗
PROPOSITION 2.3.16 We always have sup (D) ≤ inf (P) (weak duality).
(2.24)
2.3 Saddle Points and Duality
85
PROOF: For all x ∈ X and all y ∗ ∈ Y ∗ , by the Young–Fenchel inequality (see Proposition 1.2.18), we have ϕ(x, 0) + ϕ∗ (0, y ∗ ) ≥ 0, xX + y ∗ , 0Y = 0 ⇒ inf (P) ≥ sup (D). In the next proposition we provide conditions for strong duality to hold (i.e., to have equality of primal and dual problems). PROPOSITION 2.3.17 If ∂m(0) = ∅ and y ∗ ∈ ∂m(0), then y ∗ is a solution of (D) and inf (P) = max (D). PROOF: Because ∂m(0) = ∅, we have m(0) = inf (P) ∈ R and for all y ∗ ∈ Y ∗ , m∗ (y ∗ ) ≥ −m(0) > −∞. From Proposition 1.2.31, we have inf (P) + m∗ (y ∗ ) = m(0) + m∗ (y ∗ ) = y ∗ , 0Y = 0, ⇒ sup (D) = inf (P)
(see (2.23), (2.24), and Proposition 2.3.16).
In order to analyze further the strong duality situation, we need the following lemma. LEMMA 2.3.18 If V is a Banach space, f : V −→ R∗ is a convex function, and there exist (v0 , λ0 ) ∈ V × R such that f (v0 ) ∈ R and (v0 , λ0 ) ∈ int epi f , then f (v) > −∞ for all v ∈ V . PROOF: Let µ0 < f (v0 ). Then (v0 , µ0 ) ∈ / epi f and so by the separation theorem for convex sets we can find (v ∗ , t) ∈ V ∗ × R, (v ∗ , t) = (0, 0) such that v ∗ , v0 V + tµ0 ≤ v ∗ , vV + tλ
for all (v, λ) ∈ epi f.
(2.25)
Because λ can increase up to +∞, from (2.25) it follows that t ≥ 0. If t = 0, then v ∗ , v0 V ≤ v ∗ , vV
for all v ∈ dom f.
(2.26)
But int dom f = ∅ (see Theorem 1.2.3(d)). So from (2.26) we infer that v ∗ = 0, a contradiction to the fact that (v ∗ , t) = (0, 0). Hence t > 0 and we may assume that t = 1. We have v ∗ , v0 − vV + µ0 ≤ λ ⇒ v ∗ , v0 − vV + µ0 ≤ f (v)
for all (v, λ) ∈ epi f, for all v ∈ V (i.e., f (v) > −∞
for all v ∈ V ).
Using this auxiliary result we can have a theorem on the duality of problems (P) and (D). THEOREM 2.3.19 If there exists x0 ∈ X such that ϕ(x0 , ·) is finite and continuous at y = 0 and inf (P) ∈ R, then (a) m is continuous at y = 0.
86
2 Extremal Problems and Optimal Control
(b) y ∗ ∈ ∂m(0) if and only if y ∗ ∈ Y ∗ is a solution of the dual problem (D). (c) inf (P) = max (D) ∈ R (strong duality). PROOF: We can find r > 0 and M > 0 such that ϕ(x0 , y) ≤ M ⇒ m(y) ≤ M
for all y ∈ B r (0), for all y ∈ B r (0).
Then from Theorem 1.2.3 we have that m is continuous at y = 0 and (0, λ) ∈ int epi m for all λ > M . Because m(0) = inf (P) ∈ R, from Lemma 2.3.18 we infer that −∞ < m(y) for all y ∈ Y , hence m is a proper convex function continuous at y = 0. By virtue of Theorem 1.2.34 we have that ∂m(0) = ∅. Invoking Proposition 2.3.17, we know that every y ∗ ∈ ∂m(0) is a solution of (D) and inf (P) = max (P). Finally, if y ∗ ∈ Y ∗ is a solution of (D), we have m∗ (y ∗ ) ≤ m∗ (u∗ )
for all u∗ ∈ Y ∗ .
(2.27)
Because m is continuous at y = 0, using Proposition 1.2.24 (with ϕ = m and ψ = i{0} ) we obtain m(0) = sup u∗ , 0Y − m∗ (u∗ ) ≤ −m∗ (y ∗ ) (see (2.27)), u∗ ∈Y ∗
⇒ m(0) + m∗ (y ∗ ) ≤ 0
(i.e., y ∗ ∈ ∂m(0); see Proposition 1.2.31).
REMARK 2.3.20 In the above theorem, the hypothesis that inf (P) ∈ R is important and cannot be dropped. To see this consider the case where X = Y = R; let C(y) = {x ∈ R : x ≤ y} and ϕ(x, y) = x + iC(y) (x). Then for every x < 0, ϕ(x, ·) is continuous at 0, but m ≡ −∞ and so m∗ ≡ +∞. DEFINITION 2.3.21 The Lagrangian function corresponding to the dual pair of problems {(P),(D)}, is the function L : X ×Y ∗ −→ R∗ defined by L(x, y ∗ ) = inf ϕ(x, y) − y ∗ , yY : y ∈ Y for all (x, y ∗ ) ∈ X × Y ∗ . REMARK 2.3.22 Evidently inf L(x, y) = −ϕ∗ (0, y ∗ ) and L is a saddle function. x∈X
EXAMPLE 2.3.23 Consider the mathematical programming situation. Namely let (P) : inf [f (x) : fk (x) ≤ 0 for all k = 1, . . . , m], x∈X
with f, fk : X −→ R, k = 1, . . . , m, convex functions. Then ϕ : X × Rm −→ R is defined by ϕ(x, y) = f (x) + iC(y) (x)
m for all x ∈ X, y = (yk )m k=1 ∈ R ,
with C(y) = {x ∈ X : fk (x) ≤ yk for all k = 1, . . . , m}. Therefore ∗ f (x) − m if y ∗ ∈ −Rm + ∗ k=1 fk (x)yk L(x, y ) = , −∞ if y ∗ ∈ / −Rm + m for all x ∈ X, y ∗ = (yk∗ )m k=1 ∈ R .
2.3 Saddle Points and Duality
87
PROPOSITION 2.3.24 If ϕ ∈ Γ0 (X × Y ), then ϕ(x, 0) = sup L(x, y ∗ ). y ∗ ∈Y ∗
PROOF: Because of Theorem 1.2.21, we have
∗ ϕ(x, 0) = sup y ∗ , 0Y − ϕ(x, ·) (y ∗ ) : y ∗ ∈ Y ∗ = sup L(x, y ∗ ). y ∗ ∈Y ∗
We can characterize the situation of strong duality via the saddle points of the Lagrangian. THEOREM 2.3.25 If ϕ ∈ Γ0 (X×Y ) and L is the corresponding Lagrangian function, then the following statements are equivalent. (a) (x0 , y0∗ ) ∈ X × Y ∗ is a saddle point of L. (b) x0 is a solution of (P), y0∗ is a solution of (D) and min(P)= max(D). (c) ϕ(x0 , 0) = −ϕ∗ (0, y0∗ ). PROOF: (a)⇒(b),(c): We have L(x0 , y0∗ ) = inf L(x, y0∗ ) = −ϕ∗ (0, y0∗ )
(see Remark 2.3.22)
L(x0 , y0∗ )
(see Proposition 2.3.24).
x∈X
and
∗
= sup L(x0 , y ) = ϕ(x0 , 0) y ∗ ∈Y ∗
(2.28)
(2.29) So from (2.28) and (2.29), we have inf (P) ≤ ϕ(x0 , 0) = −ϕ∗ (0, y0∗ ) ≤ inf (D).
(2.30)
Combining (2.30) with Proposition 2.3.16, we conclude that min (P) = max (D) = ϕ(x0 , 0) = −ϕ∗ (0, y0∗ ).
(b)⇒(a), (c): We have ϕ(x0 , 0) = min (P),
−ϕ∗ (0, y0∗ ) = max (D)
and
ϕ(x0 , 0) = −ϕ∗ (0, y0∗ ). (2.31)
Note that ϕ(x0 , 0) = sup L(x0 , y ∗ ) ≥ L(x0 , y ∗ ) y ∗ ∈Y ∗
(see Proposition 2.3.24) (2.32)
and
−ϕ
∗
(0, y0∗ )
= inf
x∈X
L(x, y0∗ )
≤
L(x, y0∗ )
(see Remark 2.3.22). (2.33)
From (2.31) through (2.33) it follows that (x0 , y0∗ ) ∈ X × Y ∗ is a saddle point of the Lagrangian. (c)⇒(a), (b): We have
88
2 Extremal Problems and Optimal Control inf (P) ≤ ϕ(x0 , 0) = −ϕ∗ (0, y0∗ ) ≤ sup (D), ⇒ inf (P) = sup (D) = ϕ(x0 , 0) = −ϕ∗ (0, y0∗ )
(see Proposition 2.3.16).
Next we consider an important special case of the above general duality theory. So let f ∈ Γ0 (X), g ∈ Γ0 (Y ), and A ∈ L(X, Y ). The primal problem is:
(2.34) (P) : inf[f (x) + g A(x) : x ∈ X]. The convex perturbation functional ϕ : X ×Y −→ R = R ∪ {+∞} is given by
ϕ(x, y) = f (x) + g A(x) + y for all (x, y) ∈ X × Y. Then the dual problem is:
(2.35) (D) : sup − f ∗ A∗ (y ∗ ) − g ∗ (−y ∗ ) : y ∗ ∈ Y ∗ . Also the Lagrangian function for the pair of problems (P) , (D) (see (2.34) and (2.35)) is given by if f (x) < +∞ f (x) + y ∗ , A(x)Y − g ∗ (y ∗ ) ∗ L(x, y ) = (2.36) +∞ if f (x) = +∞ Using Theorem 2.3.25, we obtain the following. PROPOSITION 2.3.26 A pair (x0 , y0∗ ) ∈ X × Y ∗ is a saddle point of the Lagrangian L defined by (2.36) if and only if
and − A∗ (y0∗ ) ∈ ∂f (x0 ). y0∗ ∈ ∂g A(x0 ) Let us close this section with a simple application of duality theory on a calculus of variations problem. EXAMPLE 2.3.27 Let X = W01,2 (0, 1), Y = L2 (0, 1), A ∈ L(X, Y ) defined by A(x) = x , and f ∈ Γ0 (X), g ∈ Γ0 (Y ) defined by
1 f (x) = x∗0 , xX = − h(t)x(t)dt = −(h, x)L2 (0,1) for some h ∈ L2 (0, 1) 0
1 g(y) = y2L2 (0,1) . 2
and
Then identifying L2 (0, 1) with its dual (hence X⊆Y ⊆X ∗=W −1,2 (0, 1)), we have f ∗ (x∗ ) = i{x∗0 } (x∗ )
for all x∗ ∈ X ∗
and
Note that −A∗ (y ∗ ) = x∗0 if and only if
1
1 − y ∗ (t)x (t)dt = − h(t)x(t)dt 0
⇒ y ∗ ∈ W 1,2 (0, 1)
0
and
(y ∗ ) = −h.
g ∗ (y ∗ ) =
1 ∗ 2 y L2 (0,1) . 2
for all x ∈ W01,2 (0, 1),
2.4 Variational Principles
89
Then the primal problem is 1 (P) : inf x 2L2 (0,1) − (h, x)L2 (0,1) : x ∈ W01,2 (0, 1) 2 and the dual problem is 1 (D) : sup − y ∗ 2L2 (0,1) : y ∗ ∈ W 1,2 (0, 1), (y ∗ ) = −h . 2 Then problems (P) and (D) have solutions x0 ∈ W01,2 (0, 1) ∩ W 2,2 (0, 1) and ∈ W 1,2 (0, 1) (see Corollary 2.1.11) and they are unique due to the strict convexity of the functionals. Moreover, we have y0∗
−x0 (t) = h(t) a.e. on (0, 1),
x0 (0) = x0 (1) = 0
and
y0∗ = x0 .
2.4 Variational Principles In Theorem 2.1.10 and Corollary 2.1.11 we saw that if the objective functional ϕ : X −→ R is lower semicontinuous in a topology and the domain exhibits some kind of (local) compactness in the same topology, then a minimizing point exists. This is the cornerstone of the so-called direct method of the calculus of variations. However, in many situations the above convenient setting is not present. Think, for example, of infinite-dimensional Banach space functionals that are strongly lower semicontinuous, but not weakly. An effective tool to approach such problems is provided by the so-called Ekeland variational principle. Since its appearance in 1974, this result found many significant applications in different parts of analysis. Also it turned out to be equivalent to some other important results of nonlinear analysis and also served to provide new and elegant proofs to known results. In addition it initiated the production of some other related variational principles. In this section we survey this area exhibiting all those features that are significant for problems in nonlinear analysis. Of course our presentation is centered on the Ekeland variational principle. We start by giving the general form of the Ekeland variational principle. THEOREM 2.4.1 If (X, d) is a complete metric space, ϕ : X −→ R is a proper lower semicontinuous function that is bounded below, and ε > 0, x ¯ ∈ X satisfy ϕ(¯ x) ≤ inf ϕ + ε, X
then for any λ > 0, we can find xλ ∈ X such that x), ϕ(xλ ) ≤ ϕ(¯
d(xλ , x ¯) ≤ λ
and
ϕ(xλ ) ≤ ϕ(x) +
ε d(x, xλ ) λ
for all x ∈ X.
PROOF: In what follows dλ = (1/λ)d. We introduce the following relation on X × X. x≤u if and only if ϕ(x) ≤ ϕ(u) − εdλ (x, u). (2.37) It is easy to see that ≤ is reflexive, antisymmetric, and transitive, hence a partial order on X. We define a sequence of nonempty sets Cn ⊆ X, n ≥ 1 as follows. Let x1 = x ¯ and set
90
2 Extremal Problems and Optimal Control C1 = {x ∈ X : x ≤ x1 }. Choose x2 ∈ C1 such that ϕ(x2 ) ≤ inf ϕ + (ε 22 ) and define C2 = {x ∈ X : x ≤ C1
x2 }. We continue this way and inductively define Cn = {x ∈ X : x ≤ xn }
and xn+1 ∈ Cn ε ϕ(xn+1 ) ≤ inf ϕ + n+1 , n ≥ 1. Cn 2
such that (2.38)
Evidently {Cn }n≥1 is a decreasing sequence and each set Cn is closed. Indeed, suppose {um }m≥1 ⊆ Cn such that um −→ u. We have ϕ(um ) ≤ ϕ(xn ) − εdλ (um , xn ), ⇒ ϕ(u) ≤ lim inf ϕ(um ) ≤ ϕ(xn ) − εdλ (u, xn ) m−→∞
(i.e., u ∈ Cn ).
We claim that diam Cn −→ 0 as n → ∞. To this end let u ∈ Cn . We have u ≤ xn and so (2.39) ϕ(u) ≤ ϕ(xn ) − εdλ (u, xn ) (see (2.37)). Also note that u ∈ Cn−1 , because {Cn }n≥1 is a decreasing sequence. Therefore ϕ(xn ) ≤ ϕ(u) +
ε 2n
(see (2.38)).
(2.40)
1 . 2n
(2.41)
Combining (2.39) and (2.40), we obtain dλ (u, xn ) ≤
Because u ∈ Cn was arbitrary, from (2.41) and the triangle inequality it follows that 1 , 2n−1 ⇒ diam Cn −→ 0 as n → ∞. diam Cn ≤
Then by Cantor’s theorem, we have that
n≥1
Cn = {xλ }. Because xλ ∈ C1 , we have
¯ and so ϕ(xλ ) ≤ ϕ(¯ x) (see (2.37)). Also from the triangle inequality and xλ ≤ x (2.41), we have x, xn ) ≤ dλ (¯
n−1
dλ (xk , xk+1 ) ≤
k=1
k=1
⇒ dλ (¯ x, xλ ) = lim dλ (¯ x, xn ) ≤ 1 n→∞
n−1
1 2k
(i.e., dλ (¯ x, xλ ) ≤ λ).
Finally let x = xλ . Evidently we cannot have x ≤ xλ or otherwise x ∈
Cn , a
n≥1
contradiction. Then by virtue of (2.37) ϕ(xλ ) < ϕ(x) +
ε d(x, xλ ). λ
2.4 Variational Principles
91
REMARK 2.4.2 In the conclusions of Theorem 2.4.1, the relations d(¯ x, xλ ) ≤ λ and ϕ(xλ ) ≤ ϕ(x) + (ε/λ)d(x, xλ ) for all x ∈ X, are in a sense complementary. Indeed, the choice of λ > 0 allows us to strike a balance between them. So if λ > 0 is large the inequality d(x, xλ ) ≤ λ gives little information on the whereabouts of xλ , and the inequality ϕ(xλ ) ≤ ϕ(x) + (ε/λ)d(x, xλ ), x ∈ X becomes sharper and says that xλ is close to being a global minimizer of ϕ. The situation is reversed if λ > 0 is small. Two important cases are when λ = 1 (which means we do not care √ about the whereabouts of xλ ) and when λ = ε (which means that we need to have information from both inequalities). We state both cases as corollaries of Theorem 2.4.1. COROLLARY 2.4.3 If (X, d) is a complete metric space and ϕ : X −→ R is a proper lower semicontinuous function that is bounded below, then for any ε > 0, we can find xε ∈ X such that ϕ(xε ) ≤ inf ϕ + ε X
ϕ(xε ) ≤ ϕ(x) + εd(x, xε )
and
for all x ∈ X.
COROLLARY 2.4.4 If (X, d) is a complete metric space, ϕ : X −→ R is a proper lower semicontinuous function that is bounded below, and ε > 0, xε ∈ X satisfy ϕ(xε ) ≤ inf ϕ + ε, X
then we can find uε ∈ X such that √ ϕ(uε ) ≤ ϕ(xε ), d(xε , uε ) ≤ ε and
ϕ(uε ) ≤ ϕ(x) +
√ εd(x, uε )
for all x ∈ X.
Theorem 2.4.1 has some important consequences if we assume more structure on the space X and the function ϕ. THEOREM 2.4.5 If X is a Banach space, ϕ : X −→ R is a function that is bounded below and Gˆ ateaux differentiable, and ε > 0, xε ∈ X satisfy ϕ(xε ) ≤ inf ϕ + ε, X
then there exists uε ∈ X such that ϕ(uε ) ≤ ϕ(xε ),
uε − xε X ≤
√
ε
and
ϕ (uε )X ∗ ≤
PROOF: Apply Corollary 2.4.4 to obtain uε ∈ X such that √ and ϕ(uε ) ≤ ϕ(xε ), uε − xε X ≤ ε √ for all x ∈ X. − εx − uε X ≤ ϕ(x) − ϕ(uε )
√
ε.
(2.42)
From the third inequality in (2.42), taking x = uε + λh with λ > 0 and h ∈ X with hX ≤ 1, we have √ ϕ(uε + λh) − ϕ(uε ) − ε≤ , λ √ for all h ∈ B 1 (0), ⇒ − ε ≤ ϕ (uε ), h X √ ⇒ ϕ (uε )X ∗ ≤ ε.
92
2 Extremal Problems and Optimal Control
COROLLARY 2.4.6 If X is a Banach space and ϕ : X −→ R is a function that is bounded below and Gˆ ateaux differentiable, then for every minimizing sequence {xn }n≥1 of ϕ, we can find a minimizing sequence {un }n≥1 of ϕ such that ϕ(un ) ≤ ϕ(xn ),
un − xn X −→ 0
and
ϕ (un )X ∗ −→ 0
as n → ∞.
The following proposition is useful in critical point theory (see Section 4.1). PROPOSITION 2.4.7 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous function that is bounded below, Gˆ ateaux differentiable, and satisfies the following compactness-type condition every sequence {xn }n≥1 ⊆ X that satisfies |ϕ(xn )| ≤ M for some M > 0, all n ≥ 1 and ϕ (xn )X ∗ −→ 0, it admits a strongly convergent subsequence, then we can find x0 ∈ X such that ϕ(x0 ) = inf ϕ and ϕ (x0 ) = 0. X
PROOF: Because of Theorem 2.4.5 for every n ≥ 1, we can find xn ∈ X such that ϕ(xn ) ≤ inf ϕ + X
1 n
and
ϕ (xn )X ∗ ≤
1 . n
Then by virtue of the compactness-type condition that ϕ satisfies, by passing to a subsequence if necessary, we may assume that xn −→ x0 in X. Clearly ϕ(x0 ) = inf ϕ and from this it follows that ϕ (x0 ) = 0.
X
REMARK 2.4.8 In critical point theory (see Section 4.1), we encounter the compactness-type condition of Proposition 2.4.7 in the context of functions that are C 1 on X with values in R. In such a setting it is called the Palais–Smale condition (PS-condition for short) and it is a basic tool in the derivation of minimax characterizations of the critical values of functions ϕ ∈ C 1 (X). PROPOSITION 2.4.9 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous function that is bounded below, Gˆ ateaux differentiable, and ϕ(x) ≥ c1 x − c2 ∗
for some c1 , c2 > 0 and all x ∈ X,
(2.43)
∗
then ϕ (X) is dense in c1 B 1 where B 1 = {x∗ ∈ X ∗ : x∗ X ∗ ≤ 1}. ∗
PROOF: Let x∗ ∈ c1 B 1 and consider the function ψ(x) = ϕ(x) − x∗ , xX for all x ∈ X. Clearly ψ is lower semicontinuous, bounded below (because of (2.43)) and Gˆ ateaux differentiable. So by virtue of Theorem 2.4.5 we can find xε ∈ X such that ψ (xε )X ∗ ≤ ε, hence ϕ (xε ) − x∗ X ∗ < ε, which proves the density of ϕ (X) in ∗ c1 B 1 . COROLLARY 2.4.10 If X is a Banach space and ϕ : X −→ R is a lower semicontinuous function that is Gˆ ateaux differentiable and satisfies
ϕ(x) ≥ r(x) for all x ∈ X, with r : R+ −→ R a continuous function such that r(λ) λ −→ +∞ as λ −→ +∞, then ϕ (X) is dense in X ∗ .
2.4 Variational Principles
93
PROOF: Let c1 > 0 be given. Then we can find λ0 > 0 such that for all λ ≥ λ0 we have r(λ) ≥ c1 λ. Hence ϕ(x) ≥ c1 x for all x ∈ X with xX > λ0 . On the other hand if xX < λ0 , then ϕ(x) ≥ m = inf[r(λ) : 0 ≤ λ ≤ λ0 ]. Let c2 = |m|. Then we have ϕ(x) ≥ c1 x − c2 ∗
and we can apply Proposition 2.4.9 and obtain that ϕ (X) is dense in c1 B 1 . Because c1 > 0 was arbitrary we conclude that ϕ (X) is dense in X ∗ . Next we show that the Ekeland variational principle is actually equivalent to some other well-known results of nonlinear analysis. We start with Caristi’s fixed point theorem. THEOREM 2.4.11 If (X, d) is a complete metric space, ϕ : X −→ R is a proper, lower semicontinuous function that is bounded below, and F : X −→ 2X \{∅} is a multifunction such that ϕ(y) ≤ ϕ(x) − d(x, y)
for all x ∈ X and all y ∈ F (x),
(2.44)
then there exists x0 ∈ X such that x0 ∈ F (x0 ). PROOF: Apply Corollary 2.4.3 with ε = 1, to obtain x0 ∈ X such that ϕ(x0 ) < ϕ(x) + d(x, x0 )
for all x = x0 .
(2.45)
We claim that x0 ∈ F (x0 ). If this is not the case, then for every y ∈ F (x0 ) we have y = x0 . Then from (2.44) and (2.45) we have ϕ(y) ≤ ϕ(x0 ) − d(x0 , y) < ϕ(y), a contradiction. So we have seen that the Ekeland variational principle in the form of Corollary 2.4.3 implies Theorem 2.4.11 (Caristi’s fixed point theorem). Next we show that the opposite is also true. THEOREM 2.4.12 Caristi’s fixed point theorem (Theorem 2.4.11), implies the Ekeland variational principle in the form of Corollary 2.4.3. PROOF: We argue indirectly. Suppose that we cannot ϕ(xε ) < ϕ(x) + εd(x, xε ) for all x = xε . Let F (x) = y εd(y, x), y = x . Then F (x) = ∅ for all x ∈ X. Invoking find x0 ∈ X such that x0 ∈ F (x0 ), which is impossible. So principle holds.
find xε ∈ X satisfying ∈ X : ϕ(x) ≥ ϕ(y) + Theorem 2.4.11 we can the Ekeland variational
REMARK 2.4.13 Banach’s fixed point theorem (see Section 3.4), in its existence part, can be deducedfrom Theorem 2.4.11. Indeed, if F : X −→ X is a contraction, that is, d F (x), F (y) ≤ kd(x, y) for some k ∈ [0, 1), then it satisfies (2.44) with
ϕ(x) = 1 1 − k d x, F (x) . However, Banach’s fixed point theorem includes more information than the mere existence of a fixed point (a computational scheme to find the fixed point, rate of convergence, error estimates, etc.).
94
2 Extremal Problems and Optimal Control
Another result equivalent to the Ekeland variational principle and to the Caristi fixed point theorem, is the so-called Takahashi variational principle. THEOREM 2.4.14 If (X, d) is a complete metric space, ϕ : X −→ R = R ∪ {+∞} is a proper lower semicontinuous function that is bounded below, and for each u ∈ X with ϕ(u) > inf ϕ we can find v ∈ X such that v = u and ϕ(v) + d(u, v) ≤ ϕ(u), then X
there exists x0 ∈ X such that ϕ(x0 ) = inf ϕ. X
PROOF: We proceed by contradiction. So suppose that inf ϕ is not attained. We X
introduce the multifunction F : X −→ 2X defined by F (x) = {y ∈ X : ϕ(y) + d(x, y) ≤ ϕ(x), y = x}. By hypothesis F (x) = ∅ for all x ∈ X. Invoking Theorem 2.4.11, we can find x0 ∈ F (x0 ), a contradiction. So inf ϕ is attained. X
Thus we have seen that Caristi’s fixed point theorem (Theorem 2.4.11) implies Takahashi’s variational principle (Theorem 2.4.14). Next we show that the converse is also true. THEOREM 2.4.15 Takahashi’s variational principle (Theorem 2.4.14) implies Caristi’s fixed point theorem (Theorem 2.4.11). PROOF: We proceed by contradiction. So suppose that in Caristi’s fixed point theorem (Theorem 2.4.11), the multifunction F has no fixed point, (i.e., x ∈ / F (x) for all x ∈ X). Thus from the property of the multifunction F , we see that for every x ∈ X we can find y = x such that ϕ(y) + d(x, y) ≤ ϕ(x). This is the hypothesis in Theorem 2.4.14. By virtue of that result we can find x0 ∈ X such that ϕ(x0 ) = inf ϕ. X
Let y0 ∈ F (x0 ) be such that y0 = x0 and ϕ(y0 ) + d(x0 , y0 ) ≤ ϕ(x0 ). Then we have 0 < d(x0 , y0 ) ≤ ϕ(x0 ) − ϕ(y0 ) ≤ ϕ(y0 ) − ϕ(y0 ) = 0, a contradiction. So F has a fixed point and Caristi’s fixed point theorem (Theorem 2.4.11) holds. Another result of nonlinear analysis related to the Ekeland variational principle (in fact equivalent to it), is the so-called drop theorem. THEOREM 2.4.16 If X is a Banach space, C ⊆ X is a nonempty closed set, y ∈ X \ C, R = d(y, C), and 0 < r < R < , then there exists x0 ∈ C such that y − x0 ≤ and D(x0 ; y, r) ∩ C = {x0 }, where D(x0 ; y, r) = conv B r (y) ∪ {x0 } .
2.4 Variational Principles
95
PROOF: By translating things if necessary, without any loss of generality, we may assume that y = 0. Let S = B r (0) ∩ C. Then S is a closed set in X, hence a complete metric space. Consider the function ϕ : S −→ R defined by ϕ(x) =
+r xX . R−r
By virtue of Corollary 2.4.3 with ε = 1, we can find x0 ∈ S such that ϕ(x0 ) < ϕ(x) + x − x0 X
for all x ∈ S, x = x0 .
(2.46)
Because we have assumed that y = 0, we have y − x0 X = x0 X ≤ . Next we show that D(x0 ; 0, r) ∩ C = {x0 }. Suppose that this is not the case and we can find x = x0 such that x ∈ D(x0 ; 0, r) ∩ C. Then x∈C
and
x = (1 − λ)x0 + λu
with u ∈ B r (0) and 0 ≤ λ ≤ 1. (2.47)
Evidently 0 < λ < 1 (recall that < R). From (2.47) we have xX ≤ (1 − λ)x0 X + λuX ,
⇒ λ(R − r) ≤ λ x0 X − xX ≤ x0 X − xX .
(2.48)
From (2.46) and (2.47) it follows that +r +r +r x0 X < xX + x − x0 X = xX + λx0 − uX . R−r R−r R−r (2.49) Also from (2.48) we have λ≤
x0 X − xX . R−r
(2.50)
Using (2.50) in (2.49), we obtain ( + r)x0 X < ( + r)xX + x0 − uX (x0 X − xX ) ≤ ( + r)xX + ( + r)(x0 X − xX ) (because x0 X ≤ , u ∈ B r (0)), ⇒ x0 X < xX + x0 X − xX = x0 X ,
a contradiction.
Therefore we conclude that D(x0 ; 0, r) ∩ C = {x0 } and the proof of the theorem is finished. REMARK 2.4.17 The set D(x0 ; y, r) is called a drop, in view of its evocative geometry. In the above proof, we saw that the Ekeland variational principle (in the form of Corollary 2.4.3) implies the drop theorem (Theorem 2.4.16). In fact the converse is also true. For a proof of this we refer to Penot [494]. Thus we have the following. THEOREM 2.4.18 The drop theorem (Theorem 2.4.16) implies the Ekeland variational principle (Corollary 2.4.3). Summarizing we have the following.
96
2 Extremal Problems and Optimal Control
THEOREM 2.4.19 Corollary 2.4.3, Theorem 2.4.11, Theorem 2.4.14 and Theorem 2.4.16 are all equivalent. Now we present a general principle on partially ordered sets, which implies all the above equivalent theorems. Recall that on a set X, ≤ is a partial order if x ≤ x (reflexive) x ≤ y and y ≤ x imply x = y (antisymmetric) and x ≤ y, y ≤ z imply x ≤ z (transitive). THEOREM 2.4.20 If (X, ≤) is a partially ordered set, every increasing sequence {xn }n≥1 ⊆ X has an upper bound in X (i.e. if xn ≤ xn+1 for all n ≥ 1, there exists y ∈ X such that xn ≤ y for all n ≥ 1) and ϕ : X −→ R is an increasing function that is bounded above, then there exists x0 ∈ X such that x0 ≤ y implies ϕ(x0 ) = ϕ(y). PROOF: Let x1 ∈ X be any element. Inductively we produce an increasing sequence {xn }n≥1 ⊆ X. Suppose we have generated the element xn . We define Cn = {x ∈ X : xn ≤ x}
and
Mn = sup ϕ. Cn
If for xn , we have that xn ≤ y implies that ϕ(xn ) = ϕ(y), then clearly we are done. Otherwise ϕ(xn ) < Mn and so we can find xn+1 ∈ Cn such that Mn ≤ ϕ(xn+1 ) +
1
Mn − ϕ(xn ) . 2
(2.51)
So by induction, we have produced an increasing sequence {xn }n≥1 ⊆ X. By hypothesis we can find x0 ∈ X such that xn ≤ x0
for all n ≥ 1.
(2.52)
We claim that x0 ∈ X is the desired solution. Suppose that this is not the case. Then we can find y ∈ X such that x0 ≤ y and ϕ(x0 ) < ϕ(y). The sequence ϕ(xn ) n≥1 ⊆ R is increasing and bounded above by ϕ(y). Therefore it converges. We have lim ϕ(xn ) ≤ ϕ(x0 ) (see (2.52)). (2.53) n→∞
Because of (2.52), y ∈ Cn for all n ≥ 1. So from (2.51) it follows that ϕ(y) ≤ Mn ≤ 2ϕ(xn+1 ) − ϕ(xn ) ⇒ ϕ(y) ≤ ϕ(x0 )
for all n ≥ 1
(see (2.53)),
a contradiction to the choice of y. This proves the theorem.
REMARK 2.4.21 Theorem 2.4.20 can have a physical interpretation. We can think of ϕ as a function measuring the entropy of a system. The theorem guarantees the existence of a state of maximal entropy. To these states correspond stable equilibrium states of the system. COROLLARY 2.4.22 If X is a Hausdorff topological space equipped with a partial order ≤ and ψ : X −→ R a function that is bounded below such that (i) For every x ∈ X the set {y ∈ X : x ≤ y} is closed; (ii) x ≤ y and x = y imply ψ(y) < ψ(x) (strictly decreasing ψ);
2.4 Variational Principles
97
(iii) Any increasing sequence in X is relatively compact, then for each x ∈ X we can find x0 ∈ X such that x ≤ x0 and x0 is maximal. PROOF: Let {xn }n≥1 ⊆ X be an increasing sequence in X. By virtue of hypothesis (iii) {xn }n≥1 ⊆ X is relatively compact and so we can find a subsequence xnk k≥1 such that xnk −→ y in X. We claim that xn ≤ y for all n ≥ 1. Indeed given n ≥ 1, we have n ≤ nk for k ≥ kn and so xn ≤ xnk for k ≥ kn . From this and hypothesis (i) it follows that y ∈ Cn for all n ≥ 1; that is, xn ≤ y for all n ≥ 1. Taking ϕ = −ψ, because of hypothesis (ii) we can apply Theorem 2.4.20 starting from x1 = x and obtain x0 ∈ X such that x ≤ x0 and x0 is maximal. COROLLARY 2.4.23 Theorem 2.4.20 implies the Ekeland variational principle in the form of Corollary 2.4.3. PROOF: Without any loss of generality we take ε = 1 and define a partial order ≤ by x≤u if and only if ϕ(u) − ϕ(x) ≤ −d(x, u). (2.54) For any increasing sequence {xn }n≥1 ⊆ X, ϕ(xn ) n≥1 is decreasing and bounded below, so it converges. Therefore once more from (2.54), we infer that {xn }n≥1 ⊆ X is Cauchy, hence due to the completeness of X it converges to an element in X. Therefore we can apply Corollary 2.4.22 and finish the proof. REMARK 2.4.24 Thus Theorem 2.4.20 also implies Caristi’s fixed point theorem, Takahashi’s variational principle, and the drop theorem.
We conclude this section with a generalization of Theorem 2.4.1, which is used in Section 4.1. For a proof of it we refer to Zhong [626]. THEOREM 2.4.25 If h : R+ −→ R+ is a continuous nondecreasing function such +∞ 1 that dr = +∞, (X, d) is a complete metric space, x0 ∈ X is fixed, ϕ : X −→ 1+h(r) 0
R = R ∪ {+∞} is a proper, lower semicontinuous, bounded below function, ε > 0, ϕ(y) ≤ inf ϕ + ε and λ > 0, then there exists xλ ∈ X such that X
and
ϕ(xλ ) ≤ ϕ(y), d(xλ , x0 ) ≤ r0 + r¯ ε
d(xλ , x) ϕ(xλ ) ≤ ϕ(x) +
λ 1 + h d(x0 , xλ )
where r0 = d(x0 , y) and r¯ > 0 such that
r0 +¯ r 0
1 1+h(r)
for all x ∈ X,
dr ≥ λ.
REMARK 2.4.26 If h ≡ 0 and x0 = y, then Theorem 2.4.25 reduces to Theorem 2.4.1.
98
2 Extremal Problems and Optimal Control
2.5 Calculus of Variations The calculus of variations deals with the minimization or maximization of functions defined on function spaces. It is as old as calculus itself and has important applications in mechanics and physics. In this section we present some basic aspects of the theory concerning scalar problems. This means that the unknown functions are curves defined on closed intervals in R with values in a Banach space. For this reason we make the following definition. DEFINITION 2.5.1 Let T = [0, b] and X a Banach space. By C 1 (T, X) we denote the vector space of all functions u ∈ C(T, X) which are differentiable on (0, b) and the derivative u : (0, b) −→ X is bounded and uniformly continuous. REMARK 2.5.2 This means that u admits a unique continuous extension on T = [0, b] and u (0) = lim u (t), u (b) = lim u (t). t−→b−
t−→0+
In fact u (0) is the right derivative of u at t = 0 and u (b) is the left derivative of u at t = b. We furnish C 1 (T, X) with the norm uC 1 (T,X) = max u(t)X + max u (t)X . t∈T
t∈T
(2.55)
It is easy to see that (2.55) indeed defines a norm on C 1 (T, X). Also we have the following. PROPOSITION 2.5.3 The space C 1 (T, X) equipped with the norm · C 1 (T,X) becomes a Banach space. Now we introduce the setting of the calculus of variations problem that we study. So let T = [0, b], X be a Banach space, U be a nonempty open set in R × X × X, and L : U −→ R a continuous function, usually called the Lagrangian. A map u ∈ C 1 (T, X) is said to be admissible, if for t ∈ T , we have t, u(t), u (t) ∈ U . We introduce the integral functional IL : D ⊆ C 1 (T, X) −→ R defined by
b
IL (u) = L t, u(t), u (t) dt 0
for every admissible map u ∈ C (T, X). 1
PROPOSITION 2.5.4 The set D ⊆ C 1 (T, X) of all admissible maps is an open subset of C 1 (T, X). PROOF: We show that Dc⊆C 1 (T, X) is closed. To this end let {xn }n≥1 ⊆ Dc and 1 suppose that / D, we can find tn ∈ T
xn −→ x in C (T, X).c Because for all n≥1 xn ∈ such that tn , xn (tn ), xn (tn ) ∈ U . By passing to a subsequence if necessary, we may assume that tn −→ t ∈ T . Because xn −→ x and xn −→ x in C 1 (T, X), we have that xn (t n ) −→ x(t) and X as n → ∞. So in the limit as n → ∞, xnc(tn ) −→ x (t) in we have t, x(t), x (t) ∈ U , hence x ∈ / Dc , which finishes the proof.
2.5 Calculus of Variations
99
PROPOSITION 2.5.5 If Y, Z are Banach spaces, W ⊆ Y is a nonempty open set, b ϑ : T ×W −→ Z is a continuous map, and we set ξ(y) = 0 ϑ(t, y)dt for all y ∈ W , then (a) ξ : W −→ Z is a continuous map. (b) If in addition for all t ∈ T, y −→ ϑ(t, y) is Fr´echet differentiable and (t, y) −→ ϑ (t, y) is continuous from T × W into L(Y, Z), then ξ ∈ C 1 (W, Z) and ξ (y) = yb ϑ (t, y)dt. 0 y PROOF: (a) Let {yn }n≥1 ⊆ W and suppose that yn −→ y ∈ W . Then {yn , y}n≥1 = C is compact in Y and C ⊆ W . So h(T × C) is compact in Z. Therefore we can find M > 0 such that h(t, yn )Z ≤ M
for all t ∈ T and all n ≥ 1.
Because h(t, yn ) −→ h(t, y) in Z for all t ∈ T , from the dominated convergence theorem we have that
b
b ξ(yn ) = h(t, yn )dt −→ h(t, y)dt = ξ(y) in Z as n → ∞, 0
0
⇒ ξ : W −→ Z is continuous.
(b) We show that ξ is continuously Gˆ ateaux differentiable. Then it follows that ξ ∈ C 1 (W, Z) (see Proposition 1.1.10). So let h ∈ Y and λ = 0. We have ξ(y + λh) − ξ(y) = λ
0
b
ϑ(t, y + λh) − ϑ(t, y) dt. λ
(2.56)
From Proposition 1.1.6 (the mean value theorem), we know that ϑ(t, y + λh) − ϑ(t, y) ≤ sup ϑy (t, y + µλh)L(Y,Z) λhY .
(2.57)
µ∈[0,1]
Because by hypothesis (t, y) −→ ϑy (t, y) is continuous from T × W into L(Y, Z), we can find M > 0 such that ϑy (t, y + µλh)L(Y,Z) ≤ M > 0
for all t ∈ T,
all µ ∈ [0, 1] and all λ ∈ [−1, 1].
(2.58)
Therefore ϑ(t, y + λh) − ϑ(t, y) for all t ∈ T ≤ M hY λ Z and all λ ∈ [−1, 1] (see (2.57) and (2.58)).
(2.59)
Recall that ϑ(t, y + λh) − ϑ(t, y) −→ ϑy (t, y)h λ
for all t ∈ T as λ −→ 0.
(2.60)
Because of (2.59), (2.60), and the dominated convergence theorem, if we pass to the limit as λ −→ 0, we obtain
100
2 Extremal Problems and Optimal Control ξ (y)h =
b
ϑy (t, y)h dt
for all h ∈ Y.
0
Therefore by virtue of part (a), ξ ∈ C 1 (W, Z) and ξ (y) =
b
ϑy (t, y) dt
for all y ∈ Y.
0
We use this proposition to establish the continuous Fr´echet differentiability of the integral functional IL and determine its derivative in terms of the partial derivatives of the Lagrangian L. THEOREM 2.5.6 If the Lagrangian L : U −→ R is continuously Fr´echet differentiable, then IL : D ⊆ C 1 (T, X) −→ R is continuously Fr´echet differentiable too and
b
L2 t, u(t), u (t) , v(t) X dt IL (u), v C 1 (T,X) = 0
+
b
L3 t, u(t), u (t) , v (t) X dt
0
for all v ∈ C 1 (T, X). Here by ·, ·C 1 (T,X) we denote the duality brackets for the
pair C 1 (T, X), C 1 (T, X)∗ , by ·, ·X the duality brackets for the pair (X, X ∗ ), and Lk k = 2, 3 the partial derivative of L(t, x, y) with respect to the second and third variables, respectively. PROOF: From Proposition 2.5.4 we know that the set D ⊆ C 1 (T, X) of admissible maps is open. We consider the map ϑ : T × D −→ R defined by
ϑ(t, u) = L t, u(t), u (t) for all (t, u) ∈ T × D. Let ϑ1 : T × D −→ U and ϑ2 : U −→ R be defined by
ϑ1 (t, u) = t, u(t), u (t) and ϑ2 (t, x, y) = L(t, x, y) for all (t, u) ∈ T × D and all (t, x, y) ∈ U . Clearly ϑ1 , ϑ2 are continuous and ϑ = ϑ2 ◦ ϑ1 . So ϑ is continuous too. Moreover, because for t ∈ T fixed the maps u −→ u(t) and u −→ u (t) are continuous linear, we see that for every t ∈ T, ϑ(t, ·) is continuously Fr´echet differentiable. Then by virtue of Propositions 1.1.14 and 1.1.17, we have
ϑ2 (t, u), v C 1 (T,X)= L2 t, u(t), u (t) , v(t) X+ L3 t, u(t), u (t) , v (t) X for all v ∈ C 1 (T, X). It follows that (t, u) −→ ϑ2 (t, u) is continuous from T × D into C 1 (T, X)∗ and so we can apply Proposition 2.5.5 and conclude that
IL (u), v
= C 1 (T,X)
L2 t, u(t), u (t) , v(t) X dt
b
0
+ 0
L3 t, u(t), u (t) , v (t) X dt
b
2.5 Calculus of Variations for all v ∈ C 1 (T, X).
101
We consider the problem with fixed endpoints. Namely we require that all admissible maps satisfy u(0)=x0 and u(b) = xb , where x0 , xb ∈ X are given. So let Cx10 ,xb (T, X) = u ∈ C 1 (T, X) : u(0) = x0 , u(b) = xb . This is a closed affine subspace of C 1 (T, X) produced by translation of the closed 1 subspace C01 (T, X) = C0,0 (T, X). Now let us formally introduce the concept of the local (relative) extremal point of a functional. DEFINITION 2.5.7 Let Y be a Banach space, C ⊆ Y a nonempty set, and ϕ : C −→ R. We say that y0 ∈ C is a local (relative) extremum of ϕ, if we can find V a neighborhood of y0 in Y such that one of the following two holds. (a) ϕ(y0 ) ≤ ϕ(y) for all y ∈ V ∩ C (local minimum). (b) ϕ(y0 ) ≥ ϕ(y) for all y ∈ V ∩ C (local maximum). If the inequalities in (a) and (b) are strict for y = y0 , then we speak of a strict local extremum (strict local minimum and strict local maximum). If the inequalities in (a) and (b) are true for all y ∈ C, then we speak of a (global) extremum on C (global) minimum on C and (global) maximum on C . In the next proposition we produce a necessary condition for IL to admit a local extremum on Cx10 ,xb (T, X). In what follows we assume that D ∩ Cx10 ,xb (T, X) is nonempty. PROPOSITION 2.5.8 If L : U −→ R is a continuously Fr´echet differentiable Lagrangian and IL admits a local relative extremum on Cx10 ,xb (T, X) at the map u ∈ Cx10 ,xb (T, X), then for every v ∈ D ∩ C01 (T, X) we have
IL (u), v
C 1 (T,X)
b
= 0
L2 t, u(t), u (t) , v(t) X dt
b
+ 0
L3 t, u(t), u (t) , v (t) X dt.
(2.61)
PROOF: From Proposition 2.5.4 we know that D is an open subset of C 1 (T, X). Hence D ∩ Cx10 ,xb (T, X) is an open subset of the closed affine subspace Cx10 ,xb (T, X). By Theorem 2.5.6 IL : D −→ R is continuously Fr´echet differentiable, thus we have that its restriction on Cx10 ,xb (T, X) is also continuously Fr´echet differentiable. Because IL attains its local relative minimum or local relative maximum at a map u ∈ Cx10 ,xb (T, X), we have
IL
1 Cx (T,X) 0 ,xb
But
IL
1 Cx (T,X) 0 ,xb
and so (2.61) follows.
(u) = 0.
(u) = IL (u)
C01 (T,X)
102
2 Extremal Problems and Optimal Control
Equation (2.61) is not yet in a convenient form. To achieve this we need some auxiliary results that are basic in the theory of calculus of variations. The first auxiliary result is usually known in the literature as the Lagrange lemma. LEMMA 2.5.9 If ξ ∈ C(T, X ∗ ), then ξ ≡ 0 if and only if for all u ∈ C01 (T, X).
b 0
ξ(t), u(t)X dt = 0
PROOF: ⇒: Obvious. ⇐: Suppose that ξ is not identically zero. Then we can find t0 ∈ (0, b) such that ξ(t0 ) = 0. Hence we can find x ∈ X such that ξ(t0 ), xX = 0 and without any loss of generality we may assume that ξ(t0 ), xX > 0. The function t −→ ξ(t), xX is continuous on T and so we can find δ > 0 that [t0 −δ, t0 +δ] ⊆ (0, b) and ξ(t), xX > 0 for all t ∈ [t0 − δ, t0 + δ]. Consider a function ϑ ∈ C 1 (T ) such that ϑ≥0
and
ϑ(t) = 0
A possible such function (known as a cut-off ⎧ ⎪ ⎨ 0
1 ϑ(t) = exp − δ2 −(t−t 2 ) 0 ⎪ ⎩ 0
for all t ∈ / [t0 − δ, t0 + δ]. function) is if 0 ≤ t ≤ t0 − δ if t0 − δ ≤ t ≤ t0 + δ. if t0 + δ ≤ t ≤ b
Then let u0 (t) = ϑ(t)x. Evidently u0 ∈ C01 (T, X) and the function t −→ ξ(t0 ), u0 (t)X is continuous on T , nonnegative, and ξ(t0 ), u0 (t)X > 0. Therefore it follows that
b
ξ(t), u0 (t)X dt > 0, 0
a contradiction to the hypothesis of the lemma. The second auxiliary result is known as Du Bois–Reymond’s lemma.
LEMMA 2.5.10 If ξ ∈ C(T, X ∗ ), then ξ is constant if and only if for all u ∈ b b C(T, X) with mean value zero (i.e., 0 u(t)dt = 0) we have 0 ξ(t), u(t)X dt = 0. PROOF: ⇒ : Suppose that ξ(t) = x∗ ∈ X ∗ for all t ∈ T . Also assume that u ∈ C(T, X) has mean value zero. Then from the properties of the Bochner integral, we have
b
b
b ξ(t), u(t)X dt = x∗ , u(t)X dt = x∗ , u(t)dt = 0. 0
0
0
X
⇐ : Suppose that ξ is not constant. Because ξ ∈ C(T, X) we can find 0 < t1 < t2 < b such that ξ(t1 ) = ξ(t2 ). So there exists x ∈ X such that ξ(t1 ), xX = ξ(t2 ), xX . Without any loss of generality we may assume that ξ(t1 ), xX < ξ(t2 ), xX . By virtue of the continuity of the function t −→ ξ(t), xX on T , we can find δ > 0 such that 0 < t1 − δ < t1 + δ < t2 − δ < t2 + δ < b and
2.5 Calculus of Variations
103
max ξ(t), xX : t ∈ [t1 − δ, t1 + δ] < min ξ(t), xX : t ∈ [t2 − δ, t2 + δ] . From this inequality it follows that ξ(t2 + s) − ξ(t1 + s), xX > 0
for all s ∈ [−δ, δ].
We consider a continuous cut-off function ϑ for the interval [−δ, δ]; that is, ϑ ≥ 0, ϑ(s) > 0 for s ∈ (−δ, δ) and ϑ(s) = 0 for s ∈ / (−δ, δ). A possible such function is given by δ−|s| if |s| ≤ δ. δ ϑ(s) = 0 if |s| > δ b
Let η(t) = 0 ϑ(s − t2 ) − ϑ(s − t1 ) ds for all t ∈ T and set u(t) = η(t)x. Evidently u ∈ C 1 (T, X) and u(0)=u(b)=0. Then
b
b
ξ(t), u (t) X dt = ϑ(s − t2 ) − ϑ(s − t1 ) ξ(t), xX dt 0
0
δ
ϑ(s) ξ(t2 + s) − ξ(t1 + s), xX ds,
= δ
a contradiction to the hypothesis.
The previous two lemmata lead to the third auxiliary result which provides the tools to simplify the extremality equation (2.61). b LEMMA 2.5.11 If ξ1 , ξ2 ∈ C(T, X ∗ ), then 0 ξ1 (t), u(t)X +ξ2 (t), u (t)X dt = 1 0 for all u ∈ C0 (T, X) if and only if ξ2 is differentiable on T and ξ2 (t) = ξ1 (t) for all t ∈ T . t PROOF: Let β(t) = 0 ξ1 (s)ds for all t ∈ T . Then β ∈ C 1 (T, X ∗ ) and β (t) = ξ1 (t) for all t ∈ T . Let u ∈ C01 (T, X). We have d ξ1 (t), u(t)X = for all t ∈ T, β(t), u(t)X − β(t), u (t) X dt
b
b ⇒ ξ1 (t), u(t)X dt = β(b), u(b)X − β(0), u(0)X − β(t), u (t) X dt 0
=−
⇒ 0
0
b
β(t), u (t)
0 b
ξ1 (t), u(t)X + ξ2 (t), u (t) X dt =
Then by virtue of Lemma 2.5.10, we have
b ξ2 (t) − β(t), u (t) X dt = 0
dt
X
0
b
(because u(0) = u(b) = 0) ξ2 (t) − β(t), u (t) X dt.
for all u ∈ C01 (T, X)
0
if and only if ξ2 (t) − β(t) = x∗0 ∈ X ∗ Therefore ξ2 ∈ C (T, X) and 1
ξ2 (t)
for all t ∈ T. = ξ1 (t) for all t ∈ T .
Using this lemma we can simplify equation (2.61).
104
2 Extremal Problems and Optimal Control
THEOREM 2.5.12 If L : U −→ R is a continuously differentiable Lagrangian and IL admits a local relative extremum on Cx10 ,xb (T, X) at u ∈ Cx10 ,xb (T, X), then
d
L3 t, u(t), u (t) = L2 t, u(t), u (t) dt
for all t ∈ T.
(2.62)
REMARK 2.5.13 Equation (2.62) is known as Euler’s equation (in Lagrange form). The functions u ∈ Cx10 ,xb (T, X) along which (2.62) is valid are called extremals. Next we present some special cases where the Euler equation has easily determined integrals. COROLLARY 2.5.14 If the Lagrangian function L(t, x, y) is actually independent of the third variable y ∈ X, then a necessary condition for the extremality of the map u ∈ Cx10 ,xb (T, X) is given by
L2 t, u(t) = 0 for all t ∈ T. COROLLARY 2.5.15 If the Lagrangian function L(t, x, y) is actually independent of the second variable x ∈ X, then the Euler equation (2.62) admits the following solution
for all t ∈ T. p(t) = L3 t, u (t) = p∗0 ∈ X ∗ REMARK 2.5.16 In mechanics p(t) is the momentum function and Corollary 2.5.15 says that along an extremal the momentum is constant. COROLLARY 2.5.17 If the Lagrangian function L(t, x, y) actually does not depend on the time variable t ∈ T and the extremal u ∈ C 2 (T, X) ∩ Cx10 ,xb (T, X), then the Euler equation (2.62) admits an energy integral
H(t) = p(t), u (t) X − L u(t), u (t)
for all t ∈ T. = L2 u(t), u (t) , u (t) X − L u(t), u (t) = H0 ∈ R PROOF: From the chain rule we have
d
d H(t) = L2 u(t), u (t) , u (t) + L2 u(t), u (t) , u (t) X dt dt
− L1 u(t), u (t) , u (t) X − L2 u(t), u (t) , u (t) X = 0 (see (2.62)). Hence H(t) = H0 ∈ R for all t ∈ T .
REMARK 2.5.18 The above corollary says that along an extremal the energy is constant (conservation of energy). To be able to analyze further the Euler equation (2.62), we need to introduce second-order derivatives.
2.5 Calculus of Variations
105
DEFINITION 2.5.19 Let X, Y be Banach spaces, U ⊆ X nonempty open, x0 ∈ U , and ϕ : U −→ Y a Fr´echet differentiable map. If the Fr´echet derivative x −→ ϕ (x) from U into L(X, Y ) with the operator norm topology is differentiable at x0 , then ϕ is said to be twice differentiable at x0 and this second-order derivative is denoted by ϕ (x0 ).
REMARK 2.5.20 According to this definition ϕ (x0 ) ∈ L X, L(X, Y ) . DEFINITION 2.5.21 Let X1 , X2 and Y be Banach spaces. A map L : X = X1 ×X2 −→ Y is said to be bilinear if x1 −→ L(x1 , x2 ) from X1 into Y and x2 −→ L(x1 , x2 ) from X2 into Y are both linear. We say that L is continuous, if there exists M ≥ 0 such that L(x)Y ≤ M x1 X x2 X
for all x = (x1 , x2 ) ∈ X = X1 × X2 .
(2.63)
The infimum of all M ≥ 0 for which (2.63) is true is the norm of L. Equipped with this norm the space of continuous bilinear maps L : X1 × X2 −→ Y is a Banach space denoted by L2 (X1 × X2 ; Y ). REMARK 2.5.22 Note that on X = X1 × X2 , we consider the norm xX = max{x 1 X1 , x2 X2 } for all x = (x1 , x2 ) ∈ X = X1 × X2 ; then LL2 = sup L(x)Y : x ∈ X, xX ≤ 1 . PROPOSITION 2.5.23
If X1 , X2 , and Y are Banach spaces, then L2 (X1 × X2 ; Y ), L X1 , L(X2 , Y ) , and L X2 , L(X1 , Y ) are isometrically isomorphic Banach spaces. PROOF: Let L ∈ L2 (X1 × X2 ; Y ) and x1 ∈ X1 . We consider the map x2 −→ K(x1 )(x2 ) = L(x1 , x2 ) from X2 into Y . We have K(x1 )(x2 )Y ≤ LL2 x1 X1 x2 X2 , ⇒ K(x1 ) ∈ L(X2 , Y )
and
K(x1 )L ≤ LL2 x1 X .
(2.64)
Also the map x1 −→ K(x1 ) is linear and (2.64) implies that it is also continuous and satisfies ≤ LL . K
(2.65) 2 L X1 ,L(X2 ,Y )
On the other hand, if K ∈ L X1 , L(X2 , Y ) , for all x1 ∈ X1 we set L(x1 , x2 ) = K(x1 )x2 . Evidently L : X1 ×X2 −→ Y is a bilinear map and L(x1 , x2 )Y ≤ K(x1 )L x2 X2 ≤ K
L X1 ,L(X2 ,Y )
⇒ L ∈ L2 (X1 × X2 ; Y )
and
LL2 ≤ K
x1 X x2 X , 1 2
L X1 ,L(X2 ,Y )
.
(2.66)
From(2.65) and(2.66) we conclude that L 2 (X1×X2 ; Y ) and L X1 , L(X2 , Y ) are isometrically isomorphic and similarly for L X2 , L(X1 , Y ) . By virtue of Proposition 2.5.23, if ϕ : U −→ Y is a twice differentiable map and x0 ∈ U , then ϕ (x0 ) ∈ L2 (X × X; Y ). In fact we can say more.
106
2 Extremal Problems and Optimal Control
PROPOSITION 2.5.24 If X, Y are Banach spaces, U ⊆ X is nonempty open, x0 ∈ X, and ϕ : U −→ X is a map which is twice differentiable at x0 (see Definition 2.5.19), then ϕ (x0 ) ∈ L2 (X × X; Y ) is symmetric; that is, for all x, u ∈ X we have ϕ (x0 )(x, u) = ϕ (x0 )(u, x). PROOF: Let r > 0 be such that Br (x0 ) ⊆ U and consider the map ψ : [0, 1] −→ Y defined by ψ(t) = ϕ(x0 + tx + u) − ϕ(x0 + tx)
for xX , uY ≤ r.
From the mean value theorem (see Proposition 1.1.6), we have ψ(1) − ψ(0) − ψ (0)X ≤ sup ψ (t) − ψ (0)X .
(2.67)
t∈[0,1]
Also from the chain rule (see Proposition 1.1.14), we have
ψ (t) = ϕ (x0 + tx + u) − ϕ (x0 + tx) x
= ϕ (x0 + tx + u) − ϕ(x0 ) − ϕ (x0 )tx x
− ϕ (x0 + tx) − ϕ(x0 ) − ϕ (x0 )tx x. Given ε > 0, we can find δ > 0 such that if xX , uX ≤ δ and 0 ≤ t ≤ 1, then ϕ (x0 + tx + u) − ϕ (x0 ) − ϕ (x0 )(tx + u)L ≤ ε(xX + uX ) and
ϕ (x0 + tx) − ϕ (x0 ) − ϕ (x0 )(tx)L ≤ εxX .
These imply
and
ψ (t) − ϕ (x0 )u xY ≤ 2ε(xX + uX )
ψ(1) − ψ(0) − ϕ (x0 )u xY ≤ ψ(1) − ψ(0) − ϕ (0)Y
+ ϕ (0) − ϕ (x0 )u xY ≤ 6ε(xX + uX )xX
(2.68)
(2.69)
(see (2.67) and (2.68)). But ψ(1) − ψ(0) = ϕ(x0 + tx + u) − ϕ(x0 + x) − ϕ(x0 + u) + ϕ(x0 ) is symmetric in x, u ∈ X and so in (2.68) we may interchange x and u and obtain
ϕ (x0 )u x − ϕ (x0 )x uY ≤ 6ε(xX + uX )2 for all xX , uX ≤ δ. (2.70) If we replace x, u by λx, λu with λ > 0, then both sides of (2.70) are multiplied with λ2 . So (2.70) holds also for xX = uX = 1. Hence ϕ (x0 )(u, x) − ϕ (x0 )(x, u)Y ≤ 24ε
for all xX = uX = 1.
(2.71)
Because ε > 0 was arbitrary, from (2.71) we conclude that ϕ (x0 )(x, u) = ϕ (x0 )(u, x)
for all x, u ∈ X.
Now that we have the second-order derivative at our disposal, we can further analyze Euler’s equation (2.62). Assume that the extremal u belongs in C 2 (T, X).
2.5 Calculus of Variations
107
In order to keep track with respect to which variable we differentiate at each step, we consider the variables (t, x, y) of the Lagrangian and by Lt (resp., Lx , Ly ) we denote the partial derivative of L with respect to t (resp., with respect to x, y). Then we have d
d
L3 t, u(t), u (t) = Ly t, u(t), u (t) dt dt
= Lyt t, u(t), u (t) + Lyx t, u(t), u (t) u (t)
+Lyy t, u(t), u (t) u (t) in X ∗ . Therefore the Euler equation can be written as follows.
Lyy t, u(t), u (t) u (t) + Lyx t, u(t), u (t) u (t)
in X ∗ . + Lyt t, u(t), u (t) − Lx t, u(t), u (t) = 0
(2.72)
This is a second-order differential equation with the function u unknown. Of course (2.72) is not in canonical form, because it is not solved for∗u . In the sequel
by imposing a suitable condition on Lyy t, u(t), u (t) ∈ L(X, X ), we are able to put (2.72) in canonical form. As this point we focus on a difficulty that we encounter. Equation (2.72) was derived under the assumption that u ∈ C 2 (T, X). On the other hand the Euler equation (2.62) in Theorem 2.5.12 was obtained under the assumption that the extremal u belongs in C 1 (T, RN ). In what follows we show how we can overcome this difficulty. PROPOSITION 2.5.25 If L : U −→ R is a twice differentiable Lagrangian function (i.e., L ∈ C 2 (U )), (t0 , x0 , y0 ) ∈ U with 0 < t0 < b, and Lyy (t0 , x0 , y0 ) is an isomorphism from X into X ∗ , then (a) There exists a solution u of the Euler equation (2.62) defined on an open interval I0 containing t0 such that u(t0 ) = x0 , u (t0 ) = y0 and u is C 2 . (b) Every other solution u of the Euler equation (2.62) defined on an open interval I containing t0 and satisfying u(t0 ) = x0 , u (t0 ) = y0 coincides with u on an open interval containing t0 and u is C 2 on that interval. PROOF: Consider the map ψ : U −→ R × X × X defined by
ψ(t, x, y) = t, x, Ly (t, x, y) .
(2.73)
Because by hypothesis Lyy (t0 , x0 , y0 ) is an isomorphism from X onto X ∗ , we can apply the inverse theorem on the function Ly (see Theorem 1.1.25). So according to that theorem, we can find a neighborhood V of (t0 , x0 , y0 ) ∈ R × X × X and a neighborhood W of t0 , x0 , p0 = Ly (t0 , x0 , y0 ) ∈ R × X × X ∗ , such that the function ψ defined by (2.73) is a C 1 -diffeomorphism from V onto W . The inverse function ψ −1 (also a C 1 -function) is of the form (t, x, p) −→ t, x, y = g(t, x, p) . We consider the following system on X × X ∗
* ) x (t) = g t, x(t), p(t)
. (2.74) p (t) = Lx t, x(t), g t, x(t), p(t) Note that (2.74) is in canonical form (it is solved with respect to the higher-order derivatives x , p of the unknown functions x, p). The vector field of (2.74),
108
2 Extremal Problems and Optimal Control
(t, x, p) −→ g(t, x, p), Lx t, x, g(t, x, p) ,
is a C 1 -function from W into X × X ∗ and the triple (t0 , x0 , p0 ) corresponds to the Cauchy data of (2.74). So by the basic existence and uniqueness theorem for initial value problems in Banach spaces, we can find a unique maximal solution t −→ x(t), p(t) of (2.74) such that x(t0 ) = x0 and p(t0 ) = p0 . Let I be the open interval containing t0 on which the solution is defined. We know that on this interval the solution is C 2 . Let u(t) = x(t). Then u ∈ C 2 (I, X) satisfies the Euler equation (2.62) and the initial conditions u(t0 ) = x0 , u (t0 ) = y0 . Next let u : I −→ X be another solution defined on an interval I containing t0 satisfying u(t0 ) = x0 , u (t 0 ) = y0 . For t close to t0 , t, u(t), u (t) ∈ V . So we can consider ψ(t), u(t), u (t) (see (2.73)). Then the function t −→ u(t), u (t) solves (2.74) and satisfies the Cauchy data (t0 , x0 , y0 ). By virtue of the uniqueness of the solution, we conclude that for all t ∈ I ∩ I we have u(t) = u(t). REMARK 2.5.26 This proposition shows that in order to hope to be able to transform (2.72) to a canonical first-order system we need to assume that the Lagrangian function L is C 2 . Before proceeding to the Hamiltonian formulation of the necessary conditions for extremality, let us illustrate the use of Euler’s equation through some characteristic examples. EXAMPLE 2.5.27 (a) Let L(y) = 1 + y 2 and consider the problem inf IL (u): 1 u ∈ C ([0, 1]), u(0) = x0 , u(1) = x1 . Then the Euler equation is (d dt)(∂ ∂y) 1 + u (t)2 = 0, u(0) = x0 , u(1) = x1 , which gives u(t) = (x1 − x0 )t + x0 for all t ∈ T . This is a global minimum of IL (·) over Cx10 ,x1 ([0, 1]) and simply says that the curve with the minimum length joining x0 and x1 is the straight line connecting the two points. 1 (b) Let L(y)=y 3 and consider the problem inf[IL (u) : u ∈ C ([0, 1]), u(0) = 0, u(1) = 1]. Then the Euler equation is u (t)2 = 0, u(0) = 0, u(1) = 1. The unique solution of this boundary value problem is u(t) = t for t ∈ [0, 1]. This produces a 1 local minimum for IL on C0,1 ([0, 1]). Let v ∈ C01 ([0, 1]). Then u + v is admissible and we have
1
3 IL (u + v) = dt t + v(t) 0
1
v (t)dt +
= IL (u) + 3 0 1
1
3v (t)2 + v (t)3 dt
0
3v (t)2 + v (t)3 dt
= IL (u) +
(because v(0) = v(1) = 0).
0
Evidently if 3v (t)2 + v (t)3 ≥ 0 for all t ∈ [0, 1], then IL (u + v) ≥ IL (u). In particular if vC 1 ([0,1]) ≤ 3, then 3v (t)2 + v (t)3 ≥ 0 for all t ∈ [0, 1] and so IL (u + v) ≥ IL (u) which proves that the extremal u produces a local minimum of 1 IL on C0,1 ([0, 1]). (c) If L is not C 1 , then the extremal function need not be C 1 . Indeed let L(t, y) = t2/3 y 2 and consider the problem inf[IL (u) : u ∈ C 1 ([0, 1]), u(0) = 0, u(1) = 1]. In
2.5 Calculus of Variations this case the Euler equation has the form
2/3 2t u (t) = 0, u(0) = 0,
109
u(1) = 1.
Solving this boundary value problem, we obtain u(t) = t2/3 which does not 1 belong in C 1 ([0, 1]), but it is a global minimum of IL over C0,1 ([0, 1]). This example is due to Hilbert and shows that a variational problem does not always have a solution in the given class of curves under consideration. (d) In this example the Euler equation has no solutions and moreover, there is no solution even in the larger space of absolutely continuous functions. So let L(t, y) = (ty)2 and consider the problem inf[IL (u) : u ∈ C 1 ([0, 1]), u(0) = 0, u(1) = 1]. In this case the Euler equation has the form
2 2t u (t) = 0, u(0) = 0, u(1) = 1. The general solution of the differential equation is u(t) = (c/t) + d. First we observe that u does not belong in C 1 ([0, 1]) and none of these functions satisfies the boundary conditions u(0) = 0, u(1) = 1. In fact the variational problem has no solution in the space of absolutely continuous functions with the given boundary values. Indeed, if v(t) is such a function, then IL (v) > 0 and the value of the problem is zero. To see this consider nt if t ∈ [0, n1 ] . vn (t) = 1 if t ∈ [ n1 , 1] Then we have IL (xn ) −→ 0. This example is due to Weierstrass and was produced as an argument against Riemann’s justification of the Dirichlet principle. Now we introduce the Legendre transform, which is the “forefather” of the Legendre–Fenchel transform, which was introduced in Definition 1.2.15 and which is the basic tool in the duality theory of convex functions. DEFINITION 2.5.28 Let X be a Banach space, U ⊆ R × X × X a nonempty open set, and L : U −→ R a C 2 Lagrangian function. (a) The Legendre transform of L is the function L : U −→ R × X × X ∗ defined by
L(t, x, y) = t, x, p = Ly (t, x, y) . (b) We say that the Lagrangian is regular if the corresponding Legendre transform L is a local diffeomorphism. We say that the Lagrangian is hyperregular if the corresponding Legendre transform L is a diffeomorphism of U on an open set L(U ) ⊆ R × X × X ∗ . REMARK 2.5.29 From the inverse theorem (see Theorem 1.1.25), the Lagrangian L is regular if and only if the Fr´echet derivative of L at all (t, x, y) ∈ U is an isomorphism from T × X × X onto T × X × X ∗ . In what follows let D denote the Fr´echet derivative with respect to all three variables (t, x, y) ∈ U of the function involved. We have
DL(t, x, y)(λ, v, w) = λ, v, DLy (t, x, y)(λ, v, w) .
110
2 Extremal Problems and Optimal Control
Note that DLy (t, x, y)(λ, v, w) = λLyt (t, x, y) + Lyx (t, x, y)v + Lyy (t, x, y)w
in X ∗ .
Then DL(t, x, y) ∈ L(R × X × X, R × X × X ∗ ) is an isomorphism if and only if
Lyy (t, x, y) ∈ L(X, X ∗ ) is an isomorphism. In Proposition 2.5.25 we have assumed that Lyy (t0 , x0 , y0 ) ∈ L(X, X ∗ ) is an isomorphism. Recall that in L(X, X ∗ ) with the operator norm topology, the set of isomorphisms is open. So for all (t, x, y) near (t0 , x0 , y0 ) in R × X × X, we have that Lyy (t, x, y) ∈ L(X, X ∗ ) is an isomorphism. Therefore we see that the hypothesis in Proposition 2.5.25 implies that the Lagrangian L is regular in a neighborhood of the point (t0 , x0 , y0 ). that is hyperreguTHEOREM 2.5.30 If L : U −→ R is a C 2 Lagrangian function
lar, L is the Legendre transform of L with L−1 its inverse L−1 (t, x, y) = t, x, y = ∗ g(t, x, p) , and H : L(U ) −→ R × X × X is defined by H(t, x, p) = p, g(t, x, p)X − (L ◦ L−1 )(t, x, p), then the Euler equation
d
Ly t, u(t), u (t) = Lx t, u(t), u (t) dt
for all t ∈ T = [0, b]
is equivalent to the following first-order canonical system
) * x (t) = Hp t, x(t), p(t) . p (t) = −Hx t, x(t), p(t) for all t ∈ T = [0, b]
(2.75)
(2.76)
The function H is the Hamiltonian associated with the Lagrangian L and system (2.76) is called a Hamiltonian system. The equivalence of (2.75) and (2.76) is in the u(t) is a solution of the Euler equation (2.75), then t −→
following sense: if t −→ u(t), L y t, u(t), u (t) is a solution of the Hamiltonian system (2.76). Conversely, if t −→ x(t), p(t) is a solution of the Hamiltonian system (2.76), then t −→ u(t) = x(t) is a solution of the Euler equation (2.75). PROOF: In the proof of Proposition 2.5.25 we proved the local equivalence of (2.75) with the canonical system (2.74). In fact due to the hyperregularity of the Legendre transform L, this equivalence is actually global. So it suffices to show the equivalence of (2.74) and (2.76). By definition we have
H(t, x, p) = p, g(t, x, p)X − L t, x, g(t, x, p) . We have Hp (t, x, p) ∈ X ∗∗ . So for all x∗ ∈ X ∗ Hp (t, x, p), x∗ X ∗ = x∗ , g(t, x, p)X + p, gp (t, x, p)x∗ X
− Ly t, x, g(t, x, p) , gp (t, x, p)x∗ X .
(2.77)
But p = Ly t, x, g(t, x, p) . Hence from (2.77) we infer that for all x∗ ∈ X ∗ Hp (t, x, p), x∗ X ∗ = x∗ , g(t, x, p)X ⇒ Hp (t, x, p) = g(t, x, p) ∈ X.
(2.78)
2.6 Optimal Control
111
Then we see that the first equations in (2.74) and (2.76) are the same. Also if v ∈ X, then
Hx (t, x, p), v X = p, hx (t, x, p)v X − Lx t, x, g(t, x, p) , v X
− Ly t, x, g(t, x, p) , hx (t, x, p)v X . (2.79)
Again because p = Lx t, x, g(t, x, p) , from (2.79) we obtain
for all v ∈ X, Hx (t, x, p), v X = − Lx t, x, g(t, x, p) , v X
⇒ Hx (t, x, p) = Lx t, x, g(t, x, p) . EXAMPLE 2.5.31 The Poincar´ e half-plane: Let P = {(x, y) ∈ R2 : y > 0} and 2 L : P × R −→ R+ is defined by L(x, y, u, v) =
1 u2 + v 2 . 2 y2
The integral functional IL is defined by
1 b x (t)2 + y (t)2 IL (u) = dt, 2 0 y(t)2
where u : T = [0, b] −→ P and u(t) = x(t), y(t) for all t ∈ T . Then L(x, y, u, v) = (x, y, p, q)
with p = u y 2 , q = v y 2 . This is a diffeomorphism (i.e., L is hyperregular). Also H(x, y, p, q) =
1 2 2 y (p + q 2 ). 2
Then the Hamiltonian system (2.76) takes the following form: ) ) x (t) = y(t)2 p(t) p (t) = 0
. y (t) = y(t)2 q(t) q (t) = −y(t) p(t)2 + q(t)2 This system is a model for the hyperbolic geometry. For this reason its study is important in the understanding of non-Euclidean geometries.
2.6 Optimal Control In this section we study some basic aspects of optimal control theory. Specifically we focus our attention on the existence theory, the relaxation methods, and the derivation of the maximum principle (a necessary condition for optimality of an admissible state-control pair). To avoid technical complications that require the development of a substantial mathematical background, we limit ourselves to finite dimensional systems (i.e., systems driven by ordinary differential equations, also known as lumped parameter systems).
112
2 Extremal Problems and Optimal Control
So the mathematical setting is the following. The state space is RN and the control space is a Polish space Y . Recall that a Polish space is a Hausdorff topological space which is separable and can be metrized by means of a complete metric. The time horizon is T = [0, b]. Let Γ(T, Y ) = {u : T −→ Y : u is measurable}. Any element of Γ(T, Y ) is called a control. Also f : T × RN × Y −→ RN is a vector field and the controlled dynamical system we are considering is the following
x (t) = f t, x(t), u(t) a.e. on T, x(0) = x0 ∈ RN . (2.80) Any solution x(·) of (2.80) is referred to as a state trajectory of the control system corresponding to the initial state x0 ∈ RN and the control u(·). Note that we do not assume the uniqueness and/or the existence of solutions for (2.80). So for any control u(·) and any initial state x0 , we may have more than one or no response for the system. Moreover, in general we have some constraints on the control described by a multifunction (set-valued function) U : T −→ 2Y \ {∅}. For these reasons, the following definition is necessary.
DEFINITION 2.6.1 A pair x(·), u(·) ∈ W 1,1 (0, b), RN ×Γ(T, Y ) is said to be admissible (or feasible) state-control pair, if (2.80) is satisfied and u(t) ∈ U (t) a.e. on T . In what follows by αad we denote the set of all admissible state-control pairs. We are also given an integral cost functional
b
J(x, u) = L t, x(t), u(t) dt for all (x, u) ∈ αad . 0
Then our optimal control problem can be stated as follows: (P) Find (x, u) ∈ αad such that J(x, u) = inf J(x, u) : (x, u) ∈ αad = m.
(2.81)
then DEFINITION 2.6.2 If we can find (x, u) ∈ αad such that (2.81) is satisfied,
we say that (x, u) is an optimal admissible pair . The function x ∈ W 1,1 (0, b), RN is said to be an optimal trajectory and the function u ∈ Γ(T, Y ) is said to be an optimal control . The hypotheses on the data of the optimal control problem (P), are the following. H(f ): f : T × RN × Y −→ RN is a function, such that (i) For all (x, u) ∈ RN × Y, t −→ f (t, x, u) is measurable. (ii) For almost all t ∈ T, (x, u) −→ f (t, x, u) is continuous. (iii) For almost all t ∈ T , all x ∈ RN and all u ∈ U (t), we have f (t, x, u) ≤ α(t) + c(t)x
with α, c ∈ L1 (T )+ .
H(U): U : T −→ 2Y \ {∅} is a multifunction with compact values such that GrU = {(t, u) ∈ T × Y : u ∈ U (t)} ∈ B(T ) × B(Y ), with B(T ) (resp., B(Y )) being the Borel σ-field of T (resp., of Y ). REMARK 2.6.3 From measure theory we know that B(T )×B(Y ) = B(T ×Y )(= the Borel σ-field of T × Y ).
2.6 Optimal Control
113
H(L): L : T ×RN ×Y −→ R = R ∪ {+∞} is an integrand such that (i) (t, x, u) −→ L(t, x, u) is measurable. (ii) For almost all t ∈ T, (x, u) −→ L(t, x, u) is lower semicontinuous proper. (iii) For almost all t ∈ T , all x ∈ RN , and all u ∈ U (t) α0 (t) − c0 x ≤ L(t, x, u)
with α0 ∈ L1 (T ), c0 > 0.
In the existence theory the following convexity-type hypothesis plays a central role. Hc : For all (t, x) ∈ T × RN the set Q(t, x) = {(v, λ) ∈ RN ×R : v = f (t, x, u), u ∈ U (t), L(t, x, u) ≤ λ} is convex. REMARK 2.6.4 Hypotheses H(f), H(u), and H(L) imply that for all (t, x) ∈ T × RN , Q(t, x) is also closed. Hypothesis Hc means that the optimal control problem has enough “convex structure”. Note that if the control variable enters linearly in the dynamics of the system (i.e., f (t, x, u) = f1 (t, x) + f2 (t, x)u), Y = Rm , the control constraint multifunction has also convex values and the cost integrand L(t, x, u) is in addition convex in u ∈ Rm , then the hypothesis Hc is N satisfied. Note
that this hypothesis implies that for almost all t ∈ T and all x ∈ R , F (t, x) = f t, x, U (t) is convex. In Chapter 6, we produce equivalent conditions for Hc to hold.
1 In what follows let αad ⊆ W 1,1 (0, b), RN be the set of admissible states (traN jectories). We start by establishing the nonemptiness and
compactness in C(T, R ) 1 1,1 N of αad . Recall that the Sobolev space W (0, b), R is embedded continuously 1 (but not compactly) in C(T, RN ). To determine the properties of αad , we study the following differential inclusion.
) * x (t) ∈ F t, x(t) a.e. on T = [0, b], . (2.82) x(0) = x0 We impose the following conditions on the multifunction F (t, x): N
H(F): F : T × RN −→ 2R \ {∅} is a multifunction with compact and convex values such that (i) For all x ∈ RN , GrF (·, x)={(t, v)∈T ×RN : v∈F (t, x)}∈B(T ) × B(RN ). (ii) For almost all t ∈ T, GrF (t, ·) = {(x, v) ∈ RN ×RN : v ∈ F (t, x)} is closed. (iii) For almost all t ∈ T , all x ∈ RN , and all v ∈ F (t, x), we have v ≤ α(t) + c(t)x
with α, c ∈ L1 (T )+ .
REMARK 2.6.5 If F (t, x) = f t, x, U (t) , then by virtue of hypotheses H(f), H(U), H(L), and Hc , the multifunction F satisfies hypotheses H(F). DEFINITION
2.6.6 By a solution of (2.82), we mean a function x ∈ W 1,1 (0, b), RN such that x(0) = x0 and x (t) ∈ F t, x(t) for almost all t ∈ T . We
denote the set of solutions of (2.82) by SF (x0 ). Then SF (x0 ) ⊆ W 1,1 (0, b), RN ⊆ C(T, RN ).
114
2 Extremal Problems and Optimal Control
To be able to solve (2.82) and eventually study the optimal control problem, we need some results from multivalued analysis, which we state here for easy reference and postpone their proof until Chapter 6, where we conduct a more systematic and detailed study of multifunctions. The first result is known as the Kakutani–Ky Fan fixed point theorem. THEOREM 2.6.7 If X is a locally convex space, C ⊆ X is nonempty, compact and convex, and G : C −→ 2C is a multifunction with nonempty, closed and convex values that has a closed graph (i.e., Gr G = {(x, y) ∈ C × C : y ∈ G(x)} is closed in X × X), then there exists x ∈ X such that x ∈ G(x). The second result is known as the Yankov–von Neumann–Aumann selection theorem. It is stated in a less general form, which however suffices for our needs here. The general form of the result can be found in Theorem 6.3.20. THEOREM 2.6.8 If (Ω, Σ, µ) is a σ-finite measure space, X is a Polish space, and G : Ω −→ 2X is a multifunction such that Gr G = {(ω, x) ∈ Ω × X : y ∈ G(ω)} ∈ Σ × B(X), then there exists a Σ-measurable function g : Ω −→ X such that g(ω) ∈ G(ω) µ-a.e. on Ω. It is well known that given a Borel set in R2 , its projection on a coordinate axis need not be Borel. The next theorem provides conditions which guarantee that the projection is indeed Borel. THEOREM 2.6.9 If S, X are Polish spaces, G ∈ B(S × X) = B(S) × B(X), and for every s ∈ S the section G(s) = {x ∈ X : (s, x) ∈ G} is σ-compact, then projS G ∈ B(S). Now we are ready to determine the properties of the solution set S(x0 ) of problem (2.82). THEOREM 2.6.10 If hypotheses H(F ) hold, then SF (x0 ) = ∅ and SF (x0 ) is compact in C(T, RN ). PROOF: We start by establishing an a priori bound for the elements of SF (x0 ). 1 N So suppose that x ∈ SF (x0 ). Then for some v ∈ L (T, R ) that satisfies v(t) ∈ F t, x(t) a.e. on T , we have
t
x(t) ≤ x0 +
v(s)ds
0
⇒ x(t) ≤ x0 +
t
for all t ∈ T,
α(s) + c(s)x(s)
for all t ∈ T.
0
Invoking Gronwall’s inequality, we can find M1 > 0 such that x(t) ≤ M1
for all t ∈ T and all x ∈ S(x0 ).
Let pM1 : RN −→ RN be the M1 -radial contraction; that is,
2.6 Optimal Control pM1 (x) =
M1 x x
115
if x > M1
. if x ≤ M1
Clearly pM1 (·) is Lipchitz continuous. Let F1 (t, x) = F t, pM1 (x) . Then F1 still satisfies H(F )(i) and (ii) (with F replaced by F1 ) and in addition for almost all t ∈ T , all x ∈ RN , and all v ∈ F1 (t, x), we have x
v ≤ η(t) a.e. on T, with η(·) = α(·) + c(·)M ∈ L1 (T )+ . Set C = g ∈ L1 (T, RN ) : g(t) ≤ η(t) a.e. on T . By virtue of the Dunford– Pettis theorem, C furnished with the relative weak topology is a compact convex set. Let ξ : C −→ C(T, RN ) be defined by
t g(s)ds. ξ(g)(t) = x0 + 0
Via the Arzela–Ascoli theorem, we check that ξ is sequentially continuous from C with the relative weak topology into C(T, RN ) with the norm topology. Then let G1 : C −→ 2C be defined by
a.e. on T }. G1 (g) = {h ∈ L1 (T, RN ) : h(t) ∈ F1 t, ξ(g)(t) Let ξ(g) ∈ C(T, RN ) and let {sn }n≥1 be step functions from T into RN such that sn (t) −→ x(t) a.e. on T as n → ∞. Then because for all y ∈ RN , t −→ F1 (t, y) has a N measurable graph, applying Theorem 2.6.8 for every n ≥ 1 we can find hn : T −→ R a measurable map such that hn (t) ∈ F1 t, sn (t) a.e on T . Evidently hn (t) ≤ η(t) a.e on T and so by the Dunford–Pettis theorem and by passing to a suitable w subsequence if necessary, we may assume that hn −→ h in L1 (T, RN ). Then using Mazur’s lemma and
the fact that for almost all t ∈ T, y −→ F1 (t, y) has closed graph, we have h(t) ∈ F1 t, x(t) a.e. on T (i.e., h ∈ G1 (g)). So G1 has nonempty values, which clearly are closed and convex in C. Moreover, the above argument reveals that Gr G ⊆ C × C is sequentially closed in C × C, when C is equipped with the relative weak topology. Recall that the relative weak topology on C is metrizable. So we can apply Theorem 2.6.7 and obtain g ∈ C such that g ∈ G1 (g). Then if x = ξ(g), we have
x (t) ∈ F1 t, x(t) a.e. on T, x(0) = x0 ,
t v(s)ds for all t ∈ T, with v ∈ L1 (T, RN ), ⇒ x(t) ≤ x0 + 0
v(s) ∈ F1 s, x(s) a.e. on T. (2.83) But from the definition of F1 , we see that v(s) ≤ α(s) + c(s)x a.e. on T and so from (2.83) and Gronwall’s inequality as before it follows that for all t ∈ T, x(t) ≤ M1
for all t ∈ T ⇒ F1 t, x(t) = F t, x(t)
(i.e., x ∈ SF (x0 )).
Therefore we have proved the nonemptiness of SF (x0 ). Clearly S(x0 ) ⊆ C(T, RN ) is closed, whereas from the Arzela–Ascoli theorem we have that S(x0 ) is relatively compact in C(T, RN ). Then SF (x0 ) is compact in C(T, RN ).
116
2 Extremal Problems and Optimal Control
Now if F (t, x) = f t, x, U (t) and x ∈ S(x0 ), then we set
V (t) = {u ∈ U (t) : x (t) = f t, x(t), u },
⇒ Gr V = {(t, u) : x (t) = f t, x(t), u } ∩ Gr U.
Because of hypotheses H(f )(i),(ii), (t, u)−→x (t)−f t, x(t), u is a Carath´eodory
function (i.e., for all u ∈ Y, t−→x (t)−f t, x(t), u is measurable, whereas for almost all t ∈ T, u−→x (t)−f t, x(t), u is continuous). Hence (t, u) −→ x (t) − f t, x(t), u is measurable and combining this with hypothesis H(U ) we see that Gr V ∈ L(T ) × B(Y ) with L(T ) being the Lebesgue σ-field of T and B(Y ) the Borel σ-field of Y . So we can apply Theorem 2.6.8 and obtain u : T −→ Y a measurable map such that u(t) ∈ V (t) a.e. on T . Then (x, u) ∈ αad and so Theorem 2.6.10 implies that 1 αad = SF (x0 ) ⊆ C(T, RN ) is compact. Next we solve the optimal control problem (P). The approach is the following. We use hypothesis Hc to transform (P) into a control-free variational problem with convex structure (calculus of variations problem) with the same values as (P). We solve the calculus of variations problem using the direct method . This method is known as the reduction method . THEOREM 2.6.11 If hypotheses H(f ), H(U ), H(L), and Hc hold and m < +∞, 1 then problem (P) admits an optimal state-control pair (x, u) ∈ αad . PROOF: We start implementing the reduction method described above. To this end let A(t, x, v) = {u ∈ U (t) : v = f (t, x, u)}. This is the set of all admissible controls which at time t ∈ T and when the system is at state x ∈ RN , produce the “velocity” v ∈ RN . Evidently, by modifying f, U , and L on a Lebesgue-null set, we have Gr A = (t, x, v, u) ∈ T ×RN ×RN ×Y : (t, u) ∈ Gr U, v − f (t, x, u) = 0 ∈ B(T ) × B(RN ) × B(RN ) × B(Y ). Let p : T ×RN ×RN −→ R = R ∪ {+∞} be defined by p(t, x, v) = inf[L(t, x, u) : u ∈ A(t, x, v)]
(2.84)
which represents the minimum cost to generate “velocity” v, at time t ∈ T , when the state of the system is x ∈ RN and we use admissible controls. As always inf ∅ = +∞. Claim 1: (t, x, v) −→ p(t, x, v) is Borel-measurable. For every λ ∈ R we have {(t, x, v) ∈ T × RN × RN : p(t, x, v) ≤ λ} = projT ×RN ×RN (t, x, v, u) ∈ T × RN × RN × Y : L(t, x, u) ≤ λ, u ∈ A(t, x, v) ∈ B(T ) × B(RN ) × B(RN ) (see Theorem 2.6.9). Claim 2: For every t ∈ T , the function (x, v) −→ p(t, x, v) is lower semicontinuous.
2.6 Optimal Control
117
We need to show that for every λ ∈ R, the set S(λ) = {(x, v) ∈ RN × RN : p(t, x, v) ≤ λ} is closed. So let {(xn , vn )}n≥1 ⊆ S(λ) and assume that xn −→ x, vn −→ v in RN as n → ∞. From (2.84) we know that we can find un ∈ A(t, xn , vn ) such that p(t, xn , vn ) = L(t, xn , un ). We have un ∈ U (t) and vn = f (t, xn , un ). By passing to a suitable subsequence if necessary, we may assume that un −→ u ∈ U (t). Then v = f (t, x, u) (see hypothesis H(f )(ii)) and so u ∈ A(t, x, v). Therefore p(t, x, v) ≤ L(t, x, u) ≤ lim inf L(t, xn , un ) = lim inf p(t, xn , vn ) ≤ λ n→∞
n→∞
(see hypothesis H(L)(ii)), ⇒ (x, v) ∈ S(λ),
(i.e., p(t, ·, ·) is lower semicontinuous).
Claim 3: For all (t, x) ∈ T × RN , v −→ p(t, x, v) is convex. Note that epi(t, x, ·) = {(v, λ) ∈ RN × R : p(t, x, v) ≤ λ} # = {(v, λ) ∈ RN × R : L(t, x, u) ≤ λ + ε, ε>0
u ∈ U (t), v = f (t, x, u)}. So by virtue of hypothesis Hc , epi(t, x, ·) is convex; that is, v −→ p(t, x, v) is convex as claimed. Now let {(xn , un )}n≥1 ⊆ αad be a minimizing sequence for problem (P); that is, J(xn , un ) ↓ m as n → ∞. Because of Theorem 2.6.10, we may assume that xn −→ x in C(T, RN ). Also from hypothesis H(f )(iii) we see that {xn (·)}n≥1 ⊆ L1 (T, RN ) is uniformly integrable and so by the Dunford–Pettis theorem, we may assume that w xn −→ v in L1 (T, RN ). Evidently v = x . Using Claims 1, 2, and 3 and Theorem 2.1.28, we obtain
b
b
p t, x(t), x (t) dt ≤ lim inf p t, xn (t), xn (t) dt n→∞
0
0
≤ lim J(xn , un ) = m < +∞. n→∞
So due to hypothesis H(L)(iii) and by redefining t −→ p t, x(t), x (t) on a
Lebesgue-null set, we can say that for all t ∈ T, p t, x(t), x (t) is finite. A straightforward application of Theorem 2.6.8 produces a measurable function u : T −→ Y such that u(t) ∈ A t, x(t), x (t) for almost all t ∈ T and
p t, x(t), x (t) = L t, x(t), u (t) a.e. on T, ⇒ J(x, u) ≤ m
and (x, u) ∈ αad ,
⇒ J(x, u) = m
(i.e., (x, u) is an optimal state–control pair).
Reviewing the proof of the above theorem, we see that hypothesis Hc was crucial in the argument because it provided the needed convex structure for the direct method of the calculus of variations to work. If Hc fails, we need not have a solution. Then, in order to capture the asymptotic behavior of the minimizing sequences, we
118
2 Extremal Problems and Optimal Control
embed the original problem in a “larger” system exhibiting the necessary convex structure that guarantees existence of optimal pairs. This method of augmenting the system is known as relaxation. There is no unique approach to relaxation. Nevertheless, we can agree that a reasonable relaxation method should meet the following three criteria. (i) Every original state is also a relaxed state. (ii) The set of original states is dense in the set of relaxed states. (iii) The relaxed problem admits a solution and the values of the two problems (relaxed and original) are equal. The first two requirements concern the dynamics of the system, and the third also involves the cost functional. The requirement that the values of the two problems are equal is known as relaxability. If relaxability fails, it can be said that the original system has been enlarged too much. A relaxation method that meets the three criteria is said to be admissible. Next we present some different ways to relax (convexify) the original problem (P). The first relaxation method has its roots in the following observation. We recall that a sequence {un }n≥1 ⊆ L1 (T, Rm ) that converges weakly but not strongly to a limit u oscillates violently around its limit. However, in the limit u a great deal of information about the faster and faster oscillations is forgotten and only an average value is registered in the limit. Clearly this is not satisfactory if the control enters in a nonlinear fashion in the dynamics. The idea then is to assign as a limit not a usual Rm -valued function, but a probability-valued function, known as a transition measure (or parametrized measure or Young measure). DEFINITION 2.6.12 Let (Ω, Σ, µ) be a finite measure space and Y a Polish 1 space. Let M+ (Y ) be the space of all probability measures on Y . A relaxed control 1 (or transition probability or Young measure) is a function λ : Ω −→ M+ (Y ), such that for all C ∈ B(Y ) = the Borel σ-field of Y, ω −→ λ(ω)(C) is Σ-measurable. We denote the set of all relaxed controls from Ω into Y by R(Ω, Y ). 1 REMARK 2.6.13 The “narrow topology” on M+ (Y ) is the weakest topology on 1 1 M+ (Y ) that makes continuous all the maps ψu : M+ (Y ) −→ R with u ∈ Cb (Y ) = {u : Y −→ R : u is continuous and bounded} defined by ψu (λ) = Y u(y)λ(dy). If 1 1 we topologize M+ (Y ) this way, then λ ∈ R(Ω, Y ) if and only if λ : Ω −→ M+ (Y ) is 1 Σ-measurable. So the narrow topology is the topology w M+ (Y ), Cb (Y ) ). We can define an analogue of it on the space R(Ω, Y ) of relaxed controls.
DEFINITION 2.6.14 Let Car(Ω × Y ) be the space of L1 -Carath´eodory integrands. Namely all functions ϕ : Ω × Y −→ R such that (i) For all y ∈ Y, ω −→ ϕ(ω, y) is Σ-measurable. (ii) For µ-almost all ω ∈ Ω, y −→ ϕ(ω, y) is continuous. (iii) There exists h ∈ L1 (Ω)+ such that for µ-almost all ω ∈ Ω and all y ∈ R, |ϕ(ω, y)| ≤ h(ω). The narrow topology on R(Ω, Y ) is the weakest topology on R(Ω, Y ) with respect to which the functionals
2.6 Optimal Control
119
λ −→ Iϕ (λ) =
ϕ(ω, y)λ(ω)(dy)dµ,
ϕ ∈ Car(Ω × Y )
Ω Y
are continuous.
So the narrow topology on R(Ω, Y ) is the topology w R(Ω, Y ), Car(Ω × Y ) . REMARK 2.6.15 We say that an integrand ϕ : Ω×Y −→R = R ∪ {+∞} is normal if (i) ϕ is Σ × B(Y )-measurable. (ii) For µ-almost all ω ∈ Ω, y −→ ϕ(ω, y) is proper and lower semicontinuous. A positive normal integrand can be approximated pointwise from below by integrands in Car(Ω × Y ). So equivalently the narrow topology on R(Ω, Y ) can be defined as the weakest topology that makes all functionals λ −→ Iϕ (λ) = ϕ(ω, y)λ(ω)(dy)dµ lower semicontinuous as ϕ ranges over all positive norΩ Y
∗ mal integrands. If Y is compact, then R(Ω, Y ) ⊆ L∞ Ω, M (Y )w∗ = L1 Ω, C(Y ) (here M (Y ) is the space of all Radon measures on Y ; recall that C(Y )∗ = M (Y )) and the narrow topology on R(Ω, Y ) coincides with the relative w∗ -topology in 1 L∞ Ω, M (Y )w∗ . If Σ is countably generated, then L Ω, C(Y ) is separable and so
on bounded subsets of L∞ Ω, M (Y )w∗ (such as R(Ω, Y )), the relative w∗ −topology is metrizable. In the sequel this is the context in which we use the narrow topology on R(Ω, Y ). In what follows for simplicity we set Y = Rm . m
H(U)1 : U : T −→ 2R \ {∅} is a multifunction with compact values such that Gr U ∈ B(T ) × B(Rm ) and there exists r > 0 such that for almost all t ∈ T and all u ∈ U (t), u ≤ r. We set B r = {u ∈ Rm : u ≤ r} and we introduce the constraint set for the relaxed controls, namely
1 Σ(t) = µ ∈ M+ (B r ) : µ U (t) = 1 .
Then given a state function x ∈ W 1,1 (0, b), RN ⊆ C(T, RN ), the set of admissible relaxed controls is given by SΣ = {λ ∈ R(T, B r ) : λ(t) ∈ Σ(t) a.e. on T }. Now we have all the necessary tools to introduce and study the first relaxation of problem (P) which is based on Young measures (see Definition 2.6.12):
b ⎧ ⎫ Jr1 (x, λ) = 0 B r L t, x(t), u λ(t)(du)dt −→ inf = m1r ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎬
. (2.85) s.t. x (t) = B r f t, x(t), u λ(t)(du)dt a.e. on T ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ x(0) = x0 , λ ∈ SΣ Note that in (2.85) in both the dynamics and the cost functional, the relaxed control enters linearly. This provides problem (2.85) with the necessary convex structure to obtain an optimal pair. For problem (2.85) every original control u(·) can be viewed as a relaxed control by considering the Dirac transition probability δu(·) . So the first requirement for admissibility is satisfied.
120
2 Extremal Problems and Optimal Control
PROPOSITION 2.6.16 If hypotheses H(f ), H(U )1 , and H(L) hold and m < +∞,
then problem (2.85) has an optimal pair (x, λ) ∈ W 1,1 (0, b), RN × SΣ ; that is, Jr1 (x, λ) = m1r . PROOF: Evidently −∞ < m1r ≤ m < ∞. Also let{(x n , λn )}n≥1 be a minimizing sequence for problem (2.85). Note that if F (t, x) = B r f (t, x, u)λ(t)(du) : λ∈SΣ , then F satisfies hypotheses H(F ). On the other hand by virtue of Remark 2.6.15,
SΣ ⊆ L∞ T, M (B r )w∗ is w∗ -compact. So we may assume that xn −→ x in C(T, RN ) w∗
λn −→ λ ∈ SΣ
in L
(see Theorem 2.6.10) and T, M (B r )w∗ .
∞
Also from the dynamics of (2.85) and hypothesis H(f )(iii), we see that xn −→ x w
in L1 (T, RN ).
We have
f t, xn (t), · − f t, x(t), · C(T,Rm ) = max f t, xn (t), u − f t, x(t), u u∈B r
=f t, xn (t), un − f t, x(t), un , with un ∈ B r . We may assume that un −→ u in Rm . Then by virtue of hypothesis H(f )(ii)
f t, xn (t), un − f t, x(t), un −→ 0 as n → ∞,
⇒ fn (t) = f t, xn (t), · −→ f t, x(t), · = f (t) in C(B r , Rm ) as n → ∞,
⇒ fn −→ f in L1 T, C(B r , Rm ) as n → ∞.
w∗ Because λn −→ λ in L∞ T, M (B r )w∗ , for every C ∈ B(T ) we have
f t, xn (t), u λn (t)(du)dt −→ f t, x(t), u λ(t)(du) and
C
x (t)dt C n
Br
−→
C
Br
x (t)dt. So in the limit we have
x (t)dt = f t, x(t), u λ(t)(du)dt for all C ∈ B, C C Br
⇒ x (t) = f t, x(t), u λ(t)(du) for almost all t ∈ T, x(0) = x0 .
C
Br
Therefore the pair (x, λ) is admissible for the relaxed problem (2.85).
Let Ln (t, u) = L t, xn (t), u . Because of hypotheses H(L), this is a normal integrand (see Remark 2.6.15) and so we can find Lkn (t, u), k ≥ 1, Carath´eodory integrands such that Lkn (t, u) ↑ Ln (t, u) and
for almost all t ∈ T, all u ∈ B r as k −→ ∞
α0 (t) − c0 xn (t) ≤ Lkn (t, u)
for almost all t ∈ T, all u ∈ B r .
w∗ Then because xn −→ x in C(T, RN ) and λn −→ λ in L∞ T, M (B r )w∗ , we have
2.6 Optimal Control
b
b
Lkn t, xn (t), u λn (t)(du)dt −→
Br
0
121
Lk t, x(t), u λ(t)(du)dt,
Br
0
for all k ≥ 1, as n → ∞. Also from the monotone convergence theorem, we have
b
b
Lk t, x(t), u λ(t)(du)dt ↑ L t, x(t), u λ(t)(du)dt 0
Br
Br
0
as k −→ ∞.
Therefore according to Proposition 1.5.25 we can find a sequence {k(n)}n≥1 such that k(n) −→ +∞ as n → ∞ and
b 0
Lk(n) t, xn (t), u λn (t)(du)dt −→
b
Br
0
L t, x(t), u λ(t)(du)dt.
Br
b b
Because 0 B r Lk(n) t, xn (t), u λn (t)(du)dt ≤ 0 L t, xn (t), u λn (t)(du)dt for all n ≥ 1, we have
b
b
L t, x(t), u λ(t)(du)dt ≤ lim inf L t, xn (t), u λn (t)(du)dt ⇒
Br 0 1 Jr (x, λ)
n→∞
≤ lim inf n→∞
Jr1 (xn , λn )
=
0
Br
m1r .
Because (x, λ) is admissible for the relaxed problem (2.85), we conclude that Jr1 (x, λ) = m1r . To establish the admissibility of the relaxed problem (2.85), we need to have that the original states are dense in the relaxed ones for the C(T, RN )-norm. This gives equality of the values of the original and relaxed problems. To do this we need one more result from multivalued analysis which we state here for convenience and postpone its proof until Section 6.4. PROPOSITION 2.6.17 If Ω, Σ, µ is a σ-finite nonatomic measure space, X is a separable Banach space, F : Ω −→ 2X \{∅} a multifunction such that Gr F ={(ω, x) ∈ Ω × X : x ∈ F (ω)} ∈ Σ × B(X), and for some 1 ≤ p ≤ ∞, SFp = {u ∈ Lp (Ω, X) : u(ω) ∈ F (ω) µ-a.e.} = ∅, then denoting by w (resp., w∗ ) the weak (resp., weak∗ ) topology on Lp (Ω, X), 1 ≤ p < ∞ (resp., on L∞ (Ω, X)) we have SFp
w
p = SconvF 1 ≤ p<∞
(resp., SFp
w∗
p = SconvF for p = +∞).
Here by convF , we denote the multifunction ω −→ conv F (ω) and p = u ∈ Lp (Ω, X) : u(ω) ∈ conv F (ω) µ-a.e. on Ω . SconvF With the help of this abstract result for multifunctions, we are able to show the density for the C(T, RN )-norm topology of the original states in the relaxed ones (second requirement for the admissibility of a relaxation method). For this we need to strengthen our hypotheses on f as follows.
122
2 Extremal Problems and Optimal Control
H(f )1 : f : T ×RN ×Rm −→ RN is a function, such that (i) For all (x, u) ∈ RN × Rm , t −→ f (t, x, u) is measurable. (ii) For almost all t ∈ T , all x, y ∈ RN , and all u ∈ U (t), we have f (t, x, u) − f (t, y, u) ≤ k(t)x − y
with k ∈ L1 (T )+
and for almost all t ∈ T and all x ∈ RN , u −→ f (t, x, u) is continuous. (iii) For almost all t ∈ T , all x ∈ RN and all u ∈ U (t) f (t, x, u) ≤ α(t) + c(t)x
with α, c ∈ L1 (T )+ .
1 we denote the set of admissible states for the original In what follows by αad 1 system (see problem (P)) and by αad,r we denote the set of admissible states for 1 the relaxed system (2.85).
1 PROPOSITION 2.6.18 If hypotheses H(f )1 , H(U )1 and H(L) hold, then αad = 1 N αad,r1 the closure taken in the C(T, R )-norm.
PROOF: For every (t, x) ∈ T × RN , let
F (t, x) = f t, x, U (t) and G(t, x) = f (t, x, u)µ(du) : µ ∈ Σ(t) . Br
We claim that conv F (t, x) = G(t, x). Note that f t, x, U (t) is compact and then so is conv F (t, x). Also it is clear that F (t, x) ⊆ G(t, x) and the latter is a 1 convex set. Moreover, M+ (B r ) furnished with the narrow topology (see Remark 2.6.13), which due to the compactness of B r coincides with the relative w∗ -topology on M (B r ) = space of Radon measures on B r (recall that C(B r )∗ = M (B r )), is compact metrizable. Let {vn }n≥1 ⊆ G(t, x) and suppose vn −→ v in RN . We have
f (t, x, u)µn (du) with µn ∈ Σ(t). vn = Br
We may assume that µn −→ µ narrowly (equivalently in the relative w∗ -topology) 1 in M+ (B r ). So from the portmanteau theorem we have
lim sup µn U (t) ≤ µ U (t) , n→∞
⇒ 1 ≤ µ U (t) .
1 (B r ), we have µ U (t) = 1 and so µ ∈ Σ(t). Also Because µ ∈ M+
f (t, x, u)µn (du) −→
vn = Br
f (t, x, u)µ(du) = v ∈ G(t, x). Br
Therefore G(t, x) is closed and so conv F (t, x) ⊆ G(t, x).
(2.86)
2.6 Optimal Control
123
On the other hand if v ∈ G(t, x), we have v = B r f (t, x, u)µ(du), µ ∈ Σ(t). We n n rk δuk with uk ∈ U (t) and {rk }n≥1 ⊆ [0, 1], rk = 1 such that can find sn = w∗
k=1
k=1
1 (B r ). Then sn −→ µ in M+
vn =
f (t, x, u)sn (du) = Br
and vn −→ v =
Br
n
rk f (t, x, uk ) ∈ conv F (t, x)
k=1
f (t, x, u)µ(du). So it follows conv F (t, x) = G(t, x)
(see (2.86)).
(2.87)
Let SconvF (x0 ) be the solution set of (2.82) when the multivalued right hand side is conv F and by SG (x0 ) the solution set of (2.82) when the multivalued right hand is G. Then from (2.87) it follows that SconvF (x0 ) = SG (x0 ).
(2.88)
Because of hypothesis H(f )(ii) for almost all t ∈ T , x −→ convF (t, x) is Lipchitz continuous for the Hausdorff metric. So invoking the relaxation theorem for differential inclusions (see Denkowski–Mig´ orski–Papageorgiou [195, p. 262], we know that SconvF (x0 ) = SF (x0 ) the closure taken in C(T, RN ). (2.89) 1 1 = SF (x0 ) and αad,r = SG (x0 ), from (2.88) and (2.89) we conclude Because αad 1 that 1 1 αad,r = αad the closure taken in C(T, RN ). 1
REMARK 2.6.19 It is well-known that the relaxation theorem for differential inclusions fails if the multivalued nonlinearity F (t, x) is only continuous in the Hausdorff metric in the x ∈ RN -variable. For this reason we had to strengthen the hypothesis on the vector field x −→ f (t, x, u) (see hypothesis H(f )1 (iii)). The final step in the analysis of the relaxed problem (2.85) is to show that m1r = m (relaxability). This implies that the relaxation method of problem (2.85) is in fact admissible. To achieve this, we strengthen our hypotheses on the cost integrand L as follows. H(L)1 : L : T ×RN ×Rm −→ R is an integrand such that (i) For all (x, u) ∈ RN × Rm , t −→ L(t, x, u) is measurable. (ii) For almost all t ∈ T, (x, u) −→ L(t, x, u) is continuous. (iii) For almost all t ∈ T , all x ∈ RN with x ≤ n, n ≥ 1, and all u ∈ U (t), |L(t, x, u)| ≤ ηn (t)
with ηn ∈ L1 (T )+ .
THEOREM 2.6.20 If hypotheses H(f )1 , H(U )1 , and H(L)1 hold, then m1r = m.
124
2 Extremal Problems and Optimal Control
PROOF: From Proposition 2.6.16 we know that we can find (x, λ) an admissible state-control pair for the relaxed problem (2.85) such that Jr1 (x, λ) = m1r . Also by virtue of Propositions 2.6.17 and 2.6.18 we can find (xn , un ) ∈ αad such that
w∗ xn −→ x in C(T, RN ) and δun −→ λ in L∞ T, M (B r )w∗ as n → ∞.
Set Ln (t)(u) = L t, xn (t), u and L(t)(u) = L t, x(t), u . Evidently Ln −→ L
in L1 T, C(B r ) . So if by ·, · we denote the duality brackets for the pair
L1 T, C(B r ) , L∞ T, M (B r )w∗ , we have
b
. / . /
L t, xn (t), un (t) dt = Ln , δun −→ L, λ
0
b
= 0
⇒m≤
L t, x(t), u λ(t)(du)dt = m1r ,
Br m1r .
Because the opposite inequality is clearly true, we conclude that m = m1r .
So we have established the admissibility of the relaxation method based on Young measures. The next relaxation method is suggested by the reduction method used in the proof of Theorem 2.6.11. We introduce the following relaxed problem: ⎧ ⎫
b Jr2 (x, x ) = 0 p∗∗ t, x(t), x (t) dt −→ inf = m2r ⎬ ⎨ . (2.90)
⎩ ⎭ s.t. x (t) ∈ convF t, x(t) a.e. on T, x(0) = x0 Here p∗∗ stands for the second conjugate (in the sense of Definition 1.2.15) of the
function v −→ p(t, x, v). Also as before F (t, x) = f t, x, U (t) . Note that problem (2.90) is control-free (a calculus of variations problem with multivalued dynamic 1 constraints). In what follows by αad,r we denote the set of admissible states for 2 this new problem (2.90). PROPOSITION 2.6.21 If hypotheses H(f )1 , H(U )1 , and H(L) hold, then 1 1 1 αad,r = αad,r , m1r = m2r , and there exists x ∈ αad,r such that 1 2 2 Jr2 (x, x ) = m2r . PROOF: From the proof of Proposition 2.6.18, we know that f (t, x, u)µ(du) : µ ∈ Σ(t) . conv F (t, x) = Br 1 1 From (2.91) it follows that αad,r = αad,r . 1 2 By definition we have
(2.91)
2.6 Optimal Control 125 p∗∗ (t, x, v) = inf η ∈ R : (v, η) ∈ conv epi p(t, x, ·) . (2.92)
As always inf ∅ = +∞. Set H(t, x) = epi p(t, x, ·) =
f (t, x, u), η : u ∈ U (t), L(t, x, u) ≤ η .
As in the proof of Proposition 2.6.18, we can show that conv H(t, x) = conv epi p(t, x, ·)
f (t, x, u)µ(du), η : µ ∈ Σ(t), = Br
L(t, x, u)µ(du) ≤ η .
Br
Hence we have p∗∗ (t, x, v) = inf
L(t, x, u)µ(du) : µ ∈ Σ(t), v =
Br
f (t, x, u)µ(du) . (2.93)
Br
1 (B r ) Because µ −→ B r L(t, x, u)µ(du) is narrowly lower semicontinuous and M+ is narrowly compact, from Theorem 2.1.10, we see that the infimum in (2.93) is attained and an application of Theorem 2.6.8 implies that the minimizing measure depends measurable on t ∈ T . Combining this fact with Proposition 2.6.16, we infer that m1r = m2r and for x the optimal state for problem (2.85), we also have
Jr2 (x, x ) = m2r . Combining Propositions 2.6.18 and 2.6.21, we see the following. 1 = PROPOSITION 2.6.22 If hypotheses H(f ), H(U )1 , and H(L) hold, then αad 1 N αad,r2 , the closure taken in the C(T, R ) norm.
Now we can establish the admissibility of the second relaxed problem (2.90). THEOREM 2.6.23 If hypotheses H(f ), H(U )1 , and H(L)1 hold, then m2r = m. PROOF: From Theorem 2.6.20 we know that m1r = m. On the other hand Proposition 2.6.21 says that m1r = m2r . So finally m2r = m. So the second relaxation method, which was motivated from the reduction method, is also admissible. The third relaxed problem has its roots in the well-known theorem of Carath´eodory for the representation of convex sets in RN . Let us recall that theorem. THEOREM 2.6.24 If C ⊆ RN is nonempty, then every point of the set conv C is a convex combination of no more than n + 1 distinct points of the set C. Motivated by this result, we introduce the following relaxed problem. In what +1 N +1 follows u = (uk )N k=1 and r = (rk )k=1 .
126
2 Extremal Problems and Optimal Control ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
Jr3 (x, u, r) =
s.t.
x (t) =
N +1
+1 b N 0
rk (t)L t, x(t), uk dt −→ inf = m3r
k=1
rk (t)f t, x(t), uk (t) a.e. on T, x(0) = x0
k=1
uk ∈ L1 (T, Rm ), u(t) ∈ U (t) a.e. on T rk : T −→ [0, 1] measurable
N +1 k=1
rk (t) = 1
for all t ∈ T
⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭
.
(2.94)
1 By αad,r we denote the admissible states for problem (2.94). From the proof of 3 Proposition 2.6.21 it is clear that we have the following. 1 PROPOSITION 2.6.25 If hypotheses H(f ), H(U )1 , and H(L) hold, then αad,r = 1 1 1 1 2 3 1 1 αad,r2 = αad,r3 , mr = mr = mr , αad = αad,r3 the closure taken in the C(T, RN )norm, and there exists an optimal triple (x, u∗ , r∗ ) = m3r from problem (2.94); that is, Jr3 (x, u∗ , r∗ ) = m3r .
Also we have the admissibility of the new relaxed problem (2.94). THEOREM 2.6.26 If hypotheses H(f ), H(U )1 , and H(L)1 hold, then m3r = m. Next we derive a fourth relaxation method based on semicontinuity techniques, which are different from the ones employed so far. The new approach uses the socalled multiple Γ-operators which are related to the Γ-convergence that we studied in Section 1.5. DEFINITION 2.6.27 Let X1 , X2 be two Hausdorff topological spaces, and let ϕ : X1 ×X2 −→ R = R ∪ {+∞} be a proper function and (x1 , x2 ) ∈ X1 × X2 . In what follows by Z(+) we denote the sup operator and by Z(−) the inf operator. Also for k = 1, 2, let Sk be the set of sequences in Xk that converge to xk and let βk be one of the signs + and −. We define Γseq (X1β1 , X2β2 )ϕ(x1 , x2 ) = Z(β1 ) Z(β2 ) Z(−β1 ) Z(β1 ) ϕ(x1n , x2n ). 2 (x1 n )∈S1 (xn )∈S2
m≥1
n≥m
REMARK 2.6.28 Here for simplicity we have restricted ourselves to the sequential definition of multiple Γ-operators. Evidently we can have a more general topological version of the above notion, as we did for the Γ-convergence in Section 1.5. EXAMPLE 2.6.29 By virtue of Definition 2.6.27, we have Γseq (X1− , X2− )ϕ(x1 , x2 ) = and
Γseq (X1− , X2+ )ϕ(x1 , x2 )
=
inf
inf lim inf ϕ(x1n , x2n )
inf
sup lim inf ϕ(x1n , x2n )
2 x1 n →x1 xn →x2
xn →x1 x2 →x 2 n
n→∞ n→∞
If the Γseq -limit is independent of the sign + or − associated with one of the spaces, then this sign is omitted. So if
2.6 Optimal Control
127
Γseq (X1− , X2+ )ϕ(x1 , x2 ) = Γseq (X1+ , X2+ )ϕ(x1 , x2 ) then we simply write Γseq (X1 , X2+ )ϕ(x1 , x2 ). Note that Γseq (X1β1 , X2β2 )ϕ(x1 , x2 ) = −Γseq (X1−β1 , X2−β2 )(−ϕ)(x1 , x2 ) and if both X1 , X2 are metrizable, then
Γseq (X1− , X2− )ϕ(x1 , x2 ) = Γseq (X1 × X2 )− ϕ(x1 , x2 ). Although in general multiple Γ−operators are not distributive with respect to addition, nevertheless we have some useful inequalities. PROPOSITION 2.6.30 If ϕ, ψ : X1 ×X2 −→ R = R ∪ {+∞} are proper functions, then Γseq (X1− , X2− )ϕ+Γseq (X1− , X2− )ψ ≤ Γseq (X1− , X2− )(ϕ + ψ)
≤ Γseq (X1− , X2+ )ϕ + Γseq (X1+ , X2− )ψ.
PROOF: It is a consequence of inf λi + inf µi ≤ inf (λi + µi ) ≤ inf λi + sup µi
i∈I
i∈I
i∈I
i∈I
i∈I
for all {λi }i∈I , {µi }i∈I ⊆ R.
COROLLARY 2.6.31 If for (x1 , x2 ) ∈ X1 × X2 , Γseq (X1− , X2 )ϕ(x1 , x2 ), and Γseq (X1 , X2− )ψ(x1 , x2 ) exist, then Γseq (X1 , X2− )(ϕ + ψ)(x1 , x2 ) = Γseq (X1− , X2 )ϕ(x1 , x2 ) + Γseq (X1 , X2− )ψ(x1 , x2 ). In abstract terms the idea of this new relaxation method is the following. Suppose X is the state space, Y the control space (both Hausdorff topological spaces), ϕ : X × Y −→ R is the cost functional, and C is the set of all admissible statecontrol pairs (in C we have incorporated all constraints of the problem dynamic and nondynamic). Then the optimal control problem is inf (ϕ + iC )(x, u) : (x, u) ∈ X × Y , (2.95) where iC is the indicator function of the set C; that is, 0 if (x, u) ∈ C . iC (x, u) = +∞ otherwise Then the relaxed problem corresponding to (2.95) is inf Γseq (X − , U − )(ϕ + iC )(x, u) : (x, u) ∈ X × Y ; in other words in the relaxed problem we minimize the Γseq -relaxation of the extended cost functional ϕ + iC . So our goal is to determine Γseq (X − , U − )(ϕ + iC ). We do this by employing the auxiliary variable method , which is outlined in the next proposition.
128
2 Extremal Problems and Optimal Control
PROPOSITION 2.6.32 If Z is a third Hausdorff topological space, ξ : X×Y −→ Z is a map that satisfies (C0 ) Every sequence {(xn , un )}n≥1 that is convergent in X × Y and {(ϕ + iC )(xn , un )}n≥1 is bounded, the sequence {ξ(xn , yn )}n≥1 has a convergent subsequence in Z,
(ϕ + iC )(x, u) if v = ξ(x, u) , (2.96) +∞ otherwise
then Γseq (X − , Y − )(ϕ + iC )(x, u) = inf Γseq X − , (Y × Z)− ψ(x, u, v) : v ∈ Z . and
ψ(x, u, v) =
PROOF: Let (x, u, v) ∈ X ×Y ×Z and suppose that xn → x in X, un −→ u in Y , and vn −→ v in Z. We have (see (2.96)), (ϕ + iC )(xn , un ) ≤ ψ(xn , un , vn ), n ≥ 1
− − − ⇒ Γseq (X , Y )(ϕ + iC )(x, u) ≤ Γseq X , (Y × Z)− ψ(x, u, v) for all v ∈ Z, ⇒ Γseq (X,− , Y − )(ϕ + iC )(x, u)
≤ inf Γseq X − , (X × Y )− ψ(x, u, v) : v ∈ Z .
(2.97)
On the other hand let (x, u) ∈ X × Y and consider xn −→ x in X and un −→ u in Y . Without any loss of generality we assume that lim (ϕ + iC )(xn , un ) exists and it n→∞
is finite. Then by virtue of condition (C0 ), vn = ξ(xn , un ) −→ v ∈ Z. Hence using Definition 2.6.27 we have
Γseq X − , (Y ×Z)− ψ(x, u, v) ≤ lim inf ψ(xn , un , vn ) = lim ϕ(xn , un ) n→∞ n→∞
⇒ inf Γseq X −, (Y ×Z)− ψ(x, u, v) : v ∈ Z ≤ Γseq (X −, Y − )(ϕ + iC )(x, u). (2.98)
From (2.97) and (2.98), we conclude that equality holds.
The next proposition is helpful in the evaluation of Γseq X −, (Y × Z)− ψ. PROPOSITION 2.6.33 If the cost functional ϕ satisfies ϕ(x, u) ≤ ϕ(x , u) + w(x, x )h(x , u)
for all x, x ∈ X, u ∈ Y
(2.99)
with w : X ×X −→ R+ and h : X ×Y −→ R+ satisfying xn −→ x
in X implies that w(x, xn ), w(xn , x) −→ 0
(2.100)
and {(xn , un )}n≥1 is convergent in X × Y and {ϕ(xn , un )}n≥1 is bounded imply {h(xn , un )}n≥1 is bounded, −
then for all (x, u) ∈ X ×Y , Γseq (X, Y ) ϕ(x, u) exists and Γseq (X, Y − ) ϕ(x, u) = Γseq (δX , Y − ) ϕ(x, u) with δX being the discrete topology on X.
(2.101)
2.6 Optimal Control
129
PROOF: Let (xn , un ) −→ (x, u) in X × Y . Without any loss of generality we assume that {ϕ(xn , un )}n≥1 converges to a finite limit. From (2.99) we have ϕ(x, un ) ≤ ϕ(xn , un ) + w(x, xn )h(xn , un ), n ≥ 1. Then by virtue of (2.100) and (2.101) we have lim inf ϕ(x, un ) ≤ lim inf ϕ(xn , un ), n→∞
n→∞
⇒ Γseq (X, Y − ) ϕ(x, u) ≤ Γseq (X, Y − ) ϕ(x, u).
(2.102)
On the other hand let (xn , un ) −→ (x, u) in X × Y , and without any loss of generality assume that {ϕ(x, un )}n≥1 converges to a finite limit. Then because of (2.99), we have ϕ(xn , un ) ≤ ϕ(x, un ) + w(xn , x)h(x, un ),
n ≥ 1.
Then by virtue of (2.100) and (2.101), we have lim inf ϕ(xn , un ) ≤ lim ϕ(x, un ), n→∞
⇒ Γseq (δX , Y − ) ϕ(x, u) ≤ Γseq (δX , Y − ) ϕ(x, u). From (2.102) and (2.103) we conclude that equality must hold.
(2.103)
Now we introduce the specific optimal problem that we study. The state
space X is W 1,1 (0, b), RN endowed with the C(T, RN )-norm topology (recall that
W 1,1 (0, b), RN is embedded continuously in C(T, RN )) and the control space Y is L1 (T, Rm ) endowed with the weak topology. The optimal control problem is the following. ⎫ ⎧ b
J(x, u) = 0 L t, x(t), u(t) dt −→ inf = m, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎨
(2.104) s.t. x (t) = A t, x(t) + C t, x(t) g t, u(t) a.e. on T . ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ x(0) = x0 , u ∈ L1 (T, Rm ), u(t) ∈ U (t) a.e. on T Here A : T ×RN −→ RN , C : T ×RN −→ RN ×k , and g : T × Rm −→ Rk are measurable m functions, U : T −→ 2R \ {∅} is the control constraint multifunction, and L : T × RN × Rm −→ R = R ∪ {+∞} is the cost integrand. We introduce the following hypotheses on the above data of problem (2.104). H(A): A : T × RN −→ RN is a function such that (i) For all x ∈ RN , t −→ A(t, x) is measurable. (ii) For every r > 0, there exists βr ∈ L1 (T )+ such that for almost all t ∈ T and all x, y ≤ r, we have A(t, x) − A(t, y) ≤ βr (t)x − y. (iii) For almost all t ∈ T and all x ∈ RN , we have A(t, x) ≤ α(t) + c(t)x
with α, c ∈ L1 (T )+ .
130
2 Extremal Problems and Optimal Control
H(C): C : T ×RN −→ RN ×k is a function such that (i) For all x ∈ RN , t −→ C(t, x) is measurable. (ii) For every r > 0, there exists γr ∈ L∞ (T )+ such that for almost all t ∈ T and all x, y ≤ r, we have C(t, x) − C(t, y) ≤ γr (t)x − y. (iii) For almost all t ∈ T and all x ∈ RN , we have C(t, x) ≤ α1 (t) + c1 (t)x
with α1 , c1 ∈ L∞ (T )+ .
H(g): g : T ×Rm −→ Rk is a Borel function. m
H(U)2 : U : T −→ 2R \ {∅} is a multifunction with compact values, such that Gr m 1 m U = {(t, u) ∈ T ×Rm : u ∈ U (t)} ∈ B(T
)×B(R ) and there exists u ∈ L (T, R ) such that u(t) ∈ U (t) a.e. on T and g ·, u(·) is integrable on T . H(L)2 : L : T ×RN ×Rm −→ R+ = R+ ∪ {+∞} is an integrand such that (i) (t, x, y) −→ L(t, x, y) is measurable. (ii) For every r > 0, almost all t ∈ T , all x, y ≤ r and all u ∈ U (t), we have L(t, x, u) ≤ L(t, y, u) + ϑr (t, x − y) + σr (x − y)L(t, y, u), where ϑr : T ×R+ −→ R+ and σr : R+ −→ R+ are such that ϑr (·, λ) is integrable and ϑr (t, ·), σr (·) are continuous increasing with ϑr (t, 0) = σr (0) = 0. (iii) For almost all t ∈ T and all u ∈ U (t) ν(g(t, x) + u) − ζ(t) ≤ L(t, 0, u) with ν : R+ −→ R+ increasing convex function satisfying lim
r→+∞
ν(τ ) τ =
+∞ and ζ ∈ L1 (T )+ .
(iv) There exist u0 ∈ L1 (T, Rm ) such that t −→ L t, 0, u0 (t) is integrable on T. We implement of the auxiliary variable, outlined earlier. Let S =
the method {(x, u) ∈ W 1,1 (0, b), RN × L1 (T, Rm ) : (x, u) is an admissible pair for problem (2.104)} and set ϕ = J + iS . The next lemma shows that the method of auxiliary variable indeed can be used within the context of problem (2.104). LEMMA 2.6.34 If Z = L1 (T, Rk ) furnished with the weak topology and ξ : X × Y −→ Z is defined by
g t, u(t) if g ·, u(·) is integrable, ξ(x, u)(t) = , 0 otherwise then ϕ satisfies the compactness condition (C0 ) of Proposition 2.6.32.
2.6 Optimal Control
131
PROOF: Suppose (xn , un ) −→ (x, u) in X × Y and assume that |ϕ(xn , un )| ≤ M for some M > 0, all n ≥ 1. Let vn (t) = g t, un (t) . If εn = xn − xC(T,RN )
and r = sup xn C(T,RN ) (recall that on the state space X = W 1,1 (0, b), RN we n≥1
consider the C(T, RN )-norm topology), using hypothesis H(L), we have
b
b
L t, 0, un (t) dt ψ(vn (t)) − ζ(t) dt ≤ 0
0
b
≤ L t, xn (t), un (t) dt + ϑr (t, εn ) dt 0
+
b
σr (εn ) L t, xn (t), un (t) dt
0
≤ M1 1 + σr (εn ) +
b
ϑr (t, εn )dt
(2.105)
0
for some M1 > 0, all n ≥ 1. Then from (2.105), hypothesis H(L)2 (iii), and the De La Vall´ee–Poussin theorem, we conclude that {vn }n≥1 ⊆ L1 (T, Rk ) has a weakly convergent subsequence. This lemma permits the use of Proposition 2.6.32, which says the relaxation of the extended cost functional ϕ = J + iS is reduced to the relaxation in X × (Y × Z) of ψ(x, u, v) = J(x, u, v) + iS (x, u, v), b
where J(x, u, v) = 0 L t, x(t), u(t), v(t) dt with L(t, x, u, v) = L(t, x, u)+i{v=g(t,u)}
and S = (x, u, v) ∈ X × Y × Z : x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T, x(0) = x0 .
We compute separately Γseq X, (Y × Z)− J(x, u, v) and Γseq(X − , Y × Z)iC (x, u, v) and then use Corollary 2.6.31 to have Γseq X − , (Y × Z)− ψ(x, u, v), which is what we want. PROPOSITION 2.6.35 For all (x, u, v) ∈ X ×Y ×Z we have
b
L∗∗ t, x(t), u(t), v(t) dt, Γseq X, (Y × Z)− J(x, u, v) = 0
where L∗∗ (t, x, u, v) denotes the second conjugate of L(t, x, ·, ·). PROOF: From hypothesis H(L)2 (ii) we have
b ϑr (t, x − yC(T,RN ) )dt J(x, u, v) ≤ J(y, u, v) + 0
+ σr (x − yC(T,RN ) )J(y, u, v) for all x, y ∈ X, with xC(T,RN ) , yC(T,RN ) ≤ r, r > 0 and all u ∈ Y, v ∈ Z. Set
w(x, y) = 0
and
b
ϑr (t, x − yC(T,RN ) )dt + σr (x − yC(T,RN ) )
h(x, u, v) = 1 + J(x, u, v).
132
2 Extremal Problems and Optimal Control
We can apply Proposition 2.6.33 and have
Γseq X, (Y × Z)− J(x, u, v) = Γseq δX , (Y × Z)− J(x, u, v).
b But Γseq δX , (Y × Z)− J(x, u, v) = 0 L∗∗ t, x(t), u(t), v(t) dt (see Buttazzo [126, p. 74], and Denkowski–Mig´ orski–Papageorgiou [195, p. 588]). PROPOSITION 2.6.36 For all (x, u, v) ∈ X × Y × Z we have Γseq (X − , Y × Z)χS (x, u, v) = χS (x, u, v). PROOF: We need to show that if (xn , un , vn ) −→ (x, u, v) in X × Y × Z and (xn , un , vn ) ∈ S for all n ≥ 1, then (x, u, v) ∈ S
(2.106)
and if (x, u, v) ∈ S and (un , vn ) −→ (u, v) in Y × Z, then there exists xn −→ x in X with (xn , un , vn ) ∈ S, n ≥ 1. (2.107) First we prove (2.106). We have
* ) xn (t) = A t, xn (t) + C t, xn (t) vn (t) a.e. on T . xn (0) = x0 w
By virtue of hypotheses H(A) and H(C) and because vn −→ v in L1 (T, Rk ) (recall that Z = L1 (T, Rk ) is furnished with the weak topology), we have
w A ·, xn (·) + C ·, xn (·) vn (·) −→ A ·, x(·) + C ·, x(·) v(·) in L1 (T, RN ).
Because xn −→ x in C(T, RN ) (recall that X = W 1,1 (0, b), RN is furnished with the C(T, RN )-norm topology), in the limit as n → ∞, we have
* ) x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T . (2.108) x(0) = x0 So we have proved (2.106). Next let (x, u, v) ∈ X × Y × Z be such that it satisfies (2.108) and suppose
(un , vn ) −→ (u, v) in Y × Z. For every n ≥ 1, let xn ∈ W 1,1 (0, b), RN be the unique solution of the Cauchy problem
) * xn (t) = A t, xn (t) + C t, xn (t) vn (t) a.e. on T . (2.109) xn (0) = x0 Because sup vn L1 (T,Rk ) ≤ M2 < +∞, then from (2.109), after integration over [0, t] n≥1
and using hypothesis H(A)(iii), H(C)(iii), and Gronwall’s inequality, we obtain r > 0 such that sup vn L1 (T,Rk ) ≤ r. From the Arzela–Ascoli theorem it follows that n≥1
{xn }n≥1 ⊆ C(T, RN ) is relatively compact and so we may assume that xn −→ x in C(T, RN ). In the limit we have
)
2.6 Optimal Control
* x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T . x(0) = x0
133
Given the uniqueness of the solution of (2.108), it follows that x = x and so we have proved (2.107) and also the proposition. Combining Propositions 2.6.35 and 2.6.36 with Corollary 2.6.31, we have b
L∗∗ t, x(t), u(t), v(t) dt : v ∈ Z, Γseq (X − , Y − )ϕ(x, u) = inf 0
x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T, x(0) = x0 . The final step in the derivation of the relaxed problem, is to eliminate the variable v ∈ Z from the right-hand side expression. To this end let G(t, u) = v ∈ Rk : (u, v) ∈ conv{(u , v ) ∈ Rm × Rk : v = g(t, u )} L(t, x, u, w) = inf L∗∗ (t, x, u, v) : w = A(t, x) + C(t, x)v
S = (x, u) ∈ X × Y : x (t) ∈ A t, x(t) + C t, x(t) G t, u(t) a.e. on T, x(0) = x0 . THEOREM 2.6.37 If hypotheses H(A), H(C), H(g), H(U )2 , and H(L)2 hold, then b
Γseq (X − , Y − ) ϕ(x, u) = 0 L t, x(t), u(t), x (t) dt + iS (x, u). PROOF: Note that v ∈ G(t, u), when L∗∗ (t, x, u, v) < +∞. Therefore
L∗∗ t, x(t), u(t), x (t) dt : v ∈ Z, x (t) 0
= A t, x(t) + C t, x(t) v(t) a.e. on T + iS (x, u).
Note that if x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T , we have
L t, x(t), u(t), x (t) ≤ L∗∗ t, x(t), u(t), v(t) a.e. on T,
b
L t, x(t), u(t), x (t) dt + iS (x, u) ≤ Γseq (X − , Y − ) ϕ(x, u). ⇒ Γseq (X − , Y − ) ϕ(x, u) = inf
b
0
(2.110) b
On the other hand if (x, u) ∈ S and 0 L t, x(t), u(t), x (t) dt < +∞, then
L t, x(t), u(t), x (t) < +∞ a.e. on T and so we must have
L t, x(t), u(t), x (t) = L∗∗ t, x(t), u(t), v(t) a.e. on T,
with v ∈ L1 (T, Rk ) such that x (t) = A t, x(t) + C t, x(t) v(t) a.e. on T (see Theorem 2.6.8 and hypothesis H(L)2 ). It follows that
b
Γseq (X − , Y − ) ϕ(x, u) ≤ L∗∗ t, x(t), u(t), v(t) dt + iS (x, u) 0
=
b
L t, x(t), u(t), x (t) dt + iS (x, u).
0
(2.111)
134
2 Extremal Problems and Optimal Control Comparing (2.110) and (2.111), we obtain the desired equality. So the new relaxed problem corresponding to (2.104) is:
b
inf L t, x(t), u(t), x (t) dt + iS (x, u) : (x, u) ∈ X × Y = m4r .
(2.112)
0
Then by virtue of Theorem 2.1.20, we have the following. THEOREM 2.6.38 If hypotheses H(A), H(L), H(g), H(U )2 , and H(L)2 hold, then problem (2.112) has a solution (x, u) ∈ X × Y and m = m4r . To relate problem (2.112) with the relaxed problems introduced earlier we need to strengthen the hypotheses on g and U . H(g)1 : g : T × Rm −→ Rk is a Carath´eodory function (i.e., measurable in t ∈ T and continuous in x ∈ Rm ) and for almost all t ∈ T and all u ≤ r, we have g(t, u) ≤ p(t) with p ∈ L1 (T )+ (here r > 0 as in H(U )1 ). For the control constraint set U we assume hypothesis H(U )1 . Then from (2.85) with f (t, x, u) = A(t, x) + C(t, x)g(t, u), we have the following alternative relaxation of (2.104).
b ⎧ ⎫ Jr1 (x, λ) = 0 B r L t, x(t), u(t) λ(t)(du)dt −→ inf = m1r ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎬
s.t. x (t) = A t, x(t) + C t, x(t) B r g(t, u)λ(t)(du) a.e. on T . (2.113) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ x(0) = x0 , λ ∈ SΣ DEFINITION 2.6.39 Given an admissible original control u(·), the barycenter of u(·) is defined by
Bar(u) = λ ∈ SΣ : u(t) = uλ(t)(du) a.e. on T . Br
In what follows, for convenience we set ϕ(x, u) = Γseq (X − , Y − ) ϕ(x, u).
(2.114)
PROPOSITION 2.6.40 If hypotheses H(A), H(C), H(g), H(U )1 , and H(L)2 hold, then ϕ(x, u) = min Jr1 (x, λ) : (x, λ) is admissible for (2.113) and λ ∈ Bar(u) . PROOF: Let (x, λ) be an admissible state-control pair for problem (2.113) with λ ∈ Bar(u). Then by virtue of Proposition 2.6.17 we can find original
controls {un }n≥1 such that δun −→ λ narrowly in R(T, B r ). Let xn ∈ W 1,1 (0, b), RN be the unique state produced by control un . From Theorem 2.6.10 we know that {xn }n≥1 ⊆ C(T, RN ) is relatively compact and so we may assume that xn −→ x w in C(T, RN ). Also because of hypothesis H(U )1 , we may assume that un −→ u in 1 m L (T, R ). For all D ∈ B(T ) we have
un (t)dt −→ u(t)dt. (2.115) D
D
2.6 Optimal Control Moreover, because δun −→ λ narrowly and λ ∈ Bar u, we have
un (t)dt = uδun (t) (du)dt −→ uλ(t)(du)dt = u(t)dt. D
D Br
D Br
135
(2.116)
D
From (2.115), (2.116), and because D ∈ B(T ) was arbitrary, we infer that u = u w and so un −→ u in L1 (T, Rm ). Because (xn , un ) is admissible for the original optimal control problem, we have ϕ(xn , un ) = J(xn , un ), ⇒ lim inf ϕ(xn , un ) = lim inf J(xn , un ) = lim inf Jr1 (xn , δun ) = Jr1 (x, λ) n→∞
n→∞
n→∞
(see H(L)2 ),
⇒ ϕ(x, u) ≤ inf Jr1 (x, λ) : (x, λ) is admissible for (2.113), λ ∈ Bar u . (2.117)
On the other hand, if ϕ(x, u) < +∞, from the definition of ϕ (see (2.114)), given w ε > 0, we can find xn −→ x in C(T, RN ) and un −→ u in L1 (T, Rm ) such that lim inf J(xn , un ) ≤ ϕ(x, u) + ε.
(2.118)
n→∞
w∗ By Alaoglu’s theorem we may also assume that δun−→λ in L∞ T, M (B r )w∗ (hence narrowly in R(T, B r )). Then for all D ∈ B(T ) we have
un (t)dt = uδun (t) (du)dt −→ uλ(t)(du)dt D
D
Br
Br
(2.119)
un (t)dt −→
and
D
D
u(t)dt.
(2.120)
D
From (2.119), (2.120), and because D ∈ B(T ) was arbitrary it follows that u(t) = B r uλ(t)(du) a.e. on T (i.e., λ ∈ Bar u). Also because of hypotheses H(L)2 we have J(xn , un ) −→ Jr1 (x, λ). Therefore Jr1 (x, λ) ≤ ϕ(x, u) + ε Let ε ↓ 0, to conclude that
(see (2.118)).
Jr1 (x, λ) ≤ ϕ(x, u).
Comparing (2.117) and (2.121), we obtain the desired equation.
(2.121)
This proposition leads at once to the equivalence of relaxed problems (2.112) and (2.113). THEOREM 2.6.41 If hypotheses H(A), H(C), H(g), H(U )1 , and H(L)2 hold, then m = m1r = m4r and problems (2.112) and (2.113) are equivalent; that is, if (x, u) solves (2.112), then we can find λ ∈ Bar(u) such that (x, λ) solves (2.113) and conversely if (x, λ) solves (2.113) we can find an admissible original control u such that (x, u) solves (2.112). Both problems have optimal pairs.
136
2 Extremal Problems and Optimal Control
REMARK 2.6.42 Evidently (2.112) is also equivalent to the other two relaxed problems defined in (2.90) and (2.94) with f (t, x, u) = A(t, x) + C(t, x)g(t, u). In the last part of this section we derive the Pontryagin maximum principle for the following optimal control problem. ⎧ ⎫ b
J(x, u) = 0 L t, x(t), u(t) dt −→ inf = m ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎬
(2.122) a.e. on T = [0, b] . s.t. x (t) = f t, x(t), u(t) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪
⎩ ⎭ x(0), x(b) ∈ C, u ∈ Uad Here C ⊆ RN ×RN is nonempty and Uad = {u : T −→ Y measurable}, where Y is a separable metric space (the control space). Note the general endpoint constraints that include the periodic problem as a special case. Our derivation of the Pontryagin maximum principle is based on the Ekeland variational principle (see Theorem 2.4.1). The precise mathematical setting for problem (2.122) is the following. The state space is X = RN and the control space Y is a separable metric space. We impose the following conditions on the data of (2.122). H(f )2 : f : T ×RN ×Y is a function such that (i) For all (x, u) ∈ RN × Y, t −→ f (t, x, u) is measurable. (ii) For all t ∈ T and all u ∈ Y, x −→ f (t, x, u) is differentiable and (x, u) −→ fx (t, x, u) is continuous;. (iii) There exists M > 0 such that f (t, 0, u), fx (t, x, u) ≤ M for all (t, x, u) ∈ T × RN × Y . H(C) : C ⊆ RN ×RN is nonempty, closed, and convex. H(L)3 : L : T ×RN × Y −→ R is a function such that (i) For all (x, u) ∈ RN ×Y, t −→ L(t, x, u) is measurable. (ii) For all t ∈ T and all u ∈ Y, x −→ L(t, x, u) is differentiable and (x, u) −→ Lx (t, x, u) is continuous. (iii) There exists M1 > 0 such that L(t, 0, u), Lx (t, x, u) ≤ M1 for all (t, x, u) ∈ T × RN × Y . The control space Y is only a separable metric space and does not have a linear structure, therefore we cannot talk about convexity. For this reason the variations of the control function u(·) are of special type and are called spike perturbations of u. For this reason we need to use measure-theoretic arguments when dealing with such perturbations. In particular we need to know how we can approximate the constant function 1 by an oscillatory function of the form (1/λ) χC , λ ∈ (0, 1) (here χC denotes the characteristic function of C). LEMMA 2.6.43 If (Ω, Σ, µ) is a finite nonatomic measure space, X is a separable Banach space, and for any λ ∈ (0, 1), Sλ = C ∈ Σ : µ(C) = λµ(Ω) , then for any g ∈ L1 (Ω, X) we have
1 inf (2.123) χC (ω) − 1 g(ω)dµ = 0. C∈Sλ λ Ω
2.6 Optimal Control n
PROOF: Given ε > 0, we can find a simple function sε (ω) =
k=1
Ak ∈ Σ, xk ∈ X such that
χAk (ω)xk with
g − sε L1 (Ω,X) < ε. Due to the nonatomicity of µ, for each k ∈ {1, . . . , n} we can find that µ(Aλk ) = λµ(Ak ). Let Aλ =
n
137
(2.124) Aλk
⊆ Ak such
Aλk ∈ Sλ . Then
k=1
n 1 1 χAλ (ω) − 1 sε (ω)dµ = µ(Aλk ) − µ(Ak ) xk = 0, λ Ω λ k=1
1 1 ⇒ χAλ (ω) − 1 g(ω)dµX ≤ χAλ (ω) − 1 sε (ω)dµX + λ λ Ω Ω 1 + 1+ sε − gL1 (Ω,X) λ 1 ≤ 1+ ε (see (2.124)). λ Because ε > 0, was arbitrary, we conclude that (2.123) holds.
In dealing with the endpoint constraints, we use the distance function from the set C. Recall that if Z is a Banach space and D is a nonempty, closed, and convex subset of Z, the distance function from D defined by dD (z) = inf{z − uZ : u ∈ D} is both convex and Lipschitz continuous with Lipschitz constant 1. So the convex subdifferential (see Definition 1.2.28) and the generalized subdifferential (see Definition 1.3.5) coincide and we have ∂dD (z) = {z ∗ ∈ Z ∗ : z ∗ , z − z Z ≤ dD (z ) − dD (z) for all z ∈ Z} = {z ∗ ∈ Z ∗ : z ∗ , hZ ≤ dD0 (z; h)
for all h ∈ Z}. (2.125)
LEMMA 2.6.44 If Z is a Banach space and D ⊆ Z is nonempty, closed, and convex, then for any z ∈ / D and any z ∗ ∈ ∂dD (z) we have z ∗ Z ∗ = 1. PROOF: Because dD (·) is globally Lipschitz with constant 1, we have z ∗ Z ∗ ≤ 1. Because z ∈ / D, given any 0 < δ < 1, we can find uδ ∈ D such that 0 < (1 − δ)z − uδ Z ≤ dD (z). So by virtue of (2.125) we have z ∗ , uδ − zZ ≤ −dD (z) ⇒ (1 − δ)z − uδ Z ≤ dD (z) ≤ − z ∗ , uδ − zZ ≤ z ∗ Z ∗ uδ − zZ ⇒ (1 − δ) ≤ z ∗ Z ∗ . Because δ ∈ (0, 1) was arbitrary, we let δ ↓ 0, to conclude that z ∗ Z ∗ = 1.
138
2 Extremal Problems and Optimal Control
COROLLARY 2.6.45 If Z is a Banach space with a strictly convex dual Z ∗ and D ⊆ Z is a nonempty, closed, and convex set, then for any z ∈ / D and the set ∂dD (z) is a singleton. As we already mentioned, the derivation of the Pontryagin maximum principle is based on the Ekeland variational principle. We need to endow the set of admissible controls Uad with a metric structure. So for u, w ∈ Uad we define d(u, w) = {t ∈ T : u(t) = w(t)}1 . Recall that by | · |1 we denote the one-dimensional Lebesgue measure. It is easy to check that d is a metric and so (Uad , d) is a metric space. LEMMA 2.6.46 (Uad , d) is a complete metric space. PROOF: Let {un }n≥1 ⊆ Uad be a d-Cauchy sequence. By passing to a suitable subsequence, we may assume that d(un+1 , un ) ≤
1 , 2n
n ≥ 1.
Set Dnm = {t ∈ T : un (t) = um (t)}, n, m ≥ 1, and Ck = Clearly Ck k≥1 is decreasing and we have |Ck |1 ≤
1 1 = k−1 , 2n 2
Dn(n+1) , k ≥ 1.
n≥k
k ≥ 1.
n≥k
c Ck 1 = b. We set u(t) = uk (t) if t ∈ Ckc , k ≥ 1. From the definition It follows that k≥1
of Ck it is clear that u is well-defined and u ∈ Uad . Moreover, d(uk , u) ≤ |Ck |1 −→ 0
as k −→ ∞.
REMARK 2.6.47 A careful reading of the above proof reveals that the lemma remains valid if Y is only a measurable space without any metric structure. In addition T can be replaced by any nonatomic measure space. Now we are ready to state and prove the Pontryagin maximum principle for problem (2.122). THEOREM 2.6.48 If hypotheses H(f )2 , H(C) and H(L)3 hold and (x, u) is an optimal pair for problem (2.122), then there exists (ψ0 , ψ) ∈ R × W 1,1 (0, b), RN , (ψ0 , ψ) = (0, 0) such that (a) ψ0 ≤ 0.
∗
(b) ψ (t) = fx t, x(t), u(t) ψ(t) + ψ0 Lx t, x(t), u(t) a.e. on T .
2.6 Optimal Control (c)
139
ψ(0), x0 − x(0) RN ≤ ψ(b), x1 − x(b) RN for all (x0 , x1 ) ∈ C.
(d) If H(t, x, u, v0 , v)= v, f (t, x, u) RN +v0 L(t, x, u) for all (t, x, u, v0 , v) ∈ T × RN ×
Y ×R×RN , then H t, x(t), u(t), ψ0 , ψ(t) = max H t, x(t), u, ψ0 , ψ(t) for almost all t ∈ T .
u∈Y
PROOF: First let us briefly outline the strategy of the proof. It is divided into a series of steps. (i) We introduce a penalty function that is used to define an approximation of problem (2.122) (called the approximate problem) which has no end point constraint. (ii) We use the Ekeland variational principle to produce an optimal pair for the approximate problem. This pair is actually close to the original optimal pair. (iii) We derive necessary conditions for the approximate optimal pair. (iv) We pass to the limit to obtain necessary conditions for the original optimal pair. On RN × Uad we introduce the metric
d0 (x, u), (z, w) = x − zRN + d(u, w). Clearly (RN × Uad , d0 ) is a complete metric space (see Lemma 2.6.46). In what follows by x(x0 , u)(·) ∈ W 1,1 (0, b), RN we denote the unique state generated by u ∈ Uad and the initial condition x0 ∈ RN . Then we can write the cost functional J(x0 , u). If x(0) = x0 , then by translating the cost functional if necessary (i.e., considering the cost functional J(x0 , u)−J(x0 , u)), we may assume without any loss of generality that J(x0 , u) = 0. We start executing the four steps of the method of proof outlined above. Step i: We introduce the penalty function. For ε > 0 and (x0 , u) ∈ RN × Uad , we set
+ 1/2 . Jε (x0 , u) = d2C x0 , x(x0 , u)(b) + [ J(x0 , u) + ε ]2
(2.126)
Because of hypotheses H(f )2 and H(L)3 and Gronwall’s inequality for all (x0 , u), (x0 , u0 ) ∈ RN × Uad we have
and
x(x0 , u)(·) − x(x0 , u0 )(·)RN ≤ c(1 + max{x0 , x0 }) ×
(2.127) d0 (x0 , u), (x0 , u0 )
|J(x0 , u) − J(x0 , u0 )| ≤ c 1 + max{x0 , x0 }d0 (x0 , u), (x0 , u0 ) , for some c > 0.
(2.128)
From (2.126) through (2.128) it follows that Jε (·, ·) is continuous on (RN × Uad , d0 ). Step ii: Solution (via the Ekeland variational principle) of the problem with the same dynamics and cost functional a new expression involving Jε . From (2.126) we see that Jε ≥ 0 and Jε (x0 , u) = ε ≤ inf Jε (x0 , u) + ε (recall RN ×Uad
that x(0) = x0 and that we have assumed that J(x0 , u) = 0). So we can apply Corollary 2.4.3 and obtain (xε0 , uε ) ∈ RN × Uad such that
140
2 Extremal Problems and Optimal Control √
d0 (xε0 , uε ), (x0 , u) ≤ ε
√ Jε (xε0 , uε ) ≤ Jε (x0 , u) + ε d0 (x0 , u), (xε0 , uε )
and
for all (x0 , u) ∈ RN × Uad .
(2.129)
(2.130)
So if xε = x(xε0 , uε ), then (xε , uε ) is a solution of the approximate optimal control problem with the same dynamics and cost functional as the right-hand side of (2.130). Note that because of (2.129) the optimal pair of the approximate problem is located close to the optimal pair of the original problem. Step iii: We derive necessary conditions for the optimal pair (xε , uε ) of the approximate problem. Fix (e0 , w) ∈ B 1 × Uad (B 1 = {e0 ∈ RN : e0 RN ≤ 1}) and λ ∈ (0, 1] and set
L s, xε (s), w(s) − L s, xε (s), uε (s)
. g(s) = f s, xε (s), w(s) − f s, xε (s), uε (s) From Lemma 2.6.43 we know that given any δ > 0. we can find Cλ ⊆ T with |Cλ |1 = λb such that
b
g(s)ds − 0
1 λ
b 0
χCλ (s)g(s)dsRN ≤ δ.
(2.131)
We introduce the following spike perturbation of the optimal control uε (·), uε (t) if t ∈ T \ Cλ uελ (t) = . (2.132) w(t) if t ∈ Cλ Clearly uελ ∈ Uad and d(uελ ,uε ) ≤ |Cλ |1 = λb. Set xελ = x(xε0 + λe0 , uελ ). We define (z ε , z0ε ) ∈ W 1,1 (0, b), RN × R as follows
(z ε ) (t) = fx t, xε (t), uε (t) z ε (t) + f t, xε (t), w(t) − f t, xε (t), uε (t) a.e. on T, z ε (0) = e0 and b
Lx t, xε (t), uε (t) z ε (t)dt z0ε =
0
+
b
L t, xε (t), w(t) − L t, xε (t), uε (t) dt.
0
Exploiting hypotheses H(f )2 and H(L)3 , via Gronwall’s inequality, we can easily check that
and
xελ − xε − λz ε C(T,RN ) = o(λ)
(2.133)
J(xε0 + λe0 , uελ ) − J(xε0 , uε ) − λz0ε ) = o(λ),
(2.134)
where o(λ) λ −→ 0 as λ ↓ 0. Recalling that Jε is continuous on (RN × Uad , d0 ), from (2.133) and (2.133) and the definition of Jε , we see that Jε (xε0 + λe0 , uελ ) = Jε (xε0 , uε ) + ε(λ)
with ε(λ) ↓ 0.
(2.135)
2.6 Optimal Control
141
From (2.130) we have √ 1
ε(e0 RN + b) ≤ Jε (xε0 + λe0 , uελ ) − Jε (xε0 , uε ) λ 1
1 ε ε 2 ε ε 2 (x + λe , u ) − J (x , u ) = J ε 0 ε 0 λ 0 ε ε ε Jε (x0 + λe0 , uλ ) + Jε (x0 , uε ) λ −
=
1 2Jε (xε0 , uε )
d2C (xε0 + λe0 , xελ (b) − d2C (xε0 , xε (b)
1
+ ε(λ) λ
+ 2
+ 2 1 ε + − J(xε0 , uε ) + ε J x0 + λe0 , uε ) + ε (see (2.133)). λ (2.136) We know that d2C ∈ C 1 (RN × RN ) and 2 dC (x0 , x1 ) ∂dC (x0 , x1 ) 2 ∇dC (x0 , x1 ) = 0
/C if (x0 , x1 ) ∈ . if (x0 , x1 ) ∈ C
Recall that by virtue of Corollary 2.6.45 ∂dC (x0 , x1 ) = (α0 , α1 ) ∈ RN × RN and α0 2RN + α1 2RN = 1 (see Lemma 2.6.44), when (x0 , x1 ) ∈ / C. On the other hand because dC is Lipschitz continuous with Lipschitz constant 1, we always have ∂dC (x0 , x1 ) ⊆ B 1 = {(x, z) ∈ RN × RN : x2RN + z2RN ≤ 1}. So we can write with no ambiguity that ∇d2C (x0 , x1 ) = 2 dC (x0 , x1 )(α0 , α1 ) α0 2RN
+
α1 2RN
with (α0 , α1 ) ∈ ∂dC (x0 , x1 ),
= 1.
Because of (2.133), we have
d2C xε0 + λe0 , xελ (b) − d2C xε0 , xε (b) lim λ↓0 λ
ε ε ε = 2 dC x0 , x (b) (α0 , e0 )RN + α1ε , z ε (b) RN where
(α0ε , α1ε ) ∈ ∂dC xε0 , xε (b) and α0ε 2RN + α1ε 2RN = 1.
Also because of (2.133) we have + 2
+ 2 1 ε − J(xε0 , uε ) + ε J x0 + λe0 , uε ) + ε lim λ↓0 λ +
= 2 J(xε0 , uε ) + ε z0ε .
(2.137)
(2.138)
Returning to (2.136), passing to the limit as λ ↓ 0 and using (2.137) and (2.138) we obtain
√ (2.139) − ε(e0 RN + b) ≤ (ϕε , e0 )RN + ψε , zε (b) RN + ψ0ε z0ε
dC xε ,xε (b) dC xε ,xε (b) where ϕε = Jε (x0ε ,uε ) α0ε , ψε = Jε (x0ε ,uε ) α1ε , and 0
ψ0ε =
0
+ J(xε0 , uε ) + ε . Jε (xε0 , uε )
(2.140)
142
2 Extremal Problems and Optimal Control Because of (2.137) we have ϕε 2RN + ψε 2RN + (ψ0ε )2 = 1,
(2.141) xε0 , xε (b) is
and from the definition of the convex subdifferential and because ∂dC a cone we have
(ϕε , x0 − xε0 )RN + ψε , x1 − xε (b) RN ≤ 0 for all (x0 , x1 ) ∈ C.
(2.142)
Conditions (2.138)−→(2.142) are necessary conditions for optimality of (xε , uε ). Step iv: Passage to the
limit as ε ↓ 0. Let (z, z 0 ) ∈ W 1,1 (0, b), RN × R be defined by
z (t) = fx t, x(t), u(t) z(t) + f t, x(t), w(t) − f t, x(t), u(t)
z0 =
a.e. on T, z(0) = e0
b
Lx t, x(t), u(t) z(t)dt + L t, x(t), w(t) −L t, x(t), u(t) dt.
b
0
0
Then because of (2.129) and as before via Gronwall’s inequality, we have lim xε0 − x0 RN + z ε − zC(T,RN ) + |z0ε − z0 | = 0. ε↓0
So from (2.141) and (2.142), we have
(ϕε , x0 − x0 )RN + ψε , x1 − x(b) RN ≤ xε0 − x0 2RN + xε (b) − x(b)2RN = βε −→ 0 as ε ↓ 0.
Combining this inequality with (2.139), we obtain
ϕε , e0 − (x0 − x0 ) RN + ψε , z(b) − (x1 − x(b) + ψ0ε z0 RN √ ≥ − ε(e0 RN + b) − βε − z ε (b) − z(b)RN − |z0ε − z0 | ≥ −γε for all (x0 , x1 ) ∈ C,
(2.143)
where γε > 0 independent of u ∈ Uad , e0 ∈ B 1 , and γε ↓ 0 as ε ↓ 0. Because of (2.141) we may assume that (ϕε , ψε , ψ0ε ) −→ (ϕ, ψ, ψ0 ) in RN × RN × R as ε ↓ 0. Clearly (ϕ, ψ, ψ0 ) = (0, 0, 0) (see (2.141)). Let ε ↓ 0 in (2.143). We obtain
ϕ, e0 − (x0 − x0 ) RN + ψ, z(b) − x1 − x(b) + ψ0 z0 ≥ 0 RN
(2.144)
for all (x0 , x1 ) ∈ C, all w ∈ Uad , and all e0 ∈ B 1 . Let ψ ∈ W 1,1 (0, b), RN be the unique solution of the equation in statement (b) of the theorem ψ(b) = −ψ and ψ0 = −ψ0 ≤ 0 (see (2.140)). We rewrite (2.144) as follows,
(ϕ, x0 − x0 − e0 )RN − ψ(b), x1 − x(b) − z(b) RN + ψ0 z0 ≤ 0 (2.145) for all (x0 , x1 ) ∈ C, all w ∈ Uad , and all e0 ∈ B 1 .
2.6 Optimal Control
143
A straightforward computation gives
ψ(b), z(b) RN − ψ(0), e0 RN + ψ0 z0
b
= H t, x(t), w(t), ψ0 , ψ(t) − H t, x(t), u(t), ψ0 , ψ(t) dt, 0
(2.146) for all e0 ∈ B 1 and all w ∈ Uad . In (2.145) we set e0 = 0, x0 = xb = x(0) and use the resulting inequality in (2.146) to obtain
b
(2.147) H t, x(t), w(t), ψ0 , ψ(t) − H t, x(t), u(t), ψ0 , ψ(t) dt ≤ 0 0
for all w ∈ Uad . Recall that Y is separable. So we can find a countable dense set {um }m≥1 ⊆ Y . Then if
hm (t) = H t, x(t), um , ψ0 , ψ(t) − H t, x(t), u(t), ψ0 , ψ(t) , m ≥ 1, we have hm ∈ L1 (T ) and we can find Em ⊆ T with |Em |1 = b such that
t+δ 1 hm (s)ds = hm (t) for all t ∈ Em . lim δ↓0 2δ t−δ If t ∈ Em , we define
wδ (s) =
u(s) um
if |s − t| > δ . if |s − t| ≤ δ
From (2.147) we have
t+δ
hm (s)ds ≤ 0
for all δ > 0.
t−δ
Divide with 2δ and let δ ↓ 0, to obtain hm (t) ≤ 0. Thus,
H t, x(t), um , ψ0 , ψ(t) − H t, x(t), u(t), ψ0 , ψ(t) ≤ 0 for all t ∈ E = m≥1 Em , |E|1 = b. Exploiting the continuity of the Hamiltonian H, we have
H t, x(t), u(t), ψ0 , ψ(t) = max H t, x(t), u, ψ0 , ψ(t) a.e. on T. u∈Y
Also, if in (2.145) we let w = u, x0 = x0 , x1 = x(b) and we use (2.146) and (2.147), we obtain
(ϕ, e0 )RN ≥ ψ(b), z(b) RN + ψ0 z0 ≥ ψ(0), e0 RN for all e0 ∈ B 1 , ⇒ ϕ = ψ(0).
Therefore, if in (2.145) e0 = 0 and w = u, we obtain
ψ(0), x0 − x(0) RN ≤ ψ(b), x1 − x(b) RN
for all (x0 , x1 ) ∈ C.
Finally note that (ψ0 , ψ) = (0, 0), or otherwise we contradict the fact that (ϕ, ψ, ψ0 ) = (0, 0, 0).
144
2 Extremal Problems and Optimal Control
REMARK 2.6.49 The function ψ(·) is known as the costate or adjoint state. The system in (b) is known as the adjoint system. The inequality in (c) is known as the transversality condition and finally statement (d) is the maximum condition. In analogy with mechanics H is the Hamiltonian.
2.7 Remarks 2.1: The notion of lower semicontinuity for real functions was introduced by Borel [89], but systematic use of it in variational problems is due to Tonelli [585]. Lower semicontinuity is studied in the books of Barbu–Precupanu [59], Cesari [140], Dal Maso [171], Denkowski–Mig´ orski–Papageorgiou [194], Ekeland–Temam [222], and Ioffe–Tichomirov [327]. Theorem 2.1.25 was first proved by De Giorgi [189] for the case ϕ is a nonnegative Carath´eodory integrand. The version presented here is due to Ioffe [326]. Earlier results on the lower semicontinuity proper
ties of integral functionals J(x, u)= Ω ϕ ω, x(ω), u(ω) dµ on the Lebesgue spaces Lp (Ω, RN )×Lr (Ω, Rm ), were provided by Berkovitz [71], Cesari [138, 139], and Olech [467]. Corresponding results for the weak lower semicontinuity of integral function
als J(x) = Ω ϕ z, x(z), Dx(z) dz (Ω ⊆RN a bounded domain) on the Sobolev spaces W 1,p (Ω) were proved by Buttazzo [126], Dacorogna [170], Ekeland–Temam [222], Giaquinta [263], and Morrey [443]. Lower semicontinuity theorems in the weak topologies of Sobolev spaces when the function x(·) is vector-valued can be found in Acerbi–Fusco [1], Ball–Zhang [56], Dacorogna [170], and Morrey [443]. 2.2: The method of Lagrange multipliers is discussed in Alexeev–Tichomirov–Fomin [7], Tichomirov [582] and Zeidler [620]. Theorem 2.2.7 is due to Ljusternik [392]. It can also be found in the book of Ljusternik–Sobolev [395]. Nonsmooth extensions of the Lagrange multipliers method can be found in Aubin [37], Clarke [149, 153], Rockafellar [522], and Rockafellar–Wets [530]. The Dubovitskii–Milyutin formalism for infinite dimensional optimization problems, was first formulated by Dubovitskii– Milyutin [208]. A comprehensive and readable presentation of the theory is contained in the monograph of Girsanov [266]. 2.3: The first minimax theorem was formulated for bilinear functionals on finitedimensional simplices by von Neumann [454]. It was a basic tool in the study of his model for the growth of an economy. Since then there have been several generalizations of the result of von Neumann. We mention the works of Fan [232], Sion [557], Browder [119, 121], Brezis–Nirenberg–Stampacchia [101], Terkelsen [580], and Simons [556]. Theorem 2.3.7 was first proved by Knaster–Kuratowski–Mazurkiewicz [358] in the special case in which B is the set of vertices of a simplex in RN . The form presented here is due to Dugundji–Granas [211, 212]. An earlier version was proved by Fan [233] who also gave numerous applications of it (see, e.g., Fan [234, 235]; in fact Fan [235] obtained the coincidence result proved in Proposition 2.3.10). Theorem 2.3.13 is due to Sion [557]. The duality theory for convex optimization, which uses the idea of perturbed problems and conjugate convex functionals, was developed for the finite-dimensional case by Rockafellar [522]. The infinite-dimensional case presented here is due to Ekeland–Temam [222]. The special case considered in (2.34) (problem (P) ) is due to Fenchel [241] when A = I and to Rockafellar [518], the general situation with A ∈ L(X, Y ).
2.7 Remarks
145
2.4: Theorem 2.4.1 was proved by Ekeland [221]. A detailed discussion of the various applications that this theorem has can be found in Ekeland [223, 226]. Theorem 2.4.11 is due to Caristi [129] who proved it using transfinite induction. Takahashi [571] proved Theorem 2.4.14. Theorem 2.4.16 is due to Danes [175], who proved it using a result of Krasnoselskii–Zabreiko. Theorem 2.4.20 is due to Brezis–Browder [103]. A generalization of the Ekeland variational principle, in which the Lipschitz continuous perturbations are replaced by quadratic ones, was proved by Borwein– Preiss [92]. Additional results in this direction can be found in Deville–Godefroy– Zizler [197]. 2.5: The calculus of variations is of course a much broader subject. Detailed presentation of the subject can be found in the books Bliss [80], Bolza [83], Cesari [140], Gelfand–Fomin [260], Hestenes [292], Ioffe–Tichomirov [327], Morrey [443], Morse [444], Troutman [586], and Young [617]. The more recent developments on vectorial problems (with significant applications in elasticity theory) can be found in Dacorogna [170]. 2.6: Hypothesis Hc in the existence theory is related to the property (Q) of Cesari [138]. For a detailed discussion of the existence theory for finite dimensional optimal control problems and of the relevant property (Q), we refer to Cesari [140]. Similar results can also be found in Berkovitz [70], Fleming–Richel [249] (finite-dimensional problems) and Ahmed [4], Ahmed–Teo [3], Hu–Papageorgiou [316], Lions [388] and Tiba [581] (infinite-dimensional problems). Scrutinizing the proof of Theorem 2.6.11, we see that what makes it work is hypothesis Hc . So in order to be able to guarantee existence of an optimal pair in an optimal control problem, we need a convexity-type hypothesis. If such a condition is not present the problem need not have a solution. This difficulty was surmounted independently in the late fifties and early sixties by Filippov [247], Warga [601] and Gamkrelidze [255]. They realized that we need to suitably augment the system with the introduction of new admissible controls in order to be able to capture the asymptotic behavior of the minimizing sequences of the original problem. Gamkrelidze and Warga worked with parametrized measures (see problem (2.42)), whereas Filippov used a control-free problem (see problem (2.47)). Here we also present a third relaxation method based on Carath´eodory’s theorem for convex sets. The fourth relaxation method based on the Γ-regularization of the extended cost functional, is due to Buttazzo–Dal Maso [125] (see also Buttazzo [126]), who for that purpose developed the formalism of multiple Γ-limits. Further results on the relaxation of the calculus of variations and optimal control problems can be found in Buttazzo [126]), Dal Maso [171], Fattorini [239], and Roubicek [532]. Necessary conditions for optimality were first obtained by Pontryagin and his coworkers (see Pontryagin [499] and Pontryagin–Boltyanski–Gamkrelidze–Mischenko [500]. It has become a common terminology to call these necessary conditions the Pontryagin maximum principle. Proofs of the maximum principle for different types of optimal control problems can be found in Berkovitz [70], Cesari [140], Fleming– Richel [249], Hermes–La Salle [289], Ioffe–Tichomirov [327], Seierstadt–Sydsaeter [547] (finite-dimensional systems), and Ahmed [4], Ahmed–Teo [3], Fattorini [239], Li–Yong [381], Lions [388], and Tiba [581] (infinite-dimensional systems).
3 Nonlinear Operators and Fixed Points
Summary. *In this chapter, first we study certain broad classes of nonlinear operators that arise naturally in applications and then we develop some related degree theories and fixed point theorems. We start with compact operators, we conduct a detailed study of both linear and nonlinear operators. Then we pass to a broader class of nonlinear operators of monotone-type. We deal with operators from a reflexive Banach space X to its dual X ∗ (monotone operators and their generalizations) and with operators from X into itself (accretive operators). The latter, are related to the generation theory of linear and nonlinear semigroups. We deal with both. Then, we present the degree theories of Brouwer, (finite-dimensional) and Leray–Schauder and Browder–Skrypnik (infinite-dimensional) and several interesting topological applications of them. Finally we present the main aspects of fixed point theory. We consider metric fixed point theory, topologucal fixed point theory and the interplay between fixed point theory and partial order. In this direction we introduce and use the so-called “fixed point index”.
Introduction In this chapter, first we study certain classes on nonlinear operators that arise naturally in applications. Then we develop certain finite- and infinite-dimensional degree theories that are valuable tools in the study of boundary value problems. Degree theory leads in a natural way to fixed point theory, which is the topic of investigation in the last three sections of the chapter. In Section 3.1 we study compact operators. Compact operators were the first attempt to deal with infinite-dimensional operator equations. By its nature, compactness is a suitable concept to achieve finite approximations of infinite objects. We present the basics of both the linear and nonlinear theories. In the linear case we focus on compact self-adjoint operators on a Hilbert space, which have remarkable spectral properties. Compact operators have obvious limitations. A broader class producing a more general framework for the analysis of infinite-dimensional problems, is provided by the so-called monotone operators from a Banach space X into its topological dual X ∗ . They are the natural generalization of increasing real functions on R and their N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_3, © Springer Science+Business Media, LLC 2009
148
3 Nonlinear Operators and Fixed Points
definition does not require any order structure on the Banach space. Prominent among monotone operators are the so-called maximal monotone operators, which exhibit remarkable surjectivity properties. A detailed study of such operators appears in the first half of Section 3.2. In the second half we deal with accretive operators that are the corresponding of monotone operators, when we deal with maps from X into itself (and not into X ∗ ). The importance of accretive operators comes from the fact that they are the generators of linear and nonlinear semigroups which play a central role in the study of evolution equations. In Section 3.2 we present the basic aspects of both the linear and nonlinear semigroup theories. In Section 3.3 we present the basic finite- and infinite-dimensional degree theories. We start with Brouwer’s degree theory which concerns all continuous functions ϕ : U −→ RN with U ⊆ RN nonempty, bounded, and open. Then we present the Leray–Schauder degree theory which extends Brouwer’s theory to an infinitedimensional setting and specifically to maps of the form I −ϕ with ϕ : U −→ X compact (compact perturbations of the identity). Using measures of noncompactness we extend the theory further to maps of the form I − ϕ with ϕ being γ-condensing (Nussbaum–Sadovskii degree theory). Finally using Galerkin and Yosida approximations we produce degree maps for nonlinear operators of monotone type. We also have several interesting topological applications of these degree theories. Degree theory leads smoothly to fixed points. So in Section 3.4 we present the metric fixed point theory. Starting with the celebrated Banach contraction principle, we prove several fixed point theorems in which the metric structure of the spaces and/or of the maps involved, play a crucial role. In Section 3.5 we pass to topological fixed point theory. Now the topological properties of the spaces and/or of the maps involved are the important ones. The best-known result in this direction is Brouwer’s fixed point theorem and its infinitedimensional counterpart, the Schauder fixed point theorem. Other fixed point results of topological flavor are also presented. Finally in Section 3.6 we deal with OBS (ordered Banach spaces) which have a partial order induced by an order cone and maps that exhibit certain positivity properties with respect to this partial order. Using structural properties of the order cones and degree-theoretic methods based on the so-called fixed point index, we prove theorems on the existence and multiplicity of fixed points.
3.1 Compact and Fredholm Operators The first efforts to deal with nonlinear functional equations in infinite-dimensional spaces involved some kind of compactness. Moreover, compact operators provide the natural framework that allows a smooth transition from the finite-dimensional theory to the infinite-dimensional one. DEFINITION 3.1.1 Let X, Y be Banach spaces, C ⊆ X a nonempty set, and f : C −→ Y . (a) We say that f is compact if it is continuous and maps bounded sets of C to relatively compact sets of Y . By K(C, Y ) we denote the set of all compact maps from C into Y .
3.1 Compact and Fredholm Operators
149
(b) We say that f is completely continuous if it is sequentially continuous from C with the relative weak topology into Y with the norm topology; that is, if w {xn }n≥1 ⊆ C and xn −→ x ∈ C, then f (xn ) −→ f (x) in Y . REMARK 3.1.2 In general the two classes of mappings introduced above are not comparable. However, we have the following result. PROPOSITION 3.1.3 If X is a reflexive Banach space, Y is a Banach space, C ⊆ X nonempty, closed, and convex and f : C −→ Y is completely continuous, then f is also compact. PROOF: Evidently f is continuous. Let B be a bounded subset of X. We need to show that f (B ∩ C) is relatively compact in Y . So let {yn }n≥1 ⊆ f (B ∩ C). Then yn = f (xn ) with xn ∈ B ∩ C. Because X is reflexive and {xn }n≥1 is bounded, it is relatively weakly sequentially compact (Eberlein–Smulian theorem) and so by passw ing to a suitable subsequence if necessary, we may assume that xn −→ x. Because f is completely continuous, we have that yn = f (xn ) −→ f (x) = y in Y and this proves the compactness of f . PROPOSITION 3.1.4 If X is a reflexive Banach space, Y is a Banach space, and A ∈ L(X, Y ), then A is completely continuous if and only if A is compact. PROOF: ⇒: Follows from Proposition 3.1.3. w w ⇐: Let xn −→ x in X. Then A(xn ) −→ A(x) in Y . Also {xn }n≥1 ⊆ X is bounded and so {A(xn )}n≥1 is relatively compact in Y . Therefore A(xn ) −→ A(x) in Y . REMARK 3.1.5 It is clear from the second part of the above proof that for linear operators compactness always implies complete continuity, without any additional structure on the Banach spaces X, Y . The set of all compact maps f : X −→ Y is a linear space and we have the following. PROPOSITION 3.1.6 If X, Y are Banach spaces, fn : X −→ Y , n ≥ 1, is a sequence of compact mappings, and f : X −→ Y is a mapping such that fn (x) −→ f (x) in Y uniformly on bounded subsets of X, then f is compact too. PROOF: Clearly f is continuous. We need to show that for every bounded set B ⊆ X, f (B) is relatively compact. Given ε > 0, we can find n ≥ 1 such that fn (x) − f (x)Y ≤ ε/2 for all x ∈ B. Also, because fn (B) is relatively compact, we M
Bε/3 fn (xm ) ⊇ fn (B). Then if x ∈ B, there can find {xm }M m=1 ⊆ B such that is m ∈ {1, . . . , M } such that
m=1
f (x) − f (xm )Y ≤ f (x) − fn (x)Y + fn (x) − fn (xm )Y + + fn (xm ) − f (xm )Y ε ε ε < + + = ε, 3 3 3 ⇒ f (B) is totally bounded, hence relatively compact in Y.
150
3 Nonlinear Operators and Fixed Points
REMARK 3.1.7 Note that the linear space K(X, Y ) (see Definition 3.1.1(a)) is closed under composition with continuous bounded mappings. DEFINITION 3.1.8 Let X, Y be Banach spaces, C ⊆ X a nonempty set, and f : C −→ Y . We say that f is a finite rank mapping, if it is continuous, bounded (i.e., maps bounded sets to bounded ones), and f (C) lies in a finite dimensional subspace of Y . We denote the linear space of finite rank mappings f : C −→ Y , by Kf (C, Y ). REMARK 3.1.9 By virtue of the Heine–Borel theorem we have Kf (C, Y ) ⊆ K(C, Y ). In fact, we show that in a sense K(C, Y ) is the closure of Kf (C, Y ). THEOREM 3.1.10 If X, Y are Banach spaces and C ⊆ X is nonempty bounded, then f ∈ K(C, Y ) if and only if f is the uniform limit of elements in Kf (C, Y ). PROOF: ⇒: The set f (C) is relatively compact in Y . So for every n ≥ 1, we can n find a finite set {yk }N k=1 ⊆ f (C) such that 1 for all x ∈ C. (3.1) n Let ξk (x) = max (1/n) − f (x) − yk Y , 0 . Then ξk ∈ C(C) and the ξk s do not all vanish simultaneously at x ∈ C (see (3.1)). Moreover, if we set min f (x) − yk Y <
1≤k≤Nn
N n
fn (x) =
ξk (x)yk
k=1 N n
,
x ∈ C,
ξk (x)
k=1
then we have N
n ξk (x) yk − f (x) 1 k=1 f (x) − fn (x)Y = < N n n ξk (x)
for all x ∈ C (see (3.1)).
k=1
Then the boundedness of f (C) implies the boundedness of fn (C) and so fn ∈ Kf (C, Y ). ⇐: It follows from Proposition 3.1.6 and Remark 3.1.9.
We recall the following generalization of Tietze’s extension theorem due to Dugundji [209, p. 188]. THEOREM 3.1.11 If X is a metric space, A ⊆ X is a closed subset, Y is a locally convex space, and f : A −→ Y is a continuous map, then there exists a continuous map f : X −→ Y such that f = f and A
f (X) ⊆ conv f (A).
3.1 Compact and Fredholm Operators
151
REMARK 3.1.12 Evidently in the above theorem we can replace Y by any convex subset D of Y . Using Theorem 3.1.11 and the fact that the closed convex hull of a norm compact set in a Banach space is again norm compact (Mazur’s theorem), we can have the following theorem. THEOREM 3.1.13 If X is a Banach space, C ⊆ X is nonempty, closed, and bounded, Y is a Banach space, and f ∈ K(C, Y ), then there exists f ∈ K(X, Y ) such that f = f and f (X) ⊆ conv f (C). C
DEFINITION 3.1.14 Let X, Y be metric spaces, C ⊆ Y nonempty closed and f : C −→ Y a continuous map. We say that f is proper , if for any compact set D in Y , the set f −1 (D) is compact in C. REMARK 3.1.15 The importance of this class of continuous maps stems from the fact that the property of properness restricts the size of the solution set S(y) = {x ∈ C : f (x) = y} for any given y ∈ Y . Evidently, if X and Y are Banach spaces, then all proper operators A ∈ L(X, Y ) are injective and have closed range. PROPOSITION 3.1.16 If X and Y are metric spaces and f ∈ C(X, Y ), then the following are equivalent. (a) f is proper. (b) f is a closed map (i.e., maps closed sets to closed sets) and for every y ∈ Y , the set f −1 (y) = S(y) is compact in X. PROOF: (a)⇒(b): The singleton {y} is compact in Y , thus from the properness of f it follows that f −1 (y) is compact in X. Now let C ⊆ X be nonempty and closed. We show that f (C) is closed in Y . To this end let {yn }n≥1 ⊆ f (C) and suppose that yn −→ y in Y as n → ∞. We have yn = f (xn ), xn ∈ C for all n ≥ 1. Note that D = {yn , y}n≥1 ⊆ Y is compact and because f is proper, f −1 (D) is compact in X. Because {xn }n≥1 ⊆ f −1 (D), by passing to a suitable subsequence if necessary, we may assume that xn −→ x in X. Evidently x ∈ C and f (xn ) −→ f (x). Hence y = f (x) ∈ f (C) and so f is closed. (b)⇒(a): Let D ⊆ Y be nonempty compact. We have to show that f −1 (D) is compact. For this purpose, let {xn }n≥1 ⊆ f −1 (D). Then f (xn ) = yn ∈ D and so we may assume that yn −→ y ∈ D in Y as n → ∞. Let Cn = {xk }k≥n . Because f is closed, y ∈ f (Cn ) for all n ≥ 1. Therefore, if x ∈ f −1 (y), then f (C n ) = f (Cn ) and −1 f (x) ∈ f (C n ) and so f (y) ∩ C n = ∅. This means that {xn }n≥1 has a cluster n≥1
point x ∈ f −1 (y). So we conclude that f −1 (D) is compact.
PROPOSITION 3.1.17 If X and Y are finite-dimensional Banach spaces and f ∈ C(X, Y ), then the following are equivalent. (a) f is proper. (b) f (x)Y −→ +∞ as xX −→ +∞ (i.e., f is weakly coercive).
152
3 Nonlinear Operators and Fixed Points
PROOF: (a)⇒(b): Because both X and Y are finite-dimensional, the properness of f implies that the inverse image of a bounded set in Y is a bounded set in X. But this is simply a restatement of the property of weak coercivity. (b)⇒(a): If D ⊆ Y is nonempty compact, then by virtue of the weak coercivity of f , f −1 (D) is bounded in X. It is also closed in X (because f ∈ C(X, Y )). So by the Heine–Borel theorem, we conclude that f −1 (D) is compact, (i.e., f is proper). There is a partial generalization to an infinite-dimensional setting of the above proposition. PROPOSITION 3.1.18 If X, Y are Banach spaces, f ∈ C(X, Y ), f is weakly coercive (i.e., f (x)Y −→ +∞ as xX −→ +∞), and one of the following holds, (i) f = g + h with g ∈ C(X, Y ) proper and h ∈ K(X, Y ), or w (ii) X is reflexive and xn −→ x in X, f (xn ) −→ y in Y imply xn −→ x in X, then f is proper. PROOF: First assume that (i) holds. Let D ⊆ Y be nonempty compact and let {xn }n≥1 ⊆ f −1 (D). Then yn = f (xn ) ∈ D for all n ≥ 1 and so we may assume that yn −→ y in Y . The weak coercivity of f implies that {xn }n≥1 ⊆ X is bounded. Because h ∈ K(X, Y ), we may assume that h(xn ) −→ y in Y . Then g(xn ) = yn − h(xn ) −→ y − y in Y as n → ∞. Because g is proper, we must have, at least for a subsequence, that xn −→ x in X. Then f (xn ) −→ f (x) = y ∈ D and so we conclude that f is proper. Now suppose that (ii) holds. Because X is reflexive and f (xn ) −→ y in Y as n → ∞, from the weak coercivity w of f we have that {xn }n≥1 ⊆ X is bounded and so we may assume that xn −→ x in X (Eberlein–Smulian theorem). Because of (ii), xn −→ x in X and so f (xn ) −→ f (x) = y ∈ D, which proves that f is proper. PROPOSITION 3.1.19 If X is a Banach space, C ⊆ X is nonempty, closed, and bounded, and f ∈ K(X, Y ), then g = I − f ∈ C(X, Y ) is proper. PROOF: Let D ⊆ X be nonempty and compact and let {xn }n≥1 ⊆ g −1 (D). Then yn = g(xn ) ∈ D for all n ≥ 1 and so we may assume that yn −→ y ∈ D in X. Also because f ∈ K(X, Y ), we have that {f (xn )}n≥1 is relatively compact in X and so we may assume that f (xn ) −→ u in X as n → ∞. Then xn −→ y + u in X as n → ∞ and y = g(y + u), which proves the properness of g. PROPOSITION 3.1.20 If X, Y are Banach spaces, U ⊆ X is nonempty open, and f ∈ K(U, Y ) is Fr´echet differentiable at x0 ∈ U , then f (x0 ) ∈ L(X, Y ) is a compact linear operator. X
PROOF: Suppose that f (x0 ) ∈ L(X, Y ) is not compact. Then f (x0 )(B 1 ) is not X relatively compact in Y (recall that B 1 = {u ∈ X : uX ≤ 1}). So we can find X {un }n≥1 ⊆ B 1 and ε > 0 such that f (x0 )(un ) − f (x0 )(um )Y ≥ 3ε
for all n, m ≥ 1, n = m.
3.1 Compact and Fredholm Operators
153
But from the Fr´echet differentiability of f at x0 , we have f (x0 + h) − f (x0 ) = f (x0 )h + r(x0 , h)
for all h ∈ X
and r(x0 , h)Y ≤ εhX if hX ≤ δ for some δ > 0, (see Definition 1.1.7). Therefore f (x0 + δun ) − f (x0 + δum )Y ≥ δf (x0 )un − f (x0 )um Y − r(x0 , δun )Y − r(x0 , δun )Y ≥ 3δε − δε − δε = δε, which contradicts the fact that f ∈ K(U, Y ).
One of the fundamental notions in linear operator theory is that of the spectrum. It generalizes the concept of the eigenvalue of a linear mapping in a finite-dimensional space. Compact linear operators have remarkable spectral properties, culminating in the spectral resolution theorem for compact self-adjoint operators. In what follows for X, Y Banach spaces, we set Lc (X, Y ) = L(X, Y ) ∩ K(X, Y )
and
Lf (X, Y ) = L(X, Y ) ∩ Kf (X, Y ).
DEFINITION 3.1.21 Let X be a complex Banach space and A ∈ L(X). The spectrum σ(A) of A is defined by σ(A) = {λ ∈ C : λI − A is not invertible} and the resolvent set (A) is defined by (A) = C \ σ(A). If λ ∈ (A), then the operator R(λ) = (λI − A)−1 ∈ L(X) is called the resolvent of A. REMARK 3.1.22 In the case of a real Banach space, the definition is analogous considering only real λ. However, in the real case some important results are not true. For example, if X is a real Banach space it is not true that for every A ∈ L(X), σ(A) = ∅, although this is true for X a complex Banach space. Nevertheless because we focus on self-adjoint operators in Hilbert spaces, we show that σ(A) ⊆ R and so we can assume that we have a real Hilbert space, as are all the vector spaces in this volume. DEFINITION 3.1.23 Let X be a complex Banach space and A ∈ L(X). A number λ ∈ C is called an eigenvalue of A if there exists x ∈ X \ {0} such that A(x) = λx. Then such an x = 0 is called an eigenvector corresponding to the eigenvalue λ. The space ker(λI − A) is called the eigenspace corresponding to the eigenvalue λ. The set of all eigenvalues of A is called the point spectrum of A and it is denoted by σp (A). REMARK 3.1.24 Again, there is an analogous definition for real Banach spaces. Evidently σp (A) ⊆ σ(A). If dim X < +∞, then from linear algebra we know that the operator λI −A is invertible if and only if it is injective. So in this case σ(A) = σp (A). On the other hand in an infinite-dimensional space it may happen that the point spectrum is empty. Indeed let X = H = L2 [0, 1] and let A ∈ L(H) be defined by A(x)(t) = tx(t) for all t ∈ [0, 1]. If λ ∈ C and λx(t) − tx(t) = 0 for some x ∈ L2 [0, 1] and almost all t ∈ [0, 1], then (λ − t)x(t) = 0 a.e. on T and so x(t) = 0 a.e. on T . However, it can be shown that [0, 1] ⊆ σ(A). We leave the details to the reader.
154
3 Nonlinear Operators and Fixed Points
PROPOSITION 3.1.25 If H is a complex Hilbert space and A ∈ L(H) is selfadjoint, then σp (A) ⊆ R and eigenvectors corresponding to different eigenvalues are orthogonal. PROOF: A is self-adjoint therefore for every x ∈ H we have (Ax, x)H = (x, Ax)H = (Ax, x)H , ⇒ (Ax, x)H ∈ R
for all x ∈ H.
If λ ∈ σp (A) and x ∈ H is an eigenvector corresponding to λ, then (Ax, x)H = (λx, x)H = λx2 (Ax, x)H ⇒ λ= ∈ R (i.e., σp (A) ⊆ R). x2 If λ, µ ∈ σp (A), λ = µ and x1 , x2 are eigenvectors corresponding to λ, µ respectively, then Ax1 = λx1 and Ax2 = µx2 , ⇒ (Ax1 , x2 )H = λ(x1 , x2 )H and (Ax1 , x2 )H = (x1 , Ax2 )H = µ(x1 , x2 )H , ⇒ (λ − µ)(x1 , x2 )H = 0 (i.e., x1 ⊥x2 , because λ = µ). REMARK 3.1.26 Similarly we can show that eigenvectors corresponding to different eigenvalues of an operator A on a Banach space X are linearly independent. PROPOSITION 3.1.27 If H is a complex Hilbert space, A ∈ L(H) is self-adjoint and λ ∈ C, then λ ∈ σ(A) if and only if inf[ (λI − A)xH : xH = 1] = 0. PROOF: If λ ∈ (A), then (λI − A)−1 ∈ L(H) and for every x ∈ ∂B1 = {x ∈ H : xH = 1}, we have 1 = xH = (λI − A)−1 (λI − A)xH ≤ (λI − A)−1 L(H) (λI − A)xH , ⇒ inf[(λI − A)xH : xH = 1] ≥ (λI − A)L(H) > 0 for λ ∈ (A). (3.2)
We show that if inf (λI − A)xH : xH = 1 > 0, then λI − A is invertible. Clearly due to homogeneity, if inf[(λI − A)xH : xH = 1] = c0 > 0, then (λI − A)xH ≥ c0 xH
for all x ∈ H.
(3.3)
Hence (λI − A)x = 0 if and only if x = 0, (i.e., λI − A is injective). Next we show that λI − A is also surjective, which by Banach’s theorem implies that (λI − A)−1 ∈ L(H). First we show that (λI − A)(H) is dense in H. If this is not the case we can find h ∈ H, h = 0 such that (λI − A)x, hH = 0
for all x ∈ H,
⇒ x, (λI − A)hH = 0
for all x ∈ H (because A is self-adjoint),
⇒ (λI − A)h = 0,
a contradiction.
3.1 Compact and Fredholm Operators
155
Therefore (λI − A)(H) is dense in H. Also we show that (λI −A)(H) is closed in H. To this end let yn=(λI −A)xn −→ y in H. From (3.3) it follows that {xn }n≥1 ⊆ H is bounded and so we may assume w that xn −→ x in H. Then w
(λI − A)xn −→ (λI − A)x
in H,
⇒ y = (λI − A)x (i.e., (λI − A)(H) is closed). It follows that λI − A is surjective, hence (λI − A)−1 ∈ L(H). So combining this with (3.2) we see that the proposition has been proved. PROPOSITION 3.1.28 If H is a complex Hilbert space and A ∈ L(H), then σ(A) ⊆ R. PROOF: Let λ = a + iβ with β = 0. We show that (λI − A)xH ≤ |λ|xH for all x ∈ H. To this end, because A is self-adjoint, we have λx − Ax, xH − x, λx − AxH = λx2H − Ax, xH − λx2H + Ax, xH ⇒
2|β|x2H
= (λ − λ)x2H = 2iβx2H , = λx − Ax, xH − x, λx − AxH ≤ | λx − Ax, xH | + | x, λx − AxH | ≤ 2(λI − A)xH x,
⇒ |β|xH ≤ (λI − A)xH
for all x ∈ H.
(3.4)
Invoking Proposition 3.1.27, from (3.4) we conclude that λ ∈ (A). Therefore σ(A) ⊆ R. REMARK 3.1.29 This proposition suggests that when dealing with the spectrum of self-adjoint operators, we can assume without any loss of generality that the Hilbert space H is real. In what follows we do this. PROPOSITION 3.1.30 If H is a Hilbert space, A ∈ L(H) is self-adjoint and m= inf Ax, xH , M = sup Ax, xH , then σ(A) ⊆ [m, M ] and m, M ⊆ σ(A). xH =1
xH =1
PROOF: From Proposition 3.1.28 we know that σ(A) ⊆ R. Let η > 0. We show that M + η ∈ (A). From Proposition 3.1.27, we know that it suffices to show that
(3.5) inf (M + η)I − A xH : xH = 1 > 0. For every x ∈ ∂B1 , we have (M + η)x − Ax, xH = (M + η)x2H − Ax, xH ≥ M + η − M = η > 0. Hence (3.5) is true and then Proposition 3.1.27 implies that M + η ∈ (A). Similarly m − η ∈ (A). Therefore σ(A) ⊆ [m, M ]. Next note that σ(A − ξI) = σ(A) − ξ. So by considering, if necessary, A + ξI instead of A, we may assume without any loss of generality that 0 ≤ m ≤ M . Also using the parallelogram law, we can easily check that
156
3 Nonlinear Operators and Fixed Points AL(H) = sup Ax, xH : xH = 1 .
It follows that M = AL(H) . We show that inf (M I − A)xH : xH = 1 = 0.
(3.6)
To this end let {xn }n≥1 ⊆ ∂B1 such that Axn , xn ↑ M = AL(H)
as n → ∞.
Note that because m ≥ 0, we have Ax, xH ≥ 0 for all x ∈ H. Hence 0 ≤ (M I − A)xn 2H = M xn − Axn 2H = M xn − Axn , M xn − Axn H = M 2 xn 2H − 2M Axn , xn H + Axn 2H ≤ 2M 2 − 2M Axn , xn H −→ 0 ⇒ (M I − A)xn H −→ 0
as n → ∞,
as n → ∞.
Therefore (3.6) holds and from Proposition 3.1.27 it follows that M ∈ σ(A). Similarly we show that m ∈ σ(A). PROPOSITION 3.1.31 If H is a Hilbert space and A ∈ L(H) is self-adjoint compact, then every λ ∈ σ(A)\{0} is an eigenvalue; that is, λ ∈ σp (A). PROOF: By virtue of Proposition 3.1.27, we can find {xn }n≥1 ⊆ ∂B1 such that λxn − Axn H −→ 0
as n → ∞.
(3.7)
Due to the compactness of A and by passing to a subsequence if necessary, we may assume that Axn −→ y in H. Because of (3.7) and because λ = 0, we have that xn −→ x in H. Then y = Ax and in the limit as n → ∞ we have λx = Ax with x = 1 (i.e., λ ∈ σp (A)). COROLLARY 3.1.32 If H is a Hilbert space and A ∈ L(H) is self-adjoint compact, then σp (A) = ∅. PROOF: If A = 0, then λ = 0 is an eigenvalue of A. So suppose that A = 0. Then at least one of m and M defined in Proposition 3.1.30 is nonzero and belongs in σ(A). By virtue of Proposition 3.1.31 it also belongs in σp (A). PROPOSITION 3.1.33 If X is a Banach space, A ∈ Lc (X) and λ ∈ σp (A)\{0}, then dim ker(λI − A) < +∞; that is, the eigenspace corresponding to λ is finitedimensional. PROOF: Let Eλ = ker(λI − A) and let C ⊆ Eλ be a nonempty bounded set. We have Ax = λx for all x ∈ C. (3.8) Because A is compact, A(C) is relatively compact. Therefore by (3.8) we have that λC = A(C) is relatively compact. Because C was an arbitrary bounded set in Eλ , we conclude that dim Eλ < +∞.
3.1 Compact and Fredholm Operators
157
PROPOSITION 3.1.34 If H is a Hilbert space and A ∈ Lc (H) is self-adjoint, then σp (A) is a countable set with 0 as the only possible limit point. PROOF: We show that for every ε > 0, σp (A) ∩ {|λ| > ε} is finite. We argue by contradiction. So suppose that there exists ε > 0 and a sequence of distinct eigenvalues {λn }n≥1 of A such that |λn | > ε for all n ≥ 1. Let {xn }n≥1 ⊆ ∂B1 such that Axn = λn xn for all n ≥ 1. If n = m, then from Proposition 3.1.25 we have that (xn , xm )H = 0. Therefore Axn − Axm 2H = λn xn − λm xm 2H = λ2n + λ2m ≥ 2ε2 , ⇒ A(B 1 ) is not relatively compact in H.
But this contradicts the hypothesis that A ∈ Lc (H).
PROPOSITION 3.1.35 If H is a Hilbert space and A ∈ Lc (H) is self-adjoint, then there is an orthonormal basis of H formed by eigenvectors of A. PROOF: For every λ ∈ σp (A), let Eλ = ker(λI − A) andconsider Bλ an orthonormal basis of Eλ . Then by virtue of Proposition 3.1.25 Bλ is orthonormal set of λ∈σp (A) H. Suppose V = span Eλ = H. We consider the space V ⊥ (the orthogonal comλ∈σp (A)
plement of V in H). Clearly A(V ) ⊆ V and so A(V ⊥ ) ⊆ V ⊥ (i.e., the subspace V ⊥ is A-invariant). By Corollary 3.1.32 A has an eigenvalue and so an eigenvector V⊥ ⊥ ⊥ u ∈ V \ {0}. Then u ∈ V ∩ V, u = 0, a contradiction. So V = H and Bλ is λ∈σp (A)
an orthonormal basis of H.
REMARK 3.1.36 This proposition essentially says that every self-adjoint, compact operator on a Hilbert space is diagonalizable. Now we are ready for the spectral resolution theorem for self-adjoint, compact operators on a separable Hilbert space. THEOREM 3.1.37 If H is an infinite-dimensional separable Hilbert space and A ∈ Lc (H) is self-adjoint, then there is an orthonormal basis {en }n≥1 of H consisting of eigenvectors of A such that for λn being the eigenvalue corresponding to en , we have A(x) = λn x, en H for all x ∈ H. n≥1
PROOF: Due to Proposition 3.1.35 and the separability of H, we can find a countable orthonormal basis {en }n≥1 of H consisting of eigenvectors of A. We have
m
λn x, en H en 2H =
n=k
⇒
n≥1
−→ 0
m
|λn x, en H |2 ≤ A2L(H)
n=k
as k, m −→ ∞,
λn x, en H en is convergent in H.
m n=k
| x, en H |2
158
3 Nonlinear Operators and Fixed Points Moreover, if x ≤ 1, then for every m ≥ 1 we have
m
λn x, en H en 2H =
n=1
m
λ2n | x, en H |2
n=1
≤ A2L(H)
m
| x, en H |2
n=1
≤
A2L(H)
| x, en H |2
n≥1
⇒
= A2L(H) x2H , λn x, en H en = F (x);
x ∈ H belongs in L(H).
n≥1
Because A(en ) = λn en , we infer that A(en ) = F (en ) for all n ≥ 1, hence A = F . PROPOSITION 3.1.38 If H is a Hilbert space and A ∈ Lc (H) is self-adjoint, then σ(A) = σp (A). PROOF: If dim H <+∞, then σ(A) = σp (A) = finite set (see Remark 3.1.24). If dim H = +∞ and λ ∈ σ(A) \ {0}, then λ ∈ σp (A) (see Proposition 3.1.31). Due to the compactness of A, 0 ∈ σ(A). If 0 is not an eigenvalue, then by virtue of the proof of Proposition 3.1.34, we can find {λn }n≥1 ⊆ σp (A) \ {0} such that λn −→ 0. Therefore σ(A) = σp (A). Now we return to the general Banach space setting and prove some results that lead to the introduction of Fredholm operators. PROPOSITION 3.1.39 If X is a Banach space, A ∈ Lc (X), and λ = 0, then R(λI − A) is closed. PROOF: We may assume that λ = 1. Let un = xn − A(xn ) and
suppose that un −→ u in X. We need to show that u ∈ R(λI − A). Let dn = d xn , ker(I − A) . Because ker(I − A) is finite-dimensional (see Proposition 3.1.33), we can find yn ∈ ker(I − A) such that dn = xn − yn X . We have un = xn − yn − A(xn − yn ),
n ≥ 1.
(3.9)
We show that {xn − yn }n≥1 ⊆X is bounded. Suppose that this is not true. We may assume that xn − yn X −→+∞ as n → ∞. Set vn =
xn − yn , xn − yn X
n ≥ 1.
From (3.9) we have vn − A(vn ) =
un −→ 0 xn − yn X
as n → ∞.
(3.10)
Due to the compactness of A, we may assume that A(vn ) −→ w in X. Then from (3.10) it follows that vn −→ w in X and w −A(w) = 0; that is, w ∈ ker(I −A). Note that
3.1 Compact and Fredholm Operators
d xn , ker(I − A) d vn , ker(I − A) = =1 for all n ≥ 1 xn − yn X
⇒ d w, ker(I − A) = 1, a contradiction.
159
So indeed {xn − yn }n≥1 ⊆ X is bounded and due to the compactness of A, we may assume that A(xn − yn ) −→ h in X. Therefore from (3.9), we have xn − yn −→ u + h
in X
as n → ∞.
Set f = u + h. Then u = f − A(f ) and so u ∈ R(I − A), which proves the closedness of R(I − A). REMARK 3.1.40 From Proposition 3.1.39 it follows that R(I − A) = ker(I − A∗ )⊥ and R(I − A∗ ) = ker(I − A)⊥ . The next result is known as Schauder’s theorem and its proof is an easy application of the Arzela–Ascoli theorem. We leave the details to the reader. THEOREM 3.1.41 If X, Y are Banach spaces, then A ∈ Lc (X, Y ) if and only if A∗ ∈ Lc (Y ∗ , X ∗ ). The theorem that follows is known as the Fredholm alternative theorem and is useful in the study of boundary value problems. THEOREM 3.1.42 If X is a Banach space, A ∈ Lc (X), and λ = 0, then ker(λI − A) = {0} if and only if R(λI − A) = X. PROOF: Again without any loss of generality, we can assume that λ = 1. ⇒: We argue by contradiction. Suppose X1 = R(I − A) = X. From Proposition 3.1.39 we know that X1 is a Banach space and A(X1 ) ⊆ X1 (i.e., X1 is A-invariant). Then A ∈ Lc (X1 ) and X2 = (I − A)(X1 ) is a Banach subspace of X1 . Note X1
that because I − A is injective, X2 = X1 . We continue this way and produce {Xn = (I − A)n (X)}n≥1 a strictly decreasing sequence of Banach subspaces of X. Using the lemma of Riesz, we can have xn ∈ Xn , xn X = 1 and d(xn , xn+1 ) ≥ 12 , n ≥ 1. We have Axn − Axm = −(xn − Axn ) + (xm − Axm ) + (xn − xm ). If n > m, then Xn+1 ⊆ Xn ⊆ Xm+1 ⊆ Xm (with strict inclusions) and so −(xn − Axn ) + (xm − Axm ) + xn ∈ Xm+1 , 1 ⇒ Axn − Axm X ≥ , 2 which contradicts the fact that A ∈ Lc (X). Therefore R(I − A) = X. ⇐: Because R(I − A) = X, we have {0} = R(I − A)⊥ = ker(I − A∗ ). From Theorem 3.1.41, we know that A∗ ∈ Lc (X ∗ ). So from the first part of the proof, we know that R(I − A∗ ) = X ∗ . Then {0} = R(I − A∗ )⊥ = ker(I − A).
160
3 Nonlinear Operators and Fixed Points
REMARK 3.1.43 This theorem says that if A ∈ Lc (X), then either for every x ∈ H the equation (λI − A)x = h has a unique solution or otherwise the homogeneous equation (λI − A)x = 0 has a nontrivial solution. If X is finite-dimensional, then every linear operator (which is automatically continuous) is injective if and only if it is surjective. In an infinite-dimensional space X, an operator A ∈ L(X) can be injective without being surjective and vice versa. DEFINITION 3.1.44 Let X, Y be Banach spaces and A ∈ L(X, Y ) (a) We say that A is a Fredholm operator , if ker A is finite-dimensional and R(A) is closed and finite-codimensional. Then
indA = dim(kerA) − codim R(A) is called the index of A. We denote the set of Fredholm operators by Φ(X, Y ). (b) We say that A is a semi-Fredholm operator , if ker A is finite-dimensional and R(A) is closed. We denote the set of semi-Fredholm operators by Φ+ (X, Y ). EXAMPLE 3.1.45 (a) If X = Rn and Y = Rm , then every linear operator A : Rn −→ Rm is Fredholm with index ind A = n − m. (b) Let X = H = l2 and A ∈ L(H) is defined by Ax = (x2 , x3 , . . .),
x = (xn )n≥1 ∈ l2 .
Then ker A is one-dimensional (ker A = {x = (x1 , 0, 0, . . .)}) and R(A) = l2 . If we consider A ∈ L(H) defined by A x = (0, x1 , x2 , . . .),
x = (xn )n≥1 ∈ l2 ,
then ker A = {0} and codim R(A ) = 1. So both A, A are Fredholm operators. THEOREM 3.1.46 If X is a Banach space and A ∈ Lc (X), then λI − A ∈ Φ(X) for all λ = 0. PROOF: As before without any we can assume that λ = 1. loss of generality, Note that I ker(I−A) = Aker(I−A) . So I ker(I−A) is compact which of course implies that dim ker(I − A) < +∞. Next we show that R(A) is closed. Let V be the topological complement of ker(I − A), (i.e., V ⊆ X is a closed subspace such that X = ker(I − A) ⊕ V ). Let G=I − A and consider GV and AV . We see that GV ={0}. Then G(V ) = G(X) and so it suffices to show that GV is closed. But this follows if we show that −1
(GV ) ⊆L G(V ), V . By linearity we have to show that (GV )−1 is continuous at −1 the origin. If (GV ) is not continuous at x = 0, then we can find {xn }n≥1 ⊆ V such that G(xn ) −→ 0 but {xn }n≥1 does not converge to 0. We may assume that x ≥ δ > 0 for all n ≥ 1. Then (1 xn X ) ≤ 1/δ for all n ≥ 1 and so n X
G xn xn X ) X ≤ 1/δG(xn )X −→ 0 as n → ∞. Also because A ∈ Lc (X), we
see that at least for a subsequence A (xn xn X ) −→ u in X. Then because x x xn n n −A G = , xn X xn X xn X
3.1 Compact and Fredholm Operators we infer that
xn −→ u ∈ V xn X
161
in X, u = 0.
But G(u) = 0, which contradicts the fact that ker G ∩ V = {0}. Therefore G(V ) = G(X) is closed. Finally we show that G(X) has finite codimension. Suppose that this is not true. Then we can find a sequence {Xn }n≥0 of closed subspaces of X such that G(X) = X0 ⊆ X1 ⊆ . . . ⊆ Xn ⊆ . . . , where Xn is closed and of codimension one in Xn+1 , just by adding one-dimensional subspaces to G(X) = X0 . Using the Riesz lemma, for each Xn , we can find xn ∈ Xn with xn X = 1 and d(xn , xn−1 ) ≥ 1 − ε, ε > 0. Then for all m < n we have Axn − Axm X = xn − G(xn ) − xm + G(xm )X ≥ 1 − ε because −Axn − xm + Gxm ∈ Hn−1 . But this contradicts the fact that A ∈ Lc (X). THEOREM 3.1.47 If X, Y are Banach spaces, then Φ(X, Y ) is open in L(X, Y ) and S −→ ind S is continuous. PROOF: Let S ∈ Φ(X, Y ). Then X = ker S ⊕ V . Evidently S is an isomorphism of V on S(V ). Hence Y = S(V ) ⊕ W for some finite-dimensional subspace W . Consider the map ξ : V ⊕ W −→ S(V ) ⊕ W defined by ξ(x, y) = Sx + y.
This is an isomorphism. But recall that Isom V ⊕ W, S(V ) ⊕ W is open in L V ⊕ W, S(V ) ⊕ W . So if K is close to S, then the map η : V ⊕ W −→ K(V ) ⊕ W = Y defined by η(x, y) = Kx + y
is still in Isom V ⊕ W, S(V ) ⊕ W . Hence ker K is finite-dimensional and R(K) has finite codimension and so it is closed; that is, K ∈ Φ(X, Y ). Note that we can find E a finite-dimensional space such that X = V ⊕ ker K ⊕ E,
⇒ K ∈ Isom V ⊕ E, K(V ) ⊕ K(E)
and
dim E = dim K(E),
⇒ ind K = dim ker K − dim W + dim K(E) = dim ker S − dim W = ind S, ⇒ S −→ ind S is constant on connected components, hence continuous. REMARK 3.1.48 The above theorem remains true if instead of Φ(X, Y ) we consider Φ+ (X, Y ). Also from this theorem, we see that if X = Y and A ∈ Lc (X), then for all λ = 0 and ind(λI − A) = 0 (from Theorem 3.1.46 we know λI − A ∈ Φ(X)).
162
3 Nonlinear Operators and Fixed Points
The following proposition summarizes some of the useful properties of Fredholm operators and can be proved using the basic theorems of introductory linear functional analysis. PROPOSITION 3.1.49 If X, Y are Banach spaces and S ∈ Φ+ (X, Y ), then (a) if ind S = 0 and ker S = {0}, we have S −1 ∈ L(Y, X). (b) R(S) = (ker S ∗ )⊥ . (c) S ∗ ∈ Φ+ (Y ∗ , X ∗ ), ind S ∗ = −ind S and R(S ∗ ) = (ker S)⊥ . We conclude with a basic stability result for Fredholm operators. Its proof can be found in Kato [343, pp. 232–238]. THEOREM 3.1.50 If X, Y are Banach spaces, S ∈ Φ(X, Y ), and either A ∈ Lc (X, Y ) or A ∈ L(X, Y ) with AL ≤ ε for some ε = ε(S) > 0, then S + A ∈ Φ(X, Y ) and ind(S + A) = ind S. REMARK 3.1.51 The result is still valid if Φ(X, Y ) is replaced by Φ+ (X, Y ).
3.2 Monotone and Accretive Operators In this section we study the most important class of nonlinear maps from a Banach space X into its dual. These are the monotone maps. They constitute a manageable class of operators because of the very simple definition of monotonicity. They appear naturally in the calculus of variations and in the theory of nonlinear boundary value problems and provide an analytic framework broader than that of compact operators. When we deal with nonlinear maps from a Banach space into itself, then the corresponding notion is that of an accretive operator. When X = H = a pivot Hilbert space, the two notions coincide. Accretive operators are closely connected with the generation theory of semigroups (linear and nonlinear). In the second half of this section we deal with accretive operators and the semigroups they generate. Let X be a Banach space. By X ∗ we denote its topological dual and by ·, ·X the ∗ duality brackets for the pair (X, X ∗ ). Given a multivalued mapping A : X −→ 2X , let us fix some notation associated with it. (a) The domain of A is the set D(A) = {x ∈ X : A(x) = ∅}. (b) The graph of A is the set Gr A = {(x, x∗ ) ∈ X × X ∗ : x∗ ∈ A(x)}. (c) The inverse of A A−1 : X ∗ −→ 2X is defined by (A−1 )(x∗ ) = {x ∈ X : x∗ ∈ A(x)} (hence Gr A−1 = {(x∗ , x) ∈ X ∗ × X : (x∗ , x) ∈ Gr A}). REMARK 3.2.1 The generality of our setting allows us to identify every subset ∗ G ⊆ X × X ∗ with a mapping A : X −→ 2X defined by A(x) = {x∗ ∈ X ∗ : (x, x∗ ) ∈ G}. For this reason some authors identify a multivalued mapping with its graph. Note that in the present multivalued context the inverse mapping A−1 is always defined. ∗
DEFINITION 3.2.2 Given a mapping A : X −→ 2X , we say that:
3.2 Monotone and Accretive Operators
163
(a) A is monotone if for any x, y ∈ D(A) and x∗ ∈ A(x), y ∗ ∈ A(y) we have x∗ − y ∗ , x − y ≥ 0.
(3.11)
(b) A is strictly monotone if it is monotone and equality in (3.11) implies that x = y. (c) A is maximal monotone if it is monotone and for (y, y ∗ ) ∈ X×X ∗ the inequalities y ∗ − x∗ , y − xX ≥ 0 for all (x, x∗ ) ∈ Gr A, imply (y, y ∗ ) ∈ Gr A. (d) A is locally bounded at x ∈ D(A) if we can find a neighborhood U of x such that A(U ) is bounded. We say that A is locally bounded if it is locally bounded at every x ∈ D(A). (e) A is bounded if it maps bounded sets in X to bounded sets in X ∗ . ∗
REMARK 3.2.3 According to Definition 3.2.2(c), a mapping A: X −→ 2X is maximal monotone if and only if its graph Gr A is maximal with respect to inclusion among the graphs of all monotone mappings. This definition makes it necessary to consider set-valued mappings. Indeed a discontinuous nondecreasing function f : R −→ R is not maximal monotone. To become maximal monotone, we need to fill in the jump discontinuities, namely introduce f : R −→ 2R defined by f (x) = [f (x− ), f (x+ )] for all x ∈ R. By Zorn’s lemma, every monotone map has a maximal monotone extension. Recall that C ⊆ X is said to be absorbing if 0 ∈ C and X = λC (equivalently λ>0
for every x ∈ X, we can find λ > 0 such that λx ∈ C). A point x ∈ D ⊆ X is said to be the absorbing point of D if D − x is absorbing. Evidently if int D = ∅, then every x ∈ int D is an absorbing point of D. But there can be absorbing points of D not belonging in int D. In fact it can happen that int D = ∅. Consider D = ∂B1 ∪ {0}. Then the origin is an absorbing point of D, but int D = ∅. ∗
THEOREM 3.2.4 If A : X −→ 2X is a monotone map and x0 is an absorbing point of D(A), then A is locally bounded at x0 . PROOF: By picking any x∗0 ∈ A(x0 ) and replacing A by the monotone map A1 (y) = A(y +x0 )−x∗0 we see that without any loss of generality, we may assume that x0 = 0 and (0, 0) ∈ Gr A. Let ϕ(x) = sup y ∗ , x − yX : y ∈ D(A), yX ≤ 1, y ∗ ∈ A(y) for all x ∈ X
and
Lϕ 1 = {x ∈ X : ϕ(x) ≤ 1}.
Note that ϕ is the supremum of affine continuous functions. So ϕ is lower semicontinuous and convex, which means that Lϕ 1 is closed and convex. Note that because (0, 0) ∈ Gr A, we have ϕ ≥ 0. Also, if y ∈ D(A), y ∗ ∈ A(y), then due to the monotonicity of A and because (0, 0) ∈ Gr A, we have 0 ≤ y ∗ , y , ⇒ ϕ(0) = 0
(i.e., 0 ∈ Lϕ 1 ).
(3.12)
We show that Lϕ 1 is also absorbing. Recall that by hypothesis D(A) is absorbing. So if x ∈ X, we can find λ > 0 such that λx ∈ D(A) (i.e., A(λx) = ∅). Let u∗ ∈ A(λx). If (y, y ∗ ) ∈ Gr A, then by virtue of the monotonicity of A, we have
164
3 Nonlinear Operators and Fixed Points y ∗ , λx − yX ≤ u∗ , λx − yX ⇒ ϕ(λx) ≤ sup u∗ , λx − yX : y ∈ D(A), yX ≤ 1 ≤ u∗ , λxX + u∗ X ∗ < +∞. Choose t ∈ (0, 1) such that tϕ(λx) < 1. Because of convexity of ϕ, we have ϕ(tλx) ≤ tϕ(λx) + (1 − t)ϕ(0) = tϕ(λx) < 1 ⇒ tλx ∈ Lϕ 1
(see (3.12))
(i.e., Lϕ 1 is absorbing).
ϕ Then C =Lϕ 1 ∩ (−L1 ) is nonempty (because 0 ∈ C), closed, convex, symmetric, and absorbing, therefore 0 ∈ int Lϕ 1 . Hence we can find δ > 0 such that
ϕ(x) ≤ 1
for all xX ≤ 2δ,
⇒ y ∗ , xX ≤ y ∗ , yX + 1
for all y ∈ D(A), yX ≤ 1,
∗
all y ∈ A(y) and all xX ≤ 2δ, ⇒ 2δy ∗ X ∗ ≤ y ∗ X ∗ yX + 1 ≤ δy ∗ X ∗ + 1 for all y ∈ D(A) ∩ Bδ and all y ∗ ∈ A(y),
1 for all y ∗ ∈ A D(A) ∩ Bδ . ⇒ y ∗ X ∗ ≤ δ ∗
PROPOSITION 3.2.5 If A : X −→ 2X is a maximal monotone map and x ∈ D(A), then A(x) is nonempty, w∗ -closed, and convex. PROOF: If x∗1 , x∗2 ∈ A(x) and λ ∈ [0, 1], then for any (y, y ∗ ) ∈ GrA, we have (1 − λ)x∗1 + λx∗2 − y ∗ , x − yX = (1 − λ) x∗1 − y ∗ , x − yX + λ x∗2 − y ∗ , x − yX ≥ 0. From the maximality of A (see Definition 3.2.2(c)), it follows that (1 − λ)x∗1 + λx∗2 ∈ A(x)
(i.e., A(x) is convex). w∗
Also if {x∗α }α∈J ⊆ A(x) is a net such that x∗α −→ x∗ in X ∗ , then for all (y, y ∗ ) ∈ Gr A we have x∗α − y ∗ , x − yX ≥ 0, ⇒ x∗ − y ∗ , x − yX ≥ 0, which by the maximality of A implies that (x, x∗ ) ∈ Gr A; that is, A(x) is w∗ -closed. Combining Theorem 3.2.4 and Proposition 3.2.5, we obtain the following. ∗
PROPOSITION 3.2.6 If A : X −→ 2X is a maximal monotone map and x ∈ int D(A), then A(x) is nonempty, w∗ -compact, and convex. ∗
PROPOSITION 3.2.7 If A : X −→ 2X is a maximal monotone map, then Gr ∗ ∗ ∗ A is closed in X × Xw ∗ and in Xw × X . Here by Xw ∗ (resp., Xw ) we denote the ∗ ∗ space X (resp., X) furnished with the w -topology (resp., the w-topology).
3.2 Monotone and Accretive Operators
165 w∗
PROOF: Let {(xα , x∗α )}α∈J be a net in Gr A such that xα −→ x in X and x∗α −→ x∗ in X ∗ . Then for any (y, y ∗ ) ∈ Gr A, we have x∗α − y ∗ , xα − yX ≥ 0, ⇒ x∗ − y ∗ , x − yX ≥ 0. As before from the maximality of A we infer that (x, x∗ ) ∈ Gr A, which proves ∗ ∗ that Gr A is closed in X × Xw ∗ and similarly for the closedness in Xw × X . ∗
PROPOSITION 3.2.8 If A : X −→ 2X is a monotone map with convex and w∗ closed values, D(A) = X and for all x, u ∈ X the set-valued map λ −→ A(x + λu) ∗ from [0, 1] into X ∗ has a closed graph in [0, 1] × Xw ∗ , then A is maximal monotone. PROOF: Suppose that for (x, x∗ ) ∈ X × X ∗ , we have x∗ − y ∗ , x − yX ≥ 0
for all (y, y ∗ ) ∈ Gr A.
We need to show that (x, x∗ ) ∈ Gr A (see Definition 3.2.2(c)). We argue indirectly. So suppose that x∗ ∈ / A(x). By hypothesis A(x) is w∗ -closed and convex. So by the strong separation theorem, we can find u ∈ X \ {0} and ε > 0 such that sup v ∗ , uX : v ∗ ∈ A(x) ≤ x∗ , uX − ε. (3.13) In (3.13) we set y = yλ = x + λu, λ > 0 and y ∗ = yλ∗ ∈ A(yλ ). Then yλ∗ − x∗ , uX ≥ 0. Because of Theorem 3.2.4 {yλ∗ }λ∈[0,1] ⊆ X ∗ is bounded and so we may assume w∗
that yλ∗ −→ v ∗ in X ∗ . Then by hypothesis (x, v ∗ ) ∈ Gr A. If we pass to the limit as λ ↓ 0 we obtain 0 ≤ v ∗ − x∗ , uX
which contradicts (3.13).
DEFINITION 3.2.9 Let A : X −→ X ∗ be a single-valued map with D(A) = X. Then we say: w∗
(a) A is demicontinuous if xn −→ x in X, implies that A(xn ) −→ A(x) in X ∗ . (b) A is hemicontinuous if for all x, y ∈ X, the map λ −→ A(x + λy) is continuous ∗ from [0, 1] into Xw ∗. REMARK 3.2.10 Evidently any demicontinuous map is hemicontinuous. Conversely, it is easy to check that if A : X −→ X ∗ is monotone and hemicontinuous, it is demicontinuous. COROLLARY 3.2.11 If A : X −→ X ∗ is monotone, and hemicontinuous, then A is maximal monotone. An important monotone map that is a basic tool in the study of maximal monotone maps and of nonlinear evolution equations, is the so-called duality map.
166
3 Nonlinear Operators and Fixed Points
DEFINITION 3.2.12 Let X be a Banach space and X ∗ its dual. ∗
(a) The duality map (or normalized duality map) is the map F : X −→ 2X defined by F(x) = {x∗ ∈ X ∗ : x∗ , xX = x2X = x∗ 2X ∗ }. (b) Let ξ : R+ −→ R+ be a continuous, strictly increasing function such that ξ(0) = 0 and lim ξ(r) = +∞ (such a function is called the gauge function). The n→∞
∗
duality map with gauge ξ is the map Fξ : X −→ 2X defined by Fξ (x) = x∗ ∈ X ∗ : x∗ , xX = x∗ X ∗ xX , x∗ X = ξ(xX ) . REMARK 3.2.13 By the Hahn–Banach theorem, it follows that D(F) = D(Fξ ) = X. Moreover, both F and Fξ have closed and convex values. Note that if ξ(r) = r, r that F = Fξ . If ξ is a gauge function and η(r) = 0 ξ(s)ds, then η is a convex function and Fξ (x) = ∂η(xX ). In particular, if η(x) = 12 x2X , then F(x) = ∂η(x) for all x ∈ X. Here by ∂η we denote the subdifferential of the convex function η(·) (see Definition 1.2.28). In the sequel, for simplicity in the presentation, we limit ourselves to the normalized duality map F. ∗
DEFINITION 3.2.14 Let A : X −→ 2X be a map. (a) We say that A is coercive, if either D(A) is bounded or D(A) is unbounded and inf[x∗ , xX : x∗ ∈ A(x)] −→ +∞ xX
as xX −→ +∞, x ∈ D(A).
(b) We say that A is weakly coercive if either D(A) is bounded or D(A) is unbounded and inf x∗ X ∗ : x∗ ∈ A(x) −→ +∞ as xX −→ +∞, x ∈ D(A). REMARK 3.2.15 Evidently coercivity implies weak coercivity. PROPOSITION 3.2.16 If X is a reflexive Banach space and X ∗ is strictly ∗ convex, then F : X −→ 2X is single-valued, demicontinuous, maximal monotone, bounded, coercive, and odd. PROOF: Suppose x∗k ∈ F (x), k = 1, 2. Then 2x∗k X ∗ xX ≤ x∗1 2X ∗ + x∗2 2X ∗ = x∗1 + x∗2 , xX ≤ x∗1 + x∗2 X ∗ xX , ⇒ x∗1 X ∗ ≤
1 ∗ x1 + x∗2 X ∗ . 2
(3.14)
Because x∗1 X ∗ = x∗2 X ∗ and X ∗ is strictly convex, then from (3.14) it follows that x∗1 = x∗2 . Next we show that F is demicontinuous. To this end let xn −→ x in X. Then F(xn )X ∗ = xn X −→ xX as n → ∞. Because X ∗ is reflexive, we may assume (at least for a subsequence), that F(xn ) −→ x∗ w
in X ∗
as n → ∞.
3.2 Monotone and Accretive Operators
167
So for all u ∈ X, we have x∗ , uX = lim F (xn ), uX ≤ lim xn X uX = xX uX . n→∞
n→∞
Also we have x∗ , xX = lim F (xn ), xn X = lim xn 2X = x2X , ⇒ x∗ = F (x).
n→∞
n→∞
Therefore by Urysohn’s criterion for convergent sequences, we conclude that for w the original sequence we have F(xn ) −→ F(x) in X ∗ as n → ∞. Note that for every x, u ∈ X, we have F (x) − F (u), x − uX = x2X − 2xX uX + u2X ≥ 0, ⇒ F is monotone, hence maximal monotone (see Corollary 3.2.11). Finally it is clear from its definition that F is bounded, coercive, and odd.
PROPOSITION 3.2.17 If X is a reflexive Banach space, X ∗ is strictly convex, and ψ0 (x) = xX , then ψ0 is Gˆ ateaux differentiable at all x ∈ X \{0} and ψ0 (x) =
F (x) x
for all x ∈ X \{0}.
PROOF: For all x, y ∈ X we have F (y), y − xX ≥ y2X − yX xX 1 ≥ y2X − (y2X + x2X ) 2 1 1 = y2X − x2X 2 2 ≥ yX xX − x2X ≥ F (x), y − xX . Let y = x + λh, h ∈ X, and λ ∈ R. We obtain λ F (x), hX ≤
1 1 x + λh2X − x2X ≤ λ F(x + λh), hX . 2 2
(3.15)
From Proposition 3.2.16 we know that F (x + λh), hX −→ F (x), hX as λ → 0. So from (3.15) it follows that F(x) = η (x) ateaux derivative of η(x) = 12 x2X . Because ψ0 (x) = η(x)1/2 , where η (x) is the Gˆ from Proposition 3.2.16 ateaux differentiable at all x ∈ X \ {0}
we infer that ψ0 is Gˆ and ψ0 (x) = F (x) x. REMARK 3.2.18 Proposition 3.2.17 established the following duality implication. If X is a reflexive Banach space and X ∗ is strictly convex, then X is smooth. In fact the following stronger result is true. Namely, if X is reflexive, then X is strictly convex (resp., smooth) if and only if X ∗ is smooth (resp., strictly convex).
168
3 Nonlinear Operators and Fixed Points
PROPOSITION 3.2.19 If X is a reflexive Banach space and X ∗ is locally uniformly convex, then F : X −→ X ∗ is continuous and ψ0 (x) = x is Fr´echet differentiable at every x ∈ X \ {0}. PROOF: Let xn −→ x in X. Then F(xn )X −→ F (x)X and from Proposition w 3.2.16 we also have that F (xn ) −→ F (x) in X ∗ . But X ∗ being locally uniformly convex it has the Kadec–Klee property and so F(xn ) −→ F(x) in X ∗ (i.e., F is continuous). Combining this with Proposition 3.2.17, we see that x −→ η(x) = 1 x2X is Fr´echet differentiable (see Proposition 1.1.10). Therefore x −→ ψ0 (x) = 2
1/2 is Fr´echet differentiable at every x ∈ X \ {0}. x = 2η(x) REMARK 3.2.20 In fact if X is reflexive and both X and X ∗ are locally uniformly convex, then F : X −→ X ∗ is a homeomorphism. Also if X ∗ is uniformly convex, then F is uniformly continuous on bounded sets of X. EXAMPLE 3.2.21 (a) If (Ω, Σ, µ) is a σ-finite measure space and 1 ≤ p < ∞, then the duality map on Lp (Ω) corresponding to the gauge function ξ(r) = rp−1 , is the subdifferential of x −→ (1/p)xpp , x ∈ Lp (Ω). So
x∗ ∈ Fξ (x) if and only if x∗ (z) ∈ |x(z)|p−1 sgn x(z) a.e. on Ω. (b) If Ω ∈ RN is a bounded domain and 1 < p < ∞, then the duality map on the Sobolev space W01,p (Ω) with norm x = Dxp corresponding to the gauge function ξ(r) = rp−1 is
F (x) = −div(Dxp−2 Dx) ∈ W −1,p (Ω) = W01,p (Ω)∗ ,
1 1 + = 1. p p
The importance of maximal monotone operators stems from their remarkable surjectivity properties under coercivity conditions. Surjectivity results correspond to existence theorems for nonlinear boundary value problems. Next we prove the main surjectivity result for maximal monotone maps. To do this we need some preparation. We start with a finite-dimensional result. PROPOSITION 3.2.22 If X is a finite-dimensional Banach space, C ⊆ X is a ∗ nonempty closed convex set, A : X −→ 2X is a monotone map with D(A) ⊆ C, and ∗ B : C −→ X is monotone, continuous, and coercive, then there exists x0 ∈ C such that x∗ + B(x0 ), x − x0 X ≥ 0 for all (x, x∗ ) ∈ Gr A. PROOF: Without any loss of generality, we may assume that (0, 0) ∈ Gr A. Indeed if this is not the case, we fix x, x∗ ∈ Gr A and replace A and B, by A and B defined by A(x) = A(x + x) − x∗ and B(x) = B(x + x) − x∗ . Note that A and B still have all the properties of A and B, namely A is monotone with D(A) ⊆ C and B is monotone, continuous, and coercive. First suppose that D(A) is bounded. Let K = conv D(A). Then K is compact and convex. Suppose that the proposition is not true. Then for every y ∈ K we can find (x, x∗ ) ∈ Gr A such that
3.2 Monotone and Accretive Operators
Therefore K =
x∗ + B(y), x − yX < 0.
y ∈ K : x∗ + B(y), x − yX < 0 . Clearly each set in the
(x,x∗ )∈Gr
169
A
union is open. Because K is compact, we can find {(xk , x∗k )}n k=1 ⊆ Gr A such that n !
y ∈ K : x∗k + B(y), xk − yX < 0 .
K=
k=1
Let {ϑk }n k=1 be a continuous partition of unity subordinate to this cover and consider the map ϕ : K −→ K defined by ϕ(y) =
n
ϑk (y)xk .
k=1
By Brouwer’s fixed point theorem (see Theorem 3.5.3), we can find x0 ∈ K such that ϕ(x0 ) = x0 . For every y ∈ K, we have h(y) =
n k=1 n
= =
k=1 n
ϑk (y)x∗k + B(y), ϕ(y) − yX ϑk (y)x∗k + B(y),
n
ϑi (y)(xi − y)X
i=1
ϑk (y)ϑi (y) x∗k + B(y), xi − yX .
k,i=1
If k = i and ϑ2k (y) = 0, then y∈ K and so x∗k + B(y), xk − yX < 0. If k = i ∗ and ϑk (y)ϑi (y) = 0, then y ∈ y ∈ K : x + B(y), x − y < 0 ∩ y ∈ K : k k X x∗i + B(y), xi − yX < 0 and so exploiting the monotonicity of A, we have x∗k + B(y), xi − yX + x∗i + B(y), xk − yX = x∗k + B(y), xk − yX + x∗i + B(y), xi − yX + x∗k − x∗i , xk − xi X < 0. Therefore h(y) < 0 for all y ∈ K. But h(x0 ) =
n
ϑk (x0 )x∗k + B(x0 ), ϕ(x0 ) − x0 X = 0
(recall that ϕ(x0 ) = x0 ),
k=1
a contradiction. Next we remove the hypothesis that D(A) is bounded. From the first part of the proof, we can find xn ∈ C such that for all (x, x∗ ) ∈ Gr AB (0) . x∗ + B(xn ), x − xn X ≥ 0 Since (0, 0) ∈ Gr AB
n
n (0)
, we obtain
B(xn ), xn X ≤ 0
for all n ≥ 1.
Due to the coercivity of B, it follows that {xn }n≥1 ⊆ C is bounded. So we may assume that xn −→ x0 in X as n → ∞. Then x0 ∈ C and
170
3 Nonlinear Operators and Fixed Points x∗ + B(x0 ), x − x0 X
for all (x, x∗ ) ∈ Gr A.
We extend this proposition to infinite dimensional Banach spaces and then we use the extension to prove surjectivity results for maximal monotone operators. PROPOSITION 3.2.23 If X is a reflexive Banach space, C ⊆ X is a nonempty ∗ closed convex set, A : X −→ 2X is a monotone map with D(A) ⊆ C and B: C −→ X ∗ is monotone, hemicontinuous, bounded, and coercive, then there exists x0 ∈ C such that x∗ + B(x0 ), x − x0 X ≥ 0 for all (x, x∗ ) ∈ Gr A. PROOF: We employ the method of finite-dimensional approximations (Galerkin method). So let {Xα }α∈J be a directed family of finite dimensional subspaces of X such that Xα = X. Let pα ∈ L(X, Xα ) be the projection onto the Xα operator. α∈J
Then p∗α ∈ L(Xα∗ , X ∗ ) is the corresponding embedding map. We define Aα = p∗α ◦ A ◦ pα ,
Bα = p∗α ◦ B ◦ pα
and
Cα = C ∩ Xα
for all α ∈ J.
Then Aα and Bα satisfy the hypotheses of Proposition 3.2.22 on Xα and so we can find xα ∈ Cα such that x∗ + Bα (xα ), x − xα Xα ≥ 0
⇒ x∗ + B(xα ), x − xα X ≥ 0
for all (xα , x∗α ) ∈ Gr Aα , for all (x, x∗ ) ∈ Gr (A ◦ pα ). (3.16)
In particular, B(xα ), xα X ≤ 0 for all α ∈ J and due to the coercivity and boundedness of B, we can find M > 0 such that xα X ≤ M
and
B(xα )X ∗ ≤ M
for all α ∈ J.
Due to the reflexivity of X, we can find a subsequence {xαn }n≥1 ⊆ {xα }α∈J such that w
xαn −→ x0
B(xαn ) −→ x∗0 w
and
with (x0 , x∗0 ) ∈ C × X ∗ .
Then from (3.16), we have lim sup B(xαn ), xαn X ≤ x∗ , x − x0 X + x∗0 , xX n→∞
for all (x, x∗ ) ∈ Gr A.
(3.17) By virtue of Zorn’s lemma, without any loss of generality, we may assume that A is maximal on D(A). We claim that there exists (x1 , x∗1 ) ∈ Gr A such that x∗1 , x1 − x0 X + x∗0 , x1 X ≤ x∗0 , x0 X .
(3.18)
Indeed if (3.18) is not true, then we have x∗ + x∗0 , x − x0 X > 0 ⇒
(x0 , −x∗0 )
∈ Gr A
for all (x, x∗ ) ∈ Gr A,
(due to the maximality of A).
(3.19)
3.2 Monotone and Accretive Operators
171
So if in (3.19) we set x = x0 and x∗ = x∗0 , we reach a contradiction. Hence (3.18) holds and using this in (3.17), we obtain lim sup B(xαn ), xαn X ≤ x∗0 , x0 X , n→∞
⇒ lim sup B(xαn ), xαn − x0 X ≤ 0.
(3.20)
n→∞
Now let x ∈ D(A) and xλ = λx0 + (1 − λ)x, λ ∈ [0, 1]. Then xλ ∈ C and because of the monotonicity of B, we have B(xαn ) − B(xλ ), xαn − xλ X ≥ 0 ⇒
for all n ≥ 1 and all λ ∈ [0, 1],
λ B(xαn ), xαn − x0 X + (1 − λ) B(xαn ), xαn − xX ≥ λ B(xλ ), xαn − x0 X + (1 − λ) B(xλ ), xαn − xX
⇒
lim inf B(xαn ), xαn − xX ≥ B(xλ ), x0 − xX , n→∞
for all λ ∈ [0, 1] (see (3.20)). Because B is hemicontinuous, we have lim inf B(xαn ), xαn − xX ≥ B(x0 ), x0 − xX , n→∞
⇒ lim inf B(xαn ), xαn X ≥ B(x0 ), x0 − xX + x∗0 , xX n→∞
for all x ∈ D(A) (3.21)
Then from (3.17) and (3.21), we obtain B(x0 ), x0 − xX ≤ x∗ , x − x0 X
for all (x, x∗ ) ∈ Gr A.
Using this proposition we can prove some remarkable surjectivity results for maximal monotone operators. THEOREM 3.2.24 If X is a reflexive Banach space, C ⊆ X a nonempty, closed, ∗ convex set, A : X −→ 2X is a maximal monotone map with D(A) ⊆ C and B: C −→ X ∗ is monotone, hemicontinuous, bounded, and coercive, then A + B is surjective. PROOF: Let x∗0 ∈ X ∗ and introduce the map A(x) = A(x) − x∗0
for all x ∈ D(A).
Note that A is still maximal monotone and D(A) = D(A) ⊆ C. We can apply Proposition 3.2.23 and obtain x0 ∈ C such that for all (x, x∗ ) ∈ Gr A, x∗ + B(x0 ), x − x0 X ≥ 0 ∗ ∗ for all (x, x∗ ) ∈ Gr A. ⇒ x − x0 − B(x0 ) , x − x0 X ≥ 0
(3.22)
Due to the maximal monotonicity of A, from (3.22), we infer that
x0 , x∗0 − B(x0 ) ⊆ Gr A, ⇒ x∗0 ∈ A(x0 ) + B(x0 ). Because x∗0 ∈ X ∗ was arbitrary, we conclude that A + B is surjective.
In what follows we use the following theorem, due to Troyanski [587], to equivalently renorm X and X ∗ in a way that it is helpful in our analysis.
172
3 Nonlinear Operators and Fixed Points
THEOREM 3.2.25 If X is a reflexive Banach space, then there exists an equivalent norm on X so that both X and X ∗ are locally uniformly convex. REMARK 3.2.26 From Proposition 3.2.16 and Remark 3.2.20, we know that the duality map F : X −→ X ∗ corresponding to this new equivalent norm, is monotone, a homeomorphism, bounded, and coercive. ∗
THEOREM 3.2.27 If X is a reflexive Banach space and A : X −→ 2X is maximal monotone, then A is surjective if and only if A−1 is locally bounded. PROOF: ⇒: Note that A−1 is maximal monotone too. Moreover, D(A−1 ) = X ∗ . Then from Theorem 3.2.4 it follows that A−1 is locally bounded. ⇐: We show that the range R(A) of A is both closed and open in X ∗ . To this end let x∗n ∈ A(xn ), n ≥ 1, and suppose that x∗n −→ x∗ . Then xn ∈ −1 A (x∗n ), n ≥ 1. Because A−1 is locally bounded, {xn }n≥1 ⊆ X is bounded and so w we may assume that xn −→ x in X as n → ∞. Because (xn , x∗n ) ∈ Gr A for all n ≥ 1, from Proposition 3.2.7, we infer that (x, x∗ ) ∈ Gr A and so we have that R(A) is closed in X. Next we show that R(A) is open X. By virtue of Theorem 3.2.25, we may ∗ assume without any loss of generality that both X
and∗ X are locally uniformly ∗ −1 convex. Let x0 ∈ R(A) and r > 0 such that A B r (x0 ) is bounded in X. We can find x0 ∈ D(A) such that x∗0 ∈ A(x0 ). By virtue of Theorem 3.2.24, for every λ > 0 x −→ A(x) + λF(x − x0 ) is surjective (here F : X −→ X ∗ is the duality map). Given u∗ ∈ Br/2 (x∗0 ) we can find xλ ∈ D(A) and x∗λ ∈ A(xλ ) such that x∗λ + λF (xλ − x0 ) = u∗ ∗
⇒ u − λF (xλ − x0 ) −
x∗0 , xλ
(3.23) − x0 X ≥ 0
(because A is monotone)
⇒ u∗ − x∗0 , xλ − x0 X ≥ λ F (xλ − x0 ), xλ − x0 X = λxλ − x0 2X , r ⇒ λxλ − x0 X ≤ u∗ − x∗0 X ∗ < . 2
(3.24)
Also from (3.23), we have u∗ − x∗λ X ∗ = λF (xλ − x0 )X ∗ = λxλ − x0 X < ⇒ x∗λ − x∗0 X ∗ ≤ x∗λ − u∗ X ∗ + u∗ − x∗0 X ∗ < r Because A−1 B
r (x
∗)
is bounded, then the net {xλ }λ>0 is bounded in X and so
u∗ − x∗λ X ∗ = λxλ − xX −→ 0 ∗
⇒ u ∈ R(A) ⇒
Br/2 (x∗0 )
r 2 (see (3.23)).
as λ ↓ 0,
(because as we already proved R(A) is closed in X ∗ ),
⊆ R(A)
and so R(A) is open in X ∗ .
Because R(A) is both closed and open in X ∗ , we conclude that R(A) = X ∗ . COROLLARY 3.2.28 If X is a reflexive Banach space and A : X −→ 2X maximal monotone, and weakly coercive, then A is surjective.
∗
is
3.2 Monotone and Accretive Operators
173
PROOF: The weak coercivity of A implies that A−1 is locally bounded. Then from Theorem 3.2.27 it follows that A is surjective. Combining the above result with Corollary 3.2.11, we obtain the following. COROLLARY 3.2.29 If X is a reflexive Banach space and A : X −→ 2X monotone, hemicontinuous, and weakly coercive, then A is surjective.
∗
is
The next theorem provides a useful criterion for maximal monotonicity in terms of the duality map. THEOREM 3.2.30 If X is a reflexive Banach space such that both X and X ∗ are ∗ strictly convex and A : X −→ 2X is a monotone map, then A is maximal monotone if and only if for every λ > 0 (equivalently for some λ > 0) A + λF is surjective. PROOF: ⇒: Follows from Theorem 3.2.24. ⇐: Suppose that for some (y0 , y0∗ ) ∈ X × X ∗ , we have x∗ − y0∗ , x − y0 X ≥ 0
for all (x, x∗ ) ∈ Gr A.
(3.25)
Then by hypothesis we can find (x1 , x∗1 ) ∈ Gr A such that x∗1 + λF(x1 ) = y0∗ + λF(y0 ).
(3.26)
In (3.25) we set x = x1 and x∗ = x∗1 . We obtain λ F(y0 ) − F (x1 ), x1 − y0 X ≥ 0
(see (3.25)).
(3.27)
But because X, X ∗ are strictly convex, F is strictly monotone and so from (3.27) we infer that x1 = y0 . Hence (y0 , y0∗ ) ∈ Gr A and this proves the maximality of A. ∗
COROLLARY 3.2.31 If X is a reflexive Banach space, A : X −→ 2X is monotone and B : X −→ X ∗ is strictly monotone, hemicontinuous, bounded, and coercive, then A is maximal monotone if and only if A + B is surjective. ∗
COROLLARY 3.2.32 If X is a reflexive Banach space, A : X −→ 2X is maximal monotone and B : X −→ X ∗ is monotone, hemicontinuous, and bounded, then A + B is maximal monotone. PROOF: By virtue of Theorem 3.2.25 we may assume without any loss of generality that both X and X ∗ are locally uniformly convex. Also if we replace A and B by A(x) = A(x + x0 ) + B(x0 )
and
B(x) = B(x + x0 ) − B(x0 )
for some x0 ∈ X,
then we see that we may assume that 0 = B(0). Then B + F is monotone, hemicontinuous, bounded, and coercive. Given u∗ ∈ X ∗ , by Proposition 3.2.23 we can find x0 ∈ X such that x∗ + B(x0 ) + F (x0 ) − u∗ , x − x0 X ≥ 0
for all (x, x∗ ) ∈ Gr A.
174
3 Nonlinear Operators and Fixed Points Because A is maximal monotone, we have that u∗ ∈ A(x0 ) + B(x0 ) + F (x0 ), ⇒ R(A + B + F ) = X ∗
and this by Theorem 3.2.30 implies that A + B is maximal monotone.
Another result on the maximality of the sum is the following one whose proof can be found in Barbu [58, p. 46]. ∗
THEOREM 3.2.33 If X is a reflexive Banach space and A1 , A2 : X −→ 2X are two maximal monotone maps such that int D(A1 ) ∩ D(A2 ) = ∅, then A1 + A2 is maximal monotone. REMARK 3.2.34 If X =H = a Hilbert space, then the domain condition
in the above theorem can be replaced by a weaker one which says that 0 ∈ int D(A1 ) − D(A2 ) (recall that int D(A1 ) − D(A2 ) ⊇ int D(A1 ) − D(A2 ). This new condition is symmetric in A1 and A2 and can be satisfied even if D(A1 ) and D(A2 ) have an empty interior. For details we refer to Attouch [33]. It is easy to see that the subdifferential of a function ϕ ∈ Γ0 (X) (see Definition 1.2.1) is monotone. Next we show that in fact it is maximal monotone. THEOREM 3.2.35 If X is a Banach space and ϕ ∈ Γ0 (X), then ∂ϕ : X −→ 2X is maximal monotone.
∗
PROOF: We do the proof for the case when X is reflexive and for the general case we refer to Rockafellar [522]. Directly from the definition of the subdifferential (see Definition 1.2.28), we can see that ∂ϕ is monotone. So we need to show that ∂ϕ is maximal. By virtue of Theorem 3.2.30, it suffices to show that ∂ϕ + F is surjective. To this end, let u∗ ∈ X ∗ and consider the function ψ : X −→ R = R ∪ {+∞} defined by 1 for all x ∈ X. ψ(x) = ϕ(x) + x2X − u∗ , xX 2 Clearly ψ ∈ Γ0 (X). Also from Proposition 1.2.20 we know that we can find (x∗0 , ϑ0 ) ∈ X ∗ × R such that ϕ(x) ≥ x∗0 , xX − ϑ0 for all x ∈ X. So 1 x2X − u∗ xX , 2 as xX −→ ∞.
ψ(x) ≥ −x∗0 X ∗ xX + ⇒ ψ(x) −→ +∞
Invoking Corollary 2.1.11 we obtain x ∈ X such that ψ(x) = inf ψ, X
⇒ 0 ∈ ∂ψ(x) = ∂ϕ(x) + F(x) − u∗ ⇒ u∗ ∈ ∂ϕ(x) + F(x). Because u∗ ∈ X ∗ was arbitrary, we conclude that ∂ϕ + F is surjective, hence x −→ ∂ϕ(x) is maximal monotone. In order to identify those maximal monotone maps on X that are of the subdifferential type, we need to introduce the following specification of monotonicity.
3.2 Monotone and Accretive Operators
175
∗
DEFINITION 3.2.36 Let X be a Banach space and A : X −→ 2X . (a) We say that A is cyclically monotone if for any subset {x0 , . . . , xn } ⊆ D(A), n ≥ 1 and x∗k ∈ A(xk ), 0 ≤ k ≤ n we have n
x∗k , xk − xk+1 X ≥ 0,
with xn+1 = x0 .
k=0
(b) We say that A is maximal cyclically monotone if it is cyclically monotone and ∗ there is no cyclically monotone map A : X −→ 2X whose graph strictly contains the graph of A. REMARK 3.2.37 Every monotone map A : R −→ 2R is in fact cyclically mono∗ tone. To see this let {xk }n k=0 ⊆ D(A) and xk ∈ A(xk ) for all 0 ≤ k ≤ n. We may assume that xk < xk+1 for 0 ≤ k ≤ n − 1 and so x∗k ≤ x∗k+1 for all 0 ≤ k ≤ n − 1. So if xn+1 = x0 , we have n
x∗k (xk − xk+1 ) =
k=0
n−1
x∗k (xk − xk+1 ) + x∗n (xn − x0 )
k=0
=
n−1
(x∗k − x∗n )(xk − xk+1 ) ≥ 0.
k=0
EXAMPLE 3.2.38 Not every monotone map is cyclically monotone. Consider the linear map in R2 defined by A(x1 , x2 ) = (x2 , −x1 ). Consider the points (1, 1), (0, 1), and (1, 0) to check that A is not cyclically monotone. ∗
THEOREM 3.2.39 If X is a Banach space and A : X −→ 2X , then A is maximal cyclically monotone if and only if there exists ϕ ∈ Γ0 (X) such that A = ∂ϕ. PROOF: ⇐: Directly from Definition 1.2.28, we see that ∂ϕ is cyclically monotone and this combined with Theorem 3.2.25 implies that ∂ϕ is maximal cyclically monotone. ⇒: Let (x0 , x∗0 ) ∈ Gr A and define ϕ(x) = sup
n−1
x∗k , xk+1 − xk X + x∗n , x − xn X : (xk , x∗k ) ∈ Gr A,
k=0
0 ≤ k ≤ n, n ≥ 1 .
(3.28)
Evidently ϕ is convex and lower semicontinuous with values in R=R ∪ {+∞}. Moreover, ϕ(x0 ) ≤ 0 because A is cyclically monotone. Hence ϕ is proper; that is, ϕ ∈ Γ0 (X). Let (x, x∗ ) ∈ Gr A and u ∈ X. In the right-hand side of (3.28), we ∗ consider the (n + 1) pairs {(xk , x∗k )}n k=0 ∪ {(x, x )} ⊆ Gr A. We obtain ϕ(u) ≥
n−1
x∗k , xk+1 − xk X + x∗n , x − xn X + x∗ , u − xX ,
k=0
⇒ ϕ(u) ≥ ϕ(x) + x∗ , u − xX , ⇒ x∗ ∈ ∂ϕ(x).
176
3 Nonlinear Operators and Fixed Points
So Gr A ⊆ Gr ∂ϕ and from the first part of the proof we infer that Gr A = Gr ∂ϕ (i.e., A = ∂ϕ). COROLLARY 3.2.40 Every maximal monotone map A: R −→ 2R is of the subdifferential type. THEOREM 3.2.41 If X is a reflexive Banach space, ϕ ∈ Γ0 (X), and 0 ∈ D(∂ϕ), then ϕ(x) lim = +∞ xX →∞ xX x∈domϕ
if and only if A = ∂ϕ is coercive. PROOF: ⇒: Let x0 ∈ D(∂ϕ). Then for all (x, x∗ ) ∈ Gr ∂ϕ ϕ(x) − ϕ(x0 ) ≤ x∗ , x − x0 ∗ ∗ inf x , x − x0 : x ∈ ∂ϕ(x) ⇒ lim = +∞ (i.e., A = ∂ϕ is coercive). xX xX →∞ x∈D(∂ϕ)
⇐: Because A = ∂ϕ is coercive, maximal monotone (see Theorem 3.2.30) it is surjective (see Corollary 3.2.28). Also by virtue of Proposition 1.2.20, we may assume without any loss of generality that ϕ ≥ 0. Then for every x∗ ∈ X ∗ with x∗ X ∗ ≤ r, we can find x∈D(∂ϕ) and M > 0 such that x∗ ∈∂ϕ(x) and xX ≤ M . By definition x∗ , y − xX ≤ ϕ(y) − ϕ(x)
for all y ∈ X,
for all y ∈ X, ⇒ x∗ , yX ≤ ϕ(y) + M r ∗ x , yX ϕ(y) Mr ⇒ ≤ + , yX yX yX ϕ(y) = +∞. ⇒ lim yX →∞ yX
x∈domϕ
When we are in a Hilbert space we can introduce some useful single-valued approximations of a maximal monotone operator. DEFINITION 3.2.42 Let H be a Hilbert space identified with its dual (i.e., H = H ∗ , pivot space) and A : H −→ 2H a maximal monotone map. For every λ > 0 we define the following well-known operators. (a) Jλ = (I + λA)−1 , the resolvent of A. (b) Aλ = (1/λ)(I − Jλ ), the Yosida approximation of A. REMARK 3.2.43 Note that D(Jλ ) = D(Aλ ) = H for all λ > 0 and both maps Jλ and Aλ are single-valued. Several important properties of Jλ and Aλ are gathered in the next proposition. They follow easily from Definition 3.2.42.
3.2 Monotone and Accretive Operators
177
PROPOSITION 3.2.44 If H is a Hilbert space, H = H ∗ , and A : H −→ 2H is maximal monotone, then (a) Jλ is nonexpansive (i.e., Lipschitz continuous with constant 1) for all λ > 0.
(b) Aλ (x) ∈ A Jλ (x) for all x ∈ H and all λ > 0. (c) Aλ is monotone and Lipschitz continuous with constant 1/λ for all λ > 0. (d) For all x ∈ D(A), Aλ (x)H ≤ A0 (x)H = min x∗ H : x∗ ∈ A(x) for all λ > 0. (e) lim Aλ (x) = A0 (x) for all x ∈ D(A). λ→0
(f) D(A) is convex and lim Jλ (x) = proj x; D(A) for all x ∈ H. λ→0
REMARK 3.2.45 From (c) and Corollary 3.2.11 it follows that Aλ is maximal monotone. Also because A is maximal monotone, from Proposition 3.2.5 we know that A(x) is nonempty, closed, and convex for every x ∈ D(A). Hence the element of minimum norm in A(x) (the metric projection of the origin in A(x)) exists, is unique, and is denoted by A0 (x). In fact in every reflexive Banach space X, every ∗ maximal monotone map A : X −→ 2X has domain D(A) and range R(A)
which satisfy: D(A) and R(A) are both convex (virtual convexity). In (f) proj x; D(A) denotes the metric projection on the closed convex set D(A). The two operators Jλ and Aλ can also be defined when X is a reflexive Banach space that is not necessarily a Hilbert space. This is done as follows. DEFINITION 3.2.46 Let X be a reflexive Banach space such that both X and ∗ X ∗ are locally uniformly convex and A : X −→ 2X a maximal monotone map. Then by Theorem 3.2.30 for every λ > 0 and x ∈ X, the operator inclusion 0 ∈ F (y − x) + λA(y) has a solution. The hypothesis on X and X ∗ imply
that thissolution is unique and is denoted by Jλ (x). Also we set Aλ (x) = (1/λ)F x − Jλ (x) . REMARK 3.2.47 If H is a pivot Hilbert space (i.e., H = H ∗ ), then F = I and we recover Definition 3.2.42. Note that Jλ , Aλ are single-valued, and bounded and Aλ is also monotone, and demicontinuous (hence maximal monotone). Moreover, Aλ (x)X ∗ ≤ A0 (x)X ∗ for every x ∈ D(A) and lim Jλ (x) = x for every x ∈ D(A). λ↓0
For every x ∈ D(A), A(x) is nonempty, closed convex. Therefore due to the local uniform convexity of X ∗ , A0 (x) is well defined. If A is of the subdifferential type, then as we show below, Aλ is also of this type. To show this we introduce the following notion.
178
3 Nonlinear Operators and Fixed Points
DEFINITION 3.2.48 Let X be a Banach space and ϕ ∈ Γ0 (X). Then for every λ > 0, the Moreau–Yosida approximation ϕλ of ϕ, is defined by ϕλ (x) = inf ϕ(y) + (1/2λ)x − y2X : y ∈ X for all x ∈ X. REMARK 3.2.49 Note that ϕλ = ϕ ⊕ 1/(2λ) · 2X (see Definition 1.2.23). So ϕλ is convex and dom ϕλ = X. PROPOSITION 3.2.50 If X is a reflexive Banach space, X and X ∗ are both locally uniformly convex, ϕ ∈ Γ0 (X), and A = ∂ϕ, then
(a) ϕλ (x)=ϕ Jλ (x) + (1/2λ)x − Jλ (x)2X for all λ > 0 and all x ∈ X. ateaux differentiable and for all λ > 0, Aλ = (∂ϕ)λ = ∂ϕλ = ϕλ . (b) ϕλ is Gˆ (c) ϕ(Jλ x) ≤ ϕλ (x) ≤ ϕ(x) for all λ > 0, all x ∈ X, and lim ϕλ (x) = ϕ(x) for all λ↓0
x ∈ X.
PROOF: (a) For fixed x ∈ X, we consider the function y −→ ϑλ (y) = ϕ(y) + (1/2λ)x − y2X . Evidently ϑλ (·) is proper, lower semicontinuous, strictly convex and coercive. So there exists unique element xλ ∈ X such that ϕλ (x) = ϑλ (xλ ). The vector xλ ∈ X is the unique solution of the operator inclusion 1 0 ∈ − F (x − xλ ) + ∂ϕ(xλ ), λ ⇒ xλ = Jλ (x) (see Definition 3.2.46) and this proves (a).
(3.29)
(b) From (3.29), we obtain 0 ≤ ϕλ (y) − ϕλ (x) − Aλ (x), y − xX ≤ Aλ (y) − Aλ (x), y − xX for all x, y ∈ X. Let y = x + th with t > 0. Dividing with t > 0, we obtain lim t↓0
⇒ ϕλ
ϕλ (x + th) − ϕλ (x) for all h ∈ X, = Aλ (x), hX t is Gˆ ateaux differentiable and Aλ = (∂ϕ)λ = ∂ϕλ = ϕλ .
(c) In view of (a) we have
ϕ Jλ (x) ≤ ϕλ (x) ≤ ϕ(x)
for all x ∈ X.
(3.30)
Let x ∈ dom ϕ. Then from Remark 3.2.47 we know that Jλ (x) −→ x as λ ↓ 0 (because D(∂ϕ) = dom ψ). So we have
ϕ(x) ≤ lim inf ϕ Jλ (x) ≤ lim inf ϕλ (x) ≤ ϕ(x) (see (3.30)), λ↓0
⇒ ϕλ (x) ↑ ϕ(x)
λ↓0
as λ ↓ 0.
Next suppose x ∈ / dom ϕ. We show that ϕλ (x) −→ +∞ as λ ↓ 0. If this not the case, then we can find λn ↓ 0 and M > 0 such that
3.2 Monotone and Accretive Operators
179
ϕλn (x) ≤ M
for all n ≥ 1.
From part (a) we infer that Jλn (x) −→ x in X and ϕ Jλn (x) ≤ M . Hence ϕ(x) ≤ M , a contradiction to the fact that x ∈ / dom ϕ. In the context of a pivot Hilbert space, we can improve this proposition. COROLLARY 3.2.51 If X =H =a pivot Hilbert space, ϕ ∈ Γ0 (H), and A = ∂ϕ, then (a) ϕλ is Fr´echet differentiable and ∂ϕλ is Lipschitz continuous with constant 1/λ. (b) If λn −→ λ, xn −→ x in H, ∂ϕλn (xn ) −→ x∗ in H, then (x, x∗ ) ∈ Gr ϕ. w
(c) If λn −→ λ and xn −→ x in H, then ϕ(x) ≤ lim inf ϕλn (xn ) and if in addition n→∞ ϕλn (xn ) n≥1 ⊆ H, is bounded, then ϕλn (xn ) −→ ϕ(x). Before passing to generalizations of maximal monotonicity, let us mention the situation with linear operators. For a proof we refer to Brezis [102, p. 47]. PROPOSITION 3.2.52 If H is a pivot Hilbert space and A : H −→ H is linear, maximal monotone, then A = ∂ϕ with 1 A1/2 x2H if x ∈ D(A1/2 ) ϕ(x) = 2 +∞ if otherwise if and only if A is self-adjoint. Next we introduce some generalizations of the notion of maximal monotonicity that are useful in applications. ∗
DEFINITION 3.2.53 Let X be a reflexive Banach space and A : X −→ 2X . (a) We say that A is pseudomonotone if the following conditions are satisfied. (a1 ) For every x ∈ X, A(x) is nonempty, w-compact, and convex. (a2 ) A is upper semicontinuous (usc for short) from each finite-dimensional sub∗ space Y into X ∗ with the weak topology denoted by Xw (i.e., for all U ⊆ X ∗ + nonempty, w-open A (U ) = {x ∈ Y : A(x) ⊆ U } is open in Y ). (a3 ) If xn −→ x in X and if x∗n ∈ A(xn ) satisfies lim sup x∗n , xn − xX ≤ 0, w
n→∞
then for all u ∈ X there exists x∗ (u) ∈ A(x) such that x∗ (u), x − uX ≤ lim inf x∗n , xn − uX . n→∞
w
(b) We say that A is generalized pseudomonotone if for every sequence xn −→ x in w X and every sequence x∗n −→ x∗ in X ∗ with x∗n ∈ A(xn ) for all n ≥ 1 such that lim sup x∗n , xn − xX ≤ 0, n→∞
∗
then x ∈ A(x) and
x∗n , xn X
−→ x∗ , xX .
180
3 Nonlinear Operators and Fixed Points
PROPOSITION 3.2.54 If X is a reflexive Banach space and A : X −→ 2X pseudomonotone, then A is generalized pseudomonotone.
∗
is
PROOF: Let {(xn , x∗n )}n≥1 ⊆ Gr A such that xn −→ x in X, x∗n −→ x∗ in X ∗ and lim sup x∗n , xn − xX ≤ 0. Because A is pseudomonotone, for every u ∈ X we can w
w
n→∞
find x∗ (u) ∈ A(x) such that x∗ (u), x − uX ≤ lim inf x∗n , xn − uX . n→∞
The sequence {x∗n , xn X }n≥1 ⊆ R is bounded and so passing to a subsequence if necessary, we may assume that x∗n , xn X −→ β ∈ R. Then lim sup x∗n , xn − xX = β − x∗ , xX ≤ 0.
(3.31)
n→∞
On the other hand, β − x∗ , uX ≥ lim inf x∗n , xn − uX ≥ x∗ (u), x − uX n→∞
⇒ x∗ , x − uX ≥ x∗ (u), x − uX
for all u ∈ X
(see (3.31)). (3.32)
We claim that x∗ ∈ A(x). If this is not true and since A(x) is w-compact, convex in X ∗ , by the strong separation theorem we can find v ∈ X \ {0} and c ∈ R such that (3.33) x∗ , vX < c ≤ inf y ∗ , vX : for all y ∗ ∈ A(x) . If in (3.32) we choose u = x − v, we reach a contradiction to (3.33). So indeed x∗ ∈ A(x). Next note that for some x∗ (x) ∈ X ∗ we have lim inf x∗n , xn − xX ≥ x∗ (x), x − xX = 0 n→∞
⇒ lim inf x∗n , xn X ≥ x∗ , xX . n→∞
(3.34)
On the other hand from the choice of the sequences {xn }n≥1⊆X and {x∗n }n≥1 ⊆ X , we have (3.35) lim sup x∗n , xn X ≤ x∗ , xX . ∗
n→∞
From (3.34) and (3.35) we conclude that x∗n , xn X −→ x∗ , xX and so A is generalized pseudomonotone. There is a converse to the previous proposition. ∗
PROPOSITION 3.2.55 If X is a reflexive Banach space and A : X −→ 2X is bounded, generalized pseudomonotone and for all x ∈ X A(x) is nonempty, closed, and convex in X ∗ , then A is pseudomonotone. w
PROOF: First we show Condition [a3 ] in Definition 3.2.53(a), namely that xn −→ x in X and lim sup x∗n , xn X ≤ 0, with x∗n ∈ A(xn ); for every u ∈ X we can find n→∞
x∗ (u) ∈ A(x) such that
3.2 Monotone and Accretive Operators x∗ (u), x − uX ≤ lim inf x∗n , xn − uX .
181 (3.36)
n→∞
Because A is bounded, {x∗n }n≥1 ⊆ X ∗ is bounded and so by passing to subsew quence if necessary, we may assume that x∗n −→ x∗ in X ∗ . Suppose that (3.36) is not true. So we can find u ∈ X such that lim inf x∗n , xn − uX < inf y ∗ , x − uX : y ∗ ∈ A(x) . n→∞
By passing to a subsequence if necessary, we may assume that lim x∗n , xn − uX < inf y ∗ , x − uX : y ∗ ∈ A(x) .
(3.37)
n→∞
Because A is generalized pseudomonotone, we have x∗ ∈ A(x) and x∗n , xn X −→ x∗ , xX . Using this in (3.37), we obtain x∗ , x − uX < inf y ∗ , x − uX : y ∗ ∈ A(x) , which contradicts the fact that x∗ ∈ A(x). Finally we show that A is usc from every finite-dimensional subspace Y of X ∗ into Xw = the space X ∗ with the weak topology. But from Proposition 6.1.10, we ∗ know that it suffices to show that Gr A is sequentially closed in Y × Xw . It follows at once from Definition 3.2.53(a). PROPOSITION 3.2.56 If X is a reflexive Banach space and A : X −→ 2X maximal monotone, then A is generalized pseudomonotone.
∗
is
PROOF: Consider a sequence {(xn , x∗n )}n≥1 ⊆ Gr A such that xn −→ x in X, w x∗n −→ x∗ in X ∗ , and lim sup x∗n , xn − xX ≤ 0. We need to show that x∗ ∈ A(x) w
n→∞
and x∗n , xn X −→ x∗ , xX (see Definition 3.2.53(b)). For every (y, y ∗ ) ∈ Gr A, we have x∗n , xn X = x∗n − y ∗ , xn − yX + y ∗ , xn X + x∗n , yX − y ∗ , yX , for all n ≥ 1 ⇒ x∗ , xX ≥ y ∗ , xX + x∗ , yX − y ∗ , yX , ⇒ x∗ − y ∗ , x − yX ≥ 0
for all (y, y ∗ ) ∈ Gr A.
From the maximality of A, we infer that (x∗ , x) ∈ Gr A. Also for every n ≥ 1, we have x∗n − x∗ , xn − yX ≥ 0 ⇒ lim inf x∗n , xn X ≥ x∗ , xX ≥ lim sup x∗n , xn X . n→∞
x∗n , xn X
Therefore domonotone.
n→∞
∗
−→ x , xX and so we conclude that A is generalized pseu
Recalling that a maximal monotone operator is locally bounded on the interior of D(A) (see Theorem 3.2.4), we can easily prove the following result. We leave the details to the reader.
182
3 Nonlinear Operators and Fixed Points ∗
PROPOSITION 3.2.57 If X is reflexive and A : X −→ 2X is maximal monotone with D(A) = X, then A is pseudomonotone. The class of pseudomonotone maps is closed under addition. For a proof we refer to Browder–Hess [120]. ∗
PROPOSITION 3.2.58 If X is reflexive and A1 , A2: X −→ 2X are pseudomonotone maps, then A1 + A2 is pseudomonotone too. As was the case with maximal monotone maps, pseudomonotone maps exhibit remarkable surjectivity properties. The method of proof is based on Galerkin approximations, therefore we need the following finite-dimensional result whose proof uses fixed point results for multifunctions (see Section 6.5). PROPOSITION 3.2.59 If X is a finite-dimensional Banach space and F : X −→ ∗ 2X is usc, coercive, and has nonempty, compact, and convex values, then F is surjective. Using this result, we can prove the following surjectivity theorem for pseudomonotone maps. ∗
THEOREM 3.2.60 If X is a reflexive Banach space and A : X −→ 2X is pseudomonotone and coercive, then A is surjective. PROOF: Let S be the family of all finite-dimensional subspaces of X partially ordered by inclusion. If Y ∈ S and iY ∈ L(Y, X) is the embedding operator, then ∗ i∗Y = pY ∗ ∈ L(X ∗ , Y ∗ ) is the projection on Y ∗ . Let AY = i∗Y AiY : Y −→ 2Y . Then AY is usc, coercive, and is nonempty, compact, and convex. Note that if u∗ ∈ X ∗ , by considering Au∗ (x) = A(x) − u∗ , we see that Au∗ has the same properties as A and so we see that it suffices to show that 0 ∈ A(X). From Proposition 3.2.59 we see that for every Y ∈ S we can find xY ∈ Y such that 0 ∈ AY (xY ), hence 0 = i∗Y x∗Y
for some x∗Y ∈ A(xY ).
Because A is coercive {xY }Y ∈S ⊆ X is bounded. For Y ∈ S let CY = {xY }. Then CY ⊆ B M for some M > 0 large. From Y ⊇Y Y ∈S
the reflexivity of X and the finite intersection property, we have # w CY = ∅. Y ∈S
Let x0 ∈
Y ∈S
CY
w
and u ∈ X. Let Y ∈ S such that {x0 , u} ⊆ Y . We can find w
{xYk = xk }k≥1 ⊆ CV such that xk −→ x0 in X. Then x∗k , xk − x0 X = 0
with x∗k ∈ A(xk )
for all k ≥ 1.
Because A is pseudomonotone, for every u ∈ X, we can find u∗ (u) ∈ A(x0 ) such that
3.2 Monotone and Accretive Operators u∗ (u), x0 − uX ≤ lim x∗k , xk − uX = 0.
183 (3.38)
k→∞
If 0 ∈ / A(x0 ), then by the strong separation theorem, we can find u ∈ X such that 0 < inf x∗ , x0 − uX : x∗ ∈ A(x0 ) . (3.39) Comparing (3.38) and (3.39), we reach a contradiction. So A is indeed surjective. We conclude our discussion of maps of monotone type, with one more surjectivity result. First we have a definition. DEFINITION 3.2.61 Let X be a reflexive Banach space. ∗
(a) A map A : X −→ 2X is said to be quasibounded if for each M > 0, we can find K(M ) > 0 such for all (x, x∗ ) ∈ Gr A with x∗ , xX ≤ M xX and xX ≤ M , we have x∗ X ∗ ≤ K(M ). (b) A map C : X −→ X ∗ is said to be smooth if D(C) = X and it is bounded, coercive, and maximal monotone. ∗
(c) A map A : X −→ 2X is said to be regular if it is generalized pseudomonotone, and for any smooth map C : X −→ X ∗ we have A + C is surjective. (d) A map A:X−→X ∗ is said to be of type (S)+ if for any sequence {xn }n≥1⊆D(A) w the conditions xn −→ x in X and lim sup A(xn ), xn − x ≤ 0, imply that xn −→ x in X.
n→∞
∗
REMARK 3.2.62 Clearly every bounded map A : X −→ 2X is quasibounded. ∗ Also a monotone map A : X −→ 2X with 0 ∈ D(A) is regular if and only if it is maximal. A demicontinuous (S)+ map, is pseudomonotone. We have the following surjectivity result, whose proof can be found in Pascali– Sburlan [490, p. 139]. ∗
THEOREM 3.2.63 If X is a reflexive Banach space, A : X −→ 2X is a maximal monotone map with 0 ∈ D(A), and C : X −→ X ∗ is a regular coercive map and either A or C is quasibounded, then A + C is surjective. Thus far we have considered mappings from a Banach space into its dual. Next we deal with mappings from a Banach space into itself. Such operators are closely related to the generation theory of semigroups (linear and nonlinear), that play a central role in the theory of evolution equations. ∗ In what follows X is a Banach space and F : X −→ 2X is the normalized duality map. Additional hypotheses on X are introduced as needed. DEFINITION 3.2.64 Let A : X −→ 2X . Then (a) We say that A is accretive if for any (x, u), (y, u) ∈ Gr A, we can find x∗ ∈ F (x − y) such that x∗ , u − vX ≥ 0.
184
3 Nonlinear Operators and Fixed Points
(b) We say that A is maximal accretive if its graph is maximal with respect to inclusion among the graphs of all accretive mappings. (c) We say that A is m-accretive if R(I + A) = X. (d) We say that A is dissipative (m-dissipative) if −A is accretive (m-accretive). REMARK 3.2.65 If X= H = pivot Hilbert space (i.e., H = H ∗ ), then F = I and so A is accretive if and only if A is monotone. Moreover, in this case the notions of maximal accretivity and m-accretivity coincide and they correspond to maximal monotonicity (see Theorem 3.2.30). Note that maximal accretivity is equivalent to saying that if (x, u) ∈ X × X is such that for all (y, v) ∈ Gr A, we can find x∗ ∈ F (x − y) with x∗ , u − vX ≥ 0, then (x, u) ∈ Gr A. PROPOSITION 3.2.66 A mapping A : X −→ 2X is accretive if and only if x − y + λ(u − v)X ≥ x − yX for all λ > 0 and all (x, u), (y, v) ∈ Gr A. PROOF: ⇒: By Definition 3.2.64(a), we can find x∗ ∈ F (x − y) such that x∗ , u − vX ≥ 0. For λ > 0 we have x − y2X = x∗ , x − yX ≤ x∗ , x − y + λ x∗ , u − vX ≤ x∗ X ∗ x − y + λ(u − v)X , ⇒ x − yX ≤ x − y + λ(u − v)X (because x∗ X ∗ = x − yX ).
⇐: Let x∗λ ∈ F x − y + λ(u − v) and set yλ∗ = x∗λ x∗λ X ∗ . We have x − yX ≤ x − y + λ(u − v)X =
1 x∗λ , x − y + λ(u − v)X x∗λ X ∗
= yλ∗ , x − y + λ(u − v)X ≤ x − yX + λ yλ∗ , u − vX .
(3.40)
By Alaoglou’s theorem the net {yλ∗ }λ>0 has a limit point y ∗ and because of (3.40), we see that y ∗ X ∗ ≤ 1, y ∗ , u − vX ≥ 0 ⇒ y ∗ X ∗ = 1 and x∗ = xX
and y ∗ , x − yX ≥ x − yX ,
y ∗ ∈ F x − y), x∗ , u − vX ≥ 0,
⇒ A is accretive.
As we did for monotone operators in Definition 3.2.42, we introduce the following single-valued approximations of the identity and of the operator itself. DEFINITION 3.2.67 If A : X −→ 2X is an accretive map and λ > 0, then we set Jλ (x) = (I + λA)−1 (x), Aλ (x) = and
D(Aλ ) = Dλ = R(I + λA) |A(x)| = inf uX : u ∈ A(x) .
1 (I − Jλ )(x) λ
3.2 Monotone and Accretive Operators
185
Using Proposition 3.2.66 we check easily that the following is true. PROPOSITION 3.2.68 A : X −→ 2X is an accretive map if and only if for every λ > 0 Jλ is single-valued and Jλ (x) − Jλ (y) ≤ x − yX for all x, y ∈ Dλ . PROPOSITION 3.2.69 If A : X −→ 2X is m-accretive, then A is maximal accretive and R(I + λA) = X for every λ > 0. PROOF: Let K : X −→ 2X be an accretive operator such that Gr A ⊆ Gr K and let (x, u) ∈ Gr K. Because of the hypothesis A is m-accretive, thus we can find (y, v) ∈ Gr A such that x + u = y + v. Because K is accretive, from Proposition 3.2.66 we obtain x = y and so u = v. Therefore Gr A = Gr K and this proves that A is maximal accretive. Next let λ > 0 and u ∈ X. Then u ∈ R(I + λA) if and only if we can find x ∈ D(A) such that x = (I + A)−1
1
u−
1−λ x . λ
λ (1/λ)u− (1−λ) λ x . Then D(K) and using Proposition
Let K(x) = (I+A)−1 3.2.66 we see that 1 − λ (3.41) K(x) − K(y)X ≤ x − yX . λ If λ > 12 , then (1 − λ)/λ < 1 and so because of (3.41) we can apply Banach’s
fixed point theorem (see Theorem 3.4.3) and obtain x ∈ X such that K(x) = x. Now let β > 12 and set Aβ = βA. Then from the previous argument we have that R(I + θAβ ) = X for all θ > 12 , hence R(I + θβA) = X for all θβ > 14 . Continuing this way, we have that R(I + λA) = X for all λ > 1/2n , n ≥ 1. REMARK 3.2.70 When X =H = a pivot Hilbert space, we recover a special case of Theorem 3.2.30 (with F =I). PROPOSITION 3.2.71 If A : X −→ 2X is an accretive map, then (a) For every λ > 0 Aλ is accretive, Lipschitz continuous on
Dλ with constant 2/λ, Aλ (x)X ≤ |A(x)| for all x ∈ Dλ ∩ D(A) and Aλ ∈ A Jλ (x) for all x ∈ Dλ . (b)
lim Jλ (x) = x for all x ∈ D(A) ∩ ( ∩ Dλ ).
λ→0+
λ>0
PROOF: (a) Let x, y ∈ Dλ and x∗ ∈ F (x − y). We have 1 ∗ 1 x , x − yX − x∗ , Jλ (x) − Jλ (y)X λ λ 1 1 ≥ x − y2X − x − yX Jλ (x) − Jλ (y)X λ λ
1 ≥ x − yX x − yX − Jλ (x) − Jλ (y)X ≥ 0 λ
x∗ , Aλ (x) − Aλ (y)X =
(see Proposition 3.2.69), ⇒ Aλ is an accretive map defined on Dλ . Also from Proposition 3.2.68, we have
186
3 Nonlinear Operators and Fixed Points
1 x − y − Jλ (x) − Jλ (y) X λ 2 ≤ x − yX for all x, y ∈ Dλ . λ
Aλ (x) − Aλ (y)X =
If x ∈ Dλ ∩ D(A), then from Definition 3.2.67 we see that Jλ (x + λu) = x for every u ∈ A(x), 1 1 ⇒ Aλ (x)X = x − Jλ (x)X = Jλ (x + λu) − Jλ (x)X ≤ u λ λ (see Proposition 3.2.68), ⇒ Aλ (x)X ≤ |A(x)|
for all λ > 0 and all x ∈ Dλ ∩ D(A).
Finally let x ∈ Dλ and put y = Jλ (x). Then x = y + λv with v ∈ A(y) (see Definition 3.2.67). So
Aλ (x) = (1/λ) x − Jλ (x) = (1/λ)(x − y) = v ∈ A(y). Therefore Aλ (x) ∈ A Jλ (x) for all λ > 0 and all x ∈ Dλ . (b) If x ∈ D(A) ∩ ( ∩ Dλ ), then from Part (a) we have λ>0
x − Jλ (x)X = λAλ (x) ≤ |Aλ (x)| −→ 0
as λ ↓ 0.
PROPOSITION 3.2.72 If A : X −→ 2X is a maximal accretive map, then (a) Gr A ⊆ X ×X is closed and if X ∗ is uniformly convex, then Gr A is sequentially closed in X × Xw . (b) If X ∗ is strictly convex, then A(x) is closed and convex for all x ∈ D(A). PROOF: (a) Let (xn , un n≥1 ⊆ Gr A and suppose that xn −→ x and un −→ u in X as n → ∞. Then from Proposition 3.2.66, we have xn − xX ≤ xn − x + λ(un − u)X
for all n ≥ 1,
all λ > 0, and all (x, u) ∈ Gr A. Passing to the limit as n → ∞, we obtain x − xX ≤ x − x + λ(u − u)X
for all λ > 0 and all (x, u) ∈ GrA.
So Gr A∪{(x, u)} is the graph of an accretive map. But by hypothesis A is maximal accretive. Hence (x, u) ∈ Gr A. Because X ∗ is uniformly convex, F is uniformly continuous on bounded sets w of X (see Remark 3.2.20) and so if xn −→ x, un −→ u in X, the inequalities F (xn − x), un − uX ≥ 0 for all (x, u) ∈ Gr A, imply F(x − x), u − uX ≥ 0 for all (x, u) ∈ Gr A. Because of the maximality of A, we conclude that (x, u) ∈ Gr A. (b) From Proposition 3.2.16 we know that F is single-valued. The maximality of A implies that # u ∈ X : F (x − y), u − vX ≥ 0 , A(x) = (y,v)∈GrA
⇒ A(x) is closed and convex for every x ∈ D(A).
3.2 Monotone and Accretive Operators
187
DEFINITION 3.2.73 Let A : X −→ 2X . The minimal section of A is the map A0 : X −→ 2X defined by A0 (x) = u ∈ A(x) : uX = |A(x)| . PROPOSITION 3.2.74 If X is reflexive, strictly convex, X ∗ is strictly convex too and A : X −→ 2X is maximal accretive, then D(A0 ) = D(A) and A0 is singlevalued. PROOF: Let {un }n≥1 ⊆ A(x) be such that un X ↓ |A(x)|. Due to the reflexivity w of X we may assume (at least for a subsequence), that un −→ u in X. But from Proposition 3.2.72(b) we know that A(x) is closed, and convex. So u ∈ A(x). We have uX ≤ lim inf un X = |A(x)|, hence uX = |A(x)|, which proves that n→∞
D(A0 ) = D(A). Moreover, the strict convexity of X implies that A0 is single-valued. The significance of the minimal map is explained by the next theorem, whose proof can be found in Barbu [58, p. 79]. THEOREM 3.2.75 If X ∗ is uniformly convex and A, K : X −→ 2X are two maccretive maps such that A0 (x) ∩ K 0 (x) = ∅ 0
for all x ∈ D(A) = D(K),
0
then A = K. So if A = K , then A = K. Evidently the sum of two dissipative sets is a dissipative set on the intersection of the domains. But if the two operators are m-dissipative, it need not be mdissipative. The next theorem provides such a criterion which is easy to check in applications. For a proof of it we refer to Barbu [58, p. 82]. X THEOREM 3.2.76 If X ∗ is uniformly convex and A, K : X −→ 2 are two maccretive maps such that (i) D(A) ∩ D(K) = ∅ and (ii) F Kλ (x) , y ≥ 0 for all (x, y) ∈ Gr A and all λ > 0, then A + K is m-accretive.
The importance of accretive operators comes from their role in the generation theory of nonlinear semigroups. First let us briefly recall the very basic facts about linear semigroups. So suppose T = R+ , A is an N × N -matrix, f ∈ L1 (T, RN ), and x0 ∈ RN and we consider the following system, x (t) = Ax(t) + f (t), x(0) = x0 ,
t ∈ T.
From the variation of constants formula, we know that
b S(t − s)f (s)ds for all t ∈ T, x(t) = S(t)x0 +
(3.42)
0
where S(t) = eAt , t ∈ T , is the fundamental solution of the equation x (t) = Ax(t), x(0) = x0 . The theory of linear semigroups extends the concept of fundamental solutions to arbitrary Banach spaces X and gives precise meaning to (3.42).
188
3 Nonlinear Operators and Fixed Points
Let X be a Banach space and A ∈ L(X). Given x0 ∈ X, the function x(t) = eAt x0 is the unique solution of x (t) = Ax(t), t ∈ T, x(0) = x0 , where eAt = (At)k . Let us briefly check the properties of x0 −→ x(t). This is a linear map on k!
k≥0
X and x(t) ≤ etAL x0 . Hence this linear map is bounded. Moreover, as t ↓ 0 we have x(t) −→ x0 and also x(0) = x0 . Finally by the uniqueness of the solution to the problem, if the initial state is x(t0 ) and the system evolves for time t > 0, then we reach the state x(t + t0 ) (recall also that e(t+t0 )A = etA et0 A ). We formalize these properties in the following definition. DEFINITION 3.2.77 Let X be a Banach space and {S(t)}t≥0 ⊆ L(X). We say that {S(t)}t≥0 is a C0 -semigroup, if the following are true. (a) S(0) = I, the identity operator on X. (b) S(t + s) = S(t)S(s) for all t, s ≥ 0. (c) For every x ∈ X, S(t)x −→ x as t ↓ 0. REMARK 3.2.78 Property (b) is called the semigroup property. Property (c) implies continuity with respect to t in the strong operator topology. If A ∈ L(X), then we already saw that S(t) = eAt defines a C0 -semigroup. PROPOSITION 3.2.79 If {S(t)}t≥0 is a C0 -semigroup on X, then there exist M ≥ 1 and ω ≥ 0 such that S(t)L ≤ M eωt for all t ≥ 0. PROOF: We claim that we can find M ≥ 1 and δ > 0 suchthat for t ∈ [0, δ], S(t)L ≤ M . If not, then we can find tn → 0 such that S(tn )L n≥1 is unbounded. But from Definition 3.2.77(c), we have that S(tn )x −→ x in X for every x ∈ X. Hence S(tn )x n≥1 is bounded in X and so by the Banach–Steinhaus theorem, S(tn )L n≥1 is bounded, a contradiction. Let ω = (ln M )/δ ≥ 0. Given t ≥ 0 we can find an integer n ≥ 0 and 0 ≤ θ ≤ δ such that t = nδ + θ. Then S(t) = S(δ)n S(θ) and so S(t)L ≤ S(δ)n L S(θ)L ≤ M n M ≤ M eωt because ln M n = n ln M ≤ nωδ ≤ ωt. COROLLARY 3.2.80 For every x ∈ X, the map t −→ S(t)x is continuous from T = R+ into X. DEFINITION 3.2.81 If in Proposition 3.2.79, M = 1 and ω = 0, then S(t)L ≤ 1 for all t ≥ 0 and {S(t)}t≥0 is called a contraction semigroup. In the next definition we introduce an important concept in the theory of semigroups. DEFINITION 3.2.82 Let {S(t)}t≥0 be a C0 -semigroup on X. The infinitesimal generator of the semigroup is a linear operator A defined by Ax = lim t↓0
S(t)x − x t
S(t)x − x for all x ∈ D(A) = x ∈ X : lim exists . t↓0 t
3.2 Monotone and Accretive Operators
189
Next we derive some basic properties of the infinitesimal generator and for this reason we need the following simple lemma. LEMMA 3.2.83 If {S(t)}t≥0 is a C0 -semigroup, then lim 1/h h↓0
t+h t
S(τ )xdτ =
S(t)x for all t ≥ 0 and all x ∈ X. PROOF: We have
1 t+h
1 t+h S(τ )xdτ − S(t)x = S(τ ) − S(t) xdτ h t h t
1 t+h S(τ )x − S(t)xdτ. ≤ h t
(3.43)
Given ε > 0, we can find h > 0 small such that S(τ )x−S(t)x < ε for all |τ −t| < h (see Corollary 3.2.80). So from (3.43) we obtain
1 t+h S(τ )xdτ − S(t)x < ε h t
which proves the lemma.
PROPOSITION 3.2.84 If {S(t)}t≥0 is a C0 -semigroup with infinitesimal generator A and x0 ∈ D(A), then (a) S(t)x0 ∈ D(A) for each t ≥ 0. (b) (d/dt)S(t)x0 = AS(t)x0 = S(t)Ax0 for each t > 0.
(c) t −→ S(t)x0 belongs in C 1 (0, ∞), X ∩ C R+ , D(A) . PROOF: (a) We have lim
h→0
1
1
S(h)S(t)x0 − S(t)x0 = lim S(t)S(h)x0 − S(h)x0 h→0 h h 1
= S(t) S(h)x0 − x0 = S(t)Ax0 . h
Therefore, S(t)x0 ∈ D(A) (see Definition 3.2.82) and AS(t)x0 = S(t)Ax0 . (b) For h > 0, t ≥ 0, we have 1
1
1
S(t + h)x0 − S(t)x0 = S(t) S(h)x0 − x0 = S(h) − I S(t)x0 . h h h
By virtue of Definition 3.2.82 we know that lim (1/h) S(h)x0 − x0 exists and h→0
so the other two limits exist too. So it follows that S(t)x0 ∈ D(A). Also for t > 0 and h > 0 small, we have 1
1
S(t − h)x0 − S(t)x0 = −S(t − h) S(h)x0 − x0 . h h Hence we deduce that the strong derivative of t −→ S(t)x0 exists at each t > 0 and
190
3 Nonlinear Operators and Fixed Points d S(t)x0 = AS(t)x0 = S(t)Ax0 . dt
(c) From (b) and Corollary 3.2.80 we infer that t −→ S(t)x0 belongs in C 1 (0, ∞), X ∩ C R+ , D(A) . REMARK 3.2.85 From the above proposition it follows that t −→ x(t) = S(t)x0 is the unique solution of the initial value problem x (t) = Ax(t) t ≥ 0, x(0) = x0 , provided that the initial datum is in D(A). If x0 ∈ / D(A), then t −→ S(t)x0 is not differentiable with respect to t. We can, however, consider t −→ x(t) = S(t)x0 as a generalized solution of the initial value problem. PROPOSITION 3.2.86 If {S(t)}t≥0 is a C0 -semigroup with infinitesimal generator A, then A is closed and densely defined. PROOF: Let x ∈ X. Then for h > 0 we have
t 1 t
1
S(τ )xdτ = S(τ + h)x − S(τ )x dτ S(h) − I h h 0 0
h
1 t+h = S(τ )xdτ − S(τ )xdτ −→ S(t)x − x h t 0 t (see Lemma 3.2.83). Therefore 0 S(τ )xdτ ∈ D(A). Once again Lemma 3.2.83 implies that
1 h S(τ )xdτ −→ S(0)x = x, h 0 ⇒ D(A) is dense in X. If xn ∈ D(A) and xn −→ x in X, A(xn ) −→ u in X, we need to show that x ∈ D(A) and Ax = u,
1
1
1 h S(τ )Axn dτ. S(h)x − x = lim S(h)xn − xn = lim n→∞ h n→∞ h 0 h Because Axn −→ u, we have 1 1
S(h)x − x = h h
h
S(τ )udτ −→ u
as h ↓ 0
0
(see Lemma 3.2.83). Therefore x ∈ D(A) and Ax = u.
Now let us reverse the problem and ask ourselves what conditions must be placed on A in order that A is the generator of a C0 -semigroup. If A ∈ L(X), then S(t) = eAt =
(At)k . k!
k≥0
For the unbounded case, we know that if A is the infinitesimal generator of a strongly continuous semigroup, then we require that D(A) = X
and A is closed.
3.2 Monotone and Accretive Operators
191
The precise conditions are provided by the following classical result, known as the Hille–Yosida theorem. For a proof we refer to the book of Hille–Phillips [300, p. 360]. THEOREM 3.2.87 Necessary and sufficient conditions for a linear unbounded operator A on a Banach space X to be the infinitesimal generator of a C0 -semigroup satisfying that S(t)||L ≤ M eωt for some M ≥ 1, ω ≥ 0 are (a) A is closed and densely defined. (b) (λI − A)−1 exists for λ > ω, (c) (λI − A)−m L ≤
M , (λ−ω)m
λ > ω, m ≥ 1.
REMARK 3.2.88 The family of the operators R(λ) = (λI − A)−1 for λ > ω, is called the resolvent of A. A byproduct of the proof of the Hille–Yosida theorem, is the following exponential formula for the C0 -semigroup. THEOREM 3.2.89 If {S(t)}t≥0 is a contraction semigroup with infinitesimal ngenerator A, then for any x ∈ X, S(t)x= lim (I − (t/n)A)−n = lim (n/t)R(n/t) x. n→∞
n→∞
Let us give some characteristic examples of C0 -semigroups with their infinitesimal generators. EXAMPLE 3.2.90 (a) Let X = H = L2 (0, b) (a Hilbert space) and let us consider an operator A = d/dx with domain D(A) = {x ∈ W 1,2 (0, b) : x(b) = 0}. The operator A is the generator of the so-called left-shift semigroup {S(t)}t≥0 x(t + τ ) if t + τ ∈ (0, b) S(t)x(τ ) = . 0 if t + τ ∈ / (0, b) (b) Let H be a Hilbert space with a complete and orthonormal basis {en }n≥1 and let {λn }n≥1 be a sequence in R such that λn −→ −∞. We define λ t e n x, en H en , x ∈ H, t ≥ 0. (3.44) S(t)x = n≥1
It is easy to check that {S(t)}t≥0 is a C0 -semigroup. The infinitesimal generator of this semigroup is given by λn x, en en Ax = n≥1
2
with domain D(A) = x ∈ H : λn x, en < +∞ . n≥1
192
3 Nonlinear Operators and Fixed Points 2
(c) Let H = L2 [0, π] and Ax = ddt2x with domain D(A) = W01,2 (0, π) ∩ W 2,2 (0, π). Then A is the infinitesimal generator of a semigroup given by (3.44) with en (t) = 0 2 π
sin(nt), t ∈ [0, π], λn = −n2 , n ≥ 1.
(d) The infinitesimal generator A of the semigroup {S(t)}t≥0 on C(R) given by (λt)k S(t)x (τ ) = e−λt x(τ − kµ), k!
λ, µ > 0
k≥0
is the difference operator (Ax)(τ ) = λ[x(τ − µ) − x(τ )]. Now we pass to nonlinear semigroups. So let X be a Banach space and C ⊆ X be a nonempty subset of X. DEFINITION 3.2.91 A family of maps S(t) : C −→ C, t ≥ 0, is called a semigroup of nonexpansive maps on C if (a) S(t + τ ) = S(t)S(τ ) for all t, τ ≥ 0. (b) S(0) = IC . (c) lim S(t)x = x for each x ∈ C. t↓0
(d) S(t)x − S(t)u ≤ x − u for all t ≥ 0 and all x, u ∈ C. REMARK 3.2.92 Given a semigroup of nonexpansive maps on C, the function (t, x) −→ S(t)x is jointly continuous from R+ × C into C. The main generation theorem for nonlinear semigroups, is the following theorem due to Crandall–Liggett [166], where the interested reader can find its proof (see also Barbu [58, p. 104]). THEOREM 3.2.93 If A : D(A) ⊆ X −→ 2X is an m-accretive operator, then for
−n every x ∈ D(A) S(t)x = lim I + (t/n)A x exists for each x ∈ D(A) and n→∞
uniformly in t in compact subsets of R+ . The family of maps S(t) : D(A) −→ D(A), t ≥ 0, is a semigroup of nonexpansive mapsand for each x ∈ D(A) and t > 0, we have S(t)x − x ≤ t|A(x)|, where |A(x)|=inf y : y ∈ A(x) . Let A : D(A) ⊆ X −→ 2X be an m-accretive operator. According to Theorem 3.2.93 it generates a semigroup of nonlinear contractions S(t) : D(A) −→ D(A), t ≥ 0. Recall that for each λ > 0, Jλ and Aλ are the resolvent and the Yosida approximation of A (i.e., Jλ = (I + λA)−1 and Aλ = (1/λ)(I − Jλ ); see Definition 3.2.67). The next proposition relates the semigroup S(t) with the resolvent Jλ . Its proof can be found in Vrabie [597, p. 46]. PROPOSITION 3.2.94 If x ∈ D(A), t > 0 and λ >0,
t then S(t)x − x ≤ 2 + (t/λ) Jλ (x) − x and Jλ (x) − x ≤ (2/t) 1 + (λ/t) 0 S(τ )x − xdτ .
3.2 Monotone and Accretive Operators
193
REMARK 3.2.95 If X = H a Hilbert space and A = ∂ϕ with ϕ ∈ Γ0 (H), then the inequality can be proved in the following way: Jλ (x) − x ≤ √ second
1 + 1/ 2 S(t)x − x for each x ∈ D(∂ϕ) and t, λ > 0. Moreover, Jt (x) − x ≤ 3/2 S(t)x − x, with 3/2 being the best possible constant. DEFINITION 3.2.96 A nonlinear semigroup S(t) : C −→ C, t ≥ 0, of nonexpansive maps on C ⊆ X is called compact, if for every t > 0, S(t) is a compact operator. REMARK 3.2.97 Recall that S(0) = IC . So if S(t) is compact for all t ≥ 0, then C is compactly embedded in X. Moreover, if C = X, then X is finite dimensional. DEFINITION 3.2.98 A nonlinear semigroup of nonexpansive maps S(t) : C −→ C, t ≥ 0, is said to be equicontinuous, if for each bounded set K ⊆ X, the family {S(·)x : x ∈ K} is equicontinuous at every t > 0. PROPOSITION 3.2.99 If S(t) : C −→ C, t ≥ 0, is a semigroup of nonlinear nonexpansive maps which is compact, then it is equicontinuous. PROOF: Let K ⊆ X be bounded, t > 0, and choose λ ∈ (0, t). Due to the compactness of the semigroup, S(t − λ)K is totally bounded in X. So given ε > 0, we can find {xk }N k=1 such that S(t − λ)K ⊆
N !
Bε/3 S(t − λ)xk
(3.45)
k=1
where Bε/3 S(t−λ)xk = y ∈ X : y−S(t−λ)xk < ε/3 . Also from the continuity of S(·)xk we can find 0 < δ = δ(ε, t) < λ such that S(t + h)xk − S(t)xk <
ε 3
(3.46)
for all k ∈ {1, . . . , N } and all h ∈ R with |h| ≤ δ. Because of (3.45) for each x ∈ K there exists k ∈ {1, . . . , N } such that S(t − λ)x − S(t − λ)xk ≤
ε . 3
(3.47)
Using (3.46), (3.47), and Definition 3.2.91, we obtain S(t + h)x − S(t)x ≤ S(t + h)x − S(t + h)xk + S(t + h)xk − S(t)xk + S(t)xk − S(t)x = S(λ + h)S(t − λ)x − S(λ + h)S(t − λ)xk + S(t + h)xk − S(t)xk + S(λ)S(t − λ)xk − S(λ)S(t − λ)x ≤ 2 S(t − λ)x − S(t − λ)xk + S(t + h)xk − S(t)xk ≤ ε for all x ∈ K and all h ∈ R with |h| ≤ δ. This proves that {S(·)x : x ∈ K} is equicontinuous. Now we can characterize the compact nonlinear semigroups.
194
3 Nonlinear Operators and Fixed Points
THEOREM 3.2.100 If A : D(A) ⊆ X −→ 2X is an m-accretive operator and S(t) : D(A) −→ D(A), t ≥ 0, is the semigroup generated by A, then the following statements are equivalent. (a) The semigroup {S(t)}t≥0 generated by A is compact. (b) (b1 ) For every λ > 0 Jλ is compact. (b2 ) {S(t)}t≥0 is equicontinuous. PROOF: (a)⇒(b): From Proposition 3.2.99 we already know that {S(t)}t≥0 is equicontinuous. We have: S(t)Jλ (x) − Jλ (x) ≤ tAλ (x) =
t Jλ (x) − x. λ
(3.48)
Because Jλ is nonexpansive, from (3.48) it follows that lim S(t)Jλ = Jλ t↓0
uniformly on bounded sets in X. For every t > 0 S(t)Jλ is compact and so from Proposition 3.1.6 we infer that Jλ is compact. (b)⇒(a): From Proposition 3.2.94, we see that for each λ > 0, t > 0, and x ∈ D(A), we have
4 λ S(t + τ )x − S(t)xdτ. (3.49) Jλ S(t)x − S(t)x ≤ λ 0 Due to the equicontinuity of {S(t)}t≥0 from (3.49) it follows that lim Jλ S(t) = S(t) λ↓0
uniformly on bounded sets in D(A). But Jλ S(t) is compact. So Proposition 3.1.6 implies that {S(t)}t≥0 is compact. COROLLARY 3.2.101 If A : D(A) ⊆ X −→ 2X is a densely defined, linear, maccretive operator and {S(t)}t≥0 is the nonexpansive semigroup generated by A, then the following statements are equivalent. (a) The semigroup {S(t)}t≥0 is compact. (b) (b1 ) For each λ > 0 Jλ is compact. (b2 ) t −→ S(t) is continuous from R+ into L(X) with the uniform operator topology. REMARK 3.2.102 Recall the resolvent identity, which says that µ λ−µ Jλ (x) = Jµ x+ Jλ (x) λ λ for each x ∈ X and λ, µ > 0. So Jλ is compact for every λ > 0 if and only if it is compact for some λ > 0. So statement [b1 ] is equivalent to saying that (I + A)−1 is compact (i.e., (I + A)−1 ∈ K(X)).
3.2 Monotone and Accretive Operators
195
PROPOSITION 3.2.103 If A : D(A) ⊆ X −→ 2X is an m-accretive operator, then the following statements are equivalent. (a) For each λ > 0 Jλ is compact. (b) For each k > 0, the set Lk = {x ∈ D(A) : x + |A(x)| ≤ k} is relatively compact in X. PROOF: (a)⇒(b): Recall that Aλ (x) ≤ |A(x)| ≤ k for all x ∈ Lk and λ > 0. So x − Jλ (x) ≤ λk ⇒ Jλ −→ I
for each x ∈ Lk and λ > 0,
as λ ↓ 0 uniformly on Lk ,
⇒ ILk is compact by Proposition 3.1.6 (i.e., Lk is compact). (b)⇒(a): We have
Jλ (x) + |A Jλ (x) | ≤ Jλ (x) + Aλ (x) (because Aλ (x) ∈ A Jλ (x) ) 1 for all x ∈ X. ≤ Jλ (x) + x − Jλ (x) λ (3.50) If K ⊆ X is bounded, then Jλ (K) is bounded and so from (3.50) it follows that Jλ (K) ⊆ Lk for k > 0 large. Therefore Jλ (K) is relatively compact. Let X = H = a Hilbert space and ϕ ∈ Γ0 (H). In this case we can improve Proposition 3.2.103. First we have a definition. DEFINITION 3.2.104 A convex function ϕ : H −→ R = R ∪ {+∞} is said to be of compact type if for each k > 0, the level set Lk = {x ∈ H : x2 + ϕ(x) ≤ k} is relatively compact in H. REMARK 3.2.105 If ϕ ∈ Γ0 (H), then Lk is compact for every k > 0. Then we have the following improvement of Proposition 3.2.103. The proof of this result can be found in Vrabie [597, p. 58]. PROPOSITION 3.2.106 If H is a Hilbert space, ϕ ∈ Γ0 (H), and A = ∂ϕ, then the following statements are equivalent. (a) For each λ > 0 Jλ is compact. (b) ϕ is of compact type. (c) The semigroup generated by A on D(A) is compact.
196
3 Nonlinear Operators and Fixed Points
EXAMPLE 3.2.107 Let Z ⊆RN be a bounded domain with C 2 -boundary ∂Z, j∈ Γ0 (R), j ≥ 0, j(0) = 0, p ≥ 2, and λ > 0. We consider the function ϕλ : L2 (Z) −→ R+ = R+ ∪ {+∞} defined by ⎧
p p 1 λ ⎪ if x ∈ W 1,p (Z), ⎨ p Dxp + p xp + ∂Z j x(z) dσ
ϕλ (x) = j x(·) ∈ L1 (∂Z) ⎪ ⎩ +∞ otherwise.
2 Then ϕλ ∈ Γ0 L (Z) and it is of compact type. Note that ∂ϕλ (x) = −p x + λ|x|p−2 x
for all x ∈ D(∂ϕλ ) = x ∈ W 1,p (Z) : ∂x ∂n (z) ∈ ∂j x(z) a.e. on ∂Z .
3.3 Degree Theory Degree theory is a basic tool of nonlinear analysis and produces powerful existence and multiplicity results for nonlinear boundary value problems. It concerns operator equations of the form ϕ(x) = y0 , where ϕ is a map (often continuous) of U , the closure of an open set U of the domain space X into the range space Y and y0 ∈ Y satisfies y0 ∈ / ϕ(∂U ). Then the degree of S at y0 relative to U , written d(ϕ, U, y0 ), is an algebraic count of the number of solutions of the equation ϕ(x) = y0 . In particular the equation ϕ(x) = y0 will have solutions in U , whenever d(ϕ, U, y0 ) = 0. Here we deal with “classical” degree theories; that is, the value of the degree map is an integer. This integer can be both positive or negative and in the context of finitedimensional problems positive counts (positive degree) correspond to solutions at which ϕ is orientation preserving, whereas negative counts correspond to solutions at which ϕ is not orientation preserving. We start with a brief analytical presentation of Brouwer’s degree theory, which is the first such theory and was introduced by Brouwer in 1912. DEFINITION 3.3.1 Let U ⊆ RN be a nonempty, bounded open set and ϕ ∈ C 1 (U , RN ). We say that x ∈ U is a critical point of ϕ, if Jϕ (x) = det ∇ϕ(x) = 0
N . We set Cϕ = {x ∈ U : Jϕ (x) = 0} and ϕ(Cϕ ) where ∇ϕ(x) = ∂ϕi ∂xj (x) i,j=1
is called the crease of ϕ. If y ∈ / ϕ(Cϕ ), then y is a regular value of ϕ. Now we can give the first definition of degree. / [ϕ(Cϕ ) ∪ ϕ(∂U )], then the degree DEFINITION 3.3.2 If ϕ ∈ C 1 (U , RN ) and y ∈ of ϕ at y with respect to U is defined by d(ϕ, U, y) = sgn Jϕ (x). x∈ϕ−1 (y)
REMARK 3.3.3 Here sgn u = 1 if u > 0 and sgn u = −1 if u < 0. Note that because y is not a critical value, the set ϕ−1 (y) is discrete and so the summation in Definition 3.3.2 is finite. The determinant Jϕ (x) is positive or negative accordingly as ϕ is orientation preserving or orientation reversing at x.
3.3 Degree Theory
197
Next the goal is to remove from Definition 3.3.2 the restrictions that y is a regular value of ϕ and that ϕ ∈ C 1 (U , RN ). The first can be removed with the help of the so-called Sard’s theorem. THEOREM 3.3.4 (Sard) If ϕ ∈ C 1 (U , RN ), then ϕ(Cϕ ) is a set of measure zero. We use Theorem 3.3.4 and suitable mollification techniques to replace the sum in Definition 3.3.2 by an appropriate integral. First let us recall the notion of mollifier. DEFINITION 3.3.5 Consider the function ψ1 : RN −→ R defined by
c exp x12 −1 for x ≤ 1 ψ1 (x) = 0 otherwise with c > 0 chosen so that RN ψ1 (x)dx = 1. For ε > 0, we define ψε (x) =
1/εN ψ1 x/ε . We have ψε ∈ C ∞ (RN ), RN ψε (x)dx = 1, and supp ψε = B ε (0) = {x ∈ RN : x ≤ ε}. The family of functions {ψε }e>0 is called mollifiers. REMARK 3.3.6 If U ⊆ RN is open and f ∈ L1loc (U ), then we define fε = ψε ∗f = ψ (x − y)f (y)dy for all x ∈ Uε = {x ∈ U : d(x, ∂U ) > ε}. Evidently fε ∈ C ∞ (Uε ) U ε and if f ∈ C(U ), we have fε −→ f uniformly on compact subsets of U . Also if f ∈ Lploc (U ) for some 1 ≤ p < ∞, then fε −→ f in Lploc (U ). PROPOSITION 3.3.7 If (U, ϕ, y) are as in Definition 3.3.2 and {ψε }ε>0 is a family of mollifiers, then there exists ε0 = ε0 (y, ϕ) > 0 such that
d(ϕ, U, y) = ψε ϕ(x) − y Jϕ (x)dx for all ε ∈ (0, ε0 ). U
PROOF: If ϕ (y) = ∅, then the result is immediate because ψε ϕ(x) − y = 0 for
all ε < d∗ y, ϕ(U ) . So suppose ϕ−1 (y) = {xk }m k=1 . Then by the inverse function theorem we can find disjoint balls Br (xk ) such that ϕB (x ) is a homeomorphism onto a neighborhood r k m Vk and Vk of y and sgn Jϕ (x) = sgn Jϕ (xk ) for all x ∈ Br (xk ). Let B (y) ⊆ k=1
Ωk = Br (xk ) ∩ ϕ−1 B (y) . Then −1
ϕ(x) − y ≥ ξ on U \
m !
Uk
for some ξ > 0.
k=1
Therefore ε < ξ implies that
m
ψε ϕ(x) − y Jϕ (x)dx = sgn Jϕ (xk ) U
k=1
ψε ϕ(x) − y |Jϕ (x)|dx.
Uk
Because Jϕ (x) = Jϕ−y (x) and ϕ(Uk ) − y = Br (0), by change of variables, we obtain
ψε ϕ(x) − y |Jϕ−y (x)|dx = ψε (x)dx = 1 for ε ∈ (0, min{ξ, }). Uk
Br (0)
198
3 Nonlinear Operators and Fixed Points
We use this proposition to pass from regular to critical values in the definition of the degree. For this purpose we need the following two lemmata. LEMMA 3.3.8 If ψ ∈ Cc1 (RN ), K = supp ψ, U ⊆ RN a bounded open set and γ : [0, 1] −→ RN a continuous path such that A = {k + γ(t) : k ∈ K, t ∈ [0, 1]} ⊆ U , then we can find u ∈ Cc1 (U ) such that
(div u)(x) = f x − γ(0) − f x − γ(1) . 1 PROOF: First suppose that γ(t) = t¯ x and set Ψ(x) = 0 ψ(x−ϑ¯ x)dϑ, u(x) = x ¯Ψ(x). 1 x) = 0 for Clearly Ψ ∈ C (U ). Also if x ∈ U is such that Ψ(x) = 0, then ψ(x − ϑ¯ some ϑ ∈ [0, 1] and so x ∈ K + ϑ¯ x ⊆ A ⊆ U . Therefore supp Ψ ⊆ A and we have u ∈ Cc1 (U , RN ). Note that div u(x) =
N 0
k=1
=
N 1
N 1 0
k=1
=− 0
1
j=1
∂xj ∂ψ (x − ϑ¯ x) dϑ x ¯k ∂xj ∂xk
∂ψ (x − ϑ¯ x)dϑ x ¯k ∂xk
d ψ(x − ϑ¯ x)dϑ = ψ(x) − ψ(x − x ¯). dϑ
Now we consider a general path
γ. For some s, t ∈ [0, 1] introduce the equivalence relation s ∼ t if and only if ψ x − γ(s) − ψ x − γ(t) = div u(x) for some u ∈ Cc1 (U , RN ). We show that each equivalence class is an open set and this by the compactness of [0, 1] implies that there is only one equivalence class, which is what we want. Set yt = γ(t)
− γ(s), t ∈ [0, 1] and let E be an equivalence class for s. We introduce ψs (x) = ψ x − γ(s) , Ks = supp(ψs ) and β = d(Ks , ∂U ) > 0. We can find ε > 0 such that |t − s| < ε implies that yt < β/2. Therefore Ks = {k + ϑyt : k ∈ Ks , ϑ ∈ [0, 1]} ⊆ U. Then from the first part of the proof, we can find u ∈ Cc1 (U , RN ) such that if |t − s| < ε
div u(x) = ψs (x) − ψs (x − yt ) = ψ x − γ(s) − ψ x − γ(t) . This means that t ∼ s; that is, t ∈ E and so E is open, hence E = [0, 1].
LEMMA 3.3.9 If U ⊆ RN is a bounded open set, ϕ ∈ C 2 (U , RN ), and u ∈ Cc1 (RN , RN ) is such that supp u ∩ ϕ(∂U ) = ∅,
1 N then we can find w ∈ Cc (R , RN ) such that div w(x) = Jϕ (x)div u ϕ(x) . N
PROOF: Let Aij (x) i,j=1 be the matrix of cofactors of ∇ϕ(x). We set wi (x) =
N j=1
uj ϕ(x) Aji (x).
3.3 Degree Theory
199
It is easy to see that w ∈ Cc1 (U , RN ). Recalling from the theory of differential N ∂Aji (x) = 0, we obtain that forms that ∂xi i=1
N N
∂ϕk ∂Aji ∂uj
∂wi (x) = (x)uj ϕ(x) + Aji (x) (x) ϕ(x) ∂xi ∂xi ∂xk ∂xi j=1 k=1
N
∂uj
= ϕ(x) δjk Jϕ (x) = Jϕ (x)div u ϕ(x) . ∂xk
j,k=1
The last auxiliary result before extending Definition 3.3.2 to critical values too, says that the definition of degree as it stands at this point (see Definition 3.3.2) is stable with respect to small changes in the function ϕ. For a proof of it we refer to Lloyd [396, p. 4]. LEMMA 3.3.10 If U⊆RN is a bounded open set, ϕ∈C 1 (U , RN ), and y ∈ / ϕ(Cϕ )∪ ϕ(∂U ) , then there exists δ > 0 such that if ϕ1 ∈ C 1 (U , RN ) and ϕ − ϕ1 C 1 ≤ δ, we have y∈ / ϕ1 (Cϕ1 ) ∪ ϕ1 (∂U ) and d(ϕ, U, y) = d(ϕ1 , U, y). The next proposition allows us to admit critical values in Definition 3.3.2. PROPOSITION 3.3.11 If U ⊆ RN is a bounded open set, ϕ ∈ C 1 (U , RN ), Z is a connected component of RN \ ϕ(∂U ), and y1 , y2 ∈ Z \ ϕ(Cϕ ), then d(ϕ, U, y1 ) = d(ϕ, U, y2 ). PROOF: First we suppose that ϕ ∈ C 2 (U , RN ). Recall that an open set in RN is connected if and only if it is path-connected. So we can find a continuous path γ : [0, 1] −→ Z such that γ(0) = y1 , γ(1) = y2 . Let {ψε }ε>0 be a family of mollifiers. By virtue of Proposition 3.3.7, we can find ε0 > 0 such that 0 < ε ≤ ε0 implies that
ψε ϕ(x) − yk Jϕ (x)dx for k = 1, 2. d(ϕ, U, yk ) = U
Let 0 < ε1 < ε0 be such that the ε1 -neighborhood of the path γ is in Z. Set A = {k + γ(t) : k ∈ supp ψε1 , By Lemma 3.3.8, we can find u ∈
Cc1 (Z, RN )
t ∈ [0, 1]} ⊆ Z. such that
div u(x) = ϕε1 (x − y1 ) − ϕε1 (x − y2 ) and supp u ∩ ϕ(∂U ) ⊆ Z ∩ ϕ(∂U ) = ∅. So by Lemma 3.3.9 there exists w ∈ C 1 (U , RN ) such that
div w(x) = div u ϕ(x) Jϕ (x) = ψε1 ϕ(x) − y1 − ψε1 ϕ(x) − y2 Jϕ (x). Using Green’s formula, we conclude that
d(ϕ, U, y1 ) − d(ϕ, U, y2 ) = ψε1 ϕ(x) − y1 − ψε1 ϕ(x) − y2 Jϕ (x)dx
U
= div u ϕ(x) Jϕ (x)dx = div w(x)dx = 0. U
U
200
3 Nonlinear Operators and Fixed Points
Now we remove the restriction that ϕ ∈ C 2 (U , RN ). So if ϕ ∈ C 1 (U , RN ) we can find a sequence {ϕn }n≥1 ⊆ C 2 (U , RN ) such that ϕn − ϕC 1 −→ 0 as n → ∞.
With the path γ being as above, let δ = d1 γ, ϕ(∂U ) > 0. If ϕ − ϕn C 1 < (δ/2), then for x ∈ ∂U and t ∈ [0, 1], we have δ γ(t) − ϕn (x) ≥ γ(t) − ϕ(x) − ϕ(x) − ϕn (x) > . 2 Therefore y1 , y2 are also in the same component of RN \ϕn (∂U ). Then using Lemma 3.3.10 and the first part of the proof, we obtain d(ϕ, U, y1 ) = d(ϕn , U, y1 ) = d(ϕn , U, y2 ) = d(ϕ, U, y2 ). This proposition leads to the following generalization of Definition 3.3.2. DEFINITION 3.3.12 If U ⊆ RN is a bounded open set, ϕ ∈ C 1 (U , RN ), and y∈ / ϕ(∂U ), then we define d(ϕ, U, y) to be d(ϕ, U, y) for any y ∈ / ϕ(Cϕ ) ∪ ϕ(∂U )
such that y − y < d y, ϕ(∂U ) . REMARK 3.3.13 By Sard’s theorem (see Theorem 3.3.4) every ball Br (y) contains vectors y that are not critical values of ϕ. The component of y in RN \ ϕ(∂U ) contains the ball Bd1 (y,ϕ(∂U )) (y). So by Proposition 3.3.11, d(ϕ,
U, y) has the same value for all y that is not a critical value of ϕ and y − y < d y, ϕ(∂U ) . DEFINITION 3.3.14 Let U ⊆ RN be a bounded open set and ϕ0 , ϕ1 ∈ C 1 (U , RN ). A map h : [0, 1] × U −→ RN is a C 1 -homotopy between ϕ0 and ϕ1 , if (a) h(0, ·) = ϕ0 (·) and h(1, ·) = ϕ1 (·). (b) h(t, ·) ∈ C 1 (U , RN ) for every t ∈ [0, 1]. (c) lim h(t, ·) − h(s, ·)C 1 = 0 for every s ∈ [0, 1]. t→s
Using this notion and Sard’s theorem, we can easily establish some basic properties of the degree d(ϕ, U, y) for C 1 -functions. Namely we have the following. THEOREM 3.3.15 If U⊆ RN is bounded open and ϕ ∈ C 1 (U , RN ), then (a) d(ϕ, U, ·) is constant on each connected component of RN \ϕ(∂U ). / ϕ(∂U ) (b) If y ∈ / ϕ(∂U ), then we can find ε > 0 such that if ϕ − ϕC 1 ≤ ε; then y ∈ and d(ϕ, U, y) = d(ϕ, U, y). / h(t, ∂U ) for all (c) If h(t, x) is a C 1 -homotopy between ϕ0 , ϕ1 ∈C 1 (U , RN ) and y ∈ t ∈ [0, 1], then d(ϕ0 , U, y) = d(ϕ1 , U, y). (d) If y ∈ / ϕ(∂U ), then d(ϕ + z0 , U, y + z0 ) = d(ϕ, U, y) for all z0 ∈ RN .
3.3 Degree Theory
201
Property (c) (the homotopy invariance property) with a further approximation enables d(ϕ, U, y) to be defined when ϕ is only continuous and
not continuously differentiable. So let ϕ ∈ C(U , RN ) and y ∈ / ϕ(∂U ). Set β = d y, ϕ(∂U ) > 0. We can find ϕ0 , ϕ1 ∈ C 1 (U , RN ) such that ϕ − ϕk C 1 < β, k = 0, 1. We consider the C 1 -homotopy h(t, x) = tϕ0 (x) + (1 − t)ϕ1 (x)
for all t ∈ [0, 1] and all x ∈ U .
We have
h(t, x) − ϕ(x) = t ϕ0 (x) − ϕ(x) + (1 − t) ϕ1 (x) − ϕ(x) < β. So if x ∈ ∂U , then y − h(t, x) ≥ y − ϕ(x) − ϕ(x) − h(t, x) > 0 ⇒ y∈ / h(t, ∂U )
for all t ∈ [0, 1],
for all t ∈ [0, 1].
So by virtue of Theorem 3.3.15(c), we have d(ϕ0 , U, y) = d(ϕ1 , U, y). It follows that d(ϕ, U, y) is the same for all ϕ ∈ C 1 (U , RN ) within β of ϕ. This leads to a further generalization of Definition 3.3.2. DEFINITION 3.3.16 Let U ⊆ RN be a bounded open set, ϕ ∈ C(U , RN ), and y ∈ / ϕ(∂U ). We define d(ϕ, U, y) to be equal to d(ϕ, U, y), where ϕ ∈ C 1 (U , RN ) satisfies
ϕ(x) − ϕ(x) < d y, ϕ(∂U ) for all x ∈ U . REMARK 3.3.17 In fact it is easy to check that in the above definition we can choose ϕ ∈ C 1 (U , RN ) such that y ∈ / ϕ(Cϕ ). This way we have concluded the definition of Brouwer’s degree which has been defined for the class C(U , RN ). In the next theorem we summarize the basic properties of Brouwer’s degree. THEOREM 3.3.18 If U ⊆ RN is a bounded open set, ϕ ∈ C(U , RN ), and y ∈ / ϕ(∂U ), then (a) d(I, U, y) = 1 for all y ∈ U . (b) Additivity with respect to the domain: If U1 , U2 are disjoint open subsets of U and y ∈ / ϕ U \(U1 ∪ U2 ) , then d(ϕ, U, y) = d(ϕ, U1 , y) + d(ϕ, U2 , y). (c) Homotopy invariance: If h : [0, 1] × U −→RN is a continuous map and y ∈ / h(t, ∂U ) for all t ∈ [0, 1], then d h(t, ·), U, y is independent of t ∈ [0, 1]. (d) Dependence on the boundary values: If ϕ ∈ C(U , RN ) and ϕ∂U = ϕ∂U , then d(ϕ, U, y) = d(ϕ, U, y).
202
3 Nonlinear Operators and Fixed Points
(e) Excision property: If K ⊆ U is closed and y ∈ / ϕ(K), then d(ϕ, U, y) = d(ϕ, U \ K, y).
(f) Continuity with respect to ϕ: If ϕ ∈ C(U ) and ϕ − ϕ∞ < d y, ϕ(∂U ) , then d(ϕ, U, y) is defined and equals d(ϕ, U, y). (g) Existence property: d(ϕ, U, y) = 0 implies that ϕ−1 (y) = ∅. PROOF: (a) Follows at once from Definition 3.3.2. (b) First we show that ∂U1 , ∂U2 ⊆ ∂U . Clearly ∂Uk ⊆ U k ⊆ U for k = 1, 2. So if there is x ∈ ∂U1 with x ∈ / ∂U , then we must have x ∈ U . So x ∈ U2 and we find r > 0 such that Br (x) ⊆ U2 , hence Br (x) ∩ U1 = ∅. This contradicts the fact that x ∈ ∂U1 . So ∂U1 ⊆ ∂U and similarly ∂U2 ⊆ ∂U . We choose ϕ ∈ C 1 (U , RN ) such that
ϕ − ϕ∞ < d y, ϕ(∂U ) and y∈ / ϕ(Cϕ ). Because ∂Uk ⊆ ∂U , it follows that y ∈ / ϕ(∂Uk ) and
ϕ(x) − ϕ(x) < d y, ϕ(∂U ) for all x ∈ U k , k = 1, 2, ⇒ d(ϕ, Uk , y) = d(ϕ, Uk , y), k = 1, 2. So we have d(ϕ, U, y) = d(ϕ, U, y) =
sgn Jϕ (x)
x∈ϕ −1 (y)
=
sgn Jϕ (x) +
x∈ϕ −1 (y)∩U1
sgn Jϕ (x)
x∈ϕ −1 (y)∩U2
= d(ϕ, U1 , y) + d(ϕ, U2 , y) = d(ϕ, U1 , y) + d(ϕ, U2 , y). (c) It follows from Theorem 3.3.15(c) and Definition 3.3.16. (d) Consider the continuous map (homotopy) h(t, x) = tϕ(x) + (1 − t)ϕ(x) for all t ∈ [0, 1] and all x ∈ U . For x ∈ ∂U , we see that h(t, x) = ϕ(x) and so y ∈ / h(t, ∂U ) for all t ∈ [0, 1]. Therefore (c) implies that d(ϕ, U, y) = d(ϕ, U, y). (e) Again we choose ϕ ∈ C 1 (U , RN ) such that
ϕ − ϕ∞ < d y, ϕ(∂U ) , ϕ − ϕ∞ < d y, ϕ(K)
and
y∈ / ϕ(Cϕ ).
Then y ∈ / ϕ(K) and from the choice of ϕ, we have d(ϕ, U, y) = d(ϕ, U, y) = sgn Jϕ (x) = sgn Jϕ (x) x∈ϕ −1 (y)∩U
x∈ϕ −1 (y)∩(U \K)
= d(ϕ, U \K, y) = d(ϕ, U \K, y).
(f) Because ϕ − ϕ∞ < d y, ϕ(∂U ) and y ∈ / ϕ(∂U ), it follows that y ∈ / ϕ(∂U ) and so d(ϕ, U, y) is defined. Take g ∈ C 1 (U , RN ) such that
3.3 Degree Theory
g − ϕ∞ + ϕ − ϕ∞ < d y, ϕ(∂U ) ,
⇒ g − ϕ∞ < d y, ϕ(∂U ) , ⇒ d(g, U, y) = d(ϕ, U, y) (see Definition 3.3.16).
Because d y, ϕ(∂U ) ≤ d y, ϕ(∂U ) + ϕ − ϕ∞ , we have
g − ϕ∞ < d y, ϕ(∂U ) (see (3.51)).
203 (3.51) (3.52)
So once more Definition 3.3.16 implies that d(g, U, y) = d(ϕ, U, y).
Therefore finally we have that if ϕ − ϕ∞ < d y, ϕ(∂U ) , then d(ϕ, U, y) = d(ϕ, U, y)
(3.53)
(see (3.52) and (3.53)).
(g) If y ∈ / ϕ(U ), we take ϕ ∈ C 1 (U , RN ) such that
ϕ − ϕ∞ < d y, ϕ(U ) . Hence y ∈ / ϕ(U ) and so d(ϕ, U, y) = 0. Therefore by Definition 3.3.16 we have d(ϕ, U, y) = 0. From this we infer that y ∈ ϕ(U ) if d(ϕ, U, y) = 0. Next we present some consequences of these properties and of Definition 3.3.16. COROLLARY 3.3.19 If U ⊆ RN is bounded open and ϕ ∈ C(U , RN ), then d(ϕ, U, ·) is constant on the connected components of RN \ϕ(∂U ). PROOF: Let Z be a connected component of RN \ ϕ(∂U ) and let y1 , y2 ∈ Z. As before we can find a continuous path γ : [0, 1] −→ Z such that γ(0) = y1 and γ(1) = y2 . Choose ϕ ∈ C(U , RN ) such that
ϕ − ϕ∞ < d γ([0, 1]), ϕ(∂U ) . By virtue of Definition 3.3.16 we have that d(ϕ, U, yk ) = d(ϕ, U, yk ),
k = 1, 2
and y1 , y2 are in the same component of RN \ ϕ(∂U ). Then the result follows from Theorem 3.3.15(a). / ϕ(∂U ), COROLLARY 3.3.20 If U ⊆RN is bounded open, ϕ∈C(U , RN ), and y ∈ then d(ϕ, U, y) = d(ϕ − v, U, y − v) for all v ∈ RN . PROOF: Let ξ : [0, 1] −→ Z be defined by ξ(t) = d(ϕ − tv, U, y − tv). Note that y − tv ∈ / (ϕ − tv)(∂U ) and so ξ is well defined. Moreover, by virtue of Theorem 3.3.18(f) and Corollary 3.3.19, ξ(·) is continuous. Because it is Z-valued, it is constant. This permits the following improvement of the homotopy invariance property (see Theorem 3.3.18(c)).
204
3 Nonlinear Operators and Fixed Points
COROLLARY 3.3.21 If U⊆RN is bounded open, h : [0, 1]×U −→ R N , y : [0, 1] −→ RN are continuous maps, and y(t) ∈ / h(t, ∂U ) for all t ∈ [0, 1], then d h(t, ·), U, y(t) is independent of t ∈ [0, 1]. PROOF: Consider the continuous map h(t, x) = h(t, x) − y(t)
for all (t, x) ∈ [0, 1] × U .
Then from Corollary 3.3.20 we have
d h(t, ·), U, 0 = d h(t, ·), U, y(t)
for all t ∈ [0, 1].
But from Theorem 3.3.18(c) d h(t, ·),
U, 0 is independent of t ∈ [0, 1]. Hence the same can be said for d h(t, ·), U, y(t) . COROLLARY 3.3.22 If U ⊆ RN is bounded open, ϕ, ϕ ∈ C(U , RN ) and for all x ∈ ∂U , y ∈ / [ ϕ(x), ϕ(x) ], then d(ϕ, U, y) = d(ϕ, U, y). PROOF: Consider the homotopy h(t, x) = tϕ(x) + (1 − t)ϕ(x)
for all (t, x) ∈ [0, 1] × U .
Note that by hypothesis t ϕ(x) − y + (1 − t) ϕ(x) − y = 0 for all t ∈ [0, 1] and all x ∈ ∂U . So h(t, x) = y for all (t, x) ∈ [0, 1] × ∂U. Invoking Theorem 3.3.18(c), we infer that d(ϕ, U, y) = d(ϕ, U, y).
Next we present some useful applications of Brouwer’s degree. Additional applications are presented in Section 3.5. PROPOSITION 3.3.23 If B R = x ∈ RN : x ≤ R , ϕ : B R −→ RN is continuous and ϕ(x) = −λx for all x ∈ ∂BR , λ ≥ 0, then the equation ϕ(x) = 0 has a solution x ∈ BR = {x ∈ RN : x < R}. PROOF: By hypothesis 0 ∈ / ϕ(∂BR ) and so the degree d(ϕ, BR , 0) is well defined. We consider the homotopy h(t, x) = tϕ(x) + (1 − t)x for all (t, x) ∈ [0, 1]×B R . By hypothesis 0 ∈ / h(t, ∂BR ) for all t ∈ [0, 1]. So from the homotopy invariance and the normalization properties of the degree (see Theorem 3.3.18(a) and (c)), d(ϕ, BR , 0) = d(I, BR , 0) = 1. So there exists x0 ∈ BR such that ϕ(x0 ) = 0 (see Theorem 3.3.18(g)).
An interesting consequence of this proposition is the following result.
COROLLARY 3.3.24 If ϕ ∈ C(RN , RN ) is such that ϕ(x), x RN ≥ 0 for all x ∈ ∂BR = {x ∈ RN : x = R}, then we can find x0 ∈ B R = {x ∈ RN : x ≤ R} such that ϕ(x0 ) = 0.
3.3 Degree Theory
205
PROOF: If there is x0 ∈ ∂BR such that ϕ(x0 ) = 0, then we are done. So suppose that 0 ∈ / ϕ(∂BR ). Then we cannot have that ϕ(x) = −λx λ ≥ 0 and
for some x ∈ ∂BR or otherwise we contradict the hypothesis that ϕ(x), x RN ≥ 0 for all x ∈ ∂BR . Therefore we can apply Proposition 3.3.23 and conclude that ϕ(x) = 0 has a solution in B R .
COROLLARY 3.3.25 If ϕ ∈ C(RN , RN ) and (ϕ(x), x)RN x −→ +∞ as N N x −→ +∞, then ϕ is surjective; that is, ϕ(R ) = R . PROOF: Let y ∈ RN . Then (ϕ(x) − y, x)RN −→ +∞ as x −→ +∞. x
So we can find R > 0 large such that ϕ(x) − y, x RN ≥ 0 for all x = R. Apply Corollary 3.3.24 to obtain x0 ∈ B R such that ϕ(x0 ) = y. PROPOSITION 3.3.26 If N is an odd integer, U ⊆ RN is a bounded open set with 0 ∈ U , and ϕ : ∂U −→ RN \ {0} is continuous, then we can find x ∈ ∂U and λ = 0 such that ϕ(x) = λx. PROOF: By Tietze’s extension theorem, without any loss of generality we may assume that ϕ ∈ C(RN , RN ). Because N is odd, then d(−I, U, 0) = −1. Suppose that d(ϕ, U, 0) = −1. Then the map h(t, x) = (1 − t)ϕ(x) + t(−x) must have a zero (t0 , x0 ) ∈ (0, 1) × ∂U . Hence ϕ(x0 ) = t0 (1 − t0 ) x0 . On the other hand if d(ϕ, U, 0) = −1, then because d(I, U, 0) = 1, the map h(t, x) = (1 − t)ϕ(x) + tx must have a zero (t0 , x0 ) ∈ (0, 1) × ∂U . Hence ϕ(x0 ) = − t0 (1 − t0 ) x0 . REMARK 3.3.27 In the above proposition the dimension of the space must be odd. If it is even it is no longer true. Consider, for example the map ϕ : R2 −→ R2 defined by ϕ(x1 , x2 ) = (−x2 , x1 ) (rotation by π/2). If U = B1 = {x ∈ RN : x < 1}, then the proposition says that there is no continuous vector field
on S= ∂B1 which is nowhere zero; that is, ϕ : S −→ RN such that ϕ(x) = 0 and ϕ(x), x RN = 0 on S. So you cannot comb a hedgehog. Another useful topological application of Brouwer’s degree is the following theorem known as Borsuk’s theorem. THEOREM 3.3.28 If U ⊆ RN is bounded, open, and symmetric (i.e., U = −U ) with 0 ∈ U and ϕ : U −→ RN is continuous, odd, and 0 ∈ / ϕ(∂U ), then d(ϕ, U, 0) is odd. PROOF: First we assume that ϕ ∈ C 1 (U ) ∩ C(U ) and that det ϕ (0) = 0. We construct an odd function ψ ∈ C 1 (U ) ∩ C(U ) that is sufficiently close to ϕ and for which 0 is a regular value. The construction is done by induction. Let Uk = {x ∈ U : xi = 0 for some i ≤ k} and let ξ ∈ C(R) be odd and satisfy
ξ (0) = 0 and ξ(t) = 0 if and only if t = 0 3 (e.g., let ξ(t) = t ). Let ϕ(x) = 1 ξ(x1 ) ϕ(x) on the bounded open set U1 . Then
206
3 Nonlinear Operators and Fixed Points
by Sard’s theorem (see Theorem 3.3.4) we can find y 1 ∈ / ϕ Cϕ (U1 ) with y 1 as small as we like. So if ψ1 (x) = ϕ(x) − ξ(x1 )y 1 , x ∈ U1 , then ψ1 (x) = ξ(x1 )ϕ (x) for x ∈ U1 such that ψ1 (x) = 0 and so 0 is a regular value of ψ1 . Suppose that we already have an odd function ψk ∈ C 1 (U ) ∩ C(U ) close to ϕ on U such that
0 ∈ ψk Cψk (Uk ) for some k ≤ N . Then define ψk+1 (x) = ψk (x) − ξ(xk+1 )y k+1 with y k+1 small and such that 0 is a regular value of ψk+1 on {x ∈ U : xk+1 = 0}. Evidently ψk+1 ∈ C 1 (U ) ∩ C(U ) is odd and close to ϕ on U . Moreover, if x ∈ Uk+1 and xk+1 = 0, then x ∈ Uk , ψ k+1 (x) = ψk (x), and ψk+1 (x) = ψk (x). Hence det ψk+1 (x) = 0 and so 0 ∈ / ψk+1 Cψk+1 (Uk ) . Continuing this way at the end we
/ ψ {x ∈ obtain ψ = ψN ∈ C 1 (U) ∩ C(U ) which is odd, close to ϕ on U , and 0 ∈ U \ {0} : det ψ (x) = 0} (because UN = U \ {0}). Moreover, ψ (0) = ψ1 (0) = ϕ (0) and so 0 is a regular value of ψ. Using Theorem 3.3.18(f), we have d(ϕ, U, 0) = d(ψ, U, 0) = sgn Jψ (0) + sgn Jψ (x) x∈ψ −1 (0)
x=0
where the sum is even because ψ(x) = 0 if and only if ψ(−x) = 0 and Jψ (·) is even. Therefore d(ϕ, U, 0) is odd, . Now we remove the restriction that ϕ ∈ C 1 (U ) ∩ C(U ).
Indeed if ϕ ∈ C(U ), we approximate ϕ by ψ ∈ C 1 (U ) ∩ C(U ) and let ψ0 (x) = 12 ψ(x) − ψ(−x) (the odd part of ψ). Choose δ > 0 small such that it is not an eigenvalue of ψ0 (0) and set ϕ = ψ0 − δI. Then ϕ ∈ C 1 (U ) ∩ C(U ), it is odd, det ϕ (0) = 0, and ϕ − ϕ∞ can be as small as we like (just choose δ > 0 small). Then by Theorem 3.3.18(f), we have d(ϕ, U, 0) = d(ϕ, U, 0) = odd number. COROLLARY 3.3.29 If U ⊆ RN is bounded, open, and symmetric (i.e., U = −U ) with 0 ∈ U and ϕ : ∂U −→ Rk is continuous with k < N , then there exists x ∈ ∂U such that ϕ(x) = ϕ(−x). PROOF: We proceed by contradiction. So suppose that ϕ(x) = ϕ(−x) for all x ∈ ∂U . Let ψ(x) be any odd extension of ϕ(x) − ϕ(−x) on U and let (N −k)−elements
1 23 4
ψ(x) = ψ(x), 0, . . . , 0
for all x ∈ U.
Then by virtue of Corollary 3.3.19 and Theorem 3.3.28, we have d(ψ, U, y) = d(ψ, U, 0) = 0
for all y ∈ Br (0) with r > 0 small.
But then Theorem 3.3.18 implies that Br (0) ⊆ ψ(U ), a contradiction.
REMARK 3.3.30 In particular Corollary 3.3.29 implies that if U is as above and ϕ : ∂U −→ Rk , k < N , is a continuous odd map, then it must have a zero. COROLLARY 3.3.31 If U ⊆ RN is bounded, open, and symmetric (i.e., U = −U ) with 0 ∈ U and ϕ ∈ C(U , RN ) is such that 0 ∈ / ϕ(∂U ) and ϕ(−x) = λϕ(x) for all λ ≥ 1 and all x ∈ ∂U , then d(ϕ, U, 0) is odd.
3.3 Degree Theory
207
PROOF: Let ψ(x) = ϕ(x) − ϕ(−x). Clearly ψ ∈ C(U , RN ) and it is odd. Let h(t, x) = (1 − t)ϕ(x) + tψ(x) = ϕ(x) − tϕ(−x). Evidently, by hypothesis 0 ∈ / h(t, ∂U ) for all t ∈ [0, 1] and so Theorem 3.3.18(c) and Theorem 3.3.28 imply d(ϕ, U, 0) = d(ψ, U, 0) = odd number. Using this corollary, we can prove something about coverings of the boundary ∂U . The result is known as the Ljusternik–Schnirelmann–Borsuk theorem. It is useful in the study of the so-called Ljusternik–Schnirelmann category (see Section 4.2). N THEOREM 3.3.32 IfmU ⊆ R is bounded, open, and symmetric (i.e., U = −U ) with 0 ∈ U and let Ci i=1 be closed subsets of ∂U that cover ∂U and Ci ∩(−Ci ) = ∅ for i = 1, . . . , m, then m ≥ N + 1.
PROOF: We argue indirectly. Suppose m ≤ N . For i ∈ {1, . . . , m−1} let ϕi (x) = 1 if x ∈ Ci and ϕi (x) = −1 if x ∈ −Ci . Also for i ∈ {m, . . . , N } let ϕi (x) = 1 for all x ∈ U . For i ∈ {1, . . . , m − 1} we extend ϕi to a continuous function on all of U . Set ϕ = (ϕi }N i=1 . We claim that ϕ satisfies the hypotheses of Corollary 3.3.31. So we need to show that ϕ(−x) = λϕ(x)
for all λ ≥ 1 and all x ∈ ∂U.
(3.54)
If x ∈ Cm , then x ∈ / −Cm and so x ∈ −Ci for some i ∈ {1, . . . , m − 1}. Therefore we have m−1 !
∂U ⊆ Ci ∩ (−Ci ) . i=1
Let x ∈ ∂U . Then if x ∈ Ci , we have ϕi (x) = 1 and ϕi (−x) = −1, whereas if x ∈ −Ck , we have ϕk (x) = −1 and ϕk (−x) = 1. Therefore ϕi (−x) does not point in the same direction as ϕi (x) and so (3.54) holds. Apply Corollary 3.3.31 to obtain d(ϕ, U, 0) = 0 and so ϕi (x) = 0, for some x ∈ U , a contradiction to the fact that ϕN (x) = 1. N REMARK 3.3.33 So if Ci i=1 is a closed covering of ∂B1 in RN , then at least one of the Ci s must contain antipodal points. A final topological application of Brouwer’s degree is the so-called invariance of domain theorem, which provides sufficient conditions in order for a continuous map to be open. THEOREM 3.3.34 If U ⊆ RN is open and ϕ ∈ C(U, RN ) is locally one-to-one, then ϕ is open. PROOF: We need to show that for every V ⊆ U open, ϕ(V ) is open. Let x0 ∈ V and y0 = ϕ(x0 ). By translating things if necessary, we may assume that x0 = y0 = 0. Choose r > 0 such that Br (0) ⊆ V and ϕB (0) is one-to-one. Define r
208
3 Nonlinear Operators and Fixed Points 1 t x −ϕ − x h(t, x) = ϕ 1+t 1+t
for all (t, x) ∈ [0, 1] × B r (0).
Clearly h is continuous, h(0, ·) = ϕ, and h(1, x) = ϕ x/2 − − x/2
isodd. Moreover, r (0), we have h(t, x) = 0, then
if for
some (t, x) ∈ [0, 1] × ∂B
ϕ 1 (1 + t) x = ϕ − 1 (1 + t) x and since ϕB (0) is one-to-one, we have 1 (1 + t) x = r
− 1 (1 + t) x, hence x = 0, a contradiction. So
d ϕ, Br (0), 0 = d h(1, ·), Br (0), 0 = 0 (see Theorem 3.3.28),
⇒ d ϕ, Br (0), y = 0 for all y ∈ B (0) for some > 0 (see Corollary 3.3.19),
⇒ B (0) ⊆ ϕ Br (0) ; that is, ϕ(V ) is open and so ϕ is an open map. Before passing to degree theory in infinite dimensional spaces, let us mention the so-called multiplication formula. It concerns the Brouwer degree of composition of maps and is useful in applications because it facilitates the computation of the degree in many concrete situations. The proof of the result is rather lengthy and can be found in Lloyd [396, p. 29]. THEOREM 3.3.35 If U⊆RN is bounded open, ϕ∈C(U , RN ), V is a bounded open N set containing
ϕ(U ), {Dk }k≥1 are the components of V \ ϕ(∂U ), ψ∈C(V , R ), and y∈ / ψ ϕ(∂U ) ∪ ψ(∂V ), then d(ψ ◦ ϕ, U, y) = d(ψ, Dk , y)d(ϕ, U, zk ) where zk is k≥1
any point in Dk . Now we extend the notion of degree to infinite-dimensional Banach spaces. However, because in an infinite-dimensional Banach space a closed and bounded set need not be norm compact, mere continuity of the function ϕ is not enough. Using the theory of compact maps (see Section 3.1), we can have a degree d(ϕ, U, y) of a continuous map ϕ : X −→ X (X an infinite-dimensional Banach space), which has the form ϕ=I −f (3.55) with I being the identity operator on X and f : X −→ X being a compact map (see Definition 3.1.1(a)) (i.e., f ∈ K(X, X)). So let X be a Banach space and U ⊆ X a bounded open set. Also let ϕ ∈ C(U , X) be a map that admits the decomposition (3.55) with f ∈ K(U , X) (i.e., ϕ is a compact perturbation of the identity map). From Propositions 3.1.18 and 3.1.19 we know that ϕ(∂U ) is closed in X. So if y ∈ / ϕ(∂U ), then
ϑ = d y, ϕ(∂U ) > 0. (3.56) Invoking Theorem 3.1.10 we can find fε ∈ Kf (U , X) with ε ≤ ϑ/2 such that f (x) − fε (x) ≤ ε for all x ∈ U . (3.57) Let Yε = span R(fε ) ∪ {y} . This is a finite-dimensional subspace of X that contains the range of fε , R(fε ), and the vector y ∈ X. Let N = dim Yε and choosing a basis of Yε , we can identify it with RN . Note that for all x ∈ ∂(U ∩Yε ) = ∂(U ∩RN ), if ϕε = I − fε , we have
3.3 Degree Theory
209
ϕε (x) − y = ϕ(x) + f (x) − fε (x) − y ≥ ϕ(x) − y − f (x) − fε (x) ≥ ϑ −
ϑ ϑ = >0 2 2
(see (3.56) and (3.57)),
⇒ y∈ / ϕε ∂(U ∩ RN ) = (I − fε ) ∂(U ∩ RN ) . This suggests that we can define the degree in an infinite-dimensional Banach space by means of the degree of the appropriate finite-dimensional approximation of ϕ; that is, d(ϕ, U, y) = d(ϕε , U ∩ Yε , y). (3.58) We need some additional work to show that this degree is in fact well defined. LEMMA 3.3.36 If U ⊆ RN +k (k ≥ 1) is a bounded open set, f ∈ C(U , RN ), y ∈ RN and y ∈ / (I + f )(∂U ), then d(I + f, U, y) = d(I + f, U ∩ RN , y). PROOF: First assume that f ∈ C 1 (U , RN ). Let JN +k be the Jacobian matrix in k–entries
R
N +k
1 23 4 of the map I + f where f = (f, 0, . . . , 0). Then from Definition 3.3.2, we have 5 6 JN (x) 0 sgn JN +k (x) = sgn det d(I + f , U, y) = 0 Ik x∈(I+f)−1 (y)
x∈(I+f)−1 (y)
= d(I + f, U ∩ RN , y). Now consider the general case. For i ∈ {N + 1, . . . , N + k}, let fi ≡ 0 and for
N +k i ∈ {1, . . . , N } choose gi ∈ C(U , R) such that if g = gi i=1 , then
f (x) − g(x) < d y, ϕ(∂U )
⇒ ϕ(x) − ψ(x) < d y, ϕ(∂U )
for all x ∈ U , for all x ∈ U ,
(3.59)
where ψ = I + g. Let ψ = ψ U ∩RN .
Because ψ −1 (y) ⊆ U ∩ RN and ψ(Cψ ) has RN -measure zero, by making a small translation on ψ we can guarantee that in addition to (3.59), we have y ∈ / ψ(Cψ ). Then d(ϕ, U, y) = d(ψ, U, y) and so the result of the lemma follows from the first part of the proof.
Using Lemma 3.3.36, we can show that the definition given in (3.58) is correct.
PROPOSITION 3.3.37 For ε > 0 small (ε < ϑ = d y, ϕ(∂U ) ), d(ϕε , U ∩ Yε , y) is independent of the ε-approximation fε ∈ K(U , X) of f . PROOF: First we show that d(ϕε , U ∩ Yε , y) is independent of the basis chosen in Yε . To see this, first assume that ϕε ∈ C 1 (U ∩ Yε , Yε ). Note that at a point the sign of the Jacobian determinant is invariant under a change of a basis in Yε . So the claim follows from Definition 3.3.2. The general case follows as before by approximating with C 1 -functions.
210
3 Nonlinear Operators and Fixed Points
Next using Lemma 3.3.36 we show that d(ϕε , U ∩ Yε , y) stabilizes for ε > 0 small. To this end let ϕδ = I + fδ be the δ-approximation of ϕ and as before. Yδ = span R(fδ ) ∪ {y} . Let Y = span Yε ∪ Yδ }. Then by virtue of Lemma 3.3.36 for ε, δ > 0 small, we have
and
d(ϕε , U ∩ Yε , y) = d(ϕε , U ∩ Y, y)
(3.60)
d(ϕδ , U ∩ Yδ , y) = d(ϕδ , U ∩ Y, y).
(3.61)
Consider the homotopy h(t, x) = tϕε (x) + (1 − t)ϕδ (x) for all (t, x) ∈ [0, 1] × (U ∩ Y ). Evidently
y∈ / h t, ∂(U ∩ Y ) for all t ∈ [0, 1] and so from the homotopy invariance of Brouwer’s degree, we have d(ϕε , U ∩ Y, y) = d(ϕδ , U ∩ Y, y).
(3.62)
Comparing (3.60) through (3.62), we see that d(ϕε , U ∩ Yε , y) = d(ϕδ , U ∩ Yδ , y)
for ε, δ > 0 small.
This proposition leads to the following infinite-dimensional degree map. DEFINITION 3.3.38 Let X be a Banach space, U ⊆ X bounded open, ϕ = I −f / ϕ(∂U ). We introduce fε ∈ Kf (U , X) such that with f ∈ K(U , X), and y ∈
for all x ∈ U . f (x) − fε (x) < d y, ϕ(∂U ) Choose a finite-dimensional subspace Yε of X such that Yε ⊇ fε (U ) and y ∈ Yε . We set Uε = U ∩ Y, ϕε = I Y − fε and we define ε
d(ϕ, U, y) = d(ϕε , Uε , y). REMARK 3.3.39 It is clear from the above definition that U need not be bounded. We only need that U ∩ V is bounded for every finite-dimensional subspace V of Y . So we only need to know that U is finitely bounded. The degree d is known as the Leray–Schauder degree. So from Definition 3.3.38 and the corresponding properties of Brouwer’s degree we obtain the following theorem. THEOREM 3.3.40 If X is a Banach space, U ⊆ X is bounded open, ϕ = I − f with f ∈ K(U , X), and y ∈ / ϕ(∂U ), then (a) d(I, U, y) = 1 for all y ∈ U and d(I, U, y) = 0 if y ∈ / U.
3.3 Degree Theory
211
(b) Additivity with respect to thedomain: If U1 , U2 are disjoint open subsets of U such that y ∈ / ϕ U \ (U1 ∪ U2 ) , then d(I − f, U, y) = d(I − f, U1 , y) + d(I − f, U2 , y). (c) Homotopy invariance: If h : [0, 1] × U −→ X is compact, y : [0, 1] −→ X is con
tinuous and y(t) ∈ / I − h(t, ·) (∂U ) for all t ∈ T , then d I − h(t, ·), U, y(t) is independent of t ∈ [0, 1]. (d) Dependence on the boundary values: If ψ = I − g with g ∈ K(U , X) and ϕ = ψ on ∂U , then d(I − f, U, y) = d(I − g, U, y). (e) Excision property: If K ⊆ U is closed and y ∈ / ϕ(K), then d(ϕ, U, y) = d(ϕ, U \ K, y). (f) Continuity with respect to ϕ: If ψ = I −g with g ∈ K(U , X) and ϕ(x)−ψ(x) <
d y, ϕ(∂U ) for all x ∈ U , then y ∈ / ψ(∂U ) and d(ϕ, U, y) = d(ψ, U, y). (g) Existence property: d(ϕ, U, y) = 0 implies that there exists x ∈ U such that x − f (x) = y. (h) Dependence on y: X \ ϕ(∂U ).
d(ϕ, U, ·) is constant on the connected components of
(i) If u ∈ X, then d(ϕ, U, y) = d(ϕ − u, U, y − u). Next as we did for Brouwer’s degree, we present some topological applications of the Leray–Schauder degree. We start with the infinite-dimensional version of Borsuk’s theorem (see Theorem 3.3.28). THEOREM 3.3.41 If X is a Banach space, U ⊆ X is bounded, open, and symmetric (i.e., U = −U ) with 0 ∈ U ϕ = I − f with f ∈ K(U , X), 0 ∈ / ϕ(∂U ), and ϕ(−x) = λϕ(x) for all λ ≥ 1 and all x ∈ ∂U , then d(I − f, U, 0) is odd and in particular if f ∂U is odd, then d(I − f, U, 0) is odd. PROOF: Consider the compact homotopy h(t, x) =
t 1 f (x) − f (−x) 1+t 1+t
for all (t, x) ∈ [0, 1] × U .
Because by hypothesis ϕ(−x) = λϕ(x) for all λ ≥ 1 and all x ∈ ∂U , it follows that x = h(t, x) for all (t, x) ∈ [0, 1] × ∂U . Thus by virtue of Theorem 3.3.40(c) we have d(I − f, U, 0) = d(I − g, U, 0) where g(x) = h(1, x) = 2 f (x) − f (−x) . Note that g is the odd part of the map f .
We choose g1 ∈ Kf (U , X) such that g1 (x) − g(x) < d 0, ψ(∂U ) where ψ = I − g
and let g2 (x) = 12 g1 (x) − g1 (−x) (the odd part of g1 ). Then g2 ∈ Kf (U , X) is odd and
1
212
3 Nonlinear Operators and Fixed Points sup g2 (x) − g(x) ≤ sup x∈U
x∈U
= sup x∈U
1 1 g1 (x) − g(x) + sup g1 (−x) + g(x) 2 x∈∂U 2 1 1 g1 (x) − g(x) + sup g1 (−x) − g(−x), 2 x∈U 2
(because g is odd)
= sup g1 (x) − g(x) < d 0, ψ(∂U ) . x∈U
Therefore by virtue of Definition 3.3.38, we have d(I − g, U, 0) = d(I − g2 , U ∩ Y2 , 0) = odd
(by Theorem 3.3.28).
Also with a proof similar to that of Theorem 3.3.34, we can have the following invariance of domain theorem. THEOREM 3.3.42 If X is a Banach space, U ⊆ X is open, and ϕ = I − f with f ∈ K(U , X) and it is locally one-to-one, then ϕ is an open map. The next result is an infinite-dimensional generalization of Lemma 3.3.36 and provides one more property of the Leray–Schauder degree, namely the reduction property. THEOREM 3.3.43 If X is a Banach space, U ⊆ X is bounded open, Y is a closed linear subspace of X, f ∈ K(U , X), ϕ = I − f , y ∈ Y ,and y ∈ / ϕ(∂U ), then d(ϕ, U, y) = d(ϕU ∩Y , U ∩ Y, y).
PROOF: By hypothesis ϑ = d y, ϕ(∂U ) > 0. We can find g ∈ Kf (U , X) such that sup g(x) − f (x) < ϑ. x∈U
Let Y1 be a finite-dimensional linear subspace of X such that g(U ) ⊆ Y1 , y ∈ Y , and let U0 = U ∩ Y and U1 = U ∩ Y1 . Because ∂U1 ⊆ ∂U , we have sup g(x) − x∈U 0
f (x) < d∗ y, ϕ(∂U0 ) . Hence by Lemma 3.3.36 and Definition 3.3.38, we have
d(ϕ, U, y) = d (I − g)U , U1 , y = d(ϕU , U0 , y). 1
0
The next theorem, known as Birkoff–Kellogg theorem, is characteristic of infinitedimensional spaces and it is no longer true in a finite-dimensional setting. THEOREM 3.3.44 If X is an infinite-dimensional Banach space, U ⊆ X is bounded open with 0 ∈ U , and f ∈ K(U , X) satisfies inf f (x) : x ∈ ∂U > 0, then there exists x ∈ ∂U such that ϕ(x) = µx for some µ > 0 (i.e., ϕ has an invariant direction on ∂U ).
3.3 Degree Theory
213
PROOF: We argue indirectly. So suppose that µx = ϕ(x) for all µ > 0 and all x ∈ ∂U . Let r = inf ϕ(x) > 0. The set K = ϕ(U ) is compact, and the ball Br/2 (0) x∈∂U
is noncompact (since X is infinite dimensional). So we can find y0 ∈ Br/2 (0) \ K. For each v ∈ Br/2 (0) \ {y0 } we can write uniquely v = (1 − t)y0 + tv with t ∈ (0, 1) and v ∈ ∂Br/2 (0). Then we define ⎧ ⎪ if v = y0 ⎨0 h(v) = tv if v ∈ Br/2 (0)\{y0 } . ⎪ ⎩ v if v ∈ X \Br/2 (0) Evidently this is a homeomorphism on X and on Br/2 (0), which is the identity on X \Br/2 (0) and sends y0 to zero. Let g = h ◦ f . Then g ∈ K(U , X) and g ∂U = f ∂U . Moreover, by hypothesis ϕ(x) = x − f (x) = 0 for all x ∈ ∂U . So by Theorem 3.3.40(d) we have d(ϕ, U, 0) = d(ψ, U, 0), where ϕ = I − f , ψ = I − g. Choose R > 0 large such that U ⊆ BR (0) and let g1 (x) = R g(x) g(x), x ∈ U . Clearly g1 ∈ K(U , X) and g1 (U ) ∩ U = ∅. So d(ψ1 , U, 0) = 0 for ψ1 = I − g1 . Moreover, if h(t, x) = tR g(x) g(x), for all t ∈ [0, 1], x ∈ U , then h is a compact homotopy and so
1 = d I − h(0, ·), U, 0 = d I − h(1, ·), U, 0 = d ψ1 , U, 0 = 0, a contradiction. This proves that f has an invariant direction on ∂U .
EXAMPLE 3.3.45 As we already said Theorem 3.3.44 is no longer true if dim X < +∞. Indeed, let X = RN , U = B1 (0), and f : B 1 (0) −→ RN defined by ϕ(x) = −x. Then all the hypotheses of Theorem 3.3.44 are satisfied, but there is no invariant direction for ϕ on ∂B1 (0). Because if for some µ > 0 and some x ∈ ∂B1 (0) we have µx = ϕ(x), then (µ + 1)x = 0 and so x = 0, a contradiction. Finally there is a multiplication formula for the Leray–Schauder degree analogous to the one for Brouwer’s degree which was stated in Theorem 3.3.35. THEOREM 3.3.46 If X is a Banach space, U ⊆ X is bounded open, f ∈ K(U , X), ϕ = I − f , V is a bounded open set containing ϕ(U ), {Dk }k≥1 are the components of V \ ϕ(∂U ), g ∈ K(V , X), ψ = I − g, and y ∈ / ψ(ϕ(∂U ) ∪ ψ(∂U ), then d(ψ ◦ ϕ, U, y) = d(ψ, Dk , y) d(ϕ, U, zk ) where zk is any point in Dk . k≥1
For Brouwer’s degree we consider the triplets SB = (ϕ, U, y) : U ⊆ RN , ϕ ∈ C(U , RN ), y ∈ / ϕ(∂U ) , whereas for the Leray–Schauder degree the admissible triplets are SLS = (ϕ, U, y) : U ⊆ X bounded open, ϕ = I − f with f ∈ K(U , X), y∈ / ϕ(∂U ) . We look for Z-valued functions on SB and SLS , respectively, that satisfy the normalization, additivity, and homotopy invariance properties.
214
3 Nonlinear Operators and Fixed Points
THEOREM 3.3.47 (a) The Brouwer’s degree is the unique topological degree on SB . (b) The Leray–Schauder degree is the unique topological degree on SLS . Next we extend the Leray–Schauder degree to cover wider classes of maps of the form I − f . We require that f exhibit some kind of set-contracting. For this reason we introduce the following notion. DEFINITION 3.3.48 Let X be a Banach space and B the family of bounded subsets of X. (a) The Kuratowski measure of noncompactness α : B −→ R+ is defined by α(B) = inf{d > 0 : B admits a finite cover by sets of diameter ≤ d}. (b) The Hausdorff or ball measure of noncompactness β : B −→ R+ is defined by β(B) = inf{r > 0 : B admits a finite cover by balls of radius r}. REMARK 3.3.49 It is clear from the above definition that β(B) ≤ α(B) ≤ 2β(B) for all B ∈ B. PROPOSITION 3.3.50 If X is a Banach space and γ : B −→ R+ is either α or β (see Definition 3.3.48), then (a) A ⊆ B ⇒ γ(A) ≤ γ(B). (b) γ(A) = γ(A). (c) γ(A) = 0 if and only if A is compact. (d) γ(A ∪ B) = max{γ(A), γ(B)}, γ(A + B) ≤ γ(A) + γ(B) and γ(tA) = |t|γ(A), t ∈ R: (e) γ(A) = γ(convA). (f) If {An }n≥1 is a decreasing sequence of nonempty closed sets in B such that γ(An ) −→ 0 as n → ∞, then A = An is nonempty and compact. n≥1
PROOF: (a) If A ⊆ B, then any cover of B is also a cover of A and so we have the monotonicity property of γ. N (b) From (a) we have γ(A) ≤ γ(A). If {Un }N n=1 is a cover of A, then {U n }n=1 is a cover of A. So because diam Un = diam U n , directly from Definition 3.3.48 it follows that γ(A) ≤ γ(A). Therefore γ(A) = γ(A).
(c) Clear from Definition 3.3.48 and the definition of compactness. (d) Evidently γ(A ∪ B) ≤ max{γ(A), γ(B)}. The opposite inequality is true by (a), therefore equality must hold. The other two relations are direct consequences of Definition 3.3.48.
3.3 Degree Theory
215
(e) It is enough to show that γ(conv A) ≤ γ(A), because the opposite inequality is N always true by part (a). Let ξ > γ(A) and A ⊆ Cn where diam Cn ≤ ξ if γ = α n=1
and Cn = Bξ (xn ) if γ = β. Since diam(conv Cn ) ≤ ξ and Bξ (xn ) is convex, without any loss of generality we may assume that Cn is convex. Note that N
! Cn ) conv A ⊆ conv C1 ∪ conv n=2
N ! = conv C1 ∪ conv C2 ∪ conv( Cn ) ⊆ . . . . n=3
So it suffices to show that for convex sets C1 , C2 , we have
γ conv(C1 ∪ C2 ) ≤ max γ(C1 ), γ(C1 ) . tC1 + (1 − t)C2 and because C1 − C2 is bounded Note that conv(C1 ∪ C2 )⊆ 0≤t≤1
we can find r > 0 such that C1 − C2 ⊆ B r (0). Given ε > 0, let {tk }m k=1 ⊆ [0, 1] such m
that [0, 1] ⊆ ∪ B rε (tk ) and so k=1
m conv(C1 ∪ C2 ) ⊆ ∪ tk C1 + (1 − tk )C2 + B ε (0) k=1
⇒ γ conv(C1 ∪ C2 ) ≤ max γ(C1 ), γ(C1 ) + 2ε (see part (d)). Let ε ↓ 0 to finish the proof.
(f) Let xn ∈ An , n ≥ 1. Then γ {xn }n≥k ≤ γ(Ak ) −→ 0 as k → ∞. So xn −→ x in X and x ∈ A = ∅. Moreover, γ(A) ≤ γ(An ) −→ 0 as n → ∞, hence γ(A) = 0; that is, A being closed is compact; see part (c). REMARK 3.3.51 Part (f) of the above theorem generalizes a well-known theorem of Cantor which characterizes complete metric spaces. Using the measures of noncompactness, we can introduce some important classes of nonlinear operator that generalize the compact ones. DEFINITION 3.3.52 Let X be a Banach space, C ⊆ X, and f : C −→ X a continuous map. (a) We say that f is γ-Lipschitz if there is a k ≥ 0 such that
γ f (A) ≤ kγ(A) for all A ⊆ X bounded. (b) If f is γ-Lipschitz with constant k < 1, then we say that f is a γ-contraction. The family of such maps is denoted by SC γ (C).
(c) We say that f is γ-condensing if γ f (A) < γ(A) for all A ⊆ C bounded with γ(A) > 0. The family of such maps is denoted by Cγ (C).
216
3 Nonlinear Operators and Fixed Points
REMARK 3.3.53 Evidently γ-Lipschitz and γ-condensing maps are bounded (i.e., map bounded sets to bounded sets). In addition γ-contractions and γcondensing maps are proper (hence closed). Note that SC γ (C) ⊆ Cγ (C) and compact maps are of course γ-Lipschitz with constant k = 0. Let X be a Banach space and U ⊆ X bounded, open. We fix k < 1 and define a degree for the class Σk (U ) = {ϕ = I − f : f ∈ SC γ (U )}. We set D0 = U and define inductively: Dn = conv f (Dn−1 ∩ U )
for all n ≥ 1.
Clearly {Dn }n≥1 is a decreasing sequence of closed convex sets. We set # D= Dn . n≥1
Evidently D is closed and convex. Also using Proposition 3.3.50, we have
γ(Dn ) = γ conv f (Dn−1 ∩ U ) = γ f (Dn−1 ∩ U )
≤ γ f (Dn−1 ) ≤ kγ(Dn−1 ), ⇒ γ(Dn ) ≤ kn γ(D). Because k < 1, we see that γ(Dn ) −→ 0 and so by virtue of Proposition 3.3.50(f), D is nonempty and compact. Using Theorem 3.1.13 we can find f : U −→ D which is continuous and f D∩U = f . Because the range of f is compact and if we assume that 0∈ / ϕ(∂U ), then we can define d(I − f , U, 0). In fact we can show that this degree is actually independent of the extension f of f that we use. Indeed, let f$ : U −→ D be another continuous map such that f$D∩U = f . Consider the compact homotopy h(t, x) = tf (x) + (1 − t)f$(x)
for all t ∈ [0, 1] and all x ∈ U .
Because f and f$ both map U in D, then h(t, x) = x only if x ∈ D. But f = f$ = f on D. So h(t, x) = x only if f (x) = x. Therefore x = h(t, x) for all (t, x) ∈ [0, 1] × ∂U and we can use the homotopy invariance of the Leray–Schauder degree to conclude that d(I − f$, U, 0) = d(I − f , U, 0). This then makes the next definition meaningful. DEFINITION 3.3.54 Let X be a Banach space, U ⊆ X bounded open, ϕ ∈ Σk (U ) (k < 1), and 0 ∈ / ϕ(∂U ). We define the degree dS (ϕ, U, 0) to be the Leray– Schauder degree d( ϕ, U, 0) where ϕ = I − f and f : U −→ D is any extension of f D∩U (see the previous arguments). If y ∈ / ϕ(∂U ), then dS (ϕ, U, y) is defined to be equal to dS (ϕ − y, U, 0). This definition defines a topological degree, in the sense that it satisfies the normalization property (for the identity), the additivity with respect to the domain and the homotopy invariance property. The admissible homotopies in this case are I − h(t, ·), where h ∈ C([0, 1] × U , X) and γ h(t, B) ≤ kγ(B) for all t ∈ [0, 1] and all B ⊆ U . In fact exploiting the uniqueness of the Leray–Schauder degree (see Theorem 3.3.47(c)), we can have the following result.
3.3 Degree Theory
217
THEOREM 3.3.55 The degree dS (ϕ, U, 0) is the unique topological degree defined on Σk (U ). Next, we extend the definition of degree to cover the class Σ(U ) = {ϕ = I − f : f ∈ Cγ (U )}. But this is done easily by approximating the condensing map f with γcontractions. To this end let ϕ ∈ Σ(U ). Then ϕ = I − f with f ∈ Cγ (U ). Suppose that 0 ∈ / ϕ(∂U ) > 0. Set ϑ = d 0, ϕ(∂U ) > 0 (note that ϕ(∂U ) is a closed set because ϕ is a closed map; see Remark 3.3.53). We choose f0 ∈ SC γ (U ) such that f (x) − f0 (x) < ϑ/3 for all x ∈ U . Such a choice is possible, because we can take f0 = λf where λ ∈ (0, 1) with 1−λ sufficiently small. Let ϕ0 = I −f0 . Then ϕ0 (∂U ) is closed and d 0, ϕ0 (∂U ) ≥ 23 ϑ. Suppose that f1 is another element in SC γ (U ) such that f (x) − f1 (x) < ϑ/3 for all x ∈ U . Suppose f0 is a γ-contraction with constant k0 < 1 and f1 is a γ-contraction with constant k1 < 1. If k = max{k0 , k1 }, then both f0 and f1 are γ-contractions with constant k. Consider the homotopy h(t, x) = tf0 (x) + (1 − t)f1 (x)
for all (t, x) ∈ [0, 1] × U .
If h(t, x) = x with (t, x) ∈ [0, 1] × ∂U , then because f0 (x) − f1 (x) < 23 ϑ for all x ∈ U , we have 2 x − f1 (x) < ϑ, 3 a contradiction to the fact that x − f1 (x) ≥ 23 ϑ. So x = h(t, x) for all (t, x) ∈ [0, 1] × ∂U . Invoking the homotopy invariance of the degree dS , we have dS (I − f0 , U, 0) = dS (I − f1 , U, 0). This then justifies the following definition. DEFINITION 3.3.56 Let X be a Banach space, U ⊆ X bounded open, ϕ ∈ / ϕ(∂U ). We define the degree d0 (ϕ, U, 0) to be equal to dS (I − f , U, 0) Σ(U ), and 0 ∈ where f is any γ-contraction such that f (x) − f (x) < δ for all x ∈ U , where δ > 0 is small enough. If y ∈ / ϕ(∂U ), then we define d0 (ϕ, U, y) = d0 (ϕ − y, U, 0). REMARK 3.3.57 In this definition we may take δ =
1 3
d 0, ϕ(∂U ) .
Again this definition defines a topological degree, in the sense that it satisfies the normalization property (for the identity), the additivity with respect to the domain, and the homotopy invariance property. The admissible homotopies in this case are I − h(t, ·), where h ∈ C([0, 1] × U , X) and γ h(t, B) < γ(B) for all t ∈ [0, 1] and all B ⊆ U . Moreover, Theorem 3.3.55 leads to the following result. THEOREM 3.3.58 The degree d0 (ϕ, U, 0) is the unique topological degree on Σ(U ). REMARK 3.3.59 The degree d0 is usually called in the literature the Nussbaum– Sadovskii degree.
218
3 Nonlinear Operators and Fixed Points
It is clear from the presentation of the Nussbaum–Sadovskii degree, that the basic task of finding degree theorems more general than the Leray–Schauder degree, is therefore in part a problem of characterizing suitable classes of nonlinear maps that have enough structure and are sufficiently narrowed down from the broad class of all continuous maps so that a degree theory can exist. In the sequel we focus on maps f : U ⊆ X −→ X ∗ , where X is a reflexive Banach space and X ∗ is its topological dual. Such an extension of degree theory permits the introduction of degree-theoretic methods to nonlinear boundary value problems. First we present an extension of the degree to operators of type (S)+ (see Definition 3.2.61(d)). Note that if X = X ∗ is a Hilbert space, the mappings of type (S)+ contain as a special case the Leray–Schauder maps I − f , with f compact. This follows from the fact that each strongly monotone map is of type (S)+ and the class (S)+ is closed under compact perturbations. So let X be a reflexive Banach space with X ∗ being its dual and ·, · denoting the duality brackets for the pair (X, X ∗ ). Let U ⊆ X be bounded open and f : U −→ X ∗ a demicontinuous map of type (S)+ (see Definition 3.2.61(d)). Let {Xα }α∈J be the family of all finite-dimensional subspaces of X such that Uα = U ∩ Xα = ∅. We order J by inclusion. Also by iα we denote the injection of Xα into X and by i∗α the corresponding projection of X ∗ onto Xα∗ . In addition in order to simplify things, we assume that the reference point y ∗ ∈ X ∗ is the origin. We can always do this without any loss of generality, because we can replace f by f − y ∗ . We consider fα the Galerkin approximation of f with respect to Xα ; that is, fα (x), yXα = f (x), y
for all x ∈ U ∩ Xα and all y ∈ Xα .
Here by ·, ·Xα we denote the duality brackets for the pair (Xα , Xα∗ ). Evidently fα = i∗α ◦ f ◦ iα . An essential tool in producing a degree for (S)+ –maps is the following result due to Browder [122]. PROPOSITION 3.3.60 If Y is a finite-dimensional subspace of X, a degree map deg(S)+ (·) is defined on the demicontinuous (S)+ -maps defined on the closures of bounded open sets U in X with values in X ∗ , the degree deg(S)+ is invariant under affine homotopies (i.e., homotopies h(t, x) = tf0 (x) + (1 − t)f1 (x) with f0 , f1 : U −→ X ∗ demicontinuous, (S)+ -maps) and it is normalized by a duality map F of type (S)+ ; if f : U −→ X ∗ is such a demicontinuous, (S)+ -map, and fY is the Galerkin approximation of f with respect to Y , suppose that deg(S)+ (f, U, 0) is not defined or d(fY , U ∩ Y, 0) is not defined or both are well defined but deg(S)+ (f, U, 0) = d(fY , U ∩ Y, 0), then there exists x ∈ ∂U such that f (x), x ≤ 0 and f (x), y = 0 for all y ∈ Y . REMARK 3.3.61 By virtue of Theorem 3.2.25, we can always equivalently renorm X so that both X and X ∗ are locally uniformly convex, in which case the corresponding duality map F is of type (S)+ . Using Proposition 3.3.60 we can make the decisive step towards the stated goal of extending degree theory to demicontinuous (S)+ -maps from the closure of a bounded open set U ⊆ X in a reflexive Banach space X into its dual X ∗ .
3.3 Degree Theory
219
PROPOSITION 3.3.62 If {f, U, X} are as above and 0 ∈ / f (∂U ), then there exists α0 ∈ J such that for all α ≥ α0 (i.e., Xα ⊇ Xα0 ), we have 0 ∈ / fα (∂Uα ) and d(fα , Uα , 0) is independent of α ∈ J, α ≥ α0 . PROOF: We argue indirectly. So suppose that for each α ∈ J, we can find β ≥ α (i.e., Xβ ⊇ Xα ) such that d(fβ , Uβ , 0) = d(fα , Uα , 0) or one of the degrees is not defined. In all these cases we can apply Proposition 3.3.60 with X replaced by Xβ and Y = Xα . Evidently fα is the Galerkin approximation of fβ with respect to Xα . Both Xβ and Xα have degree functions defined (Brouwer’s degree because they are finite-dimensional), thus we can apply Proposition 3.3.60 and obtain x ∈ ∂Uβ such that f (x), x = fβ (x), xXβ ≤ 0
and
f (x), y = fβ (x), yXβ = 0
for all y ∈ Xα . We set Vα = x ∈ ∂U : f (x), x ≤ 0 and f (x), y = 0
for all y ∈ Xα .
Evidently if α1 ≤ α2 , then Vα2 ⊆ Vα1 . So the family {Vα }α∈J is contained in w a fixed bounded set ∂U and hasthe finite intersection property. The set V α is w w weakly compact in X and also V α = ∅. Let x0 ∈ V α and y ∈ X. We α∈J
α∈J
choose a finite-dimensional subspace Xα of X that contains both x0 and y. Because w x0 ∈ V α and the space X is reflexive, we can find a sequence {vn }n≥1 ⊆ Vα such that w orski–Papageorgiou [194, p. 306]). By definition vn −→ x0 in X (see Denkowski–Mig´ f (vn ), vn ≤ 0
and
f (vn ), y = 0
for all y ∈ Xα and all n ≥ 1.
So it follows that lim sup f (vn ), vn − x0 ≤ 0. n→∞
Because f is of type (S)+ , it follows that vn −→ x0 in X and x0 ∈ ∂U . Due to w the demicontinuity of f we have f (vn ) −→ f (x0 ) in X ∗ . Hence f (x0 ), y = 0 for all y ∈ X, so f (x0 ) = 0, a contradiction to the hypothesis that 0 ∈ / f (∂U ). Let X be a reflexive Banach space and consider the family S(S)+ = {(f, U, y ∗ ) : U ⊆ X bounded, open, f : U −→ X ∗ demicontinuous, / f (∂U )}. (S)+ , y ∗ ∈
Proposition 3.3.62 leads to a degree map defined on S(S)+ . DEFINITION 3.3.63 We can define a degree map on S(S)+ by setting deg(S)+ (f, U, y ∗ ) = d(fα , U ∩ Xα , y ∗ )
for Xα sufficiently large.
REMARK 3.3.64 For this definition only we used a general reference point y ∗ . In the sequel we return to our initial simplifying convention that y ∗ = 0.
220
3 Nonlinear Operators and Fixed Points
As admissible homotopies, we could use the affine homotopies h(t, x) = tf0 (x) + (1 − t)f1 (x) with (t, x) ∈ [0, 1] × U and f0 , f1 : U −→ X ∗ demicontinuous (S)+ maps. The degree map deg(S)+ is invariant under these homotopies. However, affine homotopies are too weak for some important applications. For this reason we introduce the following broader class of homotopies. DEFINITION 3.3.65 Let X be a reflexive Banach space and U ⊆ X bounded open. The map h : [0, 1] × U −→ X ∗ is said to be homotopy of class (S)+ , if it satisfies the following conditions. For any sequence {tn }n≥1 ⊆ [0, 1] converging to t w and any sequence {xn }n≥1 ⊆ U such that xn −→ x in X for which lim sup h(tn , xn ), xn − x ≤ 0 n→∞
we have that xn −→ x in X and h(tn , xn ) −→ h(t, x) in X ∗ . w
Using this broader class of admissible homotopies, we have the following. THEOREM 3.3.66 The degree map deg(S)+ : S(S)+ −→ Z defined in Definition 3.3.63 is the unique degree map such that (a) deg(S)+ (F , U, y ∗ ) = 1, where F : X −→ X ∗ is the duality map corresponding to an equivalent norm on X for which both X and X ∗ are locally uniformly convex (see Theorem 3.2.25). (b) Additivity with
respect to the domain: If U1 , U2 are disjoint open subsets of U and y ∗ ∈ / f U \ (U1 ∪ U2 ) , then deg(S)+ (F, U, y ∗ ) = deg(S)+ (F, U1 , y ∗ ) + deg(S)+ (F , U2 , y ∗ ). (c) Homotopy invariance: If h(t, x) is a homotopy of type (S)+ and y ∗ : I −→ ∗ X ∗ is continuous such / h(t, ∂U ) for all t ∈ [0, 1], then that y (t) ∈ deg(S)+ h(t, ·), U, y ∗ (t) is independent of t ∈ [0, 1]. REMARK 3.3.67 Of course the degree map deg(S)+ has other properties too, such as the excision property, the solution property, and the dependence on boundary values. Combining the Galerkin approximations technique (see the proof of Proposition 3.3.62) with a corresponding finite-dimensional result (see Amann [18]), we can have the following theorem. THEOREM 3.3.68 If X is a reflexive Banach space, U ⊆ X is open, F : U −→ R is Gˆ ateaux differentiable with f = F : U −→ X ∗ a demicontinuous, (S)+ -map, and there exist a, b ∈ R, a < b, and x0 ∈ X such that (i) V = {F < b} is bounded and V ⊆ U . (ii) If x ∈ {F ≤ a}, then tx + (1 − t)x0 ∈ V for all t ∈ [0, 1]. (iii) f (x) = F (x) = 0 for all x ∈ {a ≤ F ≤ b}, then deg(S)+ (f, V, 0) = 1.
3.3 Degree Theory
221
This theorem has some noteworthy consequences. COROLLARY 3.3.69 If X is a reflexive Banach space, F : X −→ R is a bounded, Gˆ ateaux differentiable functional such that f = F : X −→ X ∗ is of type (S)+ , F (x) −→ +∞ as x −→ +∞, and there exists r0 > 0 such that for all x ≥ r0 we have f (x) = 0, then there exists r1 ≥ r0 such that deg(S)+ f, Br (0), 0 = 1 for all r ≥ r1 . PROOF: Let a = sup{F (x) : x ∈ Br0 (0)} and r1 = sup x : x ∈ {F ≤ a} . Moreover, given r ≥ r1 , let b > sup{F (x) : x ∈ Br (0)}. Then the corollary follows from Theorem 3.3.68 with x0 = 0 and the excision property of the degree (see Remark 3.3.67). COROLLARY 3.3.70 If X is a reflexive Banach space, U ⊆ X is open convex, F : U −→ R is a C 1 -functional on U such that f = F : U −→ X ∗ is of type (S)+ , and F has at x0 and also x0 is an isolated critical point of F , then
a local minimum deg(S)+ f, Br (x0 ), 0 = 1 for some r > 0. PROOF: First we show that F is weakly sequentially lower on U . semicontinuous We argue by contradiction. So suppose that we can find xn , x n≥1 ⊆ U such that w
xn −→ x in X and lim inf F (xn ) < F (x). n→∞ We consider a subsequence xnk k≥1 of {xn }n≥1 such that lim F (xnk ) = lim inf F (xn ). n→∞
k→∞
(3.63)
(3.64)
From the mean value theorem (see Proposition 1.1.6), we can find tnk ∈ (0, 1) such that
F (xnk ) − F (x) = f x + tnk (xnk − x) , xnk − x
⇒ lim sup tnk F (xnk ) − F (x) k→∞
= lim sup f x + tnk (xnk − x) , x + tnk (xnk − x) − x . (3.65) k→∞
From (3.63) and (3.64), we infer that
lim sup tnk F (xnk ) − F (x) ≤ 0.
(3.66)
k→∞
w
Note that x + tnk (xnk − x) −→ x in X. So from (3.65), (3.66), and the fact that f is of type (S)+ , we have that x + tnk (xnk − x) −→ x in X. Because by hypothesis f is continuous, we obtain
lim F (xnk ) − F (x) = lim f x + tnk (xnk − x) , xnk − x = 0, k→∞
k→∞
a contradiction to the choice of xnk k≥1 . So indeed f is weakly sequentially lower semicontinuous on U . By hypothesis x0 is an isolated critical point of F and also a local minimizer, therefore we can find r0 > 0 such that
222
3 Nonlinear Operators and Fixed Points F (x0 ) < F (y)
and
f (y) = F (y) = 0
for all y ∈ Br0 (x0 ) \ {x0 }.
We show that for all r ∈ (0, r0 ), the following is true. inf F (y) : y ∈ Br0 (x0 ) \ Br (x0 ) > F (x0 ).
(3.67)
(3.68)
Suppose that (3.68) is not true. We can find r > 0 and {xn }n≥1 ⊆ Br0 (x0 ) \ Br (x0 ) such that F (xn ) ↓ F (x0 ) as n → ∞. (3.69) w
Due to reflexivity of the space, we may assume that xn −→ x in X. Because F is weakly sequentially lower semicontinuous we have F (x) ≤ lim inf F (xn ) = F (x0 ) (i.e., x = x0 ). n→∞
By the mean value theorem we have x + x . xn + x0 xn − x0 / n 0 F (xn ) − F = f tn xn + (1 − tn ) , 2 2 2 with tn ∈ (0, 1). Passing to the limit as n → ∞ and using the weak sequential lower semicontinuity of F and (3.69), we obtain . / xn + x0 lim sup f tn xn + (1 − tn ) , xn − x0 ≤ 0 2 n→∞ . / xn + x0 xn − x0 ⇒ lim sup f tn xn + (1 − tn ) , tn xn + (1 − tn ) − x0 2 2 n→∞ 1 + t . / xn + x0 n = lim sup ≤ 0. (3.70) f tn xn + (1 − tn ) , xn − x0 2 2 n→∞ Because f is of type (S)+ , from (3.70) it follows that tn xn + (1 − tn )
xn + x0 −→ x0 in X. 2
But
xn + x0 xn − x0 r − x0 = (1 − tn ) ≥ 2 2 2 for all n ≥ 1, a contradiction. So (3.68) is valid. Let b = inf F (x) : x ∈ Br0 (x0 ) \ Br0 /2 (x0 ) − F (x0 ). tn xn + (1 − tn )
From (3.68) we see that b > 0. Also let V = x ∈ Br0 /2 (x0 ) : F (x) − F (x0 ) < µ . Evidently V is nonempty and open. Fix r ∈ (0, r0 /2) such that Br (x0 ) ⊆ V . Choose a ∈ R satisfying 0 < a < inf F (x) : x ∈ Br0 (x0 ) \ Br (x0 ) − F (x0 ). Note that x ∈ Br0 (x0 ) : F (x) − F (x0 ) ≤ α ⊆ Br (x0 ) ⊆ Br (x0 ) ⊆ V ⊆ V ⊆ Br0 (x0 ).
3.3 Degree Theory
223
So we can apply Theorem 3.3.68 with U=Br0 (x0 ), F replaced by F B (x ) −F (x0 ), r0 0 and a, b as above. We obtain deg(S)+ (f, V, 0) = 1. By the excision property of the degree, we conclude that deg(S)+ (f, Br (x0 ), 0) = 1. Finally let us extend the degree to maps of the form f + A, with f bounded, demicontinuous, (S)+ , and A is maximal monotone with (0, 0) ∈ Gr A. We assume that the reflexive Banach space is equipped with a norm such that both X and X ∗ are locally uniformly convex (see Theorem 3.2.25). Let U ⊆ X be a bounded open ∗ set, f : U −→ X ∗ a bounded, demicontinuous, (S)+ -map and A : D(A) ⊆ X −→ 2X a maximal monotone operator such that (0, 0) ∈ Gr A. We assume that y ∗ ∈ / (f + A)(∂U ). We consider the Yosida approximation Aλ (λ > 0) of the operator A (see Definition 3.2.46). We know that Aλ is demicontinuous, monotone everywhere defined, hence it is maximal monotone (see Remark 3.2.47 and Corollary 3.2.11). Then f + Aλ is a bounded, demicontinuous, and (S)+ -map and it can be shown (see Browder [122]) that for λ > 0 small y∗ ∈ / (f + Aλ )(∂U )
and
deg(S)+ (f + Aλ , U, y ∗ ) stabilizes.
(3.71)
So for X a reflexive Banach renormed so that both X and X ∗ are locally uniformly convex, we consider the family SSM = (f + A, U, y ∗ ) : U ⊆ X bounded open, f : U −→ X ∗ bounded, demi∗
continuous, (S)+ , A : D(A) −→ 2X maximal monotone with (0, 0) ∈ Gr A and y ∗ ∈ / (f + A)(∂U ) . Then (3.71) makes the following definition possible. DEFINITION 3.3.71 We can define a degree map on SSM by setting deg(f + A, U, y ∗ ) = deg(S)+ (f + Aλ , U, y ∗ )
for λ > 0 small.
In this case the admissible homotopies are of the form h(t, x) = ft + At (x)
for all (t, x) ∈ [0, 1] × U ,
where {ft }t∈[0,1] is an (S)+ -homotopy (see Definition 3.3.65) and {At }t∈[0,1] is a pseudomonotone homotopy defined below. DEFINITION 3.3.72 Let {At }t∈[0,1] be a family of maximal monotone maps such that (0, 0) ∈ Gr At for all t ∈ [0, 1]. We say that {At }t∈[0,1] is a pseudomonotone homotopy of maximal monotone operators if it satisfies the following mutually equivalent conditions.
224
3 Nonlinear Operators and Fixed Points
(a) If tn → t in [0, 1], xn −→ x in X, x∗n −→ x∗ in X ∗ , (xn , x∗n ) ∈ GrAtn for all n ≥ 1 and lim sup x∗n , xn ≤ x∗ , x , w
w
n→∞
then (x, x∗ ) ∈ Gr At and x∗n , xn −→ x∗ , x. t (b) For every λ > 0 (equivalently for some λ > 0), the map (t, x) −→ JλA (x) is continuous from [0, 1] × X into X (X furnished with the strong topology). (c) For every λ > 0 (equivalently for some λ > 0) and every x ∈ X, the map t −→ t JλA (x) is continuous from [0, 1] to X with the strong topology. (d) If tn −→ t in [0, 1] and (x, x∗ ) ∈ Gr At , then there exist sequences {xn }n≥1 ⊆ w X, {x∗n }n≥1 ⊆ X ∗ such that (xn , x∗n ) ∈ Gr Atn , xn −→ x in X and x∗n −→ x∗ in ∗ X . REMARK 3.3.73 Anticipating some terminology from Section 6.6, we can rewrite Part (d) of the above definition as Gr At ⊆ s − lim inf Gr Atn . In general affine n→∞
homotopies are not pseudomonotone homotopies; that is, if A0 , A1 are two maximal monotone operators from X into X ∗ such that (0, 0) ∈ Gr Ak (k = 0, 1) and D(A0 ) = D(A1 ), then At = tA0 + (1 − t)A1 , t ∈ [0, 1] is not a pseudomonotone homotopy. However, if A0 is a continuous maximal monotone operator with D(A0 ) = X, then {At }t∈[0,1] is a pseudomonotone homotopy. THEOREM 3.3.74 The degree map deg : SSM −→ Z defined in Definition 3.3.71 is the unique degree map such that (a) deg(F, U, y ∗ ) = 1, where F : X −→ X ∗ is the duality map. (b) Additivity with respect to the domain: If U1 , U2 are disjoint open subsets of U
and y ∗ ∈ / (f + A) U \ (U1 ∪ U2 ) , then deg(f + A, U, y ∗ ) = deg(f + A, U1 , y ∗ ) + deg(f + A, U2 , y ∗ ). (c) Homotopy invariance: If {ft }t∈[0,1] is an (S)+ -homotopy such that each ft is bounded, {At }t∈[0,1] is a pseudomonotone homotopy of maximal monotone operators, and y ∗ : [0, 1] −→ X ∗ is a continuousmap such that y ∗ (t) ∈ / (ft + At )(∂U ) for all t ∈ [0, 1], then deg ft + At , U, y ∗ (t) is independent of t ∈ [0, 1]. REMARK 3.3.75 Of course deg has other properties too such as excision property, solution property, and dependence on boundary values.
3.4 Metric Fixed Points Fixed point theory is concerned with the conditions which guarantee that a map ϕ : X −→ X from a topological space X into itself, has one or more fixed points; that is, we can find x ∈ X such that ϕ(x) = x. Fixed point theory is one of the basic tools in nonlinear analysis and in particular in the study of nonlinear boundary value problems. There is an informal classification of fixed point theorems to metric fixed points, topological fixed points, and order fixed points. Metric fixed point theorems are
3.4 Metric Fixed Points
225
always formulated in a metric space setting (usually in a Banach space setting) and the methods involved in their study exploit the metric structure and geometry of the spaces involved together with the metric properties of the maps. The typical representative of this family of fixed point theorems is the celebrated Banach contraction principle. In topological fixed point theorems, the emphasis is on the topological properties of the spaces and maps involved. For this family, the typical representatives are the well-known Brouwer’s fixed point theorem and its infinite-dimensional extension, the Schauder–Tychonov fixed point theorem. Finally there is a third class of fixed point theorems, the so-called order fixed point theorems, in which the order structure of the ambient space (usually an ordered Banach space) and the monotonicity properties of the maps with respect to this order, play a crucial role. Here degree theory, via the so-called fixed point index is a valuable tool. The forerunner of results of this type is the Tarski fixed point theorem. In this section we deal with the metric fixed point theory, starting with the Banach contraction principle. Banach’s fixed point theorem is essentially an abstraction of the Liouville–Picard method of successive approximations to produce the solution of some integral equations. From that moment, because of its simplicity and remarkable power, it has found applications in many different branches of mathematical analysis. As we show, the importance of the theorem, derives from the fact that in addition to the existence of solutions, it provides much other valuable information, such as uniqueness of the solution, stability of the solution under small perturbations of the equation, a convergent approximation method to produce the solution, a priori error estimates and estimates of the rate of convergence of the approximation method, and finally stability of this method. DEFINITION 3.4.1 Let (X, d) be a metric space and ϕ : C ⊆ X −→ X a map (a) We say that ϕ is a k-contraction if and only if
d ϕ(x), ϕ(y) ≤ kd(x, y) for all x, y ∈ C with k ∈ [0, 1). (b) We say that ϕ is nonexpansive if and only if
d ϕ(x), ϕ(y) ≤ d(x, y) for all x, y ∈ C. (c) We say that ϕ is contractive if and only if
d ϕ(x), ϕ(y) < d(x, y) for all x, y ∈ C, x = y.
REMARK 3.4.2 Of course if d ϕ(x), ϕ(y) ≤ kd(x, y) for all x, y ∈ C and some k ≥ 0, we simply say that ϕ is k-Lipschitz continuous (or simply Lipschitz continuous) and k ≥ 0 is the Lipschitz constant of ϕ. If ϕ, ψ : X −→ X are two Lipschitz continuous maps with Lipschitz constants k(ϕ) and k(ψ), respectively, then ψ ◦ ϕ is Lipschitz continuous too with constant k(ψ ◦ ϕ) ≤ k(ψ)k(ϕ). In particular then k(ϕ(n) ) ≤ k(ϕ)n for all n ≥ 1 (recall that ϕ(n) = ϕ ◦ . . . ◦ ϕ n-times). If we are in the setting of a linear space, then k(ϕ + ψ) ≤ k(ϕ) + k(ψ) and k(λϕ) = λk(ϕ) for all λ ≥ 0. The next theorem is the celebrated Banach fixed point theorem (or Banach contraction principle).
226
3 Nonlinear Operators and Fixed Points
THEOREM 3.4.3 If (X, d) is a complete metric space (d denoting the metric of X) and ϕ : X −→ X is a k-contraction, then ϕ has a unique fixed point (i.e., there exists unique x ∈ X such that ϕ(x) = x). PROOF: Choose any point x0 ∈ X and define the sequence xn+1 = ϕ(xn ) n≥0 . This sequence satisfies
d(xn , xn+1 ) = d ϕ(n) (x0 ), ϕ(n+1) (x0 ) ≤ k d ϕ(n−1) (x0 ), ϕ(n) (x0 ) . Hence by induction we obtain d(xn , xn+1 ) ≤ kn d(x0 , x1 ).
(3.72)
If m ≥ n, from the triangle inequality we have
d ϕ(n) (x0 ), ϕ(m) (x0 ) ≤ d ϕ(n) (x0 ), ϕ(n+1) (x0 ) + . . .
+ d ϕ(m−1) (x0 ), ϕ(m) (x0 ) ≤ (kn + · · · + km−1 ) d(x0 , x1 ) kn ≤ d(x0 , x1 ), 1−k
⇒ d(xn , xm ) = d ϕ(n) (x0 ), ϕ(m) (x0 ) −→ 0 as n, m → ∞;
(see (3.72)) (3.73)
that is, {xn }n≥1 ⊆ C is a Cauchy sequence. Because (X, d) is by hypothesis complete, we have x n = ϕ(n)(x0 ) −→ x ∈ X in X. Because ϕ is continuous, we must have ϕ(xn ) = ϕ ϕ(n) (x0 ) −→ ϕ(x), hence xn+1 −→ ϕ(x) and so ϕ(x) = x. Suppose y ∈ C is another fixed point of ϕ. We have
d(x, y) = d ϕ(x), ϕ(y) ≤ kd(x, y),
a contradiction unless d(x, y) = 0; that is, x = y.
Some useful consequence of Theorem 3.4.3 and of its proof is collected in the next proposition. PROPOSITION 3.4.4 If (X, d) is a complete metric space and ϕ: X −→ X is a k-contraction, then (a) The unique fixed point x of ϕ is obtained as lim ϕ(n) (x0 ) for any x0 ∈ X. n→∞
(b) d x, ϕ(n) (x0 ) ≤ kn (1 − k) d x0 , ϕ(x0 ) for all n ≥ 0.
(c) For any x0 ∈ X, d(x0 , x) ≤ 1/(1 − k) d x0 , ϕ(x0 ) . (d) For all n ≥ 0 we have d(xn+1 , x) ≤ k d(xn , x) where xk = ϕ(k) (x0 ) for all k ≥ 0. PROOF: (a) Follows from the proof of Theorem 3.4.3.
(b) If in (3.73) we let m → ∞, we obtain d ϕ(n) (x0 ), x ≤ kn /(1−k) d x0 , ϕ(x0 ) for all n ≥ 0. (c) Let n = 0 in (b).
3.4 Metric Fixed Points
(d) Note that d(xn+1 , x) = d ϕ(xn ), ϕ(x) ≤ k d(xn , x).
227
REMARK 3.4.5 Part (b) gives an a priori error estimate of the successive approximations process starting from x0 . Part (d) determines the rate of convergence of the successive approximations process. There is a parametric version of Theorem 3.4.3 that illustrates the stability of the successive approximations method. PROPOSITION 3.4.6 If (X, d) is a complete metric space, (T, ) is another metric space (the parameter space with its metric), for each t ∈ T ϕt: X −→ X is a k-contraction with k ∈ [0, 1) independent of t ∈ T and for each x ∈ X t −→ ϕt (x) is continuous, then for each t ∈ T there exists a unique xt ∈ X such that ϕt (xt ) = xt and the map t −→ xt is continuous from T into X. PROOF: Theorem 3.4.3 guarantees the existence and uniqueness of xt . If tn → t, we have
d(xtn , xt ) = d ϕtn (xnt ), ϕt (xt ) ≤ d ϕtn (xtn ), ϕtn (xt ) + d ϕtn (xt ), ϕt (xt )
≤ k d(xtn , xt ) + d ϕtn (xt ), ϕt (xt ) ,
1 d ϕtn (xt ), ϕt (xt ) −→ 0 as n → ∞, ⇒ d(xtn , xt ) ≤ 1−k ⇒ t −→ xt is continuous from T into X. There is also a useful local version of Theorem 3.4.3. PROPOSITION 3.4.7 If (X, d) is a complete metric space,
B =Br(x0 )={x∈X : d(x, x0 ) < r} and ϕ: B −→ X is a k-contraction such that d ϕ(x0 ), x0 < (1 − k) r, then ϕ has a fixed point.
PROOF: We choose ε < r such that d ϕ(x0 ), x0 ≤ (1 − k) ε < (1 − k) r. Consider the set C = {x ∈ X : d(x, x0 ) ≤ ε}. We claim that ϕ(C) ⊆ C. We have
d ϕ(x), x0 ≤ d ϕ(x), ϕ(x0 ) + d ϕ(x0 ), x0 ≤ k d(x, x0 ) + (1 − k) ε ≤ ε. Because C ⊆ X is closed and ϕ : C −→ C is a k-contraction, we can apply Theorem 3.4.3 and obtain a fixed point. We can improve Theorem 3.4.3 by assuming that some “power” ϕ(n) of ϕ is a k-contraction. THEOREM 3.4.8 If (X, d) is a complete metric space and ϕ: X −→ X is a map that for some n ≥ 1, ϕ(n) is a k-contraction, then ϕ has a unique fixed point. PROOF: By Theorem 3.4.3 we can find a unique x ∈ X such that ϕ(n) (x) = x. Then
ϕ(x) = ϕ ϕ(n) (x) = ϕ(n+1) (x) = ϕ(n) ϕ(x) , ⇒ ϕ(x) is the unique fixed point of ϕ(n) (·) (i.e., ϕ(x) = x).
228
3 Nonlinear Operators and Fixed Points Suppose y ∈ X is another fixed point of ϕ. Then
y = ϕ(y) = ϕ ϕ(y) = · · · = ϕ(n) ϕ(y) ⇒ y is also a fixed point of ϕ(n) . So by the uniqueness of the fixed point of ϕ(n) , we have y = x.
REMARK 3.4.9 A function ϕ: X −→ X satisfying the hypotheses of Theorem 3.4.8 need not be continuous. To see this, consider the function ϕ : R −→ R defined by ) 1 if x = rational . ϕ(x) = 0 if x = irrational Evidently ϕ is not continuous but ϕ(2) (x) ≡ 1 for all x ∈ R. In the next example we present a continuous map that is not a k-contraction but ϕ(n) is a contraction for some n ≥ 1. EXAMPLE 3.4.10 Let ϕ : C[0, 2] −→ C[0, 2] be defined by
t x(s)ds for all t ∈ [0, 2]. ϕ(x)(t) = 0
It can be shown that ϕ(n) (x)(t) =
1 (n − 1)!
t
(t − s)n−1 x(s)ds,
n ≥ 1.
0
So for n ≥ 1, the map ϕ(n) is a k-contraction, but for n = 1 is not. We can also relax the hypothesis that ϕ is a k-contraction, provided we strengthen the topological structure of (X, d). THEOREM 3.4.11 If (X, d) is a compact metric space and ϕ: X −→ X is a contractive then ϕ has a unique fixed point and for any x0 ∈ X the sequence (n) map, ϕ (x0 ) n≥1 converges to this unique fixed point.
PROOF: Consider the function ϑ : X −→ R+ defined by ϑ(x) = d x, ϕ(x) . Clearly ϑ is continuous and so we can find x ∈ X such that ϑ(x) = inf ϑ. We claim that X
x = ϕ(x). Indeed, if this is not the case, then
ϑ ϕ(x) = d ϕ(x), ϕ(2) (x) < d x, ϕ(x) = ϑ(x), a contradiction to the fact that x ∈ X is a minimizer of ϑ. So x = ϕ(x). Clearly this fixed point is unique.
Next let x0 ∈ X and set ξn = d x, ϕ(n) (x0 ) . We have
ξn+1 = d x, ϕ(n+1) (x0 ) < d x, ϕ(n) (x0 ) = ξn , ⇒ {ξn }n≥1 ⊆ R+ is decreasing.
(3.74)
3.4 Metric Fixed Points
229
So we must have ξn ↓ ξ ≥ 0. Also because ϕ(n) (x0 ) n≥1 ⊆X and X is compact, we may assume (by passing to a subsequence if necessary) that ϕ(n) (x0 ) −→ u ∈ X as n → ∞. Using (3.74) we have
ξ = lim d x, ϕ(n+1) (x0 ) = d x, ϕ(u) n→∞
= d ϕ(x), ϕ(u) < d(x, u) = ξ if ξ > 0, a contradiction. So x = u. By Urysohn’s criterion for convergent sequences, we conclude that the original sequence ϕ(n) (x0 ) n≥1 converges to x. EXAMPLE 3.4.12 The compactness of (X, d) cannot be replaced by completeness and boundedness. To see this let X = {x ∈ C[0, 1] : 0 = x(0) ≤ x(t) ≤ x(1) = 1 for all t ∈[0, 1]} and ϕ : X −→ X be defined by ϕ(x)(t) = tx(t) for all x ∈ X and all t ∈ [0, 1]. The map ϕ is contractive and fixed point free. Next we present a useful consequence of Banach’s fixed point theorem. It is analogous to the invariance of domain results in Section 3.3 (see Theorems 3.3.34 and Theorem 3.3.42). THEOREM 3.4.13 If X is a Banach space, U ⊆ X is nonempty open, ϕ: U −→ X is a k-contraction, and h(x) = x − ϕ(x), then (a) h is an open map (so h(U ) is open in X). (b) h : U −→ h(U ) is a homeomorphism.
PROOF: We show that for any x∈ U if Br (x) ⊆ U , then B(1−k)r h(x) ⊆h Br (x) . To this end let uo ∈ B(1−k)r h(x) and define g : Br (x) −→ X by g(y) = u0 + ϕ(y). Then clearly g is a k-contraction and g(x) − x = u0 + ϕ(x) − x = u0 − h(x) < (1 − k)r. Invoking Proposition 3.4.7 we obtain x0 ∈ Br(x) such that x0 = g(x0 ) = u0 +
ϕ(x0 ), hence u0 = h(x0 ). Therefore B(1−k)r h(x) ⊆ h Br (x) and this proves that h is an open map. In particular h(U ) ⊆ X is open. This proves (a). To prove (b), note that if y, x ∈ U , then h(x) − h(y) ≥ x − y − ϕ(x) − ϕ(y) ≥ (1 − k)x − y, ⇒ h is one–to–one. Therefore h : U −→ h(U ) is a continuous bijection, thus a homeomorphism.
COROLLARY 3.4.14 If (X, d) is a Banach space and ϕ : X −→ X is a kcontraction, then h = I − ϕ has a homeomorphism. PROOF: We need only to show that h is surjective. So given x0 ∈ X let g(x) = x0 + ϕ(x), x ∈ X. Then g is a k-contraction and so by Theorem 3.4.3 we can find x ∈ X such that x = x0 + ϕ(x), hence h(x) = x0 . Before moving to other metric fixed point theorems let us state one more descendant of the Banach contraction principle.
230
3 Nonlinear Operators and Fixed Points
PROPOSITION 3.4.15 If (X, d) is a complete metric space and ϕ : X −→ X is a map such that
1 d ϕ(x), ϕ(y) ≤ kd x, ϕ(x) + kd y, ϕ(y) with k < (3.75) 2 then ϕ has a unique fixed point x0 and ϕ(n) (x) −→ x0 for every x ∈ X. PROOF: The function ϕ need not be continuous (see Example 3.4.20 below). However, if we set ξ(x) = d x, ϕ(x) because of (3.75) we have
ξ ϕ(n+1) (x) ≤ kξ ϕ(n) (x) + kξ ϕ(n+1) (x) , k n+1
ξ(x), ⇒ ξ ϕ(n+1) (x) ≤ 1−k
(n+1) (x) −→ 0 as n → ∞ ⇒ ξ ϕ (n) ⇒ ϕ (x) n≥1 is a Cauchy sequence. Due to the completeness of X, we have ϕ(n) (x) −→ x0 in X. Then
ξ(x0 ) = lim d ϕ(n) (x), ϕ(x0 ) ≤ k lim ξ ϕ(n−1) (x) + kξ(x0 ) = kξ(x0 ), n→∞
⇒ ξ(x0 ) = 0
n→∞
(because k < 1; i.e., x0 = ϕ(x0 )).
Clearly from (3.75) we infer that this fixed point is unique.
In Section 2.4 we proved another metric fixed point theorem under very general hypotheses. This was Caristi’s fixed point theorem (see Theorem 2.4.11). For completeness we repeat here the statement of that theorem which is a far-reaching generalization of the Banach contraction principle. THEOREM 3.4.16 If (X, d) is a complete metric space, ϕ : X −→ R = R∪{+∞} is a proper, lower semicontinuous function that is bounded below and F : X −→ 2X \{∅} is a multifunction such that ϕ(y) ≤ ϕ(x) − d(x, y)
for all x ∈ X and all y ∈ F (x),
then there exists x0 ∈ X such that x0 ∈ F (x0 ). REMARK 3.4.17 In Section 2.4 we proved that Theorem 3.4.16 is in fact equivalent to some basic variational principles of nonlinear analysis, most notable of which is the Ekeland variational principle (see Theorem 2.4.12). Next we present a generalization of Theorem 3.4.16. THEOREM 3.4.18 If (X, d), (Y, ) are complete metric spaces, g : X −→ X, f : X −→ Y is a closed map, and ϕ : f (X) −→ R+ is a lower semicontinuous function such that for each x ∈ X,
d x, g(x) ≤ ϕ f (x) − ϕ f (g(x))
and c f (x), f (g(x)) ≤ ϕ f (x) − ϕ f (g(x)) for some c > 0, (3.76) then g has a fixed point in X.
3.4 Metric Fixed Points
231
PROOF: On X we define a relation ≤ by
and
x≤u if and only if d(x, u) ≤ ϕ f (x) − ϕ f (u)
c f (x), f (u) ≤ ϕ f (x) − ϕ f (u) .
It is easy to verify that ≤ is a partial order on X. We consider a chain {xα }α∈J in X (i.e.,a totally ordered set in (X, ≤) with xα1 ≤ xα2 if and only if α1 α2 in J). Then ϕ f (xα ) α∈J is a decreasing net in R+ . So ϕ f (xα ) ↓ η, η ≥ 0. Given ε > 0, we can find α0 ∈ J such that
η ≤ ϕ f (xα ) ≤ η + ε for all α α0 . So if α0 α β, we have
d(xα , xβ ) ≤ ϕ f (xα ) − ϕ f (xβ ) ≤ ε
and c f (xα ), f (xβ ) ≤ ϕ f (xα ) − ϕ f (xβ ) ≤ ε. Therefore {xα }α∈J ⊆ X and f (xα ) α∈J ⊆ Y are both Cauchy nets. So we can find x ∈ X and y ∈ Y such that xα −→ x in X and f (xα ) −→ y in Y . Since by hypothesis f is closed we have that f (x) = y. Moreover, due to the lower semicontinuity of ϕ, we have ϕ f (x) ≤ η. Also if α β, then
d(xα , xβ ) ≤ ϕ f (xα ) − ϕ f (xβ ) ≤ ϕ f (xα ) − η
and c f (xα ), f (xβ ) ≤ ϕ f (xα ) − η. Taking the limit with respect to β, we obtain
d(xα , x) ≤ ϕ f (xα ) − η ≤ ϕ f (xα ) − ϕ f (x)
and c f (xα ), f (x) ≤ ϕ f (xα ) − ϕ f (x) , ⇒ xα ≤ x
for all α ∈ J.
This proves that every chain in (X, ≤) has an upper bound. So by Zorn’s lemma there exists a maximal element x ∈ X. Then from (3.76) we have x ≤ g(x) and so finally x = g(x). REMARK 3.4.19 If X = Y, f = I, and c = 1, then Theorem 3.4.18 reduces to Theorem 3.4.16. Next we pass to the investigation of fixed points for nonexpansive maps (see Definition 3.4.1(b)). In general a nonexpansive map in a Banach space need not have fixed points. EXAMPLE 3.4.20 Let X be a Banach space and ϕ : X −→ X be defined by ϕ(x) = x + u with u = 0. This map is nonexpansive and clearly fixed point free. Moreover, if ϕ = I (the identity map on X), then we see that the fixed point of a nonexpansive map need not be unique. PROPOSITION 3.4.21 If X is a Banach space, C ⊆ X a nonempty, bounded, closed set, and ϕ : C −→ C a nonexpansive map such that (I − ϕ)(C)⊆X is a closed set, then ϕ has a fixed point.
232
3 Nonlinear Operators and Fixed Points
PROOF: By translating things if necessary, we may assume that 0 ∈ C. Also because by hypothesis C is bounded, we can find R > 0 large such that C ⊆ BR (0). Let λn ↑ 1, λn ∈ (0, 1), n ≥ 1 and introduce ϕn = λn ϕ. For each n ≥ 1 ϕn is a λn -contraction and by Theorem 3.4.3 we can find xn ∈ C such that ϕn (xn ) = xn . We have ϕn (xn ) − xn = ϕn (xn ) − λn ϕ(xn ) ≤ (1 − λn ) R, ⇒
lim ϕn (xn ) − xn = 0,
n→∞
⇒ 0 ∈ (I − ϕ)(C)
(because by hypothesis (I − ϕ)(C) is closed).
So we can find x ∈ C such that ϕ(x) = x.
Using this proposition, we are led to the first general fixed point theorem for nonexpansive maps. THEOREM 3.4.22 If H is a Hilbert space, C ⊆ H is a nonempty, bounded, closed, and convex set, and ϕ: C −→ C is nonexpansive, then ϕ has a fixed point. PROOF: Let r : H −→ C be a retraction map for C (i.e., r is continuous and rC = IC = the identity map on C, in fact r(x) = pC (x) with p(·) being the metric projection to the set C). Then r is nonexpansive and so is ϕ ◦ r. Therefore, we can easily check that I − ϕ ◦ r is continuous, and monotone, hence maximal monotone (see Corollary 3.2.11). Using Proposition 3.2.7 we see that (I −ϕ◦r)(C) = (I −ϕ)(C) is closed. So we can apply Proposition 3.4.21 and establish the existence of a fixed point for ϕ. We can improve this theorem by replacing the Hilbert space H with a general uniformly convex Banach space X. Recall that parallelogram law we verify that a Hilbert space is uniformly convex. THEOREM 3.4.23 If X is a uniformly convex Banach space, C ⊆X is nonempty, bounded, closed, and convex, and ϕ : C −→ C is a nonexpansive map, then ϕ has a fixed point. PROOF: Let S = {K ⊆ C : nonempty, closed, and ϕ-invariant (i.e., ϕ(K) ⊆ K)}. Evidently S = ∅ since C ∈ S. We partially order S by reverse inclusion (i.e. K1 K2 if and only ifK2 ⊆ K1 ). If D is a chain in (S, ≤) (i.e., a totally ordered subset), then the set K is closed, convex ϕ-invariant, and nonempty because all K∈D sets K ∈ D are weakly compact in X (finite intersection property). So K∈S K∈D
and is an upper bound of D. By Zorn’s lemma S has a maximal element K and due to its maximality we have that K = conv ϕ(K). Clearly we finish the proof if we can show that K is a singleton. If K is not a singleton, because K = ∅, we have that r = diam K > 0. Choose x1 , x2 ∈ K such that x1 − x2 ≥ r/2. Let x ∈ K be the midpoint of the segment joining x1 and x2 (i.e., {x = λx1 + (1 − λ)x2 : λ ∈ [0, 1]}). Then x − y is the midpoint of the segment joining x1 − y and x2 − y, y ∈ K and x1 − y < r, x2 − y < r. By virtue of the uniform convexity of the space X, we can find ϑ > 0 such that x−y ≤ (1−ϑ) r = r0 < r. Let K0 = v ∈ K : v−u ≤ r0 . u∈K
3.4 Metric Fixed Points
233
Then K0 is a nonempty, closed, convex subset of K and x ∈ K0 . The set K0 is a strict subset of K since r0 < r = diam K. We show that K0 is ϕ-invariant. To this end let v ∈ K0 and y ∈ K. Given any ε > 0, we can find {wk }N k=1 ⊆ K and {λk }N k=1 ⊆ [0, 1] such that y −
N
λk ϕ(wk ) < ε
with
k=1
N
λk = 1.
k=1
Then we have ϕ(v) − y ≤ ϕ(v) −
N
λk ϕ(wk ) + ε
k=1
≤
N
λk ϕ(v) − ϕ(wk ) + ε
k=1
and
ϕ(v) − ϕ(wk ) ≤ v − wk ≤ r0
(recall v ∈ K0 ).
Therefore it follows that ϕ(v) − y ≤
N
λk r0 + ε = r0 + ε.
k=1
Because ε > 0 was arbitrary, we let ε ↓ 0 and obtain ϕ(v) − y ≤ r0 for all y ∈ K (i.e., ϕ(v) ∈ K0 ). Therefore K0 is a maximal element of S, a contradiction (because K0 is a strict subset of K). This implies that K is a singleton and so we are done. To further generalize this fixed point theorem, we need the following geometrical notion. DEFINITION 3.4.24 Let X be a Banach space and C ⊆ X nonempty. A point x ∈ C is said to be the diametral point of C, if sup{x − y : y ∈ C} = diam C. A convex set C ⊆ X is said to have normal structure, if for each bounded, convex subset K of C with diam K > 0, there exists some x ∈ K that is not a diametral point of K (i.e., there exists x ∈ K such that sup{x − u : u ∈ K} < diam K). REMARK 3.4.25 It is clear from the above definition that sets with normal structure have no convex subsets K which consist entirely of diametral points, except singletons. PROPOSITION 3.4.26 Every compact convex subset C of a Banach space X has normal structure. PROOF: Suppose that K does not have the normal structure. Without any loss of generality we assume that diam C > 0. For every x1 ∈ C there exists x2 ∈ C such that diam C = x2 − x1 . Since C is convex we have 12 (x1 + x2 ) ∈ C. We can find x3 ∈ C such that 1 diam C = x3 − (x1 + x2 ). 2
234
3 Nonlinear Operators and Fixed Points Proceeding this way, we obtain a sequence {xn }n≥1 such that n 1 xk , n ≥ 2, diam C = xn+1 − n k=1
n 1 ⇒ diam C ≤ xn+1 − xk ≤ diam C, n k=1
⇒ diam C = xn+1 − xk for 1 ≤ k ≤ n. So the sequence {xn }n≥1 has no convergent subsequence, a contradiction to the fact that K is compact. PROPOSITION 3.4.27 If X is a uniformly convex Banach space and C ⊆ X is closed, convex, and bounded, then C has normal structure. PROOF: Without any loss of generality, we assume that C ⊆ B 1 (0). Let K be a closed and convex subset of C. Let x1 ∈ K and ε = 12 . We can find x2 ∈ K such x2 − x1 ≥ 12 K1 . Then for any x ∈ K we have x − 1 (x1 + x2 ) = 1 (x − x1 ) + 1 (x − x2 ) 2 2 2
12 diam K ≤ diam K 1 − δ diam K
1 , ≤ diam K 1 − δ 2 where δ(ε) > 0 is the modulus of convexity of X. This proves that C has normal structure (see Definition 3.4.24). DEFINITION 3.4.28 Let X be a Banach space and C ⊆ X a nonempty, bounded, closed, and convex set. We introduce the following geometric quantities. (a) rx (C) = sup{x − y : y ∈ C} = the radius of C relative to x ∈ X. (b) r(C) = inf{rx (C) : x ∈ X} = the Chebyshev radius of C. (c) C0 (C) = {x ∈ X : rx (C) = r(C)} = the Chebyshev center of C. LEMMA 3.4.29 If X is reflexive, then C0 (C) is nonempty and convex. PROOF: Let Cn (x) = {u ∈ C : u − x ≤ r(C) + 1/n}. Evidently Dn = Cn (x) n≥1 is a decreasing sequence of nonempty, closed, and convex sets. Bex∈C cause X is reflexive by Smulian’s theorem, we have that C0 (C)= Dn is nonempty n≥1
and of course convex.
LEMMA 3.4.30 If X is a Banach space, C ⊆ X a closed convex set with diam C > 0, and C has normal structure, then diam C0 (C)
3.4 Metric Fixed Points
235
PROOF: Because by hypothesis C has normal structure, it has at least a nondiametral point x. Then rx (C) < diam C. Let y, v ∈ C0 (C), then y − v ≤ ry (C) = r(C). Therefore
diam C0 (C) = sup{y − v : y, v ∈ C0 (C)} ≤ r(C) ≤ rx (C) < diam C. Now we are ready for the main fixed point theorem for nonexpansive maps. THEOREM 3.4.31 If X is a reflexive Banach space, C ⊆ X is a nonempty, bounded, closed, and convex set with normal structure and ϕ : C −→ C is nonexpansive, then ϕ has a fixed point. PROOF: Let S be the collection of all nonempty, closed, and convex subsets of C, which are mapped to themselves by the map ϕ. From Smulian’s theorem and Zorn’s lemma, as in the proof of Theorem 3.4.23, we infer that S has a maximal element K. We finish the proof by showing that K is a singleton. So suppose that is not true. Then by Lemma 3.4.29 we can take x ∈ C0 (K). We have ϕ(x) − ϕ(y) ≤ x − y ≤ r(K)
⇒ ϕ(K) ⊆ Br(K) ϕ(x) .
for all y ∈ K,
Because ϕ Br(K) ϕ(x) ∩ K ⊆ Br(K) ϕ(x) ∩ K, the minimality of K implies that
K ⊆ B r(K) ϕ(x) . Therefore ϕ(x) ∈ C0 (K); that is, ϕ maps C0 (K) into itself and so C0 (K) ∈ S. If diam K > 0, then by Lemma 3.4.30 we have that C0 (K) is a proper subset of K, a contradiction to the minimality of K. So K is a singleton and so ϕ has a fixed point. Another formulation of the above theorem is the following. THEOREM 3.4.32 If X is a Banach space, C ⊆X is nonempty, weakly compact, and convex with normal structure, and ϕ : C −→ C is nonexpansive, then ϕ has a fixed point. EXAMPLE 3.4.33 In Theorem 3.4.31 the reflexivity of X (or Theorem 3.4.32 the weak compactness of the set C) can not be dropped. Let X = C[0, 1] and let C = {x ∈ X : x(0) = 0, x(1) = 1, 0 ≤ x(t) ≤ 1 for all t ∈ [0, 1]}. Then C is bounded, closed, and convex, but C is not weakly compact in X. Let ϕ : C −→ C be defined by ϕ(x)(t) = tx(t) for all t ∈ [0, 1]. Clearly ϕ is nonexpansive, but has no fixed points. Next we consider a certain class of maps that need not map the set into itself. For this reason, we introduce the following definition.
236
3 Nonlinear Operators and Fixed Points
DEFINITION 3.4.34 (a) Let X be a Banach space, C ⊆ X is nonempty and {xn }n≥1 a bounded sequence in X. The asymptotic center of {xn }n≥1 relative to C is defined by A(C, {xn }) = y ∈ C : lim sup y − xn = inf {lim sup z − xn } . z∈C
n→∞
n→∞
(b) Let X be a Banach space; C ⊆X is nonempty, closed, and convex. Then IC (x) = {(1 − λ)x + λy : λ ≥ 0, y ∈ C} is the inward set of x ∈ C with respect to C. Also a map ϕ : C −→ X is said to be inward if ϕ(x) ∈ IC (x) for all x ∈ C and weakly inward if ϕ(x) ∈ IC (x) for every x ∈ C. REMARK 3.4.35 If we set r(y)=lim sup y − xn , then r(·) is convex and nonexn→∞
pansive. Also, if the space X is uniformly convex and C ⊆ X is nonempty, bounded, closed, and convex, then A(C, {xn }) is a singleton. Finally
it is easy to check that ϕ : C −→ X is weakly inward if and only if lim(1/λ) x + λ ϕ(x) − x , C = 0 for all x ∈ C.
λ↓0
PROPOSITION 3.4.36 If X is a uniformly convex Banach space, C ⊆ X is a nonempty, closed, and convex set, {xn }n≥1 is a bounded sequence in C, and x is the asymptotic center of {xn } with respect to C (see Remark 3.4.35), then x is also the asymptotic center of {xn }n≥1 with respect to IC (x). PROOF: Let v be the asymptotic center of {xn }n≥1 with respect to IC (x) and assume that v = x. Because C ⊆ IC (x), we have v ∈ IC (x) − C and r(v) < r(x) (by the uniqueness of the asymptotic center). Due to the continuity of r (see Remark 3.4.35), we can find u ∈ IC (x)−C such that r(u) < r(x). Therefore u = (1−λ)x + λy for some y ∈ C and λ > 1. Exploiting the convexity of the function r(·) we have 1
1 1 1 r(y) = r + 1− x ≤ r(u) + 1 − r(x) λ λ λ λ
1 1 r(x) = r(x), < r(x) + 1 − λ λ a contradiction to the definition of x (see Definition 3.4.34). Hence v = x.
The following result is due to Caristi [129] and is used in the proof of the fixed point theorem for weakly inward nonexpansive maps. PROPOSITION 3.4.37 If X is a Banach space, C ⊆ X is a nonempty, closed, and convex set, and ϕ : C −→ C is weakly inward and a k-contraction, then ϕ has a unique fixed point. This proposition is used to prove the following fixed point theorem. THEOREM 3.4.38 If X is a uniformly Banach space, C ⊆ X is a nonempty, bounded, closed, and convex set, and ϕ : C −→ C is weakly inward and nonexpansive, then ϕ has a unique fixed point.
3.5 Topological Fixed Points
237
PROOF: Let u ∈ C and define ϕn by ϕn (x) = (1 − λn )u + λn ϕ(x),
0 < λn < 1 and λn −→ 1.
Evidently ϕn is a λn -contraction and so by Proposition 3.4.37 ϕn has a unique fixed point xn . Moreover, 1 1 xn − − 1 u xn − ϕn (xn ) = xn − λn λn 1 − 1 xn − u −→ 0 as n → ∞ = λn (recall that C is bounded). Let x be the asymptotic center with respect to C. We have
r ϕ(x) = lim sup ϕ(x) − xn n→∞
≤ lim sup ϕ(x) − ϕ(xn ) n→∞
≤ lim x − xn = r(x) n→∞
(3.77)
(recall that ϕ is nonexpansive). Because ϕ is also weakly inward, we have ϕ(x) ∈ IC (x). Also from Proposition 3.4.36 we have that x is the asymptotic center with respect to IC (x). So from (3.77) it follows that ϕ(x) = x (recall that the asymptotic center is unique). Moreover, it is clear from the above argument that the fixed point of ϕ is unique. We have one more fixed point theorem for weakly inward mappings. The result is due to Deimling [187], where the interested reader can find its proof (see also Deimling [188, p. 211]). THEOREM 3.4.39 If X is a Banach space, C ⊆ X a nonempty, bounded, closed, and convex set, and ϕ : C −→ X a weakly inward and γ-condensing map (with γ equal to α or β, see Definition 3.3.48), then ϕ has a fixed point. We conclude this section with a structural result concerning the set of fixed points Fix(ϕ) a map ϕ. The result is due to Vidossich [594]. THEOREM 3.4.40 If X is a Banach space, U ⊆ X is bounded open, ϕ : U −→ X is compact and nonexpansive, ϕ(x) = x on ∂U , and there exists x0 ∈ U such that ϕ(x) − x0 = λ(x − x0 ) on ∂U for all λ > 1 (Leray–Schauder condition), then Fix(ϕ) is connected.
3.5 Topological Fixed Points In this section we present some basic topological fixed point theorems. Now crucial are the topological properties of the spaces and/or of the maps involved. In particular the notion of compactness is important in our considerations here. The grandfather of all topological fixed point theorems is, of course, the famous fixed point theorem of Brouwer. To state and prove it in its general form, we need the following topological result.
238
3 Nonlinear Operators and Fixed Points
PROPOSITION 3.5.1 If C ⊆ RN is a nonempty, bounded, closed, and convex set with nonempty interior, then C is homeomorphic to the closed unit ball B 1 in RN . PROOF: Let x0 ∈ int C. Suppose that x = x0 . Then the segment [x, x0 ] = {λx0 + (1 − λ)x : λ ∈ [0, 1]} or its continuation beyond x meets the boundary of C in a uniquely determined point ξ(x). We define ) 0 if x = x0 ϑ(x) = , x ∈ RN . x−x0 if x = x0 ξ(x)−x0 Then ϑ maps C onto B 1 and is a homeomorphism of RN onto itself.
REMARK 3.5.2 A convex set C in RN may have empty interior. However, we always have that r int C = ∅, where r int C is the interior of C with respect to span C. Now we can state and prove Brouwer’s fixed point theorem. THEOREM 3.5.3 If U ⊆ RN is a bounded, open, and convex set, and ϕ : U −→ U is a continuous map, then ϕ has a fixed point. PROOF: First we assume that U = B1 (0) = {x ∈ RN : x < 1}. Clearly we may assume that ϕ(x) = x for all x ∈ ∂B1 (0) = {x ∈ RN : x = 1} or otherwise we are done. Consider the continuous homotopy h(t, x) = x−tϕ(x) for all t ∈ [0, 1]×B 1 (0). Then for all (t, x) ∈ [0, 1] × ∂B1 (0), we have h(t, x) ≥ x − tϕ(x) > (1 − t)r > 0 since x = 1 and ϕ(x) = x on ∂B1 (0). Therefore by virtue of the homotopy invariance and normalization properties of Brouwer’s degree, we have d(I − ϕ, U, 0) = d(I, U, 0) = 1, ⇒ x = ϕ(x) for some x ∈ B1 (0) (existence property of Brouwer’s degree). Now for the general case, we use Proposition 3.5.1 to find a homeomorphism θ : U −→ B 1 . We set ψ = θ ◦ ϕ ◦ θ−1 . Clearly ψ : B 1 (0) −→ B 1 (0) is continuous. So from the first part of the proof we obtain u ∈ B 1 (0) such that ψ(u) = u. Let x ∈ U such that u = θ(x). Then θ ◦ ϕ(x) = (θ ◦ ϕ ◦ θ−1 )(u) = ψ(u) = u = θ(x). Because θ is a homeomorphism, we have ϕ(x) = x; that is, x ∈ B 1 (0) is a fixed point of ϕ. DEFINITION 3.5.4 A Hausdorff topological space X has the fixed point property (fpp), if every continuous map ϕ : X −→ X has a fixed point. REMARK 3.5.5 Theorem 3.5.3 says that every compact convex set in RN has the fpp. More generally, it is easy to prove that if the space X has the fpp and C is a retract of X, then C has the fpp too. Recall that C is a retract of X, if C ⊆ X and there exists a continuous map r : X −→ C such that rC = I C (the identity map restricted on C). The map r is said to be a retraction.
3.5 Topological Fixed Points
239
Next we present two remarkable consequences of Brouwer’s fixed point theorem (see Theorem 3.5.3). The first is the so-called Perron–Frobenius theorem from linear algebra. THEOREM 3.5.6 If A = (aij )N i,j=1 is an N × N -matrix which is positive; that is, aij ≥ 0 for all i, j ∈ {1, . . . , N }, then A has a nonnegative eigenvalue and a corresponding nonnegative eigenvector; that is, there is a λ ≥ 0 and x = 0 with xk ≥ 0 for all k ∈ {1, . . . , N } such that Ax = λx. N PROOF: Let C = x ∈ RN :x=(xk )N xk =1 . If for some x ∈ C k=1 , xk ≥ 0 and k=1
we have Ax = 0, then the theorem holds for λ = 0. If Ax = 0 on C, then we can N find ξ > 0 such that (Ax)k ≥ ξ for all x ∈ C. Then the map k=1
ϕ(x) =
1 N
Ax,
x ∈ C,
(Ax)k
k=1
is continuous and ϕ(C) ⊆ C (because aij ≥ 0). So we can apply Theorem 3.5.3 and N (Ax0 )k ≥ 0. obtain x0 ∈ C such that ϕ(x0 ) = x0 , hence Ax0 = λx0 with λ = k=1
The next theorem is the so-called no-retraction theorem and it says that it is impossible to retract the whole unit ball of RN onto its boundary, keeping the boundary fixed. THEOREM 3.5.7 If Br (0) = {x ∈ RN : x < 1}, then there is no continuous map ϕ : B 1 (0) −→ ∂B1 (0) such that ϕ∂B (0) =I ∂B (0) . 1
1
PROOF: We proceed by contradiction and assume that such a retraction exists. Then from the dependence on the boundary values of Brouwer’s degree (see Theorem 3.3.18) we have
d ϕ, B1 (0), 0 = d I, B1 (0), 0 = 1, ⇒ 0 = ϕ(x)
for some x ∈ B1 (0),
a contradiction to the fact that ϕ B 1 (0) ⊆ ∂B1 (0).
Isolating the essential parts of the proof of Theorem 3.5.3, we can prove the following fixed point theorem. THEOREM 3.5.8 If U ⊆ RN is bounded open, ϕ ∈ C(U , RN ), and there is a u ∈ U such that ϕ(x) − u = λ(x − u) then ϕ has a fixed point in U .
for all x ∈ ∂U and all λ > 1,
(3.78)
240
3 Nonlinear Operators and Fixed Points
PROOF: Again we assume that ϕ(x) = x for all x ∈ ∂U or otherwise there is nothing to prove. Consider the continuous homotopy h(t, x) defined by
h(t, x) = x − u − t ϕ(x) − u for all (t, x) ∈ [0, 1] × U . Suppose that h(t, x) = 0 for some t ∈ (0, 1) and x ∈ ∂U . Then ϕ(x) − u =
1 (x − u) t
and because 1/t > 1 this contradicts (3.78). So by the homotopy invariance of Brouwer’s degree we have d(I − ϕ, U, 0) = d(I − u, U, 0) = d(I, U, u) = 1
(because u ∈ U ).
So we can find x ∈ U such that ϕ(x) = x.
REMARK 3.5.9 The geometric interpretation of (3.78), is that for some point u ∈ U , for no x ∈ ∂U does ϕ(x) lie on the continuation of the segment [u, x] beyond x. The next result is known as Borsuk’s fixed point theorem. THEOREM 3.5.10 If U ⊆ RN is bounded, open, and symmetric with 0 ∈ U and ϕ ∈ C(U , RN ) is odd, then ϕ has a fixed point. PROOF: Assume that x = ϕ(x) for all x ∈ ∂U , or otherwise we are done. From Theorem 3.3.28 we know that d(I − ϕ, U, 0) = 0 (being odd) and so ϕ has a fixed point. Can we have an infinite-dimensional version of Theorem 3.5.3? The next theorem shows that this is not possible. Its proof can be found in Dugundji–Granas [212, p. 47]. THEOREM 3.5.11 If X is a normed space, then each continuous map ϕ : B 1 (0) −→ B 1 (0) has a fixed point if and only if dim X < +∞. This shows that in order to extend the Brouwer fixed point theorem to infinitedimensional spaces, we need to introduce additional hypotheses. 2 xn < +∞ . We know that this EXAMPLE 3.5.12 Let H = l2 = x = (xn )n≥1 : is a Hilbert space with inner product (x, y) =
n≥1
n≥1
xn yn and norm x =
x2n
1/2
.
n≥1
By B 1 (0) we denote the closed unit ball in H. Let ϕ : B 1 (0) −→ B 1 (0) be defined by
ϕ(x) = (1 − x2 )1/2 , x1 , x2 , . . . .
1/2 Then ϕ(x) = (1 − x2 ) + x2 = 1; that is, ϕ B 1 (0) ⊆ ∂B1 (0) and ϕ is continuous. However, ϕ is fixed point free. Indeed, if ϕ(x) = x, then x = ϕ(x) = 1 and so
3.5 Topological Fixed Points
241
ϕ(x) = (0, x1 , x2 , . . . , xn , . . .) = x = (x1 , x2 , . . . , xn , . . .), ⇒ 0 = x1 , ⇒ xn = 0
x1 = x2 , . . . , xn−1 = xn , . . . , for all n ≥ 1, a contradiction (because x = 1).
REMARK 3.5.13 The difficulty is that in an infinite-dimensional Banach space a bounded closed set need not be compact. For this reason as we did with the infinite-dimensional extension of Brouwer’s degree, we look to maps exhibiting some kind of compactness. The next theorem is known as Schauder’s fixed point theorem. THEOREM 3.5.14 If X is a Banach space, C ⊆ X is nonempty, bounded, closed, and convex, and ϕ : C −→ C is compact, then ϕ has a fixed point. PROOF: Let εn ↓ 0 and let ϕεn ∈ Kf (C, X) be the finite-dimensional εn approximation of ϕ guaranteed by Theorem 3.1.10. Let Xεn be the finitedimensional range space of ϕεn . Then ϕεn : C ∩ Xεn −→ C ∩ Xεn is continuous and so by Theorem 3.5.3 we can find xn ∈ C ∩ Xεn such that ϕεn (xn ) = xn . Because ϕ(y) − ϕεn (y) ≤ εn for all y ∈ C and ϕ(C) is compact, it follows that ϕεn (xn ) n≥1 ⊆ X is relatively compact. So we may assume that ϕεn (xn ) = xn −→ x
in x ∈ X.
But ϕ(xn ) − xn = ϕ(xn ) − ϕεn (xn ) ≤ εn −→ 0. Hence ϕ(x) = x; that is, ϕ has a fixed point. REMARK 3.5.15 We have given a nondegree-theoretic proof of Theorem 3.5.14. It is easy to see that we can have a degree-theoretic proof of the result by mimicking the proof of Theorem 3.5.3. The next result is known as the nonlinear alternative theorem. THEOREM 3.5.16 If U ⊆ X is a bounded open set with 0 ∈ U and ϕ : U −→ X is a compact map, then we have the following alternative; either (a) ϕ has a fixed point in U , or (b) There exist λ ∈ (0, 1) and x ∈ ∂U , such that x = λϕ(x). PROOF: Let h(t, x) be the compact homotopy defined by h(t, x) = tϕ(x) for all (t, x) ∈ [0, 1] × U . If h(t, x) = x for all (t, x) ∈ [0, 1] × ∂U , then from the homotopy invariance of the Leray–Schauder degree we have d(I − ϕ, U, 0) = d(I, U, 0) = 1, ⇒ ϕ(x) = x
for some x ∈ U.
If h(1, x) = x for some x ∈ ∂U , then ϕ(x) = x and so ϕ has a fixed point. If h(t, x) = x for some (t, x) ∈ (0, 1) × ∂U , then x = tϕ(x) and so (b) is satisfied. This theorem suggests that we have to impose conditions on ϕ which do not allow (b) to happen in Theorem 3.5.16. Then automatically ϕ has a fixed point.
242
3 Nonlinear Operators and Fixed Points
COROLLARY 3.5.17 If X is a Banach space, ξ : X −→ R+ is a function such that ξ −1 (0) = 0, and ξ(λx) = λξ(x) for all λ > 0, x ∈ X, U ⊆ X is a bounded open set with 0 ∈ U, ϕ : U −→ X a compact map, and one of the following conditions is satisfied,
(i) ξ ϕ(x) ≤ ξ(x) for all x ∈ ∂U (R¨ othe’s condition), or
2
2 (ii) ξ ϕ(x) ≤ ξ ϕ(x) − x + ξ(x)2 for all x ∈ ∂U (Altman’s condition,) then ϕ has a fixed point. PROOF: We show that both R¨ othe’s condition and Altman’s condition exclude alternative (b) in Theorem 3.5.16. Arguing by contradiction, suppose
that there is (λ, x) ∈ (0, 1) × ∂U such that x = λϕ(x). Because ξ(x) = λξ ϕ(x) , R¨ othe’s condition is not satisfied. Also
2
2
2
2 ξ ϕ(x) − x + ξ(x)2 = (1 − λ)2 ξ ϕ(x) + λ2 ξ ϕ(x) < ξ ϕ(x) , ⇒ Altman’s condition is not satisfied.
The next consequence of Theorem 3.5.16 is known in the literature as Leray– Schauder alternative principle or Sch¨ afer’s fixed point theorem. COROLLARY 3.5.18 If X is a Banach space, ϕ : X −→ X is a compact map, and we set S(ϕ) = {x ∈ X : there exists λ ∈ (0, 1) such that x = λϕ(x)}, then either S(ϕ) is unbounded or ϕ has a fixed point. PROOF: Suppose that S(ϕ) is bounded. Then we can find r > 0 such that S(ϕ) ⊆ Br (0). Because λϕB (0) : B r (0) −→ X, λ ∈ (0, 1), is a compact map without a fixed r
point on ∂B r (0), then in Theorem 3.5.16 alternative (b) is excluded and so ϕ must have a fixed point. We can have an infinite-dimensional version of Theorem 3.5.10 (Borsuk’s fixed point theorem). The proof is identical. THEOREM 3.5.19 If X is a Banach space, U ⊆ X a bounded, open, convex, and symmetric set with 0 ∈ U and ϕ : U −→ X a compact odd map such that 0 ∈ / ϕ(∂U ), then ϕ has a fixed point. All these results for compact maps can be extended to γ-condensing maps. The proofs remain the same. We simply replace the Leray–Schauder degree by the Nussbaum–Sadovskii degree. So we omit the proofs. We only prove the Sadovskii fixed point theorem that follows and which is an extension of Theorem 3.5.14 (Shauder’s fixed point theorem). THEOREM 3.5.20 If X is a Banach space, C ⊆ X is nonempty, bounded, closed, and convex, and ϕ : C −→ C is a γ-condensing map, then ϕ has a fixed point.
3.5 Topological Fixed Points
243
PROOF: First suppose that ϕ is a γ-contraction with constant k < 1. We define C0 = C
for all n ≥ 1.
and inductively Cn = conv ϕ(Cn−1 )
We have γ(Cn ) ≤ k γ(Cn−1 ) ≤ . . . ≤ kn γ(C0 ) ⇒ γ(Cn ) −→ 0 # ⇒ C= Cn
for all n ≥ 1
as n → ∞ is nonempty, compact, and convex
n≥1
(see Proposition 3.3.50). Note that ϕ : C −→ C. So by Theorem 3.5.14 we can find x ∈ C such that ϕ(x) = x. Now we remove the restriction that ϕ is a γ-contraction. Let kn ↑ 1 and set ϕn = kn ϕ. Then ϕn is a γ-contraction and so from the first part of the proof we have xn ∈ C such that ϕn (xn ) = xn . Then xn − ϕ(xn ) = (kn − 1)ϕ(xn ) −→ 0 and because (I − ϕ)(C) is closed, we conclude that for some x ∈ C we have ϕ(x) = x. The nonlinear alternative theorem is still valid for γ-condensing maps. THEOREM 3.5.21 If X is a Banach space, U ⊆ X is bounded open with 0 ∈ U , and ϕ : U −→ X is a γ-condensing map, then the following alternative holds; either (a) ϕ has a fixed point in U , or (b) There exists (λ, x) ∈ (0, 1) × ∂U such that x = λϕ(x). This alternative theorem has some interesting consequences. COROLLARY 3.5.22 If X is a Banach space, ξ : X −→ R+ is a function as in Corollary 3.5.17, U ⊆ X is a bounded open set with 0 ∈ U and ϕ : U −→ X is a γ-condensing map satisfying one of the following conditions,
(i) ξ ϕ(x) ≤ ξ(x) for all x ∈ ∂U (R¨ othe’s condition), or
2
2 2 (ii) ξ ϕ(x) ≤ ξ ϕ(x) − x + ξ(x) for all x ∈ ∂U (Altman’s condition), then ϕ has a fixed point. COROLLARY 3.5.23 If X is a Banach space, ϕ : X −→ X is a γ-condensing map, and we set S(ϕ) = {x ∈ X : there exists λ ∈ (0, 1) such that x = λϕ(x)}, then either S(ϕ) is unbounded or ϕ has a fixed point. Borsuk’s fixed point theorem (see Theorem 3.5.19), is still valid for γ-condensing maps. THEOREM 3.5.24 If X is a Banach space, U ⊆ X a bounded, open, convex, and symmetric set with 0 ∈ U , and ϕ : U −→ X is a γ-condensing odd map such that 0∈ / ϕ(∂U ), then ϕ has a fixed point.
244
3 Nonlinear Operators and Fixed Points
We prove two more fixed point theorems, this time directly for γ-condensing maps. We start with a definition. DEFINITION 3.5.25 Let X be a Banach space and ϕ : X −→ X a map. (a) We say that ϕ is quasibounded if |ϕ|∞ = lim sup x−→∞
ϕ(x) ϕ(x) = inf sup < +∞. M>0 x≥M x x
The number |ϕ|∞ is called the quasinorm of ϕ. (b) We say that ϕ is asymptotically linear if there exists T ∈ L(X) such that lim
x−→∞
T x − ϕ(x) = 0. x
The operator T ∈ L(X) is uniquely determined; it is called the asymptotic derivative of ϕ and it is denoted by ϕ (∞). THEOREM 3.5.26 If X is a Banach space and ϕ : X −→ X is a quasibounded γ-condensing map, then for every y ∈ X and every λ ∈ [−1, 1] such that |λ| <
1 |ϕ|∞ the operator equation y = x − λϕ(x) has a solution; in particular then the γ-condensing map λϕ has a fixed point. PROOF: Let ψλ = y + λϕ(x). Because |λ| ≤ 1, we see that ψλ is γ-condensing. It is easy to check that |ψλ |∞ = |λ||ϕ|∞ < 1. So we can find M > 0 such that ψλ (x) <1 x
for all x ≥ M.
Then we can apply Corollary 3.5.23 with Rothe’s condition and ξ(·) = · to obtain a fixed point for ϕ. THEOREM 3.5.27 If X is a Banach space, ϕ : X −→ X is asymptotically linear, γ-condensing, and ϕ (∞) < 1, then for every y ∈ X and every λ ∈ [−1, 1], the equation y = x − λϕ(x) has a solution; in particular the γ-condensing map λϕ has a fixed point. PROOF: It is easy to check that ϕ is quasibounded and |ϕ|∞ ≤ ϕ (∞)L . Hence |ϕ|∞ < 1 and so we can apply Theorem 3.5.27 and finish the proof. We conclude this section with an extension of Theorem 3.5.14 to locally convex spaces. The result is known as Tychonov’s fixed point theorem. THEOREM 3.5.28 If X is a locally convex space, C ⊆ X a nonempty, closed, and convex set, ϕ : C −→ C is continuous, and ϕ(C) is compact, then ϕ has a fixed point.
3.6 Order and Fixed Points
245
3.6 Order and Fixed Points In many applied fields, nonnegativity of the solution of the boundary value problem is a minimal requirement in order for a function to qualify as the state of the model. What we normally understand by nonnegativity can be described using order cones. Using order cones we can extend the notion of monotonicity of operators beyond the framework of dual pairs (see Section 3.2). Also the theory of cones can be coupled with the notion of a fixed point index in order to investigate positive fixed points of various classes of nonlinear operators. Cone analysis is a valuable tool in the study of nonlinear problems and in this section we survey some important parts of this theory. We start with a discussion of order cones. DEFINITION 3.6.1 Let X be a Banach space. By an order cone, we understand a closed convex set K ⊆ X such that λK ⊆ K for all λ ≥ 0 and K ∩ (−K) = {0}. REMARK 3.6.2 Given an order cone K ⊆ X, we can introduce a partial ordering ≤ on X by x ≤ u if and only if u − x ∈ K. Usually we write x < u if and only if u − x ∈ K \ {0} and x ! u if and only u − x ∈ int K, provided of course that int K = ∅. Using this partial ordering we can speak of monotone sequences {xn }n≥1 ; that is, of sequences satisfying xn ≤ xn+1 , n ≥ 1 (increasing) or xn+1 ≤ xn (decreasing). Also a set C ⊆ X is said to be bounded above (resp., bounded below), if there exists u ∈ X such that c ≤ u (resp., u ≤ c) for all c ∈ C. By sup C (resp., inf C) we denote the least upper bound (resp., the greatest lower bound) of C with respect to ≤ if it exists. The space X with the order ≤ induced by K is an ordered Banach space (OBS for short). Next we introduce some special kinds of order cones that we encounter in practice. DEFINITION 3.6.3 Let X be a Banach space, K ⊆ X an order cone and, ≤ the partial ordering induced by K. (a) We say that K is solid if int K = ∅. (b) We say that K is generating (resp., total ) if X = K − K (resp., X = K − K). (c) We say that K is normal if there exists β > 0 such x + u ≥ β
for all x = u = 1.
(d) We say that K is regular if every increasing sequence which is bounded above is convergent. (e) We say that K is fully regular if every increasing sequence which is norm bounded is convergent. (f) We say that K is minihedral if sup{x, u} exist for all x, u ∈ X. (g) We say that K is strongly minihedral if every set which is bounded above has a supremum.
246
3 Nonlinear Operators and Fixed Points
REMARK 3.6.4 Normality means that the angle between two positive unit vectors is bounded away from π; that is, a normal cone cannot be too large. Also it is clear that K is regular if and only if every decreasing and bounded below sequence is convergent. In the next proposition we present some useful criteria for normality of a cone. PROPOSITION 3.6.5 If X is a Banach space and K ⊆ X is an order cone, then the following statements are equivalent. (a) K is normal. (b) There exists γ > 0 such that x + u ≥ γ max{x, u} for all x, u ∈ X. (c) There exists M > 0 such that 0 ≤ x ≤ u implies x ≤ M y (i.e., · is semimonotone). (d) There exists an equivalent norm · 1 on X such that 0 ≤ x ≤ u implies x1 ≤ u1 (i.e., the new norm · 1 is monotone). (e) If xn ≤ vn ≤ un , n ≥ 1 and xn −x −→ 0, un −x −→ 0, then vn −x −→ 0. (f)
B 1 (0) + K ∩ B 1 (0) − K is bounded, where B 1 (0) = {x ∈ X : x ≤ 1}.
(g) Every order interval [x, u] = {v ∈ X : x ≤ v ≤ u} is bounded (in norm). PROOF: (a)⇒(b): We assume without any loss of generality that x = 1 and u ≤ 1. We have 1 = x ≤ x + u + − u = x + u + u.
(3.79)
Also because K is normal, from Definition 3.6.3(c), we have 1 − u u u − 1 + u ≥ β − 1 + u. (3.80) − u ≥ x + x + u = x + u u u Combining (3.79) and (3.80) we obtain γ=
β ≤ x + u. 2
(b)⇒(c): If the implication is not true, then we can find sequences {xn }n≥1 , {un }n≥1 ⊆ K such that xn ≤ un and xn > nun for all n ≥ 1. We set yn =
xn un + xn nun
and
vn = −
xn un + xn nun
for all n ≥ 1.
Then {yn }n≥1 , {vn }n≥1 ⊆ K and we have 1 and vn ≥ 1 − for all n ≥ 1, n 1 2 = yn + vn ≥ γ 1 − for all n ≥ 1 (because of (b)). ⇒ n n yn ≥ 1 −
1 n
3.6 Order and Fixed Points
247
For n ≥ 1 large the above inequality leads to a contradiction. (c)⇒(d): We define x1 = inf u + inf v. u≤x
(3.81)
v≥x
We show that · 1 is the desired monotone norm on X. Clearly 01 = 0. Also suppose that x1 = 0. From (3.81) we see that given ε > 0 we can find u ≤ x ≤ v such that u, v < ε. Then because of (c), we have x ≤ x − u + u ≤ M v − u + u ≤ (2N + 1)ε. Because ε > 0 was arbitrary, we let ε ↓ 0 to get that x = 0; that is, x = 0. Clearly µx1 = |µ|x1 for all µ ∈ R and all x ∈ X. We check that · 1 satisfies the triangle inequality in order to establish that · 1 is a norm on X. So suppose x, y ∈ X. Given ε > 0 we can find u1 , v1 , u2 , v2 ∈ X such that u1 ≤ x ≤ v1 , u2 ≤ y ≤ v2 , u1 + v1 ≤ x1 + ε and u2 + v2 ≤ y1 + ε. We have u1 + u2 ≤ x + y ≤ v1 + v2 and so it follows that x + y1 ≤ u1 + u2 + v1 + v2 . Hence x + y1 ≤ x1 + y1 + 2ε. Because ε > 0 was arbitrary, we let ε ↓ 0, to obtain x + y1 ≤ x1 + y1 . From (3.81) it is clear that 0 ≤ x ≤ y, then x1 = inf v ≤ inf v = y1 , v≥x
v≥y
⇒ · 1 is monotone. Finally we need to show that the two norms · , · 1 are equivalent. Clearly from (3.81), we have · 1 ≤ 2 · . On the other hand for any u ≤ x ≤ v, we have x ≤ x − u + u ≤ M v − u + u ≤ (M + 1)(u + v) ⇒ x ≤ (N + 1)x1
(see (3.81)).
So the two norms are equivalent and we are done. (d)⇒(e): We have 0 ≤ vn − xn ≤ un − xn and so from (d) vn − xn 1 ≤ un − xn 1 . Because of the equivalence of the norms · and · 1 and since xn − x −→ 0, un − x −→ 0, we conclude that vn − x −→ 0.
(e)⇒(f): If D = B 1 (0) + K ∩ B 1 (0) − K is unbounded, then we can find {yn }n≥1 ⊆ D such that yn −→ ∞. Then xn ≤ yn ≤ un with xn , un ∈ B 1 (0) for all n ≥ 1. We set xn =
xn , yn
yn =
yn yn
and
un =
un yn
for all n ≥ 1
and we have xn ≤ yn ≤ un for all n ≥ 1. Note that xn , un −→ 0. Hence because of (e) it follows that we must have yn −→ 0, a contradiction to the fact that yn = 1 for all n ≥ 1.
(f)⇒(g): Suppose that B 1 (0) + K ∩ B 1 (0) − K ⊆ rB 1 (0), r > 0. Let =
max{x, u}. Then for any x ∈ [x, u], we have (1/)v ∈ B 1 (0) + K ∩ B 1 (0) − K ⊆ rB 1 (0), hence [x, u] ⊆ rB 1 (0); that is, [x, u] is bounded.
248
3 Nonlinear Operators and Fixed Points
(g)⇒(a): Suppose K is not normal. Then according to Definition 3.6.3(c), we can find {xn }n≥1 , {un }n≥1 ⊆ ∂B1 (0) ∩ K such that xn + un < (1/22n ), n ≥ 1. We set yn =
xn xn + un 1/2
Then 0 ≤ yn ≤ vn and
and n≥1
vn ≤
vn =
xn + un xn + un 1/2
for all n ≥ 1.
(1/2n ) < +∞. So the series
n≥1
converging strongly to v ∈ X. We have 0 ≤ yn ≤ vn ≤ v and
vn is
n≥1
1 > 2n for all n ≥ 1, xn + un 1/2 ⇒ [0, v] is unbounded, a contradiction. yn =
REMARK 3.6.6 So by virtue of this proposition, in an ordered Banach space with a normal order cone, every order bounded set C (i.e., there is an order interval [x, u] containing C) is bounded. PROPOSITION 3.6.7 If X is an OBS with order cone K and int K = ∅, then K is generating. PROOF: Let x0 ∈ int K and choose r > 0 such that B r (x0 ) ⊆ K. If x ∈ X \ {0}, then x0 + r (x/x) ∈ K. Hence x=
x r x x0 + x − x0 ∈ K − K; r x r
that is, X = K − K.
From the above proof it is also clear that the following result is true. PROPOSITION 3.6.8 If X is an OBS with order cone K, int K = ∅ and x0 ∈ int K, then for every x ∈ X we can find λ = λ(x) > 0 such that λx0 − x ∈ int K. PROPOSITION 3.6.9 If X is an OBS with order cone K and we consider the following conditions, (i) K is fully regular, (ii) K is regular, (iii) K is normal, then (i)⇒(ii)⇒(iii). PROOF: First we show that if K is not normal, then it is not regular, nor fully regular. Because K is not normal, by virtue of Proposition 3.6.5(c), we can find {xn }n≥1 , {un }n≥1 ⊆ K such that 0 ≤ xn ≤ un and xn > 2n un for all n ≥ 1. We set xn un yn = and vn = n for all n ≥ 1. xn 2 un Then we have
3.6 Order and Fixed Points 0 ≤ yn ≤ Also
Hence
xn ≤ vn 2n un
vn =
n≥1
for all n ≥ 1.
249 (3.82)
1 = 1. 2n
(3.83)
n≥1
vn = v ∈ X. We define
n≥1
⎧ n ⎪ vk ⎪ ⎪ ⎨ k=1 wn =
if n = 2m, m ≥ 1
⎪ n ⎪ ⎪ ⎩ vk + y2m+1
. if n = 2m + 1, m ≥ 1
k=1
Then because of (3.82) and (3.83) it follows that 0 ≤ w1 ≤ · · · ≤ wn ≤ wn+1 ≤ · · · ≤ v.
(3.84)
sup wn ≤ 2.
(3.85)
n≥1
On the other hand w2m+1 − w2m = y2m+1 = 1. So {wn }n≥1 does not converge and this means that K is not regular (see (3.84)) and K is not fully regular (see (3.85)). So it remains to show that (i)⇒(ii). Because K is fully regular, then from the above argument it follows that K is normal. So let {xn }n≥1 ⊆ X be an increasing, order bounded sequence. We have x1 ≤ x2 ≤ · · · ≤ xn ≤ xn+1 ≤ · · · ≤ u, ⇒ 0 ≤ u − xn ≤ u − x1
for all n ≥ 1,
⇒ u − xn ≤ M u − x1 for some M > 0 and all n ≥ 1, (see Proposition 3.6.5(c)) ⇒ {xn }n≥1 ⊆ X is norm bounded and by full regularity of K converges. PROPOSITION 3.6.10 If X is a reflexive OBS with order cone K, then K is normal ⇔ K is regular ⇔ K is fully regular. PROOF: By virtue of Proposition 3.6.9 it suffices to show that normality of K implies full regularity of K. So suppose that {xn }n≥1 ⊆ X is an increasing, norm bounded sequence. Since X is reflexive we can find {xnk }k≥1 a subsequence of w {xn }n≥1 such that xnk −→ x ∈ X. We claim that xnk ≤ x for all k ≥ 1. If this is not the case, we can find k0 ≥ 1 such that x − xnk0 ∈ / K. Then the strong separation theorem for convex sets implies that there exist x∗ ∈ X ∗ \ {0} and ε > 0 such that . / . / (3.86) x∗ , x − xnk0 + ε ≤ x∗ , xnk − xnk0 for all k ≥ k0 . If we pass to the limit as k −→ ∞ in (3.86), we reach a contradiction. So xnk ≤ x for all k ≥ 1.
250
3 Nonlinear Operators and Fixed Points
We show that {xnk }k≥1 has a subsequence which strongly converges to x in X. Indeed if this is not the case, we can find ε > 0 and k0 ≥ 1 such that x − xnk ≥ ε
for all k ≥ k0 .
Let Ck = {x ∈ X : x ≤ xnk }, k ≥ k0 , and C =
(3.87) Ck . Evidently each set
k≥k0
is convex and {Ck }k≥1 is increasing. So C is convex and thus so is C. If x ∈ C, then x ∈ Ck for some k ≥ k0 and so x ≤ xnk , hence 0 ≤ x − xnk ≤ x − x and so x − xnk ≤ M x − x. Because of (3.87), we obtain ε ≤ x − x M
for all x ∈ C,
(i.e., x ∈ / C).
Then by the strong separation theorem for convex set we can find u∗ ∈ X ∗ \ {0} and δ > 0 such that x∗ , x + δ ≤ x∗ , c
for all c ∈ C.
(3.88)
But clearly x ∈ C and this contradicts (3.88). Therefore (3.87) can not happen and so we can find a subsequence {xnkm }m≥1 of {xnk }k≥1 such that xnkm −→ x in X as m → ∞. Since for every n ≥ 1, we can find m ≥ 1 large enough such that xn ≤ xnkm , we deduce that xn ≤ x. Then 0 ≤ x − xn ≤ x − xnkm ⇒ x − xn ≤ M x − xnkm
for all n ≥ nkm , (see Proposition 3.6.5(c)),
⇒ xn −→ x in X. PROPOSITION 3.6.11 If X is a separable OBS with order cone K that is regular and minihedral, then K is strongly minihedral. PROOF: Let C ⊆ X be bounded above. Let {xn }n≥1 ⊆ D be dense in D. Set un = sup{xk }n k=1 which exist because by hypothesis K is minihedral. We have u 1 ≤ u2 ≤ · · · ≤ u n ≤ · · · ≤ w
for some w ∈ X (since D is bounded above).
Because K is regular, we have un −→ u in X. We show that u = sup C. Clearly xn ≤ un ≤ u
for all n ≥ 1.
(3.89)
Let x ∈ C.Then we can find a subsequence {xnk }k≥1 of {xn }n≥1 such that xnk −→ x in X. Because of (3.89) it follows that x ≤ u and so u is an upper bound of C. If v ∈ X is any other upper bound of C, then un ≤ v for all n ≥ 1 and so u ≤ v. Therefore u = sup C. COROLLARY 3.6.12 If X is a separable reflexive OBS and the order cone K ⊆ X is normal and minihedral, then K is strongly minihedral. Let us see some characteristic examples of ordered Banach spaces.
3.6 Order and Fixed Points
251
N EXAMPLE 3.6.13 (a) X =RN with order (positive) cone K =RN + = x = (xk )k=1 : xk ≥ 0 for all k∈{1, . . . , N } . This cone is solid, generating, and because the norm in RN is monotone, K is also normal (see Proposition 3.6.5(d)). Then Proposition 3.6.10 implies that K also regular and fully regular. Then because K is clearly minihedral, Proposition 3.6.10 implies that K is strongly minihedral. (b) X =C(D) where D⊆RN is a closed subset, with positive cone K = {x ∈ C(D) : x(t) ≥ 0 for all t ∈ D}. Then K is solid and normal (note that order intervals are bounded; see Proposition 3.6.5(g)). However, it is not regular. To see this, consider xn (t) = 1 − tn , t ∈ [0, 1], n ≥ 1. Then {xn }n≥1 is an increasing sequence, order bounded by u ≡ 1, but it does not converge uniformly on [0, 1]. Moreover, K is minihedral but not strongly minihedral. Indeed, if E = {x ∈ C[0, 2] : x(t) < 1 for t ∈ (0, 1) and x(t) < 2 for t ∈ (0, 2)}, then sup E does not exist. Another positive cone for X = C(D) is the following. K = {x ∈ C(D) : x(t) ≥ 0
for all t ∈ D and min x(t) ≥ ε0 x∞ } t∈D0
with D0 a closed subset of D and 0 < ε0 < 1. This is a solid normal cone. (c) X =Lp (Z), 1 ≤ p < ∞, where Z ⊆RN is a Lebesgue measurable set with finite Lebesgue measure, with positive cone K = {x ∈ Lp (Z) : x(z) ≥ 0 a.e. on Z}. Clearly K is generating but not solid (i.e., int K = ∅). Also it is normal because it has bounded order intervals. Moreover, if 1 < p < ∞, Proposition 3.6.10 implies that K is regular and fully regular. If p = 1, by the monotone convergence theorem and the Lebesgue dominated convergence theorem, we infer that K is fully regular, hence regular too. Also K is minihedral and because it is separable and regular, it is also strongly minihedral (see Proposition 3.6.11). (d) X =W 1,p (Z), 1 < p < ∞, and Z ⊆RN a bounded open set with a C 1 -boundary ∂Z. The positive cone of this OBS is K = {x ∈ W 1,p (Z) : x(z) ≥ 0 a.e. on Z}. In general K is not solid. However, if p > N , by the Sobolev embedding theorem W 1,p (Z) is embedded in C(Z) and so in this case K is solid (i.e., int K = 0, see Example 3.6.13(b)). Note that order intervals are not bounded, so K is not normal (see Proposition 3.6.5). (e) X =C0 (R+ )={x ∈ C(R+ ) : x(t) −→ 0 as t → ∞} with positive cone K = {x ∈ C0 (R+ ) : x(t) ≥ 0 for all t ∈ R+ } (the norm on X is the usual supremum norm x∞ = max |x(t)|. This cone is regular (hence normal too), but not fully regular. t≥0
Next let us say a few things about positive linear functionals. For this purpose we make the following definition. DEFINITION 3.6.14 Let X be an OBS with positive cone K. An element x∗ ∈ X ∗ is said to be positive if x∗ , x ≥ 0 for all x ∈ K. We consider the set of all positive elements of X ∗ K ∗ = {x∗ ∈ X ∗ : x∗ , x ≥ 0
for all x ∈ K}.
Then K ∗ is nonempty, closed, and convex but need not satisfy K ∗ ∩ (−K ∗ ) = {0}. Nevertheless we call K ∗ the dual cone.
252
3 Nonlinear Operators and Fixed Points
REMARK 3.6.15 Clearly if K is generating, then K ∗ ∩ (−K ∗ ) = {0} and so K ∗ is an order cone too. We say that x∗ ∈ K ∗ is strictly positive, if x∗ , x > 0 for all x ∈ K \ {0} and uniformly positive if x∗ , x ≥ cx for some c > 0 and all x ∈ K. In the next proposition we have collected some useful properties of dual cones and their elements. PROPOSITION 3.6.16 If X is an OBS with positive cone K = {0} and K ∗ its dual cone, then (a) K ∗ = {0} and x ∈ K if and only if x∗ , x ≥ 0 for all x∗ ∈ K ∗ . (b) For every x ∈ K \ {0} we can find x∗ ∈ K ∗ such that x∗ , x > 0. (c) If x ∈ / K, then we can find x∗ ∈ K ∗ such that x∗ , x < 0. (d) x ∈ int K if and only if x∗ , x > 0 for all x∗ ∈ K ∗ \ {0}. (e) If X is separable, then we can find x0 ∈ X ∗ such that x∗0 , x > 0 for all x ∈ K \ {0}. PROOF: (a),(b),(c): If x ∈ / K, then by the strong separation theorem for convex sets we can find x∗ ∈ X ∗ \ {0} and ε > 0 such that x∗ , x + ε ≤ x∗ , c
for all c ∈ K
(3.90)
⇒ x∗ , x ≤ −ε < 0 (i.e., x∗ = 0). Moreover, because λK ⊆ K for all λ ≥ 0, from (3.90) it follows that x∗ , c ≥ 0 for all c ∈ K and so x∗ ∈ K ∗ \ {0}. So (a), (b), and (c) follow. (d): If x0 ∈ ∂K, then by the weak separation theorem (applied to the sets {x0 } and int K, both convex), we can find x∗ ∈ K ∗ \ {0} such that x∗ , x0 = 0. On the other hand if x ∈ int K, we can find r > 0 such that Br (x) ⊆ K. Suppose that x∗ , x = 0 for some x∗ ∈ K ∗ \ {0}. Then x∗ , ru ≥ 0 for all u ∈ Br (0) and so x∗ = 0. ∗
∗ (e): Because X is separable, the set K ∗ ∩ B 1 (0) furnished with the weak -topology ∗ ∗ is compact metrizable. Let {xn }n≥1 be a dense sequence and set u = 1/n2 x∗n . n≥1
Clearly u∗ ∈ K ∗ and if u∗ , x = 0 for some x ∈ K, then x∗ , x = 0 for all x∗ ∈ K ∗ and so x = 0 (see part (b)). There is a duality relation between being normal and being generating for an order cone. The result is known as Krein’s theorem and its proof can be found in Deimling [188, p. 223]. PROPOSITION 3.6.17 If X is an OBS, K is the positive cone, and K ∗ the dual cone, then (a) K is generating if and only if K ∗ is normal. (b) K is normal if and only if K ∗ is generating.
3.6 Order and Fixed Points
253
Using the machinery of order cones, we can easily produce fixed point theorems for increasing maps. DEFINITION 3.6.18 Let X be an OBS with positive cone K and ≤ the partial ordering defined by K. Let D ⊆ X and ϕ : D −→ X. We say that (a) ϕ is increasing (resp., decreasing), if x ≤ u, then ϕ(x) ≤ ϕ(u) (resp. ϕ(u) ≤ ϕ(x)). (b) ϕ is strictly increasing (resp., strictly decreasing), if x < u, then ϕ(x) < ϕ(u) (resp., ϕ(u) < ϕ(x)). (c) ϕ is strongly increasing (resp., strongly decreasing), if x < u, then ϕ(x) ! ϕ(u) (resp., ϕ(u) ! ϕ(x)), provided int K = ∅; THEOREM 3.6.19 If X is an OBS with a normal positive cone K and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is increasing and γ-condensing, then ϕ has a maximal fixed point x and a minimal fixed point x in [x0 , u0 ]; moreover, if un = ϕ(un−1 ) and xn = ϕ(xn−1 ) for all n ≥ 1, then x = lim un n→∞
and
x = lim xn . n→∞
PROOF: Because ϕ is increasing, we have x0 ≤ x1 ≤ · · · ≤ xn ≤ · · · ≤ un ≤ · · · ≤ u0 . The sequence S = {xn }n≥0 is bounded and S = ϕ(S) ∪ {x0 }. Therefore
γ(S) = γ ϕ(S) (see Proposition 3.3.50(d)). (3.91) Because ϕ is γ-condensing, from (3.91) we infer that γ(S)=0. So S={xn }n≥0 is relatively compact. Hence we can find a subsequence {xnk }k≥1 of {xn }n≥1 such that xnk −→ x. Then xn ≤ x ≤ un for all n ≥ 1. For n ≥ nk we have 0 ≤ x−xn ≤ x−xnk and because K is normal, by Proposition 3.6.5(c) we have x − xn ≤ M x − xnk , hence xn −→ x in X. Recall that xn = ϕ(xn−1 ), n ≥ 1. So passing to the limit as n → ∞, we obtain x = ϕ(x). Similarly we show that un −→ x in X and x = ϕ(x). It remains to show that x is the maximal fixed point of ϕ in [x0 , u0 ] and x is the minimal fixed point of ϕ in [x0 , u0 ]. Let v ∈ [x0 , u0 ] be a fixed point of ϕ; that is, v = ϕ(v). Because ϕ is increasing, we see that xn ≤ v ≤ un for all n ≥ 1, hence x ≤ v ≤ x. REMARK 3.6.20 The above theorem produces an iteration process that generates the maximal and minimal fixed points of ϕ in [x0 , u0 ]. So, if ϕ has a unique fixed point in [x0 , u0 ], then the successive iterates converge to it. In the next fixed point theorem we do not require the map ϕ to be continuous. THEOREM 3.6.21 If X is an OBS with a strongly minihedral positive cone and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is an increasing map, then ϕ has a greatest fixed point x and a smallest fixed point x in [x0 , u0 ].
254
3 Nonlinear Operators and Fixed Points
PROOF: Let C = {x ∈ [x0 , u0 ] : x ≤ ϕ(x)}. Evidently x0 ∈ C and u0 is an upper bound of C. Because K is strongly minihedral x = sup C exists. We claim that x is the maximal fixed point of ϕ in [x0 , u0 ]. For any x ∈ C we have x0 ≤ x ≤ x ≤ u0 and because ϕ is increasing, we obtain x0 ≤ ϕ(x0 ) ≤ ϕ(x) ≤ ϕ(x) ≤ ϕ(u0 ) ≤ u0 . Since x ∈ C we have x ≤ ϕ(x) and so x ≤ ϕ(x) for all x ∈ C. Recalling
that x = supC, we obtain x ≤ ϕ(x). Also since x ≤ ϕ(x), we have ϕ(x) ≤ ϕ ϕ(x) (recall that ϕ is increasing) and so ϕ(x) ∈ C. Therefore ϕ(x) ≤ x, hence x = ϕ(x). Now if x is any other fixed point of ϕ in [x0 , u0 ], then x ∈ C and x ≤ x, which proves that x is the greatest fixed point of ϕ. Similarly if D = {x ∈ [x0 , u0 ] : ϕ(x) ≤ x}, we can show that x = inf D exists and is the smallest fixed point of ϕ in [x0 , u0 ]. We continue with fixed point theorems for maps ϕ that are not necessarily continuous. THEOREM 3.6.22 If X is an OBS with positive cone K and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is an increasing map such that ϕ([x0 , u0 ]) is relatively compact in X, then ϕ has a fixed point in [x0 , u0 ]. PROOF: Let C = {x ∈ ϕ([x 0 , u0 ]) : x ≤ ϕ(x)}. Because x0 ≤ ϕ(x0 ) and ϕ is increasing, we have ϕ(x0 ) ≤ ϕ ϕ(x0 ) and so ϕ(x0 ) ∈ C, which means that the set C is nonempty. Evidently C is a partially ordered set with the partial ordering that it inherits from X. Let S be a chain (i.e., a totally ordered subset) of C. Because S ⊆ C ⊆ ϕ([x0 , u0 ]) and the latter is by hypothesis relatively compact, it follows that S is relatively compact in X. So S is separable. Let E = {vn }n≥1 ⊆ S be a dense sequence. Since S is a chain yn = sup{vk }n k=1 , n ≥ 1, exist and yn ∈ S (actually yn = vk0 for some k0 ∈ {1, . . . , n}. Because S is relatively compact, by passing to a subsequence if necessary, we may assume that yn −→ y in X. Clearly vn ≤ yn ≤ y for all n ≥ 1 and y ∈ S ⊆ C ⊆ ϕ([x0 , u0 ]) ⊆ [x0 , u0 ]. Hence v ≤ y for all v ∈ S and so v ≤ ϕ(v) ≤ ϕ(y) for all v ∈ S, which means that ϕ(y) is an upper bound
of S. In addition since yn ≤ ϕ(y), n ≥ 1, we obtain y ≤ ϕ(y) and so ϕ(y) ≤ ϕ ϕ(y) , hence ϕ(y) ∈ C. Therefore ϕ(y) is an upper bound of C. So we can apply Zorn’s lemma
and obtain a maximal element x of C. Because x ≤ ϕ(x), we have ϕ(x) ≤ ϕ ϕ(x) and so ϕ(x) ∈ C. By virtue of the maximality of x, we conclude that x = ϕ(x). COROLLARY 3.6.23 If X is an OBS with a normal positive cone K and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is an increasing compact map, then ϕ has a fixed point. PROOF: Because K is normal, the order interval [x0 , u0 ] is bounded (see Proposition 3.6.5). Because ϕ is compact, it follows that ϕ([x0 , u0 ]) is relatively compact. So we can apply Theorem 3.6.22. THEOREM 3.6.24 If X is an OBS with a minihedral positive cone K and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is increasing with ϕ([x0 , u0 ]) relatively compact in X, then ϕ has a greatest fixed point x and a smallest point x in [x0 , u0 ]. PROOF: Let C be as in the proof of Theorem 3.6.22. In that proof via Zorn’s lemma, we proved that C has a maximal element x and x = ϕ(x). We claim that x is the greatest fixed point of ϕ in [x0 , u0 ]. To this end let y be any fixed point of ϕ
3.6 Order and Fixed Points
255
in [x0 , u0 ]. Because K is minihedral v = sup{y, x} exists. We have y = ϕ(y) ≤ ϕ(v) and x = ϕ(x) ≤ ϕ(v). Hence v ≤ ϕ(v) and so y ≤ x, which proves that x is the greatest fixed point of ϕ in [x0 , u0 ]. In a similar fashion, working with D = {x ∈ ϕ([x0 , u0 ]) : ϕ(x) ≤ x}, we produce a minimal element x, which is a fixed point of ϕ and in fact is the smallest fixed point of ϕ in [x0 , u0 ]. COROLLARY 3.6.25 If X is an OBS with a normal, minihedral positive cone K and ϕ : [x0 , u0 ] −→ [x0 , u0 ] is an increasing compact map, then ϕ has a greatest fixed point x and a smallest point x in [x0 , u0 ]. Thus far we have assumed that the map ϕ : D ⊆ X −→ X is increasing. Now we drop this monotonicity requirement and instead use tools and methods from degree theory. For this purpose we consider a Banach space X and C ⊆ X a retract of X (see Remark 3.5.5). Let r : X −→ C be an arbitrary retraction. Let U ⊆ X be nonempty, bounded, and open in C and ϕ : U −→ C a compact map such that x = ϕ(x) for all x ∈ ∂U . We choose R > 0 such that U ⊆ BR (0). We define
(3.92) iC (ϕ, U ) = d I − ϕ ◦ r, BR (0) ∩ r−1 (U ), 0 . We show that this definition is independent of the choices of r and R > 0. First let r1 : X −→ C be another retraction of X onto C and set V = BR (0) ∩ r−1 (U ) ∩ r1−1 (U ). Then V is a bounded open set of X and U ⊆ V . It is easy to see that ϕ ◦ r has no fixed points in BR (0) ∩ r−1 (U ) \ V and ϕ ◦ r has no fixed points in BR (0) ∩ r1−1 (U ) \ V . So from the excision property of the Leray–Schauder degree, we have
d I − ϕ ◦ r, BR (0) ∩ r−1 (U ), 0 = d I − ϕ ◦ r, V, 0
and d I − ϕ ◦ r1 , BR (0) ∩ r1−1 (U ), 0 = d I − ϕ ◦ r1 , V, 0 . (3.93)
We introduce the compact homotopy h(t, x)=r tϕ r(x) + (1 − t)ϕ r1 (x) for all (t, x) ∈ [0, 1] × V . We claim that x = h(t, x) for all t ∈ [0, 1] and all x ∈ ∂V . Indeed suppose that we can find (t, x) ∈ [0, 1] × ∂V such that x = h(t, x). We have
x = r tϕ r(x) + (1 − t)ϕ r1 (x) ∈ C, ⇒ r(x) = x, r1 (x) = x
and so x = ϕ(x),
hence x ∈ U ⊆ V , a contradiction. Invoking the homotopy invariance of the Leray–Schauder degree, we obtain
(3.94) d I − ϕ ◦ r, V, 0 = d I − ϕ ◦ r1 , V, 0 . From (3.93) and (3.94) it follows that
d I − ϕ ◦ r, BR (0) ∩ r−1 (U ), 0 = d I − ϕ ◦ r1 , BR (0) ∩ r1−1 (U ), 0 .
(3.95)
From (3.95) we see that the definition made in (3.92) is in fact independent of the particular retraction we use. Next we show that the definition made in (3.92) is also independent of R > 0. So let R1 > R. We have
256
3 Nonlinear Operators and Fixed Points
U ⊆ BR (0) ∩ r−1 (U ) ⊆ BR1 (0) ∩ r−1 (U )
and so ϕ ◦ r has no fixed points in BR1 (0) ∩ r−1 (U ) \ BR (0) ∩ r−1 (U ) . So once again from the excision property of the Leray–Schauder degree, we have
d I − ϕ ◦ r, BR1 (0) ∩ r−1 (U ), 0 = d I − ϕ ◦ r, BR (0) ∩ r−1 (U ), 0 which proves that the definition made in (3.92) is also independent of R > 0. So we can now formally introduce the notion of the fixed point index of ϕ with respect to a retract C. DEFINITION 3.6.26 Let X be a Banach space, C ⊆ X a retract of X and consider the family SI = (ϕ, U ) : U ⊆ C bounded open in C, ϕ : U −→ X a compact map such that x = ϕ(x) for all x ∈ ∂U . We can define a map iC : SI −→ Z by
iC (ϕ, U ) = d I − ϕ ◦ r, BR (0) ∩ r−1 (U ), 0 , where r : X −→ C is an arbitrary retraction of X onto C and R > 0 is such that U ⊆ BR (0). The integer iC (ϕ, U ) is called the fixed point index of ϕ on U with respect to C. From this definition and the properties of the Leray–Schauder degree (see Theorem 3.3.40) we obtain the following. THEOREM 3.6.27 If X is a Banach space, C ⊆ X a retract, and (ϕ, U ) ∈ SI (see Definition 3.6.26), then (a) Normalization: iC (ϕ, U ) = 1 if ϕ(x) = u0 ∈ U for all x ∈ U and iC (ϕ, U ) = 0 if u0 ∈ / U. (b) Additivity with respect to the domain: If U1 , U2 are disjoint open subsets of U and ϕ has no fixed points on U \(U1 , U2 ), then iC (ϕ, U ) = iC (ϕ, U1 )+iC (ϕ, U2 ). map such that (c) Homotopy invariance: If h : [0, 1] × U −→ C is a compact x = h(t, x) for all (t, x) ∈ [0, 1] × ∂U , then iC h(t, ·), U is independent of t ∈ [0, 1]. (d) Reduction property: If D is a retract of C and ϕ(U ) ⊆ D, then iC (ϕ, U ) = iD (ϕ, U ∩ D). (e) Excision property: If U0 is an open subset of U and ϕ has no fixed points in U \ U0 , then iC (ϕ, U ) = iC (ϕ, U0 ). (f) Solution property: If iC (ϕ, U ) = 0, then there exists x ∈ U such that x = ϕ(x). REMARK 3.6.28 The notion of the fixed point index can be extended to γcondensing maps, using the Nussbaum–Sadovskii degree d0 (see Definition 3.3.56). We leave the straightforward details to the reader. We still denote it by iC (ϕ, U ) and Theorem 3.6.27 remains valid.
3.6 Order and Fixed Points
257
Recall that every nonempty, closed, and convex subset of a Banach space X is retract of X (see Theorem 3.1.10 and Remark 3.1.12). PROPOSITION 3.6.29 If X is a Banach space, C ⊆ X is nonempty, closed, and convex, (ϕ, U ) ∈ SI , and there exists u ∈ U such that ϕ(x) − x = λ(x − u) for all λ ≥ 0 and all x ∈ ∂U , then iC (ϕ, U ) = 1. PROOF: Let h(t, x) = tϕ(x) + (1 − t)u for all (t, x) ∈ [0, 1] × U . If for some (t, x) ∈ [0, 1] × ∂U we have x = h(t, x), then clearly t = 0 (because u ∈ U ) and ϕ(x) − x = (1 − t)/t (x − u), so ϕ(x) − x = λ(x − u) with λ = (1 − t)/t, t = 0, a contradiction to the hypothesis. So invoking the homotopy invariance and normalization properties of the fixed point index (see Theorem 3.6.27(c) and (a)), we have iC (ϕ, U ) = iC (u, U ) = 1. PROPOSITION 3.6.30 If X is a Banach space, C ⊆ X is nonempty, closed, and convex, (ϕ, U ) ∈ SI , and there exists u ∈ C \ U such that ϕ(x) − x = λ(x − u) for all λ ≥ 0 and all x ∈ ∂U , then iC (ϕ, U ) = 0. PROOF: We consider the compact homotopy h(t, x) = tϕ(x) + (1 − t)u. If for some (t, x) ∈ [0, 1] × ∂U, x = h(t, x), then clearly t = 0 and ϕ(x) − x = (1 − t)/t (x − u), with (1 − t)/t ≥ 0 and this contradicts the assumption. So from the homotopy invariance and normalization properties of the fixed point index, we have iC (ϕ, U ) = iC (u, U ) = 0. COROLLARY 3.6.31 If X is an OBS with order cone K, U ⊆ X is a bounded open set, ϕ : K ∩ U −→ K is a compact map, and there exists u ∈ K \ {0} such that x − ϕ(x) = λu for all λ ≥ 0, and all x ∈ K ∩ ∂U , then iK (ϕ, K ∩ U ) = 0. REMARK 3.6.32 If X is a Banach space and C ⊆ X is a nonempty, closed, and convex set (hence a retract), then for any U ⊆ X bounded open, the set K ∩ U is bounded open in K and ∂(K ∩ U ) = K ∩ ∂U and K ∩ U = K ∩ U . PROPOSITION 3.6.33 If X is an OBS with order cone K, U ⊆ X is a bounded open set, and ϕ : K ∩ U −→ K is a compact map such that (i) inf ϕ(x) : x ∈ K ∩ ∂U > 0, and (ii) ϕ(x) = λx for all λ ∈ (0, 1] and all x ∈ K ∩ ∂U , then iK (ϕ, K ∩ U ) = 0. PROOF: Consider the compact homotopy h(t, x) = (1 + t)ϕ(x) for all t ≥ 0 and all x ∈ K ∩ U = K ∩ U . Because of hypothesis (ii), we see that x = h(t, x) for all t ≥ 0 and all x ∈ K ∩ ∂U = ∂(K ∩ U ). So by the homotopy invariance of the fixed point index we have iC (ϕ, K ∩ U ) = iC (1 + t)ϕ, K ∩ U for all t ≥ 0. If iC (ϕ, K ∩ U ) = 0, then we can find x = x(t) ∈ U such that (1 + t)ϕ(x) = x. But let t∗ > ξ1 + ξ2 ξ3 where ξ1 = sup{x : x ∈ K ∩ U }, ξ2 = sup{ϕ(x) : x ∈ K ∩ U }
258
3 Nonlinear Operators and Fixed Points
and ξ3 = inf{ϕ(x) : x ∈ K ∩ U }. Note that because of hypothesis (i), the dependence of the Leray–Schauder degree on the boundary values and Theorem
3.1.13 we have ξ3 > 0 and so ξ1 + ξ2 ξ3 is finite. From the solution property of the fixed point index we can find x ∈ K ∩ U such that ϕ(x) + t∗ ϕ(x) = x x − ϕ(x) ξ1 + ξ2 ⇒ t∗ = , ≤ ϕ(x) ξ3
a contradiction.
Therefore we conclude that iK (ϕ, K ∩ U ) = 0.
PROPOSITION 3.6.34 If X is an OBS with order cone K, U ⊆ X is a bounded open set, and ϕ : K ∩ U −→ X is a compact map such that (i) ϕ(x) = λx λ ∈ [0, 1] and all x ∈ K ∩ ∂U = ∂(K ∩ U ), for all (ii) The set ϕ(x) ϕ(x) : x ∈ K ∩ ∂U is relatively compact, then iK (ϕ, U ) = 0. PROOF: Let ξ=sup{ϕ(x) : x ∈ K ∩∂U } > 0 and consider the map ψ : K ∩∂U −→ K defined by ϕ(x) ψ(x) = ξ . ϕ(x) By hypothesis (ii), ψ is compact and so by Theorem 3.1.13 we can find a compact extension ψ of ψ with values in K. It is easy to see that ψ satisfies the hypotheses of Proposition 3.6.33. So we have iK (ψ, K ∩ U ) = 0. We consider the compact homotopy h(t, x) = tϕ(x)(1 − t) + ψ(x)
for all t ∈ [0, 1] and all x ∈ K ∩ U = K ∩ U .
Then using hypothesis (i) we see x = h(t, x) for all t ∈ [0, 1] and K ∩ ∂U = ∂(K ∩ U ). Hence by the homotopy invariance of the degree, we have iK (ϕ, K ∩ U ) = iK (ψ, K ∩ U ) = 0. Now we use the above degree theoretic results involving the fixed point index in order to prove fixed point theorems of the cone expansion-compression form. THEOREM 3.6.35 If X is an OBS with order cone K, U1 , U2 ⊆ X are bounded open sets with 0 ∈ U1 U 1 ⊆ U2 , and ϕ : K ∩ (U 2 \ U1 ) −→ K is a compact map that satisfies one of the following two hypotheses, (i) ϕ(x) − x ∈ / K for all x ∈ K ∩ ∂U1 and x − ϕ(x) ∈ / K for all x ∈ K ∩ ∂U2 , or / K for all x ∈ K ∩ ∂U2 , (ii) x − ϕ(x) ∈ / K for all x ∈ K ∩ ∂U1 and ϕ(x) − x ∈ then ϕ has at least one fixed point in K ∩ (U 2 \ U1 ).
3.6 Order and Fixed Points
259
PROOF: By virtue of Theorem 3.1.13 we can assume that ϕ is defined on K ∩ U 2 . If hypothesis (i) is in effect, then ϕ(x) = λx for all λ ≥ 1 and all x ∈ K ∩ ∂U1 and so we can apply Theorem 3.6.29 and obtain iK (ϕ, K ∩ U1 ) = 1.
(3.96)
On the other hand choosing arbitrary u ∈ K \ {0}, because of the second part of hypothesis (i), we have that x − ϕ(x) = λu for all λ ≥ 0 and all x ∈ K ∩ ∂U2 . Therefore from Corollary 3.6.31, we have iK (ϕ, K ∩ U2 ) = 1.
(3.97)
From (3.96), (3.97), and the additivity property of the fixed point index, we have
iK ϕ, K ∩ (U2 \U 1 ) + iK (ϕ, K ∩ U1 ) = iK (ϕ, K ∩ U2 )
⇒ iK ϕ, K ∩ (U2 \U 1 ) = −1. So by the solution property of the index, ϕ has a fixed point in U2 \U 1 . If hypothesis (ii) is in effect, then iK (ϕ, K ∩ U1 ) = 0 and iK (ϕ, K ∩ U2 ) = 1 and so
iK ϕ, K ∩ (U2 \ U1 ) = 1 from which we infer that ϕ has a fixed point in U2 \ U 1 .
In the previous theorem the expansion-compression of the cone was in terms of the order of the space. In the next theorem is with respect to the norm. The proof is similar and so is omitted. THEOREM 3.6.36 If X is an OBS with order cone K, U1 , U2 ⊆ X are bounded open sets with 0 ∈ U1 U 1 ⊆ U2 , and ϕ : K ∩ (U 2 \ U1 ) −→ K is a compact map that satisfies one of the following two hypotheses, (i) ϕ(x) ≤ x for all x ∈ K ∩ ∂U1 and ϕ(x) ≥ x for all x ∈ K ∩ ∂U2 , or (ii) ϕ(x) ≥ x for all x ∈ K ∩ ∂U1 and ϕ(x) ≤ x for all x ∈ K ∩ ∂U2 , then ϕ has at least one fixed point in K ∩ (U 2 \ U1 ). Finally we present some theorems on the existence of multiple fixed point theorems. DEFINITION 3.6.37 Let X be a Banach space and C ⊆ X a nonempty closed convex set. Given a continuous convex function ξ : C −→ R, a continuous concave function ϑ : C −→ R, and real numbers η, µ, we set C(ξ, η) = {x ∈ C : ξ(x) ≤ η},
C(ϑ, µ) = {x ∈ C : ϑ(x) ≥ µ}
and
C(ξ, ϑ, η, µ) = C(ξ, η) ∩ C(ϑ, µ). PROPOSITION 3.6.38 If X is a Banach space, C ⊆ X is a nonempty, closed, and convex set, ξ : C −→ R is continuous convex, ϑ : C −→ R is continuous concave, η, µ ∈ R, ϕ : {x ∈ C : ϑ(x) < µ} −→ C is compact, and the following hypotheses hold,
260
3 Nonlinear Operators and Fixed Points
(i) {x ∈ C : ϑ(x) < µ} = ∅ and it is bounded.
(ii) {x ∈ C(ξ, ϑ, η, µ) : ϑ(x) > µ} = ∅ and ϑ ϕ(x) > µ for all x ∈ C(ξ, ϑ, η, µ). (iii) ϑ ϕ(x) < η for all x ∈ C(ϑ, µ) with ξ ϕ(x) > η, then setting U = {x ∈ C : ϑ(x) < µ}, we have iC (ϕ, U ) = 0. PROOF: By hypothesis U ⊆ C is bounded open in C. Using hypothesis (ii) we can find u ∈ C such that ϑ(u) > µ
and
ξ(u) ≤ η.
Hence u ∈ C \ U . We show that ϕ(x) − x = λ(x − u) for all λ ≥ 0 and all x ∈ ∂U . Indeed, if there existλ ≥ 0 and
x ∈ ∂U such that
ϕ(x) − x = λ(x − u), then ϑ(x) = µ and x = 1/(1 + λ) ϕ(x) + λ/(1 + λ)
u. If ξ ϕ(x) ≤ η, then exploiting the convexity of ξ, we have ξ(x) ≤ 1/(1 + λ) ξ ϕ(x) + λ/(1 + λ) ξ(u) ≤ η. Hence
from hypothesis (ii) it follows that ϑ ϕ(x) > µ. Exploiting the concavity of ϑ, we
have
ϑ(x) ≥ 1/(1+λ) ϑ ϕ(x) + λ/(1+λ) ϑ(u) >
µ, a contradiction. Now suppose ξ ϕ(x) > η. Then from hypothesis (iii) we have ϑ ϕ(x) > µ, again a contradiction. Therefore we can apply Proposition 3.6.30 and conclude that iC (ϕ, U ) = 0. PROPOSITION 3.6.39 If X is a Banach space, C ⊆ X is nonempty, closed, and convex, ξ : C −→ R is continuous convex, ϑ : C −→ R is continuous concave, η, µ ∈ R, ϕ : {x ∈ C : ξ(x) < η} −→ C is compact, and the following hypotheses hold, (i) {x ∈ C : ξ(x) < η} is bounded;
(ii) {x ∈ C(ξ, ϑ, η, µ) : ξ(x) < η} = ∅ and ξ ϕ(x) < η for all x ∈ C(ξ, ϑ, η, µ); (iii) ξ ϕ(x) < η for all x ∈ C(ξ, η) with ϑ ϕ(x) < µ, then setting U = {x ∈ C : ξ(x) < η}, we have iC (ϕ, U ) = 1. PROOF: By hypothesis U ⊆ C is bounded open in C. Using hypothesis (ii) we can find u ∈ C such that ϑ(u) ≥ µ
and
ξ(u) < η.
So we have u ∈ C. We show that ϕ(x) − x = λ(x − u) for all λ ≥ 0 and all x ∈ ∂U . Indeed, if for some λ ≥ 0 and
x ∈ ∂Uwe have ϕ(x) − x = λ(x − u), then ξ(x) = η and x = 1/(1 + λ) ϕ(x) + λ/(1 + λ) u. First suppose that ϑ ϕ(x) ≥ µ.
1 λ ϑ ϕ(x) + 1+λ ϑ(u) ≥ µ and Then due to the concavity of ϑ, we have ϑ(x) ≥ 1+λ
so from hypothesis (ii) we have ξ ϕ(x) < η. So from the convexity of ξ, we have
η = ξ(x) ≤ 1/(1 + λ) ξ ϕ(x) + λ/(1 + λ) ξ(u) < η, a contradiction. Now suppose
that ϑ ϕ(x) < µ. Then we have µ = ξ(x) ≤
1 λ ξ ϕ(x) + ϕ(u) < µ, 1+λ 1+λ
again a contradiction. Therefore ϕ(x) − x = λ(x − u) for all λ ≥ 0 and all x ∈ ∂U . So we can apply Proposition 3.6.29 and conclude that iC (ϕ, U ) = 1. Next we present some multiplicity results for the fixed points of a compact map on an order cone.
3.6 Order and Fixed Points
261
THEOREM 3.6.40 If X is a Banach space, C ⊆ X a nonempty, bounded, closed, and convex set, U1 , U2 ⊆ X nonempty open, U 1 ∩ U 2 = ∅, and ϕ : C −→ C a compact map satisfying (i) There exists u1 ∈ U1 such that ϕ(x) − x = λ(x − u1 ) for all λ ≥ 0 and all x ∈ ∂U1 ; (ii) There exists u2 ∈ U2 such that ϕ(x) − x = λ(x − u2 ) for all λ ≥ 0 and all x ∈ ∂U2 , then ϕ has at least three fixed points x1 , x2 , x3 ∈ C such that x1 ∈ U1 , x2 ∈ U2 , x3 ∈ C \ (U1 ∪ U2 ). PROOF: Clearly iC (ϕ, C) = 1. Also from Proposition 3.6.29, we have iC (ϕ, U1 ) = iC (ϕ, U2 ) = 1. Then from the additivity property of the fixed point index (see Theorem 3.6.27(b)), we have
iC ϕ, C \ (U1 ∪ U2 ) + iC (ϕ, U1 ) + iC (ϕ, U2 ) = iC (ϕ, X) = 1,
⇒ iC ϕ, C \ (U1 ∪ U2 ) = −1. COROLLARY 3.6.41 If X is a Banach space, C ⊆ X is nonempty, bounded, closed, and convex, ξ : C −→ R is continuous convex, ϑ : C −→ R is continuous concave, η, µ ∈ R, U1 = {x ∈ C : ξ(x) < η} and U2 = {x ∈ C : ϑ(x) > µ} are nonempty, U 1 ∩ U 2 = ∅, ϕ : C −→ C is compact, and
(i) ξ ϕ(x) < η for all x ∈ C with ξ(x) = η; (ii) ϑ ϕ(x) > µ for all x ∈ C with ϑ(x) = µ, then ϕ has at least three fixed points x1 , x2 , x3 ∈ C such that ξ(x1 ) < η, µ < ϑ(x2 ), and η < ξ(x3 ), ϑ(x3 ) < µ. PROOF: First let u1 ∈ U1 . Then ξ(u1 ) < η. If we can find λ≥ 0 and x ∈ ∂U1 such that ϕ(x) − x = λ(x − u1 ), then ξ(x) = η and x = 1/(1 + λ) ϕ(x) + λ/(1 + λ) u1 . From the convexity of ξ and hypothesis (i) we have η = ξ(x) ≤
1 λ ξ ϕ(x) + ξ(u1 ) < η, 1+λ 1+λ
a contradiction. Therefore ϕ(x) − x = λ(x − u1 ) for all λ ≥ 0 and x ∈ ∂U1 . Similarly let u2 ∈ U2 . Then ϑ(u2 ) > µ and arguing as above we establish that ϕ(x) − x = λ(x − u2 ) for all λ ≥ 0 and x ∈ ∂U2 . So we can apply Theorem 3.6.40 and produce the three fixed points of ϕ with the stated properties. THEOREM 3.6.42 If X is an OBS with order cone K, ϕ : K −→ K is a compact map, ξ : K −→ R is continuous, convex, ϑ : K −→ R is continuous concave, η, µ ∈ R, and (i) 0 ∈ {x ∈ K : ϑ(x) < µ} and the set {x ∈ K : ϑ(x) < µ} is bounded;
262 (ii)
3 Nonlinear Operators and Fixed Points
x ∈ K(ξ, ϑ, η, µ) : ϑ(x) > µ = ∅ and ϑ ϕ(x) > µ for all x ∈ K(ξ, ϑ, η, µ)};
(iii) ϑ ϕ(x) > µ for all x ∈ K(ϑ, µ) with ξ ϕ(x) > η; (iv) There exist r0 > 0 small and R0 > r0 large such that for r ≤ r0 < R0 ≤ R we have iK (ϕ, Kr ) = iK (ϕ, KR ) = 1
where K = K ∩ B (0)
for all > 0,
then ϕ has at least three fixed points x1 , x2 , x3 ∈ K. PROOF: Let U = {x ∈ K : ϑ(x) < µ}. From Proposition 3.6.38, we have iK (ϕ, U ) = 0. Because 0 ∈ {x ∈ K : ϑ(x) < µ} and the set {x ∈ K : ϑ(x) < µ} is bounded (see hypothesis (i)), we can find r > 0 small and R > 0 large such that K r ⊆ U ⊆ U ⊆ KR . Then using the additivity property of the fixed point index, we obtain iK (ϕ, K r ) = −1, iK (ϕ, KR \ U ) = 1
and
iK (ϕ, Kr ) = 1.
So we obtain three fixed points x1 , x2 , x3 ∈ K of ϕ such that x1 ∈ Kr , x2 ∈ U \K r , and x3 ∈ KR \U . In a similar fashion using this time Proposition 3.6.39, we obtain the following. THEOREM 3.6.43 If X is an OBS with order cone K, ϕ : K −→ K is a compact map, ξ : K −→ R is continuous, convex, ϑ : K −→ R is continuous concave, η, µ ∈ R, and (i) 0 ∈ {x ∈ K : ξ(x) < η} and the set {x ∈ K : ξ(x) < η} is bounded;
(ii) {x ∈ K(ξ, ϑ, η, µ) : ξ(x) < η} = ∅ and ξ ϕ(x) < η for all x ∈ K(ξ, ϑ, η, µ)};
(iii) ξ ϕ(x) < η for all x ∈ K(ξ, η) with ϑ ϕ(x) < µ; (iv) There exist r0 > 0 small and R0 > r0 large such that for r ≤ r0 < R0 ≤ R we have iK (ϕ, Kr ) = iK (ϕ, KR ) = 0, then ϕ has at least two fixed points. We conclude this section with Amann’s three fixed points theorem. For a proof of this result we refer to Amann [16]. THEOREM 3.6.44 If X is an OBS with order cone K that is solid (i.e., int K = ∅) and normal, y1 , y2 , v1 , v2 ∈ X, y1 < v1 < y2 < v2 , ϕ : [y1 , v2 ] −→ [y1 , v2 ] is compact and strongly increasing (see Definition 3.6.18(c)) and ϕ(v1 ) < v1 , y2 ≤ ϕ(y2 ), then ϕ has at least three fixed points x1 , x2 , x3 ∈ [y1 , v2 ] with y1 ≤ x1 ! v1 , y2 ! x2 ≤ v2 ,
and
y2 x3 v1 .
3.7 Remarks
263
3.7 Remarks 3.1: The first attempts to deal with nonlinear operator equations in infinitedimensional Banach spaces, involved compactness properties of the operator involved. Proposition 3.1.6 is due to Schauder [540]. Theorem 3.1.10 is due to Dugundji [209] (see also Dugundji [210, p.188]). Proper maps (see Definition 3.1.14) are discussed in Berger [68]. A thorough discussion of compact operators and their spectral properties can be found in the books of Kato [343], Megginson [424], and Reid–Simon [513]. For Fredholm operators the main reference is the book of Kato [343]. 3.2: Compact operators limit the class of nonlinear boundary value problems that we can study. Monotone operators, and in particular maximal monotone operators, are a systematic effort to broaden the problems that we can study. Monotone operators is a rather natural class of nonlinear operators, which is rooted in the calculus of variations. The systematic study of monotone nonlinear operators started in the early 1960s and coincided with the advent of the so-called nonsmooth analysis, with which for a long period it developed in parallel. The first result on monotone operators, was obtained by Kachurovski [336], who proved that the gradient of a convex function on a Banach space is a monotone operator. In fact he is the one responsible for the term monotone operator. Then first came Minty [430] (for Hilbert spaces) and then Browder [121] (for general reflexive Banach spaces), who illustrated that the notion of monotonicity is a powerful tool in proving the existence theorem for nonlinear operator equations. Theorem 3.2.4 is essentially due to Rockafellar [520]. Here we state a slightly more general version of Rockafellar’s result. The duality map (see Definition 3.2.12) was introduced by Beurling–Livingstone [73] and is a basic tool in the study of evolution equations as well as in the investigation of the geometric properties of Banach spaces. A detailed study of the duality map can be found in the books of Browder [121], Cioranescu [146], Gasi´ nski–Papageorgiou [258], and Zeidler [622]. Theorems 3.2.24 and 3.2.25 are due to Browder [119] and are useful in the study of nonlinear operator equations. Theorem 3.2.30 was first proved for pivot Hilbert spaces (in which case F = I) by Minty [430]. The maximal monotonicity of the sum of monotone operators was considered by Rockafellar [522, 523] who also proved Theorems 3.2.35 and 3.2.39. The notion of pseudomonotonicity (see Definition 3.2.53) was first introduced by Brezis [98] using nets and soon thereafter Browder [121] provided the sequential definition. The importance of this class of nonlinear operators comes from Theorem 3.2.60 which is due to Browder–Hess [120]. The basic works on pseudomonotone operators are those by Browder–Hess [120] and Kenmochi [345, 346]. Accretive operators (see Definition 3.2.64) were introduced by Kato [341, 342]. For a comprehensive introduction of their theory, we refer to the books of Barbu [58] and Miyadera [436]. Semigroups (linear and nonlinear alike), are important in the study of evolution equations. The linear semigroup theory can be found in the books of Hille–Phillips [300] and Pazy [491]. Theorem 3.2.87 is the main result of the linear theory and was proved independently (for N = 1, ω = 0, i.e., contraction semigroups) by Hille [299] and Yosida [613]. The general case (stated in Theorem 3.2.89) is independently due to Feller [240], Miyadera [435], and Phillips [496]. For the nonlinear theory we refer to Barbu [58] and Vrabie [597]. The main result here is Theorem 3.2.93 proved by Crandall–Liggett [166].
264
3 Nonlinear Operators and Fixed Points
3.3: The classical degree theory, was introduced by Brouwer [115] (foreshadowed by earlier work of Kronecker) and is an algebraic count of the solutions of y0 = f (x) where f : U −→ RN is a continuous map, U ⊆ RN a bounded open set, and y0 ∈ RN . Here we present the analytical approach due to Heinz [288] in the derivation of Brouwer’s degree. Other analytical versions of the theory can be found in Nagumo [450] and Schwartz [545]. The proof of Borsuk’s Theorem (see Theorem 3.3.28) is due to Gromes [274]. Theorem 3.3.32 is independently due to Ljusternik–Schnirelmann [393] and Borsuk [91] and is useful in the study of the so-called Ljusternik–Schnirelmann category, which in turn plays a basic role in the critical point theory developed by Ljusternik–Schnirelmann. The classical degree theory of Brouwer, was extended by Leray–Schauder [375] to a class of mappings in an infinite-dimensional Banach space. Their class of mapping had the form I−ϕ with ϕ compact (compact perturbations of the identity). The uniqueness of Brouwer’s and of course of the Leray–Schauder degree, was established by Amann–Weiss [15] and F¨ uhrer [253]. Since 1934 there have been various efforts to move beyond the Leray–Schauder framework (i.e., beyond compact perturbations of the identity). The extension to maps I − ϕ with ϕ γ-condensing is due to Nussbaum [464]. The extensions of degree theory to operators of monotone type, are due to Browder [122, 123]. Some of the results of the Leray–Schauder degree theory can be casted in terms of essential and inessential maps (see Granas [273] and Dugundji–Granas [212]). DEFINITION 3.7.1 Let X be a Banach space, C ⊆ X a nonempty convex set, E ⊆ C a nonempty bounded set and A ⊆ E closed in E. By KA (E, C) we denote the set of all compact maps ϕ : E −→ C such that ϕA is fixed point free. A map ϕ ∈ KA(E, C) is said to be traverse or essential provided every ψ ∈ KA (E, C) such that ϕA = ψ A has a fixed point. A map that is not essential is called inessential. REMARK 3.7.2 In geometric terms, a compact map ϕ : E −→ C is essential if the graph of ϕA does not intersect the diagonal D ⊆ E × C, but the graph of every compact map ψ : E −→ C such that it coincides with φ on A must cross the diagonal D. The books of Deimling [188], Denkowski–Mig` orski–Papageorgiou [195], Fonseca– Gangbo [252], Lloyd [396], and Zeidler [620], have detailed presentations of the Brouwer, Leray–Schauder and Nussbaum–Sadovskii degree theories. 3.4: Theorem 3.4.3 is due to Banach [57]. In applying the theorem, sometimes we need to equivalently renorm the Banach space, in order to make the appropriate map a k-contraction. This renorming trick was first used by Bielecki [75] (see also Gasi´ nski–Papageorgiou [258], Example 7.1.4(b)). Theorem 3.4.11 is due to Edelstein [217]. Theorem 3.4.23 was proved independently by Browder [116] and G¨ ohde [271] and was the first fixed point theorem for nonexpansive maps. Immediately after that Kirk [354] observed that their result could be extended using the notion of normal structure (see Definition 3.4.24) and proved Theorem 3.4.31. The normal structure of a convex set C ⊆ X was introduced by Brodskii–Milman [113], who used it to study fixed points of isometries. The asymptotic center of {xn }n≥1 relative to C (see Definition 3.4.34(a)) was introduced by Edelstein [218], and the notions of inward set and weakly inward map (see Definition 3.4.34(b)), were introduced by Halpern
3.7 Remarks
265
in his thesis (see Halpern–Bergman [283]), who also established Theorem 3.4.38. Additional results can be found in Caristi–Kirk [128]. Detailed expositions of metric fixed point theory can be found in Dugundji– Granas [212], Goebel–Kirk [269], Smart [558], and Zeidler [619]. 3.5: Theorem 3.5.3 is one of the oldest and best known results of topology. It was proved for N = 3 by Brouwer [114]. The case of general N ≥ 1 was proved by Hadamard [281], who in his proof used the Kronecker index. A proof of the theorem based on a combinatorial technique was provided by Knaster–Kuratowski– Mazurkiewicz [358]. The infinite dimensional generalization of Theorem 3.5.6, (the Peron–Frobenius theorem), is the so-called Krein–Rutman theorem (see Zeidler [619, p. 290]). THEOREM 3.7.3 If X is an OBS with a solid order cone K (i.e., int K = ∅) which is total (i.e., X = K − K), A ∈ Lc (X), A is positive (i.e., A(K) ⊆ K) and 1/n the spectral radius r(A) = lim An L > 0, then r(A) > 0 is an eigenvalue of A n→∞
with positive eigenvector. Concerning Theorem 3.5.7, we should mention that in contrast in infinitedimensional Banach spaces such a retraction exists. Example 3.5.12 is due to Kakutani [338], whereas Theorem 3.5.14 was proved by Schauder [540] and Theorem 3.5.10 is due to Borsuk [90]. Theorem 3.5.16 (the nonlinear alternative theorem), formalizes the well-known informal principle which roughly speaking says that existence of a priori bounds implies existence. A direct elementary proof of it was given by Schaeffer [538] (see Corollary 3.5.18). Theorem 3.5.20 is due to Sadovskii [534]. Historical forerunners are the fixed point theorems of Darbo [176] and Krasnoselskii [361, 362]. The topological fixed point theory can be found in the books of Brown [124], Dugundji–Granas [212], Istratescu [329], Smart [558], and Zeidler [619]. 3.6: The various types of order cones were first considered by Krasnoselkii [361]. Deimling [188] and Guo–Lakshmikantham [277] presented them in detail. The fixed point index was first used systematically by Amann [16]. Fixed point theorems using the order structure can be found in Amann [16], Guo [276], Guo–Sun [278], Leggett–Wiliams [374], Sun [567], and Sun–Sun [566]. Also we refer to the lecture notes of Amann [16] and the books of Deimling [188], Guo–Lakshmikantham [277] and Heikkila–Lakshmikantham [287].
4 Critical Point Theory and Variational Methods
Summary. *The variational method in the study of nonlinear boundary value problems is based on the critical point theory, that provides minimax characterizations of the critical values over certain homotopically stable families of sets. Using the deformation method, we derive the main results of the smooth critical point theory and we also present results guaranteeing the existence of multiple critical points. We present an extension of the theory (perturbation of C 1 -functionals by proper, convex, lsc functionals) which is suitable in the study of problems with unilateral constraints (variational inequalities). Then we present the Ljusternik–Schnirelmann theory which leads naturally to nonlinear eigenvalue problems and to the study of the Ljusternik–Schnirelmann category and of the Krasnoseleskii genus (two instances of topological indices). The Ljusternik–Schnirelmann theory allows us to develop the spectral properties of the Dirichlet and Neumann Laplacian and p-Laplacian and of the periodic scalar p-Laplacian. Also, we examine further certain abstract nonlinear eigenvalue problems. We conclude with a look at the main features of bifurcation theory.
Introduction In this chapter we develop the abstract results that are necessary in order to implement the variational method in the study of nonlinear boundary value problems. In Section 4.1 we develop the critical point theory for smooth functionals. This is the main tool in the variational method. This method consists of trying to find solutions of a given equation, by looking for stationary points of a real functional defined on the function space in which the solution of the equation is to lie. The given equation is the Euler–Lagrange equation satisfied by a stationary point. This functional (called energy or Euler functional of the problem) is often unbounded (indefinite functional) and so one cannot look for (global) maxima and minima. Instead one seeks saddle points characterized by a minimax expression. The approach that we follow to characterize the critical values of the functional is based on deformation arguments. We derive all the classical results such as the mountain pass theorem, the saddle point theorem, and the generalized mountain pass theorem. We also prove results on the existence of multiple critical points under the local linking N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_4, © Springer Science+Business Media, LLC 2009
268
4 Critical Point Theory and Variational Methods
condition or under symmetry conditions. Finally we present an extension of the theory to functionals that are perturbations of a smooth functional by a proper, lower semicontinuous and convex functional. In Section 4.2 we present the Ljusternik–Schnirelmann theory for smooth functionals defined on a Banach space. When the functional is quadratic, then the corresponding eigenvalue problem is linear and for such problems, the eigenvalues have minimax characterizations, known as Courant’s minimax principle. The Ljusternik– Schnirelmann theory is a method to advance critical point theory beyond quadratic functionals and independent of nondegeneracy considerations, which are present in Morse theory (not covered in this volume). To produce topological analogue of the minimax principles of the linear theory, we need to introduce substitutes of the notion of dimension. This is achieved using topological indices. We present two such indices, the Ljusternik–Schnirelmann category and the genus. Using them we state the basic results of the Ljusternik–Schnirelmann theory. Section 4.3 investigates the spectral properties of the Laplacian and of the pLaplacian under both Dirichlet and Neumann boundary conditions). We show that for the negative Laplacian we can have a full description of its eigenvalues and variational characterizations of them (via Courant’s minimax principle). The situation is different with the negative p-Laplacian. Using the Ljusternik–Schnirelmann theory (Section 4.2), we produce an increasing sequence of eigenvalues converging to +∞. However, we do not know if these eigenvalues exhaust all the eigenvalues. Nevertheless, we are able to fully describe the first eigenvalue and also show that the second eigenvalue and the second variational eigenvalue coincide. In Section 4.4, we use the Ljusternik–Schnirelmann theory to study abstract nonlinear eigenvalue problems. The notion of a sphere-like constraint is important in these considerations and so we investigate it further. Finally Section 4.5 deals with bifurcation theory. Bifurcation theory is concerned with the structure of the solutions of the parametrized equation ϕ(λ, x) = 0 as a function on the parameter λ near a solution (λ0 , x0 ) that is also a singular point of the map x −→ ϕ(λ0 , x) (hence fx (λ0 , x0 ) is not invertible and so we can not use the implicit function theorem). Using some degree-theoretic arguments we prove local and global bifurcation theorems.
4.1 Critical Point Theory One of the typical features of modern physics is the characterization of phenomena by variational principles. Thus many phenomena can be understood in terms of the minimization of an energy functional ϕ(x) over an appropriate class of functionals. This leads naturally to the study of critical points of a functional ϕ (i.e., the solutions of the gradient operator equation ∇ϕ(x) = ϕ (x) = 0). There are two types of critical points of ϕ. The local extrema (local minima and local maxima) and the saddle points. It is well known that in a finite-dimensional Banach space X, any continuous functional defined on a compact subset of X attains its infimum. This is not in general true in infinite-dimensional Banach spaces, because bounded closed subsets are not necessarily compact. In fact this lack of minimizers for such functionals was the starting point of the criticism of Weierstrass for Riemann’s approach to potential theory (see Courant [163]). To guarantee minimizers (or maximizers) we need two types of properties of the functional ϕ. One is quantitative and requires
4.1 Critical Point Theory
269
that the sublevel sets {x ∈ X : ϕ(x) ≤ λ} are relatively compact in some useful topology of X (coercivity property) and the other qualitative, which requires that ϕ is lower semicontinuous (or upper semicontinuous) for the same topology on X. A coercive functional is bounded below. But many natural functionals ϕ that we may encounter may not be bounded at all, neither from above or from below. Thus we need to look for results where other types of critical points may be identified. DEFINITION 4.1.1 Let X be a Banach space and ϕ ∈ C 1 (X). We say that x ∈ X is a critical point of ϕ, if ϕ (x) = 0. We say that c ∈ R is a critical value of ϕ, if there is a critical point x ∈ X such that c = ϕ(x). We say that c ∈ R is a regular value of ϕ, if it is not a critical value of ϕ. We introduce the following notation which is used throughout this chapter. For every λ ∈ R ∪ {+∞} and every c ∈ R, we set ϕλ = {x ∈ X : ϕ(x) ≤ λ} K = {x ∈ X : ϕ (x) = 0} and
Kc = {x ∈ K : ϕ(x) = c}
(the set of critical points of ϕ) (the set of critical points of ϕ at level c ∈ R).
We want to get information about the set K. To achieve this we employ the so-called deformation method . DEFINITION 4.1.2 A continuous function h : [0, 1]×X −→ X such that h(0, ·) = I, (i.e., h(0, x) = x for all x ∈ X) is called a deformation of X. A family S ⊆ 2X is said to be deformation invariant, if h(1, A) ∈ S for every A ∈ S and every h(·, ·) is a deformation of X. We look at deformations h which effectively decrease the values of ϕ on X \ K and try to determine critical values of ϕ by consideration of minimax expressions of the form c = inf sup ϕ(x) A∈S x∈A
for various deformation invariant classes S. The construction of suitable deformations is the most technical part of this method. The guiding condition to produce the appropriate deformations is the following one. (D) : for h : [0, 1] × X −→ X a deformation of X, we have: (1) For all −∞ < a < b < +∞ with ϕ−1 ([a, b]) ∩ K = ∅, there exists t0 > 0 such that h(t0 , ϕb ) ⊆ ϕa . (2) If c ∈ R and U is a neighborhood of Kc , there exist t0 and a, b ∈ R such that a < c < b for which we have h(t0 , ϕb ) ⊆ U ∪ ϕa . REMARK 4.1.3 Condition (D)(1) says that h effectively decreases the values of ϕ in X \ K. So nothing can happen topologically between the levels a, b, if the interval [a, b] does not contain any critical values. On the other hand condition (D)(2) says that if we start a little above a critical level c, then we will either bypass the “critical” neighborhood U and as before reach a harmless level a < c or we will end up in U , where topologically interesting things may happen.
270
4 Critical Point Theory and Variational Methods
The deformation theorem which we prove in the sequel, shows how to construct deformations satisfying condition (D) above. The idea is to use the gradient or gradient-like flows on X; that is, some kind of method of steepest descent. To compensate for the lack of compactness in the ambient space X, we introduce the following compactness-type condition. DEFINITION 4.1.4 Let X be a Banach space and ϕ ∈ C 1 (X). (a) We say that ϕ satisfies the Palais–Smale condition at level c ∈ R (the P Sc condition for short), if every sequence {xn }n≥1 ⊆ X such that ϕ(xn ) −→ c
ϕ (xn ) −→ 0,
and
has a strongly convergent subsequence. If this is true at every level c ∈ R, then we simply say that ϕ satisfies the P S-condition. (b) We say that ϕ satisfies the Cerami condition at level c ∈ R (the Cc -condition for short), if every sequence {xn }n≥1 ⊆ X such that ϕ(xn ) −→ c
and
(1 + xn )ϕ (xn ) −→ 0,
has a strongly convergent subsequence. If this is true at every level c ∈ R, then we simply say that ϕ satisfies the C-condition. REMARK 4.1.5 Clearly the Cc -condition is weaker that the P Sc -condition. These conditions are fairly strong and are not satisfied even for some “nice” functions. EXAMPLE 4.1.6 (a) Let X = R and ϕ(x) = ex . Because ϕ (x) = ex , we see that ϕ does not satisfy the P S0 and C0 conditions. It trivially satisfies the P Sc and Cc conditions for c = 0. (b) Let X = R and ϕ(x) = c ∈ R. Then clearly ϕ does not satisfy the P S and C conditions. (c) Let X = R and ϕ(x) = sin x. If xn = (4n + 1)(π/2) n≥1 , then ϕ(xn ) = 1 and ϕ (xn ) = 0 for all n ≥ 1. However, {xn }n≥1 ⊆ R has no convergent subsequence. When the functional ϕ is bounded below, then as we show the situation simplifies and the compactness conditions imply weak coercivity (i.e., ϕ(x) −→ +∞ as x −→ ∞; see Definition 3.2.14(b)). PROPOSITION 4.1.7 If X is a Banach space, ϕ ∈ C 1 (X), and is bounded below, −∞ < m = inf ϕ < +∞ and ϕ satisfies the Cm -condition, then the set ϕm+ϑ is X
bounded for some ϑ > 0. PROOF: We argue indirectly. Suppose that ϕm+ϑ is unbounded for all ϑ > 0. Then we can find a sequence {yn }n≥1 ⊆ X such that m ≤ ϕ(yn ) ≤ m +
1 n
and
yn ≥ n.
√ Apply Theorem 2.4.25 (with ε = 1/n, h(r) = r, x0 = yn , and λ = 1/ n) to obtain a sequence {xn }n≥1 ⊆ X such that
4.1 Critical Point Theory
271
ϕ(xn ) ↓ 0, (1 + xn )ϕ (xn ) −→ m and xn − yn ≤ rn , √ where rn > 0 is such that ln(1 + rn ) = 1/ n. Clearly rn −→ 0 and n ≤ yn ≤ rn + xn , hence xn −→ +∞. But this contradicts the fact that ϕ satisfies the Cm -condition. PROPOSITION 4.1.8 If X is a Banach space, ϕ∈C 1 (X), and c∈R is such that ϕϑ is unbounded for ϑ > c and ϕϑ is bounded for ϑ < c, then there exists {xn }n≥1 ⊆ X such that ϕ(xn ) −→ c,
(1 + xn )ϕ (xn ) −→ 0
in
X∗
xn −→ +∞. (4.1)
and
PROOF: By hypothesis for every n ≥ 1, we can find Rn ≥ n such that ϕc−(1/n) ⊆ BRn (0) = {x ∈ X : x < Rn }. Set Cn =X \ BRn (0), ϕn =ϕC : Cn −→ R and note that
(4.2)
n
mn = inf ϕn ≥ c − Cn
1 n
(see (4.2)).
(4.3)
Because by hypothesis ϕc+(1/n) is unbounded, we can find yn ∈ X such that ϕ(yn ) ≤ c +
1 n
and
1 yn ≥ Rn + 1 + √ n
for all n ≥ 1.
(4.4)
Therefore we have yn ∈ Cn and from (4.3) and (4.4) it follows that ϕ(yn ) ≤ c +
1 2 ≤ mn + n n
for all n ≥ 1.
(4.5)
√ Applying Theorem 2.4.25 (with ε = 2/n, h(r) = r, x0 = yn , and λ = 1/ n) we obtain a sequence {xn }n≥1 such that xn ∈ Cn and c−
1 2 1 ≤ mn ≤ ϕ(xn ) ≤ ϕ(yn ) ≤ c + ≤ mn + n n n
(see (4.5)), 2 for all n ≥ 1 (1 + xn )ϕ (xn ) ≤ √ n and yn − xn ≤ rn for all n ≥ 1, where rn > 0 is such that 1 ln(1 + rn ) = √ . n
for all n ≥ 1, (4.6) (4.7)
(4.8)
From (4.4) and (4.6) through (4.8) we conclude that the sequence {xn }n≥1 ⊆ X satisfies (4.1). COROLLARY 4.1.9 If X is a Banach space and ϕ ∈ C 1 (X), satisfies the P Sc condition for some c ∈ R and for every ϑ < c, ϕϑ is bounded, then ϕc+η is also bounded for some η > 0. REMARK 4.1.10 Evidently ϕc is bounded too. We can restate the corollary by assuming that ϕc is bounded. Of course this corollary generalizes Proposition 4.1.7.
272
4 Critical Point Theory and Variational Methods
THEOREM 4.1.11 If X is a Banach space and ϕ ∈ C 1 (X), is bounded below but not coercive, then ϕ does not satisfy the Cξ -condition with ξ = sup{ϑ ∈ R : ϕϑ is bounded}. PROOF: Let C = {ϑ ∈ R : ϕϑ is bounded}. Because by hypothesis ϕ is bounded from below, we have that (−∞, m) ⊆ C where m = inf ϕ (recall that the empty set X
is bounded; see Kuratowki [368, p. 107]). So C = ∅ and we can define ξ = sup C. Because ϕ is not coercive, ξ < +∞. Then for all η > ξ we have that ϕη is unbounded. Invoking Proposition 4.1.8, we conclude that ϕ does not satisfy the Cξ -condition. A remarkable consequence of Theorem 4.1.11 is the following result. THEOREM 4.1.12 If X is a Banach space, ϕ ∈ C 1 (X), it is bounded below and satisfies the C-condition, then ϕ is weakly coercive (i.e., ϕ(x) −→ +∞ as x −→ ∞). REMARK 4.1.13 In Theorem 4.1.11, we can alternatively define ξ = inf{η ∈ R : ϕη is unbounded}. In general for any functional ϕ: X −→ R, the set C = {ϑ ∈ R : ϕϑ is bounded} is a left half-line either open or closed and we may have C = ∅ or C =R, the latter case being equivalent to the weak coercivity of ϕ. THEOREM 4.1.14 If X is a Banach space and ϕ ∈ C 1 (X) is bounded below, then the P S and C conditions are equivalent. PROOF: Evidently we only need to show that the C-condition implies the P Scondition. To this end, let {xn }n≥1 ⊆ X be such that |ϕ(xn )| ≤ M for some M > 0, all n ≥ 1, and ϕ (xn ) −→ 0. From Theorem 4.1.12 we know that ϕ is weakly coercive. Therefore it follows that {xn }n≥1 ⊆ X is bounded. Hence (1 + xn )ϕ (xn ) −→ 0. But ϕ satisfies the C-condition. So {xn }n≥1 admits a strongly convergent subsequence and this proves the theorem. As we already indicated, the deformation approach will be implemented by considering the negative gradient or gradientlike flows. If we are in a pivot Hilbert space (i.e. H =H ∗ ), the negative gradient flow is a suitable choice, because it occurs within H. However, if we are in a general Banach space, the gradient has values in X ∗ and so the negative gradient flow will occur there hence it is not helpful. In addition, even in the Hilbert space situation, for the steepest descent method to work we need more than a C 1 (H)-functional (additional smoothness on the functional ϕ). For this reason we introduce the following substitute of the gradient vector field of ϕ. DEFINITION 4.1.15 We say that u :X \ K −→ X is a pseudogradient vector field for ϕ, if u is locally Lipschitz and for every x ∈ X \ K, we have ϕ (x)2X ∗ ≤ ϕ (x), u(x) and u(x)X ≤ 2ϕ (x)X ∗ . REMARK 4.1.16 More generally for 0 < α < β, we can require that αϕ (x)2X ∗ ≤ ϕ (x), u(x) and u(x)X ≤ βϕ (x)X ∗ for all x ∈ X \K.
4.1 Critical Point Theory
273
However, to simplify things we have decided to take α = 1 and β = 2 as it is done in most cases in the literature. If X = H is a Hilbert space with H = H ∗ and ϕ ∈ C 2 (H), we can take u(x) =
α+β 1 ϕ (x). 2 ϕ (x)2
First we establish the existence of a pseudogradient vector field. To do this, we need the following lemma. LEMMA 4.1.17 If X is a Banach space, Y is a metric space, and F : Y −→ 2X \ {∅} is a multifunction with convex values such that for every y ∈ Y we can find a neighborhood U of y such that F (y ) = ∅, then we can find a locally Lipschitz y ∈U
map u : Y −→ X such that u(y) ∈ F (y) for all y ∈ Y . PROOF: By hypothesis, for every y ∈ Y we can find U (y) an open neighborhood of y such that # F (y ) = ∅.
y ∈U (y)
The family U (y) y∈Y is an open cover of Y and Y being a metric space it is paracompact. So we can find a locally finite open refinement {Vi }i∈I . There is a partition of unity {g i }i∈I subordinated to the open covering {Vi }i∈I . For every i ∈ I, we choose zi ∈ F (y ) and we define a locally Lipschitz map u : Y −→ X by y ∈Vi
u(y) =
gi (y)zi .
i∈I
n Now given y ∈Y , we can find Vik k=1 ⊆{Vi }i∈I such that y ∈Vik , k ∈ {1, . . . , n} (because the cover is locally finite) and so we have u(y) =
n k=1
gik (y)zik
with
n
gik (y) = 1.
k=1
Because y ∈ Vik for all k ∈ {1, . . . , n}, zik ∈ F (y) and the latter is convex. So we conclude that u(y) ∈ F (y) and u is a locally Lipschitz selector of F . THEOREM 4.1.18 If X is a Banach space and ϕ ∈ C 1 (X), then there exists a pseudogradient vector field for ϕ. PROOF: We let Y =X \ K={x ∈ X :ϕ (x) = 0} (the set of regular points of ϕ). For every y ∈ Y , let F (y) = {v ∈ X : vX ≤ 2ϕ (y)X ∗ and ϕ (y)2X ∗ ≤ ϕ (y), v}. Clearly F (y) is a convex set. For every y ∈ Y from the definition of the dual norm, we can find x ∈ X with xX ≤ 1 such that 4 ϕ (y)X ∗ ≤ ϕ (y), x . 5
(4.9)
We set v = 4/5ϕ (y)X ∗ x. Then vX ≤ 5/3ϕ (y)X ∗ and ϕ (y), v ≤ 4/3ϕ (y)||2X ∗ (see (4.9)). Because ϕ ∈ C 1 (X), we can find an open neighborhood U of y such that
274
4 Critical Point Theory and Variational Methods vX < 2ϕ (y )X ∗ # ⇒ v∈ F (y ).
and
ϕ (y ), v > ϕ (y )X ∗
for all y ∈ U,
y ∈U
So we can use Lemma 4.1.17 and produce a locally Lipschitz map u : Y −→ X such that u(y) ∈ F (y) for all y ∈ Y . Evidently this is the pseudogradient vector field for ϕ. Now we can prove the so-called deformation theorem that leads to minimax characterizations of the critical values of ϕ. THEOREM 4.1.19 If X is a Banach space and ϕ ∈ C 1 (X) and satisfies the Cc condition for some c ∈ R, then for every ε0 >0 and every neighborhood U of Kc (if Kc = ∅, then U = ∅) and every λ > 0, we can find ε ∈ (0, ε0 ) and a continuous map h : [0, 1] × X −→ X (a continuous homotopy) such that for all (t, x) ∈ [0, 1] × X, we have (a) (b) (c) (d) (e)
h(t,
x) − x ≤ λ(1 + x)t. ϕ h(t, x) ≤ ϕ(x).
h(t, x) = x ⇒ ϕ h(t, x) < ϕ(x). |ϕ(x) − c| ≥ ε0 ⇒ h(t, x) = x. h(1, ϕc+ε ) ⊆ ϕc−ε ∪ U .
PROOF: Because ϕ satisfies the Cc -condition, we readily verify that Kc is compact (it may be empty). So we can find r > 0 such that B3r (Kc ) = {x ∈ X : d(x, Kc ) < 3r} ⊆ U . We show that there exist ε1 > 0 and ϑ > 0 such that c − 2ε1 ≤ ϕ(x) ≤ c + 2ε1
for all x ∈ / Br (Kc ) ⇒ (1 + x)ϕ (x)X ∗ ≥ ϑ. (4.10)
Suppose that this is not possible. Then we can find {xn }n≥1 ⊆ X such that ϕ(xn ) −→ c
and
(1 + xn )ϕ (xn )X ∗ −→ 0
as n → ∞.
(4.11)
Because ϕ satisfies the Cc -condition, from (4.11) and by passing to a subsequence if necessary, we may assume that xn −→ x in X. Then ϕ(x) = c, x ∈ / Br (Kc ) and ϕ(x) = 0, a contradiction. Let A = {x∈ X : |ϕ(x) − c| ≥ 2ε1 } ∩ Br (Kc ) and B = {x ∈ X : |ϕ(x) − c| ≤ c ε1 } ∩ B2r (Kc ) . These are two nonempty disjoint closed sets. So we can find β : X −→ [0, 1] a locally Lipschitz function such that β A = 0 and β B = 1 (consider,
e.g., β(x) = d(x, B) d(x, A) + d(x, B) ). Choose 0 < η < 1 such that eη ≤ λ. If u : X \ K −→ X is the pseudogradient vector field for ϕ obtained in Theorem 4.1.18, we define u(x) −ϑηβ(x) u(x) if |ϕ(x) − c| ≤ 2ε1 and x ∈ / Br (Kc ) 2 V (x) = 0 otherwise. Evidently V : X −→ X is a locally Lipschitz vector field. Also if |ϕ(x) − c| ≤ 2ε1 and x ∈ / Br (Kc ), from (4.11) and Definition 4.1.15, we have
4.1 Critical Point Theory V (x) ≤ ϑη
1 1 ≤ η(1 + x) ≤ ϑη u(x) ϕ (x)|xd
275
for all x ∈ X (4.12)
and
ϕ (x), V (x) = −ϑηβ(x)
ϕ (x), u(x) ≤ u(x)2
1 ≤ − ϑηβ(x) 4
ϕ (x)2X ∗ −ϑηβ(x) u(x)2
for all x ∈ X.
(4.13)
We consider the following Cauchy problem in the Banach space X: ) dξ(t)
= V ξ(t) ξ(0) = x dt
a.e. on [0, 1]
* .
(4.14)
Because V is a locally Lipschitz vector field on X with sublinear growth (see (4.12)), problem (4.14) admits a unique global solution ξ(x) : [0, 1] −→ X. Then if we set h(t, x) = ξ(x)(t), because of the continuous dependence of the solution ξ(x) on the initial condition x, we see that the map h : [0, 1] × X −→ X is a continuous homotopy. Note that from the definition of V we see that if |ϕ(x) − c| ≥ 2ε0 , then h(t, x) = x for all t ∈ [0, 1]. So we have proved statement (d) of the theorem. Also for all (t, x) ∈ [0, 1] × X, we have ∂h(t, x) ∂
ϕ h(t, x) = ϕ h(t, x) , ∂t ∂t
= ϕ h(t, x) , V h(t, x) ≤ 0 (see (4.13) and (4.14)), ⇒
(4.15)
ϕ h(t, x) ≤ ϕ(x)
for all (t, x) ∈ [0, 1] × X.
So we have proved statement (b) of the theorem. Moreover, note that
1 ∂
ϕ h(t, x) ≤ ϕ h(t, x) , V h(t, x) ≤ − ϑηβ h(t, x) ∂t 4 see (4.13) and (4.15)). Therefore, if h(t, x) = x, we see that
ϕ h(t, x) < ϕ(x), which proves statement (c) of the theorem. Integrating (4.14), we obtain
t
h(t, x) − x ≤
V h(s, x) ds
0
t
≤η
1 + h(s, x) ds
(see (4.13))
0 t
h(s, x) − xds + η(1 + x) t.
≤η 0
Invoking Gronwall’s inequality, from (4.16) we infer that
(4.16)
276
4 Critical Point Theory and Variational Methods
h(t, x) − x ≤ η(1 + x) t + η 2 (1 + x) t
t
eηs ds 0
= η(1 + x) t eηt ≤ λ(1 + x) t (recall the choice of 0 < η < 1 and that t ∈ [0, 1]). This proves statement (a) of the theorem.
Moreover, because for 0 ≤ t1 ≤ t2 ≤ 1, we have h(t2 , x1 ) = h t2 − t1 , h(t1 , x) , we also have
h(t2 , x) − h(t1 , x) ≤ λ 1 + h(t1 , x) (t2 − t1 ). (4.17) Finally let R > 0 and 0 < ε ≤ ε1 be such that B2r (Kc ) ⊆ BR (0), 8ε ≤ ϑη
8λ(1 + R)ε ≤ ϑηr.
and
(4.18)
To prove statement (e) of we proceed by contradiction. So let
the theorem x ∈ ϕc+ε and assume that ϕ h(1, x) > c − ε and h(1, x) ∈ / U . Note that because the homotopy h is ϕ-decreasing (see statement (b) of the theorem), we have
c − ε < ϕ h(t, x) ≤ c + ε for all t ∈ [0, 1]. Also h([0, 1], x) ∩ B2r (Kc ) = ∅. Indeed, if this is not the case, then from (4.13) we have
1 ϑη ≤ ϕ(x) − ϕ h(1, x) < 2ε. 4
(4.19) w
Because h(1, x) ∈ / U and B3r (Kc ) ⊆ U , we can find 0 ≤ t1 −→2 ≤ 1 such that
d h(t1 , x), Kc = 2r, d h(t2 , x), Kc = 3r and 2r < d h(t, x), Kc < 3r, for all t ∈ (t1 , t2 )
1 ⇒ ϑη(t2 − t1 ) ≤ ϕ h(t1 , x) − ϕ h(t2 , x) < 2ε (see (4.19)) 4
8ε ⇒ r ≤ h(t2 , x) − h(t1 , x) ≤ λ 1 + h(t1 , x) (t2 − t1 ) < λ(1 + R) ≤r ϑη see (4.17) and (4.18)), a contradiction. This proves statement (e) of the theorem. Now we use this deformation theorem to produce minimax characterizations of the critical values of ϕ ∈ C 1 (X). For this we need the following basic topological notion. DEFINITION 4.1.20 Let Y be a Hausdorff topological space, E0 ⊆ E and D be nonempty subsets of Y with D closed, and γ ∗ ∈ C(E0 , X). We say that the sets {E0 , E} and D link in Y via γ ∗ if and only if the following conditions are satisfied. (a) E0 ∩ D = ∅. (b) For any γ ∈ C(E, X) such that γ E = γ ∗ E , we have γ(E) ∩ D = ∅. 0
0
∗ ∗ The sets {E0 , E, D} are said to be linking sets via γ ∈ C(E0 , X) and if γ = I E , then we simply say that {E0 , E, D} are linking sets. 0
4.1 Critical Point Theory
277
EXAMPLE 4.1.21 (a) Let E0 ={0, x}, E ={tx : t ∈ [0, 1]}, and D =∂Br (0) with 0 < r < x. Then the sets {E0 , E, D} are linking sets. Indeed, first note that E0 ∩ D=∅. Also let γ ∈ C(E, X) be such that γ|E0 =IE0 . Hence γ(0)=0, γ(x)=x. Consider the function γ ∈ C([0, 1]) defined by γ(t)=γ(tx), t ∈ [0, 1]. Then γ(0)=0 and γ(1) = x > r. So by the intermediate value theorem, we can find t0 ∈ (0, 1) such that γ(t0 )=r, hence γ(t0 x)=r and so we conclude that γ(E) ∩ D = ∅. (b) Suppose X =Y ⊕ V with dim Y < +∞. Let E0 ={x ∈ Y : x = R}=∂BR (0) ∩ Y , E = {x ∈ Y : x ≤ R} = BR (0) ∩ Y and D = V as in the decomposition. We claim that the sets {E0 , E, D} are linking. To this end, let pY ∈ L(X) be the projection of X onto Y . It exists because Y is finite-dimensional. Let γ ∈ C(E, X) be such that γ| E0 = IE0 . In order to prove that γ(E) ∩ D = ∅, it suffices to show that 0 ∈ pY γ(E) . We consider the continuous homotopy h : [0, 1] × Y −→ Y defined by
h(t, x) = tpY γ(x) + (1 − t)x. Note that h(0, ·) = I Y , h(1, ·) = pY ◦ γ, and h(t, ·)E = IE0 for all t ∈ [0, 1]. 0 So from the homotopy invariance and normalization properties of Brouwer’s degree, we have
d pY ◦ γ, BR (0) ∩ Y, 0 = d I Y , BR (0) ∩ Y, 0 = 1
⇒ 0 ∈ pY γ(E) (existence property of Brouwer’s degree). (c) Let X =Y ⊕V with dim Y < +∞ and let v0 ∈ V with v0 =1. Suppose 0 < r < R1 and 0 < R2 and set E0 = λv0 + y : y ∈ Y and (0 ≤ λ ≤ R1 , y = R2 ) or (λ ∈ {0, R1 }, y ≤ R2 ) E = λv0 + y : y ∈ Y, 0 ≤ λ ≤ R1 and y ≤ R2 and
D = ∂Br (0) ∩ V.
Note that E is a cylinder in the space Y ⊕ Rv0 and E0 is the boundary of E in Y ⊕ Rv0 (the upper and lower bases corresponding to λ ∈ {0, R1 } and y ≤ R2 and the lateral surface corresponding to 0 ≤ λ ≤ R1 and y = R2 ). We claim that the sets {E0 , E, D} are linking. To show this, as before let pY ∈ L(X) be the projection operator of X onto Y . Also let γ ∈ C(E, X) be such that γ E = I E . To establish 0 0 the claimed linking, we need to show that there exists x ∈ E such that γ(x) = r
and pY γ(x) = 0. Consider the continuous homotopy h : [0, 1] × (Y × R) −→ Y × R defined by
h t, (y, λ) = tpY γ(x) + (1 − t)y, tγ(x) − pY γ(x) + (1 − t)λ − r , where x = λv0 + y ∈ Y ⊕ Rv0 " Y × R. If x = λv0 + y ∈ E0 , then
h t, (y, λ) = y, tx − y + (1 − t)λ − r = (y, λ − r). So if we identify E with a subset
of Y ×R (using the decomposition x = λv0 +y), we see that the Brouwer degree d h(t, ·), intE, 0 (the interior taken in Y × R) is well defined and from the homotopy invariance property of the degree we have
278
4 Critical Point Theory and Variational Methods
d h(0, ·), int E, 0 = d h(1, ·), int E, 0 .
So d h(0, ·), int E, 0 = 1, hence
Note that h 0, (y, λ) = (λ − r, y). d h(1, ·), int E, 0 = 1. So there exists x = λv0 + y ∈ E such that h 1, (y, λ) = 0, which implies
pY γ(x) = 0 and γ(x) − pY γ(x) = γ(x) = r which is what we wanted. Therefore the sets {E0 , E, D} are linking. (d) Let X =Y ⊕ V with dim Y < +∞, 0 < r < R and let v0 ∈ V be such that v0 = r and set E0 = x = λv0 + y : y ∈ Y and (λ ≥ 0, x = R) or (λ = 0, x ≤ R) E = x = λv0 + y : y ∈ Y, λ ≥ 0, x ≤ R and
D = ∂Br (0) ∩ V.
Note that E0 is the north hemisphere of a sphere of radius R in Y × Rv0 plus the circle of radius R in Y and it is the boundary of E in Y × Rv0 " Y × R, Arguing as in the previous example, we check that the sets {E0 , E, D} are linking. THEOREM 4.1.22 If X is a Banach space, ϕ ∈ C 1 (X), the sets {E0 , E,D} are linking via γ ∗ , α = sup < inf ϕ = β, Γ = γ ∈ C(E, X) : γ E = γ ∗ , c = 0 D E0
inf sup ϕ γ(x) , and ϕ satisfies the Cc -condition, then c ≥ β and c is a critical γ∈Γ x∈E
value of ϕ. PROOF: Because by hypothesis the sets {E0 , E, D} are linking via γ ∗ ∈ C(E0 , X), for every γ ∈ Γ we have γ(E) ∩ D = ∅ and so β ≤ c. To show that c is a critical value of ϕ, we argue indirectly. So suppose by contradiction that Kc = ∅. Set ε0 = β − α and U = ∅. Then by virtue of Theorem 4.1.19 (the deformation theorem), we can find 0 < ε < ε0 and a continuous homotopy h : [0, 1] × X −→ X that satisfies statements (a)−→(e) of Theorem 4.1.19. We choose γ ∈ Γ such that
ϕ γ(x) < c + ε for all x ∈ E. (4.20)
Set γ0 (x) = h 1, γ0 (x) for all x ∈ E. From the choice of ε0 > 0 and statement (d) in Theorem 4.1.19 we see that γ0 E = γ ∗ and so γ0 ∈ Γ. For all x ∈ E, we have 0
(4.21) ϕ γ0 (x) = ϕ h 1, γ(x) . Combining (4.20), (4.21), and statement (e) of Theorem 4.1.19, we infer that
ϕ γ0 (x) ≤ c − ε for all x ∈ E, a contradiction to the definition of c.
REMARK 4.1.23 It can be shown that if c = β, then there exists a critical point of ϕ in D. For details we refer to Gasi´ nski–Papageorgiou [259, p. 633]. Now with suitable choices of the linking sets, we can have the mountain pass, saddle point and generalized mountain pass theorems. We start with the mountain pass theorem.
4.1 Critical Point Theory
279
THEOREM 4.1.24 If X is a Banach space, ϕ ∈ C 1 (X), there exist x0 , x1 ∈ X and r > 0 such that x1 −x0 > r, max{ϕ(x0 ), ϕ(x0 )} < inf{ϕ(x)
: x−x0 = r}, Γ= {γ ∈ C([0, 1], X) : γ(0) = x0 , γ(1) = x1 }, c = inf max ϕ γ(t) , and ϕ satisfies the γ∈Γ t∈[0,1]
Cc -condition, then c ≥ inf{ϕ(x) : x − x0 = r} and c is a critical value of ϕ. PROOF: We consider the linking sets E0 = {x0 , x1 }, E = {(1 − t)x0 + tx1 : t ∈ [0, 1]}, D = ∂Br (x0 ) (see Example 4.1.21(a)) and we apply Theorem 4.1.22. Next we state the saddle point theorem. THEOREM 4.1.25 If X is a Banach space, X = Y ⊕ V with dim Y < +∞, ϕ ∈ C 1 (X), thereexists R : x ∈ ∂BR (0) ∩ Y } < inf{ϕ(x)
> 0 such that max{ϕ(x)
: x ∈ V } = β, Γ= γ ∈ C B R (0) ∩ Y, X : γ ∂B (0)∩Y =I∂B R (0)∩Y , c= inf sup ϕ γ(x) , R
γ∈Γ x∈E
and ϕ satisfies the Cc -condition, then c ≥ β and c is a critical value of ϕ.
PROOF: We consider the linking sets E0 =∂BR (0) ∩ Y , E =B R (0) ∩ Y , D =V (see Example 4.1.21(b)), and we apply Theorem 4.1.22. Finally we state the generalized mountain pass theorem. THEOREM 4.1.26 If X is a Banach space, X = Y ⊕ V with dim Y < +∞, ϕ ∈ C 1 (X) and there exist 0 < r < R and v ∈ V with v=1 such that if E0 , E, D are the sets from Example 4.1.21(c), have max ϕ(x) < inf ϕ(x) = β,
x∈E0
x∈D
Γ = γ ∈ C(E, X) : γ E = IE0 , c = inf sup ϕ γ(x) , and ϕ satisfies the Cc
0
γ∈Γ x∈E
condition, then c ≥ β and c is a critical value of ϕ. PROOF: Let {E0 , E, D} be the linking triple of Example 4.1.21(c) and apply Theorem 4.1.22. REMARK 4.1.27 An alternative version of Theorem 4.1.25 can be obtained using the linking triple {E0 , E, D} from Example 4.1.21(d). If ϕ is bounded below (definite functional), the situation is much simpler. THEOREM 4.1.28 If X is a Banach space, ϕ ∈ C 1 (X) and it is bounded below, m = inf ϕ and ϕ satisfies the Cm -condition, then m is a critical value of ϕ. X
PROOF: Invoking Theorem 2.4.25, we can find a sequence {xn }n≥1 ⊆X such that ϕ(xn ) ↓ m
as n → ∞
and
(1 + xn )ϕ (xn ) −→ 0.
By hypothesis ϕ satisfies the Cm -condition, thus by passing to a suitable subsequence if necessary, we may assume that xn −→ x in X. Then ϕ(xn ) ↓ ϕ(x) and so ϕ(x) = m = inf ϕ. Hence ϕ (x) = 0 and so m is a critical value of ϕ. X
Next we look for multiple critical points of a smooth functional ϕ. For this purpose we introduce the following notion.
280
4 Critical Point Theory and Variational Methods
DEFINITION 4.1.29 Let X be a Banach space, X = Y ⊕ V and ϕ ∈ C 1 (X). We say that ϕ has a local linking at 0, if there exists r > 0 such that ϕ(x) ≤ 0 if x ∈ Y, x ≤ r . ϕ(x) ≥ 0 if x ∈ V, x ≤ r REMARK 4.1.30 Evidently x = 0 is a critical point of ϕ. We produce two more, distinct nontrivial critical points of ϕ. We start with an auxiliary result. Here u(·) is the pseudogradient vector field (see Theorem 4.1.18). LEMMA 4.1.31 If X is a Banach space, ϕ ∈ C 1 (X) and satisfies the P Scondition, x0 ∈ X is the unique global minimum of ϕ
on X, y ∈ X is such that ϕ (y) = 0, and ϕ has no critical value in the interval ϕ(x0 ), ϕ(y) , then the negative pseudogradient flow defined by
u x(t) dx(t) , x(0) = y, (4.22) =−
dt u x(t) 2
exists for a maximal finite time T (y) > 0 and x T (y) = x0 . PROOF: Without any loss of generality we assume that x0 = 0 and ϕ(x0 ) = 0. Evidently problem (4.22) has a unique local solution x(t) and we have
dx(t)
dϕ x(t) = ϕ x(t) , dt dt 7
8 u x(t)
= ϕ x(t) , −
(see (4.22)) u x(t) 2X
ϕ x(t) 2X ∗ 1
≤− (see (4.1.15)). (4.23) ≤− 4 u x(t) 2X Let (0, b) be the maximal open interval of existence of the solution x(·). From (4.23), we have
ϕ(y) ≥ b + ϕ x(b) ≥ b (recall ϕ ≥ 0)
for all t ∈ (0, b). and 0 < ϕ x(t) < ϕ(y)
We show that ϕ x(t) −→ 0 = ϕ(x0 ) as t −→ b.
First suppose that we can find δ > 0 such that ϕ x(t) X ∗ ≥ δ for all t ∈ (0, b). From Definition 4.1.15, we have
u x(t) X ≥ ϕ x(t) X ∗ ≥ δ > 0 for all t ∈ [0, b] and so
b
0
dx(t) dt = dt
⇒ lim x(t) = x(b) t→b
b 0
1
dt < +∞, u x(t)
exists in X.
4.1 Critical Point Theory
281
Note that x(b)=0 or otherwise we could continue x(t) beyond t = b, a contradiction to the maximality of b.
Now suppose that we can find a sequence tn −→ b and ϕ x(tn ) X ∗ −→ 0. But by hypothesis ϕ satisfies the P S-condition. So by passing to a subsequence if necessary, we may assume that x(tn ) −→ x0 in X and x0 is a critical point of ϕ. Because by hypothesis ϕ has no critical values in ϕ(x0 ) = 0, ϕ(y) , we must have x0 = x0 = 0. Then lim ϕ x(t) = 0 and since ϕ satisfies the P S-condition, by the t→b
Ekeland variational principle (see Corollary 2.4.6), we conclude that x(t) −→ 0=x0 as t −→ b. Using this lemma, we can prove the following multiplicity result. THEOREM 4.1.32 If X is a Banach space, X =Y ⊕ V with dim Y < +∞, ϕ ∈ C 1 (X) and satisfies the P S-condition, ϕ is bounded below, m = inf < ϕ(0) = 0, and X
there exists r > 0 such that ϕ(x) ≤ 0 ϕ(x) ≥ 0
if x ∈ Y, x ≤ r , if x ∈ V, x ≤ r
(i.e., ϕ has a local linking at 0), then ϕ has at least two nontrivial critical points. PROOF: From Theorem 4.1.28, we know that ϕ attains its infimum at some point x0 ∈ X. Then m = inf ϕ = ϕ(x0 ) < ϕ(0) = 0, X
⇒ x0 = 0. Suppose that {0, x0 } are the only critical points of ϕ. Case I: dim Y > 0 and dim V > 0. Without any loss of generality, we assume that r = 1 < x0 . We show that there exists δ > 0 such that
1 x ∈ X : ϕ(x) < ϕ(x0 ) + δ ⊆ x ∈ X : x − x0 < x0 . 2
(4.24)
Indeed, if no such δ > 0 can be found, then there is a minimizing sequence {xn }n≥1 ⊆ X (i.e., ϕ(xn ) ↓ m = ϕ(x0 )) such that xn − x0 ≥ 12 x0 for all n ≥ 1. Because of Corollary 2.4.6 we can find {yn }n≥1 ⊆ X another minimizing sequence (i.e., ϕ(yn ) ↓ m = ϕ(x0 )) such that yn − xn ≤ 1/n and ϕ (yn ) −→ 0. Because by hypothesis ϕ satisfies the P S-condition, we may assume that yn −→ x0 in X. We have yn − x0 ≥ xn − x0 − yn − xn ≥ 12 x0 − n1 , hence x0 − x0 ≥ 12 x0 > 0 and so x0 = x0 . Also ϕ (yn ) −→ ϕ (x0 ) = 0 and so x0 ∈ X is a third critical point of ϕ, a contradiction to our hypothesis. So we can find δ > 0 for which (4.24) is true. For every y ∈ Y with y=1=r < x0 , we have ϕ (y) = 0 and so we can apply Lemma 4.1.31 and
have that the flow generated by (4.22) exists on a maximal open interval 0, b(y) with b(y) ≤ −4ϕ(x0 ) and x(t) −→ x0 as t −→ b(y). Moreover, by
choosing δ > 0 small in (4.24), we can find unique t = t(y) ∈ 0, b(y) such that
282
4 Critical Point Theory and Variational Methods
ϕ x t(y) = ϕ(x0 ) + δ. The uniqueness of t(y) also implies easily the continuity of the map y −→ t(y). Also ϕ ≤ 0 on {x ∈ X : ϕ(x) ≤ ϕ(x0 ) + δ}. Let v0 ∈ V be such that v0 = 1 and let E = x = λv0 + y : y ∈ Y, λ ≥ 0, x ≤ 1 E0 = x = λv0 + y : y ∈ Y and (λ ≥ 0, x = 1) or (λ = 0, x ≤ 1)}. We define γ ∗ ∈C(E0 , X) as follows. γ ∗ (y) = y
if y ∈ Y, y ≤ 1
and
γ ∗ (v0 ) = x0 .
(4.25)
Also for any x ∈ E0 with x = v0 and x=1, we have the unique representation x = λv0 + ηy
(4.26)
with 0 ≤ λ ≤ 1, y ∈ Y, y = 1, and 0 < η ≤ 1 (the uniqueness of this representation means that λ, η, y in (4.26) are uniquely defined). Then for such a x ∈ E0 , we set
γ ∗ (x) = γ ∗ (λv0 + ηy) = x 2λt(y)
1 when λ ∈ 0, 2
(4.27)
with x(t) being the solution of (4.22). So γ∗
1 1 v0 + ηy = x t(y) ∈ {x ∈ X : x − x0 ≤ x0 } 2 2
(see (4.24)).
Finally set
γ ∗ (λv0 + ηy) = (2λ − 1)v0 + (2 − 2λ)x t(y)
when
1 ≤ λ ≤ 1. 2
(4.28)
Note that as λ moves from 12 to 1, the right-hand side of the above equation traverses the line segment from x t(y) to x0 and so for λ ∈ 12 , 1 we have γ ∗ (λv0 + ηy) − x0 ≤
1 x0 . 2
The map γ ∗ : E0 −→ X is clearly continuous and ϕ ◦ γ ∗ ≤ 0 (recall the choice of δ > 0). In addition note that for some 0 < ϑ ≤ 1, we have γ ∗ (x) ≥ ϑ > 0
for all x ∈ E0 , x = 1
(see (4.25), (4.27), and (4.28)). Next we show that if D = ∂B (0) ∩ V with < ϑ, then the sets {E0 , E, D} are linking via γ ∗ . Let γ be any continuous extension of γ ∗ on E. Let W = Rv0 ⊕ Y and consider the map S : E −→ W defined by
S(x) = pY γ(x) + (I − pY )γ(x)v0 , (4.29) where pY ∈ L(X) is the projection operator onto Y (it exists because Y is finitedimensional). In order to show that γ(E) ∩ D = ∅, it suffices to show that there exists x ∈ E such that S(x) = x. Because < ϑ, we see that S(x) = v0 for all x ∈ E0 and so the degree d(S, int E, v0 ) (the intE taken in W ) is well-defined. Recall that the degree depends only on the boundary values. So we may consider only S E =∂E (the boundary ∂E considered in W ). Set 0
4.1 Critical Point Theory B1 = {y ∈ Y : y ≤ 1}
and
283
B2 = {x ∈ E : x = 1}.
Evidently E0 = ∂E = B1 ∪ B2 (B1 is the flat basis of the half-ball E and B2 is the hemisphere of the half-ball E). It is clear from (4.29) that S B = I B and 1 1 S(x) ≥ ϑ > 0 for x ∈ B2 . So on E0 = ∂E we define x if x ∈ B1 S(x) = . S(x) if x ∈ B2 S(x) Let h : [0, 1] × E0 −→ W \ {v0 } be defined by h(t, x) = tS(x) + (1 − t)S(x). This is a continuous homotopy. Note that S(B2 ) ⊆ B2 and if we consider the boundary ∂B2 of the manifold B2 in W , we see that S ∂B = I ∂B . Since B2 2
2
is homeomorphic to a ball, there is a continuous homotopy h(t, x) connecting S to I ∂B with h(t, ·) ∂B = I ∂B . So S E =∂E is homotopic to the identity in 2 2 2 0 W \ {v0 } and thus d(S, int E, v0 ) = d(I, int E, v0 ) = 1. This means that we can find x ∈ E such that S(x) = v0 and so the sets {E0 , E, D} are linking via γ ∗ . Invoking Theorem 4.1.22 we deduce that
c = inf max ϕ γ(x) , γ∈Γ x∈E
where Γ = {γ ∈ C(E, X) : γ E =γ ∗ } is a critical value of ϕ and c ≥ inf ϕ ≥ 0 (by the 0
D
local linking at 0). If c > 0, then we have a second nontrivial critical point because ϕ(x0 ) < ϕ(0) = 0. If c = 0, then by Remark 4.1.23, we can find x ∈ D such that ϕ(x)=c and so x = 0 is the other nontrivial critical point of ϕ. Case II: dim Y = 0 and dim V > 0. In this case by virtue of the local linking at 0, we see that the origin is a local minimizer of ϕ. Recall that we have assumed that {0, x0 } are the only critical points of ϕ. So we see that we can find > 0 small such that ϕ(x) > 0
for all x ∈ B (0) \ {0}.
We claim that there exists 0 < 0 < such that inf ϕ(x) : x ∈ ∂B0 (0) > 0 = ϕ(0). If not, for any given fixed 0 < 0 < , we have inf ϕ(x) : x ∈ ∂B0 (0) = 0. Let δ > 0 be such that 0 < 0 − δ < 0 + δ < and consider the ring R = {x ∈ X : 0 − δ ≤ x ≤ 0 + δ}. We can find {xn }n≥1 ⊆ R such that xn = 0
and
ϕ(xn ) ≤
1 n
(see (4.31)).
(4.30)
(4.31)
284
4 Critical Point Theory and Variational Methods
Invoking the Ekeland variational principle (see Theorem 2.4.1) we can find {yn }n≥1 ⊆ R such that ϕ(yn ) ≤ ϕ(xn ), yn − xn ≤
1 1 and ϕ(yn ) ≤ ϕ(x) + yn − x n n
for all x ∈ R.
(4.32) From the second assertion in (4.32), we see that for n ≥ 1 large we have that yn ∈ int R. So in the third assertion in (4.32), we can take x = yn + λu with u ∈ X, u = 1, and λ > 0 small. With this choice and by letting λ −→ 0, we obtain 1 − u ≤ ϕ (yn ), u . n Because u ∈ ∂B1 (0) was arbitrary it follows that ϕ (yn )X ∗ ≤
1 . n
This together with the first assertion in (4.32) and the fact that ϕ satisfies the P Scondition, implies that we may assume (at least for a subsequence) that yn −→ y in X. Evidently ϕ(y) = 0, ϕ (y) = 0 and y = , a contradiction. So (4.30) is true. If > 0 is small enough we have and ϕ(x0 ) < 0 = ϕ(0) < inf ϕ(x) : x ∈ ∂B (0) . 0 < < x0 So we can apply Theorem 4.1.24 and obtain a critical point x1 ∈ X of ϕ such that ϕ(x1 ) ≥ inf ϕ(x) : x ∈ ∂B (0) > 0 = ϕ(0) > ϕ(x0 ) ⇒ x1 = 0,
x1 = x0
and is the second nontrivial critical point of ϕ.
Case III: dim Y > 0 and dim V = 0. In this case we may assume that dim Y = +∞. Now the origin is a local minimizer of −ϕ. Because ϕ is bounded below and satisfies the P S-condition, from Theorem 4.1.12 we have that ϕ is weakly coercive. So −ϕ(x) −→ −∞ as x −→ ∞ and we can find u ∈ X such that −ϕ(u) < −ϕ(0) = 0 < inf − ϕ(x) : x ∈ ∂B (0) < −ϕ(x0 ) for > 0 small so that u > > 0 (see Case II). Once again Theorem 4.1.24 gives the second nontrivial critical point of ϕ. Hidden in the argument for Case II in the proof of Theorem 4.1.32 is the following result. THEOREM 4.1.33 If X is a Banach space, ϕ ∈ C 1 (X), ϕ satisfies the P Scondition, and it has two local minima, then ϕ has at least one more critical point. Borrowing the introductory concept of Morse theory, we make the following definition.
4.1 Critical Point Theory
285
DEFINITION 4.1.34 Let H be a Hilbert space and ϕ ∈ C 2 (H). Let u ∈ H and consider the operator Su ∈ L(H) defined by Su (x), yH = ϕ (u)(x, y)
for all (x, y) ∈ H × H.
Here by ·, ·H we denote the inner product of H. The operator Su ∈ L(H) is self-adjoint and is identified with ϕ (u). Suppose u ∈ H is a critical point of ϕ. We say that ϕ is a nondegenerate critical point if ϕ (u) = Su is invertible. The Morse index of u is defined as the supremum of the dimensions of the linear subspaces of H on which ϕ (u) = Su ∈ L(H) is negative definite. We state the following result, known as the Morse lemma and for its proof we refer to Chang [143, p. 46]. LEMMA 4.1.35 If H is a Hilbert space, ϕ ∈ C 2 (H), and x0 is a nondegenerate critical point of ϕ, then there exists a Lipschitz homeomorphism h from a neighborhood U of the origin to a neighborhood W of x0 such that h(0) = x0 and
ϕ h(x) = ϕ(x0 ) + x+ 2 − x− 2 where x = x+ + x− corresponds to the decomposition of H = H+ ⊕ H− with respect to the operator ϕ (x0 ) ∈ L(H). Using this lemma, we deduce the following useful consequence of Theorem 4.1.32. COROLLARY 4.1.36 If H is a Hilbert space, ϕ ∈ C 2 (H) is bounded below, satisfies the P S-condition, and x0 is a nondegenerate critical point of ϕ with a finite Morse index, and inf ϕ < ϕ(x0 ), then ϕ has at least two critical points that are X
distinct from x0 . PROOF: By considering ϕ(x) = ϕ(x + x0 ) − ϕ(x0 ), we may assume without any loss of generality that x0 = 0 and ϕ(x0 ) = 0. Then from Lemma 4.1.35 we see that ϕ has a local linking at 0 with Y = H− (dim H− < ∞ because ϕ has a finite Morse index at x0 = 0) and V = H+ . So we can apply Theorem 4.1.32 and produce two nontrivial critical points of ϕ. We can have a multiplicity result under symmetry conditions on the functional ϕ. The result is sometimes called in the literature the symmetric mountain pass theorem. We state the result and for a proof we refer to Rabinowitz [508, p. 55]. THEOREM 4.1.37 If X is a Banach space, ϕ ∈ C 1 (X), satisfies the P Scondition, it is even, and also the following hold (i) There existsa linear subspace V of X of finite codimension and numbers β, r > 0 such that ϕ∂B (0)∩V ≥ β. r (ii) There exists a finite-dimensional linear subspace Y of X with dim Y > co dim V such that ϕY is weakly anticoercive; that is, ϕ(y) −→ −∞ as y −→ ∞, y ∈ Y , then ϕ has at least k = dim Y − co dim V distinct pairs of nontrivial critical points.
286
4 Critical Point Theory and Variational Methods
COROLLARY 4.1.38 If the hypotheses of Theorem 4.1.37 are satisfied with (ii) replaced by (ii) for any integer k ≥ 1, there is a k-dimensional linear subspace Y of X such that ϕY is weakly anticoercive; that is, ϕ(y) −→ −∞ as y −→ ∞, y ∈ Y , then ϕ has infinitely many distinct pairs of nontrivial critical points. Continuing with functionals exhibiting symmetry, we present the principle of symmetric criticality. First we have a definition. DEFINITION 4.1.39 Let G be a topological group. A representation of G over a Banach space X is a family {S(g)}g∈G ⊆ L(X) such that (a) S(e) = I (e is the unit of G). (b) S(g1 ∗ g2 ) = S(g1 ) S(g2 ) for all g1 , g2 ∈ G (here ∗ denotes the operation of G). (c) (g, x) −→ S(g)x is continuous from G × X into X. A set C ⊆ X is invariant (under the representation), if S(g)(C) ⊆ C for all g ∈ G. A function ϕ : X −→ R is invariant (under the representation), if ϕ ◦ S(g) = ϕ for all g ∈ G. A map h : X −→ X is equivariant (under the representation), if S(g) ◦ h = h ◦ S(g) for all g ∈ G. A representation {S(g)}g∈G of G over X is said to be isometric, if S(g)x = x for all g ∈ G and all x ∈ X. Finally the set of invariant points of X under the representation, is defined by Fix(S) = Fix(G) = {x ∈ X : S(g)x = x for all g ∈ G}. REMARK 4.1.40 Very often to simplify the notation we identify S(g) with g. We do this here too. Also we speak about the action of G on X, to mean the representation of G over X. THEOREM 4.1.41 If H is a Hilbert space, G is a topological group that has an isometric representation over H, ϕ ∈ C 1 (H) is invariant, and x0 is a critical point of ϕ Fix(G) , then x0 is also a critical point of ϕ. PROOF: For every x, y ∈ H we have
ϕ (gx), y
H
ϕ(gx + λy) − ϕ(gx) λ ϕ(gx + λgg −1 y) − ϕ(gx) = lim λ→0 λ ϕ(x + λg −1 y) − ϕ(x) = lim (because ϕ is invariant) λ→0 λ (4.33) = ϕ (x), g −1 y H . = lim
λ→0
Moreover, because the representation is isometric (hence it preserves inner products), we have ϕ (x), g −1 y H = g ϕ (x), y H , for all y ∈ H (see (4.33)), ⇒ ϕ (gx), y H = g ϕ (x), y H ⇒ ϕ (gx) = g ϕ (x); that is, ϕ is equivariant. Now suppose that x0 is a critical point of ϕFix(G) . Then clearly
4.1 Critical Point Theory
287
g ϕ (x0 ) = ϕ (gx0 ) = ϕ (x0 ) ⇒ ϕ (x0 ) ∈ Fix(G). Therefore we have ϕ (x0 ) ∈ Fix(G) ∩ Fix(G)⊥ = {0}; that is, ϕ (x0 ) = 0.
We conclude by briefly mentioning a generalization of this theory to a class of functionals that arise when dealing with problems with unilateral constraints. So let X be a Banach space. The generalization is for functionals ϕ=ϕ+ψ
(4.34)
with ϕ ∈ C 1 (X) and ψ ∈ Γ0 (X) (see Definition 1.2.1). DEFINITION 4.1.42 Let X be a Banach space and ϕ : X −→ R = R ∪ {+∞} a functional satisfying the decomposition (4.34). (a) A point x ∈ X is said to be a critical point of ϕ, if x ∈ dom ψ and for all y ∈ X. ϕ (x), y − x + ψ(y) − ψ(x) ≥ 0
(4.35)
(b) We say that ϕ satisfies the generalized P S-condition if every sequence {xn }n≥1 ⊆ X such that {ϕ(xn )}n≥1 is bounded in R and for all y ∈ X ϕ (xn ), y − xn + ψ(y) − ψ(xn ) ≥ −εn y − xn with εn ↓ 0, has a strongly convergent subsequence. REMARK 4.1.43 Clearly x ∈ X is a critical point of ϕ if and only if −ϕ (x) ∈ ∂ψ(x). LEMMA 4.1.44 If X is a Banach space, ψ ∈ Γ0 (X) with ψ(0) = 0 and ψ(x) ≥ −x for all x ∈ X, then there exists x∗ ∈ X ∗ with x∗ ≤ 1 such that ψ(x) ≥ x∗ , x for all x ∈ X. PROOF: Let ξ(x) = ψ(x) + x, x ∈ X. Then ξ ∈ Γ0 (X), ξ ≥ 0, and ξ(0) = 0. So ∗ 0 ∈ ∂ξ(0) = ∂ψ(0) + ∂ · (0) (see Proposition 1.2.38). Because ∂ · (0) = B 1 (0) = {x∗ ∈ X ∗ : x∗ ≤ 1}, we obtain x∗ ∈ ∂ψ(0) with x∗ ≤ 1 such that x∗ , x ≤ ψ(x) for all x ∈ X. Using this lemma we can equivalently reformulate the generalized P S-condition in a more convenient form. PROPOSITION 4.1.45 If X is a Banach space and ϕ : X → R = R ∪ {+∞} a functional satisfying the decomposition (4.34), then the generalized P S-condition (see Definition 4.1.42 (b)) is equivalent to the following condition, every sequence {xn }n≥1 ⊆ X such that {ϕ(xn )}n≥1 is bounded and for all y ∈ X ϕ (xn ), y − xn + ψ(y) − ψ(xn ) ≥ x∗n , y − xn with x∗n −→ 0, has a strongly convergent subsequence.
288
4 Critical Point Theory and Variational Methods
Using the Ekeland variational principle (see Theorem 2.4.1), we can prove a deformation theorem for this new class of functionals and then use it to produce minimax characterizations of the critical values of ϕ. The details can be found in Szulkin [568]. Here we simply state without proofs the formulations of the mountain pass, saddle point and generalized mountain pass theorems in this context. First we have the mountain pass theorem. THEOREM 4.1.46 If X is a Banach space, ϕ: X −→ R = R ∪ {+∞} is a functional satisfying the decomposition (4.34) and the generalized P S-condition, and the following hold, (i) ϕ(0) = 0 and there exist β, r > 0 such that ϕ ≥ β, ∂Br (0)
(ii) ϕ(u) ≤ 0 for some u ∈ X with u > r,
then ϕ has a critical value c ≥ β that can be characterized by
c = inf sup ϕ γ(t) , γ∈Γ t∈[0,1]
where Γ={γ ∈C([0, 1], X):γ(0)=0, γ(1)=u}. Next we state the saddle point theorem. THEOREM 4.1.47 If X is a Banach space, X = Y ⊕ V with dim Y < +∞, ϕ : X −→ R = R ∪ {+∞} is a functional satisfying the decomposition (4.34) and the generalized P S-condition and the following hold, (i) There exist constants r > 0 and α ∈ R such that ϕ∂B (0)∩Y ≤ α r (ii) There exists a constant β > α such that ϕV ≥ β then ϕ has a critical value c ≥ β that can be characterized by
sup ϕ γ(x) , c = inf γ∈Γ
x∈B r (0)∩Y
where Γ = γ ∈ C B r (0) ∩ Y, X : γ ∂B
r (0)∩Y
= I ∂B
r (0)∩Y
.
Finally we state the generalized mountain pass theorem. THEOREM 4.1.48 If X is a Banach space, X = Y ⊕ V with dim Y < +∞, ϕ : X −→ R = R ∪ {+∞} is a functional satisfying the decomposition (4.34) and the generalized P S-condition, and the following hold, (i) There exist constants r, β > 0 such that ϕ ≥ β, ∂Br (0)∩V
(ii) There exists a constant R > r and v0 ∈ V with v0 = 1 such that E = x = λv0 + y : y ∈ Y, y ≤ R, 0 ≤ λ ≤ R and on E0 = ∂E in W = Rv0 ⊕ Y, we have ϕ ≤ 0, E0
then ϕ has a critical value c ≥ β that can be characterized by
c = inf sup ϕ γ(x) , γ∈Γ x∈E
where Γ = γ ∈ C(E, X) : γ E = I E . 0
0
4.2 Ljusternik–Schnirelmann Theory
289
Also the following is a straightforward consequence of the Ekeland variational principle (see Theorem 2.4.1). THEOREM 4.1.49 If X is a Banach space and ϕ : X −→ R = R ∪ {+∞} is a functional satisfying the decomposition (4.34), the generalized P S-condition, and also is bounded below, then m = inf ϕ = ϕ(x0 ) for some x0 ∈ X and it is a critical X
value of ϕ. REMARK 4.1.50 Multiplicity results under symmetry, similar to Theorem 4.1.37 and Corollary 4.1.38, can be found in Szulkin [568].
4.2 Ljusternik–Schnirelmann Theory Let H be an infinite-dimensional Hilbert space. In what follows we identify H with H ∗ and by ·, ·H we denote the inner product of H. In Section 3.1 we studied the spectral properties of a self-adjoint operator A∈Lc (H) (i.e., A : H −→ H is linear, compact, and self-adjoint). In particular we proved that A has a sequence {λn }n≥1 of eigenvalues such that λn −→ 0 as n → ∞. Moreover, there is an orthonormal basis {en }n≥1 ⊆ H corresponding to the eigenvalues {λn }n≥1 and we have the spec tral resolution A(x) = λn x, en H en . There is a variational characterization of n≥1
the eigenvalues. In particular, let {λ+ n }n≥1 be the positive eigenvalues ordered in decreasing order (with multiplicities repeated), Ln the collection of all linear subspaces of H of dimension n, and Cn the collection of all linear subspaces of H of codimension n − 1. Then we have the following variational characterizations of the eigenvalues. The result is known as the Courant minimax principle. Sometimes the names of Weyl and Fischer are attached to it too. PROPOSITION 4.2.1 If everything is as described above, then λ+ n = sup min Y ∈Ln x∈Y
A(x), xH A(x), xH = inf sup . 2 V ∈C x x2 n x∈V
(4.36)
REMARK 4.2.2 An analogous variational characterization is also true for the negative eigenvalues {λ− n }n≥1 . Namely, −λ− n = sup min − Y ∈Ln x∈Y
A(x), xH A(x), xH = inf sup − . V ∈Cn x∈V x2 x2
(4.37)
Of course, if A is monotone, then only nonnegative eigenvalues exist (such as in the case of the Laplacian differential operator; see Section 4.3). Note that if S = ∂B1 (0) = {x ∈ H : x = 1}, then we can equivalently rewrite (4.36) and (4.37) as λ+ n = sup
min A(x), xH = inf
Y ∈Ln x∈S∩Y
−λ− n = sup
min − A(x), xH = inf
Y ∈Ln x∈S∩Y
sup A(x), xH
V ∈Cn x∈S∩V
sup − A(x), xH .
V ∈Cn x∈S∩V
(4.38) (4.39)
290
4 Critical Point Theory and Variational Methods
The eigenvalues of A are precisely the critical values of the quadratic functional ϕ(x) = A(x), xH on the unit sphere S = ∂B1 (0) in H, therefore it is natural to extend (4.38) and (4.39) to general smooth functionals ϕ by finding “topological” analogues of the sets S ∩ Y and of the families {S ∩ Y : Y ∈ Ln }n≥1 . This will produce a far-reaching generalization of Proposition 4.2.1 to nonlinear eigenvalue problems. This is the goal of the Ljusternik–Schnirelmann theory, which we briefly outline in this section. A crucial step in extending the theory to nonquadratic functionals (i.e., to nonlinear eigenvalue problems in a Banach space X) is the determination of an interesting (i.e., nontrivial), computable N0 ∪ {+∞}-valued function ξ on a class of closed subsets of X that exhibits the following properties. (C) (1) ξ(∅) = 0 and ξ(A) = 1 if A is a finite nonempty set. (2) A ⊆ B ⇒ ξ(A) ≤ ξ(B) (monotonicity), (3) If h(t, x) is a deformation of Y (i.e. h : [0, 1] × Y −→ Y is a continuous map such that h(0, x) = x for all x ∈ Y ), then ξ(A) ≤ ξ h(1, A) . (4) ξ(A ∪ B) ≤ ξ(A) + ξ(B) (subadditivity). (5) There is a neighborhood U of A such that ξ(U ) = ξ(A) (continuity). REMARK 4.2.3 Such functions ξ are usually called topological indices. If S is the family of all closed sets in a Banach space X, the trivial function ξ : S −→ N0 ∪{+∞} defined by 1 if A ⊆ X is nonempty, closed ξ(A) = 0 if A = ∅, satisfies properties (C). Fortunately we can find more interesting indices than this one. DEFINITION 4.2.4 Let Y be a Hausdorff topological space and C ⊆ Y a nonempty set. We say that C is contractible in Y , if there exist a continuous map h : [0, 1] × C −→ Y and y0 ∈ Y such that h(0, x) = x and h(1, x) = y0 for all x ∈ C. REMARK 4.2.5 Intuitively, regarding t ∈ [0, 1] as a time parameter, h(t, x) describes a continuous deformation of the set C down to the singleton {y0 }, with h(t0 , C) expressing the state of the deformation at time t0 ∈ [0, 1]. Another way to describe contractible sets C is to say that the identity map is null-homotopic. Next we give the definition of a nontrivial function catY (A) which admits properties (C) and which is also maximal in the sense that if ξ(A) is another function defined on the closed subsets of X and possessing properties (C), then ξ(A) ≤ catY (A). DEFINITION 4.2.6 Let Y be a Hausdorff topological space. The Ljusternik– Schnirelmann category (LS-category for short), denoted by catY , is defined on the n closed subsets of Y by catY (∅) = 0, catY (A) = min{n ∈ N : A⊆ Sk }, Sk is closed, k=1
contractible for all k ∈ {1, . . . , n} and catY (A)= +∞ if A = ∅ does not admit such a finite cover.
4.2 Ljusternik–Schnirelmann Theory
291
REMARK 4.2.7 The properties of closedness and contractibility are preserved by homeomorphisms, thus it is clear that we get the same values for homeomorphic Y or homeomorphic closed set C. However, the index Y in catY (C) is essential if Y lies in a larger space Y , because a set may be contractible in Y but not in the smaller space Y . If Y is embedded continuously in Y , then catY (C) ≤ catY (C) for all closed C ⊆ Y . Next we verify that for a large class of metric spaces Y , catY (C) satisfies properties (C). For this purpose, we need to recall the following definition from topology. DEFINITION 4.2.8 A metric space Y is said to be an absolute neighborhood retract (ANR for short), if for any metric space Z, any closed set A ⊆ Z and any continuous function f : A −→ Y , we can find a neighborhood U of A (U depends on f ) and a continuous function f : U −→ Y such that f A = f . If f can always be extended to all of Z, then we say that Y is an absolute retract (AR for short). This property is not that restrictive as the following list of examples illustrates. EXAMPLE 4.2.9 (a) It is straightforward to check that a finite product of ANR’s is again an ANR. (b) From Theorem 3.1.11 (Dugundji’s extension theorem), we know that every closed and convex set of a locally convex space is an AR (hence a ANR too). (c) If S 1 ={x ∈ R2 : x=1} (the circle), then S 1 is an ANR. To see this let Z be a metric space, A ⊆ Z a closed subset and f ∈ C(A, S 1). By virtue of the Tietze extension theorem we can find f ∈ C(Z, R2 ) such that f A = f . On the other hand we can find a neighborhood U of S 1 in R2 and a retraction r : U −→ S 1 (recall that no global retraction on S 1 exists, see Theorem 3.5.7). Then the map r ◦ f defined on f −1 (U ) is the desired extension of f . PROPOSITION 4.2.10 If Y is a path-connected ANR, then catY satisfies properties (C). PROOF: Property (C)(1) holds if and if Y is path-connected. Property (C)(2) (monotonicity property) is obvious. For property (C)(3) we argue as follows. Suppose that h : [0, 1] × Y −→ Y is a n Ak continuous map such that h(0, ·) = I (a deformation of Y ) and that h(1, A)⊆ k=1
with Ak ⊆ Y closed, contractible for all k ∈ {1, . . . , n}. So we can find continuous maps hk : [0, 1] × Ak −→ Y such that hk (0, x) = x for all x ∈ Ak and h(1, Ak ) = {xk } with xk ∈ Y (see Definition 4.2.4) and let Bk = h(1, ·)−1 (Ak ) for all k ∈ {1, . . . , n}. Evidently Bk is closed and if we set hk (t, x) =
h(2t, x)
hk 2t − 1, h(t, x)
if t ∈ 0, 12 1 , if t ∈ 2 , 1
292
4 Critical Point Theory and Variational Methods
then we see that hk is a continuous homotopy between I B and the constant map k n equal to xk . So Bk is contractible to {xk }. We have A⊆ Bk and so catY (A) ≤ k=1
catY h(1, A) . Property (C)(4) is obvious. To prove property (C)(5), we proceed as follows. First suppose that A is closed with catY (A) = 1. Let h : [0, 1] × A −→ Y be a deformation of A onto {y0 } with y0 ∈ Y . Set Z = [0, 1] × Y and D = ({0} × Y ) ∪ ([0, 1] × A) ∪ ({1} × Y ) and define f : C −→ Y by f (0, x) = x for all x ∈ Y, f = h on [0, 1] × A and f (1, x) = y0 for all x ∈ Y . Since by hypothesis Y is an ANR, we can find f a continuous extension of f on some neighborhood V of D and V ⊇ [0, 1] × U for some neighborhood U of A. Then f [0,1]×U deforms U onto {x0 } and so catY (U ) = n Ak with Ak closed and catY (Ak ) = 1. catY (A) = 1. If catY (A) = n, then A ⊆ Therefore A⊆
n
k=1
Uk = U with Ak ⊆ Uk and catY (Uk ) = 1. Hence catY (U ) = n. If
k=1
catY (A) = +∞, then clearly catY (Y ) = +∞.
Now we show that catY is maximal among all N0 ∪ {+∞}-valued functions ξ(·) on the closed subsets of Y having properties (C). PROPOSITION 4.2.11 If Y is a path connected ANR and ξ is an N0 ∪ {+∞}valued function on the closed subsets of Y that satisfies properties (C), then ξ ≤ catY . PROOF: If catY (A) = 1, then according to Definition 4.2.6, A is deformable to a point y ∈ Y . So from property (C)(3), we have ξ(A) ≤ ξ({y}) = 1 (property n (C)(3)), hence ξ(A) ≤ catY (A). If catY (A) = n < +∞, then A = Ak , where k=1
each Ak is closed and deformable to a point yk ∈ Y . So ξ(Ak ) = ξ({yk }) = 1 for all k ∈ {1, . . . , n} (see property (C)(3)) and then using the subadditivity property (C)(4) we obtain ξ(A) = ξ(
n !
Ak ) ≤
k=1
Therefore we conclude that ξ ≤ catY .
n
ξ(Ak ) = n = catY (A).
k=1
EXAMPLE 4.2.12 (a) If Y =T n =Rn /Z2 is an n-torus, then catT n (T n ) = n + 1. For a proof of this we refer to Schwartz [545, p. 161]. (b) If Y =P n =S n /Z2 where S n = {x ∈ Rn+1 : x=1} (the real n-projective space, it arises from S n by identifying antipodal points), then catP n (P n ) = n + 1. Also catP n (P k ) = k + 1 for all n ≥ k. If X is an infinite-dimensional Banach space and P ∞ (X) is the infinite -dimensional projective space obtained by identifying the antipodal points of the unit sphere ∂B1 (0) = {x ∈ X : x = 1} (i.e., P ∞ (X) = ∂B1 (0)/Z2 ), then catP ∞ (P ∞ ) = +∞.
4.2 Ljusternik–Schnirelmann Theory
293
(c) If Y = Rn and A is a closed ball in Rn , then catRn (A) = 1. (d) If Y = Rn and A = S n−1 = {x ∈ Rn : x = 1}, then catRn (S n−1 ) = catS n−1 (S n−1 ) = 2 (take as covering sets {A1 , A2 }, two slightly overlapping “northern” and “southern” hemispheres). (e) If Y = X = an infinite-dimensional Banach space and A = S = {x ∈ X : x = 1}. Then catX (S) = catS (S) = 1 (recall that due to the infinite-dimensionality of X, S is contractible in itself). The determination of the LS-category of a set A ⊆ X, is in general a difficult and complicated task, which involves various results from cohomology and homotopy theories. So it is desirable to have another interesting function ξ satisfying properties (C), which is more elementary than the LS-category. This is provided by the socalled genus of a set. As before X is a Banach space. In this case we define an N0 ∪ {+∞}-valued function γ, on the closed and symmetric with respect to the origin, subsets of X\{0}. DEFINITION 4.2.13 Let X be a Banach space and Sym(X) = {A ⊆ X \ {0} : A is closed and A=−A}. The genus γ : Sym(X) −→ N0 ∪ {+∞} is defined by γ(∅) = 0, γ(A) = min n ∈ N : there exists odd f ∈ C(A, Rn \ {0}) γ(A) = +∞
if no such continuous odd map exists.
REMARK 4.2.14 For any A ∈ Sym(X), by the Tietze extension theorem any odd map f ∈ C(A, Rn \ {0}) admits an extension f ∈ C(X, Rn ). In fact if we let
f0 (x) = 12 f (x)− f (−x) , the extension can be assumed to be odd too. The function γ is often called Krasnoselskii genus. The notion of genus generalizes the concept of dimension of a linear space. PROPOSITION 4.2.15 If X is a Banach space and U is a bounded symmetric neighborhood of the origin in X, then γ(∂U ) = dim X. PROOF: First suppose that 0 < dim X < +∞. If in Definition 4.2.13 we use f = I, we see that γ(∂U ) ≤ dim X. If k < dim X, then by Corollary 3.3.29 we know that there is no odd function f ∈ C(∂U, Rk \ {0}). Hence we must have γ(∂U ) = dim X. Next suppose that dim X = +∞. Let Xn be an n-dimensional linear subspace of X. Clearly from Definition 4.2.13 we have γ(∂U ∩ Xn ) ≤ γ(∂U ). Also from the first part of the proof we know that γ(∂U ∩ Xn ) = n. So n ≤ γ(∂U ) for all n ≥ 1, hence γ(∂U ) = +∞ = dim X. Finally if X = {0}, then ∂U = ∅ and so γ(∂U ) = 0 = dim X. There is a sort of converse to this proposition. PROPOSITION 4.2.16 If H is a Hilbert space, A ∈ Sym(H) and it is compact, and γ(A) = k < +∞, then A contains at least k-mutually orthogonal vectors {xi }ki=1 .
294
4 Critical Point Theory and Variational Methods
PROOF: Let {xi }m i=1 be a maximal set of mutually orthogonal vectors in A. Let m Y = span{xi }m i=1 which is isomorphic to R . Let pY : H −→ Y be the orthogonal projection onto Y . Then 0 ∈ / pY (A) and f = pY A : A −→ Rm \ {0} is continuous, odd, hence γ(A) = k < m. Next we state the relation between the two indices catY and γ. The result is due to Rabinowitz [503], where the interested reader can find its proof. PROPOSITION 4.2.17 If X is a Banach space, P = (x, −x):x ∈ X \{0} and A ∈ Sym(X), then catP (A) = γ(A) where A = (x, −x) : x ∈ A}. Using this result we can now prove that γ satisfies properties (C). PROPOSITION 4.2.18 If X is a Banach space then the genus γ : Sym(X) −→ N0 ∪ {+∞} satisfies properties (C) provided the deformations h(t, x) are understood to be odd in x ∈ X and h(1, A) ∈ Sym(X) for all A ∈ Sym(X). PROOF: Properties (C)(1) and (2) follow at once from Definition 4.2.13. Also let A ∈ Sym(X) and suppose that h(t, that is odd in x ∈ X and
x) is a deformation h(1, A) ∈ Sym(X). Then if f ∈ C h(1, A), Rm \ {0} , we have that f ◦ h(1, ·) : A −→ Rm \ {0} is continuous and odd and so γ(A) ≤ m, hence γ(A) ≤ γ h(1, A) . This proves property (C)(3). Next we prove property (C)(4) (the subadditivity property). Clearly we may assume that both γ(A) = m and γ(B) = n are finite, or otherwise we are done. So we can find f ∈ C(A, Rm \ {0}) and g ∈ C(B, Rn \ {0}) odd maps (see Definition 4.2.13). By the Tietze extension theorem, we can find f ∈ C(X, Rm ) and g ∈ C(X, Rn ) extensions of these maps. Let u = (f , g) ∈ C(X, Rm+n ). This is an odd map and u(x) = 0 for all x ∈ A ∪ B. Hence γ(A ∪ B) ≤ m + n. This proves property (C)(4). Property (C)(5) follows from Propositions 4.2.17 and 4.2.10. PROPOSITION 4.2.19 If X is a Banach space and A ∈ Sym(X) is compact, then γ(A) < +∞. PROOF: Because A ∈ Sym(X), / A we have 0 ∈ and so we find r > 0 such that A ∩ Br (0) = ∅. The family Br (x) ∪ Br (−x) x∈A is an open cover of A and N because A is compact, we can find a finite subcover Br (xk ) ∪ Br (−xk ) k−1 . Let {ψk }N k−1 be a continuous partition of unity subordinated to the open cover Br (xk )∪ N Br (−xk ) k−1 . Replacing ψk by ψk (x) = 12 ψk (x) + ψk (−x) if necessary, we may assume that each ψk , k = 1, . . . , N , is even. From the choice of r > 0 we see that Br (xk ) ∩ Br (−xk ) = ∅ for all k = 1, . . . , N . Consider the map u : X −→ RN with kth-component defined by ψk (x) if x ∈ Br (xk ) . uk (x) = if x ∈ Br (−xk ) −ψk (x) / u(A). Therefore γ(A) ≤ N < +∞. Evidently u ∈ C(X, Rm ), u is odd, and 0 ∈ PROPOSITION 4.2.20 If X is a Banach space, Y is a finite-dimensional linear subspace of X, pY ∈ L(X) is the projection operator onto Y , and A ∈ Sym(X) with γ(A) > k = dim Y , then A ∩ (I − pY )(X) = ∅.
4.2 Ljusternik–Schnirelmann Theory
295
PROOF: We have X = Y ⊕ V with V = (I − pY )(X). If A ∩ (I − pY )(X) = ∅, then A ∩ V = ∅ and so A ⊆ Y . Note that pY ∈ C(A, Y \ {0}) and it is odd. Hence γ(A) ≤ k, a contradiction to the hypothesis. Therefore A∩V = A∩(I −pY )(X) = ∅. The next result is an immediate consequence of Definition 4.2.13. PROPOSITION 4.2.21 If X is a Banach space and A, B ∈ Sym(X) are homeomorphic with respect to an odd homeomorphism, then γ(A) = γ(B). Now that we have two interesting and computable topological indices (namely the LS-category and the (Krasnoselskii) genus), we can state the main result of the Ljusternik–Schnirelmann theory, which justifies all the effort to define topological indices satisfying properties (C). So let X be a Banach, S a deformation invariant family of closed sets of X, and ξ a topological index on S (i.e., ξ : S −→ N0 ∪ {+∞} and satisfies properties (C)). For example we can have ξ = catX defined on all closed subsets of X or we can have ξ = γ defined on Sym(X). For every k ≥ 1, we set Sk = {A ∈ S : ξ(A) ≥ k}.
(4.40)
Clearly property (C)(3) implies that Sk is deformation invariant; that is, if h(t, x) is a deformation of X and A ∈ Sk , then h(1, A) ∈ Sk . This fact in conjunction with the next proposition suggests the right way to proceed. PROPOSITION 4.2.22 If X is a Banach space, ϕ ∈ C(X), K ⊆ X, and ϕ(K) ⊆ R are both closed, condition (D) from the beginning of Section 4.1 is satisfied and S ⊆ 2X is a nonempty deformation invariant family, then c = inf sup ϕ(x) < +∞ x∈A A∈S
implies that c ∈ ϕ(K).
PROOF: Suppose that c ∈ / ϕ(K). Because by hypothesis ϕ(K) is closed, we can find a < c < b such that ϕ−1 ([a, b])∩K = ∅. Let A ∈ S such that sup ϕ(x) ≤ b. Then x∈A
from condition (D)(1) we know that we can find t0 > 0 such that ϕ h(t0 , x) ≤ a for all x ∈ A. Since h(t, x) = h(tt0 , x), t ∈ [0, 1], x ∈ X, defines a deformation of X with h(1, A) = h(t0 , A), we have that h(t0 , A) ∈ S (recall that by hypothesis S is deformation invariant). Therefore c ≤ a < c, a contradiction. So we are led to consider ck = inf sup ϕ(x) A∈Sk x∈A
if Sk = ∅.
(4.41)
THEOREM 4.2.23 If X is a Banach space, Sk = ∅ for all k ≥ 1 (see (4.40)), ϕ ∈ C 1 (X) is bounded below and satisfies the P S-condition, then (a) ck (see (4.41)) if finite it is attained and it is a critical value of ϕ. (b) If ck = ck+1 = c < +∞, then card Kc = +∞, where recall that Kc = {x ∈ X : ϕ (x) = 0 and ϕ(x) = c}. (c) If ck = ck+1 = · · · = ck+m = c < +∞, then ξ(Kc ) ≥ m + 1.
296
4 Critical Point Theory and Variational Methods
(d) If ck = +∞ for some k ≥ 1, then sup ϕ = +∞, where recall that K = {x ∈ X : K
ϕ (x) = 0}.
PROOF: (a) If ck is finite and it is not a critical value of ϕ, then by virtue of Theorem 4.1.19(e) we may find ε > 0 small such that ϕck +ε can be deformed to ϕck −ε . So every A ∈ Sk such that A ⊆ ϕck +ε , is deformed into a set B ⊆ ϕck −ε and so by property (C)(3) we have ξ(B) ≥ k (i.e., B ∈ Sk ; recall that Sk is deformation invariant). Then ck ≤ sup ϕ(x) ≤ ck − ε, a contradiction. Therefore ck is a critical x∈B
value of ϕ. (b) Suppose that card(Kc ) < +∞. Then from properties (C)(3) and (5), we can find a neighborhood U of Kc such that ξ(U ) = 1. Moreover, from Theorem 4.1.19(e) we know that we can find ε > 0 and a deformation h(t, x), such that h(1, ϕc+ε ) ⊆ ϕc−ε ∪ U . Because c = ck+1 < c + ε, we can find A ∈ Sk+1 such that sup ϕ(x) ≤ c + ε, x∈A
hence h(1, A) ⊆ ϕc−ε ∪ U and so from properties (C) we have
k + 1 ≤ ξ(A) ≤ ξ h(1, A) ≤ ξ(ϕc−ε ) + ξ(U ) = ξ(ϕc−ε ) + 1 ⇒ k ≤ ξ(ϕc−ε ); that is, ϕc−ε ∈ Sk . So we have sup ϕ(x) ≥ ck = c > c − ε ≥ ϕ(x)
x∈ϕc−ε
for all x ∈ ϕc−ε ,
a contradiction. This proves that card(Kc ) = +∞. (c) Again property (C)(5) implies the existence of a neighborhood U of Kc such that ξ(U ) = ξ(Kc ). Also Theorem 4.1.19(e) implies the existence of an ε > 0 and of a deformation h(t, x) such that h(1, ϕc+ε ) ⊆ ϕc−ε ∪ U . So from properties (C), we have ξ(ϕc+ε ) ≤ ξ(ϕc−ε ∪ U ) ≤ ξ(ϕc−ε ) + ξ(U ) = ξ(ϕc−ε ) + ξ(Kc ).
(4.42)
Note that ϕc+ε contains subsets A with ξ(A) ≥ k + m. So ξ(ϕc+ε ) ≥ k + m.
(4.43)
ξ(ϕc−ε ) ≤ k − 1.
(4.44)
Also because c = ck , we have
Combining (4.42) through (4.44), we obtain m + 1 ≤ ξ(Kc ). (d) Suppose that β = sup ϕ < +∞. Then by virtue of Theorem 4.1.19(e) for some K
ε > 0, the space X is deformable to ϕβ+ε . Hence ξ(X) = ξ(ϕβ+ε ). Therefore ϕβ+ε ∈ Sk and so ck = inf sup ϕ(x) ≤ β + ε, A∈Sk x∈A
a contradiction to the fact that ck = +∞.
In fact this theorem can be restated in a global setting using the following notion.
4.3 Spectrum of the Laplacian and of the p-Laplacian
297
DEFINITION 4.2.24 A C 1 -manifold M (which is assumed to be a metric space), modelled on a Banach space X is said to be a Finsler manifold if and only if the following hold. (a) For every z ∈ M , the tangent space Tz M has a norm · z that is equivalent to the norm · of X. (b) If z ∈ M and U is an open neighborhood of z locally trivializing the tangent bundle T M (i.e., T M U = U × X), then for every k > 1, there exists Vk ⊆ U another open neighborhood of z such that 1 xv ≤ xz ≤ lxv k
for all (v, x) ∈ Vk × X.
REMARK 4.2.25 Trivially every Banach space X is itself a complete C ∞ -Finsler manifold. Note that the cotangent bundle T ∗ M also carries a natural Finsler structure by letting x∗ z = sup{| x∗ , x | : x ∈ Tz M, xz ≤ 1}
for any x∗ ∈ Tz∗ M.
Finally, if M is also connected, then the Finsler structure introduces a metric on M , known as the Finsler metric, defined by 1 dγ(t) dt : γ ∈ C 1 ([0, 1], M ), γ(0) = z, γ(1) = u . d(z, u) = inf γ(t) dt 0 The existence of C 1 -paths joining z and u follows from the work of Palais [471]. The deformation approach of critical point theory (see Section 4.1 and in particular Theorem 4.1.19) and the Ljusternik–Schnirelmann theory (in particular Theorem 4.2.23) can be recast in the more general framework of a connected, metric, C 1 -Finsler manifold. In this context the next theorem provides useful information. For a proof of it we refer to Ghoussoub [262, p. 78]. THEOREM 4.2.26 If M is a connected, metric, C 1 -Finsler manifold modelled on a Banach space X, and ϕ ∈ C 1 (M ) is bounded below and satisfies the P S-condition, then ϕ has at least catM (M )-distinct critical points.
4.3 Spectrum of the Laplacian and of the p-Laplacian In this section we determine some important spectral properties of the Laplacian and p-Laplacian differential operators. We start with the Laplacian (linear eigenvalue problem). So let Z ⊆ RN be a bounded open set and first consider the Dirichlet eigenvalue problem ) * −u(z) = λu(z) a.e. on Z, . (4.45) u∂Z = 0, λ ∈ R
Recall that λ ∈ R is an eigenvalue of − , H01 (Z) , provided problem (4.45) 1 has a nontrivial solution u∈H
0 (Z). Then such a nontrivial solution is called an 1 eigenfunction of − , H0 (Z) . Also the pair (λ, u) is said to be an eigenelement
298
4 Critical Point Theory and Variational Methods
of − ,H01 (Z) . To fully determine the properties of the eigenelements of − , H01 (Z) , we need to recall the basic regularity results for linear second order elliptic equations. We state the main regularity results without proofs which can be found in Evans [229, Section 6.3]. So let L be the following second order differential operator in divergence form, Lu = −
N
N
Dj aij (z)Di u + Dj bi (z)Di u + c(z)u,
i,j=1
(4.46)
i=1
where Dk = ∂/∂zk , z = (zk )N k=1 being the generic element of Z, aij , bi , c ∈ L∞ (Z), 1 ≤ i, j ≤ N . We require that L is uniformly elliptic; namely there exists ϑ > 0 such that N
aij (z)ξi ξj ≥ ϑξ2RN
(4.47)
i,j=1 N for a.a. z ∈ Z and all ξ = (ξk )N k=1 ∈ R . We will also assume without any loss of generality that aij = aji for all 1 ≤ i, j ≤ N (symmetry condition). Uniform ellipticity means that for almost all z ∈ Z, the symmetric N × N -matrix A(z) =
N aij (z) i,j=1 is positive definite, with the smallest eigenvalue greater or equal to ϑ > 0 (see (4.47)). First we state the interior regularity results and then pass to the boundary regularity ones.
THEOREM 4.3.1 If aij ∈ C 1 (Z), bi , c ∈ L∞ (Z) for i, j ∈ {1, . . . , N }, f ∈ L2 (Z) and x ∈ H 1 (Z) is a weak solution of the problem Lx = f in Z (i.e., N aij (z)Dj x(z)Di ψ(z)dz + Z N i=1 bi (z)x(z)ψ(z)dz + Z c(z)x(z)ψ(z)dz = Z i,j=1 2 f (z)ψ(z)dz for all ψ ∈ Cc∞ (Z)), then x ∈ Hloc (Z) (i.e., x ∈ H 2 (Z ) for every Z Z ⊆Z open with Z ⊆Z). Moreover, for every Z ⊆Z open with Z ⊆Z (denoted by Z ⊂⊂Z), we have the estimate xH 2 (Z ) ≤ c (f L2 (Z) + xL2 (Z) ), with c > 0 depending only on Z , Z, and the coefficients of L. REMARK 4.3.2 Note that in the above theorem we do not assume any Dirichlet conditions for the problem; that is, we do not require that x ∈ H02 (Z). Moreover, 2 because finally x ∈ Hloc (Z), then a weak solution x is in fact a strong solution; namely Lx(z) = f (z) a.e. on Z. Increasing the regularity of the coefficient functions, we can iterate the argument used to prove Theorem 4.3.1 and have higher interior regularity. THEOREM 4.3.3 If m ≥ 1 is an integer, aij , bj , c ∈ C m+1 (Z) for all i, j ∈ {1, . . . , N }, f ∈ H m (Z), and x ∈ H 1 (Z) is a weak solution Lx = f in Z, then m+2 x ∈ Hloc (Z) and for each Z ⊂⊂ Z (see Theorem 4.3.1), we have the estimate xH m+2 (Z ) ≤ c (f H m (Z) + xL2 (Z) ), with the constant c > 0 depending only on m, Z , Z, and the coefficients of L.
4.3 Spectrum of the Laplacian and of the p-Laplacian
299
We can now repeatedly apply Theorem 4.3.3 for m ≥ 1, increasing every time the regularity of the solution by 2, to obtain the following infinite interior regularity result. THEOREM 4.3.4 If aij , bi , c ∈ C ∞ (Z) for all i, j ∈ {1, . . . , N }, f ∈ C ∞ (Z), and x ∈ H 1 (Z) is a weak solution Lx = f in Z, then x ∈ C ∞ (Z). REMARK 4.3.5 Again note that we are not making any assumptions about the boundary values (in the sense of trace) of the solution x. So the theorem says that any possible singularities of x on the boundary, do not “propagate” in the interior. The next regularity results establish regularity of the weak solutions up to the boundary. THEOREM 4.3.6 If aij ∈ C 1 (Z), bi , c ∈ L∞ (Z) for all i, j ∈ {1, . . . , N }, f ∈ L2 (Z), ∂Z is a C 2 -manifold and x∈H01 (Z) is a weak solution of the Dirichlet problem (4.48) Lx = f in Z, x∂Z = 0, then x ∈ H 2 (Z) and we have the estimate xH 2 (Z) ≤ c (f L2 (Z) + xL2 (Z) )
(4.49)
with the constant c > 0 depending only on Z and the coefficients of L. REMARK 4.3.7 If problem (4.48) has a unique solution x ∈ H 2 (Z), then the estimate simplifies as follows, xH 2 (Z) ≤ cf L2 (Z) . Also note that in this case in contrast to Theorem 4.3.1 we impose Dirichlet boundary conditions (in the sense of trace). Proceeding as in the interior regularity, by improving the hypotheses on the coefficients we can conclude higher boundary regularity of the weak solution. THEOREM 4.3.8 If m ≥ 1 is an integer, aij , bi , c ∈ C m+1 (Z) for all i, j ∈ {1, . . . , N }, f ∈ H m (Z), ∂Z is a C m+2 -manifold, and x ∈ H01 (Z) is a weak solution of the Dirichlet problem Lx = f in Z, x∂Z = 0, (4.50) then x ∈ H m+2 (Z) ∩ H01 (Z) and we have the estimate xH m+2 (Z) ≤ c (f H m (Z) + xL2 (Z) )
(4.51)
with the constant c > 0 depending only on m, Z, and the coefficients of L. REMARK 4.3.9 Again problem (4.50) has a unique solution; the estimate (4.51) becomes xH m+2 (Z) ≤ c f H m (Z) .
300
4 Critical Point Theory and Variational Methods We conclude with an infinite boundary regularity result.
THEOREM 4.3.10 If aij , bi , c∈C ∞ (Z) for all i, j ∈{1, . . . , N }, f ∈C ∞ (Z), ∂Z is a C ∞ -manifold, and x∈H01 (Z) is a weak solution of the Dirichlet problem Lx = f in Z, x∂Z = 0, then x ∈ C ∞ (Z). Now we return to the linear eigenvalue problem (4.45) and prove the following
fundamental result on the spectrum of − , H01 (Z) . THEOREM 4.3.11 Problem (4.45) has a sequence (λn , un ) ⊆ R+ × H01 (Z) of eigenelements such that 0 < λ1 < λ2 ≤ · · ·≤ λn ≤ · · ·, λn −→ +∞ as n → ∞, √ is an orthonormal basis {un }n≥1 is an orthonormal basis of L2 (Z), (1/ λn )un n≥1
of H01 (Z) and {un } ∈ H01 (Z) ∩ C ∞ (Z) for all n ≥ 1.
PROOF: For every h ∈ L2 (Z) we consider the Dirichlet problem −x = h in Z, x∂Z = 0.
(4.52)
If we consider the operator A ∈ L H01 (Z), H −1 (Z) defined by
Ax, y = Dx(z), Dy(z) RN dz, Z
then clearly A is strongly monotone and weakly coercive (use Poincar`e’s inequality) and so by virtue of Corollary 3.2.29 we can find a unique x ∈ H01 (Z) such that Ax = h. Evidently this is the unique weak (in fact strong) solution of problem (4.52). So we can define the solution map K : L2 (Z) −→ H01 (Z) that to each h ∈ L2 (Z) assigns the unique solution of problem (4.52). Clearly K is linear. We show that K w is also completely continuous, namely if hn −→ h in L2 (Z), then K(hn ) −→ K(h) 1 in H0 (Z) as n → ∞. Note that for every n ≥ 1, we have in H −1 (Z), (4.53)
hn xn dz ≤ hn 2 xn 2 ≤ c1 Dxn 2 , ⇒ A(xn ), xn = Dxn 22 = A(xn ) = hn
Z
for some c1 > 0, ⇒ Dxn 2 ≤ c1 , i.e., {xn }n≥1 ⊆ H01 (Z) is bounded, (by Poincar´e’s inequality). Therefore by passing to a suitable subsequence if necessary, we may assume that w xn −→ x (i.e., x = K(h)). By Urysohn’s criterion for convergent sequences, we w conclude that for the original sequence we have xn −→ x = K(h) in H01 (Z), hence 2 1 xn −→ x in L (Z) (recall that H0 (Z) is embedded compactly in L2 (Z)). We have
A(xn ), xn − x=hn , xn − x= hn (xn − x) −→ 0, Z
⇒ Dxn 2 −→ Dx2 .
4.3 Spectrum of the Laplacian and of the p-Laplacian
301
w
But Dxn −→ Dx in L2 (Z, RN ) and a Hilbert space has the Kadec–Klee property, thus xn −→ x in H01 (Z). This proves the complete continuity of K. In particular
then K viewed as a map from L2 (Z) into itself is linear compact (i.e., K ∈ Lc L2 (Z) ; see Proposition 3.1.3) and it is also self-adjoint, because for every f ∈ L2 (Z) we have
K(h)f dz = hK(f )dz. DK(h), DK(f ) RN dz = Z
Z
Z
Finally note that
K(h)hdz = DK(h), DK(h) RN dz = DK(h)22 > 0, Z
Z
for all h ∈ L2 (Z) \ {0},
⇒ K ∈ Lc L2 (Z) is positive definite. So we can apply Theorem 3.1.37 and produce a sequence {µn }n≥1 of eigenvalues of K with µn −→ 0 and {un }n≥1 ⊆ L2 (Z) a sequence of corresponding eigenfunctions that form an orthonormal basis of L2 (Z). We have K(un ) = µn un . −1
Because clearly K = A
, we have
A(un ) = λn un
with λn =
1 . µn
So 0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , λn −→ +∞, {un }n≥1 ⊆ H01 (Z) and are eigenfunctions of − , H01 (Z) . Moreover, from Theorem 4.3.4 we have that un ∈ H01 (Z) ∩ C ∞ (Z) for all n ≥ 1. Finally, if in H01 (Z) we consider the inner product
x, yH 1 (Z) = (Dx, Dy)RN dz for all x, y ∈ H01 (Z), 0
then we have
Z
1 1 √ un , √ uk λk λn
1 (Dun , Duk )RN dz λn λk Z √ λn = √ un uk dz = δnk . λk Z = √
H01 (Z)
Also if y ∈ H01 (Z) is such that y, un H 1 (Z) = 0 for all n ≥ 1, then 0
(Dy, Dun )RN dz = λn yun dz, 0= Z Z
⇒ yun dz = 0 for all n ≥ 1,
Z
⇒ y=0
(because {un }n≥1 ⊆ L2 (Z) is a basis). √ Therefore we conclude that (1/ λn )un is an orthonormal basis of H01 (Z). n≥1
302
4 Critical Point Theory and Variational Methods
REMARK 4.3.12 If ∂Z is a C ∞ -manifold, then from Theorem 4.3.10 we have that un ∈ C ∞ (Z) for all n ≥ 1. Next we produce variational characterizations for the eigenelements {(λn , un )}n≥1 . For this purpose we introduce the so-called Rayleigh quotient. R(x) =
Dx22 x22
for all x ∈ H01 (Z), x = 0.
(4.54)
THEOREM 4.3.13 We have λ1 = min{R(x) : x ∈ H01 (Z), x = 0} = R(u1 ) n ≥ 2, λn = max R(x) : x ∈ span{uk }n k=1 , x = 0 ⊥ , x = 0 = min R(x) : x ∈ span{uk }n−1 k=1 =
min
=
max R(y)
(4.57) (4.58)
1 (Z) y∈Y Y ⊆H0
dim Y =n
(4.55) (4.56)
y=0
min R(v) = R(un ).
max
(4.59)
1 (Z) v∈V V ⊆H0 v=0 dim V ⊥ =n−1
PROOF: Let Yn = span{uk }n k=1 , n ≥ 1. Clearly for all n ≥ 1, λn = R(un ) (see (4.45)). ⊥ If v ∈ Yn−1 , then the expansion of v in L2 (Z) with respect to the orthonormal basis {un }n≥1 , is given by
v= ϑk uk with ϑk = vuk dz. Z
k≥n
Let vm = 4.3.11). Hence
m k=n
ϑk uk . Then vm −→ v in L2 (Z) and in H01 (Z) (see Theorem R(vm ) −→ R(v).
We have m
R(vm ) =
k=n
m
ϑ2k Duk 22 =
m k=n
ϑ2k
λk ϑ2k
k=n m k=n
≥ λn , ϑ2k
⇒ R(v) ≥ λn . Because R(un )=λn , we conclude that (4.57) holds, hence (4.55) holds too (let n = 1). n On the other hand if y ∈ Yn , then y = ϑk uk and so k=1 n
R(y) ≤
λk ϑ2k
k=1 n k=1
≤ λn . ϑ2k
Because R(un ) = λn , we deduce that (4.56) holds.
4.3 Spectrum of the Laplacian and of the p-Laplacian
303
Now let Y be an n-dimensional linear subspace of H01 (Z). Let y0 ∈ Y such that y0 ⊥span{uk }n−1 k=1 . Then from (4.57) we have that R(y0 ) ≥ λn , ⇒ max R(y) ≥ λn . y∈Y
Because Y was an arbitrary n-dimensional linear subspace of H01 (Z), we obtain min 1 (Z) Y ⊆H0
max R(y) ≥ λn .
(4.60)
y∈Y
dim Y =n
Then because of (4.56), we see that equality must hold in (4.60). Dually we obtain (4.59). REMARK 4.3.14 Characterizations (4.58) and (4.59) are more satisfactory than (4.56) and (4.57), since they do not require a priori knowledge of the spectrum of
− , H01 (Z) . Moreover, we can say that (4.58) is more satisfactory than (4.59), because (4.58) involves a finite dimensional maximization and (4.59) has an infinite dimensional minimization. Expressions (4.58) and (4.59) are also known as Courant’s minimax principles (see also Section 4.2). Next we focus on the eigenelement {(λ1 , u1 )}. PROPOSITION 4.3.15 If for some x ∈ H01 (Z), x = 0 we have R(x) = λ1 , then x ∈ H01 (Z) is an eigenfunction corresponding to the eigenvalue λ1 > 0. PROOF: Let y ∈ H01 (Z). Then we have R(x + ry) ≥ R(x) = λ1
for all r > 0.
Without any loss of generality, we may assume that x2 = 1. Then we have D(x + ry)22 ≥ Dx22 = λ1 x + ry22
⇒ r2 Dy22 + 2r (Dx, Dy)RN dz ≥ λ1 2r xydz + r2 y22 . Z
Z
Dividing with r > 0 and then letting r −→ 0, since y ∈ H01 (Z) was arbitrary, we obtain
(Dx, Dy)RN dz = λ1 xydz, Z Z ⇒ −x(z) = λ1 x(z) a.e. on Z, x∂Z = 0. To fully describe the first eigenelement (λ1 , u1 ), we need the so-called strong maximum principle. The maximum principles for second-order linear partial differential equations are based upon the observation that if u ∈ C 2 (Z) attains its maximum over an open Z at a point z0 ∈ U , then
304
4 Critical Point Theory and Variational Methods Dx(z0 ) = 0
and
D2 u(z0 ) ≤ 0,
(4.61)
where the second -order condition means that the symmetric N×N matrix D2 u(z0 ) = N
Dj Di u(z0 ) i,j=1 is negative definite. Evidently arguments based on (4.61) are pointwise in nature, in contrast to the integral-based energy methods that we employed so far. Although we use the strong maximum principle in the context of the Laplacian differential operator, we state and prove the strong maximum principle for general second-order elliptic differential operators in nondivergence form Lu = −
N
aij (z)Dj Di u +
i,j=1
N
bi (z)Di u + cu.
(4.62)
i=1
We assume that aij , bi , c ∈ C(Z), the uniform ellipticity condition (4.47) holds and without any loss of generality, we also have the symmetry condition aij = aji for all i, j ∈ {1, . . . , N }. First let us state the so-called weak maximum principle, which is used in the proof of the strong maximum principle. For its proof we refer to Evans [229, p. 333]. THEOREM 4.3.16 If L is as in (4.62) with c ≥ 0 and u ∈ C 2 (Z) ∩ C(Z), then (a) If Lu(z) ≤ 0 for all z ∈ Z, we have max u ≤ max u+ (u+ = max{u, 0}). Z
∂Z
(b) If Lu(z) ≥ 0 for all z ∈ Z, we have min u ≥ − max u− (u− = max{−u, 0}). Z
∂Z
REMARK 4.3.17 If Lu(z) ≤ 0 for all z ∈ Z, we say that u is a lower solution of the equation Lu(z) = 0. Similarly if Lu(z) ≥ 0 for all z ∈ Z, we say that u is an upper solution of the equation Lu(z) = 0. In Section 5.2 we generalize these concepts. Note that if Lu(z) = 0 for all z ∈ U , then max |u| = max |u|. Finally if ∂Z
Z
c ≡ 0, then in (a) we have max u = max u and in (b) min u = min u. Z
∂Z
Z
∂Z
The strong maximum principle is essentially the following lemma, known in the literature as Hopf ’s lemma. LEMMA 4.3.18 If u ∈ C 2 (Z) ∩ C(Z), L is as in (4.62), Lu(z) ≤ 0 for all z ∈ Z, there exists z0 ∈ ∂Z such that u(z) < u(z0 ) for all z ∈ Z, and Z satisfies the interior ball condition at z0 ∈ ∂Z (i.e., there exists an open ball B ⊆ Z such that z0 ∈ ∂B; this is the case if ∂Z in C 2 ), then (a) If c = 0, we have (∂u/∂n)(z0 ) > 0 with n being the outward unit normal at z0 ∈ ∂Z. (b) If c ≥ 0, then the same conclusion holds provided u(z0 ) ≥ 0. PROOF: We assume c ≥ 0 and cover both (a) and (b). By translating things if necessary, we may assume without any loss of generality that B = Br (0), r > 0. Let 2
v(z) = e−λz − e−λr We have
2
for all z ∈ Br (0), λ > 0 to be chosen later.
4.3 Spectrum of the Laplacian and of the p-Laplacian Lv(z) = −
N
aij (z)Dj Di v(z) +
i,j=1
= e−λz
N
305
bi (z)Di v(z) + c(z)v(z)
i=1 N
2
aij (z) − 4λ2 zi zj + 2λδij
i,j=1
−e−λz
2
−λz2
≤e
N
2 2 bi (z)2λzi + c(z) e−λz − e−λr
i=1
− 4ϑλ2 z2 + 2λtrA(z) + 2λb(z)z + c(z)
(see (4.47))
N
N
for A(z) = aij (z) ij=1 and b(z) = bi (z) i=1 . Consider the open ring R = Br (0) \ B r/2 (0). We have Lv(z) ≤ e−λz
2
− ϑλ2 r2 + 2λtrA(z) + 2λb(z)r + c(z) ≤ 0
(4.63)
for all z ∈ R, provided we choose λ > 0 large enough. By virtue of the hypothesis about z0 ∈ ∂Z, we see that we can find ε > 0 small such that u(z) + εv(z) ≤ u(z0 ) for all z ∈ ∂Br/2 (0). (4.64) ≤ 0, we have that Moreover, because v ∂Br (0)
u(z) + εv(z) ≤ u(z)
for all z ∈ ∂Br (0).
(4.65)
Using (4.63) and the fact that Lu(z) ≤ 0 for all z ∈ Z, we obtain
L u(z) + εv(z) − u(z0 ) ≤ −cu(z0 ) ≤ 0 for all z ∈ R, (recall either c = 0 or u(z0 ) > 0). Also from (4.64) and (4.65), we have that u + εv − u(z0 )∂R ≤ 0. Invoking Theorem 4.3.16 (the weak maximum principle), we infer that u(z) + εv(z) − u(z0 ) ≤ 0
for all z ∈ R.
Note that u(z0 ) + εv(z0 ) − u(z0 ) = 0. Hence ∂/∂n u + εv − u(z0 ) (z0 ) = ∂u/∂n (z0 ) + ε ∂v/∂n (z0 ) ≥ 0. Consequently 2 ∂u ∂v ε
(z0 ) ≥ −ε (z0 ) = − Dv(z0 ), z0 RN = 2λεre−λr > 0. ∂n ∂n r
Using this lemma, we can prove the strong maximum principle. THEOREM 4.3.19 If L is as in (4.62) with c=0, Z ⊆RN is bounded, open, and connected, and u ∈ C 2 (Z) ∩ C(Z), then (a) If Lu(z) ≤ 0 for all z ∈ Z and u attains its maximum on Z at an interior point, then u is constant on Z.
306
4 Critical Point Theory and Variational Methods
(b) If Lu(z) ≥ 0 for all z ∈ Z and u attains its minimum on Z at an interior point, then u is constant on Z. PROOF: (a) Let M = max u and E = {z ∈ U : u(z) = M }. Suppose u is not Z
constant (i.e., u = M ), and let V = {z ∈ Z : u(z) < M } = ∅. This is an open set and we can choose x ∈ V such that d(x, E) < d∗ (x, ∂Z). Let B be the largest open ball with center at x located in V . Then we can find z0 ∈ E such that z0 ∈ ∂B. Evidently V satisfies the interior ball condition at z0 . So by virtue of Lemma 4.3.18 (∂u/∂n)(z0 ) > 0. But because u attains its supremum at z0 , Du(z0 ) = 0 and so (∂u/∂n)(z0 ) = 0, a contradiction. (b) Apply part (a) on −u.
Using part (b) of Lemma 4.3.18, we can have the following variant of the strong maximum principle. THEOREM 4.3.20 If L is as in (4.62) with c ≥ 0, Z ⊆ RN is bounded, open, and connected, and u ∈ C 2 (Z) ∩ C(Z), then (a) If Lu(z) ≤ 0 for all z ∈ Z and u attains a nonnegative maximum on Z at an interior point, then u is constant on Z. (b) If Lu(z) ≥ 0 for all z ∈ Z and u attains a nonpositive minimum on Z at an interior point, then u is constant on Z. We return to the linear eigenvalue problem (4.45) and using Theorem 4.3.19, we fully describe the first eigenelement (λ, u1 ). THEOREM 4.3.21 If Z ⊆ RN is a bounded domain (i.e., a bounded, open, and connected set) with a C ∞ -boundary ∂Z, then λ1 > 0 is simple and the corresponding eigenfunction u1 ∈ C ∞ (Z) does not change sign and does not vanish on Z (so we may assume that u1 (z) > 0 for all z ∈ Z). − 1 PROOF: We know that u+ 1 , u1 ∈ H0 (Z) and we have
+ 2 + 2 (Du1 , Du+ u1 u+ 1 )RN dz = λ1 1 dz ⇒ Du1 2 = λ1 u1 2 Z
and Z
Z
(Du1 , Du− 1 )RN dz = λ1
(4.66) − 2 − 2 u1 u− 1 dz ⇒ Du1 2 = λ1 u1 2 .
Z
(4.67) − If u1 does change sign, then u+ 1 , u1 = 0 and from (4.66) and (4.67), we have − R(u+ 1 ) = λ1 = R(u1 ).
(4.68)
− Invoking Proposition 4.3.15 from (4.68) we deduce that u+ 1 , u1 are eigenfunctions + − for λ1 > 0. Because u1 , u1 ≥0, from Theorem 4.3.19(b), we deduce that − u+ 1 (z) > 0 and u1 (z) > 0
for all z ∈ Z,
a contradiction. So u1 must have constant sign on Z and we can always assume u1 (z) > 0 for all z ∈ Z.
4.3 Spectrum of the Laplacian and of the p-Laplacian
307
If λ1 > 0 is not simple, then we can find another eigenfunction u1 ∈ C ∞ (Z) such that
u1 u1 dz = 0. (4.69) Z
From the first part of the proof we know that u1 , u1 do not change sign on Z and we can assume u1 , u1 > 0 on Z. This contradicts (4.69) and so λ1 > 0 is simple. REMARK 4.3.22 The hypothesis that ∂Z is C ∞ was made for convenience. As we show later in this section, in our discussion of the p-Laplacian, it suffices to assume that ∂Z is C 2 (see Theorem 4.3.35). What we need in the proof of Theorem 4.3.21 is to guarantee that u1 ∈ C(Z) which allows us to use Theorem 4.3.19. Note that if ∂Z is Lipschitz and N ≤ 3, then since H 2 (Z) ⊆ C(Z) (Sobolev embedding theorem) and u1 ∈ H 2 (Z) (see Theorem 4.3.6), we have u1 ∈ C(Z). For N > 3, note that we can use Theorem 4.3.8, in order through more smoothness of the manifold ∂Z to achieve higher regularity of u1 and eventually guarantee that u1 ∈ C(Z). But as we already indicated we can do better.
Now we consider the Neumann linear eigenvalue problem i.e., we determine the
spectrum of − , H 1 (Z) . So we consider the following problem. ⎧ ⎫ ⎨ −x(z) = λx(z) a.e. on Z, ⎬ . (4.70) ⎩ ∂x ⎭ (z) = 0 for all z ∈ ∂Z ∂n Here ∂x/∂n denotes the derivative in the direction of the exterior normal on ∂Z. Working as for the Dirichlet linear eigenvalue problem, we can have the following fundamental theorem for problem (4.70). THEOREM 4.3.23 If Z ⊆ RN is a bounded domain with a C ∞ -boundary ∂Z, then problem (4.70) has countably many eigenvalues {λn }n≥0 such that 0 = λ0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · ,
with λn −→ +∞
and the corresponding eigenfunctions {un }n≥0 ⊆ H 1 (Z) form an orthonormal ba √ −(1/2)
, un / λn ⊆ H 1 (Z) form an orthonormal basis for sis for L2 (Z), |Z|N n≥0
H 1 (Z) λ0 = 0 simple with corresponding eigenspace R (i.e., the constant functions) and we have the following variational characterizations of the eigenvalues λ0 = min{R(u) : u ∈ H 1 (Z), u = 0}, n ≥ 1, λn = max R(u) : u ∈ span{uk }n k=0 , u = 0 ⊥ , u = 0 = min R(u) : u ∈ span{uk }n−1 k=0 =
min
max R(y)
Y ⊆H 1 (Z) y∈Y
dim Y =n+1 y=0
=
max
min R(v)
V ⊆H 1 (Z) v∈V
dim V ⊥ =n
v=0
where recall R(u) = Du22 u22 (the Rayleigh quotient).
(4.71)
308
4 Critical Point Theory and Variational Methods
REMARK 4.3.24 Because H01 (Z) ⊆ H 1 (Z), we deduce that if by λD we n n≥1 denote the Dirichlet eigenvalues and by λN the Neumann eigenvalues, for all n n≥0
D n ≥ 1 we have λN n−1 ≤ λn .
open sets and PROPOSITION 4.3.25 If Z1 ⊆ Z2 ⊆ RN are bounded
λn (Z1 ) n≥1 (resp., λn (Z2 ) n≥1 ) are the eigenvalues of − , H01 (Z1 ) (resp.,
− , H01 (Z2 ) ), then λn (Z2 ) ≤ λn (Z1 ) for all n ≥ 1. PROOF: Note that any u ∈ H01 (Z1 ) can be extended to a function u ∈ H01 (Z2 ), by u(z) if z ∈ Z1 . u(z) = 0 if z ∈ Z2 \ Z1 So H01 (Z1 ) ⊆ H01 (Z2 ) and this by virtue of (4.58) implies that λn (Z2 ) ≤ λn (Z1 ) for all n ≥ 1. REMARK 4.3.26 This proposition is also valid for the Neumann eigenvalues but the proof is more involved. The next theorem is known as Courant’s nodal set theorem and its proof can be found in Courant–Hilbert [164]. THEOREM 4.3.27 If Z ⊆RN is a bounded open set, {λn }n≥1 ⊆ R+ are the eigen
values of − , H01 (Z) (i.e., of problem (4.45)) and {un }n≥1 ⊆ H01 (Z) ∩ C ∞ (Z) are the corresponding eigenfunctions, and for every n ≥ 1 Zn = {z ∈ Z : un (z) = 0} (the nodal set of un ), then Z \ Zn has at most n components. Before passing to nonlinear problems and in particular to the study of the spectral properties of the p–Laplacian, let us briefly summarize the situation with the scalar eigenvalue problems (i.e., N = 1, ordinary differential operators). So first consider the following eigenvalue problem: * ) −x (t) = λx(t) a.e. on (0, b), . (4.72) x(0) = x(b) = 0 and the PROPOSITION 4.3.28 The eigenvalues of (4.72) are λn =(nπ/b)2 n≥1
eigenfunctions are un (t)= 2/b sin (nπ/b)t . n≥1
Next consider the Neumann problem: ) * −x (t) = λx(t) a.e. on (0, b), . x (0) = x (b) = 0
(4.73)
2 and PROPOSITION 4.3.29 The eigenvalues of (4.73) are λn = nπ/b n≥0 √
the eigenfunctions are u0 (t)=1/ b, un (t)= 2/b cos (nπ/b)t . n≥1
4.3 Spectrum of the Laplacian and of the p-Laplacian
309
Finally we consider the periodic problem: ) * −x (t) = λx(t) a.e. on (0, b), . x(0) = x(b), x (0) = x (b)
(4.74)
2 PROPOSITION 4.3.30 The eigenvalues of (4.74) are λn = 2nπ/b n≥0 √ √
and the eigenfunctions are u0 (t) = 1/ b, un (t) = (2/ b) sin (2nπ/b)t +
cos (2nπ/b)t . n≥1
Now we turn our attention to nonlinear eigenvalue problems. Namely we study the spectral properties of the p-Laplacian differential operator −p x = −div(Dxp−2 Dx),
p > 1.
(4.75)
We start with some regularity results for elliptic equations involving the pLaplacian (see (4.75)). In what follows Z ⊆ RN is a bounded domain with a C 2 –boundary ∂Z. Also for 1 < p < ∞, by p∗ we denote the critical Sobolev exponent, namely Np if p < N N −p p∗ = . (4.76) +∞ if p ≥ N We start with a result on Sobolev functions, whose proof can be found in Brezis– Browder [103].
m,p −m,p (Z) ∩ LEMMA
4.3.31 If x ∈ W0 (Z) with m ≥ 1, 1 <1 p < +∞, S ∈ W 1 Lloc (Z) (1/p) + (1/p ) = 1 , and for some g ∈ L (Z) we have g(z) ≤ S(z)x(z) a.e. on Z, then S(x) ∈ L1 (Z) and S, x = Z S(z)x(z)dz (by ·, · we denote the
duality brackets for the pair W0m,p (Z), W −m,p (Z) ).
LEMMA 4.3.32 If x ∈ W01,p (Z), p x ∈ L1loc (Z), c > 0, ϑ ∈ [1, p∗ ) (see (4.76)),
q ∈ 1, (p∗ /p) , and α ∈ Lq (Z)+ , (1/q) + (1/q ) = 1 are such that −x(z)p x(z) ≤ α(z)|x(z)| + c|x(z)|ϑ and
a.e. on Z
(4.77)
p0 = pn+1
if p∗ < +∞ p∗ 2 max{pq, ϑ} if p∗ = +∞ p0 pn = p0 + min pn − ϑ, −1 p q
(see (4.76)), for all n ≥ 0,
then x ∈ Lpn (Z) for all n ≥ 0. PROOF: From the Sobolev embedding theorem, we have that x ∈ Lp0 (Z). We proceed by induction. So suppose that x ∈ Lpn (Z). We show that x ∈ Lpn+1 (Z). Consider {yk }k≥1 the truncations of u at the levels −k and k; that is, ⎧ ⎪ if k ≤ x(z) ⎨k yk (z) = x(z) if − k ≤ x(z) ≤ k . ⎪ ⎩ −k if x(z) ≤ −k
310
4 Critical Point Theory and Variational Methods
Set a = p(pn+1 − p0 ) p0 . Clearly a ≥ 0 because we have that p0 ≤ pn for all n ≥ 0. Then we have −|yk (z)|a yk (z)p x(z) ≤ α(z)|x(z)|a+1 +c|x(z)|ϑ+a
a.e. on Z (see (4.77)). (4.78)
By H¨ older’s inequality, we have
ϑ+a α(z)|x(z)|a+1 + c|x(z)|ϑ+a dz ≤ αq xa+1 q(a+1) + cxϑ+a .
(4.79)
Z
Because pn = max{ϑ + a, q(a + 1)}, from (4.79) it follows that
α(z)|x(z)|a+1 + c|x(z)|ϑ+a dz ≤ c1 xppnn + 1 for some c1 > 0.
(4.80)
Z
Because |yk |a yk ∈W01,p (Z), p x∈W −1,p (Z) ∩ L1loc (Z), and |yk |a yk p x is bounded below by an L1 (Z)-function (see (4.78) and (4.80)), from Lemma 4.3.31 we have that |yk |a yk p x ∈ L1 (Z) and
|yk |a yk p xdz = −p u, |yk |a yk , − Z
where by ·, · we denote the duality brackets for the pair W01,p (Z), W −1,p (Z) . Hence after integration by parts, we obtain
− |yk |a yk p xdz = (a + 1) |yk |a Dyk ϑ dz Z Z p p D(|yk |a/p yk )p dz = (a + 1) a+p Z 1 p a+p
, ≥ p yk (4.81) p p0 a+p c0 a + p where c0 > 0 is the embedding constant of W01,p (Z) into Lp0 (Z) (i.e., up0 ≤ c0 u1,p for all u ∈ W01,p (Z)). Combining (4.78), (4.80) and (4.81), we obtain p(pn+1 ) p0
yk pn+1
≤ c2 ppn+1 (xppnn + 1)
p with c2 = c1 c0 /p0 . Because yk (z) −→ x(z) a.e. on Z, we have
p(pn+1 ) p0
xpn+1
p(pn+1 ) p0
≤ lim inf yk pn+1 k→∞
≤ c2 ppn+1 (xppnn + 1),
⇒ x ∈ Lpn+1 (Z).
(4.82)
So we have completed the induction and we can say that x ∈ Lpn (Z) for all n ≥ 1. Next we consider the sequence {qn }n≥0 defined by q0 = p0
and
qn+1 = qn + q(p − 1) δ
with
δ=
p0 , pq
n ≥ 0.
(4.83)
4.3 Spectrum of the Laplacian and of the p-Laplacian
311
LEMMA 4.3.33 If all the hypotheses of Lemma 4.3.32 are satisfied and {qn }n≥0 ⊆ R+ is defined by (4.83), then x∈Lqn (Z) for all n ≥ 0 and qn+1
qn
p δq xqn+1 ≤c3 qn+1 xqnq ,
where c3 > 0 depends only on αq , c, ϑ, N, p, q and xp0 .
PROOF: If r0 = min{q(p0 − ϑ), p0 − q}, for every n ≥ 1 we have p0 + r0
n
δ k = p0 + r0 δ
k=1
δn − 1 ≤ pn . δ−1
(4.84)
Because r0 > 0 and δ > 1, from (4.84) we see that pn −→ +∞. So from Lemma 4.3.32 we have that x ∈ Lr (Z) for all 1 ≤ r < ∞. Then |x|ϑ−1 ∈ Lq (Z) (see (4.77)) and from (4.82) we have that |x|ϑ−1 q ≤ c3
with c3 > 0 depending only on αq , c, ϑ, N, p and xp0 . We have
α(z)|x|qn /q + c|x|ϑ−1 dz ≤ cc3 + αq xqqnn /q . Z
If in (4.78) we replace a by (qn /q) − 1 and argue as in the proof of Lemma 4.3.32, we reach the conclusion of the lemma. Now we are ready for the first nonlinear regularity result. THEOREM 4.3.34 If x ∈ W01,p (Z), p x ∈ L1loc (Z), c > 0, ϑ ∈ [1, p∗ ), q ∈ 1, (p∗ /p) and α ∈ Lq (Z)+ are such that −x(z)p x(z) ≤ α(z)|x(z)| + c|x(z)|ϑ
a.e. on Z,
only on then x∈L∞ (Z) and x∞ ≤c4 with c4 > 0 depending
p∗ if p∗ < +∞ . αq , c, ϑ, N, p, q and xp0 where p0= 2 max{pq, ϑ} if p∗ = +∞ p ) with c3 > 0 as in Lemma PROOF: Let ξn = qn ln(xqn ) and βn = q ln(c3 qn+1 4.3.33. Then from Lemma 4.3.33, we have
ξn+1 ≤ δ(ξn + βn ) for all n ≥ 1, n k δ βn−k for all n ≥ 1. ⇒ ξn ≤ δ n ξ0 + k=1
n n n n Note that qn =
δ p0 + δq(p − 1) (δ − 1)/(δ − 1) . So we have δ p0 ≤ qn ≤ δ c4 with c4 = p0 + δq (p − 1)/(δ − 1) , hence βn ≤ γ1 + (n + 1)γ2 with γ1 = q ln(c3 cp4 ) and γ2 = pq ln δ. Then through some elementary calculations, we obtain
312
4 Critical Point Theory and Variational Methods n
δ k βn−k ≤ γ3 δ n
k=1
with γ3 = γ1 + γ2 δ (δ − 1) δ (δ − 1) . So finally xqn ≤ exp ξn δ n p0 ≤ exp (ξ0 + γ3 ) po ⇒ x∞ ≤ lim sup xqn ≤ exp (ξ0 + γ3 ) po = c4 . n→∞
Using this theorem, we can state a basic nonlinear regularity result, which is actually a particular case of a result of Lieberman [382]. THEOREM 4.3.35 If x ∈ W01,p (Z) ∩ L∞ (Z) and p x ∈ L∞ (Z), then x ∈ C 1,α (Z) and xC 1,α (Z) ≤ c5 with 0 < α < 1 and c5 > 0 depending only on
N, p, x∞ , p x∞ . REMARK 4.3.36 In fact the above regularity theorem is still valid if instead of the Dirichlet boundary condition, we have the Neumann boundary condition (so W01,p (Z) is replaced by W 1,p (Z)); see Lieberman [382, Theorem 2]. In addition to the nonlinear regularity result, we also need the following nonlinear strong maximum principle which generalizes Lemma 4.3.18 (Hopf’s lemma) and Theorem 4.3.19. The result is due to Vazquez [593] and the interested reader can look there for its proof. THEOREM 4.3.37 If x ∈ C 1 (Z), x(z) ≥ 0 for all z ∈ Z, x = 0, p x ∈ L2loc (Z), and there exists continuous nondecreasing function β : R+ −→ R such that β(0) = 0 and either β(r0 ) = 0 for some r0 > 0 or
1 1
1/p dr = +∞, 0 rβ(r) for which we have
p x(z) ≤ β x(z)
a.e. on Z,
then x(z) > 0 for all z ∈ Z; moreover, if z0 ∈ ∂Z, x ∈ C 1 (Z ∪ {z0 }) and x(z0 ) = 0, then (∂x/∂n)(z0 ) < 0. REMARK 4.3.38 In fact the result is true for any bounded domain Z ⊆ RN . In this case, as in Lemma 4.3.18, we need to assume that Z satisfies the interior ball condition at z0 ∈ ∂Z. Recall that this condition is automatically satisfied at every boundary point if ∂Z is a C 2 -manifold. Now we have at our disposal all the necessary analytical tools to start investigating the spectral properties of − p , W01,p (Z) . So let m ∈ L∞ (Z) and assume that {m > 0} > 0. N
4.3 Spectrum of the Laplacian and of the p-Laplacian
313
We consider the following nonlinear weighted eigenvalue problem with weight m ∈ L∞ (Z) \ {0}.
) * p−2 Du(z)) = λm(z)|u(z)|p−2 u(z) a.e. on Z, −div Du(z) . (4.85) u∂Z = 0, λ ∈ R, 1 < p < ∞
We say that λ ∈ R is an eigenvalue of −p , W01,p (Z), m , if problem (4.85) has a corresponding nontrivial solution u ∈ W01,p (Z). This solution is an eigenfunction
to the eigenvalue λ ∈ R. The pair (λ, u) is an eigenelement of − p , W01,p (Z), m . The first observation about problem (4.85) is a straightforward consequence of the previous discussion on nonlinear regularity theory and the nonlinear strong maximum principle.
PROPOSITION 4.3.39 If u ∈ W01,p (Z) is an eigenfunction of − p , W01,p (Z) , then u ∈ C 1,α (Z) for some 0 < α < 1; moreover, if u ≥ 0, then u(z) > 0 for all z ∈ Z and (∂u/∂n)(z) < 0 for all z ∈ ∂Z. PROOF: The fact that u ∈ C 1,α (Z) for some α ∈ (0, 1) is a consequence of Theorem 4.3.35. Also if u ≥ 0, then because u = 0, p u ∈ L∞ (Z), and p u(z) ≤ |λ|m∞ u(z)p−1 a.e. on Z, we can apply Theorem 4.3.37 and conclude that u(z) > 0 for all z ∈ Z and (∂u/∂n)(z) < 0 for all z ∈ ∂Z. REMARK 4.3.40 If we consider the Banach space C01 (Z) = {x ∈ C 1 (Z) : x∂Z = 0}, then this becomes an ordered Banach space (OBS), with order cone C01 (Z)+ = {x ∈ C01 (Z) : x(z) ≥ 0
for all z ∈ Z}.
It is well known that int C01 (Z)+ is nonempty and more precisely, we have ∂x (z) < 0 int C01 (Z)+ = x ∈ C01 (Z)+ : x(z) > 0 for all z ∈ Z and ∂n for all z ∈ ∂Z . So Proposition 4.3.39 says that, if u ∈ W01,p (Z) is a nonnegative eigenfunc
tion of −p , W01,p (Z), m , then u ∈ int C01 (Z)+ . When m ≡ 1, we simply write
−p , W01,p (Z) . Now consider the functions ξ, η : W01,p (Z) −→ R+ defined by
1 1 and η(x) = m|x|p dz. ξ(x) = Dxpp p p Z Also for the needs of the argument that follows, we introduce the function η : Lp (Z, RN ) −→ R+ defined by 1 ypLp (Z,RN ) , y ∈ Lp (Z, RN ). p
Evidently ξ, η ∈ C 1 W01,p (Z) and η ∈ C 1 Lp (Z, RN ) . Note that η(y) =
314
4 Critical Point Theory and Variational Methods
x 1 m(z)|r|p−2 rdr for all z ∈ Z, all x ∈ R. m(z)|x|p = p 0
So from Proposition 1.1.14 we have η (x)(·) = m(·)|x(·)|p−2 x(·).
(4.86)
Similarly from Proposition 3.2.17 and Example 3.2.21(a), we have η (y)(·) = y(·)p−2 y(·) for all y ∈ Lp (Z, RN ). (4.87) RN
1,p Note that ξ = η ◦ D, D ∈ L W0 (Z), Lp (Z, RN ) being the gradient operator. Hence from the chain rule (see Proposition 1.1.14), we have ξ (x) = η (Dx) ◦ D
Dxp−2 (Dx, Du)RN dz ⇒ ξ (x), u =
(see (4.87)).
Z
So, if we define A : W01,p (Z) −→ W −1,p (Z) by
A(x), u = Dxp−2 (Dx, Du)RN dz for all x, u ∈ W01,p (Z),
(4.88)
Z
then ξ (x) = A(x). We recall the following elementary inequalities: p−1 v1 − v2 2 (v1 + v2 )p−2≤(v1 p−2 v1 − v2 p−2 v2 , v1 − v2 )RN 2 ≤ 22−p v1 − v2 p for all v1 , v2 ∈ RN , 1 < p ≤ 2, (4.89) and 22−p v1 − v2 p ≤ (v1 p−2 v1 − v2 p−2 v2 , v1 − v2 )RN ≤ cv1 − v2 2 (v1 + v2 )p−2
for all v1 , v2 ∈ RN , p ≥ 2.
(4.90)
From (4.88) and using (4.89) and (4.90) it follows easily that A is demicontinuous, strictly monotone (strongly monotone if p ≥ 2), and coercive. Hence A is maximal monotone (see Corollary 3.2.11) and surjective (see Corollary 3.2.28).
Moreover, recalling that D∗ = −div ∈ L Lp (Z, RN ), W −1,p (Z) , then we see that A(x)=−p x for all x∈W01,p (Z).
PROPOSITION 4.3.41 If A : W01,p (Z) −→ W −1,p (Z) is defined by (4.88), then A is of type (S)+ (see Definition 3.2.61(d)). w
PROOF: Let xn −→x in W01,p (Z) and suppose that lim sup A(xn ), xn − x≤0. A n→∞
is maximal monotone, therefore it is generalized pseudomonotone (see Proposition 3.2.56) and so A(xn ), xn −→ A(x), x (see Definition 3.2.53(b)), ⇒ Dxn p −→ Dxp . w
We know that Dxn −→ Dx in Lp (Z, RN ) and Lp (Z, RN ), being uniformly convex, has the Kadec–Klee property. Therefore we conclude that Dxn −→ Dx in Lp (Z, RN ) and so xn −→ x in W01,p (Z), which proves that A is of type (S)+ .
4.3 Spectrum of the Laplacian and of the p-Laplacian
315
We define λ1 = inf ξ(u) : x ∈ W01,p (Z), η(u) = 1
1 1 = inf m|u|p dz = 1 . Dupp : x ∈ W01,p (Z), p p Z
(4.91)
PROPOSITION 4.3.42 If λ1 is defined by (4.91), then λ1 > 0 is an eigenvalue
of −p , W01,p (Z), m ; moreover u ∈ W01,p (Z) is an eigenfunction corresponding to λ1 > 0 if and only if ξ(u) − λ1 η(u) = 0 = inf ξ(x) − λ1 η(x) : x ∈ W01,p (Z) and u ∈ int C01 (Z)+ or u ∈ −int C01 (Z)+ . PROOF: Let {un }n≥1 ⊆ W01,p (Z), and un p = 1 be a minimizing sequence for problem (4.91). Then by Poincar´e’s inequality we have that {un }n≥1 ⊆ W01,p (Z) is bounded and so by passing to a suitable subsequence if necessary, we may assume that w un −→ u in W01,p (Z) and un −→ u in Lp (Z). From the weak lower semicontinuity of the norm functional in a Banach space, we have 1 1 (4.92) Dupp ≤ lim inf Dun pp = λ1 . n→∞ p p Also because un −→ u in Lp (Z), we have that (1/p) Z m|x|p dz=1. Therefore from (4.92) it follows that 1 Dupp = λ1 . p Then by the Lagrange multiplier rule we can find λ ∈ R such that A(u) = λη (u),
m|u|p dz (i.e., λ = λ1 ). ⇒ Dupp = λ Z
From the equation A(u) = λ1 η (u), it follows that λ1 > 0 is an eigenvalue of −p , W01,p (Z), m . Moreover, exploiting the p-homogeneity of the functions ξ and η, we have ξ(u) − λ1 η(u) = 0 = inf ξ(x) − λ1 η(x) : x ∈ W01,p (Z) .
Since u = 0 we must have u+ = 0 or u− = 0. We assume without any loss of generality that u+ = 0 (the other case is treated similarly). Then using as a test function u+ ∈ W01,p (Z) we obtain A(u), u+ = λ1 η (u), u+
⇒ Du+ pp = λ1 m|u+ |p dz, Z
⇒ u+ is an eigenfunction of − p , W01,p (Z), m . Then by virtue of Proposition 4.3.39, we have that u+ (z) > 0 for all z ∈ Z, hence u = u+ ∈ int C01 (Z)+ .
316
4 Critical Point Theory and Variational Methods
REMARK 4.3.43 The previous proposition gives a variational characterization of the eigenelement (λ1 , u), namely Dupp = λ1 upp and Dxpp ≥ λ1 xpp for all x ∈ W01,p (Z). Next we establish the simplicity of λ1 > 0. PROPOSITION 4.3.44 The eigenvalue λ1 > 0 is simple; that is, the corresponding eigenspace is one-dimensional. PROOF: Suppose that u≥0 and v≥0, u = v are both eigenfunctions corresponding to λ1 > 0. Then we have u, v∈int C01 (Z)+ (see Proposition 4.3.39) and A(u) = λ1 η (u)
(4.93)
A(v) = λ1 η (v).
(4.94) Note that u/v, v/u ∈ L∞ (Z) and set ϕ = (up − v p ) up−1 , ψ = (v p − up ) v p−1 . Evidently ϕ, ψ ∈ W01,p (Z) and we have
v p
v p−1 Dϕ = 1 + (p − 1) Dv (4.95) Du − p u u
u p−1
u p Du. (4.96) Dv − p and Dψ = 1 + (p − 1) v v and
We use ϕ as a test function in (4.93) and ψ as a test function in (4.94). We obtain
Dup−2 (Du, Dϕ)RN dz = λ1 m|u|p−2 uϕdz (4.97)
Z
Z and Dvp−2 (Dv, Dψ)RN dz = λ1 m|v|p−2 vψdz. (4.98) Z
Z
Using (4.95) and (4.96) in (4.97) and (4.98), respectively, and then adding the two equalities, we obtain
u p
v p 1 + (p − 1) Dup + 1 + (p − 1) Dvp dz u v
Z
v p−1 u p−1 = Dup−2 (Du, Dv)RN +p Dvp−2 (Dv, Du)RN dz. p u v Z (4.99) Note that D ln u=Du/u and D ln v=Dv/v. Therefore from (4.99) we have
(up − v p )(D ln up − D ln vp )dz
Z = pv p D ln up−2 (D ln u, D ln v − D ln u)RN dz Z
+ pup D ln vp−2 (D ln v, D ln u − D ln v)RN dz. (4.100) Z
If p ≥ 2, we recall the elementary inequality which says that
4.3 Spectrum of the Laplacian and of the p-Laplacian ξ2 2 − ξ1 2 ≥p ξ1 p−2 (ξ1 , ξ2 − ξ1 )RN +
1 ξ2 − ξ1 p 2p−1 − 1
317
for all ξ1 , ξ2 ∈RN . (4.101)
Therefore when p ≥ 2, we use (4.101) in (4.100) and have
1 1 1 + p vDu − uDvp dz = 0, 2p−1 − 1 Z v p u ⇒ v(z)Du(z) − u(z)Dv(z) = 0, a.e. on Z, ⇒ u = kv
for some k > 0.
If 1 < p < 2, then we recall the elementary inequality which says that ξ2 p − ξ1 p ≥ p ξ1 p−2 (ξ1 , ξ2 − ξ1 )RN +
3p(p − 1) ξ2 − ξ1 2 , 16 (ξ1 + ξ2 )2−p
(4.102)
for all ξ1 , ξ2 ∈RN . Therefore when 1 < p < 2, we use (4.102) in (4.100) and have
1 3p(p − 1) vDu − uDv2 1 + p dz = 0, p 16 u (vDu + uDv)2−p Z v ⇒ v(z)Du(z) − u(z)Dv(z) = 0 a.e. on Z, ⇒ u = kv
for some k > 0.
We conclude that the eigenspace corresponding to λ1 > 0 is one-dimensional; that is, λ1 > 0 is simple. In Proposition 4.3.42 we saw that the eigenfunctions corresponding to λ1 > 0 change sign. Next we show that these are the only eigenfunctions of
do not 1,p −p , W0 (Z), m with a fixed sign.
PROPOSITION 4.3.45 If u ∈ C01 (Z) is an eigenfunction of −p , W01,p (Z), m corresponding to an eigenvalue λ > λ1 , then u must change sign (i.e., u+ = 0 and u− = 0) and we have (4.103) min |Z + |N , |Z − |N ≥ (λm∞ c)η , where Z + = {u > 0}, Z − = {u < 0}, c > 0 is a constant independent of the eigenelement (λ, u) and −N if p < N p η= . −1 if p ≥ N PROOF: Let u1 ∈ int C01 (Z)+ be the normalized (i.e., Dup = 1) eigenfunction corresponding to λ1 > 0. We argue indirectly. So suppose that u has a constant sign on Z, say u ≥ 0 (the case u ≤ 0 is treated similarly). We may assume that Dup = 1. Note that by Proposition 4.3.44, we obtain up1 − up up − up1 0 ≤ A(u1 ), + A(u), up−1 up−1 1
1 1 (λ1 − λ)m(up1 − up )dz = (λ1 − λ) − = < 0, λ λ 1 Z a contradiction.
318
4 Critical Point Theory and Variational Methods Now we prove estimate (4.103). Note that A(u+ ) = λη (u+ )
1−(p/p∗ ) m|u+ |p dz ≤ λm∞ u+ pp∗ |Z + |N , ⇒ Du+ pp = λ
(4.104)
Z
(by H¨ older’s inequality). Similarly we obtain 1−(p/p∗ )
Du− pp ≤ λm∞ u− pp∗ |Z − |N
.
(4.105)
∗
From the continuous embedding of W01,p (Z) into Lp (Z) (Sobolev embedding theorem), we can find a constant c > 0 (independent of the eigenelement (u, λ)) such that u+ pp∗ ≤ cDu+ pp and u− pp∗ ≤ cDu− pp . (4.106) Combining (4.104) through (4.106) we obtain 1−(p/p∗ )
1 ≤ λm∞ c|Z ± |N
⇒ (λm∞ c)η ≤ min{|Z + |N , |Z − |N }
with η=
−N p −1
if p < N . if p ≥ N
Using this proposition we show that λ1 > 0 is isolated.
PROPOSITION 4.3.46 The eigenvalue λ1 >0 of −p , W01,p (Z), m is isolated. PROOF: Clearly from Proposition 4.3.42 (see also Remark 4.3.43), we have that λ1 > 0 is the smallest positive eigenvalue. Suppose we could find {λn } ⊆ R+ eigenvalues, λn = λ1 , and λn ↓ λ1 as n → ∞. Then we can find {un } ⊆ W01,p (Z) such that Dun p = 1 and A(un ) = λn η (un )
m|un |p dz. ⇒ Dun pp = λn
(4.107)
Z
We may assume that w
un −→ u
in
W01,p (Z)
and
un −→ u
in Lp (Z).
Passing to the limit as n → ∞ in (4.107) and using the weak lower semicontinuity of the norm functional in a Banach space, we obtain
Dupp ≤ λ1
m|u|p dz,
Z
⇒ Dupp = λ1
m|u|p dz (see Remark 4.3.43). Z
(4.108)
4.3 Spectrum of the Laplacian and of the p-Laplacian Note that
319
A(un ), un − u = λn
m|un |p−2 un (un − u)dz, Z
⇒ lim A(un ), un − u = 0, n→∞
⇒ un −→ u in W01,p (Z)
(see Proposition 4.3.41).
Therefore we infer that u = u1 . We have un −→u1 in W01,p (Z) and by Egorov’s theorem given ε > 0, we can find Zε ⊆Z a closed subset such that |Zε |N ≥ |Z|N − ε and un (z)−→u1 (z) uniformly on Zε . If Zn+ = {un > 0} and Zn− = {un < 0}, n ≥ 1, we have lim inf |Zn+ |N ≥ lim inf |Zn+ ∩ Zε N ≥ |Z|N − ε (recall u1 > 0). (4.109) n→∞
n→∞
On the other hand from Proposition 4.3.45, we have |Zn− |N ≥ ξ > 0 |Zn+ |N
for all n ≥ 1.
(4.110)
|Zn− |N
Because + ≤ |Z|N , by choosing ε > 0 small and using (4.109) and (4.110) we reach a contradiction. So summarizing the situation
with the beginning of the positive part of the spectrum of − p , W01,p (Z), m (see problem (4.85)), we can state the following theorem. THEOREM 4.3.47 Problem (4.85) has a smallest positive eigenvalue λ1>0. This eigenvalue is simple (i.e., the corresponding eigenspace is one-dimensional), isolated, it has a variational characterization Dxp p 1,p λ1 = inf (Z), x = 0 , (4.111) : x ∈ W 0 m|x|p dz Z this infimum is realized at the eigenfunction u1 ∈ int C01 (Z)+ with Z mup1 dz = 1, and this is the only eigenfunction (up to scalar multiplication) that does not change sign. REMARK 4.3.48 If in problem (4.85) we replace m by −m, we see that the above analysis gives a first negative eigenvalue λ−1 < 0, which is simple, isolated, and is the only negative eigenvalue with a positive eigenfunction. It is easy to check that
the set of eigenvalues of −p , W01,p (Z), m is closed in R.
What can be said about other eigenvalues and eigenfunctions of −p , W01,p (Z), m ? Here is where the Ljusternik–Schnirelmann theory from Section 4.2 enters into the picture. So let ϕ : W01,p (Z) −→ R be defined by ϕ(x) = ξ(x)2 − η(x),
x ∈ W01,p (Z).
Recall that ξ(x) =
1 Dxpp p
and
η(x) =
1 p
m|x|p dz, Z
x ∈ W01,p (Z).
320
4 Critical Point Theory and Variational Methods
Evidently ϕ ∈ C 1 W01,p (Z) . Suppose x = 0 is a critical point of ϕ (i.e., ϕ (x) = 0) with critical value c (i.e., ϕ(x) = c), then we have 0 = ϕ (x) = 2ξ(x)ξ (x) − η (x), ⇒ 2ξ(x) ξ (x), x = η (x), x ,
m|x|p dz, ⇒ 2ξ(x)Dxpp = Z
⇒ 2pξ(x)2 − pη(x) = 0
⇒ p 2ξ(x)2 − η(x) = 0
⇒ p ξ(x)2 + c = 0 (i.e., c = −ξ(x)2 < 0). Then we see that
1 ξ(x) = √ η(x) 2 −c
√ and so λ = 1 (2 −c) > 0 is an eigenvalue of − p , W01,p (Z), m with associated eigenfunction x.
Conversely if u = 0 is an eigenfunction of − p , W01,p (Z), m corresponding to
1/p an eigenvalue λ > 0, then v = 1 2λξ(x) u is also an eigenfunction corresponding to λ > 0. Exploiting the p-homogeneity of ξ, we see that λ = 1 2ξ(v) . Because ϕ (v) = 2ξ(v)ξ (v) − η (v), we see that v is a critical point of ϕ with corresponding critical value c = −(1/4λ2 ).
So the eigenelements of −p , W01,p (Z), m are completely determined by those of the functional ϕ. Hence we may work with ϕ and apply on it the Ljusternik– Schnirelmann theory. To this end we introduce Snc = K ⊆ W01,p (Z) : K is compact, symmetric (i.e., K = −K) and γ(K) ≥ n , n ≥ 1. (4.112) Here γ is the genus introduced in Definition 4.2.13. Then we define cn = inf c max ϕ(x), K∈Sn x∈K
n ≥ 1.
(4.113)
We proceed with the genus as our topological index, because as we explained in Section 4.2 it is in general easier to calculate than the LS-category ξ. Also note that in the family (4.112), without any loss of generality we assume that the sets are compact (see Proposition 4.2.19 and Theorem 4.2.23). LEMMA 4.3.49 The sequence {cn }n≥1 defined by (4.113) consists of critical points of ϕ and −∞ < inf ϕ = c1 ≤ c2 ≤ · · · ≤ cn ≤ · · · < 0 = ϕ(0). 1,p
W0
(Z)
PROOF: First we show that ϕ is bounded below. Indeed we have
1 1 ϕ(x) = ξ(x)2 − η(x) = 2 Dx2p − m|x|p dz p p p Z 1 1 ≥ Dxpp (see (4.111)), Dxpp − p λ1 ⇒ ϕ(x) is coercive, hence in particular bounded below.
4.3 Spectrum of the Laplacian and of the p-Laplacian
321
Next we show that ϕ satisfies the P S-condition. To this end let {xn }n≥1 ⊆ W01,p (Z) be a sequence such that ϕ(xn ) n≥1 ⊆ R is bounded and ϕ (xn ) −→ 0. Because ϕ is coercive, it follows that {xn }n≥1 ⊆ W01,p (Z) is bounded and so by passing to a suitable subsequence if necessary, we may assume that w
xn −→ x in W01,p (Z) Then we have η (xn ), xn − x =
and
xn −→ x in Lp (Z).
(4.114)
m|xn |p−2 xn (xn − x)dz −→ 0 Z
and because ϕ (xn ) = 2ξ(xn )A(xn ) − η (xn ) −→ 0 in W −1,p (Z), it follows that ξ(xn ) A(xn ), xn − x −→ 0.
(4.115)
Because of (4.114), we may assume (at least for a subsequence), that ξ(xn ) −→ ϑ. If ϑ = 0, then xn −→ x = 0 in W01,p (Z) and we are done. Otherwise, from (4.115), we have A(xn ), xn − x = 0, ⇒ xn −→ x in W01,p (Z)
(see Proposition 4.3.41).
This proves that ϕ satisfies the P S-condition. Finally for every n ≥ 1, we show there exists K ⊆ W01,p (Z) compact, and symmet1,p ric with γ(K) = n such that sup ϕ(x) < 0. Consider functions {yk }n k=1 ⊆ W0 (Z) x∈K
such that
yk p = 1,
m|yk |p dz > 0 Z
and
supp yk ∩ supp ym = ∅
if k = m, k, m ∈ {1, . . . , n}.
Yn = span{yk }n k=1 .
Because Yn is a finite-dimensional subspace of W01,p (Z), Let all norms are equivalent and so we can find c > 0 such that cξ(y) ≤ η(y) ≤
1 ξ(y) c
for all y ∈ Yn
(by Poincar´e’s inequality).
We consider the compact set c2 c2 K = y ∈ Yn : ≤ η(y) ≤ . 4 3 We have sup ϕ(y) ≤ −(c2 /8) < 0. Because Yn is isomorphic in Rn , we can y∈K
identify K with a ring K 1 in Rn such that ∂B1 (0) = S n−1 = {y ∈ Rn : y = 1} ⊆ K 1 ⊆ Rn \{0}. Then by virtue of Propositions 4.2.15 and 4.2.18, we have γ(K) = n. REMARK 4.3.50 By virtue of Theorem 4.2.23(b), we see that ϕ has infinitely many critical points. LEMMA 4.3.51 If {cn }n≥1 ⊆ R− is defined by (4.113), then cn −→ 0 as n → ∞.
322
4 Critical Point Theory and Variational Methods
PROOF: Let G = {x ∈ W01,p (Z) : ϕ(x) ≤ 0}. Because ϕ is coercive (see the proof of Lemma 4.3.49), we see that G ⊆ W01,p (Z) is bounded. We show that given ε > 0 we can find nε ≥ 1 and K ∈ Snc ε with K ⊆ G such that −ε ≤ sup ϕ(x). Exploiting x∈K
the compact embedding of W01,p (Z) into Lp (Z), given ϑ > 0 we can find Yϑ ⊆ Lp (Z) finite-dimensional and iϑ : G −→ Yϑ continuous such that sup x − iϑ (x)p ≤ ϑ
(see Theorem 3.1.10).
x∈G
We consider the odd component of iϑ defined by iϑ (x) =
1
iϑ (x) − iϑ (−x) 2
for all x ∈ G.
Clearly we have sup x − iϑ (x)p ≤ ϑ.
(4.116)
x∈G
The set G is relatively compact in Lp (Z) (because of the compact embedding of W01,p (Z) into Lp (Z)). Hence given ε > 0, we can find ϑε > 0 such that
ε |η(x) − η iϑε (x) | ≤ 2
for all x ∈ G
(see (4.116)).
(4.117)
Let δε > 0 be such that η(x) ≤ ε/2 if xp ≤ δε . Then for x ∈ G with iϑε (x)p ≤ δε , we have
(see (4.117)). (4.118) η(x) ≤ |η(x) − η iϑε (x) | + η iϑ (x) ≤ ε From (4.118) it follows that for every K compact, and symmetric, K ⊆ G ∩ {x ∈ W01,p (Z) : η(x) ≥ ε}, we have iϑε (K) ⊆ {y ∈ Yϑε : yp ≥ δε }.
(4.119)
The set iϑε (K) is compact and symmetric in Lp (Z) and so if by γ we denote the genus on Lp (Z), from (4.119) we have
γ iϑε (K) ≤ dim Yϑε ,
⇒ γ(K) ≤ γ iϑε (K) ≤ dim Yϑε . Therefore for all compact and symmetric K ⊆ G satisfying γ(K) ≥ dim Yϑε + 1, we can find x0 ∈ K such that inf η(x) ≤ η(x0 ) < ε. Then because ϕ(x) ≥ −η(x), we x∈K
have sup ϕ(x) ≥ − inf η(x) > −ε. x∈K
x∈K
So combining Lemma 4.3.49 we can have the following result
and Lemma 4.3.51, concerning the spectrum of −p , W01,p (Z), m . √ THEOREM 4.3.52 There exists a sequence λn = 1 2 −cn of positive n≥1
1,p eigenvalues of − p , W0 (Z), m such that λn −→ +∞.
4.3 Spectrum of the Laplacian and of the p-Laplacian
323
REMARK 4.3.53 So in addition to
λ1 > 0 (see (4.91)) we have produced a whole sequence {λn }n≥1 of eigenvalues of −p , W01,p (Z), m such that λn −→ +∞. If p eigenvalue problem), then these are all the positive eigenvalues of
= 2 (linear −, H01 (Z), m (see Theorem 4.3.13 which treats the case m ≡ 1). If p = 2 (nonlinear eigenvalue problem), then we do not know if this in this case. What we know is that λ2 > λ1 (see Proposition 4.3.46). Because the set of eigenvalues of − p , W01,p (Z), m is clearly closed, we have that
λ∗2 = inf λ > 0 : λ is an eigenvalue of − p , W01,p (Z), m , λ = λ1 (4.120)
is the second positive eigenvalue of −p , W01,p (Z), m . Evidently λ2 ≥ λ∗2 and it ∗ is natural to ask whether λ2 = λ2 . In what follows we show that indeed this is the So the second eigenvalue and the second positive variational eigenvalue of
case. 1,p −p , W0 (Z), m coincide.
Let (λ, u)∈R+ ×C01 (Z) be an eigenelement of −p , W01,p (Z), m (see Proposition 4.3.39). We introduce the set Z(λ) = {z ∈ Z : u(z) = 0}. Let N (λ) be the number of components (maximal connected open sets) of the open set Z\Z(λ). Generalizing Courant’s nodal theorem, Anane–Tsouli [26] proved the following. PROPOSITION 4.3.54 If (λ, u) ∈ R+ × C01 (Z) is an eigenelement of p , W01,p (Z), m , then λN (λ) ≤ λ.
−
COROLLARY 4.3.55 If λn < λn+1 , then N (λn ) ≤ n. PROOF: If N (λn ) > n, then λN (λn ) ≥ λn+1 . On the other hand Proposition 4.3.54 implies that λN (λn ) ≤ λn . Therefore it follows that λn = λn+1 , a contradiction to the hypothesis. Hence N (λn ) ≤ n. THEOREM 4.3.56 λ∗2 = λ2 . PROOF: Because λ1 is isolated (see Proposition 4.3.46) from (4.120) we see that λ∗2 > λ1 . If u ∈ C01 (Z) is an eigenfunction corresponding to λ∗2 > 0, u must change sign on Z (see Proposition 4.3.45). Hence N (λ∗2 ) ≥ 2 and so λ2 ≤ λN (λ∗2 ) . On the other hand from Proposition 4.3.54 we have that λN (λ∗2 ) ≤ λ∗2 and so from (4.120) we conclude that λ∗2 = λ2 . Next we investigate the dependence of λ1 and λ2 on the weight function m. Recall that m ∈ L∞ and |{m > 0}|N > 0. In what follows in order to stress the dependence of λ1 and λ2 on the weight function m, we write λ1 (m) and λ2 (m). The first proposition follows at once from (4.111). PROPOSITION 4.3.57 If m1 , m2 ∈L∞ (Z), m1 (z)≤m2 (z) a.e. on Z with strict inequality on a set of positive measure and |{m1 > 0}|N > 0, then λ1 (m2 ) < λ1 (m1 ). For the second eigenvalue we have the following result corresponding to the + monotone dependence on the weight m. In what follows Zm = {m > 0}.
324
4 Critical Point Theory and Variational Methods
+ PROPOSITION 4.3.58 If m1 , m2 ∈L∞ (Z), m1 (z)≤m2 (z) a.e. on Z, |Zm |N > 0, + and m1 (z)<m2 (z) a.e. on Zm , then λ (m ) < λ (m ). 2 2 2 1 1
PROOF: Let u2 be an eigenfunction corresponding to λ2 > 0 of −p , W01,p (Z), m1 . We know that u2 changes sign (see Proposition 4.3.45). So we can define ⎧ u+ ⎨ 2 (z) if z ∈ {u2 > 0} 1 m |u+ |p dz)1/p ( 1 2 p Z v1 (z) = and ⎩ 0 if otherwise ⎧ u− ⎨ 2 (z) if z ∈ {u2 < 0} 1 m |u− |p dz)1/p ( 1 2 p Z v2 (z) = . ⎩ 0 if otherwise
1,p Let Y2 = span{v1 , v2 } and K = y ∈ Y2 : Evidently v1 , v2 ∈ W 0 (Z). (1/p) Z m1 |y|p dz = −2c2 where c2 = −1 4λ22 (m1 ) . Clearly K is compact and since Y2 is isomorphic to R2 , K can be identified with the boundary of the unit ball in R2 . Therefore from Proposition 4.2.15 we have γ(K) = 2. Note that |{m2 > 0} ∩ {u2 > 0}|N > 0 and |{m1 < 0} ∩ {u2 < 0}|N > 0. So
m1 |vk |2 dz < m2 |vk |2 dz for k = 1, 2 {m1 >0}∩{u2 >0}
and
{m1 >0}∩{u2 >0}
m1 |vk | dz < {m1 <0}∩{u2 <0} 2
m2 |vk |2 dz {m1 <0}∩{u2 <0}
for k = 1, 2.
It follows that ϕm1 (v) = ξ(v)2 − ηm1 (v) = c2 (m1 ) ⇒ ϕm2 (v) < ϕm1 (v) = c2 (m1 )
for all v ∈ K,
for all v ∈ K,
⇒ max ϕm2 (v) < c2 (m1 ), v∈K
⇒ c2 (m2 ) < c2 (m1 ), ⇒ λ2 (m2 ) < λ2 (m1 ).
We can have another sequence of variational eigenvalues of −p , W01,p (Z) (i.e., Lp (Z) now we take m = 1). We consider the set B1 = {x ∈ Lp (Z) : xp = 1} and we set Lp (Z) M = W01,p (Z) ∩ ∂B1 . This is a C 1 -manifold (if p ≥ 2 it is a C 1,1 -manifold). For such manifolds we still have the deformation theorem (see Theorem 4.1.19). The result is due to Ghoussoub [262, p. 55], where the reader can find its proof. THEOREM 4.3.59 If M is a complete, connected, C 1 -Finsler manifold (see Definition 4.2.24), ϕ ∈ C 1 (M ), it satisfies the P S-condition, c ∈ R is a regular value of ϕ, and ε0 > 0, then there exist 0 < ε < ε0 and a continuous one-parameter family of homeomorphisms h : [0, 1] × M −→ M such that (a) h(0, x) = x and |ϕ(x) − c| ≥ ε0 ⇒ h(t, x) = x for all t ∈ [0, 1].
4.3 Spectrum of the Laplacian and of the p-Laplacian
325
(b) For any x ∈ M t −→ ϕ h(t, x) is nonincreasing. (c) h(1, ϕ c+ε ) ⊆ ϕ c−ε . (d) If ϕ is even, then for all t ∈ [0, 1] h(t, ·) is odd. We consider the functional ϕ : W01,p (Z) −→ R defined by for all x ∈ W01,p (Z).
ϕ(x) = Dxpp
We restrict ϕ on the C 1 -manifold M ; that is, we consider ϕM . Then ϕM ∈ C 1 (M ). Note that
for all x ∈ M. ϕM (x) = p A(x) − ϕ(x)|x|p−2 x PROPOSITION 4.3.60 ϕM satisfies the P S-condition. PROOF: Suppose that {xn }n≥1 ⊆ M is a sequence such that |ϕ(xn )| ≤ c for some
c > 0, all n ≥ 1, and ϕM (xn ) = p A(xn ) − ϕ(xn )|xn |p−2 xn −→ 0 in W −1,p (Z). w Evidently {xn }n≥1 ⊆ W01,p (Z) is bounded and so we may assume that xn −→ x in W01,p (Z) and x in Lp (Z). Also we may assume that ϕ(xn ) −→ η in R. Note xn −→ p−2 that ϕ(xn ) Z |xn | xn (xn − x)dz −→ 0 and so A(xn ), xn − x −→ 0, which by virtue of Proposition 4.3.41 implies that xn −→ x in W01,p (Z). Let Dn = {C ⊆ M : C is compact, symmetric, and there exists h ∈ C(S n−1 , C), odd, surjective}, n ≥ 1 and S n−1 ={y ∈ Rn : y = 1}. We define µn = inf n sup ϕ(x), C∈D
n ≥ 1.
(4.121)
x∈C
Then µn ∈ R and we have the following counterpart of Theorem 4.3.52. THEOREM 4.3.61 For every integer n ≥ 1, µn is a critical value of ϕM . PROOF: We argue indirectly. So suppose that µn is a regular value of ϕM . We apply Theorem 4.3.59 with ε0 = 1 and c = µn . We can find 0 < ε < ε0 and a one-parameter family of homeomorphisms h(t, ·) : M −→ M, t ∈ [0, 1], which satisfy statements (a)→(d) of Theorem 4.3.59. From (4.121) we see that we can find C ∈ Dn such that sup ϕ(x) ≤ µk + ε. If ψ ∈ C(S n−1 , C) is odd and surjective, then x∈C
so is h 1, ψ(·) :S n−1 −→ h(1, C) (see Theorem 4.3.59(d)). Hence h(1, C) ∈ Dn and sup ϕ(x) ≤ µk − ε
(see Theorem 4.3.59(c)),
x∈h(1,C)
a contradiction.
Because of this proposition we have vn ∈ M such that
ϕM (vn ) = 0,
⇒ (µn , vn ) is an eigenelement of − p , W01,p (Z) . Note that Dn is a subfamily of the family of sets with genus n ≥ 1, we infer that
326
4 Critical Point Theory and Variational Methods λn ≤ µn
for all n ≥ 1.
(4.122)
Clearly λ1 = µ1 and also λ2 = µ2 . To see this let u2 be a normalized eigenfunction corresponding to the eigenvalue λ2 . We know that u2 must change sign (see − + − Proposition 4.3.45). So u+ 2 = 0, u2 = 0. Set C = {ξu2 + ηu2 : ξ, η ∈ R and + − p p |ξ| u2 p + |η| u2 p = 1}. Clearly C ∈ D2 and for any u ∈ C, we have ϕ(u) = λ2 . Therefore from (4.121) it follows that µ2 ≤ λ2 . This combined with (4.122) implies that µ2 = λ2 . For n > 2 we do not know that if λn = µn . We return to the nonlinear eigenvalue problem (4.85) and we introduce the C 1 manifold
M0 = {x ∈ W01,p (Z) : m|x|p dz = 1} Z
and then the open set U in M0 , defined by U = {x ∈ M0 : Dxpp < λ2 }. Let u1 ∈ M0 ∩ int C01 (Z)+ be the eigenfunction corresponding to λ1 > 0. Evidently u1 , −u1 ∈ U . LEMMA 4.3.62 u1 and −u1 belong to different connected components of U . PROOF: We proceed by contradiction. So suppose that u1 and −u1 belong to the same connected component of U . Because such a component is also pathconnected, we can find a continuous curve γ : [−1, 1] −→ U joining u1 and −u1 . Set E = γ([−1, 1]) and C0 = E ∪ (−E). Clearly 0 ∈ / C0 , C0 is compact and symmetric, and γ(C0 ) > 1. Therefore C0 ∈ S2 and we have (recall the definition of U ), max Dxpp : x ∈ C0 < λ2 ⇒
inf max Dxpp < λ2 .
C∈S2 x∈C
Then by virtue of Theorem 4.3.52 we have λ2 < λ2 , a contradiction.
Let U + be the connected component of U containing u1 and U − the connected component of U containing −u1 . Evidently U + = −U − . Set E+ = {tU t : t > 0} E− = −E+ We have
and
E = E+ ∪ E− .
Dxpp < λ2
m|x|p dz
for all x ∈ E
(4.123)
m|x|p dz
for all x ∈ ∂E.
(4.124)
Z
and
Dxpp = λ2 Z
Here ∂E is the boundary of E in W01,p (Z). For r > 0, let (∂E)r = {x∈∂E : x = r}.
Let u2 be a normalized eigenfunction of −p , W01,p (Z), m corresponding to λ2 > 0. We set
and
Y1 = Ru1 , Y2 = span{u1 , u2 }
m|x|p dz . D = x ∈ W01,p (Z) : Dxpp = λ2 Z
Note that (4.124) implies that ∂E ⊆ D.
4.3 Spectrum of the Laplacian and of the p-Laplacian
327
PROPOSITION 4.3.63 If u ∈ R+ u1 , u = 0, S =[−u, u]= v = t(−u) + (1 − t)u : t ∈ [0, 1] , and S0 = {−u, u}, then the pair {S0 , S} and D link in W01,p (Z) via γ ∗ = I (see Definition 4.1.20).
PROOF: Clearly S0 ∩D = ∅. Also let γ ∈ C S, W01,p (Z) be such that γ S = I S . 0 0 Note that E has two connected components E+ and E− =−E+ (see Lemma 4.3.62 and recall the definition of E). Then if for x = t(−u) + (1 − t)u ∈ S, t ∈ [0, 1], we set γ(t) = γ(x), then γ : [0, 1] −→ W01,p (Z) is a continuous curve joining −u and u and so we must have γ(S) ∩ ∂E = ∅. Because ∂E ⊆ D, we obtain that γ(S) ∩ D = ∅. This proves that the pair {S0 , S} and D link in W01,p (Z) via γ ∗ = I. 1/p
1/p
PROPOSITION 4.3.64 If 0 < r < R < ∞, u1 = (1/λ1 ) u1 , u2 = (1/λ2 ) u2 and C = u = tu1 + su2 : u ≤ R, s ≥ 0 , C0 = u = tu1 , u = |t| ≤ R ∪ u ∈ C : u = R , and Dr = u ∈ D : u = r , then {C0 , C} and Dr link in W01,p (Z) through γ ∗ = I. 1,p PROOF: Clearly C0 ∩ Dr = ∅. Let γ : C −→ W0 (Z) be a continuous map that γ C = I C . Let d1 = d(u1 , ∂K) and define the map k : W01,p (Z) −→ Y2 by 0
0
k(u) =
min d(u, ∂E), Rd1 u1 + (u − r)u2 − min d(u, ∂E), Rd1 u1 + (u − r)u2
if u ∈ / E− = −E+ . if u ∈ E−
Evidently k is continuous. If g = k ◦ γ ∈ C(C, Y2 ), then we can easily check that 0∈ / g(C0 ) and d(g, int C, 0) = 1 (the int C is taken in Y2 ). So we can find u ∈ C such that g(u) = k γ(u) = 0. This implies that γ(u) ∈ ∂E and also γ(u) = r. Hence γ(u) ∈ (∂E)r and γ(C) ∩ (∂E)r = ∅. Because (∂E)r ⊆ Dr , we have γ(C) ∩ Dr = ∅ and so the sets {C0 , C} and Dr link in W01,p (Z) through γ ∗ = I. Finally before turning our attention to the Neumann nonlinear eigenvalue problem for the negative p-Laplacian, let us mention what happens if in problem (4.85) the weight function m does not belong in L∞ (Z), but we have m ∈ Ls (Z) for some s > N/p if 1 < p ≤ N and m ∈ L1 (Z) if p > N . Details can be found in Cuesta [168]. In what follows we always assume m+ = 0. If this is the case with the weight function m, then for u ∈ W01,p (Z) an eigenfunction corresponding to an eigenvalue λ, we can still say that it belongs in L∞ (Z) but we no longer have that u ∈ C01 (Z). PROPOSITION 4.3.65 If m ∈ Ls (Z) for some s > N/p if 1 <
p ≤ N and m ∈ L1 (Z) if p > N and (λ, u) an eigenelement of −p , W01,p (Z), m , then u ∈ L∞ (Z) and u is locally H¨ older continuous; that is, there exists a ∈ (0, 1) which depends on for every Z ⊂⊂ Z, we can find c > 0, c depending on
(p, N, λ, ms ) such that p, N, λ, ms , d(Z , ∂Z) , such that |u(z) − u(z )| ≤ cu∞ z − z aRN
for all z, z ∈ Z .
328
4 Critical Point Theory and Variational Methods
The lack of global regularity of the eigenfunction u is a source of difficulties. Nevertheless we can still conduct an analysis of the eigenvalue problem and produce for λ2 > λ1 > 0 (the first two eigenvalues) results similar to the ones we had for the case when m ∈ L∞ (Z) with |{m > 0}|N > 0. THEOREM 4.3.66 If m ∈ Ls (Z) with s > N/p if 1 < p ≤ N and m ∈ L1 (Z) if p > N , then problem (4.85) has a smallest positive eigenvalue λ1 > 0, which is simple and isolated and the corresponding eigenfunction does not change sign on Z; moreover Dup p λ1 = inf : u ∈ W01,p (Z), u = 0 p m|u| dz Z and the infimum is attained at the normalized principal eigenfunction u1 ∈L∞ (Z) ∩ 1,γ Cloc (Z) for some γ ∈(0, 1). In deriving Theorem 4.3.66 the analysis differs from that of the m ∈ L∞ (Z) case in two respects. First, in the proof of Proposition 4.3.44, we used the function up − v p v p − up J(u, v) = A(u), + A(v), (4.125) up−1 v p−1 for all (u, v) ∈ D(J) = (u, v) ∈ W01,p (Z) × W01,p (Z) : u, v ≥ 0, (u/v), (v/u) ∈ ∞ L (Z) . We exploited the fact that J(u, v) ≥ 0
for all (u, v) ∈ D(J)
and
J(u, v) = 0 if and only if
u = βv, β ∈ R.
(4.126)
This is sometimes called in the literature the Diaz–Saa inequality. Note that if u, v are eigenfunctions corresponding to λ1>0, u, v ≥ 0, then (u, v) ∈ D(J) (see Proposition 4.3.39). This is no longer available in the present m ∈ Ls (Z) setting due to the lack of global regularity of the eigenfunctions. Then (4.126) is replaced by Picone’s identity, proved by Allegretto–Huang [11]. THEOREM 4.3.67 If u, v∈C(Z), are differentiable a.e. on Z, u ≥ 0, v > 0, and up up Dvp − p p Dvp−2 (Dv, Du)RN p v v p p p−2 u R(u, v) = Du − Dv D p−1 , Dv RN , v L(u, v) = Dup + (p − 1)
then (a) L(u, v) = R(u, v). (b) L(u, v) ≥ 0 a.e. on Z. (c) L(u, v) = 0 a.e. on Z if and only if u = kv for some k ∈ R. Second, in obtaining Proposition 4.3.39, which justified the use of the function J(u, v) (see (4.125)) and of (4.126), we used the nonlinear strong maximum principle (see Theorem 4.3.37). This result is not available in the m ∈ Ls (Z) case due to the lack of global regularity of the eigenfunctions. So the nonlinear strong maximum principle (Theorem 4.3.37) is replaced by the following nonlinear Harnack’s inequality, due to Serrin [548].
4.3 Spectrum of the Laplacian and of the p-Laplacian
329
THEOREM 4.3.68 If u ∈ W01,p (Z) ∩ L∞ (Z) is a nonnegative eigenfunction of
−p , W01,p (Z), m and for some z0 ∈ Z and r > 0 we have B3r (z0 ) ⊆ Z, then for some c > 0 depending on (p, N, r, λ, m, Z) we have max u ≤ c min u.
B r (z0 )
(4.127)
B r (z0 )
REMARK 4.3.69 Harnack’s inequality (see
(4.127)), says that the values of nonnegative eigenfunctions of −p , W01,p (Z), m are comparable, at least in any subregion away from the boundary ∂Z. Also as before for the second eigenvalue λ2 > 0, we have the following. THEOREM 4.3.70 If m ∈ Ls (Z) with s > N/p if 1 < p ≤ N and m ∈ L1 (Z) if p > N , then λ∗2 = λ2 . There is an alternative variational characterization of λ2 > 0 due to Cuesta–De Figueiredo–Gossez [167]. Lp (Z) Lp (Z) where B 1 = x∈Lp (Z) : xp ≤ THEOREM 4.3.71 If S =W01,p (Z) ∩ ∂B1 1 and Γ0 = γ0 ∈ C([−1, 1], S) : γ0 (−1) = −u1 , γ0 (1) = u1 , then λ2 = inf max Dxpp : x∈γ([−1, 1]) .
Now we turn our attention to the Neumann nonlinear eigenvalue problem.
⎫ ⎧ p−2 Dx(z) = λ|x(z)|p−2 x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) ⎩
∂x ∂np
=0
⎭
on ∂Z, λ ∈ R
.
(4.128)
Here for x ∈ C 1 (Z), (∂x/∂np )(z) = Dx(z)p−2 Dx(z), n(z) RN , z ∈ ∂Z, where n(z) is the outward unit normal on ∂Z. First we recall a nonlinear Green’s formula, which is a useful tool in the analysis of problem (4.128). So consider the following space, V q (Z, div) = v ∈ Lq (Z, RN ) : div v ∈ Lq (Z) , 1 < q < +∞. On V q (Z, div) we consider the norm
1/q vV q (Z,div) = vqLq (Z,RN ) + divvqq . Furnished with this norm V q (Z, div) becomes a separable, reflexive Banach space in which C ∞ (Z, RN ) is dense. The next theorem extends to a nonlinear setting the classical Green’s identity. For its proof we refer to Casas–Fernadez [132] and Kenmochi [346]. THEOREM 4.3.72 If 1 < p < ∞ and (1/p) + (1/p ) = 1, then there exists a unique bounded linear operator
γn : V p (Z, div) −→ W −(1/p
),p
(∂Z) = W (1/p
),p
(∂Z)∗
330
4 Critical Point Theory and Variational Methods
such that γn (v) = (v, n)RN for all v ∈ C ∞ (Z, RN ) and
u div vdz + (Du, v)RN dz = γ0 (u), γn (v)∂Z Z
Z
for all (u, v) ∈ W 1,p (Z)×V p (Z, div), where γ0 is the usual trace map on W −1,p (Z) and ·, ·∂Z denotes the duality brackets for the pair
W (1/p
),p
(∂Z), W −(1/p
),p
(∂Z) .
If for u∈W 1,p (Z), we set p u=div(Dup−2 Du), then from Theorem 4.3.71, we obtain the following nonlinear generalization of the so-called second Green’s identity. THEOREM 4.3.73 If 1 < p < ∞, (1/p) + (1/p ) = 1, u ∈ W 1,p (Z), and p u ∈ Lp (Z), then there exists a unique element of W −(1/p ),p (∂Z), which by extension we denote by ∂u/∂np satisfying
∂u (p u)vdz + Dup−2 (Du, Dv)RN dz = , γ0 (v) . ∂np Z Z ∂Z
Clearly λ = 0 is an eigenvalue of problem (4.128) i.e., of −p , W 1,p (Z) , with the constant functions as eigenfunctions (i.e., the corresponding eigenspace is R). In fact we can say more.
PROPOSITION 4.3.74 λ = 0 is the first eigenvalue of −p , W 1,p (Z) and it is isolated and simple. PROOF: First note that problem (4.128) cannot have negative eigenvalues. Indeed, if λ < 0 is an eigenvalue for (4.128) with corresponding eigenfunction u ∈ W 1,p (Z), then Dupp = λupp (see Theorem 4.3.73), which is a contradiction because up > 0. The simplicity of λ = 0 is a direct consequence of the fact that 0 = inf
Dxp p
xpp
: x ∈ W 1,p (Z), x = 0 .
Finally suppose that λ = 0 is not isolated. We can find a sequence {λn }n≥1 of nonzero eigenvalues such that λn ↓ 0. Let {un }n≥1 ⊆ W 1,p (Z) be a sequence of corresponding eigenfunctions. We know that un ∈ C 1 (Z) for all n ≥ 1 (see Remark 4.3.36). We may assume that un p = 1. Hence λn = Dun pp −→ 0 ⇒ {un }n≥1 ⊆ W
1,p
as n → ∞,
(Z) is bounded.
So we may assume that w
un −→ u
in
W 1,p (Z)
and
We have up = 1 and Dup = 0. So
un −→ u
in
Lp (Z).
4.3 Spectrum of the Laplacian and of the p-Laplacian u=±
1 1/p
|Z|N
.
331 (4.129)
Using as test function v = 1 in (4.128), we obtain
|un |p−2 un dz = 0. Z
Passing to the limit as n → ∞, we have
|u|p−2 udz = 0, Z
which contradicts (4.129).
Next we characterize the first nonzero eigenvalue of −p , W01,p (Z) . Suppose that λ > 0 is an eigenvalue of (4.128) with corresponding eigenfunction u ∈ C 1 (Z). Integrate (4.128) over Z and then apply Theorem 4.3.73 (the nonlinear second Green’s identity). We obtain
|u|p−2 udz = 0. Z
So we are naturally led to the consideration of the following nonempty, pointed, closed, and symmetric cone
|x(z)|p−2 x(z)dz = 0 . (4.130) C(p) = x ∈ W 1,p (Z) : Z L (Z)
We consider the set C1 (p) = C(p) ∩ ∂B1 p ; that is,
C1 (p) = x ∈ W 1,p (Z) : |x(z)|p−2 x(z)dz = 0, xp = 1 .
(4.131)
Z
Let ϑp : W 1,p (Z) −→ R be the strictly convex C 1 -map defined by ϑp (x) = Dxpp , x ∈ W 1,p (Z). We consider the following minimization problem λ1 (p) = inf ϑp (x) : x ∈ C1 (p) .
(4.132)
PROPOSITION 4.3.75 Problem (4.132) has a value λ1 = λ1 (p) > 0 that is attained in C1 (p). PROOF: Let {xn }n≥1 ⊆ C1 (p) be a minimizing sequence for problem (4.130) (i.e., ϑp (xn ) ↓ λ1 ). Evidently {xn }n≥1 ⊆ W 1,p (Z) is bounded and so we may assume that w
xn −→ x and
in
W 1,p (Z), xn −→ x
in
Lp (Z), xn (z) −→ x(z) a.e. on Z
|xn (z)| ≤ k(z) a.e. on Z, for all n ≥ 1 and with k ∈ Lp (Z).
Using these relations and the Lebesgue dominated convergence theorem, in the limit as n → ∞, we obtain
332
4 Critical Point Theory and Variational Methods
|x(z)|p−2 x(z)dz = 0 and xp = 1 (i.e., x ∈ C1 (p);
see (4.131)).
Z
Also exploiting the weak lower semicontinuity of the norm functional in a Banach space we have Dxpp ≤ λ1 , hence Dxpp = λ1 . Because x1 ∈ C(p), it follows that x is not constant and so λ1 > 0. An interesting consequence of Proposition 4.3.74, is the following Poincar´e– Wirtinger type inequality. COROLLARY 4.3.76 If x ∈ C(p) (see (4.130)), then λ1 xpp ≤ Dxpp .
In fact for p ≥ 2, we show that λ1 > 0 is the first nonzero eigenvalue of − p , W01,p (Z) . PROPOSITION 4.3.77 If p ≥ 1, then λ1 > 0 is the first nonzero eigenvalue of
−p , W01,p (Z) . PROOF: Let x ∈ C1 (p) be a solution of problem (4.132) (see Proposition 4.3.74). By the Lagrange multiplier rule (see Theorem 2.2.10), we can find a, b, c ∈ R, not all of them equal to zero, such that for all y ∈ W 1,p (Z), we have
ap Dxp−2 (Dx, Dy)RN dz + bp |x|p−2 xydz + c(p − 1) |x|p−2 ydz = 0. Z
Z
Z
Let y=c and recall that because x∈C1 (p), we have So
(4.133) p−2 |x| xdz=0 (see (4.131)). Z
c2 (p − 1)
|x|p−2 dz = 0, Z
⇒ c = 0. Using this in (4.133), we obtain
ap Dxp−2 (Dx, Dy)RN dz + bp |x|p−2 xydz = 0 Z
for all y ∈ W 1,p (Z).
Z
If a = 0, then
|x|p−2 xydz = 0
bp
for all y ∈ W 1,p (Z).
Z
Select y = x ∈ W 1,p (Z). We have bpxpp = 0 (i.e., b = 0), a contradiction, because we cannot have a = b = c = 0. Hence a = 0 and so without any loss of generality, we may assume that a = 1. Then we have
Dxp−2 (Dx, Dy)RN dz + b |x|p−2 xydz = 0 for all y ∈ W 1,p (Z). Z
Z
Once again we use y = x ∈ W 1,p (Z). Then
4.3 Spectrum of the Laplacian and of the p-Laplacian
333
Dxpp + bypp = 0, and
b = −λ1 because x ∈ C1 (p) is a solution of (4.132).
Via Theorem 4.3.73, we establish that x solves (4.128) with λ = λ 1 . Clearly from the definition of λ1 > 0, we see that we cannot find an eigenvalue of −p , W01,p (Z) in (0, λ1 ). REMARK 4.3.78 The eigenfunctions of − (Dirichlet or Neumann), have the so-called unique continuation property, namely if u is an eigenfunction that vanishes on a set of positive Lebesgue measure, then u ≡ 0. In fact because the eigenfunctions belong in C(Z) (recall that we have assumed that Z ⊆ R2 has a C 2 -boundary ∂Z), the unique continuation property can be equivalently restated as follows. If u is a nontrivial eigenfunction, then the set {z ∈ Z : u(z) = 0} has an empty interior. The existence of such property is an open problem for p = 2. There is a counterexample due to Martio [409], which suggests that the unique continuation property is difficult to extend to the p-Laplacian. Let us briefly mention the situation with the eigenvalue problems for the scalar ordinary p-Laplacian (i.e., N = 1). For details see Drabek–Manasevich [205] and Gasi´ nski–Papageorgiou [259, Section 6.3]. We start with the Dirichlet eigenvalue problem: ⎧ ⎫ ⎨ − |x (t)|p−2 x (t) = λ|x(t)|p−2 x(t) a.e. on T = [0, b], ⎬ . (4.134) ⎩ ⎭ x(0) = x(b) = 0, λ ∈ R, 1 < p < ∞ Following the p = 2 case (linear eigenvalue problem), we set
(p−1)1/p 2π(p − 1)1/p ds πp = 2 .
1/p = p sin πp 0 1 − sp (p − 1) Note that π2 = π. Also define the function sinp : R −→ R by
sinp t ds for all t ∈ 0, πp /2
1/p = t p 0 1 − s (p − 1) and then extend sinp (·) to all of R in a similar way as for sin(·). We produce a 2πp -periodic function. THEOREM eigenelements D D 4.3.79 Problem (4.134) has a sequence of (λn , un ) n≥1 , where nπ p nπp p λD and uD t µ ∈ R, n ≥ 1. n = n (t) = µ sinp b b These are all the eigenvalues of the scalar ordinary p-Laplacian with Dirichlet boundary conditions. Next we consider the Neumann eigenvalue problem: ⎧ ⎫ ⎨ − |x (t)|p−2 x (t) = λ|x(t)|p−2 x(t) a.e. on T = [0, b], ⎬ ⎩
x (0) = x (b) = 0, λ ∈ R, 1 < p < ∞
⎭
.
(4.135)
334
4 Critical Point Theory and Variational Methods
THEOREM N N 4.3.80 Problem (λn , un ) n≥0 , where N D λN 0 = 0, λn = λn
and
(4.135)
has
a
sequence
of
eigenelements
b N D
uN , 0 (t) = c ∈ R \ {0}, un (t) = un t − 2n
for all t ∈ T, n ≥ 1. These are all the eigenvalues of the scalar ordinary p-Laplacian with Neumann boundary conditions. Finally we deal with the periodic eigenvalue problem: ⎧ ⎫ ⎨ − |x (t)|p−2 x (t) = λ|x(t)|p−2 x(t) a.e. on T = [0, b] ⎬ ⎩
x(0) = x(b), x (0) = x (b) = 0, λ ∈ R, 1 < p < ∞
THEOREM 4.3.81 Problem P P (λn , un ) n≥0 , where P D λP 0 = 0, λn = λ2n
and
(4.136)
has
a
D uP n (t) = u2n (t)
sequence
⎭ of
.
(4.136)
eigenelements
for all t ∈ T, n ≥ 1.
These are all the eigenvalues of the scalar ordinary p-Laplacian with periodic boundary conditions. REMARK 4.3.82 The situation changes if instead of the scalar ordinary p
Laplacian, we consider the vector ordinary p-Laplacian; that is, − x (t)p−2 x (t)
with x ∈ W 1,p (0, b), RN . In this case for the Dirichlet problem, Theorem 4.3.79 remains unchanged, but for the periodic problem the spectrum contains p more than the sequence λP n(2π/b) . The n n≥0 , for example, the infinite sequence n≥1 spectrum of the vectorial ordinary p-Laplacian with periodic boundary conditions is far from being understood. We also mention the following sharp Poincar´e–Wirtinger inequality for the scalar ordinary p-Laplacian. 1,p (0, b) = {u ∈ W 1,p (0, b) : u(0) = u(b)} and THEOREM 4.3.83 If u ∈ Wper b p u(t)dt = 0, then λ1 upp ≤ u pp . 0 1,p (0, b) the evaluations at t = 0 and REMARK 4.3.84 In the definition of Wper 1,p t = b make sense because W (0, b) is embedded continuously (in fact compactly) in C[0, b].
4.4 Abstract Eigenvalue Problems We have seen in the previous section in the context of eigenvalue problems involving the partial p-Laplacian differential operator, that the Ljusternik–Schnirelmann theory (see Section 4.2), can be carried over to C 1 -manifolds modelled on a reflexive Banach space. In this section we go beyond the p-Laplacian, consider general
4.4 Abstract Eigenvalue Problems
335
nonlinear eigenvalue problems and see how the Ljusternik–Schnirelmann theory can be used to study them. So let X be a separable reflexive Banach space. By the Troyanski renorming theorem we assume without any loss of generality that both X and X ∗ are locally uniformly convex. By ·, · we denote the duality brackets for (X, X ∗ ). Let ϕ, ψ : X −→ R be two C 1 -functions. For fixed β > 0, we consider the nontrivial solutions of the equation (eigenvalue problem) ϕ (x) = λψ (x)
with ϕ(x) = β.
(4.137)
The solutions of (4.137) are contained in the set of critical points of the functional ψ on the surface Mβ = {x ∈ X : ϕ(x) = β}. We assume that β is a regular value of ϕ and so Mβ is a C 1 -manifold of codimension 1 modelled on the Banach space X. Now we introduce the precise hypotheses on the functionals ϕ and ψ. H: (i) ϕ, ψ ∈ C 1 (X) are even (hence ϕ , ψ are odd potential operators), ϕ(0) = ψ(0) = 0, and ϕ (0) = 0. (ii) ϕ is weakly coercive; that is, ϕ(x) −→ +∞ as x −→ ∞. (iii) For every x = 0, the function s −→ ϕ (sx), x is strictly increasing on R+ . w (iv) If xn −→ x in X, ϕ (xn ) −→ y in X ∗ , then xn −→ x in X and ϕ is bounded. (v) For every x ∈ X, ψ (x) ∈ X ∗ is compact and ψ (x) = 0 if and only if x = 0. We have the following existence result concerning problem (4.137). THEOREM 4.4.1 If hypotheses (H) hold, then problem (4.137) has a sequence of solutions (λn , xn ) ∈ R × Mβ such that |λn | −→ +∞
and
w
xn −→ 0
in X.
PROOF: Note that hypotheses (H)(i) and (ii) imply that Mβ = ∅ and it is bounded. Also for x ∈ Mβ we have
1 ϕ (sx), x ds = ϕ(x) = β > 0, ϕ (x), x > 0
⇒ ϕ (x)
X∗
>
β > 0. sup x
(4.138)
x∈Mβ
Therefore β is a regular value of ϕ and so Mβ is a C 1 -manifold of codimension 1 modelled on the Banach space X. The manifold Mβ is star-shaped with respect to the origin, because every ray through the origin {rx ∈ X : r ∈ R, x = 1}, intersects Mβ at precisely two points ∞ ±r(x)x (see (4.138)). Therefore there is a bijection from P (X) = ∂B (0) Z2 (see 1 Example 4.2.12(b)) onto Mβ Z2 defined by ϑ(x) = r(x)x
for all x ∈ P ∞ (X).
Note that for x ∈ Mβ , we have d ϕ(sx) = ϕ (x), x > 0 ds
for all s > 0
(see hypothesis (H)(iii)).
(4.139)
336
4 Critical Point Theory and Variational Methods
So the implicit function theorem (see Theorem 1.1.23) implies that x −→ r(x) is a differentiable function, consequently x −→ ϑ(x) is continuous (see (4.139)). So we have proved that Mβ Z2 (i.e., the manifold Mβ with the antipodal points identified), is homeomorphic to the projective space P ∞ (X) and the homeomorphism is ϑ (see (4.139)). Now we produce deformations of subsets of Mβ Z2 analogous to the pseudogradient deformations (see Theorem 4.1.19). So consider the duality map F ∗ : X ∗ −→ X of X ∗ . Because X and X ∗ are both locally uniformly convex, we have that F ∗ is locally Lipschitz. We consider the following Cauchy problem in X. ⎧ dx(t) ⎫
⎨ dt = v + ξ(x, v)F ∗ ϕ (x) ⎬ , (4.140) ⎩ ⎭ x(0) = x0 ∈ Mβ where v ∈ X and ξ(x, v) ∈ R are chosen so that (a) x(t; x0 ) ∈ Mβ for all t ≥ 0 (by x(·; x0 ) we denote the unique solution of (4.140)). (b) t −→ ψ x(t; t0 ) is decreasing. Note that, if x(t)
= x(t; x0 ) is the unique solution of (4.140),
then if we want (a) to be true, then ϕ x(t) = β, hence (d/dt) ϕ x(t) = ϕ x(t) , x (t) = 0 and this by (4.140) leads to the following determination of ξ(x, v).
ϕ (x), v + ξ(x, v)F ∗ ϕ (x) = 0, ⇒ ξ(x, v) = −
ϕ (x), v . ϕ (x)2X ∗
(4.141)
We also determine v ∈ X. We have
t
ψ x(s) , v + ξ x(s), v F ∗ ϕ x(s) ds ψ x(t) − ψ(x0 ) = 0 7 8
t ψ x(s) , F ∗ ϕ (x(s)
ϕ x(s) , v ds = ψ x(s) − ϕ x(s) 2X ∗ 0 (see (4.141)). We set
ψ x(s) , F ∗ ϕ (x(s)
ψ x(s) − = D ψ M x(s) , β ϕ x(s) 2X ∗
the gradient of ϕM . So if v = −D ψ M x(s) , we have
β
ϕ x(t) − ϕ(x0 ) = −
0
We define
β
t
D ψ
Mβ
2 x(s) X ∗ ds.
9, cat (C) ≥ n . Sn = C : C ⊆ Mβ Z2 = M M
9 " P ∞ (X), we have that {Sn }n≥1 is a strictly decreasing family. We Because M define − c+ n = inf sup ψ(x) and cn = sup inf ψ(x), n ≥ 1. C∈Sn x∈C C∈Sn x∈C − We show that {c+ } , {c } are critical values of ψ M . To this end we need n n≥1 n n≥1 β to verify the P S-condition. So we show that
4.4 Abstract Eigenvalue Problems
337
if c = 0, ε > 0 is sufficiently small, xn ∈ {c − ε ≤ ψ ≤ c + ε} ∩ Mβ and D ψ M (xn ) −→ 0, then {xn }n≥1 has a strongly convergent subsequence. (4.142) β
w
Because Mβ is bounded in X, we may assume that xn −→ x in X. Also because of hypothesis (H)(iv), we have that {ϕ (xn )}n≥1 ⊆ X ∗ is bounded and so we may w assume that ϕ (xn ) −→ u∗ in X ∗ as n → ∞. Finally recall that ϕ (xn )X ∗ ≥ η1 > 0 for all n ≥ 1 (see (4.138)). From the choice of the sequence {xn }n≥1 , we have
ψ (xn ), F ∗ ϕ (xn ) D ψ M (xn ) = ψ (xn ) − ϕ (xn ) −→ 0 as n → ∞. (4.143) β ϕ (xn )2X ∗ Because of hypothesis (H)(v), we have that ψ (xn ) −→ v ∗ in X ∗ and so
ψ (xn ), F ∗ ϕ (xn ) ϕ (xn ) −→ v ∗ in X ∗ as n → ∞.
We may assume ψ (xn ), F ∗ ϕ (xn ) −→ a. Note that a = 0 or otherwise from (4.143) we have ψ (xn ) −→ 0 in X ∗ as n → ∞ ⇒ ψ (x) = 0 and so x = 0
(see hypothesis (H)(v)).
(4.144)
Using hypothesis (H)(v) and the mean value theorem, we can easily check that the set {c − ε ≤ ψ ≤ c + ε} is weakly closed and for ε > 0 small does not contain the origin. Since x ∈ {c − ε ≤ ψ ≤ c + ε} from (4.144) we have a contradiction. This proves that (4.142) holds; that is, ψ M satisfies the P S–condition. β Now we prove that each c+ n = 0 is a critical value of ψ M . Because of the P Sβ condition, the set of critical points of ψ M is closed. So if c+ n is not a critical value β + of ψ M , we can find ε > 0 small such that {c+ n − ε ≤ ψ ≤ cn + ε} ∩ Mβ contains β no critical points of ψ M . Consequently, as in the proof of Theorem 4.1.19(e), we β
can find a deformation h(t, x) and a set C ∈ Sn such that sup ψ(x) ≤ c+ n + ε and x∈A
+ sup ϕ(x) ≤ cn − ε. But catM h(1, C) ≥ n and so x∈h(1,C)
+ c+ n ≤ sup ψ(x) ≤ cn − ε, x∈h(1,C)
a contradiction. Similarly, using deformations so that t −→ ψ x(t) is increasing, we establish that c− n , n ≥ 1, are critical values of ψ M . β
− Let xn be a critical point with critical value c+ n or cn , n ≥ 1. We have
ψ (xn ), F ∗ ϕ (xn ) ψ (xn ) = ϕ (xn ) ϕ (xn )2X ∗
ψ (xn ), F ∗ ϕ (xn ) = 0 for all n ≥ 1. ϕ (xn )2X ∗
and
So finally we can say ϕ (xn ) = λn ψ (xn ) and
w
|λn | −→ +∞, xn −→ 0
in X
as n → ∞.
338
4 Critical Point Theory and Variational Methods
REMARK 4.4.2 Because of hypotheses (H)(ii) and (iv), we see that Mβ ⊆ X is compact. If we drop the evenness hypothesis on the functionals ϕ, ψ, we have the following. COROLLARY 4.4.3 If hypotheses (H) (without the evenness of ϕ and ψ) hold ± and we can find x± 0 ∈ Mβ such that ±ψ(x0 ) > 0, then problem (4.137) has a solution ± ± (λ , x ) such that ±ψ(x± ) = max ±ψ(x). x∈Mβ
Motivated from the proof of Theorem 4.4.1, we make the following definition. DEFINITION 4.4.4 A compact C 1 -manifold M in a Banach space X is said to be spherelike, if it is C 1 -diffeomorphic to the unit sphere ∂B1 (0) = {x ∈ X : x = 1} in X. The next proposition, resulting from the proof of Theorem 4.4.1, answers the question of when the level surface Mβ = {x ∈ X : ϕ(x) = β} of a function ϕ : X −→ R is spherelike. PROPOSITION 4.4.5 If X is a Banach space, ϕ ∈ C 1 (X), and (i) ϕ (x), x = 0 for all x ∈ Mβ . (ii) Every ray from the origin intersects Mβ at exactly one point, then Mβ is spherelike; the C 1 -diffeomorphism rβ : Mβ −→ ∂B1 (0) is given by rβ = rM , β
where r : X \{0} −→ ∂B1 (0) is the radial retraction defined by r(x) =
x ; x
also we have Drβ (x)L ≤ (2/x) for all x ∈ Mβ . PROOF: Note that r is Fr´echet differentiable on X\{0} and r (x) ∈ L(X) is given by r (x)(h) = ⇒ r (x)X ∗
F (x), h 1 x for all h ∈ X, h− x x3 2 ≤ for all x ∈ X \{0}. x
(4.145)
Because F (x), x = x2 , from (4.145) we also have that r (x)(x) ⊆ ker n (x)
for all x ∈ X \{0},
where n : X −→ R is the norm function n(x) = x which is Gˆ ateaux differentiable at every x ∈ X \ {0} because by hypothesis the space X is smooth. Recall that Mβ is a C 1 -manifold in X, which does not contain the origin of X. Also
4.4 Abstract Eigenvalue Problems
339
Tx (Mβ ) = ker ϕ (x)
for all x ∈ Mβ (see Theorem 2.2.7). (4.146) Because of hypothesis (ii) rβ = rM is a bijection from Mβ into ∂B1 (0). We β show that the derivative of this bijection is an isomorphism between the corresponding tangent spaces; that is,
Drβ : Tx (Mβ ) −→ Trβ (x) ∂B1 (0) (4.147) is an isomorphism. To this it suffices to show that r (x) is an isomorphism
end, from
ker ϕ(x) into ker n rβ (x) (see (4.146), (4.147), and recall that ker n (x) = Tx ∂B1 (0) if x = 1). Recall that r (x) maps X into ker n (x) for all x ∈ X \ {0}. Because n (x) = n (λx) for all λ > 0 and all x ∈ X \ {0} we see that r (x) maps linearly into
kern rβ (x) and this operator is also continuous. For x ∈ Mβ and u ∈ ker n rβ (x) = ker n (x), we set
v = xu − x
ϕ (x), u x. ϕ (x), x
The map u −→ v is linear and continuous. Moreover, note that ϕ (x), u ϕ (x), v = x ϕ (x), u − x ϕ (x), x = 0, ϕ (x), x
⇒ v ∈ ker ϕ (x).
Also we have r(x)(v) = u. Therefore r (x) ∈ L ker ϕ (x), ker n rβ (x) is surjective. It is also injective, because r (x)(h) = 0
if and only if h =
F(x), h x x2
(see (4.146)).
Because x does not belong to ker n (x), this equality holds only for h = 0, which proves the desired injectivity. Thus
Drβ (x) : Tx (Mβ ) −→ Trβ (x) ∂B1 (0) is an isomorphism and so rβ : Mβ −→ ∂B1 (0) is a C 1 -diffeomorphism.
DEFINITION 4.4.6 We say that ϕ ∈ C 1 (X) defines a spherelike constraint at β∈R, if Mβ =ϕ−1 ({β}) is a spherelike manifold (see Definition 4.4.4). We would like to have concrete and easily verifiable conditions on ϕ ∈ C 1 (X), which imply that for certain levels β, the manifold Mβ is spherelike. The next proposition provides such conditions. PROPOSITION 4.4.7 If X is a Banach space and ϕ ∈ C 1 (X) satisfies the following conditions. (i) ϕ and ϕ : X −→ X ∗ are bounded (i.e., map bounded sets to bounded sets); (ii) ϕ (x), x −→ +∞ as x −→ ∞, then there exists a β0 > 0 such that for all β ≥ β0 the manifolds Mβ = ϕ−1 ({β}) are spherelike.
340
4 Critical Point Theory and Variational Methods
PROOF: Because of hypothesis (ii), given M > 0, we can find R = R(M ) > 0 such that ϕ (x), x ≥ M for all x ∈ X with x ≥ R. (4.148) For r ≥ 1 and x ∈ X with x = R, we have
r d ϕ(rx) − ϕ(x) = ϕ(tx)dt 1 dt
r ϕ (tx), x dt = 1
r 1 ϕ (tx), tx dt = t 1
r dt = M ln r. ≥M 1 x
(4.149)
Because of hypothesis (i), we have m = sup |ϕ(x)| : x = R < +∞. So from (4.149) we have ϕ(rx) ≥ −m + M ln r, ⇒ ϕ(x) −→ +∞ as x −→ ∞
(i.e., ϕ is weakly coercive).
(4.150)
In particular (4.150) implies that for every β ∈ R, Mβ is bounded. Moreover, (4.150) shows that there exist R < R0 < R1 and M0 < M1 such that M0 ≤ ϕ(x) ≤ M1
for all x ∈ X with R0 ≤ x ≤ R1 .
Fix a y ∈ ∂B1 (0), otherwise arbitrary, and consider the ray {xt = ty}t≥0 . For M ∈ [M0 , M1 ] there exists a minimal t ≥ R0 such that the ray intersects Mβ . As above for t > t ≥ R, we have t ϕ(ty) − ϕ(y) ≥ M ln > 0. t So this ray intersects Mβ at exactly one point.
4.5 Bifurcation Theory Bifurcation theory is concerned with the structure of branch points of nonlinear parametric operator equations in a Banach space. So suppose that X, Y are Banach spaces and G : R × X −→ Y is a continuously Fr´echet differentiable map such that G(λ0 , x0 ) = 0. We are interested in constructing all solutions of the nonlinear operator equation G(λ, x) = 0 (4.151) in a neighborhood of (λ0 , x0 ). If Gx (λ0 , x0 ) ∈ L(X, Y ) is an isomorphism, then according to the implicit function theorem (see Theorem 1.1.23), there is a continu ously Fr´echet differentiable function x(λ) defined for λ ∈ R such that G λ, x(λ) = 0.
4.5 Bifurcation Theory
341
Moreover, all solutions of
(4.151) in a sufficiently small neighborhood of (λ0 , x0 ) in R × X lie on the curve λ, x(λ) . Bifurcation theory is concerned with what happens when Gx (λ0 , x0 ) is no longer an isomorphism. This simplest case occurs when L0 = Gx (λ0 , x0 ) is a Fredholm operator (see Definition 3.1.44(a)). In that case the bifurcation problem may be reduced, via an alternative method, to a system of n algebraic equations with n unknowns Fk (λ, y) = 0,
k ∈ {1, . . . , n}
and
n y = (yk )n k=1 ∈ R .
(4.152)
For such problems, the main task is the analysis of system (4.152), which is often a quite complex endeavor especially if n ≥ 1 is big. Bifurcation theory is an important topic in applied mathematics, because the phenomenon of bifurcation is intimately associated with loss of stability of the system under consideration. A complete resolution of the branching problem therefore requires a study of the stability of the bifurcation solutions. The bifurcation problem that we examine in this section usually arises from the steady-state solutions of dynamical systems. So let X be a Banach space, U ⊆ X an open set containing the origin, and T = (λ0 − δ, λ0 + δ) for some λ0 ∈ R and δ > 0. It is also given a compact map G : T × U −→ X that satisfies G(λ, 0) = 0 for all λ ∈ T . We consider the following operator system x = G(λ, x).
(4.153)
Clearly (λ, 0) for λ ∈ T satisfies (4.153). We call such solutions of (4.153) trivial solutions. All other solutions of (4.153) are called nontrivial solutions. DEFINITION 4.5.1 A point (λ0 , 0) ∈ T × U is a bifurcation point of (4.153) if every neighborhood of (λ0 , 0) contains a nontrivial solution of (4.153); that is, (λ0 , 0) ∈ cl {(λ, x) ∈ T × U : x = G(λ, x), x = 0}. REMARK 4.5.2 According to this definition (λ0 , 0) is a bifurcation point of (4.153) if and only if we can find a sequence {(λn , xn )}n≥1 ⊆ T × U such that xn = G(λn , xn ), xn = 0, and λn −→ λ0 , xn −→ 0 as n → ∞. Also, although strictly speaking (λ0 , 0) is the bifurcation point, we usually say (for simplicity) that λ0 is the bifurcation point (in fact if we identify R × {0} with R, we see that this abuse of the terminology is in fact justified). We focus our attention on maps G(λ, x) of the form G(λ, x) = λK(x) + ϑ(λ, x) (4.154) with K ∈ Lc (X) and ϑ(λ, x) x −→ 0 as x −→ 0 uniformly in λ ∈ T . Clearly (4.154) is obtained from (4.153) via linearization. Note that, if G : T × U −→ X is a compact map which is Fr´echet differentiable at x = 0 for all λ ∈ T with Fr´echet derivative λK, K ∈ L(X), then from Proposition 3.1.20, we know that K ∈ Lc (X) and so (4.154) is valid. The first result that we prove is a rather negative result, because it says that most numbers are not bifurcation points. Nevertheless, it is a useful result because it clarifies the situation and allows us to focus on the possible locations for bifurcation points.
342
4 Critical Point Theory and Variational Methods
PROPOSITION 4.5.3 If G(λ, x) has the form (4.154) and (λ0 , 0) is a bifurcation point of (4.153), then 1/λ0 is an eigenvalue of K ∈ Lc (X). PROOF: Suppose that 1/λ0 is not an eigenvalue of K. Then I − λ0 K is invertible. So if c = 1 (I − λ0 K)−1 L , we have (I − λ0 K)(x) ≥ c x
for all x ∈ X.
Hence we have x − G(λ, x) = x − λK(x) − ϑ(λ, x) = x − λK(x) − λ0 K(x) + λ0 K(x) − ϑ(λ, x) ≥ (I − λ0 K)(x) − |λ − λ0 | KL x − ϑ(λ, x) (4.155) ≥ cx − |λ − λ0 | KL x − ϑ(λ, x). Recall that by hypothesis ϑ(λ, x) x −→ 0 as x −→ 0 uniformly in λ ∈ T . So from (4.155) we infer that if |λ − λ0 |, x are small and x = 0, we have x − G(λ, x) > 0. So for all λ near λ0 and all x = 0 with x small x = G(λ, x), which shows that (λ0 , 0) is not a bifurcation point (see Definition 4.5.1). COROLLARY 4.5.4 If G(λ, x) has the form (4.154) and is defined on all R × X, then the set of bifurcation points is a discrete set in R. PROOF: If λ = 0 is a bifurcation point, then by virtue of Proposition 4.5.3 1/λ belongs to the point spectrum of K ∈ Lc (X), which is either a finite set or a sequence converging to zero. Therefore Proposition 4.5.3 implies that the set of bifurcation points is discrete. We have seen in Proposition 4.5.3 that a necessary condition for (λ0 , 0) to be a bifurcation point is the noninvertibility of λ0 K = Gx (λ0 , 0). However, this condition is not sufficient in general, as shown by the following simple example. EXAMPLE 4.5.5 Let T = R, X = R2 , and consider G(λ, x, y) = (−λx + y , −λy − x ) 3
3
x for all v = ∈ R2 . y
Then Gv (−1, 0, 0)=I =(−1)(−I) and K =−I has an eigenvalue equal to −1, but (−1, 0) is not a bifurcation point, because G(λ, x, y) = (x, y), ⇒ y(−λx + y 3 ) − x(−λy − x3 ) = 0,
x taking inner product with ∈ R2 , −y ⇒ x4 + y 4 = 0 (i.e., x = y = 0).
4.5 Bifurcation Theory
343
In what follows extending a standard notation for invertible N × N -matrices to isomorphisms of X onto itself, we set GL(X) = L ∈ L(X) : L is an isomorphism . Motivated from Proposition 4.5.3, we introduce the following definition. DEFINITION 4.5.6 The set of singular points of (4.153) is the set / GL(X) . Ds = (λ, 0) ∈ T × U : I − λK ∈ We want to derive sufficient conditions for (λ0 , 0) to be a bifurcation point of (4.153). According to Corollary 4.5.4 we focus on isolated critical points (λ0 , 0) and by virtue of Definition 4.5.1, (λ0 , 0) is a bifurcation point if and only if in every neighborhood of (λ, 0) problem (4.153) has a nontrivial solution. In order to distinguish nontrivial solutions from trivial ones, we use a function α ∈ C(T × U ) that is negative on the set T × {0} of trivial solutions of (4.153). This function is called the auxiliary function. Hence the search for nontrivial solutions of problem (4.153) is reduced to the resolution of the following system, ) * x = G(λ, x) . (4.156) α(λ, x) = 0 To solve system (4.156), we can use degree-theoretic methods. For this purpose we introduce the maps ϕ : T × U −→ X and ϕα : T × U −→ R × X defined by
and
ϕ(λ, x) = x − G(λ, x)
ϕα (λ, x) = α(λ, x), ϕ(λ, x)
for all (λ, x) ∈ T × U.
Let U be a small neighborhood of (λ0 , 0). If ϕα (λ, x) = (0, 0) for all (λ, x) ∈ ∂U , then we can consider the Leray–Schauder degree d(ϕα , U, 0) and to solve (4.156), try to show that d(ϕα , U, 0) = 0. To pursue this idea, we introduce the following notion. DEFINITION 4.5.7 Let (λ0 , 0) be an isolated singular point of (4.153) and let ε > 0, r > 0 be such that (a) c < |λ − λ0 | ≤ ε ⇒ λ ∈ / Ds . (b) x = G(λ, x) for all |λ − λ0 | = ε and 0 < x ≤ r. We set V (ε, r) = (λ0 −ε, λ0 +ε)×Br (0) and we call V (ε, r) a special neighborhood of (λ0 , 0). Also a function ξ : V (ε, r) × R −→ R satisfying (c) ξ(λ, 0) = −|λ − λ0 |, (d) ξ(λ, x) = r for x = r is called a complementing function (or Ize’s function) for (λ0 , 0). REMARK 4.5.8 It is easy to see that a special neighborhood of an isolated critical point (λ0 , 0) exists. Indeed, let ε > 0 be small so that I − λK is invertible for all |λ − λ0 | = ε and set
344
4 Critical Point Theory and Variational Methods
1 : |λ − λ0 | = ε . −1 (I − λ0 K) L Then choose r > 0 small enough so that ϑ(λ, x) x ≤ c for all 0 < x ≤ r. We assume without any loss of generality that this inequality is always satisfied by some special neighborhood of (λ0 , 0). A complementing function is then defined by
ξ(λ, x) = |λ−λ0 | (x−r)/r +x. Note that for every η > 0, αη (λ, x) = ξ(λ, x)−η is an auxiliary function. Exploiting of the Leray–Schauder
the homotopy invariance degree we have d ϕξ , V (ε, r), 0 = d ϕαη , V (ε, r), 0 for η > 0 small. So the existence
of nontrivial solutions in V (ε, r) can be deduced from the relation d ϕξ , V (ε, r), 0 = 0.
c = min
PROPOSITION 4.5.9 If G : T × U −→ X is a compact map given by (4.154), (λ0 , 0) is an isolated singular point of (4.153), V (ε, r) is a special neighborhood of (λ0 , 0), and ξ(λ, x) a complementing function defined on V (ε, r), then for every special neighborhood V (ε , r ) of (λ0 , 0) with 0 < ε < ε, 0 < r < r and for every complementing function ξ (λ, x) defined by V (ε , r ) we have
d ϕξ , V (ε, r), 0 = d ϕξ , V (ε , r ), 0 . PROOF: Let h : [0, 1] × V (ε, r) −→ X be defined by h(t, λ, x) = x − λK(x) − tϑ(λ, x) and hξ : [0, 1] × V (ε, r) −→ R × X be defined by
hξ (t, λ, x) = ξ(λ, x), h(t, λ, x) . If we set k(λ, x) = x − λK(x) and ϕ(λ, x) = x − G(λ, x), then because ϑ(λ, x) 1 : |λ − λ0 | = ε ≥ for all 0 < x ≤ r c = min −1 (I − λ0 K) L x (see Remark 4.5.8), we see that hξ is an admissible homotopy (for compact perturbations of the identity) between kξ and ϕξ . So from the homotopy invariance of the Leray–Schauder degree, we have
d kξ , V (ε, r), 0 = d ϕξ , V (ε, r), 0 . (4.157)
Here kξ (λ, x) = ξ(λ, x); x − k(λ, x) . In a similar fashion we show that
d kξ , V (ε, r ), 0 = d ϕξ , V (ε, r ), 0 .
(4.158)
kξ−1 (0)
= (λ0 , 0) and so from (4.157), (4.158), and the excision propNote that erty of the Leray–Schauder degree, we conclude that
d ϕξ , V (ε, r), 0 = d ϕξ , V (ε, r ), 0 . Recall that if we consider the general linear group GL(N, R), we can define GL+ (N, R) = A ∈ GL(N, R) : det A > 0 and GL− (N, R) = A ∈ GL(N, R) : det A < 0 .
4.5 Bifurcation Theory
345
Note that GL(N, R) can be regarded as a subset of RN ×N and so GL(N, R) is a topological space with the topology induced from RN ×N . It is well known that GL+ (N, R) and GL− (N, R) are two open, connected components of GL(N, R). In fact this has an infinite-dimensional analogue. Namely let GL (X) = I + K :K∈ c Lc (X) and (I + K)−1 ∈ L(X) . Then GLc (X) has two connected components GL+ (X) (which contains I) and GL− (X). DEFINITION 4.5.10 Let L ∈ GLc (X). Then 1 if L ∈ GL+ (X) . sgn L = −1 if L ∈ GL− (X) REMARK 4.5.11 As for Brouwer’s degree, if ϕ = I − f , with f ∈ K(U, X) ∩ C 1 (U, X) and ε = inf{ϕ(x) : x ∈ ∂U } > 0, then there exists y0 ∈ Bε (0) a regular value of ϕU such that d(ϕ, U, 0) = d(ϕ − y0 , U, 0) = sgn ϕ (x). x∈(ϕ−y0 )−1 (0)
Now we can produce a useful sufficient condition for (λ0 , 0) to be a bifurcation point. THEOREM 4.5.12 If G : T × U −→ X is a compact map given by (4.154), (λ0 , 0) is an isolated singular point of (4.153), V (ε, r) is a special
neighborhood of (λ0 , 0), ξ : V (ε, r) −→ R is a complementing function, and sgn I −(λ0 − ε)K = sgn I − (λ0 + ε)K , then (λ0 , 0) is a bifurcation point.
PROOF: Let k(λ, x) = x − λK(x) and kξ (λ, x) = ξ(λ, x), k(λ, x) for all (λ, x) ∈ V (ε, r). Then as in the proof of Proposition 4.5.9 we can show that
(4.159) d ϕξ , V (ε, r), 0 = d kξ , V (ε, r), 0 . By virtue of the homotopy invariance of the Leray–Schauder degree the map λ −→ λK is constant on some neighborhoods J+ and J− of λ0 + ε and λ0 − ε, respectively. We define the α(t, λ, x) = ξ(λ, x) + tη with η ∈ (0, ε) such that λ0 + η ∈ J+ and λ0 − ε ∈ J− . Then we set
h(t, λ, x) = α(t, λ, x), k(λ, x) . We have
d h(1, ·, ·), V (ε, r), 0
= sgn I −(λ0 − ε)K − sgn I −(λ0 + ε)K = 0 (see (4.159)), ⇒ d(ϕξ , V (ε, r), 0 = 0
(see Remark 4.5.11)
⇒ (λ0 , 0) is a bifurcation point. DEFINITION 4.5.13 Let G : T × U −→ X be a compact map given by (4.154), (λ0 , 0) an isolated singular point of (4.153), V (ε, r) a special neighborhood of (λ0 , 0), and ξ : V (ε, r) −→ R a complementing function. The number γ(λ0 ) = sgn I −(λ0 − ε)K − sgn I −(λ0 + ε)K is called a crossing number at (λ0 , 0).
346
4 Critical Point Theory and Variational Methods
Recall that the algebraicmultiplicity of aneigenvalue µ, denoted by mα (µ), is the dimension of the space N (µI − K)n = ker N (µI − K)n . n≥1
n≥1
THEOREM 4.5.14 If G : T × U −→ X is a compact map given by (4.154), and 1/λ0 is an eigenvalue of K which has odd algebraic multiplicity mα , then (λ0 , 0) is a bifurcation point. PROOF: By virtue of Definition 4.5.7, 1/λ is not an eigenvalue of K if λ ∈ [λ0 − ε, λ0 + ε] \ {λ0 }. Then from Remark 4.5.11 we know that as λ moves from λ0 − ε to
λ0 + ε, the degree d I − λK, Br (0), 0 = sgn(I − λK) changes by a multiplicative mα factor (−1) . Therefore Theorem 4.5.12 implies that (λ0 , 0) is a bifurcation point. So this theorem relates an eigenvalue of odd algebraic multiplicity to the occurrence of a bifurcation point. Next we show that bifurcation from eigenvalues of odd multiplicity is a global rather than a local phenomenon. We start with a topological result. LEMMA 4.5.15 If (Y, d) is a compact metric space, A ⊆ Y is a component (i.e., a maximal connected closed set), and B ⊆ Y is a closed set such that A ∩ B = ∅, then there exist compact sets K1 ⊇ A, K2 ⊇ B such that K1 ∩ K2 = ∅ and K1 ∪ K2 = Y . PROOF: Given ε > 0, we say that two points a, b ∈ Y are ε-chainable, if we can find a finite sequence {yk }n k=1 such that y1 = α, yn = b and d(yk , yk+1 ) < ε for all k ∈ {1, . . . , n − 1}. The finite sequence {yk }n k=1 is called an ε-chain joining a and b. We set Aε = y ∈ Y : there exists a ∈ A such that a and y are ε-chainable . Evidently A ⊆ Aε and Aε is both closed and open in Y (a clopen set in Y ). So it suffices to show that Aε ∩ B = ∅ for some ε > 0. We argue indirectly. So suppose that Aε ∩ B = ∅ for every ε > 0. Then we can find sequences {εn }n≥1 ⊆ R+ \ {0}, {an }n≥1 ⊆A and {bn }n≥1 ⊆B such that εn −→ 0
and
an and bn are εn -chainable for every n ≥ 1.
Both sets A and B are compact and so we may assume that an −→ a ∈ A and bn −→ b ∈ B. Therefore for every n ≥ 1, we can find an εn -chain Cn joining a and b. Consider the set Y0 = y∈Y : y = lim ynk with ynk ∈ Cnk . k→∞
We have (i) Y0 is a closed, hence compact set in Y . (ii) a, b ∈ Y0 . (iii) For any u, v ∈ Y0 and any n ≥ 1, the points u and v are εn -chainable.
4.5 Bifurcation Theory
347
Suppose that Y0 is not connected. We can find compact sets C1 and C2 and δ > 0 such that Y0 = C1 ∪ C2 and d(u, v) > δ for all u, v ∈ Y with d(u, C1 ) < δ, d(v, C2 ) < δ. Choose εn < δ, u ∈ C1 , and v ∈ C2 . Then u and v are εn -chainable, but this contradicts the previously mentioned property of C1 and C2 . Therefore Y0 is connected. Because a ∈ Y0 , we have Y0 ⊆ A (recall that by hypothesis A is a component of Y ). But b ∈ Y0 , hence b ∈ A ∩ B, a contradiction. THEOREM 4.5.16 If U ⊆ R × X is a bounded neighborhood of (λ0 , 0), G : U −→ X is a compact map given by (4.154), and the points in Ds = {(λ, x) ∈ U : I − λK ∈ / GL(X)} are isolated, then if we set S = {(λ, x) ∈ U : x = G(λ, x), x = 0}, the component S0 of S that contains (λ0 , 0) has at least one of the following properties, (a) S0 ∩ ∂U = ∅. (b) S0 contains a finite number of trivial solutions (λk , 0) k ∈ {0, 1, . . . , n} and n γ(λk ) where γ(λk ) denotes the crossing number of (λk , 0) (see Definition k=0
4.5.13). PROOF: First note that S is compact in R × X. Suppose that S0 ∩ ∂U = ∅. Claim: Thereexists a bounded open set U 0 such that S0 ⊆ U0 ⊆ U and S ∩∂U0 = ∅. Set Wδ = (λ, x) ∈ U : d (λ, x), S0 < δ . Clearly S ∩ W δ is compact and so S0 ∩ ∂Wδ = ∅. First suppose that S ∩ W δ is connected. Then S0 = S ∩ W δ because S0 is a component of S. Therefore we can choose U0 = Wδ for δ > 0 small enough. Next suppose that S ∩ W δ is not connected. By Lemma 4.5.15 we can find compact sets K1 ⊇ S0 and K2 ⊇ S ∩ ∂Wδ such that K1 ∩ K2 = ∅ and K1 ∪ K2 = S ∩ W δ . Let a = d(K1 , K2 ) > 0 and set U0 = (K1 )α/2 ∩ W δ where (K1 )α/2 = (λ, x) ∈ U : d (λ, x), K1 < a/2 . Then for δ > 0 small, we have S0 ⊆ U0 ⊆ U
and
S ∩ ∂U0 = ∅.
points Because by hypothesis the singular set Ds consist of isolated n and S0 is compact, S0 must contain a finite number of trivial solutions (λk , 0) k=0 . We may assume that n !
U 0 ∩ (R × {0}) ∩ U = [λk − ε, λk + ε] × {0} , k=0
where the intervals are disjoint. Also without any loss of generality we may assume that there is nan open set V0 ⊆ U0 such thatUn0 \V 0 is a union of special neighborhoods V (εk , rk ) k=0 of isolated points (λk , 0) k=0 . Taking ε = min εk and r = min rk , we may assume that 1≤k≤n
1≤k≤n
V (εk , rk ) = (λk − ε, λk + ε) × Br (0), k ∈ {0, . . . , n}. Also let ξk : V (εk , rk ) −→ R be a corresponding complementing function for (λk , 0); that is,
ξk (λ, 0) = −|λ − λk | and ξk (λ, x) = r for x = r. We can define a function ξ : U 0 −→ R by
348
4 Critical Point Theory and Variational Methods ξk (λ, x) if (λ, x) ∈ V (εk , rk ) ξ(λ, x) = . r if (λ, x) ∈ V0
Clearly ξ is continuous and because ϕξ (λ, x) = 0 for all (λ, x) ∈ ∂U0 , the Leray– Schauder degree d(ϕξ , U0 , 0) is well-defined. Let R > ε > 0 and consider the following compact homotopy
h(t, λ, x) = (1 − t)ξ(λ, x) − tR, ϕ(λ, x)
for all t ∈ [0, 1] and all (λ, x) ∈ U 0 .
If for some (t, λ, x) ∈ [0, 1]×∂U0 , we have h(t, λ, x) = 0, then because ϕ(λ, x) = 0 for all (λ, x) ∈ ∂U0 \ V 0 , we have that (λ, x) ∈ ∂U0 ∩ V (εk , rk ). In this case ϕ(λ, x) = 0 implies x = 0 and so ξ(λ, x) = ξk (λ, x) = −|λ − λk | = −ε. Hence (1 − t)ξ(λ, x) − tR < 0, which contradicts the fact that h(t, λ, x) = 0. So the homotopy h is admissible. Note that h(0, ·, ·) = ϕξ . Then by virtue of the homotopy invariance of the Leray–Schauder we have
d(ϕξ , U0 , 0) = d h(1, ·, ·), U0 , 0 .
(4.160)
Because h(1, λ, x) = − R, ϕ(λ, x) = 0 for all (λ, 0) ∈ U0 , we must have
d h(1, ·, ·), U0 , 0 = 0.
(4.161)
From (4.160) and (4.161) it follows that d(ϕξ , U0 , 0) = 0. Because ϕξ (λ, x) = 0 for all (λ, x) ∈ V0 , from the excision and additivity with respect to the domain properties of the degree map, we have n !
0 = d(ϕξ , U0 , 0) = d ϕξ , V (εk , rk ), 0 k−0 n
= d ϕξ , V (εk , rk ), 0 k−0
=
n
γ(λk ).
k−0
As a consequence of this theorem, we obtain the following. THEOREM 4.5.17 If U ⊆R×X is a bounded neighborhood of (λ0 , 0), G : U −→ X is a compact map given by (4.154), and 1/λ0 is an eigenvalue of odd algebraic multiplicity of K, then if we set S = {(λ, x) ∈ U : x = G(λ, x), x = 0}, the component S0 of S that contains (λ0 , 0) has at least one the following properties, (a) S0 ∩ ∂U = ∅. (b) S0 contains an odd number of trivial solutions (λk , 0) = (λ0 , 0) with 1/λk an eigenvalue of K of odd algebraic multiplicity.
4.6 Remarks
349
4.6 Remarks 4.1: In the literature we can find two approaches to the critical point theory. One is based on deformation techniques along the negative gradient flow or a suitable substitute of it, namely the pseudogradient flow (see Definition 4.1.15). This is the approach that we follow here. An alternative approach and an outline of it can be found in de Figueiredo [244] (see also Cuesta [169] for critical point theory on C 1 -Banach manifolds). The Palais–Smale condition and the slightly more general Cerami condition (see Definition 4.1.4), are crucial in both approaches and were introduced by Palais–Smale [470] (the P S-condition) and by Cerami [136] (the C-condition). The relation between these compactness-type conditions and weak coercivity (see Theorem 4.1.12) was investigated by many authors. We mention the works by Caklovic–Li–Willem [127], Costa–Silva [158], Brezis–Nirenberg [105], and Goeleven [270]. The concept of pseudogradient vector field (see Definition 4.1.15) was introduced by Palais [471] in order to extend the classical Ljusternik–Schnirelmann theory (see Section 4.2) to infinite-dimensional Banach manifolds. He also proved that the pseudogradient vector field exists (see Theorem 4.1.18). Various forms of the deformation theorem (see Theorem 4.1.19) were first obtained by Browder [116], Palais [471], Schwartz [545], and Clark [147]. Of course this result is the main step in the deformation approach. The notion of linking sets (see Definition 4.1.20), is due to Benci–Rabinowitz [64]. A more restrictive version of Theorem 4.1.22 can be found in Struwe [564, p. 118]. The mountain pass theorem (see Theorem 4.1.24), is due to Ambrosetti–Rabinowitz [20]. The saddle point theorem (see Theorem 4.1.25), is due to Rabinowitz [504]. The generalized mountain pass theorem (see Theorem 4.1.26), is due to Rabinowitz [505]. The notion of local linking was introduced first by Brezis–Nirenberg [105]. Extensions of it can be found in Li–Willem [380]. The principle of symmetric criticality (see Theorem 4.1.41) is due to Palais [472] (see also Willem [606, p. 18]). The extension of the critical point theory to functions of the form ϕ = ϕ+ψ with ϕ ∈ C 1 (X), ψ ∈ Γ0 (X) is due to Szulkin [568]. In fact there are extensions of the theory to nonsmooth functions. We refer to Degiovanni–Marzocchi [192], Motreanu–Radulescu [439], and Gasi´ nski–Papageorgiou [258]. Finally in addition to Theorem 4.1.19, there is the so-called second deformation theorem due to R¨ othe [531] Marino–Prodi [408] and Chang [142]. For a proof of it we refer to Gasi´ nski–Papageorgiou [258, p. 617]. The result is useful in proving multiplicity results for nonlinear boundary value problems. THEOREM 4.6.1 If ϕ∈C 1 (X), a∈R, a < b≤+∞ (if b=+∞, then ϕb \ Kb =X), ϕ satisfies the P Sc -condition for every c∈[a, b), ϕ has no critical values in (a, b), and ϕ−1 (a) contains at most a finite number of critical points of ϕ, then there exists a ϕ-decreasing homotopy h : [0, 1] × (ϕb \ Kb ) −→ ϕb such that h(1, ϕb \Kb ) ⊆ ϕa and
h(t, x) = x
for all (t, x) ∈ [0, 1] × ϕa .
REMARK 4.6.2 So according to this theorem ϕa is a strong deformation retract of ϕb \ Kb . The critical point theory for smooth functionals (i.e., ϕ ∈ C 1 (X)), using the deformation approach can be found in the books by Chang [143], Struwe [564], and Willem [606].
350
4 Critical Point Theory and Variational Methods
4.2: The study of critical points of not necessarily quadratic functionals, started with the work of Ljusternik [391], who considered C 2 -functionals on finite dimensional manifolds and introduced for that purpose the notion of category (see Definition 4.2.6). Soon thereafter these ideas were pursued further by Ljusternik– Schnirelmann [393, 394]. They exploited the fact that a compact set has a neighborhood of the same category, to calculate categories by means of elementary concepts of combinatorial topology. A notion closely related to the genus (see Definition 4.2.13), was first introduced by Yang [610], under the name B-index and denoted by i(A). In fact i(A) ≤ γ(A) and i(S n ) = n, where S n = ∂B1 (0) in Rn+1 . The genus was first considered by Krasnoselskii [361], although the definition we use here is due to Coffmann [156]. Discussions of the Ljusternik–Schnirelmann theory and of the notions of category and genus can be found in Deimling [188], Gasi´ nski– Papageorgiou [258], Schwartz [545], Struwe [564], and Zeidler [619]. Extensions to Banach manifolds with Finsler structure (see Definition 4.2.24) were established by Palais [471], Szulkin [569], and Ghoussoub [262]. 4.3: The spectral properties of the negative Laplacian with Dirichlet or Neumann conditions are standard and various parts of them can be found in Evans [229] and Jost [335] (see also Gasi´ nski–Papageorgiou [258] for more general linear elliptic operators). The p-Laplacian differential operator has been the object of intense research for the last fifteen years. Here in developing the spectral properties of −p we follow the works of Anane [25] and Lindqvist [384, 385]. Theorem 4.3.56 is due to Anane–Tsouli [26]. Theorem 4.3.61 was proved by Drabek–Robinson [206], and Theorem 4.3.66 is due to Cuesta [168]. For the eigenvalues of the vector ordinary p-Laplacian, we refer to Manasevich–Mawhin [406, 407], and Mawhin [416]. 4.4: Theorem 4.4.1 is essentially due to Amann [16]. Discussion of abstract nonlinear eigenvalue problems can be found in Berger [68], Blanchard–Br¨ uning [79], and Zeidler [620]. 4.5: Bifurcation theory has its roots in the work of Poincar´e [498]. His work was dealing with the determination of equilibrium forms for a rotating ideal fluid. The local bifurcation theory (Theorem 4.5.14) is due to Krasnoselskii [362], and the global bifurcation theory (Theorem 4.5.17) is due to Rabinowitz [502]. However, here we follow the approach of Ize [325] (see also Krawcewicz–Wu [364]). Many aspects of local and global bifurcation bifurcation theory are discussed in Berger [68], Krasnoselskii–Zabreiko [363], Deimling [188], Zeidler [619], Dancer [174], Nussbaum [465], and Krawcewicz–Wu [364].
5 Boundary Value Problems–Hamiltonian Systems
Summary. *Having the general abstract theories developed in Chapter 4, in this chapter we deal with concrete boundary value problems (both for ordinary and partial differential equations). In the first section, we illustrate the uses of the so-called “variational method”, which is based on the critical point theory. We consider Dirichlet and Neumann problems for nonlinear problems driven by the pLaplacian as well as semilinear periodic systems with indefinite linear part. We prove existence and multiplicity results. Then we illustrate the method of “upperlower solutions”. In fact combining this method with variational arguments, we prove three solutions theorems for p-Laplacian pde’s. We also deal with nonlinear nonvariational boundary value and provide at unifying framework that enables us to deal at the same time with the Dirichlet, Neumann, Sturm–Liouville, and periodic problems. Subsequently we illustrate the “degree-theoretical approach”, by proving multiplicity results for both ode’s and pde’s problems. In Section 5.4, we deal with perturbed elliptic eigenvalue problems and we prove existence and nonexistence results using Pohozaev identity. Then we prove maximum and comparison principles for the p-Laplacian. These results are useful tools when studying the existence of multiple nontrivial solutions. Finally, we examine Hamiltonian systems and deal with the minimal period and prescribed energy level problems.
Introduction In this chapter we use the general theories of the previous two chapters in order to solve certain boundary value problems for nonlinear ordinary and elliptic partial differential equations. In Section 5.1, we use the critical point theory of Chapter 4 in order to illustrate the variational method for the study of boundary value problems. We consider Dirichlet and Neumann problems for nonlinear elliptic equations driven by the pLaplacian differential operator and semilinear periodic systems with indefinite linear part. We prove existence and multiplicity results. In Section 5.2 we present the method of upper and lower solutions. First we consider a perturbed eigenvalue problem with the p-Laplacian and a nonlinear perturbation, which is p-superlinear. Combining the method of upper and lower solutions, with some variational arguments, we show that the problem has at least N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_5, © Springer Science+Business Media, LLC 2009
352
5 Boundary Value Problems–Hamiltonian Systems
three nontrivial solutions when λ > λ2 (= the second eigenvalue of −p , W01,p (Z) ). In the second part of the section, we use the method to solve an ordinary differential equation with the scalar p-Laplacian, a nonlinearity that may be discontinuous and nonlinear and with multivalued boundary conditions which provide a unified treatment for the Dirichlet, Neumann, and Sturm–Liouville problems. Moreover, our method of proof also applies to the periodic problem. We also establish the existence of extremal solutions in the order interval formed by an ordered pair of upper and lower solutions. In Section 5.3, we illustrate the degree-theoretical approach in the study of nonlinear boundary value problems. First using the degree map for (S)+ -operators (see Section 5.3), we prove the existence of multiple solutions of constant sign for nonlinear elliptic equations with the p-Laplacian. We also consider a periodic problem with the scalar ordinary p-Laplacian and we prove two existence theorems. In the first we impose a uniform nonresonance condition between two successive eigenvalues and in the second we employ certain Landesman–Lazer type conditions. In Section 5.4 we study two perturbed elliptic eigenvalue problems. The first is nonlinear with the p-Laplacian and the second semilinear with the Laplacian. For the nonlinear one, using the Ambrosetti–Rabinowitz condition and the mountain pass theorem (see Theorem 4.1.24), we prove the existence of a nontrivial smooth solution. For the semilinear problem, we assume a critical nonlinear perturbation. We establish the so-called Pohozaev identity, which we subsequently use to show that the semilinear problem cannot have a positive solution when λ ∈ / (0, λ1 ) and the domain is star-shaped. For the case λ ∈ (0, λ1 ) we state a result that reveals an interesting dependence on the dimension N of the domain. In Section 5.5 we present some maximum and comparison principles involving the p-Laplacian. These results extend well–known results concerning the Laplacian and are useful in proving positivity and multiplicity results for the solutions of nonlinear boundary value problems. Finally in Section 5.6, we study periodic Hamiltonian systems. We prove two existence theorems. One for a prescribed minimal period and the other for a prescribed energy level. The approach is variational.
5.1 Variational Method In this section we use the variational method to solve some characteristic secondorder boundary value problems (both for ordinary and partial differential equations). So our tools come from critical point theory (see Section 4.1) and the spectral properties of the Laplacian and of the p-Laplacian (see Section 4.3). Throughout this section Z ⊆ RN is a bounded domain with a C 2 -boundary ∂Z. We consider the following problem,
⎧ ⎫ p−2 Dx(z) = f z, x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) . (5.1) ⎩ ⎭ x ∂Z = 0, 1 < p < ∞ We look for nontrivial positive solutions of problem (5.1) under conditions that make the Euler (energy) functional of problem (5.1) noncoercive, in fact indefinite. More precisely the hypotheses on the nonlinearity f (z, x) are the following,
5.1 Variational Method
353
H(f )1 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z and (i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous. (iii) |f (z, x)| ≤ α(z) + c|x|p−1 for a.a. L∞ (Z)+ , c > 0.
z ∈ Z, all x ∈ R, and with α ∈
(iv) There exists ϑ ∈ L∞ (Z)+ such that ϑ(z) ≥ λ1 a.e. on Z with strict inequality on a set of positive measure and ϑ(z) ≤ lim inf fx(z,x) p−1 uniformly for x→+∞
a.a. z ∈ Z.
(v) There exists η ∈ L∞ (Z)+ such that η(z) ≤ λ1 a.e. on Z with strict inequality on a set of positive measure and lim sup fx(z,x) p−1 ≤ η(z) uniformly for a.a. x→0+
z ∈ Z.
REMARK 5.1.1 Hypotheses (H)(f )1 (iv) and (v) are nonuniform nonresonance conditions near +∞ and 0+ , respectively, at the principal eigenvalue λ1 of the
1,p −p , W0 (Z) . When p = 2 (semilinear problem), these hypotheses incorporate in our framework the so-called asymptotically linear problems. Note that because f (z, 0) = 0 a.e., problem (5.1) has a trivial solution. Here we are interested in nontrivial positive solutions. We consider the Lipschitz truncation function τ : R −→ R+ defined by 0 if x ≤ 0 τ (x) = x if x > 0
and set f1 (z, x) = f z, τ (x) . Note that f1 (z, x) is still a Carath´eodory function (i.e., f1 is measurable in z ∈ Z, for all x ∈ R and continuous in x ∈ R for a.a. z ∈ Z). We consider the Euler functional for problem (5.1) when the right-hand side nonlinearity is f1 (z, x). So ϕ1 : W01,p (Z) −→ R is defined by
1 ϕ1 (x) = Dxpp − F1 z, x(z) dz, p Z x where F1 (z, x) = 0 f1 (z, r)dr for all (z, x) ∈ Z × R (the potential (or primitive)
function corresponding to f1 (z, x)). We know that ϕ1 ∈C 1 W01,p (Z) . PROPOSITION 5.1.2 If hypotheses H(f )1 hold, then ϕ1 satisfies the P Scondition. PROOF: Let {xn }n≥1 ⊆ W01,p (Z) be a sequence such that |ϕ1 (xn )| ≤ M1
for some M1 > 0, all n ≥ 1 and ϕ (xn ) −→ 0 as n → ∞.
Let A : W01,p (Z) −→ W −1,p (Z) = W01,p (Z)∗ , (1/p)+(1/p ) = 1 be the nonlinear operator defined by
354
5 Boundary Value Problems–Hamiltonian Systems
Dxp−2 (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). A(x), y = Z
Hereafter by ·, · we denote the duality brackets for the pair W01,p (Z), W −1,p (Z) . We know that this operator is strictly monotone, maximal monotone and of type (S)+ (see Proposition 4.3.41). Also let Nf1 : W01,p (Z) −→ Lp (Z)
be the Nemitsky operator corresponding to the function f1 ; that is, Nf1 = f1 ·, x(·) for all x ∈ W01,p (Z). We know that this map is bounded continuous (see Proposition 1.1.28). We have ϕ1 (xn ) = A(xn ) − Nf1 (xn ) for all n ≥ 1. We claim that {xn }n≥1 ⊆ W01,p (Z) is bounded.Suppose that this is not the case. We may assume that xn −→ ∞. Set yn = xn xn , n ≥ 1. We can say (at least for a subsequence) that w
yn −→ y
in W01,p (Z)
and
yn −→ y
in Lp (Z) as n → ∞.
Because of hypothesis H(f )1 (iii), we have
|f1 z, xn (z) | α(z) ≤ + c |yn (z)|p−1 xn p−1 xn p−1 f1 ·, xn (·) ⊆ Lp (Z) is bounded. ⇒ xn p−1 n≥1 So we may assume that
f1 ·, xn (·) Nf1 (xn ) w = −→ h xn p−1 xn p−1
a.e. on Z,
(5.2)
in Lp (Z) as n → ∞.
Given ε > 0 and n ≥ 1, we consider the set Cε,n = z ∈ Z : xn (z) > 0,
un (z) ≥ ϑ(z) − ε , p−1 xn (z)
n ≥ 1.
Note that for a.a. z ∈ {y > 0}, we have xn (z) −→ +∞ as n → ∞. Then because of hypothesis H(f )1 (iv), we have χCε,n (z) −→ 1 a.e. on {y > 0}. By the dominated convergence theorem, we have (1 − χCε,n ) Nf1 (xn ) 1 −→ 0, xn p−1 L ({y>0}) Nf1 (xn ) w ⇒ χCε,n −→ h in Lp (Z) as n → ∞. xn p−1 From the definition of the truncated nonlinearity f1 (z, x) and (5.2), we see that h(z) = 0 a.e. on {y > 0}. We have
f1 z, xn (z) f1 z, xn (z) χCε,n (z) = yn (z)p−1 xn p−1 xn (z)p−1
≥ χCε,n (z) ϑ(z) − ε yn (z)p−1 a.e. on Z. Taking weak limits in L1 ({y > 0}), using Mazur’s lemma and letting ε ↓ 0, we obtain
5.1 Variational Method h(z) ≥ ϑ(z) y + (z)p−1
355
a.e. on Z.
Also note that h(z) ≤ c y + (z)p−1 a.e on Z. So it follows that h(z) = g(z) y + (z)p−1
a.e. on Z
(5.3)
with g ∈ L∞ (Z)+ , g(z) ≥ ϑ(z) a.e. on Z. From the choice of the sequence {xn }n≥1 ⊆ W01,p (Z) we have
| A(xn ), yn − y − Nf1 (xn )(yn − y)dz| ≤ εn yn − y with εn ↓ 0. Z
Dividing by xn
p−1
, we obtain
Nf1 (xn ) εn A(yn ), yn − y − (yn − y)dz ≤ yn − y. p−1 xn p−1 Z xn
(5.4)
Evidently we have
Nf1 (xn ) (yn − y)dz −→ 0, p−1 x n Z ⇒ A(xn ), yn − y −→ 0 (see (5.4)).
(5.5)
Because A is of type (S)+ , from (5.5) we infer that yn −→ y in W01,p (Z). Recall that
Nf1 (xn ) A(yn ), v − vdz ≤ εn v for all v ∈ W01,p (Z). p−1 Z xn In the limit as n → ∞, we obtain
g(y + )p−1 vdz for all v ∈ W01,p (Z). (5.6) A(y), v = Z
From (5.6) it follows that ⎫ ⎧
p−1 a.e. on Z, ⎬ ⎨ −div Dy(z)p−2 Dy(z) = g(z) y + (z) ⎩
y + ∂Z = 0
⎭
.
(5.7)
Using the test function −y − , we obtain y=y + ≥ 0. Then by virtue of Proposition 4.3.57 λ1 (g) ≤ λ1 (ϑ) < λ1 (λ1 ) = 1, (5.8)
1,p with λ1 being the principal eigenvalue of −p , W0 (Z), m = 1 . Combining (5.7), (5.8), and Proposition 4.3.45, we infer that y = 0. Because
Nf1 (xn ) A(yn ), yn − y dz ≤ εn with εn ↓ 0 p−1 n x n Z
Nf1 (xn ) Z xn p−1
yn dz −→ 0, we obtain Dyn p −→ 0, hence yn −→ 0 1,p W0 (Z), a contradiction to the fact that yn = 1 for all n ≥ 1. So {xn }n≥1 w W01,p (Z) is bounded and we may assume that xn −→ x in W01,p (Z). Because and clearly
A(xn ), xn − x −
Z
Nf1 (xn )(xn − x)dz ≤ εxn − x
for all n ≥ 1
in ⊆
(5.9)
356
5 Boundary Value Problems–Hamiltonian Systems
and Z Nf1 (xn )(xn − x)dz −→ 0, from (5.9) and the type (S)+ property of A, we obtain that xn −→ x in W01,p (Z). This proves that ϕ1 satisfies the P S-condition. Next we show that ϕ1 satisfies the mountain pass geometry (see Theorem 4.1.24). We start with a lemma that highlights the significance of the nonuniform nonresonance condition at 0+ (see hypothesis H(f )1 (v)). LEMMA 5.1.3 If η ∈ L∞ (Z)+ satisfies η(z) ≤ λ1 a.e. on Z with strict inequality on measure, then there exists ξ > 0 such that ψ(x) = Dxpp − a set of positive p η(z)|x(z)| dz ≥ ξDxpp for all x ∈ W01,p (Z). Z PROOF: Note that ψ ≥ 0. Suppose that the Lemma is not true. Exploiting the p-homogeneity of ψ, we can find {xn }n≥1 ⊆ W01,p (Z) with Dxn p = 1 such that ψ(xn ) ↓ 0. We may assume that w
xn −→ x in W01,p (Z)
and
xn −→ x in Lp (Z).
Clearly ψ is w-lower semicontinuous on W01,p (Z) and so we have ψ(x) ≤ 0. Hence
η|x|p dz ≤ λ1 xpp , (5.10) Dxpp ≤ Z
⇒ x = 0 or x = ±u1 (the principal eigenfunction of −p , W01,p (Z) ). If x=0, then Dxn p −→ 0, which is a contradiction to the fact that Dxn p =1 for all n ≥ 1. So x = ±u1 . Recall that |x(z)| > 0 for all z ∈ Z (see Proposition 4.3.42). Then from the first inequality in (5.10) and the hypothesis in η ∈ L∞ (Z)+ , we obtain Dxpp < λ1 xpp , which contradicts the variational characterization of λ1 > 0 (see Theorem 4.3.47) with m = 1. PROPOSITION 5.1.4 If hypotheses H(f )1 hold, then there exists > 0 such that ϕ1 (x) ≥ β > 0 for all x ∈ W01,p (Z) with x = . PROOF: By virtue of hypothesis H(f )1 (v), given ε > 0 we can find δ = δ(ε) > 0 such that
f1 (z, x) = f (z, x) ≤ η(z) + ε xp−1 for a.a. z ∈ Z and all x ∈ (0, δ],
p for a.a. z ∈ Z and all x ∈ (0, δ]. (5.11) ⇒ F1 (z, x) ≤ (1/p) η(z) + ε x On the other hand, from hypothesis H(f )1 (iii) and the mean value theorem, we have F1 (z, x) ≤ c1 xr p < r ≤ p∗ =
for a.a. z ∈ Z, all x ≥ δ and with c1 > 0, pN N −p
+∞
if p < N . if p ≥ N
(5.12)
5.1 Variational Method
357
Because F1 (z, x) = 0 for a.a. z ∈ Z and all x ≤ 0, from (5.11) and (5.12) it follows that 1
F1 (z, x) ≤ (5.13) η(z) + ε |x|p + c1 |x|r for a.a. z ∈ Z, and all x ∈ R. p Hence if x ∈ W01,p (Z), we have
1 F1 z, x(z) dz Dxpp − p Z
1 1 ε p ≥ Dxp − η|x|p dz − xpp − c1 xr (see (5.13)) p p Z p 1 ε ≥ ξ− Dxpp − c2 Dxrp for some c2 > 0 p λ1
ϕ1 (x) =
(5.14)
(see Lemma 5.1.3). Choose ε < λ1 ξ. Because p < r, from (5.14) and Poincar´e’s inequality it follows that if we choose > 0 small, then ϕ1 (x) ≥ β > 0
for all x ∈ W01,p (Z), with x = .
PROPOSITION 5.1.5 If hypotheses H(f )1 hold, then ϕ1 (tu1 ) −→ −∞ as t −→ +∞. PROOF: Using hypotheses H(f )1 (iii), (iv), and the mean value theorem, we see that given ε > 0 we can find cε > 0 such that F1 (z, x) ≥
1
ϑ(z) − ε xp − cε p
for a.a. z ∈ Z, and all x ≥ 0.
(5.15)
Then for t > 0, we have
tp F1 z, tu1 (z) dz Du1 pp − p Z
tp tp εtp p ≤ Du1 p − ϑ|u1 |p dz + u1 pp + cε |Z|N p p Z p
tp = (λ1 − ϑ)|u1 |p dz + ε + cε |Z|N p Z (because Du1 pp = λ1 u1 pp and u1 p = 1).
ϕ1 (tu1 ) =
(see (5.15))
(5.16)
Because u1 (z) > 0 for all z ∈ Z, because of hypothesis H(f )1 (iv) we have
(λ1 − ϑ)|u1 |p dz = γ < 0. Z
So if we choose ε < −γ, then γ + ε < 0 and from (5.16) we conclude that ϕ1 (tu1 ) −→ −∞ as t −→ +∞. THEOREM 5.1.6 If hypotheses H(f )1 hold, then problem (5.1) has a nontrivial solution x ∈ C01 (Z) such that x(z) ≥ 0 for all z ∈ Z.
358
5 Boundary Value Problems–Hamiltonian Systems
PROOF: Because of Proposition 5.1.5, we can find t > 0 large such that ϕ1 (tu1 ) ≤ 0 = ϕ1 (0) < β ≤ inf[ϕ1 (v) : v = ]
and
< tu1 .
(5.17)
Then (5.17) together with Proposition 5.1.2 permit the use of Theorem 4.1.24 (the mountain pass theorem) which gives x ∈ W01,p (Z) such that ϕ1 (0) = 0 < β ≤ ϕ1 (x)
and
ϕ1 (x) = 0.
(5.18)
From the inequality in (5.18), we infer that x = 0. Also from the equation ϕ1 (x) = 0, we obtain (5.19) A(x) = Nf1 (x). We act with the test function −x− ∈ W01,p (Z). Recalling that Dx+ (z) = 0 a.e. on {x ≤ 0} and that f1 (z, x) = 0 for a.a. z ∈ Z and all x ≤ 0, we obtain Dx− p = 0 (i.e., x ≥ 0). Also from (5.19) it follows that x ∈ W01,p (Z)+ solves problem (5.1). Finally Theorem 4.3.35 implies that x ∈ C01 (Z). We can improve the conclusion of this theorem, by strengthening a little the hypotheses on the nonlinearity f (z, x). H(f )2 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z, it satisfies hypotheses H(f )1 (i)−→(v), and (vi) There exists c0 > 0 such that −c0 xp−1 ≤ f (z, x) for a.a. z ∈ Z and x ≥ 0. THEOREM 5.1.7 If hypotheses H(f )2 hold, then problem (5.1) has a nontrivial solution x ∈ int C01 (Z)+ . PROOF: Theorem 5.1.6 gives a solution x ∈ C01 (Z)+ \{0}. Because of hypothesis H(f )2 (vi) we have
div Dx(z)p−2 Dx(z) ≤ c0 x(z)p−1 a.e. on Z. Invoking Theorem 4.3.37, we conclude that x ∈ int C01 (Z)+ .
clear from the above analysis that if the asymptotic conditions on the slopes
It is f (z, x) (|x|p−2 x) are symmetric for ±∞ and for 0± , we can have a multiplicity result for problem (5.1). So we assume that: H(f )3 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z, it satisfies hypotheses H(f )3 (i),(ii),(iii), and f (z,x) p−2 x |x|→∞ |x|
(iv) lim inf
≥ ϑ(z) uniformly for a.a. z ∈ Z with ϑ ∈ L∞ (Z)+ as in
hypothesis H(f )1 (iv). (v) lim sup x→0
f (z,x) |x|p−2 x
≤ η(z) uniformly for a.a. z ∈ Z with η ∈ L∞ (Z)+ as in
hypothesis H(f )1 (v).
5.1 Variational Method
359
THEOREM 5.1.8 If hypotheses H(f )3 hold, then problem (5.1) has two nontrivial solutions x, u ∈ C01 (Z) such that u(z) ≤ 0 ≤ x(z) for all z ∈ Z. Also if we strengthen hypotheses H(f )3 we can have a stronger conclusion. H(f )4 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z, it satisfies hypotheses H(f )3 (i)−→(v), and (vi) for almost all z ∈ Z and all x ∈ R, f (z, x)x ≥ 0 (sign condition). THEOREM 5.1.9 If hypotheses H(f )4 hold, then problem (5.1) has two nontrivial solutions x, u such that x, −u ∈ intC01 (Z)+ . Next turn our attention to the Neumann problem: ⎧ ⎫
p−2 Dx(z) = f z, x(z) + h(z) a.e. on Z, ⎬ ⎨ −div Dx(z) ⎩
∂x ∂n
⎭
= 0 on ∂Z, 1 < p < ∞, h ∈ L∞ (Z)
.
(5.20)
In this case the hypotheses on the nonlinearity f (z, x) are the following. H(f )5 : f : Z × R −→ R is a function such that (i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous. (iii) |f (z, x)| ≤ α(z)+c|x|r−1 for a.a. z ∈ Z, all x ∈ R and with α ∈ L∞ (Z)+ , c > 0, 1 ≤ r < p∗ . (iv) If F (z, x) =
Z
f (z, r)dr, then the limits
lim F (z, x) = F± (z) exist for
x→±∞
almost all z ∈ Z, and there is M > 0 such that F (z, x) ≥ F+ (z) for a.a. z ∈ Z, all x ≥ M and F (z, x) ≥ F− (z) for a.a. z ∈ Z, all x ≤ −M . 1 (v) There exist ξ ∈ L (Z) and x0 = 0 such that F (z, x) ≤ ξ(z) for a.a. z ∈ Z and also Z F (z, x0 )dz > 0.
Let h ∈ L∞ (Z) be such that problem.
Z
h(z)dz =0. We consider the following auxiliary
⎧ ⎫
p−2 Dx(z) = h(z) a.e. on Z, ⎬ ⎨ −div Dx(z) ⎩
∂x ∂n
= 0 on ∂Z
⎭
.
(5.21)
PROPOSITION 5.1.10 If h ∈ L∞ (Z) and Z h(z)dz = 0, then problem (5.21) has a unique solution v0 ∈ C01 (Z) such that Z v0 (z)dz = 0.
360
5 Boundary Value Problems–Hamiltonian Systems
PROOF: Let ψ : W 1,p (Z) −→ R be the C 1 –function defined by
1 1 ψ(x) = Dxpp − hxdz for all x ∈ W 1,p (Z). p p Z We consider the direct sum decomposition W 1,p (Z) = R ⊕ V, where V = v ∈ W 1,p (Z) : Z v(z)dz = 0 . Because of the hypothesis on h we have ψ R = 0. Let ψ = ψ V (the restriction of ψ on V ). From the Poincar´e–Wirtinger
inequality, we see that ψ is coercive and also it is clear that ψ : V −→ R is weakly lower semicontinuous. So by the Weierstrass theorem (see Theorem 2.1.10(a)), we can find v0 ∈ V such that −∞ < m0 = inf ψ = ψ(v0 ), V
⇒ ψ (v0 ) = 0 in V ∗ .
(5.22)
Note that if x ∈ W 1,p (Z), then x = x + x uniquely with x ∈ R, x ∈ V , and
1 1 ψ(x) = ψ(x + x) = Dxpp − hxdz = ψ(x). p p Z Hence if pV : W 1,p (Z) −→ V is a projection map (i.e., pV (x) = pV (x + x) = x ∈ V ), then ψ = ψ ◦ pV . Thus by the chain rule, we obtain
ψ (x) = p∗V ψ pV (x)
for all x ∈ W 1,p (Z).
Let ·, ·V denote the duality brackets for the pair (V, V ∗ ) and ·, · the duality brackets for the pair W 1,p (Z), W 1,p (Z)∗ . For any x, y ∈ W 1,p (Z) , we have
ψ (x), y = ⇒ ψ (v0 ), y =
.
/ / .
p∗V ψ pV (x) , y = ψ pV (x) , pV (y) V . / . / ψ (v0 ), pV (y) = ψ (v0 ), y = 0 (see (5.22)). V
V
Because y ∈ W 1,p (Z) was arbitrary, it follows that ψ (v0 ) =0 in W 1,p (Z)∗ . Hence A(v0 ) = h, where A : W 1,p (Z) −→ W 1,p (Z)∗ is defined by
A(x), y = Dxp−2 (Dx, Dy)RN dz
(5.23)
for all x, y ∈ W 1,p (Z).
Z
For every ϑ ∈ Cc∞ (Z) we have
Dv0 p−2 (Dv0 , Dϑ)RN dz= hϑdz Z
Z
(see (5.23)).
(5.24)
5.1 Variational Method
361
From the representation theorem for the elements of W −1,p (Z)=W01,p (Z)∗ , we infer that div(Dxp−2 Dx)∈W −1,p (Z). Let ·, ·0 denote the duality brackets for
1,p the pair W0 (Z), W −1,p (Z) . From (5.24), we have
−div(Dv0 p−2 Dv0 ), ϑ
But
Cc∞ (Z)
0
= h, ϑ0
for all ϑ ∈ Cc∞ (Z).
(5.25)
W01,p (Z).
is dense in So from (5.25) it follows that
−div Dv0 (z)p−2 Dv0 (z) = h(z) a.e. on Z.
Also using Theorem 4.3.72 (nonlinear Green’s identity), for every y ∈ W 1,p (Z), we have
∂v0 Dv0 p−2 (Dv0 , Dy)RN dz + y div(Dv0 p−2 Dv0 )dz = , γ0 (y) , ∂np Z Z ∂Z (5.26) where ∂v0 /∂np = Dv0 p−2 (Dx, n)RN on ∂Z, γ0 is the trace map and ·, ·∂Z de
notes the duality brackets for the pair W (1/p),p (∂Z), W −(1/p),p (∂Z) . Combining (5.24) through (5.26), we infer that ∂v0 , γ0 (y) = 0. (5.27) ∂np ∂Z
Recall that γ0 W 1,p (Z) =W −(1/p ),p (∂Z). So from (5.27) we obtain ∂v0 =0 ∂np
in the sense of traces.
Therefore x ∈ W 1,p (Z) solves problem (5.21). Moreover, from Theorem 4.3.35 and Remark 4.3.36, we have that x ∈ C 1 (Z) and so the Neumann boundary condition is understood in a pointwise sense. Using this auxiliary result, we can prove an existence theorem for the nonlinear Neumann problem (5.20). THEOREM 5.1.11 If hypotheses H(f )5 hold and h ∈ L∞ (Z) satisfies 0, then problem (5.20) has a nontrivial solution x ∈ C 1 (Z).
Z
h(z)dz =
PROOF: We consider the C 1 -Euler functional ϕ : W 1,p (Z) −→ R for problem (5.20), defined by
1 ϕ(x) = Dxpp − F z, x(z) dz − h(z)x(z)dz p Z Z 1 ≥ Dxpp − ξ1 − c1 Dxp for some c1 > 0, (5.28) p (see hypothesis H(f )5 (v)). Here we have used the fact that x=x + x with x ∈ R, x ∈ V , that Z h(z)dz = 0, and the Poincar´e–Wirtinger inequality. From (5.28) it is clear that ϕ is bounded below.
362
5 Boundary Value Problems–Hamiltonian Systems Let −∞<m= inf ϕ and consider a minimizing sequence {xn }n≥1 ⊆ W 1,p (Z); W 1,p (Z)
that is, ϕ(xn ) ↓ m as n → ∞. If {xn }n≥1 ⊆W 1,p (Z) is bounded in W 1,p (Z), then we w may assume that xn −→ x in W 1,p (Z) and xn −→ x in Lp (Z). Exploiting the weak lower semicontinuity of the Euler functional ϕ, we have m = ϕ(x) and ϕ (x) = 0. Note that ϕ(x) ≤ ϕ(x0 ) < 0 = ϕ(0) and so x = 0. From the equation ϕ (x) = 0, as in the proof of Proposition 5.1.10, we infer that x solves problem (5.20) and x ∈ C 1 (Z). Therefore, if m is not attained, then {xn }n≥1 ⊆ W 1,p (Z) is unbounded. We have xn = xn + xn , with xn ∈ R, xn ∈ V, n ≥ 1. From (5.28) and Young’s inequality, we obtain ϕ(xn ) ≥ c2 Dxn pp − c3
for some c2 , c3 > 0, all n ≥ 1,
⇒ {xn }n≥1 ⊆ W 1,p (Z) is bounded (Poincar´e–Wirtinger inequality). Because {xn }n≥1 ⊆ W 1,p (Z) is unbounded, we must have |xn | −→ ∞ as n → ∞. Suppose that xn −→ +∞ as n → ∞. Because {xn }n≥1 ⊆ W 1,p (Z) is bounded, we may assume that w
xn −→ x
in W 1,p (Z),
xn −→ x
in Lp (Z), xn (z) −→ x(z)
a.e. on Z
and |xn (z)| ≤ k(z) for a.a. z ∈ Z, all n ≥ 1, with k ∈ L (Z). Hence p
xn (z) = xn + xn (z) ≥ xn − k(z) ⇒ xn (z) −→ +∞
a.e. on Z,
a.e. on Z as n → ∞.
Therefore we have
1
m = lim F z, xn (z) dz − h(z)xn (z)dz Dxn pp − n→∞ p Z Z
1 p ≥ lim inf Dxn p − lim sup F z, xn (z) dz − lim h(z)xn (z)dz n→∞ p n→∞ Z n→∞
Z 1 ≥ Dxpp − F+ (z)dz − h(z)x(z)dz (by Fatou’s lemma) p Z Z
= ψ(x) − F+ (z)dz. (5.29) Z
By virtue of Proposition 5.1.10 and because xn −→ +∞, for n ≥ 1 large we have xn + v0 (z) ≥ M
for all z ∈ Z.
So by hypothesis H(f )5 (iv) we have
F z, xn + v0 (z) ≥ F+ (z) for a.a. z ∈ Z and for all n ≥ 1 large. Using this in (5.29), we obtain for all n ≥ 1 large
m ≥ ψ(x) − F z, xn + v0 (z) dz
Z
≥ ψ(v0 ) − F z, xn + v0 (z) dz (see the proof of Proposition 5.1.10) Z
= ϕ(xn + v0 ) (because h(z)dz = 0), Z
⇒ m = ϕ(xn + v0 ).
5.1 Variational Method
363
Therefore m is attained, a contradiction. Similarly, if xn −→ −∞, then again we reach a contradiction. It follows that m is attained; that is, we can find x ∈ W 1,p (Z) such that m = ϕ(x). Because ϕ(x) ≤ ϕ(x0 ) < 0 = ϕ(0), we have x = 0. Also ϕ (x) = 0 from which as we already mentioned we infer that x solves (5.20) and x ∈ C 1 (Z) (see the proof of Proposition 5.1.10). Next we consider a semilinear periodic system. Using the second deformation theorem (see Theorem 4.6.1), we prove a multiplicity result for such systems. So the problem under consideration is the following:
) * −x (t) − A(t)x(t) = ∇F t, x(t) a.e. on T = [0, b], . (5.30) x(0) = x(b), x (0) = x (b) Let GLs (RN ) be the group of N × N symmetric invertible matrices. In what follows for any A1 , A2 ∈ GLs (RN ), we set A1 ≤ A2 if and only if A2 − A1 is positive definite and A1 < A2 if and only if A2 − A1 is strictly positive definite. In problem (5.30) A ∈ L∞ T, GLs (RN ) , but note that we do not require that A(t) ≥ 0 for almost all t ∈ T . Also F (t, x) is a Carath´eodory function on T ×RN into R and for almost all t ∈ T, F (t, ·)∈C 1 (RN ). By ∇F (t, x) we denote the gradient of F (t, ·). We consider the Sobolev space
1,2
Wper (0, b), RN = x ∈ W 1,2 (0, b), RN : x(0) = x(b) .
Because W 1,2 (0, b), RN is embedded continuously (in fact compactly) in C(T, RN ), we see that the evaluations at t = 0 and t = b make sense and so 1,2 Wper (0, b), RN is well–defined. This is a Hilbert space with inner product
b
b
(x, y)1,2 = x(t), y(t) RN dt + x (t), y (t) RN dt, 0
0
1,2 (0, b), RN . The corresponding norm is of course the usual for all x, y ∈ Wper Sobolev norm x2 = x21,2 = x22 + x 22 .
PROPOSITION 5.1.12 If A ∈ L∞ T, GLs (RN ) , then there exists a sequence λn = λn (A) n≥1 ⊆ R such that
and
λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , λn = λn (A) −→ +∞ ) * −x (t) − A(t)x(t) = λn x(t) a.e. on T = [0, b], x(0) = x(b), x (0) = x (b)
(5.31)
has a nontrivial solution. Moreover, we have 1,2
(5.32) (0, b), RN = H− ⊕ H0 ⊕ H+ Wper 1,2
N with H− = span x ∈ Wper (0, b), R : −x − A(·)x = λx, λ < 0
H0 = ker − x − A(·)x 1,2
H+ = span x ∈ Wper (0, b), RN : −x − A(·)x = λx, λ > 0 . and the subspaces H− and H0 are finite-dimensional.
364
5 Boundary Value Problems–Hamiltonian Systems
PROOF: Let λ∗ > A∞ and for h ∈ L2 (T, RN ) we consider the following auxiliary periodic problem.
) * −x (t) + λ∗ I − A(t) x(t) = h(t) a.e. on T = [0, b], . (5.33) x(0) = x(b), x (0) = x (b)
It is easy to see that (5.33) has a unique solution in C(T, RN ) ∩ W 2,2 (0, b), RN . We denote this solution by K(h). So we have defined a map K : L2 (T, RN ) −→ L2 (T, RN ), which clearly is linear, monotone, and self-adjoint, and because of the
1,2 compact embedding of Wper (0, b), RN into L2 (T, RN ), it is also compact. Hence by virtue of Theorem 3.1.37, we can find {µn }n ≥ 1 ⊆ R+ such that µn −→ 0 and the µn s are eigenvalues of K. Set λn = (1/µn ) − λ∗ , n ≥ 1. Then the λn s with the corresponding eigenfunctions of K satisfy (5.31). If by En (A) we denote the corresponding eigenspace and H− = H0 =
n :
Ek (A) if λ1 ≤ λ2 ≤ · · · ≤ λn < λn+1 = 0,
k=1 m :
Ek (A) if λn+1 = · · · = λm = 0 < λm+1
k=n+1
and H+ =
:
Ek (A),
k≥n+1
then we have the orthogonal direct sum decomposition (5.32). Note that if
b
x (t), y (t) RN − A(t)x(t), y(t) RN dt qA (x, y) = 0
for all x, y ∈ (0, b), R , then H− is the subspace where qA is strictly negative definite, H0 is the subspace where qA is null and H+ is the subspace where qA is strictly positive definite. Finally because µ ↓ 0, then only a finite number of λn = (1/µn ) − λ∗ is less than or equal to zero and so we conclude that dim H− , dim H0 < +∞. 1,2 Wper
N
We have the following unique continuation property for the nontrivial solutions of the eigenvalue problem (5.31).
LEMMA 5.1.13 If A ∈ L∞ T, GLs (RN ) and x ∈ C 1 (T, RN ) is a nontrivial solution of (5.31), then x has finite number of zeros. PROOF: Suppose x has an infinite number of distinct zeros. Without any loss of generality, we may assume that the sequence of zeros {tn }n≥1 ⊆ T satisfies tn < tn+1
N for all n ≥ 1 and tn −→ t∗ . Let x(t) = xk (t) k=1 . By Rolle’s theorem, we can find rn ∈ (tn , tn+1 ), n ≥ 1, such that xk (rn ) = 0
for all n ≥ 1.
∗
Evidently rn −→ t as n → ∞. Hence xk (t∗ ) = 0 for all k ∈ {1, . . . , N }. Because xk (t∗ ) = xk (t∗ ) = 0 for all k ∈ {1, . . . , N }, from the theory of linear differential equations, we conclude that x ≡ 0, a contradiction. The next lemma gives some useful inequalities satisfied by the elements in the component spaces H− and H+ .
5.1 Variational Method
365
LEMMA 5.1.14 (a) There exists β1 > 0 such that x 22 −
b
A(t)x(t), x(t)
RN
0
dt ≥ β1 x2
for all x ∈ H+ .
(b) There exists β2 > 0 such that x 22 −
b
A(t)x(t), x(t)
RN
0
dt ≤ −β2 x2
for all x ∈ H− .
PROOF: (a) Let ψ : H+ −→ R be the C 1 –function defined by ψ(x) = x 22 −
b
A(t)x(t), x(t)
0
RN
dt.
Evidently, if x ∈ H+ , then x ∈ L2 (T, RN ) and so by Green’s identity, we have x 22 = −x , x .
1,2
∗ 1,2 We denote for the pair Wper (0, b), RN , Wper (0, b), RN the duality brackets ·, ·. So from the definition of H+ (see Proposition 5.1.12), we have / . (5.34) ψ(x) = −x , x − A(x), x ≥ 0,
where A ∈ L L2 (T, RN ), L2 (T, RN ) is defined by A(x)(·) = A(·)x(·) (the Nemitsky operator corresponding to A(·)). If the inequality in (a) was not true, then we could find {xn }n≥1 ⊆ H+ with xn =1 such that ψ(xn ) ↓ 0 as n → ∞. By passing to a
w 1,2 suitable subsequence if necessary, we may assume that xn −→ x in Wper (0, b), RN and xn −→ x in C(T, RN ). So in the limit as n → ∞, we obtain ψ(x) = x 22 −
b
A(t)x(t), x(t)
RN
0
dt ≤ 0,
which in conjunction with (5.34), implies ψ(x) = x 22 −
b
A(t)x(t), x(t)
0
RN
dt = 0.
(5.35)
Because x ∈ H+ , from (5.35) it follows that x = 0. Then xn , xn −→ 0 in 1,2 L2 (T, RN ) and so xn −→ 0 in Wper (0, b), RN as n → ∞, a contradiction to the fact that xn = 1 for all n ≥ 1. (b) The proof of the inequality for the elements in H− is done similarly. We assume that dimH− = 0 and dimH0 = 0. The hypotheses on the potential function F (t, x) are the following. H(F): F : T × RN −→ R is a function such that F (t, 0) = 0 a.e. on T and (i) For every x ∈ RN , t −→ F (t, x) is measurable.
366
5 Boundary Value Problems–Hamiltonian Systems (ii) For almost all t ∈ T, F (t, ·) ∈ C 1 (RN ). (iii) ∇F (t, x) ≤ α(t) for a.a. t ∈ T , all x ∈ RN and with α ∈ L1 (T )+ . (iv) There exists a function F∞ ∈ L1 (T ) such that F (t, x) −→ F∞ (t)
a.e. on T
b 0
F∞ (t)dt ≤ 0 and
as x −→ ∞
and if xn ∈ C 1 (T, RN ) and xn (t) −→ ∞ for almost all t ∈ T , then
b
∇F t, xn (t) , xn (t) RN dt −→ 0 as n → ∞. 0
(v) There exists δ > 0 such that for a.a. t ∈ T and all x ≤ δ, we have F (t, x) > 0
for x = 0.
(vi) If λm > 0 is the first strictly positive eigenvalue for problem (5.31), then F (t, x) ≤
λm x2 2
for a.a. t ∈ T and all x ∈ RN .
EXAMPLE 5.1.15 The following functions satisfy hypotheses H(F ). For simplicity we drop the t-dependence: λm (x2 − x3 ) if x ≤ 1 ln(x2 + 1) 2 , F2 (x) = c F1 (x) = , and λm 1 λm x2 + 1 − 2 if x > 1 2 x 2
F3 (x) = cx2 e−x
with c ≤ (λm /2).
1,2 (0, b), RN −→ R for problem (5.31), We consider the Euler functional ϕ : Wper defined by
b
b
1 ϕ(x) = x 22 − F t, x(t) dt A(t)x(t), x(t) RN dt − 2 0 0
1,2 N for all x ∈ Wper (0, b), R .
PROPOSITION 5.1.16 If A ∈ L∞ T, GLs (RN ) , hypotheses H(F ) hold and c = b − 0 F∞ (t)dt, then ϕ satisfies the Cc -condition.
1,2 (0, b), RN be a sequence such that PROOF: Let {xn }n≥1 ⊆ Wper and (1 + xn )ϕ (xn ) −→ 0 as n → ∞. ϕ(xn ) −→ c
∗
1,2
1,2 (0, b), RN , Wper (0, b), RN be defined by Let V ∈ L Wper
V (x), y= 0
b
x (t), y (t)
RN
dt
1,2
for all x, y ∈ Wper (0, b), RN .
Evidently V is monotone, hence maximal monotone. Also let N : L1 (T, RN ) −→ L (T, RN ) be the Nemitsky operator corresponding to the nonlinearity ∇F ; that is, 1
5.1 Variational Method
N (x)(·) = ∇F ·, x(·)
367
for all x ∈ L1 (T, RN ).
Clearly N is continuous Moreover, exploiting the compact embed and bounded. 1,2 N ding into C(T, RN ) (hence into L1 (T, RN ) too), we deduce that of Wper (0, b), R is completely continuous, thus compact (see Proposition 3.1.3). We N 1,2
Wper (0,b),RN
have
ϕ (xn ) = V (xn ) − A(xn ) − N (xn ) for all n ≥ 1,
2 N 2 N where as before A ∈ L L (T, R ), L (T, R ) is defined by A(x)(·) = A(·)x(·). We
1,2 show that the sequence {xn }n≥1 ⊆ Wper (0, b), RN is bounded. We proceed by contradiction. So suppose that (by passing to a subsequence if necessary), we have xn −→ +∞. We set yn =xn /xn for all n ≥ 1. We may assume that w 1,2
yn −→ y in Wper (0, b), RN and yn −→ y in C(T, RN ).
1,2 From the choice of the sequence {xn }n≥1 ⊆ Wper (0, b), RN , we have | ϕ (xn ), yn − y | ≤ εn with εn ↓ 0,
b
⇒ V (xn ), yn − y − A(t)xn (t), yn (t) − y(t) RN dt 0
b
−
N (xn )(t), yn (t) − y(t)
0
RN
dt ≤ εn .
We divide by xn and we obtain V (yn ), yn − y −
0
b
− 0
b
A(t)yn (t), yn (t) − y(t) RN dt
N (xn )(t) , yn (t) − y(t) RN dt ≤ εn . xn
(5.36)
Note that
and
A(t)yn (t), yn (t) − y(t) RN dt −→ 0 as n → ∞ 0
(because A ∈ L∞ T, GLs (RN ) )
b
N (xn )(t) , yn (t) − y(t) RN dt −→ 0 as n → ∞ xn 0 b
(see hypothesis H(F)(iii)). So from (5.36), we obtain V (yn ), yn − y −→ 0. Because V is maximal monotone, it follows that V (yn ), yn −→ V (y), y , ⇒ yn 2 −→ y 2 .
w 1,2 Because yn −→ y in Wper (0, b), RN , we infer that yn −→ y in L2 (T, RN ) and so finally
368
5 Boundary Value Problems–Hamiltonian Systems 1,2
yn −→ y in Wper (0, b), RN .
1,2 Recall that for all v ∈ Wper (0, b), RN , we have
b
V (yn ), v − A(t)yn (t), v(t) RN dt
(5.37)
0
b
− 0
N xn (t) , v(t) RN dt ≤ εn xn
with εn ↓ 0.
Passing to the limit as n → ∞ and using (5.37) and hypothesis H(F )(iii), we obtain
b
1,2
V (y), v = for all v ∈ Wper A(t)y(t), v(t) RN dt (0, b), RN 0
∗ 1,2
(i.e., y ∈ H0 ). ⇒ V (y) = A(y) in Wper (0, b), RN
Because yn = 1 for all n ≥ 1, from (5.37), we have y = 1 (i.e., y = 0). Moreover, from Lemma 5.1.13, we have y(t) = 0 a.e. on T and so xn (t) −→ ∞ a.e. on T . It follows that
a.e. on T. (5.38) F t, xn (t) −→ F∞ (t)
1,2 From the choice of the sequence {xn }n≥1 ⊆ Wper (0, b), RN , we have
b
b
V (xn ), xn − A(t)xn (t), xn (t) N dt − ∇F t, xn (t) , xn (t) N dt R R ≤ εn , ⇒ xn 22 −
0
0
A(t)xn (t), xn (t)
0
≤ εn .
b
dt − RN
∇F t, xn (t) , xn (t) RN dt
b
0
Because of hypothesis H(F )(iv), we have
b
∇F t, xn (t) , xn (t) RN dt −→ 0 0
⇒ xn 22 −
b
A(t)xn (t), xn (t)
0
RN
as n → ∞,
dt −→ 0
as n → ∞.
(5.39)
Hypotheses H(F )(iii) and (iv) imply that we can find ϑ ∈ L1 (T )+ such that |F (t, x)| ≤ ϑ(t)
for a.a. t ∈ T, all x ∈ RN .
(5.40)
From (5.38), (5.40), and the dominated convergence theorem, it follows that
b
b
F t, xn (t) dt −→ F∞ (t)dt as n → ∞. (5.41) 0
0
By hypothesis ϕ(xn ) −→ c and so given ε > 0, we can find n0 = n0 (ε) ≥ 1 such that for all n ≥ n0 we have c − ε ≤ ϕ(xn ) ≤ c + ε,
b
1 b 1 ⇒ c − ε ≤ xn 22 − A(t)xn (t), xn (t) RN dt − F t, xn (t) dt ≤ c + ε. 2 2 0 0
5.1 Variational Method
369
If we pass to the limit as n → ∞ and we use (5.39) and (5.41), we obtain
b c − ε ≤ − F∞ (t)dt ≤ c + ε. 0
But ε > 0 was arbitrary. So we let ε ↓ 0 to conclude that
b c=− F∞ (t)dt, 0
1,2 a contradiction. Therefore {xn }n≥1 ⊆ Wper (0, b), RN is bounded. Hence we may assume that w 1,2
xn −→ x in Wper (0, b), RN and xn −→ x in C(T, RN ). Then as above, we obtain lim V (xn ), xn − x = 0,
n→∞
from which because V is maximal monotone, we conclude that b xn −→ x in 1,2 Wper (0, b), RN . Therefore ϕ satisfies the Cc -condition for c = − 0 F∞ (t)dt. REMARK 5.1.17 Using hypotheses H(F )(iii) and (iv) and because dim H− =0, we can easily check that ϕ is bounded below. Using Lemma 5.1.14, we obtain the following.
PROPOSITION 5.1.18 If A ∈ L∞ T, GLs (RN ) , hypotheses H(F ) hold and c < 0, then ϕ satisfies the P Sc -condition. Now we are ready for the multiplicity result concerning problem (5.30).
THEOREM 5.1.19 If A ∈ L∞ T, GLs (RN ) and hypotheses H(F ) hold, then problem (5.30) has at least two nontrivial solutions x0 , x1 ∈ C 1 (T, RN ). PROOF: We know that ϕ is bounded below (see Remark 5.1.17). So 1,2
−∞ < m = inf{ϕ(x) : x ∈ Wper (0, b), RN }. By virtue of hypothesis H(F )(v), if x∈H0 and x(t) ≤ δ for all t ∈ T , then
b F∞ (t)dt ϕ(x) < 0 = ϕ(0) ≤ − 0
b
⇒ m < 0 = ϕ(0) ≤ −
F∞ (t)dt. 0
Because 5.1.18, ϕ satisfies the P Sm -condition and so we can find
of Proposition 1,2 x0 ∈ Wper (0, b), RN } such that ϕ(x0 ) = m < 0 = ϕ(0)
and
ϕ (x0 ) = 0
(see Theorem 4.1.28).
(5.42)
From the first relation in (5.42) we infer that x0 = 0. Suppose that x0 and 0 are the only critical points of ϕ. By virtue of hypothesis H(F )(v), we can find r > 0 small such that
370
5 Boundary Value Problems–Hamiltonian Systems
where B r (0)={x ∈
(5.43) ξ = sup{ϕ(x) : x ∈ ∂Br (0) ∩ H0 } < 0,
N (0, b), R :x ≤ r}. On the other hand, if x ∈ H+ , then
1,2 Wper
x 22
−
b
A(t), xn (t), x(t)
0
RN
dt ≥ λm x22 ,
⇒ ϕ(x) ≥ 0 for all x ∈ H+ (see hypothesis H(F )(vi)).
1,2 (0, b), RN : γ ∂B (0)∩H = I ∂B (0)∩H and conLet Γ = γ ∈ C B r (0), Wper r r 0 0
1,2 (0, b), RN defined by sider the map γ0 : B r (0) −→ Wper x0 if x < r2
2(r−x) rx , γ0 (x) = h , x if x ≥ r2 r where h(t, x) is the homotopy postulated by Theorem 4.6.1 (note that ϕ satisfies the P Sc -condition for all c∈[m, 0), see Proposition 5.1.18). Note that for all x ∈ B r (0) ∩ H0 with x =
γ0 (x) = h(1, 2x)
r . 2
We have 2x = r and so ϕ(2x) < 0 (see (5.43)). Because by hypothesis x0 is the 1,2 only minimizer of ϕ on Wper (0, b), RN (recall that m < 0 = ϕ(0)), from Theorem 4.6.1 we infer that γ0 (x) = h(1, 2x) = x0 ,
1,2 (0, b), RN . which proves the continuity of γ0 ; that is, γ0 ∈ C B r (0) ∩ H0 , Wper Moreover, for all x∈∂Br ∩ H0 , we have γ0 (x)=h(0, x)=x ⇒ γ0 ∂B ∩H , r
⇒ γ0 ∈Γ.
(see Theorem 4.6.1),
(5.44)
0
Because h is ϕ-decreasing (see Theorem 4.6.1), we have
ϕ h(s, x) ≤ ϕ h(t, x) for all t < s and all x ∈ ϕ0 \ {0}. So from (5.43) and (5.44), it follows that
for all x ∈ B r (0) ∩ H0 . (5.45) ϕ γ0 (x) < 0
1,2 The sets ∂Br ∩ H0 , B r (0) ∩ H0 , H+ are linking in Wper (0, b), RN via I (see Definition 4.1.20 and Example 4.1.21(b)). So
for all γ ∈ Γ. γ B r (0) ∩ H0 ∩ H+ = ∅ Because 0 = inf ϕ, we have H+
sup ϕ γ(x) : x ∈ B r (0) ∩ H0 ≥ 0
⇒ sup ϕ γ0 (x) : x ∈ B r (0) ∩ W ≥ 0.
for all γ ∈ Γ,
Comparing (5.45)
and (5.46), we reach a contradiction. 1,2 Wper (0, b), RN such that
(5.46) So we can find x1 ∈
5.1 Variational Method
371
x1 = x0 , x1 = 0 and ϕ (x1 ) = 0.
1,2 (0, b), RN be a critical point of ϕ; that is, ϕ (x) = 0. We Now let x ∈ Wper have V (x) − A(x) = N (x). (5.47)
∗ Let v∈C 1 (0, b), RN . Because −x ∈W −1,2 (0, b), RN = W01,2 (0, b), RN and
if by ·, ·0 we denote the duality brackets for the pair W01,2 (0, b), RN ,
−1,2 (0, b), RN , we have W0
b
b
A(t)x(t), v(t) RN dt = ∇F t, x(t) , v(t) RN dt, V (x), v − 0 .0 / ⇒ −x , v 0 − A(x), x = N (x), x0 . (5.48)
0
is dense in W01,2 (0, b), RN , from (5.48) we infer that Because Cc (0, b), R ⎫ ⎧
⎨ −x (t) − A(t)x(t) = ∇F t, x(t) a.e. on T, ⎬ , (5.49) ⎭ ⎩ x(0) = x(b) N
⇒ x ∈ L1 (T, RN ) (i.e., x ∈ W 1,1 (0, b), RN ⊆ C(T, RN )), ⇒ x ∈ C 1 (T, RN ).
1,2 (0, b), RN , via integration by parts, we have Also if y ∈ Wper
b
x (t), y (t) RN dt V (x), y = 0
b
x (b), y(b) RN − x (0), y(0) RN − x (t), y(t) RN dt, 0 . / ⇒ − A(x), y − N (x), y . /
= x (b), y(b) RN − x (0), y(0) RN − A(x), y − N (x), y =
(see (5.47) and (5.49)),
⇒ x (0), y(0) RN = x (b), y(b) RN
1,2
for all y ∈ Wper (0, b), RN ,
⇒ x (0) = x (b).
Therefore x ∈ C 1 (T, RN ) solves problem (5.30). So we conclude that both x0 , x1 ∈ C 1 (T, RN ) are nontrivial distinct solutions of (5.30). Continuing with multiplicity results, we use Theorem 4.1.32, to produce two nontrivial solutions for problem (5.1). If we consider the nonlinear eigenvalue problem
⎫ ⎧ p−2 Dx(z) = λ|x(z)|p−2 x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) , (5.50) ⎭ ⎩ x ∂Z = 0, λ > 0, 1 < p < ∞ then (5.50) has a first eigenvalue λ1 > 0, which is isolated (see Proposition 4.3.46) and simple (see Proposition 4.3.44). If by u1 ∈ C01 (Z) we denote a normalized
372
5 Boundary Value Problems–Hamiltonian Systems
eigenfunction corresponding to λ1 > 0, then Ru1 is the eigenspace corresponding to λ1 > 0. If V is a topological complement of Ru1 (i.e., W01,p (Z) = Ru1 ⊕ V ), then (5.51) λV = inf Dupp : u ∈ V, up = 1 > λ1 . We consider the following topological complement of Ru1 ,
V = u ∈ W01,p (Z) : u(z)u1 (z)p−1 dz = 0 . Z
Evidently if p = 2, then V = (Ru1 )⊥ . The hypotheses on the nonlinearity f (z, x) of (5.1) are the following. H(f )6 : f : Z × R −→ R is a function such that f (z, x) = 0 a.e. on Z and (i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous. (iii) For almost all z ∈ Z and all x ∈ R, we have |f (z, x)| ≤ α(z) + c|x|r−1 with α ∈ L∞ (Z)+ , c > 0 Np if N > p ∗ N −p . 1 ≤ r
and
(iv) There exist δ > 0 and λ ∈ (λ1 , λV ) such that for a.a. z ∈ Z and all |x| ≤ δ, we have λ1 |x|p ≤ pF (z, x) ≤ λ|x|p , x where F (z, x) = 0 f (z, r)dr (the potential function corresponding to f (z, x)); (v)
lim
|x|→∞
pF (z,x) |x|p
= λ1 and lim
most all z ∈ Z.
|x|→∞
pF (z, x) − λ1 |x|p ) = −∞ uniformly for al-
REMARK 5.1.20 Hypothesis H(f )6 (iv) implies that problem (5.1) is resonant near zero at the first eigenvalue λ1 > 0 from the right. On the other hand by virtue of hypothesis H(f )6 (v), problem (5.1) is resonant near infinity from the left. Therefore we are dealing with a problem which is resonant both near zero and near infinity and we cross λ1 as we move from 0 to infinity. The Euler functional ϕ : W01,p (Z) −→ R is given by
1 F z, x(z) dz. ϕ(x) = Dxpp − p Z
1,p 1 We know that ϕ ∈ C W0 (Z) . PROPOSITION 5.1.21 If hypotheses H(f )6 hold, then ϕ is weakly coercive; that is, ϕ(x) −→ +∞ as x −→ +∞.
5.1 Variational Method
373
PROOF: We proceed by contradiction. So suppose that ϕ is not weakly coercive. Then we can find a sequence {xn }n≥1 ⊆ W01,p (Z) and M > 0 such that xn −→ ∞
and
ϕ(xn ) ≤
for all n ≥ 1.
(5.52)
Set yn = xn /xn , n ≥ 1. We may assume that w
yn −→ y in W01,p (Z)
and
yn −→ y in Lp (Z).
From (5.52) we have 1 Dyn pp − p
Z
F z, xn (z) M dz ≤ xn p xn p
for all n ≥ 1.
(5.53)
From hypotheses H(f )6 (iii), (v), and the mean value theorem, we have |F (z, x)| ≤ α1 (z) + c1 |x|p for a.a. z ∈ Z, with α1 ∈ L∞ (Z)+ , c1 > 0,
F z, xn (z) α1 (z) ⇒ ≤ + c1 |yn (z)|p a.e. on Z for all n ≥ 1. (5.54) xn p xn p Because of (5.54) and the Dunford–Pettis theorem, we may assume that
F ·, xn (·) w −→ h in L1 (Z) as n → ∞. xn p
(5.55)
For every ε > 0 and every n ≥ 1, we consider the set
F z, xn (z) 1 1 Cε,n = z ∈ Z : xn (z) = 0, (λ1 − ε) ≤ ≤ (λ1 + ε) . p p |xn (z)| p Due to hypothesis H(f )6 (v), we have χCε,n (z) −→ 1 Note that
a.e. on {y = 0}.
(1 − χCε,n ) F ·, xn (·) 1 −→ 0 as n → ∞, p L ({y=0}) xn
F ·, xn (·) w −→ h in L1 ({y = 0}) (see (5.55)). ⇒ χCε,n (·) xn p
From the definition of the set Cε,n , we have
F z, xn (z) F z, xn (z) 1 = χ (z) |yn (z)|p (λ1 − ε)|yn (z)|p ≤ χCε,n (z) C ε,n p xn p |xn (z)|p 1 ≤ (λ1 + ε)|yn (z)|p a.e. on Z. p Taking weak limits in L1 ({y = 0}) and using Mazur’s lemma, we obtain 1 1 (λ1 − ε)|yn (z)|p ≤ h(z) ≤ (λ1 + ε)|yn (z)|p p p Because ε > 0 was arbitrary, it follows that
a.e. on {y = 0}.
374
5 Boundary Value Problems–Hamiltonian Systems h(z) =
λ1 |y(z)|p p
a.e. on {y = 0}.
(5.56)
On the other hand it is clear from (5.54) that h(z) = 0
a.e. on {y = 0}.
(5.57)
From (5.56) and (5.57) it follows that h(z) =
λ1 |y(z)|p p
a.e. on Z.
(5.58)
We return to (5.53) and pass to the limit as n → ∞. Using (5.55) and (5.58), we obtain Dypp ≤ λ1 ypp , ⇒ y = 0 or y = ±u1
(see Remark 4.3.43).
If y = 0, then we have yn −→ 0 in W01,p (Z), a contradiction to the fact that yn = 1, n ≥ 1. If y = ±u1 , then |xn (z)| −→ +∞ for a.a. z ∈ Z (see Theorem 4.3.47). By virtue of hypothesis H(f )6 (v), we have
pF z, xn (z) − λ1 |xn (z)|p −→ −∞ a.e. on Z as n → ∞. (5.59) We have
λ1
1 λ1 F z, xn (z) − Dxn pp − |xn (z)|p dz − xn pp p p p Z
λ1
≥− F z, xn (z) − |xn (z)|p dz (see Remark 4.3.43). p Z
ϕ(xn ) =
Using Fatou’s lemma and (5.59), we obtain ϕ(xn ) −→ +∞
as n → ∞,
which contradicts (5.52). This proves the weak coercivity of ϕ.
COROLLARY 5.1.22 If hypotheses H(f )6 hold, then ϕ satisfies the P S-condition and it is bounded below. PROOF: Let {xn }n≥1 ⊆ W01,p (Z) be a P S-sequence, namely {ϕ(xn )}n≥1 ⊆ R is
bounded and ϕ (xn ) −→ 0 in W −1,p (Z) as n → ∞ p1 + p1 = 1 . From Proposition 5.1.21 we know that ϕ is weakly coercive. So it follows that {xn }n≥1 ⊆ W01,p (Z) is bounded. Then arguing as in the last part of Proposition 5.1.2, we obtain xn −→ x in W01,p (Z). This proves that ϕ satisfies the P S-condition. Next suppose that {xn }n≥1 ⊆ W01,p (Z) satisfies ϕ(xn ) −→ −∞
as n → ∞.
(5.60)
Proposition 5.1.21 implies that {xn }n≥1 ⊆ W01,p (Z) is bounded and so we may w assume that xn −→ x in W01,p (Z). Exploiting the compact embedding of W01,p (Z)
5.1 Variational Method
375
into Lp (Z), we can easily see that ϕ is weakly lower semicontinuous on W01,p (Z). So we have −∞ < ϕ(x) ≤ lim inf ϕ(xn ), n→∞
which contradicts (5.60). This proves that ϕ is bounded below.
Now we can have a multiplicity result for problem (5.1). THEOREM 5.1.23 If hypotheses H(f )6 hold, then problem (5.1) has at least two nontrivial solutions x0 , x1 ∈ C01 (Z). PROOF: Recall the direct sum decomposition W01,p (Z) = Ru1 ⊕ V with V = u ∈ W01,p (Z) : Z u(z)u1 (z)p−1 dz . We know that u1 ∈ C01 (Z) (see Proposition 4.3.39). So we can find t0 > 0 such that |tu1 (z)| ≤ δ for all z ∈ Z and all |t| ≤ t0 . Then hypothesis H(f )6 (iv) implies that
λ1 |t|p u1 (z)p ≤ p F z, tu1 (z) ≤ λ|t|p u1 (z)p
a.e. on Z, with λ1 < λ < λV . (5.61)
So for |t| ≤ t0 , we have
|t|p F z, tu1 (z) dz Du1 pp − p Z |t|p |t|p p ≤ Du1 p − λ1 u1 pp (see (5.61)) p p =0 (recall that Du1 pp = λ1 u1 pp , see Remark 4.3.43).
ϕ(tu1 ) =
(5.62) Note that for a.a. z ∈ Z and all x ∈ R, we have F (z, x) ≤
λ p x + c2 |x|r p
with c2 > 0 and p < r < p∗ .
(5.63)
So if u ∈ V , we have
1 F z, u(z) dz Dupp − p Z 1 λ p ≥ Dup − upp − c2 urr (see (5.63)), p p 1 λ ≥ for some c3 > 0 Dupp − c3 Durr 1− p λV
ϕ(u) =
(5.64)
(see (5.51) and recall r < p∗ ). Recall that λ < λV and r > p. So we can find > 0 small such that ϕ(u) ≥ 0
for all u ∈ V, u ≤
(see (5.64)).
(5.65)
376
5 Boundary Value Problems–Hamiltonian Systems
On the other hand from (5.62), if > 0 is small, we have ϕ(u) ≤ 0
for all u ∈ Ru1 , with u ≤ .
(5.66)
If inf ϕ = 0, then because of (5.66) all the elements of Ru1 ∩ B (0) \ {0} are critical points of ϕ and so we have a continuum of nontrivial solutions for problem (5.1), which by virtue of Theorems 4.3.34 and 4.3.35 belong in C01 (Z). If inf ϕ < 0, then because of (5.65), (5.66), and Corollary 5.1.22, we can apply Theorem 4.1.32 and produce two nontrivial critical points x0 , x1 ∈ W01,p (Z) of ϕ. These are solutions of (5.1) and as above the nonlinear regularity theory implies that x0 , x1 ∈ C01 (Z). We prove a multiplicity theorem for semilinear problems (i.e. p = 2). In this multiplicity theorem, we combine the variational method with the method of monotone iterations. This way we are well placed to consider the method of upper-lower solutions, examined in the next section. The problem under consideration is the following. ⎧ ⎫ 2 ⎨ −x(z) = x(z) + h(z) a.e. on Z, ⎬ . ⎩ ⎭ x ∂Z = 0, h ∈ L∞ (Z)
(5.67)
First using the monotone iteration method, we produce a nontrivial solution for problem (5.67). PROPOSITION 5.1.24 If h ∈ L∞ (Z), h = 0 and h(z) ≤ 0 a.e. on Z, then problem (5.67) has a solution u ∈ −int C01 (Z)+ . PROOF: Let x0 ∈ H01 (Z) be the unique solution of −x0 (z) = h(z) a.e. on Z. From Theorems 4.3.34 and 4.3.35 we have that x0 ∈ C01 (Z). Moreover, Theorem 4.3.37 implies that x0 ∈ −int C01 (Z)+ . We set M = 2 min x0 (z) > 0. z∈Z
Given y ∈ H01 (Z), we consider the problem ⎧ ⎫ 2 ⎨ −u(z) + M u(z) = M y(z) + y(z) + h(z) a.e. on Z, ⎬ ⎩
y ∂Z = 0
⎭
.
(5.68)
Problem (5.68) has a unique solution in C01 (Z) denoted by G(y). So we define a map G : H01 (Z) −→ H01 (Z). Evidently a fixed point of G solves problem (5.67). Claim: If y ∈ H01 (Z), x0 ≤ y ≤ 0, then x0 ≤ G(y) ≤ 0. We have M y + y 2 = y(M + y) ≤ 0. So from (5.68) and Theorem 4.3.37 it follows that u = G(y) ≤ 0. Then
5.1 Variational Method
377
−(u − x0 ) + M (u − x0 ) = M y + y 2 + h − h − M x0 = M (y − x0 ) + y 2 ≥ 0, ⇒ u ≥ x0
(see Theorem 4.3.37).
In a similar fashion, we show that G is increasing; that is, x0 ≤ y1 ≤ y2 ≤ 0 ⇒ G(y1 ) ≤ G(y2 ).
(5.69)
Now let u0 = 0 and define un+1 = G(un ), n ≥ 0. By virtue of the claim and (5.69), we have x0 ≤ · · · ≤ un ≤ un−1 ≤ · · · ≤ u0 = 0,
n ≥ 1.
(5.70)
We have |un | ≤ |x0 |
for all n ≥ 1,
⇒ {un }n≥1 ⊆ L2 (Z)
is bounded.
(5.71)
Moreover, by definition we have −un+1 + M un+1 = M un + u2n + h,
⇒ un+1 2 + M un+1 22 = M un un+1 dz + u2n un+1 dz + hun+1 dz Z
Z
Z
≤ (M un 2 + h2 )un+1 ≤ M1 un+1 for some M1 > 0, all n ≥ 1 (see (5.71)), ⇒ un n≥1 ⊆ H01 (Z) is bounded. From this and (5.70), it follows that for the whole sequence {un }n≥1 , we have
w un −→ u in H01 (Z) with u(z)=inf un (z). If A∈L H01 (Z), H −1 (Z) , is defined by n≥1
A(v), y= Z
(Dv, Dy)RN dz
for all v, y ∈ H01 (Z),
then A is maximal monotone and Aun+1 + M un+1 = M un + u2n + h,
(5.72)
⇒ lim Aun+1 , un+1 − u = 0, ⇒ un+1 −→ u in H01 (Z). So if we pass to the limit as n → ∞ in (5.72), we obtain Au = u2 + h, ⇒ u solves problem (5.67) and u ∈ C01 (Z). Finally, it is clear that u =0 and from Theorem 4.3.37 we have u∈−int C01 (Z). If h ≥ 0, h = 0, it can happen that problem (5.67) has no solution.
378
5 Boundary Value Problems–Hamiltonian Systems
EXAMPLE 5.1.25 Let u1 ∈ C01 (Z) be the principal eigenfunction of −, H01 (Z) . We know that u1 (z) > 0 for all z ∈ Z. Let h(z) = tu1 (z) and t > 0 to be specified in the sequel. Suppose that we could find u ∈ H01 (Z) a solution of (5.67) with h as above. We have
uu1 dz= (Du, Du1 )RN dz = u2 u1 dz + hu1 dz λ1 Z Z Z
Z = u2 u1 dz + t (because u1 2 = 1). Z
(5.73) Note that
√ √ (u u1 ) u1 dz Z 1/2 1/2 ≤ u1 dz u2 u1 dz .
uu1 dz = Z
Z
Z
Using the estimate in (5.73), we obtain
1/2 1/2 u2 u1 dz + t ≤ λ1 c u2 u1 dz with c = u1 dz = u1 1 . Z
Z
Z
But this is possible only if 4t ≤ λ21 c2 . So if t > h = tu1 has no solution.
λ1 c 2
2
, then problem (5.67) with
Now using the variational method we produce a second solution for problem (5.67). So we have the following multiplicity theorem. THEOREM 5.1.26 If h ∈ L∞ (Z), h = 0, and h(z) ≤ 0 a.e. on Z, then problem (5.67) has at least two nontrivial solutions u, v ∈ C01 (Z). PROOF: The first solution u ∈ C01 (Z) was obtained in Proposition 5.1.24. Next using the mountain pass theorem (see Theorem 4.1.24), we produce a second nontrivial solution for problem (5.67). Let y = u + εw, w ∈ H01 (Z), w = 1, ε > 0. We have
ε2 u2 wdz ϕ(u + εw) − ϕ(u) = ε (Du, Dw)RN dz + Dw22 − ε 2 Z Z
ε3 −ε uw2 dz − w3 dz − ε hwdz 3 Z Z Z
ε2 ε3 = uw2 dz − w3 dz, (5.74) Dw22 − ε 2 3 Z Z the second equality following from the fact that u ∈ C01 (Z) solves (5.67). Because u ∈ −int C01 (Z)+ (see Proposition 5.1.24), we have Z uw2 dz ≤ 0. Also Z w2 dz ≤ c1 for some c1 > 0 and all w = 1. Therefore ϕ(u + εw) − ϕ(u) ≥ ε2
1 2
−
c1 ε 3
(see (5.74)).
(5.75)
5.1 Variational Method
379
Choosing 0 < ε < 34 c1 , from (5.75) we have ϕ(y) − ϕ(u) ≥
ε2 4
for all y ∈ H01 (Z) with y − u = ε.
(5.76)
Next let y = tu1 . Then
t2 t3 hu1 dz, λ1 − u1 33 − t 2 3 Z ⇒ ϕ(y) −→ −∞ as t −→ +∞. ϕ(y) =
Therefore we can find y ∈ H01 (Z) such that ϕ(y) < ϕ(u)
and
y − u > ε.
(5.77)
Claim: ϕ satisfies the P S-condition. Suppose {xn }n≥1 is a P S-sequence; that is,
and
|ϕ(xn )| ≤ c2
for some c2 > 0 and all n ≥ 1
(5.78)
ϕ (xn ) −→ 0
in H −1 (Z).
(5.79)
Because of (5.79), we have
ϕ (xn ), xn = Dxn 22 − x3n dz − hxn dz ≤ εn xn with εn ↓ 0, Z
Z 1 2 ⇒ 3ϕ(xn ) − Dxn 2 + 2 hxn dz ≤ εn xn (see (5.78)). 2 Z
From Poincar´e’s inequality and (5.78), it follows that {xn }n≥1 ⊆ H01 (Z) is bounded. So we may assume that w
xn −→ x in H01 (Z)
xn −→ x in L2 (Z).
Then as before, due to the maximal monotonicity of A ∈ L H01 (Z), H −1 (Z) , we conclude that xn −→ x in H01 (Z) and so ϕ satisfies the P S-condition. Because of (5.76), (5.77), and the claim, we can apply the mountain pass theorem (see Theorem 4.1.24) and obtain v ∈ H01 (Z) such that ϕ(v) ≥ ϕ(u) +
and
ε2 , i.e. v = w, v = 0 4
and ϕ (v) = 0. So v solves problem (5.67) and also v ∈ C01 (Z) (regularity theory).
We conclude this section with a remarkable theorem, which provides a bridge between variational methods and methods based on the nonlinear strong maximum principle. The result is a valuable tool in the study of nonlinear boundary value problems involving the p-Laplacian. For a proof of it we refer to Gasi´ nski–Papageorgiou [258, p. 655]. So let f : Z × R −→ R be a Carath´eodory function such that |f (z, x)| ≤ α(z) + c|x|r−1
for a.a. z ∈ Z and all x ∈ R,
380
5 Boundary Value Problems–Hamiltonian Systems
with α ∈ L∞ (Z)+ , c > 0 and 1 ≤ r < p∗ . We introduce the C 1 -functional ϕ : W01,p (Z) −→ R defined by
1 F z, x(z) dz for all x ∈ W01,p (Z), ϕ(x) = Dxpp − p Z x where F (z, x) = 0 f (z, r)dr. THEOREM 5.1.27 If ϕ ∈ C 1 (W01,p (Z)) is as above and x0 ∈ W01,p (Z) is a local C01 (Z)-minimizer of ϕ; that is, there exists > 0 such that ϕ(x0 ) ≤ ϕ(x0 + y)
for all yC 1 (Z) ≤ , 0
then x0 ∈C01 (Z) and it is also a local W01,p (Z)-minimizer of ϕ; that is, there exists 0 > 0 such that ϕ(x0 ) ≤ ϕ(x0 + y) for all y ≤ 0 .
5.2 Method of Upper–Lower Solutions In this section we present the method of upper and lower solutions in the study of nonlinear boundary value problems. With this method, via truncation and penalization techniques, we produce solutions that are located in the order interval determined by an ordered pair of upper and lower solutions, which serve as upper and lower bounds, respectively. Under appropriate monotonicity conditions on the data, this method leads to monotone iterative processes that are amenable to numerical treatment. For a given problem one may try several different methods in order to produce the necessary ordered pair of upper and lower solutions. There is no specific methodology and in general one should try simple functions (such as constants, linear, quadratic, exponential, eigenfunctions of simpler operators, solutions of simpler auxiliary equations, etc.). Here we illustrate this method by analyzing two nonlinear boundary value problems. The first deals with elliptic differential equations, whereas the second is a boundary value problem for ordinary differential equations with general boundary conditions which unify the Dirichlet, Neumann, and Sturm–Liouville problems. We start with a nonlinear eigenvalue problem driven by the p-Laplacian. So let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. We consider the following problem,
⎧ ⎫ p−2 Dx(z) = λ|x(z)|p−2 x(z) −f z, x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) . (5.80) ⎩ ⎭ x ∂Z = 0, λ ∈ R, 1 < p < ∞ We produce a multiplicity result for problem (5.80) under the hypothesis that the right-hand side nonlinearity f (z, x) exhibits a p-superlinear growth both near 0 and ±∞. For this purpose our hypotheses on f (z, x) are the following. H(f )1 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z and
5.2 Method of Upper–Lower Solutions
381
(i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous and f (z, x)x ≥ 0. (iii) For almost all z ∈ Z and all x ∈ R, we have |f (z, x)| ≤ α(z) + c|x|r−1 with α ∈ L∞ (Z)+ , c > 0 Np if N > p ∗ N −p . 1 ≤ r
and f (z,x) p−2 x x→0 |x|
(iv) lim (v)
= 0 uniformly for almost all z ∈ Z.
lim f (z,x) p−2 x |x|→∞ |x|
= +∞ uniformly for almost all z ∈ Z.
REMARK 5.2.1 When p=2 (semilinear problem), hypotheses H(f )1 (iv) and (v) imply that the nonlinearity f (z, x) is superlinear both at zero and at infinity. From hypotheses H(f )1 (iii), (iv), and (v), we see that given µ > 0, we can find ϑµ ∈ L∞ (Z)+ , ϑµ = 0, such that
and
f (z, x) > µ|x|p−2 x − ϑµ
for a.a. z ∈ Z and all x ≥ 0
(5.81)
f (z, x) < µ|x|p−2 x + ϑµ
for a.a. z ∈ Z and all x ≤ 0.
(5.82)
We use (5.81) and (5.82) to produce an ordered pair of upper and lower solutions for problem (5.80). Let us start by recalling the definitions of upper and lower solutions for (5.80). 1,p DEFINITION 5.2.2 (a) A function x ∈ W (Z) is an upper solution for problem (5.80), if x ∂Z ≥ 0 and
Dxp−2 (Dx, Du)RN dz ≥ λ |x|p−2 xudz − f (z, x)udz Z
Z
Z
for all u ∈ u ≥ 0. (b) A function x ∈ W (Z) is a lower solution for problem (5.80), if x∂Z ≤ 0 and
Dxp−2 (Dx, Du)RN dz ≤ λ |x|p−2 xudz − f (z, x)udz W01,p (Z), 1,p
Z
for all u ∈
W01,p (Z),
Z
Z
u ≥ 0.
Motivated by inequality (5.81), we consider the following auxiliary boundary value problem.
⎧ ⎫ p−2 Dx(z) = (λ − µ)|x(z)|p−2 x(z) +ϑµ (z) a.e. on Z, ⎬ ⎨ −div Dx(z) . (5.83) ⎩ ⎭ x ∂Z = 0 PROPOSITION 5.2.3 If λ > λ1 and we choose µ > λ − λ1 , then problem (5.83) has a solution x ∈ int C01 (Z)+ .
382
5 Boundary Value Problems–Hamiltonian Systems
PROOF: In what follows by ·, · we denote the duality brackets for the pair
1,p W0 (Z), W −1,p (Z) , (1/p) + (1/p ) = 1. Let A : W01,p (Z) −→ W −1,p (Z) be the nonlinear operator defined by
A(x), y = Dxp−2 (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). Z
We know that A is maximal monotone. Also let K : W01,p (Z) −→ Lp (Z) be defined by K(x)(·) = (λ − µ)|x(·)|p−2 x(·). Clearly K is completely continuous (recall that W01,p (Z) is embedded compactly into Lp (Z)). It follows then that A−K : W01,p (Z) −→ W −1,p (Z) is pseudomonotone. Moreover, we have A(x) − K(x), x = Dxpp − (λ − µ)xpp .
(5.84)
If λ ≤ µ, then it is clear from (5.84) that A − K is coercive. If λ > µ, then by hypothesis λ − µ = λ1 − ε with 0 < ε < λ1 and so A(x) − K(x), x ≥ Dxpp − ⇒ A − K is coercive.
λ1 − ε ε Dxpp = Dxpp , λ1 λ1
So in both cases A − K is coercive. Then by virtue of Theorem 3.2.60 we can find x ∈ W01,p (Z) such that A(x) − K(x) = ϑµ , ⇒ x ∈ W01,p (Z) is a solution of problem (5.83) and belongs in C01 (Z)+ . Because ϑµ = 0, it follows that x = 0. Moreover, using as a test function x − ∈ W01,p (Z), we obtain Dx − pp ≤ (λ − µ)x − pp
(recall that ϑµ ≥ 0).
If λ ≤ µ, then Dx − p = 0 and so we have x = 0. If λ − µ > 0, then λ − µ = λ1 − ε and so Dx − pp ≤ (λ1 − ε)x − pp , which contradicts Theorem 4.3.37, unless x ≥ 0. Therefore we have established that x ≥ 0, x = 0. Also from (5.83), we have
div Dx(z)p−2 Dx(z) ≤ |λ − µ| |x(z)|p−2 x(z) a.e. on Z and this by virtue of Theorem 4.3.37 implies that x ∈ int C01 (Z)+ .
COROLLARY 5.2.4 If λ > λ1 and we have µ > λ − λ1 , then the solution x ∈ int C01 (Z)+ of problem (5.83) produced in Proposition 5.2.3 is an upper solution for problem (5.80). Hypothesis H(f )1 (iv) implies that given ε > 0, we can find δ = δ(ε) > 0 such that and
f (z, x) < ε|x|p−2 x
for a.a. z ∈ Z and all x ∈ [0, δ]
(5.85)
f (z, x) > ε|x|p−2 x
for a.a. z ∈ Z and all x ∈ [−δ, 0].
(5.86)
The next lemma helps us generate a lower solution for problem (5.80).
5.2 Method of Upper–Lower Solutions
383
LEMMA 5.2.5 If X is an ordered Banach space with order cone K such that int K = ∅ and x0 ∈ int K, then given y ∈ X, we can find δ > 0 such that βx0 −y ∈ int K. PROOF: Because x0 ∈ int K, we can find δ > 0 such that B δ (x0 ) = x ∈ X : x − x0 ≤ δ ⊆ int K. Let y ∈ X, y = 0 (if y = 0, then the lemma is trivially true for all β > 0). We have y x0 ± δ ∈ B δ (x0 ) ⊆ int K, y y ⇒ x0 − y ∈ int K. δ So if β = β(y) = y/δ, then we have βx0 − y ∈ int K.
Now let u1 ∈ int C01 (Z)+ be the principal eigenfunction of −p , W01,p (Z) . Using Proposition 5.2.3 and Lemma 5.2.5, we can find ξ > 0 small such that ξu1 (z) ∈ [0, δ]
for all z ∈ Z and ξu1 ≤ x.
We set x = ξu1 . Evidently x ∈ int
C01 (Z)+
(5.87)
(of course x depends on ε > 0).
PROPOSITION 5.2.6 If λ > λ1 , then x ∈ int C01 (Z)+ defined above is a lower solution for problem (5.80). PROOF: Let ε > 0 be such that λ = λ1 + ε. Then we have
−div Dx(z)p−2 Dx(z) = λ1 |x(z)|p−2 x(z) = (λ − ε)|x(z)|p−2 x(z)
< λ|x(z)|p−2 x(z) − f z, x(z)
a.e. on Z
(see (5.85) and (5.87)), ⇒ x ∈ int C01 (Z)+ is a lower solution of (5.80)
(see Definition 5.2.2(b)).
Therefore we have produced an ordered pair {x, x} of lower and upper solutions. We introduce the truncation map τ+ : R −→ R defined by 0 if x ≤ 0 . τ+ (x) = x if x ≥ 0 We set
x
f+ (z, x) = f z, τ+ (x) , F+ (z, x) = f+ (z, r)dr
and
x F (z, x) = f (z, r)dr.
0
0
W01,p (Z) −→ R
We also consider the Euler functionals ϕ+ , ϕ :
1 λ ϕ+ (x) = Dxpp − xpp + F+ z, x(z) dz p p
Z
1 λ p p and ϕ(x) = Dxp − xp + F z, x(z) dz p p Z
defined by
for all x ∈ W01,p (Z).
384
5 Boundary Value Problems–Hamiltonian Systems
We have ϕ+ , ϕ ∈ C 1 W01,p (Z) and the critical points of ϕ are solutions of (5.80). Also we introduce the order interval E+ =[ x, x ]={x ∈ W01,p (Z) : x(z) ≤ x(z) ≤ x(z)
a.e. on Z}.
In the proof of the next proposition, we need the following nonlinear strong comparison principle, whose proof is postponed until Section 5.5 (see Theorem 5.5.11). THEOREM 5.2.7 If x, y ∈ C01 (Z), x = 0, f, g ∈ L∞ (Z), f ≥ 0,
a.e. on Z −div Dx(z)p−2 Dx(z) = f (z)
p−2 and −div Dy(z) Dy(z) = g(z) a.e. on Z f (z) ≥ g(z)
a.e. on Z and the set
C = {z ∈ Z : f (z) = g(z)} has an empty interior, then x(z) > y(z) for all z ∈ Z and (∂x/∂n)(z)<(∂y/∂n)(z) for all z ∈ ∂Z, that is x − y ∈ int C01 (Z)+ . PROPOSITION 5.2.8 If hypotheses H(f )1 hold and λ > λ1 , then we can find x0 ∈ E+ which is a local minimizer of ϕ. PROOF: Because of (5.81), we have F+ (z, x) ≥
µ p |x| − ϑµ (z)|x| p
for a.a. z ∈ Z and all x ∈ R+ .
(5.88)
Choose µ > 0 such that λ − µ < λ1 . Then
1 λ ϕ+ (x) = Dxpp − xpp + F+ z, x(z) dz p p Z 1 λ−µ p p ≥ Dxp − xp − c1 Dxp for some c1 > 0 (see (5.88)) p p 1
λ − µ = Dxpp − c1 Dxp , 1− p λ1 ⇒ ϕ+ is coercive on W01,p (Z)+
(because λ − µ < λ1 ).
Because of compact embedding of W01,p (Z) into Lp (Z), we can easily see that ϕ+ is weakly lower semicontinuous. So we can find x0 ∈ E+ such that ϕ+ (x0 ) = inf ϕ+ .
(5.89)
E+
Now let y ∈ E+ and set η(t)=ϕ+ ty + (1 − t)x0 for all t ∈ [0, 1]. Then η(0) ≤ η(t)
for all t ∈ [0, 1]
⇒ 0 ≤ η (0 ), +
⇒ 0 ≤ A(x0 ), y − x0 −λ
(see (5.10)),
|x0 |p−2 x0 (y − x0 )dz +
Z
for all y ∈ E+ . Given v ∈ W01,p (Z) and ε > 0, we define
f+ z, x0 (z) (y − x0 )dz
Z
(5.90)
5.2 Method of Upper–Lower Solutions ⎧ ⎪ if z ∈ {x0 + εv ≤ x} ⎨ x(z) y(z) = x0 (z) + εv(z) if z ∈ {x < x0 + εv < x} . ⎪ ⎩ x(z) if z ∈ {x ≤ x0 + εv} Evidently y ∈ E+ . We return to (5.90) and use this y ∈ E+ . We obtain
0≤
Dx0 p−2 (Dx0 , Dv)RN dz − λε
{x<x0 +εv<x}
f+ (z, x0 )vdz +
+ε
{x<x0 +εv<x}
+
− x0 )dz
− Dx0 )RN dz − λ
|x0 |p−2 x0 (x {x0 +εv≥x}
− x0 )dz
− x0 )dz =
Dx0 p−2 (Dx0 , Dv)RN dz − λε |x0 |p−2 x0 vdz Z Z
+ε f+ (z, x0 )vdz − Dxp−2 Dx, D(x0 + εv − x) RN dz
=ε
f+ (z, x0 )(x {x0 +εv≤x}
Dx0 p−2 (Dx0 , Dx {x0 +εv≥x} f+ (z, x0 )(x {x0 +εv≥x}
− Dv)RN dz
{x0 +εv≤x}
+
Dx0 p−2 (Dx0 , Dx {x0 +εv≤x}
|x0 |p−2 x0 (x − x0 )dz +
−λ
|x0 |p−2 x0 vdz Z
{x0 +εv≥x}
Z
|x|p−2 x(x0 + εv − x)dz −
+λ
{x0 +εv≥x}
+ εv − x)dz
Dxp−2 Dx, D(x − x0 − εv) RN dz
+
{x0 +εv≤x}
|x|p−2 x(x − x0 − εv)dz +
−λ
{x0 +εv≤x}
−
f+ (z, x)(x0 {x0 +εv≥x}
f+ (z, x)(x {x0 +εv≤x}
− x0 − εv)dz
f+ (z, x0 ) − f+ (z, x) (x0 + εv − x)dz
{x0 +εv≤x}
−
f+ (z, x) − f+ (z, x0 ) (x − x0 − εv)dz
{x0 +εv≥x}
(Dxp−2 Dx − Dx0 p−2 Dx0 , Dx0 − Dx)RN dz
+
{x0 +εv≥x}
(|x|p−2 x − |x0 |p−2 x0 )(x0 − x)dz
−λ
{x0 +εv≥x}
−
(Dxp−2 Dx − Dx0 p−2 Dx0 , Dx − Dx0 )RN dz
{x0 +εv≤x}
(|x|p−2 x − |x0 |p−2 x0 )(x − x0 )dz
+λ
+ε
{x0 +εv≤x}
(Dxp−2 Dx − Dx0 p−2 Dx0 , Dv)RN dz
{x0 +εv≥x}
385
386
5 Boundary Value Problems–Hamiltonian Systems
+ ε (Dxp−2 Dx − Dx0 p−2 Dx0 , Dv)RN dz {x0 +εv≤x}
− λε
(|x|p−2 x − |x0 |p−2 x0 )vdz
{x0 +εv≥x}
− λε
(|x|p−2 x − |x0 |p−2 x0 )vdz.
(5.91)
{x0 +εv≤x}
If u = (x0 +εv −x)+ ∈W01,p (Z)+ , then because x is an upper solution for problem (5.80), we have
− Dxp−2 Dx, D(x0 + εv − x) RN dz + λ |x|p−2 x(x0 + εv − x)dz {x0 +εv≥x}
{x0 +εv≥x}
−
f+ (z, x)(x0 {x0 +εv≥x}
+ εv − x)dz ≤ 0.
(5.92)
If u = (x − x0 − εv)+ ∈W01,p (Z)+ , then because x is a lower solution for problem (5.80), we have
Dxp−2 Dx, D(x − x0 − εv) RN dz − λ |x|p−2 x(x − x0 − εv)dz {x0 +εv≤x}
{x0 +εv≤x}
+
f+ (z, x)(x {x0 +εv≤x}
− x0 − εv)dz ≤ 0.
(5.93)
Recall that the map ψp : RN −→RN defined by yp−2 y if y = 0 ψp (y)= , 0 if y = 0 is a strictly monotone homeomorphism. Therefore
(Dxp−2 Dx − Dx0 p−2 Dx0 , Dx0 − Dx)RN dz ≤ 0
(5.94)
{x0 +εv≥x}
and
(Dxp−2 Dx − Dx0 p−2 Dx0 , Dx − Dx0 )RN dz ≤ 0.
(5.95)
{x0 +εv≤x}
Because x0 ≤ x and x − x0 ≤ εv on {x0 + εv ≥ x}, we have
−λ (|x|p−2 x − |x0 |p−2 x0 )(x0 − x)dz ≤ λε (|x|p−2 x − |x0 |p−2 x0 )vdz. {x0 +εv≥x}
Similarly because x ≤ x0 and x − x0 ≥ εv on {x0 + εv ≤ x}, we have
p−2 x − |x0 |p−2 x0 (x − x0 )dz ≤ λε (|x|p−2 x − |x0 |p−2 x0 )vdz. |x| λ {x0 +εv≤x}
(5.96)
{x0 +εv≤x}
(5.97)
{x0 +εv≤x}
Because x, x ∈ C01 (Z)+ , x ≤ x0 and using hypothesis H(f )1 (iii), we have
− f+ (z, x0 ) − f+ (z, x) (x0 + εv − x)dz ≤ c2 ε (−v)dz for some c2 > 0. (5.98) {x0 +εv≤x}
Similarly we have
{x0 +εv≤x<x0 }
5.2 Method of Upper–Lower Solutions 387
− f+ (z, x) − f+ (z, x0 ) (x − x0 − εv)dz ≤ c3 ε vdz for some c3 > 0. (5.99) {x0 +εv≥x}
{x0 +εv≥x>x0 }
We return to (5.91) and use (5.92)−→(5.99). This way we have
Dx0 p−2 (Dx0 , Dv)RN dz − λε x0 p−2 x0 vdz + ε f+ (z, x0 )vdz 0≤ε Z Z Z
+ c2 ε (−v)dz + c3 ε vdz. {x0 +εv≤x<x0 }
{x0 +εv≥x>x0 }
Divide with ε > 0 and then let ε ↓ 0. If by | · |N we denote the Lebesgue measure in RN , then |{x0 + εv ≤ x < x0 }|N −→ 0
and
|{x0 + εv ≥ x > x0 }|N −→ 0
as ε ↓ 0.
Therefore in the limit as ε ↓ 0, we obtain
Dx0 p−2 (Dx0 , Dv)RN dz − λ x0 p−2 x0 vdz + f+ (z, x0 )vdz. 0≤ Z
Z
Z
(5.100) Because x0 ∈W01,p (Z) was arbitrary, from (5.99) we infer that
⎧ ⎫ p−2 Dx0 (z) = λ|x0 (z)|p−2 x0 (z)−f z, x0 (z) a.e. on Z, ⎬ ⎨ −div Dx0 (z) ⎩
x∂Z = 0
⎭
,
(5.101)
(note that f+ z, x0 (z) = f z, x0 (z) ). From (5.101) we have that x0 ∈ C01 (Z). Moreover, from (5.81) and Proposition 5.2.6, we have x − x0 ∈ int C01 (Z)+ . Similarly because of (5.85) and Proposition 5.2.6, we have x0 − x ∈ int C01 (Z)+ . From the definition of f+ , it follows that x0 is a C01 (Z)-local minimizer of ϕ. Invoking Theorem 5.1.27, we infer that x0 is a W01,p (Z)-local minimizer of ϕ. From the above proof we have that x0 ∈ C01 (Z)+ is a solution of problem (5.80). We repeat a similar argument on the negative semiaxis of R. In this case motivated by (5.82), we consider the following auxiliary problem.
⎧ ⎫ p−2 Dx(z) = (λ − µ)|x(z)|p−2 x(z) − ϑµ (z) a.e. on Z, ⎬ ⎨ −div Dx(z) . (5.102) ⎩ ⎭ x ∂Z = 0 Following the reasoning of Proposition 5.2.3, we solve problem (5.102) and obtain a solution v ∈ −int C01 (Z)+ , which we can show is a lower solution for problem (5.80). Moreover, taking ξ > 0 as in the definition of x, we introduce v = ξ(−u1 ) which is an upper solution for problem (5.80). Using the ordered pair {v, v}, we consider the corresponding eodory truncation τ− (z, x) and then set f− (z, x) =
Carath´ x f z, τ− (z, x) , F− (z, x) = 0 f− (z, r)dr, and
388
5 Boundary Value Problems–Hamiltonian Systems
1 λ F− z, x(z) for all x ∈ W01,p (Z). ϕ− (x) = Dxpp − xpp + p p Z
We work as above, this time on the order interval E− = [ v, v ] and obtain the following. PROPOSITION 5.2.9 If hypotheses H(f )1 hold and λ > λ1 , then there exists v0 ∈ E− which is a local minimizer of ϕ. Again v0 ∈ −int C01 (Z)+ is a solution of problem (5.80). Therefore we have produced two nontrivial solutions x0 , v0 ∈ C01 (Z) of problem (5.80) which have constant sign. Next we show that if we restrict further the parameter λ > 0, namely we require that λ > λ2 , then we have a third nontrivial solution of problem (5.80), distinct from the other two. THEOREM 5.2.10 If hypotheses H(f )1 hold and λ > λ2 , then problem (5.80) has at least three distinct nontrivial solutions x0 , v0 , y0 ∈ C01 (Z). PROOF: We have produced two solutions x0 , v0 ∈ C01 (Z) that have constant sign and are also local minimizers of ϕ. We assume that these are the only nontrivial critical points of ϕλ , or otherwise we are done. Also clearly without any loss of generality, we can say that both x0 , v0 are strict local minimizers of ϕλ . We choose δ > 0 small such ϕ(v0 ) < inf ϕ(x) : x ∈ ∂Bδ (v0 ) and ϕ(x0 ) < inf ϕ(x) : x ∈ ∂Bδ (x0 ) . Let ϕ(v0 ) ≤ ϕ(x0 ), V =∂Bδ (x0 ), W0 ={v0 , x0 }, and W =[v0 , x0 ]={x ∈ W01,p (Z) : v0 (z) ≤ x(z) ≤ x0 (z) a.e. on Z}. The pair {W0 , W } links with V in W01,p (Z) via the identity map. Using hypothesis H(f )1 (v) it is easy to verify that ϕ is coercive from which it follows that ϕ satisfies the P S-condition. So by virtue of Theorem 4.1.28, we can find y0 ∈ W01,p (Z) such that ϕ (y0 ) = 0
and
(i.e., y0 is a critical point of ϕ)
ϕ(v0 ), ϕ(x0 ) < ϕ(y0 ) = inf max ϕ γ(t) ,
(5.103) (5.104)
γ∈Γ t∈[−1,1]
where Γ= γ ∈C [−1, 1], W01,p (Z) :γ(−1)=v0 , γ(1)=x 0 . Our goal is to produce a path γ ∈ Γ such that ϕγ < 0. Then y0 = 0 and of Lp (Z)
course it is distinct from v0 , x0 . To this end let S = W01,p (Z) ∩ ∂B1 where Lp (Z) Lp (Z) 1,p p 1 B1 = x ∈ L (Z) : xp ≤ 1 and Sc = W0 (Z) ∩ C0 (Z) ∩ ∂B1 . Evidently Sc is dense in S. From Theorem 4.1.28, we know that we can find γ0 ∈ Γ0 = γ0 ∈ C([−1, 1], S) : γ(−1)=−u1 , γ0 (1)=u1 such that γ0 ([−1, 1]) ⊆ Sc and max Dupp : u ∈ γ0 ([−1, 1]) ≤ λ2 + δ0 . (5.105) We choose δ0 > 0 such that λ2 + δ0 < λ. Also because of hypothesis H(f )1 (iv), given ε > 0 we can find δ=δ(ε) > 0 such that F (z, x) ≤
ε p |x| p
for a.a. z ∈ Z and all |x| ≤ δ.
(5.106)
Recall that γ0 ([−1, 1]) ⊆ Sc and −v0 , x0 ∈ int C01 (Z)+ . So we can choose ε > 0 small so that
5.2 Method of Upper–Lower Solutions |εu(z)| ≤ δ
389
for all z ∈ Z, all u ∈ γ0 ([−1, 1]),
λ2 + δ0 + ε < λ
and
εu ∈ W = [v0 , x0 ]
for all u ∈ γ0 ([−1, 1]). (5.107)
Therefore, if u ∈ γ0 ([−1, 1]), we have ϕ(εu) =
εp λεp Dupp − upp + p p
F z, εu(z) dz
Z
εp λεp εp+1 ≤ (λ2 + δ0 ) − + p p p and recall that up = 1), εp = (λ2 + δ0 + ε − λ) < 0 p
(see (5.105) and (5.106)
(see (5.107)).
Hence the path εγ0 joins −εu1 and εu1 and we have ϕεγ < 0.
(5.108)
0
Next we produce another continuous path γ+ : [0, 1] −→ W01,p (Z) which joins εu1 and x0 and along which ϕ is negative. For this purpose, we consider the truncation map x if x ≥ 0 τ+ (x) = . 0 if x < 0
x We set f+ (z, x)=f z, τ+ (x) , F+ (z, x)= 0 f+ (z, r)dr, and ϕ+ (x) =
1 λ Dxpp − x+ pp + p p
F+ z, x(z) dz
Z
for all x ∈ W01,p (Z).
Because of hypothesis H(f )1 (v), ϕ+ is coercive and so it satisfies the P Scondition. Also {0, x0 } are the only critical points of ϕ+ and ϕ+ (x0 ) = inf ϕ+ . 1,p
W0
(Z)
We set b+ = ϕ+ (εu1 )
and
a+ = ϕ+ (x0 ) = inf ϕ+ . 1,p
W0
(Z)
Invoking the second deformation theorem (see Theorem 4.6.1), we can find h ∈
b b C [0, 1] × ϕ++ , ϕ++ such that
and
a
h(t, x) = x
for all t ∈ [0, 1] and all x ∈ ϕ++ ,
h(0, x) = x
for all ϕ++
h(1, x) = x0
for all ϕ++ .
b
b
Then consider the continuous path γ+ (t) = h(t, εu1 ) for all t ∈ [0, 1], which joins εu1 and x0 . We have
ϕ γ+ (t) ≤ ϕ+ γ+ (t) ≤ b+ = ϕ+ (εu1 ) < 0 (5.109) ⇒ ϕγ < 0. +
390
5 Boundary Value Problems–Hamiltonian Systems
In a similar fashion, we produce a continuous path γ− : [0, 1] −→ W01,p (Z) joining −εu1 and v0 such that ϕγ < 0. (5.110) −
Concatenating the paths εγ0 , γ+ , and γ− we generate a path γ ∈ Γ such that ϕγ < 0 (see (5.108) through (5.110)) ⇒ ϕ(y0 ) < 0 = ϕ(0) (i.e. y0 = 0). Finally, nonlinear regularity theory implies that y0 ∈ C01 (Z).
As a second example of the method of upper–lower solutions, we consider the following nonlinear boundary value problem. ⎧ ⎫
⎨ − |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ t, x(t) a.e. on T = [0, b], ⎬ . (5.111)
⎩ ⎭ x (0) ∈ ξ1 x(0) , −x (b) ∈ ξ2 x(b) In this problem ξ1 and ξ2 are maximal monotone graphs in R2 . First let us specify the notions of upper and lower solutions for problem (5.111). DEFINITION 5.2.11 (a) A function ϕ ∈ C 1 (T ) such that |ϕ |p−2 ϕ ∈
W 1,p (0, b) , (1/p) + (1/p ) = 1 is said to be an upper solution of problem (5.111), if ⎫ ⎧
a.e. on T, ⎬ ⎨ − |ϕ (t)|p−2 ϕ (t) ≥ f t, ϕ(t), ϕ (t) + ϑ ϕ(t) .
⎭ ⎩ ϕ (0) ∈ ξ1 ϕ(0) − R+ , −ϕ (b) ∈ ξ2 ϕ(b) − R+
(b) A function ψ ∈ C 1 (T ) such that |ψ |p−2 ψ ∈ W 1,p (0, b) , (1/p) + (1/p ) = 1 is said to be a lower solution of problem (5.111), if ⎫ ⎧
⎨ − |ψ (t)|p−2 ψ (t) ≤ f t, ψ(t), ψ (t) + ϑ ψ(t) a.e. on T, ⎬ .
⎭ ⎩ ψ (0) ∈ ξ1 ψ(0) + R+ , −ψ (b) ∈ ξ2 ψ(b) + R+ We make the following hypothesis. H0 : There exist a lower solution ψ ∈ C 1 (T ) and an upper solution ϕ ∈ C 1 (T ) such that ψ(t) ≤ ϕ(t) for all t ∈ T . The hypotheses on the data of problem (5.111) are the following. H(f )2 : f : T × R × R −→ R is a function such that (i) For all x, y ∈ R, t −→ f (t, x, y) is measurable. (ii) For almost all t ∈ T, (x, y) −→ f (t, x, y) is continuous.
5.2 Method of Upper–Lower Solutions
391
(iii) For almost all t ∈ T , all x ∈ [ψ(t), ϕ(t)] and all y ∈ R, we have
|f (t, x, y)| ≤ η(|y|p−1 ) α(t) + c|y| with α ∈ L1 (T )+ , c > 0 and η : R+ −→R+ \{0} a Borel measurable nondecreasing function such that
∞ ds > α1 + c(max ϕ − min ψ) T T λp−1 η(s) b + sup |ϑ(z)| : |z| ≤ max{ϕ∞ , ψ∞ } η(λ) with λ = (1/b) max |ψ(0) − ϕ(b)|, |ψ(b) − ϕ(0)| .
(iv) For every r > 0, we can find γr ∈ Lp (T ) such that for almost all x, y ∈ R with |x|, |y| ≤ r we have |f (t, x, y)| ≤ γr (t). REMARK 5.2.12 Hypothesis H(f )2 (iii) is known as a Bernstein–Nagumo– Wintner growth condition and produces a uniform a priori bound for the derivatives of the solutions of problem (5.111). It is clear that if for almost all t ∈ T , all x ∈ [ ψ(t), ϕ(t) ], and all y ∈ R we have |f (t, x, y)| ≤ α(t) + c|y|p with α ∈ L1 (T )+ , c > 0, then hypothesis H(f )2 (iii) is satisfied. This is the growth condition initially used by Bernstein in his existence theory for second-order boundary value problems.
H(ξ): ξ1 , ξ1 : R −→ 2R are maximal monotone maps such that 0 ∈ ξ1 (0) and 0 ∈ ξ2 (0). REMARK 5.2.13 We know that there exist functions k1 , k2 ∈ Γ0 (R) such that ξ1 = ∂k1 and ξ2 = ∂k2 . H(ϑ): ϑ : R −→ R is a function that maps bounded sets to bounded sets and there exists M > 0 such that x −→ ϑ(x) + M x is increasing. REMARK 5.2.14 We emphasize that ϑ need not be continuous. We start with a lemma that produces a uniform bound for the derivatives of the solutions of problem (5.111). As we already mentioned, hypothesis H(f )2 (iii) is the crucial tool for this. LEMMA 5.2.15 If hypothesis H(f )2 (iii) holds and x ∈ C 1 (T ) satisfies
a.e. on T − |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) and
ψ(t) ≤ x(t) ≤ ϕ(t)
for all t ∈ T,
then there exists M1 > 0 (depending only on ψ, ϕ, η, α, c) such that |x (t)| ≤ M1
for all t ∈ T.
392
5 Boundary Value Problems–Hamiltonian Systems
PROOF: Set µ = ( 1/η(λ) sup |ϑ(z)| : |z| ≤ max{ϕ∞ , ψ∞ } < +∞ (see Hypothesis H(ϑ)). By virtue of hypothesis H(f )2 (iii), we can find M1 > λ such that
p−1
M1
λp−1
ds > α1 + c(max ϕ − min ψ) + µb. T T η(s)
We claim that |x (t)| ≤ M1 for all t ∈ T . Suppose that this is not the case. Then we can find t ∈ T such that |x (t )| > M1 . By the mean value theorem, we can find t0 ∈ (0, b) such that x(b) − x(0)=x (t0 )b. Without any loss of generality, we assume that t0 ≤ t (the analysis is similar if t0 > t ). Note that ψ(b) − ϕ(0) ≤ x(b) − x(0) ≤ ϕ(b) − ψ(0). So we have 1 1 |x(b) − x(0)| ≤ max |ψ(0) − ϕ(b)|, |ψ(b) − ϕ(0)| , b b ⇒ |x (t0 )| ≤ λ > M1 and |x (t )| < M1 (i.e., t0 < t ). |x (t0 )| =
Because x ∈ C 1 (T ), by the intermediate value theorem we can find t1 , t2 ∈ [t0 , t ] with t1 < t2 such that |x (t1 )| = λ and |x (t2 )| = M1 . By hypothesis we have
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T,
⇒ |x (t)|p−1 ≤ | |x (t)|p−2 x (t) | ≤ |f t, x(t), x (t) | + |ϑ x(t) |
≤ η |x (t)|p−1 α(t) + c|x (t)| + |ϑ x(t) | a.e. on T,
p−1 |ϑ x(t) | |x (t)| ≤ α(t) + c|x (t)| + ⇒ a.e. on [t1 , t2 ] η(λ) η |x (t)|p−1
t2 |x (t)|p−1
dt ≤ α1 + c(max ϕ − min ψ) + µb ⇒ p−1 T T t1 η |x (t)|
M p−1 1 ds ≤ α1 + c(max ϕ − min ψ) + µb, ⇒ T T η(s) p−1 λ
which contradicts the choice of M1 > 0.
The upper–lower solutions method employs truncation and penalization techniques. So we introduce a truncation map u : T×R×R−→R2 and a penalty function β : T ×R−→R defined by ⎧
⎪ ψ(t), ψ (t) if x < ψ(t) ⎪ ⎪
⎪ ⎪ ⎪ if x > ϕ(t) ⎨ ϕ(t), ϕ (t) u(t, x, y)= (x, M0 ) (5.112) if ψ(t) ≤ x ≤ ϕ(t), y > M0 ⎪ ⎪ ⎪(x, −M0 ) if ψ(t) ≤ x ≤ ϕ(t), y < −M ⎪ 0 ⎪ ⎪ ⎩ (x, y) if otherwise, where M0 > max{M1 , ϕ ∞ , ψ ∞ } and ⎧ p−2 ⎪ ψ(t) − |x|p−2 x ⎨ |ψ(t)| β(t, x)= 0 ⎪ ⎩ |ϕ(t)|p−2 ϕ(t) − |x|p−2 x
if x < ψ(t) if ψ(t) ≤ x ≤ ϕ(t) if ϕ(t) < x.
(5.113)
5.2 Method of Upper–Lower Solutions
393
We set f1 (t, x, y)=f t1 , u(t, x, y) . Note that for almost all x ∈ [ψ(t), ϕ(t)] and all |y| ≤ M , we have f1 (t, x, y) = f (t, x, y). Moreover, for almost all t ∈ T and all x, y∈R, we have |f1 (t, x, y)| ≤ αr (t) with r =max{M0 , ϕ∞ , ψ∞ }. For every x ∈
W 1,p (0, b) , let N1 (x)(·)=f1 ·, x(·), x (·) and β(x)(·)=β ·, x(·) (i.e., the Nemitsky operators corresponding
to f1 and β, respectively). Let G(x) = N1 (x) + β(x) for every x ∈ W 1,p (0, b) . The next proposition is an immediate consequence of the continuity of the truncation map (x, y) −→ u(t, x, y) (see (5.112)).
PROPOSITION 5.2.16 If hypotheses H(f )2 hold, then G : W 1,p (0, b) −→ Lp (T ), (1/p) + (1/p ) = 1 is continuous. Next we introduce the set
D = x ∈ C 1 (T ) : |x |p−2 x ∈ W 1,p (0, b) , x (0) ∈ ξ1 x(0)
and − x (b) ∈ ξ2 x(b)
and then define the nonlinear operator U : D ⊆ Lp (T ) −→ Lp (T ) by
for all x ∈ D. U (x)(·) = − |x (·)p−2 x (·)|
PROPOSITION 5.2.17 If hypotheses H(ξ) hold, then U : D ⊆ Lp (T )−→Lp (T ) is maximal monotone.
PROOF: Given h ∈ Lp (T ), we consider the following nonlinear boundary value problem, * ) =h(t) a.e. on T, − |x (t)|p−2 x (t) + |x(t)|p−2 x(t)
. (5.114) x (0) ∈ ξ1 x(0) , −x (b) ∈ ξ2 x(b) We show that problem (5.114) has a unique solution x ∈ C 1 (T ). To this end, given v, w ∈ R, we consider the following two-point boundary value problem. * ) − |x (t)|p−2 x (t) + |x(t)|p−2 x(t) = h(t) a.e. on T, . (5.115) x(0) = v, x(b) = w
Let us set γ(t) = 1 − (t/b) v + (t/b)w. Then γ(0) = v and γ(b) = w. We consider the function y(t) = x(t) − γ(t) and rewrite problem (5.115) in terms of this function )
* − |(y + γ) (t)|p−2 (y + γ) (t) +|(y + γ)(t)|p−2 (y + γ)(t) = h(t) a.e. on T, . y(0) = 0, y(b) = 0 (5.116) This is a homogeneous Dirichlet problem for y. To solve (5.116), we argue as follows.
Let V1 : W01,p (0, b) −→W −1,p (0, b) , be the nonlinear operator defined by
V1 (u), z0 =
b
|(u + γ) (t)|p−2 (u + γ) (t)z (t)dt
0
b
|(u + γ)(t)|p−2 (u + γ)(t)z(t)dt
+ 0
for all u, z∈W01,p (0, b) .
394
5 Boundary Value Problems–Hamiltonian Systems
It is easy to check that V1 is strictly monotone, demicontinuous, hence maximal monotone too. Moreover, we have V1 (u), u0 ≥ u + γp − c3 u + γp−1
for some c3 > 0,
⇒ V1 is coercive.
So from Corollary 3.2.28, we infer that there exists y ∈ W01,p (0, b) such that V1 (y) = h and due to the strict monotonicity of V1 , this solution is unique. From the equation V1 (y) = h it follows easily that y ∈ C 1 (T ) and it solves problem (5.116). Therefore x = y + γ ∈ C 1 (T ) and it is the unique solution of problem (5.115). We can define the solution map s : R × R −→ C 1 (T ) which to each pair (v, w) assigns the unique solution of problem (5.115). Then let : R × R −→ R × R be defined by
(v, w) = s(v, w) (0), −s(v, w) (b) . (5.117) Using integration by parts, we check easily that is monotone. Also let αn −→ α and βn −→ β in R. Set t t αn + βn xn = s(αn , βn ), x = s(α, β), γn (t) = 1 − b b t t and γ(t) = 1 − α+ β for all n ≥ 1. b b
Then directly from (5.115) we obtain that {xn }n≥1 ⊆ W 1,p (0, b) is bounded.
It follows that |xn |p−2 xn n≥1 ⊆Lp (T ) and |xn |p−2 xn n≥1 ⊆W 1,p (0, b) are both bounded. So we may assume that
w w xn −→ u in W 1,p (0, b) , |xn |p−2 xn −→ w in Lp (T ),
w and |xn |p−2 xn −→ v in W 1,p (0, b) .
Due to the compact embeddings of W 1,p (0, b) and W 1,p (0, b) into C(T ), we have xn −→ u in C(T ) and |xn |p−2 xn −→ v in C(T ). The map σ : C(T )−→C(T ) defined by
y(·) = ϑp y(·) , σ(y)(·) = ϑ−1 p where for any 1 < r < ∞,
ϑr (y) =
|y|r−2 y 0
if y = 0 , if y = 0
is continuous and maps bounded sets to bounded sets. Therefore xn −→ σ(v) in C(T ) ⇒ u = σ(v) (i.e., v = |u |p−2 u ). Therefore in the limit as n → ∞, we obtain ) * − |u (t)|p−2 u (t) + |u(t)|p−2 u(t) = h(t) a.e. on T, . u(0) = α, u(b) = β ⇒ x = s(α, β) (i.e., s : R × R−→C 1 (T ) is continuous).
5.2 Method of Upper–Lower Solutions
395
From the continuity of s it follows at once the continuity of (see (5.117)). Finally it is straightforward to check that is coercive. So is maximal monotone (being monotone, continuous) and coercive. Evidently so is k = + ξ : R × R −→ 2R×R . So we can find (α, β) ∈ R × R such that (0, 0) ∈ k(α, β). Then x0 = s(α, β) is the unique solution of the auxiliary problem (5.114). If K : Lp (T )−→Lp (T ) is the maximal monotone operator defined by K(x)(·) = |x(·)|p−2 x(·), then from the previous argument we have that
R(U + K) = Lp (T ), (i.e., U + K is surjective).
(5.118)
We claim that this surjectivity implies the maximal monotonicity of U . For this purpose suppose that for some y ∈ Lp (T ) and some v ∈ Lp (T ), we have U (x) − v, x − yp ≥ 0
for all x ∈ D.
(5.119)
Here by ·, ·p we denote the duality brackets for the pair Lp (T ), Lp (T ) . Because of (5.118), we can find x1 ∈ D such that U (x1 ) + K(x1 ) = v + K(y). We use this in (5.119) with x = x1 . So we obtain U (x1 ) − U (x1 ) − K(x1 ) + K(y), x1 − yp ≥ 0, ⇒ K(y) − K(x1 ), x1 − yp ≥ 0.
(5.120)
Because K is strictly monotone, from (5.120) we conclude that y = x1 ∈ D and v = U (x1 ). Therefore U is maximal monotone.
The operator U + K : D ⊆ Lp (T )−→ Lp (T ) is a maximal monotone and strictly monotone operator that is coercive (indeed note that U (x) + K(x), xp ≥ xpp , because U (x), xp ≥ 0 and K(x), xp = xpp ). Therefore U + K is surjective. This
means that the operator L : (U + K)−1 :Lp (T )−→D⊆W 1,p (0, b) is well-defined, single-valued, and maximal monotone (from Lp (T ) into Lp (T )).
PROPOSITION 5.2.18 If hypothesis H(ξ) holds, then L : Lp (T ) −→ D ⊆
1,p W (0, b) is completely continuous. w
PROOF: Suppose that vn −→ v in Lp (T ). We need to show that L(vn ) −→ L(v) in Lp (T ). We set xn = L(vn ) for all n ≥ 1. Then xn ∈ D and U (xn ) + K(xn ) = vn , ⇒ U (xn ), xn p + K(xn ), xn p = vn , xn p .
(5.121)
Because of the integration by parts formula, we have U (xn ), xn p = −|xn (b)|p−2 xn (b)xn (b) + |xn (0)|p−2 xn (0)xn (0) + xn pp . (5.122)
396
5 Boundary Value Problems–Hamiltonian Systems
Because xn ∈ D, we have xn (0) ∈ ξ1 xn (0) and −xn (b) ∈ ξ2 xn (b) for all n ≥ 1. Then recalling that (0, 0) ∈ Gr ξi , i = 1, 2, we have xn (0)xn (0) ≥ 0 ⇒ ⇒
− xn (b)xn (b) ≥ 0,
and
−|xn (b)|p−2 xn (b)xn (b) ≥ U (xn ), xn p ≥ xn pp
0
and
|xn (0)|p−2 xn (0)xn (0) ≥ 0
(see (5.122)).
(5.123)
Using (5.123) in (5.121), we obtain xn pp + xn pp = xn p ≤ vn p xn p ⇒ xn p−1 ≤ c3
for some c3 > 0 and all n ≥ 1, (0, b) is bounded. ⇒ {xn }n≥1 ⊆ W
w Therefore we may assume that xn −→ x in W 1,p (0, b) and xn −→ x in C(T ). Also directly from the equation U (xn ) + K(xn ) = vn for all n ≥ 1, it follows that p−2
|xn | xn n≥1 ⊆W 1,p (0, b) is bounded. So by passing to a suitable subsequence if necessary, we may assume that
w (5.124) |xn |p−2 xn −→ h in W 1,p (0, b) and |xn |p−2 xn −→ h in C(T ). 1,p
If ϑp : R −→ R is the homeomorphism |y|p−2 y ϑp (y) = 0
if y = 0 if y = 0
(see also the proof of Proposition 5.2.17), then
ϑ−1 |xn (t)|p−2 xn (t) −→ ϑ−1 h(t) for all t ∈ T, p p −1
⇒ xn (t) −→ ϑp h(t) for all t ∈ T, −1
p ⇒ xn −→ ϑp h(·) in L (T ) (by the dominated convergence theorem). But recall that xn −→ x in Lp (T ). So we infer that
x = ϑ−1 h(·) , p
⇒ h = ϑp |x (·)|p−2 x (·) , w
⇒ |xn |p−2 xn −→ |x |p−2 x ⇒ xn −→ x ⇒ xn −→ x
in C(T )
(see (5.124)),
in C(T ),
in W 1,p (0, b) .
This proves the complete continuity of the operator L.
We introduce the order interval
E=[ψ, ϕ]={x ∈ W 1,p (0, b) : ψ(t) ≤ x(t) ≤ ϕ(t) for all t ∈ T }.
Also let τ : W 1,p (0, b) −→W 1,p (0, b) , be the truncation operator defined by ⎧ ⎪ if x(t) < ψ(t) ⎨ ψ(t) τ (x)(t) = x(t) if ψ(t) ≤ x(t) ≤ ϕ(t) . ⎪ ⎩ ϕ(t) if ϕ(t) ≤ x(t)
5.2 Method of Upper–Lower Solutions
397
Clearly τ is continuous and bounded (i.e., maps bounded sets to bounded ones). Given w ∈ E, we consider the following auxiliary boundary value problem: ⎧
⎫
⎨ − |x (t)|p−2 x (t) =f1 t, x(t), x (t) + β t, x(t) −M τ (x)(t) + ϑ w(t) ⎬ . + M w(t) a.e. on T,
⎩ ⎭ x (0) ∈ ξ1 ψ(0) , −x (b) ∈ ξ2 ψ(b) (5.125) PROPOSITION 5.2.19 If hypotheses H0 , H(f )2 , H(ξ), and H(ϑ) hold, then problem (5.125) has a solution x ∈ C 1 (T ) ∩ E.
PROOF: Let G1 : W 1,p (0, b) −→Lp (T ) be the nonlinear operator defined by G1 (x) = G(x) + K(x) − M τ (x) + ϑ(w) + M w
for all x ∈ W 1,p (0, b) .
From Proposition 5.2.16 and the continuity of
the operators K and τ , we infer that G1 is continuous. For every x ∈ W 1,p (0, b) , we have p−1 G1 (x)p ≤ γr p + b1/p max ϕp−1 + M b1/p max ϕ∞ , ψ∞ ∞ , ψ∞ + ϑ(w)p + M wp = M2 , M2 > 0. Recall that r = max M0 , ϕ∞ , ψ∞ . Set C = {g ∈ Lp (T ) : g p ≤ M2 }. Clearly G1 maps bounded sets to bounded ones and LG1 W 1,p (0, b) ⊆ L(E)
1,p which is relatively compact in (0, b) (see Proposition 5.2.18). Therefore we
W 1,p can find x ∈ D ⊆ W (0, b) such that x = LG1 (x), ⇒ U (x) + K(x) = G(x) + K(x) − M τ (x) + ϑ(w) + M w, ⇒ U (x) = G(x) − M τ (x) + ϑ(w) + M w, ⇒ x ∈ D ⊆ C 1 (T ) solves problem (5.125)). It remains to show that x ∈ E. Because ψ ∈ C 1 (T ) is a lower solution for problem (5.111), we have ) *
− |ψ (t)|p−2 ψ (t) ≤ f t, ψ(t), ψ (t) + ϑ ψ(t) a.e. on T, . ψ (0) ∈ ξ1 ψ(0) + R+ , −x (b) ∈ ξ2 ψ(b) + R+
(5.126)
Subtracting (5.126) from (5.125), we obtain
|ψ (t)|p−2 ψ (t) − |x (t)|p−2 x (t) ≥ f t, x(t), x (t) + β t, x(t) − M τ (x)(t)
+ ϑ w(t) + M w(t) − f t, ψ(t), ψ (t)
− ϑ ψ(t) a.e. on T. (5.127)
We multiply (5.127) with (ψ − x)+ ∈ W 1,p (0, b) and then integrate on [0, b] the resulting inequality. We obtain
398
5 Boundary Value Problems–Hamiltonian Systems
|ψ (t)|p−2 ψ (t) − |x (t)|p−2 x (t) (ψ − x)+ (t)dt
b
0
b
b
f1 t, x(t), x (t) −f t, ψ(t), ψ (t) (ψ−x)+ (t)dt + β t, x(t) (ψ−x)+ (t)dt
≥ 0
0 b
ϑ w(t) +M w(t) − ϑ ψ(t) − M τ (x)(t) (ψ − x)+ (t)dt.
+ 0
(5.128) First we estimate the left-hand side in inequality (5.128). Performing integration by parts, we obtain
b
|ψ (t)|p−2 ψ (t) − |x (t)|p−2 x (t) (ψ − x)+ (t)dt 0 = |ψ (b)|p−2 ψ (b) − |x (b)|p−2 x (b) (ψ − x)+ (b) − |ψ (0)|p−2 ψ (0) − |x (0)|p−2 x (0) (ψ − x)+ (0)
b
(5.129) − |ψ (t)|p−2 ψ (t)−|x (t)|p−2 x (t) (ψ − x)+ (t)dt. 0
Recall that
+
(ψ − x)
(t) =
(ψ − x)(t) 0
if ψ(t) > x(t) . if ψ(t) ≤ x(t)
(5.130)
Also from the boundary conditions in (5.125) and (5.126), we have
−x (b) ∈ ξ2 x(b) and − ψ (b) ∈ ξ2 ψ(b) + eb with eb ≥ 0. If ψ(b) ≥ x(b), then of ξ2 (see hypothesis H(ξ)), we have the monotonicity
from ψ (b) ≤ x (b) ⇒ ϑp ψ (b) ≤ ϑp x (b) ⇒ |ψ (b)|p−2 ψ (b) ≤ |x (b)|p−2 x (b). So it follows that |ψ (b)|p−2 ψ (b) − |x (b)|p−2 x (b) (ψ − x)+ (b) ≤ 0. (5.131)
In asimilar fashion using the boundary conditions x (0) ∈ ξ1 x(0) and ψ (0) ∈ ξ1 ψ(0) + e0 with e0 ≥ 0, we obtain that |ψ (0)|p−2 ψ (0) − |x (0)|p−2 x (0) (ψ − x)+ (0) ≥ 0. (5.132) Also we have
=
|ψ (t)|p−2 ψ (t)−|x (t)|p−2 x (t) (ψ − x)+ (t)dt
b
0
|ψ (t)|p−2 ψ (t) − |x (t)|p−2 x (t) (ψ − x )(t)dt ≥ 0.
(5.133)
{ψ>x}
Using (5.131) through (5.133) in (5.129), we see that
b
|ψ (t)|p−2 ψ (t) − |x (t)|p−2 x (t) (ψ − x)+ (t)dt ≤ 0. 0
(5.134)
5.2 Method of Upper–Lower Solutions
399
Next we estimate the right-hand side of (5.128). We have
f1 t, x(t), x (t) −f t, ψ(t), ψ (t) = f t, ψ(t), ψ (t) −f t, ψ(t), ψ (t) = 0
b
⇒
a.e. on {ψ ≥ x},
f1 t, x(t), x (t) − f t, ψ(t), ψ (t) (ψ − x)+ (t)dt = 0.
(5.135)
0
Also from the definition of the penalty function β (see (5.113)), if |{ψ > x}| > 0 (by | · | we denote the Lebesgue measure on R), then
b
β t, x(t) (ψ − x)+ (t)dt
0
= |ψ(t)|p−2 ψ(t)−|x(t)|p−2 x(t) (ψ − x)(t)dt > 0. (5.136) {ψ>x}
Finally by virtue of hypothesis H(ϑ) and since w ∈ E, we see that
b
ϑ w(t) + M w(t) − ϑ ψ(t) − M τ (x)(t) (ψ − x)+ (t)dt ≥ 0.
(5.137)
0
Using (5.135) through (5.137), we have
b
b
f t, x(t), x (t) −f t, ψ(t), ψ (t) (ψ−x)+ (t)dt+ β t, x(t) (ψ−x)+ (t)dt 0
0
b
+ ϑ w(t) +M w(t) − ϑ ψ(t) −M τ (x)(t) (ψ−x)+ (t)dt > 0,
(5.138)
0
provided |{ψ > x}| > 0. Returning to (5.128) and using (5.134) and (5.138), we reach a contradiction when |{ψ > x}| > 0. So it follows that ψ(t) ≤ x(t) for all t ∈ T . In a similar fashion we show that x(t) ≤ ϕ(t) for all t ∈ T ; that is, x ∈ E. We use the solvability of the auxiliary problem (5.125) in order to produce a solution for the original problem (5.111). To do this, we need a fixed point theorem for multifunctions in an ordered Banach space. More on the fixed point theory for multifunctions can be found in Section 6.5. THEOREM 5.2.20 If X is a separable, reflexive, ordered Banach space, E ⊆ X is a nonempty and weakly closed set, and S :E −→ 2E \{∅} is a multifunction with weakly closed values, S(E) is bounded and (i) The set M ={x ∈ E : x ≤ y for some y ∈ S(x)} is nonempty. (ii) If x1 ≤ y1 , y1 ∈ S(x1 ) and x1 ≤ x2 , then we can find y2 ∈ S(x2 ) such that y 1 ≤ y2 , then S has a fixed point; that is, there exists x ∈ E such that x ∈ S(x). We apply this theorem, using the
following data. The separable, reflexive, ordered Banach space X = W 1,p (0, b) , E = [ψ, φ] and S:E−→2E \ {∅} is the solution multifunction for the auxiliary problem (5.125); that is, for every w ∈ E, S(w) is the set of solutions for problem (5.125). From Proposition 5.2.19, we know that S(w) = ∅ and S(w) ⊆ E.
400
5 Boundary Value Problems–Hamiltonian Systems
THEOREM 5.2.21 If hypotheses H(f )2 , H(ξ), and H(ϑ) hold, then problem (5.111) has a solution x ∈ C 1 (T ).
PROOF: Let X = W 1,p (0, b) , E = [ψ, φ] ⊆ W 1,p (0, b) and S : E −→ 2E \{∅} the solution multifunction for problem (5.125). From Proposition 5.2.19 we know that for every w ∈ E, S(w) = ∅ and
S(w) ⊆ E. Moreover, it is routine to check that S(w) is weakly closed in W 1,p (0, b) . In addition, from the proof of Proposition 5.2.19 we know that S(E) ⊆ W 1,p (0, b) is bounded. Therefore it remains to verify conditions (i) and (ii) of Theorem 5.2.20. Note that if w = ψ, then by Proposition 5.2.19, S(ψ) = ∅ and S(ψ) ⊆ E. So we have satisfied condition (i) of Theorem 5.2.20. Next we verify condition (ii) of Theorem 5.2.20. So let w1 , w2 ∈ E, w1 ≤ w2 , and x1 ∈S(w1 ) with w1 ≤x1 . Because x1 ∈ S(w1 )⊆E, we have β t, x1 (t) = 0 for all t ∈ T (see (5.113)) and τ (x1 ) = x1 . Also note that if pM0 :R−→R is the truncation function defined by ⎧ ⎪ if y < −M0 ⎨ −M0 pM0 (y) = y if − M0 ≤ y ≤ M0 , ⎪ ⎩ M0 if M0 < y then from the definition of the function u (see (5.112)), we have
u t, x1 (t), x1 (t) = τ (x1 )(t), pM0 τ (x1 ) (t) = x1 (t), pM0 x1 (t) and so it follows that
f1 t, x1 (t), x1 (t) = f t, x1 (t), x1 (t)
for all t ∈ T.
Therefore the auxiliary problem (5.125) becomes ⎫ ⎧
− |x1 (t)|p−2 x1 (t) = f t, x1 (t), x1 (t) + ϑ w1 (t) + M w1 (t) − M x1 (t) ⎪ ⎪ ⎪ ⎪ ⎬ ⎨ a.e. on T, . ⎪ ⎪ ⎪ ⎪
⎭ ⎩ x1 (0) ∈ ξ1 x1 (0) , −x1 (b) ∈ ξ2 x1 (b) (5.139) Because w1 ≤ w2 , by virtue of hypothesis H(ϑ), we have
ϑ w1 (t) + M w1 (t) ≤ ϑ w2 (t) + M w2 (t) for all t ∈ T. (5.140) Using (5.141) in (5.140), it follows that x1 ∈C 1 (T ) is a lower solution for the problem ⎫ ⎧
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ w2 (t) + M w2 (t) ⎪ ⎪ ⎪ ⎪ ⎬ ⎨ −M x(t) a.e. on T, (5.141) ⎪ ⎪ ⎪ ⎪
⎭ ⎩ x (0) ∈ ξ1 x(0) , −x (b) ∈ ξ2 x(b) (see Definition 5.2.11(b)). Also because ϕ ∈ C 1 (T ) is an upper solution for problem (5.111), we have ⎧ ⎫
⎨ − |ϕ (t)|p−2 ϕ (t) ≥ f t, ϕ(t), ϕ (t) + ϑ ϕ(t) a.e. on T, ⎬ . (5.142)
⎩ ⎭ ϕ (0) ∈ ξ1 ϕ(0) − R+ , −ϕ (b) ∈ ξ2 ϕ(b) − R+
5.2 Method of Upper–Lower Solutions
401
But f t, ϕ(t), ϕ (t) + ϑ ϕ(t) + M ϕ(t) ≥ f t, ϕ(t), ϕ (t) + ϑ w2 (t) + M w2 (t) a.e. on T (see hypothesis H(ϑ) and recall that w2 ∈ E). Using this inequality in (5.142), we obtain ⎧ ⎫
− |ϕ (t)|p−2 ϕ (t) ≥ f t, ϕ(t), ϕ (t) + ϑ w2 (t) + M w2 (t) ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ −M ϕ(t) a.e. on T, . ⎪ ⎪ ⎪ ⎪
⎩ ⎭ ϕ (0) ∈ ξ1 ϕ(0) − R+ , −ϕ (b) ∈ ξ2 ϕ(b) − R+ This means that ϕ ∈ C 1 (T ) is an upper solution for problem (5.142) (see Definition 5.2.11(a)). Then for problem (5.142), using truncation and penalization techniques based on the ordered upper–lower solution pair {ϕ, x1 }, as in Proposition 5.2.19 we obtain a solution x2 ∈ E1 = [x1 , ϕ] for problem (5.142). Evidently x2 ∈ S(w2 ) and x1 ≤ x2 . So we have satisfied condition (ii) of Theorem 5.2.20. Therefore we can apply that theorem and obtain x ∈ E such that x ∈ S(x). Clearly x ∈ C 1 (T ) solves problem (5.111). Next we establish the existence of a greatest and of a smallest solution in the order interval E.These solutions are called extremal solutions. So let C1 = x ∈ C 1 (T ) : x be a solution of (5.111) and x ∈ E . On L∞ (T ) we consider the partial order structure induced by the order cone L∞ (T )+ = {x ∈ L∞ (T ) : x(t) ≥ 0 a.e. on T }. So x ≤ y in L∞ (T ) if and only if x(t) ≤ y(t) a.e. on T . From Theorem 5.2.21, we know that under hypotheses H0 , H(f )2 , H(ξ), and H(ϑ), the set C1 is nonempty. Recall that a set C in a partially ordered set is a chain (or totally ordered subset), if for every x, y ∈ C, either x ≤ y or y ≤ x. PROPOSITION 5.2.22 If hypotheses H0 , H(f )2 , H(ξ), and H(ϑ) hold, then every chain C in C1 has an upper bound. PROOF: Because C ⊆ L∞ (T ) is bounded and L∞ (T ) is a complete lattice, we can find {xn }n≥1 ⊆ C such that sup xn = sup C ∈ L∞ (T ). Moreover, because n≥1
of the lattice structure of L∞ (T ), we can assume that the sequence {xn }n≥1 is increasing. Invoking the monotone convergence theorem, we have that xn −→ x in Lp (T ). Because of Lemma 5.2.15 and H(f )2 (iv) and H(ϑ), we have |xn (t)|p−2 xn (t) ≤ γ(t) a.e. on T, for all n ≥ 1 with γ ∈ Lp (T ),
⇒ |xn |p−2 xn n≥1 ⊆ W 1,p (0, b) is bounded. Also
xn ∞ , xn ∞ ≤ r=max{M1 , ϕ∞ , ψ∞ } for all n ≥ 1, hence {xn }n≥1 ⊆ W (0, b) is bounded. Therefore we may assume that 1,p
|xn |p−2 xn −→ v w
in W 1,p (0, b)
and
w
xn −→ u
in W 1,p (0, b) .
Evidently u = x and as in proof of Proposition 5.2.18, we can check that v = |x |p−2 x . Note that xn −→ x in C(T ) and |xn |p−2 xn −→ |x |p−2 x in C(T ). So we have xn (t) −→ x(t) and xn (t) −→ x (t) for all t ∈ T . From the dominated convergence theorem, we have
f ·, xn (·), xn (·) −→ f ·, x(·), x (·) in Lp (T ).
402
5 Boundary Value Problems–Hamiltonian Systems
Also from the monotone convergence theorem we have
ϑ(xn ) + M xn −→ ϑ(x) + M x in Lp (T ).
Because M xn −→M x in C(T ), it follows that ϑ(xn )−→ ϑ(x) in Lp (T ). Therefore in the limit as n → ∞, we have
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T. Moreover, exploiting the fact that Gr ξ1 , Gr ξ2 are closed in R × R, we also have
x (0) ∈ ξ1 x(0) and − x (b) ∈ ξ2 x(b) . Therefore x ∈ C 1 (T ) is a solution of problem (5.111) and in addition x ∈ E. Hence x ∈ C1 and clearly is an upper bound of C. Recall that if (C0 , ≤) is a partially ordered set, we say that C0 is directed, if for each pair u1 , u2 ∈ C0 , we can find u3 ∈ C0 such that u1 ≤ u3 and u2 ≤ u3 . PROPOSITION 5.2.23 If hypotheses
H0 , H(f )2 , H(ξ), and H(ϑ) hold, then C1 ⊆ W 1,p (0, b) is directed, when W 1,p (0, b) is endowed with the pointwise order (order induced by C(T )). + PROOF: Let
x1 , x2 ∈C1 and set x3 =max{x1 , x2 }. We have x3 =(x1 −x2 ) +x2 and so x3 ∈ W 1,p (0, b) . We introduce the following truncation and penalty functions. ⎧
⎪ x3 (t), x3 (t) if x < x3 (t) ⎪ ⎪
⎪ ⎪ ⎪ (t) if x > ϕ(t) ϕ(t), ϕ ⎨ u0 (t, x, y)= (x, M1 ) if x3 (t) ≤ x ≤ ϕ(t), y > M1 ⎪ ⎪ ⎪ (x, −M1 ) if x3 (t) ≤ x ≤ ϕ(t), y < −M1 ⎪ ⎪ ⎪ ⎩ (x, y) if otherwise
and
⎧ p−2 ⎪ ϕ(t) − |x|p−2 x ⎨ |ϕ(t)| β0 (t, x)= 0 ⎪ ⎩ |x3 (t)|p−2 x3 (t) − |x|p−2 x
if x > ϕ(t) if x3 (t) ≤ x ≤ ϕ(t) . if x < x3 (t).
Then we introduce the following modification of the nonlinearity f ,
f0 (t, x, y) = f t, u0 (t, x, y) . Note that for almost all t ∈ T , all x ∈ [x3 (t), ϕ(t)], and all |y| ≤ M1 we have f0 (t, x, y) = f (t, x, y). Moreover, for almost all t ∈ T and all x, y ∈ R, |f0 (t, x, y)| ≤ γr (t)
with γr ∈ Lp (T ),
r = max{M1 , ϕ∞ , ψ∞ }.
Also we introduce the truncation operator τ0 : W 1,p (0, b) −→W 1,p (0, b) defined by
5.2 Method of Upper–Lower Solutions 403 ⎧ ⎪ if x(t) < x3 (t) ⎨x3 (t) τ (x)(t)= x(t) if x3 (t) ≤ x(t) ≤ ϕ(t) . ⎪ ⎩ ϕ(t) if ϕ(t) < x(t)
1,p (0, b) : x3 (t) ≤ x(t) ≤ ϕ(t) for all t ∈ T . Given Let E0 = [x3 , ϕ] = x ∈ W w ∈ E0 , we consider the following auxiliary problem ⎧ ⎫
− |x (t)|p−2 x (t) = f0 t, x(t),x (t) + β0 t, x(t) − M τ0 (x)(t) ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ +ϑ w(t) + M w(t) a.e. on T, . (5.143) ⎪ ⎪ ⎪ ⎪
⎩ ⎭ x (0) ∈ ξ1 x(0) , −x (b) ∈ ξ2 x(b) Following the reasoning of Proposition 5.2.19, we can show that problem (5.143) 1 has a solution x ∈ C (T ) ∩ E0 . So we can define the solution multifunction S:E0 −→ 1,p
(0,b) \{∅} and then as in Theorem 5.2.21, via the use of Theorem 5.2.20 we 2W can produce x ∈ C 1 (T ) ∩ E0 such that it solves problem (5.111). Therefore x ∈ C1 and clearly x1 ≤ x, x2 ≤ x which proves that the set C1 is directed.
Now we use this proposition to establish the existence of extremal solutions for problem (5.111) in the order interval E = [ψ, ϕ]. THEOREM 5.2.24 If hypotheses H0 , H(f )2 , H(ξ), and H(ϑ) hold, then problem (5.111) has extremal solutions in the order interval E = [ψ, ϕ]. PROOF: By virtue of Proposition 5.2.22 and Zorn’s we can find x∗ ∈C1 a
lemma, maximal element for the pointwise ordering on W 1,p (0, b) . If x ∈ C1 , then because of Proposition 5.2.23, we can find y∈C1 such that x ≤ y, x∗ ≤ y. Because x∗ ∈ C1 is maximal, we must have x∗ = y. Then x ≤ x∗ and because x∈C1 was arbitrary, it follows that x∗ ∈ C1is the greatest element of C1 . If on W 1,p (0, b) we use the partial order ≤1 defined by x ≤1 y
if and only if y(t) ≤ x(t)
a.e. on T,
then by the same argument we produce x∗ ∈ C1 , which is the smallest element of C1 . Hence {x∗ , x∗ } are the extremal solutions of (5.111) in E = [ψ, ϕ]. The framework of problem (5.111) is general and it incorporates as special cases standard boundary value problems. EXAMPLE 5.2.25 (a) Let I1 , I2 ⊆ R be nonempty closed intervals containing zero. For i = 1, 2, we set 0 if x ∈ Ii (the indicator function of Ii ). iIi (x)= +∞ if otherwise Set ξi = ∂iIi , i = 1, 2 (the subdifferential in the sense of convex analysis. Then problem (5.111) becomes ⎧ ⎫
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T, ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ x(0) ∈ I1 , x(b) ∈ I2 . (5.144) ⎪ ⎪ x (0)x(0) = sup[vx (0) : v ∈ I1 ] ⎪ ⎪ ⎩ ⎭ −x (b)x(b) = sup[−wx (b) : w ∈ I2 ]
404
5 Boundary Value Problems–Hamiltonian Systems
(b) If I1 = I2 = {0}, then ξi (x) = R for all x ∈ R and for i = 1, 2. Hence problem (5.144) is the classical Dirichlet problem. (c) If I1 = I2 = R, then ξi (x) = {0} for all x ∈ R and for i = 1, 2. Hence problem (5.144) is the classical Neumann problem. (d) If ξ1 (x) = (1/β)x and ξ2 (x) = (1/γ)x, β, γ > 0, then problem (5.111) becomes the following Sturm–Liouville problem. ) *
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T, . x(0) − βx (0) = 0, x(b) + γx (b) = 0 (e) If u1 , u2 : R −→ R are two contractions and ξ1 = u1 − id, ξ2 = u2 − id (both maximal monotone functions), then problem (5.111) becomes * )
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T,
. x(0) + x (0) = u1 x(0) , x(b) − x (b) = u2 x(b) REMARK 5.2.26 The boundary conditions of problem (5.111) do not include the periodic ones. However, a careful analysis of the methods of the proof, reveals that they are also valid for the periodic problem. So Theorems 5.2.21 and 5.2.24 are also true for the periodic problem ) *
− |x (t)|p−2 x (t) = f t, x(t), x (t) + ϑ x(t) a.e. on T, . (5.145) x(0) = x(b), x (0) = x (b)
5.3 Degree-Theoretic Methods In this section we use degree theoretic methods to study second-order nonlinear boundary value problems. We consider two examples. The first is a nonlinear elliptic equation driven by the p-Laplacian with Dirichlet boundary conditions. Using degree-theoretic methods based on the degree map for operators of monotone type (see Section 3.3), we establish the existence of multiple nontrivial solutions. In the second example, we deal with a second-order scalar equation driven by the ordinary p-Laplacian with periodic boundary conditions and using the Leray–Schauder degree, we produce solutions under conditions of nonuniform nonresonance and of Landesman–Lazer type for the nonlinearity. We start with the elliptic Dirichlet boundary value problem. So let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. The problem under consideration is the following.
⎧ ⎫ p−2 Dx(z) = f z, x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) . (5.146) ⎩ ⎭ x ∂Z = 0, 1 < p < ∞ The hypotheses on the nonlinearity f (z, x) are the following. H(f )1 : f : Z ×R −→ R is a function such that f (z, 0) = 0 a.e. on Z, f (z, x) ≥ 0 a.e. on Z for all x ≥ 0 and
5.3 Degree-Theoretic Methods
405
(i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous. (iii) For almost all z ∈ Z and all x ∈ R, we have |f (z, x)| ≤ α(z) + c|x|p−1
with α ∈ L∞ (Z)+ , c>0.
(iv) There exists ϑ ∈ L∞ (Z)+ such that ϑ(z) ≤ λ1 a.e. on Z with strict inequality on a set of positive measure and f (z, x) ≤ ϑ(z) xp−1
lim sup x→+∞
uniformly for a.a. z ∈ Z,
and there exist η1 , η2 ∈ L∞ (Z)+ such that η1 (z) ≥ λ1 a.e. on Z with strict inequality on a set of positive measure and η1 (z) ≤ lim inf
x→−∞
f (z, x) f (z, x) ≤ lim sup p−2 ≤ η2 (z) |x|p−2 x x x→−∞ |x|
uniformly for a.a. z ∈ Z. (v) There exist η, η ∈ L∞ (Z)+ such that η(z) ≥ λ1 a.e. on Z with strict inequality on a set of positive measure and η(z) ≤ lim inf x→0+
and lim
x−→0−
f (z, x) f (z, x) ≤ lim sup p−1 ≤ η(z) xp−1 x + x→0
f (z,x) |x|p−2 x
=0
uniformly for a.a. z ∈ Z
uniformly for a.a. z ∈ Z.
REMARK 5.3.1 Hypotheses H(f )1 (iv) and (v) are nonresonance conditions at +∞ and 0+ , respectively, with respect to λ1 > 0 the principal eigenvalue of
−p , W01,p (Z) . In both conditions we allow partial interaction with λ1 > 0 (nonuniform nonresonance). Note that when p = 2, these conditions incorporate in the example the so-called asymptotically linear problems. Let τ+ : R −→ R+ be the positive truncation function defined by 0 if x ≤ 0 . τ+ (x) = x if x ≥ 0
x We set f+ (z, x) = f z, τ+ (x) and F+ (z, x) = 0 f+ (z, r)dr. Then we produce the energy functional ϕ+ : W01,p (Z) −→ R defined by
1 ϕ+ (x) = Dxpp − F+ z, x(z) dz for all x ∈ W01,p (Z). p Z
We know that ϕ+ ∈ C 1 W01,p (Z) . Also ϕ(x) = (1/p)Dxpp − Z F z, x(z) dz for all x∈W01,p (Z). Again ϕ∈C 1 (W01,p (Z)). PROPOSITION 5.3.2 If H(f )1 hold, then there exists x0 ∈ int C01 (Z) a local minimizer of ϕ.
406
5 Boundary Value Problems–Hamiltonian Systems
PROOF: By virtue of hypothesis H(f )1 (iv), given ε > 0, we can find M =M (ε) > 0 such that
f+ (z, x) ≤ ϑ(z) + ε xp−1 for a.a. z ∈ Z and all x ≥ M. (5.147) On the other hand from hypothesis H(f )1 (iii) and because f+ (z, x)=0 for a.a. z ∈ Z and all x ≤ 0, we have f+ (z, x) ≤ αε (z)
for a.a. z ∈ Z all x ≤ M, with αε ∈ L∞ (Z)+ .
Combining (5.147) and (5.148), we obtain
f+ (z, x) ≤ ϑ(z) + ε |x|p−1 + αε (z) 1
⇒ F+ (z, x) ≤ ϑ(z) + ε |x|p + αε (z)|x| p
(5.148)
for a.a. z ∈ Z, all x ∈ R, for a.a. z ∈ Z, all x ∈ R. (5.149)
Then for all x ∈ W01,p (Z) we have
1 F+ z, x(z) dz ϕ+ (x) = Dxpp − p Z
1 1 ε p ≥ Dxp − ϑ(z)|x(z)|p − xpp − c1 Dxp p p Z p for some c1 > 0, (see (5.149)) ε 1 ξ− Dxpp − c1 Dxp ≥ p λ1
(5.150)
(see Lemma 5.1.3 and Theorem 4.3.47). We choose ε < λ1 ξ. Then from (5.150) it follows that ϕ+ is coercive. Exploiting the compact embedding of W01,p (Z) into Lp (Z), we can see that ϕ+ is weakly lower semicontinuous. So we can find x0 ∈ W01,p (Z) such that ϕ+ (x0 ) = inf ϕ+ (x) : x ∈ W01,p (Z) . Using hypothesis H(f )1 (v), we see that ϕ+ (tu1 ) < 0 for t > 0 small. Hence m = ϕ+ (x0 ) < 0 = ϕ+ (0) and so x0 = 0. As before from nonlinear regularity theory and the nonlinear maximum principle (see Theorems 4.3.35 and 4.3.37), we obtain that x0 ∈ int C01 (Z). Then by virtue of Theorem 5.1.27, it follows that x0 is a local minimizer of ϕ. In what follows by N we denote the Nemitsky operator corresponding to the Carath´eodory function f (z, x); that is,
N (x)(·) = f ·, x(·) for all x ∈ Lp (Z).
We know that N :Lp (Z)−→Lp (Z), (1/p)+(1/p ) = 1 is continuous and bounded (see hypotheses H(f )1 (i), (ii), and (iii)). Also let A : W01,p (Z) −→ W −1,p (Z) = 1,p ∗ W0 (Z) be the nonlinear operator defined by
Dxp−2 (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). A(x), y = Z
Here by ·, · we denote the duality brackets for the pair W01,p (Z), W −1,p (Z) . We know that A is of type (S)+ (see Proposition 4.3.41). Due to the compact
5.3 Degree-Theoretic Methods
407
embedding of W01,p (Z) into Lp (Z), we can see that N is completely continuous. So it follows at once that the operator x −→ (A − N )(x) is of (S)+ –type and we can consider the degree in the sense of Definition 3.3.63. PROPOSITION 5.3.3 If hypotheses H(f )1 hold, then there exists R0 > 0 such that for all R ≥ R0 we have d(S)+ (A − N, BR , 0) = 0.
PROOF: Let K− : W01,p (Z) −→ W −1,p (Z) be defined by
p−1 K− (x) = x− (·)
for all x ∈ W01,p (Z).
Fix h ∈ L∞ (Z)+ , h(z) ≥ λ1 a.e. on Z with strict inequality on a set of positive measure. We consider the (S)+ -homotopy h1 : [0, 1] × W01,p (Z) −→ W −1,p (Z) defined by h1 (t, x) = A(x) − tN (x) + (1 − t)hK− (x). Claim: There exists R0 > 0 such that for all t ∈ [0, 1], all x ∈ ∂BR (0) and all R ≥ R0 , 0 = h1 (t, x). Suppose that the claim is not true. Then we can find {tn }n≥1 ⊆ [0, 1] and xn ⊆ W01,p (Z) such that tn −→ t
in
[0, 1], xn −→ ∞
and A(xn ) = tn N (xn ) − (1 − tn )hK− (xn )
for all n ≥ 1.
Acting with the test function x+ n and using hypothesis H(f )(iv), we obtain that + {xn }n≥1 ⊆ W01,p (Z) is bounded. Then we must have x− n −→ ∞. We set yn = − (x− n /xn ) and we may assume that w
yn −→ y and
in W01,p (Z),
|yn (z)| ≤ k(z) a.e. on Z
yn −→ y in Lp (Z), yn (z) −→ y(z) a.e. on Z for all n ≥ 1, with k ∈ Lp (Z)+ .
We have N (xn ) 1 A(x+ − (1 − tn )hynp−1 . n ) − A(yn ) = tn p−1 p−1 x− x− n n Acting with yn − y and passing to the limit as n → ∞, we obtain lim A(yn ), yn − y = 0. Because A is of type (S)+ , it follows that yn −→ y in W01,p (Z) and so y = 1. Also arguing in the proof of Proposition 5.1.2, we see that N (xn ) w −→ h xp−1
in Lp (Z),
with h = −gy p−1 where g ∈ L∞ (Z)+ , η1 (z) ≤ g(z) ≤ η2 (z) a.e on Z. Hence A(y) = gy p−1
(5.151)
408
5 Boundary Value Problems–Hamiltonian Systems
with g = tg + (1 − t)h. From (5.151) we have
⎧ ⎫ p−2 Dy(z) = g(z)|y(z)|p−2 y(z) a.e. on Z, ⎬ ⎨ −div Dy(z) . ⎩ ⎭ y ∂Z = 0
(5.152)
Note that 1 = λ1 (λ1 ) > λ1 (g) and so from (5.152) it follows that y must change sign, a contradiction. This proves the claim. So we have d(S)+ (A − N, BR , 0) = d(S)+ (A + hK− , BR , 0)
for all R ≥ R0 .
(5.153)
But from Godoy–Gossez–Paczka [268], we have that d(S)+ (A + hK− , BR , 0) = 0
for all R > 0.
(5.154)
Using (5.154) in (5.153), we conclude that d(S)+ (A − N, BR , 0) = 0
for all R ≥ R0 .
Next we prove an analogous result for small balls. PROPOSITION 5.3.4 If hypotheses H(f )1 hold, then there exists 0 > 0 such that d(S)+ (A − N+ , B , 0) = 0 for all 0 < ≤ 0 .
PROOF: In what follows K+: Lp (Z) −→ Lp (Z) is a continuous monotone (hence maximal monotone too) map defined by
p−1 . K+ (x)(·) = x+ (·)
Let h3 : [0, 1] × W01,p (Z) −→ W −1,p (Z) be the homotopy defined by h3 (t, x) = A(x) − (1 − t)ηK+ (x) − tN (x), where η ∈ L∞ (Z)+ is as in hypothesis H(f )1 (v). Because of the complete continuity of the maps K+ , N: W01,p (Z) −→ W −1,p (Z), we can easily verify that h3 (t, x) is a homotopy of type (S)+ . Claim: There exists 0 > 0 such that for all t ∈ [0, 1], all x ∈ ∂B (0) and all 0< ≤ 0 , x = h3 (t, x). As in proof of Proposition 5.3.3, we argue indirectly. So suppose that the claim is not true. Then we can find {tn }n≥1 ⊆ [0, 1] and {xn }n≥1 ⊆W01,p (Z) such that tn −→ t
in [0, 1], xn −→ 0 as n → ∞ and h3 (tn , xn ) = 0
for all n ≥ 1.
From the last equality, we have A(xn ) = (1 − tn )ηK+ (xn ) + tn N (xn )
for all n ≥ 1.
We set yn =xn /xn , n ≥ 1. By passing to a suitable subsequence if necessary, we may assume that
5.3 Degree-Theoretic Methods w
yn −→ y in W01,p (Z)
and
yn −→ y in Lp (Z)
409
as n → ∞.
We have A(yn ) = (1 − tn )ηK+ (yn ) + tn
N (xn ) xn p−1
for all n ≥ 1.
(5.155)
By virtue of hypotheses H(f )1 , we deduce that |f (z, x)| ≤ c1 |x|p−1 for a.a. z ∈ Z, all x ∈ R, with c1 > 0. From (5.156) we infer that
N (xn ) p/(p−1) p/(p−1) dz ≤ c1 ypp xn p−1 Z N (x ) n ⇒ ⊆ Lp (Z) is bounded. xn p−1 n≥1
(5.156)
for all n ≥ 1, (5.157)
By passing to a subsequence if necessary, we may assume that N (xn ) w −→ h0 xn p−1
in Lp (Z) as n → ∞, with h0 ∈ Lp (Z).
(5.158)
As before, using hypothesis H(f )1 (v), we can check that η(z)y + (z) ≤ h0 (z) ≤ η(z)y + (z) ⇒ h0 (z) = g(z)y + (z)p−1
a.e. on Z
a.e. on Z,
with g∈L∞ (Z)+ such that η(z) ≤ g(z) ≤ η(z) a.e. on Z. Acting on (5.155) with the test function yn − y ∈ W01,p (Z), we have
N (xn ) A(yn ), yn − y = (1 − tn )η(yn+ )p−1 + tn (yn − y)(z)dz −→ 0, xn p−1 Z as n → ∞, (see (5.156)). Because A is of type (S)+ , it follows that yn −→ y
in W01,p (Z).
(5.159)
Passing to the limit as n → ∞ in (5.155) and using (5.158) and (5.159), we obtain A(y) = (1 − t)ηK+ (y) + tgK+ (y).
Recall that +
Dy (z)=
Dy(z) 0
(5.160)
a.e. on {y > 0} . a.e. on {y ≤ 0}
So we can write that A(y + ) = σK+ (y),
(5.161)
∞
where σ ∈ L (Z)+ is given by σ = (1 − t)η + tg. From (5.161) we have
⎧ ⎫ p−2 Dy(z) = σ(z)|y(z)|p−2 y(z) a.e. on Z, ⎬ ⎨ −div Dy(z) . ⎩ ⎭ y ∂Z = 0
(5.162)
410
5 Boundary Value Problems–Hamiltonian Systems
Once again, the monotonicity of the principal eigenvalue on the weight function, implies that λ1 (σ) ≤ λ1 (η) < λ1 (λ1 ) = 1. 1,p + Hence because we see
of (5.162), that y ∈ W0 (Z) cannot be the principal 1,p eigenfunction of − p , W0 (Z), σ and so it must change sign, which is a contradiction, unless y + = 0. Therefore y ≤ 0 and so K+ (y) = 0, from which we have
A(y) = 0
(see (5.161)),
⇒ y = 0, a contradiction to the fact that y = 1, see (5.160). This proves the claim. Then the homotopy invariance property implies that d(S)+ (A − N, B , 0) = d(S)+ (A − ηK+ , B , 0)
for all 0 < ≤ 0 .
(5.163)
As in the proof of Proposition 5.3.3, we have d(S)+ (A − ηK+ , B , 0) = 0.
(5.164)
From (5.163) and (5.164), we conclude that d(S)+ (A − N+ , B , 0) = 0
for all 0 < ≤ 0 .
Now we can prove the existence of multiple nontrivial solutions of constant sign for problem (5.146). THEOREM 5.3.5 If hypotheses H(f )1 hold, then problem (5.146) has at least two distinct nontrivial solutions x0 ∈ int C01 (Z)+
and
x1 ∈ C01 (Z).
PROOF: Let x0 ∈ W01,p (Z) be the local minimizer of ϕ obtained in Proposition 5.3.2. We have ϕ (x0 ) = 0, ⇒ A(x0 ) = N (x0 ). Assume that x0 is an isolated critical point of ϕ+ (otherwise we have infinitely many critical points of ϕ+ , hence infinitely many nontrivial positive solutions of (5.146)). So we can find r0 > 0 such that ϕ(x0 ) < ϕ(y) and ϕ (y) = 0 for all y ∈ B r0 (x0 ) \ {x0 }, (5.165) where B r0 (x0 ) = y ∈ W01,p (Z) : y − x0 ≤ r0 . We show that for all r ∈ (0, r0 ), the following property holds inf ϕ(y) : y ∈ B r0 (x0 ) \ Br (x0 ) > ϕ(x0 ). (5.166) We argue by contradiction. So suppose that there exists r > 0 and a sequence {xn }n≥1 ⊆ B r0 (x0 ) \ Br (x0 ) such that
5.3 Degree-Theoretic Methods ϕ(xn ) ↓ ϕ(x0 )
411
as n → ∞.
Evidently, we may assume that w
xn −→ x
and
xn −→ x
in Lp (Z) as n → ∞.
(5.167)
Note that ϕ is weakly lower semicontinuous. So ϕ(x) ≤ lim inf ϕ(xn ) = ϕ(x0 ). n→∞
Because x ∈ B r0 (x0 ), it follows that ϕ(x)=ϕ(x0 ) and by virtue of (5.165), we have x = x0 . From the mean value theorem, we have x + x n 0 ∗ xn − x0 ϕ(x0 ) − ϕ = wn , , (5.168) 2 2 x0 where wn∗ =ϕ λn xn + (1 − λn ) xn + , λn ∈ [0, 1]. We know that 2 xn + x0 (5.169) + u∗n , wn∗ =A λn xn + (1 − λn ) 2 x0 , n ≥ 1. Using (5.169) in (5.168) and then passing to the limit where u∗n =N xn + 2 as n → ∞, we obtain xn + x0 , xn − x0 ≤ 0. lim sup A λn xn + (1 − λn ) 2 n→∞ We have
xn + x0 xn + x0 , λn xn + (1 − λn ) − x0 lim sup A λn xn + (1 − λn ) 2 2 n→∞ λ + 1 xn + x0 n lim sup lim sup A λn xn + (1 − λn ) ≤ 0. , xn − x0 2 2 n→∞ n→∞
Because A is of type (S)+ , it follows that λn xn + (1 − λn )
xn + x0 −→ x0 2
in W01,p (Z).
(5.170)
But note that λn xn + (1 − λn ) xn + x0 − x0 = (1 + λn ) xn − x0 ≥ r . 2 2 2
(5.171)
Comparing (5.170) and (5.171), we reach a contradiction. So (5.166) is valid. Let µ=inf ϕ(x) : x ∈ Br0 (x0 ) \ Br0 /2 (x0 ) − ϕ(x0 ). By (5.171) we have µ > 0. Also we introduce the set V = x ∈ B r0 (x0 ) : ϕ(x) − ϕ(x0 ) < µ . 2
Because x0 ∈ V , the set V is nonempty. It is clear that V is also bounded open. Fix r ∈ (0, r0 /2) with B r (x0 ) ⊆ V . Choose a number λ ∈ R such that (see (5.171)). 0 < λ < inf ϕ(x) : x ∈ Br0 (x0 ) \ Br (x0 ) − ϕ(x0 )
412
5 Boundary Value Problems–Hamiltonian Systems
Note that λ < µ because r < r0 /2. Obviously V ⊆Br0 (x0 ) and V = x ∈ Br0 (x0 ) : ϕ(x) − ϕ(x0 ) < µ . From the definition of λ > 0, we see that x ∈ Br0 (x0 ) : ϕ(x) − ϕ(x0 ) ≤ λ ⊆ Br (x0 ) ⊆ V. Moreover, from (5.170) we see that ϕ (x) = 0
for all x ∈ Br0 (x0 ) with λ ≤ ϕ(x) − ϕ(x0 ) ≤ µ.
Therefore we can apply Corollary 3.3.70 and obtain d(S)+ (A − N, V, 0) = 1
(recall ϕ = A − N ).
(5.172)
From the excision property of the degree map, we have d(S)+ (A − N, V, 0) = d(S)+ (A − N, Br0 (x0 ), 0) ⇒ d(S)+ (A − N, Br0 (x0 ), 0) = 1
(see (5.172)).
(5.173)
Now we fix R0 > 0 in Proposition 5.3.3 sufficiently large and 0 > 0 in Proposition 5.3.4 sufficiently small in order to have x0 ∈ BR0 \ B r0 . Then we can find r > 0 such that Br (x0 )⊆BR0 and Br (x0 ) ∩ B0 =∅.
Let R > R0 and ∈ (0, 0 ). We claim that there exists x1 ∈ B R\ Br (x0 )∪B with
/ (A − N ) B R\ Br (x0 )∪B A(x1 ) = N (x1 ). Indeed, if this is not the case, then 0 ∈ and so invoking the additivity property of the degree map, we obtain d(S)+ (A−N, BR , 0) = d(S)+ (A−N, Br (x0 ), 0) + d(S)+ (A−N, B , 0), ⇒ 1 = 1 + (−1)
(see (5.172) and Propositions 5.3.3 and 5.3.4),
which is a contradiction. So we can find x1 ∈ W01,p (Z) such that x1 = x0 , x1 = 0
and
A(x1 ) = N (x1 ).
Therefore x1 > 0 is a nontrivial solution of problem (5.146) and from nonlinear regularity x1 ∈ C01 (Z)+ . The second boundary value problem that we study in this section is the following nonlinear second-order periodic differential equation. ) *
− |x (t)|p−2 x (t) = f t, x(t) + h(t) a.e. on T, . (5.174) x(0) = x(b), x (0) = x (b), h ∈ L1 (T ), 1 < p < ∞ We use degree-theoretic methods to prove two existence theorems for problem (5.174). In the first we assume conditions of nonuniform nonresonance between two successive eigenvalues of the negative scalar ordinary p-Laplacian with periodic boundary conditions. In the second existence theorem, at infinity the asymptotic f (z,x) are replaced by nonuniform nonresonance conditions for the “slopes” |x| p−2 x x=0
certain Landesman–Lazer conditions.
5.3 Degree-Theoretic Methods From Section 4.3, we know that the nonlinear eigenvalue problem ) * − |x (t)|p−2 x (t) = λ|x(t)|p−2 x(t) a.e. on T, , x(0) = x(b), x (0) = x (b), λ ∈ R, 1 < p < ∞
413
(5.175)
has a sequence of eigenvalues λ2n =
2nπ p p
b
where
x=0
1
πp =
2π(p − 1) p . p sin πp
These are all the eigenvalues of (5.175). Also if we consider the nonlinear weighted eigenvalue problem ) *
− |x (t)|p−2 x (t) = λ + m(t) |x(t)|p−2 x(t) a.e. on T , (5.176) 1 x(0) = x(b), x (0) = x (b), λ ∈ R, m ∈ L (T ), 1 < p < ∞, we [624]) that (5.176) has a double sequence of eigenvalues know(see Zhang λ 2n (m) n≥1 and λ2n (m) n≥0 such that −∞ < λ0 (m) < λ 2 (m) ≤ λ2 (m) < · · · < λ 2n (m) ≤ λ2n (m) < · · · −→ +∞ as n → ∞. If p = 2 (linear case), the two sequences and λ 2n (m) λ 2n (m) n≥1 and λ2n (m) n≥0 are all the eigenvalues of (5.176). If p = 2 (nonlinear case), we do not know if this is true. For the first existence result under the nonuniform nonresonance conditions in an arbitrary spectral interval, our hypotheses on the nonlinearity f (t, x), are the following. H(f )2 : f : T × R −→ R is a function such that (i) For all x ∈ R, t −→ f (t, x) is measurable. (ii) For almost all t ∈ T, x −→ f (t, x) is continuous. (iii) For every r > 0, there exists αr ∈L1 (T )+ such that for almost all t ∈ T and all x ∈ R with |x| ≤ r, we have |f (t, x)| ≤ αr (t). (iv) There exist functions ϑ1 , ϑ2 ∈ L∞ (T )+ such that for some n ≥ 0, we have λ2n ≤ ϑ1 (t) ≤ ϑ2 (t) ≤ λ2n+2
a.e. on T
with the first and third inequalities strict on sets (not necessarily the same) of positive measure and ϑ1 (t) ≤ lim inf |x|→∞
f (t, x) f (t, x) ≤ lim sup p−2 ≤ ϑ2 (t) |x|p−2 x |x| x |x|→∞
uniformly for almost all t ∈ T .
414
5 Boundary Value Problems–Hamiltonian Systems
REMARK 5.3.6 Hypothesis H(f )2 (iv) is the nonuniform nonresonance condition in the spectral interval [λ2n , λ2n+2 ]. We start with a simple observation about the eigenvalues of problem (5.176). PROPOSITION 5.3.7 If ϑ1 , ϑ2 ∈ L∞ (T )+ are as in hypothesis H(f )2 (iv) and m∈L1 (T )+ satisfies ϑ1 (t) ≤ m(t) ≤ ϑ2 (t)
a.e. on T,
then all the eigenvalues of problem (5.176) are nonzero and do not have zero as a limit point. PROOF: By virtue of the monotonicity of the eigenvalues λ 2n (m) n≥1 and λ2n (m) n≥0 on the weight function m ∈ L1 (T )+ , we have λ2n (m) ≤ λ2n (ϑ1 ) < λ2n (λ2n ) = 0
and
0 = λ 2n (λ2n ) < λ 2n (ϑ2 ) ≤ λ 2n (m).
(5.177) (5.178)
Also we know that if λ ∈ R is an eigenvalue of (5.176), then !
λ 2k (m), λ2k (m) ∪ − ∞, λ0 (m) λ∈
(5.179)
k≥1
(see Zhang [624]). Combining (5.179) with (5.177) and (5.178), we see that λ = 0. So if by σ(p) we denote the spectrum of the nonlinear weighted eigenvalue problem (5.176), we have that 0 ∈ / σ(p). Suppose that we can find {λn }n≥1 ⊆ σ(p) such that λn −→ 0. We can find 1 un ∈ Cper (T ) = x ∈ C 1 (T ) : x(0) = x(b), x (0) = x (b) such that un = 0 and
− |un (t)|p−2 un (t) = λ + m(t) |un (t)|p−2 un (t)
a.e. on T.
(5.180)
Because of the (p − 1)-homogeneity of problem (5.180), we may assume that 1,p forall n ≥ 1 ( · denotes the norm of the Sobolev space Wper un =1 (0, b) = x∈
W 1,p (0, b) : x(0)=x(b) . So by passing to a suitable subsequence if necessary, we may assume that w 1,p
un −→ u in Wper (0, b) and un −→ u in C(T ) as n → ∞.
∗ 1,p 1,p (0, b) , Wper (0, b) By ·, · we denote the duality brackets for the pair Wper
∗ 1,p 1,p and we let A:Wper (0, b) −→Wper (0, b) be the nonlinear operator defined by
A(x), y=
b
|x (t)|p−2 x (t)y (t)dt
1,p
for all x, y ∈ Wper (0, b) .
0
We know that A is monotone hence it is maximal monotone.
demicontinuous, 1,p (0, b) −→Lp (T ), (1/p) + (1/p ) = 1 we denote the operator Also by K :Wper K(x)(·) = |x(·)|p−2 x(·).
5.3 Degree-Theoretic Methods
415
Evidently K is bounded continuous. Then in terms of A and K, we can equivalently rewrite (5.180) as the following abstract operator equation A(un )=(λn + m)K(un ), n ≥ 1,
b
⇒ A(un ), un − u= λn + m(t) |un (t)|p−2 un (t)(un − u)dt,
n ≥ 1.
0
(5.181) Note that
λn + m(t) |un (t)|p−2 un (t)(un − u)(t)dt −→ 0.
b
0
So we obtain lim A(un ), un − u = 0.
(5.182)
n→∞
But A being maximal monotone, it is generalized pseudomonotone and so from (5.182) it follows that A(un ), un −→ A(u), u , ⇒ un p −→ u p . Because un −→ u in Lp (T ) and Lp (T ) being uniformly convex it has the Kadec– 1,p Klee property, it follows that un −→ u in Lp (T ) and so un −→ u in Wper (0, b) . Hence u = 1 and so u = 0. Passing to the limit as n → ∞ in (5.181), we obtain w
A(u) = mK(u). From (5.183) it follows that ) − |u (t)|p−2 u (t) = m(t)|u(t)|p−2 u(t) u(0) = u(b), u (0) = u (b)
(5.183)
a.e. on T,
* .
(5.184)
Because u = 0, from (5.184) it follows that 0 ∈ σ(p), which contradicts the first part of the proof. From the above proposition we have that 0 ∈ / σ(p) and we can find ε0 ∈ (0, 1) such that (−ε0 , ε0 ) ∩ σ(p) = ∅. We fix ε ∈ (0, ε0 ) and we consider the following periodic problem. ) * − |x (t)|p−2 x (t) + ε|x(t)|p−2 x(t) = h(t) a.e. on T, . (5.185) x(0) = x(b), x (0) = x (b), h ∈ L1 (T ), 1 < p < ∞ PROPOSITION 5.3.8 For every h∈L1 (T ) problem (5.185) has a unique solution 1 1,p Sε (h) ∈ Cper (T ) and the solution map Sε : L1 (T ) −→ Wper (0, b) is completely
w 1 1,p (0, b) . continuous; that is, if hn −→ h in L (T ), then Sε (hn )−→Sε (h) in Wper
∗ 1,p 1,p (0, b) −→Wper (0, b) be as in the proof of Proposition PROOF: Let A, K :Wper
∗ 1,p 5.3.7 (recall that Lp (T ) is embedded compactly in Wper (0, b) ). We consider the
∗ 1,p 1,p (0, b) −→Wper (0, b) defined by nonlinear operator Lε :Wper
416
5 Boundary Value Problems–Hamiltonian Systems Lε (x) = A(x) + εK(x)
1,p
for all x ∈ Wper (0, b) .
hence it is maximal monoClearly Lε is strictly monotone and demicontinuous, 1,p tone. Moreover, for every x∈Wper (0, b) , we have Lε (x), x = x pp + εxp ⇒ Lε is coercive.
1,p So Lε is surjective and we find x ∈ Wper (0, b) such that Lε (x) = h.
(5.186)
Due to the strict monotonicity of Lε , the solution x ∈ (0, b) of (5.191) is unique and we denote it by S (h). Thus we have defined the solution map Sε : ε 1,p L1 (T ) −→ Wper (0, b) , which to each forcing term h ∈ L1 (T ) assigns the unique solution of problem (5.185). We show that Sε is completely continuous. To this end w 1 let hn −→ h in L1 (T ) and set xn = Sε (hn ) ∈ Cper (T ), n ≥ 1. We have ) * − |xn (t)|p−2 xn (t) + ε|xn (t)|p−2 xn (t) = hn (t) a.e. on T, , xn (0) = xn (b), xn (0) = xn (b) 1,p Wper
(5.187) ⇒ A(xn ) + εK(xn ) = hn , n ≥ 1.
∗ 1,p 1,p 1 Taking duality brackets in Wper (T ), we (0, b) , Wper (0, b) with xn ∈ Cper obtain
b hn (t)xn (t)dt ≤ c1 xn , εxn p ≤ xn pp + εxn pp = 0
for some c1 > 0, all n ≥ 1,
1,p
(0, b) ⇒ {xn }n≥1 ⊆ Wper
is bounded.
So we may assume that w 1,p
xn −→ x in Wper (0, b)
and
xn −→ x in C(T ).
1,p As before, acting on (5.187) with the test function xn − x ∈ Wper (0, b) and p using the Kadec–Klee property of the space L (T ), we can show that xn −→ x in 1,p Wper (0, b) . Passing to the limit as n → ∞ in (5.187), we obtain A(x) + εK(x) = h,
⇒ − |x (t)|p−2 x (t) + ε|x(t)|p−2 x(t) = h(t)
a.e. on T,
x(0) = x(b), x (0) = x (b), ⇒
x = Sε (h).
From Urysohn’s criterion for the convergence
of sequences, we conclude for the 1,p original sequence xn = Sε (hn ) n≥1 ⊆ Wper (0, b) , that 1,p
xn −→ x = Sε (h) in Wper (0, b)
⇒ Sε is indeed completely continuous.
5.3 Degree-Theoretic Methods
417
THEOREM 5.3.9 If hypotheses H(f )2 hold, then for every h ∈ L1 (T ) problem 1 (5.174) has a solution x ∈ Cper (T ).
1,p PROOF: Let Nf : Wper (0, b) −→ L1 (T ) be the Nemitsky operator corresponding to the function f (t, x); that is,
Nf (x)(·) = f ·, x(·) .
1,p Because of the compact embedding of Wper (0, b) into C(T ), we see that Nf is compact. Let g ∈ L1 (T ) such that ϑ ϑ 2 (t) a.e.
1 (t) ≤ g(t) ≤1,p on T and consider the 1,p compact homotopy h : [0, 1] × Wper (0, b) −→ Wper (0, b) defined by
h(β, x) = Sε εK(x) + βNf (x) + βh + (1 − β)gK(x) . Claim: exists R0 > 0 such that h(β, x) = x for all β ∈ [0, 1] and all x ∈
There 1,p Wper (0, b) with x = R and all R ≥ R0 . We argue indirectly. Suppose that
the claim is not true. Then we can find 1,p (0, b) such that {βn }n≥1 ⊆ [0, 1] and {xn }n≥1 ⊆ Wper βn −→ β
in [0, 1],
xn −→ ∞
and
for all n ≥ 1.
xn = h(βn , xn )
We have A(xn ) = βn Nf (xn ) + βn h + (1 − βn )gK(xn ) Set yn = xn /xn , n ≥ 1. We may assume that w 1,p
yn −→ y in Wper (0, b) and yn −→ y
for all n ≥ 1.
(5.188)
in C(T ) as n → ∞.
Using hypothesis H(f )2 (iii) and (iv), we can check that N (x ) n f ⊆ L1 (T ) is uniformly integrable. xn p−1 n≥1 So by the Dunford–Pettis theorem, we may assume (at least for a subsequence), that Nf (xn ) w −→ w in L1 (T ). (5.189) xn p−1 As in previous proofs, we can check that w(t) = g0 (t)|y(t)|p−2 y(t)
a.e. on T,
(5.190)
with g0 ∈ L1 (T ) such that ϑ1 (t) ≤ g0 (t) ≤ ϑ2 (t) a.e. on T . Recall that A(xn ) = βn Nf (xn ) + βn h + (1 − βn )gK(xn ), Dividing with xn
p−1
A(yn ) = βn
n ≥ 1.
, we obtain
Nf (xn ) h + βn + (1 − βn )gK(yn ), xn p−1 xn p−1
n ≥ 1.
(5.191)
1,p (0, b) and using the As before, acting with the test function yn − y ∈ Wper
1,p Kadec–Klee property, we can conclude that yn −→ y in Wper (0, b) . So if we pass to the limit as n → ∞ in (5.191), we obtain
418
5 Boundary Value Problems–Hamiltonian Systems
A(y) = βg0 + (1 − β)g K(y).
Let g = βg0 + (1 − β)g. Then g ∈ L1 (T )+ and ϑ1 (t) ≤ g(t) ≤ ϑ2 (t) a.e. on T . Also ) * − |y (t)|p−2 y (t) = g(t)|y(t)|p−2 y(t) a.e. on T, . (5.192) y(0) = y(b), y (0) = y (b) Note that y = 1 and so y = 0. Hence from (5.192), we infer that 0 ∈ σ(p) which contradicts Proposition 5.3.8. This proves the validity of the claim. Then from the homotopy invariance of the Leray–Schauder degree (see Section 3.3), we have
d I − Sε ◦ (ε + g)K, BR , 0 = d I − Sε ◦ (εK + Nf + h), BR , 0
for all R ≥ R0 . (5.193)
From the choice of ε > 0 and Proposition 5.3.8, we see that x = Sε ◦ (ε + g)K(x)
for all x = R ≥ R0 .
Moreover, it is clear that the map x −→ Sε ◦ (ε + g)K(x) is odd. So invoking Borsuk’s theorem (see Theorem 3.3.41), we have
d I − Sε ◦ (ε + g)K, BR , 0 = 0 for R ≥ R0 ,
⇒ d I − Sε ◦ (εK + Nf + h), BR , 0 = 0 for R ≥ R0
(see (5.193)).
Therefore by the solution property of the Leray–Schauder degree, we can find x ∈ BR such that x = Sε ◦ (εK + Nf + h)(x), ⇒
A(x) = Nf (x) + h,
⇒ − |x (t)|p−2 x (t) = f t, x(t) + h(t)
a.e. on T,
x(0) = x(b), x (0) = x (b), ⇒
1 (T ) is a solution of problem (5.174). x ∈ Cper
In the second existence theorem for problem (5.174), we employ a Landesman– Lazer type condition instead of the nonuniform nonresonance condition in the spectral interval [λ2n , λ2n+2 ]. More precisely, our hypotheses on the nonlinearity f (t, x) are the following. H(f )3 : f : T × R −→ R is a function such that (i) For all x ∈ R, t −→ f (t, x) is measurable. (ii) For almost all t ∈ T, x −→ f (t, x) is continuous. (iii) For every r > 0, there exists αr ∈ L1 (T )+ such that for almost all t ∈ T and all x ∈ R with |x|
5.3 Degree-Theoretic Methods
419
(iv) There exist functions η− , η+ ∈ L1 (T ) such that η− (t) = lim inf f (t, x)
and
x→−∞
η+ (t) = lim sup f (t, x), x→+∞
uniformly for a.a. t ∈ T . THEOREM 5.3.10 If hypotheses H(f )3 hold and h ∈ L1 (T ) satisfies
b 0
b
η+ (t)dt < −
b
h(t)dt < 0
η− (t)dt, 0
1 then problem (5.174) has a solution x ∈ Cper (T ).
PROOF: We keep the notation introduced in Theorem
the proof of
5.3.9. We 1,p 1,p consider the compact homotopy h:[0, 1] × Wper (0, b) −→Wper (0, b) defined by
h(β, x) = Sε ◦ β(εK + Nf + h) (x). Again we show that there exists R > 0 large enough so that h(β, x) = x for all β∈[0, 1] and all x∈Wper (0, b) with x=R. We proceed by contradiction. So suppose that we can find {βn }n≥1 ⊆[0, 1] and 1,p {xn }n≥1 ⊆Wper (0, b) such that
βn −→ β in [0, 1], xn −→ ∞ and xn = Sε ◦ βn (εK + Nf + h) (xn ),
1,p
for all n ≥ 1. We set yn = xn /xn , n ≥ 1. We may assume that w 1,p
yn −→ y in Wper (0, b) and yn −→ y in C(T ) as n → ∞. For every n ≥ 1, we have A(xn ) + εK(xn ) = εβn K(xn ) + βn Nf (xn ) + βn h,
n ≥ 1.
Dividing by xn p−1 , we obtain A(yn ) + εK(yn ) = εβn K(yn ) + βn
Nf (xn ) h + βn , xn p−1 xn p−1
From hypotheses H(f )3 (iii) and (iv), we have that Nf (xn ) −→ 0 xn p−1
in L1 (T ).
Also we have
and
h −→ 0 in L1 (T ) xn p−1
b yn (t)(yn − y)(t)dt −→ 0. 0
n ≥ 1.
(5.194)
420
5 Boundary Value Problems–Hamiltonian Systems
1,p So if in (5.194) we act with the test function yn − y ∈ Wper (0, b) and then pass to the limit as n → ∞, we obtain lim A(yn ), yn − y = 0, 1,p
(0, b) . ⇒ yn −→ y in Wper Therefore from (5.194) in the limit as n → ∞, we obtain A(y) = (β − 1)εK(y),
(5.195)
with y = 1, (y = 0). Because β ∈ [0, 1], ε > 0, from (5.195) it follows that β = 1 and so A(y) = 0, hence y = ξ ∈ R, ξ = 0 (recall that y = 0). First suppose that ξ > 0. If in (5.194) we act with the test function y = ξ, we obtain
b
b
(1 − βn )εxn p−1 ξ |yn |p−2 yn dt = βn ξ (5.196) Nf (xn ) + h dt. 0
0
Because βn ∈ [0, 1], βn −→ β, yn −→ξ > 0 in C(T ), from (5.196) it follows that we can find n0 ≥ 1 such that
b
for all n ≥ n0 . (5.197) Nf (xn ) + h dt > 0 0
Note that because we have assumed that ξ > 0, for all t ∈ T we have that xn (t) −→ +∞ as n → ∞. We claim that this convergence is uniform in t ∈ T . To this end let 0 < δ < ξ. Because yn −→ ξ in C(T ), we can find n1 = n1 (δ) ≥ 1 such that for all t ∈ T and all n ≥ n1 we have |yn (t) − ξ| < δ, hence 0 < ξ − δ < yn (t). Also because 9 > 0 we can find n2 = n2 (M 9) ≥ 1 such that xn ≥ M 9 for all xn −→ ∞, given M n ≥ n2 . So for all n ≥ n2 and all t ∈ T , we have xn (t) xn (t) ≥ = yn (t) > ξ − δ = γ > 0, 9 xn M 9 ⇒ xn (t) ≥ γ M for all n ≥ n2 and all t ∈ T. 9 > 0 was arbitrary, we conclude that xn (t) −→ +∞ uniformly in t ∈ T as Because M n → ∞. This then by virtue of hypothesis H(f )3 (iv) implies that
lim sup f t, xn (t) = η+ (t) uniformly for a.a. t ∈ T. n→∞
Then by Fatou’s lemma and using (5.197), we obtain
b
b
b η+ (t)dt, − h(t)dt ≤ lim sup Nf (xn )dt ≤ n→∞
0
0
0
a contradiction to the hypotheses of the theorem. If we assume ξ < 0, then arguing in a similar fashion we reach the contradiction
b
b η− (t)dt ≤ − h(t)dt. 0
0
Therefore we can find R > 0 large enough such that
5.4 Nonlinear Eigenvalue Problems h(β, x) = x
421
for all β ∈ [0, 1] and all x = R.
Invoking the homotopy invariance property of the Leray–Schauder degree, we have
d I − h(0, ·), BR , 0 = d I − h(1, ·), BR , 0 .
Note that h(0, ·) = 0 and so d I − h(0, ·), BR , 0 = d(I, BR , 0) = 1, hence
d I − h(1, ·), BR , 0 = 1. From the solution
property of the Leray–Schauder degree, it follows that thee 1,p exists x ∈ Wper (0, b) with x ≤ R such that x = h(1, x) = Sε ◦ (εK + Nf + h)(x), ⇒
A(x) = Nf (x) + h,
⇒ − |x (t)|p−2 x (t) = f t, x(t) + h(t)
a.e. on T,
x(0) = x(b), x (0) = x (b), ⇒
1 (T ) is a solution of problem (5.174). x ∈ Cper
REMARK 5.3.11 Theorems 5.3.9 and 5.3.10 are still valid if we consider the more general situation in which instead of the ordinary p-Laplacian differential operator, we have an operator of the form x −→ ϕ(x ) , with ϕ : R −→ R a suitable homeomorphism.
5.4 Nonlinear Eigenvalue Problems In this section we study two perturbed eigenvalue problems, in which the perturbation is nonlinear. The first problem is driven by the p-Laplacian and the second by the Laplacian differential operator. For both problems we establish the existence of nontrivial smooth solutions, as the parameter λ ∈ R moves in a certain interval. In fact for the second problem (the semilinear one), we also have a negative result, showing that no positive solution exists, when λ is out of the aforementioned interval. This is done using the so-called Pohozaev’s identity, which we also prove here. In this section Z ⊆ RN is a bounded domain with C 2 -boundary ∂Z. We start with the following perturbed weighted eigenvalue problem.
⎧ ⎫ p−2 Dx(z) − λm(z)|x(z)|p−2 x(z) = f z, x(z) a.e. on Z, ⎬ ⎨ −div Dx(z) . ⎩ ⎭ x ∂Z = 0, 1 < p < ∞ (5.198) The hypotheses on the weight function m(z) and the perturbation term f (z, x) are the following. H(m): m ∈ L∞ (Z)+ ={m∈L∞ (Z)+ : m(z) ≥ 0
a.e. on Z}, m = 0.
422
5 Boundary Value Problems–Hamiltonian Systems
H(f )1 : f : Z × R −→ R is a function such that f (z, 0) = 0 a.e. on Z and (i) For all x ∈ R, z −→ f (z, x) is measurable. (ii) For almost all z ∈ Z, x −→ f (z, x) is continuous. (iii) For almost all z ∈ Z and all x ∈ R, we have
and
|f (z, x)| ≤ α(z) + c|x|r−1 with α ∈ L∞ (Z)+ , c > 0 Np if p < N ∗ N −p . p
x (iv) If F (z, x) = 0 f (z, r)dr (the potential function corresponding to the nonlinearity f (z, x)), then there exist constants µ > p and M > 0 such that for almost all z ∈ Z and all |x| ≥ M , we have µF (z, x) ≤ xf (z, x) and (v) lim sup x→0
f (z,x) xp−2 x
ess inf F (·, M ), ess inf F (·, −M ) > 0.
≤ 0 uniformly for almost all z ∈ Z.
REMARK 5.4.1 Hypothesis H(f )1 (iv) is the so-called Ambrosetti–Rabinowitz condition (AR-condition for short). However, note that in contrast to the standard AR-condition, we do not require that F (z, x) > 0 for almost all z ∈ Z and all |x| ≥ M . Instead we only assume that ess inf F (·, M ), ess inf F (·, −M ) > 0. The following function satisfies all the hypotheses in H(f )1 . For simplicity we drop the z-dependence and so we have f (x) = |x|r−2 x − c|x|p−2 x with c ≥ 0 and p < r < p∗ . In this example the AR-condition (see hypothesis H(f )1 (iv)), is satisfied for µ = r and all M > 0. We consider the functional ϕλ : W01,p (Z) −→ R defined by
1 λ ϕλ (x)= Dxpp − m(z)|x(z)|p dz − F z, x(z) dz for all x ∈ W01,p (Z). p p Z Z We know that this functional is C 1 on W01,p (R). Next we show that it satisfies the Cerami condition (C-condition). PROPOSITION 5.4.2 If hypotheses H(m) and H(f )1 hold, then ϕλ satisfies the C-condition. PROOF: Let {xn }n≥1 ⊆ W01,p (Z) be a sequence such that |ϕλ (xn )| ≤ M and
(1 +
for some M > 0, all n ≥ 1
xn )ϕλ (xn )
−→ 0 as n → ∞.
(5.199)
5.4 Nonlinear Eigenvalue Problems
423
In the sequel by ·, · we denote the duality brackets for the pair W01,p (Z), W −1,p (Z) , (1/p) + (1/p ) = 1. Let A : W01,p (Z) −→ W −1,p (Z) be the nonlinear operator defined by
A(x), y = Dx(z)p−2 Dx(z), Dy(z) RN dz for all x, y ∈ W01,p (Z).
Z
We know that A is of type (S)+ (see Proposition 4.3.41). Moreover, we have ϕλ (xn ) = A(xn ) − λm|xn |p−2 xn − Nf (xn ),
where Nf (xn )(·) = f ·, xn (·) (the Nemitsky operator corresponding to the nonlin earity f ). Note that Nf :W01,p (Z)−→Lr (Z)⊆W −1,p (Z), where r1 + r1 = 1 (recall ∗ p < r < p ). From the choice of the sequence {xn }n≥1 ⊆ W01,p (Z)(see (5.199)), we have | ϕλ (xn ), xn | ≤ εn with εn ↓ 0,
(5.200) ⇒ −Dxn pp + λ m|xn |p dz + xn Nf (xn )dz ≤ εn . Z
Z
Also p ϕλ (xn ) ≤ pM for all n ≥ 1,
⇒ Dxn pp −λ m|xn |p dz − pF z, xn (z) dz ≤ pM . Z
(5.201)
Z
Adding (5.200) and (5.201), we obtain
− pF z, xn (z) −xn (z)f z, xn (z) dz ≤ εn + pM
Z
⇒ − µF z, xn (z) −xn (z)f z, xn (z) dz Z
+ (µ − p) F z, xn (z) dz ≤ εn + pM .
for all n ≥ 1,
(5.202)
Z
Note that
µF z, xn (z) −xn (z)f z, xn (z) dz
−
Z =−
−
µF z, xn (z) −xn (z)f z, xn (z) dz
{|xn |≥M }
µF z, xn (z) −xn (z)f z, xn (z) dz.
{|xn |<M }
(5.203) From hypothesis H(f )1 (iv), we have
− µF z, xn (z) −xn (z)f z, xn (z) dz ≥ 0. {|xn |≥M }
Also from hypothesis H(f )1 (iii) and the mean value theorem, we have
(5.204)
424
5 Boundary Value Problems–Hamiltonian Systems
−
µF z, xn (z) −xn (z)f z, xn (z) dz ≥ −c1 for some c1 > 0, for all n ≥ 1.
{|xn |<M }
(5.205) Using (5.204) and (5.205) in (5.203) we obtain
− µF z, xn (z) −xn (z)f z, xn (z) dz ≥ −c1
for all n ≥ 1.
(5.206)
Z
For almost all z ∈ Z and all |x| ≥ M , the function s −→ (1/sµ )F (z, sx) is C 1 on R+ \ {0}. So we have µ 1 d d1 F (z, sx) = − F (z, sx) + F (z, sx) x. ds sµ sµ+1 sµ dx By virtue of the mean value theorem, for s > 1 we can find η ∈ (1, s) such that µ 1 d 1 F (z, sx) − F (z, x) = − µ+1 F (z, ηx) + µ F (z, ηx) x (s − 1) µ s η η dx d s − 1 = µ+1 − µF (z, ηx) + F (z, ηx) ηx η dx ≥ 0 for a.a. z ∈ Z and all |x| ≥ M, (see hypothesis H(f )1 (iv)), ⇒ sµ F (z, x) ≤ F (z, sx) for a.a. z ∈ Z, all |x| ≥ M and all s ≥ 1. (5.207) Using (5.207), we see that for almost all z ∈ Z, we have
x xµ F (z, M ) if x ≥ M M ≥ F (z, x) = F z, M Mµ
|x| |x|µ and F (z, x) = F z, F (z, −M ) if x ≤ M. (−M ) ≥ M Mµ We set ξ = (1/M µ ) min ess inf F (·, M ), ess inf F (·, −M ) > 0 (see hypothesis H(f )1 (iv)). So we have F (z, x) ≥ ξ|x|µ
for a.a. z ∈ Z and all |x| ≥ M.
(5.208)
From (5.208), hypothesis H(f )1 (iii), and the mean value theorem, we conclude that F (z, x) ≥ ξ|x|µ − c2
for some c2>0, almost all z ∈ Z, and all x ∈ R.
(5.209)
Using (5.209) and because µ > p, we obtain
for some c3 , c4 > 0, all n ≥ 1. (µ − p) F z, xn (z) dz ≥ c3 xn µ µ − c4
(5.210)
Z
Returning to (5.202) and using (5.206) and (5.210), we obtain c3 xn µ µ ≤ c5
for some c5 > 0, all n ≥ 1,
⇒ {xn }n≥1 ⊆ L (Z) is bounded, µ
⇒ {xn }n≥1 ⊆ Lp (Z) is bounded From (5.199) we have
(because p < µ).
(5.211)
5.4 Nonlinear Eigenvalue Problems µ ϕλ (xn ) − ϕλ (xn ), xn ≤ µM + εn for all n ≥ 1, µ µ p p m|xn | dz − 1 Dxn p − λ −1 ⇒ p p Z
− µF z, xn (z) −xn (z)f z, xn (z) dz ≤ µM + εn , µ Z ⇒ − 1 Dxn pp ≤ c6 for some c5 > 0, all n ≥ 1, p
425
(5.212)
(see (5.206) and (5.211)). Recall that µ > p. So from (5.212) and Poincar´e’s inequality, it follows that {xn }n≥1 ⊆ W01,p (Z) is bounded. Hence we may assume that w
and
xn −→ x
in W01,p (Z), xn −→ x in Lp (Z)
xn −→ x
in Lr (Z)
(because r < p∗ ).
From (5.199), we have
εn A(xn ), xn − x−λ m|xn |p−2 xn (xn − x)dz− f (z, xn )(xn − x)dz ≤(5.213) Z
Z
for all n ≥ 1. Note that
m|xn |p−2 xn (xn − x)dz −→ 0 λ
(because xn −→ x in Lp (Z) and
Z
because of (5.211)), f (z, xn )(xn − x)dz −→ 0
and Z
Nf (xn )
n≥1
(because xn −→ x in Lr (Z) and
⊆Lr (Z) is bounded; see hypothesis H(f )1 (iii)).
So from (5.213) it follows that lim sup A(xn ), xn − x ≤ 0.
(5.214)
Because A is of type (S)+ (see Proposition 4.3.41), from (5.214) it follows that xn −→ x in W01,p (Z).
Thus we conclude that ϕλ satisfies the C-condition.
Let u2 ∈ C01 (Z) be an eigenfunction corresponding to the second eigenvalue λ2 > 0 of −p , W01,p (Z) (see Section 4.3). Let
and
Y = span{u1 , u2 } HR = y ∈ Y : y = t1 u1 + t2 u2 , t1 ∈ R, t2 ≥ 0, y ≤ R for R > 0. (5.215)
Evidently HR is a hemisphere in Y . We consider the boundary of this hemisphere in Y . So we have 0 = y∈HR :y = t1 u1 , t1 ∈ R, y ≤ R ∪ y∈HR : y=R . (5.216) HR
426
5 Boundary Value Problems–Hamiltonian Systems
0 The boundary HR is the union of two sets. The first set in this union is a circle and its interior and the second set is the dome of the hemisphere. Also we set
K = x ∈ W01,p (Z) : Dxpp = λ2 m|x|p dz . (5.217) Z
This is a closed, pointed, symmetric cone in
W01,p (Z).
For > 0, we set
D = K ∩ ∂B
where ∂B = {x ∈ W01,p (Z) : x = }. PROPOSITION 5.4.3 If hypotheses H(m), H(f )1 hold and λ ∈ [λ1 , λ2 ), then we can find R > 0 large enough such that ϕλ H 0 ≤ 0. R
PROOF: Let y ∈ Y . Then we have
1 λ m|y|p dz − F z, y(z) dz ϕλ (y) = Dypp − p p Z Z 1 p µ ≤ Dyp − ξyµ + c7 for some c7>0 independent of y ∈ Y (5.218) p (see (5.209)). Because Y ⊆ C01 (Z) is finite-dimensional, all the Lp and Sobolev norms on it are equivalent. So we can find c8 > 0 such that µ c8 Dyµ p ≤ yµ
for all y ∈ Y.
Using this in (5.218), we obtain ϕλ (y) ≤
1 Dypp − c9 Dyµ p + c7 p
for some c9 > 0 and all y ∈ Y.
(5.219)
Because µ > p, from (5.219), using Poincar´e’s inequality, we deduce that ϕλ (y) −→ −∞ as y −→ ∞, (5.220) that is, ϕλ Y is weakly anticoercive. Also for t > 0 and because u1 ∈ int C01 (Z)+ , we have
tp tp ϕλ (tu1 ) = Du1 pp − λ m|u1 |p dz − F (z, tu1 )dz p p Z Z λ tp for some c10 > 0 Du1 pp − tµ u1 µ 1− ≤ µ + c10 p λ1 (see (5.209)) tp λ ≤ for some c11 > 0. Du1 pp − tµ c11 Du1 µ 1− p + c10 p λ1 (5.221) From (5.220) and (5.221) and because λ1 ≤ λ < λ2 and p < µ, we deduce that there exists R > 0 large such that ϕλ H 0 ≤ 0 (see (5.216)). R
5.4 Nonlinear Eigenvalue Problems
427
PROPOSITION 5.4.4 If H(m), H(f )1 hold and 0 < λ < λ2 , then we can find > 0 small such that ϕλ D ≥ ξ0 > 0. PROOF: By virtue of hypothesis H(f )1 (v), given ε > 0, we can find δ = δ(ε) > 0 such that f (z, x) ≤ εxp−1
for a.a. z ∈ Z and all x ∈ [0, δ]
f (z, x) ≥ ε|x|p−2 x
and
for a.a. z ∈ Z and all x ∈ [−δ, 0].
From these inequalities, after integration we obtain F (z, x) ≤
ε p |x| p
for a.a. z ∈ Z and all x ∈ [−δ, δ].
(5.222)
On the other hand from hypothesis H(f )1 (iii) and the mean value theorem, we have F (z, x) ≤ cε |x|r
for some cε > 0, a.a. z ∈ Z and all |x| > δ.
(5.223)
Combining (5.222) and (5.223), we obtain F (z, x) ≤
ε p |x| + cε |x|r p
for a.a. z ∈ Z and all x ∈ R.
(5.224)
Let x ∈ K. Then
1 λ m|x|p dz− F z, x(z) dz Dxpp − p p Z Z
1 λ ε p p ≥ Dxp − m|x| dz − xpp − cε xrr p p Z p 1 ε λ ≥ 1− Dxpp − c14 Dxpp − cε Dxrp p λ2 p
ϕλ (x) =
(see (5.224)) (5.225)
for some c14 , cε > 0 (see (5.217)). Because λ < λ2 , choosing ε > 0 small from (5.225) we have ϕλ (x) ≥ c15 Dxpp − cε Dxrp
for some c14 > 0 and all x ∈ K.
(5.226)
Recall that r > p. So from (5.226) and Poincar´e’s inequality, it follows that we can find > 0 small such that ϕλ D ≥ ξ0 > 0, where D =K ∩ ∂B (0).
Now we can establish the existence of a nontrivial solution for problem (5.198). THEOREM 5.4.5 If H(m), H(f )1 hold and 0 < λ < λ2 , then problem (5.198) has a nontrivial solution x ∈ C01 (Z). PROOF: First assume that λ1 ≤ λ < λ2 . We choose > 0 small and R > 0 big so thatPropositions5.4.3 and 5.4.4 hold. 0 From Proposition 4.3.63, we know that the sets HR , HR , D link in W01,p (Z) via the identity map. Then Proposition 5.4.2 allows the use of Theorem 4.1.22, which gives x∈W01,p (Z) such that
428
5 Boundary Value Problems–Hamiltonian Systems ϕλ (x) ≥ ξ0 > 0 = ϕλ (0) (i.e., x = 0) and
ϕ (x) = 0.
From the last equation, we have A(x) − λm|x|p−2 x = Nf (x),
⎫
⎧ p−2 Dx(z) − λm(z)|x(z)|p−2 x(z) = f z, x(z) ⎬ ⎨ −div Dx(z) a.e. on Z, . ⇒ ⎭ ⎩ x ∂Z = 0, 1 < p < ∞ (5.227) From nonlinear regularity theory (see Theorem 4.3.35), we have that x ∈ C01 (Z) and of course it solves problem (5.198) (see (5.227)). Now assume that 0 < λ < λ1 . From the proof of Proposition 5.4.3, we know that ϕλ (tu1 ) −→ −∞
as t −→ ∞
(see (5.221)).
(5.228)
Moreover, as in the proof of Proposition 5.4.4, using (5.224), we obtain ϕλ (x) ≥
1 ε λ Dxpp − c14 Dxpp − cε Dxrp . 1− p λ1 p
Because λ < λ1 , choosing ε > 0 small, we have ϕλ (x) ≥ c16 Dxpp − cε Dxrp
for some c16 > 0, all x ∈ W01,p (Z).
Because p < r, we can find > 0 small, we have ϕλ ∂B (0) ≥ ξ0 > 0
(5.229)
(5.230)
(see (5.229) and use Poincar´e’s inequality). From (5.228), (5.230), and Proposition 5.4.2 and because ϕλ (0) = 0, we see that we can apply the mountain pass theorem (see Theorem 4.1.24) and obtain x ∈ W01,p (Z) such that ϕλ (x) ≥ ξ0 > 0 = ϕλ (0) (i.e., x = 0) and
ϕ (x) = 0.
As above it follows that x ∈ C01 (Z) and it is a nontrivial solution of problem (5.198). Next we consider the following perturbed eigenvalue problem * ) ∗ −x(z) = λx(z) + |x(z)|2 −2 x(z) a.e. on Z, . x∂Z = 0
Here ∗
2 =
2N N −2
+∞
(5.231)
if 2 < N if N = 1, 2
(the critical Sobolev exponent for 2). So we see that in this case the nonlinear perturbation in problem (5.231) has a critical growth and this causes serious difficulties
5.4 Nonlinear Eigenvalue Problems
429
dealing with the problem, because the embedding of the Sobolev space H01 (Z) into ∗ L2 (Z) is no longer compact. We are looking for positive solutions of problem (5.231). First let us consider the special case λ = 0. One way to approach problem (5.231) variationally is to try to obtain positive solutions of (5.231), as relative minima of the functional. ϕλ (x) =
∗ 1 λ 1 Dx22 − x22 − ∗ x22∗ 2 2 2
on the C 1 -manifold M = {x ∈ H01 (Z) : x2∗ = 1}. So we want to find a solution of the following minimization problem Dx2 − λx2 2 2 : x ∈ H01 (Z), x = 0 = Sλ (Z). inf 2 x2∗ If λ = 0, then we have Dx2 1 2 : x ∈ H (Z), x = 0 = S(Z), S0 (Z) = inf 0 x22∗ ∗
where S(Z) is the best Lipschitz constant for the embedding of H01 (Z) into L2 (Z). We know that S(Z) = S is independent of the domain Z and is never attained on a domain Z ⊆ RN , Z = RN (see, for example, Willem [606, Section 1.9]). So when λ = 0, problem (5.231) has no positive solution. More generally we show that this true for all λ ≤ 0 and all λ ≥ λ1 , provided that the domain Z has certain geometric structure. So we start with a definition. DEFINITION 5.4.6 A domain Z ⊆ RN containing the origin is said to be star shaped with respect to the origin, if z, n(z) RN > 0 for all z ∈ ∂Z (by n(z) we denote the unit outward normal at z ∈ ∂Z). EXAMPLE 5.4.7 Any open ball in RN centered at 0, is star-shaped with respect to the origin. To seethis let z ∈ ∂Z. Then n(z) = z/R where R > 0 is the radius of the ball. So z, n(z) RN = z2 /R = R > 0. The negative result on the existence of positive solutions for problem (5.231) when λ ∈ / (0, λ1 ), is based on the following Pohozaev’s identity. PROPOSITION 5.4.8 If g : R −→ R is continuous, G(x) = C01 (Z) is a solution of
* ) −x(z) = g x(z) a.e. on Z, , x ∂Z = 0
x 0
g(r)dr, and x ∈
(5.232)
then
N
∂x 1
g x(z) x(z)dz+N G x(z) x(z)dz = 1− z, n(z) RN (z)dσ. (5.233) 2 2 ∂n Z Z ∂G
430
5 Boundary Value Problems–Hamiltonian Systems
PROOF: We multiply the equation in (5.232) with zi (∂x/∂zi ) and then integrate over Z. So we have
∂x ∂G x(z) ∂x − x(z)zi (z)dz = g x(z) zi (z)dz = zi dz ∂zi ∂zi ∂zi Z Z
Z
∂x ⇒ − x(z)zi (z)dz = − G x(z) dz (by integration by parts). ∂z i Z Z So we have
−
G x(z) dz = −
Z
N ∂ 2 x ∂x zi dz ∂zk2 ∂zi Z k=1
N N ∂x ∂ 2 x ∂x ∂x zi dz + δik dz = ∂zk ∂zk ∂zi ∂zk ∂zi Z Z k=1
k=1
N ∂x ∂x zi nk dσ − ∂zk ∂zi ∂Z k=1
N
∂x 2 1 ∂ ∂x 2 dz = zi dz + 2 Z ∂zi ∂zk Z ∂zi k=1
∂x ∂x − zi dσ ∂n ∂Z ∂zi
N
∂x 2 ∂x 2 1 dz + dz =− 2 Z ∂zk Z ∂zi +
1 2
k=1
N
∂x 2 1 ∂x ∂x zi ni dσ − zi dσ. ∂zk 2 ∂Z ∂zi ∂n ∂Z k=1
Summing over i, we obtain
N 1 Dx2 (z, n)RN dσ Dx22 + −N G x(z) dz = 1 − 2 2 ∂Z Z
∂x − (z, Dx)RN dσ. ∂Z ∂n
(5.234)
From (5.232) we have
g x(z) x(z)dz.
Dx22 =
(5.235)
Z
Moreover, because x∂Z = 0, we have Dx(z)2RN = and
z, Dx(z)
RN
∂x
2
for all z ∈ ∂Z ∂n
∂x = z, n(z) RN (z) for all z ∈ ∂Z. ∂n (z)
(5.236) (5.237)
Using (5.235) through (5.237) in (5.234), we obtain (5.233) (Pohozaev’s identity).
5.4 Nonlinear Eigenvalue Problems
431
Using Proposition 5.4.8, we can now prove a negative result concerning problem (5.231). Let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z, which is starshaped with respect to the origin. If x ∈ H01 (Z) is a solution of (5.231), then by the strong maximum principle we have that either x = 0 or x(z) > 0 for all z ∈ Z. We show that when λ ∈ / (0, λ1 ), the second possibility can not occur. THEOREM 5.4.9 If Z ⊆ RN , N ≥ 3 is a bounded domain with a C 2 -boundary, which is star-shaped with respect to the origin, and λ ∈ / (0, λ1 ), then problem (5.231) has no nontrivial solution x ∈ H01 (Z)+ . PROOF: First suppose that λ ≥ λ1 . If x ∈ H01 (Z) \ {0} is a solution of (5.231), then using the eigenfunction u1 > 0 corresponding to λ1 > 0 as a test function, we obtain
∗ λ1 xu1 dz = (Dx, Du1 )RN dz = λ xu1 dz + |x|2 −2 xu1 dz Z Z Z
Z >λ xu1 dz
Z ≥ λ1 xu1 dz, Z
a contradiction. Next suppose that λ ≤ 0. ∗ ∗ We set g(r) = λr + |r|2 −2 r. Then G(r) = (λ/2)r2 + (1/2∗ )|r|2 . Using these 1 functions in Pohozaev’s identity (5.233) and because x ∈ C0 (Z) (regularity theory), we have
∗ ∗ Nλ N N (λx2 + |x|2 )dz + x2 dz + ∗ |x|2 dz = 1− 2 2 2 Z Z Z
∂x 2 1 (z, n)RN dσ 2 ∂Z ∂n
N ∂x 2 N 1 2∗ 2 ⇒ + 1 − |x| dz + λ λx dz = (z, n) dσ. N R 2∗ 2 2 ∂Z ∂n Z Z But
N N N N +1− = − = 0. 2∗ 2 2 2
∂x 2 1 x2 dz = (z, n)RN dσ λ 2 ∂Z ∂n Z
So we obtain
which is a contradiction if λ < 0. If λ = 0 then because Z is star-shaped with respect to the origin, we must have (∂x/∂n)(z) = 0 for z ∈ ∂Z. Hence
∗ |x|2 dz = − x(z) = 0 (by Green’s formula) Z
and so x = 0.
Z
REMARK 5.4.10 If Z ⊆ RN is not star-shaped with respect to the origin, then we can have solutions even when λ ≤ 0. In fact if Z ⊆ RN is an annular domain,
432
5 Boundary Value Problems–Hamiltonian Systems
then we know that (5.231) admits a radial solution for all λ ∈ (−∞, λ1 ). Recall that x ∈ H01 (Z) is a radial solution, if x(z) = x(zRN ) for all z ∈ Z. When 0 < λ < λ1 , problem (5.231) may admit nontrivial solutions x ∈ H01 (Z)+ . In this case we have an interesting dependence on the dimension N . More precisely, we have the following theorem due to Brezis–Nirenberg [104]. As always by λ1 > 0 we denote the principal eigenvalue of −, H01 (Z) . THEOREM 5.4.11 If Z ⊆ RN , (N ≥ 3) is a bounded domain with a C 2 -boundary ∂Z, then (a) If N ≥ 4, then for any λ ∈ (0, λ1 ), there exists a positive solution for problem (5.231). (b) If N = 3, then there exists λ∗ ∈ [0, λ1 ) such that for any λ ∈ (λ∗ , λ1 ) problem (5.231) has a positive solution. (c) If N = 3 and Z = B1 (0), then λ∗ = λ1 /4 and for λ ≤ λ1 /4 problem (5.231) has no nontrivial solution in H01 (Z)+ .
5.5 Maximum and Comparison Principles The main goal of this section is to present some maximum and comparison principles for certain nonlinear differential operators involving the p-Laplacian. So let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. We start by considering the nonlinear differential operator Vp (x) = −div(Dxp−2 Dx) + mϑp (x),
x ∈ W01,p (Z),
(5.238)
∞
where 1 < p < ∞, m∈L (Z) (the weight function), ϑp :R−→R is the homeomorphism defined by ) p−2 |x| x if x = 0 ϑp (x) = 0 if x = 0
and ϑp :Lp (Z)−→Lp (Z), (1/p)+(1/p ) = 1 is the Nemitsky operator corresponding
to ϑp ; that is, ϑp (x)(·) = ϑp x(·) . Evidently ϑp is continuous and bounded (i.e., maps bounded sets to bounded sets). We start by recalling the following basic notions. 1,p DEFINITION 5.5.1 (a) A function x∈Wloc (Z) is said to be a weak solution of
−div(Dxp−2 Dx) + m|x|p−2 x = g,
if and only if
Dxp−2 (Dx, Dv)RN dz + m|x|p−2 xvdz = g, v Z
g ∈ W −1,p (Z)
(5.239)
for all v ∈ Cc∞ (Z)
Z
(here by·, ·we denote the duality brackets for the pair W01,p (Z), W −1,p (Z) ). 1,p (b) A function x∈Wloc (Z) is said to be an upper solution of (5.239), if
Dxp−2 (Dx, Dv)RN dz + m|x|p−2 xvdz ≥ g, v for all v ∈ Cc∞ (Z)+ . Z
Z
(5.240)
5.5 Maximum and Comparison Principles
433
1,p (c) A function x∈Wloc (Z) is said to be a lower solution of (5.239), if
Dxp−2 (Dx, Dv)RN dz + m|x|p−2 xvdz ≤ g, v for all v ∈ Cc∞ (Z)+ . Z
Z
(5.241)
REMARK 5.5.2 Recall that Cc∞ (Z) is dense in W01,p (Z) and W −1,p (Z) = W01,p (Z)∗ . So relations (5.239) through (5.241) are in fact valid for all v ∈ W01,p (Z). DEFINITION 5.5.3 (a) We say that the operator Vp (see (5.238)) satisfies the maximum principle (MP, for short), if every weak solution x ∈ W 1,p (Z) of
* ) −div Dxp−2 Dx + m|x|p−2 x = g (5.242) x∂Z ≥ 0, g ∈ W −1,p (Z) satisfies x(z) ≥ 0 a.e. on Z when g ≥ 0. (b) We say that the operator Vp (see (5.238)) satisfies the strong maximum principle (SMP, for short), if every weak solution x ∈ W 1,p (Z) of (5.242) satisfies x(z) > 0 a.e. on Z when g ≥ 0, g = 0. (c) We say that the operator Vp (see (5.238)) satisfies the weak comparison principle (WCP, for short), if Vp x1 ≤ Vp x2 in Z and x1 ∂Z ≤ x2 ∂Z with x1 , x2 ∈ W 1,p (Z), imply that x1 (z) ≤ x2 (z) a.e. on Z. To establish the maximum and comparison principles for the operator Vp , we need to know its spectrum. For this purpose the relevant nonlinear eigenvalue problem is the following one.
* ) p−2 Dx(z) + m|x(z)|p−2 x(z) = λ|x(z)|p−2 x(z) a.e. on Z, −div Dx(z) . x∂Z = 0, λ ∈ R (5.243) Following step by step the arguments used to establish the existence and proper ties of the first eigenvalue of −p , W01,p (Z) with a weight m ∈ L∞ (Z) (see Section 4.3), we obtain the following result. PROPOSITION 5.5.4 The nonlinear eigenvalue problem (5.243) has a unique eigenvalue λ = λ1 (m) defined by
λ1 (m) = inf Dxpp + m|x|p dz : x ∈ W01,p (Z), xp = 1 (5.244) Z
with the property that it has a positive eigenfunction u1 ∈ W01,p (Z). Moreover, the eigenvalue λ1 (m) is simple and isolated and the eigenfunction u1 satisfies u1 ∈ intC01 (Z)+ . The next theorem is in the direction of the nonlinear regularity results proved in Section 4.3) and is due to Di Benedetto [198] and Tolksdorf [583]. So suppose f : Z × R −→ R is a Carath´eodory function (i.e., for all x ∈ R, z −→ f (z, x) is measurable and for almost all z ∈ Z, x −→ f (z, x) is continuous). We consider the following nonlinear elliptic equation,
−div Dx(z)p−2 Dx(z) = f z, x(z) in Z, 1 < p < ∞. (5.245) In analogy to Definition 5.5.1(a), we make the following definition.
434
5 Boundary Value Problems–Hamiltonian Systems
1,p DEFINITION 5.5.5 A function x ∈ Wloc (Z) is said to be a weak solution of problem (5.245), if
p−2 Dx (Dx, Dv)RN dz = f (z, x)vdz for all v ∈ Cc∞ (Z). Z
Z
The regularity result of Di Benedetto–Tolksdorf, reads as follows. x ∈ W 1,p (Z) is a weak solution THEOREM 5.5.6 If Z ⊆RN is a bounded domain, r of (5.245), and z −→ f z, x(z) belongs in L (Z) with p/(p − 1)N < r ≤ ∞, then x ∈ C 1,α (Z) for some 0 < α < 1. Another such regularity result that we need is the next theorem, which is essentially Theorem 4.3.35 but with nonhomogeneous Dirichlet boundary conditions. So we consider the following problem:
) * p−2 −div Dx + m|x|p−2 x = g in Z Dx . (5.246) ∞ x∂Z = h, 1 < p < ∞, g ∈ L (Z) The next theorem is a particular case of a more general result due to Lieberman [382]. THEOREM 5.5.7 If x∈W 1,p (Z) ∩ L∞ (Z) is a weak solution of (5.246) and h ∈ C 2 (∂Z), then x ∈ C 1 (Z). Now we can pass to the analysis of operator Vp . THEOREM 5.5.8 The following statements are equivalent. The operator Vp satisfies the MP. The operator Vp satisfies the SMP. λ1 (m) > 0 (see (5.244)). There exists a positive strict upper solution x ∈ W01,p (Z) of the equation Vp (x) = 0 and Vp (x) ∈ L∞ (Z) (i.e., Vp (x) = g in Z, g ∈ L∞ (Z)+ , g = 0). (e) For every g ∈ L∞ (Z)+ , the problem Vp (x) = g, g ∂Z = 0, has a unique weak solution x ∈ W01,p (Z)+ .
(a) (b) (c) (d)
PROOF: (a)⇒(b). Let x ∈ W01,p (Z) be a weak solution of −p x + m|x|p−2 x = g in Z, x∂Z ≥ 0,
(5.247)
where g ∈ L∞ (Z)+ , g = 0. Because by hypothesis Vp satisfies the MP, we have x(z) ≥ 0 a.e. on Z. Because g = 0, we have that x = 0. Moreover, taking into account the fact that x ∈ L∞ (Z) (see Theorem 4.3.34), from Theorem 5.5.6 we infer that x ∈ C 1 (Z). We have −p x + m∞ |x|p−2 x ≥ 0
(5.248)
(see (5.247) and recall that x ≥ 0, g ≥ 0). Then from (5.248) and the nonlinear strong maximum principle (see Theorem 4.3.37), we deduce that x(z) > 0 for all z ∈ Z. So Vp satisfies the SMP.
5.5 Maximum and Comparison Principles
435
1 (b)⇒(c). Suppose that
λ1 (m) ≤ 0 and let u1 ∈ int C0 (Z)+ be a positive eigenfunction for the operator Vp , W01,p (Z) (see Theorem 5.5.6). Then we have
Vp (−u1 ) = −p (−u1 ) + m| − u1 |p−2 (−u1 ) ≥ 0. Because Vp (u1 ) ∈ L∞ (Z) and by hypothesis Vp satisfies the SMP, we must have u1 ≤ 0, a contradiction.
(c)⇒(d). Just take x = u1 (= a positive eigenfunction of Vp , W01,p (Z) ). (d)⇒(c). As in Proposition 4.3.44 we introduce the set x y D(J) = (x, y) ∈ W01,p (Z) × W01,p (Z) : x ≥ 0, y ≥ 0, , ∈ L∞ (Z) . y x
Let u1 ∈ int C01 (Z)+ be a positive eigenfunction of Vp , W01,p (Z) . Note that because x ∈ W01,p (Z) is by hypothesis a strict upper solution of Vp (x) = 0 and Vp (x) ∈ L∞ (Z), we have x ∈ int C01 (Z)+ . So it follows that (u1 , x) ∈ D(J). In particular we have x u1 ∈ L∞ (Z). Set u1 = cu1 with c > 0 such that c ≥ x u1 ∞ . Suppose that λ1 (m) ≤ 0. Then if by ·, · we denote the duality brackets for the
pair W01,p (Z), W −1,p (Z) , (1/p) + (1/p ) = 1, we have up − xp −p x, 1 p−1 −m(up1 −xp )dz, ≥ x Z
(see (4.124) in Section 4.3) ⇒ J(u1 , x) ≤ λ1 (m)(up1 −xp )dz ≤ 0, Z
⇒ J(u1 , x) = 0 and so u1 = θx
for some θ > 0
(see (4.125) in Section 4.3).
So it follows that −p u + m|u|p−2 u = θp−1 g,
g ∈ L∞ (Z)+ , g = 0.
This contradicts the hypothesis that λ1 (m) ≤ 0. Hence we must have λ1 (m) > 0. (c)⇒(a). Suppose that x ∈ W 1,p (Z) satisfies −p x + m|x|p−2 x = g
in Z, x∂Z ≥ 0,
with g ∈ W −1,p (Z), g ≥ 0 (i.e., g, v ≥ 0 for all v ∈ W01,p (Z)+ ). Using as a test function −x− ∈ W01,p (Z) and recalling that ) −Dx(z) if x(z) < 0 , Dx− (z) = 0 if x(z) ≥ 0 we obtain Dx− pp +
m|x− |p dz = g, −x− ≤ 0
Z
⇒ λ1 (m) ≤ 0, a contradiction, unless x− = 0 in which case x ≥ 0. Thus far we have proved the equivalence of statements (a), (b), (c), (d).
436
5 Boundary Value Problems–Hamiltonian Systems
(e)⇒(d). Immediate. So we are done if we show that one of the statements (a), (b), (c), or (d) implies (e). More precisely we show the following. (c)⇒(e). We consider the Euler functional ϕ : W01,p (Z) −→ R defined by
1 1 ϕ(x) = Dxpp + m|x|p dz − gxdz. p p Z Z
(5.249)
We claim that ϕ is weakly coercive. Suppose that this is not the case. Then we can find {xn }n≥1 ⊆ W01,p (Z) such that xn −→ ∞ as n → ∞ We have
and
ϕ(xn ) ≤ M
for some M > 0, all n ≥ 1.
λ1 (m) gxn dz ≤ ϕ(xn ) ≤ M for all n ≥ 1, xn pp − p Z λ1 (m) ⇒ xn pp ≤ M + cxn p for some c > 0, all n ≥ 1, p ⇒ {xn }n≥1 ⊆ Lp (Z) is bounded.
Then from the definition of ϕ (see (5.249)), we infer that {Dxn }n≥1 ⊆ Lp (Z, RN ) is bounded, ⇒ {xn }n≥1 ⊆ W01,p (Z) is bounded (by Poincar´e’s inequality), a contradiction to the fact that xn −→ ∞. Therefore ϕ is weakly coercive. Moreover, exploiting the compact embedding of W01,p (Z) into Lp (Z), we can see that ϕ is weakly lower semicontinuous on W01,p (Z). By the Weierstrass theorem, we can find x ∈ W01,p (Z) such that ϕ(x) =
inf
1,p
W0
ϕ,
(Z)
⇒ ϕ (x) = 0, ⇒ A(x) + m|x|p−2 x = g,
(5.250)
where A : W01,p (Z) −→ W −1,p (Z) as before is defined by
A(x), y = Dxp−2 (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). Z
From (5.250) it follows that
* ) p−2 Dx(z) + m(z)|x(z)|p−2 x(z) = g(z) a.e. on Z, −div Dx(z) x∂Z = 0 ⇒ x ∈ C01 (Z)+ is a strong solution of the problem Vp (x) = g, x∂Z = 0. Moreover, acting on (5.250) with the test function −x− ∈W01,p (Z) and since by hypothesis λ1 (m) > 0, it follows that x ≥ 0. In fact if g = 0, by the nonlinear strong maximum principle we have x ∈ int C01 (Z)+ .
5.5 Maximum and Comparison Principles
437
Finally if x, y ∈ int C01 (Z)+ are two such solutions, when g = 0, then xp − y p xp − y p − −p y, p−1 0 ≤ J(x, y) = −p x, xp−1 y
p−1 p−1 y −x = (xp − y p )dz ≤ 0, xp−1 y p−1 Z ⇒ J(x, y) = 0 (i.e., x = cy for some c > 0), ⇒ cp−1 g = g (i.e., c = 1). Therefore x = y and we have proved the uniqueness of the solution. If g = 0, then we have A(x) + m|x|p−2 x = 0
(see (5.250)).
Acting with the test function x ∈ W01,p (Z), we obtain
m|x|p dz = 0, Dxpp + Z
⇒ λ1 (m)xpp ≤ 0 ⇒ x=0
(see (5.244)),
(because by hypothesis λ1 (m) > 0).
So when g = 0, the only possible solution is the trivial one.
THEOREM 5.5.9 If λ1 (m) > 0, xk ∈ W 1,p (Z) ∩ L∞ (Z) satisfy Vp (xk ) ∈ L∞ (Z), xk ∂Z ∈ C 2 (∂Z) k = 1, 2, * Vp (x 1 ) ≤ Vp (x 2 ) in Z (5.251) x1 ∂Z ≤ x2 ∂Z and finally Vp (x2 ) ≥ 0 in Z with x2 ∂Z ≥ 0, then x1 (z) ≤ x2 (z) for every z ∈ Z. Moreover, if in (5.251) x1 ∂Z = x2 ∂Z = 0, then the same conclusion holds under the following less restrictive assumptions, also we have
)
xk ∈ W01,p (Z), Vp (xk ) ∈ L∞ (Z)
for k = 1, 2 and Vp (x2 ) ≥ 0 in Z.
PROOF: If x2 = 0, then clearly then result is true, with x1 = 0. So suppose that x2 = 0. From Theorem 5.5.7 we have that xk ∈ C 1 (Z), k = 1, 2. Then since x2 ≥ 0 (recall λ1 (m) > 0), from the strong maximum principle, we 1 have that x2 ∈ int C0 (Z)+ . Therefore we can find c > 1 such that x1 < cx2 . Let g = Vp (x2 ), h =x2 ∂Z and consider the following problem * ) p−2 y = g in Z, − p y + m|y| . (5.252) y ∂Z = h We can easily check that y = x1
and
y = cx2
are lower and upper solutions, respectively, of (5.252). Thus employing the truncation and penalization techniques developed in Section 5.2, we can have a solution y ∈ C 1 (Z) (see Theorem 5.5.7) such that
438
5 Boundary Value Problems–Hamiltonian Systems x1 (z) ≤ y(z) ≤ cx2 (z)
for all z ∈ Z.
Because λ1 (m) > 0, y ≥ 0. But from the proof of Theorem 5.5.6 we know that (5.252) has a unique solution. So y = x2 and we have x1 ≤ x2 . Finally if we have homogeneous Dirichlet boundary conditions, then by Theorem 4.3.35, it suffices to assume that Vp (xk ) ∈ L∞ (Z), xk ∈ W01,p (Z), k = 1, 2, and Vp (x2 ) ≥ 0. The classical maximum principle (see Theorem 4.3.19), asserts that any superharmonic C 2 -function x on a smooth domain Z ⊆ RN can not achieve its minimum in the interior of Z, unless x is a constant. We saw that this result is still true if 1,p x ∈ Wloc (Z) is a p-superharmonic function such that div(Dxp−2 Dx) ∈ L2loc (Z) (see Theorem 4.3.37). From the classical maximum principle one immediately derives the strong comparison principle between two C 2 -functions xk (k = 1, 2) satisfying −x1 ≤ −x2 . Next we prove corresponding results for the p-Laplacian differential operator. Recall that Z ⊆ RN is a bounded domain with a C 2 -boundary ∂Z. 1,p PROPOSITION 5.5.10 If x, y ∈ Wloc (Z) (1 < p < ∞), f, g ∈ L∞ (Z),
−div Dx(z)p−2 Dx(z) = f (z)
and −div Dy(z)p−2 Dy(z) = g(z) a.e. on Z
x(z) ≥ y(z) and
for all z ∈ Z, f (z) ≥ g(z)
a.e. on Z
K = {x ∈ Z : x(z) = y(z)}
is compact, then K = ∅. PROOF: From nonlinear regularity theory (see Theorems 4.3.34 and 4.3.35), we know that x, y ∈ C 1,α (Z). Assume that K is nonempty compact. We can find a relatively compact open set Z1 such that K ⊆ Z1 ⊆ Z 1 ⊆ Z and y(z) < x(z)
for all z ∈ Z \ K.
Given ε > 0, let xε , yε ∈ W 1,p (Z1 ) be solutions of the equations ;
2 (p−2)/2 Dxε (z) = f (z) a.e. on Z1 , −div (ε + Dxε (z) ) (5.253) xε ∂Z = x and
1
; 2 (p−2/2 (z) ) Dy (z) = g(z) a.e. on Z , −div (ε + Dy ε ε 1 . (5.254) yε ∂Z = y
1
Such solutions can be easily obtained by direct minimization of the Euler functionals corresponding to problems (5.253) and (5.254). Moreover, Dxε ε∈(0,1] and Dyε ε∈(0,1] ⊆ Lp (Z1 ) are both bounded. Again we have xε , yε ∈ C 1,α (Z 1 ) (we can always assume that ∂Z 1 is C 2 ) and lim xε = x ε↓0
and
lim yε = y ε↓0
1,β weakly in W 1,p (Z1 ) and strongly in Cloc (Z1 ) for any β ∈ (0, α). We choose Z2 an open subset of Z such that K ⊆ Z2 ⊆ Z 2 ⊆ Z1 and set
5.5 Maximum and Comparison Principles ξ = min(x − y) > 0 ∂Z2
and
439
uε = xε − yε .
We choose ε > 0 small so that ξ 4
and
ξ y − yε L∞ (Z2 ) < . 4
for all z ∈ ∂Z2
and
xε (z) − yε (z) <
x − xε L∞ (Z2 ) < So it follows that xε (z) − yε (z) >
ξ 2
ξ 2
for all z ∈ K.
Invoking the mean value theorem, we have −
N ∂ ε ∂uε αij =f −g ≥0 ∂zi ∂zj i,j=1
in Z2 ,
where
(p−4)/2
ε αij = ε + ti Dxε + (1 − ti )Dyε 2 δij ε + ti Dxε + (1 − ti )Dyε 2
∂xε ∂yε ∂xε ∂yε + (p − 2) ti + (1 − ti ) + (1 − ti ) ti ∂zi ∂zi ∂zj ∂zj with ti ∈ (0, 1). Set ηε = min(xε − yε ) and Kηε = z ∈ Z2 : (xε − yε )(z) = ηε . Z2
Evidently Kηε is a nonempty compact subset of Z2 and Dxε (z) = Dyε (z) for all z ∈ Kηε . So we have
(p−4)/2
∂xε ∂xε ε αij = ε + Dxε 2 δij ε + Dxε 2 + (p − 2) on Kηε . (5.255) ∂zi ∂zj ε Moreover, because αij are the entries of the Hessian matrix of the strongly
p/2 N , we have convex function v = (vk )k=1 −→ (1/p) ε + v2 N
ε αij ϑi ϑj ≥ γϑ2
(5.256)
i,j=1 N for some γ > 0 and all ϑ = (ϑk )N k=1 ∈ R . We choose an open neighborhood Zηε of Kηε such that Z ηε ⊆ Z2 , xε − yε > ηε and (5.256) still holds with γ replaced by 1 γ. Note that such a choice is possible because Dxε , Dyε ∈ C(Z 2 , RN ). Then by 2 the strong maximum principle (see Theorem 4.3.19), it follows that uε = xε − yε is constant on Z ηε , a contradiction. This proves that K = ∅.
This proposition leads to the following strong comparison principle. THEOREM 5.5.11 If x, y ∈ C01 (Z), x = 0, f, g ∈ L∞ (Z), f ≥ 0,
−div Dx(z)p−2 Dx(z) = f (z) a.e. on Z
p−2 and −div Dy(z) Dy(z) = g(z) a.e. on Z f (z) ≥ g(z) a.e. on Z and the set C = z ∈ Z : f (z) = g(z) has an empty interior, then x(z) > y(z) for all z ∈ Z and (∂x/∂n)(z) < (∂y/∂n)(z) for all z ∈ ∂Z; that is, we have that x − y ∈ int C01 (Z)+ .
440
5 Boundary Value Problems–Hamiltonian Systems
PROOF: From the nonlinear strong maximum principle (see Theorem 4.3.37), we have ∂x x(z) > 0 for all z ∈ Z and (z) < 0 for all z ∈ ∂Z. ∂n − Acting with the test function −(x − y) , we can see that x(z) ≥ y(z) for all z ∈ Z. We set K = z ∈ Z : x(z) = y(z) (the coincidence set in Z). From Proposition 5.5.10 we know that K cannot be compact unless K is empty. Suppose K is nonempty. Then we can find {zn }n≥1 ⊆ K and z ∈ ∂Z such that zn −→ z. Since Dx, Dy ∈ C(Z, RN ) (see Theorems 4.3.34 and 4.3.35) and x, y ∈ C01 (Z), we have
∂y ∂x (z) = (z) = −β 2 < 0. ∂n ∂n
(5.257)
Set u = x − y. Then u ∈ C01 (Z) and it satisfies −
N ∂
∂u αij =f −g ≥0 ∂z ∂x j i i,j=1
0 (see (5.255) with ε = 0). In particular, we have with αij = αij
∂x 2 ∂x ∂x ∂x p−4 (z) (z) . δij (z) + (p − 2) αij (z) = (z) ∂n ∂n ∂zi ∂zj Exploiting the continuity of the gradient vectors Dx, Dy, we can find a ball B ⊆ Z such that z ∈ ∂B and the elliptic operator defined by the αij is strictly elliptic on B. It follows that either u = 0 in B, which is impossible because by hypothesis int C = ∅ for u > 0 on B and (∂u/∂n)(z) < 0, which contradicts (5.257). Therefore we infer that x(z) > y(z)
for all z ∈ Z.
(5.258)
for all z ∈ ∂Z.
(5.259)
Moreover, from Lemma 4.3.18 we have ∂x ∂y (z) < ∂n ∂n
Hence from (5.258) and (5.259), we conclude that x − y ∈ int C01 (Z)+ .
5.6 Periodic Solutions for Hamiltonian Systems The Hamiltonian equations in mechanics (classical and celestial), can be written in the compact form
−Jx (t) = ∇H t, x(t) a.e. on T = [0, b], (5.260)
5.6 Periodic Solutions for Hamiltonian Systems
where J=
0N −IN IN 0N
441
is the standard 2N × 2N simplectic matrix and ∇H(t, x) is the gradient of the Hamiltonian function x −→ H(t, x) defined on R2N which is assumed to be C 1 . Recall that the simplectic matrix has the following properties, J 2 = −I2N
and
(Jx, y)R2N = −(x, Jy)R2N
for all x, y ∈ R2N .
(5.261)
A basic problem in mechanics is to prove the existence of periodic trajectories for system (5.260). So first we establish the existence of periodic solutions for problem (5.260), when the Hamiltonian function is superquadratic, but does not necessarily satisfy the Ambrosetti–Rabinowitz condition (AR-condition, for short). Recall that the AR-condition on H(t, ·), says the following. (AR): there exist µ > 0 and R > 0 such for almost all t ∈ T and all x ≥ R
0 < µH(t, x) ≤ ∇H(t, x), x R2N . (5.262) Condition (AR) implies that the growth of x−→H(t, x) is superquadratic. Here instead, we make the following hypotheses concerning the Hamiltonian function H(t, x). H1 : H : T × R2N −→ R is a function such that (i) For all x ∈ R2N , t −→ H(t, x) is measurable. (ii) For almost all t ∈ T, x −→ H(t, x) is a C 1 -function. (iii) H(t, x) ≥ 0 for a.a. t ∈ T , all x ∈ R2N and lim
x→∞ H(t,x) 2 x→0 x
(iv) lim
H(t, x) = +∞ x2
uniformly for a.a. t ∈ T.
= 0 uniformly for a.a. t ∈ T .
, c1 , c2 > 0, and M > 0 such (v) There exist constants 1 < r and 1 < ϑ < 1 + r−1 r that
c1 xr ≤ ∇H(t, x), x R2N −2H(t, x) and
∇H(t, x) ≤ c2 xϑ
for a.a. t ∈ T and all x ≥ M.
q
EXAMPLE 5.6.1 The function H(x) = x2 ln(1 + xp ) with p, q > 1 satisfies hypotheses H1 but not the AR-condition (see (5.262)). < Let S 1 =R (2π/b)Z. We consider the Hilbert space V =W 1/2,2 (S 1 , R2N ) of all b-periodic functions x(t) =
k∈Z
exp
2πk
tJ xk ∈ L2 [0, 1], R2N b
with Fourier coefficients xk ∈ R2N that satisfy
442
5 Boundary Value Problems–Hamiltonian Systems x2V = x0 2R2N +
2π |k|xk 2R2N . b k∈Z
The inner product on the Hilbert space V is given by (x, y)V = (x0 , y0 )R2N +
2π |k|(xk , yk )R2N . b k∈Z
Evidently the inner product (·, ·)V generates the norm · V . We define the operator A∈L(V ) by
b
(Ax, y)V =
− Jx (t), y(t)
0
RN
dt
for all x, y ∈ V.
Using the properties of the simplectic matrix (see (5.261)), we see that A is a bounded self-adjoint √ operator and ker A = R2N . Note that W 1/2,2 (S 1 , R2N ) = 1/2 D(|A| ), where |A| = A2 (since A is self-adjoint). Also we consider the energy functional ϕ : W 1/2,2 (S 1 , R2N )−→R defined by ϕ(x) =
1 (Ax, x)V − 2
b
H t, x(t) dt
for all x ∈ W 1/2,2 (S 1 , R2N ).
0
Clearly ϕ ∈ C 1 (V ) and the critical points of ϕ are the solutions of )
* −Jx (t) = ∇H t, x(t) a.e. on [0, b], . x(0) = x(b), x (0) = x (b)
(5.263)
We consider the orthogonal direct sum decomposition V = V − ⊕V0 ⊕V + , where
and
V −= {x ∈ V : xk = 0
for k ≥ 0}
V0 = {x ∈ V : xk = 0
for k = 0}
V = {x ∈ V : xk = 0
for k ≤ 0}.
+
So every x ∈ V may be uniquely written as x = x + x0 + x with x ∈ V − , x0 ∈ V0 and x ∈ V + . Note that the self-adjoint operator A ∈ L(V ) has eigenspaces 2πk Ek = exp tJ xk : xk ∈ R2N b with corresponding eigenvalues 2πk, k∈Z. Moreover, dim Ek = 2N . Hence V − = the negative definite subspace of A V0 = ker A = R2N and
V + = the positive definite subspace of A.
PROPOSITION 5.6.2 If hypotheses H1 hold, then ϕ satisfies the P S-condition.
5.6 Periodic Solutions for Hamiltonian Systems
443
PROOF: Let {xn }n≥1 ⊆ V be a sequence such that |ϕ(xn )| ≤ M
for some M > 0, all n ≥ 1 and ϕ (xn ) −→ 0 as n → ∞.
(5.264)
Because of hypotheses H1 , we have that |H(t, x)| ≤ c3 + c4 xϑ+1
for a.a. t ∈ T, all x ∈ R2N with c3 , c4 > 0.
This growth condition combined with hypothesis H1 (v), imply that
c1 xr − c5 ≤ ∇H(t, x), x R2N −2H(t, x)
(5.265)
for a.a. t ∈ T , all x ∈ R2N , with c5 > 0. Then we have
2ϕ(xn ) − ϕ (xn ), xn V =
b
∇H(t, xn ), xn
0
R2N
− 2H(t, xn ) dt
b
≥ c1
xn r dt − c5 b
(see (5.265)).
0
(5.266) We claim that {xn }n≥1 ⊆ V is bounded. Suppose that this is not the case. We may assume that xn V −→ ∞. Then from (5.264) and (5.266), it follows that 1 xn V
b
xn r dt −→ 0
as n → ∞.
(5.267)
0
By hypothesis H1 (v), we have 1 < ϑ < 1 + (r − 1)/r. Let ξ = (r − 1) r(ϑ − 1) . We have 1 ξ > 1 and ξϑ − 1 = ξ − . (5.268) r Hypothesis H1 (v) implies that ∇H(t, x)ξ ≤ c6 xϑξ + c7
for a.a. t ∈ T, all x ∈ R2N , with c6 , c7 > 0. (5.269)
Recall that xn = xn + x0n + xn with xn ∈ V , x0n ∈ V0 ad xn ∈ V . Exploiting the orthogonality of the component spaces, we have
ϕ (xn ), xn V = (Axn , xn )V −
b
∇H(t, xn ), xn
0
R2N
b
≥ (Axn , xn )V −
∇H(t, xn )xn dt 0
≥ (Axn , xn )V − c8
0
for some c8 > 0, all n ≥ 1. Using (5.269), we have
dt
1/ξ
b
∇H(t, xn )ξ dt
xn dt (5.270)
444
5 Boundary Value Problems–Hamiltonian Systems
1/ξ
b
∇H(t, xn )ξ dt
b
≤
(c6 xn ϑξ + c7 )dt
0
0
≤ c9
1/r
b
b
xn r dt 0
r
1−1/r
xn (ξϑ−1) r−1 dt 0
+ c10 , c9 , c10 > 0 1/r b (ξϑ−1) ≤ c11 xn r dt xn V + c10 , c11 > 0. 0
(5.271) If we combine (5.267), (5.268), and (5.271), we see that
1/ξ 1 b ∇H(t, xn )ξ dt −→ 0 as n → ∞. xn V 0 Then from (5.264) and (5.270), we infer that
1/ξ ϕ (xn )V xn V (Axn , xn )V 1 b ≤ + c8 ∇H(t, xn )ξ dt xn V xn V xn V xn V xn V 0 −→ 0
as n → ∞,
xn V −→ 0 as n → ∞. ⇒ xn V
(5.272)
In a similar fashion, we show that xn −→ 0 xn
as n → ∞.
From hypothesis H1 (v), we have
c12 x − c13 ≤ ∇H(t, x), x R2N −2H(t, x)
(5.273)
for a.a. t ∈ T, all x ∈ R2N ,
c12 , c13 > 0. So it follows that
2ϕ(xn ) − ϕ (xn ), xn V =
b
∇H(t, xn ), xn
0
0
R2N
− 2H(t, xn ) dt
b
≥
(c12 xn − c13 )dt b
≥
(c12 x0n − c12 xn − c12 xn − c13 )dt 0
≥ c13 x0n V − c14 (xn V + xn V ) − c15 , c13 , c14 , c15 > 0, because V0 =R2N . Therefore from (5.272) through (5.274), we deduce that x0n V −→ 0 xn V So finally
as n → ∞.
(5.274)
5.6 Periodic Solutions for Hamiltonian Systems 1=
xn V + x0n V + xn V xn V ≤ −→ 0 xn V xn V
445
as n → ∞,
a contradiction. This proves that {xn }n≥1 ⊆ V is bounded. So we may assume that w
xn −→ x
in X
and
xn −→ x
in L2 (T, R2N )
(recall that W (1/2),2 (S 1 , R2N ) is embedded compactly in L2 (T, R2N )). From (5.264), we have
b
1 as n → ∞. ∇H(t, xn ), xn − x R2N dt −→ 0 (Axn , xn − x)V − 2 0 Evidently
b
∇H(t, xn ), xn − x
0
R2N
dt −→ 0.
So we have (Axn , xn − x)V −→ 0, ⇒ xn V −→ xV
as n → ∞.
w
Because xn −→ x in H and H is a Hilbert space, we conclude that xn −→ x in V , therefore ϕ satisfies the P S-condition. THEOREM 5.6.3 If hypotheses H1 hold, then problem (5.263) admits a nonconstant solution x ∈ C 1 (T, R2N ). PROOF: Hypotheses H1 imply that for almost all t ∈ T and all x ∈ RN , we have |H(t, x)| ≤ c1 + c2 xϑ+1
with c1 , c2 > 0.
(5.275)
Also because of hypotheses H1 (iv), given ε > 0, we can find δ = δ(ε) > 0 such that H(t, x) ≤ εx2 for a.a. t ∈ T and all x ≤ δ. (5.276) From (5.275) and (5.276) it follows that we can find cε such that H(t, x) ≤ εx2 + cε xϑ+1
for a.a. t ∈ T and all x ∈ R2N .
(5.277)
Therefore by choosing ε > 0 small, for every x ∈ V + we have
b
1 b (−Jx , x)R2N dt − H t, x(t) dt ϕ(x) = 2 0 0 ≥ c3 x2V − c4 xϑ+1 V
(5.278)
(see (5.277) and recall that W (1/2),2 (S 1 , R2N ) is embedded compactly in Lϑ+1 (S 1 , R2N )). Because of (5.278), we can find > 0 small such that ϕ(x) ≥ β > 0
for all x ∈ V + with xV = .
(5.279)
Let W = V − ⊕ V 0 and consider e ∈ V + with eV = 1 and E = W ⊕ Re. We introduce
446
5 Boundary Value Problems–Hamiltonian Systems
and
µ = inf (Ax, x)V : x ∈ V − , xV = 1
BE = {x ∈ E : xV = 1},
1 1/2 γ= . A|L µ
(5.280)
For x ∈ BE we write x = x + x0 + x ∈ E. (a) If xV > γx0 + xV , then for any η > 0 we have
b 1 1 ϕ(ηx) = (Aηx, x)V + (Aη x, x)V − H(t, ηx)dt 2 2 0 1 µ ≤ − η 2 xV + AL η 2 x2V (see (5.280) and recall that H ≥ 0) 2 2 ≤0 (because xV > γx0 + xV ). (b) If xV ≤ γx0 + xV , then we have 1 = x2V = x2V + x0 + x2V ≤ (γ 2 + 1)x0 + x2V 1 ⇒ 0< ≤ x0 + x2V . 1 + γ2 $E = x ∈ BE : xV ≤ γx0 + x2V . Set B $E we have Claim: We can find ε1 > 0 such that for all x ∈ B t ∈ T : x(t) ≥ ε1 ≥ ε1 . 1
(5.281)
(5.282)
Here by | · |1 we denote the Lebesgue measure on T . $E Suppose that the claim is not true. Then for any n ≥ 1, we can find xn ∈ B such that t ∈ T : xn (t) ≥ 1 < 1 . n 1 n We set xn = xn + x0n + xn ∈ E. Because dim(V0 ⊕ Re) < ∞ and x0n + xn V ≤ 1, by passing to a subsequence if necessary, we may assume that x0n + xn −→ x0 + x ∈ V0 ⊕ Re as n → ∞, 1 ⇒ 0< ≤ x0 + x2V (see (5.281)). 1 + γ2
(5.283)
Also xn V ≤ 1 for all n ≥ 1 and so because V is a Hilbert space, by the Eberlein– w Smulian theorem, we may assume that xn −→ x in V − as n → ∞. So finally w
xn −→ x = x + x0 + x ⇒ xn −→ x
in L (S , R 2
1
2N
in V as n → ∞, ),
(because V is embedded compactly in L2 (S 1 , R2N )), ⇒ xV > 0
(see (5.283)),
⇒ x2 > 0. Therefore we can find δ2 ≥ δ1 > 0 such that t ∈ T : x(t) ≥ δ1 ≥ δ2 . 1
(5.284)
5.6 Periodic Solutions for Hamiltonian Systems
447
Indeed, if (5.284) is not true, then for all n ≥ 1, we must have t ∈ T : x(t) ≥ 1 = 0, n 1 1 = 1, ⇒ t ∈ T : x(t) < n 1
b 1 ⇒ 0< x(t)2 dt < 2 −→ 0 n 0
as n → ∞,
a contradiction. So we see that (5.284) is true. We set T = t ∈ T : x(t) ≥ δ1 , Tn = t ∈ T : xn (t) < 1/n for all n ≥ 1 and c Tn = T \ Tn . Because of (5.282) and (5.284), we have T ∩ Tn = T \ (T ∩ Tnc ) ≥ |T |1 − |T ∩ Tnc |1 ≥ δ2 − 1 . 1 1 n Let n ≥ 1 be large enough so that δ2 − 1/n ≥ δ2 /2 and δ1 − 1/n ≥ δ1 /2. We have
1 2 δ1 2 ≥ xn (t) − x(t)2 ≥ δ1 − n 2 So it follows that for n ≥ 1 large
b
xn (t) − x(t)2 dt ≥
δ1 2 |T ∩ Tn |1 2
δ1 2
1 ≥ δ2 − 2 n
δ1 3 > 0. ≥ 2
xn (t) − x(t)2 dt ≥
∩Tn T
0
for all t ∈ T ∩ Tn .
(5.285)
But recall that xn −→ x in L2 (S 1 , R2N ). Comparing this with (5.285), we reach a contradiction. This proves the claim (see (5.282)). $E , let T (x) = {t ∈ T : x(t) ≥ ε1 }. Hypothesis H1 (iii) For x = x + x0 + x ∈ B implies that for β0 = AL ε31 > 0, we can find M0 > 0 such that H(t, x) ≥ β0 x2
for a.a. t ∈ T and all x ≥ M0 .
Choose η0 ≥ M0 /ε1 . Then for η ≥ η0 , we have
H t, ηx(t) ≥ β0 ηx(t)2 ≥ β0 r2 ε21 for a.a. t ∈ T (x). Hence ϕ(ηx) = ≤ ≤ ≤ ⇒ ϕ(ηx) ≤
b
1 2 1 H t, ηx(t) dt η (Ax, x)V + η 2 (Ax, x)V − 2 2 0
1 2 H t, ηx(t) dt (because H ≥ 0) η AL − 2 T (x) 1 2 η AL − β0 η 2 ε21 |T (x)|1 2 1 2 1 η AL − β0 η 2 ε31 = − η 2 AL < 0, 2 2 0 for all x ∈ BE and all η ≥ η0 .
(5.286)
448
5 Boundary Value Problems–Hamiltonian Systems
Set C = x ∈ W : xV ≤ 2η0 ⊕ ηe : 0 ≤ η ≤ 2η0 and C0 = ∂C. Then from (5.286) we have ϕC ≤ 0. (5.287) 0
This together with (5.279) and Proposition 5.6.2, permit the use of the generalized mountain pass theorem (see Theorem 4.1.26), which gives x ∈ V such that ϕ(x) ≥ β > 0 = ϕ(0)
and
ϕ (x) = 0.
It follows that x = 0 and it is a solution of (5.263). Moreover, because H ≥ 0 we see that x is nonconstant. Finally, it is clear from (5.263) that x ∈ C(T, R2N ). Using the same technique we can also have an existence theorem for Hamiltonian systems for which the function H(t, x) satisfies the (AR)-condition. In fact in this case the proofs are simplified. So our hypotheses on the Hamiltonian H(t, x) are the following. H2 : H : T × R2N −→ R is a function such that (i) For all x ∈ R2N , t −→ H(t, x) is measurable. (ii) For almost all t ∈ T, x −→ H(t, x) is a C 1 -function. (iii) H(t, x) ≥ 0 for a.a. t ∈ T , all x ∈ R2N , and lim a.a. t ∈ T .
H(t,x) 2 x→0 x
= 0 uniformly for
(iv) Condition (AR) is satisfied (see (5.262)). REMARK 5.6.4 Integrating the (AR)-condition (hypothesis H2 (iv)), we see that for a.a. t ∈ T and all x ∈ R2N , we have c1 xµ − c2 ≤ H(t, x) with c1 , c1 > 0. Hence H(t, )˙ is superquadratic. Following the reasoning of the proof of Theorem 5.6.3, via the generalized mountain pass theorem (see Theorem 4.1.26), we can have the following theorem. The reader can fill the details. THEOREM 5.6.5 If hypotheses H2 hold, then problem (5.263) admits a nonconstant solution x ∈ C 1 (T, R2N ). Next we consider a problem of a seemingly different nature. So we consider a Hamiltonian system but instead of fixing the period, the energy level is prescribed. So we consider the following autonomous Hamiltonian system:
−Jx (t) = ∇H x(t) for all t ∈ T. (5.288) Let x ∈ C 1 (T, R2N ) be a solution of (5.288). Taking inner product of (5.288) with x (t), we obtain
∇H x(t) , x (t) R2N d
⇒ H x(t) = 0 for all t ∈ T, dt
⇒ H x(t) = constant for all t ∈ T.
5.6 Periodic Solutions for Hamiltonian Systems
449
This means that the energy is conserved. It is therefore natural to seek solutions of (5.288) and in particular periodic solutions with a prescribed energy level. The difficulty in dealing with this problem is that the period, and consequently the underlying solution space, is not a priori known. Nevertheless, with some assumptions on the Hamiltonian H(x), we are able to reduce the fixed energy case to the fixed period case. We start with a result, which says that under some conditions on the gradient map ∇H, the trajectories of (5.288) on the energy manifold S, are actually independent of the Hamiltonian H and depend only on the manifold S. PROPOSITION 5.6.6 If Hk ∈ C 1 (R2N ) and ck ∈ R, k = 1, 2, are such that
and
S = Hk−1 (ck ),
k = 1, 2
∇Hk (x) = 0
for all x ∈ S, k = 1, 2,
then the trajectories of the Hamiltonian systems
for all t ∈ T, −Jx (t) = ∇Hk x(t)
k = 1, 2,
are the same. of system correspondPROOF: Let x1 ∈C 1 (T, R2N ) be a solution the Hamiltonian
ing to H1 . Note that the vectors ∇H1 x1 (t) , ∇H2 x1 (t) , t ∈ T , are normal to S. So we can find µ : T −→ R such that
∇H2 x1 (t) = µ(t)∇H1 x1 (t) for all t ∈ T. (5.289)
By hypothesis ∇H1 x1 (t) , ∇H2 x1 (t) = 0 for all t ∈ T and so µ(t) = 0 for all t ∈ T . Moreover, because t −→ ∇Hk x1 (t) is continuous for k = 1, 2, it follows that t −→ µ(t) is continuous. Therefore t −→ µ(t) has constant sign on T . Let
t
ϑ(t) = 0
1 ds. µ(s)
Then ϑ is continuous and strictly monotone. We set x2 = x1 ◦ ϑ−1 and we have Jx2 (t) = J(x1 ◦ ϑ−1 ) (t) 1 = J(x1 ◦ ϑ−1 )(t) ◦ ϑ−1 (t) (by the chain rule) ϑ
= −∇H1 x1 ϑ−1 (t) µ ϑ−1 (t)
= −∇H2 x1 ϑ−1 (t) for all t ∈ ϑ(T ) (see 5.289))
for all t ∈ ϑ(T ), = −∇H2 x2 (t)
⇒ x2 ∈ C 1 ϑ(T ), R2N is a solution of − Jx (t) = −∇H2 x(t) , for all t ∈ ϑ(T ).
REMARK 5.6.7 Evidently if x1 ∈ C 1 (T, R2N ) is periodic, then so is x2 ∈ C 1 (T, R2N ).
450
5 Boundary Value Problems–Hamiltonian Systems
Now we use Proposition 5.6.6 and some convexity properties of H, to replace H by another Hamiltonian H that satisfies hypotheses H1 and so Theorem 5.6.3 can be used to produce a periodic trajectory. The hypotheses on the time-invariant (autonomous) Hamiltonian function H(x), are the following. H3 : H ∈ C 1 (R2N ), the energy surface S = H −1 is the boundary of a relatively compact, star-shaped neighborhood of the origin and ∇H(x) = 0 for all x ∈ S. THEOREM 5.6.8 If hypotheses H3 hold, then system (5.288) admits a periodic trajectory. PROOF: Because S = ∂U with U a relatively compact, star-shaped neighborhood of the origin in R2N , for every x ∈ R2N , we can find a unique ξ(x) ∈ S and µ(x)>0 such that x = µ(x)ξ(x) (in fact µ(x) = xn /ξ(x)). Evidently µ is 1-homogeneous and C 1 on R2N \ {0}. We define ) 0 if x = 0 . H(x)= µ(x)4 if x = 0
−1 Then H ∈ C 1 (R2N ), H (1) = S, H ≥ 0, lim H(x) x2 = 0 and since H is 4x→0
homogeneous we have that ∇H(x), x R2N = 4H(x) for all x ∈ R2N . Therefore in this case H satisfies the global (AR)-condition and so hypotheses H3 are satisfied and we can apply Theorem 5.6.5. So for b = 2π (for example), we can find a nonconstant
2π-periodic solution u ∈ C 1 (T, R2N ), T = [0, 2π]. We need not have H u(t) = 1
for all t ∈ T . Nevertheless, we have that H u(t) = λ = 0 (because u = 0). Set
1 . Then H νu(t) = 1 for all t ∈ T . Moreover, ν = λ1/4
1 for all t ∈ T. (5.290) −J νu (t) = ν∇H u(t) = 2 ∇H νu(t) ν
Set τ = ν −2 t and x(τ ) = νu(t). Then x ∈ C 1 ν12 T, R2N is a 2π -periodic map ν2 which by virtue of (5.290) satisfies −Jx (τ ) =
dt
1 ∇H νu(ν 2 τ ) (τ ) = ∇H x(τ ) . ν2 dτ
5.7 Remarks 5.1: Hypotheses H(f )1 are such that when p = 2 (semilinear problem), they incorporate in our framework the so-called asymptotically linear problems at 0+ and at +∞. Since the appearance of the pioneering work of Amann–Zehnder [17], these problems have attracted a lot of interest and several papers dealing with them have appeared. Indicatively we mention the works of Stuart–Zhou [565], Tehrani [579], and Zhou [627] (for p = 2), and Costa–Magalhaes [159], Li–Zhou [379], Fan–Zhao–Huang [237], Huang–Zhou [318], and Hu–Papageorgiou [317] (for p > 1). In the last work
5.7 Remarks
451
the potential function F (z, x) is nonsmooth, locally Lipschitz in x ∈ R, and so the right-hand side nonlinearity in problem (5.1) is multivalued, namely the generalized subdifferential of the function x −→ F (z, x). From the aforementioned nonlinear works with the exception of Huang–Zhou [318] and Hu–Papageorgiou [317], the rest deal with the situation
considered here in hypotheses H(f )1 , namely they assume that the slope f (z, x) (xp−1 ) stays below λ1 > 0 asymptotically as x −→ 0+ and stays above λ1 > 0 asymptotically as λ → +∞. However, their hypotheses are more restrictive than the ones used here. Huang–Zhou [318] and Hu–Papageorgiou [317] deal with the opposite situation. Compared with the Dirichlet problem, the study of nonlinear Neumann problems involving the p-Laplacian differential operator is lagging behind. We mention the works of Binding–Drabek–Huang [76], Faraci [238], Filippakis–Gasi´ nski– Papageorgiou [246], Motreanu–Papageorgiou [440], and Papalini [474]. When A = 0, problem (5.30) has been studied extensively and many solvability conditions are given, such as the coercivity condition (see Berger–Schechter [69]), the convexity condition (see Mawhin [414]), the subadditivity condition (see Tang [576]), and the sublinear condition (see Tang [577]). The case where A(t) = k2 ω 2 IN , with k ∈ N , ω = 2π/b, and IN the N×N identity matrix, was considered by Mawhin– Willem [415], under the assumption that the potential function x −→ F (t, x) is a C 1 convex function (hence x −→ ∇F (t, x) is a continuous, monotone, hence maximal monotone operator from RN into itself). The approach of Mawhin–Willem [415, p. 61], is based on a variant of the dual action principle. Also in the book of Mawhin–Willem [415, p. 88], we find the general problem, with the right-hand side nonlinearity F (t, x) satisfying |F (t, x)| ≤ h(t)
and
∇F (t, x) ≤ h(t)
for almost all t ∈ T , all x ∈ R , and with h ∈ L1 (T )+ . The potential function x −→ F (t, x) is not assumed to be convex. The work of Mawhin–Willem [415], was extended recently by Tang–Wu [578], who considered systems with a general Carath´eodory, strictly sublinear nonlinearity. The opposite situation, of a strictly superlinear nonlinearity, was studied by Motreanu–Motreanu–Papageorgiou [438]. We should mention that by virtue of hypothesis H(F )(iv), problem (5.31) is classified as a strongly resonant (at the zero eigenvalue) problem. Such problems exhibit a certain partial lack of compactness, which is evident in Proposition 5.1.16. Finally we should mention that uses of the second deformation theorem (see Theorem 4.6.1) to obtain multiplicity results for boundary value problems, can be found in Papageorgiou–Papageorgiou [484] and Kyritsi–Motreanu–Papageorgiou [369]. Multiplicity results for resonant nonlinear elliptic problems with the p-Laplacian can be found in Costa–Magalhaes [159], Li–Zhou [379], and Liu–Su [390]. Finally Theorem 5.1.27, was first proved for p = 2 by Brezis–Nirenberg [106]. N
5.2: The method of upper–lower solutions, provides an effective tool to produce existence theorems for first- and second-order initial and boundary value problems and to generate monotone iterative techniques which provide constructive methods (amenable to numerical treatment) to obtain these solutions. The question of existence of multiple solutions for problem (5.1) under condition of superlinear behavior of the nonlinearity f (z, x) both at zero and ±∞, has been studied in the past only in the context of semilinear problems (i.e., p = 2). First Ambrosetti–Mancini [22] proved that if λ > λ1 , then the problem has two nontrivial
452
5 Boundary Value Problems–Hamiltonian Systems
solutions of constant sign (one positive and one negative). Soon thereafter Struwe [563] improved their result by showing that if λ > λ2 , then problem (5.1) (with p = 2) has at least three nontrivial solutions. Ambrosetti–Lupo [23] slightly improved the result of Struwe and they also presented an approach based on Morse theory. This required that f (z, ·) ∈ C 1 (R). The most general result for the semilinear problem (p = 2), can be found in Struwe [564, p. 132], who succeeded in eliminating the differentiability condition on the nonlinearity f and simplified the argument of Ambrosetti–Lupo. Still though Struwe [564] requires that f is Lipschitz continuous in x ∈ R. So even when p = 2, Theorem 5.2.10 is more general than the result of Struwe [564]. To our knowledge no other such multiplicity result for problem (5.1) with p > 1, can be found in the literature. Problem (5.33) has the interesting feature that it presents a compact formulation for the Dirichlet, Neumann, and Sturm–Liouville problems. Moreover, the method of proof for problem (5.33), can be repeated with only very minor changes in the case of the periodic problem (see problem (5.145)). Even more general nonlinear, multivalued boundary conditions can be found in the works of Halidias–Papageorgiou [282] and Gasi´ nski–Papageorgiou [258]. The method of upper and lower solutions was used to study such problems by Bader–Papageorgiou [50] and Douka–Papageorgiou [204]. Further uses of the method to initial and boundary value problems can be found in the book of Heikkila–Lakshmikantham [287]. 5.3: Degree theory has always been a powerful tool for the study of boundary value problems, especially for ordinary differential equations. The monograph of Mawhin [412] contains many such examples. The relatively recent generalizations of degree theory to nonlinear operators of monotone type (see Section 3.3) paved the way for the use of degree-theoretic methods to boundary value problems of nonlinear partial differential equations and to problems with unilateral constraints (such as variational and hemivariational inequalities). Our approach here in dealing with problem (5.146) is close to Amann [16] (where p = 2) and Ambrosetti–Garcia Azorero–Peral Alonso [24]. However, instead of the Leray–Schauder degree, here we employ the degree for operators of type (S)+ (see Section 5.3). This permits us to assume a nonsymmetric structure for our problem, in contrast to Ambrosetti–Garcia Azorero–Peral Alonso [24] (see hypothesis H(f )1 ). Other uses of degree-theoretic methods in the study of nonlinear elliptic problems with unilateral constraints can be found in Fillipakis–Papageorgiou [245] and Aizicovici–Papageorgiou–Staicu [6]. For the periodic problem (5.180), hypotheses H(f )2 assume a nonuniform nonresonance condition between two successive eigenvalues of the negative scalar pLaplacian with periodic boundary conditions. Analogous situations for p = 2 (semilinear problems) were considered by Fonda–Mawhin [251], and Habets–Metzen [280], and for p > 1 (nonlinear problems) and Dirichlet boundary value conditions by Zhang [623]. In hypotheses H(f )3 the asymptotic at 0 and ±∞ nonuniform nonresonance conditions (see hypotheses H(f )3 (iv) and (v)), are replaced by Landesman–Lazer type conditions (see hypothesis H(f )2 (iv)). In this direction results for p = 2 were obtained by Cesari–Kannan [141], Iannacci–Nkashama [322]; also see the references therein. The results on problem (5.180) here are a particular case of those by Aizicovici–Papageorgiou–Staicu [5]. 5.4: Problem (5.198) was studied for p = 2 (semilinear case) by Rabinowitz [508]. Problem (5.198) with p > 1 and m = 1, was examined by Fan–Li [236]. They also used
5.7 Remarks
453
the Ambrosetti–Rabinowitz condition (see hypothesis H(f )1 (iv)) and in addition they assumed that near the origin the right-hand side nonlinearity of the problem grows like |x|r−1 with r > p. We should also mention the work of Garcia Melian– Sabina de Lis [257], who obtained bifurcation results but under radial symmetry and with a C 2 -right-hand side nonlinearity. Proposition 5.4.8 is due to Pohozaev [497]. In problem (5.231) the term causing ∗ problems is the last term |x|2 −2 x. Recall that in the Sobolev embedding theorem ∗ the case r = 2 is the limiting case for the inclusion H01 (Z) ⊆ Lr (Z) and it is not compact. Thus for s = (N + 2) (N − 2) − 2∗ − 1 the map x −→ xs is not compact from H01 (Z) into H −1 (Z). For this reason compactness arguments do not work for problem (5.231), the linear term λu may be replaced by other compact perturbations (see Brezis–Nirenberg [104]). Additional results for problems with critical nonlinearities can be found in the works of Ambrosetti–Struwe [21], Cerami– Fortunato–Struwe [137], Guedda–Veron [275], Miyagaki [437], and Willem [606]. 5.5: Maximum principle–type results for the p-Laplacian can be found in Alegretto– Huang [12] and Garcia Melian–Sabina de Lis [257] who proved Theorem 5.1.8. Analogous results for some general nonlinear differential operators, can be found in Damascelli [172]. Also, we should mention the so-called antimaximum principle. So consider the following problem.
) * −div Dx(z)p−2 Dx(z) = λ|x(z)|p−2 x(z) + h(z) a.e. on Z, . (5.291) Bx = 0 Here B stands for the Dirichlet or Neumann boundary conditions, 1 < p < ∞ and h ∈ L∞ (Z). THEOREM 5.7.1 If h ∈ L∞ (Z), h(z) ≥ 0 for almost all z ∈ Z, h = 0, then (5.291) there exists δ = δ(h) such that for all λ ∈ (λ1 , λ1 + δ) any solution x ∈ C 1 (Z) of (5.291) satisfies x(z) < 0 for all z ∈ Z and (∂x/∂n)(z) > 0 for all z ∈ ∂Z. REMARK 5.7.2 It can be shown that δ > 0 can be chosen independently of h for the scalar (i.e., N = 1) Neumann problem (uniform antimaximum principle). For p = 2, the antimaximum principle was investigated by Clement–Peletier [155] and Godoy–Gossez–Paczka [267]. For p = 2 we refer to Godoy–Gossez–Paczka [268]. For comparison results involving the p-Laplacian, we refer to Garcia Melian– Sabina de Lis [257] (they obtained Theorem 5.5.9) and Guedda–Veron [275] (who proved Theorem 5.5.11). Further results in this direction can be found in Alegretto– Huang [12] and Damascelli [172]. 5.6: Periodic solutions with prescribed minimal period were obtained by Clarke [150] (who introduced the dual action technique), Clarke–Ekeland [151], Girardi– Matzeu [265], Mawhin–Willem [413] and Rabinowitz [506] (who employed the (AR)condition, see (5.262)). Periodic solutions on a given energy surface were first proved by Rabinowitz [506] and Weinstein [603]. Additional existence and multiplicity results in this direction can be found in Berestycki–Lasry–Mancini–Ruf [66], Ekeland– Hofer [225], and Rabinowitz [507]. Detailed studies of Hamiltonian systems can be
454
5 Boundary Value Problems–Hamiltonian Systems
found in the books of Chang [143], Ekeland [227], Mawhin–Willem [415], and Struwe [564].
6 Multivalued Analysis
Summary. *Multivalued analysis deals with the study of maps whose values are sets. Multivalued analysis is closely related to nonsmooth analysis and a symbiotic relationship exists between them which feeds both with new ideas and directions to grow. This chapter presents in detail the main aspects of nonsmooth analysis. We study the continuity and measurability properties of multifunctions (set-valued functions) and we present the main selection theorems, for both continuous and measurable selectors (Michael’s theorem and the Kuratowski–Ryll Nardzewski and Yanlov–von Neumann–Aumann theorems). Then we deal with the set of integrable selectors of a multifunction and develop the main properties of the set-valued integral. We also prove fixed point theorems and study Carath´eodory multifunctions. Finally we examine the different modes of convergence of sets and of multifunctions.
Introduction Multivalued analysis deals with the study of the properties of the maps whose values are sets (elements in a hyperspace). The need for set-valued maps was recognized early in the twentieth century, but a systematic study started only in the mid-1960s. It is closely related to nonsmooth analysis (see Chapter 1). In fact the real explosion on multivalued analysis occurred at the exact same time that nonsmooth analysis made its appearance. Since then the two fields have moved in sychronization and provided each other with new tools, ideas, concepts, and results. This symbiotic relationship sustains their remarkable growth. In this chapter we attempt a survey of some of the basic aspects of the theory of multivalued analysis. Needless to say our presentation omits certain parts of the theory. A more comprehensive treatment can be found in the two volumes of Hu–Papageorgiou [313, 316]. In Section 6.1, which is topological in nature, we introduce various continuity notions. We investigate their properties and also examine how they are related among them. Section 6.2 is measure-theoretic in nature and deals with measurability properties of multifunctions. N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_6, © Springer Science+Business Media, LLC 2009
456
6 Multivalued Analysis
In Section 6.3 we address the fundamental problem of existence of measurable selectors for multifunctions. So we prove the Michael selection theorem (for continuous selectors) and the Kuratowski–Ryll Nardzewski and the Yankov–von Neumann– Aumann selection theorems. In Section 6.4 we examine decomposable sets and the resulting set-valued integration theory. It turns out that in many situations decomposability is an effective substitute of convexity. Section 6.5 contains some major fixed point theorems. Also we conduct a brief investigation of multifunctions of two variables (Carath´eodory multifunctions). Finally in Section 6.6 we introduce and study various modes of convergence of sets. We also obtain Fatou-type results for set-valued integrals and for sets of integrable selectors.
6.1 Continuity of Multifunctions We start by fixing the notation. So let X be a Hausdorff topological space. We introduce the following hyperspaces, Pf (X) = {A ⊆ X : A is nonempty and closed} and
Pk (X) = {A ⊆ X : A is nonempty and compact}.
If X is a normed space, we also consider the following hyperspaces Pf c (X) = {A ∈ Pf (X) : A is convex} P(w)kc (X) = {A ⊆ X : A is nonempty, (weakly-) compact and convex} Pbf (c) (X) = {A ∈ Pf (X) : A is bounded (and convex)}. In what follows for X a Hausdorff topological space and x ∈ X, by N (x) we denote the filter of all neighborhoods of x. If (X, d) is a metric space, x ∈ X, and r > 0, then Br (x) = {y ∈ X : d(x, y) < r} (= the open r-ball with center at x ∈ X) and B r (x) = {y ∈ X : d(x, y) ≤ r} (= the closed r-ball with center at x ∈ X). When X is a normed space and x = 0, then as before we write Br = Br (0) and B r = B r (0). For sets X, Y and a multifunction (set-valued function) F : X −→ 2Y , given a set C ⊆ Y , we have two types of inverse images of C under the action of F , namely F + (C) = {x ∈ X : F (x) ⊆ C} and
−
(the strong inverse image of C)
F (C) = {x ∈ X : F (x) ∩ C = ∅}
(the weak inverse image of C).
−
Evidently we have F (C) ⊆ F (C) ⊆ X. It is straightforward to check that these inverse images obey the following calculus rules. +
PROPOSITION 6.1.1 Suppose that X, Y, Z are nonempty sets. (a) If F, G :X −→2Y are two multifunctions and (F ∪ G)(x) = F (x) ∪ G(x), (F ∩ G)(x) = F (x) ∩ G(x) for all x ∈ X, then (F ∪ G)+ (C) = F + (C) ∪ G+ (C),
(F ∩ G)+ (C) ⊇ F + (C) ∩ G+ (C)
(F ∪ G)− (C) = F − (C) ∪ G− (C),
(F ∩ G)− (C) ⊆ F − (C) ∩ G− (C)
for all C ⊆ Y.
6.1 Continuity of Multifunctions
457
(b) If F : X −→ 2Y , G : X −→ 2Z and x −→ (G ◦ F )(x) = G F (x) = G(y) for y∈F (x)
all x ∈ X, then (G ◦ F )+ (C) = F + G+ (C) , (G ◦ F )− (C) = F − G− (C) for all C ⊆ Y . (c) If {Ci , C}i∈I are subsets of Y (I being an arbitrary index set), then ! + # +
!
# F (Ci ) ⊆ F + Ci , F (Ci ) ⊆ F + Ci i∈I
!
i∈I
−
F (Ci ) =
! i∈I
i∈I −
F (Ci ),
i∈I
F
−
# i∈I
Ci ⊆
#
i∈I
F − (Ci ).
i∈I
(d) If F : X −→ 2Y , G : X −→ 2Z are two multifunctions and F ×G : X −→ 2Y ×Z is defined by (F × G)(x) = F (x) × G(x) for all x ∈ X, then (F × G)+ (C × D) = F + (C) ∩ G+ (D), (F × G)− (C × D) = F − (C) ∩ G− (D) for all C ⊆ Y , D ⊆ Z; these equalities are still true for arbitrary products of multifunctions. Now we introduce the first continuity notions for multifunctions. In what follows X, Y are Hausdorff topological spaces. Additional hypotheses are introduced as needed. DEFINITION 6.1.2 Let F : X −→ 2Y be a multifunction. (a) We say that F is upper semicontinuous at x0 ∈ X (usc at x0 for short), if for all V ⊆ Y open such that F (x0 ) ⊆ V , we can find U ∈ N (x0 ) such that F (U ) ⊆ V . If this true at every x0 ∈ X, we say that F is upper semicontinuous (usc for short). (b) We say that F is lower semicontinuous at x0 ∈ X (lsc at x0 for short), if for all V ⊆ Y open such that F (x0 ) ∩ V = ∅, we can find U ∈ N (x0 ) such that F (x) ∩ V = ∅ for all x ∈ U . If this is true at every x0 ∈ X, we say that F is lower semicontinuous (lsc for short). (c) We say that F is continuous or (Vietoris continuous) at x0 ∈ X, if it is both usc and lsc at x0 ∈ X. If this is true at every x0 ∈ X, then we say that F is continuous or Vietoris continuous. The following propositions are immediate consequences of the above definitions. PROPOSITION 6.1.3 Given a multifunction F : X −→ 2Y , the following statements are equivalent. (a) F is usc. (b) For every C ⊆ Y closed, F − (C) is closed in X. (c) If x ∈ X, {xα }α∈J ⊆ X is a net in X, xα −→ x, V ⊆ Y is an open set with F (x) ⊆ V , then we can find α0 ∈ J such that F (xα ) ⊆ V for all α ∈ J with α ≥ α0 . PROPOSITION 6.1.4 Given a multifunction F : X −→ 2Y , the following statements are equivalent. (a) F is lsc. (b) For every C ⊆ Y closed, F + (C) is closed in X.
458
6 Multivalued Analysis
(c) If x ∈ X, {xα }α∈J ⊆ X is a net in X, xα −→ x, and V ⊆ Y is an open set with F (x) ∩ V = ∅, then we can find α0 ∈ J such that F (xα ) ∩ V = ∅ for all α ∈ J with α ≥ α0 . (d) If x ∈ X, {xα }α∈J ⊆ X is a net in X, xα −→ x, and y ∈ F (x), then for every α ∈ J we can find yα ∈ F (xα ) such that yα −→ y in X. REMARK 6.1.5 It is clear from Definition 6.1.2, that when F is single-valued, the notions of upper and lower semicontinuity coincide with the usual notion of continuity of a map between two Hausdorff topological spaces. Also because of Proposition 6.1.1(c), we see that in the definition of lower semicontinuity (see Definition 6.1.2(b)), we can take V to be a basic open set in Y . In general the notions of upper and lower semicontinuity are distinct. Upper semicontinuity allows upward jumps (in the sense of inclusion), and lower semicontinuity allows downward jumps (in the sense of inclusion); see Propositions 6.1.3(c) and 6.1.4(c), respectively. For example [0, 1] if x = 0 F1 (x)= 1 if x = 0 is usc but not lsc, whereas F2 (x)=
{0} [0, 1]
if x = 0 if x = 0
is lsc but not usc. Note that if F : R −→ 2R is defined by F (x) = [ψ(x), ϕ(x)] = {y ∈ R : ψ(x) ≤ y ≤ ϕ(x)} and ψ is lower semicontinuous, ϕ is upper semicontinuous, then F is usc, whereas if ψ is upper semicontinuous, and ϕ is lower semicontinuous, then F is lsc. Combining Propositions 6.1.3 and 6.1.4, we have the following. PROPOSITION 6.1.6 Given a multifunction F : X −→ 2Y , the following statements are equivalent. (a) F is continuous. (b) For every C ⊆Y closed, F + (C) and F − (C) are both closed in X. (c) If x ∈ X, {xα }α∈J ⊆ X is a net in X, xα −→ x, and V ⊆ Y is an open set with F (x) ⊆ V or F (x) ∩ V = ∅, then we can find α0 ∈ J such that for all α ∈ J, α ≥ α0 we have F (xα ) ⊆ V or F (xα ) ∩ V = ∅. DEFINITION 6.1.7 Given a multifunction F : X −→ 2Y , its graph is the set GrF = {(x, y) ∈ X × Y : y ∈ F (x)}. It is well known that a continuous map between two Hausdorff topological spaces, has a closed graph. The same is true for usc multifunctions. PROPOSITION 6.1.8 If Y is a regular topological space and F :X −→Pf (Y ) is usc, then GrF is closed in X × Y .
6.1 Continuity of Multifunctions
459
PROOF: Let {(xα , yα )}α∈J ⊆ GrF ⊆ X ×Y be a net and suppose that (xα , yα ) −→ (x, y) in X × Y . Arguing by contradiction, suppose that y ∈ / F (x). Then exploiting the regularity of the space Y , we can find V1 ∈ N (y) and V2 an open neighborhood of F (x) such that V2 ⊇ F (x) and V1 ∩ V2 = ∅. But because yα −→ y and using Proposition 6.1.3, we can find α0 ∈ J such that for all α ∈ J, α ≥ α0 we have yα ∈ V1 and F (xα ) ⊆ V2 , hence yα ∈ / F (xα ), a contradiction. REMARK 6.1.9 It is clear from the above proof that if F has values in Pk (Y ), then in the above proposition we can drop the requirement that Y is regular. This is consistent with the fact that for single-valued maps continuity implies closedness of the graph without any additional conditions on the space Y . Even for single-valued maps, the converse of Proposition 6.1.8 is not in general true. PROPOSITION 6.1.10 If F : X −→ Pk (Y ) has a closed graph and it is locally compact (i.e., for every x ∈ X we can find U ∈ N (x) such that F (U ) is compact in Y ), then F is usc. PROOF: Given C ⊆ Y we show that F − (C) is closed and this by virtue of Proposition 6.1.3 implies that F is usc. So let {xα }α∈J ⊆ F − (C) be a net and assume that xα −→ x. By hypothesis we can find U ∈ N (x) such that F (U ) is compact in Y . We can find α0 ∈ J such that if α ∈ J, α ≥ α0 , then xα ∈ U and so if yα ∈ F (xα ), α ≥ α0 , we can find a subnet {yβ }β∈I of {yα }α∈J and y ∈ F (U ) such that yβ −→ y. Evidently (x, y) ∈ GrF ∩ (X × C); that is, y ∈ F (x) and y ∈ C. Therefore x ∈ F − (C) and so F − (C) is closed. DEFINITION 6.1.11 We say that a multifunction F : X −→ 2Y is closed (resp., sequentially closed ), if GrF ⊆X×Y is closed (resp., sequentially closed). REMARK 6.1.12 Evidently a closed multifunction F : X −→ 2Y \{∅} has values in Pf (Y ). PROPOSITION 6.1.13 If F : X −→ Pk (Y ) is usc and K ⊆ X is compact, then F (K) ⊆ Y is compact. PROOF: Let {yα }α∈J ⊆ F (K) be a net. Then yα ∈ F (xα ), xα ∈ K for all α ∈ J. By virtue of the compactness of K we can find {xβ }β∈I a subnet of {xα }α∈J such that xβ −→ x. We claim that {yβ }β∈I has a cluster point in F (x). We argue indirectly. So suppose that for all y ∈ F (x), we can find β0 (y) ∈ J and V (y) ⊆ N (y) such that yβ ∈ / V (y) for all β ∈ I, β ≥ β0 (y). Note that {V (y)}y∈F (x) is an open cover of the compact set F (x). So we can find {V (y)}N k=1 an open subcover. Set N
V = ∪ V (yk ) ∈ N (y). Then we can find β ∈ I such that for all β ∈ I, β ≥ β1 , we k=1
N
have yβ ∈ / V = ∪ V (yk ) ⊇ F (x), which contradicts Proposition 6.1.3(c). Therefore k=1
we can find a subnet {yr }r∈S of {yβ }β∈I such that yr −→ y ∈ F (x) ⊆ F (K). This proves the compactness of F (K). The following two functions associated with a set A ⊆ X play a central role in what follows.
460
6 Multivalued Analysis
DEFINITION 6.1.14 (a) Let (X, d) be a metric space and A ⊆ X. For every x ∈ X, we define d(x, A) = inf[d(x, a) : a ∈ A]. As usual, we adopt the convention that inf = +∞. The distance function ∅
x −→ d(x, A) is a contraction. (b) Let X be a normed space. By X ∗ we denote the topological dual of X and by ·, · the duality brackets for the pair (X, X ∗ ). Given A ⊆ X, for every x∗ ∈ X ∗ we define σ(x∗ , A) = sup[x∗ , a : a ∈ A]. As usual, we adopt the convention that sup = −∞.The function x∗ −→ σ(x∗ , A) ∅
from X ∗ into R ∪ {±∞} is called the support function of the set A. PROPOSITION 6.1.15 Let F : X −→ 2Y \{∅} be a multifunction. (a) If (Y, d) is a metric space, then F (·) is lsc if and only if for every y ∈ Y , x −→ d y, F (x) is upper semicontinuous.
(b) If (Y, d) is a metric space and F is usc, then for every y ∈ Y x −→ d y, F (x) is lower semicontinuous; the converse is true if F is locally compact (see Proposition 6.1.10). (c) If Y is a normed space with the weak topology and F is usc, then for
furnished all y ∗ ∈ Y ∗ x −→ σ y ∗ , F (x) is an upper semicontinuous function from X into R = R ∪ {+∞}. PROOF: (a) ⇒ : Suppose F is lsc. We
need to show that for every λ ∈ R the upper level set Uλ = x ∈ X : ϑy (x) = d y, F (x) ≥ λ is closed. So suppose that {xα }α∈J ⊆ Uλ is a net and assume that xα −→ x. Given ε > 0 we can find v ∈ F (x) such that d(y, v) < ϑy (x) + ε. Because F is lsc, we can find α0 ∈ J such that if α ∈ J, α ≥ α0 we have F (xα ) ∩ Bε (v) = ∅. So we can find yα ∈ F (xα ), α ≥ α0 , such that d(y, yα ) ≤ ϑy (x) + 2ε and so ϑy (xα ) ≤ ϑy (x) + 2ε, hence λ ≤ ϑy (x) + 2ε. Because ε > 0 was we let ε ↓ 0 to obtain λ ≤ ϑy (x). Therefore Uλ is closed
arbitrary, and so x −→ d y, F (x) is upper semicontinuous. ⇐: Let V ⊆ Y be open and let x ∈ F − (V ). We choose y ∈ F (x) ∩ V . Let ε > 0 such that Bε (y) ⊆ V . Because ϑy (·) is upper semicontinuous, we can find U ∈ N (x) such that ϑy (x ) < ϑy (x) + ε = ε ⇒ F (x ) ∩ Bε (y) = ∅
⇒ F (x ) ∩ V = ∅
for all x ∈ U, for all x ∈ U,
for all x ∈ U.
This means that F is lsc (see Definition 6.1.2(b)). (b) We need that for every λ ∈ R, the lower level set Lλ = {x ∈ X :
to show ϑy (x) = d y, F (x) ≤ λ} is closed. To this end let {xα }α∈J ⊆ Lλ be a net such that xα −→ x. Because F is usc, given ε > 0, we can find α0 ∈ J such that if α ∈ J, α ≥ α0 , then
F (xα ) ⊆ F (x)ε = {v ∈ Y : d v, F (x) < ε}, ⇒ ϑy (x) < ϑy (xα ) + ε, ⇒ ϑy (x) < λ + ε.
6.1 Continuity of Multifunctions
461
Because ε > 0 was arbitrary, we let ε ↓ 0 to conclude that ϑy (x) ≤ λ and so x ∈ Lλ . This proves the lower semicontinuity of the distance function x −→ ϑy (x) = d y, F (x) . Now suppose that F : X −→ Pk (Y ) is locally compact and the distance function x −→ ϑy (x) is lower semicontinuous for all y ∈ Y . We show that F is usc. By virtue of Proposition 6.1.10, it suffices to show that Gr F is closed in X × Y . To this end let {(xα , yα
)}α∈J ⊆ Gr F be a net and assume that (xα , yα ) −→ (x, y) in X × Y . We have d y, F (x) ≤ d(y, yα ) −→ 0. Because ϑy (·) is lower semicontinuous
we also have d y, F (x) ≤ lim inf d y, F (xα ) . Therefore it follows that d y, F (x) = 0 (i.e., (x, y) ∈ Gr F ).
α∈J
(c) Fix y ∗ ∈ Y ∗ and ε > 0 and define W (y ∗ , ε) = {y ∈ Y : y ∗ , y < ε}. This is a weak neighborhood of the origin. By hypothesis F is usc from X into Yw (= space Y endowed with the weak topology). So we can find U ∈ N (x) such that for all u ∈ U, F (u) ⊆ F (x) + W (y ∗ , ε)
for all y ∈ U, ⇒ σ y ∗ , F (u) ≤ σ y ∗ , F (x) + ε
∗ ⇒ x −→ σ y , F (x) is upper semicontinuous. EXAMPLE 6.1.16 (a) In general lower semicontinuity of the distance function does not imply upper semicontinuity of the multifunction. To see this let X = R, Y = R 2 , and F : X −→ Pf (Y ) is defined by F (x) = (t, xt) : t ∈ R . Clearly ϑy (x)=d y, F (x) is continuous in x ∈ R, but F is not usc. (b) The converse of Proposition 6.1.15(c) is not in general true. To see this let X =R+ , Y =R and consider F :X −→Pkc (Y ) defined by F (x)=
{−1, 1} [0, x]
if x = 0 . if x = 0
For every y ∗ ∈Y ∗ =R we have that x −→ σ y ∗ , F (x) is upper semicontinuous, but x −→ F (x) is not usc at x = 0. However, if we strengthen the hypotheses on the multifunction F , we can have a converse to Proposition 6.1.15(c). For a proof of this result we refer to Hu– Papageorgiou [313, pp. 47–48]. PROPOSITION 6.1.17 If Y is a normed space and F : X −→ Pwkc (Y ) is a multifunction such that for all y ∗ ∈ Y ∗ the function x −→ σ y ∗ , F (x) from X into R = R ∪ {+∞} is upper semicontinuous, then the multifunction F is usc from X into Yw .
462
6 Multivalued Analysis
The next result is useful in many applications. Consider a function u : X ×Y −→ R ∪ {±∞} and a multifunction F : Y −→ 2X \ {∅}. We consider the following parametric maximization problem, v(y) = sup[u(x, y) : x ∈ F (y)]. We call v(·) the value function depending on the parameter y ∈ Y . In addition to the value function, we also consider the solution multifunction y −→ S(y) defined by S(y) = {x ∈ F (y) : u(x, y) = v(y)}. The next result, useful in a variety of applications, is often called the Berge maximum theorem. THEOREM 6.1.18 Let X, Y, F, v(·), and S(·) be as above and assume that for every y ∈ Y we can find x ∈ F (y) such that u(x, y) ∈ R. (a) If u is lower semicontinuous and F is lower semicontinuous, then the value function v : Y −→ R = R ∪ {+∞} is lower semicontinuous. (b) If u is upper semicontinuous and F is upper semicontinuous with values in Pk (X), then the value function v : Y −→ R = R∪{+∞} is upper semicontinuous. (c) If u : X × Y −→ R is continuous and F is continuous with values in Pk (X), then the value function v : Y −→ R is continuous and the solution multifunction S : Y −→ Pk (X) is upper semicontinuous. PROOF: (a) For every λ ∈ R, we need to show the level set Lλ = {y ∈ Y : v(y) ≤ λ} is closed. So let {yα }α∈J ⊆ Lλ be a net such that yα −→ y. By virtue of Proposition 6.1.4(d), for every α ∈ J we can find xα ∈ F (yα ) such that xα −→ x in X. We have u(xα , yα ) ≤ v(yα ) ≤ λ for all α ∈ J and because u is lower semicontinuous, we obtain u(x, y) ≤ λ. Because x ∈ F (y) was arbitrary, it follows that v(y) ≤ λ, hence y ∈ Lλ . This proves the lower semicontinuity of v(·). (b) For every λ ∈ R, we need to show that the upper level set Uλ = {y ∈ Y : v(y) ≥ λ} is closed. So let {yα }α∈J be a net such that yα −→ y. Since F has values in Pk (X), for every α ∈ J, we can find xα ∈ F (yα ) such that v(yα ) = u(xα , yα ) (Weierstrass theorem). As in the proof of Proposition 6.1.13, we can show that {xα }α∈J has a cluster point in F (y). Therefore we can find x ∈ F (y) and a subnet {xβ }β∈I of {xα }α∈J such that xβ −→ x. We have λ ≤ v(yβ ) = u(xβ , yβ )
for all β ∈ I and lim sup u(xβ , yβ ) ≤ u(x, y), β∈I
⇒ λ ≤ u(x, y) ≤ v(y)
(see x ∈ F (y)).
This proves that Uλ is closed and so v is upper semicontinuous. (c) The continuity of v follows from parts (a) and (b). So it remains to show that S : Y −→ Pk (X) is usc. Because of Proposition 6.1.3, given A ⊆ X closed, we need to show that S − (A)={y ∈ Y :S(y) ∩ A = ∅} is closed. So let {yα }α∈J ⊆ S − (A) be a net such that yα −→ y in Y . For each α ∈ J, we can find xα ∈ F (yα ) ∩ A such that u(xα , yα ) = v(yα ). As before {xα }α∈J has a cluster point x ∈ F (y). Thus we can find a subnet {xβ }β∈I of {xα }α∈J such that xβ −→ x. Evidently x ∈ F (y) ∩ A
6.1 Continuity of Multifunctions
463
and u(x, y) = v(y) (because u and v are continuous). Therefore y ∈ S − (A) and we have that S − (A) is closed in Y , hence S is usc. Given a multifunction F : X −→ 2Y , note that for V ⊆ Y open, F (x) ∩ V = ∅ if and only if F (x) ∩ V = ∅. So we can state the following result. PROPOSITION 6.1.19 A multifunction F : X −→ 2Y \ {∅} is lsc if and only if x −→ F (x) = F (x) is lsc. The above result is not true for usc multifunctions. EXAMPLE 6.1.20 Proposition 6.1.19 fails for usc multifunctions. To see this let X = Y = R and let F : R −→ 2R \ {∅} be defined by F (x)=(x − 1, x + 1). Note that F + (−1, 1) ={0} and so F is not usc. On the other hand F (x)=[x − 1, x + 1] and by virtue of Remark 6.1.5 F is continuous. For usc multifunctions we have the following result. PROPOSITION 6.1.21 If Y is normal and F : X −→ 2Y \ {∅} is usc, then so is F (·). PROOF: Let {xα }α∈J be a net such that xα −→ x and let V ⊆ Y be an open set such that F (x) ⊆ V . Because of the normality of Y , we can find V0 ⊆ Y another open set such that F (x) ⊆ F (x) ⊆ V0 ⊆ V 0 ⊆ V. (6.1) Because F is usc, we can find α0 ∈ J such that for α ∈ J, α ≥ α0 we have F (xα ) ⊆ V0 ⇒ F (xα ) ⊆ V ⇒ F
(see Proposition 6.1.3), for all α ≥ α0
(see (6.1)),
is usc (see Proposition 6.1.3).
PROPOSITION 6.1.22 If Y is normal and F1 , F2 : X −→ Pf (Y ) are both usc multifunctions such that for every x ∈ X, (F1 ∩ F2 )(x) = F1 (x) ∩ F2 (x) = ∅, then x −→ (F1 ∩ F2 )(x) is usc too. PROOF: We need to show that if V ⊆ Y is open, then (F1 ∩ F2 )+ (V ) is open. By definition (F1 ∩ F2 )+ (V ) = {x ∈ X : F1 (x) ∩ F2 (x) ∩ V c = ∅}. (6.2) Let x ∈ (F1 ∩ F2 )+ (V ). Then because of (6.2) the sets F1 (x) and F2 (x) ∩ V c are disjoint closed sets. Due to the normality of Y we can find V1 , V2 ⊆ Y disjoint open sets such that F1 (x) ⊆ V1 and F2 (x) ∩ V c ⊆ V2 . Let V3 = V2 ∪V . Then F2 (x) ⊆ V3 . Because F1 , F2 are usc, we can find U ∈ N (x) such that for all x ∈ U , we have F1 (x ) ⊆ V1 and F2 (x ) ⊆ V3 . Therefore F1 (x ) ∩ F2 (x ) ⊆ V1 ∩ V3 = V1 ∩ (V2 ∪ V ) ⊆ V ⇒ (F1 ∩ F2 )+ (V ) is open (i.e., F1 ∩ F2 is usc).
for all x ∈ U,
464
6 Multivalued Analysis
Another result in this direction is given below. PROPOSITION 6.1.23 If F1 :X −→Pf (Y ) is closed (see Definition 6.1.11), F2 : X −→ Pk (Y ) is usc and for all x ∈ X, (F1 ∩ F2 )(x) = F1 (x) ∩ F2 (x) = ∅, then x −→ (F1 ∩ F2 )(x) is usc. PROOF: We need to show that if V ⊆ Y is open, then (F1 ∩ F2 )+ (V ) is open. Let x ∈ (F1 ∩ F2 )+ (V ). From (6.2) we have that the sets F1 (x) and F2 (x) ∩ V c are disjoint. Note that F2 (x) ∩ V c is compact in Y . Let y ∈ F2 (x) ∩ V c . Then (x, y) ∈ / Gr F1 and because Gr F1 is closed, we can find Uy ∈ N (x) and Wy ∈ N (y) such that (Uy × Wy ) ∩ Gr F1 = ∅. Hence for all x ∈ Uy we have F1 (x ) ∩ Wy = ∅. As we already observed, the set F2 (x) ∩ V c is compact. So we N N c Wyk =V1 ∈N (y). Set U1 = Uyk ∈ N (x). If can find {yk }N k=1 ⊆ F2 (x) ∩ V ⊆ k=1
k=1
x ∈ U1 , then F1 (x ) ⊆ V1c . Let V2 = V ∪ V1 and U2 ∈ N (x) be such that if x ∈ U2 , then F2 (x ) ⊆ V2 (recall that F2 is usc). Then if x ∈ U1 ∩ U2 ∈ N (x), we have (F1 ∩ F2 )(x ) = F1 (x ) ∩ F2 (x ) ⊆ V1c ∩ V2 ⊆ V , which proves that (F1 ∩ F2 )+ (V ) is open, hence x −→ (F1 ∩ F2 ) is usc. What about the intersection of two lsc multifunctions? In this case the situation is more delicate. First note that, if F : X −→ 2Y \{∅} is lsc and V ⊆Y a nonempty open set such that F (x) ∩ V = ∅ for all x ∈ X, then x −→ F (x) ∩ V is lsc. Moreover, if C ⊆ X is closed and F (x) ∩ V for x ∈ C , F (x) = F (x) for x ∈ C c then it is straightforward to check that F (·) is lsc. More generally, we have the following result. PROPOSITION 6.1.24 If F1 :X −→2Y \{∅} is lsc, F2 :X −→2Y \{∅} has an open graph, and for every x ∈ X, (F1 ∩F2 )(x) = F1 (x)∩F2 (x) = ∅, then x −→ (F1 ∩F2 )(x) is lsc. PROOF: Let V ⊆Y be nonempty open, x∈(F1 ∩F2 )− (V ), and y∈F1 (x)∩F2 (x)∩V . Then (x, y) ∈ Gr F2 ∩ (X × V ). Since Gr F2 is open we can find U1 (x) ∈ N (x) and V1 (y) ∈ N (y) such that U1 (x)×V1 (y) ⊆ Gr F2 ∩(X ×V ). Note that F1 (x)∩V1 (y) = ∅ and because F1 is lsc, we can find U2 (x) ∈ N (x) such that F1 (x ) ∩ V1 (y) = ∅ for all x ∈ U2 (x). Set U (x) = U1 (x) ∩ U2 (x) ∈ N (x). Then for all x ∈ U (x) we have F1 (x ) ∩ V1 (y) = ∅ and U (x) × V1 (y) ⊆ Gr F2 ∩ (X × V ). Therefore for all x ∈ U (x), we have F1 (x ) ∩ F2 (x ) = ∅ which means that (F1 ∩ F2 )− (V ) is open and so x −→ (F1 ∩ F2 )(x) is lsc. REMARK 6.1.25 Note that if Gr F is open, then F has open values. However, Proposition 6.1.24 fails if F2 instead of an open graph has only open values. If Y =RN , then the situation improves and we have the following two propositions whose proof can be found in Hu–Papageorgiou [313, pp. 55–56].
6.1 Continuity of Multifunctions
465
PROPOSITION 6.1.26 If F1 , F2 : X −→ RN \{∅} are lsc multifunctions, F2 has open convex values, and for all x ∈ X, (F1 ∩ F2 )(x) = F1 (x) ∩ F2 (x) = ∅, then x −→ (F1 ∩ F2 )(x) is lsc. PROPOSITION 6.1.27 If F1 , F2 :X −→RN\{∅} are lsc with convex values and for every x ∈ X, (F1 ∩ int F2 )(x) = ∅, or (int F1 ∩ F2 )(x) = ∅, then x −→ (F1 ∩ F2 )(x) is lsc. REMARK 6.1.28 It is easy to check that if Y is a Banach space and F : X −→ 2Y \ {∅} is lsc, then so are the multifunctions x −→ conv F (x), x −→ conv F (x). Similarly if F : X −→ Pk (Y ) is usc, then so is the multifunction x −→ conv F (x). Finally for X, Y Hausdorff topological spaces, then both upper and lower semicontinuity are preserved by the set-theoretic operation of union. When X is a metric space, then we can exploit its metric structure, to introduce some additional continuity notions for multifunctions. For this purpose, we introduce the following notions. DEFINITION 6.1.29 Let (X, d) be a metric space and A, C ⊆ X. We set h∗ (A, C) = sup d(a, C) : a ∈ A = inf ε > 0 : A ⊆ Cε , where Cε = {x ∈ X : d(x, C) < ε} (the open ε-enlargement of C). We call h∗ (A, C) the excess of A over C. Then the Hausdorff distance between A and C is defined by h(A, C) = max h∗ (A, C), h∗ (C, A) = inf ε > 0 : A ⊆ Cε and C ⊆ Aε . It is easy to see that h(A, C) = 0 if and only if A = C and so
REMARK 6.1.30 Pf (X) ∪ {∅}, h is a (generalized) metric space. The empty set is an isolated point in this metric space and the metric h(·, ·) is called the Hausdorff metric. The next proposition is a straightforward consequence of Definition 6.1.29. PROPOSITION 6.1.31 Let (X, d) be a metric space and A, C ⊆ X nonempty. (a) h∗ (A, C) = sup d(x, C) − d(x, A) : x ∈ X (b) h(A, C) = sup |d(x, C) − d(x, A)| : x ∈ X ∗ (c) If X is a normed space and A, C ∈ Pbf c (X), then h(A, C) = sup |σ(x ; A) − ∗ ∗ ∗ ∗ σ(x ; C)| : x ∈ X , x ≤ 1 (H¨ ormander’s formula) The next proposition summarizes the basic facts about this metric. Detailed proofs can be found in Hu–Papageorgiou [313, p. 8]. PROPOSITION 6.1.32 (a) If (X, d) is a complete metric space, then so is (Pf (X), h).
466
6 Multivalued Analysis
(b) Pbf (X) is a closed subset of (Pf (X), h) and if X is separable, then so is (Pk (X), h). (c) If X is a normed space, then
Pkc (X) ⊆ Pbf c (X) ⊆ Pf c (X) and Pk (X) ⊆ Pbf (X) are all closed subspaces of Pf (X), h . Now we are ready to introduce the new continuity concepts for multifunctions. DEFINITION 6.1.33 Let X be a Hausdorff topological space, (Y, d) a metric space, and F : X −→ 2Y \{∅} a multifunction. (a) We say that F is h-upper semicontinuous at x0 ∈ X (h-usc for short), if the function x −→ h∗ F (x), F (x0 ) is continuous at x0 ∈ X. If F (·) is h-usc at every x0 ∈ X, then we say that F is h-upper semicontinuous (h-usc for short). (b) We say that F is h-lower semicontinuous at x0 ∈ X (h-lsc for short), if the function x −→ h∗ F (x0 ), F (x) is continuous at x0 ∈ X. If F (·) is h-lsc at every x0 ∈ X, then we say that F is h-lower semicontinuous (h-lsc for short). (c) We say that F is h-continuous at x0 ∈ X, if it is both h-usc and h- lsc at x0 ∈ X. If F (·) is h-continuous at every x0 ∈ X, then we say that F is h-continuous. REMARK 6.1.34 Note that h-continuity of F , is continuity from X into the pseudometric space 2Y \{∅}, h . It is natural to ask how these new continuity notions compare to those of Definition 6.1.2. In what follows X is a Hausdorff topological space and (Y, d) a metric space. PROPOSITION 6.1.35 If F : X −→ 2Y \{∅} is usc, then F is h-usc.
PROOF: Because F is usc, for every ε > 0 and every x ∈ X we have F + F (x)ε = Ux ∈ N (x) (recall that F (x)ε ={y ∈Y
: d y, F (x) < ε}). For every x ∈Ux we have F (x ) ⊆ F (x)ε , which means that h F (x ), F (x) < ε and so we conclude that F is h-usc. The converse of the above proposition is not in general true as the following example illustrates. EXAMPLE 6.1.36 h-usc usc: Let X = [0, 1], Y = R and consider the multifunction F : X −→ 2Y \ {∅} defined by F (x) =
[0, 1] [0, 1)
if 0 ≤ x < 1 . if x = 1
Clearly F is h-usc, but it is not usc because F + (−1, 1) ={1}. PROPOSITION 6.1.37 If F : X −→ 2Y \{∅} is h-lsc, then F is lsc.
6.1 Continuity of Multifunctions
467
PROOF: We need to show that for every C ⊆ Y closed, the set F + (C) is closed. To this end, let {xα }α∈J ⊆ F + (C) be a net and assume that xα −→ x. We have F (xα ) ⊆ C for all α ∈ J. Because F is h-lsc, given ε > 0, we can find α0 ∈ J such that for all α ∈ J, α ≥ α0 we have F (x) ⊆ F (xα ) ⊆ Cε . Let ε ↓ 0 to conclude that F (x) ⊆ C (i.e., x ∈ F + (C)). This proves that F + (C) is closed and so F is lsc. Again the converse of this proposition, in general fails. 2 Y EXAMPLE 6.1.38 lsc h-lsc: Let X = [0, 1], Y = R+ , and let F : X −→ 2 \{∅} be defined by F (x) = [t, xt] : t > 0 . Then F is lsc but not h-lsc.
However, for Pk (Y )-valued multifunction the situation is better. PROPOSITION 6.1.39 If F : X −→ Pk (Y ), then (a) F is usc if and only if F is h-usc. (b) F is lsc if and only if F is h-lsc. PROOF: (a)⇒ : This implication is Proposition 6.1.35. ⇐ : We need to show that for every C ⊆ Y closed, F − (C) is closed. To this end let {xα }α∈J ⊆ F − (C) be a net such that xα −→ x. Let yα ∈ F (xα ) ∩ C for all α ∈ J. We have
d yα , F (x) ≤ h∗ F (xα ), F (x) −→ 0
(because F is h-usc).
(6.3)
Because F (x) ∈ Pk (Y ), for every α ∈ J we can find uα ∈ F (x) such that d yα , F (x) = d(yα , uα ). Due to the compactness of F (x), we can find {uβ }β∈I a subnet of {uα }α∈J such that uβ −→ y ∈ F (x). Then because of (6.3), we see that yβ −→ y ∈ F (x) ∩ C (because C is closed). Therefore x ∈ F − (C) and we conclude that F − (C) is closed, hence F is usc. F (b)⇒ : Suppose {xα }α∈J ⊆X is a net such that xα −→ x. Because
(x) ∈ Pk (X), for every α ∈ J we can find uα ∈ F (x) such that d uα , F (xα ) = h∗ F (x), F (xα ) . Also we can find a subnet {uβ }β∈I of {uα }α∈J such that uβ −→ y. Since F is lsc, given ε > 0, we can find β0 ∈ I such that for all β ∈ I, β ≥ β0 , we have F (xβ ) ∩ Bε/2 (y) = ∅ and uβ ∈ Bε/2 (y). So for β ≥ β0 , we have
h∗ F (x), F (xβ ) ≤ d(uβ , y) + d y, F (xβ ) < ε,
⇒ lim h∗ F (x), F (xβ ) = 0. β∈I
(6.4)
Because every α }α∈J has a further subnet so that (6.4) holds, we
subnet of {x conclude that h∗ F (x), F (xβ ) −→ 0, hence F is h-lsc. ⇐ : This implication is Proposition 6.1.37.
COROLLARY 6.1.40 If F :X −→Pk (Y ), then F is continuous if and only if it is h-continuous.
468
6 Multivalued Analysis
REMARK 6.1.41 If a multifunction F : X −→ 2Y \{∅} is h-usc (resp., h-lsc, hcontinuous), then so are the multifunctions x −→ F (x), x −→ conv F (x), x −→ conv F (x). Moreover, if G : X −→ 2Y \ {∅} is another multifunction which too is h-usc (resp., h-lsc, h-continuous), then so is x −→ (F ∪ G)(x). The situation is more complicated with the operation of intersection. To deal with the h-continuity properties of the intersection of two multifunctions, we need the following lemma, known as the cancellation law lemma. LEMMA 6.1.42 If V is a locally convex space, D ⊆ V is nonempty bounded, and C ⊆ V is convex, then A + D ⊆ C + D implies A ⊆ C. PROOF: First we assume that C ∈ Pf c (V ) and A + D ⊆ C + D. Let a ∈ A and d1 ∈ D. By induction we can choose ck ∈ C and dk+1 ∈ D such that a + dk = ck + dk+1 for all k ≥ 1. Then ck = a + dk − dk+1 . Summing up to n and then dividing by n, we obtain a+
n d1 dn+1 1 ck . − = n n n k=1
Because by hypothesis D is bounded, we have n
d1 dn+1 1 a = lim a + ck ∈ C, − = lim n→∞ n→∞ n n n k=1
because we have assumed that C ∈ Pf c (V ). Now we consider the general case. Let U ∈ N (0) be convex and choose U1 ∈ N (0) convex such that U1 + U1 ⊆ U . We have A + D ⊆ C + D ⊆ C + D + U1 ⊆ C + U1 + D. Note that C + U1 ∈ Pf c (V ). So we are in the first part of the proof, from which it follows that A ⊆ C + U1 ⊆ C + U1 + U1 ⊆ C + U . Because U ∈ N (0) was arbitrary, we conclude that A ⊆ C. LEMMA 6.1.43 If V is a normed space and A ⊆ V is convex bounded with int A = ∅, then for every ε > 0 there exists C ⊆ int A and δ > 0 such that Cδ ⊆ A ⊆ Cε . PROOF: By translating things if necessary, we may assume that 0 ∈ int A. Because A is bounded, given ε > 0 we can find λ ∈ (0, 1) such that λ int A ⊆ Bε/2 . Also because 0 ∈ int A, we can find δ > 0 such that Bδ ⊆ λ int A. Set C = (1 − λ)int A. Then C + Bδ = Cδ ⊆ A. On the other hand because A is convex, A = int A and so A ⊆ C + Bε = Cε . REMARK 6.1.44 This lemma fails if A is not bounded. PROPOSITION 6.1.45 If Y is a normed space and F1 , F2 : X −→ Pbf c (Y ) are h-lsc and for all x ∈ X we have int (F1 ∩ F2 )(x) = ∅, then x −→ F (x) = (F1 ∩ F2 )(x) = F1 (x) ∩ F2 (x) is h-lsc.
6.1 Continuity of Multifunctions
469
PROOF: Fix x ∈ X and let ε > 0 be given. Because of Lemma 6.1.43, we can find C ⊆ int(F1 ∩ F2 )(x) and δ > 0 such that Cδ ⊆ F (x) ⊆ Cε . Since F1 , F2 are h-lsc, we can find U ∈ N (x) such that Fk (x) ⊆ Fk (x )δ
for all x ∈ U and for k = 1, 2, for all x ∈ U.
⇒ Cδ ⊆ Fk (x )δ
(6.5)
From (6.5) and Lemma 6.1.42, it follows that C ⊆ (F1 ∩ F2 )(x ) = F (x ) ⇒ F (x) ⊆ Cε ⊆ F (x )ε
for all x ∈ U,
for all x ∈ U,
⇒ F is h-lsc. There is an analogous result for h-usc multifunctions. PROPOSITION 6.1.46 If F1 :X−→Pf (Y ) and F1 :X−→Pk (Y ) are h-usc and for all x ∈ X, F1 (x) ∩ F2 (x) = ∅, then x −→ (F1 ∩ F2 )(x) = F1 (x) ∩ F2 (x) is h-usc. PROOF: It is easy to see that a Pf (Y )-valued, h-usc multifunction is closed. Then the proposition follows by combining Propositions 6.1.23 and 6.1.39(a). For h-continuity, if Y is an infinite-dimensional space, we cannot simply combine Propositions 6.1.45 and 6.1.46, because a normed compact set has empty interior. Nevertheless the result is still true for h-continuity (see Hu–Papageorgiou [313, p.66]). PROPOSITION 6.1.47 If Y is a normed space, F1 , F2 : X −→ Pbf c (Y ) are hcontinuous and for all x ∈ X int(F1 ∩ F2 )(x) = ∅, then x −→ F (x) = (F1 ∩ F2 )(x) is h-continuous. We conclude this section with two weak versions of lower semicontinuity and of h-lower semicontinuity. DEFINITION 6.1.48 Let X, Y be Hausdorff topological spaces and F : X −→ 2Y \{∅} a multifunction. (a) We say that F is almost lower semicontinuous (a-lscfor short), if for every x ∈ X and every ε > 0, we can find U ∈ N (x) such that F (x )ε = ∅. x ∈U
(b) We say that F is weakly h-lower semicontinuous (hw -lsc for short), if for every x ∈ X and every U ∈ N (x), we can find V ∈ N (x), V ⊆ U and a point x ∈ V such that F (x ) ⊆ F (u)ε for all u ∈ V . REMARK 6.1.49 If in Definition 6.1.48(b), x = x , then we recover the definition of h-lower semicontinuity. Clearly h-lower semicontinuity implies hw -lower semicontinuity and hw -lower semicontinuity implies almost lower semicontinuity.
470
6 Multivalued Analysis
6.2 Measurability of Multifunctions In the previous section we focused our attention to the topological properties of multifunctions. In this section we shift our investigation to the measure theoretic properties. Throughout this section (Ω, Σ) is a measurable space and (X, d) a metric space. Additional hypotheses are introduced as needed. DEFINITION 6.2.1 Let F : Ω −→ 2X be a multifunction. (a) We say that F is measurable, if for all U ⊆ X open, we have F − (U ) = {ω ∈ Ω : F (ω) ∩ U = ∅} ∈ Σ. (b) We say that F is graph measurable, if Gr F = {(ω, x) ∈ Ω × X : x ∈ F (ω)} ∈ Σ × B(X) with B(X) being the Borel σ-field of X. (c) If X is a separable Banach space,
we saythat F is scalarly measurable, if for all x∗ ∈ X ∗ , the function ω −→ σ x∗ , F (ω) is Σ-measurable. REMARK 6.2.2 We define the domain of a measurable multifunction F : Ω −→ 2X to be the set dom F = {ω ∈ Ω : F (ω) = ∅}. It is clear from Definition 6.2.1(a) that dom F ∈ Σ. So in the sequel without any loss of generality, for measurable multifunctions, we always assume that dom F = Ω. PROPOSITION 6.2.3 If F : Ω −→ 2X \{∅} and for every C ⊆ X closed we have F − (C) ∈ Σ, then F is measurable. PROOF: Recall that in a metric space, an open set U is Fσ . So U = Cn with n≥1 − Cn ⊆ X closed for all n ≥ 1. Hence F − (U ) = F − ( Cn ) = F (Cn ) ∈ Σ (see n≥1
Proposition 6.1.1(c)). Therefore F is measurable.
n≥1
PROPOSITION 6.2.4 A multifunction F : Ω −→ 2X \ {∅} is measurable if and only if for all x ∈ X, ω −→ d x, F (ω) is Σ-measurable.
PROOF: ⇒: For every λ > 0, we need to show that Lλ (x) = {ω ∈Ω : d x, F (ω) <
λ}∈Σ. Note that Lλ (x)=F − Bλ (x) ∈Σ.
− ⇐: For every x ∈ X and every λ > 0, we have F Bλ (x) ∈ Σ. If U ⊆ X is open, then U = Bλn (xn ) (recall that X is separable, hence second countable). Then n≥1
F − (U )=F − Bλn (xn ) = F − Bλn (xn ) ∈ Σ. n≥1
n≥1
In earlier sections we already used the notion of the Carath´eodory function. Because it is central in our subsequent considerations, for easy reference we recall here its definition (in a general setting) and some fundamental properties of such functions. DEFINITION 6.2.5 Let (Ω, Σ) be a measurable space and V, Y Hausdorff topological spaces. A function ϕ : Ω × V −→ Y is said to be a Carath´ eodory function, if
6.2 Measurability of Multifunctions
471
(a) For every v ∈ V, ω −→ ϕ(ω, v) is Σ, B(Y ) -measurable and (B(Y ) is the Borel σ-field of Y ). (b) For every ω ∈ Ω, v −→ ϕ(ω, v) is continuous. The two theorems that follow reveal two basic and useful properties of Carath´eodory functions. For their proof we refer to Denkowski–Mig´ orski– Papageorgiou [194, pp. 188–189]. THEOREM 6.2.6 If (Ω, Σ) is a measurable space, V is a separable metric space, Y a metric space, and ϕ : Ω × V −→ Y a Carath´ eodory function, then ϕ is Σ × B(V )measurable. REMARK 6.2.7 This theorem implies that ϕ is superpositionally measurable (sup-measurable for short), namely if u : Ω −→ V is Σ-measurable, then ω −→
ϕ ω, u(ω) is Σ-measurable (i.e., the Nemitsky operator corresponding to ϕ maps measurable maps to measurable ones). The second theorem is known in the literature as the Scorza–Dragoni theorem and is a parametric version of Lusin’s theorem. First a definition. DEFINITION 6.2.8 A Hausdorff topological space (V, τ ) (τ being the Hausdorff topology of V ), is said to be a Polish space, if it is separable and there exists a metric on V for which the topology τ is complete. THEOREM 6.2.9 If T, V are Polish spaces, Y is a separable metric space, µ a tight Borel measure on T , and ϕ : T × V −→ Y a Carath´ eodory function, then for every ε > 0, we can find Tε ⊆ T a compact subset with µ(T \ Tε ) < ε such that ϕT ×V is continuous. ε
Returning to the measurability of multifunctions, we have the following result. PROPOSITION 6.2.10 If F : Ω −→ Pf (X) is measurable, then F is graphmeasurable.
PROOF: Note that Gr F = (ω, x) ∈ Ω × X
: d x, F (ω) = 0 . From Proposition 6.2.4, we have that (ω, x) −→ ϕ(ω, x) = d x, F (ω) is a Carath´eodory function. Then by virtue of Theorem 6.2.9 ϕ(ω, x) is Σ × B(X)-measurable and so Gr F ∈ Σ × B(X) (i.e., F is graph-measurable). The converse of the previous proposition is not in general true. EXAMPLE 6.2.11 Graph-measurable measurable: Let Ω = [0, 1] with Σ = B([0, 1]) (the Borel σ-field of [0, 1]) and X = R\ Q = the set of irrationals. Recall that X is a Polish space. Consider C a closed subset of Ω × X such that projΩ C ∈ / Σ and projX C = X. Choose x0 ∈ X \ projX C and let F : Ω −→ Pf (X) be a multifunction defined by F (ω) = C(ω) ∪ {x0 }. Then clearly F has closed graph, hence it is graph-measurable. On the other hand, if U is a neighborhood of projX C not containing x0 , then F − (U ) = projΩ C ∈ / Σ and so F is not measurable.
472
6 Multivalued Analysis
For every open set U ⊆ X, we have A ∩ U = ∅ if and only if A ∩ U = ∅. This leads at once to the following proposition. PROPOSITION 6.2.12 F : Ω −→ 2X \{∅} is measurable if and only if F : Ω −→ Pf (X) is measurable. In the next result we produce conditions under which the converse of Proposition 6.2.3 holds. PROPOSITION 6.2.13 If F : Ω −→ Pk (X), then F is measurable if and only if F − (C) ∈ Σ for all C ⊆ X closed. PROOF: ⇒: Let C ⊆ X be closed and set U =X \C. Then we know that U = Cn n≥1 with Cn = x ∈ X : d(x, C) ≥ 1/n (i.e., U is an Fσ -set). Then F − (C) = Ω\F + (U ) =
Ω\F + Cn . From Proposition 6.1.1(c) we know that n≥1
F+
!
! + F (Cn ). Cn ⊇
n≥1
(6.6)
n≥1
We show that opposite inclusion is also true. Indeed, if this is not true, we can the find ω ∈ F + Cn such that F (ω)∩Cnc = ∅ for all n ≥ 1. Let xn ∈ F (ω)∩Cnc , n ≥ 1. n≥1
Because F (ω) ∈ Pk (X), by passing to a suitable subsequence if necessary, we may assume that xn −→ x and x ∈ F (ω) ⊆ U . On the other hand xn ∈ Cnc for all n ≥ 1 and so d(xn , C) < 1/n. Therefore in the limit as n → ∞, we obtain d(x, C) = 0; that is, x ∈ C = X \ U , a contradiction. So the opposite inclusion in (6.6) holds and we have
! ! + F+ F (Cn ) Cn = n≥1
n≥1
=
!
Ω \ F − (Cnc ) ∈ Σ
n≥1 −
⇒ F (C) ∈ Σ. ⇐: This implication is Proposition 6.2.3.
PROPOSITION 6.2.14 If F : Ω −→ Pf (X) is measurable, then F − (K) ∈ Σ for all K ⊆ X compact. PROOF: Because X is a separable metric space, we can think of X as a dense subspace of a compact metric space Y (in fact it is homeomorphic to a subset of the Hilbert cube [0, 1]N which is compact by Tychonov’s theorem). Let G : Ω −→ Pk (X) be defined by G(ω) = F (ω). From Proposition 6.2.12 we have that G is measurable. Now let K ⊆ X be compact. We have F − (K) = {ω ∈ Ω : F (ω) ∩ K = ∅} = {ω ∈ Ω : G(ω) ∩ X ∩ K = ∅} = G− (K).
(6.7)
6.2 Measurability of Multifunctions
473
But G is Pk (Y )-valued. So Proposition 6.2.13 implies that G− (K) ∈ Σ, hence F − (K) ∈ Σ (see (6.7)). If we enrich the structure of the space X, then we can strengthen our conclusions. PROPOSITION 6.2.15 If X is a σ-compact metric space and F : Ω −→ Pf (X), then the following statements are equivalent. (a) F − (C) ∈ Σ for every C ⊆ X closed. (b) F is measurable. (c) F − (K) ∈ Σ for every K ⊆ X compact. PROOF: (a)⇒ (b): This implication is Proposition 6.2.3. (b)⇒ (c): This implication is Proposition 6.2.14. (c)⇒ (a): Because X is σ-compact, we have X = Kn with Kn ⊆X is compact. Let n≥1
C ⊆X be a closed set. Then we have F − (C)=F − Kn ∩ C = F − (Kn ∩ C) ∈ Σ (because Kn ∩ C is compact for every n ≥ 1).
n≥1
n≥1
Next we introduce a class of spaces, broader than the class of Polish spaces, because they are not necessarily metrizable. This makes them useful in many concrete situations, for example, when we deal with certain Banach spaces furnished with their weak topology. DEFINITION 6.2.16 A Hausdorff topological space X is said to be a Souslin space, if there exists a Polish space Y and a continuous surjection u : Y −→ X. REMARK 6.2.17 Souslin subspaces of a Polish space, are known in the literature as analytic sets. A Souslin space is always separable but need not be metrizable. For example, consider an infinite-dimensional separable Banach space endowed with the weak topology. Evidently this is a nonmetrizable Souslin space. It can be shown that closed and open subsets of a Souslin space are Souslin, countable products of Souslin spaces are Souslin, and countable intersections and unions of Souslin subspaces of a Hausdorff topological space are Souslin. From these facts, it follows ∗ ∗ that if X is a separable Banach space and Xw of X endowed ∗ denotes the dual X ∗ ∗ ∗ with the w∗ –topology, then Xw nB 1 , where ∗ is Souslin (indeed note that X = n≥1 ∗ ∗ B 1 = x∗ ∈ X ∗ : x∗ ≤ 1 and recall that (B 1 , w∗ ) is compact metrizable, hence Polish, in particular then Souslin). It is well known that a Borel set in R2 does not in general project to a Borel set on R. The next theorem determines more precisely the structure of this projection. The result is known as the Yankov–von Neumann–Aumann projection theorem and its proof can be found in Hu–Papageorgiou [313, p. 149]. Recall that if (Ω, Σ) is a measurable space, the universal σ-field corresponding to Σ is defined by Σ = Σµ , µ
where µ ranges over all finite measures on Σ and Σµ denotes the µ-completion of Σ. If (Ω, Σ, µ) is a σ-finite complete measure space, then Σ = Σ.
474
6 Multivalued Analysis
THEOREM 6.2.18 If (Ω, Σ) is a measurable space, X is a Souslin space, and A ∈ Σ × B(X), then proj A ∈ Σ. Using this theorem, we can have a partial converse to Proposition 6.2.10. PROPOSITION 6.2.19 If (Ω, Σ) is a complete measurable space (i.e., Σ = Σ), X is a Souslin space, and F : Ω −→ 2X \ {∅} is a multifunction such that Gr F ∈ Σ × B(X), then for every D ∈ B(X), we have F − (D) ∈ Σ. PROOF: By definition F − (D) = ω ∈ Ω : F (ω) ∩ D = ∅ = proj GrF ∩ (Ω × D) ∈ Σ = Σ (see Theorem 6.2.18). Now we can state a theorem summarizing the situation concerning the measurability of a multifunction with values in a separable metric space. THEOREM 6.2.20 Let (Ω, Σ) be a measurable space, X a separable metric space, and F : Ω −→ Pf (X). We consider the following properties. (a) (b) (c) (d) (e)
For every D ∈ B(X), F − (D) ∈ Σ. For every C ⊆ X closed, F − (C) ∈ Σ. F is measurable.
For every x ∈ X, ω −→ d x, F (ω) is Σ-measurable. F is graph-measurable.
Then the following implications are true. (1) (a)⇒(b)⇒(c)⇔ (d)⇒(e). (2) If X is σ-compact, then (b)⇔ (c). (3) If Σ = Σ and X is complete (i.e., X is a Polish space), then statements (a) to (e) are equivalent. It is easy to see that countable unions preserve measurability and graph measurability. This is no longer true with intersections. Additional conditions are needed to guarantee measurability of countable intersections. PROPOSITION 6.2.21 If Fn : Ω −→ Pf (X), n ≥ 1, are measurable multifunctions and for some n0 ≥ 1 Fn0 has values in Pk (X), then ω −→ F (ω) = Fn (ω) is n≥1
measurable. PROOF: " First we assume that all multifunctions Fn have values in Pk (X). We set P(ω)= Fn (ω). It is easy to see that P is measurable. Because P is compactn≥1
valued, from Proposition 6.2.13 we have that P − (C) ∈ Σ for every C ⊆ X closed. N Let C ⊆ X be closed diagonal set of the product space. Then and ∆ ⊆ X be the F − (C)= ω ∈ Ω: Fn (ω) ∩ C = ∅ = ω ∈ Ω : P(ω) ∩ ∆ ∩ C N = ∅ ∈ Σ (because n≥1
∆ ∩ C N is closed in X N ). Next we assume that only one of the Fn ’s has compact values. Let Y be a metrizable compactification of X (recall that X is homeomorphic to a subset of the Hilbert cube H = [0, 1]N ). Let Fn : Ω −→ Pk (Y ) be defined by Fn (ω) = F n (ω) (the closure in Y ), for all n ≥ 1. We know that Fn is measurable (see Proposition
6.3 Continuous and Measurable Selectors 6.2.14). So from the first part of the proof, we know that if F (ω) =
475
Fn (ω); then
n≥1
F − (C) ∈ Σ for all C ⊆ Y closed. Because Fn = Fn for some n ≥ 1, we have F = F and so from Proposition 6.2.3 we conclude that F is measurable. In order to study scalarly measurable multifunctions (see Definition 6.2.1(c)), we need to produce some results on the existence of measurable selectors for multifunctions. This is done in the next section, where we prove theorems on the existence of continuous and measurable selectors.
6.3 Continuous and Measurable Selectors Given two sets X, Y and a multifunction F : X −→ 2Y \{∅}, a selector of F is a single-valued map f : X −→ Y such that f (x) ∈ F (x) for all x ∈ X. When X, Y both have topological structure, it is natural to look for continuous selectors. If X = Ω has a measure-theoretic structure, then we seek to produce measurable selectors. In this section we study both cases. We start with continuous selectors. First a negative observation directs our efforts to the appropriate class of multifunctions. EXAMPLE 6.3.1 An usc multifunction need not have a continuous selector. We consider the usc multifunction F : R −→ Pf c (R) defined by ⎧ ⎪ ⎨ −1 F (x) = [0, 1] ⎪ ⎩ 1
if x < 0 if x = 0 . if x > 0
It is clear that F cannot have a continuous selector. Note that F (x) = ∂ϕ(x) where ϕ(x) = |x| (the subdifferential is in the sense of convex analysis, see Definition 1.2.28). This example suggests that usc multifunctions are not the right class to consider. The next proposition indicates where we should look in order to produce continuous selectors. In what follows, for the study of continuous selectors of a multifunction we assume that X, Y are Hausdorff topological spaces. Additional hypotheses are introduced as needed. PROPOSITION 6.3.2 If F : X −→ 2Y \{∅} and for every (x, y) ∈ Gr F we can find U ∈ N (x) and f a continuous selector of F U such that f (x) = y, then F is lsc. PROOF: Given V ⊆ Y open set, we need to show that F − (V ) is open. Let (x, y) ∈ Gr F ∩ (X × V ). By hypothesis we know that there exist U ∈ N (x) and a continuous function f : U −→ Y such that f (u) ∈ F (u) for all u ∈ U and f (x) = y. Set U ={x ∈ U : f (x ) ∈ V } ∈ N (x). Then U ⊆ F − (V ) and so F − (V ) is open. REMARK 6.3.3 Multifunctions that satisfy the hypotheses of the above proposition are called locally selectionable.
476
6 Multivalued Analysis
PROPOSITION 6.3.4 If X is paracompact, Y is a topological vector space, and F : X −→ 2Y \{∅} is a multifunction with convex values such that F − (y) = {x ∈ X : y ∈ F (x)} is open for all y ∈ Y , then F admits a continuous selector f (i.e., there exists a continuous map f : X −→ Y such that f (x) ∈ F (x) for all x ∈ X). PROOF: For every (x, y) ∈ Gr F , we can take U = F − (y) ∈ N (x) and f : U −→Y defined by f (x ) = y for all x ∈ U . So F is locally selectionable (see Remark 6.3.3). Therefore for every (x, y) ∈ Gr F we can find Ux ∈ N (x) and fx : Ux −→ Y continuous. Because of paracompactness let {Ui }i∈I be a locally finite refinement of the cover {Ux }x∈X . Let {gi }i∈I be a continuous partition of unity subordinate to the cover {Ui }i∈I . We define the map gi (x)fxi (x) for all x ∈ X. (6.8) f (x) = i∈I
Here xi ∈X is such that Ui ⊆Uxi (recall that {Ui }i∈I is a refinement of {Ux }x∈X ). Note that because {Ui }i∈I is locally finite, at every x ∈ X, the summation in (6.8) is finite. Therefore f is continuous and because F is convex-valued, for every x ∈ X we have f (x) ∈ F (x). REMARK 6.3.5 If Gr F is open in X×Y , then for every y ∈ Y F − (y) is open. Proposition 6.3.2 suggests that we should focus on lsc multifunctions. This leads to the celebrated Michael’s selection theorem. THEOREM 6.3.6 If X is paracompact, Y is a Banach space, and F : X −→ Pf c (Y ) is lsc, then F admits a continuous selector. PROOF: In what follows B1 = {y ∈ Y : y < 1}. First we produce a continuous function f : X −→ Y such that f (x) ∈ F (x) + εB1
for all x ∈ X, with ε > 0.
(6.9)
For every x ∈ X, we choose yx ∈ F (x). Then due to the lower semicontinuity of F , the set F − (yx + εB1 ) is open and so {F − (yx + εB1 )}x∈X is an open cover of X. Due to the paracompactness of X, we can find F − (yi + εB1 ) i∈I a locally finite refinement of F − (yi + εBi ) i∈I . We set f (x) =
gi (x)ui
for all x ∈ X,
i∈I
where ui ∈ (yi +εBi ) ⊆ (yxi +εB1 ). As before, at every x ∈ X the above sum is finite and so f is continuous and because of the convexity of the values of x −→ F (x)+εB1 , we have f (x) ∈ F (x) + εB1 for all x ∈ X. Next, inductively we generate a sequence {fn }n≥1 of continuous functions fn : X −→ Y such that fn (x) ∈ F (x) + and
1 B1 2n
fn+1 (x) − fn (x) <
1 2n−1
(6.10) for all x ∈ X and all n ≥ 1.
(6.11)
6.3 Continuous and Measurable Selectors
477
For n = 1, by the first part of the proof we have f1 : X −→ Y a continuous function such that f1 (x) ∈ F (x) + 12 B1 for all x ∈ X. For the induction step, suppose we have produced functions {fk }n k=1 that satisfy (6.10) and (6.11). We consider the multifunction
1 H(x) = F (x) ∩ fn (x) + n B1 2
for all x ∈ X.
Because of (6.10) we see that H has nonempty values, which are convex. Moreover, from Proposition 6.1.24 we have that H is lsc. Then from the first part of the proof we know that we can find fn+1 : X −→ Y a continuous function such that 1 B1 for all x ∈ X, 2n+1 1 1 1 for all x ∈ X. ⇒ fn+1 (x) ∈ fn (x) + n B1 + n+1 B1 ⊆ fn (x) + n−1 B1 2 2 2 n−1 1 (2 ) < +∞, from (6.11) it follows This completes the induction. Because fn+1 (x) ∈ H(x) +
n≥1
that {fn }n≥1 is a Cauchy sequence in C(X, Y ). So we can find f ∈ C(X, Y ) such that fn (x) −→ f (x) uniformly in x ∈ X. Because F is closed-valued from (6.10) we deduce that f (x) ∈ F (x) for all x ∈ X. Therefore f is a continuous selector of F . REMARK 6.3.7 It can be shown that the existence of continuous selectors is in fact equivalent to the paracompactness of the space X. Namely, if X is a Hausdorff topological space, Y is a Banach space, and every lsc multifunction F : X −→ Pf c (Y ) admits a continuous selector, then X is necessarily paracompact. We can use Theorem 6.3.6 to produce a continuous selector that passes from a prescribed point of Gr F . COROLLARY 6.3.8 If X is a paracompact space, Y is a Banach space, and F : X −→ Pf c (Y ) is lsc, then given (x, y) ∈ GrF we can find a continuous selector f0 of F such that f0 (x) = y. PROOF: Let F0 : X −→ Pf c (Y ) be defined by if x = x F (x ) . F0 (x ) = {y} if x = x It is easy to see that Fo is lsc. So by virtue of Theorem 6.3.6 we can find f0 : X −→ Y a continuous function such that f0 (x ) ∈ F0 (x ) for all x ∈ X. Evidently f0 is a continuous selector of F such that f0 (x) = y. Another interesting consequence of Theorem 6.3.6, is the following corollary. COROLLARY 6.3.9 If X, Y are Banach spaces and A ∈ L(Y, X) is surjective, then there exists f : X −→ Y a continuous (not necessarily linear) map such that A ◦ f = IdX .
478
6 Multivalued Analysis
PROOF: Let F (x)=A−1 (x), x ∈ X. Clearly F has nonempty, closed, and convex values. Moreover, from the open mapping theorem we know that A is an open map. Now, let V ⊆ Y be open. We show that F − (V ) is open in X, hence F is lsc. Indeed we have F − (V ) = x ∈ X : F (x) ∩ V = ∅ : x ∈ X : A−1 (x) ∩ V = ∅ . If y ∈ A−1 (x) ∩ V , then V ∈N (y) and so U = A(V ) ∈ N (x) (recall A is an open map). Then for every x ∈ U we have A−1 (x ) ∩ V = ∅, hence F − (V ) is open (i.e., F is lsc). We apply Theorem 6.3.6 and obtain f : X −→ Y a continuous selector of F . Evidently A ◦ f = IdX . REMARK 6.3.10 An interesting byproduct of the above proof is the fact that if X, Y are Hausdorff topological spaces and ϕ :Y −→ X is an open map (i.e., it maps open sets to open sets), then x −→ ϕ−1 (x) : y ∈ Y : ϕ(y) = x is a multifunction which is lsc from X into 2Y \{∅}. In the same vein if ϕ : Y −→ X is a closed map (i.e., it maps closed sets to closed sets), then x −→ ϕ−1 (x) is usc from X into 2Y \{∅}. We can improve the conclusion of Theorem 6.3.6 and produce a whole sequence {fn }n≥1 of continuous selectors of F such that {fn (x)}n≥1 is dense in F (x) for all x ∈ X, provided we enrich the structure of the spaces X and Y . THEOREM 6.3.11 If X is metric space, Y is a separable Banach space, and F : X −→ Pf c (Y ) is lsc, then there exists {fn }n≥1 a sequence of continuous selectors of F such that F (x) = {fn (x)}n≥1 for all x ∈ X. PROOF: Let {yn }n≥1 be a dense set in Y and set Unm =F − B1/2m (yn ) , n, m ≥ 1. Due to the lower semicontinuity of F each set Unm is open. Because X is a metric space, every open set is Fσ . So we have ! Unm = Cnmk with Cnmk is closed for all k ≥ 1. k≥1
We define
Fnmk (x) =
⎧ ⎪ ⎨ F (x)
if x ∈ / Cnmk
⎪ ⎩ F (x) ∩ B1/2m (yn )
if x ∈ Cnmk
.
Clearly Fnmk is lsc with values in Pf c (Y ). So by Theorem 6.3.6 we can find fnmk : X −→ Y a continuous selector of Fnmk . We claim that fnmk n,m,k≥1 is the desired sequence. To see this let y ∈ F (x) and m ≥ 1. We pick yn ∈ y + (1/2m )B1 . Then x ∈ Un,m+2 and for some k ≥ 1. But then we so x ∈ Cn,m+2,k
have fn,m+2,k (x) ∈ yn + 1/(2m+2 ) B 1 ⊆ yn + 1/(2m+1 ) B1 ⊆ y + (1/2m )B1 . In general Theorem 6.3.6 (Michael’s selection theorem) is optimal in the sense that all hypotheses are needed and can not be relaxed, namely the completeness of the space Y and the convexity and closedness of the values of the lsc multifunction
6.3 Continuous and Measurable Selectors
479
F . However, if the Banach space Y is separable, then we can drop the closedness of the values of F (·), provided we assume that they have a nonempty interior. More precisely we have the following result, whose proof can be found in Michael [426, Theorem 3.1 , p. 368] (see also Hu–Papageorgiou [313, p. 97]). THEOREM 6.3.12 If X is a metric space, Y is a separable Banach space, and F : X −→ 2Y \{∅} is a lsc multifunction such that for all x ∈ X F (x) is convex and int F (x) = ∅, then F admits a continuous selector. Deutsch–Kenderov [196] determined the biggest class of convex-valued multifunctions that admit a continuous approximate selector. They proved the following result. THEOREM 6.3.13 If X is a Hausdorff topological space, Y is a normed space, and F : X −→ 2Y \{∅} is a multifunction with convex values, then F admits a continuous approximate selector (i.e., for every ε > 0 there exists fε : X −→ Y a continuous map such that fε (x) ∈ F (x) + εB1 for every x ∈ X) if and only if F (·) is a-lsc (see Definition 6.1.48(a)). However, there exist multifunctions with nonempty, closed, and convex values values which are a-lsc and do not admit continuous selectors. Deutsch–Kenderov [196] and De Blasi–Myjal [179], produced examples in this direction. In addition De Blasi–Myjal [179] proved the following theorem. THEOREM 6.3.14 If X is a paracompact space, Y is a Banach space, and F : X −→ Pf c (Y ) is an hw -lsc multifunction, then F admits a continuous selector. REMARK 6.3.15 Because a multifunction F : X −→ 2Y \{∅} that is lsc is not necessarily hw -lsc and vice-versa, we see that Theorem 6.3.14 is distinct from Theorem 6.3.6. Before passing to measurable selectors, let us have a look at what happens with an usc multifunction. Recall that we started this section with a simple example illustrating that in general we do not expect to have continuous selectors for an usc multifunction. However, we can have some kind of approximate continuous selector (verify this with the multifunction given in Example 6.3.1). PROPOSITION 6.3.16 If X is a metric space, Y is a Banach space, and F : X −→ 2Y \{∅} is usc with convex values, then given ε > 0, we can find a locally Lipschitz function fε : X −→ Y such that fε (X) ⊆ conv F (X)
and
h∗ (Gr fε , Gr F ) < ε.
PROOF: Fix ε > 0. Because F is usc, for every x ∈ X we can find 0 < δ < δ(ε, x) such that F (x ) ⊆ F (x)+(ε/2)B1 for all x ∈ Bδ(x) (x). The collection Bδ/4 (x) x∈X is an open cover of X. The space X being a metric space, it is paracompact and so we can find {Uα }α∈J a locally finite refinement and {gα }α∈J a corresponding locally Lipschitz partition of unity subordinate to it. For each α ∈ J we choose
480
6 Multivalued Analysis
(uα , yα ) ∈ Gr F ∩ (Uα × Y ) and set fε (x) =
gα (x)yα . Clearly fε is locally
α∈J
Lipschitz and fε (X) ⊆ conv F (X). For x ∈ X we can find J(x) ⊆ J finite such that gα (x) > 0 if α ∈ J(x). For every α ∈ J(x), let xα ∈ X be such that Uα ⊆ Bδα /4 (xα ) with δα = δ(xα ). Let β ∈ J(xα ) and set δβ = max{δα : α ∈ J(x)}. We have xα ∈ Bδβ /2 (xβ ) and so Uα ⊆ Bδβ (xβ ). Hence for any α ∈ J(x), we have yα ∈ F (Uα ) ⊆ F (xβ ) + (ε/2)B1 and so fε (x) ⊆ F (xβ ) + (ε/2)B1 . Because x ∈ X was arbitrary we conclude that h∗ (Gr fε , Gr F ) < ε. Next we turn our attention to measurable selectors of measurable multifunctions. The first result in this direction is the so-called Kuratowski–Ryll Nardzewski selection theorem. The technique of its proof is similar to the proof of Michael’s selection theorem (Theorem 6.3.6). THEOREM 6.3.17 If (Ω, Σ) is a measurable space, X is a Polish space, and F : Ω −→ Pf (X) is measurable, then F admits a measurable selector. PROOF: Let d be a compatible metric on X and let {xn }n≥1 be a dense subset of X. For every ω ∈ Ω let n ≥ 1 be the smallest integer such that F (ω)∩B1 (xn ) = ∅. We set f0 (ω) = xn . Because by hypothesis F is measurable, the function f0 : Ω −→ X is Σ-measurable. Moreover, we have
d f0 (ω), F (ω) < 1 for all ω ∈ Ω. Suppose that we have constructed measurable maps fk : Ω −→ X, k = 0, 1, . . . , m such that fk (Ω) ⊆ {xn }n≥1 and 1
d fk (ω), F (ω) < k 2
d fk (ω), fk+1 (ω) <
for all ω ∈ Ω and all k ∈ {0, 1, . . . , m} 1 2k−1
(6.12)
for all ω ∈ Ω and all k ∈ {0, 1, . . . , m − 1}. (6.13) −1 fm ({x n })={ω
∈ Ω : fm (ω)=xn }, n ≥ 1. Evidently Cn =Ω. Moreover, from (6.12) we have
We consider the sets Cn = these sets are mutually disjoint and
n≥1
F (ω) ∩ B1/2m (xn ) = ∅
for all ω ∈ Cn .
We fix n ≥ 1 and ω ∈ Cn . Consider the smallest integer i ≥ 1 such that F (ω) ∩ B1/2m (xn ) ∩ B1/2m+1 (xi ) = ∅. We set fm+1 (ω) = xi . Then we have
and
1 1 1 d fm (ω), fm+1 (ω) ≤ m + m+1 < m−1 2 2 2
1 d fm+1 (ω), F (ω) < m+1 . 2
Moreover, the map ω −→ fm+1 (ω) from Ω into {xn }n≥1 is Σ-measurable. So by induction we have produced a sequence of Σ-measurable functions fk : Ω −→
6.3 Continuous and Measurable Selectors
481
{xn }n≥1 that satisfy (6.12) and (6.13) for all k ≥ 1. From (6.13) we see that for all ω ∈ Ω {fk (ω)}k≥1 is a Cauchy sequence in X. Because X is complete fk (ω) −→ f (ω) as k −→ ∞, for all ω ∈ Ω (see (6.12)). Then f : Ω −→ X is Σ-measurable and f (ω) ∈ F (ω) for all ω ∈ Ω (recall that F has closed values). In analogy with Theorem 6.3.11, we can produce a whole sequence fn : Ω −→ X, n ≥ 1, of measurable selectors of F , such that {fn (ω)}n≥1 is dense in F (ω) for all ω ∈ Ω. THEOREM 6.3.18 If (Ω, Σ) is a measurable space, X is a Polish space and F : Ω −→ Pf (X) is measurable, then there exists a sequence fn : Ω −→ X of measurable selectors of F such that F (ω) = {fn (ω)}n≥1
for all ω ∈ Ω.
PROOF: Let {Un }n≥1 be a countable base for X (recall that X being separable it is second countable). We define Fn : Ω −→ 2X \ {∅} by F (ω) ∩ Un if F (ω) ∩ Un = ∅ . Fn (ω) = F (ω) otherwise Let Ωn = ω ∈ Ω : F (ω) ∩ Un = ∅ . Because F is measurable, we have that Ωn ∈ Σ for all n ≥ 1. Let V ⊆ X be nonempty open. Then Fn− (V )= ω ∈ Ωn : F (ω) ∩ Un ∩ V = ∅ ∪ ω ∈ Ωcn : F (ω) ∩ V = ∅ ∈ Σ. ⇒ Fn is measurable for every n ≥ 1. ⇒ F n is measurable for every n ≥ 1
(see Proposition 6.2.4).
Apply Theorem 6.3.17 to obtain fn : Ω −→ X a Σ-measurable map such that fn (ω) ∈ Fn (ω) for all ω ∈ Ω and all n ≥ 1. Then clearly F (ω) = {fn (ω)}n≥1 for all ω ∈ Ω. Now we can complete Theorem 6.2.20 as follows. THEOREM 6.3.19 If (Ω, Σ) is a measurable space, X is a separable metric space, F : Ω −→ Pf (X), and we consider the following statements For every D ∈ B(X), F − (D) ∈ Σ. For every C ⊆ X closed, F − (C) ∈ Σ. F is measurable.
For every x ∈ X, ω −→ d x, F (ω) is Σ-measurable. There exists a sequence {fn }n≥1 of Σ-measurable selectors of F such that F (ω) = {fn (ω)}n≥1 for all ω ∈ Ω. (f) Gr F ∈ Σ × B(X).
(a) (b) (c) (d) (e)
Then the following implications hold (1) (a)⇒(b)⇒(c)⇔ (d)⇒(f). (2) If X is complete, then (c)⇔(d) ⇔(e). (3) If X is σ-compact, then (b)⇔(c).
482
6 Multivalued Analysis
(4) If Σ is complete (i.e., Σ = Σ) and X is complete, then (a)−→(f) are all equivalent. The other measurable selection theorem is graph-conditioned and is known in the literature as the Yankov–von Neumann–Aumann selection theorem. We state the result and for a proof of it, we refer to Hu–Papageorgiou [313, p.p. 158–159]. THEOREM 6.3.20 If (Ω, Σ) is a complete measurable space, X is a Souslin space, and F : Ω −→ 2X \{∅} is graph-measurable, then there exists a sequence {fn }n≥1 of Σ-measurable selectors of F such that F (ω) ⊆ {fn (ω)}n≥1
for all ω ∈ Ω.
REMARK 6.3.21 If Σ is not complete, then the selectors are Σ–measurable. So if (Ω, Σ, µ) is a σ-finite measure space, we can find a sequence {fn }n≥1 of Σ-measurable selectors of F such that F (ω) ⊆ {fn (ω)}n≥1 µ-a.e. on Ω. Now let us prove some useful consequences of these selection theorems. The first is about the existence of implicit measurable functions and is sometimes called Filippov’s implicit function theorem. THEOREM 6.3.22 If (Ω, Σ) and (T, S) are measurable spaces, X is a Souslin X T h:Ω × X −→ T space,
F :Ω −→ 2 \{∅} and G:Ω −→ 2 \{∅} are graph-measurable,
is Σ × B(X), S -measurable, and for all ω ∈ Ω, h ω, F (ω) ∩ G(ω) = ∅, then there
exists a Σ-measurable selector u of F such that h ω, u(ω) ∈ G(ω) for all ω ∈ Ω. PROOF: Let K(ω)={x ∈ F (ω) : h(ω, x) ∈ G(ω)}. By hypothesis K(ω)
= ∅ for all ω ∈ Ω. Also let h : Ω × X −→Ω × T be defined by h (ω, x)= ω, h(ω, x) . Evidently 0 0
h0 is Σ × B(X), Σ × S -measurable. We have GrK = GrF ∩ h−1 0 (GrG). The measurability of h0 and the graph-measurability of G imply that h−1 0 (GrG) ∈ Σ × B(X), ⇒ GrK ∈ Σ × B(X)
(recall that F is graph-measurable).
So we can apply Theorem 6.3.20 (see also Remark 6.3.21) and obtain u : Ω −→ X, a Σ-measurable map such that u(ω) ∈ K(ω) for all ω ∈ Ω. Then
u(ω) ∈ F (ω) and h ω, u(ω) ∈ G(ω) for all ω ∈ Ω. REMARK 6.3.23 Theorem 6.3.22 is useful in control theory, where Ω = S = [0, b] and G(s) = {v(s)} with v : S −→
T is a (Σ, S)-measurable map, usually corresponding to a velocity. If v(s) ∈ h s, F (s) for all s ∈ S, then there exists a measurable
control u : S −→ X which is admissible; that is, u(s) ∈ F (s) and v(s) = h s, u(s) for all s ∈ S. In many cases the control constraint set is trivial, namely F (s) = X for s ∈ S. If T and X are metrizable spaces and h is a Carath´eodory function, then h is jointly measurable (see Theorem 6.2.6).
6.3 Continuous and Measurable Selectors
483
We can also use the measurable selection theorems to prove a measurable version of the Berge maximum theorem (see Theorem 6.1.18). THEOREM 6.3.24 If (Ω, Σ) is a measurable space, X is a Souslin space, ϕ : Ω×X −→R∗ = R ∪ {±∞} is a Σ × B(X)-measurable function, and F :Ω−→2X \ {∅} is a graph-measurable multifunction, then (a) ω −→ m(ω) = sup{ϕ(ω, x) : x ∈ F (ω)} is Σ-measurable. (b) If for all ω ∈ Ω, the set S(ω) = {x ∈ F (ω) : m(ω) = ϕ(ω, x)} is nonempty, then Gr S ∈ Σ × B(X). PROOF: (a) Let λ ∈ R. Note that m(ω) > λ if and only if we can find x ∈ F (ω) such that ϕ(ω, x) > λ. Hence {ω ∈ Ω : m(ω) > λ} = projΩ {(ω, x) ∈ Ω × X : ϕ(ω, x) > λ}. Because ϕ is jointly measurable, we have {(ω, x) ∈ Ω × X : ϕ(ω, x) > λ} ∈ Σ × B(X). Invoking Theorem 6.2.18, we obtain projΩ {(ω, x) ∈ Ω × X : ϕ(ω, x) > λ} ∈ Σ, ⇒ {ω ∈ Ω : m(ω) > λ} ∈ Σ. Because λ ∈ R was arbitrary, we conclude that ω −→ m(ω) is Σ-measurable. (b) Let ψ(ω, x) = m(ω) − ϕ(ω, x). Evidently ψ is Σ × B(X)-measurable (see part (a)). Then Gr S = Gr F ∩ {(ω, x) ∈ Ω × X : ψ(ω, x) = 0} ∈ Σ × B(X). REMARK 6.3.25 Because of Theorem 6.3.24(b) and Theorem 6.3.20 we can have a Σ-measurable selector of S. If S has empty values, then we can look for εmaximizers. More precisely, if m is R-valued and ε > 0, then the multifunction ω −→ Sε (ω) = {x ∈ F (ω) : ϕ(ω, x) ≥ m(ω) − ε} has always nonempty values, its graph belongs in Σ × B(X) and so admits a Σ-measurable selector. PROPOSITION 6.3.26 If (Ω, Σ) is a measurable space, X is a separable Banach space, u : Ω −→ X is a Σ-measurable
map, and r : Ω −→ R+is a Σ-measurable function, then ω −→ B r(ω) u(ω) = x ∈ X : x − u(ω) ≤ r(ω) is measurable. PROOF: Let {xn }n≥1 be dense in the closed unit ball of X. Set vn (ω) = u(ω) + r(ω)xn ,
n ≥ 1.
Evidently for every n ≥ 1, vn is Σ-measurable and
B r(ω) u(ω) = {vn (ω)}n≥1
⇒ ω −→ B r(ω) u(ω) is measurable (see Theorem 6.3.18(2)).
484
6 Multivalued Analysis
The measurable selection theorems also provide the necessary tools to deal with scalarly measurable multifunctions (see Definition 6.2.1(c)). PROPOSITION 6.3.27 If (Ω, Σ) is a complete measurable space, X is a separable Banach space, and F : Ω −→ 2X \{∅} is graph-measurable, then F is scalarly measurable. PROOF: By Theorem 6.3.20 we can find a sequence {fn }n≥1 of Σ-measurable selectors of F such that F (ω) ⊆ {fn (ω)}n≥1 for all ω ∈ Ω. Then for every x∗ ∈ X ∗ , we have
σ x∗ , F (ω) = sup x∗ , fn (ω) , n≥1
⇒ ω −→ σ x∗ , F (ω) is Σ-measurable, ⇒ F
is scalarly measurable.
If F is Pwkc -valued, then measurability and scalar measurability are equivalent notions. To prove this, we need to do some preparatory work. We start with a lemma. LEMMA 6.3.28 If V is a normed space and C ⊆ V is nonempty and convex, then for every v ∈ V , we have d(v, C) = sup v ∗ , v − σ(v ∗ , C) : v ∗ ≤ 1 . PROOF: Let ϕ(v) = d(v, C). We know that ϕ is continuous convex. From Example ∗ 1.2.27(b), we know that ϕ∗ =σ(·, C) + iB ∗1 , where B 1 = v ∗ ∈ V ∗ : v ∗ ≤ 1 and ∗
iB ∗1 (v ) =
0 +∞
∗
if v ∗ ∈ B 1 ∗ if v ∗ ∈ / B1
for all v ∗ ∈ V ∗ .
Because ϕ is continuous convex, we have that ϕ = ϕ∗∗ = (ϕ∗ )∗ (see Theorem 1.2.21). Hence ϕ(v) = sup v ∗ , v − σ(v ∗ , C) − iB ∗1 (v ∗ ) : v ∗ ∈ V ∗ = sup v ∗ , v − σ(v ∗ , C) : v ∗ ≤ 1 . Also we use a particular topology on X ∗ , which for convenience of the reader we recall in the next definition. DEFINITION 6.3.29 Let V be a locally convex space. The locally convex topology on V of uniform convergence on w∗ -compact, convex balanced sets in V ∗ is called the Mackey topology on V . Recall that C ⊆ X ∗ is balanced, if λC ⊆ C for all |λ| ≤ 1.
6.3 Continuous and Measurable Selectors
485
REMARK 6.3.30 The strongest locally convex topology τ on V for which we have (V, τ )∗ = V ∗ , is the Mackey topology. So, if V is a Banach space and we consider the strongest topology τ on V ∗ such that (V ∗ , τ )∗ = V , this is the Mackey topology, whereas the weakest topology on V ∗ for which this is true is the weak∗ -topology. Of course in this case the Mackey topology is strictly weaker than the strong (norm) topology. However, on the Banach space V the Mackey and strong topologies coin∗ cide. Finally note that in a locally convex
∗ space V ∗a set C ⊆ V is w(V, V ∗)–compact ∗ if and only if the function v −→ σ v , C is m(V , V )–continuous (m(V , V ) is the Mackey topology on V ∗ for the pair (V ∗ , V )). PROPOSITION 6.3.31 If (Ω, Σ) is a measurable space, X is a separable Banach space, and F : Ω −→ Pwkc (X), then F is measurable if and only if it is scalarly measurable. PROOF: ⇒: From Theorem 6.3.19(2), we know that we can find a sequence {fn }n≥1 of Σ-measurable selectors of F such that F (ω) = {fn (ω)}n≥1 for all ω ∈ Ω. Then
for all x∗ ∈ X ∗ we have σ x∗ , F (ω) = sup x∗ , fn (ω) and so we conclude that n≥1
ω−→σ x∗ , F (ω) is Σ-measurable. ∗ ∗ endowed with ⇐: Since X is separable, the space Xw ∗ (= the dual Banach space X ∗ the w -topology) is separable. It follows that for every other topology τ on X ∗ for which we have (Xτ∗ )∗ = X, the space Xτ∗ is separable. In particular this is true if τ = m(X ∗ , X). Because F is Pwkc (X)-valued, from Remark 6.3.30 we know that x∗ −→ σ x∗ , F (ω) is m(X ∗ , X)-continuous on X ∗ . From Lemma 6.3.28 for every x ∈ X and every ω ∈ Ω, we have
d x, F (ω) = sup x∗n , x − σ x∗n , F (ω) , (6.14) n≥1
∗
where {x∗n }n≥1 is dense in B 1 ={x∗ ∈X ∗ : x∗ ≤ 1} for the Mackey topology. From (6.14) and Theorem 6.3.19(2), we conclude that F is measurable. Now we can state a version of Theorem 6.3.19 for nonmetrizable spaces. THEOREM 6.3.32 If (Ω, Σ) is a complete measurable space, X is a regular Souslin space, F : Ω −→ Pf (X) is a multifunction, and we consider the following statements, (a) (b) (c) (d)
For every D ∈ B(X), F − (D) ∈ Σ; For every C ⊆ X closed, F − (C) ∈ Σ; For every U ⊆ X open, F − (U ) ∈ Σ; There exists a sequence {fn }n≥1 of Σ-selectors of F such that F (ω) = {fn (ω)}n≥1
for all ω ∈ Ω;
(e) Gr F ∈ Σ × B(X); (f) For every continuous function u : X −→ R, the function ω −→ m(ω) = sup[u(x) : x ∈ F (ω)] is Σ-measurable, then
486
6 Multivalued Analysis
(1) (a)⇔(d)⇔(e)⇔(f)⇒(b) ⇒(c). (2) If X is second countable, then (a)−→(f) are all equivalent. REMARK 6.3.33 The proof of this theorem is based on the fact that there exists a metric d on X defining a Souslin metric topology finer that the original topology on X (see Saint-Pierre [536]). Then B(X) = B(Xd ) and we can apply Theorem 6.3.19 to the multifunction F : Ω −→ Pf (Xd ) (Xd is the space X with the new Souslin metric topology). Now, if X is a separable Banach space, then Xw∗ (= the dual Banach space X ∗ with the w∗ -topology) is a Souslin space (see Remark 6.2.17) which is regular and second countable. Then using Theorem 6.3.32 we can have the following “dual” version of Proposition 6.3.31. PROPOSITION 6.3.34 If (Ω, Σ) is a complete measurable space, X is a separable Banach space, and F : Ω −→ 2X \ {∅} is a multifunction with w∗ -compact convex values, then for every U ⊆ X ∗ open F − (U ) ∈ Σ if and only if for every x ∈ X,
ω −→ σ x, F (ω) is Σ-measurable. Let (Ω, Σ) be a measurable space and X a separable Banach space. Given a multifunction F : Ω −→ 2X \ {∅}, by ext F we denote the multifunction which to each ω ∈ Ω assigns the set ext F (ω) of extreme points of the set F (ω). In the next proposition we establish the measurability properties of the multifunction ω −→ ext F (ω). PROPOSITION 6.3.35 If (Ω, Σ) is a measurable space, X is a separable Banach space, and F : Ω −→ Pwkc (X) is a measurable multifunction, then ω −→ ext F (ω) is graph-measurable. PROOF: From the Krein–Milman theorem we know that for all ω ∈ Ω, ext F (ω) = ∅. Recall that X ∗ with the Mackey topology m(X ∗ , X) is separable. Let {x∗n }n≥1 ⊆ ∗ B 1 be m(X ∗ , X)-dense and consider the function ϕF : Ω × X −→ R+ defined by ⎧ ∗ 2 xn ,x ⎨ if x ∈ F (ω) 2n . ϕF (ω, x) = n≥1 ⎩ +∞ otherwise Evidently ϕF is Σ × B(X)-measurable and for every ω ∈ Ω, ϕF (ω, ·)F (ω) is continuous. Let α be the set of all continuous affine function α : X −→ R. We consider ϕF (ω, x) = inf α(x) : α ∈
α, α(v) > ϕF (ω, v)
for all v ∈ F (ω) .
Let un : Ω −→ X, n ≥ 1, be a sequence of Σ-measurable selectors of F such that F (ω) = {un (ω)}n≥1
(see Theorem 6.3.19).
For every (ω, x∗ ) ∈ Ω × X ∗ , let rx∗ (ω) = sup[ ϕF (ω, v) − x∗ , v : v ∈ F (ω)].
6.4 Decomposable Sets and Set-Valued Integration
487
We have that rx∗ (ω) < +∞, for every ω ∈ Ω x∗ −→ rx∗ (ω) is continuous on X ∗ and rx∗ (ω) = sup ϕF (ω, x) − x∗ , x : x ∈ F (ω) ,
⇒ rx∗ (ω) = sup ϕF ω, un (ω) − x∗ , un (ω) , n≥1
∗
⇒ (ω, x ) −→ rx∗ (ω) is Σ × B(X ∗ )-measurable. If we set
ϕF (ω, x) = inf x∗ , x + rx∗ (ω) : x∗ ∈ X ∗ ,
∗ }m≥1 ⊆ X ∗ m(X ∗ , X)-dense set, we have then for {vm ∗ ∗ (ω)], ϕF (ω, x) = inf vm , x + rvm m≥1
⇒ (ω, x) −→ ϕF (ω, x) is Σ × B(X)-measurable. From Choquet [144, Chapter 6], we know that ext F (ω) = {x ∈ X : ϕF (ω, x) = ϕF (ω, x)}, ⇒ Gr F ∈ Σ × B(X). REMARK 6.3.36 If we set ξF (ω, x) = ϕF (ω, x) − ϕF (ω, x), then ξF is Σ × B(X)measurable, for every ω ∈ Ω, ξF (ω, ·) is strictly concave on F (ω), concave on X, and upper semicontinuous. The function ξF (ω, ·) is known in the literature as the Choquet function of the set F (ω).
6.4 Decomposable Sets and Set-Valued Integration Throughout this section the standing hypotheses are: (Ω, Σ, µ) is a σ-finite measure space and X is a separable Banach space. By L0 (Ω, X) we denote the space of all equivalence classes in the set of all Σ-measurable maps from Ω into X, for the equivalence relation of equality almost everywhere. DEFINITION 6.4.1 A set K ⊆ L0 (Ω, X) is said to be decomposable, if for every triple (A, f1 , f2 ) ∈ Σ × K × K we have χA f1 + χAc f2 = χA f1 + (1 − χA )f2 ∈ K. EXAMPLE 6.4.2 If F : Ω−→2X \ {∅} is a multifunction and SF ={f ∈ L0 (Ω, X) : f (ω) ∈ F (ω) µ-a.e.}, then SF is decomposable (possibly empty). The same is true for the sets SFp = SF ∩Lp (Ω, X), 1 ≤ p ≤ ∞. We show in the sequel that within closure all decomposable sets are of this form. The condition of decomposability looks similar to convexity but only formally, because χA is not a constant and does not assume values between zero and one. However, as we show in this section, decomposability has some implications analogous to those of convexity.
488
6 Multivalued Analysis
PROPOSITION 6.4.3 If F : Ω−→2X \{∅} is graph-measurable and SFp = ∅, 1 ≤ p ≤ ∞, then we can find a sequence {fn }n≥1 ⊆ SFp such that F (ω) ⊆ {fn (ω)}n≥1 µa.e. on Ω. PROOF: From Theorem 6.3.19 (see also Remark 6.3.21), we can find a sequence {gm }m≥1 of Σ-measurable selectors of F such that F (ω) ⊆ {gm (ω)}m≥1 µ-a.e. on Ω. Since µ is σ-finite on Ω, we can find {Ak }k≥1 ⊆ Σ a countable partition of Ω with µ(Ak ) < +∞ for all k ≥ 1. Let f ∈ SFp and define Cmki = {ω ∈ Ω : i − 1 ≤ gm (ω) < i} ∩ Ak for all m, k, i ≥ 1.
c fmki = χCmki gm + χCmki f
Evidently {fmki }m,k,i≥1 ⊆ SFp and F (ω) ⊆ {fmki (ω)}m,k,i≥1 µ-a.e. on Ω.
COROLLARY 6.4.4 If F1 , F2 : Ω−→2X \{∅} are graph measurable and for some 1 ≤ p ≤ ∞ we have SFp 1 =SFp 2 = ∅, then F1 (ω) = F2 (ω) µ-a.e. on Ω. LEMMA 6.4.5 If F : Ω −→ 2X \ {∅} is graph-measurable, 1 ≤ p ≤ ∞, {fn }n≥1 ⊆ SFp satisfies F (ω) ⊆ {fn (ω)}n≥1 µ-a.e. f ⊆ SFp , and ε > 0, then we can find a finite Σ-partition {Ck }N k=1 of Ω such that N f − χCk fk p < ε. k=1
PROOF: Without any loss of generality we assume that f (ω) ∈ F (ω) for all ω ∈ Ω. Let ϑ ∈ L1 (Ω), ϑ(ω) > 0 for all ω ∈ Ω and Ω ϑdµ < εp /3. We can find {Dn }n≥1 a Σ-partition of Ω such that f (ω) − fn (ω)p < ϑ(ω)
for all ω ∈ Dn , n ≥ 1.
We choose N ≥ 1 large enough so that εp f (ω)p dµ < and 3.2p Dn n≥N +1
f1 (ω)p dµ <
n≥N +1 Dn
We define a finite Σ-partition {Ck }N k=1 of Ω by
! Dn C1 = D1 ∪ and Ck = Dk
εp . 3.2p
for k = 2, . . . , N.
n≥N +1
Then we have N N p f − χCk fk p = f (ω) − fk (ω)p dµ + f (ω) − f1 (ω)p dµ k=1 Dk
k=1
≤
ϑ(ω)dµ + Ω
n≥N +1 Dk
2p−1 f (ω)p + f1 (ω)p dµ < εp .
n≥N +1
Dk
Using this lemma we can completely characterize closed decomposable sets in the Lebesgue–Bochner space Lp (Ω, X), 1 ≤ p < ∞.
6.4 Decomposable Sets and Set-Valued Integration
489
THEOREM 6.4.6 If K ⊆ Lp (Ω, X), 1 ≤ p < ∞ is nonempty and closed, then K is decomposable if and only if K =SFp for some F : Ω−→Pf (X)-measurable. PROOF: ⇒ : From Proposition 6.4.3 we know that we can find a sequence {fn }n≥1 ⊆ Lp (Ω, X) such that X = {fn (ω)}n≥1 µ-a.e. on Ω. Let ξn = inf fn − hp : h ∈ K and let {hnm }m≥1 ⊆ K be such that
fn − hnm p ↓ ξn
as m −→ ∞.
We set F (ω)={hnm (ω)}n,m≥1 for all ω ∈ Ω. Evidently ω −→ F (ω) is measurable from Ω into Pf (X) (see Theorem 6.3.18). We claim that K = SFp . To this end let f ∈ SFp and ε > 0. From Lemma 6.4.5, we know that we can find a finite Σ-partition N {Ck }N k=1 of Ω and {gk }k=1 ⊆{hnm (ω)}n,m≥1 such that N f − χCk gk p < ε. k=1
Due to the decomposability of K we have N
χCk gk ∈ K,
k=1
⇒ f ∈K ⇒ Suppose that
K = SFp .
SFp
(because K is closed),
⊆ K.
(6.15)
Then we can find f ∈ K, A ∈ Σ with µ(A) > 0 and δ > 0 such
f (ω) − hnm (ω) ≥ δ
for all ω ∈ A and all n, m ≥ 1.
For what follows we fix n ≥ 1 so that δ E = A ∩ ω ∈ Ω : f (ω) − fn (ω) < 3
(6.16)
(6.17)
has a positive measure. We define gm = χE f + χE c hnm ,
m ≥ 1.
We have gm ∈ K and hnm (ω) − fn (ω) ≥ hnm (ω) − f (ω) − f (ω) − fn (ω) 2δ δ for all ω ∈ E (see (6.16) and (6.17)). ≥δ− = 3 3 (6.18) Therefore fn − hnm pp − ξnp ≥ fn − hnm pp − fn − gm pp
≥ fn − hnm p − fn − f p dµ E
2δ δp p − p µ(E) > 0, ≥ 3 3
m≥1
(see (6.18)).
(6.19)
490
6 Multivalued Analysis
If in (6.19) we let m −→ ∞, we have a contradiction. Hence K ⊆ SFp . From (6.15) and (6.20), we conclude that K = SFp .
(6.20)
REMARK 6.4.7 The result is also true for p=+∞, provided that K is boundedly closed; that is, if {fn }n≥1 ⊆ K, sup fn ∞ < +∞ and fn (ω) −→ f (ω) µ-a.e. on Ω, then f ∈ K.
n≥1
Bounded decomposable sets in L1 (Ω, X) are in fact uniformly integrable. To show this we need the following lemma, whose proof can be found in Hu– Papageorgiou [313, p. 178], or Neveu [458, p. 121]. LEMMA 6.4.8 If (Ω, Σ, µ) is a finite space and R is a family of Σ-measurable, R+ = R ∪ {+∞}-valued functions on Ω, then we can find a unique (modulo µ-a.e. equality) Σ-measurable function h : Ω −→ R+ such that (a) For every f ∈ R, we have f (ω) ≤ h(ω) µ-a.e. on Ω. (b) If h : Ω −→ R+ is another Σ-measurable function such that f (ω) ≤ h(ω) µ-a.e. on Ω for all f ∈ R, then h(ω) ≤ h(ω) µ-a.e. on Ω. Moreover, we can find a sequence {fn }n≥1 ⊆ R such that h(ω) = sup fn (ω) n≥1
µ-a.e. on Ω.
Finally, if R is directed upwards (namely if for every f1 , f2 ∈ R, we can find f3 ∈ R such that f1 (ω), f2 (ω) ≤ f3 (ω) µ-a.e. on Ω), then {fn }n≥1 can be chosen to be increasing. REMARK 6.4.9 The function h is denoted by ess sup R and is the least upper bound of R in the sense of inequality µ-a.e. The essential supremum coincides with the supremum modulo µ-null sets for countable families, but it is not the same for uncountable families. To see this let A ⊆ [0, 1] and R = {χ{α} : α ∈ A}. Then ess sup R = 0, but sup χ{α} = χA which need not be measurable if A is not or even if A is measurable but of positive Lebesgue measure, then χA = 0. In a similar way we can define the essential infimum of R, denoted by ess inf R. PROPOSITION 6.4.10 If (Ω, Σ, µ) is a finite measure space, Y is a Banach space, and K ⊆ L1 (Ω, Y ) is bounded and decomposable, then K is uniformly integrable. PROOF: We introduce the set |K| = {f (·) : f ∈ K} ⊆ L1 (Ω) and let h = ess sup |K| (see Remark 6.4.9). From Lemma 6.4.8, we know that we can find {fn }n≥1 ⊆ K such that h(ω) = sup fn (ω) µ-a.e. on Ω. Moreover, the decomn≥1
posability of K implies that |K| is directed upwards and so {fn }n≥1 can be chosen such that fn (ω) ↑ h(ω) µ-a.e. on Ω as n → ∞. From the monotone convergence theorem and because by hypothesis K is bounded, it follows that h ∈ L1 (Ω). Because f (ω) ≤ h(ω) µ-a.e. on Ω for all f ∈ K, we conclude that K is uniformly integrable.
6.4 Decomposable Sets and Set-Valued Integration
491
An immediate consequence of this proposition and of the Dunford–Pettis theorem, is the following result. COROLLARY 6.4.11 If (Ω, Σ, µ) is a finite-measure space, Y is a reflexive Banach space, and K ⊆ L1 (Ω, Y ) is a bounded decomposable set, then K is relatively weakly compact in L1 (Ω, Y ). To prove the next proposition, we need to recall the following important result of measure theory, which has remarkable applications in control theory. The result is known as Lyapunov’s convexity theorem and for a proof of it we refer to Diestel–Uhl [199, pp. 264 and 266]. THEOREM 6.4.12 Let (Ω, Σ) be a measure space. (a) If m : Σ −→ RN is a nonatomic vector measure, then m(Σ) is compact and convex in X. (b) If X is a Banach space with the RNP (Radon–Nikodym property) and m : Σ −→ X is a nonatomic measure of bounded variation, then m(Σ) is compact and convex. Using this theorem, we can prove the next proposition, which is another indication that decomposability and convexity are closely related notions. PROPOSITION 6.4.13 If (Ω, Σ, µ) is a finite nonatomic measure space, Y is a Banach space, K ⊆ Lp (Ω, Y ) 1 ≤ p < ∞ is nonempty, and decomposable, V is a finite-dimensional Banach space, and L ∈ L Lp (Ω, Y ), V , then L(K) is convex. PROOF: Let v1 , v2 ∈ L(K). Then v1 = L(f1 ) and v2 = L(f2 ) with f1 , f2 ∈ K. Consider the vector-valued set function m : Σ−→V defined by m(A) = L χA (f1 − f2 ) for all A ∈ Σ. We claim that m is a vector measure. To this end, let {An }n≥1 ⊆ Σ be mutually disjoint sets and let A= An , Cn = Ak ∈ Σ. We have n≥1
k≥n+1
m(A) = L χA (f1 − f2 ) n
= L χAk (f1 − f2 ) + L χCn (f1 − f2 ) , k=1 n n
⇒ m(A) − L χAk (f1 − f2 ) V = m(A) − m(Ak )V k=1
k=1
= L χCn (f1 − f2 ) V .
p
Note that Cn ↓ ∅ as n → ∞ and so χCn (f1 − f2 ) −→ 0 in L (Ω, Y ) and so L χCn (f1 − f2 ) −→ 0 in V . This proves that m is a vector measure. Because µ is nonatomic and m ! µ, m is nonatomic too. Therefore we can apply Theorem 6.4.12(a) and deduce that m(Σ) is convex. Then D = m(Σ) + L(f2 ) is convex in V . Note that v1 , v2 ∈ D and so for every λ ∈ [0, 1] we have λv1 +(1−λ)v2 ∈ D. Because χA (f1 − f2 ) + f2 = χA f1 + χAc f2 ∈ K (due to the decomposability of K), we infer that D ⊆ L(K). Therefore λv1 + (1 − λ)v2 ∈ L(K) for all λ ∈ [0, 1]. Since v1 , v2 ∈ L(K) were arbitrary, we conclude that L(K) is convex.
492
6 Multivalued Analysis
PROPOSITION 6.4.14 If (Ω, Σ, µ) is a finite measure space, Y is a Banach space, and K ⊆ Lp (Ω, Y ) 1 ≤ p < ∞ is nonempty, decomposable and w-closed, then L(K) is convex. PROOF: Let g = conv K. Then g =
n
λk fk with λk ∈ [0, 1],
k=1
n
λk = 1 and
k=1
fk ∈ K for all k ∈ {1, . . . , n}. Let (·, ·)p denote the duality brackets for the pair
p L (Ω, Y ), Lp (Ω, Yw∗∗ ) , p1 + p1 = 1 (see Ionescu–Tulcea [328, p. 99]). Consider the basic weak neighborhood of g defined by V (g) = h ∈ Lp (Ω, Y ) : |(uk , h − g)p | < ε, k ∈ {1, . . . , N } ,
where N ≥ 1, uk ∈ Lp (Ω, Yw∗∗ ), k ∈ {1, . . . , N } and ε > 0. Let L : Lp (Ω, Y ) −→ RN be defined by N L(h) = (uk , h) k=1 . From Proposition 6.4.13 we know that L(K) is convex. Because L(conv K) = conv L(K) = L(K), we can find f ∈ K such that L(g) = L(f ), hence V (g) ∩ K = ∅ w and so g ∈ K = K. Therefore K = conv K. REMARK 6.4.15 Both Propositions 6.4.13 and 6.4.14 are also valid for σ-finite measure spaces (Ω, Σ, µ). In this case we work on the finite measure components of Ω. The next theorem is a basic tool in what follows. It is also very useful in various optimization problems. THEOREM 6.4.16 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, ϕ : Ω × X −→ R∗ = R ∪ {±∞} is a Σ × B(X)-measurable function, F : Ω −→ 2X \ {∅} is graph-measurable, and the integral functional
Iϕ (u) = ϕ ω, u(ω) dµ(ω), u ∈ Lp (Ω, X), 1 ≤ p ≤ ∞, Ω
is defined for all u ∈ SFp = {u∈Lp (Ω, X) : u(ω) ∈ F (ω) µ-a.e. on Ω}, exists and there p u0 ∈ Lp (Ω, X) such that I (u ) > −∞, then sup I (u) : u ∈ S sup ϕ(ω, x) : = ϕ 0 ϕ F Ω x ∈ F (ω) dµ. PROOF: Let m(ω) = sup ϕ(ω, x) : x ∈ F (ω) . From Theorem 6.3.24, we know
p for every u ∈ S we have ϕ ω, u(ω) ≤ m(ω) µ-a.e. that m is Σµ -measurable. Also F
on Ω. In particular then ϕ ω, u0 (ω) ≤ m(ω) and so we see that Ω mdµ exists (possibly equals +∞). We have
sup Iϕ (u) : u ∈ SFp ≤ mdµ. Ω
If Iϕ (u0 ) = +∞, then we are done.
So assume that in Iϕ (u0 ) ∈ R. Then the function ω −→ ϕ ω, uo (ω) belongs 1 L (Ω). Let ξ < Ω mdµ be given. We show that ξ < Iϕ (u) for some u ∈ SFp . Let {Cn }n≥1 ⊆ Σ be such that Cn ↑ Ω and µ(Cn ) < ∞ and consider a strictly positive function ϑ ∈ L1 (Ω). We set
6.4 Decomposable Sets and Set-Valued Integration
493
Dn = Cn ∩ {ω ∈ Ω : ϕ ω, uo (ω) ≤ n} ⎧ ϑ(ω) ⎪ if ω ∈ Dn and ϑ(ω) ≤ n ⎨ m(ω) − n ϑ(ω) mn (ω) = n − n if ω ∈ Dn and ϑ(ω) > n . ⎪ ⎩
if ω ∈ Dnc ϕ ω, uo (ω) − ϑ(ω) n
and
Clearly {mn }n≥1 ⊆ L1 (Ω) and mn ↑ m as n → ∞. Therefore by the monotone convergence theorem, we can find n0 ≥ 1 such that ξ < Ω mn0 dµ. Set η = mn0 . Then ξ < Ω ηdµ and η(ω) < m(ω) µ-a.e. on Ω. Define G(ω) = F (ω) ∩ {x ∈ X : η(ω) ≤ ϕ(ω, x)} = ∅
for all ω ∈ Ω.
(6.21)
Evidently Gr G ∈ Σ × B(X) and so by Theorem 6.3.20, we can find u : Ω −→ X a Σ–measurable function such that u(ω) ∈ G(ω) for all ω ∈ Ω. We define En = Cn ∩ {ω ∈ Ω : u(ω) ≤ n} vn = χEn u + χEnc u0 , n ≥ 1.
and
Evidently {vn }n≥1 ⊆ SFp and we have
ϕ ω, u(ω) dµ + Iϕ (vn ) = En
≥
η(ω)dµ + En
η(ω)dµ +
= Ω
Because have
Ω
c En
c En
c En
ϕ ω, u0 (ω) dµ
ϕ ω, u0 (ω) dµ
(see (6.21))
ϕ ω, u0 (ω) − η(ω) dµ.
(6.22)
η(ω)dµ > ξ and En ↑ Ω, from (6.22) it follows that for n ≥ 1 large we Iϕ (vn ) > ξ.
In what follows we study the set SFp 1 ≤ p ≤ ∞ for a multifunction F . Throughout this study (Ω, Σ, µ) is a σ-finite measure space and X a separable Banach space. As always additional hypotheses are introduced as needed. PROPOSITION 6.4.17 If F : Ω −→ 2X \ {∅} is graph-measurable and SFp = ∅, p 1 ≤ p < ∞, then conv SFp = SconvF .
p PROOF: It is easy to see that SconvF ∈ Pf c Lp (Ω, X) . So we have p conv SFp ⊆ SconvF .
(6.23)
p Suppose that the conclusion in (6.23) is strict. So we can find f ∈ SconvF p such that f ∈ / convSF . Then from the strong separation theorem we can find ∗ p ∗ u∗ ∈ Lp (Ω, Xw ∗ )=L (Ω, X) , (1/p) + (1/p ) = 1 such that
(6.24) σ u∗ , conv SFp < u∗ , f p .
From the definition of the support function (see Definition 6.1.14(b)), we have
494
6 Multivalued Analysis
σ u∗ , conv SFp = σ u∗ , SFp ∗ = sup u , hp : h ∈ SFp
u∗ (ω), h(ω) dµ : h ∈ SFp = sup Ω
= sup u∗ (ω), x : x ∈ F (ω) dµ Ω
= σ u∗ (ω), F (ω) dµ Ω
= σ u∗ (ω), conv F (ω) dµ.
(see Theorem 6.4.16)
(6.25)
Ω p , we have From (6.24) and (6.25) and because f ∈ SconvF
σ u∗ (ω), conv F (ω) dµ < u∗ (ω), f (ω) dµ Ω Ω
≤ σ u∗ (ω), conv F (ω) dµ, Ω
a contradiction. This proves that in (6.23) we have equality.
A useful byproduct of the previous proof is the following result. COROLLARY 6.4.18 If F : Ω −→ 2X \ {∅} is graph-measurable and SFp = ∅, ∗ p ∗ 1 1 ≤ p < ∞, then for every u∗ ∈ Lp (Ω, Xw + p1 = 1, we have ∗ ) = L (Ω, X) , p σ(u∗ , SFp )=
σ u∗ (ω), F (ω) dµ.
Ω
PROPOSITION 6.4.19 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space, F : w p Ω −→ 2X \{∅} is graph-measurable, and SFp = ∅, 1 ≤ p < ∞, then SFp = SconvF (here w we denote the closure in the weak topology). by w
PROOF: Evidently SFp is nonempty, decomposable, and weakly closed. Therefore w w by Proposition 6.4.14 SFp is convex. It follows that conv SFp = SFp . Invoking Proposition 6.4.17, we conclude that w
p SconvF = SFp .
PROPOSITION 6.4.20 If F : Ω −→ 2X \ {∅} is graph-measurable, and SFp = ∅, 1 ≤ p < ∞, then SFp =SFp .
PROOF: Because SFp ∈ Pf Lp (Ω, X) , we have SFp ⊆ SFp . Because µ is σ-finite, we can find {An }n≥1 ⊆ Σ of finite µ-measure suchp that An ⊆ An+1 for all n ≥ 1 and An =Ω. Then we have µ(An ) ↑ +∞. Let f ∈ SF and for every n ≥ 1, we consider n≥1
the multifunction
6.4 Decomposable Sets and Set-Valued Integration Hn (ω) = x ∈ F (ω) : f (ω) − x <
495
1 . µ(An )2
Clearly Hn (ω) = ∅ for all ω ∈ Ω and Gr Hn ∈ Σ × B(X). Then by virtue of Theorem 6.3.20 (see also Remark 6.3.21), we can find hn : Ω −→ X a Σ-measurable selector of F such that f (ω) − hn (ω) <
1 µ(An )2
µ-a.e. on Ω.
(6.26)
Let g ∈ SFp and set un = χAn hn + χAcn g. Then un ∈
SFp
and we have
f − un p = f (ω) − hn (ω)p dµ + An
<
1 µ(An ) + µ(An )2p
Ac n
Ac n
f (ω) − g(ω)p dµ
f (ω) − g(ω)p dµ.
(6.27)
Note that f (·) − g(·)p ∈ L1 (Ω)+ and µ(Acn ) ↓ 0. So if we pass to the limit as n → ∞ in (6.27), we conclude that un −→ f
in Lp (Ω, X).
Because un ∈ SFp , it follows that f ∈ SFp and so SFp ⊆ SFp . Hence finally SFp = SFp . From the structure of the set SFp , we can extract information about the pointwise properties of the multifunction F . PROPOSITION 6.4.21 If F :Ω−→2X \{∅} is graph-measurable and SFp , 1 ≤ p < ∞, is nonempty closed (resp., nonempty closed convex and µ is nonatomic), then for µ-a.a. ω ∈ Ω, F (ω) ∈ Pf (X) (resp., F (ω) ∈ Ff c (X)). PROOF: First we assume that SFp is nonempty and closed. From Proposition 6.4.20 we have SF = SF and then Corollary 6.4.4 implies that F (ω) = F (ω) µ-a.e. on Ω. Now assume that SFp is nonempty, closed, and convex and µ is nonatomic. Inw p p , hence SFp = SconvF (recall voking Proposition 6.4.19, we have that SFp = SconvF that a convex set is closed if and only if it is w-closed). Once again Corollary 6.4.4 implies F (ω) = conv F (ω) µ-a.e. on Ω; that is, F (ω) ∈ Pf c (X) µ-a.e. on Ω. DEFINITION 6.4.22 A multifunction F : Ω −→ 2X\{∅} is said to be Lp -integrably bounded (1 < p ≤ ∞) and simply integrably bounded (for p = 1), if there exists h ∈ Lp (Ω)+ such that |F (ω)| = sup x : x ∈ F (ω) ≤ h(ω) µ-a.e. on Ω.
496
6 Multivalued Analysis
For integrably bounded multifunctions we have a weak compactness result for the set SF1 , which is a very valuable tool in many applications. THEOREM 6.4.23 If F : Ω −→ Pwkc (X) is a graph measurable and integrably bounded multifunction, then SF1 is nonempty, convex, and w-compact. ∗ PROOF: Clearly SF1 is nonempty and convex. Also let u∗ ∈ L∞ (Ω, Xw ∗) = 1 ∗ L (Ω, X) . If by ·, · we denote the duality brackets for the pair 1
1 ∗ L (Ω, X), L∞ (Ω, Xw ∗ ) , we have
u∗ (ω), f (ω) dµ : f ∈ SF1 sup u∗ , f 1 : f ∈ SF1 = sup Ω
= sup u∗ (ω), x : x ∈ F (ω) dµ, Ω
(see Theorem 6.4.16). We introduce the multifunction S : Ω −→ 2X defined by
S(ω) = u ∈ F (ω) : u∗ (ω), u = sup u∗ (ω), x : x ∈ F (ω) = σ u∗ (ω), F (ω) . Because F has values in Pwkc (X), we see that S(ω) = ∅for all ω ∈ Ω. Moreover,
from Theorem 6.3.24 we know that ω −→ σ u∗ (ω), F (ω) is Σµ -measurable (Σµ being the µ-completion of Σ). It follows that GrS ∈ Σµ × B(X). Invoking Theorem 6.3.20 we can find u : Ω −→ X Σ-measurable such that u(ω) ∈ S(ω) Evidently u ∈ L1 (Ω, X) and we have sup u∗ , f 1 : f ∈ SF1 =
Ω
µ-a.e.
u∗ (ω), u(ω) dµ = u∗ , u1 .
By virtue of the James theorem, because SF1 is closed, convex, and bounded, we conclude that SF1 is w-compact in L1 (Ω, X). The converse of this theorem is also true (see Klei [356, p. 313]). THEOREM 6.4.24 If F : Ω −→ Pf (X) is graph-measurable, integrably bounded, and SF1 ⊆ L1 (Ω, X) is w-compact convex, then F (ω) ∈ Pwkc (X) µ-a.e. on Ω. PROPOSITION 6.4.25 If F : Ω −→ Pwkc (X) is a scalarly measurable multifunction, then ext SF = SextF = ∅ in L0 (Ω, X). PROOF: By virtue of Proposition 6.3.35 and Theorem 6.3.20, we have that SextF = ∅. Moreover, we have SextF ⊆ ext SF . Suppose that the inclusion is strict. So we can find f ∈ ext SF and A ∈ Σ, µ(A) > 0 such that f (ω) ∈ / ext F (ω) for all ω ∈ A. Consider the multifunction S : A −→ 2X×X defined by 1 S(ω) = (x, y) ∈ F (ω) × F (ω) : f (ω) = (x + y), x = y . 2
6.4 Decomposable Sets and Set-Valued Integration
497
Then Gr S ∈ (Gr F × Gr F ) ∩ (ω, x, y) ∈ A×X×X : f (ω) = 12 (x + y) ∩ (Ω × ∆c ) where ∆ = {(x, y) ∈ X × X : x = y} (the diagonal in X × X). Hence Gr S ∈ A×B(X) and we can apply Theorem 6.3.20 and obtain f1 , f2 : A −→ X ΣA = Σ∩A
measurable functions such that f1 (ω), f2 (ω) ∈ S(ω) µ-a.e. on A. Let h ∈ SF (it exists by virtue of Theorem 6.3.20) and set f1 = χA f1 + χAc h, f2 = χA f2 + χAc h. Evidently f1 , f2 ∈ SF and f = 12 (f1 + f2 ), f1 = f2 , a contradiction to the fact that f ∈ extSF . Using this proposition, we can prove the following theorem. THEOREM 6.4.26 If F : Ω −→ Pwkc (X) is a scalarly measurable multifunction p and SFp = ∅, 1 ≤ p ≤ ∞, then ext SFp = SextF . p ∩ Lp (Ω, X) and from Proposition PROOF: Recall that SFp =SF ∩ Lp (Ω, X), SextF 6.4.25 we have ext S = S . So it suffices to show that ext S F ∩ Lp (Ω, X) = extF
F p ext SF ∩ L (Ω, X) . It is easy to see that ext SF ∩ Lp (Ω, X) ⊆ ext SF ∩ Lp (Ω,
X) . We showthat the opposite inclusion is also true. To this end let f ∈ ext SF ∩ Lp (Ω, X) and suppose that f ∈ / ext SF . So we can find f1 , f2 ∈ SF , f1 = f2 such that f = 12 (f1 + f2 ). By translating things if necessary, we may assume that f = 0. Because µ is σ-finite, we can find A ∈ Σ with 0 < µ(A) < +∞ such that f1 (ω) = f2 (ω) for all ω ∈ A. We define
⎧ ⎪ ⎨ fk (ω) hk (ω) =
⎪ ⎩
fk (ω) fk (ω)
0
if fk (ω) ≤ 1, ω ∈ A if fk (ω) > 1, ω ∈ A , k = 1, 2. otherwise
p 1 Clearly hk ∈ SF ∩ Lp (Ω, X) = SF and 0 = f = 2 (h1 + h2 ), which means that 0=f ∈ / ext SF ∩ Lp (Ω, X) .
Combining Proposition 6.3.35 and the Krein–Milman theorem, we obtain the following result. PROPOSITION 6.4.27 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space and w p F : Ω −→ Pwkc (X) is scalarly measurable with SFp = ∅, 1 ≤ p ≤ ∞, then SextF = p SF . Before passing to set-valued integration, we state (without proofs which are rather long and technical) two continuous selection theorems for multifunctions that instead of convexity exhibit a decomposability property in their values. These theorems confirm the earlier observation that decomposability is a very effective substitute of convexity. For their proofs we refer to Hu–Papageorgiou [313, Section II.8]. The first theorem is a nonconvex-decomposable version of Michael’s selection theorem (see Theorem 6.3.6).
THEOREM 6.4.28 If Y is a metric space and F : Y −→ Pf L1 (Ω, X) is a lsc multifunction with decomposable values, then F admits a continuous selector.
498
6 Multivalued Analysis
For the second continuous selection theorem, we need to do some preparatory work. So let T = [0, b] furnished with the one-dimensional Lebesgue measure λ. Let Y be a Polish space and X a separable Banach space. For the Lebesgue–Bochner space L1 (Ω, X), we introduce a new norm, useful in the study of differential and evolution equations. DEFINITION 6.4.29 The weak norm on the Lesesgue–Bochner space L1 (T, X), is defined by f w = sup
t
f (s)ds : 0 ≤ t ≤ t ≤ b ,
f ∈ L1 (T, X).
t
REMARK 6.4.30 Clearly an equivalent definition of the weak norm, is the following.
t f w = sup f (s)ds : 0 ≤ t ≤ b , f ∈ L1 (T, X). 0
It is not immediately clear how the · w -norm topology and the weak topology on L1 (T, X) are related. It turns out that the next concept is the link between the two topologies. DEFINITION 6.4.31 A set W ⊆ L1 (T, X) has property U , if (a) W is uniformly integrable and (b) For every ε > 0, we can find Kε ∈ Pk (X) such that for every f ∈ W there exists Sε,f ⊆ T -measurable subset with λ(T \ Sε,f ) < ε such that f (t) ∈ Kε for all t ∈ Sε,f . REMARK 6.4.32 If W ⊆ L1 (T, X) has property U , then W is relatively weakly compact in L1 (T, X). Using property U , we can compare the · w -norm and weak topologies on L1 (T, X). PROPOSITION 6.4.33 If W ⊆ L1 (T, X) has property U , then the · w -norm and weak topologies coincide on the set W . The second continuous selection theorem uses the · w -norm topology on L1 (T, X). This theorem concerns multifunctions F (t, x) of two variables (t, x) ∈ T × X. The hypotheses on F are the following. H(F): F : T ×Y −→ Pwkc (X) is a multifunction such that (i) For every y ∈ Y, t −→ F (t, y) is measurable. (ii) For almost all t ∈ T, y −→ F (t, y) is h-continuous. (iii) For every C ∈ Pk (Y ), we can find αC ∈ L1 (T )+ such that for a.a. t ∈ T , all y ∈ C, and all u ∈ F (t, y), we have u ≤ αC (t).
6.4 Decomposable Sets and Set-Valued Integration
499
REMARK 6.4.34 In the next section we discuss in some detail multifunctions F (t, x) of two variables. For the moment it suffices to mention that if F satisfies hypotheses H(F )(i), (ii), then F is jointly measurable, in particular then supmeasurable (compare with Theorem 6.2.6). This fact combined with hypothesis = ∅. H(F )(iii), implies that if y ∈ C(T, Y ), then S 1
F ·,y(·)
Now let K ⊆ C(T, Y ) be nonempty, and compact and consider the multifunc . Let CSΓw (resp., tion Γ : K −→ Pwkc L1 (T, X) defined by Γ(y) = S 1
F ·,y(·)
w CSextΓ ) be the set of selectors of Γ (resp., of ext Γ) that are continuous from K into L1 (T, X) endowed with the weak norm (denoted by L1w (T, X), not to be confused with L1 (T, X)w which is the Lebesgue–Bochner space L1 (T, X) with the weak topology).
THEOREM 6.4.35 If F (t, x) satisfies hypotheses H(F ), K⊆C(T, Y ) is compact, ·w w for all y ∈ K, then CSΓw = CSextΓ . and Γ(y) = S 1
F ·,y(·)
REMARK 6.4.36 Theorems 6.4.28 and 6.4.35 are indispensable tools in the study of nonconvex differential inclusions (see Hu–Papageorgiou [316]). Now we pass to set-valued integration. First of all the definition of the integral of a multifunction that we use is a straightforward continuous extension of the Minkowski sum of sets. Throughout this last part of this section (Ω, Σ, µ) is a σfinite measure space, X is a separable Banach space, and F : Ω −→ 2X \ {∅} is a multifunction with SF1 = ∅. REMARK 6.4.37 For a graph-measurable multifunction F : Ω −→ 2X \ {∅}, a straightforward measurable selection argument involving Theorem 6.3.20, reveals that SF1 = ∅ if and only if there exists h ∈ L1 (Ω)+ such that inf{u : u ∈ F (ω)} ≤ h(ω). DEFINITION 6.4.38 The set-valued integral of a multifunction F : Ω −→ 2X \{∅} with SF1 = ∅, is defined by
F dµ = f dµ : f ∈ SF1 . Ω
Ω
The following is an immediate consequence of Theorem 6.4.16. X measurable and SF1 = ∅, then PROPOSITION 6.4.39 If F : Ω −→ 2 \{∅} is graph ∗ ∗ ∗ ∗ for every x ∈ X , we have σ(x , Ω F dµ) = Ω σ(x , F )dµ.
Also from Proposition 6.4.17, we obtain the following. X 1 PROPOSITION 6.4.40 If F : Ω −→ 2 \{∅} is graph-measurable and SF = ∅, then cl Ω conv F dµ = conv Ω F dµ = cl Ω conv F dµ.
The set-valued integral has some remarkable intrinsic convexity properties.
500
6 Multivalued Analysis
X THEOREM 6.4.41 If µ is nonatomic and F : Ω −→ 2 \{∅} is graph-measurable with SF1 = ∅, then cl Ω F dµ is convex. PROOF: Let v1 , v2 ∈ Ω F dµ. Then by Definition 6.4.38, we have
v1 = f1 dµ and v2 = f2 dµ, with f1 , f2 ∈ SF1 . Ω
Ω
Consider the vector measure m : Σ −→ X × X defined by
m(A) = f1 dµ, f2 dµ , A ∈ Σ. A
A
Invoking Theorem 6.4.12(b), we see that m(Σ) is convex. We have m(∅) = 0
and
m(Ω) = (v1 , v2 ).
So given ε > 0 and t ∈ [0, 1], we can find A ∈ Σ such that
ε t fk dµ − fk dµ < , k = 1, 2. 2 Ω A Set f = χA f1 + χAc f2 . Then f ∈ SF1 and we have
tv1 + (1 − t)v2 − f dµ < ε, Ω
⇒ cl F dµ is convex. Ω
COROLLARY 6.4.42 If µ is nonatomic and F : Ω −→ 2X\{∅} is graph-measurable 1 with SF = ∅, then cl Ω conv F dµ = cl Ω F dµ. REMARK 6.4.43 So when µ is nonatomic, convexification of the multifunction F essentially does not add anything new to the set-valued integral. If in Theorem 6.4.41, dim X<+∞, then we can exploit the fact that Lyapunov’s convexity theorem is exact (i.e., no closure is needed; see Theorem 6.4.12(a)), in order to obtain the following result. THEOREM 6.4.44 If µ is nonatomic, X is finite-dimensional, and F : Ω −→ 2X \{∅} is graph-measurable with SF1 = ∅, then Ω F dµ is convex. For RN -valued multifunctions, we have the following partial improvement of the above theorem. For a proof of this result see Hildenbrand [297, p. 48] and Hu–Papageorgiou [313, p. 202]. N
THEOREM 6.4.45 If F : Ω −→ 2R \{∅} is graph-measurable, SF1 = ∅, and F (ω) ⊆ RN + for all ω ∈ Ω, then Ω conv F dµ = conv Ω F dµ.
6.4 Decomposable Sets and Set-Valued Integration
501
N
COROLLARY 6.4.46 If µ is nonatomic, F :Ω−→ 2 R \{∅} is graph-measurable, SF1 = ∅, and F (ω) ⊆ RN + for all ω ∈ Ω, then Ω F dµ = Ω conv F dµ. REMARK 6.4.47 Corollary 6.4.46 fails if RN is replaced by an infinitedimensional Banach space. Similarly Theorem 6.4.45 fails if we remove the graphmeasurability of F or the hypothesis that F (ω) ⊆ RN + for all ω ∈ Ω. As a straightforward consequence of Theorem 6.4.23, we have the following result. PROPOSITION 6.4.48 If F : Ω −→ Pwkc (X) is graph-measurable and integrably bounded, then Ω F dµ ∈ Pwkc (X). Continuing with the topological properties of the set-valued integral Ω F dµ, we obtain conditions for Ω F dµ to be open. has open values, is graph-measurable, PROPOSITION 6.4.49 If F : Ω −→ 2X\{∅} and SF1 = ∅, then for every A ∈ Σ the set A F dµ is open in X. PROOF: Without any loss of generality, we may assume that A = Ω. Let f ∈ SF1 . By replacing F by ω −→ F (ω) − f (ω) if necessary, we may assume that 0 ∈ F (ω) for all ω ∈ Ω. So we need to show thatthere exists ε > 0 such that Bε ⊆ Ω F dµ. For
this purpose let ξF (ω) = d 0, F (ω)c . Let λ > 0 and consider the set Lλ = {ω ∈ Ω : ξF (ω) < λ}. If G = {(ω, x) ∈ Ω × X : x ∈ Bλ ∩ F (ω)c } = (Ω × Bλ ) ∩ GrF c ∈ Σ × B(X), then Lλ = projΩ G ∈ Σµ (see Theorem 6.2.18). This proves that the function ξF is Σµ -measurable. Because F has open values, we have ξF (ω) > 0 for all ω ∈ Ω and so we can find ε > 0 and A ∈ Σ, with µ(A) > 0 such that ξF (ω) ≥ ε for all ω ∈ A. This means that Bε ⊆ F (ω) for all ω ∈ A and so µ(A)Bε ⊆ A F dµ ⊆ Ω F dµ (because 0 ∈ F (ω) for all ω ∈ Ω). In fact under some reasonable conditions, we can establish that the interior and integral operators commute. To show this we need two auxiliary results. LEMMA 6.4.50 If V is a Banach space and U1 ⊆ U2 ⊆ X are nonempty open sets with U1 convex and dense in U2 , then U1 = U2 . PROOF: We have U2 ⊆ int U 1 and because U1 is convex U1 = int U 1 . Therefore U1 = U2 . LEMMA 6.4.51 If F : Ω −→ 2X \{∅} is graph-measurable and intF (ω) = ∅ for all ω ∈ Ω, then Gr int F ∈ Σµ × B(X). PROOF: For every ω ∈ Ω we have int F (ω) = F (ω) \ ∂F (ω) with ∂F (ω) being the c boundary of the set F (ω). Since ∂F (ω) = F (ω) ∩ F (ω) , we infer that Gr ∂F ∈ Σµ ×B(X) (see the proof of Proposition 6.4.49). Hence Gr int F = Gr F ∩(Gr F )c ∈ Σµ × B(X). Using these auxiliary results, we can prove a theorem on the commutation of the interior and integral operators.
502
6 Multivalued Analysis
THEOREM 6.4.52 If µ is finite, F : Ω −→ 2X \{∅} is graph-measurable, SF1 = ∅, F (ω) is convex for all ω ∈ Ω, and int F (ω) = ∅ µ-a.e., then for all A ∈ Σ we have int A F dµ = A int F dµ. PROOF: Clearly without any loss of generality, we may assume that int F (ω) = ∅ for all ω ∈ Ω. We fix A ∈ Σ with µ(A) > 0. We show that A int F dµ. To this end let f ∈ SF1 and ε > 0. We introduce the multifunction Gε : Ω −→ 2X defined by Gε (ω) = x ∈ intF (ω) : x − f (ω) <
ε . µ(A)
Due to the convexity of F (ω) we have F (ω) = int F (ω) for all ω ∈ Ω. This implies that Gε (ω) = ∅ for all ω ∈ Ω. Also due to Lemma 6.4.51 Gr Gε ∈ Σµ ×B(X). Applying Theorem 6.3.20 we can find g : Ω −→ X a Σ-measurable map such that g(ω) ∈ Gε (ω) µ-a.e. So we have ε µ-a.e. on Ω, and g(ω) − f (ω) < µ(A)
1 and (g − f )dµ < ε, ⇒ g ∈ SintF A
⇒ int F dµ is dense in F dµ. g(ω) ∈ int F (ω)
A
A
and so But from Proposition 6.4.49, we have that A int F dµ is open int F dµ ⊆ int F dµ. Invoking Lemma 6.4.50, we conclude that int F dµ = A A A int A F dµ.
Because of Theorem 6.4.44, when X is finite-dimensional, we can have the following stronger version of the above theorem. THEOREM 6.4.53 If µ is finite and nonatomic, X is-finite dimensional, F : 1 Ω −→ 2X \ {∅} is graph-measurable, SF = ∅, and int F (ω) is dense in F (ω) for all ω ∈ Ω, then for every A ∈ Σ we have A int F dµ = int A F dµ. EXAMPLE 6.4.54 In the above theorem we cannot drop the condition that int F (ω) is dense in F (ω) for all ω ∈ Ω. To see this let Ω = [0, 1], µ = λ (λ is the Lebesgue measure on R), and X = R. Let F : Ω −→ Pf (X) be defined by F (ω) = [0, 21 ] ∪ {1} for all ω ∈ Ω. Then int Ω F dµ = (0, 1), and Ω int F dµ = (0, 12 ). In the next proposition, we use the set-valued integral to determine if f ∈ L1 (Ω, X) is an integrable selector of a multifunction F . PROPOSITION 6.4.55 (a) If F : Ω −→ Pwkc and inte (X) is graph-measurable grably bounded, then f ∈ SF1 if and only if A f dµ ∈ cl A F dµ = A F dµ for all A ∈ Σ. (b) If X ∗ is separable and F : Ω −→ P f c (X) is graph-measurable and integrably bounded, then f ∈ SF1 if and only if A f dµ ∈ cl A F dµ for all A ∈ Σ.
6.4 Decomposable Sets and Set-Valued Integration
503
PROOF: (a) ⇒: This implication follows from the definition of the set-valued integral (see Definition 6.4.38). ⇐: If by ·, · we denote the duality brackets for the pair (X, X ∗ ), for all (x∗ , A) ∈ X ∗ × Σ we have
f dµ = x∗ , f dµ ≤ σ x∗ , F dµ = σ(x∗ , F )dµ. x∗ , A
A
∗
∗
A
A
So for all ω ∈ Ω \ N (x ), µ N (x ) = 0, we have
x∗ , f (ω) ≤ σ x∗ , F (ω) .
Consider {x∗n }n≥1 ⊆ X ∗ m-dense. Because F is Pwkc (X)–valued σ ·, F (ω) is m-continuous. So
x∗ , f (ω) ≤ σ x∗ ; F (ω) for all ω ∈ Ω \ N, µ(N ) = 0 with N = N (x∗n ) and all x∗ ∈ X ∗ . Hence f ∈ SF1 . n≥1
(b)⇒: Again this follows from Definition 6.4.38. ⇐: This follows similarly to the corresponding implication in part (a), using this ∗ ∗ X ∗ is separable) and the fact that time
{xn}n≥1 ⊆ X strongly dense (because ∗ σ ·, F (ω) is strongly continuous on X . We conclude this section with two other approaches to set-valued integration. The first is the so-called embedding method , which concerns Pkc (X)-valued multifunctions and is based on the following embedding theorem, known as the R˚ adstr¨ om embedding theorem. THEOREM 6.4.56 There exists a separable Banach space X in which Pkc (X) can be embedded as a closed convex cone. Moreover, the embedding iR : Pkc (X) −→ X is linear isometric. ∗
REMARK 6.4.57 Because X is separable, then (B 1 , w∗ ) is compact metrizable. ∗ ∗ In what follows we consider B 1 with the weak∗ -topology. We have that C(B 1 ) is a separable Banach space. Moreover, for any C ∈ Pbf c (X) we have that C ∈ Pkc (X) if and only if σ(·, C) is sequentially w∗ -continuous. So in Theorem 6.4.56 X is the ∗ closed subspace of C(B 1 ) generated by the set {σ(·, C)}C∈Pkc (X) . Using Theorem 6.4.56 we can have the following alternative approach to setvalued integration. THEOREM 6.4.58 If F : Ω −→ Pkc (X) is graph-measurable and integrably bounded, then Ω F dµ = (B)− Ω iR (F )dµ with (B)− Ω being the X-valued Bochner integral. The second alternative approach to set-valued integration requires topological structure on the set Ω and the multifunction F . So let Ω = Y with (Y, d) a metric space, µ a finite Borel measure on Y (recall that µ is regular), and F : Y −→ Pf c (X) is lsc. We consider the set
504
6 Multivalued Analysis CSF = f ∈ C(Y, X) : f (y) ∈ F (y) for all y ∈ Y .
From Theorem 6.3.6 (Michael’s selection theorem), we know that CSF = ∅. We can define a set-valued integral of F using the set CSF . DEFINITION 6.4.59
c Y
F dµ =
Y
f (y)dµ : f ∈ CSF .
REMARK 6.4.60 Clearly we always have
c Y
F dµ ⊆
Y
F dµ.
In the next theorem, we obtain equality for the closures of the two sets. The proof of this result can be found in Hu–Papageorgiou [313, p. 210]. THEOREM 6.4.61 If F : Y −→ Pf c (X) is lsc and integrably bounded, then for all c A ∈ B(Y ), cl A F (y)dµ = cl A F (y)dµ. c REMARK 6.4.62 If F is Pwkc (X)-valued, then A F dµ = A F dµ for all A ∈ B(Y if F is h-lsc, integrably bounded with open convex values, then c ). Moreover, F dµ = F dµ for all A ∈ B(Y ) (see Hu–Papageorgiou [313, p. 212]). A A
6.5 Fixed Points and Carath´ eodory Multifunctions The purpose of this section is to extend some of the main fixed point theorems for single-valued maps to multifunctions and so determine some basic properties of multifunctions F (ω, x) that are measurable (in some sense; see Section 6.2) in ω ∈ Ω and continuous (in some sense; see Section 6.1) in x ∈ X. In Section 3.4 we proved metric fixed point theorems for single-valued maps. Undoubtedly the most important such result is the Banach contraction principle (see Theorem 3.4.3). The first theorem of this section generalizes the fixed point theorem of Banach to multifunctions. DEFINITION 6.5.1 Let (X, d) be a metric space and h the corresponding Hausdorff metric on Pf (X) (see Definition 6.1.29). A multifunction is an h-contraction, if
h F (x), F (u) ≤ k d(x, u) for all x, u ∈ X with 0 < k < 1. THEOREM 6.5.2 If (X, d) is a complete metric space and F : X −→ Pf (X) is an h-contraction, then F has a fixed point; that is, there exists x ∈ X such that x ∈ F (x). PROOF: We choose k < k1 < 1 and x0 ∈ X. We pick x1 ∈ F (x0 ) such that d(x0 , x1 ) > 0. If no such x1 ∈ F (x0 ) can be found, then x0 ∈ F (x0 ) and we are done. Then
d x1 , F (x1 ) ≤ h F (x0 ), F (x1 ) ≤ kd(x0 , x1 ) < k1 d(x0 , x1 ). Hence we can find x2 ∈ F (x1 ) such that d(x1 , x2 ) < k1 d(x0 , x1 ). By induction we can generate a sequence {xn }n≥1 ⊆ X such that
6.5 Fixed Points and Carath´eodory Multifunctions xn+1 ∈ F (xn ) and d(xn , xn+1 ) < k1n d(x0 , x1 )
for all n ≥ 1.
505 (6.28)
From the inequality in (6.28) and since k1 < 1, we deduce that {xn }n≥1 ⊆ X is Cauchy. Hence xn −→ x ∈ X for some x ∈ X. From the inclusion in (6.28) and if we pass to the limit as n → ∞, we obtain x ∈ F (x). REMARK 6.5.3 In contrast to Theorem 3.4.3, in this case the fixed point is not unique. Indeed, if F is the trivial constant multifunction F (x) = X for all x ∈ X, then every point in X is a fixed point of F . If Fix(F ) = {x ∈ X : x ∈ F (x)} (the set of fixed points of F ), then it is easily seen that Fix(F ) is closed in X. The next proposition proves a remarkable stability property of the set Fix(F ) with respect to the multifunction F . PROPOSITION 6.5.4 If (X, d) is a complete metric space, F1 , F2 : X −→ Pbf (X) are h-contractions
with the same constant 0 < k < 1, then h Fix(F1 ), Fix(F2 ) ≤ 1/(1 − k) sup h F1 (x), F2 (x) . x∈X
PROOF: Let ε > 0 and also choose ξ > 0 such that ξ nkn < 1. We set ε1 = n≥1
ξε 1/(1 − k) , pick x0 ∈ Fix(F1 ), and then we choose x1 ∈ F2 (x0 ) such that
(6.29) d(x0 , x1 ) ≤ h F1 (x0 ), F2 (x0 ) + ε. We can find x2 ∈ F2 (x1 ) such that d(x2 , x1 ) ≤ k d(x0 , x1 ) + kε1 and then inductively we obtain {xn }n≥1 ⊆ X such that xn+1 ∈ F2 (xn ) ⇒
n≥m
and
d(xn+1 , xn ) ≤ kn d(x0 , x1 ) + nkn ε1
for all n ≥ 1, (6.30)
km d(xn+1 , xn ) ≤ nkn , d(x0 , x1 ) + ε1 1−k n≥m
⇒ {xn }n≥1 ⊆ X is a Cauchy sequence and so xn −→ x ∈ X. Then from (6.30) in the limit as n → ∞ we obtain x ∈ Fix(F2 ). Moreover, d(x0 , x) ≤
1 nkn d(x0 , x1 ) + ε1 1−k n≥m 1
≤ h F1 (x0 ), F2 (x0 ) + 2ε 1−k
d(xn , xn+1 ) ≤
n≥0
(see (6.29)).
Reversing the roles of F1 and F2 in the above argument, we obtain
h Fix(F1 ), Fix(F2 ) ≤
1 sup h F1 (x), F2 (x) . 1 − k x∈X
506
6 Multivalued Analysis
COROLLARY 6.5.5 If (X, d) is a complete metric space, Fn , F : X −→ Pbf (X) for n ≥ 1 are h-contractions all with the
same constant 0 < k < 1, and sup h Fn (x), F (x) −→ 0 as n → ∞, then h Fix(Fn ), Fix(F ) −→ 0 as n → ∞. x∈X
There is also a multivalued analogue of Theorem 3.4.23. DEFINITION 6.5.6 Let (X, d) be a metric space and h the corresponding Hausdorff
metric on Pf (X). A multifunction F : X −→ Pf (X) is said to be nonexpansive, if h F (x), F (y) ≤ d(x, y) for all x, y ∈ X. The multivalued generalization of Theorem 3.4.23 is due to Lim [383]. THEOREM 6.5.7 If X is a uniformly convex Banach space, C ∈ Pbf c (X), and F : C −→ Pf (C) is nonexpansive, then F has a fixed point. Now we turn our attention to topological fixed point theorems for multifunctions. DEFINITION 6.5.8 Let X be a vector space. (a) A subset C ⊆ X is said to be finitely closed, if C ∩ Y is closed for every finitedimensional affine subspace Y in X, when on Y we consider its Euclidean topology (recall that Y is an affine subspace of X, if Y = y0 + Y0 for some y0 ∈ X and some finite-dimensional subspace of X). (b) A family {Ci }i∈I of nonempty sets in X is said to have the finite intersection property, if the intersection of every finite subfamily is nonempty. (c) Let C ⊆ X be nonempty. A multifunction F : C −→ 2X is a Knaster– Kuratowski–Mazurkiewicz multifunction (a KKM-multifunction for short), if for every finite set {xk }m k=1 ⊆ C, we have conv{xk }m k=1 ⊆
m !
F (xk ).
k=1
(see also Definition 2.3.6). The next theorem gives the basic property of KKM-multifunctions. It extends Theorem 2.3.7. THEOREM 6.5.9 If X is a vector space, C ⊆ X is nonempty, and F : C −→ 2X is a KKM-multifunction with values that are finitely closed, then the family {F (x)}x∈C has the finite intersection property (see Definition 6.5.8(b)). PROOF: We argue by contradiction. So suppose that
n
F (xk ) = ∅. Let Y =
k=1
span{xk }n k=1 ,
let d be the Euclidean metric on Y and D = conv{xk }n k=1 ⊆ Y . We n
know that Y ∩ F (xk ) is closed for all k ∈ {1, . . . , n}. We have Y ∩ F (xk ) = ∅ k=1
n
and so the function ξ(x) = d x, Y ∩ F (xk ) satisfies ξ(x) > 0 for all x ∈ C. Then
ϑ : D −→ D defined by
k=1
6.5 Fixed Points and Carath´eodory Multifunctions ϑ(x) =
507
n 1
d x, Y ∩ F (xk ) xk ξ(x) k=1
is continuous and so by Theorem 3.5.3 (Brouwer’s fixed point theorem), we can find x0 ∈ D such that ϑ(x0 ) = x0 . Let J = k ∈ {1, . . . , n} : d x0 , Y ∩ F (xk ) = 0 . Then x0 ∈ / F (xk ). But k∈J
ϑ(x0 ) = x0 ∈ conv{xk }k∈J ⊆
!
F (x)
k∈J
(because F is a KKM-multifunction), a contradiction.
An immediate consequence of the previous theorem is the following result (see also Corollary 2.3.9). THEOREM 6.5.10 If X is a Hausdorff topological vector space, C ⊆ X is nonempty, F : C −→ 2X is a KKM-multifunction with closed values, and for at least one x0 ∈ C, F (x0 ) is compact, then F (x) = ∅. x∈C
The same conclusion can be reached in another way, which avoids any topological structure and any topological hypotheses and instead uses an auxiliary multifunction. THEOREM 6.5.11 If X is a vector space, C ⊆ X is nonempty, F : C −→ 2X is a KKM-multifunction, G : C −→ 2X is another multifunction such that # # G(x) ⊆ F (x) for all x ∈ C and G(x)= F (x), x∈C
x∈C
and for some topology on X, G has compact values, then
F (x) = ∅.
x∈C
PROPOSITION 6.5.12 If X is a vector space, C ⊆ X is nonempty, and F : C −→ 2X is a multifunction such that x∈ / conv F (x)
for all x ∈ C,
then x −→ G(x) = X \F −1 (x) is a KKM-multifunction. PROOF: Let {xk }n k=1 ⊆ X. We have X\
n ! k=1
G(xk ) =
n #
F −1 (xk ).
k=1
Therefore we have y∈ /
n ! k=1
and this is equivalent to
G(xk )
if and only if y ∈
n # k=1
F −1 (xk )
508
6 Multivalued Analysis y∈C
xk ∈ F (y)
and
Next let y ∈ then from (6.31) we have
conv{xk }n k=1 .
for all k ∈ {1, . . . , n}. (6.31) n We claim that y ∈ k=1 G(xk ). If this is not true,
conv {xk }n k=1 ⊆ conv F (y) ⇒ y ∈ conv F (y), a contradiction. Therefore, we have conv{xk }n k=1 ⊆
n !
G(xk ),
k=1
⇒ G is a KKM-multifunction. This proposition can be used to produce maximal elements with respect to certain irreflexive binary relations. The result is of importance in mathematical economics where the binary relation represents preference among pairs of goods. THEOREM 6.5.13 If X is a Hausdorff topological vector space, C ⊆ X is a nonempty, compact, convex set, ≺ is an irreflexive binary relation on C, and (i) For every x ∈ C, x ∈ / conv{y ∈ C : x ≺ y}. (ii) For every x ∈ C, the lower section {y ∈ C : y ≺ x} is open in C, then the set of ≺-maximal elements in C, namely the set {x ∈ C : x ⊀ y for all y ∈ C} is nonempty and compact. PROOF: We consider the multifunction x −→ F (x) = {y ∈ C : x ≺ y}
for all x ∈ C.
By virtue of hypothesis (i) and Proposition 6.5.12, we have that x −→ G(x) = X \F −1 (x) is a KKM-multifunction. Hence H : C −→ 2C defined by x −→ H(x) = C ∩ G(x) = C \F −1 (x)
is a KKM-multifunction too.
Moreover, because of hypothesis (ii) the set F −1 (x) is open and so H is compactvalued. So Theorem 6.5.10 implies that # #
H(x) = C \ F −1 (x) = ∅ x∈C
x∈C
and of course it is compact. But this is the set of ≺-maximal elements.
This theorem leads to a fundamental existence theorem for variational inequalities.
6.5 Fixed Points and Carath´eodory Multifunctions
509
THEOREM 6.5.14 If X is a locally convex space, C ⊆ X is nonempty, compact, and convex, and u : C −→ X ∗ is a map such that (x, y) −→ u(x), y is jointly continuous on C×C (by ·, · we denote the duality brackets for the dual pair (X, X ∗ )), then we can find x ∈ C such that u(x), y − x ≥ 0 for all y ∈ C. PROOF: On C we define the irreflexive binary relation ≺ by x≺y
if and only if u(x), x − y > 0.
Note that for every x ∈ C, we have x∈ / conv{y ∈ C : x ≺ y} = {y ∈ C : x ≺ y} = y ∈ C : u(x), x − y > 0 . Also note that because of the hypothesis on u, the set {y ∈ C : y ≺ x} = y ∈ C : u(x), x − y < 0 is open. So we can apply Theorem 6.5.13 and obtain a ≺-maximal element x ∈ C. Hence u(x), x − y ≥ 0 for all y ∈ C. Next we use the results on KKM-multifunctions to deduce some basic topological fixed point theorems for set-valued maps. DEFINITION 6.5.15 Let X be a vector space and C ⊆ X a nonempty set. (a) The inward set of x ∈ C with respect to C is defined by IC (x) = {x + λ(y − x) : λ ≥ 0 and y ∈ C}. (b) A multifunction F : C −→ 2X \ {∅} is said to be weakly inward if F (x) ∩ IC (x) = ∅
for all x ∈ C.
REMARK 6.5.16 If F (C) ⊆ C, then F (x) ⊆ IC (x) for all x ∈ C. Indeed, if y ∈ F (x), then for λ = 1, y = x + λ(y − x) = x + (y − x) = y. Multifunctions satisfying F (x) ⊆ IC (x) are often called strongly inward . For single-valued maps the geometric notion of weak inwardness hasa metric equivalent,namely f : C −→ X is
weakly inward if and only if lim (1/λ)d x + λ f (x) − x , C = 0 for all x ∈ C. λ→0+
THEOREM 6.5.17 If X is a locally convex space, C ⊆ X is nonempty, compact, and convex, and F : C −→ Pf c (X) is usc and weakly inward, then F admits a fixed point; that is, there exists x ∈ C such that x ∈ F (x). PROOF: We argue indirectly. So suppose that F has no fixed point. Then for any given x ∈ C, we have 0 ∈ / x − F (x). Invoking the strong separation theorem for convex sets, we can find x∗ ∈ X ∗ \{0} such that x∗ , x − y < 0 for all y ∈ F (x),
∗ ⇒ σ x , x − F (x) < 0.
510
6 Multivalued Analysis We set
U (x∗ ) = x ∈ C : σ x∗ , −F (x) < 0
for all x∗ ∈ X ∗ \ {0}.
Because F is usc, we have that U (x∗ ) is open (see Proposition 6.1.15(c)) and {U (x∗ )}x∗ ∈X ∗ \{0} is an open cover of C. Inasmuch as C is compact, we can find {U (x∗k )}m k=1 a finite subcover and a corresponding continuous partition of unity {pk }m . k=1 We define m u(x) = pk (x)x∗k . k=1
Clearly the function (x, y) −→ u(x), y is continuous on C × C (as before by ·, · we denote the duality brackets for the dual pair (X, X ∗ )). So we can apply Theorem 6.5.17 and obtain x0 ∈ C such that u(x0 ), y − x0 ≥ 0
for all y ∈ C.
By hypothesis we can find y ∈ F (x0 ) ∩ IC (x0 ). Then y = x0 + lim λn (yn − x0 ), n→∞
with λn ≥ 0 and yn ∈ C,
⇒ y − x0 = lim λn (yn − x0 ), n→∞
⇒ u(x0 ), y − x0 = lim λn u(x0 ), y − x0 ≥ 0. n→∞
But this contradicts the fact that
σ u(x), x − F (x) < 0
for all x ∈ C.
REMARK 6.5.18 It is easy to check that in the above theorem Fix(F ) = {x ∈ C : x ∈ F (x)} is compact. If we combine Theorem 6.5.17 with Remarks 6.5.16 and 6.5.18 and if we recall that a locally compact multifunction is usc if and only if it is closed (i.e., its graph is closed; see Proposition 6.1.10), we can have the following multivalued generalization of the Schauder–Tychonov fixed point theorem (see Theorem 3.5.28). The result is known as the Kakutani–Ky Fan fixed point theorem. THEOREM 6.5.19 If X is locally convex space, C ⊆ X is a nonempty, compact, and convex set, and F : C −→ Pf c (C) is closed, then the set Fix(F ) is nonempty and compact. A related topological fixed point theorem for multifunctions is the following. THEOREM 6.5.20 If X is a locally convex space, C ⊆ X is a nonempty, compact, and convex set, and F : C −→ 2C \{∅} is a multifunction with convex values such that for each y ∈ C the set F + ({y}) = {x ∈ C : y ∈ F (x)} is open, then F admits a fixed point.
6.5 Fixed Points and Carath´eodory Multifunctions
511
PROOF: The family F + ({y}) y∈C is an open cover of C. Due to the compactness m of C, we can find a finite subcover F + ({yk }) k=1 and a corresponding continuous partition of unity {pk }m k=1 . We set m u(x) = pk (x)yk for all x ∈ C. k=1
Clearly u : C −→ C is continuous and u(x) ∈ F (x) for all x ∈ C (recall that F is convex-valued). Apply Theorem 3.5.28 to obtain x ∈ C such that u(x) = x ∈ F (x). Next let X, Y be Banach spaces and C ⊆ X, D ⊆ Y are nonempty, closed, convex sets. In what follows by (D, w) we denote D furnished with the relative weak topology of Y . We consider multifunctions G : C −→ 2C \{∅} that admit the following decomposition G = K ◦ N, (6.32) where N : C −→ 2D \ {∅} is usc from C with the relative strong (norm) topology into (D, w) and has weakly compact and convex values and K : (D, w) −→ C is sequentially continuous, namely w
yn −→ y
in D ⇒ K(yn ) −→ K(y)
in C.
Also we assume that G is compact; that is, it maps bounded sets in C into relatively compact sets in C. We emphasize that with the above hypotheses G need not have convex values. We have the following multivalued versions of the nonlinear alternative theorem (see Theorem 3.5.16) and of the Leray–Schauder alternative principle (see Corollary 3.5.18). Both theorems can be proved using degree-theoretic arguments. Details can be found in Bader [49]. THEOREM 6.5.21 If X is a Banach space and G : B R −→ 2X \{∅} is a compact multifunction that admits the decomposition (6.32), then at least one of the following statements holds. (a) There exists x0 ∈ ∂BR and λ ∈ (0, 1) such that x0 ∈ λG(x0 ). (b) Fix(G) = ∅. THEOREM 6.5.22 If X is a Banach space, G : C −→ 2C \ {∅} is a compact multifunction that admits the decomposition (6.32), and 0 ∈ C, then at least one of the following statements holds. (a) G has a fixed point. (b) The set S = {x ∈ C : x ∈ λG(x) for some 0 < λ < 1} is bounded. Let us conclude our discussion of the fixed point theory of multifunctions, with two results on the topological structure of the set Fix(F ) for h-contractions. First we have a definition. DEFINITION 6.5.23 (a) Let A be a subset of the metric space X. Then A is called a retract if there exists a continuous map (retraction) r : X −→ A such that rA = idA .
512
6 Multivalued Analysis
(b) A metric space E is said to be an absolute retract if for every metric space X and every nonempty closed set A ⊆ X, each continuous map f : A −→ E admits a continuous extension on all of X. REMARK 6.5.24 By virtue of Theorem 3.1.10 (Dugundji’s extension theorem), every closed convex set of a normed space is a retract. Every homeomorphic image of an absolute retract is also an absolute retract. Moreover, if E is an absolute retract and A is a retract of E, then A is also an absolute retract. The first structural theorem for the set Fix(F ) is due to Ricceri [514]. THEOREM 6.5.25 If X is a Banach space, C ⊆ X is nonempty, closed, and convex and F : C −→ Pf c (X) is an h-contraction, then Fix(F ) is an absolute retract. The second structural theorem, is due to Bressan–Cellina–Fryszkowski [97]. THEOREM 6.5.26 If (Ω, Σ, µ) is a finite nonatomic measure space, X is a separable Banach space, Y = L1 (Ω, X), and F : Y −→ 2Y \{∅} is an h-contraction with bounded, closed, and decomposable values, then Fix(F ) is an absolute retract. We conclude this section with a few facts about Caratheodory multifunctions. First suppose that T is a locally compact, σ-compact metric space equipped with a (regular) measure µ of bounded variation and defined on the σ-field S of a µ-measurable set (i.e., S = B(T )µ ). Also X is a Polish space and Y is a separable metric space. Recall that for Pk (Y )–valued multifunctions continuity and h-continuity coincide (see Corollary 6.1.40) and that Pk (Y ), h is a separable metric space (see Proposition 6.1.32(b)). Therefore invoking Theorem 6.2.9, we obtain the following result. PROPOSITION 6.5.27 If F : T ×X −→ Pk (Y ) is a multifunction such that (i) For every x ∈ X, t −→ F (t, x) is measurable and (ii) For every t ∈ T , x −→ F (t, x) is continuous,
then for every ε > 0 we can find Tε ⊆ T compact with µ(T \ Tε ) < ε such that F T ×X ε is continuous. Another result in the same direction is the following. PROPOSITION 6.5.28 If F : T × X −→ 2Y \{∅} is a multifunction such that (i) (t, x) −→ F (t, x) is measurable and (ii) For every t ∈ T , x −→ F (t, x) is lsc,
then for every ε > 0 we can find Tε ⊆ T compact with µ(T \ Tε ) < ε such that F T ×X ε is lsc.
PROOF: Fix y ∈ Y and consider the function (t, x)−→ϕy (t, x) = d y, F (t, x) . Evidently ϕy is L×B(X)-measurable (see Proposition 6.2.4) and for every t ∈ T ϕy (t, ·) is upper semicontinuous (see Proposition 6.1.15(a)). We can find a sequence {ϕn }n≥1 of Carath´eodory functions on T × X into Y such that ϕn (t, x) ↓ ϕy (t, x) for all
6.5 Fixed Points and Carath´eodory Multifunctions
513
(t, x) ∈ T × X. Invoking Theorem 6.2.9 given ε > 0, we can find Tε ⊆ T com pact with µ(T \ Tε ) < ε such that ϕn T ×X is continuous. Hence ϕy T ×X is lower ε ε semicontinuous. A new application of Proposition 6.1.15(a) implies that F T ×X is ε lsc. REMARK 6.5.29 The result fails if for all t ∈ T, F (t, ·) is usc. Next we present two parametrized versions of Michael’s selection (see Theorem 6.3.6). PROPOSITION 6.5.30 If Y is a separable reflexive Banach space and F : T × X −→ Pf c (Y ) a multifunction such that (i) (t, x) −→ F (t, x) is measurable and (ii) For every t ∈ T , x −→ F (t, x) is lsc, eodory functions such then there exists a sequence fm : T ×X −→ Y, m ≥ 1, of Carath´ that for all (t, x) ∈ T × X we have F (t, x) = {fm (t, x)}m≥1 . PROOF: By virtue of Proposition 6.5.28, for every n ≥ 1 we can find Tn ⊆ T compact such that µ(T \ Tn ) < 1/n and F T ×X is lsc. We apply Theorem 6.3.11 n to obtain continuous functions fn,m : Tn × X −→ Y, n, m ≥ 1, such that F (t, x) = {fnm (t, x)}m≥1 for all (t, x) ∈ Tn × X. Let fm : T ×X −→ Y, m ≥ 1, be defined by ⎧ ⎨fnm (t, x) if t ∈ Tn and t ∈ / Tk for k < n . fm (t, x) = T 0 if t ∈ T \ n ⎩ n≥1
Clearly for every m ≥ 1, fm is a Carath´eodory function and for all (t, x) ∈ T ×X, we have F (t, x) = {fm (t, x)}m≥1 . If (T, S, µ) is replaced by (Ω, Σ) a measurable space (no topological structure on Ω), then the situation is more involved. Nevertheless by adapting the proof of Michael’s selection theorem (see Theorem 6.3.6) to the present parametric situation, we can have the following theorem. For details we refer to Hu–Papageorgiou [313, p. 235]). THEOREM 6.5.31 If (Ω, Σ) is a complete measurable space, X is a Polish space, Y is a separable Banach space, and F : Ω × X −→ Pf c (Y ) is a multifunction such that (i) (ω, x) −→ F (ω, x) is measurable and (ii) For all ω ∈ Ω, x −→ F (ω, x) is lsc, eodory functions such then we can find a sequence fn : Ω× X −→ Y, n ≥ 1, of Carath´ that F (ω, x) = {fn (ω, x)}n≥1 for all (ω, x) ∈ Ω × X. REMARK 6.5.32 If F (ω, x) is measurable in ω ∈ Ω and usc or lsc in x ∈ X, it is not in general jointly measurable.
514
6 Multivalued Analysis
6.6 Convergence of Sets We start this section with the definitions of the different modes of set convergence, which we investigate. DEFINITION 6.6.1 Let (X, d) be a metric space and {An , A}n≥1 ⊆ Pf (X). We h
say that the An s converge to A in the Hausdorff sense, denoted by An −→ A or by h- lim An = A if and only if h(An , A) −→ 0 as n → ∞ (recall h is the Hausdorff n→∞
metric on Pf (X); see Definition 6.1.29). h
REMARK 6.6.2 From Proposition 6.1.31(b), we see that An −→ A if and only if d(·, An ) −→ d(·, A) uniformly on X. Also if X is a normed space and {An , A}n≥1 ⊆ ∗
Pbf c (X), then An −→ A if and only if σ(·; An ) −→ σ(·, A) uniformly on B 1 = {x∗ ∈ X ∗ : x∗ ≤ 1} (see Proposition 6.1.31(c)). h
DEFINITION 6.6.3 Let (X, τ ) be a Hausdorff topological space (τ denotes the Hausdorff topology) and let {An }n≥1 ⊆ 2X \{∅}. We define τ- lim inf An = {x ∈ X : x = τ- lim xn , xn ∈ An , n ≥ 1} n→∞
n→∞
and
τ- lim sup An = {x ∈ X : x = τ- lim xnk , xnk ∈ Ank , nk < nk+1 , k ≥ 1}. n→∞
n→∞
The set τ-lim inf An is the τ -Kuratowski limit inferior of the sequence {An }n≥1 n→∞
and τ-lim sup An is the τ -Kuratowski limit superior of the sequence {An }n≥1 . If n→∞
A = τ-lim inf An = τ-lim sup An , then A is the τ -Kuratowski limit of the sequence n→∞
n→∞ K
τ A. {An }n≥1 and we write An −→
REMARK 6.6.4 If (X, d) is a metric space, then d- lim inf An = {x ∈ X : x = lim d(x, An ) = 0} n→∞
n→∞
and
d- lim sup An = {x ∈ X : x = lim inf d(x, An ) = 0}. n→∞
n→∞
Note that we always have τ -lim inf An ⊆ τ -lim sup An and if the space X is first n→∞ n→∞ An and so it is a closed set. If the topology countable, then τ -lim sup An = n→∞
k≥1 n≥k
τ is clearly understood, then we drop the letter τ . The Kuratowki mode of set-convergence turns out to be suitable for locally compact spaces. In order to deal with sets defined in an infinite-dimensional Banach space (which is not locally compact), we need a new mode of set-convergence, which involves all the useful topologies defined on the Banach space. DEFINITION 6.6.5 Let X be a Banach space. By w- (resp., s-) we denote the weak (resp. strong) topology on X. Let {An }n≥1 ⊆ 2X \ {∅}. We say that the An s converge to A in the Mosco sense if and only if w-lim inf An = s-lim inf An = A. n→∞
n→∞
6.6 Convergence of Sets
515
REMARK 6.6.6 Note that we always have s-lim inf An ⊆ w-lim inf An and sn→∞
n→∞
lim sup An ⊆ w-lim sup An . So comparing Definitions 6.6.3 and 6.6.5, we see that n→∞ M
n→∞
K
K
s w A and An −→ A. An −→ A if and only if An −→
DEFINITION 6.6.7 Let (X, d) be a metric space and {An , A} ⊆ Pf (X). We say W
that the An s converge to A in the Wijsman sense, denoted by An −→ A if and only if for all x ∈ X we have d(x, An ) −→ d(x, A). h
REMARK 6.6.8 From Remark 6.6.2 we already know that An −→ A implies W An −→ A. DEFINITION 6.6.9 Let X be a Banach space and {An , A} ⊆ Pf c (X). We say w that the An s converge to A weakly (or scalarly), denoted by An −→ A, if and only ∗ ∗ ∗ ∗ if for all x ∈ X we have σ(x , An ) −→ σ(x , A). REMARK 6.6.10 From Remark 6.6.2, we know that if {An }n≥1 ⊆ Pbf c (X), then h
w
An −→ A implies An −→ A. It is natural to ask what is the relation between these modes of set-convergence. Already in Remarks 6.6.8 and 6.6.10, we saw some first relations. In fact comparing the definitions and using Remark 6.6.6, we can have the following result which provides a first detailed general comparison between these notions. More specialized results are given later in the section. PROPOSITION 6.6.11 If X is a Banach space and {An }n≥1 ⊆ 2X \ {∅}, then (a) (b) (c) (d)
h
K
W
s An −→ A implies An −→ A and An −→ A. h w If {An , A}n≥1 ⊆ Pbf c (X), then An −→ A implies An −→ A. Ks Kw M A and An −→ A. An −→ A if and only if An −→ Ks W An −→ A implies An −→ A.
Without additional hypotheses we cannot say more. EXAMPLE 6.6.12 (a) In general M -convergence does not imply h-convergence: Let X = l2 and let {en }n≥1 be the standard orthonormal basis of the Hilbert space. M
We define An = {λen : 0 ≤ λ ≤ 1} and A = {0}. Evidently An −→ A but we do not have convergence in the Hausdorff metric because h(An , A) = 1 for all n ≥ 1. (b) In general h-convergence does not imply M -convergence: Let X be a reflexive Banach space and let An = A = ∂B1 = {x ∈ X : x = 1} for all k. The trivially h
An −→ A. Recall that ∂B1 convergence.
w
= {x ∈ X : x ≤ 1}. So we cannot have Mosco
(c) In general K-convergence does not imply W -convergence: Let X = l2 and let {en }n≥1 be the standard orthonormal basis of this Hilbert space. We define An = K
{x ∈ X : x = λe1 + (1 − λ)en ,√0 ≤ λ ≤ 1}, n ≥ 1 and A = {e1 }. Evidently An −→ A but d(0, An ) = 12 e1 + en = 2/2 and so we cannot have W -convergence.
516
6 Multivalued Analysis
The next proposition explains the situation in Example 6.6.12(b) and also justifies the use of monotone techniques in variational analysis. PROPOSITION 6.6.13 If (X, d) is a metric space and {An }n≥1 ⊆ 2X \ {∅}, then (a) The sets lim inf An , lim sup An are both closed (possibly empty). n→∞ n→∞ An . (b) If {An }n≥1 is increasing, then lim inf An = lim sup An = n→∞ n→∞ n≥1 (c) If {An }n≥1 is decreasing, then lim inf An = lim sup An = An . n→∞
PROOF: (a): Note that lim sup An = n→∞
n≥1
n→∞
An and so lim sup An is closed. Also
n≥1 n≥1
n→∞
let {xk }k≥1 ⊆ lim inf An and assume that xk −→ x as k −→ ∞. We have d(x, An ) ≤ n→∞
x − xk + d(xk , An ). Hence lim sup d(x, An ) ≤ x − xk . Because this is true for n→∞
all k ≥ 1, we conclude that d(x, An ) −→ 0 as n → ∞, hence x ∈ lim inf An (see n→∞
Remark 6.6.4). Therefore lim inf An is closed. n→∞
(a) and (c): Follow at once from Remark 6.6.4 and part (a). REMARK 6.6.14 Note that in a metric space X,
An ⊆ lim inf An .
k≥1 n≥k
n→∞
h
PROPOSITION 6.6.15 If X is a Banach space, {An }n≥1 ⊆ Pf c (X), and An −→ M
A, then An −→ A. K
s PROOF: From Proposition 6.6.11(a) we know that An −→ A. So we need to show that w-lim sup An ⊆ A. To this end let x ∈ w-lim sup An ⊆ A. Then by
n→∞
n→∞
virtue of Definition 6.6.3 we can find a subsequence {nk } on {n} and xnk ∈ Ank w such that xnk −→ x. Note that A ∈ Pf c (X) and so d(·, A) is convex on X. So it is w-lower semicontinuous and we have d(x, A) ≤ lim sup d(xnk , A). On the other k→∞
hand d(xnk , A) ≤ h(Ank , A) −→ 0 as k −→ ∞. Therefore d(x, A) = 0 and so x ∈ A (because A ∈ Pf c (X)). Hence we infer that w-lim sup An ⊆ A and we conclude that n→∞
M
An −→ A.
K
PROPOSITION 6.6.16 If (X, d) is a metric space, {An }n≥1 ⊆Pf (X), An −→ A, h
and there exists a compact set C ⊆ X such that An ⊆ C for all n ≥ 1, then An −→ A as n → ∞. PROOF: First note that A ∈ Pk (X). Let xn ∈ A be such that d(xn , An ) = sup d(x, An ) : x ∈ A] = h∗ (A, An ). Since {xn }n≥1 ⊆ C, we can find a subseK
quence {xnk }k≥1 of {xn }n≥1 such that xnk −→ x ∈ A. Also since An −→ A, we can find an ∈ An , n ≥ 1, such that an −→ x in X. Then we have h∗ (A, Ank ) = d(xnk , Ank ) ≤ d(xnk , ank ) −→ 0 as k −→ ∞.
6.6 Convergence of Sets
517
Next let un ∈ An , n ≥ 1, such that d(un , A) = sup[d(x, a) : x ∈ An ] = h∗ (An , A). As before we can find a subsequence {unk }k≥1 such that unk −→ u. Because K
An −→ A, we have u ∈ A. Then h∗ (Ank , A) = d(unk , A) ≤ d(unk , u) −→ 0
as k −→ ∞.
From the above arguments, we have h∗ (A, An )
and
h∗ (An , A) −→ 0
as n → ∞,
h
⇒ An −→ A. Now suppose that X is a Banach space and {An }n≥1 ⊆ 2X \{∅}. From Definition 6.6.3 it is clear that for every x∗ ∈ X ∗ , we have σ(x∗ , w- lim sup An ) ≤ lim sup σ(x∗ , An ). n→∞
(6.33)
n→∞
In the next proposition, we present a situation where equality holds in (6.33). PROPOSITION 6.6.17 If X is a Banach space, {An }n≥1 ⊆ 2X \{∅}, and for all n ≥ 1, An ⊆ W ∈ Pwk (X), then w-lim sup An = ∅ and for every x∗ ∈ X ∗ we have n→∞
∗
lim sup σ(x , An ) = σ(x∗ , w- lim sup An ). n→∞
n→∞
PROOF: Because An ⊆ W ∈ Pwk (X), n ≥ 1, from the Eberlein–Smulian theorem, we conclude that w-lim sup An = ∅. Let x∗ ∈ X ∗ and choose an ∈ An such that n→∞
σ(x∗ , An ) −
1 ≤ x∗ , an . n
(6.34)
By passing to a subsequence if necessary, we may assume that an −→ a ∈ wlim sup An . Then from (6.34), we have n→∞
lim sup σ(x∗ , An ) ≤ x∗ , a ≤ σ(x∗ , w- lim sup An ). n→∞
(6.35)
n→∞
Comparing (6.33) and (6.35), we conclude that σ(x∗ , w- lim sup An ) = lim sup σ(x∗ , An ). n→∞
n→∞
A related result is the following. PROPOSITION 6.6.18 If X is a Banach space, {An , A}n≥1 ⊆ 2X \ {∅} and for every x∗ ∈ X ∗ we have lim sup σ(x∗ , An ) ≤ σ(x∗ , A), n→∞
then w-lim sup An ⊆ conv A. n→∞
518
6 Multivalued Analysis
PROOF: Let a ∈ w-lim sup A. We can find a subsequence {nk } of {n} and ank ∈ n→∞ w
Ank , k ≥ 1, such that ank −→ a in X. For every x∗ ∈ X ∗ we have x∗ , ank −→ x∗ , a and so x∗ , a ≤ lim sup σ(x∗ , Ank ) ≤ σ(x∗ , A). Hence a ∈ conv A and we conclude k→∞
that w-lim sup An ⊆ conv A. n→∞
PROPOSITION 6.6.19 If X is a Banach space, {An }n≥1 ⊆ Pf c (X), for all n ≥ 1 K
w
w A, then An −→ A. we have An ∈ W ⊆ Pwk (X) and An −→
PROOF: Evidently A ∈ Pf c (X). We fix x∗ ∈ X ∗ and then choose an ∈ An such that x∗ , an = σ(x∗ , An ), n ≥ 1.
(6.36) w
By passing to a suitable subsequence if necessary, we may assume that an −→ Kw A). Hence a ∈ A (because An −→ x∗ , a = lim σ(x∗ , An ) ≤ σ(x∗ , A)
(see (6.36))
⇒ lim sup σ(x∗ , An ) ≤ σ(x∗ , A).
(6.37)
n→∞
From (6.33) and (6.37) and since A = w- lim sup An , we conclude that n→∞
σ(x∗ , An ) −→ σ(x∗ , A)
for all x∗ ∈ X ∗ ,
w
⇒ An −→ A. REMARK 6.6.20 The previous proposition fails if we do not have the uniform boundedness of the sequence {An }n≥1 by W ∈ Pwk (X). Moreover, if in Proposition K
w
w A if and only if An −→ A. For details we refer 6.6.19 X is separable, then An −→ to Hu–Papageorgiou [313, p. 676].
PROPOSITION 6.6.21 If X is a Banach space, {An }n≥1 ⊆ 2X \ {∅}, then for every x ∈ X we have lim sup d(x, An ) ≤ d(x, s- lim inf An ). n→∞
n→∞
PROOF: If s- lim inf An = ∅, then d(·, s- lim inf An ) ≡ +∞ and so the proposition n→∞
n→∞
is trivially true. Hence we may assume that s- lim inf An = ∅. Let a ∈ s- lim inf An . n→∞
n→∞
We can find an ∈ An , n ≥ 1, such that an −→ a in X. Then d(x, An ) ≤ x − an and so lim sup d(x, An ) ≤ x − a. Because a ∈ s- lim inf An was arbitrary, we deduce n→∞
n→∞
that lim sup d(x, An ) ≤ d(x, s- lim inf An ). n→∞
n→∞
PROPOSITION 6.6.22 If X is a Banach space, {An , A}n≥1 ⊆ 2X \ {∅} and for every x ∈ X, we have lim sup d(x, An ) ≤ d(x, A), n→∞
then A ⊆ s- lim inf An . n→∞
6.6 Convergence of Sets
519
PROOF: If a ∈ A, then d(a, An ) −→ 0 and so a ∈ s- lim inf An (see Remark 6.6.4). n→∞
Therefore we have A ∈ s- lim inf An . n→∞
If we combine Propositions 6.6.18 and 6.6.22, we obtain the following result. PROPOSITION 6.6.23 If X is a Banach space, {An , A}n≥1 ⊆ Pf c (X) and W
w
M
An −→ A, An −→ A, then An −→ A. In Propositions 6.6.21 and 6.6.22 we obtained results concerning s- lim inf An . n→∞
We want to have similar results for w- lim sup An . For this purpose we introduce n→∞
the following class of subsets of X. DEFINITION 6.6.24 Let X be a Banach space. We define A = {C ⊆ X : C is nonempty, weakly closed and for every r > 0, C ∩ B r ∈ Pwk (X)}, and Ac = {C ∈ A : C is also convex}. REMARK 6.6.25 Clearly the family A is closed under finite unions, arbitrary intersections, and of course contains the family of all weakly closed and locally weakly compact subsets of X. Moreover, if X is reflexive, then A = {C ⊆ X : C is nonempty and weakly closed}; that is, A = Pf (Xw ). PROPOSITION 6.6.26 If X is a Banach space, {An }n≥1 ⊆ 2X \ {∅} and An ⊆ W ∈ A for every n ≥ 1, then for every x ∈ X, we have d(x, w- lim sup An ) ≤ n→∞
lim inf d(x, An ). n→∞
PROOF: Let x ∈ X and r = lim inf d(x, An ). If r = +∞, then the proposition is n→∞
trivially true. So let r < +∞. Suppose that the result is not true. Then for some x ∈ X we have lim inf d(x, An ) < d(x, w- lim sup An ). (6.38) n→∞
n→∞
We can find a subsequence {nk } of {n}, such that lim d(x, Ank ) = lim inf d(x, Ank ) = r < ∞.
k→∞
n→∞
We pick ank ∈ Ank such that x − ank ≤ d(x, Ank ) +
1 k
for all k ≥ 1.
(6.39)
For k ≥ 1 large we have ank ∈ W ∩ B (r+x+1) ∈ Pwk (X). So by passing to a w further subsequence if necessary, we may assume that ank −→ a ∈ w- lim sup An = ∅. n→∞
Then x − a ≤ lim inf x − ank ≤ lim inf d(x, An ) k→∞
n→∞
(see (6.39))
< d(x, w- lim sup An ) n→∞
≤ x − a,
(see (6.38))
520
6 Multivalued Analysis
a contradiction.
Hidden in the above proof is also the result that w- lim sup An = ∅. So we can n→∞
state this separately. PROPOSITION 6.6.27 If X is a Banach space, {An }n≥1 ⊆ 2X \ {∅}, and r = lim inf d(x, An ), then n→∞
(a) w- lim sup An = ∅ implies r < +∞. n→∞
(b) If An ⊆ W ∈ A for all n ≥ 1 and r < +∞, then w- lim sup An = ∅. n→∞
REMARK 6.6.28 If X is reflexive, then in part (b), W = X. So in reflexive Banach spaces, for any sequence {An }n≥1 ⊆ 2X\{∅} we always have d(x, w- lim sup An ) ≤ n→∞
lim inf d(x, An ) for all x ∈ X. n→∞
The next result is analogous to Proposition 6.6.22 but for w-lim sup this time. For a proof of this result, we refer to Hu–Papageorgiou [313, pp. 673–674]. PROPOSITION 6.6.29 If X is a reflexive Banach space with X ∗ locally uniformly convex, {An , A}n≥1 ⊆ Pf c (X) and for every x ∈ X we have d(x, A) ≤ lim inf d(x, An ), then w- lim sup An ⊆ A. n→∞
n→∞
Recalling that weakly convergent sequences in a Banach space are bounded, we see that ! w- lim sup(An ∩ kB1 ). w- lim sup An = n→∞
k≥1
n→∞
This observation leads at once to the following result. PROPOSITION 6.6.30 If X is a Banach space, {An }n≥1 ⊆ 2X \ {∅} and either (i) X ∗ is separable or (ii) X is separable and An ⊆ W ∈ A for all n ≥ 1, then w An ∩ kB1 . w- lim sup An = n→∞
k≥1 m≥1 n≥m
REMARK 6.6.31 Clearly in the above proposition An ∩ kB1 can be replaced by τ An ∩ kB 1 or by An ∩ kB1 with τ = s or τ = w. Note that in general w- lim sup An n→∞
is neither strongly nor weakly closed. Finally in a finite-dimensional setting and under a compactness condition, the situation is very pleasant because we have the following. PROPOSITION 6.6.32 If X is a finite-dimensional Banach space and K w h {An , A}n≥1 ⊆ Pf c (X) with A compact, then An −→ A ⇔ An −→ A ⇔ An −→ W
A ⇔ An −→ A.
6.6 Convergence of Sets
521
In the last part of this section we deal with the convergence of the set-valued integrals and of the sets of Lp -selectors of sequences of measurable multifunctions. A major tool in our considerations is the following result on the pointwise behavior of a weakly convergent sequence in Lp (Ω, X), (1 ≤ p < ∞). PROPOSITION 6.6.33 If (Ω, Σ, µ) is a σ-finite measure space, X is a Banach w space, {fn , f }n≥1 ⊆ Lp (Ω, X, ) 1 ≤ p < ∞, fn −→ f in Lp (Ω, X), and for µ-almost all ω ∈ Ω and all n ≥ 1 fn (ω) ∈ G(ω) ∈ Pwk (X), then f (ω) ∈ conv w- lim sup{fn (ω)} n→∞
µ-a.e. on Ω. PROOF: By Mazur’s lemma we have f (ω) ∈ conv
!
fn (ω)
µ-a.e. on Ω.
n≥k
So for every k ≥ 1, x∗ ∈ X ∗ and ω ∈ Ω \ N, µ(N ) = 0, we have !
x∗ , f (ω) ≤ σ x∗ , fn (ω) = sup x∗ , fn (ω) , n≥k
n≥k ∗
∗
⇒ x , f (ω) ≤ lim sup x , fn (ω) . n→∞
Invoking Proposition 6.6.17, we obtain
x∗ , f (ω) ≤ σ x∗ , w- lim sup fn (ω) , n→∞
∗
∗
for all ω ∈ Ω\N and all x ∈ X ,
⇒ f (ω) ∈ conv w- lim sup fn (ω) . n→∞
Our aim is to derive multivalued analogues of Fatou’s lemma. To do this we need the following measurability result whose proof can be found in Hu–Papageorgiou [313, p. 692]. LEMMA 6.6.34 If (Ω, Σ) is a measurable space, X is a separable Banach space, and Fn : Ω −→ 2X \ {∅}, n ≥ 1, are measurable multifunctions such that Fn (ω) ⊆ W (ω) ∈ A for all ω ∈ Ω and all n ≥ 1, then ω −→ lim sup Fn (ω) is measurable. n→∞
REMARK 6.6.35 If X is reflexive, then W (ω) = X for all ω ∈ Ω and so ω −→ lim sup Fn (ω) is always measurable. n→∞
PROPOSITION 6.6.36 If (Ω, Σ, µ) is a nonatomic, σ-finite measure space, X is a separable Banach space and Fn : Ω −→ 2X \ {∅}, n ≥ 1, are graph-measurable multifunctions such that Fn (ω) ⊆ W (ω) µ-a.e. on Ω, for all n ≥ 1, with W : Ω −→ Pwkc (X) integrably bounded, then w- lim sup Ω Fn dµ ⊆ Ω w- lim sup Fn dµ. n→∞
n→∞
522
6 Multivalued Analysis
PROOF: Let x ∈ w- lim sup Ω Fn dµ. We can find a subsequence {nk } of {n} and n→∞ w xnk ∈ Ω Fnk dµ such that xnk −→ x in X. Then we have xnk = Ω fnk dµ with fnk ∈ 1 1 SF1 n ⊆ SW . From Theorem 6.4.23 we know that SW ⊆ L1 (Ω, X) is w-compact. So k
w
we may assume that fnk −→ f in L1 (Ω, X). By virtue of Proposition 6.6.36 we have f (ω) ∈ conv w- lim sup Fn (ω) = conv w- lim sup Fn (ω) and for every n ≥ 1 , ω −→ n→∞
n→∞
Fn (ω) is Σµ -measurable. So Lemma 6.6.34 implies that ω −→ w- lim sup Fn (ω) is n→∞
Σµ -measurable. Using Corollary 6.4.42, we have
w- lim sup Fn dµ cl conv w- lim sup Fn dµ = cl n→∞ n→∞
Ω
Ω ⇒ conv w- lim sup Fn dµ = cl w- lim sup Fn dµ (see Proposition 6.4.48)
⇒
Ω
f dµ ∈ Ω
n→∞
Ω
n→∞
w- lim sup Fn dµ. Ω
n→∞
REMARK 6.6.37 The closure on the set-valued integral
Ω
w- lim sup Fn dµ cann→∞
not be dropped generally. For a situation where this can be done, we refer to Hu–Papageorgiou [313, p. 697]. PROPOSITION 6.6.38 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, and Fn : Ω −→ 2X \ {∅}, n ≥ 1, are graph measurable multifunctions such that Fn (ω) ⊆ W (ω) µ-a.e. with W (ω) ∈ Pwk (X) and sup[u : u ∈ F (ω)] ≤ 1 ϕ(ω) µ-a.e. on Ω with ϕ ∈ L1 (Ω)+ , then w- lim sup SF1 n ⊆ conv Swlim sup Fn . n→∞
n→∞
∞
∗ 1 ∗ (Ω, Xw ∗ ) = L (Ω, X)
PROOF: From Corollary 6.4.18, we know that for all u ∈ L we have
σ(u, Fn )dµ, σ(u, SF1 n ) = Ω
⇒ lim sup σ(u, SF1 n ) = lim sup σ(u, Fn )dµ, n→∞ n→∞ Ω
≤ lim sup σ(u, Fn )dµ (by Fatou’s lemma) n→∞
Ω = σ(u, w- lim sup Fn )dµ (see Proposition 6.6.17) n→∞ Ω
1 = σ u, Swlim sup Fn , (see Lemma 6.6.34 and Corollary 6.4.18) 1 ⇒ w- lim sup SF1 n ⊆ conv Swlim sup Fn
(see Proposition 6.6.18).
n→∞
REMARK 6.6.39 If µ is finite, then the pointwise boundedness condition on |W (ω)| = sup{u : u ∈ W (ω)} can be replaced by the condition that |Fn (·)| n≥1 is uniformly integrable.
6.6 Convergence of Sets
523
We can have analogous Fatou-type results for s- lim inf. PROPOSITION 6.6.40 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, and Fn :Ω −→ 2X \ {∅}, n ≥ 1, are graph-measurable multifunctions such that sup d 0, Fn (·) ∈ L1 (Ω), then Ω s-lim inf Fn dµ ⊆ s-lim inf Ω Fn dµ. n→∞
n≥1
n→∞
1 PROOF: Let x ∈ Ω s- lim inf Fn dµ. Then x = Ω f dµ with f ∈ Sslim inf Fn . Let n→∞
Hn (ω) = x ∈ Fn (ω) : f (ω) − x ≤ d f (ω), Fn (ω) + 1/n . Clearly Gr Hn ∈ Σµ × B(X) and so by Theorem 6.3.20, we can find fn : Ω −→ X a Σ-measurable function such that fn (ω) ∈ Hn (ω) µ-a.e. on Ω. We have fn (ω) − f (ω) −→ 0 µa.e. on Ω. Therefore by the dominated convergence theorem we have fn −→ f in L1 (Ω, X) and so xn = Ω fn dµ −→ x = Ω f dµ. Because fn ∈ SF1 n , we conclude that
s- lim inf Fn dµ ⊆ s- lim inf Fn dµ. Ω
n→∞
n→∞
Ω
PROPOSITION 6.6.41 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable X Banach n ≥ 1, are graph-measurable multifunctions, and
space, Fn 1: Ω −→ 2 \{∅}, 1 1 sup d 0, Fn (·) ∈ L (Ω), then Sslim inf Fn ⊆ s- lim inf SFn . n→∞
n≥1
PROOF: Using Theorem 6.4.16, for every u ∈ L1 (Ω, X) and every n ≥ 1, we have
d(u, SF1 n ) = d u(ω), Fn (ω) dµ, Ω
⇒ lim sup d(u, SF1 n ) ≤ lim sup d u(ω), Fn (ω) dµ (by Fatou’s lemma) n→∞ Ω n→∞
≤ d u(ω), s- lim inf Fn (ω) dµ, (6.40) Ω
n→∞
(see Proposition 6.6.21). Note that
s- lim inf Fn (ω) = s- lim inf Fn (ω) = x ∈ X : lim d x, Fn (ω) = 0 , n→∞
n→∞
n→∞
hence ω −→ s- lim inf Fn (ω) is Σµ -measurable. Therefore by Theorem 6.4.16, n→∞
Ω
⇒ ⇒
1 d u(ω), s- lim inf Fn (ω) dµ = d(u, Sslim inf Fn ) n→∞
n→∞
1 lim sup d(u, SF1 n ) ≤ d(u, Ss(see (6.40)) lim inf Fn ) n→∞ 1 1 (see Proposition 6.6.22). Sslim inf Fn ⊆ s- lim inf SFn n→∞ n→∞
REMARK 6.6.42 If µ is finite, then the hypothesis that sup d 0, Fn (·) ∈L1 (Ω), n≥1
can be replaced by the weaker hypothesis that d 0, Fn (·) n≥1 is uniformly integrable.
524
6 Multivalued Analysis
COROLLARY 6.6.43 If (Ω, Σ, µ) is a σ-finite measure space, X is a separable Banach space, Fn : Ω −→ Pwkc (X), n ≥ 1, are graph-measurable multifunctions with Fn (ω) ∈ W (ω) µ-a.e. on Ω for all n ≥ 1 where W : Ω −→ Pwkc (X) is integrably M M M bounded and Fn (ω) −→ F (ω) µ–a.e. on Ω, then Ω Fn dµ −→ Ω F dµ and SF1 n −→ 1 SF . COROLLARY 6.6.44 If (Ω, Σ, µ) is a σ-finite measure space, X is a finitedimensional Banach space, Fn : Ω −→ 2X \ {∅}, n ≥ 1, are graph-measurable K
multifunctions with sup |Fn (·)| ∈ L1 (Ω) and Fn (ω) −→ F (ω) µ-a.e. on Ω, then
K
Ω
Fn dµ −→
n≥1
Ω
F dµ.
We conclude with two interesting facts concerning weakly convergent sequences in L1 (Ω, RN ). A sequence {fn }n≥1 ⊆ L1 (Ω) that converges weakly but not strongly in L1 (Ω), oscillates wildly around its weak limit. The next proposition is well known from the theory of Lebesgue spaces (see, e.g., Gasi´ nski–Papageorgiou [259, p. 170]). PROPOSITION 6.6.45 If (Ω, Σ, µ) is a finite measure space, {fn , f }n≥1 ⊆ L1 (Ω), w fn −→ f in L1 (Ω) and one of the following two conditions holds, (i) f (ω) ≤ lim inf fn (ω) µ-a.e. or n→∞
(ii) lim sup fn (ω) ≤ f (ω) µ-a.e., then fn −→ f in L1 (Ω). If R is replaced by RN , then conditions (i) and (ii) above are replaced by an extremality condition. The result is due to Visintin [596]. PROPOSITION 6.6.46 If (Ω, Σ, µ) is a finite measure space and {fn }n≥1 ⊆ w L1 (Ω, RN ) satisfy fn −→ f in L1 (Ω, RN ) with
µ-a.e. on Ω, f (ω) ∈ ext conv lim sup{fn (ω)} n→∞
w
then fn −→ f in L1 (Ω, RN ). Using this proposition we can have the following result for the extreme points of the set-valued integral in RN . For the proof of this result, we refer to Hu– Papageorgiou [313, p. 736]. PROPOSITION 6.6.47 If (Ω, Σ, µ) is a finite nonatomic measure space, F : Ω −→ Pf (RN ) is measurable, e ∈ ext Ω F dµ, {fn }n≥1 ⊆ SF1 is uniformly inte grable, and Ω fn dµ −→ e in RN , then we can find f ∈ SF1 such that e = Ω f dµ and fn −→ f in L1 (Ω, RN ).
6.7 Remarks 6.1: There are hyperspace topologies corresponding to the various kinds of continuity of multifunctions introduced in this section. For details in this direction we refer to the books of Beer [61] and Kuratowski [368] and to the paper of Michael
6.7 Remarks
525
[425]. Continuity properties of multifunctions are studied at various levels of generality in the books of Aliprantis–Border [10], Aubin–Cellina [38], Aubin–Frankowska [40], Berge [67], Castaing–Valadier [134], Denkowski–Mig´ orski–Papageorgiou [194], Hu–Papageorgiou [313], Kisielewicz [355], and Klein–Thompson [357]. 6.2: The study of the measurability properties of multifunctions was initiated with the works of Castaing [133] and Jacobs [330]. The approach of Castaing [133] is topological because the multifunctions are defined on a locally-compact Hausdorff topological space equipped with a Radon measure and are compact valued, whereas Jacobs [330] drops this requirement. In Debreu [186] and Rockafellar [521, 529], the multifunctions are defined on a measure space (Ω, Σ, µ) (no topological structure is assumed) and have values in RN . In Debreu [186] the multifunctions are compact valued and in Rockafellar [521, 529] only closed valued. Further contributions to the theory were made by Valadier [591], Leese [373], Himmelberg [301], Himmelberg– Parthasarathy–Van Vleck [304]. When dealing with compact-valued multifunctions, the following result is helpful (see Debreu [186, p. 355]). PROPOSITION 6.7.1 If X is a separable metrizable space, then the Borel
+σ-field of the hyperspace P (X), denoted by B P (X) is generated by the family U :U ⊆ k k X open where U + = C ∈ Pk (X) : C ⊆ U ; also it is generated by the family {U − : U ⊆ X open}, where U − = C ∈ Pk (X) : C ∩ U = ∅ . 6.3: Theorem 6.3.6 is due to Michael [426]. Michael [425, 426, 427, 428] made additional important contributions in this direction. Proposition 6.3.16 is due to Cellina [135], who was the first to conduct a systematic study of the existence of approximate continuous selectors and their use in differential inclusions. For Pkc (RN )-valued multifunction we can have a Lipschitz selector using the Steiner point map. DEFINITION 6.7.2 The Steiner point map is a map s : Pkc (RN ) −→ RN that has the following properties. (a) s(C) ∈ C for all C ∈ Pkc (RN ). (b) s(C + D) = s(C) + s(D) for all C, D ∈ Pkc (RN ). (c) s(U C) = U s(C) for all C ∈ Pkc (RN ) and all U ∈ O(RN ) where O(RN ) is the group of orthogonal transformations on RN . (d) s(λC) = λs(C) for all λ ∈ R and C ∈ Pkc (RN ). Using the Steiner point map, we have at once the following result on Lipschitz continuous selectors. PROPOSITION 6.7.3 If X is a metric space and F : X −→ Pkc (RN ) is hLipschitz, then F admits a Lipschitz continuous selector. This result cannot be generalized to infinite-dimensional Banach spaces. More precisely, we have the following result due to Yost [615]. PROPOSITION 6.7.4 Let X be a metric space and Y a Banach space. Every h-Lipschitz multifunction F : X −→ Pbf c (Y ) admits a Lipschitz continuous selector if and only if dim Y < +∞.
526
6 Multivalued Analysis
Theorem 6.3.17 is due to Kuratowski–Ryll–Nardzewski [367]. Theorem 6.3.20 in the form stated there is due to Saint-Beuve [535]. However, earlier versions with additional restrictions, primarily on X, were proved by Yankov [611], von Neumann [456], and Aumann [46]. Scalarly measurable multifunctions were first considered by Valadier [591]. For the graph measurability of the extremal multifunction ω −→ ext F (ω) (see Proposition 6.3.35), we refer to Benamara [63] (who followed a different approach to prove the result), Castaing–Valadier [134, Chapter IV], and Hu–Papageorgiou [313, Section II.4]. A final useful observation concerning measurable selectors is included in the next proposition (see Hu–Papageorgiou [313, p. 173]). PROPOSITION 6.7.5 If (Ω, Σ, µ) is a separable finite measure space and G : Ω −→ Pf (X), then there exists F : Ω −→ Pf (X) ∪ {∅}-measurable multifunction such that F (ω) ⊆ G(ω) µ-a.e. and every measurable selector of G is also a selector of F .
6.4: The importance of decomposability was evident from the beginning to people working in control theory and optimization; see Boltyanski–Gamkrelidze–Pontryagin [82] and Neustadt [457]. The name convex with respect to switching is also used in the control theory literature. Theorem 6.4.16 is a very valuable tool in applications and illustrates the power of the notion of decomposability. It was first proved by Rockafellar [529]. The form presented here is due to Hiai–Umegaki [293]. Theorem 6.4.23 is also useful in applications and was proved by Papageorgiou [475]. Systematic studies of decomposability can be found in Hiai–Umegaki [293], Hiai [294], Olech [468], and Hu–Papageorgiou [313, Section 2.3]. Decomposability is used to define conditional expectation of multifunctions and define multivalued martingales and their extensions. We refer to the papers of Hiai–Umegaki [293], Papageorgiou [475, 480], Wang–Xue [600], and Wang [599]. Property U (see Definition 6.4.31), was introduced by Bourgain [94]. Proposition 6.4.33 relating the weak norm (see Definition 6.4.29) and the weak topology is due to Gutman [279]. Theorem 6.4.35 was proved by Tolstonogov [584], using the Baire category method of De Blasi–Pianigiani [180, 181, 182]. The set-valued integral given in Definition 6.4.38 is due to Aumann [44]. The topological variant given in Definition 6.4.59 is due to Cornwall [157]. The approach based on the R˚ adstr¨ om embedding theorem (see Theorem 6.4.56) is due to Debreu [186]. For more details about embedding theorems of hyperspaces of convex sets, we refer to Schmidt [544]. Studies of the set-valued integral can be found in Klein– Thompson [357], and Hu–Papageorgiou [313]. 6.5: Theorem 6.5.2 was proved by Nadler [449] for multivalued contractions with nonempty, bounded, closed values. Soon thereafter Covitz–Nadler [165], extended the result to multivalued contractions whose values need not be bounded. The stability results presented in Proposition 6.5.4 and Corollary 6.5.5, are due to Lim [383]. Theorem 6.5.10 was first proved by Knaster–Kuratowski–Mazurkiewicz [358] in the special case where C is the set of vertices of a simplex in RN . Their approach is combinatorial based on Sprener’s lemma. The infinite-dimensional version presented here is due to Fan [232]. The finite-dimensional forerunner of Theorem 6.5.13 is due to Hartman–Stampacchia [286], and the infinite-dimensional result presented
6.7 Remarks
527
here is due to Browder [121]. Weakly inward maps (see Definition 6.5.15(b)), were introduced by Halpern–Bergman [283]. Theorem 6.5.17 is due to Halpern [284]. Theorem 6.5.19 is due to Kakutani [337] when X = RN and due to Fan [232] when X is a locally convex space. Theorem 6.5.20 is due to Browder and [118] and Theorems 6.5.21 and 6.5.22 are due to Bader [49]. Additional fixed point theorems for multifunctions can be found in the books of Andres–Gorniewicz [27], Border, [86], Gorniewicz [272], and Hu–Papageorgiou [313, 316]. 6.6: The first systematic study of the limits in Definition 6.6.3 can be found in Kuratowski [366]. The Mosco convergence of sets was introduced by Mosco [445, 446]. From Section 1.5 we know that they two notions correspond to the Γ-convergence and Mosco-convergence of functions, respectively. The W -convergence (see Definition 6.6.7) and the weak or scalar convergence (see Definition 6.6.9), were both introduced by Wijsman [605] to deal with problems in mathematical statistics. A detailed study together with comparison results accompanied by examples and counterexamples, can be found in Hu–Papageorgiou [313, Chapters I and VII]. For the Hausdorff and Kuratowski convergences, we have the following two compactness results. THEOREM 6.7.6 If (X, d) is a metric space, {An }n≥1 ⊆ Pf (X), and for every n ≥ 1, An ⊆ K ∈ Pk (X), then we can find a subsequence {Ank }k≥1 ⊆ {An }n≥1 and h
A ∈ Pk (X), such that Ank −→ A. REMARK 6.7.7 This result is known as Blaschke’s theorem. THEOREM 6.7.8 If (X, d) is a separable metric space and {An }n≥1 ⊆ 2X \{∅}, K
then there exists a subsequence {Ank }k≥1 of {An }n≥1 and A ∈ 2X , such that Ank −→ A. For multivalued Fatou’s lemmata, we refer to Papageorgiou [477] and Hu– Papageorgiou [313, Section VII.3]. Proposition 6.6.45 was extended to functions in the Lebesgue–Bochner space L1 (Ω, X), by Balder [53] and Rzezuchowski [533].
7 Economic Equilibrium and Optimal Economic Planning
Summary. *This chapter is devoted to mathematical economics. We start with the static model of an exchange economy with a measure space of agents (an abstract device to model perfect competition). We introduce the notions of core allocation and of Walras allocations. We show that in the context of perfect competition, the two sets coincide and they are nonempty (existence result). Then we pass to dynamic models and deal with discrete-time infinite horizon multisector growth models. We deal with both discounted and undiscounted model. For the latter, we introduce the optimality notion of “weak maximality”. We prove existence theorems and we establish weak and strong “turnpike theorems”. Turnpike programs are important because every optimal program eventually moves close to a turnpike one. Moreover, they are easier to compute and are relatively insensitive to the optimality criterion. Subsequently, we conduct an analogous study for models with uncertainty (nonstationary discounted and stationary undiscounted). Then we consider continuous-time models and finally we investigate the “expected utility hypothesis (EUH), the main hypothesis in the theory of decision making.
Introduction Mathematical economics deals with the analytical and mathematical aspects of modern economic theory. The field has made remarkable progress in the last forty years and was also the source for significant developments in nonlinear analysis, such as nonsmooth analysis and multivalued analysis. Today the field has expanded very much and covers a variety of topics that cannot be surveyed in just one chapter. For this reason, here we focus on two particular topics that hold a central position in economic theory. We study the theory of competitive markets (in particular their equilibrium theory) and the theory of growth. At the end of the chapter we examine a notion, which is central in decision making and which brings us to the doorsteps of game theory, which is the topic of the next chapter. In Section 7.1, we discuss a static model of an exchange economy. We assume that perfect competition prevails. This is modelled by a continuum of agents (a nonatomic measure space of agents). For such a model, we introduce two equilibrium concepts. The core allocation (which is price-independent) and the Walras N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_7, © Springer Science+Business Media, LLC 2009
530
7 Economic Equilibrium and Optimal Economic Planning
allocation (which is price-competitive). We show that in the assumed environment of perfect competition, the sets of these allocations coincide (core–Walras equivalence theorem). We also prove the existence of a Walras allocation. In Section 7.2 we turn our attention to growth theory and study infinite horizon discrete-time multisector growth models. First we deal with the discounted model. We show that it admits an optimal program (path) and then we characterize it via a system of supporting prices. Then we pass to the undiscounted model. In this case the intertemporal utility need not converge and so we need to have a new optimality concept. We introduce the notion of weak maximality and we prove that the model admits a weakly maximal program. In Section 7.3, we prove weak and strong turnpike theorems. A turnpike program is a special stationary program which in general is easy to compute and optimal programs eventually move close to a turnpike. So in general terms, turnpike theory deals with the asymptotic properties of optimal programs. In Section 7.4 we study growth under uncertainty. We investigate both nonstationary, discounted and stationary, undiscounted models. We obtain existence and characterization results that are analogous to the ones proved for the deterministic case (see Section 7.2). In Section 7.5 we discuss a continuous-time, discounted growth model. We prove an existence theorem. Now the choice of the topology on the set of feasible programs is crucial in the analysis of the model. Finally, in Section 7.6, we examine a basic hypothesis in the theory of decision making, namely the Expected Utility Hypothesis (EUH). Using a partial order in the space of measures, we characterize choice behavior that is consistent with EUH. The models considered in this chapter are highly idealized versions of real-life situations and to some readers, may appear to be only of theoretical interest. However, we think this is incorrect. First, there are specific economic situations for which these mathematical models are reasonable and realistic approximations. Second, the postulated optimality of a particular allocation or path (mystical as it may be) can be used to find practical solutions. It can serve as the yardstick against which we measure all other real-life trajectories of the particular economic systems and decide how good our choices actually are.
7.1 Perfectly Competitive Economies: Core and Walras Equilibria A perfectly competitive economy is one in which no individual agent can affect the social outcome of the economic activity by his or her individual decisions. To simplify our model we consider only pure exchange economies in which no production is possible. The prices at which the exchange takes place, are independent of the actions of the agents and are taken by the agents as given. With these prices, the agents can trade any amount of commodities, which of course implies that the supply and demand of each individual agent is negligible compared to the total volume of trade and so cannot influence the price. The mathematical device to express this economic situation is to use a continuum of agents (traders). In particular as we show in the sequel, we assume that the space of agents is a finite nonatomic measure space (e.g., the unit interval with the Lebesgue measure). This way we are able to
7.1 Perfectly Competitive Economies: Core and Walras Equilibria
531
model mathematically the situation in which an individual agent cannot influence the outcome of the collective activity. There are two equilibrium concepts for a pure exchange economy. The first one is a core allocation, whose origins can be traced back to Edgeworth [219], under the name contract curve. To illustrate this notion, consider an economic agent who suggests a feasible allocation to the other agents. Suppose now a group of agents, using their initial endowments, can make each member better off than the proposed allocation. Then we say that this group can block the proposed allocation. A feasible allocation, is a core allocation if no group of agents can block it. The set of core allocations forms the core of the economy. The second equilibrium concept goes back to Walras [598] and refers to the noncooperative allocation of resources via a price system. The idea behind the notion of Walras equilibrium is that when the agents are assumed to know only the price system and their own preferences and initial endowments and are allowed to trade freely among them in a decentralized market, then the result will be allocations that maximize the utilities of the agents (subject to their budget constraints) and equate supply and demand. Comparing the two equilibrium concepts introduced above, we see that in contrast to the Walras (competitive) allocation, a core allocation allows for the possibility of cooperation among agents in the economy. Each Walras allocation belongs to the core and the core is in general larger than the set of Walras allocations. A classical conjecture, which dates back to Edgeworth [219] says that the core shrinks to the set of Walras allocations as the number of economic agents increases. The conjecture was made mathematically precise with the introduction of measure space agents. Then in such a setting Aumann [45] proved that the core coincides with the set of Walras allocations. In this section we prove this equivalence result for a perfectly competitive pure exchange (no production) economy. We start with the description of the mathematical model for a perfectly competitive pure exchange economy. So let (Ω, Σ, µ) be a finite measure space, describing the agents of the economy. More precisely, Ω denotes the set of economic agents, the σ-field Σ represents the system of coalitions that can be formed among the agents, and the finite measure µ measures the size of all the feasible coalitions. The commodity space is RN . So there are N -commodities traded in the market. There is N a consumption correspondence F : Ω −→ 2R \ {∅}. For each agent ω ∈ Ω, let ≺ω denote an irreflexive binary relation defined as F (ω). This is the preference relation of ω and describes his consumption tastes. For two commodity vectors v, v ∈ F (ω) we write v ≺ω v (resp., v ⊀ω v), to express the fact that v is preferred (resp., not preferred) to v by the agent ω ∈ Ω. The preference relation ≺ can be derived from an indifference relation , which is usually assumed to be a reflexive, transitive and complete binary relation. To complete the description of the economy, we are also given a function e ∈ L1 (Ω, RN ) which assigns to each agent ω ∈ Ω, the agent’s initial endowment vector e(ω) ∈ F (ω). So we call e the initial endowment allocation. The economy under consideration is the quadruple (Ω, Σ, µ), F, ≺, e . DEFINITION 7.1.1 Let E = (Ω, Σ, µ), F, ≺, e be an exchange economy as described above. (a) An allocation for the economy E is a function u ∈ L1 (Ω, RN ). We say that the allocation u is feasible if Ω udµ = Ω edµ.
532
7 Economic Equilibrium and Optimal Economic Planning
(b) A coalition A ∈ Σ can improve upon the allocation u if there is an allocation w for E such that (i) u(ω) ≺ω w(ω) µ-a.e. on A. (ii) µ(A) > 0 and A wdµ = A edµ. REMARK 7.1.2 In Definition 7.1.1(b), the coalition A ∈ Σ can block allocation u, because it can redistribute the initial endowment among its members in such a way so as to make each agent in the coalition better off (with respect to the allocation u). DEFINITION 7.1.3 Given a pure exchange economy E= (Ω, Σ, µ), F, ≺, e , the core C(E) of the economy E, is the set of all feasible allocations of E, upon which no coalition in Σ can improve. REMARK 7.1.4 The core of an economy is a fundamental concept in mathematical economics, because it provides a precise explanation of competitive behavior. Now let p ∈ RN + \{0} be the price prevailing in the market. Given a commodity vector v ∈ RN , the real number (p, v)RN is the value of the vector v. Clearly two commodity vectors v, v ∈ RN can be exchanged at price p, provided their values are equal; that is, (p, v)RN = (p, v)RN . DEFINITION 7.1.5 Let E = (Ω, Σ, µ), F, ≺, e be a pure exchange economy in which prevails the price system p ∈ RN + \ {0}. (a) The budget set of agent ω ∈ Ω is defined by
b(ω, p) = v ∈ F (ω) : (p, v)RN ≤ p, e(ω) RN and the demand set is defined by d(ω, p) = v ∈ b(ω, p) : there is no y ∈ b(ω, p) such that v ≺ω y (i.e., d(ω, p) consists of those elements in the budget set that are maximal in b(ω, p) with respect to ≺ω ). (b) The allocation-price pair (u, p) ∈ L1 (Ω, RN )×(RN\{0}) is a Walras or competitive equilibrium if and only if (i) u(ω) ∈ d(ω, p) µ-a.e. on Ω; (ii) Ω udµ = Ω edµ (i.e., the allocation u is feasible). We denote by W (E) the set of all Walras equilibriafor the economy E. Next we make precise our hypotheses on the pure exchange E= (Ω, Σ, µ), F, ≺, e . H: (i) (Ω, Σ, µ) is a finite nonatomic measure space. (ii) For all ω ∈ Ω, F (ω) = RN +. (iii)
Ω
edµ ∈ int RN +.
7.1 Perfectly Competitive Economies: Core and Walras Equilibria
533
(iv) For every ω ∈ Ω, ≺ω is irreflexive and transitive. N (v) For every v ∈ RN : v ≺ω y is open. + , the set y ∈ R N (vi) For every v ∈ RN : v ≺ω y is + , the multifunction ω −→ Gv (ω) = y ∈ R graph-measurable. N (vii) If v ∈ RN + and x ∈ R+ \ {0}, then v ≺ω v + x for all ω ∈ Ω.
REMARK 7.1.6 Hypothesis H(i) implies that we are in an economic situation of perfect competition. Hypothesis H(iii) implies that there are resources available to be traded among the agents. Hypotheses H(v), (vi) are continuity and measurability hypotheses on the preference relation and finally hypothesis H(vii) is a monotonicity relation. THEOREM 7.1.7 If E = (Ω, Σ, µ), F, ≺, e is a pure exchange economy satisfying hypotheses H, then C(E) = W (E). PROOF: First we show that W (E) ⊆ C(E). To this end let (u, p) ∈ L1 (Ω, RN ) × (RN \ {0}) be a Walras equilibrium for the economy E. Assume that u ∈ / C(E). Then according to Definition 7.1.3 we can find A ∈ Σ and w ∈ L1 (Ω, RN ) such that A wdµ = A edµ and u(ω) ≺ ω w(ω) µ-a.e. on A. Since (u, p) ∈ W (E), from Definition 7.1.5(b) we have that p, e(ω) RN <
p, w(ω) RN µ-a.e. on A. Hence
p, e(ω) RN dµ < p, w(ω) RN dµ, A A ⇒ p, edµ < p, wdµ ,
A
RN
A
RN
which contradicts the fact that A edµ = A wdµ. So we have proved that W (E) ⊆ C(E). Next let u ∈ C(E). We show that for some price system p ∈ RN + \ {0}, (u, p) ∈ W (E). For this purpose we introduce the multifunction. ω −→ G(ω) = y ∈ RN : v ≺ω y ∪ {e(ω)}.
(7.1)
Evidently G is a graph-measurable multifunction. We claim that
Gdµ − edµ ∩ int(−RN + ) = ∅.
Ω
(7.2)
Ω
Here Ω Gdµ is the set-valued integral introduced in Definition 6.4.38. Note that (7.2) is equivalent
Gdµ − edµ ∩ int(−RN (7.3) + ) = ∅. Ω
Ω
We argue indirectly. So suppose that (7.3) is not true. Then we can find y ∈ int RN + such that
534
7 Economic Equilibrium and Optimal Economic Planning
edµ −y ∈ Gdµ,
Ω
Ω 1 ⇒ edµ −y = gdµ with g ∈ SG (see Definition 6.4.38). Ω
(7.4)
Ω
We define C = {ω ∈ Ω : g(ω) = e(ω)}. (7.5) 1 , Evidently A, C ∈ Σ (see hypothesis H(vi)). Because Ω edµ = Ω gdµ and g ∈ SG N from (7.1) we see that µ(A) > 0. Let g : A −→ R+ be the Σ-measurable function defined by 1 g(ω) = g(ω) + y. µ(A) A = {ω ∈ Ω : u(ω) ≺ω g(ω)}
and
Because y ∈ int RN + , from hypothesis H(vi) we have for all ω ∈ A.
g(ω) ≺ω g(ω)
(7.6)
Also from the definition of the multifunction G (see (7.1)), we have for all ω ∈ A.
u(ω) ≺ω g(ω)
(7.7)
From (7.6), (7.7), and the transitivity of the binary relation ≺ω , ω ∈ A, we have for all ω ∈ A.
u(ω) ≺ω g(ω)
(7.8)
Moreover, note that
gdµ = gdµ + y = gdµ− gdµ + y A A
Ω
C = edµ− edµ (see (7.4)) C
Ω = edµ (see (7.5)) A
⇒ g ∈ L1 (Ω, RN ) is a feasible allocation.
(7.9)
Combining (7.8) and (7.9) we reach a contradiction to the fact that u ∈ C(E). So (7.3) and equivalently (7.2) hold. 1 Because G is graph-measurable, e ∈ SG , and the measure µ is nonatomic, we can apply Theorem 6.4.44 and infer that Ω Gdµ − Ω edµ is convex. Because of (7.3), we can use the weak separation theorem and obtain p ∈ RN + \ {0} such that
edµ RN ≤ (p, v)RN for all v ∈ Gdµ, p, Ω
Ω
1 ⇒ p, edµ RN ≤ inf (p, g)RN dµ : g ∈ SG , Ω
Ω
⇒ (p, e)RN dµ ≤ inf (p, x)RN : x ∈ G(ω) dµ (see Theorem 6.4.16). Ω
Ω
(7.10) We claim that from (7.10) it follows that
7.1 Perfectly Competitive Economies: Core and Walras Equilibria
p, e(ω)
≤ inf (p, x)RN : x ∈ G(ω)
RN
µ-a.e. on Ω.
535 (7.11)
Suppose that (7.11) is not true. Then we can find D ∈ Σ with µ(D) > 0 such that
inf (p, x)RN : x ∈ G(ω) < p, e(ω) RN for all ω ∈ D. From Theorem 6.3.24, we know that ω −→ m(ω) = inf (p, x)RN : x ∈ G(ω) is Σµ ∩D-measurable on D. Let ε : D −→ RN + \{0} be a Σ∩D-measurable function such
N that m(ω) + ε(ω) < p, e(ω) RN and consider the multifunction H : D −→ 2R \ {∅} defined by H(ω) = x ∈ G(ω) : (p, x)RN ≤ m(ω) + ε(ω) . Because G is graph-measurable, it follows that GrH ∈ (Σµ ∩ D) × B(RN ). So we can apply Theorem 6.3.20 (the Yankov–von Neumann–Aumann selection theorem) and obtain h : D −→ RN + , Σ ∩ D-measurable such that h(ω) ∈ H(ω) µ-a.e. on D. We have
p, h(ω) < p, e(ω) RN µ-a.e. on D (7.12) and
h(ω) ∈ G(ω)
Define h : Ω −→ RN + by
h(ω) =
h(ω) e(ω)
µ-a.e. on D.
(7.13)
if ω ∈ D if ω ∈ Dc .
1 Evidently h ∈ SG (see (7.13)). Moreover, we have
(p, h)RN dµ = (p, h)RN dµ + (p, e)RN dµ c Ω
D
D < (p, e)RN dµ + (p, e)RN dµ (see (7.12)) Dc
D = (p, e)RN dµ, Ω
which contradicts (7.10). This proves that (7.11) is true. Next we show that
(7.14) p, u(ω) RN = p, e(ω) RN µ-a.e. on Ω.
Because of (7.11), we have that p, e(ω) ≤ p, u(ω) µ-a.e. on Ω. If this inequality is strict for all ω ∈ Ω, µ(E) > 0, then
p, udµ RN = p, udµ + p, udµ RN N R c Ω E E
= (p, u)RN dµ + (p, u)RN dµ c
E
E > (p, e)RN dµ + (p, e)RN dµ E Ec = p, edµ , Ω
536
7 Economic Equilibrium and Optimal Economic Planning
a contradiction to the fact that Ω udµ = Ω edµ (recall u ∈ C(E)). So we have proved that (7.14) holds and so u(ω) ∈ b(ω, p) µ-a.e. on Ω. It remains to show that u(ω) ∈ d(ω, p) µ-a.e. on
Ω. By virtue of hypothesis H(iii), we have µ(S) > 0 where S = {ω ∈ Ω : p, e(ω) RN > 0}. Fix ω ∈ S. We can find v0 ∈ RN \ {0} such
+ that (p, v0 )RN < p, e(ω) . Let y ∈ RN + be such that (p, y)RN ≤ p, e(ω) RN . We set yt = tv0 + (1 − t)y for t ∈ (0, 1). Note that yt −→ y as t −→ 0. But because of hypothesis H(v), G(ω)c is closed and so y ∈ / G(ω). This proves that u(ω) ∈ d(ω, p). From this and hypothesis H(vii) it follows that (p, v)RN > 0 for all v ∈ RN + \ {0}, N hence p ∈ int RN we have proved that p ∈ int R and u(ω) ∈ d(ω, p) for + . Therefore +
all ω ∈ S. If ω ∈ S c , then p, e(ω) RN = 0. But because p ∈ int RN + , we deduce that b(ω, p) = {0}. Hence u(ω) = 0 a.e. on S c . Finally we can say that u(ω) ∈ d(ω, p) µa.e. on Ω and p ∈ int RN + , which implies that (u, p) ∈ W (E) (i.e., C(E) = W (E)). Thus far we have introduced the notion of markets with a continuum of traders, we have demonstrated their significance as mathematical models for the intuitive concept of perfect competition, and we proved that under very general conditions, the core of such markets equals the set of all Walras (competitive) equilibria. However, we did not establish the existence of a Walras equilibrium allocation. It may well happen that both sets C(E) and W (E) are empty. In the second part of this section we fill-in this gap by showing that W (E) = ∅ (hence C(E) = ∅ because we always have W (E) ⊆ C(E)). To do this we have to strengthen the hypotheses of our model. We still consider markets with a continuum of agents. So (Ω, Σ, µ) is a nonatomic finite measure space. The nonatomicity of µ expresses the fact that we have perfect competition. We can think of atoms as being oligopolistic agents. The commodity space is RN ; that is, there are N -different commodities traded in the market. Each N trader can trade things in RN + ; that is, for all ω ∈ Ω, F (ω) ∈ R+ . For each trader N (agent) ω ∈ Ω, there is defined on R+ (the set of feasible consumption plans of each trader ω ∈ Ω) a binary relation ω , known as the preference-indifference relation, which is assumed to be reflexive, transitive, and complete. From ω we define the relations ≺ω and ∼ω , called the preference and indifference relations, respectively, as follows. x ≺ω y
if and only if x ω y but not y ω x
(7.15)
x ∼ω y
if and only if x ω y and y ω x.
(7.16)
DEFINITION 7.1.8 We say that the commodity vector x ∈ RN + saturates a trader’s ω ∈ Ω desire, if for all y ∈ RN + , we have y ω x. Also there is an initial endowment e ∈ L1 (Ω, RN + ). Recall that e(ω) presents the commodity vector with which trader ω ∈ Ω comes to the market. So we have an
economy E = (Ω, Σ, µ), RN + , , e . The precise hypotheses on the characteristics of this economy, are the following. H : (i) (Ω, Σ, µ) is a finite nonatomic measure space. (ii) For µ-a.a. ω ∈ Ω, the set F (ω) of all feasible consumption vectors is RN +.
7.1 Perfectly Competitive Economies: Core and Walras Equilibria
537
N (iii) There is an initial endowment e ∈ L1 (Ω, RN + ) such that e(ω) ∈ intR+ µ-a.e. on Ω.
(iv) For µ-a.a. ω ∈ Ω, there is a reflexive, transitive, and complete binary relation ω defined on RN + ; this is the preference–indifference relation for trader ω ∈ Ω and from it we can define a preference relation ≺ω and an indifference relation ∼ω as indicated in (7.15) and (7.16), respectively. (v)
N N N (ω, x, y) ∈ Ω × RN + ×R+ : x ≺ω y ∈ Σ×B(R+ )×B(R+ ).
N (vi) For µ-a.a. ω ∈ Ω and all y ∈ RN + , the sets {x ∈ R+ : y ≺ω x} are open.
(vii) For µ-a.a. ω ∈ Ω, unless the commodity bundle y ∈ RN + saturates the agent’s desire, we have that x − y ∈ int RN + implies y ω x. (viii) There is an allocation v ∈ L1 (Ω, RN + ) such that for µ-a.a. ω ∈ Ω, v(ω) saturates the desire of the trader (ix) For µ-a.a. ω ∈ Ω, the commodity bundle x ∈ RN + can not saturate the desire of the agent unless x − e(ω) ∈ int RN +. REMARK 7.1.9 Hypothesis H (vi) is a continuity in the commodity space for the preference relation ≺ω . In fact it can be shown that this hypothesis together with hypothesis H (iv), implies that there is a continuous utility function x −→ u(ω, x) for each trader ω ∈ Ω (see Debreu [185, p. 56]). Then hypothesis H (v) implies that the utility function (ω, x) −→ u(ω, x) is jointly measurable (a Carath´eodory function). Hypothesis H (vii) is a weak desirability hypothesis. The existence of the special allocation in hypothesis H (viii) is intuitively very acceptable. It says that there is an upper bound on the amount of a commodity that can be profitably used by a trader, no matter what other commodities are or are not available. The fact that v is an allocation (i.e., v ∈ L1 (Ω, RN + )), simply means that the economy as a whole can be saturated, namely the commodity vector Ω vdµ can be distributed among the traders in such a way as to saturate each trader’s desire. Under these hypotheses, we can prove the following existence theorem. THEOREM 7.1.10 If the economy E = (Ω, Σ, µ), RN satisfies hypotheses + , , e H , then W (E) = ∅. PROOF: For each trader ω ∈ Ω and each price vector p ∈ RN + , we define the budget set
b(ω, p) = x ∈ RN + : (p, x)RN ≤ p, e(ω) RN and the preferred set for all y ∈ b(ω, p) . ξ(ω, p) = x ∈ RN + : y ω x N and then set We define V (ω) = x ∈ RN + : v(ω) − x ∈ R+ ϑ(ω, p) = V (ω) ∩ ξ(ω, p).
538
7 Economic Equilibrium and Optimal Economic Planning
Claim 1: For µ-a.a. ω ∈ Ω, p −→ ϑ(ω, p) is continuous on RN \{0}. N Fix ω ∈ Ω\N, µ(N ) = 0. Then for every p ∈ RN + , ϑ(ω, p) ⊆ V (ω) ∈ Pk (R ). So in order to prove that p −→ ϑ(ω, p) is usc, itsuffices to show that it has a closed graph (see Proposition 6.1.10). To this end let (pn , xn ) n≥1 ⊆ Gr ϑ(ω, ·) and assume that
/ ϑ(ω, p). pn −→ p = 0, xn −→ x in RN + . Evidently x ∈ V (ω). Suppose that x ∈ Then x ∈ / ξ(ω, p) and this means that we can find y ∈ b(ω, p) such that x ≺ω y. By virtue of hypothesis H (vi) we can find z ∈ RN + close to y such that
x ≺ω z and (p, z)RN < p, e(ω) RN
(recall that because of hypothesis H (iii) p, e(ω) RN > 0). Therefore we can find n0 ≥ 1 large such that
xn ≺ω z and (pn , z)RN ≤ pn , e(ω) RN for all n ≥ n0 , / ξ(ω, pn ) ⇒ xn ∈
for all n ≥ n0 , a contradiction.
This proves that Gr ϑ(ω, ·) is closed and so p −→ ϑ(ω, p) is usc. Next we show that p −→ ϑ(ω, p) is lsc. Assume that pn −→ p = 0 and let x ∈ ϑ(ω, p). If x saturates the desire of trader ω ∈ Ω, then x ∈ ξ(ω, pn ) for all n ≥ 1 and so if xn = x for all n ≥ 1 we have xn −→ x with xn ∈ ξ(ω, pn ) n ≥ 1. So we may assume that x does not saturate the desire of trader ω ∈ Ω. Because ϑ(ω, pn ) ∈ Pk (RN + ) we can find xn ∈ ϑ(ω, pn ) such that
d x, ϑ(ω, pn ) = x − xn . Let ε > 0 and set sε = (ε, . . . , ε) ∈ int RN + , yε = x + sε . Then by hypothesis H (vii), we have x ≺ ω yε .
Then either for all ε > 0 and all n ≥ 1 large, yε ∈ ϑ(ω, pn ) or else for some ε > 0, we can find a subsequence {nk }k≥1 of {n} such that yε ∈ / ϑ(ω, pn ). In the first case, we have √ xn − x ≤ yε − x ≤ ε N . Because ε > 0 was arbitrary, it follows that xn −→ x and because xn ∈ ϑ(ω, pn ), n ≥ 1, we are done. In the second case, we can find znk ∈ ϑ(ω, pn ) such that yε ≺ω znk . Because N N {z }k≥1⊆V (pnk , znk )RN ≤ n k
(ω)∈Pk (R+ ), we may assume that znk−→z in R . Since
pnk , e(ω) RN , in the limit as k −→ ∞ we obtain (p, z)RN ≤ p, e(ω) RN and so z ∈ b(ω, p). Also by virtue of hypothesis H (vi), we have yεωz and because x ≺ω yε , hypothesis H (iv) implies that x ≺ω z, a contradiction to the fact that x ∈ ϑ(ω, p). This proves that p −→ ϑ(ω, p) is lsc. So we conclude that p −→ ϑ(ω, p) is continuous (and h-continuous; see Corollary 6.1.40). Claim 2: For all p ∈ RN + \{0}, ω −→ ξ(ω, p) and ω −→ ϑ(ω, p) are graph measurable multifunctions. Clearly the budget multifunction ω −→ b(ω, p) is Pf (RN )-valued and measurable. So by virtue of Theorem 6.3.18, we can find hn : Ω −→ RN + , n ≥ 1 Σ–measurable maps such that b(ω, p) = {hn (ω)}n≥1 for all ω ∈ Ω.
7.1 Perfectly Competitive Economies: Core and Walras Equilibria
539
Invoking hypothesis H (vi) we see that # ξ(ω, p) = x ∈ RN : hn (ω) ω x n≥1
⇒ ω −→ ξ(ω, p) is graph-measurable (see hypothesis H (v)), ⇒ ω −→ ϑ(ω, p) = V (ω) ∩ ξ(ω, p) is graph-measurable. Claim 3: p −→
Ω
ϑ(ω, p)dµ is Pkc (RN ) follows from Theorem 6.4.45 and 6.4.23.
Note that
h
ϑ(ω, p)dµ, Ω
ϑ(ω, p )dµ
Ω
≤ h ϑ(ω, p), ϑ(ω, p ) dµ for all p, p ∈ RN + \ {0}, Ω
⇒ p −→ ϑ(ω, p)dµ is continuous by virtue of Claim 1. Ω
Using Claim 3 we can consider the metric projection on the set denoted by proj ·; Ω ϑ(ω, p)dµ . Set d(p) = proj Ω edµ; Ω ϑ(ω, p)dµ .
Ω
ϑ(ω, p)dµ
Claim 4: p −→ d(p) is continuous.
Suppose that pn −→ p and let d(pn ) = proj Ω edµ; Ω ϑ(ω, pn )dµ , n ≥ 1. By virtue of Proposition 6.1.13 and Claim 3, {d(p n }n≥1 is relatively compact. So we can find a subsequence d(pnk ) k≥1 of d(pn ) n≥1 such that d(pnk ) −→ w in RN + . We have
Ω
edµ − d(pnk ) = d edµ, ϑ(ω, pnk )dµ . Ω
From Claim 3 and Proposition 6.6.32, we have
edµ, ϑ(ω, pnk )dµ −→ d edµ, ϑ(ω, p)dµ d Ω Ω Ω
Ω
and edµ − d(pnk ) −→ edµ − w. Ω
(7.17)
Ω
(7.18) (7.19)
Ω
Combining (7.17) through (7.19) we obtain
edµ − w = d edµ, ϑ(ω, p)dµ , Ω Ω
Ω ⇒ w = proj edµ; ϑ(ω, p)dµ = d(p). Ω
Ω
Then by Urysohn’s criterion for convergence of sequences we have d(pn ) −→ w = d(p) and so we conclude that p −→ d(p) is continuous. N Claim 5: For every p ∈ RN + \{0}, d(p)− Ω edµ ∈ R+ . Suppose that the claim is not true. Then without any loss of generality we may as
N 1 N sume that d1 (p) < Ω edµ (here d(p) = dk (p) k=1 ∈ RN and e = (ek )N k=1 ∈ L (Ω, R+ )). 1 By definition d(p) = Ω udµ with u ∈ Sϑ(·,p) . Let y = (v1 , u2 , . . . , uN ), where
540
7 Economic Equilibrium and Optimal Economic Planning
N N v = (vk )N k=1 (see hypothesis H (viii)) and u = (uk )k=1 . Then y(ω) − u(ω) ∈ R+ and v(ω)−y(ω) ∈ RN for all ω ∈ Ω and so it follows that y(ω) ∈ ϑ(ω, p) µ-a.e. on Ω. + Hence
y1 dµ, d2 (p), . . . , dN (p) = ydµ ∈ ϑ(ω, p)dµ. Ω Ω Ω Recall that d1 (p) < Ω e1 dµ and by hypothesis H (ix), we have Ω e1 dµ < Ω v1 dµ. So we can find λ ∈ (0, 1) such that Ω e1 dµ = λ Ω v1 dµ + (1 − λ)d1 (p). Set w = λy + (1 − λ)u and z = Ω wdµ. Then z ∈ Ω ϑ(ω, p)dµ (see Claim 3) and z = Ω e1 dµ, d2 (p), . . . , dN (p) . We have
z −
N 2 2
edµ = ek dµ dk (p) −
Ω
Ω
k=2
<
N
dk (p) −
k=1
= d(p) −
2
ek dµ
(because d1 (p) <
Ω
e1 dµ) Ω
2 edµ ,
Ω
a contradiction to the definition of d(p). This proves the claim. N Now let γ(p) = d(p) − Ω edµ ∈ R + for all 5) and p −→ γ(p) p (see Claim is continuous. We define c(p) = proj Ω edµ; Ω ξ(ω, p)dµ . This is well-defined because Ω ξ(ω, p)dµ ∈ Pkc (RN ). We set η(p) = c(p)− Ω edµ. Claim 6: γ(p) = η(p) for all p. We fix p and write γ = γ(p), η = η(p), c = c(p), and d = d(p). If γ = 0, then we are done. So suppose that γ = 0. From the definition of γ, the hyperplane through d perpendicular to γ supports the compact convex set Ω ϑ(ω, p)dµ. This means that
for all x ∈ ϑ(ω, p)dµ − edµ. (7.20) (x, γ)RN ≥ γ2 Ω
Ω
Suppose that there is a point in Ω ξ(ω, p)dµ which is closer to Ω edµ, than d is. Then we can find y ∈ Ω ξ(ω, p)dµ− Ω edµ which is closer to the origin than γ is. We have y2 < γ2 . (7.21)
2 2 2 2 Moreover, y − 2(y, γ)RN + γ = y − γ > 0. Hence y > (y, γ)RN + (y, γ)RN − γ2 . If (y, γ)RN − γ2 ≥ 0, then y2 > (y, γ)RN ≥ γ2 , which contradicts (7.21). Therefore (y, γ)RN < γ2 . (7.22) 1 We know that y = Ω udµ− Ω edµ with u ∈ Sξ(·,p) . If u(ω, x) is the utility function corresponding to ω −→ω (see hypothesis H (iv)
and Remark 7.1.9). Then u ω, u(ω) on Ω. ∈ ϑ(ω, p) and u ω, u(ω) ≤ v(ω) µ-a.e. Set w(ω) = u ω, u(ω) . We have Ω wdµ ∈ Ω ϑ(ω, p)dµ and Ω wdµ − Ω edµ ≤ y N (i.e., y − Ω wdµ + Ω edµ ∈ RN + ). Because γ ∈ R+ (see Claim 3), it follows that
wdµ − edµ, γ RN ≤ (y, γ)RN ,
Ω
Ω
⇒ wdµ − edµ, γ RN < γ2 (see (7.22)). (7.23) Ω
Ω
7.1 Perfectly Competitive Economies: Core and Walras Equilibria But because
Ω
wdµ −
Ω
edµ ∈
Ω
ϑ(ω, p)dµ −
wdµ −
Ω
edµ, γ
Ω
RN
Ω
541
edµ, from (7.20) we have
≥ γ2 .
(7.24)
Comparing (7.23) and (7.24), we reach a contradiction. This proves the claim. N N pk = Next let P be the standard price simplex; that is, P = p = (pk )N k=1 ∈ R+ :
1 . Let ϕ : P −→ P be defined by
k=1
p + η(p) .
1
ϕ(p) = 1+
N
ηk (p)
k=1
By virtue of Claim 6, ϕ is continuous and by Brouwer’s fixed point theorem (see Theorem 3.5.3), we can find q ∈ P such that ϕ(q) = q. Then 1+
N
ηk (q) q = q + η(q)
k=1
⇒ η(q) = µq
with µ =
N
η(q) ≥ 0
(see Claim 6).
(7.25)
k=1
We show that η(q) = 0. Suppose that this is not true. From the definition of η and the convexity of Ω ξ(ω, p)dµ, we see that the hyperplane through η(p) + Ω edµ which is perpendicular to η(p) supports the closed, convex set Ω ξ(ω, p)dµ. So if p = q, we have
edµ, η(q) RN ≥ η(q)2 for all y ∈ ξ(ω, p)dµ. y− Ω
Ω
Because η(q) = 0, from (7.25) we have γ > 0 and
edµ, η(q) RN ≥ µ2 q2 > 0 for all y ∈ ξ(ω, p)dµ. y− Ω
(7.26)
Ω
Using hypothesis H (v) and the Yankov–von Neumann–Aumann selection theorem (see Theorem 6.3.20), we can find u ∈ L1 (Ω, RN + ) such that u(ω) ∈ b(ω, q) µ-a.e. and is ≺ω -maximal for µ-a.e. ω ∈ Ω. We have
u(ω) − e(ω), q RN ≤ 0 µ-a.e. on Ω (7.27) u(ω) ∈ ξ(ω, p)
and
µ-a.e. on Ω.
(7.28)
Integrating (7.27) and (7.28) we obtain
udµ − edµ, q RN ≤ 0 with udµ ∈ ξ(ω, p)dµ. Ω
Ω
Ω
(7.29)
Ω
Comparing (7.26) and (7.29), we reach a contradiction. Therefore η(q) = 0. 1 From this we infer that there exists s ∈ Sξ(·,q) such that
542
7 Economic Equilibrium and Optimal Economic Planning
sdµ = edµ. Ω
(7.30)
Ω
We claim that (s, q) is a Walras equilibrium (see Definition 7.1.5(b)). To estab1 lish this, it suffices to show s ∈ Sb(·,q) . From hypotheses H (viii) and (ix), we see that
q, e(ω) RN ≤ q, s(ω) RN µ-a.e. on Ω. If this inequality is strict on a set of positive µ-measure, we contradict (7.30). Therefore
q, s(ω) RN = q, e(ω) RN µ-a.e. on Ω, ⇒ s(ω) ∈ b(ω, q)
µ-a.e. on Ω.
REMARK 7.1.11 Walras equilibria are significant only in a perfect competition situation; that is, the market consists of a large number of individually insignificant traders. Theorem 7.1.10, illustrates that in a perfect competition situation, convex preferences are not needed to prove existence.
7.2 Infinite Horizon Multisector Growth Models In this section we turn our attention to dynamic economic models and we examine optimal growth models in discrete time. An important feature of such models is the infinite (unbounded) planning horizon. This results from purely economic considerations. For example, one of the main problems in dynamic economic planning is that of determining what amounts of resources should be allocated to the production of consumption goods and what amounts should be allocated to the production of capital equipment that will be used in the production of future consumption goods. The only rational way to address this question is by considering an infinite time horizon. It is true that we usually hear about five or ten year plans, which set targets for the level of capital accumulation. But then it is natural to ask about the criteria that led to these target levels. The only reasonable answer to this question can be some sort of trade-off between the sacrifices required to accumulate the capital during the planning period and the benefits that will result from the future use of the accumulated capital. In other words, a finite planning horizon requires some method of evaluating end-of-period capital stocks or assets and the only proper evaluation is their value in use in the subsequent future. But, if this is so, then the planning decisions have actually been based on considerations about times well beyond the period of the plan and this will be true independent of whether the planning period is short or long. Therefore the proper way to model this mathematically is through an infinite horizon. So the open endedness of the future in the model is important, because it expresses the fundamental economic fact that the consequences of investment are long-lived. If the criterion to be maximized (the intertemporal utility) is discounted at every step, then an optimal program can be obtained easily using the direct method of calculus of variations. For such a situation the interesting question is to
7.2 Infinite Horizon Multisector Growth Models
543
characterize the optimal programs (paths) using decentralizable conditions. If the discount factor δ equals 1 (i.e., we have an undiscounted instantaneous utility), then the intertemporal utility integral need not converge and we face some serious mathematical difficulties. For this reason we introduce some weaker notions of optimality, which we discuss in the second part of this section. In both the discounted and undiscounted cases, the growth model is a multisector model meaning that there is more than one commodity in the model. We start the discussion with the discounted model. In this case the model is N N described by a triplet (G, u, δ), where G ⊆ RN + ×R+ is the technology set, u : R+ −→ R is the utility function, and δ ∈ (0, 1) is the discount factor. So in our model the commodity space is RN , in other words there are N commodities. In the commodity N N space RN we use the l1 -norm x = |xk | for all x = (xk )N k=1 . Also for x, y ∈ R k=1
we use the following notation. y≤x y <x y!x
if and only if yk ≤ xk for all k ∈ {1, . . . , N }, if and only if y ≤ x and y = x, if and only if yk < xk for all k ∈ {1, . . . , N },
(i.e., x − y ∈ intRN + ). A pair (x, y) ∈ G represents a technologically feasible production pair, where x stands for the initial stock of inputs and y stands for the final output, which can be produced using as inputs the vector x. The hypotheses on the model (G, u, δ) are the following. N H 1 : G ⊆ RN + ×R+ is a nonempty closed set such that
(i) There exists M > 0 such that if (x, y) ∈ G and x ≥ M , then y ≤ x. (ii) If (x, y) ∈ G and x ≤ x , 0 ≤ y ≤ y, then (x , y ) ∈ G. (iii) There exists (x, y) ∈ G such that x ! y. REMARK 7.2.1 Note that we do not assume that the technology set is convex. Hypothesis H1 (i) in economic terms says that if the input exists in sufficiently large quantities, then we have losses due to depreciation. From a mathematical viewpoint, this hypothesis guarantees the boundedness of the set of feasible programs (paths). Hypothesis H1 (iv) is the free disposability hypothesis, very common in economic models. It says that we are always open to accept a supplement of goods, even if we do not have to do anything with them. Finally hypothesis H1 (iii) means that there is a feasible production pair in which the output is strictly bigger than the input. H2 : u : R N + −→ R is continuous. REMARK 7.2.2 Note that no concavity assumption is made on u. H3 : 0 < δ < 1.
544
7 Economic Equilibrium and Optimal Economic Planning
DEFINITION 7.2.3 A program (path) starting at y ∈ RN + , is a sequence {(xn , yn )}n≥0 such that y0 = y, 0 ≤ xn ≤ yn , and (xn , yn+1 ) ∈ G for all n ≥ 1. Associated with such a program {(xn , yn )}n≥0 is a consumption sequence {cn }n≥0 defined by cn = yn − xn ≥ 0 for all n ≥ 0. The utility function u at each stage is evaluated at the consumption level (i.e., we consider u(cn ), n ≥ 0) and it is discounted by the factor δ ∈ (0, 1). So the intertemporal utility generated by a consumption sequence {cn }n≥0 , is given by n δ u(cn ). (7.31) n≥0
LEMMA 7.2.4 If hypotheses H1 hold and (x, y) ∈ G, then (a) x ≤ M implies y ≤ M ; (b) y ≤ max{x, M }. PROOF: (a) We argue indirectly. So suppose that x ≤ M and y > M . Set x = x + (M − x)(e/N ) with e = (1, . . . , 1) ∈ int RN + . Then x ≥ x and so by hypothesis H1 (ii) (the free disposability hypothesis), we have (x , y) ∈ G. Note that x = M and so by virtue of hypothesis H1 (i) we have that y ≤ M , a contradiction. (b) If x ≤ M , then from part (a) we have y ≤ M . This combined with hypothesis H1 (iii) implies that y ≤ max{x, M }. LEMMA 7.2.5 If hypotheses H1 hold and {(xn , yn )}n≥0 is a program starting at y ∈ RN + , then xn , yn , cn ≤ M1 = max{y, M } for all n ≥ 0. PROOF: From Definition 7.2.3 we have that 0 ≤ cn = yn − xn ≤ yn
for all n ≥ 0.
Therefore it suffices to show that yn ≤ M1 for all n ≥ 0. Note that y0 = y and so y0 ≤ M1 . We proceed by contradiction. So suppose that yn ≤ M1 for some n ≥ 1. Then by virtue of Lemma 7.2.4(b) and because (xn , yn+1 ) ∈ G (see Definition 7.2.3), we have xn+1 ≤ yn+1 ≤ max{xn , M } ≤ M1
(by the induction hypothesis).
This implies that for every program {(xn , yn )}n≥0 emanating from y ∈ RN + , the intertemporal utility given in (7.31) is absolutely summable. So we can make the following definition. DEFINITION 7.2.6 A program {(x∗n , yn∗ )}n≥0 starting at y ∈ RN + is said to be optimal if n ∗ n δ u(cn ) ≤ δ u(cn ) n≥0
n≥0
for every other program {(xn , yn )}n≥0 starting from y. Here cn = yn − xn , c∗n = yn∗ −x∗n ≥ 0 for all n ≥ 0.
7.2 Infinite Horizon Multisector Growth Models
545
THEOREM 7.2.7 If hypotheses H1 , H2 , and H3 hold and y ∈ RN + , then there exists an optimal program starting at y. PROOF: Let F (y) be the set of all consumption sequences generated by programs ∞ N that start at y ∈ RN + . Note that F (y) ⊆ l (R )+ and from Lemma 7.2.4(b) we know that if (cn )n≥0 ⊆ F (y), then cn ≤ M1 for all n ≥ 0. Also because G is closed it is easy to see that F (y) is w∗ -closed in l∞ (RN ). Therefore by Alaoglu’s theorem F (y) N is w∗ -compact in l∞ (R n ). Similarly the intertemporal utility∗function U : F (y) −→ R defined by U (c) = δ u(cn ) for all c = (cn )n≥0 ∈ F (y) is w -continuous. Hence by n≥0
the Weierstrass theorem we can find c∗ ∈ F (y) such that U (c∗ ) realizes the maximum of U on F (y). Then the program {(x∗n , yn∗ )}n≥0 which generates the consumption sequence {c∗n }n≥0 is an optimal program. There is a close connection between price systems and optimal programs, which in some respects strongly resembles the relation existing between price systems and optimal programs in finite horizon problems. However, as we show there is a fundamental difference, which stems from the infinite horizon nature of the problem. In order to make all these facts precise, we need to introduce some additional concepts. DEFINITION 7.2.8 (a) The value function V : RN + −→ R is defined by n δ u(cn ), V (y) = n≥0
where {cn }n≥0 is the consumption sequence generated by an optimal program {(xn , yn )}n≥0 starting at y ∈ RN + (see Definition 7.2.6). (b) A sequence {(xn , yn , pn )}n≥0 } is a competitive program starting at y ∈ RN + , if {(xn , yn )}n≥0 } is a program starting at y ∈ RN , pn ∈ RN + for all n ≥ 0 and δ n u(c) − (pn , c)RN ≤ δ n u(cn )−(pn , cn )RN
for all c ∈ RN + , all n ≥ 0 (7.32)
and (pn+1 , y)RN −(pn , x)RN ≤ (pn+1 , yn+1 )RN −(pn , xn )RN
(7.33)
for all (x, y) ∈ G and all n ≥ 0. REMARK 7.2.9 Adding (7.32) and (7.33), we see that if {(xn , yn , pn )}n≥0 is a N competitive program starting at y ∈ RN + ; then for all (x, y) ∈ G, all c ∈ R+ , and all n ≥ 0 we have δ n u(c) + (pn+1 , y)RN − (pn , x + c)RN ≤ δ n u(cn ) + (pn+1 , yn+1 )RN − (pn , yn )RN .
(7.34)
In the definition of a competitive program, inequality (7.32) says that at every time period the consumption plan {cn }n≥0 maximizes the total utility among all other possible consumption vectors. The total utility is defined as δ n u(c) minus the present cost of the consumption (pn , c)RN . In inequality (7.33), we interpret the
546
7 Economic Equilibrium and Optimal Economic Planning
quantity (pn+1 , y)RN −(pn , x)RN as the total profit from operating the technological process (x, y) ∈ G, when the price system {pn }n≥0 prevails in the market. Then inequality (7.33) says that a competitive program from y at every time period maximizes the total profit among all other technologically feasible input-output pairs. Finally inequality (7.34) says that a competitive program from y maximizes the sum of total utility and total profit at every time period in the set of all feasible input-output pairs. DEFINITION 7.2.10 A competitive program {(xn , yn , pn )}n≥0 is said to satisfy the transversality condition if lim (pn , xn )RN = 0. n→∞
REMARK 7.2.11 This condition means that the value of the input asymptotically equals zero. As we show in the sequel this is the condition that is characteristic of infinite horizon problems and which from an economic viewpoint is rather undesirable. But first let us state the theorem on the characterization of optimal programs. To do this we need to modify hypotheses H1 and H2 as follows. N H1 : G ⊆ RN + × R+ is a nonempty, closed, convex set that satisfies hypotheses H1 (i),(ii), and
(iii) (0, 0) ∈ G and if (0, y) ∈ G, then y = 0. (iv) There exists a pair (x, y) ∈ G such that y ∈ int RN . REMARK 7.2.12 Note that now the feasibility set G is convex. This is common hypothesis in economic models and simply means that technological processes can be mixed in arbitrary proportions. The two conditions in hypothesis H1 (iii) are very natural and simply mean that inaction is an option (first condition) and that a nonzero output cannot result from zero input (second condition). Finally hypothesis H1 (iv) means that there is an input x ∈ RN + that produces a strictly positive quantity of each commodity. This condition is a weaker version of hypothesis H1 (iii). Moreover, the input x ∈ RN + is often called sufficient. H2 : u : RN + −→ R is continuous, concave, and strictly increasing in the sense that if c ! c in RN + , then u(c ) < u(c). The next theorem characterizes optimal programs by means of competitive programs. Because the same result is proved in the more general context of stochastic models (see Theorem 7.4.10), here we simply state the theorem and postpone its proof until Section 7.4. THEOREM 7.2.13 If hypotheses H1 , H2 , H3 hold and {(xn , yn )}n≥0 is an optimal N program starting at y ∈ RN + , then there exists a price sequence {pn }n≥0 ⊆ R+ such that (a) The sequence {(xn , yn , pn )}n≥0 is competitive (see Definition 7.2.8(b)). (b) δ n V (y)−(pn , y)RN ≤ δ n V (yn )−(pn , yn )RN for any y ∈ RN + and n ≥ 0. (c) lim (pn , xn )RN = 0 (i.e., the transversality condition holds). n→∞
7.2 Infinite Horizon Multisector Growth Models
547
REMARK 7.2.14 According to (a) the optimal program {(xn , yn )}n≥0 satisfies certain support properties for the technology and utility explained in Remark 7.2.9 (see (7.32) and (7.33)). Condition (b) implies that at every stage we have minimization of the cost among all programs producing no less future value. Finally condition (c) is the transversality condition, which distinguishes infinite horizon from finite horizon problems. Conditions (b) and (c) are not independent, as the next proposition shows. PROPOSITION 7.2.15 If hypotheses H1 , H2 , H3 hold and {(xn , yn , pn )}n≥0 is a competitive program starting at y ∈ RN + , then (b) and (c) in Theorem 7.2.13 are equivalent. PROOF: (b)⇒(c): In statement (b) let y = 0. Then we have
0 ≤ (pn , yn )RN ≤ δ n V (yn ) − V (0) .
(7.35)
By hypothesis H2 , u is continuous. This combined with Lemma 7.2.5 implies that {V (yn )}n≥1 ⊆ R is bounded. Moreover, from Theorem 7.2.7, we have that V is defined on all of RN + . So from (7.35) we deduce that lim (pn , xn )RN = 0 (recall that n→∞
0 < δ < 1; see hypothesis H3 ). (c)⇒(b): Choose any k ≥ 1 and y ∈ RN + . Let {(xn , yn )}n≥0 be a program starting at y. Because by hypothesis {(xn , yn , pn )}n≥1 is a competitive program starting at y, from (7.34) we have for all n ≥ k
δ n u(cn−k )−u(cn ) ≤ (pn+1 , yn+1 )RN − (pn , yn )RN − (pn+1 , yn−k+1 )RN −(pn , yn−k )RN .
(7.36) Summing up (7.36) from n = k to n = k + k1 , k1 ≥ 1, we obtain
k+k1
δ n u(cn−k )−u(cn )
n=k
≤ (pk+k1 +1 , yk+k1 +1 )RN −(pk , yk )RN + (pk , y)RN . From this we obtain δk
k1
δ n u(cn )−
n=0
k+k1
δ n−k u(cn )
n=k
≤ (pk+k1 +1 , yk+k1 +1 )RN −(pk , yk )RN + (pk , y)RN .
(7.37)
Note that as k1 → +∞, the two sums in the left-hand side of (7.37) converge. So using the transversality condition (statement (c)), we obtain n−k δ n u(cn ) − δ u(cn ) ≤ (pk , y)RN −(pk , yk )RN . (7.38) δk n≥0
n≥k
But from the principle of optimality, {(xn+k , yn+k )}n≥0 is an optimal program starting at y ∈ RN + and
548
7 Economic Equilibrium and Optimal Economic Planning n−k n δ u(cn ) = δ u(cn+k ) = V (yk ). n≥k
(7.39)
n≥0
Using (7.39) in (7.38) and because {(xn , yn )}n≥0 is any program starting at y ∈ RN + , we have
δ k V (y) − V (yk ) ≤ (pk , y)RN −(pk , yk )RN . Now we show that a competitive program that satisfies the transversality condition is in fact optimal. THEOREM 7.2.16 If hypotheses H1 , H2 , H3 hold and {(xn , yn , pn )}n≥0 is a competitive program starting at y ∈ RN + that satisfies the transversality condition, then {(xn , yn )}n≥0 is an optimal program starting at y ∈ RN +. PROOF: From (7.34) we have k
k
δ n u(cn )−u(cn ) ≤ )RN −(pn , yn − yn )RN (pn+1 , yn+1 − yn+1
n=0
n=0
=
k
(pn+1 , xn+1 + cn+1 −xn+1 −cn+1 )RN
n=0
−(pn , xn + cn −xn −cn )RN
≤ (pk+1 , xk+1 )RN + (pk+1 , ck+1 − ck+1 )RN .
(7.40)
But from (7.32) we know that
(pk+1 , ck+1 −ck+1 )RN ≤ δ k u(ck+1 )−u(ck+1 ) .
(7.41)
Using (7.41) in (7.40), we obtain k+1
δ n u(cn )−u(cn ) ≤ (pk+1 , xk+1 )RN .
(7.42)
n=0
Passing to the limit as k−→+∞ in (7.42) and using the transversality condition, we conclude that {(xn , yn )}n≥0 is an optimal program starting at y ∈ RN +. As we already mentioned in Remark 7.2.11, the transversality condition is rather problematic from an economic viewpoint. Indeed, being asymptotic in nature, it is not a myopic behavior rule for an individual decision maker and for this reason it is unclear how this condition can be verified in a decentralized setting. So, we want to have a reformulation of Theorem 7.2.16 with the transversality hypothesis removed. For this purpose, we introduce the reachability condition. DEFINITION 7.2.17 We say that the discounted multisector growth model (G, u, δ) satisfies the reachability condition (R), if given any y ∈ RN + and any program {(xn , yn )}n≥0 starting at y ∈ RN + , there is an integer k0 ≥ 1 and a program {(xn , yn )}n≥0 starting at y such that yk 0 ≥ yk0 .
7.2 Infinite Horizon Multisector Growth Models
549
REMARK 7.2.18 In essence the reachability condition (R) may be rephrased as follows. The technological production possibilities are such that, beginning with a capital stock from which expansion of stocks are feasible (hypothesis H1 (iii)), it is possible (if need be through pure accumulation of capital over a sufficiently long period), to attain the stocks along any feasible program, at some future date. THEOREM 7.2.19 If hypotheses H1 , H2 , H3 hold, the reachability condition (R) is satisfied and {(xn , yn , pn )}n≥0 is a competitive program starting at any y ∈ RN +, then {(xn , yn )}n≥0 is an optimal program starting at y ∈ RN +. PROOF: Let c = 0 and apply (7.34) to (x, y) ∈ G, c = 0. We obtain δ n u(c) + (pn+1 , y)RN −(pn , x + c)RN = δ n u(0) + (pn+1 , y)RN −(pn , x)RN = δ n u(0) + (pn+1 , y − x)RN + (pn+1 − pn , x)RN ≤ δ n u(cn ) + (pn+1 , yn+1 )RN −(pn , yn )RN .
(7.43)
For k ≥ 2, we have k−1
δ n u(0) +
n=0
k−2
(pn+1 , y − x)RN + (pk , y)RN − (p0 , x)RN .
(7.44)
n=0
Consider the sequence {(xn , yn )}n≥0 defined by xn = xn+k , yn = yn+k
for all n ≥ 0.
{(xn , yn )}n≥0
is a program starting at yk . Invoking the reachability conEvidently dition (R) (see Definition 7.2.17), we can find a program {(xn , yn )}n≥0 starting at y and k0 ≥ 0 such that yk 0 ≥ yk0 = yk0 +k . (7.45) ) ∈ G cn = yn − xn , we have for all n ≥ 0. Applying (7.34) to (xn , yn+1 )RN − (pk+n , yn )RN δ k+n u(cn ) + (pk+n+1 , yn+1
≤ δ k+n u(ck+n ) + (pk+n+1 , yn+k+1 )RN − (pk+n , yn+k )RN .
(7.46)
We adopt the convention that if k0 = 0, the summation from 0 to k0 − 1, equals zero. Summing up (7.46) from n = 0 to n = k0 − 1, we have k0 −1
δ k+n u(cn ) + (pk+k0 , yk 0 )RN − (pk , y0 )RN
n=0 k0 −1
≤
δ k+n u(ck+n ) + (pk+k0 , yk+k0 )RN − (pk , yk )RN .
(7.47)
n=0
Recall that y0 = y, pn ≥ 0 for all n ≥ 0, and use (7.35). Then (7.47) yields k0 −1
δ k+n u(cn )−(pk , y)RN
n=0 k0 −1
≤
n=0
δ k+n u(ck+n )−(pk , yk )RN .
(7.48)
550
7 Economic Equilibrium and Optimal Economic Planning
From (7.44) and (7.48), for every k ≥ 2 we have k0 −1
k−1
δ n u(0) +
n=0
δ k+n u(cn ) +
n=0
k−2
(pn+1 , y − x)RN − (p0 , x)RN
n=0
k+k0 −1
≤
δ n u(cn ) − (p0 , y0 )RN .
(7.49)
n=0
From Lemma 7.2.5 we know that 91 = max{y, M } cn ≤ M1 = max{y, M } and cn ≤ M
for all n ≥ 0.
91 } and M3 = max |u(c)| : 0 ≤ c ≤ M2 e where e = Define M2 = max{M1 , M [1, 1, . . . , 1] ∈ RN . Then from (7.49), we have k−2
(pn+1 , y − x)RN
n=0 k+k0 −2
≤
δ n u(cn ) −
n=0
≤
k−1
k0 −1
δ n u(0) −
n=0
δ k+n u(cn ) + (p0 , x − y0 )RN
n=0
3M3 + (p0 , x − y0 )RN < +∞. 1−δ
So for any k ≥ 2
k−2
(pn+1 , y − x)RN < +∞.
(7.50)
n=0
Because y − x % 0 (see hypothesis H1 (iii)) and pn ≥ 0 for all n ≥ 0, from (7.50) we infer that pn < +∞, n≥0
⇒ pn −→ 0
as n → ∞.
Recall that xn ≤ M1 = max{y, M } for all n ≥ 0 (see Lemma 7.2.5). Therefore (pn , xn )RN −→ 0 as n → ∞. Now we can apply Theorem 7.2.16 and conclude that {(xn , yn )}n≥1 is an optimal program starting at y ∈ RN . Next we present an example of a linear model that satisfies the reachability condition (R). EXAMPLE 7.2.20 The dynamic Leontief model: Let yn ∈ RN be a vector whose ith component yni represents the output of the ith good in time period n ≥ 0. Let A = (aij )N i,j=1 be an N × N -matrix, where aij denotes the current input of the ith N good used to produce one unit of the jth good. Then Ay= aij yj is the vector of j=1 N with lk representing the amount inputs (raw materials). Also let l = (lk )N k=1 ∈ R
7.2 Infinite Horizon Multisector Growth Models
551
of labor required to produce one unit of the kth good. As before u : RN −→ R is a continuous function, representing the utility function and 0 < δ < 1 is the discount N factor. Then the technology feasibility set G ⊆ RN + × R+ is defined by: (x, y) ∈ G if and only if x ≥ Ay, x ≥ 0, y ≥ 0
and
(l, y)RN ≤ 1.
(7.51)
We impose the following conditions on the input coefficient matrix. (L1) A ≥ 0 in the sense that aij ≥ 0 for all i, j ∈ {1, . . . , N } and l % 0. (L2) A is productive, meaning that there is y % Ay and (l, y)RN ≤ 1. n From matrix theory n we know that under the above hypothesis A −→ 0 as n → ∞ −1 and (I − A) = A . Now we can verify that this model fits in the framework of n≥0
the previous analysis. Claim 1: If G is defined by (7.51) and A satisfies hypotheses (L1) and (L2), then G satisfies hypotheses H1 . Proof: Clearly we have the free disposability property H1 (ii). Let m = min lk . 1≤k≤N
Because by (L1) l % 0, we have m>0. Set M = 1/m. If (x, y) ∈ G, then (l, y)RN ≤ 1 N lk yk ≤ 1 and so y ≤ 1/m. Therefore we have (see (7.51)). Hence my ≤ k=1
satisfied hypothesis H1 (i). Finally set x = Ay. Then clearly (x, y) ∈ G and x ! y (see hypothesis (L2)). So we have satisfied hypothesis H1 (iii). N Claim 2: If G ⊆ RN + × R+ is defined by (7.51) and (L1) holds, then G satisfies hypotheses H1 if and only if hypothesis (L2) holds.
Proof: Note that if G satisfies H1 (iii), then it satisfies (L2). This combined with Claim 1 produces the desired equivalence. N Claim 3: If G ⊆ RN + × R+ is defined by (7.51), hypotheses (L1), (L2) hold and {(xn , yn )}n≥0 is a program starting at y ∈ RN + , then there is a program {(xn , yn )}n≥0 starting at y and an integer k0 ≥ 1 such that yk0 = yk0 .
Proof: From the proof of Claim 1, we know that yn ≤ M e for all n ≥ 1 with n e = (1, 1, . . . , 1) ∈ RN + . Because y % 0 and A −→ 0 as n → ∞, we can find k0 ≥ 2 such that Ak0 M e ! y. (7.52) We define a new program {(xn , yn )}n≥0 as follows. x0 = y0 = y, xn
= xn ,
yn
xn = Ak0 −n yk0 = yn =
yn
for 1 ≤ n ≤ k0 − 1,
for n ≥ k0 .
(7.53)
We also introduce the corresponding consumption sequence cn = yn − xn xn
yn
for n ≥ 0.
cn
Clearly ≥ 0, ≥ 0 for n ≥ 0, = 0 for 0 ≤ n ≤ k0 − 1, and cn = cn ≥ 0 for n ≥ k0 . Also y0 = y. So in order to have that {(xn , yn )}n≥0 is a program starting at y, we need to verify that Ayn+1 ≤ xn
and
(l, yn )RN
≤1
for all n ≥ 0 for all n ≥ 1.
(7.54) (7.55)
552
7 Economic Equilibrium and Optimal Economic Planning
From (7.53), we see that (7.54) and (7.55) hold for all n ≥ k0 . Because A ≥ 0 (see hypothesis (L1)), we have y1 = Ak0 −1 yk0 ≤ Ak0 −1 M e, ⇒ Ay1 ≤ Ak0 M e ! y = x0
(see (7.52)).
This proves (7.54) when n = 0. Next if 1 ≤ n ≤ k0 − 1, then because of (7.53), we have = AAk0 −(n+1) yk0 = xn . Ayn+1
This proves (7.54) when 1 ≤ n < k0 − 1. Finally, if n = k0 − 1, then Ayn+1 = Ayk 0 = Ayk0 = Ak0 −(k0 −1) yk0 = xk0 −1 = xn .
This proves (7.54) when n = k0 − 1 and so we have verified (7.54) for all n ≥ 0. It remains to verify (7.55) for 1 ≤ n ≤ k0 − 1. Because {(xn , yn )}n≥0 is a program starting at y ∈ RN + , we have Ayn+1 ≤ xn ≤ yn
for all n ≥ 0.
So for any integer i ≥ 1, we have Ai yn+i = Ai−1 yn+i ≤ Ai−1 yn+i−1
for all n ≥ 0.
Continuing this way until zero, we obtain Ai yn+i ≤ yn
for all n ≥ 0 and all i ≥ 1.
(7.56)
So from (7.53) for 1 ≤ n ≤ k0 − 1, after setting i = k0 − n in (7.56), we obtain yn = Ak0 −n yk0 ≤ yn . Because l % 0 (see hypothesis (L1)) and (l, yn )RN ≤ 1 for all n ≥ 1, we have (l, yn )RN ≤ (l, yn )RN ≤ 1
for 1 ≤ n ≤ k0 − 1.
Therefore we have verified (7.55) and we have proved Claim 3. Claim 3 implies that for the Leontief model the reachability condition (R) holds. So for the Leontief growth model Theorem 7.2.19 applies. Next we turn our attention to growth models that are undiscounted (i.e., δ = 1). In this case the sum describing the intertemporal utility along a program, need not converge. So in order to evaluate the performance of a program, we need to use a criterion different from the standard notion of optimality given in Definition 7.2.6. This leads us to the so-called weak maximality criterion. But first let us describe the model. As before the planning horizon is infinite, namely N0 = {0, 1, 2, . . .} (i.e., we deal with a discrete time, infinite horizon growth model). Also the commodity space is RN (there are N commodities in the market). The technological constraints are described by the following mathematical hypotheses.
7.2 Infinite Horizon Multisector Growth Models
553
N H4 : G : RN + −→ Pf c (R+ ) is a multifunction with a closed and convex graph such that
(i) There exists M > 0 such that |G(x)| ≤ sup{y : y ∈ G(x)} ≤ M for all x ∈ RN +. (ii) If x ≤ x , y ≤ y and y ∈ G(x), then y ∈ G(x ). (iii) There exists (x, y) ∈ Gr G such that x ! y. REMARK 7.2.21 Combining hypothesis H4 (i) with the fact that G has a closed graph, we infer that G is usc (see Proposition 6.1.10). Invoking the Kakutani–Ky Fan fixed point theorem (see Theorem 6.5.19), we have that Γ = {x ∈ RN + : x ∈ G(x)} = ∅. Hypothesis H4 (ii) is the free disposability hypothesis (see also hypothesis H1 (ii)). Hypothesis H4 (iii) is the same as hypothesis H1 (iii) and it says that the economy has an expansible commodity vector x ∈ RN +. Also we have a utility function u(x, y) that satisfies the following hypotheses. H5 : u : Gr G −→ R is an upper semicontinuous concave function such that x −→ u(x, y) is nondecreasing and y −→ u(x, y) is nonincreasing. DEFINITION 7.2.22 A feasible program is a sequence x = {xn }n≥0 , xn ∈ RN + for all n ≥ 0 such that xn+1 ∈ G(xn ) for all n ≥ 0. In what follows by F (x0 ) we denote the set of feasible programs starting at x0 ∈ RN +. REMARK 7.2.23 In contrast to the previous multisector growth model, in this one we have supressed the consumption sequence {cn }n≥0 . In the new model a feasible input-output pair (x, y) includes consumption activities as well as production activities. One might object that this way we are obscuring the distinction between the consumption and production activities. Although this may have some minor implications, such as, for example, obscuring the role of consumer satiation, in general it does not create any conceptual problems and it is reasonable from an economic viewpoint. Moreover, it is mathematically convenient. Due to the infinite planning horizon and the absence of a discount factor (in this model δ = 1), we have difficulties in defining an optimal feasible program, because the sum of the intertemporal utility need not converge. As a first attempt to overcome this difficulty, we introduce the following notion. DEFINITION 7.2.24 (a) Let {αn }n≥0 , {bn }n≥0 be two real sequences. We say m (αn − bn ) ≤ 0. that {bn }n≥0 catches up to {αn }n≥0 , if lim sup m→∞ n=0
(b) A feasible program x={xn }n≥0 starting from x0 ∈RN + is said to be strongly maximal (or optimal ) if and only if the sequence {u(xn , xn+1 )}n≥0 catches up to the sequence {u(yn , yn+1 )}n≥0 for any y = (yn )n≥0 ∈ F (x0 ), in other words lim sup m→∞
m
u(yn , yn+1 ) − u(xn , xn+1 )
n=0
for every y = (yn )n≥0 ∈ F (x0 ).
≤0
554
7 Economic Equilibrium and Optimal Economic Planning
REMARK 7.2.25 The catching-up criterion introduces an ordering on the real sequences that is reflexive and transitive. In this ordering, two sequences {αn }n≥0 , {bn }n≥0 are equivalent (each of them catching up the other) if and only m m (αn − bn ) = 0. The ordering is strict if lim sup (αn − bn ) < 0 or if lim m→∞ n=0 m
lim sup
(αn − bn ) = 0 and lim inf
m
m→∞ n=0
(αn − bn ) < 0. However, this ordering is
m→∞ n=0
m→∞ n=0
m (αn − bn ) and lim sup (bn − αn ) m→∞ n=0 m→∞ n=0 n may be strictly positive. To see this consider the sequences αn = (−1) n≥0 and bn = 1/(2n+2 ) n≥0 . In this case both lim sup are equal to 12 . For this reason instead of the strong maximality criterion, we prefer a weaker one, which nevertheless induces a total ordering.
not total. It may happen that both lim sup
m
DEFINITION 7.2.26 (a) Let {αn }n≥0 , {bn }n≥0 be two real sequences. We m (bn − an ) > 0 (equivalently say that {bn }n≥0 overtakes {αn }n≥0 if lim inf lim sup
m
m→∞ n=0
(αn −bn ) < 0).
m→∞ n=0
(b) A feasible program x = {xn }n≥0 starting at x0 ∈ RN + is said to be weakly maximal if and only if no element in F (x0 ) overtakes x, in other words lim inf m→∞
m
u(yn , yn+1 ) − u(xn , xn+1 ) ≤ 0
n=0
for every y = (yn )n≥0 ∈ F (x0 ). REMARK 7.2.27 Note that the overtaking criterion is total and another equivalent way to state it is to say that for every ε > 0 and every m0 ≥ 1, we can find m = m(ε, m0 , y) ≥ m0 such that m n=0
u(yn , yn+1 ) − ε ≤
m
u(xn , xn+1 ).
n=0
In a multisector growth model, of special interest are stationary programs, because they permit a balanced growth of the economy. This justifies the next definiN tion. First note that because Γ = Fix G = {x ∈ RN + : x ∈ G(x)} ∈ Pk (R+ ) and u is upper semicontinuous, the optimization problem u = sup[u(x, x) : x ∈ Γ]
(7.57)
has a solution. Moreover, due to hypothesis H4 (ii) (the free disposability) and hypothesis H5 , we can equivalently rewrite (7.57) as follows, u = max[u(x, y) : y ∈ G(x), x ≤ y].
(7.58)
Let S be the solution set of (7.57) (equivalently of (7.58)), (i.e., S = {x ∈ Γ : u = u(x, x)} = ∅).
7.2 Infinite Horizon Multisector Growth Models
555
DEFINITION 7.2.28 A weakly maximal stationary program is a vector x∗ ∈ S such that the constant (stationary) program x ∗ = x∗n = x∗ n≥0 is weakly maximal within F (x∗ ). We are looking for weakly maximal programs. To this end we start by producing a stationary price system that supports the optimization problem (7.58). PROPOSITION 7.2.29 If hypotheses H4 and H5 hold, then there exists p ∈ RN + such that for all (x, y) ∈ GrG we have u(x, y) + (p, y − x)RN ≤ u. PROOF: Let h : RN ×RN −→ RN be the continuous function defined by h(x, y) = y − x. Then we can write (7.58) as follows. u = max{(x, y) : (x, y) ∈ Gr G, h(x, y) ≥ 0}. By hypothesis H4 (iii) h(x, y) % 0. So by the Kuhn–Tucker theorem, we can find p ∈ RN + such that u(x, y) + (p, y − x)RN ≤ u
for all (x, y) ∈ Gr G.
Using this proposition, we can show that there is no feasible program starting + at x0 ∈ B M = {v ∈ RN + : v ≤ M } that is infinitely better than the value u of problem (7.58). PROPOSITION 7.2.30 If hypotheses H4 and H5 hold, then there exists M1 > 0 + such that for any x = (xn )n≥0 , x0 ∈ B M , such that for all m ≥ 1, we have m
u(xn , xn+1 ) − u ≤ M1 .
n=0
PROOF: By virtue of Proposition 7.2.29, we can find p ∈ RN + such that u(x, y) + (p, y − x)RN ≤ u
for all (x, y) ∈ Gr G.
From this, we have u(xn , xn+1 ) − u = (p, xn − xn+1 )RN for all x = (xn )n≥0 ∈ F (x0 ) m
⇒ u(xn , xn+1 ) − u ≤ (p, x0 )RN − (p, xm )RN ≤ p2M = M1 , n=0
(see H4 (i)).
So there is no feasible program that can produce an intertemporal utility infinitely better than the one generated by a stationary program x∗ ∈ S. However, a program can be infinitely worse (e.g., any nonoptimal stationary program). To avoid this situation, we introduce the following definition.
556
7 Economic Equilibrium and Optimal Economic Planning
DEFINITION 7.2.31 A feasible program x = {xn }n≥0 is said to be good if there m
u(xn , xn+1 ) − u ≥ γ; in other exists γ ∈ R such that for all m ≥ 1, we have n=0
words lim inf m→∞
m
u(xn , xn+1 ) − u > −∞.
n=0
REMARK 7.2.32 So a feasible program x = (xn )n≥0 is good, if it can sustain a utility level close to u. In the literature a good program is also called eligible (see Takayama [572]). The next proposition links good programs with the set S of solutions of problem (7.58) (equivalently of (7.58)). PROPOSITION 7.2.33 If hypotheses H4 and H5 hold and x = (xn )n≥0 is a n + 1/(n + 1) xk n≥0 good program with x0 ∈ B M , then L = limit points of k=0
is nonempty and a subset of S. n
PROOF: Note that 1/(n + 1)
+
xk ∈ B M for all n ≥ 0. So by the Heine–Borel
k=0
theorem, we have that L = ∅. Let x∗ ∈ L. Then we can find a sequence m −→ nm such that nm 1 xk −→ x∗ as m → ∞. (7.59) nm + 1 k=0
Exploiting the convexity of Gr G (see hypotheses H4 ), we have
nm n m +1 1 1 xk , xk ∈ Gr G. nm + 1 nm + 1 k=0
(7.60)
k=1
Note that n n m +1 m +1 1 1 1 xk = xk + (xnm +1 − x0 ) nm + 1 nm + 1 nm + 1 k=1
k=0
+
and x0 , xnm +1 ∈ B M . Therefore n m +1 1 xk −→ x∗ nm + 1
as m → ∞.
(7.61)
k=1
Because Gr G is closed from (7.59), (7.61) and (7.60), we infer that (x∗ , x∗ ) ∈ Gr G (i.e., x∗ ∈ Γ). Moreover, because of H5 , Jensen’s inequality, and the fact that by hypothesis x = {xn }n≥0 is a good program, we have (see Definition 7.2.31)
7.2 Infinite Horizon Multisector Growth Models
557
nm
1 γ ≤ u(xk , xk+1 ) − u nm + 1 nm + 1 k=0
nm n m +1 1 1 ≤u xk , xk − u. nm + 1 nm + 1 k=0
(7.62)
k=1
Passing to the limit as m −→ ∞ in (7.62), we obtain
u ≤ lim sup u m→∞
≤ u(x∗ , x∗ )
nm n m +1 1 1 xk , xk nm + 1 nm + 1 k=0
k=1
(because u is upper semicontinuous; see H4 ). (7.63)
Because x∗ ∈ Γ, from (7.63) it follows that u = u(x∗ , x∗ ) ⇒ x∗ ∈ S (i.e., L ⊆ S). This proposition suggests that we should look among the elements of S in order to produce a weakly maximal stationary program. THEOREM 7.2.34 If hypotheses H4 and H5 hold, then there exists a weakly maximal stationary program. ∗ ∗ PROOF: Let p ∈ RN + be as in Proposition 7.2.29 and let x ∈ S such that (p, x )RN ≤ (p, y)RN for ally ∈ S. It exists since S is compact. Consider the stationary program x∗ = x∗n = x∗ n≥0 . We claim that this is a weakly stationary program in F (x∗ ). We argue indirectly. So suppose that the claim is not true. Then there exist z ∈ F (x∗ ), ε > 0, and m0 ≥ 1 such that for all m ≥ m0 we have
Um (x∗ ) + ε ≤ Um (z), where Um (x∗ ) =
m
u(x∗n , x∗n+1 ) =
n=0
m
u(x∗ , x∗ ) and Um (z) =
n=0
m
u(zn , zn+1 ) (see
n=0
Remark 7.2.27). Then ε ≤ Um (z) − Um (x∗ )
for all m ≥ m0 .
(7.64)
From (7.64) and Definition 7.2.31 it follows that z is a good program. So by virtue of Proposition 7.2.33, we have (at least for a subsequence), that m 1 zn −→ y ∗ ∈ L ⊆ S m + 1 n=0
as m −→ ∞.
Hence lim
m→∞
m m
1 1 (p, zn )RN = lim p, zn RN ≤ lim sup(p, zm )RN , m→∞ m + 1 n=0 m + 1 n=0 m→∞
⇒ (p, y ∗ )RN ≤ lim sup(p, zm )RN . m→∞
(7.65)
558
7 Economic Equilibrium and Optimal Economic Planning Then from Proposition 7.2.29 and (7.64), we obtain 0 < ε ≤ (p, x∗ )RN − (p, zm+1 )RN
for all m ≥ m0
⇒ 0 < ε ≤ (p, x∗ )RN − lim sup(p, zm+1 )RN ≤ (p, x∗ )RN − (p, y ∗ )RN , m→∞
(see (7.64)), ⇒ (p, y ∗ )RN < (p, x∗ )RN
and
y ∗ ∈ S.
This contradicts the choice of x∗ ∈ S. This proves that x∗ = {x∗n = x∗ }n≥0 is a weakly maximal stationary program. REMARK 7.2.35 The result fails if the weak maximality criterion is replaced by the strong maximality criterion (see Definition 7.2.24(b)).
7.3 Turnpike Theorems In this section we continue the analysis of multisector growth models with an infinite planning horizon. As in the previous section, the model that we consider is stationary, namely the set of technological possibilities and the utility function are constant in time. The stationarity assumption is of course restrictive, but most ren alistic models such as the constant growth models,
that is, Gn = λ G with λ > 0 (the growth rate) and un (x, y) = u (1/λn )x, (1/λn )y , can be transformed into stationary models. In general, any realistic growth model, will involve some kind of stationarity assumption because any probabilistic prediction of the behavior of the system over a long period is almost always based on some kind of stationarity hypothesis. In the second part of Section 7.2, we saw that in stationary models the stationary programs are important (i.e., programs x = {xn }n≥0 , where xn is independent of time, namely xn = x∗ ∈ RN + ). In this section, we isolate among the stationary programs, a special subclass called turnpike programs (or simply turnpikes). A turnpike in a stationary model is a special stationary optimal program, distinguished by at least three properties. (a) An optimal program eventually moves close to the turnpike. (b) A turnpike is easier to compute than an arbitrary optimal program. (c) A turnpike is relatively insensitive to the optimality criterion. In this section we focus on property (a), which leads to the so-called turnpike theorems. These theorems for infinite programs maintain that these programs in a certain sense approach a turnpike. In fact this property motivated the term turnpike. In most books on the subject, the following analogy is given by way of further explanation. Suppose that we want to drive from town A to town B and the main highway (the turnpike) passes close to A and B. The time of an optimal journey is to drive from town A to the turnpike, follow the turnpike until the exit to town B, and then reach town B. Turnpike theorems for finite programs can have weak and strong forms. Weak turnpike theorems assert that for a sufficiently large planning horizon most of the time optimal programs remain close to the turnpike. However, these theorems say nothing about the whereabouts of the planning intervals where the optimal program deviates considerably from the turnpike. In contrast the strong
7.3 Turnpike Theorems
559
turnpike theorems maintain that these distant points are concentrated at the beginning and at the end of the planning interval. So strong turnpike theorems, show that optimal programs of economic growth are analogous to the time of the optimal journey from town A to town B described above. Turnpike theorems are important because they establish the proximity of optimal programs to special programs with simple structure, the turnpikes (maximal stationary programs). Turnpikes are much easier to obtain compared to a general optimal program. As in Section 7.2, the multisector growth model is described by a pair (G, u). N The set G ⊆ RN + × R+ describes the technological possibilities of the economy. A pair (x, y) ∈ G describes a technological process. So the expenditure of the vector x ∈ RN + (the input) at time n−1 (n ≥ 1), yields the production of the (not necessarily unique) output vector y ∈ RN + at time n (n ≥ 1). The function u(x, y) measures the aggregate utility of the technological process (x, y). The hypotheses on the two items of the economy are the following. N H 1 : G ⊆ RN + × R+ ) is a nonempty, compact, and convex set such that
(i) (0, 0) ∈ G. (ii) We can find (x, y) ∈ G such that x ! y. REMARK 7.3.1 As before hypothesis H1 (i) means that inactivity is one option, and hypothesis H1 (ii) means that there is an expansible stock, namely we can find N N N an input x = (xk )N such that k=1 ∈ R+ and a corresponding output y = (yk )k=1 ∈ R xk < yk for all k ∈ {1, . . . , N }. This is a productivity assumption and it is not that restrictive because the vectors x, y ∈ RN + may differ very little from each other and from the zero vector. Finally the convexity of the set G means that it is possible to mix technological processes in arbitrary proportions. H2 : u : G −→ R is a continuous, strictly concave function. REMARK 7.3.2 The concavity of the utility function reflects the nature of the preferences of the society for variety in general (half an apple and half an orange are preferred to either a whole apple or a whole orange). From hypotheses H1 and H2 we see that the model is stationary, because both the technology set G and the utility function u are time-variant. DEFINITION 7.3.3 A finite or infinite sequence {zn }m n=1 , (m ≤ +∞), is said to be a program (path) of the economy if and only if zn = (xn−1 , yn ) ∈ G
and
xn ≤ yn
for all n ≥ 1.
(7.66)
If the vector y0 ∈ RN + is fixed, then we have program originating from y0 . REMARK 7.3.4 In (7.66), the first condition means that all pairs zn = (xn−1 , yn ) are technologically feasible. The second condition is a resource restriction and simply says that at each stage the input (or expenditures) can not exceed the output (available capital stock) from the previous stage. The vector y0 ∈ RN + is called the initial vector for the program {zn }m n=1 , if y ≥ x0 ; that is, the input at stage zero does not exceed the initial resources.
560
7 Economic Equilibrium and Optimal Economic Planning
The next definition describes how we measure the performance of a program m {zn }m n=1 , (m ≤ ∞). We distinguish between finite programs {zn }n=1 , (m < ∞) and infinite programs {zn }n≥1 depending on whether we consider a finite or an infinite planning horizon. In the first case there is no issue of convergence of the intertemporal utility and so optimality is defined in the standard way. In the second case, we are faced with the possible nonconvergence of the intertemporal utility (because there is no discount; i.e., δ = 1) and so to define optimality we use the notions introduced in the last part of Section 7.2. N DEFINITION 7.3.5 (a) A finite program {zn }m n=1 , (m<∞) starting at y0 ∈ R+ is said to be optimal if and only if m
u(zn ) ≤
n=1
m
u(zn )
n=1
N for any finite program {zn }m n=1 starting at y0 ∈ R+ . (b) An infinite program {zn }n≥1 starting at y0 ∈ RN + is said to be optimal if and only if m
lim sup u(zn ) − u(zn ) ≤ 0 m→∞
n=1
for any infinite program {zn }n≥1 starting at y0 ∈ RN +. REMARK 7.3.6 In part (b) the optimality of {zn }n≥1 is the same as saying that the program is weakly maximal in the sense of Definition 7.2.26(b). If for every program {zn }n≥1 starting at y0 ∈ RN u(zn ) converges, + , the intertemporal utility n≥1
then an optimal program {zn }n≥1 (in the sense of Definition 7.3.5), maximizes the intertemporal utility (i.e., is optimal in the usual way of maximizing the performance criterion). Conversely, if exactly one program exists that maximizes the intertemporal utility, then it is optimal in the sense of Definition 7.3.5(b). Because our model is stationary, the following definition is important. DEFINITION 7.3.7 (a) A program {zn }n≥1 is said to be stationary if zn = z ∈ G for all n ≥ 1. (b) A stationary program {zn=z}n≥1 is a turnpike if it maximizes the utility function (i.e., u(z) = max[u(z ) : z ∈ G]). REMARK 7.3.8 Evidently a necessary and sufficient condition for a vector z = N (x, y) ∈ RN + × R+ to define a stationary program is that z = (x, y) ∈ G and x ≤ y. So the stationary program z defines a turnpike if and only if it solves the maximization problem max{u(z) : z = (x, y) ∈ G, x ≤ y}. (7.67) Because by hypothesis H2 , the utility function u is strictly concave, then a turnpike (which always exists due to the compactness of G), is unique. From Proposition 7.2.29, we know that there is a stationary price system supporting the turnpike. Namely we have the following.
7.3 Turnpike Theorems
561
N PROPOSITION 7.3.9 If hypotheses H1 and H2 and z ∗=(x∗ , y ∗ ) ∈ RN + × R+ is a turnpike, then there exists p∗ ∈ RN such that +
u(z) + (p∗ , x − y)RN ≤ u(z ∗ )
(7.68)
for all z = (x, y) ∈ G. Let ξ : G −→ R be defined by ξ(z) = u(z) + (p∗ , x − y)RN
for all z ∈ (x, y) ∈ G.
Evidently ξ(z) represents the total utility corresponding to the stationary prices {p∗n = p∗ }n≥1 (i.e., ξ(z) is the utility u(z) plus the profit resulting from realizing the pair (x, y) ∈ G, when the price system p∗ ≥ 0 prevails in the market). We set (z, z ) = |ξ(z) − ξ(z )| for all z, z ∈ G. (7.69) This is a pseudometric on G. In the formulation of the turnpike theorems, this pseudometric appears to be a more convenient measure of deviation from the turnpike z ∗ than the norm, because it facilitates the proofs. Let us recall the following notion from Section 7.2 (see Definition 7.2.31). This notion is crucial in obtaining the weak turnpike theorem. DEFINITION 7.3.10 A sequence of finite programs {znm }m n=1 is said to be good if there exists γ ∈ R such that m
u(znm ) − u(z ∗ ) ≥ γ.
n=1
REMARK 7.3.11 The good programs can be worse than the turnpike, only by a maximum utility value equal to −γ. In what follows z ∗ = (x∗ , y ∗ ) is a turnpike, p∗ ∈ RN + is the corresponding price system supporting z ∗ (see Proposition 7.3.9), and {znm }m n=1 , m ≥ 1 is a good sequence in the sense of Definition 7.3.10, with initial vector y0 ∈ RN +. LEMMA 7.3.12 We can find β > 0 independent of m ≥ 1 such that m
(znm , z ∗ ) ≤ β
for all m ≥ 1.
n=1
PROOF: We have ϑm =
m
(znm , z ∗ ) =
n=1
=
m
ξ(z ∗ ) − ξ(znm )
n=1 m
(see (7.69))
m (p∗ , y ∗ − x∗ )RN u(z ∗ ) − u(znm ) +
n=1 m
−
n=1 ∗ m (p∗ , ynm − xm n )RN + (p , x0 − ym )RN .
n=1
(7.70)
562
7 Economic Equilibrium and Optimal Economic Planning
∗ Because p∗ ∈ RN + is the stationary price system supporting the turnpike z = (x∗ , y ∗ ), we have (7.71) (p∗ , y ∗ − x∗ )RN = 0. m ∗ N Also we have xm n ≤ yn and because p ∈ R+ , we obtain
(p∗ , ynm − xm n )RN ≥ 0.
(7.72)
Moreover, because G is by hypothesis compact, we have m )RN ≤ η (p∗ , x0 − ym
for some η > 0, all m ≥ 1.
(7.73)
Returning to (7.70) and using (7.71) through (7.73), we infer that ϑm ≤
m
u(z ∗ ) − u(znm ) + η.
(7.74)
n=1
From (7.74) and the fact that {znm }m n=1 , m ≥ 1 is a good sequence (see Definition 7.3.10) we conclude that ϑm ≤ β for some β > 0 (independent of m ≥ 1) and all m ≥ 1. LEMMA 7.3.13 There exists a function λ : R+ −→ R+ nondecreasing such that λ(0) = 0, λ(ε) > 0 for all ε > 0 and p(z ∗ , z) ≥ λ(z ∗ − z) for all z ∈ G. PROOF: We set λ(ε) = inf p(z ∗ , z) : z ∈ G, z ∗ − z ≥ ε . We only need to show that λ(ε) > 0 for ε > 0. If this is not true, then we can find {zn }n≥1 ⊆ G such that z ∗ − zn ≥ ε for all n ≥ 1, but (z ∗ , zn ) −→ 0 as n → ∞. Because G is compact, we may assume that zn −→ z ∈ G. Clearly z ∗ − z ≥ ε. For any z ∈ G, we have (z ∗ , z) = |ξ(z ∗ ) − ξ(z)| = ξ(z ∗ ) − ξ(z) ≥ 0.
(7.75)
On the other hand 0 = lim (z ∗ , zn ) = ξ(z ∗ )− lim ξ(zn ) = ξ(z ∗ )−ξ(z) ∗
n→∞
n→∞ ∗
(see (7.75))
⇒ ξ(z ) = ξ(z) = max{u(z) + (R , y − x)RN : z = (x, y) ∈ G}.
(7.76)
But ξ is strictly monotone because u is strictly concave (see hypothesis H2 ). So from (7.76) it follows that z∗ = z which contradicts the fact that z ∗ − z ≥ ε.
Now we can prove a weak turnpike theorem for good sequences. So we continue to have z ∗ = (x∗ , y ∗ ) a turnpike and {znm }m n=1 , m ≥ 1 a good sequence starting at y0 ∈ RN + . By rm (ε) we designate the number of the periods k ∈ {1, . . . , m} for which we have zkm − z ∗ ≥ ε (as in Section 7.2 on RN × RN we consider the l1 -norm, 2N |zk |). namely z = k=1
7.3 Turnpike Theorems
563
DEFINITION 7.3.14 A vector y0 ∈ RN + is said to be sufficient if a strictly positive vector can be produced in a finite number of steps starting from y0 ∈ RN + ; that is, we can find a finite program {zn = (xn−1 , yn )}m starting from y such that ym % 0. 0 n=1 THEOREM 7.3.15 If z ∗ = (x∗ , y ∗ ) ∈ RN ×RN is a turnpike and {zn }m n=1 , m ≥ 1, is a good sequence starting at a sufficient vector y0 ∈ RN , then {rm (ε)}m≥1 is bounded. PROOF: Using Lemmata 7.3.12, and 7.3.13, we have rm (ε)λ(ε) ≤
m
(z ∗ , znm ) ≤ β,
for all m ≥ 1,
n=1
β for all m ≥ 1, λ(ε) ⇒ {rm }m n=1 is bounded. ⇒ rm (ε) ≤
However, our goal is to have this theorem for a sequence {znm }m n=1 , m ≥ 1, where the finite program is optimal {znm }m n=1 , m ≥ 1 for the finite horizon economy. So we need to check that the sequence {znm }m n=1 , m ≥ 1, of optimal programs forms a good sequence. For this in turn it is enough to construct an infinite program {z}n≥1 starting at y0 ∈ RN + for which we have m
u(z ∗ ) − u(z n ) ≤ β0
for some β0 > 0, all n ≥ 1.
(7.77)
n=1
Indeed, because of the optimality of {znm }m n=1 , we have m
m u(z ∗ ) − u(znm ) ≤ u(z ∗ ) − u(z n )
⇒ ⇒
n=1 m
for all m ≥ 1,
n=1
u(z ∗ ) − u(znm ) ≤ β0
n=1 {znm }m n=1 ,
for all m ≥ 1,
m ≥ 1, is a good sequence (see Definition 7.3.10).
An infinite program {z n }n≥1 satisfying (7.77) is of course a good program (see Definition 7.2.31). In other words, it cannot be infinitely worse than the turnpike. LEMMA 7.3.16 If y0 ∈ RN + is sufficient (see Definition 7.3.14), then a good program exists with initial vector y0 ∈ RN +. PROOF: Without any loss of generality, we may assume that y0 % 0. Indeed, because y0 is sufficient, according to Definition 7.3.14 with a finite program, we can reach a vector ym % 0. Then we continue as described below. N So y0 % 0. Let y ∈ RN + be the output of the expansible stock x ∈ R+ (see hypothesis H3 (ii)). We may assume that y ! y0 . Because (0, 0) ∈ G and G is convex (see hypothesis H1 ), we have that (µx, µy) ∈ G and of course µx ! µy (i.e., µx remains expansible). Let t ∈ (0, 1) and define
564
7 Economic Equilibrium and Optimal Economic Planning z$n = tn z + (1 − tn )z ∗ ,
where z = (x, y), n ≥ 1.
We show that {$ zn }n≥1 is a good sequence starting at y0 . Because G is convex, we see that z$n ∈ G for all n ≥ 1. To establish that {$ zn }n≥1 is a program we need to show that x $0 ≤ y0 and x $n ≤ y$n for all n ≥ 1, (7.78) (see Definition 7.3.3 and Remark 7.3.4). Note that x $0 = x ! y ≤ y0 . So we have verified the first inequality in (7.78). To check the second inequality in (7.78), by direct substitution we have
$n = tn y − tx + (1 − t)x∗ + (1 − tn )(y ∗ − x∗ ). (7.79) y$n − x Recall that x∗ ≤ y ∗ . Also because x ! y (see hypothesis H1 (ii)), by choosing t close to 1, we see that tx + (1 − t)x∗ ≤ y. Therefore from (7.79), we have x $n ≤ y$n for all n ≥ 1. So we have also verified the second inequality in (7.78). This means that {$ zn }n≥1 is a program starting at y0 . Because u is concave (see hypothesis H2 ), we have zn ) for all n ≥ 1, tn u(z) + (1 − tn )u(z ∗ ) ≤ u($
zn ) ≤ tn u(z ∗ ) − u(z) for all n ≥ 1, ⇒ u(z ∗ ) − u($ m zn ) ≤ β0 for some β0 > 0, all n ≥ 1, u(z ∗ ) − u($ ⇒ n=1
⇒ {$ zn }n≥1 is a good program starting at y0 ∈ RN +. Having this lemma, we can state the weak turnpike theorem for finite programs. THEOREM 7.3.17 If {znm }m n=1 , m ≥ 1, is a sequence of optimal finite programs starting at y0 ∈ RN + , then for every ε > 0, the sequence {rm (ε)}m≥1 is bounded. REMARK 7.3.18 This result says that the number of time periods when the optimal program differs from the turnpike by at least ε > 0 is bounded by a constant β(ε) > 0 that is independent of m ≥ 1. So the optimal program stays near the turnpike except possibly for at most β(ε) periods. In particular we
for all time have rm (ε) /m −→ 0 as m −→ ∞. So the proportion of time when the optimal program deviates from the turnpike goes to zero as the planning horizon increases. In fact we can extend the above weak turnpike theorem to infinite programs. THEOREM 7.3.19 If {zn }n≥1 is a good program, then zn −→ z ∗ as n → ∞. PROOF: As in the proof of Lemma 7.3.12, we can show that m n=1
(z ∗ , zn ) ≤
m
u(z ∗ ) − u(zn ) + η,
for some η > 0, all m ≥ 1.
n=1
From (7.80) and because {zn }n≥1 is a good program, we infer that
(7.80)
7.3 Turnpike Theorems (z ∗ , zn ) −→ 0
565
as n → ∞.
Suppose that for some subsequence {nk } of {n}, we have znk − z ∗ ≥ ε
for all k ≥ 1.
Then from Lemma 7.3.13, we have 0 < λ(ε) ≤ (z ∗ , znk ),
for all k ≥ 1,
a contradiction. Therefore we conclude that zn −→ z ∗ as n → ∞.
REMARK 7.3.20 So according to this theorem every good program, in particular every weakly maximal program, is asymptotic to the turnpike. m THEOREM 7.3.21 If y0 ∈ RN + is a sufficient vector, {zn }n≥1 , m ≥ 1, is a sequence of optimal finite programs starting at y0 , and the turnpike z ∗ ∈ int G, then for any ε > 0, we can find ϑ = ϑ(ε) > 0 (independent of m ≥ 1) such that for any m > 2ϑ, we can find a finite program {$ znm }m n=1 starting at y0 for which we have m ∗ (a) z$n = z for all n ∈ {ϑ, . . . , m-ϑ}. m
znm ) < ε. (b) u(znm ) − u($ n=1
PROOF: Because z ∗ ∈ int G, we can find r > 0 such that Br (z ∗ ) ⊆ G. Let η ∈ (0, r). By virtue of Theorem 7.3.17, we can find β = β(η) > 0 such that m ≥ 1 there are at most (β − 1) time periods n ≥ 1 for which we have znm − z ∗ ≥ η. Fix m > 2β. Among the first β periods {1, 2, . . . , β} and among the last β periods {m − β + 1, . . . , m}, there is at least one time period k = k(m) and i = i(m), respectively, such that zkm − z ∗ < η
and
zim − z ∗ < η.
We consider the following finite program: m ∗ ∗ ∗ ∗ ∗ ∗ m m m , (xm σm = z1m , . . . , zk−1 k−1 , y ), (x , y ), . . . , (x , y ), (x , yi ), zi+1 , . . . , zm . (7.81) Note that in this finite sequence σm of length m, the initial and last parts (with lengths k − 1 and m − i − 1, resp.) coincide with the optimal program. The middle part coincides with the turnpike. The turnpike is linked with the other two parts ∗ ∗ m using the pairs (xm k−1 , y ) and (x , yi ). It is straightforward to check that σm is indeed a finite program starting at y0 . So for every m > 2β we have a program σm = {$ znm }m n=1 as above. Then χ= =
m
u($ znm ) − u(znm )
n=1 m
u($ znm ) − u(znm )
(see (7.81))
n=k
=
i
∗ ∗ ∗ m ∗ u(z ∗ ) − u(znm ) + u(xm k−1 , y ) − u(z ) + u(x , yi ) − u(z ) ,
n=k
(7.82)
566
7 Economic Equilibrium and Optimal Economic Planning
(see (7.81)). ∗ N Because {znm }m n=1 is optimal, we have χ ≤ 0. Moreover, if p ∈ R+ is the price system supporting the turnpike (see Proposition 7.3.9), we have
∗ ∗ ∗ m ∗ µ(η) ≤ u(xm k−1 , y ) − u(z ) + u(x , yi ) − u(z ) − (p∗ , y ∗ − yim )RN + (p∗ , x∗ − xm i )RN ≤ χ ≤ 0
(7.83)
with µ(η) independent of m ≥ 1 and µ(η) −→ 0 as η −→ 0. Choose η > 0 small enough so that |µ(η)| < ε. Then set ϑ = ϑ(ε) = β(η) + 1. We see that both statements (a) and (b) of the theorem are satisfied. REMARK 7.3.22 Statement (a) of the theorem says that the program {$ znm }m n=1 coincides with the turnpike except perhaps in the initial and final stages. The length of these parts is bounded by a constant independent of m ≥ 1. Moreover, according to statement (b), this new program {$ znm }m n=1 is ε-optimal. m LEMMA 7.3.23 If {znm }m n=1 is an optimal program, {zn }n=1 is any finite program, and 1 ≤ k ≤ i ≤ m, then i
k−1
(z ∗ , znm ) ≤
n
u(znm ) − u(zn ) + u(znm ) − u(zn )
n=1
n=k
n=i+1
i
∗ ∗ m u(z ∗ ) − u(zn ) + (p∗ , x∗ − xm k−1 )RN + (p , y − yi )RN .
+
n=k
(7.84) PROOF: From (7.69) (definition of the pseudometric ), we have i
i
i−1 (p∗ , ynm − xm u(z ∗ ) − u(znm ) − n )RN
(z ∗ , znm ) =
n=k
n=k
n=k
+ (p∗ , y ∗ − yim )RN − (p∗ , x∗ − xm k−1 )RN i
u(z ∗ ) − u(znm ) + (p∗ , y ∗ − yim )RN − (p∗ , x∗ − xm k−1 )RN .
≤
n=k
(7.85) Because {znm }m n=1 is optimal, we have m
u(zn ) ≤
n=1
⇒
k−1
m
u(znm )
n=1 i m i
u(zn ) + u(znm ). u(zn ) − u(znm ) + u(zn ) − u(znm ) ≤
n=1
n=k
n=i+1
n=k
(7.86) Using (7.86) in (7.85), we obtain (7.84).
Using this lemma we can prove the following result, which is crucial in the proof of the strong turnpike theorem.
7.4 Stochastic Growth Models
567
m m LEMMA 7.3.24 If y0 ∈ RN + is a sufficient vector, {zn }n=1 is an optimal finite m−l program starting at y0 , and ϑ > 0, then we find l = l(ϑ) ≥ 1 such that (z ∗ , znm ) < n=l
ϑ. PROOF: Using (7.84) (see Lemma 7.3.23) with {znm }m n=1 = σm , as in the proof of Theorem 7.3.21, we have m−β
n=β
(z ∗ , znm ) ≤
i
(z ∗ , znm )
n=k
≤ u(z ∗ ) − u(zk ) + u(z ∗ ) − u(zi ) + 2ηp∗ ≤ ϕ(η),
where ϕ(η) −→ 0 as η −→ 0 because u is continuous on G and z ∗ −zk , z ∗ −zi < η. Choose η > 0 small enough so that ϕ(η) < ϑ. Then l = β(η) will do the job. This lemma leads to the following strong turnpike theorem. m m THEOREM 7.3.25 If y0 ∈ RN + is sufficient, {zn }n=1 , m ≥ 1, is a sequence of optimal finite programs, and the turnpike z ∗ ∈ int G, then for any ε > 0, we can find l = l(ε) > 0 (independent of m ≥ 1) such that for any m ≥ 2l, we have z ∗ − znm < ε for all h ∈ {l, . . . , m − l}.
PROOF: Given ε > 0, let ϑ = λ(ε) where λ(·) is as in Lemma 7.3.13. Let l = l(ϑ) ≥ 1 be as in Lemma 7.3.24. If z ∗ − znm ≥ ε for some n ∈ {l, . . . , m − l}, then (z ∗ , znm ) ≥ λ(ε) = ϑ (see Lemma 7.3.13) and this contradicts Lemma 7.3.24. So we have proved the strong turnpike theorem for the pseudometric . Then arguing as in the proof of Lemma 7.3.13, we can have it for the norm in RN ×RN . REMARK 7.3.26 This theorem says that an optimal finite program is in the vicinity of the turnpike for all time periods not less than l periods distant from the endpoints. So compared to Theorem 7.3.17, now we identify those time periods when the optimal program deviates from the turnpike.
7.4 Stochastic Growth Models In the previous sections we studied static and dynamic deterministic economic models. In this section, we examine discrete-time economic growth models with uncertainty. The uncertainty is present both in the utility and in the technological constraints. First we deal with the discounted nonstationary model and we prove the existence of an optimal program, which we characterize with a system of supporting prices. In the second half of the section, we deal with the undiscounted, stationary growth model for which we show that it has a weakly maximal program. So first we describe and study the discounted, nonstationary growth growth model. For this purpose let (Ω, Σ, µ) be a complete probability space. Then ω ∈ Ω represents a possible state of the environment, Σ is the collection of all possible events, and µ is the probability distribution of these events. We have a discrete-time,
568
7 Economic Equilibrium and Optimal Economic Planning
infinite planning horizon N0 = {0, 1, 2, . . .} (i.e., the model is a discrete-time, infinite horizon model). To describe mathematically the uncertainty of the system, we consider an increasing sequence {Σn }n≥1 of complete sub-σ-fields of Σ and assume that Σ = Vn≥1 Σn . The sub-σ-field Σn of Σ, represents the information about the state of the environment available up until time n ≥ 0. There are N commodities in the economy. So the commodity space is RN (multisector stochastic model). Finally there is a constant discount factor 0 < δ < 1 (discounted model). At time period n ≥ 1, the technological possibilities of the economy are described N N by a multifunction G : Ω −→ 2R+ ×R+ \ {∅} which has a graph belonging in Σn × N N B(RN + ) × B(R+ ) with B(R+ ) being the Borel σ-field. The set Gn (ω) describes all possible transformations of capital stock at time n ≥ 1, when the state of the environment is ω ∈ Ω. More precisely, if (k, y) ∈ Gn (ω), when the state of the environment is ω ∈ Ω, we can transform a capital input k at time n− 1 into a capital output y at time n. The uncertainty in this production process is manifested by the fact that the graph of Gn is Σn -measurable. Note that the technology feasibility set varies with time and so the model is nonstationary. At every stage n ≥ 1, the utility (gain) achieved by operating a particular technoN logical process (x, y) ∈ Gn (ω), is measured by a utility function un : Ω×RN + ×R+ −→ N R which is assumed to be Σn × B(RN ) × B(R )-measurable. Again the uncertainty + + is expressed by the Σn -measurability of the function. Finally there is a constant discount factor δ ∈ (0, 1). A sequence k = {kn }n≥0 with kn ∈ L∞ (Σn , RN ) is a program (or path). We
say that program k = {kn }n≥0 is feasible if kn (ω), kn+1 (ω) ∈ Gn+1 (ω) µ-a.e. on Ω, for all n ≥ 0. We say that k = {kn }n≥0 is a feasible program starting at x0 ∈ L∞ (Σ0 , RN ), if k0 = x0 . The set all feasible programs starting at x0 , is " of ∞ denoted by F (x0 ). Evidently F (x0 ) ⊆ L (Σn , RN ). n≥0
Given an initial capital stock x0 ∈ L∞ (Σ0 , RN )+ (i.e. x0 (ω) ≥ 0 µ-a.e. on Ω), we want to find an element in F (x0 ) (i.e., a feasible program starting at x0 ) that maximizes the intertemporal utility n+1 U (k) = δ Jn+1 (kn , kn+1 ), (7.87) n≥0
where k = {kn }n≥0 and
un+1 ω, v(ω), w(ω) dµ
Jn+1 (v, w) =
(7.88)
Ω
for all (v, w) ∈ L∞ (Σn , RN )+ × L∞ (Σn+1 , RN )+ , n ≥ 0. The mathematical hypotheses on the technological possibilities multifunction Gn are similar to those of the deterministic discounted multisector growth model. N H1 : Gn : Ω −→ Pf c (RN + ×R+ ) is a multifunction such that N (i) Gr Gn ∈ Σn × B(RN + ) × B(R+ ) (i.e., the multifunction Gn is Σn -graph measurable.
(ii) For every ω ∈ Ω, (0, 0) ∈ Gn (ω) and (0, y) ∈ Gn (ω) implies y = 0.
7.4 Stochastic Growth Models
569
(iii) For every ω ∈ Ω, if (k, y) ∈ Gn (ω), k ≤ k , and y ≤ y, then (k , y ) ∈ Gn (ω) (free disposability). (iv) There exists M > 0 such that if ω ∈ Ω and (k, y) ∈ Gn (ω) with k > M , then y ≤ k. The hypotheses on the instantaneous utility function, are the following. N H2 : un : Ω× RN + × R+ −→ R, n ≥ 1 is a function such that N (i) (ω, k, y) −→ un (ω, k, y) is Σn × B(RN + ) × B(R+ )-measurable.
(ii) For every ω ∈ Ω (k, y) −→ un (ω, k, y) is concave and upper semicontinuous. N (iii) For all (ω, k, y) ∈ Ω × RN + × R+ , we have
|un (ω, k, y)| ≤ ϕn (ω, k|, y) with ϕn : Ω × R+ × R+ −→ R+ a Σn × B(R+ ) × B(R+ )-measurable function, ϕn (ω, ·, ·) is nondecreasing, and sup ϕn (·, v, v)L1 (Σn ) < +∞ for all v ∈ R+ . n≥1
By a price system we mean a sequence p = {pn }n≥0 ⊆ L1 (Σn , RN )+ . Normally prices belong to the dual of the space of production vectors. However, L∞ (Σn , RN )∗ is too big and has no satisfactory economic interpretation. For this reason we limit ourselves to L1 (Σn , RN ) ⊆ L∞ (Σn , RN )∗ . The precise description of L1 (Σn , RN ) as a subspace of L∞ (Σn , RN )∗ is given by the Yosida–Hewitt theorem. For easy reference we state it here. First a definition. DEFINITION 7.4.1 Let (Ω, Σ, µ) be a σ-finite measure space. A functional u∈
L∞ (Ω, RN )∗ is said to be absolutely continuous if u(g) = Ω g(ω), f (ω) RN dµ for some f ∈ L1 (Ω, RN ) and all g ∈ L∞ (Ω, RN ). A functional u ∈ L∞ (Ω, RN )∗ is said to be singular if there exists a decreasing sequence {Cn }n≥1 ⊆ Σ such that µ(Cn ∩C) −→ 0 for every C ∈ Σ with µ(C) < +∞ and u is supported by Cn ; that is, u(g) = 0 for each g ∈ L∞ (Ω, RN ) which is identically zero on one of the Cn s (i.e., u(g) = u(χCn g), n ≥ 1, g ∈ L∞ (Ω, RN )). REMARK 7.4.2 Absolutely continuous elements in L∞ (Ω, RN )∗ are identified with L1 (Ω, RN ). Using the notions of absolutely continuous and singular elements of L∞ (Ω, RN )∗ , we can have a complete description of the dual space L∞ (Ω, RN )∗ . The result is known as the Yosida–Hewitt theorem and for a proof of it, we refer to Levin [377]. THEOREM 7.4.3 If (Ω, Σ, µ) is a σ–finite measure space and u∈L∞ (Ω, RN )∗ , then u can be uniquely written as u = ua + us , where ua is absolutely continuous and us is singular. If u ≥ 0, then ua , us ≥ 0. Moreover, we have u = ua + us .
570
7 Economic Equilibrium and Optimal Economic Planning
DEFINITION 7.4.4 A feasible program k∗ = {kn∗ }n≥0 ∈ F (x0 ) is said to be optimal if for all k ∈ F (x0 ). U (k) ≤ U (k ∗ ) To prove the existence of an optimal program for our model, we use the direct method of the calculus of variations. This requires that we prove the upper semicontinuity of the objective functional and the compactness of the constraint set in some useful topology and then apply the Weierstrass theorem. THEOREM 7.4.5 If hypotheses H1 , H2 hold and x0 ∈ L∞ (Σ0 , RN ), then there exists an optimal program in F (x0 ). PROOF: Using hypotheses H1 and arguing as in Lemma 7.2.5, we can have that for all k = {kn }n≥0 ∈ F (x0 ). kn ∞ ≤ M1 = max x0 ∞ , M Let Dn = h ∈ L1 (Σn , RN ) : h∞ ≤ M1 , n ≥ 1 (recall L∞ (Σn , RN ) ⊆ L1 (Σn , RN )). From the Eberlein–Smulian theorem, we infer that Dn ⊆ L1 (Σn , RN ) " is weakly sequentially compact. Therefore Dn is weakly sequentially compact in n≥1 " 1 " 1 L (Σn , RN ). Recall that the weak topology on L (Σn , RN ), is the product n≥1
n≥1
topology of the weak topologies on each factor space L1 (Σn , RN ), n ≥ 0. N We define un : Ω×RN + × R+ −→ R = R ∪ {−∞} by un (ω, k, y) if (ω, k, y) ∈ Gr Gn . un (ω, k, y) = −∞ otherwise It is easy to see that un ∈ Σn ×B(RN )×B(RN )-measurable (see hypotheses H1 (i) and H2 (i)) and for every ω ∈ Ω, un (ω, ·, ·) is concave and upper semicontinuous. Moreover, we have |un (ω, k, y)| ≤ ϕn (ω, k, y) Let U :
"
for all (ω, k, y) ∈ GrGn .
L1 (Σn , RN ) −→ R = R ∪ {−∞} be the expected intertemporal utility
n≥0
corresponding to un , n ≥ 1; that is, n+1 δ U (k) = Jn+1 (kn , kn+1 ) n≥0
for all k = {kn }n≥0 and where
un+1 ω, v(ω), w(ω) dµ
Jn+1 (v, w) = Ω
for all (v, w) ∈ L (Σn , R ) × L (Σn+1 , RN ). " 1 We show that U is weakly sequentially upper semicontinuous in L (Σn , RN ). n≥0 " 1 w L (Σn , RN ), with k m = {knm }n≥0 and So suppose that k m −→ k in F (x0 ) ⊆ 1
N
1
n≥0
7.4 Stochastic Growth Models
571
w
k = {kn }n≥0 . We have knm −→ kn in L1 (Σn , RN ) as m −→ ∞, for all n ≥ 0. Using Fatou’s lemma, we have m lim sup Jn+1 (knm , kn+1 ) ≤ Jn+1 (kn , kn+1 ). m→∞
Given v ∈
"
L1 (Σn , RN ) and K ≥ 0, we set
n≥0
UK (v) =
K
δ n+1 Jn+1 (vn , vn+1 ).
m=0
Using this notation we have lim sup UK (k m ) = lim sup m→∞
m→∞
≤
K
K
m δ n+1 Jn+1 (knm , kn+1 )
m=0
m δ n+1 lim sup Jn+1 (knm , kn+1 ) m→∞
m=0
≤
K
δ n+1 Jn+1 (kn , kn+1 ) = UK (k).
m=0
Because k ∈ F (x0 ), we have ∞ ∞ |U (k) − UK (k)| = δ n+1 Jn+1 (kn , kn+1 ) ≤ δ n+1 β, n=K+1
n=K+1
where β = sup ϕn (·, M1 , M1 )1 (see hypothesis H2 (iii)). It follows that n≥1
UK (k) −→ U (k)
as K −→ +∞.
Then we can find a map m −→ K(m) increasing to +∞ such that
lim sup UK(m) (k m ) ≤ lim sup lim sup UK (k m ) m→∞
K→+∞
m→∞
≤ lim sup UK (k) = U (k).
(7.89)
K→+∞
So we have U (k m ) − U (k) = U (k m ) − UK(m) (k m ) + UK(m) (k m ) − U (k) ∞ ≤ δ n+1 β + UK(m) (k m ) − U (k) n=K(m)+1 ∞ ⇒ lim sup U (k m ) − U (k) ≤ lim sup δ n+1 β
m→∞
m→∞ n=K(m)+1
+ lim sup UK(m) (k m ) − U (k) ≤ 0 m→∞
⇒ lim sup U (k m ) ≤ U (k). m→∞
(see (7.89))
572
7 Economic Equilibrium and Optimal Economic Planning This proves the weak sequential upper semicontinuity of U on
"
L1 (Σn , RN ).
n≥0
So by the Weierstrass theorem, we can find k ∗ ∈ F (x0 ) such that U (k ∗ ) = U (k ∗ ) = sup U (k) : k ∈ F (x0 ) . Next we characterize an optimal program by generating a price system that supports it. More precisely, we prove that there exists a price system such that: (a) At every time period, we have minimization of the cost among programs producing no less future value. (b) At every time period, we have maximization of the total utility, defined as the sum of the instantaneous utility and the net profit resulting from operating a particular technological process at that period. The net profit is calculated as the value of the produced output minus the cost of the used input. (c) The expected value of the optimal program goes to zero as the planning horizon expands to +∞ (transversality condition). Recall that the deterministic variant of such a result was given in Theorem 7.2.13. In our effort to produce a price characterization of the optimal program, we need some quantities, which we introduce next. So suppose that f ∈ L∞ (Σn , RN ), n ≥ 0. We define
Dn (f ) = h = {hm }m≥n : hn=f and hm (ω), hm+1 (ω) ∈ Gm+1 (ω) µ-a.e. on Ω m+1 and Vn (f ) = sup δ Jm+1 (hm , hm+1 ) : h ∈ Dn (f ) . m≥n
Clearly Vn is the value (Bellman) function of the sequential optimization problem and as it is well-known it satisfies the dynamic programming functional equation, namely Vn (f ) = sup δ m+1 Jm+1 (f, g) + Vn+1 (g) : g ∈ L∞ (Σn+1 , RN ),
f (ω), g(ω) ∈ Gn+1 (ω) µ-a.e. on Ω .
(7.90)
If k ∗ = {kn∗ }n≥0 ∈ F (x0 ) is an optimal program, then ∗ ∗ ∗ , kn+1 ) + Vn+1 (kn+1 ). Vn (kn∗ ) = δ n+1 Jn+1 (kn−1
(7.91)
Also in what follows by Sn+1 we denote the subset of L∞ (Σn , RN ) × L∞ (Σn+1 , RN ) consisting of pairs of functions which pointwise correspond to a technological feasible production process, namely Sn+1 = (f, g) ∈ L∞ (Σn , RN ) × L∞ (Σn+1 , RN ) :
f (ω), g(ω) ∈ Gn+1 (ω) µ-a.e. on Ω .
7.4 Stochastic Growth Models
573
Note that hypotheses H2 imply that the value function Vn is concave (i.e., −Vn is convex). Then for every z ∈ L∞ (Σn , RN ), we can define the subdifferential of Vn by ∂Vn (z) = p ∈ L∞ (Σn , RN )∗ : Vn (y) − Vn (z) ≤ pn (y − z) for all y ∈ L∞ (Σn , RN ) (see Definition 1.2.28). To generate the desired supporting price system, we need to strengthen hypotheses H2 . N H3 : un : Ω × RN + × R+ −→ R is a function such that N (i) For all (k, y) ∈ RN + × R+ , ω −→ un (ω, k, y) is Σn -measurable.
(ii) For every ω ∈ Ω, (k, y) −→ un (ω, k, y) is continuous concave. (iii) For every (ω, y) ∈ Ω × RN + , k −→ un (ω, k, y) is nondecreasing and for every (ω, k) ∈ Ω × RN + y −→ un (ω, k, y) is nonincreasing. N (iv) For every (ω, k, y) ∈ Ω × RN + × R+ , we have
|un (ω, k, y)| ≤ ϕn ω, k, y ,
with ϕn : Σn × B(R+ ) × B(R+ )-measurable, for every ω ∈ Ω ϕn (ω, ·, ·) is nondecreasing and sup ϕn (·, v, v)1 < ∞ for all v ≥ 0. n≥1
We also need a hypothesis concerning the value function Vn . H4 : For every n ≥ 0, Vn is continuous at some point in L∞ (Σn , RN ) and ∂V0 (x0 ) = ∅. REMARK 7.4.6 From Theorem 1.2.3, we know that the first part of hypothesis H4 is equivalent to saying that Vn is bounded below in a neighborhood of a point. Also from Theorem 1.2.34, we know that if V0 is continuous at x0 ∈ L∞ (Σ0 , RN ), then ∂V0 (x0 ) = ∅. Note that if the technology is rich enough to admit a feasible program v = {vn }n≥0 such that vn (ω), vn+1 (ω) + Bεn +1 ⊆ Gn+1 (ω) µ-a.e. on Ω (interior program), then hypothesis H4 is satisfied. To produce the supporting price system, we need to use the Mackey topology on the Lebesgue space L∞ (Σn , RN ). We have already encountered the Mackey topology in Chapter 6 (see Definition 6.3.29). For easy reference we recall this definition here in the context of the space L∞ (Σn , RN ), where it is used in the sequel. DEFINITION 7.4.7 The Mackey topology m∞ on L∞(Σn , RN ) induced by the
∞ N 1 N pair L (Σn , R ), L (Σn , R ) , is the topology of uniform convergence on the 1 N weakly compact and convex subsets
∞ of L N(Σn , 1R ). So, if by ·, · we denote the duality brackets for the pair L (Σn , R ), L (Σn , RN ) (recall L∞ (Σn , RN ) = m∞ k if and only if for every W ⊆ L1 (Σn , RN ) weakly L1 (Σn , RN )∗ ), then ka −→ compact and convex, we have sup | ka − k, p | : p ∈ W −→ 0.
574
7 Economic Equilibrium and Optimal Economic Planning
REMARK 7.4.8 Clearly the w∗ -topology on L∞ (Σn , RN ) = L1 (Σn , RN )∗ is weaker than the Mackey topology. We know that the Mackey
topology is∗the strongest locally convex topology τ on L∞ (Σn , RN ) such that L∞ (Σn , RN )τ = L1 (Σn , RN )). Convergence in the Mackey topology is related to the convergence in measure, µ denoted by −→. PROPOSITION 7.4.9 If {hk , h}k≥1 ⊆ L∞ (Σn , RN ), sup hk ∞ = η < +∞ and k≥1 µ
∞ hk −→ h as k −→ ∞, then hk −→ h in L∞ (Σn , RN ) as k −→ ∞.
m
PROOF: Let W ⊆ L∞ (Σn , RN ) be a weakly compact and convex set. According to Definition 7.4.7, we need to show that sup | hk − h, w | : w ∈ W −→ 0 as k −→ ∞. Replacing hk by hk − h if necessary, we may assume that h = 0. Because W ⊆ L1 (Σn , RN ) is weakly compact, from the Dunford–Pettis theorem, we have that W is uniformly integrable. Hence the set {v = g∞ w1 : g∞ ≤ η, w ∈ W } is uniformly integrable too. So given ε > 0, we can find ϕ ∈ L1 (Σn )+ , ϕ > 0 such that for all g ∈ L∞ (Σn , RN )+ with g∞ ≤ η and all w ∈ W , we have
g(ω) w(ω)dµ ≤ ε. {gw>ϕ}
So to prove the proposition, it suffices to show that
hm (ω) w(ω)dµ = 0. lim sup m→∞ w∈W
{hm w≤ϕ}
We can always assume that |W |1 = sup w1 : w ∈ W ≤ 1. Note that # {ϕ < λ} {ϕ = 0} = λ>0
and because ϕ(ω) > 0 µ-a.e. on Ω, we see that we can find ϑ > 0 small enough such that
ϕ(ω)dµ ≤ ε. (7.92) {ϕ<ϑ}
Also we can find δ = δ(ε) > 0 such that, if A ∈ Σn and µ(A) ≤ δ, then
ϕ(ω)dµ ≤ ε.
(7.93)
A
µ Let γ ∈ 0, min{ε, δ} . Since by hypothesis hk −→ h, we can find k0 ≥ 1 such that for all k ≥ k0 , we have
(7.94) µ {ϕ ≥ ϑ, hk ≥ γ} ≤ γ. Then, for all k ≥ k0 and all w ∈ W , we have
7.4 Stochastic Growth Models
575
≤
hk (ω) w(ω)dµ {hk w≤ϕ}∩{ϕ≥ϑ}∩{hk ≥γ} hk (ω) w(ω)dµ {ϕ≥ϑ}∩{hk ≥γ}
≤ε
(see (7.93) and (7.94)).
(7.95)
Also because we have assumed that |W |1 ≤ 1, for all k ≥ k0 and all w ∈ W , we have
hk (ω) w(ω)dµ < γ. (7.96) {hk <γ}
Hence, for all k ≥ k0 and all w ∈ W , we have
hk (ω) w(ω)dµ
=
hk w≤ϕ}
hk (ω) w(ω)dµ +
hk (ω) w(ω)dµ {hk w≤ϕ}∩{ϕ≥ϑ}∩{hk ≥γ} {hk w<ϕ}∩[{ϕ<ϑ}∪{hk <γ}]
≤ 3ε
(see (7.92) and 7.95)).
Therefore we conclude that lim sup
k→∞ w∈W
hk (ω) w(ω)dµ {hk w≤ϕ}
=0 m
∞ and as we already indicated in the beginning of the proof, this implies that hk −→ ∞ N h = 0 in L (Σn , R ) as k −→ ∞.
Using this proposition, we can now produce a price system supporting an optimal program and fulfilling the requirements described earlier. THEOREM 7.4.10 If hypotheses H1 , H3 , H4 hold, x0 ∈L∞ (Σ0 , RN ), x0 ≥ 0, and k ∗∈F (x0 ) is an optimal program, then we can find a price system p = {pn }n≥1 , pn ∈ L1 (Σn , RN )+ , n ≥ 1, such that (a) Vn (f ) − Vn (kn∗ ) ≤ pn , f − kn∗ for all f ∈ L∞ (Σn , RN ) (hereafter by ·, · we denote the duality brackets for the pair L1 (Σn , RN ), L∞ (Σn , RN ) = L1 (Σn , RN )∗ , n ≥ 1), (b) For every (f, g) ∈ Sn+1 , we have δ n+1 Jn+1 (f, g)−pn , f + pn+1 , g ∗ ∗ ) − pn , kn∗ + pn+1 , kn+1 ; ≤ δ n+1 Jn+1 (k∗ , kn+1
(c) lim pn , kn∗ = 0 (transversality condition). n→∞
PROOF: Let η1n : L∞ (Σn , RN ) × L∞ (Σn+1 , RN ) −→ L∞ (Σn , RN ) and η2n : L∞ (Σn , RN )×L∞ (Σn+1 , RN ) −→ L∞ (Σn+1 , RN ) be defined by η1n (f, g) = f
and
η2n (f, g) = g
(i.e., they are the projections on the factors of the Cartesian product L∞ (Σn , RN )× L∞ (Σn+1 , RN )). Then (η1n )∗ : L∞ (Σn , RN )∗ −→ L∞ (Σn , RN )∗ × L∞ (Σn , RN )∗ and (η2n )∗ : L∞ (Σn+1 , RN )∗ −→ L∞ (Σn , RN )∗ × L∞ (Σn+1 , RN )∗ are defined by
576
7 Economic Equilibrium and Optimal Economic Planning (η1n )∗ (v) = (v, 0) and (η1n )∗ (w) = (0, w)
(i.e., they are the embedding operators of the factor spaces in the Cartesian product L∞ (Σn , RN )∗ × L∞ (Σn+1 , RN )∗ ). Then we define ξ1n = Vn ◦ η1n
ξ2n = δ n+1 Jn+1 + Vn+1 ◦ η2n + iSn +1 ,
and
where
0 −∞
iSn +1 =
if (f, g) ∈ Sn+1 otherwise
(the indicator function of the set Sn+1 ). If (f, g) ∈ L∞ (Σn , RN ) × L∞ (Σn+1 , RN ), then from (7.90), we have ξ2n (f, g) ≤ ξ1n (f, g).
(7.97)
Because by hypothesis k = {kn∗ }n≥0 is an optimal program, from (7.91) we have ∗ ∗ ) = ξ1n (kn∗ , kn+1 ). ξ2n (kn∗ , kn+1
(7.98)
From (7.97) and (7.98) and the definition of the concave subdifferential, we deduce that ∗ ∗ ∂ξ1n (kn∗ , kn+1 ) ⊆ ∂ξ2n (kn∗ , kn+1 ). (7.99) Because of hypothesis H4 , we can use Theorem 1.2.40 (the nonsmooth chain rule) and obtain ∗ ∂ξ1n (kn∗ , kn+1 ) = (η1n )∗ ∂V2 (kn∗ ). (7.100) Recall that int L∞ (Σn , RN )+ = ∅. These facts combined with hypothesis H4 permit the use of Theorem 1.2.38, which implies that ∗ ∗ ) = ∂(δ n+1 Jn+1 + Vn+1 ◦ η2n + iSn+1 )(kn∗ , kn+1 ) ∂ξ2n (kn∗ , kn+1 ∗ ∗ ) + (η2n )∗ ∂Vn+1 (kn∗ ) + ∂iSn+1 (kn∗ , kn+1 ). = δ n+1 ∂Jn+1 (kn∗ , kn+1
(7.101) From (7.99) through (7.101), we see that if pn ∈ ∂Vn (kn ), then we can find ∗ ∗ ∗ (zn , zn+1 ) ∈ ∂ Jn+1 (kn∗ , kn+1 ), pn+1 ∈ ∂Vn+1 (kn+1 ) and (yn , yn+1 ) ∈ ∂iSn+1 (kn∗ , kn+1 ) such that (pn , 0) = (δ n+1 zn , δ n+1 zn+1 ) + (0, pn+1 ) + (yn , yn+1 ), ⇒ pn = δ n+1 zn + yn
and
− pn+1 = δ n+1 zn+1 + yn+1 .
Let pan+1 ∈ L1 (Σn+1 , RN ) be the absolutely continuous part of pn+1 ∈ ∗ L (Σn+1 , RN )∗ (see Theorem 7.4.3). We show that pan+1 ∈ ∂Vn+1 (kn+1 ). Because ∗ pn+1 ∈ ∂Vn+1 (kn+1 ), we have ∞
∗ ∗ ) ≤ pn+1 (w−kn+1 ) Vn+1 (w) − Vn+1 (kn+1
for all w ∈ L∞ (Σn+1 , RN )∗ . (7.102)
According to Definition 7.4.1, there is a sequence {Cm }m≥1 ⊆ Σn+1 which is decreasing, µ(Cm ) −→ 0 and psn+1 , the singular part of pn+1 , is supported on this sequence. We set ∗ ∞ c w + χC kn+1 ∈ L wm = χCm (Σn+1 , RN ) m
for all m ≥ 1.
(7.103)
7.4 Stochastic Growth Models
577
Returning to (7.102) and replacing w by wm given by (7.103), we obtain ∗ ) Vn+1 (wm ) − Vn+1 (kn+1 ∗ ∗ + psn+1 (wm − kn+1 ) ≤ pan+1 , wm − kn+1
a ∗ s ∗ ) , = pn+1 , wm − kn+1 + pn+1 χCm (wm − kn+1
(see (7.103) and Definition 7.4.1) ∗ . = pan+1 , wm − kn+1
(7.104)
µ
Evidently wm −→ w as m −→ ∞. Invoking Proposition 7.4.9, we have ∗ ) −→ 0 psn+1 (wm − kn+1
as m −→ ∞.
(7.105)
We also claim that Vn+1 (w) ≤ lim inf Vn+1 (wm ).
(7.106)
m→∞
To this end, given ε > 0 we can find y ∈ Dn+1 (w) such that Vn+1 (w) −
∞ ε δ i+1 Ji+1 (yi , yi+1 ). ≤ 2 i=n+1
(7.107)
We set ∗ ∞ c yi + χC ki ∈ L (Σi , RN ) yim = χCm m
and
m ∗ ∞ c yi+1 + χC ki+1 ∈ L = χCm (Σi+1 , RN ). yi+1 m
µ
µ
m −→ yi+1 as m −→ ∞. Because of hypotheses H3 , we Clearly yim −→ yi and yi+1 can find m0 ≥ 1 such that for all m ≥ m0 , we have ∞ ∞ ε m δ i+1 Ji+1 (yim , yi+1 )− δ i+1 Ji+1 (yi , yi+1 ) < . 2 i=n+1 i=n+1
(7.108)
Using (7.108) in (7.107), we obtain Vn+1 (w) − ε ≤
∞
m δ i+1 Ji+1 (yim , yi+1 ) ≤ Vn+1 (wm )
for all m ≥ m0 ,
i=n+1
which implies (7.106). We return to (7.104), pass to the limit as m −→ ∞, and use (7.105) and (7.106) to obtain ∗ ∗ ) ≤ pan+1 , w−kn+1 Vn+1 (w)−Vn+1 (kn+1
⇒
pan+1
∈
for all w ∈ L∞ (Σn+1 , RN ),
∗ ∂Vn+1 (kn+1 ).
Arguing in a similar fashion, we can show that a ∗ ) ∈ ∂Jn+1 (kn∗ , kn+1 ) (zn , zn+1
and
a ∗ (yn , yn+1 ) ∈ ∂iSn+1 (kn∗ , kn+1 ).
a a + yn+1 . Hence we have so far: pn ∈ ∂Vn (kn∗ ) and Also we have −pan+1 = δ n+1 zn+1 a ∗ ) ∈ ∂Jn+1 (kn∗ , kn+1 ), (zn , zn+1
(7.109)
578
7 Economic Equilibrium and Optimal Economic Planning ∗ pan+1 ∈ ∂Vn+1 (kn+1 ) n
and also pn = δ zn + yn ,
−pan+1
=
a ∗ (yn , yn+1 ) ∈ ∂iSn+1 (kn∗ , kn+1 ) a δ n+1 zn+1
+
a yn+1 .
(7.110)
From (7.109), we obtain
∗ δ Jn+1 (f, g) − δ Jn+1 (kn∗ , kn+1 ) n+1 ∗ n+1 a ∗ zn (f − kn ) + δ zn+1 , g − kn+1 , δ n+1
≤
and
n+1
for all (f, g) ∈ L∞ (Σn , RN ) × L∞ (Σn+1 , RN )
a ,g ⇒ δ n+1 Jn+1 (f, g) − δ n+1 zn (f ) − δ n+1 zn+1 ∗ a ∗ ≤ δ n+1 Jn+1 (kn∗ , kn+1 ) − δ n+1 zn (kn∗ ) − δ n+1 zn+1 , kn+1
for all (f, g) ∈ L∞ (Σn , RN ) × L∞ (Σn+1 , RN ).
(7.111)
Moreover, from the second inclusion in (7.110), we have a ∗ 0 ≤ yn (f − kn∗ ) + yn+1 , g − kn+1
for all (f, g) ∈ Sn+1 .
(7.112)
Adding (7.111) and (7.112) and recalling that pn = δ n+1 zn + yn , −pan+1 = a + yn+1 , we obtain
a δ n+1 zn+1
δ n+1 Jn+1 (f, g) − pn (f ) + pan+1 , g ∗ ∗ ) − pn (kn∗ ) + p∗n+1 , kn+1 ≤ δ n+1 Jn+1 (kn∗ , kn+1
for all (f, g) ∈ Sn+1 . (7.113)
In fact in (7.113), we can replace pn by its absolutely continuous part pan (see Definition 7.4.1). Indeed, let {Cm }m≥1 ⊆ Σn be a decreasing sequence such that µ(Cm ) −→ 0 as m −→ ∞ and which supports the singular functional psn . We set ∗ ∞ c f + χC kn ∈ L fm = χCm (Σn , RN ) m
and
∗ ∞ c g + χC kn+1 ∈ L gm = χCm (Σn+1 , RN ), m
m ≥ 1.
µ
m
∞ f . Therefore Evidently, we have fm −→ f and so by Proposition 7.4.9, fm −→
∗ s ∗ α ∗ pn (fm − kn∗ ) = pα n , fm − kn + pn (fm − kn ) −→ pn , f − kn
as m −→ ∞.
Also because of hypotheses H3 , we have Jn+1 (fm , gm ) −→ Jn+1 (f, g)
as m −→ ∞.
So, if in (7.113), for (f, g) we use (fm , gm ) ∈ Sn+1 and then pass to the limit as m −→ ∞, we obtain δ n+1 Jn+1 (f, g) − pan , f + pan+1 , g ∗ ∗ ) − pan , kn∗ + pan+1 , kn+1 ≤ δ n+1 Jn+1 (kn∗ , kn+1
for all (f, g) ∈ Sn+1 . (7.114)
Moreover, as we did for pn+1 , we can show that pan ∈ ∂Vn (kn∗ ). So inductively we can generate a sequence {pn }n≥1 , pn ∈ L1 (Σn , RN ) that satisfies statements (a) and (b) of the theorem. We show that pn ≥ 0. To this end, let zn = kn∗ + e with e ∈ L∞ (Σn , RN ), e ≥ 0. Because of the free disposability hypothesis (see hypothesis ∗ H1 (iii)), we have (zn , kn+1 ) ∈ Sn+1 . Hence from (7.113), we have
7.4 Stochastic Growth Models
579
∗ ∗ δ n+1 Jn+1 (zn , kn+1 ) − pn , zn + pn+1 , kn+1 ∗ ∗ ) − pan , kn∗ + pan+1 , kn+1 , ≤ δ n+1 Jn+1 (kn∗ , kn+1
n+1 ∗ ∗ ∗ Jn+1 (zn , kn+1 ) − Jn+1 (kn , kn+1 ) ≤ pn , e , ⇒ δ
⇒ 0 ≤ pn , e
(see hypothesis H3 (iii)).
(7.115)
Because e ∈ L∞ (Σn , RN )+ was arbitrary from (7.115) it follows that pn ≥ 0 for all n ≥ 1. Finally if in statement (a) we set f = 0 (see hypothesis H3 (ii)), we have 0 ≤ pn , kn∗ ≤ Vn (kn∗ ) − Vn (0).
(7.116)
Because Vn (kn∗ ) − Vn (0) −→ 0 as n → ∞, from (7.116) we conclude that lim pn , kn∗ = 0
n→∞
(the transversality condition).
REMARK 7.4.11 In Theorem 7.4.10 above, statements (a), (b), and (c) have significant economic interpretations. First of all note that if (f, g) ∈ Sn+1 , then pn , f is the cost of input f at time period n ≥ 0 and pn+1 , g represents the value of the resulting output g at time period n + 1. The quantity δ n+1 Jn+1 (f, g) is the direct utility resulting from operating the technological process (f, g) ∈ Sn+1 at time period n + 1. Therefore, we can view the quantity δ n+1 Jn+1 (f, g)−pn , f +pn+1 , g as the total expected utility generated by the process (f, g) ∈ Sn+1 . Having this in mind, we can interpret the statements in Theorem 7.4.10 as follows. (1) Statement (b) says that along the optimal program k ∗ = {kn∗ }n≥0 we have at every time period maximization of the total utility, among all other feasible processes. (2) Statement (a) says that the optimal program minimizes the cost of the input among all programs producing no less future value. (3) Statement (c) (the transversality condition) says that the expected value of the input (and of the output) goes to zero as n → ∞. (4) Borrowing a definition from the deterministic models (see Definition 7.2.8(b)), we can say that if p = {pn }n≥1 is the price system produced in Theorem 7.4.10, then (k ∗ , p) is competitive. In the next theorem, we prove the converse of Theorem 7.4.10. Namely we show that if a feasible program k ∗ = {kn∗ }n≥0 admits a price system p = {pn }n≥1 such that (k ∗ , p) is competitive, then k ∗ is an optimal program (compare with Theorem 7.2.19 for the deterministic model).
THEOREM 7.4.12 If hypotheses H1 , H3 hold and the program-price pair k ∗ = {kn∗ }n≥0 , p = {pn }n≥1 is competitive, then k ∗ ∈ F (x0 ) is optimal.
580
7 Economic Equilibrium and Optimal Economic Planning
PROOF: From the hypothesis that (k ∗ , p) is competitive, we have K
/ .
∗ ∗ δ m+1 Jm+1 (ym , ym+1 ) − Jm+1 (km∗ , km+1 ) ≤ pK+1 , kK+1
m=0
for all y = {yn }n≥0 ∈ F (x0 ) (recall that pK+1 ≥ 0, yK+1 ≥ 0). Let K −→ +∞. Using the tranversality condition, we obtain ∞
δ m+1 Jm+1 (ym , ym+1 )≤
m=0
⇒ k ∗ ∈ F (x0 ) is optimal.
∞
∗ δ m+1 Jm+1 (km∗ , km+1 )
for all y ∈ F (x0 ),
m=0
Next we turn our attention to the undiscounted stochastic economy, using as performance criterion the notion of weak maximality (see Definition 7.2.26(b)). To treat this case, we need to restrict ourselves to the stationary model. For this reason we assume that there exists a map τ : (Ω, Σn ) −→ (Ω, Σn−1 ) which is one-to-one onto and both τ and τ −1 are measurable. Also we assume that the transformation τ is measure preserving; namely for every n ≥ 1 and every A ∈ Σn−1 , we have
µ(A) = µ τ −1 (A) . Under these hypotheses the probability spaces (Ω, Σn , µ) and (Ω, Σn−1 , µ) are isomorphic with the transformation τ, τ (Σn ) = Σn−1
isomorphism and for every B ∈ Σ we have µ τ (B) = µ(B). So we see that the probability space n
Ω, τ (Σn ), µ is in fact equivalent to the probability space (Ω, Σn−1 ), µ). Therefore we can think of τ as a time shift operator. Suppose that f : Ω −→ RN is a Σn measurable function, then we can find g : Ω −→ RN is a Σn−1 -measurable function such that f = g ◦ τ , hence g = f ◦ τ −1 . Therefore any Σn -measurable function f : Ω −→ RN can be written as f = g ◦ τ n with g : Ω −→ RN Σ0 -measurable. Moreover, because τ is measure preserving, for every C ∈ RN , we have
µ f −1 (C) = µ (g ◦ τ n )−1 (C) = µ τ −1 ◦ (g ◦ τ n−1 )−1 (C)
= µ (g ◦ τ n−1 )−1 (C) = · · · = µg −1 (C). So f and g are essentially the same function, only f is shifted in a later time period. For reasons of convenience in the presentation, we replace the feasibility set Gn (ω) by Ln (ω, k) = y ∈ RN n ≥ 1. + : (k, y) ∈ Gn (ω) , We see that Gr Ln (ω, ·) = Gn (ω). The set Ln (ω, k) represents the set of all possible outputs when the state of the environment is ω ∈ Ω and the input is k ∈ RN + . This production multifunction and the instantaneous utility function, are stationary, namely
Ln (ω, k) = L1 τ n−1 (ω), k and un (ω, k, y) = u1 τ n−1 (ω), k, y . Therefore all the data of the model are described in terms of the corresponding quantity at time n = 1, by using the time shift operator τ . Hence L1 and u1 are the prototype production multifunction and utility function, respectively, and in the sequel we drop the subscript and simply denote them by L and u. Let us give the precise mathematical hypotheses on L and u.
7.4 Stochastic Growth Models
581
N
R+ H5 : L : Ω×RN \{∅} is a multifunction such that + −→ 2
N N N (i) Gr L = (ω, k, y) ∈ Ω × RN + × R+ : y ∈ L(ω, k) ∈ Σ1 × B(R+ ) × B(R+ ). N (ii) For every ω ∈ Ω, GrL(ω, ·) is closed and convex in RN + × R+ .
(iii) There exists M > 0 such that for all (ω, k) ∈ Ω × RN + , we have |L(ω, k)| = sup y : y ∈ L(ω, k) ≤ M. (iv) For every ω ∈ Ω, if y ∈ L(ω, k), k ≥ k, and y ≤ y, with k ≤ M , then y ∈ L(ω, k ). (v) For every ω ∈ Ω, there exists (k, y) ∈ Gr L(ω, ·) such that k ! y. REMARK 7.4.13 Comparing this set of hypotheses with H1 , we see that they are essentially the same. Only we have included hypothesis H5 (v) on the existence of an expansible capital stock. N H6 : u : Ω×RN + ×R+ −→ R = R ∪ {−∞} is a function such that N (i) (ω, k, y) −→ u(ω, k, y) is Σ1 × B(RN + ) × B(R+ )-measurable.
(ii) For every ω ∈ Ω, (k, y) −→ u(ω, k, y) is upper semicontinuous and concave. (iii) For all k, y ∈ RN + with k, y ≤ M , we have u(ω, k, y) ≤ ϕ(ω) µ-a.e. on Ω with ϕ ∈ L1 (Ω). (iv) For all (ω, y) ∈ Ω × RN + , k −→ u(ω, k, y) is nondecreasing and for all (ω, k) ∈ Ω × RN + , y −→ u(ω, k, y) is nonincreasing. By a program, we understand a sequence k = {kn }n≥0 , kn ∈ L∞ (Σn , RN ). The program is feasible, if we have
kn+1 (ω) ∈ L ω, kn (ω) µ-a.e. on Ω, n ≥ 0. There is an initial capital stock x0 ∈ L∞ (Σ0 , RN ) and the set of all feasible programs starting at x0 is denoted by F (x0 ). DEFINITION 7.4.14 A program k ∗ ={kn∗ }n≥0∈F (x0 ) is said to be weakly maximal if for every y = {yn }n≥0 ∈ F (x0 ), we have lim inf m→∞
m
∗ ◦ τ −n−1 ) ≤ 0, J(yn ◦ τ −n , yn−1 ◦ τ −n−1 ) − J(kn∗ ◦ τ −n , kn+1
n=0
where J(f, g) =
Ω
u ω, f (ω), g(ω) dµ for all f, g ∈ L∞ (Ω, RN ).
REMARK 7.4.15 Note that due to the stationarity of the model, we have Jn+1 (yn , yn+1 ) = J(yn ◦ τ −n , yn−1 ◦ τ −n−1 ) and
∗ ∗ ) = J(kn∗ ◦ τ −n , kn+1 ◦ τ −n−1 ), Jn+1 (kn∗ , kn+1
n ≥ 1.
582
7 Economic Equilibrium and Optimal Economic Planning
Therefore Definition 7.4.14 is Definition 7.2.26(b) adapted to the present stochastic setting. We introduce the set C = f ∈ L1 (Σ0 , RN ) : f (ω) ≤ M µ-a.e. on Ω . In the light of hypothesis H5 (iii), C is the set where we expect to locate any possible weakly maximal programs. Clearly this set is convex and weakly compact in L1 (Σ0 , RN ). We introduce the multifunction R : C −→ 2C \{∅} defined by
R(k) = y ∈ L1 (Σ0 , RN ) : y τ (ω) ∈ L ω, k(ω) µ-a.e. on Ω}. In what follows we use C endowed with the relative weak L1 (Σ0 , RN )-topology. PROPOSITION 7.4.16 If hypotheses H5 hold, then R is usc from C into Pkc (C). PROOF: Because C ⊆ L1 (Σ0 , RN ) is weakly compact and the weak topology on C is metrizable, to prove the desired upper semicontinuity of the multifunction R, it is enough to show that Gr R is sequentially closed in C × C (see Proposition 6.1.10). To this end let (kn , yn ) ∈ GrR, n ≥ 1 and assume that w
kn −→ k
and
w
yn −→ y
in L1 (Σ0 , RN )
as n → ∞.
In what follows by ·, · we denote the duality brackets for the pair L1 (Σ1 , RN ), L∞ (Σ1 , RN ) . For every A ∈ Σ1 and every x∗ ∈ RN , we have
χA x∗ , yn ◦ τ ≤ σ χA x∗ , R(kn ) ◦ τ = sup χA x∗ , z ◦ τ : z ∈ R(kn )
∗
⇒ dµ ≤ sup x , (z ◦ τ )(ω) RN dµ : z ∈ R(kn ) x∗ , yn τ (ω) N R A
A
= sup (x∗ , v)RN : v ∈ L ω, kn (ω) dµ, A
(see Theorem 6.4.16)
σ x∗ , L ω, kn (ω) dµ.
= A
From hypotheses H5 (ii) and (iii), it follows that z −→ σ x∗ , L(ω, z) is concave and upper semicontinuous (see Proposition 6.1.15(c)). So via Mazur’s lemma and Fatou’s lemma, we have
σ x∗ , L ω, kn (ω) dµ lim sup n→∞
A
≤ σ x∗ , L ω, k(ω) dµ = σ χA x∗ , R(k) ◦ τ ,
A
⇒ dµ ≤ σ x∗ , L ω, k(ω) dµ. (7.117) x∗ , y τ (ω) A
RN
A
Because A ∈ Σ1 and x∗ ∈ RN were arbitrary, from (7.117) it follows that
7.4 Stochastic Growth Models
≤ σ x∗ , L ω, k(ω) for µ-almost all ω ∈ Ω
x∗ , y τ (ω)
583
RN
and all x ∈ RN ,
⇒ y τ (ω) ∈ L ω, k(ω)
µ-a.e. on Ω,
⇒ y ∈ R(k). Proposition 7.4.16 and Theorem 6.5.19 (the Kakutani–Ky Fan fixed point theorem), imply S = {k ∈ C : k ∈ R(k)} = ∅. Moreover, S is weakly compact and convex in L1 (Σ0 , RN ) (see hypothesis H5 (ii)). We consider the following deterministic infinite-dimensional optimization problem, v = sup J(k, k) : k ∈ R(k) = sup J(k, k) : k ∈ S . (7.118) Because of hypotheses H6 , J(·, ·) is upper semicontinuous on C × C and so problem (7.118) admits a solution k∗ ∈ S. In addition, because of hypothesis H6 (iv), we see that v = sup J(k, y) : y ∈ R(k), k ≤ y . (7.119) Using the infinite-dimensional Kuhn–Tucker multiplier theorem we produce a stationary price system supporting the stationary program k∗ ∈ S. PROPOSITION 7.4.17 If hypotheses H5 and H6 hold, then we can find p ∈ L1 (Σ0 , RN ), p ≥ 0 such that for all (k, y) ∈ Gr R, we have J(k, y) + p, y − k ≤ J(k∗ , k∗ ) = v. PROOF: Let G1 (ω) = (k, y) ∈ GrL(ω, ·) : k ! y . The existence of an expansible capital stock for every ω ∈ Ω (see hypothesis H5 (v)), implies that G1 (ω) = ∅ for all ∗ ω ∈ Ω. Let {x∗n }n≥1 be dense in RN + , xn = ∅ for all n ≥ 1. Then we have # (k, y) ∈ GrL(ω, ·) : (x∗n , y − k)RN > 0 G1 (ω) = n≥1
⇒ GrG1 ∈ Σ1 × B(RN ) × B(RN )
(see hypothesis H5 (i)).
We can apply Theorem 6.3.20 (the Yankov–von Neumann–Aumann selection theorem) and obtain k , y : Ω −→ RN Σ1 -measurable functions such that
k (ω), y (ω) ∈ G1 (ω) for all ω ∈ Ω. We set k = k ◦ τ −1 and y = y ◦ τ −1 . Evidently (k, y) ∈ L∞ (Σ0 , RN )×L∞ (Σ0 , RN ) and we have k(ω) ! y(ω) ∞
for all ω ∈ Ω,
⇒ k ! y in L (Σ0 , RN )
(recall int L∞ (Σ0 , RN )+ = ∅).
So we can apply the infinite-dimensional Kuhn–Tucker multiplier rule and obtain p ∈ L∞ (Σ0 , RN ), p ≥ 0, such that
584
7 Economic Equilibrium and Optimal Economic Planning J(k, y) + p(y − k) ≤ J(k∗ , k∗ ) = v
for all (k, y) ∈ Gr R.
(7.120)
Arguing as in the proof of Theorem 7.4.10 and using Theorem 7.4.3, in (7.120), we can replace p by its absolutely continuous part pa ∈ L1 (Σ0 , RN ), pa ≥ 0, which is the desired stationary price system. Next we give the definition of good program for this stochastic model (see also Definition 7.2.31). DEFINITION 7.4.18 A feasible program k = {kn }n≥0 is a good program if lim sup m→∞
m
v − J(kn ◦ τ −n , kn+1 ◦ τ −n−1 < +∞.
n=0
Good programs exhibit a remarkable average asymptotic property. PROPOSITION 7.4.19 If hypotheses H5 , H6 hold, and k = {kn }n≥0 is a good program, then the sequence of arithmetic averages kn =
k0 + k1 ◦ τ −1 + · · · + kn ◦ τ −n n+1 n≥0
has weak limit points in L1 (Σ0 , RN ) and each such limit point solves the optimization problem (7.119) (hence it is an optimal stationary program). PROOF: Because of hypothesis H5 (ii), we have
n+1 n 1 1 (km ◦ τ −m )(ω), (km ◦ τ −m )(ω) ∈ Gr L(ω, ·) µ-a.e. on Ω. n + 1 m=0 n + 1 m=1 (7.121) We have n+1 n+1 1 1 1 (km ◦ τ −m )(ω) = (km ◦ τ −m )(ω) − k0 (ω). n + 1 m=1 n + 1 m=0 n+1
(7.122)
Also, because of hypothesis H5 (iii), we see that kn =
n 1 km ◦ τ −m ⊆ L1 (Σ0 , RN ), n + 1 m=0 n≥0
is relatively weakly sequentially compact. So we may assume that kn −→ k ∗ in L1 (Σ0 , RN ),
∗ ⇒ k (ω), k ∗ (ω) ∈ GrL(ω, ·) µ-a.e. on Ω, (see (7.121) and 7.121)), w
⇒ k ∗ ∈ S. Using Jensen’s inequality and because by hypothesis k is a good program,
7.4 Stochastic Growth Models J(k∗ , k∗ ) − J ≤
585
n+1 n 1 1 km ◦ τ −m , km ◦ τ −m n + 1 m=0 n + 1 m=1
n 1
γ J(k∗ , k∗ ) − J(km ◦ τ −m , km+1 ◦ τ −m−1 ) ≤ n + 1 m=0 n+1
(7.123) for some γ > 0 and all n ≥ 1. Recall that v = J(k∗ , k∗ ). Exploiting the weak upper semicontinuity of the expected utility function J(·, ·), we have lim sup J n→∞
n+1 n 1 1 km ◦ τ −m , km ◦ τ −m ≤ J(k ∗ , k ∗ ). n + 1 m=0 n + 1 m=1
(7.124)
So, if we pass to the limit as n → ∞ in (7.123) and we use (7.124), we obtain J(k∗ , k∗ ) = v ≤ J(k ∗ , k ∗ ). But recall that k ∗ ∈ S. So J(k∗ , k∗ ) = J(k ∗ , k ∗ ) = v.
We introduce the value loss function ξ(k, y) defined by ξ(k, y) = v − J(k, y) + p, y − k
for all (k, y) ∈ Gr R.
PROPOSITION 7.4.20 If hypotheses H5 , H6 hold and {kn }n≥0 , {zn }n≥0 are two good programs that have sequences of arithmetic means {kn }n≥0 , {z n }n≥0 which converge weakly in L1 (Σ0 , RN ) to the same limit k∗ , then lim inf m→∞
m
n=0 m
≤ lim sup m→∞
J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) − J(zn ◦ τ −n , zn+1 ◦ τ −n−1 )
ξ(zn ◦ τ −n , zn+1 ◦ τ −n−1 ) − ξ(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) .
n=0
(7.125) PROOF: We have m
J(kn ◦ τ −n , kn+1 ◦ τ −n−1 )−J(zn ◦ τ −n , zn+1 ◦ τ −n−1 )
=
n=0 m
J(kn ◦ τ −n , kn+1 ◦ τ −n−1 )− v − p, kn ◦ τ −n −kn+1 ◦ τ −n−1
n=0
−J(zn ◦ τ −n , zn+1 ◦ τ −n−1 ) + v + p, zn ◦ τ −n − zn+1 ◦ τ −n−1 + p, kn ◦ τ −n − kn+1 ◦ τ −n−1 − p, zn ◦ τ −n − zn+1 ◦ τ −n−1 m
≤ ξ(zn ◦ τ −n , zn+1 ◦ τ −n−1 )−ξ(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) n=0
+ p, zm+1 ◦ τ −m−1 − km+1 ◦ τ −m−1 .
Note that because of Proposition 7.4.19, we have
(7.126)
586
7 Economic Equilibrium and Optimal Economic Planning lim sup p, zm+1 ◦ τ −n − km+1 ◦ τ −m−1 m→∞
≤
lim
m→∞
m+1 1 p, zn ◦ τ −n − kn ◦ τ −n = 0. m + 2 n=0
(7.127)
So, if we pass to the limit as m −→ ∞ in (7.126) and use (7.127), we obtain (7.125). REMARK 7.4.21 Note that the lim inf in the left-hand side of (7.125) is similar to the one involved in the definition of weak maximality (see Definition 7.4.18). The above remark suggests that we should consider the following problem, sup[ϑ(k) : k ∈ F (x0 )], where ϑ(k) = lim
(7.128)
−ξ(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) , with k = {kn }n≥0 . Because
m
m→∞ n=0
−ξ ≤ 0, the limit in the definition of ϑ can be −∞. However, this can not happen if k is good. For this reason (in analogy with the deterministic case) we look at good programs to find a weakly maximal one. This requires that the solution set of (7.119) is a singleton. H7 : There exists at least one good program k = {kn }n≥0 ∈ F (x0 ) and the solution set S0 of (7.119) is a singleton. REMARK 7.4.22 If for all ω ∈ Ω, u(ω, ·, ·) is strictly concave, the second part of hypothesis H7 is automatically satisfied. Now we can prove the existence of a weakly maximal program. THEOREM 7.4.23 If hypotheses H5 ,H6 , and H7 hold, then there exists a weakly maximal program k ∗ = {kn∗ }n≥0 ∈ F (x0 ). PROOF: For every m ≥ 0, we set ϑm (k) =
m
−ξ(kn ◦ τ −n , kn+1 ◦ τ −n−1 )
n=0
for every k = {kn }n≥0 ∈ F (x0 ). Evidently ϑm is w∗ -upper semicontinuous and concave. Moreover, we have ϑm ↓ ϑ as m −→ ϑ is w∗ -upper semicontinu"∞∞. Hence n ous and concave. Also note that F" (x0 ) ⊆ n=0 C ◦ τ and by Tychonov’s theorem ∞ N the product set is w∗ -compact in ∞ n=0 L (Σn , R ). Therefore we can apply the ∗ ∗ theorem of Weierstrass and find k = {kn }n≥0 ∈ F (x0 ) such that ϑ(k ∗ ) = sup ϑ(k) : k ∈ F (x0 ) . We claim that the maximizing feasible program k ∗ is a good program. From hypothesis H7 there is a good program k = {kn }n≥0 ∈ F (x0 ). Then given ε > 0, we can find m0 = m0 (ε) ≥ 1 such that for all m ≥ m0 , we have
7.4 Stochastic Growth Models
587
m
J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) + p, kn+1 ◦ τ −n−1 − kn ◦ τ −n
≤ ⇒
n=0 ∞
∗ ∗ ◦ τ −n−1 ) + p, kn+1 ◦ τ −n−1 − kn∗ ◦ τ −n + ε, J(kn∗ ◦ τ −n , kn+1
n=0 m
∗ ◦ τ −n−1 ) v − J(kn∗ ◦ τ −n , kn+1
n=0
≤
m
v − J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) +
n=0
∗ ◦ τ −m−1 − km+1 ◦ τ −m−1 + ε p, km+1
≤ γ + 4M p1 + ε ⇒ k ∗ = {kn∗ }n≥0 ∈ F (x0 ) is a good program (see Definition 7.4.18). If k ∗ is not weakly maximal, then we can find ε > 0, m0 ≥ 1, and z ∈ F (x0 ) such that Vm (k ∗ ) + ε ≤ Vm (z), where Vm (k ∗ ) = and
Vm (z) =
m n=0 m
∗ J(kn∗ ◦ τ −n , kn+1 ◦ τ −n−1 )
J(zn ◦ τ −n , zn+1 ◦ τ −n−1 ).
n=0
Then we have
0 < ε ≤ Vm (z) − Vm (k ∗ ).
(7.129)
∗
Recall that k ∈F (x0 ) too is a good program. In addition, because k ∗ maximizes ϑ on F (x0 ), given δ > 0, we can find m0 = m0 (δ) ≥ 1 such that for all m ≥ m0 , we have m
∗ ∗ ◦ τ −n−1 ) − ξ(zn ◦ τ −n , zn+1 ◦ τ −n−1 ) ≤ δ, ξ(kn ◦ τ −n , kn+1 ⇒
n=0 m
∗ ∗ ◦ τ −n−1 ) + p, kn+1 ◦ τ −n−1 − kn∗ ◦ τ −n v − J(kn∗ ◦ τ −n , kn+1
n=0
− v + J(zn ◦ τ −n , zn+1 ◦ τ −n−1 ) − p, zn+1 ◦ τ −n−1 − zn ◦ τ −n ≤ δ, ∗ ⇒ ε ≤ p, zm+1 ◦ τ −m−1 − km+1 ◦ τ −m−1 + δ. Passing to the limit as m−→∞ and using (7.127) (recall that {zn }n≥0 , {kn∗ }n≥0 are both good programs), we obtain 0 < ε ≤ δ. But δ > 0 was arbitrary. So we let δ ↓ 0 to reach a contradiction. This proves that k ∗ ∈ F (x0 ) is weakly maximal. A careful reading of the previous proof reveals that the hypothesis concerning the existence of a good program (see hypothesis H7 ), was crucial. So we need to produce verifiable conditions that imply the existence of a good program in F (x0 ). For this reason we introduce the following hypotheses.
588
7 Economic Equilibrium and Optimal Economic Planning
N H8 : u : Ω×RN + ×R+ −→ R is a function such that N (i) (ω, k, y) −→ u(ω, k, y) is Σ1 × B(RN + ) × B(R+ )-measurable.
(ii) For every ω ∈ Ω, (k, y) −→ u(ω, k, y) is continuous, concave. (iii) For every r > 0, there exists ϕr ∈ L1 (Ω)+ such that |u(ω, k, y)| ≤ ϕr (ω) for µ-almost all ω ∈ Ω and all k, y ≤ r.
H9 : There exists w ∈ L∞ (Σ0 , RN ) such that (w ◦ τ ) ∈ L ω, x0 (ω) µ-a.e. on Ω and x0 (ω) ! w(ω) µ-a.e. on Ω. REMARK 7.4.24 This hypothesis means that the initial capital stock x is expansible. THEOREM 7.4.25 If hypotheses H6 , H8 , and H9 hold, then there exists a good program k = {kn }n≥0 ∈ F (x0 ). PROOF: Hypothesis H9 implies that there exists 0 < λ < 1 such that (1 − λ)k∗ + λx0 ≤ w
in the ordered Banach space L∞ (Σ0 , RN ),
where k∗ ∈ S is a solution of problem (7.119). We set kn = (1 − λn )k∗ ◦ τ n + λn x0 ◦ τ n ∈ L∞ (Σn , RN ),
n ≥ 0.
We claim that k = {kn }n≥0 ∈ F (x0 ). If n = 0, then because of τ 0 -identity, we have k0 = x0 . For n ≥ 1, because of the convexity of Gr L(ω, ·) (see hypothesis H5 (iii)), we have (1 − λn )(k∗ ◦ τ n )(ω) + λn (x0 ◦ τ n )(ω)
∈ L τ n (ω), (1 − λn )(k∗ ◦ τ n )(ω) + λ(x0 ◦ τ n )(ω)
n = L τ (ω), (1 − λn )(k∗ ◦ τ n )(ω) + λ(x0 ◦ τ n )(ω)
= Ln+1 ω, x0 (ω) µ-a.e. on Ω. ∗
∞
(7.130)
Because (1 − λ)k + λx0 ≤ w in L (Σ0 , R ), we have N
kn+1 = (1 − λn+1 )k∗ ◦ τ n+1 + λn+1 x0 ◦ τ n+1
≤ (1 − λn+1 )k∗ ◦ τ n + λn w ◦ τ n ◦ τ.
(7.131)
Then from (7.130), (7.131), and the free disposability hypothesis (see hypothesis H5 (iv)), we have
kn+1 (ω) ∈ Ln+1 ω, kn (ω) = L τ n (ω), kn (ω) µ-a.e. on Ω, ⇒ k ∈ F (x0 ). We claim that k is good. From hypotheses H8 we have that J is continuous, concave on L1 (Ω, RN )×L1 (Ω, RN ). Therefore it is Lipschitz continuous on bounded sets, in particular on C. Therefore we have
7.5 Continuous-Time Growth Models
589
v − J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) = J(k∗ , k∗ ) − J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) ≤ lM (λn + λn+1 )4M, for some lM > 0, all n ≥ 1. Summing up, we obtain
v − J(kn ◦ τ −n , kn+1 ◦ τ −n−1 ) < +∞, n≥0
⇒ k = {kn }n≥0 is a good program (see Definition 7.4.18).
7.5 Continuous-Time Growth Models In this section, we turn our attention to continuous-time growth models. We consider a general deterministic model of capital accumulation, with a convex technology and a concave utility function. We establish the existence of an optimal growth path using the direct method of the calculus of variations. This requires a careful choice of topology on the space of feasible programs. First let us describe the model. The planning horizon is R+ = [0, +∞} (continuous-time, infinite horizon model). There are N capital goods in the economy and so the commodity space is RN (multisector growth model). The technological possibilities of the economy are described by a multifunction G : R+ × RN + −→ RN + 2 \ {∅}. If y ∈ G(t, x), then this means that if at time t ≥ 0, we are given a capital stock x, then y can be accumulated as additional capital. Hence at every time instant t ≥ 0, G(t, ·) describes the technological capacity of the economy at that moment. For this reason, we call G the technology multifunction. Note that in our model this multifunction is time-varying and so the model (at least in its technology component) is nonstationary. The performance of a feasible technological process (x, y) ∈ Gr G(t, ·) is measured by an instantaneous utility function u(t, x, y) and future utilities are discounted by a factor δ∈(0, 1). So the intertemporal utility generated by a growth path x is given by
∞
J(x) = e−δt u t, x(t), x (t) dt. 0
Also, there is an initial capital stock x0 ∈ RN +. DEFINITION 7.5.1 A capital accumulation program x is an absolutely continuous map x: R+ −→ RN + . From Lebesgue’s theorem we know that such a function is differentiable almost everywhere and for every t ≥ 0 we have
t x (s)ds. x(t) = x(0) + 0
A program x is feasible if 0 ≤ x(0) ≤ x0 and x (t) ∈ G t, x(t) a.e. on R+ . The set of all feasible programs is denoted by F (x0 ). We assume throughout this section that F (x0 ) = ∅. Note that F (x0 ) ⊆ ACloc (R+ , RN ).
590
7 Economic Equilibrium and Optimal Economic Planning
REMARK 7.5.2 The hypothesis that F (x0 ) = ∅ is not at all restrictive. Indeed, we already know from the analysis of the discrete models, that it is reasonable to assume that inaction is an option (i.e., 0 ∈ G(t, 0) a.e. on R+ ) and that free disposability holds. Hence for all (t, x) ∈ R×RN + we have 0 ∈ G(t, x) and so we see that F (x0 ) = ∅. We introduce the measure µ on R+ defined by dµ = e−δt dt. Evidently µ is a finite measure on R+ which is absolutely continuous with respect to the Lebesgue measure. The Radon–Nikodym derivative of µ with respect to the Lebesgue measure is of course e−δt . In what follows by L1 (R+ , µ; R) we denote the space of functions from R+ into R whose absolute value is µ-integrable. Now we can introduce our hypotheses on the technology G and the utility u: N H1 : G : R×RN + −→ Pkc (R ) is a multifunction such that
(i) For all x ∈ RN + , t −→ G(t, x) is graph-measurable. (ii) For almost t ∈ T , x −→ G(t, x) has a closed graph. (iii) There exists a function ϕ ∈ L1loc (R+ ) such that for all x ∈ F (x0 ) we have x (t) ≤ ϕ(t)
a.e. on R+ .
REMARK 7.5.3 Hypothesis H1 (iii) is essentially a growth condition on the technology. For example, assume that for almost all t ∈ T and all x ∈ RN + we have |G(t, x)| = sup y : y ∈ G(t, x) ≤ ξ(t)ϕ(x), 1 with ∞ ξ ∈ Lloc (R+ ) and ϕ a nonnegative continuous function on R+ such that dr/ϕ(r) = +∞. Then hypothesis H1 (iii) is satisfied. To see this, fix b > 0 and then choose M > x0 + 1 such that
M
b dr ξ(t)dt. (7.132) > x0 +1 ϕ(r) 0
We claim x(t) ≤ M for all t ∈ T = [0, b]. If this is not true, then we can find t1 , t2 ∈ T such that 0 ≤ t1 ≤ t2 ≤ b and x(t1 ) = x0 + 1, x(t2 ) = M
and
x0 + 1 ≤ x(t) ≤ M
for all t ∈ [t1 , t2 ].
We have x (t) ≤ ξ(t)ϕ(x(t)) a.e. on T,
b
M
t2 dr ξ(s)ds ≤ ξ(s)ds ⇒ ≤ t1 x0 +1 ϕ(r) 0 which contradicts (7.132). In fact we can assume a more general majorant; namely |G(t, x)| ≤ g(t, x)
a.e. on R,
where g : R+ ×R+ −→ R+ is a function such that
for all x ≥ 0,
7.5 Continuous-Time Growth Models • • •
591
For all r ≥ 0, g(·, r) ∈ L1loc (R+ ). For all t ≥ 0, g(t, ·) is nondecreasing. There exist t0 ≥ 0 and u0 > 0 such that the scalar differential equation
u (t) = g t, u(t) , u(t0 ) = u0 has a bounded solution on [t0 , +∞). The hypotheses on the instantaneous utility function, are the following:
H2 : u : R+ ×RN ×RN −→ R = R ∪ {−∞} is a function such that (i) (t, x, y) −→ u(t, x, y) is Borel-measurable. (ii) For almost all t ≥ 0, (x, y) −→ u(t, x, y) is upper semicontinuous. (iii) For almost all t ≥ 0 and all x ∈ RN , y −→ u(t, x, y) is concave. (iv) For almost all t ≥ 0 and all x, y ∈ RN , we have |u(t, x, y)| ≤ α(t) + c(x + y)
with α ∈ L1 (R+ , µ; R)+ and c > 0.
REMARK 7.5.4 By allowing the utility function to take the value −∞, we have incorporated in its definition constraints such as u(t, x, y) = −∞ if x ∈ / RN + . Another economically interesting situation accommodated by hypotheses H2 , is when u(t, x, y) −→ −∞ as x, y −→ 0. Indeed, in many situations it is more natural to heavily penalize the inactivity option, more than simply setting u(t, 0, 0) = 0.
∞ If J(x) = 0 e−δt u t, x(t), x (t) dt, x ∈ F (x0 ), our goal is to find a solution of the following optimization problem, V (x0 ) = sup[J(x) : x ∈ F (x0 )].
(7.133)
As we already mentioned, we approach problem (7.133) using the direct method of the calculus of variations. This requires a careful choice of topology that will make the constraint set F (x0 ) compact and the objective functional (the discounted intertemporal utility) upper semicontinuous. The problem that we face is that even if we choose a good topology on the space of capital stocks, this translates into weak convergence of the corresponding investment flows. Weak convergence does not imply, in general, pointwise convergence (even for a subsequence). This lack of pointwise convergence poses several technical difficulties in the existence theory. For this reason, we have included convexity and concavity conditions in hypotheses H1 and H2 . First we focus on the constraint set and seek a topology that makes it compact. For this purpose, we introduce two topologies. The first one concerns the capital accumulation programs. DEFINITION 7.5.5 The c-topology on C(R+ , RN ) (compact-open topology) (hence on ACloc (R+ , RN ) too) is the topology of uniform convergence on compact sets.
592
7 Economic Equilibrium and Optimal Economic Planning
REMARK 7.5.6 The above topology is defined by the family of seminorms pn (x) = max{x(t) : t ∈ [0, n]}, where · is any of the equivalent lp -norms on RN . It is well-known that C(R+ , RN ) endowed with this topology becomes a Fr´echet space. This topology makes the constraint set compact. But in order to be able to show this, we need to consider another topology related to the investment flows. The investment flows are the derivatives of the capital accumulations. Because the latter by definition are absolutely continuous functions, by Lebesgue’s theorem the investment flows belong in L1loc (R+ , RN ). This too is a Fr´echet space, for the topology defined by the sequence of seminorms
n y(t)dt, n ≥ 1. qn (y) = 0
Then using this fact we can define a second topology on the space of absolutely continuous functions from R+ into RN , namely the following. DEFINITION 7.5.7 The α-topology on ACloc (R+ , RN ), is the Fr´echet topology generated by the sequence of seminorms rn (x) = pn (x) + qn (x )
for all n ≥ 1. n Recall that pn (x) = max{x(t) : t ∈ [0, n]} and qn (x ) = 0 x (t)dt. REMARK 7.5.8 Evidently on ACloc (R+ , RN ), the a-topology is stronger than the c-topology. Note that the a-topology is equivalent to the one given by the natural metric on the direct sum RN ⊕ L1loc (R+ , RN ). So we can make the identification ACloc (R+ , RN ) = RN ⊕ L1loc (R+ , RN ). Using this identification we can now determine the dual of ACloc (R+ , RN ) and therefore define the weak topology on ACloc (R+ , RN ). The dual of RN with the lp -norm, is of course RN with the lp -norm, (1/p) + (1/p ) = 1. The dual of L1loc (R+ , RN ) is the space ∗ ∞ {y ∈ L (R+ , RN ) : suppy ∗ is compact}. So x∗ ∈ ACloc (R+ , RN ) if and only if
b
∗
∗ ∗ y , x = v , x(0) RN + w (t), x (t) RN dt 0
∗
for some v ∈ R , some b > 0, and some w ∈ L∞ (T, RN ), T = [0, b]. Therefore with this duality we can define a weak topology on ACloc (R+ , RN ). Using the Dunford– w c Pettis theorem, we can check that if xn −→ x in ACloc (R+ , RN ), then xn −→ x. However, this is no longer true if sequences are replaced by nets. Recall that the generalized Lebesgue’s dominated convergence theorem fails for nets. N
∗
Let ξ ∈ L1loc (R+ ) and define D(ξ) = x ∈ ACloc (R+ , RN ) : x (t) ≤ ξ(t) a.e. on T . We show that when restricted to sets such as D(ξ), the weak and c-topologies coincide. This permits us to show the c-compactness of the set F (x0 ). To this end, first we prove the following lemma.
7.5 Continuous-Time Growth Models
593
c
LEMMA 7.5.9 If {xα }α∈S is a net in D(ξ) and xα−→x in ACloc (R+ , RN ), then w x −→ x in ACloc (R+ , RN ) and x ∈ D(ξ). PROOF: Fix b > 0. Given ε > 0, we can find δ > 0 such that if A ⊆ [0, b] is measurable and |A| < δ (by | · | we denote the Lebesgue measure on R), then we have
ξ(t)dt < ε. (7.134) A
We take 0 ≤ s1 ≤ t1 ≤ · · · ≤ sm ≤ tm ≤ b, with
m
(tk − sk ) < δ. Then
k=1 m
≤
k=1 m
x(tk ) − x(sk )
x(tk ) − xa (tk ) + xa (sk ) − x(sk ) +
k=1
≤ 2mx − xa C([0,b],RN ) +
m k=1
tk
xa (τ )dτ
sk tk
xa (τ )dτ
sk
< 2mx − xa C([0,b],RN ) + ε < 2ε
for a ∈ S large (see (7.134)),
which implies that x ∈ ACloc (R+ , RN ). Next given ε > 0, we can find δ > 0 such that
ξdt < ε and kdt < ε C
C
for all C ⊆ [0, b] measurable with 0 < |C| < δ. Let A ⊆ [0, b] be measurable with 0 < |A|. We can find U ⊆ [0, b] open such that |C = A U | < δ. We know that m U= (sk , tk ). We have k=1
x dt A
A
≤ (xa − x )dt + xa dt + x dt
xa dt −
U
≤
C
C
m
xa (tk ) − x(tk ) + xa (sk ) − xa (sk ) + 2ε
k=1
≤ 2mxa − xC([0,b],RN ) + 2ε,
xa dt −→ x dt. ⇒ A
A
Because A ⊆ [0, b]-measurable was arbitrary, we have xa −→ x in L1 ([0, b], RN ). w
Also xa (0) −→ x(0). Because b > 0 was arbitrary, we conclude that w
xa −→ x
in ACloc (R+ , RN ).
594
7 Economic Equilibrium and Optimal Economic Planning Moreover, from Mazur’s lemma, we infer that x (t) ≤ ξ(t)
a.e. on R+ ,
⇒ x ∈ D(ξ). w
c
LEMMA 7.5.10 If {xa }a∈S ⊆ D(ξ) and xa −→ x in ACloc (R+ , RN ), then xa −→ x and x ∈ D(ξ). PROOF: Let ε > 0 and b > 0 be given. We can find δ > 0 such that
t t ξ(τ )dτ < ε and x (τ )dτ < ε s
s w
for 0 ≤ s ≤ t ≤ b and (t − s) < δ. Because xa −→ x in ACloc (R+ , RN ) for a ∈ S large, we have
δn xa (δn) − x(δn) ≤ xa (0) − x(0) + (xa − x )dτ < ε, 0
where n is an integer such that 0 ≤ n ≤ (1/δ)b. Let t ∈ [0, b]. We choose an integer 0 ≤ n ≤ (1/δ)b with |t − δn| < δ. Then for a ∈ S large, we have
t xa (t) − x(t) ≤ xa (δn) − x(δn) + (xa − x )dτ < 3ε, δn c
⇒ xa −→ x and from Lemma 7.5.9, we have x ∈ D(ξ). From Lemmata 7.5.9 and 7.5.10, we infer the following. PROPOSITION 7.5.11 On the sets D(ξ), the c-topology and the weak-α-topology coincide and of course are metrizable. Moreover, D(ξ) is closed. PROPOSITION 7.5.12 Let Dr (ξ) = D(ξ) ∩ x ∈ ACloc (R+ , RN ) : x(0) ≤ r . N Then Dr (ξ) is c-compact in ACloc (R+ , R ). PROOF: From Proposition 7.5.11 we know that Dr (ξ) is c-closed. Also for every b > 0 and every t ∈ [0, b] Dr (ξ)(t) = {x(t) : x ∈ Dr (ξ)} is bounded and closed, hence compact. Moreover, if t, s ∈ [0, b], 0 ≤ s < t ≤ b, and x ∈ Dr (ξ), we have
t
t x (τ )dτ ≤ ξ(τ )dτ. x(t) − x(s) ≤ s
s
1 t Since ξ ∈ Lloc (R+ ), given ε > 0, we can find δ > 0 such that, if t − s < δ, then ξ(τ )dτ < ε. Hence, we have s
x(t) − x(s) < ε
for all x ∈ Dr (ξ) provided t − s < δ,
⇒ Dr (ξ) is equicontinuous.
7.5 Continuous-Time Growth Models Invoking the Arzela–Ascoli theorem, we conclude that Dr (ξ) is c-compact.
595
In the past literature on the subject, a common mistake was to claim that weak convergence in ACloc (R+ , RN ) implies pointwise convergence of at least a subsequence of the derivatives. Of course this is not true in general. EXAMPLE 7.5.13 For 1 ≤ p < ∞, consider the sequence {xn }n≥1 ⊆ Lp [0, 2π] w defined by xn (t) = cos(nt). The Riemann–Lebesgue lemma implies that xn −→ 0 p in L [0, 2π]. However, because xn 2 = π for all n ≥ 0, we cannot have xn −→ 0 in the norm of Lp [0, 2π], or pointwise, or in measure. In order to get pointwise convergence (of at least a subsequence) out of a weakly convergent sequence, we need additional conditions (see Proposition 6.6.45 and Proposition 6.6.46). These conditions are suggested by the following observation, which is an immediate consequence of Proposition 6.6.33. w
PROPOSITION 7.5.14 If {xn }n≥1 ⊆ L1loc (R+ ) and xn −→ x in L1loc (R+ ), then x ∈ L1loc (R+ ) and lim inf xn (t) ≤ x(t) ≤ lim sup xn (t) a.e. on R+ . n→∞
n→∞
Now we are ready to prove the compactness of the constraint set F (x0 ) in the c-topology (equivalently in the weak-α-topology). THEOREM 7.5.15 If hypotheses H1 hold, then F (x0 ) is c-compact. PROOF: Note that F (x0 ) ⊆ Dx0 (ϕ) and from Proposition 7.5.12, we know that Dx0 (ϕ) is c-compact. So it suffices to show that F (x0 ) is c-closed. To this c end let {xn }n≥1 ⊆ F (x0 ) and assume that xn −→ x. Then xn (0) −→ x(0) in N R , 0 ≤ xn (0) ≤ x0 for all n ≥ 1, hence 0 ≤ x(0) ≤ x0 . Also because of Proposition w 7.5.11 we have that xn −→ x in L1loc (R+ , RN ). For every n ≥ 1, we have
xn (t) ∈ G t, xn (t) a.e. on T. We have xn (t) −→ x(t) for every t ≥ 0. Moreover, by virtue of hypotheses H1 (ii), we have
for all t ≥ 0. (7.135) lim sup G t, xn (t) ⊆ G t, x(t) n→∞
Invoking Proposition 6.6.33 and using (7.135) and the fact that G is convexvalued, we have
x (t) ∈ G t, x(t) a.e. on T, ⇒ x ∈ F (x0 ). Therefore F (x0 ) is c-closed, hence it is c-compact.
To establish the desired upper semicontinuity of J(·) in the c-topology, we will need the following general upper semicontinuity result, whose proof can be found in Hu–Papageorgiou [316, p. 31] (see also Theorem 2.1.28).
596
7 Economic Equilibrium and Optimal Economic Planning
THEOREM 7.5.16 If (Ω, Σ, µ) is a finite measure space, Y is a separable Banach space, V is a separable reflexive Banach space, and u : Ω × Y × V −→ R = R ∪ {−∞} is a measurable integrand such that (i) For µ-a.e. ω ∈ Ω, (y, v) −→ u(ω, y, v) is upper semicontinuous; (ii) For µ-a.e. ω ∈ Ω and every y ∈ Y, v −→ u(ω, y, v) is concave; (iii) There exist α ∈ L1 (Ω)+ and c > 0 such that u(ω, y, v) ≤ a(ω) + c(yY + vV ) for almost all ω ∈ Ω and all (y, v) ∈ Y × V ,
then (y, v) −→ Iu (y, v) = Ω u ω, y(ω), v(ω) dµ is sequentially upper semicontinuous from L1 (Ω, Y ) × L1 (Ω, V )w into R = R ∪ {−∞} (by L1 (Ω, V )w we denote the Lebesgue–Bochner space equipped with the weak topology). Using this theorem, we can prove the upper semicontinuity of the objective functional. THEOREM 7.5.17 If hypotheses H1 and H2 hold, then the functional J(·) is cupper semicontinuous on F (x0 ).
c PROOF: Note that R+ , B(R+ ), µ is a finite measure space. Suppose xn −→ x w with xn ∈ F (x0 ). Then from Lemma 7.5.9, we have that xn −→ x in ACloc (R+ , RN ). Applying Theorem 7.5.16, we obtain lim sup J(xn ) ≤ J(x) n→∞
⇒ J(·) is c-upper semicontinuous. From Theorems 7.5.15 and 7.5.17, using the theorem of Weierstrass, we conclude that THEOREM 7.5.18 If hypotheses H1 and H2 hold, then there exists x∗ ∈ F (x0 ) such that J(x∗ ) = V (x0 ) (see (7.133)). The open-endedness of the planning horizon, although conceptually and mathematically elegant, is impractical from a computational point of view. To actually solve such a problem, we must usually content ourselves with finite horizon approximates. So we need to know that the values of these finite horizon approximations converge to the value V (x0 ) of the original problem (see (7.133)). Hence we consider the following finite horizon growth problem: b
Vb (x0 ) = sup e−δt u t, x(t), x (t) dt : x ∈ Fb (x0 ) . (7.136) 0
Here the feasibility set Fb (x0 ), is defined by
Fb (x0 ) = x ∈ AC([0, b], RN + ) : x (t) ∈ G t, x(t) 0 ≤ x(0) ≤ x0 .
a.e. on [0, b],
7.5 Continuous-Time Growth Models
597
We want to determine conditions which guarantee that Vb (x0 ) −→ V (x0 )
as b −→ +∞.
For this reason, we strengthen hypotheses H2 as follows. H3 : u : R+ ×RN ×RN −→ R = R ∪ {−∞} is a function that satisfies hypotheses H2 and also (v) There exists ψ ∈ L1 (R+ , µ) such that for almost all t ≥ 0, all x ≥ 0, and all y ∈ RN , we have ψ(t) ≤ u(t, x, y). THEOREM 7.5.19 If hypotheses H1 and H3 hold, then b −→ Vb (x0 ) is continuous on R+ . PROOF: Note that by Theorem 7.5.18, we have V (x0 ) ∈ R. We can find x ∈ F (x0 ) such that
∞
−∞ < V (x0 ) − ε ≤ e−δt u t, x(t), x (t) dt ≤ V (x0 ) < +∞. (7.137) 0
Then because of hypothesis H3 (v) we have
b
e−δt u t, x(t), x (t) dt = lim b→+∞
0
∞
e−δt u t, x(t), x (t) dt.
(7.138)
0
Let b > 0 and set Tb = [0, b]. We have xT ∈ Fb (x0 ) and so b
b
e−δt u t, x(t), x (t) dt ≤ V (x0 ),
0
e−δt u t, x(t), x (t) dt ≤ lim inf Vb (x0 ), b→+∞ 0
∞ −δt
e u t, x(t), x (t) dt ≤ lim inf Vb (x0 ) ⇒ V (x0 ) − ε ≤ ⇒
b
lim
b→+∞
b→+∞
0
(see (7.137)). Because ε > 0 was arbitrary, we let ε ↓ 0 and obtain V (x0 ) ≤ lim inf Vb (x0 ). b→+∞
(7.139)
Next let ξ = lim supb→+∞ Vb (x0 ). We can find bn ↑ +∞ such that ξ−
1 ≤ Vbn (x0 ) n
for all n ≥ 1.
Similarly as for the infinite horizon problem, the finite horizon problem has a solution (see Theorem 7.5.18). So we can find xn ∈ Fbn (x0 ) such that
b
1 e−δt u t, xn (t), xn (t) dt. (7.140) ξ− ≤ n 0 Extend xn on all of R+ by setting xn (t) = 0 for all t > bn . Denote the extended function by xn . Then xn ∈ ACloc (R+ , RN ) and in fact xn ∈ Dx0 (ϕ). From Proposition 7.5.12, we know that Dx0 (ϕ) is c-compact. So we may assume that
598
7 Economic Equilibrium and Optimal Economic Planning c
xn −→ y
with y ∈ Dx0 (ϕ).
From this it follows that xn (t) −→ y(t) and
for all t ≥ 0, uniformly on compact sets in R+
xn −→ y in L1loc (R+ , RN ) (see Lemma 7.5.9). w
Fix k ≥ 1. Then for every n ≥ k, we have
xn (t) ∈ G t, xn (t) a.e. on Tbk . By virtue of Proposition 6.6.33, we obtain
y (t) ∈ G t, y(t)
a.e. on R+ .
(7.141)
Moreover, we have 0 ≤ y(0) ≤ x0 and so we conclude that y ∈ F (x0 ). In addition, using Theorem 7.5.17, we have
∞
∞
e−δt u t, xn (t), xn (t) dt ≤ e−δt u t, y(t), y (t) dt. (7.142) lim sup n→∞
0
0
Using hypothesis H3 (v), we have
bn 0
≤
∞
e−δt u t, xn (t), xn (t) dt +
e−δt u t, xn (t), xn (t) dt.
∞
e−δt ψ(t)dt
bn
(7.143)
0
Note that because ψ ∈ L1 (R+ , µ), we have
∞ ψ(t) −→ 0
as n → ∞.
(7.144)
bn
Passing to the limit as n → ∞ in (7.143) and using (7.144), we obtain
bn
lim sup n→∞
e−δt u t, xn (t), xn (t) dt
0 ∞
e−δt u t, xn (t), xn (t) dt ≤ lim sup n→∞ 0
∞
e−δt u t, y(t), y (t) dt (see (7.142)). ≤
(7.145)
0
From (7.140) and (7.145) it follows that
∞
ξ≤ e−δt u t, y(t), y (t) dt ≤ V (x0 ) < +∞, 0
⇒ lim sup Vbn (x0 ) ≤ V (x0 ). n→∞
(7.146)
From (7.139) and (7.146), we conclude that Vbn (x0 ) −→ V (x0 ).
7.6 Expected Utility Hypothesis
599
∗ ∗ Let S and S(x0 ) = x∗ ∈ F (x0 ) : V (x0 ) = b (x0 ) = x ∈ F (x0 ) : Vb (x0 ) = Jb (x ) J(x∗ ) . Here
b
Jb (x) =
e−δt u t, x(t), x (t) dt
for all x ∈ ACloc (R+ , RN ).
0
From the above proof, we deduce the following. PROPOSITION 7.5.20 If hypotheses H1 and H3 hold and bn ↑ +∞, then lim sup Sbn (x0 ) ⊆ Sb (x0 ) in the c-topology. n→+∞
7.6 Expected Utility Hypothesis In this section we deal with an issue that brings us closer to the subject of the next chapter, which is the theory of games. The Expected Utility Hypothesis (EUH for short), is the dominant theory of decision making under risk, employed in economics and finance. In simple terms it says that decision makers choose (or ought to choose) among lotteries, by computing the expected value of the utility of the lottery prizes and finally selecting the lottery with the largest expected utility. In this section we want to characterize the choice behavior, which is consistent with the EUH, in risky situations where there are objective probabilities and a ranking of prizes. This setting covers many situations of interest in economic theory. The items chosen are not commodities, but lotteries. By a lottery we understand a probability distribution on a set of prizes. There is also a set of observations. To each observation corresponds a budget, which is a set of lotteries available to the decision maker and also a choice from each budget, which is a nonempty subset of the budget. If the choices are singletons, then we have a choice function. More generally, for at least some observations the choices may be more than one. This case we refer to as the choice multifunction. The choice function or multifunction determines the choice behavior of the decision maker. There is also a utility function. Since the prizes are monetary (remember we have assumed that there is an a priori ranking of prizes), it is reasonable to assume that the utility function is strictly increasing. Then a choice behavior is EU-rational if it maximizes the expected utility. Let us make precise the mathematical setting for the items described above. There is a set observation T which is a compact Hausdorff topological space and a space of prizes X which is a compact subset of R. A lottery is a probability measure µ on X (i.e., µ ∈ M1+ (X)). We topologize M1+ (X) as follows. DEFINITION 7.6.1 The weak topology on M1+ (X) is the weakest topology on M1+ (X) that makes continuous all functions ξh : M1+ (X) −→ R, h ∈ C(X) defined by
ξh (µ) =
h(x)dµ(x). X
We denote the weak topology on M1+ (X) by w M1+ (X), C(X) or simply by w. REMARK 7.6.2 A subbasic element for the weak topology on M1+ (X) is given by
600
7 Economic Equilibrium and Optimal Economic Planning
hdµ − hdµ < ε , Uh,ε (µ) = µ ∈ M1+ (X) : X
X w
where h ∈ C(X), ε > 0. Evidently for a net {µa }a∈J in M1+ (X), we have µa −→ µ if and only if for every h ∈ C(X), we have
hdµa −→ hdµ. X
X
Evidently the weak topology on M1+ (X) is the relative w∗ -topology on M1+ (X), when the latter is viewed as a subspace of the dual Banach space M (X) = C(X)∗ . Hence the weak topology on M1+ (X) is metrizable. + A budget is a set of lotteries. There is a budget multifunction B : T −→ 2M1 (X) \ {∅}. This multifunction provides to the decision maker a set of lotteries for each observation t ∈ T . A choice function is a measurable selector t −→ µt of the multifunction t −→ Bt . For each observation t ∈ T, µt represents the choice of the decision maker. DEFINITION 7.6.3 A choice function µ : T −→ M1+ (X) is said to be EU-rational if there is a strictly increasing utility function u : X −→ R (hence automatically Borel-measurable), such that for every observation t ∈ T , the choice µt maximizes the expected utility over the budget set
u(x)dν(x) ≤ u(x)dµt for every ν ∈ Bt and every t ∈ T. X
X
We say that u rationalizes µ. Given a choice function µ, a compound lottery can be found by first choosing t ∈ T at random according to a prior probability λ on T and then choosing a prize at random in Bt according to µt . This composition process generates a new lottery ν on X as follows. For every Borel set A ⊆ X, we have
µt (A)dλ(t); ν(A) = T
that is, ν = T µt dλ(t), where the vector-valued integral is understood as a Gelfand integral (see Denkowski–Mig` orski–Papageorgiou [194, p. 615]). Recall that M (X) is the Banach space of all finite Borel measures on X. We know that M (X) = C(X)∗ . DEFINITION 7.6.4 (a) If µ, ν ∈ M (X), then we say that ν ≺ µ if and only if
udν ≤ udµ for all u ∈ C(X) nondecreasing. X
X
(b) A choice function µ : T −→ M1+ (X) is ex-ante dominated if there exist another choice function ν : T −→ M1+ (X) and a probability measure λ on T such that
µt dλ(t) ≺ νt dλ(t) (see (a)). T
T
In this case we say that µ is dominated by ν with respect to the prior λ.
7.6 Expected Utility Hypothesis
601
REMARK 7.6.5 In what follows K ={µ∈M (X):0 ≺ µ}, the positive cone of ≺. We characterize EU-rational choice functions. This requires a closer look at the order described by ≺. We start with a result of Nachbin [447, 448]. In what follows M+ (X) denotes the nonnegative elements of M (X) (i.e., µ(A) ≥ 0 for every Borel set A ⊆ X), G = {(x, y) ∈ X × X : y ≤ x}, and D is the diagonal of the Cartesian product X × X. THEOREM 7.6.6 µ ∈ K (i.e., 0 ≺ µ) if and only if there exists ξ ∈ M+ (G) such that
gdµ = g(x) − g(y) dξ for all g ∈ C(X). X
G
This theorem has some remarkable consequences, which are useful in the sequel. COROLLARY 7.6.7 If u∈C(X), it is strictly increasing, µ∈K, and then µ = 0.
X
udµ = 0,
PROOF: According to Theorem 7.6.6, we can find ξ ∈ M+ (G) such that
udµ = u(x) − u(y) dξ. X
G\D
Because u is strictly increasing we have u(x) − u(y) > 0 We have
for all (x, y) ∈ G \ D.
u(x) − u(y) dξ,
udµ =
0= X
G\D
⇒ ξ(G \ D) = 0
(see (7.147)).
Hence for any g ∈ C(X), we have
gdµ = X
(7.147)
g(x) − g(y) dξ = 0
G\D
⇒ µ = 0. From the above corollary, we get the following result at once. COROLLARY 7.6.8 The following statements are equivalent. udν < X udµ. (a) For all u ∈ C(X) strictly increasing, we have X (b) For all u ∈ C(X) nondecreasing, we have X udν ≤ X udµ and µ = ν. COROLLARY 7.6.9 If g ∈ C(X) and for every µ ∈ K \ {0} we have then g is strictly increasing.
X
gdµ > 0,
602
7 Economic Equilibrium and Optimal Economic Planning
PROOF: Suppose y < x and let ξ = δ(x,y) ∈ M1+ (G) (by δ(x,y) we denote the Dirac measure concentrated at (x, y)). We choose µ ∈ K \ {0} corresponding to ξ. We have
0< gdµ = g(x) − g(y) dξ = g(x) − g(y), X
G\D
⇒ g is strictly increasing. Finally from Corollaries 7.6.7 and 7.6.8, we obtain the following. COROLLARY 7.6.10 If g0 ∈ C(X) is strictly increasing, then the set C = {µ ∈ K : X g0 dµ = 1} is a w∗ -closed and convex set in M (X) = C(X)∗ and K = {ϑµ : ϑ ≥ 0, µ ∈ C}. Now we can give some alternative descriptions of ≺ restricted on M1+ (X). PROPOSITION 7.6.11 If µ, ν ∈ M1+ (X), then the following statements are equivalent. (a) ν ≺ µ. (b) For every u : X −→ R nondecreasing, we have
X
(c) For every u ∈ C(X) strictly increasing, we have
udν ≤
X
udν <
X
(d) For every u : X −→ R strictly increasing, we have
X
udµ.
udµ or ν = µ.
X
udν <
X
udµ or ν = µ.
(e) For every x ∈ X, ν {y : x ≤ y} ≤ µ {y : x ≤ y} . PROOF: (b)⇒(a): Obvious (see Definition 7.6.4(a)). (a)⇔(c): Follows from Corollary 7.6.8. (d)⇒(c): Obvious. (e)⇒(b): Let S(X) be the family of all increasing sets A ⊆ X; that is, χA (the indicator function of A) is increasing. Let A ∈ S(X). We need to show that ν(A) ≤ µ(A). For this purpose, given ε > 0 we can find Cε ⊆ A closed, such that µ(A) − µ(Cε ) < ε and ν(A) − ν(Cε ) < ε (regularity of the measures µ, ν). Let Eε = {y ∈ X : x ≤ y for some x ∈ Cε }. Evidently Eε is closed and Eε ⊆ A. Because of (e), we have ν(Eε ) ≤ µ(Eε ) and because Cε ⊆ Eε ⊆ A, we have ν(A) ≤ ν(Eε ) + ε ≤ µ(Eε ) + ε. But ε > 0 was arbitrary. So we let ε ↓ 0 and obtain ν(A) ≤ µ(A). (b)⇒(e): This follows at once by taking u to be the indicator function of the set {y : x ≤ y}. (c)⇒(e): This implication holds because the indicator of {y : x ≤ y} is the poitwise limit of continuous nondecreasing functions.
7.6 Expected Utility Hypothesis
603
(b)⇒(d): Obvious. (e)⇒(d): From Kamae–Krengel–O’Brien [339, Theorem 1], we know that if (e) holds, then there are random variables Sµ and Sν defined on [0, 1] equipped with the Lebesgue measure with values in X and distributions µ and ν, respectively, such that Sν ≤ Sµ and the inequality is strict on a set of positive Lebesgue measure. Therefore if u : X −→ R is strictly increasing, then we have u(Sν ) ≤ u(Sµ ) with strict inequality on a set of positive Lebesgue measure. From this we conclude that (d) holds. REMARK 7.6.12 In the literature, the partial order induced on M1+ (X) by any of the equivalent statements of Proposition 7.6.11, is called first-order stochastic dominance. PROPOSITION 7.6.13 If V is a compact Hausdorff topological space, Y is a ∗ separable Banach space, F : V −→ 2Y \{∅} is a multifunction with w∗ -compact and convex values which is Vietoris continuous (see Definition 6.1.2(c)) from V into Y ∗ endowed with the w∗ -topology (denoted by Yw∗∗ ), and ! ! Rc = conv F (v) and G = F (v)dλ(v) λ∈M1+ (V )
v∈V
V
then Rc = G and G is w∗ -compact and convex. PROOF: From Proposition 6.1.13, we know that F (V ) ⊆ Y ∗ is w∗ -compact. Recall that Yw∗∗ has the convex compact property (i.e., compact sets in Yw∗∗ have a compact closed convex hull). Therefore we deduce that Rc is w∗ -compact, convex. First we show that (7.148) G ⊆ Rc . We argue indirectly. So suppose that we can find y ∗ ∈ G, such that y ∗ ∈ / Rc . Then by the strong separation theorem, we can find y ∈ Y such that σ(y, Rc ) < y ∗ , y
(7.149)
(by ·, · we denote the duality brackets for the pair (Y ∗ , Y )). By definition
y∗ = u∗ (v)dλ(v), V
where Then
λ ∈ M1+ (V
) and u : V −→ X is w∗ -measurable and u∗ (v) ∈ F (v) for all v ∈ V .
y ∗ , y =
∗
u∗ (v), y dλ(v) ≤ V
σ y, F (v) dλ(v) ≤ σ(y, Rc ).
(7.150)
V
Comparing (7.149) and (7.150), we reach a contradiction. Therefore (7.148) holds. Next we show that Rc ⊆ G. (7.151) If y ∗ ∈ G, then by definition we can find λ ∈ M1+ (V ) such that
604
7 Economic Equilibrium and Optimal Economic Planning
σ y, F (v) dλ(v) for all y ∈ Y, y ∗ , y ≤
(7.152)
V
if y ∗ ∈ Rc . Note that because Y is separable, the relative w∗ -topology on Rc is w∗
compact and metrizable. So we can find {yn∗ }n≥1 ⊆ conv F (V ) such that yn∗ −→ y ∗ in Y ∗ . There exist {λn }n≥1 ⊆ M1+ (V ), each with finite support, such that yn∗ =
f (v)dλn (v), V
where f : V −→ X ∗ is a w∗ -measurable selector of F . We have
σ y, F (v) dλn (v). yn∗ , y ≤
(7.153)
V w∗
We may assume that λn −→ λ in M (V ) = C(V )∗ and λ ∈ M1+ (V ). Because v −→ σ y, F (v) is continuous (recall F is Vietoris continuous), we have
σ y, F (v) dλn (v) −→
V
σ y, F (v) dλ(v).
(7.154)
V
Therefore, if we pass to the limit as n → ∞ in (7.153) and we use (7.154), we obtain y ∗ , y ≤
σ y, F (v) dλ(v), V
⇒ y∈G
(see (7.148)),
⇒ Rc ⊆ G.
(7.155)
Combining (7.151) and (7.155), we conclude that Rc = G.
Now we can have the main theorem on the rationalization of the choice function. THEOREM 7.6.14 If (i) The set of prizes X is a compact subset of R; (ii) The set of observations T is a compact Hausdorff topological space; (iii) If on M1+ (X), we consider the weak topology (see Definition 7.6.1), the budget + multifunction B : T −→ 2M1 (X)\{∅} has compact convex values and it is Vietoris continuous; and µ : T −→ M1+ (X) is the choice function, then (a) If µ is ex-ante dominated, then it is not EU-rational. (b) If µ is continuous, then one of the following alternatives holds. (b1 ) µ is rationalized by a continuous strictly increasing utility function. (b2 ) µ is ex-ante dominated.
7.6 Expected Utility Hypothesis
605
PROOF: (a) Clear from Definition 7.6.4. (b) Suppose µ is continuous and it is ex-ante dominated. Then we can find a measurable selector ν of the budget multifunction such that for some λ ∈ M1+ (T ), we have
µt dλ(t) ≺ νt dλ(t) (see Definition 7.6.4(b)), T
T ⇒ (νt − µt )dλ(t) & 0. (7.156) T
Set G=
!
(Bt − µt )dλ(t) : λ ∈ M1+ (T ) .
(7.157)
T
If µ is not ex-ante dominated, then G ∩ K = ∅ (see (7.156)). Let C ⊆ K be as in Corollary 7.6.10 (i.e., a w∗ -closed convex base of the cone K). Then G ∩ C = ∅. Recall that the weak topology on M1+ (X), is the relative w∗ -topology, that + M1 (X) inherits from the dual Banach space M (X) = C(X)∗ . So t −→ Bt − µt is Vietoris continuous into M (X) with the w∗ -topology (denoted by M (X)w∗ ) and has nonempty, compact, and convex values in M (X) w∗ . Because ∗ G ∩ C = ∅, by the strong separation theorem we can find u ∈ C(X) = M (X)w∗ such that and
ν, u ≤ 0
for all ν ∈ G
ξ, u > 0
for all ξ ∈ C.
(7.158)
(7.159)
Here by ·, · we denote the duality brackets for the pair C(X), M (X) . From (7.159), we have
ξ, u = udξ > 0 for all ξ ∈ C. X
This combined with Corollaries 7.6.10 and 7.6.9, implies that u ∈ C(X)
is strictly increasing.
We claim that it rationalizes µ. To this end let t ∈ T and let ν ∈ Bt . Consider λ ∈ M1+ (T ) defined by λ = δ{t}
(the Dirac measure concentrated on t).
Note that ν − µt ∈ Bt − µt and so
(ν − µt )dλ(t) = ν − µt ∈ G. T
Then because of (7.158), we have u, ν − µt ≤ 0,
udν ≤ udµt , ⇒ X
⇒ u
X
rationalizes µ
(see Definition 7.6.3).
EXAMPLE 7.6.15 The compactness of the space X of prizes cannot be dropped: Indeed consider the situation where T ={t}, X
= {0, 1, 2, . . .} and Bt = {µk }k≥0 with µ0 = δ{t} and for n ≥ 1, µn = (1/n)δ{0} + 1 − (1/n) δ{2} . The choice µ0 is not ≺-dominated, but we can check that it is not EU-rational.
606
7 Economic Equilibrium and Optimal Economic Planning
7.7 Remarks 7.1: The first to formulate and study of model of a pure exchange was Edgeworth [219]. He introduced a model with two economic agents and two commodities and using the recontracting adjustment process (i.e., no allocation occurs if it can be blocked – recontracted out – by a group of agents), he defined the well-known (Edgeworth) contract curve. This curve represents the locus of points where the marginal rate of substitution between the pair of goods, is the same for both agents. This curve can be illustrated, using a diagrammatic construction known as the Edgeworth box. The core, introduced in Definition 7.1.3 is a generalization of the contract curve. Note that the notion of core does not involve prices and it is the result of pure trading among the agents. In contrast, the other equilibrium notion, the Walras (or competitive) equilibrium, introduced in Definition 7.1.5(b), assumes that in the market prevail some prices for the commodities, which are accepted by all agents involved. But, intuitively speaking, we know that in reality, prices are a convenience and simply provide a common yardstick to measure all traded quantities. Therefore it is reasonable to claim that the two equilibrium notions should lead to the same allocations. A moment’s reflection can lead to the conclusion that this should be the case in a perfect competition environment (see Theorem 7.1.7). This means that no individual agent or group of agents has enough power alone to influence the outcome of the economic process. The idea to model perfect competition using a continuum of agents (more precisely a nonatomic measure) is due to Aumann [43, 45]. In Aumann [43] we find the core-Walras equivalence theorem (see Theorem 7.1.7), and in Aumann [45], the author establishes the existence of a Walras (competitive) equilibrium (see Theorem 7.1.10). The results of Aumann were extended to economies with production by Hildenbrand [295]. There are some other equilibrium notions such as Pareto equilibrium, quasi-competitive equilibrium, and expenditure minimizimg equilibrium. A comparative study of all these different equilibrium notions can be found in Hildenbrand [296]. There is another approach in the core theory in which the primitive notion in the economic model is the coalition and not the individual agent. This changes the necessary mathematical tools and so multifunctions are replaced by set-valued measures (multimeasures). This approach was initiated by Vind [595]. There is still a third, more complicated pure exchange model, with “big” and “small”traders due to Dreze–Gabsewicz–Schmeidler–Vind [207]. In the last twenty years, a lot of effort was put into extending equilibrium theory to economies with an infinite-dimensional commodity space. In this direction we mention the works of Bewley [74], Aliprantis–Brown [8], Florenzano [250], MasColell [411], Jones [333], Zame [618], and the book of Aliprantis–Brown–Burkinshaw [9], where a more detailed bibliography can be found. The books of Debreu [185], Hildenbrand [297], Hildenbrand–Kirman [298], and Takayama [572] contain additional material on economic equilibrium theory. In particular, Hildenbrand [297] deals with economies described by a continuum of agents. It does historical justice to say that the founding fathers of mathematical economics are Cournot, Walras, Pareto, Edgeworth, and Menger. Antoine Cournot (1801–1877) was French and is considered to have made the initial step in developing econometrics. Leon Walras (1834–1910) was French and the first holder in 1870 of the chair of political economy at the University of Lausanne in Switzer-
7.7 Remarks
607
land. His two main contributions in the theory of mathematical economics are the development of the marginal utility approach to the theory of value (1873) and the development of the theory of general equilibrium (1874–1877). Vifredo Pareto (1848–1923) was Italian and the successor of Walras to the chair of political economy at the University of Lausanne (1892). He is better known for the Pareto optimum, which is described as a position from which it is impossible to improve anyone’s welfare by altering production or exchange without impairing someone else’s welfare. He is also the founder of positive economics, an economic science purged of all ethical elements. Pareto rejected socialism, justified the inequality of income, and viewed with sympathy the coming to power in Italy of the fascists (October 1922). Francis Ysidro Edgeworth (1845–1926) was British and professor of political economy at Oxford University from 1891 to 1922. He is credited with inventing indifference curves and the contract curve. Finally Carl Menger (1840–1921) was Austrian and one of the formulators (together with Walras) of the marginal utility theory of value. He is also the founder of the Austrian school of economic thought. 7.2: The question of price characterization of efficient capital accumulation and allocation of resources in infinite horizon economies (“efficient pricing”) started with the work of Malinvaud [403, 404], and continued by Gale [254] in the context of optimal growth for multisector growth models (“competitive pricing”). Gale [254] introduced the notion of strong maximality (see Definition 7.2.24(b)) and assumed that the utility function is strictly concave. Brock [107] objected to this hypothesis, primarily because it excludes the von Neumann model. Brock [107] replaced the strict concavity of the utility function by the assumption that the optimal stationary program is unique and used the weak maximality criterion (see Definition 7.2.26(b)). Other contributions to the subject were made by Peleg [492], Peleg–Yaari [493], McKenzie [420, 423], Mitra [431, 432, 434], Mitra–Zilcha [433], Dechert–Nishimura [183], Khan–Mitra [350], and Joshi [334]. The books of Makarov–Rubinov [402] and Takayama [572] deal with discrete time growth models and contain additional references on the subject. Starting with the work by Hurwicz [319] on informational decentralization, people raised the question of characterizing the optimality of competitive programs in terms of conditions that can be verified by agents in an informationally decentralized mechanism, which involves two basic restrictions; first that there is initial dispersion of information and each economic unit has only partial knowledge of the environment; second that there is limited communication and so it is impossible to completely centralize dispersed information about the environment. Significant contributions in this direction, were made by Brock–Majumdar [111], Hurwicz– Majumdar [320], Hurwicz–Weinberger [321], and Dasgupta–Mitra [177, 178] who introduced the reachability condition (R) (see Definition 7.2.17) and have additional particular economic models which satisfy this condition. 7.3: Turnpike theory originates with a simple remark in the book of Dorfman– Samuelson–Solow [203]. Radner [509] proved a weak turnpike theorem for the von Neumann–Gale model, and McKenzie [418] did this for a Leontief-type model. A strong turnpike theorem was first proved by Nikaido [461]; see also Tsukui [589] and Winter [607]. Additional results can be found in McKenzie [419, 420, 421, 422, 423], Araujo–Scheinkman [28], Benveniste–Scheinkman [65], Fershtman–Mullar [242], and Read [512], and in the books of Arkin–Evstingeev [29] and Takayama [572].
608
7 Economic Equilibrium and Optimal Economic Planning
7.4: Optimal growth under uncertainty was first investigated in the early seventies. We mention the works of Brock–Mirman [108, 109], Dana [173], Dynkin [215], Evstingeev [230], Jeanjean [331], Radner [510], and Tacsar [570]. Brock–Mirman modelled the uncertainty in their model with a sequence of independent, identically distributed (iid) random variables. Dynkin–Evstingeev, Jeanjean, Radner, and Tascar considered a stationary Markov process modelling the uncertainty in the economy, considered the more general situation of a probability distribution on the set of sequences of states. This approach incorporates the other two via Kolmogorov’s theorem (for the idd case) and the Ionescu–Tulcea disintegration theorem (for the Markovian case). Soon thereafter Zilha [628, 629] considered the discounted model (see Zilha [628]) and the undiscounted model using the strong maximality criterion (see Zilha [629]) and established the existence of supporting prices for optimal programs. More recent works on the subject are those by Evstingeev–Katyshev [231], Majumdar–Radner [400], Majumdar–Zilha [401], Nyarko [466], Zilha [630], and Pantelides–Papageorgiou [469]. In particular, Nyarko [466] extends to a stochastic economic growth model the work of Dasgupta–Mitra [177] on temporal and informational decentralization. Various aspects of the theory of stochastic economic dynamics can be found in the book of Arkin–Evstingeev [29]. 7.5: Continuous-time economic growth problems present more technical difficulties than the discrete ones. So the choice of the topology on the set of feasible programs is crucial. The investigation of the c-topology and the α-topology was conducted by Becker–Boyd–Sung [60], who this way were able to rectify some erroneous claims existing in the literature. Namely, Brock–Haurie [110] and Yano [612] claimed that stock convergence implies, at least for a subsequence, flow convergence almost everywhere and Balder [55] claimed that the c-topology and the weak α-topology coincide. We should mention that the problem studied by Becker–Boyd–Sung [60] involves a recursive utility. Undiscounted continuous-time growth models can be found in the works of Brock–Haurie [110], Papageorgiou [479], and in the book of Carlson–Haurie–Leizarowitz [130]. 7.6: The model used in this section is due to Border [88]; see also Border [87]. Earlier works on the subject by Fishburn [248] and Ledyard [372] assumed the set of observations and the budget set are finite. Definition 7.6.4(a) is due to Nachbin [447]; see also Nachbin [448]. Proposition 7.6.11 is essentially due to Kamae–Krengel– O’Brien [339].
8 Game Theory
Summary. *Game-theoretic models provide a substantial amount of generalization of the basic notions of mathematical economics, such as core, equilibria, saddle points and intertempolar optimum. In this chapter, we deal with different models in game theory. We start with noncooperative games which lead to the notion of “Nash equilibria”. Then we pass to cooperative games, for which we can define the notion of core. After that we consider random games with a continuum of players. For such games, we show the existence of Cournot–Nash equilibria. We also consider Bayesian games stochastic 2-player, and zero-sum games. Finally, using the notion of ε-subdifferential of convex functions, we prove the existence of approximate Nash equilibria for noncooperative games.
Introduction Any discussion of economic models is incomplete if it is not accompanied by a presentation of game theory. This is because the models in game theory provide a substantial amount of generalization of the basic notions of mathematical economics such as core, equilibria, saddle point, and intertemporal optimum. A game is a decision problem for a group of players who may have cooperative or noncooperative (conflicting) objectives to optimize, with different or equal information structures and finite strategies from which to choose. In this chapter we present some basic issues from the theory of games. We deal with both static and dynamic (over an infinite discrete-time horizon) games and with deterministic and stochastic games. In Section 8.1, we examine noncooperative games with n-players. The basic equilibrium concept for such games, is the so-called Nash equilibrium. We establish the existence of such equilibrium. In Section 8.2, we deal with cooperative games with n-players. We consider the characteristic function form of such games and we study both side-payment games and no-side-payment games. For both the central stability concept is the core. We present theorems that guarantee the nonemptiness of the core. In Section 8.3 we deal with random games, with a continuum of players (namely, atomless measure space of players) and, an infinite-dimensional strategy space. For N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_8, © Springer Science+Business Media, LLC 2009
610
8 Game Theory
such games, using notions and results from multivalued analysis, we prove the existence of Cournot–Nash equilibria. Continuing with random games, in Section 8.4 we consider Bayesian games with an infinity of players and an infinite-dimensional strategy space. In such games the players have different information structures that are modelled as sub-σ-fields of the σ-field of all possible outcomes. Again we show that under reasonable hypotheses, Bayesian games admit Cournot–Nash equilibria. In Section 8.5 we investigate stochastic 2-player zero-sum games, using the stochastic dynamic programming model, which we also present. We show the existence of a saddle point. Finally in Section 8.6 we introduce and use the notion of ε-subdifferential of a convex function, in order to show that noncooperative n-player games with noncompact strategy sets admit approximate Nash equilibrium.
8.1 Noncooperative Games–Nash Equilibrium In this section we study n-person games in normal form, in which the players exhibit a noncooperative (individualistic) behavior. The central concept for such games is the notion of Nash equilibrium. We start with a description of the n-person game in normal (strategic) form. So let N = {1, 2, . . . , n}, n ≥ 2, be the set of players. Each player k ∈ N has a strategy set Xk which describes all strategies available to him. We set X=
n =
Xk
(the multistrategies set).
k=1
Also a set S(N ) ⊆ X is given describing the feasible multistrategies. The preference relation of player k ∈ N among the strategies available to him, is described by a utility function uk : S(N ) −→ R, which to each feasible multistrategy x ∈ S(N ) assigns the number uk (x) measuring the utility produced by x. Of course he prefers multistrategies that yield a maximum utility. We can also think of uk as a loss function, in which case −uk can be regarded as a utility function. The interpretation of uk as a utility function, which we have adopted here, is motivated from the economic applications. From the viewpoint of player k ∈ N , the set X of multistrategies, is split as follows X = X × Xk
with X =
=
Xi .
i=k
We think of kc = N \ {k} as a coalition adverse or complementary to player k ∈ N . Also let pk and p be the projections from X onto Xk and X, respectively. The decomposition (x, xk ) ∈ X × Xk of a multistrategy x ∈ X = X × Xk is important for player k ∈ N , because this way he distinguishes the component xk which he can control from the components x = {xi }i=k ∈ X over which he has no control. DEFINITION 8.1.1 A game in normal form is the collection {Xk , uk }k∈N .
8.1 Noncooperative Games–Nash Equilibrium
611
Suppose that no player has any information about the choices made by the other players. Then a cautious way to proceed is for each player to maximize her minimum utility. More precisely, for every player k ∈ N , let vk (xk ) = inf uk (x) : x ∈ S(N ), pk (x) = xk . (8.1) The quantity vk (xk ) in (8.1) represents the least utility for player k among all feasible multistrategies, when she employs strategy xk . Then she will try to maximize vk (·); that is,
vk = sup vk (xk ) : xk ∈ pk S(N ) . (8.2) Note that vk is the outcome of a sup inf process. DEFINITION 8.1.2 The vector v = {vk }n k=1 is the conservative value of the game and a multistrategy x∗ = {x∗k }n k=1 ∈ S(N ) such that vk = vk (x∗k ) is a conservative or maximin multistrategy. We have the following existence theorem for conservative multistrategies. THEOREM 8.1.3 If the strategy sets Xk , k ∈ N , are Hausdorff topological spaces, S(N ) ⊆ X is compact (for the product topology on X), and uk : S(N ) −→ R, k ∈ N , is upper semicontinuous, then the game admits a conservative multistrategy x∗ ∈ S(N ). PROOF: Note that vk (xk ) = inf uk (x) : x ∈ S(N ), pk (x) = xk = inf uk (x, xk ) : (x, xk ) ∈ S(N ) .
So from Theorem 6.1.18(a) is upper we have that vk : pk S(N ) −→ R
semicon tinuous. Because pk S(N ) is compact in Xk , we can find x∗k ∈ pk S(N ) such that vk = vk (x∗k ), k ∈ N. Evidently x∗ = {x∗k }k∈N is a conservative multistrategy for the game.
REMARK 8.1.4 In general note that player k ∈ N will reject any multistrategy x yielding a utility uk (x) smaller than vk . So vk is the minimum that each player can tolerate. In the above definition of conservative strategy, every player chooses his strategy independently of the rest of the players. We want to consider a different kind of behavior, in which every player varies his strategy based on the choice of the complementary coalition. Roughly speaking, what happens is the following: As soon as player k announces her intention to play, the complementary coalition N \ {k} moves first and then player k responds. With such a rule, an equilibrium will be a
612
8 Game Theory
multistrategy x∗ in which no player k can increase her utility by moving away from strategy x∗k , provided that the other players do not alter their choice. So we have uk (x∗ , x∗k ) = sup uk (x∗ , y) : y = pk (x), x ∈ S(N ) . (8.3) In other words, (8.3) says that given the multistrategy x∗ of the complementary coalition N \ {k}, the k-player responds by maximizing the utility y −→ uk (x∗ , y)
y = pk (x),
x ∈ S(N ).
DEFINITION 8.1.5 Given a game {Xk , uk }k∈N in normal form, a Nash equilibrium is a feasible multistrategy x∗ ∈ S(N ), such that for every k ∈ N , we have uk (x∗ , y) ≤ uk (x∗ , x∗k )
for all y = pk (x), x ∈ S(N ).
REMARK 8.1.6 So in the Nash equilibrium there is no incentive for a player k ∈ N to change her strategy x∗k by herself. This is a noncooperative equilibrium concept. To study the notion of Nash equilibrium, we introduce the function ξ : S(N ) × S(N ) −→ R defined by ξ(x, z) =
n
ui (x) − ui (x, zi ) .
(8.4)
i=1
PROPOSITION 8.1.7 If {Xk , uk }k∈N is a game in normal form and for x ∈ S(N ) we have ξ(x, z) ≥ 0 for all z ∈ S(N ), (8.5) then x is a Nash equilibrium for the game. The converse is true provided that S(N ) is a product space. PROOF: Suppose that x ∈ S(N ) satisfies (8.5) and let z ∈ S(N ) such that p(z) = x = {xi }i=k . Using (8.5), we have 0 ≤ ξ(x, z) =
n i=1
=
ui (x) − ui (x, zi )
ui (x) − ui (x, xi ) + uk (x) − uk (x, zk )
i=k
= uk (x) − uk (x, zk ). There x ∈ S(N ) is a Nash equilibrium (see Definition 8.1.5). n n " " Now suppose that S(N ) = Si ⊆ Xi and let x∗ ∈ S(N ) be a Nash equilibi=1
i=1
rium. Then by Definition 8.1.5, we have ui (x∗ ) − ui (x∗ , zi ) ≥ 0
for all zi ∈ Si , i ∈ {1, . . . , n} = N.
Adding these inequalities, we obtain ξ(x∗ , z) ≥ 0 for all z ∈ S(N ).
To prove the existence of a Nash equilibrium, we need the following abstract theorem, known as Ky Fan’s inequality. For a proof, we refer to Aubin–Ekeland [39, p. 327].
8.1 Noncooperative Games–Nash Equilibrium
613
THEOREM 8.1.8 If K ⊆ RN is a nonempty, compact, convex set and ξ : K × K −→ R is a function satisfying (i) For every z ∈ K, x −→ ξ(x, z) is upper semicontinuous; (ii) For every x ∈ K, z −→ ξ(x, z) is quasiconvex (see Definition 2.3.11); (iii) For every z ∈ K, ξ(z, z) ≥ 0, then we can find x∗ ∈ K such that inf ξ(x∗ , z) ≥ 0.
z∈K
Now we can state and prove the main existence theorem for Nash equilibria. N THEOREM 8.1.9 If {Xk , uk }k∈N is a game in normal form, S(N
) ⊆ R is compact convex, for every k ∈ N uk is continuous, and for every x ∈ p S(N ) , z −→ uk (x, z) is quasiconcave, then there exists a Nash equilibrium.
PROOF: The function ξ(x, z) =
n
ui (x) − ui (x, zi ) ,
(x, z) ∈ S(N )×S(N )
i=1
is continuous in x and quasiconvex in z. Moreover, ξ(z, z) ≥ 0
for all z ∈ S(N ).
So we can apply Theorem 8.1.8 and obtain x∗ ∈ S(N ) such that ξ(x∗ , z) ≥ 0
for all z ∈ S(N ).
This then by virtue of Proposition 8.1.7 implies that x∗ ∈ S(N ) is a Nash equilibrium for the game. At this point it is instructive to briefly present a special case, known as a bimatrix game. EXAMPLE 8.1.10 Bimatrix Game: This is a two-player game; that is, N = {1, 2} and the two strategy sets X1 and X2 are finite sets; for example, X1 = {1, . . . , l1 } and X2 = {1, . . . , l2 }. Also we have two utility functions u1 and u2 . We set apq = u1 (p, q)
and
bpq = u2 (p, q)
for all p ∈ X1 , q ∈ X2 .
So apq is the first player’s payoff when the first player chooses a (pure) strategy p ∈ X1 and the second chooses a (pure) strategy. Similarly for the payoff bpq of the 2 second player. Then the game Xk , uk k=1 reduces to the following bimatrix game: 5 6 (a11 , b11 ) · · · (a1l2 , b1l2 ) . (al1 1 , bl1 1 ) · · · (al1 l2 , bl1 l2 ) Evidently a Nash equilibrium in general need not exist because the strategy sets X1 and X2 are not in general convex. The sets X1 and X2 are said to represent the pure strategies available to player 1 and player 2, respectively. When apq + bpq = 0 for all (p, q) ∈ {1, . . . , l1 } × {1, . . . , l2 }, we say that we have a zero-sum game. It
614
8 Game Theory
represents a situation of direct conflict between the two players, because the gain of one is equal to the loss of the other. To be able to produce a Nash equilibrium, we need to consider mixed strategies. Namely, we are led to another game in normal form with strategy sets lk lk l X k = x = (xi )i=1 ∈ R+k : xi = 1 ,
k = 1, 2.
i=1
A point in the simplex X k is a mixed strategy and each component xi represents the probability attached to the pure strategy i. When the two players choose mixed strategies (x, y) ∈ X 1 × X2 , then the payoffs are apq xp yq for player 1 (8.6) p,q
and
bpq xp yq
for player 2.
(8.7)
p,q
In this case all hypotheses of Theorem 8.1.9 are satisfied and so a Nash equilibrium exists. Now let us see a specific bimatrix game with a unique Nash equilibrium. This is the famous prisoner’s dilemma problem. There are two players who are prisoners and have been charged with some joint crime. Each player has a strategy set with two elements (strategies). The first strategy is to plead innocent and the second is to plead guilty. If both prisoners plead guilty they will be punished. If both plead innocent, then there is no solid case against them and their punishment will be light. If one pleads guilty and other pleads innocent, then the first will be rewarded for his honesty and will be released, and the second will be punished severely. So let the bimatrix representing the game be the following one 5 6 (10, 10) (0, 15) . (8.8) (15, 0) (5, 5) The entries in this bimatrix are not the terms of imprisonment, but are the utility levels. Evidently the Nash equilibrium for this bimatrix game exists uniquely and is the pair of strategies that generates (5, 5); that is, both prisoner’s plead innocent. If we pass to the mixed strategies the situation does not change. The Nash equilibrium remains the same (a pure strategy) and is unique. Indeed let (x∗ , y ∗ ) ∈ X 1 ×X 2 be a Nash equilibrium. Then given y ∗ , x∗1 solves max 10.x1 y1∗ + 0.x1 y2∗ + 15.(1 − x1 )y1∗ + 5.(1 − x1 )y2∗ (see (8.6)) 0≤x1 ≤1
⇒ x∗1 = 0. By using (8.7) we obtain y1∗ = 0. So indeed is the strategy producing utility (5, 5) (see (8.8)) as before. Note that in this game the incentive, for a rational player concerned only with his own survival, is to confess and let the other suffer the consequences. But, as both are motivated to act in this way, they end up with an outcome that is worse for both than if they had been able to make a binding agreement between themselves
8.1 Noncooperative Games–Nash Equilibrium
615
that no one confesses. This example illustrates how a rational behavior at a micro level leads to an apparently irrational macro outcome. An interesting interpretation of the bimatrix game is in wage negotiations in a labor market. In this situation, player 1 is the labor union, player 2 is the management. Player 1 wants high wages, whereas management wishes to give as small a wage increase as possible. Hence the game is clearly a zero-sum game. Next we go beyond games in normal form by introducing the notion of feasibility. This notion allows us to deal with situations in which the choices of players cannot be made independently (as was the case thus far). For example, consider several mining companies which extract coal from a field. Each company chooses to extract a certain amount xk and to sell it. The price of the coal depends on the amount sold. So each company (producer) has a partial control of the price and consequently of its n profits. On the other hand xk can not be chosen independently, because xk (n = k=1
number of producers), cannot exceed the total amount of coal in the field. For this reason, for every player k ∈ N , we introduce a correspondence Fk : X −→ Xk , which describes those strategies available to player k ∈ N , when all players have chosen their strategy. The way we have defined Fk it also depends on the strategies of player k. This is done only for reasons of technical convenience. In concrete situations Fk is independent of players k choice. Also now, in contrast to the previous model, we do not assume that the preferences of the players are represented by utility functions. Instead we are more general and assume that each player k ∈ N , has a good reply n " multifunction Uk . So Uk : X = Xi −→ Xk and yk ∈ Uk (x), if yk is a good reply i=1
for player k ∈ N to multistrategy x ∈ X. DEFINITION 8.1.11 A generalized game or abstract economy is a collection {Xk , Fk , Uk }k∈N . REMARK 8.1.12 This is not a game. No player can individually play this game, because he needs to know the strategies of the other players in order to determine his feasibility set. For this reason we call it a generalized game. A generalized game is a good mathematical tool to produce existence theorems in various applied contexts. In this general setting, the notion of Nash equilibrium takes the following form: DEFINITION 8.1.13 A Nash equilibrium for an abstract economy {Xk , Fk , Uk }k∈N , is a multistrategy n " x∗ ∈ X such that x∗ ∈ F (x∗ ) = Fk (x∗ ) and Fk (x∗ ) ∩ Uk (x∗ ) = ∅ for all k ∈ N . k=1
REMARK 8.1.14 So a Nash equilibrium for an abstract economy, is a multistrategy for which no player has a good reply. PROPOSITION 8.1.15 If Xk ⊆ Rlk are nonempty, compact, convex sets and n " Xk −→ 2Xk is a multifunction such that Uk : k=1
(i) For every x ∈ X, Uk (x) is convex and
616
8 Game Theory
(ii) For every {xk } ∈ X, Uk {xk } = x ∈ X : xk ∈ U (x) is open, then we can find x = {xk }n k=1 ∈ X such that xk ∈ Uk (x)
or
Uk (x) = ∅
for all k ∈ N = {1, . . . , n}.
PROOF: Let Γk = {x ∈ X : Uk (x) = ∅}. Because of hypothesis (ii), the set Γk is open in X. Let Uk = Uk Γ : X −→ 2Xk \{∅}. Apply Proposition 6.3.4 to obtain a k continuous function uk : Γk −→ Xk such that uk (x) ∈ Uk (x) for all x ∈ Γk . Consider the multifunction Gk : X −→ 2Xk \{∅} such that if x ∈ Γk {uk (x)} . Gk (x) = Xk otherwise Evidently Gk is usc and has nonempty, compact, and convex values. Hence so is n " Gk (x) from X into the nonempty, compact, and the multifunction x −→ G(x) = k=1
convex subsets of X. So we can apply the Kakutani–Ky Fan fixed point theorem (see Theorem 6.5.19) and obtain x∗ ∈ X such that x∗ ∈ G(x∗ ). If Gk (x∗ ) = Xk , then ∗ ∗ ∗ ∗ x∗k ∈ Gk (x∗ ) (x∗ = {x∗k }n k=1 ) implies that xk = uk (x ) ∈ Uk (x ). If Gk (x ) = Xk , ∗ then Uk (x ) = ∅. REMARK 8.1.16 A careful reading of the above proof reveals that if one of the Uk s is single-valued, then the result is automatically true (in this case the continuous selection γk is obtained trivially). COROLLARY 8.1.17 If Xk ⊆ Rlk is nonempty, compact, and convex, and for every k ∈ N = {1, . . . , n} the multifunction Uk : X −→ 2Xk has an open graph and satisfies xk ∈ / conv Uk (x) for all x ∈ X, then we can find x ∈ X such that Uk (x) = ∅ for all k ∈ N . PROOF: The multifunction x −→ conv Uk (x) satisfies hypotheses (i) and (ii) of Proposition 8.1.15. So by virtue of that result, we can find x = {xk }n k=1 ∈ X such that xk ∈ convUk (x) or conv Uk (x) = ∅, k ∈ N . But by hypothesis xk ∈ / conv Uk (x). Therefore Uk (x) = ∅ for all k ∈ N . THEOREM 8.1.18 If {Xk , Fk , Uk }k∈N is an abstract economy and for every k ∈ N, (i) Xk ⊆ Rlk is nonempty, compact, and convex, (ii) Fk : X −→ 2Xk has nonempty, closed, and convex values, is usc, for every x ∈ X we have Fk (x) = cl int Fk (x) and x −→ int Fk (x) has an open graph, (iii) Gr Uk is open in X × Xk , (iv) For every x ∈ X, xk ∈ / conv Uk (x), then the abstract economy admits a Nash equilibrium x∗ ∈ X (see Definition 8.1.13).
8.2 Cooperative Games PROOF: Let V0 = X, Vk = Xk for all k ∈ N and V = V0 × of V is denoted by (x, y) ∈ V with x ∈ V0 and y ∈
n "
n "
617
Vk . A generic element
k=1
Vk . For every k ∈ N , we
k=1
introduce a multifunction Hk : V −→ 2
Vk
defined by
H0 (x, y) = {y}
(8.9)
and for k ∈ N, Hk (x, y) =
int Fk (x) conv Uk (y) ∩ int Fk (x)
if yk ∈ / Fk (x) . if yk ∈ Fk (x)
(8.10)
Clearly H0 is single-valued continuous, whereas for k ∈ N, Hk has convex values (see hypothesis (ii)) and yk ∈ / Hk (x, y) (see hypothesis (iv)). We claim that GrHk ⊆ V × Vk is open. To this end let Ak = (x, y, zk ) : zk ∈ int Fk (x) Bk = (x, y, zk ) : yk ∈ / Fk (x) Ck = (x, y, zk ) : zk ∈ conv Uk (y) . Because by hypothesis (ii), x −→ int Fk (x) has an open graph, we deduce that Ak is open. Suppose that yk ∈ / Fk (x). Then we can find a closed neighborhood D of yk such that Fk (x) ⊆ Dc and because Fk is usc, it follows that Bk is open. Moreover, Ck is open because of hypothesis (iii). Since GrHk = (Ak ∩ Bk ) ∪ (Ak ∩ Ck ), we infer that indeed Gr Hk ⊆ V × Vk is open. Apply Proposition 8.1.15 (see also Remark 8.1.16) to obtain (x∗ , y ∗ ) ∈ V such that x∗ = y ∗
and
Hk (x∗ , y ∗ ) = ∅
for all k ∈ N
(see (8.9)).
(8.11)
From (8.10) and (8.11), we obtain conv Uk (x∗ ) ∩ int Fk (x∗ ) = ∅ ∗
∗
⇒ Uk (x ) ∩ int Fk (x ) = ∅ ∗
for all k ∈ N,
for all k ∈ N.
(8.12)
∗
But by hypothesis (ii), Fk (x ) = cl [int Fk (x )]. So from (8.12), we conclude that Uk (x∗ ) ∩ Fk (x∗ ) = ∅
for all k ∈ N,
⇒ x∗ ∈ X is a Nash equilibrium (see Definition 8.1.13).
8.2 Cooperative Games In the previous section, we examined noncooperative games and we introduced the concept of Nash equilibrium according to which each player will consider only unilateral strategy changes in deciding whether or not she can be made better off. In
618
8 Game Theory
this section we consider games in which players cooperate in order to improve their position. So coalitions of players may have more power to make their members better off than they would be by acting individually. So let N = {1, . . . , n} be the set of players and 2N \ {∅} (all nonempty subsets of N ) be the set of all possible coalitions of players. Throughout this section, we use the following notation. If {Xk }k∈N is"a family of sets and B ∈ 2N \ {∅}, " B then X = Xk . Similarly we set RB = R. By pB : X N −→ X B (resp., k∈B
k∈B
pB : RN −→ RB ), we denote the projection map. The simplest cooperative game is a game in characteristic function form with side payments (a side-payment game). This is defined as a function v : 2N \{∅} −→ R, which for every coalition B ∈ 2N \ {∅} assigns the maximal total utility of the coalition that its members can generate by their cooperation regardless of the actions of the players outside the coalition. A side-payment game corresponds to a game in normal form (see Definition 8.1.1). Indeed, for each player k ∈ N , his strategy set is given by the mutually disjoint union ! Xk = Xk,B : B ∈ 2N \ {∅}, k ∈ B . If x ∈ Xk,B , then player k ∈ N cooperates with the players in the coalition B. We set uk (x, xk ) = −∞ (x = pB c (x); see Section 8.1), / Xm,B . if for the coalition B for which xk ∈ Xk,B , there exists m ∈ B such that xm ∈ Also let uk be independent of {xk }k∈N \B where B is the coalition for which xk ∈ Xk,B . Then the required game satisfies for every coalition B ∈ 2N \ {∅}
uk ≤ v(B) = uk (x, xk ) k∈B for all k ∈ B, xk ∈ Xk,B . (uk )k∈B : k∈B
But clearly the associated game in normal form is not easy to deal with and for this reason, we prefer to work directly with the value function ν. The basic solution concept for cooperative games, is analogous to the notion of core of a pure competition economy (see Definition 7.1.3). DEFINITION 8.2.1 The core of a side-payment game v is the set C(v) of all u = (uk )k∈N ∈ RN such that (a) uk ≤ v(N ). k∈N uk < v(B). (b) There is no coalition B ∈ 2N \{∅} such that k∈B
REMARK 8.2.2 In the above definition, the requirement (a) corresponds to the feasibility of the utility vector u = {uk }k∈N , and the requirement (b) implies that no coalition can improve upon u ∈ RN . Note that the core C(v), is the set of solutions of the 2n -linear inequalities, −(χN , x)Rn ≥ −v(N ) and (χB , x)Rn ≥ v(B), ek ,ek = (ek,l )n where for every B ∈ 2N \{∅}, χB = l=1 with k∈B
ek,l = δkl =
1 0
if k = l . if k = l
8.2 Cooperative Games
619
The last observation leads us to consider systems of linear inequalities. So let A be an m × n-matrix and b ∈ Rm . First we consider the following system of linear equalities with a nonnegativity constraint. Ax = b
with x ≥ 0.
(8.13)
DEFINITION 8.2.3 We say that system (8.13) is consistent, if there is x ∈ Rn + (i.e., x ∈ Rn , x ≥ 0) that satisfies Ax = b. Such a vector is called a solution of (8.13). PROPOSITION 8.2.4 System (8.13) is consistent (i.e., it has a solution x ∈ Rn +) if and only if (y, b)Rm ≥ 0
for all y ∈ Rm such that A∗ y ≥ 0.
PROOF: Let Γ(A) = {Ax : x ≥ 0}. Clearly Γ(A) ⊆ Rm is a nonempty, closed, convex cone and 0 ∈ Γ(A). Note that the system (8.13) is consistent if and only if b ∈ Γ(A), then b = Ax for some x ≥ 0 and so if A∗ y ≥ 0, then (A∗ y, x)Rn ≥ 0, hence (y, Ax)Rm = (y, b)Rm ≥ 0. Suppose b ∈ / Γ(A). Then by the strong separation theorem we can find y ∈ Rm \ {0} and ε > 0 such that (y, b)Rm + ε ≤ (y, u)Rm ⇒ (y, b)Rm ≤ −ε
for all u ∈ Γ(A),
(because 0 ∈ Γ(A)).
On the other hand we have (y, u)Rm ≥ 0 for all u ∈ Γ(A). Indeed, if we can find u ∈ Γ(A) such that (y, u)Rm < 0, then because Γ(A) is a convex cone, for ϑ > 0 large enough, we have ϑu ∈ Γ(A)
and
(y, ϑu)Rm < (y, b)Rm + ε,
a contradiction.
REMARK 8.2.5 This result is often called the Minkowski–Farkas lemma. COROLLARY 8.2.6 If A is an m×n-matrix and b ∈ Rm , then the system Ax ≥ b is consistent (i.e., there exists x ∈ Rn such that Ax ≥ b) if and only if for every ∗ y ∈ Rm + for which we have A y = 0, then (y, b)Rm ≤ 0. PROOF: Let Im be the m × m-identity matrix. The system Ax ≥ b is consistent if and only if the system ⎡ ⎤ ? ⎡x ⎤ > x1 1 (8.14) A −A −Im ⎣ x2 ⎦ = b with ⎣ x2 ⎦ ≥ 0 x3 x3 is consistent. Then by virtue of Proposition 8.2.4, (8.14) is consistent if and only if for every y ∈ Rm for which we have
∗ A − A∗ − Im y ≥ 0, then (y, b)Rm ≥ 0. Now we return to the side-payment game and introduce the following notion.
620
8 Game Theory
DEFINITION 8.2.7 A collection S of coalitions (i.e., S ⊆ 2N \{∅}) is said to be balanced if there exists a set {λB }B∈S ⊆ R+ such that λB χB = χN . B∈S
Recall that for every B ∈ 2N \{∅} χB =
ek ,
k∈B n where ek = (ek,l )n l=1 ∈ R+ with
ek,l = δkl = Note that
1 0
if k = l . if k = l
λB χB = χN if and only for every k ∈ N ,
λB = 1. The set
B∈S
B∈S
k∈B
{λB }B∈S is called the associated balancing coefficients. The side-payment game v is said to be balanced if for every balanced S ⊆ 2N \ {∅} with associated balancing coefficients {λB }B∈S , we have λB v(B) ≤ v(B). B∈S
The notion of a balanced side-payment game is related to the nonemptiness of the core. THEOREM 8.2.8 If v : 2N \ {∅} −→ R is a side-payment game, then the core C(v) of v is nonempty if and only if for every family {λB }B∈2N \{∅} ⊆ R+ for which λB χB = χN , we have B∈2N \{∅}
λB v(B) ≤ v(N ).
B∈2N \{∅}
PROOF: By virtue of Corollary 8.2.6, we have that C(v) = 0 if and only for every family {λB }B∈2N \{∅} ∪ ξ ⊆ R+ for which λB χB + ξ(−χN ) = 0, B∈2N \{∅}
we have
λB v(B) + ξ −v(N ) ≤ 0.
(8.15)
B∈2N \{∅}
If ξ = 0, then λB = 0 for all B ∈ 2N \ {∅}, satisfies (8.15). If ξ > 0, then we set λB = λB t and we obtain λB v(B) ≤ v(N ). B∈2N \{∅}
8.2 Cooperative Games
621
A shortcoming of the model of a side-payment game is that the notion of total utility v(B) for a coalition B ∈ 2N \{∅} is hard to interpret, unless the players have utility functions. If the players have preferences over outcomes that are not representable by utility functions, then the game must specify the physical outcomes that a coalition can guarantee for its members. The preferences can then be described as binary relations on vectors of physical outcomes and it is not necessary to rely on a utility function. To accommodate this situation, we introduce a more general game. So for every coalition B ∈ 2N \{∅}, we set n for every k ∈ B . RB = x = (xk )n k=1 ∈ R : xk = 0 A game in characteristic function form without side-payments (or simply a nonn side-payment game) is a multifunction V : 2N \{∅} −→ 2R \ {∅} such that V (B) ⊆ RB
for every B ∈ 2N \ {∅}.
We have that u ∈ V (B) if and only if cooperation among the members of coalition B, can bring about the utility allocation (uk )k∈B to the members of B. An equivalent definition of a non-side-payment game is a multifunction V : 2N \ n {∅} −→ 2R \{∅} such that u, u ∈ Rn
and
uk = uk
u ∈ V (B)
if and only if
for all k ∈ B
implies
u ∈ V (B).
Evidently the set V (B) is a cylinder with basis V (B); that is, V (B) = u ∈ Rn : (uk , δkB )k∈B ∈ V (B)
where δkB =
1 0
if k ∈ B . if k ∈ N \B
We adopt this definition of a non-side-payment game. Again we can associate with a game in normal form. However, its formulation is very cumbersome and so it is much easier to work with the value multifunction B −→ V (B). For such a game, the notion of core is defined as follows. DEFINITION 8.2.9 The core of a non-side-payment game V , is the set C(V ) of n all u = (uk )n k=1 ∈ R such that (a) u ∈ V (N ). (b) We can find coalition B ∈ 2N \{∅} and u ∈ V (B) such that uk < uk
for all k ∈ B.
REMARK 8.2.10 In the above definition, requirement (a) is a feasibility condition on the core vector and requirement (b) simply says that no coalition can improve upon u.
622
8 Game Theory
DEFINITION 8.2.11 A non-side-payment game V is said to be balanced if for every balanced collection of coalitions S (see Definition 8.2.7) we have # V (S) ⊆ V (N ). B∈S
In what follows, we show that the notion of a balanced game, is sufficient for the nonemptiness of the core. However, in contrast to side-payment games (see Theorem 8.2.8) it is not necessary. But first we need to state a version of the KKM-theorem (see Theorem 6.5.9) due to Shapley [551], which we need in the proof of the theorem for the nonemptiness of the core. For every coalition B, ∆B = conv{ek : k ∈ B}. PROPOSITION 8.2.12 If CB : B ∈ 2N \ {∅} is a family of nonempty closed subsets of ∆N such that for each B ∈ 2N \ {∅}, ∆B ⊆ CD , then there is a balanced collection S such that
D⊆B
#
CB = ∅.
B∈S
Using this proposition we can now establish the nonemptiness of the core C(V ). n THEOREM 8.2.13 If V is a non-side-payment game, b = (bk )n k=1 ∈ R is defined by bk = sup uk : u = (ui )n i=1 ∈ V ({k})
and we have (i) For every B ∈ 2N \ {∅}, V (B) − Rn + = V (B); (ii) There exists M > 0 such that for every B ∈ 2N \ {∅} u ∈ V (B) ∩ {b} + Rn implies uk < M +
for all k ∈ B;
(iii) For every B ∈ 2N \ {∅}, V (B) is closed in Rn ; (iv) V is balanced, then C(V ) is nonempty. PROOF: We may assume without any loss of generality that b = 0. Let M > 0 be as in hypothesis (ii) and for every coalition B define DB = conv{−M nek : k ∈ B}. We consider the function ξ : DN −→ R defined by ! V (B) . ξ(y) = max λ ∈ R : y + λχN ∈ B∈2N \{∅}
The function ξ is continuous (see Theorem 6.1.18(c)). Hence the function y −→ g(y) = y + ξ(y)χN is continuous too. We set
8.2 Cooperative Games
623
CB = y ∈ DN : g(y) ∈ V (B) = g −1 V (B) , ⇒ CB ⊆ DN
is closed.
(8.16)
Claim: If B, B are coalitions and DB ∩ CB = ∅, then B ⊆ B . Clearly the claim is true if B = N . So suppose that |B | < n and let y ∈ DB ∩ CB . We have yk = −M n, k∈B
hence, we can find k0 ∈ B such that Mn < −M, |B | + ξ(y) ≥ 0 (because g(y) ∈ Rn + ),
yk0 ≤ − ⇒ yk0
⇒ ξ(y) > M
(see (8.17)).
(8.17)
(8.18)
On the other hand, we have g(y) ∈ V (B)
(see (8.16)), for all k ∈ B
⇒ yk + ξ(y) < M
for all k ∈ B
⇒ yk < 0
(see hypothesis (ii)),
(see (8.18)),
⇒ B ⊆ B. Thus we have satisfied all the hypotheses of Proposition 8.2.12. According to that result we can find a balanced collection S0 such that #
CB = ∅.
B∈S0
Let y0 ∈
B∈S0
CB . Then we have
g(y0 ) ∈ g
#
CB
⊆
B∈S0
#
g(CB )
B∈S0
⊆
#
V (B)
(see (8.16))
B∈S0
⊆ V (N )
(because S0 is balanced).
(8.19)
definition of ξ, we see that g(y0 ) is on the boundary of From (8.19)Nand the V (B) : B ∈ 2 \{∅} . This together with hypothesis (i) imply that g(y0 ) can not be improved upon by any coalition. Therefore g(y0 ) ∈ C(V ); that is, C(V ) = ∅. REMARK 8.2.14 Hypothesis (i) is usually called the comprehensiveness condition (free disposability in economic models). Also hypothesis (iii) can be replaced by the weaker condition: (iii) V (N ) ⊆ Rn is closed.
624
8 Game Theory
8.3 Cournot–Nash Equilibria for Random Games In Section 8.1 we considered noncooperative games and introduced an equilibrium notion for them, due to Nash (see Definition 8.1.5). The game had a finite number of players each of whom is described by a strategy set and a utility (payoff) function defined on the Cartesian product of the strategy sets. It was assumed that the strategy sets were subsets of R and in fact without any additional effort we can also assume that each strategy set is a subset of a finite-dimensional Banach space. In this section we present a twofold generalization of this model. We allow the strategy set of each player to be a subset of an infinite dimensional Banach space. On the other hand, we consider a continuum of players, in analogy to the model of perfect competition investigated in Section 7.1. So the space of players is a finite measure space. Each player has a preference multifunction, which describes his or her preference relation on the set of feasible strategies. This relation need not be transitive and complete and so may not be representable by a utility function. So let (T, T , µ) be a complete finite nonatomic measure space. Here T is the set of players, T is the set of all possible coalitions among the players, and µ is a set function measuring the size of each coalition. The space of all possible strategies (decisions) of the players is a separable Banach space X. More precisely, there is a multifunction F : T −→ 2X \{∅} which for each player t ∈ T describes his strategy set F (t) (i.e., F (t) ⊆ X is the set of all strategies (actions, decisions) available to player t ∈ T ). The randomness of the game is described by a complete probability space (Ω, Σ, µ), with Ω corresponding to the set of all possible states of nature, and Σ to the collection of all possible outcomes. Also given is a multifunction P : Ω × T × SF1 −→ 2X which describes the random preferences of the players. More precisely, for every (ω, t) ∈ Ω × T , P (ω, t, u) ⊆ F (t), and for µ-almost every player t ∈ T , the set P (ω, t, u) consists of all the strategies (actions), that he/she strictly prefers to her own strategy u(t), given that ω ∈ Ω is the state of the environment and given that the decisions of all players (modulo null-coalitions), are represented by u ∈ SF1 . So we see that each player’s preference pattern is influenced by the strategies of the other players (modulo null coalitions) and by the realized state of nature. However, each player must choose his strategy independently, after having observed the realized state ω ∈ Ω of nature. So the players’ actions can be modelled by a map g : Ω −→ SF1 , which prescribes for each possible state of nature ω ∈ Ω, the strategies of all players modulo null coalitions. Note that the players act independently and noncooperatively. So the random game under consideration is described by the quadruple R = (T, T , µ), (Ω, Σ, ν), F, P . DEFINITION 8.3.1 A Cournot–Nash equilibrium for the random game R, is a
map g : Ω −→ SF1 which is Σ, B L1 (T, X) -measurable such that for ν-a.a. ω ∈ Ω, we have
P ω, t, g(ω) = ∅ for µ-a.a. t ∈ T.
REMARK 8.3.2 Every Σ, B L1 (T, X) -measurable map g : Ω −→ SF1 , is called a strategy rule. According to the above definition, no player t ∈ T outside a powerless (due to the nonatomicity of µ) null coalition, can produce a strategy, which he/she strictly prefers to the equilibrium strategy g(ω)(t).
8.3 Cournot–Nash Equilibria for Random Games
625
Now we are ready to introduce the precise mathematical hypotheses on the data of the random game. In what follows (SF1 , w) is the set SF1 with the relative weak L1 (T, X)-topology. H1 : (T, T , µ) is a complete, finite, nonatomic measure space. REMARK 8.3.3 As was the case with the economic model of pure competition considered in Section 7.1, the nonatomicity of the measure space of players expresses the fact that there is no coalition of players which has more influence on the game than the others. H2 : F : T −→ Pwkc (X) is graph-measurable and integrably bounded (see Definition 6.4.22).
H3 : dom P = (ω, t, u) ∈ Ω × T × SF1 : P (ω, t, u) = ∅ ∈ Σ× T ×B L1 (T, X) and dom conv P (ω, t, ·) is w-open.
H4 : There is no Σ, B L1 (T, X) -measurable map f : Ω −→ SF1 (a decision rule),
such that for ν-a.a. ω ∈ Ω, we have f (ω)(t) ∈ conv P ω, t, f (ω) µ-a.e. on T . H5 : There exists a multifunction H : domP −→ Pf c (X) that is graph-measurable and (i) For every (ω, t, u) ∈ Ω × T × SF1 , H(ω, t, u) ⊆ conv P (ω, t, u) ⊆ F (t). (ii) For every (ω, t) ∈ Ω × T, u −→ H(ω, t, u) is usc from (SF1 , w) into Xw . REMARK 8.3.4 If (ω, t, u) −→ conv P (ω, t, u) is graph measurable and for all (ω, t) ∈ Ω × T, u −→ P (ω, t, u) is usc with closed values, then we can take H = convP . Let (Y, d) be a separable metric space and consider a multifunction G : Ω × T × Y −→ Pf c (X) such that for every (ω, t, y) ∈ Ω × T × Y we have G(ω, t, y) ⊆ F (t). 1
Then we define Γ : Ω × Y −→ 2SF by Γ(ω, y) = u ∈ SF1 : u(t) ∈ G(ω, t, y) µ-a.e. on T . PROPOSITION 8.3.5 If hypotheses H1 , H2 hold and for every (ω, t) ∈ Ω × T , G(ω, t, ·) is usc from Y into X furnished with the weak topology (denoted by Xw ), then (a) For every ω ∈ Ω, Γ(ω, ·) is usc from Y into (SF1 , w). (b) If Gr G ∈ Σ × T × T × B(Y ) × B(X), 1 then we can find a multifunction Γ : Ω×Y −→ 2SF such that GrΓ ∈ Σ×B(SF1 ) and for ν-a.a. ω ∈ Ω we have Γ(ω, ·) = Γ (ω, ·).
626
8 Game Theory
PROOF: (a) By virtue of hypothesis H1 and Theorem 6.4.23, we have that SF1 is nonempty, convex, and w-compact. Moreover, (SF1 , w) is metrizable. So in order to prove the desired upper semicontinnuity of Γ(ω, ·), it suffices to show that Gr Γ(ω, ·) is sequentially closed in Y × (SF1 , w) (see Proposition 6.1.10). To this end let (yn , un ) ⊆ Y × SF1 and suppose that yn −→ y
in (Y, d)
w
un −→ u
and
in L1 (T, X).
We have un (t) ∈ G(ω, t, yn )
µ-a.e. on T.
Invoking Proposition 6.6.33, we obtain u(t) ∈ conv w- lim sup G(ω, t, yn )
µ-a.e. on T.
But since G(ω, t, ·) is w-usc and has closed convex, values, we obtain u(t) ∈ G(ω, t, y)
µ-a.e. on T,
⇒ u ∈ Γ(ω, y). This proves the upper semicontinuity of y −→ Γ(ω, y) from Y into (SF1 , w). (b) Let
ϑ(ω, t, y, x) = iG(ω,t,y) (x) =
0 +∞
if x ∈ G(ω, t, y) , otherwise
the indicator function of the set G(ω, t, y). By virtue of the graph measurability of G, we have that ϑ is measurable. Moreover, because Gr G(ω, t, ·) is closed in Y ×Xw (due to the upper semicontinuity of y −→ G(ω, t, y) from Y into Xw ), we have that ϑ(ω, t, ·) is lower semicontinuous on Y × Xw and of course it is convex in x ∈ X. Therefore we can find an increasing sequence ϑn : Ω × T × Y × X −→ R, n ≥ 1, of measurable functions such that for every n ≥ 1 and every (ω, t) ∈ Ω × T , ϑn (ω, t, ·, ·) is Lipschitz continuous on Y ×X with Lipschitz constant n ≥ 1 and ϑn ↑ ϑ as n → ∞. Let
Iϑn (ω, y, u) = ϑn ω, t, y, u(t) dµ and Iϑ (ω, y, u) = ϑ ω, t, y, u(t) dµ. T
T
Using Theorem 2.1.28, we can check that for every ω ∈ Ω, Iϑn (ω, ·, ·) is continuous on Y ×L1 (T, X). Also for every n ≥ 1 and every (y, u) ∈ Y ×SF1 , ω −→ Iϑn (ω, y, u) is Σ-measurable. Thus by virtue of Theorem 6.2.6 for every n ≥ 1, (ω, y, u) −→ Iϑn (ω, y, u) is Σ×B(Y )×B(SF1 )-measurable. Note that from the monotone convergence theorem, we have Iϑn ↑ Iϑ
as n → ∞,
⇒ Iϑ is Σ × B(Y ) × B(SF1 )-measurable
recall that B(SF1 ) = B (SF1 , w) . Finally let Γ be defined by GrΓ = (ω, y, u) ∈ Ω×Y ×SF1 : lim Iϑn (ω, y, u) ≤ 0 .
n→∞
Then Γ (ω, y, u) has all the desired properties.
Now we are ready to establish the existence of a Cournot–Nash equilibrium (see Definition 8.3.1 for the random game under consideration.
8.3 Cournot–Nash Equilibria for Random Games
627
THEOREM 8.3.6 If hypotheses H1 , H2 , H3 , H4 , and H5 hold, then the random game R admits a Cournot–Nash equilibrium. PROOF: As we already pointed out (SF1 , w) is compact, and metrizable (see the proof of Proposition 8.3.5). Let G : Ω×T ×SF1 −→ 2X be defined by H(ω, t, y) if y ∈ dom P (ω, t, ·) G(ω, t, y) = . (8.20) F (t) otherwise Evidently G(ω, t, y) ∈ Pwkc (X) and G(ω, t, y) ⊆ F (t) for all (ω, t, y) ∈ Ω×T ×SF1 . Moreover, because of hypothesis H5 (ii), we have that for all (ω, t) ∈ Ω × T , G(ω, t, ·) is usc from (SF1 , w) into Xw . So, if we define Γ(ω, y) = u ∈ SF1 : u(t) ∈ G(ω, t, y) µ-a.e. on T , then we can apply Proposition 8.3.5 with Y = (SF1 , w) and
G, Γ as above,
to deduce that for every ω ∈ Ω, the multifunction y −→ Γ(ω, y) is usc from (SF1 , w) into (SF1 , w). Because Γ(ω, ·) has nonempty, compact, convex values in (SF1 , w), we can apply Theorem 6.5.19 (the Kakutani–Ky Fan fixed point theorem) and deduce that for every ω ∈ Ω, we can find u ∈ SF1 (depending on ω ∈ Ω) such that
u ∈ Γ(ω, u).
(8.21) 1
From Proposition 8.3.5(b), there is a multifunction Γ : Ω×SF1 −→ 2SF such that Γ is graph-measurable and for ν-a.a. ω ∈ Ω, we have Γ(ω, ·) = Γ (ω, ·).
(8.22)
So from (8.21) and (8.22) and the µ-completeness of Σ, we have D = (ω, u) ∈ Ω × SF1 : u ∈ Γ(ω, u) ∈ Σ × B(SF1 ). Invoking Theorem 6.3.20 (the Yankov–von Neumann–Aumann selection theo
rem), we obtain a Σ × B(SF1 ) -measurable map u : Ω −→ SF1 such that
u(ω) ∈ Γ ω, u(ω) ν-a.e. on Ω. This implies that for ν-a.a. ω ∈ Ω, we have
u(ω)(t) ∈ G ω, t, u(ω)
µ-a.e. on T.
Suppose that we can find B ∈ Σ with ν(B) > 0 such that for every ω ∈ B
domP ω, t, u(ω) = ∅ µ-a.e. on T. Then for every ω ∈ B, we have
u(ω)(t) ∈ H ω, t, u(ω) ⊆ conv P ω, t, u(ω)
µ-a.e. on T
(see (8.20)),
which contradicts hypothesis H4 . Therefore we conclude that for ν-a.a. ω ∈ Ω,
P ω, t, u(ω) = ∅ for µ-a.a. t ∈ T, ⇒ u ∈ SF1 is a Cournot-Nash equilibrium for R.
628
8 Game Theory
8.4 Bayesian Games In the previous section, we studied the existence of equilibria for random games in which all players shared the same information about the state of the environment. In this section, we consider the situation in which the players have private information (imperfect information in the terminology of Harsanyi [285] or differential information). Motivated by the analogous situation in statistics, we call such games Bayesian games. We consider the situation with a countable number of players and an infinite-dimensional strategy space. The precise mathematical model of the game is the following. Let (Ω, Σ, µ) be a complete probability space. Here Ω denotes the set of all possible states of the environment, the σ-field Σ corresponds to the collection of all possible events and the measure µ describes the distribution of these events. Also T is a countable set and its elements represent the players of the game. The players choose their strategies (actions) from a separable Banach space X. Each player t ∈ T , has a strategy (action) multifunction Ft : Ω −→ 2X \{∅}. If the state of the environment is ω ∈ Ω, the set Ft (ω) ⊆ X represents the set of strategies (actions) available to player t ∈ T when the state of the " environment is ω ∈ Ω. Also each player t ∈ T has a utility function ut (ω, ·) : Fs (ω) −→ R s∈T
that depends on the state of the environment ω ∈ Ω. This utility function describes the preference of the player among the various feasible strategies, when the state is ω ∈ Ω. Moreover, to every player t ∈ T corresponds a complete sub-σ-field Σt of Σ, which represents the private information of that player and a prior pt : Ω −→ R+ , which is a Radon–Nikodym derivative with respect to the probability µ such that p (ω)dµ = 1. t Ω Then the Bayesian game (or game with private information or game with differential information), is the quadruple. B = (Ft , ut , Σt , pt )t∈T . In what follows we use the following notation. SF1 t (Σt ) = vt ∈ L1 (Ω, Σt ; X); vt (ω) ∈ Ft (ω) µ-a.e. on Ω . We set SF1 =
=
SF1 t (Σt )
t∈T
and
SF1t =
=
SF1 s (Σs ).
s∈T
s=t
An element of SF1 t (Σt ) is a strategy for player t ∈ T . For each player t ∈ T , we assume that there exists a finite or countable partition Pt of Ω and Σt = σ(Pt ) (i.e., Pt generates the sub-σ-field Σt containing ω ∈ Ω. We assume that
pt (ω )dµ > 0. At (ω)
8.4 Bayesian Games We set
pt ω At (ω) =
0 pt (ω ) At (ω)pt (ω )dµ
if ω ∈ / At (ω) if ω ∈ At (ω)
.
629
(8.23)
DEFINITION 8.4.1 The conditional expected utility of player t ∈ T , Ut (ω, ·, ·) : SF1 t ×Ft (ω) −→ R, is defined by
ut ω , y(ω ), x pt ω At (ω) dµ
Ut (ω, y, x) =
(see (8.23)).
(8.24)
At (ω)
REMARK 8.4.2 In the above definition we understand Ut (ω, y, x) as the conditional expected utility of player t ∈ T ; when the state of the environment is ω ∈ Ω, the player chooses the strategy x ∈ Ft (ω) and the other players have chosen the strategy profile y ∈ SFt . DEFINITION 8.4.3 A Bayesian Nash equilibrium for the Bayesian game B is a strategy profile y ∗ ∈ SF1 such that for all t ∈ T , we have
Ut ω, y ∗ , yt (ω) = max Ut (ω, y ∗ , v) : v ∈ Ft (ω)
µ-a.e. on Ω.
We introduce some hypotheses on the data of the Bayesian game B. H1 : F : Ω × T −→ Pwkc (X) is a multifunction such that for every t ∈ T, Gr Ft ∈ Σt × B(X), and Ft is integrably bounded. Let Ft (ω) =
"
Fs (ω) and Xt = X for all t ∈ T .
s=t
H2 : u : Ω×
"
Xt , ω −→ R = R ∪ {−∞} is a function such that
t∈T
(i) For every (t, x) ∈ T ×
"
Xt , ω −→ ut (ω, x) is Σ-measurable.
t∈T
(ii) For every (ω, t) ∈ Ω×T , ut (ω, ·, ·) : Ft (ω)×Ft (ω) −→ R and it is continuous " when Ft (ω) is endowed with the relative product weak topology of Xs s=t
and Ft (ω) is endowed with the norm topology. (iii) For every (ω, t, x) ∈ Ω × T ×
"
Fs (ω), the function x −→ ut (ω, x, x) is
s=t
concave on Ft (ω). (iv) There exists h ∈ L1 (Ω, Σ) such that for µ-a.a. ω ∈ Ω and all (t, x) ∈ T × Ft (ω)×Ft (ω) we have |ut (ω, x)| ≤ h(ω). We start with an auxiliary result that establishes the continuity properties of the conditional expected utility (see (8.24)). Let µt (A) = A pt (ω)dµ for every t ∈ T and A ∈ Σ. Evidently this is a probability measure on (Ω, Σ).
630
8 Game Theory
PROPOSITION 8.4.4 If hypotheses H1 , H2 hold and for every fixed ω ∈ Ω, t ∈ T and A ∈ Σ, we set
A U t (y, x) = ut ω , y(ω ), x dµt (ω ) with y ∈ SF1t , x ∈ Ft (ω), A
A Ut
is continuous on SF1t ×Ft (ω), when SF1t is equipped with the relative product then weak topology and Ft (ω) is equipped with the norm topology. w
PROOF: Suppose that y n −→ y in SF1t and xn −→ x in Ft (ω). We know that SF1t is weakly compact (see hypothesis H1 and Theorem 6.4.23). We have y n = (ysn )s∈T s=t
w
and ysn −→ ys in SF1 s for all s ∈ T, s = t. w
Claim: For every ω ∈ Ω and every s ∈ T, s = t, we have ysn (ω) −→ ys (ω) in Fs (ω). We fix ω ∈ Ω. Let Ps be the countable Σs -partition of Ω such that σ(Ps ) = Σs . We set Ps = Dsk k≥1 . Hence we have ysn =
zsn,k χDsk
and
ys =
k≥1
zsk χDsk ,
k≥1 k(ω)
where zsn,k , zsk ∈ Fs (ω). For every s ∈ T , we can find a unique Ds
∈ Ps such that
ω ∈ Dsk(ω) . Then for every x∗ ∈ X ∗ , we have x∗ , ysn (ω) =
x∗ , ysn (ω)
χ k(ω) dµ(ω ) k(ω) µ(Ds ) Ds
x∗ , ysn (ω ) = χ k(ω) (ω )dµ(ω ) k(ω) ) Ds Ω µ(Ds Ω
(8.25)
(because ysn (ω ) = ysn (ω) for all ω ∈ Ds ).
k(ω) Evidently x∗ µ(Ds ) ∈ L∞ (Ω, X ∗ ) and ysn χDk(ω) ∈ L1 (Ω, X). Therefore k(ω)
s
w
from (8.25) and because ysn −→ ys in L1 (Ω, X), we obtain x∗ , ysn (ω) −→ x∗ , ys (ω) , w
⇒ ysn (ω) −→ ys (ω)
in Fs (ω) ⊆ X.
This proves the claim. Then using the claim and hypothesis H2 (ii), we have
ut ω, y n (ω), xn −→ ut ω, y(ω), x . From this and the dominated convergence theorem (it can be used here because of hypothesis H2 (iv)), we have A
A
U t (y n , xn ) −→ U t (y, x). Now we can have the equilibrium result for game B.
8.4 Bayesian Games
631
THEOREM 8.4.5 If hypotheses H1 and H2 hold, then the Bayesian game B admits a Bayesian Nash equilibrium. PROOF: By virtue of Proposition 8.4.4, for every ω ∈ Ω Ut (ω, ·, ·) : SF1t × Ft (ω) −→ R is continuous when SF1t is furnished with the relative product weak topology and Ft (ω) is furnished with the norm topology. Moreover, because of hypothesis H2 (iii) for every (ω, y) ∈ Ω × SF1t , Ut (ω, y, ·) is concave. For every t ∈ T , we consider the multifunction Et : Ω × SF1t −→ 2X defined by Et (ω, y) = x ∈ Ft (ω) : Ut (ω, y, x) = max{Ut (ω, y, v) : v ∈ Ft (ω)} . Due to the concavity and continuity of Ut (ω, y, ·), we have that it is weakly upper semicontinuous. Hence because the set Ft (ω) ∈ Pwkc (X), we deduce that Et (ω, y) = ∅. Moreover, the concavity of Ut (ω, y, ·) implies that Et (ω, y) ⊆ Ft (ω) is convex. Theorem 6.1.18(b) implies that for every ω ∈ Ω, y −→ Et (ω, y) is usc from SF1t with the relative product weak topology into the nonempty, closed, and convex subsets of Ft (ω), the latter equipped with the norm topology. From hypothesis H1 , we know that we can find vn : Ω −→ X, n ≥ 1, Σt measurable selectors of Ft (·) such that Ft (ω) = {vn (ω)}n≥1
for all ω ∈ Ω
(see Theorem 6.3.20).
Then exploiting the norm continuity of Ut (ω, y, ·), we have
max Ut (ω, y, v) : v ∈ Ft (ω) = max Ut ω, y, vn (ω) , n≥1 ⇒ ω −→ max Ut (ω, y, v) : v ∈ Ft (ω) = ξt (ω, y) is Σt -measurable, ⇒ GrEt (·, y) = (ω, x) ∈ GrFt : Ut (ω, y, x) = ξt (ω, y) ∈ Σt × B(X) (hypothesis H1 ). So applying Theorem 6.3.20 (the Yankov–von Neumann–Aumann selection theorem), we obtain ft : Ω −→ X a Σt -measurable map such that ft (ω) ∈ Et (ω, y)
for all ω ∈ Ω.
Since Et (ω, y) ⊆ Ft (ω) and by hypothesis H1 , Ft is integrably bounded, it follows 1 that ft ∈ SF1 t (Σt ). So, if we consider the multifunction Γt : SF1t −→ 2SFt (Σt ) defined by Γt (y) = x ∈ SF1 t (Σt ) : x(ω) ∈ Et (ω, y) µ-a.e. on Ω ; then we see that Γt has nonempty values. We claim that it is usc from SF1t with the relative product weak topology into SF1 t (Σt ) with the weak topology. Because SF1 t (Σt ) is weakly compact in L1 (Ω, Σt ; X) (see Theorem 6.4.23), it suffices to show that Gr Γt is closed in SF1t × SF1 t (Σt ) (see Proposition 6.1.10). Because the weak
632
8 Game Theory
topologies on SF1t and SF1 t (Σt ) are metrizable, we can use sequences. So suppose w w that yn −→ y in SF1t , xn −→ x in SF1 t (Σt ), and xn ∈ Γt (yn ). Then xn (ω) ∈ Et (ω, yn )
for all n ≥ 1.
µ-a.e. on Ω
Invoking Proposition 6.6.33, we obtain x(ω) ∈ conv w- lim sup Et (ω, yn ) n→∞
⊆ Et (ω, y),
(8.26)
where the inclusion in (8.26) follows from the fact that Et (ω, ·) is usc on SF1t with the relative product weak topology and it has closed and convex values. Therefore it follows that (y, x) ∈ Gr Γt , ⇒ Γt is usc as claimed. Let Γ : SF1 −→ Pwkc (SF1 ) be defined by = Γ(y) = Γt (yt ) t∈T
where y = (yt )t∈T with yt ∈ SF1 t (Σt ) and yt = (ys )s∈T ∈ SF1t . Evidently Γ is usc s=t
when SF1 is furnished with the relative product weak topology. We apply Theorem 6.5.19 (the Kakutani–Ky Fan fixed point theorem), to obtain y ∗ ∈ SF1 such that y ∗ ∈ Γ(y ∗ ). Clearly then
Ut ω, y ∗ , yt∗ (ω) = max Ut (ω, y ∗ , v) : v ∈ Ft (ω)
µ-a.e. on Ω,
⇒ y ∗ is a Bayesian Nash equilibrium for B.
8.5 Stochastic Games In this section, we consider discounted stochastic games in which the state space is a Borel space (i.e., a Borel subset of a Polish space) and the action spaces of the players are compact metric spaces. Under reasonable continuity hypotheses on the reward function and the transition probability describing the law of motion of the system, we show that the discounted stochastic game has a value and both players have optimal stationary policies. To do this we use the so-called Bellman’s principle of optimality for the discounted dynamic programming problem. For this reason our discussion begins with a brief presentation of the dynamic programming model and of the associated Bellman’s principle of optimality. The discounted dynamic programming model is determined by the following objects.
8.5 Stochastic Games
633
•
A state space S, which is assumed to be Borel space (i.e., it is the Borel subset of a Polish space).
•
A space of actions X, which is a Borel space too.
•
A constraint multifunction F : S −→ 2X \ {∅}; this multifunction assigns to each state s ∈ S a nonempty feasible (permissible) set of actions F (s).
•
A law of motion q, which to every pair (s, x) ∈ Gr F assigns a probability measure q ·s, x on the Borel sets of S. This probability measure describes the distribution of the state next visited by the system if the system is in state s ∈ S and action x ∈ X is taken.
•
A bounded reward function r : Gr F −→ R.
•
A discount factor δ ∈ (0, 1). We make the following mathematical hypotheses on the above items:.
H1 : Gr F ∈ B(S ×X) = B(S)×B(X) and contains the graph of a Borel map from S to X.
H2 : q ·s, x is a Borel-measurable transition probability the
on Borel σ-field B(S); that is, for every B ∈ B(S), the function (s, x) −→ q Bs, x is Borel-measurable from S × X into [0, 1]. H3 : The reward function r : Gr F −→ R is bounded and Borel-measurable. Let H1 = S and Hn = Gr F ×Hn−1 for n ≥ 2. These are the sets of all possible histories of the system up to the stage n ≥ 1. Then a policy π is a sequence {πn }n≥1 , where for each n ≥ 1, πn is a conditional probability on B(X) given the past history Hn of the system. We assume that πn satisfies the constraint
πn F (sn )hn = 1 for all histories hn = (s1 , x1 , . . . , sn ). DEFINITION 8.5.1 A history π is said to be stationary if there exists a Borel measurable selector of f of F such that πn = f ; that is, πn f (sn )hn = 1 for all histories hn = (s1 , x1 , . . . , sn ). The stationary policies are identified with SF = set of Borel selectors of F . Any policy π together with the law of motion q, defines a conditional probability pn on the set X × S × X × S × · · · of futures of the system, given the initial state s ∈ S; that is, pπ (·s) = π1 qπ2 q . . . (8.27) The expected total discounted reward is defined by L(π)(s) = Eπ
∞
n=1
δ n−1 r(sn , xn )s ,
s ∈ S.
(8.28)
Here Eπ (·s) denotes the conditional expectation with respect to pn (·s) (see (8.27)).
634
8 Game Theory
DEFINITION 8.5.2 A policy π ∗ is optimal if L(π)(s) ≤ L(π ∗ )(s) for all policies π and all states s ∈ S (see (8.28)). The next theorem, known as Bellman’s optimality principle, gives a necessary and sufficient condition for optimality of a policy π ∗ . We assume that hypotheses H1 , H2 , and H3 are in effect. THEOREM 8.5.3 A policy π ∗ is optimal if and only if its reward L(π ∗ ) satisfies the optimality equation
u(s )dq(s s, x) : x ∈ F (s) , s ∈ S. (8.29) u(s) = sup r(s, x) + δ S
PROOF: Necessity: Suppose that π ∗ is optimal. Fix a state s ∈ S. For any x ∈ F (s), we define G(x) = {s ∈ S : (s, x) ∈ Gr F }. (8.30) Evidently Gr G ∈ B(X)×B(S) (see hypothesis H1 ). Let f be a Borel-measurable selector of F . It exists by virtue of hypothesis H1 . For x ∈ F (s), we consider the map fx : S −→ X defined by x if s ∈ G(x) . fx (s) = f (s) otherwise Clearly f is Borel-measurable and a selector of F (see (8.30)). If, by (fx , π ∗ ) we denote the policy {fx , π1∗ , π2∗ , . . .}, then by virtue of the optimality of π ∗ , we have
∗ L(fx , π )(s) = r(s, x) + δ L(π ∗ )(s )dq(s s, x) ≤ L(π ∗ )(s). (8.31) S
Because x ∈ F (s) was arbitrary, from (8.31) it follows that
sup r(s, x) + δ L(π ∗ )(s )dq(s s, x) : x ∈ F (s) ≤ L(π ∗ )(s),
s ∈ S.
(8.32)
S
Given any policy π = {πn }n≥1 , we define πn (·|hn ) = πn+1 (·|s, x, hn ), n ≥ 1 and πs,x = {πn }n≥1 . We set (8.33) v(s, x, s ) = L(πs,x )(s ) and we have L(π)(s) =
v(s, x, s )dq(s s, x) dπ1 (x|s),
r(s, x) + δ
F (s)
s ∈ S.
(8.34)
S
From (8.34) it follows that L(π ∗ )(s) ≤ sup r(s, x) + δ
L(π ∗ )(s )dq(s s, x) : x ∈ F (s) .
(8.35)
S
Comparing (8.33) and (8.35), we conclude that
L(π ∗ )(s) = sup r(s, x) + δ L(π ∗ )(s )dq(s s, x) : x ∈ F (s) , S
s ∈ S.
8.5 Stochastic Games
635
Sufficiency: Suppose that L(π ∗ ) satisfies the functional equation (8.29). Let h1n = (s, x1 , . . . , sn , xn ) and u(s) = L(π ∗ )(s), s ∈ S. Then, for any policy π and any state s ∈ S, we have Eπ
δ n u(sn+1 ) − Eπ δ n u(sn+1 )|hn s = 0.
N
(8.36)
n=1
Here Eπ (·|hn ) denotes the conditional expectation given the history hn . Then for pπ (·|s)-almost all futures, we have
Eπ δ n u(sn+1 )|hn
= δn u(s )dq(s |sn , xn ) S
= δ n−1 r(sn , xn ) + δ u(s )dq(s |sn , xn ) − δ n−1 r(sn , xn ) S
≤ δ n−1 u(sn ) − δ n−1 r(sn , xn ).
(8.37)
Using (8.37) in (8.36), we obtain Eπ
δ n u(sn+1 ) − δ n−1 u(sn ) + δ n−1 r(sn , xn ) s
N
n=1
= Eπ δ N u(sN +1 ) − u(s) +
N
δ n−1 r(sn , xn )s ≤ 0.
(8.38)
n=1
We pass to the limit as N −→ ∞ in (8.38). Using the dominated convergence theorem, we have n−1 Eπ δ r(sn , xn )s = L(π)(s) ≤ u(s) = L(π ∗ )(s), s ∈ S, n≥1
⇒ π
∗
is an optimal policy.
Bellman’s optimality principle leads to the existence of optimal stationary policies. H1 : F : S −→ Pk (X) is a measurable multifunction. REMARK 8.5.4 By Theorem 6.3.17 (the Kuratowski–Ryll Nardzewski selection theorem), if F satisfies hypothesis H1 , it admits a Borel-measurable selector. H2 : For every s ∈ S and every bounded Borel measurable function v : S −→ R, the function
x −→ v(s )dq(s |s, x) S
is continuous. H3 : The reward function r : Gr F −→ R is bounded (i.e., |r(s, x)| ≤ M for some M > 0 and all (s, x) ∈ S×X), Borel-measurable and for every s ∈ S, x −→ r(s, x) is continuous.
636
8 Game Theory
First let us make a straightforward but useful observation concerning the solution of the optimality functional equation. In what follows by B0 (S) we denote the space of all bounded Borel-measurable functions on S. This is a Banach space for the supremum norm u∞ = sup |u(s)|. s∈S
PROPOSITION 8.5.5 If hypotheses H1 , H2 , and H3 hold, then the optimality functional equation (8.29) has a unique solution u∗ ∈ B0 (S). PROOF: Let u ∈ B0 (S). We consider the optimization problem
u(s )dq(s |s, x) : x ∈ F (s) = ξ(s). sup r(s, x) + δ
(8.39)
S
By virtue of hypothesis H1 we can find a sequence fn : S −→ X, n ≥ 1, of Borel-measurable functions such that F (s) = {fn (s)}n≥1
for all s ∈ S.
Because x −→ S u(s )dq(s |s, x) is continuous (hypothesis H2 ) and x −→ r(s, x) is continuous (see hypothesis H3 ), from (8.39) we have
u(s )dq s |s, fn (s) , ξ(s) = sup r s, fn (s) + δ n≥1
S
⇒ ξ is Borel-measurable. In addition it is clear that ξ is bounded (see hypothesis H3 ). So we can define the operator P : B0 (S) −→ B0 (S) by
u(s )dq(s |s, x) : x ∈ F (s) , s ∈ S. P (u)(s) = sup r(s, x) + δ S
For every u, v ∈ B0 (S) and every s ∈ S, we have |P (u)(s) − P (v)(s)|
u(s ) − v(s ) dq(s |s, x) ≤ sup δ S
≤ δu − v∞ . Therefore P is a contraction map and by the Banach fixed point theorem (see Theorem 3.4.3), P has a unique fixed point, which is the unique solution of the Bellman’s optimality equation (8.29). REMARK 8.5.6 In the above proof, the operator P is known as the dynamic programming operator . Now we can use Theorem 8.5.3 to establish the existence of an optimal stationary policy. THEOREM 8.5.7 If hypotheses H1 , H2 , and H3 hold, then there exists an optimal stationary policy.
8.5 Stochastic Games
637
PROOF: Let u∗ ∈ B(S) be the unique fixed point of the dynamic programming operator. We have
u∗ (s )dq(s |s, x) : x ∈ F (s) = ξ(s). u∗ (s) = sup r(s, x) + δ S
Let Γ(s) = x∗ ∈ F (s) : ξ(s) = r(s, x∗ ) + δ S u(s )dq(s |s, x ∗ ) . Because F (s) ∈ Pk (X) (see hypothesis H1 ) and the function x −→ r(s, x) + δ S u∗ (s )dq(s |s, x) is continuous (see hypotheses H1 and H3 ), from the Weierstrass theorem, we have that Γ(s) = ∅ for all s ∈ S and in fact Γ(s) ∈ Pk (X). Moreover, recalling that a Carath´eodory function is jointly measurable (see Theorem 6.2.6) and because s −→ ξ(s) is Borel-measurable (see the proof of Proposition 8.5.5), we conclude that GrF ∈ B(S) × B(X). Because Γ is compact valued, it follows that Γ is measurable. Therefore by the Kuratowski–Ryll–Nardzewski selection theorem (see Theorem 6.3.17), we can find f ∗ ∈ SF such that f ∗ (s) ∈ Γ(s) for all s ∈ S,
∗ ∗ u∗ (s )dq s |s, f ∗ (s) u (s) = r s, f (s) + δ
S = sup r(s, x) + δ u∗ (s )dq(s |s, x) : x ∈ F (s) S
and so by Theorem 8.5.3 we infer that f ∗ ∈ SF is optimal and L(f ∗ ) = u∗ .
∗ REMARK 8.5.8 In fact we can strengthen the say
above theorem
that ∗f ∈ ∗and ∗ ∗ SF is an optimal policy if and only if u (s) = r s, f (s) + δ S u (s )dq s |s, f (s) for all s ∈ S. Here again u∗ ∈ B(S) is the unique fixed point of the dynamic programming operator P (see Remark 8.5.6). For details we refer to Hernadez– Lerma [290].
Now we pass to discounted stochastic games. We consider a two-player stochastic game. The items determining such a stochastic game are the following. • S is a nonempty Borel subset of a Polish space (a Borel space) and corresponds to the set of all possible states of the system. •
X and Y are compact metric spaces and correspond to the set of actions available to player I and player II, respectively.
•
F1 : S −→ Pf (X) and F2 : S −→ Pf (Y ) are two measurable multifunctions that restrict the actions of the two players; so if the state of the system is s ∈ S, then player I can choose an action from the set F1 (s) and the player II from the set F2 (s).
•
q is the law of motion (transition law) of the system and with every triple (s, x, y) ∈ S ×X ×Y associates probability measure q(·|s, x, y) on the Borel subsets B(S) of S; so if the system is in state s ∈ S and the two players have chosen actions x ∈ X and y ∈ Y , respectively, the system moves to a new state according to the probability distribution q(·|s, x, y).
638
8 Game Theory
•
r : S ×X ×Y −→ R is a bounded reward function.
•
δ ∈ (0, 1) is the discount factor; so the unit income today is worth δ n at the nth period in the future.
The game is played as follows. Periodically (say, once a day), players I and II observe the current state s ∈ S of the system and choose actions x ∈ F1 (s) and y ∈ F2 (s), respectively. This choice is made with full knowledge of the history of the system as it has evolved up to the present time. Then as a consequence of the actions chosen by the two players, player II pays player I an amount equal to r(s, x, y). Subsequently the system moves to a new state s according to the distribution q(·|s, x, y). Then the whole process is repeated from the new state s ∈ S. The goal of player I is to maximize the expected discounted reward (income) as the game proceeds over the infinite future and the goal of player II is to minimize the expected discounted loss. For each n ≥ 0, we define the space Hn of all admissible histories of the systems up to time n (the n-histories), by H0 = S
and
Hn = Gr(F1 ×F2 )n ×S = Gr(F1 ×F2 )× Hn−1
for
n ≥ 1.
The generic element hn ∈ Hn , is a vector hn = (s0 , x0 , y0 , . . . , sn−1 , xn−1 , yn−1 , sn ), where (sk , xk , yk ) ∈ Gr(F1 × F2 ) for all k = 0, 1, . . . , n − 1 and sn ∈ S. DEFINITION 8.5.9 (a) A randomized, admissible policy π for player I, is a sequence π = {πn }n≥0 of stochastic kernels (transition probabilities) πn on the Borel σ-field B(X) given Hn and satisfying the constraint
πn F1 (s)|hn = 1 for all hn ∈ Hn and all n ≥ 0. Similarly a (randomized admissible) policy β for player II is a sequence β = {βn }n≥0 of stochastic kernels (transition probabilities) βn on the Borel σ-field B(Y ) given Hn and satisfying the constraint
βn F2 (s)|hn = 1 for all hn ∈ Hn and all n ≥ 0. We denote the set of all policies of player I (resp., of player II) by ∆1 (resp. ∆2 ). (b) A stationary policy for player I (resp., for player II) is a Borel-measurable map 1 1 f : S −→ M+ (X) (resp., a Borel-measurable map g : S −→ M+ (Y )). REMARK 8.5.10 A stationary policy is a particular case of a randomized admissible policy. Indeed, if pn :Hn −→ S is the projection map defined by pn (hn ) = sn (hn = (s0 , x0 , y0 , . . . , sn−1 , xn−1 , yn1 , sn )), then π = {πn = f ◦ pn }n≥0 ∈ ∆1 . So πn is the Dirac measure concentrated at f (sn ). Similarly for player II. A stationary policy is a special case of a Markov policy. Clearly the stationary policies of player I (resp., of player II), can be identified with SF1 (resp., with SF1 ). A pair (π, β) of policies for players I and II associates with every initial state th
s ∈ S, an n =-day expected reward (gain) rn (π, β)(s) for player I (see the dynamic
8.5 Stochastic Games
639
programming model). The total expected discounted reward for player I, is given by n L(π, β)(s) = δ rn (π, β)(s). (8.40) n≥0
The rewards rn (π, β)(·) are Borel-measurable and so from (8.40) it follows that L(π, β)(·) is Borel-measurable. DEFINITION 8.5.11 A policy π ∗ is optimal for player I if inf
sup L(π, β)(s) ≤ L(π ∗ , β)(s)
β∈∆2 π∈∆1
for all β ∈ ∆2 , s ∈ S.
A policy β ∗ is optimal for player II if sup inf L(π, β)(s) ≥ L(π, β ∗ )(s) for all π∈∆1 β∈∆2
π ∈ ∆1 , s ∈ S. The stochastic game is said to have a value if inf
sup L(π, β)(s) = sup inf L(π, β)(s)
β∈∆2 π∈∆1
π∈∆1 β∈∆2
for every s ∈ S.
In this case the function v(s) = inf
sup L(π, β)(s) = sup inf L(π, β)(s),
β∈∆2 π∈∆1
π∈∆1 β∈∆2
s ∈ S,
is called the value function of the stochastic game.
To be able to establish the existence of optimal policies for the two players, we need some mathematical hypotheses on the data {F1 , F2 , q, r} of the stochastic game. H4 : F1 : S −→ Pf (X) and F2 : S −→ Pf (Y ) are measurable multifunctions. REMARK 8.5.12 Because X and Y are compact metric spaces, the measurability
of F1 , F2 is equivalent
to Borel-measurability into the separable metric spaces Pk (X), h and Pk (Y ), h , respectively. Let Cb (S) denote the space of bounded continuous functions on S. This is a Banach space for the supremum norm u∞ = sup |u(s)|, u ∈ Cb (S). Also by s∈S 1 (S) we denote the space of probability measures on S. M+ 1 H5 : q : S ×X ×Y −→ M+ (S) is a map such that
(i) For every (x, y) ∈ X × Y and every B ∈ B(S), s −→ q(B|s, x, y) is Borelmeasurable. (ii) For every s ∈ S, (x, y) −→ q(·|s, x, y) is continuous in the sense that if (xn , yn ) −→ (x, y) in X × Y , then for every u ∈ B0 (S) we have
u(s )dq(s |s, xn , yn ) −→ u(s )dq(s |s, x, y) as n → ∞. S
S
640
8 Game Theory
REMARK 8.5.13 By approximating a function in Cb (S) uniformly by simple functions, we can check that hypothesis H5 (i) is in fact equivalent to saying that 1 for every (x, y) ∈ X × Y, s −→ q(·|s, from S into M+ (S)
x,1 y) is Borel-measurable 1 furnished with the weak topology w M+ (S), Cb (S) . Recall that M+ (S) topologized this way is a Borel space (hence a separable metrizable space). H6 : r : S ×X ×Y −→ R is a function such that (i) For every (x, y) ∈ X × Y, s −→ r(s, x, y) is measurable. (ii) For every s ∈ S, (x, y) −→ r(s, x, y) is continuous. REMARK 8.5.14 The reward function r is a Carath´eodory function, thus Borelmeasurable (see Theorem 6.2.6). 1 1 For u ∈ B(S) and (s, λ, µ) ∈ S × M+ (X) × M+ (Y ), we define
u(s )dq(s |s, λ, µ), Ku (s, λ, µ) = r(s, λ, µ) + δ
(8.41)
S
where
r(s, λ, µ) =
r(s, x, y)dλ(x)dµ(y)
Y X
and
q(B|s, λ, µ) =
q(B|s, x, y)dλ(x)dµ(y) Y
X
for all B ∈ B(S). Because of hypotheses H5 and H6 , the function (s, λ, µ) −→ 1 1 Ku (s, λ, µ) is a Carath´eodory function on S × M+ (X) × M+ (Y ) into R; that is, 1 1 for every (λ, µ) ∈ M+ (X) × M+ (Y ), s −→ Ku (s, λ, µ) is Borel-measurable and 1 1 for every s ∈ S, (λ, µ) −→ Ku (s, λ, µ) is continuous on M+ (X)
1× M+ (Y ), when 1 1 M + (X) and M+(Y ) are endowed with the weak topologies w M+ (X), C(X) and 1 1 1 w M+ (Y ), C(Y ) , respectively. In what follows M+ (X) and M+ (Y ) are topologized this way.
1
1 Let P1 : S −→ Pk M+ (X) and P2 : S −→ Pk M+ (Y ) be defined by
1 P1 (s) = λ ∈ M+ (X) : λ F1 (s) = 1
1 and P2 (s) = µ ∈ M+ (Y ) : µ F2 (s) = 1 . Note that the function Ku (s, ·, ·) defines a two-person, zero-sum game η(s, u) and P1 (s) and P2 (s) are the spaces of pure strategies in η(s, u) for players I and II, respectively, and Ku (s, ·, ·) is the reward (payoff, utility) function of the game. The next theorem, known as the portmanteau theorem, helps us establish the existence of a pair of optimal stationary strategies for the two players. For a proof of the portmanteau theorem, we refer to Denkowski–Mig` orski–Papageorgiou [194, p. 195] or Parthasarathy [487, p. 40]. Recall that if Z is a separable metric space,
1 1 then M+ (Z) endowed with the weak topology w M+ (Z), Cb (Z) is separable, and metrizable. 1 THEOREM 8.5.15 If Z is a separable metric space and {µn }n≥1 ⊆ M+ (Z), then the following statements are equivalent.
8.5 Stochastic Games w
1 (a) µn −→ µ in M+ (Z) 1 w M+ (Z), Cb (Z) .
(b)
641
w −→ denotes convergence in the weak topology
udµn −→ Z udµ for all u ∈ Ub (Z) = space of bounded, R-valued uniformly continuous functions.
Z
(c) lim sup µn (C) ≤ µ(C) for every closed set C ⊆ Z. n→∞
(d) lim inf µn (U ) ≥ µ(U ) for every open set U ⊆ Z. n→∞
(e) µn (A) −→ µ(A) for every Borel set A ⊆ Z such that µ(∂A) = 0. PROPOSITION 8.5.16 If hypothesis H4 holds, then the multifunctions P1 : 1 1 S −→ 2M+ (X) \ {∅} and P2 : S −→ 2M+ (Y ) \ {∅} have compact values and are Borel-measurable. PROOF: We do the proof for P1 . The proof for P2 is similar. 1 Because X is a compact metric space, then so is M+ (X) furnished with the weak topology. Hence using Theorem 8.5.15 (in particular the equivalence of (a) and (c)), we deduce at once that P 1 has compact values. Also the multifunction F1 is Borel measurable from S into Pk (X), h and the latter is a Polish space. Moreover, for 1 every C ∈ Pk (X), the map λ −→ λ(C) is continuous on M+ (X) (see Theorem 1 8.5.15). Therefore if we consider the function ξ1 : S × M+ (X) −→ R defined by
ξ1 (s, λ) = λ F1 (s) we see that ξ1 is a Carath´eodory function, hence jointly Borel-measurable. But
1 1 GrP1 = (s, λ) ∈ S × M+ (X) : ξ1 (s, λ) = 1 ∈ B(S) × B M+ (X) . Now we can prove the main existence theorem for the stochastic game. THEOREM 8.5.17 If hypotheses H4 , H5 , H6 hold, then the discounted stochastic game has a value, the value function is Borel-measurable, and the two players have optimal stationary policies. PROOF: Recall that for every u ∈ B0 (S), the function (s, λ, µ) −→ Ku (s, λ, µ)
(see (8.41))
1 1 is a Carath´eodory function from S × M+ (X) × M+ (Y ) into R. Hence it is jointly Borel-measurable. Also because of Proposition 8.5.16, we can find λn : S −→ 1 M+ (X), n ≥ 1, Borel-measuarble maps such that
P1 (s) = {λn (s)}n≥1 1 (Y ), we have So for fixed µ ∈ M+
for all s ∈ S.
642
8 Game Theory ϑu1 (s, µ) = max Ku (s, λ, µ) : λ ∈ P1 (s)
= sup Ku s, λn (s), µ , n≥1
⇒ s −→ ϑu1 (s, µ) is Borel-measurable. Moreover, Theorem 6.1.18(c) implies that µ −→ ϑu1 (s, µ)
1 is continuous on M+ (Y ).
1 Because of Proposition 8.5.16, we can find µn : S −→ M+ (Y ), n ≥ 1, Borel-measurable maps such that P2 (s) = {µn (s)}n≥1 for all s ∈ S. Therefore
1 v1 (s) = min ϑu1 (s, µ) : µ ∈ M+ (Y )
= inf ϑu1 s, µn (s) , ⇒ s −→
n≥1 u v1 (s) =
min
max Ku (s, λ, µ) is Borel-measurable.
s −→ v2u (s) = max
min Ku (s, λ, µ) is Borel-measurable.
µ∈P2 (s) λ∈P1 (s)
Similarly we show that λ∈P1 (s) µ∈P2 (s)
Moreover, Theorem 2.3.13, implies that v1u (s) = v2u (s) = v u (s)
for all s ∈ S.
(8.42)
Consider the operator V : B0 (S) −→ B0 (S) defined by V (u)(s) = v u (s). We have that V is a contraction map (see the proof of Proposition 8.5.5). So by the Banach fixed point theorem (see Theorem 3.4.3), there exists unique u∗ ∈ B0 (S) such that V (u∗ ) = u∗ . Then as in the proof of Theorem 8.5.7, we can find Borel maps 1 f ∗ : S −→ M+ (X)
such that
f ∗ (s) ∈ P1 (s)
and
and
1 g ∗ : S −→ M+ (Y ),
g ∗ (s) ∈ P2 (s)
u∗ (s) = min r s, f ∗ (s), µ + δ
for all s ∈ S.
u∗ (s )dq s |s, f ∗ (s), µ : µ ∈ P2 (s)
S
(8.43)
u∗ (s) = max r s, λ, g ∗ (s) + δ
u∗ (s )dq s |s, λ, g ∗ (s) : λ ∈ P1 (s)
S
u∗ (s) = r s, f ∗ (s), g ∗ (s) + δ
(8.44)
u∗ (s )dq s |s, f ∗ (s), g ∗ (s) .
S
(8.45)
8.5 Stochastic Games
643
The system of equations (8.43) through (8.45) may be interpreted as follows. For the game, u∗ (s) is the value of the game and f ∗ (s), g ∗ (s) are, respectively, the optimal strategies for players I and II. We prove that u∗ (·) is the value function of the stochastic game and f ∗ , g ∗ are optimal stationary policies for players I and II, respectively. To this end for every (f , g ) ∈ SP1 × SP2 (recall SP1 (resp., SP2 ) is the set of Borel-measurable selectors of P1 (resp., of P2 )), we set
W (f , g )(u)(s) = r s, f (s), g (s) + δ u(s )dq s |s, f (s), g (s) , s ∈ S. S
We interpret W (f , g )(u)(s) as the expected amount player II pays player I. When the initial state of the system is s ∈ S, player I uses strategy f (s), player II uses strategy g (s), and the game is terminated at the beginning of the second day with player II paying player I the amount u(s ) with s ∈ S being the state of the system on the second day. It is easy to check that W (f , g ) : B0 (S) −→ B0 (S) is a contraction and so it admits a unique fixed point, which is L(f ∗ , g ∗ ). Therefore we have
L(f ∗ , g ∗ )(s) = max r s, λ, g ∗ (s) + δ L(f ∗ , g ∗ )(s )dq s |s, λ, g ∗ (s) : S λ ∈ P1 (s)
= min r s, f ∗ (s), µ + δ L(f ∗ , g ∗ )(s )dq s |s, f ∗ (s), µ : S µ ∈ P2 (s) . Note that if we fix the stationary policy g ∗ of player II in the stochastic game (i.e., player II is allowed to use only this policy), then the stochastic game reduces to a dynamic programming model such as the one considered in the first part of this 1 section. More precisely, the state space is S, the action space M+ (X), the constraint ∗ multifunction is P1 , the law of motion q is given by q (s) , and the reward ·|s, λ, g
function r defined by r (s, λ) = r s, λ, g ∗ (s) . Hypotheses H4 , H5 , H6 imply that these quantities are well-defined and we can use Theorem 8.5.7 to conclude that f ∗ is an optimal stationary policy for the dynamic programming problem. From this it follows easily that L(f ∗ , g ∗ )(s) = sup L(π, g ∗ )(s) : π ∈ ∆1 , s ∈ S. (8.46) In a similar fashion, we show that L(f ∗ , g ∗ )(s) = inf L(f ∗ , β)(s) : β ∈ ∆2 ,
s ∈ S.
(8.47)
Therefore from (8.46) and (8.47) we deduce that L(f ∗ , g ∗ )(s) = inf sup L(π, β)(s) = sup inf L(π, β)(s), ∆2 ∆1
∆1 ∆2
s ∈ S.
(8.48)
From (8.48) it follows that the discounted stochastic game has a value, the value function s −→ L(f ∗ , g ∗ )(s) = u∗ (s) is bounded Borel-measurable (i.e., it belongs in B0 (S)), and f ∗ , g ∗ are optimal stationary policies for players I and II, respectively.
644
8 Game Theory
8.6 Approximate Equilibria In this section, we return to the first topic of this chapter, namely noncooperative games. The individual stability in such games was described by the fundamental notion of Nash equilibrium (see Definition 8.1.5). In Theorem 8.1.9 we established the existence of such an equilibrium for cooperative games. However, in order to do that, we had to assume that the strategy set of each player is compact. We want to weaken this compactness condition. Excluding the compactness condition on the strategy set of each player, leads to the notion of ε-equilibrium (approximate equilibrium), because an equilibrium need not exist anymore. We consider a noncooperative n-player game in normal form as in Section 8.1. So let N = {1, . . . , n} be the set of players. Each player has a strategy set Xk which is assumed to be a subset of a Banach space Zk , k ∈ N . In contrast to Section 8.1, in order to make use of the approximate convex subdifferential, we assume that n each player k ∈ N , has a loss function uk : X −→ R. Then u = (uk )n k=1 : X −→ R is the multiloss map. We emphasize that this is done only for reasons of convenience and in what follows if ε = 0 and uk is replaced by −uk , we recover the setting of Section " 8.1. As before, we consider the following splitting of the multistrategy set X= Xk , k∈N = Xi . X = X × Xk with X = i=k
As before, we think of k c = N \ {k} as the coalition adverse (complementary) to player k. By pk : X −→ X and pk : X −→ Xk we denote the projection maps of X onto X and Xk , respectively. Every multistrategy x ∈ X can be written as x = (x, xk ) with x = pk (x) ∈ X and xk = pk (x) ∈ Xk . For each k ∈ N , we set (8.49) βk = inf uk (x) : x ∈ X and throughout this section we assume that βk > −∞ for all k ∈ N . Then the vector β = (βk )n k=1 is called the shadow minimum of the game. Note that u(X) ⊆ β + Rn +. If β ∈ u(X), then β = u(x∗ ) with x∗ ∈ X and so x∗ realizes the infimum in (8.49) for every player k ∈ N . However, this situation is rare and for this reason, we need to introduce approximate equilibria. Recall that in Section 8.1, we introduced the following quantity vk (xk ) = inf uk (x) : x ∈ X, pk (x) = xk . (8.50) This quantity represents the least utility for player k ∈ N among all feasible multistrategies, when she employs strategy xk . DEFINITION 8.6.1 A multistrategy x = (xk )n k=1 ∈ X is said to be an εequilibrium if for some ε ≥ 0 and for all k ∈ N , we have uk (x) ≤ vk (xk ) + ε.
8.6 Approximate Equilibria
645
REMARK 8.6.2 If ε = 0, then we recover the notion of Nash equilibrium (see Definition 8.1.5 with uk there replaced by −uk ). In this case, given the strategy x ∈ X of the adverse coalition kc , the player k ∈ N responds by choosing the strategy xk ∈ Xk that minimizes vk on Xk ; that is, uk (x, xk ) = inf vk (xk ) : x ∈ X, xk ∈ Xk . (8.51) Moreover, if N = {1, 2} and u1 (x) + u2 (x) = 0 (zero-sum game), then the εequilibrium is an ε-saddle point (a saddle point if ε = 0). As in Section 8.1, we introduce the function ξ : X × X −→ R defined by ξ(x, z) =
n
ui (x) − ui (x, zi ) .
i=1
PROPOSITION 8.6.3 If for some ε ≥ 0 and x ∈ X we have sup ξ(x, z) : z ∈ X ≤ ε,
(8.52)
then x is an ε-equilibrium multistrategy. PROOF: Let z = (x, zk ). Then we have ξ(x, z) = uk (x) − uk (z) ≤ ε (see (8.52), ⇒ uk (x) ≤ inf uk (z) : z ∈ X, pk (z) = xk + ε,
for all k ∈ N,
⇒ x ∈ X is an ε-equilibrium. PROPOSITION 8.6.4 If x ∈ X is an ε-equilibrium, then for every multistrategy z ∈ X, we have ξ(x, z) ≤ nε. PROOF: From Definition 8.3.1, we have uk (x) − uk (x, zk ) ≤ ε
for all k ∈ N and all zk ∈ Xk
(see (8.51).
Adding these inequalities over k ∈ N , we obtain ξ(x, z) ≤ nε
for all z ∈ X.
We introduce the conjugate function of zk −→ uk (x, zk ) (see Definition 1.2.15). So
= u∗k (x)(zk∗ ) = sup zk∗ , zk Zk −uk (x, zk ) : z ∈ Z = Zk , pk (z) = x , zk∗ ∈ Zk∗ . k∈N
(8.53) Here by Zk∗ we denote the topological dual of the Banach space Zk and by ·, ·Zk the duality brackets for the pair (Zk∗ , Zk ). Recall that u∗k (x)(·) is convex and lower semicontinuous. In addition to the subdifferential of a convex function introduced in Definition 1.2.28, we can define the approximate subdifferential (ε-subdifferential), which is also a useful tool in convex analysis.
646
8 Game Theory
DEFINITION 8.6.5 Let Y be a Banach space and ϕ : Y −→ R = R ∪ {+∞} be a proper convex function. For each ε ≥ 0 the ε-subdifferential of ϕ at y ∈ dom ϕ, is defined to be the set ∂ε ϕ(y) = y ∗ ∈ Y ∗ : y ∗ , v − y − ε ≤ ϕ(v) − ϕ(y) for all v ∈ domϕ . Here ·, · denotes the duality brackets for the dual pair (Y ∗ , Y ). REMARK 8.6.6 If ε = 0, then the above definition coincides with Definition 1.2.28 (i.e., ∂0 ϕ = ∂ϕ). However there is a basic difference between the subdifferential (ε = 0) and the ε-subdiffrential (ε > 0). Although ∂ϕ = ∂0 ϕ is a local notion, for ε > 0 ∂ε ϕ is a global one, namely the behavior of ϕ on all of X may be relevant for the construction of ∂ε ϕ. This explains why ∂ϕ and ∂ε ϕ have in general different properties, with the most basic difference being that if ϕ ∈ Γ0 (Y ) and y ∈ dom ϕ, for every ε > 0 the set ∂ε ϕ(y) is nonempty, w∗ -closed, and convex. For the function u∗k (x) ∈ Γ0 (Zk∗ ), the ε-subdifferential ∂ε u∗k (x)(zk∗ ) is defined by ∂ε u∗k (x)(zk∗ ) = zk∗∗ ∈ Zk∗∗ : zk∗∗ , v ∗ − zk∗ − ε ≤ u∗k (x)(vk∗ ) − u∗k (x)(zk∗ ) for all vk∗ ∈ Zk∗ . Using the notion of ε-subdifferential, we can produce necessary and, under some additional hypotheses, also sufficient conditions for a multistrategy to be an εequilibrium. THEOREM 8.6.7 If x = (xk )n k=1 is an ε-equilibrium when Xk = Zk for all k ∈ N (i.e., X = Z), then xk ∈ ∂ε u∗k (x)(0). PROOF: From (8.53), we see that u∗k (x)(0) = − inf uk (x, zk ) : z ∈ Z, pk (z) = x ≤ −uk (x) + ε
(because x is an ε-equilibrium)
≤ −vk∗ , xk Zk + u∗k (x)(vk∗ ) + ε ⇒ xk ∈ ∂ε u∗k (x)(0)
(see Definition 8.6.5).
THEOREM 8.6.8 If for every k ∈ N and for some multistrategy x ∈ X, uk (x, ·) is convex and lower semicontinuous on Zk and xk ∈ ∂ε u∗k (x)(0), then x ∈ X is an ε-equilibrium strategy. PROOF: Because by hypothesis xk ∈ ∂u∗k (x)(0), for all k ∈ N and all vk∗ ∈ Zk∗ , we have vk∗ , xk Zk − ε ≤ u∗k (x)(vk∗ ) − u∗k (x)(0), ⇒ vk∗ , xk Zk − u∗k (x)(vk∗ ) ≤ ε − u∗k (x)(0) ⇒ u∗∗ k (x)(xk ) ≤ ε + inf uk (x, zk ) : z ∈ Z, pk (z) = x ⇒ uk (x) ≤ ε + inf uk (x, zk ) : z ∈ Z, pk (z) = x
8.6 Approximate Equilibria
647
(because uk (x, ·) ∈ Γ0 (Zk )). This implies that x ∈ X is an ε-equilibrium.
ateaux differenNow we assume that the loss function uk of each player k, is Gˆ tiable and we examine how the Gˆ ateaux derivative and the ε-equilibria multistrategies are related. THEOREM 8.6.9 If the strategy set Xk ⊆ Zk of each player k ∈ N is closed, n " int Xk = ∅ is an ε-equilibrium with nonempty interior, x = (xk )n k=1 ∈ int X = k=1
with ε > 0 and for every k ∈ N, uk (x, ·) is lower semicontinuous and Gˆ ateaux differentiable on Zk , then there exists a multistrategy y = (yk )n k=1 ∈ X such that √ y − x ≤ n ε ∂uk √ (x, yk )Z ∗ ≤ ε for every k ∈ N. and ∂zk k PROOF: Recall that in the beginning of the section, we have assumed that the game is bounded below. So we can apply Theorem 2.4.1 (the Ekeland variational principle) and obtain yk ∈ Xk such that √ (8.54) xk − yk ≤ ε √ for all zk ∈ Xk . (8.55) and uk (x, yk ) ≤ uk (x, zk ) + εzk − yk Let hk ∈ Zk and set zk = yk + thk for t > 0 small so that zk ∈ Xk (recall that xk ∈ int Xk ). Then √ 1 − εhk ≤ uk (x, yk + thk ) − uk (x, yk ) . t We let t −→ 0 to obtain √ − εhk ≤
∂uk (x, yk ), hk ∂zk
.
(8.56)
Zk
In (8.56), we take infimum of both sides with respect to all hk ∈ Zk with hk Zk = 1. It follows that ∂uk √ (x)Z ≤ ε, k ∈ N. k ∂zk Also, if we add inequalities (8.56), we conclude that √ x − yX ≤ n ε. THEOREM 8.6.10 If each player k ∈ N has a reflexive strategy space Zk , Xk = Zk , x = (xk )n k=1 ∈ X = Z is an ε-equilibrium multistrategy with ε > 0 and uk (x, ·) ∈ n " ∗ ∗ n ∗ Γ0 (Zk ), then we can find yε = (yε,k )n Zk∗ k=1 ∈ X = Z and yε = (yε,k )k=1 ∈ Z = such that for every player k ∈ N , we have xk − yε,k Zk ≤ √ ∗ Zk∗ ≤ ε; yε,k
k=1
√
ε
(8.57) (8.58)
648
8 Game Theory
moreover, yε = (yε,k )n uk , Xk }k∈N , k=1 ∈ X = Z is a Nash equilibrium of the game {$ where ∗ u $k (x) = uk (x, xk ) − yε,k , xk Z
for every x = (xk )n k=1 ∈ X =
k
n =
Xk .
k=1
PROOF: From Theorem 8.6.7, we have xk ∈ ∂ε u∗k (x)(0) ϑk (x)(vk∗ )
Let (8.59) we have
=
u∗k (x)
−
vk∗ , xk Zk
for all k ∈ N.
(8.59)
, k ∈ N . Then ϑk (x) ∈
Γ0 (Zk∗ )
ϑk (x)(0) ≤ inf ϑk (x)(vk∗ ) : vk∗ ∈ Zk∗ + ε.
and from (8.60)
From (8.60) it follows that ϑk (x)(·) is bounded below on Zk∗ . So we can apply ∗ Theorem 2.4.1 (the Ekeland variational principle) and obtain yε,k ∈ Zk∗ such that for ∗ all vk∗ = yε,k ∗ ϑk (x)(yε,k )−
√
∗ εvk∗ − yε,k Zk∗ < ϑk (x)(vk∗ ) √ ∗ ∗ ϑk (x)(yε,k ) ≤ ϑk (x)(0) − εyε,k Zk∗ .
and
(8.61) (8.62)
∗ From (8.61) it follows that yε,k minimizes the function √ ∗ f (x)(·) = ϑk (x)(·) + ε · −yε,k Zk∗
defined on Zk∗ . So we have
∗ √ ∗ 0 ∈ ∂ ϑk (x)(·) + ε · −yε,k Zk∗ (yε,k ).
Invoking Theorem 1.2.38, we have ∗ ∗ 0 ∈ ∂ϑk (x)(yε,k ) + ∂j(yε,k ),
where j(vk∗ ) =
√
(8.63)
∗ εvk∗ − yε,k Zk∗ . From Example 1.2.41(c), we know that ∗ )= ∂j(yε,k
√
∗∗
εB 1,k
∗∗ B 1,k being the closed unit ball in Zk∗∗ . The space Zk is Zk∗∗ = Zk and so from (8.63), (8.64), and the definition
with have that there exists yε,k ∈ Zk such that
and
(8.64) reflexive, therefore we of ϑk (x)(·), we deduce
∗ yε,k ∈ ∂u∗k (x)(yε,k ) √ yε,k = xk − εvk Zk , vk Zk ≤ 1.
From (8.66), we obtain xk − yε,k ≤
√
ε
which proves inequality (8.57) of the theorem. From (8.60) and (8.62), we have √ ∗ yε,k Zk∗ ≤ ε, which proves inequality (8.58) of the theorem. Next from (8.65), we have
(8.65) (8.66)
8.7 Remarks
∗ , yε,k vk∗ − yε,k
649
∗ ≤ u∗k (x)(vk∗ ) − u∗k (x)(yε,k ) for all vk∗ ∈ Zk∗ , ∗ ∗ ⇒ vk∗ , yε,k Zk − u∗k (x)(vk∗ ) − yε,k , yε,k Z ≤ −u∗k (x)(yε,k ), k ∗ ∗ ∗∗ ⇒ uk (x)(yε,k ) − yε,k , yε,k Z ≤ − sup yε,k , zk Z − uk (x, zk ) : zk ∈ Zk k k ∗ = inf uk (x, zk ) − yε,k , zk Z : zk ∈ Zk k ∗ ∗ ⇒ uk (x, yε,k ) − yε,k , yε,k Z ≤ inf uk (x, zk ) − yε,k , zk Z : zk ∈ Zk Zk
k
k
(8.67) Because uk (x, ·) ∈ Γ0 (Zk ). From (8.67) it follows that u $k (yε ) = min u $k (z) : z ∈ X, pk (z) = yε , $k }k∈N . ⇒ yε = (yε,k )n k=1 ∈ Z is a Nash equilibrium for the game {Xk , u
8.7 Remarks 8.1: Noncooperative games and the associated equilibrium notion (see Definition 8.1.5) were introduced by Nash [451, 452]. Nash proved Theorem 8.1.9 for games where the preferences of the players are described by continuous quasiconcave utility functions and the strategy sets are simplexes. The work of Nash generalized the pioneering work of von Neumann [454], who proved the first minimax theorem (using Brouwer’s fixed point theory) and initiated game theory. The work of Nash achieved the shift of attention from the minimax concept (saddle point) of von Neumann to the new concept of Nash equilibrium (noncooperative equilibrium). It is only with this new notion that the theory of noncooperative games could move with full generality to the case of n-players with n > 2. Theorem 8.1.9 assumes that the utility functions uk (x, yk ) are continuous in both variables. A partial generalization in this direction, can be found in Nikaido–Isoda [460]. Abstract economies (or generalized games; see Definition 8.1.11), were first introduced by Debreu [184], who was influenced by the work of Nash. He introduced and studied a model in which the agents’ (players’) preference relations are described by a utility function (hence they are transitive). Later Mas-Colell [410] and Shafer–Sonnenschein [549] dropped the transitivity requirement and described preferences using a preference multifunction (see Theorem 8.1.18). For further discussion of noncooperative games and abstract economies, with a rich bibliography, we refer to the books of Aubin [36], Border [86], and Ichiishi [324]. 8.2: The first side-payment game was introduced by von Neumann–Morgenstern [455], but had a specific interpretation of the characteristic function v of the game. The notion of core of a side-payment game (see Definition 8.2.1) was introduced by Shapley [550]. The concept of a balanced side-payment game was introduced by Bondavera [84, 85] and for non-side-payment games (see Definition 8.2.9) by Scarf [537]. Theorem 8.2.8 is independently due to Bondavera [84, 85], and Shapley [551]. Side-payment games with infinitely many players have been studied. We mention the works of Schmeidler [541, 542], Kannai [340], and Delbaen [193]. The first to
650
8 Game Theory
formulate non-side-payment games, were Aumann–Peleg [41]. Aumann [42] defined the core for such games (see Definition 8.2.9). Theorem 8.2.13, was first proved by Scarf [537]. However, the proof presented here, which is based on the special version of the KKM-Theorem (see Proposition 8.2.12), is due to Shapley [552]. Non-sidepayment games with a continuum of players, can be found in Aumann–Shapley [47], Ichiishi [324], and in the paper of Ichiishi–Weber [323]. 8.3: The first to study the existence of Cournot–Nash equilibria for games with an atomless measure space of players (a continuum game model), was Schmeidler [543]. In Schmeidler’s model the strategy spaces are finite-dimensional. Khan [349] extended this to the infinite-dimensional case. His work and the subsequent extensions by Khan–Papageorgiou [351] and Balder–Yannelis [54] considered models in which each player has a preference multifunction instead of a utility function. So her preference relation need not be transitive or complete. Cournot–Nash equilibrium distributions for games that are viewed as a probability measure on the space of payoff (utility) function were established by Mas Collel [411], and Khan [352]. 8.4: For games with a finite number of players and actions and with private information (imperfect information in the terminology of Harsanyi [285]), equilibrium results were proved by Radner–Rosenthal [511], Milgram–Weber [429], and Mamer–Schilling [405]. Bayesian games with an infinite number of players, were first introduced by Palfrey–Srivastava [473], and Postlewaite–Schmeidler [501], who introduced the information partition approach used here. Bayesian games with an infinite-dimensional strategy space can be found in Balder–Yannelis [54], and Kim– Yannelis [353]. 8.5: The dynamic programming model has its origins in the pioneering work of Bellman [62]. A rigorous foundation of stochastic dynamic programming, was given by Blackwell [77, 78], and Strauch [561]. Advanced treatment of stochastic dynamic programming can be found in the books of Hinderer [305], Striebel [562], Bertsekas– Shreve [72], Dynkin–Yushkevich [216], Arkin–Evstingeev [29], and Maitra–Sudderth [399]. Theorem 8.5.3 is basic in the theory of stochastic dynamic programming and it is due to Blackwell [77]. Discounted stochastic games were investigated by many authors. Indicatively we mention the works of Maitra–Parthasarathy [397, 398], Parthasarathy [488, 489], Himmelberg–Parthasarathy–Raghavan–Van Vleck [302], Himmelberg–Parthasarathy–Van Vleck [303], Nowak [462, 463], and Whitt [604]. Adaptive control problems can be found in the book of Hermandez Lerma [290], and gambling games in the book of Maitra–Sudderth [399]. 8.6: Approximate equilibria for noncooperative games were studied by Tanaka– Yokoyama [575]. The ε-subdifferential of convex functions (see Definition 8.6.5) was studied in a systematic way by Hiriart–Urruty [306, 307].
9 Uncertainty, Information, Decision Making
Summary. *In this chapter, we study how information can be incorporated as a variable in various decision models. In a decision model with uncertainty, it is natural to model information by sub-σ-fields of a probability space which represents the set of all possible states of the “world”. For this reason we topologize the set of sub-σ-fields with two topologies, using tools from probability theory and functional analysis. Then we examine the “ex-post view” and the “ex-ante view”. In both cases, we prove continuous dependence of the model on the information variable. Then we introduce a third mode of convergence of the information variable and we study prediction sequences. Finally we study games with incomplete information or unbounded cost and general state space.
Introduction The goal of this chapter is to study how information can be incorporated as a variable in various decision models. Such a formulation allows us to treat situations with asymmetric information. In Section 9.1 we present a mathematical framework that permits the analytical treatment of the notion of information. We produce two such comparable metric topologies using tools from probability theory and functional analysis. In Section 9.2, we examine the ex-post view, in modelling systems with uncertainty. According to this view, every agent chooses an action after observing his or her information and updating his or her belief. Using the topologies of Section 9.1, we prove the continuity of this model on the information variable. In Section 9.3, we examine an alternative view of uncertain decision systems, known as the ex-ante view. In this alternative approach, an agent formulates a plan of what action to choose at each state before observing her information, subject to the constraint that the plan must be measurable with respect to her information. In Section 9.4, we introduce a third mode of convergence of information, distinct from the ones introduced and studied in Section 9.1. We show that this new mode of convergence is suitable in the analysis of prediction sequences. Section 9.5 formulates and studies a general two-person, zero-sum game with incomplete information. The hypotheses on the data of the model are minimal. N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_9, © Springer Science+Business Media, LLC 2009
652
9 Uncertainty, Information, Decision Making
Finally in Section 9.6, we treat games with a general state space and unbounded cost. We prove the existence of equilibria.
9.1 Mathematical Space of Information In many economic models with uncertainty, the information structure is an endogenous variable or an exogenous parameter. A natural and general way to model information is to represent it by sub-σ-fields of a probability space (Ω, Σ, µ), which describes the set of all possible states of the world (Ω), the family (σ-field) of all possible events (Σ), and the distribution of those events (µ). In order to make formal statements on the relation of information to the other variables of the model, we need to define precisely the mathematical space of information and endow it with a suitable topology. This topology should not be too weak, or otherwise various decision variables that depend on information (such as consumer demand) may fail to be continuous. On the other hand the topology should not be too strong (too rich), or otherwise information structures (fields) that are known to lead to similar behavior, may fail to be topologically close to each other. So we need to achieve a rather delicate balance with the topology on the space of information. In this section, we outline a topology that achieves this balance. Let (Ω, Σ, µ) be a probability space. It represents the uncertainty present in the microeconomic model. By S ∗ we denote the set of all sub-σ-fields of Σ. On S ∗ we introduce the relation ∼, defined by for Σ1 , Σ2 ∈ S ∗ , we have Σ1 ∼ Σ2 if and only if they differ only by µ-null sets. So for every A ∈ Σ1 , there is a C ∈ Σ2 such that µ(A C) = 0 and vice versa. The space of information S0 is defined to be the set of ∼-equivalence classes of S ∗ ; that is, S0 = S ∗ / ∼ . In what follows, given Σ ∈ S ∗ and f ∈ L1 (Ω, Σ), by E(f |Σ ) we denote the conditional expectation of f with respect to the sub-σ-field Σ . Using the standard approximation of L1 (Ω, Σ) functions by simple functions, we obtain the following result. PROPOSITION 9.1.1 Σ1 ∼ Σ2 if and only if E(f |Σ1 ) = E(f |Σ2 ) for all f ∈ L1 (Ω, Σ). REMARK 9.1.2 Note that for every Σ ∈ S ∗ and every f ∈ L1 (Ω, Σ), E(f |Σ ) belongs in L1 (Ω, Σ ) ⊆ L1 (Ω, Σ).
Now let L L1 (Ω, Σ) be the Banach space of all bounded linear operators L : L1 (Ω, Σ) −→ L1 (Ω, Σ). Then for Σ ∈ S0 , E(·|Σ ) ∈ L1 (Ω, Σ ) ⊆ L1 (Ω, Σ) (see Remark 9.1.2) and the map Σ −→E(f |Σ ) from S0 into L1 (Ω, Σ) is injective (see Proposition 9.1.1). So we can embed S0 into L L1 (Ω, Σ) and topologize it
1 with the subspace topology of any topology we consider on the space L L (Ω, Σ) . The
space L L1 (Ω, Σ) is a Banach space with the norm
9.1 Mathematical Space of Information LL = sup
L(f )
1
f 1
653
: f ∈ L1 (Ω, Σ), f = 0 .
The induced norm topology is called operator topology. In this
the uniform topology the map {L, K} −→ L ◦ K ∈ L L1 (Ω, Σ) is jointly continuous. We now introduce another topology on L L1 (Ω, Σ) that is weaker than the uniform operator topology and more suitable for our purposes. DEFINITION 9.1.3 The strong operator topology or topology of pointwise con
vergence is the weakest topology on L L1 (Ω, Σ) such that the maps
Ef : L L1 (Ω, Σ) −→ L1 (Ω, Σ) defined by Ef (L) = L(f ) are continuous for every f ∈ L1 (Ω, Σ). REMARK 9.1.4 A neighborhood basis of the origin is given by the sets
L ∈ L L1 (Ω, Σ) : L(fk )1 < ε, k = 1, . . . , n , 1 this topology where {fk }n k=1 is a finite collection of elements in L (Ω, Σ) and ε > 0. In a net {La }a∈J of operators converges to an operator L ∈ L L1 (Ω, Σ) denoted by s La −→ L if and only if La (f ) − L(f )1 −→ 0 for all f ∈ L1 (Ω, Σ). The map {L, K} −→ L ◦ K is separately but not jointly continuous.
In general the strong operator topology on L L1 (Ω, Σ) is not well-behaved and it is not metrizable. However, the relative topology on S0 ⊆
in particular L L1 (Ω, Σ) is much better behaved because S0 is a uniformly equicontinuous subset. In what follows we assume that the Lebesgue space L1 (Ω, Σ) is separable. This is true when Σ is countably generated and this in turn holds if Σ is the Borel σ-field of a second-countable Hausdorff topological space (e.g., of a separable metric
space). In the sequel by τs we denote the strong operator topology on L L1 (Ω, Σ) . THEOREM 9.1.5 If L1 (Ω, Σ) is separable, then (S0 , τs ) is a Polish space and the metric is given by ds (Σ1 , Σ2 ) =
∞ 1 min E(fk |Σ1 ) − E(fk |Σ2 )1 , 1 , 2k
(9.1)
k=1
where {fk }k≥1 is a countable dense subset of L1 (Ω, Σ). Any two metrics using different countable dense subsets of L1 (Ω, Σ) are uniformly equivalent. PROOF: First we show that ds is a metric. Clearly ds satisfies the triangle inequality. Also from Proposition 9.1.1 we see that Σ1 = Σ2 ⇒ ds (Σ1 , Σ2 ) = 0. On the other hand, if ds (Σ1 , Σ2 ) = 0, then from (9.1) we see that E(fk |Σ1 ) − E(fk |Σ2 )1 = 0
for all k ≥ 1.
(9.2)
Given any h ∈ L1 (Ω, Σ) and ε > 0, we can find k ≥ 1 such that h − fk < ε. Then we have
654
9 Uncertainty, Information, Decision Making E(h|Σ1 ) − E(h|Σ2 )1 ≤ E(h|Σ1 ) − E(fk |Σ1 )1 +E(fk |Σ2 ) − E(h|Σ2 )1
(see (9.2)) ≤ h − fk 1 + fk − h1 < 2ε. Because ε > 0 was arbitrary, we let ε ↓ 0 to obtain E(h|Σ1 ) − E(h|Σ2 )1 = 0 ⇒ Σ1 = Σ2
for all h ∈ L1 (Ω, Σ)
(see Proposition 9.1.1).
Finally it is obvious that ds (Σ1 , Σ2 ) = ds (Σ2 , Σ1 ) for all Σ1 , Σ2 ∈ S0 . Therefore we have shown that ds is a metric on S0 . Note that for all f ∈ L1 (Ω, Σ) and all Σ ∈ S0 , we have E(f |Σ )1 ≤ f 1 , ⇒ S0 is uniformly equicontinuous. So from a result of point-set topology (see, e.g., Kelley [344, p. 238]), we have that the topologies ds and τs on S0 coincide and in fact ds is independent of the particular dense sequence {fk }k≥1 we use. Moreover, the topology is separable. It remains to show that the metric topology is complete. To this end let {Σn }n≥1 be a ds -Cauchy sequence. Then given ε ∈ (0, 1) and k ≥ 1, there exists N such that for m, n ≥ N , we have ε , 2k (see (9.1)), ⇒ E(fk |Σn ) − E(fk |Σm )1 < ε 1 ⇒ E(fk |Σn ) n≥1 ⊆ L (Ω, Σ) is a Cauchy sequence. ds (Σn , Σm ) <
Therefore we can find L ∈ L L1 (Ω, Σ) such that L(h) = lim E(h|Σn ) n→∞
for every h ∈ L1 (Ω, Σ).
(9.3)
We show that L is also a conditional expectation with respect to some Σ ∈ S0 . To this end we need to show that
L gL(h) = L(g)L(h) for all g ∈ L∞ (Ω, Σ), all h ∈ L1 (Ω, Σ)
L(h)dµ =
and
(9.4)
Ω
hdµ
for all h ∈ L1 (Ω, Σ)
Ω
(see Neveu [459, p. 12]). So we have
(see (9.3)) L gL(h) = L g lim E(h|Σn ) n→∞
(because L is continuous) = lim L gE(h|Σn ) n→∞
(see (9.3)) = lim E gE(h|Σn )|Σn n→∞
= lim E(g|Σn )E(h|Σn ) = L(g)L(h) n→∞
which proves (9.4). Also
(9.5)
L(h)dµ = Ω
9.1 Mathematical Space of Information
lim E(h|Σn )dµ = lim E(h|Σn )dµ = hdµ,
Ω n→∞
n→∞
Ω
655
Ω
which proves (9.5). Therefore we can find Σ ∈ S0 such that L(h) = E(h|Σ) for all h ∈ L1 (Ω, Σ) and so (S0 , ds ) is a Polish space. In the propositions that follow, we more closely examine the convergence in the Polish space (S0 , ds ). µ
d
s PROPOSITION 9.1.6 Σn −→ Σ if and only if E(χA |Σn ) −→ E(χA |Σ) for all A ∈ Σ.
PROOF: ⇒: This implication follows from the definition of the topology τs = ds (see Definition 9.1.3). ⇐: We know that for every f ∈ L1 (Ω, Σ), {E(f |Σ ) : Σ ∈ S0 } ⊆ L1 (Ω, Σ) is uniformly integrable. So from the hypothesis and the extended dominated convergence theorem (Vitali’s theorem) we have that E(s|Σn ) −→ E(s|Σ) for every simple function s ∈ L1 (Ω, Σ). But simple functions are dense in L1 (Ω, Σ). So given f ∈ L1 (Ω, Σ), we can find simple functions {sm }m≥1 ⊆ L1 (Ω, Σ) such that sm −→ f in L1 (Ω, Σ). We have E(f |Σn )−E(f |Σ)1 ≤ E(f |Σn )−E(sm |Σn )1 +E(sm |Σn )−E(sm |Σ)1 +E(sm |Σ)−E(f |Σ)1 ≤ 2f − sm 1 + E(sm |Σn ) − E(sm |Σ)1 −→ 0, as ds
⇒ Σn −→ Σ.
m −→ ∞,
In a similar fashion we also show the following result. d
s Σ if and only if for every D ⊆ L1 (Ω, Σ) dense set PROPOSITION 9.1.7 Σn −→ µ E(f |Σn ) −→ E(f |Σ) for all f ∈ D.
DEFINITION 9.1.8 (a) For Σ1 , Σ2 ∈ S0 , Σ1 ∨ Σ2 is the smallest σ-field containing both Σ1 and Σ2 . (b) For {Σn }n≥1 ⊆ S0 , we define D # # D lim inf Σn = Σn and lim sup Σn = Σn . n→∞
n→∞
m≥1 n≥m
m≥1 n≥m
d
s Σ. PROPOSITION 9.1.9 Σ = lim inf Σn = lim sup Σn implies Σn −→
n→∞
PROOF: Let Γm=
n≥m
n→∞
E Σn and Λm= n≥m Σn . Evidently {Γm }m≥1 is an increasing
sequence and {Λm }m≥1 is a decreasing sequence of sub-σ-fields of Σ. From the
656
9 Uncertainty, Information, Decision Making
martingale and reverse martingale convergence theorems, for every f ∈ L1 (Ω, Σ) we have E(f |Γm ) −→ E(f |Σ)
and
E(f |Λm ) −→ E(f |Σ)
in L1 (Ω, Σ)
as m −→ ∞.
Given ε > 0, we can find m0 = m0 (ε) ≥ 1 such that E(f |Γm ) − E(f |Λm )1 <
ε2 2
for all m ≥ m0 .
(9.6)
Let m ≥ m0 and A ∈ Σm . Because Σm ⊆ Λm , we have
E(f |Γm ) − E(f |Σm ) dµ
A
= E(f |Γm ) − E(f |Λm ) dµ
A E(f |Γm ) − E(f |Λm )dµ ≤ A
≤ E(f |Γm ) − E(f |Λm )1 <
ε2 2
(see (9.6)).
(9.7)
Let Am = ω ∈ Ω : E(f |Γm )(ω) − E(f |Σm )(ω) > ε ∈ Σm (because Γm ⊆ Σm ). Using this set in (9.7) we obtain
ε2 εµ(Am ) ≤ E(f |Γm ) − E(f |Σm ) dµ < , 2 A ε ⇒ µ(Am ) < (9.8) for all m ≥ m0 . 2 Similarly, if Cm = ω ∈ Ω : E(f |Γm )(ω)−E(f |Σm )(ω) < −ε , we can say that µ(Cm ) <
ε 2
for all m ≥ m0 .
(9.9)
Combining (9.8) and (9.9), we obtain µ ω ∈ Ω : E(f |Γm )(ω) − E(f |Σm )(ω) > ε < ε µ
⇒ E(f |Γm ) − E(f |Σm ) −→ 0
for all m ≥ m0 ,
as m −→ ∞.
(9.10)
Also from the martingale convergence theorem, we have E(f |Γm )(ω) −→ E(f |Σ)(ω)
µ-a.e. on Ω
as m −→ ∞.
(9.11)
From (9.10) and (9.11), we infer that µ
E(f |Σm ) −→ E(f |Σ)
as m −→ ∞.
(9.12)
Also, again from the martingale convergence theorem, we have E(f |Γm )1 , E(f |Λm )1 −→ E(f |Σ)1
as m −→ ∞.
(9.13)
for all m ≥ 1.
(9.14)
Because Γm ⊆ Σm ⊆ Λm , we have E(f |Γm )1 ≤ E(f |Σm )1 ≤ E(f |Λm )1
9.1 Mathematical Space of Information
657
From (9.13) and (9.14), we infer that E(f |Σm )1 −→ E(f |Σ)1
as m −→ ∞.
(9.15)
From (9.12) and (9.15), we conclude that E(f |Σm ) −→ E(f |Σ)
in L1 (Ω, Σ)
as m −→ ∞.
Because f ∈ L1 (Ω, Σ) was arbitrary, we have d
s Σm −→ Σ
as m −→ ∞.
The strong operator topology (topology of pointwise convergence) on S0 has the nice property that the subset of information consisting of all finite partitions of the state space is dense. This is nice because in models in which the information variable appears in a continuous fashion, finite-dimensional approximations can be used. PROPOSITION 9.1.10 If Σ ∈ S0 and ε > 0 are given, then we can find a finite Σ-partition Σf ∈ S0 of Ω with Σf ⊆ Σ such that ds (Σ, Σf ) < ε. ∞ PROOF: Let {fk }N k=1 ⊆ L (Ω, Σ) and let c> max fk ∞ . Choose r > 0 such that
c < rε/4. For every k = {1, . . . , N } we set
and set
1≤k≤N
ε(i + 1) εi Aki = ω ∈ Ω : ≤ E(fk |Σ)(ω) ≤ 2 2 Σk = σ Ak1 , . . . , Akr , Ak,(−1) , . . . , Ak,(−r) and Σf = σ{Σk }N k=1 . Then, E(fk |Σ)(ω) − E(fk |Σf )(ω) < ε µ-a.e. on Ω.
Next we show how the information topology depends on the probability measure µ. This way, we are able to handle certain asymmetries of the agents’ prior beliefs about uncertainty. Recall that the collection of null-sets affects the information space. For this reason, to be able to proceed further in our discussion we need the following definition. DEFINITION 9.1.11 Let µ, ν be two probability measures on the measurable space (Ω, Σ). We say that µ and ν are equivalent, if they produce the same null sets. REMARK 9.1.12 For equivalent measures there is no ambiguity in the expression almost everywhere. So the space of information S ∗ is the same. Also L∞ (Ω, Σ, µ) = L∞ (Ω, Σ, ν) for µ, ν equivalent. However, it is not true that L1 (Ω, Σ, µ) = L1 (Ω, Σ, ν) (although the common L∞ -space is dense in both). Moreover, the conditional expectations Eµ (·|Σ ), Eν (·|Σ ), Σ ∈ S0 , with respect to µ and ν, are different. The equivalent measures µ, ν are mutually absolutely continuous and so we can define the Radon–Nikodym derivatives (dµ)/(dν), (dν)/(dµ).
658
9 Uncertainty, Information, Decision Making
LEMMA 9.1.13 If µ, ν are equivalent probability measures on (Ω, Σ), Σ ∈ S0 , dν and f ∈ L∞ (Ω, Σ), then Eµ (f |Σ ) = Eµ dµ |Σ Eν f dµ |Σ a.e. on Ω. dν
PROOF: For A ∈ Σ , we have
dν dµ Eµ |Σ Eν f |Σ dµ dµ dν
A dν dµ = Eµ Eν f |Σ dµ dµ dν A
dµ = Eν f f dµ, |Σ dν = dν A A
dν dµ ⇒ Eµ (f |Σ ) = Eµ |Σ Eν f |Σ dµ dν
a.e. on Ω.
∞
Taking f = 1 ∈ L (Ω, Σ), we obtain the following corollary. COROLLARY 9.1.14 If µ, ν are equivalent probability measures on (Ω, Σ) and
dν −1 |Σ = Eµ dµ |Σ a.e. on Ω. Σ ∈ S0 , then Eν dµ dν Recall that throughout this section, we assume that L1 (Ω, Σ) is separable. LEMMA 9.1.15 If µ, ν are equivalent probability measures on (Ω, Σ), and ⊆ L∞ (Ω, Σ) is dense {fk }k≥1 ⊆ L∞ (Ω, Σ) is dense in L1 (Ω, Σ, µ), then fk dµ dν k≥1 in L1 (Ω, Σ, ν). PROOF: Let h ∈ L1 (Ω, Σ, ν). We have
dµ dν dµ h − fk dµ 1 = − f = − f h h dν dν k k dν L (ν) dν dµ dν Ω Ω dν = h − fk L1 (µ) dµ dµ ⇒ fk ⊆ L∞ (Ω, Σ) is dense in L1 (Ω, Σ, ν). dν k≥1 Using these auxiliary results, we can now compare the metrics ds (µ) and ds (ν) generating the strong operator topologies (topologies of pointwise convergence) for the two probability measures µ and ν. THEOREM 9.1.16 If µ, ν are equivalent probability measures on (Ω, Σ) and ds (µ), ds (ν) are the metrics generating the corresponding strong operator topologies on S0 , then ds (µ) and ds (ν) are uniformly equivalent. PROOF: Let {fk }k≥1 ⊆ L∞ (Ω, Σ) be dense in L1 (Ω, Σ, µ) with f1 = 1. From Theorem 9.1.5, we know that ds (ν)(Σ1 , Σ2 ) =
∞ 1 min E(fk |Σ1 ) − E(fk |Σ2 )L1 (ν) , 1 k 2
k=1
9.1 Mathematical Space of Information
659
is a metric generating the strong operator topology (topology of pointwise convergence) on S0 for the prior ν. Given ε > 0, we choose k0 ≥ 1 and c > 1 such that 1 <ε 2k0 −3
and
fk ∞ < c
for all k < k0 .
Due to the fact that ν ! µ, we can find δ = δ(ε) > 0 such that for all A ∈ Σ ν(A) < δ
We have
Eν Ω
So, if we set
implies µ(A) <
dν dµ |Σ1 dν = dµ dν
Eµ Ω
ε . 4c
dν |Σ1 dµ = 1. dµ
dν dµ 1 |Σ1 (ω) (ω) ≤ , A = ω ∈ Ω : Eµ dµ dν δ
we have
ε (by Chebyshev’s inequality). µ(A) ≥ 1 − 4c
k +2 If we set ϑ = δε 1 (2 0 c) , then ds (ν)(Σ1 , Σ2 ) < ϑ implies dµ
dµ δε Eν fk |Σ1 − Eν fk |Σ2 L1 (ν) < dν dν 8c
for all k < k0
and so using Lemma 9.1.15, we obtain Eµ (fk |Σ1 ) − Eµ (fk |Σ2 )L1 (µ) <
ε 2
for all k ≥ 1,
⇒ ds (µ)(Σ1 , Σ2 ) < ε. Reversing the roles of µ, ν in the above argument, we conclude that the metrics ds (µ) and ds (ν) are uniformly equivalent. There is another topology on S0 , which is actually the first topology in the literature introduced for the purpose of topologizing the space of information. DEFINITION 9.1.17 For Σ1 , Σ2 ∈ S0 , we define the quantity d∗ (Σ1 , Σ2 ) = sup inf µ(A C) A∈Σ1 C∈Σ2
and then set
d(Σ1 , Σ2 ) = max d∗ (Σ1 , Σ2 ), d∗ (Σ2 , Σ1 ) .
It can be shown that d is a metric on S0 , known in the literature as the Boylan metric. REMARK 9.1.18 On Σ we can define the relation ∼ , by A ∼ C if and only ˙ = Σ/ ∼ . Note that if µ(A C) = 0. This is an equivalence relation. We set Σ if A ∼ C, then µ(A) = µ(C). Thus a function µ˙ is unambiguously defined on ˙ by setting µ([A]) Σ ˙ = µ(A) where [A] is the ∼ -equivalence class of A ∈ Σ. Then ˙ ˙ A probability space (Ω, Σ, µ) is said to be d([A], [C]) = µ([A ˙ C]) is a metric on Σ. ˙ is separable. The metric space (Σ, ˙ ˙ d) ˙ d) separable, if the associated metric space (Σ, is complete. We emphasize that completeness of the probability space (Ω, Σ, µ) has ˙ ˙ d). nothing to do with the completeness of the associated metric space (Σ,
660
9 Uncertainty, Information, Decision Making
Consider the Banach space L L∞ (Ω, Σ), L1 (Ω, Σ) of bounded linear operators from L∞ (Ω, Σ) into L1 (Ω, Σ). For every f ∈ L∞ (Ω, Σ) and every Σ ∈ S0 , we ∞ 1 have
∞E(f |Σ ) ∈1L (Ω,Σ ) ⊆ L (Ω, Σ). So we can consider that S0 is embedded in L L (Ω, Σ), L (Ω, Σ) . We have the following result due to Allen [13, Fact 9.3]. PROPOSITION 9.1.19 The d-metric topology on S0 coincides with the relative uniform operator topology of L L∞ (Ω, Σ), L1 (Ω, Σ) on S0 . So d(Σn , Σ) −→ 0 if and only if sup E(f |Σn ) − E(f |Σ)1 : f ∞ ≤ 1 −→ 0 as n → ∞. This proposition together with the results of Allen [13, Proposition 13.1], leads to the following comparison of the ds and d-metric topologies. PROPOSITION 9.1.20 The d-metric topology on S0 is finer (stronger) than the ds -topology. The two coincide when µ is purely atomic and in this case S0 is a compact metric space. In general the d-metric topology has some serious drawbacks. One such drawback is illustrated in the next example. EXAMPLE 9.1.21 An increasing sequence {Σn }n≥1 ⊆S0 need not converge in the
metric d: To see this, for every n ≥ 1 let Ωn = [0, 1), Bn = B [0, 1) and µn = λ =the Lebesgue measure on [0, 1). We set = = = Ω= Ωn , Σ = Bn and µ = µn . n≥1
n≥1
n≥1
Let Tn = {∅, Ωn }, n ≥ 1, be the trivial σ-field on Ωn and set Σn =
n = k=1
Bn ×
=
Tk
for all n ≥ 1.
k≥n+1
Evidently Σn ↑ Σ. However, we claim that 1 ≤ d(Σn , Σn+1 ) 2
for every n ≥ 1.
(9.16)
Let A, C be independent subsets of Ω. We have µ(A C) = µ(A ∩ C c ) + µ(C ∩ Ac ) = µ(A) − µ(A ∩ C) + µ(C) − µ(A ∩ C) = µ(A) + µ(C) − 2µ(A)µ(C), (due to the independence of A and C), ⇒ µ(A C) = µ(A) 1 − 2µ(C) + µ(C). If µ(C) = 12 , then we have µ(A C) = µ(C). Choose Cn ∈ Σn+1 independent of all the sets in Σn with µ(Cn ) = 12 . We have 1 = inf µ(A Cn ) : A ∈ Σn ≤ d(Σn , Σn+1 ) 2
for all n ≥ 1,
9.1 Mathematical Space of Information
661
(see Definition 9.1.17), which proves (9.16). Hence {Σn }n≥1 is not a d-Cauchy sequence, although Σn ↑ Σ as n → ∞. Note also that in this example (S0 , d) is not compact. In Proposition 9.1.10 we showed that the family of all finite partitions is dense in (S0 , ds ). This is important because it permits the use of finite-dimensional approximations in models which are ds -continuous with respect to the information variable. This is no longer true for the d-metric topology as the next example illustrates. EXAMPLE 9.1.22 The family of all finite partitions of Ω is not d-dense in S0 : Let Ω = [0, 1], Σ = B([0, 1]), and µ = λ = the Lebesgue measure on [0, 1]. Let {Ak }N k=1 be a Borel partition of Ω. We know that we can find C ∈ Σ such that µ(Ak ∩ C) = 12 µ(Ak ) for all k ∈ {1, . . . , N }. Therefore, if Σf = σ({Ak }N k=1 ), we have 1 ≤ d(Σ, Σf ). 2 REMARK 9.1.23 In fact the d-closure of all finite Σ-partitions of Ω is {Σ ∈ S0 : µ Σ , is purely atomic}. Example 9.1.22 shows that the d-metric topology is not suitable to consider finite-dimensional approximations of the model. However, the d-metric topology being a relative uniform operator topology, has the property that the map ξ : S0 × S0 −→ S0 defined by ξ(Σ1 , Σ2 ) = Σ1 ∨ Σ2 = σ(Σ1 ∪ Σ2 )
(see Definition 9.1.8(a))
is jointly d-continuous. This is no longer true for the ds -metric topology. We can only say that if Σf is a finite Σ-partition of Ω, then Σ −→ Σf ∨ Σ is ds -continuous. In general the ∨-operation is not even separately continuous. We conclude with one more observation emphasizing the differences between the two metrics ds and d. We start with a definition. DEFINITION 9.1.24 The elements {Σk }N k=1 ⊆S0 are said to be independent if for every Ak ∈ Σk , k = 1, . . . , N , we have µ(A1 ∩ . . . ∩ AN ) = µ(A1 ) . . . µ(AN ). A subset of S0 is said to be independent if every finite collection of elements in it is independent. PROPOSITION 9.1.25 If {Σn }n≥1 ⊆ S0 is independent, then d
s (a) Σn −→ {∅, Ω} = T .
(b) If sup inf µ(A) − 12 < 12 , then Σn T in the d-metric. n≥1 A∈Σn
PROOF: (a) From Kolmogorov’s (0, 1) law, we have lim sup Σn = T . n→∞
d
s Because lim inf Σn ⊆ lim sup Σn , from Proposition 9.1.9 we conclude that Σn −→ T.
n→∞
n→∞
662
9 Uncertainty, Information, Decision Making
(b) For every n ≥ 1, we have 1 1 d(Σn , T ) = sup min µ(A), 1 − µ(A) = − sup µ(A) − , 2 A∈Σn 2 A∈Σn for all n ≥ 1, ⇒ Σn T
in the d-metric.
EXAMPLE 9.1.26 Let Ω = [0, 1], Σ = B([0, 1]), and µ = λ = the Lebesgue measure on [0, 1]. Set An = x ∈ [0, 1] : the nth entry in the binary expansion of x is 0 and Σn = {∅, Ω, Acn , An },
n ≥ 1.
So Σn reveals the nth binary entry of the true state. It is easy to see that {Σn }n≥1 is independent. Hence by Proposition 9.1.25(a), we have d
s Σn −→ T.
On the other hand µ(An ) = 12 for all n ≥ 1. Therefore by Proposition 9.1.25(b), we have Σn T in the metric d.
9.2 The ex-Post View There are two ways we can view information and how it affects the behavior of the model. In this section, we examine the first way, which is known as the ex-post view or Bayesian view. According to it an agent chooses an action after observing his information and updating his belief. Then information affects state-dependent actions through posteriors (transition measures). When the state-dependent posteriors are not well-defined, then the idea is to show that the mapping from information to conditional expected utility is continuous and then use the fact that the mapping from utility to actions is continuous. We establish the continuity of consumer demand under uncertainty with respect to his private information. So consider the commodity space RN . We set N N pk = 1 ∆ = p = (pk )N k=1 ∈ R+ :
(the usual price simplex).
k=1
There is a consumption set C ⊆ RN which is assumed to be closed, convex, and bounded from below. Let U = {u : C −→ R : u is continuous, strictly increasing and strictly concave}. We furnish U with the topology of uniform convergence on compacta (c-topology or compact-open topology). Recall that C(C, R) with the c-topology is a separable Fr´echet space. Therefore U with the c-topology is a separable metric space. There
9.2 The ex-Post View
663
is uncertainty in the model described by the probability space (Ω, Σ, µ). The dependence of the utilities on the state of the world is described by a Σ, B(U) -measurable function U : Ω −→ U (by B(U) we denote the Borel σ-field of the separable metric space U ). We interpret U (ω) ∈ U as the consumer’s ex-post utility function when ω ∈ Ω is the true state of the world. We write U (ω)(x) = u(ω, x)
for all (ω, x) ∈ Ω × C.
(9.17)
Recall that the evaluation maps e : U × C −→ R
and
ex : U −→ R
for x ∈ C
defined by e(u, x) = u(x)
and
ex (u) = u(x)
are both continuous. Therefore from (9.17) and the measurability of U (·), we see that (ω, x) −→ u(ω, x) is a Carath´eodory function; that is, for every x ∈ C, ω −→ u(ω, x) is Σ-measurable and for every ω ∈ Ω, x −→ u(ω, x) is continuous. Therefore by virtue of Theorem 6.2.6 (ω, x) −→ u(ω, x) is Σ × B(C)-measurable. The consumer faces uncertainty only about her preferences for commodities. So only the utility function is state dependent. Her initial endowment e ∈ int C and the price vector p ∈ ∆ are both known and state independent (i.e., non-stochastic). The information of the consumer about the uncertainty she faces is described by Σ0 ∈ S 0 . The budget constraint of the consumer is the set B(e, p) = {x ∈ C : (p, x)RN ≤ (p, e)RN }. The consumer first observes Σ0 . This means that if the true state of the world is ω ∈ Ω, she observes the utility function
x −→ E u(·, x)|Σ0 (ω) and then she performs the maximization
sup E u(·, x)|Σ0 (ω) : x ∈ B(e, p) . To make sure that expected utilities are well defined, we must assume that as states of the world vary with probability one, utilities remain in some compact subset of U . So we introduce the following hypothesis. (H): The image measure (distribution) µ ◦ U −1 has compact support in U . REMARK 9.2.1 The support of a measure ν is the smallest closed set whose complement is ν-null. Based on hypothesis (H), we introduce the space
D= U : Ω −→ U : U is Σ, B(U) -measurable, µ ◦ U −1 has compact support . If is the metric on U, we endow D with the metric
0 (U, U ) = E (U, U ) = (U, U )dµ. Ω
664
9 Uncertainty, Information, Decision Making
Now the excess demand map de : ∆ × S0 × int C × D −→ L1 (Ω, RN ) is defined by
de (, Σ0 , e, U )(ω) = arg max E u(·, x)|Σ0 (ω) : x ∈ B(e, p) − e. (9.18) We want to determine the continuity properties of the excess demand function de when S0 is furnished with the ds -metric topology (topology of pointwise convergence). Before continuing our analysis to this end, we need to clarify the
conditional expected utility involved in (9.18). First note that for each x ∈ C E u(·, x)|Σ0 ∈ L1 (Ω). Because the Lebesgue dominated convergence theorem holds for conditional expectations, we see that
(ω, x) −→ w(ω, x) = E u(·, x)|Σ0 (ω) is a Σ0 -Carath´eodory function (i.e., w(·, x) is Σ0 -measurable). Also from the general theory of vector-valued integration, we have E(U |Σ0 )(ω) ∈ U
for µ-a.a. ω ∈ Ω.
So immediately we deduce the following. PROPOSITION 9.2.2 For any U ∈ D and Σ0 ∈ S0 , we have E(U |Σ0 ) ∈ D which is Σ0 -measurable and for all h ∈ L1 (Ω, Σ0 , RN ), we have
E(U |Σ0 )(ω) h(ω) = E u ω, h(ω) |Σ0 µ-a.e. on Ω. PROPOSITION 9.2.3 If U ∈ D and ε > 0 are given, then there exists Ω0 ∈ Σ with µ(Ω0 ) = 1 such that {E(U |Σ0 )(ω) : ω ∈ Ω0 , Σ0 ∈ S0 } is equicontinuous on C and uniformly equicontinuous on every compact K ⊆ C. PROOF: Let x ∈ X and δ > 0 be given. Because for all ω ∈ Ω0 , µ(Ω0 ) = 1, u(ω, ·) is continuous, we can find η = η(x, δ) > 0 such that x ∈ X
and
x − x < η imply |u(ω, x ) − u(ω, x)| < δ.
Then for all ω ∈ Ω0 we have
|E u(·, x)|Σ0 (ω) − E u(·, x )|Σ0 (ω)|
≤ E |u(·, x) − u(·, x )|Σ0 (ω) < δ. This proves the first assertion of the proposition. The second assertion follows from the fact that on compact sets equicontinuity is in fact uniform equicontinuity. Now we can havethe first result on the continuous dependence on the informa
tion. Here L1 (Ω, U )= U:Ω−→U : U is Borel-measurable and E (U, 0) < ∞ .
9.2 The ex-Post View
665
THEOREM 9.2.4 If {Un , U }n≥1 ⊆ D, Un (ω) −→ U (ω) µ-a.e. in U , for every K ⊆ C compact there exists MK > 0 such that for all x ∈ K and all n ≥ 1 we have |un (ω, x)| ≤ MK
and
|u(ω, x)| ≤ MK
µ-a.e. on Ω,
d
s and Σn −→ Σ as n → ∞, then E(Un |Σn ) −→ E(U |Σ) as n → ∞ in L1 (Ω, U ) and in probability.
PROOF: Let K ⊆ X compact and ε > 0 be given. By hypothesis Un (ω) −→ U (ω) µa.e. in U , therefore we have
lim sup |un (ω, x) − u(ω, x)| = 0
n→∞ x∈K
µ-a.e. on Ω.
So we can find n0 (ε) ≥ 1 such that sup un (·, x) − u(·, x)1 < x∈K
ε 2
for all n ≥ n0 .
(9.19)
Also we can find n1 (ε) ≥ 1 such that
ε sup E u(·, x)|Σn − E u(·, x)|Σ 1 < . 3 x∈K
(9.20)
Then for every n ≥ max{n0 (ε), n1 (ε)}, we have
sup E un (·, x)|Σn − E un (·, x)|Σ 1 x∈K
≤ sup E un (·, x)|Σn − E u(·, x)|Σn 1 x∈K
+ sup E u(·, x)|Σn − E u(·, x)|Σ 1 x∈K
+ sup E u(·, x)|Σ − E un (·, x)|Σ 1 x∈K
<
ε + 2 sup un (·, x) − u(·, x)1 3 x∈K
< ε
(see (9.20))
(see (9.19)).
(9.21)
Because of Proposition 9.2.3, there exist Ω0 ∈ Σ, µ(Ω0 ) = 1, and for n ≥ 1 a δn > 0 such that, ω ∈ Ω0 , x, x ∈ K with x − x < δn imply
and
|E un (·, x)|Σn (ω) − E un (·, x )|Σn (ω)| < ε
|E un (·, x)|Σ (ω) − E un (·, x )|Σ (ω)| < ε.
(9.22) (9.23)
m Because K is compact it is totally bounded and so we can find {xn k }k=1 ⊆ K m such that K ⊆ Bδn (xn k ). We have k=1
666
9 Uncertainty, Information, Decision Making
sup |E un (·, x)|Σn (ω) − E un (·, x)|Σ (ω)|dµ Ω x∈K
≤ sup |E un (·, x)|Σn (ω) − E un (·, xn k )|Σn (ω)|dµ x∈K Ω
n |E un (·, xn + k )|Σn (ω) − E un (·, xk )|Σ (ω)|dµ
Ω
+ sup |E un (·, xn k )|Σ (ω) − E un (·, x)|Σ (ω)|dµ Ω x∈K
≤ 3ε
(see (9.21), (9.22), and (9.23)).
(9.24)
Recall that Un (ω) −→ U (ω) µ-a.e. on Ω in U . So we can find n2 (ε) ≥ 1 such that for all n ≥ n2 (ε), we have
sup |E un (·, x)|Σ (ω) − E u(·, x)|Σ (ω)|dµ x∈K Ω
sup E |un (·, x)| − u(·, x)|Σ (ω)|dµ < ε. (9.25) ≤ Ω x∈K
Finally letting n(ε) = max n0 (ε), n1 (ε), n2 (ε) , for n ≥ n(ε) we have
sup |E un (·, x)|Σn (ω) − E u(·, x)|Σ (ω)|dµ < 3ε Ω x∈K
⇒ E(Un |Σn ) −→ E(U |Σ)
in L1 (Ω, U ) and in probability. d
s COROLLARY 9.2.5 If {Un , U }n≥1 ⊆ D, Un −→ U in L1 (Ω, U ) and Σn −→ Σ, then 1 E(Un |Σn ) −→ E(U |Σ) in L (Ω, U) and in probability.
Next we establish the continuity of the excess demand function on the information variable. THEOREM 9.2.6 The excess demand function de : ∆ × S0 × int C × D −→ L1 (Ω, RN ) is continuous when S0 is endowed with the topology of pointwise convergence. PROOF: Suppose that (pn , Σn , en , Un ) −→ (p, Σ, e, U ) in ∆ × S0 × int C × D as n → ∞, with e ∈ int C. Then by virtue of Corollary 9.2.5, we have that E(Un |Σn )(ω) −→ E(U |Σ)(ω)
in U for µ-a.a. ω ∈ Ω.
(9.26)
Let xn ∈ B(en , pn ) such that
E un (·, xn )|Σn (ω) = max E un (·, x)|Σn (ω) : x ∈ B(en , pn ) . Note that
n≥1
B(en , pn ) ∈ Pk (RN ) and so we may assume that xn −→ x in RN .
From Proposition 9.2.2 and (9.26) we have
E un (·, xn )|Σn (ω) −→ E u(·, x)|Σ (ω)
for µ-a.a. ω ∈ Ω.
9.2 The ex-Post View
667
Because B(en , pn ) −→ B(e, p), is given any x ∈ B(e, p) we can find xn ∈ B(en , pn ), n ≥ 1, such that xn −→ x in C and so
E un (·, xn )|Σn (ω) −→ E u(·, x)|Σ (ω)
for µ-a.a. ω ∈ Ω.
We have
E un (·, xn )|Σn (ω) ≤ E un (·, xn )|Σn (ω) for µ-a.a. ω ∈ Ω,
⇒ E u(·, x)|Σ (ω) ≤ E u(·, x)|Σ (ω) for µ-a.a. ω ∈ Ω. Because x ∈ B(e, p) was arbitrary, it follows that
x = arg max E u(·, x)|Σ (ω) : x ∈ B(e, p) , ⇒ de
is continuous as claimed by the theorem.
DEFINITION 9.2.7 The value of information of the choice problem is the function V : ∆×S0 ×int C ×D −→ R defined by
V (p, Σ0 , e, U ) = U (ω) de (p, Σ0 , e, U )(ω) + e dµ. Ω
Concerning the function V , we have the following continuity result. THEOREM 9.2.8 The value of information map V : ∆ × S0 × int C × D −→ R is continuous when S0 is endowed with the topology of pointwise convergence. PROOF: Suppose (pn , Σn , en , Un ) −→ (p, Σ, e, U ) in ∆ × S0 × int C × D with e ∈ int C. Then from Theorem 9.2.6, we have that de (pn , Σn , en , Un ) −→ de (p, Σ, e, U )
in L1 (Ω, U )
as n → ∞.
(9.27)
Evidently the integral functional
h −→ U (ω) h(ω) + e dµ Ω
is continuous on L1 (Ω, U). Therefore because of (9.27) we have V (pn , Σn , en , Un ) −→ V (p, Σ, e, U ), ⇒ V
is continuous.
REMARK 9.2.9 Of course Theorems 9.2.6 and 9.2.8 are also valid if on S0 we consider the d-metric topology.
668
9 Uncertainty, Information, Decision Making
9.3 The ex-ante View In this section we investigate the second way of viewing information in a decision problem and how it affects behavior. This is the ex-ante view or measurability view. According to this approach, the agent formulates a plan of what action to take at each state before observing his information, subject to the constraint that the plan be measurable with respect to his information. So information affects the actions through this measurability constraint. The decision problem has the following structure. There is uncertainty represented by a probability space (Ω, Σ, µ) with Σ countably generated so that L1 (Ω) is separable. The information observed by the agent is represented by a sub-σ-field Σ0 of Σ. We emphasize that this is an ex-ante representation of information and so it does not denote what the agent has observed, but rather what an agent would observe in each state. Before observing his information, the agent formulates a statecontingent plan f : Ω −→ X where X ⊆ RN is a nonempty, compact, and convex set, representing the set of all feasible actions. We must have that f is Σ0 -measurable in order for the plan to be informationally feasible. The agent also faces for the plan f an additional constraint B(p) depending on a variable p ∈ P (p can be, e.g., the price system prevailing in the market). The set B(p) denotes all those plans that are economically feasible when the measurability constraint is not taken into account. The agent has some preferences among the feasible plans, which is described by the complete preorder . Then the decision problem facing the agent, is the following. max : f : Ω −→ X, f is Σ0 -measurable, f ∈ B(p) . (9.28) In problem (9.28), we view p −→ B(p) as the economic constraint multifunction. On the set E of all Σ-measurable functions f : Ω −→ X, we introduce the equivalence relation ∼ of equality µ-a.e. We set E0 = E/ ∼. DEFINITION 9.3.1 A plan is an element of E0 . The measurability constraint multifunction M : S0 −→ 2E0 is defined by M (Σ0 ) = f ∈ E0 : f is Σ0 -measurable . (9.29) Then the overall constraint multifunction ξ : P × S0 −→ 2E0 is defined by ξ(p, Σ0 ) = B(p) ∩ M (Σ0 ).
(9.30)
E0
The solution multifunction S : P × S0 −→ 2 for decision problem (9.28) is defined by S(p, Σ0 ) = f ∈ ξ(p, Σ0 ) : g f for all g ∈ ξ(p, Σ0 ) . (9.31) We want to characterize the continuity of S. DEFINITION 9.3.2 Let Z be a Hausdorff topological space and a complete preorder on Z. Then (a) is lower semicontinuous (lsc for short) if for every z ∈ Z, the set L(z) = {z ∈ Z : z z} is closed.
9.3 The ex-ante View
669
(b) is upper semicontinuous (usc for short) if for every z ∈ Z, the set U (z) = {z ∈ Z : z z } is closed. (c) is continuous if it is both lsc and usc. REMARK 9.3.3 Recall that a general binary relation R on a set X is a preorder (or quasiorder), if it is reflexive (i.e., xRx) and transitive (i.e., xRy, yRz imply xRz). If in addition xRy and yRx imply x = y, then R is called a partial order (or simply an ordering). We should mention that some authors do not require a partial order to be reflexive. If for a preorder R we necessarily have xRy or yRx or both, then we say that R is a complete or total preorder. Evidently a complete preorder is reflexive. A total partial order is called a linear order . Given a preorder on X, we can define the associated strict preorder ≺ by x≺y
if and only if x y
and
y x.
Evidently ≺ is irreflexive and transitive. If the set X is a Hausdorff topological space, then in analogy to Definition 9.3.2 we can say that ≺ is lsc (resp., usc), if for every x ∈ X, the set Ls (x) = {x ∈ X : x ≺ x } (resp., the set Us (x) = {x ∈ X : x ≺ x}) is open. Note that for a total preorder , lower semicontinuity of the corresponding ≺ is equivalent to the lower semicontinuity of (see Definition 9.3.2(a)). Similarly for the upper semicontinuity. The parameter space P is assumed to be Hausdorff topological space. Now we want to topologize E0 . We consider two topologies on E0 . The first is the norm topology of L1 (Ω, RN ), denoted by s-. The second is the weak topology of L1 (Ω, RN ), denoted by w-. PROPOSITION 9.3.4 If u : Ω × X −→ R is a function such that (i) For all x ∈ X, ω −→ u(ω, x) is Σ-measurable, (ii) For µ-a.a. ω ∈ Ω, x −→ u(ω, x) is continuous, (iii) ω −→ max |u(ω, x)| belongs in L1 (Ω), x∈X
then the integral operator U : E0 −→ R defined by
u ω, x(ω) dµ U (x) = Ω
is s-continuous. PROOF: This is an immediate consequence of a dominated convergence theorem and from the fact that from a strongly convergent sequence in L1 (Ω, RN ), we can extract a µ-a.e. convergent subsequence. REMARK 9.3.5 Note, however, that U is not in general w-continuous. Let Ω = [0, 1], Σ = B([0, 1]), and µ = λ = the Lebesgue measure. Also suppose N = 1 and X = [0, 1] ⊆ R. Consider the utility function u(x) = x2 . Then if {xn }n≥1 is the sequence of Rademacher functions; that is,
670
9 Uncertainty, Information, Decision Making 1 if t ∈ 2kn , k+1 k = even 2n xn (t) = , −1 otherwise w
then xn −→ 0 in L1 ([0, 1]) but U (xn ) = 1 U (0) = 0. Our stated goal is to determine the continuity properties of the solution multifunction S(p, Σ0 ) (see (9.31)). A reasonable approach is to use the Berge maximum theorem (see Theorem 6.1.18). Then according to that result if ξ is lower semicontinuous and has a closed graph and the preference relation is continuous (in the sense of Definition 9.3.2(c)), then S has closed graph. This requires that on E0 we employ the w-topology, which makes E0 w-compact and assume that is w-continuous, a rather restrictive requirement as was observed in Remark 9.3.5. So we see that although the weak topology is a good choice for E0 because it gives compactness of that space (see Theorem 6.4.23), however, this choice is too restrictive for the preference relation , where the natural choice is the strong topology. Hence we have conflicting requirements and eventually what we need is a version of Theorem 6.1.18 when two different topologies are involved. Such a result was proved by Horsley–Van Zandt–Wrobel [312, Corollary 8]. PROPOSITION 9.3.6 If (P, τ ) is a Hausdorff topological space, V is a linear space on which we have two linear topologies s, w, whose restrictions on any straight line in V are identical to the usual topology of R, X is a convex and w-compact subset of V , B : P −→ 2X \{∅} has w-compact values and it is τ -to-w usc and τ -to-s lsc, is a total preorder on X which is s-lsc and w-usc, then the optimal action correspondence S(p) = x ∈ B(p) : for all z ∈ B(p) z x , p ∈ P, from P into 2X \ {∅} is τ -to-w usc with w-compact values. Using this proposition in the case of our decision problem, we obtain the following result. PROPOSITION 9.3.7 If (P, τ ) is a Hausdorff topological space, on S0 we have a Hausdorff topology τ0 , ξ is τ × τ0 -to-s lower semicontinuous, and Gr ξ is closed for the product topology τ ×τ0 ×w, is s-lsc and w-usc, then Gr S is closed for the product topology τ × τ0 × w and has nonempty and w-compact values. The above result is missing the concrete topology τ0 on S0 that will make the measurability constraint multifunction M (see (9.29)) s-lsc and w-closed. In Section 9.1, we defined and discussed the ds -metric topology on S0 (the topology of pointwise convergence). PROPOSITION 9.3.8 If on S0 we consider the ds -metric topology, then the map (f, Σ0 ) −→ E(f |Σ0 ) is w × ds -to-w continuous. PROOF: Recall that the weak topology on E0 is metrizable. So we can work with w×d sequences. Suppose (fn , Σn ) −→s (f, Σ). For every h ∈ L∞ (Ω), we have
9.3 The ex-ante View
671
E(fn |Σn )hdµ = Ω
fn E(h|Σn )dµ. Ω
Because L∞ (Ω) ⊆ L1 (Ω), we have E(h|Σn ) −→ E(h|Σ) in L1 (Ω). Also note that w∗
because X is compact and fn −→ f in L1 (Ω), we also have that fn −→ f in L∞ (Ω). Then
fn E(h|Σn )dµ −→ f E(h|Σ)dµ = E(f |Σ)hdµ, Ω
Ω
Ω ⇒ E(fn |Σn )hdµ −→ E(f |Σ)hdµ. w
Ω
Ω
∞
Because h ∈ L (Ω) was arbitrary, we conclude that w
E(fn |Σn ) −→ E(f |Σ)
in L1 (Ω).
PROPOSITION 9.3.9 If on S0 we consider the ds -metric topology, then M : S0 −→ 2E0 has nonempty, w-compact, convex values and it is ds -to-s lower semicontinuous and Gr M is ds ×w-closed. PROOF: First we show the ds -to-s lower semicontinuity. To this end suppose ds Σn −→ Σ. It suffices to show that M (Σ) ⊆ s- lim inf M (Σn ).
(9.32)
So let g ∈ M (Σ). Then gn = E(g|Σn ) ∈ M (Σn ) for all n ≥ 1 and gn −→ g in L1 (Ω). This proves (9.32) and so we have the s-lower semicontinuity of M . Next we show that Gr M is ds × w-closed. To this end let {(Σn , gn )}n≥1 ⊆ Gr M d
w
s and Σn −→ Σ, gn −→ g in L1 (Ω). From Proposition 9.3.8 we have
w
gn = E(gn |Σn ) −→ E(g|Σ)
in L1 (Ω),
⇒ g = E(g|Σ) (i.e., g ∈ M (Σ)). Therefore Gr M is ds × w-closed.
PROPOSITION 9.3.10 If (P, τ ) is a Hausdorff topological space, S0 is endowed with the ds -metric topology, and B : P −→ 2E0 \{∅} has a graph that is τ × w-closed, then ξ : P ×S0 −→ 2E0 \{∅} has a graph that is τ ×ds × w-closed. DEFINITION 9.3.11 We say that the constraint multifunction B : P −→ 2E0 \{∅} is state independent if, for all p ∈ P, all g ∈ B(p), and all Σ0 ∈ S0 , we have E(g|Σ0 ) ∈ B(p). REMARK 9.3.12 One way to interpret the above condition is to say that the economic constraint multifunction B does not reveal information. Another interpretation is to say that information is not necessary in order to satisfy the constraint. There are economic models in which this condition is not satisfied (e.g., the microeconomic rational expectations general equilibrium model).
672
9 Uncertainty, Information, Decision Making
PROPOSITION 9.3.13 If (P, τ ) is a Hausdorff topological space, S0 is endowed with the ds -metric topology, and B : P −→ 2E0 \{∅} is τ -to-s lower semicontinuous, then ξ : P ×S0 −→ 2E0 \{∅} is τ ×ds -to-s lower semicontinuous. PROOF: First we show that
ξ(p, Σ0 ) = E B(p)|Σ0
for all (p, Σ0 ) ∈ P × S0 .
(9.33)
Let g ∈ ξ(p, Σ0 ). Then from (9.30) we see that g ∈ B(p) and because g is Σ0 -measurable, we have g = E(g|Σ0 ). Therefore g ∈ E B(p)|Σ0 and we have proved
(9.34) ξ(p, Σ0 ) ⊆ E B(p)|Σ0 .
On the other hand, if g ∈ E B(p)|Σ0 , then we can find h ∈ B(p) such that g = E(h|Σ0 ). Because B is by hypothesis state-independent, we have g ∈ B(p). Because g is Σ0 -measurable we have g ∈ M (Σ0 ), hence g ∈ B(p) ∩ M (Σ0 ) = ξ(p, Σ0 ). Therefore we have proved that
E B(p)|Σ0 ⊆ ξ(p, Σ0 ). (9.35) From (9.34) and (9.35), we conclude that (9.33) holds. Let η1 : P ×S0 −→ E0 ×S0 be defined by η1 (p, Σ0 ) = B(p) × {Σ0 }. Evidently this is τ × ds -to-s × ds lower semicontinuous. Also let η2 : E0 × S0 −→ E0 be defined by η2 (g, Σ0 ) = E(g|Σ0 ). We know that η2 is s × ds -to-s continuous (see Proposition 9.3.8). From (9.33) we see that ξ = η2 ◦ η 1 , ⇒ ξ
is τ × ds -to-s lower semicontinuous.
REMARK 9.3.14 The result remains true if instead we assume that the economic constraint multifunction B : P −→ 2E0 \ {∅} is τ -to-w lower semicontinuous (see Proposition 9.3.8). The above results yield the following theorem on the continuity of the solution multifunction ξ(p, Σ0 ). THEOREM 9.3.15 If (P, τ ) is a Hausdorff topological space, on S0 we consider the ds -metric topology, and (i) B : P −→ 2E0 \{∅} is τ -to-s lower semicontinuous and Gr B is τ × w-closed, (ii) B is state-independent (see Definition 9.3.11),
9.4 Convergence of σ-Fields and Prediction Sequences
673
(iii) is s-lower semicontinuous and w-upper semicontinuous, then the solution multifunction S : P ×S0 −→ 2E0 has nonempty, w-compact values, and Gr S is τ ×ds ×w-closed. PROOF: From Propositions 9.3.10 and 9.3.13 we have that (p, Σ0 ) −→ ξ(p, Σ0 ) is s-lsc and w-closed and clearly has nonempty values. These facts in conjunction with hypothesis (iii) permit the use of Proposition 9.3.6, from which we get the conclusion of the theorem.
9.4 Convergence of σ-Fields and Prediction Sequences In this section motivated from the work in Section 9.1 and in particular from Definition 9.1.8(b), we define still another mode of convergence of sub-σ-fields, which turns out to be suitable for the convergence of prediction sequences. So let (Ω, Σ, µ) be a complete probability space. To simplify our considerations throughout this section we assume that all sub-σ-fields considered are µ-complete. DEFINITION 9.4.1 Let {Σn }n≥1 be a sequence of sub-σ-fields of Σ (a) µ- lim inf Σn is the sub-σ-field Σ0 such that for every f ∈ L∞ (Ω), we have n→∞
E(f |Σ0 )1 ≤ lim inf E(f |Σn )1 .
(9.36)
n→∞
and Σ0 is maximal among all sub-σ-fields satisfying (9.36), namely if Σ0 is a sub-σ-field for which (9.36) is true (with Σ0 replaced by Σ0 ), then Σ0 ⊆ Σ0 . (b) µ- lim sup Σn is the sub-σ-field Σ0 such that for every f ∈ L∞ (Ω), we have n→∞
lim sup E(f |Σn )1 ≤ E(f |Σ0 )1 .
(9.37)
n→∞
and Σ0 is minimal among all sub-σ-fields satisfying (9.37), namely if Σ0 is a sub-σ-field for which (9.37) is true (with Σ0 replaced by Σ0 ), then Σ0 ⊆ Σ0 . Immediately from this definition, we have the following. PROPOSITION 9.4.2 If {Σn }n≥1 ∈ S0 , then we always have µ- lim inf Σn ⊆ µ- lim sup Σn .
n→∞
n→∞
REMARK 9.4.3 The inclusion in the above result can be strict. PROPOSITION 9.4.4 If {Σn }n≥1 ∈ S0 , then µ- lim inf Σn = n→∞ lim inf µ(A C) : C ∈ Σn = 0 .
n→∞
A ∈ Σ :
674
9 Uncertainty, Information, Decision Making
PROOF: In what follows Σl = A ∈ Σ : lim inf µ(A C) : C ∈ Σn = 0 . n→∞
For any sets A1 , A2 , C1 , C2 ∈ Σ we have
µ (A1 ∩ A2 ) (C1 ∩ C2 ) ≤ µ(A1 C1 ) + µ(A2 C2 )
and µ (A1 ∪ A2 ) (C1 ∩ C2 ) ≤ µ(A1 C1 ) + µ(A2 C2 ).
(9.38) (9.39)
From (9.38) and (9.39), we see that A1 , A2 ∈ Σl ⇒ A1 ∪ A2 ∈ Σl
and
A1 ∩ A2 ∈ Σl .
(9.40)
Suppose that {Cn }n≥1 ⊆ Σl and set C=
!
Cn ,
Cm =
n≥1
m !
Cn .
n=1
Given ε > 0, we can find m = m(ε) ≥ 1 such that ε µ(C Cm ) < . 2 Because of (9.40), we have that Cm ∈ Σl . So from the definition of Σl , we can find n0 ≥ 1 and Dn ∈ Σn such that for n ≥ n0 we have ε µ(Dn Cm ) < 2 ⇒ inf µ(C Cn ) : Cn ∈ Σn ≤ µ(C Cn ) ≤ (C Cn ) + µ(Cm Cn ) < ε for n ≥ n0 ⇒ Σl
is closed under countable unions.
Clearly Σl is closed under complementation and so Σl is a σ-field. It is easy to see that any set A ∈ Σl satisfies
1 1 − µ(A|Σn )dµ = − µ(A|Σl )dµ lim n→∞ Ω 2 2 Ω
(9.41)
and so from (9.41), it follows that for every f ∈ L∞ (Ω), we have E(f |Σl )1 ≤ lim inf E(f |Σn )1 . n→∞
(9.42)
Next we show that if Σl ∈ S0 also satisfies (9.42), then Σl ⊆ Σl . Let C ∈ Σl and take f = 12 − χC in (9.42). We have
1 1 1 − χC dµ = − µ(C|Σl )dµ = 2 2 2 Ω Ω
1 − µ(C|Σn )dµ ≤ lim inf n→∞ 2
Ω 1 − µ(C|Σn )dµ ≤ 1 ≤ lim sup 2 n→∞ Ω 2
1 1 − µ(C|Σn )dµ = , ⇒ lim n→∞ Ω 2 2 ⇒ lim inf µ(C Cn ) : Cn ∈ Σn = 0, n→∞
⇒ Σl ⊆ Σl .
9.4 Convergence of σ-Fields and Prediction Sequences
675
Therefore we conclude that Σl = µ- lim inf Σn . n→∞
Note that {L2 (Ω, Σn )}n≥1 is a sequence of closed subspaces of L2 (Ω). So we can define s- lim inf L2 (Ω, Σn ) and w- lim inf L2 (Ω, Σn ) in the sense of Definition 6.6.3. n→∞
n→∞
THEOREM 9.4.5 If {Σn }n≥1 ⊆ S0 , then s-lim inf L2 (Ω, Σn )= L2 (Ω, µ- lim inf Σn ). n→∞
n→∞
PROOF: First we show that we can find Σ ∈ S0 such that s- lim inf L2 (Ω, Σn ) = L2 (Ω, Σ). n→∞
To do this, we need to show that it is a lattice and it contains the constant functions (see e.g., Schaefer [539, p. 210]). So let f, g ∈ s- lim inf L2 (Ω, Σn ). By n→∞
virtue of Definition 6.6.3 we can find sequences {fn }n≥1 , {gn }n≥1 ⊆ L2 (Ω, Σn ) such that fn −→ f and gn −→ g in L2 (Ω) as n → ∞. Let ∨ denote the maximum operation and recall that for a, b, c, d ∈ R, we have (a ∨ b − c ∨ d)2 ≤ (a − c)2 + (b − d)2 .
(9.43)
Using (9.43), we obtain fn ∨ gn − f ∨ g22 ≤ fn − f 22 + gn − g22 −→ 0
as n → ∞,
⇒ f ∨ g ∈ s- lim inf L2 (Ω, Σn ).
(9.44)
n→∞
Because fn ∧ gn = fn ∨ gn − |fn − gn |, we see that f ∧ g ∈ s- lim inf L2 (Ω, Σn ). n→∞
(9.45)
From (9.44) and (9.45), it follows that s- lim inf L2 (Ω, Σn ) is indeed a lattice. Clearly n→∞
it contains the constant functions. So as we already said, we can find Σ ∈ S0 such that s- lim inf L2 (Ω, Σn ) = L2 (Ω, Σ). n→∞
Let A ∈ Σ. Then we can find {fn }n≥1 ⊆ L2 (Ω, Σn ) such that fn −→ χA in 2 L (Ω) and of course in probability. Set 1 An = ω ∈ Ω : fn (ω) ≥ . 2 We have
An A = ω ∈ Ω : χA (ω) = 0
1 2
1 or χA (ω) = 1 and fn (ω) < 2 1 for all n ≥ 1, ⊆ ω ∈ Ω : χA (ω) − fn (ω) ≥ 2 ⇒ µ(An A) −→ 0 as n → ∞, ⇒ A ∈ µ- lim inf Σn n→∞
and
fn (ω) ≥
676
9 Uncertainty, Information, Decision Making
(because An ∈ Σn for every n ≥ 1, see Proposition 9.4.4). Now suppose that A ∈ µ- lim inf n→∞ Σn . Then by Proposition 9.4.4 we can find An ∈ Σn , n ≥ 1, such that µ(An A) −→ 0. Then we have
χAn − χA 22 = |χAn − χA |2 dµ = χAn A dµ = µ(An A) −→ 0, Ω
Ω
⇒ χA ∈ s- lim inf L2 (Ω, Σn ), n→∞
⇒ A ∈ Σ. To further characterize µ- lim inf Σn we need the following general result about n→∞
the Mosco convergence of closed subspaces of a Hilbert space H. PROPOSITION 9.4.6 If H is a Hilbert space, and {Hn }n≥1 is a sequence of closed subspaces of H, then (a) s- lim inf Hn is the maximum among all closed subspaces H of H such that n→∞
pH (x) ≤ lim inf pHn (x) n→∞
for all x ∈ H.
(9.46)
(by pH we denote the orthogonal projection on a closed subspace H of H). (b) w- lim sup Hn is the minimum subspace among subspaces H of H such that n→∞
lim sup pHn (x) ≤ pH (x) n→∞
for all x ∈ H.
(9.47)
PROOF: (a) Let H = s- lim inf Hn . We have n→∞
pHn (x) = pHn pH (x) + pHn (1 − pH )(x)2 = pHn pH (x), pH (x) + pHn pH (x), (1 − pH )(x) + (1 − pH )(x), pHn pH (x) + pHn (1 − pH )(x)2 ≥ pHn pH (x), pH (x) + pHn pH (x), (1 − pH )(x) + (1 − pH )(x), pHn pH (x) −→ pH (x)2 2
(recall pHn (x) −→ x for all x ∈ H). Therefore s- lim inf Hn satisfies (9.36). On the other hand, let H be any closed subspace of H satisfying (9.36). We have pH (x)2 ≤ lim inf pHn pH (x)2 ≤ lim sup pHn pH (x)2 ≤ pH (x) n→∞
n→∞
and
pHn pH (x) −→ pH (x)
as n → ∞.
Hence we have pH (x) − pHn pH (x)2 = pH (x)2 − pH (x), pHn pH (x) − pHn pH (x), pH (x) + pHn pH (x), pH (x) = pH (x)2 − pHn pH (x)2 −→ 0
as n → ∞.
9.4 Convergence of σ-Fields and Prediction Sequences
677
(b) Let H = w- lim sup Hn . Then we have n→∞
(H )⊥ = s- lim inf Hn⊥ n→∞
and from (a) it follows that (1 − pH )(x)2 ≤ lim inf (1 − pHn )(x)2 n→∞
for any x ∈ H,
⇒ x2 − pH (x)2 ≤ x2 − lim sup pHn (x)2 , n→∞ 2
⇒ lim sup pHn (x) ≤ pH (x) 2
n→∞
for all x ∈ H.
On the other hand, let H be a closed subspace of H satisfying (9.47). By following the previous argument backwards, we obtain w- lim sup Hn ⊆ H . n→∞
Using this proposition we can further characterize µ- lim inf Σn . n→∞
THEOREM 9.4.7 If {Σn }n≥1 ⊆ S0 , then µ- lim inf Σn is the maximum among all n→∞
the sub-σ-fields Σ of Σ such that E(f |Σ )2 ≤ lim inf E(f |Σn )2 n→∞
for all f ∈ L2 (Ω).
(9.48)
PROOF: Recall that the conditional expectation E(·|Σn ) is the orthogonal projection of L2 (Ω) onto L2 (Ω, Σn ). From Theorem 9.4.5, we see that E(·|µ- lim inf Σn ) is n→∞
the orthogonal projection of L2 (Ω) onto s- lim inf L2 (Ω, Σn ). So by virtue of Propon→∞
sition 9.4.6, we have
E(f |s- lim inf Σn )2 ≤ lim inf E(f |Σn )2 . n→∞
n→∞
On the other hand if Σ ∈ S0 satisfies (9.48), then from Proposition 9.4.6 and Theorem 9.4.5, we have L2 (Ω, Σ ) ⊆ s- lim inf L2 (Ω, µ- lim inf Σn ), n→∞
n→∞
⇒ Σ ⊆ s- lim inf Σn . n→∞
THEOREM 9.4.8 If {Σn }n≥1 ⊆ S0 , then for any f ∈ L1 (Ω) the following statements are equivalent. (a) f is µ- lim inf Σn -measurable. n→∞
(b) E(f |Σn ) − f 1 −→ 0.
678
9 Uncertainty, Information, Decision Making
PROOF: (a)⇒(b): Given ε > 0 we can find g ∈ L2 (Ω, µ- lim inf Σn ) such that n→∞
ε f − g1 < . 3 We have E(g|Σn ) − g2 −→ 0
as n → ∞ (see Theorem 9.4.5).
So we can find n0 = n0 (ε) ≥ 1 such that E(g|Σn ) − g2 <
ε 3
for all n ≥ n0 .
Hence we have E(f |Σn ) − f 1 ≤ E(f |Σn ) − E(g|Σn )1 + E(g|Σn ) − g1 + g − f 1 ≤ 2f − g1 + E(g|Σn ) − g2 (because · 1 ≤ · 2 ) ε 2ε + =ε for all n ≥ n0 . < 3 3 Therefore we conclude that E(g|Σn ) − f 1 −→ 0
as n → ∞.
(b)⇒(a): Let V = f ∈ L1 (Ω) : E(g|Σn ) − f 1 −→ 0 as n → ∞ . Let f, g ∈ V . Then as above (see the proof of Theorem 9.4.5), we have E(f |Σn ) ∨ E(g|Σn ) − f ∨ g1 −→ 0
as n → ∞.
So we have E(f ∨ g|Σn ) − f ∨ g1 ≤ E(f ∨ g|Σn ) − E(f |Σn ) ∨ E(g|Σn )1 + E(f |Σn ) ∨ E(g|Σn ) − f ∨ g1
= E f ∨ g − E(f |Σn ) ∨ E(g|Σn )|Σn 1 + E(f |Σn ) ∨ E(g|Σn ) − f ∨ g1 ≤ 2E(f |Σn ) ∨ E(g|Σn ) − f ∨ g1 −→ 0
as n → ∞,
⇒ f ∨ g ∈ V. Similarly we show that f ∧ g ∈ V . Also note that V is a linear subspace, which is closed and contains the constant functions. Therefore we can find Σ ∈ S0 such that V = L1 (Ω, Σ ). For every f ∈ L∞ (Ω, Σ ), we have E(f |Σn ) − f 22 ≤ E(f |Σn ) − f ∞ E(f |Σn ) − f 1 ≤ 2f ∞ E(f |Σn ) − f 1 ⇒ f ∈ L (Ω, µ- lim inf Σn ), 2
n→∞
⇒ L∞ (Ω, Σ ) ⊆ L2 (Ω, µ- lim inf Σn ). n→∞
as n → ∞
9.4 Convergence of σ-Fields and Prediction Sequences
679
Taking the L1 -closure of both sides, we conclude that V ⊆ L1 (Ω, µ- lim inf Σn ). n→∞
Let {Σn }n≥1 be a sequence in S0 and define S0p = Σ ∈ S0 : lim sup E(f |Σn )p ≤ E(f |Σ )p n→∞ for all f ∈ L∞ (Ω) , p = 1, 2.
LEMMA 9.4.9 If Σ ∈ S01 ∪S02 then lim supE E(f |Σ )|Σn p = lim supE(f |Σn )p n→∞
for every f ∈ L∞ (Ω).
n→∞
PROOF: Let Σ ∈ S01 and f ∈ L∞ (Ω). We have
lim sup E E(f |Σ ) − f |Σn 1 ≤ E E(f |Σ ) − f |Σ 1 = 0, n→∞
⇒ lim sup E E(f |Σ )|Σn 1 − lim sup E(f |Σn )1 n→∞ n→∞
≤ lim sup E E(f |Σ )|Σn 1 = 0. n→∞
Also we have
lim sup E E(f |Σ ) − f |Σn 22 n→∞
≤ lim sup E E(f |Σ ) − f |Σn ∞ E E(f |Σ ) − f |Σn 1 n→∞
≤ E(f |Σ ) − f ∞ lim sup E E(f |Σ ) − f |Σn 1 = 0. n→∞
Therefore
lim sup E E(f |Σ )|Σn 22 − lim sup E(f |Σn )2 n→∞
n→∞
≤ lim sup E E(f |Σ )|Σn − E(f |Σn )2 = 0. n→∞
The proof for Σ ∈ S02 is done similarly.
LEMMA 9.4.10 If Σ ∈ S01 and Σ ∈ S02 , then Σ ∩ Σ ∈ S01 ∩ S02 . PROOF: To simplify the notation, we set E =E(·|Σ ) and E =E(·|Σ ). We claim that lim sup E(f |Σn )1 ≤ (E E )m f 1 n→∞
for every f ∈ L∞ (Ω), m ≥ 1.
(9.49)
Note that (9.49) is true for m = 1. Suppose it is true for m = k. Then using Lemma 9.4.9, we obtain lim sup E(f |Σn )1 = lim sup E(E E f |Σn )1 n→∞
n→∞
≤ (E E )k E E f 1 = (E E )k+1 f 1 .
680
9 Uncertainty, Information, Decision Making
So (9.49) is also true for m = k + 1 and by induction we have proved it for all m ≥ 1. Passing to the limit as m −→ ∞ in (9.49), we obtain lim sup E(f |Σn )1 ≤ E(f |Σ ∩ Σ )1 . n→∞
Similarly we prove (9.49) for · 2 . Using the two lemmata we can characterize µ- lim supn→∞ Σn .
THEOREM 9.4.11 If {Σn }n≥1 ⊆ S0 , then µ- lim sup Σn is the minimum sub σn→∞
field Σ such that lim sup E(f |Σn )2 ≤ E(f |Σ )2 n→∞
for all f ∈ L2 (Ω).
(9.50)
PROOF: By a density argument, we see that (9.50) is satisfied for every f ∈ L2 (Ω) if and only if it is satisfied for every f ∈ L∞ (Ω). Also by Lemma 9.4.10, we have S0 = S02 . This proves the theorem. Using this theorem together with Proposition 9.4.6(b), we obtain the following. THEOREM 9.4.12 If {Σn }n≥1 ⊆ S0 , then the closed linear lattice generated by w- lim sup L2 (Ω, Σn ) is equal to L2 (Ω, µ- lim sup Σn ). n→∞
n→∞
REMARK 9.4.13 The above results can be extended to general Lp -spaces (1 ≤ p < ∞).
9.5 Games with Incomplete Information The specification of a game with incomplete information provides a very general framework for the analysis of noncooperative decision making under uncertainty. One can describe such an n-player game with general hypotheses. So let (Ω, Σ, µ) be a probability space, with Ω a Polish space, Σ = B(Ω) = the Borel σ-field of Ω, and µ the probability measure. This probability space models the uncertainty in the system. There are n-players to whom correspond n-action spaces Xk , k = 1, . . . , n. We assume that each Xk is a compact metric space. Also there are n-random variables fk on Ω, k = 1, . . . , n, which we assume are bounded and measurable. Each player k ∈ {1, . . . , n} before choosing his action, consults his private information random variable fk , which leads him to an assessment of the state of nature ω ∈ Ω. This then dictates his choice of action from the action space Xk . As soon as all these are finalized, he receives a reward uk (ω, x1 , . . . , xn ). In this section we investigate the existence of Nash equilibrium for a two-person, zero-sum game with incomplete information. Our hypotheses on the data are minimal and in particular for the utility (payoff) function u(ω, x1 , x2 ), we only assume that it is bounded, measurable, and continuous in xk , k = 1, 2, separately for each fixed ω ∈ Ω. The hypotheses on the model do not include a continuity of information condition, which typically requires the joint distribution of the information available
9.5 Games with Incomplete Information
681
to the players be absolutely continuous with respect to a product measure on the product of the spaces of types. In general such a condition is difficult to verify. Now let us give the precise mathematical description of the game denoted by Γ. The game Γ is determined by the following mathematical items. (1) The compact metric spaces X and Y representing the action spaces of the two players. (2) Two Polish spaces Ω1 and Ω2 called the types of the two players. (3) Another Polish space Ω0 called the type of nature. We set Ω = Ω0 × Ω1 × Ω2 . (4) There is a probability measure µ on Ω, B(Ω) ; the probability space
Ω, B(Ω), µ represents the information structure of the game. (5) There is a utility (payoff) function u : Ω × X × Y −→ R which is assumed to be jointly Borel-measurable. The game begins with nature choosing a vector ω = (ωk )2k=0 ∈ Ω according to the probability µ. Then player I observes ω1 ∈ Ω1 and based on that, chooses her action x ∈ X, whereas player II observes ω2 and chooses action y ∈ Y . At the end player II pays player I the amount u(ω, x, y) (zero-sum game). DEFINITION 9.5.1 (a) A distributional strategy σ for player I is a probability measure on Ω1 × X such that the marginal probability of σ on Ω1 is equal to the marginal probability of µ on Ω1 (recall that for a measure ν on a product space S × T , the marginals νS and νT on S and T , respectively, are defined by νS (A) = ν(A × T ) for all A ∈ Σ(S) and by νT (C) = ν(S × C) for all C ∈ Σ(T ). (b) A distributional strategy τ for player II, is a probability measure on Ω2 × Y such that the marginal probability of τ on Ω2 is equal to the marginal probability of µ on Ω2 . REMARK 9.5.2 Informally, the conditional probability σ(x|ω1 ) represents the probability that player I will choose action x ∈ X, provided that she observes the type ω1 ∈ Ω1 . Similarly for the conditional τ (y|ω2 ). DEFINITION 9.5.3 The expected payoff of the game Γ is given by
u(ω, x, y)σ(dx|ω1 )τ (dy|ω2 )dµ U (σ, τ ) = Ω×X×Y
when player I uses the distributional strategy σ and player II uses the distributional strategy τ . An equilibrium point of game Γ is defined in the usual way. DEFINITION 9.5.4 An equilibrium of game Γ is a saddle point of the expected payoff, namely a pair (σ, τ ) of distributional strategies for the two players such that U (σ, τ ) = sup U (σ, τ ) = inf U (σ, τ ). σ
τ
Our goal is to establish the existence of an equilibrium for game Γ. To do this we need some auxiliary results.
682
9 Uncertainty, Information, Decision Making
PROPOSITION 9.5.5 If (Ω, Σ, µ) is a probability space, with Ω a Polish space, Σ = B(Ω) = the Borel σ-field of Ω, and µ the probability measure, Z is a compact metric space, and h : Ω × Z −→ R is a Carath´ eodory function (i.e., for all z ∈ Z, ω −→ h(ω, z) is Borel-measurable and for all ω ∈ Ω, z −→ h(ω, z) is continuous) which is bounded, then given ε > 0 we can find a set Dε ∈ Σ with µ(Ω\Dε ) < ε such that {h(ω, ·) : ω ∈ Dε } is equicontinuous. PROOF: Recall that µ is tight. Therefore we can apply Theorem 6.2.9 (the ScorzaDragoni theorem) and obtain Dε ⊆ Ω compact subset such that µ(Ω \ Dε ) < ε and hD ×Z is continuous. Therefore it follows at once that {h(ω, ·) : ω ∈ Dε } is ε equicontinuous. Keeping the setting of the previous proposition, we define 1 1 M+,µ (Ω × Z) = {ϑ ∈ M+ (Ω × Z) : ϑΩ = µ}; 1 (Ω × Z) are all probability measures on Ω × Z, such that Ωthat is, M+,µ
their 1 1 marginal ϑΩ equals µ. Clearly M (Ω×Z) inherits the weak topology w M (Ω× +,µ +,µ 1 Z), Cb (Ω × Z) from the space M+ (Ω × Z). As in Proposition 9.5.5, we consider a 1 Carath´eodory function h : Ω × Z −→ R and then define H : M+,µ (Ω × Z) −→ R by
h(ω, z)dϑ. H(ϑ) = Ω×Z 1 (Ω×Z) −→ R is weakly continuous. PROPOSITION 9.5.6 The function H : M+,µ w
1 PROOF: Let ϑn −→ ϑ in M+,µ (Ω × Z). By virtue of Theorem 6.2.9 we can find Dε ⊆ Ω compact with µ(Ω \ Dε ) < ε such that hD ×Z is continuous and so bounded. ε We have
hdϑn = hdϑn + hdϑn (9.51) Dε ×Z
Ω×Z
and
(Ω\Dε )×Z
hdϑn −→
Dε ×Z
hdϑ Dε ×Z
as n → ∞.
(9.52)
Also, if ξ > 0 is the bound of h (recall that by hypothesis h is bounded), then
hdϑn ≤ |h|dϑn ≤ ξε for all n ≥ 1. (9.53) (Ω\Dε )×Z
(Ω\Dε )×Z
So, if we pass to the limit as n → ∞ in (9.51) and we use (9.52) and (9.53), we obtain
hdϑ − ξε ≤ lim inf hdϑn ≤ lim sup hdϑn ≤ hdϑ + ξε. (9.54) Dε ×Z
n→∞
Ω×Z
n→∞
Ω×Z
Dε ×Z
Because ε > 0 was arbitrary we let ε ↓ 0. From (9.54) we infer that
9.5 Games with Incomplete Information
hdϑn −→ hdϑ,
Ω×Z
683
Ω×Z
⇒ H is weakly continuous. Now we are ready to prove an equilibrium theorem for game Γ. THEOREM 9.5.7 If the game Γ is as above and in addition the payoff function u : Ω×X ×Y −→ R is bounded and (i) For all (ω, y) ∈ Ω × Y, x −→ u(ω, x, y) is continuous; (ii) For all (ω, x) ∈ Ω × X, y −→ u(ω, x, y) is continuous, then the game Γ has an equilibrium. PROOF: First we show that the expected payoff function U (ε, τ ) is weakly continuous in each variable separately. For fixed ω1 ∈ Ω1 and distributional strategy τ ∗ for player II, we define the Borel measure ηω 1 on Ω0 × Ω2 × Y by
ηω 1 (A) = 1τ ∗ (dy|ω2 )µ d(ω, ω2 )|ω1 . A
We define H : Ω1 ×X −→ R by
H(ω1 , x) =
u(ω, x, y)dηω 1 .
Ω0 ×Ω2 ×Y
Claim: For every fixed ω1 ∈ Ω1 , x −→ H(ω1 , x) is continuous on X. By virtue of Proposition 9.5.5, for every n ≥ 1, we can find En ⊆ Ω0 × Ω2 × Y a Borel set with ηω 1 (En ) > 1 − (1/n) such that u(ω, ω1 , ω2 , ·, y) : (ω, ω2 , y) ∈ En is equicontinuous. Define Hn : X −→ R by
Hn (x) = u(ω, ω1 , ω2 , x, y)dηω 1 . En
Evidently Hn is continuous. Because u is bounded and ηω 1 (En ) > 1 − (1/n), it follows that Hn −→ H(ω1 , ·) uniformly, ⇒ H(ω1 , ·) is continuous.
Next let J(σ) =
H(ω1 , x)dσ.
Ω1 ×X
From Proposition 9.5.6 we have that J is weakly continuous. We have
684
9 Uncertainty, Information, Decision Making
u(ω, x, y)dηω1 dσ J(σ) = Ω ×X Ω0 ×Ω2 ×X
1 = u(ω, x, y)σ(dx|ω1 )τ ∗ (dy|ω2 )dµ Ω×X×Y
= U (σ, τ ∗ ). Therefore U (σ, τ ) is weakly continuous separately on each variable. Now observe that the sets of distributional strategies are convex and weakly compact. So we can apply Theorem 2.3.13 and produce a saddle point (σ, τ ) of U , namely U (σ, τ ) = inf sup U (σ, τ ) = sup inf U (σ, τ ). τ
σ
σ
τ
This pair (σ, τ ) is the desired equilibrium of game Γ.
REMARK 9.5.8 It is interesting to extend this result to the case of nonzero-sum games.
9.6 Markov Decision Chains with Unbounded Costs In this section we consider a general state space Markov decision chain (MDC), with finite action sets and possibly unbounded cost. We prove the existence of optimal stationary discounted policies. So let (Ω, Σ) be a measurable space for which singletons belong to Σ. Also X denotes the action set that is assumed to be countable. Given a state ω ∈ Ω and an action x ∈ X, the next state is chosen according to a transition probability (stochastic kernel), p(·|ω, x) defined on Σ. So for every (ω, x) ∈ Ω × X, we have: (a) p(·|ω, x) is a probability measure on Σ. (b) For every A ∈ Σ, p(A|·, ·) : Ω × X −→ [0, 1] is (Σ × 2X )-measurable. Let µ be a probability measure on Ω, known as the initial measure. Usually µ is the Dirac probability measure concentrated at the known initial state. A policy π is a rule for choosing actions, given the history of the process, and it may be randomized. So, if the policy is π, the process develops as follows. The initial state ω0 is chosen according to the distribution µ. Subsequently the initial action x0 is chosen according to the rule π. Then the next state ω1 is chosen according to the distribution p(·|ω0 , x0 ) and the process continues indefinitely. Of course the above description of a policy (strategy) π is informal. To rigorously define it, we proceed as follows. Let Ω0 , Ω1 , Ω2 , Ω3 , . . . be copies of the state space Ω and X0 , X1 , X2 , X3 , . . . be copies of the action set X. The history of the process up until time n ≥ 0, is defined inductively as follows. (a) H0 = Ω0 . (b) Given Ωn , we set Hn+1 = Hn Xn Ωn+1 . Clearly Hnrepresents the information available to the decision maker at stage 1 n ≥ 0. Let H= Hn and let M+ (X) be the space of all probability measures on X. n≥0
9.6 Markov Decision Chains with Unbounded Costs
685
DEFINITION 9.6.1 A policy (strategy) is a map π : H −→ M+ (X) such that for each n ≥ 0 and x ∈ X, hn −→ π H (hn )(x) is measurable on Hn with respect to the n
product σ-field Σ × 2X × Σ × · · · × Σ. REMARK 9.6.2 By the disintegration theorem (see, e.g., Ash " [30, p. 109]), there is a probability measure on the set of sample paths S = Ωk × Xk under the k≥0
policy π. Equip S with the σ-field Σ × 2X × Σ × 2X × . . . with initial measure µ on (Ω0 = Ω, Σ). Given (ω0 , x0 , . . . , ωn , xn ) we need a measure on Σ. This is the law of motion p(·|ωn , xn ). Also given (ω0 , x0 , . . . , ωn ) we need a measure on 2X . This measure is induced by the probability distribution π(ω0 , x0 , . . . , ωn ). Note that we do not assume that the state space is a Borel space (see Section 8.5). DEFINITION 9.6.3 A stationary policy is a measurable map g : Ω −→ X. REMARK 9.6.4 So whenever ωn = ω, the stationary policy πs chooses the action f (ω). For this reason the stationary policy πs is identified with f . We assume that a cost function is given, namely a function u : Ω × X −→ R+ = R+ ∪ {+∞}. We make the following hypothesis concerning u. H1 : u : Ω×X −→ R+ = R+ ∪ {+∞} is a measurable map such that for every ω ∈ Ω, the set x ∈ X : u(ω, x) < +∞ (9.55) is finite. REMARK 9.6.5 If an action x is infeasible, then u(ω, x) = +∞. With a slight abuse of notation, when a stationary policy g is employed, the cost function evaluated at the point ω is denoted by u(ω, g). There is a discount factor 0 < β < 1. Using it we can now define the total discounted cost under a policy π. For n ≥ 0, we define un : S −→ R = R+ ∪ {+∞} by un (ω0 , x0 , . . .) = u(ωn , xn ). Evidently these functions are measurable. We consider n β u(ωn , xn ). U (ω0 , x0 , . . .) = n≥0
Then U is measurable on S. Conditioned on ω0 = ω, we denote the conditional expectation by Vβ (π, ω). This quantity may be +∞ for some policies. Let Vβ (ω) = inf Vβ (π, ω) : π = policy . We make the following hypothesis. H2 : Vβ (ω) < +∞ for every (β, ω) ∈ (0, 1) × Ω.
686
9 Uncertainty, Information, Decision Making
REMARK 9.6.6 This hypothesis implies that given an initial state ω ∈ Ω, there must be at least one action x ∈ X such that u(ω, x) < +∞. Hence in hypothesis H1 , the set given by (9.55) is nonempty. As we already saw in Section 8.5, the proof of the existence of an optimal stationary policy depends on a measurable selection argument, which permits the realization of a stationary policy that achieves the minimum in the discounted optimality equation. The next proposition paves the way to obtain such a measurable selector. PROPOSITION 9.6.7 If hx : Ω −→ R+ = R measurable + ∪ {+∞}, x ∈ X, are functions such that for each ω ∈ Ω, the set x ∈ X : hx (ω) < +∞ is finite, h = min hx , the elements of the countable set X are ordered as x0 < x1 < x2 < · · · x∈X
and we define g : Ω −→ X by g(ω) =smallest x ∈ X such that hx (ω) = h(ω), then g is measurable. PROOF: Evidently h is measurable. We need to show that for every k ≥ 0, g −1 ({xk }) ∈ Σ. Note that k−1 #
#
i=0
i≥k+1
g −1 ({xk })=
{ω ∈ Ω : hxk (ω) < hxi (ω)} ∩
{ω ∈ Ω : hxk (ω) ≤ hxi (ω)},
⇒ g −1 ({xk }) ∈ Σ. If for some ω ∈ Ω, h(ω) = +∞, then the result also holds.
PROPOSITION 9.6.8 (a) If h : Ω × Ω −→ R+ is jointly measurable, then for every x ∈ X, ω −→ Ω h(ω, s)p(ds|ω, x) is measurable. (b) If h : Ω −→ R+ is Σ-measurable and g : Ω −→ X is a stationary policy, then ω −→ Ω h(s)p(ds|ω, g) is Σ-measurable. PROOF: (a) Let h(ω, s)=
N
k=1
λk χCk (ω)χDk (s) with Ck , Dk ∈Σ, k∈{1, . . . , N }. We
have
h(ω, s)p(ds|ω, x) = Ω
N
λk χCk (ω)p(Dk |ω, x),
k=1
⇒ ω −→
h(ω, s)p(ds|ω, x)
is Σ-measurable.
Ω
Because every measurable function h : Ω × Ω −→ R+ can be approximated by an increasing sequence of simple functions, we conclude that ω −→ Ω h(ω, s)p(ds|ω, x) is Σ-measurable. (b) Exploiting as above the approximation of a measurable function h : Ω×Ω −→ R+ by an increasing sequence of simple functions, it suffices to show that for each A ∈ Σ,
ω −→ p A|ω, g(ω) is Σ-measurable. To this end note that ξ : Ω −→ Ω×X defined by
ξ1 (ω) = ω, g(ω)
is Σ, Σ × B(X) -measurable. Recall that ξ2 : Ω × X −→ [0, 1] defined by
9.6 Markov Decision Chains with Unbounded Costs
687
ξ2 (ω, x) = p(A|ω, x) is Σ × B(X)-measurable. Therefore ξ2 ◦ ξ1 is Σ-measurable. But note that
ξ2 ◦ ξ1 (ω) = p A|ω, f (ω) . DEFINITION 9.6.9 If π is a policy and (ω, x) ∈ Ω × X, then the shifted policy π(ω, x) is defined by π(ω, x)(ω0 , x0 , . . . , xn ) = π(ω, x, ω0 , x0 , . . . , xn ). REMARK 9.6.10 According to this definition the decision made by the shifted policy π(ω, x) at the nth stage is the same as the decision made by π at stage n + 1, given that the initial state and the initial decision are (ω, x) ∈ Ω × X. PROPOSITION 9.6.11 For every policy π, the value function ω −→Vβ (π, ω) is
Σ-measurable. Moreover, for every fixed x ∈ X, (ω, s) −→ Vβ π(ω, x), s is Σ × Σmeasurable on Ω × Ω. PROOF: To prove the first part of the proposition, it suffices to show that for all k ≥ 0, the function ω −→ Eπ (uk |ω0 = ω) is Σ-measurable. We have Eπ (uk |ω0 = ω) =
π(ω)(x)
π(ω, x, ω1 )(y1 )p(dω1 |ω, x)
Ωy ∈X 1
x∈X
π(ω, x, ω1 , . . . , ωk−1 )(xk−1 )p(dωk−1 |ωk−2 , xk−2 )
...
Ωy k−1 ∈X
u(ωk , xk )π(ω, x, ω1 , . . . , ωk )(xk )p(dωk |ωk−1 , xk−1 ).
Ω y ∈X k
From Definition 9.6.1 and Proposition 9.6.8, we see that the first part of the proposition holds. Similarly we deduce the second part of the proposition, by establishing in a similar fashion the joint measurability of (ω, s) −→ Eπ((ω),x) (uk |s0 = s). Now we can prove a theorem on the existence of a β-discounted optimal stationary policy. THEOREM 9.6.12 If hypotheses H1 and H2 hold, then the finite value function ω −→ Vβ (ω) is the minimal nonnegative measurable solution of the optimality functional equation
Vβ (ω) = min u(ω, x) + β Vβ (s)p(ds|ω, x) , ω ∈ Ω. (9.56) x∈X
Ω
If the minimum in (9.56) is realized at g(ω), then ω −→ g(ω) is a stationary optimal policy.
688
9 Uncertainty, Information, Decision Making
PROOF: For each n ≥ 0, we define inductively a sequence of functions Vn : Ω −→ R+ = R+ ∪ {+∞} as follows:
Vn−1 (s)p(ds|ω, x) , ω ∈ Ω, n ≥ 1. V0 = 0 and Vn (ω) = min u(ω, x) + β x∈X
Ω
(9.57)
We show the following. • ω −→ Vn (ω) is Σ-measurable.
(9.58)
• {Vn }n≥0 is increasing.
(9.59)
• Vn ≤ Vβ (π, ·) for all n ≥ 0 and all policies π.
(9.60)
Clearly (9.58) follows from (9.57), Proposition 9.6.8(a), and the fact that X is countable. Also (9.59) follows immediately from the inductive definition in (9.57). To prove (9.60), let Vn (π, ω) be the extended discounted cost if we follow policy π for n steps and then the process is terminated with terminal cost zero. We claim Vn ≤ Vn (π, ·)
for all n ≥ 0.
(9.61)
Evidently (9.61) implies (9.60). So we need to prove (9.61). We have u(ω, x)π(ω)(x) = V1 (π, ω). V1 (ω) ≤ x∈X
Suppose (9.61) is true for n − 1. We can easily check that
π(ω)(x) u(ω, x) + β Vn−1 π(ω, x), s p(ds|ω, x) , Vn (π, ω) = Ω
x∈X
with π(ω, x) being the shifted policy (see Definition 9.6.9). From Proposition 9.6.11
we know that (ω, s) −→ Vn−1 π(ω, x), s is Σ × Σ-measurable on Ω × Ω and so Proposition 9.6.8(a) implies that Vn (π, ·) is Σ-measurable. Then by the induction hypothesis
Vn−1 (ω) ≤ Vn−1 π(ω, x), s
⇒ Vn (ω) ≤ u(ω, x) + β Vn−1 π(ω, x), s p(ds|ω, x), Ω
and this proves (9.61) (hence (9.60) too). Then lim Vn = v exists and is Σ-measurable (see (9.58) and (9.59)). Moreover, n→∞
from the monotone convergence theorem, we see that v satisfies
v(s)p(ds|ω, x) , ω ∈ Ω. v(ω) = min u(ω, x) + β x∈X
(9.62)
Ω
By virtue of Proposition (9.61), there exists a stationary policy g : Ω −→ X realizing the minimum in (9.62). So
v(s)p(ds|ω, x). (9.63) v(ω) = u(ω, g) + β Ω
9.7 Remarks
689
From Proposition 9.6.8(b), the integral in the right-hand side of (9.63) produces a Σ-measurable function. Iterating (9.63), we obtain Vn (g, ω) ≤ v(ω), ⇒ Vβ (g, ω) ≤ v(ω). This combined with (9.60) implies that v = Vβ
and
g = stationary optimal policy.
Finally let w be a nonnegative Σ-measurable solution of (9.56) and let gw be the stationary policy realizing the minimum in (9.56) for w (instead of u). Then following the above argument, we obtain Vβ ≤ Vβ (gw , ·) ≤ w, ⇒ Vβ is indeed the minimal solution of (9.56).
9.7 Remarks 9.1: The first to apply topologies on sub-σ-fields to study the dependence of economic models on information was Allen [13], who employed the d-metric topology (see Definition 9.1.17). The d-metric was first defined by Boylan [96], who used it to study the convergence of certain martingale like sequences. The weaker topology of pointwise convergence (see Definition 9.1.3), was introduced by Cotter [160]. Proposition 9.1.9 is due to Fetter [243]. Other topologies considered in probability theory, can be found in Allen [14]. Stinchcombe [559, 560] introduced some Bayesian topologies, in which two information structures are similar if there is only small probability that the posteriors are far apart. Stinchcombe [559, 560], explored the Bayesian topologies on sub-σ-fields that are generated by different vector space topologies on the space of distributions. 9.2: The ex-post (or Bayesian) view was adopted by Allen [13], Cotter [160, 161] and Stinchcombe [559]. In Allen [13] and Cotter [160, 161] the state-dependent posteriors are not well defined and so instead they show that the map from information to expected utility is continuous and then use the fact that the map from utility to actions is continuous. In Stinchcombe [559], the author imposes assumptions on the probability space so that the state-dependent posteriors (regular conditional probabilities) given a sub-σ-field, are well defined. Then he showed that the mapping from information to state-dependent posteriors is continuous under suitable topologies on the space of information. The mapping from posteriors to actions is also continuous. Hence the composition of the two maps is continuous. In this section we present the first approach and outline the continuous dependence results due to Cotter [160] and Allen [13]. 9.3: The ex-ante view was adopted by Van Zandt [592]. Proposition 9.3.6 is due to Horsley–Van Zandt–Wrobel [312]. This approach can be found also in Cotter
690
9 Uncertainty, Information, Decision Making
[162] who proved that the set of type-correlated equilibria (an extension of correlated equilibria to Bayesian games) is upper semicontinuous with respect to the information variable. 9.4: Definition 9.4.1 was first introduced by Kudo [365], who also proved Proposition 9.4.4. If Σ = µ- lim inf Σn = µ- lim sup Σn , then we say that the sequence of n→∞
n→∞
Σn s strongly converges to Σ. The use of these limits of sub-σ-fields, to study the convergence of prediction sequences, is due to Tsukada [588]. 9.5: The first general equilibrium result for games with incomplete information was proved by Milgram–Weber [429]. They made two key hypotheses. The first (called the continuity of payoff), is that for fixed ω ∈ Ω, {uk (ω, ·)}n k=1 are uniformly equicontinuous. The second (called continuity of information), is that the joint distribution of the information available to the players, is absolutely continuous with respect to a product measure on the product of the spaces of types. This last hypothesis is in general difficult to verify, unless the information random variables are either independent or discrete and finite-valued. Here we present the result of Mamer–Schilling [405], who were able to weaken the continuity of payoffs hypothesis to continuity of u(ω, ·) on each action variable separately and they eliminated completely the continuity of information hypothesis. 9.6: The first work of Markov decision chains with an arbitrary state space and unbounded costs, is due to Ritt–Sennott [515]. Previous works had a general state space but a bounded cost, see for example Bertsekas–Shreve [72] and Hernandez Lerma [290].
10 Evolution Equations
Summary. *Evolution equations are the abstract formulation of dynamic partial differential equations. In this chapter, we examine both semilinear and nonlinear evolution equations. We start by developing the mathematical tools which are necessary in this study. Among them, the notion of evolution triple plays central role and always allows us to use different spaces within the analysis of a single evolution equation. First we consider semilinear evolutions, which we analyze using the semigroup method. Subsequently, we pass to nonlinear evolutions. We consider two such classes. Evolutions of the subdifferential type (which incorporate variational inequalities) and evolution formulated in the framework of evolution triples with operators of monotone type. The Galerkin method is crucial here. We conclude with the study of second-order evolutions.
Introduction Evolution equations are the abstract formulation of dynamic partial differential equations (i.e., partial differential equations with time t as one of the independent variables). Such equations arise in many branches of science and engineering and describe mathematically important physical processes and phenomena. The basic theoretical question related to such problems is whether for given initial data the equation has a solution at least locally in time and whether this solution is unique. It is also important to know how this solution depends on the initial data. We investigate these issues in the context of certain broad classes of semilinear and nonlinear evolution equations. In Section 10.1 we present the mathematical tools needed to study evolution equations. Central in this section is the notion of evolution triple of spaces. Evolution triples provide a natural and flexible framework for the study of evolution equations. Evolution triples realize the modern strategy to study pdes, which is to use many different spaces in the same problem. In Section 10.1, we examine some basic function spaces associated with evolution triples. Section 10.2 deals with semilinear evolution equations. We employ the semigroup method and prove existence, uniqueness, and blow-up of solutions for such evolutions. We also apply the abstract results to some parabolic initial-boundary value problems. N.S. Papageorgiou, S.Th. Kyritsi-Yiallourou, Handbook of Applied Analysis, Advances in Mechanics and Mathematics 19, DOI 10.1007/b120946_10, © Springer Science+Business Media, LLC 2009
692
10 Evolution Equations
In Section 10.3 we study nonlinear evolution equations. We consider two classes of such equations. The first class involves time-invariant subdifferential operators and so it incorporates certain variational inequalities. The second class involves problems formulated in the framework of evolution triples. The Galerkin method is the basic tool here. Finally in Section 10.4 we deal with second-order evolutions. The same two classes of equations are studied in this section too.
10.1 Lebesgue–Bochner Spaces In this section we introduce some spaces of Banach space–valued functions that are needed in the study of evolution equations. By a Banach space–valued function, we mean any map that takes values in a Banach space X, for example, a map f : T = [0, b] −→ X. Throughout this section (Ω, Σ, µ) is a finite measure space and X a Banach space. Additional hypotheses are introduced as needed. By · we denote the norm of X and by X ∗ its topological dual. DEFINITION 10.1.1 (a) A function f : Ω −→ X is a simple function if there exist n finitely many sets {Ck }n k=1 ⊆ Σ that are mutually disjoint and {xk }k=1 ⊆ X such that n f (ω) = xk χCk (ω) for all ω ∈ Ω. k=1
(b) A function f : Ω −→ X is said to be strongly measurable if there exists a sequence of simple function sn : Ω −→ X, n ≥ 1 such that lim f (ω) − sn (ω) = 0
n→∞
µ-a.e. on Ω.
(c) A function f : Ω −→ X is said to be weakly measurable if for any x∗ ∈ X ∗ the R-valued function ω −→ x∗ , f (ω) is Lebesgue measurable. Hereafter, by ·, · we denote the duality brackets for the pair (X ∗ , X). The next theorem, usually known as the Pettis measurability theorem relates the notions of strong and weak measurability (see Definition 10.1.1(b),(c)) and its proof can be found, for example, in Gasi´ nski–Papageorgiou [259, p. 109]. THEOREM 10.1.2 A function f : Ω −→ X is strongly measurable if and only if (a) f is almost separably valued (i.e., there exists a µ-null set C ⊆ Ω such that f (Ω\C) is separable in X) and (b) f is weakly measurable. In particular, if X is separable, then the notions of strong and weak measurability are equivalent.
10.1 Lebesgue–Bochner Spaces
693
DEFINITION 10.1.3 Let f : Ω −→ X be strongly measurable. We say that f is Bochner integrable if there exists a sequence {sn }n≥1 of simple functions such that
f (ω) − sn (ω)dµ = 0. lim n→∞
In this case
Ω
f (ω)dµ = lim
n→∞ Ω
Ω
sn (ω)dµ and it is called the Bochner integral of f .
REMARK 10.1.4 It is easy to see that the above definition of the Bochner integral is independent of the choice of the sequence {sn }n≥1 of simple functions. For any C ∈ Σ we set C f (ω)dµ = Ω χC (ω)f (ω)dµ. An easy consequence of Definition 10.1.3 is the following convenient criterion for Bochner integrability. PROPOSITION 10.1.5 If f : Ω −→ X is strongly measurable, then f is Bochner integrable if and only if f (·) ∈ L1 (Ω). Moreover, we have
f (ω)dµ ≤ f (ω)dµ. Ω
Ω
The Bochner integral possesses almost the same properties as the Lebesgue integral. In the next theorem we have gathered the basics of those properties. We omit the proofs which can be found in Gasi´ nski–Papageorgiou [259, Section 2.1]. THEOREM 10.1.6 (a) If fn : Ω −→ X, n ≥ 1, is a sequence of Bochner integrable w f (ω) µ-a.e. in X, then f is Bochner integrable and functions and fn (ω) −→ f (ω)dµ ≤ lim inf Ω fn (ω)dµ. Ω n→∞
µ
(b) If fn : Ω −→ X, n ≥ 1, is a sequence of Bochner integrablefunctions, fn −→ f (i.e., for every ε > 0, µ {ω ∈ Ω : fn (ω) − f (ω) ≥ ε} −→ 0 as n → ∞) and there exists h ∈ L1 (Ω)+ such that fn (ω) ≤ h(ω) µ-a.e. on Ω, then s f is Bochner integrable and C fn dµ −→ C f dµ for all C ∈ Σ. In fact, lim Ω fn − f dµ = 0. n→∞
(c) If m(C) = C f dµ, C ∈ Σ, then m is a vector measure (i.e., m(∅) = 0) and if {Cn }n≥1 ⊆ Σ is a sequence of pairwise disjoint sets and C = Cn , then n≥1 m(Cn ) (the sum is actually absolutely convergent). Moreover, m ! µ m(C) = n≥1 (i.e., lim C f dµ = 0) and |m|(C) = C f dµ for all C ∈ Σ, with |m| being µ(C)→0
the total variation of the vector measure m. COROLLARY 10.1.7 If f, g : Ω −→ X are Bochner integrable and C f dµ = gdµ for all C ∈ Σ, then f (ω) = g(ω) µ-a.e. on Ω. C PROOF: Let m(C) = C (f − g)dµ, C ∈ Σ. Then by hypothesis m(C) = 0 for every C ∈ Σ, hence |m|(C) = 0. But |m|(C) = f − gdµ for all C ∈ Σ. So C f − gdµ = 0 for all C ∈ Σ, hence f (ω) − g(ω) = 0 µ-a.e. on Ω. C
694
10 Evolution Equations
The next theorem states a useful property of the Bochner integral, which has no counterpart in the Lebesgue integration. For a proof of this result we refer to Gasi´ nski–Papageorgiou [259, p. 116]. THEOREM 10.1.8 If Y is another Banach space, A : D(A) ⊆ X −→ Y is a closed linear operator,
and f : Ω −→ D(A) ⊆ X and Af : Ω −→ Y are both Bochner integrable, then A Ω f dµ = Ω Af dµ. Given f, g : Ω −→ X strongly measurable, we say that f ∼ g if and only if f (ω) = g(ω)µ-a.e. Evidently this is an equivalence relation. Now we can define the so-called Lebesgue–Bochner spaces. DEFINITION 10.1.9 (a) Let 1 ≤ p < ∞. By Lp (Ω, X) we denote the space of all the equivalence classes for the relation ∼ of the strongly measurable functions f : Ω −→ X such that f (·) ∈ Lp (Ω)+ . We set
1/p f (ω)p dµ . f Lp (Ω,X) = Ω
(b) The space L∞ (Ω, X) is defined to be the space of all equivalence classes for the relation ∼ of the strongly measurable functions f : Ω −→ X such that f (·) ∈ L∞ (Ω)+ . We set f L∞ (Ω,X) = ess supf (·). REMARK 10.1.10 For every 1 ≤ p ≤ ∞, Lp (Ω, X) is a Banach space. Moreover, if Σ is countably generated and X is a separable Banach space, then Lp (Ω, X), 1 ≤ p < ∞ is separable. Recall that, if Ω is an open or a closed subset of RN , then the Borel σ-field B(Ω) is countably generated. For 1 < p < ∞ Lp (Ω, X) is uniformly convex if and only if X is uniformly convex. Simple functions are dense in Lp (Ω, X) for 1 ≤ p < ∞ and countably-valued functions of L∞ (Ω, X) are dense in L∞ (Ω, X). Finally if Ω ⊆ RN is bounded open, then C ∞ (Ω, X) is dense in Lp (Ω, X), 1 ≤ p < ∞. The characterization of the dual of these spaces is not easy. The reason for this is the fact that the Radon–Nikodym theorem is not in general true for vector measures. This leads to the following classification of Banach spaces. DEFINITION 10.1.11 A Banach space X has the Radon–Nikodym property (RNP for short) if for every finite measure space (Ω, Σ, µ) and every vector measure m : Σ −→ X of bounded variation such that m ! µ, we can find f ∈ L1 (Ω, X) such that m(C) = C f (ω)dµ for all C ∈ Σ. Large classes of Banach spaces have the RNP (see Diestel–Uhl [199, pp. 79–82]). THEOREM 10.1.12 (a) Reflexive Banach spaces have the RNP. (b) Separable dual Banach spaces have the RNP. Using this notion we can have a Riesz representation theorem for the Lebesgue– Bochner spaces. For a proof see Diestel–Uhl [199, pp. 89–100].
10.1 Lebesgue–Bochner Spaces
695
THEOREM 10.1.13 If X is a Banach space such that X ∗ has the RNP and 1 ≤ p < ∞, then Lp (Ω, X)∗ = Lp (Ω, X ∗ ), (1/p) + (1/p ) = 1. There is a version of this theorem when X ∗ does not satisfy the RNP. We state here the case p = 1 which appears often in applications. DEFINITION 10.1.14 Given two functions g, h : Ω −→ X ∗ that are w∗ measurable (i.e., for every x ∈ X, ω −→ g(ω), x, ω −→ h(ω), x are Σ-measurable), we have g∼h
if and only if g(ω), x = h(ω), x
µ-a.e. on Ω
for every x ∈ X. (10.1)
Note that in (10.1) the µ-null set in general depends on x ∈ X. Also ∼ is an equiv∗ alence relation. By L∞ (Ω, Xw ∗ ), we denote the linear space of the ∼-equivalence classes of functions g : Ω −→ X ∗ that are w∗ -measurable and there exists c ≥ 0 such that | g(ω), x | ≤ cx µ-a.e. on Ω for every x ∈ X. (10.2) Again the µ-null set in (10.2) may depend on x ∈ X. The infimum of all c ≥ 0 ∗ ) and it is easily seen to be a norm such that (10.2) holds is denoted by gL∞ (Ω,Xw ∗ ∞ ∗ for the space L (Ω, Xw∗ ). ∗ REMARK 10.1.15 If X is separable and g ∈L∞ (Ω, Xw ∗ ), then ω −→ g(ω)X ∗ belongs in L∞ (Ω)+ and ∗ ) = ess supg(·)X ∗ . gL∞ (Ω,Xw ∗
Ω
The next theorem is usually called the Dinculeanu–Foias theorem (Ionescu– Tulcea [328, p. 95]). THEOREM 10.1.16 If X is an arbitrary Banach space, then L1 (Ω, X)∗ = ∗ L∞ (Ω, Xw ∗ ) and the duality pairing between the two spaces is given by
g(ω), f (ω) dµ. g, f 0 = Ω
Using the notion of RNP, we can also extend the fundamental theorem of the Lebesgue calculus to vector-valued functions. DEFINITION 10.1.17 A function f : T = [0, b] −→ X is said to be absolutely continuous if for every ε > 0, we can find δ > 0 such that for every sequence of disjoint (cn − αn ) < δ, we have f (cn ) − f (αn ) < ε. intervals (αn , cn ) ⊆ T verifying n≥1
n≥1
THEOREM 10.1.18 If X has the RNP and f : T −→ X is absolutely continuous, then f is strongly differentiable a.e. on T , f ∈ L1 (T, X), and
t
f (t) = f (0) + 0
f (s)ds
for all t ∈ T.
696
10 Evolution Equations
In applications this theorem is used in the context of reflexive Banach spaces (see Theorem 10.1.12(a)). For this reason in the next definition we assume that X is reflexive. DEFINITION 10.1.19 Suppose X is a reflexive Banach space. (a) By AC 1,p (T, X), 1 ≤ p ≤ ∞, we denote the space of all absolutely continuous p functions
f : T −→ X such that f ∈ L (T, X) (see Theorem 10.1.18). (b) By D (0, b), X we denote the space of all continuous
linear operators from D(0, b) = Cc∞ (0, b) into X. An element u ∈ D (0, b), X is an X-valued distri
bution on (0, b). If u ∈ D (0, b), X , then for a positive integer k ≥ 1 dk ϑ
Dk u(ϑ) = (−1)k u
dtk
for all ϑ ∈ D(0, b) = Cc∞ (0, b),
defines another vectorial distribution Dk u ∈ D (0, b), X , known as the k1 vectorial distributional derivative of u. If k = 1, then we write D = D. (c) Let 1 ≤ p ≤ ∞. The vectorial Sobolev space W 1,p (0, b), X is defined to be the space of all functions u ∈ Lp (T, X) such that the vectorial distributional derivative Du also belongs in Lp (T, X).
As in the case for R-valued functions, the spaces AC 1,p (T, X) and W 1,p (0, b), X can be identified (see Barbu [58, p. 19] and Gasi´ nski–Papageorgiou [259, p. 138]). THEOREM 10.1.20 If X is a reflexive Banach space and u ∈ Lp (T, X), 1 ≤ p ≤ ∞, then the following statements are equivalent.
(a) u ∈ W 1,p (0, b), X . (b) There exists u ∈ AC 1,p (T, X) such that u(t) = u(t) a.e. on (0, b). Suppose X and Y are Banach spaces and X is embedded continuously and densely into Y . Then we can easily verify that • Y ∗ is embedded continuously into X ∗ .
(10.3)
• If X is reflexive, then the above embedding is also dense.
(10.4)
With this in mind we make the following definition, which is basic in the study of evolution equations. DEFINITION 10.1.21 By an evolution triple (or Gelfand triple), we understand a triple of spaces X ⊆ H ⊆ X∗ such that (a) X is a separable, reflexive Banach space. (b) H is a separable Hilbert space that is identified with its dual (pivot space). (c) X is embedded continuously and densely into H.
10.1 Lebesgue–Bochner Spaces
697
REMARK 10.1.22 By virtue of (10.3) and (10.4), for an evolution triple (X, H, X ∗ ) we have X ⊆ H and H ∗ = H ⊆ X ∗ with the embedding being continuous and dense. In what follows by ·, · we denote the duality brackets for the pair (X ∗ , X) and by (·, ·) the inner product of H. Also by · (resp., | · |, · ∗ ), we denote the norm of X (resp., of H, X ∗ ). We have ·, · H×X = (·, ·) and x∗ ≤ c|x| for some c > 0 and all x ∈ H. (10.5) Using evolution triples we can define the following Banach space, which is a useful tool in the study of evolution equations. DEFINITION 10.1.23 Wp (0, b) = x ∈ Lp (T, X) : x ∈ Lp (T, X ∗ ) , 1 < p < ∞, (1/p) + (1/p ) = 1. The space Wp (0, b) is equipped with the norm xWp (0,b) = xLp (T,X) + x Lp (T,X ∗ ) which clearly makes Wp (0, b) a Banach space. REMARK 10.1.24 If in the above definition, we interpret the derivative of x as a vectorial distributional derivative into X ∗ (weak derivative), by virtue of Theorem ∗ 10.1.20, we see that it is a strong derivative if
we view∗ x as an X -valued function. 1,p ∗ 1,p Evidently Wp (0, b) ⊆ AC (T, X ) = W (0, b), X . THEOREM 10.1.25 If (X, H, X ∗ ) is an an evolution triple and 1 < p < ∞, then Wp (0, b) is embedded continuously and densely in C(T, H). PROOF: Let c < 0 < b < d and x ∈ Wp (0, b). Because x ∈ AC 1,p (T, X ∗ ), we can extend x on (c, 0) and on (b, d) by setting x(t) = x(0) for t ∈ (c, 0) and x(t)
= x(b) for t ∈ (b, d). We choose ϕ ∈ Cc∞ (c, d) such that ϕT = 1. We set x(t) = x ϕ(t) for all t ∈ (c, d). Then clearly x ∈ Wp (c, d) and xT = x. Also we have xWp (0,b) ≤ xWp (c,d) ≤ c(ϕ)xWp (0,b) and we can find ε > 0 small such that x(c,c+ε) = 0 and
with c(ϕ) > 0
x(d−ε,d) = 0.
1 Let ϑ ∈ Cc∞ (−1, 1) be a mollifier (i.e., ϑ ≥ 0, −1 ϑ(t)dt). We set ϑm (s) = mϑ(ms) and define the corresponding mollification of x; that is,
d
x(s)ϑm (t − s)ds.
xm (t) = c
We know that xm ∈ Cc∞ (c, d), X for m ≥ 1 large and xm −→ x in Wp (c, d) with xm Wp (c,d) ≤ xWp (c,d) . We have
698
10 Evolution Equations
1 d |xm (t)|2 = xm (t), xm (t) = xm (t), xm (t) 2 dt
t 1 ⇒ |xm (t)|2 = xm (s), xm (s) ds 2 c
t xm (s)∗ xm (s)ds ≤
(see 10.5)),
c
≤ xm 2Wp (c,d)
(by the Cauchy–Schwartz inequality). (10.6)
From (10.6), we infer that for m, n ≥ 1, we have 1 |xm (t) − xn (t)|2 ≤ xm − xn 2Wp (c,d) 2 ⇒ {xm }m≥1 ⊆ C(T, H) is Cauchy.
for all t ∈ [c, d]
Therefore we have xm −→ x in C(T, H), hence x = x ∈ C(T, H). So we have proved that Wp (0, b) is embedded continuously in C(T, H). The embedding is also dense because the set of all X-valued polynomials is dense in both Wp (0, b) and C(T, H). From the above proof, we deduce the following corollary.
COROLLARY 10.1.26 If x, y ∈ Wp (0, b), then t −→ x(t), y(t) is absolutely continuous and d
x(t), y(t) = x (t), y(t) + y (t), x(t) a.e. on T. dt
(10.7)
REMARK 10.1.27 We call (10.7) the integration by parts formula for the elements in Wp (0, b). The embedding of Wp (0, b) in C(T, H) is not in general compact. We can have compact embedding of Wp (0, b) into Lp (T, H). This is a particular case of Theorem 10.1.29 below. First we need the following interpolation lemma, sometimes called in the literature Ehrling’s inequality. LEMMA 10.1.28 If X, Y, Z are Banach spaces such that X ⊆ Y ⊆ Z, the embedding of X into Y is compact, and the embedding of Y into Z is continuous, then given ε > 0, we can find c(ε) > 0 such that for all x ∈ X we have xY ≤ εxX + c(ε)xZ .
(10.8)
PROOF: Suppose that (10.8) is false. Then we can find ε > 0 and sequence {xn }n≥1 ⊆ X such that xn Y > εxn X + nxn Z
for all n ≥ 1.
We set yn = xn /xn X , n ≥ 1. Evidently yn X = 1 for all n ≥ 1 and we have yn Y ≥ ε + nyn Z
for all n ≥ 1.
(10.9)
10.1 Lebesgue–Bochner Spaces
699
Because {yn }n≥1 ⊆ X is bounded and by hypothesis X is embedded compactly in Y and Y is embedded continuously in Z, by passing to a suitable subsequence if necessary, we may assume that yn −→ y
in Y
and in
as n → ∞.
Z
(10.10)
From (10.9) and (10.10), we obtain yZ = 0
and
yY ≥ ε > 0,
a contradiction.
THEOREM 10.1.29 If X, Y, Z are Banach spaces with X, Z reflexive such that X ⊆ Y ⊆ Z, the embedding of X into Y is compact, the embedding of Y into Z is continuous, and for 1 < p, q < ∞, we define the Banach space Wpq (0, b) = x ∈ Lp (T, X) : x ∈ Lq (T, Z) , then Wpq (0, b) is embedded compactly in Lp (T, Y ). PROOF: Clearly Wpq (0, b) is a reflexive Banach space. Let {xn }n≥1⊆Wpq (0, b) be w a bounded sequence. We may assume that xn −→ x in Wpq (0, b) as n → ∞. We have w
xn −→ x
in Lp (T, X)
and
xn −→ x w
in Lq (T, Z)
as n → ∞.
Without any loss of generality we may assume that x = 0. First we show that xn (t) −→ 0 in Z for every t ∈ T . We show this convergence for t = 0 and for the other points in T the proof is similar. Because xn ∈ AC 1,r (T, X), r = min{p, q}, we have
t xn (0) = xn (t) − xn (s)ds, t ∈ T. 0
Integrating this inequality we obtain
s t
1 s xn (t)dt − xn (τ )dτ dt xn (0) = s 0 0 0 = an + cn , with an =
1 s
s
xn (t)dt
and
cn = −
0
1 s
s
(10.11)
(s − τ )xn (τ )dτ, n ≥ 1.
0
Given ε > 0, we choose s > 0 small so that
s ε xn (τ )Z dτ ≤ . cn Z ≤ 2 0 w
(10.12)
For this fixed s > 0 and because xn −→ 0 in Lp (T, X) (recall x = 0), we see that w
an −→ 0
in X
as n → ∞,
⇒ an −→ 0
in Z
as n → ∞.
700
10 Evolution Equations
So we can find n0 = n0 (ε) ≥ 1 such that for all n ≥ n0 , we have an Z ≤
ε . 2
(10.13)
Returning to (10.11) and using (10.12) and (10.13), we have xn (0)Z ≤ ε ⇒ xn (0) −→ 0
for all n ≥ n0 , in Z
as n → ∞.
(10.14)
Note that Wpq (0, b) is embedded continuously in C(T, Z) (see Theorem 10.1.20). So it follows that {xn }n≥1 ⊆ C(T, Z) is bounded. Therefore by virtue of the dominated convergence theorem (see (10.14)), we have xn −→ 0
in Lp (T, Z)
as n → ∞.
(10.15)
Invoking Lemma 10.1.28, for any δ > 0 we have xn Lp (T,Y ) ≤ δxn Lp (T,X) + c(δ)xn Lp (T,Z) ≤ δM + c(δ)xn Lp (T,Z) ⇒ xn Lp (T,Y ) −→ 0
(with M = sup xn Lp (T,X) < ∞) n≥1
as n → ∞ (see (10.15).
This proves that Wpq (0, b) is embedded compactly in Lp (T, Y ). Another useful result in this direction is included in the next proposition.
PROPOSITION 10.1.30 If X, Y are Banach spaces, X is reflexive and is embedded continuously in Y , and x ∈ L∞ (T, X) ∩ C(T, Yw ), then x ∈ C(T, Xw ) (here Xw , resp., Yw , denotes the space X, resp., Y , endowed with the weak topology). ·
PROOF: By replacing Y with X Y if necessary, we may assume that the embedding of X into Y is also dense. Then Y ∗ is embedded continuously and densely in X ∗ (see (10.3) and (10.4)). If tn −→ t in T , then because by hypothesis x ∈ C(T, Yw ), we have y ∗ , x(tn )Y −→ y ∗ , x(t)Y
as n → ∞,
for all y ∗ ∈ Y ∗ .
(10.16)
Here by ·, ·Y we denote the duality brackets for the pair (Y ∗ , Y ). First we show that x(t) ∈ X for all t ∈ T and x(t)X ≤ xL∞ (T,X)
for all t ∈ T.
(10.17)
To see this, extend x by zero outside T , denote this extension by x and then regularize x. So we can find a sequence {xn }n≥1 ⊆ C 1 (T, X) such that xn (t)X ≤ xL∞ (T,X) and
∗
∗
y , xn (t)Y −→ y , x(t)Y
for all t ∈ T, all n ≥ 1 as n → ∞,
for all y ∗ ∈ Y.
We have | y ∗ , xn (t)Y | = | y ∗ , xn (t)X | ≤ y ∗ X ∗ xL∞ (T,X) .
(10.18)
10.2 Semilinear Evolution Equations
701
Here by ·, ·X we denote the duality brackets for the pair (X ∗ , X). Passing to the limit as n → ∞ in (10.18), we obtain | y ∗ , x(t)Y | = | y ∗ , x(t)X | ≤ y ∗ X ∗ xL∞ (T,X)
for all t ∈ T and all y ∗ ∈ Y ∗ . (10.19)
Because Y ∗ is dense in X ∗ , from (10.19) it follows that x(t) ∈ X
for all t ∈ T
and
(10.17) holds.
(10.20)
∗ ∗ Let x∗ ∈ X ∗ . Because Y ∗ is dense in X ∗ , we can find {ym }m≥1 ⊆ Y ∗ , ym −→ x∗ in X ∗ as m −→ ∞. We have ∗ ∗ ym , x(tn )X −→ ym , x(t)X
Also
as n → ∞
∗ ym , x(t)X −→ x∗ , x(t)X
for all m ≥ 1
(see (10.20)). (10.21)
as m −→ ∞.
(10.22)
Because of (10.21) and (10.22) we can find a sequence n −→ m(n) not necessarily strictly increasing, such that m(n) −→ ∞ as n → ∞ and ∗ ym(n) , x(tn ) X −→ x∗ , x(t)X as n → ∞. (10.23) Then we have | x∗ , x(tn )X − x∗ , x(t)X | ∗ ∗ , x(tn ) X + ym(n) , x(tn ) X − x∗ , x(t)X ≤ x∗ , x(tn )X − ym(n)
⇒ x∗ , x(tn )X −→ x∗ , x(t)X
as n → ∞
(see (10.23)) (10.24)
Because in (10.24) x∗ ∈ X ∗ was arbitrary, we conclude that x ∈ C(T, Xw ).
10.2 Semilinear Evolution Equations In this section we examine semilinear evolution equations using semigroups of linear operators (see Section 3.2). So let X be a Banach space, A : D(A) ⊆ X −→ X be the infinitesimal generator of a C0 -semigroup {S(t)}t≥0 and f : [0, b] × X −→ X be a function (the nonlinear perturbation). We consider the following semilinear Cauchy problem:
) * x (t) = Ax(t) + f t, x(t) , t ∈ T = [0, b], . (10.25) x(0) = x0 ∈ X We introduce the following hypotheses on the nonlinearity f (t, x). H(f ): f : T ×X −→ X is a function such that (i) For every x ∈ X, t −→ f (t, x) is strongly measurable.
702
10 Evolution Equations (ii) There exists k ∈ L1 (T )+ such that for almost all t ∈ T and all x, u ∈ X, we have f (t, x) − f (t, u) ≤ k(t)x − u and
f (t, 0) ≤ k(t).
DEFINITION 10.2.1 A function x ∈ C(T, X) is said to be a mild solution of (10.25) if it is a solution of the following integral equation
x(t) = S(t)x0 +
t
S(t − τ )f τ, x(τ ) dτ,
t ∈ T = [0, b].
(10.26)
0
REMARK 10.2.2 Hereafter we do not distinguish between the mild solutions of (10.25) and the solutions of (10.26). PROPOSITION 10.2.3 If A is the infinitesimal generator of a contraction semigroup {S(t)}t≥0 ; that is, S(t)L ≤ 1
for all t ≥ 0,
f (t, x) satisfies hypotheses H(f ), and x0 ∈ X, then problem (10.25) admits a unique mild solution x(·; x0 ) ∈ C(T, X) satisfying
and
k(τ )dτ x0 0
t k(τ )dτ x0 − x0 . x(t; x0 ) − x(t; x0 ) ≤ exp x(t; x0 ) ≤ exp
t
(10.27) (10.28)
0
PROOF: On C(T, X) we introduce the following equivalent norm
|u| = max exp −L
t
k(τ )dτ x(t) : t ∈ T
(10.29)
0
with L > 1. We consider the nonlinear operator ξ : C(T, X) −→ C(T, X) defined by
ξ(x)(t) = S(t)x0 +
t
S(t − τ )f τ, x(τ ) dτ.
0
We show that ξ is a | · |-contraction. To this end, for x, u ∈ C(T, X), we have ξ(x)(t) − ξ(u)(t)
t
S(t − τ ) f τ, x(τ ) − f τ, u(τ ) dτ = 0
t k(τ )x(τ ) − u(τ )dτ. ≤ 0
t Multiplying (10.30) with exp −L 0 k(τ )dτ , we obtain
(10.30)
10.2 Semilinear Evolution Equations
703
t
k(τ )dτ ξ(x)(t) − ξ(u)(t) exp −L 0
t
τ
t
exp −L k(s)ds exp −L k(s)ds k(τ )x(τ ) − u(τ )dτ ≤ 0 τ 0
t
t
exp −L k(s)ds k(τ )dτ (see (10.29)) ≤ |x − u| 0
τ
1 ≤ |x − u| L 1 |x − u|. L Because L > 1, it follows that ξ is a |·|-contraction and so by Banach’s contraction principle, we deduce that there exists a unique x(·; x0 ) ∈ C(T, X) such that
x(·; x0 ) = ξ x(·; x0 ) . ⇒ |ξ(x) − ξ(u)| ≤
Clearly this is the unique mild solution of (10.25). We have
t
f τ, x(τ ) dτ x(t; x0 ) ≤ x0 + 0
t
f (τ, 0) + k(τ )x(τ ) dτ ≤ x0 + 0
t
k(τ ) 1 + x(τ ) dτ. ≤ x0 + 0
Invoking Gronwall’s inequality, we infer that
t k(τ )dτ (1 + x0 ). x(t; x0 ) ≤ exp 0
Moreover, we have
t
x(t; x0 ) − x(t; x0 ) ≤ S(t)(x0 − x0 )+ S(t − τ ) x(τ ; x0 ) − x(τ ; x0 ) dτ 0
t k(τ )x(t; x0 ) − x(t; x0 )dτ. ≤ x0 − x0 + 0
Once again Gronwall’s inequality implies that
t k(τ )dτ x0 − x0 . x(t; x0 ) − x(t; x0 ) ≤ exp 0
REMARK 10.2.4 Of course the result is still true if A is the infinitesimal generator of a C0 -semigroup {S(t)}t≥0 satisfying S(t)L ≤ M eωt
for all t ≥ 0
with M ≥ 1, ω > 0.
In this case the estimates (10.27) and (10.28) have the following form,
t
k(τ )dτ (1 + x0 ) x(t; x0 ) ≤ M exp ωt + M 0
t
k(τ )dτ x0 − x0 for all t ∈ T. and x(t; x0 ) − x(t; x0 ) ≤ M exp ωt + M 0
704
10 Evolution Equations
In Proposition 10.2.3 we assumed that x0 ∈ X. For any x0 ∈ X that does not belong in D(A), in general t −→ S(t)x0 is not necessarily differentiable and neither belongs in D(A) for t > 0. Therefore, if x0 ∈ D(A) (smooth initial condition), we expect to deduce some regularity for the mild solution. PROPOSITION 10.2.5 If A is the infinitesimal generator of a C0 -semigroup and f (t, x) satisfies hypotheses H(f ), with k ∈ L∞ (T )+ and x0 ∈ D(A), then the unique mild solution x(·; x0 ) is Lipschitz continuous. PROOF: Without any loss of generality we assume that {S(t)}t≥0 is a contraction semigroup. For any r > 0, we set x(t) = x(t + r). Evidently x is a mild solution of (10.25) with initial condition x(r). So from (10.28) it follows that
t x(t + r) − x(t) ≤ exp k(τ )dτ x(r) − x0 . (10.31) 0
Also note that
r
S(t − τ )f τ, x(τ ) dτ x(r) − x0 = S(r)x0 − x0 + 0
r
r
r S(τ )Ax0 dτ + f (τ, x0 )dτ + k(τ )x(τ ) − x0 dτ ≤ 0
0
(because x0 ∈ D(A))
0
r k(τ )(1 + x0 )dτ + k(τ )x(τ ) − x0 dτ 0 0
t
k(τ )x(τ ) − x0 dτ. ≤ Ax0 + (1 + x0 )k∞ r + r
≤ Ax0 r +
0
(10.32) From (10.32) and Gronwall’s inequality, we obtain x(r) − x0 ≤ cr
for some c > 0 (independent of r ∈ T ).
(10.33)
Using (10.33) in (10.31), we conclude that x(·; x0 ) is Lipschitz continuous.
COROLLARY 10.2.6 If X is a reflexive Banach space, A is the infinitesimal generator of a C0 -semigroup, f satisfies hypotheses H(f ) with k ∈ L∞ (T )+ , and x0 ∈ D(A), then the mild solution x(·; x0 ) is a strong solution; that is, it satisfies (10.25) pointwise almost everywhere. Moreover, if f is t-independent, then
the mild solution is a classical solution; that is, x(·; x0 ) ∈ C(T, X) ∩ C 1 (0, b), X . Proposition 10.2.3 is still valid if T = [0, b] is replaced by R+ . So we have the following. PROPOSITION 10.2.7 If A is the infinitesimal generator of C0 -semigroup, f (t, x) satisfies hypotheses H(f ), and x0 ∈ X, then problem (10.25) admits a unique global mild solution x(·; x0 ) ∈ C(R+ , X) and
t x(t; x0 ) − x(t; x0 ) ≤ exp k(τ )dτ x0 − x0 . 0
10.2 Semilinear Evolution Equations
705
PROOF: It is similar to the proof of Proposition 10.2.3, so we only sketch it here. Again we introduce the map ξ : C(R+ , X) −→ C(R+ , X), defined by
t
S(t − τ )f τ, x(τ ) dτ, t ≥ 0. ξ(x)(t) = S(t)x0 + 0
t Let V = x ∈ C(R+ , X) : supt≥0 exp −L 0 k(τ )dτ x(t) < ∞ , with L > 1. In V we introduce the norm
t
k(τ )dτ x(t). |x| = sup exp −L t≥0
0
Clearly V equipped with this norm is a Banach space. Moreover, ξ(V ) ⊆ V . Indeed, for x ∈ V , we have
t
ξ(x)(t) ≤ S(t)x0 + S(t − τ )f τ, x(τ ) dτ 0
t
f τ, x(τ ) dτ ≤ x0 + 0
t k(τ )(1 + x(τ ))dτ ≤ x0 + 0
t
t
t
k(τ )dτ ξ(x)(t) ≤ c1 + exp −L k(τ )dτ k(τ )x(τ dτ, ⇒ exp −L 0
0
0
c1 > 0, ⇒ |ξ(x)| < ∞ (i.e., ξ(x) ∈ V ). As in the proof of Proposition 10.2.3, we can check that ξ is a contraction. So by the Banach contraction principle it has a unique fixed point in V , which is a mild solution of (10.25). Moreover, the uniqueness also holds in C(R+ , X). Indeed if x, u ∈ C(R+ , X) are mild solutions of (10.25) and y = x − u, then
t
S(t − τ ) f τ, x(τ ) − f τ, u(τ ) dτ, y(t) = 0
t k(τ )y(τ )dτ, t ≥ 0, ⇒ y(t) ≤ 0
⇒ y(t) = 0
for all t ≥ 0
(by Gronwall’s inequality),
⇒ x = u. The Lipschitz dependence on the initial condition follows via Gronwall’s inequality, as in the proof of Proposition 10.2.3. As before, for simplicity we assume that A is the infinitesimal generator of a contraction semigroup. Then from the Hille–Yosida theorem (see Theorem 3.2.87), we know that for every λ > 0, the operator I − λA is one-to-one from D(A) ⊆ X onto X and so by Banach’s theorem we know that Jλ = (I −λA)−1 ∈ L(X). This operator is called the resolvent of A. Using Jλ we also define
(10.34)
706
10 Evolution Equations 1 (I − Jλ ) ∈ L(X) λ
Aλ =
(10.35)
which is the so-called Yosida approximation of A (see also Definition 3.2.42, where these notions were defined in the context of nonlinear monotone operators defined on a pivot Hilbert space). As in Proposition 3.2.44, we can show that the following are true for the operators Jλ and Aλ defined in (10.34) and (10.35), respectively. PROPOSITION 10.2.8 (a) For any λ>0 and x ∈ X, we have Aλ (x) = A(Jλ x). (b) For any λ > 0 and x ∈ D(A), we have Aλ x = Jλ (Ax). (c) For any λ > 0 and x ∈ D(A), we have Aλ x ≤ Ax. (d) For any x ∈ X, we have lim Jλ x = x. λ→0
(e) For any x ∈ D(A), we have lim Aλ x = Ax. λ→0
A is unbounded, therefore a mild solution need not be strongly differentiable. This may be an inconvenience in many applications. The following approximation result helps us to overcome such an inconvenience. PROPOSITION 10.2.9 If A is the infinitesimal generator of a contraction semigroup, f (t, x) satisfies hypotheses H(f ), x0 ∈ X, x ∈ C(T, X) is a mild solution of problem (10.25), and xλ ∈ C(T, X) is the solution of
t
xλ (t) = Sλ (t)x0 + Sλ (t − τ )f τ, xλ (τ ) dτ, t ∈ T, 0
where Sλ (t) = eAλ t =
(Aλ t) k!
k≥0
k
(the semigroup generated by Aλ ∈ L(X)) then
lim max xλ (t) − x(t) = 0.
λ→0 t∈T
PROOF: As before using hypothesis H(f )(ii) and Gronwall’s inequality, we obtain
b
sup Sλ (r) − S(r) f τ, x(τ ) dτ. (10.36) xλ (t) − x(t) ≤ 0 r∈T
But we know that for any x ∈ X. Sλ (t)x −→ S(t)x
as λ ↓ 0
uniformly in t ∈ T.
So from (10.36) and the dominated convergence theorem, we conclude that lim max xλ (t) − x(t) = 0.
λ→0 t∈T
In Section 3.2, we saw the Hille–Yosida characterization of the infinitesimal generator of a C0 -semigroup. In the next theorem we produce a different characterization for the infinitesimal generators of contraction semigroups. The result is known as the Lumer–Phillips theorem and its proof can be found in Yosida [614, p. 250].
10.2 Semilinear Evolution Equations
707
THEOREM 10.2.10 If A is a closed and densely defined linear operator in a Banach space X, then A is the infinitesimal generator of a contraction semigroup if and only if the operator −A is m-accretive. Recall that if A is a closed operator in a Banach space X with domain D(A), then we can consider D(A) with the graph norm xD(A) = x + Ax.
(10.37)
Due to the closedness of A, (D(A), · D(A) ) is a Banach space that is embedded continuously in X. We consider the following Cauchy problem: ) * x (t) = Ax(t) + g(t), t ≥ 0, . (10.38) x(0) = x0 We assume that −A is closed, densely defined, and m-accretive, hence it is the infinitesimal generator of a contraction semigroup {S(t)}t≥0 . We have the following existence result for problem (10.38). PROPOSITION 10.2.11 If A is closed, densely defined, −A is m-accretive, g ∈ C 1 (R+ , X), and x0 ∈ D(A), then problem (10.38) admits a unique classical solution x such that
x ∈ C 1 (R+ , X) ∩ C R+ , D(A) which can be expressed as
t
S(t − τ )g(τ )dτ.
x(t) = S(t)x0 + 0
PROOF: We know that t −→ S(t)x0 (the solution of the homogeneous equation) belongs in C 1 (R+ , X) ∩ C R+ , D(A) . So we need to check that
t
S(t − τ )g(τ )dτ
u(t) = 0
belongs in C 1 (R+ , X) ∩ C R+ , D(A) and satisfies (10.38). To this end we consider
t
u(t + r) − u(t) 1 t+r S(t + r − τ )g(τ )dτ − S(t − τ )g(τ )dτ = r r 0 0
t
1 t+r = S(t + r−τ )g(τ )dτ + S(t + r−τ )−S(t −τ ) g(τ )dτ r t 0 (10.39)
1 t
1 t+r S(z)g(t + h−z)dz + S(z) g(t + h−z)−g(t−z) dz. = r t r 0 (10.40) We let r −→ 0 and in the right side of (10.40), we obtain
t S(t)g(0) + S(z)g (t − z)dz ∈ C(R+ , X). 0
708
10 Evolution Equations
So it follows that u ∈ C 1 (R+ , X). Moreover, the right-hand side in (10.39) gives S(0)g(t) − Au(t) = g(t) − Au(t). Hence u ∈ C R+ , D(A) and we have proved the proposition.
COROLLARY 10.2.12 If A is closed, densely defined, −A is m-accretive, g ∈
C R+ , D(A) , and x0 ∈ D(A), then problem (10.38) has a classical solution x given by
t
S(t − τ )g(τ )dτ,
x(t) = S(t)x0 +
t ≥ 0.
0
COROLLARY 10.2.13 If A is closed, densely defined, −A is m-accretive, g ∈ C(R+ , X), g is strongly differentiable with g ∈ L1loc (R+ , X), and x0 ∈ D(A), then problem (10.38) has a classical solution x given by
t x(t) = S(t)x0 + S(t − τ )g(τ )dτ, t ≥ 0. 0
PROOF: Let b > 0, T = [0, b]. By hypothesis g ∈ L1 (T, X). We set
t u(t) = S(t − τ )g(τ )dτ, t ∈ T. 0
We can easily check that u ∈ C(T, X). Also from (10.40) we see that u is strongly differentiable almost everywhere on T and at points of strong differentiability we have
t u (t) = S(t)g(0) + S(τ )g (t − τ )dτ 0
t S(t − τ )g (τ )dτ ∈ C(T, X). = S(t)g(0) + 0
Therefore for almost all t ∈ R+ , we have u (t) = Au(t) + f (t),
⇒ u ∈ C 1 (T, X) ∩ C T, D(A) . REMARK 10.2.14 The assumption g ∈ C(R+ , X) alone is not enough to guarantee the existence of a classical solution. COROLLARY 10.2.15 If X is a reflexive Banach space, A is closed, and densely defined, −A is m-accretive, g : R+ −→ X is Lipschitz continuous, and x0 ∈ D(A), then problem (10.38) has a classical solution x given by
t x(t) = S(t)x0 + S(t − τ )g(τ )dτ, t ≥ 0. 0
Now we use Proposition 10.2.11 to obtain an existence result for classical solutions for problem (10.25).
10.2 Semilinear Evolution Equations
709
PROPOSITION 10.2.16 If A is closed, densely defined, −A is m-accretive, f ∈ C 1 (X, X), and there exists k > 0 such that f (x) − f (u) ≤ kx − u
for all x, u ∈ X,
then for every x0 ∈ D(A) problem (10.25) admits a unique global classical solution. PROOF: By virtue of Proposition 10.2.7, problem (10.25) has a unique global mild solution x ∈ C(R+ , X). Because of the hypotheses on f and Proposition 10.2.11, it suffices to show that x ∈ C 1 (R+ , X). To this end, we consider the following auxiliary problem,
) * u (t) = Au(t) + f x(t) u(t), t ≥ 0, . (10.41) u(0) = f (x0 ) + Ax0 ∈ X Problem (10.41) has a unique global mild solution u ∈ C(R+ , X) given by
u(t) = S(t) f (x0 ) − Ax0 ) +
t
S(t − τ )f x(τ ) u(τ )dτ,
t ≥ 0.
0
We have x(t + r) − x(t) − u(t) r
t+r 1
= S(t) S(r) − I x0 + S(t + r − τ )f x(τ ) dτ r 0
t
t
S(t − τ )f x(τ ) dτ−S(t) f (x0 )+Ax0 − S(t−τ )f x(τ ) u(τ )dτ − 0
0
t
S(r)x0 − x0
f x(τ + r) −f x(τ ) − Ax0 + − f x(τ ) u(τ )dτ ≤ r r 0
1 r
S(t + r − τ )f x(τ ) dτ − S(t)f (x0 ) + r 0 = ξ1 + ξ2 + ξ3 . (10.42) Because by hypothesis x0 ∈ D(A), we have ξ1 −→ 0
as r ↓ 0.
(10.43)
Also due to the continuity of x and f , we have ξ3 −→ 0
as r ↓ 0.
Next note that because f is strongly differentiable, we have
f x(τ + r) − f x(τ )
x(τ + r) − x(τ ) − f x(τ ) r r x(τ + r) − x(τ )
ϑ x(τ + r) − x(τ ) ≤ r (with ϑ(s) −→ 0 as s −→ 0). So we have
(10.44)
(10.45)
710
10 Evolution Equations
f x(τ + r) − f x(τ )
− f x(τ ) u(τ ) r
f x(τ + r) − f x(τ )
x(τ + r) − x(τ ) − f x(τ ) ≤ r r
x(τ + r) − x(τ ) + f x(τ ) − u(τ ) r x(τ + r) − x(τ )
ϑ x(τ + r)−x(τ ) +k x(τ + r) − x(τ ) − u(t) ≤ r r (10.46)
using (10.46) and recalling that f x(τ ) L ≤ k for all τ ≥ 0. Because of Proposition 10.2.5, we see that x(τ + r) − x(τ )
ϑ x(τ + r) − x(τ ) −→ 0 r
as r −→ 0.
(10.47)
Using (10.43), (10.44), (10.46), and (10.47), we see that given ε > 0, we can find r > 0 small such that
t x(t + r)−x(t) x(τ +r)−x(τ ) −u(t) ≤ ε+k −u(τ )dτ. r r 0 Invoking Gronwall’s inequality and because ε > 0 was arbitrary, we conclude that x(t+r)−x(t) − u(t) = 0, r for all t ≥ 0, ⇒ x (t) = u(t) lim
r→0
⇒ x ∈ C 1 (R+ , X). Thus far all the existence results that we proved required that the nonlinearity f (x) be globally Lipschitz. This is a rather restrictive hypothesis. In particular it implies that f (x) ≤ f (0) + kx. This means that f has at most linear growth and so we rule out several interesting nonlinearities (such as quadratic). For this reason we would like to be able to relax this global Lipschitzness hypothesis. In this direction we have the following existence result. PROPOSITION 10.2.17 If A is a closed, densely defined operator, −A is maccretive, f : X −→ X is Lipschitz continuous on bounded sets, and x0 ∈ X, then we can find b > 0 depending on x0 such that problem (10.25) admits a unique local solution x ∈ C(T, X). Moreover, if x0 ∈ D(A), then x is Lipschitz continuous on T . Finally if X is reflexive and x0 ∈ D(A), then the mild solution x is in fact classical. PROOF: Let r = 1 + x0 and let kr > 0 be the Lipschitz constant of f on B r = {x ∈ X : x ≤ r}. Let 0 < b < 1/kr and define Cb = x ∈ C(T, X) : x(t) ≤ r, t ∈ T = [0, b] .
10.2 Semilinear Evolution Equations
711
Clearly Cb is a closed and convex subset of C(T, X). We consider the nonlinear map ξ : Cb −→ C(T, X) defined by
t
S(t − τ )f x(τ ) dτ for all t ∈ T. ξ(x)(t) = S(t)x0 + 0
We have
ξ(x)(t) ≤ x0 + b f (0) + kr r
for all t ∈ T.
If we choose b > 0 such that b < min
1 1 , , kr f (0) + kr r
(10.48)
then we deduce that ξ(x)(t) ≤ x0 + 1 = r. Therefore, ξ : Cb −→ Cb . Moreover, for every x, u ∈ Cb , we have
t
S(t − τ ) f x(τ ) − f u(τ ) dτ ξ(x) − ξ(u)C(T,X) = sup t∈T
0
≤ b kr x − uC(T,X) . Because bkr < 1 (see (10.48)), we see that ξ is a contraction on Cb . So by Banach’s fixed point theorem, we can find a unique fixed point of ξ. It is easily seen that this is the unique mild solution of (10.25). If x0 ∈ D(A), Proposition 10.2.5 implies that x is Lipschitz continuous. Finally when X is reflexive and x0 ∈ D(A), Corollary 10.2.6 implies that the unique mild solution is in fact a classical one. As is the case with ordinary differential equations in RN , we can always produce a maximal solution; that is, a solution defined on a maximal time interval. PROPOSITION 10.2.18 If A is a closed, densely defined operator, −A is maccretive, f : X −→ X is Lipschitz continuous on bounded sets, and x0 ∈ X, then problem (10.25) has a unique mild solution on a maximal time interval [0, bmax ) such that either (a) bmax = +∞; that is, the unique mild solution is global, or (b) bmax < +∞ and lim x(t) = +∞ (i.e., we have a blow-up of the solution in t→b− max
finite time). PROOF: Suppose x1 , x2 are two mild solutions defined on T1 = [0, b1 ] and T2 = [0, b2 ], respectively, and assume that b1 < b2 . Then T1 ⊆ T2 and due to the uniqueness of the mild solution on T1 we have x2 T = x1 . 1
Invoking Zorn’s lemma, we can find a maximal interval Tmax = [0, bmax ) on which the unique mild solution exists. We need to show that if bmax < +∞, then
712
10 Evolution Equations lim x(t) = +∞.
(10.49)
t→b− max
We proceed indirectly. So suppose that (10.49) is not true. We show that we can continue x beyond bmax , a contradiction to the maximality of the interval Tmax = [0, bmax ). So suppose we can find {tn }n≥1 ⊆ Tmax such that tn −→ b− max
as n → ∞
and
x(tn ) ≤ c for some c > 0, all n ≥ 1.
We consider the following Cauchy problem
) * u (t) = Au(t) + f u(t) , t ≥ 0, . u(0) = x(tn ), n ≥ 1
(10.50)
By virtue of Proposition 10.2.17, problem (10.50) has a unique mild solution on some interval [0, b], with b > 0 depending on c > 0. Let n ≥ 1 be large so that bmax < b + tn . We set x(t) if t ∈ [0, tn ] . (10.51) y(t) = if t ∈ [tn , tn + b] u(t − tn ) We show that y is a mild solution of (10.25) on the interval [0, b + tn ]. To this end note that
t
S(t − τ )f y(τ ) dτ for t ∈ [0, tn ] (see (10.51)) y(t) = S(t)x0 + 0
and
u(t) = S(t)x(tn ) +
(10.52) t
S(t − τ )f u(τ ) dτ
for t ∈ [0, b].
(10.53)
0
From (10.52) we see that y ∈ C([0, b + tn ], X) is a mild solution of (10.25) on the time interval [0, tn ]. Now, if t ∈ [0, b], we have
t
y(t + tn ) = u(t) = S(t)x(tn )+ S(t−τ )f u(τ ) dτ (see (10.51) and (10.53)) 0
tn
S(t − τ )f x(τ ) dτ = S(t) S(tn )x0 + 0
t
+ S(t−τ )f u(τ ) dτ 0
tn
S(t + tn − τ )f x(τ ) dτ = S(t+tn )x0 +
0
tn+t
+
S(t+tn − τ )f u(τ − tn ) dτ
tn
= S(t + tn )x0 +
t+tn
S(t+tn − τ )f y(τ ) dτ
(see (10.51)).
0
(10.54) From (10.54) we conclude that y is a mild solution of (10.25) on [0, b + tn ] which strictly contains Tmax , a contradiction to the maximality of Tmax .
10.2 Semilinear Evolution Equations
713
Using Propositions 10.2.17 and 10.2.18, we can have an existence and uniqueness result for a global mild solution, under conditions that are slightly weaker than the global Lipschitz condition considered in the first part of this section. COROLLARY 10.2.19 If A is a closed, densely defined operator, −A is maccretive, and f : X −→ X is Lipschitz continuous on bounded sets, there exist c1 , c2 > 0 such that f (x) ≤ c1 + c2 x
for all x ∈ X
(10.55)
and x0 ∈ X, then problem (10.25) admits a unique maximal mild solution x ∈ C(R+ , X). PROOF: From Proposition 10.2.18 we know that problem (10.25) admits a unique maximal mild solution x ∈ C(Tmax , X), Tmax = [0, bmax ). We have
t
x(t) = S(t)x0 + 0
t
⇒ x(t) ≤ x0 + 0
≤ x0 +
S(t − τ )f x(τ ) dτ
for t ∈ Tmax = [0, bmax ),
f x(τ ) dτ c1 + c2 x(τ ) dτ
t
(see (10.55)).
0
If bmax < +∞, via Gronwall’s inequality, we obtain x(t) ≤ M
for some M > 0
and all t ∈ Tmax .
(10.56)
But from Proposition 10.2.18, we know that we must have lim x(t) = +∞,
t→b− max
a contradiction to (10.56).
COROLLARY 10.2.20 If X = H = Hilbert space, −A is maximal monotone, f : H −→ H is Lipschitz continuous on bounded sets, and there exist c1 , c2 > 0 such that (f (x), x)H ≤ c1 + c2 x2
for all x ∈ H,
(10.57)
then for every x0 ∈ X problem (10.25) admits a unique global mild solution x ∈ C(R+ , X). Moreover, if x0 ∈ D(A), then this unique global mild solution is in fact a classical one. PROOF: First suppose that x0 ∈ D(A). Then from Propositions 10.2.17 and 10.2.18, problem (10.25) admits a maximal classical solution x ∈ C(Tmax , H) with Tmax = [0, bmax ). We have
714
10 Evolution Equations
x (t) = Ax(t) + f x(t) for all t ∈ Tmax ,
⇒ x (t), x(t) H = Ax(t), x(t) H + f x(t) , x(t)
H
1 d ⇒ x(t)2 ≤ c1 + c2 x(t)2 2 dt (because − A is monotone and using (10.57))
t x(τ )2 dτ ⇒ x(t)2 ≤ c3 + c2
for all t ∈ Tmax .
0
(10.58) As in the proof of Corollary 10.2.19, (10.58) implies that bmax = +∞. Now suppose that x0 ∈ X.Then we can find {xn 0 }n≥1 ⊆ D(A) such that xn 0 −→ x0
in X
as n → ∞.
Let xn ∈ C(R+ , X) be the global classical solution emanating from xn 0 (see the first part of the proof). As above, using (10.57), we show that for every b > 0, we have xn (t) ≤ M
for all n ≥ 1, all t ∈ T = [0, b],
with M > 0 independent of n ≥ 1. Also if n, m ≥ 1 are fixed and we set u(t) = xn (t) − xm (t), then we have
)
t ∈ T,
* u (t) = Au(t) + f xn (t) − f xm (t) , t ≥ 0, . m u(0) = xn 0 − x0 , n ≥ 1
(10.59)
From (10.59) as above, using (10.57) and the monotonicity of −A, we obtain 1 d u(t)2 ≤ kM u(t)2 2 dt
(10.60)
with kM > 0 being the Lipschitz constant of f on B M . From (10.60), via Gronwall’s inequality, we obtain m 2 u(t)2 = xn (t) − xm (t)2 ≤ cxn 0 − x0 ,
for some c > 0 independent of n, m ≥ 1. Therefore {xn }n≥1 ⊆ C(T, X) is Cauchy and so xn −→ x in C(T, X). Evidently x is mild solution of (10.25) on T = [0, b]. Because b > 0 is arbitrary, it follows that x is a global mild solution. As before we can easily check that x is unique. We conclude this section by applying the abstract results to some concrete parabolic problems. So let Z ⊆ RN be a bounded domain with a C ∞ -boundary. First we consider the following initial-boundary value problem:
10.2 Semilinear Evolution Equations ⎧ ∂x ⎫ ⎨ ∂t + x(t, z) = g(t, z) on R+ × Z, ⎬ . ⎩ x ⎭ = 0, x(0, z) = x (z), z ∈ Z 0 R ×Z
715
(10.61)
+
We make the following hypotheses on g. H(g): g : T ×Z −→ R is a function such that g(0, ·) ∈ L2 (Z) and (i) For all t ∈ T , z −→ g(t, z) is measurable. (ii) For almost all z ∈ Z and all t, t ∈ R+ we have |g(t, z) − g(t , z)| ≤ k(z)|t − t |
with k ∈ L2 (Z).
Then we have the following result concerning problem (10.61). PROPOSITION 10.2.21 If g(t, z) satisfies hypotheses H(g) and x0 ∈ H01 (Z) ∩ H 2 (Z), then problem (10.61) admits a unique global classical solution
x ∈ C R+ , L2 (Z) ∩ C 1 (0, ∞), H01 (Z) ∩ H 2 (Z) . PROOF: Let X = H = L2 (Z) and A : D(A) ⊆ H −→ H defined by Ax = −x ∩ H (Z). Then we can rewrite (10.61) as the following with x ∈ D(A) = equivalent semilinear evolution equation ) * x (t) = Ax(t) + g(t) , (10.62) x(0) = x0 H01 (Z)
2
where g(t) = g(t, ·) ∈ H for all t ≥ 0. We claim that −A is maximal monotone. To this end let h ∈ H = L2 (Z) and consider the following stationary problem * ) u(z) − u(z) = h(z) a.e. on Z, . (10.63) u∂Z = 0
Let V ∈ L H01 (Z), H −1 (Z) be defined by
(Dx, Dy)RN dz. V (x), y H = Z
Evidently V is monotone (hence maximal monotone) and coercive. Therefore it is surjective (see Corollary 3.2.28) and we can find x ∈ H01 (Z) such that V (x) = h ∈ H. Then x is a solution of (10.63) and from the linear regularity theory, we have x ∈ H 2 (Z), ⇒ x ∈ D(A), ⇒ R(I − A) = L2 (Z) = H.
716
10 Evolution Equations
Because −A is clearly monotone, it follows that −A is maximal monotone (see Theorem 3.2.30). Note that by virtue of hypothesis H(g)(ii), g : R+ −→ H is Lipschitz continuous. So we can apply Corollary 10.2.15 and obtain a classical solution x such that
x ∈ C R+ , L2 (Z) ∩ C 1 (0, ∞), H01 (Z) ∩ H 2 (Z) . Next we consider the semilinear heat equation:
⎧ ∂x ⎫ ⎨ ∂t + x(t, z) = f0 x(t, z) on R+ ×Z, ⎬ . ⎩ x ⎭ = 0, x(0, z) = x0 (z), z ∈ Z R ×Z
(10.64)
+
PROPOSITION 10.2.22 If f0 ∈ C 1 (R) and f0 is uniformly bounded, then for every x0 ∈ L2 (Z), problem (10.64) admits a unique global mild solution
x ∈ C R, L2 (Z) . Moreover, if x0 ∈ H01 (Z) ∩ H 2 (Z), then the above solution is classical and
x ∈ C 1 R+ , L2 (Z) ∩ C R+ , H01 (Z) ∩ H 2 (Z) . PROOF: As before, let X = H = L2 (Z), A : D(A) ⊆ H −→ H is defined by Ax = −x for all x ∈ D(A) = H01 (Z) ∩ H 2 (Z), and f : L2 (Z) −→ L2 (Z) is defined by
f (x)(·) = f0 x(·) .
Evidently f ∈ C 1 L2 (Z), L2 (Z) and also it is Lipschitz continuous. Then problem (10.64) can be equivalently rewritten as the following abstract evolution equation
* ) x (t) = Ax(t) + f x(t) . (10.65) x(0) = x0 Invoking Proposition 10.2.16 and Corollary 10.2.15, we obtain the conclusions of the proposition. PROPOSITION 10.2.23 If N ≤ 3, f0 (x) = −x3 + x, and x0 ∈ H01 (Z) ∩ H 2 (Z), then problem (10.64) admits a unique classical solution
x ∈ C 1 R+ , L2 (Z) ∩ C R+ , H01 (Z) ∩ H 2 (Z) . PROOF: Again X = H = L2 (Z), A : D(A) ⊆ H −→ H is defined by Ax = −x for all x ∈ D(A) = H01 (Z) ∩ H 2 (Z) and f : L2 (Z) −→ L2 (Z) is defined by f (x)(·) = f0 x(·) . Then problem (10.64) is equivalent to the semilinear evolution equation (10.65).We claim that f : D(A) −→ D(A). Indeed, because N ≤ 3, by the Sobolev embedding theorem, we have that H01 (Z) is embedded continuously in L6 (Z) and H 2 (Z) is embedded continuously into C(Z). Because f0 ∈ C ∞ (R), we see that for all u ∈ D(A), f (u) ∈ H01 (Z) and from the chain rule we have
10.2 Semilinear Evolution Equations
and
717
Df0 u(z) = f0 u(z) Du(z)
D2 f0 u(z) = f0 u(z) D2 u(z) + f0 u(z) Du(z), Du(z) RN .
Therefore we see that f (u) ∈ H 2 (Z) and so we have f : D(A) −→ D(A). Now let H1 = D(A) (with the graph norm) and A1 : D(A1 ) ⊆ H1 −→ H1 is defined by A1 x = Ax with x ∈ D(A1 ) = D(A2 ). Then applying Proposition 10.2.18 on H1 with operator A1 , we obtain a unique maximal classical solution
x ∈ C 1 Tmax , L2 (Z) ∩ C Tmax , H01 (Z) ∩ H 2 (Z) with Tmax = [0, bmax ). We show that bmax = +∞. To this end, it suffices to show that x(t)H 2 (Z) is bounded uniformly in t ≥ 0. So we multiply (10.64) with xt and then integrate over Z. We obtain
∂x 2 1 1 1 d (10.66) Dx2 + x4 − x2 dz + L2 (Z) = 0. dt Z 2 4 2 ∂t Note that x(z)2 ≤
1 x(z)4 + 1. 4
Using this in (10.66) we obtain
∂x 2 1 d (Dx2 + x2 )dz + L2 (Z) = 0, dt Z 2 ∂t
t ∂x 2 dt ≤ M ⇒ xH 1 (Z) ≤ M and 0 ∂t L (Z) 0
for some M > 0.
For r > 0, let u(t) = x(t + r). Evidently we have
u ∈ C 1 [0, bmax − r), L2 (Z) ∩ C [0, bmax − r), H01 (Z) ∩ H 2 (Z) ⎧ ∂u ⎫ 3 ⎨ ∂t + u(t, z) = −u(t, z) + u(t, z) ⎬ and
⎩
u(t, ·)∂Z = 0, u(0, z) = x(r, z)
⎭
.
(10.67)
We set y(t, z) = u(t, z) − x(t, z) = x(t + r, z) − x(t, z). From (10.64) and (10.67), we have ⎧ ∂y ⎫
⎨ ∂t + y(t, z) + y(t, z) u(t, z)2 + u(t, z)x(t, z) + x(t, z)2 − y(t, z) = 0 ⎬ . ⎩ ⎭ y(t, ·)∂Z = 0, y(0, z) = x(r, z) − x0 (z) (10.68) We multiply (10.68) with y(t, z) and then integrate over Z. We obtain
1 d y(t, ·)2L2 (Z) +Dy(t, ·)2L2 (Z) + y 2 (x2 + xu + u2 )dz = y(t, ·)2L2 (Z) . (10.69) 2 dt Z
718
10 Evolution Equations Note that x2 + xu + u2 ≥ 0,
⇒ y 2 (x2 + xu + u2 )dz ≥ 0.
(10.70)
Z
Using (10.70) in (10.69), we have 1 d y(t, ·)2L2 (Z) ≤ y(t, ·)2L2 (Z) 2 dt ⇒ y(t, ·)2L2 (Z) = x(t + r, ·) − x(t, ·)2L2 (Z) ≤ M1 x(r, ·) − x0 2L2 (Z) (10.71) for some M1 > 0 (by Gronwall’s inequality). The constant M1 > 0 depends only on bmax . From (10.71) it follows that ∂x (t, ·)2 2 ≤ M1 − u0 − u30 + u0 2L2 (Z) < +∞ L (Z) ∂t
(10.72)
(recall u0 ∈ H01 (Z) ∩ H 2 (Z)). We have u(t, ·)H 1 (Z) ≤ M u(t, ·)L2 (Z) 0 ∂x 2 ≤ M1 (t, ·)L2 (Z) + x(t, ·)3 L2 (Z) + x(t, ·)L2 (Z) ∂t ≤ M2 (see (10.72)) (10.73) with M2 > 0 depending only on bmax > 0. So, if bmax < +∞, from (10.73) it follows that lim x(t, ·)L2 (Z) < +∞, t→b− max
a contradiction. Therefore bmax = +∞.
REMARK 10.2.24 A careful reading of the above proof reveals that the result is more generally true if N ≤ 3 and f0 ∈ C 3 (R) with f0 (0) = 0.
10.3 Nonlinear Evolution Equations In this section we deal with nonlinear evolution equations. We focus on two broad classes of nonlinear evolution equations that arise often in applications. The first class concerns dynamic problems driven by subdifferential operators. It contains gradient flows and certain variational inequalities and it is related to the nonlinear semigroup theory briefly mentioned in Section 3.2. The second class considers evolutions monitored by general nonlinear operators of monotone type that are defined within the framework of an evolution triple of spaces (see Section 10.1). We start with the evolution inclusions of the subdifferential type. To prove the main existence and regularity results for such systems, we need some preparatory work.We start with a Gronwall-type lemma, which is a basic tool in deriving a priori estimates.
10.3 Nonlinear Evolution Equations
719
LEMMA 10.3.1 If h ∈ L1 (T )+ , T = [0, b], c ≥ 0 and x ∈ C(T ) satisfies
t 1 2 1 2 h(τ )x(τ )dτ for all t ∈ T, x (t) ≤ c + 2 2 0 t then |x(t)| ≤ c + 0 h(τ )dτ for all t ∈ T . PROOF: For any ε ≥ 0, let ϑε (t) =
1 (c + ε)2 + 2
t
h(τ )x(τ )dτ. 0
We have ϑε (t) = h(t)x(t) for a.a. t ∈ T 1 for all t ∈ T. and x2 (t) ≤ ϑ0 (t) ≤ ϑε (t) 2 From (10.74) and (10.75) it follows that √ ϑε (t) ≤ h(t) 2 ϑε (t)
(10.74) (10.75)
for a.a. t ∈ T.
(10.76) √ The function t −→ ϑε (t) is absolutely continuous and the function r −→ r is locally Lipschitz. So from the Serrin–Vall´ee Poussin chain rule, we have d 1 ϑε (t) = ϑε (t) for a.a. t ∈ T, dt 2 ϑε (t) 1 d ϑε (t) ≤ √ h(t) a.e. on t ∈ T (see (10.75)), ⇒ dt 2
t 1 ⇒ ϑε (t) ≤ ϑε (0) + √ h(τ )dτ for all t ∈ T. (10.77) 2 0 Combining (10.75) and (10.77) we obtain |x(t)| ≤
t √ 2 ϑε (t) ≤ 2ϑε (0) + h(τ )dτ 0
t h(τ )dτ for all t ∈ T. = c+ε+ 0
Because ε > 0 was arbitrary, we let ε ↓ 0, to obtain
t |x(t)| ≤ c + h(τ )dτ for all t ∈ T. 0
Let H be a Hilbert space with inner product denoted by (·, ·)H . The next lemma extends the Serrin–Vall´ee Poussin chain rule.
LEMMA 10.3.2 If ϕ ∈ Γ0 (H), x ∈ W 1,2 (0, b), H , x(t) ∈ D(∂ϕ) a.e. on (0, b),
2 and we can find h ∈ L (T, H) such that h(t) ∈ ∂ϕ x(t) a.e. on T , then the function t −→ ϕ x(t) is absolutely continuous on T and
d
ϕ x(t) = u, x (t) H dt
a.e. on T for all u ∈ ∂ϕ x(t) .
720
10 Evolution Equations
PROOF:
From Corollary 3.2.51(a), we know that for every λ > 0 the function t −→ ϕλ x(t) is differentiable a.e. on T and we have
d ϕλ x(t) = ϕλ x(t) , x (t) dt H
= (∂ϕ)λ x(t) , x (t) a.e. on (0, b), H
t
(∂ϕ)λ x(τ ) , x (τ ) dτ ⇒ ϕλ x(t) − ϕλ x(s) = H
s
for all s, t ∈ T. (10.78)
From Proposition 3.2.44(d)(e) we know that
(∂ϕ)λ x(t) ≤ (∂ϕ)0 x(t) ≤ h(t) a.e. on T
and (∂ϕ)λ x(t) −→ (∂ϕ)0 x(t) a.e. on T as λ ↓ 0. So, from the dominated convergence theorem, we have
(∂ϕ)λ x(·) = ∂ϕλ x(·) −→ (∂ϕ)0 x(·) in L2 (T, H) as λ ↓ 0. Then passing to the limit as λ ↓ 0 in (10.78), we obtain
t
(∂ϕ)0 x(τ ) , x (τ ) dτ ϕ x(t) − ϕ x(s) = s
H
for all s, t ∈ T
(see Corollary 3.2.51(c)). Therefore the function t −→ ϕ x(t) is absolutely
continuous on T . Let t0 ∈ T be a differentiability point for both x(·) and ϕ x(·) and
x(t0 ) ∈ D(∂ϕ). Then for every u ∈ ∂ϕ x(t0 ) , we have
for all y ∈ H. ϕ x(t0 ) − ϕ(y) ≤ u, x(t0 ) − y H Let y = x(t0 ± ε), ε > 0 and divide by ε. We have
1
1
≤ ϕ x(t0 ) − ϕ x(t0 ± ε) u, x(t0 ) − x(t0 ± ε) H . ε ε H Passing to the limit as ε −→ 0, we obtain
ϕ x(t0 ) = u, x (t0 ) H . Now let A : D(A) ⊆ H −→ 2H \{∅} be a possibly multivalued nonlinear operator, x0 ∈ H, and g ∈ L1 (T, H). We consider the following Cauchy problem,
* ) −x (t) ∈ A x(t) + g(t), t ∈ T, . (10.79) x(0) = x0 DEFINITION 10.3.3 (a) A function x ∈ C(T, H) is said to be a strong solution (or solution) for the Cauchy problem (10.79), if t −→ x(t) is absolutely continuous on every compact subinterval of (0, b), x(t) ∈ D(A) a.e. on T , x(0) = x0 and x(·) satisfies the differential equation in (10.79) for almost all t ∈ T .
10.3 Nonlinear Evolution Equations
721
(b) A function x ∈ C(T, H) is said to be a weak solution
for the Cauchy problem (10.79), if there exist sequences {xn }n≥1 ⊆ W 1,1 (0, b), H and {gn }n≥1 ⊆ L1 (T, H) such that for every n ≥ 1, −xn (t) ∈ A xn (t) +gn (t) a.e. on T , gn −→ g in L1 (T, H), and xn −→ x in C(T, H) as n → ∞ with x(0) = x0 . The following existence theorem for problem (10.79) is proved in Barbu [58, p. 124] and in Brezis [102, p. 54]. THEOREM 10.3.4 If A : D(A) ⊆ H −→ 2H is maximal monotone, x0 ∈ D(A), and
g ∈ W 1,1 (0, b), H , then problem (10.79) has a unique (strong) solution x ∈ C(T, H), which is Lipschitz continuous (i.e., x ∈ W 1,∞ (0, b), H ), it is right differentiable at every point t ∈ [0, b), and
0 d+ x for all t ∈ [0, b) (t) = A x(t) + g(t) dt 0 t d+ x
g (τ )dτ (t) ≤ A x(t) + g(t) + dt 0
−
(10.80) for all t ∈ [0, b). (10.81)
1,∞
Moreover, (0, b), H are solutions for the data (x0 , g1 ), (y0 , g2 ) ∈
if x, y∈W D(A)×W 1,1 (0, b), H , respectively, we have
t
x(t) − y(t) ≤ x0 − y0 +
g1 (τ ) − g2 (τ )dτ
for all t ∈ T.
(10.82)
0
REMARK 10.3.5 The result is still true if T =[0, b] is replaced by R+ . Note that (10.80) means that for the multivalued system, among all choices the system tends to minimize its velocity. We can also slightly generalize Theorem 10.3.4 and assume that for some ϑ > 0, x −→ A(x) + ϑx is maximal monotone. Then the existence theorem is still valid, only the estimates in (10.81) and (10.82) need to be modified accordingly. The next theorem concerns weak solutions. THEOREM 10.3.6 If A : D(A) ⊆ H −→ 2H is a maximal monotone operator, x0 ∈ D(A), and g ∈ L1 (T, H), then there exists a unique weak solution x ∈ C(T, H) of (10.79) such that
t
1 1 (10.83) g(τ ) − w, x(τ ) − v H dτ x(t) − v2 ≤ x(s) − v2 + 2 2 s for all 0 ≤ s ≤ t ≤ b and all (v, w) ∈ Gr A. Moreover, if x, y ∈ C(T, X) are weak solutions corresponding to the data (x0 , g1 ),(y0 , g2 ) ∈ D(A)×L1 (T, H), respectively, then we have
t
1 1 g1 (τ ) − g2 (τ ), x(τ ) − y(τ ) H dτ, x(t) − y(t)2 ≤ x(s) − y(s)2 + 2 2 s (10.84) for all 0 ≤ s ≤ t ≤ b.
722
10 Evolution Equations
PROOF: Fix(x0 , g0 ) ∈ D(A)×L1 (T, H). We can find {xn 0 }n≥1 ⊆ D(A) and {gn }n≥1 ⊆
W 1,1 (0, b), H such that xn 0 −→ x0
in H
and
gn −→ g
in L1 (T, H)
as n → ∞.
For every n ≥ 1, we consider the following Cauchy problem
* ) −xn (t) ∈ A xn (t) + gn (t) a.e. on T, . n xn (0) = x0
(10.85)
By virtue of Theorem 10.3.4 problem (10.85) has a unique strong solution xn ∈ W 1,∞ (0, b), H . Moreover, (10.82) implies that for all n, m ≥ 1, we have
b
m xn (t) − xm (t) ≤ xn 0 − x0 +
gn (τ ) − gm (τ )dτ, 0
⇒ {xn }n≥1 ⊆ C(T, H) is Cauchy. So there exists x ∈ C(T, H) such that xn −→ x
in C(T, H).
Evidently x ∈ C(T, H) is a weak solution for problem (10.79). Moreover, because for every n ≥ 1 xn (·) satisfies (10.83) and (10.84), passing to the limit as n → ∞, we deduce that x(·) satisfies them too. Finally the uniqueness of the weak solution follows at once from (10.84). From Theorems 10.3.4 and 10.3.6 we see that in the general case, in order to have a strong solution we need to restrict
the data of the problem, namely we need to assume that x0 ∈ D(A) and g ∈ W 1,1 (0, b), H ; that is, we need a “smooth” initial condition and “smooth” forcing term. We show that in the case of subdifferential evolution inclusions these restrictive requirements can be removed. THEOREM 10.3.7 If A = ∂ϕ : D(∂ϕ) ⊆ H −→ 2H with ϕ ∈ Γ0 (H), x0 ∈ D(∂ϕ), and g ∈ L2 (T, H), then problem (10.79) admits a unique strong solution x ∈ C(T, H) such that √ t −→ t x (t) belongs in L2 (T ),
t −→ ϕ x(t) belongs in L1 (T, H), and it is absolutely continuous on [δ, b], ∀δ > 0. Moreover, if x0 ∈ dom ϕ, then x ∈ L2 (T, H) and t −→ ϕ x(t) is absolutely continuous on T . PROOF: First note that if (v0 , w0 ) ∈ Gr ∂ϕ and we introduce the function ϕ0 (x) = ϕ(x) − ϕ(v0 ) − (w0 , x − v0 )H ≥ 0, x ∈ H,
then the subdifferential inclusion −x (t) ∈ ∂ϕ x(t) + g(t) is equivalent to
−x (t) ∈ ∂ϕ0 x(t) + g(t) − w0 . So, without any loss of generality, we may assume that ϕ(v0 ) = inf ϕ = 0. H
(10.86)
10.3 Nonlinear Evolution Equations
723
Now suppose that x0 ∈ D(∂ϕ) and g ∈ W 1,2 (0, b), H . Then Theorem 10.3.4 implies that there exists a unique strong solution x ∈ W 1,∞ (0, b), H . We multiply the equation with tx (t) and use Lemma 10.3.2. We obtain
d
tx (t)2 + t ϕ x(t) = t g(t), x (t) H a.e. on T, dt
b
b
b
2 ⇒ tx (t) dt + b ϕ x(b) = t g(t), x (t) H + ϕ x(t) dt. 0
0
0
(10.87) Note that
1 1 t g(t), x (t) H ≤ tg(t) x (t) ≤ t g(t)2+ t x (t)2 2 2
for all t ∈ T. (10.88)
Using (10.88) and the fact that ϕ ≥ 0 (see (10.86)) in (10.87), we obtain
b
b
b
tx (t)2 dt ≤ tg(t)2 dt + 2 ϕ x(t) dt. (10.89) 0
0
0
Also from the definition of the convex subdifferential, we have
ϕ x(t) ≤ −x (t) − g(t), x(t) − v0 H (recall that φ(v0 ) = 0, see (10.86))
b
b
ϕ x(t) dt ≤ ⇒ −x (t), x(t)−v0 H dt+ 0
0
b
g(t) x(t) − v0 dt 0
b 1 d g(t) x(t) − v0 dt x(t) − v0 2 dt+ 0 2 dt 0
b 1 ≤ x0 − v0 2 + g(t) x(t) − v0 dt. (10.90) 2 0 b
=−
Next take the inner product of the differential equation with x(t) − v0 and then integrate over [0, s]. We obtain
s
s
−x (t), x(t) − v0 H dt ∈ ∂ϕ x(t) , x(t) − v0 dt H 0 0
s
(10.91) g(t), x(t) − v0 H dt. + 0
We have
s
−x (t), x(t) − v0
0
s
g(t), x(t) − v0
0
dt =
1 1 x0 − v0 2 − x(s) − v0 2 2 2
H
s
and
H
(10.92)
∂ϕ x(t) , x(t) − v0 dt ≥ 0 (because 0 ∈ ∂ϕ(v0 ) and ∂ϕ is
0
H
dt ≥ −
monotone) (10.93) s
g(t) x(t) − v0 dt. 0
Using (10.92) through (10.94) in (10.91), we obtain
(10.94)
724
10 Evolution Equations
s 1 1 g(t) x(t) − v0 dt, x(s) − v0 2 ≤ x0 − v0 2 + 2 2 0
b ⇒ x(s) − v0 ≤ x0 − v0 + g(t)dt for all s ∈ T.
(10.95)
0
Using (10.95) in (10.90), we obtain
b
ϕ x(t) dt ≤ x0 − v0 + 0
b
2 g(t)dt .
We use (10.96) in (10.89) and have
b
b
tx (t)2 dt ≤ tg(t)2 dt + 2 x0 − v0 + 0
(10.96)
0
0
b
2 g(t)dt .
(10.97)
0
Now suppose that x0 ∈ D(∂ϕ) = dom ϕ and g ∈ L2 (T, H). Then we can find {xn 0 }n≥1 ⊆ D(∂ϕ)
and
{gn }n≥1 ⊆ L2 (T, H)
such that xn in H and gn −→ g in L2 (T, H) as n → ∞. 0 −→ x0
Let xn ∈ W 1,∞ (0, b), H , n ≥ 1, be the unique strong solution of the Cauchy n problem with data (x0 , gn ). From (10.82), we see that {xn }n≥1 ⊆ C(T, H) is Cauchy. So we can find x ∈ C(T, H) such that xn −→ x
in C(T, H) as n → ∞.
So we have that xn −→ x in the sense of H-valued distributions on (0, b). Because 2 (10.97) is true for {xn , xn 0 , gn }n≥1 , we deduce that t −→ tx (t) belongs in L (T, H) and √ w √ t xn −→ t x in L2 (T, H) as n → ∞. In particular then, for any 0 < δ < b, we have xn −→ x
in L2 ([δ, b], H) as n → ∞.
If A is the realization of A in L2 ([δ, b], H), then A is maximal monotone and (xn , −xn − gn ) ∈ Gr A
for all n ≥ 1.
From Proposition 3.2.7 it follows that (x, −x − g) ∈ Gr A, )
−x (t) ∈ ∂ϕ x(t) + g(t) x(0) = x0
a.e. on T (because δ ≥ 0 is arbitrary),
* .
Next, let Jϕ : L1 (T, H) −→ R = R ∪ {+∞} be the integral functional defined by b
ϕ u(t) dt if ϕ u(·) ∈ L1 (T, H) 0 Jϕ (u) = . +∞ otherwise
10.3 Nonlinear Evolution Equations
725
We have that Jϕ ∈ Γ0 L1 (T, H) . Also because of (10.96) we see that we can find c0 > 0 such that for n ≥ 1 ⇒ Jϕ (xn ) ≤ c0 , ⇒ Jϕ (x) ≤ lim inf Jϕ (xn ) ≤ c0 , n→∞
⇒ x ∈ dom Jϕ .
So using Lemma 10.3.2, we have that t −→ ϕ x(t) belongs in L1 (T, H) and it is absolutely continuous on [δ, b] for all 0 < δ < b. Finally suppose that x0 ∈ dom ϕ. We have d
ϕ x(t) ≤ g(t) x (t) a.e. on T, dt d
1 1 ⇒ x (t)2 + ϕ x(t) ≤ g(t)2 a.e. on T, 2 dt 2
1 t ⇒ t −→ ϕ x(t) − g(τ )2 dτ is decreasing on T. 2 0 x (t)2 +
Because x0 ∈ dom ϕ, we obtain
1 t ϕ x(t) ≤ ϕ(x0 ) + g(τ )2 dτ 2 0
for all t ∈ [0, b).
(10.98) (10.99)
(10.100)
For δ > 0, from (10.98), we obtain
1 b 1 b x (t)2 dt ≤ ϕ x(δ) + g(t)2 dt 2 δ 2 δ
1 b ≤ ϕ(x0 ) + g(t)2 dt (see (10.98)), 2 0 ⇒ x ∈ L2 (T ).
∞ Finally from (10.100),
we have ϕ x(·) ∈ L (T ) and from Lemma 10.3.2 we conclude that t −→ ϕ x(t) is absolutely continuous.
REMARK 10.3.8 If g ∈ W 1,1 (0, b), H , then x is right differentiable at every
0 + point in (0, b), x(t) ∈ D(A) for every t ∈ (0, b] and − ddtx (t) ∈ ∂ϕ x(t) + g(t) for all t ∈ (0, b].
Subdifferential evolution inclusions exhibit a remarkable smoothing effect. In what follows by {S(t)}t≥0 we denote the semigroup of nonlinear contractions S(t) : D(A) −→ D(A), t ≥ 0, generated by A (see Theorem 3.2.93). THEOREM 10.3.9 If A = ∂ϕ : D(∂ϕ) ⊆ H −→ 2H , ϕ ∈ Γ0 (H), and x0 ∈ D(A), then (a) Sx0 ∈ D(A) for all t > 0 (smoothing effect). +
(b) ddt S(t)x0 = A0 S(t)x0 ≤ A0 (x0 ) + 1t x0 − v0 for all t > 0 and all v0 ∈ D(∂ϕ).
726
10 Evolution Equations
PROOF: (a) As in the proof of Theorem 10.3.7, let (v0 , w0 ) ∈ Gr ∂ϕ and define the function ϕ0 (x) = ϕ(x) − ϕ(v0 ) − (w0 , x − v0 )H , x ∈ H. We have ϕ0 ≥ 0 (recall the definition of the subdifferential) and 0 = ϕ0 (v0 ) = min ϕ0 . H
We set x(t) = S(t)x0 , t ≥ 0. Then x ∈ C(T, H) is a strong solution of the Cauchy problem
* ) −x (t) ∈ ∂ϕ0 x(t) + w0 , . (10.101) x(0) = x0 Then by virtue of Remark 10.3.8, we have x(t) = S(t)x0 ∈ D(A) for all t > 0. (b) Also, again from Remark 10.3.8, we have 0
d+ x for all t > 0 (10.102) (t) ∈ −∂ϕ x(t) − w0 dt
t
t t d+ x 2
d+ x ⇒ s ϕ0 x(s) ds − s w0 , (s) ds ≤ (s) ds dt dt H 0 0 0
t
t
ϕ0 x(s) ds − t w0 , x(t) H + w0 , x(s) ds = 0
0
(10.103) (integration by parts).
Because −w0 − x (t) ∈ ∂ϕ0 x(t) a.e on R+ , we have
ϕ0 x(t) ≤ (−w0 − x (t), x(t) − v0 )H a.e. on T,
t
t
ϕ0 x(s) ds ≤ (−w0 − x (s), x(s) − v0 )H ds ⇒ 0
0
=
1 1 x0 − v0 2 − x(t) − v0 2 + t(v0 , w0 )H 2 2
t
− (10.104) w0 , x(s) H ds, t ≥ 0. 0
Using (10.104) in (10.103), we obtain
t d+ x 2 1 1 s (s) ds ≤ x0 − v0 2 + t(w0 , v0 − x(t))H − v0 − x(t)2 dt 2 2 0 1
2 2 2 ≤ x0 − v0 + t w0 , t ≥ 0. (10.105) 2 Note that d+ for all t > 0 and all h > 0, x(t + h) − x(t)2 ≤ 0 dt x(t + h) − x(t) x(s + h) − x(s) ⇒ ≤ for all 0 < s ≤ t, h > 0, h h + d x ⇒ t −→ (t) is decreasing on (0, +∞). dt
10.3 Nonlinear Evolution Equations
727
From this fact and (10.105), we obtain d+ x 2 1 1 2 2 x − v + w , t > 0, (t) ≤ 0 0 0 dt 2 t2 + d 1 ⇒ S(t)x0 ≤ x0 − v0 + w0 , t > 0. dt t Let us present an application of Theorem 10.3.7. So let β : D(β) ⊆ R −→ 2R \{∅} be a maximal monotone function. We know that β = ∂j with j ∈ Γ0 (R). Also let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. We consider the following heat equation with Neumann boundary condition. ⎧ ∂x ⎫ ⎨ − ∂t − x(t, z) = g(t, z) a.e. on (0, b) × Z, ⎬ . (10.106)
⎩ − ∂x ∈ β x(t, z) , x(0, z) = x0 (z) ⎭ ∂n (0,b)×∂Z 2 PROPOSITION 10.3.10 If g ∈ L2 (T ×Z) and x0 ∈ L (Z), then problem (10.106) has a unique solution x ∈ C T, L2 (Z) such that
√
tx ∈ L2 T, H 2 (Z) ,
√ ∂x t ∈ L2 (T × Z). ∂t
Moreover, if x0 ∈ H 1 (Z) and j x0 (·) ∈ L1 (Z), then
x ∈ L2 T, H 2 (Z) ,
∂x ∈ L2 (T × Z). ∂t
PROOF: Let H = L2 (Z) and let ϕ : H = L2 (Z) −→ R = R ∪ {+∞} be defined by ⎧
⎨ 12 Dx(z)2 dz+ j x(z) dσ if x ∈ H 1 (Z) and j x(·) ∈ L1 (∂Z) Z ∂Z ϕ(x) = ⎩ +∞ otherwise. Let A : L2 (Z) −→ L2 (Z) be the operator defined by Ax = − x
∂x for all x ∈ D(A) = x ∈ H 2 (Z) : − (z) ∈ β x(z) ∂n a.e. on ∂Z .
Evidently this is a nonlinear operator and the definition of D(A) makes sense because x ∈ H 2 (Z) implies that x∂Z ∈ H 3/2 (∂Z) and (∂x/∂n) ∈ H 1/2 (∂Z) ⊆ L2 (∂Z). Using Green’s identity, we obtain
∂x −x(y − x)dz = (Dx, Dy − Dx)RN dz − (y − x)dσ Z Z ∂Z ∂n ≤ ϕ(y) − ϕ(x)
for all x ∈ D(A), y ∈ L2 (Z),
⇒ A ⊆ ∂ϕ. To show that A = ∂ϕ, it suffices to show that A is maximal monotone in L2 (Z) and this follows if we show that R(I + A) = L2 (Z). To this end, we consider the following elliptic (stationary) problem.
728
10 Evolution Equations ⎧ ⎨ −xλ (z) + xλ (z) = h(z)
⎫ a.e. on Z, ⎬
, (10.107)
⎭ λ − ∂x (z) = β (z) a.e. on ∂Z x λ λ ∂n
2 with h ∈ L (Z) and βλ = (1/λ) 1 − (1 + λβ)−1 (the Yosida approximation of β). Consider the operator K : L2 (∂Z) −→ L2 (∂Z), where K(x) = y ∂Z with y being the unique solution of ⎫ ⎧ ⎬ ⎨ −y(z) + y(z) = h(z) a.e. on Z, . (10.108) ⎭ ⎩ ∂y (z) = (1 + λβ)−1 x(z) a.e. on ∂Z y(z) + λ ∂n ⎩
Problem (10.108) has a unique solution y ∈ H 2 (Z) and using Green’s identity we can have K(x) − K(u)L2 (∂Z) ≤ ϑx − uL2 (∂Z) 0 < ϑ < 1,
for all x, u ∈ L2 (Z).
So by Banach’s fixed point theorem we can find y ∈ L2 (∂Z) such that K(y) = y. Then the corresponding solution xλ ∈ H 2 (Z) of (10.108) also solves (10.107). Moreover, from Brezis [100], we know that
xλ H 2 (Z) ≤ c 1 + hL2 (Z) for some c > 0 and all λ > 0.
∂x Recalling that the trace map x −→ x∂Z , ∂n is continuous from H 2 ((Z)) ∂Z ∂xλ 3/2 1/2 into H (∂Z) × H (∂Z), we conclude that xλ , ∂n λ>0 ⊆ L2 (∂Z) × L2 (∂Z) is bounded. Also we know that H 2 (Z) is embedded compactly in L2 (Z). Hence we may assume that w
xλ −→ x
in H 2 (Z), xλ −→ x
Moreover, because βλ xλ (·)
in L2 (Z), λ>0
∂xλ w ∂x −→ ∂n ∂n
in L2 (∂Z)
as λ ↓ 0.
⊆ L2 (Z) is bounded, we have
(1 + λβ)−1 xλ −→ x
in L2 (Z) as λ ↓ 0.
(10.109)
Therefore, in the limit as λ ↓ 0, we obtain −x(z) + x(z) = h(z)
a.e. on Z.
2
Also, if β : D(β) ⊆ L2 (∂Z) −→ 2L (∂Z) is defined by
β(x) = y ∈ L2 (∂Z) : y(z) ∈ β x(z) a.e. on ∂Z ,
x ∈ L2 (∂Z)
(the realization (lifting) of β on L2 (∂Z)), then β is maximal monotone and
βλ (x)(z) = βλ x(z) a.e. on ∂Z.
Because βλ (x) ∈ β (I + λβ)−1 x , we obtain w
βλ (xλ ) −→
∂x ∂n
in L2 (∂Z)
as λ ↓ 0
(see (10.109)).
10.3 Nonlinear Evolution Equations
729
Because β is maximal monotone, we deduce that ∂x ∈ β(x). ∂n Hence we conclude that A is maximal monotone and so A = ∂ϕ. Now we equivalently rewrite (10.106) as the following nonlinear evolution inclusion
* ) −x (t) ∈ ∂ϕ x(t) + g(t) a.e. on T, . (10.110) x(0) = x0 with g(t) = g(t, ·) ∈ L2 (Z) for all t ∈ T . Then the conclusion of Proposition 10.3.10 follows from Theorem 10.3.7 applied to the Cauchy problem (10.110). Now we turn our attention to evolution inclusions driven by operators of monotone type defined in the framework of an evolution triple. So let T = [0, b] and let (X, H, X ∗ ) be an evolution triple of spaces (see Definition 10.1.21). We consider the following Cauchy problem,
) * −x (t) + A t, x(t) = f t, x(t) a.e. on T, . (10.111) x(0) = x0 The hypotheses on the data of (10.111) are the following. H(A): A : T ×X −→ X ∗ is a map such that (i) For every x ∈ X, t −→ A(t, x) is measurable. (ii) For almost all t ∈ T , x −→ A(t, x) is hemicontinuous monotone. (iii) For almost all t ∈ T and all x ∈ X, we have A(t, x)∗ ≤ α(t) + cxp−1
with 2 ≤ p < ∞, α ∈ Lp (T ),
1 p
+
1 p
= 1, c > 0.
(iv) For almost all t ∈ T and all x ∈ X, we have A(t, x), x ≥ c1 xp
with c1 > 0.
REMARK 10.3.11 Hypothesis H(A)(ii) implies that for almost all t ∈ T, A(t, ·) is maximal monotone. H(f ): f : T ×H −→ H is a function such that (i) For all x ∈ H, t −→ f (t, x) is measurable. (ii) For almost all t ∈ T , x −→ f (t, x) is sequentially continuous from H into Hw (by Hw we denote the pivot Hilbert space H furnished with the weak topology).
730
10 Evolution Equations (iii) For almost all t ∈ T and all x ∈ H, we have f (t, x) ≤ α2 (t) + c2 |x|2/p with α2 , c2 ∈ Lp (T ), p1 + p1 = 1 .
By a solution of (10.111), we understand a function x ∈ Wp (0, b) (see Definition 10.1.23) that satisfies (10.111). In what follows by S(x0 ) we denote the solution set of (10.111). Then S(x0 ) ⊆ Wp (0, b) ⊆ C(T, H) (see Theorem 10.1.25). THEOREM 10.3.12 If hypotheses H(A), H(f ) hold, x0 ∈ H, and X is embedded compactly into H, then S(x0 ) is nonempty, weakly compact in Wp (0, b) and compact in C(T, H). PROOF: The proof of the existence is based on the Galerkin method. Because X is separable reflexive, we can find a basis {vk }k≥1 for X (hence for H and X ∗ too). We set Hn = span{vk }n k=1 ,
n≥1
and endow Hn with the inner product inherited from H. By pn : H −→ Hn we denote the orthogonal projection on Hn . Also let Xn be the finite dimensional space Hn equipped with the X-norm (i.e., we view Hn as a subspace of X rather than of H). Then the dual Xn∗ of Xn is the space Hn furnished with the X ∗ -norm. For every n ≥ 1, we consider An : T × Xn −→ Xn∗ to be restriction (Galerkin approximation) of A(t, ·) on Xn . So we have An (t, x) = y
for x ∈ Xn
with y ∈ Xn∗ satisfying A(t, x), u = y, u ,
for all u ∈ Xn ; that is, An (t, x), uXn = A(t, x), u for all x, u ∈ Xn . Here by ·, ·Xn , we denote the duality brackets for the pair (Xn∗ , Xn ). Clearly for every x ∈ Xn , t −→ An (t, x) is measurable and for almost all t ∈ T, x −→ An (t, x) is continuous. Also we set fn (t, x) = pn f (t, x) for (t, x) ∈ T × Hn . We see that fn is a Carath´eodory function and |fn (t, x)| = |pn f (t, x)| ≤ pn L |f (t, x)| ≤ α2 (t) + c2 (t)|x|2/q
for a.a. t ∈ T, all x ∈ H.
We consider the following finite-dimensional approximation (Galerkin approximation), of problem (10.111).
) * xn (t) + An t, xn (t) = f t, xn (t) a.e. on T, . (10.112) n xn (0) = pn x0 = x0 ∈ Hn From Carath´eodory’s existence theorem, we know that problem (10.112) has at least one solution xn ∈ Wp (0, b). Next we derive some a priori bounds for the
10.3 Nonlinear Evolution Equations
731
sequence {xn }n≥1 . For this purpose we take the duality brackets with xn (t) and obtain
xn (t), xn (t) + An t, xn (t) , xn (t) = pn f t, xn (t) , xn (t) a.e. on T = [0, b]
d ⇒ |xn (t)|2 + 2c1 xn (t)2 ≤ 2|pn f t, xn (t) ||xn (t)|, dt (see Corollary 10.1.26 and H(A)(iv)) εp
p 1 p ≤ 2ξ |p f t, x (t) | + (t) x n n n p εp p (by Young’s inequality) with ε > 0 and ξ > 0 such that | · | ≤ ξ · . We choose 1/p ε = c1εp >0 and obtain
p d a.e. on T with c3 > 0 |xn (t)|2 ≤ c3 pn f t, xn (t) dt p ≤ c3 α2 (t) + c2 |xn (t)|2/p ≤ 2p
−1
c3 α2 (t)p + 2p
−1
c3 cp2 |xn (t)|2
a.e. on T,
(see H(f )(iii)). We integrate this inequality over [0, t] and because |xn 0 | = |pn x0 | ≤ pn L |x0 | = |x0 |, we obtain
t |xn (t)|2 ≤ |x0 |2 + 2p −1 cp3 α2 pp + 2p −1 cp3 c2 (s)p |xn (s)|2 ds. (10.113) 0
From (10.113) using Gronwall’s inequality, we obtain |xn (t)| ≤ M1
for all n ≥ 1
and all t ∈ T = [0, b].
(10.114)
Recall that
d |xn (t)|2 + 2c1 xn (t)2 ≤ 2pn f t, xn (t) |xn (t)| a.e. on T, dt
≤ 2M1 pn f t, xn (t) a.e. on T, (see (10.114)),
b
xn (t)dt ≤ |x0 |2 + 2M1 ⇒ 2c1 0
b
|f t, xn (t) |dt
0
b
≤ |x0 | + 2M1 2
0
⇒ xn L2 (T,X) ≤ M2
for some M2 > 0
Also we know that
xn (t) = −An t, xn (t) + pn f t, xn (t)
2/p
α2 (t) + c2 (t)M1
dt
and all n ≥ 1.
a.e. on T
for all n ≥ 1.
(10.115)
(10.116)
From (10.116) and hypotheses H(A)(iii) and H(f )(iii), we obtain that xn L2 (T,X ∗ ) ≤ M3
for some M3 > 0 and all n ≥ 1.
(10.117)
732
10 Evolution Equations
Motivated from (10.115) and (10.117), we introduce the set C = x ∈ Wp (0, b) : xLp (T,X) ≤ M2 and x Lp (T,X ∗ ) ≤ M3 . Evidently C is bounded, closed, and convex (hence w-compact, convex) in Wp (0, b). Invoking Theorem 10.1.29, we deduce that C is compact in Lp (T, H). Let Sn be the solution set of the Galerkin approximation problem (10.113). Note that Sn ⊆ C for all n ≥ 1. By passing to a subsequence if necessary, we may assume that w
xn −→ x
in Wp (0, b), xn −→ x
in Lp (T, H)
and xn (t) −→ x(t) a.e. on T in H.
Also, if Nn (xn )(·) = fn ·, xn (t) = pn fn ·, xn (t) (the Nemitsky operator), then w
Nn (xn ) −→ g
in Lp (T, H)
as n → ∞.
For every D ⊆ T measurable and every h ∈ H, we have
pn f t, xn (t) , h dt Nn (xn )(t), h dt = D D
= f t, xn (t) , pn h dt, D
⇒ g(t), h dt = f t, x(t) , h dt (because pn h −→ h in H). D
D
Because D ⊆ T and h ∈ H, were arbitrary, we infer that
g(t) = f t, x(t) a.e. on T.
(10.118)
Let An : Lp (T, Xn ) −→ Lp (T, Xn ), n ≥ 1, be the Nemitsky operator correspond
ing to An (t, x); that is, An (x)(·) = An ·, x(·) for all x ∈ Lp (T, Xn ). Also let (·, ·)
p denote the duality brackets for the pair L (T, X ∗ ), Lp (T, X) ; that is, (u∗ , x) =
b
u∗ (t), x(t) dt
for all x ∈ Lp (T, X), u∗ ∈ Lp (T, X ∗ ).
0
Then for every y ∈ Lp (T, X), we have
(xn , y) + (An (xn ), y) = (Nn (xn ), y) .
(10.119)
Because of hypothesis H(A)(iii) and (10.115), we may assume that w
An (xn ) −→ v
in Lp (T, X ∗ )
as n → ∞.
So, if we pass to the limit as n → ∞ in (10.119), we obtain
for all y ∈ Lp (T, X), (x , y) + (v, y) = (N (x), y)
with N (x)(·) = f ·, x(·) , (see (10.118)),
⇒ x (t) + v(t) = f t, x(t) a.e. on T.
(10.120)
10.3 Nonlinear Evolution Equations
733
From Theorem 10.1.25, we know that Wp (0, b) is embedded continuously in w w C(T, H). So xn (0) = xn 0 −→ x(0) = x0 and xn (b) −→ x(b) in H. Also
(Nn , (xn ), xn ) − (xn , xn ) = (An (xn ), xn )
1
1 2 ⇒ (Nn (xn ), xn ) − |xn (b)|2 + |xn 0 | = (An (xn ), xn ) 2 2 (see Corollary 10.1.26),
1 1 ⇒ (N (x), x) − |xn (b)|2 + |x0 |2 ≥ lim sup (An (xn ), xn ) . 2 2 n→∞ (10.121) From (10.120), with y = x and (10.121), we have
lim sup (An (xn ), xn ) ≤ (v, x) , n→∞
⇒ lim sup (An (xn ), xn − x) ≤ 0. n→∞
(10.122)
But it is easy to see that A is hemicontinuous and monotone, hence maximal monotone. So it is generalized pseudomonotone and from (10.122) we have A(x) = v; w that is, An (xn ) −→ A(x) in Lp (T, X ∗ ). Thus finally x + A(x) = N (x), x(0) = x0 ,
⇒ x (t) + A t, x(t) = f t, x(t) a.e. on T, x(0) = x0 , ⇒ x ∈ S(x0 ) = ∅. Next we show the topological properties of S(x0 ) ⊆ Wp (0, b). From (10.114), (10.115), and (10.117), we know that we can find M = max{M1 , M2 , M3 } > 0 such that xLp (T,X) , x Lp (T,X ∗ ) , xC(T,X) ≤ M for all x ∈ S(x0 ).
If we set ϑ(t) = α2 (t) + c2 (t)M 2/p , then ϑ ∈ Lp (T ) and we may assume that |f (t, x)| ≤ ϑ(t) for a.a. t ∈ T, all x ∈ H.
Otherwise we replace f (t, x) by f t, rM (x) , with rM : H −→ H being the M -radical retraction; that is, x if |x| ≤ M rM (x) = for all x ∈ H. Mx if |x| > M |x| Then we introduce the set K = h ∈ Lp (T, H) : |h(t)| ≤ ϑ(t)
a.e. on T .
In what follows we consider K equipped with the relatively weak Lp (T, H)topology. Then K becomes a compact metrizable space. For every h ∈ K we consider the auxiliary Cauchy problem
) * −x (t) + A t, x(t) = h(t) a.e. on T, . x(0) = x0
734
10 Evolution Equations
This problem has a unique solution x = η(h) ∈ Wp (0, b). We consider the solution map η : Lp (T, H) −→ C(T, H). Claim: η(K) is compact in C(T, H). Let {xn }n≥1 ⊆ η(K). Then xn = η(hn ) with hn ∈ K, n ≥ 1. From the a priori estimation of the first part of the proof, we know that {xn }n≥1 ⊆ Wp (0, b) is bounded. So we may assume that w
xn −→ x
in Wp (0, b),
xn (t) −→ x(t)
xn −→ x
in Lp (T, H)
for all t ∈ T \ N0 , |N0 | = 0
(see Theorem 10.1.29) in H
p
w
hn −→ h in L (T, H) as n → ∞. The sequence xn (·), xn (·) − x(·)}n≥1 is uniformly integrable. Therefore given ε > 0, we can find t ∈ T \ N0 such that and
b
| xn (s), xn (s) − x(s) |ds < ε
for all n ≥ 1.
(10.123)
t
For t ∈ T , let (·, ·) t denote the duality brackets for the pair Lp ([0, t], X ∗ ), Lp ([0, t], X) . Using Corollary 10.1.26, we have
1 (xn , xn − x) t = |xn (t) − x(t)|2 + (x , xn − x) t . 2
(10.124)
w
2 If
t ∈ T \N0 , then |xn (t) − x(t)| −→ 0. Also because xn −→ x in Wp (0, b), we have (x , xn − x) t −→ 0 as n → ∞. So from (10.124), we infer that
(xn , xn − x)
t
−→ 0
as n → ∞.
We have
b
(xn , xn − x) = (xn , xn − x) t + xn (s), xn (s) − x(s) ds t
≥ (xn , xn − x) t − ε (see (10.123)),
(10.125) ⇒ lim inf (xn , xn − x) ≥ 0 (because ε > 0 was arbitrary).
n→∞
In a similar fashion, we show that
lim sup (xn , xn − x) ≤ 0.
(10.126)
Therefore, from (10.125) and (10.126) we conclude that
(xn , xn − x) −→ 0 as n → ∞.
(10.127)
n→∞
Note that
(xn , xn − x) + (A(xn ), xn − x) =
⇒ lim (A(xn ), xn − x) = 0. n→∞
hn (t), xn (t) − x(t) dt,
b
0
(10.128)
10.3 Nonlinear Evolution Equations
735
As before from (10.128), it follows that
in Lp (T, X ∗ ).
w
A(xn ) −→ A(x) So, in the limit as n → ∞, we have
x + A(x) = h, h ∈ K, x(0) = x0 , ⇒ x ∈ η(K). We show that xn −→ x in C(T, H). We have
1 |xn (t) − x(t)|2 = (hn − h, xn − x) − (A(xn ) − A(x), xn − x) t , 2
b
1 ⇒ |xn (t) − x(t)|2 ≤ | hn (s) − h(s), xn (s) − x(s) |ds 2 0
b
+ | A(s), xn (s) − x(s) |dt + (A(x), xn − x) t . 0
(10.129) Note that
b
| hn (s) − h(s), xn (s) − x(s) |ds −→ 0
as n → ∞.
(10.130)
0
Let βn (t) = A t, xn (t), xn (t) − x(t) . If t ∈ T \N 0 , N0 ⊆ N 0 , |N 0 | = 0, then (10.131) βn (t) ≥ c1 xn (t)p α(t) + cxn (t)p−1 x(t). Let D = t ∈ T : lim inf βn (t) < 0 . This is a Lebesgue measurable subset of T . n→∞
Suppose |D| > 0. For all t ∈ C ∩ (T \ N 0 ) = ∅ we have that {xn (t)}n≥1 ⊆ X is w bounded (see (10.131)). Because xn −→ x in Wp (0, b) and Wp (0, b) is embedded w w continuously in C(T, H), we have xn −→ x in C(T, H) and so xn (t) −→ x(t) in H for all t ∈ T . Therefore, exploiting the reflexivity and separability of X, we have w xn (t) −→ x(t) in X as n → ∞ for all t ∈ T . Then because of the monotonicity of A(t, ·), we have
lim inf A t, xn (t) , xn (t) − x(t) ≥ lim A t, x(t) , xn (t) − x(t) = 0 n→∞
n→0
for all t ∈ D ∩ (T \ N0 ), a contradiction to the definition of D. Therefore |D| = 0 and we have 0 ≤ lim inf βn (t) a.e. on T. n→∞
So from Fatou’s lemma, we obtain
b lim inf βn (t)dt ≤ lim inf 0≤ 0
n→∞
n→∞
b
βn (t)dt = lim (A(xn ), xn − x) = 0,
0
(see (10.128)),
b βn (t)dt −→ 0. ⇒ 0
From (10.131), we see that there exists {γn }n≥1 uniformly integrable sequence such that
736
10 Evolution Equations γn (t) ≤ βn (t)
a.e. on T,
for all n ≥ 1,
⇒ 0 ≤ βn− (t) ≤ γn− (t) a.e. on T,
b βn− (t)dt −→ 0 as n → ∞. ⇒
for all n ≥ 1
0
Therefore, we have
b
b
|βn (t)|dt =
b
βn (t)dt + 2
0
0 1
βn− (t)dt −→ 0,
0
⇒ βn −→ 0 in L (T ),
b
A t, xn (t), xn (t) , xn (t) − x(t) −→ 0. ⇒
(10.132)
0
Finally let ξn (t) = (A(x), xn − x) t . Then ξn ∈ C(T ) and let tn ∈ T such that
tn
ξn (tn ) = max ξn = T
A s, x(s) , xn (s) − x(s) ds
0
= 0
b
χ[0,tn ] A s, x(s) , xn (s) − x(s) ds
= (χ[0,tn ] A(x), xn − x) −→ 0,
⇒ sup (A(x), xn − x) t −→ 0.
(10.133)
T
Returning to (10.129) and using (10.130), (10.132), and (10.133), we obtain xn −→ x
in C(T, H)
as n → ∞
and
x ∈ η(K)
⇒ η(K) is compact in C(T, H). This proves the claim. Note that S(x0 ) ⊆ η(K). Also from the previous part of the proof, we have that S(x0 ) ⊆ Wp (0, b) is weakly closed. So S(x0 ) ⊆ Wp (0, b) is weakly compact in Wp (0, b) and compact in C(T, H).
Let η : Lp (T, H) × H −→ C(T, H) be the map that for each pair (h, x0 ) ∈ p L (T, H)×H assigns the unique solution of
* ) −x (t) + A t, x(t) = h(t) a.e. on T, . (10.134) x(0) = x0
From the proof of Theorem 10.3.12, we have as a byproduct the following continuous dependence result. PROPOSITION 10.3.13 If hypotheses H(A) hold and X is embedded compactly into H, then η : Lp (T, H)w ×H −→ C(T, H) is sequentially continuous. Recall that Sn ⊆ Wp (0, b) is the solution set of the Galerkin approximation (10.109). Another byproduct of the proof of Theorem 10.3.12, is the following result.
10.3 Nonlinear Evolution Equations
737
PROPOSITION 10.3.14 If hypotheses H(A), H(f ) hold, x0 ∈ H, and X is embedded compactly in H, then lim sup Sn ⊆ S(x0 ) and h∗ Sn , S(x0 ) −→ 0 in C(T, H). n→∞
In particular, if S(x0 ) is a singleton (this is the case if f (t, ·) is Lipschitz continuous) and xn ∈ Wp (0, b) is the unique solution of the Galerkin approximation (10.109), then xn −→ x in C(T, H) with x ∈ Wp (0, b) the unique solution of (10.108). We can generalize hypotheses H(A) and still have an existence theorem for problem (10.74). In concrete parabolic problems, the new hypotheses on A(t, x) permit the differential operator to have lower-order terms of nonmonotone nature. In this extended existence theorem, we need the following notion. DEFINITION 10.3.15 Let X be a reflexive Banach space, L : D(L) ⊆ X −→ X ∗ ∗ is a linear maximal monotone operator, and A : X −→ 2X . We say that A is L-pseudomonotone if the following conditions hold. (a) For every x ∈ X, A(x) ⊆ X ∗ is nonempty, weakly compact, and convex. (b) A is usc from every finite-dimensional subspace of X into X ∗ furnished with the weak topology. w w (c) If {xn }n≥1 ⊆ D(L), xn −→ x ∈ D(L) in X, Lxn −→ Lx in X ∗ , x∗n ∈ A(xn ) for w all n ≥ 1, x∗n −→ x∗ in X ∗ and lim sup x∗n , xn − x ≤ 0, then x∗ ∈ A(x) and n→∞
x∗n , xn −→ x∗ , x.
The next result is the analogue of Theorem 3.2.60 for L-pseudomonotone operators. Its proof can be found in Papageorgiou–Papalini–Renzacci [482]. THEOREM 10.3.16 If X is a reflexive Banach space that is strictly convex, L : ∗ D(L) ⊆ X −→ X ∗ is a linear maximal monotone operator, and A : X −→ 2X is bounded, L-pseudomonotone and coercive, then L + A is surjective; that is, R(L + A) = X ∗ . For the extended version of Theorem 10.3.12, we employ the following conditions on the map A(t, x). Here X is a separable reflexive Banach space and X ∗ its topological dual. H(A) : A : T ×X −→ X ∗ is a map such that (i) For all x ∈ X, t −→ A(t, x) is measurable. (ii) For almost all t ∈ T , x −→ A(t, x) is pseudomonotone. (iii) For almost all t ∈ T and all x ∈ X, we have A(t, x)∗ ≤ α(t) + cxp−1 , with 2 ≤ p < ∞, α ∈ Lp (T )+ , p1 + p1 = 1 and c > 0. (iv) For almost all t ∈ T and all x ∈ X, we have A(t, x), x ≥ c1 xp − ϑ(t), with c1 > 0, ϑ ∈ L1 (T )+ .
738
10 Evolution Equations
REMARK 10.3.17 Note that the pseudomonotonicity of A(t, ·) for almost all t ∈ T (see H(A)(ii)) and the p-growth condition (see H(A)(iii)), imply that for almost all t ∈ T, A(t, ·) is demicontinuous. Using an argument similar to that in the last part of the proof of Theorem 10.3.12 with the functions {βn }n≥1 and the set D, we can show the following “lifting” theorem (see also Hu–Papageorgiou [316, p. 41]).
THEOREM 10.3.18 If hypotheses H(A) hold and A : Lp (T, X) −→ Lp (T, X ∗ ) is defined by
A(x)(·) = A ·, x(·) w
(the Nemitsky operator corresponding to A), then A is demicontinuous and if xn −→ x in Lp (T, X) and
lim sup (A(xn ), xn − x) ≤ 0, n→∞
we have A(xn ) −→ A(x) in Lp (T, X ∗ ) and (A(xn ), xn − x) −→ (A(x), x) as n → ∞. w
H(f ) : f : T ×H −→ H is a function such that (i) For all x ∈ H, t −→ f (t, x) is measurable. (ii) For almost all t ∈ T , x −→ f (t, x) is sequentially continuous from H into Hw . (iii) For almost all t ∈ T and all x ∈ H, we have |f (t, x)| ≤ α2 (t) + c2 |x|2/q
with α2 ∈ Lp (T )+ , c2 > 0 if p > 2 and α2 ∈ Lp (T )+ , c2 > 0, c2 β 2 < c1 /2, where β > 0 is such that | · | ≤ β · when p = 2. Using Theorems 10.3.16 and 10.3.18, we can have the following extension of Theorem 10.3.12. For the proof see Hu–Papageorgiou [316, p. 42]. THEOREM 10.3.19 If in the evolution triple (X, H, X ∗ ) the embedding of X into H is also compact, hypotheses H(A) and H(f ) hold, and x0 ∈ H, then the solution set S(x0 ) of problem (10.74) is nonempty, weakly compact in Wp (0, b), and compact in C(T, H). Let us see an application to a nonlinear parabolic problem. So let T = [0, b] and let Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. We consider the following nonlinear parabolic initial boundary value problem.
⎧ ∂x ⎫ p−2 Dx(t, x) = f0 t, z, x(t, z) in T × Z, ⎬ ⎨ ∂t − div α(t, z)Dx(t, z) x(0, z) = x0 (z) a.e. on Z, . ⎩ ⎭ x T ×∂Z = 0, 2 ≤ p < ∞ (10.135) We impose the following conditions on the data of (10.135):
10.3 Nonlinear Evolution Equations
739
H(α): α : T ×Z −→ R is a measurable function such that 0 < γ1 ≤ α(t, z) ≤ γ2
for almost all (t, z) ∈ T × Z.
H(f0 ): f0 : T ×Z ×R −→ R is a function such that (i) For all x ∈ R, (t, z) −→ f0 (t, z, x) is measurable. (ii) For almost all (t, z) ∈ T × Z, x −→ f0 (t, z, x) is continuous.
(iii) |f0 (t, z, x)| ≤ η1 (t, z) + η2 |x|2/p for a.a.
Lp T, L2 (Z) and η2 ∈ Lp T, L∞ (Z) .
(t, z) ∈ T × Z, with η1 ∈
2 PROPOSITION 10.3.20 If hypotheses H(α), H(f 0 ) hold2 and x0 ∈ L (Z), then
1,p 2 problem (10.135) has a solution x ∈ L T, W0 (Z) ∩C T, L (Z) such that ∂x/∂t ∈
L2 T, W −1,p (Z) and the solution set of the problem is compact in C T, L2 (Z) .
PROOF: We consider the evolution triple (X, H, X ∗ ), consisting of the following spaces. 1 1 X = W01,p (Z), H = L2 (Z), X ∗ = W −1,p (Z) + =1 . p p From the Sobolev embedding theorem, we know that X is embedded compactly into H (recall p ≥ 2). Consider the time-dependent Dirichlet form α : T × X × X −→ R defined by
α(t, z)Dxp−2 (Dx, Dy)RN dz. α(t, x, y) = Z
Using hypotheses H(α), we can easily check that |α(t, x, y)| ≤ c3 xp−1 y
for some c3 > 0.
So we can introduce a map A : T ×X −→ X ∗ defined by A(t, x), y = α(t, x, y). Then clearly we have. •
For all x ∈ X, t −→ A(t, x) is measurable (see Theorem 10.1.2).
•
For almost all t ∈ T , x −→ A(t, x) is demicontinuous, monotone.
•
For almost all t ∈ T and all x ∈ X, we have A(t, x)∗ ≤ cxp−1 .
•
For almost all t ∈ T and all x ∈ X, we have A(t, x), x ≥ c1 xp .
740
10 Evolution Equations
Also let f : T ×H −→ H be the Nemitsky operator corresponding to the function f0 (t, ·, ·); that is,
f (t, x)(·) = f0 t, ·, x(·) . It is straightforward to check that f satisfies hypotheses H(f ). We equivalently rewrite (10.135) as the following nonlinear evolution equation.
) * x (t) + A t, x(t) = f t, x(t) a.e. on T, . x(0) = x0 ∈ H Invoking Theorem 10.3.12, we obtain all the conclusions of the proposition.
10.4 Second-Order Nonlinear Evolution Equations In this section we deal with second-order nonlinear evolution equations. We consider two classes of problems, both defined in the framework of evolution triples. In the first class, the evolution triple consists of Hilbert spaces, the evolution equation involves multivalued terms, and the analysis is conducted using the material from the first half of Section 10.3. As a particular case of this class, we obtain hyperbolic variational inequalities. The second class of problems is formulated on a general evolution triple, involves maximal monotone coercive operators, and the method of proof is based on the results of the second half of Section 10.3. So let T = [0, b] and let (X, H, X ∗ ) be an evolution triple consisting of Hilbert spaces. We are interested in the following second-order nonlinear evolution equation.
* ) x (t) + Ax(t) + B x (t) * g(t) a.e. on T, . (10.136) x(0) = x0 , x (0) = y0 The hypotheses on the data of (10.136) are the following. H(A): A ∈ L(X, X ∗ ) and Ax, x+c1 |x|2 ≥ c2 x2 for all x ∈ X with c1 ∈ R, c2 > 0. ∗
H(B): B : D(B) ⊆ X −→ 2X is maximal monotone.
THEOREM 10.4.1 If hypotheses H(A), H(B) hold, g ∈ W 1,1 (0, b), H , x0 ∈ X, y0 ∈ D(B), and
Ax0 + B(y0 ) ∩ H = ∅, then problem (10.136) admits a unique strong solution x(·) such that
x ∈ W 1,∞ (0, b), X ∩ W 2,∞ (0, b), H d+ x(t) dt
exists in X,
d+ x (t) dt
exists in H, both for all t ∈ [0, b) and
d+ x (t) + Ax(t) + B x (t) * g(t) dt
for all t ∈ [0, b).
PROOF: Let H0 = X ×H. This is a Hilbert space with inner product (u1 , u2 )0 = Ax1 , x2 + c1 (x1 , x2 ) + (y1 , y2 ) for all u1 = [x1 , y1 ], u2 = [x2 , y2 ] ∈ X ×H.
10.4 Second-Order Nonlinear Evolution Equations
741
Let E : D(E) ⊆ H0 −→ 2H0 be the nonlinear operator defined by
E(u) = E([x, y]) = − y, Ax + B(y) ∩ H + ξ[x, y]
for all [x, y] ∈ D(E) = [x, y] ∈ X × H : Ax + B(y) ∩ H = ∅ and with ξ = sup
c1 (x, y) : x ∈ X, y ∈ H, x + |y| = 0 . Ax, y + c1 |x|2 + |y|2
(10.137)
Then we can equivalently rewrite problem (10.136), as the following first-order evolution equation in the Hilbert space H0 = X ×H:
) * u (t) ∈ −E u(t) + ξu(t) + g(t) on T, , (10.138) u(0) = u0 where u(t) = [x(t), x (t)], g(t) = 0, g(t) and u0 = [x0 , y0 ]. It is easy to check that E is monotone in H0 . We claim that in fact E is maximal monotone. It suffices to show that R(I + E) = H0 . Let (v, h) ∈ X × H be arbitrary. Then the conclusion u + E(u) * (v, h) is actually equivalent to the following system ) * x + (−y + ξx) = v, , y + Ax + B(y) + ξy * h ) * (1 + ξ)x − y = v, ⇒ . (1 + ξ)y + Ax + B(y) * h
(10.139)
From the first equation in (10.139) we have x=
1 (v + y). 1+ξ
Using this in the second inclusion of system (10.139), we obtain (1 + ξ)y +
1 1 Ay + B(y) * − Av. 1+ξ 1+ξ
(10.140)
Note that y −→ (1+ξ)y+1/(1+ξ)Ay is continuous, monotone, and coercive from X into X ∗ . Then y(1 + ξ)y + 1/(1 + ξ)Ay + B(y) is maximal monotone, and coercive (see Theorem 3.2.33). Therefore by virtue of Corollary 3.2.28 it is also surjective. So the inclusion (10.140) has a unique solution y ∈ D(E). Hence [v, h] ∈ R(I + E). Therefore, E is maximal monotone and we can apply Theorem 10.3.6 to finish the proof of the theorem. If B = ∂ϕ with ϕ ∈ Γ0 (X), then we have an existence result for nonlinear hyperbolic variational inequalities (see Theorem 10.3.7). COROLLARY
10.4.2 If hypothesis H(A) holds, B = ∂ϕ with ϕ ∈ Γ0 (X), g ∈ W 1,1 (0, b), H , x0 ∈ X, y0 ∈ D(∂ϕ), and
Ax0 + B(y0 ) ∩ H = ∅,
742
10 Evolution Equations
then there exists a unique function x ∈ C(T, X) such that x ∈ C(T, H) ∩ L∞ (T, X), x (t) ∈ dom ϕ
for all t ∈ (0, b)
∞
x ∈ L (T, H)
−x (t) − Ax(t) + g(t), y − x (t) ≤ ϕ(y) − ϕ x(t) ,
for a.a. t ∈ T, all y ∈ X x(0) = x0 , x (0) = y0 . As an application consider the following hyperbolic initial-boundary value problem. Let T = [0, b] and Z ⊆ RN is a bounded domain with a C 2 -boundary ∂Z. ⎧ 2 ⎫ ∂ x ∂x ⎪ ⎨ ∂t2 − x(t, z) + β ∂t * g(t, z) a.e. on T × Z, ⎪ ⎬ ∂x . (10.141) (z), (z, 0) = y (z) a.e. on Z, x(z, 0) = x 0 0 ∂t ⎪ ⎪ ⎩ ⎭ xT ×∂Z = 0 function (hence PROPOSITION 10.4.3 If β : R −→ 2R is a maximal monotone β = ∂j with j ∈ Γ0 (R)) with 0 ∈ D(β), g ∈ W 1,2 (0, b), L2 (Z) , x0 ∈ H01 (Z) ∩ H 2 (Z), and y0 ∈ H01 (Z) with y0 (z) ∈ D(β) a.e. on Z, then problem (10.141) has a unique solution x(t, z) that satisfies
x ∈ C T, H01 (Z) ∩ L∞ T, H 2 (Z)
∂x ∈ C T, L2 (Z) ∩ L∞ T, H01 (Z) ∂t √ ∂2x
∂2x t 2 ∈ L2 (T × Z), t 2 ∈ L∞ T, L2 (Z) . ∂t ∂t
Moreover, if j y0 (·) ∈ L1 (Z), then ∂ 2 x ∂t2 ∈ L2 (T × Z).
PROOF: Let X = H01 (Z), H = L2 (Z) and X ∗ = H −1 (Z). Let A ∈ L H01 (Z), H −1 (Z) be defined by
A(x), y = (Dx, Dy)RN dz for all x, y ∈ H01 (Z). Z
Hence Ax = −x (by integration by parts). Also let ϕ ∈ Γ0 H01 (Z) be defined by
ϕ(x) = j x(z) dz for all x ∈ H01 (Z) (recall β = ∂j with j ∈ Γ0 (R)). Z
Note that {x ∈ L2 (Z) : A ∈ L2 (Z)} = H01 (Z) ∩ H 2 (Z). Therefore, we have
Ax0 + ∂ϕ(y0 ) ∩ H = ∅. Thus we can apply Corollary 10.4.2 and establish the conclusions of the proposition. Next we consider a second-order evolution equation defined in the framework of a general evolution triple. So let T = [0, b] and (X, H, X ∗ ) a general evolution triple of spaces (see Definition 10.1.21). We examine the following second-order Cauchy problem:
)
10.4 Second-Order Nonlinear Evolution Equations 743
* x (t) + A t, x (t) + Bx(t) = f t, x(t), x (t) a.e. on T, . (10.142) x(0) = x0 , x (0) = y0
In problem (10.142), A : T × X −→ X ∗ is a nonlinear in general operator, B ∈ L(X, X ∗ ), f : T ×H ×H −→ H, and (x0 , y0 ) ∈ X ×H. DEFINITION 10.4.4 By a solution of problem (10.142), we mean a function x ∈ C(T, X) such that x ∈ Wp (0, b), (1 < p < ∞), it satisfies the equation in problem (10.142) for almost all t ∈ T and x(0) = x0 , x (0) = y0 . We denote the set of solutions of problem (10.142) by S(x0 , y0 ). REMARK 10.4.5 Because Wp (0, b) ⊆ C(T, H) (see Theorem 10.1.25), it follows that S(x0 , y0 ) ⊆ C 1 (T, H). We impose the following hypotheses on the maps A(t, x), B, and f (t, x, y). H(A) : A : T ×X −→ X ∗ is a map such that (i) For all x ∈ X, t −→ A(t, x) is measurable. (ii) For almost all t ∈ T , x −→ A(t, x) is demicontinuous monotone. (iii) For almost all t ∈ T and all x ∈ X, we have A(t, x)∗ ≤ α(t) + cxp−1 ,
with 2 ≤ p < ∞, α ∈ Lp (T )+ , (1/p) + (1/p ) = 1 and c > 0. (iv) For almost all t ∈ T and all x ∈ X, we have A(t, x), x ≥ c1 xp − α1 (t), with c1 > 0 and α1 ∈ L1 (T )+ . H(B) : B ∈ L(X, X ∗ ), Bx, x ≥ 0 for all x ∈ X and Bx, u = x, Bu for all x, u ∈ X. H(f ): f : T ×H ×H −→ H is a function such that (i) For all x, y ∈ H, t −→ f (t, x, y) is measurable. (ii) For almost all t ∈ T , (x, y) −→ f (t, x, y) is sequentially continuous from H × H into Hw (here Hw denotes the Hilbert space H furnished with the weak topology). (iii) For almost all t ∈ T and all x, y ∈ H, we have |f (t, x, y)| ≤ α2 (t) + c2 |x|2/p + |y|2/p ,
with α2 ∈ Lp (T )+ , c2 > 0.
744
10 Evolution Equations
THEOREM 10.4.6 If hypotheses H(A) , H(B) , H(f ) hold and (x 0 , y0 ) ∈ X ×H, then the solution set S(x0 , y0 ) is nonempty, weakly compact in W 1,p (0, b), X , and compact in C 1 (T, H). PROOF: First we derive some a priori bounds for the elements of S(x0 , y0 ). So let x ∈ S(x0 , y0 ). We know that x ∈ Wp (0, b) ⊆ C(T, H). We have
x (t), x (t) + A t, x (t) , x (t) + B x(t) , x (t)
= f t, x(t), x (t) , x (t) a.e. on T. (10.143) Using Corollary 10.1.26, we have
1 d x (t), x (t) = |x (t)|2 2 dt
a.e. on T.
Also because of hypotheses H(A) (iv) and H(B) , we have A t, x (t) , x (t) ≥ c1 x (t)p − α1 (t)
1 d
B x(t) , x(t) and B x(t) , x (t) = 2 dt
(10.144)
a.e. on T
(10.145)
a.e. on T.
(10.146)
So, if we return to (10.143) and use (10.144) through (10.146), we obtain 1 d
1 d |x (t)|2 + c1 x (t)p + B x(t) , x(t) 2 dt 2 dt
≤ f t, x(t), x (t) , x (t) + α1 (t) a.e. on T. Integrating, we have
t 1
1 x (s)p ds + |x (t)|2 + c1 B x(t) , x(t) 2 2 0
t
≤ f s, x(s), x (s) , x (s) ds + c3 for some c3 > 0, 0
t 1 x (s)p ds ⇒ |x (t)|2 + c1 2 0
t
α2 (s) + c2 |x(s)|2/p + |x (s)|2/p |x (s)|ds + c3 ≤
(10.147)
0
(see hypotheses H(B) and H(f )(iii)). Because x ∈ Wp (0, b), we have
t x (s)ds in X for all t ∈ T, x(t) = x0 + 0
t |x (s)|2 ds. ⇒ |x(t)|2 ≤ 2|x0 |2 + 2
(10.148)
0
So, if on the integral in the right-hand side of (10.147) we use Young’s inequality with ε > 0 and (10.148), after some calculations, we obtain
t 1 x (s)p ds |x (t)|2 + c1 2 0
t
t s
εp β p t |x (τ )|2 dτ ds + x (s)p ds ≤ c4 (ε) + c5 (ε) |x (s)|2 ds + c6 (ε) p 0 0 0 0 (10.149)
10.4 Second-Order Nonlinear Evolution Equations
745
with c4 (ε), c5 (ε), c6 (ε) > 0 and β > 0 is such that | · | ≤ β · . We choose ε > 0 such that (εp β p )/p < c1 . We have
t
t s 1 |x (τ )|2 dτ ds. (10.150) |x (t)|2 ≤ c4 (ε) + c5 (ε) |x (s)|2 ds + c6 (ε) 2 0 0 0 From (10.150) and using the generalized Gronwall inequality (see Denkowski– Mig´ orski–Papageorgiou [195, p. 128]), we obtain M1>0 such that |x (t)| ≤ M1
for all t ∈ T
and all x ∈ S(x0 , y0 ).
(10.151)
From (10.148) and (10.151), it follows that we can find M2 > 0 such that |x(t)| ≤ M2
for all t ∈ T
and all x ∈ S(x0 , y0 ).
(10.152)
So, if in (10.149), we use (10.151) and (10.152) and recalling the choice of ε > 0, we see that there exists M3 > 0 such that x Lp (T,X) ≤ M3
Recall that x(t) = x0 +
t
for all x ∈ S(x0 , y0 ).
x (s)ds
(10.153)
in X, for all t ∈ T.
0
Hence there exists M4 > 0 such that x(t) ≤ M4
for all t ∈ T and all x ∈ S(x0 , y0 ).
(10.154)
Using (10.152) through (10.154), directly from the equation in (10.142), we can find M5 > 0 such that x Lp (T,X ∗ ) ≤ M5
for all x ∈ S(x0 , y0 ).
(10.155)
As in the proof of Theorem 10.3.12, because of (10.151) and (10.152), we can replace f (t, x, y) by f1 (t, x, y) defined by
f1 (t, x, y) = f t, rM1 (x), rM2 (y) (rMk is the Mk -radial retraction in H). Note that for almost all t ∈ T and all x, y ∈ H we have |f1 (t, x, y)| ≤ ψ(t) with ψ ∈ Lp (T )+ . Let K : Lp (T, X) −→ C(T, X) be the integral operator defined by
t (Ky)(t) = x0 + y(s)ds for all t ∈ T. 0
With the use of K, we can equivalently recast (10.142) as the following first order evolution equation.
) * y (t) + A t, y(t) + B (Ky)(t) = f1 t, (Ky)(t), y(t) a.e. on T, . (10.156) y(0) = y0 If y solves (10.156), then x(t) = x0 + use the material of Section 10.3.
t 0
y(s)ds solves (10.142). To solve (10.156) we
746
10 Evolution Equations
First assume that y0 ∈ X and set A1 (t, x) = A(t, x + y0 ). We can easily verify that A1 still satisfies hypotheses such as those in H(A) .
Also let B1 ∈ L Lp (T, X), Lp (T, X ∗ ) be defined by
B1 (y)(·) = B K(y + y0 )(·)
and let F1 : Lp (T, X) −→ Lp (T, X ∗ ) be defined by
F1 (y)(·) = f1 ·, K(y + y0 )(·), y(·) + y0 . We consider the following Cauchy problem )
y (t) + A1 t, y(t) + B1 (y)(t) = F1 (y)(t) y(0) = y0
a.e. on T = [0, b],
* .
(10.157)
Note that y ∈ Wp (0, b) solves problem (10.156) if and only if y(·) − y0 solves problem (10.157). Let L : D(L) ⊆ Lp (T, X) −→ Lp (T, X ∗ ) be the linear operator defined by Ly = y for all y ∈ D(L) = y ∈ Wp (0, b) : y(0) = 0 . It is easy to see that L is densely defined, maximal monotone. Also let A1 : Lp (T, X) −→ Lp (T, X ∗ ) be defined by
A1 (y)(·) = A1 ·, y(·) . Clearly A1 is demicontinuous, monotone, hence it is maximal monotone. We set V (y) = A1 (y) + B1 (y) − F1 (y), V : Lp (T, X) −→ Lp (T, X ∗ ). Claim 1: V is L-pseudomonotone (see Definition 10.3.15). w
Clearly V is bounded and demicontinuous. Suppose yn −→ y in Wp (0, b) and assume that
lim sup (V (yn ), yn − y) ≤ 0. (10.158) n→∞
From the definition of V , we have
lim sup (A1 (yn ), yn − y) = lim sup (V (yn ) − B1 (yn ), yn − y) . n→∞
n→∞
(10.159)
Also, if by (·, ·)p p we denote the duality brackets for the pair Lp (T, H), Lp (T, H) , then we have
(10.160) (F1 (yn ), yn − y) = F1 (yn ), yn − y p p −→ 0 as n → ∞. Moreover, because of H(B) , we have
0 = lim (B1 (y), yn − y) ≤ lim inf (B1 (yn ), yn − y) . n→∞
n→∞
If in (10.159), we use (10.158), (10.160), and (10.161), we obtain
lim sup (A1 (yn ), yn − y) ≤ 0. n→∞
(10.161)
(10.162)
10.4 Second-Order Nonlinear Evolution Equations
747
By virtue of the maximal monotonicity of A1 , from (10.162), we have
w (A1 (yn ), yn ) −→ (A1 (y), y) and A1 (yn ) −→ A1 (y) in Lp (T, X ∗ ), w
⇒ V (yn ) −→ V (y)
in Lp (T, X ∗ ).
(10.163)
Also because of (10.158), we have
lim sup (B1 (yn ), yn − y) ≤ 0. n→∞
As before from (10.164) and because B1 is maximal monotone, we have
(B1 (yn ), yn ) −→ (B1 (y), y) w
⇒ (V (yn ), yn ) −→ (V (y), y) .
(10.164)
(10.165)
From (10.163) and (10.165), we conclude that V is L-pseudomonotone. Claim 2: V is coercive. Because of hypothesis H(A) (iv), we have
(A1 (y), y) ≥ c7 ypLp (T,X) − c8
for some c7 , c8 > 0.
(10.166)
Also hypothesis H(B) implies that
(B1 (y), y) ≥ 0,
(10.167)
whereas for F1 , we have
| (F1 (y), y) | = | F1 (y), y p p | ≥ −ψp yLp (T,H) .
(10.168)
Combining (10.166) through (10.168) and because 2 ≤ p < ∞, we conclude that V is coercive. Because of Claims 1 and 2 and using Theorem 10.3.16, we can find y ∈ D(L) ⊆ Wp (0, b) such that L(y) + V (y) = 0 when y0 ∈ X. Then y = y + y0 solves (10.157). Now we remove the restriction that y0 ∈ X. So suppose y0 ∈ H. We can find {y0n }n≥1 ⊆ X such that y0n −→ y0 in H as n → ∞. Let yn ∈ Wp (0, b) be the solution of (10.157) for y0n obtained above. An a priori estimation as in the proof of Theorem 10.3.12 (note that {y0n }n≥1 ⊆ H is bounded), implies that {yn }n≥1 ⊆ Wp (0, b) w is bounded. Therefore we may assume that yn −→ y in Wp (0, b). Then as in the proof of Theorem 10.3.12 (with the Galerkin solutions), we check that y is a solution of (10.157) with y0 ∈ H. This proves the solvability of (10.157), hence the solvability of (10.142) too. Next we prove the compactness properties of the solution set S(x0 , y0 ). The proof follows the steps of the corresponding part of the proof of Theorem 10.3.12. So let Σ = h ∈ Lp (T, H) : |h(t)| ≤ ψ(t) a.e. on T .
Endowed with the relative weak Lp (T, H)-topology, Σ becomes a compact metrizable space. Let Γ : Σ −→ 2C(T,H) be the multifunction which to each h ∈ Lp (T, H) assigns the set of solutions of
748
10 Evolution Equations
) y (t) + A t, y(t) + B (Ky)(t) = h(t) y(0) = y0
a.e. on T,
* .
We saw that Γ(h) = ∅ and in fact Γ(h) is a singleton. Claim 3: Γ(Σ) is weakly in Wp (0, b) and compact in C(T, H). To show the weak compactness in Wp (0, b), it suffices to show that Γ(Σ) is weakly sew quentially closed. So suppose {yn }n≥1 ⊆ Γ(Σ) and assume that yn −→ y in Wp (0, b). Then as in proof of Theorem 10.3.12, we can show that
(xn , xn − x) −→ 0. Also we have
and
0 = lim (B(Ky), yn − y) ≤ lim inf (B(Kyn ), yn − y) n→∞ n→∞
lim (hn , yn − y) = lim (hn , yn − y)p p = 0.
n→∞
n→∞
Therefore finally we have
lim sup (A(yn ), yn − y) ≤ 0. n→∞
Because of the maximal monotonicity of A, we infer that w
A(yn ) −→ A(y)
in Lp (T, X ∗ ).
Because yn + A(yn ) + B(Kyn ) = hn for all n ≥ 1, passing to the limit as n → ∞, we have y + A(y) + B(Ky) = h, y(0) = y0 ,
w
(where hn −→ H in Lp (T, H))
⇒ y ∈ Γ(Σ) (i.e., Γ(Σ) is weakly compact in Wp (0, b)). Next we show that Γ(Σ) is compact in C(T, H). So suppose {yn }n≥1 ⊆ Γ(Σ). w For every n ≥ 1, yn = Γ(hn ), hn ∈ Σ and we may assume that hn −→ h in Lp (T, H). For every t ∈ T and if y = Γ(h), we have
1 |yn (t) − y(t)|2 = (hn − h, yn − y) t − (A(yn ) − A(y), yn − y) t 2
− (B(Kyn ) − B(Ky), yn − y) t . (10.169) Note that
(B(Kyn ) − B(Ky), yn − y) t ≥ 0 (see hypothesis H(B) ).
(10.170)
Using (10.170) in (10.169), we obtain 1 |yn (t) − y(t)|2 ≤ 2
b
0
| hn (t) − h(t), yn (t) − y(t) |
+ 0
b
| A t, yn (t) , yn (t) − y(t) |dt + (A(y), yn − y) t .
10.4 Second-Order Nonlinear Evolution Equations
749
w
Because yn −→ y in Wp (0, b), we have yn −→ y in Lp (T, H) (see Theorem 10.1.29). Hence
b
| hn (t) − h(t), yn (t) − y(t) | −→ 0. 0
Also from the proof of Theorem 10.3.12 we have
b
| A t, yn (t) , yn (t) − y(t) |dt −→ 0 0
and
sup (A(y), yn − y) t −→ 0. t∈T
So finally sup |yn (t) − y(t)|2 −→ 0 (i.e., yn −→ y in C(T, H)), t∈T
⇒ Γ(Σ) is compact in C(T, H). This proves Claim 3. Let S (x0 , y0 ) = x : x ∈ S(x0 , y0 ) . Evidently S (x0 , y0 ) ⊆ Γ(Σ) and it is weakly closed
in Wp (0, b) and strongly closed in C(T, H). Because S(x0 , y0 ) = K S (x0 , y0 ) , we conclude that S(x0 , y0 ) is weakly compact in Wp (0, b) and strongly compact in C 1 (T, H). REMARK 10.4.7 The above theorem remains valid if instead of H(A)(ii), we assume the following more general condition (ii) for almost all t ∈ T, x −→ A(t, x) is pseudomonotone. Moreover, if for almost all t ∈ T, f (t, ·, ·) is Lipschitz continuous, then the solution set S(x0 , y0 ) is a singleton. We conclude this section with an example of a nonlinear hyperbolic initial boundary value problem. So let T = [0, b] and Z ⊆ RN be a bounded domain with a C 2 -boundary ∂Z. We consider the following hyperbolic problem. ⎫ ⎧ ∂2x
⎨ ∂t2 − x(t, z) − div(Dxt p−2 Dxt ) = f0 t, z, x(t, z), xt (t, z) in T × Z, ⎬ . x(z, 0) = x0 (z), xt (0, z) = y0 (z) a.e. on Z, ⎭ ⎩ x T ×∂Z = 0 (10.171) Here xt = ∂x/∂t. The hypotheses on the function f0 (t, z, x, y) are the following. H(f0 ): f0 : T ×Z ×R ×R −→ R is a function such that (i) For all x, y ∈ R, (t, z) −→ f0 (t, z, x, y) is measurable. (ii) For almost all (t, z) ∈ T × Z, (x, y) −→ f0 (t, z, x, y) continuous . (iii) For almost all (t, z) ∈ T ×Z and all x, y ∈ R, we have
|f0 (t, z, x, y)| ≤ α2 (t, z) + c2 |x| + |y| ,
with α2 ∈ Lp T, L2 (Z) , c2 > 0, 2 ≤ p < ∞,
1 p
+
1 p
= 1.
750
10 Evolution Equations
DEFINITION 10.4.8 By a solution of problem (10.171), we mean a function
x ∈ C T, W01,p (Z) such that
and
∂2x
∂x ∈ Lp T, W −1,p (Z) ∈ Lp T, W01,p (Z) , 2 ∂t ∂t
d2 p−2 x(t, z)u(z)dz + Dx (Dx , Du) (Dx, Du)RN dz N dz + t t R dt2 Z Z Z
= f (t, z, x, xt )u(z)dz Z
for all u ∈
W01,p (Z)
(weak solution).
PROPOSITION 10.4.9 If hypotheses H(f0 ) hold, x0 ∈ W01,p (Z), and y0 ∈ L2 (Z), then
the set of (weak) solutions of problem (10.171) is nonempty and compact in C 1 T, L2 (Z) .
PROOF: The evolution triple is X = W01,p (Z),H = L2 (Z), X ∗ = W −1,p (Z). Also A : X −→ X ∗ is defined by
Dxp−2 (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). A(x), y = Z
Clearly it satisfies hypotheses H(A). Also B ∈ L(X, X ∗ ) is given by
Bx, y = (Dx, Dy)RN dz for all x, y ∈ W01,p (Z). Z
It satisfies hypothesis H(B). Finally we set
f (t, x, y)(·) = f0 t, ·, x(·), y(·) . Then f : T ×H ×H −→ H satisfies hypotheses H(f ). Using A, B, and f , we can equivalently rewrite (10.171) in the form of (10.136). We apply Theorem 10.4.6 to conclude the proof.
10.5 Remarks 10.1: The origins of the Bochner integral can be traced in the works of Bochner [81] and Dunford [213]. Some people call it Dunford’s second integral. A more detailed study of the Bochner integral can be found in Diestel–Uhl [199], Denkowski– Mig´ orski–Papageorgiou [194], and Gasi´ nski–Papageorgiou [259]. There is also a weak vector-valued integral known as the Pettis integral. A detailed investigation of it can be found in Talagrand [573]. The RNP (see Definition 10.1.11), is investigated in a systematic way in the books of Diestel–Uhl [199] and Bourgin [95]. Theorem 10.1.16 is due to Dinculeanu–Foias [201]. For X = H = a Hilbert space, Theorem 10.1.18 was proved by Komura [360]. Evolution triples (see Definition 10.1.21) are also known as Gelfand triples , because it was Gelfand who first made systematic use of them (see Gelfand–Shilov [261]). Lemma 10.1.28 appears in the literature under the names Lions lemma or Ehrling’s inequality (see Ehrling [220]). Theorem 10.1.29
10.5 Remarks
751
is due to Aubin [35]. More on evolution triples and the related function spaces, can be found in the paper of Simon [555] and in the books of Gasi´ nski–Papageorgiou [259], Showalter [554], Wloka [608], and Zeidler [621, 622]. 10.2: Semilinear evolution equations using the semigroup method can be found in the books of Amann [19], Henry [291], Pazy [491], Showalter [553, 554], Tanabe [574], and Zheng [625]. The case of a time-dependent operator A, in which case we are dealing with the so-called evolution operators, can be found in Amann [19] and Tanabe [574]. For the applications to parabolic problems, we follow Zheng [625]. 10.3: Evolution equations associated with maximal monotone operators (Hilbert space case) or accretive operators (Banach space case), were the starting point for the introduction of nonlinear semigroups: see Komura [360] (Hilbert spaces) and Crandall–Liggett [166]) (Banach spaces) (see also Segal [546]). Subdifferential evolution inclusion, with time-invariant ϕ, was first investigated by Brezis [99, 100]. Brezis [99] proved Theorem 10.3.7 and also established the regularizing effect on the initial condition. A comprehensive treatment of such evolution equations can be found in the books of Barbu [58] and Brezis [102]. The case of time-dependent subdifferentials, is studied by Attouch–Damlamian [32], Hu–Papageorgiou [314, 315], Kenmochi [347, 348], Yamada [609], and Yotsutani [616]. For the periodic problem, we refer to Bader–Papageorgiou [51]. Further results can be found in Chapter 2 of the book of Hu–Papageorgiou [316]. The Galerkin method together with the monotonicity method was used extensively in the framework of evolution triples by Lions [387]. Extensions can be found in Papageorgiou–Papalini–Yannakakis [483]. We also refer to the books of Showalter [554] and Zeidler [622]. 10.4: Theorem 10.4.1 was originally by Lions–Strauss [386], however the proof given here is due to Brezis (see also Barbu [58, pp. 268–269]). Similar results can be found in Brezis [100]. Second-order evolution equations defined in the framework of a general evolution triple can be found in Zeidler [622, Chapter 33]. The work of Zeidler was generalized by Papageorgiou–Yannakakis [485, 486]. Our presentation here is based on these papers.
References
[1] Acerbi, E.–Fusco, N., Semicontinuity problems in the calculus of variations, Arch. Rat. Mech. Anal. 86 (1984), 125–145 [2] Agarwal, R.–Filippakis, M.–O’Regan D.–Papageorgiou N. S., Solutions for nonlinear Neumann problems via degree theory for multivalued perturbations of (S)+ –maps, Adv. Diff. Eqns. II (2006), 961–980 [3] Ahmed, N. U.–Teo, K. L. H., Optimal Control of Disturbed Parameter Systems, North–Holland, New York (1981) [4] Ahmed, N. U., Optimization and Identification of Systems Governed by Evolution Equations in Banach spaces, Longman, Essex, UK (1988) [5] Aizicovici, S.–Papageorgiou, N. S.–Staicu, V. Periodic solutions for second order differential inclusions with the scalar p-Laplacian, J. Math. Anal. Appl. 322 (2006), 913–929 [6] Aizicovici, S.–Papageorgiou, N. S.–Staicu, V. Degree theory for operators of monotone type and nonlinear elliptic equations with inequality constraints, Memoirs AMS–December 2008 [7] Alexeev, V.–Tichomirov, V.–Fomin, S., Commande Optimale, Mir, Moscow (1982) [8] Aliprantis, C.–Brown, D., Equilibria in markets with a Riesz space of commodities, J. Math. Economics 11 (1983), 189–207 [9] Aliprantis, C.–Brown, D.–Burkinshaw, O., Existence and Optimality of Competitive Equilibria, Springer-Verlag, New York (1989) [10] Aliprantis, C.–Border, K., Infinite Dimensional Analysis, Springer-Verlag, Berlin (1994) [11] Allegretto, W.–Huang, Y–X., Principal eigenvalues and Sturm comparison via Picone’s identity, Nonlin. Anal. 32 (1998), 819–830 [12] Allegretto, W.–Huang, Y–X., A Picone’s identity for the p-Laplacian and applications, J. Diff. Eqns. 156 (1999), 427–438 [13] Allen, B., Neighboring information and distributions of agents characteristics under uncertainty, J. Math. Economics 12 (1983), 63–101 [14] Allen, B., Convergence of σ-fields and application to mathematical economics, Eds. K. Hammer–D. Pallaschke, Springer, Berlin, 161–174 [15] Amann, H.–Weiss, S., On the uniqueness of the topological degree, Math. Zeits 130 (1973), 39–54
754
References [16] Amann, H., Fixed point equations and nonlinear eigenvalue problems in ordered Banach spaces, SIAM Rev. 18 (1976), 620–709 [17] Amann, H.–Zehnder, E., Nontrivial solutions for a class of nonresonance problems and applications to nonlinear differential equations, Ann. Scuola Norm. Sup. Pisa 7 (1980), 539–603 [18] Amann, H., A note on degree theory for gradient mappings, Proc. AMS 85 (1982), 591–595 [19] Amann, H., Linear and Quasilinear Parabolic Problems, Vol. I:Abstract Linear Theory, Birkh¨ auser, Basel (1995) [20] Ambrosetti, A.–Rabinowitz, P., Dual variational methods in critical point theory and applications, J. Funct. Anal. 14 (1973), 349–381 ∗ [21] Ambrosetti, A.–Struwe, M., A note on the problem −u = λu + |u|2 −2 u, Manuscr Math. 54 (1986), 373–379 [22] Ambrosetti, A.–Mancini, G., Sharp nonuniqueness results for some nonlinear problems, Nonlin. Anal. 73, (1979), 635–648 [23] Ambrosetti, A.–Lupo, D., On a class of nonlinear Dirichlet problems with multiple solutions, Nonlin. Anal. 8 (1984), 1145–1150 [24] Ambrosetti, A.–Garcia Azorero, J.–Peral Alonso, I., Multiplicity results for some nonlinear elliptic equations, J. Funct. Anal. 137 (1996), 219–242 [25] Anane, A., Simplicit´e et isolation de la premi` ere valeur propre du pLaplacian avec poids, CRAS Paris, t.305 (1987), 725–728 [26] Anane, A.–Tsouli, N., On the second eigenvalue of the p-Laplacian in Nonlinear Partial Differential Equations, Eds. A. Benikrane–J.–P. Gossez, Pitman Research Notes in Math, Vol. 343 (1996), Longman, Harlow, 1–9 [27] Andres, J.–Gorniewicz, L., Topological Fixed Point Principles for Boundary Value Problems, Kluwer, (2003) [28] Araujo, A.–Scheinkman, J., Smoothness, comparative dynamics and the turnpike property, Econometrica 45 (1979), 601–620 [29] Arkin, V.–Evstigneev, I., Stochastic Models of Control and Economic Dynamics, Academic Press, New York (1987) [30] Ash, R. B., Real analysis and probability, Probability and Mathematical Statistics, No. 11, Academic Press, New York–London, (1972) [31] Asplund, E., Fr´echet differentiability of convex functions, Acta Math. 121 (1968), 31–47 [32] Attouch, H.–Damlamian, A., Probl´emes d’evolution dans les hilberts et applications, J. Math. Pures Appl. 54 (1975) [33] Attouch, H., On the maximality of the sum of two maximal monotone operators, Nonlin. Anal. 5 (1981), 143–147 [34] Attouch, H., Variational Convergence for Functions and Operators, Pitman London (1984) [35] Aubin, J.–P., Un th´eor`eme de compacit´ e, CRAS Paris, t. 256 (1963), 5042– 5044 [36] Aubin, J.–P., Mathematical Methods of Control and Economic Dynamics, Academic Press, New York (1979) [37] Aubin, J.–P., Further properties of Lagrange multipliers in nonsmooth optimization, Appl. Math. Optim. 6 (1980), 79–90 [38] Aubin, J.–P.–Cellina, A., Differential Inclusions, Springer-Verlag, Berlin (1984) [39] Aubin, J.–P.–Ekeland, I., Applied Nonlinear Analysis, Wiley, New York (1984)
References
755
[40] Aubin, J.–P.–Frankowska, H., Set–Valued Analysis, Birkh¨ auser, Boston (1990) [41] Aumann, R.–Peleg, B., von Neumann–Morgenstein solutions to cooperative games without side payments, Bull. AMS, 66 (1960), 173–179 [42] Aumann, R., The core of a cooperative game without side payments, Trans. AMS 98 (1961), 539–552 [43] Aumann, R., Markets with a continuum of traders, Econometrica 32 (1964), 39–50 [44] Aumann, R., Integral of set–valued functions, J. Math. Anal. Appl. 12 (1965), 1–12 [45] Aumann, R., Existence of competitive equilibrium in markets with a continuum of traders, Econometrica, 34 (1966), 1–17 [46] Aumann, R., Measurable utility and the measurable choice theorem, Actes Colloq. Internat. du CNRS Aix–en–Provence, (1969), 15–26 [47] Aumann, R.–Shapley, L., Values of Non-Atomic Games, Princeton Univ. Press, Princeton, NJ (1974) [48] Averbukh, V.–Smolyanov, O., The theory of differentiation in linear topological spaces, Russian Math Surveys 22 (1967), 201–258 [49] Bader, R., A topological fixed–point index theory for evolution inclusions, Z. Anal. Anwend., 20 (2001), 3–15 [50] Bader, R.–Papageorgiou, N. S., Nonlinear boundary value problems for differential inclusions, Math. Nachr., 244 (2002), 5–25 [51] Bader, R.–Papageorgiou, N. S., On a problem of periodic evolution inclusions of the subdifferential type, Z. Anal. Anwendungen, 21 (2002), 963–984 [52] Balder, E., A general denseness result for relaxed control theory, Bull. Austr. Math. Soc., 30 (1984), 463–475 [53] Balder, E., Necessary and sufficient conditions for L1 –strong–weak lower semicontinuity of integral functionals, Nonlin. Anal. 11 (1987), 1399–1404 [54] Balder, E.–Yannelis, N., Equilibria in random and bayesian games with a continuum of players in Equilibrium Theory in Infinite Dimensional Games, Eds. M. A. Khan–N. Yannelis, Springer-Verlag, Berlin (1991), 333– 350 [55] Balder, E., Existence of solutions for control and variational problems with recursive objectives, J. Math. Anal., 178 (1993), 418–437 [56] Ball, J.–Zhang, K., Lower semicontinuity of multiple integrals and the bitting lemma, Proc. Royal. Soc. Edinburgh, 114A (1990), 367–379 [57] Banach, S., Sur les op´erations dans les ensembles abstraits et leur application aux ´equations int´egrales, Fund. Math., 3 (1922), 133–181 [58] Barbu, V., Nonlinear Semigroups and Differential Equations in Banach Spaces, Noordhoff International, Leyden, The Netherlands (1976) [59] Barbu, V.–Precupanu, T., Convexity and Optimization in Banach Spaces, Reidel, Dordecht, The Netherlands (1986) [60] Becker, R.–Boyd, J.–Sung, B., Recursive utility and optimal capital accumulation I: Existence, J. Economic Theory 47 (1989), 76–100 [61] Beer, G., Topologies on Closed and Closed Convex Sets, Kluwer, Dordrecht (1993) [62] Bellman, R., Dynamic Programming, Princeton Univ. Press, Princeton, NJ (1957) [63] Benamara, M., Points Extremaux, Multi–Applications et Fonctionalles Integrales, Th´ese du 3eme Cycle, Univ. de Grenoble, France (1975)
756
References [64] Benci, V.–Rabinowitz, P., Critical point theorems for indefinite functionals, Invent. Math., 52 (1979), 241–273 [65] Benveniste, L.–Scheinkman, J., On the differentiability of the value function in dynamic models of economics, Econometrica, 47 (1979), 728–732 [66] Berestycki, H.–Lasry, M.–Mancini, G.–Ruf, B., Existence of multiple periodic orbits on star-shaped Hamiltonian surfaces, Comm. Pure Appl. Math., 38 (1985), 253–289 [67] Berge, C., Espaces Topologiques, Fonctions Multivoques (2–nd edition), Dunod, Paris (1966) [68] Berger, M., Nonlinearity and Functional Analysis, Academic Press, New York (1977) [69] Berger, M.–Schechter, M., On the solvability of semilinear gradient operator equations, Adv. Math., 25 (1977), 97–132 [70] Berkovitz, L., Optimal Control Theory, Springer-Verlag, New York (1974) [71] Berkovitz, L., Lower semicontinuity of integral functionals, Trans AMS, 192 (1974), 51–57 [72] Bertsekas, D.–Shreve, S., Stochastic Optimal Control: The Discrete Time Case, Academic Press, New York (1978) [73] Beurling, M.–Livingston, A., A theorem on duality mapping in Banach spaces, Ark. f¨ or Math., 4 (1962), 405–411 [74] Bewley, T., Existence of equilibria in economics with infinitely many commodities, J. Economic Theory, 4 (1972), 514–540 [75] Bielecki, A., Une remarque sur la m´ ethode de Banach–Caccioppoli– Tichomirov, Bull. Acad. Polon. Sci., 4 (1956), 261–268 [76] Binding, P.–Drabek, P.–Huang, Y.–X., On Neumann boundary value problems for some quasilinear elliptic equations, Electr. J. Diff. Eqns., Vol. 1997, No. 5 (1997), 1–11 [77] Blackwell, D., Discounted dynamic programming, Ann. Math. Stat., 36 (1966), 226–235 [78] Blackwell, D., Positive dynamic programming in Proceedings of the Fifth Berkeley Symp. on Math. Stat. Prob., Ann. Math. Stat., Vol. 2, Univ. of California, Berkeley, (1967), 415–418 [79] Blanchard, P.–Br¨ uning, E., Variational Methods in Mathematical Physics, Springer-Verlag (1992) [80] Bliss, G., Lectures on the Calculus of Variations, Univ. of Chicago Press, Chicago (1946) [81] Bochner, S., Integration von Funktionen deren Werte die Elemente eines Vectorraumes sind, Fund. Math., 20 (1933), 262–272 [82] Boltyanski, V. G.–Gamkrelidze, R. V.–Pontryagin, L. S., Theory of optimal processes I. Maximum principle, (Russian) Izv. Akad. Nauk. SSSR, Ser. Mat. 24 (1960), 3–42 [83] Bolza, O., Lectures on the Calculus of Variations, Dover, New York (1961) [84] Bondavera, O. N., Theory of the core in the n-person game, Vestnik Leningrand, 13 (1962), 141–142 (in Russian) [85] Bondareva, O. N., Some applications of linear programming methods to the theory of cooperative games, Problemy Kibernetik, 10 (1963), 119–139 (in Russian) [86] Border, K., Fixed Point Theorems with Applications to Economics and Game Theory, Cambridge Univ. Press, Cambridge, UK (1985)
References
757
[87] Border, K., Functional analytic tools for expected utility theory in Positive Operartors, Riesz Spaces and Economics, Eds. C. Aliprantis–D. Brown–W. Luxemburg, Springer-Verlag (1991) [88] Border, K., Revealed preference, stochastic dominance and the expected utility hypothesis, J. Economic Theory, 56 (1992), 20–42 [89] Borel, E., Les “paradoxes” de la th´ eorie des ensembles, Ann. Sci. Ecole Norm. Sup., 25 (1908), 443–448 [90] Borsuk, K., Sur les retractes, Fund. Math., 17 (1931), 152–170 [91] Borsuk, K., Drei S¨ atze u ¨ber die n–dimensionale euklidische Sph¨ are, Fund. Math., 20 (1933), 177–190 [92] Borwein, J.–Preiss, D., A smooth variational principle with applications to subdifferentiability and to differentiability of convex functions, Trans. AMS, 303 (1987), 517–527 [93] Bouligand, G., Introduction a ` la Geometrie Infinit´esimale Directe, Gauthier Villars, Paris (1932) [94] Bourgain, J., An averaging result for l1 -sequences and applications to weakly conditionally compact sets in L1 (X), Israel J. Math., 32 (1979), 289–298 [95] Bourgin, R., Geometric Aspects of Convex Sets with the Radon–Nikodym Property, Lecture Notes in Math., Vol. 993, Springer-Verlag, Berlin (1983) [96] Boylan, E., Equiconvergence of martingales, Ann. Math. Stat., 42 (1971), 552–559 [97] Bressan, A.–Cellina, A.–Fryszkowski, A., A class of absolute retracts in spaces of integrable functions, Proc. Amer. Math. Soc., 112 (1991), 413– 418 [98] Brezis, H., Equations et inequations non lineaires dans les espaces vectoriels en dualit´e, Ann. Inst. Fourier, 18 (1968), 115–175 [99] Brezis, H., Propri´ et´es r´egularisantes de certains semi–groupes nonlin´ eaires, Israel J. Math., 9, (1971), 513–534 [100] Brezis, H., Probl`emes unilat´eraux, J. Math. Pures Appl., 51 (1972), 1–164 [101] Brezis, H.–Nirenberg, L.–Stampacchia, G., A remark of Ky Fan’s minimax principle, Bolletino UMI, 6 (1972), 293–300 [102] Brezis, H., Operateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces de Hilbert, North Holland, Amsterdam (1973) [103] Brezis, H.–Browder, F., A general principle on ordered sets in nonlinear functional analysis, Adv. Math., 21 No. 3 (1976), 355–364 [104] Brezis, H.–Nirenberg, L., Positive solutions of nonlinear elliptic equations involving critical Sobolev exponents, Comm. Pure Appl. Math., 36 (1983), 437–477 [105] Brezis, H.–Nirenberg, L., Remarks on finding critical points, Comm. Pure Appl. Math., 44 (1991), 939–963 [106] Brezis, H.–Nirenberg, L., H 1 versus C 1 -local minimizers, CRAS Paris, t. 317 (1993), 465–472 [107] Brock, W., On the existence of weakly maximal programmes in a multisector economy, Review of Economic Studies, 37 (1970), 275–280 [108] Brock, W.–Mirman, L., Optimal economic growth and uncertainty: The discounted case, J. Economic Theory, 4 (1972), 479–513 [109] Brock, W.–Mirman, L., Optimal economic growth and uncertainty: The non-discounting case, Intern. Economic Review, 14 (1973), 560–573
758
References
[110] Brock, W.–Haurie, A., On the existence of overtaking optimal trajectories over an infinite time horizon, Math. Operat. Res., 1 (1976), 337–346 [111] Brock, W.–Majumdar, M., On characterizing optimal competitive programs in terms of decentralizable conditions, J. Economic Theory, 45, (1988), 262–273 [112] Brondsted, A., Conjugate convex functions in topological vector spaces, Mat–Fys. Medd. Danske Vid. Selsk, 34 (1964), 1–26 [113] Brodskii, M.–Milman, D., On the center of a convex set, Dokl. Akad. Nauk., USSR, 59 (1948), 837–840 (Russian) [114] Brouwer, L. E. J., On continuous one-to-one transformations of surfaces into themselves, Proc. Kon. Ned. Ak. V. Wet., Series A 11 (1909), 788–798 ¨ [115] Brouwer, L. E. J., Uber abblildung von mannigfaligkeiten, Math. Ann., 71, (1912), 97–115 [116] Browder, F., Nonexpansive nonlinear operators in a Banach space, Proc. Nat. Acad. Sci., USA, 54 (1965), 1041–1044 [117] Browder, F., Infinite dimensional manifolds and nonlinear eigenvalue problems, Ann. of Math., 82 (1965), 459–477 [118] Browder, F., The fixed point theory of multivalued mappings in topological vector spaces, Math. Ann., 177 (1968), 283–301 [119] Browder, F., Nonlinear maximal monotone mappings in Banach spaces, Math. Ann., 175 (1968), 81–113 [120] Browder, F.–Hess, P., Nonlinear mappings of monotone type in Banach Spaces, J. Funct. Anal., 11 (1972), 251–294 [121] Browder, F., Nonlinear Operators and Nonlinear Equations of Evolution in Banach Spaces, Proc. of Symposia in Pure Math., Vol. XVII, Part 2, AMS, Providence (1976) [122] Browder, F., Fixed point theory and nonlinear problems, Bull. AMS (N.S.), 9 (1983), 1–39 [123] Browder, F., Degree Theory for Nonlinear Mappings in Nonlinear Functional Analysis and its Applications, Ed. F. Browder, Proc. of Symposia in Pure Math., Vol. 45, Part I, AMS, Providence (1986) [124] Brown, R., A Topological Introduction to Nonlinear Analysis, Birkh¨ auser, Boston (1993) [125] Buttazzo, G.–Dal Maso, G., Γ-convergence and optimal control problems, J. Optim. Th. Appl., 38 (1982), 385–407 [126] Buttazzo, G., Semicontinuity, Relaxation and Integral Representation in the Calculus of Variations, Pitman Research Notes in Math., vol. 207, Longman Scientific and Technical, Essex, UK. (1989) [127] Caklovic, L.–Li, S.–Willem, M., A note on Palais–Smale condition and coercivity, Diff. Integral Eqns., 3 (1990), 799–800 [128] Caristi, J.–Kirk, W., Geometric fixed point theory and inwardness conditions, Lecture Notes in Math., Vol. 490, Springer-Verlag (1975) [129] Caristi, J., Fixed point theorems for mappings satisfying inwardness conditions, Trans. AMS, 215 (1976), 241–251 [130] Carlson, D.–Haurie, A.–Leizarowitz, A., Infinite Horizon Optimal Control: Deterministic and Stochastic Systems, Springer-Verlag, New York (1991) [131] Cartan, H., Calcul Diff´erentiel, Herman, Paris (1967) [132] Casas, E.–Fernadez, L., A Green’s formula for quasilinear elliptic operators, J. Math. Anal. Appl., 142 (1989), 62–73
References
759
[133] Castaing, C., Sur les multiapplications mesurables, Revue Francaise Inform. Rech. Operat., 1 (1967), 91–126 [134] Castaing, C.–Valadier, M., Convex Analysis and Measurable Multifunctions, Lecture Notes in Math., Vol. 580, Springer-Verlag, Berlin (1977) [135] Cellina, A., The role of approximation in the theory of multivalued mappings in Differential Games and Related Topics, Eds. H. W. Kuhn–G. P. Szego, North–Holland, Amsterdam, (1971), 209–220 [136] Cerami, G., Un criterio di existenza per i punti critici su varieta illimitate, 1st Lombardo Accad. Sci. Lett. Rend. A, 112 (1978), 332–336 [137] Cerami, G.–Fortunato, D.–Struwe, M., Bifurcation and multiplicity results for nonlinear elliptic problems involving critical Sobolev exponents, Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire, 1 (1984), 341–350 [138] Cesari, L., Closure theorems for orietor fields and weak convergence, Arch. Rational Mech. Anal., 55 (1974), 332–356 [139] Cesari, L., Lower semicontinuity and lower closure theorems without seminormality conditions, Annali di Mat. Pura Appl., 98 (1974), 381–397 [140] Cesari, L., Optimization Theory and Applications, Springer-Verlag, New York (1983) [141] Cesari, L.–Kannan, R., Existence of solutions of a nonlinear differential equation, Proc. AMS, 88 (1983), 605–613 [142] Chang, K.–C., Solutions of asymptotically linear operator equations via Morse theory, Comm. Pure Math., 34 (1981), 693–712 [143] Chang, K.–C., Infinite Dimensional Morse Theory and Multiple Solution Problems, Birkh¨ auser, Boston (1993) [144] Choquet, G., Lectures on Analysis I,II,III, Benjamin, Reading, MA (1969) [145] Christensen, J., Topology and Borel Structure, North Holland, Amsterdam (1974) [146] Cioranescu, I., Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems, Kluwer, Dordrecht, The Netherlands (1990) [147] Clark, D., A variant of the Lusternik–Schnirelmann theory, Indiana Univ. Math. J., 22 (1972), 65–74 [148] Clarke, F., Generalized gradients and applications, Trans. AMS, 205 (1975), 247–262 [149] Clarke, F., A new approach to Lagrange multipliers, Math. Oper. Res., 1 (1976), 165–174 [150] Clarke, F., A classical variational principle for periodic Hamiltonian trajectories, Proc. AMS, 76 (1979), 186–188 [151] Clarke, F.–Ekeland, I., Hamiltonian trajectories with prescribed minimal period, Comm. Pure Appl. Math., 33 (1980), 103–116 [152] Clarke, F., Generalized gradients of Lipschitz functionals, Adv. Math., 40 (1981), 52–67 [153] Clarke, F., Optimization and Nonsmooth Analysis, Wiley, New York (1983) [154] Clarke, F., Methods of Dynamic and Nonsmooth Optimization, CBMS– NSF, Regional Conf. in Appl. Math., Vol. 57, SIAM, Philadelphia (1989) [155] Clement, P.–Peletier, L., An antimaximum principle for second order elliptic operators, J. Diff. Eqns., 34 (1979), 218–229 [156] Coffman, C. V., A minimum–maximum principle for a class of nonlinear integral equations, J. Anal. Math., 22 (1969), 391–419 [157] Cornwall, R., Conditions on the graph of the integral of a correspondence to be open, J. Math. Anal. Appl., 39 (1972), 771–792
760
References
[158] Costa, D.–Silva, E., The Palais–Smale condition versus coercivity, Nonlin. Anal., 16 (1991), 371–381 [159] Costa, D.–Magalhaes, C., Existence results for perturbations of the pLaplacian, Nonlin. Anal., 24 (1995), 409–418 [160] Cotter, K., Similarity of information and behavior with a pointwise convergence topology, J. Math. Economics, 15 (1986), 25–38 [161] Cotter, K., Convergence of information, random variables and noise, J. Math. Economics, 16 (1987), 39–51 [162] Cotter, K., Type correlated equilibria for games with payoff uncertainty, (English summary), Economic Theory, 4 (1994), 617–627 [163] Courant, R., Dirichlet’s Principle, Conformal Mapping and Minimal Surfaces, Appendix by M. Schiffer, Interscience, New York (1950) [164] Courant, R.–Hilbert, D., Methods of Mathematical Physics I, Interscience, New York (1953) [165] Covitz, H.–Nadler, S., Multivalued contraction mappings in generalized metric spaces, Israel J. Math., 8 (1970), 5–11 [166] Crandall, M.–Liggett, T., Generation of semigroups of nonlinear transformations on general Banach spaces, Amer. J. Math., 93 (1971), 265–298 [167] Cuesta, M.–De Figueiredo, D.–Gossez, J.–P., The beginning of the Fuˇcik spectrum for the p-Laplacian, J. Diff. Eqns., 159, No. 1 (1999), 212–238 [168] Cuesta, M., Eigenvalue problems for the p-Laplacian with indefinite weights, Electronic J. Diff. Eqns., Vol. 2001, No. 33 (2001), 1–9 [169] Cuesta, M., Minimax theorems on C 1 –manifolds via Ekeland variational principle, Abstr. Appl. Anal., 13 (2003), 757–768 [170] Dacorogna, B., Direct Methods in the Calculus of Variations, SpringerVerlag, New York (1989) [171] Dal Maso, G., An Introduction to Γ–Convergence, Birkh¨ auser, Boston (1993) [172] Damascelli, L., Comparison theorems for some quasilinear degenerate elliptic operators and applications to symmetry and monotonicity results, Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire, 15 (1998), 493–516 [173] Dana, R. A., Evaluation of development programs in a stationary stochastic economy with bounded primary resources in Mathematical Models in Economics, Eds. J. Los–M. Los, North Holland, Amsterdam, (1974), 179– 205 [174] Dancer, E., Bifurcation theory in real Banach spaces, Proc. London Math. Soc., 23 (1971), 699–734 [175] Danes, J., A geometric theorem useful in nonlinear functional analysis, Bollettino UMI, 6 (1972), 369–375 [176] Darbo, G., Punti uniti in transformazioni a condominio non compatto, Rend. Sem. Mat. Univ. Padova, 24 (1955), 84–92 [177] Dasgupta, S.–Mitra, T., Characterization of intertemporal optimality in terms of decentralization conditions: The discounted case, J. Economic Theory, 45 (1988), 274–287 [178] Dasgupta, S.–Mitra, T., Optimal and competitive programs in reachable multisector models, Economic Theory, 14 (1999), 565–582 [179] De Blasi, F. S.–Myjal, J., The Baire category method in existence problems for a class of multivalued differential equations with nonconvex right hand side, Funkc. Ekvac, 28 (1985), 139–156
References
761
[180] De Blasi, F. S.–Pianigiani, G., The Baire category method in existence problems for a class of multivalued differential equations with nonconvex right hand side, Funkc. Ekvac, 28 (1985), 139–156 [181] De Blasi, F. S.–Pianigiani, G., Nonconvex valued differential inclusions in Banach spaces, J. Math. Anal. Appl., 157 (1991), 469–494 [182] De Blasi, F. S.–Pianigiani, G., Topological properties of nonconvex differential inclusions, Nonlin. Anal., 20, (1993), 871–894 [183] Dechert, W. D.–Nishimura, K., A complete characterization of optimal growth paths in an aggregated model with a non-concave production function, J. Economic Theory, 31 (1983), 332–354 [184] Debreu, G., A social equilibrium existence theorem, Proc. Nat. Acad. Sci. USA, 38 (1952), 886–893 [185] Debreu, G., Theory of Value, Cowles Foundation Monograph, Yale Univ. Press, New Haven, CO (1959) [186] Debreu, G., Integration of correspondences, Proc. of the fifth Berkley Symp. on Math. statistics and Probability (1965/66), Vol. II, Univ. of California Press, Berkley, 351–372 [187] Deimling, K., Fixed points of condensing maps in Volterra equations, Lecture Notes in Math., Vol. 737, (1979) Springer-Verlag, New York, 67–82 [188] Deimling, K., Nonlinear Functional Analysis, Springer-Verlag, Berlin (1985) [189] De Giorgi, E., Teoremi di Semicontinuita nel Calcolo delle Variazioni, Notes of a course held in INDAM, Rome (1968) [190] De Giorgi, E.–Franzoni, T., Su un tipo di convergenza variazionale, Atti. Accad. Naz. Lincei. Rend Cl. Sci. Fis. Mat. Nat., 58 (1975), 842–850 [191] De Giorgi, E.–Franzoni, T., Su un tipo di convergenza variazionale, Rend Sem. Mat. Brescia, 3 (1979), 63–101 [192] Degiovanni, M.–Marzocchi, M., A critical point theory for nonsmooth functionals, Ann. Mat. Pura ed Appl., 167 (1994), 73–100 [193] Delbaen, F., Convex games and extreme points, J. Math. Anal. Appl., 45 (1974), 210–233 [194] Denkowski, Z.–Mig´ orski, S.–Papageorgiou N. S., An Introduction to Nonlinear Analysis: Theory, Kluwer Academic, Boston (2003a) [195] Denkowski, Z.–Mig´ orski, S.–Papageorgiou N. S., An Introduction to Nonlinear Analysis: Applications, Kluwer Academic, Boston (2003b) [196] Deutsch, F.–Kenderov, P., Continuous selections and approximate selection for set–valued mappings and applications to metric projections, SIAM J. Math. Anal., 14 (1983), 185–194 [197] Deville, R.–Godefroy, R.–Zizler, V., Smoothness and Renorming in B– spaces, Longman, Essex, UK (1993) [198] E. Di Benedetto, E., Degenerate Parabolic Equations, Springer-Verlag, New York(1993) [199] Diestel, J.–Uhl, J., Vector Measures, Math. Surveys, Vol. 15 (1977), AMS, Providence, RI [200] Dieudonn´e, J., Foundations of Modern Analysis, Pure and Applied Mathematics, Vol. 10–I, Academic Press, New York (1969) [201] Dinculeanu, N.–Foias, C., Sur la representation int´egrale des certaines op´erations lin´eaires IV, Canad. J. Math., 13 (1961), 529–556 [202] Dontchev, A.–Zolezzi, T., Well–Posed Optimization Problems, Lecture Notes in Math., Vol. 1404, Springer-Verlag, Berlin (1994)
762
References
[203] Dorfman, R.–Samuelson, P.–Solow, R., Linear Programming and Economic Analysis, McGraw-Hill, New York (1958) [204] Douka, P.–Papageorgiou, N. S., Extremal solutions for nonlinear second order differential inclusions, Math. Nachr., 247 (2004), 30–45 [205] Drabek, P.–Manasevich, R., On the closed solution to some nonhomogeneous eigenvalue problems with p-Laplacian, Diff. Integral Eqns., 12, (1999), 773–788 [206] Drabek, P.–Robinson, S., Resonance problems for the p-Laplacian, J. Funct. Anal., 169 (1999), 189–200 [207] Dreze, G.–Gabszewicz, J.–Schmeidler, D.–Vind, K., Cores and prices in an exchange economy with an atomless sector, Econometrica, 40 (1972), 1091–1108 [208] Dubovitskii, A.–Milyutin, A., Extermal problems with side conditions, USSR Comput. Math. Phys., 5 (1965), 1–80 [209] Dugundji, J., An extension of Tietze’s theorem, Pacific J. Math., 1 (1951), 353–367 [210] Dugundji, J., Topology, Allyn and Bacon, Boston (1966) [211] Dugundji, J.–Granas, A., KKM–maps and variational inequalities, Ann. Scuola Norm. Sup. Pisa, 5 (1978), 679–682 [212] Dugundji, J.–Granas, A., Fixed Point Theory, Polish Scientific, Warsaw (1982) [213] Dunford, N., Integration in general analysis, Trans. AMS, 37 (1935), 441– 453 [214] Dunford, N.–Schwartz, Linear Operators I, Wiley, New York (1958) [215] Dynkin, E., Optimal programs and stimulating prices in stochastic models of economic growth in Mathematical Models in Economics, Eds. J. Los–M. Los, North Holland, Amsterdam, (1974), 207–218 [216] Dynkin, E.–Yushkevich, A., Controlled Markov Processes, Springer-Verlag, Berlin (1979) [217] Edelstein, M., On fixed and periodic points under contractive mappings, J. London Math. Soc., 37 (1962), 74–79 [218] Edelstein, M., The construction of an asymptotic center with a fixed point property, Bull. AMS, 78 (1972), 206–208 [219] Edgeworth, F. Y., Mathematical Physics, Paul Kegan, London (1881) [220] Ehrling, G., On a type of eigenvalue problems for certain differential operators, Math. Scand., 2 (1954), 267–287 [221] Ekeland, I., On the variational principle, J. Math. Anal. Appl., 47 (1974), 324–353 [222] Ekeland, I.–Temam R., Convex Analysis and Variational Problems, North– Holland, Amsterdam (1976) [223] Ekeland, I., Nonconvex minimization problems, Bull. AMS (N. S.), 1 (1979), 443–474 [224] Ekeland, I.–Lasry, M., On the number of periodic trajectories for a Hamiltonian flow on a convex energy surface, Ann. Math., 112, (1980), 283–319 [225] Ekeland, I.–Hofer, H., Convex Hamiltonian energy surfaces and their periodic trajectories, Comm. Math. Phys., 113 (1987), 419–469 [226] Ekeland, I., The ε–variational principle revisited in Methods of Nonconvex Analysis, ed. A. Cellina, Lecture Notes in Math., Vol. 1446, SpringerVerlag, Berlin, (1989), 1–15
References
763
[227] Ekeland, I., Convexity Methods in Hamiltonian Mechanics, SpringerVerlag, Berlin (1990) [228] Evans, I.–Gariepy, R., Measure Theory and Fine Properties of Functions, CRC Press, Boca Raton, FL (1992) [229] Evans, G., Partial Differential Equations, Graduate Studies in Mathematics, Vol. 19, AMS, Providence, (1998) [230] Evstigneev, I., Optimal stochastic programs and their stimulating prices in Mathematical Models in Economics, Eds. J. Los–M. Los, North Holland, Amsterdam, (1974), 219–252 [231] Evstigneev, I.–Katyshev, P., Equilibrium points in stochastic models of economic dynamics, Theory Prob. Appl., 27, (1982), 127–135 [232] Fan, K., Fixed points and minimax theorems in locally convex linear spaces, Proc. Nat. Acad. Sci. USA, 38 (1952), 121–126 [233] Fan, K., A generalization of Tychonoff ’s fixed point theorem, Math. Ann., 142 (1961), 305–310 [234] Fan, K., Sur un th´eor`eme minimax, CRAS Paris, t. 259 (1964), 3925–3928 [235] Fan, K., Applications of a theorem concerning sets with convex sections, Math. Ann., 163 (1966), 189–206 [236] Fan, X.–Li, Z., Linking and existence results for perturbations of the pLapalcian, Nonlin. Anal., 42 (2000), 1413–1420 [237] Fan, X.–L.–Zhao, Y.–Z.–Huang, G.–F., Existence of solutions for the p-Laplacian with crossing nonlinearity, Disc. Cont. Dynam. Systems, 8 (2002), 1019–1024 [238] Faraci, F., Multiplicity results for a Neumann problem involving the pLaplacian, J. Math. Anal. Appl., 277 (2003), 180–189 [239] Fattorini, H., Infinite Dimensional Optimization and Control Theory, Cambridge Univ. Press, Cambridge, UK (1999) [240] Feller, W., On the generalization of unbounded semigroups of bounded linear operators, Ann. of Math., 58 (1953), 166–174 [241] Fenchel, W., Convex Cones, Sets and Functions, Princeton University Press, Princeton, NJ (1951) [242] Fershtman, C.–Mullar, E., Turnpike properties of capital accumulation games, J. Economic Theory, 38 (1986), 167–177 [243] Fetter, H., On the continuity of conditional expectations, J. Math. Anal. Appl., 61 (1977), 227–231 [244] de Figueiredo, D., The Ekeland Variational Principle with Applications and Detours, Tata Institute of Fundamental Research, Bombay (1989) [245] Filippakis, M.–Papageorgiou, N. S., Solvability of nonlinear variational– hemivariational inequalities, J. Math. Anal. Appl., 311 (2005), 162–181 [246] Filippakis, M.–Gasi´ nski, L.–Papageorgiou, N. S., Multiplicity results for nonlinear Neumann problems, Canad. J. Math., 58 (2006), 64–92 [247] Filippov, A. F., On certain questions in the theory of optimal control, SIAM J. Control, 1 (1962), 76–84 [248] Fishburn, P., Separation theorems and expected utility, J. Economic Theory, II (1975), 16–34 [249] Fleming, W.–Richel, R., Deterministic and Stochastic Optimal Control, Springer-Verlag, New York (1975) [250] Florenzano, M., On the existence of equilibria in economies with an infinite dimensional commodity space, J. Math. Economics, 12 (1983), 207–219
764
References
[251] Fonda, A.–Mawhin, J., Quadratic forms, weighted eigenfunctions and boundary value problems for nonlinear second order differential equations, Proc. Royal Soc. Edinburgh, 112A (1989), 145–163 [252] Fonseca, I.–Gangbo, W., Degree Theory in Analysis and Applications, Clarendon Press, Oxford, UK (1995) [253] F¨ uhrer, L., Ein elementarer analytischer Beweis zu Eindeutigkeit des Abbildungsgrades in RN , Math. Nach., 52 (1972), 259–267 [254] Gale, D., An optimal development in a multisector economy, Review of Econ. Studies, 34 (1967), 1–18 [255] Gamkreilidze, R. V., Principles of Optimal Control Theory, Plenum Press, New York (1978) [256] Garcia Azorero, J.–Manfredi, J.–Peral Alonso, I., Sobolev versus H¨ older local minimizers and global multiplicity for some quasilinear elliptic equations, Comm. Contemp. Math., 2 (2000), 385–404 [257] Garcia Melian, J.–Sabina de Lis, J., Maximum and comparison principles for operators involving the p-Laplacian, J. Math. Anal. Appl., 218 (1998), 49–65 [258] Gasi´ nski, L.–Papageorgiou, N. S., Nonsmooth Critical Point Theory and Nonlinear Boundary Value Problems, Chapman & Hall/CRC Press, Boca Raton, Fl (2005) [259] Gasi´ nski, L.–Papageorgiou, N. S., Nonlinear Analysis, Chapman & Hall/CRC Press, Boca Raton, Fl (2006) [260] Gelfand, I. M.–Fomin, S. V., Calculus of Variations, Prentice Hall, Englewood Cliffs, NJ (1963) [261] Gelfand, I.–Shilov, G., Generalized Functions I: Properties and Operations, Academic Press, New York (1977) [262] Ghoussoub, N., Duality and Perturbation Methods in Critical Point Theory, Cambridge Univ. Press, Cambridge, UK (1993) [263] Giaquinta, M., Multiple Integrals in the Calculus of Variations and Nonlinear Elliptic Systems, Princeton Univ. Press, Princeton (1983) [264] Giles, J., Convex Analysis with Applications in Differentation of Convex Functions, Pitman, Boston (1982) [265] Girardi, M.–Matzeu, M., Solutions of prescribed maximal period to convex and nonconvex Hamiltonian systems, Bollettino UMI, 4 (1985), 951–967 [266] Girsanov, I. V., Lectures on Mathematical Theory of Extremum Problems, Springer-Verlag, New York (1972) [267] Godoy, T.–Gossez, J.–P.–Paczka, S., Antimaximum principle for elliptic problems with weight, Electr. J. Diff. Eqns., Vol. 1999, No. 22, (1999), 1–15 [268] Godoy, T.–Gossez, J.–P.–Paczka, S., On the antimaximum principle for the p-Laplacian with indefinite weight, Nonlin. Anal., 51 (2002), 449–467 [269] Goebel, K.–Kirk, W., Topics in Metric Fixed Point Theory, Cambridge Univ. Press, Cambridge, UK (1990) [270] Goeleven, D., A note on Palais–Smale condition in the sense of Szulkin, Diff. Integral Eqns., 6 (1993), 1041–1043 [271] G¨ ohde, D., Zum Prinzip der kontraktiven Abbildung, Math. Nachr., 30 (1965), 251–258 [272] Gorniewicz, L., Topological Fixed Point Theory of Multivalued Mappings, Kluwer, Dordrecht (1999)
References
765
[273] Granas, A., The theory of compact vector fields and some of its applications to the topology of functional spaces, Rozprawy Mathematyczne, No. 30, Warsaw, (1962), 93 [274] Gromes, W., Ein einfacher Beweis des Satzes von Borsuk, Math. Zeits., 178 (1981), 399–400 [275] Guedda, M.–Veron, L., Quasilinear elliprtic equations involving critical Sobelev exponents, Nonlin. Anal., 13 (1989), 879–902 [276] Guo, D., Some fixed point theorems of expansion and compression type with applications in Nonlinear Analysis and Applications, Ed. V. Lakshmikantham, Lecture Notes in Pure and Applied Mathematics, Vol. 109, Marcel Dekker, New York, (1987), 213–221 [277] Guo, D.–Lakshmikantham, Nonlinear Problems in Abstract Cones, Academic Press, Boston (1988) [278] Guo, D.–Sun J., Some global generalizations of the Birkhoff–Kellog theorem and applications, J. Math. Anal. Appl., 129 (1988), 231–242 [279] Gutman, S., Topological equivalence in the space of integrable vector–valued functions, Proc. AMS, 93 (1985), 40–42 [280] Habets, R.–Metzen, G., Existence of periodic solutions of Duffing equations, J. Diff. Eqns., 78 (1989), 1–32 [281] Hadamard, J., Sur quelques applications de l’indice de Kronecker, Hermann Paris, (1910), 437–477 [282] Halidias, N.–Papageorgiou, N. S., Existence and relaxation results for nonlinear second–order multivalued boundary value problems in RN , J. Diff. Eqns., 147 (1998), 123–154 [283] Halpern, B. R.–Bergman, G. M., A fixed point theorem for inward and outward maps, Trans. AMS, 130 (1968), 353–358 [284] Halpern, B. R., Fixed point theorems for set–valued maps in infinite dimensional spaces, Math. Anal., 189 (1970), 87–89 [285] Harsanyi, J. C., Games with randomly disturbed payoffs, Intern. J. Game Theory, 2 (1973), 1–23 [286] Hartman, P.–Stampacchia, G., On some nonlinear elliptic differential equations, Acta Math., 115 (1966), 271–310 [287] Heikkila, S.–Lakshmikantham, V., Monotone Iterative Techniques for Discontinuous Nonlinear Differential Equations, Marcel Dekker, New York (1994) [288] Heinz, E., An elementary theory of the degree of a mapping in n– dimensional space, J. Math. Mech., 8 (1959), 231–247 [289] Hermes, H.–La Salle, J., Functional Analysis and Time Optimal Control, Academic Press, New York (1969) [290] Hernandez Lerma, O., Adaptive Markov Control Processes, SpringerVerlag, New York (1989) [291] Henry, D, Geometric Theory of Semilinear Parabolic Equations, Lecture Notes in Math., Vol. 840, Springer-Verlag, New York (1981) [292] Hestenes, M., Calculus of Variations and the Optimal Control Theory, Wiley, New York (1966) [293] Hiai, F.–Umegaki, H., Integrals, conditional expectations and martingales of multivalued functions, J. Multiv. Anal., 7 (1977), 149–182 [294] Hiai, F., Representation of additive functionals on vector–valued normed K¨ othe spaces, Kodai Math.–J., 2 (1979), 303–313
766
References
[295] Hildenbrand, W., On the core of an economy with a measure space of agents, Review of Econ. Studies, 35 (1968), 443–452 [296] Hildenbrand, W., Existence of equilibria for economies with production and a measure space of consumers, Econometrica, 38 (1970), 608–623 [297] Hildenbrand, W., Core and Equilibria of a Large Economy, Princeton Univ. Press, Princeton, NJ (1974) [298] Hildenbrand, W.–Kirman, A., Introduction to Equilibrium Analysis, North–Holland, Amsterdam (1976) [299] Hille, E., Representation of one–parameter semigroup of linear transformations, Proc. Nat. Acad. Sci., USA, 28 (1942), 175–178 [300] Hille, E.–Phillips, R., Functional Analysis and Semigroups, AMS Colloquium Publications, Vol. 31, AMS, Providence (1957) [301] Himmelberg, C., Measurable relations, Fund. Math., 87 (1975), 53–72 [302] Himmelberg, C.–Parthasarathy, T.–Raghavan, T.–Van Vleck, F., Existence of p-equilibrium and optimal stationary srategies in stochastic games, Proc. AMS, 60 (1976), 245–251 [303] Himmelberg, C.–Parthasarathy, T.–Van Vleck, F., Optimal plans for dynamic programming problems, Math. Oper. Res., 1 (1976), 390–394 [304] Himmelberg, C.–Parthasarathy, T.–Van Vleck, F., On measurable relations, Fund. Math., 111, (1981), 161–167 [305] Hinderer, K., Foundations of Nonstationary Dynamic Programming with Discrete–Time Parameter, Lecture Notes in Oper. Res. and Math. Systems, Vol. 33, Springer-Verlag, New York (1970) [306] Hiriart-Urruty, J.–B., Lipschitz r-continuity of the approximate subdifferential of a convex function, Math. Scand., 47 (1980), 123–134 [307] Hiriart-Urruty, J.–B., ε-subdifferential calculus in Convex Analysis and Optimization, Eds. J.–P. Aubin–I. Ekeland, Research Notes in Math., Vol. 57, (1982), Pitman, London, 43–92 [308] Hiriart-Urruty, J.–B.–Lemarechal, C., Convex Analysis and Minimization Algorithms I, Die Grundlehren der Mathematischen Wissenschaften, Vol. 306, Springer-Verlag, New York (1993) [309] Hiriart-Urruty, J.–B.–Lemarechal, C., Convex Analysis and Minimization Algorithms II, Die Grundlehren der Mathematischen Wissenschaften, Vol. 306, Springer-Verlag, New York (1993) [310] Holmes, R., Geometric Functional Analysis and Applications, Graduate Texts in Mathematics, Vol. 24 Springer-Verlag, New York (1975) [311] H¨ ormander, L., Sur la fonction d’appui des ensembles convexes dans un espace localement convexe, Arckiv f¨ ur Math., 3, (1955), 180–186 [312] Horsley, A.–Van Zandt, T.–Wrobel, A., Berge’s maximum theorem with two topologies on actions, Econ. Letters, 61 (1998), 285–291 [313] Hu, S.–Papageorgiou, N. S., Handbook of Multivalued Analysis. Volume I: Theory, Kluwer, Dordrecht, The Netherlands, (1997) [314] Hu, S.–Papageorgiou, N. S., On nonlinear, nonconvex evolution inclusions, Kodai Math. J., 18 (1995), 169–186 [315] Hu, S.–Papageorgiou, N. S., Time-Dependent Subdifferential Evolution Inclusions and Optimal Control, Memoirs AMS, Vol. 632, (1998) [316] Hu, S.–Papageorgiou, N. S., Handbook of Multivalued Analysis Volume II: Applications, Kluwer, Dordrecht, The Netherlands, (2000) [317] Hu, S.–Papageorgiou, N. S., Solutions and multiple solutions for problems with the p-Laplacian, Monatsch. Math., 150 (2007), 309–326
References
767
[318] Huang, Y.–Zhou, H.–S., Positive solution for −p u = f (x, u) with f (x, u) growing as up−1 at infinity, Appl. Math. Letters, 17 (2004), 881–887 [319] Hurwicz, L., On informationally decentralized systems. Decision and organization, (a volume in honor of Jacob Marschak), pp. 297–336. Studies in Mathematical and Managerial Economics, Vol. 12, North–Holland, Amsterdam (1972) [320] Hurwicz, L.–Majumdar, M., Optimal intertemporal allocation mechanisms and decentralization of decisions, J. Economic Theory, 45 (1988), 228–261 [321] Hurwicz, L.–Weinberger, H., A necessary condition for decentralization and an application to intertemporal allocation, J. Economic Theory, 51 (1990), 313–345 [322] Iannacci, R.–Nkashama, M. N., Unbounded perturbations of forced second order ordinary differential equations at resonance, J. Diff. Eqns., 69 (1987), 289–309 [323] Ichiishi, T.–Weber, S., Some theorems on the core of a non side payment game with a measure space of players, Intern. J. Game Theory, 7 (1978), 95–112 [324] Ichiishi, T., Game Theory for Economic Analysis, Academic Press, New York (1983) [325] Ize, J., Bifurcation theory for Fredholm operators, Memoirs AMS, Vol. 174, AMS, Providence, RI (1976) [326] Ioffe, A., On lower semicontinuity of integral functionals I,II, SIAM J. Control Optim., 15 (1977), 169–186 & 991–1000 [327] Ioffe, A.–Tichomirov, V., Theory of Extremal Problems, North–Holland, Amsterdam (1979) [328] Ionescu–Tulcea, A. and C., Topics in the Theory of Lifting, SpringerVerlag, Berlin (1969) [329] Istratescu, V., Fixed Point Theory, Kluwer, Dordrecht, The Netherlands (1981) [330] Jacobs, M., Measurable multivalued maps and Lusin’s theorem, Trans. AMS, 134 (1968), 471–481 [331] Jeanjean, P., Optimal development programs under uncertainty: The undiscounted case, J. Economic Theory, 7 (1974), 66–92 [332] Jiu, Q.–Su, J., Existence and multiplicity results for Dirichlet problems with p-Laplacian, J. Math. Anal. Appl., 281 (2003), 587–601 [333] Jones, S., Existence of equilibria with infintely many commodities: Banach lattices reconsidered, J. Math. Economics, 16 (1987), 89–104 [334] Joshi, S., Existence in undiscounted non-stationary non-convex multisector environments, J. Math. Economics, 28 (1997), 111–126 [335] Jost, J., Partial Differential Equations, Graduate Texts in Mathematics, Vol. 214, Springer, New York (2002) [336] Kachurovski, R., On monotone operators and convex functionals, Uspekhi Mat. Nauk., 15 (1960), 213–215 (in Russian) [337] Kakutani, S., A generalization of Brouwer’s fixed point theorem, Duke Math. J., 8 (1941), 457–459 [338] Kakutani, S., Topological properties of the unit sphere of a Hilbert space, Proc. Imp. Acad. Tokyo, 19 (1943), 269–271 [339] Kamae, T.–Krengel, U.–O’Brien, G. L., Stochastic inequalities on partially ordered spaces, Ann. Prob., 5 (1977), 899–912
768
References
[340] Kannai, Y., Countably additive measures in cores of games, J. Math. Anal. Appl., 27 (1967), 227–240 [341] Kato, T., Nonlinear semigroups and evolution equations, J. Math. Soc. Japan, 19 (1967), 508–520 [342] Kato, T., Accretive operators and nonlinear evolution equations in Banach spaces in Nonlinear Functional Analysis, Ed. F. Browder, Proc. of Symposia in Pure Math., Vol. 18, AMS Providence (1968) [343] Kato, T., Perturbation Theory for Linear Operators, (2nd edition) Springer-Verlag, Berlin (1976) [344] Kelley, J., General Topology, Springer-Verlag, New York (1955) [345] Kenmochi, N., Nonlinear operators of monotone type in reflexive Banach spaces and nonlinear perturbations, Hiroshima Math. J., 4 (1974), 229–263 [346] Kenmochi, N., Pseudomonotone operators and nonlinear elliptic boundary value problems, J. Math. Soc. Japan, 27 (1975), 121–149 [347] Kenmochi, N., Some nonlinear parabolic variational inequalities, Israel J. Math., 22 (1975), 304–331 [348] Kenmochi, N., Solvability of nonlinear evolution equations with time– dependent constraints and applications, Bull. Fac. Edu., Chiba Univ., 30 (1980), 1–87 [349] Khan, M. A., Equilibrium points of nonatomic games over a Banach space, Trans. AMS, 293 (1986), 737–749 [350] Khan, M. A.–Mitra, T., On the existence of a stationary optimal stock of a multisector economy: A primal approach, J. Economic Theory, 40 (1986), 319–328 [351] Khan, M. A.–Papageorgiou, N. S., On Cournot–Nash equilibria in generalized quantitive games with an atomless measure space of agents, Proc. AMS, 100 (1987), 505–510 [352] Khan, M. A., On Cournot–Nash equilibrium for games with a nonmetrizable action space and upper semicontinuous payoffs, Trans. AMS, 315 (1989), 127–146 [353] Kim, T.–Yannelis, N., Existence of equilibrium in Bayesian games with infinitely many players, J. Economic Theory, 77 (1997), 330–353 [354] Kirk, W., A fixed point theorem for mappings which do not increase distances, Amer. Math. Monthly, 72 (1965), 1004–1006 [355] Kisielewicz, M., Differential Inclusions and Optimal Control, Kluwer, Dordrecht (1991) [356] Klei, H. A., A compactness criterion in L1 (E) and Radon–Nikodym theorems for multimeasures, Bull. Sci. Math., 112 (1988), 305–324 [357] Klein, E.–Thompson, A., Theory of Correspondences, Wiley, New York (1984) [358] Knaster, B.–Kuratowski, K.–Mazurkiewicz, S., Ein Beweis des Fixpunktsatze f¨ ur n–dimensionale Simplexe, Fund. Math., 14 (1929), 132–137 [359] Kobayashi, J.–Otani, M., An index formula for the degree of (S)+ – mappings associated with one–dimensional p-Laplacian, Abstr. Appl. Anal., Vol. 2004, No. 11 (2004), 981–995 [360] Komura, Y., Nonlinear semigroups in Hilbert spaces, J. Math. Soc. Japan, 19 (1967), 493–507 [361] Krasnoselskii, M. A., Positive Solutions of Operator Equations, Noordhoff, Groningen, The Netherlands (1964)
References
769
[362] Krasnoselskii, M. A., Topological Methods in the Theory of Nonlinear Integral Equations, Pergamon Press, Oxford, UK (1964) [363] Krasnoselskii, M. A.–Zabreiko, P., Geometrical Methods of Nonlinear Analysis, Springer-Verlag, New York (1984) [364] Krawcewicz, W.–Wu, J., Theory of Degrees with Applications to Bifurcations and Differential Equations, Wiley, New York (1997) [365] Kudo, H., A note on the strong convergence of σ–algebras, Ann. Prob., 2 (1974), 76–83 [366] Kuratowski, K., Les fonctions semi–continues dans l’espace des ensembles ferm´es, Fund. Math., 18 (1931), 148–159 [367] Kuratowski, K.–Ryll–Nardzewski, C., A general theorem on selectors, Bull. Acad. Polon. Sci. Ser. Sci. Math., Astron., Phys. 13 (1965), 397–403 [368] Kuratowski, K., Topology, Vol. I, Academic Press, New York (1966) [369] Kyritsi, S. T.–Motreanu D.–Papageorgiou, N. S., Two nontrivial solutions for strongly resonant nonlinear elliptic equations, Arch. Math., 83 (2004), 60–69 [370] Laurent, P., Approximation et Optimisation, Hermann, Paris (1972) [371] Lebourg, G., Valeur moyenne pour gradient g´en´eralis´e, CRAS Paris, t. 281 (1975), 795–797 [372] Ledyard, J., The scope of the hypothesis of Bayesian equilibrium, J. Economic Theory, 39 (1986), 59–82 [373] Leese, S., Multifunctions of Souslin type, Bull. Austr. Math. Soc., 11 (1974), 395–411 [374] Leggett, L.–Williams, L., Multiple positive fixed points of nonlinear operators on ordered Banach spaces, Indiana Univ. Math. J., 28 (1979), 673–688 [375] Leray, J.–Schauder, J., Topologie et equations fonctionelles, Ann. Sci. Ecole Norm. Super., 51, (1934), 45–78 [376] Levin, V., Subdifferentials of convex integral functionals and liftings that are the identity on subspaces in L∞ , Soviet Math. Dokl., 14 (1973), 1163– 1166 [377] Levin, V., The Lebesgue decomposition for functions on the vector space L∞ X , Funct. Anal. Appl., 8 (1974), 314–317 [378] Levin, V., Convex integral functionals and lifting theory, Russian Math. Surveys, 30 (1975), 119–184 [379] Li, G.–Zhou, H.–S., Asymptotically linear Dirichlet problem for the pLaplacian, Nonlin. Anal., 43 (2001), 1043–1055 [380] Li, S.–Willem, M., Applications of local linking to critical point theory, J. Math. Anal. Appl., 189 (1995), 6–32 [381] Li, X.–Yong, J., Optimal Control Theory for Infinite Dimensional Systems, Birkh¨ auser, Boston (1995) [382] Lieberman, G., Boundary regularity for solutions of degenerate elliptic equations, Nonlin. Anal., 12 (1988), 1203–1219 [383] Lim, T.–C., On fixed point stability for set–valued contractive mappings with applications to generalized differential equations, J. Math. Anal. Appl., 110 (1985), 436–441 [384] Lindqvist, P., On the equation div(∇xp−2 ∇x+λ|x|p−2 x = 0, Proc. AMS, 109 (1990), 157–164 [385] Lindqvist, P., Addendum to: On the equation div(∇xp−2 ∇x+λ|x|p−2 x = 0, Proc. AMS, 116 (1992), 583–584
770
References
[386] Lions, J.–L.–Strauss, W., Some nonlinear evolution equations, Bull. Soc. Math. France, 93 (1965), 43–96 [387] Lions, J.–L., Quelques M´ethodes de R´ esolution des Probl`emes aux Limites Non–Lin´eaires, Dunod, Paris (1969) [388] Lions, J.–L., Optimal Control of Systems Governed by Partial Differential Equations, Springer-Verlag, New York (1971) [389] Liu, J.–Li, S., An existence theorem of multiple critical points and its applications, Kexue Tongbao, 29 (1984), 1025–1027 (in Chinese) [390] Liu, J.–Q.–Su, J., Remarks on multiple nontrivial solutions for quasilinear resonant problems, J. Math. Anal. Appl., 258 (2001), 209–222 [391] Ljusternik, L., Topologishe Grundlagen der allgemeinen Eigenwertheorie, Monatsh. Math. Phys., 37 (1930), 125–130 [392] Ljusternik, L., On constrained extrema of functionals, Mat. Sb., 41 (1934), 390–401 [393] Ljusternik, L.–Schnirelmann, L., Methodes Topologiques dans les Probl´emes Variationelles, Hermann, Paris (1934) [394] Ljusternik, L.–Schnirelmann, L., Topological methods in variational problems and their application to the differential geometry of surfaces, Uspekhi Mat. Nauk., 2 (1947), 166–217 (in Russian) [395] Ljusternik, L.–Sobolev, V., Elements of Functional Analysis, Gordon and Breach Science, New York (1968) [396] Lloyd, N. G., Degree Theory, Cambridge Univ. Press, Cambridge, UK (1978) [397] Maitra, A.–Parthasarathy, T., On stochastic games, J. Optim. Theory Appl., 5 (1970), 289–300 [398] Maitra, A.–Parthasarathy, T., On stochastic games II, J. Optim. Theory Appl., 8 (1971), 154–160 [399] Maitra, A.–Sudderth, W., Discrete Gambling and Stochastic Games, Springer, New York (1996) [400] Majumdar, M.–Radner, R., Stationary optimal policies with discounting in a stochastic activity analysis model, Econometrica, 51 (1983), 1821–1837 [401] Majumdar, M.–Zilha, I., Optimal growth in a stochastic environment: Some sensitivity and turnpike results, J. Economic Theory, 43 (1987), 116– 133 [402] Makarov, V.–Rubinov, A., Mathematical Theory of Economic Dynamics and Equilibria, Springer-Verlag, New York (1977) [403] Malinvaud, E., Capital accumulation and efficient allocation of resources, Econometrica, 21 (1953), 233–268 [404] Malinvaud, E., Efficient capital accumulation: A corringendum, Econometrica, 30 (1962), 570–573 [405] Mamer, J.–Schilling, K., A zero–sum game with incomplete information and compact action spaces, Math. Oper. Res., 11 (1986), 627–631 [406] Manasevich, R.–Mawhin, J., Periodic solutions for nonlinear systems with p-Laplacian–like operators, J. Diff. Eqns., 145 (1998), 367–393 [407] Manasevich, R.–Mawhin, J., Boundary value problems for nonlinear perturbations of vector p-Laplacian–like operators, J. Korean Math. Soc., 37 (2000), 665–685 [408] Marino, A.–Prodi, G., Metodi perturbativi nella teoria di Morse, Bollettino UMI, 3 (1975), 1–32
References
771
[409] Martio, O., Counterexamples on the unique continuation, Manuscripta Math., 60 (1988), 21–47 [410] Mas Collel, A., An equilibrium existence theorem without complete or transitive preferences, J. Math. Economics, 1 (1974), 237–246 [411] Mas Collel, A., On a theorem of Schmeidler, J. Math. Economics, 13 (1984), 201–206 [412] Mawhin, J., Topological Degree Methods in Nonlinear Boundary Value Problems, CBMS, Regional Conference Series in Math., Vol. 40 (1979), AMS, Providence, RI [413] Mawhin, J.–Willem, M., Critical points of convex perturbations of some indefinite forms and semilinear boundary value problems at resonance, Ann. Inst. H. Poincar´e, Analyse Non–Lin´eaire 3 (1986), 431–453 [414] Mawhin, J., Semicoercive monotone variational problems, Acad. Royal Belg., Bull. Cl. Sci., 73 (1987), 118–130 [415] Mawhin, J.–Willem, M., Critical Point Theory and Hamiltonian Systems, Springer-Verlag, New York (1989) [416] Mawhin, J., Periodic solutions of systems with p-Laplacian–like operators in Nonlinear Analysis and its Applications to Differential Equations, Progress in Nonlinear Differential Equations, Vol. 45 (2001), Birkh¨ auser, Boston, 37–63 ¨ [417] Mazur, S., Uber konvexe Mengen in linearen normierten R¨ aumen, Studia Math., 4 (1933), 70–84 [418] McKenzie, L., Turnpike theorems for a generalized Leontief model, Econometrica, 31 (1963), 165–180 [419] McKenzie, L., Turnpike theorems with technology and welfare function variable in Mathematical Models in Economics, Eds. J. Los–M. Los, North– Holland, Amsterdam (1974), 120–145 [420] McKenzie, L., Turnpike theory, Econometrica, 44 (1976), 841–856 [421] McKenzie, L., A primal route to turnpike and Liapunov stability, J. Economic Theory, 27 (1982), 194–209 [422] McKenzie, L., Turnpike theory, discounted utility and the von Neumann Facet, J. Economic Theory, 30 (1983), 330–352 [423] McKenzie, L., Optimal economic growth, turnpike theorems and comparative dynamics in Handbook of Mathematical Economics, Vol. III, Eds. K. Arrow–M. Intriligator, North–Holland, Amsterdam (1986), 1281–1355 [424] Megginson, R., An Introduction to Banach Space Theory, Graduate Text in Math., Vol. 183, Springer, New York (1998) [425] Michael, E., Topologies on spaces of subsets, Trans. AMS, 71 (1951), 152– 182 [426] Michael, E., Continuous selections I, Ann. Math., 63 (1956), 361–382 [427] Michael, E., Continuous selections II, Ann. Math., 64 (1956), 562–580 [428] Michael, E., A survey of continuous selections in Set-valued Mappings, Selections and Topological Properties of 2X , Ed. W. M. Fleischman, Lecture Notes in Math., Vol. 171, Springer-Verlag, Berlin (1970), 54–57 [429] Milgram, P.–Weber, R., Distributional strategies for games with incomplete information, Math. Oper. Res., 10 (1985), 619–632 [430] Minty, G., Monotone (nonlinear) operators in a Hilbert space, Duke Math. J., 29 (1962), 341–346 [431] Mitra, T., On the value maximizing property of infinite horizon efficient programs, Inter. Economic Review, 20 (1979), 635–642
772
References
[432] Mitra, T., Efficiency, weak value maximality and weak value optimality in a multisector model, Review of Economic Studies, 48 (1981), 643–647 [433] Mitra, T.–Zilcha, I., An optimal economic growth with changing technology and tastes: Characterization and stability results, Inter. Economic Review, 22 (1981), 221–238 [434] Mitra, T., Sensitivity of optimal programs with respect to changes in target stocks: The case of irreversible investment, J. Economic Theory, 29 (1983), 172–184 [435] Miyadera, I., Nonlinear Semigroups, Translation of Math. Monographs, Vol. 109 (1992), AMS, Providence, RI [436] Miyadera, I., Generation of a strongly continuous semigroup of operators, Tˆ ohoku Math. J., 4 (1952), 109–114 [437] Miyagaki, O., On a class of semilinear elliptic problems in RN with critical growth, Nonlin. Anal., 29 (1997), 773–781 [438] Motreanu, D.–Montreanu, V.–Papageorgiou, N. S., Periodic solutions for nonautonomous systems with nonsmooth quadratic or superquadratic potential, Topol. Meth. Nonlin. Anal., 24 (2004), 269–296 [439] Motreanu, D.–Radulescu, V., Variational and Nonvariational Methods in Nonlinear Analysis and Boundary Value Problems, Kluwer, Dordrecht (2003) [440] Motreanu, D.–Papageorgiou, N. S., Existence and multiplicity of solutions for Neumann problems, J. Diff. Eqns., 232 (2007), 1–35 [441] Moreau, J.–J., Proximit` e et dualit´e dans un espace hilbertien, Bull. Soc. Math. France, 93 (1965), 273–299 [442] Moreau, J.–J., Fonctionelles Convexes, Seminaire sur les equations aux deriv`ees partielles, II, College de France, Paris (1966–67) [443] Morrey, C. B., Multiple Integrals in the Calculus of Variations, SpringerVerlag, Berlin (1966) [444] Morse, M., The Calculus of Variations in the Large, AMS, Providence, RI (1934) [445] Mosco, U., Convergence of convex sets and of solutions of variational inequalities, Adv. Math., 3 (1969), 510–585 [446] Mosco, U., On the continuity of the Young–Fenchel transform, J. Math. Anal. Appl., 35 (1971), 518–535 [447] Nachbin, L., Linear continuous funtionals positive on the increasing continuous functions, Summa Brasil Math., 2 (1951), 135–150 [448] Nachbin, L., Topology and Order, Van Nostrand, Princeton, NJ (1965) [449] Nadler, S., Multivalued contraction mappings, Pacific J. Math., 30 (1969), 475–488 [450] Nagumo, M., Degree of mapping in convex linear topological spaces, Amer. J. Math., 73 (1951), 497–511 [451] Nash, J., Equilibrium points in n–person game, Proc. Nat. Acad. Sci. USA, 36 (1950), 48–49 [452] Nash, J., Non–cooperative games, Ann. Math., 54 (1951), 286–295 [453] Nashed, Z., Differentiability and related properties of nonlinear operators: Some aspects of the role of differentials in nonlinear functional analysis in Nonlinear Functional Analysis and Applications, Academic Press, New York (1971)
References
773
[454] Neumann von, J., Zur Theorie der Gesellschaftsspiele, Math. Ann., 100 (1928), 295–320 (English Translation: On the theory of games of strategy in Contributions to the Theory of Games, Vol. IV, Eds. A. W. Tucker–R. D. Luce, Princeton Univ. Press, Princeton, NJ 13–42 [455] Neumann von, J.–Morgenstern, O., Theory of Games and Economic Behavior, Princeton Univ. Press, Princeton, NJ (1947) [456] Neumann von, J., On rings of operators, reduction theory, Ann. Math., 50 (1949), 401–485 [457] Neustadt, L., Optimization. A Theory of Necessary Conditions, Princeton Univ. Press, Princeton, NJ (1976) [458] Neveu, J., Calcul des Probabilit´es, Masson, Paris (1970) [459] Neveu, J., Discrete Parameter Martingales, North Holland, Amsterdam (1975) [460] Nikaido, H.–Isoda, K., Note on noncooperative convex games, Pacific J. Math., 5 (1955), 807–815 [461] Nikaido, H., Persistence of continual growth near the von Neumann ray: a strong version of the Radner turnpike theorem, Econometrica, 32 (1964), 151–163 [462] Nowak, A., Universally measurable strategies in zero–sum stochastic games, Ann. Prob., 13 (1985), 269–287 [463] Nowak, A., On the weak topology on a space of probability measures induced by policies, Bull. Polish Acad. Sci., 36 (1988), 181–188 [464] Nussbaum, R. D., Degree theory for local condensing maps, J. Math Anal. Appl., 37 (1972), 741–766 [465] Nussbaum, R. D., A global bifurcation theorem with application to functional differential equations, J. Funct. Anal., 19 (1975), 319–338 [466] Nyarko, Y., On characterizing optimality of stochastic competitive processes, J. Economic Theory, 45 (1988), 316–329 [467] Olech, C., A characterization of L1 –weak lower semicontinuity of integral functionals, Bull. Acad. Polon. Sci., 25 (1977), 135–142 [468] Olech, C., Decomposability as a substitute of convexity in Proceedings of the Conference on Multifunctions and Integrands, ed. G. Salinetti, Lecture Notes in Math., Vol. 1091, Springer-Verlag, Berlin (1984) [469] Pantelides, G.–Papageorgiou, N. S., Weakly maximal stationary programs for a descrete time stochastic growth model, Japan Jour. Indust. Appl. Math., 10 (1993), 471–486 [470] Palais, R.–Smale, S., A generalized Morse theory, Bull. AMS, 70 (1964), 165–172 [471] Palais, R., Lusternik–Schnirelmann theory on Banach Manifolds, Topology, 5 (1966), 115–132 [472] Palais, R., The principle of symmetric criticality, Comm. Math. Phys., 69 (1979), 19–30 [473] Palfrey, T.–Srivastava, S., Private information in large economies, J. Economic Theory, 39 (1986), 34–58 [474] Papalini, F., A quasilinear Neumann problem with discontinuous nonlinearity, Math. Nachr., 250 (2003), 82–97 [475] Papageorgiou, N. S., On the theory of Banach space valued multifunctions, part 1: Integration and conditional expectation, J. Multiv. Anal., 17 (1985), 185–206
774
References
[476] Papageorgiou, N. S., On the theory of Banach space valued multifunctions, part 2: Set valued martingales and set valued measures, J. Multiv. Anal., 17 (1985), 207–227 [477] Papageorgiou, N. S., On Fatou’s lemma and parametric integrals of set– valued functions, J. Math. Anal. Appl., 187 (1994), 809–825 [478] Papageorgiou, N. S., Optimal programs and their price characterizations in a multisector growth model with uncertainty, Proc. AMS, 122 (1994), 227–240 [479] Papageorgiou, N. S., Weakly maximal programs for a multisector growth model, J. Math. Anal. Appl., 183 (1994), 624–630 [480] Papageorgiou, N. S., On the conditional expectation and convergence properties of random sets, Trans. AMS, 347 (1995), 2495–2515 [481] Papageorgiou, N. S., On parametric evolution inclusions of the subdifferential type with applications to optimal control problems, Trans. AMS, 347 (1995), 203–231 [482] Papageorgiou, N. S.–Papalini, F.–Renzacci, F., Existence of solutions and periodic solutions for nonlinear evolution inclusions, Rend. Circ. Mat. Palermo (2), 48 (1999), 341–364 [483] Papageorgiou, N. S.–Papalini, F.–Yannakakis, N., Nonmonotone, nonlinear evolution inclusions, Math. Comp. Modelling, 32 (Special Issue on Nonlinear Operator Theory, Eds. R. Agarwal–D. O’Regan), (2000), 1345– 1366 [484] Papageorgiou, E.–Papageorgiou, N. S., Two nontrivial solutions for quasilinear periodic problems, Proc. AMS, 132 (2004), 429–434 [485] Papageorgiou, N. S.–Yannakakis, N., Second order nonlinear evolution inclusions I: Existence and relaxation results, Acta Math. Sinica, 21 (2005), 977–996 [486] Papageorgiou, N. S.–Yannakakis, N., Second order nonlinear evolution inclusions II: Structure of the solution set, Acta Math. Sinica, 22 (2006), 195–206 [487] Parthasarathy, K. R., Probability Measures on Metric Spaces, Academic Press, New York (1967) [488] Parthasarathy, T., Discounted, positive and non–cooperative stochastic games, Intern. J. Game Theory, 2 (1973), 25–37 [489] Parthasarathy, T., Existence of equilibrium stationary strategies in discounted stochastic games, Sankhya, 44 (1982), 114–127 [490] Pascali, D.–Sburlan, S., Nonlinear Mappings of Monotone Type, Sijthoff and Noordhoff, The Netherlands (1978) [491] Pazy, A., Semigroups of Linear Operators and Partial Differential Equations, Springer-Verlag, New York (1983) [492] Peleg, B., Efficiency prices for optimal consumption IV, SIAM J. Control, 10 (1972), 414–433 [493] Peleg, B.–Yaari, M., Price properties of optimal consumption programs in Models of Economic Growth, Eds. I. Mirelees–N. Stern, Macmillan, London (1973), 306–317 [494] Penot, J.–P., The drop theorem, the petal theorem and Ekeland’s variational principle, Nonlin. Anal., 10 (1986), 813–822 [495] Phelps, R., Convex Functions, Monotone Operators and Differentiablity, Lecture Notes in Math., Vol. 1364, Springer-Verlag, Berlin (1993)
References
775
[496] Phillips, R., Perturbation theory for semigroups of linear operators, Trans. AMS, 74 (1953), 199–221 [497] Pohozaev, S., Eigenfunctions of the equation u+λf (u) = 0, Soviet Math. Dokl., 6 (1965), 1408–1411 [498] Poincar´e, H., Les figures equilibrium, Acta Math., 7 (1885), 259–302 [499] Pontryagin, L. S., Optimal regulation processes, Amer. Math. Soc. Transl. (2), 18 (1961), 321–339 [500] Pontryagin, L. P.–Boltyanski, V.–Gamkrelidze, R.–Mischenko, E., Mathematical Theory of Optimal Processes, Wiley, New York (1962) [501] Postlewaite, A.–Schmeidler, D., Implementation in differential information economies, J. Economic Theory, 39 (1986), 14–33 [502] Rabinowitz, P., Some global results for nonlinear eigenvalue problems, J. Funct. Anal., 7 (1971), 487–513 [503] Rabinowitz, P., Some aspects of nonlinear eigenvalue problems, Rocky Mountain Consortium, Symposium on Nonlinear Eigenvalue Problems (Santa Fe, N. M., 1971), Rocky Mountain J. Math., 3 (1973), 161–202 [504] Rabinowitz, P., Some minimax theorems and applications to partial differential equations in Nonlinear Analysis–Selection of Papers in Honor of E. R¨ othe, Eds. L. Cesari–R. Kannan–H. Weinberg, Academic Press, New York (1978), 161–177 [505] Rabinowitz, P., Some critical point theorems and applications to semilinear elliptic partial differential equations, Ann. Scuola Norm. Sup. Pisa Cl. Sci., 5(4) (1978), 215–223 [506] Rabinowitz, P., Periodic solutions of Hamiltonian systems, Comm. Pure Appl. Math., 31 (1978), 157–184 [507] Rabinowitz, P., Periodic solutions of a Hamiltonian system on a prescribed energy surface, J. Diff. Eqns., 33 (1979), 336–352 [508] Rabinowitz, P., Minimax Methods in Critical Point Theory with Applications to Differential Equations, CBMS, Regional Conference Series in Math., No. 65, AMS, Providence, RI (1986) [509] Radner, R., Paths of economic growth that are optimal with regard only to final states: A turnpike theorem, Review of Economic Studies, 28 (1961), 98–104 [510] Radner, R., Optimal stationary consumption with stochastic production and resources, J. Economic Theory, 6 (1973), 68–90 [511] Radner, R.–Rosenthal, R. W., Private information and pure–strategy equilibria, Math. Oper. Res., 7 (1982), 401–409 [512] Read, T., Balanced growth without constant returns to scale, J. Math. Economics, 15 (1986), 171–178 [513] Reed, M.–Simon, B., Functional Analysis, Academic Press, New York (1972) [514] Ricceri, B., Une propriete toplogique de l’ensemble des points fixes d’une contraction multivoque a valeurs convexes, Atti Accad. Naz. Lincei., 81 (1987), 283–286 [515] Ritt, R.–Sennott, L., Optimal stationary policies in general state space, Markov decision chains with finite action sets, Math. Oper. Res., 17 (1992), 901–909 [516] Roberts, A.–Varberg, D., Convex Functions, Pure and Applied Mathematics, Vol. 57, Academic Press, New York (1973)
776
References
[517] Rockafellar, R. T., An extension of Fenchel’s duality theorem for convex functions, Duke Math. J., 33 (1966), 81–90 [518] Rockafellar, R. T., Duality and stability in extremum problems involving convex functions, Pacific J. Math., 21 (1967), 167–187 [519] Rockafellar, R. T., Integrals which are convex functionals, Pacific J. Math., 24 (1968), 525–539 [520] Rockafellar, R. T., Local boundedness of nonlinear monotone operators, Michigan Math. J., 16 (1969), 397–407 [521] Rockafellar, R. T., Measurable dependence of convex sets and functions on parameters, J. Math. Anal. Appl., 28 (1969), 4–25 [522] Rockafellar, R. T., Convex Analysis, Princeton Math. Series, Vol. 28, Princeton Univ. Press, Princeton, NJ (1970) [523] Rockafellar, R. T., On the maximal monotonicity of subdifferential mappings, Pacific J. Math, 24 (1970), 525–539 [524] Rockafellar, R. T., On the maximal monotonicity of sums of nonlinear monotone operators, Trans. AMS, 149 (1970), 75–88 [525] Rockafellar, R. T., Convex integral functionals and duality in Contributions to Nonlinear Analysis, Ed. E. Zarantonello, Academic Press, New York (1971) [526] Rockafellar, R. T., Integrals which are convex functionals II, Pacific J. Math., 39 (1971), 439–469 [527] Rockafellar, R. T., Weak compactness of level sets of integral functionals in Trois`eme Colloque sur l’Analyse Fonctionelle, Ed. H. Garnir, Vander, Louvain (1971) [528] Rockafellar, R. T., Conjugate Duality and Optimization, CBMS, Vol. 16, Regional Conference Series in Math., SIAM, Philadelphia (1974) [529] Rockafellar, R. T., Integral functionals, normal integrands and measurable selections in Nonlinear Operators and the Calculus of Variations, ed. I. Waelhroek, Lecture Notes in Math., Vol. 543, Springer-Verlag, New York (1976) [530] Rockafellar, R. T.–Wets, R., Variational Analysis, Die Grundlehren der Mathematischen Wissenschaften, Vol. 317, Springer-Verlag, Berlin (1998) [531] R¨ othe, E., Morse Theory in Hilbert space, Rocky Mountain J. Math., 3 (1975), 251–274 [532] Roubicek, T., Relaxation in Optimization Theory and Variational Calculus, W. De Gruyter, Berlin (1997) [533] Rzezuchowski, T., Strong convergence of selectors implied by weak, Bull. Austr. Math. Soc., 39 (1989), 201–214 [534] Sadovskii, B. N., Limit compact and condensing operators, Russian Math. Surveys, 27 (1972), 85–155 [535] Saint-Beuve, M.–F., On the extension of von Neumann–Aumann’s theorem, J. Funct. Anal., 17 (1974), 112–129 [536] Saint-Pierre, J., Une remarque sur les espaces sousliniens reguliers, CRAS Paris, t. 282 (1976), 1425–1427 [537] Scarf, H., The core os an N -person game, Econometrica, 35, (1967), 50–69 ¨ [538] Schaefer, H., Uber die Methode der a priori Schranken, Math. Ann., 129 (1955), 415–416 [539] Schaefer, H., Banach Lattices and Positive Operators, Springer-Verlag, New York (1974)
References
777
[540] Schauder, J., Der Fixpunftsatz in Funktionalr¨ aumen, Studia Math, 2 (1930), 171–180 [541] Schmeidler, D., On balanced games with infintely many players, Research Program in Game Theory and Mathematical Economics, Research Memorandum No. 28, Dept. of Mathematics, The Hebrew Univ. of Jerusalem (1967) [542] Schmeidler, D., Cores of exact games I, J. Math. Anal. Appl., 40, 214–225 [543] Schmeidler, D., Equilibrium points of non-atomic games, J. Stat. Phys., 7, 295–300 [544] Schmidt, K. D., Embedding theorems for classes of convex sets, Acta Appl. Math., 5 (1986), 209–237 [545] Schwartz, J., Nonlinear Functional Analysis, Gordon & Breach, New York (1969) [546] Segal, I., Nonlinear Semigroups, Ann. Math., 78 (1963), 339–364 [547] Seierstadt, A.–Sydsaeter, K., Optimal Control Theory with Economic Applications, North Holland, Amsterdam (1987) [548] Serrin, J., A Harnack inequality for nonlinear equations, Bull. AMS, 55 (1963), 481–486 [549] Shafer, W.–Sonnenschein, H., Equilibrium in abstract economies without ordered preferences, J. Math. Economics, 2 (1975), 345–348 [550] Shapley, L., A value for n-person games in Contributions to the Theory of Games, Vol. II, Eds. H. W. Kuhn–A. W. Tucker, Princeton Univ. Press, Princeton (1953), 307–317 [551] Shapley, L., On balanced sets and cores, Naval Res. Logist. Quart., 14 (1967), 453–460 [552] Shapley, L., On balanced games without side payments in Mathematical Programming, Eds. T. C. Hu–S. Robinson, Academic Press, New York (1973), 261–290 [553] Showalter, R., Hilbert Space Methods for Partial Differential Equations, Pitman, London (1977) [554] Showalter, R., Monotone Operators in Banach Space and Nonlinear Partial Differential Equations, Math. Surveys and Monographs, Vol. 49, AMS, Providence, RI (1997) [555] Simon, J., Compact sets in the space Lp (O, T ; B), Ann. Mat. Pura ed Appl., 146 (1987), 65–96 [556] Simons, S., An upward–downward minimax theorem, Arch. der Math., 55 (1990), 275–279 [557] Sion, M., On general min–max theorems, Pacific J. Math., 8 (1958), 171– 176 [558] Smart, D. R., Fixed Point Theorems (2nd Edition), Cambridge Univ,. Press, Cambridge, UK (1980) [559] Stinchcombe, M., Bayesian information topologies, J. Math. Economics, 19 (1990), 233–253 [560] Stinchcombe, M., A further note on Bayesian information topologies, J. Math. Economics, 22 (1993), 189–193 [561] Strauch, R., Negative dynamic programming, Ann. Math. Stat., 37 (1966), 871–890 [562] Striebel, C., Optimal Control of Discrete Time Stochastic Systems, Lecture Notes in Econ. Math. Systems, Vol. 110, Springer-Verlag, Berlin (1975)
778
References
[563] Struwe, M., A note on a result of Ambrosetti–Mancini, Ann. Mat. Pura ed Appl., 81 (1982), 107–115 [564] Struwe, M., Variational Methods, Springer-Verlag, Berlin (1990) [565] Stuart, C. A.–Zhou, H. S., Applying the mountain pass theorem to an asymptotically linear elliptic equation in RN , Comm. PDEs, 24 (1999), 1731–1758 [566] Sun, J.–Sun, Y., Some fixed point theorems for increasing operators, Appl. Anal., 23 (1986), 23–27 [567] Sun, Y., A fixed point theorem for mixed monotone operators with applications, J. Math. Anal. Appl., 156 (1991), 240–252 [568] Szulkin, A., Minimax principles for lower semicontinuous functions and applications to nonlinear boundary value problems, Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire, 3 (1986), 77–109 [569] Szulkin, A., Ljusternik–Schnirelmann theory on C 1 -manifolds, Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire, 5 (1988), 119–139 [570] Tacsar, M. L., Optimal planning over infinite time interval under random factors in Mathematical Models in Economics, Eds. J. Los–M. Los, North Holland, Amsterdam (1974), 289–298 [571] Takahashi, W., Existence theorems generalizing fixed point theorems of multivlaued mappings in Fixed Point Theory and Applications, Eds. J. Baillon–M. Thera, Pitman Res. Notes in Math., Vol. 252, Longman Scientific and Technical, Harlow (1991) [572] Takayama, A., Mathematical Econonics (2nd edition), Cambridge Univ. Press, Cambridge, UK (1985) [573] Talagrand, M., Pettis Integral and Measure Theory, Memoirs AMS, Vol. 51 (1984) [574] Tanabe, H., Equations of Evolution, Pitman, London (1979) [575] Tanaka, K.–Yokoyama, K., On ε-equilibrium point in a noncooperative nperson game, J. Math. Anal. Appl., 160 (1991), 413–423 [576] Tang, C.–L., Periodic solutions of nonautonomous second order systems with γ-quasisubadditive potential, J. Math. Anal. Appl., 189 (1995), 671– 675 [577] Tang, C.–L., Periodic solutions for nonautonomous second order systems with sublinear nonlinearity, Proc. Amer. Math. Soc., 126, No. 11 (1998), 3263–3270 [578] Tang, C.–L.–Wu, X.–P. Periodic solutions for a class of nonautonomous subquadratic second order Hamiltonian systems, J. Math. Anal. Appl., 275 (2002), 870–882 [579] Tehrani, H.–T., A note on asymptotically linear elliptic problems in RN , J. Math. Anal. Appl., 271 (2002), 546–554 [580] Terkelsen, F., Some minimax theorems, Math. Scad., 31 (1972), 405–413 [581] Tiba, D., Optimal Control of Nonsmooth Distributed Parameter Systems, Lecture Notes in Math., Vol. 1459, Springer-Verlag, New York (1997) [582] Tichomirov, V., Fundamental Principles of the Theory of Extremal Problems, Wiley, New York (1986) [583] Tolksdorf, P., Regularity for a more general class of quasilinear elliptic equations, J. Diff. Eqns., 51 (1984), 126–150 [584] Tolstonogov, A., Extremal selections of multivalued mappings and the bang–bang principle for evolution inclusions, Soviet Math. Doklady, 43 (1991), 481–485
References
779
[585] Tonelli, L., Fondamenti di Calcolo delle Variazioni, Zanichelli, Bologna (1921) [586] Troutman, J., Variational Calculus with Elementary Convexity, SpringerVerlag, New York (1983) [587] Troyanski, S. L., On equivalent norms and minimal systems in nonseparable Banach spaces, Studia Math., 43 (1972), 125–138 [588] Tsukada, M., Convergence of closed convex sets and σ-fields, Z. Wahr. verw. Gab., 62 (1983), 137–146 [589] Tsukui, J., Turnpike theorem in a generalized dynamic input–output system, Econometrica, 34 (1966), 396–407 [590] Vainberg, M., Variational and Method of Monotone Operators in the Theory of Nonlinear Equations, Halsted Press, New York (1973) [591] Valadier, M., Multiapplications measurables a valeurs convexes compactes, J. Math. Pures Appl., 50 (1971), 265–297 [592] Van Zandt, T., Information, measurability and continuous behavior, J. Math. Economics, 38 (2002), 293–309 [593] Vazquez, J. L., A strong maximum principle for some quasilinear elliptic equations, Appl. Math. Optim. 12 (1984), 191–208 [594] Vidossich, G., On the topological properties of the set of fixed points of nonlinear operators, Confer. Sem. Mat. Univ. Bari, 126 (1971), 3–63 [595] Vind, K., Edgeworth allocations in an exchange economy with many traders, Inter. Economic Review, 5 (1964), 165–177 [596] Visintin, A., Strong convergence results related to strict convexity, Comm. PDEs, 9 (1987), 439–466 [597] Vrabie, I., Compactness Methods for Nonlinear Evolutions, Longman Scientific and Technical, Essex, UK (1987) [598] Walras, L., El´ements d’Economie Pure, Corbaz, Lausanne (1874) [599] Wang, R., Essential (convex) closure of a family of random sets and its applications, J. Math. Anal. Appl., 262 (2001), 667–687 [600] Wang, Z.–Xue, X., On the convergence and closedness of multivalued martingales, Trans. AMS, 341 (1994), 807–827 [601] Warga, J., Optimal Control of Differential and Functional Equations, Academic Press, New York (1972) [602] Webster, R., Convexity, Oxford Science, Oxford University Press, New York (1994) [603] Weinstein, A., Periodic orbits for convex Hamiltonian systems, Annals Math., 108 (1978), 507–518 [604] Whitt, W., Representation and approximation of noncooperative sequential games, SIAM J. Control Optim., 28 (1990), 1148–1161 [605] Wijsman, R., Convergence of sequences of convex sets, cones and functions II, Trans. AMS, 123 (1966), 32–45 [606] Willem, M., Minimax Theorems, Birkh¨ auser, Boston (1996) [607] Winter, S. G., The norm of a closed technology and the straight–down–the turnpike theorem, Review Economic Studies, 34 (1967), 67–84 [608] Wloka, J., Partial Differential Equations, Cambridge Univ. Press, Cambridge, UK (1987) [609] Yamada, Y., On evolution equations generated by subdifferential operators, J. Fac. Sci. Univ. Tokyo, 23 (1976), 491–515 [610] Yang, C.–T., On theorems of Borsuk–Ulam, Kakutani–Yamabe–Yujobˆ o and Dyson II, Anal. Math., 62 (1955), 271–283
780
References
[611] Yankov, V., On the uniformization of A-sets, Doklady Akad. Nauk. USSR, 30 (1941), 591–592 [612] Yano, M., A note on the existenxe of an optimal capital accumulation in the continuous–time horizon, J. Economic Theory, 27 (1982), 421–429 [613] Yosida, K., On the differentiability and representation of one–parameter semigroups of linear operators, J. Math. Soc. Japan, 1 (1948), 15–21 [614] Yosida, K., Functional Analysis (Fifth Edition), Springer-Verlag (1978) [615] Yost, D., There can be no Lipschitz version of Michael’s selection theorem in Proceedings of the Analysis Conference (Singapore, 1986), Eds. S. Choy– J. Jesudason–P. Lee, North–Holland, Amsterdam (1988), 295–299 [616] Yotsutani, S., Evolution equations associated with subdifferentials, J. Math. Soc. Japan, 31 (1978), 623–646 [617] Young, L. C., Lectures on the Calculus of Variations and Optimal Control Theory, Saunders, Philadelphia (1969) [618] Zame, W., Competitive equilibria in production economies with an infinite dimensional commodity space, Econometrica, 55 (1987), 1075–110 [619] Zeidler, E., Nonlinear Functional Analysis and Its Applications I: Fixed Point Theorems, Springer-Verlag, New York (1985) [620] Zeidler, E., Nonlinear Functional Analysis and Its Applications III: Variational Methods and Optimization, Springer-Verlag, New York (1985) [621] Zeidler, E., Nonlinear Functional Analysis and Its Applications II/A: Linear Monotone Operators, Springer-Verlag, New York (1990) [622] Zeidler, E., Nonlinear Functional Analysis and Its Applications II/B: Nonlinear Monotone Operators, Springer-Verlag, New York (1990) [623] Zhang, M., Nonuniform nonresonance of semilinear differential equations, J. Diff. Eqns., 166 (2000), 35–50 [624] Zhang, M., The rotation number approach to eigenvalues of the one– dimensional p-Laplacian with periodic potentials, J. London Math. Soc. (2), 64 (2001), 125–143 [625] Zheng, S., Nonlinear Evolution Equations, Chapman & Hall/CRC Press, Boca Raton, FL (2004) [626] Zhong, C.–K., On Ekeland’s variational principle and a minimax theorem, Nonlin. Anal., 44 (2001), 909–918 [627] Zhou, H.–S., Existence of asymptotically linear Dirichlet problem, J. Math. Anal. Appl., 205 (1997), 239–250 [628] Zilha, I., Characterization by prices of optimal programs under uncertainty, J. Math. Economics, 3 (1976), 173–183 [629] Zilha, I., On competitive prices in a multisector economy with stochastic production and resources, Review of Economic Studies, 43 (1976), 431–438 [630] Zilha, I., Efficiency in economic growth models under uncertainty, J. Economic Dynam. Control, 16 (1992), 27–38
List of Symbols
Symbol
Meaning
L(X, Y )
space of linear operators from X to Y
Γ0 (X)
proper, convex, lower semicontinuous
∗
ϕ
conjugate function
∂ϕ
convex subdifferential
ϕ
0
generalized directional derivative
TC
tangent cone of C
NC
normal cone of C
TC
Clarke tangent cone of C
NC
Clarke normal cone of C
Γ- lim
Γ-limit
B(X)
Borel σ-field of X
Γseq (X1β1 , X2β2 )
multiple Γ-operator
K(C, Y )
compact maps from C into Y
Kf (C, Y )
linear space of finite rank mappings from C to Y
σ(A)
spectrum of A
Lc (X, Y )
compact linear operators from X to Y
Lf (X, Y )
finite rank operators from X to Y
Φ(X, Y )
Fredholm operators from X to Y
Φ+ (X, Y )
semi-Fredholm operators from X to Y
d
Leray–Schauder degree map
782
List of Symbols α
Kuratowski measure of noncompactness
β
ball measure of noncompactness
d0
Nussbaum–Sadovskii degree
d(S)+ = deg(S)+
degree for (S)+ maps
catY A
Ljusternik–Schnirelmann category of A in Y
γ
genus
C01 (Z)
1,2
Wper
(0, b), RN
space of C 1 -functions on Z s.t. on ∂Z are zero
x ∈ W 1,2 (0, b), RN : x(0) = x(b)
Pf (X), Pk (X)
spaces of sets
Pf c (X), Pwkc (X)
spaces of sets
Pkc (X), Pbf c (X)
spaces of sets
m(X ∗ , X)
Mackey topology on X ∗ for (X, X ∗ )
F + (C), F − (C)
inverse images of multifunctions
SFp
Lp -selectors of F
Jλ
resolvent operator
Aλ
Yosida approximation of A
Index
L-pseudomonotone operator, 737 ε-minimizer, 56 m-accretive operator, 184 absolute neighborhood retract, 291 absolute retract, 291, 512 absolutely continuous function, 569, 695 abstract economy, 615 accretive operator, 183 adjoint state, 144 admissible map, 98 admissible pair, 112 admissible policy, 638 allocation core, 531 feasible, 532 initial endowment, 531 Altman’s condition, 242 Ambrosetti–Rabinowitz condition, 422 antimaximum principle, 453 asymptotic center, 236 cone, 43 derivative, 244 asymptotically linear problems, 353 balanced game, 622 non-side payment game, 622 balanced coalition, 620 balanced side-payment game, 620 Banach contraction principle, 225 Banach fixed point theorem, 225 barycenter, 134
Bayesian game, 628 Bayesian Nash equilibrium, 629 Bayesian view, 662 Bellman’s optimality principle, 634 Berge maximum theorem, 462, 670 bilinear map, 105 bimatrix game, 613 Blaschke’s theorem, 527 Bochner integrable function, 693 Bochner integral, 693 Borsuk’s fixed point theorem, 240 Borsuk’s theorem, 205 bounded operator, 163 Boylan metric, 659 Brouwer’s fixed point theorem, 238 budget multifunction, 600 budget set, 532 cancellation law lemma, 468 Carath´eodory function, 470 Caristi’s fixed point theorem, 93 center asymptotic, 236 Cerami condition, 270 Chain Markov Decision, 684 chain rule, 5 Serrin–Vall´ee Poussin, 719 Chebyshev center, 234 radius, 234 Choquet function, 487 Clarke normal cone, 41 Clarke tangent cone, 41 coalition balanced, 620
784
Index
coercive function, 53 coercive map, 166 compact operator, 148 competitive program, 545 completely continuous, 149 comprehensiveness condition, 623 condition Altman’s, 242 Ambrosetti–Rabinowitz, 422, 441 Cerami, 270 comprehensiveness, 623 growth Bernstein–Nagumo–Wintner, 391 maximum, 144 nonuniform nonresonance, 412 Palais–Smale(P S), 92, 270 Palais–Smale(P S)-generalized, 287 R¨ othe’s, 242 reachability, 548 transversality, 144, 546 conditional expected utility, 629 cone τ -closed, 44 τ -recession, 44 asymptotic, 43 Clarke tangent, 41 contingent, 61 convex, 44 dual, 251 fully regular, 245 minihedral, 245 normal, 36, 245 order, 245 positive, 601 recession, 43 regular, 245 solid, 245 strongly minihedral, 245 tangent, 35 total, 245 conjugate function, 18 conservative multistrategy, 611 conservative value, 611 consistent system, 619 consumption sequence, 544 contingent cone, 61 continuity of information, 690 continuity of payoff, 690 continuously Fr´echet differentiable, 6
continuum of agents, 606 contractible set, 290 contraction semigroup, 188 convergence Γτ -, 45 epigraphical, 48 Hausdorff, 514 Mosco, 59, 514 scalar, 515 weak, 515 Wijsman, 515 convex cone, 44 function, 12 strictly, 12 core of a non-side-payment game, 621 of a side-payment game, 618 core allocation, 531 cost function, 685 Courant minimax principle, 289 Courant’s minimax principles, 303 Courant–Fischer–Weyl minimax principle, 289 Cournot–Nash equilibrium, 624, 627 crease, 196 criterion catching, 553 overtaking, 554 critical point, 73, 196, 269 free, 78 nondegenerate, 285 critical value, 269 cyclically monotone map, 175 decomposable set, 487 deformation, 269 invariant, 269 method, 269 theorem, 270, 274 degree, 196 Leray–Schauder, 210, 216 map, 219 Nussbaum–Sadovskii, 217 demand set, 532 demicontinuous map, 165 derivative asymptotic, 244 directional, 16
Index Fr´echet, 3 Gˆ ateaux, 2 generalized directional, 27 partial, 6 diametrical point, 233 diffeomorphism, 11 differentiable twice, 105 Dinculeanu–Foias theorem, 695 direct method, 116 directional derivative, 16 dissipative operator, 184 distance Hausdorff, 465 distance function, 41 distributional strategy σ, 681 distributional strategy τ , 681 domain effective, 12 star-shaped, 429 double sequence lemma, 53 drop set, 95 drop theorem, 94 Du Bois–Reymond’s lemma, 102 dual cone, 251 duality map, 26, 166 with gauge, 166 Dubovickii–Milyutin theorem, 79 Dunford’s second integral, 750 dynamic programming functional equation, 572 dynamic programming operator, 636 economy abstract, 615 effective domain, 12 Ehrling’s inequality, 698 eigenelement, 297 eigenfunction, 297 eigenspace, 153 eigenvalue, 153, 297 eigenvector, 153 Ekeland variational principle, 89 element independent, 661 positive, 251 embedding method, 503 envelope
785
τ -lower semicontinuous, 67 epigraph, 12, 44 equation dynamic programming, 572 Euler, 104 equi-τ -lower semicontinuity, 49 equicoercivity, 53 equilibrium ε, 644 Bayesian Nash, 629 competitive, 532 Cournot–Nash, 624, 627 of a game, 681 Walras, 531 equivalence of probability measures, 657 evolution triple, 696 ex-ante dominated choice function, 600 ex-ante view, 668 ex-post view, 662 expected payoff, 681 extremals, 104 feasible pair, 112 feasible program, 553, 581 Filippov’s implicit function theorem, 482 Finsler manifold, 297 Finsler metric, 297 first partial function, 6 first-order stochastic dominance, 603 fixed point index, 225, 256 fixed point property, 238 formula H¨ ormander’s, 465 Fr´echet derivative, 3 differentiable, 3 Fredholm operator, 160 fully regular cone, 245 function τ -coercive, 66 τ -lower semicontinuous, 64 τ -lower semicontinuous at x, 64 τ -recession, 44 τ -upper semicontinuous, 64 τ -upper semicontinuous at x, 64 conjugate, 18 quasiconcave, 82
786
Index
absolutely continuous, 569 asbolutely continuous, 695 asymptotic, 44 auxiliary, 343 Bochner integrable, 693 Carath´eodory, 470 choice, 599 Choquet, 487 closed, 12 coercive, 53 complementing, 343 convex, 12 convex of compact type, 195 convex–concave, 84 cost, 685 costate, 144 EU-rational choice, 600 ex-ante dominated choice, 600 first partial, 6 gauge, 166 graph-measurable, 471 Hamiltonian, 110 indicator, 12 invariant, 286 Ize’s, 343 Lagrangian, 76, 86 locally Lipschitz, 27 lower semicontinuous, 12 penalty, 392 proper, 12 quasiconvex, 82 recession, 44 regular, 31 saddle, 84 second partial, 6 sequentially coercive, 53 simple, 692 strongly measurable, 692 superpositionally measurable, 471 support, 21, 460 value, 545 value of stochastic game, 639 weakly measurable, 692 functional singular, 569 Gˆ ateaux derivative, 2 differentiable, 2
Galerkin method, 751 game a side-payment, 618 balanced, 622 balanced non-side payment, 622 bimatrix, 613 generalized, 615 in normal form, 610 non-side payment, 621 side-payment balanced, 620 zero-sum, 613 gauge function, 166 Gelfand triple, 750 Gelfand integral, 600 Gelfand triple, 696 generalized game, 615 generalized mountain pass theorem, 279 generalized pseudomonotone operator, 179 generalized subdifferential, 29 generating cone, 245 genus, 293 Krasnoselkii, 293 global extremum, 101 maximum, 101 minimum, 101 glossary, 781 gradient, 270 gradient-like flow, 270 graph, 162 norm, 707 Green’s second identity, 330 H¨ ormander’s formula, 465 Hamiltonian function, 110 system, 110 Hausdorff convergence, 514 distance, 465 measure of noncompactness, 214 metric, 465 hemicontinuous map, 165 Hilbert cube, 475 Hille–Yosida theorem, 191 homotopy pseudomonotone, 223
Index homotopy of class (S)+ , 220 Hopf’s lemma, 304 identity Picone’s, 328 Pohozaev’s, 421, 429 second Green’s, 330 independent element, 661 subset, 661 indicator function, 12 inequality Diaz–Saa, 328 Ehrling’s, 698 Harnack’s, 328 Ky Fan’s, 612 Young–Fenchel, 18 infimal convolution, 20 infinitesimal generator, 188 initial endowment allocation, 531 input sufficient, 546 int dom ϕ, 14 integral Bochner, 693 Gelfand, 600 Pettis, 750 second Dunford’s, 750 integration by parts formula, 698 invariance of domain theorem, 212 inward set, 509 isomorphism, 11 Kakutani–Ky Fan theorem, 114 Knaster–Kuratowski–Mazurkiewicz map, 80 Krasnoselskii genus, 293 Krein’s theorem, 252 Kuratowski limit, 514 interior, 514 superior, 514 Kuratowski measure of noncompactness, 214 Kuratowski–Ryll Nardzewski selection theorem, 480 Ky Fan’s inequality, 612 Lagrange lemma, 102
787
Lagrange multiplier, 76 Lagrange multiplier rule, 35 Lagrangian, 98 function, 76, 86 hyperregular, 109 regular, 109 Lebesgue–Bochner space, 694 Legendre transform, 109 Legendre–Fenchel transform, 18 lemma cancellation law, 468 Du Bois–Reymond’s, 102 Hopf’s, 304 Lagrange, 102 Lions, 750 Minkowski–Farkas, 619 Morse, 285 Leontief dynamic model, 550 Leray–Schauder alternative principle, 242 Leray–Schauder degree, 210 limit lower Γ, 45 lower Kτ , 46 upper Γ, 45 upper Kτ -, 47 linear order, 669 linking set, 276 List of Symbols, 781 Ljusternik’s theorem, 74 Ljusternik–Schnirelmann category, 207, 290 Ljusternik–Schnirelmann–Borsuk theorem, 207 local extremum, 34 linking, 280 maximum, 101 minimum, 101 locally bounded operator, 163 locally Lipschitz function, 27 lower Γ-limit, 45 limit-Kτ , 46 lower semicontinuous function, 12 lower semicontinuous regularization, 48 Lumer–Phillips theorem, 706 Lyapunov’s convexity theorem, 491
788
Index
Mackey topology, 484, 573 manifold Finsler, 297 spherelike, 338 map γ-Lipschitz, 215 γ-condensing, 215 γ-contraction, 215 k-contraction, 225 admissible, 98 asymptotically linear, 244 bilinear, 105 coercive, 166 contractive, 225 cyclically monotone, 175 degree, 219 demicontinuous, 165 duality, 26, 166 duality with gauge, 166 hemicontinuous, 165 inward, 236 Knaster–Kuratowski–Mazurkiewicz, 80 maximal cyclically monotone, 175 minimal section, 187 nonexpansive, 225 normalized duality, 166 policy, 685 proper, 151 quasibounded, 183, 244 regular, 183 smooth, 183 strategy, 685 traverse, 264 truncation, 392 type (S)+ , 183 weakly coercive, 166 mapping finite rank, 150 marginal probability, 681 Markov Decision Chain, 684 maximal accretive operator, 184 maximal cyclically monotone map, 175 maximal monotone operator, 163 maximin multistrategy, 611 maximum condition, 144 principle, 433 measurability view, 668
measurable strongly function, 692 weakly function, 692 measure probability initial, 684 test, 61 Young, 118 method auxiliary variable, 127 deformation, 269 direct, 116, 542 embedding, 503 Galerkin, 751 monotonicity, 751 of Lagrange multipliers, 75 reduction, 116 relaxation, 118 upper-lower solutions, 376, 380 metric Boylan, 659 Finsler, 297 Hausdorff, 465 projection, 39 Michael’s selection theorem, 476 minihedral cone, 245 minimal section map, 187 minimax equality, 79 minimizer-ε, 56 minimum shadow, 644 Minkowski–Farkas lemma, 619 mixed strategies, 614 model Leontief dynamic, 550 mollifier, 197 monotone operator, 163 Moreau–Yosida approximation, 178 Morse index, 285 Morse lemma, 285 Mosco convergence, 514 Mosco sense convergence, 514 mountain pass theorem, 278 multifunction h-continuous, 466 h-contraction, 504 h-lower semicontinuous, 466 h-upper semicontinuous, 466 p-integrably bounded, 495 almost lower semicontinuous, 469 budget, 600
Index choice, 599 closed, 459 constraint, 671 continuous, 457 good reply, 615 graph, 458 graph measurable, 470 integrably bounded, 495 Knaster–Kuratowski– Mazurkiewicz, 506 locally selectionable, 475 lower semicontinuous, 457 measurable, 470 nonexpansive, 506 scalarly measurable, 470 sequentially closed, 459 state independent constraint, 671 strongly inward, 509 upper semicontinuous, 457 Vietoris continuous, 457 weakly inward, 509 weakly lower semicontinuous, 469 multiplication formula, 208 multistrategy, 611 ε-equilibrium, 644 multivalued mapping domain, 162 inverse, 162 Nash equilibrium, 612 neighborhood special, 343 non-side-payment game, 621 nondivergence form, 304 norm graph, 707 weak, 498 normal cone, 36, 245 normal integrand, 119 normal structure, 233 number crossing, 345 Nussbaum–Sadovskii degree, 217 operator L-pseudomonotone, 737 m-accretive, 184 accretive, 183 bounded, 163 compact, 148
789
diagonalizable, 157 dissipative, 184 dynamic programming, 636 evolution, 751 Fredholm, 160 generalized pseudomonotone, 179 index, 160 inf, 126 locally bounded, 163 maximal accretive, 184 maximal monotone, 163 monotone, 163 pseudomonotone, 179 resolvent, 176, 705 semi-Fredholm, 160 strictly monotone, 163 sup, 126 Yosida approximation, 706 optimal admissible pair, 112 control, 112 trajectory, 112 optimal policy, 634 order cone, 245 ordering, 669 strict, 554 pair admissible, 112 feasible, 112 optimal admissible, 112 Palais–Smale condition, 92, 270 Palais–Smale generalized condition, 287 partial order relation, 669 path, 544 path of the economy, 559 payoff expected, 681 penalization technique, 392 perfect competition, 536 Perron–Frobenius theorem, 239 perturbation spike, 136 Pettis integral, 750 plan, 668 Poincar´e half-plane, 111 point bifurcation, 341 critical, 73, 196, 269, 287
790
Index
diametral, 233 invariant, 286 regular, 73 saddle, 78, 79 singular, 343 spectrum, 153 policy admissible, 638 map, 685 randomized, 638 shifted, 687 stationary, 685 Polish space, 471 Pontryagin maximum principle, 136 portmanteau theorem, 640 positive element, 251 preorder continuous, 669 lower semicontinuous, 668 upper semicontinuous, 669 preorder relation, 669 principle antimaximum, 453 Banach contraction, 225 Bellman’s optimality, 634 Courant’s minimax, 303 Ekeland, 89 Leray–Schauder, 242 maximum, 433 Pontryagin maximum, 136 strong maximum, 303, 305, 433 symmetric criticality, 286 Takahashi, 94 uniform antimaximum, 453 weak comparison, 433 prisoner’s dilemma problem, 614 probability initial measure, 684 marginal, 681 transition, 118 problem asymptotically linear, 405 prisoner’s dilemma, 614 value of information, 667 program, 544 capital accumulation, 589 competitive, 545 eligible, 556 feasible, 553, 589
feasible good, 556, 584 finite good, 561 finite optimal, 560 good feasible, 556, 584 infinite optimal, 560 optimal, 544, 553 stationary, 560 strongly maximal, 553 turnpike, 560 weakly maximal, 554, 581 weakly maximal stationary, 555 program of the economy, 559 proper function, 12 property U , 498 convex compact, 603 finite-dimensional, 506 fixed point(ffp), 238 Kadec–Klee, 415 Radon–Nikodym, 694 unique continuation, 333 pseudogradient vector field, 272 pseudometric, 561 pseudomonotone homotopy, 223 pseudomonotone operator, 179 quasibounded map, 244 quasiconcave function, 82 quasiconvex function, 82 quasinorm, 244 quasiorder relation, 669 R¨ othe’s condition, 242 R˚ adstr¨ om embedding theorem, 503 Rademacher’s theorem, 30 radius, 234 Chebyshev, 234 Radon–Nikodym property, 491 randomized policy, 638 Rayleigh quotient, 302 reachability condition, 548 recession function, 44 reduction method, 116 reduction property, 212 regular cone, 245 regular function, 31 regular point, 73 regular value, 196, 269
Index regularization τ -lower semicontinuous , 67 relation indifference, 536 partial order, 669 preference, 536 preference-indifference, 536 preorder, 669 quasiorder, 669 relaxability, 118 relaxation admissible, 118 relaxation method, 118 relaxed control, 118 representation isometric, 286 topological group, 286 resolvent operator, 705 resolvent set, 153 retract, 511 absolute, 512 retraction, 238 rule chain, 5 saddle point, 78 on product space, 79 saddle point theorem, 279 saddle value, 79 Sadovskii fixed point theorem, 242 Sard’s theorem, 197 saturation by a commodity vector, 536 Schauder’s fixed point theorem, 241 Schauder’s theorem, 159 Scorza–Dragoni theorem, 471 second partial function, 6 semigroup C0 , 188 compact, 193 contraction, 188 equicontinuous, 193 of nonexpansive maps, 192 separable probability space, 659 sequence consumption, 544 sequentially τ -lower semicontinuous, 66 τ -lower semicontinuous at x ∈ X, 66 Serrin–Vall´ee Poussin chain rule, 719
set τ -closed, 65 analytic, 473 budget, 532 contractible, 290 decomposable, 487 demand, 532 drop, 95 finitely closed, 506 Haar-null, 60 invariant, 286 inward, 236, 509 linking, 276 resolvent, 153 set-valued integral, 499 sgn, 345 shadow minimum, 644 shifted policy, 687 simple function, 692 solution extremal, 401 lower, 381, 433 mild, 702 strong, 721 upper, 381, 432 vector, 619 weak, 432, 721 Souslin space, 473 space Lebesgue–Bochner, 694 Polish, 471 separable probability, 659 Souslin, 473 tangent, 73 spectral resolution theorem, 157 spectrum, 153 spherelike constraint, 339 spike perturbation, 136 star-shaped domain, 429 state independent, 671 stationary history, 633 policy, 638, 685 Steiner point map, 525 strategy distributional σ, 681 distributional τ , 681 map, 685 rule, 624
791
792
Index
strictly convex, 12 strictly differentiable, 29 strictly monotone operator, 163 strong maximum principle, 303, 433 strong operator topology, 653 strong solution, 720 strong turnpike theorem, 567 strongly minihedral cone, 245 sub-σ-field µ- lim inf Σn , 673 n→∞
µ- lim sup Σn , 673 n→∞
subdifferential, 22 -ε, 646 generalized, 29 subset independent, 661 sufficient input, 546 sufficient vector, 563 support function, 21, 460 system adjoint, 144 consistent, 619 Hamiltonian, 144 Takahashi variational principle, 94 tangent cone, 35 space, 73 vector, 73 theorem Amann’s three fixed points, 262 Banach fixed point, 8, 225 Berge maximum, 462, 670 Birkoff–Kellogg, 212 Blaschke’s, 527 Borsuk’s, 205 Borsuk’s fixed point, 240 Brouwer’s fixed point, 238 Caristi’s fixed point, 93 Courant’s nodal set, 308 deformation, 270, 274, 288 Dinculeanu–Foias, 695 drop, 94 Dubovickii–Milyutin, 79 embedding R˚ adstr¨ om, 503 Filippov’s implicit function, 482 fixed point Kakutani–Ky Fan, 510 Fredholm alternative, 159 generalized mountain pass, 279, 288
Hille–Yosida, 191 implicit function, 8 infinite-dimensional Kuhn–Tucker multiplier, 583 invariance domain, 212 invariance of domain, 207 inverse function, 8, 11 Kakutani–Ky Fan, 114 Krein’s, 252 Kuratowski–Ryll Nardzewski, 480 Ljusternik’s, 74 Ljusternik–Schnirelmann–Borsuk, 207 Lumer–Phillips, 706 Lyapunov convexity, 491 Michael’s selection, 476 mountain pass, 278, 288 Nikaido’s, 37 no retraction, 239 nonlinear alternative, 241 Perron–Frobenius, 239 Pettis measurability, 692 portmanteau, 640 projection Yankov–von Neumann– Aumann, 473 Rademacher’s, 30 saddle point, 279, 288 Sadovskii fixed point, 242 Sard’s, 197 Sch¨ afer’s fixed point, 242 Schauder’s, 159 Schauder’s fixed point, 241 Scorza–Dragoni, 471 selection Yankov–von Neumann– Aumann, 114, 482 spectral resolution, 157 strong turnpike, 567 symmetric mountain pass, 285 turnpike, 558 Tychonov’s fixed point, 244 Vitali’s, 655 weak turnpike, 564 Weierstrass, 66 Yosida–Hewitt, 569 topological complement, 73 topological indices, 290 topology α, 592 c, 591
Index Mackey, 484, 573 narrow, 118 of pointwise convergence, 653 strong operator, 653 uniform operator, 653 weak, 599 total cone, 245 transform Legendre, 109 Legendre–Fenchel, 18 transition probability, 118 transversality condition, 144, 546 truncation technique, 392 turnpike, 560 programs, 558 theorem, 558 turnpikes, 558 twice differentiable, 105 Tychonov’s fixed point theorem, 244 uniform operator topology, 653 upper limit Kτ -, 47 limit-Γ, 45 utility conditional expected, 629 value conservative, 611 critical, 196, 269 of information, 667 regular, 269 value function, 545 value function of stochastic game, 639
793
variational convergence, 61 variational principle Ekeland, 89 vector solution, 619 sufficient, 563 tangent, 73 view Bayesian, 662 ex-ante, 668 ex-post, 662 Walras equilibrium, 531, 532 weak comparison principle, 433 weak maximality criterion, 552 weak maximum principle, 304 weak norm, 498 weak solution, 721 weak turnpike theorem, 564 weakly coercive map, 166 weakly inward map, 236 weakly maximal program, 554 weakly maximal stationary program, 555 Weierstrass theorem, 66 Wijsman convergence, 515 Yankov–von Neumann–Aumann projection theorem, 473 selection theorem, 114, 482 Yosida approximation, 176, 706 Yosida–Hewitt theorem, 569 zero-sum game, 613