'/\
Real Variables Alberto Torchinsky University if Indiana, BloominBwn
Addison-Wesley Publishing Company, Inc. The Advanced Book PrOfJram Redwood City, California· Menlo Park, California· Reading, Massachusetts New York· Amsterdam· Don Mills, Ontario· Sydney Bonn • Madrid • Singapore • Tokyo • Bogota· Santiago San Juan· Wokingham, United Kingdom
Publisher: Allan M. Wylde Production Administrator: Karen L. Garrison Editorial Coordinator: Pearline Randall Electronic Production Consultant: Mona Zeftel Promotions Manager: Celina Gonzales
Library of Congress Cataloging-in-Publication Data Torchinsky, Alberto. Real variables / Alberto Torchinsky. p. cm. Includes index. ISBN 0-201-15675-x : $39.95 1. Functions of real variables. I. Title QA331.5.T58& 1987 515.8-dc19
87-18629 CIP
Copyright @1988 by Addison-Wesley Publishing Company. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recOlding, or otherwise, without the prior written permission of the publisher. Published simultaneously in Canada. Printed in the United States of America. This book was typeset in MicroTEX using a Leading Edge Model D computer. Cameraready output from an Apple Laserwriter Plus Printer. ABCDEFGIIIJ-AL-8987 0-201-15675-x
To Massi, Kurosh, and Darius
Author's Foreword During the academic year 1985-1986 I gave a course on Real Variables at Indiana University. The main source of reference for the course was a set of class notes prepared by the students as we went along; this book is based on those notes. One of the purposes in those lectures was to present to students who are beginning a deeper study of the fairly esoteric subject of Real Variables an overview of how the familiar results covered in Advanced Calculus develop into a rich theory. Motivation is an essential ingredient in this endeavour, as are convincing examples and interesting applications. Now, teaching a course at this level two facts become quickly apparent, to wit: (i) The background of the students is quite varied, as first year graduate and upper division undergraduate Math students, as well as various science and economics majors, enroll in it, and, (ii) Even those students with a strong background are not entirely at ease with proofs involving either an abstract new concept or an c-o argument. My idea of a course at this level is one that presents to the students a modern introduction to the theory of real variables without subjecting them to undue stress. ' Although the material is not presented here in a radically different way than in other textbooks, this book offers a conceptually different approach. First, it takes into account, both in placement and content of the topics discussed, the uneven nature of the background of the students. Second, an attempt has been made to motivate the material discussed, and always the most "natural" rather than the most elegant proof of a result is given. \Ve also stress the unity of the subject matter rather than individual results. Third, we go from the particular to the general, discussing each definition and result rather carefully, closer to the way a mathematician first thinks about a new concept. Finally, students are not "talked down," but rather feel that the issues at hand are addressed in a forthright manner and in a direct language, one they can understand. It is important that readers have no difficulty in following the actual arguments presented and spend their time instead in considering questions such as: What is the role or roles of a given result? What is it good for? What are the important ideas, and which are the secondary ones? What are the basic problems in this area and how are they approached and solved?
v
vi
Author's Foreword
In fact, we expect the serious students at this level to learn to ask these questions and this text will serve as a guide to ask them at the appropriate time. How does the text present the material? An important consideration is that the students see the "big picture" rather than isolated theorems, and basic ideas rather than generality are stressed. Each chapter starts with a short reader's guide stating the goals of the chapter. Specific examples are discussed, and general concepts are developed through particular cases. There are 599 problems and questions that are used to motivate the material as well as to round out the development of the subject matter. The reader will be pleasantly surprised to find out that problems are in fact problems, and not further theorems to be proved. Problems are thought-provoking, and there is a mixture of routine to difficult, and concrete to theoretical. Because I wanted this book to be essentially self-contained for those students with a good Advanced Calculus background as well as an elementary knowledge of the theory of metric spaces, the point of departure is an informal discussion of the theory of sets and cardinal numbers in Chapter I, and ordinal numbers and Zorn's Lemma in Chapter II. These topics give the student the opportunity to work with abstract, possible new, concepts. Chapter III introduces the Riemann-Stieltjes integral and the limitations of the Riemann integral become quickly apparent; £-6 proofs are discussed here. At the completion of these chapters the background of the students has been essentially equalized. Chapter IV is the exception that proves the rule. It develops the abstract concept of measure, a particular case of which, the Lebesgue measure on Rn, is discussed in Chapter V. Anyone objecting to this treatment can plainly, and almost painlessly, read these chapters in the opposite order. The construction of the Lebesgue measure is a favorite among the students, as it allows them to discover where measures come from and how they are constructed. In Chapter VI we return to a somewhat abstract setting, although for reasons of simplicity Lusin's theorem is presented in the line where all the difficulties are already apparent. An important feature of this chapter is working with "good" and "bad" sets; this is an indispensable tool in other areas, including the Calder6n-Zygmund decomposition of integrable functions discussed in Chapter VIII. The proof of Egorov's theorem illustrates our point of view: It is longer than the usual proof, but it is clear and understandable. In Chapter VII we introduce the notion of the integral and the role of almost everywhere convergence. I am confident that the path that leads to the various convergence theorems is direct and motivational. The material described thus far constitutes a solid first semester of a yearlong course.
Foreword
vii
Chapter VIII presents new properties of integrable functions, including the Lebesgue Differentiation Theorem. The proof given here makes use of the Hardy-Littlewood maximal function, and is one that most experts agree should have worked its way into the standard treatment of this topic by now. Chapter IX constructs important new examples of measures on the line, the Borel measures. The correspondence between these measures and their distribution functions, a subject that lies at the heart of the theory of Probability, is established in an elementary and computational manner. Chapter X discusses properties of absolutely continuous functions, including the Lebesgue decomposition of functions of bounded variation and the characterization of those functions on the line that may be recovered by integrating their derivatives. The abstract setting of these results is presented in detail in Chapter XI, where the Radon-Nikodym theorem is discussed. The basic theory of the Lebesgue LP spaces, including duality and the notion of weak convergence is covered in Chapter XII. Chapter XIII deals with product measures and Fubini's theorem in the following manner: In the first section we discuss the version dealing with Lebesgue integrals in Euclidean space; the second section discusses some important applications, including convolutions and approximate identities; and, finally, the third section presents Fubini's theorem in an abstract setting. This is a concrete example on how to proceed from the particular to the general. However, if preferred, the third and second sections can be covered, and the first section assigned for reading. Chapter XIV deals with normed linear spaces, an abstraction of the notion of the LV spaces, and the Hahn-Banach theorems. Students are happy to see both the geometric and analytic forms of this result and their applications. Chapter XV covers the basic principles of Functional Analysis, to wit, the Uniform Boundedness Principle, the Closed Graph Theorem, and the Open Mapping Theorem; each principle is given individual attention. In Chapter XVI we consider those Banach spaces whose norm comes from an inner product, or Hilbert spaces. The discussion of the geometry of Hilbert spaces and the spectral decomposition of compact self-adjoint operators are some of the features of this chapter. Brief historical references concerning the origin of some of the concepts introduced in the text have been made throughout the text, and Chapter XVII presents these remarks in their natural setting, namely, the theory of Fourier series. Finally, Chapter XVIII contains suggestions and comments to some of the problems and questions posed in the book; they are not meant, however, to make the learning of the material effortless. The notations used throughout the book are either standard or else they are explained as they are introduced. "Theorem 3.2" means that the result alluded to appears as the second item in Section 3 of the present chapter, and "Theorem 3.2 in Chapter X" means that it appears as the
Author's Foreword
viii
second item of the third section in Chapter X. The same convention is used for formulas and problems. A word about where the text fits into the existing literature. It is more advanced than Rudin's book Principles of Mathematical Analysis, a good references for the material on Advanced Calculus and metric spaces. It is also more abstract than the treatise Measure and Integral by Wheeden and Zygmulld. I learned much of the material on integration from Antoni Zygmund, and some of the topics discussed, including the construction of the Lebesgue measure and the outlook on the Euclidean version of Fubini's theorem, have his imprint. Then, there are the classics. They include Natanson's Theory of Functions of a Real Variable, Saks' Theory of the Integral, F.Riesz and Sz.-Nagy's Lel$ons d'Analyse Fonctionnelle, Halmos' Measure Theory, Hewitt and Stromberg's Real and Abstract Analysis, and Dunford and Schwartz's Linear Operators. Anyone consulting these books will gain the perspective of the masters. Where do we go from here? I am confident that the reading of this book will adequately prepare the student to venture into diverse fields of Mathematics. Specifically, books such as Billingsley's Probability and Measure, Conway's A Course in Functional Analysis, Stein's Singular Integrals and Differentiabilty Properties of Functions and Zygmund's Trigonometric Series are now within reach. Acknow ledgments It is always a pleasure to acknowledge the contribution of those who make a project of this nature possible. My friends and colleagues Hari Bercovici and Ron Kerman read the complete manuscript and made valuable suggestions and comments. The opportunity to create this manuscript with the Micro'lEX version of 'lEX was an unexpected pleasure and challenge. Elena Fraboschi and George Springer were my mentors in this endeavour, and lowe them much. Pam Cunningham Pierce contributed with the illustrations. My largest debt, though, is to the students who attended the course and kept a keen interest in learning throughout the ordeal. Many examples and solutions to the problems are due to them, particularly to Steve Rowe. Steve Blakeman, Nick Kernene, and Shilin Wang were also very helpful. The manuscript was cheerfully typed by Storme Day. The staff at Addison-Wesley handled all my questions efficiently. Mopa Zeftel provided the much needed technical assistance, and Allan Wylde was the best publisher tIllS ambitious project could have had.
Contents Foreword
v
Chapter 1. Cardinal Numbers
1
Sets Functions and Relations Equivalent Sets Cardinals Problems and Questions
1 3 5 8 11
Chapter II. Ordinal Numbers
15
Ordered Sets Well-ordered Sets and Ordinals Applications of Zorn's Lemma The Continuum Hypothesis Problems and Questions Chapter III. The Riemann-Stieltjes Integral Functions of Bounded Variation Existence of the Riemann-Stieltjes Integral The Riemann-Stieltjes Integral and Limits Problems and Questions Chapter IV. Abstract Measures Algebras and u-algebras of Sets Additive Set Functions and Measures Properties of Measures Problems and Questions
15 17 20 23 23 27 27
32 38 40 45 45
49 53 58
Contents
x
Chapter V. The Lebesgue Measure Lebesgue Measure on R n The Cantor Set Problems and Questions Chapter VI. Measurable Functions Elementary Properties of Measurable Functions Structure of Measurable Functions Sequences of Measurable Functions Problems and Questions Chapter VII. Integration The Integral of Nonnegative Functions The Integral of Arbitrary Functions lliemann and Lebesgue Integrals Problems and Questions Chapter VIII. More About L1 Metric Structure of L1 The Lebesgue Differentiation Theorem Problems and Questions Chapter IX. Borel Measures Regular Borel Measures Distribution Functions Problems and Questions Chapter X. Absolute Continuity Vitali's Covering Lemma Differentiability of Monotone Functions Absolutely Continuous Functions Problems and Questions Chapter XI. Signed Measures Absolute Continuity The Lebesgue and Radon-Nikodym Theorems
63 63 73
74 79 79
89 91 98 105 105 114
120 124 131 131 136 143 149 149 153 159 165 165 167 171 178
183 183 194
Contents
Problems and Questions Chapter XII. LP Spaces The Lebesgue LP Spaces Functionals on LP Weak Convergence Problems and Questions Chapter XIII. Fubini's Theorem
Xl
204 209 209 218 229 233 237
Iterated Integrals Convolutions and Approximate Identities Abstract Fubini Theorem Problems and Questions
237 246 256 262
Chapter XIV. Normed Spaces and Functionals
267
Normed Spaces The Hahn-Banach Theorem Applications Problems and Questions Chapter XV. The Basic Principles The Baire Category Theorem The Space 8(X, Y) The Uniform Boundedness Principle The Open Mapping Theorem The Closed Graph Theorem Problems and Questions Chapter XVI. Hilbert Spaces The Geometry of Inner Product Spaces Projections Orthonormal Sets Spectral Decomposition of Compact Operators Problems and Questions
267 271
282 293 297 297 300 303 306 309 312 317 317 326 331 338 351
Contents
xii
Chapter XVII. Fourier Series
357
The Dirichlet Kernel The Fejer Kernel Pointwise Convergence
357 363 369
Chapter XVIII. Remarks on Problems and Questions
371
Index
399
CHAPTER
I
Cardinal Numbers
We open our discussion by introducing, in a naive fashion, the notion of set. We are particularly interested in operating with sets and in the concept of "number of elements" in a set, or cardinal number. We consider various cases of infinite cardinals and do some cardinal arithmetic.
1.
SETS
What is a set? According to G. Cantor (1845-1918), who initiated the theory of sets in the last part of the nineteenth century: "A set is a collection into a whole of definite, distinct objects of our intuition or our thought. The objects are called the elements (members) of the set." The origin of the theory of sets, like that of many of the basic notions and results that are covered in this book, can be traced back to the theory of trigonometric and Fourier series. The theory of sets was created by Cantor to address the problem of uniqueness for trigonometric series. We refer to the "whole of distinct objects" in Cantor's definition as the universal set. We denote sets by capital letters A, ... and elements by small letters a, .. . , say. The notation a E A, which reads a belongs to A, indicates the fact that a is a member of A. Most of the sets we consider are of the following form: If X is the universal set, then A is the set of those x in X for which the property P(x) is true. The convenient, and descriptive, notation we adopt in this instance is A = {x EX: P( x)}, or plainly A = {x: P(x)} or even A = {P(x)}. Nor Z+ is the set of natural numbers {I, 2, ... }, Z = {... , -1,0, I, ... } is the set of integers, Q = {r:r = m/n,m,n E Z,n f:. O} is the set of rational numbers, I is the set of irrational numbers and R is the (universal) set of real numbers. Q+ = {r E Q: r ~ O} and Q_ denote the sets of
I.
2
Cardinal Numbers
nonnegative and negative rational numbers respectively; similarly for I+, L, R+ and R_. If a is not a member of A we write a r;. A, which reads a does not belong to A. The complement B\A of a set A relative to a set B is defined as
B \ A = {b E B:b
r;.
A}.
We call X \ A the complement of A. For instance, in the universal set R, the complement of Q is I and that of I is Q. It is not clear at this point what the complement of the universal set X should be. For this, and other important reasons, we postulate the existence of a particular set. We say that 0 is the empty set if x E 0 holds for no element x. For instance, for every set A, A \ A = 0. If every element of a set A also belongs to a set B we say that A is a subset of B and we write A ~ B or B 2 A; these expressions read A is contained in or equal to Band B contains or is equal to A, respectively. For instance, Z ~ Q ~ Rand 0 ~ A for any A. We say that sets A and B are equal, and we write A = B, if A ~ Band B ~ A. Although this definition seems a bit cumbersome, it often represents the only practical way we have to determine whether two sets are equal. To emphasize that A is a proper subset of B, i.e., A ~ B and A 1= B, we write A C B or B ::J A. Given a set A, we let peA), or parts of A, be the set consisting of all the subsets of A, i.e., peA) = {B:B ~ A}. For instance, if A = {a,b}, then peA) = {0,{a},{b},{a,b}}. What operations can we perform with sets, and what new sets are generated? We begin by introducing the union and intersection. Let A and B be any two sets. By the union A U B of A and B we mean the set consisting of those elements which belong to either A or B. Thus Au B = {x:x E A or x E B}. By the intersection An B of A and B we mean the set consisting of all elements which belong to both A and B, i.e., An B = {x: x E A- and x E B}. In case An B = 0 we say that the sets A and B are disjoint. For instance, QUI = Rand Q n I = 0. How do we operate with more than two sets? A set whose elements are sets is referred to as a collection, a class or a family. Families are denoted by script letters A, ... For the question we posed it often suffices to consider a family A of indexed sets. More precisely, if I is a nonempty set and A = {Ai: i E I}, then we put
U Ai = {x : x E Ai for some i in I} ieI
2.
Functions and Relations
and
3
n
Ai = {x : x E Ai for all i in I} .
ieI
It is quite straightforward to operate with these concepts, cf. 5.1 below. If A = {Ai: 1 ~ i ~ n} is a family of n sets, we define the Cartesian product IIi=1 Ai, or product, of the Ai'S as the set of ordered n-tuples n
II Ai = {(all .. ·, an) : ai E Ai, 1 ~ i ~ n} . i=l
This set is named after Descartes (1596-1650), who introduced the rectangular coordinates for the plane; the analogy of the concepts is clear. A familiar product is R n = {(Xl, ... ,X n ): Xi E R, 1 ~ i ~ n}. A product of two sets A and B, say, is denoted by A X B. A useful application of the notion of product is the following: If X ~ A U B, then the sets A x {x} and {x} X B look essentially like A and B, and yet are disjoint.
2. FUNCTIONS AND RELATIONS Various fields of human endeavour have to do with relationships that exist between sets of objects. Graphs and formulas, for instance, are devices for describing special relations in a quantitative way. We start by defining a particular kind of relation, namely, a function. The terminology goes back to Leibniz (1646-1716) who used the term primarily to refer to certain kinds of mathematical formulas. The notion of function generally accepted today was first formulated in 1837 by Dirichlet (1805-1859) in a memoir dealing with the convergence of Fourier series. Given two sets A and B, say, a function I from A into B is a correspondence which associates with each element a of A, in some manner, an element, and only one, bin B, which we denote by I(a). We refer to I as a function (or map, mapping, correspondence or transformation) of A into B. A is called the domain of I and those elements of B of the form I(a) form a subset of B, denoted by I(A), called the range of I. Any letter in the English or Greek alphabets, capital or small, may be used to denote a function. The symbol I: A -+ B means that I is a function with domain A and range contained in B. If I:A -+ Band g:B -+ C, then the mapping go I:A -+ C is defined by go I(a) = g(J(a» for a in A. The function 9 0 f is called the composition of f and g. A function F is said to be
4
I.
Cardinal Numbers
an extension of a function I, and I a restriction of the function F, if the domain of F contains that of I and F( a) = I( a) for every a in the domain of I. The restriction of F to a subset A of its domain is denoted by FlA. The function I is said to map A onto B if I(A) = Bj we also say that I is surjective. The function I is said to be a one-to-one mapping of A into B, or plainly one-to-one or injective, if I(ad i l(a2) whenever al i a2 for all aI, a2 in A. Suppose I: A - B is one-to-one and onto. Then we can define the mapping g: B - A by means of g(b) = a whenever I( a) = b. The function 9 is called the inverse of I and is denoted by 1-1. For example, the function I: (-1,1) - R given by I(x) = tan(1I'x/2) is one-to-one and onto, and its inverse l-l:R - (-1,1) is I-l(x) = 2arctan(x)/1I'. Although somewhat inconsistent, we conform to tradition and adopt the following notation: If I: A - B and C ~ B, the set {a E A: I(a) E C} is called the inverse image of C by I and is denoted by l-l(C). This set should not be confused with (J-l)(C) = {a:a = I-l(b),b E C} which is only defined when 1-1 exists. Two particular functions have a specific name. They are the identity function 1: A - A, 1( a) = a for all a in A, and the characteristic function XE of a set E, i.e., the function defined by the equation XE( x) = 1 if x E E and 0 otherwise. We often work with families of functions. The collection of all the functions I: A - B from a set A into a set B is denoted by BA. For example, RN denotes the family of all real sequences {Tl' T2, • •• }. We visualize a function I from A into B as a particular subset ofAxB. Indeed, we think of I as the subset of A x B consisting of the ordered pairs (a,J(a))j in other words there is a natural identification between I and its graph. This notion can be extended considerably. An arbitrary subset R of Ax B is called a relation. To emphasize this correspondence we often write aRb to indicate that (a,b) E R. In addition to functions, an important instance of relations are the so-called equivalence relations. In this particular case we have A = B and the equivalence relation R satisfies the following three properties: R(reflexivity) aRa, all a in A, S(symmetry) aRb iff(if and only if) bRa, T(transitivity) If aRb and bRe, then aRc. The equivalence class 'R( a) of an element a E A is the set 'R( a) = {b E A: aRb}; A is then the disjoint union of these equiValence classes. For instance, let A be the collection of all the straight lines L in R2. Then the relation LlRL2 iff Ll and L2 are parallel is an equivalence
3.
Equivalent Sets
5
relation, and the equivalence class n(L) of any line L consists precisely of all the lines parallel to it.
3. EQUIVALENT SETS Suppose A and B are two sets for which there is a function I: A ---t B which is one-to-one and onto. Intuitively, the sets A and B are interchangeable provided we are interested in some property that does not concern the specific nature of their elements. Therefore, in this case we say that A and B are equivalent, with equivalence function I, and we write A rv B. It is readily seen that rv is an equivalence relation among sets. Indeed, rv verifies the following three properties: R. A rv A, S. If A rv B, then B rv A, T. If A rv B and B rv C, then A rv C. First, in the case of R, the identity equivalence function will do. As for S, if I: A ---t B is an equivalence function, then 1-1: B ---t A establishes an equivalence between Band A. Finally, if I: A ---t Band g: B ---t C are equivalence functions, so is go I: A ---t C, cf. 5.10 below. By means ofthis equivalence relation we are able to sort sets as follows: A finite set is any set that is either empty or equivalent to {I, ... ,n} for some n EN. Any set that is not finite is called infinite. For instance N, and any set equivalent to N, is infinite. Sets equivalent to N are called countable; it is easy to see why. If A is a countable set and I: N ---t A is an equivalence function, then each element a E A is of the form a = I(n), n EN, and can be identified with n. Thus A can be explicitly written as the sequence (at, a2," .), where an I(n), n E N. A set which is either finite or countable is said to be at most countable. An uncountable set is one which is not at most countable. It is not hard to see that there are uncountable sets. Indeed, let 10 = [0,1] be the unit interval of the real line; we claim that 10 is uncountable. Suppose not, then 10 can be expressed as rl, r2,' .. , say. Dividing 10 into three closed intervals, each of length 1/3 (they may have common endpoints), it is clear that one of the intervals, II say, does not contain rl; if there is more than one interval just choose any. Next we divide II into three closed intervals of equal length and choose a second subinterval, 12 say, which does not contain r2' Proceeding in this fashion we construct a nested sequence In of closed intervals, each of which is one-third of the preceding in length and such that rn fi. In, all n in N. By the wellknown nested interval principle, the intersection nnEN In is not empty
=
I.
6
Cardinal Numbers
°
and consists of a single real number T, say. Clearly ::s; T ::s; 1. Since our assumption is that all the real numbers in 10 are listed in the sequence T}, T2, ••• , T must be one of the Tn'S. But since by construction Tn f/. In, then T =I Tn for all n, and we have reached a contradiction. This result is so interesting that it deserves another proof, cf. 5.15 below. It is often helpful to "picture" a proof and once this is achieved to translate this proof into one that can be written out. For instance, let us show that any two closed, bounded intervals are equivalent. If the intervals are [a,b] and [c,d], say, and if (b- a) < (d- c), then an equivalence function can be readily obtained as indicated in Figure 1. In fact, the picture hints that an explicit expression for f might be f(x) = ((d-c)/(b-a))(x-a)+c. A similar picture can be used to establish that any two bounded, open intervals in the line are equivalent. Combining this result with the above observation that (-1,1) "" R, it readily follows that any two open intervals, bounded or not, are equivalent. It is natural to consider whether [0,1] and [0,1) are equivalent. This is a slightly more complicated question since a proof by pictures is not easy to come by. Let A = {Tl, T2, •.• } consist of a decreasing sequence of distinct points in [0,1] such that Tl = 1 and lim n -+ co Tn = 1/2. Then the function f: [0,1] --+ [0,1) given by f(T) = T if T f/. A and f(Tn) = T n +}, Tn E A, establishes the desired equivalence. Now, that [0,1) and [0,1] are equivalent is not so surprising since [0,1/2] ~ [0,1) ~ [0,1] and [0,1/2] "" [0,1]. The remarkable fact is that a similar result, conjectured by Cantor, is true for arbitrary sets.
Figure 1
3.
Equivalent Sets
7
Theorem 3.1 (Schroder-Bernstein). Let Ao, At, A2 be distinct sets such that A2 C Al C Ao and suppose that Ao A 2. Then also Ao AI. fV
fV
Proof. Let f: Ao - A2 be an equivalence function and consider fIA I , the restriction of f to AI. If we let A3 = f(A I ), then clearly (fIAI): Al - A3 is one-to-one and onto and it establishes an equivalence between Al and A 3. By a proof by pictures it also follows that Ao \ Al A2 \ A3 and that fl(Ao \ AI) is an equivalence function in this case. Next we put 9 = flAI and we repeat the above argument with A2 and 9 in place of Al and f. That is, let A4 = g(A2) and note that g1A 2: A2 A4 is an equivalence function. It is also readily seen that gl(AI \ A 2) establishes the equivalence of Al \ A2 and A3 \ A 4. We repeat this procedure and thus obtain a decreasing sequence {An} of subsets of Ao which satisfies the following properties: fV
(i) Al A3 As (ii) A2 A4 A6 (iii) An \ An+I An+2 \ A n+3 , fV
fV
fV
•••
fV
fV
fV
•••
fV
all n ~
o.
Furthermore, note that
and
The equivalence of Ao and Al follows now readily since the sets on the right-hand side of (3.1) and of (3.2) are pairwise disjoint, the sets located at the odd spots in (3.1) are equivalent to the sets located at the even spots in (3.2) and the remaining sets are the same. • This result has many important consequences, and we mention some.
Corollary 3.2. Let A, B be arbitrary sets, and let Al BI ~ B be such that Al Band BI A. Then A B. fV
fV
~
A and
fV
Proof. Simply observe that by assumption BI ~ B Al ~ A, and A. Then (a simple variant of) Theorem 3.1 applies with A2 BI and Ao = A. • fV
BI
fV
fV
An interesting application of Theorem 3.1 is to show that Q+ is countable. Since N ~ Q+ it is enough to show that Q+ is equivalent to a subset
I.
8
Cardinal Numbers
of N. But this is not hard: If r = min, m, n EN, is the relatively prime expression of r E Q+, then put I( r) = I( m/ n) = 2m3n . It is clear that I is one-to-one, and consequently, it is an equivalence between Q+ and a subset of N, as we wanted to show. There are at least two other ways to verify that Q+ is countable. For instance, we may exhibit Q+, including repetitions, as the sequence 1/1,2/1,1/2,1/3,2/2,3/1,4/1,3/2,2/3,1/4, ... , ordered by the increasing magnitude of the sum of the numerator and denominator of each rational number. A proof by pictures leading to the above sequence is also available; we leave it to the reader to set it up. These observations may be cast in a more general setting. Proposition 3.3.
A
Let A = {An} be a family of countable sets. Then
= UnEN An is also countable.
Proof. List the elements of each An = {an,t, an,2, . .. } and introduce the mapping I: A -+ N given by I(an,m) = 2n 3m . Since I is one-to-one, A is at most countable. Also, since A ~ At, say, A is actually countable. . The argument can be readily modified to show that a finite, or countable, union of at most countable sets is again at most countable. •
4.
CARDINALS
As pointed out above, sets which are equivalent cannot be told apart by purely set-theoretic properties. This observation leads to the following definition. Given a set A, we associate with it its 'Cardinal number, with the property that any two sets A and B have the same cardinal number, or cardinality, provided that they are equivalent. We denote the cardinal number of A by card A and it is clear that card A = card B whenever A '" B. This definition is somewhat imprecise, but it will do for the applications we have in mind. The cardinal number of the class of sets equivalent to 0 is denoted by 0, that of {I, ... , n} by n, and that of N by No. Thus No is the first infinite cardinal. The cardinal number of the uncountable set [0,1], or that of R for that matter, is denoted by c (for continuum). Small letters often are used to denote cardinal numbers. The inclusion relation between sets translates into a comparison relation for cardinal number. More precisely, given cardinal numbers, or plainly cardinals, a and b, we say that a precedes b, or that a is less than or equal to b, and we write a ::; b, if there are sets A and B and a function I: A -+ B such that card A = a, card B = b and I is one-to-one. In other
4.
Cardinals
9
words, and with the above notation, a ~ b if and only if A B 1 , where Bl ~ B. It is clear that ~o ~ c, and that n ~ m (in the cardinal sense) iff n ~ m (in the usual sense). Inspired by the concept of equivalent sets we say that the cardinals a and b are equal, and we write a = b, if a ~ band b ~ a. We say that a < b if a ~ b and a =I b. For instance, ~o < c. Next we develop the arithmetic of cardinal numbers, including the operations of addition, multiplication and exponentiation. We do addition first. Given cardinals a,b, we define the sum a + b of a and b as the cardinal number obtained as follows: Let A,B be disjoint sets such that card A = a and card B = b. Then put a + b = card (A U B). It is not hard to see that addition is commutative (since AU B = B U A) and associative (since A U (B U C) = (A U B) U C). For example, if n, m are finite cardinals then n + m is, as it should be, (n + m) (let A = {1, ... ,n}, B = {n+ 1, ... ,n+m}). On the other hand, n+~o = ~o (choose A = {1, ... ,n},B = {n+1, ... } and note that AnB = 0 and AUB = N) and ~o+~o = ~o (A = even natural numbers, B = odd natural numbers). Also ~o + c = c, cf. 5.18 below, and c + c·= c(A = [O,1/2),B = [1/2,1]). As for the multiplication of cardinal numbers, given cardinals a and b, we define the product ab of a and b as the cardinal obtained as follows: Let A, B be sets such that card A = a and card B = b. Then put ab = card (A x B). Multiplication of cardinal numbers is commutative and associative, and distributive with respect to addition, cf. 5.3 below. For example, in the case of finite cardinals nand m, the product nm is, as it should be, (nm), i.e., the cardinal of {1, ... , n, ... , n2, ... , nm}, and that of~o~o is ~o (Put A = N,B = {1/n:n EN}). Finally we consider exponentiation. Given cardinals a and b, we define the cardinal ba as follows: Let A,B be sets with card A = a and card B = b. Then we set ba =card BA. The usual properties of exponentiation are not hard to check, cf. 5.21, 5.22 below. There is at least one exponential that is readily computed, and it corresponds to the case b = 2, since it is not hard to identify 2A. More precisely, we have f'V
Proposition 4.1.
Given any set A, 2A
f'V
peA).
Proof. Let 'Ij;: 2A _ peA) be defined as follows: If I: A - {O,1}, then let 'Ij;(f) be the subset of A corresponding to 1- 1({1}), i.e., put 'Ij;(f) = 1-1 ({1}). We claim that 'Ij; is an equivalence function. First note that if 'Ij;(f) = 'Ij;(g), then 1- 1( {1}) = g-l( {1}), and consequently also 1- 1( {O}) = g-l( {O}) and 1= g; thus 'Ij; is one-to-one. Next suppose that
10
I.
Cardinal Numbers
B E P(A) and let f = XB. Then 1f;(J) = f-l( {1}) = B and 1f; is also onto. Thus 1f; is an equivalence function. • This result explains why P( A) is also referred to as the power set of A, and it can be used to show that there is no largest cardinal number.
Proposition 4.2.
For any set A, card A
< card 2A .
Proof. Since all the singletons of A belong to P(A) it is clear that cardA ~ card P(A). Let 1f; be a (one-to-one if you wish) map from A into P(A), we show that 1f; cannot be onto. This is not hard; suppose that 1f; is onto. Now, for each x E A, 1f;( x) is a subset of A and consequently the set B = {x E A: x ~ 1f;( x)} is well defined. Since by assumption 1f; is onto, there exists a E A such that 1f;(a) = B. Now, if a E B, then by the definition of B, a ~ 1f;( a) = B, and this cannot happen. If, on the other hand, a ~ B, then also a ~ 1f;( a) = B and consequently, by the definition of B, a E B, which is also a contradiction. In other words, 1f; cannot be onto. • Proposition 4.1 in particular implies that for finite cardinals n, 2n is as expected. How about 2 No ? This requires a new idea. Each real number r in [0,1] can be expressed as 00
r = Lan2-n = .ala2 ... ,
an = 0,1,
all n.
n=O
This is the so-called dyadic expansion of r. A minor inconvenience arises since expansions are not necessarily unique. For instance 1/2 = .011 ... = .100 ... , one dyadic expansion terminating in O's and one in 1 'so But the set of such r's is countable, cf. 5.14 below. In other words, if we consider all dyadic expansions, there are, counting repetitions, c + No = c of them. Furthermore, the set of dyadic expansions is clearly equivalent to the set A of all sequences which assume the values 0 and 1, and this set in turn is equivalent to 2N , 2 = {0,1}. Now, by definition, cardA =card2 N = 2ND , and by the above remarks card A = c. Thus 2ND = c. A similar argument allows us to compute cc. On the other hand, to compute this product it suffices to note that cc = 2ND 2No = 22No = 2No = c. One point remains open. Given two cardinals a and b, we cannot be sure that they are comparable. In order to answer this question we need a new concept, namely that of an ordered set, which we discuss in the next chapter.
5.
Problems and Questions
11
5. PROBLEMS AND QUESTIONS 5.1 Show that union and intersection are distributive with respect to
intersection and union respectively. In other words, show that
and C u (niEIAi)
= niEI(C U Ai).
In addition the de Morgan's laws also hold, to wit,
5.2 Let A = {An: n E N} be a family of sets and let A = UnEN An. Show that there is a family B consisting of pairwise disjoint sets, B = {Bn: n EN}, such that Bn ~ An and A = UnEN Bn.
=
5.3 Show that A X (B U C) (A X B) U (A X C) and that, in general, AU (B X C) I (A U B) X (A U C).
5.4 Show that if A 5.5 Suppose that B
I 0
and A X B = A xC, then B = C.
n C = 0 and show that ABUC
fV
AB X AC.
5.6 Suppose that AB = BA and show that A = B.
5.7 Suppose that A is a set of n elements, how many relations are there in A X A? 5.8 Let f: A ~ B and suppose that for all i E I, Ai Discuss the (inclusion) relations between
and do the same for
Also, what are the inclusion relations between
and between
~
A and Bi
~
B.
12
I.
5.9 Let
Cardinal Numbers
I: A ~ B. Show that 1(J-l(B)) ~ Band 1- 1 (J(A)) 2 A.
By means of examples show that the inclusions may indeed be proper. 5.10 Show that the composition of one-to-one functions is one-to-one, and that of onto functions is onto.
5.11 Let I: A ~ Band g: B ~ A, and suppose that go 1= 1 (identity in A) and log = 1 (identity in B). Show that I and 9 are one-to-one and onto and that 9 1-1.
=
5.12 Suppose that A"" B and show that for any set C, AO "" BO. 5.13 Show that if I: [0,1] x [0,1] ~ [0,1] is one-to-one and onto, then it cannot be continuous. 5.14 Show that the set of real numbers in [0,1] which have two decimal expansions (one terminating in 9's and one in O's) is countable. 5.15 This is a sketch of a proof that [0,1] is uncountable: Any countable listing .all a12a13 •.• , .a21 a22a23 ••• , .a31 a32a33 ••• , • •• of the real numbers in [0,1] can not be complete. Indeed, put r = .b1 b2 b3
••• ,
bn
-::f:. ann,
all n ,
and note that the real number r cannot be in the above listing. The actual proof requires some care; 5.14 is relevant here. The method of proof uses a "Cantor diagonal selection process." 5.16 Prove the following restatement of the corollary to the SchroderBernstein theorem: If a and b are cardinal numbers such that a ::; b and b ::; a, then a = b. This result does not require the Axiom of Choice as do the results of Chapter II, but it merely asserts that both a < b and b < a cannot occur. 5.17 Show that if A is an infinite set, then A contains a countable subset. 5.1S Prove these two corollaries to 5.17: (i) A set is infinite iff it is equivalent to a proper subset, and, (ii) If a is an infinite cardinal, then a + No = a. 5.19 Suppose at, a2 are infinite cardinals, and that that for every cardinal 6, al + b < a2 + 6? 5.20 If a and 6 are cardinal numbers so that follow that 6 = c?
a
al
<
a2.
Does it follow
< c and a + b = c, does it
5.
Problems and Questions
13
= abd . Show that for arbitrary cardinals a, b, 2a +b = 2a 2b•
5.21 Given cardinals a,b and d, show that (ab)d 5.22
5.23 Compute NoN o, cN o , cc, C C and N~o. 5.24 Show that the set of all polynomials with rational coefficients is countable. Also the set of roots of such polynomials is countable; these are known as algebraic numbers. A real number is said to be transcendental if it is not algebraic. Show that the set of transcendental numbers is uncountable. 5.25 What is the cardinal number of the set of all the real-valued functions defined on [0,1]? What is the cardinality of C([0,1]), the set of all the continuous real-valued functions defined on [0,1]? 5.26 Let S be the set of those sequences of natural numbers which are eventually 0; show that S is countable. 5.27 Let A be the family of all the finite subsets of R and let B be the family of all the at most countable subsets of R. Compute card A and cardB. 5.28 What is the cardinality of the family of all the open subsets of R? of R2? 5.29 Let A be the family of all the convex subsets of R2, and let B be the family of all the connected subsets of R2. Evaluate card A and cardB. 5.30 Given a family of sets A = {A} such that card A ::; c, let B = {B: B = U~I (Ak \ Ale), Ak, A~ E A}. Prove that cardB ::; c. 5.31 If {AihEI is an indexed family of sets with card Ai ::; c for all i E I, and card I ::; c, show that card (UiEI Ai) ::; c. 5.32 Let A be a subset of the plane with the property that the (usual Euclidean) distance d (x ,y) is a rational number for any pair of elements x, YEA. Show that A is at most countable. Is the result valid for subsets of R n with the same property? 5.33 Suppose Al ~ A2 ~ ... , A ~ U:::'=I An, and for each infinite subset B of A there exists AN such that B n AN is an infinite set. Is it true that for some integer no, B ~ Ana? 5.34 Decide whether the following statements are true: (a) If A ~ B, then card B = card (B \ A)+card A, and, (b) card (A n B) = card A iff A ~ B.
14
5.35 Let a be an infinite cardinal such that a then a + b = b.
I.
Cardinal Numbers
= 2a. Show that if a < b,
5.36 Let a, b be cardinal numbers, b ~ 1. Show that a + b ~ abo 5.37 (Russell's Paradox). A set is either a member of itself, or it is not. Let R denote the set of all sets which are not members of themselves. Then if R E R it follows that R ~ R. If R ~ R, it follows that R E R. Hence it cannot be that R E R or that R ~ R. It is clear from this and other paradoxes, that there is a need for the axiomatization of intuitive set theory. These paradoxes are avoided in axiomatic set theory by the elimination of "sets" that are "too large." To develop the theory of sets from the axiomatic point of view is a long and difficult process, far removed from Real Analysis, which is the main subject of our text. For this reason we have made no effort to be rigorous in dealing with sets, but have rather appealed to intuition.
CHAPTER
II
Ordinal Numbers
In Chapter I we identified sets according to their cardinal number; an attribute we want to consider next is that of order. For instance, we want to distinguish between the sets A = {1, 2, ... } ordered by the usual "::;" relation and B = {... , 2,1} ordered by "~." Although these sets are equivalent in the sense of cardinality, the same is not true if we take the order into account: A has a first, but not a last, element, and B has a last, but not a first, element.
1.
ORDERED SETS
A partial order on a set M is a special type of relation. More precisely, we say that the relation -< on M X M is a partial ordering on M if it satisfies the following three properties: R. m -< m for every m EM. AS. (antisymmetry) If ml -< m2 and m2 -< ml, then ml = m2. T. If ml -< m2 and m2 -< m3, then ml -< m3. To stress that M is partially ordered by -< we denote a partially ordered set by (M, -<). Also ( {aI, a2, ... }, -<) denotes that the an's are ordered by the relation -< in the order they are listed. Although the symbol -< reads "precedes", this terminology by no means assigns to -< an intuitive meaning. For instance, Z ordered by 0,1, -1, 2, -2, ... , Le., integers ordered by increasing magnitude with the positive integers coming first, satisfies the three order properties listed above. Another important example of a partial order is (P(A), ~), namely, the family of all the subsets of a fixed set A partially ordered by inclusion; in this case not all the elements of peA) are comparable.
II.
16
Ordinal Numbers
If M1 ~ M and (M, -<) is a partially ordered set, then we may consider M1 as a partially ordered set by simply restricting -< to Mt. Le., by ordering elements in M1 as they were ordered in M. This partial ordering is denoted by (M1' -< IMd. As we saw above, a partial order does not insure that all the elements of a set are comparable. A partially ordered set (M, -<) is said to be totally ordered (also called linearly ordered or ordered), provided that for every m f:. m1 in M, either m -< m1 or m1 -< m. (Z,~) is totally ordered. In identifying sets according to their order properties, when do we say that (M, -<) and (M*, -<*) are order equivalent? First M and M* must be equivalent, and in addition we require that there exists an equivalence function f: M -+ M* which is order preserving, Le., for all mt. m2 in M,
m1 -< m2
iff
f(m1) -<* f(m2).
For example, (N,~) and ({1,3, ... ,2,4, ... }, -<) are not order equivalent, whereas (N,~) and ({at.a2, ... },-<) are. Given the ordered sets (M, -<) and (M*, -<*), we say that they have the same order type provided they are order equivalent. The order type of ({1, 2, ... ,n},~) is denoted by n, that of ({1, 2 .... },~) by w, that of ({ ... , 2, 1},~) by w*, that of ([0,1],~) by A and that of ([0,1] n Q, ~ IQ) by TJ. Although it is apparent that not all order types are comparable, we can not compare wand w*, it still is possible to do some arithmetic with order types. We do addition and multiplication as examples. Given order types Jl1 and Jl2' we define the sum Jl1 + Jl2 as the order type obtained as follows: Let (Mt. -
=
2.
Well-ordered Sets and Ordinals
17
order type IL corresponding to its order equivalence class. -< is called the lexicographic order on Ml X M 2 , and the reason for this is apparent. Note that in general ILIIL2 '# IL2ILI. However, as the reader can verify, addition is distributive with respect to multiplication, i.e., IL(ILI +IL2) = ILILI +IL1L2·
2. WELL-ORDERED SETS AND ORDINALS Given a countable set A it is possible to assign to it various nonequivalent order types. For instance, there are countable sets with order type w, w + w, w*, w + w*, ... In fact, there are 2No nonequivalent order types corresponding to sets of cardinal No. Inspired by the basic model (N,:::;) we will focus on the theory of well-ordered sets. We need some definitions first. Given an ordered set (M, -<), we say that mE M is the first element of M if m precedes any other element of M, i.e., m -< m' for every m' in M. By AS the first element is unique. Similarly, we say that the ordered set (M, -<) has a last element if there is mE M which is preceded by any other element of M, i.e., m' -< m for every m' in M. For instance, an ordered set with order type w has a first, but no last element, and one with the order type w + 1 has both a first and a last element. Closely related to these notions are the concepts of lower and upper bound, and minimal and maximal element. Given an ordered set (M,-<) and a subset M' of M, we say that m E M is a lower bound for M' if m -< m' for every m' E M'j we say that ml is an upper bound for M' if m' -< ml for every m' in M'. For instance, in (P(A), ~), 0 is a lower bound and A is an upper bound for any family of peA). An element m is said to be a minimal element of M iff there is no m '# m' E M so that m' -< m. Similarly, an element m E M is said to be maximal iff there is no m' E M so that m -< m'. For instance, if M = {a, b, c} is partially ordered by a -< b and a -< c, then a is the minimal element and b and c are maximal elements. Finally, we say that an ordered set (M, -<) is well-ordered if M and any of its nonempty subsets, ordered by the restriction order, has a first element. The order type of a well-ordered set is called an ordinal number, or plainly an ordinal. For instance, (N,:::;) and (N +N,:::;) are well-ordered sets with ordinal wand w + w, respectively. On the other hand, the order types .x and TJ are not ordinals.
18
II.
Ordinal Numbers
It is also clear that if MI ~ M and (M,~) is well-ordered, then so is (M!, ~ IMI). More generally, if (M,~) is well-ordered and if A is equivalent to a subset MI of M, then A can also be well-ordered. To see this let f: MI ~ A be an equivalence function and rewrite A = {am} where am = f(m), mE MI. It is then readily seen that the relation ~* on A X A given by am ~* am' iff m ~ m' is an order relation on A, and that (A, ~*) is well-ordered. Although.,., is not an ordinal, by the preceeding discussion Q n (0,1), or any other countable set for that matter, can be well-ordered by means of an order induced by (N, s,). The next step is, then, to consider whether arbitrary sets may be well-ordered. In the early 1900's Zermelo showed that this was the case provided we assume the validity of the so-called Axiom of Choice. Again, as in the case of the rationals, if the set under consideration is already ordered, the well-ordering that Zermelo's theorem induces does not, in general, coincide with the existing order. We discuss this in more detail later, but first we consider the Axiom of Choice. It was introduced by Zermelo in 1904 and, in one of its many equivalent formulations, it states:
Axiom of Choice. Given an arbitrary family A = {Ai: i E I} of nonempty sets indexed by (the nonempty) set I, there exists a function f: I ~ UiEI Ai, called a choice or selection function, such that for each i E I, f(i) is an element of Ai. Where does the Axiom of Choice stand in comparison to the more familiar (Zermelo-Fraenkel) axioms of set theory? Godel (1906-1978) established the fact that the Axiom of Choice and the generalized continuum hypothesis, which we do not discuss here, are consistent with the remaining axioms of set theory, provided that these axioms are consistent themselves. On the other hand, some 60 years after its introduction, P.J. Cohen showed that the Axiom of Choice can neither be proved nor refuted from the usual axioms of set theory. The Axiom of Choice is thus a new principle of set formation. Zermelo's theorem is actually equivalent to the Axiom of Choice. We leave to the reader to prove that the Axiom of Choice implies Zermelo's theorem, cf. 5.14 below. Conversely, suppose that any set can be wellordered and let A = {Ai: i E I} be an arbitrary family of nonempty sets indexed by I. Well-order each Ai and observe that by choosing the first element ai, say, in each nonempty subset Ai of UiEI Ai we get the required selection function f by simply putting J(i) = ai E Ai. The Axiom of Choice plays an important role in Analysis. As we shall have the opportunity to discover as we go along, in addition to the results
2.
Well-ordered Sets and Ordinals
19
discussed in the next section, the existence of a non Lebesgue-measurable set, the existence of a maximal orthonormal set in a Hilbert space and the proof of the Hahn-Banach Theorem, all rely on the Axiom of Choice. In applications we often prefer to work with an equivalent principle, namely Zorn's Lemma, because it only requires that we deal with partially ordered sets. To state Zorn's Lemma we need a definition. We say that a subset A of a partially ordered set (M, -<) is a chain in M if (A, -< IA) is totally ordered. For instance, if M = {a, b}, then A = {0,{a}} and B/= {0,{a},M} are chains in (P(A),~). The stage is now set for Zorn's Lemma. A partially ordered set (M, -<) has a maximal element if every chain A in M has an upper bound. As we stated above, Zorn's Lemma is equivalent to the Axiom of Choice. To illustrate a typical use of Zorn's Lemma we show that it actually implies the Axiom of Choice. Suppose, then, that .A = {Ai: i E I} is a family of nonempty sets indexed by a nonempty set Ij we must find a selection function. Let F = {f} be the collection of those functions I which satisfy the following three properties: 1. Domain of I ~ I. 2. Range of I ~ Uiel Ai. 3. I(i) is an element of Ai for each i in the domain of I. It is readily seen that F =I O. Indeed, choose io in the nonempty set I, and since Aio is nonempty, pick an element aio E Aio' Then, the function I: {io} -+ Aio given by I (io) = aio is in F. F may be partially ordered as follows: Given I, 9 in F, we say that I -< 9 iff J = domain of I ~ domain of 9 and glJ = I. Next we check that any chain C = {g} in F has an upper bound. Indeed, let J = UgecCdomain of g) and A = UgecCrange of g)j by assumption J ~ I and A ~ Uiel Ai. A candidate for an upper bound is the function I: J -+ A defined as follows: If i E I, i is in the domain of some 9 in Cj then put I( i) = g( i) E Ai. There may be more than one 9 in C so that i E (domain of g), but since C is a chain it readily follows that all functions 9 in C for which g( i) is defined assume a common value at i, and consequently, I is well-defined. It is now clear that I E F and that 9 -< I for every 9 in C. In other words, I is an upper bound for C, and every chain in F has an upper bound. By Zorn's Lemma F has a maximal element I, say. We claim that domain of I = I, and if this is the case then clearly I is ar choice function. Suppose that the domain of I is strictly contained in I and let i E I \ (domain of I). Then since
II.
20
Ordinal Numbers
ai E Ai :f. 0, the function 9 defined on (domain of 1) U {i} by 9 I(domain of 1) = f and g( i) = ai, satisfies properties 1, 2, and 3 listed above and f -< g, thus contradicting the maximality of f in :F. Whence the domain of f is I, f is a selection function and the Axiom of Choice follows.
3. APPLICATIONS OF ZORN'S LEMMA It is readily seen that any continuous function f: R ~ R which satisfies f(x + y) = f(x) + fey) is of the form f(x) = ex, for some real number e( = f( 1». Are there, necessarily discontinuous, functions f( x) :f. ex which verify the same functional relation? Yes, there are, and to construct such functions we need some definitions. We say that a nonempty subset M of R is l.i. (linearly independent) over Q if every finite subset {Xl, ... , x n } of M is l.L over Q, Le., there is no nontrivial linear combination rlXI + ... + rnXn = 0, ri E Q. A Hamel basis H = {Xi} for R is a maximall.i. set. More precisely, H is a set of nonzero real numbers which satisfies the following three properties: 1. H is l.i. over Q. 2. Every X in R can be written as a finite linear combination, with rational coefficients, of the Xi'S. 3. If H C H', then H' is not l.L over Q. Suppose we have a Hamel basis H for R and observe that for each X in R, the decomposition given in 2 above is unique. We may then construct a function f as follows: If x = rixi + ... + xnrn , Xb'" ,X n in H, put
f(x)
= rd(xt) + ... + rnf(x n ).
Since the decomposition for each X is unique, f is well-defined and it satisfies the relation f( X + y) = f( x) + f(y) for all x, y in R. Since H has at least two elements Xl and X2, say, we may set f(XI) = 1, f(X2) = and take arbitrary values for the other xi's in H. Furthermore, since X2 :f. 0, f cannot be of the form f(x) = ex unless e = 0. But in this case we have 1 = f(xt) = eXI = 0, which is a contradiction. Thus f is not linear and our construction will be accomplished once we exhibit a Hamel basis for R. It is at this juncture that we invoke Zorn's Lemma.
°
Proposition 3.1.
There is a Hamel basis for R.
Proof. The class M of those subsets of R which are l.i. over Q is nonempty since it contains any nonzero real number r. Let C be a chain in the partially ordered set (M, ~), we check that C has an upper bound.
3.
Applications of Zorn's Lemma
21
This is not hard to do since the union of all the real numbers in C is an element of M which is an upper bound. By Zorn's Lemma, M has a maximal element M, say. It is readily seen that M is a Hamel basis for R. Indeed, let Y denote the span of M in R. If Y = R we are done, otherwise let r E R \ Y and observe that the set M U {r } is l.i. over Q and it properly contains M, contrary to the latter's maximality. Therefore, H is a Hamel basis for R. • We close this section with two interesting results that also follow from Zorn's Lemma, namely that cardinals are comparable, and that ordinals are comparable, by the relation less than or equal to.
Theorem 3.2.
Any two cardinal numbers are comparable.
Proof. Let a,b be nonzero cardinals and let A and B be (nonempty) sets so that card A = a and card B -= b. Furthermore let F be the family of functions I defined from a subset Ao of A, into a subset Bo of B, which are one-to-one and onto. Given x E A and y E B, I(x) = y establishes an equivalence between Ao = {x} and Bo = {y}, and F f:. 0. Also F may be partially ordered by the usual extension relation between functions. Next we check that any chain C in F has an upper bound. Let X = U( domain of J) ~ A, Y = U(range of J) ~ B, where the unions are taken over those f's in C. Since it is readily seen that the function g: X ---? Y given by gl( domain of J) = I is an upper bound for C (here we use the fact that C is a chain to insure that 9 is well-defined), by Zorn's Lemma F has a maximal element I: X ---? Y, say. There are four possibilities, to wit: (i) X = A and Y = B, in this case a = b, (ii) X = A and Y C B, then a ::; b, (iii) X C A and Y = B, then b ::; a, and finally (iv) X C A and Y C B, which we show cannot occur. For, if case (iv) holds, then there are elements x E A \ X and y E B \ Y, and the function g:X U {x} ---? Y U {y} given by glX = I and g(x) = y is in F, and it properly extends I, contradicting its maximality. • In order to prove our next result we need a concept that will enable us to work with the first, or initial, elements of a well-ordered set. Given a well-ordered set (M, -<) and m in M, let 1m = {m' EM: m'
-< m, m' f:. m} ;
1m is called the initial segment of M corresponding to m. An important property that initial segments satisfy is
II.
22
Ordinal Numbers
Proposition 3.3. No nonempty well-ordered set is order equivalent to any of its initial segments. Proof. Let (M, -<) be a well-ordered set, suppose there is an element m in M so that M is order equivalent to 1m, and let I: M ~ 1m be a function which establishes the order equivalence. Then I( m) E 1m and by the definition of 1m , ml = I(m) -< m = mo. We iterate this step and by setting mk = I(mk-t}, k = 1,2, ... , we obtain a sequence of elements of M such that for all k, mk -< mk-l -< ... -< mo. Thus the subset of M consisting precisely of the mk's has no first element, contradicting the fact that M is well-ordered. Whence M is not order equivalent to any 1m , as we wanted to show. • We are now ready to compare ordinal numbers. Given ordinal numbers a and b, we say that a precedes b, and we write a ~ b, provided the following holds: If A, B are sets with ordinal number a and b respectively, then A is order equivalent to B or an initial segment of B. If A is order equivalent to an initial segment of B we write a < b. Also if a ~ b and b ~ a, then a = b. The stage is now set for
Theorem 3.4.
Any two ordinal numbers are comparable.
Proof. Let a,b be nonzero ordinal numbers and let (A, -<) and (B, -<*) be well-ordered sets with ordinal a and b respectively. Inspired by Theorem 3.2, let :F denote the family of all order preserving mappings which establish an order equivalence between an initial segment of A, or A itself, onto an initial segment of B or B. It is clear that :F f:. 0. Indeed, if x is the first element of A and y is the first element of B, then the function I:{x} ~ {y} defined by I(x) = y is in:F. As in the proof of Theorem 3.2 it is readily seen that when :F is partially ordered by extension, any chain C in :F has an upper bound and consequently, :F has a maximal element I, say. If I: Al ~ BI is an order equivalence and AI, Bl are initial segments of A and B, and if Al c A and BI C B, then let x, y be the first elements of A \ Al and B \ BI respectively. Then the mapping g:Al U {x} ~ BI U {y} defined by glAI = I and g(x) = y is an order equivalence of an initial segment of A, or A, onto an initial segment of B, or B, which contradicts the maximality of I. Thus, either Al = A or Bl = B, or both. In the first case a ~ b and actually by Proposition 3.3 a < bj in the second case b < a and in the last case a :::; b and b :::; a, i.e., a = b. •
5.
Problems and Questions
23
4. THE CONTINUUM HYPOTHESIS We close this chapter with the discussion of a particular set A, namely one that is well-ordered, uncountable, and such that all its initial segments are at most countable. That uncountable well-ordered sets exist follows from Zermelo's theorem, so let (X,~) be one such set. If X has all the desired properties we are finished, otherwise the set Xo = {x EX: Ix is uncountable} ::J 0; let Xo be the first element of Xo. Then Ixo is well-ordered, uncountable, and for every x ~ Xo, Ix is at most countable. This is precisely the set A we wanted to construct. Also observe that if B is another set with similar properties, then clearly B is not order equivalent to an initial segment of A (since they are all at most countable), nor for the same reason is A order equivalent to any initial segment of B. Thus by Theorem 3.4 the ordinal of A is equal to that of B, and A is unique. The ordinal of A is special, it is denoted by n and it is referred to as the first uncountable ordinal. The cardinal of A is denoted by Nt, and since R is uncountable it is clear that Nt ~ c. The question is whether Nt = c or not. The Continuum Hypothesis was formulated by Cantor and it asserts that Nt = c. In 1900, Hilbert (1862-1943) included this hypothesis as the first item in a list of open problems that he presented to the Second International Congress of Mathematicians in Paris. In 1939 Godel showed that the Continuum Hypothesis is consistent with the axioms of set theory, that is, using the usual axioms of set theory, including the Axiom of Choice, the Continuum Hypothesis cannot be proved false. On the other hand, in 1963 P.J. Cohen showed that the Continuum Hypothesis is independent of these axioms; this means that from these axioms one cannot prove the Continuum Hypothesis true.
5. PROBLEMS AND QUESTIONS 5.1 By means of an example show that not every AS relation may be extended to an order. 5.2 Let P = {p} be the family of all real polynomials and let R be the following relation on P X P: PtRP2 iff there is a real number Xo such that Pt(x) ~ P2(X) for any x ~ Xo. Is R a partial order on P? Is it a total order? Is it a well-ordering?
24
II.
Ordinal Numbers
5.3 Let 1 = [0,1], G(l) = {/:l -+ R:/ is continuous}, and let R be the following relation on G(l) X G(l): hRh iff hex) ~ hex) for every x in 1. Is R a partial order on G(l)? Is it a total order? Is it a well-ordering? 5.4 Let A = {-<} be the collection of all partial orders on a given set M. A can be partially ordered by "extension" of orders; let (A,~) denote this partially ordered set. Show that -<* is a maximal element of (A,~) iff (M, -<*) is totally ordered. 5.5 Let (M, -<) be totally ordered. Show that (M, -<) is well-ordered iff there exists no infinite strictly decreasing sequence in M. 5.6 Let (M, -<) be totally ordered. Show that if every initial segment 1m is well-ordered, then M itself is well-ordered. 5.7 Suppose every countable subset of a totally ordered set (M, -<) is well-ordered, does it follow that (M, -<) is well-ordered? 5.8 Is the lexicographic order on N X N a well-ordering? How about the lexicographic order on N X R? 5.9 Suppose (M, -<) is totally ordered and introduce a relation R on (M X M) X (M X M) as follows: (m,mI)R(m',mD iff (i) (m,mt)-< max(m',mD, or (ii) max (m,mI) = max (m',mD and m -< m', or (iii) max (m,mI) = max (m',mD, m = m', and mI -< mi. Is R a total order? If (M, -<) is well-ordered, is (M X M, R) well-ordered? 5.10 (Principle of Transfinite Induction) Given a nonempty well-ordered set (M, -<), let P(m) be a statement which is formulated for each min M. Further suppose that (i) P(m) is true for the first element of M, (ii) If P(m) is true for each m -< m*, then P(m*) is also true. Show that P( m) is true for every m EM. 5.11 Suppose MI is a subset of a well-ordered set M and that the following holds: If m E M and the initial segment 1m ~ MI, then mE MI' Show that MI = M. 5.12 An interesting application of Transfinite induction is to show the various properties concerning operations with order types. For instance, prove that the addition is associative and left distributive with respect to multiplication. 5.13 Does the conclusion of the Principle of transfinite induction hold if, instead of (ii) there, we assume: (ii ') If P( m) is true, then P( m + 1) is also true? 5.14 Prove that the Axiom of Choice implies Zermelo's theorem.
5.
Problems and Questions
25
5.15 Let A be a family of nonempty sets of nonnegative integers. Can you construct, without invoking the Axiom a Choice, a selection function on A? 5.16 Use the Axiom of Choice to prove the following result: A mapping I: A -+ B is onto iff there is a mapping g: B -+ A such that log = l(identity on B). Moreover, any such function 9 is one-to-one. 5.17 Use Zorn's Lemma to show that for every infinite cardinal a, a
+a =
aa = a.
5.18 Show that any two Hamel bases for R have the same cardinality. 5.19 Prove that Zermelo's theorem implies Zorn's Lemma. 5.20 Prove that every set of cardinals is well-ordered. 5.21 Given an ordinal a, let Wa = {ordinals b: b < a}. Show that Wa is well-ordered and find its ordinal. 5.22 Is w
+ w = w2? Is w + w =
2w?
5.23 Discuss which of the following statements is true: Given ordinals a, b, d, (i) If a + b = a + d, then b = d, (ii) If b + a = d + a, then b = d, (iii) If b"# 0, then a < a+b, and, (iv) If b"# 0, then b < a+b. 5.24 Show that given an ordinal a, there exists no ordinal b such that a
< b, show that there exists a unique ordinal
5.26 Show that every set of ordinals is well-ordered. 5.27 Prove that the Axiom of Choice is equivalent to the following principle: Let A be a family of pairwise disjoint nonempty sets. Then there exists a set C such that A n C contains exactly one element, for each A E A. 5.28 Suppose a discontinuous function I: R -+ R satisfies I(x + y) I(x) + fey) for all x,y E R. Show that f-t({O}) is dense in R. 5.29 Show that there exists no infinite cardinal a such that No
=
< a < Nt.
5.30 Prove that the set A constructed in Section 4 has the following three properties: (i) A has no largest element, (ii) For every ordinal Xo < il, the set {x: Xo < x < il} is uncountable, and, (iii) Let Ao denote the subset of A consisting of all elements x such that x has
II.
26
Ordinal Numbers
no immediate predecessor, i.e., for no yEA we have x = y Then, Ao is uncountable.
+ 1.
5.31 Can a subset S of real numbers ordered by the order induced by (R,::;) be (order) equivalent to the set of all countable ordinals? 5.32 Let n be the first uncountable ordinal and suppose that f is a realvalued monotone increasing function defined for all ordinals a < n. Show that f is bounded, and that in fact it is eventually constant. 5.33 Let A be a set with ordinal A has an upper bound.
n. Show that every countable subset of
5.34 (The Burali-Forti Paradox) For every infinite set A of ordinals there is an ordinal number that exceeds every ordinal in A. Can we then consider the set of all ordinals?
CHAPTER
III
The Riemann-Stieltjes Integral In this chapter we introduce the Riemann-Stieltjes integral and study some of its basic properties. We also describe a natural class of functions related to this notion of integral, namely the class BV of functions of bounded variation.
1. FUNCTIONS OF BOUNDED VARIATION What do questions in Physics which involve mass distributions that are partly discrete and partly continuous, problems in Probability that consider continuous and discrete random variables simultaneously, and the determination of the arc length of a curve have in common? They all involve the Riemann-Stieltjes integral. It was to solve the "moment problem", that is, to find a distribution of mass whose moments of all orders are known, that led Stieltjes (1856-1894) to introduce this integral, and in the process to give respectability to discontinuous functions. We begin by discussing the question of finding the arc length of a plane curve r: I = [a,b] -+ R2 with graph {(x, y): x =
L(r, P) = (L:;=/
+ (1fJ(tj) -1fJ(tj_l))2Y/2,
(1.1)
which corresponds to the length of the polygonal line segment. The next step is to take finer partitions of I and to define the length L(r) of r as
27
III.
28
The Riemann-Stieltjes Integral
= lim L(r, P), where the limit is taken as the norm of the partition, i.e., sup (tj - tj-1), 1 ::; j ::; n, goes to o. Rectifiable curves are those f's L(r)
for which L(f) < 00. There are many physical quantities whose calculation takes a similar form. For instance, to compute the mass of a slab of unit thickness we must consider expressions of the form 2: j g(xj)(A(xj)-A(xj_1)), where g denotes density and A denotes area, or 2: j g(Xj)(V(Xj)- V(Xj-t)), where V denotes volume, and their limits as the norm of the partition goes to
O. So, in general, we are led to consider, for arbitrary functions defined on I, expressions of the form
J,
9
n
L:9(1]j) (J(tj) - J(tj-1)) , j=1
1]j
E [tj_btj] ,
(1.2)
and their limits as the norm of the partition goes to O. If this limit is to exist we expect that it will satisfy properties similar to those of the Riemann integral, which corresponds to the choice J(t) = t in (1.2). It then becomes quickly apparent that step functions, i.e., those 9'S of the form 2:7=1 CjXj' where the cj's are arbitrary real constants and the Xj'S are the characteristic functions of a partition {Ij} of I consisting of nonoverlapping closed intervals (their interiors are disjoint), should be incorporated into the theory. In particular, if the cj's are ± 1, we note that not only is each step function Riemann integrable, but also that
(1.3) A moment's thought will convince the reader that an appropriate general version of (1.3), now with a constant that also depends on J on the right-hand side there, should hold as well. Thus, with the familiar notation 1 if x > 0 sgn x = { 0 if x = 0 -1 if x < 0,
if we put 9 = 2: j CjXj, where Cj = sgn (J(tj) - J(tj-l)) = ±1 (or 0, but then the term is unimportant) in (1.2) above, and if C = cI,j denotes the uniform bound in the estimate analogous to (1.3), it follows that n
L j=l
IJ(tj) - J(tj-l)1 ::; c,
all partitions P.
(1.4)
1.
Functions of Bounded Variation
29
Functions that satisfy (1.4) are called of bounded variation and the class of such functions is denoted by BV. These functions were discovered by Jordan (1838-1921) in 1881 while working out the proof of Dirichlet concerning the convergence of Fourier series. Before we continue with the discussion of the integral we investigate what BV functions are like. A useful notation in what follows is this: For a BV function I defined on I = [a,b] and a subinterval J = [c,d] of I, put V(Jj c, d) = sup of the left-hand side of (1.4), where the sup is taken over all finite partitions P of J. So, BV functions I on I are precisely those with V(Jj a, b) < 00. What are some BV functions? Monotone functions I are BV and it is readily seen that V(Jja,b) ::; I/(b) - l(a)l· On the other hand, there are continuous functions which are not BV. Rather than looking for a function in our repertoire, cf. 4.12 below, we construct one. Let {ak},{dk} be two sequences in (0,1], a1 = 1, which decrease to 0, and suppose that 2: dk diverges. Define I as follows: On each subinterval [ak+1' ak] the graph of I is the isosceles triangle with base [ak+l,ak] and height dk. Thus I(ak) = and I«ak + ak+1)/2) = dk. If we are careful to set 1(0) = 0, I is a continuous function and
°
n
V(JjO,l) ~ Ldk
-+ 00,
as n
-+ 00.
k=1
Nevertheless, as we shall see below, the continuity of a BV function is reflected by that of its "variation function" Vex) = V(Jja,x),a::; x::; b. Some of the properties of V are readily obtained. For instance, V(O) = 0, V is nondecreasing on I,
I/(x
+ h) -
l(x)1 ::; V(Jj x, x + h),
a::; x < x + h ::; b
(1.5)
a::;x<x+h::;b.
(1.6)
and
V(x)+V(Jjx,x+h)=V(x+h),
We sketch a proof of the identity (1.6). Given c > 0, let P be a partition of [a,x] such that 8
=
I/(xj) - l(xj-t)1 > V(Jj a,x) - c/2,
L overP
and let P 1 be a partition of [x,x 81
=
L over PI
+ h]
such that
I/(xj) - l(xj-1)1 > V(JjX,X
+ h) - c/2.
III.
30
The Riemann-Stieltjes Integral
Since P2 = P U PI, i.e., the partition consisting of the points in P or PI, is a partition of [a,x + h] it follows that
V(Jia,X)
+ V(JiX,X + h) ~ 8 + c/2 + 81 + c/2
L
~ ~
If(xj) - f(Xj-I)1
+c
V(Jia,x+h)+c.
(1.7)
Since c > 0 is arbitrary, (1.7) gives one inequality (~) in (1.6). In order to show the opposite inequality, given c > 0, let P be a partition of [a,x + h] with the property that V(Ji a, x + h) ~ 8 = 'Eover'P If(xj) - f(xj-dl + c. Clearly we may assume that x is a point in P since 8 only increases when we add points to it. But then, breaking P up into the partitions PI = {a = Xo < ... < Xn = x} and P2 = {x = Xn < ... < Xm = X + h}, it readily follows that
V(Jia,X
+ h) ~ 8 + c ~ V(Ji a,x) + V(JiX,X + h) + c.
Again, since c > 0 is arbitrary, (1.8) gives the other inequality (1.6) and equality holds there, as asserted. We are now ready to prove
(1.8) (~)
in
Proposition 1.1. Suppose f is BV on I = [a,b] and let x E [a,b). Then f is right-continuous at x iff V is right-continuous at x. Proof. From the inequality (1.5) and the relation (1.6) it follows that if V is right-continuous at x, so is f. Conversely, suppose that f is right-continuous at x and observe that since V is nondecreasing, limh->O+ Vex + h) = L exists, cf. 4.3 below. Suppose that
L-V(x)=17>O, and choose 6
> 0 so that
x
(1.9)
+ 6 < band
If(y) - f(x)1 < 17/2,
0 < y - x < 6.
(1.10)
From (1.9), (1.6) and the monotonicity of V it readily follows that V(x+6)- Vex) = V(Ji x,x+6) > 17, and consequently, there is a partition Po = {x < Xl = X + 61 < X2 < ... < Xn = X + 6} of [x, x + 6] for which also L:j=I If(xj+I) - f(xj)1 > 17. Now, as before, by (1.10), there exists a partition PI of [x, x + 61 ] consisting of more than one intermediate point such that If(xj+I) - f(xj)1 > 17·
L
over'Pl
1.
Functions of Bounded Variation
31
Therefore combining Po and PI we get a partition P of [x, x + 6] so that Lover1' I/(xj+1) - l(xj)1 > 2T}. We may now repeat this procedure, i.e., subdivide the first interval in each partition and thus obtain, for any given k, a partition P of [x, x + h] so that
~ I/(xj+1) - l(xj)1 > kT}. overP
But this implies that I is not of bounded variation over I, which is a contradiction. In other words, T} = 0, V is right-continuous at x, and we have finished. • A similar, yet simpler, argument shows that for x in (a,b], I is leftcontinuous at x iff V is left-continuous at x. Thus combining these results we get that for x E (a,b), I is continuous at x iff V is continuous at x. To complete the description of BV functions we note the following properties of V. Proposition 1.2. on I. Proof.
Let
I
be BV on I. Then V ± I are nondecreasing
Observe that for a ::; x
< y < b we have
(V(y) - I(y)) - (V(x) - I(x)) = v(fjX,y) - (f(y) - I(x)) and V -
I
is non decreasing. A similar argument applies to V
~
0,
+ I. •
The difference, or any linear combination for that matter, of non decreasing functions is BV. The interesting fact is that the converse is also true. Theorem 1.3 (Jordan). Suppose I is BV on I. Then written as a difference of two non decreasing functions. Proof.
A decomposition that works is
I
= V - (V - f).
I
can be
•
The decomposition in Jordan's theorem is not uniquej in fact if
I = It - 12 and if 9 is any increasing function, then we also have 1= (fI + g) - (12 + g). In particular, if we need to, we may assume that the nondecreasing functions in Jordan's theorem are actually strictly increasing. Also, if I is right-continuous on [a,b), the increasing functions in the Jordan decomposition may be assumed to be right-continuous there.
III.
32
The Riemann-Stieltjes Integral
2. EXISTENCE OF THE RIEMANN-STIELTJES INTEGRAL We return to the task at hand, namely that of defining the RiemannStieltjes integral. Since BV functions are differences of non decreasing functions we do first the theory for non decreasing functions, and then discuss the general case. Given a bounded function g: I -+ R and a partition P of I = [a,b] consisting of the points a = Xo < ... < Xn = b, put Ik = [Xk' Xk+1] and set mk = infIk 9 and Mk = sUPIk g. Further, if j is a nondecreasing function defined on I, let tl.kj = j(Xk+1) - j(Xk), 0:$ k :$ n - 1. The lower and upper sums of 9 corresponding to P with respect to j are then defined by the expressions n
8(g, j, P) =
L mktl.kj ,
n
and
S(g,j, P) =
k=I
L Mktl.kj, k=I
respecti vely. It is readily seen that as the partitions get finer, the lower sums increase and the upper sums decrease. Indeed, suppose that P is a partition of I, and that we have refined it by adding a point t, say, between Xk and Xk+1. Then inf g, inf 9 [Xk ,t]
~
mk ,
and
sup g, sup g:$ Mk , [t,Xk+ 11
[Xk,t]
[t,Xk+ll
and our assertion follows at once from this. An argument using similar ideas, more precisely, working with a common refinement, shows that for any partitions P and PI of I, we have
8(g, j, P) :$ S(g, j, PI).
(2.1)
It is therefore natural to define the lower and upper Riemann-Stieltjes integrals of 9 with respect to j by
L(g,!,1) = sup 8(g, j, P), p
and
U(g, j, I) = inf S(g, j,I), p
(2.2)
where the sup and inf are taken over all finite partitions P of I. By (2.1) it follows that L(g,j,I) :$ U(g,j,I). If L(g, j, 1) = U(g, j,I) we say that 9 is Riemann-Stieltjes integrable with respect to j over I, and we write 9 E n(j,I)i this common value is denoted by g( x) dj( x), or plainly by 9 dj.
J:
J:
2.
Existence of the Integral
33
When j(x) = x this definition coincides, as it should, with that of g( x ) dx, the Riemann integral of 9 over Ij the class of Riemann integrable functions is denoted by 'R(I). Whereas we are interested in finding sufficient conditions for 9 to be integrable, it is easier to find functions which are not integrable. For instance, on I = [0,1], if 9 is the Dirichlet function, i.e., if 9 takes the value 1 on the irrationals and 0 on the rationals there, we have 8(g, j, P) = 0 and B(g, j, P) = j(b)- j(a) for any partition P, and, unless j is constant, 9 ~ 'R(j,I). As for the integrability of g, a useful criterion is the following result which deals with a single partition at a time.
J:
Proposition 2.1. Suppose 9 is a bounded real-valued function on I, and let j be non decreasing there. Then 9 E 'R(j, I) iff given c > 0, there is a partition P of I such that
B(g,j, P) - 8(g,j, P)
~
c.
(2.3)
Note that if (2.3) holds, then it also holds for every partition finer than P. Proof. To show that the condition is necessary, given c > 0, let PI and P 2 be partitions of I such that B(g,j, PI) ~ U(g,j,I) + c/2 = 9 dj + c/2, and 8(g, j, P 2 ) ~ L(g,!, 1) - c/2 = 9 dj - c/2, and let P = PI U P 2 • Since the upper sums are nonincreasing and the lower sums are non decreasing, we have
J:
J:
B(g, j, P) - 8(g, j, P)
~
B(g, j, PI) - 8(g, j, P 2 )
~
(lbgdj+c/2) - (lbgdj-c/2)
~c,
and (2.3) holds for P. Conversely, if (2.3) holds, by the definition of L(g, j, I) and U(g, j,!), it readily follows that 0 ~ U(g,j,I) - L(g,j,!) ~ c, and, since c is arbitrary, we actually have U(g,j,I) = L(g,j,I). • Corollary 2.2. Suppose 9 E 'R(j, I) and let J 'R(j,J). Moreover, if I = [a,b], and if a < c < b, then
~
I. Then 9 E
III.
34
The Riemann-Stieltjes Integral
The conclusion of Corollary 2.2 should not be confused with its converse, which is false. More precisely, if gEnU, [0,1]) and gEnU, [1,2]), then it is not necessarily true that gEnU, [0,2]). To see this just take g(x) = 0 if 0 ::; x ::; 1 and g(x) = lif 1 < x ::; 2, and f(x) = 0 if 0 ::; x < 1 and f( x) = 1 if 1 ::; x ::; 2. Then J; 9 df = 0 (since 9 vanishes on [0,1]) and J12 9 df = 0 (since f is constant on [1,2]). However, 9 ~ nu, [0,2]), since for any partition P which does not contain the point 1 we have s(g, f, P) = 0 and S(g,/, P) = 1, and consequently, U(g,f,I) = 1 > L(g,f,I) = O. The reader may have noticed that f and 9 have a common point of discontinuity, and it is a general fact that under these circumstances 9 cannot be integrable with respect to f, cf. 4.17 below. Corollary 2.3. Suppose that 9 E n(h,I) n n(h,I) and that ,X is a nonnegative real number. Then 9 E n(h + 'xh, I), and
(2.4) Also, if gl, g2 E nu, I), then gl
+ 'xg2
E nu, I), and
The converse to both the statements in Corollary 2.3 is false. Proposition 2.1 can be used to show that important classes of functions are Riemann-Stieltjes integrable, and Corollary 2.3 to show that the integral with respect to a BV function is well-defined. Proposition 2.4. Suppose that 9 is continuous on I, and that nondecreasing there. Then gEnu, I). Proof.
If
f is
f is constant, all upper and lower sums of 9 are 0 and
J: 9 df = O. Otherwise, let c > 0 be given, and put 1J = c/U(b) - f(a)). Since I is closed and bounded and 9 is continuous on I, 9 is uniformly continuous there and there exists 6 > 0 so that
Ig(x) - g(y)1 < 1J whenever
Ix - yl < 6,
allx,y E I.
Put then Xo = a,X1 = a+ 6, ... ,X n -1 = a + (n-1)6, where n is chosen so that a + n6 2: b, and, if necessary, complete a partition P of I by setting
2. Xn
Existence of the Integral
35
= b. It is clear that n
S(g,I,P) - s(g,I,P) = L:(Mk - mk)D.kl k=l n
~ 1J L:D.kl = 1J(j(b) - I(a))
= E.
k=l
That is, (2.3) holds for this partition P, and Proposition 2.1 gives that
9 E n(j,I).
•
If I is BV on an interval 1= [a,b), then by Jordan's theorem we have I = It - 12, where It and 12 are nondecreasing functions defined on I. Thus, if the integral of 9 with respect to both It and 12 over I is defined, we put (2.5) It is not hard to check that the left-hand side of (2.5) above is a welldefined quantity. Specifically, if I = 13 - 14 is another decomposition of I as a difference of nonincreasing functions, and if 9 E n(h, I) n n(j4, I), then the left-hand side of (2.5) also equals 9 dh 9 d14 • Indeed, since it readily follows that It + 14 = 12 + 13, by Corollary 2.3 we get that 9 dlt + 9 dl4 = 9 dh + 9 d14 , and our observation follows by subtraction.
J:
J:
J:
J:
J:
J:
Corollary 2.5. Suppose that 9 is continuous on I and that BV function there. Then 9 E n(j,I).
I
is a
Proof. Since I can be written as the difference of nondecreasing functions, the assertion follows readily from Proposition 2.4 and Corollary 2.3. • An interesting point to consider is whether in Corollary 2.5 the roles of 9 and I can be interchanged. Of course this question is loosely posed since the integral of a BV function with respect to a continuous function has not yet been defined. We proceed as follows: First we introduce a notion of a Rlemann-Stieltjes integral that makes sense for arbitrary functions /,g, then show that when restricted to our old setting, Le., / E BV and 9 continuous, both definitions coincide, and finally we prove an "integration by parts" formula that will enable us to answer the question we posed. Given bounded functions /,g defined on I, we say that they satisfy the property (nS) if there is a number L so that for any E > 0 there
III.
36
The Riemann-Stieltjes Integral
exists a partition of Pe of I with the property that for every partition P of I finer than Pe and every choice of points tk in [Xk' Xk+I],
1L::=1 g(tk)~k/ - LI < c.
(2.6)
Note that the number L, when it exists, is uniquely determined. Theorem 2.6. Suppose 9 is a bounded function defined on I = [a,b], and / is nondecreasing there. Then the following are equivalent: (i) 9 E n(f,I) (ii) /,g satisfy property (nS) on I. Proof. (i) implies (ii). Since 9 E n(f,I), given c > 0, there are partitions P~ and P: of I such that S(g,/, P~) ~ J: gdf+c, and s(g,/, P~') ~ J: 9 d/-c. Put Pe = P~UP;' and note that ifP = {a = Xo < ... < Xn = b} is finer than Pe, and if tk is any choice of points in [Xk' Xk+I], then
-c +
lb
gdf
~ s(g, /, P:) ~ s(g,/, P) ~ L:9(tk)~k/ ~ S(g,/, P) ~ S(g,/, P~) ~
This chain of inequalities gives that
J:
lb
gd/ + e.
12:9k~k/- J:gd/I < c, i.e., (ii)
holds with L = 9 d/. (ii) implies (i). Given c > 0, let Pe = {a = Xo < ... < Xn = b} be the partition corresponding to the choice c/4 in the definition of (nS). Now, since Mk - mk = sup{g(x) - g(x'):x,x' Elk} it follows that given TJ > 0, there is a choice of points tk and tic in Ik so that Mk - mk ~ g(tk) - g(tlc) + TJ. Whence
= L:(Mk - mk)~k/ ~ (L9(tk)~k/ - L) + (L - L:9(tlc)~k/) + TJ(f(b) -
S(g,/, Pe) - s(g,/, Pe)
/(a)) ,
and consequently, by the choice of Pe,
S(g,/, Pe) - s(g,/, Pe)
~
c/2 + TJ(f(b) - /(a)).
If /( b) = /( a), then / is constant and there is not much to do. Otherwise let TJ = c /2(f(b) - I( a)) and observe that we have
S(g, I, Pe) - s(g, I, Pe) ~ c,
2.
Existence of the Integral
37
and by Proposition 2.1, 9 E RU,I), as we wanted to show. In fact, now that the integrability of 9 has been established, we may refer to the proof of the first implication and note that also L = 9 df. •
J:
Because of Theorem 2.6, if f and 9 satisfy property (RS) we also write 9 E RU,1) and denote L by 9 df· With this notation we have
J:
Theorem 2.7. Let f,g be real-valued bounded functions defined on I. If 9 E RU, 1), then f E R(g, I) and
lb
gdf
+
lb
f dg = f(b)g(b) - f(a)g(a).
(2.7)
Proof. Let c > 0 be given, and let Pe; be the partition of I corresponding to c in the definition of the integrability of g. Let P = {a = Xo < ... < Xn = b} be a partition of I finer than Pe; and finally let tk be any choice of points in [Xk' Xk+1]. Since A = f(b)g(b) - f(a)g(a) can also be written as n
A
=L
n
f(Xk)g(Xk) - L f(Xk-l)g(Xk-l) '
k=l
it readily follows that A -
k=l
'E f(tk)t:J.kg
n
equals
n
L9(Xk)U(Xk) - f(tk))
+ L9(Xk-t)U(tk) -
f(Xk-t)).
(2.8)
k=l
The sum in (2.8) corresponds to the partition of I obtained by combining the Xk'S and the tk's, and, since this partition is finer than Pe;, we have that
IA-lb
gdf- Lf(tk)t:J.kgl < c.
Since P and the tk's are arbitrary it follows that f E R(g, I), and that (2.7) holds. • Theorem 2.7 and the results we have covered thus far provide examples of integrable functions; our aim, however, is to characterize these functions. For instance, when f(x) = x experience indicates that it is reasonable to expect that the set of discontinuities of a Riemann integrable
III.
38
The Riemann-Stieltjes Integral
function can be covered by intervals whose total length is smalL To make this precise Hankel (1839-1873) introduced the notion of "content" of a set (contained in R). Shortly after that Cantor and Stoltz extended the results to subsets of R n , although there were some problems when the sets in question were not closed. To deal with this inconvenience, a few years later, Peanno (1858-1932) and Jordan introduced the concepts of inner and outer content. The notion of content accepted today was introduced by Borel (1871-1956) in 1898 and Lebesgue (1875-1941) in 1902. It is in terms of these concepts, namely the Lebesgue and Borel measures, that we characterize the Riemann integrable functions in Chapter VII, and the Riemann-Stieltjes integrable functions in Chapter IX.
3. THE RIEMANN-STIELTJES INTEGRAL AND LIMITS In this section we explore how successful we are in operating with the Riemann-Stieltjes integral, in other words what we can, and cannot, do with it. First note that if 9 is a bounded function defined on I and if j is nondecreasing there, then 9 E R(J, I) implies Igl E R(J, I). This fact is a simple consequence of Proposition 2.1. Indeed, given 6 > 0, by Proposition 2.1 there is a partition P of I so that S(g, j, I) - s(g, j, I) < 6/2. Also, for any 1} > 0, there exist points xk' xZ with the property that sup Igl- inf Igl ~ Ig(xk)l- Ig(xZ)1 1"
+ 1},
all k.
1"
But since Ig(xk)l- Ig(xZ)1 ~ Ig(x k ) - g(xZ)I, rearranging if necessary x k and xZ to remove the absolute value, it follows that sup 191- inf Igl ~ g(x~) - g(x%) 1"
+ 1} ~
Mk - mk
+ 1},
all k.
I"
Multiplying these inequalities through by t:..kj and adding them up we get
S(lgl,j,I) - s(lgl,j,I)
~ ~
S(g,j,I) - s(g,j,I) + 1}(J(b) - j(a)) 6/2 + 1}(J(b) - j(a)) ,
and this quantity can be made less than or equal to 6 by choosing 1} small enough. Whence by Proposition 2.1, Igl E R(J,I) as we wanted to show. The converse to this fact is not true: There is a bounded function 9 so that Igl E R(I) and yet 9 ft R(I); on [0,1] the function g(x) = 1 if x is irrational and g( x) = -1 if x is rational will do.
3.
Integral and Limits
39
If 9 and f are as above, then 191 ±9 are nonnegative Riemann-Stieltjes integrable functions, and consequently, J:(191 ±9) df ~ O. By the linearity
of the integral we get that J: 9 df ~ =F J: 191 df, or
(3.1) A moment's thought will convince the reader that it is possible to extend (3.1) to include the BV functions f as welL The estimate in this case proceeds as follows: Since for arbitrary partitions of 1 we have
IS(9,f,/)1 ~ supl9lLI~kfl ~ supI9IVUja,b), ]
]
it is not difficult to see that (3.2) Inequality (3.2) allows us to address the following question: Suppose 9n ~ 9, and both the 9n'S and 9 are integrable, is it then true that . J: 9n df ~ J: 9 df? A simple example shows that this is not always the case: If 1= [0,1] and 9n(X) = nX(O,l/n)(X), then 9n(X) ~ 9(X) = 0 for x in I, but J; 9n(x)dx = 1 f+ J~ 9(x)dx = o. Nevertheless, this situation can be remedied. Proposition 3.1. Suppose that the sequence of bounded functions {9n} converges uniformly to 9 on I, that f is a BV function there, and that 9n, 9 E I). Then limn-+oo J: 9n df = J: 9 df·
nu,
Proof.
By the linearity of the integral and (3.2) it follows that
Now, since the 9n's converge uniformly to 9 on I, sup] 19n - 91 ~ 0 as n ~ 00, the right-hand side of the above inequality goes to 0 as n tends to 00, and so does the left-hand side. • Some remarks concerning this result are in order. First, we assumed that 9 E I). It would be more interesting if we could derive the integrability of 9 from that of the 9n'S. Next, we assumed that the 9n'S
nu,
III.
40
The Riemann-Stieltjes Integral
converge uniformly to 9, and this is quite restrictive. However, these conditions cannot essentially be relaxed. Consider, for instance, the following situation: Enumerate the rationals in I = [0,1], Tl, T2, ••• , say, and let 9n(x) = 1 if x = T1, ... ,Tn and 9n(X) = 0 otherwise. Then 9n E R(I), f01 9n(x)dx = 0 for all n, lim n-+ oo 9n(X) = 9(X) exists everywhere in I, but, as we have seen above, 9 is not Riemann integrable. Although the 9n'S do not converge uniformly to 9, they do converge boundedly, and the example points out an unhappy state of affairs that will be corrected, in Chapter VII, by the Lebesgue integral. There is yet another way to interpret this last observation. Suppose the distance between 1,9 E R(I) is measured by the quantity
dU,9)= 111/(x)-9(x)ldx. Now, endowed with this distance R(I) becomes a metric space. In this metric, the sequence {9n} introduced above is Cauchy, but it does not converge. Thus, this metric space is not complete. This shortcoming will also be corrected by the Lebesgue integral.
4. PROBLEMS AND QUESTIONS 4.1 Assume {In} is a sequence of real-valued nondecreasing functions defined on I = [a,b], and suppose I(x) = limn -+ oo In(x) exists for x E I. Is I necessarily nondecreasing?
4.2 (F. Riesz's Rising Sun Lemma) Assume I is a bounded real-valued function defined on I = [a,b], and let :F = {9:9 is defined on I, 9 is nonincreasing and 9( x) ~ I( x) for x in I}. Show that
f*(x)
= sup{/(Y):x ~ Y ~ 1},
x E I,
belongs to :F, and in fact it is the smallest element there. Moreover, if I is continuous at x, so is f*.
I be a real-valued nondecreasing function defined on I = [a,b]. Show that for x E [0,1),
4.3 Let
f(x+) = lim f(x+h), h-+O
h> 0
exists. Also, for x E (0,1],
f(x-)=limf(x+h), h-+O
h
4.
Problems and Questions
41
exists. Show that the functions
and
Ir(x) = I(x+) ,
0::; x < 1,
Ir(1) = 1(1)
are nondecreasing on I. Furthermore, II is left-continuous, and Ir is right-continuous. 4.4 Show that a monotone function I: [a,b] ---4 R has, at most, countably many discontinuities, and that all are of the first kind. Conversely, if D is an at most countable subset of [a,b], there is a monotone function I: [a,b] ---4 R such that D = {x E [a,b]: I is discontinuous at x}. 4.5 Let A be a nonempty subset of R, and let I be a bounded nondecreasing real-valued function defined on A. Show that I can be extended to R as a non decreasing function. 4.6 Let I: [a,b]
---4
R be nondecreasing, and for x in [a,b] put
L
sex) =
(J(y+) - I(y-»)
+ I(x) -
I(x-).
a~y<x
Show that s( x) is a well-defined function, called the "saltus" function of I. Show that s and I - s are nondecreasing and that I - s is continuous. 4.7 A real-valued function I defined on I = [a,b] is said to be Lipschitz there if there is a constant c so that I/(x) - l(x')1 ::; c Ix - x'i for all x, x' in I. Show that if I is Lipschitz on I it is BV there. 4.8 Let I,g be BV on 1= [a,b]. Show that I,g are bounded on I, and that for any real number 1}, 1+ 1}g is BV on I and
V(f
+ 1}g;a,b)::; V(f;a,b) + I1}W(g;a,b).
4.9 Let I,g be BV on I = [a,b]. Show that Ig is BV on I, and that if Ig( x)1 ~ e > 0 for x E I, then also 1/9 is BV on I. Estimate V(fg;a,b) and V(f/g;a,b) in terms ofV(fja,b),V(g;a,b) and e. 4.10 Let I,g be BVon I
and (f A g)(x)
= [a,b]. Show that (fVg)(x) = max (f(x),g(x»
= min (f(x),g(x»
are BV on I.
4.11 Let I,g be real-valued functions defined on 1= [a,b], and suppose that I and 9 differ at finitely many values. Show that I is BV on I iff 9 is BV on I, and that V(f;a,b) = V(g;a,b).
III.
42
The Riemann-Stieltjes Integral
4.12 Characterize those real numbers TJ,c for which I(x) = x'l/ sine'(1/x), x :I 0,/(0) = 0, is BV on [0,1]. Verify that the choice TJ = 2,c = 3/2 gives an example of a function I which is BV on I, differentiable there, and yet I' is unbounded. 4.13 Show that the plane curve r: I -+ R2 with graph {( 4>( t),,,p( t)): t E I} is rectifiable iff 4> and "p are BV on I. What is L(r) in this case? 4.14 Let I be BV and continuous on I = [a,b]. Show that for a we have Vex) = lim I/(xj) - I(xj-dl·
~
x
~
b
L
norm(P)-+O over,'"
The above statement is understood as follows: Given c > 0, there exists TJ > 0 such that for any partition P of [a,x] with norm less than or equal to TJ we have ~overP I/(xj) - I(xj-dl > Vex) - c. 4.15 Assume that I is BV on I = [a,b] and for a ~ x {a = Xo < ... < Xn = x} be a partition of [a,x]. Let
PcP) = {k : D.kl > 0, 0 ~ k
~
n - 1} ,
N(P) = {k:D.kl < 0,0
~
n-1},
< b let P =
and ~
k
and put
P(x)=su P {'" D.k/} , N(X)=SUP {L...JkEN(P) '" D.k/} , L...JkEP(P) where the sup in each expression above is taken over all finite partitions P of [a,x]; P and N are called the positive and the negative variations of I on I. Prove they satisfy the following properties: (i) P, N are nonnegative and nondecreasing, (ii) P(x)+N(x) = Vex) and P(x) - N(x) = I(x) - I(a), and, (iii) Every point of continuity of I is also a point of continuity of P and N. 4.16 Recall that for any real number r we have r
+_{r 0
ifr>O ifr~O,
ifr>O if r ~ o.
These are called the positive and negative parts of r and satisfy the relations r+, r- 2:: 0, r = r+ - r-, and Irl = r+ + r-.
4.
Problems and Questions
43
Show that if J is BV on I = [a,b], and if J E n(I) and J(x) = J'(t) dt for x E I (this condition is not redundant), then for a ~ x ~ b, we have
J:
P(x)
l
= xU'(t))+dt,
and Vex)
=
l
N(x) X
l
= xU'(t))-dt,
IJ'(t)ldt.
4.17 Assume that g, J are bounded functions defined on I = [a,b] which are discontinuous from the right at x E (a,b). Show that 9 fI. nu, I). 4.18 Assume that J is a nondecreasing real-valued function defined on an interval I, and that 9 E nU,I). Show that g2 E nU,I). Show that the converse is not true, Le., there are functions J,g such that g2 E nu, I) but 9 fI. nu, I). 4.19 Assume that J is a nondecreasing real-valued function defined on an interval I and that g, h E nu, I). Show that gh E nu, I). 4.20 Suppose that gEnU, I), and that J has a bounded derivative on 1= [a,b]. Show that gJ' E n(I) and
lb
gdJ =
lb
I'
g(x)J'(x)dx.
4.21 Let J be BV on I = [a,b], and let, as usual, V denote its variation on I, V(a) = O. Prove that if 9 is bounded on I and gEnu,!), then 9 E n(v, I). 4.22 Suppose J is BV on I = [a,b] and the bounded function gEnU, I). For x E I put G(x) = gdJ, and show that G is BV on I and continuous at those points of I where J is continuous.
J:
4.23 Let J be a nondecreasing bounded function on I = [a,b] and let 9 E nU,I), m ~ g(x) ~ M for all x E I. Show that there is a real number c, m ~ c ~ M, so that
lb
gdJ = cU(b) - J(a)).
4.24 Let J,h be nondecreasing functions defined on an interval I = [a,b] of the line with the property that J(a) = ft(a) and
lb
9 dJ =
lb
9 dft ,
all 9 continuous on I.
44
III.
The Riemann-Stieltjes Integral
Prove that if x E I is a point of continuity of both
I
and
h,
then
I(x) = hex). 4.25 Let h, h be two real-valued nondecreasing functions defined on I = [a,b] and suppose there is a value c E R such that the set D = {x E I:h(x) = hex) + c} is dense in I. Show that
1b
9 dh
=
1b
9 dh
,
all 9 continuous on I.
4.26 Let I = [0,1] and suppose I is a BV function defined on I. Let h be the function defined on I as follows: h(O) = 0, hex) = l(x+O)- 1(0) if 0 < x < 1, and h(l) = 1(1) - 1(0). Show that his BV on I, and that for each continuous function 9 we have 9 dl = 9 dh.
J;
J;
4.27 Let I be a continuous function defined on I = [a,b] and suppose that 9 is nondecreasing there. Show that there is a point Xo E I such that
Jafb gdl = g(a)
1
xO
a
dl + g(b)
1b dl· xo
4.28 (Change of variable) Let I,g be bounded on I = [a,b] and suppose that 9 E 'R,(j, I). Furthermore, assume there are an interval J = [c, d] and a continuous, strictly monotone function ¢> such that I = ¢>(J), ¢>(c) = a, ¢>(d) = b. Show that the functions F(x) = I(¢>(x)) and G( x) = g( ¢>( x)) are well-defined on J, G E 'R,( F, J), and that
1b
9 dl =
ld
G dF .
4.29 Let {In} be a sequence of BV functions on I = [a,b] and suppose there exists a BV function I defined on I such that the variation V(j - In; a, b) tends to 0 as n --+ 00. Assume also that In(a) = I( a) = 0 for each n = 1,2, ... If 9 is continuous on I, prove that 9 E 'R,(j, I) and
lim n-+oo
fb gdln = 1ba gdl.
Ja
CHAPTER
IV
Abstract Measures
In this chapter we study the notions of measure and of sets of "content" zero. These concepts are essential to measure the level sets of the new class of functions to be integrated and in the characterization of RiemannStieltjes integrable functions. A successful approach to these problems requires that we operate freely with sets, including taking limits. This we achieve with the introduction of algebras and O'-algebras of sets.
1.
ALGEBRAS AND O'-ALGEBRAS OF SETS
A class A of subsets of a (universal) set X is called an algebra of sets, or plainly an algebra, provided the following three properties hold: (i) A is nonempty. (ii) If E E A, then X \ E E A. (iii) If {Ek }k=l ~ A, then UZ=l Ek EA. Some sets of an algebra A are easily identified, namely 0 and X. In fact, A = {0,X} is the most economical algebra. On the other hand, A = P(X) is also an algebra. Another interesting example is E = {E ~ R : E can be written as a finite pairwise disjoint union of half-open intervals (a,b], with a,b in R}. Also, it is not hard to check that if {AihEI is a collection of algebras, then A = iE1 Ai is an algebra. If A is an algebra of subsets of X and E ~ X, then the family AE = {E n A: A E A} is an algebra of subsets of E. What operations can we perform with the sets of an algebra A and still remain in A?
n
45
IV.
46
Abstract Measures
Proposition 1.1. Suppose A is an algebra of sets, and E 1 , E2 E A. Then El n E2 and El \ E2 belong to A. Proof.
Since by 5.1 in Chapter I (1.1)
by (ii) and (iii) the set on the right-hand side of (1.1) IS III A and consequently, the complement of El n E2 belongs to A. By (ii) again, El n E2 EA. Moreover, since El \ E2 = El n (X \ E 2), by (ii) and the first part of the proof, we have El \ E2 EA. • In applications it is often convenient to replace (iii) by the seemingly weaker condition that A be closed under the union of pairwise disjoint sets, namely: (iii') If {Ek}k=l is a collection of pairwise disjoint subsets of A, then U~=l Ek E A. However, as an argument using 5.2 in Chapter I and Proposition 1.1 readily shows, (iii) and (iii') are actually equivalent. We consider the taking of limits next. Given a sequence {An}, we define the sets lim supAn = {X: X belongs to infinitely many An's} and liminf An = {x:x belongs to all but finitely many An's}. It is not hard to see that
lim sup
An = m0. (Q An),
(1.2)
D. CEl An) .
(1.3)
and
limiufAn =
For instance, if An = [0,1], n odd, and An = [1,2], n even, then liminf An = {I}, and lim sup An = [0,2]. When the limits are equal we say that the sequence {An} converges and the common value is denoted by lim An. From the expressions for the limits it is apparent that limiting operations are not necessarily closed in an algebra of sets; we are thus led to the concept of O"-algebra. We say that an algebra A of subsets of X is a O"-algebra of sets, or plainly a O"-algebra, if it satisfies the additional property (iv) If {Ek}~l ~ A, then U~l Ek EA. As before, (iv) is equivalent to the condition obtained by requiring that the Ek'S be pairwise disjoint.
1.
Algebras and O'-algebras
47
P(X) is a u-algebra, and the algebra £ introduced above is not. Also, if {AihEI is a family of u-algebras, then A = niEI Ai is a u-algebra as well. If A is a u-algebra of subsets of X and E ~ X, the collection AE = {A n E: A E A} is a u-algebra of subsets of E. As for the consideration of the limits we have
Proposition 1.2.
Suppose {An} is a sequence of sets of au-algebra
A. Then lim inf An and lim sup An belong to A. Proof. Since An = X \ (X \ An), by the relations 5.1 in Chapter I we have that lim sup An equals n:=1 (U~=m (X \ (X \ An))) = n:=1 (X \ (n~=m(X \ An))) = X \ U:=1 (n~=m(X \ An)) , i.e., lim sup An = X \ liminf(X \ An). Thus, if we prove the conclusion for the lim sup An, it will follow for the liminf An, and vice versa. Now, since A is au-algebra, Bm = U~=m An E A for all m, and by the countable version of Proposition 1.1 (which holds for u-algebras and which is proved in a similar fashion), lim An = n:=1 Bm E A, and we are done. • That the notion of u-algebra is the natural one to deal with limits is also expressed by Proposition 1.3. Suppose A is an algebra. Then A is au-algebra iff for every sequence {Ak} ~ A, limsupAk E A. Proof. Since Proposition 1.2 gives the necessity of the condition we only do the sufficiency. We must only show that property (iv) holds. Let {Ak} ~ A, and set Bn = Uk=1 Ak. Since A is an algebra, Bn E A for all n, and consequently, by assumption, lim sup Bn E A. But since every x in Uk:1 Ak belongs to infinitely many Bn's, in fact, if x E Ak, then x E Bn for all n ~ k, then Uk:l Ak = lim sup Bn E A, and we have finished. • Next we consider the following question: Given a family C of subsets of X, what is the smallest family of subsets of X that contains the limits of all sequences of sets in C? Or equivalently, which is the smallest u-algebra of subsets of X that contains C? If C is a u-algebra, then the answer is C. Otherwise, let :F be the family of all the u-algebras of subsets of X which contain C. Since P(X) E :F,
IV.
48
Abstract Measures
:F f. 0. As observed above, the intersection of an arbitrary family of a-algebras is again a a-algebra, and the intersection of all the a-algebras in :F is the smallest a-algebra that contains C. This a-algebra is called the a-algebra generated by C and it is denoted by S(C). For instance, if C is the family of all the singletons {x} of X, then S(C) = {E ~ X: either E is at most countable, or else X \ E is at most countable}. Of course, if X is at most countable, then S(C) = P(X). There are two other examples which are useful in applications and we discuss them next. Example 1.4. Let Al be a a-algebra of subsets of Xl and A2 a a-algebra of subsets of X 2 • We are interested in constructing a a-algebra of subsets of Xl X X 2 in terms of Al and A 2 • Our first candidate is the family C = {El X E2:El E Al and E2 E A2},
but it is not hard to see that C is not necessarily closed under complementation, and consequently, C is not even an algebra. Therefore we define the product a-algebra Al X A2 = S(C). This is the natural thing to do, since S(C) is the smallest a-algebra containing all the "rectangles" El X E 2. In some cases it is possible to give a concrete description of the product a-algebra. For instance, if Al = S(Cl ) and A2 = S(C 2 ), then Al X A2 = S(C I X C2 ), cf. 4.22 below. Example 1.5. Suppose X = R n is the Euclidean n-dimensional space endowed with the usual topology, and let 0 denote the family of open sets of Rn. Then S( 0) is called the a-algebra of Borel subsets of R n and it is denoted by Bn. There is yet a simpler way to generate Bn. When n = 1, let I denote the collection of all open intervals of R. Since every open set is a disjoint, at most countable union of intervals in I, it is clear that also S(I) = Bl. In fact, since each interval (a,b] of £ can be expressed as the intersection
n 00
(a,b] =
(a,b
+ lin),
n=l
it also follows that S(£) = Bl . When n > 1 we use the fact that every open set can be written as the countable union of nonoverlap ping closed cubes, i.e., if the cubes intersect it is only along the faces, cf. 4.18 below. Thus, if C denotes now the family of all the closed cubes in Rn, we also have S(C) = Bn. Finally, by 4.21 below, Bn X Bm = Bn+m.
2.
Additive Set Functions and Measures
49
2. ADDITIVE SET FUNCTIONS AND MEASURES A measure on Rn is a natural generalization of such elementary notions as the length of a line segment, the area of a rectangle and the volume of a parallelepiped. In 1898, Borel formulated the following four postulates for defining measures of sets: (i) A measure is always nonnegative. (ii) The measure of the union of a finite number of pairwise disjoint sets is equal to the sum of their measures. (iii) The measure of the complement of a set relative to another equals the difference of their measures. . (iv) Every set whose measure is not 0 is uncountable. Based on these postulates we introduce the notion of an additive set function. Given a set X and an algebra A of subsets of X, a set function "p on A is a function which assigns to each set of A a real value, or ±oo. To avoid technical difficulties we assume that if "p takes infinite values, they are all of the same sign. A set function "p is said to be additive provided the following property holds: If {Ek }k=l ~ A and the Ek'S are pairwise disjoint, then (2.1) This property corresponds to Borel's postulate (ii). Because "p only takes infinite values of one sign, the right-hand side of (2.1) always makes sense under the usual arithmetic rules: r+oo = 00,
r+(-oo)
= -00, 00+ 00 = 00, (-00)+(-00) = (-00).
These are two examples of additive set functions. Example 2.1. Let X be an infinite set and let A = {E ~ X: either E is finite or X \ E is finite}. Then A is an algebra and the mapping "p: A ---+ [0,00] given by "p( E) = 0 if E is finite and "p( E) = 00 if X \ E is finite, is an additive set function. Example 2.2. Let X = (0,1], put £ = {E ~ X: E can be written as a finite disjoint union of half open intervals (a,b]), and let f: X ---+ R be nondecreasing. Then £ is an algebra of sets and "p: £ ---+ [1(0+),/(1)] given by "p((a,bD = feb) - f(a) and extended additively otherwise, is an additive set function; one must check, of course, that the value "p(E) does not depend on the way in which E is represented as a disjoint union of
IV.
50
Abstract Measures
half-open intervals. There is more to this example than meets the eye, and we will return to it in Chapter IX. Additive set functions have the following property: If E I , E2 E A, EI ~ E2 and "p(EI) is a finite value, then Borel's postulate (iii) holds, to wit (2.2) Indeed, by (2.1) we have that "p(E2) = "p(Ed + "p(E2 \ Ed. Now, "p(E2) is either a finite value or not. In the former case we subtract "p(EI) from both sides of the above equality and (2.2) follows; in the latter case we observe that "p(E2 \ Ed is infinite of the same sign as "p(E2) and (2.2) still holds. An immediate consequence of (2.2) is that "p(0) = o. As a matter of fact, an additive set function "p is either identically infinite, or else "p(0) = O. From this point on we consider nonnegative set functions (Borel's postulate (i)), but we will have more to say concerning "signed" set functions, cf. 4.8 - 4.12 below and Chapter XI. How do additive set functions behave with respect to limits? For instance, if in Example 2.1 the set X = {x!, X2, ... } is countable and we put En = {Xl, ... , Xn}, n 2: 1, then it readily follows that "p(lim En) = "p(X) = 00 =1= lim "p(En) = O. To deal with this inconvenience we restrict the domain of an additive set function to a u-algebra and require an additional compatibility condition, the u-additivity. More precisely, given a set X and au-algebra M of subsets of X, we say that a set function J.L defined on M is a measure provided the following three properties hold:
(i) WM
-+
[0,00].
(ii) J.L(0) = O. (iii) If {EkH';1 ~ M is a. sequence of pairwise disjoint sets, then
J.L
(Uk=l Ek) = OO
~oo J.L(Ek).
L..Jk=l
Condition (ii) is assumed in order to exclude the possibility that J.L is identically 00. To emphasize the interrelation among these objects we say that J.L is a measure on (X,M), or that the triplet (X,M,J.L) is a measure space. The u-algebra M is called the family of J.L-measurable, or plainly measurable, sets. A measure J.L is said to be finite if J.L(X) < 00. In this case, by (2.2), it follows that for every measurable set E, J.L(E) ~ J.L(X), and J.L only takes
2.
Additive Set FUnctions and Measures
51
finite values. When Jl is a finite measure we may rescale, i.e., consider JlI(E) = Jl(E)/Jl(X),E EM, instead, and assume that Jl(X) = 1; these measures are called probability measures. The measure space (X,M,Jl) is said to be a-finite if X is the countable union of measurable sets, each of finite Jl measure. Informally, we also say that Jl is a-finite. Some examples will clarify these concepts. Let M be the a-algebra of subsets of an uncountable set X generated by the singletons of X. Then the set function tf;(E) = 0 when E is finite, and tf;(E) = 00 otherwise, is an additive set function which is not a measure. On the other hand, the set functions Jl(E) = number of elements of E when E is finite and Jl(E) = 00 when E is measurable and infinite, and veE) = 0 when E is at most countable and v( E) = 1 when X \ E is at most countable, are measures. Jl is not a-finite and v is a probability measure. Also, if X is countable, the measure Jl on (X, P(X» given by Jl(E) = number of elements of E if E is finite and Jl( E) = 00 otherwise, is a-finite. Next suppose that X is a nonempty set and that M = P(X). Let I:X ~ [0,00], and for E E M put Jl(E) = L
I(x).
:r:EE
As usual the sum is defined as
where the sup is taken over all finite subsets {Xl!"" xn} of E. To verify that (X,M,Jl) is a measure space, the only step that offers any difficulty is the a-additivity of Jl; we do this next. Let {Ek}~l be a sequence of pairwise disjoint measurable sets, and let E denote its union. Given a finite subset {Xl, ... , xn} of E, suppose that {Xl,.' . ,X n } ~ U~l Eki' m ~ n, and note that n
m
00
LI(xk) ~ LJl(EkJ ~ LJl(Ek). k=l
i=l
(2.3)
k=l
Thus, taking the supremum of the left-hand side of (2.3) over all finite subsets of E, we get that 00
Jl(E) ~
L
k=l
Jl(Ek).
(2.4)
IV.
52
Abstract Measures
We show the opposite inequality next. If J.L(Ek) = 00 for some k, since + J.L(Ek) ~ J.L(Ek), J.L(E) is also infinite and there is nothing to prove. Otherwise, given E > 0, let {XI,k, ... , Xn(k),k} ~ Ek be such that
J.L(E) = J.L(E \ Ek)
n(k) J.L(Ek) ~ L f(Xi,k)
+ E2- k ,
k
= 1,2, ...
(2.5)
i=l
For each integer m,
,Xn(m),m} is a finite subset of E, and, by
{Xl,!, ••.
(2.5),
n(k) L J.L(Ek) ~ L L f(Xi,k) m
k=l
m
k=l i=l
m
+E L
2- k ~ J.L(E)
+ E.
(2.6)
k=l
Since the right-hand side of (2.6) is independent of m, we may let m in the left-hand side there, and thus obtain
---I-
00
00
LJ.L(Ek) ~ J.L(E)
+ E.
k=l
But since E above is arbitrary, the inequality opposite to (2.4) holds, J.L is a-additive. Three particular instances of this example are of interest. If
L f(x) = 1, xEX
J.L is a probability measure. On the other hand, if f( x) = 1 for all X EX, J.L is called, for obvious reasons, the counting measure on X. The counting measure is finite if X itself is finite and it is a-finite if X is countable. Finally, if f(xo) = 1 for some fixed Xo E X and f(x) = 0 for X # xo, J.L is called the Dirac measure supported at Xo and is denoted by 6xo ; clearly 6xo (E) = 1 or 0, according as to whether Xo belongs to E or not. The interesting question of when, in general, J.L is finite or a-finite, is left for the reader to answer, cf. 4.24 below. We close this section with a simple criterion that enables us to determine when an additive set function is a measure. Theorem 2.3. Let"p be an additive, finitely valued set function defined on a a-algebra A. Then "p is a measure iff for any nonincreasing sequence {Ek}~l ~ A with n~l Ek = 0, we have limk-+oo "p(Ek) = o.
3.
Properties of Measures
53
Proof. Assume that t/J is a measure and let {Ek} be a nondecreasing sequence of sets in A. Then by the u-additivity of the finite set function t/J we have 00
k=l n-l
= n~~ ~) t/J( Ek) - t/J( Ek+t)) = t/J( E 1 ) -
n~~ t/J( En) .
k=l
Since t/J(El) is finite, limn--+co t/J(En) = 0, and the necessity follows. As for the sufficiency, let {Ed be a disjoint sequence of measurable sets with union E and let An = E\ (El U ... U En), n = 1,2 ... Then {An} is a nonincreasing sequence of measurable sets with n~=l An = 0, and lim n--+ oo t/J( An) = o. Now, since t/J is additive we have n
t/J(E)
= L t/J(Ek) + t/J(An),
n
= 1,2, ...
k=l
Whence taking the limit as n
t/J(E) = !~~
--+ 00
it follows that
n
00
k=l
k=l
L t/J(Ek) + !~~(An) = L t/J(Ek) ,
and the u-additivity of t/J has been established.
•
3. PROPERTIES OF MEASURES How do measures behave with respect to the usual set operations, and with respect to the limiting operations? Some of the basic properties are given in Proposition 3.1. Suppose (X,M,JL) is a measure space. Then the following properties hold: (i) (Monotonicity) If E,F are measurable and E ~ F, then JL(E) ~ JL( F). Moreover, if JL( E) is finite, then
JL(F \ E) = JL(F) - JL(E).
(3.1)
IV.
54
Abstract Measures
(ii) (u-subadditivity) If {Ek} is a sequence of measurable sets, then (3.2) (iii) (Continuity from below) If El ~ E2 sequence of measurable sets, then
~ ...
is a non decreasing
(3.3) (iv) (Continuity from above) If El ;2 E2 ;2 ... is a nonincreasing sequence of measurable sets and for some k ,jt(Ek) < 00, then
(3.4) Proof. The monotonicity was essentially established in (2.2) above, so we say no more. As for the u-subadditivity, first note that on account of 5.2 in Chapter I and the properties of measurable sets, we may rewrite U Ek = U F k , where {Fk} is a sequence of pairwise disjoint measurable sets with Fk ~ Ek for all k. Consequently, by the u-additivity and the monotonicity of jt it follows that
which is precisely (3.2). Next note that if {Ek} is non decreasing, then lim Ek = U Ek, so that in this particular instance the measure of the limit is the limit of the measures. Now, if jt(El) = 00, by the monotonicity of jt, jt(Ek) = 00 for all k, and also jt(U Ek) = 00; in this case there is nothing to prove. Otherwise, since the sequence in question is nondecreasing, put Eo = 0, and note that U~l Ek = U~l(Ek \ Ek-l), where the sequence {Ek \ Ek-d is pairwise disjoint. Whence
Thus, by (3.1), we obtain that
:E:=l jt(Ek \ Ek-l) = :E:=l (jt(Ek) -
jt(Ek-d)
= jt(En) - jt(Eo) = jt(En) ,
3.
Properties of Measures
55
and (3.3) follows. Finally, the idea to prove the continuity from above is to reduce the problem to one of continuity from below and to invoke (3.3). Replacing Ek by Ek n Eko if necessary, where Jl(Eko) < 00, we may assume that Jl(E l ) < 00. Since {Ek} is nonincreasing, the sequence {E l \Ek} is nondecreasing, and, by (3.3), (3.5) Since Uk:l(El \ Ek) = El \ nk:l Ek, and since by (3.1) it follows that p(Uk:l(El \ Ek)) = p(El)-p(nk:l Ek), and that peEl \Ek) = p(Et)peEk) , by (3.5) we get that
peEd - p
(nCOk=l Ek) = peEd -
lim p(Ek). k-+oo
Moreover, peEd < 00, and this quantity may be cancelled in the above inequality to give (3.4), and to complete the proof. • The restriction peEd < 00 is necessary for (3.4) to hold. Indeed, let p be the measure on (N, peN)) given by peE) = number of elements of E if E is finite and peE) = 00 otherwise, and let Ak = {k, k + 1, ... }. Then peAk) = 00 for all k, but pen Ak) = p(0) = O. In working with measures a useful result is the following Theorem 3.2 (Borel-Cantelli Lemma). Suppose (X, M, p) is a measure space and let {En} be a sequence of measurable sets with the property that Z:::::l peEn) < 00. Then
p(limsupEn) = Proof.
o.
(3.6)
First observe that by the u-subadditivity of p,
Now, the sequence consisting of Am and peAl) < 00. Whence, by (3.4) p
= U~=m En, m:2: 1, is nonincreasing
(nCOm=l Am) = m-+oo lim p(Am).
(3.7)
The set on the left-hand side of (3.7) is precisely lim sup En. As for the right-hand side, note that the measure of each set there does not exceed
56
IV.
Abstract Measures
I:~=m J.l(En ), and since these are the tails of a convergent series, we have lim m -+ oo J.l(Am) = O. Consequently, (3.7) gives at once (3.6). •
Measures, or additive set functions for that matter, can be restricted or extended. More precisely, if M1 ~ M2 are a-algebras of subsets of X, we say that a measure J.l1 on (X, M 1) is the restriction to M 1 of the measure J.l2 on (X,M 2), and we write J.l1 = J.l2IM 1, if J.l1(E) = J.l2(E) for every E E M 1. In this case we also say that J.l2 is an extension of J.l1 to M2. For instance, given a measure space (X,M,J.l) and A E M, J.lIMA can intuitively be thought of as the restriction of J.l to A. Sets of measure 0 play a special role in many questions of interest to us. Given a measure space (X, M, J.l), any measurable set of measure 0 is called a null set. Null sets are often denoted by N. If {Nd is a sequence of null sets, then by the a-subaddivity of J.l it readily follows that UNk is also a null set. Also if N is a null set and A ~ N, then by the monotonicity of J.l it follows that J.l(A) = 0, provided that A is measurable. This, of course, is not true of all measures. Consider, for instance, the following simple example: Let X = {a,b,c},M = {0,{a},{b,c},X}, and J.l({a}) = 1, and J.l( {b,c}) = o. Then J.l is a probability measure and N = {b,c} is a null set, but {b}, {c} C {b,c} are not measurable. This motivates our next definition. A measure space (X,M,J.l) is said to be complete if whenever N E M is a null set and A ~ N, then A is also a measurable null set. In this case we also say, plainly, that J.l is complete. Since it is quite inconvenient to work with measure spaces which are not complete, we consider next whether a measure which is not complete can be extended, in a natural way, to complete measure. Let, then, (X,M,J.l) be a measure space and put N = {N EM: J.l( N) = O}. If J.l is not complete, then there is a set in P(N) which is not measurable, and the first step in constructing an extension of J.l which is complete is to find a a-algebra M1 which contains both M and P(N). The natural choice for M1 is S(M UP(N))j fortunately there is a simpler way to characterize M 1. Indeed, if
A = {E U F: E EM, F E P(N)} , then we claim that M 1 = A. Clearly A ~ M 1. If we can show that A is a a-algebra of sets, since A contains M and P(N), it also contains the a-algebra generated by M U P(N), that is Mb and we are done. First, A # 0. Next we show that if A E A, then also X \ A E A. Let A = E U F, where E is measurable and F ~ N, N EN. Since N EN
3.
Properties of Measures
57
and E U F = E U (F \ E), replacing N by N \ E if necessary, we may assume that En F = En N = 0. Now, since E and N are disjoint, we have that E U F = (E UN) n (F U (X \ N», and consequently, X \ (E U F) = (X \ (E UN» U (X \ (F U (X \ N»)
= (X \ (E UN»
U «X \ F)
n N) = EI U FI ,
say. Since E UN is measurable, EI E M, and since FI ~ N, FI E P(N); in other words X \ (E U F) E A as we wanted to show. Finally we check that if {Ak} is a sequence of subsets of A, then also Uk:1 Ak belongs to A. This is not hard: Since Ak = Ek U Fk, Ek E M, H E P(N) for all k, it readily follows that U%"::I Ak = (Uk:1 Ek) U (Uk:1 Fk) = E U F, say. But as it is evident that E E M and F E P(N), our verification that A is a a-algebra is finished. We are now ready to prove Theorem 3.3. Given a measure space (X,M,p), consider N = {N E M:p(N) = O} and MI = {EU F:E E M,F E P(N)}. Then there is a unique extension PI of P to (X,Mt) so that (X,MbPI) is complete. Proof. If P is complete, MI = M, and by putting PI = P we are done. Otherwise, if P is not complete, define PIon (X,Mt) as follows: If A = E U F E M I, let
PI(A)
= p(E).
(3.8)
First we show that PI is a well-defined set function on (X,M I ), i.e., if A = EI U FI = E2 U F2, then we have p(E1 ) = P(E2)' This is not hard; indeed if F2 ~ N2 EN, note that
EI
~
EI U FI = E2 U F2
~
E2 U N2 EM,
and, by monotonicity, p(E1 ) ~ P(E2 U N 2) ~ p(E2) + p(N2) = p(E2). Reversing the roles of EI and E2 we get that P(E2) ~ p(Et), and PI is well-defined. Next we check that PI is a measure on (X, MI)' The only property that is not obvious is the a-additivity. Let {Ak} be a sequence of pairwise disjoint sets in M 1 , Ak = Ek UFk,Fk ~ Nk EN for all k. Then F = Uk Fk ~ Uk Nk = N E N, and since the Ek'S are pairwise disjoint we have PI (UkAk) = PI ((UkEk) UF)
= P (UkEk) = LP(Ek) = LPI(Ak) ,
IV.
58
Abstract Measures
and the u-additivity follows. Finally we verify that J.LliM = J.L and that J.Ll is the only complete measure on (X,Mt) with this property. Since for E E M we have E = E U 0 and 0 EN, it is clear that J.Ll(E) = J.L(E) and J.Ll is an extension of J.L. To check that J.Ll is complete, let Nl = El U PI E Ml, J.Ll(NI) = 0, and let M ~ Nl. Since Nl is a null set we get that J.L(El) = 0 and El U PI ~ N E N. Thus M ~ N,M E P(N) , J.Ll(M) = 0 and J.Ll is complete. Further, to show that J.Ll is unique, suppose that J.L2 is another extension of J.L to Ml and note that for F ~ N EN we have J.L2(F) ~ J.L2(N) = J.L(N) = O. Thus, if A = E U FE Ml, it readily follows that
+ J.L2(F) = J.L(E) = J.Ll(A). E peN) also J.Ll (F) = 0, we may essentially reverse the
J.L2(A) ~ J.L2(E)
But since for F roles of J.Ll and J.L2 above and obtain that J.Ll(A) ~ J.L2(A) as well. In other words, J.Ll(A) = J.L2(A) for every A E Ml, and J.Ll is unique. •
4. PROBLEMS AND QUESTIONS 4.1 Prove that (lim sup An)n(limsup Bn) ;;;? limsup(AnnBn), and that (lim sup An) U (limsupBn) = limsup(An U Bn). What are the corresponding statements for the lim inf? 4.2 Let A be an algebra of subsets of X and let "p be a finite additive set function defined on A. Given {AkH=l ~ A, no two of the Ak'S being the same, let Cm = {x: x belongs to exactly m of the Ak 's}, m = 1,2, ... , n. Show that L:k=l "p(Ak) = L:~=l m"p(Cm ). 4.3 In the notation of problem 4.2, let Bm = {x: x belongs to at most m of the Ak'S}, m = 1,2, ... ,n. Show that L:k=l"p(Ak) = L:~=l "p(Bm ). 4.4 Let "p be a finite additive set function defined on an algebra A of subsets of X, and let AI, A2 E A. Show that
"p(A l
n A 2) + "p(A l U A 2) =
"p(A I ) + "p(A 2).
4.5 It is possible to extend 4.4 to include more than two sets. Let {Ak}k=l ~ A and for an integer m ~ n put Tm
=
L ~< ...
"p(Ak1
n ... n Akm ) '
4.
Problems and Questions
59
Show that 4.6 Assume 'I/J is an additive set function defined on an algebra A of subsets of X. Show that for arbitrary subsets AI,'" ,An ~ A we have
4.7 An extended real-valued set function 'I/J defined on an algebra A of subsets of X is said to be bounded if there exists a constant M such that I'I/J(A) I ::; M for all A E A. Show that any nonnegative additive set function which only assumes finite values is necessarily bounded. 4.8 Let'I/J be an extended real-valued set function defined on an algebra A of subsets of X with the property that 'I/J(0) = O. Given A E A, put
'I/J+(A) =
sup
E~A,Ee.A
'I/J(E) ,
'I/J-(A)
=
sup
(-'l/J(E)) ,
E~A,Ee.A
and 'I/J+ is called the positive variation, 'I/J- the negative variation and I'l/JI the total variation of 'I/J, respectively. Show that if 'I/J is additive, then all the variations are nonnegative additive set functions on A. 4.9 In the setting of 4.8, if for A E A, 'I/J(A) is finite, show that
where the sup is taken over those subsets Ell E2 of A which belong
toA. 4.10 In the setting of 4.8, show that if 'I/J is bounded above, i.e., if there exists a constant M such that 'I/J( A) ::; M for all A E A, then the positive variation 'I/J+ is a finite additive set function. Similarly, if 'I/J is bounded below, i.e., if there exists a constant m such that 'I/J(A) ~ m for all A E A, then the negative variation 'I/J- is a finite additive set function. 4.11 Assume t/J is an additive set function defined on an algebra A of subsets of X which is either bounded above or bounded below. Show
IV.
60
Abstract Measures
that 'I/J can be represented as the differences of two nonnegative additive set functions. Specifically,
'I/J(A) = 'I/J+(A) -'l/J-(A),
A EA.
This relation is referred to as the Jordan decomposition of 'I/J, cf. Theorem 1.3 in Chapter III. 4.12 The total variation I'l/JI of the additive set function 'I/J defined on an algebra A of subsets of X can also be determined by the formula n
I'l/JI(A) = sup
L 1'I/J(Ek)1 ,
k=l
where the sup is taken over all finite partitions {E k } of A into disjoint sets of A. Prove it. 4.13 Suppose 'I/J is a bounded additive set function defined on au-algebra A of subsets of X and suppose that
Uk1k. 4.19 Let A be an algebra of subsets of X. Then the u-algebra SeA) generated by A is the smallest family F of subsets of X that contains A and satisfies the following two conditions: (i) If En E F and En ~ En+l for n = 1,2, ... then U~=l En E F, and, (ii) If En E F and En ~ En+l for n = 1,2, ... , then n~=l En E F. Families that satisfy (i) and (ii) are said to be monotone. Prove that if the algebra A is a monotone family, then A is au-algebra.
4.
Problems and Questions
61
4.20 If Al is a a-algebra of subsets of Xl and A2 is a a-algebra of subsets of X2, show that Al x A2 is the smallest monotone family of subsets of Xl x X2 that contains all finite disjoint unions of "rectangles" El X E2 with El E Al and E2 E A 2.
x Bm = Bn+m. 4.22 Prove that S(El ) x S(E2 ) = S(EI
4.21 Prove that Bn
X
E2 ).
4.23 Let A be a a-algebra of subsets of X and E eX. If Al = A u {E}, show that SeAl) consists of those subsets of X of the form (AI n E) u (A2 n (X \ E)), where A}, A2 EA. 4.24 Let I be an extended real-valued function defined on X, and let J.L be the measure on (X, P(X)) given by J.L(E) L-XEE I(x). Find necessary and sufficient conditions, in terms of I, for J.L to be finite or a-finite.
=
4.25 Let (X,M,J.L) be a measure space, and let {Ek} ~ M. Show that if J.L (U E k ) < 00, and J.L(E k ) ~ 'fJ > 0 for infinitely many k's, then J.L (lim sup Ek) > O. By means of an example show that the condition J.L (U Ek) < 00 cannot be removed. 4.26 Let (X,M,J.L) be a measure space, and {Ek} ~ M. Show that J.L (lim inf Ek) ~ lim inf J.L (Ek), and that, provided that J.L (U Ek) < 00, lim sup J.L (Ek) ~ J.L(limsupEk). By means of examples show that we may have strict inequalities above. 4.27 Let (X,M,J.L) be a probability measure space. If J.L(lim sup An) = 1 and J.L(liminf Bn) = 1, prove that J.L(limsup(An n Bn)) = 1. What happens if we assume instead that J.L(lim sup Bn) = I? 4.28 Let J.L be a measure on (R, Bt) with the property that J.L( I) < every finite interval I, y E R, and put
J.L«y,xD
Fy(x)
={ 0
-J.L«x, yD
00
for
if x> y if x = y if x < y.
Show that Fy is a nondecreasing right-continuous function; Fy is called a distribution function induced by J.L. 4.29 Let (X,M,J.L) be a measure space, T be a mapping of X onto Y and set N = {A ~ Y: T-l(A) EM}. Furthermore, let v be the set function defined on N by yeA) = J.L(T-l(A». Prove that N is a a-algebra of subsets of Y and that (Y,N, v) is a measure space.
IV.
62
Abstract Measures
4.30 Let J.Ll,J.L2 be measures on (X,M), J.L2(X) < 00, and suppose that J.Ll(E) ~ J.L2(E) for all E E M. Show that there exists a (unique) measure J.L3 on (X, M) such that
J.Ll(E) = J.Li E ) + J.L3(E) , all E EM. Is the restriction J.L2(X) < 00 necessary? 4.31 Let {J.Lk} be a sequence of measures on (X,M) with the property that J.Lk( E) ~ J.Lk+1 (E) for all E EM. Is the set function J.L defined on (X,M) by J.L( E) = lim J.Lk( E) , k-+oo
necessarily a measure? 4.32 A useful concept in measure theory is that of semifiniteness, a notion weaker than that of a-finiteness. A measure J.L on (X, M) is said to be semifinite if for each A E M with J.L(A) = 00, there is a measurable set E C A, with 0 < J.L(E) < 00. Show that every 0'finite measure is semifinite, as is the counting measure, and give an example of a measure that is not semifinite. Also prove that if J.L is semifinite and M does not contain an uncountable, pairwise disjoint collection of measurable sets of positive measure, then J.L is a-finite. Show by means of an example that this does not hold without the assumption that there are no "infinite atoms." 4.33 Let J.L be a semifinite measure defined on (X, M). Show that every measurable set with infinite measure contains a measurable subset with arbitrarily large, finite, measure. 4.34 Let J.L be a measure on (X,M). Show that J.L can be written as the sum J.L(A) = J.Ll(A) + J.L2(A), A E M, where J.Ll is a semifinite measure and the measure J.L2 only assumes the values 0 and 00; the decomposition need not be unique. 4.35 Assume that (X, M, J.L) is a measure space, and for E}, E2 E M put
d (El, E2) = J.L(E1 b. E2). Discuss under what conditions endowed with this distance, (M, d) becomes a complete metric space. 4.36 Assume that (Xk,Mk,J.Lk) are measure spaces for k = 1,2, ... , and that the Xk'S are pairwise disjoint. Put X = Uk Xk, and let M be the class of subsets A of X of the form A = Uk Ak, where Ak E Mk for all k. Show that M is a a-algebra of subsets of X and that the set function J.L given on M by J.L(A) = Lk J.Lk(A k ) is a measure on (X, M). When is J.L finite? a-finite?
CHAPTER
v
The Lebesgue Measure
In this chapter we introduce the most important example of a measure on Rn, the Lebesgue measure. I learned this construction from A. Zygmund. 1.
LEGESGUE MEASURE ON Rn
In defining a measure on Rn we must first decide on the a-algebra of measurable sets; in the case at hand geometric considerations are also of importance. We call a closed parallelepiped, i.e., a closed bounded set of the form {(XI, ... xn): ak ~ Xk ~ bb 1 ~ k ~ n}, a closed interval. An open interval is the interior of a closed interval, i.e., a set of the form {(Xl, ... , Xn) : ak < Xk < bk , 1 ~ k ~ n}. Intervals I, open and closed, have volume vel) equal to IIk=1 (bk - ak). We require that the a-algebra of Lebesgue measurable sets contain all open and closed intervals and that the Lebesgue measure agree with the volume for intervals. Since each open set in Rn is the countable union of nonoverlap ping closed intervals, Lebesgue sets include all open sets, and consequently, also all closed and Borel sets. It is not intuitively clear what the Lebesgue measure of these general sets is; however, we expect the Lebesgue measure to satisfy two additional properties: It should be complete and translation invariant, i.e., if A is Lebesgue measurable and Ay = y + A = {y + x: x E A}, then Ay is also measurable and A and Ay have the same measure. Translation invariance plays an essential role in the determination of the nature of the a-algebra of Lebesgue measurable sets. In the early 1900's Vitali showed that if we accept the Axiom of choice, then not all subsets of R are Lebesgue measurable. In the early 1970's Solovay proved
63
v.
64
The Lebesgue Measure
that it is consistent with the usual axioms of Set Theory, excluding the Axiom of choice, for every subset of R to be Lebesgue measurable. One cannot conclude from Solovay's result, however, that the Axiom of choice and the existence of a subset of R which is not Lebesgue measurable are equivalent. Since the construction of a subset of R which is not Lebesgue measurable depends on general, rather than specific, properties of the measure, including translation invariance, we describe such a set at this point. Let 1= [-1/2,1/2] and let", be the relation defined for numbers X,Y in I by x '" y iff x - y is a rational number (between -1 and l)i it is not hard to check that '" is an equivalence relation. Let 'R, be the collection of all the distinct equivalence classes of "'i clearly 'R, can be indexed by a subset of I. Now, by the Axiom of choice, there is a set A C I which contains exactly one element from each equivalence class of "'i it is the set A we propose to show is not Lebesgue measurable. Suppose, to the contrary, that A is Lebesgue measurable, enumerate the rational numbers in [-l,l],ro = O,Tt, .•• , and let Ak = Tk + A,k 2: 0, be the rational translates of A. Since we assume that A is Lebesgue measurable, the Ak'S are also Lebesgue measurable and they all have the same measure. Note that translates corresponding to different Tk'S are disjoint: Indeed, iffor j i- k there is y E Aj n Ak, then there are x, x' E A such that y = x + Tj = x' + Tk. Now, since Tj i- Tk, then also x i- x', and x and x' correspond to two distinct equivalence classes of "'. But we also have that x - x' = Tk - Tj is a rational number (between -1 and 1), and consequently, x '" x'. However, since A contains one element from each equivalence class this cannot happen, and the Ak'S are pairwise disjoint. Next observe that I ~
U
Ak
~ [-3/2,3/2].
(1.1)
k
Suppose we have shown (1.1) to be true. Then the left-hand side inclusion implies that the Lebesgue measure of A cannot be 0, for otherwise Uk Ak would be a null set which contains a subset of measure 1, and the righthand side inclusion implies that the Lebesgue measure of A cannot be positive, for otherwise Uk Ak would be a measurable set of infinite measure contained in a set of measure 3. In short, (1.1) implies that A cannot be Le besgue measurable. To prove that (1.1) holds is not hard. For x E I, let x, be the representative of the equivalence class of x which belongs to A. Then there is a rational number Tk in [-1,1], such that x = x, + Tk. Whence x E Ab and the left-hand side inclusion in (1.1) holds. Moreover, since
1.
Lebesgue Measure on R n
65
each Ak ~ [-3/2,3/2], the same is true of the union, and the right-hand side inclusion in (1.1) also holds. Now that we are satisfied that not every subset of Rn can be Lebesgue measurable, how do we go about constructing the u-algebra of Lebesgue measurable sets? The idea is to introduce the Lebesgue measure as we go along. First we define a nonnegative u-subadditive set function on P(Rn ), the Lebesgue outer measure, and then select as Lebesgue measurable sets those subsets of R n which can be approximated, in a sense made precise by the outer measure, by open sets. Since intervals are the only sets we know how to measure at this point, it is natural that the definition of Lebesgue outer measure involve coverings with intervals. Given a subset A of R n , we define the Lebesgue outer measure IAle of A as the quantity
(1.2) where the infimum in (1.2) is taken over the family of all at most countable coverings of A by closed intervals. Now, a countable mesh of nonoverlapping unit intervals (actually n-dimensional cubes of sidelength one) covers Rn, and consequently it also covers every subset of Rn. Thus the inf in (1.2) is a well-defined quantity. By (1.2) it is clear that if A ~ B, then IAle ::; IBle (monotonicity). Moreover, as expected, the outer measure of an interval coincides with its volume. Proposition 1.1.
Let I be a closed interval in Rn. Then
IIle =
v(I).
Proof. Since I is a finite covering of itself, IIle ::; v(I) < 00. To prove the opposite inequality note that since IIle < 00, given c > 0, there is a family of closed intervals {Ik} such that I ~ UIk, and 2: v(Ik) ::; IIle + c. Furthermore, by extending the sidelengths of the Ik's and then deleting the edges, it is readily seen that given", > 0, there are open intervals I~ d Ik, such that v(ID ::; (1 + ",)V(Ik), all k. Since I ~ UI~, and since I is compact and the Irs are open, by the Heine-Borel theorem there is a finite subcovering, which for simplicity we also denote by {In, such that I ~ U:=1 I k. In this case it is intuitively clear, although involved to prove, that v(I) ::; 2::=1 v(I~), and consequently we also have v(I) ::; (1 + ",)
Lk
N =1
v(Ik) ::; (1
+ ",) L
v(Ik) ::; (1 + ",)(IIle
+ c).
(1.3)
Since c and", are both arbitrary, from (1.3) it readily follows that v(l) ::; IIle, and we have finished. •
66
V.
The Lebesgue Measure
Along the same lines it is not hard to see that the following result holds: Let {Ik}k=l be a finite c,ollection of nonoverlapping closed intervals in Rn, then
Corollary 1.2.
Let I be an open interval. Then Ille = vel).
Proof. Let I denote the closure of I, then we have Ille ~ v(1) = v(l), and one inequality holds. Also, if J is any closed interval contained in I, monotonicity implies that v(J) = IJI ~ Ille. But since v(I) = supv(J), where the sup is taken over the family of the closed subintervals J of I, we get that vel) ~ Ille, and the opposite inequality holds. • Next we show that the outer measure is q-subadditive. Proposition 1.3. Then
Let {Ek}k:l be any sequence of subsets of Rn. (1.4)
Proof. If IEkie = 00 for some k, then by monotonicity also the union has infinite outer measure, and we have equality in (1.4). Otherwise, suppose that IEkie < 00 for all k, and let c > 0 be arbitrary. For each k let {lk,j} be a covering of Ek by closed intervals with the property that
(1.5) Clearly
(1.6) and consequently, the closed intervals on the right-hand side of (1.6) form a covering of Uk Ek' Whence, by definition, we have
(1.7) and since the summands on the right-hand side of (1.7) are nonnegative we may interchange freely the order of summation and estimate that expression by
1.
Lebesgue Measure on R"
But since
E
67
> 0 is arbitrary, we also have
I Uk Ekle
~ L:k
IEkle. •
In contrast to the case of measures, it is interesting to point out that strict inequality may occur in (1.4), even ifthe Ek'S are pairwise disjoint. To see this observe that the Lebesgue outer measure is translation invariant, and that with the notation of (1.1) above, IAle > O. Then, again by (1.1), IUk Akl e ~ 3 < L:k IAkie = 00. Since open sets can be expressed as the union of closed intervals, it is reasonable to attempt to compute the outer measure of subsets of R n in terms of the outer measure of open sets. Specifically, we have Proposition 1.4. IE Ie
Let E be any subset of Rn. Then
= inf {IOle: 0
(1.8)
is open, and 0 ;2 E} .
Proof. If IEle = 00, by monotonicity every open set which contains E (and this class is nonempty since Rn is one such set) also has infinite outer measure and (1.8) holds. On the other hand, if IEle is finite, given E > 0, let {lk}k:l be a covering of E by closed intervals such that
Furthermore, for each k, let lfe be an open interval containing lk such that v(lfe) ~ v(lk) + E/2 k+1, and put 0 = Uklfe. By construction 0 is open and it contains E. Moreover, by Proposition 1.3 and Corollary 1.2 it readily follows that
IEle ~ IOle ~ ~
Lk Ilfele = Lk v(lfe)
Lk (v(lk)
+ E/2k+l)
= Lk v(lk)
+ E/2 ~ IEle + E.
Thus, for any E > 0, there is an open set 0 ;2 E such that IEle + E, and (1.8) holds. •
(1.9)
IEle ~ IOle
~
Proposition 1.4 is important in applications; to state some we need a definition. We say that a subset H of Rn is a G6 set if H is the intersection of an at most countable family of open sets. The complement of a G6 set is an Fu set, i.e.,.an at most countable union of closed sets. Corollary 1.5. Let E be an arbitrary subset of Rn. Then there is a G6 set H which contains E and such that IHle = IEle.
v.
68
The Lebesgue Measure
Proof. If IEle = 00, put H = Rn. If IEle < 00, by (1.9) there is a sequence {Ok}k:l of open sets containing E such that 10Ie ~ IEle + l/k. Let now H k Ok; by construction H is a Gs set, and E ~ H ~ Ok for all k. Thus by monotonicity it follows that
=n
(1.10)
Since k is arbitrary we get that IEle = IHle.
•
A closer look at inequality (1.10) for sets E with finite outer measure indicates that the following property is true: Given E > 0, there is an open set 0 ~ E such that 10Ie - IEle < E. This estimate does not make sense when IEle = 00, but a closely related result holds for any subset of Rn, to wit, there exists an open set 0 ~ E such that 10Ie ~ IEle+IO\Ele. This inequality is a simple consequence of the monotonicity and it hints that rather to seek to controIIOle-IEle, which is meaningless when IEle = 00, we may try to control 10 \ Ele. This control is, indeed, all that is needed. We say that E ~ Rn is Lebesgue measurable if for any E > 0, there exists an open set 0 ~ E such that
10 \ Ele <
E.
The class of Lebesgue measurable sets is denoted by Cn, or plainly by C. Notice that open sets, as well as sets with outer measure equal to 0, belong to C. In the case of open sets this is obvious, and for sets E of outer measure observe that by (1.10) there are open sets Ok ~ E with lOki ~ l/k for all k. Then, given E > 0, let k ~ l/E, and note that
°
10k \ Ele ~ 10kie ~ l/k ~
E.
In order to show that C is a O'-algebra we must verify that C is closed under countable unions, which is easy, and under complementation, which requires some work. We begin with the easier part. Proposition 1.6. UkEk E C.
Let {Ek}k:l be a sequence of subsets in C, then
Proof. For a given E > 0, we must find an open set 0 ~ Uk Ek so that 10 \ Uk Ekle < E. Since each Ek belongs to C, there are open sets Ok ~ Ek such that 10k \ Ekle ~ E/2 k for all k. Hence, 0 = Uk Ok is an open set which contains Uk Ek, and since as is readily seen (Uk Ok) \ (Uk Ek) ~ Uk (Ok \ E k ), by Proposition 1.3 we get that
10 \ UkEkle ~ Lk 10 k \ Ekle ~ Lk E/2 k = E.
1.
Lebesgue Measure on R n
Thus, the union of the Ek'S belongs to C.
69
•
To complete the verification that C is a u-algebra, we begin by showing that closed sets are indeed Lebesgue measurable. This requires some preliminary results. Lemma 1.7.
Let E 1 , E2 be subsets of Rn with the property that
Then, lEI U E21e = IEtie compact and disjoint.
+ IE2Ie. In particular, this is
true if Et, E2 are
Proof. If either lEt Ie or IE21e is infinite, then the same is true of lEt uE2le, and we are done. Now, if both are finite, since the outer measure is subadditive, it suffices to show that IEtie + IE21e ~ lEt U E21e. But in this case we also have that lEt UE21e < 00, and consequently, given £ > 0,
there is a covering {Ik} of E t U E2 consisting of closed intervals such that
There are three relevant kinds of Ik'S, to wit: (i) Those h's such that h nEt -::j:. 0, h n E2 = 0, call them Il's; (ii) Those Ik'S such that Ik nEt = 0, Ik n E2 -::j:. 0, call them I~'s; (iii) Those Ik'S which intersect both E t and E 2 • The intervals in the third class above may be subdivided into nonoverlapping closed subintervals of diameter less than or equal to d(Et, E 2 ). Each subinterval thus obtained either belongs to the first family (i), or to the second family (ii), or it does not intersect E t U E2 and it can be discarded. Therefore, we divide the Ik'S into a covering of E t , a covering of E 2 , and throwaway the rest. By definition we have
which implies, since
lEI U E21e.
£
is arbitrary, that, as asserted, IEti e + IE21e ~
•
We are now ready to prove Theorem 1.8.
Closed subsets of Rn are Lebesgue measurable.
V.
70
The Lebesgue Measure
Proof. Suppose first that the closed set F in question is bounded, and hence compact. Then, given c > 0, by inequality (1.10) there is an open set 0 :J F such that 10Ie :5 IFle + cj we would like to show that 10 \ Fie :5 c as well. Now, 0 \ F is also open, and consequently it can be expressed as the countable union of nonoverlapping closed intervals, Uk Ik, say. By Proposition 1.3, it follows that 10 \ Fie :5 ~k v(h). On the other hand, since
by monotonicity we get that
Furthermore, since F and Uf=l Ik are both compact and disjoint and the Ik's are nonoverlap ping, by Lemma 1.7 it readily follows that
In particular
But since this inequality holds with a bound independent of N, we have ~k v(Ik) :5 c, and consequently also 10 \ Fie :5 c. Thus, in this case, F is Lebesgue measurable. For general closed subsets F of Rn, let Fk = {x E F: Ix I :5 k}, and observe that each Fk is closed and bounded, and that F = Uk Fk. By the above argument each Fk is measurable, and by Proposition 1.6 so is their union, F. • Theorem 1.9.
Suppose E E C, then R n
\
E E C.
Proof. Let Ok :J E be a family of open sets such that 10k \ Ele :5 11k, k = 1,2, ... j each Rn \ Ok is closed, and hence measurable. furthermore, since Rn \ Ok ~ Rn \ E for all k, H = Uk(Rn \ Ok) is an Fq measurable subset of Rn \ E. Let A = (Rn \ E) \ Hj since Rn \ E = H U A, we will be done once we check that A is measurable. To do this we show that IAle = o. Indeed, since for each k, A ~ (Rn \ E) \ (Rn \ Ok) = Ok \E, it readily follows that
IAle :5 10 k \ Ele :5 11k,
all k.
Lebesgue Measure on R n
1.
Thus
IAle =
0, and we are done.
71
•
Theorem 1.9 completes the verification that C is a u-algebra and, at the same time, it provides a description of the Lebesgue measurable sets. Indeed, the argument of Theorem 1.9 applied to the (measurable) complement R n \ E of a measurable set E gives that E = R n \ (Rn \ E) = HUN, where H is an FO' set and INle = 0. It also gives that the Lebesgue measurable sets are precisely those subsets E of R n which satisfy the following property: Given c > 0, there is a closed set F ~ E such that IE\ Fie ~ c. Next we construct the Lebesgue measure on (Rn, C)j it is the restriction of the Lebesgue outer measure to C. More precisely, we have Theorem 1.10. The set function I . Ie restricted to C is a measure. We call this measure the Lebesgue measure on Rn and denote it by 1·1. Proof. The proof amounts to showing that the set function I . Ie is u-additive on C. For this purpose, let {Ek}~l be a sequence of pairwise disjoint measure sets, and let E denote its union. Note that, by Proposition 1.3, lEI ~ L:k IEkl. As for the opposite inequality, assume first that the Ek'S are bounded. Given c > 0, let Fk ~ Ek be a sequence of closed sets such that IFk \ Ekl ~ c/2 k for all k. Since Ek = Fk U (Ek \ Fk) we also have IEkl ~ IFkl + c/2 k . Furthermore, since the Ek'S are pairwise disjoint, the sequence of Fk'S is composed of pairwise disjoint compact subsets of Rn. Fix N, and note that by (a simple extension of) Lemma 1.7, I Uf=l Fkl = L:f=lI Fkl. Thus, since Uf=l Fk ~ E for all N, it follows that L:f=l IFkl ~ lEI, all N, and consequently also L:k:llFkl ~ lEI. Whence 00
00
k=l
k=l
L IEkl ~ L (IFkl + c/2k) ~ lEI + c,
and, since c is arbitrary, we are done in this case. In the general case, fix an increasing sequence {Ij} of bounded intervals so that Uj Ij = Rn, 10 = 0, and put Sj = Ij \Ij-bi = 1,2, ... Then, the sets Ek,j = Ek n Sj are measurable, pairwise disjoint and bounded, and for each k we have Uj Ek,j = Ek. Thus, on the one hand,
IUj,kEk,j1 In other words,
IUj,k Ek,jl
=
I Uk Ekl, and, on the other hand,
= Lk,jIEk,jl = LkLjlEk,jl = LkiEkl.
1·1 is u-additive on C, and we have finished.
•
V.
72
The Lebesgue Measure
The following characterization of C, due to CaratModory (1873-1950), highlights the interplay between the Lebesgue measurable sets and the Lebesgue measure, and it is very interesting since it can be used to define C, and more general a-algebras of sets, d. 3.40 below. Theorem 1.11 (CaratModory). A subset E of Rn is Lebesgue measurable iff for every subset A of Rn we have
IAle = IA n Ele + IA \
Ele .
(1.11)
Proof. We begin by assuming that E is measurable and A is any subset of Rn. Since A = (A n E) U (A \ E), by the sub additivity of the Lebesgue outer measure it follows that
IAle
~ IA U Ele
+ IA \ Ele .
To prove the opposite inequality, note that by Corollary 1.5 there is a Gs measurable set H ~ A such that IAle = IHI. Now, since H is also measurable, we have IHI = IH n EI + IH \ EI. Whence, by monotonicity
and (1.11) holds. Next assume that (1.11) is true for every subset A of Rnj we distinguish the cases IEle < 00 and IEle = 00. In the former case, by Corollary 1.5 there is a Gs set H ~ E such that IHI = IEle. By (1.11) we have
IHI = IH n Ele + IH \ Ele = IEle + IH \ Ele.
(1.12)
(1.12) gives at once that H \ E is a measurable set of measure 0, and that E = H \ (H \ E) is also measurable. As for the latter case, let Ek = {x E E: Ixl ~ k} be so that IEkl < 00, and let Hk be a Gs set containing Ek such that IHkl = IEkie for all k. By (1.11), with A = Hk there, we get
Since IHkl = IEkle, we have IHk \ Ele = 0 for all k. Whence, setting H = Uk Hk, it readily follows that H is a measurable set which contains E, and that H \ E = Uk(Hk \ E) is a measurable set of measure o. Thus, E = H \ (H \ E) is also measurable. •
2.
73
The Cantor Set
2. THE CANTOR SET
°
It is easy to see that there are uncountable sets of measure in Rn, n ~ 2; indeed, the boundary of any interval is such a set. How about R? The Cantor set is such an example, and we construct it next. Consider the closed interval Co = [0,1]. The first stage of the construction is to trisect Co and to remove the interior of the middle interval, (1/3,2/3). Each successive step is essentially the same. Let C 1 = [1,1/3] U [2/3,1]; C 1 is the union of 21 = 2 closed disjoint intervals. At the second stage we subdivide each of the closed intervals of C1 into thirds and remove from each one the middle open thirds, (1/9,2/9) and (7/9,8/9). Suppose that Cn has been constructed and that it consists of 2n closed disjoint intervals, each of length 3- n • Subdivide each of the closed intervals of C n into thirds and remove from each one of them the interior of the middle intervals. What is left from C n is C n+!; note that C n+! is the union of 2n+1 closed intervals, each of length 3-(n+1). The Cantor set C is now defined as C = n~=o Cn. Some of the elementary properties of C are the following: It is closed, it contains the endpoints of all intervals in Cn, and any point of C is the limit of a nondecreasing (and a nonincreasing) sequence of endpoints of the intervals of the Cn's. It is not hard to give an analytical description of the elements of C. Let x = L:~=1 a n 3- n be the tryadic expansion of an arbitrary x E C. We observe that since x ~ (1/3,2/3),a1 ~ 1; similarly, since x ~ (1/9,2/9) U (7 /9,8/9),a2 ~ 1, and so on. In other words, by induction we see that an ~ 1 for all n, and C consists precisely of those points with an = 0,2 in their tryadic expansion. For example, the number 1/4 = L:~=1 2· 3- 2n is in C, but is not an endpoint of any of the intervals of the Cn's.
o o
1/3
1/9
2/9
2/3
2/3
1/3
-
1
7/9
8/9 I
Figure 2
1
...... \
v.
74
The Lebesgue Measure
As for the cardinality of C, we have Proposition 2.1.
C is uncountable.
Proof. The idea is to show that C '" 2N ,2 = {O,l}. If (xn) E 2N , let Yn = 2x n , and put f((xn)) = E~=l Yn3-n. Since Yn -::j:. 1 for all n, f maps 2N into Cj we want to show that f is one-to-one and onto. Suppose that (xn) -::j:. (x~), and let m = min{n: Xn -::j:. x~}j we may assume that Xm = 0 and x~ = 1. Since 2 E~=m+1 3- n = 3- m, it follows that 00
m-l
f((x~)) = 2 Lx~3-n ;::: 2 L n=l
x n3- n + 2· 3- m
n=l
00
> 2 L xn3- n = f((xn)) ' n=l
and
f is one-to-one.
Since given x onto. •
= E~=l an3-n in C,
we have f((a n /2))
= x,
f is also
Is C measurable, and if so, what is its measure? Since C is covered by the intervals in any C n we have ICle ::; 2n3- n for all n, and consequently, ICI = o. Thus C is an example of an uncountable set of measure 0 in the line.
3. PROBLEMS AND QUESTIONS 3.1 Suppose A, B are not Lebesgue measurable, is the same true of AUB? 3.2 Suppose IAle = 0 and show that for every subset of B of R n we have IB U Ale = IB \ Ale = IBle . 3.3 Let A, B
~
Rn. Show that
3.4 Suppose {Ek} is a nondecreasing sequence of subsets of R n and let E = Uk Ek. Is it true that limk-+oo IEkie = IE Ie ? 3.5 Does the notion of outer measure change if we replace the coverings by intervals by coverings with balls? How about parallelepipeds with a fixed orientation?
3.
Problems and Questions
3.6 Show that if
75
L: IEkie < 00, then
3.7 Suppose A,B and IAI = IBI·
~ Rn ,
Ilim sup Ekle
= O.
A E £, and IA ~ Ble = O. Show that BE£'
3.8 Assume {Ek} is a sequence of pairwise disjoint Lebesgue measurable sets and let A be any set. Is it true that
IA n (U
OO
n=I
Ek)
I= e
",00
L....tk=I
I An Ek Ie ?
3.9 Consider the transformation ¢>( x) = 71X + 6 from R into itself, where 71 f; 0 and 6 are real numbers. Show that: (i) For any set E, 1¢>(E)le = 171IIEle. (ii) E is Lebesgue measurable iff ¢>(E) is Lebesgue measurable, and in this case I¢>(E)I = 171IIEI. Can you think of extensions of this result to R n ? 3.10 A mapping ¢> from R into itself is said to be an isometry if for any x,x'in R we have I¢>(x) - ¢>(x')1 = Ix - x'i. Show that if ¢> is an isometry and E E £', then ¢>(E) E £, and I¢>(E)I = lEI.
3.11 Assume that INI = 0 and show that {x 3 : x E N} is a null Lebesgue set. 3.12 Suppose IEle < 00 and show that E E £, iff for any E > 0, we can write E = (A U AI) \ A 2 , where A is the union of a finite collection of nonoverlap ping intervals and IAIle, IA21e < E. 3.13 Is the set of irrational numbers in the line a Gs set?
3.14 Show that E E £, iff E = H \ N, where H is a Gs set and INI = O. 3.15 Does there exist a function f: R --+ [0,1] such that the set D of its discontinuities has IDI = 0 and D n I is uncountable for every interval I of R?
3.16 Assume A is a Lebesgue measurable subset of R of finite measure and put ¢>(x) = IA n (-00, xli. Show that ¢> is continuous at each x of R.
3.17 Let A be a Lebesgue measurable subset of R and let 0 < 71 < IAI. Show that there exists a Lebesgue measurable set B so that B ~ A and IBI = 71. 3.18 Given E > 0, show that there exists a dense open subset 0 of [0,1] with 101 < E so that its boundary 80 satisfies 1801 ~ 1 - E.
3.19 Let A = {x E [0,1]: x = .aIa2 ... , an f; 7, all n}. Prove that IAI = O. Generalize this result to different configurations of an's and to dyadic, tryadic expansions.
V.
76
The Lebesgue Measure
3.20 Let A = {x E [0,1]:x = .a1a2 ... ,an = 2 or 3, all n}. Show that A is measurable and compute IAI. 3.21 Let A = {x E R: there exist infinitely many pairs of integers p, q such that Ix - p/ql ~ 1/q3}. Show that IAI = o.
=
3.22 Suppose II, ... ,In are open intervals in R, so that if Q1 Q n [0,1] denotes the rational numbers in [0,1], then Q1 c Uj=l Ij. Prove that 2:j=l IIj I ~ 1. Is the conclusion true if the Ii's are measurable sets rather than intervals? What if we allow the collection of intervals to be infinite rather than finite? 3.23 Let
T1, T2, .•.
be an enumeration of Q. Show that
On the other hand, also show that
may, or may not, be empty. 3.24 Suppose E is bounded measurable subset of R, there exist Xl, X2 E E so that Xl - X2 E Q. 3.25 Show that if B is a Hamel basis for R, then
lEI> o.
Prove that
IBI = O.
3.26 Construct Lebesgue null subsets B 1 , B2 of R such that
3.27 Construct a Cantor-type subset C n of [0,1] by removing at the nth stage a "middle" interval of length (1 - 7] )3- n , 0 < 7] < 1. Show that C n enjoys all the properties of C, but it has Lebesgue measure
ICnl
= 7].
3.28 If -1
~ T ~
1, show there exist x, y E C such that y -
X
= T.
3.29 Does the Cantor set contain a Hamel basis for R? 3.30 Construct a Cantor-like subset of [0,1] which consists entirely of irrational numbers. 3.31 Let (an) be a fixed decreasing sequence of real numbers such that ao = 1, and 0 < 2a n < an-I, and define the sequence (d n ) by dn = a n-1 - 2a n , n ~ 1.
3.
Problems and Questions
77
= [1 - at,l], h,2 = [al - a2,al], Ia,3 = [al - a2,al - a2 + a3], 13,4 = [al - a3,aI], and so on; this definition can be made precise by induction. Now put Now let 10,1 = [0,1], h,l = [O,al],
Fn =
U
2"
k=l
In k '
and
h2
P =
n°on=l Fn.
Show that P E £ and IPI = lim n ..... oo 2n a n • Moreover, if 0 ~ "I < 1, the an's can be chosen so that IPI = "I. Also, if Tn = an-l - an, then the elements of P are precisely those real numbers of the form 2:k=l ckTk, ck = 0 or 1. 3.32 Show that the dinality 2 c • Consider now say that El rv relation on £ cardinali ty c.
class of Lebesgue measurable subsets of R n has carthe following relation on £: Given EI, E2 in £, we E2 if lEI l:::. E21 = o. Show that rv is an equivalence and that the family of the equivalence classes has
3.33 Prove that there is no Lebesgue measurable subset A of R such that alII ~ IA n II ~ bill for all open intervals I of the line. More precisely, prove the following two assertions: (a) If IA n II ~ bill for all open intervals I C R and b < 1, then IAI = 0, and, (b) If alII ~ IA n II for all open intervals I C R and a > 0, then IAI = 1. 3.34 Prove there exists a Lebesgue measurable set E C R such that
o < IE n II
< III, all bounded intervals I CR.
3.35 Does there exist a measurable subset E of R such that
o < IE n II ,
0
< II \ EI , all intervals I
C R?
3.36 A measurable subset A of R is said to have a well-defined density, if the limit D(A) = lim IA n (-A,A)I ~ ..... oo
2A
exists. In this case D( A) is called the density of A. Give an example of a measurable set whose density is defined, and one whose density is not defined. Further, prove that if Al and A2 are disjoint and have well-defined density, then Al U A2 also has a well-defined density, and D(AI U A 2) = D(Ad + D(A 2). 3.37 Prove that the Lebesgue measure enjoys the following property, known as regularity: Given a measurable set A we have
IAI = sup{IKI : K is compact, and K
~
A} .
V.
78
The Lebesgue Measure
3.38 It is difficult to approximate sets that are not Lebesgue measurable with measurable ones. Specifically, suppose E C R n is not Lebesgue measurable. Show there is .,., > 0 such that if E ~ A and R n \E ~ B, and if A and B are Lebesgue measurable, then IA n BI ~ .,.,. 3.39 Decide whether the following statement is true: A ~ Rn is Lebesgue measurable iff for every open subset G of Rn we have
IGI = IG n Ale + IG \ Ale. 3.40 Suppose p.* is a nonnegative u-subadditive monotone set function defined for all the subsets of a set X such that p.*(0) = O. We say that E ~ X is measurable with respect to p.* if for every subset A ~ X we have p.*(A) = p.*(A
n E) + p.*(A \
E).
Let M be the class of subsets of X which are measurable with respect to p.*. Show that M is a u-algebra of subsets of X and that the restriction of p.* to M defines a measure on (X, M). This construction is known as the Caratheodory extension of an outer measure.
CHAPTER
VI
Measurable Functions
In this chapter we introduce the class of measurable functions, for which the integral will be defined, and discuss some of its basic properties.
1. ELEMENTARY PROPERTIES
OF MEASURABLE FUNCTIONS Let M be a u-algebra of (measurable) subsets of X and suppose f is an extended real-valued function defined on X; by this we mean that, in addition to real values, f may also assume the values ±oo. We say that f is measurable if for any real number~, {x EX: f(x) > ~} = {f > ~} E M; that is to say, all the level sets of f are measurable. For instance, for any M, f = XA is measurable iff A EM. If M = {0,X}, only constant functions are measurable, and if M = P(X), all functions are measurable. We begin by exploring some simple properties of measurable functions. Proposition 1.1. Suppose M is a u-algebra of subsets of X and let f be an extended real-valued function defined on X. Then, the following statements are equivalent: (i) f is measurable. (ii) For any real ~,{f ~ ~} E M. (iii) For any real ~,{f < ~} E M. (iv) For any real ~,{f =:; ~} E M. Proof. (i) implies (ii). Fix~, and for n ~ 1 let An = {f > ~-l/n}; by assumption An E M, all n. Now, since {f ~ ~} is the intersection of the An's, it also belongs to M, and (ii) holds.
VI.
80
Measurable Functions
(ii) implies (iii). {f < A} = X \ {f ~ A}. (iii) implies (iv). {f ~ A} = n~=l {f < A + lin}. (iv) implies (i). {f > A} = X \ {f ~ A}. • In working with measurable functions it is essential to know whether certain sets are measurable. Since these sets are readily obtained from those introduced in Proposition 1.1 we merely indicate how their measurability is established.
{f
= co} = n~=l {f > n},
{f < co} = U~=l {f < n},
= -co} = n~=l {f < -n}.
(1.1)
{f> -co} = U~=l {f > -n}.
(1.2)
{f
{ -co < f < co} = {f > -co}
n {f < co} .
(1.3)
Also, for any real numbers A, T/, we will have the occasion to deal with the measurable sets
P < f < co} =
{f > A} n {f < co} , {-co < f < A} = {f < A} n {f > -co}.
(1.4)
P < f < J.L} = {f > A} n {f < J.L}, P < f ~ J.L} = {f > A} n {f ~ J.L}. P ~ f < J.L} = {f ~ A} n {f < J.L},
(1.5)
P~f~J.L}={f~A}n{f~J.L}.
(1.6)
{f = A} = {f ~ A} n {f ~ A}, {f t A} = X \ {f = A}.
(1.7)
Our next result indicates how to handle the infinite values of a measurable function. Proposition 1.2. Let M be a u-algebra of subsets of X and let f be an extended real-valued function defined on X. Then, f is measurable iff {f = -co} EM and for each real A, {A < f < co} E M. Proof. The necessity has been established in (1.1) and (1.4). As for the sufficiency, first observe that
{f < co}
= {f = -co} u (U:=-oo {n < f
< co})
1.
Properties of Measurable Functions
81
belongs to M by assumption. Whence {I in M, and since for each real A we have
{I> A} the level sets of
I
= oo} = X \
{I < oo} is also
= {A < 1< oo} U {I = oo} EM,
are measurable and
I
is measurable.
•
In fact, a more general statement is true.
Proposition 1.3. Let M be a cr-algebra of subsets of X and suppose is an extended real-valued function defined on X. Then, I is measurable iff {I = -oo} E M and for each open subset 0 ~ R, 1-1(0) E M.
I
Proof. Since for each real A, (A, 00) is open, the sufficiency follows from Proposition 1.2. As for the necessity, suppose 0 is an open subset of R and write 0 = Uh, where the Ik's are an at most countable collection of pairwise disjoint open intervals, one or two of which are possibly unbounded. By 5.8 in Chapter I
1-1(0) =
U1-1(Ik).
(1.8)
k
Now, by (1.3), (1.4) and (1.5), the sets in the union on the right-hand side of (1.8) all belong to M and 1-1(0) is measurable. Since {I = -oo} E M whenever I is measurable, we have finished. • So far the role of measures on (X, M) is not apparent, but in dealing with measurable functions sets of measure 0 are important and the following concept essential. Given a measure space (X, M,J.L), we say that a property P(x) is true J.L-almost everywhere on a measurable subset E of X, and denote this by J.L-a.e. on E, if 1'( {x E E: P( x) is not true}) = o. For instance, we say that a measurable function I is finite J.L-a.e. on E if 1'( {x E E : I( x) = ±oo}) = o. It is natural to expect that measurable functions that coincide J.L-a.e. on X be, in some sense, equivalent. A more precise statement is
Theorem 1.4. Let I' be a complete measure on (X,M), and let I,g be extended real-valued functions defined on X. If I is measurable and 9 I J.L-a.e., then 9 is also measurable and
=
1'( {g > A}) = 1'( {f > A}) ,
all real A .
(1.9)
VI.
82
Measurable Functions
=
Proof. Let N {g "I J}; by assumption N is a null, measurable, set. Now, for each real A we have
{g > A} u N = {f > A} UN.
(1.10)
Since f is measurable, the set on the right-hand side of (1.10) is measurable, and so is the set on the left-hand side there. Moreover, since J.l is complete and No = {x E N :g(x) ~ A} ~ N, then No is also a null, measurable set and consequently,
{g> A}
= ({g > A} U N) \ No
(1.11)
is also measurable. Next observe that since N is null, we have J.l( {f > A} U N) = J.l( {f > A}) for all real A. Whence, by (1.11) and (an argument similar to) 3.2 in Chapter V, we get
J.l( {g > A})
= J.l( {g > A} UN) -
J.l(No) = J.l( {f > A} U N) = J.l( {f > A}). •
Theorem 1.4 states that functions that coincide J.l-a.e. are roughly interchangeable; this property is essential in operating with extended realvalued functions. Consider, for instance, addition: f( x) +g( x) is undefined for those x's where f and 9 assume infinite values of opposite sign. The idea is to work with functions j and 9 which are closely related to f and g, and for which the sum makes sense. We proceed as follows: Let the (bad) set
B = {x E X:f(x) = oo,g(x) = -oo}U{x E X:f(x) = -oo,g(x) = co}. Since B is measurable, MX\B = {E n (X \ B): E E M} is a u-algebra of subsets of X \ B. Observe that j = fl(X \ B) and 9 = gl(X \ B) are also measurable on (X \ B,Mx\B), and that lex) + g(x) is defined for any x E X \ B. In fact, as we shall prove below, j + 9 is also measurable. To avoid having to go through various technical considerations each time we discuss an operation involving measurable functions, we sort the functions out into equivalence classes and operate at the level of classes. Let (X,M,J.l) be a measure space. We consider the collection :F consisting of those functions f that satisfy the following properties: (i) f is an extended real-valued function defined on X \ N, where N is a null subset of X.
1.
Properties of Measurable Functions
83
(ii) I is measurable, as a function on (X \ N,MX\N). Note that we only require functions in F to be defined J1,-a.e. on X. Next we identify those measurable functions which coincide J1,-a.e.j more precisely, given I, 9 E F, we say that I '" 9 iff there is a null subset N of X such that I(x) = g(x) for x E X \ N. It is clear that", is an equivalence relation on Fj the only property that offers any difficulty is the transitivity, and this follows at once from the fact that the union of null sets is null. We return to the addition: Given equivalence classes ],y E F corresponding to the finite J1,-a.e. functions I,g, by removing the bad set B we readily see that hex) = I(x) + g(x) is a finite quantity for x E X \ N, N null, and we put ] + Y = h. It is straightforward to verify that h is well-defined, i.e., it is independent of the representatives of the classes ],y, and this completes our discussion. Now that we know how things should be done we agree to denote the equivalence class] of a function I once again by I, and to operate with the classes as if they were functions. This should cause no undue stress, and the reader should keep in mind that a statement such as "a function I defined on X" actually means "an equivalence class ] of a function I defined J1,-a.e. on X." To deal with the usual arithmetic operations we need a preliminary result.
Lemma 1.5. Let (X,M,J1,) be a measure space, and suppose I,g are extended real-valued measurable functions defined on X. Then
{I> g} EM. Proof. Let {rk} be an enumeration of the rational numbers and observe that by Proposition 1.1
Ek={/>rk}n{g
g} = Uk Ek is also measurable. • We are now ready to prove
Theorem 1.6. Let (X,M,J1,) be a measure space and I,g be extended real-valued measurable functions defined on X. Then I ± 9 is also measurable. Proof. We only do the addition. Observe that for any real A, A - 9 is measurable. Since
{/+g>A}={/>A-g},
real A,
VI.
84
the conclusion follows at once from Lemma 1.5.
Measurable Functions
•
The other operations of interest to us are covered by the following result. Theorem 1.7. Let (X,M,J.L) be a measure space, assume I is a measurable, finite J.L-a.e. function defined on X and let 4> be a real-valued continuous function defined on R. Then the composition 4> 0 I is measurable. Proof. Since {I = ±oo} is a null set we may assume that 4> 0 I is well-defined and that {4> 0 I = -oo} = 0. By Proposition 1.2 the measurability of 4> 0 I will be established once we show that (4)0 l)-l((A,oo))
= 1-1 (4)-l((A,oo)))
EM,
all real A.
(1.12)
But this is not hard; indeed, since 4> is continuous, 4>-1 (( A, 00)) = 0 is an open subset of R, and, by Proposition 1.3, 1-1(0) E M. Thus (1.12) holds and we have finished. • Theorem 1.7 shows that the composition 4>0 I of a measurable function with a continuous function 4> is measurable; it is not intuitively apparent that the composition 104> should also be measurable. In fact, it is not, as the following example shows. Let {J(n} be a sequence of Cantor-like sets, lJ(nl = 1 - lin, n 2,3, ... , and let A = Un J(n' Since
I
1[0,1] \
AI
~ 1[0,1] \
it readily follows that [0,1] = any subset B of [0,1] we have
J(nl
~
lin,
Un J(n U Z,IZI
n = 2,3 ... = 0, and consequently for
In particular, if B is not Lebesgue measurable, there is an index N so that B n J(N is not Lebesgue measurable. Referring to the construction of the Cantor set, let Dn = [0,1] \ Cn, where as usual C n denotes the union of the intervals remaining after n steps. Dn consists of 2n - 1 open intervals, I! say, ordered from left to right by k, removed in the first n steps of the construction of C. Since J(n is a Cantor-like set, there also is a sequence of open intervals, J! say, 1 ~ k ~ 2n - 1, ordered from left to right by k, removed in the first n steps of the construction of J(N.
1.
Properties of Measurable Functions
85
/
/ /
/
/
/ / Figure 3 We define now a function h from [0,1] onto [0,1] as follows: Construct KN in the interval [0,1] corresponding to the domain of h, and C in the interval [0,1] that corresponds to the range of h. Then h is the function that maps the left-end point of J~ into the left-end point of the rightend point of J~ into the right-end point of I~, and is extended to [0,1]\KN by continuity. It is not hard to check that h is well-defined, one-to-one (if this were not the case KN would contain an interval, and this is not possible) and onto [0,1]. Let BnKN be a non-Lebesgue measurable subset of KN and put A = h(B n KN) ~ C. Then A is null, and consequently measurable; in other words the image ofthis non-Lebesgue measurable set by a continuous function is Lebesgue measurable. Another way to express this situation is the following: If = h- 1 is the continuous inverse of h, then (A) = B n KN, and the image of a Lebesgue measurable set by a continuous function is not necessarily Lebesgue measurable. Also <1>( C) = KN, and takes a null set onto a set of positive Lebesgue measure. Returning to the question at hand, let f = XA; since A is null f is Lebesgue measurable. Consider now the composition f 0 <1>. The inverse image (J 01/;)-1((1/2,3/2» is readily seen to equal
I!,
which is not measurable. Thus, as asserted, f
0
is not measurable.
VI.
86
Measurable Functions
The measurability of several expressions involving I follow at once from Theorem 1.7, with an appropriate choice of
I(x) ± A ,A real AI(x), A real l(x)P,p ~ 1 I/(x)IP,p ~ 0 I+(x) = v(O,/(x)) I-(x) = V(O, - I(x)) and when
I i- 0 J.L-a.e.,
l/l(x) I/( x )1 71 , 'fJ real
We also have Theorem 1.8. Let (X,M,J.L) be a measure space and assume I,g are measurable, finite J.L-a.e. functions defined on X. Then Ig is measurable, and, if 9 i- 0 J.L-a.e., also 1/9 is measurable. Proof. By Theorem 1.6, I ± 9 are measurable, and by the above remarks so are (J ± g? Whence, once again by Theorem 1.6,
is also measurable, and so is
Ig.
Since 1/ 9 is measurable, it readily follows that measurable.
•
1/9 = I . !
is also
9
Corollary 1.9. Let (X,M,J.L) be a measure space, and suppose I is a "simple" function defined on X, Le., I( x) = E~=l CkXE/c (x), where the Ck'S are real constants and the Ek'S form a measurable partition of X. Then I is measurable. Next we consider whether measurability is preserved under limiting operations. We begin by discussing the inf and sup of a sequence {In(x)} of measurable functions, which exist for any x E X.
1.
Properties of Measurable Functions
87
Lemma 1.10. Let (X,M,J.L) be a measure space, and let {In} be a sequence of extended real-valued measurable functions defined on X. Then the functions inf In{x) ,
and
sup In{x) ,
x EX
are measurable. Proof.
Since for arbitrary sequences {gn} we have
infgn{x)
= -sup{-gn{x)),
all x EX,
we only need prove one statement, the one with the sup, say. But this is not hard; indeed given any real A, note that {sup In
> A} = U{/n > A}.
(1.13)
n
By assumption the set on the right-hand side of (1.13) is the countable union of measurable sets, and so it is measurable. Thus the set on the left-hand side is measurable, and the sup of the In's is measurable. • Next we consider the limsup and liminf of a sequence of functions; these limits exist for every x EX, even if the sequence does not converge. Theorem 1.11. Assume (X,M,J.L) is a measure space and let {In} be a sequence of extended real-valued measurable functions defined on X. Then lim inf In and lim su PIn are measurable. In particular, if I (x) = lim In( x) exists for all x EX, then I is measurable. Proof.
Since liminf In(x)
= sup (inf 1m (X)) , k~l
x EX,
m~k
the measurability of the liminf follows at once from Lemma 1.10. Similarly, since lim sup In(x) = inf (sup k~l
m~k
1m (x )),
x EX,
also the lim sup is measurable. Finally, if the sequence converges, I{ x) coincides with both the lim sup In and Hminf In, and it is measurable as well. • We close this section with an interesting and important result; it shows how to approximate arbitrary functions by functions that assume finitely many values.
VI.
88
Measurable Functions
Theorem 1.12. Let (X,M,J.L) be a measure space, and I be an extended real-valued function defined on X. Then there is a sequence {In} of simple real-valued functions defined on X, i.e., kn
In(x) = LCiXEi(X) ,
Ci real,
Ei pairwise disjoint,
i=l
so that lim n -+ oo In(x) = I(x),x EX. Furthermore, (i) If I is measurable, so are the In's. (ii) If I is nonnegative, then the sequence {In} is nondecreasing, and
o ~ In(x)
~
I(x) ,
all x EX,
n
= 1,2, ...
(iii) If I is bounded, i.e., I/(x)1 ~ M for all x E X, then the In's converge uniformly to I. Proof. The In's are defined by looking closely at the level sets of Suppose first that I is nonnegative, fix an integer n ~ 1, and consider the n2n pairwise disjoint subintervals of [0, n) given by
I.
Put now In(x) = {n(k - 1)2- n
if (k - 1)2- n ~ I(x) otherwise.
< k2- n , 1 ~ k
~ n2n
Clearly In( x) ~ I( x) for all x E X and n = 1,2, ... Also, each In assumes a finite number of values; more precisely, if Ak,n
= {(k -
we have
1)2- n ~
1< k2- n }
and
An
= {I ~ n},
n2n
In(x)
= L(n -1)2- n XA",Jx) + nXAn(x).
(1.14)
k=l
Observe that In+1(x) is obtained from In(x) by dividing each interval [(k - 1)2- n , k2- n ) in half, and then only increasing In(x) to In+1(x) at those x's where In(x) is changed; this proves the remarks in (ii). We claim that lim fn(x)
n-+oo
= I(x),
x EX.
(1.15)
2.
Structure of Measurable Functions
89
Now, if I(x) = 00, then In(x) = n for all n, and In(x) ~ 00. On the other hand, if I( x) is finite, by the definition of In( x) it readily follows that
I(x) - In(x)
~
2- n ,
all n> I(x),
(1.16)
and (1.15) holds. Moreover, if I is measurable, by (1.6) the Ak,n'S are measurable, and, by (ii) in Proposition 1.1, the An's are measurable. Whence In is a simple measurable function and (i) is proved. Furthermore, if I is bounded, (1.16) is true for n > M, uniformly for x in X, the convergence is uniform and (iii) holds. In short, we have obtained the desired conclusion for nonnegative functions. To complete the proof recall that each function I is the difference of two nonnegative functions, I = 1+ - 1-, and apply the first part ofthe proofto 1+ and 1- separately. Note that in this case, the In's also have the property that IIn(x)I~I/(x)l,
allxEX,
n=1,2, . . . •
2. STRUCTURE OF MEASURABLE FUNCTIONS What does a measurable function I look like? In case there is a topology defined on X, how far is I from being continuous? First an example: Let I be the characteristic function of the set of irrational numbers [in [0,1]. I is Lebesgue measurable, yet discontinuous at every point of [0,1]. There is, however, another way to interpret this situation: Since I is constant on [, it is continuous there in the relative topology in [with respect to the usual topology of R. Now, 1[0,1]\[1 = and consequently, we have that I is continuous on a subset [ of [0,1] of full measure in the relative topology in [. This observation points to a general fact concerning measurable functions. In order to avoid technical difficulties, and since the nature of the question is already apparent there, we restrict our attention to the Lebesgue measure on X = [0,1]. We begin by considering the simpler question of the boundedness of measurable functions.
°
Proposition 2.1. Assume I is an extended real-valued Lebesgue measurable function defined on X such that I{III = 00}1 = o. Then, for
VI.
90
Measurable Functions
any E > 0, there exist a measurable subset B of X and a constant M with the following properties: (i) IBI < E. (ii) II(x)1 ~ M for x E X \ B. Proof. Let Bn = {III> n},n = 1,2, ... ; {Bn} is a nonincreasing sequence of measurable subsets of X. Suppose that for all n, IBnl > E. Since Bn = {III = oo} and IXI = 1, by (3.4) in Chapter IV we get
n
lim
n-+oo
IBnl = I{III
=
00}1
~ E,
which is impossible. Thus, there exists m such that IBml ~ desired conclusion obtains with B = Bm and M = m. • As for the continuity of provides the answer.
E,
and the
I, the following result of Lusin (1883-1950)
Theorem 2.2 (Lusin). Suppose I is an extended real-valued measurable function defined on X with the property that I{III = 00}1 = 0. Then, given E > 0, there is a closed subset F of X such that (i) IX \ FI < E • (ii) IIF is continuous on F, in the relative topology in F. Proof. Since I = 1+ - 1- is the difference of two nonnegative measurable functions, we may assume that I ~ 0. Now, given E > 0, let B be a measurable subset of X corresponding to the choice E /2 in Proposition 2.1. The restriction II(X \ B) of I to X \ B is measurable and bounded, and with no fear of confusion we also denote it by I. By Theorem 1.12 there is a sequence {In} of simple functions defined on X \ B that converges uniformly to I there. Let n=I,2, ...
where the Ci,n'S are real numbers and the Ei,n'S form a pairwise disjoint partition of X \ B. By the regularity properties of the Lebesgue measure, given TJn > 0, there exist closed subsets li,n of Ei,n with the property that
IEi,n \li,nl ~ TJn
i
= 1, ... ,in, n = 1,2, ...
Put now Fn = U:~l Fi,n; since each Fi,n is closed, Fn is a closed subset of X. Moreover, since the Fi,n'S are pairwise disjoint and compact, they
3.
Sequences of Measurable Functions
91
can be separated by pairwise disjoint open subsets of X. Whence, the restriction of In to F n, or any of its subsets for that matter, is continuous in the relative topology in Fn. Next observe that
I (X \ B) \
Fnl = I U~~lEi,n \ U~~lFi,nl ""in :$ L..Ji=l I Ei,n \ Fi,nl :$ i n17n.
(2.1)
A good choice for 17n is c/i n2n+1 for then, by (2.1), we have
Put now F = n~=l Fn; F is a compact subset of X \ B and {/nIF} is a sequence of continuous functions in the relative topology in F which converges uniformly to I on F. Since the uniform limit of a sequence of continuous functions on a compact set is continuous, IIF is continuous in the relative topology in F. It only remains to estimate IX \ Fl. First observe that I(X \ B) \ FI equals
I (X \ B) \
nnFnl
= IUn((X \ B) \ Fn)1 :$ Ln I(X \ B) \ Fnl :$ Ln c/2n +1
= c/2. (2.2)
Now, since X \ F = ((X \ B) \ F) U B, by (2.2) and the choice of B it readily follows that
IX \ FI :$ c /2 + c /2
=c .
•
3. SEQUENCES OF MEASURABLE FUNCTIONS Let (X,M,Il) be a measure space. The pointwise convergence of a sequence {In} of real-valued functions defined on X is a well-defined notion. Here, and unless otherwise specified, the term "convergence" means that the convergence is to a finite limit. Actually, a more relevant concept is that of Il-a.e. convergence; sets of measure 0 playa special role in this setting. For instance, on X = [0,1], the sequence In(z) = 0 if Z '" 1/2 and In(1/2) = 1 + lin tends to
92
VI.
Measurable Functions
I(x) = 0, Da-a.e., a i- 1/2, but In f+ I D1 / 2 -a.e. Other important notions of convergence involve uniformity, and we shall discuss them shortly. How can we express the notion of p.-a.e. convergence in terms of familiar concepts? First an observation: Since the limiting function I is finite p.-a.e., the statements In(x) - I(x) p.-a.e. and Iln(x) - l(x)1 - 0 p.-a.e. are equivalent. So, we may restrict our attention to sequences of nonnegative functions which converge to 0 p.-a.e. For such sequences we have Proposition 3.1. Let (X,M,p.) be a measure space and let {In} be a sequence of measurable nonnegative functions defined on X. Then, In - 0 p.-a.e. on a subset of M E M iff for each 'fJ > 0, the sets Bn.,., = {x E M :/n(x) > 'fJ} satisfy p. (lim sup B n.,.,) = o. Proof. Let In - 0 p.-a.e. on M and suppose that for some 'fJ > 0 we have p. (lim sup B n.,.,) > O. Then each x E lim sup Bn.,., belongs to infinitely many of the Bn.,., 's, and consequently, there is a sequence nk - 00 such that Ink (X) > 'fJ. Whence, lim sup In(x) ~ 'fJ > 0, and the In's do not converge to 0 on a subset of M of positive measure; this is a contradiction. Conversely, given c > 0, pick 0 < 'fJ < c, and consider a point x in M \ (lim sup B n.,.,). Since x belongs to at most finitely many of the Bn.,., 's there is a no such that In(x) ~ 'fJ < c for all n ~ no. In other words, In(x) - 0 for x E M \ N,p.(N) = 0, which is precisely what we wanted to show. • Since lim sup Bn.,., = nk:l (U~=k Bn.,.,), if P.(U~=k Bn.,.,) < 00 for some k, by (3.4) in Chapter IV the conditions of Proposition 3.1 are satisfied iff OO (3.1) lim p. B n .,.,) = 0 , all'fJ>O. k-+oo n=k
(U
In particular, if p.(X) < 00, (3.1) describes convergence p.-a.e. The relation (3.1) points to a possible limitation of the concept of p.-a.e. convergence, namely, we require the control of all the Bn.,., 's, from one index on. To illustrate this point, consider the sequence of (dyadic) subintervals of I = [0,1] defined as follows: 10 = I, II = [0,1/2], 12 = [1/2,1], 13 = [0,1/4], and so on. In other words, the sequence consists of successive blocks of 2n nonoverlap ping intervals, each of length 2- n , and the union of the intervals in each block is I. Let {In} be the sequence consisting of the characteristic functions of the In's. Clearly {In} does not converge to 0 anywhere on I, yet in some sense the In's are getting
3.
Sequences of Measurable Functions
close to
93
o. Specifically, lim 1{ln
n ..... oo
> TJ}I = 0,
all real TJ>
o.
(3.2)
Notice that in contrast to (3.1), we are dealing with one Bn,T/ at a time. Motivated by this remark we introduce the following definition. Given a measure space (X,M,p) and a sequence {In} of measurable nonnegative extended real-valued finite p-a.e. functions defined on X, we say that In converges to 0 in p-measure iff lim p( {In> TJ})
n ..... oo
=0,
all real TJ
> o.
(3.3)
If p(X) < 00 we refer to convergence in p-measure as convergence in the sense of probability, or convergence in probability. Thus, p-a.e. convergence implies convergence in probability, but the opposite is not true. Also, p-a.e. convergence does not, in general, imply convergence in p-measure. To see this consider (R,.c, 1·1), and observe that the sequence In(x) = X[n,oo)(x) tends to 0 everywhere, but 1{ln > 1/2}1 = 00 for all n. Nevertheless, a closer look at the first example indicates that there is a subsequence {Ink} of the In's, specifically that consisting of the characteristic functions of the intervals [0,1/2 n ], n ~ 1, with the property that limk ..... oo Ink (x) = 0 for all x E (0,1]. The remarkable fact is that this property is true for arbitrary sequences; before we prove this we need a bit of information concerning convergence in p-measure. Proposition 3.2. Let (X, M,p) be a measure space and suppose {In} is a sequence of measurable nonnegative extended real-valued finite p-a.e. functions defined on X. Then, In -+ 0 in p-measure iff for any e,8 > 0, there exists a constant N = Ne,s such that p( {In> 8})
< e , all n
~ N .
(3.4)
Proof. The necessity of the condition is obvious. As for the sufficiency, if In 0 in p-measure, then by (3.3) there exist TJ > 0 and a sequence nk -+ 00 such that
r
L = limsupp({lnk > 77}) >
Thus, (3.4) cannot hold for c completes the proof. •
=
L and Ii
=
o.
77, and this contradiction
Next we show that convergence in probability implies p-a.e. convergence along a subsequence.
VI.
94
Measurable Functions
Proposition 3.3. Let (X,M,J.t) be a finite measure space, and assume {In} is a sequence of measurable, nonnegative, extended real-valued, finite J.t-a.e. functions defined on X. If In --+ 0 in probability, then there is an increasing sequence nk --+ 00 such that lim Ink = 0 J.t-a.e. on X.
k-+oo
Proof. By Proposition 3.2 it follows that for each n we may find an index nk < nk+1 with the property that (3.5) Let Bk = {Ink > 1/2k} and consider the (bad) set B = lim sup B k. Since by (3.5) L:~1 J.t(Bk) < 00, by the Borel-Cantelli Lemma we have J.t(B) = O. Now, it is not hard to see that lim Ink (x)
k-+oo
Indeed, if x (3.6) holds.
ft
=0 ,
xEX \ B.
(3.6)
B, then x belongs to at most finitely many of the Bk'S and •
Sometimes we have to deal with questions of convergence when no limit is in evidence. For J.t-a.e. convergence this can be reduced to the numerical case, where the Cauchy criterion is available. Specifically, let (X,M,J.t) be a measure space, and let {In} be a sequence of measurable extended real-valued finite J.t-a.e. functions defined on X. We say that {In} is Cauchy J.t-a.e. if for J.t almost every x EX, given e > 0, there is an integer no = no( x) such that
I/n(x) - In,(x)1 ~ e,
all n,n' ~ no.
By the Cauchy criterion of convergence of numerical sequences, if {In} is Cauchy, then lim n -+ oo In( x) = I( x) exists J.t-a.e. The same is true for convergence in probability. Let (X,M,J.t) be a finite measure space, and let {In} be a sequence of measurable, extended real-valued, finite J.t-a.e. functions defined on X. We say that {In} is Cauchy in probability if given e, 6 > 0, there is an integer no such that
J.t({l/n - In' I > 6}) < e,
all n,n' ~ no.
Sequences which are Cauchy in probability converge in the sense of probability, cf. 4.30 below, and convergence in probability corresponds to a notion of "metric" convergence, cf. 4.31 below.
3.
Sequences of Measurable Functions
95
Next we discuss the concept of uniform convergence. Let (X,M,J.t) be a measure space and assume {in} is a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X. We say that in -+ 0 almost uniformly if given c > 0, we can find a measurable subset B of X such that J.t(B) < c and lim in(x) = 0,
uniformly for x E X \ B .
n--+oo
Proposition 3.4. Let (X,M,J.t) be a measure space, let {in} be a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X, and suppose that in -+ 0 almost uniformly. Then in -+ 0 J.t-a.e. Proof. For every positive integer k there is a measurable subset Bk of X such that J.t(Bk) < 11k and in(x) -+ 0, uniformly for x E X \ Bk. Clearly in -+ 0 pointwise on the (good) set G = Uk:l(X \ Bk). It only remains to check that J.t(X \ G) = 0; this is not hard. Since X \ G equals
and J.t(Bt)
< 00, it readily follows that
How about the converse to Proposition 3.4? To decide whether it is true we investigate the rate at which arbitrary sequences converge pointwise to o. First we show that the convergence occurs at a fairly rapid rate. Theorem 3.5. Let (X,M,J.t) be a finite measure space, and suppose {in} is a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X so that in -+ 0 J.t-a.e. Then there exists a non decreasing sequence of integers An -+ 00 with the property that lim Anin
n--+oo
= 0 J.t-a.e. on X .
(3.7)
Proof. Redefining the In's if necessary on a set of measure 0, we may assume that the In's are finite everywhere and that In(x) -+ 0 for every x E X. Let 9n(X) = sup A(x), x EX. k~n
VI.
96
Measurable Functions
Clearly the 9n'S are measurable, 9n(X) 2: In(x), all n,x E X, and lim 9n(X) = O.
n-+oo
In other words, working with the 9n'S instead, we may also assume that, in fact, the In's decrease to 0 everywhere. The first step is to construct the An's. Let nl = 1, and note that since In -+ 0 in probability, for each integer k = 2,3, ... , there exist integers nk > nk-l such that (3.8) Put now for nk ~ n
An = k,
< nk+1, k =
1,2,. . .
(3.9)
In other words, the sequence of An's is defined in blocks: The first (n2 - nl) entries are l's, the next (n3 - n2) entries are 2's, and so on. Furthermore, since nk -+ 00 as k -+ 00, also An -+ 00 as n -+ 00. Next we deal with the convergence of the sequence {Anln}. Let
Bm =
U {An In > 11m},
m = 1,2, ...
(3.10)
n~nm
It is not hard to estimate J.L(B m ). First observe that since the An's are constant on blocks we have 00
nk+l- 1
00
nk+l- 1
U U {Anln > 11m}
Bm =
=
U U {kIn> 11m}.
(3.11)
k=m n=nk
Furthermore, since the sequence {In} is nonincreasing and since k 2: m, the innermost union in (3.11) is contained in {Ink> 1/k 2}, and 00
Bm ~
U {Ink> 1/k2}.
(3.12)
k=m Whence, by (3.12) and (3.8) 00
J.L(Bm) ~
L
k=m
J.L( {Ink> 1/k2}) ~ 2- m+1
•
3.
Sequences of Measurable Functions
97
Let the (bad) set B = limsupBm. Since Lp(Bm) < 00, by the Borel-Cantelli Lemma we have pCB) = O. It only remains to check that for any x E X \ B, we have lim n -+ oo Anln(x) = o. But this is not hard: Given c > 0, let m be so large that l/m ::; c and x rt. Bm; such a choice is always possible since x belongs to at most finitely many of the Bm's. Then by (3.10) there exists nm so that
Our next result is an interesting interpretation of Theorem 3.5. Theorem 3.6. Let (X, M,p) be a finite measure space, and let {In} be a sequence of measurable nonnegative extended real-valued finite p-a.e. functions defined on X such that In -4 0 p-a.e. Then there exist a measurable nonnegative finite p-a.e. function I defined on X and a sequence of real numbers 'f/n
-4
0 such that
In ::; 'f/nl p-a.e. on X . Proof.
(3.13)
In the notation of Theorem 3.5, let
I(x) = sup {).n In (x)} ,
x EX.
n
Clearly I is measurable and nonnegative, and since Anln also finite p-a.e. Put now 'f/n = 1/ An, and note that
In ::; 'f/nl p-a.e., with 'f/n as asserted.
-4
-4
0 p-a.e., I is
0,
•
We are now ready to show that convergence p-a.e. implies almost uniform convergence; this result is due to Egorov (1869-1931). Theorem 3.7 (Egorov). Let (X,M,p) be a finite measure space, and let {In} be a sequence of measurable nonnegative finite p-a.e. functions defined on X such that In -4 0 p-a.e. Then In -4 0 almost uniformly.
98
VI.
Measurable Functions
Proof. We must show that given e > 0, there exists B E M such that J-L(B) < e and In(x) - 0 uniformly on X\B. Let I be the J-L-a.e. finite function corresponding to the sequence {In} constructed in Theorem 3.6. By Proposition 2.1 there is a constant M such that I( x) ~ M for x E X \ B, J-L(B) < e. By Theorem 3.6
In(x)
~
'I]nM,
x E X \ B, J-L(B) < e,
and In(x) - 0 uniformly for x E X \ B.
•
The measurability of the In's is essential to the validity of Egorov's theorem, cf. 4.38 below, as is the assumption J-L(X) < 00. Indeed, in the measure space (R,C, 1·1) the sequence In = X[n,oo) tends to 0 everywhere on R, but not uniformly on any unbounded subset of R.
4. PROBLEMS AND QUESTIONS The setting of the first thirteen problems and questions is the following: M is a u-algebra of (measurable) subsets of X and I is an extended real-valued function defined on X.
{I> A} EM for each rational number Aj is I measurable? Suppose I is a measurable real-valued function defined on X, and
4.1 Suppose 4.2
put g( x) = 0 if I( x) is rational and g( x) 9 measurable?
= 1 if I( x) is irrationalj is
I is measurable and B E Bl is a Borel subset of Rj does it follow that 1-1 (B) E M?
4.3 Suppose
I is a measurable real-valued function defined on X, and let > be a real-valued Borel measurable function defined on R. Show that the composition > 0 I is measurable.
4.4 Suppose
4.5 Suppose I is measurable and show that for each real r, s truncations
>
0, the
r if I(x) > r Ir,s(x) = { I_(sx) if - s ~ I(x) ~ r if I(x) < -s, are measurable. 4.6 If I,g are measurable real-valued functions defined on X and U,g) is measurable.
4.
Problems and Questions
99
4.7 Suppose X = R, and show that Bl is the smallest u-field with respect to which all continuous functions are measurable. More precisely, show that every continuous function is measurable iff Bl ~M. 4.8 Let {In} be a sequence of measurable functions defined on X, and let A = {x EX: lim n -+ oo In(x) exists}. Show that A E M. 4.9 An elementary function is one which assumes at most count ably many values. Show that if I is an everywhere finite real-valued measurable function, then it is the uniform limit of a sequence of elementary functions. Also, unless I is bounded, it is not the uniform limit of a sequence of simple functions. 4.10 Let I' be a measure on (X,M). Show that the following assertions are equivalent: (a) (X,M,I') is a complete measure space, and, (b) If {In} is a sequence of measurable functions defined on X which converges I'-a.e., and if I is any extended real-valued function defined on X such that I = lim n -+ oo In I'-a.e., then I is measurable. 4.11 Suppose (X,M,I') is a u-finite measure space and let
I be a mea~
surable function defined on X. Show that the function I'({I/I > A}),
A> 0
is nonincreasing and right-continuous. Furthermore, if I, It and h are nonnegative and measurable and "'1, 'f/2 are nonnegative real numbers so that I ~ ",til + 'f/2h I'-a.e., then for any A > 0
4.12 With the notation of 4.11, if {In} is a non decreasing sequence of measurable functions and I lim n -+ oo In, then for any A > 0,
=
lim 1'({l/nl
n-+oo
> A}) = I'({I/I > A}).
4.13 With the notation of 4.11, suppose that I'({I/I A --+ 00, and let
ret)
r
= inf{'\:I'({1/1 > A} ~ t},
Show that satisfies the following properties: (i) It is nonincreasing and right-continuous.
t
> A}) ~ O.
--+
0 as
VI.
100
(ii) If
°< 7]
~ p,( {III> A})
Measurable Functions
< 00, then
f*(p,({1/1 > A})) ~ A < f*(p,({1/1 > A})-7]).
(iii) If /* is continuous at t = p,( {III> A}), then /* (p,( {III> A})) = A. (iv) I and /* are equimeasurable, i.e., for A > 0, p,({1/1 > A}) =
I{/* > A}I· Because of property (iv), /* is called the nonincreasing equimeasurable rearrangement of I. The next six problems and questions deal with the Lebesgue measure on
R. 4.14 Suppose I is defined a.e. on [0,1] and it is continuous a.e. there. Is I Lebesgue measurable? What if I is right-continuous instead? 4.15 Suppose
I
is differentiable on [0,1]. Is
f'
Lebesgue measurable?
4.16 Show that I is Lebesgue measurable iff there is a Borel measurable function 9 such that I = 9 Lebesgue-a.e. 4.17 Suppose I(x, y) is a Lebesgue measurable real-valued function defined on R2 with the property that I(x,.) is Lebesgue measurable as a function of x E R, and 1(·, y) is continuous as a function of y E R. Let now 4>( x) = maxc is Lebesgue measurable. - 4.18 Suppose that I is Lebesgue measurable and 4> is real-valued, continuous, and has the following property: For any null set N, 4>-l(N) E £,. Show that 104> is Lebesgue measurable. 4.19 Suppose {An} is a sequence of Lebesgue measurable subsets of R, and let In = XA n' n = 1,2, ... Find necessary and sufficient conditions for the sequence {In} to: (a) converge Lebesgue-a.e., and, (6) converge uniformly.
In the next three problems and questions we assume that (X, M , p,) is a probability measure space.
I defined on X is said to be bounded in probability if given c > 0, there exists a finite real number Me such that p,( {III ~ Me}) ~ 1 - c. Prove that I is bounded in probability iff I is finite p,-a.e.
4.20 An extended real-valued measurable function
4.21 A sequence of extended real-valued measurable functions {fn} is
said to be bounded inl probability iff sup Ifni is bounded in probability. The sequence {fn} is said to diverge to 00 in probability iff
4.
Problems and Questions
101
for each M > 0 and c > 0, there exists a finite integer no(M,c) such that if n > no, then J-t( {l/nl > M}) > 1- c. Prove that if {In} diverges to 00 in probability and {9n} is bounded in probability, then the sequence {In + 9n} diverges to 00 in probability. 4.22 Suppose that sup In = 00 J-t-a.e. Does there necessarily exist a subsequence {In,,} that diverges to 00 in probability? In the next nine problems and questions we assume that (X,M,J-t) is a measure space. 4.23 This result concerns the approximation Theorem 1.12. Suppose the function I there enjoys the property that {x EX: I( x) =F O} is O'-finite. Then prove that the approximating sequence {In} may be constructed with the additional property that J-t( {In =F O}) < 00 for each n. Conversely, if each of the simple functions In satisfies J-t( {In =F O}) < 00 and for each x E X we have lim n-+ oo In( x) = I( x), then the set {I =F O} is O'-finite. 4.24 Let J-t(X) < 00, and suppose {In} is a sequence of extended realvalued measurable functions defined on X that satisfies 00
L J-t( {l/nll An > 1}) <
00.
n=l
Prove that lim sup(l/n II An) ~ 1 J-t-a.e. 4.25 Suppose J-t(X) = 1. Show that the sequence of extended real-valued measurable functions {In} converges J-t-a.e. to the measurable function I iff for any c > 0 and N we have
J-t({l/n - II
~ c, for all
n ~ N})
= 1.
4.26 Suppose J-t(X) < 00 and let {In} be a sequence of extended realvalued measurable functions defined on X which converges to I in probability. Show that I is measurable and that I: - t 1+, I; - t 1- and Ilnl - t III in the sense of probability. Are the analogous statements for convergence in measure valid? 4.27 In the setting of 4.26, suppose that {gn} is a sequence of extended real-valued functions defined on X which converges to 9 in probability, and {An} a sequence of real numbers, An - t A. Does it follow that In + An9n - t 1+ A9 in probability? Is it also true that Inl9n - t I I 9 in probability? How about the corresponding statements for convergence in measure?
VI.
102
Measurable Functions
Consider further the following statement: If 4>: R2 ~ R is continuous, then 4>(Jn,gn) ~ 4>(J,g) in probability, or in measure. Is it true? 4.28 In the setting of 4.26, assume that the sequence {In} is nondecreasing. Show that in this case we have limn_oo In = I p,-a.e. 4.29 In the setting of 4.26, suppose that p, is a probability measure and, for t =I 0 real, let D(t) = 1 if t < 0 and D(t) = 0 if t > O. Show that In ~ 0 in probability iff lim p,( {In> t}) = D(t) ,
n-oo
<
all t
=I O.
and let {In} be a Cauchy sequence in the sense of probability, i.e., for any c > 0 there exists an integer N = Ne such that
4.30 Suppose that p,(X)
00,
P,({l/n - Iml > c}) < c,
all n,m ~ N.
Show that there exists a measurable function = I in probability.
I such that limn_ oo In
< 00, and for extended real-valued measurable functions I,g defined on X put
4.31 Suppose that p,(X)
d(J,g) = inf {c > 0:p,({11 -
gl > c}) ~ c}
.
Observe that d(J,g) = 0 iff I = 9 p,-a.e.; whence (at the level of classes) upon identifying functions which agree p,-a.e. we have d(J,g) = 0 iff 1= g. Show that d(J,g) is a distance function and that In ~ I in probability iff d (In, J) ~ O. Furthermore, show that endowed with the metric d, the space of measurable functions is a complete metric space. Is a similar result, involving now convergence in measure, true for arbitrary measure spaces? 4.32 Show that if p, is the counting measure on the integers X = Z, then convergence in measure is equivalent to uniform convergence. 4.33 Suppose (X,M,p,) is a finite measure space, and let {In} be a
sequence of measurable real-valued functions defined on X. Show that given c > 0, there exist a (bad) set B with p,( B) < c and a constant M such that
I/n(x)I::5M,
allXEX\B,
alln.
4.
Problems and Questions
103
Is the conclusion true if we assume instead that the in's are finite J.L-a.e. ? 4.34 Suppose (X,M,J.L) is a finite measure space, and let {in} be a sequence of extended real-valued measurable functions defined on X. Show that {in} converges to a finite limit J.L-a.e. iff given e > 0, there exists a finite constant Me such that
4.35 Suppose (X,M,J.L) is a finite measure space, and let {in} be a sequence of measurable real-valued functions defined on X. Show that there exists a sequence (an) of positive real numbers with the property that lim anin = 0 J.L-a.e. n-+oo
4.36 Show that the proof of Lusin's theorem may be adjusted to give the following result: Suppose i is an extended real-valued finite Lebesgue-a.e. measurable function defined on X = [a,b]. Then, given e > 0, there is a continuous function 4> defined on X such that
I{x EX :i(x) ~ 4>(x)} I < e. 4.37 In the setting of 4.36 show that the following is true: There is a sequence of continuous functions {in} defined on X such that limn -+ oo in = i Lebesgue-a.e. 4.38 By means of an example show that the conclusion of Egorov's theorem does not necessarily hold if the in's are not Lebesgue measurable. 4.39 Construct a sequence of Lebesgue measurable functions {in} defined on X = [a,b] with the following properties: The in's converge at every point of X, but they do not converge uniformly on any Lebesgue measurable subset E of X with IX \ EI = O. 4.40 Let {in} be a sequence of Lebesgue measurable functions defined on X = [a,b] and suppose that limn -+ oo in = i exists Lebesgue-a.e. on X. If i ~ 0 Lebesgue-a.e. and in ~ 0 Lebesgue-a.e. on X, prove that given e > 0, there exist c > 0 and a sequence {En} of Lebesgue measurable subsets of X such that
lin(x)1
~ c, x E E,."
and
IX \ Enl ~
E,
n = 1,2, ...
VI.
104
Measurable Functions
4.41 Prove the following variant of Egorov's theorem: Let X = [a,b] and suppose I, In, n = 1,2, ... , are Lebesgue measurable functions defined on X such that lim sup In ::; I on X. Then, given € > 0, there is a Lebesgue measurable set E ~ X with lEI < € such that for each 17 > there is N with the property that
°
In(x)::;/(x)+l7,
allxEX\E,n~N.
4.42 Prove the following extension of Egorov's theorem due to Lusin. Let (X, M ,J.L) be a 0'- finite measure space, and I, In be extended real-valued measurable functions defined J.L-a.e. on X, n = 1,2, ... , such that lim n -+ oo In = I J.L-a.e. Then there exists measurable sets N, EI, E 2 , ••• such that X = NUU k Ek, J.L(N) = 0, and the sequence {In} converges uniformly to I on each Ek. 4.43 Let M be a O'-algebra of (measurable) subsets of X, and let I be a complex-valued function defined on X. We say that I = ~I + i';Sl is measurable iff the (real-valued) functions ~I and ';SI are measurable. This is an open ended question: Discuss the properties of complex-valued measurable functions. For instance, show that I is measurable iff for any open subset U in the complex plane, l-l(U) E M.
CHAPTER
VII
Integration
In this chapter we introduce the integral and establish its basic properties, including taking limits under the integral sign. The relation between the Riemann and the Lebesgue integrals of a bounded function is elucidated.
1. THE INTEGRAL OF
NONNEGATIVE FUNCTIONS Suppose (X,M,J.L) is a measure space and let
ak real,
k=l
where the Ak'S form a measurable pairwise disjoint partition of X. The integral of
Ix
Ix
(1.1)
Ix
The usual convention 0 . 00 = 0 is in effect, so that
105
VII.
106
Integration
(i) It is well-defined, Le., if also 4>( x) = Lh=l bhXBh (x), then n
m
k=l
k=l
L: akJ.L(Ak) = L: bhJ.t(Bh). (ii) If 4> and 'I/J are nonnegative simple functions and 4> = 'I/J J.L-a.e., then
Ix
4> dJ.L
=
Ix
'I/J dJ.L .
(iii) It is positively linear, i.e., if 4>, 'I/J are nonnegative simple functions and A > 0, then
Ix
(4) + A'I/J)dJ.L = lx4>dJ.L+A lx'I/JdJ.L.
(iv) It is monotone, i.e., if
°: ; 4> ::;
Ix
(1.2)
'I/J are simple functions, then
4> dJ.L ::;
Ix
(1.3)
'I/J dJ.L .
(v) For each nonnegative simple function 4>, the set function v given by
veE) =
Ix
4>XE dJ.L = t4>dJ.L,
E EM,
is a measure on (X, M). Proof. Since the proof of (i) follows along the lines to that of (iii), we only do (iii). Let, then, 4> = L anXAn and 'I/J = L bmXBm be nonnegative simple functions; 4> + A'I/J is then the simple function that takes the value an + Abm on the set An n Bm E M. Note that the an + Abm's are not necessarily distinct, but that the An n Bm's are pairwise disjoint. Thus, by the definition of the integral, we have
If the sum in (1.4) is infinite there are indices n, m such that an + Abm =I 0, J.L(An n Bm) = 00, and either anJ.L(An) = 00 or bmJ.L(Bm) = 00. In the former case we have 4>dJ.L = 00, in the latter case it follows that 'I/J dJ.L = 00, and in either case (1.2) holds. On the other hand, if the sum in (1.4) is finite, since the summands there are nonnegative they may be rearranged freely and we obtain at once that the sum equals
Ix
Ix
n
m
m
n
1.
The Integral of Nonnegative FUnctions
107
By the additivity of J.L this expression is
and (iii) holds. We prove (ii) next. Since 4> = "p J.L-a.e., there are nonnegative simple functions (,4>' and "p' and a null set A such that 4>' and "p' vanish off A and 4> = ( + 4>', "p = ( + "p' . By (i) it readily follows that
and (ii) is true. As for (iv), let 4> = L: anXA n , "p = L: bmXBm' and observe that if J.L(An n Bm) 0, then an :::; bm . Whence, since Un,m(An n Bm) = X, we have
t
and (iv) is true. (v) is a useful property and, among other things, it gives new examples of measures on (X, M). Clearly 11 is a nonnegati ve set function and 11( 0) = 0; only the a-additivity requires some work. First observe that if 4> is a nonnegative simple function and E EM, then 4>XE is also a nonnegative simple function and
Suppose, then, that {Ek} is a sequence of pairwise disjoint, measurable subsets of X, and let E denote its union. Now, since J.L is a measure, the right-hand side of the above expression equals
Whence
11
is a measure on (X,M).
•
VII.
108
Integration
How about the integral of arbitrary nonnegative measurable functions? By Theorem 1.12 in Chapter VI, these functions are limits of nondecreasing sequences of simple functions, and this fact suggests a way of defining the integral. Let (X,M,JL) be a measure space, I a nonnegative measurable function defined on X, and set
F J = {>: > is simple, and 0 ~ > ~ f} . The integral of lover X with respect to JL is denoted by or simply I dJL, and it is defined as the quantity
Ix
Ix I( x) dJL( x), (1.5)
Ix
By Theorem 1.12 in Chapter VI, }='J "# 0, and consequently, I dJL is a well-defined nonnegative real number or 00. (1.5) is similar to the definition of the lower Riemann integral of a nonnegative function I, but with a crucial difference: Rather than considering partitions of the "domain of integration" X, we work with partitions of the "range" of I, in a manner compatible with each individual function I. More precisely, }='J contains the >'s constructed in Theorem 1.12 in Chapter VI, and these simple functions are closely related to the level sets of I. Before we go on we must check that if I is simple, then the definitions in (1.1) and (1.5) coincide. If I dJL denotes the expression given by (1.1), since I E }='J it readily follows that
Ix
(1.6) On the other hand, if > E }='J' then by (1.3),
Ix
> dJL
~
Ix I
dJL,
all > E }='J,
the inequality opposite to (1.6) holds, and the definitions given by (1.1) and (1.5) coincide. We are now ready to consider some elementary properties of the integral.
1.
The Integral of Nonnegative Functions
109
Theorem 1.2. Assume (X,M,p.) is a measure space, and let I,g be nonnegative measurable functions defined on X. We then have (i) If 1= 9 p.-a.e., then Ix I dp. = Ix 9 dp.. (ii) IxU + g) dp. ~ Ix I dp. + Ix 9 dp.. (1.7) (iii) If I ~ g, then Ix I dp. ~ Ix 9 dp.. (iv) If A ~ B are measurable, then
(1.8) Proof. (i) follows at once since for any > E F, there exists a'IjJ E Fg such that > = 'IjJ p.-a.e., and, by property (ii) in Theorem 1.1, the integrals of > and 'IjJ over X with respect to p. are equal. As for (ii), observe that since for any > E F, and 'IjJ E Fg we have > + 'IjJ E F1+g, by (1.2) it follows that
Whence taking the sup of the left-hand side in the above inequality over > E F, and 'IjJ E Fg gives (1.7), and (ii) holds. As for (iii), note that in this case we have F, ~ F g , and consequently,
L ~L >dp.
9 dp.,
all > E F, .
Taking the sup over the >'s above gives (iii). To verify (1.8) it suffices to note that in this case FhA ~ F'XB' Thus (iv) holds, and the proof is complete. • It is of interest to determine whether equality holds in (1.7). To address this question, and to investigate the behavior of the integral with respect to limits, we consider the following result.
Theorem 1.3 (Beppo Levi). Let (X,M,p.) be a measure space and assume {In} is a nondecreasing sequence of nonnegative finite p.-a.e. measurable functions defined on X. Then, lim n -+ oo In(x) = I(x) exists everywhere on X, I(x) is nonnegative and measurable and
lim f In dp. (= sup f In dP.) Jxf I dp. =n-+oo Jx Jx
.
(1.9)
VII.
110
Integration
Proof. That I is nonnegative and measurable is clear. By monotonicity, the numerical sequence In dJ.L, n = 1,2, ... , is nondecreasing, and consequently, it has a limit L, say. Also, by monotonicity, In dJ.L ~ I dJ.L for all n, and
Ix
Ix
Ix
L = sup n
f In dJ.L ~ f I dJ.L . Jx Jx
(1.10)
If L = 00, the right-hand side of (1.10) is also 00, and (1.9) holds in this case. On the other hand, if L is finite we must show the inequality opposite to (1.10), and this requires some work. Given 0 < TJ < 1 and > E F" let En = {In> TJ>}; {En} is a sequence of measurable sets and since the In's are non decreasing and > ~ I, it readily follows that En ~ En+! for all n, and that lim En = X. Consider now the measure veE) = IE >dJ.L, E E M, and observe that by monotonicity we have
(1.11) By (1.10) and (v) in Theorem 1.1 both sides of (1.11) have a finite limit as n --+ 00. Whence, taking limits there we obtain at once L
~ n~oo lim TJv(En) =
Now, (1.12) holds for each
>
TJv(X) = TJ
Jxf >dJ.L.
(1.12)
E F" and taking sup over F, we get
TJ
L
I dJ.L
~ L.
Since TJ < 1 is arbitrary it is clear that (1.13) is also true with TJ the inequality opposite to (1.10) holds. •
(1.13)
= 1 and
This result of Beppo Levi (1875-1961), also known as the Monotone convergence theorem, or MCT, has many important applications; before we discuss them we present a simple extension of MCT, also useful in applications. Corollary 1.4 (J.L-a.e. version of MCT). Let (X, M, J.L) be a measure space and {In} a J.L-a.e. nondecreasing sequence of nonnegative finite 1'a.e. measurable functions defined on X. Then limn~oo In(x) = I(x) exists J.L-a.e on X, I( x) is nonnegative and measurable and
lim f n~oolx
In dJ.L = lx f I dJ.L •
1.
The Integral of Nonnegative Functions
111
Proof. Let N be the null set outside of which the In's increase to I, and put 9n = In on X \ N, and 9n = 0 on N, and 9 = Ion X \ N, and 9 = 0 on N. The point is that now the 9n'S converge to 9 everywhere, and they coincide with the In's and I at the level of integrals. More precisely, by property (i) of Theorem 1.2,
Ix I
dl'
=
Ix
9 dl' ,
and
Ix In = Ix 9n dl'
dl' ,
all n,
and consequently
lim n-+oo
Jxr In dl' =
lim n-+oo
Jxr9n dl' = Jxr9 dl' = Jxr I dl' .
•
As for the applications, we do the additivity and u-additivity of the integral first. Proposition 1.5. Suppose (X,M,I') is a measure space and let 1,9 be nonnegative extended real-valued measurable functions defined on X. Then (1.14) Proof. Let {
Ix (
dl'
dl' ,
all n.
(1.15)
Ix(J +9) dl' as n -+ 00, Ix I dl' + Ix 9 dl' as
By MCT the left-hand side of (1.15) converges to and also by MCT the right-hand side converges to n -+ 00. Thus (1.14) holds. • We are now ready to prove
Theorem 1.6. Assume (X,M,I') is a measure space, suppose {In} is a sequence of nonnegative extended real-valued measurable functions defined on X and let I = 2:n In. Then I is nonnegative, extended realvalued and measurable, and
{ I dl' = L:
Jx
n
{ In dl'.
Jx
(1.16)
VII.
112
=
Integration
=
Proof. Let Sk E:=l In, k 1,2, ... , and observe that the Sk'S form a non decreasing sequence of nonnegative, extended real-valued measurable functions defined on Xj moreover, limk-+oo Sk = I, and I is nonnegative, extended real-valued and measurable. By MeT
{ I dJ.L = lim { Sk dJ.L .
Jx
(1.17)
Jx
k-+oo
Now, by (a simple extension of) Proposition 1.5, we have
1
k
Sk dJ.L =
X
~
n=l
1
In dJ.L,
X
and consequently,
k~~
Ix
k
SkdJ.L =
l~~~ n=l
Ix
00
IndJ.L =
~
n=l
Ix
IndJ.L.
Whence, replacing this expression in the right-hand side of (1.17), (1.16) follows. • An interesting consequence of Theorem 1.6 is
Proposition 1.7. Suppose (X,M,J.L) is a measure space and let I be a nonnegative extended real-valued measurable function defined on X. Then the set function
v( E)
=
l
I dJ.L ,
E EM,
is a measure on (X,M).
Proof. That v is nonnegative and v(0) = 0 is clear. As for the a-additivity, let { En } be a sequence of pairwise disjoint measurable sets, and put In IXEn' n 1,2, ... The sequence {In} satisfies the hypothesis of Theorem 1.6, and consequently, by (1.16) we have
=
=
LV(En) = L { IndJ.L = { LlndJ.L n n Jx Jx n
= Jx { ILXEn dJ.L = Jx ( IXUEn dJ.L = v(UEn)' In other words, v is a-additive.
•
It is natural to consider whether Theorem 1.3 can be extended to include more general classes of functions. First we discuss a result due to Fatou (1878-1929); since its statement involves the liminf it applies to arbitrary sequences.
1.
The Integral of Nonnegative Functions
113
Thereom 1.8 (Fatou's Lemma). Suppose (X,M,JL) is a measure space and let {In} be a sequence of nonnegative extended real-valued measurable functions defined on X. Then
Ix Proof. spot. Let
liminf In dJL
~ liminf
Ix
IndJL.
(1.18)
The idea of the proof is to invoke MCT at the appropriate 9k
= inf In, n~k
k = 1,2, ...
The sequence {9k} satisfies the following properties: (i) The 9k are nonnegative extended real-valued measurable functions. (ii) The sequence is nondecreasing. (iii) 9k ~ In, all n ~ k ~ 1. (iv) limk-+oo 9k = sUPk~l (infn~k In) = lim inf In. By (iv) and MCT we have lim
f
k-+ooJx
9k dJL =
f
lim 9k dJL =
Jx k-+ oo
f
Jx
liminf In dJL.
(1.19)
On the other hand, by (iii) and monotonicity we get
Ix
9k dJL
~
Ix
In dJL ,
all n
~ k,
and consequently for each fixed k we have
L 9k dJL
~ Hminf Lin dJL.
(1.20)
Combining now (1.19) and (1.20) it readily follows that (1.18) holds, and we have finished. • It is not hard to see that strict inequality may occur in Fatou's Lemma. Indeed, for the Lebesgue measure on [0,1] and the sequence Un} given by In(x) = nX(O,l/n)(X), n = 1,2, ... , we have Hminf In(x) = 0 for all x, and consequently, the left-hand side of (1.18) is 0, but itO,l] In dJL = 1 for all n, and the right-hand side there is 1. Fatou's Lemma is very important in applications; Fatou discovered it while investigating the convergence properties of the Poisson integral, a problem that lies at the heart of Harmonic Analysis. A related result will be discussed in Theorem 3.1 in Chapter XVII.
VII.
114
Integration
2. THE INTEGRAL OF ARBITRARY FUNCTIONS As interesting a result as Theorem 1.8 is, it still does not address the question of interchanging limits with integrals. We discuss this general question in the context of integrals of functions of arbitrary sign. Let (X,M,JL) be a measure space, and let f be an extended realvalued measurable function defined on X; we can write f = f+ - f-:as the difference of two nonnegative functions. In particular, both of the integrals f+ dJL and f- dJL exist, and if either of these integrals is finite we define the integral f dJL of f over X with respect to JL as
Ix
Ix
Ix
(2.1) Observe that if
f and 9 are measurable and f = 9 JL-a.e., then
f + -- g+, f- -- 9 - JL-a.e. and consequently, by (i) in Theorem 1.2, the integral of f over X with respect to JL exists iff that of 9 exists, and in this case they are equal. Some of the properties of the integral are readily obtained, for instance (2.2) and (2.3) Indeed, (2.2) follows at once from (2.1), and (2.3) is a simple consequence of this observation: For A > 0 we have (AJ)+ = Af+ and (AJ)- = Af-, and for A < 0 we have (AJ)+ = -Af- and (AJ)- = -Af+. An interesting theory may be developed for those functions for which both of the integrals on the right-hand side of (2.1) are finite. This class of functions is denoted by L1(X,JL), or plainly L(X,JL), L(X), or L(JL), and it is called the Lebesgue class Ll; the functions f in Ll are said to be integrable. Note that since (2.4) and for functions I in L(JL) the expression Ix (1+ + 1-) dJL = Ix III dJL is finite, it is built into the definition of L(JL) that I E L(JL) iff the integral
2.
The Integral of Arbitrary Functions
Ix I dJ.L
115
IIx
is defined and I dJ.L1 < 00. On the other hand, as simple examples show, it is possible for III to be integrable, and yet for I not to be measurable. Thus the measurability of I is essential in the definition of L(J.L). The following estimate, in the spirit of (3.1) in Chapter III, is quite useful. Proposition 2.1. Let (X,M,J.L) be a measure space, and let I be an extended real-valued measurable function defined on X for which the integral over X with respect to J.L is defined. Then we have
Ii
I dJ.L1
~
i
(2.5)
III dJ.L.
Proof. If the right-hand side of (2.5) is infinite there is nothing to prove. On the other hand, if IE L(J.L), then by (2.4) and Proposition 1.5 the integrals of 1+ and 1- over X with respect to J.L are also finite and we have
Ii
I dJ.L1 =
~
Ii
1 +i
1+ dJ.L -
i 1+
dJ.L
1- dJ.L1
1- dJ.L =
i (1+ +1-) = i dJ.L
III dJ.L. •
When I is integrable, (2.5) may be interpreted as a statement concerning its "size." A more precise estimate was proved by Chebychev (1821-1894); the result is known as Chebychev's inequality. Theorem 2.2. Let (X,M,J.L) be a measure space and suppose I is an extended real-valued measurable function defined on X. Then for any real A > 0 we have AJ.L({I/I > A})
~
i
I/ldJ.L.
(2.6)
Proof. Let A..\ = {III > A}; A..\ is measurable and AXA>. < III J.L-a.e. Thus, by Theorem 1.2 (iii), it readily follows that
VII.
116
Integration
Corollary 2.3. Let (X,M,JL) be a measure space and let Then I is finite JL-a.e. Further, if I is nonnegative and I dp 1= 0 JL-a.e.
Ix
Proof.
I
E L(JL).
= 0, then
By Chebychev's inequality we have
JL( {III> n})
:::;!n lxfill dJL -+ 0
as n
-+ 00 •
Whence JL({I/I
= oo}) = nlim JL({I/I ..... oo
> n})
= 0,
and I is finite JL-a.e. Moreover, if I is nonnegative and its integral over X vanishes, then, also by Chebychev's inequality, JL({I > A}) = 0 for all A > 0, and I vanishes JL-a.e. • How does the integral behave with respect to addition? Proposition 2.4. Let (X,M,JL) be a measure space and suppose A is a real number and I, 9 E L(JL). Then I + Ag is integrable and
Ix (J +
Ag) dJL
=
Ix I
dJL + A
Ix
9 dJL .
(2.7)
Proof. The integrability of 1+ Ag follows at once from the estimate II + Agi :::; III + IAllgl· Now, since h = 1+ Ag is integrable, the integral of h over X with respect to JL is a well-defined finite number. Furthermore, we have h+ - h- = 1+ - 1- + (Ag)+ - (Ag)- ,
and consequently, also h+
+ 1- + (Ag)- = h- + 1+ + (Ag)+ .
(2.8)
All the summands in (2.8) are nonnegative, and by Corollary 2.3 finite JL-a.e. By Proposition 1.5 then, it readily follows that
fx
h+ dJL+
fx
1- dJL+
fx
(Ag)- dJL =
fx
h- dJL+
fx
1+ dJL+
fx fx
(Ag)+ dJL,
and since all the integrals are finite we may move them freely and obtain
L L h+ dJL -
h- dJL =
Thus (2.7) holds.
L L L 1+ dJL -
1- dJL+
(Ag)+ dJL -
(Ag)- dJL.
•
The following variant of Proposition 2.4 is important in applications since it allows us to consider arbitrary functions for which the integral is defined.
2.
The Integral of Arbitrary Functions
117
Proposition 2.5. Let (X,M,JL) be a measure space and suppose j, 9 are extended real-valued measurable functions defined on X which satisfy the following conditions: The integral of j over X with respect to JL is defined, and 9 is integrable. Then the integral of j + 9 over X with respect to JL is defined and (2.9) Proof. By Proposition 2.4, (2.9) is only novel when f is not integrable. If this is the case one of the quantities, f+ dJL or I~ f- dJL, is infinite and the other one is finite. To fix ideas suppose that Jx f- dJL = 00, and observe that with the notation of Proposition 2.4, recalling that h = f + 9 and setting>. = 1 in (2.8), we have
Ix
(2.10)
Ix
Ix
Ix
Since f+ dJL, g+ dJL < 00, from (2.10) it follows that h- dJL = 00. Furthermore, since h- = 0 when h+ 'I 0, by (2.10) we also get that h+ ~ f+ + g+, and consequently h+ dJL < 00. Thus the integral of h over X with respect to JL exists and it equals -00. Whence the left-hand side of (2.9) equals -00, and so does the right-hand side. Thus (2.9) holds, and we are done. •
Ix
We are now in a position to explore what Fatou's Lemma says in the general context of functions of arbitrary sign. Theorem 2.6 (Fatou's Lemma). Let (X,M,JL) be a measure space and suppose {fn} is a sequence of extended real-valued measurable functions defined on X which satisfy the following property: There exists an integrable function 9 such that 9 ~ fn
(2.11)
all n.
Then the integrals of lim inf fn and fn over X with respect to JL exist, n = 1,2 ... , and we have
Ix
lim inf fn dJL
~ lim inf
L
fn dJL •
(2.12)
VII.
118
Integration
Proof. Since by (2.11) In - 9 ~ 0, the integral of the functions In - 9 over X with respect to p, is well-defined for n = 1,2, ... ; similarly for liminf In - g. Now, by Fatou's Lemma for nonnegative functions we have
Ix
lim inf(Jn - g) dp, =
~
Ix
(lim inf In
lim inf
- g) dp,
Ix (J
n -
g) dp, .
(2.13)
First we consider the left-hand side of (2.13). By Proposition 2.5 with I = liminf In - 9 and 9 = 9 there, we get that 1+ 9 = liminf In has a well-defined integral over X with respect to p, which satisfies
Ix
lim inf In dp,
=
Ix
(lim inf In
- g) dp, +
Ix
(2.14)
9 dp, .
Since 9 is integrable, by (2.14) it readily follows that the left-hand side of (2.13) equals
Ix
lim inf In dp, -
Ix
(2.15)
9 dp, .
A similar argument gives that the integral of In over X with respect to p, exists, n = 1,2, ... , and that the integral that appears on the right-hand side of (2.13) is equal to
Ix In
dp, -
Ix
9 dp, ,
(2.16)
n = 1,2, ...
Thus combining (2.15) and (2.16) we may rewrite (2.13) as
Ix
lim inf In dp, -
fx
9 dp,
~ lim inf
Ix In
dp, -
fx
9 dp, .
Ix
Since 9 is integrable we may now cancel 9 dp, in the above inequality. Whence (2.12) holds, and we have finished. • There is a version of Theorem 2.6 with the inequality (2.12) reversed, but with the lim inf replaced by the lim sup. Corollary 2.7. Let (X,M,p,) be a measure space and suppose {In} is a sequence of extended real-valued measurable functions defined on X
2.
The Integral of Arbitrary Functions
119
which have the following property: There exists an integrable function 9 such that (2.17) In:5 g, all n. Then the integrals of lim sup In and n = 1,2, ... , and we have
In
over X with respect to J.l exist,
(2.18) Proof.
Observe that (2.17) is equivalent to -g
:5 - In,
all n,
and that 9 is integrable iff -g is integrable, in other words, the hypotheses of Fatou's Lemma are satisfied by {- In} and -g. Since lim inf( - In) = -lim sup In, by Theorem 2.6 we have
-Ix
lim sup In dJ.l
=
Ix
lim inf ( -
:5 liminf
In) dJ.l
Ix (- In) Ix In
= -lim sup
dJ.l
= liminf
(-Ix In
dJ.l.
dJ.l ) (2.19)
Now, (2.19), whether involving finite quantities or not, is equivalent to (2.18), and we have finished. • Some remarks concerning these results: Clearly we may assume that (2.11) and (2.17) hold J.l-a.e. and obtain the same conclusion; also strict inequality may occur in estimates (2.12) and (2.18). For instance, for the Lebesgue measure on 1= [0,1] and the sequence In = X[O,3/4)' n odd, and In = X[3/4,1],n even, we have
and lim sup
1In
dJ.l = 3/4 <
1
lim sup In dJ.l = 1.
We close this section with the Lebesgue dominated convergence theorem, or LDCT, which describes under what conditions we may pass to the limit under the integral sign. \
VII.
120
Integration
Theorem 2.8 (LDCT). Let (X,M,JL) be a measure space and suppose {In} is a sequence of extended real-valued measurable functions defined on X such that (i) lim n -+oo In = I exists JL-a.e. (ii) There is an integrable function 9 so that for each n, IInl ~ 9 JL-a.e. Then I is integrable and lim f IndJL. Jxf IdJL = n-+oo Jx
(2.20)
Proof. By (i) it readily follows that I is measurable and by (ii) that III ~ 9 JL-a.e., thus I is also integrable. As for the In's, by (ii) they are integrable and for any n we have -9 ~
In
~ 9
JL-a.e.
So, Fatou's Lemma and its corollary apply, and since lim sup In JL-a.e., it readily follows that
I
= liminf In
L I dJL ~ liminf L In dJL
~ lim sup L IndJL ~ LIdJL . Whence, all four quantities in the above inequality are equal (and finite), and (2.20) holds. • By the way, the example following Theorem 1.8 shows that, in the absence of an integrable majorant, the conclusion of LDCT may fail.
3. RIEMANN AND LEBESGUE INTEGRALS Suppose 9 is a Riemann integrable function over an interval I. Does the integral of 9 over I with respect to the Lebesgue measure exist? If so, do both integrals coincide? In other words, we would like to know whether the notion of Lebesgue integral extends that of Riemann integral. First some notations. Lebesgue measurable functions will be called measurable functions, Lebesgue integrable functions will be called integrable, Lebesgue a.e. will be denoted plainly by a.e. and the integral of 9 over I with respect to the Lebesgue measure is denoted by 11 9 dx. We then have
3.
Riemann and Lebesgue Integrals
121
Theorem 3.1. Let 9 be a bounded real-valued function defined on 1= [a,b] and suppose that 9 E n(I). Then 9 E L(I) and
1b
(3.1)
g(x)dx = i9dX.
Proof. Let Pn = {a = Xl,n < ... < Xkn,n = b},n = 1,2, ... , be a sequence of finite partitions of I such that P n +1 is a refinement ofPn , n ~ 1, and so that the norm of P n ~ 0 as n ~ 00. If h,n = [Xk,n,Xk+1,n] , 1 ~ k ~ k n - 1, are the intervals induced by P n , and
mk,n = inf g, I",n
Mk,n
= supg I",n
note that the functions
=L
Ln(x)
mk,nXI",JX),
Un(x)
and
= LMk,nXI",JX) k
k
are bounded and measurable, and hence integrable over I. Now, the sequence {Ln} is non decreasing and the sequence {Un} is nonincreasing, and consequently, the limits
L(x) = lim Ln(x) , n~CX)
U(x) = lim Un(X) ,
and
n~oo
(3.2)
exist and are finite everywhere on I. Furthermore, Land U are measurable, and L(x) ~ g(x) ~ U(x), x E 1. (3.3) Next observe that for n
= 1,2, ... we have (3.4)
and
f Un dx = L
11
Mk,nI1k,nl
= S(g, Pn).
(3.5)
k
Since 9 is bounded the Ln's and the Un's are uniformly bounded on I, and consequently, by LDCT we get that
f L dx =
JI
lim f Ln dx , n-ooJI
f U dx =
JI
lim f Un dx . n_oo JI
(3.6)
VII.
122
Integration
Moreover, since 9 E 'R,(I), by (3.4), (3.5) and (3.6) we have
i
L dx =
i
(3.7)
U dx = lb g(x) dx .
From (3.7) it readily follows that the integral of the nonnegative function U - L over I vanishes, and by Corollary 2.3 we have that U = L a.e. Hence, by (3.3) we obtain that 9 = U a.e., and since the Lebesgue measure is complete, by Theorem 1.4 in Chapter VI we get that 9 is also measurable. By (3.3) it now follows that (3.1) holds, and we have finished. • The converse to Theorem 3.1 is false, to wit, there are bounded integrable functions 9 defined on I which are not Riemann-integrable; the characteristic function of the rationals in I will do. The notion of Riemann-integrability incorporates unbounded functions by means of the so-called "improper" convergence methods. For instance, suppose 9 is unbounded on I, but it satisfies the following properties: (i) For each 0 < c < b - a, 9 E 'R,([a + c,b]). (ii) lime.... o+ J:+e g( x) dx = J:+ g( x) dx exists.
J:-
J:;
Of course there are similar definitions for g( x) dx and g( x) dx. 1 2 For instance, the function g(x) = x- / is unbounded on (0,1], but
Functions for which the improper Riemann integral exists are also integrable, as our next result shows. Theorem 3.2. Suppose that the nonnegative function 9 is defined and finite on I = (a,b] and that J:+ g(x) dx exists. Then 9 E L([a,b]) and
f gdx =
Jf
lb g(x)dx. a+
Proof. Let 0 < en < b-a be a sequence which tends to 0 as n - 00. By Theorem 3.1 the functions gn = gX[a+En,b), n = 1,2, ... , are integrable on I and f gn dx = fb g( x) dx. Jf Ja+En
3.
Riemann and Lebesgue Integrals
123
Moreover, since the gn's increase to 9 on I (the value of 9 at a is irrelevant) by MeT it readily follows that
t g( x) dx = n-+oo lim f gn dx = f 9 dx . Ja+ JI JI
•
A similar result holds for the other improper integrals. On the other hand, it is also possible to consider the Riemann integral of a function 9 defined on R by simply letting
1
g( x) dx = lim
00
-00
N-+oo
IN g( x) dx , -N
whenever the limit exists. In this case the function g( x) = sin x / x is Riemann integrable over R, but not integrable there since, as is readily seen, iRlg(x)ldx = 00. With the aid of the Lebesgue measure we are also able to identify those functions which are Riemann integrable on finite intervals. Theorem 3.3. Suppose 9 is a real-valued bounded function defined on I = [a,b]. Then 9 E 'R(I) iff 9 is continuous a.e. on I. Proof. First assume 9 E 'R(I), and fix a sequence of partitions {Pn } of I as in Theorem 3.1. If L and U are defined by (3.2), let
N = {x E I:L(x) < g(x) or g(x) < U(x)}. In Theorem 3.1 we actually proved that INI = O. Now let N' be the set consisting of all those points in I which belong to some P n ; N' is countable and hence null. We claim that 9 is continuous off the null set NUN'; we prove this by contradiction. Given an interval J ~ I, consider the oscillation osc (g, J) of 9 over J given by osc (g, J) = sup 9 - inf 9 . J
J
Now, if 9 is not continuous at x ~ NUN', there exists e > 0, such that the oscillation of 9 on any interval containing x is at least e. Since x ~ N', x is an interior point to one of the intervals of each Pn , and consequently,
Un(x) - Ln(x)
~
e
for all n.
Fromfoll (3.2) it follows that if this is the case, then U(x) - L(x) and x EN, which is the desired contradiction.
~
e,
VII.
124
Integration
Conversely, suppose 'Pn is an arbitrary sequence of partitions of J such that limn-+oo(norm 'Pn) = 0 and observe that by assumption, lim Ln(x) = lim Un(x) = g(x) a.e.
n-+oo
n-+oo
By (3.4), (3.5) and LDCT we get that lim 8(g, 'Pn) = lim S(g, 'Pn) = [g dx .
n-+oo
h
n-+oo
Whence, by Proposition 2.1 in Chapter III, 9 E R(J).
•
4. PROBLEMS AND QUESTIONS The first four problems describe how the concept ofintegral was viewed by different mathematicians. For simplicity we assume that Jl is the Lebesgue measure defined on (X, C), where X is a compact interval of Rn, and that I is a nonnegative measurable function defined on X. 4.1 Lebesgue defined the integral of a bounded measurable function as follows: If 0:::; I:::; M a.e. on X, put
1 x
I dx
mM
I
k
= m-+oo lim L -I{klm:::; 1< (k + l)lm}l· k=l m
Show that this definition coincides with the one given in the text. 4.2 de la Valh~e-Poussin (1866-1962) extended the definition of Lebesgue
to include unbounded functions I as follows: Let 1m = AU, m) be the truncation of I at level m. {1m} is a sequence of bounded measurable functions, cf. 4.5 in Chapter VI, and we put
[ I dx =
Jx
lim [ m-+oo Jx
1m dx.
Show that this definition is equivalent to the one given in the text. 4.3 Saks (1897-1942) defines the integral in a manner reminiscent ofthe Riemann-Stieltjes integral. More precisely, let 'P = {E1 , • •• ,Ek} be a measurable partition of X, let mk = infEIc I, and put
4.
Problems and Questions
125
Show that this definition is also equivalent to the one given in the text. 4.4 Finally, there is the notion of integral as the "area under the graph." CaratModory defines the integral of a bounded measurable function f as follows: Let AU) = ((x,y) E X X R:O ~ y ~ f(x)}; show that AU) is a Lebesgue measurable subset of R n +1 and put
Lf
dx
= IA(J)I·
(4.1)
Show that (4.1) is equivalent to the definition given in the text. It is also interesting to interpret results such as MCT in terms of (4.1), as their meaning is quite apparent. ( 4.5JTrue or false: If f is a nonnegative function defined on Rand IR f dx < 00, then limlxl-+oo f(x) = O. 4.6 Suppose f is integrable on R n and for a fixed h E R n let g(x) = f(x + h) be a translate of f. Show that 9 is also integrable and that
f
iRn
gdx
= f
iRn
fdx.
(4.2)
(4.2) is a restatement of the translation-invariance of the Lebesgue measure. 4.7' Let (X,M,JL) be a measure space and {fn} a sequence of mea- surable functions such that L:n I~ Ifni dJL < 00. Show that L:n fn converges absolutely JL-a.e. and Jx (L:n fn) dJL = L:n Ix fn dJL. In particular, also lim n -+ oo fn = 0 JL-a.e. 4.8 Let Tl, T2, ••. ,Tn, .•. be an enumeration of the rational numbers in 1= [0,1], and let f(x) = 2- n •
L
{n :X>rn}
Compute IIf(x)dx. 4.9 Prove that the sum L:~=o ito,1r](1 - "'sin x)n cos x dx converges to a finite limit, and find its value. 4.10 Let f be a nonnegative measurable function defined on R. Prove that if L:~=-oo f( x + n) is integrable, then f = 0 a.e. On the other hand, if I is integrable, then 4>( x) = L:~=-oo 1(2nx + 1/ n) is finite a.e. and integrable, and JR 4>( x ) dx = JR I( x ) dx.
VII.
126
Integration
4.11 Let (X,M,JL) be a measure space and IE L(JL). Show that the set {I i: O} is O'-finite, i.e., the at most countable union of sets of finite measure. 4.12 Referring to Proposition 1.7, decide when the measure v introduced there is: ( a) finite, and (b) 0'- finite. 4.13 Referring to Proposition 1.7 again, suppose that 9 is a nonnegative 9 dv = gl dJL. measurable function defined on X. Show that
Ix
Ix
4.14 Suppose that the assumptions of 4.27 in Chapter IV hold and let I be a nonnegative real-valued measurable function defined on Y. Prove that the following "change of variable" formula holds:
4.15 Show that Fatou's Lemma is also true for functions that depend on a continuous parameter. More precisely, under the relevant assumptions, the following is true: lim infiEI Ii dJL ~ lim infiEI Ii dJL.
Ix
Ix
4.16 Prove the following variant of Fatou's Lemma: If {In} is a sequence of nonnegative measurable functions which converges to I JL-a.e. and In dJL ~ M < 00 for all then I is integrable and I dJL ~ M.
Ix
n,
Ix
4.17 Decide whether the following Fatou-like statements are true: (a) If {In} is a sequence of nonnegative measurable functions and In converges to I in probability, then I dJL ~ liminf In dJL, and, (b) Same result with convergence in probability replaced by convergence in measure.
Ix
Ix
4.18 Show that the following extension of Fatou's Lemma is true: Rather than assuming 9 E L(JL), we may assume that g- dJL < 00 in g+ dJL < 00 in Corollary 2.7. Theorem 2.6, and that
Ix
Ix
4.19 Describe the relation of 4.26 in Chapter IV to Fatou's Lemma. 4.20 Let (X,M,JL) be a measure space and {
¢>k'S, call it ¢>, so that
iR" (j - ¢» dx < e/2.
We have thus reduced the problem at hand to one of approximating simple integrable functions by continuous functions in the metric of L(Rn).
VIII.
134
More About £1
Suppose > = E~=l CkXEk' where for each k, Ck > 0 and Ek is a measurable set of finite measure, 1 ~ k ~ m. It suffices now to approximate each summand that appears in the definition of >, or equivalently, the characteristic function of a measurable set E, say, of finite measure. By the regularity of the Lebesgue measure, for any "I > 0, there is an open set 0 of finite measure such that
10\EI<"I,
O~E.
This estimate, in particular, implies
and since "I above is arbitrary, we may assume that the set E in question is actually open. It is at this point that the geometry of the situation plays a role. Let 0 = Uk:l h be an open set of finite measure; here the Ik's are nonoverlap ping closed intervals. Since 101 = Ek:llhl < 00, it readily follows that
and consequently it suffices to approximate the characteristic function of U~=l Ik, all finite m. But then it is enough to consider XI' where I is a closed interval of Rn. Suppose first that n = 1 and I = [0,1], and given "I> 0, let t/J." be the continuous function
0
1
1 + xj"l 1 ~"I+1-x)j"l
t/J.,,(x) =
Then clearly t/J."
~ Xb
if x < -"I if -"I ~ x < 0 if 0 ~ x ~ 1 if 1 < x ~ 1 + "I if x> 1 + "I.
and
L
(t/J." - XI) dx = "I ~ 0 with "I.
Now, if 1= [a,b], the function t/J." «x - a)j(b - a» does the job.
2.
Metric Structure of L1
135
=
As for the general case, observe that if I [at, b1 ] X ••• X [an, bn), then XI can be approximated in the metric of L( Rn) as close as we want by the continuous function
7/J1/
(Xlb - a1) X ••• X 1/;1/ (xn b 1 -
n
al
an), an
TJ small . •
Theorem 1.3 indicates that simple functions are dense in L(J.L). As for the Lebesgue integral, one of its important applications concerns the continuity of the translates of integrable functions. Proposition 1.4. Integrable functions are continuous in the metric of L(Rn). More precisely, for any IE L(Rn) we have lim
f
Ihl--+O iRn Proof.
I/(x + h) - l(x)1 dx = 0.
We show that given c
f I/(x + h) iRn
l(x)1 dx
> 0, there exists 6 > 0, such that
~ c,
whenever Ihl
~ 6.
(1.5)
This is not hard. First let 9 E Co(Rn) be such that dU,g) ~ c/3, and observe that I/(x + h) - l(x)1 may be estimated by
g(x + h)1 + I g(x + h) - g(x)1 + Ig(x) - l(x)l. (1.6) Whence, integrating (1.6) over Rn, we get that the integral on the lefthand side of (1.5) does not exceed I/(x
+ h) -
f I/(x+h)-g(x+h)ldx+ f Ig(x+h)-g(x)ldx+ f I/(x)-g(x)ldx iRn iRn inn =A+B+C, say. By 4.6 in Chapter VII we have A = C for all h, and, by our choice of g, A,C ~ c/3. As for B, note that 9 is actually a uniformly continuous function that vanishes off a bounded interval of Rn. Thus, given TJ > 0, there exists 6 > such that Ig(x + h) - g(x)1 ~ TJ for alllhi ~ 6 ,X E Rn. Moreover, since for any fixed h, Ihl ~ 1, also g(x + h) - g(x) vanishes off a bounded interval I of Rn, it is clear that
°
B ~
III TJ,
whenever Ihl ~ min (1,6). Thus, by choosing TJ small enough, we also have B ~ c/3 whenever sufficiently small, and A + B + C < c. •
Ihl is
VIII.
136
More About Ll
2. THE LEBESGUE DIFFERENTIATION THEOREM Given x = (xt, ... ,xn ) E R n and r > 0, let I(x,r) = {y: IXi - Yil < r, i = 1,2, ... ,n} denote the open interval of sidelength 2r centered at x. The question we address in this section is: If I is an integrable function, for what x's does lim
f
1
r-+O II(x, r)1 JI(:c,r)
I dy = I( x)
?
(2.1)
At those points x where (2.1) holds we say that the (indefinite) integral of I differentiates to I(x). In case n = 1, the question is whether lim .!.
j
I dy = I( x ) ,
.!. f
Idy
r-+O 2r (:c-r,:c+r) which is equivalent to lim
h-+O,ht=O h J[:c,:c+h)
= I(x).
(2.2)
If we set F(x) = J[O,:c)ldy, (2.2) reads precisely F'(x) = I(x), and this justifies the terminology. In fact, there are two questions implicit in (2.1): When does the limit exist, and, if it exists, when does it equal I(x). For instance, when 1= [0,1] and I = XI' we have
.!. f
2r J(-r,r)
I dy = 1/2, all r > 0.
(2.3)
Thus the limit of the left-hand side of (2.3) exists and it equals 1/2 =I1(0) = 1. Some observations are in order. First, the question we posed is "local" in nature, i.e., since we take limits as r -+ only the values of I near x are relevant. Thus, we may assume that x E 1(0,1) and that I vanishes off 1(0,2). Next, since the example given in (2.3) is not very reassuring, we consider an instance where (2.1) is true. Suppose, then, that I is continuous at x and note that
°
11(:r)1 f IdY-/(x)=II(:r)1 f (f-/(x))dy. , JI(:z:,r) , JI(:z:,r)
(2.4)
2.
The Lebesgue Differentiation Theorem
137
Given c > 0, let r be so small that I/(x)- l(y)1 we may assume r < 1. By (2.4) it follows that
II( 1 )1 x,r
r
JI(x,r)
Idy-/(x)
~
II( 1 )1 x,r
< c for y
r
JI(x,r)
E I(x,r)j clearly
1/-/(x)ldy~c,
and (2.1) is true in this case. Since continuous functions are dense in the metric of L(Rn), we expect that the good behaviour of continuous functions will somehow translate into an a.e. good behaviour of integrable functions. The idea of Hardy (1877-1947) and Littlewood (1885-1977) is to seek the control of all the averages of I. They devised this procedure to study the convergence of Fourier series and were inspired by the averages in the game of cricket. To control the averages of I we introduce the so-called HardyLittlewood maximal function. Specifically, suppose I is an integrable function which vanishes off 1(0,2), and for x E R n put
M(J)
= sup II( 1 )1 r i l l dy . r>O x, r JI(x,r)
(2.5)
What can we say about M I? We claim that M I is a nonnegative lowersemicontinuous, and hence measurable, function, which tends to as Ixl 00 at the rate of Ixl-n. To show that M I is lower-semi continuous we must verify that for each oX > 0, the set {M I > oX} is openj this is not hard. Working with complements we show that for each oX > 0, {M I ~ oX} is closed. Fix oX > 0, then, and suppose {Xk} is a sequence of points in {M I ~ oX} such that Xk Xj we show that x E {M I ~ oX} as well. In other words, we check that all the averages of III about x are less than or equal to oX. First observe that since Xk - x,
°
lim I(xk, r)
k~oo
~
I(x, r) = 0,
all r.
Therefore, if Xk denotes the characteristic function of I(xk,r) and ik = IXk' it follows that
I/k(y)1
~
I/(y)1 and
lim
k~oo
!k =
~
I(x,r)
°
a.e.
Thus, by LDCT
(2.6)
VIII.
138
Next consider the average II(~,r)1
fI(x,r)
More About L1
If I dy. Since
the average in question does not exceed
r
1
11(x,r)1 JI(x,r)~I(xk,r)
+
If I dy
1
11(xk,r)1
r
If I dy = A
+ B,
JI(xk,r)
say. By (2.6), A -+ 0 as k -+ 00. As for B, since M f( Xk) ::; A for all k, we also have B ::; A. Thus all averages of If I about x are less than or equal to A, and Mf(x) ::; A. Next we show that Mf(x) Ixl- n for Ixllarge. To see this take x E Rn with Ixllarge, x E Rn \ 1(0,10) will do, and observe that unless r> clxl, where c is a dimensional constant independent of x, the average of If I about x vanishes. Thus, with a dimensional constant c which may differ at different occurrences even in the same chain of inequalities, we have f'V
1
11(x,r)1
r
If I dy ::;
JI(x,r)
~ rn
r
and consequently,
Mf(x)::;
If I dy ::; _c_
Ixl n
JI(O,l)
ICln x
r
r
If I dy ,
JI(O,l)
(2.7)
Ifldy.
JI(O,l)
Since there is a dimensional constant c such that for Ixl large we have 1( x ,clx I) ~ 1(0,1), it readily follows that
Mf(x) >
1 - 11(x,clxl)l
r
JI(x,clxl)
Ifldy> _c_ - Ixl n
r
Ifldy,
JI(O,l)
the inequality opposite to (2.7) holds, and, as asserted, Mf(x) Ixl- n for Ix I large. It is then apparent that M f is not integrable, cf. 4.46 in Chapter VII, but just barely. As for the function Ixl- n, it satisfies a weak integrability condition reminiscent of Chebychev's inequality. More precisely, there is a constant c such that f'V
AI{lxl- n
> A}I
::; c,
all A >
o.
(2.8)
Indeed, if Ixl-n > A, then there is a dimensional constant c so that x E 1(0, CA- 1 / n ) , and
2.
The Lebesgue Differentiation Theorem
The class of those measurable functions
AI{I/I > A}I::; c,
all
139
I
which satisfy the estimate
A> 0,
(2.9)
was studied by Marcinkiewicz (1910-1940). It is called the weak-Ll class of Marcinkiewicz and it is denoted by wk-L(Rn). By Chebychev's inequality, L(Rn) ~ wk-L(Rn), and for integrable functions I, (2.9) is true with c = I/(y)1 dy. On the other hand, Ixl- n E wk-L(Rn) \ L(Rn). The remarkable fact that Hardy and Littlewood proved is that although for I integrable M I is not necessarily integrable, it belongs to wk-L(Rn); in a sense this gives the next best result.
JRn
Theorem 2.1 (Hardy-Littlewood). Suppose I is an integrable function which vanishes off 1(0,2). Then M IE wk-L(Rn), and for any A > 0 we have (2.10) AI{MI> A}I ::; 3n I/ldy.
iR!'
Proof. Given A > 0, let (h = {M I > A}; we want to show that the open set V;, has finite measure and that (2.10) holds. Since by (2.7) MI(x) - 0 as lxi- 00, V;, is a bounded set of finite measure; to show that (2.10) holds requires some work. The following line of reasoning is a prototype of the so-called "covering arguments" and it is due to Wiener (1894-1964). If V;, = 0 there is nothing to prove. Otherwise, let x E V;" and observe that by the definition of M I(x) there exists T = Tx such that 1 Il(x, Tx)1 Clearly
V;, ~
1
I(x,r.,)
III dy > A.
U lex,
Tx).
(2.11)
(2.12)
xED>.
Although the set on the right-hand side of (2.12) appears to be quite cumbersome, with the overlaps and all, things are not as complicated as a first impression might indicate. Since by the regularity of the Lebesgue measure, cf. 3.37 in Chapter V, IV>.I
= sup{IJ(I: J( C V>., J(compact},
(2.13)
VIII.
140
More A bout L 1
it suffices to estimate !KI for each compact subset I( of 0.". Now, for each such a compact set of K we also have
U I(x,Tx)
K ~
xeO",
and, since the I( X, T x) 's are open, by the Heine-Borel Theorem there exist finitely many intervals I(xt, Td, ... ,1(xm' Tm ), say, so that m
I( ~ UI(xi,Ti).
(2.14)
i=l
Clearly we may assume that no interval that appears on the right-hand side of (2.14) is contained in the union of all the others, that is to say, each interval there contributes something to the union. Let T = ma.x{ Tt, ... , Tm} and, by renaming the intervals if necessary, suppose that TI = Tj if more than one Ti equals T just choose any. At this juncture of the argument the geometry of the situation takes over: Observe that if
I(xt,TI)nI(xj,Tj)
=10,
then
I(xj,Tj)
~
I(Xt,3Tl)'
Because ofthis property we discard all the intervals I( Xj, Tj), j =11, which intersect I( xl, TI), and repeat the same procedure with the remaining intervals, i.e., the family ofthose intervals I(xj, Tj) which are disjoint with I( Xl, Td. In other words, we separate an interval with largest sidelength, and then discard all the intervals which intersect it. Since the original family of intervals is finite, after a finite number of steps we are left with a pairwise disjoint family of open intervals I(xt,Tt), ... ,I(Xk,Tk), say, which by (2.14) has the property that k
K ~
UI(xi' 3Ti). i=l
Whence
k
k
IKI ~ LII(xi,3Ti)1 i=l
= 3n LII(xi,Ti)l.
(2.15)
i=l
But the intervals that appear on the right-hand side of (2.15) are special: They are pairwise disjoint and they all satisfy (2.11). Therefore the sum on the right-hand side of (2.15) is less than or equal to
L:\11
k i=l
I(Xi,ri)
IfI dy =
11
:\
k
U i =l I (xi,ri)
IfI dy ~
11 IfI
-
..\
R!'
dy ,
2.
The Lebesgue Differentiation Theorem
141
and consequently, (2.16) Taking the sup of the left-hand side of (2.16) over those K CO)., by (2.13) it follows that (2.10) holds. • We are now ready to address the questions raised in (2.1), to wit, the existence of the limit there, and its precise value. Theorem 2.2 (Lebesgue Differentiation Theorem). integrable function which vanishes off 1(0,2). Then lim Il( 1 )1 ( fey) dy = f(x) x,r JI(x,r)
Suppose
a.e. on 1(0,1).
f is an (2.17)
r-+O
Proof. First we show that the limit on the left-hand side of (2.17) exists a.e. For this purpose consider the function
~(f,x) = limr-+Osup Il( X,1 r )1 JI(x,r) { f dy -liminf lIe 1 )1 ( f dy. r-+O X, r JI(x,r) Although ~ is measurable, we need not make use of this fact to proceed with the proof. However we point out that ~ is a well-defined nonnegative function, and show that ~(f,x) = a.e. on 1(0,1). The idea is to control ~ by the Hardy-Littlewood maximal function. Now, if 9 is continuous it is clear that ~(g, x) = everywhere and consequently, ~(f, x) = ~(f - g, x). Whence it readily follows that for any continuous function 9 we have
°
°
~(f,x)
= ~(f -
g,x)
~
2M(f - g)(x) ,
all x.
Let now A > 0, and note that the above inequality gives
Thus, by the monotonicity of the Lebesgue outer measure and Theorem 2.1, we obtain I{~(f,·)
> A}le
~ I{M(f - g)
> A/2}1
2 ·3n
~ -A-
JR( If -
gl dy.
(2.18)
VIII.
142
More About Ll
Now, since continuous functions are dense in the metric of L(Rn), the integral on the right-hand side of (2.18) may be made arbitrarily smail, and consequently,
I{x E Rn:i)(j,x) > A}le = 0, all A > O. This can only be true if i)(j,x) = 0 a.e., in other words the lim sup is equal to the lim inf and the limit on the left-hand side of (2.17) exists a.e. Next we show that the limit equals
W(j,x)= lim 11(1 )1 r-+O
x, r
f
f a.e. Let
JI(x,r)
f(y)dy-f(x).
W is an a.e. well-defined nonnegative function, and it is apparent that for continuous 9 W(j,x) = W(j-g,x), all x. Whence it readily follows that W(j,x) ~ M(j - g, x) + If(x) - g(x)l, and consequently, for A > 0 we have
{x E R n : W(j,x) > A}
~
{x E R n : M(j - g)(x) > A/2} U{x E R n : If(x) - g(x)1 > A/2}.
Thus by the monotonicity of the Lebesgue outer measure, the HardyLittlewood maximal theorem and Chebychev's inequality, it follows that
I{w(j,·) > A}le ~ I{M(j - g) > A/2}1 + 1{lf - gl > A/2}1 2·3 n f 2 f ~ -A- JR'" If - gl dy + X JRn If - gl dy. Since 9 is an arbitrary continuous function, we get that
I{x E R n :W(j,x) > A}le = 0, Whence W(j, x) = 0 a.e., (2.17) holds. •
all A > o.
Although we have only discussed a "local" version of (2.1), it is not hard to obtain the "global" version as well. One way to go about this is to use the general Hardy-Littlewood maximal theorem, cf. 3.23 below, but a simpler way to proceed is this: First observe that Rn = U~l 1(0, 2k). Now, an argument entirely analogous to Theorems 2.1 and 2.2 gives that if I E L(I(O,2k», then the integral of I differentiates to f(x) a.e. on I(O,k). Given an arbitrary integrable function I note that !k = IXI(O,2k) is integrable and it vanishes off I(O,2k), and consequently there exists a null set Nk such that the integral of !k differentiates to !k(x) = I(x) a.e. on I(O,k), k = 1,2, ... Clearly the integral of I differentiates to I(x) off the null set N = U~l Nk, and the global version of the differentiation theorem also holds.
3.
Problems and Questions
143
3. PROBLEMS AND QUESTIONS In what follows (X,M,JL) is a measure space. 3.1 By means of an example show that if f, 9 E L(JL) it is not necessarily true that f 9 is integrable. However, if p, g2 E L(JL), prove that
fg E L(JL) and
Ix
If gl dJL
~
(Ix
f2 dJL ) 1/2
(Ix
g2 dJL) 1/2
(3.1)
(3.1) is known as the Cauchy-Schwarz inequality and it has many interesting applications. One of them is the following: Show that if {fn}, {gn} are sequences with the property that f~,g~ E L(JL), n = 1,2, ... , and if lim [Un - f? dJL = lim [(gn - g)2 dJL = 0, n~oolx
n~oolx
then
3.2 In the spirit of 3.1, show that if f E L(JL) and {fn} is a sequence of integrable functions so that limn~oo dUn, f) = 0, and if {gn} is a sequence of measurable functions such that Ignl ~ M JL-a.e.,
limgn = 9 JL-a.e.,
then lim [ fngn dJL = [ fg dJL. n~oolx
lx
= 1,2, ... , and limn~oo dUn' f) = 0. Show that there exist an integrable function F and a sequence {nk} such that
3.3 Suppose that f, fn E L(JL), n
Ifnk 1 :$ F JL-a.e.
and
lim fnk
k~oo
=f
f E L(R) with the followRand M > 0,
3.4 Construct a Lebesgue integrable function
ing property: For any interval I
~
JL-a.e.
I{x E I: If(x)1 > M}I > O.
VIII.
144
3.5 Suppose
I is a Lebesgue integrable function on R+ and put
f
g(x) =
x>
I(t) dt,
1[0,00)
x
+t
Is 9 continuous? Does 9 have a limit as x 3.6 Suppose
More About L1
I
E
--+-
o.
oo? Is 9 differentiable?
L([a,b]) and put F( x)
= f I dy , 1[a,x]
a
~ x ~ b.
Show that F is a continuous BV function on [a,b] that satisfies the following property: Given e > 0, there exists 8 > 0, such that for any finite collection {[ai,bi]} of nonoverlap ping subintervals of [a,b], we have
Functions which satisfy (3.2) are said to be "absolutely continuous" in the sense of Vitali; we will have more to say about absolutely continuous functions in Chapter X. 3.7 Show that the integral of an arbitrary I E L(J.') is "absolutely continuous" in the following sense: Given e > 0, there exists 8 > 0, such that
L
III dJ.' < e,
whenever J.'(E) < 8.
We will have more to say about this notion of absolute continuity in Chapter XI. 3.8 Let (X,M,J.') be a finite measure space, and
I
a measurable extended real-valued function defined on X. Show that I E L(J.') iff
Lk:l J.'( {III 2:: k}) < 00. 3.9 Show that if I E
L(J.'), then lim
f
>'-+00 1{IJ1>>'}
III dJ.' =
(3.3)
0.
3.10 A sequence {In} of integrable functions is said to be "uniformly integrable" if (3.3) holds uniformly in n. More precisely, {In} is
uniformly integrable iff
lim sup
>'-+00 n
f
J{l/nl>>'}
I/nl dJ.' =
O.
3.
Problems and Questions
145
Show that if Jl(X) < 00 and {In} is uniformly integrable and limn->oo In = I Jl-a.e., then IE L(Jl) and
r In dJl = Jxr I dJl .
lim
n->oo Jx 3.11 If
sup
Ix
I/nlI+7J dJl
< 00 , 17 > 0,
show that {In} is uniformly integrable. The same conclusion is true if there exists 9 E L(Jl) so that I/nl ::; 9 Jl-a.e.j on the other hand the sequence
In
= (n/lnn)X(O,l/n)'
n
= 2,3, ...
is uniformly integrable with respect to the Lebesgue measure of 1= [0,1]. In fact, II In dx - 0, and yet the In's are not dominated by any integrable function. 3.12 Suppose Jl(X)
<
and show that uniformly integrable sequences are precisely those sequences with uniformly bounded integrals that are uniformly absolutely continuous. More precisely, {In} is uniformly integrable iff (i) There is a constant M such that I/nl dJl ::; M for all n, and, (ii) Given c > 0, there exists 6 > 0 such that Jl(E) < 6 implies I/nldJl < c for all n. 00
Ix
IE
< 00 and In ~ 0 Jl-a.e. for all n. Show that if the sequence {In} is uniformly integrable, then
3.13 Suppose Jl(X)
lim sup
Ix
In dJl ::;
Ix
lim sup In dJl .
= 1,2, ... , and limn->ood(fn,f) = o. Show that if Ihnl - 0, then limn->oo I/n(Y + hn) - l(y)1 dy = O.
3.14 Suppose 1,ln E L(Rn),n 3.15 Suppose that
IRn
IE L(Rn) and compute
r
lim I/(y Ihl->oo Jnn
+ h) + l(y)1 dy.
3.16 Prove that if A is a Lebesgue measurable subset of R n of positi ve measure, then the difference set A - A = {x E R n : x = Yl - Y2, Yl, Y2 E A} contains a neighbourhood of the origin.
VIII.
146
More About L1
State and prove a similar statement for A + A, and for A ± B, where B is another Lebesgue measurable set of positive measure. 3.17 Suppose I is a measurable function defined on R which assumes finite values on a set of positive measure and such that I(x + y) = I(x) + I(y) for all real x, y. Show that I is of the form I(x) = ex. 3.18 Prove that CJ(R n ) = {g: 9 is real-valued, it vanishes off a compact set, and its first order partial derivatives are continuous} is dense in L(Rn). Also, Ci(R n ) = {g:g and allits partial derivatives of order ~ (k-1) belong to CMR n )}, k = 2,3, ... , and Co(Rn) = n~l Ci(R n ), are dense in L(Rn). 3.19 If IE L(Rn), prove that there exists a sequence {Id of continuous functions such that limk.... oo Ik = I a.e. Further show that we may also require that each fk vanish off a compact set, and that it belong to CJ(Rn), or even to co(Rn). 3.20 Given
IE L(Rn), F(x,r)
put
= II( x,1 r )1 JI(x,r) f I(y)dy,
x E R n ,r > O.
Is F continuous as a function of x, for each r fixed? Is F continuous as a function of r, for each x fixed? 3.21 In the notation of 3.17, show that
limsupF(x,r) r .... O
and
liminf F(x,r) r .... O
are Lebesgue measurable. 3.22 Prove that Theorem 2.1 is true if we replace intervals by balls, i.e., (2.10) holds with M I there replaced by
Md(x)
= sup IB/ )1 f i l l dy, r>O x, r JB(x,r)
where B(x, r) = {y E Rn: Ix - yl < r} denotes the ball of radius r centered at x. 3.23 Prove the general version of the Hardy-Littlewood maximal the-
orem, i.e., remove the assumption that the integrable function vanishes off a bounded set.
I
3.
147
Problems and Questions
3.24 A point x at which lim II( 1 )1 x, r
r--+O
f
JI(x,r)
If(y) - f(x)1 dy
=0
is called a Lebesgue point of f, and the collection of all such points is called the Lebesgue set of f. Prove that if f is integrable, then almost every point is in the Lebesgue set of f. This notion is extremely important in the convergence of Fourier series, cf. Theorem 3.1 in Chapter XVII. 3.25 A family R = {R} of subsets of Rn is said to be regular at x provided that: (i) The diameters of the sets R tend to 0, and, (ii) There is a constant c such that if I( x, r) denotes the smallest interval centered at x containing R, then II(x,r)1 ~ clRI; the sets R need not contain
x. Show that if f is integrable, R is regular at x, and x is in the Lebesgue set of f, then . lim
diam(R)--+O
IRll JfR If(y) -
f(x)1 dy
= o.
3.26 Let E be a measurable subset of Rn; a point x E R n for which lim IE n I(x, r)1 II( x, r)1
=1
r--+O
is called a point of density of E. If the above limit equals 0, x is called a point of dispersion of E. Prove that almost every point of E is a point of density of E and a point of dispersion of Rn \ E. 3.27 Suppose that
hf(Y)dY = 0
(3.4)
for every subinterval I of R, and show that f = 0 a.e. In fact, the same conclusion is true if (3.4) holds for every I with III = c > 0, c a fixed constant. 3.28 Suppose that and let
f E L(R) vanishes off a bounded interval I F(x)=
r
J[a,x]
fdy,
= [a,b],
VIII.
148
More About L1
Is is true that
f
llim F(x + h) -
J[a,b] h-O
F(x) _
I(X)I
dx
= 07
h
Calderon and Zygmund, while considering problems related to the "norm" convergence of Fourier series of functions of several variables, introduced a decomposition of an integrable function into an essentially bounded "good" part and a "bad" part. The decomposition is described in the next four problems. 3.29 Let I be a finite interval in R and suppose I is a nonnegative integrable function which vanishes off I. Show that for any ,X satisfying
there is a sequence {Ik} of nonoverlapping intervals contained in I such that (i) ,X < Ilk Idy ~ 2,X, k = 1,2 ... (ii) I ~,X a.e. on 1\ Uk1k . (iii) \ Uk1k\ ~ ~ IUlk Idy ~ ~ Illdy.
Ttl
3.30 Referring to 3.26, consider the "good" function
and the "bad" function
b=l-g· Show that these functions enjoy the following properties (i) 0 ~ 9 ~,X a.e. on 1\ U Ik. (ii) 0 ~ 9 ~ 2'x on U Ik. (iii) b = 0 in 1\ Ulk ; Ilk b = 0 for all k. (iv) \b\~/+2'x. 3.31 Show that 3.29 and 3.30 are valid for I ,X>O
= R, I
E L(R), and any
3.32 The reader is invited to prove the n-dimensional version of 3.29, 3.30 and 3.31.
CHAPTER
IX
Borel Measures
In this chapter we study Borel measures on Euclidean space, their regularity properties, and the distribution functions associated with them. 1.
REGULAR BOREL MEASURES
A measure J.t on (R n , Bn) is called a Borel measure; the restriction of the Lebesgue measure to Bn is a familiar example of a Borel measure. In working with measures it is apparent that "regularity" plays an essential role. Now, in the case of the Lebesgue measure, regularity is built into its definition. We begin by showing that the same is true for those Borel measures which are finite on bounded sets; first the precise definition of regularity. A Borel measure J.t is said to be regular if for any E E Bn, J.t(E) may be computed by either of the expressions
J.t(E) = sup{J.t(K): K is compact, and K
~
E} ,
(1.1)
J.t(E) = inf{J.t(O):O is open, and 0 ~ E}. (1.2) These conditions roughly state that J.t is determined by the compact, or open, sets in Rn. A convenient way to verify these conditions is to consider the following equivalent formulations. For (1.1) we have: If J.t(E) < 00, given c > 0, there exists a compact set K ~ E such that
J.t(E \ K) = J.t(E) - J.t(K) and if J.t(E) = that
00,
~
given M > 0, we can find a compact set K
J.t(K)
~
M.
(1.3)
c, ~
E such (1.4)
14&
IX.
150
As for (1.2), if J.L(E) such that
Borel Measures
< 00, given £ > 0, there exists an open set 0 :2 E
J.L(O \ E)
= J.L(O) -
J.L(E):S
£,
(1.5)
and if J.L(E) = 00, by monotonicity any open set containing E also has infinite measure. We consider the regularity of finite Borel measures first; although the idea for proving this assertion is clear, it takes some time to carry out the details.
Theorem 1.1. Proof.
Suppose J.L is a finite Borel measure, then J.L is regular.
Let
A = {E E Bn : (1.1) and (1.2) are true for E}. The idea of the proof is to show that A is a u-algebra that contains the closed intervals, and which therefore coincides with Bn . It is not hard to check that A contains the closed intervals of Rn: Let I be a closed interval, then I is also compact and (1.1) holds. As for (1.2), let {Ik} be a decreasing sequence of open intervals that converges to I. By (iv) in Proposition 3.1 in Chapter IV it follows that
and (1.2) holds as well. Next, if {Em} ~ A and E = Um Em, then we have E E A. Indeed, suppose that £ > 0 has been given, and invoke (1.3) to find a sequence {Km} of compact sets such that
Km ~ Em, Furthermore, since E \
J.L(Em \ Km)
Um Km
~
:s £/2 m+1 ,
m = 1,2, ...
Um(Em \ K m), we have (1.6)
Now, since J.L (Um Km) <
00,
get at once that limM..... oo J.L
by (iii) in Proposition 3.1 in Chapter IV we
(U~=l Km) = J.L (U m Km). Whence, for M
sufficiently large, the compact set K = U~=l Km ~ E satisfies (1.7)
1.
Regular Borel Measures
151
Moreover, since
E\ K
= (E \ UmKm) U (UmKm \ K)
,
by (1.6) and (1.7) it follows that J.L(E) ::; J.L(K) + c. As noted in (1.3) above, this estimate gives that (1.1) is true for E. Along similar lines, let {Om} be a sequence of open sets such that
Em
~
Om,
J.L(Om \ Em) ::; €/2 m ,
m = 1,2, ...
Since 0 = UmOm is an open set containing E and O\E ~ we get that
J.L(O\E)::; LmJ.L(Om \Em)::;
Um (Om \
Em),
€,
and consequently, (1.2) is also true for E. Thus E E .A and .A is closed under countable unions. Finally we check that .A is closed under complementation. Suppose E E .A and € > 0 is given. Since J.L is finite, there exist an open set 0 ~ E and a compact set K ~ E such that
/J,(O \ E) ::; €/2,
and
J.L(E \ K) ::; €.
Moreover, since E \ K = (Rn \ K) \ (Rn \ E) and R n \ K = 0' is open, we also have
J.L( 0' \ (Rn \ E)) ::;
€ ,
0'
~
(Rn \ E) ,
and (1.2) holds for Rn \ E. A similar argument gives that
but we can only assert that R n \ 0 is closed. This is not a major difficulty: Since say, the sequence of compact sets {(Rn \ 0) n Bm} converges to R n \ O. Whence, by Proposition 3.1 (iii) in Chapter IV, we get that
and consequently, for m sufficiently large it follows that
152
IX.
Let J('
Borel Measures
= (Rn \ 0) n Bm; J(' is a compact subset of Rn \ E
and since
(Rn \ E) \ J(' = ((Rn \ E) \ (Rn \ 0)) U ((Rn \ 0) \ J('), by the above estimates it follows that 1-'( Rn \ E) - 1-'( J(') ~ c. Thus, (1.1) holds for R n \ E, and A is closed under the taking of complements. Whence A is a a-algebra that coincides with Bn , and we have finished .
•
Since R n is a-compact it is possible to extend Theorem 1.1 to more general Borel measures. More precisely, we have Theorem 1.2. Suppose I-' is a Borel measure which is finite on bounded subsets of Rn. Then I-' is regular. Proof.
Since Rn = Um{x:
Ixi
~
m} =
Um B m, say, for
each E in
Bn we have E
= Um (E n Bm) ,
with
En Bm E Bn
,
m
= 1,2 ...
Thus, from Proposition 3.1 (iii) in Chapter IV we get
I-'(E)
= m-+oo lim I-'(E n Bm).
(1.8)
The idea is to approximate the measure, of the sets that appear on the right-hand side of (1.8), and this will be achieved by "restricting" I-' to Bm. More precisely, consider the sequence of Borel measures given by
I-'m(E) = I-'(E n Em),
m = 1,2, ...
(1.9)
Since I-' is finite on bounded sets, the I-'m'S are finite Borel measures, and, by Theorem 1.1, they are regular. Fix now E E Bn , let £ > 0 be given, and put Em = E n B m , m = 1,2, ... By regularity, there exist compact sets Pm and open sets G m such that
(1.10) and
(1.11) We rewrite (1.10) and (1.11) in terms of 1-'. Since subset of Em, (1.10) reads
J(m
=
Pm
is a compact
(1.12)
2.
Distribution Functions
153
As for (1.11), let {Im,k} be a decreasing sequence of bounded open balls which converges to Em. Since f.l(im,d < 00, by (iv) in Proposition 3.1 in Chapter IV, we have
and consequently we can find a sequence of km's such that
Now, Om becomes
= Gm n Im,km
is an open set that contains Em, and (1.11) (1.13)
We are now ready to show that f.l(E) may be computed by both (1.1) and (1.2); we do (1.1) first. Combining (1.8) and (1.12) it readily follows that we may find a sequence of compact sets {Km} with the property that this gives (1.1) whether f.l(E) is finite or not. As for (1.2), we must only do the case f.l(E) < 00. If {Om} is the sequence of open sets introduced in (1.13), put 0 = Um Om and observe that 0 is an open set which contains E. Furthermore, since 0 \ E ~ Um(Om \ Em), by (1.13) it follows that
f.l(O) - f.l(E) = f.l(O \ E) $ Lm f.l(Om \ Em) $ Thus (1.2) is also true, and f.l is regular.
E:.
•
2. DISTRIBUTION FUNCTIONS Borel measures on the line which are finite on bounded sets are important in applications and there is a useful way to describe them. Let BB denote the collection of those Borel measures which are finite on bounded sets, assume that f.l E BB is finite and, referring to 4.28 in Chapter IV, let Fy be a distribution function induced by J-L. Since J-L is finite a way to normalize the Fy's is to consider not the expression given there but rather the distribution function F corresponding to y = -00, namely
F(x) = Jl«-oo,x]).
(2.1)
IX.
154
Borel Measures
F is called the distribution function of JL, it is nondecreasing and rightcontinuous, and it satisfies lim F(x)
x--+-oo
= 0,
lim F(x) <
x--+oo
Also, as a consequence of (2.1), it follows that for we have
JL«x, y]) = F(y) - F(x) , JL([x, y)) = F(y-) - F(x-),
(2.2)
00.
-00
<x
00
JL([x, y]) = F(y) - F(x-) JL«x,y)) = F(y-) - F(x).
Furthermore, if D is a dense subset of R, then the relation (2.1) is already determined by XED, or by any of the four above relations when x, y ED. The remarkable fact is that, conversely, any non decreasing rightcontinuous function F that satisfies (2.2) determines a unique finite Borel measure JL such that (2.1) is true. Rather than proving this result we discuss a more general one that also includes F(x) = x, which intuitively corresponds to the Lebesgue measure on the line. The precise statement IS
Theorem 2.1. Let BB = {JL: JL is a Borel measure on the line which is finite on bounded sets}, and 1) = {F: F is non decreasing and rightcontinuous}; we identify those functions in 1) that differ by a constant. Then, there is an injective mapping T from BB onto 1) which satisfies the following property: If TJL = F and c is an arbitrary constant, we have
c + JL«O,xD
F(x) = { c
c - JL«x,OD
°°
if x> if x = if x < 0.
(2.3)
Clearly (2.3) is equivalent to
F(y) - F(x)
= JL«x,y]) ,
all real x < y.
(2.4)
Proof. It is not hard to check that if T is given by (2.4), then T is one-to-one. Indeed, if TJL is constant, from (2.4) it follows that JL vanishes on all half-open intervals (x,yj. Since any open subset of R is a countable union of such intervals, JL also vanishes on all open sets. Furthermore, since JL is regular, by (1.2) we get JL(E) = for all E E B1 • Thus JL is the measure, and T is injective. To show that T is onto requires some work. Suppose F E 1) and observe that since F is nondecreasing the (bad) set
°
°
B = {r:F-l({r}) consists of more than one value x E R}
2.
Distribution Functions
155
is at most countablej B consists of those points in the range of F which correspond to the intervals of constancy of F. Now, let ~ be the interval-valued function defined as follows: Since FE V,F(x-) and F(x+) = F(x) exist for each x E Rj then put ~(x)
= [F(x-),F(x)]
,
x
E
R.
(2.5)
Thus, for each real x, ~(x) is a closed interval, degenerate if F is continuous at x, and closed and bounded otherwise. For each subset E of R let (2.6) ~(E) = ~(x),
U
xEE
and put J =
~(R)j
clearly J is also an interval. Now set
A = {E E 8 1 : ~(E) E £} j we claim that A is a u-algebra which contains all intervals, and which consequently coincides with 8 1 . First observe that if E is an interval, then so is ~(E); thus A contains all intervals. Next we verify that A is closed under complementation; this is clear if F is strictly increasing for then ~(R
\ E)
=J
\
~(E)
E £,
whenever E EA.
(2.7)
In the general case, i.e., when F is only monotone non decreasing, a slight complication occurs when E includes an interval of constancy of F. For instance, if F(x) = r for x E I = [a,b] and E = [a,(a + b)/2], then ~(R\E) = J, but J\ ~(E) = J\ {r}, and equality does not hold in (2.7). Nevertheless, equality will hold there if we add a subset of B, namely {r}, to J \ ~(E). The same reasoning applies to the general case. More precisely, for any E E A there exists an (at most countable) subset Bl of B such that (2.8) ~(R \ E) = (J \ E) UBI. From (2.8) it is clear that ~(R \ E) E £, which in turn implies that R\E E A, and consequently, A is closed under the taking of complements. Finally we check that if {En} ~ A and E = Un En, then E also belongs to A. This is a simple consequence of the readily verified identity
Thus, A is the u-algebra 8 1 •
IX.
156
Borel Measures
Let now p. be the set function given by
p.(E)
= 1cT>(E)I,
E E B1 •
(2.9)
Since for each Borel set E, cT>(E) E C, p. is well-defined. We claim that p. is a Borel measure which is finite on the bounded sets of R. That p.(0) = 0 is obvious. Next let {En} be a sequence of pairwise disjoint Borel subsets of the line and let E denote their union. In general the cT>(En)'s are not pairwise disjoint, as there may be overlaps created by the intervals of constancy of Fi this is not a serious inconvenience. First recall that since B is at most countable, for any A E C we have IA \ BI = IAI, cf. 3.2 in Chapter V. Next observe that the sets cT>(En) \ B, n = 1,2, ... , are pairwise disjoint and Lebesgue measurable. Thus
p.(E)
= 1cT>(E)1 = IUncT>(En) I = I(UncT>(En)) \ BI = IUn(cT>(En) \ B)I = Ln 1cT>(En) \ BI = Ln 1cT>(En)1 = Ln p.(En).
Whence, p. is a Borel measure on the line. It only remains to show that p. is finite on bounded sets, and that Tp. = F, i.e., that (2.3) holds. Let E ~ [-n,n] be a bounded Borel set. Since cT>(E) ~ cT>([-n,nD we obtain
p.(E)
= 1cT>(E)1 ~ 1cT>([-n,nDI < 00,
and p. is finite on bounded sets. Finally, let 1= (x,y]. Since lim F(z-)
z-+x+
= F(x+) = F(x)
by (2.5) it is apparent that either
cT>(I) = (F(x),F(y)] ,
or
cT>(I) = [F(x),F(y)].
In either case we have
p.(I) = IcT>(I) I = F(y) - F(x) , and (2.4) holds. It is now easy to see that (2.3) holds as well, with c there. •
= F(O)
To emphasize the fact that Tp. = F, and that, consequently, p. and F are related by (2.3), or (2.4), we denote p. by P.Fi P.F is called the Lebesgue-Stieltjes measure induced by F, and
In f
dp.F
,
f Borel measurable ,
2.
Distribution Functions
157
1
3
i
!
2
1
i
.1
o
a0
.1 3
Figure 4 is called the Lebesgue-Stieltjes integral of lover R with respect to dJ.LF' The following is a natural question to ponder: If F E 1) has a measurable derivative F' a.e., under what circumstances is dJ.LF(x) = F'(x) dx? That equality does not always hold follows by considering the Cantor Lebesgue function, which we construct next. Referring to the construction of the Cantor set and to the constructIOn in Section 1 of Chapter VI, let I! be the 2n -1 open intervals whose union is D n , and for n = 1,2, ... , let In be the continuous function defined on I = [0,1] which satisfies In(O) = 0, In(1) = 1,ln(x) = k2- n for x E I!, and which is linear on each interval of Cn. It is clear that each In is monotone non decreasing, and since the values only change on the Cn's, that In = In+! on I!, k = 1, ... , 2n - 1, and
I/n(x) - In+!(x)1
~
2- n ,
all x in I.
This estimate allows us to show that the In's converge uniformly on I. Indeed, for any k < m we have m-l
I/m(x) - Ik(X)1 = L(Jn(X) - In+l(X)) n=k
IX.
158 00
~
Borel Measures
00
L I/n(x) - In+1(x)1 ~ L2-n = 2- +1. k
n=k
n=k
Thus, given c > 0, we may choose N so large that I/m(x)-!k(x)l~c,
m,k~N,allxEI,
and the sequence {In} is uniformly Cauchy on I. Whence lim n -+ oo In(x) = I( x) exists on I, and since I is compact and the In's continuous and nondecreasing, I is also continuous and non decreasing. Also note that 1(0) = 0, 1(1) = 1, and that I is constant on every interval removed in the construction of Cj I is called the Cantor-Lebesgue function. Let now 0 if x < 0 F( x) = { I( x ) if 0 ~ x ~ 1 1 if x > 1. Clearly F E 1), and since F is not constant, fLF is not the 0 measure. On the other hand it is apparent that F'( x) = 0 for x ~ C, that is a.e. Whence it readily follows that
L
dfLF = fLF([O, 1]) = F(I) - F(O)
= 1:1
L
F' dx
=0 .
It is also clear that for the Cantor-Lebesgue function is true: [ !' dx = 0 :I 1(1) - 1(0) = 1.
I
the following
l[o.l]
This puzzling fact will be discussed in full detail in Chapter X, where we characterize those functions which coincide with the integral of their derivatives. We close this section with two interesting remarksj since the proofs follow along familiar lines we leave it to the reader to carry them out, cf. Section 3 in Chapter VII. Theorem 2.2.
Let 9 be a bounded real-valued function defined on
I = [a,b], I a nondecreasing right-continuous function defined on I, and put I(a) F(t) = { I(t)
I(b)
if t ~ a if a < t < b if t ~ b.
3.
159
Problems and Questions
Then, 9 E 1l(I) iff p F( {x E I: 9 is not continuous at x}) =
Furthermore, if
J:
°.
9 df exists, then 9 E L(p F), and
3. PROBLEMS AND QUESTIONS 3.1 What is the cardinal number of Bn? 3.2 Suppose p is a Borel measure on the line such that p([O,l]) = 1, and for each real x,
Show that p coincides with the restriction of the Lebesgue measure to B 1 • 3.3 Suppose that p is a probability Borel measure on [0,1], and that for each Borel set E C [0,1] with lEI = 1/2 we also have p(E) = 1/2. Does it follow that p coincides with the restriction of the Lebesgue measure to B1? 3.4 Suppose that p is a finite Borel measure on Rn, and that A C Rn is closed. Show that >(x) = p(x + A) is upper semicontinuous, and consequently, measurable. 3.5 Let p be a finite Borel measure on a bounded interval of the line such that for each real x, p( {x}) = O. Show that given £ > 0, there exists 0 = 0(£) > 0 with the property that if E E B1 and diam (E) < 0, then p(E) < £. 3.6 Suppose p is a nonzero Borel measure on the line which is finite on bounded sets. Show that if
then I' is a Dirac measure.
IX.
160
Borel Measures
3.7 Let 4> be a nonnegative additive set function defined on Bn , and suppose that for E in Bn we have
4>(E) = sup{4>(K):Kcompact,K
~
E}.
Show that 4> is O'-additive, and hence a Borel measure. 3.8 Suppose J.L is a regular Borel measure on R n , and let E E Bn. Show that there exist a G s set U and an Fq set V such that
V ~ E ~ U,
J.L(U \ V)
= O.
3.9 Suppose J.L is a regular Borel measure on Rn, and let f be a nonnegative integrable function. Show that the set function
is also a regular Borel measure on Rn. 3.10 Suppose J.L, A, are Borel measures on Rn and let
(J.L
V
A)(E) == sup{J.L(A) + A(E \ A): A
~
E, A
E
Bn} ,
(J.L
A
A)(E)
= inf{J.L(A) + A(E \
~
E,A
E
Bn}.
and
A): A
Show that J.L V A and J.L A A are Borel measures and that
(J.L V A)(E) + (J.L A A)(E)
= J.L(E) + A(E) ,
all E E Bn.
If J.L and A are regular, are J.L V A and J.L A A also regular?
3.11 An atom of a Borel measure J.L is a singleton {x} such that J.L( {x}) > O. Show that the number of atoms of a O'-finite Borel measure J.L is at most countable. 3.12 Suppose J.LF is the Borel measure on the line induced by F E V. Show that
J.L F( {x}) = 0 iff F is continuous at x • Moreover, if {x} is an atom for J.LF' then we have
3.
Problems and Questions
161
3.13 Show that a regular Borel measure J.L on the line is a probability measure iff there exists FE V such that J.L = JLF and lim F(x)
x ..... -oo
= 0,
and
lim F(x) = 1.
x ..... oo
3.14 We say that a Borel measure JLF is atomic, or discrete, if J.LF(E) = 0 whenever E E B1 does not contain any atom of JLF; in this case the associated distribution function F is said to be discrete. On the other hand, if F is continuous, or equivalently when JLF has no atoms, we say that the Borel measure is continuous. Show that if J.L is a regular Borel measure on the line, then there exist a discrete measure JLd and a continuous measure J.Lc such that J.L = JLd + J.Lc. Is this decomposition unique? 3.15 Suppose J.L is a regular Borel measure on the line with no atoms, and let 0 < 11 < J.L(R). Show that there exists E E B1 such that JL(E) = 11· 3.16 Let J.L be a regular Borel measure on Rn. Show that there exists a unique closed subset C of R n with the following two properties: (a) J.L(R n \ C) = 0, and (b) If 0 is an open set such that C n 0 "I 0, then JL( C n 0) "I O. This closed set C is called the support of J.L, and we also say that J.L is supported in C, and denote this relation by supp J.L = C. What is the support of the Dirac 6x measure? Given a compact subset ]( of R, construct a measure J.L such that supp J.L = ](. 3.17 Show that supp (J.L V ,x)
= supp J.L U supp,x,
and that supp (J.L " ,x) ~ supp J.L n supp,x .
(3.1)
By means of an example show that the inclusion in (3.1) may be proper. 3.18 Suppose J.L is a regular Borel measure on the plane such that for any horizontal or vertical line L we have J.L(L) = o. Show that the function ¢>(x) = J.L(I(x,l)) , x E R2, is continuous. Can you think of n-dimensional extensions?
IX.
162
Borel Measures
3.19 Let J.L be a finite Borel measure on the plane, and suppose that for any line L we have J.L(L) = o. Show that if E E B2 and 0 < "I < J.L(E), there is a Borel set ACE such that J.L(A) = "I. 3.20 Suppose J.L P is the Borel measure induced by the distribution function F E V given by
F(x) =
o { x2
ifx
<
ifx~1.
2
Find the measures J.LP,c' J.LP,d associated to J.Lp as in 3.14 above, and compute
lXdJ.Lp(x). 3.21 Suppose F E V satisfies: F(x) x ~ 1. Compare
[
= 0 for
x
and
if
(1- F(x))dx
i[O,l]
~
0, and F(x)
= 1 for
[RXdJ.LP(x).
= 0 and that J~oo Ixl dJ.Lp(x) < 00 iff the integrals F(x)dx and ito,oo)(l- F(x)) dx are finite.
3.22 Given a distribution function F such that lim x -+_ oo F( x)
lim x-+ oo F(x)
J<-oo,O]
= 1, show
3.23 Show that if FE V is odd, then for any 9 E Co(R) we have
19(X)dJ.Lp(X)
= 19(-X)dJ.Lp(x).
3.24 (Change of Variable). Let A be a finite Borel measure on Bn , continuous real-valued function defined on R n , and set
F(x) = A(T-l(-oo,xD,
T
a
x E R.
Show that F E V, and that for any 9 E Co( R) we have [ gOT dA = [ 9 dJ.L P . iRn iR
In what follows we assume that the distribution functions F E V we work with are normalized so that lim F(x)
x-+-oo
= 0,
and
lim F(x)
x-+oo
= 1.
3.
Problems and Questions
163
3.25 Given F, Fn E V, n = 1,2, ... we say that Fn converges weakly to F if at each point x at which F is continuous we have lim Fn(x)
n-+oo
= F(x).
Prove that Fn converges weakly to F iff
lim JLF.n (( -oo,x])
n--+oo
= JLF(( -oo,x]) ,
at every point x for which JL( {x} ) = O. 3.26 Among other reasons, the notion of weak convergence is important in Probability because of the following approximation property: For every F E V there exists a sequence {Fn} ~ V such that (i) Fn converges weakly to F. (ii) Fn is continuous everywhere on R. (iii) Fn is constant on each interval ofthe form ((k-1)jn,kjn] , k = O,±1,±2, ... 3.27 The notion of weak convergence corresponds to a "metric" convergence. The Levy distance d (F, G) between F, G E V is defined as the infimum of those c > 0 for which
G(x - c) - c
~
F(x)
~
G(x + c) + c,
all x E R.
Prove that a necessary and sufficient condition for Fn to converge weakly to F is that d (Fn, F) --t O. 3.28 Show that if F E V is everywhere continuous, then F is uniformly continuous. 3.29 Given a real-valued function G defined on R, we define its modulus of continuity w(G,c) by the expression
w(G,c) = sup{G(x) - G(y): Ix -
yl
~
c}.
Show that if F, G E V, and d( F, G) < c, then we have sup IF(x) - G(x)1 ~ c + w(F,c). As a consequence of this prove that if Fn converges weakly to an everywhere continuous distribution F, then Fn converges uniformly. to F. Is this statement true if F is only continuous on a bounded closed interval?
IX.
164
Borel Measures
3.30 Suppose J.L,J.Ln EBB, J.L(R) = J.Ln(R) = 1 for all n. We say that J.Ln converges weakly to J.L if lim J.Ln« -oo,x]) = J.L« -oo,x])
n-+oo
at every point x at which J.L( {x}) = O. For instance, the statements Fn converges weakly to F and J.L Fn converges weakly to J.L F are only different expressions of the same fact. Prove that if J.Ln is the Dirac measure at x n , and J.L the Dirac measure at x, then J.Ln converges weakly to J.L iff Xn --+ x. 3.31 Let J.L, J.Ln be Borel measures on the line, n following conditions are equivalent: (i) J.Ln converges weakly to J.L. (ii) For every I E Co(R) we have lim n-+oo
=
1,2, ... Then the
JRf I dJ.Ln = JRf I dJ.L •
(iii) For every Borel set E with the property that its boundary BE = En (R \ E) is J.L-null, we have lim J.Ln(E)
n-+oo
= J.L(E).
3.32 If I denotes the Cantor-Lebesgue function in I = [0,1] and x E C is of the form x = 2:~=1 2an /3 n , where each an = 0 or 1, show that I(x) = 2:~=1 an/2n. Also compute
i
xnl(x)dx,
n
= 0,1,2.
CHAPTER
x
Absolute Continuity
In this chapter we discuss the class of absolutely continuous functions, namely, those functions which may be recovered by integrating their derivatives. 1.
VITALI'S COVERING LEMMA
In dealing with the question of whether the indefinite integral of I differentiates to I(x) it was essential to handle families of intervals. The same is true in general problems of differentiation, or any other area of Analysis where intervals are sorted out. For the problem at hand this process is carried out by means of Vitali's covering lemma; first a definition. A family V of closed intervals of R is said to be a covering of E in the sense of Vitali if for any x E E and c > 0 there is an interval I in V which contains x and so that III ~ c. In other words, every point of E belongs to arbitrarily small intervals of V. Such coverings satisfy the following remarkable property. Theorem 1.1 (Vitali's Covering Lemma). Suppose E is a subset of the line with IEle < 00 and V is a covering of E in the sense of Vitali. Then there exists an at most countable family {Ik} of pairwise disjoint intervals of V such that (1.1) Proof. Let 0 be an open set of finite measure which contains E, and discard from V those intervals that are not totally contained in o. It is clear that this new family, which we call V again for simplicity, is also a Vitali covering of E.
165
X.
166
Absolute Continuity
Having done this, pick an interval h, say, of V. If IE \ hi = 0 we are done, otherwise we choose recursively a family of intervals of V according to the following rule: Suppose that the pairwise disjoint intervals h, ... ,In of V have been chosen and that IE\U k=1 Ikle > o. Consider then the open set
Gn
=0
\ Uk=IIk
f: 0,
and the class of those intervals of V totally contained in G n ; the idea is to select In+1 as a largest interval in this class. Thus, if kn
= sup{III:I E V
and I C G n } > 0,
let In+1 be any subinterval of V contained in G n such that
IIn+11 > k n/2; by construction it is clear that In+1 n (II U ... U In) = 0. Either the selection process stops after a finite number of steps, and if this is the case we have finished, or else there exists a pairwise disjoint sequence {Id of intervals of V such that
Uklk ~ 0,
Lk IIkl ::S
and
101 < 00.
In this case, given TJ > 0, we may find N such that l:~N+1IIkl < TJ. We consider RN = E \ U~=1 h and estimate its Lebesgue outer measure in terms of TJ. Since each x in RN belongs to the open set GN, by assumption there is an interval I E V containing x such that I n (II U ... U IN) = 0. We claim that there is an index n > N so that I n In f: 0. Indeed, if I E V and for all m we have I n 1m = 0, it follows that
III ::S k m < 21Im+11 ~ 0,
as m ~
00 ,
which is impossible. Let n be smallest index so that I n In f: 0; clearly n > N. Furthermore, since by the way the In's were selected we have III ::S k n < 2IIn+1l, by simple geometric considerations we obtain
+ IIn+1I/2 < 21In+11 + IIn+1I/2 = 5IIn+1I/2.
d(x,midpoint of In+J)::S III
Let I n+1 denote the interval concentric with In+1 with sidelength 5 times that of I n +1 • By the above estimate it follows that x E I n +1 and consequently, Rn ~ U~N+1 Jk. Thus
IRNle < ,",00 IJkl - L...Jk=N+l
= 5,",00 IIkl = 5TJ. L...Jk=N+1
Whence by (1.2) we have
IE \ U~lhle ::S
IE \Ui"=IIkle < 5TJ,
and, since TJ is arbitrary, (1.1) holds.
•
(1.2)
2.
Differentiability of Monotone Functions
167
Corollary 1.2. Under the assumptions of Theorem 1.1, given c > 0, there exists a finite family II, ... , IN of pairwise disjoint intervals of V such that (1.3) Proof. Pick TJ = c/5 in the proof of Theorem 1.1; then (1.2) gives the desired conclusion. • Note that whereas the validity of Corollary 1.2 requires that IEle < 00, the conclusion of the Vitali covering lemma is true for an arbitrary subset Eof R.
2. DIFFERENTIABILITY OF MONOTONE FUNCTIONS Suppose I is a real-valued function defined on 1= (a,b), and for x E I and h -:f:. 0 with x + h E I put DI(x,h) = I(x
+ h) -
I(x) .
h Whether I is differentiable at x E I, or has a one-sided derivative at x, or not, the following four quantities, called the Dini numbers of I at x, are well-defined: D+ I(x) = lim sup D I(x, h),
D+/(x) = liminf D I(x, h),
h-+O+
h-+O+
D- I(x) = lim sup DI(x, h) ,
D_/(x) = liminf DI(x,h).
h-+O-
h-+O-
Clearly D+/(x) ~ D+ I(x) and D_/(x) ~ D- I(x) and I'(x) exists iff all four Dini numbers of I at x are equal. The stage is now set for
Theorem 2.1 (Lebesgue). Let I be an open subinterval of the line and suppose I is a monotone real-valued function defined on I. Then I' exists a.e. on I. Proof. We may assume that I is nondecreasing, and consider first the case when I is bounded. We will be done once we show that D_I ~ D- I ~ D+I ~ D+ I ~ D_I,
a.e. on I,
(2.1)
x.
168
Absolute Continuity
for then all the Dini numbers of f are equal at those x's where (2.1) holds, and f' exists a.e. on I. As noted above, the first and third inequalities in (2.1) are always true, so we only need to establish the second and fourth inequalities there. Now, this amounts to showing that the (bad) sets
B
= {x E I:D+ f(x) > D-f(x)} and B' = {x E I:D- f(x)
> D+f(x)}
are null. Since the proof for both sets follows along similar lines we only consider B. First observe that all Dini numbers are nonnegative, and if for rational numbers u > v > 0 we put
= {x E I:D+ f(x) > u > v > D-f(x)} , then we have B = Uu,v Bu,v. Thus the desired conclusion will follow once we show that each of the Bu,v's is null. So we suppose that IBu,vle = T/, Bu,v
and show that T/ = O. The idea of the proof is to approximate Bu,v by a simpler set consisting of pairwise disjoint intervals (here we use the fact that D-f < v and the Vitali covering lemma) and then to further approximate the part of Bu,v which lies within those intervals by another family of intervals (here we use the fact that D+ f > u and Vitali's covering lemma again). First observe that since Bu,v ~ I we have T/ < 00, and, by (1.8) in Chapter V, given € > 0, there exists an open set 0 ;2 Bu,v such that
Moreover, since for each x in Bu,v we have D-f(x) < v, there exists a sequence hx,n > 0 approaching 0 such that the intervals [x - hx,n,x] C 0 and (2.2) f(x) - f(x - hx,n) < vhx,n all n. Clearly V = {[x - hx,n,x]} is a covering of Bu,v in the sense of Vitali and consequently, by Theorem 1.1 there is a finite collection II = [XI ,XI - hI], ... ,In = [xn,x n - h n], say, of pairwise disjoint intervals of V such that
(2.3) Next let jj denote the interior of Ij, 1 ::; j ::; n, and observe that since for each x in
B~,v
= Bu,v n
(Ui=1 jj)
we have D+ f( x) > u, there is
2.
Differentiability of Monotone Functions
a sequence kx,m 1 :$ j :$ n, and
>
169
0 tending to 0 such that [x,x
I(x
+ kx,m) -
I(x) > ukx,m,
+ kx,m]
~
Ij for some (2.4)
all m.
Since V1 = {[x ,x + kx,m]} is a covering of B~,v in the sense of Vitali, we can find a finite collection J 1 = [x~ ,x~ + k1], ... , J m = [x~,x~ + km], say, of pairwise disjoint intervals of V1 with the property that
(2.5) Let now
m
0:$ D..J = 2:(f(xi + ki) - I(xi» i=l denote the increase of I along the Ji'S, and, similarly, let D..] denote the increase of I along the I/s. It is not hard to check that (2.6) Indeed, suppose that J1 , ••• , Jml are ordered from left to right and are contained in /1 j that Jml +1, ... ,Jm2 are ordered from left to right and are contained in 12 , and so on. Since {Ji}~l is a pairwise disjoint collection of intervals contained in /1 it readily follows that
2::1(f(xi
+ ki) -
I(xi»
= I (X~l
:$
I(X~l
+ kml ) + km1 )
-
••• -
(J(x~) - I(x~
+ k1» -
I(xt)
l(xD :$ I(xt) - I(X1 - ht).
Whence, by adding up the increase of I along these blocks of Ji'S it follows that (2.6) holds. Now, by (2.2), and since the I/s are all contained in 0, it is clear that
D..] <
2:;=1 vlIjl
= viOl :$ v(1] + oS).
(2.7)
On the other hand, by (2.4) we get
D..J >
U
2::1IJil,
(2.8)
and consequently, we need a lower bound for the right-hand side of (2.8). Since Bv.,v = (Bv.,v \ Ui=l Ij) U B~,v UN, where N Ij 's} is a finite set, it is clear that
Bu,'IJ
~
(Bu,'IJ \ Ui=l I
= {endpoints of the
j) U (B~,'IJ \ U~lJi) U (U~lJd UN.
X.
170
Absolute Continuity
Thus, on account of (2.3) and (2.5) it readily follows that
1] =
IBu,vle ~ c + c + I:: 1IJil,
or
I::1IJil 2:: 1] -
2c.
Substituting this estimate in (2.8), and combining it with (2.6) and (2.7), we see that
(1]-2c)u <
/}..J ~ /}..[
< V(1]+c),
c > O.
(2.9)
Since c in (2.9) is arbitrary, we also have 1]U ~ 1]V, and since 0 < v < u, this can only hold if, as asserted, 1] = O. This completes the proof when I is bounded. On the other hand, if the interval I is unbounded, observe that
say, where h is a bounded open interval, k on each Ik, and consequently also a.e. on I.
= 1,2, ... Thus I' exists a.e. •
Not only does the derivative of a monotone function exist a.e., but it also is integrable. More precisely, we have Lebesgue's theorem, Theorem 2.2. Suppose I is a nondecreasing real-valued function defined on I = (a,b). Then I' E L(I) and
(2.10) Proof. Extend I to R by setting I( x) = I( a+) if x ~ a, and I( x) I(b-) if x ~ b. Put now
In(x) = n(f(x
+ lin) -
and observe that since by Theorem 2.1 lim In = I'
n-+oo
x E R, n
I(x)) ,
I'
= 1,2, ...
=
,
exists a.e., we have
a.e. on I.
Thus, by Fatou's Lemma, it follows that
i /'
dx
~ lim inf
i
In dx .
(2.11)
3.
Absolutely Continuous Functions
171
It is rather straightforward to compute the integral on the right-hand side of (2.11). Indeed, by 4.6 in Chapter VII, and with n sufficiently large, we have
h Jn dx = n h J( x + 1/n) dx - n
=n f
J dx - n
J[b,b+l/n)
i
J dx
f
J dx = A + B ,
J(a,a+1/n]
say. Clearly A = J(b-), and B ~ J(a+). Whence, for all sufficiently large n the integral in question is dominated by J(b-) - J(a+), and (2.10) holds. •
Corollary 2.3. Then l' E L(I) and
Suppose
J is
hlJ/ldx
BV on a bounded interval I = [a,b].
~ V(J;a,b).
(2.12)
Our aim now is to discover when equality holds in (2.10); for this we need the concept of absolutely continuous functions.
3. ABSOLUTELY CONTINUOUS FUNCTIONS Absolutely continuous, or AC, functions were introduced in 3.6 in Chapter VIII. They are continuous functions whose increment along any collection of pairwise disjoint intervals of sufficiently small total length is arbitrarily small. This concept excludes the Cantor-Lebesgue function J which, although being locally constant on a subset of [0,1] offull measure, it nevertheless increases from 0 to 1 there. To see this we cover the Cantor set C by a union Un( an, bn ), say, of pairwise disjoint open intervals with L:n(bn - an) is arbitrarily small. Extend J so that J(x) = 0 for x < 0 and J( x) = 1 for x > 1. Then it is not hard to verify that L:n(J(bn ) - J( an)) = 1, and consequently, L:~=l(J(bn) - J(a n )) > 1/2 for sufficiently large N, while at the same time, L:~=1 (bn - an) is arbitrarily small. So, which among the continuous functions are AC, and what properties do AC functions satisfy? Proposition 3.1.
Let I = [a,b] and suppose I is AC on I. Then
I is BV on I, and consequently, by Corollary 2.3, I' exists a.e. and it is integrable there.
x.
172
Absolute Continuity
Proof. Let 8 be the real number that corresponds to the choice c = 1 in the AC definition of I, and let the integer N > (b - a)/8. Note that, in particular, we have
v (f; x, x + "7) ~
1,
any x E I , 0
< "7 ~
(3.1 )
8.
The idea now is to use (3.1) to put together the estimates along the partitions of I. So, let P = {a = Xo < ... < Xn = b} be a partition of I and let P' be the partition of I obtained by adjoining the points a + (b - a)/N, a + 2(b - a)/N, ... ,b to P. Since P' is finer than P it readily follows that
2:
2:
I~k/l ~
over l'
I~k/l·
(3.2)
over 1"
It is not hard to estimate the right-hand side of (3.2); indeed, by (3.1) it does not exceed
V(f;a,a + (b - a)/N) + ... + V(f;a + (N -1)(b - a)/N,b) Since P is arbitrary, by (3.2) it follows that V(f; a, b) on I . •
~
~
N.
N and I is BV
So, AC functions are continuous and BV, but is the converse to this statement true? It is partially true, and to discuss it we need some preliminary results.
Lemma 3.2. Suppose A function defined on I such that
~
II' (x) I ~
I = [a,b], and let M,
I
be a real-valued
x EA.
Then
I/(A)le ~ MIAle. Proof.
(3.3)
Given c > 0, let 0 be an open set such that
101
~
IAle + c ,
0
2 A.
(3.4)
Break up A into two disjoint parts, Al = {x E A: I is constant in a neighbourhood of x}, and A2 = {x E A: I is not constant in any neighbourhood of x}, say. We construct a covering V of I(A) in the sense of Vitali as follows: If I(x) E I(A) and x E At, then there is an interval J ~ 0 such
3.
Absolutely Continuous Functions
173
that f(x) = fey) for all y E J. Now, if I( w, w') denotes the closed interval with endpoints w and w' (note that w' < w is possible), then we can find h so that I(x,x+Mh) C 0 and f is constant on I(x,x+Mh). Such values f(x) are then assigned the intervals I(f(x),f(x) + Mb), where b satisfies
I(x,x
+ Mb) ~ I(x,x + Mh).
On the other hand, if f(x) E f(A) and x E A 2 , then there is a sequence hx,n '=I 0, n = 1,2, ... , converging to 0, such that I(x,x + hx,n) C 0 for all n, and
°< If(x + hx,n) - f(x)1
~ (M
+ c)lhx,nl,
n = 1,2, ...
To these values f(x) we assign the intervals I(f(x),f(x + hx,n)), n = 1,2, ... Now, the collection V of all the intervals introduced above is a covering of f(A) in the sense of Vitali and consequently, there is an at most countable family consisting pairwise disjoint intervals II' ... ' Ik ... , say, such that Whence, we also have (3.5) To estimate the right-hand side of (3.5) we separate the Ik's into two families: Those that correspond to f(x) with x E AI, call them Il's, and those corresponding to f(x) with x E A 2 , call them ~'s. Furthermore, if Il = I(f(xk),/(xk) + Mb k ), let Jl = I(xk,xk + bk ), and if I~ = I(f(xk),f(xk + h k )), let J~ = I(xk,xk + h k ). Since the Ik's are pairwise disjoint, so are the Jk'S, and they are also contained in O. Consequently, by (3.4) we have
2: IIkl
2: jIll + 2: II~I ~ 2: M lbkl + 2:(M + c)lhkl = M 2:IJll + (M + c) 2:IJ~1 ~ (M + c) 2:IJkl ~ (M + c)IOI ~ (M + c)(IAle + c), =
which substituted into (3.5) gives If(A)le ~ (M + c)(lAle + c). Moreover, since c is arbitrary, the above inequality is also true with c = 0, and (3.3) holds. • An interesting consequence of Lemma 3.2 is
x.
174
Absolute Continuity
Lemma 3.3. Suppose I is a real-valued function defined on I = [a,b], and let A be a measurable subset of I so that I'(x) exists everywhere on A and is measurable there. Then
I/(A)le ~
i
11'1 dx.
(3.6)
Proof. First suppose that for some integer M we have on A and consider the level sets
II'(x)1 < k/2n},k =
Ak,n={X E A:(k-1)/2n ~
1/'(x)1 ~ M
1, ... ,M2n,n
= 1,2, ...
Since for each n we have A = Uk Ak,n, by Lemma 3.2 and Chebychev's inequality it readily follows that
I/(A)le ~
I/(Ak,n)le ~
L k
L(k/2 n )I A k,nl k
= L((k - 1)/2 n )I A k,nl k
1
~ Lk
Ak,n
1
+ 2n L
IAk,nl
k
1/'ldx + 2~IAI
~ JAf Il'ldx + 2~IAI.
Since this estimate holds for every n, (3.6) is true in this case. As for the general case, note that 00
U{x E A: k -
A=
00
1 ~ 11'1 < k}
k=l
=
U Ak , k=l
say, where the Ak's are pairwise disjoint and 1/'1 exists and is bounded and measurable on each A k • Then, by the first part of the proof we have
I/(A)le ~ and (3.6) holds.
L
I/(Ak)le
k
~ L f 1/'1 dx = f 1/'1 dx, k
~k
~
•
We are now ready for
Theorem 3.4 (Banach-Zarecki). Suppose I is a continuous, BV, real-valued function defined on I = [a,b]. Then I is AC on I iff I maps null sets into null sets, i.e.,
IAI =
0
implies
I/(A)I = O.
3.
Absolutely Continuous Functions
175
Proof. We show the necessity first: It is enough to prove that given a null set A ~ (a,b) and c > 0, we have
I/(A)le
~
(3.7)
c.
We invoke (iii) in 4.14 below: From the hypothesis of AC there exists 8 > 0 such that no matter what finite pairwise disjoint family {Ik = (ak,bk)} of subintervals of (a,b) we take, with the notation w(j, J) = sUPJ I - infJ I, we have 2:(bk - ak) < 8 implies 2:w(j,Ik) < c. (3.8) Also observe that since I/(Ik)le ~ w(j,Ik) for each k, by (3.8) it readily follows that
Choose now an open set 0 with 101 < 8 so that A C 0 = U~l(ak,bk) ~ (a,b), where the (ak,bk),s are pairwise disjoint, and note that the above estimate implies that
(3.7) follows at once from this. As for the sufficiency, suppose that c > 0 is given, and let {(ak,b k )} be a finite pairwise disjoint family of subintervals of I. Then, if Ak = {x E [ak,b k]: I'(x) exists }, we have I[ak,bk] \ Akl = 0 for all k. Furthermore, since I is continuous we also have
Whence, combining these remarks, and by our assumption and Lemma 3.3, we obtain
2: 1/([ak,bkDle ~ 2: 1/([ak,bk] \ Ak)le + 2: I/(Akle
2: I/(bk) - l(ak)1 ~ k
k
k
~
2: lAic f 1f'1 dx = k
k
f
lUAIc
1/'1 dx .
(3.9)
Now, since / is BV on I, by Corollary 2.3 /' is integrable on I, and we are in a position to invoke 3.7 in Chapter VIII: Choose 8 > 0 so that the
x.
176
Absolute Continuity
conclusion there holds for the c we fixed at the beginning of the argument, and observe that since Uk Ak ~ Uk[ak,bk], we also have
IUkAkl ~ 6 whenever
2:(bk - ak) ~ 6. k
Therefore, by (3.9) it follows at once that
2: I/(bk) -
l(ak)1 ~ c whenever
k
and
I is AC on I.
•
That the assumption that I is BV is necessary for the validity of Theorem 3.4 follows from a construction which is reminiscent of the discussion preceding (1.5) in Chapter III. Consider 1= [0,1] and a Cantor-like subset J( of Ij the measure of J( may be positive or not. Write the set 1\ J( = Un(an,bn ) as the at most countable pairwise disjoint union of open intervals, and let Cn denote the midpoint of (an, bn ). If dn is a sequence of positive numbers with limit 0, define Ion I as follows: I( x) = 0 for x E J(, I(c n ) = dn for all n, and I is linear in [an, cn] and [cn, bn]. Then I is continuous, and V(fj 0, 1) = 2 L~=l dn • To see that I maps null sets into null sets, consider a null subset A of I. By 5.8 in Chapter I, we have I(A) = I(An J()U Un I (A n (an, bn )). Since I is linear in [an, cn] and in [cn,b n], it readily follows that I/(A n (an,bn)1 = 0 for all n, and so I/(A) ~ I{O}I + Ln I/(A n (an, bn)1 = o. If Ln dn = 00, then I fails to be AC on I since it is not BV there. In order to establish further properties of AC functions we introduce the following definition: Suppose I is a real-valued a.e. differentiable function on an interval I. We then say that I is singular if I' = 0 a.e. on I. How do AC singular functions look?
Proposition 3.5. Suppose I is an AC singular function defined on an interval I. Then I is constant. Proof. Let A be a subset of I of full measure so that I'(x) x E A. By Lemma 3.3 we have
I/(A)le
~
L1/'1
dx = O.
= 0 for (3.10)
Further, since II \ AI = 0, by the necessity of Theorem 3.4 it follows that
If(I\A)1 =
o.
(3.11)
3.
Absolutely Continuous Functions
177
Whence combining (3.10) and (3.11) we get
I/(I)le ~ I/(A)le + 1/(1 \ A)le =
o.
(3.12)
Now, since I is continuous, unless I is constant, 1(1) contains an interval and (3.12) does not hold. Therefore I must be constant. • We are now ready to characterize, following Lebesgue, the class of functions which may be reconstructed by integrating their derivatives. Theorem 3.6. Suppose I is a real-valued function defined on I [a,b]. Then, I' exists a.e. in (a,b), it is integrable there, and I(x)-/(a)=
f
I'(t)dt,
J[a,x]
a~x~b,
=
(3.13)
iff I is AC on I. Proof. We do the sufficiency first: Since I is AC on I, I' exists a.e. on I and it is integrable, therefore we only need to show that (3.13) holds. For this purpose put F( x) = i[a,x] f' (t) dt, a ~ x ~ b, and observe that by 3.6 in Chapter VIII also F is AC on I, and, by the Lebesgue differentiation theorem, F' = I a.e. Let 9 = F - Ii 9 is AC and singular on I, and, by Proposition 2.6, 9 is constant there. More precisely, F( x) - I( x) = F(a) - I(a), a ~ x ~ b, and since F(a) = 0, it readily follows that F(x)=j
1'(t)dt=/(x)-/(a),
[a ,x]
a~x~b,
as we wanted to show. Conversely, since I(x) = I(a)
+f
f'(t)dt,
J[a,x]
a~x~b,
the sufficiency follows from Theorem 2.2 in Chapter VIII.
•
Implicit in the proof of Theorem 3.6 is the following important result concerning BV functions. Theorem 3.7 (Lebesgue). Suppose I is BV on 1= [a,b]. Then there exist an AC function 9 and a singular function h such that I(x)
= g(x) + h(x),
x E I.
Up to constants, the decomposition in (3.14) is unique.
(3.14)
x.
178
Absolute Continuity
Proof. Since I is BV on I, I' exists a.e. there, and it is integrable. Let g(x) = Ira,x] I'dt, a ~ x ~ b, and set
h(x)=/(x)-g(x),
a~x~b.
Then 9 is AC on I and by the Lebesgue differentiation theorem, h' = I' - g' = 0 a.e. on I. Thus 1= 9 + h is a desired decomposition. As for the uniqueness (modulo constants), suppose that also I = gl + hI, where gl is AC on I and hI is singular. We then have
(3.15)
9 - gl = hI - h,
where the expression on the left-hand side of (3.15) is AC and that on the right-hand side is singular. By Proposition 3.5 it readily follows that this function is constant, c say. Thus 9 = gl + c, and h = hI - c. •
4. PROBLEMS AND QUESTIONS 4.1 Show that in Vitali's covering lemma we may also demand that given E > 0, L:k IIkl ~ (1 + E)IEle. 4.2 State and prove Vitali's covering lemma, including the conclusion of 4.1, for subsets E of Rn with IEle < 00, and covered in the sense
of Vitali by closed n-dimensional intervals. 4.3 Let E be a subset of Rn that is the union of sets, each being an open
interval together with any of its edges. Prove that E is Lebesgue measurable. 4.4 A measure J.L on (Rn,£.) is said to be doubling provided there exists an absolute constant c such that
J.L(I(x,2r»
~
cJ.L(I(x,r)) ,
all x E R n ,r
> o.
Given a doubling measure J.L, a family V of closed intervals of R n is said to be a covering of E E £. in the sense of Vitali if for any x E E and E > 0, there is an interval I E V which contains x and so that J.L(I) ~ E. Show that if V is a covering of E in the sense of Vitali, and J.L(E) < 00, then there exists an at most countable family {Ik} of pairwise disjoint intervals of V such that J.L (E \ Uklk) = O. 4.5 Suppose I is a real-valued function defined on an interval I, and that all the Dini numbers of I for x in I lie between -k and k,
4.
Problems and Questions
179
where k is some positive constant. Must I be Lipschitz on I? If so, what is the relation between the Lipschitz constant of I and k? 4.6 Let I be a real-valued function defined on I there exist real constants u, v such that
u~
n+ I( x)
Is it then true that for all a
~ v,
~ x
<
x
= [a,b]
and suppose
all x E I.
+ h ~ b we have
uh ~ I( x + h) - I( x) ~ vh ? 4.7 Suppose I is a continuous real-valued function defined on 1= [a,b], A c I is at most countable, and n+ I( x) ~ 0 on I \ A. Prove that I is nondecreasing on I. 4.8 Compute the Dini numbers of the Cantor-Lebesgue function at each point x of [0,1]. 4.9 (Fubini's Lemma) Let {Ik} be a sequence of non decreasing functions defined on an interval I of the line. Show that if the series I(x) = Lk Ik converges to a finite limit on I, then f' = Lk Ik a.e. on I.
4.10 Let (an) be a sequence of distinct points in an interval I of the line, and suppose (un), (vn ) are sequences ofreal numbers such that L lunl, L Ivnl < 00. Put
and show that 8 ' = 0 a.e.
8
=
L In
has a finite derivative a.e. on I and that
4.11 Let E be any subset of R. Show that for almost all x in E we have
1.
1m
h-+O
IE n [x -
h,x 2h
+ h]le
_ 1 - .
4.12 Does there exist a strictly increasing function terval I so that I' = 0 a.e. on I?
I
defined on an in-
4.13 Suppose I is a real-valued function defined on I. Show that if not constant and I' = 0 a.e., then I cannot be Lipschitz on I.
I
is
4.14 Let I be a real-valued defined on I = [a,b]. Prove that the following statements are equivalent:
x.
180
Absolute Continuity
(i) I is AC on I. (ii) Given E > 0, there exists 6 > 0 such that for any finite collection {[ai,bi]} of nonoverlap ping subintervals of [a,b] we have IEi(j(bi) - l(ai»1 < E, whenever Ei(bi - ail < 6. (iii) If w(j, J) = sup I - inf I , J
J
then for each E > 0, there is 6 > 0 such that for any finite collection {Ik = [ak,b k]} of nonoverlapping subintervals of I we have Ekw(j,Ik) ~ E, whenever Ek(b k - ak) < 6. 4.15 Let I be a real-valued continuous function defined on I = [a,b], and suppose that I is AC on [a,ci], for any d < b. Show, then, that I is AC on I. Is this result true if the assumption that I is continuous on I is dropped? 4.16 If the functions 1,9 are AC on an interval I, show that their difference, sum and product are also AC on I. If 9 vanishes nowhere on I, then the quotient 1/9 is also AC on I. 4.17 Show that the composition of AC functions need not be AC, or even BV for that matter. 4.18 Let I be AC on I, and 1(1) ~ J. If 4>: J that 4> 0 I if AC on I.
-+
R is Lipschitz, show
4.19 Suppose I is a non decreasing, AC function on I, and 1(1) ~ J. Show that if 4> is AC on J, then 4> 0 I is AC on I.
4.20 Suppose I is a monotone function defined on I = [0,1], and let E = {x E I: n+ I(x) = oo}. Prove that I is AC on I iff I/(E)I = 0. 4.21 Suppose I is a real-valued continuous function defined on I, and let A be an Fq subset of I. Prove that I(A) is Fq too, and show that as a consequence of this, if I is AC on I, then it maps Lebesgue measurable sets into Lebesgue measurable sets.
4.22 Let N be a Lebesgue null subset of I = [0,1]. Show that there is a real-valued function I defined on I which is AC there, and such that f'(x) = 00 for each x E N. 4.23 Show that there is an AC function tone on no subinterval of I.
I defined on [0,1] which is mono-
4.24 Let I be a real-valued Lebesgue measurable function defined on I = [a,b], and suppose e, fJ > 0 are given. Show that there exist a
4.
Problems and Questions
181
Lebesgue measurable set B C I, and an AC function F defined on I such that II - FI < c and IBI < "I.
II\B
4.25 Let E be a bounded Lebesgue measurable set in the line and let
In(x)=n [
XEdy,
n=1,2, .•.
i[:c,X+l/n]
Show that each In is AC on every bounded interval of the line, that ~ 1, that lim n -+ oo In = XE a.e., and that lim n -+ oo d (In,XE)
o ~ In = o.
Is this result sufficient to prove that AC functions are dense in the metric of L(R)? 4.26 Let 9 be a continuous function defined on 1= [a,b] and suppose is AC there. Prove that I: 9 dl = II gl' dy.
I
4.27 Let 1= [a,b], and suppose I is BV on I. Prove that for each Borel set E ~ I we have 1/'1 dy ~ 1V(J)(E)le, and that there is equality here provided that I is AC on I. A related result is the following: Suppose I is a strictly monotone AC function defined on I, and let I(I) = J. Show that for every Borel subset E of J we have 1,-1 (E) I'(y) dy = lEI.
IE
4.28 (Change of Variable) Let I = [a,b], and g: I -+ R, g(I) C J C R, be continuous there. Furthermore, if J = [c,d) and I: J -+ R is integrable, put F(x) = irc,x] I dy, c ~ x ~ d. Now, suppose that 9 and Fog are a.e. differentiable on their domains of definition, and prove that the relation (F 0 g)' = (J 0 g) g' holds a.e. on I. Finally, show that Fog is AC on I iff (i) (J 0 g) g' E L([a, b)). (ii) For each subinterval I' = [a',b'] of I we have
1
9 (b')
g(a')
I dy =
1
(J 0 g) g' dy .
l'
4.29 (Integration by Parts) Let I = [a, b] be a bounded interval, and suppose F, 9 E L(I). Show that if
F(x)= [
Idy,
J[a,x]
G(x)= [ J[a,x]
gdy,
a~x~b,
then IG,gF E L(I) and
i
lGdY +
i
Fgdy = F(b)G(b) - F(a)G(a).
X.
182
Absolute Continuity
Also, if I,g are AC on [, we have
iI9'dY+ il'9dY=/(b)9(b)-/(a)g(a). 4.30 Suppose I, I' E L(R). Prove that
JR I' dy = O.
4.31 Let [ = [a, b], and suppose I is a continuous real-valued function defined 'on [. The estimate (1.5) in Chapter III implies that if V(x) is AC on [, then also I is AC on [. Discuss whether the converse is true, to wit, does the assumption that I is AC on [ imply that V(x) is AC on I? 4.32 Let {In} be a sequence of AC functions defined on [ = [0,1] such that In(O) = 0 for all n. Assume that the sequence of derivatives {/~} is Cauchy in L(I), i.e., limn,m-+ooJII/~(x) - 1~(x)ldx = o. Prove that {In} converges uniformly to a function I, and that I is AC in [. 4.33 Prove that a real-valued function I(x)
=j
(-oo,x]
>dy,
I defined on R is of the form where > E L(R),
iff (a) I is AC on [-n,n] for all n, (b) V(J; -n,n) n, and, (c) limlxl-+oo I(x) = O. Prove it.
~ k
<
00
for all
4.34 Let [ = [a,b], and suppose I is a continuous function defined on [. Show that I is AC on [iff there exists a sequence {In} of Lipschitz functions defined on [ such that lim n-+ oo V(J - In; a, b) = O. 4.35 Suppose I, In are BV on [
V(Jn - I; x)
= [a, b], n = 1,2, ... , and -t
0,
for some a < x < b.
Prove that there exists a subsequence
nk
-t
00
such that
CHAPTER
XI
Signed Measures
In this chapter we consider u-additive set functions of arbitrary sign, or signed measures, establish their basic properties and describe, in the Theorems of Lebesgue and Radon-Nikodym, their basic form. I learned the proof of these theorems from R. Bradley.
1.
ABSOLUTE CONTINUITY
In Chapter IV we dealt briefly with additive set functions of arbitrary sign; these functions appear quite naturally in applications. We consider here extended real-valued u-additive set functions; we motivate our interest in them with two simple examples. Let 1'1' 1'2 be measures on (X, M), and let v be the set function defined on M by veE)
= 1'1 (E) -
J.t2(E) ,
E EM.
(1.1)
Although v is not necessarily nonnegative, it satisfies many of the properties of a measure provided, of course, that the right-hand side of (1.1) is defined. Similar considerations apply to the set function v( E) =
L
f dJ.t1 , E EM,
(1.2)
where f is an extended real-valued measurable function defined on X for which the integral in (1.2) exists. To consider the general setting we introduce the following definition: Given a set X and au-algebra M of subsets of X, we say that a set
183
XI.
184
Signed Measures
function v defined on M is a signed measure provided the following three properties hold: (i) v:M --+ [-00,00], and v assumes, at most, one of the values-oo or 00. (ii) v(0) = O. (iii) If {EkH~l !;; M is a sequence of pairwise disjoint sets, then (1.3) The equality in (1.3) means, in particular, that if Iv(Uk Ek)1 < 00, then the series on the right-hand side of (1.3) converges absolutely and unconditionally, and that it diverges properly to ±oo otherwise. Also note that property (ii) rules out the possibility that v is identically -00 or identically 00. The usual properties of a measure are true in this more general setting as well. For instance, properly interpreted, the results concerning limits discussed in Proposition 3.1 of Chapter IV hold; it is incumbent upon the reader to verify that this is the case. How do signed measures look, and how can they be represented? Referring to 4.8-4.12 in Chapter IV, given a signed measure v on (X,M), let v+, v_ and Ivl denote the positive variation, the negative variation, and the total variation of v, respectively. By 4.14 in Chapter IV, all the variations are actually measures on (X,M) and, by 4.11 in that Chapter, the Jordan decomposition
v(E) = v+(E) - v_(E) ,
E EM,
obtains. Thus, a general representation in the spirit of (1.1) holds for arbitrary signed measures. We note in passing that the Jordan decomposition is not unique: If J.L is any finite measure on (X,M) and vI = v+ + J.L and v2 = v_ + J.L, then we also have
v(E)
= vI(E) -
v2 (E) ,
E EM.
However, the Jordan decomposition satisfies a "minimality" condition, cf. 3.8 below. As for (1.2), it also leads to an interesting theory; the observations in 3.7 in Chapter VIII are relevant here. In fact, motivated by those considerations we introduce the following definition: Suppose J.L is a measure and v is a signed measure on (X,M)i we say that v is absolutely continuous with respect to J.L, and denote this by v <: J.L, if v(A) = 0 for any A E M with J.L(A) = O. Informally, each J.L-null set has v measure O. Our first result explains how this nomenclature is derived from our usual understanding of absolute continuity.
1.
Absolute Continuity
185
Proposition 1.1. Let J-LF be the Borel measure induced by the distribution function FE V. Then J-LF is absolutely continuous with respect to the Lebesgue measure iff F is AC on every bounded interval of R. Proof. We show the necessity first: IfF is not AC on every bounded interval of R, then there exist an interval 1= [a,b] and € > 0, such that for every 0 > there is a family {[ak,b k]} of nonoverlapping subintervals of I so that
°
L IF(bk) -
(1.4)
F(ak)1 ;?: e,
k
Now, since as is readily seen, cf. Theorem 2.1 in Chapter IX,
(1.4) implies that the set B = Uk[ak,b k] verifies (1.5) We now invoke (1.5) with 0 = 1/2 n , n of (bad) sets {Bn} so that
J-LF(Bn) > e,
IBnl ~
= 1,2, ... ,and construct a sequence
1/2 n ,
Bn ~ I,
all n.
Thus, on the one hand by the Borel-Cantelli Lemma it follows that Ilim sup Bn I = 0, and, on the other hand, by 4.25 in Chapter IV, we have J-LF(limsup Bn) > 0, contradicting the fact that J-LF is absolutely continuous with respect to the Lebesgue measure. As for the sufficiency, suppose that IAI = 0, and given e > 0, let 0 > be the number that corresponds to the choice of € in the AC definition of F on [-2n,2n], n > 0. Observe that since IA n [-n,n]1 = 0, there exists an open set 0 = Uk(ak,bk) ~ [-2n,2n] such that
°
An [-n,n] C 0,
101 < O.
Also note that by (2.4) in Theorem 2.1 in Chapter IX,
EZ"=l (bk - ak) ~ 101 ~ 0, we get J-LF (U:=l (ak,b k ») ::; :E:=1 (F(bk) - F(ak» ::; e,
and consequently, since
all N.
XI.
186
Signed Measures
Since this inequality holds for all N, it follows that J.LF(O) c is arbitrary and J.LF is regular, we have
J.LF(A n [-n,n]) = 0,
~
c, and since
all n.
But this can only be true if J.LF(A) = 0, and consequently, J.LF is absolutely continuous with respect to the Lebesgue measure. • Also corresponding to the following result.
Suppose J.L is a measure and v is a signed measure
Proposition 1.2. on (X, M) so that
Iv(E)1 < Then, v
~
c-o definition of AC functions, there is the
00
whenever
J.L(E) <
00.
(1.6)
J.L iff given c > 0, there exists 0 > 0, such that Iv(E)1 < c
whenever
J.L(E) <
o.
(1.7)
To show that the condition is sufficient observe that if J.L(E) < 0 for all 0 > 0, and (1.7) gives that v(E) = 0. As for the necessity, suppose that v ~ J.L and that (1.7) is false. Then there exist a sequence {Bn} ~ M and c > 0, such that Proof.
= 0, then J.L(E)
Iv(Bn)1 > c
J.L(Bn) ~ 1/2n ,
and
n
= 1,2, ...
Pick now a subsequence {B nk }, say, so that all the v(Bnk)'s are of the same sign, and observe that since J.L(Uk B nk ) < 00, by (1.6) it also follows that Iv(Uk Bnk)1 < 00. The proof may now be finished in a stroke: By 4.25 in Chapter IV, Iv(lim sup Bnk)1 > 0, and by the Borel-Cantelli Lemma, J.L(lim sup B nk ) = 0; this contradicts the fact that v ~ J.L. • Observe that even if v is a measure, (1.6) is necessary for Proposition 1.2 to hold. Consider, for example,
v(E) =
°
l
x 2 dx,
E E C.
Then lEI = implies v(E) = 0, but since for Ixllarge the set E = (x,x+7]) has lEI = 7] and v(E) is large, (1.7) fails. Next suppose that p, is a probability measure and v is a signed measure defined on (X, M), and that
Iv(E)1
~
p,(E) ,
E EM.
(1.8)
1.
Absolute Continuity
187
Clearly v <:: pj the question is, then, how to go about constructing a measurable function I so that
veE) =
Lld
P,
E EM.
(1.9)
Now, for A EM, the function
(1.10) = (v(A)/ peA)) XA + (v(X \ A)/p(X \ A» XX\A , satisfies (1.9) for E = A,X\A. I is a first and crude approximation to the
I
solution to our problem, and it is well-defined if the convention % = 0 is in forcej in fact, by (1.8), it is natural to adopt this convention. Observe that I is measurable and III ~ 1. We may think of I as Ip, namely, as a function associated to the measurable partition P = {A, X \ A} of X. Intuitively, the more refined the partition P is, the more spread out the function Ip associated to P by a formula similar to (1.10) will be. We remind the reader that we only consider measurable partitions of X, and that, given partitions P and pI of X, we say that pI is finer than P if given A' E pI, there exists A E P such that A' ~ A. Now, on a probability measure space a natural measure ofthe amount of spread of a measurable function is its "variance"j in the process described in (1.10) above the finer the partition P becomes, the greater the variance of the associated function Ip is. We expect, then, that the function I that verifies (1.9) will emerge as the limit function of the Ip's as the partitions P get finer, or as the variance of the Ip's approaches a supremum. The "expected value" of the Ip's is the same, and, as in the case of the function defined by (1.10), it equals I dp = veX). Thus by the well-known formula
Ix
variance = (second moment) - (expected value)2 , it is clear that maximizing the variance of the h's is equivalent to finding the maximum of the second moments of these functions. The proof of (1.9) we present below follows along these lines. More precisely, we construct the function I by maximizing the second moments of an appropriate family of functions. We begin by formalizing the relation between the partitions P of X and the functions Ip associated with them. Specifically, given a partition P = {At, ... ,An } of X, let Ip:X --+- [-1,1] be defined by the expression n
Ip
=
L (V(Ak)/ peAk)) XA,. , k=l
(1.11)
XI.
188
where the convention % = 0 is in force. Clearly, if A E M is any set of the form A k m ::; n, we have
Signed Measures
= Uj=l Ak"
1 ::;
kl < ... <
and, in particular,
v(X)
=
Ix 11'
dp.,
all partitions P.
The following properties of partitions are essential to carry out the verification of (1.9). Lemma 1.3. Let (X,M,p.) be a probability measure space, and suppose P and pi are partitions of X, with pi finer than P. If v is a signed measure on (X,M) which satisfies (1.8), A E P, and 11' is given by (1.11), the following five properties hold:
v(A)
=
Lh
L = Lli,
dp.
= 11"
L11' 11" L11' U1" - 11') = LIf" = LU1" - 11')2 + LIf, LIf, LIf" dp.
dp.
dp.
and
dp. ,
(1.13)
0,
(1.14)
dp.
dp. ::;
(1.12)
dp.,
dp. .
dp. ,
(1.15)
(1.16)
By the additivity of the integral it is clear that the above properties also hold for any set E E M of the form E = Uk Ak, where Ak E P for each k. In particular, they are true for E = X. Proof. Since pi is finer than P, each A E P may be written as a finite union A = Uk Ak, Ak E P'. Furthermore, since
11"(x) = V(Ak)/p.(Ak) , x E Ak, it readily follows that
f 11" lA
dp.
= I: I 11" Ie lAic
dp.
= I: (V(Ak)/ P.(Ak)) P.(Ak) = v(A) . Ie
1.
Absolute Continuity
189
Similarly, since fp(x) = II(A)/p(A) ,
we also have
x E A,
(1.17)
L
fp dp = (II(A)/ p(A)) p(A) = II(A) ,
and (1.12) holds. The verification of (1.13) is also simple: If A E P is partitioned into Ak's E pI as before, we have fp(x)fp,(x) = (II(A)/ p(A)) I)II(Ak)/ peAk)) XAk (x),
x E A,
k
and consequently, the left-hand side of (1.13) is equal to
f
JA
fp fpl dp = (II(A)/ p(A))
Lk (II(Ak)/ peAk)) JAf
dp k
= (II(A)/ p(A)) II(A). As for the right-hand side of (1.13), on account of (1.17) it equals
L
(II(A)/ p(A))2 dp = II(A)2 / p(A),
and (1.13) holds. (1.14) is a simple rewriting of (1.13). Next we consider (1.15). Since fpl = UPI - fp)+ fp, it readily follows that f~, = UPI - fp)2 + 2 UPI - fp) fp + f~ . (1.18) Whence, integrating the identity (1.18) over A, and invoking (1.14), we get at once that (1.15) holds. (1.16) follows from (1.15), and we have finished. • We are now ready to prove Theorem 1.4. Suppose (X,M,p) is a probability measure space and II is a signed measure defined on (X, M) such that III(E)I ~ p(E),
all E EM.
Then there exists a unique measurable function f: X
lIe E) =
kf
dp ,
all E in M .
(1.19) ---t
[-1,1] such that (1.20)
Uniqueness is understood in the following sense: If It is another measurable function defined on X for which (1.20) holds, then f = It p-a.e.
XI.
190
Signed Measures
Proof. Consider the collection of all the finite measurable partitions P of X, to each associate the function fp given by (1.11), and set
TI = sup { 'P Jx
I~ dJ.L .
Clearly 0 ~ TI ~ 1. Let now {Pn } be a sequence of partitions of X with the property that
IxI~ndJ.L;:::
TI-1/4 n ,
(1.21)
n=1,2, ...
It is more efficient, however, to work with the sequence {P~} consisting of the common refinement of the Pn's; specifically, let Pf = PI, P 2 be the common refinement of PI and P2, and so on. Denote I'P'n by In, and observe that since P~ refines P n for all n, by (1.16), (1.21), and the definition of TI, we have
(1.22) Next we construct a "maximal" element from the In's. Since for each n P~+! refines P~, by (1.15) and (1.22) it follows that
Ix (Jn+! - In)2 dJ.L = Ix
I~+! dJ.L -
Ix
I~ dJ.L
~TI-(TI-1/4n)=1/4n,
n=1,2, ...
Thus, by the Cauchy-Schwarz inequality, cf. (3.1) in Chapter VIII, we have
Ix lin+! - Inl dJ.L
~
(Ix (In+! - In)2 dJ.L) 1/2 (J.L(X))1/2
~
1/2n,
n = 1,2, ...
and consequently, since
we may invoke 4.7 in Chapter VII and obtain that L:n(Jn+l - In) converges absolutely, and pointwise, J.L-a.e. on X. Put now N
9 = lim L(Jn+l - In) = lim IN+l N-+oo n=l
N-+oo
It ;
1.
Absolute Continuity
191
9 is well-defined J.L-a.e. on X.
Next we show that the function I = 9 + It gets the job done; in other words, (1.20) is true for I and any E in M. Given E E M, let {Qn} be the sequence of partitions consisting of the common refinement of the partitions p~ and {E, X \ E}. Clearly Qn is finer than p~, n = 1,2, ... , and Qn+1 is finer than Qn, n = 1,2, ... Thus, if h n = hQn' n = 1,2, ... , by (1.22) and (1.16) it readily follows that
Ix I~ dJ.L ~ Ix h~ dJ.L ~ "', Ix (hn - In)2 dJ.L Ix h~ dJ.L - Ix I~ dJ.L '" - 1/4n
~
all n,
and, by (1.15) and (1.22), we also get =
~ '" - ('" - 1/4n ) = 1/4 n ,
all n. (1.23) Consequently, once again by the Cauchy-Schwarz inequality and (1.23), we have
Ik
(hn - In) dJ.L1
~ (k (hn ~
In? dJ.L) 1/2 J.L(E)I/2
(Ix (hn - In? dJ.Ly/2 ~
1/2 n ,
all n. (1.24)
We are now ready to compute veE). Since Qn is finer than {E,X \ E} for all n, by (1.12) we have
veE)
=k
hn dJ.L
=k
(hn - In) dJ.L + kIn dJ.L
= An + B n ,
say. By (1.24), lim n -+ oo An = O. As for the Bn's, first observe that lim In = I J.L-a.e., and I/nl ~ 1 all n. n-+oo
Whence, by LDCT it follows that lim Bn = n-+oo
JEf I dJ.L ,
and (1.20) holds. Further, if It is another measurable function for which (1.20) is true, then fEU - It) dJ.L = 0 for all E E M, and consequently, I - It = 0 J.L-a.e. • The applications of Theorem 1.4 are numerous and interesting; we begin by showing how a signed measure may be decomposed into "positive" and "negative" parts. Theorem 1.5, and the remarks that follow it, are known as Hahn's decomposition theorem. This decomposition is true in greater generality, cf. 3.6 below, but the result presented here is sufficient for the applications we have in mind.
XI.
192
Signed Measures
Theorem 1.5. Let v be a signed measure defined on (X, M), and suppose that its variation Ivl is a probability measure on (X, M). Then there exist two disjoint, measurable sets A and B, A U B = X, so that (i) veE n A) ~ 0, all E E M, (ii) veE n B) ~ 0, all E E M. Proof.
Since
Iv(E)1 ~ Ivl(E) ,
all E EM,
Theorem 1.4 applies with Il. = Ivl. In particular, there is a measurable function I, III ~ 1, that satisfies
veE)
= ~ I dlvl ,
all E EM.
We claim that III = 1 Il.-a.e. First note that since Ivl(X) = 1, by 4.12 in Chapter IV, given e > 0, there exists a partition P = {Al, ... ,Am } of X such that m
1- e~
L
Iv(Ak)1 ~ Ivl(X) = 1. k=l Thus, by (1.12), it readily follows that m
1-e ~ Llv(Ak)1 k=l =
t
Ivalue of
k=l
11'
on Akllvl(Ak) = 11/1'1 dlvl·
x
This estimate, together with the Cauchy-Schwarz inequality, and with the meaning of TJ introduced in Theorem 1.4, give
Since e is arbitrary, it follows at once that TJ = 1. Referring once again to the proof of Theorem 1.4, by LDCT we obtain 1
= TJ = n-oo}x lim { I~ dlvl = { 12 dlvl, }x
and consequently, since Ivl is a probability measure, we have a.e., and III = 1 Ivl-a.e., as claimed.
P
= 1 Ivl-
1.
Absolute Continuity
193
Let now A = {x EX: I(x) = 1}, and B = X \ Aj {A,B} is a disjoint partition of X. Since I = -1 Ivl-a.e. on B, for any E E M we have
v(E
n A) = f
lEnA
Idlvl = Ivl(E
n A) ~ 0,
and
v(EnB)=
f
lEnB
Idlvl=-lvl(EnB)$O._
Two remarks concerning this result: First, the decomposition X =
A u B is not unique, and second, the result is true in greater generality. For instance, if Ivl is O"-finite, then we have X = Uk Xk, where the Xk'S are pairwise disjoint and Ivl(Xk) < 00 for all k. By rescaling if necessary we may assume that Ivl(Xk) = 1 for all k, and apply Theorem 1.4 to each Xk, thus obtaining sequences {Ak}, {Bk} as in that theoremj then the sets A = Uk Ak and B = Uk Bk correspond to the "positive" and "negative" parts of v respectively. Another interesting consequence of Theorem 1.4 is Theorem 1.6. and that
Suppose that >',1' are O"-finite measures on (X,M)
>'(E) $ J.t(E) ,
all E EM.
Then there exists a unique (in the J.t-a.e. sense as explained above) nonnegative measurable function I: X - t [0,1] such that
>'(E) = lldJ.t,
all E EM.
(1.25)
Furthermore, if 9 is a measurable extended real-valued function defined on X, then (1.26) The equality in (1.26) is understood as follows: If the integral on either side of (1.26) exists, then the integral on the other side also exists and they are equal. Proof. Write X = Uk X k , where the union is pairwise disjoint and J.t(Xk) < 00 for all k. By rescaling if necessary, assume that J.t(Xk) = 1, and invoke Theorem 1.4 for the measures >'k,J.tk on (X,M) given by
XI.
194
Signed Measures
The function fin (1.25) is obtained as f = :Ek fkXXk' where the fk'S are the (unique) functions that satisfy (1.20) for the measures Ak and J1k. As for (1.26), suppose first that 9 ~ 0, and let {
1
X
j
=
j
Ix (L:
j
l
ajXEj) f dJ1
fdJ1
EJ
=
Ix
and, by MCT, it follows that (1.26) holds in this case. As for a general function g, note that 9 = g+ - g-, and apply the above result to g+ and g- separately. •
2. THE LEBESGUE AND RADON-NIKODYM THEOREMS Let J1 and v be signed measures on (X,M). We say that J1 and v are mutually singular, and denote this by J1 .1 v, if there exists a disjoint partition {A, B} of X such that IJ.LI(A) = Ivl(B) = O. In this case we also say that J.L is singular with respect to v, or symmetrically, that v is singular with respect to J1. For instance, in Theorem 1.5 the measures
v I (E) = veE n A),
v2 (E) = veE n B),
E EM,
are mutually singular. Another interesting example we have encountered is that of J1 = J1F' the Borel measure induced by the extension F to R of the Cantor-Lebesgue function, and v the Lebesgue measure on the line. To show that this last example is part of a general state of affairs, and to elucidate the notion of singularity, we present a preliminary result. Lemma 2.1. Suppose J.LF is a finite Borel measure induced by a distribution F E V. If A is a Borel set on the line such that F' exists on A and M, m ~ 0, then we have (i) If F'(x) ~ M for x E A, then J.LF(A) ~ MIAI. (ii) If F'(x) ~ m for x E A, then J1F(A) ~ miAI.
2.
The Lebesgue and Radon-Nikodym Theorems
195
Proof. Given c > 0, let In denote the collection of intervals of the form (u,v] that satisfy the following two properties: (a) u, v are rational, and 0 < v - u < lin. (b) F(v) - F(u) ~ (M + c)(v - u). Observe that the sets
An =
U
(An (u,vD,
n = 1,2 ...
(u,V]EIn
are Borel sets, and that under the hypothesis of (i), they increase to A. Let {In,k} be a sequence of nonoverlapping intervals, open on the left and closed on the right, such that
L IIn,kl ~ IAnl
+c .
k
By working with the In,k's it is possible to assume that each In,k meets An, has rational endpoints, and that IIn,kl < lin for all k. Then, (a) and (b) above apply to each In,k = (Un,k,Vn,k], and it readily follows that
JLF(An) ~ LJLF(In,k) = L(F(vn,k) - F(Un,k)) k
~ (M
k
+ c) L(Vn,k -
Un,k) = (M
+ c) L
k
~
(M + c)( IAnl
IIn,kl
k
+ c).
Since c is arbitrary the above inequality implies that
and since the An's increase to A, a similar inequality holds with A in place of An above, and (i) is true. The proof of (li) follows along similar lines, a Vitali covering argument also works in this case, and is therefore left to reader. • We are now ready to prove Proposition 2.2. Suppose JLF is a finite Borel measure induced by a distribution F E 'D. Then JLF is singular with respect to the Lebesgue measure on the line iff F is singular, Le., F' = 0 a.e.
XI.
196
Signed Measures
Proof. We first show that the condition is sufficient. Given "I by (i) in Lemma 2.1 we get
and consequently, by first letting "I -+ 0 and then n -+ follows that JLF( {F' = O}) = o. Put now A
= {F' = O} and B = R \ JLF(A)=IBI=O,
00,
> 0,
it readily
A. Since F is singular we have
AnB=0,
AUB=R,
(2.1)
and consequently, JLF is singular with respect to the Lebesgue measure. On the other hand, if (2.1) holds for Borel sets A and B, by the other half of Lemma 2.1, it readily follows that for "I > 0 we have
"I1{x: F' ~ "I} I = "I1{x E A: F' ~ "I}I + "I1{x E B: F' ~ "I} I = "I1{x E A: F' ~ "I}I ::; JLF(A) = O. So, for each "I > 0 we have I{F' ~ "I}I = 0, and by letting "I -+ 0 we get that F' = 0 except on a null Lebesgue set. • Propositions 1.1 and 2.2 suggest that it may be possible to decompose signed measures in terms of absolutely continuous and singular measures. This indeed is the case; we begin by discussing a preliminary result in this direction. Theorem 2.3. Suppose A and JL are finite measures on (X,M). Then there exist finite measures Aa and As which satisfy the following properties:
(i) Aa <:: JL, As .L JL. (ii) A = Aa + As . Furthermore, the measures Aa, As are unique. Proof.
The set function
(A + JL)(E) = A(E) + JL(E) ,
E EM,
is a measure on (X,M) which satisfies
A(E) ::; (A + JL)(E) ,
all E in M .
2.
The Lebesgue and Radon-Nikodym Theorems
197
Whence, by (1.25) in Theorem 1.6, there exists a measurable function f: X ~ [0,1] such that
A(E)
= Lfd(A+J.t) = Lf dA + LfdJ.t,
EEM.
(2.2)
Next, let 9 = fXE' and observe that by (1.26), with I' = A + I' there, we have
fx fXE dA = fx fXEf d(A + 1'),
and consequently, (2.2) may be rewritten as
Now, this procedure may be iterated with 9 = f2XE' 9 = f3 XE , and so on, and (2.3) becomes
A(E) = L fn dA + L (f + ... + fn) dJ.t,
n = 1,2,...
(2.4)
Let B = {f = 1}, put
As(E)
= A(E n B),
E EM,
and observe that (2.4) becomes
A(E)
= As(E) + { fndA + {(f + ... + fn)dJ.t JEn(X\B) JE =I + J + K ,
(2.5)
say. Clearly, by LDCT, lim n .... oo J = 0. As for K, note that by MCT and (2.4) it follows that lim (f
n .... oo
+ ... + In) =
_f_ E L(J.t) 1- I
and consequently, J.t(B) = 0, and lim K n .... oo
=
(_f_ dJ.t .
JE 1- f
Now, by (2.6) we obtain that the measure Aa(A)
= 1_1_ dJ.t, JA 1-1
A EM,
(2.6)
XI.
198
Signed Measures
is absolutely continuous with respect to p,. Thus, returning to (2.5), we get A(E) = lim (I + J + K) = As(E) + Aa(E) , n-+oo
and (ii) holds. Next we show that As .L p,; since p,(B) = 0, this reduces to checking that As(X \ B) = dA = 0,
j
(X\B)nB
which is obviously true. Finally we show that the decomposition is unique; this is not hard. Suppose A = A~ + A~ is a decomposition of A that satisfies (ii) above, and let A~(X \ B') = p,(B') = 0, B' EM. We claim that
Aa(E)
= A~(E),
all E EM.
Indeed, since p,(B U B') = 0, by the absolute continuity of Aa and have
Aa(E n (B U B'))
= A~(E n (B U B')) = 0,
(2.7) A~
all E EM.
we
(2.8)
Moreover, since En (X \ (B U B')) is a subset of both X \ B and X \ B', we also have
= A~(E n (X \ (B U B'))) = 0, E EM. Whence, since Aa + As = A~ + A~, it readily follows that Aa(E n (X \ (B U B'))) = A~(E n (X \ (B U B'))) , E EM. (2.9) As(E n (X \ (B U B')))
Finally, combining (2.8) and (2.9), for each E E M we get
Aa(E) = Aa(E n (B U B')) + Aa(E n (X \ (B U B'))) = Aa(E n (X \ (B U B'))) = A~(E n (X \ (B U B'))) = A~(E n (B U B'))) + A~(E n (X \ (B U B'))) = A~(E), and (2.7) holds. Further, since Aa = A~ and since all the measures involved are finite, then As = A~ also holds. • We are now ready to prove a result in the spirit of Theorem 3.7 in Chapter X, it is appropriately known as the Lebesgue Decomposition Theorem.
2.
The Lebesgue and Radon-Nikodym Theorems
199
Theorem 2.4. Suppose J.L and A are O'-finite measures defined on (X,M). Then there exist O'-finite measures Aa,Aa defined on (X,M) such that:
(i) Aa <: J.L , Aa 1. J.L . (ii) A = Aa + As. Furthermore, the measures Aa and As are unique. Proof. The idea of the proof is to reduce the general hypothesis to the special case when the measures involved are finite, and then to invoke Theorem 2.3. First note that since J.L and A are O'-finite, we can write X as a pairwise disjoint union X =
UXk,
with
J.L(Xk) , A(Xk) <
00,
all k.
k
Next we localize the problem at hand by introducing the finite measures J.Lk and Ak defined on (X,M) by
For each k ~ 1, let Ak = Ak,a + Ak,s be the unique decomposition of Ak obtained in Theorem 2.3 with the property that
Moreover, since the Xk'S are pairwise disjoint it is also simple to verify that if we put
then A = Aa holds. •
+ As
is the unique decomposition of A for which (i) above
It is not hard to extend this result to signed measures. Indeed, we have
Theorem 2.5. Suppose J.L is a O'-finite measure and v is a signed measure defined on (X,M). If Ivi is O'-finite, then there exist signed measures Va, Vs defined on (X, M such that: (i) Va <: J.L , and Va 1. J.L.
(ii) V = Va
+ Va·
Furthermore, the signed measures Va and VB are unique.
XI.
200
Signed Measures
Proof. By the remarks following the Hahn decomposition theorem we can find O'-finite measures A, A' on (X, M) such that v = A - A'. Let now
A = Aa + As ,
A'
= A~ + A~
be the unique decomposition of A and A' obtained in Theorem 2.4 with the property that
It is not hard to check that if we put
then v =
Va
+ Vs is the (unique) decomposition of v that does the job.
•
These interesting results still do not address the question raised in considering (1.2) above. Now, in that example we had v <: Jl, and under this assumption we show that (a preliminary version of) the Radon-Nikodym Theorem holds. Theorem 2.6.
Suppose Jl and A are finite measures defined on
(X, M), and that A <: Jl. Then there exists a nonnegative integrable function h such that
A(E) =
~hdJl'
alIEEM.
(2.10)
h is called the Radon-Nikodym derivative of A with respect to Jl, and one writes h = dA/ dJl , or dA = h dJl . Furthermore, h is unique. Proof. The proof is identical to that of Theorem 2.3. First observe that since in the notation of Theorem 2.3 Jl(B) = 0, and since A <: Jl, we also have A(B) = O. But this readily implies that As(E) = 0 for all E EM, and consequently for any E in M we have
f
h= 1_
f
E L(Jl) . •
It is clear that we cannot expect (2.10) to hold in general. For instance, if A is the Lebesgue measure on R and Jl = 6 is the Dirac delta
2.
The Lebesgue and Radon-Nikodym Theorems
201
measure at 0, it is obvious that (2.10) cannot hold for any h E L(J.L); thus the assumption A < J.L is necessary. On the other hand, if J.L is the counting measure on [0,1], then A < J.L but still (2.10) is not true for any nonnegative measurable function h defined on [0,1]. The difficulty here is that the integral on the right-hand side of (2.10) is finite only when the set {h =I O} n E is at most countable, and [0,1] is an uncountable set of finite Lebesgue measure. A moment's thought will convince the reader that these are the only difficulties in extending Theorem 2.6 to a more general setting. Indeed, we have Theorem 2.7 (Radon-Nikodym Theorem). Let J.L be a O'-finite measure, and v a signed measure defined on (X,M). If Ivl is O'-finite and v < J.L, then there exists an extended real-valued measurable function h defined on X such that if E E M and Ivl(E) < 00,
v(E)
=
k
(2.11)
hdJ.L.
h is called the Radon-Nikodym derivative of v with respect to J.L, and writes h = dv/dJ.L, or dv = hdJ.L.
o~
Also h is unique, in the sense that if h' is another extended real-valued measurable function defined on X for which (2.11) holds, then h = h' J.L-a.e. Proof. The proof follows along the lines to that of Theorem 2.5; first we localize the problem at hand and reduce it to the particular case of finite measures. Write X as a pairwise disjoint union
X
= UXk ,
with
J.L(Xk),lvl(Xk) <
00 ,
all k.
k
Consider now the finite measures J.Lk and signed measures Vk defined on (X,M) by
= J.L(E n Xk), and vk(E) = v(E n Xk), k = 1,2, ... Further, write Vk = Ak - Ak , where the Ak'S and Ak's are finite measures J.Lk(E)
on (X, M) such that Vk = Ak - Ak for k = 1,2, ... It is also easy to check that under our assumptions we have
Ak, Ak
<
J.Lk,
and
Ak,B - Ak,B = 0,
all k.
XI.
202
Signed Measures
Now, by Theorem 2.6 there exist (unique) nonnegative functions hk' h~ in L(/1k) with the property that
A~(E) =
Ak(E) = k hk d/1k, and consequently for each k
k
h~ d/1k,
all E EM,
= 1,2, ... , we have
vk(E) = k(hk - hk)d/1k'
E EM.
It is now readily seen that the function h = L,k(h k - h~) has all the desired properties. •
The assumption that /1 is u-finite is essential for the validity of Theorem 2.7. For, suppose X is an uncountable set, and let M be the u-algebra of those subsets of X which are either at most countable or so that their complements are at most countable. For E E M put /1(E) = the number of elements of E if E is finite and /1(E) = 00 otherwise, and veE) = 0 or 1 according as to whether E is at most countable or not. Then v ~ /1, but no integral representation such as the one given in (2.11) is possible. Clearly /1 is not u-finite. The theorems of Lebesgue and Radon-Nikodym have many interesting applications, we discuss one next. If v is a signed Borel measure, we say that v is differentiable at x E R n , provided that Dv(x) = lim v(I(x,r)) exists. r-+O
II(x,r)1
Then the following is true. Proposition 2.8. Suppose v is a signed Borel measure so that Ivl is finite on bounded sets of Rn. Then v is differentiable a.e. More precisely, if v = Va + Vs is the Lebesgue decomposition of v with respect to the Lebesgue measure /1, Va ~ /1, Vs .1 /1, we have
Dvs = 0 a.e.
and
DVa = dva/ d/1 a.e.
Proof. Since Ivl is u-finite, Theorem 2.4 applies. Let h denote the Radon-Nikodym derivative of Va with respect to the Lebesgue measure and observe that
v(I(x,r)) _ vs(I(x,r)) II(x,r)1 - II(x,r)1 = A+B,
+
1
II(x,r))1
f iI(x,r)
hdy
2.
The Lebesgue and Radon-Nikodym Theorems
203
say. Since by the Lebesgue Differentiation Theorem we have limr-+o B = h a.e., we are reduced to showing that limr-+o A = 0 a.e.; this is not hard. First observe that since Va is the difference of two Borel measures whose variations are finite on bounded sets, also IVai enjoys this property. The proof now follows along the lines of Proposition 2.2. Since
Iva(I(x,r))1
~
Ival(I(x,r)),
it clearly suffices to show that (2.12) = 0 a.e. Let B be a Borel set such that Ival(B) = IRn\BI = 0, and for k = 1,2, ... ,
Dlval
let
Fk Since IFk that
n (Rn
={x E R
\ B)I
n' Ival(I(x,r)) 1} : li~-!~p II(x,r)1 > k
= 0, in order to prove
.
(2.12) it is enough to show (2.13)
Now, since IVai is a regular Borel measure, given e > 0, there exists an open set 0 ;2 B such that IVal(O) ~ e. Further, to each x E Fk n B associate an interval I(x,r) with the property that (2.14) Observe that if I( is an arbitrary compact subset of Fk n B, by a covering argument similar to that in the proof of the Hardy-Littlewood maximal theorem, specifically estimate (2.15) there, there exists a finite family I(xl,rI), ... ,I(xm,rm) of pairwise disjoint subintervals of 0 such that (2.15) But the intervals that appear on the right-hand side of (2.15) are special. In addition to being pairwise disjoint, they all satisfy (2.14). Therefore the sum on the right-hand side of (2.15) can be estimated by
Since e is arbitrary, this means that 11(1 = O. By the regularity of the Lebesgue measure it follows that (2.13) holds, and we have finished. •
XI.
204
Signed Measures
3. PROBLEMS AND QUESTIONS In what follows, all measures are assumed to be defined on a O'-algebra M of subsets of a space X, even when this is not explicitly stated. 3.1 Referring to the notation of 4.8 in Chapter IV and 3.10 in Chapter IX, if v is a signed measure, describe v+ " v-, and v+ V v- . 3.2 Let p, be a measure and f an extended real-valued measurable function defined on X. If v is the signed measure given by
v(E)
= kfdP"
E EM,
describe v+, v- and Ivl in terms of f. What if p, is a signed measure and f has constant sign? What if both p, and f are allowed to have variable sign? 3.3 Let F denote the class of all measures on (X, M). Show that the relation ~ --< p, iff ~(E) ~
p,(E) ,
all E EM,
(3.1)
is a partial ordering on F. 3.4 Show that if (3.1) holds then Ll(p,) ~ Ll(~). Is the converse true? 3.5 A subset A E M is said to be positive with respect to a signed measure v if for each E ~ A, E E M, we have v(E) ~ O. Similarly, a measurable set C is said to be negative with respect to v if for each E ~ C, E E M, we have v(E) ~ O. Finally, a set N E M is said to be null with respect to v if it is simultaneously positive and negative with respect to v. Show that there may be sets of measure 0 which are not null, as there may be sets of positive measure which are not positive, and sets of negative measure which are not negative, all with respect to v. Also, investigate some of the properties of these classes of sets. For instance, show that every measurable subset of a positive set is positive, etc. 3.6 If v is a signed measure which does not assume the value 00, and if v(E) > -00, show that E contains a measurable subset A such that: (a) v(A) ~ v(E), and, (b) A is positive with respect to v. 3.7 If v is a signed measure, prove that E is null with respect to v iff Ivl(E) = O.
3.
Problems and Questions
205
3.8 If V is a signed measure and A, v are measures such that v = A-I', show that
A(E) ~ v+(E) ,
and
veE) ~ v-(E) ,
all E EM.
3.9 Suppose J.LF is the probability measure on R induced by the distribution function F, and that F(x) = J.LFCC-oo,X]). If A denotes the
Lebesgue measure on Rand dJ.LF/dA
F(x) =
= j, show that
f j(y)dI'F(Y)' J<-oo,x]
-00
< x < 00.
The nonnegative function j is known as the probability density of
F. Compute the probability density that corresponds to the distribution function 0 if x < a F( x) = { (x - a) / (b - a) if a < x < b 1 if x ~ b.
3.10 Suppose
0 if x < -1 F( x) = { 1 _ x 2 if x ;: -1,
and let I' be the signed measure on (R,8) that satisfies I'«x, y)) = F(y)-F(x). Find the Hahn decomposition of I' and for an arbitrary interval I = (x,y] of R find explicit formulas for I'+(I), I'-(I) and
II'I(I)· 3.11 Does there exist an increasing distribution function F on R such
that the induced Borel measure I'F is not absolutely continuous with respect to the Lebesgue measure on R? 3.12 If v is a finite measure and I' a measure, show that the following are equivalent: (a) v <: 1', and, (b) If the sequence {En} ~ M has the property that lim n -+ oo I'(En) = 0, then lim n -+ oo v(En) = O. 3.13 Show that v <: I' iff v+, v- <: I' iff
Ivl <: 1'.
3.14 If Vb v2 are signed measures, I' is a measure and vI, V2 <: 1', prove that VI + V2 <: I' and d(VI + v2)/dl' = dVI/dJ.L + dV2/dl'. 3.15 Given measures A,J.L, show that
dA/d(A + 1') + dJ.L/d(A + 1') = 1 (.\ + J.L)-a.e.
XI.
206
Signed Measures
3.16 If A,p, V are measures such that A ~ p and p ~ v, show that A ~ v and dAldv = (dAldp) . (dpldv) A-a.e. 3.17 Suppose p, v are O'-finite measures so that p that
~
v and v
~
p. Show
dv I dp = 1/( dpI dv) p-a.e. 3.18 Let p, v be O'-finite measures, and suppose p - v is a measure so that v ~ p - v. Show that p({dvldp = 1}) = o. 3.19 Let p, v be O'-finite measures, and suppose v
v({dvldp
~
= O}) = o.
p. Show that
3.20 Referring to 4.14 in Chapter VII, if p is a finite measure and A is a O'-finite measure on (Y,N), then show that there exists a function h E LI(A) such that Jx f 0 Tdp = Jy f dv = Jy fhdA for all fin
LI(v). 3.21 Let p, v be O'-finite measures and assume v f E LI(v), f(dvldp) E L(p) and f dv =
Jx
~
Jx
p. Show that for all f(dvldp)dp.
3.22 Consider the relation defined on classes of measures by A '" p iff A ~ p and p ~ A. Show that '" is an equivalence relation and describe the relation between LI(A) and LI(p). 3.23 Given a measure p, put
veE) =
{Ooo
if peE) = 0 if peE) > o.
Show that v is measure on (X, M), and that v
~
p. Also find
dv I dp. 3.24 Let (X, M,p) be a measure space, N eM a O'-algebra of subsets of X, and v = piN the restriction of p to N. Show that given f E L(p), there exists a uniquely determined, v-a.e. that is, It E L(v) such that fA f dp = fA It dv for every A EN. It is called the conditional expectation of f with respect to N. 3.25 Show that the Radon-Nikodym Theorem remains true if p is a O'-finite signed measure.
3.26 Let p, v be regular Borel measures. Show that if supp p 0, then p 1. v. 3.27 Show that if Vb v2 1. p, then
VI
+ V2 1. p.
n supp v
=
s.
Problems and Questions
207
3.28 Show that J.L 1. v iff 1J.L11. Ivl. 3.29 Let J.L, v be u-finite measures. Show that J.L 1. v iff there exists no nonzero measure A such that A ~ J.L and A ~ v. 3.30 Suppose A and J.L are finite measures. Prove there is a (measurable) partition {AI,A2,A3} of X such that:
(i) A(Al) = 0, (ii) J.L(A 2) = 0, and (iii) On A 3, A ~ J.L and J.L ~ A. Also, show there exists a finite positive measurable function h on A3 such that for every nonnegative measurable function f on A 3,
f fdA = f fhdJ.L lA3 l~
and
f
l~
f dJ.L =
f (J / h ) dA . lA3
3.31 Suppose v is a signed Borel measure with the property that Ivl is finite on bounded sets on Rn. Referring to 3.25 in Chapter VIII, show that if R = {R} is a regular family, then
v({x + R}) diam(R}-+O IRI lim
exists a.e.
3.32 Suppose v is as in 3.31 and that A is a Borel subset of R n so that Dv(x) ;::: A for all x E A. Prove that v(A) ;::: AlAI. 3.33 Assume J.L, J.Lm are finite Borel measures, m = 1,2, ... , such that lim m -+ oo J.Lm(A) = J.L(A) for all A E Bn , monotonically (either nondecreasing or nonincreasing). Prove that
lim DJ.Lm = DJ.L
m-+oo
a.e.
3.34 A complex-valued set function v of the form
veE) = vI(E) + iV2(E) ,
E EM,
where VI and V2 are signed measures is called a complex measure. VI and V2 are called the real and imaginary parts of v, respectively. This is an open ended question: Discuss the properties of complex measures. For instance, show that if Ivl is given by 4.12 in Chapter IV, then the set function Ivl is a measure called the (total) variation of v. Also prove that if 7] = sup{lv(A)I: A E M, ACE}, then 7]
5I v l(E) 5 47],
all E EM.
XI.
208
Signed Measures
3.35 Let Jt and v be complex measures. We say that v is absolutely continuous with respect to Jt, and we write v <:: Jt, if IJtI(E) = 0 implies veE) 0, for E E M.
=
Discuss properties of absolutely continuous measures. For instance, prove that if v = VI + iV2, then the following are equivalent: (i) v <:: Jt, (ii) vt, VI' vi, v; <:: Jt, and, (iii)lvl <:: Jt. 3.36 Let Jt and v be complex measures. We say that Jt arid v are singular, and we write Jt 1. v, if there exists a set A E M such that IJtI(A) = Ivl(X \ A) = O. Discuss properties of singular measures. 3.37 Can you think of a Lebesgue decomposition theorem in case
(X,M,Jt) is a O'-finite measure space and v is a complex measure on (X,M)?
CHAPTER
XII
1Y Spaces
In this chapter we introduce the Lebesgue spaces of p-integrable functions and study their basic properties. In considering the various results discussed here, the reader should keep in mind the three basic examples: The LP spaces of Lebesgue measurable functions on the line, or Rn, the LP spaces of Lebesgue measurable functions on a bounded interval, and the sequence i P spaces. 1.
THE LEBESGUE LP SPACES
Let (X,M,JL) be a measure space and I an extended real-valued measurable function defined on X. Then, for 0 < p < 00, I/IP is also measurable and the expression
II/lIp =
(Ix I/IP dJL) IIp,
0
< p < 00,
(1.1)
whether finite or not, is well-defined and is called the "p norm" of I. The case p = 1 has been studied in Chapter VIII, and the natural question to consider is to what extent those results can be extended to values of p other than 1. First a definition. The Lebesgue class LP(X,JL), or plainly LP(X) or LP(JL), is defined as
LP(X,JL) = {I measurable: II/lIp < oo},
0
00.
(1.2)
Our immediate goals are to show that LP(JL) is a linear class and to introduce a metric that will turn LP(JL) into a complete metric space.
209
XII.
210
LP Spaces
It is not hard to verify that LP(p) is a linear class. Indeed, given E LP(p) and a real scalar ,x, first note that I + ,xg is measurable. Furthermore, since for nonnegative real numbers a, b we have
I, 9
(1.3) it follows that I/(x) + ,xg(x)IP ~ 2P(I/(x)iP consequently, we also have
+ 1,xIPlg(x)IP),x
E X, and
As for the metric, inspired by (1.1) in Chapter VIII, a natural choice is
dp(J,g) =
III - gllp,
0
< p < 00.
Since dp(J, g) = 0 implies I = 9 p-a.e, strictly speaking, the elements of LP(p) are equivalence classes of functions defined on X, where we agree that I = 9 means that I = 9 p-a.e. Now, since the cases 0 < p < 1 and 1 < p < 00 are essentially different, we treat them separately; we do the former case first. So, fix 0 < p < 1, and note that (1.3) can be improved to (1.4) with equality occurring in (1.4) iff ab = O. To see this observe that if ab = 0 there is nothing to prove. On the other hand, if ab =f. 0, and since (1.4) is to hold for any a, b > 0, we may replace a by ab there and consider the equivalent inequality
¢( t)
= (1 + t)P -
1 - tP ~ 0 ,
all t
~
0,
(1.5)
with ¢(t) = 0 ifft = O. But this is easily checked: Indeed, since 0 < p < 1, we have ¢'(t) = p(l + t)p-l - pt p- l < 0 for t > 0, and consequently, ¢(t) < ¢(O) = 0, which gives (1.5), including the remark concerning equality. With this observation out of the way, let A, B be disjoint measurable subsets of X neither of which is null, and let I = XA' 9 = XB. Then, putting a = J.L(A)I/p and b = J.L(B)I/p in (1.4) we obtain
dp(J,g) =
(L II - glP dP)
= (J.L(A)I/P)P
l/p
= (J.L(A)
+ J.L(B»I/P
+ (J.L(B)I/P)P//P ~ J.L(A)I/P + J.L(B)I/P,
1.
The Lebesgue LP Spaces
211
or, in other words,
Thus, the inequality opposite to the triangle inequality holds, and dp cannot be a distance function on LP(J.L). This is not, however, a serious difficulty. Indeed, by (1.4), the expression
satisfies the requirements of a metric, and essentially all the results discussed in Chapter VIII are true: Endowed with this metric LP(J.L) is a complete metric space and Co(Rn) is dense in LP(Rn). It is interesting to point out that LP integrable functions are not necessarily integrable, or locally integrable for that matter. For instance, when 0 < p < 1, by 4.46 in Chapter VII, the function Ixl-n E LP(I(O,l)), but it is not locally integrable in any neighbourhood of the origin. In order to deal with the case 1 < p < 00 we need a preliminary result which is essential in what follows. We have already encountered a particular instance of this result, the Cauchy-Schwarz inequality, in 3.1 in Chapter VIII. We say that 1 < p, q < 00 are conjugate indices provided that l/p+ l/q = 1.
Conjugate indices are also related by the expressions p + q = pq and q = p/(p-1). An important property that the conjugate indices p = 1/17, q = 1/(1 - 17), 0 < 17 < 1, satisfy, is this: For any a, b ~ 0 we have
aTJb1 - TJ ~ 17a + (1 - 17)b.
(1.6)
Furthermore, equality holds in (1.6) iff a = b. To see this note that if ab = 0 there is nothing to prove. Otherwise, if ab :j:. 0, and since (1.6) is to hold for any a, b > 0, we may replace a by ab there and consider the equivalent assertion
¢( t) = t'1 - "It - (1 - 17) with ¢(t)
~
0,
all t
~
0,
= 0 iff t = 1. Note that since 0 < 17 < 1, we have ¢'(t)
= 17t '1- 1 -
17
> 0 if 0 < t < 1 if t = 1 < 0 if t > 1.
= { =0
XII.
210
LP Spaces
It is not hard to verify that LP(J-L) is a linear class. Indeed, given f,g E LP(J-L) and a real scalar .,x, first note that f + .,Xg is measurable. Furthermore, since for nonnegative real numbers a, b we have
( a + b)P ::; 2P (a P + bP) ,
0
it follows that If(x) + .,Xg(x)IP ::; 2P(lf(x)iP consequently, we also have
(1.3)
00 ,
+ 1.,XIPlg(x)IP),x
E X, and
As for the metric, inspired by (1.1) in Chapter VIII, a natural choice is
dp(f,g) = IIf - gllp,
0
00.
Since dp(f, g) = 0 implies f = 9 J-L-a.e, strictly speaking, the elements of LP(J-L) are equivalence classes of functions defined on X, where we agree that f = 9 means that f = 9 J-L-a.e. Now, since the cases 0 < p < 1 and 1 < p < 00 are essentially different, we treat them separately; we do the former case first. So, fix 0 < p < 1, and note that (1.3) can be improved to
(a
+ b)P ::; aP + bP,
a, b ~ 0 ,
(1.4)
with equality occurring in (1.4) iff ab = o. To see this observe that if ab = 0 there is nothing to prove. On the other hand, if ab ::f:. 0, and since (1.4) is to hold for any a, b > 0, we may replace a by ab there and consider the equivalent inequality
+ t)P -
1 - t P ::; 0 ,
all t
~
0,
(1.5)
with 0, and consequently,
dp(f,g)
=
(Ix If - glP dJ-L)
= ((J-L(A)l/P)P
l/p
= (J-L(A) + J-L(B))l/ P
+ (p.(B)l/P)p
riP ~
p.(A)l/P + p.(B)l/P ,
1.
The Lebesgue LP Spaces
211
or, in other words,
Thus, the inequality opposite to the triangle inequality holds, and dp cannot be a distance function on LP(p.). This is not, however, a serious difficulty. Indeed, by (1.4), the expression
d~(J,g) =
LIf -
glP dp.
satisfies the requirements of a metric, and essentially all the results discussed in Chapter VIII are true: Endowed with this metric LP(p.) is a complete metric space and Co(Rn) is dense in LP(Rn). It is interesting to point out that LP integrable functions are not necessarily integrable, or locally integrable for that matter. For instance, when 0 < p < 1, by 4.46 in Chapter VII, the function Ixl- n E LP(I(O,l)), but it is not locally integrable in any neighbourhood of the origin. In order to deal with the case 1 < p < 00 we need a preliminary result which is essential in what follows. We have already encountered a particular instance of this result, the Cauchy-Schwarz inequality, in 3.1 in Chapter VIII. We say that 1 < p, q < 00 are conjugate indices provided that l/p+ l/q = 1.
Conjugate indices are also related by the expressions p + q = pq and q = p/(p - 1). An important property that the conjugate indices p = 1/1], q = 1/(1 - 1]), 0 < 1] < 1, satisfy, is this: For any a, b ~ 0 we have
aTlb 1 -
TI
~ 1]a
+ (1 -
1])b.
(1.6)
Furthermore, equality holds in (1.6) iff a = b. To see this note that if ab = 0 there is nothing to prove. Otherwise, if ab ::j:. 0, and since (1.6) is to hold for any a, b > 0, we may replace a by ab there and consider the equivalent assertion
¢( t) = tTl - 1]t - (1 - 1])
~
with ¢(t) = 0 iff t = 1. Note that since 0
¢'(t)
= 1]tTl - 1 -
all t
~
0,
< 1] < 1, we have
>O
1]
0,
= { =0
ifO
if t = 1 < 0 if t > 1.
XII.
212
LP Spaces
Thus 'I/J(t) decreases to 0 as t increases to 1, and then increases to 00 as t ~ 00, and 'I/J(t) = 0 iff t = 1. This givs (1.6), including the remark concerning equality. A useful reformulation of (1.6), known as Young's inequality, is this: If a, b > 0 and p, q are conjugate indices, then aP bq ab < - + (1.7) -
P
q
and equality holds in (1.7) iff a P = bq • We are now ready to prove Holder's inequality. Theorem 1.1. Let (X,M,p) be a measure space, p,q conjugate indices, and suppose 1 E LP(p), 9 E Lq(p). Then Ig is integrable, and
Ix 11g1
dp
~ IIll1plIgllq·
(1.8)
Moreover, equality holds in (1.8) iff there exist nonnegative constants A, B, AB i 0, such that (1.9) Proof. If the right-hand side of (1.8) is 0, then either 1 = 0 p-a.e. or else 9 = 0 p-a.e., and we have equality in (1.8). If, on the other hand, IIll1plIgllq i 0, by replacing 1 and 9 by I/11111p and g/llgllq if necessary, we may assume that IIllip = IIgllq = 1. Now, since 1 and 9 are finite p-a.e., from (1.7) it follows that
1111g1 ~ 111 P + Iglq, p
q
p-a.e.,
(1.10)
and consequently, integrating over X we obtain
f 11g1 dp ~ ! f 111 P dp +! f Iglq dp = 1, lx p lx q lx which is precisely (1.8). A moment's thought will convince the reader that equality can occur in (1.8) iff it occurs in (1.10) p.-a.e. But, by (1.7), this is true iff 111 P = Iglq p-a.e., and this gives (1.9) when IIllip = IIgllq = 1. As for arbitrary functions 1 and g, we normalize them as above and note that equality holds iff 111P /lIll1~ = Iglq /lIgll~ p.-a.e. Whence a possible choice for the (nonunique) constants is A = IIgll~ and B = IIll1~. • We are now in a position to prove one of the essential results in the theory of LP spaces, which is due to Minkowski (1864-1909), namely, Minkowski's inequality.
1.
The Lebesgue IJ' Spaces
213
Theorem 1.2. Suppose (X,M,JL) is a measure space and I,g E LP(JL), 1 ~ p < 00. Then
(1.11) As for equality in (1.11) there are two separate cases, depending on whether p = 1 or not. If 1 < p < 00 the condition is: There exist constants A,B, AB =I- 0, such that AI = Bg JL-a.e. On the other hand, if p = 1 the condition is: There exists a nonnegative measurable function h such that Ih = 9 JL-a.e. on the set {Ig =I- o}. Proof. (1.11), in case p = 1, was already established in Chapter VIII. As for p > 1, since I,g are finite JL-a.e., so is 1+ g, and we have that II + glP is less than or equal to
II + gill + glP-l
~
1I11I + glP-l + Igill + glP-l
JL-a.e.
(1.12)
Whence integrating (1.12) over X we get
III + gll~ ~
L1I11I +glP-l + LIgill +glP-l dJL
dJL
(1.13)
=1+J,
say. To estimate 1 and J we apply Holder's inequality with indices p and its conjugate q = p/(p - 1), and note that 1
~ IIllIp
-L )(P-l)/P f ( Jx II + gl(P-l)p_l dJL = 1IIIIpili + gll~-l,
and similarly,
J ~
IIgllplll + gll~-l .
(1.14)
(1.15)
Whence substituting (1.14) and (1.15) into (1.13) it follows that
which is equivalent to (1.11). Finally, when 1 < p < 00, it is clear that we only have equality in (1.11) provided we have equality in (1.12) and in the estimates of 1 and J in (1.14) and (1.15), respectively. Now, equality holds in (1.12) if I and 9 are of the same sign JL-a.e. Also, by (1.9), equality holds in the estimates of 1 and J if for some constants A,B we have AIIIP = II + glq = BlglP JL-a.e. Whence combining these remarks it follows that equality holds if, as asserted, AI = Bg JL-a.e.
XII.
214
LP Spaces
When p = 1 we must have equality in II + gl ~ III + Igl J.L-a.e., and this occurs when I and 9 are of the same sign J.L-a.e. Then the function h = g/ I will do the job. • An interesting consequence of Minkowski's inequality is that dp is a metric on LP(J.L), 1 < p < 00. The only property that offers any difficulty is the triangle inequality, and it is obtained as follows: If I,g,h E LP(J.L), we have
dp(J,g) = III - gllp = II(J - h) + (h - g)lIp ~ III - hll p + IIh - gllp = dp(J, h) + dp(h,g). Furthermore,
Theorem 1.3 (F.Riesz-Fischer). Let (X,M,J.L) be a measure space. Then, the distance function dp turns LP(J.L) into a complete metric space, 1 < p < 00. Proof. We fix 1 < p < 00, and assume that {In} is a Cauchy sequence of functions in LP(J.L). We must show that there is a function I E LP(J.L) so that limn-+oo dp(Jn,J) = O. First observe that since {In} is Cauchy we can find an increasing sequence nk+l > nk such that
dp(Jn,Jnk) ~ Put now
Ina
all n ~ nk,
1/2k,
k = 1,2, ...
= 0, and let m
gm =
L lInk - Ink_II,
m = 1,2, ...
k:::::l
Clearly the sequence {gm} is nondecreasing, let 9 denote its limit. Furthermore, since by (a simple extension of) Minkowski's inequality it follows that IIgmllp ~ 1 for all m, by MCT we get that gP dJ.L ~ l.Thus, in particular, 9 is finite J.L-a.e., and the series with terms Ink - Ink_I' k = 1,2, ... , converges absolutely to a finite sum J.L-a.e. Let then
Ix
m
I = m-+oo lim "(Jnk - Ink-I) ; L...J k:::::l
I is measurable and finite J.L-a.e. Moreover, since the sum on the righthand side above telescopes to Inm' it readily follows that lim Ink = I
k-+oo
J.L-a.e.
1.
The Lebesgue LP Spaces
215
We want to show that the convergence is also in the metric of LP(Jl). First observe that since
III ~ 9,
and
I/n,,1
~ 9n" ~ 9,
all k,
and since 9 E LP(Jl), by LDCT we get lim {lIn" - liP dJl
k-+oo
JX
= k-+oo lim d~(Jn", J) = o.
To complete the proof we invoke the well-known fact that if a Cauchy sequence in a metric space has a convergent subsequence, then the sequence itself converges to the same limit. • As in the case of integrable functions, the metric structure of LP( X, Jl) permits us to establish the following properties: (i) Simple functions are dense in LP(X,Jl) , 1 < p < 00. (ii) Co(R n ), cgo(Rn) are dense in LP(Rn ), 1 < p < 00. (iii) cg(I) = {I E cg(Rn): I vanishes off I} is dense in LP(I), 1 < p < 00, 0 ~ k ~ 00. (iv) The class of sequences {en} which eventually vanish is dense in lP, 0< P < 00. (v) The translates of LP(Rn ) functions are continuous in the norm, i.e., if IE LP(Rn ), then limlhl-+o 11/(· + h) - 1(·)lIp = 0, 1 < p < 00. Since the proof of these statements follows along the lines of that of the corresponding results for p = 1, we leave their verification to the reader. Finally, in the scale of Lebesgue spaces there are two limiting cases left to consider: The case p = 0 and the case p = 00. The idea here is to let p --+ 0 and p --+ 00 in the expressions corresponding to II/II~ and II/lIp respectively, and study what happens. In the limiting case p = 0, it is clear that the limit exists and it equals 11/110 = i{Jt=o} dJl. The class LO(X, Jl), consisting of those extended realvalued finite Jl-a.e. measurable functions I defined on X so that Jl( {I '# OJ) < 00, enjoys many ofthe properties ofthe LP(X,Jl) spaces and is very useful in applications; we discuss it no further here. As for the case p = 00, to fix ideas consider I = [0,1] and a measurable function I defined on 1. If 1 < p < q < 00 are given, observe that r = q/p > 1 and r' = r/(r - 1) = q/(q - p) are conjugate indices, and consequently, by HOlder's inequality, and assuming that the quantities involved are finite, we have
i I/IP ~ (i Iflpr dx
dx ) l/r
(i
dX) l/r' =
(i Ifl
q
dX) q/p ,
XII.
216
LP Spaces
or equivalently,
II/l1p:5 II/l1q,
1< p
< q < 00.
Whence, the p norms of I are nondecreasing and lim q-+ oo II/l1q exists. Suppose this limit is finite, this is the case of interest to us, and call it L. By Chebychev's inequality we have
~I{I/I > ~}Il/q :5
II/l1q :5
all ~ >
L,
o.
(1.16)
Moreover, since lim I{I/I > ~}Il/q = 1 whenever q-+oo
I{I/I > ~}I =I 0,
by (1.16) it readily follows that
I{I/I > ~}I
= 0,
all ~ > L.
(1.17)
Functions that satisfy (1.17) are called "essentially bounded". In contrast to bounded functions, essentially bounded functions may assume infinite values, but only on null sets. These observations motivate our next definition. Let (X,M,JL) be a measure space and I an extended real-valued measurable function defined on X. The expression
11/1100 =
inf {A>
0 :JL({I/I > ~}) = O} ,
(1.18)
whether finite or not, is well-defined and is called the JL-essential sup,of III, or the "00 norm" of I. The Lebesgue class Loo(X,JL) consists of those measurable functions I with 11/1100 < 00, and it is also denoted by Loo(JL) or Loo(X). Note that since
JL({I/I>
11/1100 + lin})
we also have
JL( {III>
= 0,
n
1l/1I00}) = 0,
= 1,2, ... , (1.19)
and the infimum in (1.18) is attained. The next step in the study of the Loo spaces is to establish which properties of the LP spaces remain valid in this setting, and which do not. For instance, since convergence in the LOO norm corresponds to uniform convergence, if 1 = [0,1], only continuous functions can be approximated in Loo(l) by continuous functions, and consequently, Go(l) is not dense in Loo(l). On the other hand, on a positive note we have
1.
The Lebesgue If' Spaces
Theorem 1.4.
217
Endowed with the metric
doo(J, g)
= 111 - gil 00 ,
(1.20)
LOO(X,J.l) becomes a complete metric space. Proof. To show that doo is a metric the only property that offers any difficulty is the triangle inequality. The proof of this depends on the following variant of Minkowski's inequality:
111 + glloo ::; 1111100 + IIglioo .
(1.21)
To see this observe that
and since by (1.19) the sets on the right-hand side above are null, so is the set on the left-hand side, and (1.21) follows. The completeness of LOO(J.l) with the metric doo follows along the lines of Theorem 1.3 and is therefore left for the reader to verify. • Another important property of Loo(J.l) is that 1 and 00 are conjugate indices, in the sense that 1/1+1/00 = 1. A justification for this convention is given by the following extension of Holder's inequality.
Proposition 1.5.
Let (X,M,J.l) be a measure space, and suppose E L(J.l), and we have
1 E L(J.l), 9 E LOO(J.l). Then Ig
Ix 11g1 Equality holds in (1.22) iff
dJ.l ::;
111111 IIg II 00 .
(1.22)
Igl = IIglioo J.l-a.e.
Proof. By (1.19) we may assume the integral in (1.22) to be extended over the set { Igl ::; IIglloo}' and in this case (1.22) holds trivially. As for the case of equality in (1.22), observe that since
Igl/lIglioo ::; 1 the relation
J.l-a.e.,
Ix 111(lgl/lIgIl00) Ix 111 dJ.l =
can only be true if, as asserted,
Igl/lIglioo =
dJ.l,
1 J.l-a:e.
•
XII.
218
LP Spaces
2. FUNCTIONALS ON LP The next topic we consider is that of mappings defined on LP(X, ft), and the simplest case is that of scalar-valued mappings. For instance, point-evaluation is a well-defined mapping on Co(R n ), and since this class is dense in LP(Rn ), < p < 00, it is natural to consider whether pointevaluation may be extended, in some natural way, to LP(Rn ). Since functions in these classes need only be defined ft-a.e. it is not intuitively clear how to construct such an extension, or whether, in fact, one such extension exists. The answer to these questions is postponed to Chapter XIV, where the Hahn-Banach Theorem is discussed. In this chapter we take a different approach, one suggested by Holder's inequality. For example, referring to (1.22), given 9 E LOO(ft), let Lg be the mapping on L(ft) given by
°
(2.1) Clearly Lg is well-defined, and it satisfies the following properties: (a) For I,h E L(ft) and a scalar A we have Lg(f + Ah) = Lgi + ALgh, (b) ILgl1 ~ IIgliooll/lll for all I E L(ft), and, (c) If limn-+ oo d1 (fn,!) = 0, then limn-+ oo Lgln = Lgi. In fact, Lg is a prototype of those mappings L on the Lebesgue LP spaces which satisfy the following properties: (i) L is a well-defined scalar-valued mapping. (ii) (Linearity) For each scalar A and I,g E LP(ft),
L(f + Ag)
= LI + ALg.
(iii) (Continuity) If limn-+ oo dp(fn,j) = 0, then limn-+oo Lin = LI. (iv) (Boundedness) There is a constant k such that
ILII ~
kll/ll p ,
all
IE LP(ft)·
More precisely, a mapping L which satisfies (i) above is called a functional. L is said to be a linear, continuous or bounded functional if (ii), (iii) or (iv), respectively, hold. Although not apparent at a first glance, the concepts of continuity and boundedness are equivalent. Proposition 2.1. Suppose L is a linear functional on LP(ft). Then L is continuous iff L is bounded.
2.
Functionals on LP
219
Proof. To prove the necessity, suppose that L is continuous but unbounded. Then, for each n, there is In E LP(J-L) such that
Now, the sequence 9n = (l/nll/nllp)/n' n = 1,2, ... , satisfies lim dp(9n' 0) = 0,
n-+oo
and
IL9nl
~ 1,
all n,
thus contradicting the continuity of L. As for the sufficiency, observe that if I, In E LP(J-L), n = 1,2, ... , by the linearity and boundedness of L we have
ILln - LII = IL(Jn -
1)1 ~ k II/n - Ilip .
Whence, if dp(Jn, I) ~ 0, the right-hand side in the above inequality tends to 0 with n, as does the left-hand side there, and the desired conclusion follows. •
on
It is rather straightforward to construct bounded linear functionals 0 < p ~ 1. Indeed, if (m n) E too, the mapping
[P,
L(cn ) =
L mncn ,
(cn) E t P ,
n
is readily seen to be such a functional. In fact, by (1.4) it is clear that
and (iv) above holds with k = lI(m n)lIoo there. The following result, then, is a bit surprising.
Theorem 2.2 (M.M. Day). Let I = [0,1], and suppose L is a continuous linear functional on LP(I), 0 < p < 1. Then L is the zero functional, i.e., for every IE LP(I) we have LI = O. Proof. Suppose, to the contrary, that L is not the zero functional. Then, by rescaling if necessary, we may assume that there is a function I E LP(J-L) such that (2.2) LI = 1, II/lIp # O. Observe that as a function of a; in I,
Ix[o.,x] is continuous in the metric of
LP(J-L), and consequently, since L is continous we have that
= L (IX[o.,xD,
a;
E I,
XII.
220
LP Spaces
is a continuous real-valued function defined on I which satisfies <1>(0) = 0, and <1>(1) = LI = 1. Since is continuous there is x E (0,1) such that (x) = 1/2; further, since I = Ix[o,x] + IX[x,I] (in LP(IL)), by the linearity of L it readily follows that
(2.3) Moreover, since
(2.4) one of the summands on the left-hand side of (2.4) does not exceed 1I/1I~/2. Call 91 one of the functions in (2.4), either Ix[o,x] or IX[x,I]' such that 119III~ ~ 1I/1I~/2, and rewrite this estimate
(2.5) Put now
It = 291 and observe that, by combining (2.3) and (2.5), we get
Lit
= 1,
and
IIltll~ ~ 2(P-I)II/II~.
Repeating the above argument with function h, say, such that
Lh
= 1,
and
It in place of I above we obtain a
IIhll~ ~ 2(P-I)lIltll~ ~ 22(P-I)II/II~.
It is now apparent that iterating this inequality we get a sequence {In} LP(I) which satisfies
LIn = 1, Since 0 < p
and
II/nll~ ~ 2n(p-I)II/II~·
~
(2.6)
< 1, (2.6) implies that the In's satisfy LIn = 1,
lim ~(Jn,O) = 0,
n-+oo
which is impossible if L is continuous. This contradiction was derived from (2.2), and hence L is the zero functional. • The situation is quite different when p ~ 1: Not only are there plenty of functionals on LP(IL), but it is also possible to characterize them. We begin with a definition. The norm IILII of a bounded linear functional L on LP(IL), 1 ~ p ~ 00, is defined by the quantity
IILII = sup
IIJllp~O
ILII -11/11p •
(2.7)
2.
Functionals on LP
221
For instance, as pointed out in (ii) above, the functional Lg on L(p) given by (2.1) satisfies IILgll ~ IIglioo. In fact, if pis O'-finite, or more generally semifinite, we have equality here. To see this, given c > 0, let E E M be a set of positive measure so that Igl
> IIgll - c , p-a.e. on
E.
Moreover, since p is O'-finite, or semi finite , we may also assume that 00 and consequently, the function I = (sgng)XE E L(p). Finally, since
p(E) <
LgI =
l
(sgng)g dp = llgl dp
~ (lIglloo - c)
p(E) = (lIglioo - c) 11/111,
it follows that which gives the desired conclusion since c is arbitrary. It is natural to pose the analogous question for Lg, 0 ::J 9 E Lq(p), considered as a bounded linear functional on LP(p), l/p + l/q = 1. First observe that by Holder's inequality we have
(2.8) and indeed Lg is a well-defined continuous linear functional on LP(p) with IILgll ~ IIgliq. To see that we actually have equality in the norms, if 1 < p < 00, put I = (sgng)lglq/p, and note that II/II~ = IIgll~ < 00 and Ig = Iglq/P(sgng) = Iglq. Thus, for this particular LP function I we have LgI = Iglq dp, and consequently,
Ix
IILgll ~ ILgII/ll/llp
= IIgll;/lIgll;/p = IIgllq,
as we wanted to show. The case p = 00 is even simpler, as the limiting process gives the right answer: If 0 ::J 9 E L(p) now, put 1= sgng, and observe that 11/1100 = 1 and
LgI
=
L
(sgng)g dp
= IIglh .
Whence, as asserted, IILgll ~ LgI = IIglll. A natural question to consider is whether every bounded linear functional L on LP(p) is of the form L = Lq, for some 9 E Lq(p), 1/p+1/q = 1. In Chapter XIV we will see that this is not the case for p = 00, but how about for finite p's? In order to address this question we consider a converse to HOlder's inequality.
XII.
222
LP Spaces
Proposition 2.3. Let (X,M,p) be a measure space, 1 ~ p ~ l/p + l/q = 1, and suppose I E LP(p). Then if 1 ~ p < 00, we have
00,
(2.9) If p is CT-finite it is also true that
11/1100 =
sup ILg/I· IIglh :5 1
(2.10)
Proof. We may assume that I :j:. 0 on a set of p positive measure, for otherwise there is nothing to prove. Now, if 1 ~ p < 00, by Holder's inequality it follows that the sup on the right-hand side of (2.9) is less than or equal to II/lIp. Furthermore, putting 9 = (1/1I/11~-1)(sgn J)l/lp-1, we see at once that IIgllq = 1, and that
Lgi =
1I/1~~-1
LI/I
P-
1(sgn J)I dp = II/lIp,
and (2.9) holds. As for (2.10), again by Holder's inequality it suffices to show that the sup on the right-hand side there is at least 11/1100. Now, since p is CT-finite, given c > 0, we can find a set E E M such that
III > 11/1100 Then the function 9 satisfies
c,
p-a.e. on E,
0 < peE) <
00 .
= (1/ p(E))(sgn J)XE is integrable, has norm 1, and
Lgi =
p(~)
k
I(sgn J) dp 2:
Since c is arbitrary, (2.10) holds.
11/1100 -
c.
•
An interesting and useful variant of this result is Theorem 2.4. Let (X,M,p) be a measure space, 1 ~ p ~ 00, l/p + l/q = 1, and let I be an extended real-valued measurable function defined on X with the property that for every simple function 4> defined on X we have (2.11) Then, if p is CT-finite, it follows that the constant in (2.11).
I
E LP(p) and II/lIp ~ k, where k is
2.
Functionals on LP
223
Proof. Write X = Un Xn as an increasing union of sets of finite measure, and begin by constructing a sequence {tPn} of simple functions such that
ItPn I ::; III ,
and
lim
n-+oo
tPn = I
everywhere.
(2.12)
Further, let
Jxfill dJL =
lim n-+oo
Jxf I In dJL ::; k,
as we wished to show. Next, if 1 < p < 00, put In = l
(i) IIn
Ix l
P
dJL =
=
Ix I/n
IIx I In
dJL ::;
dJLI ::;
Ix I/n/l
dJL
kll/nll q ::; kll
It is clear that unless I = 0 JL-a.e., and in this case there is nothing to prove, we may also assume that 0 < lI
With this computation out of the way, observe that by Fatou's Lemma we have
XII.
224
V Spaces
and (2.11) holds. As for the case p = 00, let c > 0 be given, and put E = {If I > k +c}. If p,(E) > 0, since p, is u-finite, E contains a subset B, say, of positive, finite measure. Setting 1
= p,(B) (sgn f)XB ,
11<1>111
= 1,
by (2.11) we have
k?
Ix
fdp, >
p,(~)
k+ (k
which is impossible. Thus, we have p,(E)
c) dp, = k + c,
= 0, and IIflloo ::; k.
•
We are now in a position to describe the bounded linear functionals on LP(p,). Theorem 2.5 (F. Riesz Representation Theorem). Let (X,M,p,) be a measure space, 1 ::; p < 00, and lip + 1/q = 1. Then, if p, is u-finite, to each continuous linear functional L on LP(p,) there corresponds a unique function 9 E Lq(p,) such that IILII = IIgllq, and
Lf
= Lgf =
Ix
fgdp"
all f E LP(p,).
Proof. Suppose first that p, is a finite measure and introduce the set function (2.13) v(E) = LXE' E E M. Since L is linear it readily follows that v is an additive set function on M. We claim that actually v is a signed measure and that v <: It. Clearly v(0) = O. Moreover, since L is bounded, v only assumes finite values and for each E E M we have
Iv(E)1 = ILXEI ::;
IILII p,(E)l/P < 00.
(2.14)
Thus, to verify that v is a signed measure, it only remains to check that v is u-additive. Let, then, {En} be a sequence of pairwise disjoint measurable subsets of X, and put E = Un En. By (2.13) and (2.14), and since E \ U:'=l En = U~=m+I En, we get
Iv(E) - L:'=l V (En)I = Iv(E \ U:'=lEn)I = ILXUoo n=m+1 E n I ::; ilL II J.t(U~=m+IEn)l/P .
(2.15)
2.
Functionals on LP
225
Now, since the En's are pairwise disjoint and JI.(E)
JI.(U~=m+IEn) =
,",00
L....n =m+l
JI.(En ) _ 0,
< 00, we have as m _
00,
and consequently, the right-hand side of (2.15) tends to 0 as m Whence, it readily follows that
00.
and v is a signed measure on (X,M). Moreover, if JI.(E) = 0, from (2.14) we get that Iv(E)1 ~ IILIIJI.(E)l/P = 0, and consequently, v <: JI.. We are now in a position to invoke the Radon-Nikodym Theorem. Let 9 = dv/dJl. be the Radon-Nikodym derivative of v with respect to Jl.j 9 is uniquely determined and locally integrable, we want to show that also 9 E Lq(JI.) and that IIgllq = IILII. First recall that, in particular, we have
veE)
= 19dJl.,
(2.16)
E EM.
Now, if I = L: CnXEn is a simple function defined on X, by (2.13) and (2.16) we get
LI
=L =
CnLXEn
=L
cnv(En)
=L
Cn In 9 dJl.
Ix (LCnXEn) gdJl. Ix IgdJl.. =
(2.17)
Moreover, since L is bounded, by (2.17) we obtain
Ilxl9dJl.I
~ IILllll/llp,
and by Theorem 2.4 it follows that 9 E Lq(JI.) and IIgllq ~ IILII. In order to show that L = L9 , we must still prove that (2.17) is true for arbitrary f's in LP(JI.). Given I E LP(JI.), let {In} be a sequence of simple functions such that In - I and I/nl ~ III. By (2.17) we have
LIn =
L
IngdJl.,
n = 1,2, ...
(2.18)
and consequently, the limit of the right-hand side of (2.18) and that of the left-hand side there, if they exist, must be equal.
LP Spaces
XII.
226
As for the left-hand side, note that since L is bounded, we have
ILfn - Lfl ~ IILllllfn - flip
-+
as n
0,
-+ 00 ,
and consequently, lim Lfn
n-+oo
= Lf.
(2.19)
On the other hand, by the linearity of the integral and HOlder's inequality, we also have
Ii
fngdJL -
i
fgdJLI
~
i
~
IIfn - fllplIgllq
fllgl dJL
Ifn -
0,
-+
as n
-+ 00 ,
and consequently, lim { fngdJL n-+oo
ix
= (
ix fgdJL.
(2.20)
Whence combining (2.18), (2.19) and (2.20), we get
Lf =
i
fgdJL,
(2.21)
all f E LP(JL).
To complete the proof in the general case write X = Un Xn as an increasing union of sets of finite measure, and consider the restriction Ln of the functional L to LP(Xn , JL). Each Ln is a linear functional, and since
the Ln's are also bounded. Thus, by the first part of the proof, we can find functions gn E Lq(Xn,JL), n = 1,2, ... , such that
(2.22) and
(2.23) Now, since Xn ~ Xm for all n ~ m, we also have LP(Xn,JL) ~ LP(Xm,JL). Furthermore, since for each f E LP(Xn,JL) we have Lnf = Lmf for all m ~ by (2.23) it follows that for such functions we have Ign dJL = fgm dJL, or
Ix
n,
Ix
Ix
I(gn - gm) dJL = 0,
all m
~ n.
(2.24)
2.
Functionals on LP
227
In particular, since J.L(Xn ) < 00, the functions sgn (9n - 9m) E LP(Xn,J.L) for all m ~ n, and by (2.24) we get that 19n - 9ml = J.L-a.e. on X n • In other words, 9n = 9m Jlra.e. on Xn for all m ~ n, and consequently, the function 9 on X given by
°
is well-defined and it satisfies 191 = limn->oo 19n1 J.L-a.e. Thus, by Fatou's Lemma and (2.22) it follows that (2.25) and 9 is an Lq(J.L) function with norm less than or equal to IILII. Next note that since for each I E LP(J.L) we have
In = IXxn
---+
I J.L-a.e.,
and
Ilnl ~ III J.L-a.e.,
by LDCT it follows that limn->oo IIln - Ilip = 0, and consequently, by the continuity of L we obtain lim Lin = LI. n->oo Moreover, since by (2.23) Lin = IXn In9n dJ.L = Ix In9dJ.L, by Holder's inequality we also have limn->oo Ix In9 dJ.L = Ix 19 dJ.L, and consequently,
as we wanted to show. It thus only remains to verify that IILII = 11911q, and by (2.25) we only need to check that IILII ~ 11911q. But since L = L g , this is an easy consequence of Holder's inequality. • Two natural questions arise from this result: How can we go about representing the bounded linear functionals on LOO (J.L) , and, is the assumption concerning the u-finiteness of J.L necessary? The former question will be addressed in Chapter XIV, and the latter question has two answers, to wit: If 1 < p < 00, it is not necessary that J.L be u-finite, and if p = 1, it is. We do the case p = 1 first. Let X = (0,1), let M be the u-algebra of those subsets E of X which are either countable or such that X \ E is countable, and assume J.L is the
XII.
228
LP Spaces
counting measure on (X,M); p. is. not O'-finite. If v denotes the counting measure on (X, P(X)), put
LI
=
Ix
IX(o,l/2) dv,
IE Ll(p.).
Since for I E Ll(p.) the set {I "=I O} is p. and v O'-finite, it is clear that ILII ~ 111111, and L is a bounded linear functional on Ll(p.) of norm less than or equal to 1. It is intuitively clear that if LI = LgI, then 9 must be the function X(O,l/2)' which is measurable with respect to the 0'algebra P(X), but not measurable with respect to the O'-algebra M; the verification of this observation is left to the reader. On the other hand, the situation is quite different if 1 < p < 00, for then X(O,l/2) r;. Lq(v) for any q < 00. Finally, to see that in the case 1 < p < 00 the O'-finiteness of p. is not needed in the Riesz representation theorem, note that, in the notation of that theorem, given a O'-finite subset E of X, there is a unique function 9 = 9E vanishing off E so that 9E E Lq(E,p.) and
LI =
k
19 dp.,
all I E LP(E,p.).
Furthermore, if LIE denotes the restriction of L to LP(E,p.), we also have
119Ellq ~ IILIEII ~ liLli,
all O'-finite E.
Also (a simple variant of) the argument in (2.24) above gives that if El ;2 E are O'-finite subsets of X, then we have 9EI = 9E p.-a.e. on E, and 119Ellq ~ 119El liq. Let now TJ be the finite quantity TJ = sup {1I9Ellq : E is a O'-finite subset of X} ,
and let {En} be a sequence of O'-finite subsets of X with the property that lim n ..... oo 119E"lIq = TJ· Observe that if E = Un En, then E is also a O'-finite subset of X, and since
119Ellq ~ 119EJq,
all n,
it readily follows that 119Ellq = TJ. Now, life outside E is uneventful. Indeed, let A be a O'-finite subset of X, and put Al = (A \ E) U E. Then Al is also a O'-finite subset of X, and since q < 00 and
{ 19A 1 1qdp. = { 19A1q dp. + { 19E1q dp. JAI JA\E JE = { IgAlq dp. + TJq ~ TJq , JA\E
3.
Weak Convergence
229
it readily follows that gA = 0 J1,-a.e. on A \ E. This is all we need to know: If I is an arbitrary function in LP(J1,) , then the set A = {I ¥= O} is a-finite, cf. 4.11 in Chapter VII, and
LI
L
=
L
=
f Ig dJ1, + f IgE dJ1, = f IgE dJ1" JA\E A JAnE Jx
I 9 A dJ1,
=
I 9 A dJ1,
which is the desired representation of L.
3. WEAK CONVERGENCE Assume (X,M,J1,) is a measure space, and let 1,ln E LP(J1,), n = 1,2, ... , 1 ~ p < 00. We say that the sequence {In} converges weakly to I in LP(J1,), if, with lip + 1/q = 1, we have
lim
n_oo
f IngdJ1,= Jx f IgdJ1" Jx
allgELQ(J1,).
(3.1)
We now give a few examples to show that there is no connection between weak convergence and any of the other forms of convergence, unless further assumptions are made on either the sequence itself or the measure space involved. For instance, in £P, consider the sequence {en} consisting of those sequences en = (0, ... ,1,0, ... ) with 1 in the nth place and zeroes elsewhere. If 1 < p < 00, and x = (XI, ... ,X n , ... ) E £q, then the functional Lx has the property that
(3.2) and since by the Riesz representation theorem these are all the functionals on £P, the sequence {en} converges weakly to O. Nevertheless, since lien - em lip = 21 / p for all n ¥= m, neither the sequence itself nor any of its subsequences converges to O. Neither does {en} converge to 0 in measure, nor uniformly, nor even in the pointwise sense. Note however that {en} does not converge weakly to 0 in £1; this is clear since the sequence x = (1,1, ... ) is bounded and Lxen = 1 for all n. Now, in the case of £1 we have the following interesting result. Proposition 3.1 (Schur). x in £1, then
If the sequence {xn} converges weakly to
lim IIx n
n .....oo
-
xIII =
o.
(3.3)
XII.
230
LP Spaces
Proof. By considering, if necessary, the sequence {xn - x} we may assume that {xn} converges weakly to O. Suppose that limn-+ oo Ilxnlll i- 0; passing to a subsequence, if needed, we may assume that IIxnlll ~ TJ > 0 for a.ll n. In this case also (1/llxnlldxn converges weakly to 0, and so we may as well assume that IIxnlll = 1 for a.ll n. In addition, if Xn = (Xn,b ... ,xn,m, ... ), n = 1,2, ... , by the weak convergence it follows that lim Xn m
n-+oo
'
= 0,
(3.4)
a.ll m.
Observe that since IIXl111 = 1, we can find ml so that L::!:llxl,ml > 3/4. Further, by (3.4) it readily follows that there exists an index n2 > 1 such that L::!:1 IX n2 ,m I < 1/4, and consequently, since IIx~ lit = 1, we can find an index m2 > ml so that L::;m1 +1 Ix~,m I > 3/4. The pattern is now clear: Having chosen sequences mo = 0 < ml < m2 < ... < mk and nl = 1 < n2 < ... < nk, choose first nk+1 with the property that
mk
L IXnk+ ,ml < 1/4, 1
m=1 and then
mk+1
so that
mk+1
L
Ixnk+l1m l > 3/4.
m=mk+1 Consider now the sequence Y E [00 with terms
Since IYm I : : ; 1 for all m, the functional Lyon
[I
satisfies
00
ILy(xnk)1 =
L xnk,mYm
m=1
~ m=mk+1 1: IXnk,ml1
(I: + m=1
f )
IXnk,mYml
m=mk+1 +1
mk+1 = 2
L
IXnk,ml-lIxnklh> 2·3/4 -1 =1/2,
k = 1,2, ...
m=mk+1 Thus, lim sup ILy(xn)1 ~ 1/2, which contradicts the weak convergence of {xn} to o. •
3.
Weak Convergence
231
In a different direction, an interesting example is the sequence {In} ~ LP([O,1j), 1 ~ p, given by In = nX[O,lln)' which converges to 0 in measure and a.e., and does not converge to 0 weakly in LP([O,1j) for any p 2: 1. (Just consider the functional induced by X[O,I)') Finally, the sequence {In} ~ LP(R), 1 ~ p, given by In = (1/n)X[1,en), converges uniformly to 0, yet it does not converge weakly to 0 in LP( R) for any p 2: 1. (The functional induced by (1/X)X[1,00)(X) will do.) There are additional assumptions that we may impose on weakly convergent sequences to ensure they they also are convergent. Again we discuss the t P case; the result is also true for general LP(J-t) spaces, but the proof is more complicated. Proposition 3.2. Suppose the sequence {xn} converges weakly to x in t P , 1 ~ P < 00, and that, in addition,
(3.5) Then we also have
(3.6) Proof.
As before it follows that if
x = (xt, ... ,x m, ... ),
Xn = (xn,t, ... ,xn,m,"')'
n = 1,2, ...
then lim Xn m = Xm ,
n~oo
'
(3.7)
all m .
Also, by (3.5) and (3.7), for each fixed M we have
00 )I/P (00 ) IIp ( P P n~~ m~ IXn,mI = m~ Ixnl
(3.8)
Whence, for each fixed M we get, M-I
)llP
IIXn - xl!p ~ ( ~ IXn,m - xml P
(00
+ m~ IXn,m -
) IIp
xml P
=A+B, say. It is not hard to estimate B. By Minkowski's inequality and (3.8) it follows that for all sufficiently large n, B$
Ctlz•.mIPr + CtlzmlP)"P $3Ctl•.IP)"P.
XII.
232
V Spaces
which, since x E i P , can be made arbitrarily small provided M is sufficiently large. Once M is fixed, it is clear that, on account of (3.7), A also can be made arbitrarily small provided n is large enough. Thus (3.6) holds, and we have finished. • As noted above, LP spaces do not have the Bolzano-Weierstrass property: There are bounded sequences for which no convergent subsequence may be found. The concept of weak convergence is also relevant in this context. Theorem 3.3. Let {/k} be a bounded sequence in LP(Rn), 1 < P < 00, with bound M. Then there exist a subsequence k m -+ 00 and a function / E LP(Rn) with II/lip ~ M, such that {/k m} converges weakly to / in LP(Rn). Proof. We divide the proof into a number of steps, and begin by showing that there is a subsequence k m -+ 00 with the property that limm -+ oo /kmg dx exists, provided 9 is any function in a fixed countable /kgh family {gh} ~ Lq(Rn), 1/p+l/q = 1. This is not hard: Let Ck,h = for k, h ~ 1, and note that by HOlder's inequality we have
Inn
Inn
(3.9) Fix h = 1 now. By (3.9), (Ck,l) is a bounded sequence and consequently, there is a subsequence kl -+ 00 such that limkl-+OO Ckl,1 exists. Repeating this argument with (CkI.l) in place of (Ck,t) above we obtain a new subsequence k2 -+ 00, say, such that limk:l-+oo Ck2,h exists for h = 1,2. These are the first steps of the by now familiar Cantor diagonal process which ensures the existence of a subsequence k m -+ 00 so that limkm-+oo Ckm,h exists for each h. We choose now for the 9h's a dense family in Lq(Rn ), which, since Lq(Rn ) is separable, is clearly possible, and define the functional L on the gh's by means of the expression Lg = limm -+oo /km 9 dx. Now, L is decidedly linear over these gh's, i.e., L(9hl + ).g~) = Lghl + )'Lg~ for all scalars )., and by HOlder's inequality it also satisfies ILgl ~ MlIgliq. We claim that L can be extended linearly and continuously to all of Lq(Rn). Indeed, to each 9 E Lq(Rn ) there corresponds a sequence of gh's such that limh-+oo IIg - ghllq = 0 and lim IIghllq = IIgliq. Now, for these gh's -the sequence of scalars (Lgh) is Cauchy, and consequently, convergent. Putting Lg = limh-+oo Lgh, L turns out to be a well-defined linear functional on Lq(Rn), and since
Inn
ILgl ~ lim sup ILg - Lghl + lim sup ILghl ~ MlIgllq ,
4.
Problems and Questions
233
L is also bounded and has norm IILII ~ M. By Theorem 2.5 there exists a function IE LP(Rn ) with II/lIp ~ M such that Lg LJg; the function I satisfies all the required conditions. •
=
The proof of this interesting result relies on the fact that the functional
L, originally defined on a subset of Lq(Rn), can be extended to all of Lq(Rn) without an increase of its "norm." A more general setting where this is also true is described in Chapter XIV.
4. PROBLEMS AND QUESTIONS In what follows (X,M,JL) denotes a measure space, and we don't find it necessary to stress this point at each instance. 4.1 Suppose 0 < p < q ~ 00. Give examples of functions I defined on R such that I E Lr(R) iff (a) p < r < q, (b) p ~ r ~ q, and, (c) r =p. 4.2 Let I be a bounded interval of R. By means of an example show that, in general, no
4.3 The following inequality is often referred to as Holder's inequality, prove it and identify the cases of equality: If I E LP(JL) and 9 E Lq(JL), 1 < p, q < 00, 1/p+1/q = 1, then IIx Ig dJLI ~ II/lIplIgllq· 4.4 The following is an extension of Holder's inequality to more than two indices: Suppose I E LP(JL), 9 E Lq(JL), h E Lr(JL), 1 < p,q,r < 00, l/p + l/q + l/r = 1. Prove that Igh E L(JL) and that
Can you think of a further extension to more than three functions? 4.5 Let 1= [0,11"]. Show that
II x- I / 4 sin x dx ~ 11"3/4.
4.6 Show that if for some 0 < p < 00, I E LP(JL) n LOO(JL), then for all p < q < 00, IE Lq(JL) and II/l1q ~ 1I/11~/qIl/Il~p/q. 4.7 If for some 0 < p,q < 00, IE LP(JL) n Lq(JL), then I E Lr(JL) for all p < r < q, and II/l1r ~ 1I/11~-'lIl/Il~, where 0 < TJ < 1 is given by l/r = (1 - TJ)/p + TJ/q. 4.8 Let I = [0,11"] and I E L2(I). Is it possible to have simultaneously Il(/(x) - sinx)2dx ~ 4/9 and !1(/(x) - cosx)2dx ~ 1/9?
XII.
234
LP Spaces
4.9 Suppose an extended real-valued function I defined on Rn satisfies the following two properties: (a) There is a p, 1 ~ p < 00, such that I E LP( I), for every bounded interval I in Rn, and, (b) I II liP ~ clIlp-l II I/IP, for every bounded interval I and a constant 0 < c < 1 independent of I. Show that I = 0 a.e. 4.10 Let 0 < p < q ~ 00. Prove that LP(J.t) is not contained in Lq(J.t) iff X contains sets of arbitrarily small, positive, I' measure. 4.11 Let 0 < p < q < 00. Show that Lq(J.t) is not contained in LP(J.t) iff X contains sets of arbitrarily large, finite, I' measure. What can you say about the case q = oo? 4.12 Prove that if limn_oo II/nllp = 0, 1 ~ P ~ 00, then there exist a subsequence {Ink} and a nonnegative function h E LP(J.t) such that link I ~ h J.t-a.e., and limk_oo Ink = 0 J.t-a.e. 4.13 Prove that if II/n - Ilip --+ 0 and IIgn - gllq lip + 1/q = 1, then II/ngn - Igll1 --+ o.
--+
0, 1 ~ p, q ~
00,
4.14 The LP(X,J.t) spaces are not, in general, separable when 0 < p ~ 00. Give an example of a measure space (X,M,J.t) so that LP(J.t) is not separable, and prove that the Lebesgue spaces LP( Rn) are separable for 0 < p < 00. On the other hand, Loo(Rn) is not separable. When is Loo(J.t) separable? 4.15 Suppose
I E LP(Rn ), 0 < p < 00, and compute lim f I/(y + h) Ihl-oo Jll!'
+ l(y)IPdy.
4.16 Suppose I, In E LP(J.t), n = 1,2, ... , satisfy limn _ oo In = I J.t-a.e., and limn_oo II In lip = II/lIp, 0 < p < 00. Prove that lim II/n n-oo
Ilip = o.
4.17 Is the conclusion of 4.16 true if we replace J.t-a.e. convergence by convergence in measure? 4.18 Let (X,M,J.t) be a finite measure space, 0 < r < p, and {In} a sequence of V(J.t) functions such that II/nllp ~ k for all n and limn _ oo In = I J.t-a.e. Prove that limn _ oo II/n - Ilir = 0, and that the conclusion may fail if either J.t(X) = 00 or r = p. 4.19 (Vitali's Convergence Theorem) Let {In} be a sequence of LP(J.t) functions, 1 ~ P < 00, and assume limn _ oo In = I J.t-a.e. Show that
4.
Problems and Questions
235
I E LP(p) and limn..... oo II/n - Ilip = 0 iff: (a) For each c > 0, there exists a set Ae such that p(Ae) < 00 and IX\A. I/nl Pdp < c for all n, and, (b) liml'(E) .....O IE I/nlPdp = 0, uniformly in n. 4.20 Verify that for every measurable function
I
f I/IP dp = f ptp-1p( {III> t} )dt, Jx J[O,oo)
0
< 00.
4.21 Suppose the nonnegative functions I,g E LP(p), 1 < p < 00, satisfy the relation p( {g > ,x}) ~ I{g>>.} I dp for all ,x > O. Show that IIglip ~ p'lI/l1p, where lip + lip' = 1.
f
4.22 Suppose 0 < r < p < q ~ 00, and let I E LP(p). Show that I can be written I = 9 + h, where 9 E F(p) and h E Lq(p). Further, given t > 0, we can choose 9 and h so that IIgll~ ~ tr-PII/II: and IIhll~ ~ tq-PII/II:·
4.23 Suppose I E wk-L(Rn) is such that I{I f:. O}I < 00. Prove that IE LP(Rn) for each 0 < p < 1. Also, if IE wk-L(Rn)nLOO(Rn), show that I E LP(Rn) for 1 < p < 00.
4.24 The Hardy-Littlewood maximal operator takes LP(Rn ) functions into LP(Rn) functions, 1 < p < 00. More precisely, show there is a constant C = cn,p such that
4.25 Show that if I E LP(Rn ), 1 ~ p ~ 00, then the integral of I differentiates to I( x) for almost every x in Rn. 4.26 Given an interval I = [a,b] in the line, show that a necessary and sufficient condition for a function F to be the integral of IE LP(I), 1 < p < 00, is that the sums
formed for every partition {a ~ Xo < ... < The sup of these sums is then IIlf(x)IPdx.
Xn
~
b} be bounded.
4.27 Suppose p is a finite measure and v <:: p. Prove that the RadonNikodym derivative h = dvJdp E LP(p) iff there is a constant c such that for all measurable at most countable partitions {En} of X, we
XII.
236
LP Spaces
have L:~=l v(En)P / J.L(En)p-l ~ c. What is the relation between c and IIhllp? 4.28 If J.L is a finite measure, and the sequence {In} of nonnegative funcIn dJ.L = ." and liminf I~ dJ.L < 00 for tions satisfies lim sup some 1 < p < 00, show that J.L( {lim sup In ~ .,,}) > O.
Ix
Ix
4.29 Show that each function I E LP(J.L), 0 < p < ing property: lim,X-+oo .VJ.L( {III> A}) = O.
00,
satisfies the follow-
4.30 Suppose I is an extended real-valued function defined on a probability measure space (X,M,J.L). Show that ess sup I
= inf{A: J.L( {x EX: I( x) ~ A}) = I} .
4.31 If I is an extended real-valued function defined on X, define its essential infimum by ess inf 1= sup{A:J.L({x E X :/(x)
< A}) = O}.
Explore the properties of this quantity. In particular show that if I ~ 0 J.L-a.e., then ess inf 1= 1/111/11100. 4.32 Suppose that limn-+ oo II/n - Ilip = 0, 0 < p < 00, and that 9,9n are uniformly bounded measurable functions, n = 1,2, ... , such that limn-+oo 9n = 9 J.L-a.e. Prove that limn-+oo II/n9n - 1911p = O. 4.33 Suppose I, In E LP(J.L) , n = 1,2, ... ,1 < p < 00. Show that In converges weakly to I in LP(J.L) iff (a) sup II/nllp < 00, and, (b) limn -+oo In dJ.L = I dJ.L for all E E M.
IE
IE
4.34 Suppose {In} is a bounded set in LP(J.L), 1 < p < 00. Show that if limn -+ oo In = I J.L-a.e., then also In converges weakly to I in LP(J.L). 4.35 Show that the conclusion of 4.34 holds with the assumption of the pointwise J.L-a.e.convergence replaced by convergence in measure. 4.36 Let (X,M,J.L) be a measure space and {In} a sequence of LP(J.L) functions so that liminf II In lip < 00 for some 0 < p < 00. Suppose further that limn -+ oo In = I J.L-a.e. What can then be said about II/lIp? Well, prove that ;!.~ (II In - III~
-
II/nll~
-
II/II~) = O.
CHAPTER
XIII
Fubini '8 Theorem
In this chapter we deal with the questions of "exchanging the order of integration" in a double integral, and of evaluating an integral as an iterated one.
1. ITERATED INTEGRALS The Lebesgue theory of integration developed in Chapter VII is independent of the dimension of the ambient space, and this is an attractive feature. Nevertheless, the computation of an integral in Rn is usually carried out as a succession of n one-dimensional integrals. The following is a familiar situation: Suppose f is a continuous function on a rectangle I = [a, b] X [e, d] in R2. Then it is true that
Lf(x,Y)dA
=
lb Ld
f(x,y)dydx
=
Ld lb
f(x,y)dxdy,
where all integrals are taken in the sense of Riemann. Simple examples indicate that, in general, this is not always the case. Consider, for instance, 1= (0,1) x (0,1) and
x2 _ y2
f(x,y)
= (x 2 + y2)2'
(x,y) E I.
J;
Since as a simple computation shows f(x,y) dy = 1/(1 + x 2), we have 1 1 Jo Jo f(x,y)dydx = 7r/4. Similarly, J; J; f(x,y) dx dy = -7r/4. Now, a couple ofthings go wrong with f. First notice that for 0 < x < 1, we have f(x,y) dy = 1/2x, and consequently, If(x,y)ldy ~ 1/2x.
J;
J;
XIII.
238
f; f;
It is then apparent that II( x ,y) Idy dx = {(x,y) E I: Ixl > V3lyl } we have
I/(x,y)1 ~
1
00.
Fubini's Theorem
Also, since on the region
1
2(x2 + y2) ,
by x.xx in Cahpter VII, it is clear that I ~ L(I). Motivated by this example we introduce the following definition: Let It ~ R n and 12 ~ R m be intervals in R n and R m respectively, and consider the "rectangle" I = II X 12 ~ R n X R m = Rn+m. An integrable function I E L(I) is said to satisfy, or have, "property F", provided the following three properties hold: (i) For almost every x E II, in the sense of the n-dimensional Lebesgue measure, I(x,y) is a measurable and integrable function of y on 12 , with respect to the m-dimensional Lebesgue measure. (ii) As a function of x E II, fh I( x ,y) dy is measurable and Lebesgue integrable. (iii)
f 1= f f
11
111 112
I(x,y) dy dx .
What we hope to prove is that every I E L(I) has property F. This we achieve in a series of lemmas, each involving a basic step required in approximating arbitrary integrable functions. First an observation: By setting I = 0 off I if necessary we may assume that II = Rn, 12 = Rm and I = Rn+m.
Lemma 1.1. Suppose {Ik} is a non decreasing sequence of functions each of which has property F, and suppose that limk-+oo!k = IE L(Rn+m). Then I also has property F. Proof. Let Nk be a null set (in Rn) such that !k(x,y) is measurable and integrable as a function of y E Rm for x ¢ Nk, and let N = Uk Nk; clearly INI = O. Now, for x ¢ N, Ik(X,y) increases (in the wide sense) to I(x,y), and consequently, for those x's, I(x,.) is measurable. Thus by MeT it follows that lim k-+oo
JfRm Ik(X,y) dy = JfRm I(x,y) dy.
(1.1)
Moreover, since each Ik has property F, each integral on the left-hand side of (1.1) is a measurable, integrable function of x and consequently,
1.
Iterated Integrals
239
fRm f(x,y) dy is also a measurable function of x. Again by MCT, (1.1)
gives lim {
(
k-+oo iRn i~
fk(x,y) dy dx = {
(
f(x,y) dy dx.
iRn iRm
(1.2)
On the other hand, also by MCT, lim {
k-+oo iRn+m
fk
= iRn+m ( f·
(1.3)
Now, since the fk'S have property F, for each k the integral on the lefthand side of (1.2) is equal to that on the left-hand side of (1.3) and consequently, we get
Since
f
E L(Rn+m),
f
has property F.
•
Corollary 1.2. Suppose {fk} is a nonincreasing sequence of functions each of which has property F, and suppose that limk-+oo /k = f E L(Rn+m). Then f also has property F. Lemma 1.3. Suppose E = nr=10k, Ok open, is a Gs subset of Rn+m so that lOki < 00 for some k. Then XE satisfies property F. Proof. We proceed in steps. Note first that if II and 12 are open, bounded intervals in Rn and Rm respectively, and if I = II X 12 , then XI(x,y) = XII (x)Xh(y), and XI has property F. Also note that if A is any subset of the boundary of I, we then have
I{y E R m : (x ,y) E A} I = 0,
a.e. x in Rn.
Roughly speaking, this relation is true except, possibly, along the "endpoints" of II. It then follows that
which together with JRn+rn XA =
IAI
= 0, implies that XA has property
F. II
Consider next a subset j of Rn+m consisting of a rectangle I = 12 , It, 12 open in RR and R m respectively, plus some portion A of
X
XIII.
240
Fubini's Theorem
the boundary of I. Since Xi = XI + XA' and since as is readily verified any finite linear combination offunctions that satisfy property F also has property F, then by the above results it follows that Xi has property F. Now suppose 0 is an open set in R n +m with finite measure; we can write 0 = Uk i k as the pairwise disjoint union of subsets i k of Rn+m consisting of open rectangle Ik and a subset of their boundary. Thus, Xo = limn -+oo Lk=l Xi,.' and since each sum on the right-hand side above has property F, by Lemma 1.1 Xo also satisfies property F. We are finally ready to handle XE' Under our conditions we may assume that
n 00
E =
lOll < 00.
Ok,
k=l
Then XE = limN-+oo Xnf=l Ok' and each of the sets nf=IOk is open and has finite measure. Therefore, each of the functions on the right-hand side above has property F, and, by Corollary 1.2, XE has property F. • Lemma 1.4.
Let N ~ Rn+m,
INI = O. Then XN
has property F.
Proof. By Corollary 1.5 in Chapter V there is a Gs set E ~ N such that lEI = INI = O. Now, XE has property F, and, in particular,
I JRm I XE(x,y) dydx = lEI = O. Jnn Thus, by Corollary 2.3 in Chapter VII, we get that
fnm XE(x,y)dy = Ib E R m :(x,y) E E}I = 0,
x a.e. in R n ,
which in turn implies that
Inn Inm XN(x,y)dydx = Since also JRn+m XN
F.
•
lib
E R m : (x,y) E
= INI = 0, it readily follows
Lemma 1.5. Suppose E Then XE has property F.
~
N}I dx = O.
that XN has property
Rn+m is Lebesgue measurable,
lEI < 00.
1.
Iterated Integrals
Proof.
241
By 1.5 in Chapter V we have E = H \ N,
HaGs set, and N null.
where H = n~l Ok, and some Ok has finite measure. Now, since XE = XH - XN' and since by Lemmas 1.2 and 1.4 the functions XH and XN have property F, it follows that XE also has property F. • The stage is now set for a preliminary version of Fubini's theorem. Theorem 1.6. Proof.
Suppose f E L(Rn+m). Then f has property F.
Since f = f+ - f- , f+, f- integrable, we may assume that
f is nonnegative. The conclusion follows now by a familiar limiting argument. Indeed, let {
o ~
and
lim
k-+oo
f
a.e.
Now, since each
= {y E R n : (x,y) E E},
all x E R n ,
and
EY = {x E R n : (x ,y) E E} ,
all y E R m
•
Although not intuitively apparent, the sections of a measurable subset of Rn+m are a.e. measurable in the respective ambient spaces. This is a particular instance of our next result, which extends the following interesting consequence of the preliminary version of Fubini's theorem: If f(x,y) E L(Rn+m), then f(x,y) is a Lebesgue measurable function of y E Rm, for almost every x E Rn. We now show that the same is true if f is merely measurable; the result about the sections follows by considering f = XE' which is the first step of the proof. Proposition 1.7. Suppose f is a measurable function defined on Rn+m. Then, for almost every x E Rn, f(x,y) is a measurable function of y E Rm. Symmetrically, for almost every y E R m , f(x,y) is a measurable function of x E RR.
XIII.
242
Fubini's Theorem
=
Proof. Assume first that J XE' where E a measurable subset of Rn+m. Then write E = HUN, where H is of type Fq in Rn+m and INI o. Then Ex Hx U N x , where Hx is an Fq subset of Rm, and by Lemma 1.4, INxl 0 for almost every x ERn. Therefore, Ex is measurable for almost every x E Rn. Similarly for EY. If J is any measurable function on Rn+m now, given A real, consider the set E(A) ((x,y) E Rn+m :J(x,y) > A}. Since E(A) is measurable in Rn+m, the section E(A)x is measurable in Rm for almost every x in Rnj the exceptional null subset of Rn depends, of course, on A. The union N of these exceptional sets for all rational A is also a null set, and the set {y E R m : J( x ,y) > A} is measurable, provided that A is rational and x ft N. By 4.1 in Chapter VI the same is true for all A, and consequently J( x ,y) is a measurable function of y E R m , for almost every x E Rn. The other statement is proved in an analogous fashion. •
=
=
=
=
We now state Fubini's theorem in its general form. Theorem 1.8. Let J(x,y) be a measurable function defined on a measurable subset E of Rn+m. Then, (i) For almost every x E Rn, J(x,y) is a measurable function of yon Ex. (ii) For almost every y E Rm, J(x,y) is a measurable function of x on EY. (iii) If J E L(E), then for almost every x in Rn, J(X,.) E L(Ex). Moreover, fEz J(x,y) dy is an integrable function of x and
f J= f f
JE
JRn JEz
J(x,y)dydx.
(iv) If J E L(E), then for almost every y in R m , J(., y) E L(EY). Moreover, fEll J(x,y) dy is an integrable function of y and
f
JE
J=
f
f
JRm JEll
J(x,y)dxdy.
Proof. Let i be the function equal to J on E and equal to 0 elsewhere in Rn+m. Since J is measurable on E and E is measurable, i is measurable on Rn+m. Therefore, by Proposition 1.7, /( x ,y) is a measurable function of y for almost every x in Rn. Moreover, since Ex is measurable for almost every x in Rn, it follows that J(x,y) is measurable as a function of yon Ex, for almost every x. This proves (i)j the proof of (ii) is analogous.
1.
Iterated Integrals
243
Now, if f is integrable on E, then we get
f f=f
iE
iRn+m
j
j E L(Rn+m) and by Theorem 1.6
=f f
iRn iRm
j( x ,y) dy dx .
(1.4)
On the other hand, Ex is measurable for almost every x in Rn, and for those x's we get
f
iRm
j(x,y) dy
=f
iE:z:
f(x,y) dy.
(1.5)
Whence integrating (1.5) over R n , and combining the resulting expression with (1.4), it readily follows that (iii) holds. The proof of (iv) is analogous. • Next we consider whether a converse to Fubini's theorem is true. For instance, suppose f is measurable on R n +m , and the iterated integrals of f exist and are equal. Does it, then, follow that f is integrable? The answer to this question is no, and the counterexample is constructed along the lines of Figure 5 below.
the = 1
Figure 5
tPli: =
-1
XIII.
244
Fubini's Theorem
Indeed, let J = [0,1] x [0,1], divide J into four equal squares, separate the bottom left square, call it J}, divide the right upper square into four equal squares, call the bottom left square J 2 , and so on. On each square Jk, k = 1,2, ... , define a function tPk as follows: Divide each Jk into four equal squares, and let tPk equal -Ion the interior of left bottom and right upper squares, equal 1 on interior ofthe right bottom and left upper squares, and otherwise. Put now 00 1 I(x,y) = -IJIXk(X,y).
°
I: k=l
k
There is no problem with convergence here since at most one of the terms of the above series is nonzero. Also I, being the sum of measurable functions, is itself measurable. Now, since for all x E [0,1] we have itO,l] I(x,y)dy = 0, it is clear that
f
f
I( x ,y) dy dx
J[O,l] J[O,l]
=
°.
A similar argument gives that the other iterated integral is 0, and so both iterated integrals exist and are equal. Is I integrable, and, if so, is its integral equal to o? Well, since ItPk(x,y)1 = 1 for (x,y) E Jk, it follows that 1 f ItPkl = 00, fill = J[O,l] x [0,1] k IJkl
I:- JJk
and the integral of I is not defined on [0,1] X [0,1]. In deciding what goes wrong with I, we note that it changes signs. In fact, this is the only difficulty. Theorem 1.9 (Tonelli). Suppose function defined on Rn+m. Then,
f
JRn+m
1= f
f
Jnn JR!"
I
is a nonnegative measurable
I(x,y) dy dx = f
f I(x,y) dx dy.
JRm JRn
(1.6)
The identity in (1.6) is understood as follows: If anyone of the three expressions there is infinite, so are the other two, and if anyone is finite, the other two are also finite and all three are then equal. Proof. Let {/k} be a nondecreasing sequence of integrable functions that converges to I a.e. For instance, if Xk(X,y) denotes the characteristic function of Il(O,k) x 12 (0,k), then the sequence
h(x,y)
= min{/(x,y), k}Xk(X,y) ,
k
= 1,2, ...
1.
Iterated Integrals
245
will do. :Now",,,, apply Fubini's theorem to each !k and note that outside of a null subset Nk of R n, JRm !k(X,y) dy is well-defined and
r
JRn+m
fk =
rr
JRn JRm
fk(x,y)dydx
k = 1,2,...
Let N = Uk Nki N is a null subset of R n and for x E R n it follows that lim k-+oo
\
(1.7)
N, by MeT,
Jr Rm !k(X,y) dy = Jr Rm !(x,y) dy.
Thus, again by MeT, lim k-+oo
r JRm r !k(X,y) dydx = JRnr JRm r !(x,y) dydx. JRn
(1.8)
On the other hand, also by MeT, we get (1.9) Whence, combining (1.7), (1.8), and (1.9) we see that
r
JRn+m
!=
r r !(x,y)dydx,
JRn JRm
which is one of the identities in (1.6). The other is obtained in a similar fashion. • An interesting application of Tonelli's theorem is the following remark concerning the evaluation of the measure of a set E in R n +m in terms of its sections. Since for each fixed x E R n , XE(x,y) i- 0 iff XEJy) i- 0, we have
A similar argument also gives that lEI = Note that in particular we have that in Rn iff IEYI = 0 for y a.e. in Rm.
JRm IEYI dy .
E is
null iff
IExl = 0 for
x
a.e.
XIII.
246
Fubini's Theorem
2. CONVOLUTIONS AND APPROXIMATE IDENTITIES Given integrable functions I, 9 defined on Rn, we have already noted that the pointwise product I 9 is not necessarily integrable. In other words, the expression I(y)g(y) dy is not necessarily finite. It may come as a surprise that a closely related expression, namely the convolution 1* g( x) of I and 9 at x, is nevertheless finite a.e. on Rn. The precise definition is as follows: Suppose I,g E L(Rn), and consider the integral
IRn
I*g(x)= [
iRn I(x-y)g(y)dy,
xERn.
(2.1)
Suppose for the moment that I,g ~ 0, and note that for any fixed x E Rn, the integrand in (2.1) is a nonnegative measurable function, so that 1* g( x) is a well-defined real number or 00. But, is there any x for which 1* g(x) < oo? A way to go about answering this question is to show that actually the function I * 9 is integrable, for then it will be finite a.e. As for the integrability, observe that
III * gill
= [
[ I(x iNn iRn
y)g(y) dydx,
and this expression corresponds to one of the iterated integrals of the function I(x - y)g(y). But it is the other iterated integral, namely,
iNn iNn I(x -
y)g(y)dxdy = =
iNn iNn I(x -
y) dxg(y) dy
1I/11111g111 < 00 ,
which is easy to handle. To pass from one to the other we wish to invoke Tonelli's theorem, and to do this we need to prove first that the function h( x ,y) defined by
h(x,y) = I(x - y)g(y) ,
X,y E R n ,
is actually measurable in Rn x Rn = R2n. Now, since 9 is measurable, given a scalar ,x, the set
((x,y) E R 2n :g(y) > ,x} = R n x {y E R n : g(y) > ,x} is a measurable subset of R2n, and 9 is also measurable when considered as a function defined on R2n.
2.
Convolutions and Approximate Identities
247
So, we are reduced to showing that if I is a measurable function defined on R n, then h(x,y) = I(x - y) is a measurable function on R n x R n = R2n. To clarify the ideas behind this result we do the case n = 1, where the main difficulties are already apparent; we leave it to the reader to think of the general proof. We proceed step by step and assume first that I = XI' where 1= (a,b) is an open interval. In this case the level sets
((x,y) E R2: I(x - y) > A},
A real,
are empty if A ~ 1, and the open strip {(x,y) : a < x - y < b} otherwise. In either case, the level sets are measurable. Next, if 1= Xo, 0 open, then write 0 = Uk h as the pairwise disjoint union of the open intervals h, and note that the levels sets are now {/(x-y) > A} = Uk{XI" (x-y) > A}, and hence they are also measurable. If I = Xa, GaGs subset in R, then we have G = k Ok, Ok open, and consequently I(x - y) is the limit I(x - y) = limk-+oo Xo,,(x - y) of measurable functions, and hence measurable. The same is true if I = XN' N a null subset of Rn. Indeed, let G be a Gs subset in R such that N ~ G, IGI = 0, and let 9 = {(x,y) E R2: x - y E G}. As pointed out above, we have 191 = 0 iff 19x1 = 0 for all x E R. Now,
n
9x = {y
E R: x - y E G} = {y E R: y = x - g,g E G} = x - G,
and by the translation invariance ofthe Lebesgue measure we have Ix - GI = IGI = 0 for all x E R. Thus 191 = O. Consider now
N = {(x,y)
E R2:x - YEN} ~
19x1 =
9.
By the completeness of the Lebesgue measure it follows that also INI = 0, and so XN( x - y) is measurable. Now, if 1= XE' where E is a measurable subset of R, write E = G\N, with GaGs subset of Rand N a null set, and note that in this case I(x - y) = Xa(x - y) - XN(x - y), is also measurable. Whence simple functions 4> also enjoy the property that 4>( x - y) is measurable, and the same is true of the limits of simple functions, to wit, arbitrary measurable functions. This is precisely what we set to prove. Returning to the properties of convolutions, we have Theorem 2.1.
Suppose I,g E L(Rn). Then
I I/(x JRB
y)llg(y)1 dy <
00,
x a.e. in Rn.
(2.2)
XIII.
FUbini's Theorem
y)g(y) dy.
(2.3)
248
For those x's that satisfy (2.2) we define
1* g(x)
= iR!' I(x -
Then 1* 9 E L(Rn), and (2.4)
Proof. First observe that the function h on R 2n defined by h(x,y) = I/(x - y)llg(y)1 is measurable and nonnegative. Thus by Tonelli's theorem we have the equality of the iterated integrals of h, and consequently,
r iRnr I/(x iRn
y)llg(y)1 dy dx =
r iRnr I/(x iRn
y)llg(y)1 dx dy
=
iR!' iR!' I/(x -
y)1 dx Ig(y)1 dy.
By the translation invariance of the Lebesgue measure it readily follows that the inner-most integral above is equal to IIIIII, and the right-hand side is IIgllll1/111. Now, since II * g(x)1 ~ I/(x - y)llg(y)1 dy, (2.4) holds. •
IRn
What more can we say about convolutions? For instance, a natural question to consider is whether the operation of convolution has a unit, i.e., whether there exists an integrable function I such that 1* 9 = 9 for any 9 E L(Rn). Suppose such a function I exists, and given 0 f:. x E R n , let 0 < 1] < Ixl. Put now g(y) = (1/21])n X1 (0,1/)(Y) and note that
O=g(x)=1*9(X)=II(01 )1 ,1]
r
il(O,1/)
I(x-y)dy.
(2.5)
Now, by the Lebesgue Differentiation Theorem, the limit as 1] ~ 0 of the above expression is I(x) a.e., and consequently, we get that I = 0 a.e. Thus, since 0 * 9 = 0, such a function I cannot exist. Nevertheless, if we put x = 0 in (2.5), since g(O) = 1/11(0,1])1, again by the Lebesgue Differentiation Theorem it follows that 00
= lim 11(01 1/-+0
,1]
Still one more property: Since integrable, we get
)1 i r1(0,1/) I( -y) dy =
I
~
1(0).
0, by taking 9 to be nonnegative and
2.
Convolutions and Approximate Identities
249
IRn
and so we have f = 1. Combining these remarks we note that the integrable function f, which does not exist, satisfies the following properties: It is 0 a.e. away from the origin, where it assumes the value 00, and it has integral 1. In other words, f corresponds to the Dirac measure 00. The convolution is also well-defined in the more general context of the Lebesgue LP-spaces, 1 < p ~ 00. More precisely, we have Theorem 2.2. Suppose 9 E L(Rn), f E LP(Rn), 1 < p ~ 00. Then the convolution f * g(x) is a well-defined LP(Rn) function that satisfies (2.6) Proof. Since If *g( x)1 ~ If I*Igl( x) we may assume that f and 9 are nonnegative. Since the case p = 00 follows at once from the definition of convolution, we assume that 1 < p < 00. Let 1 < q < 00 be the conjugate index to p, i.e., l/p + l/q = 1, and note that by Holder's inequality, Tonelli's theorem, and the translation invariance of the integral we have
Ln (Ln ~ Ln (Inn
IIf * gll~ =
f(x - y)g(y)l/Pg(y)l/qdY)P dx f( x - y)P g(y )P·l/P dY)
(Inn g(y )q.l/q y/q dx
= IIglhllfll~lIglli/q = IIglli/q+lllfll~ = IIgllillfll~, and, as asserted, (2.6) holds.
•
Thus the convolution inherited the integrability properties of fj next we show that it also inherits the "smoothness" properties of g. Smooth integrable functions aboundj e- 1x12 is one such example. In applications it is important to have at hand a smooth integrable function 4> that vanishes off Ixl ~ 1. To construct such a function, let "p be defined on the line by
= {e 1 / t
t<0 t ~ O. Since all the derivatives of"p exist when t =J 0 and converge to 0 when "p(t)
o
-+ 0, and since by a simple application of l'Hopital's rule it follows that all derivatives of"p vanish when t = 0, "p E COO(R). The function 4> is now defined by letting 4>(x) = ¢(lxI 2 - 1). Before we continue we need a simple observation concerning convolutions: By the translation invariance of the Lebesgue measure it readily
t
XIII.
250
follows that at those x's where
Fubini's Theorem
I * g( x) is defined, we also have
I*g(x)= iR"I(Y)9(X-Y)dY . We are now ready to prove Theorem 2.3. Suppose I E LP(Rn), 1 :::; P :::; m ~ 1. Then 1* l/J E LP(Rn ) n cm(Rn).
00,
and
l/J E C[f(Rn),
Proof. Since, by Theorem 2.2, 1* l/J E LP(Rn), only the smoothness of the convolution needs to be proved. We first show that 1* l/J is continuous. Indeed, given x,h ERn, note that
II * l/J(x + h) - 1* l/J(x)1 =
lin
I(y)(l/J(X + h - y) -l/J(X - Y))dyl
JRn II(y)IIl/J(x + h -
:::; f
y) -l/J(X - y)1 dy
:::; IIllIplll/J(X + h - .) -l/J(X - ')lIq, (2.7) conjugate indices. Now, if q < 00, since l/J E Lq(Rn) and
where p,q are since translations are continuous in Lq, the right-hand side of (2.7) goes to oas Ihl -+ 0, and so does the left-hand side; thus I*l/J(x) is continuous. On the other hand, if q = 00, by the first inequality above it readily follows that II * l/J(x + h) - 1* l/J(x)1 :::; 111111 sup Il/J(' + h) -l/J(-)I·
Now, by the uniform continuity of l/J we get that the right-hand side of the above inequality goes to 0 with h, and so does the left-hand side there. As for the smoothness, let h = (h, 0, ... ,0) denote the vector with scalar h i- 0 in the first position and zeros elsewhere, and put
"'( h)_l/J(x+h-y)-l/J(x- y ) _ (al/J) ( _ ) .,.., x,y, h aX! x y. By the conditions of the theorem it is clear that for each fixed x E R n , ~(x,y,h) -+ 0, uniformly and boundedly in y as h -+ O. Whence
lim h-+O
f I(y)~(x,y,h) dy = 0, JJln
Moreover, since the integral in (2.8) equals
x E Rn.
(2.8)
2.
Convolutions and Approximate Identities
251
it readily follows that
au *
VXl
o
*~
VXl
Clearly a similar argument applies to all other derivatives of the convolution, and the statement concerning the smoothness has also been established. • Now, the convolution f * 0, are the dilates of the kernel
Suppose
l
lim
1
Rn
and &-+0
{lxl>M}
l
Rn
o.
all€>O,
(2.9)
alIM>O.
(2.10)
Proof. We begin by checking (2.9) for the simplest function, namely
and (2.9) holds. Next we show that this is also true if
252
XIII.
Fubini's Theorem
is also null, and consequently, by 3.9 in Chapter V, we have
Now, by Tonelli's theorem again, we get that (2.9) holds in this case as well, both integrals there being O. Combining these observations it readily follows that (2.9) is true for 0, and let y = x/c. Thus, by (2.9), we have
~n c
1
{lxl>M}
1
{IYI>M/~}
and since M / c -+ 00 as c -+ 0, it readily follows that the integral on the right-hand side above tends to 0 with c. • What is the meaning of these properties of
will be to emphasize the values of f(x - y) corresponding to small y. Our next results show that, in fact, f *
lim IIf *
~-+O
= o.
(2.11)
2.
Convolutions and Approximate Identities
Proof.
253
Under the conditions of the theorem, by (2.9) we have
f(x)
= f(x) f
JRn
f f(x)
all£>O.
JRn
Whence it readily follows that
If *
~ and consequently sup
Ilf *
y) -
f(X»
JRR If(x - y) -
f(x)I
= IJRR (J(X -
(2.12)
flip is dominated by
f f If(x-y)-f(x)I
JRn JRn
(2.13)
where the sup in (2.13) is taken over those g's in Lq(Rn ) , lip + 1/q = 1, with IIgllq ~ 1. Now, by Tonelli's theorem, each integral in (2.13) equals
f f
JRn JRR
If(x - y) -
f(x)lg(x)dx
~ f
JRn
~
IIf(· - y) - f(-)lIpllgllq
( f
J{IYI$M} =A+B,
+f
J{IYI>M}
)
IIf(· - y) - f(')lIp
say. Since IIf(· - y) - f(-)lIp ~ 0 as Iyl ~ 0, given 1] > 0, we can choose M so that IIf(' - y) - f(-)llp ~ 1] if Iyl ~ M, and consequently
A
~ 1] f
J{IYI$M}
~ 1],
all M.
Moreover, since IIf(· - y) - fOli p ~ 2l1fllp, and having fixed M as above, by (2.10) we get B
~ 211fllp f
J{IYI>M}
~ 0,
as
£
~ O.
Whence combining these estimates we get that the integrals in (2.13) can be made arbitrarily small provided € > 0 is small enough, and this completes the proof. • An interesting consequence of this result is
XIII.
254
C~(Rn)
Corollary 2.6.
Fubini's Theorem
is dense in LP(Rn), 1 :::; p
< 00.
Proof. Suppose I E LP(Rn), 1 :::; p < 00. Given TJ > 0, choose M so large that J{IYI>M} I/(y)IPdy :::; if. Next pick a nonnegative kernel 4> E C~(Rn) with integral 1, and let r = UXB(O,M») * 4>e. Now, by Theorem 2.3, r E LP( Rn) n COO( Rn); but there is something else we can say. Indeed, since both IXB(o,M) and 4> vanish off a compact set K, say, the convolution rex) = JR"UXB(O,M»)(X - Y)4>e(y)dy vanishes unless there are points x and y such that x - y E K and y/c E K. Hence, rex) = 0 unless x is of the form
and this is a bounded set of points in Rn. Thus r E Finally,
III - rll p
Il/xB(o,M) :::; Il/xB(o,M) :::;
rll p + 11/(1 rll p + TJ,
C~(Rn).
XB(o,M»)lIp
and by Theorem 2.5 the right-hand side above can be made arbitrarily small with c. • There are substitute results for I E Loo(Rn); one of them is Theorem 2.7. Suppose 4> is a nonnegative integrable function with integral 1, and let I E Loo(Rn). Then
lim 1* 4>e(x)
e-+O
at every point x of continuity of Proof.
II * 4>e(x) -
= I(x),
(2.14)
I.
As before, by (2.12) we have
l(x)l:::;
f I/(x JRR
:::; ( f
J{IYI~M}
y) - l(x)l4>e(Y) dy
+
f
J{lYI>M}
)
I/(x - y) - l(x)l4>e(y)dy
=A+B, say. Now, if I is continuous at x, given TJ > 0, there exists M > 0 such that I/(x - y) - l(x)1 :::; TJ if Iyl :::; M. With this choice of M we have
A + B :::; TJ + 211/1100
f
J{lyl>M}
4>e(Y) dy,
2.
Convolutions and Approximate Identities
255
where the right-hand side above tends to 0 as e that (2.14) is true. •
--t
O. This readily implies
Still we must address the harder question concerning the pointwise convergence to I of the convolutions of I with the approximate identities cPe for integrable, or more generally, p-integrable functions. We begin by proving Theorem 2.8. Suppose cP is a nonnegative integrable function with integral equal to 1, and let I E LP(Rn ), 1 ~ p < 00. If in addition cP satisfies cP(y) ~ c/lyln+ 71 , TJ> 0, all y E R n , (2.15)
then at each point x of continuity of I we have lim 1* cPe(x) = I(x).
(2.16)
e_O
Proof. The proof follows along the lines to that of Theorem 2.7. For, in the notation of that theorem, and with the same choice of M as there, we still have that A can be made arbitrarily small at a point of continuity x of I. As for B, it is majorized by
f
J{IYI>M}
I/(x-y)lcPe(y)dy+l/(x)1
f
hlyl>M}
cPe(y)dy=Bl+B2'
say. (2.10) establishes that lime_o B2 = O. Also, if p > 1, by Holder's inequality with indices p and its conjugate q, we have
B2
~ (f
JR!'
I/(x - y)IP dy)l/P (
n/q = II/lIp~ e
(1 {lyl>M/e}
f
cPe(y)q dy)l/q
J{lYI>M}
cP(y)q dy
)l/q
= II/lIp~(e),
say. Consequently, any condition on cP that ensures that lime_o ~(e) = 0, will give (2.16); we show next that (2.15) is one such condition. Indeed, if (2.15) holds, then
XIII.
256
which clearly tends to 0 with Finally, if p = 1, then
B2
::;
Fubini's Theorem
E.
I/(x)1
f
J{lyl>M/e}
>(y)dy,
which, since > is integrable, also tends to 0 with
E.
•
To complete the analogy with the Lebesgue Differentiation Theorem, we discuss the a.e. convergence of I * >e to I. The results are now more complicated, cf. 4.18 below and Theorem 3.1 in Chapter XVII, but are surprisingly simple in case > vanishes off a compact set. Theorem 2.10. Suppose > is a nonnegative bounded integrable function with integral equal to 1, which vanishes off B(O,1), and let j E LP(Rn), 1 ::; p ::; 00. Then, at each point x of the Lebesgue set of I, and in particular a.e., we have
lim 1* >e(x) = I(x).
e-+O
Proof. As before, and in the notation of Theorem 2.7, since >e vanishes off the set {Iyl ::; c}, the choice M = c gives that B = o. As for A, since > is bounded, it is dominated by
A::;c~ f
c J{lYI~e}
I/(x-y)-/(x)ldy,
which goes to 0 with c at precisely those points x in the Lebesgue set of
I· • The reader will note that the convergence results presented above may be extended to the following setting: We may assume that > is an integrable function with integral one such that 1>(x)1 ::; 'I/J(x) for all x ERn, where 'I/J satisfies the conditions that we required the previously nonnegative function > to verify.
3. ABSTRACT FUBINI'S THEOREM In this section we present the abstract version of Fubini's theorem. Now, in the case of Euclidean space the problem at hand was facilitated
3.
Abstract Fubini's Theorem
257
by the fact that the Lebesgue measure is defined on the various spaces involved. Thus, given measures IL, v defined on (X,M) and (Y,N) respectively, the first order of business is to construct a "product measure" on M x N, the u-algebra introduced in Chapter IV, one that will make statements such as Fubini's theorem true. First some definitions. A measurable rectangle is any subset of X x Y of the form A x B, A E M, BEN. Finite unions of pairwise disjoint measurable rectangles are called elementary sets, and are often denoted by Q . .If E ~ X x Y, we define the section Ex of E (at level x E X) as the subset of Y given by Ex = {y E Y : (x ,y) E E},
x EX.
(3.1)
Similarly, the section EY of E (at level y E Y) is defined as EY = {x EX: (x,y) E E},
y E Y.
(3.2)
How do sections behave with respect to measurability? Proposition 3.1. Every section of a measurable set E E M x N is measurable. Specifically, Ex E N for all x EX, and EY E M for every y E Y. Proof. Let:F denote the class of those E E M x N such that Ex E N for all x EX; we intend to show that :F is a u-algebra of subsets of Y that contains all measurable rectangles and which therefore coincides with M X N. First note that if E = A x B is a measurable rectangle, then Ex = B when x E A and Ex = 0 otherwise, and consequently, every measurable rectangle belongs to :F. In particular X X Y E :F. Further, since N is a u-algebra, it readily follows that if E E :F, then «X x Y) \ E)x = {y E Y: (x,y) (j E} = Y \ Ex EN
and :F is closed under complementation. Finally, if En E :F, n = 1,2, ... , and E
all x EX,
= Un En, since
n
:F is also closed under countable unions. Thus :F is the u-algebra M x N.
The proof for the EY's is the same.
•
XIII.
258
Fubini's Theorem
The statement of Proposition 3.1 is one about characteristic functions of measurable sets. For arbitrary measurable functions the situation is as follows: If f is a function defined on X X Y, we call the function
fx(Y) the X-section of defined by
f at level
= f(x,y) ,
x EX,
x. Similarly, the Y-section of
fY(x)
= f(x,y) ,
f at level
y is
y E Y.
We begin by showing that sections of measurable functions are measurable; the measurability of functions defined on X X Y is always understood to be with respect to the O'-algebra M X N. Proposition 3.2. The sections of measurable functions are measurable. More precisely, if f is a measurable function defined on X X Y, fx is a measurable function on (Y, N) for every x EX, and fY is a measurable function in (X, M) for every y E Y. Proof. Let f be measurable, and given an open set 0 note that for each x E X we have
f;l(O) = {y E Y: fx(y) E O} = {y E Y: f(x,y) E O} = {y E Y: (x,y) E f- 1(0)} = (I-1(0))x . Now, since f- 1 (0) E M X N, the measurability of the set on the righthand side above follows from Proposition 3.1, and that of fx from Proposition 1.3 in Chapter VI. The proof for fY is the same. • We are now ready to prove the basic result needed to introduce the product measure. Theorem 3.3. Let (X,M,JL), (Y,N,v) be O'-finite measure spaces, and suppose E E M X N. Then for each x E X, v(Ex) is a measurable function on (X,M), and for each y E Y, JL(EY) is a measurable function on (Y,N). Furthermore (3.3)
3.
Abstract Fubini's Theorem
259
Proof. The measurability of Ex and EY has been established in Proposition 3.1, so we begin by computing the integrands of the integrals in (3.3). Now, as noted above, if E = Un(An x Bn) is an elementary set, then XE"cV) = L:nXAJX)XBJV), and by the additivity of v it readily follows that
Clearly v(Ex) is a measurable function on (X,M), and (3.4) In a similar fashion it follows that J.l(EY) is a measurable function on (Y,N) and that its integral over Y with respect to v is equal to the right-hand side of (3.4). Therefore the assertion of the theorem is true for all elementary sets in M X N; we now show that the collection F of subsets of M X N for which (3.3) is true is a monotone class which, on account of 4.19 in Chapter IV, coincides with M x N. Let {En} ~ F be a non decreasing sequence, and write E = Un En; we must show that E E F as well. First note that {(En)x} ~ N is a non decreasing sequence that converges to Ex EN, and consequently by (3.3) in Chapter IV, lim v((En)x) = v(Ex) ,
n-+oo
all x EX.
Whence v(Ex) is a limit of measurable functions on (X, M), and is therefore measurable. Further, by MCT it readily follows that (3.5) Similarly J.l(EY) is measurable on (Y,N), and (3.6) Now, since for each n the integrals that appear on the left-hand side of (3.5) and (3.6) are equal, so are their limits. In other words, the integrals on the right-hand side of (3.5) and (3.6) are equal, and (3.3) holds in this case.
XIII.
260
Fubini's Theorem
Suppose next that {En} ~ :F is a nonincreasing sequence of sets, and put E = En; we must show that E E :F. The preceding argument, invoking now (3.4) in Chapter IV instead, certainly goes through if X and Y have finite measure. To see that the same is true in the a-finite case, let {Xk}, {Ykl be sequences of sets of finite measure such that X = UXk and Y = UY k . Then, since for each k the nonincreasing sequence {En n (Xk X Yk)} converges to En (Xk X Yk) and I'(Xk), V(Yk) < 00, (3.3) is true for En (Xk X Yk), k = 1,2, ... But since the nondecreasing sequence {E n (Xk x Yk)} converges to E, the conclusion of the theorem also holds for E. We have thus shown that :F is a monotone class, and the proof is complete. •
nn
The following example shows that the a-finiteness ofthe measures was necessary. Let I = [0,1] and consider the measure spaces (1,£, I· I) and (I, P(I), v), where v denotes the counting measure on (I, P(I)). Further, let E = {(x,y) E I x I:x = y} be the "diagonal set" in I X I; it is not difficult to show that E E £ X P(I). Now, since for each real x, y the sets Ex and EY consist of a single point, we have hV(Ex)dX
= 00,
and
hlEYldv
= o.
If (X,M,I') and (Y,N,v) are as in Theorem 3.3, we define the set function I' X von (X X Y,M X N) by (I'
X
v)(E) =
Ix
v(Ex) dl' = [I'(EY) dv,
E E M x N.
(3.7)
The equality of the integrals in (3.7) is assured by Theorem 3.3. We call I' X v the "product" of the measures I' and v, and it follows without much difficulty from MCT that I' x v indeed is a measure. Observe that also I' X v is a-finite. Now, since v(Ex) = XE(x,.) dv and I'(EY) = XE(·, y) dl', (3.7) actually states that a Tonelli-like identity is true for the characteristic functions of measurable sets. In fact, the general statement holds as well, to wit,
Iy
Ix
Theorem 3.4. Let (X,M,I'), (Y,N,v) be a-finite measure spaces, and let f be a nonnegative extended real-valued measurable function defined on (X X Y, M x N). Then fx(y) dv is a measurable function on (X,M), fY(x)dl' is a measurable function on (Y,N), and
Iy
Ix
I
}XXy
fd(l'xv)=
I I
}x}y
fx(y)dvdl'
= }y}x I I fY(x)dl'dv.
(3.8)
3.
Abstract FUbini's Theorem
261
Proof. By (3.7) the theorem is true for characteristic functions of measurable sets, and hence (3.8) holds for all nonnegative simple functions. By Theorem 1.12 in Chapter VI we know that f is the limit of nondecreasing sequence of simple functions, and consequently (3.8) follows by MCT as in Theorem 1.8. • Corollary 3.S.
Under the assumptions of Theorem 3.4, if
Ix [ Iflx(Y) dv dp, <
00,
then f E L(X x Y,p, x v).
Proof.
Apply Theorem 3.4 to
If I· •
Under the assumptions of Theorem 3.4, if
Theorem 3.6 (Fubini).
f E L(X x Y,p, x v), then fx E L(X,p,) p,-a.e. on X, fY E L(Y,v) v-a.e. on Y, the functions [fAY) dv E L(X,p,) ,
Ix fY(x) dp,
E L(Y, v)
and (3.8) holds.
Proof.
f
}XXY
Write f
= f+ -
f+d(p,xv)= f
f- . By Theorem 3.4 we have f(J+)x(y)dvdp,= f f (J+)Y(x)dp,dv.
}x}Y
}y}x
Since f E L(X x Y,p, x v), the left-hand side above is finite and so
[(J+)x(Y)dVEL(X,P,),
and
1x(J+)Y(X)dP,EL(Y,P,).
This implies that (J+)x(') E L(Y,v) p,-a.e. and that (J+)Y(.) E L(X,p,) v-a.e. The same result holds if we replace f+ by f- , and thus the theorem follows. • A word about the product measure: If (X,M,p,) and (Y,N,v) are complete measure spaces, it does not follow that (X x Y, M x N, p, xv) is complete. For instance, if p, = v = 1·1, the Lebesgue measure on R, let A = {x} consist of a single point, and B be any non-Lebesgue measurable subset of R. Then A X B C A x R, IA x RI = 0 but A x B rJ. C x C. Thus I . I x I ·1 is not a complete measure, and, in particular, it is not the Lebesgue measure on the plane. However, the completion of 1·1 x 1·1 is the Lebesgue measure on the plane, cf. 4.27 below. The statement of Fubini's theorem in this context is left to the reader, cf. 4.28 below.
XIII.
262
4.
Fubini's Theorem
PROBLEMS AND QUESTIONS
4.1 Prove that if E is a Lebesgue measurable subset of the rectangle [0,1] x [0,1], and if IExl ~ 1/2 for almost all x E [0,1], then I{y E [0,1]: IEYI = 1}1 ~ 1/2. Can you think of n-dimensional extensions? 4.2 Show that if an extended real-valued function f defined on R n +m has the property that fx is Borel measurable for every x E R n and fY is continuous for every y E Rm, then f is Borel measurable. 4.3 Let E be a dense measurable subset of Rn, and f be an extended real-valued function defined on Rn+m. Show that if fx is Lebesgue measurable for all x E E and fY is continuous for almost every y E Rm, then f is Lebesgue measurable. 4.4 Calculate
11 [0,00)
x3y3 cos(y2)
""':"""""'-----:-"-~ [O,v'ij (x 4 y4 )3/2
+
dy dx .
4.5 Given that f E L(R) ~d that IR IRf(4x)f(x culate IRf(x)dx.
+ y)dxdy = 1, cal-
4.6 Let f be an extended real-valued measurable function defined on 1= [0,1]. Show that if the function F(x, y) = f(x)- fey) E L(IxI), then f E L(I). 4.7 For t
> 0, let 4>(t)
= { ~1
if 2n ~ t < 2n + 1 for some n otherwise,
= [0,1], and for (x,y) E I x I put f(x,y) = (l/y)4>(x/y) and g(x) = IIf(x,y)dx. Does f E L(I X I)? Does 9 E L(I)? 4.8 Given f E L(R), let let I
1 4>h(X)=-';' 2
l
x h
+ f(t)dt,
h> 0.
x-h
Prove that 4> E L(R) and IR l4>h(X)1 dx ~ IIflll. 4.9 Let F be a closed subset of the line, and let t5( x) = t5( x, F) = inf{lx - yl : y E F} denote the distance of x from F. IT A > and f is a nonnegative function, f E L(R \ F), prove that the function
°
M>.(f,x)
( t5(y)>' fey)
= iR Ix _ yl1+>' dy,
x E R,
4.
Problems and Questions
263
is integrable over F, and so finite a.e. there, and 00 if x rI. F. M>. is called the Marcinkiewicz function corresponding to F, and it is an indispensable tool in the theory of Fourier series. The particular case I = Xl! where I is a bounded interval of the line, is of interest. 4.10 If I is a nonnegative extended real-valued Borel measurable function on Rn+m, show that iRn I(x, y) dx and iR"" I(x, y) dy are Borel measurable, and
I
JRn+m
I = I
I I(x,y) dx dy = I
JR"" JRn
I
JRn JRm
I(x,y)dydx.
4.11 Let E be a domain in the plane bounded by the continuous curves y = 4>(x), y = .,p(x) for x E I = [a,b], where 4>(x) < .,p(x). Prove that if I is a Borel measurable, integrable function defined on E, then
I 1= I i
JE
Jf
I(x,y)dydx.
[4>(x).1/I(x)1
4.12 Let I = 1(0,1) denote the unit interval in Rn, 0 < 'f/ < n, and suppose b(x, y)"is an essentially bounded function defined on I X I. Show that if IE L(I), the function
F(x)= Ib(x'Y)/(y)dy,
if Ix -
YI7J
xEI,
is finite a.e. on I. In fact, F E L(I). What is an estimate of in terms of 1I/11t ? 4.13 Prove that LP(Rn ) * Lq(Rn) l/p + l/q = 1.
~
Loo(Rn) n C(Rn), 1 <
p, q
IIFIII <
00,
4.14 Prove that LP(Rn ) * Lq(Rn ) ~ Lr(Rn ), 1 < p,q,T < 00, 1 + l/T = l/p + l/q. This result is known as Young's convolution theorem. 4.15 Suppose 4> E LI(Rn)nVXl(Rn) has the property that iRn 4> = 1 and limlxl_oo(4)(x)/lxln) = O. Now, if I E LP(Rn ), 1 $ p < 00, show that at each point of continuity x of I, lime_o I * 4>e(x) = I(x). 4.16 Verify that the assumptions of 4.15 are satisfied on the line by the Poisson kernel P(x) = (1/7r)(1/1 + x 2 ), the Fejer kernel K( x) = (1/7r )(sin x/x)2, and the Gauss-Weierstrass kernel W( x) =
(l/../i)e- rr .
XIII.
264
Fubini's Theorem
4.17 Suppose that I is integrable over the shell 0 ~ r ~ Ixl ~ R < 00, and that ¢> E C([r, RD. Then, if F(p) = J{r~lxl~p} I(x )¢>(Ixl) dx, r ~ p ~ R, show that
f
l(x)¢>(lxl)dx =
J{r~lxl~R}
lR
¢>(p)dF(p) ,
r
the integral on the right-hand side above being a Riemann-Stieltjes integral.
JR"
4.18 Suppose ¢> E Ll(Rn) n Loo(Rn) has the property that ¢> = 1 and that for some 'lJ > 0, 1¢>(x)l/lxl n+71 ~ c for alllxllarge. Now, if IE LP(Rn), 1 ~ P < 00, show that at each point x in the Lebesgue set of I, we have lime-+o I * ¢>e( x) = I( x). 4.19 A sequence {¢>k} ~ L(Rn) is called an "approximate unit" if: (a) ¢>k ~ 0, for all k, (b) lI¢>klll = 1, for all k, and (c) For each neighbourhood G of 0 we have limk-+oo JRn\G ¢>k = o. If {¢>k} is an approximate unit, and if 1 ~ p < 00, prove that limk-+oo III * ¢>k - Ilip = 0 for all I E LP(Rn). 4.20 Show that E = R2 \ {(x, y) E R2: x - y is rational} contains no measurable rectangle of positive Lebesgue measure. 4.21 Prove that the operation of convolution is associative in L(Rn). Specifically, if I, g, and h are integrable, show that 1* (g * h)( x) = (f * g) * hex) a.e. 4.22 Suppose I and 9 are nonnegative integrable functions defined on R n so that both I and 9 are strictly positive on some set of positive measure (not necessarily the same for I and g). Prove that I*g > 0 on a set of positive measure. 4.23 (Minkowski's Integral Inequality). Under all appropriate measurability conditions on I, show that if 1 ~ p < 00 we have
([ (fx I/(x, y)1 d
P)
P
dV) lip ~
fx ([ I/(x, Y)IPdV) lip dp.
If we write this inequality in the form
then it is also true for p
= 00.
4.
Problems and Questions
265
4.24 Concerning the example following Theorem 3.3, show that the integral fIxI XE d(I·1 x v), where E is the "diagonal" set given there, is different from either of the iterated integrals. In what follows (X,M,JL) and (Y,N,v) are measure spaces. 4.25 If A is a measure on M x N such that A(A X B) = JL(A)v(B) for all measurable rectangles A x B, show that A = JL x v. 4.26 If the measure spaces involved are complete and u-finite, and if JL x veE) = 0, show that for every F ~ E we have
JL(FY) = 0 v-a.e.,
and
v(Fx) = 0 JL-a.e.
4.27 The measure space (X X Y,M xN,JL x v) is seldom complete, even when the measure spaces involved are both complete. Prove that this is the case if there exists a set A c X such that A rt M, and a nonempty set BEN such that v(B) = o. In particular, if JL denotes the Lebesgue measure on the line, (R 2 x .c, JL x JL) is an incomplete measure space.
,.c
4.28 An alternative statement of the Fubini-Tonelli theorem is the following: Suppose (X,M,JL) and (Y,N,v) are complete u-finite measure spaces, and let (X x Y, F, A) denote the completion of (X X Y,M x N,JL x v), cf. Theorem 3.3 in Chapter V. If 1 is measurable (with respect to F) and either (a) 1 ~ 0 or (b) 1 E Ll(oX), then Ix is N-measurable JL-a.e., l Y is M-measurable v-a.e., and, in case (b) holds, also Ix E Ll(v) and l Y E Ll(JL), in the a.e. sense. Moreover, the functions fx Ix dv and Ix l Y dJL are measurable and
f 1 dA = f f Ix dvdJL = f f l Y dJLdv . }xXY }x}y }y}x Prove it. 4.29 The requirement that 1 be measurable cannot be dispensed for the validity of Fubini's theorem. To see this let X = Y be well-ordered sets with ordinal il, M = N be the u-algebra consisting of those sets which are either at most countable or so that their complement is at most countable, and let JL = v be the measure defined for A E M by JL(A) = 0 if A is at most countable and JL(A) = 1 otherwise. Show that if E = {(x, y) E X x Y: x -< y}, then Ex and EY are measurable for all x, y, and that both iterated integrals of XE exist and are unequal. Hence E rt M x N, and Fubini's thoerem does not hold in this case.
XIII.
266
Fubini's Theorem
4.30 If one accepts the Continuum Hypothesis the construction in 4.29 leads to the following situation: There is a subset E of X = [0,1] X [0,1] such that Ex is at most countable for all x E [0,1], [O,l]\EY is at most countable for all y E [0,1], but E is not Lebesgue measurable. 4.31 The following result describes the behaviour of absolute continuity and singularity with respect to product measures. Let J1. and J1.* be O'-finite measures on (Y,N). Prove that if J1. ~ J1.* and v ~ v*, we have J1. X v ~ J1.* X v* and d(J1. X v) d(J1.* X J1.*) (x, y)
dJ1.
dv
= dJ1.* (x) dv* (y),
all (x, y) E X X Y.
Also, if J1. .L J1.* or v .L v*, then J1. X v .L J1.* X v*. 4.32 In the notation of 4.28, prove that if J1. = J1.a + J1.s is the Lebesgue decomposition of J1. with respect to J1.*, and similarly v = Va + Vs that of v with respect to v*, then the Lebesgue decomposition of J1. X v is given by (J1. X v)a = J1.a X Va and
4.33 Let J1.1 be a finite Borel measure on Rnl, and J1.2 a finite Borel measure on Rn2. If J1.1 X J1.2 is absolutely continuous with respect to the Lebesgue measure ~ on Rnl +n2 , does it necessarily follow that d(J1.1 X J1.2)/d~ = f . g, where f is a Lebesgue measurable function defined on Rnl, and 9 is a Lebesgue measurable function defined on R n 2?
CHAPTER
XIV
N ormed Spaces and Functionals In this chapter we study the basic properties of linear spaces, and in particular of those spaces which are normed, and of those which are complete in the metric induced by the norm, or Banach spaces. The existence of continuous linear functionals on these spaces is established by the HahnBanach Theorem. 1.
NORMED SPACES
The time has come to set up a general framework to address some of the important questions we have posed, including the existence of bounded linear functionals on various linear spaces. We begin by introducing the necessary definitions. Suppose X is a vector space over the field of real, or complex, scalars; since the theory in both cases follows along similar lines we consider them simultaneously. A scalar valued function defined on X is called a functional. We are first interested in a particular kind of functional, namely a seminorm. A nonnegative functional p defined on X is called a semi norm provided the following two properties are satisfied: (i) (Triangle Inequality) p(x + y) ~ p(x) + p(y) , x, y E X. (ii) (Absolute Homogeneity) p(AX) = IAlp(x), A scalar, X E X. Of course, in (ii) above, IAI denotes the absolute value of A when X is a vector space over the reals and the modulus of A when the scalar field are the complex numbers. It follows from (ii) that p(O) = o. We say that the semi norm p is a norm provided that (iii) (Uniqueness) p(x) = 0 implies x = O.
268
XIV.
N ormed Spaces and Functionals
Norms are often denoted by II ,11, or variants thereof. To emphasize that X is endowed with a norm we call X a normed linear space. We have already encountered many instances of normed linear spaces. The finite-dimensional spaces R n and en may, of course, be normed in different ways. For instance, if z = (Zl' ... , zn) E en, then the expressions 1~p
(1.1 )
and
(1.2) are norms on en. 0 bserve that IIzlloo ~ IIzllp,
IIzllp ~ nl/Pllzll oo ,
and
all z E en.
In general, if 11·111 and 11·112 are norms on X, we say that 11·111 is weaker than 11·112 if for some constant k we have IIxlll ~ kllxlh for all x EX. We also say that the norms are equivalent if we have both
All norms on en are equivalent. On the other hand, if I is a bounded interval of Rn, then the LP-norm on LP(I) is weaker than the Lq-norm on LP(I) iff p ~ q, and consequently, these norms are not equivalent. In the general context of function spaces defined on subsets of Rn there are other instances of normed linear spaces that are of interest to us. For example, the space B(l) consisting of those real, or complex-valued, bounded functions I defined on I may be normed by
11/11 = maxl/(x)l· xEI
(1.3)
Clearly this expression is also a norm on the subspace C(I) of B(I) consisting of those functions which are continuous. Now, if for a multi-index a = (al, ... ,an) of nonnegative integers we let lal = al + ... + an, and a
alall
_
DI-aa 1 Xl
•••
aa xn n
'
then Ck(l) = {I E C(l): Da IE C(I), lal ~ k} may be normed by
II/lIk =
L lal~k
IIDa/ll·
(1.4)
1.
Normed Spaces
269
Another example is the class of BV functions defined on an interval I = [a,b] of Rj it is not hard to see that the expression
11/11
= V(Jja,b) + I/(a)l,
(1.5)
is a norm. Note that if X is a normed linear space, the function d on X X X given by d(x,y) = IIx - yll, X,y EX, defines a metric on X. For, d (x, y) = 0 implies IIx - yll = 0, which in view of (iii) above is equivalent to x = y. The symmetry of d is obvious from the definition. Finally, the triangle inequality is a simple consequence of
(i): d(x,y) =
IIx-yll
~
Ilx-zll+llx-YIl =;= d(x,z)+d(z,y),
all X,y,z EX.
d is called the metric induced by the norm 11·11. Among the normed linear spaces, a particularly important role is played by those spaces which are complete metric spaces in the metric induced by the normj these are the so-called Banach spaces. For instance, G(l) normed by (1.3) is a Banach space, but it is not a Banach space when normed by II ·111, the Lebesgue Ll norm. Now, C(I) normed with 11·111 is densely embedded in the Banach space Ll(l) and an interesting question to ponder is whether any normed linear space which is not complete may be densely embedded in a Banach spacej we will answer this question in the next section. In the meantime, inspired by the proof of Theorem 1.3 in Chapter XII, we consider a useful criterion to decide when a normed linear space X is complete. Observe that in a normed linear space X it is possible to assign a sum S to the (formal) series 00
L Xn = Xl + ... + Xn + ... ,
Xn EX, all n .
n=l
Indeed, the series L Xn is said to converge to the sum s if the sequence iSm} of the partial sums Sm = Xl + ... + X m , m = 1,2, ... , converges to S in the norm, or metric, of X. Along the same lines we say that the series L Xn is "absolutely convergent" if the numerical series 00
L
71=1
IIxnll = IIXlll + ... + IIxnll + ...
(1.6)
XIV.
270
N ormed Spaces and Functionals
converges. It is not hard to see that in a Banach space every absolutely convergent series converges. For this it suffices to verify that the sequence of partial sums {sn} is Cauchy in the metric of X. First observe that if m > n, then Sm - Sn = Xn+I + ... + X m , and consequently, by (a simple extension of) (i) above, we also have (1.7) Since the series in (1.6) converges, the right-hand side of (1. 7) can be made as small as desired provided n is sufficiently large; thus iSm} is Cauchy in X and consequently, convergent to a limit SEX, say. Furthermore, by the continuity of the norm, cf. 4.7 below, we also have IIsll :::; L:~=ll1xnll. These remarks lead to the following useful result. Theorem 1.1. Let X be a normed linear space. Then X is a Banach space iff every absolutely convergent series converges. Proof. It only remains to show that the condition is sufficient. Let be a Cauchy sequence of elements of X and choose a sequence nk+I > k = 1,2, ... , such that
{xn}
nk,
Put Xno = 0, and let Yk = xn/c - Xn/C_l for k = 1,2, ... By the above estimate it follows that L:~1 IIYkll :::; IIxnlll + L:k 2- k < 00, and the series L: Yk is absolutely convergent. By assumption this series converges in X, and since partial sums L:~1 Yk of the series equal X nm ' the subsequence {xnm} converges in X. In order to complete the proof it suffices to invoke the well-known fact that if a Cauchy sequence in a metric space has a convergent subsequence, then the sequence itself converges to the same limit. • As an illustration of this result we show that normed by (1.5) the space of BV functions on an interval I = [a,b] is a Banach space. Indeed, let {fn} be a sequence of BV functions on I such that L: IIfnll < 00. In particular, for a :::; x :::; b we have
and consequently by (1.5) in Chapter III it follows that
2.
The Hahn-Banach Theorem
271
Now, since the series with terms {/n(x)} converges absolutely for each x E I, it also converges there. Let I(x) = L: In(x) denote its sum; we must show that I is BV on I, and that lim m ...... oo " L::=l In - III = O. First observe that since in summing a series with nonnegative terms we may interchange freely the order of summation, given a partition P of I we have n over'P
over'P
n
and consequently I is BV on I, and V(J;a,b):$ L:V(Jn;a,b). Moreover, since I/(a) - L::- 1 In(a)1 -+ 0 as m -+ 00, and
V (I -
"m In;a,b):$ ,,00 V(Jn;a,b) L..."n=l L..."n=m+l
-+
0,
as m
-+ 00,
it readily follows that 111- L::=llnll -+ 0 as m -+ 00, and we are done. The advantage of the above argument is apparent: We showed that a normed linear space was complete without dealing with Cauchy sequences, which are often difficult to handle.
2. THE HAHN-BANACH THEOREM A functional L defined on a linear space X is said to be linear if for every x, y E X and scalar -X, we have
L(x
+ -Xy) =
Lx + -XLy.
(2.1)
When X is finite dimensional, the conjugate, or dual, space X· consisting of all the linear functionals on X, plays an important role in the development of the general theory of linear spaces. One of the main res.ults is that the natural embedding of X into X·· is an isomorphism. In general, it is not possible to extend this result to infinite dimensional linear spaces; in fact, the result itself is not quite as relevant because most of the functionals that arise in concrete analytic situations and examples are also bounded. This requires, of course, that X be normed and we make this clear in our next definition. We say that a linear functional L defined on a normed linear space X is bounded if there is a constant c, independent of x EX, so that
ILxl
:$
cllxll ,
all
x EX.
(2.2)
XIV.
272
Normed Spaces and Functionals
The study of bounded linear functionals originated in two closely related areas: The solution of linear systems of infinitely many equations with infinitely many unknowns, and the so-called summability methods of divergent series, in particular, the Fourier series of integrable functions. Now, even when the boundedness condition is imposed, it is found that the duality theory for infinite dimensional linear spaces is more complex than the finite dimensional case. We have already encountered some of these results as we examined the theory of the Lebesgue LP spaces, 0 < P ~ 00, in Chapter XII. The Hahn-Banach Theorem, or theorems actually, are an indispensable tool in the theory of duality. In the case of arbitrary linear spaces, where no topology is apparent, the Hahn-Banach Theorem assures a plentiful supply of linear functionals, and in the case of normed linear spaces, under some general "domination" assumptions, a supply of bounded linear functionals. To elucidate this point we discuss first a simple question of geometric nature. We say that a subset C of a normed linear space X is convex if for every X,Y E C, the set {7JX + (1 - 7J)Y: 0 ~ 7J ~ I} is contained in C. The question of interest to us may be loosely stated as follows: If C f:. R2 is a closed convex subset of the plane which contains the origin as an interior point, can we draw a line so that C lies entirely on one side of the line? The answer to this question is not intuitively obvious. First some definitions. Given a convex subset C of a normed linear space X which contains o as an interior point, there is a natural functional, called the Minkowski functional of C, which is denoted by Pc, and which satisfies the following properties: (i) Pc is nonnegative and finite everywhere. (ii) (Positive homogeneity) For each x E X and 7J ~ 0 we have
Pc( 7Jx) = 7JPc (x) . (iii) (Triangle Inequality) For any x, Y E X we have
pc(X
+ y) ~ pc(x) + Pc(Y)·
(iv) For every x E X \ C we have pc(x) ~ 1. This is how we go about defining Pc: Given x E X, put
pc(X) = inf{l/A:A > 0 and AX E C}.
(2.3)
We claim that Pc satisfies (i)-(iv) above. First note that since by the continuity of the norm IIAxll --+ 0 as A --+ 0, and since 0 is an interior point
2.
The Hahn-Banach Theorem
273
of C, AX E C for sufficiently small A. Thus the inf in (2.3) is finite and (i) holds. As for (ii), since AO E C for every A, it follows that poCO) = 0, and consequently, we may assume that TJ "=I 0 and X "=I o. Now,
PoCTJx) = inf{l/A:A > 0, ATJX E C} = inf{TJ/A:A > 0, AX E C} = TJinf{l/ A: A> 0, AX E C} = TJPoC X) , which is precisely (li). To prove (iii), given c > 0, let that P,X E C, 1/p, ~ Pe(x) +c/2, and
vy E C, and put
l/v ~ Pe(Y) + c/2,
l/A = 1/p, + l/v.
p"V
> 0, be such
(2.4) (2.5) (2.6)
Now, since 0 < TJ = A/p, < 1 and C is convex, it readily follows that A(X + y) = TJ(p,x) + (1- TJ)(vy) E C, and by (2.3), (2.6), (2.4) and (2.5) we have Pe(x + y) ~ l/A ~ Pe(x) + PoCy) + c. But c > 0 is arbitrary, and consequently, (iii) holds. Finally, if X E X \ C, since C is convex we cannot have AX E C for some A ~ 1. Whence, if AX E C, it follows that A < 1 and, as asserted, poCx) ~ 1, thus proving (iv). By means of the Minkowski functional we may answer the question posed above concerning convex sets of the plan. The idea, after some simple arguments, is to consider the linear functional L defined on the one-dimensional subspace of X consisting of all elements of the form AXo, Xo ~ C, by the formula L(Axo) = APel (xo), where Cl is a convex set related to C, and observing that in this case L( AXo) ~ Pel (AXo) for all real A. This expression exhibits the domination alluded to above, and the question is whether L can be extended to the plane satisfying the same inequality. We make these remarks precise with the aid of the following result. Theorem 2.1 (Hahn-Banach Theorem). Suppose X is a real linear space and P is a functional on X which satisfies the triangular inequality, and so that p(AX) = Ap(X) for all X E X and A > o. Further, let Xo be a linear subspace of X and Lo a linear functional on Xo such that Lox~p(x),
allxEXo .
(2.7)
Then there is a linear functional L defined on X that extends L o, i.e., Lx = Lox for X E X o, and so that Lx
~
p( x) ,
all x EX.
(2.8)
XIV.
274
Normed Spaces and Functionals
Proof. The idea of the proof is to invoke Zorn's Lemma to construct a maximal extension of Lo, and then to show that this extension satisfies (2.8). Let X be the collection of all pairs of the form (Y, L) where (i) Y is a linear subspace of X and Xo ~ Y ~ X. (ii) L is a linear functional on Y, LIXo = L o, and Lx ~ p(x) for all x E Y. Note that (Xo, Lo) E X. On X we introduce a partial ordering as follows: We say that (Y, L) precedes (Y', L'), and we write (Y, L) -< (Y', L'), if Y ~ Y' and L'IY = L. In order to apply Zorn's Lemma we must first check that any linearly ordered family {(Ys , L s )}, say, of elements in X has an upper bound. But this is not hard: Indeed, put Y = Us Y s , and consider the functional L on Y so that LIYs = Ls. Since the family is ordered it readily follows that (Y, L) is an upper bound, and we are in a position to invoke the conclusion of Zorn's Lemma, to wit, X has a maximal element (Xl,L l ), say. There are two possibilities: Either Xl = X, and in this case we are done, or else Xl is a proper linear subspace of X. Next we show that the latter possibility does not occur, for otherwise we would reach a contradiction. Indeed, if the latter possibility occurs, let Xo E X \ Xl and consider the linear subspace X 2 of X spanned by Xl and {xo}, Le., X 2 consists of all linear combinations of the form Xl + .xxo, where Xl E Xl and .x is a real number. We claim that Ll may be extended to a linear functional L2 on X2 which satisfies L2X ~ p(x) for all X E X2, thus contradicting the maximality of (Y, L). Denote by L2 a candidate for such an extension of L to X 2, and observe that if L2XO = "I, an arbitrary real scalar, we have
If we can produce a scalar "I so that for all Xl in Xl and scalars
.x the
inequality (2.9) is true, it then follows that (X2' L 2) EX, and that (Y, L) strictly precedes it, thus contradicting its assumed maximality. Observe that if (2.9) holds, we also have
.x> 0
(2.10)
7J;::: (P(XI + .xxo) - p(xI)/>., >. < o.
(2.11)
"I ~ (P(XI
+ .xxo) -
p(xI))/.x,
2.
The Hahn-Banach Theorem
275
By setting A = -1',1' > 0, in (2.11), (2.10) and (2.11) may be combined into the single expression
p(XI) - p(XI - JLXO) I'
< < P(XI + AXO) - p(XI) A
-T/-
'
(2.12)
which should now hold for all A, I' > o. Thus, the existence of T/ is equivalent to the validity of the inequality
P(XI) - P(XI - JLXo) I'
::;
P(XI - AXo) - p(xt} A '
alIA,JL>O,
(2.13)
for then T/ may be chosen to be any real number lying between the sup of the left-hand side of (2.13) and the inf of the right-hand side there. To show that (2.13) holds is not hard. First observe that it is equivalent to
( ) P Xl
JLP(XI
::;
+ AXo) + Ap(XI (A+JL)
JLXo)
(2.14)
.
Next note that since
and since P is subadditive and positively homogeneous, it follows that I'
P(XI) ::; (A + 1') P(XI
A
+ AXo) + (A + 1') P(XI -
JLXo) ,
which is precisely (2.14). Thus, reversing the steps, also (2.13) holds and L can be extended to a subspace of X containing Y and satisfying (2.8), thus contradicting the maximality of (Y, L). Whence Y is actually X, and the proof is complete. • Next we consider the Hahn-Banach Theorem for complex linear spaces, the proof presented here is due to Bohnenblust and Sobczyk. We begin by exploring the relationship between real and complex functionals.
Lemma 2.2. Suppose X is a complex linear space, and let L be a (complex) linear functional defined on X. Then Llx = ~(Lx) is a real linear functional defined on X and (2.15) Conversely, if Ll is a real linear functional defined on X, then the functional L defined by (2.15) is a complex linear functional on X.
XIV.
276
N ormed Spaces and Functionals
Proof. That Ll is a real functional on X if L is a complex functional on X is a simple verification left to the reader. Now, since for any complex number z we have that ~(iz) = -~(z), it readily follows that
Lx =
~(Lx)
= L1x -
+ i~(Lx) = i~(L(ix))
L1x + i(-~(iLx)) = L1x - iLl(ix) ,
and (2.15) follows. Finally, if Ll is a real functional defined on X and L is given by (2.15), in order to show that L is a complex linear functional on X it suffices to check that L( ix) = iLx for all x EX. But
L(ix)
= Ll(ix) -
iLl(i(ix)) = Ll(ix) - iL1(-x) = i(LIX - iLl(ix)) = iLx. •
We are now ready to present the complex version of the Hahn-Banach Theorem. Theorem 2.3. Let X be a complex linear space, p a semi norm on X, Xo a linear subspace of X and Lo a complex linear functional defined on Xo such that (2.16) ILoxl ~ p(x) , all x E Xo. Then there is a linear functional L defined on X which extends L o, i.e., Lx = LoX for x E X o, and so that
ILxl
~
p(x) ,
all x EX.
(2.17)
Proof. Let Ll = ~Lo; by Proposition 2.2, Ll is a real linear functional defined on X o, and by (2.16) we have L1x ~ ILoxl ~ p(x) for x E Xo. We are now in a position to invoke Theorem 2.1 and extend Ll to a real linear functional L2 defined on X with the property that L 2x = L1x, X E X o, and
L 2x
~
p( x) ,
all x EX.
(2.18)
Since p is a seminorm, replacing x by -x if necessary in (2.18), we note that we have IL2Xl ~ p(x) as well. Inspired by Lemma 2.2, let
Lx
= L 2x -
iL 2(ix).
L is a complex linear functional defined on X, and since L2 extends L 1, the restriction of L to Xa coincides with La. It only remains to check (2.17):
2.
The Hahn-Banach Theorem
277
Since for each x E X so that Lx :I 0 we have ILxl = >.Lx = L(>.x), where >. = Lx/ILxl is complex number of modulus 1, it follows that L(>.x) is real, and consequently we also have
ILxl and (2.17) holds.
= L(>.x) = L 2 (>.x)
~
p(>.x)
= p(x) ,
•
We focus our discussion next in the normed linear spaces; we begin with some definitions. A functional L defined on a normed linear space X is said to be continuous if
Ilx n
-
xII -
0 implies
ILxn - Lxi -
o.
For linear functionals L, which are the functionals of interest to us, the notion of continuity is equivalent to that of continuity at a single point of X. For, suppose that L is continuous at a point Xo of X and let Xn - x EX. Then we have Xn - x + Xo - Xo, and consequently, IL(x n - x + xo) - Lxol - o. But, since L is linear, it is obvious that L(x n - x + xo) - Lxo = LX n - Lx, and our assertion follows. Also, for linear functionals, the concepts of boundedness and continuity are interchangeable. Proposition 2.4. Suppose L is a linear functional on a normed linear space X. Then L is bounded iff L is continuous. Proof. have
Suppose first that L is bounded; since L is also linear we (2.19)
and the right-hand side, and consequently also the left-hand side, of (2.19) goes to 0 with IIx n - xII. Whence, L is continuous. Conversely, suppose that L is a continuous linear functional on X which is not bounded. Then, by (2.2), for each positive integer n there is Yn :I 0 in X, so that ILYnl > nllYnll. Put Xn = (1/nIlYnII)Yn, and observe that the sequence {x n } ~ X satisfies
IIxnll- 0
and yet
ILxnl > 1 for all n.
But this is not possible if L is continuous.
•
Although not intuitively apparent, there are linear functionals that are not bounded. To see this consider an infinite dimensional linear space
XIV.
278
N ormed Spaces and Functionals
X and, referring to Section 3 in Chapter II, let H be a Hamel basis for X over the ambient scalar field. It is a straightforward application of Zorn's Lemma to prove that any linearly independent subset of X is contained in a Hamel basis for X. In particular, any linear space has a Hamel basis. Now, for each x in X we can find a unique elements hI, . .. ,hn in Hand scalars At, ... , An, say, such that x ~i=l AiXi. Define IIxli oo to be the maximum of the the numbers Ai, i = 1, •.• , n. Clearly II x 1100 is a norm in X, and consequently, any linear space over the real or complex scalar field can be given a norm. There is an interesting case for which a Hamel basis can be exhibited. Let I = [0,1] and let pel) C G(l) denote the class of all polynomial functions on I. Then H = {1,x,x 2 , ••• } is a Hamel basis for pel). Let now HI be a Hamel basis for G(l) that contains H and choose any element hI E HI \ H. Put Lhl = 1 and Lh = 0 for all h in Ht, h =I ht, and extend L to all of G(l) by requiring that it be linear. It is clear that L cannot be continuous with respect to the uniform norm on G(l). Indeed, if this were the case, then by 4.15 below the set {f E G(l): Lf = O} would be a closed subspace of G(l). But this set contains P(I), which, by the Weierstrass theorem, cf. Corollary 2.3 in Chapter XVII, is dense in G(l). Hence if L were continuous, it would have to be identically 0, contrary to the fact that Lhl = 1. We are now ready to introduce the conjugate, or dual, space X* of a normed linear space X, i.e., the space consisting of all continuous linear functionals defined on X. More precisely, given a normed linear space X, let X* = {L: L is a continuous linear functional on X}.
=
It is readily seen that X* is itself a linear space over the scalar field of Xi LI + AL2 is defined as the continuous linear functional on X given by
(LI
+ AL2 )(x) =
Llx + AL 2 x,
all x EX.
We also have Proposition 2.5.
Suppose X is a normed space. Then X*, normed
by (2.20) is a Banach space. Proof. It is clear that the expression in (2.20) is a semi norm on X*. Now, if IILII = 0, it follows that ILxl = 0 for each x E X, L is the 0 functional and consequently, IILII is a norm on X*.
2.
The Hahn-Banach Theorem
279
To show that normed by (2.20) X* becomes a Banach space, by Theorem 1.1 it suffices to prove that if ~ IILnll < 00, then ~ Ln converges in X*. First observe that for each x E X we have (2.21) Thus the numerical series with terms (Lnx) converges absolutely for each x in X, and since the scalar field is complete, also ~ Lnx converges, even unconditionally, to a sum Lx, say. First we show that L is a bounded linear functional on X. Indeed, given x, y in X and a scalar )., we have
L(x + ).y) =
L Ln(x + ).y) = ~~oo L~=l Ln(x + ).y)
= m-+oo lim "m (Ln x + ).LnY) = Lx + )'Ly , ~n=l
and the linearity of L follows. Moreover, by (2.21) it also readily follows that ILxl/llxl1 ~ ~ II Ln II , xi- 0, and consequently, L is bounded. Finally, since for x E X we have
we get that
and L = lim m -+ oo ~~1 Ln (in X*). Since all the assumptions of Theorem 1.1 are now satisfied, we get that X* is complete. • It is interesting to point out that the conclusion of Proposition 2.5 holds whether X itself is complete or not, as the proof only makes use of the completeness of the field of scalars. After this brief digression we turn to prove a version of the HahnBanach Theorem that deals with continuous linear functionals. Theorem 2.6 (Hahn-Banach Theorem). Suppose X is a normed linear space, and let Lo be a bounded linear functional defined on a subspace Xo of X. Then there exists a bounded linear functional L defined on X such that (2.22) LIXo = Lo and IILII = IILoli.
XIV.
280
Normed Spaces and Functionals
Proof. We consider first the case when X is a real linear space. Since Lo is a bounded linear functional on Xo, it follows that
Lox ~ ILoxl ~ IILollllxll ,
all x E Xo.
(2.23)
Note that the expression on the right-hand side of (2.22) may be thought of as a semi norm on X. More precisely, if for x in X we put p( x) = IILollllxll, then p is a seminorm on X, and (2.23) actually states that the assumptions of Theorem 2.1 are satisfied. By the conclusion of that theorem there is a linear functional L defined on X such that
Lx=Lox,
xEXo,
Lx~p(x),
and
allxEX.
(2.24)
The estimate in (2.24) may be rewritten as
Lx ~ IILollllxll,
(2.25)
and since L is linear we also have
-Lx
= L( -x) ~
IILolIlI- xII
= IILollllxll.
(2.26)
Thus combining (2.25) and (2.26) it follows that ILxl ~ IILollllxll for all x E X, and consequently, IILII ~ IILoli. Furthermore, since the restriction of L to Xo is Lo, we also have IILII;:::
sup
x~O,xEXo
ILoxl -II-II = IILoli , x
and IILII = IILoli. This completes the proof in the real case. As for the complex case, the prooffollows along similar lines once we invoke Theorem 2.3. • Many important topics in the theory of linear spaces rely on the notion of convexity; as a first application of the Hahn-Banach Theorem we formalize the discussion preceeding Theorem 2.1; first a definition. Given subsets Xo and Xl of a linear space X, a linear functional L defined on X is said to separate Xo and Xl if sup Lx xEXt
~
inf Lx. xEXo
The lack of symmetry in this definition is only apparent as the roles of Xo and Xl are interchanged when L is replaced by -L. It follows at once from this definition that L separates Xo and Xl iff L separates Xo - Xl = {z:z = Xo - Xt,XO E XO,XI E Xl} and {O} iff L separates Xo - x = {z:z = Xo - X,Xo E Xo} and Xl - x for every x E X.
2.
The Hahn-Banach Theorem
281
We then have Theorem 2.7. Let Co, C l be two disjoint, nonempty convex subsets of a real normed linear space X, and suppose that at least one of the sets, Co say, has a nonempty interior. Then there exists a nontrivial linear functional L on X that separates Co and C l . Proof. Let Xo be an interior point to Co; by considering if necessary Co - Xo and C l - xo, which are also convex, we may assume that 0 is an interior point to Co. Let Xl be a point of C!, then -Xl is an interior point to the convex set Co - C l = {z:x = X - y,x E Co,y E C l } and 0 is an interior point to the convex set C = Co - Cl + Xl = {x: z = X + Xl, X E Co - Ct}. Moreover, since Co and Cl are disjoint we also have
o rt Co -
Cl ,
Xl
rt C = Co -
Cl
+ Xl .
(2.27)
Let Pc be the Minkowski functional corresponding to C; from (2.27) it follows that Pc(Xl) ~ 1. Let Xl = {xt} be the one-dimensional subspace of X spanned by Xl; Xl consists of all elements of the form AX!, A real, and consider the linear functional Ll defined on Xl by
Since Pc(AXl)
= APC(Xt) if A ~ 0, while
we also have Ll(AXt) ~ Pc(AXl), for all real A. We are now in a position to invoke the Hahn-Banach Theorem, and extend Ll to a linear functional L defined on the whole space X satisfying the condition (2.28) Lx ~ Pc( x) , all x EX. Since pc(x) ~ 1 on C, while LXI = LlXl ~ 1, by (2.28) it follows that L separates C and {xt}. But as observed above this is equivalent to the statement that L separates Co - C l and {O}, which is in turn equivalent to the fact that L separates Co and C l . • We discuss next further applications of the Hahn-Banach Theorem to different settings.
XIV.
282
Normed Spaces and Functionals
3. APPLICATIONS We begin by discussing three interesting applications of the HahnBanach Theorem: The determination of when a linear subspace is dense in a linear space, the general form of the converse to HOlder's inequality, and the construction of a natural embedding of a normed space into a Banach space. First we prove Proposition 3.1. Let Y be a linear subspace of a normed linear space X, and suppose x E X is such that d (x, Y) = infyeY IIx - YII = fJ > o. Then there is a bounded linear functional L on X with norm IILII = l/fJ which separates x from Y. More precisely, we have Lx = 1,
and
Ly = 0
for all Y E Y .
Proof. Let Y1 be the subspace of X spanned by Y and {x}; each element of Y1 of Y1 can be written uniquely as Y1 = Y + AX, with Y E Y and a scalar A. Now, if Y1 = Y + AX, note that
(3.1) Indeed, if A = 0 there is nothing to prove. Otherwise, if A f:. 0, since (-1/ A)Y E Y, it readily follows that fJ :::; IIY111/IAI, and (3.1) holds. We define now the linear functional L1 on Y1 as follows: If Y1 = Y + AX, then put L 1Y1 = A. By (3.1) it follows that IL1Y11 :::; IIY111/TJ, and consequently, we have IIL111 :::; 1/TJ. To show that equality actually holds here let {yn} ~ Y be such that lim n -+ oo IIx - Ynll = fJ. It is clear that 1 = L 1(x - Yn) :::; IIL11111x - Ynll where the right-hand side above tends to IIL111fJ as n - 00. Thus we also have l/fJ :::; IIL111, and consequently, IIL111 = l/fJ· We are now in a position to invoke Theorem 2.6. By that result there exists a linear functional L defined on X with IILII = l/fJ that extends L 1. Since it is also clear that Lx = L 1x = 1 and Ly = L 1y = 0 for Y E Y, the functional L does the job. • Corollary 3.2. Let X be a normed linear space. For any 0 f:. X E X there exists a linear functional L defined on X with IILII = 1 and such that Lx = IIxli. In particular, if x and Y are distinct points of X, there exists L E X* such t~at Lx f:. Ly.
3.
Applications
283
Proof. Suppose x ::f. O. Then by Proposition 3.1, with Y = {O} there, there exists a functional L' E X* such that IIL'II = 1/lIxll and L'x = 1. The first part of the conclusion follows now upon setting L = IIxIlL'. As for the second part, it follows from the first with x replaced by x - y::f. O. • Next we show a "density" result, it roughly states that if Y is a dense subspace of X, then the only bounded linear functional that vanishes on Y is the trivial, or zero, functional. Proposition 3.3. Suppose X is a normed linear space, and let Y be a subspace of X which is not dense in X. Then there exists a nontrivial linear functional L defined on X which vanishes on Y. Proof. infyEY IIx -
Since Y is not dense in X, there is x E X which satisfies yll > O. To obtain L apply now Proposition 3.1. •
The next result we discuss is an extension to the converse to HOlder's inequality in the spirit of Proposition 2.3 in Chapter XII. Proposition 3.4. X. Then we have
Suppose X is a normed linear space, and let x E
IIxll =
ILxl
sup -IILII = sup ILxl· L,#O
IILII=1
(3.2)
Proof. Since for each L E X* we have ILxl ~ IILllllxll, it readily follows that either sup in (3.2) above is less than or equal to IIxll. As for the opposite inequality, note that by Corollary 3.2 there is a bounded linear functional L of norm 1 defined on X so that Lx = IIxli. For this functional we have IIxll = ILxl/IlLII, and we have finished. • Next we discuss the embedding of a normed space into a Banach space, but first a definition. The natural map, denoted by J x, of a normed linear space X into its second conjugate space X** (the Banach space of bounded linear functionals on X*) is defined by
(Jxx)L=Lx,
allLEX*.
(3.3)
It is not hard to check that for each x EX, J xx is a bounded linear functional on X*. To show that J X x is a linear functional on X*, let
XIV.
284
Normed Spaces and Functionals
LI, L2 E X*, A a scalar, and note that by (3.3) we have
+ AL2) =
(Ll + AL2)(x) = LIX + AL2X = (JXX)Ll + A(Jxx)L2.
(JXX)(LI
To show that Jxx is actually bounded we make use of (3.2): If IIJxx11 denotes the norm of Jxx as an element of X**, then by (3.3) and (3.2) it follows that
IIJxxll
= ~~~
l(Jxx)LI IILII
ILxl
= ~~~ IiLiT = IIxll·
In fact, we have shown that J x is also norm preserving, and consequently, one-to-one. In other words, the natural map establishes a linear isometric embedding from X into X**. These properties of the natural map lead to a simple proof of the following result. Theorem 3.5. Banach space.
Every normed linear space is a dense subspace of a
Proof. Given a normed linear space X, let Xl = Jx(X) ~ X** denote the image of X into X** under the natural map. Since, as established above, X and Xl are isometrically isomorphic, we may think of X as Xl, and prove the conclusion for Xl instead. Let X 2 denote the closure of Xl in X**j X 2 is a closed subspace of a complete space, and consequently it is also complete. Moreover, since by construction Xl is dense in X 2 , we are done. •
If the range of the natural map J x is all of X**, then X is said to be reflexive. For instance, from the definition of the natural map and the representation of the dual space to the Lebesgue LP spaces given in Theorem 2.5 in Chapter XII, it follows that LP(J.L) is reflexive when 1 < p < 00. The reader should be warned tha.t, in general, the equivalence of a normed linear space with its second conjugate does not guarantee the reflexivity of the space. On the other hand, Ll(J.L) is not in general reflexive, and to see this we make use of the following observation. A normed linear space X is said to be separable, if there exists a countable dense subset of X. For instance, LP(Rn) is separable if 1 ~ p < 00, and is not separable if p 00. We then have
=
Proposition 3.6. If the conjugate X* of a normed linear space X is separable, then X is also separable.
3.
Applications
285
Proof. Let {Ln} be an at most countable dense subset of X*, and {xn} a sequence of elements in X such that ILnxnl ~ IILnll/2, IIxnll = 1, for all n. We claim that the linear subspace Y of X spanned by the xn's is dense in X. Suppose this is not the case. Then, by Proposition 3.3, there is a nontrivial linear functional L E X* such that Lx = 0 for every x E Y. Since by assumption {Ln} is dense in X*, there is a sequence {Lnm} that converges to L. Now, since for each m we have
it readily follows that lim m -+ oo Lnm = o. But this is impossible since lim m -+ oo Lnm = L '=I 0, and consequently, Y is dense in X. Finally, since the set consisting of all finite linear combinations of the xn's with rational coefficients is countable and dense in X, X is separable. • Since £00 = (£1)* is not separable but £1 is, the converse to the above proposition is not true. Nevertheless, we have Corollary 3.7. is also separable.
The conjugate space of a reflexive separable space
Proof. Suppose X is a normed linear space which is reflexive and separable. Then X** = Jx(X) is also separable and, by Proposition 3.6, X* is separable. • Since as pointed out above (£1)* = £00 and £1 is separable but £00 is not, £1 is not reflexive. It is therefore of interest to describe the dual space to LOO(p), a task we left open in Chapter XII. We begin by discussing a related result of independent interest, namely the dual space to G(I). Let I = [0,1], and L be a continuous linear functional defined on G(I). Since G(I) is a (closed) linear subspace of LOO(I), by Theorem 2.6 there is a bounded linear functional LI defined on Loo(I) that satisfies
Ld = Lf if f E G(I),
and
IILIII =
IILII.
(3.4)
Now, for each x E I we define a bounded function
-
{10
0
if ~ t ~ x if x < t ~ 1,
and put g(x) = L1
XIV.
286
Normed Spaces and Functionals
°
Let = Xo < Xl < ... < xn = 1 be a partition of I, and put ci = sgn(g(Xi) - g(Xi-1». We then have n
n
A = L Ig(xi) - g(xi-dl = LCi(9(Xi) - g(xi-d) i=l i=l n
= LCi(Ll4>xi - Ll4>xi_l) = Ll 4>,
i=l where 4> = :E?=1 ci( 4>xi - 4>xi-l) is a bounded function with 114>1100 = l. Now, by (3.4) we get A ~ IILll1l14>lIoo = liLli, and consequently, 9 is BV on I, and V(gj 0, 1) ~ IILII. Given I E C(I), define the bounded functions n
In = L I(k/n) (4)k/n - 4>(k-1)/n)) , k=l
n = 1,2, ...
and note that since I is uniformly continuous it follows that II In as n --t 00. Whence by the continuity of L1 and (3.4) we have
lim Ldn = Ld = LI.
n--+oo
11100
--t
°
(3.5)
On the other hand, since L1/n may be rewritten as n
L1/n
=L
I(k/n)(g(k/n) - g((k - 1)/n» ,
k=l by Theorem 2.6 in Chapter III we obtain lim Ldn
n--+oo
=
11 0
Idg.
(3.6)
Thus combining (3.5) and (3.6) we conclude that
LI =
1
1/dg .
(3.7)
Furthermore, by (3.2) in Chapter III, we also have
ILII ~ maxl/(x)lV(gjO, 1) , xeI
and consequently
IILII = V(gj 0,1).
(3.8)
3.
Applications
287
Two observations: First, since g(O) = 0, it follows that V(gj 0,1) = IIgll, the norm on BV introduced in (1.5). Also, by (3.8), for BV functions 9 with g(O) = 0, the integral in (3.7) determines a bounded linear functional Lon C(J) with IILII ::; IIgll. The only difficulty here is that the expression in (3.7) does not uniquely determine the functional L, d. 4.24-4.26 in Chapter III, and, as in the case of the Lebesgue LP spaces, some kind of normalization is needed. The details are left to the reader, d. 4.36 below. We close this section with the description of the conjugate space to LOO(p). It is not intuitively clear how the bounded linear functionals on LOO(p) look. On the one hand, it is obvious that functions 9 E L1(p) induce such functionals by means of
but it is not hard to see that not all functionals are of this form. Indeed, let J = [-1,1], and Y = {I E LOO(J): lim
.! f
r-+O+ r
J(o,r)
1 dy exists}
Then Y is a nonempty subspace of LOO(J), and
LI
=
lim
.! f
r-+O+ r
J(o,r)
1 dy
is a bounded linear functional on Y with IILII = 1. Now, by the HahnBanach Theorem, L can be extended to a bounded linear functional on Loo(I), also of norm 1. For simplicity denote this extension also by L and observe that L cannot be of the form (3.9) for any 9 E L1(I). Indeed, if (3.9) is true for an integrable function g, let I'T/ = XR\(O,71)sgng, where 0 < TJ ::; 1j it is clear that 171 E Y and LI71 = O. It then readIly follows that
LI'T/
=
1.
[71,1]
Igi dy = 0,
all TJ> O.
But this implies that 9 = 0 a.e. on (0,1], and a similar argument gives that 9 = 0 a.e. on [-1,0]. In other words, 9 = a.e., and L is then the zero functional, contrary to the fact that IILII = 1.
°
XIV.
288
Normed Spaces and Functionals
The analytic representation of the conjugate space to L 00 (J..L) requires that we extend the notion of integral to include integration with respect to a signed additive set function. Because it suffices for this application, we restrict our attention to the case when both the function to be integrated and the set function with respect to which the integration is carried out, are bounded. First some definitions. Let A be an algebra of subsets of X and 'I/J a bounded nonnegative set function defined on A. Given a bounded function g: X ~ R and a partition P of X consisting of pairwise disjoint measurable sets E 1 , •.. , En, put mk = infEIc g, Mk = sUPEIc g, and consider the lower and upper sums of g corresponding to P with respect to 'I/J, defined by the expressions n
8(g, 'I/J, P)
n
= L mk'I/J(Ek)
and
B(g, 'I/J, P)
k=l
= L Mk'I/J(Ek) k=l
respectively. The usual properties of lower and upper sums hold in this case as well. They are: (i) If a partition pI refines a partition P, then we have
8(g, 'I/J, P) ~ 8(g, 'I/J, pI)
B(g, 'I/J, pI) ~ B(g, 'I/J, P).
and
(ii) No lower sum exceeds an upper sum, even when they are formed with two different partitions. In case the quantities sup 8(g, 'I/J, P) p
and
inf B(g, 'I/J, P) p
Ix
coincide, we define the integral g d'I/J of g over X with respect to 'I/J as that common value. The class of functions for which the integral exists is rather wide and, as we now show, it includes the bounded measurable functions. By the way, since A is not necessarily a u-algebra, we say that a function is measurable provided all four conditions in Proposition 1.1 in Chapter VI are satisfied. Proposition 3.S. If 'I/J(X) < then 9 d'I/J exists.
Ix
00
and g is bounded and measurable,
Proof. By (i) and (ii) above it suffices to show that there are partitions P of X for which the lower and upper sums are arbitrarily close to each other. Let m < M be real numbers such that m < g( x) < M for all x EX, suppose", is an arbitrary constant, 0 < ", < M - m, and divide
3.
Applications
289
the interval (m, M) by means of the points m = to < tl < ... < tn = M into a finite number of subintervals, each of length less than or equal to TJ. Form now the sets
and observe that they are pairwise disjoint and measurable. Let P denote the partition of X into the Ek 's; if any Ek is empty, simply drop it. Further note that since this family is finite we have 'I/J(X) = L. 'I/J(Ek). Moreover, since for each k we have
it readily follows that
S(g, 'I/J, P) - 8(g, 'I/J, P)
= 2)Mk ~ TJ
mk)'I/J(Ek)
L 'I/J(Ek) = TJ'I/J(X) .
Thus, by means of an appropriate choice of TJ, the difference between the upper and lower sums above can be made arbitrarily small, which is what we set out to prove. •
It is interesting to point out that, in general, the class of functions for which the integral exists includes functions that are not measurable. Indeed, if X = N and A is the algebra of those subsets E of N which are either finite or so that N \ E is finite, then
'I/J(E) - { 0 -
00
if E is finite if N \ E is finite,
is an additive set function defined on A, the function 9 = characteristic 9 d'I/J = 00 function of the odd integers is not measurable, and yet exists. It is possible to define the integral of 9 with respect to a signed additive set function 'I/J over A as follows: If 'I/J+ and 'I/J- denote the positive and negative variations of'I/J respectively, cf. 4.8 in Chapter IV, let
IN
i9d'I/J = i9d'I/J+ - !x9d'I/J_,
(3.10)
provided the expression on the right-hand side of (3.10) is well-defined. Now, from (3.10) it follows that the basic properties of the RiemannStieltjes and Lebesgue integrals hold in this context, with slight or no
XIV.
290
N ormed Spaces and Functionals
change. We need two specific properties of the integral, to wit, linearity and boundedness; we state them next, their proof is left to the reader. If the integral of 91 and that of 92 with respect to "p exist and), is a scalar, then the integral of 91 + ).92 with respect to "p exists, and we have
L
(91
Also, if
19(X)1
~
+ ).92) d"p =
L
91 d"p
+).
L
92 d"p.
M for all x E X, then
(3.11) We are now ready to give a description ofthe dual to Loo(J.t). Suppose (X, M, J.t) is a measure space, let L be a bounded linear functional defined on Loo(J.t), and for E E M put (3.12) From the linearity of L it readily follows that "p is an additive set function defined on M. Moreover, since L is bounded we also have (3.13) Now, if J.t(E) = 0 we have IIXElioo = 0, and consequently, by (3.13) we obtain "p(E) = O. Moreover, since we also have that J.t(A) = 0 for any A ~ E, A E M, it follows that "p(A) = 0 for those sets, and, by 4.8 in Chapter IV, we get that "p+(E) = "p_(E) = 0, and 1"pI(E) = O. Suppose now that I E Loo(J.t) and consider a representative in the equivalence class of I, which we call I again, which is bounded everywhere by 11/1100' Let M > 1111100, and divide the interval [-M ,M] by means ofthe points -M = to < tl < ... < tn = M into a finite number of subintervals, each oflength less than or equal to an arbitrary real number 'T/. Form now the partition of X consisting of the measurable sets Ek
= {tk-l
~
1< tk},
k
= 1, ... ,n,
and let h be the measurable function h = Ek=1 tk-1XE". Now, if x E Ek, then I/(x) - h(x)1 = I/(x) - tk-ll ~ 'T/, and consequently, we have
III -
hll oo = sup I/(x) - h(x)1 ~ 'T/.
(3.14)
:cEX
Further, since L is linear and bounded, by (3.14) it follows that ILl - Lhl = IL(f - h)1 :5 IILII II! - hll oo :5
IILII 'T/.
(3.15)
3.
Applications
291
On the other hand, both I and h have an integral with respect to X, and since h dt/J = L. tk-l t/J(Ek) = Lh, we get
Ix
t/J on
Whence it follows that
ILldt/J-Lhl = ILU-h)dt/JI ~ III - hlloolt/JI(X)
~ TJ It/JI(X),
which combined with (3.15) allows us to estimate
ILl - Lhl + ILh - L I dt/JI
ILl -
Ix I dt/JI
by
~ (IILII + It/JI(X)) TJ·
Since TJ is arbitrary this can only happen if
LI = Lldt/J,
all bounded
I.
(3.16)
This is the first step in obtaining the representation of L. There is, of course, the question of the uniqueness of the representation: We must be sure that for each I E Loo(J.t) the right-hand side in (3.16) above is independent of the bounded representative we choose in the equivalence class of I. This amounts to proving that if I is equivalent to 0, then we have I dt/J = 0. Now, in this case, X can be partitioned into two disjoint measurable sets El and E2 , say, so that I = on El and J.t(E 2 ) = 0. By the definition of the integral it is clear that lEI I dt/J = 0, and since as observed above we also have 1t/JI(E2 ) = 0, if c is a bound for I, by (3.11) we get that
Ix
°
Ix
By the linearity of the integral it now follows that I d'!f; = 0, which insures that the right-hand side of (3.16) is well-defined at the level of bounded equivalent functions of Loo(J.t) functions. The stage is now set for
XIV.
292
Normed Spaces and Functionals
Theorem 3.9. Let (X,M,J..L) be a measure space. The dual to LOO(J..L) can be described as follows: Each bounded linear functional L on L 00 (J..L) is of the form
(3.17) where t/J is a bounded additive set function defined on M satisfying the condition It/JI(E) = 0 whenever J..L(E) = o. Furthermore, the norm IILII is
IILII = It/JI(X) .
(3.18)
Proof. As discussed above, if L is a bounded linear functional defined on Loo(J..L), then there is a bounded additive function t/J defined on M such that (3.17) holds. Moreover, this representation is independent of the bounded representative of each I E LOO(J..L), and (3.11) implies that IILII ~ It/JI(X). Thus to verify (3.18) it suffices to prove that we also have It/JI(X) ~ IILII. Now, by 4.9 in Chapter IV, given £ > 0, there exist measurable subsets El, E2 of X such that
Put I = XEl - X~ and observe that since ±1, we have 11/1100 = 1 and consequently,
IILII
I only takes the values 0 and
~ ILII ~ LI = LXEl - LX~ = t/J(Et) - t/J(E2 ) ~ It/JI(X) -
£.
Since £ > 0 is arbitrary, the above estimate implies that IILII ~ It/JI(X), (3.18) holds, and the integral representation of L has been established. On the other hand, if t/J is a bounded additive set function defined on M with the property that It/JI(E) = 0 whenever J..L(E) = 0, then it is not hard to see that (3.17) defines a bounded linear functional L on LOO(J..L) with norm IILII = It/JI(X). Indeed, if I is a bounded representative of a function I E LOO(J..L), consider the measurable partition of X consisting of the sets B = {III> 1l/1100}, and X \ B. Since J..L(B) = 0, it follows that It/JI(B) = 0 and by (3.11) we get that I dt/J = o. Whence we have I dt/J = I dt/J and, by (3.11) again, it follows that
Ix
IX\B
IB
ILII:5 It/JI(X \ B)lI/lIoo :5 1t/JI(X)lI/lIoo . These observations imply at once that L is a bounded linear functional defined on LOO(J..L) with IILII :5 It/JI(X). The opposite inequality holds as before, L also satisfies (3.18), and we have finished. •
4.
Problems and Questions
293
4. PROBLEMS AND QUESTIONS 4.1 Is every metric on a linear space induced by a norm?
4.2 Let X be a normed linear space, and B = {x EX: IIxll that the closure of B is the set {x EX: IIxll ~ 1}.
< 1}. Show
4.3 Suppose M is a closed subspace of a normed linear space X and
define an equivalence relation R on X X X by xRy iff x - y belongs to M. If X/M denotes the set of equivalence classes and x + M the equivalence class corresponding to x EX, show that X / M is a linear space over the scalar field of X with the operations
(x + M) + (AY + M) = (x + AY) + M,
X,y E M ,A scalar.
For future reference, the dimension of X/M is called the codimension of M. Further,X/M is alsoanormed space with norm IIx+MII = d(x,M). Are these conclusions true if M is not necessarily closed? 4.4 Let X be a normed linear space, and M be a closed subspace of X.
Prove that X is complete iff M and X/M are complete. Also, show that X is separable iff M and X / M are separable. 4.5 If M is a finite-dimensional proper subspace of a normed linear space X, prove there exists an element x EX, IIxll = 1, such that d(x,M) = 1. 4.6 (F. Riesz) Let X be a normed linear space and M be a proper closed linear subspace of X. Show that given c > 0, there exists an element x E X, IIxll = 1, such that d(x,M) > 1- c.
4.7 Let X be a normed linear space and suppose lim n -+ oo Show that lim n -+ oo IIx n ll = IIxll. 4.8 Suppose
IIxn -
xII =
o.
are linearly independent elements of a normed linear space X. Show that there exists a constant c > 0 with the property that for every choice of scalars AI, ... ,An we have Xl, ••• , Xn
4.9 Referring to the construction of the norm on a linear space following Proposition 2.4, Suppose we put IIxlip = (Ei=l IAiIP)I/P, 1 ~ p < 00
there. Is
II . lip a norm '/
XIV.
294
N ormed Spaces and Functionals
4.10 Let (X,M,p) be a measure space and 1 ~ p,q ~ 00. Show that LP(p) + Lq(p) {I: I can be written as I 9 + h, 9 E LP(p), hE Lq(p)}, is a linear space. Further, normed by
=
11/11
=
= inf{lIglip + IIhllq:1 = 9 + h},
n
LP(p) + Lq(p) is a Banach space. Along similar lines, LP(p) Lq(p) normed by 11/11 = ma.x{lI/lIp, II/lIq}, is also a Banach space. Can you characterize the conjugate space to LP(p) + Lq(p)? to LP(p) Lq(p)?
n
4.11 A sequence (xn) of elements of a Banach space X is said to be a Schauder basis for X if for each x E X there is a unique sequence of scalars (An) such that lim m --+ oo IIx Anxnll = o. Show that £P has a Schauder basis if 1 ~ p < 00, but £00 does not.
r::=1
4.12 Prove that if a Banach space has a Schauder basis, then it is separable. 4.13 Let Co = {(xn) E £00 : lim n--+ oo Xn = O}. Show that Co is a closed linear subspace of £00, and that it has a Schauder basis. Is Co reflexive? 4.14 For each positive integer n let en be the sequence with 1 in the nth place and zeros elsewhere. Prove that {en} is a Schauder basis for £1, but it is not a Hamel basis for £1. 4.15 Let X be a linear normed space, and L a nontrivial linear functional on X. Prove the following three conditions are equivalent: (a) L is continuous, (b) The null space of L is a proper, closed linear subspace of X, and, (c) The null space of L is not dense in X. 4.16 Let X be a linear normed space over C. If a linear functional L on X is not continuous, prove that {Lx: IIxll ~ 1} is all of C. 4.17 Let L =I 0 be a linear functional on a linear space X and Xo any fixed element of XIN, where N is the null space of L. Show that any x E X has a unique representation x = AXo + y, where A is a scalar and yEN. 4.18 Referring to 4.17, show that any two elements Xll X2 E X belong to the same element of XI N iff LXI LX2. Further, the codimension of N is equal to 1.
=
4.19 Show that two linear functionals Lll L2 which are defined on the same linear space and have the same null space are proportional. 4.20 If Y is a subspace of a linear space X and the co dimension of Y is equal to 1, then every element of X/Y is called a hyperplane
4.
Problems and Questions
295
parallel to Y. Show that for any linear functional L ::f 0 on X the set YI = {x EX: Lx = 1} is a hyperplane parallel to the null space N of L. Further, show that the norm IILII of L can be interpreted geometrically as the reciprocal of the distance of the hyperplane Y I to the origin. 4.21 Let X be a normed linear space, and suppose L is a bounded linear functional on X with norm 1. Given E > 0, show there exists x,. EX such that II x,. II = 1 and Lx,. > 1- E. Give an example to show that there need not exist x E X such that IIxll = 1 and Lx = 1. 4.22 Let X be a normed linear space and let {x n } ~ X. Prove that x E X is the limit of finite linear combinations of the xn's iff Lx = 0 for all continuous linear functionals L on X such that LX n = 0 for all n. 4.23 Let Y be a subset of a (real) normed linear space X, and Lo a functional defined on Y. Show that a necessary and sufficient condition for Lo to have a bounded linear extension to X is that there exists a constant k with the property that 1IL:~1 AnLoxnl1 ~ k 11L::=1 Anxnll for any Xl,"" xn in Y and scalars At, ... , An. 4.24 Let 1
< p,q < 00, be conjugate indices, i.e., 1/p+ 1/q = 1. Suppose
IRn
9 E Lq(Rn ) has the property that jg dx = 0 for each j in n D = {j E LP(R ) n L(Rn): j dx = a}. Prove that 9 = 0 a.e. As a consequence, show th~,t D is dense in LP(Rn ). Is a similar statement true if we consider a bounded interval I instead of R n ? Also, what can we say about the case p = 1?
IRn
4.25 Suppose 1 < p, q < 00 are conjugate indices, and j fI. LP(X, J.t). Show that the set {g E Lq(X,J.t):jg E L(X,J.t) and jgdJ.t = O}, is dense in Lq(X,J.t).
Ix
4.26 Let X = L 2 (J.t) X L 2 (J.t) normed by
= {(1,g): j,g E L 2 (J.t)} be the linear space
11(1,g)1I = (lIjll~ + IIgll~)1/3 . Show that X is a Banach space and describe X*. 4.27 Let X 1= {O} be a normed linear space. Show that X* 1= {O}. Moreover, prove that if X has n linearly independent elements, so does X*. 4.28 Show that if Lx
= Ly for every L
E X*, then x
= y.
4.29 Prove that if a normed linear space X is reflexive, so is X*.
296
XIV.
N ormed Spaces and Functionals
4.30 Prove that the "completion" of the normed linear space described
in Theorem 3.5 is unique up to isomorphisms. 4.31 Let Y be a closed subspace of a normed linear space X. Show that if
every L E X* which vanishes on Y vanishes also on X, then Y
= X.
4.32 Let X be a normed linear space. A sequence {x n } ~ X is said to
converge weakly to an element x E X if lim n -+ oo LX n = Lx for all L E X*. Prove that no sequence can have two distinct weak limits. Further, a sequence {xn} ~ X is said to be weakly Cauchy if {Lx n } is a Cauchy (scalar) sequence for every L E X*, and X is said to be weakly sequentially complete if every weakly Cauchy sequence converges weakly. Prove that if X is weakly sequentially complete, then it is complete.
4.33 Prove that a reflexive Banach space is weakly sequentially complete. 4.34 Show that any closed subspace of a weakly sequentially complete
Banach space is itself weakly sequentially complete. 4.35 Show that il is weakly sequentially complete. 4.36 Describe a normalization of BV functions that allows for the iden-
tification of the dual of C(J). 4.37 If J is a compact interval of Rn and J.t is a finite Borel measure
on J, then LI = II I dJ.t is a positive bounded linear functional on C(J). Positive here means that LI ~ 0 whenever I ~ O. Prove now the following result, a particular case of the so called Riesz Representation Theorem: Suppose J is a compact interval of Rn and L is a positive bounded linear functional on C(I). Then there is a unique Borel measure J.t such that LI = II I dJ.t for every I E C(J).
I which is BV in J = [0,1] such that I;p(x)dl(x) = 'L.r:.=IP(n)(nIN) for all polynomials p of degree less than or equal to N?
4.38 Fix an integer N. Does there exist a function
4.39 Let J = [0,1] and consider the sequence {In} C C(J) defined by
In(x) = nx if 0 ~ x ~ lin, = 2 - nx if lin ~ x ~ 2/n, and = 0 otherwise. Show that {In} converges weakly to 0 in C(J), but that lim n -+ oo II In II i- o.
CHAPTER
xv
The Basic Principles
In this chapter we consider the three basic principles concerning continuous linear transformations that provide the foundation for many results in linear analysis. These principles are: The Uniform Boundedness Principle, The Open Mapping Theorem, and the Closed Graph Theorem. '1.
THE BAIRE CATEGORY THEOREM
Baire's theorem concerning the structure of complete metric spaces is an essential ingredient in proving the validity of the basic principles alluded to above. To state it we need some definitions. Let (X,d) be a metric space. A set E ~ X is said to be nowhere dense if its closure E has empty interior. The sets of first category in X are those that are countable unions of nowhere dense sets; these sets are also called meager. All other sets are said to be of second category in X, or nonmeager. For instance, the rational numbers Q are of first category in R, and the irrational numbers I are of second category in R. We begin by proving Theorem 1.1 (Baire's Category Theorem). X is of second category in itself. Proof. and let
A complete metric space
Suppose, to the contrary, that X is of first category in itself 00
X
= UXn , n=l
Xn nowhere dense, all n.
xv.
298
The Basic Principles
Take a point Xo in X and consider the (nonempty) open ball B(xo,l) centered at Xo of radius 1. Since the interior of X I is empty, X I does not contain B(xo, 1); let then Xl be a point in B(xo, 1) \XI and 0 < Tl < 1/2 be such that
Similarly, since X 2 is nowhere dense, X 2 does not contain B(Xl,Tt) and, as before, there are a point X2 E B( XI, Tl) \ X 2 and 0 < T2 < 1/4 such that B(X2,T2)CB(xt,Tt), and B(X2,T2)nX 2 =0. Continuing in this fashion step by step we get a decreasing sequence of closed balls {B(x n , Tn)} with the property that
Now, by a well-known result in the theory of metric spaces, actually an extension of the Nested sequence theorem on the real line, since the B(xn, Tn)'S form a monotone decreasing sequence of non empty closed sets whose diameters tend to 0, and since (X, d) is complete, there exists one, and only one, point X E X so that
n 00
~B""""(xn ,-Tn-:-)
= {x} .
n=l
By construction X ~ X n for all n; thus X ~ contradiction. •
Un Xn
X, and this is a
Theorem 1.1 is often cast in the following form: If On = X \ X n denotes the complement of X n , then each On is open and dense in X, and the conclusion of Baire's Category Theorem is that On i 0. More precisely, the intersection of every countable family of dense open subsets of X is dense in X. The Baire Category Theorem is useful in proving that a set is nonempty. In fact, the category method furnishes a whole class of examples, and it often makes it possible to construct an explicit example by successive approximations. We exemplify this by showing that, in the sense of category, almost all continuous functions are nowhere differentiable. In fact, as we prove below, it is exceptional for a continuous function to have a finite one-sided derivative anywhere in an interval.
nn
1.
Baire Category Theorem
299
To make this precise let I = [0,1], consider G(l) with the uniform metric, and let En denote the class of functions f E G(l) such that for some x in [0,1 - lin] we have If(x+h)-f(x)l~nh,
alIO
We claim that for each n, En is closed and nowhere dense in G(l). To see that En is closed let f E En, and let {fk} be a sequence in En that converges to f. Then, there is a sequence (Xk) such that 0 ~ Xk ~ I-lin, and Ifk(Xk + h) - fk(Xk)1 ~ nh, all 0 < h < 1 - Xk. By the Bolzano-Weierstrass theorem we may also assume that Xk -+ x for some 0 ~ x ~ 1 - lin since this condition is satisfied if we replace (Xk) by a suitable subsequence. Now, if 0 < h < 1- x, note that the inequality 0< h < 1 - Xk holds for sufficiently large k, and that
If(x
+ h) -
f(x)1 ~ If(x + h) - f(Xk + h)1 + If(Xk + h) - fk(Xk + h)1 + IA(Xk + h) - fk(Xk)1 + IA(Xk) - f(Xk)1 + If(Xk) - f(x)1 = Al
+ A2 + A3 + A4 + As ,
say. Clearly, A2,A4 ~ IIf - All. Also by the choice of Xk, we get A3 ~ nh. Moreover, since f is uniformly continuous in l, we have limk-+oo AI, As = o. Thus, letting k -+ 00, it follows that If(x+h)-f(x)l~nh,
all O
f belongs to En, and En is closed. Next, since any continuous function on I can be approximated arbitrarily closely by a piecewise linear continuous function 9, to show that En is nowhere dense in C(I) it suffices to prove that given any such function 9 and c > 0, there is a function h in G(l) \ En such that 119 - hll ~ c. This is not hard, in fact, a "proof by pictures" using "saw-tooth" functions works, cf. 6.1 below. Hence the set E = U En is of first category in G(l). This is the set of all continuous functions that have bounded right difference quotients at some point of [0,1) and it contains the set of all functions in G(l) that have a finite right-sided derivative somewhere there. The use of the Baire Category Theorem in this, and other contexts, amounts to the verification of the fact that a set is not empty by showing that an element of the set can be found as the limit of a suitable sequence. In fact, the above proof hints that a nowhere differentiable function can be exhibited as the sum of a (uniformly convergent) series of "saw-tooth" functions, cf. 6.2 below.
XV.
300
The Basic Principles
2. THE SPACE 8(X, Y) The time has come to consider the theory of linear mappings from a linear space into another linear space over the same field of scalars. More precisely, let X, Y be normed linear spaces over the same field of scalars, usually but not necessarily the real or complex numbers, and let T be a mapping, or operator, with domain D(T) in X and range R(T) contained in Y. We say that T is linear if for all Xl, x2 in D(T) and scalars A we have (2.1) A word about (2.1) above: The sign "+" denotes the addition in X on the left-hand side of (2.1) and the addition in Y on the right-hand side there; a similar remark applies to other notations throughout this chapter. A mapping T: X -+ Y is said to be continuous at a point Xo in X if given c > 0, there exists 6 = 6(xo,c) such that c whenever IIx - xoll ~ 6. Closely related to this concept is that of boundedness: A mapping T: X -+ Y is said to be bounded if
IITx - Txoll
~
IITII =
sup
IIxll~o
IITxl1 -II-II < 00 •
(2.2)
X
As in the case of linear functionals we have Proposition 2.1. Let X and Y be normed linear spaces and T be a linear operator defined on X and range in Y. The following statements are equivalent: (i) T is continuous at a point Xo EX. (ii) T is uniformly continuous on X. (iii) T is bounded. The proof of the proposition follows along the lines to that of Proposition 2.4 in Chapter XIV and is therefore left to the reader. We denote by 8(X, Y) the collection of all the bounded linear mappings defined on X with range in Yj when X = Y we simply writ,e 8(X). We begin by giving an example of a mapping in 8(X, Y), and computing its norm. Let X = Y = C(I), where I is a compact interval of the line, and let (the kernel) k be a continuous real-valued function defined on I x I. We consider the mapping T on f E C(I) given by
Tf(x) =
l
k(x,y)f(y)dy.
2.
The Space B(X, Y)
301
From the continuity of k and LDCT it readily follows that also T / E C(I). The question is whether T E B(C(I», and if so to compute IITII. First observe that
IT/(x)1
~ II/II sup :eEl
f1k(x,Y)ldy,
11
and consequently T E B(C(I», and
IITII ~ sup :eEl
f Ik(x, y)1 dy.
(2.3)
11
Next we show that equality holds in (2.3). Since the function
g(x) =
i
Ik(x, y)1 dy
is continuous on I, it attains its maximum value at some point Xo in I. Now, the function h defined by h(y) = sgn (k(xo, y» is bounded and measurable, and consequently, integrable on I. It follows from Theorem 1.3 in Chapter VIII that there is a sequence {
and
n-+oo
o.
On the one hand we have max f Ik(x, y)1 dy = f Ik(xo, y)1 dy = f k(xo, y)h(y) dy, :eEl ~ ~ ~
(2.4)
and, on the other hand, by LDCT it follows that lim T
(2.5)
1/
Moreover, since for each n we have T
(2.6) it readily follows that IITI + AT211 ~ belongs to B(X, Y). Moreover, we also have
IITIII + IAIIIT211,
and TI
+ AT2
also
XV.
302
The Basic Principles
Proposition 2.2. Let X, Y be normed spaces over the same field of scalars. Then, normed by (2.2), B(X, Y) is a normed space. Furthermore, if X ::f:. {O}, B(X, Y) is a Banach space iff Y is complete. Proof. That B(X, Y) is a normed space, and that it is complete if Y is complete, follows along the lines of the particular case of functionals, cf. Proposition 2.5 in Chapter XIV, so we say no more. Now, suppose that B(X,Y) is a Banach space and let {yn} be a sequence of elements in Y such that 2: n IIYnll < 00; we must show that 2:n Yn converges in Y. Let now Xo EX, IIxoll = 1, invoke the HahnBanach theorem to construct a bounded linear functional L on X so that Lxo = IIxoll = 1, and define the sequence {Tn} ~ B(X, Y) by Tnx=(Lx)Yn,
xEX.
Since IITnx11
IITnl1 = ~~~ IIxiI
~
IILII llYn II ,
all n,
it follows at once that 2: n IITnll < 00, and since B(X, Y) is complete 2:n Tn converges to a sum T E B(X, Y), say, in the norm of B(X, Y). In particular, we have
But the right-hand side of (2.7) tends to 0 with m, and so does the lefthand side there. By the definition of the Tn's this means that ",m
~n=l
Tnxo = Lxo ",m
~n=l
Yn = ",m
~n=l
and 2:n Yn converges to Txo in Y.
Yn
-+
Txo,
as m
-+ 00 ,
•
We operate with elements in B(X, Y) pretty much like with numerical functions, including the taking of inverses. Indeed, suppose T is a oneto-one linear operator in B(X, Y). The inverse T- 1 of T is the map from R(T) into X given by T-1(Tx) = x for all x E X. It is clear that T- 1 is linear, and we also have
Proposition 2.3. Let T E B(X, Y). Then T-l exists and is continuous iff there exists a constant c > 0 such that IITxll ~
cllxll ,
all x EX.
(2.8)
3.
The Uniform Boundedness Principle
303
Proof. Suppose (2.8) is true and observe that if x ::/= 0, then also Tx ::/= 0, and T is one-to-one. Moreover, given Y E R(T), let y = Tx, x EX, and note that
and T- 1 E B(R(T), X). On the other hand, if T- 1 exists and is continuous, we have
IIxll =
IIT- 1 (Tx)1I ::; IIT - 1 1IIITxll,
and (2.8) follows upon taking
x EX,
c= 1/IIT-11I. •
3. THE UNIFORM BOUNDEDNESS PRINCIPLE A family F ~ B( X, Y) can be "bounded" in at least two different senses. First, since B(X, Y) is normed, F can be bounded in the norm, i.e., sup{IITII:T E F} is finite. If this is the case we say that F is a norm bounded set. On the other hand, it can also happen that sup{IITxll : T E F} is finite for each x in X. When this is the case we say that F is pointwise bounded on X. A norm bounded set is certainly pointwise bounded on X. The remarkable fact is that, when X is complete, the converse is true. This is the content of the Uniform Boundedness Principle, and of the Resonance Theorem, which we present next. Theorem 3.1 (The Uniform Boundedness Principle). Let X be a Banach space, Y a normed linear space, and F = {Ti: i E I} ~ B(X, Y) a pointwise bounded family on X. Then lim IITixll .:1:-+0
Proof.
= 0,
We show that given c
IITiXIl ::;
c,
uniformly in I.
°
> 0, there exists 6 > such that
whenever
IIxll ::; 6,
all i E I.
(3.1)
Let Xk = {x E X: supliTiXIl::; k},
k = 1,2 ...
iel
Since the Tj's are continuous, each Xk is closed. Moreover, since by assumption X = Uk Xk, by the Baire Category Theorem some Xk, Xko say,
304
xv.
contains an open ball B(xo,r) = {x EX: IIx particular, that
- xoll < r}.
sup IITiXIl ::; ko,
The Basic Principles
This implies, in
all x E B(xo, r).
ieI
Now it is all a matter of translations: Observe that if x E B(O, r), then x - Xo E B(xo,r) and consequently, since Xo E Xko, we have
IITiXIl = IITi(X - xo) + TiXoll ::; ko + IITiXoll ::; 2ko,
all i E 1.
Next observe that given 0 =J x E X, (rj2I1xlI)x E B(O, r), and consequently, by the above estimate we also have
or equivalently, (3.2)
=
Erj4ko, from (3.2) it follows that (3.1) is true, and Whence letting 6 the proof is complete. • As for the Resonance theorem, it states Theorem 3.2 (Banach-Steinhaus). orem 3.1, F is a norm bounded set.
Under the assumptions of The-
Proof. In the notation of Theorem 3.1, by (3.1) it readily follows that for all i E I we have sup x::f:.o
SUPieI
IITili ::; Ej6,
IIIT.IiX1111 x
= ~ sup IITi((6jllxlI)x)1I ::; Ej6, x::f:.o
and F is a norm bounded set.
•
It is interesting to note that the assumption concerning the completeness of X is necessary for the above results to hold. To see this let X be the linear subspace of [1 consisting of all sequences that have only a finite number of nonzero terms; clearly X is dense, but not closed, in [1. For each positive integer n, let Tn denote the linear functional defined on sequences x = (XI, ••• X m , ••• ) of X by
3.
The Uniform Boundedness Principle
305
Now, since Tnx = 0 for each x E X and all sufficiently large n, it is clear that the family {Tn} is pointwise bounded on X. On the other hand, if en E £1 is the sequence which has a 1 in the nth position and zeros elsewhere, we have en E X, lIenll = 1, and
Thus {Tn} is not a norm bounded set. Important applications of these results, including the existence of a continuous function whose Fourier series diverges at a point, cf. the discussion following Proposition 1.2 in Chapter XVII, follow from the following restatement of Theorem 3.2 involving sequences of linear operators: Let X be a Banach space, Y a normed linear space, and {Tn} a sequence in 8(X, Y). Then the (good) set
G
= {x EX: lim sup II Tn II < oo}
(3.3)
either coincides with X or is a set of first category in X. This formulation enables us to generalize this result somewhat. Theorem 3.3 (Principle of the Condensation of Singularities). Let {Tm,n}, n = 1,2, ... , be a sequence of bounded linear operators from a Banach space space X into a normed linear space Ym , m = 1,2, ... Suppose that for each m there exists Xm E X such that lim sup IITm,nxmll =
00,
m = 1,2 ...
(3.4)
n-+oo
Then the (bad) set
B
= {x
E X: limsupllTm,nxll n-+oo
= 00
all m
= 1,2, ...}
is of second category in X. Proof.
Consider the sequence {G m } of (good) subsets of X given
by
G m = {x EX: lim sup IITm,nxll < oo},
m = 1,2, ...
n-+oo
By the preceding remark and (3.4) it readily follows that each G m is of first category in X. Since X is complete, by the Baire Category Theorem, we get that B = X \ (U m G m ) is of second category in X. •
306
XV.
The Basic Principles
4. THE OPEN MAPPING THEOREM Suppose that X, Yare normed linear spaces over the same field of scalars, and let T be a mapping from X into Y. We say that T is open at x E X if T(V) contains a neighbourhood of Tx whenever V is a neighbourhood of x. We say that T is open if T(U) is open in Y whenever U is open in X. It is clear that T is open iff T is open at every x EX. Because of the translation invariance of the neighbourhoods in linear spaces, it is also clear that T is open iff T is open at a single point of X. An interesting question to consider is whether a one-to-one continuous linear mapping from a Banach space onto another has a continuous inverse. The answer is affirmative and, as we shall see below, it is a special case of our next result.
Theorem 4.1 (Open Mapping Theorem). Let X, Y be Banach spaces, and suppose T E 8( X, Y). If T maps X onto Y, then T is open. Proof. The proof amounts to showing that T is open at the origin. To simplify the notation, in what follows we put B(O, r) = Bn the open ball of X of radius r centered at the origin, and similarly, B~, the open ball of Y of radius r centered at the origin. Now, suppose that V is a neighbourhood of 0 in X, we will be done once we show that T(V) contains a neighbourhood of TO = 0 in Y. Since every neighbourhood V of 0 contains a ball Br for sufficiently small r, it suffices to prove that given an arbitrary ball Bn T(Br) contains a ball B~/. We do this in two steps: First we show that given E > 0, there exists "7 > 0 such that the closure T(Be) of T(Be) contains B~, and then we show that the same is true of T(Be). First note that since for all E > 0, U~=l Bne = X, and since T is onto, we have 00
Y = T(X) =
UT(Bne) ,
all
E
> O.
(4.1)
n=l
We are now in a position to invoke the Baire Category Theorem and conclude that at least one of the sets in the union on the right-hand side of (4.1) is not nowhere dense in Y, and consequently, its closure contains a ball. Specifically, there exist an integer n and r > 0 such that T(Bne) ::> Yo
+ B~ =
B'(yo, r).
(4.2)
4.
The Open Mapping Theorem
307
Moreover, since B~ ~ B'(Yo,r) - B'(Yo,r)
= {y E Y:y = Yl -
Y2,YbY2 E B'(Yo,r)} ,
from (4.2) we get (4.3) / If Y belongs to the set on the left-hand side of (4.3), there are sequences {Xk} and {xU of points in Bn~ such that
Y = lim TXk - lim TXk k-+oo
k-+oo
= k-+oo lim T(Xk -
Xk).
Since the points Xk - xk E B2n~ for all k, by (4.3) we get that also T(B2n~)
2
B~.
(4.4)
Finally, since B2n~
= 2nB~{x EX: x = 2nx', x' E B~}
and T(2nBe) = 2nT(B~), (4.4) gives T(B~)
2
B~,
"I = r/2n,
(4.5)
and our first assertion is true. Next, given c > 0, put Cn = c/2 n , n = 1,2, ... , and for each n let "In be the choice of "I corresponding to Cn in (4.5) above. Whence
= 1,2, ... Clearly we may, and do, assume that lim n -+ oo "In = o. T(B~n)
2 B~,
n
(4.6)
Put now TJI = "I, and suppose Y E B~. By (4.6) also Y E T(B~l)' and consequently, Y can be approximated as close as we want by points in T(B~l). In particular, there exists Xl E B~l such that IIy - TXl II < "11. In this case, since Y - TXl E B~, we can find X2 E B~2 such that IIY - TXl - TX211 < "13· In general, having chosen Xi E B~i' 1 ~ i ~ k, pick Xk+l E B~k+l with the property that
(4.7) We claim that L Xk converges to a point x E B~, and that Y = Tx. If this is the case, then we have T(B~) ;2 B~, and the second assertion is also true.
XV.
308
The Basic Principles
To see that 2: Xk converges, since X is complete it suffices to check that 2: IIXkl1 < 00. Now, since Xk E Bek for all k, it readily follows that 2:k:l IIXkll < 2:k:l c/2 k+l = c. Whence 2:k:l Xk converges to an element x E X with IIxll < c. Also, by the continuity of T we get that T(2: k Xk) == Tx, and, since limk-+oo TJk = 0, from (4.7) it is clear that
Thus y == Tx, and we have finished.
•
A word about the hypothesis of Theorem 4.1. The assumption that T is continuous is not essential, cf. 6.29 below, and the assumption that T is onto may be replaced by the assumption that the range R(T) of T is of second category in Y. On the other hand, this last assumption, as well as the completeness of X, are necessary for the map T to be open. We postpone the presentation of specific examples until after the proof of the Closed Graph Theorem. Now, concerning the inverses, we have
Corollary 4.2. Let X, Y be Banach spaces and assume T E B(X, Y) is a one-to-one mapping from X onto Y. Then T- l is a well-defined bounded linear mapping from Y onto X. Proof.
By Theorem 4.1 there exists TJ
> 0 such
T(Bl) 2 B~.
that
(4.8)
Now, since T- l is well-defined, (4.8) is equivalent to
Bl Thus, if 0
':f. y E Y,
2
T-l(B~).
(4.9)
we have (TJ/2I1yll)y E B~, and by (4.9) it follows that IIT- l ((TJ/2I1yll)y)1I ~ 1.
By the linearity of T- t we get IIT-lyll ~ (2/TJ)lIyll, and consequently,
T-t E B(Y, X), and liT-til
~
(2/TJ).
•
Corollary 4.3. Suppose that the linear space X is normed by 11·11 and by II . lit, and that, endowed with both norms, X is complete. Then if for some constant c we have IIxll ~ cllXllt ,
all x EX,
( 4.10)
all x EX. More precisely, the norms are equivalent.
(4.11)
there is a constant k such that IIxliI ~ kllxll ,
5.
The Closed Graph Theorem
309
Proof. Let T denote the identity map from X, normed by II . IiI, into X, normed by 11·11; clearly T is linear, one-to-one and onto, and, by (4.10), it is also continuous. By Corollary 4.2, T- I = T is also continuous, and (4.11) holds. • The completeness of X under both norms is essential for (4.11) to be true. Referring to the construction following Proposition 2.4 in Chapter XIV, if X is a Banach space, and if we introduce the norm II . 1100 in X, then (4.10) holds. Now, if (4.11) were to hold, then X normed by II ·1100 would become a complete normed space, which it is not.
5. THE CLOSED GRAPH THEOREM Many important operators in Analysis enjoy the following property: They are well-defined on a dense subspace of a normed linear space X, and yet fail to be continuous. For instance, put I = [0,1], and let X = Y = G(l), and Xl = GI(I) C X; Xl is dense in the uniform norm of X, a proof of this will be given in Corollary 2.3 in Chapter XVII. Consider now the linear operator T: Xl -4 Y given by
T I = I',
or
T I (x) =
1'( x ) , all x
E 1.
Thus T is a densely defined operator, and it is clear that it is not bounded since the sequence Un} consisting of the functions
In(x)=xn,
n=I,2 ...
satisfies liT In II = nand II/nll = 1 for all n. The challenge is to incorporate operators such as T into the theory we are developing, and to discover what properties they satisfy. Referring to the differentiation operator T, we are interested in considering sequences {In} ~ GI(l), In -4 I E G(I), and the corresponding sequences {T In} ~ G(I). As observed above, {Tin} need not converge, but when it does, i.e., if lim n ..... oo Tin = 9 E G(l), then the following is true: Since the sequence {/~} converges uniformly to 9 on l, the sequence {In} converges uniformly to an anti derivative of 9 on I. But since by assumption also {In} converges uniformly to Ion l, it follows that IE GI(I) and that T 1= g. Because of the importance of this example we formalize these considerations into a definition. Let X, Y be normed spaces, and let T: D(T) -4 Y be a linear mapping. We say that T is closed in X, if for any sequence {x n } ~ D(T), lim Xn = x,
7'&-+00
and
lim TX n = Y
n-+oo
(5.1)
xv.
310
imply
x E D(T) ,
and
Tx
The Basic Principles
= y.
(5.2)
As the differentiation mapping shows, not all closed operators are continuous. The opposite is also true; namely, not all continuous operators are closed. For instance, if Xl is a proper dense subspace of a normed space X = Y, then the identity map T: Xl ---+ Y is obviously bounded, but not closed. The Closed Graph Theorem, a close relation to the Open Mapping Theorem, establishes when a closed mapping is bounded. In order to prove it we find it convenient to consider a more "geometric" setting. First a definition. Let X, Y be normed linear spaces and let X x Y be the linear space normed by lI(x,y)1I = IIxli + lIyll· Given a linear mapping T: D(T)
G(T)
---+
Y, the graph G(T) of T is the set
= {(x,Tx):x E D(T)} ~ X
X
Y.
Since T is linear, G(T) is a linear subspace of X X Y. Now, when T is closed, (5.1) and (5.2) imply that G(T) is a closed subspace of X x Y. The converse is also true: If G(T) is closed in X X Y, then T is closed in X. Thus, the concepts T closed and G(T) closed are interchangeable. It is also clear that if D(T) is a closed subspace of X and T is continuous, then T is closed in X. The remarkable fact is that for Banach spaces the converse to this statement is also true. More precisely, we have
Theorem 5.1 (Closed Graph Theorem). Let X, Y be Banach spaces, and suppose T: X ---+ Y is linear. If T is closed in X, then T is continuous in X. Proof. Since X X Y is a complete normed space, and since by assumption G(T) is a closed subspace of X x Y, G(T) is also a Banach space in the norm induced by that of X x Y. Consider now the (projection) linear mapping P: G(T) ---+ X given by P«x,Tx))=x,
xEX.
Note that in addition to being linear, P is one-to-one and onto X. Moreover, since also
IIP«x,Tx»1I
= II xli $ IIxll + IITxll = lI(x, Tx)1I ,
5.
The Closed Graph Theorem
311
P is also bounded. Thus by Corollary 4.2, the inverse p-l: X P given by
p-lx
= (x,Tx),
--+-
G(T) of
x EX,
is also bounded. Specifically, there exists a constant c such that
lI(x,Tx)1I =
IIxll + IITxll
This clearly implies that IITxll ~
~
cllxll,
cllxll, i.e., that T
x EX. is bounded.
•
For the validity of the Closed Graph Theorem it is essential that both the domain X and the target space Y be complete, as may be seen by the following examples. As pointed out above, the differential operator T: Gl(l) --+- G(l) is closed but not bounded; in this case the domain X = Gl(l) is not complete. An example along similar lines, roughly speaking it corresponds to differentiation of Fourier series, is the following: Let £1 = {(an) : lIe an)lIl = E~=l lanl < oo}, and
Since X is a proper dense subspace of £1, it is not complete. Let now T: X --+- £1 be the mapping defined by
It is easy to check that T is well-defined and closed, and since for the sequence en E £1 which has a 1 in the nth position and zeros elsewhere we have
it follows that T is not bounded. Now, since T is also one-to-one and onto £1, it has a well-defined inverse T- l : £1 --+- X; in fact, we have
Observe that indeed T- l is defined on the whole of £1 since x E £1 implies T-lx E X, and
In fact, the above remark shows that T- l is bounded, and putting x = el there we also get that liT-III = 1. Moreover, since x = T-l(Tx), it
xv.
312
The Basic Principles
follows that T- 1 is onto X. Now, T- 1 is not open, for if it were open, then (T- 1 )-1 = T would be continuous, and, as we saw above, this is not the case. Thus, for the validity of the Open Mapping Theorem it is essential that the target space be complete. Consider next an infinite dimensional Banach space X, let H be a (necessarily infinite) Hamel basis for X such that IIhll = 1 for all h E H, and let 1I·11t be the norm on X given by n
IIxl11 =
L
n
lail,
x = Laihi,
i=l
hi E H, 1 ~ i ~ n.
i=l
Let Y denote the linear space X endowed with the metric II . 111, and let T: X -+ Y be the identity map. T is one-to-one, onto and, as it is readily verified, closed. However T is not bounded, for otherwise the fact that X is complete would imply that Y is also complete, and this is not the case. Thus, for the Closed Graph Theorem to be true it is essential that the target space Y be complete. Now, T- 1 :y -+ X is also the identity, and as such it is one-to-one and onto. Moreover, since
IIxll
~
IIxll1 ,
all x EX,
T- 1 is also bounded. However, T- 1 is not open, for if it were open, then (T- 1 )-1 = T would be continuous, and as pointed above, this is not the case. Thus, for the Open Mapping Theorem to be true, the domain must be complete.
6. PROBLEMS AND QUESTIONS 6.1 Show that any continuous function on [0,1] can be approximated uniformly and arbitrarily closely by a piecewise linear continuous function. 6.2 Let 4>( x) denote the function that assigns to each real x the distance from x to the nearest integer. Prove that for appropriately chosen sequences (en) and (kn ) the function given by the uniformly convergent series L:~=1 en4>(knx) is nowhere differentiable. 6.3 Let X,Y be Banach spaces and L(x,y) be a functional on X X Y, continuous and linear in each variable separately. Prove that L is continuous at (0,0), and consequently everywhere.
6.
Problems and Questions
313
6.4 If X is a finite dimensional normed linear space, prove that every linear operator T: X - t X is bounded. 6.5 Let X, Y "=I {O} be normed linear spaces and suppose the dimension of X is infinite. Show that there is at least one unbounded linear operator T: X - t Y. 6.6 If T E B(X, Y), T "=I 0, and IIxll < 1, then IITxll < IITII. Is it also true that IITxll < IIxll? 6.7 Suppose T,Tn E B(X,Y), n = 1,2, ... , suPIITn11 < 00, and limn-+oo IITnx - Txll = 0 for every x in a dense subset of X, does it follow that lim n-+ oo IITn - Til = O? 6.8 Suppose 0 "=I T E B(X) and {x n } ~ X has the property that limn-+ oo IIx n ll = 00. Does it follow that lim n-+ oo IITx n ll = oo? 6.9 Suppose Tt, T2 E B(X). Show that TIT2 E B(X), and
6.10 Prove that if X is a Banach space and T E B(X) and IITII < 1, then the "geometric series" 1+ T + ... + Tn + ... converges in B(X).
What does it converge to?
6.11
Referring to 6.7, the condition IITII < 1 is not necessary for ... + Tn + '" to converge in B(X). For, suppose
1+ T +
lim v'IIT n ll = L,
n-+oo
exists. Show that if L < 1 the above series converges and if L > 1 it does not. Further, prove that a necessary and sufficient condition for the series to converge is that for some k we have IITk II < 1. 6.12 (Banach) Let X be a Banach space and suppose T E B(X) is such that IITII ~ TJ < 1. Prove that the operator 1- T has a continuous inverse (I - T)-I and 11(1 - T)-III ~ 1/(1- TJ). 6.13 Let To E B(X, Y), where X and Yare Banach spaces, and suppose To has a bounded inverse TOI E B(Y, X). Show that if an operator T E B(Y,X) satisfies IITII < 1/IiTo-III, then the operator U = To + T: Y - t X has a continuous inverse and
xv.
314
The Basic Principles
6.14 Let X, Y be normed linear spaces and let To be a mapping from a subset M of X into Y. Show that a necessary and sufficient condition for To to have a bounded linear extension to the span of M is that there exists a constant k such that
for any
Xt, • •• ,X m
in M and scalars At, ... ,Am.
6.15 Let X, Y be normed linear spaces and suppose Y is complete. Show that every continuous linear operator To from a subset M of X into Y has a unique continuous linear extension T to the closure of M into Y, and IITII = IIToll. In particular, prove that if a continuous linear operator T from a normed linear space X into a Banach space Y maps a dense subset of X into 0, then Tx = 0 for all x E X. 6.16 Let X, Y be Banach spaces and T: X -+ Y, T linear. Prove that if LoT E X* for every L E Y*, then T E SeX, Y). 6.17 Suppose X, Y are Banach spaces and {Tn} ~ SeX, Y). Prove that if for each L E Y* we have sup ILTnxl < then sup IITnll
00,
all x EX,
< 00.
6.18 Assume {Tn} ~ SeX) and lim n--+ oo Tnx = Tx exists for each x in X. Show that T is a bounded linear operator on X and that IITII ~ lim sUPn--+oo IITnll· 6.19 Let X be a Banach space, and T, TI , T 2 , ••• be bounded linear operators defined on X with the property that lim n--+ oo Tnx = Tx for all x EX. Prove that there exists a constant c > 0 such that sup II Tn II ~ c. 6.20 As an application of the Uniform Boundedness Principle show that the space X of polynomials p( x) = L:~=o anx n , where an = 0 for all but finitely many n's, normed by IIplI max lanl, is not complete.
=
6.21 Let X be a normed linear space, Y a Banach space, and suppose T E SeX, Y). If N denotes the null space of T we may define a map T*:XjN -+ Y as follows: For each class x+N, let T*(x+N) = Tx. Prove that T* E S(XjN, Y), and that if X is a Banach space, then T* is an isomorphism.
6.
Problems and Questions
315
6.22 Let X be an infinite-dimensional Banach space, and {xn} a linearly independent set in X. Show that for each n = 1,2, ... , the linear span {x!, ... , xn} is a nowhere dense subset of X. As a consequence of this result prove that the dimension of every infinite-dimensional Banach space is at least N1 . 6.23 Prove that if X is an infinite-dimensional Banach space, there is an embedding of /00 into X. 6.24 Prove that every separable Banach space is isomorphic to some quotient space of /1. 6.25 Let 1 < p < 00. If 2:n anb n converges for every sequence (b n ) such that 2:n Ibnl P < 00, prove that 2:n lanl q < 00, where q is the index conjugate to p. 6.26 Prove that there is no sequence of positive real numbers (an) such that 2:n anlbnl converges iff the sequence (b n ) is bounded. 6.27 Let X be a Banach space, and assume Y and Z are closed subspaces of X. If each x E X has a unique representation of the form x = y+ z with y E Y and z E Z, show that there exists a constant c such that for all x = y + z we have lIyll, IIzll ~ cllxll. 6.28 Let I = [0,1]. Show that LP(I) is properly contained in L(I) for each p > 1, and conclude that in the metric of L(I), LP(I) is a set of first category in L(I). 6.29 Let X be a Banach space and Y a normed linear space of second category. Prove that if the linear mapping T: X -+ Y is closed and onto, then T takes open sets into open sets. 6.30 Let I be a compact interval of the line, and K a closed subinterval of I with the following property: For every function 9 E G(K) there exists a function I E G(I) such that 11K = g. Show that there exists a constant c > with the property that for every 9 E G(K) there exists I E G(I) such that 11K = g, and maxI III ~ maxK Igi.
°
6.31 Let I = [0,1] and suppose A is a closed subspace of G(I). Suppose that for each I E A, T 1= ¢>I E A, where ¢> is a real-valued function defined on I. Show that T is continuous in the norm of A. Is ¢> necessarily continuous? 6.32 Let X be a Banach space, and suppose that Tl, T2 are linear operators from X into itself, and that T2 E B(X) is also one-to-one. Show that Tl E B(X) iff T2Tl E B(X).
xv.
316
The Basic Principles
6.33 Let I = [0,1], 1 < p < 00, and suppose T f is the linear operator defined on LP(I) by Tf(x) = ilk(x,y)f(y)dy, x E I. Specifically, assume that for almost every x E I, the function k(x, y)f(y) is integrable as a function of y E I, and T f E LP(I). Prove that T is bounded. 6.34 Let X, Y be normed linear spaces and suppose T: X --+ Y. Prove that if T is closed and one-to-one, then T- 1 is also closed. Also, if T E 8(X, Y) and the domain D(T) of definition of T is closed, then T is closed. 6.35 Suppose X, Yare Banach spaces, and let T E 8(X, Y). Show that R(T) is closed in Y iff there exists a constant c > 0 such that inf{lIx - yll : Ty = O} ~ cllTxl1 for all x E X. 6.36 Let Y be a subspace of a normed linear space X. A mapping p E 8(X, Y) is said to be a projection of X onto Y if P maps X onto Y and p 2 = P. Suppose now Y is a closed subspace of a Banach space X. Show that there exists a projection P of X onto Y iff there exists a closed subspace Z of X such that X = Y EB Z, i.e., Y n Z = {O} and every x E X can be written uniquely as x = y + z with yin Y and z in Z. If this is the case, there also exists a constant k > 0 such that lIy + zll ~ kllyll ,
y E Y ,z E Z.
6.37 Let X, Y be normed linear spaces and suppose T: D(T) ~ X --+ Y. T is said to be closable if there exists a linear extension of T to all of X which is closed in X; the domain D(T) of T is not required to be dense in X. Prove that the following are equivalent: (i) T is closable. (ii) For any y :f. 0 in Y, (O,y) is not in the closure of the graph of
T. (iii) T has a minimal closed linear extension, i.e., there exists a closed linear extension T* of T such that any closed linear extension of T is a closed linear extension of T* . 6.38 Let X, Y be Banach spaces and T: X --+ Y be closed. Prove that T has a bounded inverse iff T is one-to-one and has a closed inverse.
CHAPTER
XVI
Hilbert Spaces
In this chapter we consider those normed spaces where, as in the case of the Euclidean space Rn, there is an inner product which is connected with the norm by a simple relation: The square of the norm of an element is the inner product of that element with itself. Some examples to keep in mind are, of course, R n and en, and what we will have the occasion to verify is a universal model of a Hilbert space, L 2 (J.L).
1.
THE GEOMETRY OF INNER PRODUCT SPACES
A complex vector space X is said to be an inner product space provided there is a complex-valued map defined on X x X denoted by (., .), and called an inner product on X, or plainly an inner product, which satisfies the following properties: (i) (Xl + AX2,Y) = (XbY) + A(X2,Y), all Xt,X2 in X, and A E e. (ii) (x,y) = (y,x), all x,y in X. (iii) (x,x) ~ 0, and (x,x) = 0 iff X = O. Of course we may also consider real inner product spaces, and in this case we restrict our attention to A E R, and require that (-,.) be real valued. Property (i) is known as the linearity of the inner product, and together with (ii) it implies "conjugate linearity" in the second variable, to wit, (iv) (X,AY) = X(x,y), all x,y E X and A E e. An immediate consequence of these properties is that (x,O) (0, x) = o for all x EX. Inner product spaces satisfy an important inequality, which we prove next.
=
XVI.
318
Hilbert Spaces
Proposition 1.1 (The Cauchy-Schwarz Inequality). X we have
For any X,y in
l(x,y)1 2:::; (x,x) (y,y).
Proof.
(1.1)
Note that for any x, y in X and scalars A, by (iii), (iv) and
(ii) above it follows that 0:::; (x + AY,X + AY) = (x,x) + (X,AY) + (AY,X) = (x,x) +'X(x,y) + A(Y,X) + A'X(Y,y) = (x, x) +'X(x, y) + A(X, y) + IAI2(y, y) = (x, x) + 2~ ('X(x, y») + IAI2(y, y).
+ (Ay,AY)
Now, if y = 0, (1.1) is trivially true since both sides there are o. On the other hand, if y -=I 0, then (y,y) > 0, and putting A = -(x,y)/(y,y) the above estimate becomes
(x, x) _
21(x, y)12 + I(x, y)12 (y, y) = (x, x) _ I(x, y)12 ~ (y,y)
(y,y)2
which is clearly equivalent to (1.1).
0,
(y,y)
•
If in an inner product space X we set
IIxll=~,
(1.2)
xEX,
then X is turned into a normed space. Indeed, with the exception of the triangle inequality, all other properties of the norm follow at once from (1.2) and the properties of the inner product. As for the triangle inequality, given x, y EX, observe that
IIx + Yll2 = (x + y, x + y) = IIxll2 + 2~(x, y) + lIyll2 ,
(1.3)
which, by the Cauchy-Schwarz inequality, is dominated by
IIxll 2+ 211xllilyll + lIyll2 = (lIxll + lIyl1)2 . we get IIx + Yl12 :::; (lIxll + lIy11)2, which is
In other words, equivalent to the triangle inequality. An easy consequence of the Cauchy-Schwarz inequality is the continuity of the inner product. More precisely, if Xn -+ x and Yn -+ y, then \Xn, Yn) -+ (x, y). This observation follows from the estimate
l{xn,Yn) - (x,y)1 = l(xn,Yn) ± (xn'Y) - (x,y)1 :::; I(xn, Yn - y)1 + I(xn - x, y}1
:::; IfxnllllYn - yll + IIx n
-
xlillyll,
1.
Geometry of Inner Product Spaces
319
and the fact that convergent sequences are bounded. An inner product space X endowed with the norm introduced in (1.2) is said to be a pre-Hilbert space. A complete pre-Hilbert space is called a Hilbert space. For instance, L2(p.) is a Hilbert space with inner product given by
(j,g)
= !xfYdp.,
j,gEL 2 (p.).
£2 is the prototype of a Hilbert space. It was introduced by Hilbert in the early 1900's in his work on integral equations. The axiomatic definition of a Hilbert space was not given until much later by J. von Neumann(19031957) in the mid 1920's, in a paper dealing with the mathematical foundations of Quantum Mechanics. In what follows we restrict our attention to Hilbert spaces since, as we show next, every inner product space can be "completed", and the completion is a Hilbert space, unique up to isomorphisms. In the present context, a linear mapping T: X --t Y from an inner product space X onto another inner product space Y over the same field of scalars is said to be an isomorphism if it preserves inner products, i.e.,
(Tx,Ty)
= (x,y),
all X,y EX.
Thus, isomorphisms of inner product spaces preserve their whole structure, including inner products and norms. Proposition 1.2. Suppose X is an inner product space. Then there exist a Hilbert space Y and an isomorphism T of X onto a dense subspace of Y. The space Y is unique up to isomorphisms. Proof. Because the proof follows along familiar lines we only sketch it. By Theorem 3.5 in Chapter XIV there exist a unique, up to isometries, Banach space Y and an isometric (in the linear sense) isomorphism T of X onto Y. Consider now the complex-valued map (·,·h defined on Y X Y as follows: If X,y E Y, and {x n } and {Yn} are sequences of elements in X that converge in Y to x and y respectively, let
By the continuity of the inner product it is not hard to see that (·,·h is well-defined, i.e., it is independent of the approximating sequences chosen, and that it is an inner product on Y. Also, by (1.2), it follows that T is an isomorphism of X onto Y. •
XVI.
320
Hilbert Spaces
The notion of Hilbert space is an immediate generalization of Euclidean space, so its "geometry" approaches Euclidean geometry more closely than that of other Banach spaces. For instance, the "parallelogram law" holds: For any x, y E X we have
(1.4) Indeed, (1.4) follows at once by adding (1.3) and the expression we obtain replacing y by - y there. In fact, even more is true. Proposition 1.3. A normed linear space X is an inner product space iff the "parallelogram law" holds. Proof. It only remains to check that if (1.4) holds, then X is an inner product space. The proof of this result is entirely computational since we can exhibit explicitly the inner product (-,.) associated to the norm \I . \I in X: It is given by the expression
When X is a real normed space the second summand above is ommited. The details of the straightforward, and tedious, computation needed to verify the properties of the inner product are left to the reader. • An interesting application of Proposition 1.3 is that the Lebesgue LP spaces are inner product spaces only if p 2. For instance, in the case of £P, consider x = (1,1,0,0, ... ), and y = (1,-1,0,0, ... ). We then have
=
and for these elements (1.4) holds iff p = 2. Another important notion is that of orthogonality. Elements x, y in an inner product space X are said to be orthogonal, and we write x ..l y, when (x, y) = 0. If x E X is orthogonal to each element of a subset A of X, then x is said to be orthogonal to A, and we write x ..l A. In this context we have Proposition 1.4 (Pythagorean Theorem). Suppose lection of pairwise orthogonal elements of X. Then
{xdi=l
is a col-
(1.6)
1.
Geometry of Inner Product Spaces
Proof.
321
By definition, the left-hand side of (1.6) equals
n X· Ln (L 1'-1" '-1 J-
X· J
)
-
L n. '-1( X·" I.J-
X· J}'
Also, since the Xi'S are pairwise orthogonal, the sum on the right-hand side above equals L:i=1 IIXi 112. • Our next goal is to explore whether given a subspace M of a Hilbert space X and X E X \ M, we can find Y E M such that
d(x,M) = inf {lIx' - xII :x' EM} = IIx -
yli.
(1.7)
This question is related to that of dropping a perpendicular from x to M, or "projecting" x onto M. Simple examples in R2 when M is an open segment or an arc show that there may exist no points Y E M which satisfy (1.7), or that there may exist infinitely many such y's. The following result handles the difficulties raised by these examples.
Proposition 1.5 (Existence of the Minimizing Element). Let X be an inner product space and M a nonempty, complete, convex subset of X. Then for every x E X there exists a unique Y E M such that
d(x,M)
= IIx - yli.
Proof. If x EM, then (1.8) holds with Y = x. Otherwise, if x let {Yn} ~ M be a minimizing sequence, i.e.,
lim IIx - Ynll = d(x,M).
n-+oo
(1.8) ~
M,
(1.9)
We claim that the sequence {Yn} is Cauchy. Indeed, by the parallelogram law we have
llYn - Yml1 2 = II(Yn - x) - (Ym - x)112 = 211Yn - xll 2 + 211Ym - xll 2 -1I(Yn - x) + (Ym - X)1I2 = 211Yn - xll 2 + 211Ym - xll 2 - 411(Yn + Ym)/2 - xll 2 . Now, since M is convex it follows that (Yn +Ym)/2 EM, and consequently,
d (x, M) ::; II(Yn + Ym)/2 - xII. Thus the above estimate becomes
llYn - Ymll 2 $ 211Yn - xll 2 + 211Ym - xW - 4d (x, M) ,
(1.10)
XVI.
322
Hilbert Spaces
and, in view of the choice of the Yn'S, the right-hand side of (1.10) goes to 0 as n, m - t 00. But this implies that the sequence {yn} ~ M is Cauchy and, since M is complete, that it converges to a limit Y EM, say. Furthermore, passing to the limit in (1.9) it follows at once that (1.8) holds. It only remains to check the uniqueness of y. Suppose Y and y' are elements of M that satisfy (1.8). Since (1.10) is actually true for arbitrary elements of M, by setting Yn = Y and Ym = Y' there, we get that lIy-y'll ~ o. Thus y = y'. • Turning from arbitrary convex sets to subspaces we obtain the result alluded to above concerning projections. But first a definition. Given a subset A of an inner product space X, let the orthogonal complement A.l of A be the set of all elements of X orthogonal to A, to wit, A.l = {x EX: x .1 y for all yEA} . (1.11) An elementary and important property of the orthogonal complement is Proposition 1.6.
A.l is a closed subspace of X.
Proof. First observe that if then we have
xl,
X2 E A.l, A is a scalar and YEA,
+ AX2, y) = (Xl, y) + A(X2' y) = 0, and consequently, Xl + AX2 E A.l.( Next suppose that {xn} ~ A and lim n oo Xn = x. Now, if YEA, we (Xl
--+
~
get that
I(x, y)1 = I(x -
Xn ,
y)1
~
Thus, X E A.l, and A.l is closed.
,"
IIx - xnllllYIl
-t
0
as n
-t
00.
•
We are now ready to prove a result of fundamental importance in the theory of Hilbert spaces, the projection theorem. Theorem 1.7. Let X be a Hilbert space and M a complete subspace of X. Then every element X E X can be expressed in the form X=
Xl
+ X2,
Xl
E M, X2 E M.l .
(1.12)
Furthermore, the representation is unique, and
(1.13)
1.
Geometry of Inner Product Spaces
323
Proof. If x E M we put Xl = X and X2 = O. Otherwise, let Xl be the unique element of M which satisfies (1.8), i.e., (1.14) Next we verify that X2 = X - Xl is orthogonal to M, and so it belongs to M.1. Let 0 i: y E M and observe that since for each scalar .\ also Xl + .\y EM, it readily follows that (1.15) Now, by a familiar argument using (1.14), (1.15) may be rewritten -'X(X2'
y} - .\(y, X2)
+ 1.\12(y, y} ~ o.
= (X2,Y}/(Y,Y) we conclude that I(X2, y}12 _ I(X2, y}12 + I(X2, y)i2 > 0
In particular, when .\
(y, y)
(y, y)
(y, y)
-
,
that is, I(X2,y}1 2 ::; O. But, this can only happen when X2 ..L y, and consequently X2 E M.1. By the Pythagorean theorem, (1.13) is then true. Finally we prove that the representation in (1.12) is unique. For, if we also have X = x~ + x~ , x~ EM, x~ E M.1 , then, comparing this with (1.12), we get
Now, the element on the left-hand side above belongs to M, while that on the right-hand side belongs to M.1. Thus,
(Xl - x~,x~ and Xl - x~
X2)
= (Xl -
X~,XI - x~)
= O. This implies that also x~ -
X2
= 0,
= 0, and we are done.
•
The elements Xl E M and X2 E M.1 uniquely determined by X are called the projections of X onto M and M.1 respectively. The operator PM: X --+- M given by PMX = Xl is called the projection on M; it is not hard to see that PM is a bounded linear operator onto M, and that IIPMII ::; 1. As we describe in Section 2, projection operators play an important role in the description of basic properties of the subspaces of
X. Theorem 1. 7 may be applied to characterize the bounded linear functionals on a Hilbert space.
XVI.
324
Hilbert Spaces
Theorem 1.8 (F. Riesz). Let X be a Hilbert space, and suppose L is a bounded linear functional defined on X. Then there exists a unique y E X such that Lx (x, y) , all x EX. (1.16)
=
Furthermore, IILII
= lIyll·
Proof. Let M be the null space of L, i.e., M = {x EX :Lx = O}j since L is continuous, M is a closed subspace of X, d. 4.15 in Chapter XIV. If M = X, then we choose y = 0, and we are done. Otherwise, let 0 ::j:. x ~ M, and note that by Theorem 1. 7 there is an element z = (1/lix - PMxll)(x - PMX) in M.l with IIzll = 1. Now, given x E X, put Xl = (Lz)x - (Lx)z, and observe that
LXI i.e.,
xl
= L((Lz)x) -
L((Lx)z)
= LzLx -
E M. Furthermore, since z E M.l and
((Lz)x - (Lx)z,z)
= Lz(x,z) -
Lx(z,z)
LxLz
= 0,
IIzll = 1, we get
= Lz(x,z) -
Lx
= o.
In other words, we have
Lx = Lz(x, z) = (x, (Lz)z) , and (1.16) holds with y = (Lz)z. Suppose now there is another point y', say, such that Lx = (x, y') for all x EX. Then (x,y - y') = 0 for all x E X, and by taking x = y - y' we find that lIy - y'll = 0, and so y = y'. Finally, about the norm. Since IIzll = 1 we have lIyll = ILzlllzll ~ IILII· On the other hand,
IILxll
~
I(x, y)1
and consequently, IILII ~ we have finished. •
~
IIxlillyll,
all x EX,
lIyll. We have thus proved that IILII
=
lIyll, and
A well-known property of finite-dimensional linear spaces is that of being algebraically reflexive. It is, therefore, natural, to consider whether arbitrary Hilbert spaces are reflexive. In order to answer this question we begin by showing that X*, which we already know to be complete, is an inner product space as well. Proposition 1.9.
If X is a Hilbert space, then X* is a Hilbert space.
1.
Geometry of Inner Product Spaces
325
Proof. Consider the mapping T:X ~ X* defined by Tx i.e., Tx is the bounded linear functional on X given by (Tx)y
= (y,x),
all y EX.
(·,x),
(1.17)
It is apparent that T is one-to-one and that, by Theorem 1.8, it is also norm preserving and onto. Observe that (1.17) may be rewritten Ly
= (y,T-1L) ,
for each L E X*,y EX.
(1.18)
Now, T establishes an equivalence between X and X* at the level of sets, but not as linear spaces, since T is not linear but rather conjugate linear. More precisely, we have T(XI
+ AX2) = TXl +J.TX2,
all Xl,X2 in X, scalars A.
Nevertheless, this property of T enables us to introduce an inner product (-,.)* on X* as follows: Given L1,L2 E X*, let (L1, L 2)*
= (T- 1L 2, T- 1L 1) .
(1.19)
A straightforward computation gives that (-,.)* is an inner product on X*, and that the norm and inner product on X* are related by (1.2). Thus X* is a Hilbert space. • We are now ready to show Proposition 1.10.
Suppose X is a Hilbert space, then X is reflex-
ive. Proof. Along the lines of the proof of Proposition 1.9 above, let r: X* ~ X** be the mapping on X* given by rL
= (. ,L)* ,
L E X* .
(1.20)
It then readily follows that r is one-to-one, onto and norm-preserving, and consequently, (1.20) may be rewritten x** L = (L, r- 1 x**)* ,
all x** E X** ,L E X* .
But then, by (1.19) and (1.18), for any x** E X** and L E X* we have x** L
= (T-1r-1x**, T- 1L) = L(T-1r-1x**) = Jx(T-1r-1x**)L.
Since L is arbitrary this can only mean that x** = Jx(T-1r-1x**), and consequent]:: T-1r-1x** E X. Thus, the natural map Jx is onto, and X is reflexive. . •
XVI.
326
Hilbert Spaces
2. PROJECTIONS We consider now some of the connections between the class of projections and the geometry of subspaces of X. We begin by noting an obvious property of projections: If PM: X -- M is the projection of X onto the subspace M, then PM is a bounded linear operator and, if M i:- {O}, we have IIPMII = 1. Indeed, for x, Y E X and scalars '\, by Theorem 1.7 it follows that x =
Xl
+ X2 , Y =
YI
+ Y2,
Xl,
YI EM, X2 ,Y2 E Ml. ,
and
Whence, we see that
and PM is linear. Further, by (1.13), IIPMXII 2 ~ IIx1I2, so that IIPMII ~ 1. Provided that M i:- {O}, we can choose X E M with IIxll = 1. Then II PM II ~ IIPMXII = IIxll = 1, and II PM II = 1. It is also possible to characterize the projections.
Proposition 2.1. Let P: X -- X be a linear map from a Hilbert space X into itself. Then P is a projection iff (2.1) (i) (Px, y) = (x, Py), all x, Y E X. (ii) p2 x = P(Px) = Px, all X EX. (2.2) Proof. We do the necessity first. Suppose P = PM is a projection onto a closed subspace M, and, given x, Y EM, write X = Xl + X2, Y = YI + Y2, with Xt,YI EM and X2,Y2 E Ml.. We then have
which is precisely (2.1). Moreover, since for Y E M we have Py readily follows that P(Px) = Px, X EX, and (2.2) is true.
= y,
it
2.
Projections
327
Conversely, suppose P is a linear mapping which satisfies both the conditions in the theorem, and first note that P is bounded. Indeed, by (2.1), (2.2) and the Cauchy-Schwarz inequality we have
= (Px, Px) = (x, p 2x) = (x, Px) ::; and consequently, P is bounded, and IIPII ::; 1. IIPxl12
IIxllIlPxll,
Next let M = P( X) be the image of X under P. It is clear that M is a linear subspace of Xj it is also closed. Indeed, if {yn} ~ M and Yn --+ y, let {x n } ~ X be such that PX n = Yn, and note that by (2.2) it follows that PYn = P(Pxn) = PXn = Yn, all n. Hence by the continuity of P we get
Y = n--+oo lim Yn
= n--+oo lim PYn = Py EM,
and M is the closed subspace of X, M = {x E X:Px = x}. Finally we show that P is the projection PM of X onto M. Take any x E X and write it as x = Px + (x - PX)j we want to show that Px E M and (x - Px) E M1.. The first assertion obtains since Px = P(Px) E M. Also, if Y E M, then Py = Y and hence, by (2.1) and (2.2),
(x - Px,y) = (x,y) - (Px,py) = (x,y) - (x,p 2y) = (x,y) - (x,Py) = (x,y) - (x,y) = 0, and x - Px E M 1. •
•
Two closed subspaces M, N of a Hilbert space X are said to be orthogonal, and we write M 1- N, if
(x, y)
= 0,
all x E M, yEN.
It is not hard to characterize orthogonal subspaces in terms of the projections they determine.
Proposition 2.2. Suppose X is a Hilbert space, and M, N are closed subspaces of X. Then, M 1- N iff PM PN = PN PM = O. Proof. First suppose M and N are orthogonal and let x, y EX. Then PMX E M, PNY E N, and
(PNy,PMX) = (PMPNY,X) =
o.
XVI.
328
Hilbert Spaces
Since x, yare arbitrary this can only happen if PM PN = OJ similarly PNPM = O. Conversely, suppose that PM PN = 0, and let x EM, yEN. Then,
and M 1. N. Note that the above assumptions are somewhat redundant in that PMPN = 0 iff PNPM = O. • How about the sum of projections? First a definition. Let M, N be closed linear subspaces of a Hilbert space X, and assume that every element in the vector sum M + N has a unique representation of the form x + y, where x EM, yEN. Then we call M + N the direct sum of M and N. If M 1. N, we denote this direct sum by M ffi N. We leave it to the reader to verify that M ffi N is also a closed subspace of X. If Y = M ffi N, then we say that N is the orthogonal complement of M in Y, and we write N = Y M j symmetrically, M = YeN denotes the fact that M is the orthogonal complement of N in Y. For instance, in the projection theorem we have X = M ffi M.L, M = X e M.L, and M.L = X eM.
e
Proposition 2.3. Suppose X is a Hilbert space, and M, N are closed subspaces of X. Then the sum PM+PN ofthe projections PM and PN is a projection iff PM PN = PN PM = O. In this case, PM+ PN = PM(J)N. Proof. First assume that P = PM sition 2.1 we have
+ PN is a projection. By Propo-
IIPxll2 = (Px,Px) = (p2X,X) = (Px,x) ,
all x EX,
and similarly,
Whence, we get IIPMXll2
+ IIPNXl1 2= (PMX,X) + (PNX,X) = (Px, x) =
IIPxll 2 ~
IIxll 2.
(2.3)
Consider now an arbitrary element y in X and put x = PNy in (2.3). Since PNX = P~y = PNY, this gives
2.
Projections
329
which can only be true if PM PN = O. By the way, this is equivalent to PNPM = O.
Conversely, we verify that P = PM + PN satisfies the conditions of Proposition 2.1, and so it is a projection. Since P is a sum of operators that satisfy (2.1), it also satisfies (2.1). As for (2.2), since
+ PN)2 = (PM + PN){PM + PN) pk + PMPN + PNPM + piv pk + piv = PM + PN = P,
p2 = {PM
= =
we have that p 2 = P, and P is a projection. Finally, it is clear that Px = PMX + PNX varies over M E9 N as x varies over X. Conversely, if x = Xl + X2 E M E9 N, Xl EM, X2 EN, since PPM = PM, PPN = PN, and since Xl = PMXI = PPMXI, X2 = PNX2 = PPNX2, we have
Hence P = PMffiN.
•
How about the product, or composition, and the difference of projections? Proposition 2.4. Suppose X is a Hilbert space, and M and N are closed subspaces of X. Then the composition P = PMPN of the projections PM and PN is a projection iff they commute, i.e., PMPN = PN PM. In that case P = PMnN. Proof.
If P is a projection, then
Moreover, since PM and PN are projections we also have
and consequently, we get
But this can only be true if PM PN established.
= PN PM, and the necessity has been
330
XVI.
Hilbert Spaces
Conversely, if PN and PM commute, then essentially reversing the above steps we get that P = PM PN = PN PM satisfies (Px,y) = (x,Py) ,
all x,y EX.
Moreover, since also p2 = (PMPN)(PMPN) = PM(PNPM)PN
= PM(PMPN)PN
= piJpiv = PMPN = P,
by Proposition 2.1 P is a projection. Furthermore, since
= PM(PNX) = PN(PMX)
Px
E M n N,
all x EX,
P projects X into M n N. On the other hand, if x E M n N, then Px P
= PMnN, and P
= PM(PNX) = PMX = X,
is the projection onto M
n N.
•
Before we consider the question of the difference of projections we need a preliminary result.
Lemma 2.5. Suppose X is a Hilbert space and let PM, PN be projections onto the subspaces M and N, respectively. Then the following four conditions are equivalent: (i) (PMX,X) ~ (PNX,X), all x E X. (ii) M2N. (iii) PMPN = PN. (iv) PNPM = PN. Proof.
(i) implies (ii). Let x E X. Since x - PMX E Ml., we have
IIxII 2=
IIPMXII 2 + IIx
-
PMxII 2 .
Now, if x E N, then PNX = x, and (PNX, x) = get
IIx1I2.
Whence, by (i) we
IIxII 2= (PNX, x) ::; (PMX, x) = (PMX, PMX) = IIPM xII 2 ::; IIxJJ2. Thus IIxll = IIPMXII, and by (1.13) we conclude that IIx - PMxlI = O.
Consequently, x = PMX E M, and (ii) holds. (ii) implies (iii). Since PNX E N ~ M, we have PMPNX = PNX. (iii) implies (iv). By Proposition 2.4, since the composition PMPN = PN
is a projection, the projections commute. Specifically, PN PM = PM PN = PN, which is precisely (iv). (iv) implies (i). It is a straightforward computation: For x E X we have (PNX,X)
and (i) holds.
= IIPNXII 2 = IIPNPMxII 2 ::; IIPMxII2 = (PMX,X) , •
3.
Orthonormal Sets
331
Proposition 2.6. Suppose X is a Hilbert space, and let PM, PN be projections onto the subspaces M and N, respectively. The difference P = PM - PN is a projection iff N ~ M. If this is the case, then P = PMeN.
Proof. Suppose that P is a projection. Then, since PM = P + PN is also a projection, by Proposition 2.3 it follows that P PN = O. Whence
and (iii) of Lemma 2.4, and consequently, also (ii) in that lemma, are true. Conversely, if N ~ M, it is clear that PNPM = PMPN = PN, and (PM - PN)2 = (PM - PN)(PM - PN)
= pk - PNPM - PMPN + PJv = PM - PN - PN + PN = PM - PN.
Also,
and by Proposition 2.1, P is a projection. Moreover, since as observed above (PM - PN )PN = 0, by Proposition 2.3 the subspace Y of the projection PM - PN satisfies Y EEl N = M. Therefore Y = MeN. •
3. ORTHONORMAL SETS A general question we address in this section is the following: Given a Hilbert space X and a subset Y of X, how can we best approximate elements of X by those of Y? A good measure of the approximation is given by the quantity
d(x,Y) = inf
yeY
IIx - yll,
so we are naturally interested whether the inf above is actually achieved. We begin by considering a simple example, namely, the case when Y is the (finite dimensional) subspace of X spanned by {Xl, ..• ,x n }. Given x EX, we seek to minimize the expression
(3.1)
XVI.
332
Hilbert Spaces
Clearly we may assume the Xi'S are linearly independent, and, by the Gram-Schmidt process, cf. 5.21 below, orthonormal. More precisely, we may assume that each Xi has norm 1, and that Xi ..L Xj for 1 ~ i =J j ~ n. The xi's are then said to constitute an orthonormal system, or ONS, in X. A closer look at (3.1) and (1.13) suggests that we consider the projection of X onto Y, and in order to invoke the projection theorem we begin by showing that Y is closed. Proposition 3.1. If M is a closed subspace of a normed space X and if X EX, then the span {M, x} of M and {x} is also a closed subspace of X. Proof. If x E M, then the span {M,x} = M, and we are done. Otherwise, assume x ~ M and suppose a sequence {mn +AnX } of elements of {M, x} converges to an element y EX; we must show that actually y E {M, x}. First note that since the sequence converges it is bounded, i.e., there is a constant c such that
limn
+ Anxll
~ c,
all n.
We claim that {IAnl} is also bounded. If this is not the case, there is a subsequence nk - t 00 such that lAnk I - t 00 as k - t 00, and consequently,
II A~;mnk + x I ~ c/IAnk I
-t
0
as k
-t
00 .
Whence x belongs to the closure of M, which is M since M is closed, and this is a contradiction. Now, since {IAnl} is bounded, passing to a subsequence if necessary, we may assume that the An'S converge to a limit A, say. But then
and since M is closed, y - AX EM. Thus y {M,x}. •
= (y -
AX)
+ AX
belongs to
Corollary 3.2. Let X be a normed space. Then the subspace Y spanned by {Xl, ... , X n} is closed. Proof. Observe that {xd = {AXI : A is a scalar} is closed in X and apply Proposition 3.1 as many times as necessary. • The stage is now set to invoke the projection theorem, and to obtain the answer to our question: It is Pyx. In fact, by the projection theorem
3.
Orthonormal Sets
333
we can find the Ai'S as follows: Since x - Pyx E y.l, if Pyx = L:~=1 AiXi, we have Moreoever, since {Xi} is an ONS in X, it readily follows that
and consequently, the minimum value of the expression (3.1) equals (3.2) Furthermore, we may compute the exact value of the expression (3.2): Its square equals
II xll 2- (2::1(X,Xi)Xi,X) - (x, 2::=1 (x, Xk)Xk) + 2:;=I I (x,Xi)1 2 = IIxII 2 - 2::1 1(x,Xi)1 2 • (3.3) The minimizing values of the Ak'S, to wit, the scalars (X,Xk), are called the Fourier coefficients of x with respect to the ONS {Xk}. In addition to providing a complete answer to the question posed above, (3.3) implies Proposition 3.3 (Bessel's Inequality). Suppose {Xa}aEA is an ONS in a Hilbert space X. Then Bessel's inequality holds, Le.,
2: I(x, xa)12 ::; IIxII 2,
(3.4)
all x EX.
aEA
In particular, for each x in X, all but an at most countable number of the Fourier coefficients (x,xa) of x with respect to the ONS {xa} vanish. Proof. Let al, ... , an be a finite subset of A. Then, given x EX, by (3.3) we have
Ilx - 2:;=I{x,Xa,)Xa,11 2 = IIxII It then readily follows that
2 -
2:: 1 1{x,xa,W
~ o.
XVI.
334
and (3.4) holds.
Hilbert Spaces
•
There is yet another way to interpret the inequality (3.4): Suppose A is endowed with the counting measure, and let T be the linear mapping from X into the space of sequences (Ca)aEA which takes x into its sequence of Fourier coefficients,
Tx
= ((x,Xa})aEA,
x EX.
(3.5)
Bessel's inequality asserts that T is a bounded mapping from X into .e2 (A) with norm IITII ~ 1. Since we are interested in describing X in terms of .e2 (A), we must decide when T is one-to-one and onto. We begin settling the "onto" question.
Theorem 3.4 (F. lliesz-Fischer). Suppose {Xa}aEA is an ONS in a Hilbert space X, and let (c a ) E .e2 (A). If T is given by (3.5), then there exists y E X such that Ty = (Ca)aEA. More precisely, T is onto. Since (c a ) E .e2 (A), there are at most count ably many a's, al, ... ,ai, ... , say, so that caj '# o. Put now
Proof.
n
Yn = LCajXaj,
n = 1,2 ...
i=l
We claim that the sequence {yn} is Cauchy in X, and consequently, it converges. This is not hard; indeed, let n < m, and note that by Proposition 1.4 we have m
m
Now, since (c a ) E .e2 (A), the right-hand side of the above equality is dominated by the tail of a convergent series, and it tends to 0 as n -+ 00. Whence the same is true of the left-hand side, and {yn} is Cauchy. Moreover, if y E X is the limit of the Yn's, from the continuity of the inner product we get if a '# a,,, all k if a = ak. Thus, Ty
= (Ca)aEA.
•
3.
Orthonormal Sets
335
Still the question as to whether T is one-to-one remains openj first a definition. We say that an ONS {Xa}aEA in a Hilbert space X is maximal, or complete, if no nonzero element can be added to it so that the resulting collection of elements is still an ONS in X. Note that given a Hilbert space X, we can always find a maximal orthormal system in Xj this is a simple consequence of Zorn's Lemma, cf. 5.23 below. The stage is now set for Theorem 3.5. Suppose {Xa}aEA is an ONS in a Hilbert space X. Then the following properties are equivalent: (i) {Xa}aEA is maximal in X. (ii) The collection of all finite linear combinations of the xa's is dense in X. (iii) (Plancherel's Equality) Equality holds in Bessel's inequality, i.e., IIXII2
=L
l(x,xaW,
all x EX.
(3.6)
aEA (iv) (Parseval's Identity) For all X,y E X, we have
(X,y) = L(x,Xa)(y,x a ). aEA
(3.7)
Proof. (i) implies (ii). Let M be the closure of the subspace of X consisting of all finite linear combinations of the Xa'Sj M is then a closed subspace of X. Now, if X\M f; 0, by Theorem 1.7 we can find an element x E M 1.. with II x II = 1. In particular, we have
(x,xa)
= 0,
all a E A,
thus contradicting the assumed maximality of the xa's. (ii) implies (iii) By Bessel's inequality, it suffices to prove the "::;" inequality in (3.6). Given x E X and c > 0, there exists a finite subset {at, ... ,an} of A such that
(3.8) Now, since the best approximation to x in the subspace spanned by
{xa1, ... ,x an } is given by :Ei=l(x,x ai )X ai , we also have (3.9)
XVI.
336
Hilbert Spaces
Moreover, since by Proposition 1.4 we have
from (3.9) it follows that
Ilxll ~ Ilx - L:~=l(X'XaJXaill + 11L:;=l(X,XaJX ai l ~ E: + (L:;=1 (X,xaJI 2Y/2 ~ E: + (L: I(X,X WY/2 . 1
aEA
a
(3.10)
Thus, since E: > 0 is arbitrary, the inequality opposite to Bessel's inequality holds, and (3.6) is true. (iii) implies (iv). Identity (3.7) is one of inner products and (3.6) one of norms; we derive the former from the latter via (1.5). Specifically, given x, y EX, using (3.6) we compute (x, y) by evaluating the norms that appear on the right-hand side of (1.5). In fact, adding up the relation
to those corresponding to x-y and x±iy, by (1.5) again, it readily follows that if ((x,xa) and ((y,x a ) are the sequences of Fourier coefficients of x and y respectively, then
This is precisely (3.7). The argument involved in this step is known as "polarization." (iv) implies (i). Suppose (x,xa) = 0 for all a E A. Then, by (3.7) we get = (x,x) = LaEA )1 = 0, and consequently, x = o. Thus {x a} is a maximal ONS in X. •
IIxll2
l(x,x a 2
Observe that, in particular, (iii) above implies that T is an isometry of X onto .e2 (A), and consequently, T is one-to-one. Thus T establishes an isomorphism between X and .e2 (A) provided {Xa}aEA is a maximal ONS in X. We reiterate that, by Zorn's Lemma, all Hilbert spaces have a maximal ONS. The question is then, how to produce concrete examples of such systems. In the familiar case of L2(1), where 1 is an interval of the line, we construct such an example following Corollary 2.3 in Chapter
XVII.
4.
Orthonormal Sets
337
It is also apparent that (3.6) gives
x
= L(x,xcy}x cy ,
(3.11)
all x EX,
CYEA
where the sum is understood to converge in the norm of X. Thus, in this setting, each element of X is represented by its Fourier series. It is also customary to call a maximal ONS in X a basis. An interesting question we are able to settle at this time is when X has an at most countable basis. Proposition 3.6. Suppose X is a Hilbert space. Then X has an at most countable basis iff X is separable, and in this case all bases are at most countable. Proof. We do the sufficiency first. Let {xn} be an at most countable dense subset of X. First discard any Xn which is a linear combination of the xi's with 1 ::; i ::; n - 1, and then, by the Gram-Schmidt process, orthonormalize the remaining elements. Call this ONS {xn} again, and note that its span coincides with that of the original ONS, and consequently, its closure is X. By (ii) in Theorem 3.6, the set {xn} is an at most countable basis for X. Conversely, suppose {xn} is an ON basis for X, and note that by (ii) in Theorem 3.6, finite linear combinations of the xn's are dense in X. Let now {An} be a countable dense subset of the field of scalars, and observe that
s = { .L
AkYk : Yk
= X n , some
n}
finite swns
is a countable dense subset of X, and consequently, X is separable. It only remains to show that any other basis {YCY}CYEA for X is also at most countable. For each integer n consider the set
An = {a E A: (YCY,x n ) i- O}, Since by (3.6)
L I(x
n , ycy}1 2
n = 1,2, ...
= II xnll 2 = 1,
CYEA
each An is at most countable. Thus A = Un An is also at most countable. We claim that, unless a E A, YCY = O. For, if YCY i- 0, then by the maximality of the xn's there exists an index n such that (ycy, xn) i- O. In this case a E An ~ A, and we have finished. •
XVI.
338
Hilbert Spaces
4. SPECTRAL DECOMPOSITION OF COMPACT OPERATORS As in the case of the Lebesgue L2 spaces, there is a notion of weak convergence in Hilbert spaces. More precisely, inspired by the Riesz Representation Theorem, we say that a sequence (xn) in a Hilbert space X converges weakly to x EX, provided that lim (xn'Y) = (x,y),
n-+oo
all Y EX.
(4.1)
As before, convergent sequences are weakly convergent, but the converse is not true. Also, bounded sequences have weakly convergent subsequences. Given a bounded sequence (Yn) in X, let {xam} be the at most countable subset of a basis {Xa}aEA for X such that the Fourier coefficients of the Yn's with respect to the ONS {Xa}aEA do not vanish. Setting now
Cn,m
= (xam,Yn),
all m,n,
we obtain a weakly convergent subsequence along the lines of the proof of Theorem 3.3 in Chapter XII. How do weakly convergent sequences look? Proposition 4.1. Let X be a Hilbert space, and suppose the sequence {Xn} ~ X converges weakly to x EX. Then (i) The set {xn} is bounded. (ii) The weak limit x lies in the subspace of X spanned by {Xl, X2, • •• }. (iii) IIXII ~ liminf IIx n ll. Proof.
Let Y EX; since by assumption we have lim Ly(xn - x)
n-+oo
= n-+oo lim (xn -
x,y)
= 0,
it readily follows that for each Y E X we have sup ILy(xn - x)1 ~ cy <
00.
n
Thus, by the Uniform boundedness principle we get sup IIx n n
which gives (i) at once.
-
xII < 00 ,
4.
Spectral Decomposition
339
Next, if (ii) is not true, by Proposition 3.3 in Chapter XIV and Theorem 1.8 , there exists V E X such that
(Xn,y) = 0,
all n,
and
(x,y):f:. O.
However, since the sequence {xn} converges weakly to x, this is not possible, and (ii) also holds. Finally, let {xn/c} be a subsequence such that limn/c-+oo IIxn/c II = liminf IIxnll. Since {xn/c} also converges weakly to x, we get I(x, y)1 = lim I(xn/c , y)1 ~ lim (lIxn/c IIlIyll) n/c-+oo n/c-+oo ~ (liminf IIxnll )lIvll. and (iii) follows at once from Proposition 3.4 in Chapter XIV.
•
As in the case of the LV spaces, this last result suggests under what conditions the notions of weak convergence and convergence coincide. Proposition 4.2. Let X be a Hilbert space, and suppose {xn} ~ X converges weakly to x E X. If in addition lim n-+ oo IIxnll IIxlI, then lim n -+ oo Xn = x.
=
Proof.
As usual,
IIX n - xII2 = (x n - X,X n - x) = IIxnll2 - (xn'x) - (xn - x)
+ IIx1I2.
Whence, by assumption
How does the notion of weak convergence fit into the theory of continuous linear operators? Well, if T E 8(X), then T also preserves weak convergence. Specifically, if the sequence {xn} ~ X converges weakly to x E X, we also have that {Tx n } converges weakly to Tx. We prove this using the notion of adjoint mapping. Given V E X, consider the functional L on X defined by
Lx
= (Tx, V) ,
all x E X.
As before, cf. Proposition 1.9, it is clear that L is actually bounded, that the dependence on y is linear, and that if T* denotes the correspondence
XVI.
340
Hilbert Spaces
between y and L, then T* is also a bounded linear mapping on X with norm IIT*II IITII, and
=
(Tx,y) = (x,T*y) ,
all X,y EX.
(4.2)
T* is called the adjoint of the operator Tj some of its basic properties are discussed in 5.36-5.38 below. Now, in our case, by (4.2) we have (Txn'Y) = (xn,T*y), and consequently, lim (Txn, y) = (x, T*y) = (Tx, y), all y EX, n ...... oo
which establishes the weak convergence of {Tx n } to Tx. We can also consider the special case when an operator T maps weakly convergence sequences into convergent sequences. More precisely, we say that a linear mapping T from X into itself is completely continuous if whenever the sequence {x n } ~ X converges weakly to x, then we have lim IITx n - Txll = O.
n ...... oo
Not all continuous mappings on X are completely continuous. Indeed, since as observed above there are sequences in £2 that converge weakly but not strongly, the identity map is one such operator. On the other hand, it is clear that completely continuous operators are also continuous. Completely continuous operators are also called compact, and the following result explains why. Proposition 4.3. Let X be a Hilbert space, and T E 8(X). Then, T is completely continuous iff given any bounded sequence {x n } of elements in X, there is a subsequence {Tx nk } ~ X which converges. Proof. Suppose first that T is completely continuous and that {x n } X is bounded. Since as noted above {x n } has a weakly convergent subsequence, and since T is completely continuous, the necessity follows. Conversely, assume the sequence {x n } ~ X is bounded. Passing to a subsequence if necessary we may assume that {x n } converges weakly to x E X. Since T is bounded, {Tx n } is weakly convergent to Tx. Now, ifthe sequence {Tx n } does not converge to Tx, then there exist a subsequence {x nk } and 1] > 0, such that ~
IITxnk - Txll
~ 1],
k = 1,2, ...
Note that the sequence {x nk } is weakly convergent and therefore also bounded. By the assumptions on T there are yet another subsequence,
4.
Spectral Decomposition
341
which we call {xnr.} again, and Y E X with such that limk-+oo TXnr. = y. But then {Txnr.} is also weakly convergent to y, and since weak limits are unique, cf. 4.32 in Chapter XIV, it follows that Tx = y. Hence, lim IITxn/c - TxlI
k-+oo
which is a contradiction.
= 0,
•
What are some completely continuous operators? The following is a useful sufficient condition to identify them. Proposition 4.4. and T E 8(X). If
Let X be a Hilbert space, {Xa}aEA a basis for X,
L I(Txa,x,6)1 2 < a,,6EA
(4.3)
00,
then T is completely continuous. Proof.
By translations it suffices to show that ifthe sequence {yn}
~
X converges weakly to 0, then {TYn} converges to O. First observe that by Theorem 3.5 there are at most countably many a's so that
We may, therefore, consider a as an integer index, and note that from (3.6) we obtain N
IITYnll 2
= L
00
I(TYn,xa)1 2
+
L
I(TYn,xa)1 2
a=1
=A+B, say. Now, since 00
Yn
= L(Yn,X,6)X,6 ,6=1
and T is continuous, we get at once that 00
(TYn,x a) = L(Yn,X,6}(TX,6,Xa} , ,6=1
XVI.
342
Hilbert Spaces
and consequently, by the Cauchy-Schwarz inequality we get 2 00
B
=
00
L L{Yn,X,8}(TX,8,xa) a=N+1 ,8=1
<
.t+. (t,
~
IIYnll
I(Yn, Xp)l')
(t,
I(Txp, X.)[' )
00
2
L L I{Tx,8,xaW· a=N+1 ,8
Furthermore, since {Yn} is bounded, by (4.3) it follows that the righthand side of the above inequality can be made as small as we wish, ~ 1] say, provided N is sufficiently large. Once that N has been fixed it is not hard to estimate A. Indeed, since T is bounded and the sequence {yn} converges weakly to 0, then also {TYn} converges weakly to 0, and consequently, lim (TYn,xa)
n ..... oo
= 0,
°
a
= 1, ... ,N.
Thus each summand in A goes to with n, and we have A ~ 1], provided n is large enough. Finally, combining this estimate with that for B, we get IITYnll 2 ~ 21], all large n. • This result allows us to construct an interesting, and important, example of a compact operator. Let 1 = [0,1]' and consider the integral operator T on L2(1) given by
Tf(x) = ik(x,t)f(t)dt,
XEI,
with kernel k(x, t) E L2(1 X I). We claim that T is a bounded, completely continuous mapping. The boundedness follows from the Cauchy-Schwarz inequality. Indeed,
iii ~ 1(1 = (11
liT fll2 =
k(x, t)f(t) dtr dx Ik(X,Y) 12 dt)
(1
Ik(x, t)1 2dtdX)
If(t) 12 dt) dx
IIf1l2.
4.
Spectral Decomposition
343
As for the compactness, let {4>n} be a basis for L2(I), and consider the quantity (4.4)
m,n we must show it is finite. First note that T* is given by
T*f(x) = lk(s,x)f(S)ds,
xEI.
By Tonelli's theorem, (4.4) can be evaluated by the iterated sum obtained by summing first over m and then over n. Now, 00
00
2: I{T4>m,4>nW = 2: 1{4>m,T*4>n)1 m=l
2
m=l
= II T*4>n 112 ,
n = 1,2, ...
Thus summing over n above we get
~ I(T4>m,4>n)I' = %;IIT'4>nIl' = %; =
J, f
i Ii
1{4>n,k(.,x)1 2 dx =
I n=l
k(s,x)4>n(s)dsl' dx
J, J, Ik(s,x)1 dsdx < 2
I
00,
I
(4.3) holds, and T is compact. We are now ready to give a detailed description of a compact operator T with the additional property that T* = Tj any operator which satisfies this relation is said to be self-adjoint. For example, in the case of the integral operator described above, this corresponds to those kernels k that satisfy k(x,t) = k(t,x), x,t E I. As we shall have the opportunity to verify in the case of compact operators, it turns out that the structure of such an operator is reminiscent to that of a symmetric matrix. And, as in the case of matrices, eigenvalues and related concepts play an important role in determining the properties of a compact self-adjoint operator. An eigenvalue of a linear operator T defined on X is a scalar A such that there exists 0 i:- x EX, with the property that
Tx = AX.
(4.5)
An element x E X for which (4.5) holds is called an eigenvector associated, or corresponding, to the eigenvalue A. The collection of all eigenvectors associated to an eigenvalue A is a subspace X).. of X, called the eigenspace corresponding to A. The relevant facts concerning a self-adjoint operator are included in our next result.
XVI.
344
Hilbert Spaces
Proposition 4.5. Let X be a Hilbert space, and suppose T is a selfadjoint mapping defined on X. Then T satisfies the following properties: (i) T is bounded. (ii) For each x E X, (Tx,x) is a real number. (iii) The norm of T is given by
IITII =
sup I(Tx, x)l. IIxll=l
( 4.6)
(iv) All eigenvalues of T are real. (v) Eigenspaces corresponding to different eigenvalues are orthogonal. (vi) If P>.. denotes the projection onto the eigenspace X>.. corresponding to the eigenvalue>. of T, then
(4.7) Proof. (i) Assume that Xn -+ x, and that TX n -+ Yi by the Closed graph theorem it suffices to show that x E D(T) and Tx = y. Let z EX, and observe that on the one hand,
(Txn' z)
= (xn' Tz) -+ (x, Tz) ,
and, on the other hand,
(Txn' z)
-+
(y, z) .
Thus, combining the above relations we have
(Tx,z) = (x,Tz) = (y,z)
all z EX,
which, since z is arbitrary, means that x E D(T) and that Tx = y. (ii) Since for x E X we have (Tx,x) = (x,Tx) = (Tx,x), as anticipated, (Tx,x) is real. (iii) Let "l = sUPllxll=l I(Tx, x)l. Since for IIxll = 1 we have I(Tx, x)1 ~ IITxllllxll ~ IITII, it follows that "l ~ IITII. On the other hand, since by a simple computation it readily follows that 4~(Tx,y)
= (T(x + y),x + y) -
by putting Xl = {1/lix IIX211 = 1, we get that 4~(Tx, y) =
+ yll )(x + y),
X2
(T(x - y),x - y),
= {1/lix - yll )(x -
IIx + Y1l2(Txb Xl) - IIx + yll2 + "llix - Yll2 =
~ "lIiX
y),
IIXlll =
(lIxll2 + lIyIl2).
(4.8)
YIl2(Tx2' X2) 2"l
4.
Spectral Decomposition
345
Pick now x E X such that Tx =I 0 and IIxll = 1, and put y IIYII = 1, in (4.8). That inequality then becomes
= (1/IITxll )Tx,
4~(Tx, Tx)/IITxll ~ 41],
or, equivalently, IITx II ~ 'fl. This gives IITII ~ 1], and (4.6) holds. (iv) If A is an eigenvalue of T and x is an eigenvector corresponding to A with IIxll = 1, we have
A = A(X,X) = (Tx,x) , which by (ii) above is real. (v) Let A =I I' be eigenvalues of T, and let X), and XI-' denote the eigenspaces corresponding to A and 1', respectively. Now, assume x EX)" Y E XI-" and note that since the eigenvalues are real we get
A(X,y)
= (Tx,y) = (x,Ty) = I'(x,y) ,
which can only be true if (x, y) = O. (vi) Since P),x E X), for all x E X, it readily follows that
TP),x
= AP),X,
x EX.
As for the commutation relation in (4.7), observe that for x, y E X we have (TP),x,y) = (AP),X,y) = (X,AP),y) = (x,TP),y) = (Tx,P),y) = (P),Tx,y). Since x and yare arbitrary, this can only happen if T P), = P),T.
•
Referring to the finite dimensional case, it is possible to represent a self-adjoint mapping T: -+ in terms of projections. First choose a and represent T by a Hermitian matrix, which we also denote basis for by Tj for simplicity assume that the matrix T has n different, necessarily real, eigenvalues Al < ... < An, say. Then T has an ONS of n eigenvectors Xl, ... , x n , say, where Xi corresponds to the eigenvalue Ai, 1 ~ i ~ n. But this is a basis for so that
en
en
en
en,
n
X = 2)X,Xi)Xi, i=l Consequently, Tx is equal to n
all x E
en.
n
~)x,xi)Txi = LAi(X,Xi) , i=l i=l
x
E
en.
XVI.
346
en
Hilbert Spaces
en
Thus, if Pi: -I- {Xi} denotes the projection of onto the eigenspace of T corresponding to the eigenvalue ~i, 1 ~ i ~ n, the above expression becomes i=1
This is a representation of T in terms of projections; a natural question is whether a similar representation holds for compact self-adjoint operators on a Hilbert space. The first step is to show that such operators do have eigenvalues. Lemma 4.6. Let X be a Hilbert space and T a compact self-adjoint mapping defined on X. Then T has a nonzero eigenvalue. Proof. Since the conclusion of the theorem is obvious when T we assume that T f; O. Let
J.L = inf (Tx,x) , 11:1:11=1 by (iii) in Theorem 4.4, we have
and
TJ = sup (Tx,x); 11:1:11=1
IITII = max{IJ.LI, TJ}.
~ _ {J.L if IITII -
TJ
if
lim (Txn,xn) = J.L,
We claim that
= IJ.LI
IITII = TJ,
is an eigenvalue of T. Consider, for instance, the case when IITII = IJ.LI of J.L there exists a sequence {xn} such that n-+oo
= 0,
> O. By the definition
IIxnll = 1 ,all n.
Since T is compact we can choose a subsequence, which we call {xn} again, such that lim n -+ oo TX n = y, say. Moreover, since
IITXn - J.Lxnll2 = ~
lIT-xnll2 - 2J.L(Txn ,xn ) + J.L2 IITII2 - 2J.L(Tx n , xn) + J.L2 ,
and the right-hand side above tends to 0 as n
-I-
00,
we get that
lim TX n - J.LX n = 0 .
n-+oo
But then, writing 1
Xn = -(Txn - (Txn - J.LX n J.'
»,
n = 1,2, ...
4.
Spectral Decomposition
341
it follows that limn-+oo Xn = (1/JL)Y. In this case we have
y
= n-+oo lim TXn = (1/JL)Ty,
and consequently, JL is an eigenvalue for T, and y is an eigenvector corresponding to JL. • It is interesting to point out that the assumption that T is self-adjoint is necessary for the validity of Lemma 4.6. Indeed, let X = [2 and suppose T:X ~ X is given by
T is compact, but not self-adjoint. Also, as the reader can readily verify, T has no nonzero eigenvalues.
Corollary 4.7. Let X be a Hilbert space, and suppose T is a compact self-adjoint operator defined on X. If T has no nonzero eigenvalues, then T = O.
Observe that the eigenvalue we just found in Lemma 4.6 is one with largest absolute value. Indeed, if ,x is an eigenvalue of T, and x is an eigenvector corresponding to ,x with IIxll = 1, then we have l,xl
= l,xl(x, x) = I(Tx, x)1 ~ IITII = IJLI ,
which is precisely our remark. The stage is now set for the description of the action of a compact self-adjoint mapping in terms of projections. Theorem 4.8 (Spectral Decomposition of Compact Operators). Let X be a Hilbert space, and T a compact self-adjoint mapping defined on X. Then, the set of distinct eigenvalues {,xn} of T is at most countable, and if P>'n denotes the projection of X onto the eigenspace X>'n corresponding to the eigenvalue ,xn, we have
(4.9) The convergence of the series in (4.9) is understood to be in the sense of the norm of B(X), i.e., lim m-+oo
liT - ""m L.."n=l ,xnP>'n II = O.
XVI.
348
Hilbert Spaces
Proof. Unless T = 0, by Lemma 4.6, T has an eigenvalue At, say, with the largest absolute value. IfTl = T and if PAl denotes the projection of X onto X A1 , the eigenspace corresponding to the eigenvalue At, consider the mapping T2 given by T2 = Tl - Al PAl . In view of (vi) in Proposition 4.4 we can rewrite T2 as (4.10) Since Tl is compact and self-adjoint, by 5.48 below, also T2 is compact and self-adjoint. Moreover, since I - PAl is a projection, and as such III - PAll! ~ 1, we have
(4.11) Whence applying Lemma 4.6 to T2 now, unless T2 = 0 we can find an eigenvalue 0 f; A2 ofT2 with largest absolute value. By (4.11), IAII ~ IA21. Moreover, we claim that the following is also true: Al is not an eigenvalue of T2, and every eigenvalue of T2 is at the same time an eigenvalue of Tl , and the corresponding eigenspaces coincide. First we show that Al is not an eigenvalue of T2 • For if it were, let Of; x be an eigenvector corresponding to At, and note that by (4.10) and (4.7) we have (4.12) Tlx - Al PAl X = AlX . Now, applying PAl to both sides, again by (4.7), we get
Al PAl X -AlPA1X
= AlPA1X = 0,
( 4.13)
which substituted into (4.12) gives Tlx = AlX, i.e., x E X AI • But if this is the case, then by (4.13) we also have x = PAl X = 0, which gives the desired contradiction. We now show that every nonzero eigenvalue of T2 is at the same time an eigenvalue of T}, and the corresponding eigenspaces coincide. In fact, let A f; 0 be an eigenvalue of T2 , and x f; 0 an eigenvector of T2 corresponding to A. By the definition of T2 we have (4.14) and consequently, it follows that (4.15) Moreover, since Tl commutes with (I - PAl)' the left-hand side of (4.15) equals Tl(I - PAl)' which happens to be the left-hand side of (4.14).
4.
Spectral Decomposition
349
Whence equating the right-hand sides of (4.14) and (4.15), and since A"I 0, we get x = (1 - P)"l )x, which gives
Thus, A is also an eigenvalue of Tl with eigenvector x, and consequently the eigenspace corresponding to A as an eigenvector of T2 is contained or equal to that of A as an eigenvalue of T1 • Now, since A "I At, by (v) in Proposition 4.5 it follows that the eigenspaces X)., and X)"l' corresponding to Tt, are orthogonal. Whence if y is an eigenvector of Tl corresponding to A it follows that P)"l y = 0, and by (4.10) we have
Thus, y is also an eigenvector ofT2 corresponding to A, and the eigenspaces coincide. Repeating this process we construct compact self-adjoint operators Tl = T, T2, .. . ,Tn' and eigenvalues At, ... , An of these operators, such that
and Further, by what has been shown above, the Ak'S are distinct eigenvalues ofT1 • Now, if for some n we have Tn = 0, then the sum in (4.9) is finite, and the conclusion follows. However, if Tn "I 0 for every n, then the process described above leads to a sequence {Tn} of compact self-adjoint operators and a corresponding sequence {An} of eigenvalues. Next we show that in this case An --+ 0 as n --+ 00. Suppose not, then it follows that
Choose now an ONS {xn} consisting of eigenvectors associated to the An'S. By the Pythagorean theorem we get -
IITxm - TXnll2
= IIAmXm -
Anxnll2 = IAml2 + IAnl2 ~ 2£2,
m"l n,
XVI.
350
Hilbert Spaces
and consequently, neither the sequence {Tx n } nor any of its subsequences converges, thus contradicting the fact that T is compact. Moreover, since IITnll = IAnl for all n, we also have limn-+ oo II Tn II = 0, and (4.9) holds. There is yet a last detail to be checked, namely, that T has no nonzero eigenvalues apart from the An'S. For, if A'lOis such an eigenvalue and if x 'lOis an eigenvector corresponding to A, then by (4.9) we get (4.16) Now, by (v) in Proposition 4.5 the elements P>'nx, n = 1,2, ... are pairwise orthogonal. Hence by (4.16) it follows that
AP>'m X = P>'m (Ln AnP>.n X) = AmP>'m X ,
all m,
and since A 'I Am, we have that P>'mx = 0 for all m. Again by (4.16) we get that AX = 0, which, in turn, implies that x = 0, a contradiction. • Theorem 4.8 has many important applications, including the development of a functional calculus for compact self-adjoint operators. For instance, if T is such an operator, to represent T2 observe that on account of (4.9 ) we have
T2x = LAnP>.nTx = LA~P>'nX, n
all x EX.
n
It is then apparent that for polynomials p, we also have
n
where the notation peT) is self-explanatory. Moreover, since functions J that are continuous on [}L,77] are uniform limits of polynomials, we prove this in Corollary 2.3 in Chapter XVII, we also have
J(T)x = 2:J(An)P>'nX,
all x EX.
n
In fact, the expression on the right-hand side above defines J(T).
There is yet another way to express the identity in (4.9). Consider the operators E>. = ~)..n<).. P)..n' A E R. It is not hard to check that the family {EAheR' has the following property: For X,y E X, put ¢(A) = (E>.x,y);
5.
Problems and Questions
351
4> is a right-continuous function that vanishes for constant for ~ > TJ. We then have
(Tx, V)
= Jt4>(Jt) +
~
< Jt, and that is
i" ~ d4>(~),
where the integral is an ordinary Riemann-Stieltjes integral. A similar representation is true for arbitrary self-adjoint operators, not necessarily compact; we will not discuss it here.
5. PROBLEMS AND QUESTIONS 5.1 Suppose X is a real inner product space and that the elements x, V E X satisfy IIx + vII2 = IIxll 2+ IIv1I 2. Show that x .1 V. Is the result true for complex inner product spaces? 5.2 Suppose X is an inner product space and show that the elements x, V E X satisfy Ilx + vII = IIxll + IlvlI iff there exists a scalar ~ > 0 such that x = ~V. 5.3 If X is a real inner product space and x, V E X satisfy IIxll = IIvlI, then (x - V) .1 (x + V). What does this mean geometrically? What does the assumption imply in case X is a complex inner product space? 5.4 Suppose x, V, z are elements of an inner product space X. Show that Appolonius' identity holds, to wit,
liz -
xW
+ liz - vII2 = IIx - v1I 2/2 + 211z -
5.5 Can we obtain the norm product?
IIzll
=
IZll + IZ21
(x
+ v)/211 2.
in C 2 from an inner
5.6 Discuss under what conditions equality holds in the Cauchy-Schwarz inequality. 5.7 Show that if V,X,X n are elements of an inner product space X, n = 1,2, ... , and if V .1 Xn for all n, and lim n -+ oo Xn = x, then V.l. x. 5.8 Show that in an inner product space X, xl. V iff IIx+~vll = IIx-~vll for all scalars ~. Further, x .1. V iff IIx + ~vll ~ IIxll for all scalars ~. 5.9 Prove that the span of a subset M of a Hilbert space X is dense in X iff M.l. = {a}.
XVI.
352
Hilbert Spaces
5.10 Let MI ;2 M2 be nonempty subsets of an inner product space X. Show that (a) MI ~ M{l., (b) M{ ~ M:j-, and (c) M{l.l. = M{. Further, show that a subspace Y of a Hilbert space X is closed iff Y = yl.l..
5.11 Suppose M is a closed subspace of a Hilbert space X and let x be an element of X. Prove that
min{lIx -
YII: Y E M} =
min{l(x, y)1 : y E Ml., lIyll = I}.
5.12 Let I = [0,1]. Compute max II x 3 f dx, and mina,b II(x 5 -a-bx)2dx, subject to the conditions
1
xkf(x)dx
= O,k = 0,1,2,
and
l
lf(x) 12 dx
= 1.
5.13 The following extension of Theorem 1.8 is called the Lax-Milgram lemma. Let X be a Hilbert space. Let B(x,y) be a complex-valued functional defined on X X X which satisfies the following four conditions: (i) B(XI + AX2,y) = B(XbY) + AB(X2,Y) for all XI,X2,Y in X and scalars A. (ii) B(X,YI +AY2) = B(X,YI)+XB(x,Y2) for all x, Yt, Y2 in X and scalars A. (iii) There is a positive constant k such that IB(x, y)1 ~ kllxllllyll for all x, Y in X. (iv) There exists a positive constant e such that IB(x,x)1 ~ ellxl1 2 for all x EX. Then for every L E X* there exists a unique element Y E X such that Lx = B(x,y) for all x E X. More precisely, there exists a uniquely determined bounded linear operator T with a bounded inverse T- I such that (x,y) = B(x,Ty) for all x,y E X, and
IITII
~ lie,
liT-III ~ k.
5.14 Let X be a complex inner product space. A complex-valued functional H(x, y) defined on X X X is said to be a Hermitian form provided that the following two conditions hold: (i) H(XI + AX2,y) = H(XbY) + AH(X2,Y) for all XbX2,Y in X, and scalars...."".....,,..--...,.. A. (ii) H(x,y) = H(y,x) for all x,y E X. H is said to be positive semidefinite if (iii) H(x,x) ~ 0 for all x E X.
5.
Problems and Questions
353
Show that if (i), (ii) and (iii) hold, then H satisfies the following Cauchy-Schwarz inequality
IH(x,y)1 $ H(x,x)H(y,y),
all X,y EX.
Further, show that p( x) = ..;H (x, x) defines a semi norm on X. 5.15 Let X, Y be inner product spaces and T: X --+ Y be a bounded linear operator. Then, (a) T = 0 iff (Tx, y) = 0 for all x in X and y in Y, and, (b) If X = Y is a complex linear space, then T = 0 iff (Tx,x) = 0 for all x in X. 5.16 Suppose X is a Hilbert space, and let T E 8(X). Show that
IITII =
sUPllxll=lIyll=1 I(Tx, y)l·
5.17 Suppose P: X --+ X is a linear map that satisfies (2.1) in Proposition 2.1 and so that p2 is a projection. Is P a projection? 5.18 Let I = [0,1] and consider the linear mapping T: L2(I) --+ L2(I) given by T I( x) = a( x )/( x ), I E L2 (I). Find necessary and sufficient conditions on a( x) for T to be a projection. 5.19 Let X be a Hilbert space and suppose PI, . .. ,Pn are projections on X. Find necessary and sufficient conditions for P = PI + ... + Pn to be a projection. What does the subspace of X onto which P projects look like? 5.20 Let M be a closed subspace of an infinite-dimensional Hilbert space X. Show that the projection PM is compact iff M is finite dimensional. 5.21 (Gram-Schmidt) Given an arbitrary linearly independent sequence of elements {yn} in an inner product space X there is an ONS of elements {x n } in X such that span {yl, ... ,Yn} = span {Xl, . .. ,x n }
for every n.
5.22 Suppose IIxll = 1. Show that at most 1/£2 ofthe Fourier coefficients of x with respect to any ONS in X exceed, in modulus, any £ > o. 5.23 Suppose X is a Hilbert space, show that X has an orthonormal basis. Moreover, in case X is separable, the existence of such a basis may be established without invoking Zorn's Lemma. 5.24 Suppose {x n } is a basis for a Hilbert space X, and let Yn = Xn - Xn+h n ~ 1. Show that the system {Yn}, although not orthonormal, is nevertheless complete in X.
Hilbert Spaces
XVI.
354
5.25 Suppose {x n } is a basis for a Hilbert space X, and {yn} ~ X is such that I:n IIx n - Ynll 2 < 1. Show that {Yn} is complete in X. Is the same conclusion true if I:n IIx n - Ynll 2 < 00 instead? 5.26 (Paley-Wiener) Suppose {x n } is a basis for a Hilbert space X, and suppose the sequence {yn} ~ X satisfies n
L
n
~m(xm - Ym) ~ A
m=l
L
~mxm ,
m=l
for any scalars ~m, m ~ 1, and a constant A, 0 ~ A < 1, independent of the choice of scalars and n. Show that each x E X can be written as x = I:~=1 CnYn, where the cn's are scalars and the sum converges in X. 5.27 Suppose I = [a,b] is an interval of the line. Show that an ONS {
I:~=1
(I[a,:l:j
5.28 Let {
f
OO
5.30 Referring to 5.13, if the Hermitian form there is bounded, i.e., if IH(x, y)1 ~ kllxllllyll for all x, Y E X, prove that there exists a self-adjoint operator T such that H(x,y) = (Tx,y) for all X,y E X. 5.31 If H(x,y) is a Hermitian form, then H(x) = H(x,x) is called a quadratic form. Prove that Hermitian and quadratic forms are related by
4H(x,y) = H(x
+ y) + H(x -
y) + iH(x + iy) - iH(x - iy).
5.32 Prove, without invoking the Continuum Hypothesis, that the dimension of any infinite-dimensional Hilbert space is at least c. 5.33 Let X be a Hilbert space, and T, Tn E B(X), n = 1,2, ... Show that if limn-+oo(Tnx,y) = (Tx,y) uniformly for y E X, lIyll = 1, then liIDn-+oo IITnx - Txll = 0 for each x EX. The assumption concerning the uniform convergence means that the sequence {(Txn' y)}
5.
Problems and Questions
355
converges to (Tx, y), uniformly over those y E X with lIyll = 1 for each x E X. Moreover, if lim n -+ oo IITnx - Txll = 0 uniformly for x EX, IIxll = 1, then limn -+ oo IITn - Til = 0 (in SeX)). 5.34 Let 1= [0,1] and X = L2(I). Show that the linear operator T on X given by T I(x) = ~o,x] I(y) dy, 0 ~ x ~ 1, is bounded, and that IITII ~ 1. Furthermore, show that T* = -T + P, where P is the orthogonal projection onto the subspace of constant functions.
= [0,1]
and consider the linear operator T defined for I in L2(I) by T I(x) = I(x) + 2 f[o,x] e(x-t) I(t) dt, 0 ~ x ~ 1. Show that T E S(L2(I)), and that if >(x) = eX and 1/J(x) = e- x , the following properties hold: (a) T> = 1/J, (b) If (/, » = 0, then (T I, 1/J) = 0 and liT1112 = 11/112, (c) If (g,1/J) = 0, then there exists I E L2(I) such that (/,» = and 9 = TI. Conclude from these observations that T is one-to-one, onto, and that its inverse is continuous.
5.35 Let I
°
5.36 Let X be a Hilbert space, and T, Tl E SeX). Show that the adjoint mapping satisfies the following properties: (a) IIT*TII = IITII2, (b) (AT + AITI)* = "XT* + "X1Ti, (c) (TTI)* = TiT*, and, (d) T** = T. 5.37 Let X be a Hilbert space, and T E SeX). If as usual R(T) and N(T) denote the range and the nullspace of T respectively, show that: (a) R(T)l. = N(T*), and, (b) N(T)l. = R(T*). 5.38 Show that if X is a complex Hilbert space and T is a bounded linear operator on X such that (T x, x) is real for all x in X, then T is self-adjoint. This result is generally false if X is a real Hilbert space. 5.39 Suppose TI, T2 are linear mappings defined on X that, in addition, satisfy (Tlx,y) = (X,T2Y) for all x, y E X. Show that Tl E SeX), and that T2 = Ti. 5.40 Define a relation among the self-adjoint transformations on a Hilbert space X by writing Tl ~ T2, or T2 ~ TI, when (T2X,x) ~ (TIX,X) for all x EX. Show that this relation is a partial ordering. A mapping T ~ 0 is said to be positive. Prove that positive selfadjoint mappings satisfy the following generalized Cauchy-Schwarz inequality: I(Tx, y)12 ~ (Tx, x)(Ty, y). 5.41 Prove that ifthe sequence {Tn} ofself-adjoint operators on a Hilbert space X satisfies 0 ~ Tl ~ ... ~ Tn ~ ... ~ I, then there exists a bounded self-adjoint mapping T on X such that limn -+ oo IITn-TIl =
o.
XVI.
356
Hilbert Spaces
5.42 Let I = [-1,1]. Is there a bounded linear functional L on L2(1) such that LI = 1(0) for every I E GI(I)? Is there a bounded linear functional LIon L2(1) such that Ld = 1'(0) for every I E CI(I)? 5.43 Let X be a Hilbert space, and T, Tl, T2, ... be bounded linear operators on X. Assume that limn-+oo(Tnx,y) = (Tx,y) for all X,y E X. Show that for some constant c, IITnll ~ c for all n. 5.44 Let I = [0,1] and suppose X is a linear subspace of G(l) which is closed with respect to the norm of L2(1). Prove that X is also closed in G(l), and, as a consequence of this, show that that there is a constant c such that 11/112 ~ 1111100 ~ cll/1l2 for every lEX. Furthermore, show that there is a function k(·, y) E L2(1) such that
I(y) = hk(x,Y)/(X)dX,
all/EX,
yEI.
Finally, show that the dimension of X cannot exceed c2 • 5.45 Assume that {x n } is a sequence of elements of a Hilbert space X such that IIx n II = 1 for n = 1,2, ... If the xn's converge weakly to an element x EX, does it necessarily follow that lim n-+ oo IIx n - xII = O? Similarly, if the sequence {x n } converges weakly to x E X, and if the convergence is uniform over y E X with lIyll = 1, does it follow that limn-+oo IIx - xnll = O? The assumption concerning the uniform convergence means that the sequence «(xn,Y) converges to (x,y), uniformly over those y E X with lIyll = 1. 5.46 Let X be a Hilbert space and suppose that {x n } ~ X converges weakly to x EX. Prove that there is a subsequence {xn/c} such that its arithmetic means (xnl + ... + xn/c)/k converge to x in the norm ofX. 5.47 Suppose {x n } is a basis for a Hilbert space X, and An --+ 0 is a sequence of scalars. Prove that the mapping T on X given by Tx = L:~=l An (x, xn)xn is compact. 5.48 Prove that if T I , T2 are compact operators on a Hilbert space X and A is a scalar, then TIT2 and TI + AT2 are also compact. Further, if T is bounded, then also TTl and TIT are compact. 5.49 Show that if {Tn} ~ 8(X) is a sequence of compact operators on a Hilbert space X and lim n-+ oo Tn = T (in 8(X», then T is also compact. 5.50 Prove that if T is a compact mapping on a Hilbert space X, then so is its adjoint T*. Is the converse true?
CHAPTER
XVII
Fourier Series
In this section we discuss how the different results we have covered thus far fit into the theory of Fourier series. 1.
THE DIRICHLET KERNEL
A trigonometric polynomial p( x) is an expression of the form
p(x) =
2: Ck eikx ,
ILnl + Icnl i' o.
(1.1)
Ikl~n
n is the degree of p and the
are (possibly) complex constants. Thus p is a continuous function of period 211" and is therefore determined by its values on T = (-11",11"] or any other interval of length 211" for that matter. Given a trigonometric polynomial p of degree ~ n, we can easily compute the constants Ck by means of Ck'S
This observation follows at once from the fact that
...!:... { 211"
JT
e ikx
dx =
{o1
~f k If k
i' 0
= O.
A trigonometric series is an expression of the form
(1.2)
357
XVII.
358
Fourier Series
Since we make no assumption concerning the convergence of this series, (1.2) only formally represents a function of period 211". A Fourier series is a trigonometric series for which there is a periodic Lebesgue integrable function I such that
(1.3) In this case we call the constants Ck the Fourier coefficients of denote this correspondence by
I
and
(1.4) Still in the case of Fourier series no assumption concerning the convergence of the series (1.4) is made. More precisely, if sn(J, x) denotes the trigonometric polynomial of degree ::; n corresponding to the symmetric partial sums of (1.4) of order n, i.e.,
Sn(J, x) =
L Ck eikx ,
(1.5)
Ikl$n
then nothing is known or assumed concerning the existence of the lim n -+ oo Sn (J, x) for any x in T. Now, by (1.3) it is clear that ICk(J)1 ::; 11/111 for all k, and there is more we can say in this direction.
Theorem 1.1 (Riemann-Lebesgue). Let function on T. Then Ck(J) -+ 0 as Ikl -+ 00.
I
be a periodic integrable
Proof. We invoke the fact that trigonometric polynomials are dense in L(T); a proof of this is given in Corollary 2.3 below. Now, given E > 0, we show that ICkl ::; E provided Ikl > ko is large enough. Let p be a trigonometric polynomial such that III - pilI ::; E, and let ko = degree of p. Then for Ikl > ko we have
Ck(J) = and consequently,
~ 211"
JTr(J(x) -
p(x))e- ikx dx,
ICk(J)1 ::; III - pill::; E.
•
Now that there is some hope that the Fourier series of an integrable function converges, we take a closer look at 8 n (J, x). It can also be written
1.
The Dirichlet Kernel
359
as
L (2~ 1f(t)e- ikt dt) ik:r: = ;. 1f(t) (~ L eik(:r:-t»)
Ikl~n
T
=.! f
f(t)Dn(x - t) dt ,
eik:r:,
n
7r
iT
dt
Ikl~n
T
(1.6)
where we have denoted by
=~
Dn(x)
L
= 0,1, ...
(1.7)
Ikl~n
the Dirichlet kernel of order n. In other words, sn(J, x) is essentially the convolution of f with the Dirichlet kernel Dn of order n. We list some properties of these kernels. In the first place by summing the geometric series in (1.7) we get that Dn(x) equals 1
2e
'
(e i (2 n+I):r: - 1) 1 ei(n+I):r: _ e-in:r: (ei:r: _ 1) = 2 ei:r:/2 (e i:r:/2 _ e- i :r:/2) 1 ei (n+I/2):r: _ e-i(n+I/2):r: =2 ei:r:/2 _ e- i :r:/2 _ 1 sin« n + 1/2)x) n = 0,1, ... 2 sin(x/2)
-'n:r:~....,......:--_.,....-.:..
(1.8)
Thus Dn is an even function, and integrating (1.7) over T it readily follows that .! f Dn(x)dx =! f oDn(x)dx = 1, all n. (1.9)
7r iT
7r i[o,1rj
It is also possible to estimate Dn. In fact, by (1.8),
IDn(x)1 ~
1
2
~
L..J
'k
Ie' :r:1 =
2n+ 1 2
1 = n + 2'
alln.
(1.10)
Ikl~n
Moreover, since as is readily seen
°< x < 7r,
(1.11)
< Ixl < 7r, all n.
(1.12)
1/(2sin(x/2)) ~ 7r/2x,
for
by (1.7) it follows that
IDn(x)1 ~
7r/2Ixl,
0
XVII.
360
Fourier Series
Because of these properties, the Dn's look like an approximate identity parametrized by n rather than c. The natural question is, then, how closely does the behaviour of the Dirichlet kernels resemble that of an approximate identity. In particular, do the convolutions in (1.6) tend to I a.e.? Or, in other words, do the partial sums sn(J,·) tend to I a.e.? We begin by considering the question of the convergence for continuous functions I, and to do so we introduce the Lebesgue constants Ln. They are given by
L n =! 7r
iTr IDn(x)ldx,
n=O,l, ...
This is one reason why: In Chapter XIII we have seen that in addition to (1.9) a uniform bound on the Ln's is an important ingredient in establishing the convergence of the approximate identities to the function in question. There is yet another way to see this: From (1.6) it is clear that
ISn(J, 0)1
~!7r iTr II(t)IIDn(t)1 dt = 1\11\00 Ln.
Moreover, setting In(x) = sgn Dn(x), we readily see that I\Inl\oo = 1 for all n and consequently, sup
1eLOO (T) ,11/1100 :9
ISn(J, 0)1 = Ln ,
all n.
Further, since the function In is real-valued and discontinuous at a finite number of points, it is easy to modify its values in small neighbourhoods of those points to obtain that also for continuous functions
(1.13)
ISn(J, 0)1 = Ln.
sup leC(T),lI/lIoo $1
Proposition 1.2. Proof. that
Ln
rv
(4/7r 2 ) In n, as n
---+ 00.
Since Dn is even and sin(t/2) > 0 for 0
Ln
=~7r J[O,1r] { 1sin«n + 1/2)t)1 (2 sm . ~ /2) -!) t t +~ { 7r
J[O,1r]
Isin«n + 1/2)t)1 dt t
= An
+ En,
dt
7r, we have
1.
The Dirichlet Kernel
361
say. It is clear that the integrand in An is bounded, independently of n, and consequently the An's are bounded. As for the Bn's, the change of variables (n + 1/2)t = s gives Bn
21
I·SIns Ids -
=-
7r [O,(n+1/2)1I"]
= -2 7r
S
(1 +1
(n1l",(n+1/2)1I"]
[O,n1l"]
) I·
I
SInS -ds S
= Bn,
+Bn , II
say. Clearly B~ ~ k for all n. Thus we will be done once we show that B~ (4/7r 2 ) In n for large n. To see this rewrite fV
21
= sin t 7r [011"] ,
{n-l L
k 1 } dt . k=O 7r+t
The expression in {.} in the above integral can be estimated below and above, uniformly for t E (0,7r], by
1
n-l 1 1 """" 1 n
1
-7rL."k+1 --7rL."k - -7r' k=O
k=l
respectively. Since i[o,1I"] sin t dt
B'n
4 fV
-
7r2
1 L."k
n-l
1 1 -7r ""L." k ' k=l
= 2, these estimates show that 4
n
"" -
and
fV
-
7r2
In n '
large n.
•
k=l
As a consequence of this result we now know that, on account of (1.13), for each (large) n there is a continuous function f, If( x) I ~ 1 for all x in T, such that ISn(J, 0)1 (4/7r 2 )lnn. But, do there exist continuous functions whose Fourier series have large partial sums at O? Assuming that no such functions exist we will reach a contradiction. Indeed, in the notation of Theorem 3.2 in Chapter XV, let X = C(T), Y = C, and put fV
Tnf=Sn(J,O)=.!. {f(x)Dn(x)dx, 7r
iT
n=0,1,2, ...
(1.14)
XVII.
362
Fourier Series
It is clear that the Tn's form a sequence of bounded linear mappings from X into Y, and by Proposition 1.2 and (1.13) it follows that
IITnll
=
sup
/ec(T).II/lIoo~l
4
ITn/1 '" 2" In n, all n. 11"
(1.15)
Suppose now that the Fourier series of every continuous function converges at OJ in particular, the partial sums will be bounded there, i.e., ITn 1 = ISn(J,O)1 ::; c/ < 00 for each I E G(T) and all n. Then by the Uniform Boundedness Principle there is a constant c such that
IITnll '" (4/11"2) In n ::; c, all n, which is impossible. In fact, by (3.3) in Chapter XV, the set of those continuous functions I whose Fourier series converges at the origin is of first category in G(T). As for explicit examples, in 1876 du Bois-Reymond (1831-1889) constructed a continuous function with a nonconvergent Fourier series at a single point. This led to an example where there is no convergence at each point of a dense set of points in Tj this set nevertheless is of Lebesgue measure o. In 1926 Kolmogorov produced an example of an integrable function whose Fourier series diverges a.e. in T. Until 1966 it was not known whether or not there exists a continuous function with that property. In that year Carleson proved that "Lusin's conjecture" is true, to wit, the Fourier series of every L2(T) function converges a.e. Shortly after, Hunt extended Carleson's argument to prove that the Fourier series of LP(T) functions converge a.e. in T provided p > 1. The Principle of Condensation of Singularities allows us to construct functions whose Fourier series do not converge at many points.
Example 1.3. There exists a real-valued continuous function I(x) of period 211" such that the partial sums sn(J, x) of its Fourier series expansion satisfy the condition lim sup ISn(J,x)1 n-+oo
= 00,
for those x's on a set BeT which may be taken to contain any sequence {x m } CT. Indeed, from (1.7) is clear that for a given fixed x E T, Sn (J, x) is a bounded linear functional on G(T) with norm Ln - 00 as n - 00. Therefore, if we take a sequence {x m } C T, by Theorem 3.3 in Chapter XV, the set B
= {I E G(T): limsuplsn(J,x)1 = 00 for x = Xt,X2, ••• } n-+oo
2.
Cesaro Summability
363
is of second category in G(T). Hence, by the completeness of G(T), the (bad) set B is nonempty. In fact, there is more we can say about functions in B: If we take {x m } to be a dense sequence of points in T, then for any fEB the set
B
= {x E T: lim sup ISn(f,x)1 = oo} n-+oo
is uncountable. To see this, given fEB, let
n Fm•n . n=l 00
Fm•n = {x E T:lsn(f,x)1 ~ m},
Fm =
By the continuity of f it readily follows that the Fm.n's and hence Fm are closed subsets of T. If we can show that U:=l Fm is of first category in T, then the set B = T \ U:=l Fm d {Xl,X2,'''} would be of second category in T, and so uncountable. So, finally we prove that each Fm is of first category in T. Suppose some Fmo is of second category in T. Then the closed set F mo must contain a closed subinterval [a,b] of T. This implies that sup ISn(f, x)1 ~ mo,
all x E [a,b] ,
n~l
contradicting the fact that P contains a dense subset of T.
2. CESARO SUMMABILITY In the previous section we saw that the notion of pointwise convergence is not the ideal one for dealing with Fourier series of continuous, or integrable, functions. Now we address the convergence question from the more general point of view of the arithmetic means. We begin by defining the notion of Cesaro (G, 1) summability. Given a sequence of complex numbers {Ck}, we say that it converges to L in the Cesaro (G, 1) sense, and we write lim Ck = L(G, 1) k-+oo provided that limk-+oo Gk = L, where Gk is the average (Cl + ... + ck)/k. It would be reassuring to know that convergent sequences also converge (G, 1) to the same limit.
XVII.
364
Proposition 2.1.
Fourier Series
Let limk-+oo Ck = Lj then limk-+oo Ck = L(C, 1).
Proof. We may assume that L = 0 by replacing Ck by Ck - L if necessary. Observe that the Ck'S have the following properties: (i) ICkl ~ M, all k. (ii) Given E > 0, there is a ko such that ICkl ~ E provided k ~ ko. It is now a simple matter to estimate the Ck'S. Indeed,
k ~ ko .
Therefore, by first picking E and thus fixing ko, and then letting k we see that ICkl can be made arbitrarily small for k large. •
00
Note that the oscillating sequence Ck = (_l)k has limit 0 in the (C, 1) sense. In a similar vein we define the Cesaro summability of series. Given a sequence (Ck) of complex numbers, put n
Sn
= LCk,
and
Un
1 = -
n
L Sk .
n k=t
k=t
If limn--+oo Un = s, then we say that the series L Ck is Ct-summable to s, and we write LCk = s(C,l). By Proposition 2.1 it follows that if lim n--+ oo Sn = s, then L Ck = s( C, 1). On the other hand, if Ck = zk, z :I 1 a complex number with Izl = 1, then L zk does not converge, yet 00
'""' zk L...J
k=O
1
= -1-z ( C , 1).
We pass now to explore how the notion of Cesaro summability applies to Fourier series. As before we begin by determining the integral, or convolution, representation of the Cesaro means un(f, x) corresponding to (the Fourier series of) I. More precisely, given 1 rv LCkeikx, let Un
(I ,x ) -- so(f,x) + ... + sn(f,x) , n+1
n
= 0,1, ...
(2.1)
2.
Cesaro Summability
365
It is fairly easy to obtain an explicit expression of CTn(j, x). Indeed, note that the numerator of (2.1) is the sum of Co
c_le-ix+co
c_ne -inx
+ cle ix
+ . . . +Co + • • • + cneinx
which equals Lne- inx + 2c_n+1e-i(n-l)x + ... + (n Thus, by dividing by (n + 1) we get that
+ l)co + ... + eneinx. (2.2)
As for the integral representation alluded to above, by (1.6) we readily see that
say, where (2.3) is the Fejer kernel of order n. It is not hard to compute K n( x). In the first place, the numerator in (2.3) equals
~n L...Jk=l
sin«k + 1/2)x) 2sin(x/2)
=
1
0;
2sin(x/2)~
(~n L...Jk=l
i(k+1/2)X)
e
.
By summing the geometric series we see that the imaginary part of the above sum equals
~ (e iX /2 -'-(l---:--_e_i _(n....,.+~l_)X......) ) (1 - eix ) i
(e-i(n+l)x/2 _ e (n+1)x/2) ) = ~ ( et(. n+1)x/2 ~_:---:--;:::------;"~:--~ (e-ix/2 _ e ix/ 2 )
= sin«.n + 1)x/2) ~(ei(n+l)x/2) = sin 2 «n + 1)x/2) . sm(x/2)
sin(x/2)
XVII.
366
Fourier Series
Whence 1 (sin((n + 1)X/2))2 (2.4) 2(n + 1) sin(x/2) The following properties of Kn(x) are readily verified: It is a positive even
Kn(x) =
function and by (2.3) above
!. { Kn(x)dx = ~ ~~
(
~ ~,~
Kn(x)dx = 1.
(2.5)
Thus, in contrast to the Dirichlet kernels Dn(x), the Fejer kernels have uniformly bounded Ll norms. It is also immediate to estimate Kn(x). By (1.10), 1
Kn(x) ::; n + 1
1
n
n
L IDk(x)1 ::; n + 1 k=O L(k + 1/2)
k=l
= _1_ (n( n + 1) + n + 1) = n + 1 . n +1 2 2 2
(2.6)
Similarly, by (1.11), 2~
Kn(x) ::; 2(n + l)x2'
0
< Ixl < ~.
(2.7)
This is all we need to know about these kernels. Concerning the representation of O'n(l, x), since Kn(x) is even we have that 1* 2Kn(x) equals either
(2.8) or,
!. { ~
(I(x
+ t) + I(x - t)) J(n(t) dt.
(2.9)
J[O,1rj
The first result we consider is how the convolution with the Fejer kernel behaves at the points of continuity of a function. In fact, a slightly more general result is Theorem 2.2 (Feje~2' suppose I is a periodic integrable function defined on T, I '" ~ eke' x. If the limits I( x ± 0) exist, then
(2.10) In particular, if I is continuous at every point of an interval I the convergence is uniform over I.
~
T, then
2.
Cesaro Summability
367
Proof. We may, and do, assume that I(x) = (f(x+O)+ I(x -0))/2. On account of (2.9) and (2.5) we have
un(f, x)- I(x)
=~ {
(/(x
+ t) ~ I(x -
t) - I(X)) Kn(t) dt. (2.11)
11" J[o,?r]
Thus, for 0
< TJ < 11", we see that
!un(f,x) - I(x)!
~ ~ ({ +1. ) 11"
= A
I/(x+t)+/(x-t) -/(x)IKn(t)dt
[.",?r]
J[O,.,,)
2
+ B,
(2.12)
say. We estimate A first. By assumption, given c such that
> 0, there exists 6 > 0
I/(x+t)~/(x-t) -/(x)1 ~c
provided 0
~
t < 6. We set TJ = 6 in (2.12) and note that A
~ c~
Kn(t)dt
(
11" J[O,6)
~ C.
To bound B we introduce the quantity Mn(6) = sUP6 0, and consequently B
~ Mn(6)~
{
11" J[O,?r]
~
(!/(x
cfMn(6) -- 0,
+ t)! + !/(x -
as n --
00.
t)!
+ 2!/(x)1) dt (2.13)
This completes the proof of the first assertion. If now I is continuous on a closed interval I ~ T, then (2.13) holds uniformly over I, and the theorem is completely proved. • An interesting application of Fejer's theorem is the familiar Weierstrass theorem that asserts that continuous functions are the uniform limit of polynomials. Corollary 2.3 (Weierstrass). Let I E G(T). Then, given E > 0, there is a trigonometric polynomial p such that I/(x) - p(x)1 ~ E for all x E T.
XVII.
368
Fourier Series
Proof. The natural candidate for p(x) is O'n(J,x), with n sufficiently large. By Fejer's theorem this choice works. •
As for the convergence in norm, we have Theorem 2.4.
Let 1
~
p
< 00. Then
lim 1I00n(J) - flip = 0,
n-oo
Proof.
all f E LP(T) .
(2.14)
Let 1 f F(t)= ( 211"iT 1f(x+t)-f(x) IP dx
)l /P
j
by (v) in Section 1 of Chapter XII, F(t) is continuous at t = O. Thus Theorem 2.2 gives that limn_oo O'n(F,O) = F(O) = O. Now, by (2.5) and (2.8), we have
O'n(J, x) - f(x)
=.!. f (J(x + t) 11" iT
f(x))Kn(t)dt,
and consequently, by Minkowski's integral inequality, 4.23 in Chapter XIII, we get
1I00n(J) - flip
~.!. f
IIf('
=.!. f
F(t)Kn(t) dt
11" iT 11" iT
+ t) -
fOllpKn(t) dt
Whence lim sup 1I00n(J) - flip ~ lim sup O'n(F, 0)
= O'n(F, 0).
= o. •
In particular, referring to a question left open in Chapter XVI, the trigonometric polynomials are dense in L2(T), and consequently, the ONS ikX } is complete in L2(T). The stronger version of (2.14) is also true, to wit,
{-j;e
lim IIsn(J) - flip = 0,
n_oo
all f E LP(T) ,1 < p <
00 •
This result is known as the M.Riesz theorem and it is much harder to prove. Another consequence of Theorem 2.2 is the uniqueness of the Fourier expansion.
3.
f
Pointwise Convergence
Corollary 2.5. = 0 a.e.
Proof. 2.4. •
Let
369
f E L(T), f '" E cke ikx . If Ck = 0 for all k, then
Note that (1n(f, x)
= 0 for
all n, and then apply Theorem
3. POINTWISE CONVERGENCE We close our discussion on a positive note, namely, we show that the Cesaro means of the Fourier series of an integrable function converge to the function a.e. This result was proved by Lebesgue, and it represents one of the early successes of the new concept of Lebesgue integral. Theorem 3.1. Suppose f E L(T). Then lim n--+ oo G'n(f, x) = f(x) at each point x in the Lebesgue set of f. In particular this statement is true a.e. in T. Proof. We proceed as in the proof of Theorem 2.2, but in a more deliberate fashion. By (2.12) of that theorem
lG'n(f,x) - f(x)1
~~([ J[O,f/)
11"
+1,
)If(x+t)+f(X-t)-f(x)IKn(t)dt 2
[f/,1I"]
=A+B, say. Further, by (2.7) B
<
_C_1,
- n +1
If(x + t) + f(x -
t) _ f(X)1 dt <
2
[f/,1I"]
t2
-
C
(n
+ 1)172
C
j,
where Cj depends, of course, on f. So this term can be made small, but there must be a balance between 17 and n. Choosing 17 = 1/nl/4, for instance, we get at once that B can be made arbitrarily small for sufficiently large n. To bound A is a more delica.te pursuit. We split A into two integrals,
-2
11"
(1
[O,lln]
+
1
[l/n,llnI/4)
)
= Al
+ A2 ,
XVII.
370
Fourier Series
say. Now, by estimate (2.6) it readily follows that
Al
~cn (
If(X+t)+f(X-t) -f(X)1 dt,
l[o,l/n)
2
Further, by 3.24 in Chapter VIII, the right-hand side ofthe above estimate tends to 0 as n - 00 at each point x in the Lebesgue set of f. Thus, it only remains to deal with A 2 • Again, by (2.7) it is dominated by
~ ( n
1[I/n,l/nI/4)
If(x+t)+f(X-t) -f(X)1 d!. 2 t
(3.1)
Let now
Fx(t)=F(t)= (
If(X+S)+f(X-S) -f(x)1 ds. 2
l[o,t)
As F is absolutely continuous, we may rewrite (3.1)
~ ( n
l[l/n,l/nI/')
F'(t) d! . t
Whence integrating by parts we see that
A2
~ ~ F~t)] n t
l/nI/4 l/n
+ 2c n
F(t) d! .
( 1[I/n,l/nI/4]
(3.2)
t
Since by 3.24 in Chapter VIII limt-+o F(t)/t = 0 at each Lebesgue point x of f, the integrated term in (3.2) above tends to 0 as n - 00. The same is true for the integral in (3.2) since, given E > 0, we may first choose n large enough so that F(t)/t ~ E in [1/n,1/n l / 4 ], and observe that the integral there does not exceed CE
(
dt
-; l[l/n,oo) t2
= CE.
The combination of all these estimates leads to the desired conclusion and we have finished. •
CHAPTER
XVIII
Remarks on Problems and Questions 'DON'T PANIC!'
These remarks consist of hints and comments to some of the problems and questions included in the text.
CHAPTER I 5.17 A contains a subset An of n elements for each nj consider the subset B = Un An.
5.25 A continuous real-valued function on [0,1] is determined by the values it takes on Q n [0,1]. 5.29 A subset of R2 consisting of a disk and any part of its boundary is convex.
CHAPTER II 5.10 Suppose not and consider the first element of the nonempty set {m EM: P(m) is not true}. 5.14 Given a set A, let f be a selection function for P(A) \ {0}. We may now construct a well-ordering on A as follows: Put aI = f(A), a2 = f(A \ {ad), a3 = f(A \ {aI, a2}), and in general, define by transfinite
XVIII.
372
Remarks on Problems and Questions
=
induction ab f(A \ {Ud 0, let m~ = mb, where b is the least ordinal such that mb is an upper bound of the chain C = {m~: d < a} and mb ~ C. Note that {m~: d < a} is always a chain, and that mb exists unless m~_l is a maximal element of M. This construction eventually comes to an end, and we obtain a maximal element of M.
=
5.28 Pick Y E f-l( {O}) \ f-l( {O}) and a sequence {yn} ~ f-I( {O}) such that limn-+oo Yn = y. Now, given x E R, write it as
x
= (x - (J(x)j f(y))(y -
say, and observe that f( xI)
Yn))
+ (J(x)j f(y))(y -
= 0 and X2 -
Yn)
= Xl + X2,
O.
5.33 Suppose B is an at most countable subset of A. Then the set B* = UbEB Ib is also an at most countable, and consequently, proper subset of A. In fact, B* is an initial segment Ia of A, say, and a is an upper bound of B.
CHAPTER III 4.2 Imagine a rising sun on the x-axis at 00. Then {(x, y) E R2: Y ~ rex)} is illuminated by the sun whereas ((x,y) E R2:y < rex)} lies in the shadow.
XVIII.
Remarks on Problems and Questions
4.4
If D
373
= {Tt, T2, • •• } is a countable subset of [a,b], define functions
h, ... ,ln, ... by
-1/n2
In (X)
={ 0
1/ n 2
if x < Tn if X = Tn if X > Tn,
and let I(x) = ~~=1 In(x). Note that since Iln(x)1 ~ 1/n 2 for all x, the series converges uniformly. Further, since each In is increasing, so is I. Let now X ~ D. Then each function h + ... + In is continuous at x and so is the uniform limit, I. If, on the other hand, xED, then x = XN for some N and f = IN + ~n~N In. By the above argument the sum is continuous at x, but IN is not. 4.14 Choose a partition PI consisting of points a = Xo < ... < Xn = X so that ~over'Pl lilkll > Vex) - c and using the uniform continuity of Ion [a,x] find 0> 0 with the property that I/(x) - l(x')1 < c/4(n + 1) if Ix - x'i < o. Show now that any partition P with norm less than or equal to 'fJ = A( 0, min ilkX) will do. 4.16 By the mean value theorem we get ~j=ll/(xj) - I(Xj-l)1 = ~j=II/'(tj)liljx for appropriate tj E (Xj_t,Xj). Hence by 4.15 and The-
orem 2.6 we have
Vex)
=
lim
L If'(tj )Iiljx = l n
nonn('P)-+O j=1
+ (I(x) -
1f'(t)1 dt.
a
Moreover, by 4.15,
2P(x) = Vex)
x
I(a» =
l
X
1/'(t)1 dt +
l
x
f'(t) dt.
4.23 If I(b) = I(a) the identity clearly holds. Otherwise, if I(b) iI(a), since all upper and lower sums lie between mU(b) - I(a» and MU(b) - I(a», so does the integral. Thus, the quotient c = 9 dl/ dl lies between m and M.
J:
J:
4.25 Note that h(x)- hey) = h(x)- hey) whenever X,y ED. Let c > 0 be given. Since 9 is uniformly continuous on I, there is a 0 > 0 such that Ig(x) - g(y)1 ~ c/(h(b) - h(a) + 1) whenever Ix - yl < o. Because D is dense in R there is a partition P of I consisting of points in D so that the distance between no two succesive points exceeds o. The stage is 9 dhl ~ c. now set to show that I 9 dh 4.27 There are two ways to go about this problem: Either integrate by parts and invoke 4.23 or note that by 4.22 the function 9 df g(a) dl - g(b) df is continuous and invoke the mean value theorem.
J:
J:
J;
J:
J:
XVIII.
374
Remarks on Problems and Questions
CHAPTER IV 4.14 Let X = {at, .. . ,a6}, Al = {at,a2,a3} and A2 S( {At, A 2}) consists of exactly 16 sets.
= {a3,a4,a5};
4.16 Partially order A by set inclusion; by Zorn's Lemma there exists a maximal chain C, say, in A. Furthermore, since A is infinite, so is C. Indeed, suppose there are only finitely many elements Co = 0 C CI ... C Cn = X, say, in C. Then, let A E A \C, and let no be the largest integer less than n such that A \ Cno -::j:. 0. Since A -::j:. Cno+I it follows that 0 C ... C Cno C (Cno U ((A \ C no ) n Cno+I)) C C no +1 C ... C Cn, is a chain of elements of A that properly contains C, contrary to its assumed maxi mali ty. So, C is an infinite maximal chain. If C has no "first" nonempty element, then there exists an infinite sequence {Ak} ~ C such that Al :::> ••• :::> An :::> ••• :::> 0. Otherwise designate the first nonempty element of C by BI and take a look at C \ {Bd. If this family has no first nonempty element, pick as before an infinite decreasing sequence, and if it has a first element B 2, say, note that BI C B 2, and consider C\ {Bt,B 2}. Proceeding in this fashion we either obtain an infinite decreasing sequence of nonempty elements of C, or else a sequence BI C ... C Bn C ... In the latter case, by looking instead at X \ BI :::> ••• :::> X \ Bn :::> ••• we also obtain an infinite decreasing sequence of elements of C. Thus, in any case, there exists an infinite decreasing sequence {Ak} ~ C, say, and the sequence A~ = Ak \ Ak+l, k = 1, ... , is composed of infinitely many pairwise disjoint nonempty subsets of A. 4.17 Divide R n into a mesh of congruent nonoverlapping closed intervals each of measure 1. Next discard any interval in the mesh which does not intersect the open set 0 and separate those intervals that are totally contained in O. Next subdivide each of the remaining intervals into 2 n congruent nonoverlapping closed intervals by bisecting the sides. Once again discard any interval in this new mesh which does not intersect 0 and separate those intervals which are totally contained in O. As for those intervals that are left, they intersect both 0 and Rn \ O. Keep going. 4.29 The restriction that 1'2 is a finite measure is not necessary. Indeed, when p.2(X) = 00 we may define
p.3(E)
= sup{p.l(B) -
p.2(B): BeE, p.2(B) < oo},
E EM.
By the way, 1'3 is uniquely determined if 1'2 is O'-finite. 4.31 If X is uncountable and p.(E) = 0 if E C X is at most countable and p.(E) = 00 if E ~ X is uncountable, then I' is not semifinite.
XVIII.
Remarks on Problems and Questions
4.32
Let A E M, with J.t(A) =
o < sup{J.t(B): B
C
00.
375
If the assertion is false, then
A, B E M,J.t(B) < oo} =
'TJ
< 00.
Since 'TJ is finite we can find a non decreasing sequence {Bn} ~ M such that J.t(Bn+d ~ J.t(Bn) > 'TJ - l/n. Let B = U~=l Bn; B C A; B E M and J.t(B) = 'TJ. Since J.t(A) = 00, also J.t(A \ B) = 00. The fact that J.t is semifinite now leads to a contradiction. 4.33
Given A E M, let
J.t1(A) = sup{J.t(B): Be A,B E M,J.t(B) < oo}, and
J.t2(A) = sup{J.t(B) - J.t1(B): B C A, BE M,J.tI(B) < oo} .
CHAPTER V 3.15 Consider the function I(x) = lO- N (x), where N(x) is the number of zeros and nines after the decimal point in the expansion of X; the convention 10- 00 = 0 is in effect. 3.24 Suppose that E ~ [0,1] and that, to the contrary, En(E+r) = (2) for every r E [0,1]. Then note that the sets E + rn, rn E [0,1] n Q, are measurable, pairwise disjoint and contained in [0,2].
3.25
3.24 is useful here.
3.26
Let B I , B2 be the sets of real numbers
and
= {r E R: r = cl/2 + c3/23 + ... }, where Co is an arbitrary integer and Ck = 0 or 1 for all k. Show now that B2
IBII = IB21 = O. Furthermore, Hamel basis for R.
BI
U
B2 is also null, and it contains a
3.28 Let D = C x C C R2 and observe that for any number -1 ~ r ~ 1, the line y = X + r meets D in at least one point. Indeed, since Cis obtained by a sequence of removals of "middle thirds" , D can be thought of as the intersection of a countable family D t , D 2 , ••• of closed subsets
XVIII.
376
Remarks on Problems and Questions
of [0,1] X [0,1] obtained as follows: The set DI consists of four 1/3 by 1/3 closed squares located in the corners of [0,1] x [0,1], the set D2 consists of of sixteen 1/9 by 1/9 squares located by fours in the corners of the squares of DI, and so on. For any given T E [-1,1], the line y = x + T meets at least one of the four squares in D 1 , at least one of the four squares of D2 that lies in DI and so on. Now note that there is a point (x,y), say, that belongs to every square of this sequence. This point therefore belongs to D, and consequently, we have y = x + T, with x, y E C.
3.30 Let TI, T2, ... denote the sequence of all the rational numbers in (0,1). Start out as in the construction of the Cantor set but extend the open interval to be removed in the first step so that its endpoints are irrational and so that TI belongs to this interval. At the second step remove from each ef the two remaining closed intervals an open "middle third" so that the endpoints are irrational and T2 is removed. If this process is carried out a Cantor-like set consisting entirely of irrational numbers remains. 3.34 Let Al be a Cantor subset of [0,1] of measure 1/2. In each interval contiguous with AI, i.e., in every connected component of [0,1]\A 1 in [0,1], construct a Cantor set the measure of which is half that of this interval, and denote by A2 the union ofthese compact sets; clearly IA21 = 1/4. Once A!, ... , An have been defined, An+! is constructed as follows: Choose in every interval contiguous to the compact set J(n = Al U ... U An a Cantor set of measure half of that interval, and let An+! denote the union of those compact sets. Note that IAn+! I = 1/2n+!. Finally put E = U~=O A2n+l, and show that for every nondegenerate interval I ~ [0,1] we have < IEnII < III.
°
CHAPTER VI 4.30 By an argument analogous to Proposition 3.3, there exists a subsequence {Ink} which is Cauchy p.-a.e. This subsequence determines a measurable function I such that limk~oo Ink = I p.-a.e., and consequently also limk~oo Ink = I in probability. To complete the argument observe that given c > 0, it follows that {lin - II > c} ~ {lin - Ink I > c/2} U {link - II > c/2}. The first set on the right-hand side above may be handled because the sequence is Cauchy in measure and the second because the subsequence converges p.-a.e. Choose bn such that p.(X\An) ~ 2- n , where An = {l/nl < bn }, put an = l/nb n , and consider the set liminf An.
4.35
XVIII.
Remarks on Problems and Questions
377
A proof by pictures works. By Lusin's theorem there is a closed subset I( of [0,1] such that lEI = 1[0,1] \ 1(1 < £ and III( is continuous on I( in the relative topology of 1(. Now, B is open and it can be written as an at most countable union of disjoint open intervals B = Un (an, bn ), say. Since an,bn E I( for all n, it is now clear how to extend 1 linearly to a continuous function F defined on [0,1] with the desired properties. 4.36
Let A l , A 2 , ••• be the sequence of nonmeasurable sets constructed in Chapter V, put Bn = U~n Ai, and let {In} be the sequence of bounded functions on [0,1] given by In = XB n , n = ~,2, ... The conclusion of Egorov's theorem fails for this sequence. 4.38
CHAPTER VII 4.5
1{lxl >
It is false. However, show that for each ", A, 111 > ",}I --t 0 as A --t 00.
4.10
>
0 it is true that
Feel free to use Proposition 2.4 in Chapter XIII.
4.22 Suppose not. Then, passing to a subsequence if necessary, we may assume there exists £ > 0 such that peEn) ~ £ for all n. Let Gk = {O < 1 < l/k}, k = 1,2, ... By assumption limGk = 0, and since p(Gt) < 00, it readily follows that there exists an index ko such that p(Gke) :::; £/2, and consequently, peEn \ Gko) ~ £/2 for all n. Thus, lEn Idp ~ IEn\Gko Idp ~ peEn \Gke)/ko ~ £/2ko for all n, contradicting the assumption that lim n --+ oo lEn 1 dp = O. 4.24 Since lIn - 11 :::; 211nl + 2111, we have Thus by Fatou's Lemma we get
2111 +211nl-lln - 11
Ix
~
o.
111 dp may be cancelled, and Since 1 E L(p), any expression involving the conclusion obtains. Note that the weaker assumption lim inf 11nl dp 111 dp is all that is needed to complete the argument. The idea of this proof is due to Novinger, Pmc. Amer. Math. Soc. 34 (1972),627-628. Also note that in (R, £,1·1), the sequence In = X[-n,n]' n = 1,2, ... , tends to 1 everywhere, lim IR In = IR 1 = 00 but IR lIn - 11 O. 4.30 and 4.32 You may want to use 4.46 to do these problems.
Ix
: :; Ix
-r
4.36
9 p-a.e.
No, but try with the additional assumption that limn .... oo gn =
XVIII.
378
The inequality la - bl/(1 + la - bl) ~ cl/(1 + Ib - cl) is useful in this context.
4.37
+Ib -
Remarks on Problems and Questions
la - cl/(1 + la - cl)
4.38 Let N be the set of points x E X for which either II( x ) I > g( x) or Iln(x)1 > g(x) for some n; by assumption Jt(N) = O. Now, by LDCT, we get limn -+ oo IX\N lin - II dJt = 0, and consequently by Chebychev's inequality, lim n -+ oo Jt((X \ N) n {lin - II > A}) = 0 for all A > O. Given c > 0, we must find a measurable set B C X such that Jt(B) < c and so that the sequence {In} converges to 0 uniformly on X \ B. Let Ek = {g ~ 11k}, k = 1,2, ... Since 9 E L(X), Jt(Ek) < 00, and also Ek ~ Ek+1 for all k. By Egorov's theorem there exist measurable sets Bk C Ek such that Jt(Bk) < c/2k and the In's converge uniformly to o on Ek \ Bk. It only remains to check that if B = Uk:l Bk, Jt(B) < c, then the In's converge uniformly to 0 on X \ B. 4.40
4.41 that
Suppose
111,llnl
~
¢ E L(Jt), n = 1,2, ... Given c > 0, note
where 7] is an arbitrary positive real number, independent of n. Thus, since the In's converge to I in measure, we get that lim sup lin - II dJt ~ 2 I{q,571} ¢dJt + cll¢lh = 1+ J, say. The desired conclusion now follows since, by LDCT, lim7l -+o I = 0, and c is arbitrary. In fact, note that we have proved a stronger statement, to wit, lim n -+ oo II - Inl dJt = O.
Ix
Ix
4.43 The integral of (J - j2)2 over X vanishes. By the way, the reader may be interested in proving that if I is nonnegative, a similar statement is true for any consecutive three integers. How about a similar statement for any three integers? 4.46 For simplicity assume that 1= [-1,1] x ... x [-1,1], and note that with the notation rI = [-r,r] X ••• x [-r,r], we can write I as the pairwise disjoin union I = Uk:o(2- kI \ 2-(k+1)I) = Uk:l Jk, say. Now, on each "ring" Jk we have Ixl- 71 rv 2k7l, where the constant for rv is a dimensional constant independent of k. Also IJkl rv 2- kn , with a
XVIII.
Remarks on Problems and Questions
379
dimensional constant independent of k. Thus,
which converges iff 'f/
< n.
CHAPTER VIII 3.1
°
Given a, b >
and c, it is apparent that
Therefore,
Now, unless
Ix PdJ.L
= 0, and in this case I =
trivially, we may choose c = (3.1) at once. 3.4
Let
Tb T2, •••
(Ix
92
dJ.LI
°
J.L-a.e. and (3.1) holds
Ix P dJ.L) 1/4. This choice of c gives
be an enumeration of the rational numbers in I =
[0,1] and put I(x) =
fn=l 2n
1 X[-1+r n .1+rn ](X) ,
x E 1.
Jlx - Tnl
By Theorem 1.6, and 4.6 and 4.46 in Chapter VII, it follows that IE L(I) and consequently, I is finite a.e. This function I works for the interval I, to obtain a function that gives the example in R, just put together functions that look like I. By the way, not only is I discontinuous at every point of I and unbounded on I, but it remains so after modifying its values on any Lebesgue null subset of I. Also, P is finite a.e. on I, but it is not integrable there. , 3.7 Let {In} be the sequence of simple integrable functions constructed in Theorem 1.12 in Chapter VI. Then for each E E M we have IE III dJ.L 5 IE II - Inl dJ.L+ IE Ilnl dJ.L = I +J, say. By LDCT, limn -+oo 1= 0, uniformly for E E M. So, fix n large enough so that I < E/2, and since
XVIII.
380
Remarks on Problems and Questions
fn is bounded by c, say, observe that IE Ifni dJ.L ~ CJ.L(E) < £/2 provided that J.L( E) < 6 = £/2c.
= I{lJI>n} If I dJ.L for n = 1,2, ... , it follows that Ix If I dJ.L ;::: L:~=1 n( F( n) - F( n + 1)). What we need now is to "sum by parts," and 3.8
If F(n)
this is achieved through the following relation, known as Abel's transformation: L:~=1 UnVn = L:~:~ Un(vn-Vn+t)+UkVk, where Un = Ul+·· +u n for n = 1,2, ... ,k. 3.10 Since as is readily seen Ix Ifni dJ.L ~ C for all n, by Fatou's Lemma, f is integrable. Put h n = If - fnl ;::: o. Then the sequence {h n } is also uniformly integrable and lim n .... oo hn = 0 J.L-a.e. Further, if hn,r = hnX{hn~r} denotes the sequence of cut-offs of the hn's at level r, by LDCT we have lim n .... oo Ix hn,r dJ.L = o. Now, Ix hn dJl = I{hn>r} hn dJl + Ix hn,r dJ.L = 1+ J, say. Given £ > 0, choose first r so that I ~ £/2 for all n, and then pick N so that J ~ £/2 for all n ;::: N. 3.15 Assume first that f vanishes off a bounded interval I of Rn. In this case f(y) + fey + h) equals either f(y), fey + h) or zero provided that Ihl is sufficiently large, the exact value depends on I. For these values of h, and by 4.6 in Chapter VII, we have IRn If(y + h) + f(y)1 dy = IRn If(y+h)1 dy+ IRn If(y)1 dy = 2IR" If(y)1 dy, and the limit in question equals this last quantity. For arbitrary integrable functions consider a limiting argument working with truncates. By passing to a subset if necessary, we may assume that 0 < Let now 4>(x) = IRn XA(y)XA(x+y) dy, x ERn. By the continuity of the translates of integrable functions it follows that 4> is continuous, and since 4>(0) = IAI > 0, there exists 6 > 0 such that 4>(x) > 0 provided that Ixl < 6. This in particular means that the integrand of the integral defining 4>(x) does not vanish identically, and consequently, there is yEA such that x + yEA, or x E A-A. 3.16
IAI <
00.
3.17 We show first that f is bounded in a neighbourhood of the origin. Indeed, let k be so large that IAI = I{x E R: If(x)1 ~ k}1 > o. Then, by 3.16, A - A contains a neighbourhood of the origin, and this gives our assertion. Next note that if f is bounded on (-£,£), say, we have iro,e]f(x + y)dy = £f(x) + iro,£]f(y)dy. But this implies that
£f(x) = ir:r:,:r:+e] fey) dy - I[o,e] fey) dy = iro,:r:+£] fey) dy - I[o,:r:] fey) dy iro,£] f(y) dy = I[£,:r:+£] f(y) dy - iro,:r:] f(y) dy = iro,:r:]U(y +£) - f(y)) dy = f(£)x. It is also possible to avoid the first part of the argument by considering cos(J( x)) instead; in this case the integral relation is a bit more complicated.
XVIII.
Remarks on Problems and Questions
381
3.22 The Wiener covering argument works with balls in place of intervals. 3.23 We may assume that I E L(Rn) is nonnegative since replacing I by III does not change MI. Consider next a sequence {Ik} of nonnegative integrable functions that vanish off a compact set of Rn with the following two properties: (i) !k ::; I a.e., and, (ii) The Ik'S increase to I a.e. By Theorem 2.1 there is a constant c independent of A > 0 and k such that AI{M!k > A}I ::; c Ik dy ::; c I dy. Now, since the !k's increase to I it follows that M Ik increases to M I everywhere and consequently, limk-+oo I{M Ik > A}I = I{M I > ~}I·
IRn
IRn
3.24 Let rt, r2,'.' be an enumeration of the rational numbers, and for each rk let Nk be the subset of those points x in Rn where the relation lim II( 1
r-+O
x,r
)1 JI(x,r) f I/(y) - rkl dy = I/(x) - rkl
fails to hold. Since I/(y)-rkl is integrable on bounded sets, by (a straightforward variant of) Theorem 2.2, INkl = 0 for all k. Let N = Uk Nki N is also a null subset of Rn. Now, for any x,r, and rk we have
II( 1 )1 x,r
f
JI(x,r)
I/(y) - l(x)1 dy
::; II( 1 )1 x, r
f
JI(x,r)
I/(y) - rkl dy + II( 1 )1 x, r
f
JI(x,r)
I/(x) - rkl dy
= II( x,r 1 )1 f I/(y)-rkldy+l/(x)-rkl· JI(x,r) Therefore, if x ¢ N and if I( x) is finite, lim sup II( 1 )1 r-+O x, r
f
JI(x,r)
I/(y) - l(x)1 dy ::; 21/(x) - rkl ,
where the right-hand side of the above inequality can be made arbitrarily small since rk is any rational number. 3.29 Divide I into two nonoverlapping closed intervals, each of equal length. If J denotes one of these intervals, then either I dy ::; A, or else JJ f dy > A. In the latter case we separate J and it becomes one of the Ik's in the conclusion. Then (i) clearly holds since f dy ::; II I dy ::; 2A. On the other hand, in the former case we proceed with
m
2f.n
mIJ
mIJ
XVIII.
382
Remarks on Problems and Questions
subdividing J, and repeat this process until we are in the second case; if we are never forced into the second case we just keep subdividing. We denote by Uk lk the union of the pairwise possibly nonoverlapping closed intervals obtained from the second case, and we claim that (ii) holds. Indeed, by the Lebesgue differentiation theorem, for almost every point x of 1 \ Uk h we have I(x) = limpl->O Iii I dy, where the limit is taken over those J's which satisfy the second case and which contain x; thus (ii) also holds. Finally, I Uk lkl = Lk 11kl ::; Lk I dy = I dy. 3.31 If I is integrable and A > 0 is given, decompose R into a sequence of nonoverlapping closed intervals 1 each of equal length such that I dy ::; A for each 1. Now apply 3.29 to each 1 separately.
IJ
1
he
1IUkIk
mII
3.32 We begin by decomposing Rn into a mesh of nonoverlapping closed intervals 1, each of equal size, and whose diameter is so large that I dy ::; A for every 1 in this mesh. We consider now one interval at a time, as follows: If 1 is in the mesh we divide it into 2n congruent nonoverlapping closed intervals by bisecting each of the sides of 1. Let J be one of these new intervals and note that either I dy ::; A, or else Iii I dy > A. In the latter case we separate J and note that
mII
r.h IJ
IJ
IJ
Iii I dy ::; 2~iJI going.
II I dy ::; 2n A; in the former case, we subdivide. Keep
CHAPTER IX 3.1 Given a family C of subsets of X, let C* denote the class of subsets of X consisting of the at most countable unions of differences of sets in C, i.e., C* = {Un(An \ Bn):An,Bn E C}. It is clear that if C ~ SeA), then C* ~ SeA). Also, if 0 and X are in C, then 0 and X are in C*, and C ~ C*. In addition, by 5.30 and 5.31 in Chapter I, if card C ::; c, then card C* ::; c. Let now C be the countable family consisting of open intervals in the line with rational endpoints, and intervals with one endpoint 00 or -00 and the other enpoint rational, and (-00,00); clearly 0 and (-00,00) are in C. Let A be the well-ordered uncountable set with ordinal n constructed in Chapter II; we now show that to each ordinal a E A, there corresponds a class COt of subsets of R which satisfies the following two properties: (a) If the ordinals a,(3 E A satisfy a < (3, then C ~ COt ~ Cf3 ~ S(C), and (b) cardCOt ::; c for all a. This is how we go about it: To the first element 1 of A we associate the class C1 = C. Assuming now that COt has been assigned to each ordinal a < (3 and that (a) and (b) hold, let Cf3 = (UOt
XVIII.
Remarks on Problems and Questions
383
above, and by the principle of transfinite induction, cf. 5.10 in Chapter II, a class Ca exists for each a E A satisfying (a) and (b) above. Furthermore, we claim that M = UaeA Ca is a u-algebra that coincides with S(C). To see this, let {An} be a sequence of sets in M and observe that to each n there corresponds a class Can so that An E Can' Let a = sup an; by 5.33 in Chapter II, a E A. Since Can ~ Ca for all n, we have {An} ~ Ca and Un An = Un(An \ 0) E Ca +l ~ M. In a similar fashion we show that if AI, A2 EM, then Al \ A2 EM. Finally, since card Ca ~ c for all a and card A ~ c, it follows that card Bl = card S(C) = card M ~ c. That cardBl = c follows combining this result with 4.16 iIi Chapter IV. Also, in the construction following Theorem 1. 7 in Chapter VI, the set A there is Lebesgue but not Borel measurable. 3.5 Since I" is regular and 1"( {x} ) = 0 for each x, given c > 0, there exists 6(x) > 0 such that the open ball B(x,6(x)) satisfies J.L(B(x,6(x)) < c/2. Now, by the Heine-Borel Theorem there exists a finite sequence Xl, ... ,XN, say, such that X ~ U~=lB(xn,6(xn)). The choice 6 = min{6(xt), ... ,o(xN)}/2 will do. 3.7
Theorem 2.3 in Chapter IV is useful here.
3.14 The set A = {x: 1"( {x}) > O} is at most countable; let XI, X2, ... be an enumeration of A. If A = 0 put J.Ld = 0 and J.Lc = 1". Otherwise put J.Ld = Lk 1"( {x k} )OXk' and note that since J.Ld( E) = LxeE 1"( {x}) ~ 1"(E) for each Borel set E, then by 4.29 in Chapter IV there exists a unique measure J.Lc, say, such that J.Lc(E) = J.L(E) - J.Ld(E) for all E E B. 3.15 Let Xl < YI satisfy 0 < 1"«-00, Xl]) < TJ < 1"« -00, YI]) < 00. Having chosen points Xl ~ ... ~ Xn-l '< Yn-l ~ ... ~ YI such that 1"« -00, Xn-l]) < TJ < 1"« -00, Yn-l]), divide the interval [xn-I, Yn-l] into two equal nonoverlapping closed subintervals [xn-I, r] and [r, Yn-l]. If 1"« -00, r]) = TJ we are done. Otherwise, if 1"« -00, rn < TJ set Xn = rand Yn = Yn-I, and if 1"« -00, rn > TJ set Xn = Xn-l and Yn = r. Keep going. 3.16 Let V = {V: V is open and J.L(V) = O}, and put G = Uvev V; G is open. We show next that 1"( G) = O. To see this, let J( be a compact subset G. Then there exist Vb"" Vm in V such that J( C U~=l Vk, and consequently, J.L(J() = 0; by regularity, also J.L(G) = O. Now put C = Rn\G; clearly (a) holds. Also, if 0 is an open set such that CnO =f:. 0, and if J.L(CnO) = 0, then we also have 1"(0) = J.L(OnE)+J.L(On(Rn\E» = 0, implying that 0 ~ (Rn \ C), contrary to the fact that C n 0 =f:. 0. Finally, if C l is another closed subset that satisfies (a) and (b), then from (a) it follows that Rn \ C l ~ G, and so C = Rn \ G ~ Cl. Moreover, since J.L(GnCl) = 0, from (b) it follows that GnCl = 0. Hence, C l ~ (Rn\G) =
XVIII.
384
Remarks on Problems and Questions
c.
Finally, if J( is a compact subset of R, let Xl,X2, ... be a dense subset of J(, put JI. = I:n 2- n6xn , and verify that supp JI. = J(. What if J( is only closed? 3.21 To compare the integrals we need to integrate by parts. Since it is not apparent how to do this, we sum by parts instead using Abel's transformation, cf. 3.8 above. 3.25 To prove the necessity, choose continuity points Xo < ... < Xk of F such that F(xo) < E, F(Xk) > 1-E, and Xi-Xi-l < E for i = 1, ... , k. If n is sufficiently large, then IF(Xi)-Fn(Xi)1 < E/2 for all i. Suppose now that Xi-l ~ x ~ Xi. Then Fn(x) ~ Fn(Xi) ~ F(Xi)+E/2 ~ F(X+E)+E/2; the inequality going the other direction is established along similar lines. 3.31 (i) implies (ii) follows from the usual convergence results once we change variables, cf. 3.24. Also, if I = XE' then the set D f of discontinuities of I is 8E, and from JI.(8E) = 0 and (i) it follows that Jl.n(E) = I dJl.n ~ I dJl. = JI.(E). Thus, (i) also implies (iii). Furthermore, since 8(-oo,x] = {x}, (iii) implies (i). To deduce (i) from (ii), consider the corresponding distribution functions F n , F, assume that x < y, and let I(t) be the function equal to 1 for t ~ x, 0 for t ~ Y and linear on [x,y], i.e., I(t) = (y - t)/(y - x) for x ~ t ~ y. Since Fn(x) ~ I dJl.n and I dJl. ~ F(y), it follows from (ii) that lim sup Fn(x) ~ F(y); whence letting y decrease to x we get lim sup Fn(x) ~ F(x). Similarly we show that limu-+x,u<x F( u) ~ liminf Fn(x), and this implies the convergence at continuity points.
iR
iR
iR
iR
3.32 The relations 2/(x/3) = I(x) and 2/(2/3 are useful here.
+ x/3) -
1 = I(x)
CHAPTER X 4.9 By considering if necessary the functions Ik - Ik( a) we may assume that the Ik'S are nonnegative. Thus I = I:k Ik is nonnegative and non decreasing and, by Theorem 2.1, I' exists a.e. in I. Consider next the partial sums Sn = It + ... + In, and let N be a null subset of I so that f' and I~ exist off N, n = 1,2, ... For any x E (a,b) and h > 0 with x + h E (a,b), we have
I(x
+ h) h
f(x) > Sn(X + h) - Sn(X) -
h
'
and consequently, s~(x) ~ f'(x) for x E I \ N. Moreover, since s~(x) ~ s~+1 (x) ~ f' (x), lim n -+ oo s~ (x) exists for x E I \ N and it does not exceed
XVIII.
Remarks on Problems and Questions
385
f'(x). But, is it equal to f'(x)? To show that this is the case it suffices to find a subsequence s~& that converges to f' a.e. Let nl < n2 < ... < nk < ... be so that Lk=1(J(b) - sn/c(b)) < 00. This implies that for each nk and all x E (a,b) we have 0 5 f(x) - sn/c(x) 5 feb) - sn/c(b) and consequently the series Lk(J(x)-sn/c(x)) converges. Since the terms of this series are monotone functions that have finite derivatives a.e., the above argument gives that Lk(J'( x) - s~ (x)) converges a.e. and limk->oo(J'(x) - s~/c(x)) = 0 a.e. /c 4.12 Let f denote the Cantor-Lebesgue function on [0,1], and extend it to be 0 for x 5 0 and 1 for x ~ 1. Let {[an,b n]} be an enumeration of the family of closed subintervals of [0,1] with rational endpoints, and put fn(x) = f«x - an)/(b n - an)). Then ~(x) = L~=12-n fn(x) is continuous and strictly increasing on [0,1] and, by 4.8, g' = 0 a.e. 4.22 Let {On} be a decreasing sequence of open sets containing N such that IOnl < 1/2n , n = 1,2, ... , let ifJn = Lk=1 Xo n , and put ifJ = limn->oo ifJn. It is not hard to verify that ifJ E L(I) and that JI ifJ < 1. Further, put "pn(x) = J[O,x] ifJn(Y) dy and "p(x) = fro,x] ifJ(y) dy. Now, if x E I and [x,x + h] C On, we have
"p(x+h)-"p(x) ~.!.1 h
h
ifJn(y)dy=n.
[x,x+h]
Thus the right-hand side derivative of"p at x is hand side derivative.
00;
similarly for the left-
4.23 If E is the set constructed in 3.34 in Chapter V, consider the function f(x) = fro,x](XE - XI\E)dy, 0 < x < 1.
CHAPTER XI 3.4 The Closed Graph Theorem, which is discussed in Chapter XV, is relevant to answer the question. 3.6 We observe first that sets E with veE) > -00 enjoy the following property: Given € > 0, there is a measurable subset B of E such that (a) v(B) ~ veE), and, (b) v(B') > -€ for each measurable subset B' of B. Indeed, if veE) > -€ and the same is true for each of its measurable subsets, we are done. Otherwise let B1 be a measurable subset of E with V(B1) ~ -c. Since veE) is finite, we have veE \ Bd ~ veE), and we may repeat the above argument with E\B t in place of E. If v( E\B1) > -€ and the same is true for each of its measurable subsets, we are done. Otherwise
XVIII.
386
Remarks on Problems and Questions
let B2 ~ E\BI be such that V(B2) ~ -Ej note that B2 is disjoint with BI and that V(B2) ~ v(E\BI) ~ v(E). We proceed in this fashion recursively and either stop after a finite number of steps after obtainig a set Bk with the desired properties, or else we get a pairwise disjoint family BI, B2, ... of measurable subsets of E such that V(Bk) ~ -E for all k. In the latter case put B = Uk Bk, and note that v(E \ B) = v(E) - Lk V(Bk) = 00, which contradicts our assumption on v. To complete the proof, put EI = E, and define En inductively as follows: Having defined EI, ... , En-I, by the above remark we can find a measurable subset En of E n- I such that v(En) ~ V(En-I ~ v(E) and v(B) ~ -lIn for each measurable En ~ EI = E, and verify that A is subset B of En. Finally put A = positive with respect to v. By the way, this result can be used to show that arbitrary signed measures von (X,M) admit a Hahn decomposition. The proof runs along these lines: We may assume that v does not take the value 00 for otherwise we work with -v. Put T/ = sup{v(A): A is positive with respect to II} ~ 0, let {An} be a sequence of positive sets such that liIDn-+oo v(An) = T/, and set A = Un An. A is also positive with respect to T/ and v(A) = T/ < 00. Put now B = X \ Aj we claim that B is negative with respect to v. For, if this is not the case, then B contains a measurable subset of measure E > 0, say, and consequently, also a subset Al with V(AI) ~ E which is positive with respect to II. In particular, Al and A are disjoint, and T/ ~ II(A U AI) = v(A) + II(AI) = T/ + E.
nn
3.33 By considering either ILm - IL or IL - ILm, we may assume that ILm(A) decreases to 0 for each A E Bn. Thus for each interval I we have ILm(I)/III ~ ILm+I(I)/III, and consequently DlLm ~ DlLm+1 and {DlLm} is a nonincreasing sequence of function. Let A'l = {x E R n : D ILm (x) ~ T/ for all n}. By 3.32 it follows that ILm(A'l) ~ T/IA'l1. If we let n -+ 00 we get that 0 ~ T/IA'l1 and so A'l is null. Now we must show that DlLm -+ 0 a.e. Indeed, if A = {x E R n : limm -+oo DlLm(X) > O}, then A = Uk A I / b and hence IAI = o.
CHAPTER XII To show the necessity, let I E LP(X) \ Lq(X), and set En = {III ~ n}. Thus, IL(En) < 00 for all n, and limn -+oo IL(En) = O. Furthermore, if IL(En) = 0 for some n, then I E LOO(X), and by 4.6 it follows that I E Lq(X), a contradiction. Conversely, let us first show that there exists a pairwise disjoint sequence {En} of measurable sets such that 0< IL(En) < 1/2n. By assumption there exists a sequence {An} of measurable sets such that 0 < IL(An) ::S 1/2n and IL(A n+1) ~ IL(An)/4j the 4.11
XVIII.
Remarks on Problems and Questions
387
sequence En = An \ Uk:n+1 Ak has the desired properties. If q < I = ~~=l p.(En)-I/qXEn' and if q = 00 put I = ~~=l nl/PxEr,.
00
put
4.16 Note that lIn - liP ~ 2Pl/niP+ 2PI/IP, and proceed as in 4.24 in Chapter VII. 4.19 To prove the necessity of (a), given e > 0, choose an integer no so that III - Inllp < e for all n ~ no, and then pick B~,C~ E M of finite measure such that I/IP dp. < e and I/nl Pdp. < e for n = 1, ... , no; now put A~ = B~ u C~. (b) follows along similar lines using 3.7 in Chapter VIII. As for the sufficiency, by (a), Fatou's Lemma and Theorem 1.2, we may reduce the problem to one where p. is a finite measure. Given e > 0, let 6 > be chosen so that (b) holds. By Egorov's theorem there is a measurable subset B with p.(B) < 6 such that In converges uniformly to I on X \ B, and by Fatou's Lemma it follows that I/IP dp. < e. By Theorem 1.2 we then see that 11- Inl Pdp. < 3Pe for all sufficiently large n. Thus we conclude that 1= (J - In) + In E LP(p.), and III - Inllp --+ as n --+ 00.
IX\B.
Ix\c.
°
IB
Ix
°
If I is a nonnegative simple function, I = ~~=l cnXA n , p.(An) for all n, and CI > C2 > ... > Cn, then, by the properties established in 4.11-4.12 of Chapter VI, it follows that
4.20
= an <
00
For arbitrary LP functions, take limits. A simpler proof follows from Fubini's theorem, to be discussed in Chapter XIII, and the identity I/( x )IP = i[o.IJ(x)1l pt p - l dt. 4.22
If Ot =
{III> t}, try 9 = IXo
t •
Given I E LP(Rn) and t > 0, let Ot be as in 4.22 and write I = It + lOCH where It = I Xot / 2 , and 100 E Loo(Rn), 11/001100 ~ t/2. Note that since {M I > t} ~ {Mit> t/2} U {M 100 > t/2} and 11M 1001100 ~ 11/0011 ~ t/2, by the Hardy-Littlewood maximal theorem it readily follows that I{M I> t}1 ~ I{M It > t/2}1 ~ c(t/2)-lllltlll 4.24
= ctl { III dx . hIJI>t/2}
XVIII.
388
Remarks on Problems and Questions
Whence from 4.20 and Fubini's theorem, to be discussed in Chapter XIII, we get that
IIMfll~~c
{
ptp-1C 1 {
If(x)ldxdt
i{IJI>t/2}
i[o,oo)
~ c iRn { If(x)1 ( { pt pi[O,2IJ(x)1l
2
dt) dx
=c
{
iRn
If(x)llf(x)IP-l dx.
The constant c above depends, of course, on n and p. It is surprising that if MtJ(x) denotes the Hardy-Littlewood maximal function with respect to balls introduced in 3.22 in Chapter VIII, then it is true that IIMtJllp ~ cllfll p, where c depends only on p and is independent of n. The proof of this interesting result has been given by Stein and Stromberg, cf. Ark. Mat. 21 (1983), 259-269. 4.27 To show the sufficiency observe that v(E) > 0 implies J-L(E) > 0, and consequently, v <: J-L. Let f = dJ-L/dv, and for a sequence (b n ) of nonnegative real numbers that increases from 0 to 00, let En = {b n ~ f < bn+1}. Then the En's form a measurable partition of X, and, since v(En) ~ bnJ-L(En), the condition gives IIfll~ ~ c. 4.28 Let q be the conjugate index to p, and c > inequality we have
o.
By Young's
all n.
=
4.32 We have Ilfngn - fgllp ~ IIUn - J)gnllp + IIf(gn - g)lIp 1+ J, say. By assumption, lim n --+ oo I = O. Now, given c > 0, let 8 be chosen so that Ifl P dJ-L < c provided that J-L(E) < 8, and pick a measurable set A, J-L(A) < 00, such that IX\A Ifl P dJ-L < c. Since f is finite J-L-a.e on A, by (an extension of) Proposition 2.1 in Chapter VI, there exist a measurable subset B of A and a constant M such that J-L(B) ~ A{J-L(A)/2,8} and If I ~ M on A \ B. If k denotes the uniform bound for the 9n'S and 9, then JP ~ 2k IX\A Ifl P dJ-L + 2k IB Ifl P dJ-L + M IA\B Ign - glP dJ-L.
IE
4.33 Part ( a) of the necessity follows from the Uniform Boundedness Principle, to be discussed in Chapter XV; also cf. Proposition 4.1 in Chapter XVI. As for the sufficiency, from (b) it follows that limn --+ oo fndJ-L = fdJ-L for all >'s in the dense subset of LP(J-L) consisting ofthe simple functions. The result for arbitrary E LP'(J-L), l/p+ l/p' = 1, follows by a limiting argument.
Ix
Ix
4.34 If lim inf IIfnllp :::; k, by Fatou's Lemma IIfllp ~ k as well. Thus, by Holder's inequality, I fg dJ-Ll, lim sup I fng dILl < 00. Now, given
Ix
Ix
XVIII.
£
Remarks on Problems and Questions
> 0 and 9
389
E Lq, l/p+ l/q = 1, by Young's inequality we have
~ el/nlW ~
ePl/nl P + Igl; , e p q£ and consequently, £Pl/nl P/p+ Iglq /q£q - Ing ~ O. Applying Fatou's Lemma to this nonnegative function it follows that £PII/II~ + IIglig _ f Ig dp. ~ ePkPk + IIglig -lim sup f Ing dp.. P q£q JX P qe q JX Unraveling this expression we get lim sup Ix Ingdp. ~ Ix Igdp. + ePk/p, which implies, since £ is arbitrary, that lim sup Ix Ingdp. ~ Ix Igdp.. The opposite inequality follows from the estimate Ix Igdp. ~ liminf Ix Ing dp., obtained by considering the sequence {- In}. 4.35 Assume that the sequence In does not converge weakly to I, choose 9 E pI such that lim sup I Ix'!ng dp. - Ix I 9 dp. I = ." > 0, and find a subsequence nl < n2 < ... such that limk-+oo I Ix(fn/c - I)g dp.1 = .". Next invoke (an appropriate version of) Proposition 3.3 in Chapter VI to conclude that there is a further subsequence of the nk's, which we call nk again for simplicity, such that limk-+oo In/c = I p.-a.e. This leads to a contradiction.
Ing
4.36 We begin by observing that given 0 < p < 00 and 0 there exist constants c, independent of £, and Ce; such that Iia + blP -lalPI ~ eelal P + ce;lbl P ,all real a,b.
<e<
1,
(1.1)
We may prove this by considering two cases, to wit, when a, b are of the same sign, and when they are not. For instance, in the former case, suppose that a, b > 0, and note that when b ~ £a, by the mean value theorem, the left-hand side of (1.1) is dominated by aP«1+£)P-1) ~ pwP, and when b > w, by «b/e) + b)P = «l/e) + l)PbP. With (1.1) out of the way, note that by Fatou's Lemma I E LP(p.), and that I lIn - liP - I/nl P - I/IPI ~ I lIn - liP - I/nlPI + I/IP ~ cel/nl P + ce;I/IP + I/IP = eel/nl P + ce;I/IP. Whence, eel/nl P + ce;I/IP -ll/n - liP -l/nl P -I/IPI ~ 0, and, by applying Fatou's Lemma to this sequence of nonnegative functions we get (ee
+ ce;)
Ix
I/IP dp.
+ Ce;
L
~ cdiminf
Ix
I/IP dp. -lim sup
I/nl P dp.
L
I lIn - liP -l/nl P -I/IPI dp..
Thus, lim sup Ix I lin - liP - I/nl P - I/IPI dp. ~ uk, and the conclusion follows since £ is arbitrary. This result is due to Brezis and Lieb, Proc. Amer. Math. Soc. 88 (1983),486-490.
XVIII.
390
Remarks on Problems and Questions
CHAPTER XIII 4.2
Consider
:Er~-m2 f«k - l)/m, y)X[(k-l)/m,k/m](X).
4.9 First, since 8(y) -+ 8(x) as y -+ x, we get that M>.(x,f) = 00 if x ~ F. Next, note that since 8 = 0 in F, the integral defining M>. can be restricted to 1\ F. Thus, by Tonelli's theorem, IF M>.(x, f) dx = II\F IF(8)'(y)/lx - yl1+>')dxdy. In order to estimate the integral over F, fix y E 1\ F, and note that for x E F, Ix - yl ~ 8(y) > o. Thus, the integral in question is dominated by ~x_YI~6(y)(1/lx - yl1+>.) dx =
2I[6(y),oo)t-(1+>')dt
= 2.x- l II\FI.
This implies that IFM>.(x,f)dx::;
2.x- l II \ Fl. 4.13 Holder's inequality and the continuity of translates of LP( Rn) functions should do the job. Suppose f E LP(Rn), 9 E Lq(Rn), and 4> E L r ' (R n ), where l/r + l/r' = 1, are nonnegative. Young's convolution theorem follows then from the estimate 4.14
f
f
JRn JRn
4>(x)f(x - y)g(y) dx dy ::;
IIfllplIgllqll4>lIr'.
If a},a2 are nonnegative numbers such that al + a2 = 1, and similarly for {3}, {32 and it, ;2, then the left-hand side of the above inequality
may be rewritten IRR JRR 4>(x)£r1+£r2 f(x - y)131+l12 g(ypl+'Y2dx dy. By 4.4 in Chapter XII, if 1/ s + l/t + l/u = 1, this expression is dominated by the product of the three integrals (JRR JRn 4>(x)£r18 f(x - y) f318 dxdy)l/S, (JRnIRnf(x - y)l12tg(y)'Yltdxdy)l/t and (JRnJRn4>(X)£r 2 u g(y)'Y2 Udxdy)l/U. Clearly we will be done once we choose our parameters so that al S = r', {3ls = p, {32t = p, ;It = q, a2u = r', and ;2U = q. It is a simple arithmetic task to verify that if l/p + l/p' = 1, and l/q + l/q' = 1, then the choice s = q', t = r, and u = p' works. 4.15 We do the case p = 1. We look carefully at the proof of Theorem 3.7 and note that the term A there is also estimated by TJ. As for the B term, it does not exceed
f
J{IYI~M}
If(x -
Y)I4>~(y) dy + If(x)1 f
say. By (2.10), li~-+o B2 as Iyl -+ 00. Thus,
B2
=f
J{lYI~M}
J{IYI~M}
4>~(y) dy = Bl + B 2 ,
= O. Write 4>(y) = 'I/J(y)/Iyln, where lim 'I/J(y) = 0
If(x - y)I'I/J(Y/E)lyl-ndy
~ M- n ( {IYI~M} sup 'I/J(Y/E») IIflll ,
XVIII.
Remarks on Problems and Questions
391
where the sup above goes to 0 with e. 4.17 The formula to be proved reduces to a familiar formula of integration by parts when n = 1. Writing I = 1+ - 1-, we note that F is the difference of two bounded increasing functions; hence F is BV on [r,R], and the integral frR >(p) dF(p) is well-defined. Assume now that I is nonnegative, let 1= J{rSlxISR} I(x)>(Ixl) dx, and let {r = Po < ... < Pk = R} be a partition of [r, R]. If mi and Mi denote the minimum and maximum values of > on [Pi-I, Pi] respectively, 1 ~ i ~ n, it readily follows that k
k
i=l
i=l
L mi(F(Pi) - F(Pi-l)) ~ I ~ L Mi(F(Pi) - F(pi-l))' where the extreme terms above converge to
frR >(p) dF(p).
4.18 We do the case p = 1. Let xo be a Lebesgue point of I, by considering the function I(xo + x) we may assume that xo = O. Now, if I is continuous at 0, we are done by 4.11. Hence, subtracting from I a continuous function that vanishes outside a bounded interval which equals 1(0) at 0 we may also suppose that 1(0) = o. The assumptions on > can be combined into a single estimate, to wit, >(x) ~ cj(1 + Ixl)n+~. Therefore, (1.2) and it suffices to show that the integral on the right-hand side above tends to O. With the notation of 4.17, let F(p) = f{lxISp} I/(x)1 dx. Our assumptions imply that given TJ > 0, there exists 6 > 0 such that F(p) < TJp n provided that p ~ 6. Writing the integral on the right-hand side of (1.2) as a sum of integrals A + B, say, where A extends over Ixl ~ 6 and B extends over Ixl > 6, with >(p) = e~ j(e + p)n+\ with some work, by 4.13, it follows that A = f;(e~ j(e + p)n+~) dF(p). Integration by parts gives now that lim SUPe-+O A ~ CTJ. As for B, note that if Ixl > 6, then e+ Ixl > 6, so that B does not exceed e~lI/lhj6n+~. Hence lime-+o B = o. The result now follows since TJ is arbitrary. 4.23
Use Theorem 2.4 in Chapter XII.
CHAPTER XIV 4.6 Since M is a closed set properly contained in X there is an element x E X such that d(x,M) = TJ > o. Also, there exists y E M
XVIII.
392
Remarks on Problems and Questions
such that IIx - yll < TJ/(1 - oS). Put Xo = IIx:ylI(x - y), and verify that IIxoll = 1 and IIxo - xII ~ 1 - oS. Note that M has to be closed. Indeed, if J = [0,1], and X = C(I) and M is the subspace of polynomials on J, then M = X and the conclusion does not follow. The element Xo is "almost perpendicular" to M, and Riesz's lemma does not always assure that a perpendicular element may be found. To see this consider the following setting: Let J = [0,1], let X be the closed subspace of C(J) consisting of those functions I such that 1(0) = 0, and consider M = {I EX: fI I dy = O}; M is a proper closed subspace of X. Suppose there is an element 9 E X \ M such that infJEM IIg - III ~ 1. If hEX \ M and a(h) = fIg/ fIh, then fI(g(x) - a(h)h(x))dx = 0 and 9 - a(h)h E M. Therefore, we have IIg - (g - a( h)h) II = lIa( h )hll ~ 1. Let hn( x) = x l / n , n = 1,2, ... ; h n E X \ M and hence lIa(hn)h n/1 ~ 1. But a(h n ) = ((n + 1)/n) fIg, and since IIhnll = 1 it follows that fIgl ~ ((n + 1)/n) for each positive integer n. This in particular means that IfIgl ~ 1. Now, since g(O) = 0 and IIgll = 1, IfI gl < 1, which is a contradiction. 4.8 Let TJ = IAII + .. ·IAnl; if TJ = 0 there is nothing to prove. Otherwise, dividing both sides of the inequality by TJ, it follows that it is equivalent to prove that for some constant c> 0, we have n
IIIlIXI
+ .. ·lln Xnll
~ c,
whenever
L Illil = 1. i=l
Suppose this last statement is false. Then for each integer k there exists a choice of constants Ill,k, ... ,Iln,k such that IIIlI,kXI + ... + Iln,kXnll ~ l/k. Now, by the Bolzano-Weierstrass Theorem, we can find a subsequence k m - 00 with the property that lim m -+ oo Ili,km = Ili exists for 1 ~ i ~ n. It now readily follows that not all the Ili's are zero, and that lliXI + ... + Il~Xn = 0, contradicting the linear independence of the Xi'S. 4.10 All the norm properties for the intersection are obvious except for the completeness. If {In} is Cauchy in LP(Il) n Lq(Il), then it is also Cauchy in LP(Il) and in Lq(Il), and if lim In = I (in LP(Il)), and lim In = 9 (in Lq(Il)), then I = 9 Il-a.e., and limln = I (in LP(Il) n Lq(Il)). For the sum there are two properties that require some thought: (a) That the norm of I = 0 implies I = 0 Il-a.e., and, (b) Completeness. To show (a) note that there are sequences {gn} ~ LP(Il) and {h n } ~ Lq(J.t) such that f = gn + hn for all n, and lim IIgnlip = lim IIhnll q = o. Then gn + h n = gl + hI. and gn - gl = hI - hn converges to -gl (in LP(J.t)) and to hI (in Lq (11)); this implies that f = gl + hI = O. To show (b) it suffices to check that if 2: IIfnll < 00, then limN_oo 2:~=1 fn exists in
XVIII.
Remarks on Problems and Questions
393
LP(J-L) + Lq(p,). To characterize the functionals on these spaces it is helpful to think about the relationship between the norm in LP(J-L) n Lq(J-L) and iro,oo)J-L({1/1 > ,x})dmax(,xp,,xq), and the relationship between the norm in LP(J-L) + Lq(J-L) and I[o,oo)J-L({1/1 > ,x})dmin(,xP,,xq). 4.11 A Banach space B need not have a Schauder basis. If B is a Banach space over the reals and {xn} is a Schauder basis for B, then the at most countable set E;;=I TnXn, where the Tn'S are arbitrary rational numbers, is dense in B. 4.16
L( eit? xo)
If Xo E X,
= ei-D Lxo.
IIxoil
~ 1, then also eit?xo E X,
IIeit?xoil = IIxoil
and
4.23 To prove the sufficiency observe that if a linear combination ,xIXI + ... + AnXn = 0, then also E AkLOXk = o. For an arbitrary element x = E AkXk in the span of Y put Lx = E AkLOXk. Our previous observation implies that, in spite of the possible multiplicity of the representations of x as an element in the span of Y, the value of the functional Lx is uniquely determined. The linearity and boundedness of L on the span of Yare readily established. To extend L to all of X, invoke the Hahn-Banach Theorem.
m
m
4.24 Given intervals I,J of R n , let 1= XI XJ . Then lED, and consequently, II 9 dx = IJ 9 dx. By the Lebesgue Differentiation Theorem it readily follows that 9 coincides with a constant c, say, a.e. Further, since 9 E Lq(Rn), c must be o. Next, suppose that D is not dense in LP(Rn). Then, by Proposition 3.1, there exist a function ¢> E LP(Rn)\D and a bounded linear functional L on LP(Rn) such that L¢> # 0 and LI = 0 for all lED. Whence, by Riesz's theorem, there exists 9 E Lq(Rn) such that L I = I Rn I 9 dx for all I E LP (Rn ), and by the first part of the argument 9 = 0 a.e. Thus L is the zero functional, a contradiction. As for LP(I), it is not hard to see that in this case D is a closed proper subspace of LP(I). For instance, g(x) = 1 for all x E I is a nonzero element of Lq(I) that satisfies all the conditions. As for the case p = 1, again D is the null space of a nontrivial bounded functional on L(Rn).
m
4.25
m
4.15 is useful here.
4.36 A function I which is BV on I = [0,1] is said to be normalized, if 1(0) = 0 and I is right-continuous at each point x in the open interval (0,1); the collection of normalized BV functions of I is denoted by NBV. The dual space of e(l) can be identified with NBV, 4.26 in Chapter III is relevant here.
XVIII.
394
Remarks on Problems and Questions
Given an open subset 0 of I begin by defining the set function tf;(O) = sup{LJ: J E G(I) and 0 ~ J ~ XK' where K is a compact subset of O}. Next set 4.37
1'*(E) = inf{ tf;(O): 0
~
E,O an open subset of I},
all E
~
I.
The measure l' is now constructed by invoking 3.40 in Chapter V.
CHAPTER XV 6.3 Given c > 0, consider Yn = {y E Y: lIyll < lin} and An = {x EX: IL(x, y)1 ~ c for all y E Yn}. Since L is continuous in x, each An is closed, and since L is continuous in y, X = Un An. By the Baire Category Theorem we get that IL( x, y) I ~ 2c if x E 0 and y E YN, where
o is a neighbourhood of the origin in X, and m is some integer. 6.5
A Hamel basis will get the job done.
Let." = infn v'IITnlli we begin by showing that actually." = lim n--+ oo v'IITnll. Given c > 0, choose m such that v'IITmll < ." + c. Also let M = max {l, IITII, ... , IITm-III}. Now consider any integer nand write it in the form n = knm + In, where k n is a nonnegative inte er and o ~ In ~ m - 1. Then, since IITnll ~ IITlln, it follows that n IITnll ~ v'IITln IIIITmllkn ~ MI/nIlTmllkn/n < MI/n(." + c)(n-1n)/n. Furthermore, since lim n--+ oo M I / n(." +c )(n-1n)/n = ." +c, there exists an integer Ne such that for all n ~ N e, we have M I / n + c)(n-1n)/n < ." + 2c. Therefore, for those values of n we have ." ~ n IITnll ~ ." + 2c, and consequently, as asserted, lim n--+ oo v'IITnll exists and it equals .". We now deduce that the series in question converges or diverges by applying the Cauchy test for the convergence of series to ~~=o IITnll. 6.11
6.14 Let x E M, and suppose {xn} is a sequence of elements of M that converges to x. Consider now the sequence {Toxn} C Yi since IIToxn - Toxmll ~ IITollllxn - xmll, this is a Cauchy sequence, and since Y
is complete, it converges to a limit that is independent of the choice of sequences of elements of M that converge to x. Now set Tx = lim n--+ oo Toxn' where {xn} is any sequence of elements of M that converges to x. Obviously T is linear, and, taking limits in the inequality IIToxnll ~ IITollllxnll, we find that IITxlI ~ IITollllxlli that is, T is bounded with norm not exceeding IIToli. The opposite inequality, as well as the uniqueness of T, are not hard to establish.
XVIII.
Remarks on Problems and Questions
395
6.20 We construct a sequence {Tn} of bounded linear operators on X which satisfies IITnxll ~ kx for n = 1,2, ... , and yet for no constant cit is true that IITnll ~ c for all n. We begin by writing polynomials as x(t) = 2::'0 ant n , where to each x there corresponds an integer N x so that an = o for all n ~ N x • As the sequence Tn we take the sequence of functionals Ln given by LnO = 0 and for x ::f. 0, Lnx = ao+" ·+an-I' Clearly each Ln is linear, and, by the choice of norms, also bounded, with norm less than or equal to n. Now, for each x we have ILnxl ~ (Nx+1)maxo n1· After Xn1 , . .. ,xn/c' have been chosen, with nk > ... > nl, choose x n/c+1 satisfying IIx - L:~~l2-i+1Xni II < 2-(k+1), and nk+1 > nk. Choose now a sequence in (t n ) E i 1 as follows: Put tn = 0 if n ::f. nk for all k, and tn = 2- k+1 if n = nki clearly T((t n )) = x. If N is the null space of T, the linear map T induces an isomorphism from i l / N onto B. 6.24
{x EX: IIxll
6.29 The proof follows essentially along the lines of that of the Open Mapping Theorem. In the notation of that theorem, we introduce the sequence {xn}, and note that if Zk = E:=l Xn , then limk-+oo Zk = x exists,
XVIII.
396
and, by (4.7), limk--+oo TZk T, and Tx = y.
= y.
Remarks on Problems and Questions
Since T is closed, x is in the domain of
6.33 Suppose that g, fn are LP(I) functions, n = 1,2, ... , such that lim n--+ oo IIfnllp = 0 and lim n--+ oo liT fn - gllp = O. Replacing these sequences with subsequences if necessary, by 4.12 in Chapter XII, we may assume that also limn fn 0 a.e. and Ifni ::; (y)1 E L(1). Thus, by LDCT, for almost every x E I we have limn--+oo T fn = 0, which in turn implies that 9 = 0 a.e. The result now follows by invoking the Closed Graph Theorem.
=
6.37 (iii) implies (i) requires no proof. As for (i) implies (ii), let T* be a closed linear extension of T. If y E Y \ {O}, then y ¢ G(T*) :J G(T). Moreover, since G(T*) is closed in X X Y, (O,y) ¢ G(T). Finally, we consider (ii) implies (iii): Define T* as the linear operator whose graph is precisely G(T), i.e., D(T*) = {x EX: (x, z) E G(T) for some Z E Y}. It is, then, not hard to check that T* is a well-defined closed linear, minimal, extension of T.
CHAPTER XVI 5.21 The first element is Xl = (l/IlYlII)Yl. Now, Y2 = (Y2, Xl)Xl +Z2, where Z2 is not the zero vector since the Yn's are linearly independent. Also Z2 .1 XI, and so we can take X2 = (1/lIz211)z2. Having chosen orthonormal vectors Xl, ... , Xn-t, note that the vector Zn = Yn - 2:k::(Yn, Xk)Xk is nonzero and orthogonal to Xt, ... , Xn-l. Pick now Xn = (l/lIznll)zn. 5.26 Since the sum 2::=1 Am(Xm - Ym) converges in X when the sum 2::=1 AmXm converges, the mapping T given by T(2: m AmXm) = 2:m Am(Xm - Ym) is well-defined, has norm IITII ::; A < 1, and consequently, T - I is invertible. 5.27 The sum in question is equal to (X[a,x],2:n(X[a,x],4>n)4>n)' The necessity follows then from Theorem 3.5. As for the sufficiency, note that the above identity, together with the case of equality in the CauchySchwarz inequality imply that X[a.x] = 2: n(X[a.x] ' 4>n) 4>n , where the equality is in the sense of L2(1). Pass now to the limit. 5.28 Let 17 > 0 be chosen so that 1- M217 > O. By Egorov's theorem there is a subset G of I such that II\GI < 17 and en = sUPxEG IAn4>n(x)1 -+
XVIII.
Remarks on Problems and Questions
397
o as n - 00. Thus ,x~ = fI('xn>n(x»2dx ~ (fa + fI\a) ('xn>n(x»2dx ~ £2 + M2TJ'x~, and consequently, (1 - M2TJ)'x~ ~ E~ _ 0 as n _ 00. 5.32 Since by the Riesz-Fischer theorem £2 embeds into every Hilbert space, a copy of the space 1 consisting of those sequences of the form {A,,X2,,X3, ••• }, with 0 < ,X < 1, sits inside every infinite-dimensional Hilbert space. Since card 1 = c, it only remains to verify that the sequences in 1 are linearly independent. 5.35 Put T = 1 + U, and note that U* I(x) = 2 f[x,lj e(t-x) I(t) dt. Using this observation it follows that if 1 is orthogonal to >, then U* U1 = -U 1 - U* I, and consequently, T*T 1 = (1 + U + U* + U*U)I = I· Whence, (j, 'I/J) = (T I, T'I/J) = (T*T I, » = (j, » = 0, and liT III~ = II/II~, and (b) holds. As for (c), note that if 9 is orthogonal to 'I/J it follows that T*Tg = -Tg - T*g, and consequently in this case we have T*Tg = g. Setting 1 = T*T 9 we get that 1 is orthogonal to > and 9 = T I. Thus T is isometric as a map from {>}.L onto {'I/J}.L, and T is bijective. Indeed, if 1 is orthogonal to > and T(j + ,X» = T 1 + 'x'I/J = 0, then T 1 = 0, ,X = 0, and since 11/112 = liT 1112 = 0, also 1 + ,X> = O. Given 9 E L2(1), writting it as 9 = 9 - «(g,'I/J)/('I/J,'I/J»'I/J + «(g,'I/J)/('I/J,'I/J»'I/J, and setting 1 = T*g we have T 1 = 9 and further manipulations give that 1 = T*g - 2(g, 'I/J)>. This last relation may be written I(x) = g(x)2 f[o,xj e(-t+x)g(t) dt which essentially corresponds to the inversion formula of g(x) = I(x) 5.39
+ 2 f[o,xj e(x-t) I(t) dt.
The Closed Graph Theorem does the job.
Since for m ~ n we have that 0 ~ Tm - Tn ~ 1, and hence ~ 1, a computation using the generalized Cauchy-Schwarz inequality gives that 5.41
IITm - Tnll
But since the sequence {(Tnx,x)} is bounded and non decreasing, and hence convergent, the above inequality implies that lim n ..... oo Tnx = Tx, say, exists for all x EX. This mapping T is obviously linear and selfadjoint. 5.44 The estimate 11/112 ~ 11/1100 holds for every 1 E Loo(I). Thus, if 1 belongs to the C(I) closure of X, it also belongs to the L2(I) closure of X, and since X is closed in L2(I), it follows that 1 E X. That X is closed under these norms implies, in particular, that the restriction of these norms turn X into a Banach space, and that the identity map from
XVIII.
398
Remarks on Problems and Questions
X normed by the C(I) norm onto X normed by the L2(1) norm is continuous; the existence of c follows now from Corollary 4.3 in Chapter XV. As for the dimension, let {fn}~=l be an ONS of X, and note that by the Pythagorean theorem it follows that II L::;=l Anfnlloo :$ cll L::;=l Anfnll2 :$ c, for any choice of scalars AI, ... , AN such that L::;=l A~ = 1. Thus, for almost every x E I we have I L::;=l Anfn(x)1 :$ c, and, taking the sup over the {An}'S, we get (L::;=l fn(x)2)1/2 :$ c. Squaring this inequality and integrating over I it readily follows that N = II L::;=l fnll~ :$ c2. We may assume that x = O. Let nl = 1, and choose n2 so that :$ 1; this choice is possible since the xn's converge weakly to o. In general, having chosen nl < ... < nk and Xn1 , ... , Xnk , choose nk+l and Xnk +1 such that I(XnpXnk+l}I, ... ,I(xnk,xnk+l}1 :$1/k. Furthermore, since the norms IIxnll are uniformly bounded by c, say, it follows that lI(x n1 + ... + x nk )/kIl 2 :$ (c 2 + 2)/k - 0 as k - 00. 5.46
l(xn1 ,x n2 }1
5.50 Assume that the sequence {xn} converges weakly to x in X; we must show that IIT*xn-T*xlI- 0 as n - 00. Now, on account ofthe continuity ofT* we know that the sequence {T*xn} converges weakly to T*x, so, by Proposition 4.2, it suffices to verify that lim n -+ oo IIT*xnll = IIT*xlI. Write IIT*xn1l 2 = (T*xn,T*xn) = (xn,T(T*xn - T*x)} + (xn,TT*x) = 1+ J, say. Since the xn's converge weakly, the sequence IIx nII is bounded, and since {T*xn -T*x} converges weakly to 0 and T is compact, {T(T;T*x)} converges in norm to o. Whence lim n-+ oo I = 0 and lim n-+ oo J = (x, TT*x) = IIT*xII2.
Index Abel's transformation, 380 AC function, 144, 171, 180 Banach-Zarecki theorem, '174 Algebra of sets, 45 Approximate identity, 252 norm convergence, 252 pointwise convergence, 254, 255, 256, 263, 264 Approximate unit, 268 Axiom of Choice, 18, 25 Baire Category Theorem, 297 Banach space, 269 quotient, 293 Schauder basis, 294 Beppo Levi's theorem, 105 Borel measure, 149 atom, 160 atomic or discrete, 160 continuous, 160 differentiable at x, 202 distribution function, 153 regular, 149 support, 161 weak convergence, 163 BV function, 28 Jordan decomposition, 31 Lebesgue decomposition, 177 negative variation, 42 norm, 270 positive variation, 42 variation, 29
C(I), 13 dual of, 285, 296 norm, 268
Co(Rn ),33
C~(I), 146 norm, 268 Cr(R n ),146 Cantor's diagonal process, 14 Cantor set, 73, 76 Cardinal number, 8 addition, 9 No, 8 Nl,23 comparison, 8, 21 exponentiation, 9 multiplication, 9 Cauchy-Schwarz inequality, 143, 318 Cesaro summability, 363 Chebychev's inequality, 115 Closed Graph Theorem, 310 Compact mapping, 340 eigenvalue, 343 eigenvector, 343 integral, 342 spectral decomposition, 347 Conditional expectation, 206 Content, 38 Continuum Hypothesis, 23 Convolution, 246 Young's theorem, 263 Covering in the sense of Vitali, 165 Covering in the sense of Wiener, 139 Curve, 27 length, 27 rectifiable, 28
Dini numbers, 167 Distribution function, 61, 153 discrete, 160 Levy distance, 163 weak convergence, 162
399
Index
400
Essential infimum, 236 Essential supremum, 216 Fatou's Lemma, 113, 117, 126 Fejer kernel, 263, 373 First category, 297 Fourier series, 337, 358 Cesaro summability, 363 Dirichlet kernel, 359 Fejer kernel, 373 Fejer's theorem, 366 Fourier coefficients, 333, 357 Lebesgue constants, 360 Riemann-Lebesgue theorem, 358 M. Riesz theorem, 368 symmetric partial sums, 358 Fubini's lemma, 179 Fubini's theorem, 242, 261 property F, 238 Fubini-Tonelli's theorem, 265 Function, 3 absolutely continuous or AC, 144, 171 BV,28 Cantor-Lebesgue, 156 characteristic, 4 composition, 3 Dirichlet, 33 distribution, 61, 153 domain, 3 extension, 4 identity, 4 inverse, 4 Lipschitz, 41 measurable, 79 nonincreasing equimeasurable rearrangement, 99 one-to-one or injective, 4 onto or surjective, 4 range, 3 restriction, 4 Riemann integrable, 33 Riemann-Stieltjes integrable, 32 saltus, 41 section, 258 simple, 88, 105 singular, 176 step, 28 truncation, 98
Functional, 218, 267 bounded, 218, 271, 294 continuous, 218, 278 linear, 218, 271 M.M.Day's theorem, 219 Minkowski, 272 norm, 220, 267 seminorm, 267 Gauss-Weierstrass kernel, 263 Hahn-Banach Theorem, 273, 279 complex version, 276 Hamel basis, 20, 25, 76, 278, 353 Hardy-Littlewood maximal theorem, 139, 146 in LP(R"), 235 Hilbert space, 319 completely continuous or compact mapping, 340 direct sum, 328 existence of the minimizing element, 321 geometry of subspaces, 326 Gram-Schmidt process, 353 Lax-Milgram lemma, 352 ONS, 332 orthogonal complement, 322, 328 parallelogram law, 320 projections, 326 Pythagorean theorem, 320 reflexivity, 325 F. Riesz theorem, 314 weak convergence, 338 Holder's inequality, 212, 233 Inner product space, 319 Cauchy-Schwarz inequality, 318 Hermitian form, 352 Hilbert, 319 pre-Hilbert, 319 quadratic form, 354 Integral, 105, 108, 114, 124, 288 absolute continuity, 144 Beppo Levi's theorem, 109 change of variable, 44, 162, 181 Chebychev's inequality, 115 Fatou's lemma, 113, 117, 126 improper Riemann, 122
Index integration by parts, 37, 181 LDCT,120 lower Riemann-Stieltjes, 32 MCT,110 Riemann, 33 upper Riemann-Stieltjes, 32 Vitali-Caratheodory theorem, 130
L(It), 114 complete metric space, 131 complex-valued, 130 dense subset, 135 uniform integrability, 144 L(R"), 133 Co(R") deD.sity, 133 Calder6n-Zygmund decomposition, 148 continuity of translates, 135 Hardy-Littlewood maximal function, 137 Hardy-Littlewood maximal theorem, 139 Lebesgue Differentiation Theorem, 141 Lebesgue point, 147 Lebesgue set, 147 LO(It), 215
LP(It), 209 approximate unit, 264 conjugate indices, 211 converse Holder's inequality, 222 dense class, 215 F. Riesz-Fischer duality theorem, 214 functional, 218 Holder's inequality, 212, 233 Minkowski's inequality, 213 Minkowski's integral inequality, 264 separability, 234 Vitali's convergence theorem, 234 weak convergence, 231 LP(R"), 215 approximate identity, 255 C~(R") density, 254 continuity of translates, 215 dense classes, 215 Hardy-Littlewood maximal theorem, 235 separability, 234
401
L OO (It),216 complete metric space, 217 dual of, 292 Lebesgue Differentiation Theorem, 141 LDCT,120 Lebesgue measure, 71 Caratheodory's theorem, 72 outer, 65 regularity, 77 Lebesgue's theorem, 167 Linear space, 267 Banach,269 conjugate or dual, 271 convex subset, 272 Minkowski functional, 272 normed,267 open mapping, 306 Marcinkiewicz function, 262 Marcinkiewicz weak-L 1 class, 139 MCT,110 Measurable functions, 79 almost uniform convergence, 95 bounded in probability, 100 complex-valued, 104 convergence It-a.e., 92, 101, 103 convergence in It-measure, 94, 101, 129 convergence in probability, 94, 101, 128 Egorov's theorem, 97, 103, 129 Lusin's theorem, 90, 103 Measure space, 50 It-almost everywhere, 81 Measure, 50 Borel, 149 Borel-Cantelli Lemma, 55 Caratheodory's extension, 78 complete, 56 counting, 52 Dirac, 52 distribution function, 61 doubling, 178 extension, 56 finite, 50 Lebesgue, 71 Lebesgue-Stieltjes, 156 null set, 56 probability, 51
Index
402
product, 260 restriction, 56 semifinite, 62 O'-finite, 51 signed, 184 Minkowski's inequality, 213 Normed linear space, 269 B(X, Y), 300 bounded functional, 271 bounded mapping, 300 BV,271 en, 268 closed mapping, 309 conjugate or dual, 271 continuous mapping, 300 linear mapping, 300 quotient, 293 . Rn, 268 weak convergence, 296 Nowhere dense, 297 ONS,332 basis, 337 Bessel's inequality, 333 Fourier coefficients, 333 maximal or complete, 335 Parseval's identity, 335 Plancherel's equality, 335 Open Mapping Theorem, 306 Order type, 16 .", 16 A, 16 multiplication, 16 n, 16 w, 16
w·,
16
sum, 16 Order, 16 chain, 19 equivalent, 16 first element, 17 last element, 17 lexicographic, 17 lower bound, 17 maximal element, 17 minimal element, 17 partial, 15 total, 16
type, 16 upper bound, 17 well-ordering, 17 Ordinal number, 17 Burali-Forti paradox, 26 comparison, 22 fl,23 Partition, 27 lower sum, 32 norm, 28 upper sum, 32 Poisson kernel, 263 Principle of the Condensation of Singularities, 305 Principle of transfinite induction, 24 Relation, 4 equivalence, 4 equivalence class, 4 partial order, 15 Resonance or Banach-Steinhaus Theorem, 304 F. Riesz Rising Sun Lemma, 40 F. Riesz Representation Theorem, 296 Second category, 297 Sets, 1 at most countable, 5 Cantor, 73, 76 Cartesian product, 3 complement, 2 countable, 5 empty, 2 equivalent,S finite,S FiT, 67 G6,67
infinite,S intersection, 2 Lebesgue measurable, 68 lim, 46 liminf, 46 lim sup, 46 null, 56 point of density, 147 point of dispersion, 147 Russell's Paradox, 14 SchrOder-Bernstein theorem, 7
Index section, 241, 257 subset, 2 uncountable, 5 union, 2 universal, 1 well-defined density, 77 Set function, 49 additive, 49 bounded, 59 Jordan decomposition, 60 negative variation, 59 positive variation, 59 O'-additive or measure, 50 total variation, 59 O'-algebra, 46 ' Borel, 48 generated by C, 48 measurable sets, 50 monotone classes, 60 product, 48, 61 Signed measure, 183 absolutely continuous, 184 complex-valued, 207 differentiable at :1:, 202 Hahn decomposition theorem, 191, 386
Lebesgue decomposition theorem, 199
negative part, 191 negative set, 204 null set, 204 positive part, 191 positive set, 204 Radon-Nikodym Theorem, 200, 201 singular, 194 Tonelli's theorem, 244 Trigonometric polynomial, 357 Trigonometric series, 357 Fourier series, 358 Uniform Boundedness Principle, 302 Vitali's covering lemma, 165, 178 doubling measures, 178 Weak convergence, 229, 296 in Hilbert space, 338 in V(I'), 229
403
Schur's theorem, 229 weak limits, 296 Weierstrass theorem, 367 Well-ordering, 17 initial segment, 21 ordinal number, 17 transfinite induction, 24 Zermelo's theorem, 18, 24, 25 Young's inequality, 212 Young's convolution theorem, 263 Zorn's Lemma, 19, 25