JEWGENI H. DSHALALOW
Real Analysis An Introduction to the Theory of Real Functions and Integration
I
I . # ' ~ ~ ~ ' l ) Il.v l ~ .. S\ l ) \ : \ . v t ' k . ' I ) >\1.~7~111:.\1~\~'1~',s
CHAPMAN & HALLICRC
Studies in Advanced Mathematics Series Editor
STEVEN G. KRANTZ Washiltgtorl University
St. Louis
Editorial Board R. Michael Beals Rutgers University
Dennis de Turck
Gerald B. Folland
University of Washington
William Helton
University of Pennsylvania
University of California at San Diego
Ronald De Vore
Norberto Salinas
University of South Carolina
University of Kansas
Lawrence C. Evans
Michael E. Taylor
University of California at Berkeley
University of North Carolin
Titles Inciuded in the Series Steven R. Bell, The Cauchy Transform, Potentlal Theory, and Conformal Mapping Johr~J. Benederto, Harmonic Analysis and Applications John J. Benedetro and Michael W Frazier, Wavelets: Mathematics and Applications Albert Boggess, CR Manifolds and the Tangential Cauchy-Riemann Complex Goong Chen and Jianxin Zhou, Vibration and Damping in Distributed Systems, Vol. 1: Analysis, Estimation, Attenuation, and Design. Vol. 2: WKB and Wave Methods, Visualization, and Experimentation Carl C. Cowen and Barbara D. MacCluer, Composition Operators on Spaces of Analytic Functions John P. D'Angelo, Several Complex Variables and the Geometry of Real Hypersurfaces Lawrence C. Evans and Ronald E Gariepy, Measure Theory and Fine Properties of Functions Gerald B. Folland, A Course in Abstract Harmonic Analysis Jose' Garcia-Cuerva, Eugenio Herndndez, Fernando Soria, and Josi-Luis Torrea, Fourier Analysis and Partial Differential Equations Peter B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah-Singer Index Theorem, 2nd Edition Alfred Gray, Modem Differential Geometry of Curves and Surfaces with Mathematlca, 2nd Edition Eugenio Herndndez and Guido Weiss, A First Course on Wavelets Steven G. Krant~,Partial Different~alEquations and Complex Analysis Steven G. Krantz, Real Analysis and Foundations Kenneth L Kutfler, Modem Analysis Michael Pedersen, Functional Analysis in Applied Mathematics and Engineering Clark Robinson, Dynamical Systems: Stability, Symbolic Dynamics, and Chaos, 2nd Edition Jotm Ryan, Clifford Algebras in Analysis and Related Topics Xavier Saint Raymond, Elementary introduction to the Theory of Pseudodifferential Operators Robert Strictlartz, A Guide to Distribution Theory and Fourier Transforms A ~ ~ dUnterberger ri and Harald Upmeier, PseudodifferentialAnalysis on Symmetric Cones Jatnes S. Walker, Fast Fourier Transforms, 2nd Edition Jarnes S. Walker. Pnmer on Wavelets and their ScientificApplications Gilbert G. Walter, Wavelets and Other Orthogonal Systems with Applications Kehe Zhu, An Introduction to Operator Algebras
JEWGENZ H.DSHALALOW
Analysis An Introduction to the Theory of Real Functions and Integration
CHAPMAN & HALUCRC Boca Raton London New York Washington, D.C.
Library of Congress Catalogingin-PublicationData Dshalalow, Jewgeni H. Real analysis : an introduction to the theory of real functions and integration / Jewgeni H. Dshalalow. p. cm. -- (Studies in advanced mathematics) Includes bibliographical references and index. ISBN 1-58488-073-2 (alk. paper) 1. Mathematical analysis. I. Title. 11. Series. 2. Biology-molecular. I. McLachlan, Alan. 11. Title. QA300 .D742000 5 15--dc2 1
00-058593
CIP This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information stomge or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W.Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
O 2001 by CRC Press LLC No claim to original U.S. Government works International Standard Book Number 1-58488-073-2 Library of Congress Card Number 00-058593 Printed in the United States ofAmerica 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper
To my Lord and Redeemer Who made the supreme sacrificefor me and Who will come again
Preface This book is intended to be an introductory two-semester course in abstract analysis, which includes topology, measure theory, and integration, traditionally staffing an assemblage of topics under the cognomen "Real Analysis," more common in the United States. Most North American schools offer this as a graduate one- to two-semester course for mathematics, physics, and engineering majors. Many European schools, to the best of my knowledge, do not have such a course; they have instead a sequence of separate courses such as Topology, Measure and Integration, and Functional Analysis. In some countries, such as Russia and former Soviet Republics, they, additionally, have a Real Variables course, which is somewhat similar to Real Analysis but is more specialized, and, its profile and rigor vary fiom college to college. A very good reason for learning real analysis is that not only is it a core course for all mathematical disciplines, but it is absolutely mandatory for statistics and probability, operations research, physics, and some engineering majors as well. Hence, rephrasing an old adage, all routes of science and technology go through real analysis. This text predominantly targets the first year graduate students of mathematical science majors as well as the first and second year graduate students of engineering, physics, and operations research majors. A stronger senior undergraduate mathematics student can also benefit fiom the course. Some less theoretically oriented programs or those with weaker mathematics course curricula may find it reasonable to use the book for a three-semester course: with the first two semesters of basics and the third semester of advanced topics. The course can always be shortened to two semesters in such schools with the option to cover the fust seven chapters, which are also quite sufficient for technical majors. This book is destined primarily as a textbook and its purpose as a reference is secondary. The reason for such a claim is a rather thorough elaboration of major theorems, notions, and constructions, very often supplied with a blueprint and sometimes a less formal introduction. The latter are then succeeded by detailed treatments. For instance, the Radon Nikodym Theorem is first introduced in Chapter 6, with a minimum of proofs and formalities, but with a number of examples and exercises. Then it is followed by a more abstract version later, in Chapter 8. vii
PREFACE
viii
The first three chapters of the book (Part I) include preliminaries on sets theory and basics of metric spaces and topology. I have been using these three chapters for the many years teaching a bilevel topology course at Florida Tech during our quarter system. However, I would not be able to cover the present version of the three chapters in one quarter, and one semester would be a more appropriate term for the current program at our school. Hence, the first three chapters can easily serve as a separate one quarter to one semester topology senior undergraduate or beginning graduate course. Chapters 4-7 (Part II) present basics of measure and integration and, again, they can be offered as a separate measure theory (and integration) course. Consequently, Parts I and II can become appealing to those programs with separate named courses and, in particular, to European students. Part III (Chapters 8 and 9) includes a more elaborate and abstract version of measure and integration, along with their applications to functional analysis (LPspaces and Riesz Representation Theorem for locally compact Hausdorff spaces), probability theory (conditional expectation, uniform integrability, Lebesgue-Stieltjes integrals, decomposition of distribution functions, stochastic convergence, and convergence of Radon measures), and conventional analysis on the real line (monotone and absolutely continuous functions, functions of bounded variations, and major theorems of calculus). Part 111 can be utilized for advanced topics, as well as an enlarged variant of measure and integration. While the reader would be better off to have studied Part I prior to Part II and the first six sections of Chapter 8, the latter can also be used as an independent material with sufficient basics of topology drawn from any generic advanced analysis course. The book can also be used as a reference source for researchers in mathematical and engineering sciences, and especially, operations research (such as applied stochastic processes, queueing theory, and reliability). The reader should understand, however, that the book is not intended to become an encyclopedia of mathematics or to be any kind of a broad reference. I had to suppress my temptation to include some written chapters on Hilbert spaces, functional analysis, and Fourier transforms, because of my motives to compile main topics of what constitutes the real.analysis and to design a text by spending more time on details (within the frameworks of the book size imposed by the publisher and buyers' affordability). This text may be well suited for independent studies with or without instructors for which an abundance of examples and over 600 exercises provide a pertinent support. While a solution manual is in preparation and will become available soon (and it would be an additional studying aid), the publisher and I have agreed on honoring only university instructors with this manual upon adoption of the book for the course. The reader may also find the new terms subsections (at the end of each section) useful, especially considering a plethora of new definitions and notations, which not only can be intimidating, but they can create an additional memory burden and thereby slow down learning of the main concepts.
PREFACE
Most of my thanks are due to my wife Irina for her ample support, encouragements, and overwhelming sacrifice. I would like to express my deep appreciation to Mr. Jiirgen Becker, for his constant guidance and countless ideas, Mr. Donald Konwinski for his enormous editorial work on earlier versions of my manuscript, Professors Gerald B. Folland and Ryszard Syski for their numerous and very constructive remarks, as well as the kind assistance of Professors S.G. Deo, Jean-B. Lassere, Jordan Stoyanov, Mr. Gary Russell, the project editor, Mr. David Alliot, and anonymous reviewers who thoroughly read my manuscript and made many helpful suggestions. My thanks are also due to the publisher, Mr. Robert Stern for his help and extreme patience. Jewgeni H. Dshalalow Melbourne, Florida
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Part l; An Introduction to General Topology . . . . . . . 1 Chapter 1 1. 2.
3. 4. 5. 6. 7.
2.
3. 4. 5. 6. 7.
2. 3.
Analysis of Metric Spaces . . . . . . . . . . . . . . . . . 59
Defmitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 The Structure of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Convergence in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Continuous Mappings in Metric Spaces . . . . . . . . . . . . . . . . . . . . - 7 8 Complete Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9 2 Linear and Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 100
Chapter 3 1.
3
Sets and Basic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Set Operations under Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Relations and Well-Ordering Principle . . . . . . . . . . . . . . . . . . . . . . 22 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4 0 Basic Algebraic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Chapter 2 1.
Set-Theoretic and Algebraic Preliminaries
Elements of Point Set Topology . . . . . . . . . . 107
Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Bases and Subbases for Topological Spaces . . . . . . . . . . . . . . . . . 115 Convergence of Sequences in Topological Spaces and
CONTENTS
xii
Countability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Continuity in Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . .128 ProductTopology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Notes on Subspaces and Compactness . . . . . . . . . . . . . . . . . . . . . 143 Function Spaces and Ascoli's Theorem . . . . . . . . . . . . . . . . . . . . . 151 Stone-Weierstrass Approximation Theorem . . . . . . . . . . . . . . . . . 160 Filter and Net Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182 Functions on Locally Compact Spaces . . . . . . . . . . . . . . . . . . . . . 195
Part IL Basics of Measure and Integration . . . . . . .20 1 Chapter 4 1. 2. 3.
Systems of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .204 System's Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Chapter 5 1. 2. 3. 4. 5. 6.
Measurable Spaces and Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -203
Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .221
SetFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .222 Extension of Set Functions to a Measure . . . . . . . . . . . . . . . . . . . 235 Lebesgue and Lebesgue-Stieltjes Measures . . . . . . . . . . . . . . . . . . 258 Image Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .277 Extended Real-Valued Measurable Functions . . . . . . . . . . . . . . . -282 Simple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .288
Chapter 6
Elements of Integration . . . . . . . . . . . . . . . . . . 295
Integration on C.'(Q. 27) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Main Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . .312 Lebesgue and Riemann Integrals on R . . . . . . . . . . . . . . . . . . . . . 327 Integration with Respect to Image Measures . . . . . . . . . . . . . . . . . 341 Measures Generated by Integrals. Absolute Continuity. Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Product Measures o f Finitely Many Measurable Spaces and Fubini's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Applications of Fubini's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 378
...
CONTENTS
XLZL
Chapter 7 1. 2.
Calcubs in Euclidean Spaces . . . . . . . . . . . .387
Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Part III. Further Topics in Integration . . . . . . . . . . . . 419 Chapter 8
Analysis in Abstract Spaces . . . . . . . . . . . . . . 421
Signed and Complex Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Singularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 LPSpaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .460 Modesofconvergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 Uniform Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 Radon Measures on Locally Compact Hausdorff Spaces . . . . . . . 493 Measure Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Chapter 9 1. 2. 3. 4.
Calculus on the Real Line . . . . . . . . . . . . . . . . 517
MonotoneFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Functions of Bounded Variation . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Absolute Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 SingularFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
BLBLIOGRAPHY
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Part I An Introduction to General Topology
Chapter 1 Set- Theoretic and Algebraic Preliminaries -
Set theory is not just one of the main tools in mathematics, it is the very root of mathematics, from which all mathematical disciplines stem. The great German mathematician, Georg Ferdinand Cantor, is considered to be a sole founder of set theory in a series of papers, the first of which appeared in 1874. Although Czech Bernard Bolzano (1781-1848) made one of the first attempts to formalize set theory, in particular in his Paradoxien des Unendlichen 1851 work, by considering the one-to-one correspondence between two sets (later on developed by Cantor to what we now know as cardinals), neither he, nor anyone else, was really a predecessor to Cantor's creation. Ernst Zermelo (187 1-1953) was another German, who among his numerous contributions to set theory, is the author of the first axiom for set theory (of 1908) and undoubtedly the primary axiom of the whole mathematics. This chapter presents only essentials of set theory and abstract algebra needed throughout the book.
1. SETS AND BASIC NOTATION Cantor defined a set as a collection M into a whole of definite, distinct objecis (that are called elements of M) of our thought. In other words, we bind objects (perhaps of different nature) in our mind into a single entity and call that entity a set. We will denote sets by capital letters, and their elements by lower case letters. For instance, a set A has elements a, b, c, or al,a2,. . .. To abbreviate the expression "a is an element of the set A," we will write a E A. The expression "a 6 A" reads "a is not an element of A." Observe that the notion of a set is relatively simple if we deal with such frequently encountered sets as sets of integers, rational numbers, real numbers or continuous functions. In some rare situations, thoughtless use of this notion can lead to contradictions, like Bertrand Russell's paradox. Russell posed the following set dilemma. Let % be the set of all sets, which are not elements of themselves. Clearly, '3 is not empty. For instance, the set of all real numbers is not an element of itself (for it is
4
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
not a real number), thus it belongs to %. The question arises: Is % an element of itself? If % E % then by definition of '3, it should not belong to % which is a contradiction. Thus, % 6 %. But then, by definition, it must belong to 3,which is impossible. In this case, we have put the definition of an object ahead of its existence. The concept of a set must be supported by axioms of set theory, just as main axioms of plane geometry define the shape of lines.
1.1 Definitions. (i) A set A is said to be a subset of a set B (in notation, A 5 B) if all elements of A are also elements of B. If A is a subset of B, we call B a superset of A (in notation, B 2 A). A set that contains exactly one element, say a, is called a singleton (set) and it is denoted by (a). If a E A, then we can alternatively write {a) C A. Any set is obviously a subset of itself: A 5 A.
(ii) The unique set with no elements is called the empty set and is denoted 0. Clearly, 0 is a subset of any set, including itself. (iii) A = B (read "set A equals set B") if and only if A C B and B C_ A; otherwise, we will write A # B. Occasionally, we will be using the symbol " c " applied to the situation where one set is a subset of another set but the sets are not equal. A C B reads "A is a proper subset of B." In this case, B is a proper superset of A (in notation, B 3 A). We postulate the existence of a set that is a superset of all other sets in the framework of a certain mathematical model. This set is usually called a universal set or just universe. We will also make use of the word "carrier" as a synonym for the universe and reserve for it the Greek letter S1. Sometimes, we will denote it by X, Y or 2. A universe (as a base for some mathematical model or problem) is generally defined to contain all considered sets and it varies from model to model. For example, if en [a, bl
denotes the set of all n-times differentiable functions on interval [a,b], it contains, .as a subset, the set of possible solutions of an ordinary differential equation of the nth order. Thus, R = is a relevant universe within which the problem is posed. One could also take for 52 the set C[,,bl of all continuous functions on [a,b] or even the set of all real-valued functions on [a,b]. However, these are "vast" to serve for universes and they are impractical for this concrete problem. Set theory is also a basic ingredient of probability theory, which always begins with elements of set theory under slightly modified lexicon. For instance, a universe is referred to as sample space. Subsets of the sample space are called events, specifically singletons are called elementa-
5
1. Sets and Basic Notation
ry events. The concept of the universe is most vivid when used in probability theory. Let us consider the experiment that consists of tossing a coin until the first appearance of the head on the upper face of the coin. Denoting H as an output of the head and T as an output of the tail, when tossing the coin, we may define {(T,T,. ..,T,H ) ) as an elementary event of the sample space R populated by the elements {(H), (T,H), (T,T , H),. .). The universe R contains, as elements, all possible outcomes of tossing the coin until the "first success" or the first appearance of the head. For instance, in the language of probability theory, the event {(H), (T,H),(T, T, H)} corresponds to the cLsuccessin a t most three tosses."
.
1.2 Notations. Throughout the whole book we will be using the following notation. (i) Logical symbols:
V means "for all" 3 means "there is" or "there are" or "there exists" 3 means "implies" or "from
... it follows
that
..."
means "if and only if" A (&) means "and" V means "or" : means "such that" (primarily used for definition of sets) (ii) Frequently used sets:
N: the set of all positive integers No: the set of all nonnegative integers Z: the set of all integers Q: the set of all rational numbers QC:the set of all irrational numbers W: the set of all real numbers C: the set of all complex numbers W + : the set of all nonnegative real numbers R - : the set of all negative real numbers
(iii) Denotation of sets: List:
The elements are listed inside a pair of braces [for instance, {a,b,c) or {al, a2 ,. .)I.
.
Condition: A description of the elements with a condition following a colon (that in this case reads "such that"), again with braces enclosing the set [for instance, The set of odd integers is { n E Z: n = 2k+1, k E Z)].
6
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
( i v ) Main set operations:
Union: A u B = ( x E ~ ~X: E A V X E B ) Intersection: A n B = {x E R: x E A A x E B) Two subsets A , B 5 i2 are called disjoint if A n B = 0. Difference: A\B = {x E i2: x E A A x 6 B) [A\B is also called the complement of B wiih respect to A, with the alternative notation A - B or B> .] Symmetric Difference: A A B = (A\B) U (B\A) Complement (with respect to the universe R): AC = A h = R\A
( v ) General notation: u. - ,, . - reads "set b y definition." L3 indicates the end of a proof, remarks, examples, etc.
A set-algebraic expression is a set in the form of some defined sets connected thrciugh set operations. Any transformation of a set-algebraic expression into another expression would require a set-theoretic manipulation which we call a set-algebraic transformation. All basic set-algebraic transformations over basic set-algebraic expressions are known as Laws of Algebra (or Calculus) of Sets. 0
1.3 Remark. One of the standard tools of the algebra of sets is the socalled pick-a-point process applied to, say, showing that A C B or A = B. It is based on the following Axiom of Ex-tent: For each s d A and each set B , it is true that A = B i f and only i f for every x E R, x E A when and only when x E B.
Axiom's modification: If every element of A is an element of B , then A C B.
Thus, for the modification, the pick-a-point process consists of selecting an arbitrary point x of A (picking a point x) and then roving that x also belongs to LI. The identities below can be verified easily by the reader using pick-apoint techniques.
1.4 Theorem (Laws of Algebra of Sets).
(i)
Commutative Laws:
(ii)
Associative Laws:
1. Sets and Basic Noiation
(iii) Distributive Laws: ( A u B ) n C = (AnC)UCBnC) ( A n B ) U C= ( A U C ) n ( B U C ) (iv)
Idempotence of complement: (AC)'= A union: A U A = A iniersection: A n A = A
(vi)
AuAC=fI
(vii) DeMorgan's Laws:
(viii) A U 0 = A (ix) A n 0 = 0 (x)
RC= 0 and 0' = S1.
1.5 Example. Show the validity of the first distributive law.
1.6 Remark. The concepts of union and intersection can be extended to an arbitrary family of sets. For instance,
U Ai={x~R:3i€I,x€A;}.
iEI
The distributive laws and DeMorgan's laws hold for arbitrary families (subject to Problem 1.1 6 ) ) :
U Ai ( i E I
n A;
(iEI
U
A;
) n B =i UE I ( A i n B ) U B = r) ( A ~ U B ) iEI
8
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
(i) An indexed family '3 = (Ai R : i E I) of sets is called (pairwise) disjoint, if for all i # j , Ai n A j = 0. Throughout this book, the union of a pairwise disjoint family of sets will be denoted for convenience by C A;. Specifically, A + B means A U B, when A and B are disjoint. I
'
(ii) A decomposition of a set A is any representation of A as the union of a disjoint family of sets, A = C Ai. The family {Ai; i E I) is iCI
referred to as a partition of A. [There is another use of the term partition, applied to a different construction in a narrower sense. Namely, P is a partition of a closed interval [a,b] C R if P is any ordered finite set of points {ao,.. .,a,) & [a,b] with a = a. < a, < ... < a, = 6.1 (iii) Let R be a fixed set. The family of all subsets of St is called the power set of and it is denoted by T(R). (iv) A sequence {A, : n = 1,2,. .. ) of sets is said to be monolone nondecreasing ( n o n i n c r e ~ s i n ~if) ,
T o specify the type of convergence, we will write {A,} t A ({A,} 1A). A sequence {A,) of sets is said to be monotone vanishing, if it is monotone nonincreasing and {A,) 0. (u)
Let {A,) be a n arbitrary sequence of sets. Denote A,) == ,IU- m=n A,.
n
This limit is
n U A,. ,=lrn=n
This limit is
00
( a ) lim inf A, (or just n+w
called the limit inferior,
-
(6) lim sup A, (or just lim A,) = n+w
00
00
00
called the limit superior. If
-
A, = lim A, then we denote this common limit as
li.imAn. In
this case, the limit of {A,) is said t o exist and equal n lim A,. +oo
PROBLEMS 1.1 a) Prove Theorem 1.4, the laws of algebra of sets by using the pick-apoint process. b ) Prove the generalized distributive laws and DeMorgan's laws stated in Remark 1.6.
1. Sets and Basic Notation
Show that:
Show that A\B = A n BC.
IA 1 =n 1 ? ( A ) I = 2".
Let
(i.e., the set A contains n elements). Show that
Prove that:
For each of the following, justify with a proof or give a counterexample.
Give an example of a monotone vanishing sequence of sets. Let ( A , : n = 1,2,. .. ) be an arbitrary sequence of sets. Define
n A, and A, n =1
A, =
00
00
= U A,. n =1
a) Construct a monotone nonincreasing sequence of sets ( B , )
.
such that { B,) A, b ) Construct a monotone nondecreasing sequence of sets { C , ) such that ( C , ) f A,. c) Given ( C , ) t A,, construct a pairwise disjoint sequence
{ D , ) such that
Em-, n - D,
= A,.
In the condition of Problem 1.8, show that A, C limA, E IimA, C A,. Let 52 be an arbitrary set. Find a sequence { E n ) of subsets of R such that lim En = (8 and lim En = 52. -
I0
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
NEW TERMS: set 3 element of a set 3 Russell's paradox 3 subset 4 superset 4 singleton 4 empty set 4 proper subset 4 proper superset 4 universe 4 carrier 4 sample space 4 events 4 elementary events 4 union 6 intersection 6 disjoint sets 6 difference 6 symmetric difference 6 complement 6 set-algebraic expression 6 set-algebraic transformation 6 pick-a-poin t process 6 axiom of extent 6 commutative laws 6 associative laws 6 distributive laws 7 idempotence 7 DeMorgan's laws 7 pairwise disjoint sets 8 disjoint family of sets 8 decomposition of a set 8 partition of a set 8 partition of an interval 8 power set 8 monotone nondecreasing sequence of sets 8 monotone nonincreasing sequence of sets 8 monotone vanishing sequence of sets 8 limit inferior 8 limit superior 8 limit of a sequence 8
2. Functions
2. FUNCTIONS The word "function" was introduced by Gottfried von Leibnitz in 1694, initially as a term to denote any quantity related to a curve, such as its slope, the radius of curvature, etc. The notion of the function was refined subsequently by Johann Bernoulli, Leonard Euler, Joseph Fourier, and finally, by Lejeune Dirichlet in the middle of the nineteenth century with a formulation pretty close to what we are using a t the present time and which a mathematics or engineering student meets in an introductory calculus course. Dirichlet introduced a variable, as a symbol that represents a set of numbers; if two variables x and y are so related that whenever x takes on a value, there is a value y assigned to x by some rule of correspondence. In this case y (a dependent variable) was said to be a function of x (an independent variable). In this section we introduce a more contemporary notion of a function. For functions operating with sets (rather than with points), we will be using a nontraditional notation of f , and f * (instead of just f ) , previously used by MacLane and Birkhoff [I9931 and which we found very appealing, as it brings more order within functions acting on collections of sets (such as topologies and sigma-algebras) and simplifies many proofs.
2.1 Definitions. (i) Let X and Y be two sets. The set {(x,y): x E X , y E Y) of all ordered pairs of elements of X and Y is called the Ca7-tesian or direct product of X and Y and it is denoted by X x Y. If X = Y then we shall write X x X = x2.Similarly, the Cartesian product of n sets is
the set of all ordered n-tuples. (ii) Any subset f of X x Y is called a binary relation. (iii) A binary relation f X x Y is called a (single-va1ued)'function if whenever (x,yl) and ( x , ~ are ~ ) elements of f , then yl = y2. We also say that the function f is a map (or mapping) from X to Y and denote this most frequently by the triple [X,Y,f] or by f : X - - + Y or by (x,f (x)) or by f ( x ) = y or by X H f(x). (iv) For a function f (as a subset of X x Y), denote
and call it the domain of f . When a function [X,Y,f] is given we will
12
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARLES
agree that X is the domain of f . If a domain is not specified, we agree to regard as Df the largest possible set where f is defined. The latter requires a more rigorous motivation. For instance, let 1
f ( 4 = d_1 This -
function is defined for all x E (1,oo). On the extended real line R = R U { oo, - oo), we allow x E [l,oo].And finally, it is not wrong to have x be any real (or even complex) number, if f will take on values in Y E C (or C = CU{w)). (v)
+
Another component of a function is its range, Rf = { Y E Y : ~
X
Df E , f ( x ) = y}.
A superset of Rf (such as Y) is referred to as a codomain. In other words, Rf is the subset of all such elements of Y, which take part in the relation f 5 D x Y. (vi) If x E D f , then f(x) ( E R f ) is called the image of I under f . By the above definition, for every x there is a unique image. [Note that an "extended" concept of a function allows more than one image of each point x under f . Any such function f is called multi-valued. The reader is definitely acquainted with principles of complex analysis where such functions are common. It is also known that in this case the range of a multi-valued function can be parhitioned into pairwise disjoint subsets, such that the function is then split into a number of single-valued functions called branches.] (vii) If D Df then the set of the images of all points of D under f is called the image of D under f and, following the notation of most analysis textbooks, it can be denoted
However, for the upcoming constructions, it is convenient to distinguish images of points of a set from images of subsets of X under f . In other words, we introduce the function
where for D E T(X) we denote f,(D) = { y E Y: 3 x E D, f (x) = y } .
13
2. Functions
Specifically, Rf = f ,(Df). We agree to set f ,({x}) = 0 Vx !$! Df. However, unless specified, we will always assume that in [X,Y,f , X is the domain of function f . [In particular, this agreement excludes such an inconsistency as having f (x) = @,whenever x $ D f , since f (x) is supposed to be a point and not a set.] (viii) Let [X, Y, f ] be a function. Define the function
and call it the inverse of f ,. In other words, for each B E T ( R f ) , f * ( B ) = (x E X: f (x) E B). The set f *(B) is called the inverse image of B under f , or the pre-image of B under f . Another construction related to f * is f defined as {(y, x) E Y x X: (x,y) E f } and called the inverse of f . Unlike f*, in general, f is not a single-valued function (in other words, it is a binary relation or multi-valued function , Consider, for instance, the function [R, R, f ] such that f (x) = x . Clearly, = W + and the inverse J = f of f is a two-valued function wit domain D = R + and with range equal R, which can be decomposed
-'
-'
-'
f
J
Rl
+
as R = (-m,O) [O,m). Accordingly, we have two branches [R+, ( - m ) o ) , JI and [ R + , R + , J I of J . (is) Observe that it is legitimate that f (xl) = f (x,) and x, # I,. However, if f is such that f (xl) = f (x,) if and only if xl = x,, then f is called one-to-one (or injective or invertible). If f is one-to-one, f is a single-valued function too.
-'
-'
in general is not a single-valued function we will agree to Since f as a set (which in particular can be a singleton or the regard f empty set), with the alternative notation f *({y}). Let [X,Y, f ] be a function. Generally, f ,(X) = Rf & Y. In (x) this case, we say the map f is from X into Y. When f,(X) = Y, we say the map f is from X onto Y or surjective. We call f bijective if f is surjective (onto) and injective (one-to-one).
- X x Y and g C Y x Z be binary relations. Then the (xi) Let f C composition of f with g is defined as
The composition of f with g is most frequently used when [X,Y,f] and [Rf n D,, 2,g] are functions and, consequently, it is defined as
14
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
2.2 Kxample. For a fixed subset A C X, define the indicator function [X,R , l A I as
Then, [X, R, l A ] is an into map, while [X, {0,1), lA] is an onto map. 2.3 Definition. Let f: X
-+
Y and let A C X . Then define
This function is called the restriction o f f to A. On the other hand, the function f is called an extension of the function ResAf from A to X. 0
2.4 Example. Consider [R, [ - 1,I], sin] which is surjective (i.e., onto) but not injective (one-to-one). Take a restriction of function [R, [ - 1,1],sin] to one of the largest subsets A of R where [R, [ - l,l],sin] is monotone increasing. It is plausible to set A =
[-$,;I
since it is also
symmetric about the Y-axis. Then [A,[ - 1,1],R e s p i n ] is obviously bi0 jective and its inverse is the well-known function [ [ - 1,1], A,arcsin].
2.5 Remark. Let [X,Y, f be a single-valued function such that for some y E R f , f *({y}) = {xl, x2, x3} C X. Consider the composition f, o f * and find that
Thus, if f is single-valued, the restriction of f o f-' to Rf is the identity = R f ) However, f o f function (denoted I , with the domain D f of need not be a single-valued function a t all (show it). f-' of is the 17 identity function only when f is injective.
-'
PROBLEMS 2.1
Find the image of [-3,5) under 1(1,21.
2.2
Find the inverse image of (&4] under
2.3
Composition:
~1.
a) Show that the compose operator is associative. b) Show that (g o f )-' = f o g-l. c) Show that Dg = Df n f *(Dg).
-'
-'
2 . Functions
2.4
Show the equivalence of the following statements: a) f is one-to-one.
b ) f * ( An B ) = f * ( A )n f * ( B ) . c) For every pair A and B, of disjoint sets, f , ( A ) n f ,(B) = 0. In the following problems we assume that f is a map from X into Y .
2.5
Show that A C X 3 A C - f * o f ,(A).
2.6
Show that VB & Y, f, of * ( B )& B.
2.7
Show that [X, Y ,f] is onto if and only if f, of * ( B )= B holds
VB
c Y.
16
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARLES
NEW TERMS: Cartesian (direct) product 11 binary relation, 11 function 11 map 11 mapping 11 domain 11 range 12 codomain 12 image of a point 12 multi-valued function 12 image of a set 12 branch of a function 12 inverse image of function f, 13 pre-image 13 inverse of function f 13 one-to-one (injective, invertible) map 13 into map 13 onto (surjective) map 13 bijective (onto and one-to-one) map 13 composition of binary relations 13 composition of maps 13 indicator function 14 restriction of a map 14 extension of a map 14 identity function 14
3. S e t O p e r a t i o n s u n d e r M a p s
3. SET OPERATIONS UNDER MAPS The most remarkable property of the inverse of a function is that it "preserves" all set operations. The function itself, as we shall see, does not have such a quality. The main theorems in this section will be proved for special cases of surjective maps; the rest will be left for the reader.
3.1 Theorem. L e t [ X , Y ,f ] be a s u r j e c t i v e map a n d let B
Y . Then
Proof. We prove an equivalent statement, f * ( B )+ f*(BC)= X , i.e., we show that (i) f * ( B )and f * ( B Care ) disjoint and (ii) f * ( B )complements f * ( B E ) up to X. We start with: (i) Suppose f * ( B )and f*(BC) have a common point x. Then there is yl E B such that f (x) = y, and p2 E BC such that f (x) = y,. Thus, y l # y 2 and f is not a single-valued function. (See Figure 3.1.)
Figure 3.1
18
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
(ii) If f*(B) does not complement f*(BC) up to X, there will be at least one point x which does not belong to either of these sets (for they are disjoint as shown above). This is an obvious contradiction, since it follows that f(x) 6 Y. (See Figure 3.2 below.)
Figure 3.2 C
3.2 Example. Let [X,Y, f ] be a function. Then [ f * ( ~ ) ]= XC= 0. On the other hand, setting B = Y, by Problem 3.1, we obtain
3.3 Theorem. Let [X,Y, f ] be a surjective map. Then B1 C_ B2 implies that f *(B1) 2 f *(B2).
Y
Proof. Suppose that f *(B1) is not a subset of f *(B2). This implies the existence of a point x which belongs to f*(B1) and does not belong to f *(B2). Therefore, there is exactly one point y E B1 with f (x) = y. On the other hand, since x 6f*(Bz), f(x) cannot belong to B2. But it must, since f (x) = y E B1 5 B2. (See Figure 3.3 below.) Hence, our assumption above was wrong.
3. Set Operations under Maps
Figure 3.3
3.4 Theorem. Let f : X--1 Y be an onto map and let { B i : i E I) be an indexed family of subsets of Y. Then,
Proof.
(i)
We prove that
Let x E
U f * ( B i ) C f*( U B i ) .
;€I
U f *(Bi)
i E I
Then there is a n index io E I such that
i € I
B i , by Theorem 3.3, f * ( B i0) C_ f*( i U Bi), € I
x E f *(BiJ S
which implies that x E f '(
U Bi). i € I
( i i ) We show the validity of the inverse inclusion,
Let x E f *(
U Bi). Then
f (x) E
U Bi. Therefore,
there is an index
iEI
iEI
io E I such that f ( x ) E Bi if and only if { f (x)} 5 Bi 0
. By Theorem
0
it follows that f *{f ( x ) ) 5 f *(Bi ). Since x E f *({ f ( x ) } ) , we have 0
3.3,
20
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 3.1
Prove Theorem 3.1 under the condition that f is an into map.
3.2
Prove Theorem 3.3 under the condition that f is an into map.
3.3
Generalize Theorem 3.4 when f is an into map.
3.4
Let [X, Y, f ] be an into map and let { B i : i E I) be an indexed family of subsets of Y. a) Prove that f*(
n Bi) = i nc I f*(Bi).
i c I
b ) If { B i : i E I) is a pairwise disjoint family, show that
3.5
Show that f *(A\B) = f *(A)\ f *(B).
3.6
The results above prove that all set operations are closed under the inverses of maps. Show that not all set operations are closed under maps as per the following. a) Show that maps preserve inclusions.
b ) Show that maps preserve unions. c) Show that maps do not preserve intersections; specifically, show that
and that the inverse inclusion need not hold. Explain the latter without a counterexample. d) Do maps preserve the difference?
3.7
Let -[X,Y, f ] be a map and let A C Y. Show that
3.8
Prove the following properties of the indicator function defined on a nonempty set R:
(i)
lA
= min{lA, lB)= lAlg
3. Set Operations under Maps ( i i i ) lA+B - 1A + lB.
ACB
(vi)
3 lA5
lB.
( v i i ) 1" A~ = sup{lA1.: i E I ) , i € I
l n Ai = inf{lA.:1 i E I ) . i € I
3.9
Let { A , ) be a sequence of subsets of
-
a. Show that -
the function
limlA is the indicator function of the set limA, and that the n
function 3.10
lim 1An is the indicator function of the set limA,.
exists. [Hint: Use Prove that n-w lim A, exists if and only if nlim lA -oo n Problem 3.9.1
3.11
Let [ X , X 1 , F ]be a bijective map and let r and lections of subsets of X and XI such that
T'
F**(r:) C - r and F,,(r) C_ r'. Show that
F**(rl) = r and F , , ( r ) = r'.
be respective col-
22
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
4. RELATIONS AND mLL-ORDERING PRINCIPLE In Definition 2.1 (ii) we introduced the concept of a binary relation R as an arbitrary subset of A x B. In the special case when R E A x B and A = B, we call R a binary relation on A. We will sometimes use as notation aRb instead of (a,b) E R. This notation makes sense, for instance, if R is stipulated by < or 5 on some set. In addition, we will also say that a pair (A,R) is a binary relation, where in fact R is a binary relation on a set A (a carrier). Now we consider some special relations.
4.1 Definitions. Let R be a binary relation on S. (i)
R is called reflexive if Va E S, (a,a) E R [aRa].
(ii) R is called symmetric if (a,b) E R
+ (b,a) E R
(iii) R is called antisymmetric if (a,b), (b,a) E R bRa ja = b]. (iv) R is called transitive if (a,b), (b,c) E R
[aRb
ja
j (a,c)
bRa].
= b [aRb A ER
[aRb A
bRc 3 aRc]. (v) R is called a n equivalence on S (denoted by symbol it is reflexive, symmeiric and transitive.
or E ) if
[Observe Chat the equivalence E on S partitions S into mutually disjoint subsets, called equivalence classes. A partition of S is a family of disjoint subsets of S whose union is a decomposition of S. The elements of S "communicate" only within these classes. Therefore, every equivalence relation generates mutually disjoint classes. The converse is also true: a n arbitrary partition of the carrier S generates a n equivalence relation.] ( v i ) R is called a partial order (denoted by the symbol 5 ) if it is reflexive, antisymmetric and transitive.
(vii) If 3 is a partial order, it is called linear or total if every two elements o f ' s are comparable, i.e. Va,b E S either a 5 b or b 5 a . (viii) Let S be a n arbitrary set and let relation on S. For t E S denote [t]
E
(E) be an equivalence
,( = [tIE) = {s E S : s = t}
and call it an equivalence class modulo classes
FZ
(E). The set of all equivalence
4. Relations and Well- Ordering Principle
23
is said to be the quotient (or factor) set o f S modulo m . It is easily seen that a quotient set of S is also a partition of S. Note that x H[XI is a function assigning to each x E S, an equivalence class [x] We will denote this function by a~ (or a, ) and call it the projection of S on iis quoiient by E (or = ).
,.
4.2 Examples.
(i) (R, = ) is an equivalence relation. Therefore, every real number as a singleton represents an equivalence class.
- ) is a linear order. (ii) (R, < (iii) Congruent triangles on a plane offer an equivalence relation on the set of all triangles. [Two sets A and B are called congrueni if there exists an L'isometric" bijective map f: A -, B, i.e., f must preserve the L'distance" for every pair of points a,b E A and their images f (4,f(b) E B.1 (iv) (R2, 5 ) is not a linear order if we define " < " as (al,bl) 5 (a2,b2) if and only if al 5 az. A b1 5 b2. T o make this relation a linear order we can define, for mstance, (al,bl) 5 (a2,b2) if and only if I1 (al,bl) 11 11 (a2,b2) 11, where 11 (a,b) 11 is the distance of point (a,b) from the origin. (v) Let I be the relation on N such that n 1 m if and only if n divides m (without a remainder). It can be shown that (N, I ) is a partial order but not a linear order. (See Problem 4.5.) (vi) Let p be a fixed integer greater than or equal to 2. Two integers a and b are called congruent modulo p if a - b is divisible by p (without remainder); in notation we write p I a - b or a b (mod p). The number p is called the modulus of congruence. Let [mJp= {n E Z: m
I
n (mod p ) ) (m E a ) -
In other words,
Then any two integers m and n are related in terms of [.Ip if and only if n E [mIp. This is an equivalence relation. (Show it; see Problem 4.1.) (vii) Let S be a nonempty set and R C_ S x S be a binary relation. Taking for R the diagonal D = { ( s , ~ )s: E S} we have with (S,D) the "smallest" (by the contents of elements of S x S) equivalence relation on S, where each element forms a singleton-class, and D partitions S into {s), classes. The "largest" equivalence relation on S is obviously R =
24
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
S x S itself and it consists of the single class.
(viii) Any function [X, Y, f ] generates an equivalence relation on its domain X partitioning X into disjoint subsets. Define the binary relation Ef ( af ) ~ n X a s
Then, it is readily seen that Ef is an equivalence relation on X , referred to as the equivalence kernel of the function f . Formally, for every point y E f,(X), the pre-image f is an equivalence class in X and {[f -l(y)lEf:y t f,(X)} is the quotient set of X modulo Ef (or z f). Furthermore,
is a decomposition of X.
f*(X) f
For instance, the function f (z)= x2 generates a partition of R into a collection of subsets of the form { - a,a), for a > 0, along with {0), which is a factor set of R modulo E 2. x
Another example is the function
Let Ay=tandl(y)={arctany+~n:nEZ)=[arctany]E
tan
.
Then, Eta, is the equivalence kernel of the function tan,
I 'tan = {tan
: y E W) (the quotient set of
X modulo Eta,)
and
The last discussion about equivalence relation generated by a function yields some important results and notions we would like to use in the upcoming materials of Chapters 6 and 8. While we demonstrated in Example 4.2 (viii) that any function on X generates an equivalence relation, the following proposition states that the converse is also true; namely that any equivalence relation E is the equivalence kernel of some function. 4.3 Proposition. Let E be a n equivalence relation on a n o n e m p t y set X . T h e n the projection [X,XIE,rd is an onto m a p w i t h E a s the equivalence kernel. 0
4. Relations and Well-Ordering Principle
25
Proof. From the definition of TE it follows that rEis surjective. T o claim that E is the equivalence kernel of r ~we, need show that rE(x) = rE(y) if and only if xEy. Let rE(x) = rE(y). Since xEx, x E [xIE and therefore, by the assumption (rE(x) = nE(y)) x E [yIE. This proves that xEy. Now let XEZ.If y E [xIE, then yEx and thus, by transitivity, yEz, i.e. y E [%IE. Therefore, [xIE 5 [%IE. The inverse inclusion, and thus the equality, is due to the symmetry of E. Hence, rE(x) = rE(y). 0 Proposition 4.3 asserts that the projection r~ is a trivial example of an onto function defined on X and with the range XIE. Now suppose E is an equivalence relation on a set X and [X,Y,f ] is any function whose equivalence kernel is E. The following theorem claims that, there is a unique 'mediator" f between the quotient set XIE and the codomain Y of f .
4.4 Theorem. Let E be an equivalence relation on a nonempty set X and [X,Y,f] be a function whose equivalence kernel is E. Then there is a unique function [XIE,Y,flsuch that f = f o r r ~ . The reader shall be able to take care of this theorem (Problem 4.10) as well as of Corollaries 4.5 and 4.6 (Problems 4.11 and 4.12).
4.5 Corollary. In the condition of Theorem 4.4, i f f is onto, then f is bijective. 17 4.6 Corollary. Let [X,Y,f] be a function and let Ef denote its equivalence kernel. Then, there is a unique one-to-one function [XIEf,Y,fl such that f can be represented as a composition
Furthermore, f is bijective i f f is surjective (onto).
Now, we turn to a discussion on the partial order relation and all relevant notions and theorems, which we are going to apply throughout the book.
4.7 Definitions. Let ( A , 5 ) be a partial order and let B 5 A. Clearly, (B, ) is also a partial order. (i) The partial order (B, ) is called a chain in ( A , 5 ) if it is linear. (ii) An element bo E B is called a minimal element of B (relative to
26
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
4 ) if for each b E'B with b 3 bo, b = bo (compared with the smallest element bo, which is 5 b for all b E B).
(iii) An element b, E B is called a maximal element of B (relative to -4 ), if for each b E B, with b, 5 b, it holds true that b = b, (compared with the largest element b,, which is such that b 5 b, Q b E B). [Observe that the difference between a minimal element and the smallest element of a set is as follows. A minimal element bo is 5 b E B whenever bo is comparable with some b. In addition, the smallest element is comparable with all elements of B.] (iv) An element u E A is said to be an upper bound of B if b -1 u Qb E B. An element 1 E A is said to be a lower bound of B if 15b Qb E B. If B has lower and upper bounds then B is called bounded (or - -bounded). 4 If the set of upper bounds of B has a smallest element uo then this element is called the least upper bound of set B (abbreviated lub(B)) or supremum (sup(B)). Similarly, if the set of all lower bounds has a largest element I, then it is called the greatest lower bound of the set B (in notation glb(B)) or infimum (inf(B)). (u)
[For instance, 0 is the glb((0,l)) or inf(0,l) in (R, the set [I,&)
< ), while a lub of
n Q does not exist in (Q, 5 ).I
(vi) Let B contain a t least two points. The partial order (B, 5 ) is called a laitice if every two-element subset of B has a supremum and an infimum and they are also elements of B. [In notation: if B = {x, y), then
and
4.8 Examples.
(i) Let B = {1,3,3~,...,3",. ..). Then (B, relation in Example 4.2 (v)) is a chain in (N, I ).
1)
(where
I
is the
(ii) Let B = {2,3,4,. ..) and consider the relation I on B. In terms of this relation, the set of all prime numbers {2,3,5,7,11,. ..) is the set of all minimal elements, while there is no smallest element in B, since there is no minimal element related to all other elements. B does not have a maximal element either. (iii) Consider the partial order (T(a), ). It is obvious that for an arbitrary subcollection A = { A i E R : i E I) E T(R), it is true that
4. Relations and Well-Ordering Principle supA =
U Ai E ?(a)
i€I
and infA =
n Ai E ?(a).
i E I
In particular, it holds true for pairs of subsets. Thus, (?(a), C ) is a lattice. 0 4.9 Definition. A linear order (A, 5 ) is said to be well-ordered if every nonempty subset of A has a smallest element in the sense of the same order 5 .
4.10 Example. Let R be the set of all real numbers and consider the relation (R, < - ) which is clearly a linear order. However, R is not wellordered by 5 , for there are nonempty subsets containing no smallest element, such as (0,l). But (N, ) is well-ordered.
<
Can all sets be well-ordered? This is one of the fundamental questions in set theory posed by Georg Cantor in the 1870's. Cantor considered it obvious that every set can indeed be well-ordered. At that time set theory was not well-postulated yet. In 1908, Ernst Zermelo formulated his axiom of choice and showed in his paper, Untersuchungen iiber die Grundlagen der Mengenlehre, that the axiom of choice is equivalent to the "well-ordering principle." The axiom of choice was included in an axiom scheme for set theory that was later (1922) strengthened by A. Frankel in his paper, Zu den Grundlagen der Cantor-Zermeloschen Mengenlehre. Zermelo and Frankel introduced the following notions. Let Y be a collection of sets. A function c defined on 9 is called a choice function, if for each S E 9, c(S) E S. In other words, c assigns to each set exactly one element of the set. Or less formally, we can choose exactly one element from each set. Observe that if Y is an indexed set, i.e. Y = {Si :i E I ) , then we have f (i) = c ( S i ) E Si. The axiom of choice is formulated in this way: Every system of sets has a choice function.
Zermelo proved that a nonempty set A can be well-ordered if and only if its power set T(A) has a choice function. [There will be a short discussion of the axiom of choice in the upcoming sections.]
4.11 Theorem (Zermelo). The axiom of choice is equivalent to the well-ordering principle. 4.12 Examples.
(i) T o illustrate a use of the axiom of choice, consider the following example. Let [ X , Y ,f ] be an onto map. We show that there exists a subset A E X such that ResAf : A -+ Y is bijective. Let c be a choice func-
28
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARES
tion for the factor set {[f -
y E Y)of X modulo Ef. Then the set
has the desired property. In other words, we choose one for each y and the collection of all these x's is A.
I
from f - l ( y )
'
7r 7r (ii) Let A = {c(tan - y) = arctany :y E W } . Then A = ( - 7 ,7 )and
hence [ A , R ,ResAtan] is a function such that it is one-to-one and
-
(ResAtan) - 1 - arctan. One of the central results in set theory is Zorn's Lemma [1935], which is widely used in set theory and which is also equivalent to the axiom of choice. 4.13 Lemma (Zorn). If each chain in a partially ordered set A has an
upper bound, then A has a maximal element.
PROBLEMS 4.1
Show that the relation in Example 4.2 (vii) is an equivalence relation on B. Give the equivalence classes for p = 4.
4.2
Classify the following binary relations.
a ) Let R be a nonempty set. Define the relation ( T ( R ) , ).
b ) Let R = W2\(s,0). Define R: (a,b)R(c,d)e~ ad = bc. 4.3
The following theorem is a statement of the principle of mathematical induction:
Let S ( n ) be a statement which is true or false, for n = 1,2,. . . Let S ( l ) be true and let S ( n ) 's being true imply that S ( n + 1) is true, n = 1,2,... . Then S ( n ) is true for all n.
.
Prove it. [Hint: Use the well-ordering principle.] n
4.4
Prove that
C i2 = $n(n + 1)(2n+ 1).
i=l
4.5
Show that (N, 1 ) in Example 4.2 (v) is a partial order relation. Is (N, I ) a lattice?
4.6
Is ( R , 5 ) a lattice?
4. Relations and Well-Ordering Principle
< ) a lattice?
4.7
Is ((1,3),
4.8
Is the set of all continuous real-valued functions a lattice?
4.9
Is the set of all real-valued polynomials a lattice?
4.10
Prove Theorem 4.4.
4.11
Prove Corollary 4.5.
4.12
Prove Corollary 4.6.
30
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
NEW TERMS: binary relation on a set 22 - reflexive 22 - symmetric 22 - antisymmetric 22 - transitive 22 - equivalence 22 - partial order 22 - linear (total) order 22 comparable elements 22 equivalence class modulo (E) 22 quotient (factor) set 23 projection of a set on its quotient 23 congruence 23 congruence modulo p 24 modulus of congruence 24 equivalent classes generated by a function 24 equivalence kernel of a function 24
chain 25 minimal element 25 smallest element 26 maximal element 26 largest element 26 upper bound 26. lower bound 26 bounded set 26 least upper bound (supremum) 26 greatest lower bound (infimum) 26 lattice 26 well-ordered set 27 well-ordering principle 27 choice function 27 axiom of choice 27 Zermelo's Theorem 27 Zorn's Lemma 28 principle of mathematical induction 28
5. Cartesian Product
31
5. CARTESIAN PRODUCT The idea of the Cartesian product (or, equivalently, direct product) primarily belongs to Ren6 Descartes who introduced this notion for two sets X and Y as a set of all ordered pairs {(x,y): x E X and y E Y). Descartes was also the one who introduced the widely used Cartesian coordinate system related to the Cartesian product. In Definition 2.1, we introduced the notion of the Cartesian product of finitely many sets. We are going t o extend this definition t o arbitrarily many sets. We begin with sequences of sets.
5.1 Definitions. (i) Let ( Y i : i = 1,2,. ..) be a sequence of arbitrary sets. Then the Cartesian product of this sequence is the set of all sequences
of elements from
Y1,Y2,. ...
(ii) In the general case, let {Y,: x E X) be an indexed family of sets.
Figure 5.1
32
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then the Cartesian product (see Figure 5.1 above)
is the collection of all functions defined on the index set X and valued in Y,. Each such function is a choice function for the family {Y,: x E X}.O 5.2 Remarks.
One of the basic questions that arises is this: when is the Cartesian product nonempty? Obviously, if a t least one set Yk = 0, then Y, = 0. But if all Y, # 0, is the Cartesian product
(i)
,EX
nonempty necessarily? Although the answer may seem obvious, we must turn to the axiom of choice. In other words, the Cartesian product of a family of sets is nonempty if and only if there exists a t least one choice function for this family.
(ii) We said that the Cartesian product of the family of sets {Y, : x E X} is the collection of all functions from X to Y,, x E X. In particular, if Y, = Y, for all x E X, then the Cartesian product is the collection of all functions from X to Y and is naturally denoted by yX. Alternatively, the set yX is also denoted by CJ(X;Y). (iii) Let X be an arbitrary set. Then every subset A C X can be associated with its indicator function lA.Conversely, A = {x E X: lA(x) = 1). Therefore, we can set a one-to-one correspondence between T ( X ) and the set of all indicator functions indexed with all subsets of X. On the other hand, the set of all such indicator functions is in fact the set of all (binary) functions of type f: X -+ {0,1). [Indeed, if f is a binary function, {x E X: f (x) = 1) = B is a subset of X. Thus, f = lg.] This set, by the above definition, is the Cartesian product of the family of sets Y, = {0,1), where the index I runs X , in notation, {0,1lX. So, we have shown that T ( X ) is "equipotent" (i.e., in a bijective correspondence) with the set { 0 , 1 }of~all functions f: X {0,1}.
5.3 Definitions.
(i)
Let {Y,: x E X} be a collection of sets. The map
for each a E X is called the a t h projection m a p if r,(f) = ~ ( c Y where ), f E Y, , f ( a ) E Y,. The point f ( a ) is called the a t h coordinate of f a: E X and the space Y, is called the uth factor space. (See Figure 5.2.) [Observe that r:({f (a)})# {f} but it contains {f}. For instance, if
n
33
5 . CarZesian Product
X = (1,. ..,n)is finite,
In general, ar({f (a)))=
i., = {f (a)).]
n ?, , where ?, = Y,,
for z # a, and
zE X
Figure 5.2
ITn
(ii) Let X = (1,...,n} and let A; & Y i , i = I,.. .,n. The set' A; is i=l called a rectangle or parallelepiped and it can be expressed in the form
Figure 5.3 below.) The notion of a parallelepiped can also be extended when index set X is arbitrary. Given A, C Y,, z € X,the set A, , E X ' is a parallelepiped with the alternative representation (5.1). (See
n
34
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Figure 5.3
(iii) Now we introduce a more general notion of a projection map. Let {Y,: x E X} be an arbitrary indexed family of sets and let A X. Define and call it the A-projection map if s A ( f ) = f,(A). Specifically, if A = {a} we have s{~1(f) = f,({u)) which, in contrast with definition (i), is a singleton.
C
Let A
n Y,.
Then call s > ( A ) an A-cylinder with base A. An A-
aEA
cylinder is called a rectangular cylinder if A is a rectangle. If, in addition, A is a finite set then the rectangular cylinder is called simple. A simple A-cylinder is called a unit cylinder if A is a singleton. (See r''1g ures 5.45.7.)
5.4 Example. Let A = {al,a2,...,an}. Then, a
-
tans. .#an)(f) = f*({a1,...~n)) = {f(al),'..,f (an)},
and hence,
is a {a1,...,an)-simple cylinder with base {f(al),.
..,f(a,)).
5. Cartesian Product
35
-." . lndcr wttl
Figure 5.4
with bos
Figure 5.5
36
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Figure 5.6
5 . Cartesian Product
A-cylinder with base R Figure 5.7
37
38
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 5.1
Let Z =
nX Y , , let A,
Y , and let A =
n A,,
where A, =
,EX
x E
Y, except for finitely many values of the index x , say 1,. ..,n E X. show that A =
.;(A,). k=l
5.2
Let Z =
n Y,, and let A,, 00
Y,, where for each x = 1,2,..., the
x=1 . . -
sequence of sets {A,,}
C
...) with sup{A,,:
is monotone nondecreasing (i.e. A,,
n = 1,2,. ..) =
00
U A,, n=l
E A2,
= Y , for x = 2,3,. .. .
.
Also assume that A,, = A2, = A,, = .. = A,. Show that
n
00
sup{
x=l
A,,:
.)
n=1,2,. . = .;(A,).
a ) Draw nA(f ) for f ( x ) = x 2 . b ) Draw ?r>(A)for A = (0,l)x (0,l). 5.4
Let {Y,; x E X) and {Z,;x E X) be two family of sets. Show that
5.5
Let m,n E N and Y
# @.
a ) For m < - n, find an injective map [Ym,Yn,f]. b ) Find an injective map [Yn,YR, f ] . c) Find a bijective map [Ynx Y R,Y R,f ] . d) Find a bijective map [ y R x YR,Y R
,A.
e) For A
5 X, find an injective map [YA ,Y X , f ] .
5. Cartesian Product NEW TERMS: Cartesian product of a sequence 31 Cartesian product of an indexed family of sets 32 projection map 32 coordinate 32 factor space 32 rectangle 33 parallelepiped 33 A-projection map 34 cylinder 34 rectangular cylinder 34 simple cylinder 34 unit cyIinder 34
40
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARLES
6. CARDINALITY One of the main perplexities in the theory of sets is finding a criterion for their "powers." We can overcome this dif3culty when considering the class of "finite" sets. (We frequently operate with the term "finite", though we did not give any strong definition.) We can easily define an equivalence relation in this class, for example, introducing C, as the class of all n-element sets for every n E No. A partial order relation in this class would act as an appropriate comparison among sets from various classes. Sets A and B are said to be compared, in notation A 5 B,when - s. Then we could assign to set A and only when A E C,, B E C, and n < the number n and call it the cardinal n u m b e r of A. Doing this, however, we would experience real difficulties when introducing "countable" and "uncountable" sets. Specifically, we would fail to operate with cardinal numbers as numbers in the usual sense. (Pursuing this philosophy we readily encounter contradictions - the most frequent phenomenon in set theory.) The basic principles of the formalism of cardinality belong to Georg Cantor who was the first to introduce a well-structured concept of "infinity" in his pioneering work done in the 1870's and 1880's. We will present a rather informal version of cardinality sufficient for us throughout the analysis presented in this book. A curious reader should be referred to special monographs on set theory. We will start with comparison ideas based on finite sets, ideas that enable us to deal with infinite sets as well.
6.1 Definitions. Two sets A and B are said to be equipotent if there is a bijective function f : A --t B. In this case we denote I A I = I B I (or A a B) and also say that A and B have equal cardinality.
(i)
(ii) If there exists a one-to-one function f : A -t B, then we say that the cardinality of A is less t h a n o r equal t o t h e cardinality of B, in notation I A ( I B ( or A 3 B . If [ A ( 5 ( B I and ( A 1 # I B ( we shall write ( A I < I B I or A 4 B.
<
(iii) A cardinal n u m b e r is an equivalence class containing all sets that are " B -comparable." [For some cardinal numbers we will be using the same notation as for regular numbers.] ( i v ) Let 0 denote the cardinal number of the empty set 0 (the only representative of this class). Note that 0 is not a number but the class containing 0. Thus, ( 0 [ = 0.
6. Cardinality Similarly, the cardinal number n is the equivalence class con(v) taining the set (1,...,n). Therefore, a set A is finite if it is equipotent with some set of cardinal number n, such that the integer number n is an element of N, i.e., I A I = I (1,. .,n) 1 = n. A set that is not finite is called infinite.
.
[One can easily show that N is infinite.] (vi) A set A is said to be countable or denumerable if it is equipotent with N and in this case we write I A 1 = No (pronounced aleph I N I or A 5 N. nought). A set A is called at most countable if I A I (vii)
An infinite set, which is not countable is called uncountable.
(viii) A set A is said to have the cardinalzty of continuum if it is equipotent with the set R of real numbers and we write I A I = a. [We D show below that No < E.]
6.2 Remark. For every set a, the property I A 1 = I B I induces an equivalence relation on the power set T(R),while I A I 5 I B I induces a partial order on T(n) (see Problem 6.1). U If sets A and B have only finitely many elements, then A (i) if and only if they have the same number of elements.
B
B
In contrast with finite sets, an infinite set can be equipotent (ii) with a proper subset of itself. Consider A = {1,3,5,. .) E M and define f (n) = 2n - 1, n E N. Then f : N-, A is bijective and N a A.
.
(iii)
No w No x NO. Indeed, the function
f(k, n) = 2k(2n + 1) - 1 is bijective from No x No to No. Similarly, N
N x N.
(iv) Let { A l ,A2,.. .) be a countable family of countable sets. Then 00 its union A = U An is countable. T o construct an appropriate bijective n=l
map we first represent A as a countable union of disjoint sets. Let n-1
B1 = A l , B2 = A2\A1, ...,Bn = An\ (J A, (for n > 1) ... . k=l
.
Then, clearly A = C = 1 B " Without loss of generality, we assume that each set B,.- is countable (in general, any set may also be a t most countable) and, therefore, can be enumerated as B, = {bnl, bn2,..J, n = 1,2,. . We can place these sets in the form of a matrix:
..
42
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Now the desired bijective map is f(1) = bll, f(2) = b12, f(3) = bzl, f (4) = b3i, f (5) = b22, f (6) = b13, .. , from N to A.
.
(v) The set $ of rational numbers is countable, for the function f (g)= (m, n) is one-to-one from $ to (N x N) U ((0,O)). The latter is countable by (iii). (vi) We can show that No < (5. Clearly, No 5 (5. Then it is sufficient to show that N 4 [0,1], since [0,1] M R (see Problem 6.5). If a bijective function f : N + [0,1] exists, then f (n) is of type O.anl an2.. . Now define
.
the number O.blb
.
2...
such that b i = 3 if a . # 3 'i
and b i = 5 if a . = 3 , zi
i = 1,2,. . . Then the number b := O.blb 2... cannot appear among the values of f (n) for it differs from f (n) a t the nth place. On the other hand, b E [0,1] contradicts the assumption that f is onto. Thus No < E. Observe that each rational number has two representations, e.g. 0.1 and 0.0999 ... . That means we have to be careful about different numbers above. 0
The follo*ing theorem is one of the central results in set theory.
6.4 Theorem (Cantor). A 4 T(A) f o r every set A.
Proof. The result holds trivially for any finite set A (see Problem 1.4). Specifically, for the empty set, ( 0 I = 0, while ( T ( 0 ) 1 = 1. Since ?(A) contains all singletons, it immediately follows that A 5 9(A). T o show that ( A I # ( ?(A) [ , we assume that A = ?(A) and deliver a contradiction. By our assumption, there exists a bijective map f : A + T(A). hen each element a in A is also an element of a subset of A that contains a. In other words, a may belong to f(a) (a subset of A) or may not. We then define B = {a E A: a $ f (a)). B is nonempty, since there exists a t least one element a. E A assigned to 0. We pick a point b E A such that f (b) = B. By definition of B, b E B e b $ f (b) = B, and this is a contradiction.
(i) In Remark 5.2 (iii) we showed that the power set T ( X ) of a set X is equipotent with the set { 0 , 1 }of ~ all functions f : X + {0,1}. Note that 2 is the cardinal number of the set (0,l). Thus, we conclude that
6. Cardinality
43
I 9 ( X ) I = 2 I I (where we set I B I I A = I B~ 1 ). In particular, if I X I = 1 N 1 then I T(N) I = 2 N ~ .An interesting fact is that 2N0= (5, the proof of which is left for the reader as an exercise (see Problem 6.6). (ii) The continuum hypothesis states that if 'U is an infinite
cardinal, then there is no cardinal 8 such that l I< 8 < 2%. This was conjectured by Cantor for !X = No. In 1900 David Hilbert included the "continuum problem" as Problem #1 in his famous list of open problems in mathematics. In 1940 Kurt Gadel proved that the continuum hypothesis is consistent with (i.e. does not contradict) the axioms of set theory (axiom of existence, axiom of choice, etc.). In 1963 Paul Cohen [I9661 showed that the continuum hypothesis is independent of the axioms. (iii) The cardinal number 2' is called the hypercontinuurn. For example, the set 9(R) has the hypercontinuum cardinal. Supplementary Historical Note. Modern set theory was founded by Georg Cantor, in a sequence of several articles that appeared between 1870 and 1880. One of these articles, [iber eine Eigenschafl des Inbegrifles allen reellen algebraischen Zahlen, appeared in Crelle's Journal in 1874, and is said to have given birth to set theory. Georg Cantor was born of Danish parents (both of Jewish descents) in St. Petersburg, Russia, in 1845, and lived there until 1856, when his parents moved to Frankfurt, Germany. Cantor began his university studies a t Ziirich in 1862. After one semester a t Ziirich he moved to Berlin University, where he attended lectures of Weierstrass, Kummer and Kronecker. Leopold Kronecker later became Cantor's main opponent, criticizing his concept of infinity and regarding it as theology and not as mathematics. (Cantor, whose mother was a catholic and father a Protestant, has been a devoted Protestant and active theologian. The latter has become a major target of attacks by Cantor's liberal opponents in Berlin University.) In 1867 Cantor received his Ph.D. (in number theory) from Berlin University. His dream to get a teaching position a t Berlin University never came true, primarily due to the opposition of Kronecker. In 1869 Cantor was appointed at Halle University, where he remained until his retirement in 1913. Cantor died in a mental hospital in Halle in 1918. In 1925 David Hilbert recognized Cantor's concept of infinity. He said, "No one can drive us from the paradise that Cantor created for us."
44
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS. 6.1
Show the validity of the statement in Remark 6.2.
6.2
Prove the Schraer-Bernstein Theorem: If A 4 - B and B -( A, then A =B.
6.3
We call an algebraic number any root of a polynomial with integer coefficients. What is the cardinal number of all algebraic numbers? [Hint: Use Problem 6.2.1
6.4
Prove that every subset of a countable set is a t most countable. [Hint: Use the well-ordering principle.]
6.5
ShowthatIW=[O,l]. [Hint: Showthat [ 0 , 1 ] ~ ( 0 , 1 ) . ]
6.6
Show that 2 0 = K.
6.7
Let [X,Y, f ] be a surjective map. Show that there is a subset of X equipotent with Y.
6.8
Let [ X , Y ,f ] be an injective map, where Y is countable, and let be a countable set for each y E Y. Must X be countable? f-
6.9
Let A be an uncountable set and let B 5 A be countable. Show that A\ B is uncountable.
6.10
Prove $he statement: Every infinite set contains a countable subset.
6.11
What is the cardinal number of all polynomials whose coefficients are algebraic numbers?
6.12
Show that the set of all finite subsets of N is countable.
N
6. Cardinality
NEW TERMS: cardinal number 40 equipotent sets 40 finite set 41 countable (denumerable) set 41 No 41 a t most countable Set 41 uncountable Set 41 continuum 41 Cantor's Theorem 42 continuum hypothesis 43 hypercontinuum 43 Schroder-Bernstein Theorem 44 algebraic number 44
46
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
7. BASIC ALGEBRAIC STRUCTURES Algebra is a mathematical discipline that studies algebraic structures. The most rudimentary algebraic operations with natural and positive rational numbers were already encountered in ancient mathematical texts. The famous book, "Arithmetics," by Greek Diophantos (of Alexandria) in the third century A.D., has a significant influence on the development of algebraic formalism. The term "algebra" stems from the text Al-jabr wa'l-mukhabala (by Muhammad al-Khowarismi in the ninth century A.D.), which dealt with solution techniques for various problems reducing to first and second order algebraic equations. Not until the end of the fifteenth century, when the common algebraic operations + , - , x , power, roots and parentheses were introduced, one used cumbersome phrases and descriptions of algebraic expressions. Francois Vidte, by the end of the sixteenth century, was the first to use letters to denote unknowns and parameters. The algebraic symbolism, as we know it now, has been used only since the middle of the seventeenth century. The Elemenfury Algebra (which deals with basic arithmetic operations on real numbers, first to fourth order algebraic equations, binomial formula, Diophant equations) was completed by the middle of the eighteenth century. Leonard Euler's Introduction to Algebra was one of the most prominent texts then. In the early nineteenth century the algebra became furnished with five basic (commutative and associative and distributive) laws with respect to two algebraic operations, + (addition) and - (multiplication). On the strength of Dirichlet's definition of a function, later on, these operations were declared as binary operations based on the following definition. An operation on a set A is a rule that assigns to each ordered subset A, C A of n elements a uniquely defined element of the same set A. For n = 1,2, and 3, the operation is called unary, binary, and ternary, respectively. The algebraic structures were formalized in 1830 by the Brits George Peacock in 1830, Duncan Gregory in 1840, and Augustus De Morgan and further refined by the Germans Hermann Hankel and Hermann Grassman. The absiract algebra is regarded as having been born in 1846, when Joseph Liouville had published Galois' theory (of solvability of polynomial equations) based on the group concept, which began to spread within mathematics ever since. In 1872, German Felix Klein published a program, in which he proposed to formulate all of geometry as the study of invariants under groups of transformations. In 1883, Norwegian mathematician Marius Sophus Lie published his fundamental work on continuous groups of transformations used in studies of continuous functions. The group theory, which is at the heart of contemporary abstract algebra, made prominent contributions to geometry, topology, and even physics in the 20th century .
47
7. Basic Algebraic Structures
In this section, we review some familiar algebraic structures. These will provide a basis for analysis shifting it to more abstract settings in the upcoming chapters.
7.1 Definitions. A set g with a binary algebraic operation * (frequently called addition or multiplication ) from Cj x g into Cj is called a semigroup, in notation (Cj,*), if * is associative. [Note that even though + or may denote addition and mu1tiplication, they need not mean the conventional algebraic operations known for numbers.]
(i)
-
+
(ii) A semigroup (Cj,*) is called a monoid if, there is an element I E Cj (called a two-sided identity) such that for all x E Cj, x*I = I*x = x. (iii) A monoid (Cj,*) is called a group, if for each x E Cj, there is a inverse x' such that x*xl = xl*x = I.
*-
If * is commutative (semigroup, monoid, or group), (g,*) is called commutative or Abelian. If we use for * symbol + or (Cj, ) multiplicative, respectively.
-
-,
(Cj, + ) is referred to as additive or
If (g, + ) is additive, the element I, denoted by 0, is called zero, and the element x' denoted by - x is said to be a n additive inverse of x. If (Cj, - ) is multiplicative, the element I is called the unity and denoted by 1. The element x' is denoted by x-' and is said to be a multiplicative inverse of x.
+
(iv) A set % with addition and multiplication %, i.e. a triple (%, , ), is called a ring if:
+
-
from % x % into
+
a) (%, ) is an Abelian group; b) - is associative; c) V a,b,x E %, x (a + b) = x a + x b (called the lefi distributive law) (a + b) x = a x + b x (called the right distributive lhw).
-
Observe that multiplication need not be commutative in a ring. However, if this is the case, the ring is called commutative. A ring need not have a unity either; consequently, a ring equipped with a unity is called a is a noncomring with unity. [For instance, the set of all matrices A (n,n) mutative ring with unity (unit matrix).]
(G,
5) be two groups and let [9,o,f] be a map (v) Let (Cj, t) and preserving the algebraic operations * and i , i.e. such that
48
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then f is called a (group) homomorphism of
g
into
4.
If [ g , 8, f ] is bijective then it is called an isomorphism. In this case, are called isomorphic. If [g,& f ] is a the groups (g,*) and homomorphism, and in addition (g,+) = (g,;), then [g,fi,f ] is said to be an endomorphism. If [g,& f ] is an endomorphism and an isomorphism then it is called an automorphism. 0
(a,%)
A homomorphism preserves some (but not all) structural properties of groups, as the following theorem states.
7.2 Theorem. Let [ Q , a , f ] be
o
homomorphism. Then
( i ) for each x E g, Ax') = [Ax)]',and (ii) f ( I )= I.
(See Problem 7.1.)
7.3 Definition. Let
[g,o, f ] be a homomorphism. Kerf
=f
Define
*({TI)
and call it the kernel of f . 7.4 Examples.
+
The space of all continuous functions with operation forms an Abelian group. The same space is not a group with operation of multiplication.
(i)
(ii) All polynomials with operation + form an Abelian group. (iii) ( H , +), ( R , + ) and ((O,oo),- ) are Abelian groups; ( Z , = ) is an Abelian monoid.
( i v ) The space 43\{(0,0)) with the operation ~ccomplexmultiplication" is obviously an Abelian group and (C, + , - ) is a ring.
~1%~:
= dn)([a,b];R)denote the space of all n times ( v ) Let continuously differentiable real-valued functions on [a,b] c R. Then
( ~ 1 % +~ )~ is
a commutative group. If Jnf denotes the nth derivative of
a function f , then
into
~ f : ! ~Jn]] , is a hm-nomorphism of (ef:,)bl:kl, +)
+ 1. Replacing
by the space of all polynomials T on
[a,b],we have [ 9 , 9 , J n ]as an endomorphism. (vi) Consider two groups (R,+ ) and ((O,oo),- ) and the function f ( x ) = ex. Then, [W,R+ , f ] is an isomorphism. Indeed, f ( x y ) =
+
49
7. Basic Algebraic Structures f (x) - f (y). In addition, [R, R + ,f] is bijective.
(vii) Let 9 = 9(X;Y) = yX be the space of all functions from X into Y. Then, ( 9 , - ) is a multiplicative monoid. For any nonnegative integer n and f E 9, define the unary operation power f n on 9 as: f O = 1, f n + = f f ". The power has the properties, f f = f + and (filk = fik. Note thdt the power can be defined on an arbitrary multiplicative monoid with the above properties.
'
(viii) A function T from
c onto c (where
bilinear transformation if T(z) = -with cz+d
I
= C U {oo)) is called a a b
/ # O . Let 4 denote
the set of all bilinear transformations. Then, (9,o ), where o stands for composition, is a (multiplicative) group where 1= T with a = 1, b = c = d = 0. Indeed, it is readily seen that TI0 T 2 and T are bilinear transformations, that T o T - = T - 0 T = 1, and that 0 is associative.
'
-'
'
f J ; t 2 0) be an indexed family of functions (ix) Let 9 = {[X,X, and let * be some binary operation defined on 9. (9,*) is called a semigroup (of functions) if f 0 = 1 and for all s,t 0, f ,* f = f, + t . Obviously, the semigroup (9,*) is a commutative monoid. N Let lP ( C - R ) be the space of all sequences such that for each (x)
>
= ( x 0 , x ,. 1
,
xrn
n=O
1 2, I < m,
where p E [ l , m ) . Define the
following operation on lP. For z and y, let z = (zo,tl,. ..) = z*y is such that zn =
xi = ,,xkyn -
(called discrete convolution). The operation
*
is commutative and associative and it is closed in lP (see Problem 7.11). Obviously, 1= (l,O,O,.. . is the unity of (lP,*) and thus (lP,*) is an Abelian monoid. Let z = (xo,xl,. ..) E 1P such that so # 0. Define y = ( ~ ~ , y..)~such , . that yo = 1. For n
2 1, yn can be determined recursively
from the equations
= 0. For instance,
"0
x;= Oxkyn-
In conclusion, for each z with xo # 0, there is a unique element y = x-'. On the other hand, if 1; denotes the subset of all elements z E IP with xO= 0 then 1; and its complement lp\lO, relative to IP are two equivalence classes induced by *. This implies that (lp\l;,*) is a commutative group. Obviously, the triple (IP, + ,*) is a commutative ring with unity.
50
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Now, let 9.J be the space of all complex-valued functions analytic a t zero and not equal to zero a t the origin. This space is closed with respect to multiplication. Hence, (Y, = ) is an Abelian group. Indeed, u = 0 1 is the unity and for each x E CLJ, & is analytic a t zero and it is a two-sided inverse of x. Obviously, each x E 9.J can be expanded in Taylor series a t zero, such that x is uniquely associated with the sequence
If F is defined as F(z) = x and F ( l ) = 0, then [ l p \ l ~ , ~is, a~ group ] homomorphism such that
Notice that F-'(z) = z need not be an element of lp\l;, ( x, I may be a divergent series. X = p , , (xi) Let LP (p {[R,W,f]} such that
> 1)
J:
for
denote the class of all real-valued functions
1 f I < oo. Define on LP operation * as follows.
The operation * is closed in LP and it is commutative and associative (see Problem 7.12). Define the function 1
f(u.u) = -J= e x 4 u 2n
2). > 2
for u
0 and u t R.
This function is a well-known probability density function of a normal random variable with mean 0 and variance u2. Consequently,
From the theory of probability, it is also known that a lion portion of the integral under the curve f (over 99%) is concentrated over the interval ( - 3u,3u). Function f has its maximum value a t 0 equal approximately 0.399). Now, if we let u +0 + , the resulting function is called the (Dirac) delta function, in notation, 6. It is readily seen that the delta equals 0 on R\{O} and oo a t 0, and that
6:
= 1. There is an alter-
native integral representation of delta function. Recall that the Fourier transform of f is
7 . Basic Algebraic Structures
and that f can be restored by applying the inverse Fourier transform to its image as follows:
Again, letting u+O, we arrive at
By using this integral representation it will be easy to show that 6 is the unity of 4 operation:
Since the expression in parenthesis is 2(0), that denotes the Fourier transform of x, the rest is the inverse Fourier operator, which should restore x at u. So, x*6 = x. According to Problem 7.1, 6 is a unique unity of operation *. Since 6 0 and because
>
6 is an element of LP. This all implies that (LP,*) is a commutative monoid and, therefore, (LP, ,*) is a commutative ring with unity.
+
(xii) As an application of the last example, consider the discrete indexed family of functions { f ;, n = 0,1,. ..} defined as follows:
52
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then, f (" group.
+ k)*
= f "** f k*, and therefore ({ f
n =O,,..
.
is a semi-
(xiii) Let 9, = 9,(R;R) denote the space of all bounded real-valued functions. For a function A E 9,, define
in agreement with Example (vii). Obviously, for each u, the above series converges absolutely, since there is a positive constant M such that
so that eA is again an element of 9,. For a fixed A, define the family of functions ft = etA, t 2 0. From the above definition of eA it follows that fo = 1. It is easy to show that e = e(' + t)A. Indeed,
The last expression yields e ( a + t ) A for letting n + m. Consequently, (etA,= ) is a semigroup defined in (b).This example can be generalized for operators, for instance, squafe matrices. T o discuss such cases rigorously, one would require the concept of the "norm" of operators treated in upcoming chapters.
0
7.5 Definitions.
(i) Let IF be a nonempty set with two binary operations, addition ( a + P ) and multiplication ( a p ) [in many instances, especially. for the elements- of ff, we will drop the conventional multiplication symbol 1. (F, +, * ) is called a field if it is a commutative ring with unity and if for every a 0 there is a multiplicative inverse CY - l.
-
+
In other words, IF is a field if for all a,P,y E IF,
+
1) (commutative law) a + ,O = /3 a, a@= Pa 2) (associative law) ( a p ) 7 = a ( P y), (aP)y = a ( P 7 ) 3) (zero) there is an element 0 E F such that a + 0 = a 4) (additive inverse) there is an element - a E IF such that a+(-a)=O 5) (distributive law) a ( p + 7) = ap cry 6) (unity) there is an element 1 E ff such that la = a 7) (multiplicative inverse) for every a # 0, there is a-' E F such
+ +
+ +
+
7. Basic Algebraic Structures
that aa-' = 1. The elements of a field are called scalars. (ii) Let ff be as above with the exception that ff does not have additive inverses. Then ff is called a semifield. We will denote a semifield by
ff+. [The set of all rational numbers, Q, the set of real numbers, R, and the set of all complex numbers, C, are typical examples OF fields. The set of all nonnegative rational or real numbers and the set of complex numbers z E 43 with Re(z) 0, are examples of semifields.]
>
(iii) A linear or vector space X over a field ff is a nonempty set with the binary operations addition ( + ) on X x X into X and multiplication ( - ) on IF x X into X such that 1) + is commutative and associative; 2) there exists an element (called an origin of X), 0 E X such that o w x = e ,V X E X ; 3) 1 - x = x , V X E X ; 4) a ( + + y) = a x + a y , ( a + P ) x = a x + P x , Q a,P E ff, Q x,y E X; 5) a(@) = (aP)x, V a,p E F, Q x E X. (iv) Elements of X are frequently called vectors. If ff = R then X is called a real linear space. If ff = C then X is called a complex linear space. If in (iv) a semifield ff + is taken, then we call X a semi-linear space. (v) Any subset of a linear space, which itself is a linear space, is referred to as a subspacc.
-
(vi) A ring (A,+, ) is called an algebra over a field ff if its additive (Abelian) group (A,+) is a linear space over ff. An algebra over a field ff will be denoted by (A;!=). If (A;ff) is an algebra, a pair (A1;F1)is called a subalgebra (of (A;ff)) if A' C - A , ff' ff, and (A';ffl) is also an algebra. The above characteristics of commutative rings and rings with unities are hereditary for algebras. (vii) A partially ordered linear space, which is also a lattice, is called a vector lattice. 0
7.6 Properties of Linear Spaces. By Definition 7.5 (iii), 2) and 3), we have 0 + x = 0 x + 1 - x = (0 1) - x = x. Therefore, the origin 0 is zero and, by Problem 7.1, it is unique.
(i)
+
(ii) For every x E X, there exists - x such that x Indeed, by Definition 7.5 (iii), 2) and 4), we have
+ ( - x) = 19.
54
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
We call ( - 1)x the additive inverse of x and denote it by - x. Properties (i) and (ii) imply that ( X , + ) is an Abelian group. (iii) V a ~ f f a, O = a ( O - x ) = ( a 0 ) - x = O * x = B .
0
7.7 Notation. Let X be a vector lattice over a field ff. Then Q x E X ,
7.8 Examples.
(i)
(8') is a subspace, since by Property 7.6 (iii), a - 8 = 8.
(ii)
Any field is a linear space over itself.
(iii)
Rn is a real linear space with 8 = (0,. ..,0) over R.
(iv) I' space, with all real sequences over the field R whose series are absolutely convergent, is a linear space. 1P space over the field 43, of all sequences such that for each (v) z = (xlrx2,. .) E I P ,=:c I I,I < 03, where p E [ l , ~ ) , is a linear space. (See Problems 7.9 and 7.10.)
.
(vi) space.
e[a, b ] space of all continuous functions on [a,b] is a real linear
(vii) era,,l space of all n-times differentiable functions on [a,b] is a real linear space. (viii) space.
dm)space of all analytic (entire) functions is a complex linear
(ix) In Example 7.4 (xi, (l~\l: U {B), + ,*), where 8 = (0,0,. ..), is a field, since elements of P \ l p have multiplicative inverses. (C, +, = ) is another example of a field. The space RX of all real-valued functions on a set X is a (x) commutative algebra over R with unity. RX is also a vector lattice. (xi) The subspace 4,(X;R) 5 IRX of all bounded real-valued functions on a set X is a commutative subalgebra with unity and a vector lattice.
7. Basic Algebraic Siructures
55
( x i i ) The subspace C ( X ; R ) of all continuous functions is also a commutative subalgebra over R with unity and a vector lattice. (xiii) The subspace C,(X;R) of all bounded continuous functions is a commutative subalgebra of C ( X ; R )and a vector lattice. ( z i v ) The subspace Cn(R;R) of all n-times differentiable functions is a commutative subalgebra with unity but not a lattice (sup{x,-x) = I x I $ Cn(R,R))* (xu) The space C ( ~ ) ( C ; Cof) all entire functions over C is a commutative algebra with unity but not a lattice. ( x v i ) The space 9 of all polynomials with real coefficients is a commutative subalgebra over R with unity but not a lattice. (xvii) The space Q of all polynomials with rational coefficients is a commutative subalgebra over the field of rational numbers with unity but not a lattice. 0
PROBLEMS. Show that each monoid has exactly one identity. Let (Q,*) be a group. Show that for each two elements x,y E Q , there are 1,r E Q, such that l*x = y and x*r = y.
An operation * is called reducible if x*y = x*z implies that y = z for all x,y,z. Show that if (Q,*)is a group, then * is reducible. In particular, show that for each x E Cj, its inverse is unique. Prove Theorem 7.2. Let [ Q 1 f l l f ]be an isomorphism. Show that isomorphism.
[o,Q,f -'I
is also an
6, f ] be an isomorphism. Find K e r f . Let [Cj, 0, f ] be a mapping such that Cj = 6 = R with operation + and let f ( x ) = [XI (i.e. the greatest integer less than or equal to x). Let [Cj,
Is [Q, 0, f J an endomorphism?
Let ( Q , * ) be the set of all 2 x 2 real matrices with determinant equal 1. a) Show that ( C j , - ) is a group.
b ) Let B be any 2 x 2 nonsingular matrix. Define the map [g, Cj, f ] such that f ( A )= B - AB. Show that [Q,Q, f ] is an automorphism.
'
56 7.9
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
>
Show that, Va,b 0 and p E [l,oo), (a + b)P 5 2~-'(aP + bP). > 1, work with the auxiliary function f (x) = (a x ) ~ - 2p-'(ap + XP), x 2 0.1
+
[Hznl: For p
7.10
7.11
Show that 1P is a linear space; specifically show that x,y E lP + x + y E 1P. [Hint: Apply the inequality in Problem 7.9 in the form Ixn+YnIPI 2p-1(I~nIP+ I Y ~ I ~ ) * I Show that the operation * in Example 7.4 (x) is commutative and associative and it is closed in lP.
*
7.12
Show that the operation and associative.
7.13
Show that o defined in Example 7.4 (viii) is associative and that T ~ T = - T ~- ~ ~ T = I .
7.14
Is ( 4 , + , o ) (where ( 4 , 0 ) is defined in Example 7.4 (viii)) a ring?
7.15
Let S be a subset of 43. Argue for what cases S is a subspace of C over R.
in Example 7.4 (xi) is commutative
a) S is a closed unit disc centered at zero, i.e., S = {z E C: I z 5 1). b) S = {z E C: { I Re(z) I 5 1) x ( I Im(z) I 5 I}}. c) S = {z E C: {Im(z) = 0) x ( 1 Im(z) I 1)). d) S = {z E C: Im(z) 2 0 and Re(z) 2 0) U {z E C: Im(z) 5 0 and Re(z) 5 0).
<
7.16
Prove in Definition 7.7, for functions, that x = x + 1x1 = x + + x - .
-x -
and
7. Basic Algebraic Strzlcivres
NEW TERMS: algebra 46 algebraic operation 47 semigroup 47 associative algebraic operation 47 monoid 47 two-sided identity 47 group 47 inverse 47 commutative algebraic operation 47 abelian group 47 additive group 47 multiplicative group 47 zero 47 additive inverse 47 unity 47 multiplicative inverse 47 ring 47 left distributive law 47 right distributive law 47 commutative ring 47 ring with unity 47 group homomorphism 48 group isomorphism 48 group endomorphism 48 group automorphism 48 kernel 48 space of all n times differentiable functions 48 power 49 bilinear transformation 49 semigroup of Functions 49 discrete convolution 49 1P space 49 LP space 50 normal probability density function 50 Dirac delta function 50 Dirac delta function, Fourier transform of 51 field 52 scalar 53 semifield 53 linear space (vector space) over a field 53 vector 53 real linear space 53
58
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
complex linear space 53 semi-linear space 53 subspace 53 algebra over a field 53 subalgebra 53 vector lattice 53
Chapter 2 Analysis of Metric Spaces Metric spaces were introduced and studied by the French mathematician, Maurice Renk Frkchet (in his doctoral dissertation published in 1906), and developed later by German Felix Hausdorff (in his book Grundziige der Mengenlehre of 1914). It was apparent that to the end of the nineteenth century the mathematical world (partly inspired by Cantor's fundamental work in set theory) was eager to structure more general sets than conventional Rn. On the other hand, the needs of complex analysis and the rash development of differential equations speeded up this process. Typical examples are uniform convergence in function spaces, approximation of continuous functions by polynomials and the Riemann mapping theorem. After 1920, the theory of metric spaces, especially, fundamental work on normed spaces and their applications to functional analysis, was further developed by Pole Stefan Banach and his school. Paying a tribute to their achievements and of their other fellow countrymen followers, an important subclass of metric spaces was named "Polish." A series of studies of metric spaces were further undertaken in the late 1920s by the Russian school of analysis. At this time, metric spaces have become generalized to topological spaces. In this chapter we introduce main principles of metric spaces and their special case: normed linear spaces. This part of analysis traditionally precedes the more general theory of topology and functional analysis.
1. DEFINITIONS AND NOTATIONS The concept of "metric" (measuring distances in space) is a t the root of mathematical (geometric) thinking. Starting with that concept )we will see how the notions of limits of sequences and continuity of functions extend by metrization to those in more general spaces than Euclidean spaces introduced in calculus. Recall that a point x is a limit of a sequence (x,} if all terms of the sequence numbered with k, k 1,... for some k are sufficiently "close" to x. The closeness of these points to x is defined in terms of the Euclidean distance 1 x - x k I , which determines the specific structure employed on the "carrier" R. In many applications, the carrier is more general than R or even Rn. So, the question arises, 'how do we construct the analysis in the general space?' Since the distance was crucial in the formation of analysis on the real line, we will introduce this
+
60
CHAPTER 2. ANALYSIS OF METRIC SPACES
notion also for the general space, emphasizing the main properties of the distance with which we have had experience. Once a distance (or metric) between any two points of a set is defined, the set becomes "wellstructured" or metrized, and then is ranked as a space, more precisely, a metric space.
1.1 Definitions. Let X be a nonempty set. A metric d (or distance) on X is any nonnegative function d: x2+ R+ such that:
(i)
(a)
(b) (c)
Vx,y E X , d(x,y) = 0 x = y. VX,Y E X, d(x,y) = d(y,x). Vx,y,z E X, d(x,y) 5 d(x,z) d(z,y)
+
(triangle inequalily).
The pair (X,d) is called a metric space. We will refer to set X as a carrier. Sometimes, for brevity, the carrier X itself will be called the metric space..
(ii) If for x, y E XI x = y implies d(x,y) = 0, but the converse does not (i.e., d(x,y) = 0 does not necessarily yield x = y), and if ( b ) and (c) hold, then d is called a pseudo-metric. Correspondingly, the pair (X,d) is called a pseudo-metric space. (Any pseudo-metric can be made a metric by introducing the equivalence classes "generated by metric d," in such a way that x and y will belong to one and the same class whenever d(x,y) = 0.) 0
.
1.2 Remark. By the triangle inequality we have
which holds for all x,y,z E X. Then, interchanging x and z in the last inequality we arrive a t
Inequalities (1.2a) and (1.2b) yield
Let Y E X. Then the pair (Y,d) is also a metric space, called a subspace of (X,d).
1.3 Examples (of metric spaces).
(a)
The discrete metric is defined on a nonempty set X as
1. Definitions and Notations
The triangle inequality does not hold if and only if d(x,y) = 1 and d(x,z) = d(z,y) = 0. However, this would only be possible for x = z = y. Hence, d(x,y) cannot equal 1. (ii) Let X = (0,co) and d (x,y) = follows from d ( ~ , ~ ) = l $ - $ l=
I$ - b 1.
I$--+---
1 Z
The triangle inequality 1
Z
Y
I
(iii) Let X consist of all sequences {I,} C W. Such a carrier X is denoted by RN. Recall that a subset of R N is the 1' space if it contains only absolute convergent sequences, i.e., those with
Let us define the function d on 1' as d(x,y) =
zrn = [ xn - yn I . Then n
1
Thus, d is a metric on I', since the other properties of d as a metric are obvious.
(4 Let
c[a, bl denote the set of all continuous functions on inter-
val [a,b] C R. Let us define
called the supremum metric. Because any continuous function on a closed and bounded interval assumes maximum and minimum values, the definition of d makes sense. Since the inequalities
hold for all t E [a,b], we have
62
CHAPTER 2. ANALYSIS O F METRIC SPACES
which is exactly the triangle inequality. Hence d is a metric on C[a,bl. (u) Now, define another metric on
e[a,bl:
It is easy to see that d(x,y) = 0 if and only if x(t) = y(t) for all t E [a,b] (why?). The triangle inequality is obvious.
PROBLEMS 1.1
Let X = R and d(xIy) = sin2(x - y). Is (X,d) a metric space?
1.2
Let X = R and d(x,y) =
1.3
Let X = Wn. Define on X , d(x,y) = max{ I xk - yk I : k = 1,. . .,n} Vx = (2,). ..,xn), y = (y,,.. ., yn). Show that (X,d) is a metric space.
1.4
Let d be a metric on X. Define p(x,y) = is a metric on X.
1.5
Two peal numbers p if
d m . Is (X,d) a metric space?
d(x~y) . Show that p
+d(xJ~)
> 1 and q > 1 are called
conjugate exponents,
F1 + + = 1. Show that for all x, y E W+ and for conjugate exponents p and q, the following inequality holds.
[Hint: Work with the function f (2) = jr + f - zllp and then substitute r =
1.6
% .] Y
Prove Holder's inequality (for finite sums): for conjugate exponents p > 1 and 9 > 1 such that f = 1, a,,. ..,an 3 0, and bl,. ..)bn 2 0 1
+b
[Hint: Apply Problem 1.5 to x = ai/A and y = bi/B, where
1. Definitions and Notations
[
A = i=l &r]l" and I3 = 1.7
[
&?flq.
]
i=l
a) Prove Minkowski's inequality (for finite sums): for p all.. . a n 2 0, and bl,. ..,b, 2 0, it holds true that
+
[Hint: Make use of (a b)P = a(a then apply Holder's inequality.]
2 1,
+ b)P - '+ b(a + b)P - ' and
b ) Generalize Minkowski's inequality for infinite sums. 1.8
The Euclidean metric or Euclidean distance is defined in Rn by
(Specifically, if n = 1, we have d(x,y) = d ( x - y)2 = I x - Y 1 .) Show that d, is indeed a metric [Hint: Apply Minkowski's inequality.] [In Problem 1.8 we defined the Euclidean metric on Rn by equation (P1.8). This metric can be regarded as
where dk(xk,yk) is the one-dimensional Euclidean metric on the kth coordinate axis (kth factor space). We can extend this notion and define a metric on the n-times Cartesian product set Y = Y1 x Y2 x .. . x Yn by formula (P1.8a). The proposition in Problem 1.9 states that such dp is indeed a metric on Y. We call this metric the product metric arld the corresponding metric space (Y,dp) the product space. In notation, x {(Yk,dk): k = 1,...,n).]
1.9
Prove the statement. Let (Yk, dk), k = 1,..., n, be a collection of metric spaces and let Y be the Cartesian product of Y1,. ..,Y,. Then the function d p on Y x Y defined b y (P1.8a) is a metric on V
1.10
Show that the function p(x,y) = on Y = Y l x Y 2 x ... x Y n .
C ; = ldk(xk,yk) is also a metric
64
CHAPTER 2. ANALYSIS O F METRIC SPACES
NEW TERMS: metrization, 60 carrier 60 metric 60 distance 60 triangle inequality 60 metric space 60 pseudo-metric 60 pseudo-metric space 60 subspace 60 discrete metric 60 1'-space 61 supremum metric 61 conjugate exponents 62 Holder's inequality 62 Minkowski's inequality 63 Euclidean metric 63 Euclidean distance 63 product metric 63 product space 63
2. The Structure of Metric Spaces
2. THE STRUCTURB OF METRIC SPACES The structural properties of metric spaces stem from the notion of the open ball with the aid of which we shall be able to introduce open and closed sets, interior, closure, and accumulation points. Open balls, due to a particular metric, generate convergence and continuity, the principles of any analysis, which we explore in this chapter and Chapter 3.
2.1 Definition. Let (X,d) be a metric space and let x E X and r > 0. The subset of X , B(x,r) = (y E X : d(x,y) < r), is called the open ball centered at x with radius r (with respect to metric d). [If we need to emphasize that the ball is with respect to metric d, we will write as Bd(xlr). This notation makes sense whenever more than one metric on X is considered.]
2.2 Examples. The open ball B(x,r) in Euclidean space (R, d,) is the open interval (x - r, x r).
(i)
+
(ii) The open ball B(x,r) in Euclidean space (W2, d,) is the open disc centered a t x with radius r in the usual sense. (iii) Different choices of metric on a given carrier give rise to different spaces and, as the result, to different open balls. In metric spaces other than Euclidean, the shape of open balls may be quite surprising to our usual way of their perception. Consider, for instance, an open ball B(x,r) in (W2,d), where d is the supremum metric defined as in Problem 1.3, for n = 2, i.e.,
It is easy to see that the open ball B(x,r) is of the square shape and that the corresponding open ball B,(x,r) with respect to the Euclidean) metric in W ' is inscribed in this square (see Figure 2.1 below). (iv) Let ( X , d ) be a discrete metric space with the metric defined in Example 1.3 (i). Then, for any x E X , an open ball centered a t x is
66
CHAPTER 2. ANALYSIS OF METRIC SPACES
Figure 2.1
Figure 2.2
2. The Siruciure of Metric Spaces
67
Let (X, d) be the metric space defined in Example 1.3 ( i v ) , where X = C(,, ,], and (v)
Then the open ball B(x,r) has a shape as depicted in Figure 2.2 above. 0
2.3 Definition. Let (X,d) be a metric space. A subset A of the carrier X is called a d-open set (or just open set) if every point x of A can serve as the center of an open ball inscribed in A, i.e., there is a n r > 0 such that B(x,r) 5 A. 0 2.4 Examples.
Every open ball is an open set itself. Indeed, if xl E B(x,r) then r - d(x,xl) > 0. Take rl = r - d(x,xl) and show that B(xl,rl) C B(x,r). For every z E B(xl,rl), by the triangle inequality,
(i)
Thus
aE
B(x,r) (see Figure 2.3).
Figure 2.3
68
CHAPTER 2. ANALYSIS OF METRIC SPACES
(ii) The set [a,b), for a open ball B(a,r) 2 [a,b).
< b, in (W, d,) is not open, since there is no
(iii) The carrier X is obviously open.
(iv) A set A is not open if there is at least one point x E A such that there is no ball B(x,r) that can be inscribed in A. Since the empty set does not have any point, it is reasonable to assign it to the class of open sets. (v) In the Euclidean space (R,d,), R is an open set but not an open ball (why?). 0
2.5 Theorem. F o r every metric space ( X , d), the following statements hold true:
(i)
Arbitrary unions of open sets are open sets.
(ii)
Finite'intersections of open sets are open sets.
Proof.
(i) Let {Ak: k E I}be an indexed family of open sets in X and let A = U A k . If x E A then there is an index i such that x E Ai. Since Ai k EI
is open, there is a n r
> 0 such that
Therefore, A is open. n
n A k . If x E A k=l
(ii) Let At,.. .,An be open subsets of X and let A =
then x E Ak, k = 1,...,n. It follows that there are rl ,...,r n such that B(x,rk) 5 A k , k = 1,. .,n. Let r = min{rl,. ..,rn). Then, obviously, B(x,r) # (8 and B(x,r) Ak, k = 1,...,n. Thus, B(z,r) 5 A and A is open. 0
.
2.6 &mark. The intersection of more than a finitely many open sets need not be open. The reason is that r = min{rk: k E I) can be zero. For example, let
Then 1 E A n , n = 1,2,. .., which implies that 1 E
n An and hence 00
n=l
However, the set {I) is not open in (W,d,).
2. The Structure of Metric Spaces
69
2.7 Example. Let (X,d) be a discrete metric space. Then the power set T ( X ) coincides with the set of all open sets. Indeed, in Example 2.2 (iv), we showed that in any discrete metric space, every singleton {x) and the carrier X are open balls. In addition, 9) is an open set. Since any subset A of X can be represented as the union of all points of X, by Theorem 2.5 (i), it follows that A is also open. Specifically, in R endowed with the discrete metric, all singletons are open, while in Euclidean space (Ride) they are not. C3 2.8 Definitions.
(i) A point x E A X is called a n interior point of A if there exists an open ball B(x,r) 5 A. The set of all interior points of set A is denoted by
A or Int(A) and called the interior of A.
[Clearly,
is the largest open subset of A, which yields that A is
A. Indeed, let C c A be an open set, larger than A. Then there is an x E C such that x $ A. But this is a contradiction,
open if and only if A =
since x must be an interior point of A.] (ii) A subset A of X is called closed if its complement AC is open. [Specifically, the carrier X and the empty set (8 are both closed.] (iii) A point x E X is called a closure point of A E X if every open ball centered a t x contains a t least one element of A (including x if x E A). We will also say, "if every open ball centered a t x meets A The set of all closure points of A is denoted by 2 or by Cl(A) and called the closure of A.
."
[For example, let A = [0,2) U (5). (5) is a one of the closure points since B(5, r ) contains {5) for all r > 0 . Thus, 2 = [0,2]U{5).] 2.9 Proposition. Arbitrary intersections or finite unions of closed sets are closed sets.
Proof. The statements follow by applying DeMorgan's laws. 2.10 Examples. (i)
From Definition 2.8 (iii) it follows that A
2.
(ii) Since the set of all open subsets of a discrete metric space (X, d) coincides with its power set, the set of all closed subsets is also the power set. Particularly, in a discrete metric space all subsets are simultaneously open and closed.
2.11 Proposition. For any subset A of X, superset of A.
X
is the smallest closed
CHAPTER 2. ANALYSIS O F METRIC SPACES
Proof. (i) We show first that 2 is a closed set, i.e. that (Cl(A))' is open. Let x E (Cl(A))'. Then there exists an open ball B(x,r) such that B(x,r) n A = (8 (since, otherwise, x would belong to A by the definition). (8, which would However, we have not proved yet that B(x,r) immediately imply that (Cl(A))' is open. Now we show that no point of B(x,r) is a closure point of A. Take an arbitrary point t E B(x,r). Since B(x,r) is an open set, there is an open ball B(t,rt) B(x,r) also disjoint from A. By the definition of a closure point, this means that t $2.Since t was an arbitrary point of B(x,r), B(x,r) C (CI(A))'.
nz=
(ii) Now we show that the closure of A is the smallest closed set containing A. Let B be an arbitrary closed set such that A C B. We prove that BCC (A)'. Since BC is open, for each x E BC, there is an open ball B(x,r) 5 2.This implies that B(x,r) fl B = (d and that
Thus x @
(by the definition of a closure point), which is equivalent to x E (Cl A)'. Therefore, we have proved that x E BC yields that x E (Cl A)', i.e. BC5 (Cl A)'. The latter is obviously equivalent to A B.
2.12 Corollary. A set A is closed if and only if A = A. (See Problem 2.1.)
2.13 Fkmark. Consider the set C(x,r) = {y E X : d(x, y) 5 r). It can be easily shown that C is a closed set. (See Problem 2.4.) Such C is called a closed ball centered at x with radius r. Evidently, B(x,r) C C(x,r) implies that B(x,r) 5 C(x,r), since B is the smallest closed set containing B. However, we observe that C(x,r) does not necessarily coincide .with the closure of the corresponding open ball B(x,r). For instance, let ( X , d ) be a discrete metric space, where any open ball is both closed and open set, i.e. B(x,r) = B(x,r). Because
we have B(x,r) = C(x,r) = X for r > 1 or B(x,r) = C(x,r) = {x) for r < 1. For r = 1, B(x,r) = {x) C C(x,r) = X , unless X is a singleton. 0 2.14 Examples.
(i)
In the Euclidean metric space (R,d,), for each x E R, {x) is
2. T h e Structure of Metric Spaces closed. Indeed, {x)' = ( - oo,x) U (x,m) is open. (ii) The set of all rational numbers Q is neither open nor closed. Indeed, it is known that each irrational point x is a limit of a sequence of rational points {x,). Therefore, there is no open ball B(x,r), which does not contain rational points. This implies that QC is not open, or equivalently, Q is not closed. On the other hand, Q cannot be open, since otherwise, every rational point q could be the center of an open ball (interval) containing just rational numbers. This is absurd, since any interval is continuum. Therefore, the set of all rational numbers is neither open nor closed. It also follows that the set of all irrational numbers is neither open nor closed. 0
2.15 Definition. A point x E X is called an accumulation point of a set A X if V r > 0, B(x,r) fl (A\{x)) # [Observe that x need not be an element of A.] The set of all accumulation points of A is called the derived set of A and it is denoted by A'.
a.
Unlike a closure point, an accumulation point must be "close" to A. If B(x,r) n (A\{x)) # #, then B(x,r) fl A # (8, and, consequently, x E A' yields that x E 2 or A'
x.
2.16 ExamplesNotice that not every closure point is an accumulation point. For instance, let A = (0,l) U (2) (R,de). Then (2) is obviously a closure point of A. However, (2) is not an accumulation point of A, since ~ ( 2 , in) (0,l) = @. On the other hand, {0) is an accumulation and closure point of A.
(i)
1 1 (ii) Let A = {1, 3, 3,. ..) 2 (W,de). Since 0 is the limit of the se(in terms of Euclidean distance), it is also an accumulation quence point of A. Any open ball a t 0 contains at least one point of A. This is the only accumulation point of A. By the way, A is not closed, for 0 is a closure point of A. So we have A' = {0), 2 = A U {O).
{a)
In the previous section we introduced the notion of the product metric. We wonder what the shape of open sets in the product metric space is. A remarkable property of this metric is given by the following theorem.
2.17 Theorem. Let {(Yk,dk): k = 1,...,n) be a finite family of m e t r i c spaces and let (Y,d) = x {(Yk,dk): k = 1,...,n) be t h e product space. T h e n 0 (Y,d) i s open if and only if 0 i s t h e u n i o n of sets of t h e f o r m x ( 0 ; :i = 1,...,n), where each 0; is open in (Yi,di). A proof of this theorem in a more general form is given in Chapter 3.
72
CHAPTER 2. ANALYSIS OF METRIC SPACES
PROBLEMS 2.1
Prove Corollary 2.12.
2.2
Is it true that A C_ B
2.3
Show that
2.4
Prove that a closed ball C ( x , r )is a closed set.
2.5
Show that in (Rn,d,),
2.6
Show that
2.7
Let A ( X , d ) , where X is an infinite set. Show that, if x is an accumulation point of A, then every open set containing x contains infinitely many points of A.
2.8
Give an example of a continuum closed set that does ,not have any accumulation point.
2.9
Find the shape of open balls in the metric space ( X , d ) introduced in Example 1.3 (ii).
2.10
Show that the set [l,oo)is closed in the metric space in Problem 2.9.
j
2 2 B?
[FIC C - 2. B(x,r) = C ( x , r ) .
= A U A'.
2. The Structure of Metric Spaces NEW TERMS: open ball 65 radius of an open ball 65 supremum metric 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open (d-open) set 67 interior point 69 interior of a set 69 closed set 69 closure point 69 closure of a set 69 closed ball 70 accumulation point 71 derived set 71
CHAPTER 2. ANALYSIS O F METRIC SPACES
3. CONVERGENCE IN METRIC SPACES This section introduces the reader to one of the central notions in the analysis of metric spaces - convergence. Among different things, we will discuss the relation between limit and closure points.
3.1 Definitions. (i) Recall that a function [N,Xf] is called a sequence, and its most commonly used notation is {x,} = f , with x, = f(n). Let {x,} (X,d) be a sequence and let x E X. A subsequence QN = {xN , XN + I , . ..} is called an N(x,E)-tail of {x,} if there are N 2 1 and E > 0 such that QN E B(x,E). The sequence {I,} is said to converge to a point x E X if for every E > 0, there is a N(x,E)-tail. In notation, lim d(x,,x) = 0
n+oo
(also d-lim x, = x or just x,-+x). n+oo
x is called a limit point of the
sequence {x,}. A sequence is convergent if it is convergent to a t least one limit point that belongs to X. (ii) A point x is said to be a limit point of a set A if there is a sequence {x,} E A convergent to x. (iii) A sequence {x,} is called a Cauchy sequence, in notation lim
n, m-oo
if for each
E
d(xn,xm) = O ,
> 0, there is an N such that d(xn,x,) < E, for n,m > N.
(iv) A metric space (X,d) is called complete if every Cauchy sequence in X is convergent. (v) A sequence {x,} is called bounded if for every n, d(xl ,xn) 5 M, 0 where M is a positive real number.
3.2 Remark. A sequence in a metric space can have a t most one limit point. Indeed, let x, y be limits of a sequence {x,} 5 (X,d) and let E > 0 be arbitrary. Then, given an N , by the triangle inequality,
(i.e. d(x,y) can be made arbitrarily small). Thus, x = y.
3.3 Theorem. Let A E (X,d). Then a point x is a closure point of a set A if and only if x is a limit point of A (i.e. there is a sequence {x,}
3. Convergence in Metric Spaces
C A such that x,-,
2).
Proof. (i) Let x be a closure point of A. If x E A then the proof becomes trivial (take x, = x, n = 1,2,. ..). Let x E X\A. By the definition of a closure point, every open ball B(x,r) meets A. Thus for every n, there is a point, x, E A n ~(x,;), so that d(x,x,) < Therefore, {x,} is a desired sequence convergent to x.
4.
lim x, = x. We prove that x E 2. (ii) Let {x,} C_ A such that n-tm The convergence implies that for every E > 0, there is an N such that ~ ( x , x , ) < E, for all n z N. Thus VE > 0, B(x,E)n A # #, which yields that x €2. (Particularly, if x E A1\A # #, then there exists a sequence {x,} with all distinct terms such that x,+ x.) C3
3.4 Corollary. A subsei A of a meiric space (X,d) is closed if and only if it contains all of its limit points.
Proof. (i) Let A be closed and let {xn} Then, by Theorem 3.3,
A be a convergent sequence.
-
lim x, = x E A. n+oo Since A is closed, A = 2 and x E A. Thus, A contains all of its limit points. (ii) Let A contain all of its limit points. Apply the pick-a-point process. Let x E 2.Then, by Theorem 3.3, there is a sequence {x,) A such that n+oo lim x, = x. By our assumption, x belongs to A or, equivalent-
s
ly,
Z E A implying that A = A and hence A is closed.
C3
3.5 Definitions. (i) A subset A C_ (XId) is called dense in X if 2 = X. [By Theorem 3.3, A is dense in X if and only if the set of all limit points of A coincides with X, or, in other words, if and only if for every x E X, there exists a sequence {x,} 2 A such that x, + 2.1 (ii) A set A C_ (X,d) is called nowhere dense if its closure has the empty set for its interior, i.e., if Int(Cl(A)) = #.
76
CHAPTER 2. ANALYSIS O F METRIC SPACES
(iii) A point x E (X,d) is called a boundary poini of A if every open ball at x contains points from A and from AC. The set of all boundary points of A is called the bounda y of A and is denoted by dA. [Note that B A = B A ~ = I ~ P ] . 3.6 Examples.
(i) Since each irrational number can be represented as the limit of a sequence of rational numbers, Q is dense in W (in terms of the Euclidean metric). (ii) X and
0 have no boundary points.
(iii) Let A = [0,1) U {2). Then, [0,1], aA={0,1,2}
= (Ol),
= [0,1] U {2), A' =
(since AC=(-oo,O)U[1,2)U(2,m), k = ( - m , O ]
u [l,oo), and Xn H = {0,1,2)). (iv) Let A = {1,5,10) C (R,d,). Then A is nowhere dense. (v)
(A: n = 1,2,. ..) is nowhere dense in (W,d,).
PROBLEMS 3.1
Show that every convergent sequence is a Cauchy sequence. Give an example when the converse is not true.
3.2
Prove that
3.3
If x E aA, must x be an accumulation point?
3.4
Prove that a set A C_ (X,d) is nowhere dense in X if and only if the complement of its closure is dense in X.
3.5
Assuming that (W, d,) is complete (a known fact from calculus) prove that (Wn,d,) is also complete.
3.6
Show that any Cauchy sequence is bounded.
3.7
Show that in a discrete metric space any convergent sequence has at most finitely many distinct terms.
3.8
Show that any discrete metric space is complete.
3.9
Show that if (x,) E ( X , d ) is a Cauchy sequence and (x } is a nk subsequence convergent to a point a E X, then xn -t a.
2=
+aA .
3. Convergence in Me-tnc Spaces
NEW TERMS: sequence 74 N(x,E)-tail 74 convergent sequence 74 limit point of a sequence 74 limit point of a set 74 Cauchy sequence 74 complete metric space 74 bounded sequence 74 dense set 75 nowhere dense set 75 boundary point 76 boundary of a set 76
CHAPTER 2. ANALYSIS O F METRIC SPACES
4- CONTINUOUS MAPPINGS IN METRIC SPACES 4.1 Definition. Let (X,d) and (Y ,p) be two metric spaces. A function f : ( X , d )-,(Y,p) is called continuous at a point xo E X if for each E > 0, there is a number 6 > 0 such that p( f (x),f (xo)) < E for all x with d(x,xo) < 6. The function f is called continuous on X or simply continuous if f is continuous a t every point of X. CI
4.2 Remark- Since xo E f *(I f (xo)}), x0 E f *(Bp(f (xO),&)).However, in general, xo need not be an interior point of f *(Bp(f ( X ~ ) , E )The . continuity of function f a t xo is equivalent to the statement that, for any E > 0, xO is indeed an interior point of f *(Bp(f(xo),~)). In other words, f is continuous a t xo if and only if the inverse image under f * of any open ball centered a t f(xo) contains xo as an interior point. (See Figure 4.1.) Consequently, there is an open ball Bd(x0,6) C f*(Bp(f(x0),&)) In particular, this implies that: 1) such a positive 6 exists, and 2) the image of Bd(xo,6) under f , is a subset of Bp(f (x0),&), which guarantees that p(f (x), f (xo)) < E for all x with d(x,xo) < 6.
Figure 4.1
4. Continuous Mappings in Metric Spaces
79
However, if f is not continuous a t xo, as it is depicted in Figure 4.2 below, xo need not be an interior point of f * ( B (f (x0),&)).In this case, no ball Bd(xo,6) can be inscribed in f * ( B ~ f (xoP,&)) ( or, equivalently, no positive 6 exists to warrant p(f(x), f(xo)) to be less than E for all x with d(x,xo) < 6.
Figure 4.2 The following theorem is a generalization of the above principles of continuity.
4.3 Theorem. A function f: (X,d) -4 (Y,p) is continuous if and only if the inverse image of any open set in (Y,p) under f is open in (X,d).
CHAPTER 2. ANALYSIS OF METRIC SPACES
Proof. 1) As mentioned in Remark 4.2, we will begin the proof by showing the validity of the following assertion:
f is continuous at xo if and only if xo is an interior poini of the inverse image under f * of any open ball Bp(f ( X ~ ) , E ) . Let xo be an interior point of f *(Bp(f (X~),E)). Then there is an open ball Bd(xo,s) C f *(Bp(f ( ~ o ) , & ) ) , and hence, (by Problems 3.6 (a) and 2.6 of Chapter I),
which yields continuity o f f a t xo. Now, let f be continucus a t xo. Then, the inclusion f *(Bd(xO,d))E B p ( f ( x o ) , ~ )holds, which, along with Problem 2.5 (Chapter 1) lead to the following sequence of inclusions:
Because xo is the center of Bd(x0,6), it is an interior point of this ball and, due to the last inclusion, an interior point of f *(Bp(f(xo),~)). 2) Suppose f is continuous on X. We show that for each open set 0 Y, f *(O) is open in (X,d). Pick a point xo E f '(0). Then, f (xo) E f ,(f '(0)) 0 and, since 0 is open, f (xo) is its interior point. Thus, 0 is a superset of the open ball Bp(f ( X ~ ) , E )for ) , some E, and consequently,
c
Since f is continuous a t xO, by assertion I), xo must be an interior point of f *(Bp(f ( x o ) , ~ ) ) ,and, by (4.3)) an interior point of f ' ( 0 ) . Thus, f * ( O ) is open. 3) Let f*(O) be open in (X,d) for every open subset 0 of Y. Take xo E X and construct an open ball Bp(f ( x o ) , ~ ) .By our assumption, the set f *(Bp(f(xo),&))is open in (X,d). Since f (xo) E Bp(f (go),&),we have that
and, therefore, xo E f*(Bp(f(xo),&)) and it is an interior point of f *(Bp(f(xo),~)).By I), f must then be continuous a t xo.
4. Continuous Mappings in Metric Spaces
81
There will also be yet another useful criterion of continuity.
4.4 Theorem. A function f : (X,d) -, (Y,p) is continuous at x E X if and only if for every sequence {I,}, d-convergent to x, its image sequence {f (x,)) is p-convergent to f (x). We will prove this theorem for a more general case in Chapter 3 (Theorems 4.9 and 4.10).
4.5 Definition. Let (X,d) be a metric space and ~ ( d be ) the collection ) just T) is of all open subsets of X with respect to metric d. Then ~ ( d (or said to be the topology on X generated b y d. Theorem 4.3 can now be reformulated as follows. 4.6 Theorem. Let f : (X,d) t (Y,p) be a function and let r ( d ) and ~ ( p be ) the topologies generated b y metrics d and p, respectively. Then f is continuous on X if and only i f f **(T(P)) E ~ ( d )[i.e., VO E ~ ( p ) , f * ~ E)~ ( d ) l * 0
4.7 Example. Let f: (W,d) (R,d,) be the Dirichlei function defined as f = l q , where Q is the set of rational numbers. If d = d, is the Euclidean metric then f is discontinuous a t every point. If d is the discrete metric, by Theorem 4.3, f is continuous on R, since the inverse image of any open set in (W,d,) under f is clearly an element of the power set coinciding with the "discrete topology" generated by the dis0 crete metric (see Example 2.7). We will further be interested in the conditions under which two different metrics on X generate one and the same topology. This property of metrics satisfies an equivalence relation on the set of all topologies on X and hence referred to as equivalence of metrics. In other words, topologies generated by metrics on a carrier induce an equivalence relation.
4.8 Definition. Two metrics dl and d2 on X are called equivalent if ~ ( d , )= r(dz) (in notation dl R d2).
4.9 Remark. Let (X,d,) and (X,d2) be two metric spaces and, let f : (X,dl) -, (X,d2) be the identity function (f(x) = x, x E X). If dl and d2 are equivalent and therefore r(dl) = T(d2), then for every open set 0 in (X,d2) (and in (X,dl)), f*(O) E r(dl). According to Theorem 4.4, this is equivalent to the statement that lim dl(xn,x) = 0
n t w
implying that
lirn d2(f (x,), f (x)) = nlirn t m d 2 (xn' x) = 0.
R+OO
82
CHAPTER 2. ANALYSIS O F METRIC SPACES
Thus, assuming
we showed that
(ii) n+oo lim dl(x,,x) = 0
e n+oo lim d2(xn,x) = 0.
By Theorem 4.4, it follows that the converse is also true, i.e. that statement (ii) implies statement (i). Hence, we may call two metrics r ( d t ) and r(d2) on X equivalent if (i) or (ii) holds. CI From Theorem 4.3, it also follows that the identity map above is continuous under equivalent metrics. However, an identity map need not be continuous if dl and d2 are not equivalent. 4.10 Definitions. (i)
LeC A be a subset in a metric space (X,d). The number
(more precisely, a real number or infinity) is called the diameter of A. The set A is called d-bounded or just bounded if d(A) < oo. Particularly, the metric space (X,d) or d is called bounded if X is bounded. A is said to be unbounded if d ( A ) = oo. (ii) A subset A in a metric space ( X , d ) is called iotally bounded if for every a > 0, the set A can be covered by finitely many &-balls (i.e. balls with common radius E ) . 0
4.11 Example. According to Problem 1.4, the function
defined on a metric space (X,d) is a metric on X. Obviously lim d(xn,x) = 0
n+oo
if and only if lim p(xn,x) = 0 (due to d = &). n+oo
Therefore, d and p are
equivalent. Observe that p is clearly bounded while d is arbitrary.
0
We finish this section by rendering a short discussion on uniform continuity. This concept will be further developed in Section 6 and Chapter 3.
4.12 Definition. A function f: (X,d) + (Y,p) is called uniformly continuous on X if for every a > 0, there is a positive real number 6 such
4. Coniinuous Mappings in Metric Spaces
83
that d(x,y) < 6 implies that p(f (x),f (y)) < E , for every x,y E X. Unlike continuity, uniform continuity guarantees the existence of such positive 6 (for every fixed E) for all points of X simultaneously. In the case of usual continuity, a delta depends upon a particular point x E X, where the continuity holds, so that a common delta, good for all points x E X, need not exist. Clearly, uniform continuity implies continuity. Uniform continuity can also be defined on some subset A of X, so that in Definition 4.12, X will be replaced by A. 4.13 Examples.
(i)
Consider f : (W, d,)
-t
(W,d,) such that f (x) = x2. Then
11, - +I
<6
implies that
and
+
Take 6 - (6 2 1 xo I ) as E such that
E.
Then 6 can be found explicitly as a function of
Therefore, the function x2 is d,-continuous a t every point xo E W. However, x2 is not uniformly continuous on W, since 6 depends upon xo as well. Specifically, 6 --,0 when x o + m . Consequently, we cannot find a 6 > 0 good for all so.
(ii) Let f (x) = x2 be given as
From the last inequality above we derive
d E
+
and thus 6 = - 3, where E = 6(6 6). Thus de(f (x),f (to)) < a whenever de(x,xo) < 6 = - 3. Since 6 is independent of x,, f ( x )
d E
84
CHAPTER 2. ANALYSIS OF METRIC SPACES
is uniformly continuous. Observe that f has been given on a closed and bounded interval which provides the uniform continuity. However, in this case f would also be uniformly continuous if f were defined on any bounded but not necessarily closed interval, for instance (0,3) (why?). (iii) A continuous function can be uniformly continuous over unbounded sets, as for example, functions f(x) =$, x E [l,m), and f (x) = sin x, x E R. There is an analytical result, known as Heine-Bore1 Theorem, stating that any continuous function defined on a closed and bounded set in any Euclidean metric space is also uniformly continuous. The general form of this result will be discussed in Section 6 (Theorem 6.13). 4.14 Remark. It is known from calculus that the space of all realvalued continuous functions defined on Rn is closed under the formation of main algebraic operations. What if functions were defined on an arbitrary space (X,d)? We give here some informal discussion on this matter. Let FtX be the space of all real-valued functions defined on a set X and let f ,g E RX. Define the following.
(i)
f fg is the function such that for each point (f fg)(x) = f ( 4 fg(x).
x E X,
(ii) f g is the function such that Vx E X, (f g)(x) = f (x) g(x). (iii)
+ m and
- oo are not real numbers. Consequently, f l g is the function such that for ail x E X, (f/g)(x) = f (x)/g(x), excluding x E X for which g(x) = 0. At all those values, the function f l g is either undefined or can be specified.
(iv) As a special case, any real-valued function multiplied by a real number, is a real-valued function too. (v)
The associative (relative to mu1tiplications) and distributive laws of functions relative to the addition and multiplication defined in (i) and (ii) are the corresponding consequences of these laws for real numbers.
Bearing in mind these observations, we conclude that the space RX is a commutative algebra over R with unity and a vector lattice (that was also mentioned in Example 7.7 (ix), Chapter 1). A subset e((X,d);(R,p)) (of RX) of all continuous functions is a subalgebra characterized by the following properties: (a) (6)
*
f,g E e af +bg E e , Va,b E R. f,gEe*fgEe.
4. Continuous Mappings in Metric Spaces
PROBLEMS
4.1
Show that if A is totally bounded then A is bounded. Give an example, where a bounded set is not totally bounded.
4.2
Prove that C is indeed a subalgebra with properties (a) and ( 6 ) above.
4.3
Show that a continuous bounded function on a bounded interval need not be uniformly continuous.
In the problems below it is assumed that f and g are functions from (R,de) to (R,de). 4.4
Let f : (( - oo,O),de)--t (( - m,O),de) be a function given by f (x) = &. Show that f is continuous. Explain why f (x) is not uniformly continuous.
4.5
Let f : A -, W be a differentiable function such that its derivative f' is bounded over A, where A is an arbitrary (bounded or unbounded) interval. Show that f is uniformly continuous on A.
4.6
Show that if f and g are uniformly continuous on W and bounded then f g is uniformly continuous on R too.
4.7
Which of the following functions are uniformly continuous? a) f ( I) = sin2x (x E W). b ) f (x) = x3cos r (x E W). c) f(x) = xsinx (x E R). d) f(x) = lnx (x E [l,m). e) f (x) = x21n x (x E (1,100)).
4.8
Let f be a continuous function and g a uniformly continuous function on a set A such that I f 1 5 1 g 1 . IS f then uniformly continuous?
4.9
Show that in (Wn,d,), any bounded set is also totally bounded.
86
CHAPTER 2. ANALYSIS OF METRIC SPACES
NEW TERMS: continuous a t a point function 78 continuous function on a set 78 inverse image of an open set under f 79 continuity criteria 79, 81 topology generated by a metric 81 Dirichlet function 81 equivalent metrics 8 1 diameter of a set 82 bounded set 82 d-bounded set 82 unbounded set 82 totally bounded set 82 uniformly continuous function 82 algebra of functions 84
5. Complete Metric Spaces
5. COMPLETE METRIC SPACES In this section we will discuss the completeness of metric spaces as it was introduced in Definition 3.1 (iv).
5.1 Theorem. Let (X,d) be a complete m e t r i c space. T h e n a subspace (A,d) i s compIete if and only if A is closed.
Proof. Let A be closed and let {x,} C A be any Cauchy sequence. Since ( X , d ) is complete, there is a point x E X such that n+oo lim x, = x. Then, by Corollary 3.4, x E A. Thus, (A,d) is complete. Now, let (A,d) be complete and {x,} be any convergent sequence in A. Then this sequence is also a Cauchy sequence and hence A contains its limit. Therefore, A is closed, again, by Corollary 3.4. 0 The reader should be aware of the differences between the notions of completeness and closeness of a subspace. (See Problem 5.3.)
5.2 Theorem. A m e t r i c space (X,d) i s complete if and only if every nested sequence {C(z,,r,)) of closed balls, w i t h r, 10 as n-too, has a n o n e m p t y intersection.
Proof. Because rn 0, for any r, < :E. Given that k > n > - u,
E
> 0, there is an integer u such that
and, consequently , d(xk,xn) 5 2r,
< E.
Therefore, {x,} is a Cauchy sequence. First assume that (X,d) is complete. Then, {x,} converges to a point, say x E X. Since each ball C(z,,r,) contains the tail
of the sequence {x,}
and because it is closed, it must contain x.
n C(xn,rn) contains x and hence it is not empty. 00
Thus,
n = l
Now, let any nested sequence of closed balls have a nonempty intersection and let {xk) be a Cauchy sequence in X. By Definition 3.1 (iii), it implies the existence of an increasing subsequence {ul,u2,. ..} of indices of {xk) such that for each n, d(x3,xpn) <
2"+1' for s > u,.
88
CHAPTER 2. ANALYSIS OF METRIC SPACES
We show that the sequence y E Cn Then d ( ~ rn~+ut ) -<2 " + I1
is nested. Indeed, let
and d(xun,xu n+l
)'2n+l.1
Therefore,
which yields that y is an interior point of Cn and thus Cn 3 C n + l .
n C, # 0,there is a t least 00
Since by our assumption, the intersection
n = l
one point, say x that belongs to all balls. Furthermore, because the sequence {r,) of their radii is convergent to zero, the subsequence {x } vn
of their centers must converge to x E X and thus, by Problem 3.9, {xk) also converges to x.
5.3 Ftemark. Clearly, in the final phrase of the last theorem, point x
n= C,. 00
is a unique point of the intersection n
The below theorem is a
1
useful refinement of this statement due to Georg Cantor. Because of its similarity with Theorem 5.2, its proof is suggested as an exercise (Problem 5.8).
5.4 Theorem (Cantor). Let (X,d) be a complete m e t r i c space and let {An)J. E X be a sequence of nonempty closed subsets with lim d(An) = 0.
n+w
Then
w
n
n=l
An consists of exacily one element.
5.5 Definition. A function [X, (Y,d), f ] is called d- bounded if Y is a linear space and there is a nonnegative real number M such that d(f (x),O(x)) 5 M, Vx E X, where 0 is the function identically equal to 13 E Y (the origin of Y). 0 5.6 Examples.
Let X be a nonempty set, (Y,d) a linear metric space, and let T, = 3,(X;(Y, d)) be the set of all d-bounded functions from X to Y. For all f ,g E 9, define
(i)
It can be shown (Problem 5.4) that p is a metric on 9,) called a uniform (or supremum) m e t r i c . Consequently, the convergence in (9,,p) is called
5. Complete Metric Spaces
89
uniform convergence. A subset of functions 9 9, is said to be uniformly bounded on X if 9 is p-bounded, i.e., diam9 5 M (a positive real number). We show that any Cauchy sequence in (9,,p) is uniformly bounded. We will make use of Problem 5.5. Let {f,} be a Cauchy sequence in ( 9 , ) Therefore, for a = 1, there is an N = N ( l ) such that p(f n , f k ) < 1, n,k >_ N. Let k = N(1). Then,
where M ( f N ) is a "p-bound" of function f N. If M ( f i) is a bound of f ; , then M, defined as max{M(fl), ...,M(fN-l),l+ M(fN)}, p-dominates the whole sequence {f ,}. By Problem 5.5, we have that {f ,} is pbounded. (ii) Assume that (Y,d) is a complete linear metric space. Let us show then that (B,,p) is complete too. Consider a Cauchy sequence {f}, c_ (B*,P)* I t is obvious that for each fued x E X, the sequence {fn(x)} is also Cauchy in (Y,d). Since (Y,d) is by our assumption complete, the "pointwise limit" of {f ,} exists. Denote it by f . In other words, lirn d( f ,(x), f (x)) = 0, Vx E X.
n+oo
We need to show that f E (9,,p). Since { f ,} is a Cauchy sequence, according to (i) it is uniformly bounded by a real number M. Thus we have d(f (x),O(x)) 5 d(f (41f ,(XI)
+ d(f n(x)rO(x))
The last inequality holds for every x E X if n d(f(x),O(x))
+ oo, which
yields
5 M, for all x E X.
Consequently, p( f ,0) 5 M and hence f E (9,,p). We only showed that fn(x)
f E 9,. The assertion
f n -f+
I!
f(x), for each r E X , and that
f is subject to Problem 5.6.
90
CHAPTER 2. ANALYSIS O F METRIC SPACES
PROBLEMS
5.1
Using similar arguments as in Example 5.6, show that the limit of any uniformly convergent sequence of continuous bounded functions from ( X , d o ) to ( Y , d ) is a bounded and continuous function.
5.2
Let {C,} be a sequence of closed balls in (Wn,d,) such that each of the balls Cn is centered a t a point zo E En and has radius n=
a,
00
1,2,. .. . Find fl C,. n=l
5.3
Show that if a metric space ( X , d ) is not complete then a closed subspace ( A , d ) need not be complete either. [Hint: Consider the metric space in Problems 2.9 and 2.10.1
5.4
Show that p, defined in Example 5.6 (i), is a metric on 9,.
5.5
Let '3 E T , ( X ; ( Y , d ) ) , where Y is a linear space. Prove that 9 is pbounded if and only if there is a positive constant M such that for all f E 9, ~ (10)f 5 Ma
5.6
Show that in Example 5.6 (ii) f,
5.7
We can make use of the fact that the Euclidean and uniform metrics are equivalent to show completeness of (Rn,d,). For n = 1, it is well-known from calculus. Prove completeness of (En,d,) for an arbitrary n. (See Problem 4.9.)
5.8
Prove Cantor's Theorem 5.4.
5.9
Let ( X , d ) be a metric space. A subset A C_ X is said to be first category if it can be represented as a countable union where dense sets. Otherwise, A is of the second category. Babe's Category Theorem: A complete m e t r i c space is second category.
5f .
of the of noProve of t h e
5. Complete Metric Spaces NEW TERMS: completeness criteria 87 Cantor's Theorem on intersection of closed sets 88 d-bounded function 88 bounded function 88 uniform metric 88 supremum metric 88 uniform convergence 89 uniformly bounded set of functions 89 p-bound of a function 89 bound of a function 89 pointwise limit 89 Baire's Category Theorem 90
CHAPTER 2. ANALYSIS OF METRIC SPACES
6. COMPACTNESS Compactness is one of the kernel concepts in real analysis. We develop it in the present section for metric spaces and then in Chapter 3 for the general topological spaces. It stems from the fact known in R that every bounded sequence has a convergent subsequence, which implies that any sequence in a closed bounded interval has a subsequence convergent to a point in this interval. In a general metric space, a subset A, in which every sequence has a subsequence convergent to a point in A is called sequentially compact or compact. Although compactness and sequential compactness are distinct notions in general topological spaces (and they are defined differently), they are equivalent in metric spaces as Theorem 6.3 states it. Continuous functions defined on compact sets are uniformly continuous; continuous images of compact sets are compact (hence, closed and bounded) an4 this means that in normed linear spaces continuous functions on compact sets reach their maximum values). Further applications lead to the celebrated Ascoli and Ascoli-Arzela theorems. 6.1 Definitions.
(i) A family of sets { A i : i E I ) c ( X , d ) is called a cover of a set ASXif
Any subfamily of { A i : i E I), which covers A is called a subcover of A. I f (Ai : i E I) is a family of open sets, then the corresponding cover (or subcover) is called an open cover (or an open subcover).
(ii) A set A E (X,d) is called compact if any open cover of A has
within itself a finite subcover of A, or we will also say that "any open cover of A can be reduced to a finite subcover of A." Correspondingly, (X,d) is a compact m e t r i c space if X is compact. [Notice that any finite subset is compact. Consequently, to avoid triviality, in all theorems below we will assume that sets of spaces under consideration are infinite.] (iii) A set A (X,d) is called a Lindel'if set if any open cover of A contains a t most a countable subcover of A (or "can be reduced to a t most a countable subcover"). ( X , d ) is called a Lindelcf space if X is a Lindelof set. 0 A noteworthy property of Euclidean spaces is given in the following classical result. 6.2 Theorem (Lindeliif). (Rn,d,) i s a Lindeliif space.
(See Problem 6.7.)
93
6. Compactness
6.3 Theorem. For a set A C - (X,d), the following statements are equivalent. (i)
A is compact.
(ii) Every infinite subset of A has an accumulation point in A (in this case A is called Bolzano- Weiersirass compact). (iii) Every sequence in A has a subsequence that converges in A ( A is called sequentially compact). The sequential compactness of a subspace implies its completeness. (See Problem 6.6.) The proofs to the above statements are left for the reader. (Problem 6.8.)
Definition 6.4. A metric space is called separable if it has a dense countable subset.
Example 6.5. The Euclidean metric space (R,d,) is separable. A relevant dense countable subset of R would be Q, the set of rational numbers. Another example is the n-dimensional Euclidean metric space with the countable, dense subset Qn. 0
Theorem 6.6. Any compact metric space is separable. Proof. Let X be compact. It is easy to see that for each n E N, X can be covered by the family of open balls centered a t every x E X with radius & Since ( X , d ) is compact, this open cover can be reduced to a
U
finite subcover, such that 00
U F,,
1x7,. ..,xi ). Denote F = n
B(x,A) contains X, where Fn =
x E- F -n
n=l
which is obviously a countable subset of
X. We show that F is dense in X, i.e., = X. It is sufficient to prove that, for each y E X and r > 0, the open ball B(y,r) contains a t least one point of the set F, i.e., y is a closure point of F. Choosing such y and r we take any n such that < r. Then if
a
Y EX
E U
~(x,:),
XEF,
then there is a point x? E Fn such that y E B(x? ,r). This implies that 3n
n
d(x? ,y) < r and, therefore, xn E B(y,r). Consequently, B(y,r) n Fn # @ n
and B(y,r) n F #
3n
0. The proof of the statement is complete.
The following two theorems belong to central results in analysis.
Theorem 6.7. Let A 5 (X,d) be compact. Then A is closed and bound-
CHAPTER 2. ANALYSIS O F METRIC SPACES
Proof. 1) We show that A is bounded. Obviously A is covered by the family of open balls {B(x,l): x E A}. Since A is compact, this open cover can be reduced to a finite subcover, i.e. A 5
h
U B(xk,l) for some integer
h. Let
k=l
M = max(d(x."x 3.): i,j = 1,...,h}. Then M is finite. For any x, y E A, there are xi and x such that x E B(xi, 1) and y E B(x j,l). The following holds due to the triangle inequality:
Therefore, A is bounded. 2) We show that A is closed, i.e. that A = 2. Let x E 2. By Theorem 3.3, there exists a sequence {x,} A such that x n + x. By Theorem 6.3, if A is compact, every sequence {x,} C_ A has a subsequence that converges in A. By Problem 3.9, such a subsequence must have the same limit as {I,], i.e., x E A. Therefore, A is closed.
Theorem 6.8 (Heine-Borel). A set A 5 (Rn,de) is compact if and only if A is closed and bounded.
Proof. 1) If A is compact it is closed and bounded as a special case of Theorem 6.7. 2) If A 5 (Wn,de) is closed and bounded, d(xjy) 5 M < oo V x,y E A. Fix a y E A and define a = (al,. ..,an) E A. Then we have
where 0 denotes the origin in Rn. Since each y has a finite distance to the origin, then every other point of A, like a, has also a finite distance to the origin bounded by M d(y,0). Note that even though d(a,0) < oo for unbounded sets, d(a,0) would not have a uniform bound unless A is bounded. Now we show that any de-bounded sequence in Rn has a convergent subsequence. The below considerations represent an appropriate selection procedure. Let {xk} 5 A. Then {xi} 5 R is a bounded sequence of i-coordinates (the ith-component sequence), i = l,...,n. A bounded sequence does not necessarily converge but does have a convergent subsequence.
+
95
6. Compactness
For i = 1, let such a subsequence be {xt ,x: 1
,...) with the limit point
2
xl.
Select from the 2nd-component sequence, the subsequence with the same indices {xfl, xf ,.. .). This subsequence is also bounded and hence -
2
contains a convergent subsequence {x2 ,xi2,...) with a limit point x2, so kl
that the set of indices {kl,k2,. ..) E {rl,r2,...). If we return to the sub-
--
sequence {xlrl) x1r 2 ) . } and select from it the subsequence
. ..),
{ X ~ ~ , X ~ ~ ,
then this subsequence is also convergent and has the same limit xl. We can continue this process by taking the 3rd-component sequence, 3 selecting the subsequence {xk ,xk3 ,.. .) and from this sequence a conver1
2
gent subsequence with a limit point x3. Then the above 1st- and 2ndcomponent subsequences will be reduced to the ones with the indices from the third selection and so on. Let x = (xl, ...,xn) be the limit of the selected subsequence of {xk). Since A is closed, x must belong to A. Therefore, we have proven that an arbitrary sequence in A has a convergent subsequence in A, i.e. that A is sequentially compact. By Theorem 6.3, A is compact. C3 6-9 &mark. The second part of the Heine-Bore1 Theorem does not hold for general metric spaces. That is, if A is closed and bounded, it need not be compact. For example, let X be an infinite set and let d be the discrete metric (which is finite) on X. Then X is closed and bounded. Now consider
Since each of the balls covers just one point, the open cover {B(x,l): x E X ) cannot be reduced to a finite subcover. Therefore, X is not compact.
6.10 Theorem. Let f : (X,d) -+ (Y,p) be a continuous, surjective function and let (X,d) be compact. Then the image f,(X) = Y is compact. Consequently, the image of a compact set under a continuous function is compact.
C
Proof. Take any open cover (0;: i E I} of Y to have Y = f ,(X) (J 0;. Then,
i E I
Since f is continuous, f *(Oi) is open, and because X is compact, there is a finite subcover of sets f*(Oi), without loss of generality indexed by 1,...,n, i.e.,
CHAPTER 2. ANALYSIS O F METRIC SPACES
Therefore,
f *(XIc
0f *(f '(0,))
k=l
=
00,.
k=1
(By Problem 2.7, Chapter 1, since f is surjective.)
6.11 Remark. Let f : (X,d)+ (R,de) be a continuous map and let A C (X,d) be compact. Then by Theorem 6.10, f(A) is compact in ( ~ 2 , ) .By Theorem 6.7, f ( A ) is then closed and bounded, which means that the diameter of f (A) equals some M < oo. As mentioned in part 2 in the proof of the Heine-Bore1 Theorem, this implies that all points of f ( A ) have a finite distance (i.e., are bounded by some Mo) to the origin, or equivalently, I f(x) I M o , for all x E A. We have therefore shown that a continuous real-valued map on a compact set assumes a minimum and a maximdm value. CI
<
6.12 Examples.
In (R,de), R is closed but not bounded. Therefore, by the HeineBore1 Theorem, R is not compact. (2)
(ii) Take as A C (R,d,) the set (0,1] which is bounded but not closed and therefore is not compact. Consider the open cover of A given by the family of sets 2) : n = 1,2,. ..). Obviously,
{(a,
It is not possible to select any finite subcover of A, for no finite subcover would include the point 0. Yet another argument that A is not compact CI (by Theorem 6.3) is that the sequence I;} does not converge in A. A continuous function need not be uniformly continuous, unless it is defined on a compact set, as the following theorem states.
6.13 Theorem. Let f: (X,d)+ (Y,p) be a continuous function and let (X,d) be compact. Then f is uniformly continuous on X . Proof. Let f be continuous a t x. Then, for each s > 0, there is a 6,
> 0, such that
for all y with d(x,y) < 6,. Since X is compact, after reduction, there is an n-tuple of open balls such that
6. C o m p a c t n e s s Let 6 = 1min{6,
97
,...' 6" n 1 and let x,y be such that
1
d (x,Y) < 6. Then x E
B(xi,bXi/2) implies that d(x,xi) < and
Thus, y belongs to the ball B(xi,6,;).
Since y and xi are within the
distance of bXi, due to continuity of f a t xi,. given
Obviously, d(x,xi)
E,
< 6 . yields p( f (xi), f (x)) < f and, therefore,
6.14 Theorem. A m e t r i c space ( X , d ) is c o m p a c t i f a n d o n l y i f it is c o m p l e t e a n d totally bounded.
Proof. 1) Let (X,d) be compact. Then by Problem 6.6, it is complete. Since X E U B(x,E) for some E > 0, by compactness, the cover x E
X
can be reduced to a finite subcover, which implies total boundedness.
2) Let (X,d) be complete and totally bounded. We will show that (X,d) is sequentially compact, which, by Theorem 6.3, would imply compactness. Let {xn} be a sequence in X. We will construct a Cauchy subsequence. Since X is totally bounded, it can be covered by finitely many open balls of radius 1. Then a t least one of the balls, for instance B1, contains infinitely many terms, say {xi}, of this sequence. ~ u rheirnore, t cover X by balls of radius and again an infinite subsequence {xi} C { x } (since B1 will also be covered) is contained in one of the balls, which we label B2, and so on. The desired Cauchy sequence is formed by the selection of the first term from each subsequence. Indeed, by the con1 2 struction, xi and x: belong to ball B1. Thus, d(xl,xl) < 1. xy and x t
4
2 3 belong to ball B2, which implies that d(xl,xl)
< i, and so on. Since
(X,d) is complete, this Cauchy sequence is convergent, yielding sequent ia1 compactness of ( X , d ) . 0
98
CHAPTER 2. ANALYSIS O F METRIC SPACES
PROBLEMS
6.1
Show that if i r k } (Rn,d,) with d(xk,O)5 3, then { x k } has a convergent subsequence.
6.2
Define VA,B E ( X , d ) , d ( A , B )= inf{d(a,b ) : a E A, b E B } . Let A be compact. Show that V B X , there is an x E A such that d(x,B)= d(A,B). [ H i n t : Use the fact that A is sequentially compact.]
6.3
Let A,B C ( X , d )such that A is compact and B is closed. If A n B = 0, show that d ( A , B )> 0.
6.4
Let A C ( X , d ) . Show that if A is totally bounded then totally bounded.
6.5
Generalize Theorem 6.6: Any Lindel'if m e t r i c s p a c e i s separable.
6.6
Show that sequential compactness of a subspace implies its completeness.
6.7
Prove Theorem 6.2.
6.8
Prove Theorem 6.3.
2 is
also
6. Compactness
NEW TERMS: cover 92 subcover 92 open cover 92 open subcover 92 compact set 92 compact metric space 92 Lindelof set 92 Lindelof space 92 compactness, criteria of 93, 97 Bolzano-Weiers trass compactness 93 sequential compactness 93 separable metric space 93 Heine-Bore1 Theorem 94 compact set under a continuous function 95 uniform continuity criterion in compact space 96
100
CHAPTER 2. ANALYSIS OF METRIC SPACES
7. LINEAR AND NORMED LINEAR SPACES We have already mentioned that the Euclidean metric defines the length of a vector in n-dimensional Euclidean vector (linear) space. The following generalizes the notion of vector length in a linear space and reconciles it with the notion of a special metric defined on a linear space (initially discussed in Section 5).
7.1 Definition. Let (X,d) be a metric space such that X is a linear space over R or 43. The metric d is said to be:
+
a) translation invariant if for all a, x, y E X , d(x + a,y a) = d(x,y). b) homothetic if for all a E F and x,y E X , d(ax,ay) = I a I d(x,y). If d is translation invariant and homothetic we will abbreviate it by TIH. If d is a metric on a linear space X , then we are able to measure length of vectors, and thus comparing them, by setting the distance from any point x E X to one fixed point of X , the origin. If, in addition, d is TIH then we can use the properties of X as a linear space, and in some particular cases, employ even the geometry, thereby replicating the Euclidean space and preserving the generality needed in applications.
7.2 Definition. Let d be a TIH metric on a linear space X, with the origin 0, over [F. (assuming that IF is R or C). Then for all x E X, we call the distance d(x,0) the norm of vector x and denote it by 11 x 11. We will also call 11 11 the norm on X induced b y the TIH metric d. The pair C3 ( X , 11 (1 ) is called a normed linear space (NLS). 7.3 Theorem. Let following properties of
11 - 11 be a norm on X 11 11 hold true:
(i)
IIaxII = la 1 IIx
11,
(iii)
11 x + y 11 I Il x 11 + Il Y II
in Definition 7.2. Then the
b'a E F , v x E X. 9
VX,Y E X -
Proof. Property (i) is obvious.
Conversely, if 11 11 is a real-valued nonnegative function defined on a linear space X and has properties (i-iii) of Theorem 7.3, then 11 11
-
7. Linear and Normed Linear Spaces generates a TIH metric on X by setting d(x, Y) = Problem 7.10).
101
11 x - y 11 (show
it, see
If d in Definition 7.2 is a TIH pseudometric then the function 11 - 11 is called a semi-nonn and correspondingly, the pair (X, 11 11 ) is called a semi-normed linear space (SNLS). It is easy to show that the Euclidean metric d, on Rn is TIH. The associated norm induced by d, is called the Euclidean norm and it will be denoted I1 11 ,.
-
A very important class of NLS's is introduced below.
7.4 Definition. An NLS is called a Banach space if it is complete with respect to the metric induced by the norm (or the norm induced by a TIH metric).
7.5 Examples. (i)
The NLS ( R n , / 11,) over the field R with l l x / =
,/g
is a Banach space with the Euclidean norm (see Problem 7.1).
(ii) The NLS
[ x=p,, I xn I
IP
over the field C with the norm
11 x 11
,
.
is a Banach space. Observe that 11 11 indeed defines a norm (called the lP norm). (See Problem 7.5.) Now let {z(")) be a Cauchy sequence. Then this sequence is uniformly bounded (show it in Problem 7.6), say, by some M E W+. Let x = (xl, xz,. ..) be the pointwise limit of the sequence {x(")}. This limit exists, since each zi is the limit of the ith-component sequence in (C,d,) which is complete. We need to show that x is an element of I P , i.e. 11 x 11 < m and that
=
p]lIp
(i.e. { x ( ~ ) }converges to x in
1P
norm). We have
(by Minkowski's inequality with ak = xk - zp)and bk = x p ) )
Now, letting n
-t
oo, we have
102
CHAPTER 2. ANALYSIS OF METRIC SPACES
[2I k=l
k
I pi'"
5 M,
which holds for all r = 1,2,... . Hence, we have 11 x 11 5 M. Show that x ( ~ ) - x in l P norm (Problem 7.7). Thus, l P is complete and therefore is a Banach space. (iii) Let T,(St) be the space of all bounded real-valued functions on St valued in (R,d,) or (C,d,). One can show that 4, is a linear space. The norm 11 f 11, = sup{ I f(w) I : w E St) is called the supremum norm. 9, is a Banach space with respect to this norm (see Problem 7.4).
Cia,bl as the space of all n-times differentiable realvalued functions on a compact interval [a,b]. It is easily seen that Cn is a linear space. We introduce the following norm in Cia,bl : (iv) Consider
[ a ,bl
Clearly,
11 11 z
cL,bl.We show that Cia,bl is a Banach { f k ) be a 11 - 11 z-Cauchy sequence. Then, for
is a norm in
space under this norm. Let every E > 0, there is a positive integer N such that Qk,j
> N,
which implies
Therefore, by the well-known theorem from calculus (cf. Theorem 4.2, p. 508, in Fisher [1983]), there exists a function gi : [a,b]4 W to which the sequence {f ;( 1: j = 1,2,. ..) converges uniformly and gi is continuous, i = 0,1,. ..,n.On the other hand, it holds that
Let k - + m in the above equation. Since the convergence is uniform, we may interchange the limit and the integral (a more rigorous motivation is due to the Lebesgue Dominated Convergence Theorem in Chapter 6) and have i l ( ) - i l ( ) =
J gi(u)du, i = 1,...,n. [ a ,XI
Consequently, we conclude that gi-l is differentiable on [a,b] and g :-l(x)
7. Linear and Normed Linear Spaces
= gi(x). Thus go E Cia, bl implying that
11 f
- go 11
103
+
0 and Cia, bl is
a Banach space. 7.6 Definitions.
(i) Let X and Y be linear spaces over a field f f .A map A : X --+ Y is called a linear operator (with respect to ff) if
(ii) A linear map f : X--r f f (where X is a linear space over a field
f f ) is called a linear finc2ional.
(iii) Replacing a field f f in ( i )and (ii) by a semifield F + I we have the notions of a semi-linear operator and a semi-linear functzonal, respectively.
PROBLEMS 7.1
Show that (Rn, 11 11 ,) defined in Example 7.5 (i) is an NLS and then show that it is a Banach space.
7.2
Define the space lW as the set of all bounded sequences x = {xl,x2,...} C C. Show that lm is an NLS with the norm defined as 11 x 11 = sup{ I xi 1 : i = 1,2,. ..}.
7.3
Define the space c E lW as the subset of all convergent subsequences and let co C - c be the set of all sequences convergent to zero. Show that c and co are normed linear subspaces of loo with the same norm as that in Problem 7.2.
7.4
Let 9,(a) be the space of all bounded real-valued functions on a. Show that 4, is a linear space. Let 11 f 11 ,= sup { I f ( w ) I : w E a} be the supremum norm defined in Example 7.5 (iii). Show that the supremum norm in 4, is indeed a norm and show that 9, is a Banach space with respect to this norm.
7.5
Show that
7.6
Show that the Cauchy sequence { I ( " ) } in Example 7.5 (ii) is uniformly bounded.
7.7
Show that the pointwise limit x of the sequence { x ( " ) } in Example 7.5 (ii)is also an IP-limit.
7.8
Show that the differential operator dn : Cia, dx with respect to R.
11 11
in Example 7.5 (ii)is a norm.
-+
C[,,
bl is linear
104
CHAPTER 2. ANALYSIS O F METRIC SPACES
7.9
Let A be an n x m matrix. Show that A: Rm+ Rn is a linear operator with respect to R.
7.1
Let 11 11 be a real-valued nonnegative function defined on a linear space X over a field ff (which is R or C ) and let it have properties (i-iii) of Theorem 7.3. Show that 11 11 generates a TIH metric on by 4 x 1 Y) = I1x - Y (1
x
7. Linear and Norrned Linear Spaces NEW TERMS: translation invariant metric 100 homothetic metric 100 TIH metric 100 norm 100 normed linear space (NLS) 100 NLS 100 semi-norm 101 semi-normed linear space (SNLS) 101 SNLS 101 Euclidean norm 101 Banach space 101 lP-norm 101 supremum norm 102 G n o r m 102 linear operator 103 linear functional 103 semi-linear operator 103 semi-linear functional 103
Chapter 3 Elements of Point Set Topology 1. TOPOLOGICAL SPACES In Definition 4.5, Chapter 2, we called the collection of all open sets ~ ( d ) of a metric space ( X , d ) the topology induced by a metric. We recall that this collection of open sets or topology is closed with respect to the formation of arbitrary unions and finite intersections. We understand that the topology of a metric space carries the main information about its structural quality. For instance, equivalent metrics possess the same topology. In addition, through the topology we can establish the continuity of a function (see Theorem 4.6, Chapter 2) without need of a metric. This all leads to an idea of defining a structure more general than distance on a set, a structure that preserves convergence and continuity. Mathematics historians are not in complete agreement about the roots of topology and who should get full credits for being its initiator. Most consider that topology, as the theory of structures, has its basis in the work of the German mathematician Felix Hausdorff, who published his fundamental monograph, Grundziige der Mengelehre (Principles of Set Theoy), in Leipzig, in 1914. It was 'Limmediately') preceded by Maurice Frdchet's 1906 pioneering introduction to metric spaces. (Notice that contemporary topology has branched out into several specialized areas, such as general topology, algebraic topology, and combinatorial topology. The very topology founded by Hausdorff w& what we now refer to as general topology, also called point sei topology, which is deeply bound to classical analysis.) Bourbaki [1994], regarded German Bernhard Georg Riemann's work (his doctoral and habilitation theses and a paper on abelian functions) from 1851 to 1857 revolutionary and qualified him as the creator of topology, since he was the first to recognize where'topological ideas were needed. In 1870, Georg Cantor (apparently inspired by Riemann's work), in connection with the representation of real-valued functions by Fourier series, was concerned with the characterization of sets on which the function's value can be altered leaving the series invariant. This yielded more advanced concepts of topological accumulation point (earlier introduced by Karl Weierstrass), derived set, closed set, connected set, dense set and others that further led to the topological big bang. The word topology was introduced for the first time in 1836 by German Johann B. Listing, who used this as the notion of a "new analysis."
108
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
Topology has been further evolved ever since. Most of the fundamental results in general topology were developed in works by Germans Felix Hausdorff, Heinrich Hopf, and Hermann Weyl, Russians Pavel Alexandrov and Pavel Urysohn, Poles Stefan Banach, Kazimierz Kuratowski, and Waciaw Sierpihski, American Eliakim H. Moore and James Alexander, and Bourbaki group of French mathematicians.
1.1 Definition. Let X # @. A collection r of subsets of X is called a topology on X or a family of open sets, if: (i)
X,
@ E r.
(ii) {Oi : i E I) C_ r 3 U Oi E r. iEI (iii) r is n -stable, i.e., 01,02E r 3 O1 n O2 E r. [Observe that property (iii) implies inductively that the intersection of any finite collection of open subsets will also be open.] A carrier X endowed with a topology r is said to be a topological space. The topological space is denoted by (X,T). 1.2 Examples. (i) Let (X,d) be a metric space and let r ( d ) be the topology generated by the metric d (see Definition 4.5, Chapter 2). Due to Theorem 2.5, Chapter 2, the collection of all open sets generated by metric d contains all arbitrary unions and finite intersections. Moreover, @ and X a r e also open, so that r(d) is indeed a topology as it was defined above. For instance, the topology in Rn generated by the Euclidean metric de is called the usual (or standard or natural) topology and it is denoted by re. (ii) Let X be a nonempty set. Then the pair {X, @) = so is a trivial example of a topology. It is obviously the smallest topology on X, and it is called the indiscrete topology. Another trivial example of a topology is T(X), the collection of all subsets of X. This is the largest possible topology on X, and it is called the discrete topology. (iii) For A C_ X, r, = {X,@,A) is a topology "induced by set A."
+
(iv) Let X = = R U { - m) U { m) be the extended real line. Let ? C Y(X) be the following collection of sets:
0 E 7 if and only if 1) O n R E re
2) if m E 0 or - m E 0, then there is an a E R such that (a,m] C_ 0 or an a E R such that [ - oo,a) E 0, respectively. Then 'i is a topology on
(see Problem 1.1).
109
1. Topological Spaces
- X. Define the sys(v) Let ( X , r ) be a topological space and let Y C tem of subsets ry = {0n Y : 0 E r). We show that r y is a topology on Y. Indeed, Y and (8 obviously belong to ry. Let { U i : i E I) C ry. Then, V i E I, there is 0; E r such that 0; n Y = U i E ry. Now U 0; E r icI
and therefore Y n U Oi E ry. On the other hand, due to the distributive law, iEI
It can similarly be shown that ry is closed with respect to the formation of all finite intersections. Therefore, ry is a topology on Y 5 X, called the relative topology of r on Y. The pair (Y,ry) is called a subspace. In some older textbooks, the topology ry is also called the trace of Y in T. For instance, take the Euclidean metric space (R,d,) and let Y = [0,1]. Then the set (;,I] is open in (Y,ry). CI
Let X be a non-empty set and let T and T' be two topologies on X. If T C - TI, then we say T is weaker (or smaller or coarser) than 7'. We also say that r' is stronger (or larger or finer) than T. As it follows from Examples 1.2 (ii) and (iii), roC_ rl 5 9(X). The indiscrete topology is, therefore, the coarsest topology on X, while T(X) is the finest topology on X. (i)
(ii) If ( X , d ) is a metric space and ~ ( d is) the topology induced by metric d (also called the metric topology), then (X,r(d)) is said to be a metrizable (topological) space. Therefore, a metrizable space is a topological space with a topology that comes from some metric.
1.4 Definition. Let ( X , r ) be a topological space. A subset A C X is called T-closed or just closed if AC E 7. CI As in the case of metric spaces, we can easily prove that X and # are closed, finite unions of closed sets are closed, and arbitrary intersections of closed sets are closed. In Definitions 1.5 below we introduce some important notions for topological spaces. It will be advantageous to support these definitions by examples immediately after the notions are introduced. T o reference the examples, we assign them the letter D followed by the prefix of the definition.
1.5 Definitions. (i) Let ( X , r ) be a topological space. A subset A 2 X is called a neighborhood of a point x E X if x belongs to some open subset of A. Specifically, if A E T then A is called an open neighborhood of x.
110
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
[Ezample D1.5(i). Let X = R and r = {W, @,{1),(3,4],{1) U (3,411. Then {I) is an open neighborhood of 1, [3,5] is a neighborhood of 3$, ( - 2,O) is not a neighborhood of - 1, and R is the only neighborhood of - 1.1 (ii) A point x is called an interior point of a set A if A is a neighborhood of x. The set of all points interior to A is called the interior of A and is denoted by or by Int(A). [Example Dl.S(ii). In Example D1.5(~), 1 is the interior point of the set {I). The interior of set A = [3,5] is = (3,4].] (iii) The collection of all neighborhoods of a point x E X is called the neighborhood system at x and it is denoted by 91,. An arbitrary subcollection %, %, is called a neighborhood base at x (or a fundamental system of neighborhoods of x), if every neighborhood U E 91, is a super., Any element B E %, is called a base neighborset to least one B E % hood. ~learl;, 91, itself is a neighborhood base a t x. Obviously, %, is a neighborhood base a t x if and only if there is another neighborhood base 9, such that every base neighborhood D, E 9, is a superset to a t least one neighborhood base B from 9,.
.
[Example D1.5(iii). Let { ( x , 1) , n = 1,2,. .) be the sequence of deopen balls centered a t a point x E Rn. Clearly, it is a fundamental system of neighborbods of x. Another neighborhood base a t x, which contains the above ne'ighborhood base, is the system of all open balls with rational radii, centered a t x. We can alsc take the system of all open balls with positive real radii, centered a t x. This system contains the first two neighborhood bases.] A neighborhood base 93, a t x is in general a more "economical system" of neighborhoods than the whole neighborhood system %, ; and, as it will be shown, it is as informative about the structure of the space in the vicinity of x as %, is. Technically, it is of greater advantage in various proofs for us to use a base neighborhood than to use a n arbitrary neighborhood. As it follows from the definition, an arbitrary set A need not be a neighborhood of all of its points. For instance, [0,1] is not a neighborhood for points 0 and 1 in the usual topology @,re). More about the nature of neighborhoods is contained in the following propositions that the reader can easily verify.
1.6 Proposition. A 5 X is a neighborhood f o r all of its points if and only if A is open. 0 (See Problem 1.4.)
1.7 Proposition.
is the largest open set contained in A.
0
1. Topological Spaces (See Problems 1.5.) 0
In particular, it follows that A is open if and only if A = A.
1.8 Definitions. (i) x E X is called a closure point for a set A if any neighborhood of x has a nonernpty intersection with A. We also say that any neighborhood of x meets A. The set of all closure points of A is called the closure of A and it is denoted by 2.[Sometimes, when working with relative topologies it is necessary to emphasize that the closure of A is with respect to the carrier X , it is advisable to use the notation CIXA. However, for brevity we shall still use the notation 2,whenever X is the only carrier under consideration.] [Example D1.8(i). In the topology introduced in Example D1.5(i), let us take A = ( - 2,O). Then we have
while = #. Indeed, for any x E ( - m , l ) , W is the only neighborhood of x; thus W n ( - 2,O) # @.Observe that 1 is not a closure point of A, since 11) is a neighborhood (of 1) such that {1) fl A = Q).For set B = { - 1) we have
(ii) A subset A s X is said to be dense in X if said to be nowhere dense if ~ n t ( A = ) 0.
2 = X. A s X is
[Example D1.8(ii). Consider Example D l . 8(i). For A = ( - 2,0),
while i n t ( 2 ) = Q),i.e. A is nowhere dense. The set
C = { - 1) U (1) U (3,4] is dense in X.] (iii) A point x E X is called a n accumulation point (or cluster point) of a set A if every neighborhood of x contains a t least one point of A other than x. The set of all accumulation points is called the derived set and is denoted by A'. [Example Dl.b(iii). In Example D1.8(i), A' = 2.1 (iv) A point x E X is called a boundary point of a set A if every neighborhood of x contains a t least one point of A and a t least one point of AC. The set of all boundary points of A is denoted by
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
and called the boundary of A. [Example Dl.$iv). In Example DI.B(i),
A = @ and 6'A = 2.1
(The closure of A is evidently the smallest closed set containing A; and A is closed if and only if A = A. See Problem 1.6.) (v) A topological space (X,T) is called separable if there exists a t most a countable, dense subset of X.
PROBLEMS 1.1
Show that the collection is a topology in R.
1.2
Let X be a nonempty set and r = {X,Q),CC:C 2 X and C is fanite). Show that r is a topology on X. T is called the cofinite (or finite complement) topology on X.
1.3
Let X = R and let r = {x,#,(-m,l],[l,m),(3,10]}. Is r a topology on R? If not, supplement T by some subsets to a topology (and be reasonable).
1.4
Prove Proposition 1.6.
1.5
Prove Proposition 1.7. [Hint: Show that A contains all open sets that are contained in A and use Proposition 1.6.1
1.6
Show that the closure of A is the smallest closed set containing A; and A is closed if and only if A = 2.
1.7
Show that (a) A C B
; i of
sets introduced in Example 1.2 (iv)
A E B, ( b )
= 2 U B, (c) AnB
C _ ~ f l ~ ai n td( ~ n ~ ) = X f l IS h .i n t ~ = ~ ?
2 = A U aA.
1.8
Show that
1.9
For X being an infinite set, define T: = {x,@,cC: C is a t most countable). Show that T is a topology on X. We call such a topology cocountable (or the countable complement topology).
1.10
Show that 2 = A + a A [Hint: Proceed in the same way as in Problem 3.2, Chapter 2, and work with a neighborhood instead of a ball.]
1.11
Prove that a subset of a topological space is closed if and only if it contains all of its accumulation points.
1 . Topological S p a c e s
1.12
113
Let ?=(W,(-1,1],[0,5),(0},{10)). a) Extend ? to the smallest topology
T
in R generated by ?.
b ) Let A = ( - 7 , - 51, B = (0,7],and C = [ -
k,20). Find the sets
A , B , C , i , b , & A',Bt,C', , aA, aB, and 6'C. Determine whether A,B and C are dense in R.
1.13
Show that a A = (8 if and only if A is open and closed.
1.14 Show that (2)' C p. 1.15
Show that the inverse inclusion in the previous problem holds if and only if A is closed and open.
1.16
This provides an equivalent definition of a closure point. Show that r E 2 if and only if VUz E rll,, U zn 2 # @.
114
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
NEW TERMS: topology 108 open sets 108 n -stable family of sets 108 topological space 108 usual topology 108 standard (natural) topology 108 indiscrete topology 108 discrete topology 108 topology induced by a set 108 topology on the extended real line 108 relative topology (subspace) 109 subspace 109 subspace 109 trace of a s e t in a topology 109 weaker (coarse?, smaller) topology 109 coarser topology 109 stronger (finer, larger) topology 109 finer topology 109 metric topology 109 metrizable topological space 109 closed set 109 neighborhood of a point 109 open neighborhood of a point 109 interior point 110 interior of a set 110 neighborhood system a t a point 110 neighborhood base a t a point 110 fundamental system of neighborhoods a t a point 110 base neighborhood 110 closure point for a set 111 neighborhood of a point that meets a set 111 closure of a set 111 dense set 111 nowhere dense set 111 accumulation (cluster) point 111 cluster (accumulation) point 111 derived set 111 boundary point of a set 111 boundary of a set 112 separable topological space 112 cofinite (finite complement) topology 112 t) topology 112 cocountable (countable com~lemen
2. Bases and Subbases for Topological Spaces
115
2. BASES AND SUBBASES FOR TOPOLOGICAL SPACES In the previous section, we introduced the notion of a collection of open sets, called a topology. In many applications, describing an entire topology on a carrier is difficult and sometimes even impossible. This predicament is manageable if one deals instead with a sort of "pre-topology," a smaller collection of sets, which is not a topology, but which generates a topology and thereby can be extended to a topology. With a similar idea, we come to introduce neighborhood bases. Take, for example, a metric space. While the family of all open balls does not yield a topology, every open set, as we know, can be made of the union of some subcollection of open balls, and consequently, it leads to a topology and gives rise to the notion of a base for a topology.
2.1 Definition. Let (X,T) be a topological space. A subcollection 93 of open sets is called a base for T if every open set is a union of some elements of 93. (Specifically, it follows that 0 must be an element of 93.) The elements of 93 are called base sets. 0 With no major difficulty (and with hints provided), the reader can afford establishing a very useful criterion of a base for T, subject to Problem 2.2. An important relation between bases and neighborhood bases is given in the following theorem. 2.2 Theorem. 93 is a base for T i f and only if, # E 93 and for every point x E X , there is a neighborhood base 93, consisting of open sets such that 93, C 93.
Proof. We have to show that 93 is a base for T if and only if, for every x E X and each neighborhood U, of x, there is a base neighborhood B, E 93 such that B, C U,.
Let 93 be a base for T and let U, be a neighborhood of a point x E X. Without loss of generality we assume that U, is open. (Otherwise, take any open neighborhood 0,E U, of x and work with 0, instead.) If U, is open, there exists a subcollection of 93 whose union equals U,. Thus, a t least one set of this subcollection, say B, ( E 48)) must contain x, and B, C U,. Observe that by Definition 1.5 (iii), B, is then an element of a neighborhood base and 93, = {B,} forms a neighborhood base of x. Therefore, each neighborhood base %, of x has a t least one neighborhood base 93, of x such that 93, C_ 93 and each U, E 21, is a superset of a t least one B, E 3,. (i)
(ii) Let 93 E T and assume that for every x E X, there is a neighborhood base 93, C 93. Let 0 be an arbitrary open set. Then, by our assumption and by the definition of a neighborhood base, for any point x E 0 (since 0 is a neighborhood of x), there is a base neighborhood B, E 93,
116
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
such that B, 5 0. Thus 0 = x -
U t=
B,
(union of all such B, E 93). Hence,
0
every open set 0 E T can be cohposed of a union of some elements of I, or equivalently, 93 is a base for r.
2.3 Examples. (i) Let (93,: x E X) be an arbitrary collection of open neighborhood bases a t all points. Then, U %, can be regarded as an example of ,EX
a base for r. Indeed, as in Theorem 2.2, take a point x of any open set 0. Then, 0 is a neighborhood of x and thus it belongs to the neighborhood system a t x. By the definition, a neighborhood base %, E {%,: x E X) is such that there is a t least one base neighborhood B, of x included in 0. Collecting all such neighborhoods of all points of 0, we can represent 0 as the union U B,. Hence, {%,: x E X} is a base for the topology r. ~
€
0
(ii) As mentioned a t the beginning of this section, in any metric space (X,d), the collection of all open balls is a trivial example of a base for the corresponding metrizable topological space. Indeed, by Definition 2.3, Chapter 2, for each open neighborhood 0, of x E X, there exists an open ball B(x,E) E Ox. Earlier (in Example 1.5 (iii)), we showed that B(x,r) is a base neighborhood a t x. Thus by Theorem 2.2, the system {B(x,E): x E X, E > 0) is a base for r(d). As in Example Dl.S(iii), a neighborhood base a t x can be reduced to the system 93, = {B(xIq): q E Q, q > 0) of all balls with rational radii. Consequently, by Theorem 2.2, the collection of all open balls with rational radii is a base for r(d). [Note that these balls are centered a t all x E X, so consequently, this base need not be countable.] (iii) We give a rather informal definition of an open parallelepiped in (Wn, re).More formalism is brought in Section 5. A set
is called an open parallelepiped (or rectangle) in Rn if each o(;)is an open set in W. An open parallelepiped is said to be base (or simple) if each o(;)is an open interval. Let 9 be the system of all base parallelepipeds in (Rn,re) along with the empty set 0.Let x E Rn and let Ox be any open neighborhood of x. Then, there is an open ball B(x,r) 5 0,. On the other hand, there obviously is a base parallelepiped P, "centered" a t x that can be inscribed into this ball, and this implies that P, E 0 T' Therefore, the system 9, of all open base parallelepipeds centered a t x is a neighborhood base a t x; and again by Theorem 2.2, 9 = {T,: x E X ) is a base for (Rn,re). Observe that the system of all "rational" parallelepipeds (i.e. those base ones with rational coordinates) is also a base for
2. B a s e s a n d Subbases f o r Topological S p a c e s
(Rn,re). (iv) The collection of all singletons {I) E 9 ( X ) , along with base for the discrete topology on X.
@,is
a
0
be a base for (i) Let r1 and r2 be two topologies on X and let r,. If 93, 5 r2 then rl C r2.[Observe that B !1 need not be a base for r2.] Indeed, by the definition of a base, each 0' E rl can be represented as o1= u B i However, Bi E r2implies that U Bi = 0' E r2. 1
1
(ii) Let r1 and r2 be two topologies on X with a common base '3. Then, by (i), rl 5 r2 and r25 T,, and thus rl = 72. In other words, a base uniquely defines a topology. Note that although one topology may have different bases, a base cannot share different topologies. be a base for r2.It does not follow that (iii) Let rl 5 T2 and let !B2 is a base for 7,. In fact, !B2 need not even be a subcollection of 7,. However, if in addition, !B2 & r,, then by (i), r2C r1 and therefore, rl = r2.Indeed, rl C r2implies that 7, = r2. Cl
In a construction of a topology on a carrier, it is often very helpful to start with a collection, yet smaller and more rudimentary than a base. Even more rewarding becomes the formation of product topologies and quick and tame continuity criteria of functions. Recall that a function f , corresponding between two metric spaces X and Y, is continuous if and only if inverse images under f of open sets in Y are open in X. Remarkably, continuity of f can be verified for a (frequently) much smaller community of subbase sets in Y. This will be established and elaborated in Section 4 for topological spaces. We begin with the following: 2.5 Definition. Let Y C 9 ( X ) such that
U
A = X. If there exists the
A E !f
weakest topology containing Y, then it is called the topology g e n e r a t e d b y Y, and the collection !f is called a subbase o n X. [Note that !f can directly restore only X, while '3 restores all open sets, including @. Clearly, a base 93 for a topology 7, besides T itself, offers a trivial examp16 of a subbase on X.] T o justify Definition 2.5 we need: 2.6 Proposition. T h e weakest topology generated by a subbase exists.
Proof. Clearly, there exists a topology containing Y (for instance, T(X)). Then define r(Y) as the intersection of all topologies containing Y. We show that r(Y) is a topology on X. (i)
X and @ belong to all topologies containing 1 . Therefore X
118
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
and @ E ~ ( 9 ) .
..,On E ~ ( 9 ) .Then 01,02,. ..,On are elements of (ii) Let 01,02,. n every topology containing Y. This implies that Ok belongs to all
n
k=l
topologies containing Y, and thus it belongs to T(Y).
(iii) By similar arguments, r(Y) is closed relative to the formation of arbitrary unions. Obviously, r(Y) is the weakest topology containing Y. The following theorem shows that the way we generated the weakest topology r(Y) dVer a collection Y of "primitive" sets or a subbase, by extending this collection to the one closed with respect to the formation of finite intersections and arbitrary unions, takes place in the construction of arbitrary topologies. [It seems plausible to supplement Y by X , @, and all unions and finite intersections of elements of Y.] In addition, the theorem shows that the extension of a subbase to an fl -stable supercollection makes a base to the weakest topology ~ ( 9 ) . 2.7 Theorem. Let Y be an arbitrary subcollection of
T(X)with
and let where @ E '3 and 38 contains all finite intersections of elements of Y. Then 38 is a base for r(Y).
Proof. Let
where '3 is defined in the condition of this statement. We show that r' is a topology on X. It is sufficient to show that T' contains all finite intersections; the other properties of T' as a topology are obvious. Also, for brevity in notation, we show this for the case of the intersection of two open sets. Let U and V be two elements of 7'. By the definition of 38,
u = U Ui i € I
where
u.= a
(7 s:
k=l
U Vj,
and V =
(Ui,Vj E 9)
j€J
and V j =
~ s (s;,sa€Y). L s=1
2. Bases and Subbases for Topological Spaces
119
Then
Now, since obviously 39 is a base for 7' and 39 C r(Y) C r', by Remark 2.4 (iii), identifying r(Y) as r1, T' as T2, and 39 as 39,, we have r(Y) = 7'. In particular, we see that 39 is a base for r(J). 2.8 Examples-
(i) In Example 2.3 (ii), it was shown that the system 9 of all base parallelepipeds is a base for (Wn,re). On the other hand, it is easily seen that 9 is closed relative to the formation of all finite intersections (recall that @ is also in 9 ) . Thus, 9 is a base for r ( 9 ) , according to Theorem 2.7. Furthermore, 9 is a base for re. Thus, by Remark 2.4 (ii),re and r ( J ) coincide. In other words, the natural topology re on Rn is generated by the system of all base parallelepipeds. In another situation, we can take for Y the system of all open parallelepipeds with rational coordinates, which is certainly closed relative to all finite intersections. Then, re would also be generated by the system of all rational parallelepipeds. [Recall that metrics de and supremum metric are equivalent in Rn. No wonder that re and r ( J ) coincide.] (ii) In another scenario of (Rn,r,), the collection of open parallelepipeds of types af((ai,bi)) = R x ... x R x (ai,bi) x R x ... x W, where (ai,bi)'s are open intervals in R, i = 1,...,n,forms a subbase for re. [Note that none of at((ai,bi)) is a base parallelepiped.] This collection can be extended to a base 39 for re by including in 39 the empty set @ and all finite intersections of the subbase parallelepipeds. Base 93 evidently contains 9 (why?). 0
PROBLEMS 2.1
Let (X,T) be a topological space and let 39 C T. Show that 93 is a base for r if and only if for every open set 0 E r and each 'point x E 0, there is a subset U of 0 such that x E U E 39.
2.2
Show that 39 C 9 ( X ) is a base for a topology on X if and only if (i)
each x E X belongs to at least equivalently, X = U B) BE% and
(ii) QB1,B2 E 39 and Qx E Bl B1 fl B,.
one set B E 39 (or
n B2, 3 B E 39 such that
x EBC
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
[Hint:Use the steps that follow. 1) If '3 is a base, then apply Theorem 2.2. (ii).
2) Let
r={
U
B:V 9 3 ' 9 3
Show that
T
is a
B E '3'
topology on X and that '3 is a base for
7.1
2.3
Let 93 be a base for a topology r on X. Since '3, in particular, is a subbase on X, it also generates the weakest topology ~ ( ' 3 )and hence r('3) r. Is r ( 3 ) = r ?
2.4
Let rl denote the topology on the real line generated by all semiopen intervals of type [a,b) where a,b E R. This topology is called the lower limit topology. Show that {[a,b): a,b E R) is a base for rl and that r l is strictly finer than re,the usual topology on the real line.
2.5
Let '3 = {[a,b): a,b E Q). Show that '3 is a base for the topology r that 93 generates and that r is strictly coarser than the lower limit topology r1of Problem 2.4.
2.6
Show that the collection of all sets on the real line of types (a,m) and ( - m,b) is a subbase for the usual topology @,re).
2.7
Show that any base and subbase parallelepipeds in Example 2.3 (ii) and Example 2.8 (ii), respectively, are open sets.
121
2. Bases and Subbases for Topological Spaces
NEW TERMS: pre- t opology 115 base for a topology 115 base sets 115 base for a topology criterion for 115, 119 open parallelepiped (rectangle) 116 rectangle 116 base (simple) parallelepiped (rectangle) 116 simple parallelepiped 116 rational parallelepiped 116 subbase 116 topology generated by a subbase 116 base, a construction of 118 subbase parallelepiped 119 lower limit topology 120
122
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
3. CONVERGENCE OF SEQUENCES IN TOPOLOGICAL SPACES AND COUNTABILITY Convergence of sequences introduced in this section generalizes that of Section 3, Chapter 2, for metric spaces, and it is preparatory for the more general type of convergence of nets and filters to be treated in Section 9.
3.1 Definition. Let {x,: n = 1,2,. ..) C (X,T) be a sequence and let A be a set. A subsequence QN = {x,: n = N, N 1,...) is called an N(A)tail of {x,} for some N 2 1 if QN & A. A sequence {a,: n = 1,2,.. .} CX is said to converge t o a point x E X if for every neighborhood U, of x, there is an N(U,)-tail of {x,). The point x is said to be a limit point of the sequence. A point x is said to be a limit point of a set A if x is a limit point of some sequence {x,) A.
+
Unlike metric spaces, a sequence in a topological space can have more than one limit as we learn it from the following example.
3.2 Example. Let X = W , let r={R,@,(-2,3],[-1,2]) and let x, = +, n = 1,2,. . . . Then, {+) converges to all points of the set [ - 1,2], since for each point x E [ - 1,2], its open neighborhoods are R, ( - 2,3], and [ - 1,2], each one of which contains the whole sequence. In most applications we will deal with general topological spaces, in which every convergent sequence has exactly one limit. An important representative of this class is introduced in the definition below.
3.3 Definition. A topological space (X,T) is said to be HausdorfS (or separated or T 2 ) if every two distinct points, x, y E X, possess disjoint neighborhoods.
T2 is often referred to as the second separation axiom. Other separation axioms will be introduced and discussed in Section 10. As was mentioned, the following proposition (which will be hardly a challenge for the reader) is a consequence of Hausdorff spaces.
3.4 Proposition. L e t (X,T) be a Hausdorff topological space, lim x, = x, and let lim x, = y. T h e n x = y.
n--+oo
n--+w
(See Problem 3.1.)
3.5 Example. Let (X,d) be a metric space and let (X,r(d)) be the corresponding metrizable topological space. With xl and x2 being distinct points of X, construct two open balls, B(xl,r) and B(x2,r), with r = $d(x1,x2). It follows that the balls are neighborhoods of xl and x2, respectively, and that B(xl, r ) fl B(x2,r) = @.This immediately implies that any metrizable topological space is Hausdorff.
123
3. Convergence of Sequences in Topological Spaces
3.6 Remarks.
(i) In metric spaces (see Corollary 3.4, Chapter 2)) a point is a closure point of a set A if and only if it is a limit point of A. This does not apply to general topological spaces. More specifically, a limit point is always a closure point, but the converse is not true. Let x be a closure point of A. If x E A, then setting x, = x, we have a sequence convergent to x. If x @ A then, by definition, for each neighborhood U, of x, a . a closure point, U, fl A # In this case, however, it is not clear how to choose a sequence convergent to x, i.e., how to ensure that for each U,, there is an N(U,)-tail, for we do not have the flexibility of metric spaces with balls like ~(x,;) of Theorem 3.3, Chapter 2. In Remark (ii) below we will demonstrate an example of a topology where a set A contains all of its limit points and yet is not closed, or, in other words, some closure points of A are not its limit points. However, if x is a limit point of A, then it is always a closure point. Indeed, if {x,) C A is a sequence convergent to x, then for every neighborhood U, of point x, there is a tail . .), which is contained by U,, and hence U, meets A. {xN,xN
a.
(ii) Consider the cocountable topology T on R introduced earlier in Problem 1.9. Take A = (a,b) where a < b. Let {xi} C A be a sequence. Then, by the definition of T , the complement of (xi) is open (and disjoint from {xi}). If this sequence has a limit x E AC, then this limit should belong to the open set {xi)' (since {I,} E A =+ AC C {x,}~), which can serve as an open neighborhood of x. This neighborhood does not have a single element of the sequence and, therefore, x cannot be its limit; or equivalently, this sequence cannot converge to any point of AC. Therefore, x E A. However, A is not closed either. T o see this, take a in association with set A = (a,b). Let 0 be any open set of the form R\(any sequence not containing a). Then 0 is a neighborhood of a, any such neighborhood 0 meets A on some set, and a is an accumulation point of A. Thus, A is not closed, for otherwise, by Problem 1.11, it would contain all of its accumulation points. An alternative argument shows that the only convergent sequences in a cocountable topology are those with constant tails and X itself. In other words, any sequence {x,} with an N-tail is {I, = x: n 2 N). It is clear that the complement of {x,: x, # x) is an open set containing x. Therefore, every set contains all limits of its convergent sequences, but the only closed sets are the countable ones and the carrier. (iii) Consequently, there arises the quest ion: Under what condition does a topological space have the property metric spaces have, namely, xE
Z tj 3 a sequence in A whose limit is x?
(3.6)
124
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
(With no additional condition, this result is valid for metric spaces; see Theorem 3.3, Chapter 2.) In other words, when is a set closed if and only if it contains all of its limit points? We also raise another question: When can Proposition 3.5 be reversed, i.e. when does the uniqueness of limits imply that the space ( X , r ) is Hausdorff? These two questions are closely related. T o see this, assume that in a topological space ( X , r ) property (3.6) holds and, in addition, all limits are unique. Assume that we can prove that (3.6) also holds in the "product topology" r, on X ~ = X X X generated by the base '3 = r x r that consists of all open parallelepipeds 10, x 0 2 : Ol,O2 E r). Pick a n arbitrary point (x,y) from B, which stands for the closure of diagonal D = {(x,x): x E X}. If (3.6) holds in ( X , r ) (and eventually in ( x 2 T )), then the point (x, y) is a limit of ' .p some sequence {(xn,yn)) C D. Since (xn,yn) E D, we have that x, = y,; and, in accordance with our above assumption, by uniqueness of limits, x = y. Thus (x,y) E D, i.e. D = b or D is closed in the product topology r ~The ' latter implies that any point (x, y) with distinct coordinates is an interior point of DCand hence it is contained in some base neighborhood O,xOy C - Dc. This implies that O , n O y = @, i.e. ( X , r ) is Hausdorff. 0
If (3.6) is so crucial for ( X , r ) to be Hausdorff, what then is a prerequisite for (3.6)? The answer is provided in the upcoming Theorems 3.8 and 3.10. Before that we introduce the following important notions.
3.7 Definitions. (i) A topological space ( X , r ) is said to satisfy the first axiom of countability (or to be first countable), if each point x E X has a t most a countable neighborhood base.
(ii) A topological space ( X , r ) is said to satisfy the second axiom of countability (or to be second countable), if ( X , r ) has a countable base.
As mentioned, a noteworthy attribute of topological spaces emulating metric spaces is subject to Theorem 3.8 combined with reader's efforts in Problem 3.7.
3.8 Theorem. Let (X, r ) be first countable and let A be a subset of X . Then a point x is a closure point of A if and only i f there exists a sequence {I,}( C A) which converges to x. 3.9 Remark. In what follows, we will advance to the notion of the product topology to be rigorously constructed in Section 5 of this chapter. We will call the topology on the Cartesian product X x X generated by all open parallelepipeds, O1 x O2 E r x r, the product topology and denote it by rp. The reason why r x r is a generator for rp is that T x r is a subbase and base for rp (in light of Proposition 2.7). Obviously, rp
3. Convergence of Sequences in Topological Spaces
125
is first countable if
T
is; show it (see Problem 3.12).
The statement below builds promised bridges between uniqueness of limits of sequences, Hausdorff spaces, and closeness of the diagonal in rp. The same result will be generalized and applied to filter and nets in Section 10 (Theorem 10.22).
3.10 Theorem. Let (X,T) be a topological space. Then the following are equivalent.
( i ) ( X , r ) is HausdorPf. (ii) All convergent sequences in ( X , r ) have unique limit points. (iii) The diagonal D = {(x,x) E is closed in the product topology rp on
x2.
x2]
Proof.
+ (ii) holds according to Proposition 3.4 (Problem 3.1). For (ii) + (iii) we assume that all limits of sequences in ( X , r ) are (i)
-
unique. If D is not closed, then there is a sequence ((xn,xn)} C D such that (x,,x,) -t (x,y) with x # y, but then it immediately contradicts assumption (ii), since then x, +x and x, +y. For (iii) (i) we assume that the diagonal D is closed in ( x 2 , r p ) . Let x # y E X. Then (x,y) E DCC Since DC is open, it can be represented as a union of base open sets, i.e. as a union of open parallelepipeds. Then a t least one of these parallelepipeds, say 0, x 0, C DC, must contain the point (x,y), i.e., x E 0, and y E 0,. Thus 0, and 0, are open neighborhoods of x and y, respectively. They are disjoint, since 0, x 0, 5 DC.Hence, ( X , r ) is Hausdorff.
x2.
PROBLEMS 3.1
Prove Proposition 3.4.
3.2
Show that any one-point set in a Hausdorff space is closed.
3.3
Show that any metric space is first countable.
3.4
Prove that any separable metric space is second countable.
3.5
Is it true that any first countable topological space is also second count able?
3.6
Prove that if a topological space is second countable, then it is separable and first countable.
126
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
3.7
Prove Theorem 3.8.
3.8
Let 0 CX be open. Show that V x E 0 and V sequence x, + x, there is an N(0)-tail of this sequence. Prove the converse of this statement assuming that X is first countable.
3.9
While Corollary 3.4, Chapter 2, claims that in a metric space a set A is closed if and only if it contains all its limit points, Remark 3.6 (ii) asserts that in a general topological space a set A could contain all its limit points and still not be closed. However, for any set A of a first countable space, the former property does hold. Show that a set F is closed in X if and only if each convergent sequence in F converges to a point in F.
3.10
Show that subspaces of second countable spaces are second countable.
3.11
Show that T, x r 2 that consists of all open parallelepipeds {O, x 02:0, E r,,02E T2} is n -stable.
3.12
Show that rp in Remark 3.9 is first countable if able.
T
is first count-
127
3. Convergence of Sequences in Topological Spaces
NEW TERMS: N(A)-tail of a sequence 122 convergent sequence 122 limit point of a sequence 122 limit point of a set 122 Hausdorff (separated, T2)topological space 122 separated topological space 122 T 2space 122 Second Separation Axiom 122 product topology 124 diagonal 124 First Axiom of Countability 122 first count able topological space 124 Second Axiom of Countability 124 second countable topological space 124 closure point, criterion of 124 Hausdorff space, criterion of 125
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
4. CONTINUITY IN TOPOLOGICAL SPACES Except for a brief introduction of sequences (being a rather vague manifestation of functions) in the previous section, in the present section, functions will appear for the first time in conjunction with topologies. Naturally, their most natural quality we look into will be continuity. After a first acquaintance with continuity in metric spaces (Section 4, Chapter 2), the reader will be well prepared to its "surprising" variant for topological spaces and a striking similarity between Theorem 4.2 below and Theorem 4.3, Chapter 2, with respect to a key continuity criterion. Again, we will observe some other continuity properties, typical for metric spaces and holding for special topological spaces, yet more general than metric spacks. One of them deals with an important relationship between convergence of sequences and continuity of functions initiated in Chapter 2 (formulated as Theorem 4.4 and pledged to be proved in this section). 4.1 Definitions.
( i ) A function f: (X,r)+(Y,rl) is said to be conlinuous a t a point a E X if, for every neighborhood Wf(,), there is a neighborhood U, such that f *(U,) W1(.)
s
This is obviously equivalent to the following definition: f is continuis a neighborhood ous a t a, if for every neighborhood Wf )(, , f *(Wf of a (see Problem 4.1). (ii) The function f is said to be continuous on X (or simply con0 tinuous) if it is continuous at each point a E X. 4.2 Theorem. Let f : (X, T) -+ (Y, T ~ be ) a function. Then the following are equivalent. (i)
f is continuous.
(ii) The inverse image under f of any open set H E r1 is open, i.e. is an element O ~ T . Proof. (ii). Let H E rl. For each point a E f *(H), f (a) E H and (i) therefore f(a) is an interior point of H. Specifically, H is a neighborhood of f (a). Since f is continuous a t a, there is a neighborhood U, such that f (U,) C - H. Because the inclusion is preserved under the inverse, we have
which implies that f ' ( H ) contains a neighborhood for each of its points.
4. Continuity in Topological Spaces
129
Hence, f *(H) is itself a neighborhood for all of its points. Therefore, by Proposition 1.6, f *(H) is open, i.e. is an element of T. be a neighborhood of f (a). Then, (ii) + (i). Let a E X and let Wf By there exists an open set H E rl such that f (a) E H 5 W assumption (ii), f * ( H ) an element of T. Since obviously a E f * ( H ) , f ' ( H ) is a neighborhood of a and thus f *(Wf(a)) is also a neighborhood of a. Consequently, we have continuity of f a t a. Let ( X , r ) be a topological space. Denote the collection of all closed sets OCsuch that 0 E r by rC.
4.3 Proposition. A function f : (X, T) + (Y, r l ) is continuous on X if and only i f the inverse image under f of any closed set OC E slCis closed C3
in (X, T).
(See Problem 4.2.) 4.4 Proposition. Let (X, T), ( Y ,T ) , and (2,s 2 ) be topological spaces and let f : X -+ Y and g : Y+ Z be continuous functions. Then the function g o f :X + Z is continuous. (See Problem 4.3.)
4.5 Definition. Let (X,T) be a topological space and let [X,Y, f ] be a function. Define
i.., f r qC - T . By the below arguments (Remarks 4.6), rq is a t o p e logy and it contains any topology relative to which f is continuous. rq is called the quotient topology induced on Y b y f. [Recall that f * is defined on T(X); consequently, we denote f ** as a function acting on 9(9(X)).]
(i)
rqis indeed a topology:
sectionofopensets) 3
nBkErq.
k=l
3) A similar consideration can be used to show that rq contains all unions.
(ii) rq is the largest topology on Y relative to which f is continu-
130
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
ous. This follows directly from Definition 4.5.
4.7 Example. Let X=W, r = { R , @,(-1,2], [0,3), [0,2], (-1,3), (-1,1)} and let f(x) = x 2 defined as f:W-+W = Y . It is clear that W , @ and [0,1) are the only subsets of Y whose inverse images are in r. Therefore, (W,@, [0,1)} is the quotient topology on Y. By Theorem 4.2, f : ( X , r ) -+ (Y,rl) is continuous if and only if f **(rl) 2 T. However, if we know a generator Y' of TI, then condition (ii) of Theorem 4.2 can be weakened as the following theorem shows.
4.8 Theorem. Let f : (X, r ) -+ (Y,r(Y)) (where r(3) is the topology generated b y a subbase Y). Then f is continuous i f and only i f f **(3) 5 7. Proof. If f is continuous, then, in particular, f **(3) 5 r. Assume that f **(f) C - r and introduce the quotient topology rq induced by f . Thus, Y E 7,; which implies that ~ ( 3 ) rq, for ~ ( 3 )is the smallest topology containing Y. Then since f **(rq) 5 T, we have
4.9 Theorem. Let f : ( X , r ) --, ( Y , r l ) be a map continuous at some point x E X . If {x,) is a sequence convergent to x, the sequence {f(x,)) is convergent do f (x). (See Problem 4.10.) Theorems 4.8 and 4.9 and the next theorem form an analog to Theorem 4.6, Chapter 2, which was only valid for metric spaces. The statement in Theorem 4.9 has no restriction as to the nature of topological spaces ( X , r ) and (Y,rl), while its converse needs to be strengthened by the condition that ( X , r ) is first countable.
4.10 Theorem. Let f : (X, r ) -+ (Y, TI) be a map and let ( X ,r ) be first countable. If for any sequence {x,} convergent to a point x E X, the sequence {f (2,)) converges to f (x), then f is continuous at x. Proof. T o prove this theorem, we assume that f is not continuous a t x, then select a sequence {x,} convergent to x such that { f(x,)) does not converge to f (x). The assumption that (X,T) is first countable is essential in the selection of a convergent sequence {x ) which otherwise .' need not exist. If f is not continuous a t x, there is a neighborhood W r(4 such that f *(W ) is not a neighborhood of x, or equivalently, there is no neighborhoob(d, such that f (U,) 5 W (,). [Otherwise, if f(U,) W f (,), then
13 1
4. Continuity in Topological Spaces
This would contradict our assumption. (See Figure 4.1.)]
Figure 4.1 Specifically, it follows that, for each base neighborhood B E 1,, f ,(B) is not a subset of W Since ( X , r ) is first countable, there is a f (XI* countable neighborhood base 3, = {B1,B2,...) which can always be assumed to be monotone decreasing (why?). Now, each Bi contains a t least one point, say xi, such that f(xi) $ W f ( z ) ,which immediately yields that the sequence {f(x,)} is not in W and, thus, does not f (xl) converge to f(x). However, x, + x. Indeed, for every neighborhood V,, there is an element BN E 3 ' , such that BN C V, , which implies that Bk E V,, Vk 2 N (since 1, is monotone decreasing). Thus, {xN, X N + 1 ,...} is the N(V,)-tail of {x,). Theorem 4.10 leads to some useful applications. 4.11 Lemma. Let f , g : (X,T) -+ (Y,rl) be two continuous maps. If (Y, r l ) is Hausdorff, then the set S = {x E x : Ax) = g(x)} is closed in ( X ,r ) .
Proof. Since f and g are continuous, clearly the map (f ,g): X x X -+ Y x Y is continuous relative to the respective product topologies. Since by the assumption, (Y,rl) is Hausdorff, by Theorem 3.10, the diagonal D in Y x Y is closed. Hence, the set S, as the inverse image of the diagonal D under the continuous map (f ,g) must be closed. 4.12 Proposition. Let f,g: ( X , r ) -+ (Y,rl) be two continuous maps that coincide on some dense set in X . If ( X , r ) is first countable and if ( Y , r l ) is T2, then f = g on X .
132
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
Thus, it follows that a continuous function is well-defined on a dense set. The proof to this proposition is the subject to Problem 4.11. 4.13 Example. If f , g : (Wn,re)
(Rn,re) are continuous maps that coincide on the set Qn of all vectors with rational coordinates, then f and g are identical on Wn. This fact takes into account that (Rn,re) is 0 Hausdorff and first countable. +
4.14 Definition. Let ( X , r ) and (Y,rl) be two topological spaces. A bijective map [X,Y, f ] is called a homeomorphism if both f and f are
-'
-
continuous. The topological spaces ( X , r ) and (Y,rl) are then called homeomorphic. We write X Y. If f fails to be surjective, then f is called an embedding ofX into Y. X is also said to be embedded in Y b y
f4.15 Remark. It is not hard to see that the homeomorphic property applied to a collection of topological spaces on fixed carriers X and Y
offers an equivalence relation (show it, Problem 4.12).
PROBLEMS 4.1
Show that f is continuous a t a point a if and only if for every neighborhood Wf(,), f *(Wf(,)) is a neighborhood of a.
4.2
Prove Proposition 4.3.
4.3
Prove Proposition 4.4.
4.4
Let f : ( X ) ( Y , ) be a function such that f (x) = x, x = Y = R , r={R,@,{1),[1,3)) and rI={R,@,{2),[2,4)}. Is f continuous?
4.5
Under the conditions of Problem 4.4, set f (x) = x tinuous?
4.6
Let f : ( X , r ) -+ (Y,rl) be a map. Show that f is continuous a t a point x E X if and only if, for any base neighborhood Bf(,) of the point f(x), f *(Bf(,)) is a neighborhood of x.
4.7
Under the condition of Problem 4.6, assume that (Y,rl) is a metrizable topological space.
+ 1. Is f
TI
con-
= r(d), i.e.
a) Show that f is continuous a t x E X if and only if the inverse image under f of any open ball Bd(f(x),&)is a neighborhood of x.
b ) Show that, for each open ball Bd(f(x),&) there is a neighborhood U,(E) such that
4. Continuity in Topological Spaces
4.8
4.9
Let f : (X,T) + (Y, 11 11 d) be a map, where Y is an NLS over a field F, and let 11 11 be the norm generated by a TIH metric d. Show that f is continuous a t x E X if and only if, for every E > 0, there is a neighborhood U,(E) E U, such that for each y E U,(E), 11 f ( ~ -) f ( ~ 11) d < &Prove the following statement: Let f : (X,T) --t (IWn,de), where (X,T) is a topological space. Then f is continuous a t a point x E X if and only if, for every E > 0, there is a neighborhood U,(E) E Q x such that, for all y E U,(E), 11 f ( x ) - f (y) 11 < E. ( 11 11 denotes the Euclidean norm.)
-
4.10
Prove Theorem 4.9.
4.11
Prove Proposition 4.12.
4.12
Prove the statement posed in Remark 4.15.
4.13
Show that (R,r,) is homeomorphic to ( - 1,l) with the corresponding relative topology on ( - 1,l).
4.14
Is (R,re) homeomorphic to [ - 1,1]?
134
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
NEW TERMS: function continuous a t a point 128 function continuous a t a point, criterion of 128 continuous function 128 continuity of a function, criterion of 128, 129, 130 composition of continuous functions 129 quotient topology 129 continuous function on a dense set 131 homeomorphism 132 homeomorphic topological spaces 132 embedding 132 embedded set 132
135
5. P r o d u c t T o p o l o g y
5.
PRODUCT TOPOLOGY
Let (Yl,rl),...,(Yn,sn)be topological spaces. One of the reasonable ways to define a topology on the Cartesian product Y = Y l x ...x Y , is to take the collection
for a family of "open" parallelepipeds and declare it as a base for the topology it generates. 4B is obviously closed relative to the formation of all finite intersections [show it], and therefore, by Proposition 2.7, is a base for ~ ( 9 3 that ) includes all unions of elements of 93. We wish to call r(4B) the p r o d u c t t o p o l o g y o n Y and denote it by r p . The following is an attempt to reduce the base 93 for rp.
5.1 Proposition. L e t
where '3; i s a base f o r ri, i = 1,...,n.T h e n 93' is a h 0 a base f o r r p .
CI
(See Problem 5.1.) Any element of '3' is called a base parallelepiped. 5.2 Proposition. L e t
Y =Ylx...xYn={Slx...xSn:
S i € Y i , i = 1,...,n ) ,
w h e r e Y is a subbase f o r ri, i = 1,...,n.T h e n Y is a subbase f o r
T,.
(See Problem 5.2.) Any element of Y is called a subbase parallelepiped.
5.3 Proposition. L e t Y' = { n f ( S i ) :Si E Yi, i = 1,...,n } , w h e r e Yi is a subbase f o r ri. T h e n f ' C Y is a subbase f o r r,.
(See Problem 5.3.) Observe that any element of Y' is a unit cylinder.
-
5.4 Example. As it was mentioned in Example 2.8 (i), the usual topology T , on Rn coincides with the product topology r , on Rn = R x ... x R generated by the base '3' of all open parallelepipeds (as the n
136
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
n-times Cartesian product of open sets in W). The base parallelepipeds
n (ai,bi), where (ai,bi) E R, and they are elements of a n
are of the form
i=l
base for @,re). In particular, the system of all rational parallelepipeds is also a base for rp = re.The system !f' of all unit cylinders {ri*((ai,bi)) : (ai,bi) C W, i = 1,...,n}is a subbase for re. (See Example 2.8 (ii).) It is apparent that the projection maps are continuous relative to the product topology. Furthermore,
5.5 Theorem. Let Y =
n Y; and n j : Y n
-+
Y j be the jth projection
i=l
map, j = l , , n Then the product topology rp on Y is the weakest topology f o r which each projection is continuous.
Proof. Let
be a topology on Y, for which each projection is continuous, i.e. af *(ri) 5 T. Then for every set 0 € rj, j = 1,...,n, T
But 0 is known to belong to rp, where 0 is a base set of rp. Thus, if % is a base for rp such that % E T, then by Remark 2.4 (i), rp 5 T. C3 We extend the notion of product topology of finitely many factor spaces to that on the Cartesian product of arbitrarily many factor spaces. We therefore assume that ((Y,,T,) : 3: E X) is an arbitrary indexed family of topological spaces. Let us c~nsidertwo different models of topologies on th; ~ a r t e i i a nproduct Y = Y,. One of them, called the box
n
x € X
Lopology (in notation r b ) , is subject t o the following construction. We take for a base for rbthe system of box parallelepipeds,
or even a weaker base, %b={
n B,: B , E % , } .
,EX
Hence, the introduced box topology rbis not different from its version for finitely many factor spaces. There is another, "more economical" topology on Y, which also preserves continuity of projection maps, and in addition, it leads to a tame formation of the widely used "pointwise topology" (which the box topology does not).
5.6 Definition. Let us define the topology rp on Y through the base
5. Product Topology
137
where 0,= Y,, except for finitely many indices x E X. In other words, all elements of '3 are simple cylinders (see Definition 5.3, Chapter 1). The topology rp generated by such a base is called the product or T y c h o n o v topology o n Y. 0 Obviously base (5.6) for rp can be further reduced if each 0, is selected from a base 3, for 7,. 5.7 Remarks. Let Y, be a subbase for r,. One can show that the collection Y = { ( S ) :S E Y , x E X ) of unit subbase cylinders is a subbase for r,, just as it is for the case of finite products. (See Problem 5.7.) (i)
(ii) We will always prefer to deal with the smallest possible base or a subbase for r,, provided that we have the knowledge of bases or subbases for each T,. For instance, as the rule of thumb, we can take {':(Ox): 0, E r,) as a subbase for r,, unless more is known about the nature of rX9s. 0
5.8 Examples. (i) Let {(Y,,r,), spaces and let Y =
n
2
x E X ) be a collection of metrizable topological Y,. According to Example 2.3 (i), the collection
E X
of all open balls B?(~,,T), y, E Y,, constitutes a base for (Y,,r,(d,)). Now, the set of all simple cylinders of the form "zl(Bnl(~l,rl)) "z2(Bn2(~z,r2)). cri
rzk(Bnk(~k,rk)), E X , y; EY,.,k = 1,2,..., 1
(5.8) is a base for T whereas the collection of all unit cylinders of the form p'. 7r~(BnX(yx,r,))is a subbase for r,. (ii) Let Y = RR be the collection of all real-valued functions on R that are regarded as the Cartesian product of R's, with each R eqhipped with the usual topology. We select an open neighborhood U f of a point f E Y. First of all, according to (5.8), a simple cylinder with base (y, - E ~yl, E,) x ...x (yk - ~ ~ E, ~has )y the~ form
+
+
where y, is a point in Y, = R. In order that this cylinder be a neighborhood of f , we need to replace y, by the corresponding traces f (a,) of f in the factor spaces Y al,. .,Yak:
138
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
(See Figure 5.1.)
Figure 5.1
5.9 Remark. Let {g,: x E X) be a family of functions g,: R 4 Y,, where each Y, is endowed with a topology T,. Recall that g y L ( r , ) , Vx
EX,i s - a topology on R, and that each function g, is
continuous
relative to this topology. The union of all these topologies,
need not be a topology, for it does not necessarily preserve unions and intersections. But we can extend it to a topology, say T ( Y ) , regarding 3 as a subbase. This topology is the weakest one for which all functions of the above family are continuous. r(Y) is called the w e a k f o p o l o g y generated by the family {g,). Now, taking Y, for R and r, (the xth
n
,EX
projection map) for g,, we deduce that the Tychonov topology r p is the weakest topology for which all projections are continuous. Consequently,
139
5. Product Topology
rp turns out to be the weak topology generated by the projection maps. (Of course, we need to show that r p = r(3); see Problem 5.7.) By the way, this offers another (equivalent) definition of the Tychonoff topology Y,. on
n
xEX
5.10 Example. Recall that a sequence {x,} C R converges to a point x E R if, for every neighborhood U,, there is an N(U,)-tail of {x,). In the product space R = R', a sequence of points {f is convergent to a point f E R if and only if f ,(x) -, f (2) for all x E R. T o see this we note (see Example 5.8 (ii)) that a base neighborhood Uf of f in (5.8b) is of the form,
In other words, f n -, f if it is close to f on each finite set {xl,. ..,xk} C R, specifically on singletons {x) C_ R. Example 5.10 is motivational to the following notion.
5.11 Definition. Let {(Y,,r,), x E Y}, be a topological space and let ( n YZ,rp) be the Tychonov product topology. Recall that if Y, = Y ,EX
and rX= T, for each x E X , then we denoted
n
Y, by yX and called it
,EX
the set of functions from X to Y. Now the special Tychonov product topology (lfX,rp) is called the topology of pointwise convergence. As a generalization of Example 5.10, the following proposition can help solidify our understanding of the topology of pointwise convergence.
5.12 Proposition. Let ifn}be a sequence in yX. Then fn +f E yX (in the topology of pointwise convergence) if and only if fn(x)-,flx), Vx E X (in the topology (Y,,r,)).
Pmof. Recall that T,: yX-r Y is the x-projection map defined as ~ , ( f )= f (x) (see Section 5, Chapter 1). (i) First assume that f, + f in (yX,rP). By Theorem 5.5, T, is continuous for every x. Thus, by Theorem 4.9, n,(f ), -r ?r,(f). This yields that f n ( x ) - t f(x) in (Y,,r,).
-
(ii) Let f ,(x) f (x) in (Y,,r,), Vx E X. Let U f be a neighborhood of f in ( y X , r P ) . Clearly, U f contams some base neighborhood Bf.Since by Theorem 2.2, Bf E CBf E 9 (for r p ) , it follows that Bf is of the form
140
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
where all 0 (,)'s but finitely many ( 0 (xl)l. each i = 1,. ..,k, 0
(xk) ) are Yx's and for
contains f (xi). Thus the base neighborhood B is
a simple cylinder
Now, f n -.f if and only if for every base neighborhood Bf , there is an N(Bf)-tail of If ,}. By our assumption, f n(xi) +f (xi), which implies the existence of an Ni(Of ( )-tail, i = 1,. .,k. Let N = max{N1,. .. , N k b
.
1
(Note that this is exactly the place, where we take advantage of the Tychonoff product topology, for otherwise, in the case of the box topology, a baie neighborhood of f could not be represented by a simple cylinder. The latter would be an obstacle in finding a finite maximum of infinitely many Ni's, which would finally imply that { f,} does not converge to f in this box topology.) Then, for each xi, i = 1,...,k, we have the ~ ( if.(,i))-tail ) of { f n(zi)}, which yields that
Therefore, we have k
k
f n E$= fl1r; .(fn(x;)) 2. fl z = l
r: .(of
= B , for all n
> N.
The latter tells us that an N (B$!-tail of { f ,} exists, and therefore, X + f in (Y ,rp).
fn
PROBLEMS 5.1
Prove Proposition 5.1.
5.2
Prove Proposition 5.2.
5.3
Prove Proposition 5.3. [Hint:Apply Theorem 2.7.1
5.4
A map f : ( X , T ) + (Y ,TI) is said to be open if f (7) 2 TI. Show that in the product topology each projection map is open. [Hznt: Use the fact that, according to Problem 3.3, Chapter 1, maps preserve unions.]
5.5
Let f : (R,T)
3
(X =
n Xi,rp). n
Show that the function f is
i=1
continuous if and only if each r;o f is continuous. [Hint:Show , then apply that f *(S)E T, for every subbas; element of T ~ and Theorem 4.8.1
5.6
Let (Xi,ri) be a Hausdorff space, i = 1,...,n. Prove that
(nX, n
i =1
141
5 . Product T o p o l o g y r p )is Hausdorff.
5.7
Show that Y in Remark 5.9 is a subbase for the Tychonov topology.
5.8
Show that all major properties of the product topology of finitely many spaces can be reformulated and can hold for the Tychonov topology (Problems 5.4-5.6).
5.9
Let ( X =
n X i , r p ) be the Tychonov topology and assume that
iEI
each factor space is first countable. Is ( X , r p ) first countable if: a)
I II
= No?
b) 111 ? & ? 5.10
Generalize Theorem 5.5 for the case of Tychonov's topology.
142
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
NEW TERMS: product topology for finitely many factor spaces 135 base parallelepiped 135 subbase parallelepiped 135 continuity of projection maps 136 product topology for arbitrarily many factor spaces 136 box topology 136 box parallelepiped 136 Tychonov topology 137 weak topology (generated by a family of functions) 138 topology of pointwise convergence 139 pointwise convergence, criterion of 139 open map 140
6. Notes on Subspaces and Compactness
6. NOTES ON SUBSPACES AND COMPACTNESS It has been mentioned that subspaces of topological spaces (i.e. relative topologies) inherit certain qualities of the original spaces. In this section we consider this notion more systematically. We will be concerned with such topological properties as separability, countability, and compactness and their effect on subspaces.
6.1 Definition. A property of a space is referred to as hereditary if every subspace has this property. A property is said to be weakly hereditar y if it is inherited by a subspace whose carrier is closed in the original space. A property is vaguely hereditary if it is inherited by a subspace whose carrier is open in the original space. [The last notion is restricted to use in this textbook.] 0
6.2 Example. Second countability is hereditary. (See Problem 3.10.) 6.3 Remark. In Section 1 we denoted by 2 the closure of some subset A of a topological space (X,T), understanding that this is the closure relative to the topology T. As was mentioned in Definition 1.8 (i), in the case of subspaces we may need to deal with closures of subsets with respect to any relative topology, say (Y,ry). To make a certain distinction clear we will then write CIyA. However, we will still use 2 having in mind the closure relative to the original space (X,T). 6.4 Example. The property of density of a set is not hereditary and not weakly hereditary, i.e. if D is dense in (X,T), its trace in a subspace (Y,ry) need not be dense. Let (X,T) = (R,T,) and Y = W+ U { Then, obviously the set Q+ = Q n Y is not dense in (Y T ). It is easily that does seen that { is an open neighborhood of the point not meet Q+. Thus Cly Q+.# Y. Since Y is closed in (W,T,), the density 0 property is not weakly hereditary either.
a}.
fi}
6.5 Theorem. Separability is vaguely hereditary, but not (weakly) hereditary.
Proof. Let (X,T) be separable and let (Y,ry) be a subspace of (X,T) such that Y E T. We show that (Y,ry) is separable. Let D be a countable, dense set in (X,T). We need to prove that Cly(D n Y) = Y; specifically, we need to show that Y C Cly(D n Y), for the inverse inclusion holds trivially. Let y be any point of Y and let Ub,be any open neighborhood of y in ry. Since Y is open in X, UL is also a neighborhood of the point y in T. [fi is easy to show the follow
144
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
U; E PLY in r.] Therefore, Ub meets D and, consequently, U; meets D fl Y (as a subset of Y). Observe that if Y is not open in X, Uk need not be a neighborhood of y in Y. [For instance, let Y = (0,2] and U; = (1,2]. Clearly, U; is not a neighborhood of 2 in (R,T,), but it is a neighborhood of 2 in ( Y , r y 1.1
(ii) As a counterexample of separability as a hereditary property, we consider the topology (X,T) known as the Moore plane. Let X = R x [O,oo) (the upper semiplane and the horizontal axis). The topology on X is described by the following base sets. At each point (x,y) E R x (O,CQ), the- neighborhood base is the collection of all open balls B ( ( x , y ) , r ) : r < y}, where B , ) = z X : d ( z , z l ) < T } . At each point (x,O), the neighborhood base consists of all open balls touching the horizontal axis at (x,O), and the point (x,O) is attached to these balls. Take thd inion of all neighborhood bases and construct a base for the Moore plane topology in light of Theorem 2.2. This topological space is separable with the dense countable subset D = Q~fl X. Indeed, let (x,y) E R x (0,m). Then any neighborhood of (x,y) contains points with rational coordinates (a property inherited from the Euclidean space). As for the points (x,O), any open ball bordering (x,O) also contains points with rational coordinates. Now, for a subspace of the Moore plane, consider the one with the horizontal axis as the carrier Y. Clearly, all singletons are traces of base neighborhoods a t (x,O) in Y yielaing the discrete topology as the relative topology on Y. According to Problem 6.2, any discrete topology with a noncountable carrier is not separable. Observe that YC is obviously open in X. Hence the separability is not weakly hereditary.
6.6 Definition. A subset A of a topological space ( X , r ) is said to be compact (Linderifl if every open cover of A contains a finite (at most countable) subcover. We also say that A is finitely (countably) reducible. Specifically, if X is compact (Lindelof), ( X , r ) is called a compact topological space (Lindel'if space).
6.7 Example. Compactness in metrizable topological spaces obviously coincides with that for the corresponding metric spaces. In this case, we may use the tools and criteria of compactness for metric spaces. For instance, the interval [a,b] for a,b E Rn is compact in the sense of the Euclidean metric; therefore, it is compact in (Rn,re), while (a,b) is not compact in (Rn,,r,), since it is not closed. 0
6.8 Theorem. Let j (X,r)+(Y,rl) be a continuous function. Then the image of any compact subset of X is compact. One can use the same method of proof of Theorem 6.8 as that of
145
6. Notes on Subspaces and Compactness
Theorem 6.10, Chapter 2.
6.9 Theorem. Compactness is weakly hereditary (i.e., a closed subset of a compact topological space is compact).
Proof. Let (X,T) be compact and let B C X be closed. Let (0;:i E I) be an open cover of B. Since BCis open, {BC,0; : i E I) is an open cover of X. Since X is compact, there exists an open subcover of X,
say {BC,O1,. . .,On}, which is also an open subcover of B. Hence, B is 0 compact. Hausdorff topological spaces possess an important property with respect to compactness.
6.10 Theorem. Every compact subset of a Hausdorff space is closed. Proof. Let A be a compact subset of the Hausdorff space (X,T). We show that AC is open. Take x E AC. The family of neighborhoods of all points y E A covers A. We extract a particular subfamily of these neighborhoods. Since (X,T) is Hausdorff, for each y E A, there is a neighborhood Ux(y) of x and a neighborhood Vy(x) of y such that UJy) n V,(x) = (8. Without loss of generality we may assume that the family {Vy(x) : y E A) is an open cover of A. (Otherwise, for each y E A we can select open subsets O,(x) 2 V,(x) such that y E O,(x).) Since A is compact, there exists an open subcover (V (x) : k = 1,. ..,n) of A. ObviYk
ously, V
Yk
(x) n Ux(yk) = @.Select {O,(yk)
2 Ux(yk), k = 1,. ..,n),
whose
intersection (denoted by Ox), since being finite, is open and nonempty. Therefore, Ox is an open neighborhood of x E AC with Ox n A = @, which means that x is an interior point of AC. Thus, AC is open, or 0 equivalently, A is closed.
6.11 Remark. In Theorem 6.3, Chapter 2, we stated and proved (Problem 6.8) the equivalence of the conditions: (i) A C (X,d) is compact; (ii) every infinite subset of A has a n accumulation point in A (Bolzano- Weierstrass compactness); (iii) every sequence in A has a convergent subsequence (sequential compactness). This equivalence does not hold for topological spaces, where (i) and (iii) are in general distinct propert ies, and compactness just implies Bolzano- Weierstrass compactness, as the reader will prove it (see Problem 6.6). Recall that second countable spaces are first countable and separable (see Problem 3.6). In addition, they are Lindelof spaces, as the following theorem asserts.
6.12 Theorem. Any second countable topological space is Lindel'if.
Proof. Let GJB be a countable base of a topological space (X,T) and let
146
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
0 = (0;: i E I} be an arbitrary open cover of X. Let x E X. Then x belongs to some 0, E 0. Since '3 is a base for T, by Theorem 2.2, there is a neighborhood base 3 ' , C '3. Then there is a base neighborhood B, E 3, such that B, C 0,. The collection of all distinct B,'s for all x E X is a t most countable. On the other hand, this collection obviously covers X. Consequently, the collect ion of all open supersets {O?} associated with each B, is also countable and it covers X. Thus (X,T) is indeed Lindelof. The below result is in the spirit of Theorem 4.8, where, for continuity of a function f from (X,T) to (Y,T(Y)), it was sufficient to verify that f *(Y) C T. Here we claim that, if Y is a subbase for T , then X is compact whenever every cover of X by elements from Y can be reduced to a finite subcover.
6.13 Theorem (Alexander Subbase Theorem). Let ( X , T ) be a topological space and let o be a subbase for T. If every open cover of X by elements of a is finitely reducible, i.e. if every open cover c a n be reduced t o a finite subcover, t h e n X i s compact. Proof. We prove the equivalent statement: If X is not compact, t h e n there exists a n open cover by elements o f u that i s not finitely reducible. Assume that X is not compact. We will prove this assertion in four steps, which we outline as follows: (i) Let O be the collection of all open covers of X that cannot be reduced to finite subcovers. We will show that 0 has a maximal element; call it A. (ii) We will show that for every x E X , there is an open set M, E A and a finite tuple of open sets {S1(x),. ..,S,(x)} C B such that
(iii) We will show that a t least one of the sets (Sl(x),. ..,S,(x)), denote it ~ ( x )belongs , to A. (iv) We will recognize that for each x E X, S(x) E B and S(x) E A. In particular, the latter will imply that {S(x) : x E X} is an open cover of X, which is not finitely reducible. On the other hand, since we will have (S(x) : x E X) E B, the proof will be complete. We will be concerned with each of the above steps in detail. Step (i): Since X is not compact, 0 is not empty. Introduce on 0 the partial order relation in terms of the inclusion. (In other words, two open covers, Cl and C2 of X from O are related as Cl & C2 if and only if Cl is
147
6. Notes on Subspaces and Compactness
e2.) Let
C E 0 be any chain, and let CU be the union of all elements of 43. Clearly, 21 is a cover of X that cannot be reduced to a finite subcover. Thus, C11 must belong to 0, and 21 is an upper bound of C. By Zorn's Lemma 4.13, Chapter 1, there is a t least
a subcover of the cover
one maximal element in 0; denote it by A. Step (ii): Let x E X and let M , E A such that x is an interior point of ~ m h exists, for A E 0 is an open cover). On the other hand, by Theorem 2.7, the collection 93 of all finite intersections of elements of the subbase u is a base '3 for T. Thus, as an open set, M, is a union of base elements, each one of which is a finite intersection of elements of u;i.e., x belongs to one of the base elements B,, represented by a finite intern n Sk(x) of subbase elements. In other words, there is a tuple k=l {S1(x),. ..,S,(X)} C u such that x E n Sk(x) c M,. k=l
section
n
Step (iii): Assume that for the given set M, there is no element of the m ' ( x ) , . . .,S,(x)} which is an element of A. In this case, for each Sk(x) E {Sl(x),. ..,S,(x)}, there is a finite subcollection {Ml(k),. . ., M . (k)) from A that supplements Sk(x) to a cover of X. If no such 3k
finite subco1lection were to exist, then {Sk(x)) U A would be an element of 0, i.e. {Sk(x)) U A would not be finitely reducible, and A would not be a maximal element of 0. Hence, {Ml(k),. ..,M . (k),Sk(x)}, k = 1,..., 3k n, is a finite open cover of X. By Problem 6.9, {Ml(k) ,...,M . (k): k = 1,...,n, and Jk
n
n Sk(x)) k=1
is also a finite open cover of X. This implies that
.
{Ml(k),. ..,M . (k) : k = 1,. .,n, and M,} Jk
is a finite cover of X, and is also a finite subcover of A, which contradicts the property that A is an element of 0. Thus, our assumption about {Sl(x),. ..,S,(x)} was wrong and there is a t least one set S(x) E {Sl(x),. . .,Sn(x)} which belongs to A. Step (iv): It follows that, for each x E X, there is an element S(x) common to u and A, and thus the collection (S(x): x E X ) is an open cover of X that obviously cannot be reduced to a finite subcover. [Observe that the assumption that X is not compact implies that X cannot be finite.] The Alexander Subbase Theorem leads to the following meaningful result by Tychonoff.
148
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
6.14 Theorem (Tychonov). A n o n e m p t y T y c h o n o v p r o d u c t i s c o m p a c t if and o n l y if e a c h f a c t o r s p a c e i s c o m p a c t .
Proof. (i) If the Tychonov product is compact, then compactness of factor spaces follows from continuity of projection maps (see Theorems 5.5 and 6.8). ) the Tychonov product of compact spaces (Xi,ri), (ii) Let ( X , T ~be i E I. Take for a subbase for r p the collection of all simple cylinders 9 = {ai4(Oi) : Oi E ri, i E I) (see Proposition 5.3). If X is not compact, by the Alexander Subbase theorem, no subcollection of Y' covering X can be reduced to a finite subcover. Specifically, !f' cannot be reduced to a finite subcover. Let {Oil be an arbitrary open cover of Xi. Then it can be reduced to a finite subcover, {Oi(l),. ..,Oi(ki)) of X i . Obviously, a i * ( O ( l ) ) . .,ai*(Oi(ki))} is a finite open cover of X that is a finite subcover of !f'. This contradicts the hypothesis that X is not compact.
PROBLEMS 6.1
Is the property of density of a set vaguely hereditary? (Consider Example 6.4.)
6.2
Let (X,T(X)) be a discrete topological space with an uncountable carrier. Show that the space is not separable.
6.3
Show that the topological space in Problem 6.2 is not second countable.
6.4
Show that first countability is hereditary.
6.5
Prove that the Moore plane is not second countable.
6.6
Prove the statement: Every compact topological space is BolzanoWeierstrass compact.
6.7
Let. T be the cofinite topology on an arbitrary nonempty set X. Show that (X,T) is compact.
6.8
Show that any bijective continuous map f: ( X , r ) -+ (Y,rl), where (X,T) is compact and (Y,rl) is Hausdorff, is a homeomorphism. [Hint: Make use of Proposition 4.3.1
6.9
Let {Ml(k), ...,M - (k),Sk) be a cover of a set X for each k = 1, 3k
.. . n. Show that { M l ( k ) , ...,Mjk: I
cover of X . 6.10
n
k = 1,...,n, and
n Sk}is also a
k=l
Let (X,T) be a Hausdorff topological space. Show that an
6. Notes on Subspaces and Compactness
arbitrary intersection of compact sets in X is compact.
6.11
Show that compactness is weakly hereditary.
6.12
Let 0 be an open set in Rn. Show that there is a monotone f of open bounded subsets of 0 such increasing sequence (Ok) that (Ok) f 0.
6.13
Is the property of a space to be Hausdorff hereditary of any kind?
6.14
Let (X,T) be a Hausdorff space and C and K be disjoint compact sets. Show that there are disjoint open supersets, U and V, of C and K , respectively.
6.15
A topological space (X,T) is called countably compact if any open countable cover of X has a finite subcover. Prove that the following are equivalent:
(i)
(X,T) is countably compact. (ii) Each countable family of closed sets in (X,T) with the finite intersection property (i.e. the intersection of any finite subfamily is nonernpty) has a nonempty intersection. (iii) Every countably infinite subset A of X has a point x with the property that each neighborhood of x contains infinitely many points of A. Every sequence in X has a closure point. (iv)
150
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
NEW TERMS: hereditary property 143 weakly hereditary property 143 vaguely hereditary property 143 Moore plane 144 compact set 144 Lindelof set 144 compact topological space 144 Lindelof topological space 144 compact set under a continuous map 144 compact sets in Hausdorff spaces 145 Bolzano-Weierstrass compactness 145 sequential compactness 145 Alexander Subbase Theorem 146 Tychonov's Theorem 148 countably compact topological space 149
7 . Function Spaces and Ascoli's Theorem
7. FUNCTION SPACES AND ASCOLI'S THEOREM 7.1 Remarks. (i) Earlier (Example 5.2 (i), Chapter 2), we introduced the space 9,(X;R) of all bounded real-valued functions on a set X , and metric
We called it the uniform metric. This metric is TIH and thus induces the corresponding norm, which we called the supremum norm. (Since 9,(X) is a vector space over R, the norm induced by p is legitimate.) It was also shown that p is complete and, therefore, 4F,(X) is Banach. A generalization of the above metric space is the linear space of all real-valued, bounded vector functions f : X -,Rn with the corresponding uniform metric,
(TIH too). We will be concerned with a similar metric space of all continuous real-valued vector functions defined on a compact topological space (X,r). By Theorem 6.8, any image of X under a continuous function f is compact in the corresponding image space. Since this image space is (Rn,~,), by the Heine-Bore1 theorem, f ,(X) is bounded. Thus uniform metric (7.1) is a valid metric too. So we denote this metric space by ( W ; R n ) , p ) . (ii) Observe that metric (7.1) can be generalized for the space of all continuous functions defined on a compact topological space (X,T) and valued in an arbitrary metric space ( Y , d ) . Again the continuous image of X is compact in (Y,r(d)) and, according to Theorem 6.7, Chapter 2, it is closed and bounded. Hence, we are able to define the uniform metric (induced by metric d) by p ( f ,g) = s u ~ { d ((f 4 , f (XI):
E XI.
(7. l a )
Specifically, if d is TIH on a linear space Y over a field ff, then p defines a norm,
on Y (where 11 l i d is the norm generated by the TIH metric d and 11 11 is the norm generated by the metric p), i.e. the supremum norm.
-
(iii) If (X,T) is not compact, instead of C(X;Y), we consider the
152
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
subspace C*(X;Y) of all continuous, bounded functions and define the uniform metric on C,(X;Y). 7.2 Definition. Let ( X , r ) be a compact topological space and let ',r(d)) be a metrizable space. Denote by C(X;Y) the (linear) space of all continuous functions from ( X , r ) to (Y,r(d)). A sequence I f n } 2 (C(X;Y),p) is said to converge uniformly to a function f E (C(X;Y),p), if p( f , f ) 0 and n -t m. (p is the uniform metric defined by (7. la).) If d is TIH, then the corresponding condition becomes -)
The following lemma generalizes the classical result in analysis that states that the limit of a uniformly convergent sequence is continuous. Below we assume that (Y, (1 - 11) is an NLS over a field ff, which is R or
7.3 Lemma. L e t ( X , r ) be a compact topological space and let (Y, 11 11 ) be a n N L S over a field IF. Let ( f E (C(X; Y), p) converge uniformly t o a function f. T h e n f E C(X;Y). 0
-
Although a burden for proving this lemma and many other statements in this section will be passed on to the reader, the associated problems will chiefly be provided with detailed hints and handouts. (See Problem 7.1.) 7.4 Theorem. Let Y be a linear space over a field IF and let (1 (1 be a n o r m o n Y induced by a translation-invariant m e t r i c d o n Y. If ( Y , (1 - 11 d ) i s a B a n a c h space, t h e n t h e space (C(X;Y), 11 - lip) i s also Banach.
(See Problem 7.2.)
7.5 Example. The following special case frequently occurs in applications. Since Y = Rn, with the Euclidean norm 11 11 ., is a Banach space, by Theorem 7.4, (C(X;Rn), 11 - I(p ) is also a Banach space. tl Theorem 7.4 can be generalized as follows.
7.6 Theorem. L e t ( X , r ) be a compact topological space and let (Y,d) be a complete m e t r i c space. T h e n t h e u n i f o r m m e t r i c space (C(X;Y),p) i s also complete. D (See Problem 7.3.) In many problems related to differential equations or complex analysis, it is of interest to have a criterion, under which a closed and bounded subset of (C(X;Y),P) is sequentially (or Bolzano-Weierstrass) compact.
7. Function Spaces and Ascoli's Theorem
153
Hence, we need to know whether this subset is compact under the uniform metric. While a set may often be closed and bounded, it is insufficient for compactness, in contrast with the Heine-Bore1 Theorem applied to Euclidean spaces. There will be a n additional condition below characterizing compactness of some sets of continuous functions known as equicontinui-ty. Equicontinuity was introduced by Italian Giulio Ascoli in 1884 and it is regarded as one of the fundamental concepts in the theory of real functions. Ascoli's Theorem was generalized by his fellow countryman Cesare Arzelii in 1889 and it led to a very practical sequential compactness criterion of functions often referred to ai Ascoli-Arreli Theorem.
7.7 Definition. Let (X,T) be a topological space and let (Y,d) be a metric space. A subset of functions 9 E C(X,Y) is said to be (d-)equicontinuous at xo E X if, for each
E
> 0, there is
a neighborhood
such that for each x E U, and f E 4, d(f (x),f (xo)) < E . 0
U
of xOr =o The subset 4 is
called (d-)equicontinuous if it is equicontinuous a t each point of X.
7.8 Theorem (Ascoli). Let ( X , r ) be a compact topological space and let (C(X;Rn),p) be the function space endowed with the uniform metric p. A subset 4 C_ (C(X;Rn),p) is compact if and only if it is closed, bounded and de-epuicontinuous. The proof of Ascoli's theorem is based on the following two lemmas.
7.9 Lemma. Let (X,r) be a compact topological space and let (Y,d) be a metric space. If a subset 5 C (C(X,Y),p) is totally bounded in (e(X, Y), p), then 9 is d-equicontinuous on X . (See Problem 7.4.)
7.10 Lemma. Let ( X , T ) be a compact topological space, (Y,d) be a totally bounded metric space, and 2 C(X;Y) be any d-equicontinuous subset. 'Then is totally bounded. (See Problem 7.5.)
Proof of Ascoli's Theorem. If 9 is compact, it is closed and bounded by Theorem 6.7, Chapter 2, with no further restrictions. In this case, we have to prove that 9 is de-equicontinuous. We first show that since 9 is bounded, there is a compact subset Y 5 Rn such that, for all x E X and for all f E 9, f (4 E Y. Let f E 9. Since f is continuous, by Theorem 6.8, f o*(X) is a compact subset of Wn. In other words, fo*(X) is closed and bounded. Hence, there is a n open ball Bd (8 = (0,...,O),R) such that fo,(X)
(i)
e
154
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
Bd (0,R). On the other hand, since 9 is bounded, there is an M 2 0 such e
that p(f o, f ) < M, V f E 9. Thus for all
f E 4,
+
and now, Bd (0,R M) can be taken for Y. Hence, (Y ,d,) is a compact e
subspace of (Rn,de) such that each f E 9 is valued in Y. By compactness, 9 is totally bounded (see Theorem 6.14, Chapter 2)) and we conclude that 4 is de-equicontinuous by Lemma 7.9. (ii) Let 4 be closed, bounded, and de-equicontinuous. As a closed subset of the complete metric space (C(X;Rn),p) (Example 7.5), (9,p) is complete (see Theorem 5.1, Chapter 2). Since '5 is bounded, by the above argument in (i), all functions of 9 are valued in a compact subspace of (Rn,de). Now, X and Y are compact and 4, by the assumption, is deequicontinuous. By Lemma 7.10, we conclude that 9 is totally bounded. Finally, we can make use of Theorem 6.14, Chapter 2, and have 9 compact. 0
7.11 Examples. For the following examples we denote by ~ ' ) ( x ; Y ) the space of all differentiable functions with uniformly bounded derivatives.
(i) Let X = Y = W. Then, e(l)(W;W) is an equicontinuous family. Indeed, for every f € e(')(W;W), I f ' 1 5 M. Let E > 0 and x E R. Then for all y E R such that I x - y I < E/M, we have, by the mean value theorem,
-
(ii) -Let X = [a,b], Y = R, and 4 be the subspace of C (1)(X;Y) consisting of all uniformly bounded functions. We wish to show that (9,p) is compact. By Example (i), 9 is equicontinuous. Clearly, (9,p) is bounded, since the diameter of 9 is
where N is defined as the common bound for all f E T. Furthermore, it is easy to see that (9,p) is closed. Since a subset of a metric space is closed if and only if it contains all of its limit points, we select an arbitrary convergent sequence {f,) C 9 and show that its limit is a function, which 1) is differentiable,
7. Function Spaces and Ascoli's Theorem
2) is bounded by N, 3) has its derivative bounded by M. The first statement immediately follows from the known fact in analysis that a uniformly convergent sequence {f ,) of differentiable functions has as the limit, a differentiable function f , and that p-lim f = f '. The other two statements can be easily verified. There is another version of Ascoli's Theorem frequently used in applications. It is based on the result of Problem 7.14: If (9,p) C (C(X;Wn),p) is equicontinuous and bounded, then ( g , p ) is also equicontinuous and bounded. We will need another definition. Any subset of a topological space is called relatively compact if its closure is compact. For instance, if 9 is a sequence of continuous functions (which need not be closed), we might be interested in whether or not it has a convergent subsequence, i.e. if 3 is sequentially compact or, equivalently, if '3 is relatively compact. Now, with the use of Problem 7.14, the following version of Ascoli's Theorem obviously holds. 7.12 Theorem (Ascoli). Let 9 be a subset in a uniform metric space (C(X;Wn),p). Then 4F is relatively compact if and only if 9 is bounded and equicontinuous.
A more general version of Ascoli's Theorem for a subset 9 E C(X;Y), where Y is a Banach space, requires a finer condition imposed on 9. 7.13 Theorem (Ascoli). Let '3 be a subset in a uniform normed linear space (C(X;Y),sup 11 11 ), where (Y, (1 11) is a Banach space (over f). Then 9 is relatively compact with respect to sup (1 11 if and only if 9 is equicontinuous and, for every x E X , the set
9(x) = (f (x) E Y: f E 9 ) is relatively compact in (Y,
I[
(I).
0
(See Problem 7.16.) As mentioned earlier, there are very many other versions of Ascoli's Theorem known from textbooks and research papers that led to special applications. For instance, consider Arzelii's Theorem (see Problem 7.15). T o work with some of the problems below we need the notion of pointwise boundedness.
7.14 Definition. A collection '5 C(X;(Y, d)) of functions is called pointwise bounded if, for every x E X , the set 9(x) = {f(x): f E 9) is bounded, i.e. for each x E X, there is a positive real number M , such
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
156
that d ( f ( x ) ,g(x)) 5 M ,
, for each pair
f , g E 9.
0
Recall that a collection tS C ( C ( X ; Y ) p, ) is uniformly bounded if there is a positive real number M such that p ( f , g ) 5 M , V f , g E 9.
PROBLEMS
7.1
Prove Lemma 7.3. [Hint:Make use of the inequality
and continuity of f k in the form of Problem 4.8.1 7.2
Prove Theorem 7.4. [Hint: Use Problems 4.6-4.9 and apply Lemm a 7.3.1
7.3
Prove Theorem 7.6. [Hint: Show first the validity of the statement similar to Lemma 7.3: Under the conditions of Theorem 7.6, if a sequence If n} 5 ( C ( X , Y ) ,p ) converges uniformly to a function f , then f E C ( X ; Y).]
7.4
Prove Lemma 7.9. [Hint: Let 9 be totally bounded; show that 9 is equicontinuous at any fxed point so E X.
> 0 and bl, b2 > 0 such that E 2 2b1 + b22) Cover 4 by balls B p ( fi, 61)' i = 1,. ..,n [call the n-tuple 1) Choose any
If I,
a,
f ,}
E
a sl-netl.
3) Use continuity of each f a t xo in the form of Problem 4.7 b): for each b2 > 0, there is a neighborhood
4 ) Choose a neighborhood
(I,
u"0( ~of)xo with
, good for all f i's.
0
5) Let f be any function in IT; thus f falls into one of the balls in 2), say B p ( f i, 61).
6 ) Use the estimate
157
7. Function Spaces and Ascoli's Theorem
where the first term of the right-hand side of the inequality is less than bl (why?), and the second term is dominated by
(The estimate needed then follows.)]
7.5
Prove Lemma 7.10. [Hint: Choose E > 0 and 61, 62 > 0 such that E > 2b1 b2. Show that there exists an E-net {fl,. . .,f N ) k. Use the steps that follow.
+
1) Use equicontinuity of k and compactness of X to show that, for every b1- > 0, there is a finite open cover (by neighborhoods) {U,1(61),.. .,Us( 4 ) ) of X, such that for any f E k and n
for any y that falls into a neighborhood U,.(b1), t
2) Cover Y by a finite collection {B(j)} of d-balls, such that B ( j ) = Bd(y b2), j = 1,. ..,
3) Let I' be the collection of all integer functions
Let I?' be a subset of I' with the following property: an element y E I? belongs to I?' if and only if there is a function f E 8 such that f (xi) E B(y(i)), i = 1,...,n. Let I I" I = N. Then order the elements of I" and the functions assigned to I" by (1,...,N), so that I" = {yl,. ..,yN} and 8' = { f ..,f N). Show that 8' is a relevant &-net. 4) Let f E $. Show that for this f there is an element of I", say yj, such that if f (xi) E B(yj(i)), i = 1 , .n , then '(f (xi), f j ( ~ i ) )< 62, i = 1,. n5) Show that for all x E X\{xl,.
..,xn},
by using the triangle inequality and the inequality in 1).
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
6) Show that the inequality in 5) implies the desired inequality p(f, f k) < E for some k E (1,. ..,N) and therefore i f l , ...,f N } is indeed an &-net in I.]
Prove the following: Let ( X , r ) be a topological space and let (Y, d) be a complete metric space. If C*(X;Y) is the subspace of continuous bounded functions, then (C,(X;Y), p) is a uniform complete metric space. Prove the statement: If 9 E C(X; Y) is an equicontinuous family, then so is its uniform closure 3. Prove Dini's Theorem: Let (X,T) be a compact topological space. Consider the space (C(X; W), p). Let { f ,) be a monotone sequence from C(X;W) such that {f,) converges to a continuous function f E C(X; W) in the topology of pointwise convergence. Then ( f ), converges to f in p also. Let 9" be the set of a11 polynomials defined on [0, 11 with degrees less than or equal to n and with all real coefficients bounded by a positive constant. Show that (Tn, p) is compact. Let 9 C_ C(X;Y), where X is a compact topological space and Y is a metric space. Show that if 9 is equicontinuous and pointwise bounded, then it is uniformly bounded in (C(X;Y), p). Let 9 2 C(X;Rn) and let X be compact. Show that 9 is relatively compact if and only if it is equicontinuous and pointwise bounded. Let 9 be the set of functions
Show that the set ( 9 , p) is sequentially compact. 1 Let 9 be a sequence of functions with fn(x) = bncosx, b , = 1 +El n = 1,2,. .., and fo(x) = COSX. Show that (9,p) is compact.
Let $ be a subset of (C(X;Wn),p). Show that the uniform closure (3, p) is equicontinuous and bounded if and only if ( 9 , p) is equicontinuous and bounded.
Prove Arzeli's Theorem: Let X be compact and let { f k} C C(X;Wn) be a pointwise bounded and equicontinuous sequence of functions. Show that ({fk),p) is sequentially compact. 7.16
Prove Theorem 7.13.
159
7. Function Spaces and Ascoli's Theorem
NEW TERMS: uniform metric 151 uniform metric space 151 supremum norm 151 space of all continuous functions 151 space of all continuous bounded functions 152 uniform convergence 152 uniform convergence, criterion of 152 completeness of a uniform metric space 152 equicontinuity a t a point 153 equicontinuity on a set 153 Ascoli's Theorem 153, 155 equicontinuity , criterion of 153 totally boundedness, criterion of 153 relative compactness 155 pointwise bounded set of functions 155 uniformly bounded set of functions 156 Dini's Theorem 158 Arzeli's Theorem 158
160
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
8. STONEWEIERSTRASS APPROXIMATION THEOREM Let (QT) be a topological space, X be a compact subset, and let (A,R) E C(X;R) be a subspace of all real-valued continuous functions on X that also contains products f - g of functions from A. Each continuous function on a compact set is bounded, as we know it from Theorem 6.8. We will use the uniform metric p introduced in the previous section:
Since C(X;R) is complete (Example 7.3 or Theorem 7.2), A 5 C(X;R). We wonder under what condition 2 = C(X;R), i.e., under what condition each continuous function can be "uniformly approximated" by elements of A. For instance, if A is the set of all polynomials, can a continuous function be uniformly approximated by a sequence of polynomials ? It is known from calculus that every function, analytic at a point can be uniformly approximated in a vicinity of this point by a sequence of polynomials (Taylor's theorem). In 1885, German Karl Weierstrass established a more general result (also known from calculus), which states that every continuous function defined on a compact interval X can be uniformly approximated by polynomials. Finally, American Marshall H. Stone in 1937 generalized the classical Weierstrass Theorem, allowing X tg be a compact topological space with some minor restriction to the subspace A. For all necessary preliminaries the reader is referred to the beginning of Sectioh.7, Chapter 1. We will start with some auxiliary results to be rendered in a few steps (Lemmas 8.4 and 8.5) that lead to the Stone-Weierstrass Approximat ion Theorem.
8.1 Remark. Compactness of the topological space ( X , r ) we were talking above is not a mandatory prerequisite to define the uniform metric, if we consider C,(X;Y) as a subspace of all d-bounded continuous functions from (X,T) to a complete metric space (Y,d). The uniform metric p is also well-defined on C,(X;Y). Completeness of (C,(X;Y),p) is then due to Theorem 7.6 (where only boundedness of C,(X;Y) on the compact space X is essential). 8.2 Definitions.
(i) Let fj be a family of functions defined on a set X. Then fj s e p a r a t e s points of X if for each x and y from X such that x # y, there is a function f E Cj such that f (x) # f (y). (ii) Let C j C_ C,(X;R) be an arbitrary nonempty subcollection of continuous, bounded functions on X and let A be any subalgebra of C,(X;R) containing Cj. The intersection of all subalgebras containing Cj is obviously
8. Stone- Weierstrass Approximation Theorem
16 1
a subalgebra (see Problem 8.1); and moreover it is the smallest subalgebra containing Cj, denoted by A(Cj), and is called the subalgebra generated b y Cj. The subcollection Cj is called the generator of this sub-
0
algebra.
8.3 Theorem (Stone-Weierstrass). Let X be a compact subset of a topological space (a,T ) and let Cj 2 C , ( X ; R). If C j separates points and contains the unity 1 (i.e. the function identically equal to I ) , then the subalgebra A.(Cj) generated b y Cj is dense in C , ( X ; R ) relative to the uniform metric p. [Observe that if needed, the condition '9 separates points" can be strengthened by the condition "A(Cj)separates points."]
A few lemmas will precede the proof of Theorem 8.3.
8.4 Lemma. For each
I P(t)-
It1
I
Proof. Let
<E
> 0,
there is a polynomial P ( t ) such that for all t ~ [ - 1 , 1 ] .
n=O
E
b nI" be the binomial expansion of the function
( 1 + z)" for a E Q and z E C. Recall that this function can be expanded in the binomial series, where the coefficient bn is given by the formula b, = a ( a - 1 )
( a-n
+ l ) / n ! ,n 2 1, and b0 = 1.
(8-4)
The binomial series is uniformly convergent in the open ball B ( 0 , l ) C_ 43 and a t point z = ( - 1,O) for a, > 0, it is absolutely convergent as a special case of a hypergeometric series. Thus, the series c:= 0 bnxn with
+
coefficients given by (8.4) is uniformly convergent to function (1 x)", a t least for all x E [ - 1,0]. Letting a, = f and replacing x by - x we arrive a t the series 1
-x
c:=
0 bb xn,
which is uniformly convergent to
V x E [0,11, where bk = ( - l)"bn. The statement now follows
if we set x = 1 - t 2 ,
where t ~ [ - l , l ]The . series
converges to I t 1 , V t E [ - 1, 1] with b', = ( - l)"bn; sums of the series are polynomials.
~:=Ob',(l-t~)"
and the partial
8.5 Lemma. Let ( X ,T ) be a topological space, A E C,(X; R ) be a subalgebra, and (C,(X; R), p ) be a uniform metric space. Then the closure 1 relative to p is a subalgebra and, in addition, 2 is a vector lattice, i.e., vflg€2, fAgandfvg€1.
Proof. By Problem 8.2 a), is a lattice. Because of
3 is a subalgebra.
We need to show that
162
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
aAb=
(a+b)+ la-bI
and
2
aVb=
(a+b)- la-bI 2
,
it suffices to show that with f E 1, 1 f 1 E 1. Since f is a continuous, bounded function on X , I f 1 M and I g I 1 , where
<
Then, by Lemma 8.4, for every for all x E X,
<
E
> 0, there is a polynomial P such that
Since 1 is an algebra, P ( g ) E A . From inequality ( 8 . 5 ) ) we have that for each E > 0, the ball Bp( 1 g I ,E ) meets A, implying that 1 g 1 E 1. Hence, I f 1 E 1 (see Problem 1.16). Finally, the statement of the . lemma follows from the linearity of 1 Now we return to the Stone-Weierstrass Theorem.
Proof (of the Stone-Weierstrass Theorem). We will show that each function f E C,(X;R) can be approximated by functions from A = A(g) relative to the uniform metric. By the assumption, Cj separates points, i.e., Q x1 # x2 E X , there exists a function g E Cj such that g ( x l ) # g(x2). Define for fixed a,P E R, the auxiliary function
which belongs to A, because 1 E A. Thus, Q xl # x2 E X and Qrr, P E R, there is an h E A such that h ( x l ) = a and h ( x 2 ) = P . Let f E C,(X;R). Then by the above argument, Q x # y, there is an h,, E A with the property that
where
Fix an x and let y be arbitrary. Since f - h,, is continuous a t y and f ( y ) - h X y ( y )= 0 , VE > 0, there is an open ball B ( y ) = B(y, 6), such that
Now, we cover X by ( B ( y ) :y E X ) , and by compactness of X , reduce
163
8. Stone- Weierstrass Approxima-lion Theorem
this cover to a finite subcover {B(yl), ...,B(yn)}. Let the associated functions, with the above properties in vicinities of yl,. ..,yn be
respectively, and let h, = min(h xyl'h, E 3. By (*), VE > 0,
h
- - ' "Yn
), on X. By Lemma 8.5,
which implies that
Observe that the above inequalities, along with their parameters, depend upon a fixed x E X. Notice that h, does not really approximate f on X; it just approximates f in a vicinity of point x. Thus, f = h, by continuity of f - h, and h and f satisfy inequality (**). By continuity of f -h,, for each E > 0, there is a ball B(x) = B(x, 6,) such that 1 f ) - h ( ) - 0 1 < E , Vz E B ( z ) . Again, let us cover X by the collection {B(x): x E X) and then reduce the latter to a finite subcover {B(xl), ...,B(xk)). Correspondingly, Vs > 0,
Then h = max(h
h
x l ) ' . - )' k
1 E 3 by similar considerations,
f ( t ) - E < h(z),
and hence
E X.
Furthermore, (**) yields that
and
From the last inequality we have that any function f E C,(X;R) is approximated by elements of 2, i.e. VE > 0, Bp(f, E) fl3 # @ which, • due to Problem 1.16, implies that B,(f, E) fl A # @.
8.6 Corollary (K. Weierstrass). Every real-valued continuous function defined on a compact interval [a,b] can be approximated uniformly by polynomials. (In other words, the algebra C([a, b ] ; R) of all continuous functions on [a, b] is the closure of the subalgebra A of all polynomials on [a, bl-)
164
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
Proof. The subalgebra of all polynomials on [a, b] has g = (1,x) as a generator, which contains 1 and separates points (see Problem 8.3). Therefore, the hypotheses of the Stone-Weierstrass theorem are satisfied.
0 In the proof of the classical version of the Stone-Weierstrass theorem we essentially needed a subalgebra A(g). Indeed, in Lemma 8.5 we made use of the fact that g E A and P(g) E A to show that I g I E A and to claim that A is a vector lattice. Should we have assumed that Q is already a vector lattice separating points and containing 1, we were able to prove the Stone-Weierstrass theorem (special version) without Lemmas 8.4 and 8.5.
8.7 Theorem (Stone-Weierstrass, special version). Let (X, T ) be a compact topological space and let g 5 C,(X;R) be a vector lattice that separates points and contains 1. Then C j is dense in C,(X;R). 8.8 Example, Let g be the collection of all continuous piecewise linear functions on [O,l]. Thus, 6 satisfies the hypotheses of Theorem 8.7 and Cj = C([O, I],R). In other words, every continuous function on [O,l] can be approximated by a piecewise linear function.
PROBLEMS 8.1
Show that A(g) in Definition 8.2 (ii) is a subalgebra.
8.2
Let 3 be the closure of a subset A uniform metric p. Show that
C,(X;W) relative to the
a) if A is an algebra, then 1 is also an algebra; b) if A is a vector lattice, then 1 is also a vector lattice. 8.3
Let C j = {f(x) = l,g(x) = x) E C([a,b];R), for a < b E R. Show that A(Q) is the subalgebra of all polynomials on [a,b].
8.4
Let g be the collection of all continuous, piecewise linear functions on [0,1]. Show that g E C([O,l];R) is a vector lattice but not a subalgebra.
8.5
Let X be a compact subset of R. Show that (C(X,R);p) is separable.
8.6
Let (X,r(d)) be a compact metrizable topological space. Show that (C(X;R),p) is separable. [Hint: Use the steps that follow. 1) Let D = {dl,dz, ...) be a countable, dense set in (X,d) (why?). Define f ,(x) = d(x,d,), Vx E X.
8. Stone- Weierstrass Approximation Theorem 2) Show that
fn E e(X;R).
3) Show that ( f ,) separates points.
4) Show that the algebra generated by fo = 1, is dense in C(X;R).]
Ifn: n = 0,1,. ..), with
8.7
Prove the following: Let X be a compact subset of Rn. Then every real-valued continuous function on X can be approximated uniformly by polynomial functions of n variables.
8.8
Can continuous functions on a compact interval be approximated by polynomials with rational coefEcients?
8.9
Show that each continuous function on a compact interval can be approximated by a differentiable function.
8.10
Can continuous functions on a compact interval be approximated by polynomials with integer coefficients? Can we apply the StoneWeierstrass theorem?
8.11
A continuous function defined on a compact interval [a,b] is called a parabolic spline if there is a partition (ao = a,al, ...,a, = b) of [a,b] (cf . Definition 1.7 (ii), Chapter 1) such that f is a second degree polynomial on each subinterval [ai,ai+J, i = 0,. ..,n - 1. Can continuous functions on [a,b] be approximated by parabolic splines? If so, what version of the Stone-Weierstrass theorem should be applied?
8.12
Consider a subcollection 5 of "rational" parabolic splines on [a,b], i.e. piecewise second degree polynomials with rational coefficients. Can continuous functions on [a,b] be approximated by elements of '3?
166
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
NEW TERMS: set of functions that Separates Points 160 subalgebra generated by continuous functions 161 generator of a subalgebra 161 Stone-Weierstrass Theorem 161, 164 binomial series 161 Weierstrass Theorem 163 piecewise linear function 164 subalgebra of polynomials 164 parabolic spline 165 rational parabolic spline 165
9. Filter and Net Convergence
9. FILTER AND NET CONVERGENCE In this section we will generalize the concept of convergence of sequences introduced in Section 3. Many problems in topological spaces allow significantly weaker conditions imposed on the linear order of terms in sequences while retaining the principles of convergence. This gives rise to the notion of a net, which is a set indexed by another (partially ordered) set, in which the usual linear order is therefore largely relaxed. One of the prominent applications of convergence of nets is the notion of the Riemann integral, which is known to have inspired American Eliakim H. Moore in his 1915 widely referred to paper, Definition of limit in general integral analysis, and 1922 paper, A general theory of limits, co-authored with H.L Smith, to develop the general concept of a net. Filters offer another, very useful type of convergence in topological spaces such as convergence of neighborhoods to a point. The theory of filters was developed in the thirties by the famous Bourbaki group of French mathematicians.
9.1 Definitions.
Let X be a set and 9 C T ( X ) be a nonempty collection of sets. 9 is said to be a filter on X if:
(i)
a) 8 d 91 b) for each two sets F1,F2E 9, Fl n F2 E 9 (specifically, it means that every pair of elements of 4F is not disjoint), c) if F1 E 9 then any superset F2 of F1 is also an element of 4. Clearly X E 9. if:
(ii) A collection of subsets Tb E T(X) is called a filter base on X
4 @ d 4bl
b ) for each two sets F1,& E Tb, there is a set F E Tb such that F 5 Fl n F2 (clearly, Fl n F2# 0).
(iii) Let 9 be a filter on X. A collection of subsets Tb 5 T(X) is called a filter base for the filter 9 if:
9,s
a) s, b) each F E 9 is a superset for some FbE Tb. (iv) A filter 9 on X is called an ultrafilter if for each subset A of X , either A or AC is in 9. 9.2 Remarks. (2)
A filter is obviously a filter base, since we can take F1 fl F2 for
168
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
F to have ?Fb
9.
(ii) Let Tb be a filter base on X. We can extend T b to a filter 9 by including in 9 additionally all supersets of each FbE qb.Indeed, a) Let F1, F2E 9. Then there are FL, Ff:E !Fb such that FL 2 F1 and Ff:C F,. Thus, there is an FbE ?fb such that Fb5 FL n Ff: ( 5 F, n F,). By definition, 9 contains all supersets of elements of Tb, in particular, Fl n F, is one superset of Fb.Consequently, Fl n F2E 9.
b ) Let F E 9. Then there is an FbE Tb such that FbC F. Now, 9 should contain all supersets of Fb,thus all supersets of F. Therefore, 9 is a filter. Note that the above filter 9 is the smallest filter containing the filter base Tb (show i t in Problem 9.1). For instance, T(X) is another filter containing Tb. Consequently, it is called the filter generated b y the filter base and it is denoted by 9(Vb). Thus a filter base on X is a filter base for a filter on X, namely for the filter generated by the filter base. (iii) We showed that a filter base on X is a filter for a filter base. The converse is also true: A filter base for a filter is a filter base (show it in Problem 9.2).
9.3 Examples. The neighborhood system X, called the neighborhood filter.
(i)
tll, a t
a point x E ( X , T )is a filter on
' , a t a point x E (X,T) is a filter base on (ii) A neighborhood base 3
(iii) Let xo E X = R. Then the following collection of sets are filter bases:
9.4 Lemma. Let F(V0) be the collection of all filters that contain a filter 9,, on X . Let C be the partial order inclusion on ff(90). A filter
9. Filter and Net Convergence
169
9 E IF is an zllirafilter if and only i f 9 is a maximal filter in IF. Proof. 1) Let 5 be a maximal filter in ff(qo) and let A 5 X. Each element of 9 intersects A or AC. Assume that one such F meets A. Then, by Problem 9.4, 9 meets A. By Problems 9.5-9.7, ?FA: = {F n A: F E 91, is a filter base for 9': = 9 U U !FA , which is equal to 9 ( 9 U { A ) ) , i.e., (B>A
)
the filter generated by the collection 9 U (A}. 9' is finer than 9 and it contains A. Since 9 is a maximal filter, it follows that 9 = 9'. Thus, 9 contains A. The same result holds if F meets AC. Therefore, 9 is an ultrafilter. 2) Let 9 be an ultrafilter and let A X such that A E 9. We show that 9 is maximal. Let 9' be any filter in IF such that 9 E 9'. Then there is F' E 9'\9. Since 9 is an ultrafilter and F' $ 9 , we have that FlCE 9 and hence FlCE 9'. However, this is impossible, for two disjoint sets F' and FtCcannot belong to the same filter and this is a contradiction.
9.5 Proposition. For each filter To, there is an ultrafilter Q 2 To. Proof. Let ff(v0) be a collection of all filters finer than V0 and let 43(T0) be any chain in ff(To). Then it is easy to see that
is again a filter and it is the largest filter in 63(T0). Specifically, it is an upper bound for C(TO). Then, by Zorn's Lemma 4.13, Chapter 1, IF(Vo) has a maximal element which by Lemma 9.4 is an ultrafilter. U
9.6 Definitions. A filter 9 on a topological space ( X , r ) is said to converge to an x E X (in notation 9 + x) if it is finer than or equal to the neighborhood system U,, i.e. if U, E 9. x is said to be a limii point of the filter 9. (i)
Clearly, every neighborhood system CI1, converges to x.
(ii) A filter base Tb is said to converge to x ( 9 ) -P x) if for every neighborhood U, E Q,, there is an FbE Tb such that FbC_ U,. Consequently, each neighborhood base %, converges to x. (iii) A point x E ( X , T ) is said to be an accumulation point of the filler 9 (filler base Tb) if for each F E 9 ( F b E Tb) and for each U, E Q,,
Fnu, #
@*
(iv) Let gb be a filter base on X and let f : X -t ( Y , r l ) (a topological space). The function f is said to converge to 1 E Y ( f -t I) along the
170
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
filter base 4, if for every neighborhood Vl of 1, there is a n F E T , such 0 that f ,(F)5 Vl. 9.7 Examples.
(i) Let X = N and let 9, = {{n,n + 1,.. .): n E N) be a filter base ), which, in fact, is a on N. Now consider a map f : N + (Y,rdiscrele sequence in space Y. Then, Definition 9.6 (iv) in this case reduces to the conventional definition of the limit of the sequence { f (n) = y,) (cf. Definition 3.1). (ii) Let (X,T), (Y,T') be topological spaces, f : X --+ Y, a E X , 1 E Y, and let T b= CU, (the neighborhood filter on X). Now, the expression f 4 1 along CU, means: for each neighborhood Vl, there is a neighborhood U, E %, such that for each x E U , f (x) E Vl (or, equivalently, f ,(U,) C Vl), in notation, lim f (x) = 1. 2-a
(9.7)
Observe that as long as %, is declared and since it is unique with respect to the point a and topology T , we need not specify along which filter base f converges to 1. Should %, be replaced by a specific neighborhood base 38, (also a filter base), then we can write lim
f(x)=l.
~-4'33~1
Now, let 38, be a neighborhood base a t a with (9.7a) holding. Then, by Definition 9.6 (iv), for each neighborhood Vl of I, there is a neighborhood B, E %, of a such that f ,(B,) V 1 Since 93s , CU,, (9.7a) then implies (9.7). Conversely, if (9.7) holds, then for each VI, there is a neighborhood U, from the neighborhood system %., Because each U, is, by Definition 1.5 (iii), a superset of a t least one B, E 3, (being an arbitrary neighborhood base a t a), (9.7a) must hold. Consequently, (9.7) and (9.7a) are equivalent, even though (9.7a) is related to a specific neighborhood base of a. We therefore see that the limit is invariant of a neighborhood ljase of a and (9.7) can be sustained with no specification of any neighborhood base. Consequently, (9.7) can be used for the notion of convergence of a function f a t a point a. Notice that f acts between two topological spaces. Interestingly enough, we could alternatively use a definition of convergence, similar to that of continuity in Definition 4.1, i.e. with no visible consent of a filter base. This would read:
A function f is said to have a limit 1 at a point a if for each neighborhood V1 of 1 in (Y,rl), there is a neighborhood U, of a in (X,T) such that f,(U,) 5 Vl, or equivalently, if f*(Vl) is a
9. Filter and Net Convergence
171
neighborhood of a. In particular, if (X,T) is first countable (which is the case of metric spaces and many other applications), we can have f converge to 1 along any monotone decreasing countable neighborhood base of the point a, say, {B,"}.If we now select from each B," an arbitrary point x, (a. in the proof of Theorem 4.10), then x, + a in the usual sense and, consequently, we can write lim f(x) = 1
xn+a
that has a double meaning. For one, it goes back to notation (9.7-9.7a) and limit (9.7b) is a limit of f along the filter base {B:). On the other hand, it coincides with our conventional definition of the limit of f a t a point a along the sequence {x,). Finally, if limit in (9.7b) is consistent along any sequence {x,) that converges to a, then, by arguments as in Theorem 4.10, we can show that 1 is a limit of f along a filter base {B,"} and therefore, along any neighborhood base of a. The uniqueness of 1 is subject to Example (iv) below and we will see that this is the case if (Y,rl)is Hausdorff. For instance, if we consider as f the function
f ( 4 = g(x)x --ag(a)
1
then function [Rn,R,g] is differentiable a t a if and only if the limit lim f(x) = 1 x+a
exists, where 1 = g1(a), and now we can say that function g is differentiable at a if and only if this limit exists along any sequence {x,) convergent to a in the sense of notation (9.7b). This idea is frequently used in analysis whenever convergence along a sequence is a plausible (if not the only) option for us. (iii) Consider some special cases of limits along the filter bases from Example 9.3 (iii). Let X = Y = R and f : X -, (Y,T,).
+
a) If EFb on X is T b = {(a - &,a E): E > 0), then the concept of limit introduced in Definition 9.6 (iv) reduces to the conventional definition of the limit of a function known from calculus, with the usual notation lim f(x) = 1. x-'a
b) Similarly, with
% = {[a,a + E):E > 0)
c) With T b = {[b,oo): b E W), we have
(iv) Let f : X
+
we obtain lim
x++oo
lim f (x) = 1.
x+a+
f(x) = 1.
( Y , r ) , T b be a neighborhood base on X and let
172
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
(Y,T) be Hausdorff. We show that if f has a limit along Tb, then it is unique. Assume that ll and l2 are two different limits along Tb. Since Y is Hausdorff, there are two disjoint neighborhoods of ll and 12: V l and
. By
1
the definition of the limit along EFb, there are two sets U1,
UP E T b such that
By the definition of Tb as a filter base, there is U E gb such that U 2 Ul n U2. Since
we have f ,(U)
E Vl n V, = @. 2
1
This is absurd, for U # (d.
When introducing convergence of a function f: X -t (Y,T) along a filter base Tb on X in Definition 9.6 (v), we did not need to assume any topology on X. Now if we define a topology on X and take for EFb the neighborhood filter (U a t a point xo E X , then, by Definition 9.6 (iv) "0
(applied to LU
"0
= Tb) and taking 1 E Y as f (xo), we arrive a t the defini-
tion of continuity of f a t xO that agrees with Definition 4.1: A function f : ( X , r ) -t (Y,rl) is called continuous at a point xo if lim f i x ) = f (xo).
x +so
Now, we consider another very useful type of convergence: convergence along nets. As we will see it, the filter and net convergence have a very close relationship. 9.8 Definitions.
A set A is called directed if there exists a relation (denoted (i) j ) on A defined as: a) (R) for each X E A, X
b ) (T) X1
< X2
< A.
and X2 j X3 imply that X1
5 X3.
s) (SL - superlativity) for each pair X1,X2 E A, there is X E A such that X1 5 X and X2 5 A. A net is roughly speaking a set indexed by a directed set, and (ii) it is a generalization of a sequence. More formally: A net in X induced by A is any function f : A + X where A is a directed set. The point f (A) is denoted by xA and we will then instead denote the net by {xA) =
9. Filter and Net Convergence
173
{xA:A E A). Observe that since f need not be surjective, {ox) is in general a proper subset of X. (iii)
If {xA) is a net, then {xA: Xo
5 A} is called a Xo-tail
of {xA).
(iv) Let A C_ X. A Xo-tail of a net {xA) is called a Xo(A)-tail of {xA) if the Xo-tail is a subset of A. A net {xA} is said to be cofinally in A C_ X if for each Xo E A, (v) there is X 2 Xo such that xA E A. (vi) A point x E X is said to be an accumulation point of a net {xA) if the net {xA) if {xA) is cofinally in each neighborhood U, E CU,. (vii) Let {xA} be a net in X. {xA} is said to converge to a point x E X (in notation xA --+ I), if for each neighborhood Uz of x, there is a Xo(Uz)-tail of {xA}. x is called a limit point of the net {xA). (viii) A net {xA) is called an ultranet if for every subset A there is a Xo(A)-tail of {xA} or Xo(AC)-tail of {xA}.
S X,
9.9 Examples.
(i) An X = (A1,. ..,A,)
example of a directed set A will be Wn with 5 p = (pl,. ..,p,) if and only if xi 5 yi, for all i = 1,. . .,n.
A neighborhood base 93, a t x, or even more trivial case, the (ii) neighborhood system CU,, with the relation U1 U2 if and only if U1 2 U2 for their elements, is a directed set.
<
(iii) Let X be an arbitrary continuum set and let {xA) be the net in X induced by A defined in (i). Now, a Xo-tail involves only those x E X whose indices are 5 -related. (iv) Let ( X , T ) be a topological space, x E X, and let 8, be any neighborhood base of x directed as in (ii). Now, we index a subset of X as follows. For each neighborhood B E 93,, we pick a point y E B and index it by B, and so we obtain a net {yB: B E 93,) in X. Observe that same points of X can be indexed by different neighborhoods, but for each neighborhood B E 8,, exactly one point (of this neighborhood) is assigned. Any such net {ye) will be called a net generated by the neighborhood base 93,. It is understood that there are in general more than one net generated by a neighborhood base. If B,, is any neighborhood from 3'3), then the Bo-tail is the collection of all yg of the net with all those B E 3, such that Bo 5 B, or equivalently, B 5 Bo. Consider a net iYg} in ( X , r ) from Example (iu) generated by (v) the neighborhood base 93,. We show that yg -, x. Indeed, if U, is any neighborhood of x then, by definition of a neighborhood base, there is Bo E 8, such that Bo C U,. On the other hand, the Bo-tail is the sub-
174
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
collection of sets from the net, with all B C Bo ( 5 U,). However, since each yg E B, the Bo-tail is a subset of Bo, specifically, of U,. Thus, yg (vi) Let 9 be the collection of all finite partitions of the compact interval [a,b]. Recall (Definition 1.7 (ii), Chapter 1) that P E 9 is a parlition of [a,b] if P is any ordered finite set of points {ao,.. .,an} c [a,b] with a = a. < al < ... < an = b. Let P and P' be two partitions in 9. We say that P' is finer than P if P is a proper subset of PI. P' is also said to be a refinement of P. We direct 9 as follows: for every pair of partitions P, P' E 9, P _< P' if and only if P' is finer than P ( P C PI). Let f be a real-valued function defined on [a$]. Then for a partition P, define:
n
LP: = z m i ( a i Up =
(Darboux lower sum indexed by P).
i=l n
C Mi(ai - aiJ
(Darboux upper sum indexed by P).
i=l
Consequently, {Lp} and {Up} are two nets in (W,r,)and if each of them converges to the same real number I we call this number the Riemann integral off and denote it by
Indeed, let U be a neighborhood of I. If Lp -t I, then there is a paitition Po such that the Po-tail of Lp is in U, or equivalently, all Darboux lower sums indexed by the partitions finer than Po must be in the &-range of I. Observe that the "naive" definition of the Riemann integral is based on the eGstence of a limit of a sequence of lower sums over any sequence of subsequently refined partitions. The definition in this example is just the same, since the net convergence involves in fact the existence of such a limit over. all appropriate sequences of partitions. As mentioned, this motivated E.H. Moore and H.L. Smith to develop the general concept of a net.
(vii) If an ultranet {xA} is cofinally in A C X then there exists a A,,(A)-tail of {xA}. Indeed, by the definition of an ultranet, there is either a Ao(A)-tail or Ao(AC)-tail. The latter contradicts the assumption that {xA} is cofinally in A. (viii) If {xA} is an ultranet and x is its accumulation point, then x is also a limit point. Indeed, if x is an accumulation point of {xA}, then for every neighborhood U,, {xA} is cofinally in U,, and thus by above
9. Filter and Net Convergence
175
example (vii), there is a Xo(Ux)-tail of (xx}, which implies that x is a limit point of {xx}. (ix)
An example of a trivial ultranet. Let A be a directed set; then
any function f : A
onto +
0
{xo) (xo E X ) is an ultranet.
9.10 Proposition. Let A C (X,r). Then x E A i f and only i f there is a net {xx} in A such that xx + x.
Proof. 1) In Example 9.9 (i) we have shown that for each x E ( X , T ) ,any net generated by a neighborhood base 38" converges to x. Thus, it is sufficient to show that there is such a net located in A. If x E 2, then, by Definition 9.6 (iii), each neighborhood U, meets A a t a t least one point. Specifically, each neighborhood taken from a neighborhood base 38, has a nonempty intersection with A . Therefore, a desired net generated by 53" is any net whose terms picked from this intersections.
iYS}
2) Conversely, if {yg} is a net in A convergent to x, then for each neighborhood U,, there is a Bo-tail of the net that is included in U,. On the other hand, as a subset of the net, the Bo-tail C B, which implies that the Bo-tail C U x fl A. Consequently, U, fl A # # and thus x E A. U 9.11 Remark. Let f : X --t Y be any function and let {xx} 5 X be any net in X. Then, clearly {f (xx)} is a net in Y. 9.12 Definition. A function f : X along the net {xX} if f (xx) + 1.
--t
Y is said to converge to 1 E Y 0
The theorem below refines Theorem 4.9 and modifies Theorem 4.10. 9.13 Theorem- Let f: (X,T) -4 (Y,T') be a m a p . Then f is continuous at a point x0 i f and only i f for each net {xX} in X such that xx 4 xo it yields that Axx) Axo). -)
Proof. 1) Let f be continuous a t xo and let {xA} be a net in X convergent to xo. Let W be a neighborhood of f(xo). Since f is continuous a t xo, f f ( W ) is a neighborhood of xo. The convergence xx --+ xo guarantees the of { x } included in f '(W). Thus, f (Aoexistence of a &-,tail tail) E f o f ' ( W )c W implying that f(xA) + f (xo). 2) Let f be not continuous a t xo. The negation of the continuity is not a means that there is a neighborhood W of f (xO)for which f *(W) neighborhood of xo, or equivalently, there is no neighborhood U, with 0
U
"0
C_ f*(W). Therefore, f ( U
"0
) is not a subset of W. This fact implies
176
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
that each neighborhood B from 93, has a t least one representative, say 0
xg such that f(xB) E WC. Since {xB} is a net generated by 93
"0'
it con-
verges to x while the net { f (xB)} cannot converge to f (xo), for it is separated from a neighborhood W of f (xo).
n
9.14 Theorem. A n e t {xA) in t h e product topological space ( X = X , r , ) i s convergent t o a point x E X if and only if for each i E I,
i E I
ri(xX) + ri(x) = xi E X i ( w h e r e ri denotes the ith projection m a p ) .
Proof. 1) Let x~ -+ x in X. Then since each projection map ri is continuous in product topology, by Theorem 9.13, si(xA) --+ si(x), V i E I . 2) Suppose that ri(xA) + ri(x), Vi E I. If U, is an element of a neighborhood base of the point x in the product topology then U, can be represented as
>
Then for each k = 1,...,n, there is a Xk such that for all X Xk, rik(xA) E Ux , i.e. Xk-tail is in Ux Since there are only finite many ik
ik
such k's, by superlativity of A, there exists a A. 2 Ak, k = 1,...,n, such that each Xo-tail of {ri ( x ) is in U , k = 1,...,n. Hence, af (Ao-tail of k
k
k
{rik(xA)}) is contained by af (Us. ). Consequently, the Xo-tail of {xA) is
.
in T;~(U,. ), k = 1,. .,n, and 'k
n
'9 ) = Uz, for all X 2 Ao.
In other words, xA + x.
9.15 Remark. We activate Example 9.7 (i) treating a special case of the convergence of a function on N (sequence) along the filter base T b = {{n, n I,. ..):n E N) in a discrete topological space. Since any sequence is a net, the filter base !Fb in this case obviously contains all notails of this net, and the convergence of f along T b is equivalent to the convergence of the net {f (n)). We wonder what is a connection between the filter and net convergence, and in which cases they are equivalent. We will start with the natural generalization of this case.
+
9.16 Proposition. Let {xA) be a n e t in X . T h e n the collection of all tails of {xA) i s a filter base o n X . (See Problem 9.11.)
9.17 Definition. Let {xA} be a net in X. The filter base in
9. Filter and N e t Convergence
177
Proposition 9.16 is said to be the filter base generated by t h e n e t {xA) and it is denoted by gA.Correspondingly, the filter 9(VA) generated by this particular filter base is called the filter generated by the n e t {xA}. 0 The following two criteria form a bridge between filter and net convergence.
9.18 Theorem. A net (xA} -, x if and only if t h e filter 9 ( T A ) generated by t h i s net converges t o x. 9.19 Theorem. x is a n accumulation point of a n e t {xA} if and only if x i s a n accumulation point of the filter 9(9A) generated by t h i s net. The proofs to both theorems are left for the reader as Problems 9.12 and 9.13.
9.20 Remark. Let 9 be a filter on X. Denote A9 = {(x,F): x E F E 9) and introduce the relation on AT by
<
Note that from each F, each time we select exactly one point x. Consequently, we pair all elements of F with F. Then (A9, 5 ) is a directed set (show it as Problem 9.14) and the projection map a: AT -+ X (assigning ~ r ( x , F )to x) is a net in X. This net is called the n e t based o n 9. So, the net based on 9 is just {xA} where X = (x,F) and this particular x is labeled by X or by F. This is somewhat similar to the labeling a net generated by a neighborhood base. However, in this case, we select all elements x of F and, in addition, we deal with a filter base instead of a filter.
9.21 Theorem. A filter '5 converges t o x if and only if t h e net based o n 9 converges t o x.
Proof.
U,
1) Suppose that 9 -t x. Then by Definition 9.6 (i), CI1, 9. Let E 91., Then (1, E 9. Let xA E U,. Then ( x U ) A . By
superlativity of AT, there is X
0
- Xo. >
0
Hence, there is an F ( E 9 ) 5 U,,
Xo 5 X = (xA,F), and x~ E F. The collection of all such xA's is the Xo-tail and it is a subset of U, being a n arbitrary neighborhood of x. Therefore, XX -+ x. 2) Let {xA} be the net generated by a filter 9 such that xx x E X. We need to show that C11, E '3. Since xA -+ x, for each U,, there is Xo E AB such that the Xo-tail is in U,, i.e., for some Xo = (xAo,F0),all
-+
178
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
xA E U,, with X > - Xo, or equivalently, with xA E FA Fo. Furthermore, Fo must be contained by U,. If this is not the case, then a t least Fo and U, are not disjoint (it follows from the above inclusions). Since by our assumption Fo\U, # @,there is a y E Fo\Ux, and then the pair (y,Fo), marked with some X is obviously in the Xo-tail. Thus yx must belong to U,, which contradicts the assumption. Another reason why Fo C U, is that if some xA E F belongs to U, then all other elements of F belong to U, for they participate in the relation (xA,F ) C (y,F ) and thus belong to the Xo-taiI. So, we have shown that an arbitrary neighborhood U, is a superset to some Fo E 9. By the definition of a filter, U, E 9 . 0
9.22 Example. If '3 = @d,, then such a filter always converges to x. By Theorem 9.21, the net {xA} based on LUX converges to x. A Xo-tail of this net would consist of all points y indexed with all neighborhoods U E LUz, which are included in the LcXo-neighborhood"U . A.
9.23 Remiuk. The following considerations are similar to those in Remark 9.20. Let T b be a filter base on X. Denote
A
<
F2 E F1. Then 9 , by (xl,F1) 5 (x2,F2) is a directed set (show it, in Problem 9.15). Now, the projection
and set the relation
9, map a: AT
--t
b
in A
X is a net in X. This net is called the net based on the
filter base Tb.
9.24 Theorem, A filter base ?Fb converges to x i f and only i f the net 0 based on T b converges to x. The proof of this theorem is similar to that of Theorem 9.21 and it is subject to Problem 9.16.
9.25 Example. Let T b = 93, be an arbitrary neighborhood base of a point x E X. Then as mentioned, 93, converges to x. By Theorem 9.24, the net {xA)based on 3, also converges to x. A typical Xo-tail is similar 0 to that in Example 9.22. The theorem below is a refinement of Theorem 3.10 initiated for sequences.
9.26 Theorem. The following statements are equivalent: (i) (X,T) is T2.
(ii) All limits in ( X , r ) along nets or filters are unique. (iii) The diagonal {(x,x) E x2:x E X} is closed in the product
179
9. Filter and Net Convergence
topology
x2.
Proof. (i) 3 (ii): Let (X,T) be T2 and let 4 be any filter on X with 4 -t x and 4 -, y. By Definition 9.6 (i), CU, 5 4 and CU, 5 '3. Thus, V U,, U, E 4, U, fl U s # @ (by the definition of a filter). Consequently, either x = y or ( X , r ) is not Hausdorff. If now {xx] is any net in X with xx t x, then by Theorem 9.18, the filter 4F(4Fx) generated by this net converges to the same point x. If y would be another point such that xx t y # x, then by the same Theorem 9.18, it would mean that 4 ( 4 x ) --t y as well, which is impossible, for in T2, any filter, as proved, converges to a t most one point. (ii) j (iii): Assume that all limits in (X,T) are unique along any nets. Therefore, the net based on a filter 4F converges to x and to no other point of X. By Theorem 9.21, it follows that 4 also converges to x and to no other point of X. Let D: = {(x,x) E x2:x E X). Then the diagonal D will contain all nets (xx,xx). By Proposition 9.10, a point (x,y) E b if and only if there is a net (xx,xJ E D: (xx,xx) -t (x,y). Thus, if we show that x = y, it would imply that D = D. The statement x = y easily follows from the uniqueness of limits along nets. Therefore, for each point (x,x) E D, there is a net (xx,xx) -t (x,x). The latter yields
D = D. (iii)
(i): It can be directly taken from (iii)
j
(i) of Theorem 3.11.
The next two results are analogous to Lemma 4.11 and Proposition 4.12 and left for students as exercises. 9.27 Lemma. Let f , g : (X,T) t (Y1r1) be continuous functions and let ( Y ,T') be T2. Then the set S: = {x: f (x) = g(x)) is closed in X . 9.28 Proposition. Let f,g: (X,T) t (Y,rf) be continuous maps and let ( Y , r f ) be T2. If f and g coincide on some dense set D 5 X then f = g
on X .
PROBLEMS
9.1
Show that the filter '3 in Remark 9.2 (ii) is the smallest filter containing the filter base Tb.
9.2
Show that a filter base for a filter is a filter base.
9.3
Let X be a set and A 5 X. Define 4: = {F E T(X): A 5 F). Show that 4 is a filter on X. Give the smallest filter base T b on X
180
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
containing the set A. For Problems 9.4-9.7, let 9 be a filter on X and A C X. Show that if one element F E 9 meets A, then A meets all other elements of 9. In this case we say 9 meets A. Let '3 meets A. Show that T A : = { F n A: F E '3) is a filter on X, called the trace of the filter 9 on A. Show that 9': = 9 U(
U
9A) is the smallest filter containing
B3A
9 u( A ) .
Show that T A is a filter base for 9'. Show that x is an accumulation point of a filter 9 if and only if X E ~ { F : F E ~ ) . Show that if a filter 9 converges to x then x is an accumulation point of 5. Let ( X , d ) , ( Y , p ) be metric spaces, xo E X , 1 E Y . Show that the following statements are equivalent: lim f ( x ) = 1 (in the sense of Definition 9.6 ( i v ) and ( i ) z--1z0 Example 9.7 (ii).
> 0, there is a 6 > 0 such that for all x E X with d(x,xo) < 6, p ( f ( x ) , l ) < E . [Hint: Work with the system of (ii) For each
E
open balls as a filter base.] Prove Proposition 9.16. Prove Theorem 9.18. Prove Theorem 9.19. Show that ( A 9 , Show that ( A
9b'
-< ) is a directed set. 5 ) is a directed set.
Prove Theorem 9.24. Show that the net based on an ultrafilter is an ultranet. Show that the filter generated by an ultranet is an ultrafilter. Generalize Theorem 3.11 replacing condition (ii) by the condition: each net or filter in ( X , T ) converges to no more than one point. Prove Lemma 9.27. Prove Proposition 9.28.
9. Filter and Net Convergence NEW TERMS: filter 167 filter base 167 filter base for a filter 167 ultrafilter 167 filter generated by a filter base 168 neighborhood filter 168 maximal filter 169 convergence of a filter 169 limit point of a filter 169 convergence of a filter base 169 accumulation point of a filter 169 accumulation point of a filter base 169 convergence of a function along a filter base 169 limit of a function at a point 170 continuity of a function at a point 172 directed set 172 net 172 net induced by a directed set 172 &-tail of a net 173 net, cofinally in a set 173 accumulation point of a net 173 convergence of a net to a point 173 limit point of a net 173 ultranet 173 net generated by a neighborhood base 173 partition of an interval 174 refinement of a partition 174 Darboux lower sum 174 Darboux upper sum 174 Riemann integral 174 function convergent along a net 175 continuity of a function, criterion of 175 convergence of a net to a point, criterion of 176 filter base generated by a net 177 convergence of a net to a point, criterion of 177 accumulation point of a net, criterion of 177 convergence of a filter to a point, criterion of 177 convergence of a filter base to a point, criterion of 178 uniqueness of limits along nets and filters, criteria of 178 filter that meets a set 180 trace of a filter on a set 180
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
10. SEPARATION In this section we will see that the fineness of a topology is characterized by its ability to separate points and sets. We will treat some special types of topological spaces that have qualities somewhat similar to Hausdorff spaces introduced in Section 3 and here given in weaker or stronger forms. In addition to countability, it is another attempt to arrive a t various classes of topological spaces having common properties with metric spaces and yet being sufficiently more general.
10.1 Definitions. Let ( X , T )be a topological space. (i) (X,T) is called a To space if for each pair of points x there is a neighborhood of x, U, such that y E Uz:
(ii) ( X , r ) is called a TIspace if for each pair x UI and Uy such that y E U: and x E Ui:
(iii) (X,T) is called a U,,Uy: U, n UY = @:
# y E X,
# y E X, there
T2 space (or H a u s d o r m if Vx # y
are
E X, 3
10. Separation
183
( i v ) ( X , T ) is called regular if for every closed set F C_ X and for every point x E FCthere are disjoint open sets 0, and 0 such that F E 0 and x E 0,:
( v ) ( X ) T )is called a T 3 space if it is regular and it is a T , space. ( v i ) ( X , T ) is called completely regular if every closed set F C X and every point x E FCcan be separated by a continuous function, i.e. if there f ( x ) = 0) f ( F ) = 1. is a continuous function f : ( X j r )4 ([O)l],re):
(vii) ( X , r ) is called Tychonov if it is completely regular and a T I space.
(viii) ( X , r ) is called normal if any two disjoint closed sets have disjoint open supersets:
( i x ) ( X , r ) is called a T 4 space if it is normal and a T I space. ( x ) ( X , T ) is called locally compact if every point of X has a t least one compact neighborhood. 10.2 Lemma. The following are equivalent:
(i) ( X , r ) i s T 1 . (ii) Each one-point set is closed. (iii) Every subset of X equals the intersections of all open sets containing this set.
184
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
Proof. (i) 3 (ii): Let ( X , r ) be T1 and let x E X . Then by the definition, each y( # x) has a neighborhood, disjoint from {x); for instance, X\{x) is such one. By the definition of a neighborhood, there is an open neighborhood, say 0, 5 X\{x). Thus, y is an interior point of X\{x). Since y E X\{x) was an arbitrary choice, it follows that X\{x) is an open set. [Observe that Hausdorff spaces have the same property, cf. Problem 3.2.1 (ii) + (iii): Assume that each singleton in ( X , r ) is closed. Let A E X . Then A = (X\{x)). Now, the statement follows from the
n
x
E
fact that X\{x) is open and that A 5 X\{x), V x E AC. (iii) j (i): Assume that every subset A E X is the intersection of all open sets containing A. Let A = {x). Then {x) is the intersection of all open neighborhoods of x such that x = n 0,. Let y be a point such that there is no open set 0, that does not contain I. This implies that y E 0, and hence y E n Oz and y = x. 0
10.3 Proposition. If (X,T) is a Ti space then the following diagram holds:
Proof. Indeed: T 2 + T, 3 To is obvious. Since T3 is TI, by Lemma 10..2, we take F = {y], which is closed, to get T2. Similarly, by letting F2= {x) and applying Lemma 10.2 to set 1x1, we have T4 3 T3 10.4 Example. Let X be any infinite set equipped with the cocountable topology r = {x,@,cC: I C I I N I ) (introduced in Problem 1.7). Thus, by the definition, all a t most countable sets are closed, specifically, all singletons are closed. Thus, by Lemma 10.2, r must be TI.Similarly, any cofinite topology (cf. Problem 1.1) is TI. Now let O1 and O2 be any two open sets in a cofinite topology with an infinite carrier. We show that Or and O2 cannot be disjoint unless O1 or O2 is empty. If they are disjoint and nontrivial then O1 2 0; which is impossible, for 0; must be finite and O1 is infinite. Thus any cofinite topology on an infinite carrier cannot be T2.Similarly any cocountable topology on a carrier whose cardinal number is greater than No cannot be T2.
10.5 Theorem. The following are equivalent for a topological space (X,7): (i) X is regular. (ii) If 0, is an open neighborhood of x then there exists an open set
185
10. Separation C - 0.
U which contains x and such that
(iii) Each x E X has a neighborhood base consisting of closed sets.
Proof. (i) + (ii). Suppose X is regular. Let x E 0 E r. Then OC is closed and x 6 OCand by regularity of X, there are disjoint open sets U and W such that x E U and OCC W . Clearly, W Cis closed and U W CC 0. Furthermore, 0 WC 0. (ii) + (iii). If 93, is a neighborhood base a t x, then for each B E $, there is an open subset 0 of B and, if (ii) applies, there is an open subset U of 0 whose closure is in 0.This way, we can form a neighborhood base a t x, which consists of closed sets. (iii) + (i). Let F be a closed set such that x E FC.Then, if (iii) applies, there is a closed neighborhood B of x such that B 2 FC.As a 0
0
neighborhood of x, B is such that B # @ and B is an open neighborhood of x (for there is an open subset of B that is a neighborhood of x). Now we have that B is disjoint with BC, x E h, and F regular. 0
BC. Hence, X is
10.6 Proposition. A compact Hausdorff space is regular.
Proof. Let F be a closed subset of a compact Hausdorff space ( X , r ) and let x E FC.For each a E F, there are open neighborhoods V, and U,, of a and x, respectively, which are disjoint. Because F is closed, by Theorem 6.9 it is also compact, and therefore, there is a finite open subcover {V, ,...,V, } of F reduced from {V,: a E F}. If n
1
n
V= U V k=1
n
,k
and U = fl
k=l
U,,
k
then U and V are such disjoint open sets that x E U and F C V.
10.7 Corollary. A compact Hausdorff space is normal.
Proof. Let A and B be disjoint closed subsets of ( X , r ) . Since (X, r ) is regular, for each a E A there are disjoint open sets U, and Va such that U, is a neighborhood of a and Va is a superset of B. Because A is compact, {U,} is reduced to a finite subcover {U ,...,U, } whose union n
al
n
is U. Let V = fl vak. Then, B C V, which is open, and U and V are k=l disjoint. 0 The class below of locally compact Hausdorff spaces we are going to explore will be useful in Chapter 8 when dealing with measures and
186
CHAPTER 3. ELEMENTS O F POINT S E T TOPOLOGY
integration. 10.8 Examples. Observe that by Theorem 6.9, a compact topological space ( X , r ) is also locally compact [i.e. E X must be compact].
(i)
u,
(ii) The space (Rn,r,) is not compact but locally compact: Every point x E Rn has a compact neighborhood [x - 6,x 63. 0
+
10.9 Theorem. Each locally compact Hausdorff space (X, r ) h a s t h e property that each point of X has a neighborhood base consisting of o p e n sets whose closures are compact. Proof. Let ( X , r ) be a locally compact Hausdorff space. Choose a point x E X. Let U be any neighborhood of x and K be a compact neighborhood of x which is guaranteed by Definition 1.10 (x). Denote 0 = Int(K n u ) . As a closed subset of K ( 0 K 3 b 5 X = K ) , by Theorem 6.9, b is compact in r n K. By Theorem 10.6, as a compact and Hausdorff subspace, b is regular. As an open neighborhood of x in ( X , r ) , and a subset of b, 0 is also open in r fl b. By Theorem 10.5, there is an open neighborhood W of x in r fl b such that its closure in r fl 0, W 0. (It is easily seen that W is also open in r.) Since b is a compact subspace, is compact in b. We need to show that is also compact in (X,r). Let { V s } be an open cover of W in r. Then, { V , fl b} is obviously an open cover of W in r fl b. This cover can be reduced to a finite subcover { V1 fl ..,Vk f l a ) and therefore, { ,. . is a finite subcover of I V in 7. In a nutshell, we showed that an arbitrary neighborhood U of x has an open subneighborhood W whose closure is compact. Hence, a neighborhood base a t x forms thereby a neighborhood base consisting of open sets whose closures are compact. In particular, it means that every point of X possesses a neighborhood base consisting of compact sets. 0
c
rn
a,.
10.10 Proposition. Let ( X , r ) be a locally compact Hausdorff space and let U be a n open neighborhood of a point x. T h e n t h e r e i s a n open neighborhood 0, of x such that dxC U and i s compact.
ax
(See Problem 10.6.) 10.11 Proposition. Let K be a compact set in a locally compact Hausdorff space ( X , r ) and W be a n open superset of K . T h e n t h e r e i s a n o p e n superset U of K such that 0 W and i s compact. Proof. By Proposition 10.10, each point x of K has an open neighborhood U, whose closure is compact and included in W. If we cover K by all UX's, because of compactness of K, this cover can be reduced to a
187
10. Separation
finite subcover, say U1,. ..,U,. If U = U1 U ... U U,, then clearly
As a finite union of compact sets,
0 is compact.
0
The next is a small and useful consequence of Proposition 10.11 (whose proof we assign to Problem 10.8). It states that every locally compact Hausdorff space is 'Lweakly" normal. Recall that a space is normal if every two disjoint closed sets can be separated, i.e. they have disjoint open supersets. In a locally compact Hausdorff space, the same property applies to compact sets, which as we know ( cf . Theorem 6.10)) are closed in Hausdorff spaces. In other words, any two compact sets can be separated by disjoint open supersets.
10.12 Corollary. In a locally compact Hausdorff space any two disjoint compact sets have disjoint open supersets. The theorem below is quite famous and it is known as Urysohn's Lemma. Given two disjoint closed sets in a normal space (X,T), the lemma asserts the existence of a real-valued continuous function on f that "separates" two given disjoint closed sets, i.e. f: X -+ [0,1] such that f * ( A )= 0 and f,(B) = 1. (The original proof guarantees the existence of a function f from X onto [0,1], but with a simple transformation, the range of f can be made [a,b].) Whenever we talk about real-valued functions from X to R, we will mean the usual topology in R. The following short biographical note on Pavel S. Urysohn will add to the prominence of his widely referred to lemma. Pavel Samuilovich Urysohn (born in 1898 in Odessa, Russia), according to Pavel S. Alexandrov, was the founder of the Russian school of topology. He studied mathematics under Nikolai N. Lusin in Moscow State University from which he was awarded a doctoral degree in 1921. He tragically died by drowning in Brittany, France (at the early age of 26), during his visit of one of the mathematical conferences. Amon'g the different significant results Urysohn made during his less than four years of academic work, was one of the central problems in topology - the dimensions of arbitrarily complex geometrical figures.
10.13 Theorem (Urysohn's Lemma). A space (X, T) is normal i f and only i f whenever A and B and disjoint closed sets in X, there is a continuous function f: X-r[O,l] such that f ,(A) = 0 and f , ( B ) = 1. Proof. 1. Necessity. We assume that -
( X , T) is normal and that A and B are disjoint closed sets. By normality of (X, T) and Problem 10.8, there is an
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
open superset U
1/ 2
of A such that
UlI2 n B = @.
Now, the sets A and
(U112)C are disjoint and closed. By normality, there are open supersets, UlI4 and V of A and (U1/,)',
respectively, such that
Therefore, U1/4 C VC2 UlI2 and this yields that 01/4
2 VC2 Ulj2.
Since B and are disjoint and closed, by Problem 10.8, there is an open superset U314 of OlI2 such that 03/4n B = 0. In summary,
-
A
c 4 / 4 9 0 1 / 4 E U1/21 u1/2 C- U3/41 and OaI4 n B = a.
For convenience, we display one more step. Repeating the above arguments, there are open sets
that are embedded in the following way:
Continuing the same process, we define sets UiI2" i = 1,...12n- 1, which are embedded as
Let Do denote the set of all dyadic rationals belonging to [0,1], i.e. those numbers-of the form i/2" where i = 0,1,. .12n and n = 0,1,. ., and D be the subset of dyadic rationals from (Ol), i.e., Do\{0,1). I t is easy to show that Do is dense in [0,1]. By induction, we can construct the countable f a d y {Ud; d € D) of open sets indexed by the elements of D such that for each pair p,q E D with p < q,
.
.
Let U denote the union of all Ud9s. Now, we introduce the function inf{p: w E Up}, if w belongs to some U p 11
w E [O,lI\U
189
10. Separation
on X. Clearly, f ,(A) = 0 and f ,(B) = 1 and that [0,1] is the range of f . We prove that f is continuous a t each point w of X. Continuity is subject to the following arguments. It is easy to show that: if w E Up then f(w) 5 p; if w
6 Up then f (w) 2 p;
hence, if for p
< q, w E uq\VP.then
p
5 f ( w ) 5 q.
By Definition 4.1, f is continuous a t w if for every neighborhood Wf(,), there is a neighborhood V, such that f,(V,)
5 Wf(,)Let f(w)
E (0,l)
and let (a,b) = Wf(,) be any open subinterval of [0,1] containing f (w). Because D is dense in [0,1], there is a pair of dyadic rationals p,q E D such that
Now, the open set V, = uq\Up is a neighborhood of w such that f+(V,) (a,b). It is a rather routine procedure t o verify the continuity of f a t 0 and 1. This completes the necessity of the statement. 2. Sufficiencv. Assume that for any two closed disjoint sets A and B, -
there is a continuous function f : X+[0,1] such that f,(A) = 0 and f,(B) = 1. Since f is continuous, f *([O,E)) and f *((&,I])are open sets in (X,r ) and they contain A and B, respectively.
10.14 Corollary. A T4 space is Tychonov. Proof. Let (X, r ) be a Tq space. By Lemma 10.2, as a T Ispace, each singleton in ( X , r) is closed. Since the TCspace is normal, given an x and a closed set F, to which x does not belong, by Urysohn's Lemma there is a continuous function f with the range [0,1], which separates (x} and F. Hence, (X, T ) is completely regular. In addition (X, r ) is a TI space. O
10.15 Corollary (Urysohn). Let K and W be compact and open sets, respectively, in a locally compact Hausdorff space ( X , r ) such that K C W . Then there is a continuous function [X,[O,l],f] such that f , ( K ) = (1) and f,(G) = (O), where GC is a compact subset of W containing K . Proof. By Proposition 10.11, there is an open superset U of K whose closure U is compact and is contained in W. Since the subspace ( U , r fl U) is compact Hausdorff, by Corollary 10.7, it is normal. Then, by
190
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
u,
Urysohn's Lemma, for any two disjoint closed subsets of there is a continuous function [p,[~,l],(o]such that (o,(A) = {0) and (o,(B) = {I). Now, if take A = V\U and B = K we have two disjoint closed subsets of 0 (see Theorem 6.10) in the scenario of Urysohn's Lemma. Now, we extend the function (o to X by letting f , ( X \ o ) = 0, where f denotes the extension of (o from D to X. Hence,
in particular, on its subset, G = (0)'. It remains to show that f is continuous. Let C be any closed subset of [0,1]. If C does not contain 0, then f *(C) = y-*(C) is closed in and, therefore, it is closed in (X, r ) (as the traces of all r-closed sets on are all closed sets in T fl V and they are closed in r). If 0 E C, then f *(C) = f *(C u {0})
= (o*(C) U UC
is also closed in (X, T).
10.16 Definition and Notation. Let (X, r ) be a topological space. Any a t most countable intersection of open sets is denoted by G6. Any a t most countable union of closed sets is denoted by F,. A set is referred to as a-compact, in notation K,, if it is at most a countable union of compact sets. 0
10.17 Proposition. Let ( X , r ) be a second countable locally compact Hausdorff space. Then each open set is an F,- and Kc-set and each closed set is a G6.
Proof. Let 93 be a countable basis for r and let U E r. By Proposition 10.10, each point x E U has an open neighborhood 0, such that 0,s U and 6, is compact. On the other hand, 0, can be represented as a union of some sets from 93. Let B, be one such subset of Ox that contains x. Then B, 5 B, is compact, and Bx 5 U. Consequently, U can be represented as
a,,
Since all BXysare elements of 3,which is countable, the family {B,: x E U} automatically reduces to a countable cover of U and so does {B,: x E U}. In other words, U is a n F,- and K,-set. Let F be a closed set. Then FCis an F,- and Kc-set. Thus,
Obviously,
191
10. Separation
10.18 Corollary. Every second countable locally compaci Hausdorff space is a-compact. 10.19 Examples (i) (Topology on R). In Example 1.2 (iv) we constructed a topology on the extended real line. There is another way how to do it. onto R as follows: Define the map f of A =
I;,-[
I
tanx,
-q < x < $
Now let us define a topology on 8. First of all, we consider the relative topology rA:= A n re on A, i.e. the topology relative to the usual topology re on W, and then define the topology i on R a s 7: = f (rA). Since f is evidently bijective, i is indeed a topology on Furthermore, f is a homeomorphism and the spaces (A,rA) and (a,?) are homeomorphic. Since A is compact, by Theorem 4.3, we conclude that R is also compact. This example shows that by supplementing two more points to R we made a compact space from a non-compact one. We observe that W c R, as well as re c 7. Such a process is called a compactification.
a.
(ii) Let (X,T) be a compact Hausdorff space and let x E X. From Problem 3.2 we have that X\{x) is open. Then by the previous theorem, X\{x) with the relative topology on it, is locally compact. Consider now the inverse process, where we take X\{x) and then give the point (c) back to X, which makes X compact. This is a very special case of onepoint compactification unlike the two-point compactification discussed in Example (i). This example inspire us for a more general approach of a one-point compactification of a locally compact space. Let ( X , r ) be a locally compact Hausdorff topological space and let w 4 X. Define XI: = X U ( w ) . Now we construct a new topology 7' on XI containing T and the sets of type (X\C) U { w ) where C is a T-compact subset of X. We are going to prove the result basically belonging to Pave1 S. Alexandrov.
Theorem 10.20. The following hold true. a) (X',rl) is a topological space.
b ) (Xt,r') is HausdorfJ C)
(X1,r') is compact.
Proof. We just show c). Let {G;; i E I) be an open cover of X'.
192
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
Then there is an index io E I such that w E G.. Hence, '0
n
where C is T-compact, specifically,
U Gk
is an open subcover of C
k=l
.
(without loss of generality, we took 1,. .,nas the relevant indices for the finite subcover). Now,
XI = ( X 1 \ C ) U C = ( X \ C )
u (w) U C
Therefore, X' is compact.
10.21 Definition. The point w of the compactification is called the point at infinity. The one-point compactification process described above is called the Alexandrov compactification.
PROBLEMS 10.1
Show that T 2is hereditary, i.e. every subspace (relative topology) of T 2is T2.
10.2
Show that the Tychonov product topology of T 2factor spaces is
T2. 10.3
Show that if the Tychonov product topology is T 2 then each factor space is also T2.
10.4
Show that local compactness is weakly hereditary.
10.5
Show that local compactness is vaguely hereditary.
10.6
Prove Proposition 10.10. [Hint: Use Problem 6.14.1
10.7
Let (X,T) be a normal space. Let A and B be two disjoint closed sets in (X, T). Show that there is an open superset U of A such that V ~ B = @ .
10.8
Prove Corollary 10.12.
10.9
Prove that a product of Hausdorff spaces is Hausdorff.
10.10 Show that regularity is hereditary. Show that a subspace of a normal space need not be normal.
10. Separation
193
10.11 A product of regular spaces is regular. Show that a product of normal spaces need not be normal. 10.12 Prove that every metrizable space is normal. 10.13 Prove that every regular space with a countable base is normal. 10.14 Prove that in every u-compact and locally compact Hausdorff space (X, r ) there is a sequence I%,) of compact sets such that 00
X = U X, and
n =1
194
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
NEW TERMS:
Tospace 182 TIspace 182 T2space (Hausdorff') 182 Bausdorff space 182 regular space 183 T3 space 183 completely regular space 183 Tychonov space 183 normal space 183 T4 space 183 locally compact space 183 T I space, criterion of 183 T spaces, diagram of 184 regularity of a space, criteria of 184, 185 compact Hausdorff space 185 normality of a space, criteria of 185, 187 locally compact Hausdorff space, properties of 186, 187, 189, 191, 193 Urysohn's Lemma 187 Urysohn's Corollary 189 o-compact space 190 0-compactness, criteria of 190, 191, 193 G,-set 190 F,set 190 K,-set 190 compactification 191 one-point compactification 191 Alexandrov compactification 191-192
11. Functions on Locally Compact Spaces
195
11. FUNCTIONS ON LOCALLY COMPACT SPACES In this section we will utilize a version of Urysohn's Theorem for locally compact Hausdorff spaces in connection with a very important subclass of continuous functions that vanish outside compact sets. This will lead to one of the central results in analysis, a so-called Riesz Representation Theorem, explored in Chapter 8. 11.1 Definitions and Notation.
(i) Let (X, T) be a topological space. For a real-valued function [X,R,f], the set Cl(f*(R\(O)) is called the support, in notation, suppf or supp(f 1. 0 (ii) Given a topological space ( X , r ) , the real vector space of all continuous real-valued functions will be denoted by C(X,r;R) or, shortly, by C(X). The symbols C,(X) and Cc(X) denote subspaces of continuous bounded functions and continuous functions with compact support, respectively. [Obviously, C,(X) C C,(X) 5 C(X) and the inclusions can be replaced by the equalities if (X,T) is compact.] (iii) Let K and W be a compact and open subset of X, respectively, and f E C,(X) such that 0 5 f 5 1. We will denote K 4 f if f ,(K)= 1 (hence, lK
In this case we will say that f is subordinate to W. If K 4 f and f 4 W, we will write
and say that f is subordinate to W with respect to K . Clearly, if K 4 f , then K 2 suppf. Notice that the use of symbol " 4" for f with K or W always requires that 0 5 f 5 1. (iv) Let { W 1,.. .,W,} be a finite open cover of a compact set K. An n-tuple { f .,f n} 2 Cc(X) is said to be a partition of unity for K subordinate (or dominated by) to {W1,...,W,) if:
.
11.2 Remark. In the upcoming theorems we are going to use Urysohn's Corollary 10.15 and we would like to reformulate it in terms of the support of a function f introduced above. Recall that, according to
196
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
Urysohn's Corollary, given a compact set K and its open superset W in a locally compact Hausdorff space ( X IT), there is an open set U with the compact closure, which are LLsqueezed"between K and W:
To this quadruple, there is a continuous function [X,[O,l],f ] with f ,(UC) = 0 and f ,(K) = 1. Consequently,
Since f is continuous and (0,1] is open in {f # 0) is an open subset of U and therefore,
K
r,n
[O,l], it follows that
E suppf E u.
As a closed subset of a compact set, suppf is compact and hence f E Cc(X) in the scenario of Corollary 10.15. Furthermore,
11.3 Theorem. Let ( X , T ) be a locally compact H a u s d o r f l space and K be a compact set. T h e n , f o r a n y finite open cover of K t h e r e i s a partit i o n of u n i t y s ~ b o r d i n a t et o t h i s open cover. Proof. Since K is compact there is a t least one finite open cover of K, say {W1,.. ,,Wn). Let x E K. Then x belongs to a t least one of Wi's, say W1. By Proposition 10.10, there is an open neighborhood 0, of x whose closure 0, is compact and such that 0, W1. The open cover, {Ox: x E K) of K can be reduced to a finite subcover, say {OX1,...,OXb).
.
Now, for each i = 1,. .,n,let Hi be the union of those 0x; 's for which J
0 .C - W i or else set Hi = 9) if no such inclusion is available. Obviously, 3
each Hi is an open subset of W; whose closure, in notation, K i t is compact and included in Wi. Furthermore, {El,. .,8,) covers K. In light of Remark 11.2 applied to the pair of sets to K i and Wi there is a continuous function [X,[O,l],gi] with a compact support such that s~,(K;) = 1 and gi,(UF) = 0, where Ui is an open superset of K i whose closure is compact and is contained in Wi and, in terms of (11.2), K; -i gi -i Wi. Applying Remark 11.2 again, now to the pair of sets K
.
and
6 Hi,
i=l
there is a continuous function [X,[O,l],g] with compact
support such that g,(K) = 1 and s,(U')
= 0, where U is an open superset
197
11. Functions o n Locally C o m p a c t Spaces
r)
n
of K whose closure is compact and contained in . U H i (In particular,
B
g,({i
=
Hi
r = l
= 0.) In terms of (11.2), we have
In summary, we have
Let
It is a routine procedure to verify that f
for all x between K and.
> 0 on X:
0 Ki;
r = l
n
f (x) = 1 for all x outside . u Ui; r = l
and
f (x) 2 1 for all x between. B Ki and r = l
8
r = l
Ui.
This allows us to define the continuous functions
It is readily seen that
f >o,
K 4 Cy= f
and that 0 5 Cl,
f
< 1. Since
or in terms of the above notation, f 4 W i . Hence, the tuple { f ..,f ,} meets the requirements of the above assertion in terms of Definition 11.1 (iv)
.
11.4 Corollary. L e t K be a compact set in a locally compact Hausdorff space (X,T) and W be a n open superset of K . T h e n there is a continuous function [ X ,[O,l],f] with compact support such that K 4 f 4 W and K suppf.
Proof. The statement follows from Theorem 11.5 immediately for n = 1. 0
198
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
In particular, if W = X I we have 11.5 Corollary. Let K be a compact set in a locally compact Hausdorff space (X,T). Then there is a continuous function [ X I[O,l],f] with compact support such that K 4 f and K S suppf.
Under the condition of Corollary 11.4, let K =
a. Then,
11.6 Corollary. Given an open set W in a locally compact Hausdorff space (X,T), there is a continuous function [X,[O,l],f ] with compact support such that f + W .
We complete this section by the widely referred to Tietze's Extension Theorem.
11.7 Definition. Let (X,T) be a locally compact Hausdorff space and K C U be compact and open sets, respectively. Let C(X;C) and C(K;C) denote the spaces of all continuous complex-valued functions on X and K , respectively. A function F E C(X;C) is said to be a Tietze's extension of a function f E C(K;C) with respect to U, if:
11.8 Theorem (Tietze's Extension). Let (X,T) be a locally compact Hausdorff space, K C U be compact and open sets, respectively. Then for every function f E C(K;C) there is a Tietze's extension with respect t o U. The proof of this theorem is offered a s an exercise in several steps (Problems 11.1-3).
PROBLEMS 11.1
Use Proposition 10.11 to have an open set V such that K 5 V C 5 U and is compact. Let be the subfamily of all continuous, real-valued functions admitting Tietze's extensions with respect to U. Show that (3 is a subalgebra. Use Proposition 10.10 and Corollary 10.15 to show that E separates points and that it also contains constant functions.
11.2
Construct an extension F of f E (X from K to X and show that
II f 11, 11.3
=
II FII
U.
Use the Stone-Weierstrass Theorem 8.3 to prove that the closure
11. Functions o n Locally C o m p a c t Spaces of CS with respect to the uniform norm equals C ( K ; R ) and extend the result to complex-valued functions.
200
CHAPTER 3. ELEMENTS O F POINT SET TOPOLOGY
NEW TERMS: support of a function 195 space of all continuous real-valued functions 195 subordinance 195 partition of unity for a compact set 195 dominance of functions by sets 195 locally compact Hausdorff space, criteria of 196, 197, 198 Tietze's extension of continuous functions 198 Tietze's Extension Theorem 198
Part II Basics of Measure and Integration
Chapter 4 Measurable Spaces and Measurable Functions In the previous chapter we studied general topological spaces. A topology was defined as a collection of sets (on a carrier) that is closed with respect to the formation of arbitrary unions and finite intersections. In the present chapter, we introduce various classes of sets similar to topological spaces but serving other purposes. One of them prepares the student for another part of analysis - integration. Beyond the familiar integration we experienced in calculus, we will need to measure much more general sets than those which are used for the Riemann integral. For instance, we will consider abstract sets that are encountered in the theory of probability. In addition, we will largely extend the existing class of integrable functions. If we try to measure the length (or area) of all sets, set theory forces us into certain contradictions or paradoxes. Therefore, we have to restrict attention to measuring a (large) subclass of sets. It stands to reason that we would wait the collection of "measurable" sets to be closed under certain operations such as union, complementation, and intersection. Thus we seek a collection of sets satisfying certain algebraic properties under the binary operations of union, intersection, and set-theoretic difference. This leads to the concept of a sigma-algebra. As with topological spaces, where base (or sometimes also subbase) sets were most convenient to study, in measure theory it is also useful to start with more primitive collections of sets called generators of sigmaalgebras. For instance, if we need to measure a flat closed figure, one of the reasonable ways to do it is to approximate the figure by a number of (various) disjoint rectangles whose measures we already know. Such a natural way of measuring more complex sets by "base" sets gives rise to the extension of measure from the collection of "abstract rectangles" to the set of all figures formed from rectangles under countable set operations. This method of extension was generalized by the German mathematician Constantin Carathiiodory in 1918. This chapter is just preparation for the next two, where we will be concerned with various classes of sets on which measure will be defined and then extended. Generators to these classes, in particular, topologies that have found other applications in this part of analysis, are of special interest.
CHAPTER 4. MEASURABLE SPACES
1. SYSTEMS OF SETS 1.1 Definition. Let 52 be an arbitrary set and let C be some collection of subsets of S2.
(i) C is called a u-algebra (pronounced "sigma-algebra") or u-field (sigma-field) if: (a) f-2 E E. (b) A E C 3 AC E C. 00
(c) for any sequence {A,} of sets of 6,
U A,
E 6.
n=l
(ii) E is called an algebra (or field), denoted by A (i.e. C = A) if (a) 52 E A.
(b) A E E * A C € A .
(iii) C is called a Dynkin system, denoted by 9 ( C = 3 ) ) if (a) 52 E 9.
(b) A
€
a * ACE a.
(c) for every sequence {An} of pairwise disjoint sets of 9,
(iv) C is called a ring, denoted by % (E = a ) , if (a)
$3 E a*
( b ) A,B E '3+ A \ B E a. (c) A,B E
(v)
3 A U B E%.
C is called a semi-ring, denoted by Y (C = Y), if
(a)
8 E 9.
(b) A , B E Y * A ~ B E Y . (c) for A,B E Y, there is a finite tuple Cl, ...,Ck of pairwise disjoint sets from Y such that A\B can be represented as the union
c
n=i
Cn'
1. Systems of Sets
(vi)
205
C is called a monotone system, denoted by A ( C = A ) , if: (a) f o r every {An)f (i.e. monotone nondecreasing) sequence of sets of A, A,EA
B
n=l
(b) for every {An)l (i.e. monotone nonincreasing) sequence of sets of A, A,EA.
fi
n=l
(vii) n -stable (pronounced LLintersection-stable") if A,B E C A ~ B E C . (viii) A pair (h,,E), where C is a a-algebra in G?, is called a measurable space, while elements of C are called measz~rable sets. [Compare these with topological space and open sets, respectively.]
(i) Let G? be a n arbitrary set. Then (c-algebra).
{a,@}is
the smallest algebra
(ii) ?(a), the power set, is always a c-algebra. It is the largest aalgebra in n. (iii) The smallest c-algebra containing a set A is obviously If&@,A,ACl. (iv) Let G? = Rn and let Y be the system of all n-dimensional halfopen intervals (or rectangles) of type (a,b], for a,b € Rn. The intersection of two intervals is either 0 or again an interval. The difference of two intervals need not be an interval, but it can be represented as a finite union of pairwise disjoint intervals (see Figure 1.1). Hence, !f is a semi-ring.
Figure 1.1
206
CHAPTER 4. MEASURABLE SPACES
Every a-algebra C has three properties directly following from Definition 1.1 (i): (v)
a ) @ E C (since
@ = RE).
n 00
b ) For every sequence {A,) C 6, An E C. This holds n=l because
Observe that by applying DeMorgan's law we can similarly show that this property and property (c) in Definition 1.1 (i) are equivalent, i.e., in 1.1 (i, c) a countable intersection can be replaced by a countable union. c) Finally we have
A,B E C 3 A \ B E C (due to A \ B = A n BC). One can say that any u-algebra is closed with respect to the formation of all countable set operations.
(vi) Every algebra A has the same property as a-algebras in (v) except it is closed under finite intersections. Hence, any algebra is closed relative to the formation of all finite set operations. (vii) Let 52 be an arbitrary set and let C be the system of all subsets A of C such that either A or AC is a t most countable. Then C is a a-algebra. Indeed, @ is at most countable. Thus, R E C. If A E C then obviously ACE 23. Now let {A,) C. Then either there is a t least one countable set Ai or else, all An are not countable but their complements a l
are countable. In the first case, 00
case, the set
U A:
n A, -
n=l
.- is
clearly countable. In the second
is a t most countable and, therefore, it belongs to C,
n=l
along with its complement. The latter, by DeMorgan's law, is
n A,. 00
n=l
Consequently, C is a a-algebra. (viii) Let f : R +52' be a map and C' a a-algebra in a'. Then C = f**(C1) defined as the system of all sets {f *(A1): A' E E') is a aalgebra. This property is due to the fact that the inverse of a map preserves all set operations. For instance, if A E C then it follows from the definition of C' that A is the inverse image of some A' E C'. Thus (A')' E C' yielding AC = (f *(A1))' = f '((A')') E C. The proof that the union of a sequence from C belongs to C is also analogous (show it). (is)
A monotone system need not be an algebra. For instance, let
1. Systems of Sets
207
be the set of all convex subsets of Fl2. Then it is easily seen that B is a monotone system. However, the union of two convex sets A U B need not be convex. The difference of two convex sets is not necessarily coniex either. An algebra need not be a monotone system either, for it is not closed under countable set operations. (5
(x) The collection of all finite subsets of an infinite set R is a ring in R.
PROBLEMS 1.1
Let C be a a-algebra in R and let R' be a subset of R. Define C' as E n R' = { A n R': A E C). Then E' (called the trace of 22 in R') is a a-algebra in 0'. Prove it.
1.2
Let % be a ring in R. Show that with any two sets A,B E Z , their intersection also belongs to %.
1.3
Let GSL be a collection of subsets of R with the properties:
@ E % and A,B E % implies that A f l B E Z and A u B
E %.
Is % a ring?
1.4
Let A be an algebra in R. Prove that A is a a-algebra in R if and only if A is also a monotone system. [Hint:Show that any aalgebra is a monotone system and any algebra, which is a monotone system, is a a-algebra.]
1.5
The flow chart below reflects the relations between some systems of sets
Demonstrate that each relationship holds. 1.6
Give an infinite collection of subsets of R that contains (6 and R and which is closed under countable unions and intersections but is not a a-algebra.
1.7
Let R be a finite set with IS21 = 2n. Let 3 ' be the system of all subsets D of R such that I D I = Zq, q = O , l , . ..,n.Show that '3 is
CHAPTER 4. MEASURABLE SPACES
a Dynkin system. 1.8
Give a Dynkin system that is not a a-algebra.
1.9
Show that if 9 is a Dynkin system then D,E E 9 and E E D => D\E E 9.
1.10
Prove the statement: A Dynkin system '3 is a a-algebra if and only if 9 is n -stable.
1.11
Show that the inverse image of a ring k under the map f :Q1+Q is a ring in R1.
1.12
Show the equivalence of two definitions of a semi-ring if property c) in Definition 1.1 (u) can be replaced by c') Let A,AI,. ..,A, E 9 . Then there is a finite tuple C1,. ..,Ck of disjoint sets from Y such that
1. Systems of Sets NEW TERMS: a-algebra (cr-field) 204 a-field 204 algebra (field) 204 field 204 Dynkin system 204 ring 204 semi-ring 204 monotone system 205 n -stable (intersection-stable) system 205 measurable space 205 measurable sets 205 smallest a-algebra 205 a-algebra containing (generated by) a set 205 half-open interval (rectangle) 205 rectangle 205 systems of sets, diagram of 207
CHAPTER 4. MEASURABLE SPACES
2. SYSTEMS' GENERATORS 2.1 Theorem. The intersectzon of arbitrarily many a-algebras (algebras, monoione systems, rings) in R is a a-algebra (an algebra, a monotone system, a ring).
(See Problem 2.1 .) 2.2 Remark. Let Cj be an arbitrary collection of subsets of R. There is obviously a a-algebra containing Cj, for instance, the power set T(R). If we collect all a-algebras that contain Cj and find their intersection, it must contain Cj and, due to Theorem 2.1, it is a c~-algebratoo. This aalgebra is clearly the smallest one containing Cj. It is called the a-algebra generated by Cj and it is denoted by E(Q). The system of sets Cj is called the generator of E(Cj). It is worthwhile to recollect the analog of a subbase or bas.e -and their role as generators of the smallest topology that contained them. While, as we saw it, the classes of generators in topology are quite limited in their practical use, their counterparts for a-algebras form a significantly richer inventory filled with such prominent collections as semi-rings, rings, Dynkin systems, monotone systems, and t opologies themselves. Among them, rings and semi-rings shall be often used as generators of a-algebras (throughout this book, especially, in Chapter 5) in Carathdodory construction of measures. Another frequently used generator is a topology that we will see in action when characterizing regular and Radon measures and in calculus of Lebesgue-Stieltjes measures. The smallest a-algebra containing a topology T as a generator is called a Borel a-algebra and it is denoted 3 ( r )or by %(R) or just by 93 whenever the nature of T is specified. Of various Borel a-algebras we are going to come across will be many generated by the usual topology. C3 2.3 Example. Let C j be the system of sets containing only one subset A of 52. As mentioned in Example 1.2 (iii), the smallest a-algebra containing Cj is {R,@,A,A~}. C3
Problem 1.10 states that a Dynkin system is a c~-algebraif and only if it is n -stable. The proposition below generalizes it by allowing the Dynkin system to have just a fl-stable generator. 2.4 Proposition. A Dynkin system is a a-algebra if and only if it has an n -stable generator.
Proof. Let Cj be an n -stable system of subsets of R. Then C(Cj) = 9(Cj). Since every a-algebra is a Dynkin system and 9(Cj) is the smallest Dynkin system containing Cj then 9(9) C .E(Cj). The inverse relation remains to be shown:
2. System's Generators
Let D E 9(g)and let O D = { Q E
?(a):Q n D E 9(g)}.
a) We show that !DD is a Dynkin system.
) ACnD=D\A=D\(AnD)€ If A E g D then A n D ~ 9 ( 9 and 9 ( Q (see ) Problem 1.12). This yields ACE CDD. Similarly, let {A,} C g D be a sequence of pairwise disjoint sets. Then A, n D E 9(Cj), for n = 1,2,. ... Obviously, { A , n D } is a sequence of pairwise disjoint sets and
implying that
C n = l A,
E 9,. Therefore, 4DD is a Dynkin system.
c
b) We prove that for every D E 9(g), 9 ( Q ) g D . Let G E g. Then G E 9(CJ).Since Cj is n-stable, it follows directly from the definition of !DG that (j c 9,. Thus Q C - 9(g)C - 9 , since 9(g) is the smallest Dynkin system containing g and pG is just a Dynkin system containing g. Now let D E 9(Q). Then D E 9 , and G D E 9(9), implying that G E g D or g C - 9,. This yields that 9(g)C g D 1 since again 9(g)is the smallest Dynkin system containing g.
9(Q)is
c ) We show that
n -stable.
Let C,D E 9(g). Then 9(g) '?DD and C,D E 9(g)(by the definition of g D ) ,and therefore 9(g)is
Thus, C n D E n -stable.
Finally, by Problem 1.10, 9(g) must be a a-algebra. Then, as the smallest a-algebra, Z(Q)C 9(9).This is the desired inverse relation. O In the next lemma and theorem we present a construction of the ring generated by an arbitrary semi-ring.
2.5 Lemma. Let 9 be a semi-ring in and let Tb be the system of all finite unions of elements from Y . Then any element of '3 can be represented as a finite union of pairwise disjoint sets from 9 , in notation, C = lCkl Ck E 9.
;
Proof. Let R E %. Then by the definition of %, R =
(1 Sk, where
SkE J. We now construct a decomposition of R by elementsif Y using Sk.Let
Since Sk\Sj=
C Cikj is a finite union of elements from 9 , it follows i
212
CHAPTER 4. MEASURABLE SPACES
that
(as a finite union of finite intersections) =
C Dik , n
where Dik are elements of 3'. It is easily seen that R =
U Sk = U Rk , k=l
k=1
x x Dik n
where Rk are pairwise disjoint. This leads to R =
k=l
n
i
lemma is proved.
and the
0
2.6 Theorem. Let Y be a semi-ring in G?. Then the system of all fina'te unions of sets in Y is the (smallest) ring %(Y) generated b y Y .
Proof. 1) We show that % described above is a ring. Since Y E %, we have @ E %. Let Rl,R2 E 3.Then, by the definition of a,
Therefore,
n
RlUR2=
rn
U U ( S : U S ; )E % .
k=l i=l
By Lemma 2.4 and by Problem 1.2(c) (Chapter I ) , we have
Since C k and Di, as elements of Y, are semi-ring sets, the sets Ck\Di C
.-
' ik
= C E :k and E ik are also elements of 9 . Therefore, s=1
R,\R,=
c n ~ E =F 5 c n
rn
'ik
k=l i=l s=1
'ik
k=l s=1 i=l
2 ) Now we show that % = %(Y). Let Y
a. As a ring, %' is closed with respect
rn
nEikEa.
%', where %' is any ring in
to the formation of finite unions of sets from %'. Specifically, it is closed under finite unions of sets from Y; hence, it includes %. Consequently, % is the smallest ring generated by
Y.
0
In Remark 2.2 we defined a Borel a-algebra as a a-algebra generated by a topology. We will show below that the smallest a-algebra C ( Y ( R n ) ) generated by the semi-ring of all half open intervals (a,b] in Rn coincides with the Borel c-algebra %(Fin).
2. S y s t e m ' s G e n e r a t o r s
213
2.7 Theorem. L e t 7, rC,and 7 d e n o t e respectively t h e s y s t e m of all o p e n , closed, and c o m p a c t subsets of (Rn,r,). T h e n t h e following r e l a t i o n s hold.
Proof. Since all compact sets in (Rn,r) are closed and bounded, it follows that 7 C rCC C ( r C ) ,and thus
On the other hand, every closed set F can be represented as a countable union of compact sets Ck E rC, k = 1,2,... . For instance, if C(c,k) denotes the compact ball centered a t some point c and with radius k E N, 00
then we may choose Ck = F n C(c,k) implying that F = U Ck. Therek=l
fore, all closed sets belong to the u-algebra C(y) (since this a-algebra contains countable unions); i.e., r C 5 C ( 7 ) which yields
Both inclusions (*) and (**) lead to C ( r C )= E(7). Since open sets are complementary to closed sets, it follows that 93 = C ( r ) = E ( r C ) = C(y). Now we show that '3 = E(Y). Any half-open interval (a,b] in Rn can be represented as the intersection of a sequence of bounded open intervals of type (a,bn) (or as we called them earlier, o p e n parallelepipeds) with bnJ b. Therefore, the collection Y of all half open intervals belongs to aalgebra E ( r ) , which implies that C(Y) 5 C(T). On the other hand, any open bounded interval can be represented as the union of a sequence of half-open intervals of Y; and any open set is a countable union of bounded open intervals as base sets (recall that (Rn,r) is second countable). Therefore, any open set is the union of countably many halfopen intervals from Y and we have T 5 E(Y), implying C ( r ) 5 IC(Y). Dual containment gives us C(Y) = 3.
PROBLEMS 2.1
Prove Theorem 2.1.
2.2
Show that an intersection of semi-rings in il need not be a semiring.
2.3
Show that a union of a-algebras in 52 need not be a a-algebra.
214
2.4
CHAPTER 4. MEASURABLE SPACES
Let A and B be subsets of R and let g = {A,B). Find 9(g) and C(Cj). Show that 9(g) and C(Q)are identical if and only if one of the following conditions holds. A n B or A n BCor ACn B or ACn BCis the empty set. [Hint: Use Problem 1.10.1
2.5
Let C be a u-algebra in St and let B C R. Show that the u-algebra generated by Q = C U {B) is of the form
[Hint: 1) Show that u(E1) = a ( C U {B)). 2) Show that C' is a ualgebra in R.]
91 E g2 implies
2.6
Let Cji and Cj2 be systems of sets in R. Show that that %i1) E C(!32)'
2.7
Let R be an arbitrary non-empty set and let A,B C_ R. Construct for r = { Q , @ , AB,A , n B,A U B ) the Borel u-algebra %(T).
2.8
Construct the Borel u-algebra generated by the cocountable topology T = {a,@, AC: A is a t most countable) (see Problem 1.7, Chapter 3, where 52 is a uncountable set).
2.9
Let A be a monotone system in R and let A be an algebra in R such that A .At. Prove that the u-algebra C(A) generated by A is a subcollection of Jrl. [Hint: Let Jrlo = Jrl(A) be the monotone system generated by A. Furthermore: 1) Let A be a fixed element of Ato. Define
=hO. Show that hA 2) Show that AAis thus an algebra. 3) Show that A, = ,??(A).] 2.10
Show that any open set in Rn can be represented as a t most a countable union of disjoint semi-open cubes.
2. System's Generators NEW TERMS: u-algebra generated by a collection of sets 210 generator 2 10 Borel u-algebra 210 a-algebra generated by a set 210 semi-ring, propery of 211 ring generated by a semi-ring 212 Borel a-algebra, criterion of 213 a-algebra extended by a set 214 Borel a-algebra generated by a cocountable topology 214
CHAPTER 4. MEASURABLE SPACES
3. MEASURABLE FUNCTIONS 3.1 Definition. Let (R,C) and (R1,C') be two measurable spaces. A function [R, R', f ] is called measurable if f **(El) E C, i.e. if VA' E C' f **(A1)E C. The collections of all measurable functions from (R, C ) to (U,C') will be denoted by C -'(a, C ; R', C'). Notice that symbol C - is a natural extrapolation of the common notation in analysis, where Cn stands for the space of all n-times continuously differentiable functions, with C0 (or simply C) being used for the space of all continuous functions. So, not only has C been vacant, but it also agrees with the existing linear order (Cn,n = - 1,0,1,. ..; 2 ).
'
-'
3.2 Remark. In Example 1.2 (viii), we saw that f **(El) is a aalgebra in R. We wish to call it the a-algebra generated b y function f . This is the smallest a-algebra relative to which f is measurable.
3.3 Examples. (i) Each identity function f : (R,C)--t(R,C1) is measurable if and only if C' C - C. (ii) Let f: (R,C) -* (R1,C') be a constant function, i.e. f (w) = c E R', Vw E R. Therefore, f *(c) = R and f *({clC)= @,which yields that for each A' E C , f *(A1) is either R or @.The latter implies that f is measurable with respect to the smallest a-algebra {R, @}in R. Thus, f is always measurable. (iii) Let f(w) = lA(w) for some A E R. Let C' be an arbitrary calgebra in R (for instance, the Bore1 a-algebra). It is easily seen that the inverse image under f of any subset of R (specifically, of any subset of C') is one of the elements of the set, C = {!2,@,A,AC}. Therefore, C is the smallest a-algebra with respect to which f is measurable. On the other hand, if C is a c-algebra in a, then lAis measurable if and only if AEC. There is a noteworthy parallel between continuity and measurability of functions and their relationships with topologies and a-algebras. Recall that a function [R,R1,f] is continuous on R if there are two topologies T and T' declared on R and R1, respectively, and f**(rl) E T. If, in addition, r' is known to be induced by a subbase Cj', then the condition f **(TI) C T can be relaxed by f **(CJ1)E T . The pointed out analog with measurability is utilized by 3.4 Proposition. Let (R, C ) and (a', C') be two measurable spaces and let CJ' be a generator of C'. Then a function f: R-tS1' is measurable if and only if f **(Cjl) E C.
3. Measurable Functions
217
Proof. Let C = {Q' E 9(R1): f *(Q1) E C). Obviously, is a ualgebra (show it, see Problem 3.1). Now let f**(gl) C_ C. Then it follows that g' E C and hence C' = C(9') 2 Therefore, f is measurable. The 0 converse is trivial.
z.
3.5 Example. Let f: (R,r)+(R1,r') be a continuous function on a topological space (S2,r). Then f **(rl) C r E %(T) (the Borel u-algebra generated by T ) . By Proposition 3.4, the function f is then J ( T ) - J ( r l ) measurable. We call f a Bore1 measurable funciion. 0 Measurability, like continuity, is preserved under the composition. 3.6 Proposition. Let f ,: (Rl,C1)+(R2 = f l*(S21),C2) and f 2: (R2,C2)+ (a&,) be measurable functions. Then the composition f o f ,: R,, E l ) + (a,, 27,) is measurable. 0 (See Problem 3.2.)
3.7 Remark. Let {Sli,Zi:i E I) be an arbitrary collection of measurable spaces and let Ifi: a + R i : i E I) be a collection of functions defined on a set a. Every function f of this family is clearly f i**(Ci)El-measurable. We are interested in constructing the minimal u-algebra in S2 relative to which all functions of the family are measurable. Since U fi**(Ei) is not, in general, a a-algebra, it is reasonable ieI to regard it for the generator of the a-algebra generated by the family { f i; i E I),in notation, 6 (f i; i E I). 3.8 Lemma. Let {g;: ( a , 22) -+ (Ini, 6;)) be a collection of functions on R and let f : (!do, C,) 4 (R, C ) be a function on $2,. The function f is Co-C(gi: i E I)-measurable af and only if each of the functions gi o f is C,-El-measurable.
Proof.
1) Let gko f be Co-Ck-measurable Vk E I. Then VAk E C k , (gk 0 f)*(Ak) = f * 0 gk4(Ak)E Co where &(Ak) E
U gf (Ei). Taking
Ak
i € I
from Ck for each respective k E I we run the whole set
U gf *(Ei)
i E I
whose elements are further transferred by f * into Co. In other words, we have
Since
U gf*(Ci)
is a generator of C(gi; i E I ) , by Proposition 3.4,
i E I
inclusion (3.8) is sufficient for f **(C(gi;i E I ) ) indeed Co-C(gi;i E I)-measurable.
E C,.
Therefore, f is
2 18
CHAPTER 4. MEASURABLE SPACES
2) Let f be Co-C(gi;i E I)-measurable. This implies that VE E Q = U gf *(Ci), f *(E) E Besides, VAk E Ck, gi(Ak) E U gf *(Ci).
so.
i E l
i € I
Thus, VAk E Ck, (gk o f )*(Ak) E Co, which means that gk o f is 23,-Ckmeasurable.
PROBLEMS
C in the proof
of Proposition 3.4 is a u-algebra.
3.1
Show that
3.2
Prove Proposition 3.6.
3.3
Let f :SZ+SZ1 be a function and let Q'
E 9(St1). Show that
[Hint: Let E: = {A' E 9(R1): f *(A1) E C(f **(gl))}. Show that is a a-algebra. Then show that C(Q1) C.] 3.4
E
Let [0,01, F ] be a homeomorphism, with 0 and O1 being open sets in topological spaces (X,r ) and (X, r l ) , respectively, and let %(ro) and %(rO ) be the Borel a-algebras generated by the 1
relative topologies ro and r . Prove that [%(ro),%(rol),F,] O1 bijective.
3.5
is
Let [O,O1,F] be a homeomorphism, with 0 and O1 being open sets in topological spaces (X,T) and (X,rl), respectively, and let '3(r0) and %(ro ) be the corresponding Borel a-algebras 1
generated by the relative topologies r0 and r
O1
. Suppose B C 0.
Show that if F,(B) is Borel, then B is also Borel.
3. Measurable Functions
NEW TERMS: measurable function 216 e - '-space 2 16 a-algebra generated by a function 216 measurability of a function, criterion of 216 Borel measurable function 217 composition of measurable functions 217 a-algebra generated by a collection of functions 217 homeomorphisms and Borel a-algebras, relationship between 218
Chapter 5 Measures This chapter is a precursor to the general theory of integration, which is a significant advancement from the Riemann integration known from calculus. Although many applications in natural sciences triggered the development of general integration and measure theory, the theory of probability has become the primary client of abstract measure even prior to integration. An early notion of measure was introduced by Italian Giuseppe Peano's in 1883. For a simple set in the plane, Peano used two types of polygons that contain and are included in the given set. The areas of the polygons of the former type have a greatest lower bound and of the latter type - the least upper bound. If these limits coincide, their common value is said to be the area of the set. However, if the limits differ, the concept of area would not apply. Apparently, Cantor's development of set theory greatly influenced Peano's concept of area for arbitrary sets in his 1887 monograph, Applicazioni geometriche del calcolo infinitesimale. He generalized his original idea on inner and outer measures of sets by polygons for two- and three-dimensional Euclidean sets. Peano emphasized the close connection between measure and integral. In 1892, Frenchman Camille Jordan arrived at a more advanced concept of measure as a countably additive set function applied first to positive and then to signed set functions. The latter led to the prominent Jordan decomposition of two positive measures, which we will study in Chapter 8. Jordan's motivation of the concept of measurable sets apparently stemmed from the theory of double integration, which naturally arises when introducing integrals on arbitrary plane sets. However, Jordan's approach of the measurements of sets was restricted to the common, a t that time, finite covers of sets by intervals or rectangles. The most revolutionary steps were undertaken by the Frenchman Emile Borel in his famous 1898 monograph, L e ~ o n ssur la thkorie des fonctions, where he introduced the idea of countable, instead of finite, covers, thereby significantly extending classes of measurable sets. Borel has also pointed out in 1905 a possibility of using measure theory in probability, which has been successfully accomplished by Russian Andrey Kolmogorov not earlier than in 1933. However, in his Lecons, Borel did not bother to connect measure and integration. In 1902, another Frenchman, Henri Lebesgue further refined measure theory by combining the ideas of Camille Jordan on finite contents with
222
CHAPTER 5. MEASURES
the countably additive measure notion of Emile Borel. Lebesgue called sets in Rn measurable whenever their inner and outer measures are equal. This led to the\completion of the concept of measure and gave rise to the general theory of integration so significantly enlarging the class of integrable functions that it made Lebesgue say in 1902: "I know no function that is not summable and I do not know if any such exists." (However, Italian Guiseppe Vitali showed the existence of such a function in 1905.) Lebesgue also established several central theorems in the theory of integration; one of them is the famous Lebesgue Dominated Convergence Theorem. Finally, the Austrian Johann Radon, in his 1913 Habilitation work began to study abstract measures and integrals more general than those of Lebesgue in Rn. Radon is also the author of the well known Radon and Radon-Stieltjes integrals. The latter is most frequently referred to as the Lebesgue-Stieltjes integral. Radon's ideas led not only to the abstract theory of measure and integration but also to its applications in the boundary value problems in the theory of logarithmic potentials (developed by Radon himself) and contemporary theory of probability and stochastic processes.
1. SET FUNCTIONS 1.1 Definitions.
a.
A (i) Let C be a system of subsets of G? including the empty set numerical function p : C + A such that p ( 0 ) = 0 is called a set function. In this chapter we will only consider nonnegative set functions p : C -4 -
R+. In the below definitions we assume that corresponding sets are elements of C . (ii) A set function p is called finitely additive or just additive if for any n-tuple All.. .,A, of mutually disjoint sets with C = l A k E C,
2
(iii) A set function p is called a-additive if for any sequence Al,A2,. ,. of mutually disjoint sets with C n = l A n E 27, it holds that
(iv) A set function p on C is called continuous from below on C if
223
1. Set Functions for every monotone nondecreasing sequence {A,} it holds that
t such that
fi A,
E 13
n=l
If this condition is known to hold for a particular monotone nondecreasing sequence {A,}t, then p is said to be continuous from below on {A,}. (v)
Let { A , } be a monotone nonincreasing sequence and
00
nn = 1An
E E. A set function p on E is said to be continuous from above on
(A,)
if
The set function p is called continuous from above on E, if (1.1) holds for every monotone nonincreasing sequence {A,} In particular, if {An} 10, (1.1) reduces to
1 such that
00
fl A, E C.
n=l
lim p(A,) = 0
n+00
and this is referred to as conlinuity from above at the empiy set or, shortly, @-continuity of p. (vi) A E C.
A set function p is called finite on C if p(A) < oo for all
(vii) A set function p is called w-finite on C if C contains a 00 sequence {A,} monotonically increasing to R (i.e. U A, = a ) and p is n =1
finite on {A,}. In this case, we also say that p is u-finite on {A,}. (viii) An additive set function p defined on a semi-ring 9' (in R) is called an elementary content (on Y'). (is) An additive set function p defined on a ring tJ1 (in a ) is, called a content (on %). A w-additive set function p defined on a ring I& or algebra A (2) (in R) is called a premeasure (on I& or A). (xi) A a-additive set function p defined on a a-algebra 23 (in S1) is called a measure (on E). If, in addition, p ( n ) = 1, then p is called a probability measure. (xii) Let (fl,C) be a measurable space and let p be a measure on C. Then the triple (R,C,p) is called a measure space. If p is a probability measure, then the triple (R,C,p) is called a probability space.
CHAPTER 5. MEASURES
(i) Let E be a a-algebra in R and let a E R be a fixed point. Define the following set function E, on E. For each A E C, we set &,(A) = 1 if a E A and &,(A) = 0 if a $ A. Then E, is a measure on C. It is clear that E,(@) = 0 and that E, 2 0. Let { A be a sequence of pairwise disjoint sets from C. Then a can either belong to exactly one set, A j (for some j), or to no set of the sequence. In the first case,
and in the second case,
On the other hand, in the first case,
and in the second case,
Therefore, E, is u-additive. The measure Dirac measure.
E,
is called a point mass or
(ii) Let R = Rn. In Example 1.2 (iv), Chapter 4, we introduced the system 3' of all half-open intervals (a,b] C_ Rn, which was shown to be a semi-ring on Rn. For a = (al,. .,a,) and b = (bl,. ..,b,) (ai < bi), we
.
define Ao((a,b]) =
n (bk - ak) and A'(@) = 0. n
Then A' is obviously an elementary content on 9. A' is said to be the Lebesgue elementary content. (iii) Let [R,R,f ] be a bounded, monotone nondecreasing, right-continuous function that vanishes a t - m. Any such function is called a distribution function. There is also a variant of the so-called extended distribution function [R,R, f ] , which is monotone nondecreasing, rightcontinuous, but need not be bounded over unbounded sets and does not vanish a t - m. (As a right-continuous function, an extended distribution function is bounded over bounded sets though.) Since the set {f > a) (for any real number a) is either 0 or an interval, every distribution or extended distribution function is Borel. Both types of functions are subclasses of monotone functions (that will be introduced and studied in
1. Set Functions Chapter 9). Let J be the semi-ring of half-open intervals in R. We define the set function pOf on J as
p0f is clearly additive on 9 and therefore is an elementary content on 9. pOf is called the Le besgue-Stieltjes elementary content.
(iv) Let R be an uncountable set and let
C = {A E T(R) : either A or AC5 N), which is a a-algebra on R (see Example 1.2 (vii), Chapter 4). Then, define VA E E, p(A) = 0 if A is a t most countable and p(A) = 1 if AC is a t most countable. We show that p is a measure on $2. First observe that p 2 0 and p(@) = 0. Let {A, : n = 1,2,. .} be a sequence of pairwise disjoint sets of C. If the union C r = 1 An is at most countable, then each A, is a t most countable, and thus
.
[c
is a t most countable, then we argue that there is exactly If =: 1 one set of the sequence {A,} with a t most a countable compliment. Suppose that there is yet another set from this sequence with this property. Then both of them, say AT and Af, are at most countable and hence A; n Af is also a t most countable. Since Ai n Ak = (8, At u Af = $2, which is a contradiction, for R is uncountable. Therefore, we have exactly one set Ai such that AT is a t most countable. Then,
is a t most countable and
On the other hand,
This yields a-additivity of p. (v)
Let {R,C,pi : i = 1,2,. ..} be a sequence of measure spaces and
226
CHAPTER 5. MEASURES
let {ai: i = 1,2,. ..) be a nonnegative numerical sequence. Define p on C as
Then p is a measure on C. (See Problem 1.3.) (vi) Consider the special case in the last example with p, = E ~ , , n = 1,2,. .., where { b , ) R. In other words, p, is a point mass which was introduced in (i). Then the measure p defined in (1.2) is called an atomic (sometimes also discrete) measure. A further special case is of interest. With the given p,, we also assume that the sequence {a,) has the property
It is readily seen that the measure p is a probability measure; more specifically, it -is an atomic probability measure. (vii) Let R be an uncountable set and let C = {A E ?(a) : either A or AC5 N) as in (iv). Define VA E S', p(A) = 0 if A is a t most countable and p(A) = 00 if AC is a t most countable. Then p is a measure on
C. (viii) Let R be an arbitrary set and C be a a-algebra on R. Define the following set function p on E. For each A E C, p(A) = I A I (i.e. the number of elements in A), if A is finite and p(A) = oo, otherwise. Then, it is easy to verify that p is a measure on E. It is called a counting measure and the corresponding triple (R,C, p) will be referred to as a counting measure space. Note that if R is a t most a countable set, a counting measure p can also be expressed in terms of the measure introduced in (1.2), with a, = 1 and p, = cb . tl n
1.3 Proposition. Let p be an elementary content on a semi-ring f . Then the following holds true: (i) For any two sets A and B from 9 with A C B, p(A) (monotonicit y)
5 p(B)
(ii) Let A1,A2,. .. be a sequence of pairwise disjoint sets from Y and A, 2 A E Y f o r each n. Then,
decomposition
c
00
5 U A,. Then there is a countable n=l C of A with C ks' E f such that k=l k
(iii) Let A,Al,A 2r...E f and A
1. Set Functions
(i) B = A + B\A = A + C: = 1CS' additivity of p,
227
where CS9sE Y.
Hence, by
and the statement follows. (ii) By Problem 1.12, Chapter 4,
Hence, because of the assumption that A, A = C;=lAk+
5 A,
c:=~c~.
Thus, by additivity of p on Y, p(A) =
z;- l ~ ( A k +) C
=:
lp(Cs),
which implies that
- ,p(Ak), good for all n.
p(A) 2 Consequently , pcA)
C=:
l~(Ak)-
(iii) With B , = A, n A ( E Y) we have that
00
U B, = A. Denote
n=l
n
Dl = B,, D n + , - B n + , \ . U B j , n = 1 , 2 ,.... 3=1
Then, A =
C km= 1D k and, by Problem 1.12, Chapter 4, Dk =
c
:k = 1Cs k
with Csk's E 9.
Since Dk C Bk, by (ii),
C s'k = 1p(Csk) 5 p(Bk) for each k. Now, due to monotonicity of p and because B k E Ak, we have that
CHAPTER
5.
MEASURES
which yields (1.3a).
1.4 Corollaries.
E
is a decomposition of A b y elements Em's from 9 , then (from Proposition 1.3 (iii)):
( i ) Ifc:=,.!,
00
ise-, C k Z C 1k is a "p-minimal decomposition" o f A. 00
A,. If p is a-additive on 9 (ii) Let A,Al,AZ,... E 'Y and A S n U =l ( i . . , for any -sequence of mutually disjoint sets { A , ) E ! f with
x;=
,A, E 9,
then (from (1.4)):
In particular, if A
=
00
U A , E 9,
n=l
which is known as the a-subadditiv'ity property. Inequality (1.4b) (originally applied t o a semi-ring !f and elementary content p) implies:
(iii) If p is a content on a ring k then for any A1,. . .,A, E k , p
6 A k ) < C ;-- l p ( A k ) (finite subadditivity).
k=l
( i v ) If p is a measure on a a-algebra C then the a-subadditivity property (l.$b) is valid for any sequence (A,) E. ( v ) Because of monotonicity property of an elementary content, due to Proposition 1.3 (i), the definition (1.1 ( i v ) ) of finiteness can be relaxed for set functions from elementary contents and "above" b y requiring merely p(52) < oo. There are two more minor properties o f contents left for an exercise:
1.5 Proposition. Every content p on a ring '3 has the following properties:
(i)
A
B
+ p(B\A)
= p ( B ) - p ( A ) (provided that p ( A ) < oo).
1. Set Functions
(ii) p(A U B ) = p(A)
+ p(B) - p(A n B).
(See Problem 1.2.)
1.6 Lemma- Let p be a content on a ring 9.Then p is a prerneasure
( i . . , rr-additive) on % if and only if p is continuous from below. Proof. 1) Let p be a premeasure on % and {A,} a monotone nondecreasing 00
sequence in '3 such that A: = U A, E %. If p(A,) = ca for some n, by n=l
monotonicity, p(Ak) = ca for all k 2 n and p(A) = ca. Then continuity from below follows immediately. Assume that p(A,) < GO for all n. Denote B, = A,\A,-, , n = 1,2,. . ., A. = (8. Then An= CL=,Bk and {B,} forms a pairwise disjoint sequence from % with
Since p(A,) < oo (by the above assumption), p(B,) is well defined and, therefore, due to a-additivity of p (as a premeasure),
which yields continuity from below. 2) Now let p be a content on %, which is continuous from below, i.e., suppose&hat for every monotone increasing sequence of sets in 9 , {A,} I A: = U A, such that A E %, it holds that n=l
lim p(An) = p(A).
n--+oo
Let {C,} be a sequence of pairwise disjoint sets. By setting
00
we get {Bn}t 5 %, with u B, = B and hence lim p(B,) = p ( ~ ) which , n=l n+oo is equivalent to
CHAPTER 5. MEASURES
This is the desired a-additivity of p.
1.7 Theorem.
( i ) If p is a premeasure on a ring % and if {A,} 1 C % such that 00
p(A,) < oo and A: = fl A , E %, then p is 0-continuous ( o r continuous n =1
from above) on sequence {A,}. In particular, if p is a finite premeasure on ring % then p is @-continuous (or continuous from above) on %.
(ii) If p is a finite content on a ring % then @-continuity implies that p is a premeasure. Proof.
(i). Since {A,} is monotone nonincreasing, {AI\A,}t & %. It can 00 easily be shown that A,). Now we apply U (A1\A,) = Al\
(n = l
n=l
Lemma 1.6 to the sequence {A1\A,}
to arrive a t
Since p is finite on {A,}, by Proposition 1.5 ( i ) , equation (1.7) reduces t0
and thereby yields assertion (i).
(ii) We show that @-continuity implies Lemma 1.6 and that Lemma 1.6 in turn yields (ii). Let {A,}t E % such that {A,} t A E %. Then {A\A,} J @ and @-continuity yield lim
n+oo
Since the content p is assumed to be finite on %, by Proposition 1.5 (i), the last expression leads to lim
n+w
p(A,) = p ( A ) .
Applying Lemma 1.6 ( 2 ) we have that p is a premeasure on %.
0
1.8 Examples. (i) ( A )
We show that on some measure spaces, while { A n } J. @,
0 Let (R, C , p ) be such that R = N, C = T(Sl), p ( { n } ) =
A. Let
1. Set Funciions
23 1
An = {n,n + 1,.. .). Clearly, {A,) is a monotone decreasing vanishing sequence. However, p(An) = a, for each n and thus, p(An) -too. So, @continuity does not apply, since we violated the condition p(A1) < a, of Theorem 1.7, which, as we see, is essential. (ii) Consider in R the c-algebra C = {R,@,A,AC}and define
where p E (0,l). Then p is a (probability) measure on C, called a Bernoulli measure. Notice that for the traditional Bernoulli measure R = (0,l) and A = (1). (iii) Consider the following atomic measure on the Bore1 a-algebra
where ak =
(i)pk(l - p)n-k , p E (0,1), k = 0,. ..,n.
Clearly, p is a probability measure. It is called the binomial measure and it is denoted by ,On,,. ( i v ) Consider another atomic measure on 3(R): P=
00
C n= O a n ~ n where , a,
= e -AX" n! ' A
> 0, n = 0 , l ,....
p is also a probability measure, called the Poisson measure, in notation, T
~
u
-
PROBLEMS 1.1
Let Y be the semi-ring of half open intervals on the real line and ho be the Lebesgue elementary content. Take A = (0,1], B (1,2], and C = (3,4]. While X'(A B) = x'(A) X'(B), we cannot state that A 0 ( ~ + C )= A'(A) A'(c), since A + C is not a n interval, and therefore, the left-hand side of the last equation is not defined. Hence, A' is not additive on Y. True of false?
+
+
+
1.2
Prove Proposition 1.5 (i,ii).
1.3
Show that for a content on a ring, the notions of continuity from below and @-continuity are equivalent.
1.4
Prove that p is a measure on C in Example 1.2 (v).
232
CHAPTER 5. MEASURES
Let (R,C,p) be a measure space and let {A, with
zrn = p(An) < n
1
-
:n
= 1,2,. ..) E C
oo. Show that p( lim A,) = 0 n-+w
[Hint:Apply
continuity from above of measure p to the monotone nonincreas-
{ k=n U Ak : n = 1,2,. ..}.] 00
ing sequence
A subset %I of a ring % in C? is called an ideal in % if it has the following properties:
Let p be a content on a ring %. Define %, = { R E %: p(R) = 0). Show that %, is an ideal in %.
A subset of a ring % is called a u-ideal in % if, in addition to properties a-c) of Problem 1.6, it is closed with respect to the formation of a t most countable unions. Let p be a premeasure on %. With the same definition of 3 ' , as in 1.6, prove that %, is a a-ideal in %. be a a-ideal in a ring 3.Show that there exists a Let preme-ure p on % such that %,-I = a, defined in Problems 1.7 and 1.8. Let p be a finite content on a ring %. Show that d(A,B): = p(A A B), for all A,B E 3,is a pseudo-metric on % (i.e. that d possesses all properties of a metric except for d(A,B) = 0 yields that A = B). Let
(a,C, p)
sets with p Show that p
be a measure space and {A,} 00
U An
(n=1
>
E C be
a sequence of
< ca and inf{p(An): n = 1 , 2 , . ..) = a 2 0.
(nGKOO~n) 2 a.
Let ( f l , C , p ) be a measure space and for a number 0 < a define g, = {G E C: p(G) < a) and
5 oo,
Show that E, is a a-algebra. In the condition of Problem 1.11, let a = oo and p be a finite measure. Show that C, = C.
1. Set Functions
1.13
Let (R, 13,p ) be a measure space, Ern = {Q C - 51: Q fl G E C, VG E 9,). on C, as t
(Notice that C 5 C,.)
g,
23 = {G E C:p(G) < 001, an( Define the set function p,
Show that p, is a measure on C,.
1.14 Argue that for any probability space ( , , ) , the axiom P(@) = 0 is redundant. Is it also true for any measure?
CHAPTER
5 . MEASURES
NEW TERMS: set function 222 additive set function 222 a-additive set function 222 continuity from below on a a-algebra 222 continuity from below on a sequence of sets 223 continuity from above on a sequence of sets 223 continuity from above on a a-algebra 223 continuity from above a t the empty set 223 @-continuity 223 finite set function 223 a-finite set function on a system of sets 223 a-finite set function on a sequence of sets 223 elementary content 223 content 223 premeasure 223 measure 223 probability measure 223 measure space 223 probability space 223 point mass (Dirac measure) 224 Dirac measure (point mass) 224 Lebesgue elementary content 224 distribution function 224 extended distribution function 224 Lebesgue-St ieltjes elementary content 225 atomic (discrete) measure 226 discrete (atomic) measure 226 atomic probability measure 226 counting measure 226 monotonicity 226 p-minimal decomposition of a set 228 a-subadditivity 228 finite subadditivity 228 continuity from below, criterion of 229 continuity from above, criterion of 230 Bernoulli measure 231 Binomial measure 231 Poisson measure 231 ideal 232 a-ideal 232
2. Extension of Set Functions to a Measure
2. EXTENSION OF SET FUNCTIONS
TO A MEASURE We begin this section with the introduction of a set function that is not exactly a measure, as it is not even additive, but which is a t the heart of the formation of measures extended from some more primitive set functions. A prominent example of such a construction yields the Lebesgue measure. It is initially defined on rectangular figures and then the measurement of a more arbitrary figure is accomplished by means of approximation of rectangles inscribed into the figures or rectangles that cover the figure. The latter leads to the notion of an "outer measure," which was initially proposed by Lebesgue a t the turn of this century and later on refined by Carathiodory. Carathdodory's approach is essentially preserved in the contemporary construct ions. The principal idea of the extension begins with measuring an arbitrary set by sequences of rudimentary sets, which should cover the set and whose measure is previously defined. The total "measure" of the cover is then minimized over all available cover-sequences of basal sets (such as rectangles in Euclidean space). As it turns out, this way we can measure all subsets by the resulting set function, i.e., outer measure, but the latter fails to hold additivity, although it preserves some, rather useful properties of measure, such as subadditivity and monotonicity. Having proved this, we will notice that some of the additivity can be regained; namely, there are sets, including the basal sets, that, each, along with its compliment, forms a two-set partition of any other set, on which the outer measure becomes additive. The collection of all such "separating" sets assembles a a-algebra, which, as we will notice, will contain the basal sets. This is generally not the smallest a-algebra over the basic collection, but this a-algebra of separating sets can further be reduced. Our procedure, however, will be different from the more intuitive way described above. Rather than having a particular generator (such as a semi-ring along with an elementary content) in mind, we will try to develop the whole extension in general. In the beginning, we will define an outer measure as a set function with monotonicity and subadditivity and show that the subcollection of all separating sets is a a-algebra and, in addition, that the outer measure on this subcollection is a measure. All this will initially be rendered without assuming that the outer measure was generated by a "formatter" (i.e., some collection of sets and set function). Then, we take an arbitrary formatter and create a more specific outer measure by applying the above construction with countable covers.
2.1 Definition. Let S1 be a nonempty set and p* be a set function defined on ?(a). p* is called an outer measure if:
CHAPTER 5. MEASURES
a) p*(@) = 0.
b) A C -B
+ p*(A) 5 p*(B) (monotonicity).
(2. l a ) (2. l b )
Although axiom a) is redundant, since p*(@) = 0 as a set function in general, we find it to a be useful reminder. 0
2.2 Definition. Let p* be an outer measure on Y(S2). A subset M 2 R
is said to be p*-measurable, if for any Q C_ a,
We will also say that M separates Q.
0
The following is what essentially constitutes the widely referred to Carathdodory Extension Theorem. For convenience, we will break it up into several theorems. The idea of outer measures and the below construction belong to the German mathematician (of the Greek origin) Constantin Carathbodory that appeared in his 1914 paper, ~ b e rdas lineare Map von Punktmengen - eine Verallgemeinerung des Langebegriffes (in Gottingen Nachrichten) and in his famous 1918 book, Vorlesungefi iiber Reellen Funktionen (in Teubner, Leipzig).
2.3 Theorem. The collection C* of all p*-measurable subsets forms a u-algebra in a. The restriction of p* from ?(a) to C*, in notation p:, is a measure.
Proof. Since throughout the proof of this theorem we will largely use equation (2.2) or prove its validity, we first notice that, due to 0subadditivity of p*, as an outer measure, the inequality
holds true for all subsets, Q and M, of a. Our proof will consist of the following steps. a) S1 is obviously an element of C*, as it satisfies (2.2). If M E C*, then MCE C*, by their symmetry in (2.2). b) We show that C* is closed with respect to the formation of finite unions, i.e., we show that with A, B E C*, A U B E C*. Since B E C*, it follows that for each Q' E T ( a ) ,
2. Extension of Set Functions to a Measure
237
Specifically, (2.3a) is valid for Q' = Q n A and Q' = Q n AC, Q E 9(!2). Hence, p*(QnA) = p * ( Q n A n B ) + p * ( Q n A n B C )
and p*(Q n AC)= p * ( Q n A C nB )
+ p*(Q n A C nBC).
Summing up the last two equations and taking into account that A E C*, we have
implying that p*(Q)
Now replacing Q in (2.3b) with Q n (A U B) we also have
The latter reduces to
Substituting (2.6) into (2.5) we get
238
CHAPTER 5. MEASURES
which shows that A U B E C*. The above assertions a ) and b) imply that C* is an algebra in S1. c) Now we prove that C* is a a-algebra in 52. Since E*, as an algebra, is n -stable, it is sufficient to show that C* is a Dynkin system. (See Problem 1.10 of Chapter 4.) Let {A,} C C* be a sequence of disjoint sets. Take Al,A2 E {A,}. Substituting Al = A and A2 = B into (2.3~))taking A and B in ( 2 . 3 ~ )disjoint, and then noticing that A f l BC= A and B f l AC = B, we arrive at
If AI,. ..,A, is an n-tuple of mutually disjoint elements of C*, then, by induction, from (2.3d)) P*[Q n
where
S, =
s,]= C
E = IAk.
Denote
- ,r*(Q
S=
(2.3e)
n A,))
C=:
lAn.
Because of
S, C S, (Q n SC)c (Q n Sk), and by monotonicity of p*,
Since C* is an algebra, i t follows that S, E C* and hence it is p*measurable, i.e., it separates Q, which, combined with (2.3e) and (2.30, yields
Therefore,
that, by a-subadditivity, is
Inequalities (2.3) and (2.3g-2.3h) lead to
2. Extension of Set Functions to a Measure
concluding that S =
Ern A n=l n
239
indeed separates any Q C T(R)
and thus is an element of C*. The latter supports the claim that C* is a Dynkin system and, consequently, that C* is a a-algebra. d) We show that pg is a measure on C*. Substituting the set S = A for Q in (2.3g), we have =1 n
which, due to a-subadditivity of p*, leads to the strict equality and thereby, a-additivity of p;. Therefore, we have proved that Resz*p*, denoted by pg, is a measure. The proof is, therefore, completed.
0
2.4 Examples.
()
Let
52 = {a,b,c), A = {a), AC= { b , ~ ) , P = {b}, Q = {c}, R = {a,b), S = {a,c}. Define the following set function p* on ?(a).
One can easily verify that p* is an outer measure on
as it satisfies axioms (2.la-2.lc), but p* is not a measure, because it is not additive. We can see that only the sets (8, R, A, and AC p*-separate all subsets of R and, consequently, {@,R,A,A~}is the c-algebra C*. Clearly, p:, as the restriction of p* on C*, is a measure.
(ii) Let R be an infinite set. Define the set function 7 on T(R) by y(Q) = 0 if Q is a finite set and 7(Q) = 1 if Q is infinite. Let Q = { { w , ) , n = 1,2,. .) be a sequence of all different singletons. Then,
.
while 7(Q) = 1. Thus, y is not a-subadditive and not an outer measure.
240
CHAPTER 5. MEASURES
Recall that a restriction of a function [X,Y,f is a function [Xo,Yo,fo] defined on contracted domain Xo X with f = f o on X o and Yo 2 Y. (In notation, f a = ResX f .) From Theorem 2.3, we learned 0
that the set function [E*,[O,oo],p@ is a restriction of an outer measure ["s(fl),[O,~I,~*l. If and P are supersets of X and Y, respectively, a function [ E , 9 , f ] is called an extension of f (from X to X),if [X,Y,f ] is the restriction of 7 to X. (In notation f = Eolxf .) We will apply this notion to extend a set function y defined on a collection (jof subsets of S1 to a set function 7 on an expanded family $(g) of subsets of 0. For instance, in Example 1.2 (ii) we defined the Lebesgue elementary content XO on the semi-ring Y of half-open intervals in Wn. We can extend the Lebesgue elementary content A0 to a (unique) content A, on a ( ) ) (see Problem 2.2)) which turns out to be a premeasure on (verified in Theorem 3.1). The primary goal in this section is to construct an extension of a set function, such as premeasure, given on a ring, to a measure on the smallest o-algebra generated by this ring. Although this is the main objective, other extensions, such as "completion" of a measure, will also be a focus of our discussions. 2.5 Definitions.
(i) Let ( n , E , p ) be a measure space. A set N E L' is called a p-null set (or just null set) if p ( N ) = 0. We denote the set of all p-null sets by Np.A set E is called pnegligible (or just negligible), if there is a measurable null superset of E. The measure space is called complete, if for each null set N E N p , T ( N ) C - 22, i.e., if all negligible sets are measurable.
(ii) Consider a measure space (R,E,p). Let E be the collection of all sets of type A U M where A E C and M is any negligible set. According to Problem 2.8, C is a 0-algebra. We extend p to F on by setting
(z,~)
The extension of (,E,p) or just P is then said to be the completion of measure p and, due to Problem 2.7, ( n , C , p ) is a measure space, called the completion of measure space (a,L',p).
2.6 ]Example. Let Sl = W, E = {A E ?(R): either A or AC 4 N), which is a o-algebra on !it (see Example 1.2 (vii), Chapter 4)) and i t E~ be the n = l,2,. . .) and Ac are elements of C and point mass. Both A = E~(A')= 0. Obviously, E = [2,m), as a subset of AC, is negligible, but not measurable. Therefore, the measure space (IW,E,E~)is not complete. (See a more general case in Problem 2.14.)
{A,
2. Extension of Set Functions to a Measure
241
The proposition below is a paradigm of a complete measure space.
2.7 Proposition. The restriction p; of an outer measure p* to the ualgebra C* of all p*-measurable subsets of Q is complete and (Q,E*,pg) is a complete measure space.
Proof. Since p* is defined on whole ?(a),for any p*-negligible subset N a, due to (2.lb)) p*(N) = 0 and, therefore, it is sufficient to show that N is p*-measurable. Let Q 5 a. Due to monotonicity of outer measure, p*(Q n lv) = 0 and p"(Q fl N c ) 5 p*(Q) and this, along with (2.3), yields
and, hence, that N E C*. The following will be a construction of an outer measure by an arbitrary set function y defined on an arbitrary subcollection of sets Cj 5 ?(a). As usual, we only assume that CJ contains the empty set and that 7, as a set function, is such that y(@) = 0. This construction lies in the basis of the Carathiodoy extension of the set function y to a measure on a-algebra C(g). For any subset Q 5 R, denote by CQ(g) the collection of all at most countable covers of set Q by elements of g. (Unless there is another subcollection, besides g, under consideration, we will for brevity drop g in EQ(g).) Therefore, if EQ #
@,for any {G,}
E EQ, we have
00
Q 5 n - IGn.
2.8 Proposition. The set function p* defined on ?(a) as
is an outer measure.
Proof. We need to verify the above properties (2.la-2.1~)of p* as an outer measure: a) Since (b E g and y (0) = 0, it follows that p*(@) = 0. b ) We assume that both p*(A) and p*(B) are finite, since otherwise, the proof is obvious. If A B, EB E Q A and then we can reach on EA a possibly smaller limit inferior than that on EB. Therefore,
00
C)
Let {Qn} 5 ?(a) and Q = U Q,. If for a t least one n, n=l
(E. Qn
= @,
CHAPTER 5 . MEASURES
then also gQ = 0 and subadditivity immediately follows. We assume that for all n, gQ # 0 and choose an E > 0. From n
and by the definition of a limit inferior, it follows that for €2 -', there is a cover {Gin, n = 1,2,. ..) E (Sp. such that 8
.
Now, clearly {Gin, i,n = 1,2,. .) E
which proves monotonicity
(Sg.
Thus,
.
0
We will call the couple (Q,y) (a subset Q of T(R) and a set function 7 on Q) a f o r m a t t e r of outer m e a s u r e p* defined by (2.8). As it has been shown, the formatter and, subsequently, the outer measure, induced the rr-algebra C*, on which p* was a complete measure. When constructing a measure space (R, E*,p;S) by (Q, 7)) the major goal is to extend y from Q to a measure, say p, acting on the smallest rralgebra C(g) generated by Q. This can be achieved by restricting (C*,pi) to (E(g),pcl)\given that (E*,p@ itself is an extension of (Q, 7). The latter, however, is* not guaranteed from the above construction, unless we impose some restrictions to the' formatter (g, y), for even though (g, y) produces (R, Z*,pg), (Q,y) need not have all elements p*-measurable. In other words, Q need not be a subset of 6'. In addition, p: need not coincide with y on Q. For example, if y is an elementary content and Q is a semi-ring, then, according to Problem 2.2, for each G E Cj, there is a cover {C,) of G such that c:= lCn is a decomposition of G and
Hence, in order that p*(G) = y(G), y must be rr-additive on g, which, in general, it is not. Consequently, we call (E*,p;I) (produced by (g,y) in (2.8)) the complete Carath6odory extension of (Q,?) if Q C* and R e s pg = y. If
9
(E*,pg) is the Carathbodory extension of (9, y), then the formatter (9, y) is said to be extendible and the corresponding restriction of (E*,p:) to (Z(Q), p ) is referred to as the CarathCodoy extension o f (9, y). As mentioned above, one of the most important questions arises, what the formatter (g,y) should really be to be extendible and, consequently, generate the Carathhodory extension. By now, we have a fairly
2. Extension of Set Functions to a Measure
243
large choice of systems of sets and set functions on them ranging from semi-rings to a-algebras and elementary contents to measures. The idea is, however, to select a possibly more rudimentary formatter (g, y), which is tame and suited in most common practical applications and constructions and such that (E*,pG) is an extension of ( g , ~ ) .In particular, this means that the elements of Q have to be p*-measurable. The theorem below, which is a crucial step in the whole extension procedure, infers that (9,y) can be a ring and premeasure to serve as a reasonable extendible formatter. 2.9 Theorem.
Let (Q,7) be a semi-ring and elementary content, respectively, in 52, which produce the outer measure p* and a-algebra C* of p*measurable subsets of R. Then E C*. (i)
(ii) I ' in addition, 7 is a-additive on therefore (E*,p:) is an extension of ( 9 , ~ ) .
9,
then 7 = Res p* and
9
Proof. We have to show that 8 C E*, i.e., that any element, G E 9, p*-separates all subsets of R. Take any subset Q & R with OQ # 0,since, otherwise, the proof would be trivial, and let C = {C,} be any (countable) cover of Q from Oq. For a G € Q, and Cn € C,
(i)
Since Q is a semi-ring, C, n G is an element of 9 and C n \ G can be represented as a finite union of pairwise disjoint elements of Q, say
c y =n 1Sjn. Consequently, (2.9) can be rewritten as and, by finite additivity of 7,
y(C,) Now, suppose Ern n = l (2.9a) over n gives
< m.
Then, summing up all equations in
.
where {S,} is the reordered sequence isjn, j = 1,. .,Nn, n = 1,2,. ..I. As Q = (Q n G) (Q n GC), obviously, {Cnn G) g and {S,} 9 are covers of Q n G and Q n GC,respectively. Consequently,
+
c
c
CHAPTER 5 . MEASURES
and
C=:
l,rsn)
2 p*(Q n GC),
and then by (2.9b),
Since this inequality holds for every cover C of Q, it should also hold for the limit inferior to yield
If
cF-- P-Iy(Cn)=
OO,
then the equation symbol in (2.9b) must be
- " to yield ( 2 . 9 ~ )again. The inverse inequality is due to replaced by " > (2.3). Therefore, G separates all subsets of Sl and, consequently, g C*. (ii) By Problem 2.2, for each G E 9, there is a cover {C,} of G such that
G=
C: = I C n
and p*(G)= C:-~-~(C,). -
additive, p* coincides with y on
Hence, if y is a-
g.
These two facts warrant that (C*,p:) is an extension of ( 9 , ~ ) .
0
2.10 Remarks.
(i) One should bear in mind that, while (g, 7) can be an extendible formatter for the outer measure P*, C j is not really a generator for C*, as the latter need not be the smallest a-algebra containing 9. We would like to make a clear distinction between these two terms. Recall that a family Q E T(R) is said to be a generator of another family (g ) To ?(a) with a property P, if is the intersection of all supercollections of CJ on each of which property P holds. In our case, C* will eventually contain the smallest a-algebra ,E = E(g) and, in general, p* needs to be further restricted 'to this a-algebra. From Theorem 2.9, we conclude that any elementary content y on a semi-ring 9, which is a-additive, can be extended to a measure p = p* (acting on the smallest a-algebra
Resx(~)
C generated by 9). In other words, if y is a a-additive elementary content on a semi-ring 9, then there exists a t least one extension, namely, Carat hdodory 's extension. (ii) From the proof of Theorem 2.9, it is obvious that a semi-ring with a u-additive elementary content on it is one of the most economical systems good for the Carathdodory extension. However, it is often more
2. Extension of Set Functions to a Measure
245
prudent to work with premeasures on rings. In practice, to start with, one can first extend a semi-ring with an elementary content to the smallest ring with the content using the procedure of Theorem 2.5 (Chapter 4) and Proposition 2.11 below.
(iii) Another reasonable question arises: in how many different ways can a formatter (9,y ) be extended to a measure on C(g)?Theorem 2.13 below states that with some relatively minor restriction (given in Remark 2.12) to a set function y, the uniqueness of Carathkodory's extension is guaranteed. 0 We will begin with one useful extension of an elementary content on a semi-ring to a content on the smallest ring containing the semi-ring.
2.11 Proposition. There is exactly one content on %(!f), coincides with the elementary content on Y. (See Problem 2.3.)
which
2.12 &mark. In Definition 1.1 (vii) we introduced the notion of afiniteness of a set function. Sometimes it is more convenient to use another definition of cr-finiteness, which is equivalent to 1.1 (vii) for a large class of set functions. Namely, the condition of having a monotone increasing sequence {G,} t R from g with y(G,) < a, for all n can be replaced by the equivalent condition that there is a t most a countable art it ion {a1,R2,. ..) 5 g of R ( ==z : p,, R,) such that y(R,) < a, for all n. For instance, rings with contents clearly provide a basis for such equivalence. For a semi-ring with elementary content, the first definition yields the second one, as we can arrange from {G,)7 R a countable decomposition; the converse is not true. Another related notion we are going to use in the sequel is c-finiteness of a set. Let ( R ,E ) p ) be a measure space. A measurable set A is said to be a-finite if ResC A~ is a-finite. 0
,
2.13 Theorem. Let g be a fl -stable generator of the cr-algebra E(g) in 51 such that g contains a monotone increasing sequence {B,} TG!. Let p1 and p2 be two measures on C(Q), which are a-finite on {B,} and which coincide on g. Then p1 = p2 on C(Q. :
Proof. Let A E g such that p l ( A ) = p 2 ( A ) < a, and let %A = {B E p l ( A fl B ) = p,(An B)}. We show that g A is a Dynkin system: a ) A E g A implies that R E g A . b ) Let D E % A Then A n DC = A\D = A\(A fl D),which implies that PI(An DC)= ~1 ( A )nD)
CHAPTER 5. MEASURES
and this leads to DCE gA. c) Let {D,} be a sequence of disjoint sets from gA.Then
w
Hence
C D n E gA,and
therefore gAis a Dynkin system. Since
n=l
obviously g gA,it follows that g c g((f) gA . Also since ~~stable, it follows that g(Cj) is a a-algebra. Hence, we have
g is
leading to In particular, we proved that VB E C(g) p l ( A n B ) = p2(A n B). Now let {Bn}be a monotone increasing sequence of sets from 9 convergent to R. Thus C(Pf) = gB . Then tln = 1,2,. . ., and n QB E Z(g),
Since {B,
n B) t B and sate pi(B n B,) < m, by lim n+w
pl(B
n Bn) = n+w lim
p,(B
Lemma 1.6,
n B,)
Now, by means of Theorem 2.13 we easily deduce the following significant statement.
2.14 Corollary. Let y be a a-finite and a-additive elementary content on a semj-ring Cj. Then the CarathCodory extension o f 7 to a measure on a-algebra Z(g) is a unique extension. The lemmas below will be used for various purposes and, in particular, will lead to a relationship between the completion ( R , C , ~ )of a measure space (fl,C,p) and the a-algebra C* of all p*-measurable sets.
2.15 Lemma. Let (R,g,y) be an extendible formatter of the outer measure p*, Pf, the collection of all at most countable unions of elements from 9. Then, for each Q S fl, there is a set G, E Q,, such that G, Q and
>
2. Extension of Set Functions to a Measure
Proof. Because p* is generated by
247
(g, y),
If p*(Q) = oo, then inequality (2.15) holds trivially. Suppose p*(Q) < oo. Then, by definition of a limit inferior and from (2.15a), for every E > 0, there is {G,} E EQsuch that p*(Q)
+
&
Now, we make use of the fact that (9,y) is an extendible formatter. This implies that not only Cj E C*, but also monotone increasing and p* below (Lemma 1.6))
g, 5 C*.
Since
k
u Gn is n =1
< oo for all k, by continuity from
Passing to the limit in (2.15b)) which holds true for all k, we prove (2.15) with G, =
00
U Gn being the desired set.
n =1
0
Lemma 2.16. Let p* be an outer measure, C* the a-algebra of all p*-measurable sets, and A any subset of S1. If there is a p*-measurable set B such that B 2 A and p*(B\A) = 0, then A E C*. Proof. Since B E C*, it should p*-separate Q:
Now, because A C B, we can easily show that
From Q n (B\A) 5 B\A, it follows that p*(Q n (B\A)) = 0. From (2.16a))
248
CHAPTER 5. MEASURES
Consequently, we can replace p*(Q n B C ) in (2.16) by p*(Q n A'). Finally, noticing that Q n B C_ Q fl A, we have that
and this is the desired inequality.
0
Lemma 2.17. Let p* be the outer measure generated b y an extendible formatter ((5, y ) , E* be the a-algebra of all p*-measurable sets, p; be ResE*p*, and let E(Cj) be the a-algebra generated b y 9. Then, for
every A* E E* such that &(A*) B 3 - A* and pG(B\A*) = 0.
< m,
there is a set B E E(Cj) with
Proof. Since pG(A*) < oo, Eg # 0. Fmm Lemma 2.15, for every
i, there is a Gk, =
> A* such that p;(Gk,) 5 pg(A*) + The latter yields that ~;(G;\A*) 5 i. Obviously, k m=nl Gk, is still a
E
> 0,
say
00
U G:
n=l
E.
superset of A* and since
where Dm
m
n G: E G F , it follows that
k=1
=(k = l G~,)\A* E E*. The sequence { D m } is clearly monotone
nonincreasing and pg(D1) < oo. Therefore, by continuity from above (see Theorem 1.7 (i)))of p; and because of (2.17),
The set
00
fl
k =1
G; obviously meets the requirements on set B 'Lprornised"
in the statement and we are done with the proof.
Corollary 2.18. Let p* be the outer measure generated b y a a-finite extendible formatter ( C j , y ) , E* be the a-algebra of all p*-measurable sets, and let E((5) be the a-algebra generated b y (5. Then, for every A* E C*, there is a set B E C(Cj)with B A* and p*(B\A*) = 0.
>
Prmf. Since (0,y ) is a-finite, there is a partition {H1,H2,...) 5 Cj of R such that y ( H k ) < oo. If A* E E*, then
{A; = A* n H k , k = 1,2,. ..) is a p*-measurable partition of A*, with p*(A;) < oo for every k , and to each of which we can apply Lemma 2.17 and have a set B k E C(Cj),with B k A; and p*(Bk\A;) = 0.
>
2. Extension of Set Functions to a Measure
249
Notice that since
it holds true that
The statement follows after setting B =
k = 1Bk ( E ~ ( g ) ) .
n
Now, with the aid of the above propositions, we can finally answer the question about the relationship between the completion (n,E,ji) of a measure space (n,C,p) and the a-algebra C* of all p*-measurable sets.
2.19. Theorem. Let (g, y) be an extendible formatter f o r (R,C*,p;) and a generator f o r the measure space (R, C = a(g), p = Respp*) whose completion is ( ~ , E , j i ) . (i)
Then,
EEC*.
* (ii) If (Cj,y) is a-finite, then C = C* and P = po. Proof. (i) Obviously, 5 C* if and only if, any element 2 of 3 is of the form A U N, where A E C , N is p-negligible, and 3 is p*-measurable. According to Lemma 2.16, A U N would be p*-measurable, if there is a p*-measurable set B such that B A U N and p*(B\(A U N)) = 0. By Definition 2.5 (i) of a p-negligible set, N must have a C-measurable pnull superset, say No. (Note that even though, by Problem 2.10, p*(N) = 0 and p*(AU N ) = p*(A), this does not warrant that A U N E E*.) Since A U No is a superset of A U N and, by Problem 2.11, (A U No)\(A U N ) is a p*-null set, B = A U No meets all prerequisites of Lemma 2.16, which makes A U N indeed p*-measurable. This proves part (i) of the theorem.
>
(ii) Because of part (i), we need to show that C* C C, i.e., that each A* can be represented as the union of a p-measurable set and pnegligible set. By Problem 2.12, for any A* E C*, there is a C-measurable subset B of A* such that p*(A*\B) = 0. Obviously, A* can be decomposed as B and p*-measurable null set A* n BC. It only remains to show that A* n BCis p-negligible.
250
CHAPTER 5. MEASURES
>
By Corollary 2.18, for A*, there is a set C E C such that C A* and p*(C\A*) = 0. The set-difference C\B = (C\A*) (A*\B), as the union of two p*-null sets, is a p*-null set, therefore, a p-null set (as C \ B E C). This proves that A* n BCis p-negligible. Now, we show that = pg. (Recall that they are equal on C.) Since C = C*, A* = A U N , where A E C and N is p-negligible, and
+
On the other hand, there is a p-null superset of N to yield p*(N) = 0 due to monotonicity of p*. Finally, from the inequalities p*(A*) 5 p*(A)
+ p*(N) = P*(A)
and p*(A* = A U N )
2p*(A),
it follows that p*(A*) = p*(A) and this, along with (2.19), yields that p(A*) = p*(A*) for each A* E C* =
x.
Example 2.20. If (S2, C , p ) is a probability space, it follows from Theorem 2.19 that the completion of ( C , p ) coincides with (C*,&) produced by ( C , p) or by a "smaller generator" (Cj, 7 ) of p). 0
(z,
A noteworthy question arises: if we have a semi-ring and o-additive elementary content, would it make any difference, if we first extend them to the smallest ring and premeasure, according to Proposition 2.11, and then use the Carathiodory extension to arrive a t the smallest c-algebra and a measure on it, or apply the Carathdodory extension directly to that semi-ring and o-additive elementary content. The same question applies, say, to a ring with a premeasure and the generated cr-algebra with a measure. The difference, if any, can apparently take place a t the expense of two outer measures, induced by a formatter and its extension. 2.21 Theorem. Let (Sl,Cj,70) be an extendible formatter of outer measure p* and o-algebra C* of p*-measurable sets and let (8 = 8(Cj),7 ) be an extension of ( C ~ , Y and ~ ) an extendible formatter of outer measure v* and %*, such that 8 C C* and 7 = Resgp*. Then, v* = p* on T(R) and C* = %*.
Proof. Let Q E R. Since eQ(g) E gQ(8), obviously
which yields the equation v* = p* on a subcollection of sets Q E T(S2) with v*(Q) = m. Suppose v*(Q) < m. Then, for every > 0, there is a cover {En}E CSq(g) with
5
2. Extension of Set Functions to a Measure
Since y = p* on 8 and y(En) = pl(En) < m, for each E2 - n - 1 is a cover (Gnk, k = 1,2,. ..) E E E (g), such that
25 1
> 0, there
n
Because {Gnk,n, k = 1,2,. ..) E CSg(g), from (2.21a) and (2.21b),
Finally, taking in (2.21~)E = leads to the inverse of inequality (2.21) and proves that p* = v* on T(R). Since the outer measure is the mere generator of the a-algebra of separating sets, p* = V* yields that C* = %*, which completes the theorem. An important consequence of Theorem 2.21 is the following.
2.22 Corollary. Let (3, yo) be a semi-ring and w-additive and w-finite elemenlay content in R, and let (8 = 8(3),y) be an extension of (3,yo) such that 8 C(3). Suppose (C, = C(Y),p,) and (C, = C(g),p,) be the Carathiodoy extensions with their respective outer measures p,* and p,* and a-algebras C,* and C: of measurable sets. Then,the following hold true: 1) C = C, = C,.
Proof. 1) From % 2 C, we have Ee 2 C,. From 3s 8 C E , it follows that
Z,
c Ce.
2) Now measures p, and p, act on the same a-algebra C and coincide on semi-ring Y. Since yo is a-finite on 3, by Corollary 2.14, p, = pe on C.
3) With
and, consequently,
CHAPTER 5. MEASURES
we meet all conditions of Theorem 2.21 to have p: = p: = p*. 4) C* = C: = C: also by Theorem 2.21.
For instance, 6 can be a ring generated by !f and y - the extension of the elementary content yo in accordance with Proposition 2.11; or 6 can be an algebra with y as a premeasure or IS can even be the a-algebra E(Y). In particular, it follows that, once the Carathdodory extension from (Y, yo) to (C(Y), p) is rendered, another Carathdodory's extension of (S,y) would be redundant. Another consequence of Theorem 2.21 is the uniqueness of outer measures generated by measures.
Corollary 2.23. Let p a measure on a a-algebra C, which produces the outer measure p* with a-algebra C* of measurable sets. If there is another outer measure p *, then * = p* on ?(a) and C* = E*.
Proof. This is a direct application of Theorem 2.21 with the following identification of the above characteristics: be a measure on C such that 1) Let as an extension of (C,p). 2) 23 2 C*. 3) p = = Reszp*.
= p. Then ji can serve
Remark 2.24. Corollary 2.23 is useful in various applications of Carat hdodory 's extension. Suppose yl and y2 are two elementary contents coinciding on a a-finite semi-ring 9 (i.e. they are a-finite on 9). By Corollary 2.14, their respective Carathkodory extensions p1 and p2 must coincide on C(9). Let p; and p; be the corresponding outer measures, according to Corollary 2.22, produced by yl and y2 or p1 and p2 (regardless). By Corollary 2.23, p; = p; on T(R) and C; = C;.
As in-Theorem 2.21, by comparing two measures generated by a set function acting on a collection of sets and their extension, we ended up comparing two corresponding produced outer measures. It seems to be reasonable to raise another question: what if an outer measure will produce another outer measure? Would this make any difference? More specifically, can the restriction p i of an outer measure p* on E* become a formatter of another, different from p*, outer measure? Note that this is a different scenario from one considered in Theorem 2.21, since here p* is not supposed to be generated by a formatter and it "acts on its own." The following example shows us this distinction. 2.25 Example. Consider
?(a),p*, C*, and p;
in Example 2.4 (i):
2. Extension of Set Functions to a Measure
Q=
{a,b,c}, A = {a}, AC = {b,c}, P = {b},
C* = {Q),R,A,AC],and pf; = Resz*p*. Then, generate the outer measure v* by (C*,p;). So, we have: p* = v* on E* and v*(P) = p*(AC)= 3 ( > p*(P) = 2),
v*(S) = p*(R) = 4 ( > p*(S) = 3).
0
As we see it, in most cases v* is strictly greater than p* on Y(S1).
As we learn it from Problem 2.1, if v* is an outer measure induced by p:, then always p* 5 v* on ?(a). That p* = v*, requires some restrictions, such as those in the following proposition.
2.26
Proposition. Let p* be an outer measure on be the outer measure produced b y ( *
* po = Resz*p* and v*
?(a), ) The
equation p* = v* holds true on Y(R) if and only if for every Q E ?(a), there exists a set A* E C* such that A* Q and p*(A) = p*(Q). 0
>
(See Problem 2.17.) 2.27 Remark. If p* is generated by an extendible formatter (g, y), then clearly p* = v*, due to Theorem 2.21, as (p;,E*) can serve as an extension of (Cj,y). Alternatively, if Q E Y(R), according to Lemma 2.15, for each positive E, there is a set G, E Cj, (a collection of all countable unions of elements from 9) such that G, 2 Q and p*(G,) 5 p*(Q) E. We assume that v* is the outer measure generated by pg. Since p* = v* on T(R) and G, E C*, we have p*(G,) = v*(G,) and, by monotonicity, v*(G,) v*(Q). Thus, we have
+
>
which yields v*(Q) 5 p*(Q). The inverse inequality is due to Problem
CHAPTER 5 . MEASURES
2.28 Theorem. Let ( a , C, p) be a measure space such that C = C(Y) with Y being a semi-ring, and p be a-finite on Y. Then, given A E C(Y) and E > 0, there is a disjoint countable cover (S,} E Y of A that aapproximates" A, i.e. such that A C c:= and p((C:= l ~ n ) \ ~ )
< E.
Proof. Let y = Resyp and p* be the outer measure produced by (Y, y). Then, p is the unique caratheodor; extension of y from Y to C(Y), according to Corollary 2.14, and p = ResEp*. p(A) = p*(A) < oo. Then, by (2.8) (of Proposition 2.8), for each sequence (G,} E Ga such that Case 1. Let --
s
since p( n = 1Gn) 5
C IP n = l p(Gn), we have that
p({
n
E
> 0, there is
a
E= l ~ n } \ ~ <) E .
Case 2. p(A) is arbitrary. -By ~-finitenessof p on Y, there is a t most a countable decomposition 00
Ck=,
k
of
by
(Qk) E Y such that p(Qk) < oo and hence
p ( n~a,) < m. Now, we apply the above arguments to A fl SIL and is a sequence (Gnk} Y fl Slk such that
5. Hence, there 2
This leads to
and thus to
where (2,: =
00
= lGnk.
Finally, it remains to form a disjoint sequence of semi-ring sets as stated in the theorem. The latter can be rendered in the same way as in Lemma 2.5 of Chapter 4. 2.29 Corollary. If in the condition of Theorem 2.28, p(A)
< oo,
then A can be approximated by just a finite tuple of disjoint semi-ring sets.
2. Extension of Set Funciions l o a Measure
255
Proof. Because p(A) < m, by Case 1 of Theorem 2.28, so is p(~;= ,Gn) < m. Then, by continuity from above, for each E > 0, there is an N such that 00
" ( C k = n + l Gk) < E , for all n
>N
thereby leading to
PROBLEMS 2.1
Let p* be an outer measure on ?(R), p; = Resp*p*, and v* be the outer measure induced by p:.
2.2
Show that p* 5 v* on ?(a).
Let (Cj,7) be a formatter of the outer measure p* defined by (2.8). Show that if 7 is an elementary content and g is a semi-ring, then for each G E CJ, there is a cover { C } of G such that G= - 1 Cn and p*(G) = - 1y(Cn).
~r-
cY-
2.3
Prove Proposition 2.11.
2.4
Let p be a finite measure on (R,E) and let be any subcollection of C. Show that, for any fixed subset Q C R, it is true that
2.5
Show that the original definition of a-finiteness 1.1 (vii) implies the second definition of a-finiteness for semi-rings and elementary contents mentioned in Remark 2.12.
2.6
Let p* be an outer measure on ?(R) and {A,} a sequence of disjoint p*-measurable sets. Show that for any Q C R,
2.7
Let N E N, ( i . . a p-null set) and let B E E. Show that U B ) = p(B\N) =
2.8
Show that defined in Definition 2.5 (ii) is a o-algebra, P is a measure, that this extension does not depend upon representations of sets of C, and that ( R , ~ , P )is complete.
2.9
Show that the measure space defined in Example 1.2 (iv) is complete.
z
256
CHAPTER 5. MEASURES
Let p* be an outer measure on ?(R) and N C R be such that p*(N) = 0. Show that for any subset Q C R, p*(Q U N ) = p*(Q). Show that (A U No)\(A U N) in part (i) of Theorem 2.19 is a p*null set. Let p* be the outer measure generated by an extendible a-finite formatter (g, y), Z* be the a-algebra of all p*-measurable sets, and let C(Q) be the a-algebra generated by Q. Show that for every A* E E*, there is a set B E E(g) with B S A* and p*(A*\B) = 0. Let (G?,Eo,pO)be a completion of a measure space (R,C,p). Define for each A C St F(A) = sup{p(B): B E C, B C A) and p(A) = inf(p(B): B E C, A 5 B). Show that a) if A E Co, then p ( A ) = p(A) = po(A);
b ) if p ( A ) = p(A) < oo, then A E Co. Let C be a u-algebra in R and let a E R. Show that for {a} E 23 the measure space (G?,E,E=)is complete if and only if C = T(R). (Generalization of Problem 1.12.) Let ( R , E , p ) be a measure Q fl G E C, VG space, Q, = {G E C: p(G) < oo), C, = {Q E,}g, and p be u-finite. Show that C, = 6.
m:
In the condition of Problem 1.13, show that if p is complete, then so is pm. Prove Proposition 2.26. Let p* be the outer measure generated by an extendible formatter (9, y) on a non-empty set R, C* be the u-algebra of all p*-measurable sets, and C(Q) be the u-algebra generated by Q. Show that a subset N 2 52 is negligible if and only if p*(N) = 0.
2. Extension of Set Punctions to a Measure
NEW TERMS: outer measure 235 monotonicity of outer measure 236 subadditivity of outer measure 236 p*-measurable set 236 p*-separabili ty 236 Carathkodory's Extension Theorem 236 p:-measure 236 E*-a-algebra 236 restriction of a function 240 extension of a function 240 p-null (null) set 240 null (p-null) set 240 Niset 240 p-negligible (negligible) set 240 negligible (p-negligible) set 240 extension of a measure 240 completion of a measure 240 completion of a measure space 240 restriction of outer measure to E*-algebra 241 Carathkodory 's extension 241, 242 formatter of an outer measure 242 complete Carathkodory 's extension 242 extendible formatter 242 extendibility of a formatter, criterion of 243 a-finiteness of a set function 245 Carathdodory's extension, uniqueness of 245, 246
258
CHAPTER 5. MEASURES
3. LEBESGUE AND LEBESGUESTIELTJES MEASURES In this section, we will use the results of the previous section for the construction of Lebesgue and Lebesgue-Stieltjes measures. We have learned that to warrant the Carathbodory extension, a given formatter should be a t least a semi-ring and a-additive elementary content, which applies to some special cases of formatters in Euclidean spaces. In Theorem 3.1 below, we will show that the Lebesgue content is a-additive on the ring %(Fin), which will clearly yield that Lebesgue elementary content is also a-additive on the semi-ring of half open intervals. Although it is possible to prove this statement directly (cf. Problem 3.25 with no prior extension and -@-continuity arguments, as in Theorem 3.1), we prefer first to extend the elementary content to the ring, as we want to exploit the equivalence of @-continuity and a-additivity. The latter, as we know, can be observed on set families not lesser than rings.
Theorem 3.1. The Lebesgue content A, on the ring %(Rn) is aadditive, i. e. a prerneasure. Proof. Since the Lebesgue content A, is finite on %, by Proposition 1.7 (ii), A, were a premeasure if it would be @-continuous. We shall be using an equivalent version of @-continuity: % with For every monotone decreasing sequence {A,)J Ac(A1) < oo, the assumption that n+oo lim A c (A,) (which clearly 00
exists) is strictly positive must yield that fl A, n =1
# @.
Let { A , ) be any such monotone decreasing sequence with E
= n+= lim A c (A,) > 0.
It is readily seen that (3.1) implies that for each n, A, fore, by Cantor's Theorem 5.4, Chapter 2,
00
-
fl A,
n=l
# @,and
# @.
there-
However, the
nonempty intersection of the closures of An's need not yield that the intersection
00
n
n = l
A,
#@
either. T o overcome this difficulty we will
construct a subsequence of compact subsets of A,'s with the desired above property. Now, since An's E 9, each A, can be represented as a finite union of disjoint half open parallelepipeds, say C : , (for brevity let us drop index n) such that A,(P,) > 0. Then for each value of E and for every P,, there is a half open parallelepiped l?, whose closure is a proper subset of P, and such that
n,
3. Lebesgue and Lebesgue-Stieltjes Measures
Bound (3.la) yields that
where B, =
x
= D,. Obviously,
with the sequence
3,
A,. It seems like we are done
{B,). However, the claim that
00
n
n = l
an#@ is un-
warranted, as {B,} need not be monotone decreasing. Therefore, we define
which forms a monotone nonincreasing sequence of sets term-wise dominated by {A,}. Now, we need to show that C, # @.We shall be able to prove a much stronger statement that AC(Cn)> 0 for all n. Namely, we will prove that
which, because of Xc(An) 2 E , would yield the desired XC(C,)
1
+.
(3. l d )
We prove ( 3 . 1 ~ )by induction. For n = 1, ( 3 . 1 ~ )holds true, since from (3.lb), 1 > Ac(Al) - 9. Now we assume that ( 3 . 1 ~ )holds for some n > 1 and show the validity of (3.1~)for n + 1. Because of C, = B, n C, and Proposition 1.5
Xc(CI) = X,(Bl)
+
+
(ii),
AdB,+ I LJCn) = 'c(Bn + 1)+ hc(Cn) - Ac(Cn + 1)Due to (3.le), the inequality Xc(Bn + I ) 1 .\,(A,
+
(3.le)
- 2"+1E (from
+
(3.lb) for n 1), and the assumption that ( 3 . 1 ~ )holds true for some fixed n we have
Since obviously B, and hence
+
U C,
S A,, we have hc(An) 2 X,(B,
+
U C,),
260
CHAPTER 5. MEASURES
This proves ( 3 . 1 ~ )and (3.ld) and thereby yields that {En]is a monotone nonincreasing sequence of nonempty compact sets; hence, by Cantor's Theorem 5.4, Chapter 2,
Consequently, it shows that Xc is indeed a premeasure on the ring Vb.
3.2 Remarks and Definitions. (i) Theorem 3.1 states that the Lebesgue content on %(Y) in Rn is a-additive. This, obviously, implies that the Lebesgue elementary content is also a-additive on Y. (ii) In Example 1.2 (ii) we defined the Lebesgue elementary content
X0 on the semi-ring Y of half-open intervals in Wn. Now, by the use of Proposition 2.11, Corollary 2.14, and Theorem 3.1, we can have the couple (%,Ac) or, in light of Remark (i), even (!!,A0) as an extendible formatter of the outer measure X' acting on T(Rn) and call this set function the Lebesgue o u t q measure. The a-algebra E* 9(Rn) of all A*-measurable sets, in notat'ion, L*, called the Lebesgue a-algebra of measurable sets, along with A: = ResL.A*, callid. the Lebesgue measure, will form a complete measure space, according to Proposition 2.7. The further ) Xthe * Lebesgue outer measure on the smallest restriction A = R ~ S ~ ( ~ of a-algebra generated by Y (which, according to Theorem 2.7, Chapter 4, is identical to the smallest a-algebra generated by the usual topology) or, equivalently, by %, known a s the Borel a-algebra 38 on Wn, is referred to as the Borel-Lebesgue measure. By noticing that there exists a monotone increasing sequence ( - k,kIn f Rn of half-open squares with
we conclude that Xo is 0-finite on 9 and, therefore, by Corollary 2.14, the Borel-Lebesgue measure X is unique on 38. By Remark 2.24, the Lebesgue outer measure A* and hence the Lebesgue measure A; are also unique on T(Rn) and L*, respectively. Finally, by Theorem 2.19 (ii), the completion of Borel-Lebesgue measure X coincides with Lebesgue measure A: on L* and the corresponding completion of the Borel a-algebra coincides with the a-algebra L* of Lebesgue measurable sets. Both, Lebesgue and Borel-Lebesgue measures have their strengths
3. Lebesgue and Lebesgue-Stieltjes Measures
26 1
and weaknesses. The Borel-Lebesgue measure acts on the Borel u-algebra, which stems from the usual topology and preserves some topological properties. The Borel-Lebesgue measure is also an element of a very important class of Borel measures. However, unlike Lebesgue measure, Borel-Lebesgue measure is not complete.
3.3 Definitions. (i) Let G$ be a Borel u-algebra in R. Any measure p on 93 is called a Borel measure and the triple (R,G$,p) is called a Borel measure space. (ii) A Borel measure p on (Rn,G$) is said to be a Borel-LebesgueStieltjes measure if p(B) < XI for any de-bounded Borel set B. Clearly, any Borel-Lebesgue-Stieltjes measure is u-finite. (iii) Let p be a a Borel-Lebesgue-Stieltjes measure on (Rn,G$). Now, in light of Carathbodory's construction we can use the couple (93,p) as an extendible formatter of the outer measure p* acting on 9(Rn) and call this set function the Borel outer measure. The u-algebra E* E 9(Rn) of all p*-measurable sets will be denoted by 3 ' ; and called the LebesgueStieltjes a-algebra of measurable sets. The corresponding restriction p s = Res%.p* will be called the Lebesgue-Stieltjes measure. In the
u literature on measure theory, Lebesgue-Stieltjes measures are often confused with Borel-Lebesgue-Stieltjes measures. In addition to the Borel-Lebesgue measure on Borel a-algebra 93 on Rn, we present another construction of a Borel-Lebesgue-S tieltjes measure, for simplicity letting dimension n = 1. In Example 1.2 (iii) we introduced the Lebesgue-S tieltjes elementary content py on the semi-ring Y of all half-open intervals (a,b] 2 R, by means of an extended distribution function (i.e., a monotone nondecreasing, right-continuous function) f : ( , d e ) ( , d e ) , as pOf((a,b]) = f (6) - f (a). Observe that py reduces to the Lebesgue elementary content if f(x) = x. According to Proposition 2.11, py can uniquely be extended to the Lebesgue-Stieltjes content pf on the ring %(Y) of "figures." The following is to show that pf is u-additive.
3.4 Theorem. Let pf be a Lebesgue-Stieltjes content on the ring %(f) induced by a monotone nondecreasing right-continuous function f . Then pf is a premeasure.
Proof. Since pf is finite on %(Y), as in Theorem 3.1, it is sufficient to show that p is @-continuous. Let {R,) be a sequence of sets from %(Y) monotonical y decreasing to @. We prove that lim pf(Rn) = 0.
I
.
n-w
We assume that Rn C C, n = 1,2,. ., where C is a compact set in @,re). A set Rn E % is a figure if it is a finite union of disjoint intervals of type (a,b]. Because of right-continuity of f , it can be easily shown
262
CHAPTER 5. MEASURES
that, for each fixed Bn 2 Rn such that
E
> 0 and for any figure R,, there is a subfigure
3,s R n and such that
(Rn) - (B,) < ~2-". It also follows that n B n = Q) We claim that there is an r such that w -
f
n=l
-
n Bk = @.To see this, observe that k=l
{C\Bn = C n (B,)';
.
n = 1,2,. .} is
an open cover of C in the relative topology (C, ren C). Since compactness is weakly hereditary and C is closed, it follows that C is also compact in r en C. Thus, the above cover reduces to a finite subcover, for example, C\B1,. .,c\B, yielding that
.
Thus,
6Bk = 0 and n Bk = @. f
k=l
Now, for all n
k=l
> r,
n
nBk=#
k=l
and
Since {R,} is monotone decreasing, it follows that
Observe that this is the desired inclusion implying the estimate (R,) < E . This inclusion is due to the inclusion Rn\Bk 5 Rk\Bk, which holds for all k 5 n (as long as n < 00). Hence the above countable intersection reduces to a finite intersection of the sets Bk, k = I,. ..,r. Thus we have
which shows that pf(Rn)
4
0.
tl
Notice that it can alternatively be shown (Problem 3.26) that the Lebesgue-Stieltjes elementary content is a-additive with no prior extension to the Lebesgue-Stieltjes content and bypassing @-continuity.
3.5 Remarks.
(i)
Using the same arguments as in Remarks and Definitions 3.2,
3. Le besgue and Lebesgue-Stieltjes Measures
263
we will extend the Lebesgue-Stieltjes elementary content py (or content p f ) from the semi-ring Y (or ring %(Y), respectively) to the LebesgueStieltjes measure p j on the a-algebra C* ( = I;) of Lebesgue-Stieltjes measurable sets and then reduce it to the unique measure pf, which is clearly a Borel-Lebesgue-Stieltjes measure on the Borel a-algebra I @ ) . (ii) When dealing with Borel measures, it is common to observe a certain property of a a-finite Borel measure p1 on the semi-ring Y in Rn and extend this property of p1 from Y to the Borel a-algebra GB arriving a t another Borel measure p2. Since p1 and p2 coincide on Y, by Corollary 2.14, p1 = p2 on 3. Consequently, by Remark 2.24, the corresponding outer measures p: and p," must coincide on 9(Rn) as well as their restrictions on 93; = 93;. Note, however, that '3; is not a general notation like 3 is, for it is not a induced by the usual topology and it is related to a particular Borel measure p on 3. (iii) We have learned that if f is an extended distribution function (see the definition in Example 1.2 (iii)), then it induces a BorelLebesgue-Stieltjes measure on 3. Conversely, a Borel-Lebesgue-Stieltjes measure p generates an extended distribution function. If p is a finite Borel-Lebesgue-Stieltjes measure on I,then we can set f (x) = p(( - oo,x]) and such an f is a distribution function. Indeed, take a sequence xI > x2 > ... -+ x. Then f (x,) - f (x) = p((x x ])+0, by @.' ., continuity of p (Theorem 1.7 (i)), which shows that f is right-continuous. Since p is a finite measure, f is bounded. Finally, if x, is any monotone decreasing sequence convergent to - oo (such as { - n)), then, again by @-continuity of p, it follows that p(( - m,x,]) and thus, f(x,) 4 0. If p is an arbitrary Borel-Lebesgue-Stieltjes measure, we can define f (0) = 0 and
Similarly, one can show that f is an extended distribution function. (See Problem 3.3.) If B = B(R,3) denotes the set of all Borel-LebesgueStieltjes measures on ( R , I ) , then it can be shown that any two extended distribution functions f l and f 2 that induce p E 8 can differ only in an additive constant (see Problem 3.4). The latter generates an equivalence relation, say 8. Therefore, if 9, denotes the set of all extended distribution functions, for each p E 23, there is a unique equivalence class {f;p) of all such extended distribution functions that induce p, and {f;p) = {f + c : c E R). Let 9,lg: = ({f;p):p E 8) be the corresponding quotient set of 9,.Then, there is a bijective map '3 from the set 8 onto the set 9,18.
264
CHAPTER 5. MEASURES
As regards the subset 23, of all finite Borel-Lebesgue-Stieltjes measures, then, obviously, each one of them generates a unique distribution function and there is a bijective map between IS, and the set 9 ( C 9,) of all distribution functions. T o make all distinctions between distribution and extended distribution functions lucid the reader may find it expedient to go over Problem 3.9. We will return to Lebesgue measure A: on A*. First, we prove a lemma about negligible sets. One of the interesting consequences of this result is that in Rn, all Bore1 sets having a dimension less than n are null sets.
3.6 Lemma. A set N 5 Rn is A-negligible i f and only i f for each E > 0 there is a countable cover of semi-open intervals { I k } C 3 of N such that
C=;
l,X(~k< ) E*
Proof. Let N be Xnegligib!e. Then, by Problem 2.18, A*(N) = 0 and
where EN is the set of all countable covers of N by semi-open intervals and it is not empty, since otherwise A*(N) would equal oo. By the definition of a limit inferior, for each E > 0, there is a cover { I k }E EN such that
which proves the first part of the statement. Conversely, let E > 0 and let { I k } 5 f! be a countable cover of N with the property that Ern A0(Ik)< E. Then, k =1
and hence, by Problem 2.18, N is a A-negligible set.
3.7 Lemma. Let f : R --t R be an additive function, continuous at zero. Then, f is linear.
Proof. First note that
This yields that f ( 0 ) = 0. Then, from
3. Lebesgue and Lebesgue-Stieltjes M e a s u r e s
265
it follows that f (x) = - f ( - x) and thus f is odd. Now, let n be any positive integer number. Then, since f is additive,
f(
4 =nf ( 4 .
If n is a negative integer, then, from (3.7b-c),
Hence, for each n E iZ,
which yields that
Combining (3.7d) and (3.70 we have that for each integer m,
In other words, for each rational number q,
Since f is continuous a t zero and because f is additive and odd we have from f ( x - Y) = f ("1
+ f ( - Y) = f ( 4 - f (Y)
that f is continuous on R. Now, let r E R. Then, there is a sequence (nn) of rationals convergent to r. Due to continuity of f , lim n+ca
Aq,) = f ( 4
(3.71~)
-
On the other hand, f(qn 1) = qnf(l) and (3.7g) lead to
This shows that f is a linear function f (x) = cx, where c = f (1).
0
3.8 Corollary. L e t f : Rn 4 R be c o n t i n u o u s at z e r o a n d additive f o r e a c h variable separately. T h e n f (xl,. ..,xn) = exl.- .xn, w h e r e c = f (1,. ..,l). Proof. If x2,. . .,xn are fixed, then by Lemma 3.7,
CHAPTER 5. MEASURES
Applying the same procedure successively to the other variables we have the statement. 3.9 Definition. A Borel measure p on !B(Rn) is said to be translationinvariant, if for each Borel set B E 4B(Rn) and x € Rn, p ( B x) = p(B), where B + x = (x y: y E B).
+
+
We will see in Section 4 that the Borel-Lebesgue and Lebesgue measures are translation invariant. The following theorem states that any translation-invariant Borel measure is a multiple of the Borel-Lebesgue measure. 3.10 Theorem. Let p be a translation-invariant Borel measure on !B(Rn). Then, p = cX, where X is the Borel-Lebesgue measure on 93(Rn) and c = p(C) ( C stands f o r a unit cube). Proof. For each x E R, define
and sgnx =
1, x > o
- 1,x < 0.
Denote
We show that f defined in (3.10) is additive and continuous in each variable separately. Without loss of generality, we show it with respect to xl. Let xl = x y.
+
Case 1. Suppose x > 0 and y > 0. Then,
and
where
3. Lebesgue and Lebesgue-Stieltjes Measures
R1=I,xI
x2
x...xI,
n
267
and R 2 = [ x , x + y ) x I X 2 x . . . x IXn.
+ y are all positive, sgn n xi = sgn(x - x2 - ... x,) (t:l )
Since, x,y, and x
= sgn(y x2
...
x,).
(3.10a)
From (3.10a),
and since p is translation invariant,
Case 2. Suppose x
+ y > 0 and x > 0, y < 0. Then,
= s g n ( x - x 2 - ...-x,) = -sgn(y-x2
-...-xn).
(3.10b)
Since Ix
+ y = [01x + Y) = [O,x)\[x + Y,x),
+
X([x y,x)) = X([y,O)), and because p is translation invariant, using (3.10b) we have that
The other combinations of x and y are left for the reader. (See Problem 3.20.) Now, we prove continuity of f a t zero. Let {ak} be a sequence conver-
268
CHAPTER 5. MEASURES
gent to zero from the right. Then, {ak} E W + and the sequence of sets {Iak) is such that
The latter yields that
n 00
} = I ~ x I , ....XI,.
{I~,XI,,X...XI
k =1
By the definition, I. = that
,n
2
n
a; and by continuity from above of p, we have lim f (ak,x2,...,x,) = 0.
k-00
Similarly, by continuity from below of p, we have that lim f (ak,o2,...,x,) = 0
k-rw
for {ak} l' 0. In addition, f (0,x2,.. .,xn) = 0 is by the definition of f . By Corollary 3.8,
where C = [O,l) x .. .x [0,1). On the other hand,
which, along with ( 3 . 1 0 ~ )gives ~
Note that
Equations (3.10d) and (3.10e) tell us that for any rectangle R whose all sides lie on corresponding coordinate axes,
For an arbitrarily positioned rectangle R whose all sides are parallel to the corresponding coordinate axes (3.10f) still holds true due to the translation invariance of p.
3. Le besgue and Lebesgue-Stieltjes Measures
269
By po = p(G')Ao we define an elementary content on the semi-ring Y of half open rectangles. Then, by 2 = p(C)A we also have a Borel measure on 3.Now, we have three Borel measures on 93: p , p, and the (unique, as !f is n-stable) extension of po from Y to 93. All three coincide on Y and therefore must be equal on 93. 0 3.11 Example. (Cantor ternary set). Consider the following family of
subsets
of
[0,1].
G2= (i,g) U (g,!),
Let
R=Co=[O,l],
GI=(:,$),
C1=CO\G,,
C2 = C1\G2, as depicted in Figure 3.1 below:
Figure 3.1 Therefore, each Cn is the union of 2" closed intervals, while each Gn is the union of 2n-1 open intervals. Also,
and (C,) is a monotone decreasing sequence of sets. The Cantor set is defined as
and it can be characterized as follows.
1) C is closed as the intersection of closed sets. 2) Each Cn contains 2" closed disjoint intervals Fl (n),. . .,F2n(n). Each of these intervals is a term of the monotone decreasing sequence
270
CHAPTER 5. MEASURES
{Fk(n)]l with d,(Fk(n)) = X(Fk(n)) L O , n+m.
By applying Cantor
Theorem 5.4, Chapter 2, we conclude that V k = 1,2,. ..,
n Fk(n)
00
n=l
consists of exactly one point. In other words, C is a union of isolated points and therefore nowhere dense.
5) The Lebesgue measure of G, is X(Gn) = :($I",
since
6) Thus
Hence X(C) = 1 - 1 = 0 and therefore C is a Borel A-null set.
7) C is not empty, since C contains all boundary points of the sets 1 Cn which are 0,1, g,
2
1 2 7 329 321 321
8 ,,. .. . The boundary
points have the
following ternary representations
1 = 1.0 (or) = 0.22222.. . (in duadic representation)
l3 = 0.1 (or) = 0.02222.. .
Each set Cn has exactly 2" boundary points, each of which has a unique triadic representation consisting of all n-tuples of digits 0 or 2. Observe that 2, = 19(A)I where A is a n n-element set. Therefore, C is equivalent to the set of all subsets of natural numbers which has the cardinality of the continuum. In other words, C G R. Therefore, the Cantor set is an example of a noncountable Borel Anull set.
3. Lebesgue and Lebesgue-Stieltjes Measures PROBLEMS 3.1
Let H = ( x = (xl,. . .,en)E Rn: xi = a E R) be a hyperplane orthogonal to the ith coordinate axis. Show that H is a A-null Borel set. [Hint: 1 ) Show that H is closed in (Wn,re)and hence Borel, 2 ) Find a relevant countable cover of H by rectangles from Y and apply Lemma 3.6.1
3.2
Show that each countable subset of W n is a Borel A-null set. [Hint: Use Problem 3.11.
3.3
Show that f defined by (*) in Remark 3.5 (iii) is an extended distribution function.
3.4
Let f l and f 2 are two extended distribution functions and let p1 and p2 are the corresponding Borel-Lebesgue-Stieltjes measures induced by these functions. Show that p1 = p2 if and only if f - f = C , where c is a constant function.
3.5
Let %Ie be the set of all extended distribution functions. Show that 9,is a semilinear space over W + .
3.6
Let f and f 2 be two extended distribution functions. If pl and p2 are the corresponding Borel-Lebesgue-St ieltjes measures induced by f l and f 2 , show that for any nonnegative scalars al and a 2 , 9 ( a l p l u 2 p 2 )= { a l f a2f 2;p} , where '3 is defined in Remark 3.5 (iii).
+
3.7
+
Let f : R+R be an extended distribution function and let p f be the corresponding Borel-Lebesgue-Stieltjes measure on %(R). Show that
a ) p f ( ( a ) b ) )= f ( b -
-f (a)
b ) ~ f ( [ a , b l=) f ( b ) - f ( a - )
4
= f ( b - ) -f (a- )
~ f ( [ " l b ) )
d ) f is continuous if and only if p f ( { x ) ) = 0 , x E R.
3.8
Let f be the extended distribution function on R given by
CHAPTER 5. MEASURES
and let pf be the corresponding Borel-Lebesgue-Stieltjes measure. Evaluate the measure of the following sets:
3.9
Let f be a distribution function and let pf denote the BorelLebesgue-Stieltjes measure induced by f . Justify with a proof or give a counter-argument: a) Must f be an extended distribution function?
b) Suppose g is a function defined by (*) of Remark 3.5 (iii). Is g a distribution function? If your answer is yes, is g = f ? 3.10
Let p be an atomic measure ( =
i=o
ai ob i)a
1 ) Is p always a Borel-Lebesgue-Stieltjes measure? If it is not, give a condition under which p is a Borel-Lebesgue-Stieltjes measure. 2 ) Find in this case { f ;p). 3 ) Plot one such f .
3.11
Consider the Borel a-algebra 93 = 3B([W)" generated by the usual topology. Show that, for any Borel set B E 93 and any point xeRn, B + x = ( r ~ R " : z = y + x : Y E B ) E % . [Hznt: Show that C , = {A E 3: A + x E 3 ) is a a-algebra.]
3.12
Let (52,bB)p)be a Borel measure space, such that the Borel aalgebra 4B is generated by a Hausdorff topological space T , and p is a finite Borel measure. For any subset Q E f l denote by %(Q) the collection of all compact subsets of Q. Show that a subcollection J% C 93 of all sets B E 4B such that
is a monotone system in f l , i.e. a subcollection of those Borel sets that can be approximated "from below" by compact subsets is a monotone system.
3.13
Let (fl,G$,p) a special case of the Borel measure space introduced in Problem 3.12, namely, let 52 = Rn and the Borel a-algebra 93 = bB(Rn) be generated by the usual topology re. Again assume that p is a finite Borel measure. Show that in this case every Borel set B can be approximated from below by a compact subset K C B; i.e. for every o > 0 there is a compact subset K ( B )E B, such that
3. Lebesgue and Lebesgue-Stieltjes Measures
273
3.14
Generalize Problem 3.13 allowing p to be a a-finite Borel measure.
3.15
Let (Rn,%,p) be a Borel measure space, where the Borel u-algebra '3 is generated by the usual topology re in Rn and p is a finite Borel measure on 93. Show that every Borel set can be "approximated from above" by an open set, i.e. if C(B) is the collection of all open supersets of B, then
3.16
Show that there is a non-Bore1 set in T(R).
[Hints:We call x,y E R equivalent (x y) if and only if x - y E Q (rational numbers). For every real number x R we assign another real number y to the class A, if and only if
X-YEQ. 1) Show that (R,
-
) is indeed an equivalence relation.
Let 81 ,be the quotient set of modulo . Using the Axiom of Choice we select any element from each class of 81, that belongs to set (0,1]. Denote by A the collection of all such elements. .V
2) Show that such a selection is possible taking into account the Axiom of Choice; i.e. it can be shown that V x E R, A,n(0,11# @. 3) Show that set A has the following properties:
R can be restored from A as
4) Finally, let Q = Q fl (O,l]. Then
U 9€
Q
(q
+ A) 5 (0,2]. If
A
is a Borel set (and this is the assumption that will lead to a contradiction), then by Problem 3.11, x + A is a Borel set too; and by the translation-invariance of Borel-Lebesgue measure A, X(x A) = X(A) implying that
+
U- (n + 4)= C
A( 9
Q
qeQ
X(q
+ A) I X((0,21) = 2.
Thus the above series is finite; and since the X(q + A) values are equal for all q E Q, each of them must be zero, which implies that
274
CHAPTER
X(q
5. MEASURES
+ A) = 0, Vq E $. But R=
z (q+A)*X(R)= 9cQ
C 9
E
X(q+A)=O,
Q
which is an absurdity. Thus, our assumption that A is a Borel set was wrong.]
3.17
Let X denote the Borel-Lebesgue measure on the Borel a-algebra 4B(Rn). Show that for each Borel set B and E > 0, there is a countable cover of B by disjoint semi-open cubes {Ck}such that
C=:
"=O(ck) -
< E.
In particular,
[Hint:Use Problem 3.15.1 3.18
Show that if N is a negligible set in (Rn,93,X), for each E > 0, there is a countable cover of N by disjoint semi-open cubes {Ck}such that
3.19
Show that if N is a subset of Rn, and for each E > 0, there is a countable cover of N by semi-open (not necessarily disjoint) cubes such that C=: l X o ( ~ k <) El then N is negligible.
3.20
Show additivity of f in Theorem 3.10 for the other combinations of x and y.
3.21
Let p be a translation invariant Borel measure on 93(Rn) and let p* be the outer measure produced by (%(Rn),p) and %; be the corresponding a-algebra of p*-measurable sets. Show that (i) p* = p(C)X* on T(Wn) and p; = p(C)X; on 3;.
(ii) 93; = &*. 3.22
Let (Rn, 93, A) be the Borel-Lebesgue measure space and let B be a compact set. Show that for each E > 0, there is a finite cover of B by semi-open rectangles Dl,. ..,DNsuch that
3. Lebesgue and Lebesgue-Stieltjes Measures
275
3.23
For any E > 0, construct an open set D in (R,T,), which is dense in R and with X(D) 5 E .
3.24
Is every a-finite Bore1 measure on %(Rn) also a Borel-LebesgueStieltjes measure?
3.25
Give a direct, alternative to Theorem 3.1, proof that the Lebesgue elementary content X0 is u-additive on the semi-ring 3 of half open intervals in Rn, not using any prior extension to the Lebesgue content A, on %(Rn), as Theorem 3.1 does.
3.26
Give a direct (alternative to Theorem 3.4) proof that the Lebesgue-Stieltjes elementary content is c~-additivewith no prior extension to the Lebesgue-Stieltjes content and bypassing 0continuity.
CHAPTER 5. MEASURES
NEW TERMS: Lebesgue outer measure 260 Lebesgue a-algebra 260 Lebesgue measure 260 Borel-Lebesgue measure 260 Borel measure 261 Borel measure space 261 Borel-Lebesgue-Stieltj es measure 26 1 Borel outer measure 261 Lebesgue-Stieltjes measure 261 Lebesgue-Stieltjes content 26 1 distribution function 263 extended distribution function 263 translation-invariant Borel measure 266 Cantor ternary set 269 measure of a hyperplane 271 non-Bore1 set 273
4. Image Measures
4. IMAGE MEASURES In Remark 3.5 (iii) we saw how Borel-Lebesgue-Stieltjes measures can be generated by measurable functions belonging to the class of so-called extended distribution functions. In this section we will also generate measures by elements of the far more general C-'-class of measurable functions. The very process of generation of measure is totally different from that in Remark 3.5 (iii) and the two notions should not be confused with each other. Section 3, Chapter 4, is a relevant prerequisite to this material.
4.1 Proposition. Let ( R , C , p ) be a measure space and let J ( ) ( R 1 ) be a measurable function. Then the set function A' -+ pf * ( A 1 = ) p(f *(A1)) on E' is a measure. tl (See Problem 4.1.)
4.2 Definition and Notation. The measure p f * in Proposition 4.1 induced by a measurable function f is called an image measure. Notice that directly from Definition 2.1 (viii), pf*(A1) can alternatively be viewed as p(w E Q: f (w) E A') or, shortly, as p{f E A'). 0
4.3 Proposition. Let L: Rn + Rn be such that L(x) = a x + b, where a' E R\{O) and b E Rn. Then the Borel-Lebesgue measure X on %(Rn) has the properly XL* = A X . Specifically, if a' = 1 we have XL* = A , Ial which shows that the Borel-Lebesgue measure is translation-invariant.
Proof.
1) Let f ( x ) = a x (called a homothetic function), where x E Rn and a ( # 0) E R. Let X be the Borel-Lebesgue measure on the Bore1 a-algebra 9.W e show that
Take (a,b] € Y. Then,
fi f *((a,bl) =
(ai bi
a',a
]
i=l
i=l
which implies that
X f *((a,b]) = and
4X((a,b]) for a > 0 a'
CHAPTER 5. MEASURES
A f *((a,b J) = -&( CY
- l)"A((a, b 1) for a < 0,
and thus
As a continuous map relative to the usual topology, f is Borel and, consequently, Af* is a Bore1 measure on 93. Obviously, ,A is also a
'
lal
Bore1 measure on 3.Since A f * and ,A are a-finite on 3' and coincide
I 4
on Y (being a n-stable generator of %), and since +A,
lal
is rr-additive,
by Corollary 2.14, they should also coincide on 93. 2) Let g(x) = x + b. Similarly, we can show (see Problem 4.2) that Ag* is a Borel measure on 93 and that Ag* = A. Therefore, A is translation-invariant. Finally,
The proposition is therefore proved.
4.4 Remark. Proposition 4.3 tells us that the Borel-Lebesgue measure is invariant under translation, which is a sort of motion defined as T,(x) = x + In the two-dimensional Euclidean space we know another form of the motion, called rotation. A figure under rotation R and subsequent translation T, is trarraformed into a congruent one. We can show that an arbitrary Borel set in Rn rotated and then translated preserves its volume. (See Corollary 2.3, Chapter 7.) In the n-dimensional Euclidean space, instead of rotation, we use an orthogonal transformation. More precisely, in Rn an orthogonal transformation is in the form of an orthogonal n x n matrix; recall that an n x n nonsingular matrix R is orthogonal if R R T = R ~ = R I (the identity matrix). The composition M = T, o R is an example of a motion. Generally, a bijective map M from one metric space ( X , d ) onto another metric space (Y,p) is called an isometry if it preserves the distance, i.e. if for every pair z,y E X, d ( x , ~=) p(M(x),M(y)). Such two metric spaces are said to be isometric. A motion is a special case of isometry when Y = X, p = d. In the Euclidean space (Wn,d,), a motion M can be represented by the composition T, CJ R, where T, is a translation map and R is an orthogonal matrix. As a continuous map, the motion is also Borel. It can be shown (see Problem 4.10) that the Borel-Lebesgue measure is motioninvariant, i.e. AM* = A. 4.5 Examples.
(a)
Let (R,E,p) be a probability space and let (R1,E1)be a measure
279
4. Image Measures
space. Then any E-El-measurable function f : R --, R' is called a random variable. The corresponding image measure p f * is called the probability distribution ( o f the random variable f). Observe that in probability theory, a probability measure is denoted by P and a random variable is denoted by upper case letters like X, Y or 2. In most applications, 52' is the numeric set Rn or a subset of Rn, and C' is the corresponding Bore1 a-algebra %(Rn) or its trace on the subset. We would like to emphasize that a measurable function, say X , can only be a random variable if it is associated with a particular probability measure P, along with which it induces the probability distribution. The latter specifies the random variable. In other words, measurable functions may share the same measurable space, but as far as probability theory goes, they differ if they induce different probability distributions (or more precisely, different classes of probability distributions categorized by their parameters). (0,1,...,) be a random variable such that PX* is a Poisson measure T X . Then the random variable X is called a Poisson random variable. Similarly, a random variable X: R + (0, ...,n) is called binomial, if PX* is a binomial measure P n t p . A random variable X is called (discrete) uniformly distributed if P X * = E kn= o E As it was pointed out in (i), X: R n + l k'
(ii) Let (R,E,P) be a probability space and X: R
-t
(0,. ..,n) is just a measurable function (which can be uniform or binomial), and it becomes a random variable upon specification of its distribution PX* or even earlier, the probability measure P. These are examples of so-called discrete random variables. The construction of probability distributions of continuous random variables (i.e., those whose ranges are continuums) requires integration and the concept of a density. The latter will be developed in Chapters 6 and 8. -,
PROBLEMS 4.1
Prove Proposition 4.1.
4.2
Prove part 2 of Proposition 4.3.
4.3
Let (R,E,p) be a measure space with R = R, E = (A 5 R: either A or AC4 - N) and let p(A) = 0 for A 5 N and p(A) = 1 for ACj N. Let R' = {0,1), 22' = 9(!2').Define [R, R', f ] as
f(x)={
0, if x is rational ~~ifxisirrationai.
Prove that f is C-22'-measurable and determine pf *.
280
CHAPTER 5. MEASURES
4.4
What are the traces of Borel a-algebras on R' = (0,1, ...) and a' = (O,l,. ..,n)introduced in Example 4.5 (ii)?
4.5
Let A ( R n ) be the collection of all motions on (Rn,d,). Show that (&(lRn), o ) , where o is the composition operator, forms a group with unity.
4.6
Let f be a homothetic function (f (x) = ax) defined in Proposition 4.3, part 1, A - the Borel-Lebesgue measure on G;B(Rn), A* - the Lebesgue outer measure, L* - the rr-algebra of Lebesgue measurable sets, and A; - the Lebesgue measure on L*. Let p* be the outer measure generated by the image measure A f *, '3; - the rr-algebia of all p*-measurable sets, and p; = ResB,p*. that: P
Show
c) 93; = L*.
4.7
Generalize Problem 4.6 by letting f to be a special case of the affine map f (x) = a x + b, a # 0, b E Rn.
4.8
Show that the Lebesgue measure A; on L* is translation-invariant.
4.9
Let p be a translation-invariant Borel measure on 3B(Rn) and let p* be the outer measure produced by (B(Rn),p). Show that: a) p* = p(C)A*, where C is the unit cube. b) 3; = e*.
4.10
Show that the Borel-Lebesgue measure is motion-invariant. (See also Chapter 7.)
4 . Image Measures
NEW TERMS:
C - '-class of functions 277 image measure 277 homothetic function 277 orthogonal transformation 278 isometry 196 isometric metric spaces 278 motion 278 motion-invarian t measure 278 random variable 279 probability distribution 196 Poisson random variable 197 Binomial random variable 197 Discrete random variable 279 translation-invariance of Lebesgue measure 280
282
CHAPTER 5. MEASURES
5. EXTENDED REAL-VALUED MEASURABLE FUNCTIONS 5.1 Definitions and Notations. (i) Recall (Section 3, Chapter 4) that C - '(a, C ; R', C', ) denotes the collection of all measurable functions from a measurable space (a,E ) to a measurable space ( R , ) . If R ' = W and C'= %(W), then C - '(a, E ; W) will denote the class of all real-valued measurable functions on a measurable space (R, C). The class of all complex-valued functions will be denoted by C - '($2, E ; C) = C - '(a, C ; C,GB(C))). Using the notion of product measures (Section 6, Chapter 6) we can show that a function f = u + i v € C - ' ( ~ , C ; C ) ifand only i f u , v ~ C - ' ( R , E ; W ) . (ii) In Examples 1.2 (iv) and 10.19 (i), Chapter 3, we constructed a topology on the extended real line R via "two-point compactification." The formed topological space (R,T) included all open sets of (W,r) and, in addition, open sets of types
where 0 E T . The corresponding Borel c-algebra %(W) = C(-?), therefore, consists of all sets of %(R) and combinations of unions of Borel sets with the sets ( + m ) and (-00). In this section, we will be concerned with the class of all eztended real-va,julued functions f: R+R which are E-:-%(R)measurable, where C is a a-algebra in R. We denote such a class by C- l ( n , s ; R ) (or sometimes shortly by C- if a measurable space (R,C) is previously specified).
'
We give a simple criterion for measurability of C- '-functions.
5.2 Proposition. A function f E C-
' is measurable if and only if, f o r <
every real value a, the set ( w E R: f ( w ) a) = f *([ - m,a]), in notation (f 5 a), is measurable, i.e., is an element of E. Proof. We shall show that the collection of sets {[ - m,a]: a E R} is a generator of %(R). Then the statement will follow directly from Proposition 3.4, Chapter 4. Denote 3 ' 3' = C(([ - m,a]: a E El). Then,
and hence !f 5 GB', which implies that %(R) E 9'. Since
I+..)=
00
00
n [ k , + m ] and ( - m ) = k=ln [ - m , - k ] , k=1
5. Extended Real- Valued Measurable Functions
283
we have that { + oo), { - oo) E 93'. Thus %(E) 5 93'. Also,
Therefore, {[ - oo,a]: a E R} finally, 3'= %(R).
E %(k),
which yields that 3'
%(R)
and,
Proposition 5.2 can be extended to a number of modifications of conditions equivalent to measurability.
5.3 Corollary. A function f E C - ' is measurable if and only if any of the following conditions holds.
Proof. (iii)
{f < a) = f *([ - oo,a))
u [ - oo, a -A]) n=l 00
=
= f*(
i.e. (f
5 b)
00
1 U f *([ - =,a - iil)
n=l
E C, Vb E R, and thus {f
< a ) E C. Similarly,
00
=
n f*([-w,a+A))
n=l
n v
=
i.e. {f < b) E C, Vb E R implying that {f
n=l
5 a).
Therefore, the statements, {f 5 a) E C, Va E R and {f < a), Va E R, are equivalent, which along with Proposition 5.2, yield statement (iii). (For the proof of (i) and (ii), see Problem 5.1.)
5.4 Proposition. Lei f , g E e -
Then { f < g} E C.
Proof. We show that
If < 9) = U {f < r) n (9 > 7-3 r E 'Q
by using the pick-a-point process. We exclude the trivial case when { f < g} = @.If wo E {f (w < g(w)), then equivalently, f (do) < g(uo), implying that there exists an ro E Q such that f (wo) < ro < g(wo).
CHAPTER 5. MEASURES
Hence,
and
{f < g l E U {f < r) n {g > r). rEQ
Conversely, if
then there exists an ro E Q such that wo E { f (w) < ro}n {g(w > ro} and f (wo) < ro < g(wo), implying that f (wo) < g(wo). Thus wo E {f (w) < g(w)}. Now the statement shall follow from Proposition 5.3 (ii,iii). C3
5.5 Definitions. In the situations below we will deal with spaces of measurable functions that have not occurred before. We discuss the following constructions. Let IF be a field and let % be a vector lattice over ff and a commutative ring with unity. Observe that (%,F) is an algebra and (36,* ) is a multiplicative Abelian semi-group with unity (i.e. a group that perhaps fails to have multiplicative inverses); call it shortly an %-space over IF. Throughout the remainder of this book, as an %-space, we shall consider a class of functions (extended real- or complex-valued over the field R or 43). For instance, the space of all continuous functions is an %space over R. [Note that the term %-space is not common in real analysis literature and is restricted to the use in this book.]
(i)
(ii) Let En be the set of all functions from il to R (as we defined it in Section 5, Chapter I ) , and let (Rn,rp) be the topology of pointwise convergence (cf. Definition 5.11, Chapter 3) generated by the compact topology (R,T,) in each of the factor spaces. Let us call (Rn,r,) the eztended topology of pointvise convergence. Let 96 be a subset of ( R 4 r p ) such that it is an %-space over W. We call 96 a closed %-space if (EG,rp) contains the limits of all rP-convergent sequences. In other words, it contains the limit of every pointwise convergent sequence (observe that since @,ye)is Hausdorff, any pointwise limit is unique). For instance, the space of all continuous functions is not a closed $-space. (iii) Consider the subspace ( - ) ( T ~ of) all measurable functions structured in terms of the extended topology of pointwise convergence. The next theorem states that until now (C - ',T,) is the widest, known class of functions, second to En.
5.6 Proposition. (C-l,rp) is a closed %-space over W, that is f o r any f , g E C - ' a n d f i r {f,: n = 1,2,...) 5 C-':
5. Extended Real- Valued Measurable Functions
285
(ii) f f g ~ C - l . (iii) f . g E C - '.
(iv) supif ,} E (C - l , r p ) and inf{f ,) E (C follows that CI f 1 EC-'.
'
is a lattice,
' ,rp);
specifically, it and thus with any f E C - ', also
(vi) if f -+f in the extended topology of pointwise convergence, then f E ( C - q r p ) .
Proof. (i)
is obvious.
'
(ii) BY (i), a - g E - implying that {f Therefore, by Problem 5.2 (i), f g E C (iii) { f 2 > a ) = n (a > 0)
* {f
2 a)
E 23
representation f g = f (f
@SO),
*f
-'.
+
+ g < a ) = {f
{ f 2 ~ a ) = { f >/ii)u{f
€ C-'.
g - a).
5
T h e statement follows from the
+ g)2 - i(f- g)2.
(iv) We show that
wo E {supif
n)
5 a) if and only if supif ,(wo)} 5 a or equivalently
or, equivalently,
The latter implies that 00
{sup{f Let {f ,}
n)
6 a} =
-fi)
n {fn6 a).
n=l
C - '. Then { - f }, C C - '. The statement follows from
286
CHAPTER 5. MEASURES
Now if f E e - ', it implies that (v)
I f I = sup{f,
- f } E e - '.
This statement directly follows from (iv).
lim f, = f if and only if (vi) n-)w
lim
f, = lim f, = f , and the
statement follows from (v).
0
PROBLEMS
5.1
Prove Corollary 5.3 (i) and (ii).
5.2
Prove that for f ,g E e - ', (i)
(f5s)Er:
(ii) (f = g) E C (iii) (f
# g) E E.
5.3
Let f , g ~ ~ - ' . S h o w t h a t w ~ c o s ( f ~ ( ~ ) + 4 g ( w ) ) ~ ~ - ' .
5.4
Show that if f 3 € e - ' then then f need not be in C -
5.5
Let f ,g E (2 - and let A E C. Show that
5.6
Let f : (a,b]-tR be a a) monotone function b) convex function c) function with a t most countably many discontinuities. Show that in each case, f is GB( (a,b ] ) - GB(R)-measurable.
5.7
Prove the statement: f E e-' if and only if {f > d ) E C for all d E D, where D 5 R is any dense set in R.
5.8
Show that if f has derivative a t each point of R, then this derivative is Borel-measurable.
'.
fee-'.
Show that if f 2 E c - '
5. Extended Real- Valued Measurable Functions NEW TERMS:
extended real-valued function 282 measurability of an extended real-valued function, criteria of 282, 283 %-space 284 extended topology of pointwise convergence 284 closed %-Space 284
CHAPTER 5. MEASURES
6. SIMPLE FUNCTIONS The present section is a direct precursor to integration, which we develop in the next chapter. The integral itself will be first defined for simple functions valued in a finite set of nonnegative reals. 6.1 Definition. We consider the following subclass of functions from C - l ( a , c ; R ) , which we call nonnegative simple functions and denote this subclass by 3 + ( Q C ) = P + ( a , C;R). An element s is said to belong to 3 + or to be nonnegative simple if:
c) s takes on only finitely many real values.
0
6.2 Remarks.
(i)
Let s E 3+(R,C). If there is an n-tuple of nonnegative real
numbers {al,. ..,a,} and a finite decomposition
C "k = l Ak of C? such that
s(w) = ak for all w E Ak, then the function s (as in Figure 6.1) can obviously be represented as
Figure 6.1
6. Simple Functions
289
In some cases we may need to deal with different decompositions of 52. Consequently, there are in general different finite representations or expansions of s E P + of type (6.1). However, there is obviously a unique one where (6.1) contains all different values {al,.. .,an} of s. We wish to call such a representation (expansion) canonic. (ii) For the upcoming material we will need some modifications of X-spaces introduced in Definition 5.5. Let ff be a field. Recall that F+ C F is called the semifield if all axioms of the field hold except for #4 (the existence of additive inverses, see Definition 7.5, Chapter 1). If X is a linear space over F, the corresponding erestriction (%;IF+) is called a semi-linear space. If, in addition, (X;F) is a vector lattice, then (ES;F+) is called a semi-linear lattice. Similarly, we can define corresponding restrictions of rings and algebras over F+ calling them quasirings and quasialgebras. If % is a semi-linear lattice over a semifield F + and a commutative quasiring with unity over F+ then we call the pair (96;F+) a semi$-space. (iii) In Chapter 8 (Section 4), we will also be using the notion of a simple function, which is just as in Definition 6.1, except that they are not necessarily nonnegative. The set of all such simple functions will be denoted by P(Q, E ) = P(R, E;R).
6.3 Proposition. ( P + (52,E); W+; - ) is a semi-%-space. In other words, if s,t E P + , then: (ii) s e t E P + . (iii) sup(s,t) E P +
.
(iv) inf(s,t) E P + . (See Problem 6.1.) We denote by ( 9+ (R,C),rp) the subspace of all extended, realvalued, nonnegative functions f € C - to each of which there exists a monotone nondecreasing sequence {s,} 5 P + of nonnegative simple functions such that f = sup{sn} in the topology - of pointwise convergence. By Proposition 5.5 (iv), $ + 2 C - l , i.e. P + consists of only measurable functions. The following proposition asserts that 9 + is a semi-%space and it is the closure of P + with respect to the topology of pointwise convergence).
'
6.4 Proposition. P + (R,C) is a semi-%-space o v e r W+, f ,g E + ( n , E ) , then:
i.e. if
290
CHAPTER 5.
MEASURES
(ii) f o g € $ + . (iii) sup(f ,g) E iF + . (iv) inf(f,g) E
Y+.
Proof.
(i)
Let f = sup{s,}, g = sup{t,}. Then af
+ bg = a sups, + b supt ,= sup{as,}
and as,, bt, E creasing.
P + . Furthermore, {as,] and
{bl,]
+ sup{bt,} are monotone nonde-
(ii) The proof is similar to that for (i). (iii) Let w, = sup(s,,t,). s u p i f ,g} (why?).
Then obviously, sup{w,} exists and equals I7
(iv) The proof is similar to that for (iii).
= C-'(n,c;R+),i.e., the subclass of all nonnegative ertended real-valued functions. Then e 7' = @ and it is 6.5 Theorem. Let C;':
+
the a closed semi-%-space.
Proof. Evidently, $ + C C ;.'
Therefore, we are left to prove that
@ +.. We will show that, for every f E C ;,' there is a monotone C nondecreasing sequence { s } of nonnegative simple functions from P + such that sup{s,) = f . The latter is a t the heart of the following construction. Let
;'
For instance,
In other words,
6. Simple Funciions and
Therefore, all sets Ai(n), i = 0,...,n2", are disjoint and obviously Emeasurable. Let us define
Both f and s , are depicted in Figure 6.2.
Figure 6.2
+
Clearly s,+~ 2 s,. Besides, s,(w) 5 f ( w ) < s,(w) 2-", V w E R: f ( w ) n, and f ( w ) > n, V w E R: f ( w ) = oo. Functions s , and s , + are drawn in Figure 6.3.
<
CHAPTER 5. MEASURES
Figure 6.3 Thus there exists
f
E
sup{s,} = f (pointwise Vw E R), and therefore
Y +, implying that e;' 2 $ + . This proves that C T ' = 8 . +
PROBLEMS 6.1
Prove Proposition 6.3.
6.2
Let R be an uncountable set and let C = ( A C_ a : A or AC is a t most countable). Show that f E C - '(R,E) if and only if f is constant everywhere except on an at most countable subset of 0.
6. Simple Functions NEW TERMS: nonnegative simple functions 288 canonic representation (expansion) 289 canonic expansion (representation) 289 semi-linear space 289 semi-linear lattice 289 quasiring 289 quasialgebra 289 semi-$space 289 simple function 289 closed semi-%-space 290
Chapter 6 Elements of Integration The historical significance of the development of measure theory is that it created a base for a generalization of the classical Riemann notion of the definite integral (which since 1854 was considered to be the most general theory of integration). Riemann defined a bounded function over an interval [a,b] to be integrable if and only if the Darboux (or Cauchy) sums C r"(ti)X(Ii), where C 7r = 1l i , is a finite decomposition of [a,b] = l into subintervals, approach a unique limiting value whenever the length of the largest interval goes to zero. A French mathematician, Henri Lebesgue (1875-1941), assumed that the above intervals I i may be substituted by more general measurable sets and that the class of Riemann integrable functions can be enlarged to the class of measurable functions. In this case, we arrive a t a more solid theory of integration, which is better suited for dealing with various limit processes and which greatly contributed to the contemporary theory of probability and stochastic processes. Although many results existed prior to Lebesgue's major work between 1901 and 1910, Lebesgue's construction appeared to be the most efficient. After 1910, a large number of mathematicians began to engage in work initiated by Lebesgue. Some of the most significant contributions were made by the Frenchman Pierre Fatou (1878-1929)) Italian Guido Fubini (1897-1943)) Hungarian Frigyes (Frdddric) Riesz (1880- 1956)) Pole Otto Nikodym (1887-1974), and Austrian Johann Radon (1887-1956) who developed the Lebesgue-Stieltjes integral and whose work led to the modern abstract theory of measure and integration. In this chapter, we will first be concerned with the main principles of integration with respect to arbitrary measures. We will be using standard techniques developed for Lebesgue integration but without sacrificing the generality. Then various applications of the integral will be considered. We will look a t the integral as a measure (and later, in Chapter 8, in the general case, as a "signed measure"), a t Radon-Nikodym derivatives, a t decomposition of measures and decomposition of absolutely continuous functions, and a t "multiple integration." Other applications of integration (including uniform integrability) and various principles of convergence will be developed in Chapter 8.
CHAPTER 6. ELEMENTS OF INTEGRATION
1. INTEGRATION ON e - l(R,E) We begin the theory of integration with integrals of nonnegative simple functions, which we introduced in Section 6, Chapter 5. Prior to the definition of the rudimentary integral, the proposition below states that integrals of nonnegative simple functions are invariant of their representations.
1.1 Lemma Let (R,C,p) be a measure space and let s E P (R,C) have two representations: +
Then it holds that
Proof. The above representations are due to the two decompositions of R: Then
which implies 'that
and
By noticing that ai = bk on Ai n Bk, we are done with the proof.
1.2 Definition. Let (S1,C,p) be a measure space and let s E P ! + (R,C) with the representation
Then the number
is called the integral of s with respect to p, and it is denoted by one of the symbols:
[ s(w)dp(w) or [ s(w)p(dw) or, shortly,
S sdp.
1. Integration on C -'(R,c)
297
Since the value of the integral of a function s does not depend upon its representation, this definition is consistent. In other words, the integral s t+ J s d p defines a functional on P + valued in R. 1.3 Proposition (Properties of the integral).
(i) F o r each measurable set A E C,
(ii) The integral
1 is a nonnegative linear functional,
J (as + bt) d p = a
sdp
+ b [tdp, where s,t E P
+
i.e.,
and a,b E R+
.
(iii) F o r any two nonnegative simple functions, s, t E P +, such that s 5 t, it holds that Ssdp 5
5 t d p (monotonicity).
(See Problem 1.1.)
1.4 Example. Let f be the Dirichlet function defined as f = 1
'Q
(earlier introduced in Example 4.7, Chapter 2), where $ is the set of all rational numbers (hence a Bore1 set). Thus f E P + (W ,%). By Proposition 1.3 (i), the integral of f with respect to Lebesgue measure X is
For the upcoming definitions and statements we will denote a monotone nondecreasing sequence of functions by {f,} f and a monotone nonincreasing sequence of functions by {f ,} J. .
1.5 Lemma. Let {s,}f C P + and s E P + such that s 5 sup{sn}. Then
Proof. Let s = Denote
m
. = ailAi and let e > 0 be any small number.
Thus s, 2 s(1- &)lgn. By Proposition 1.3 (ii,iii),
298
CHAPTER 6. ELEMENTS OF INTEGRATION
By the definition of Is,}, it follows that {B,} 1R, which implies that { A fl B,} 7 A j . Therefore, by continuity from below of p (Lemma 1.6, Chapter 5),
$sdp = = n+m lim
r = l
aip(Ai) =
=;
lai l i i m p(Ai n B,)
xm a . p ( A i f l B n ) = lim =1 t
n+oo
2
J s l B dp. n
The last equation is due to the relationship
Thus, SUP{
J s,dp)}
S
= n--+oo lim sndp
which proves the statement because the inequality holds for each
1.6 Corollary. For {s,}t,
{t,}I
E
> 0. 0
b + such that sup{s,} = sup{t,},
it holds that
(See Problem 1.2.) Let us now turn to the integral of the functions from the more general class C ;= C - '(a, C;R + ) which we became familiar with first in Theorem 6.5, Chapter 5.
'
;'.
1.7 Definition. Let (R,C,p) be a measure space and let f E C By Theorem 6.5, Chapter 5, there is a monotone, nondecreasing sequence {s,}T E P + such that f = sup{s,}. Hence, it is plausible to define
and call it the integral of (an extended, real-valued, nonnegative function) f with respect to measure p. By Corollary 1.6, the value of the integral, Sf dp, is unique. Analogous to Proposition 1.3 (ii,iii), we have:
1.8 Proposition. The integral introduced in Definition 1.7 is a positive, linear, monolone nondecreasing functional on C 7'.
Proof- Let f ,g E C
;' and a, b E R+. Then
1. Integration on C - '(R,c)
yield that (af
+ bg)dr = sup{ S (as, + bt,)dp},
which, by Proposition 1.3 (ii), equals
Now let f 5 g. Then we have sups,
5 supt,; hence sk -C - supt, .
Thus, by Lemma 1.5,
S skd P 5 SUPS t n d ~ , and finally,
S f d r = SUP J' s k d r 5 J t,dp
= J gdp.
1.9 Examples.
(i) Let E , be a point mass on a measurable space (R,Z) for some a E R and let s E P + (R,Z) be such that s(a) = ai , for some o io E (1,. . .,n).Then
Now let f E C ; '(R,c). f = sup {s,}. Thus
Then there is a sequence {s,}t C - 9 + such that
Similarly, if p = C E , (for some c
> 0), f d p = cf (a).
(ii) Let n
r = C i = O ca; ~ a . . By Problem 1.3,
300
CHAPTER 6. ELEMENTS O F INTEGRATION
Specifically, if
ci
=
(5)pi(l - P)"-~,
then p is the binomial measure
pn . (See t
6 k,
Example 1.8 (iii), Chapter 5.) Furthermore, if f (x) = etx, for then the transform of the binomial measure
-
El=,cieti = (1+ pe t - p)"
is a function in t and is referred to as the moment generating function. In the general definition, t is allowed to run the complex plane C.
(iii) Let (R,ZI,p) be the measure space with 52 = [0,1], ZI = %([0,1]), and p = X (Borel-Lebesgue measure on %([O,l]). Let C be the Cantor set 1 2 n and Gn be the open intervals of the Borel-Lebesgue measure 2(3) (introduced in Example 3.11, Chapter 5). Let us define the function
We are going to evaluate the integral
S
f (x)A(dx) (with respect to the
[o, 11 Borel-Lebesgue measure). First of all, we have to identify the function f , which can be represented in the form f = sup{sn}, where
Clearly, s, E P + ([O,11, % n [0,11) and f (x) = sup{s,(x)}.
f and hence
E e;'([o,lI,q[o,11))
Thus
1. Integration on C - l ( f i , ~ ) (iv) Let
('1C).
301
{p,} be a sequence of measures on a measurable space
Then P =
C
00
= 'pn
is a measure on C ; and for an A E C, the
integral of the indicator function 1, is
S l , d ~ = P(A) = C=:
'P,(A)
=
C=;
1
S 1, dPn.
Let s E I+ (R,C). Then
$ s d =~ C=: -
-
C ;"=
lak
dAk)
C=:
lpn(Ak)
- C n = l C r = l a k ~ n ( A k )= C:=1Ssdpn' 00
Now, for f E C T', we have f = sup{sj} such that
{sj}t 2 P! +. Let
bjn=Cn= 1 Jsjdpi. Since { bjn} is monotone increasing,
which yields that
Therefore,
Thus we showed that
Now we further enlarge the class of integrable functions by considering arbitrary extended, real-valued, measurable functions of C - (52,c). For each f E C - and 0, being the function identically equal to zero on a, denote
'
'
f
+
=sup{f,O)
and f
-
= -inf{f,O) = ( - f ) +
302
CHAPTER 6. ELEMENTS OF INTEGRATION
(cf. Definition 7.7, Chapter 1). Clearly (see also Problem 7.16, Chapter 1)1
By Proposition 6.6, Chapter 5, f + and f - are also elements of C-' (more precisely, elements of C );' if and only if f E C - '. 1.10 Definitions. Let (R,C,p) be a measure space and let f E e -'(R,C;R) (or e - '(R, C ; R)). If a t least one of the integrals, J f + d p or Sf - dp, is finite, we say that the integral of f with respect to measure p exists and denote this integral by
(i)
We also denote L(R, C, p;R) = { f E C -'(R, C;R) : J f d p exists}.
(1.10a)
If both of the integrals of the functions f + and of f - are finite, we say that the function f is p-integrable and again denote the integral of f by formula (1.10). The subset of C-' of all p-integrable functions is denoted by L'(R,C,~;W), i.e. L'(R,C,~;R) = {f E C - ' ( 5 2 , ~ ) :
1 f + d p < m and 1 f - d p
< m). (1. lob)
Note that
S If I + = which is due to 1 f I = f + (1.10b) can be rewritten as
S f + d p + Sf-dp,
(1.10~)
+ f - and Proposition 1.8. In light of (l.lOc),
If a measurable space is specified, the notation f E L(R, C, p;R) or f E L'(R, C , p;R) will be shortened to f E L(p) or f E L1(p). (ii) If 52 = Wn, C = 3, and p is the Borel-Lebesgue measure A and if the integral of the function f in (1.10) exists, it is called the Lebesgue integral off. If f is A-integrable, we write f E L1(A). (iii) If 52 = W, C = 3 and p = p~ (a Borel-Lebesgue-Stieltjes measure induced- by an extended distribution function F), and if g E h(R, C, pF;R), then the integral in (1.10) is called the Le besgue-Stieltjes
1. Integration on C - '(R,C) integral of g; and we will write g E L1(pF) if g is pFintegrable. (iv) Let
e -'(L?,aR) be the
space of all extended real-valued random variables on a probability space (R, C, P). From Example 4.5 (i), Chapter 5, we recall that for any random variable X E C-'(R,C) on (R, C, P), the image measure P X * is the probability distribution of X. If X E L'(R, C, P;R), then the numeric value S X d P is called the expectation of the random variable X , in notation, IE[X]. Observe that E[X] makes sense only if X is P-integrable, i.e., if I X I d P < oo. [It is now becoming clear why in text books on probability, the expectation E[X] is defined only when E[ ( X I ] < 00.1 tl
1.11 Proposition. The integral is a linear, monotone, nondecreasing functional on the space L'(R, C, p). I7 (See Problem 1.6.) 1.12 Proposition.
f,
(i) L ~ ( R , C , ~ ; Ris) a vector lattice o v e r E L1(R, C, P$),
(ii)
~f
EL',
W,
i.e. for every pair
I S f d ~ Il l l f l d ~ .
Proof.
+
l g I and I inf(f ,g) I ( 9 I supIf ,gl l i l f l statement is now due to Problems 1.7 and 1.8. (ii) Obviously, [ f we have
I >f
l lf I
+ Ig1.
The
and [ f [ 2 - f . Thus, by Proposition 1.11,
and
S If l
d ~ I2S f d p I .
1.13 Notations. Let f E C-'(R,C;R)
Specifically, it follows that
and A E C. Then, we denote
3 04
CHAPTER 6. ELEMENTS OF INTEGRATION
Now we will need the notion of "properties that hold almost everywhere."
1.14 Definitions and Remarks. (i) Let (R,C,p) be a measure space. A property II (of points of R) is said to hold almost everywhere (a.e.) or p-almost everywhere (p-a.e.) if there is a (p-null) set N E N, (see Definition 2.5 (i), Chapter 5) such that II holds for all points of NC. Notice that this definition does not preclude property II to hold on N or on its subset. It merely says that II may fail on a negligible subset of N.
(ii) Two measurable functions f and g are said to equal (p-)a.e. if f = g on the compliment of a p-null set N. Observe that (f # g) E N. Recall that, by Problem 5.2 (iii), Chapter 5, the set { f # g} is measurable. Therefore, if f = g a.e., then the set {f # g} E N,, i.e., is p-null. (iii) Let e-'(R,Z;@) be the set of all measurable functions on R and let p be a measure on C. Let [f], denote the set of all functions that are pairwise equal p-a.e. on R. Specifically, [O], denotes the set of all measurable functions, which equal zero p-a.e. on 52. Clearly, the p-almost everywhere property of equality of functions induces an equivalence relation (say E) on the set C - '(R,C; E). Then
denotes the quotient set {[f], : f E C- ' ( Q z ; ~ ) } and it is called the quotient set modulo p. In light of these considerations, any two functions f and g such that f = g p-a.e. on R are also said to be equal modulo p and we will write f = g (mod p), or f E [g],, or equivalently, f - g E
PI, .
0
1.15 Lemma. Let (R,Z,p) be a measure space and let f E c ;~(S~,Z.;R).Then J f d p = 0 if and only iff E [O],.
Proof. Denote N = (f > 0) (which is an element of C). (i) Let f E [O],. Then N EN,. Let s, = nlN ( E P+),n = 1,2,.... Therefore,
J' s,dp = np(N) = 0, for all n. Denote s = sup(s,}.
Then, by Theorem 6.5, Chapter 5, s E C j 'and
Finally, f = s, = 0 on NC. While f is arbitrary on N and, in particular, not necessarily oo, we have that s,too on N. Consequently, f 5 s on R
1. Integration on C - l ( n , C ) which, by monotonicity (Proposition 1.8), yields O < J f d p s J'sdp=O
and hence J' f d p = 0.
(ii) Now let J' f d p = 0. Denote
Obviously, N, E C and N,T N, where
By continuity from below of p, lim p(N,) = p(N).
n-+w
(1.15)
Clearly, nf 3 lN. Again, by monotonicity (Proposition 1.8), we have n that 0 = J'fdr 2 J'klNndr =~ P ( N , ) , which leads to p(N,) = 0, n = 1,2,.... From (1.15) it follows that p(N) = 0 and hence N E N,. Therefore, f E [O],. 1.16 Proposition. Let (Q,C,p) be a measure space and let f,g E C ; l ( n , c ; R ) such that f = g (mod p). Then
Proof. By Problem 5.2 (iii), Chapter 5, we have that N = (f # g ) E C. Therefore, by the above assumption regarding f and g, N E N, and the functions f lN and g l N are elements of the quotient set [O],. By Lemm a 1.15, it follows that
On the other hand, if A = NC, then
Similarly,
J' s d r = J' slAdr.
306
CHAPTER 6. ELEMENTS OF INTEGRATION
The statement follows from f lA= glA, V w E R . Indeed, while on set N, Sf = Sg = 0; on NC we have that f = g.
0
1.17 Proposition. Let (R,E,p) be a measure space -and let f,g E e - l ( n , c ; R ) such that 1 f I 5 g a.e.. Then g E L1(R,E,p; W) implies that f E L'(R,c,~;R). Proof. Let g E L1(R,E,p; R). Then by Proposition 6.6, Chapter 5, we have that
gl=sup{g,
If I I EC'.
Clearly,
I f I 5 g ' everywhere and g ' = g (mod p ) -
(show it),
and by Problem 1.17, g ' E L'(R,E,~; R). Then, by Problem 1.8, f E ~ l ( a , s , pR). ; 1.18 Proposition. Lei f , g E C - '(R, C ) and f or g E L1(R, C, p). Then
j' f d p = S gdp, for each A E C ,
A
(1.18)
A
yields that f = g (mod p).
(See Problem 1.27.) Theorem 1.19 and Corollary 1.20 modify and, to some extent, refine Proposition 1.18. -
1.19 Theorem. If p is c-finite, f, g E L(R, C, p; R), and
S f d p 5 AS gdp, f o r
A
each A E C,
(1.19)
then f 5 g p-a.e. on R.
Proof. a) Let p be finite. Denote
Then, since by our assumption,
Sf d p 5 S gdp for each A E C , we have A
A
1 . Integration on C - ' ( S ~ , C )
= M: =
S g d p +A p ( A n ) . An
On the other hand,
Therefore, from (1.19a) and because
S g d p is finite, L 2 M, which yields An
that p ( A n ) = 0 , for each n. Thus,
On the other hand, from 00
n=l
n=l \
'
w
n=l V
{ g is finite)
{ f >9) we conclude that p{ f
Letting Bn = { g =
> g: g is finite) = 0. Hence,
- m,f 2 - n )
-oop(Bn) =
we have
S gdp 2 S fd p 2
Bn
-np(Bn)
Bn
and therefore,
or, equivalently, n p ( B n ) 2 o o p ( B n ) . This holds true if and only if p ( B n ) = 0 (as the consequence of the agreement that oo - 0 = 0 ) . Thus,
In summary, we proved that p { f > g ) = 0 implies that f 5 g p-a.e. on S1.
b ) Now, let p be c-finite and let pn = Rest,
n
p. Then
308
CHAPTER 6. ELEMENTS OF INTEGRATION
and hence f _< g p-a.e. on R,. The rest of this case is obvious. The reader can easily conclude that
-
1.20 Corollary. If p is a-finite, f, g E (L(S2, C, p; R), and
S f dp =
A
gdp, for each A E 27,
(1.20)
A
then f = g p-a.e. on 52.
(For a pertinent discussion, see Problem 1.28.) Finally, we would like to formulate the proposition below that will be often cited in the sequel and whose prove we assign to the reader as Problem 1.19. 1.21 Proposition. Each function f E L ~ ( ~ , C , ~ ;isRfinite ) p-a.e. on R.
0
PROBLEMS 1.1
Prove Proposition 1.3.
1.2
Prove Corollary 1.6, i.e., for {s,}t, sup{s,} = sup{t,} it holds that
[Hint: Use the fact that s
{t,}t E I + such that
5 sup{t,} and t k 5 sup{s,}.]
1.3
n- ocfioi, the corresponding value of the Show that for p = integral of any bounded'measurable function f is
1.4
Let nx be a Poisson measure and let f E C; l ( ~ , b $ ; R )Show . that
1.5
Under the condition of Problem 1.4 assume that
1. Integration on C - '(R,c)
309
and find in each case the integral o f f with respect to measure r ~ . Prove Praposition 1.11. Let Q be a non-Bore1 subset of R (such as one in Problem 3.16, Chapter 5) and let C denote the Cantor ternary set. Define the function
Is f Lebesgue measurable, i.e. f E C - '(R,L*,x)? Let (R,C,p) be a measure space and' let f E C-'(R,c;R). Show that f E L'(R,C,~;R) if and only if there exists g E L1(R,C,p;R) such that I f I 5 g. Show that L' is a linear space over R. Show that
Show that { ~ l ( n , z , ~ , ; R )a: E 0) = (L'(R,c,E,;R):
a E R).
Let (R,CLp) be a complete measure space and let f E C-'(R,C;R). Suppose that g: R-tR is an extended, real-valued function. Show, that if g = f (mod p), then g E C-l(R,C;R). [Hint: Show that (g < c) E C, Qc E R.]
,.
Let ( p ) be a complete measure space and let { f n} 5 C - '(R,C; R). Suppose that lim f, exists and f ,+ f pointwise p-a.e. on R, where f is an extended, real-valued function. Show that f E C - '. Prove that f = g (mod p) if and only if f f - = g - (mod p). Show that f E [OIp if and only if f
+
= g + (mod p) and
+,f - E [OJc.
Show that if f E C -'(R, C;R) then f E [O],,yields that f f d p = 0. Does the converse hold true? Let (R,C,p) be a measure space and let f E L1(R,C,p;R), g -E C -'(R,C;R) such that f - g E [0], . Show that g E L1(R,C,p; R ) and that J f d p = J gdp. Generalize Proposition - 1.16 assuming that f ,g E C - '(a,E;R) and that f E L(R, E, p; R) (i.e., that J f d p exists). 1.19
Show that each function f E L'(R,C,~;R) is finite p-a.e. on 0.
3 10
CHAPTER 6. ELEMENTS O F INTEGRATION
[Hint:Let A = { 1 f 1 = ca). Show that ap(A) < ca, V a E R + , and then show that n-+m lim n p ( A ) < ca implies that p(A) = 0.1 Show that for f E C only if f E [O],.
-'(a, C ) ,
S f dp = 0 for each A E C if and A
Show by a counterexample that L' is not an %-space.
-
Let f E C -'(a, c;R). Show that f E L1(R,C,p;-R ) if and only if for each E > 0, there is a function g E L1(R,C, p; R + ) such that
Let ( R ,C ,p) be - a measure space and c > 0. Show that for each f E L'(Q, C, P; R),
S f d ( c p )= C S f d l l Let { p n } be a sequence of measures on a measurable space ( R ,C ) , Icn1 be ,a sequence of positive real numbers, and let P = C n = l c n ~ n which , is a measure on ( R ,27). Show that for -
every f E L'(R, C ,p; R),
S f d(cp) = c S f d ~ . two measures on - ( R , C ) such that p -< v. Show f E L1(R,C ,p; R ) n L1(R,C , v;R), the integral
Let p and v be that for each S f d(v - p) makes sense, f E L'(R, C , v - p;E), and that
Let p and v be two measures on ( R , C ) such that p < v. Show thdt for each f E C ;'(a, c;R), S f dp S f dv.
<
Prove Proposition-1.18, i.e., show that if f ,g E C-'(a, C ; R ) and f or g E L1(R,C, p; R), then
1fdp
= Sgdp for each A E C
A
(P1.27)
A
yields that f = g (mod p). Show by a counterexample that dropping the condition f or g E L'(R, C , p; R ) in Problem 1.27 need not yield f = g (mod p) even if f and g are nonnegative.
1. Integration on C - '(R,c) NEW TERMS: integral of a nonnegative simple function 296 Dirichlet function 297 integral of an extended nonnegative function 298 moment generating function 300 integral of an extended real-valued function 302 p-integrable function 302 e n,X, p; R)-space 302 L (R, C,p; R)-space 302 Lebesgue integral 302 Lebesgue-Stieltjes integral 302 expectation of a random variable 303 property that hold almost everywhere (p-a.e.) 304 equality of functions modulo p 304 [OlFset 304 quotient set modulo p 304 [flliclass 304
I
3 12
C H A P T E R 6. ELEMENTS O F INTEGRATION
2. MAIN CONVERGENCE THEOREMS The following result is one of the basic convergence theorems a special case of which (Corollary 2.2) was originally proved by the Italian mathematician, Beppo Levi (1875- 1961). 2.1 Theorem (of Monotone Convergence). Let { f ,)f
C -C
'.
Then
Proof. Let f = sup{f,). Then, by Proposition 5.6 (iu), Chapter 5, sup{f ,) E C Thus, the integral on the left-hand side of (2.1) makes sense. On the other hand, for each element f, E C T1, there exists a monotone nondecreasing sequence of nonnegative simple functions
7'.
{sp)}tC- 8
+
such that sup{sp): k = 1,2,. ..) = f ., Let
.. .
Furthermore, Since I+ is a lattice, it follows that t k E IY + , k = 1,2,. i t k ) is monotone nondecreasing. Since {f,) is monotone nondecreasing, we have sf) If l
Ifk,
s f ) I f 2I fk,...,skk)I f k ,
and hence s f ) 5 f k , i = 1,...,k, which leads to (2. l a ) and (2.lb) On the other hand, t k 2 s t ) for k
2 n;
this yields
and, consequently, SUP{~
=f
Is u ~ { t k I .
(2. lc)
Thus, by (2.lb) and (2.lc),
Now the facts that f = sup{tk) and that i t k ) is monotone nondecreasing imply that
2. Main Convergence Theorems
Since t k 5 f k by (2.la), we have by Proposition 1.8 Stkdp 5
Sf kdp
which yields
f d~ =
~ SUPS f kdp. SUP t k d 5
Finally, the inverse inequality holds due to f,
5 f and Proposition 1.8.
2.2 Corollary (Beppo Levy). Let {f,) C - C ;.'
Then
(See Problem 2.1.) The Monotone Convergence Theorem can be generalized for an arbitrary monotone sequence under a minor constraint.
2.3 Theorem (Generalized Monotone Convergence Theorem). L et such that f, 2 g for all n {f,) t C-'(n,c;R) and g E c-'(R,c;R) and S gdp > - oo. Then, ~ ~ ~ { S f n= dS p s) u ~ I f n I d ~ . (See Problem 2.2.) 2.4 Lemma (Fatou). Let 5 e;'(n, 22). Then
(a,22, p)
be a measure space and let {f,)
Proof. By Proposition 5.6 (v), Chapter 5, Proposition 5.6 (iv), Chapter 5,
Clearly, the sequence {g,)
f, E C
;'
is monotone nondecreasing and hence
sup{g,) J I = I & f ,and gn 5 f k , for all k 2 n. By monotonicity of the integral,
and by
3 14
CHAPTER 6. ELEMENTS O F INTEGRATION
which implies that
Finally, by the Monotone Convergence Theorem,
2.5 Defbition. Let f , { f }, C L'(R,E,~; a ) . The sequence {f}, is said to converge to f in mean if
We now formulate and prove one of the central results in the theory of integration. As with the Monotone Convergence Theorem, the following theorem enables us to interchange the limit and the integral for a pointwise convergent sequence of functions. However, it does not require that the sequence be monotone nondecreasing and nonnegative. On the other hand, the sequence needs an integrable dominating function, and thus it is not a generalization of the Monotone Convergence Theorem. 2.6 Theorem (Lebesgue's Dominated Convergence Theorem). Let e -'(Q, el)be a (point(R,Z,p) be ,a measure space and let {f ,} wise) a.e. oonvergent sequence. Suppose that there is a p-integrable function g ( E L ~ ( R , C , ~ ; R )such ) that g > - 0, and that I f, ( 5 g, n = 1, 2,. ... Then the following a r e true.
(i) There ezists at least one function f E e-', such that f < oo, to which the sequence {f,} converges a.e. in the topology of pointwise convergence. (ii)
f E L'(R,G,~;w) and {f},
C_ L 1 ( ~ , ~ , p ; R ) ;
(iii) The sequence {f}, converges to f in mean, i.e.,
Proof.
(i)
By our assumption, there is a negligible set I1 such that
exists for all w E IIc and there is a p-null set N1
> II. Therefore, NF
ItC
2. Main Convergence Theorems
and
exists for all w E Nf. Since g E L1($2,.E,p), by Proposition 1.21, it follows that g is finite p-a.e. on $2, i.e. there is a p-null set N 2 such that g(w) c co for all w E N;. Define the function
where A = (N1 U N 2 ) ' Clearly, f, converges to f pointwise p-a.e. on $2 and hence, by Proposition 5.6 (iii) and (vi), Chapter 5, f E C-'. Indeed, and that f, lA+ f since f, and lAE C -', it follows that f, lAE C in the topology of pointwise convergence; the latter implies that f, -* f pointwise p-a.e. on $2.
-'
(ii) From (2.6) it follows that on set A, lim , f, = f ; in addition, { f ,} is dominated by a finite function g on A. Thus, ( f I 5 g on A and, due to (2.6), f = 0 on AC. Hence,
By Proposition 1.17 and since 1 f 1 < co, f E L ~ ( R , E , ~ ) . Also by Proposition 1.17, { f }, C_ ~ ~ ( ~ ~ 1 3 , ~ ) .
(iii) We prove that f, is convergent in mean to f , i.e.,
Let g,= Since
If-f,l
( ~ e ; l ( n , ~ ) why?). , Then,
Ifl +
OIgn6
I f 1 +g.
E L'(Q,c,P),
it follows that g, E L'(R,c,~), again by Problem 1.8. [Observe that since linearity of the integral holds just on L', we do need to show that g, E L' which would lead to
Applying Fatou's lemma to the sequence { ( f
I + g - g,},
we have:
3 16
CHAPTER 6. ELEMENTS O F INTEGRATION
1 f 1 + g - g,
Since f ,+f a.e., then gn- 0 a.e., and hence a.e. which implies that
-+
I f 1 +g
By Proposition 1.16,
which, together with inequality (2.6a), yields
or, equivalently,
G J g n d p 5 0. Because g, 2 0, (2.6b) reveals that
-
lim gndp = 0
and thus e m s If-fnIdp=O, which proves (iii). Now (iv) follows from Problem 2.6.
2.7 Examples. (i)
We evaluate
1
lim Sonx(l - x)"dx.
n-+w
First observe that the
sequence (nx(1- 2)") is convergent to the function 0 pointwise on [0,1]. However, .it is an easy exercise to show that the sequence {nx(l - x)") does not converge to 0 uniformly. Otherwise, we could interchange the limit and the integral. (See Problem 3.12 of the next section.) Fortunately, the functions nx(1 - x)" are uniformly bounded by 1. Therefore, function 1 can be taken as a pertinent integrable majorant function in the Lebesgue Dominated Convergence Theorem. This enables us to interchange the limit and the integral and conclude that lim ~ i n z ( 1 -x)"dx = 0.
n4w
(We can verify this result by direct computation of the integral
2. Main Convergence Theorems
and then passing to the limit.) (ii) Calculate n+m lim
J ;(l+ $ r e - 2 x (d~x ). Clearly,
Hence, by the Lebesgue Dominated Convergence Theorem,
=
J n+m lim (1 + Crllol ,](x)e - 2 x ~ ( d x= ) J y e - xX(dx) = 1.
2.8 Remark. Note that we treated ~ i n x ( 1 -x)"dx in Example 2.7 (i) informally both as Lebesgue (L) and Riemann (R) integrals (since they are identical in this case), although the formal relationship between the two will be developed and discussed in Section 3. The same applies to Example 2.7 (ii). In Problems 2.9-2.11 we will also assume that the Lebesgue integrals are equal to Riemann integrals. Another useful application of Lebesgue's Dominated Convergence Theorem 2.6 leads to the possibility of interchanging the derivative and integral whenever we need to differentiate a function under integral. The only obstacle in using Theorem 2.6 is that it is formulated for sequences, while derivative is defined as a limit along nets or filters. Nevertheless, to overcome this predicament we will utilize the arguments of Example 9.7 (ii), Chapter 3, when the limit of a function, originally introduced along a filter base, reduces to the topological limit along countable neighborhood bases whenever we deal with first countable spaces (which we frequently do, as far as applied to derivatives in metric spaces, in particular, in Euclidean spaces). This enables us to make use of limits as derivatives along sequences (as was pointed out in that example) and finally apply the Lebesgue Dominated Convergence Theorem. This is subject to Theorem 2.9, which the reader shall be able to prove. (See Problem 2.14.)
2.9 Theorem. Let f E C - ' ( ~ 2 x [a,6],C'; W) (a < 6 E W) be a Bore1 measurable function and for each t E [a,b], f ( ,t) E L'(i-2, Z, p; R).
-
) ) that ( i ) If there is a p-integrable function g ( E ~ l ( n , ~ , p ; Rsuch g 2 0 , and that [ f ( w , t ) I 5 g(w), t E [a,b], w E i-2, and if the function t H f ( - ,t) is continuous at some 5 E [a,6] uniformly for all w, then the integral of parameter
CHAPTER 6. ELEMENTS O F INTEGRATION
is continuous at <, i.e. limt+eI(t) = I ( < ) (In other words, the limit and integral are interchangeable.)
(ii) If the partial derivative
exists and there is a p-integrable
function g ( E L ~ ( R , C , ~ ; Rsuch ) ) that g
20 , and
that
Then, I is differentiable and
The following are analogs of the main convergence theorems (Monotone Convergence Theorem, Fatou's Lemma, and Lebesgue's Dominated Convergence Theorem) for measures, which are often needed in probability and control theory. The theorems are essentially based on the recent results of Onksimo HernAndez-Lerma and Jean B. Lasserre [2000], which are established under weaker conditions than in previous texts and papers.
2.10 Lemma (Fatou). Let f E c :'(a, C) and { p , pll p z l . . .} be a sequence of measures on C such that for each A E C , h p n ( A ) 2 p ( A ) . Then
Proof. Let I s k } t
!P + (R,C) such that sk t f and
Hence, for each k = 1,2,. ..,
The statement now follows from the definition of integral.
2.11 Theorem (Dominated Convergence). 6 et f E C ; ' ( 0 ,C ) and { , p , p 1 , p ,. be a sequence of measures on C such that for each A E C , ( A )+ ( A ) ,p and f dv < cm.T h e n
,
2. Main Convergence Theorems
Urn
n+w
Proof. Since p,
< v, it
J f dp,
=
f f dp.
is easy to verify that v - pn is a measure on
E. Due to Problem c25,
J f d ( v - P,)
=
J f d v - J f dp,.
Furthermore,
= v ( A )- n+w lim p n ( A ) = v ( A )- p ( A ) = ( v - p ) ( A ) . The last inequality holds true, because obviously p 5 v and hence v - p is a measure on C. Now, all conditions of Fatou's Lemma 2.10 are met for the sequences { u - p,} and {p,} and therefore,
and
J f d r 5 limJfdp,. Combining both inequalities we have
and hence, the statement.
0
T o prove the Theorem of Monotone Convergence for measures we need the notion of setwise convergence.
2.12 Definition. Let (a,C) be a measurable space and {p,} be a sequence of measures on (R,C). We will say that {p,} converges to a set function p setwise if n+w lim p,(A) = p ( A ) exists for each A E E. The set function p will be called the setwise limit of {p,}.
2.13 Proposition. The setwise limit p of {p,) properties:
0 has the following
( i ) p is monotone and additive. (ii) Let {A1,A2,. ..} be a sequence of pairwise disjoint sets from C and A, C A E C . Then
C H A P T E R 6. ELEMENTS O F INTEGRATION
Proof. (i)
is trivial.
(ii) It can be verified directly from the definition of the setwise limit by using monotonicity and additivity or just due to Proposition 1.3 0 (ii), Chapter 5. We are wondering what condition imposed on a sequence {p,} makes its setwise limit a measure. For instance, if the sequence {pn} is monotone nonincreasing, then the limiting set function p need not be cadditive, as we learn it from Problem 2.12. 2.14 Theorem. Let a sequence {p,} of measures on a measure space (f2,C) be convergent t o a set function p setwise. Then p is a measure if one of the following conditions holds. (i) {p,} is a monotone nondecreasing sequence. (ii) p is finite. Proof. Let {Ak) be a sequence of pairwise disjoint measurable sets with A as its union. (i)
Since {pk} is monotone nondecreasing, for each m = 1,2,. . .,
which, combined with (2.13), yield3 the statement. (ii) Since p is finite, by Theorem 1.7 (ii), Chapter 5, if p is not afinite (which we are going to assume), it would not be @-continuous. In other words, there is a monotone nondecreasing sequence {Ak} .J@ of measurable sets such that limk+mp(Ak) = E > 0. Let al = bl = 1 and suppose a j and b j are positive integers defined for all j 5 n. Furthermore, let an + 1 > a, such that
(If there is no such a, =& that limk,,p(Ak)
(Such a bn
+
then it would surely contradict our assumption > 0.) Now, let 6, + > b, such that
should exists, because pa
+ 1, we
have that pa
n+l
n+l
(B,)
is @-continuous.) For B,: =
$. Therefore
for j being odd
2. Main Convergence Theorems and j > k 3 1, n even:n 2 k
Then, for k >_ 1,
n even: n
_> k
We can easily verify that the last inequality holds true also for all odd values of n. Consequently, for all k 2 1,
The latter contradicts the assumption that limkdWp(Ak) = E
> 0.
2.15 Theorem (of Monotone Convergence). Let f E C ;'(G?, 6 ) and {p1,p2, ...} be a monotone nondecreasing sequence of measures on a measure space (G?,E).Then there is a measure p on (R, C , p ) such that pn(A) -' p(A) for all A of )3 and lim
n+w
1f dpn = S f dp.
Proof. Since {p,} is monotone nondecreasing, by Theorem 2.14 (i), the setwise limit p of {p,} exists and it is a measure on (R, C). Since f is nonnegative and pn f p, the sequence { S f dp,} is monotone nondecreasing and hence
- f f dpn = limn+, f f dpn If f dp. The last inequality holds because of S fdp, 5 S fdp, which, in turn, is due to Problem 1.26. On the other hand, from Fatou's Lemma 2.10 applied to our case,
that, combined with (2.15), yields the statement.
tl
The convergence theorems below are for sequences of functions and measures at once.
2.16 Lemma Fatou. Let {p, pl, p2,...) be a sequence of measures on a measure space (R, C ) and let { f ,} C C ;'(a, C ) such that for each A E E, pn(A) 2 p(A). Then where
ff
d 5~lim f f n d p n ,
CHAPTER 6. ELEMENTS O F INTEGRATION
f (w): = h f , ( w ) , w E 0.
(2.16a)
Proof. First assume that { f ,) C C ;'(a, C ) . Then, for every fured positive integer N and for every n,
Applying the version of Fatou's Lemma 2.10 to the right-hand side of (2.16b) we have
- N I N } T f defined in (2.16a), applying the standard Since {inf{ f :, m > Monotone Convergence Theorem 2.1, we arrive a t lim f f ,dpn 2
inf{ f :, m
> N } d p = J' f dp.
The following generalization of Fatou's Lemma 2.16 is applied to arbitrary measurable functions {f,) and its proof is left to the reader. (Problem 2.13.)
2.17 Lemma (Fatou). I n the condition of Fatou's L e m m a 2.16, let { g , f l , f Z I . . .} E C -'(R, C ) such that for all n, f, 2 g and limn,,
f
gdp, =
f gdp > - 00.
Then, f f d 5~ where
lim f
fndpn,
f ( w ) : = W n ( w ) , W E 52.
2.18 Theorem (Lebesgue's Dominated Convergence Theorem). Let { f , } C C L 1 ( R , C ) , g E C;'(R, C ) , and {v,p,pl,pz,. . .} be a sequence of measures on the measure space ( R ,C) such that:
('1
pn 5 v* (ii) f , converges t o a function f i n the topology of pointwise convergence. (iii) p, converges to p setwise. ( i v ) f gdv < oo.
(4
IfnI
5s.
323
2. Main Convergence Theorems Then,
Proof. Consider Theorem 2.11 for which we use the conditions (i), (iii) and (iv). Then, applying Theorem 2.11 to g we have that
Now, since g -I: f,
> 0 for all n, we have from Fatou's gdp 5
On the other hand, since
gdv
Lemma 2.17,
< m,
that yields the assertion.
PROBLEMS 2.1
Prove Corollary 2.2.
2.2
Generalize the Monotone Convergence Theorem: Let { f ,) f 5 C -'(R,c) and g E C -'(R,c) such that f, 2 g for all n and suppose that J g dp > - co. Prove that
2.3
Show that if Sgdp = - oo, the Generalized Monotone Convergence Theorem need not hold.
2.4
Let { f , ] J . s C - ' and g ~ ~ - l s u cthat h f,
2.5
SI , d
S
~ =l inf{f ,I d~
Let (R,E,p) be a measure space and let { A , )
and if p
< oo that
-
-
p(1im A,) >_ lim p(A,).
for all n. If
.
E E. Prove that
324
CHAPTER 6. ELEMENTS O F INTEGRATION
[Hint:Apply Fatou's Lemma 2.4 to the sequence of functions {IA } and use Problem 3.8, Chapter 1; then apply DeMorganYs n
law to prove the second inequality.] 2.6
Show that if f n - , f in mean then
2.7
Generalize Fatou's Lemma 2.4 in the following way. Let { f ,} C C - ' ( R , E ) and g E C-'(R,c) such that g 5 f, for all n . Let g - d p < CQ.Show that
Slim f n d p slim S f n d P 2.8
Let { f ,} c C -'(n,C) and g E C -'(Q,C) n . Let g + d p < CQ. Show that
2.9
Let
S
Show that f n Explain why
-t
such that f n
5 g for
all
0 A-a.e. in the topology of pointwise convergence.
S limn,, 2.10
Let
n2x,
o < x ~ ;
( x - ) ,
1 2 iii x sn
0,
2 xZz.
Show that
Slim f n X ( d x ) < bj' f n A ( d x ) . 2.11
Use Lebesgue's Dominated Convergence Theorem 2.6 to prove that for all a > 0, lim
n+=
n!na
a(a - 1). .. ( a
+n - 1) = r ( 4 ,
2. Main Convergence Theorems
325
where r ( a ) is known to be the gamma function and it is expressed as the improper Riemann integral (P2.1 la) 2.12
Give an example of a monotone nonincreasing sequence of meas, setwise such that p is not a ures convergent to a set function u measure.
2.13
Prove Fatou's Lemma 2.17.
2.14
Prove Theorem 2.9. [Hznt: Use Theorem 2.6, the Mean Value Theorem, and Example 9.7 (ii), Chapter 3.1
326
CHAPTER 6. ELEMENTS OF INTEGRATION
NEW TERMS: Monotone Convergence Theorem for functions 312 Beppo Levi's Corollary 313 Monotone Convergence Theorem, Generalized 313 Fatou's Lemma for functions 313 convergence in mean 314 Lebesgue's Dominated Convergence Theorem for functions 314 interchanging derivative and integral 3 17 Fatou's Lemma for measures 318 Lebesgue's Dominated Convergence Theorem for measures 318 setwise convergence of measures 319 setwise limit of measures 319 setwise convergence, criterion of 320 Monotone Convergence Theorem for measures 321 Fatou's Lemma for measures and nonnegative functions 321 Fatou's Lemma for measures and functions 322 Lebesgue's Dominated Convergence Theorem for measures and functions 322 gamma function 324, 325
3. Lebesgue and Riemann Iniegrals on R
3. LEBESGUE AND REMANN INTEGRALS ON aB In this section we will develop integration techniques in L~(R,%,X;R)(see Definition 1.10 (ii)). The principal idea is to reduce the Lebesgue integral to the Riemann integral whenever it is possible in combination with the main convergence theorems. The Riemann notion of an integral, which was a refinement since its inception of Cauchy in 1832, was introduced in 1854. We begin with the concept of the Riemann integral of a bounded function on a compact interval suggested by the Frenchman Gaston (in some sources, Jean-Gaston) Darboux (1842- 1917) in 1875. Although the construction below is selfcontained, the reader is encouraged to go back to Example 9.9 (vi), Chapter 3, for topological preliminaries of this construction. Let = [a,b] be a compact interval in R. By Definition 1.7 (ii), Chapter 1 (see also Example 9.9 (vi), Chapter 3), partition of [a,b] is any ordered n-tuple P = P(n) = P(ao,. ..,an) with
P = {ao,...,a, E [a,b]: a = a. < al <
... < an = b}.
Let P1 and P2 be two partitions of [a,b]. We say P2 is finer than P1 if PI 5 P2. P2 is also said to be a refinement of P1 (in notation Pl 3 P2). Thus, if 9 is the set of all partitions on [a,b], 5 is a partial order on 9. Denote by eb '([a,b], %([a,b])) = C; '([a,b], % fl [a,b]; R) the set of all real-valued, Borel-measurable, bounded functions on [a,b]. Let f E Cb l([a,b], ZB([a,b])) and let P ( n ) = (ao,. ..,an} be a partition of [a,b]. We introduce the following notation:
mi = inf{f (x): x E Ail, Mi = sup{f (x): x E A;),
L(f, P ) =
Mi(ai - ai-l) En r =1 = C t"m = i llAi, ~ (),' f
U(f , P ) = '(f lP)
P-- lmi(ai - ai-l)
(the Darboux lower sum), (the Darboux upper sum),
=
Cr = l
Clearly, the jump functions I and u are elements of i + ([a,b], $([a,b])), i.e., are nonnegative simple Borel-measurable functions. Thus, L(f,P) and U(f,P) can be interpreted in terms of Lebesgue integrals as
328
CHAPTER 6. ELEMENTS O F INTEGRATION
and U ( f ) P ) = S u(f ,P)dA =
S [a1
~ (,P)dA f
bl
(in agreement with Notation 1.13 (ii)). Now let {P(n) = P(ao,. ..,a,); n = 1,2,. ..) be a sequence of partitions of [a,b] such that {P(n), -( ) is a chain. Denote I P ( n ) I the Lebesgue measure of the largest subinterval of P ( n ) and call it the mesh of this partition. A chain {P(n), -1 ) is said to be canonic if { 1 P ( n ) I ) is a monotone nonincreasing sequence vanishing for n + 00. Let I, = l(f,P(n)) and u, = u(f,P(n)) denote the lower and the upper jump functions corresponding to a partition P ( n ) in a canonic chain. Then it can be easily verified that
S
Let U, = U(f ,P(n)) = u, dA and L, = L(f ,P(n)) = monotonicity of the Lebesgue integral, we have
S 1,dA.
By
Since f is bounded, there exist U- = inf U, = lim U, (called the upper Darboux integral) and L+ = s u p L, = lim L, (called the lower Darboux integral).
3.1 Definition. If U- = L+, then their common value, R ( f ,[a,b]), is called the Riemann integral of the (bounded) function f over [a,b], and the function f is called Riemann integrable. R(f ,[a,b]) is also denoted by the symbol (R) S f (x)dx. [a1
bl
Sometimes; to tell a Lebesgue integral from a Riemann integral we will write as
For notational consistency, most often we shall be using the dX symbol within the Lebesgue integral (rather than a n "L" in front of it). However, many text books and papers routinely use the same symbol dx in Lebesgue integrals as in Riemann integrals, which we do not believe should cause any serious confusion (and it makes A available for other notation).
3.2 Theorem. Let f E C; '([a,b], %([a,b])). If f is Riemann integrable on [a,b], then f E ~'([a,b],%([a,b]),A;w), i.e., it is Lebesgue integrable on [a,b]. In this case, the Riemann integral off equals the Lebesgue integral
3. Lebesgue and Riernann Integrals on R
off. Proof. Let f be Riernann integrable. Then, n+w lim (U, - L,) = 0. Ap-
C C;'([a,b],
plying Fatou's Lemma 2.4 to the sequence {u,-1,) %([a, b])), we have
Because of elements of C ;,'
, Lemma
1.15, and the fact that u, - f and f - 1, are )we have that
*.
u = iim u, = I = lirn 1, = f a.e. Also, since f E Cb '([a,b], %([a,b])) and 1
(3.2)
f , it follows that
Now we can apply Lebesgue's Dominated Convergence Theorem 2.6 to the sequence {I,} with respect to its a.e.-limit-function f to have lim
n+w
J 1,dA [a, bl
= lirn L, = (R) n+w
[ f (x)dx =
[a1
bl
J' f dX. [a1
bl
(i) The functions u = inf u, and 1 = sup 1, are called the upper and the lower Baire functions. Therefore, L+ is the Lebesgue integral of the lower Baire function 1. (ii) The above construction of the Riemann integral, which is now common in mathematical analysis courses, belongs to Gaston Darboux in his work Mimoire sur la thtorie des fonctions discontinues of 1875. The original construction of the Riemann integral, that goes back to Augustin-Louis Cauchy (1789-1857) in 1823 (and later generalized by Riernann in his Habilitationschrift of 1854), is as follows. Given a function f E C b '([a,b], %([a,b])) and a partition P from a canonic sequence of partitions { P ( n )= P(ao,. ..,a,); n = 1,2,. . .) of the interval [a,b], define the Cauchy sum as
where Fi is any point of ad. Note that, unlike the Darboux sum, the Cauchy sum is not specified because ti's are arbitrary. If the limit
330
CHAPTER 6. ELEMENTS O F INTEGRATION
exists as a unique number, then f is called Cauchy integrable on [a,b] and the value of this limit is denoted by (C) J af (x)dx. Clearly,
and therefore the Cauchy integral exists if L + = U-. In 1875, Darboux proved that this is also a necessary condition for the existence of C and in this case, C = R. Darboux's theorem and his approach are subjects in most standard texts in mathematical analysis, while Riemann's concept of the integral is more common in calculus classes as it leads to a quicker and more lucid interpretation. As a sufficient condition of the existence of the Riemann integral, Cauchy required that f be continuous on [a,b]. Riemann relaxed Cauchy's integrability condition by requiring that for each E > 0, there is a partition P of [a,b] such that U-(f ,P ) - L + (f , P ) < E . However, Riemann did not specify the class of functions, which are subject to integration (although he pointed out that a function can be discontinuous on a dense set and nevertheless integrable), as Lebesgue did in his Theorem 3.5 which is to follow.
3.4 Example. Let f be the Dirichlet jump function introduced in Example 1.4. Consider its modification
The Lebesgue integral of f exists and equals zero. The Riemann integral of f , however, does not exist, since for every partition, the lower Baire function equals 0 ( I = 0) and the upper Baire function equals 1 (u = 1). Therefore, the lower Darboux integral L+ = 0, and the upper Darboux integral U- = 1.
ec
3.5 Theorem (H. Lebesgue). Let f E '([a,b], %([a, b])). Then f is Riernann integrable on [a,b] if and only iff is continuous A-a.e. on [a,b]. Proof. (i) Observe that if f is continuous on [a,b], then it is uniformly continuous on [a,b]. This implies that for each E > 0, there is a 6 > 0, such that for each partition P whose mesh is less than 6,
(Show it, see Problem 3.1.) This leads to Riemann integrability. (ii) Let f be bounded, Borel-measurable and A-a.e. continuous on
3. Lebesgue and Riernann Integrals on R
331
[a,b]. If f is not continuous everywhere, but is bounded, it can have only
discontinuities of finite magnitude. From the nature of the lower and the upper Baire functions, I and u, it follows that 1 and u coincide with f a t all points of continuity of f . (A rigorous proof of this statement, known as Baire's theorem, is contained in many standard analysis text books.) At the points of discontinuity of f , 1 assumes the smallest values and u takes the largest values (this can be shown by elementary methods). (See Figure 3.1.)
Figure 3.1 Then, if f is discontinuous on a negligible set S, it should equivalently follow that u and 1 differ on the same set S. By the above condition, S C N where N is a measurable null set. Since f is bounded, u, and I , are measurable, bounded jump functions, and U , and L, exist. By Lebesgue's Dominated Convergence Theorem, U- - L+ = 0, which implies that f is Riemann integrable. Indeed,
'--'+
lim U - n+m
- L = ,400 1 j' U,
dX
- n+m lim J 1,dA
by Lemma 1.15, since u = I on NC, i.e., a.e. (iii) Let f be Riemann integrable. Then, by (3.2),
CHAPTER 6. ELEMENTS O F INTEGRATION
Furthermore, f is bounded. We repeat the above arguments. From the nature of u and 1, it follows that, in this case, u, 1, and f coincide wherever f is continuous. At all points of discontinuity, while f assumes one of these values, the smallest values of f will be assigned to 1 and the largest ones - to u. Therefore, the set, on which the function f is discontinuous equals the set on which u and I differ. This proves that f is continuous Xa.e. 3.6 Remarks.
(i) By employing a canonic chain of partitions on the X-axis, in construction of the Riemann integral, we sometimes face the problem that the sequence of the corresponding lower jump functions {I,) converges to the lower Baire function I, but it does not converge to f , as it turns out for the Dirichlet function. Consequently, the lower Darboux integral gives a "wrong" value. In contrast, the construction of the Lebesgue integral literally sets up partitions on the Y-axis whose canonic chains form monotone increasing sequences of "lower" jump functions. The latter, due to Theorem 5.5, Chapter 5, always converge to f . Consequently, the lower Darboux integral L + equals the Lebesgue integral Sfd ~ . (ii) Although Riemann and Darbowc enlarged the previously existing class of integhble functions, the Riemann integral has a plethora of limitations, one of which goes back to the fundamental theorem of calculus in the form
This formula becomes meaningless when a differentiable function f is not integrable. On the other hand, the classical proof of the formula
"sXf dx
a
(u)du = f (x)
was originally based on the continuity assumption for f . The new concept of integration suggested by Henri Lebesgue in 1902 in his doctoral work restored the generality of the fundamental theorem to its current stat us. Furthermore, the class of Lebesgue integrable functions is significantly enlarged. Notice that from Theorem 6.5, Chapter 5, it follows that, in contrast with the Cauchy-Riemann-Darboux formation of partitions of [a,b]and essentially leading to Definition (3.3), the Lebesgue construction of the integral of an (initially nonnegative) function f suggests partitions of the interval [0, supf] on the Y-axis instead. The latter leads to a notion of a sequence of nonnegative simple functions
3. Lebesgue and Riemann Integrals on R
333
is,)
approximating f from below, a very elegant and lucid definition of the integral of a nonnegative simple function, and, as a consequence, the definition of the integral f dA as sup{ s,dA). The function f need not be A-a.e. continuous, nor need it even be bounded. (iii) As we mentioned, in order that a function be Lebesgue integrable, it need not be bounded. A class of Riemann-integrable functions, as known, can be "extended" for nonbounded functions by the use of the &La improper integral." Another need for the improper integral arises when
the interval of integration is unbounded. In the latter case, the integral is constructed as usual on a compact interval [a,b ] , and then its values are taken for a -, - ca or b -, ca. This is a "trick" rather than a proper integral construction. That is why such integrals are called improper. (iv) Unlike this type of improper integration over infinite intervals, there is another way to integrate functions with the conventional approach of constructing an integral via uniform "partitions" of the infinite interval. Consider as an example a bounded Bore1 measurable function f on an interval [a,=) and a partition of this interval by the sequence {a,}, where a, = a 672, n = 0 , 1 , . .., for some positive 6. Then on each of the intervals A, = [a,, a, + consider
+
m, = inf {f (x): x E A,}
and
M , = sup { f (x): x E A,). Since the Lebesgue measure of each interval A, equals 6, we have again the lower Darboux sum,
and the upper Darboux sum,
If lim L(f ,6) = lim U(f ,6) then its common value is denoted by 610
610
and called the direct Riemann integral. The function f is then said to be directly Riemann integrable. The direct integrability is used in probability, specifically in renewal theory, where such a notion is introduced for a class of nonnegative functions bounded over finite intervals.
334
CHAPTER 6. ELEMENTS OF INTEGRATION
3.7 Examples.
+
(i) Let R = [O,11 and let f (x) = x21A(x) sins lAc(x), where AC is
the Cantor ternary set. The function f is a bounded Borel-measurable function on [0,1] and obviously A-a.e. continuous on [0,1]. Thus, f is Lebesgue as well as Riernann integrable and f ( x ) = x2 ka.e. on [0,1]. Furthermore,
(ii) Let R = [1,2] and f (x) = (x - 1)-'I3. We wish to evaluate f (x)A(dx). Since f is no longer bounded (on [1,2]) we cannot apply [ 1 Y 21
the same techniques as discussed above. Consequently, we introduce a n auxiliary sequence of functions, {f,}, defined as
(see Figure 3.2). It is easily seen that {f,} is monotone increasing sequence of continuous functions contained in C ;'([1,2],% ([1,2])) with sup{fn) = f
Figure 3.2
3. Lebesgue and Riemann Integrals on R By Proposition 5.6 ( i v ) , Chapter 5, f E C;'. gence Theorem,
335
By the Monotone Conver-
On the other hand,
Thus,
Observe that the improper integration technique for nonbounded functions could also be applied to this function. 0
3.8 Remark. The Lebesgue integrable functions constitute a much wider class in comparison to the Riemann integrable functions. It should also be mentioned that an L'(R,%, A)-function f can be integrated over arbitrary Bore1 sets, while the Riemann integral is defined just on intervals. With all these advantages, however, the Lebesgue integral does not have the same elegance and analytical tractability the Riemann integral has, due to its "Newton-Leibnitz bridge" to derivatives and a huge inventory of integration techniques. In many cases, whenever possible, the Lebesgue integral is just reduced to a Riemann integral. In addition, the class of Riemann integrable functions is traditionally enlarged to include those functions which are Riemann integrable in an improper sense. There will be functions with discontinuities of an infinite magnitude and functions defined on intervals of type [a,m) or ( - oo,b]or ( - m, m). In Example 3.7 (ii) we examined a Lebesgue integral of a nonbounded function. In a certain sense, the approach used there reminds us of Riemann integration of nonbounded functions. In the proposition below we will state that in most cases, when the integration over an infinite interval is needed, we can use Riemann integration in the improper sense and equate their values to those for Lebesgue integrals. This fact makes the Riemann improper integral more legitimate. 3.9 Proposition. Let f E C ,'(R, '3;R + ) let f be Riemann integrable on any compact interval. Then f E L ~ ( R , % , x ; w + )if and only if the improper Riemann integral o f f ,
CHAPTER 6. ELEMENTS O F INTEGRATION
R = lim,, b-00
-, J f(x)dx, [ a ,b ]
exists. ( W e say that f E %(W), where %(R) is the class of all functions on W Riemann integrable in the improper sense.) In this case R = fdA.
S
Proof- Denote
Rnk = (R) J f (x)dx where Bnk = [ - k,n]. Bnk
Then, since f is Riemann integrable, Rnk =
S f 1Bnk dA. Observing that
we have, by the Monotone Convergence Theorem,
3.10 Remark. The special case treated in the above proposition applied to nonnegative Bore1 measurable functions can easily be extended to arbitrary functions of e - by our noticing that f E L1(W, %, A;R) if and only if 1 f 1 E L1(W, 93,A;R). Therefore, using Proposition 3.9, we conclude that I f I must be an element of %(W). In this case, evidently,
'
3.11 Examples. x sinx (where k # 0). We show (i) Consider the function f (x) = k2 + x2 that this function is Riemann integrable in the improper sense but not Lebesgue integrable over W + . We apply the Dirichlet criterion:
Let g and h be two real-valued functions defined on [a,oo). If g is monotonically vanishing at oo and b
>a
I (R) J' ab
h(x)dx
I< C,
for each
and positive real number C , i.e., the integral of h is uniformly
bounded in b, then the improper integral ( R )J':
In our case, the function
x
k2
+ x2
gh is convergent.
can be taken for g and sinx can represent
3. Lebesgue and Riemann Integrals on R
337
h. Then, the conditions in Dirichlet's criterion are met for a = 0, and consequently, (R)
s:
f converges. On the other hand, f E L'(w, %, A;R) if
and only if 1f 1 E L'(w, '3,A;W), which, by Proposition 3.9, is equivalent to the convergence of the integral (R) I = m. Indeed,
Sr I
(n
n=O 7r
I : = I. It
will be shown that
+ I)* 1 sins 1 x dx k2
nr
sint ( t
0 k2
f
+n r )
+ ( n r + t12
dt 3
+ x2
r mSO aintdt
k2
+ r 2 ( n + 112
(the second summation is due to the inequality nn t 5 (n l ) n , for t E [O,n]).
+
+
Thus
2
(ii) The function f (x) = sinxexp( -%) is an element of C is Lebesgue integrable, because and because
Observe that xi+- e
x
-
I f (x) I 5 g(z) = exp(-+)
2
-'and it
and g(z) 2 0
00
) ,x
R, is the normal density func-
tion of the standard normal distribution. (See Example 5.10 (iii).)
PROBLEMS 3.1
Prove (3.5) in Theorem 3.5.
3.2
In Example 3.4, we showed that the Dirichlet function f on [0,1] is Lebesgue integrable, but not Riemann integrable. Since the rationals have the Lebesgue measure of 0, the function f is equal to 0 (a constant) for A-almost all points on [0,1], and therefore, it is continuous almost everywhere on [0,1]. By Theorem 3.5, f must be Riemann integrable. This is just the opposite of the result of Example 3.4. What is wrong with this reasoning?
3.3
Is the function f (x) = $ on [O,11 Borel-measurable and X-integr-
CHAPTER 6. ELEMENTS OF INTEGRATION
able? Show that the function f , such that f (z) = $ C O S (1~ ) on (0,1] and f ( 0 ) = 0, is Borel-measurable and not A-integrable. Let f : [0,1] + R be defined as
Show that f is improperly Riemann integrable but not Lebesgue integrable. Let f be a monotone increasing differentiable function on [a,b] and let cp be its inverse function on [ f( a ) ,f ( b ) ] . Prove that
Investigate
0
if
the
function
< 1) is improperly
f (z) = 7 sin l{x # 01 ( x ) (where
Riemann and Lebesgue integrable over
R+. Let G be a nonempty open subset of [a,b] and let f be a Bore1 measurable function on [a,b], discontinuous a t each point of G. Can f be Riemann integrable? Show that the functional
defines a semi-norm on ~ ' ( [ ba] ,%([a,b ] ) ,A). How can become a norm? Let s E P + ([a,b],l ( [ a , b J ) ) .Show that for each continuous function h E €'([a,b])such that
E
11 11 LI
> 0 , there is
a
Show that the space e([a,b])of all continuous functions on interval [a,bl is dense in ( ~ ' ( [ a , b l%([a,bl), , A), II II J.
-
Use Lebesgue's Theorem 3.5 to show that the limit of a uniformly convergent sequence { f ,] of bounded Riemann integrable functions on [a,b] is Riemann integrable on [a,b]. Prove that under this
3. Lebesgue and Riemann Integrals on R
condition, nlim d o o (R) j'!
f ,(x)dx. f ,(x)dx = ( R )j' a nlirn doo
(P3.12)
3.13
Let A be a closed negligible subset of [a,b]. Is the function lA Riemann integrable?
3.14
Let A be a subset of [a,b] whose closure is negligible. Is lA Riemann integrable?
3.15
Let (f), be a sequence of bounded, Bore1 measurable, nonnegative functions on A C R. Suppose (L) j' f,dX + 0 for n+m. Is it true A that f ,+0 Xa.e. on A?
340
CHAPTER 6. ELEMENTS OF INTEGRATION
NEW TERMS: partition 327 refinement 327 Borel-measurable bounded functions 327 Darboux lower sum 327 Darboux upper sum 327 mesh of a partition 328 canonic chain of partitions 328 upper Darboux integral 328 lower Darboux integral 328 Riemann integral 328 Riemann integrable function 328 upper Bair function 329 lower Baire functions 329 Cauchy sum 329 Cauchy integrable function 330 Dirichlet function 330 Lebesgue's Theorem of Riemann integrability 330 improper Riemann integral 333 direct Riemann integral 333 direct Riemann integrability 333 Dirichlet's criterion 336
4. Integration with Respect to Image Measures
4. INTEGRATION WITH RESPECT TO IMAGE
MEASURES As one of the extensions of major integration techniques, we will study integration with respect to image measure pF* (where F is a measurable mapping), with the nickname change of variables, as it resembles the prominent method for the Riemann integral. In this section we will restrict our attention to the abstract integral. A more specific approach to a change of variables for Lebesgue integrals in Euclidean spaces will be treated separately in Chapter 7.
4.1 Theorem (Change of Variables). Let (520,Co,p) be a measure space, f E C-'(R,E), and F: (no, Co)+ (a,C ) be a measurable map (such that p F * is an image measure on the measurable space (52,C)). Then, the following formula holds true:
Specifically, reduces to
if f = g l A , where A €
E and
S g(w)d~F*(w)=F*(A) S g
A
0
g E C-'(fi?,~), then
F(wo) dp(wo).
(4.1) (4. l a )
Proof. Let s E I+ (R,C) be just a n indicator function s = l A By Problem 3.7, Chapter 1, we have that
(i)
Therefore,
(ii) Let s be a nonnegative simple function with the representation,
Then,
342
CHAPTER 6. ELEMENTS O F INTEGRATION
and n
1s F d p = C a; pF*(Ai) = sdpF*. i=l f E C; '(R,c). Then there exists {sn}tC 9 0
(iii) Let f = sup{s,). For s, we have, according to (ii):
+
such that
Observe that {s, o F)T C P + (Ro,E,,) and, by Proposition 5.6 (iv),
Therefore, we have that
(iv) Let f E (2-'(R,c).
Then, f = f + - f -
and, according to
Problem 4.1, f o ~ + = f + o and ~
foF-=f-oF.
Therefore,
and this, along with (iii), imply that
(v) have,
Let
f =glA
where A E C
and
g EC-'(R,c).
Then we
4. Integration with Respect to Image Measures
4.2 Corollary. Let (Q,C,p) be a measure space and let
be a bijective transformation which is C-22 measurable along with its inverse F*. Then, for each f E C-'(R,E), the following formula holds true.
(See Problem 4.2.) 4.3 Examples.
(i) Then,
Let f E e-
1
(Rn ,!Bn)and L(x) = a x + b for
a E R and b E Wn.
where B is any Borel set and X is the Lebesgue measure. Let A = L,(B). Representing the Borel set B as
B = L* 0 L,(B) = L*(A) = &(A - b), we have
I=
S
f o L(x)A(dx) (by (4. la)) = 1f (x)XL*(dx).
L*(A)
By Proposition 4.3, Chapter 5, XL* =
A
AX, implying that I 4
where A = aB + b. (4.3) is due to Problem 1.23, i.e. due to the fact that fd(cp) = c S f d p , where c > 0. (ii) Let (52,6,P) be a probability space and let X E e - '(R,Z) be a random variable. Recall that X induces the image measure PX*, or,-equivalently, the probability distribution on the -measurable space (R,!B), thereby generating the new probability space (R,!B,PX*). The functional of X , S X(w) P(dw), was called (In Definition 1.10 (iv)) the expectation of the - -random variable X and denoted by symbol IE[X]. Let g E e - '(R,%). Then, g o X is also a random variable whose expectation is
By formula (4.1), we have
CHAPTER 6. ELEMENTS O F INTEGRATION
Specifically, if g(x) = x, we have E[X] = J xPX*(dx). If g = lA,then from (4.3a), Notation 1.13, and Definition 4.2, Chapter 5, E[1,
o
XI = JPX*(dx) = PX*(A) = P{X E A). A
(4.3b) 0
PROBLEMS = f + o F and f O F - = f - O F .
4.1
show that f
4.2
Prove Corollary 4.2.
4.3
SimpIify
O F +
J f (e2x)~(dx),where f
E
c - '(R,%;R) and A = [1,2].
A
4-4
Use the change of variables formula to evaluate the integral f (2x l)X(dx), where A
+
and A = [1,3].
4. Integration with Respect to Image Measures
NEW TERMS: change of variables 341 change of variables for a bijective transformation 343 expectation of a random variable 343 expectation of a function of a random variable 343
CHAPTER 6. ELEMENTS O F INTEGRATION
5. MEASURES GENERATED BY INTEGRALS ABSOLUTE CONTINUITY. ORTHOGONALITY In this section we will learn that the integral
J f dp,
as a set function
A
v(A), turns out to be a measure. Hence the two measures, p (the original measure) and v (generated by the integral), are related through the given integrand-function f , which is referred to a . a density. Now, under what condition imposed on two arbitrary given measures can a density function exist? The question raised leads to one of the central results in measure theory and integration, known as the Radon-Nikodym Theorem, which specifies exactly that condition. This section gives a very brief and informal acquaintance with the Radon-Nikodym Theorem and its ramifications needed to advance to the upcoming material and serving as an introduction. A more elaborated and general version of Radon-Nikodym Theorem will be treated in Section 2, Chapter 8. Let ( a , C, p) be a measure space. Consider the integral A - + S f d p =[ f l A d p A
as a set function on C. If f 2 0, then as the following proposition states, we have a measure on C.
5.1 Proposition. Let ( R E , ) be a measure space and let f E C ;'(R,C). Then, the set function u(A) = J f d p is a measure on C. A
(See Problem 5.1.)
5.2 Definition. According to Propositon 5.1, v is the measure generated by the integral [ f d p ; v is also called the indefinite integral of f with respect to p. The function f is called a (Radon-Nikodym) density function of v relative to p.
5.3 Proposition. Let (C?,C,p) be a measure space. (i) If f and g E C ,'(Q, integral f dp. Then
E ) , and v is the measure generated b y the
(il),
(ii) In the condition of let g E ~ - ~ ( a , z ; R ) .Then g E L ' ( R , E , v ; ~ ) if and only if gf E L ( G ? , C , ~ ; ~and ) , in this case (5.3) holds too.
5. Measures Generated by Integrals
Proof. (i) As usual, we begin with g E P + (R,6) as a nonnegative simple function g = E r 5 . l t o g e t (5.3): = l s A; fgdv = C h i u ( A i ) r = l
For g E
e 7 ',
Since {s,f
}t
there is {sn}t C P + such that g = sup{sn}. By (5.3a),
C; ', by the Monotone Convergence Theorem,
(ii) Now let g E
c - '.
Thus
Sgdu =
sgf d v -
Sg-
dv
The following example motivates the Radon-Nikodym Theorem.
5.4 Example. Let (R,C,p) be a measure space and let p be c-finite. Then, there exists a sequence {A,} f R such that p(A,) sequence {a,} E W+\{O} as a, = min
1
,n =
< 00. Define
the
1,2,... .
Let n
9, =
C ';'Ai
i=l
.xa ; l ~ ; . 00
and g = sup{g,]
=
r=l
Then,
Therefore, if p is c-finite, there always exists a positive element g of L1(G?,6,p). Conversely, let g > 0 and g E L1(R,6,p). Then
A,={~z&}EE and gn
> 1An. Thus
348
CHAPTER 6. ELEMENTS OF INTEGRATION
which implies that p(An) < m. Since g
> 0, it follows that
Ant R.
We have shown that a-finiteness of p is equivalent to the existence of a positive integrable function g. In other words, there is a positive "Radon-Nikodym density" g such that the measure v generated by the integral is finite. Another noteworthy observation is that if
then g l A E [O],. Since g > 0, A E N,, i.e., from v(A) = 0 it follows that p(A) = 0. Should p(A) = 0, then g l A E [0], and v(A) = 0. Thus, v(A) = 0 if and only if p(A) = 0. In other words, Y and p possess the same null-sets. It is clear that, if g is just nonnegative, v(A) = 0 does not necessarily imply that p(A) = 0. But from p(A) = 0, it follows anyway that v(A) = 0 (why?).
If v has a density relative to p, then a p-null set is also a v-null set. Is the converse of the statement true? (i.e., would this relation between the measures guarantee the existence of a density?) The answer will be given in the Radon-Nikodym Theorem below.
5.5 Deiinition. Let p and v be two measures on a measure space (R,C). The measure v is called (absolutely) continuous (with respect to p) if every p-null set is also a v-null set. If v is continuous relative to p, then we write v << p. Any Bore1 measure continuous with respect to the Lebesgue measure is just called continuous. 0 The use of the word "continuity" is basically due to the following proposition.
5.6 Proposition. Let v be a finite measure on (R,C) and let p be another measure on (R, C ) . Then the following are equivalent: (A) v < p .
> 0, there is 6 > 0, such thut p(A) < 6, the inequality v(A) < E holds. (B) For all
E
f o r each A E C with
Proof. Suppose statement (B) is true. Choose an E. Denote by A the set of all A E C,for which p(A) < 6. Then N, E A (where N denotes the subset of all p-null sets). Then, for all N E N , 0 = p ( ~ y <6 and v(N) < E. Since E can be made arbitrarily smalc we conclude that v(N) = 0 and thus v << p.
(i)
5. Measures Generated b y Integrals
349
(ii) Suppose now that statement (B) is not true. That means, for
some e > 0 and for any 6 > 0 there is a set A(6) E C such that p(A(6)) < 6 implies that v(A(6)) > E. We now define the sequence of 6's 1 as 6, = p ,n = 1,2,. .., and construct the corresponding sequence of A's such that A(6,) = A, with the above property, i.e. ( A , ) is a p-monotone
-
00
decreasing sequence but "v-resistant." Let A = lim A,. Then A 5 U Am m=n and
Therefore, p(A) = 0. However, by Problem 2.5, since v is finite,
-
v(A) = v(Lim A,)
-
2 lim v(A,) 2 E > 0
and thus v is not p-continuous. Hence (A) is not true either. The most general version of the celebrated Radon-Nikodym Theorem was proved by the Pole Otto Nikodym in his paper, Sur une ge'ne'rczlisation des intigrazes de M.J. Radon of 1930. Another prominent Pole, Stanislav Saks, suggested the name of this theorem, perhaps meaning as Nikodym's Theorem on Radon Integrals, although Radon himself proved a much more special case. The idea of Radon-Nikodym's result had its inception in a 1884 paper by Thomas Stieltjes, in which he introduced the new concept of a density function in connection with his famous "Stieltjes integral" (in its present version known as the Riemann-Stieltjes integral) and initially applied to very restricted classes of functions. In 1909, Frdddric Riesz proved in his widely referred to Representation Theorem that Stieltjes integrals are represented by the most general continuous linear functionals on [a,b] (whose more general version we will explore in Section 7, Chapter 8). Riesz's result yielded many generalizations, of which the most productive was by Johann Radon in his 1913 paper, Theorie und Anwenhungen der absolut additiven Mengenfunktionen. In this paper, Radon, combining the ideas of Lebesgue and Riesz, introduced an integral with respect to Borel measures o.n Borel a-algebra of Rn rather than the Borel-Lebesgue measure used by Lebesgue. Among other things, Radon showed the existence of a Radon-Nikodym density function with respect to this integral as an absolute continuous measure with respect to the Borel-Lebesgue measure, significantly generalizing the earlier theorem by Lebesgue about the existence of an almost everywhere differentiable density. Right after the appearance of Radon's paper, Maurice Frdchet noticed that Radon's result can be generalized for arbitrary measures, rather than Borel measures of Rn. This lead Nikodym to his 1930 gene-
350
CHAPTER 6. ELEMENTS O F INTEGRATION
ralization of Radon's theorem in the form very close to the present version. Consequently, a significant gap in integral theory existed between 1913 and 1930. Soon thereafter, in 1933, Nikodym's generalization led to the birth of measure-theoretic probability theory (in Andrey Kolmogorov's famous monograph, Grundbegriffe der Wahrscheinlichkeiisrechnung), the concept of conditional expectation, and an introduction to the theory of stochastic processes. Still, many consider Radon as the father of the modern theory of integration. Otto Nikodym, who is a t the heart of one of the most important results ever made in mathematics, was born on August 13, 1887, in eastern Poland, then belonging to the Russian empire. In 1919 he was among 16 mathematicians -to found the Polish Mathematical Society. Shortly after World War 11, Nikodym's family moved to Belgium and then to France, where Nikodym was invited by the Institute of H. Poincari: to work on the mathematical foundations of quantum mechanics. (He published his results in numerous papers, and his monograph, The Mathematical Apparatus f o r Quantum Theories, was published by Springer-Verlag in 1966.) In 1948 he accepted a position in the United States a t Kenyon College, Gambier, Ohio, where he stayed until his retirement. He died in 1974. We introduce some preliminaries on the Radon-Nikodym Theorem (further to be embellished in Chapter 8).
5.7 Notation. Let Tm = Tm(Q, C) be the set of all measures on (Q, C). For a fixed measure p E W, denote W> = {v E W:-v < p]. (This set is not empty, since p E ! I l : .) Define - on L(R, C,p;R + ) a mapping J, such that for each f E L(Q, C, p;W + ),
Spf = S f d p = 4.1. (-1
By Problem 1.20,
is valued in W>
.
Now the Radon-Nikodym Theorem states that if p is a-finite, for each v E Dl>, there exists a unique (up to the equivalence class modulo p) Radon-Nikodym density f E B(S2, C, ,u;R + ) of v relative to p. This needs some clarification:
-
s,
1) Given a function f E L(R, C, p;R + ), f defines a measure, which is absolutely continuous with respect to p. As noticed above, this is an mawing. is done. Consequently, [IL(R, C, ,u;R + ) ,!Dl> ,
S
2) Recall (Definitions and Remarks 1.14 (iii)) that the p-almost everywhere property of equality of measurable functions generates an equivalence relation E on C - '(R, C ; E ) and thus on - L(R, C, p;R + ), as a subset of C - ( R , C ; ) . Consequently, B(R, C, p; R + ) I is a quotient set, "inherited" from (1.14). On the other hand, by Corollary 1.20, the
5. Measures Generated b y Integrals
351
,
S,
CL agrees'' with this equivalence relation E, i.e. adopts E mapping as its equivalence kernel. Then, by Theorem 4.4, Chapter 1, there is a unique function, say
such that -
where ?rE-stands for the projection of L(R, E, p; W L(R, E , p ; W + ) I b y E. (See Section 4, Chapter 1.)
+
) on its quotient
to the iniective m a ~ ~ i ni gpthat now Therefore, J p literally turns acts on the quotient set L(R, C, p; W + ) I .,
3) The major claim (existence) of the Radon-Nikodym Theorem is that the mapping [L(R, E, p;R + ) 1 , , ! D l , $ ,J is suriective. In other words, for each measure v E (i.e., absolutely continuous with respect to p), there is an equivalence class [f], of Radon-Nikodym densities of Y relative to p.
A compact version of the above arguments is as follows: 5.8 Theorem (Radon-Nikodym). Let p E m(R, 23) be a 1~-finite measure. Then [L(R, C, p; IR + ) 1 , ,],l is a bijective map.
id
As mentioned, the uniqueness of the Radon-Nikodym density class is due to Corollary 1.20. The rest of the proof of Theorem 5.8 (existence) will be rendered in Section 2, Chapter 8, for more general classes of signed measures. 0 By Radon-Nikodym's Theorem, the map .J' and its inverse, denoted by symbol L(n, E , p) I
,. Thus, for any v E 'IR>
d,
,, is therefore invertible
is also a map valued in
)
, there
is a nonempty equivalence
class [f], of Radon-Nikodym densities of u relative to
Y ,
and, for a futed
v E !Ill> , we will write
and call it the Radon-Nikodym derivative of measure v relative to the measure p. I t should be clear that, unlike a Radon-Nikodym density, the Radon-Nihodym derivative is a p-equivalence class of all Radond~c Nikodym densities with respect to measure v. 5.9 Proposition (Chain Rule). Let f E
* d,
and g E
g.Then,
CHAPTER 6. ELEMENTS O F INTEGRATION
Proof. For A E C, n(A) = glAdv and, by Proposition 5.3, r ( A ) = which implies that f g is a density of r relative to p, i.e.,
S f g l Adp,
5.10 Examples.
(i) Let R be an uncountable set; let C = (A E T(R): either A or AC is countable); and let v(A) = 0 if A is countable and v(A) = co if AC is countable. Let p be a counting measure on C, i.e., let p(A) = I A I if A is finite; otherwise, p(A) = CQ. Since the only y n u l l set available is Q), it immediately follows that v << p. However, we show that v cannot have a density relative to p. Assume the opposite. Let g E C! be such that g E 4~ . Then, for all w E a, we have that:
;'
dcr
-
This implies that g 0, which, in turn, yields that, v contradiction. (For a further discussion see Problem 5.3.)
(ii) Let
0. This is a
be a point mass on (Wn,B). We are interested in whether or not E, << A. Let B = {a) E 3. Then, X({a]) = 0 and &,({a)) = 1. Therefore, by the Radon-Nikodym Theorem, E, does not have any density relative to A. E,
(iii) Let X be the Borel-Lebesgue measure on (Rn,'3) and let v be a probability measure on the same measurable space such that v << A. Since X is a-finite, by the Radon-Nikodym Theorem, there exists a densicalled the probability d e n s i t y of v, such that v = fdX. ty, f Should v be a probability distribution, say PX*, induced by a random variable X E C - '(R,c), then f is referred to as the probability d e n s i t y f u n c t i o n (p.d.f. or pdf) of X. Let X: (R,C) -t (W,%) be a random variable with PX* = f (cr, u2) dX, where
ES,
Then, X is called a n o r m a l r a n d o m variable w i t h p a r a m e t e r s (a,u2); the corresponding probability density function of X, f (a, a 2 ) is called the
5. Measures Generated b y Integrals normal density.
5.11 Definition. Let p1 and p2 be two measures on a measurable space (R,C). pl is said to be singular relative to p2 (or orthogonal to p2) if there is an A E C such that
In this case we write p1 1p2. It should be clear that the orthogonality relation is symmetric.
5.12 Examples. (i) Let E, be a point mass on (Wn,%) with a E Wn, and let A = {a)'. Then, A({a)) = &,(A) = 0. Therefore, E, I A. In Example 5.10 (ii), it was shown that E, is not absolutely continuous relative to A. Now we have established another relation between E, and A.
+
(ii) Let p = CE, bA, where c,b E R+\{O). I t is clear that CE, I A and bA << A, which implies that measure u , is a sum of continuous and a singular components relative to A. In the general case, if p and v are measures on (S2,C) such that v is u-finite, there is a unique decomposition of v = v, + v,, where v, is an absolute continuous measure relative to p and v, is a singular measure relative to p. This fact is due to the well-known Lebesgue decomposition theorem. 0
PROBLEMS 5.1
Prove Proposition 5.1.
5.2
Let p and v be measures on (R,C) such that v(A) <_ p A , VA E C, and let p be a-finite. Show that there exists f E 2 , d.P such that 0 5 f 5 1. [Hint: If f is a Radon-Nikodym density, show that either A = { f > 1) = (8 (there is a sequence A n l A such that u(An) > p(An), if p(An) > 0) or A E N P (if p(An) = 0). In the latter case set g = f lAc .]
5.3
Let p and v be measures on (S2,C) such that v << p and let v be finite and g E Denote A = { w E Q: g ( w ) # 0). Show that the dP' restriction of p on C n A is a-finite. Give an example where p(A) is not finite.
5.4
Let r, be a Poisson measure on (W,$). Investigate whether r, is absolutely continuous or singular relative to A.
5.5
Let pl, p2, and p be measures on (S2,C) such that pl
6
Ip and
354
CHAPTER 6. ELEMENTS O F INTEGRATION
p2 I p. Show that pl
<< p and
+ p2 I p.
Ip. Show that
Ip2.
5.6
Let pl
5.7
Prove that p
5.8
Let p and v be a-finite measures on (R,C). Show that p and v possess densities f and g, respectively, relative to p = p v.
5.9
Is orthogonality transitive?
p2
<< p
and p
pl
Ip imply that p
0.
+
5. Measures Generated by Integrals NEW TERMS: density 346 measure generated by a n integral 346 indefinite integral 346 Radon-Nikodym density 348 absolutely continuous measure 348 continuous measure 348 !Dl% -set 350 Radon-Nilcodym Theorem 351 Radon-Nikodym derivative 351 chain rule 351 probability density 352 probability density function 352 normal random variable 352 normal density 353 singularity of a measure 353 orthogonal measures 353
356
CHAPTER 6. ELEMENTS OF INTEGRATION
6. PRODUCT MEASURES OF FINITELY MANY
MEASURABLE SPACES AND FUBINI'S THEOREM The present section will extend the results on integration to Cartesian products. It will discuss the formation of product a-algebras (which has some resemblance with the product topology) and product measures on them. This leads to the main result of this section - the celebrated Fubini's Theorem, which allows one to iterate multiple integrals as its measure-theoretic analog of multiple Riemann integrals. Many text books in analysis and on the history of mathematics adopt "Fubini's Theorem" as a generic name for a class of theorems establishing the identity of multiple integrals with iterated integrals. In the mainstream of the evolution of calculus, when integrating a function f on the rectangle R = [a,b] x [c,d], the question was raised: under what condition does the existence of the double integral fd(xjy) guarantee R the existence of either of the iterated integrals, f (x,y)dy)dx and-. C
5
S :{I
[ d{J c
af ( ~ , ~ ) d z ) d yand ,
will they all be equal? Fubini's Theorem, in one
of its earlier forms was proved by Augustin-Louis Cauchy in the early nineteenth century and applied to continuous functions. In 1904, Henri Lebesgue extended this result to bounded measurable functions. In 1906 Beppo Levi conjectured that f need not be bounded, but just integrable. Italian Guido Fubini (1879-1943) proved this statement in 1907. Namely, he proved that given the function f is integrable on R, the functions XH f (x,y) and y~ f (xiy) are integrable for almost all x and y, respectively. In addition, the functions y u $: f (x,y)dm and x u f d f (x, y)dy are integrable and
Fubini, however, imposed some unnecessary condition on the integrand function. This was corrected and refined independently by Italian Leonida Tonelli (1885-1946) in 1909, Brit Ernest W. Hobson (1856-1933)) and Belgian Charles J.G.N. (Baron) de la Valke-Poussin (1866-1962) in 1910 who rendered proofs entirely different from that of Fubini. The notion of multiple integrals goes back to as early as the middle of the 18th century, first in the form of an indefinite integral. Later on, by 1770, Leonard Euler, formalized the double integral on a bounded domain and applied the above formula for iterated integrals by justifying it in terms of Riemann sums. Functions to be integrated were assumed to be continuous and the area of integration was not too complicated. This
6. Product Measures and Fubini's Theorem
357
approach began to run into serious difficulties as soon as more general cases were considered. Not until Lebesgue published his famous thesis in 1902, has it become possible to tackle other classes of functions, which all led to Fubini's Theorem as of 1910. More general versions of Fubini's Theorem (which we are going to explore in this section), applied to abstract measures and integrals, appeared to be possible after the Austrian Joachim Radon's extension of the Lebesgue integral in 1913 (mentioned in Section 5).
6.1 Definition. Let (Ri,Ci), be a measurable space for i = 1,...,n.
n Ai a measurable n
Given arbitrary measurable sets Ai E Ci, we call A =
i=l
rectangle in 52. The a-algebra generated by all measurable rectangles is n
called the product a-algebra and it is denoted by @ Ci = El @ C2@ i=l
...
A stronger definition of the product a-algebra will follow. Let ni: L? -+ Ri be the projection map (or projection operator), i = 1,...,n (see Section 5, Chapter 1). Recall that nf(Ai) is a cylinder (with base A;), which can also be represented as the Cartesian product
In terms of projection operators, a rectangle can be expressed as the intersection of n cylinders with bases Al,. .,An, i.e.
.
Now recall that the inverse projection nf*(Ci) is a a-algebra on L? that is a a-algebra generated by the map ni. This is in our case the ualgebra generated by all measurable cylinders with bases A; E Ci. The union of all these a-algebras for i = 1,. .,n need not be a a-algebrq, and therefore the smallest a-algebra generated by this union is to be considered.
.
6.2 Definition. The a-algebra C(
.
jection operators nl,. .,nn
6 xi*(
k=l
induced by the pro-
is called the product a-algebra and it is
n
denoted by 8 Ci or sometimes shortly by C @. i=l
-
The lemma below reveals the nature of C @
one more notation. Let
6 .
- i=l
Consider
gi be an arbitrary subset of Ci. Let us denote by
358
CHAPTER 6. ELEMENTS O F INTEGRATION
the set of all measurable rectangles Gl x from Cji.
...x Gn where
Gi's are picked
6.3 Lemma. Let Qi be a generator of Ei containing a sequence {Gik: k = 1,2,. ..) of sets, i = 1,...,n, monotonically increasing to Ri. Then the product w-algebra C g coincides with the w-algebra C=C
n
o Qi
(i=l
68;). ) (generated by Q = i=l
Proof. Because Q E C @ , it follows that C E C @. Indeed, every Ai E gi is also an element of Ci and therefore nf (Ai) E nf (Ci) C - C which implies that
(i)
(ii) Now we show the inverse inclusion C
5 C. We prove that
each ni is C-Eimeasurable. By our assumption, each generator of Ci contains a sequence {Gi,}t Ri. Consider
where Ai E Qi. Observe that
-
Gk = n;(Glk) n..
. n sf (A,) n ...n T;(G,,).
Therefore,
(since we took the union of elements of C). Hence, we proved that the inverse image of an arbitrary element of Qi (which is a generator of Ci) under ni belongs to C. According to Proposition 3.4, Chapter 4, we claim that the same inclusion holds for an arbitrary element of Ci or that ni is C-Ca-measurable. Since ri is C@-Ca-measurable for all i = l,...,n, it Thus C confollows that C contains all cylinders, i.e. a generator of C @. tains 13@. Observe that for Qi = C i , we thereby reconciled two definitions of product a-algebras: Definitions 6.1 and 6.2.
6.4 Remark. Now we see that in light of the above lemma, C @ n
= 8 C; is generated by a more economical^' generator than that given i=l
in Definition 6.1, i.e. by all rectangles from Ci9s. In some cases, when we fail to indicate this generator, we do consider C @ as the u-algebra
6. Product Measures and Fubini's Theorem
359
cl
generated by all rectangles as it follows from Lemma 6.3.
6-5Examples- To tell sets and set collections in R from those in Rn we will attach to the latter the superscript n. (i)
Let
Ri=R,
Ei=38i=4B,
63 . We also know that
C @=izl
i
= n. Then R = R n
and
there is another c-algebra in Rn, i.e.
the Bore1 IT-algebra 38" = !B(Rn). What is the relation between 93" and n
8 !Bi? Recall that !Bn was generated by the semi-ring of n-dimensional
i=l
semi-open intervals (which we for convenience denote by Yn). Observe that
and that each !f' contains a sequence monotonically increasing to R. Thus, by Lemma 6.3, !Bn and
693;
i=1
must coincide.
(ii) Recall that the Borel-Lebesgue measure An on 33" was extended from the Lebesgue elementary content A'", defined on Jn as
We know that An is the unique extension of that content on
5 33;.
i=1
We
can look a t a more general problem. Let us now consider an n-tuple of measure spaces (Ri,Ei,pi), i = 1,. .,n. We wonder if there exists a unique
.
measure p on the measurable space rectangle A,
( i=l nn R,, i=l6c,) such that for each
n n
6.6 Definition. Let B be an arbitrary subset of R = a;. We define i=1 for a point ai E Ri the a;-section of B as
If ai is such that ai gL ni(B), then ( y , . .. , W ~ - ~ , U ~w, ~ + ~.,.,wn) . gL B and B = 0.(See Figure 6.1.) ai
360
CHAPTER 6. ELEMENTS O F INTEGRATION
Figure 6.1 Here s i j was defined so that
T ij :
S2 +Ri x Rj .
6.7 Lemma. Let A be an arbitrary element of El 8 C, and let ai E Ri, i = 1,2. Then the corresponding sections A and A are O l O2 measurable in the way that A E C2 and A E C1. O2
Proof. Denote C' = {A E C1 8 C2:A
E C,}. We show that: O1
(a) C' is a a-algebra in R1 8 RZ
( b ) any rectangle is an element of C'. This would imply that C' contains C1@ C2. On the other hand, by the above definition, El @ C2 contains C'; therefore, C' would coincide with C18 C,. By Problem 6.2, we have that the section is commutative with all set operations.
6. Product Measures and Fubini's Theorem
361
Finally observe that
i.e. any a,-section of a measurable rectangle belongs to E2 which proves assertion (b) above. Assertion (a) becomes an easy exercise for the reader (Problem 6.3).
Example 6.8. Recall that the Lebesgue measure A: is complete on L*. Consider A: on &*(R). We show that A: @I A: is not complete. Let Q be any subset of R which does not belong to L*(R). Then, (0) x Q 4 L* @ L* or else, by Problem 6.4, Q would be an element of L*. On the other hand, (0) x Q is a (proper) subset of (0) x R, The latter by Problem 3.1, Chapter 5, is clearly a measurable Bore1 null set. Hence, the Lebesgue space (Rn,L*(Rn),Ag), as a complete measure space, does not coincide
6
6
AGi), in contrast with its with the product measure space (Rn, L:(R), i=l i=l Borel-Lebesgue counterpart.
If pi is a measure on Ei, i = 1,2, then according to Lemma 6.7, for an arbitrary set A E El 6C 2 , A, E C2 and A E El, and thus 1
"2
P ~ ( A , ~and ) p2(Aa1) are defined terms. For a fixed A,
p1(Aa2) and
p2(Aa1) are functions of a2 and a,, respectively. The proposition below states that under some restrictions they are even measurable.
6.9 Lemma. Let pl and p2 be c-finite measures on El and E2, respectzvely. Then f o r a fixed set A E El @ E 2 , the function
is El-%+-measura ble (E2-%+-measurable).
Proof.
(i)
We prove this proposition under the assumption that p2 is
finite. Let E = {A E El @ C2: f A is El-%+-measurable}.
We show that
E =E1@IE2. a) For R = Rl x R2 and f *(al) = p2(R2) = const (a real positive number), we have that f* is El-%+-measurable and hence R E C. Observe that the finiteness of p2 is essential, since otherwise, we can only arrive a t the weaker result that f A is El-s+-measurable.
b) Let A E C. Then
362
CHAPTER 6. ELEMENTS O F INTEGRATION
A C € . E l @ E 2, and
f ,c(a1) = P ~ ( ( A ~ =) ,~~2)( ( A a ~ ) ~ )
is measurable as the difference of two measurable functions. This implies that AC E C . c) Let { A , ) be a sequence of disjoint sets of C . Then,
00
is clearly measurable. This implies that Dynkin system.
C A,
E C and thus C is a
n=l
d) Now let A = Al x A2 be a rectangle. Then by equation (6.7),
f a ( a l ) = 0, if a,
@ A,;
and therefore,
is El-%+-measurable as the product of a constant function p2(A2)and an indicator function of a measurable set Al E 27,. Thus, C contains the set Cjo of all measurable rectangles which is n -stable. Therefore, El @ C 2 is a Dynkin system generated by an n-stable generator, which implies El @ C 2 5 C (as C is also a Dynkin system containing go). However, by the definition, C 5 El @ E2. Therefore, E is a u-algebra and
( i i ) Now we assume that p2 is u-finite. Then there exists a sequence
{Q,kl r c2
such that Q:
t Q2 and P,(Q:)
is finite and we can apply the above result to the trace a-algebras
C 1 n Q,k, and state that f
is
6. Product Measures and Fubini's Theorem
363
where
f $(al) = p2(Aa n fit)and A E El @ E,. 1
Consequently , by continuity from below, the function
6.10 Theorem. Let (St,,Ei,pi) be two measure spaces and let pi be afinite measures. Then there exists a measure p on El @ C2 such that, for each A E El @ E,, p(A) is determined b y the formula
Specifically, for A = Al x A,,
Proof. Let f A(al) = p2(Aa ). Then by Lemma 6.9, f A is a non1
negative extended-valued El-%+-measurable function. Therefore, the integral
A-
Sfa(al)~l(dal)
(6.10b)
is defined for any El 60 E2-measurable set A and it is a nonnegative set function which denote by p(A). We will prove that p is a measure on C1 @ C,. Clearly ~ ( 8 =) 0. In Lemma 6.8 we showed that
for {A,) as a sequence of measurable disjoint sets of El @ E2. Applying this equation to (6. lob), we have, by Corollary 2.2,
In particular, for A = Al x A2,
CHAPTER 6. ELEMENTS O F INTEGRATION
The measure p on El @ E2 is called a product measure and it is denoted by p1 @p2.Theorem 6.10 is readily extendible to the product n
measure 8 pi on finitely many measure spaces. i=l
The uniqueness of the product measure is subject to the following.
6.11 Theorem. Let (Ri,Gi,pi) be measure spaces such that Cji are f l stable generators of Ei, i = 1,...,n. If pi's are uf%nite, then the product n
n
i=l
i=1
measure 8 pi on 8 Ei is unique.
Proof. Denote
g = .! gi,
which, according to Lemma 6.3, is a generat
r=l or of 6 E i It is easily seen that Cj is fl-stable. Indeed, i=1 (Gl x
since each Cji is
... x G,) n ( H I x ...x H n )
n -stable.
Since pi is a-finite for each i, there are sequences
-
GikTni, i = 1,...,n, and k = 1,2,..., such that-pi(Gik) < oo. In Lemma 6.3 we have shown that the rectangle
and, in addition, that
n
Now suppose there is a product measure @ pi with the property i=l
6. Product Measures and Fubini's Theorem
365
n
for every measurable rectangle from @ Ci. Specifically, the property i=1
holds for every rectangle selected from the generator
9. If there is another
n
measure on @ C; with the same property on g, i.e. that coincides with i=l
the original product measure on Cj, then, by Corollary 2.14, Chapter 5 (on the uniqueness of the extension of a measure), these two measures n
must coincide on @ Ei. This proves the uniqueness of the product
-
i=l
measure. Observe that another measure on
5 Ci
i=l
may exist. However,
it cannot be a product measure.
6.12 Definition. Let f :R1 x R2+ RJ be a map, R = x R2 x Rg and let nl, n2 and x3 be the corresponding projection operators in R. Let
= ~( w, w ~ ,~w)~Denote ). be the projection operator defined as R ~ ~ ( w ~ , w
and call it the al-section o f f . In other words,
The a2-section is defined equivalently.
6.13 Proposition. Let (Rl x R2,El @ C 2 ) be a measurable product space, (a3, C3) be a measurable space, and let f : Rl x R2 - t R 3 be a El 8 E2-C3-measurable function. Then the section f is E2-C3-measural able and f is El-C3-measurable. a2
(See Problem 6.10.) The theorems below resemble some of the original versions, of Fubini's Theorem. Theorem 6.14 (which is by many referred to as Tonelli's Theorem) states in essence that if the integral of a nonnegative extended-valued El @ E2-measurable function is finite, then it can be calculated by using iterated integrals of formula (6.14) below. T o check on whether or not an arbitrary El @ C2-measurable function f is integrable one can also apply Theorem 6.14 directly to ( f [ by using formula (6.13).
6.14 Theorem (Tonelli). Let (Ri,EilPi), i = 1,2, be c-finite measure spaces and let f E C ;'(R1 x R2,El @ C2). Then the functions
366
CHAPTER 6. ELEMENTS O F INTEGRATION
are El- and E2-measurable, respectively, and
Proof.
(i)
As usual, we begin with nonnegative simple functions. Let s=
By Problem 6.11,
= : ,ai lAi, Ai E El
sal(w2) =
C
n,
lai
8 E2. (w2), and the latter is
measurable and
which is El-%+-measurable by Lemma 6.9. Hence we may integrate it with respect to measure pl:
S ( I sal(w2)dr2(w2))d~l(al)= C
"
S ~ 2 ( ( A i )d~~~l ()a 1 ) (6.14a)
By Theorems 6.10 and 6.11, equation (6.14a) reduces to
(ii) Let f E c ;'(Rl such that
Is,}
x R2,E1 @ E,). Then there is a sequence
f C P + (Rl x R2,C - I ) and that f = sup{sn}. Thus
(by the Monotone Convergence Theorem)
By Problem 6.12, finally,
Is,}
6. Produci Measures and Fubini's Theorem
367
Now it is obvious how to complete the proof of formula (6.14).
Cl
6.15 Theorem (Fubini). Lei ( i l i , p i ) l i = 1,2, be c-finite measure ~ El o E2,p1 8 p2;R). Then the functions spaces and let f E ~ ' ( $ 2x a2,
are p2-integrable pl-a.e., functions
and pl-integrable p2-a.e., respectively; the
are pl-a.e. and p2-a.e. defined, and they a r e pl- and p2- integrable, respectively, and the formula
holds.
Proof. Since by the condition of the theorem,
by Problem 6.13 and then by Theorem 6.14 applied to ( f that
By Proposition 1.21, the functions a l w S
S I fo2(w1) I dpl(wl)
1,
1 f a 1(w2) 1 dp2(wz)
we have
and a2ct
are pl- and p2-a.e h i t e , respectively. In other
words, for almost all al E fi1 and a2 E R2,
I f al I
and
I f a2 I
are p2- and
pl-integrable, respectively. It is easy to verify that the same applies to the pairs of functions (f
) + , (f al) - and (f a2) + , (f a2) - . By Theorem
6.14, the pairs of functions
368
CHAPTER 6. ELEMENTS O F INTEGRATION
and
are El- and C2-measurable, and they are pl-a.e. and p2-a.e. defined, respectively. Finally, applying formula (6.14) to f + , (f al) + , and +
(f
and then to f - , (f
- , and (f a2) - , we arrive a t formula
6.16 Remarks. (i) Fubini's theorem generalizes Theorem 6.11 since for f = lA and A E El @ C 2 , the result of Theorem 6.10 immediately follows. That is why Theorem 6.10 is also called Fubini's theorem. (ii) Observe that Tonelli's and Fubini's Theorems differ not only in that they are applied to nonnegative and arbitrary measurable functions, respectively, but also that functions on a product space belong to L'space is a conclusion in Tonelli's Theorem while being a hypothesis in Fubini's Theorem. (iii) The above results (including Fubini's theorem) can naturally be extended from the case of two spaces to the case of any finitely many spaces. This involves a relatively straightforward notational routine and will not be discussed further except by way of examples. 0
6.17 Definition. Let (Ri,Ei,pi), i = 1,...,n,be measure spaces with afinite measures. Then the triple (
n Ri, n
i=l
product (measure) space.
8 C , 8 pi) r =l
is called the
i=l
6.18 Examples. (i) Let ( x r ) be a closed ball in ( W n n ) Denote Vn = Xn(Cn(B,l)). We wish to show that Xn(C,(xo, r)) = Vn rn, where
and
V2k=
lk T !k)
k = 1,2,...
A closed ball is clearly a Bore1 set. Let Ln(r, so): Wn
(6.18)
-4
Wn be the bijective
map, L(r,x0)( 2) = r x where x, xo E Wn and r
+ xo,
r>. Then, Cn(B,l) = L*(r,xO)(Cn(r,zO)). Ob
6. Product Measures and Fubini's Theorem
369
serve that L*(r,xo) = E($) o M ( - xO), where E means expansion with factor and M stands for the parallel motion (here with the shift - x,,) On the other hand,
n
2
2
= {(xl,...,xn): E x i 5 1i=3
- x22 , with
(x1,x2) E C2(8,1)}
= L: - 2((1 - x12 - xZ2)- 1'2,0)(cn-2(e,l))
9
where (xl,xZ) E C2(f?, 1). This yields
Now, P
(by Fubini's theorem and by (6.18b))
The interior (second) integral is, due to Proposition 5.3, I), Chapter 5, and the above observation equal to
C H A P T E R 6. ELEMENTS O F INTEGRATION
By Proposition 5.3, 2), Chapter 5, and by Theorem 4.1, the last integral equals 2 - x22)(n-2)/2 v (1 - Xl n-2 ' Therefore,
(by Fubini's theorem)
This is a Lebesgue integral of a continuous function on the unit ball and it can be redaced to a Riemann integral by using conventional techniques for Riemann integrals. For example, the double integral above is then
and thus V, = V,-2T,27r n = 2,3,... .
(6.18d)
V o = 1. Then, V1 = 2 (as the Lebesgue measure of the interval [ - 1,1]). By (6.18c), V2 = n. (that agrees with the definition of Vo). Let
2
2n r2-' , and v3= vlT-1.3
V4 =
X
2
The validity of formulas (6.18) and (6.18a) is then easily shown by induction and the use of (6.18d). (ii) We show that Fubini's theorem need not hold when a t least one of the measures, p1 or p 2 is not c-finite. Let (Ri,Ci)= ([0,1],%([0,1])), i = 1,2, p1 = Res[o,llA, and p2(A) = IAl, if A is finite and p2(A) = oo, if
A is infinite, where A E C2.Denote
the diagonal of the square (see Figure 6.2).
6. Product Measures and Fubini's Theorem
Figure 6.2 We show that D E C1 8 C2 = %2([0,1]2). Let
and
n 00
Then D E C1 8 C2 for D = An. Now we find n=l
I
p2(Dx)X(dx) = X([O,l]) = 1 (since p2(Dx) = 1). [o, 11 On the other hand,
So as we see, Fubini's theorem or more precisely, the second equation in (6.10) of Theorem 6.10, does not hold. (iii) Let (N,T(N), y) be the counting measure space introduced in Example 1.2 (viii), Chapter 5, for more general measure spaces. We will consider a sequence isn) of nonnegative simple functions on N as
where (ak) is a nonnegative sequence of reals, so that
CHAPTER 6. ELEMENTS O F INTEGRATION
Hence the integral of g will turn to a series:
This is readily extendible to a series with real-valued terms. In other words, the integral of a sequence { a ) E R with respect to the counting n. measure y is represented by the series in (6.18e). Let { f ,} be a sequence of nonnegative functions of C; '(a, C ) and let p be a a-finite measure on C. Since the above counting measure y is a-finite, the function f (where f (n,w) = f ,(w)) obviously meets the conditions of Tonelli's Theorem 6.14: f E C ;'(N x R, T(N) 8 C). Consequently, the sections
are 3(N)- and C-measurable, respectively, and
(18f) is an nice illustration of Tonelli's Theorem. Howvere, it is a slightly weaker alternative to Beppo Levi's Corollary 2.2, since the latter does not require p to be a-finite. Now, let { f ,} E C - '(a, C). T o use an analog of Fubini's Theorem, we need to make sure that f E L'(M x R,Y(N) 8 C ,y 8 CI;R ), or, alternatively, apply the above procedure initially to the sequence f n I ) instead. Then, from (6.180 we can get
f
SL
00
S I f 1 dy 8 p
or their equivalents, C n = 0 J o [ f n I dp, be finite, then it would yield that
Should now,
1 f n I dp
or
and therefore, Fubini's formula (6.180 would hold true, now for an arbitrary sequence of measurable functions { f ,}. Notice that, since
is a necessary condition for f E L'(M x a,T(N) 8 C, y 8 P;R), it automat-
6. Product Measures and Fubini's Theorem ically implies that
would be alternative necessary conditions for f E L'(N x R, T(N) 8 C, y OP;R) (although the latter is by no way a necessary condition for Fubini's Theorem). This version of Fubini's Theorem can compete with Generalized Monotone Convergence Theorem 2.4 and Lebesgue's Dominated Convergence Theorem 2.6 in some applications. (iv) As an illustration to the last application of Fubini's Theorem, consider a random variable X on a probability space (Q, E, P). The function m(0)rIE[eeX] (normally, complex-valued) is known to be the moment generating junction of X. If we expand eeX in the Maclorin series,
we will have with
exn] - on!
m(O) = E [ c ~ -
a scenario of the application of Fubini's Theorem discussed in Example
(iii). Hence we have to make sure that, in light of (6.18h), the series is
in some vicinity of 0 = 0. [The latter holds for many practical cases, provided that IE[ I X ( "1 < oo for all n.] Assuming that all absolute moments of X exist and the above series converges, the application of Fubini's Theorem (6.18f) yields that
as a Taylor series expansion of m(O) in terms of all moments of the random variable X, and consequently that
Consider Borel-Lebesgue measure X2 = X 8 X on Bore1 c-algebra g2.Let A = Q x R. According to Problem 3.1, Chapter 5, A is a countable union of Borel-null sets. Thus, (v)
374
CHAPTER 6. ELEMENTS O F INTEGRATION
On the other hand, the section (IA)
is not A-integrable for all al E Q.
O1
This is, however, in agreement with Fubini's Theorem that the function (lA)ol is A-integrable only for almost all al E W. (ui) Now we discuss yet another application often occurring in probability theory. Let pF and pG be finite Borel-Lebesgue-Stieltjes measures induced by distribution functions F,G E 9(R,G$) (see Remark 3.5 (iii), Chapter 5). Recall that
lim,,
- ,F(x) = lim,,
- ,G(x) = O.
From Problem 3.7, Chapter 5, given a compact interval I = [a,b], we have that
Let T, = {(x,y) E [a,b12:y > x) and T I= {(x,y) E [a,bI2:y 5 x), which are the upper and lower triangles of the square I , respectively. Now we calculate the measure of I 2 under pF 8 p~ by using Theorem 6.10 in terms of Lebesgue-Stieltjes integrals:
Equating (6.18i) and (6.18j) we arrive a t
Interchanging the roles of
F and G we have
6. Product Measures and Fubini's Theorem
375
Hence, from (6.18k) and (6.181) we establish the following integration by parts formula for Lebesgue-Stieltjes integrals:
PROBLEMS 6.1
Let
go
be the set of all measurable rectangles A = Al x . . . x A,,
Ai E Ei. Denote by C(Cjo) the algebra generated by
go
and by
C(go) the collection of all finite unions of disjoint rectangles of
go.
Show that C(Qo) = C(go). 6.2
Prove that the section is commutative with respect to all set operations.
6.3
Show the validity of assertion a) in Lemma 6.7.
6.4
Show that a rectangle Rl x R2 E C 8 C, where R1 and R2 are not empty, if and only if R1 E C and R2 E C.
6.5
Let (Ri,Ei,pi), i = 1,2, be u-finite measure spaces. Show that the product measure p1 8 p2 is a-finite.
6.6
Let (Rn,91n,An) and (R k ,9l k ,A k ) be the Borel-Lebesgue measure spaces. Show that
6.7
Let (Ri,Ci,pi), i = 1,2, be measure spaces with u-finite measures and let A E C18 C2. Show that the following statements are equivalent:
6.8
Let A C R1 x R2 and let al E
6.9
Show that f al*(A3) = (f *(A3))o1 , A3 C R3.
6.10
Prove Proposition 6.13. [Hint: Apply Lemma 6.7 and Problem
4. Show that
(lA)a = lA 1
.
3 76
C H A P T E R 6. ELEMENTS O F INTEGRATION
Let A,B C Rl x R2 be two disjoint sets and let a,p E R. Show that
Let f E C;'(fll
x R2,Cl 8 13,) and let {s,} g !+F+ (52' x n,,C - ')
such that f = sup{sn}. Show that f, = sup{(s ) } [HintApply 1 "1 Theorem 6.5, Chapter 5, and Problem 6.101. Showthat
If I.=
Ifal,(f+)"=(fa)+,and(f-)a=(f")-.
Let El and C2 be cr-algebras on a1 and a,, respectively. Show that Cl 0 C2 is a semi-ring. Let Ql and Q2 be semi-rings on Rl and a,, respectively. Is Q1 0 Q2 also a semi-ring? What will the smallest algebra generated by Cl O C z from Problem 6.14 look like? Let pi and vi be finite measures on a measurable space ( a i , C i ) , i = 1,2. Show that if pi << ui, i = 1,2, then p1 p2 (< ul v2.
+
+
Let (R, C, p) be a c-finite measure space and let f E C ;'(R, 6). Prove that
by using Theorem 6.10. Generalization of (P6.17). In the condition of Problem 6.17, let g: R + +W + be a continuous monotone nondecreasing function such that g(O)=O and which is continuously differentiable on (0,m). Show that
Show that if F and G in Example 6.18 (ui) have no common discontinuities, then formula (6.18m) reduces to
6. Product Measures and Fubini's Theorem
NEW TERMS: measurable rectangle 357 product a-algebra 357 measurable cylinder 357 section of a set 359 a;-section of a set 359 section of a function 365 as-section of a function 365 Tonelli's Theorem 365 Fubini's Theorem 367 product measure space 368 closed ball in Rn, Borel-Lebesgue measure of 368 integral with respect to the counting measure 371 moment generating function 373 integration by parts formula for Lebesgue-Stieltjes integrals 375, 376
378
CHAPTER 6. ELEMENTS O F INTEGRATION
7. APPLICATIONS OF FUBINI'S THEOREM Product measures and Fubini's theorem find some of their finest applications in probability theory. One of them has to do with independence of random variables, a popular topic in statistics and stochastic processes.
7.1 Definitions. Let (R,C,P) be a probability space. (i) Let g C C be an arbitrary (indexed) family of events (i.e. measurable subsets of R). Cj is called P-independent (or just independent) if, for any finite subcollection {A .,Ai } of n > 2 events from g, the n following relation holds true: P{Ai n . . . n A i ) = P ( A i l ) - - - P ( A i ) . n
1
(7. l a )
n
Observe that, if g is an independent family of events then the Dynkin system generated by g is also independent (see Problem 7.1). If, in addition, g is n -stable, then 9(g) is an independent a-algebra. (ii) Let Q = {gi; i E I) 5 C be an indexed collection of families of events. Q is called independent if, for any finite subset {il,. ..,in} I , n 2 2, and for any choice of Ai E Cjik, k = 1,.. .,n, the events Ail,. . .,Ai k n are independent. (iii) Let 9 = {Xi ;i E I} be an indexed collection of random variables on ( R , E , P ) . 9 is called independent if the corresponding collection {a(Xi); i E I} of a-algebras generated by these random variables is independent. (iv) Let X i : R -4 R i , i = 1,.. .,n, be ErE random variables on n
(R, E , P ) . Then we denote 8 X i = {XI,. . .,Xn) : R i=l call it the product map. (v)
+
Rl x ... x Rn and
6
It appears (Problem 7.2) that the product map X i is Ci=1 Therefore, by letting
6 El-measurable. i=l
~
-
( n nit8 ,Xi) n
we can define a probability measure on
*=1 1=1
joint distribution of random variables XI,. . .,X,.
and call it the
0
Let PK = PX; be the distribution of the random variable X i , 1
i = 1,.. .,n. This is a probability measure on E;. Then, according to the previous section, we can construct the triple
7. Applications of Fubini's Theorem
On the other hand, we already have another measure P 8
(
0 ,&
,=l
) ,
on
which in general, need not be a product measure. The
1=1
following statement clarifies the matter. 7.2 Proposition. The joint probability distribution P
is a product
n
measure and equals 8 PX. if and only i f the random variables XI, ...,Xn a are independent. i = l cl
(See Problem 7.3.) Note that the treatment of the product P g X i of more than finitely many independent random variables is more complicated; such a treatment involves the product of infinitely many a-algebras and measures. Another import ant application of product measures and Fubini's theorem is the notion of "convolution" of measures.
7.3 Definition. Let %,(IRk, 3(IRk)) be the set of all finite Bore1 measures on '3k= %(IRk). Clearly, %,(IRk, %(Wk)) is a semi-linear space over n
the field W + . Let p = 8 pi a=1
where pi E 8 (it is easily seen that
p E B,(w~",%~")).Consider the linear measurable map L,: Wkn L
n ... x ) =
i=l
+ Wk
as
x .. Then the image measure pLn* is called the
convolution of measures pl,. . .,pn and it is denoted by
7.4 Properties of Convolution. l k k k k (i) Let f E C z (W ,% ) and let pl,p2 E B*(W ,% ). Then
=
Sf
0
L2(xl,x2)dpl 8 p2(x1,x2) (by Fubini's theorem)
For f E C -'(IRk,ak) and p1,p2 E B,, we require that f = f + - f be p1*p2- (or p2*p1-) integrable to have (7.4a) valid. Specifically, let f = lA,A E %k. Then,
380
CHAPTER 6. ELEMENTS OF INTEGRATION
(since lAo L =
where T(xl) = xl
+ x2 and
TL(y) = y - 1 2 )
Applying Fubini's Theorem to (7.4a) (i-e. interchanging the integration) we also get
But the expression on the right is exactly (p2*pl)(A), which implies commutativity of the convolution. k
k
k
k
(ii) If kl,p2,v E B,(W ,J ), then pl + p 2 E Bt(R ,J ), and we have by (7.4b) that
Thus, the convolution is distri+utive. 00
(iii) Let {v,pn} E B,(R k ,% k ) such that
n=l
pn E B,(W k ,ak ). Then,
by the same argument as in (ii), we get
i.e. the convolution is also
CT-distributive.
(iu) Let p1 and p2 be as above and let easily seen that
P
E W+\{O).
Then, it is
7.5 Examples. (i)
Let p1 =
and
p2 = E~ E B,(W k ,% k ). Then by (2.2),
7. Applications of Fubini's Theorem since &,(A- b) = lA-&a) = l A ( a write
+ b)
E,SE~
381
for fixed a,b,A. We can therefore
=E,+~
(7.5a)
(ii) Let P , and P m l p be binomial measures introduced in Example 1.8 (iii), Chapter 5. We find the convolution of these measures by applying (7.4~)-(7.5a):
Denoting i
+k =j
and renumbering the second sum, we have
The middle sum is fore, (
by a known cornbinatorial identity. There-
+"
j
) Pm, p*Pn,
- Pm+n, p
. 00
(iii) Convolution of atomic measures. Let pl =
i=O
M)
P j r j E 23,. Then
=
C aihi
and p2
3=0
Substituting k for i
The expression
+j
we have
e ~ ~ ~ ~ - ~ = yk is known as the convolution in the
i=O
product of power series
382
CHAPTER 6. ELEMENTS OF INTEGRATION
(iu) Consider the following special case of (iii) by taking p1 = ra and pz = r b - Poisson measures with parameters a and b (introduced in Example 1.8 (iu), Chapter 5). By formula (7.5b), we therefore have
= k=O
exp( - (a
+ b))
k
C (f)aibk-i
i=O
7.6 Remark. Let f 2 0 be an element of L ~ ( I R ~ , % and ~ , Alet ~ )p be k k the measure generated by fdAk. Therefore, p E 8,(W ,% ). Let v be
S
another measure from B,(W k ,%k ). We wonder about the convolution p*v. We have by (7.4a) and with L&x) = x y that:
+
[by Fubini's theorem]
Sf
where p(x) = (x - y)v(dy) 2 0 is, by Fubini's theorem, %k-measurable and 3 is obviously finite. Therefore, p E L1(Ak) and we conclude that p*v has a density relative to Ak. We rewrite the above equation in the form
7.7 Definition. The function p in (7.6a) is called the convolutzon of
7. Applications of Fubini's Theorem the function f and the measure v and it will be denoted by cp = f*v.
383
O
Observe that, since p*v = v*p, we have that f *v = v* f .
7.8 Remark. Let f ,g E L1 (A k) and f , g 2 0. Let p = fdXk and v = S gdXk. Then p,v E 8,; and we obtain, by using Proposition 5.3 that
Now we have from (7.6a))
where we denoted
The function f*g is obviously integrable and we call it the convolution of f and g. tl The definition of the convolution for functions can be extended from (7.8a) to real-valued integrable functions. However, we shall refrain from connecting it with the convolution for measures, since it will require a background on signed measures (in Chapter 8). 7.9 Example. In probability theory, the convolution finds its application for the distribution of the sum of independent random variables. Let XI, ...,Xn be independent random variables valued in Flk with their distributions Px., i = 1,. . .,n.Let S = = : X iand Ln be as in Definia tion 7.3. Then
and thus
Since XI,.. .,Xn are independent, by Proposition 7.2,
and, following Definition 7.3, we obtain that
CHAPTER 6. ELEMENTS OF INTEGRATION
PROBLEMS 7.1
Let (C!,C,P) be a probability space and let 2 be an independent family of events. Show that the Dynkin system 9(g) generated by C j is also independent.
7.2
Show thatn the product map . & X i introduced in Definition 7.1 (iu) is C- @ Ermeasurable. t = l i =1 Prove Proposition 7.2.
7.3 7.4
Let L+ denote the space of all real-valued nonnegative x ~ integrable functions. Show that (L + ,*), with the binary operator * defined by (7.8a) is a semi-%-space (over the semifield W + ) in light of Remark 6.2 (ii), Chapter 5.
7.5
Let y
zdenote the probability distribution of a normal random
a,6
variable X with density f(a,a2)defined in Example 5.10 (iii) and lety y be the probability distribution of another normal
tJ,6
random variable Y. Show that, if X and Y are independent, then
P ~ += Yra + B , 0 2
+d2'
-
7. Applications of Fubini's Theorem NEW TERMS: independent random variables 378 independent family of random variables 378 product map 378 joint distribution of random variables 378 convolution of measures 379 convolution of measures, properties of 379 convolution of point masses 380 convolution of binomial measures 381 convolution of atomic measures 381 convolution of Poisson measures 382 convolution of a function and measure 382, 383 convolution of functions 383 sum of independent random variables 383, 384 sum of independent normal random variables 384
Chapter 7 Calculus in Euclidean Spaces This is about Lebesgue integration in Euclidean spaces, which will primarily deal with the change of variables techniques. As a mandatory preliminary and for consistency, it begins (in Section 1) with differentiation. Any standard analysis text book can serve as an alternative refresher. Although the Euclidean space is the chief application in this chapter, for didactical purposes, we allow us to introduce certain concepts for Banach spaces.
1. DIFFERENTIATION 1.1 Definition. Let X and X' be Banach spaces. A function F: X -,X' is said to satisfy a Lipschitz condition on a subset 0 C - X, if there is a constant K, called the Lipschitz constant on 0, such that 11 F(x) - F(y) 11 5 K 11 a: - y 11 for all x,y E 0. Clearly, a function F that satisfies a Lipschitz condition on a set 0 is uniformly continuous on 0. If the Lipschitz constant is zero, then F = const on 0. 1.2 Remarks. Let X = Rn and X' = R m both endowed with Euclidean norms and let L be a linear operator from Rn onto Rm. It is known that any linear operator can be expressed by a matrix. Conversely, any m x n matrix, say A, represents a linear operator, so that L(x) = Ax, for each x E Rn. The Euclidean or Frobenius norm of matrix A = (aij) is defined as
There are a few other norms we are going to use in the sequel. Before we introduce them, note that the notation 11 I 11 I for a matrix norm is used when besides the usual properties of the norm, a matrix norm is submultiplicative, i.e., if
One can show (Problem 1.1) that the Frobenius norm is submultiplica-
388
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
tive. The matrix supremum norm is defined as
t,
It is not submultiplicative. (A counterexample is [I A' 11 > [I A 11 where A is the 2 x 2 matrix with aij = 1 for all i and j . ) The maximum row sum matrix norm is defined as
One can show (Problem 1.2) that the maximum row sum matrix norm is submul tiplicat ive. We will outline the following properties of matrix-vector norms. We assume that A is a n m x n matrix and x E Rn. (i)
II AxII .<
IIIAIIIeII
X I I e*
(MN. 1)
Indeed, with the aid of Hiilder's inequality (p = q = 2),
IIAxII
2
= ~ y . ~ ( ( C ; = ~ ~ i k5 " k( )C~y = l ~ ; = l a ~ k ~ ~ = i x j
11 11 : 11 AX 11 ,9 11 A 11 ,11 11 = I l l A 111:
(ii)
U.
(MN.2)
This is easy to verify as well as the other inequalities (see Problem 1.3): (iii)
I.(
IIAxIIU< IIIAIIIr,IIxII~.
(MN.3)
III A IIIu 5 II A II us
(MN .5) If L is a linear operator expressed by an m x n matrix A, then from (MN. 1-3) it follows that
where 11 11 is the Euclidean or supremum norm and K is either I I I A I I I,, or n 11 A 11 or 1 1 A I , respectively. Consequently, L satisfies a Lipschitz condition on Rn with the respective Lipschitz constants. In particular, it follows that L is uniformly continuous on Rn with respect to the Euclidean and supremum norms.
1.3 Lemma. Let F: 0 ( 5 Wn)-+ Rn be a map that satisfies a Lipschitz
1. Differentiaiion
389
condition on 0. If N is a A-negligible subset of 0, then so is F,(N). Proof. Let C E !f be a semi-open cube in Rn and F,(C) - its image under P. Since F is continuous on 0, F,(C) is d-bounded and therefore there is a compact cube C* with each side equal to diamF,(C), which contains F,(C). Let K stand for the Lipschitz constant on 0.Then, it is easily seen that diamF,(C) 5 KdiamC with diamC = r f i and r being the cube edge length. Clearly,
where A* denotes the Lebesgue outer measure on 9 ( R n ) . Now, since C is a cube, A(C) = rn and
If N is a negligible set, then according to Problem 3.18, Chapter 5, for each E > 0, there is a countable cover of N by disjoint semi-open cubes {Ck) such that
Therefore, N C unions,
Ern C k=l k
and since maps preserve inclusions and
The latter, along with (1.3) and (1.3a), yield that:
We showed that for any c, F,(N) can be covered by countably many half open cubes with the sum of their volumes less that E. By Lemma 3.6, of Chapter 5, F,(N) is negligible. 0 The following concept of the derivative was given by Frdchet in 1903, which we first formulate for Banach spaces.
390
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
1.4 Definitions.
Let R and R' be Banach spaces and let 0 be an open set in R. A map F: 0 -,R' is said to be differentiable at a point x E 0 if there is a continuous linear operator L 4:52 + R' and a map o: R -4 R' such that (2)
(
~
1
and
then it is It is easy to show that if a map F has such an operator L unique given F' and x (Problem 1.4). The operator L(F,,) is usually denoted by F1(x) or DF, is called the derivative (or FrCchet derivative) of F at x. Consequently, from (1.4)) lim
F(x
+ h) - F(x) = lim DF, ( 4 I1h 11
ht9
h40
llhll
'
If the function F is differentiable a t every point of 0, it is said to be differentiable on 0. Then X H DF, is evidently a function itself, which is obtained by the application of the operator D to F. (ii) Consider the special case of R and R' being Euclidean spaces Rn and Rm, re~p~ectively.Then, a t every x = (ul,. ..,un)= E Rn, F(x) = (f l(x),. ..,f m(x))T. In the above definition, the linear operator LF(x), any linear operator in Rn (recall it is also continuous), is known to be represented by an m x n matrix, say M,. Therefore, the derivative of F a t x is, in this case, a matrix, called the Jacobian mairix, in notation jF(x). Then, (1.4) and (1.4a) can be rewritten as
and lim
h+B
F(x
+ h) - F(x) = lim B F ( x ) ~ 11 h 11
h3f3
Ilhll .
For m = n, the determinant of jF(x) is denoted by JF(x) and is called the Jaco bian.
1.5 Examples. If F itself is a continuous linear map, then F ( x + h) - F(x) = F(h) and taking o = 0 (zero function), we get LF( 4(h) = DF,(h) = F(h). Therefore, F is everywhere differentiable and for all x, DF, = F, i.e., DF, does not depend on x and F coincides with its derivative. In particular, if F acts in the Euclidean space and thus is represented by an
(i)
I . Differentiation
391
m x n matrix, say M, then the Jacobian matrix JF(x) equals M.
(ii) Let R = R' = e([O,l], R) with norm 11 x 11 = sup(x(t) : t E [O,l]) and let O = (x : I( x 11 < r) for some r > 0. Define the operator F: O 4 R as
as v ) where K(t,s) is continuous on [0,112 and the partial derivative %(u, (defined on the set R = [0,1] x R) exists and is uniformly continuous on R. Then we can show that
;
= J K(t,s)
as (s, ~ ( s ) ) h ( sdx ) +v ( x , ~ ) 1
where
Thus, F is differentiable a t x and its derivative satisfies (Fi(x) h)(t) =
2
J ~ ( t , s ) (s, x(s))h(s) dx .
(1.5a)
1.6 Proposition. Let [0(C Wn), Wm, F = (f .,f,JT] be a function. F is differentiable at an interior point x of 0 if and only if each componeni function f l,. ..,f m is differentiable a t x and in this case
Proof.
(i)
Suppose F is differentiable a t x. Then,
where j&(x) is the ith row vector of jF(x). The right-hand side of (1.6) can also be written in the form
3 92
CHAPTER 7. CALCULUS I N EUCLIDEAN SPACES
which yields that
f i(x + h) - f i(x) = 8 ; 7 ( ~ ) h+ oi(h) and, hence, f ; is differentiable a t x and its derivative f:(x) is expressed by a 1x n Jacobian matrix I:("). Consequently, we have that F1(x) = (f;(x),. .. , f a c ~ ) ) ~ . (ii) The converse of the statement is obvious.
0
Suppose [ 0 ( C Rn),R,f ] is a function. If f is differentiable a t x E 0 "along the segment [x,x teJ" parallel to the Xk-axis, where t is a real scalar and ek is the kth basis vector of Rn, i.e., the limit
(i)
+
+
- f (x) lim f (X lek) t t-0 exists, it is called the partial derivative of f with respect to its kth
af coordinate, in notation -(x). [Note that by fucing all components of at-IF,af vector x except for xk, in the above limit, the partial derivative aCk(f) is nothing else but the usual Newton-Leibnitz derivative.] (ii) We qan analogously define the kth partial derivative of a vector function [0(CRn),Rm, F = (f f ,)I as
..,
aF %(x)
= lim
F(x
+ tek) - F(x)
t-0
t
9
if the limit on the right exists. In light of Proposition 1.6 (h = tek), the kth partial derivative %(I) of F is k
and it exists if and only if the corresponding partial derivatives of all its component functions exist. Suppose [ 0 ( C_ Rn),R,f] is a function differentiable a t a point x E 0. Therefore, ff(x) exists and from (1.4a), lim f (x
h-8
+ h) - f ( 4 = Il h I1
lim f '(4(h)
h4e
lihll
In particular, if h = tek, where t is a real scalar and ek is the kth basis vector of Rn, h is the increment of x taken along the segment of a line
parallel to the Xk-axis. Then,
(1 h I[ = t and, since f ' is linear,
= lim f '(x)(tek) = f ' ( ~ ) ( e k )= 1 (l)ek. t-0 t af From (1.8a) it follows that %(I) Jacobian matrix
3f (4 and
equals the scalar product of f's
the kth basis vector ek. If [0 ( 5 Wn),Rm,F] is
a vector function differentiable a t an interior point x of 0 , then Proposition 1.6 and (1.8a) yield
Thus, if F is differentiable a t x, all its partial derivatives exist and are determined by formula (1.8b). In particular, (1.8b) reveals the nature of the Jacobian matrix IF(+). Namely, from (1.8b) and (1.7) it follows that
The kth column of JF(x) is e ( x ) and therefore,
The above can be summarized as the following theorem.
1.8 Theorem. Let [O ( C Rn), Rm, F] be a function dzflerentiable at a point x E 0 ( a n interior poini). Then, all iis partial derivatives exist and its Jacobian matrix IF(x) is equal t o
1.9 Definition. Let 0 be an open set in Wn. A function [O,Rm,F] is
394
CHAPTER 7. CALCULUS I N EUCLIDEAN SPACES
said to be continuously differentiable on 0 or a el(O,Wm)-function if F is differentiable on 0, and all of its partial derivatives -,. aF . . exist and a€, 'a€, are continuous on 0. Note that F is a el-map if and only if F is differentiable and F1 is continuous on 0.
1.10 Examples.
( i ) If F E el(O,Wm) and m = n, then the Jacobian JF is obviously
a continuous function on 0.
(ii) It can be easily verified that
is a e l ( { ( x l y )E
w ~x:= ylClW2)-function.
The following is the chain rule holding in Banach spaces.
1.11 Theorem (Chain Rule). Lei Q, Ql, and Q2 be Banach spaces and lei H: 0 ( C R) Ql and G: 0, ( E Ql) -.R2 be maps such that H ( 0 ) 2 0,. Let H be differentiable at x E 0 and G be differentidble at H ( x ) . Then the composed map G OH is a differentiable function at x and
Proof. By the assumption of differentiability, H(x and
+ h ) = H ( x ) + D H x ( h )+ o H ( h ) G ( H ( x+ h ) )
Substituting the expression for H ( x + h ) - H ( x ) = D H x ( h ) + o H ( h ) we have that
G ( H ( x+ h ) )
By linearity of DGH(,),
+
Now, by continuity of H , H(x h) - H(x) --+ 81 when h -,8 , and by linearity and continuity of DGH(,), lim
h+8
Therefore, G o H(x
OH(^))
DGH(~)(oH(~))
11 /t 11
+ h) = G
0
= lh+8 imD~H(x)(
H(x)
I
hll
+ DGH(,)DH, + oG
)
,(h).
1.12 Corollary. In the condition of Theorem 1.11, let R1 = Wm, and Rp = R1. Then,
R = Rn,
1.13 Theorem (The Mean Value Theorem). Let F: R n t W m be differentiable on a convex set 0. Then, for any x and y E 0, there is a point q, which belongs to the line segment S(x,y) between x and y, such that
+
<
Proof. Let x,y E 0. Denote g(t) = ty (1 - t ) x for 0 5 t 1. Then, the function g represents the segment S(x,y) and F o g will let the function F run over the segment S(x,y). By the chain rule, the function Qs = F o g is evidently differentiable on the segment [0,1] and by (1.11),
Now, applying to @: [0,1] -,R the Mean Value Theorem known from standard analysis we conclude that there is a point J E (0,l) such that @'(c) = @(I)- @(0) = F(y) - F(x). Taking q = g(c), we prove the above statement. 0 1.14 Corollary. Let 0 Rn be a convex open set and F E C ' ( O , R ~ ) . Then F satisfies a Lipschitz condition on any convex compact subset B of 0 with Lipschitz constant A ' = sup{lll j F ( z ) Ille: z E B).
Proof. From (1.13) and (MN.l),
CHAPTER 7. CALCULUS IN EUCLLDEAN SPACES
In particular,
The following result will also be useful in the sequel.
1.15 Corollary. Let B be a convex compact subset of an open set 0 Rn and F E e1(0,Ftrn).Then F satisfies a Lipschitz condition on B with respect to the supremum norm and with Lipschitz constant K = J i i s u ~ I l l lBF(.z-)Ille: E ' 1 .
Proof. Let F = ( f l ,...,f,)=.
By (1.14) and then by (MN.4) for
m = 1,
which obviously yields that
and thereby the statement of this corollary. The following is a modification of Lemma 1.3.
1.16 Lemma. Let F: 0 ( C_ Wn) + R n be a e l - m a p and 0 an open set. If N is a negligible subset of 0, then so is also F,(N).
Proof. Since (Rn,re) is second countable and since open rectangles with rational coordinates are a countable base for (Wn,re) (see Example 2.8 (i), Chapter 3), 0 can be represented by a union of such rectangles. Because F' is continuous on 0 and each R k is convex and bounded, it follows from (1.14a) that F satisfies a Lipschitz condition on Rk with K k = sup(] 11 F1(r) I 1.: z E Ek} being a Lipschitz constant on Rk. Since N n R k is negligible, by Lemma 1.3, F,(N n Rk) is also negligible. This yields that F,(N) is negligible as the countable union of sets I F * ( N n R,)l's* 13 1.17 Definition. Let 0 and 0' be open subsets of Rn and [O,O1,F] be a el-map. F is called a dijfeomorphism or diffeomorphic or el-invertible,
1. Differentiation
397
if: (i) F is bijective. (ii) [01,0,F - 1' is a el-map. The following is a version of the Inverse Mapping Theorem, which can be found in many standard analysis books, such as one by Tom Apostol [1974]. 1.18 Theorem (Inverse Mapping Theorem). Let [ 0 C Rn,Rn,F] be a el-map and let JF(x) # 0 f o r some x E 0. Then: (i)
(ii)
c
there are open sets U 0 and V C Rn such that x E U, F(x) E V, and [U,V,F] is bijective; [V,U,F - 1' is a
e1-map;
1.19 Remarks. The Inverse Mapping Theorem tells us that: (i)
[U,V,F] is a diffeomorphisrn.
(ii) If [0 5 Rn,Rn,F] is a c'-map [ 0 ,F,(O), F] is a diffeomorphism.
and JF(x)
#0
on 0, then
1.20 Example. Let [0 C Rn,O' g n , F ] be a diffeomorphism. We show that for each x E 0,
= D ( F - l ) F ( , l ~ ~=x I.
( F - ')'(F(x))(F'(x)) As the identity map, 1 = F Dl, = D ( F -
'
o F),
'
o
F,
= I (see Example 1.5 (i)).
On the other hand, by the chain rule (Theorem 1.11), D(F-'
o F),
= D ( F - l ) F ( x l ~ F x= I*
In terms of Jacobian matrices the same results read
The latter yields the following: ()F(x)) and thus, 4
- = IF - 1(F(x))
(1.20)
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
(1.20b-d) imply that if [ 0 C_ Rn,O' C_ Rn,F] is a diffeomorphism, then both JF#O on O a n d J F - l # O on 0'. 0 1.21 Proposition. Let [O,O1,F] be a diffeomorphism in Rn and A be a subset of 0. Then, A is Lebesgue measurable i f and only i f F,(A) is also. Proof. (i) Let L; be the trace a-algebra of Lebesgue measurable sets on 0 and L;, - the corresponding trace Lebesgue a-algebra in 0'. If A E L&, then, by Corollary 2.18, Chapter 5, there is a Borel superset B of A from the trace Borel a-algebra 93 fl 0, such that A:(B\A) = 0, i.e., B\A is Anegligible. Since F is a e1 map, by Lemma 1.16, F,(B\A) is also negligible. Therefore, since (L;,,Ag) is complete, F,(B\ A) E Lg,. On the other hand, since F is a homeomorphism, it preserves all set operations and
By Problem 3.5, Chapter 4, F,(B) is Borel, thus, F,(A) is a Lebesgue measurable set and we have that
F**(eg) G &;I.
(1.21)
(ii) Because F is diffeomorphic, F**(L;,)
L; and, in additisn, F,,
Consequently, from (12la), L yields F,,(L;) and thus the assertion.
F
0
F**= I,, (identity).
(1.21a)
) and this, along with (1.21) = L&
(1.21b)
0
(i) Let F be a homeomorphic map. From Problem 3.5, Chapter 4, F,,(%) = 93 is a Borel c-algebra in Rn and, therefore, the image measure p = AF, is a Borel measure. For B, being a compact set, F,(B) is also compact and thus p is a Borel-Lebesgue-Stieltjes measure. (ii) If F is diffeomorphic, then from Lemma 1.16 and (i), it follows
that p << A.
1. Differentiaiion
PROBLEMS 1.1
Show that the Frobenius norm is submultiplicative.
1.2
Show that the maximum row sum matrix norm is submultiplicative.
1.3
Prove inequalities (MN.2-MN.5). In the problems below, unless specified otherwise, we assume that a function F: R -,$2' has Banach spaces as the domain and codomain.
1.4
Show that given F and x, the linear operator L in (1.4) is F(4 unique.
1.5
Let F: S1 --+ R' be a constant function. Show that F is differentiable everywhere on R and that for all x E R, DF, = 0.
1.6
Show that if F: R-, a t x.
1.7
Show that the derivative is a linear operator acting on the set of all differentiable functions F: R -,S1' at a point x.
1.8
Let R, R1, R2, and R' are Banach spaces, F: 0 +R1, G: O + Q 2 be functions differentiable a t x E 0 (where 0 is an open set), F @ G: R >: R +R'. Show that the product function F @ G is differentiable a t x and
$2'
is differentiable a t x, then it is continuous
D ( F @ G), = DF,G(x)
1.9
+ F(x)DG,.
Let [Rn,Rn,L] be a linear map given by a regular matrix M. Show that there is a positive real number rr such that, for all x E Rn,
and
1.10
Let [ 0 Wn,Wm, F] be a el-function, where 0 is an open set, and x0 E 0. Prove that for each E > 0, there is an open ball Be(xo,G) E 0 or Bu(xo,6) C 0 such that
400
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
for all x E Be(xo,6) and h E Wn
for all x E B,(xo,S) and h E Wn, respectively. 1.11
In the conditions of Problem 1.10, let 0 be a convex set. Prove that for each E > 0, there is an open ball Be(xo,6) E 0 or B,(xo)6) C 0 such that
for all x E B,(xo,6) and h E Wn such that x
+ h E Be(xo,6)
for all x E BU(xo,6)and h E Rn such that x + h E Bu(xo,6), respectively.
1.12
Let [ 0 & Wn,Wn,F] be a el-function, where 0 is an open set, and xo E 0 such that the Jacobian JF(xO) # 0. Prove that there is an open ball Be(xo,6) E 0 such that for all y E Be(xOr6),
# O for all x E B,(xo,6)
(P1.12)
[B,(xo,6) ,Wn, F ] is one-to-one.
(P1.12a)
JF(x)
1.13
Let [0 C Rn,Rn,F] be a diffeomorphism. Show that for each x, E 0, DFx 0D(F
- l)F(xO)= I
or, equivalently, F1(x0)(F-')'(F(X~)) = 1. 1.14
Show that if [Rn,Wm,F] is differentiable, then {x E Wn:l < a) is an open set in Wn.
1.15
Under the condition of Problem 1.14, is {x E Wn: an open set?
11 F1(x) 11 1.
11 F1(x) (1, < a}
1. Differentiation
NEW TERMS: Lipschitz condition 387 Lipschitz constant 387 Euclidean (Frobenius) norm of a matrix 387 Frobenius (Euclidean) norm of a matrix 387 submultiplicative property of a matrix norm 387 matrix supremum norm 388 maximum row sum matrix norm 388 differentiable map 390 derivative of a map 390 FrCchet derivative 390 Jacobian matrix 390 Jacobian 390 partial derivative 392 continuously differentiable function 394 chain rule in Banach spaces 394 chain rule in Euclidean spaces 395 Mean Value Theorem 395 diffeomorphism 396 Inverse Mapping Theorem 397
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
2. CHANGE OF VARIABLES 2.1 Lemma. Let L be a linear operator from Rn to Rn expressed b y a regular matrix M and C be the compact unit cube spanned b y the basis vectors in Rn. Then, it holds true that X(L,(C)) = I detM I .
(2.1)
Proof. (i) We will refer to the linear operator L as to elemenfary, if the corresponding matrix M is regular and one of the following three types: Type 1. M is derived from the n x n unity matrix I whose ith element on the main diagonal is replaced by a nonzero real number c. Type 2. M is obtained from the n x n unity matrix I, in which the columns i and j are interchanged. Type 3. M is obtained from the n x n unity matrix I in such a way that in its column i, the element eji = 0 is replaced by the element m ji = 1.
In all types above we assume i,j = 1,...,nand i # j. Clearly, if x E Rn is a column vector, then L(x) = M x stipulates the rules of the following transformation of x: For type 1, the ith entry of x is multiplied by c and the rest of the entries are left unchanged. For type 2, the entries xi and x j are interchanged and the rest of the entries remain unchanged. For type 3, entry xi is replaced by xi x j and the other entries are left unchanged.
+
(ii) We first show that p ( C ) = XL,(C) = ( detM I , if L is an elementary operator. Remember that C is the closed unit cube spanned by the basis vectors el,. ..,en and expressed as the Cartesian product [0,1In. Consequently, it is obvious that when mapping C by L, we apply L to each of its points x = tlel ... tne,, where ti E [0,1]. Therefore, by the above rules we have:
+ +
'--'
ith edge
2. Change of Variables
and L,(C) = [0,1]x ... x[c,O] x...x[0,1], if c < 0. w ith edge
The edges of C, from el,. . .,en are transformed onto el,. ..,cei,. ..1 en whose volume X(L,(C)) equals 1 c I . This is the same value as that of
In this case, the edges ei and e j are interchanged, and therefore, the shape of the cube remains the same. The volume of X(L,(C)) is the same as that of X(C) = 1 = I - d e t I [ = I detX(L,(C)) I .
The edges of C will be transformed onto (el,. ..,ei
+ e j ,...en), which will
u ith edge
span a paralleletop whose sides parallel to the XiX3-plane are rhombi and the other sides are squares. For convenience sake i = 1 and j = 2, the volume of L,(C) can be calculated by using Fubini's theorem as follows: x(L*(c)) =
=
//
/' ~ A ~ ( x.,xn) ~,.
/ /
~ A ~ ( x ~ , x..~. ) rhomb [OJ] [ O J I
- 2(x3,.. .,x,,).
This reduces to 1 as it is easy to see. On the other hand, it is also the same quantity as I detA(L,(C)) I = det (el,. . .,ei ej,. ..,en).
+
(iii) Now, if instead of a cube, we have a compact rectangle R, i.e. a paralleletop with its edges spanned by the coordinate axes and possibly translated, by similar arguments as in (i-ii) we obtain that
if L is an elementary linear operator. (See Problem 2.1 where the validity of (2.la) is to be shown.)
404
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
(iv) Let P be a compact paralleletop in Rn. Since the boundary aP of P consists of parallelograms each of which have a dimension less than 0
n, X(8P) = 0 and, therefore, X(P) = X(P). By Problem 2.10, Chapter 4, 0
as an open set, P can be represented as a countable union of disjoint semi-open cubes:
0
Therefore, X(P) = X(P) = C j = 1X(Cj) there is an N E N such that
< ca and hence for each
E
>0
On the other hand, by Problem 3.22, Chapter 5, for each E > 0, there is a finite cover of. P by disjoint semi-open rectangles R1,. .,R, such that
.
C I= i'(Ri)
- 6 '(P) 6
C 7r = l h(Ri)*
(2. lc)
Equations (2.lb) and (2.1~)yield
zL1X(RJ - 5 < X(P) c z=;
lX(Cj)
+$
(2. l d )
Therefore, from (2.16) we have that (2. le) Now L,(C) = P is a compact paralleletop with the property that for each E > 0, there is a finite cover of P by semi-open disjoint rectangles and a finite tuple of semi-open disjoint rectangles that can "approximate" P from above and below,
In terms of the Lebesgue measure A, this is in accordance with (2.1~2.lf). (v) Suppose L is an elementary linear operator. Then, applying L to (2.lf) and evaluating the Lebesgue measure of the resulting inclusion we have
From (2.la), the last inequality can be rewritten as
2. C h a n g e of Variables
I detM I
405
cY- - lA(Cj) < AL,(P) < I detM I X I
r = l
or, with notation C =
A(Ri),
C 3 = 1 C3. and % = C I= lRi, in the form = I detM I A(%).
On the other hand, replacing
E
in (2.le) by
A(L*(%)\L*(C))
E
(2-lg)
I detM 1 we get
<
(2. l h )
We conclude that, if L is an elementary operator applied to a compact paralleletop P, for each E > 0, there are a subset C and a superset % of P whose images under L, satisfy inequalities (2.lg-2.lh) and
and
A(L,(C)) = [ detM 1 A(C)
(2. li)
A(L,(%)) = I detM I A(%).
(2. l j)
Equations (2.lg-2. lj) yield that A(L,(P)) = I detM I A(P).
(2. lk)
(vi) If L is a regular linear operator, then, as it is known from linear algebra, L can be expressed as a composition of finitely many elementary operators or, equivalently, M = MI.. .Ma, where Mi's are elementary matrices. (One of the arguments is the Gauss-Jordan algorithm for derivation of the matrix inverse.) The application of L, = (L1 o ...o L,), or any subgroup of Ll o o La to C makes it a compact paralleletop such as P above. Consequently,
...
and because of (2.lk),
which finally yields A(L,(C)) = I detM1 I.. . I detM, I A(C) = I detM I
.
0
2.2 Theorem. L e t L: Rn -IRn be a l i n e a r o p e r a t o r specified b y
406
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
matrix M . Then, for every Lebesgue measurable set E,
Ag(L,(E)) = I detM [ Ag(E).
P.2)
Proof. (i) If M is a singular matrix, then L maps the (n-dimensional) set E into Rm, where m < n and, therefore, L,(E) becomes A-negligible. On the other hand, detM = 0 and thus equation (2.2) is valid.
(ii) Suppose M is regular. Then L is diffeomorphic on Rn and, due to Proposition 1.21, L,(E) E L*.Denote
Then pg is a measure on ZB* and the restriction of p; to p (which evidently is AL,) from ZB* to the Bore1 c-algebra 4B is a Borel-LebesgueStieltjes measure. For every a E Rn and E E 93, the set E + a E 93 (why?) and L,(E a) = L,(E) Ma. Since (Proposition 4.3, Chapter 5) the Lebesgue measure A is translation invariant, we have that
+
+
which makes p also translation invariant on 93. Therefore, by Theorem 3.10 (Chapter 5), p = p(C)X, where C is the unit cube in Rn with the edges along the coordinate axes. By Problem 4.9 (Chapter 5), the outer measure p* (generated by p) will obey the same relation p* = p(C)A* on T(fl), where A* is the Lebesgue outer measure,
and p; = p(C)A; on 93* and 93' = A*. The statement of the theorem follows from Lemma 2.1: p(C) = x(L,(c)) = I detM I . 0
2-3 CoroUaqy. (Generalization of Proposition 4.3, Chapter 5.) If L is an afline transformation (L(x) = M x b, where M is an n x n matrix and b is a real number), then for every Lebesgue measurable set E,
+
Ag(L,(E)) = I detM I Ag(E). 2.4 Lemma. Let [0 ( 5 Rn),R", F] be a ma^, 0 be an open set, C C_ 0 be a compact cube with its edges parallel to coordinate axes, and E > 0 be such that for all x E C , 111 jF(x) - I 11 le < E , where I is the
2. Change of Variables
407
identity matrix. Then it holds true that
Proof. Denote I ( x ) = F(x) -x. Clearly, @ is also a el-map. By Corollary 1.15, @ satisfies a Lipschitz condition on C with respect to the supremum norm:
where x,zo E C and, obviously, j@(x)= JF(x) -I. From (2.4a))
If xo is the center of a cube C and 2r is the length of its edge, lIx-xOIIuI.~"nd
The last inequality tells us that F(x) belongs to the compact cube centered a t F(xo) with edge 2r(K + 1) or ball with radius r ( K + I), with 1)). In other respect to the supremum norm, in notation B,(F(x0),r(h' words, ( 2 . 4 ~ )yields that
+
(see Figure 2. l),
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
Figure 2.1 and because F,(C) is a Borel set,
Now, if IIIIF(x)- I follows.
lllesE
for all x E C, then K 5
~f andi (2.4)
2.5 Propaition. Let [ 0 C_ Rn,O1 C_ Rn,F] be a diffeomorphism. Suppose for some b > 0,
for all x € B, where B is a Borel subset of 0. Then,
Proof. 0. We (i) Suppose B is an open and bounded set such that prove (2.5a) under the assumption that (2.5) holds true for all x E B. Denote @(x) = ( F - ')'(F(x)). If F - '(y) = (gl(y),. ..,g,(y))T, then obviously
2. Change of Variables
Since @(xo)F(x) represents a linear map applied to F(x), by the chain rule,
By Example 1.20, ( F - ')'(F(X~))(F'(X~))= I. Thus,
and this turns out to be the product of matrices (F-')'(F(X~)) and F1(x) - F1(xo). Since the Frobenius norm is submultiplicative (see (1.2a)),
Since Q is continuous and B is compact, Q is bounded on B (in terms of the Frobenius norm) and so it is on B. Hence, there is an M 3 0 such that
[I @(x) (1 5 M for all x E B.
(2.5~)
As a c'-map, F' is continuous on B and because is compact, F' is therefore uniformly continuous, i.e., for every c > 0, there is a 6 > 0 such that, for all x,y E B with 11 x - y 11 < 6,
Combining (2.5~)and (2.5d) we have from (2.5b) that
111 [@(xo)F(x)-IxI1(IIe < E given 11 x-x0 11. < 6.
(2.5e)
410
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
By Problem 2.10, Chapter 4, B, as an open set, can be represented as a t most a countable union of disjoint semi-open cubes (Ck} with edges parallel to their coordinate axes. Obviously, we can assume that the edge of each cube does not exceed 26 or, otherwise, we can subdivide the edges accordingly if necessary. Now, if xo is the center of such a cube, then 11 x - xo 11 6 for any x from the cube. From Problem 1.13,
<
Since @(xo)F is demeomorphic (as a composition of regular linear and demeomorphic maps), @(xo)F,(Ck) is a Bore1 set. Since F1(xo) is a linear operator, by Theorem 2.2, and from (2.5f),
By our assumption, @(xo)F,
I detF1(x) I 5 b
X(@(lO)F*(Ck))
on B. By Lemma 2.4, applied to
< (1 + ~ f)"XO(ck). i
Hence, X(F*(Ck)) 5 b(l
+E ~ ~ ) " X ~ ( C ~ ) -
Inequality (2.5h) holds for any cube. Now, since B = that
c=:
(2.5h)
'Ck, we have
and thus
Since the latter holds for every
E
> 0, we have that
Hence, given that (2.5) holds true on an open and bounded set B, (2.5a)
2. Change of Variables
411
is valid. (ii) Now we suppose that (2.5) holds true on 0. Note that 0 is open but not necessarily bounded. By Problem 6.12, Chapter 3, there is a monotone sequence {Ok) of bounded open subsets of 0, increasing to 0. By Part (i), for each Ok,
Since F,(O) =
00
U F,(Ok), by continuity from below, k =1
A(F,(O)) = klL%X(F,(Ok)) 5 lim bA(Ok) = bA(0). k+m
(iii) Finally, let B be a Borel subset of 0 on which (2.5) holds true. By regularity of A, Problem 3.15 (Chapter 5), for each E > 0, there is an open superset 0, of B such that X(O,\B) < E or A(0,) < A(B) E. We assume that 0, E 0, or, otherwise, we take 0 n 0, instead. Denote
+
N
0 has the following properties: 1) Since I detIg(x) (
5 b on B, B E 5
2) Since 5 = 0, fl {x E Wn: 0 is open. rV
So, we have that B C
5
This holds true for any
E
(1 F1(x) 11 < b + E),
by Problem 1.14,
0,. Thus,
> 0. Hence it yields the statement. [ 0 5 Wn,OI 5 R n , q be a diffeomorphism.
2.6 Proposition. Let for each Borel subset B of 0,
A(F*(B)) =
Proof.
,fB I J ~ ( x )I A(dx)-
Let B be a Borel subset of 0 such that A(B) each k = 1,2,. .. and a f ~ e positive d integer rn,
(i)
0 Then
(2.6)
< m. Define for
412
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
From Proposition 2.5, X(F*(Bmk)) 5 kX(Bmk).
(2.6a)
From Example 1.20, (1.20d), 1 l(F(x)). J F ( ~ Fk and hence, from (2.6b), I JF(x) I ( < =)), --J
k-1 5 For all x E Bmk, T
If we apply Proposition 2.5 to F
(2.6b)
-'we will have that
which along with (2.6a) yields
For all x E Bmk,
Integrating (2.6d) we have k-1
m X ( B m k )5
1 I J ~ ( x )I
Bmk
Combining (2.6~)and (2.6e) leads to
Because B =
00
- lBmk, we have that
<~!i~(~,k).
(2.6e)
2. Change of Variables
413
Since by our assumption X(B) < oo, we have from (2.6f) the validity of (2.6) by letting m--, oo. (ii) If B is an arbitrary Borel set, we can make a countable decomposition of B = s = l Bs such that X(Bs) < oo and get (2.6) by summing up the equations
zrn
CI
over s.
2.7 Ftemark. Formula (2.6) can be alternatively expressed as follows. Let B1 be a Borel subset of O1 and B = F1(B1). Then B is also Borel and B1= F,(B). Applying Proposition 2.6 to such a B, we have that
2.8 Theorem. (Change of Variables.) Let [ 0 E Rn,Ol E Rn, F] be a diffeomorphism, let A be a Borel subset of 0 and Al = F,(A). Then for each Borel measurable finction [O,R,g],
Proof. Let g = 1 for some Borel subset B1 of Al and B = F*(B,). B1 Then, by (2.6),
=
SA ! d F ( ~ ) )I J ~ ( x )I
(2.8a)
Thus (2.8) holds true for g being an indicator function. Let g be a simple function, i.e., g =
...,k) is a measurable partition of A,.
EL1ailB,., where a
From (2.8a),
{ B i , i = 1,
414
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
The rest of this theorem is due to the standard procedure by going over 0 to the class of Je +-functions and then to g = g + - g-.
2.9 Examples. (Spherical Coordinate Transformation).
(i)
Let 0 be an open subset of W3 defined as
and let [ 0 , W 3 , ~be ] defined as (x(r,O,p) = r cos0 sinp, y(r,0,p) = r sin0 sinp, z(r,O,p) = r cosy) T. (2.9) The transformation has the range W 3 \ ~ , where D = {(x,y,z) E W3: x 2 0, y = 0, z E W). One can easily see that F is a el-map on 0 and its Jacobian, JF(r,O,p) = -r2sinp # 0 on 0. By Remark 1.19 (ii), [O,F,(O) = W3\ D,F] is a diffeomorphism. Such a map transforms the rectangle [O,P] x [0,2n) x [O,T] onto the ball B , ( O , ~ ) ,but it obviously fails to be a diffeomorphism. On the other hand, if we take
instead as.the domain of F it will transform the open rectangle R onto an open ball B,(O,p) with the deleted sector
The transformation [R,B,(O,p)\S,F], with F defined by (2.9), is clearly a diffeomorphism. (ii) Let [W,W,h] be a continuous function and let g be defined as
2. Change of Variables
415
Let Be(O,p) be an open ball in R3. We will show that P
S Be((',
gdh = 4r S h(r)r2dr. P)
(2.9b)
0
Consider the transformation [R,Be(O,p)\S,F] from (i). Since S is a twodimensional set, its Lebesgue measure in IR3 is zero and, consequently,
Now we are going to apply formula (2.8):
with A, = B,(O,p)\S, A = F*(A,) = R = (0,p) x (0,2n) x ( 0 , ~ ) p) = (r,B, p), and ( J F ( p ) I = r2sinp. Clearly, g(F(p)) = h(r), which by Fubini's Theorem leads to
The last expression reduces to a Riemann integral and this further P
reduces to 4r S h(r)r2dr.
0
0
PROBLEMS 2.1
Show the validity of (2.la).
2.2
Let [R,R,h] be a continuous function and let E(0; al,a2,a3) denote the open ellipsoid
Show that
416
CHAPTER 7. CALCULUS IN EUCLIDEAN SPACES
2.3
Show that the volume of the ellipsoid in Problem 2.2 is $ra1a2a3
2.4
Evaluate the integral
where Be(O,p) is a ball in R ~ .
2. Change of Variables NEW TERMS: Borel-Lebesgue measure of a cube under a linear map 402 Lebesgue measure of a set under an affine map 406 Borel-Lebesgue measure of a Bore1 set under a diffeomorphism 411 change of variables in Euclidean spaces 413 spherical coordinate transformation 414 volume of an ellipsoid 416
-
h u t h e r Topics in In legration
Chapter 8 Analysis in Abstract Spaces This chapter (which is the least focused of the entire text) continues integration started in Chapter 6 and combines seemingly diverse topics from measure, integration, functional analysis, and topology. After we learned about absolute continuity of positive measures briefly introduced in Chapter 6, Section 5 (which may be sufficient for a first acquaintance), we will render a more thorough analysis of the Radon-Nikodym theory (Section 2) from the position of signed and complex measures (subject to Section 1). Singularity and Lebesgue decomposition of signed measures are also treated here (Section 3) in a more rigorous fashion. The reader will definitely benefit from having a first look a t Chapter 6, Section 5, even though much of its formalities are suppressed. The results on signed measures are then applied to the analysis of LP spaces (a traditional topic of functional analysis) and generalization of the Lebesgue Dominated Convergence Theorem (Section 4)) followed by convergence of measures (Section 5) and uniform integrability (Section 6). In Section 7, we return to locally compact Hausdorff spaces (started in Sections 10 and 11, Chapter 3) in connection with regularity of Radon measures and the general proof of the Riesz Representation Theorem (Section 7). The chapter concludes with measures derivatives (Section 8) making traditional calculus on the real line (Chapter 9) very powerful. Besides the Radon-Nikodym Theorem (initially discussed in Chapter 6)) LP spaces and the Riesz Representation Theorem are among the main topics of this chapter. LP spaces (and their duals) were introduced and studied by the Hungarian Frigyes (Friddric) Riesz (one of the major figures in early functional analysis) who presented in 1910 a fully developed theory of these spaces, operators on them, and their spectral theory. His 1909 widely referred to Representation Theorem (of continuous linear functionals through integrals), initiated by Jacques Hadamard in 1903, was his other major accomplishment, even though he proved this theorem for the special case of Riemann-Stieltjes integrals on [a,b]. Consequently, Riesz used no measure theory, although his work made a huge impact on the development of measure theory and integration and, in particular, lead Johann Radon to his 1913 revolutionary work.
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
1. SIGNED AND COMPLEX MEASURES The situation below is motivational to study a more general class of set functions than those we called "measures." Let (R,C,p) be a measure space and let f E ~'(fl,C,p;R). Define the following set function on C:
where v + ( A ) = J f + d p and A
v-(A)= Sf-dp. A
The set function v has all the properties of a measure (a-additivity follows by Lebesgue's Dominated Convergence Theorem) except for being positive. However, in the above decomposition v = v + - v - , the set function v is represented by the difference of two measures. We will study this type of a set function, which we wish to call a signed measure. We give a formal definition below, without saying anything about a decomposition which is to follow later.
1.1 Definitions. (i) Let ( a , C ) be a measurable space. A set function v: C -+R is called a signed measure if: a) ~ ( $ 3 = ) 0;
b) for each A E C, the value of v(A) is well defined, i.e. it is either finite or + oo or - oo; c) v is a-additive.
To tell signed measures from nonnegative measures, we will refer to the latter as positive measures. G ( a , C ) will denote the set of all signed measures on the measurable space (a,E ) . (ii) The signed measure is called finite if its range is a subset of R. Otherwise, is is called infinite. The triple ( R , C , v ) is called the signed measure space. According to the type of the signed measure, the signed measure space is referred to as finite or infinite. The signed measure v is called a-finite if C admits a countable measurable partition (R,) of vfinite sets. (iii) Sometimes, we will need a notion of a finite set under v (or just a v-finite set). This is referred to as a measurable set A with I v(A) I < oo. A measurable set P is called v-positive (or just positive) if v ( P n A) 2 0 for all A E C. A measurable set N is called v-negative (or just negative) if v ( N fl A) <_ 0 for all A E C. Obviously, P (N) is positive (negative) if and only if for any measurable subset E of P (N), v(E) 0
( < 0).
>
1. Signed and Complex Measures
423
(iv) A set function v: C +R is called continuous from below if for every monotone nondecreasing sequence {A,} f E C it holds that lim v(A,) = v
n+w
(v) Let ( A , ) be a monotone nonincreasing sequence of sets from C of which a t least one is v-finite. A set function v: C-+R is said to be continuous from above on (A,} if
The set function v is continuous from above on C, if (1.1) holds for every monotone nonincreasing sequence {A,) 4 2 E with at least one v-finite set. In particular, if {A,} 40,(1.1) reduces to lim00 v(A,) = 0
n---+
and this is referred to as continuity from above at the empty set or, shortly, @-continuity of v. (vi) Any signed measure on the Borel a-algebra is called a signed Borel measure. In particular, a signed Borel measure on (Fin,%), finite on d,-bounded Borel sets is said to be a signed Borel-Lebesgue-Stieltjes measure.
1.2 Remarks. (i) Notice that (ii) and (iii) imply that if Al + A2 is any decomposition of a measurable set A and if v(Al) = - m , then so is v(A) (and v(A2) must not equal m ) and v(A2) is either finite of - m ; and it is not possible for any subset of A to have its signed measure be + m as it would yield v(A) = + 00. If A is a finite set under v, then any finite or countable decomposition of A consists of finite subsets under v.
+
(ii) If a sequence of mutually disjoint measurable sets {A,) is such lAn) I < m, then a-additithat its union is finite under v, i.e. ( V(C:= vity of v ( I v ( ~ r = ~ A ,I )= I ~ r = ~ v ( A ,I)) implies that the series =z: lv(A,) is also absolutely convergent. This, as we know, does not hold true for series in the general case. The reader is encouraged to explain this phenomenon. (See Problem 1.1.) We will start with a few introductory properties of signed measures.
1.3 Proposition. Let (R, C, v) be a signed measure space. A
(i)
If A and B are measurable sets such that B is v-finite and is also v-finite and v(B\A) = v(B) - v(A).
E B, then A
424
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
(ii) above.
The signed measure v is continuous from below and from
(See Problems 1.2 and 1.3.) The converse of Proposition 1.3 (ii) is as follows. 1.4 Proposition. Let v be a finitely additive set function on the measurable space (R,C) such that v(@) = 0. If v is continuous from below, then it is a-additive. If v is finite, then the continuity from above implies that v is a-additive.
(See Problem 1.4.)
1.5 Lemma. Let (a,Z:,v) be signed measure space and let A E Z: be such that - CQ < v(A) < 0. Then there is a negative subset N of A such that v(N) <_ v(A). Proof. If A does not contain a t least one subset E with v(E) > 0, A is negative itself and the statement of the lemma is proved. Otherwise, let
So= sup{v(C): G
Bo = A),
which, by Proposition 1.3 (i), is finite and by our assumption about E is also positive. Hence, for every E , there is a set C1 E A such that v(Cl) E 2 SO> 0. Let E = $So. Then, C1 is such that v(C1) 2 $So. Now, if B, = A\C, is v-negatiue, then we are done with the proof. Indeed, v(B1) = v(A) - v(C1), by Proposition 1.3 (i), and because v(C1) > 0, v(Bl) < v(A). Otherwise, there is a t least one subset of B1 whose measure is strictly positive. Continuing with the same procedure, a t step n we arrive a t set ~.
+
which is either a v-negative set satisfying v(B,) < v(A) or it admits a t least one subset with a positive value under v. This again leads to a positive real number
and v(C, set
the +
such that existence of a nontrivial set C, + 2 > 0. If for no n, B, defined above is negative, then we
;s,
We show that N is a negative subset of A claimed in the statement of
1. Signed and Complex Measures
the lemma. From
we see that both v(N) and C F = lv(C,) are finite. The latter implies that v(C ) and, consequently, S,, dominated by v(C,), are vanishing. 1 (Notice that, because Y ( C ~ lCn) = > 0, N # 0.)This in turn yields that N is negative. Indeed, from the definition of S,, for every measurable subset E of B,, v(E) S., Since B, E N , it follows that for every measurable set D, v ( N n D ) 5 S, 10. Finally, that v(N) < v(A) is obvious.
<
The following theorem states that there is an (essentially unique) decomposition of the carrier set 52 into a positive and a negative set relative to a given signed measure v. This decomposition, referred to as a Hahn decomposition leads to the upcoming Jordan decomposition of v into the difference of two positive measures mentioned in the beginning of this section.
1.6 Theorem (Hahn Decomposition Theorem). Let (Sl,E,v) be a signed measure space. Then R can be partitioned into two sets, P and N , of which P is a positive and N is a negative set, referred -to as a Hahn decomposition of SZ with respect to v, in notation (P,N). A Hahn decomposition is unique in the following sense. If there is another Hahn decomposition (P1,N') then P A P ' and N A N ' are v-null sets and therefore all Hahn decompositions form a unique equivalence class.
Proof. We assume without loss of generality that v does not take the value - m. If 0 is the only negative set of v, then for each A E E, v(A) > - 0. (If there is a set A such that v(A) < 0, then by Lemma 1.5 there would be a nonempty, negative subset of A.) Therefore, (R,@) is the "trivial" Hahn decomposition and we are done with the proof. Let I = inf{v(E): E E C and E is v-negative). Clearly, I 5 0. Then, there is a sequence N lim,,,v(N,) = I. Because of Problem 1.5, N: =
} of negative sets with
00
U
N,
n=l
is also a negative set. Regarding B, as
n
U N k , we have
k =1
{B,) as a
monotone nondecreasing sequence of negative sets T N and hence, by Proposition 1.3 (ii), lim,,,v(B,) = v(N). Furthermore, since B,\N, E B, and B, is negative, v(B,\N,) 5 0. On the other hand, v(B,\N,) = v(B,) - v(N,) and thus v(B,) <_ v(N,). The latter yields that v(N) I. On the other hand, as for a negative set, v(N) 2 I, and thus v(N)
<
426
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Now we show that P = NC is a v-positive set. If this is not the case, then there is a t least one measurable subset A of P with - oo < v(A) < 0 and then, by Lemma 1.5, there is a measurable, negative subset B of A with v(B) 5 v(A); hence v(B) < 0. Then, B N makes a negative set such that v(B N) = v(B) v(N) < v(N) = I, which contradicts the fact that I is the v-limit-inferior of all negative sets. The uniqueness of the Hahn decomposition is left for an exercise. (See Problem 1.7.) 0
+
+
+
While the Hanh decomposition is a decomposition of the carrier R (with respect to the signed measure v), the Jordan decomposition below is of the signed measure itself. It states that each signed measure is the difference of two positive measures.
1.7 Corollary (Jordan Decomposition). Let (a,Z, v) be a signed measure space. Then v can be represented as the difference of two positive measures; of which at least one is finite, and this representation is unique ( i n the sense that it is invariant of any Hahn decomposition).
Proof. Let ( P , N ) be a Hahn decomposition of R relative to v and define the set functions v + and v - on C as follows: v + ( A ) = v ( A n P ) and v-(A) = - v ( A n N).
(1.7)
It follows from the definition of v + and v - that both are positive measures on C.It is also obvious why only one of them can be infinite. Hence, v = v + - v - is the Jordan decomposition induced by the Hahn decomposition (P,N). Suppose that p + - p - is yet another Jordan decomposition of v induced by the Hahn decomposition (P1,N1). Then, it can be easily shown (and it is left for an exercise; see Problem 1.8) that v + = p + and -. v-=v
1.8 Definition. The defined in Corollary 1.18 Jordan decomposition of a signed measure v, due to its uniqueness, suggests the following terms: v + is called the positive variation of v v is called the negative variation of v I v I = v + v - is called the total variation of v. (As the sum of two positive measures, I v I is a positive measure itself.) O
+
One of the remarkable properties of the Hahn-Jordan decomposition of a signed measure is that it attains its maximum and minimum values on two disjoint measurable subsets of R as stated by the following proposition.
1. Signed a n d C o m p l e x M e a s u r e s
427
1.9 Proposition. L e t (R, C, v) be a signed m e a s u r e space. T h e n t h e positive, negative and t o t a l v a r i a t i o n s of v c a n be r e p r e s e n t e d a s follows. G i v e n a n y measurable set A E C, (i)
v + (A) = sup{u(E): E E C fl A)
(ii) v - (A) = sup{ - v(E): E E C fl A) = - inf{v(E): E E C n A) (iii) I v I ( A ) = s u p { C ; = , I v(Ek) I : {El,...,En} C and
C L , Ek
A).
Proof. Denote by ( P , N ) a Hahn decomposition of R with respect to v and let "sup
(A) = sup(v(E): E E C fl A)
and vinf (A) = sup{ - v(E): E E C n A) = - inf{v(E): E E C fl A).
<
(i) Clearly, v + (A) = v(A n P ) vsup(A). T o prove the inverse inequality we notice that because ( P , C n P,R e s z n pv) is a positive measure space, R e s z p v is monotone and hence, for each E E C fl A,
This yields the desired inverse inequality
and thereby proves part ( i ) of the proposition. (ii)
Because P and N interchange their roles for
- v, we have
and therefore v - = - vin We leave part (iii) for an exercise (Problem 1.9).
1.10 Remark. In summary of the Hahn-Jordan decomposition, we have that V+
(A) = SUP{@): E E B n A) = v(A n P),
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
and
V(A) = v + ( A )- v - (A).
This has an obvious interpretation. The signed measure v attains its maximum and minimum values on two measurable disjoint subsets of A: A fl P and A n N , respectively; and the entire measure of A is the sum of these two values. In particular, it follows that P and N are the v-maximal and v-minimal subsets of R (in notation, P = S and N = I) on which v attains maximum and minimum values, respectively. This is due to the fact that ( P , E fl P, Res, ,v) is a positive measure space and hence Resz, pv is monotone. A similar argument explains why v attains a minimum value on N. 0
,
Let us consider a few examples.
(i) Let (Q,E,v) be a signed measure space. If v is a positive measure, then, obviously, S = 0 and I = Q). Consider the case with v = S f d p , where p is a positive measure on (R,E) and f E L'(Q, E , p). Then, v(A)= S f d p = A
S
A n If
fdp+
20)
S
A n If
< 03
fdp-
Therefore,
and thus S = { f
>0 )
and I = { f
< 0).
(ii) If p and p are two positive measures (of which a t least one is
finite), the difference v = p - p is a signed measure. However, it is not, in general, the Jordan decomposition of v. Let v be a signed measure on the measurable space (R,E). Denote by v~ = R e s z n ~ v , where E is a measurable set. To obtain the Jordan decomposition of v = p - p , we need any Hahn decomposition of R with respect to v. Say, ( P , N ) is one. Then, from Corollary 1.7, and
We can also make use of formulas of (i) and (ii) of Proposition 1.9 to determine the positive and negative variations.
1. Signed and Complex Measures
429
(iii) Let E,, be the point mass and P - a probability measure on (W,%). We find a Hahn decomposition of the signed measure v = P - ro. We show that I = (0) is a v-minimal (and negative) set discussed in Remark 1.10. For an A E 93, u(A n I c ) = P(A n I C )2 0, and either v(A n I ) = P({O)) - cO({O)) = P({O)) - 1, with 0 E A or
v(A n I) = 0, with 0 4: A,
which implies that v ( A n I ) 5 0. Using relations (1.11) and ( l . l l a ) , we have the Jordan decomposition of v: v + (A) = v(A n I ~ = ) P(A n I'), and v - ( A ) = lA(0)(l - P({O)). Note that (0) = I is the set where v attains its minimum. (iv) Let v = X - p, where A is the Lebesgue measure on (OX,%) is the geometric measure defined as
and p
Clearly, N = (1,2,. ..) is a negative set relative to v, whereas P = NCis a positive set. Thus, ( P , N ) is a Hahn decomposition of R relative to v and, consequently, for every Bore1 set A, v + ( A ) = (A - P)(A n {1,2,. .
and
.Ic)
v - (A) = ( p - A)(A n {1,2,. ..))
represent the Jordan decomposition of v. Since N is a A-null set, the latter reduces to Y
- (A) = p(A n {1,2,. ..)).
Therefore, v attains its minimum a t N and its value is maximum value of v is oo and it is attained a t NC.
- 1, while
the
0
The next embellishment of the notion of measure is a complex measure.
430
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
1-12 Definition. Let (R,C) be a measurable space. A set function v on C is said to be a complex measure if: (i)
v is valued in 43. [Notice that being valued in 43, v must not have infinite values, and therefore, of those signed or positive measures only finite ones can be qualified as complex measures.]
(ii) v(@) = 0. (iii) v is c-additive. [Analogously to the signed measures (see Remark 1.2 (ii)), a-additivity of v,
-
(where I I stands for the two-dimensional Euclidean norm), implies that the series c := lv(A,) is also absolutely convergent .] The triple (R, E, v) is referred to as a compIex measure space. Now, we use a similar concept in Proposition 1.9 (iii) to define the total variation of a complex measure.
Given a complex measme space (R, E, v), the complex measure v can be represented as v = vl + iv2, where vl and v2 are finite signed measures on C. Hahn decompositions should then be applied for vl and vZ and their corresponding Jordan decompositions will yield
(i)
v;, ,v: and v; being positive finite measures. We will call with ,v: (1.13) the Jordan decomposition of the complex measure v. (ii) For each measurable set A, the total variation complex measure v is is defined as sup
{
I v 1 (A)
of a
= 1 v(Ak) I , over all finite measurable partitions {Al,. ..,An} of A).
1.14 Proposition. The total variation of a complex measure v on (R, 27) is a finite positive measure on (R, 23).
.
Proof. Let { Al,. .,An} be a measurable partition of a set A E C.
1. Signed and Complex Measures
Because
for nonnegative real numbers a,b,c,d, and due to Proposition 1.9 (iii) we have
5 and therefore,
I Vl I (A) + I v2 l(A)l
Consequently, the total variation of any measurable set is a real nonnegative number. Obviously, 1 u 1 (@)= 0. Now we show that v is an additive set function. Let A and B be disjoint measurable sets and let {El,. .,En}be a measurable partition of A B. Then,
.
+
and the triangle inequality of the Euclidean norm yield that:
and therefore,
< 1.1
(A)+
Ivl
(B))
I v I ( A + B ) l l v l ( A ) + IvI(B)*
(1.14a)
The inverse inequality is due to the following. Given a measurable partition {El,.. .,En}of A B, it holds true that
+
432
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Applying the supremum twice to the left-hand side of the above inequality we arrive a t the desired inverse to inequality (1.14a). Hence, we showed that the total variation of v is a finite content on C. Finally, by Proposition 1.7 (ii), 1 u I is c-additive if it is @-continuous. This readily follows from (1.14) and the fact that vl+ , v; , v z , and v2 , as positive measures, are @-continuous. 0
1.15 Remarks. (i) Notice that there is a slight difference in the definition of the total variation of a signed measure and a complex measure, but according to Problem 1.13, they agree in the case of finite signed measures. (ii) While the set 6 ( R , C ) of all signed measures is not a linear space (the sum of two signed measures need not be a signed measure, as we can arrive a t oo - oo), the space E(R,C) of all complex measures (over the field 43) is. It is easy to verify that 11 v 11 defined as [ v [ (R) is a norm and therefore upgrades E(S1,C) to a normed linear space. It can be shown (Problem 1.14) that (Q(R, C), 11 11 ) is even a Banach space. (iii) In the subspace G,(R, C ) of finite signed measures we can introduce the partial order relation as vl 5 v2 if and only if vl(A) 5 v2(A) for each measurable set A. The lattice operations are defined as follows:
(v
P)(A) = inf(v(")
+ p(A\E):
It is obvious that v A p 5 v 5 v V p and v A p readily shown (Problem 1.15) that
and therefore, (G,(R, C),
E E C nA ) .
5 p 5 v V p . It can also be
11 - 11 , < ) is a Banach lattice.
The following is an embellishment of the integral notion of real- and complex-valued functions with respect to signed and complex measures.
1.16 Definitions. (i) Let [R,C,f = (u,v)] be a complex-valued function. Given a ualgebra C in R, f is measurable if for every Bore1 set B E %(W2), f*(B) E C, as usual. By using the projection operators one can easily show that f is C-%(it2) measurable if and only if u and v are C-%(R) measurable. Now, given a positive measure p E IIIZ(S1, C ) , we say that f E L1(R, C, p;C) or f is p-integrable if 1 f I E L1(R, C, p;W + ). Since I f ( ( u ( + 1 v I 2 1 f I , f is integrable if and only if both
<
<
1. Signed and Complex Measures u and v are elements of ~ ' ( $ 2 , C, p;R) and in this case we will write
, p;C) is a linear space with the integral being a and therefore, ~ ' ( $ 2C, linear function on ~ ' ( $ 2C, , p;C). All major theorems of integration (cf. Sections 1 and 2, Chapter 6) hold true with very minor notational modifications. (ii) Let v E B(R, C ) with its Jordan decomposition v = v + - v Denote
The integral of a function f E C-'($2,C;R) measure v is defined as
-
.
relative to the signed
The function f is said to be integrable with respect to the signed measure v, if f E ~ ' ( $ 2C, , v;R). The value of the integral J f d v is therefore finite. [Notice that the full decomposition of (1.16b) is
, v;W) could be enlarged by including those f's for which and thus ~ ' ( $ 2E, just one of the parts (negative or positive) in (1.16~)is finite.] (iii) Let us now define integrals of complex-valued functions with respect to complex measures. Let Cb '($2, C;C) denote the linear space of all C-measurable complex-valued bounded functions (C< '($2, C;R) is the corresponding subspace of real-valued bounded functions). Let v = vl iv2 be a complex measure on (52, C )
+
where each of the four integrals of (1.16d) relative to the signed measures v1 and v, is subject to representation (1.16b). Obviously, it makes a vector of linear combinations of eight finite positive integrals and therefore the integral of (1.16d) exists and it is complex-valued. 1.17 Example. The integral in (1.16d), a s a functional of f , is clearly linear. Replacing f by f lAwe see that (1.16d) also defines a complex measure and therefore v H J f d v is a linear operator from &(52,C) to
434
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
& ( a ,C ) . Now we define a norm on ,C '(R, C;C). Given a v E &(R,C), and s being a simple function, s=
CL=,ak 1Ak'
we have that
I SsdvI = I C L = l a k v ( A k ) l
If f E C c '(R, C;R + ), then there is a sequence {s,} of simple functions with sn f . Hence,
which leads to
and obviously, to the same inequality for f E Cb '(R, C;C).
PROBLEMS 1.1
Why is the series Cr= ,v(A,) vergent?
1.2
Prove Proposition 1.3 (i)
1.3
Prove Proposition 1.3 (ii).
1.4
Prove Proposition 1.4.
1.5
Show that the families of positive and negative sets are closed with respect to a t most countable unions and intersections.
1-6
Let ( R , C , v ) be a signed measure space and let A E C be such that 0 < v(A) < oo. Show that there is a positive subset P of A, with v(A) 5 v ( P ) < 00.
1.7
Show that all Hahn decompositions form a unique equivalence class.
1-8
Prove that the Jordan decomposition of a signed measure (from
in Remark 1.2 (ii) absolutely con-
1. Signed and Complex Measures
435
Corollary 1.7) is invariant of any Hahn decomposition.
1.9
Prove part (iii)of Proposition 1.9.
1.10
Let v be a signed measure on (S1,C)represented as a difference of two positive measures v = p1 - pz. Show that pl 2 v + and p2rv-.
1.11
Let (Sl,C,v) be a signed measure space. Show that I v I ( A )= 0, for an A E C, if and only if v(S) = 0 for each S E C fl A. Show by example that v(A)= 0 is not sufficient for ( v I ( A )= 0.
1.12
Let (R,C,p)be apositivemeasurespace and let v be the indefinite ; R )p. Show integral u = J f d p generated by f E o _ ( n , ~ , ~and that v is a signed measure,
v + = J f + d p , v - = s f - d p and ivl = J 1 f l d p . 1.13
Show that the total variation of a finite signed measure agrees with its total variation as for a complex measure.
1.14
Prove that space.
1.15
Prove that the subspace (6,(R,C ) ,11 11 , 5 ) of all finite signed measures is a Banach lattice, i.e. show the validity of equations (1.15).
1.16
Show that [R,C,f= (u,v)] (of Definition 1.16 ( i ) ) is E-%(R') measurable if and only if u and u are E-%(R)measurable.
1.17
Show that for each f E ~
1.18
Modify the Lebesgue Dominated Convergence Theorem of Section 2, Chapter 6, and prove it for complex-valued functions.
1.19
Let (R,E, v ) be a complex measure space. Prove that for each f E L'(R,E,V;C), ivl ( A )= 1 I f Idv, for all A E C .
1.20
Show that, given a signed measure v, it holds true that L'(R, C ,v;R)= L' (a,C , I v I $2).
1.21
Show that any signed Borel-Lebesgue-Stieltjes measure on (Rn,3) is a-finite.
(&(a,C ) ,11 - 11 )
(of Remark 1.15 (ii) is a Banach
-
' ( aC ,,p;C), I S f d p I 5 J I f I d p .
A
436
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
NEW TERMS: signed measure 422 positive measure 422 finite signed measure 422 infinite signed measure 422 a-finite signed measure 422 finite (v-finite) set 422 positive (v-positive) set 422 negative (v-negative) set 422 continuous from below set functions 423 continuous from above set functions 423 continuity from above at the empty set 423 @-continuity of a signed measure 423 signed Bore1 measure 423 signed Borel-Lebesgue-Stieltjes measure 423 Hahn decomposition 425 Hahn Decomposition Theorem 425 Jordan decomposition of a signed measure 426 positive variation of a signed measure 426 negative variation of a signed measure 426 total variation of a signed measure 426 v-maximal set 428 v-minimal set 428 geometric measure 429 complex measure 430 complex measure space 430 Jordan decomposition of a complex measure 430 total variation of a complex measure 430 measurability of a complex-valued function 432 integrability of a complex-valued function 432 integral of a real-valued function relative to a signed measure 433
2. Absolute Continuity
2. ABSOLUTE CONTINUITY 1.1 Definition and Notation. Absolute continuity of signed measures is formulated in the same way as that of positive measures. Let p and v be positive and signed measures, respectively, on a measure space ( a , C ) . We call a signed measure v absolutely continuous with respect to (a positive measure) p (in notation, v << p) if every measurable y n u l l set A is also a v-null set. A signed measure that is absolutely continuous with respect to the Lebesgue measure is called continuous. Recall that G(R,C) denotes the set of all signed measures on ( a , C ) and TIl(i-2,E) is the subset of all positive measures. Given p E TIl, denote 5 (: = {v E G(n, C) : v < p). Define on L(n, C, p;R) the map Sp such that for each g E L(n, C, p;R),
that, according to Problem 1.12, is valued in G and, as easily seen, v < p. Consequently, [L(R, C, p;R), , S is an into map.
(52
From Definitions and Remarks 1.14 (iii), Chapter 6, we remember that the p-almost everywhere property of equality of measurable functions generates an equivalence relation E on C-'(n,C;R) and thus on L(R, C, p;R), as a subset of C - '(a, C;R). Consequently,
is also a quotient set. On the other hand, by Corollary 1.20, Chapter 6, the map S, agrees with the equivalence relation E. In other words, S adopts E as its equivalence kernel. Then, by Theorem 4.4, Chapter 1, there is a unique function, denoted by
such that
-
where T E is the projection of L(R, C, p; R) on its quotient L(R, 13,p; W) I by E. (See Section 4, Chapter 1, for refresher.) Therefore, the map "becomes" an injection $ I, with the domain L(R, C, p; R) I
..
sp
Recall that any function g from the quotient class [glp of L-functions (generating the signed measure v in (2.1)) is referred to as a Radon-
438
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Nikodym density of (the signed measure) v relative to (the positive measure) p. The chief component - of the Radon-Nikodym Theorem below is that the map [L(R, E, p; W) I ,,G> , $J is surjective. In other words, for each measure v E , there is an equivalence class [g], of Radon-Nikodym densities of v relative to p.
62
2.2 Theorem (Radon-Nikodym). Let p E m(R, E ) be a a-finite measis a bijective map. ure. Then [L(R, E, p;R) I , G> ,
,
Proof. The proof of the theorem includes three objectives: 1) Show that given [g], E L(R,
z,p;R) 1 , I dg],
E5 (:
, i.e.,
that
for each g E [g],, v = gdp defines a signed measure, absolutely continuous with respect to p. This is readily done. Since g E L, v is a signed measure. The proof that v << p is trivial. Therefore,
is an into map. 2) Show that
I,:
L(R, C, p;R)
I,
-+
G :
is a surjective, (an onto) map, i.e., that for every signed measure v E absolutely continuous with respect to a positive c-finite measure p there is an equivalent class [g], E L(R,C,p;R) 1, of Radon-Nikodym densities of v relative to p. This is the key part of the theorem and it is referred to as the "existence of a Radon-Nikodym density."
(52,
3) Show that the map
is injective (one-to-one), i.e. that the above equivalent class [g]p is unique. This is due to Corollary 1.20, Chapter 6. It therefore remains to prove the existence of the Radon-Nikodym density, i.e., given a signed measure v, absolutely continuous relative to a a-finite positive measure p, there is an equivalence class [g], of Lfunctions such that for each element g of this class v = $ gdp. We break up the proof of existence into five parts starting with the case of two finite measures p and v and embellishing it to the case when p and v are c-fini te positive and signed measures, respectively.
2. Absolute Continuity
Case 1. p and
Y
are two finite positive measures.
Abbreviate L$ = LI functions
(a,,Z,p;R + )
and introduce the subset of
Since 0 E @, it is not empty. Furthermore, @ is closed under finite suprema. Indeed, let f ,g E @ and A E C. Denote by
E = {w E A: f (w) 2 g(w)) and
G = {d E A: f(w)
+ G = A and
Now, let
Then, there is a sequence {(on} 5 @ such that sup{ J (o? dp} = S. (Indeed, since S < w, for each n = 1,2,. .., there is a function cpn such that S-1
< J' p,dp < S.)
n
By setting f, =. V pi we form the new sequence r = l
{j,},which is monotone increasing and has S as lim,,,J jndp. By using the Monotone Convergence Theorem we have that g: = limn,,f ,, in the topology of pointwise convergence, is an L:-function that also belongs to @. The function g we arrived a t is an J-maximal element of I and it is an element of the equivalent class [gIp of J-maximal elements.
Now, we will show that [gIp is an equivalence class of RadonNikodym densities of v relative to p, i.e for each g E [gIp, J g d p = v(A), for all A E E. A
Because v
<< p and for all A E C
the set function
is a finite positive measure, absolutely continuous relative to measure p.
440
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
If g is not a Radon-Nikodyrn density of v with respect to p, then p f 0 and p(R) > 0. Thus for some positive E ,
Consider the (finite) signed measure y = p - ~ p By . Theorem 1.6, there is a Hahn decomposition ( P , N ) of R such that y(A n P ) 1 0 and y(A fl N ) < 0 for all measurable sets A, i.e., and If p(N) = 0 then, because of p << p, p(N) = 0 and thus y(N) = 0. On the other hand, from (2.2b)) by setting A = R we have that
Furthermore, since by the above assumption about p, N turns out to be a y-null set, it follows from (2.2a) that y ( P ) < 0. This contradicts inequality (2.2d). Hence, p(N) must be positive. Now we have from (2.2~)that
or, equivalently,
S (:
A
Thus, the-function $ 1,
1,
+ g E @.
+ s)dp I 44) But, since p(N)
> 0,
it holds true that
This contradicts that g is an I-maximal element of @. The contradiction is due to the wrong assumption about p. Thus p r 0, or, in other words, .(A) = Sgdp A
for all A E C, which proves the statement of the theorem for this special case. Notice that because v is finite and therefore every Radon-Nikodym
2. Absolute Continuity
44 1
density g is an L'-function, by Proposition 1.21, Chapter 6, g is finite pa.e. If it is "occasionally" infinite, we can redefine g as to make it finite. Therefore, of the equivalence class [gIp of Radon-Nikodym densities there is a subclass of finite ones. In summary of case 1, given two finite positive measures p and v such that v < p, there is a unique (nonempty) equivalence class [gIp E ~l(f& & P;E + 1 p of Radon-Nikodym densities (of measure p relative to measure v) of which a nonempty subclass is of finite densities. Case 2.
p
and v are finite and o-finite positive measures, resp.
If v is a-finite then there is a t most a countable decomposition of such that v(Rn) < m for all n = 1,2,. .. . Let R == E :
Then vn is a finite measure on Rn n E and from case 1 it follows that there is a measurable nonnegative function g,: R, -t R such that v n ( A n Q n ) = J g,dp, for each A E C, n = 1,2 ,... . Ann, Now by the Monotone Convergence Theorem applied to the sequence = : C { lgnlnn) we have that
It only remains to set g = =:c theorem.
lg 1
n '
to complete this part of the
Therefore, given two positive measures p and v such that p is finite, v is o-finite and v < p, there is unique equivalence class [gIp E of Radon-Nikodym densities (whose integral is not necL(Q,C , +) I essarily finite). Case 3. p is a finite positive and v is an arbitrary positive measure. Denote by
f = {B E C: Resc ,. ~v is 0-finite). Since
@ E r, it follows that
r # @.Let
442
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
and let (En) C r such that p(En) -S. (Since S 5 p(R) < oo, for each n = 1,2,. .., there is a set En such that S - 5 p(En) 5 S.) Clearly,
A
Hence, S 2 p(E) 2 p(E,)
+
and p(E) = S.
Now since v is a-finite on E, from case 2, it follows that there is an C fl E-3 + -measurable L'-function [E, R + ,i j ] , such that
for all A E C. Fix an A E C. a) Let p(A n EC)> 0. If v(A fl EC) < oo, then A fl ECE
r, and thus
E U (A n EC) E r. The latter yields that
and this contradicts p(E) = S. Thus v(A n EC) = oo. b) Let p ( A n EC)= 0. Then since v
<< p,
it holds true that v(A fl EC)
= 0. The above cases a) and b) can be combined in the following compact equation v ( A f l E C )= J m l A n E C d p by agreeing that oo - 0 = 0. Furthermore,
+
where g =.? lE oolEc. Notice that g is measurable, since
Therefore, given two positive measures p and v such that p is finite, v is arbitrary, and v 9p, there is unique equivalence class [g], E L(R, Elp;R +) I of Radon-Nikodym densities.
,
Case 4. p is a a-finite and v is an arbitrary positive measure. such that p(Rn) < oo for all n > - 1. Due to case 3, Let R = c:= for each n, there is a G niln-3 + -measurable function [a,, R + ,i j J , such
2. Absolute Continuity that
for all A E E. Denoting gn = ij la
n
we have
and thus 00
where, by the Monotone Convergence Theorem, g = En = lgn. Therefore, given two positive measures p and v such that p is afinite, v is arbitrary, and v << p, there is a unique equivalence class [g], E L(R, C,p;W + ) ( of (nonnegative) Radon-Nikodym densities.
,
Case 5. p is a a-finite positive measure and v is a signed measure.
-
Let v = v + - v be the Jordan decomposition of u, where, for instance, v - is supposed to be finite. By case 4, there are functions [a,R + ,gi], i = 1,2, such that u+(A) =
S gldp
and v-(A) = A
A
g2dp, A G E .
Since by our assumption v - is finite, g2 is p-integrable and g2 < oo pa.e. This leads to
In summary of case 5, given a o-finite positive measure p and signed measure v, with v << p, there is a unique equivalence class [g], E L(R, E , p;a) I of Radon-Nikodym densities of v relative to p.
,
Case 5a (special case of 5, with v being a finite signed measure.) In this case, clearly, given a a-finite positive measure p and a finite signed measure v, with v < p, there is a unique equivalence class [g], E L'(R, C ,p;R) I of Radon-Nikodym densities of v relative to p.
,
Case 5b (special case of 5, with v being a a-finite signed measure.) Since v is a-finite, there is a countable decomposition=:c such that v, = ResC n v is finite for every n. By case 5a, since
=R
n
there is a unique equivalence class [gn], E L'(R,, E n R,, pn;R)
I ,of
444
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Radon-Nikodym densities of v, relative to p,. Now, letting
for every g, E [gnlm, we define the class [gIp of Radon-Nikodym densities of v relative to p. As a sum of countably many integrable functions, g is clearly p-a.e. finite.
CI
The proof of the theorem is now complete. By Radon-Nilcodym Theorem, the map its inverse (Ili)
I,, is therefore invertible and
-'is also a map valued in R(Q,C,p;W) 1
p.
In other
03
, under ( I r! ) - ', there is a nonempty equivalence words, for any v E class [gIp of Radon-Nikodym densities of v relative to p. We denote (I,)-' and for a fixed v E B ,:
d by the symbol d~
we set
and call it the Radon-Nikodym derivative of measure v relative to measure p. We would like to emphasize that a Radon-Nikodym derivative is not the same as a Radon-Nikodym density (as it is being routinely used in the colloquial language), but it is an equivalence class of Radon-Nikodym densities.
2.3 Example. Let X ( E R(R, E, p;R)) be a random variable defined on a probability space (R,E,P). Recall that X induces the image measure P X * referred - -to as the probability distribution and yielding the probability space (W,%,PX*). Given a (positive) Bore1 measure p such that P X * << p, we have, according to case 1 of the Radon-Nikodym Theorem, a nonempty equivalence class
of Radon-Nikodym densities (probability density functions) such that for any g E- dPX*, it holds true that P X * = sgdp. For instance, if p = A is d~ the Borel-Lebesgue measure on (IF!,%), then the probability distribution PX* can be represented by the Lebesgue integral and a density g can often be reduced to the usual Newton-Leibnitz derivative of the probability distribution function x I+ PX*( - oo,x]. A random variable X, whose probability distribution PX* is absolutely continuous with respect to the
2. Absolute Continuity
445
Borel-Lebesgue measure (or as we agreed to call it, just "continuous"), is said to be continuous. In probability theory, it is common to specify a probability density function and (as one of the consequences of the 0 Radon-Nikodym Theorem) it uniquely defines a random variable. As another application of the Radon-Nikodym Theorem, we formulate the following result. 2.4 Corollary. Let [R, W + ,f ] be a C-3 + -measurable function and let p be a finite measure on E. Then for each sub-1~-algebra Eo C C, there exists a unique equivalence class
[folP C B(R, 6,HR
+ ) I ,,
such that
for each f o E [folr and for all A. E Eo.
Proof. Let po = ResEop and let v = S f d p . Then, for any A. E Co,
and therefore v << po. By Radon-Nikodym's theorem (Case 3), there is a nonnegative
C0-,-8 +-measurable
equivalence class
Nikodym densities such that for each f E
dv dpo
of
Radon-
d~o'
The statement of the proposition now follows from (2.4a) and (2.4b). Corollary 2.4 can be generalized as follows.
2.5 Proposition. Let [R,R, f ] be a C-J-measurable function from L(R,C,p;R) and let p be a finite measure on C. Then, f o r each sub-(Talgebra Eo such that C0 E there exists a unique quotient class [fOlll L I , such that
I fo d~ = S f A.
A.
f o r each fo E [fOlp and f o r all A. E Co.
(See Problem 2.3.)
d ~
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
The above propositions find an important application in probability theory. 2.6 Definitions.
(i) Let X be a random variable on a probability space ( R , E , P ) valued in R and let Co be any sub-u-algebra of E. Then, in light of Proposition 2.5, there exists a class of P-integrable random variables [XOIPsuch that for each Xo E [XOlp the equation JX,dP= A.
J XdP A.
holds true for all A. E C,. The class [XOIP of P-equivalent random variables is called the conditional expectation of X given u-hypothesis Eo, in notation,
z
[XOIP= E[X I CO]= E ('[XI.
(2.6b)
Any random variable Xo from the class [XOlP is called a version of the conditional expectation IE~O[X]. (ii) For a measurable set (event) A E C take X = l A .Then, for a C sub-u-algebra Co C C, E O[lA] is called the conditional probability of event A given u-hypothesis Eo and it is denoted by $(A1 Eo) or by pCo( A). The following construction explains why [X0lp is called the "conditional expectation." 2.7 Examples.
Let X be a random variable on a probability space ( R , C , P ) and let fl = =:c P=lHnbe a measurable decomposition such that P(Hn) > 0. Then for each n = 1, 2,. .., the conditional probability
(i)
defines the probability measure PHn on the new measurable space (R, C n a,), where
Thus, the expected value of X with respect to measure PHn is then
2. Absolute Continuity
EIXIHn] = J x d P
H
"=
447
J XdP,
'(Hn) H,
which is called the conditional expectation of X given the hypothesis H,. Observe that the value IE[X I H J is a constant (random variable). Now consider the random variable
which is Co-%-measurable, where Co = u({H,}) is a a-algebra generated by the sequence of hypotheses {H,). Obviously, Co = {n,@,A = E Hi:I C N). 1E
I
Hence, for every A E Co (which a union of some Hi's):
J X d P = J XdP. =i Z E[X I Hi]P(Hi) =a. E EI EIH. a
A
The random variable Xo is then a version of the conditional expectation IE[X I E$ that belongs to the class [Xolp. (ii) We consider a special case of the above example. Let 52 = [0,1), C = 93 fl[0,1) and P = ReszX (where X denotes the Borel-Lebesgue measure). As decomposition, take
k-1 k 52= E ; = l H k r where H~ =[-?i-,Fi). Let X(w) = w, for all w E 52. Then,
and
Thus, from (2.7))
and from (2.7a),
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
as a version of the conditional expectation = cr(Hl,. . .,Hn).
E[XI Co], where Co
(iii) Let X and Y be two random variables on a probability space (S1,E1P). Then, Zo = a(Y) is a sub-cr-algebra of Z generated by Y. The corresponding conditional expectation of X given Co is denoted E[X I Y] or IEY[x]. 17 2.8 Remarks.
(i) Observe that from (2.6a) and (2.6b) it does not follow that z E O[X] = X (mod P), because X need not be Eo-measurable. However, E
E O[X] = X (mod P) if X is Co-measurable (see Problem 2.10).
(ii) Note that if two random variables X and Y belong to the same equivalence class, we would normally write X = Y (mod P) or X = Y Pon S1. In probability, however, the latter is usually denoted by X = Y P-a.s. on 52 or just a.s. (reads almost surely). a.e.
After a short break from the Radon-Nikodym Theorem for signed measures, we return to this theme with a version of Radon-Nikodym's Theorem for complex measures. This is readily done as follows. Firstly, given a cr-finite positive measure p E im(S1, E ) , we will denote by
Let v E e$ and let v = vl + iv2. Since vl < p and v2 < p and ul and v2 are finite signed measures, according to case 5a of the Radon-Nikodym Theorem, there are two equivalent classes [g1Ip and [g2Ip of Radon-
-
Nikodym densities from the factor space ~ ' ( f l El , p;W) I I, so that, for every elements gl and g2 of their respective classes,
S
' I ( ~ ) = A gldP
and v2(A) = AJ g2dp, for each A E El
thereby making [g], = [gl], x [g2], E ~ l ( S 1Clp;C) , (see Definition 1.16) the desired Radon-Nikodym derivative. The uniqueness of [g], is based on that for signed measures. Summarizing the above arguments we have:
2.9 Theorem (Radon-Nikodym for complex measures). Let p E im(S1, C ) be a cr-finite measure. Then [~l(S1, C,p ; ~ I) &$ , f is a bijective map. Finally, with reader's help (Problem 2.1) we will establish a small,
2. Absolute Continuity
but useful result in
2-10Proposition. Let v be a signed measure and p be a positive measCI ure. Then v << p if and only if I v I << p. PROBLEMS 2.1
Prove Proposition 2.10.
2.2
Consider in case 1 of the Radon-Nikodym Theorem, the partial order 5 on @ by defining f 5 g if and only if f 5 g p-a.e. Show that any chain in Qi has an upper bound and thus, by Zorn's Lemma, 4.13, Chapter 1, @ has a maximal element.
2.3
Prove Proposition 2.5.
2.4
Let p E I ( R , E ) (i.e. a positive measure) and v be a a-finite signed measure such that v = j' fdp. Show that if
J f d p = J gdp, for all A E E , A
A
then f = g (mod p). 2.5
Let (a,E, v) be a complex measure space. Show that the Radonsatisfies = 1 I v 1 -a.e. on a. [Hint: Nikodym derivative Use Problem 1.19.1
2.6
In the condition of Problem 2.5, show that for each f E Cb '(a, C;W) (see Definition 1.16 (iii)),
-&
[ $,I
[Hint: Use Problem 2.5.1
2.7
Let (Q,C,v) be a a-finite signed measure space and p and R are two positive a-finite measures on (fl,E) with v << p and p << p. Show that v (< p and prove the chain rule
If, in addition, p
2.8
<< p, then
Show that E [ECO[~]] = E[X].
450
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
+ bIE%[Y]
2.9
Show that E ~ O [ +~ by] X = ~E'o[x]
2.10
Show that if X is C,-measurable, then IEEOIX] = X a.s.
2.11
Show that if X
2.12
Let Y be an Co-measurable and $-integrable random variable and let X be a C-measurable random variable such that X Y E L'(P). Show that
5 Y a.s. then lECOIX] 5 E'O[Y] a.s.
I E ~ O [ X Y=I YE 2.13
Show that &>
a.s.
C
0~x1 a.s.
is a linear space over the field C. Does the same
2. Absolute Continuity
NEW TERMS: absolutely continuous signed measure 437 Radon-Nikodym density of a signed measure 438 Radon-Nikodym Theorem for a signed measure 438 Radon-Nikodym derivative of a signed measure 444 probability density function 444 probability distribution function 444 continuous random variable 445 conditional expectation given a a-hypothesis 446 version of the conditional expectation 446 conditional probability of an even given a a-hypothesis 446 conditional expectation given a random variable 448 almost surely equality 448 Radon-Nilcodym Theorem for a complex measure 448 chain rule 449
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
3. SINGULARITY The singularity (which we introduced in Section 5, Chapter 6, for positive measures) is a sort of opposite notion to continuity.
3.1 Definition and Notation. Let v and p be two signed or complex measures on a measurable space (S1,C). v is said to be singular with respect (or orthogonal) to p, in notation, v Ip, if there is a measurable partition (R1,a2) of S1 such that I u I ( a l ) = I p I ( a 2 ) = 0. Clearly, ( 6 , I ) is a symmetric relation. Therefore, v and p are to be called mutually singular or just singular. [Because the total variations of complex measures coincide with that for finite signed measures (Problem 1.13) and the total variations of signed and positive measures are equal, the above definition of singularity agrees with that for positive measures.] A signed or complex measure, orthogonal to the Lebesgue measure is called just singular. Given a signed measure a signed measure space (51,C,v), we will denote by 6 : (51, C) the subset of all signed measures G(51, C ) orthogonal to v. We establish a few major properties of singular measures. 3.2 Proposition. Let p be a positive measure and v and p be signed measures on the measurable space ( a , C). The following hold true:
(i) If v = v + - v - is the Jordan decomposition, then v + (ii) If v E 6 : and p E G : , then v + p , v - p E G :
Iv - .
.
(iii) v I p if and only if v + I p and v - I p.
< p and p E G,: then v Ip. If v < p and v Ip, then v = 0.
(iv) If v (v)
Proof. w e leave (i) for the reader. (Problem 3.1.)
-
(ii) By the definition, there are two measurable sets A and B such that p(A) = P(B) = 0 and I v I (AC)= I p I (BC)= 0. Then, by Problem 1.11, v 0 on C f l AC and p 0 on C n BC. Consequently, v, p, v + p , and v - p are identically zero, each one on C n (ACn BC). Again, applying Problem 1.11, we see that the measures I v p 1 and I v - p 1 attain zero on the set ACfl BC. On the other hand, obviously, p(A U B) = 0.
+
a ) v Ip implies that 1 v 1 (A) = v + (A) some A and therefore v + (A) = v - ( A ) = 0.
+v -
( A ) = p(AC) = 0 for
3. Singularity b) If v + I p and v - I p , then by (ii), I v I = v + + v - I p .
(iv) Since p Ip, there is a set A E E such that p(A) = I p I (AC)= 0. By Proposition 2.10, 1 v I << p. In other words, I v I (A) = 0, which proves the statement. (v) Replacing p in (iv) by v we have in the condition of (iv) that v I v. Therefore, there is an A E E such that I u 1 (A) = I u I (AC)= 0 and, since I u I is a positive measure, I v I ( a ) = 0 and I v I = v + = v = 0.
3.3 Definition. Let p be a positive measure and v - a singular measure. If v has a decomposition in two signed measures in the form v = v,
+ v, where v,
<< p and v, Ip,
then it is called a Lebesgue decomposition of v with respect to p. The measures v, and v, are said to be absolutely continuous and singular components of v with respect to p. 0
3.4 Theorem (Lebesgue Decomposition Theorem). Let p be a u-finite positive measure and v be a a-finite signed measure, both on a measurable space ( a , C ) . Then, there is a unique Lebesgue decomposition of v with respect to p.
Proof. Let v be a u-finite positive measure. Obviously, p
+ v is a a-
finite positive measure and both p and v are absolutely continuous with respect to p v. By the Radon-Nikodym Theorem, case 4, there is a unique equivalence class [f], E L ( a , E, p;R + ) I of (nonnegative) RadonNikodym densities with respect to p v. Let f pe one such density. Denote E = ( f > 0) and define two measures:
+
+
v, = Resz Obviously, v,
+ v,
EU
and v, = Res .r n E ~ " '
= v. Let p(A) = 0 for some A E 6. Then, since
and f 2 0, it follows that lAf E [0], + ., On the other hand, because f > 0 on E n A, the set E n A is p v-null and, therefore, v-null as well. Consequently, v,(A) = 0 or, in other words, v, < p. T o show that us Ip, observe that vs(E) = 0, whereas
+
Now, let v be a a-finite signed measure with its Jordan decomposition
454
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
v = v - v - . Applying the above arguments to v + and v - (with ' v < p , which respect to the same set E) we have that v f < p and makes v, = v f - v, < p. The same applies to I us 1 = v ,+ + v; Ip in the proof of v , Ip.
Now, we prove uniqueness. Let v be a finite signed measure and suppose that v = v,+ us = v1
+ u2 with va,vl E B% and v,,v2 E G,I .
Then, because v is finite, by Problem 2.13, v, - vl = us - v2 is a signed measure, absolutely continuous with respect to p and, by Proposition 3.2 (ii), orthogonal to p. Thus, by Proposition 3.2 (v), the signed measure v, - v1 = vs - v 2 must be 0. If v is cr-finite, then there is a countable measurable partition {a,} of R so that v is finite on each R,. Then, by the above arguments, the restriction of the Lebesgue decomposition of v on each (R,,E fl R,) is unique, which obviously yields uniqueness of the Lebesgue decomposition of v on the entire (R, E). Next, we consider yet another decomposition of a measure into two mutually singular components.
3.5 Definitions. Let (Q E , v ) be a signed measure space such that for each w E R, {w) E C. A point ~ E isR said to be a v-atom (or just an atom) if / v I ({a)) > 0. In this case, we aIsb say that v has an atom at { a ) . v is called atomic (or discrete) if the set of atoms of v is at most countable, i.e. there is a countable set A of R of atoms such that I v I (AC)= 0. (ii) v is called continuous if I v I ((w}) = 0 for all w's. Notice that if (R, E, v ) is an atomic measure space with respect to a countable set A on which v is concentrated, then v can be represented as
(i)
Apparently, if v and p are signed measures on ( R , E ) , as in Definition 3.5, such that v is continuous and p is atomic, then v l p. I t seems plausible that a signed measure v on ( R , E ) is, in general, of the mixed type and that it permits a decomposition v = v , + v d into a continuous and discrete component. Of course, in contrast with the Lebesgue decomposition, there is no "third party measurey' involved. We start with positive measures.
3.6 Theorem. Let ( R , E ) be a measurable space such that for each w E S1, {w} E E and let p be a cr-finite positive measure on (R,E). Then there is a unique decomposition p = pc + p d into a continuous and dis-
3. Singularity crete component such that pc Ipd.
Proof. Assume that p is finite. Let C be any countable subset of C. Then C is measurable and
Obviously,
From (3.6) we have that
C
p({w)) < oo. Thus,
wER
C
p({w)) can have
wER
only a t most countable many positive terms. In other words, the set A of all p-atoms can be a t most countable. Denote
Then, pd is an atomic measure. We will show that the set function pc = p - pd is a positive measure. It clearly suffices to show that pc 2 0. Let B be a measurable set. Then,
Because pd(A n B) = pd(B),
Clearly, pc is continuous and, as mentioned previously, pc l pd. Consequently, p = pc pd is the desired decomposition. Now suppose that p is c-finite and let {an)be a countable measurable partition of S2 such that
+
is finite for each n. Applying the above arguments to every p,, we arrive a t the decomposition p, = pC, pd, relative to the set A, of the atoms of p,. Then,
+
is the set of all atoms of p and
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
is the desired decomposition of p and p, I pd with respect to A. It now remains to prove the uniqueness of the decomposition. Let
Since the set A of all atoms of p is unique, both pd and pd are concentrated on A that makes them clearly equal. If B is a p-finite measurable set, then pd = pd and (3.6a) immediately imply that p,(B) = pc(B). Otherwise, let B, = B n D,, where {D,) t ! I and
Then, p,(B,)
= pc(Bn) and continuity from above lead to
and to the equality of pc and p,.
•
3.7 Theorem. Let (f2,C) be a measure space as in Theorem 3.6 and v be a a-finite signed measure on (f2,Z). Then, given a a-finite positive measure p on (R,E), there is a unique decomposition
with respect to p into three a-finite signed measures, of which the first one is continuous and absolute continuous with respect to p, the second is continuous and singular with respect to p, and the third one is atomic. Furthermore, vd I v,, and v,, I vd.
Proof. Let v = v + - v - be its Jordan decomposition. Then, by Theorem 3.6, v + and v - can be decomposed as
relative to the sets A + and A - of atoms of u + and v - , respectively. Consequently,
is the corresponding decomposition of the signed measure v into its continuous v, and atomic vd components with respect to the set A = A + U A - of atoms of v. This representation is obviously unique.
3. Singularity
457
Now, given a u-finite positive measure p, let v = v, + vd be the decomposition (with respect to the set A of atoms of v). According to Theorem 3.4, there is a unique Lebesgue decomposition of v, = v,, + v,, with respect to p. Therefore, v = v,, v,, vd is a unique decomposition of v with respect to p into three a-finite signed measures of which the first is continuous and absolute continuous with respect to p, the second is continuous and singular with respect t o p and the third one is atomic. Furthermore, we have that v,,(A) = v,,(A) = vd(AC)= 0. Therefore, vd I v,, and v,, I vd. 0
+ +
3.8 Corollary. Let v be a signed ~orel-'Lebes~ue-Stieltjes measure on (Rn,93) and A be the Borel-Lebesgue measure. Then, there is a unique decomposition
with respect to the Borel-Lebesgue measure A such that v, << A, v,, IA, and v,, Ivd.
+ vd
Proof. Because any Borel-Lebesgue-Stieltjes measure is a-finite, by Theorem 3.7, v can uniquely be decomposed as
where v,,
< A.
Since obviously, vd I A, by Proposition 3.2 (ii),
Because the Lebesgue decomposition is unique, it follows that v is the absolute continuous and v,, vd is the singular component in the Lebesgue decomposition of v. In particular, it follows that v, = uc, is also continuous.
+
3.9 Definition. The singular components v,, and vd of v in decomposition (3.8) are said to be singular-continuous and singular-discret d (or just discrete), respectively. 0 We are going to continue our discussion of singularity of measures in Section 4, Chapter 9.
PROBLEMS 3.1
Prove part (i) of Proposition 3.1.
3.2
Generalize Proposition 3.2 for complex measures replacing signed measures.
458
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
3.3
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v in Theorem 3.4.
3.4
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v and an arbitrary positive measure p replacing the a-finite positive measure p in Theorem 3.4.
3.5
Prove a version of the Lebesgue Decomposition Theorem with a afinite positive measure replacing the signed measure v and an arbitrary positive measure p replacing the a-finite positive measure p in Theorem 3.4.
3.6
Let v, and v, be the absolute continuous and singular components of a complex measure v with respect to a positive measure p. Show that I v 1 = I u, I I v, I .
+
3. Singularity
NEW TERMS: singularity (orthogonality) of a signed measure 452 orthogonality (singularity) of a signed measure 452 Lebesgue decomposition of a signed measure 453 absolutely continuous component of a signed measure 453 component of a signed measure 453 Lebesgue Decomposition Theorem 453 atom (u-atom) 454 atom of a singular measure 454 continuous singular measure 454 decomposition of a positive measure 454 decomposition of a a-finite signed measure 456 singular components of a signed measure 457 singular-continuous component of a signed measure 457 singular-discrete component of a signed measure 457
460
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
4. LP SPACES This section will deal with the so-called LP-spaces and give more systematic studies of them as metric spaces. 4.1 Notation. Let (R,E,p) be a (positive) measure space. Then, for 0 < p < oo, we denote by LP(Sl,C,p;C), the set of all measurable complex-valued functions such that 1 f I E L'(R,E,~;c). In particular, if p is the counting measure on ( R Y E )with R = {1,2,...), then the set LP(S1,E)p;C) reduces to the familiar IP space of all summable sequences. We will occasionally abbreviate LP(S1, E, c(;C) as LP(R, E, c() or just- LP. One more notation we are going to use throughout is LP(R,C,p;W) as the set of all C '(GI, 6;R)-functions with I f I E ~ l ( S 16, , p;R + ). A
4.2 Proposition. LP(Sl,C,p;C) is a linear space over the field 43.
Proof. Let a,b 2 0, then
Now, for f ,g E LP(S1,E,p,C), due to (4.2), we have
from which we see that f obvious.
+ g E LP. The other linear space properties are
Notice that LP(Sl, C, p;R) is sort of quasi-linear over W. Due to (4.2) and the homogeneity, the LP is "linear" restricted to the scalars from R but not R, of course. Consequently, endowing a norm on LP should be done with care and respect to the accepted terminology. We now introduce a semi-norm on LP.
4.3 Theorem. The real-valued function defined as
11 - 11 p:
LP(R,E,p;C)
-
R+
I1 f 11 p = ( l l f l Pdr)''P is a semi-norm. Theorem 4.3, whose proof will follow, essentially reduces to the triangle inequality, which we show in two steps below. Recall (Problem 1.5, Chapter 2) that two real numbers p > 1 and q > 1 are said to be conjugate exponents if
4. LP Spaces
46 1
Now we prove the Holder inequality for the semi-norm LP(~,Z,p;Q3).
)I . 11
on
4.4 Proposition (H61der7sInequality). Let 1 < p < oo and q be its conjugate exponent, and let f € LP(n,C,p;C) and g E LQ(R,E,p;C). Then, f g EL' and
IIfsII1 5 l l f
llpllgllq.
(4.4)
Proof. By Problem 1.5, Chapter 2,
Hence,
I fg I
is bounded by integrable functions and
,
If one of the values 11 f 11 or 11 g 11 vanishes or is infinity (or any combination), then (4.4) holds. Assume that neither of them is zero or infmity. Then (4.4a) still holds with f / 11 f 11 replaced by f and g/ 11 g 11 - by g. This yields (4.4).
,
Observe that for the special case p = q = 2, Halder's inequality reduces to the frequently used Cauchy-Schwarz inequality. (In addition to (4.4), we have f g E L' and f ,g E L ~ . )Now, we are ready to prove the triangle inequality, known as Minkowski's inequality.
4.5 Proposition (Minkowski's Inequality). Let 1 < p < oo and f ,g E LP(G!,E,p;C). Then f + g E LP(R,E,p;C) and
Ilf
+glIps
I l f 11 p +
llg I1 p-
(4.5)
Proof. For p = 1, (4.5) reduces to the known triangle inequality for L' space. Assume that 1 < p < oo and denote by q its conjugate exponent. We have
Since obviously pq
-q = p
and because the space LP(G!,E,p;C) is linear,
462
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
and hence
Consequently,
If
+ g l p - l € Lq.
Now we apply the Holder inequality to f ,g E LP and to I f + g I p-' € L q to have I f 1 - 1 f + g l P-' and 1 g 1 I f g 1 P-' as L'-functions and
+
Ilf
lf+91P-'Il
=
1If l If +gIp-'dp
(since pq - q = p )
=
with
Il f 11 p Il f + 9 11
I191f+91p-111
pplql
Il l g l l p l l f + ~ l l pp/q.
(4.5~)
Applying the norm (integral operator) to (4.5a-c) we have
+
Dividing both sides of the last inequality by 11 f g 11 pp/q (of course, we assume 11 f g 11 > 0, or else the triangle inequality holds true tl immediately) and ue to p - ( p l q ) = 1 we have the above assertion.
+
d
Proof of Theorem 4.3. Notice that 11 CY f 11 = I a 1 11 f I( satisfies property (ii) of the norm in Theorem 7.3, Chapter 2. Property (iii) of the same theorem is subject to the Minkowski inequality. And finally, f = 0 implies 11 f 11 = 0. The converse however gives a weaker condition: 11 f 11 = 0 yields f = 0 p-a.e.. Theorem 4.3 is therefore 0 proved.
4.6 Remark. T o make (LP, 11 11 ) a normed space we will pass to equivalent classes in the same way as in Sections 1 and 5 of Chapter 6 and Section 2 of the present chapter. Recall that, the p-almost everywhere property of equality of measurable functions generates an equivalence relation E on C - '(a, C; C) and thus on LP. Consequently,
is also a quotient set. Then, [ O ] is a linear subspace and
4. LP Spaces
463
is the (quotient) space, with the origin 0 = [OIp, generated by E and 11 11 is now a norm on LP(R, C, p; C) I ., Indeed, by Lemma 1.15, Chapter 6, we see that )I f 11 = 0 implies that f E [O],. 4.7 Definition. A sequence { f }, E LP(R,C,P;C) is said to converge in the pth mean to a function f E LP(R,C,p;C) (or just LP-converge to f ) if
,
We will also denote it by f ,LP -+ f . Problems 4.2 and 4.3 (which are essentially due to Riesz) state that if an LP-sequence {f ,} converges to a n LP-function f , then the convergence of { 11 f, 11 to 11 f 11 is equivalent to the convergence of { f,} to f in the pth mean. Below we state and prove a more general version of the Lebesgue Dominated Convergence Theorem than Theorem 2.6, Chapter 6, for ( ~ ' ( n ,C, PI, I1 II ,)-space.
4.8 Theorem (Lebesgue's Dominated Convergence Theorem), Let ( R E , p ) be a measure space and {f }, S C -'(a, C;C) (or C - '(R, c;R)) be an a.e. convergent sequence, a.e. dominated by an LP(R, C, p; W + )function g, more precisely, 1 f, 1 5 g for-each n p-a.e.. Then the following are true:
(ii) there is an LP(R, C ,p; C)-function f such that (f,} converges to f a.e. in the topology ofpointwise convergence; (iii) f n
LP -+
f;
Proof. As usual, denote by N = N p the subfamily of all measurable p-null sets. Since {f,} is a.e. convergent pointwise, there is M E N spch that lim,,f
,(w) exists for all w E MC.
Denote by L(w) the value of this limit. Since gP E L'(R, E , p;E + ), by Proposition 1.21, Chapter 6, there is N E N, such that g(w) < oo on NC. Furthermore, there is a set On E N such that 1 f, 1 5 g for all w E 0:. Let 0 =
n=l
On. Then, clearly 0 E N. Denote A = M Cn N Cn OC and f
= L l A Then, f, + f p-ae., f E C -'(R, C;C). Because I f, I 5 g < m on A, I f I 5 g a.e., I f I < m and hence f E 43. By Proposition 1.17, Chapter 6, we have that
464
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
I f [and, consequently, f E LP(R, E,p;Q=). Let g, = I f, - f I and h = ( I f I + g)p. Then, the sequence (g,} nonnegative and is dominated by h. Since I f I + g E LP(R, C, p;R
Therefore,
is +
),
g, E L'(R, C, p;W + ). Applying Fatou's Lemma to h - g, we have J k ( h - g,)dp
< lim S (h - gn)dp = J hdp - lim J gndp.
Since g, + 0 a.e., h - g, This and (4.8) yield
-t
h a.e. and therefore h ( h - g,) = h a.e..
-
lim g,dp
Because g,
Finally,
1
(4.8)
5 0.
2 0, we have
f
1
j
11 f 11 , is due to Problem 4.2.
We are going to show that the space LP(fl,C,p;Q=)is complete with respect to the seminorm 11 = 11 and hence the quotient space LP(Q, C, p; C) I,p is Banach.
4.9 Theorem (Riesz-Fischer). Let { f ,] C LP(R, C, p; C) ( o r L P ( R , E , ~ ; R ) ) be a Cauchy sequence with respect to the seminorm
11 . 11 .
Then, there exists f E LP(R, E , p ; C) such that f,
LP 4
f.
Pmof. Let {f}, be an LP-Cauchy sequence. Then, given there is an N k such that for all indices nk, nk+' 2 Nk,
Hence, there is a subsequence (f
"k
E
= 2-k,
} whose terms satisfy (4.9). Denote
and apply the inequality of Problem 4.1 to the sequence { I gk I }. Then we have from (4.9):
Thus, g E LP or, equivalently, gP E L'. By Proposition 1.21, Chapter 6, gP and, therefore, g is finite p-a.e.. The latter implies that the partial
4. LP Spaces
sums
and hence the subsequence {f
,k
) converge p-a.e. on
a. Furthermore,
and since (due to (4.9a)) g € LP(S1,C, p;W +), the subsequence {f"k } is dominated by an integrable nonnegative function
I f ,l I + g.
All other
conditions of the Lebesgue Dominated Convergence Theorem 4.8 (applied to the subsequence {f )) are met. Consequently, there is a function f nk
E LP(n,E,p;C) to which
(f } converges pane., both in the topology of ,k
pointwise convergence and in the pth mean. Finally, {f,), being an LP-Cauchy sequence, by Problem 3.9, Chapter 2, must converge to the same limit function f (as its subsequence { f nk)) in the pth mean. 17 Notice that the function f to which { f ,} converges in the pth mean is defined uniquely p-a.e.. Therefore, the Riesz-Fischer theorem states that the quotient space LP(Q, C,p; C) I is Banach. As a by product, the theorem provides a subsequence {f
"k
) of { f ,}, which converges to f p-
a.e. in the topology of pointwise convergence. The theorem does not state, however, that {f,) also converges to f p-a.e. pointwise. (The reader is encouraged to provide a counterexample where such an option is not the case, see Problem 4.6.) Below is what we can afford.
4.10 Proposition. If an LP(R, E, p; C)-Cauchy sequence { f ,) converges p-a.e. pointwise to a function f E C -'(R, C ; C), ihen f E LP and
f,
LP -)
f.
Proof. By Riess-Fischer Theorem 4.9, there is an LP-function that f,
LP -, f
and there is a subsequence {f nk)
7
such
{ f }, such that f,
k
-+
a.e. pointwise. On the other hand, by our assumption, f "k f .a.e. pointwise. Therefore, f E [f ,1 and the rest of the statement is again due to the Riesz-Fischer Theorem. -+
N
4.11 Proposition. Lei (R,C,p) be a measure space, such that p is finite and let /E €!-'(R, C;C). If 1 5 p 5 q < oo, then
+
466
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
and therefore Lq(R,E,p,C) C LP(R,E,p,C). Proof. We assume that p < q or else (4.11) is trivial. Then denote a = q/p and b = a/(a - 1) = q/(q - p). Then, a and b are conjugate exponents with a > 1. Since p is finite, the constant function 1 E L ~ ( R , E , ~ , WNow ) . apply Holder's inequality to 1 f 1 and to 1 with respect to the conjugate exponents a and b:
or, equivalently,
II f II 5 [ S l f l pad~]lla[a(~)]llb (since pa = q, l / a = p/q and l / b = 1- q/p)
=
1 1 P-V
I1 f I1 ;[P(Q)J-
that proves (4.11).
4.12 Examples. (i) Consider an important special case. If p is a probability measure in Proposition 4.11, then the result applied to a random variable X can be interpreted as follows. The existence of the moment of nth order implies the existence of all lower moments of X. (ii) The statement of Proposition 4.1 1 that, for p
< q,
need not hold if p is not finite. For example, if R = [ l , ~and ) p is the 1 ~ ., Let counting measure concentrated on set 1 2 . . i.e. p = :C f (x) = l. Then,
and thus f E L ~ However, . it is easily seen that f $ L'.
I7
The theorem below states that the space of all real-valued integrable "extended" simple functions is dense in LP. We need the following notation. Let PP(R, E , p;R) = P(R, C;R) fl LP(R, E, p;R) denote the subset of all real-valued simple LP-integrable functions. (See Remark 6.2 (iii), Chapter 5, on simple functions.)
4.13 Theorem. The real subspace
PP
is dense in (LP, 11
11 p).
Proof. PP 5 LP, by the definition. Now, given an f E LP, by Theorem 6.5, Chapter 5, for f + and f - there are monotone nondecreas-
4. LP Spaces ing sequences { s ; ) t f + and f +,f - E LP and, consequently,
{s,
I s , + ) , {s,I
1t f - .
467
Since f E LP,
SO
are
CLP
and
{ s , = s,+ -s,}
5 LP.
BY (4.2))
and since f E LP, we have that { f - s,) E LP. Therefore, the sequence { I f - s, I is dominated by an L'-function 2 P + I f I P. We also know converges to function 0 pointwise. Hence, the sequence that { f - s,} { f -s,} meets all criteria of the Lebesgue Dominated Convergence Theorem. As the result, there is an LP(R,C,p;R)-function, say f *, to which { f - s,} converges a.e. pointwise. Hence f * E [0], and by setting LP f * = 0, we have lim,,, 11 f - s , 11 = 0 or that sn -' f . In other words,
'
4.14 Remarks.
(i) Given an LP-function f , we proved the existence of an "extended" sequence {s,} of simple functions such that { I s, 1 ) is monotone increasing to I f I and {s,} converges to f pointwise. ~ (ii) Noticed that not only 9 = C -'(Q, E;R) in C - '(R, E ) , T (i.e., in the topology of pointwise convergence), but as we showed, the subspace PP of I is dense in (LP, 11 I( p).
-
(iii) A minor adjustment to Proposition 4.13 allows us to claim that the subspace I P ( R , E, p;C) = P(R, E;C) fl LP(R, E, p;C) of all complexvalued simple LP-integrable functions is dense in LP(R, E, p;C). (Problem 4.8.) The following topic on p-a.e. bounded measurable or "Loo-functions" occurs often in applications and is going to be explored. We will also see how the Loo-space fits in the LP-family.
4.15 Definition. Let f E C - '(R, E; C) or C - '(a, E ; R). A positive real number M is said to be an essential bound for f if I f I 5 M p-a.e. on R. If f has an essential bound it is naturally called essentially bounded. 0 We would like to notice the difference between p-a.e. finite and essen-
468
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
tially bounded functions. For instance the function ) E e - '(52, E;R) is finite A-a.e. on R, i.e. every where, except for 0, whereas it is not essentially bounded. Moreover, the "repaired" version of
8,
becomes finite (and an element of C - '(52, C; C)), but still not essentially bounded.
4.16 Definition and Notation. If a measurable function f on (52, C, p) is essentially bounded, then the infimum of all essential bounds for f is called the (p-) essential supremum of f and it is denoted by [I f 11, or by esssup{ I f [ ). More formally,
The subset of C - '(a,E ; C) (or e - '(a,E ; R)) of all essentially bounded functions is denoted by Lw(R, E, p;C) (or Lw(R, E, p;R), resp.). (Of course, if f is not essentially bounded, it would make sense to set 11 f 11 , = oo. However, since we are going to use 11 11, as the norm within 0 LO", we do not need such an extension.) I t is easy to see - that Lm(R, E, p; C) is a vector space over the field 43, while Lo0($2,E,p;R) is a "quasin-vector space over R. The properties below justify (1 11 as a semi-norm on Lm.
- ,
4.17 Proposition. Given two measurable functions f and g on (Q, E, p) and a scalar a E 43, the following are valid: (i)
If I I II f
( 4
llf+911,~Ilfll,+119l1,-
(iii)
If I < IgI
(i.1
f
(4 (vi) (vii) (viii)
E [gl,
II,
CL-a*e* On 52-
ya.e. on 52 implies that
* Il f ll , = Il g ll
I[ f 11, I 11 g 11.,
00.
II "f llm = I " I ll f ll m* Il f 11, = 0 * f E [Ol,. ll"ll,= 14II fg II , i ll f I1 , Il g ll.,
Proof. (i)
Given
E
=
A, there is an essential bound M , such that
4.LP Spaces
Hence, the set
{Ifl
,+a} E N, and along with this, the set G {If l > llf II,+~EN,.
{ I f I > 11 f 11 > llf ll,)=
n =1
If +gl 5 If l + lgl IIlf+,l llgll, pawe* 11 f [[ , + 11 g 11 ,is an essential bound for f + g. Thus,
Hence, infimum of all essential bounds,
(iii) Because of (i) and our assumption, we have a.e.. Therefore, g is an essential bound and
11 (I,
Ilf<,l as
[f I
as the
< [[ g [I,
p-
Ilsll,,
11 f 11 , is the infimum of all essential bounds. (iv)
The validity of this statement follows from (iii) applied twice.
Becauseof iafl = la1 (fl,it follows that ( a (Ilfll,is (v) the essential supremum of af.
= 0. From (i), it follows that (vi) Let [[ f , R. The converse of this statement is trivial.
11
(vii)
Follows from
(viii) IfgI = (vii),
If l
I f1
5 0 ya.e. on
11 1 11 , = 1 and (v). lgl
5 [Ifll,llgIl,
p-a-e*
IIfsIIwIIlf llwllsll,.
* by
(iii), and
a
11 11
The above properties yield that (Lm, ), is a semi-normed linear space and it can be made an NLS by passing to the usual factor space Lm ., We will establish a few more properties, such as Holder's inequality and completeness of Lm 1 , to have Lm be a part of the LP family. First, a few examples.
I
4.18 Examples.
(i)
1 n = 11,)...} and B = [O,l]\A. Define the measurable Let A = {z;
function f on
([0,1],% n [0,1], p = Res% ,[ 1 1 ~as) f(x) = sinxlg(z) = c - InlA(x).
r-
470
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Clearly, the function f is not bounded and therefore ever, since p(A) = 0, II f 11 , = sinl.
11 f 11
= 00.HOW-
(ii) In the condition of Example (i), let
Then,
11 g 11
= 1, while
11 g 1,
= sin1 < 1.
0
4.19 Proposition (Hiilderb Inequality for Lw spaces). Let f E L' and g E Lm. Then, f g E ' L and the following inequality holds true:
The proof is left for the exercise (Problem 4.10). 4.20 Notation. Given a sequence ( f ,) C - Lm, we will write
4.21 Theorem. The space (Lm(R, C, p;C),
11 11),
is Banach.
Proof. Let { f , } be a Cauchy sequence. Then, by Problem 4.13, there is a set A E Nc such that f , - f , + 0 uniformly on AC. Consequently, there is a function [Ac,C,fo] to which { f ,} converges uniformly on Ac. It is readily justified ( c f .Proposition 5.6 (vi), Chapter 5) that
Thus, the function f = f olAc E e - '(R, E;C). Clearly, f is essentially bounded. Since f ,
-t
f p-a.e. uniformly on R, by Problem 4.12,
4.22 Definition and Notation. As we see it from the above analysis of Lm spaces, the latter become a natural extension of the LP spaces in the following way. The two versions of the Holder Inequality can be combined in one after upgrading the notion of the conjugate exponent. Two extended real numbers 1 5 p 5 oo and 1 5 q 5 oo are said to be conjugate exponents if they satisfy the equation
with the usual agreement that
& = 0. The generalization below of conju-
4. LP Spaces
47 1
gate exponents allows modification of the Hiilder Inequality. The extended real numbers 1 q i 5 oo, i = 1,...,n, are said to be conjugate exponents if they satisfy the equation
<
4.23 Proposition (Generalized Hiilder's Inequality; Version 1). Given n conjugate exponents of (4.22), let g i E L'~(R, E , p ; C), i = I,. . .,n. Then, gl-.-gn E L' and
The following is a modification of the Holder Inequality.
4.24 Proposition (Generalized Hiilder's Inequality; Version 2). Given n + 1 extended real numbers 0 < pi oo, i = 0,...n, such that
<
P.
and functions f E L ( , ; C ) , j = 1,..n , it holds true that f E L'O and
.fn
It can be verified (Problem 4.14) that the two versions are equivalent. The proof of one of them is left for the exercise (Problem 4.15). PROBLEMS 4-1
Let {f ,} G LP(n,C,p,R+), where 1 5 p following inequality holds:
4-2
Let
{f ,) C LP(n,C, p;C)
< m. Show that the
be a sequence of functions such that
f E L P p-a.e. pointwise. Prove that if Il f n - f ll p+O), then I n I p II f II p a
--,
fn
LP -t
fn
f (i.e.
4.3
Prove the converse to Problem 4.2 [Hint: Apply Fatou's Lemma 2.4, Chapter 6.)]
4.4
Show that if {f ,}
E LP is a Cauchy sequence, then it is uniformly
472
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
bounded.
g,
2 g E Lq, and
p and q be conjugate exponents. Prove that
Show that in the Riesz-Fischer Theorem, {f ,) need not converge to f p-a.e. in the topology of pointwise convergence. Show that LP(S1, C, p;C) is a lattice, i.e., if f ,g E LP(R,C,p;C) then also f V g, f A g E LP(R,C,p;C). Prove that the set 9P(R, C, p;C) = 9(R, C;C) n LP(S2, C, p;C) of all complex-valued simple LP-integrable functions is dense in LP(R, C, p;Q3)* Show that Loo(R,C, p;C) is a lattice. Prove the HGlder Inequality for Loo spaces. (Proposition 4.19.) Lm(n, 6 ,p ) such that f, C f. Show that there is Let { f ,f ,) an A E N, such that f, -,f uniformly on AC. Prove the converse of the statement in Problem 4.11: Given { f ,f ,} E Lm(52, E, p) suppose there is an A E Nu such that f,
j
f uniformly on AC.Show that f,
-t
f.
Prove that {f ,} 5 Lm(R, 2,p) is a Cauchy sequence if and only if there is an A E N, such that f, - f, -t 0 uniformly on AC. Show that the two versions (Propositions 4.23 and 4.24) of the generalized Holder Inequality are equivalent. Prove one of the versions of the generalized Holder Inequality. Let (52,C,p) be a s follows: 52 = W+, C = %+, p = RescX, and let
where A = [o,:], B = (n,n2], n 2 2. Investigate if the sequence {f,} is Loo-convergent and if the answer is yes, give a version of its Loo-limit. Repeat the same investigation with respect to the L' space.
4. LP Spaces NEW TERMS: LP(R, C,p; 43) space 460 LP(R, C,p; W) space 460 semi-norm 460 conjugate exponents 460 Holder's inequality for LP spaces 461 Cauchy-Schwarz inequality 46 1 Minkowski's inequality 461 convergence in the pth mean (LP-convergence) 463 LP-convergence (convergence in the pth mean) 463 Lebesgue Dominated Convergence Theorem for LP spaces 463 Riesz-Fischer Theorem 464 integrable simple function 486 essential bound 467 essential bounded function 467 essential supremum of a function 468 Lm(S1, C,p; C) space 468 Lm(S2, C, p; 43) space 468 11 11 , semi-norm 468 Holder's inequality for Lm spaces 470 generalized Holder's inequalities 471
474
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
5. MODES OF CONVERGENCE In this section, we explore other forms of convergencies initiated in Section 2, Chapter 6. Many of them find a frequent application in analysis and probability. 5.1 Definitions. Let { f }, be a C -'(a, E ; C)-sequence and let p be a measure on (R, C).
(i) { f ,}
is said to converge to a function f E C - '(R, C ;C) in measure if for each E > 0,
I f f1
P
in notation, f,
> E}) = 0,
5 f. The function f is called a p-limit
(ii) { f ,} is said to be Cauchy in measure if for each
of { f ,}. E
> 0,
(iii) { f ,} is said to converge almost uniformly to f (in notation, f, -, f p-a.u.) if for each E > 0 there is a set A ( = A(E)) E C such that p(A) < E and f, + f uniformly on A'. We will begin with the statement that "almost uniform convergence implies convergence in measure," which is quite obvious and its proof left for the exercise (Problem 5.1). 5.2 Proposition. Let (R, C, p) be a measure space and f,{f ,) C
C -'(a, C ; C) such that f,
+f p-a.u. on R. Then
P f, + f.
Next, we will need the following
'
5.3 Lemma (Chebyshev's Inequality). If f E C - (R, C ; C) and 0 < p < m, then for each E > 0,
+
I 1
Proof. Let A: = { f thus
2 E).
Then,
IfI
= If
I P I A + I f I PIA,
If f E LP(R, C, p; 43) for p >_ 1, then from (5.3) it follows that
and
5. Modes of Convergence Another, noteworthy consequence of Chebyshev's Inequality is 5.4 Proposition. Let { f ,}, f f,
LP(R, C, p; C), for 1 5 p
< oo. If
L'P f , then f, 5 f .
Proof. The statement follows directly from Chebyshev's Inequality applied to f, - f . The converse of the last proposition does not hold as we learn it from the following example.
5.5 Examples.
(i)
1
Let f n = F(0,n)' i l Then, f,
-t
0. Let
E
E (0,l). Then,
and thus
that yields limn,,.\({
I f n - 0 1 2 E)) = 0 for a11 E E (0,l).
However,
(ii) The pointwise convergence does not imply convergence in measThen, {f,} converges to 0 pointwise. However, ure. Let f, = I(,, , + for every
E
E (0, I),
{fn2 &} = (n,n + 1) and .\(If, 2 E } ) = 1, for all n. The LP-convergence does not hold either in this case.
0
5.6 Theorem. Lei (R,C,p) be a measure space and lei {f,}, f C C - '(a, C ; C). If f ,-t f p-a.u., then f, +f p-a.e. pointwise. Proof. Almost uniform convergence means that for each k, there is a
measurable set Ak such that p(Ak) < Denote
and f,+ f uniformly on A%.
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Then p(Bk) < and f,
-. f
p-a.u. on Bi. Hence
f n +f
pointwise on
(but not necessarily uniformly on BC).On the other hand, since p(Bk) is finite, by continuity from above,
The converse of this statement, as we know from analysis, does not hold true, unless p is finite, as the following, widely referred to theorem states.
5.7 Theorem (Egorov). Let ( a , C , p ) be a finite measure space and f , { f ,} E C - '(a, E;C) such that { f ,} converges to f p-a. e. pointwise. Then { f ,} converges to f p-a.u.
Proof. By the assumption, there is a p u l l set N such that {f ,} converges to f pointwise on NC. Define r=n
w E NC: I f (w) - fi(w)
I2
.
Clearly, the sequence { A : n = 2 , . . is monotone nonincreasing for each j and since If,) converges to f pointwise on NC, for every j , we have that { Ajn)n [email protected] p is finite, by @-continuity,
Let
E
> 0 be chosen and let nj be such that p(A . .) < $ Denote
Then, p(A) < E and if w $ A, it follows from the definition of Ajn and A that for every j ,
I fi(w)-
f(w)
I
;for all i L n j
and therefore {f ,) converges to f uniformly on A.
0
5.8 Proposition. Let (R, C, p ) be a finite measure space and let f , {f n} C - C - '(a,C ; C ) be such that
f n -+ f
p-ax. pointwise on
a.
Then,
5. Modes of Convergence
Proof. Let
-
00
En(&): =k = U Ak(&),and A(&):= lim A,. n n-+w
Then,
+
Indeed, w E { f , f } if and only if there is a 6 > 0 and a subsequence, f , .(w) such that for all E 5 6, (
3
)
I f, 3.(w) - f (w) I 2E, j The latter is equivalent to w E A(&) for all the fact that { A ( & )f} for E JO. Consequently, for each E > 0,
= 1,2,. E
..
5 6. Finally, (5.8) is due to
Since by our assumption, p is finite, due to (5.8a) and by continuity from above, limn,,
~ ( 8 , ) = 0.
(5.8b)
Finally, (5.8b) and that A,(&) C_ En(&)yield that for each s > 0,
and thus convergence of { f ,} to f in measure.
0
The converse of this proposition is a much weaker statement that convergence {f,} to f in measure guarantees just the existence of a subsequence of { f ,) convergent to f p-a.e.
5.9 Theorem. Let (R, C, p) be a measure space and { f }, C C
-'be a
Cauchy sequence in measure. Then:
(i) there exists a measurable function f to which the sequence { f , } converges in measure; (ii) there is a subsequence f , of If ,I that converges to f p-a.e.
k) in the topology of pointwise convergence;
(iii) i f f ,
S
g, then g E [f],.
478
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Proof.
Since such that
If ,}
In terms of
E
is Cauchy in p, for each
E
> 0 and 6 > 0, there is an N s
= 6 = 1 the above can be reformulated as 2k'
Now, choose one hk: = f n E Tk. Since {Tk}is monotone nonincreasing, k
hk and hk + are elements of Tk, and thus the subsequence {hk} of {f .} is such that for each k = 1,2,..., r(A,) where
<
$ 9
Let 00
B,: =
U Aj.
j= s
We will show that for each s, {hk} is Cauchy, uniformly on Bz. First notice that since for each w E A:,
for w gt . B A;, a=k
In other words, given a 6 > 0, there is an N N,
>
> s such that for all m > k
I hk(w) - hrn(w) I < 6, good for all w E BE. 00
Consequently, {hk} is Cauchy on A: = U B,C in the topology of points=1 wise convergence. Furthermore, since the sequence { B , } is monotone nonincreasing and
5. Modes of Convergence
from (5.91, , 4 8 3 )
( < oo), by continuity from above,
5
and thus {hk} is pointwise Cauchy on A, i.e. p-a.e. Define f = lAlimk+oohk. Clearly, f exists and is finite for each w, and, by Theorem 5.9 (vi), Chapter 5, f E C - '(a, C ;C). From (5.9a) it follows that, for m -,oo,
I f (w) - hk(w) 1 < 2k-1 for all w E B:
and k 3 s,
P and hence because of (5.9b) hk -t f . Moreover, since
and because {f,} was assumed to be Cauchy in measure, each of the sets on the right of inclusion (5.9~)converges to zero. Therefore,
Finally, let g be yet another p-limit of { f ,}. Then, from
{
If
-g
1 2 E)
E Np, good for all
E
> 0, and thus g = f
(mod p).
From Proposition 5.4 and Theorem 5.9 we arrive a t
5.10 Corollary. Let {f ,I, f
LP(R, E , p; C) such that f,
3 f. Then
of {f,} that converges to f p-a.e.
there is a subsequence pointwise.
The following proposition makes some sort of converse of Proposition 5.4 (that LP-convergence implies convergence in measure) with one additional condition.
5.11 Proposition. Let (R, C, p) be a measure space and let f , {f}, C LP(R,C,&C) such that f,
-
R + )-function g such that
Proof. Since f,
C1 -t
%f
I f, I
and suppose there is an LP(R, C , p ;
< g.
Then f,
LP +
f.
f , according to Theorem 5.9, there is a sub-
480
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
sequence {f nk) of { f ,}, which converges to f p-a.e. on R in the topology of pointwise convergence. Since f
is dominated by g, by Lebesgue's { Dominated Convergence Theorem 4.8, f 2 f and f E LP(R, E , p;C).
Suppose that f,
LP
$,
"k
f . Then there is a positive s and a subsequence
)
hj: = f , of {f ,} such that for all j's, it holds true that j
On the other hand, since h j z f , there is a subsequence { h j i ) (of convergent to f p-a.e. on R (and also dominated by g) and thus, by the Lebesgue Dominated Convergence Theorem, hji
2f
thereby directly
contradicting (*).
0
5.12 Proposition. Let
P
f,-t
f . Then, every subsequence
i kjl
{f }, contains a subsequence f,
such that f,
-+ kj
Proof. By the assumption, every subsequence converge to f in measure. Then, by Theorem 5.9, least one subsequence, say
{f"k}
f p-a.e. on R.
k\
of {f ,} must must have a t
that converges to f p-a.e. on R.
The converse of ~ r o ~ o s i t i o5.12 n requires the finiteness of p.
5.13 Proposition. Let {f}, be a sequence of C - '(a,27; c)-functions on a finite measure space (R,C,p). Suppose that every subsequence of { f }, contains a subsequence such that f + f p-a.e. on R. Then, f,
,kj
3f.
Proof. Since p is finite, by Proposition 5.8, f, given an s > 0, every subsequence a
{
has a subsequence
3f.
Therefore,
k J; "k}
that converges to 0. Therefore, the numeric {ankj}
sequence {a,} is sequentially compact and (cf. Theorem 6.3, Chapter 2 or Problem 3.9, Chapter 2) converges to 0 itself. The following chart (Figure 5.1) makes an overview of the major convergence modes and their relations and summarizes the theorems and propositions above.
5. Modes of Convergence
Every subsequence
+f
a*e-
Cflk)cCf,):f,
p is fdte P
ILlSg
€LP
fn
"A, -' f p-a.u.
p is finite
+f 4
A
-
v
A
4~
fn+ f p- a.e.
I
I
pis finite
Figure 5.1
5.14 Proposition. Let (R, C, p) be a finite measure space and f , {f,} C1
- C -'(R, C ; C ) such that f ,+ f . Suppose a function p: C+ C is contiC P
nuous. Then, p o f, + p o f.
-'
Proof. Since p is continuous, p o f ,{p o f ,} E C (R,C;C). By Proposition 5.12, each subsequence of {f }, has a subsequence, say
PI
convergent fo f p-a.e. on R. Hence, by continuity of p, also
}
p o f,,
converges to 9 o
f p-a.e.
on R. Since p is finite, the statement
j
is due to Proposition 5.13.
5.15 Proposition. Let { f ,}, {g,} E C -'(a, C; C ) be two sequences on a measure space (R, E , p ) convergent in measure to measurable functions f and g, respectively. Then, for any two complex numbers a and b,
482
CHAPTER 8. ANALYSIS M ABSTRACT SPACES
Proof. From
we have that
Therefore,
f,+g,
+g.
5 af .
C -'(R, E;C) be two sequences on 5.16 Proposition. Let { f ,I, {g,) a finite measure space ( R , E , p ) convergent in measure to measurable functions f and g , respectively. Then, f ,g,
P -t
f g. of {f ,} contains a be any subsequence
Proof. By Proposition 5.12, every subsequence convergent to f p-a.e. on of { f ), and { f , , j} be a subsequence of f
{
R.
Then
the
of
sibsequence
convergent to f p-a.e. on
{g,)
has
a subsequence
{gnkj}
{ G ~ := g n k j i }
convergent to g pa.e. on R. Therefore, the sequence
{ F ; G i ) (where Fi: = f
k,
-
) converges to f g p-a.e. on fl.
3;
In summary, we showed that an arbitrary subsequence
{ f ,g,} has a subsequence {FiGi) that converges to f g p-a.e. on R. The statement now follows by Proposition 5.13.
5.17 Examples. (i)
Let R = [0,1] and let f , = e n l
An
, where
A, = [o,;].
Obviously,
X
f, + 0 Xa.e. Therefore, by Proposition 5.8, f , + f = 0 pointwise. Since A is finite on R, by Egorov's Theorem, f,+ f = 0 a.u. However,
for n+oo (0 < p < 00). SO, the LP-convergence of { f ,} The same applies to Lm: 11 f , 11 , = en+oo, for n-+oo.
does not hold.
483
5. Modes of Convergence
(ii) Let R = R + , C = % + , p=ResEn,X, f n = l n, , and A n = [n,n + Clearly f ,+ 0 A-a-e. pointwise, and hence by Proposition 5.8,
A].
Furthermore,
11 f, 11 Pp = S 1 f n 1 PdX = (&)P-+o, n
n-+oo
(0
< p < CQ).
Since 11 f n 11 = 1, for a11 n, f ,-,I in Lm(RIC,p;R). However, Ifn} does not converge almost uniformly on S1. Assume the opposite, i.e. suppose there is an A E C such that X(A) < E and {f ,} converges uniformly to 0 on AC. Clearly, then f n should be less than one on AC for m
sufficiently large n, which implies that for sufficiently large n, ACn ,U Ai t =n
= Q).Thus .
r=n
Ai E A and X(A)
> 7t = n X(Ai) = oo (since Ai's are dis-
joint), which is a contradiction. (iii) The following is an application of two major convergence modes to probability. Let {X,} be a sequence of L'(R, C, P;IR)-random variables. Construct the sequence
P
and denote f: = 0. If f n + f in measure, we say that { f ,} converges to f in probability (also called stochastic convergence) and in this particular case, we say that the sequence {X,) obeys the Weak Law of Large Numbers. If the sequence in (5.17) is such that f n + f P-a.e. on S1 (more precisely, P-almost surely or P-as.), then {X,) is said to obey the Strong Law of Large Numbers. Due to Proposition 5.8, the Strong Law of Large Numbers implies the Weak Law of Large Numbers, thereby justifying their names. In the special case, when the random variables {X,} share a common mean, say m, the convergence of f n to 0 means that the average value
of the sequence converges to m (weakly or strongly) and therefore becomes a constant. This is often being used in statistics to evaluate the unknown mean (m) of a population (by p,). Notice that the Central Limit Theorem is also applied as a practical tool to estimate the sample size within a given significance level. Finally, the reader can be referred to regular text books in probability to learn about various sufficient conditions to satisfy the Weak and U Strong Laws of Large Numbers.
484
CHAPTER 8. ANALYSIS TN ABSTRACT SPACES
PROBLEMS 5.1
Prove Proposition 5.2.
5.2
Show that f ,
5.3
Give an example of a sequence convergent in measure but not in LP.
5.4
Let LP(i2,E,p;R) be as follows: R = [0,1], E = 93 n [0,1], p = Rest& and p 2 1. Define a sequence { f }, in LP as f ,(x) =
.I:f
implies that { f ,} is Cauchy in measure.
n l A ( x ) , A,: = [o,+). Show that the g l i m i t of { f , } is 0, but the n
LP-limit of { f ,} is not, for all p 2 1 (including oo).
5.5
Let LP(R,C,p;W) be as follows: R = R, C = 93, p = A, p Define f,(x): =
f,
4
a l A n ( x ) , A,:
= [O,en]. Find
0 uniformly on R, f ,4 0 in Lm, f
,-+
11 f , 11.,
> 0.
Show that
O ka.e., i l i m f , = 0.
f, fails to converge in LP (0 < p < w). = [0,1], C = 93 n [O, 11, p = Rest&!, p > 0. Define
However, show that
5.6
Define
f
nm
- l ~ n ,m where A,, =
,,
m-1
m
.
m = 1,. .,n, n = 1,2,. ..
.
Show that the sequence { f ,m = 1,. ..,n,n = 1,2,. ..) converges to 0 in the pth mean but does not converge A-a.e. , not a.u.. and not in L".
5. Modes of Convergence NEW TERMS: convergence in measure sequence of functions 474 Cauchy in measure sequence of functions 474 almost uniform convergence of a sequence of functions 474 Chebyshev's inequality 474 Egorov's Theorem 476 convergence in probability (stochastic convergence) 483 stochastic convergence (convergence in probability) 483 Weak Law of Large Numbers 483 Strong Law of Large Numbers 483
CHAPTER 8. ANALYSIS ZN ABSTRACT SPACES
6. UNIFORM INTEGRABILITY Uniform integrability has some resemblance with equicontinuity as it applies to a family of functions. Recall that Problem 1.22, Chapter 6, states that a function f E C -'(R, C ; C) on a measure space (R, C, p) is integrable if and only if for each E > 0, there is g E L1(R,C,p; W + ) such that
This is a motivation for the notion of uniform integrability of a family of integrable functions, for all of which such a function g exists, given any positive E. 6.1 Definition. A family Q C C -'(a, C ; C) of functions is said to be uniformly integfable with respect to a measure p on (R, E) if for each E > 0, there is g E L1(R, C, p; R + ) such that for every f E 0,
The function g is said to be an E-bound of 0.
13
6.2 Remark. If (R, C , p ) is a finite measure space, then Problem 1.22 of Chapter 6 can be restated as: a function f E C - '($2, C; C) on a finite measure space ( R , C , p ) is integrable if and only if for each E > 0, there is a nonnegative number N such that
Consequently, a family B C C -'(R, C ; C) is uniformly integrable with respect to a finite measure p if for every E > 0, there is a nonnegative number N such that for every f E a, (6.2) holds true. This second variant of uniform integrability was originally introduced in connection with martingale theory in probability. Definition 6.1 is therefore more 13 general. 6.3 Examples.
A finite set B = { f
',. ..,f ,}
of L1-functions forms a uniformly integrable family. Indeed, given an E > 0, by Problem 1.22, Chapter 6, each f has an E-bound gi. Therefore, g = gl V ...V g, is an E-bound of 0. More generally, replacing f i by a uniformly integrable family Qi of functions, we deduce that the finite union of uniformly integrable families of functions is uniformly integrable.
(i)
6. Uniform Integrability
487
(ii) In the Lebesgue Dominated Convergence Theorem, a sequence { f ,}, dominated by a nonnegative L'-function g, is uniformly integrable. Indeed, since for each n, I f, I g a.e., we have that
<
However, it is not true that a uniformly integrable family is dominated by any function. Consider a finite measure space (N,Y(N),p) such that p({n}) = 2"' n = 1,2,. .., and a sequence {f,} of measurable functions defined as
We will show that {f,) is uniformly integrable, by using the definition of Remark 6.2. Let N 2 0. Then,
Since, obviously,
holds,
2"L N 1, k = n a n d w 01
and therefore, l{fn,N )
-
< I{,)
otherwise
leading to
Consequently, given an E > 0, for all n > ), the set { f + ..} is uniformly integrable. Since f ',. . .,f, are integrable, the whole sequence { f ,} is uniformly integrable. On the other hand, g(k) = ak is evidently the smallest function of those dominating the sequence { f ,} and it is not p-integrable. Indeed,
488
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
and
I gdp = sup C n
k
n
= 1S k l [ k ) ( w ) ~ ( d w )=
C
zkl
= 1x3= m'
Therefore, there is no integrable dominating function for { f ,). We immediately observe that
6.4 Proposition. If a family @ is uniformly integrable, then
SUPISIf Proof. Indeed, given an
E
ldp:
f €a)
> 0, let g be an &-bound of a. Then,
The following is a useful criterion of uniform integrability for a sequence of functions on a finite measure space. We start with
6.5 Definition. A sequence { f ,) E C - '(a, E; C) on a measure space (R, C, p) is said to be uniformly continuous in LP if 1 f , 1 -+ 0 with
I
A
p(A) --, 0 uniformly in n.
6.6 Theorem. Let { f }, 5 C -l(R, E;C) be a sequence offunctions on a finite measure space (R, 13,p). { f is uniformly integrable i f and only if it is uniformly continuous in L' and the integrals J' I f , ( dp are uniformly bounded.
Proof. 1. Let { f , ) be uniformly continuous in L' and the integrals J 1 f , ( d p be uniformly bounded. Then, by Chebyshev's Inequality (Lemma 5.3) and due to uniform boundedness, p{(f n I
Hence, p{ ( f ,
I
> N) 5511 f ,
111+0, for N - t o o , for all n.
> N ) +0, as N + 0, and this implies that I If, I dp--0, for all n, { I ~ , I2
~ )
by uniform continuity. The latter leads to uniform integrability of {f,).
6. Uniform Integrability
2. Let ( f ,) be uniformly integrable. Then,
By uniform integrability of { f ,), N can be chosen such that
I
I f, I d p <
z, for all n.
{If,I L N )
< &, then from (6.6) we have that I f, 1 < E and thus { f , } is n uniformly continuous in L'. The uniform boundedness is due to PropoIf p(A)
1
sition 6.4. Now, we prove another criterion of uniform integrability for arbitrary measures generalizing Theorem 6.6.
6.7 Theorem. A family O 5 C -'(a, E; C ) is uniformly p-integrable if and only if the following two conditions hold: B) For each E > 0, there is a nonnegative L'-function p and 6 such that for each measurable set A with i p d p 5 6 ,
>0
A
I I f I d p 5 E uniformly for
all f E @.
A
Proof. 1. Suppose conditions A) and B) are met. For each c
Since by A),
j' I f I d p 5 M, c can
> 0 and f E @,
be chosen large enough to have
I P ~56P
A
with A = ( 1 f
1 2 cp).
an &-bound for O. 2. Conversely, let
Then, by B),
$ I f I 5 E for
all f and thus c p is
A
be uniformly integrable. Since
490
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
S If I & =
{If1
S
If I++
{If1
29)
S
If ~
9
Idp,
)
we have
If g is an E-bound of
a, then (6.7) yields
and thus condition A). Taking cp = g and 6 = E we have for each measurable set A with Sgdp < E, we have from (6.7) 1 I f I dp 5 26 and thereby condition B). A
A
6.8 Proposition. L e t @ C_ LP(S2, C, p; C) f o r s o m e 1 5 p < oo a n d s u p p o s e t h e f a m i l y QP: = { I f I P: f E @} is u n i f o r m l y integrable. T h e n the family
is also u n i f o r m l y integrable. Proof. For any f E LP and equality,
A
E C, f l A E LP. By Minkowski's in-
11 (f1+ f 2 ) ' ~11 5 ( 11 f l l 11~p + 11 f 2 l A 11 p)'. Now, let f l = a f and subsequently, by (4.2),
f2
= bg, for some f , g
(6.8)
a. Then, from (6.8) and
Therefore, by Theorem 6.7, conditions A) and B) for for 1 a f + bg 1 P.
If I
imply those
2
By Proposition 5.4, f , f implies that f , 5f . The converse of this holds true if { f , } , in addition, is uniformly integrable. The following two versions of the converse are left for the reader.
6.9 Theorem. L e t f , { f }, measure space (
C -'(a, C ; C) be a s e q u e n c e o n a finite
,p) such that
f,
f . If { ( f , (
is u n i f o r m l y
6. U n i f o r m Integrability integrable f o r s o m e p
> 0,
ihen f ,
6.10 Theorem. F o r following a r e equivalent:
each
LP
--, f
.
sequence
( f },
LP(R, C ,p; C),
the
(i) (f}, is LP-convergeni. (ii) ( f }, able.
i s convergent in m e a s u r e a n d { 1 f ,
I P}
i s u n i f o r m l y integr-
PROBLEMS 6.1
Let { f }, 5 e - '(R, 22; R) be a uniformly integrable sequence on a measure space (52, C, p). (Using Fatou's Lemma) show that J'wndp
< lim J f,dp
6.2
Let { f }, E C-'(Q, E ; R ) be a uniformly integrable sequence on a measure space (R, C ,p). If f , -+ f p-a.e. on R or in measure, then f is integrable.
6.3
Prove Theorem 6.9.
6.4
Prove Theorem 6.10.
492
CHAPTER 8. ANALYSIS M ABSTRACT SPACES
NEW TERMS: uniformly integrable family of functions 486 &-bound of a family of functions 486 uniformly integrable sequence of functions, a criterion of 491
7. Radon Measures on Locally Compact Hausdorff Spaces
493
7. RADON MEASURES ON LOCALLY COMPACT HAUSDORFF SPACES We will assume that (X,T) is a locally compact Hausdorff topological space, $(X) is the Borel c-algebra generated by 7,and = % (X,%(X)) is the family of all positive Borel measures on %(X).Let 5 = 5 ( X ) and St = St(X) be the families of closed and compact sets in (X,T), respectively. Unless specified otherwise, under a Borel measure we will understand a positive Borel measure.
(i)
Let p E g. A Borel set A is called: a) p-outer regular if p(A) = inf(p(G): G 2 A, G E TI. b ) y i n n e r regular if p(A) = sup{p(K): K C A, K E St}.
(ii) A Borel measure p is said to be outer (inner) regular on a subfamily fl C_ %(X) if all elements of Cj are p-outer (-inner) regular.
(iii) A Borel measure p is called weakly regular or Radon if: a) p is finite on St(X) (compact sets). b ) p is outer regular on %(X) (Borel sets). c) p is inner regular on T (open sets). (iv) A Borel measure p is called regular if p is Radon and it is inner regular on $(X). Denote 3 = 3 ( X ) the subfamily of Radon measures on
%(X). As we recall, the Radon-Nikodym Theorem inferred that, given two measures p and v in the relation p << v, there is a unique equivalence class of density functions [f], such that, for each f E [f],, v = f fdp. Therefore, the integral f ( - ) d p "represents" a function (f); more precisely, a class of functions. We will be interested in another representation of the integral. From Section 1 of Chapter 6, we learned that given a measure p, the integral f u f f d p = :I ( f ) is a linear functional on L'(R, C,p). Can a general linear functional be represented by an integral with respect to a particular measure? Rephrasing the latter, can a given linear functional I on a function space 525 be associated with some measure, say p, so that, for each f E @, I ( f ) can be (uniquely) represented by the integral with respect to this measure p? If the answer is yes, this functional will thereby induce a measure (p) (however, not in the same sense as "RadonNikodym's" integral does regarding the measure v). This can be answered positively if we restrict the space of measurable functions to the vector
494
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
space Cc(X). (Recall that Cc(X) denotes the subspace of all continuous functions with compact support. We suggest that the reader turns to Section 11 of Chapter 3 for a refresher and notation.) More specifically, given a positive linear functional I on Cc(X), there exists a unique measure p on W such that I ( f ) = f d p holds true for all f E Cc(X). 7.2 Definition and &mark. Let 3 = S(X;R) be the vector space of all real-valued functions on X. An operator [9,R,I] is referred to as a positive lanear jknctional if I is linear on 9 and I ( f ) 0 whenever f 2 0. As a linear and positive functional, I is monotone. Indeed, for f 5 g, g - f 2 0, and hence, I(g) = I ( g - f ) I(f) I(f).
>
+
>
7.3 Notation. Let I be a positive linear functional on Cc(X) = CC(X,r;R). Define a set function y on T as y(U) = sup{I(f): f E Cc(X) and f 4 U) and extend it from
T
(7.2)
to T ( X ) by introducing
p*(Q) = inf(y(U): U 2 Q and U open).
(7.2a)
0 Recall (see Definition 11.1 (iii), Chapter 3) that a function f with compact support is subordinate to an open set U, in notation f 4 U if 05f
5 1 and suppf
U.
Furthermore, f 4 @ if and only if f = 0. Thus p*(@) = y(@) = 0. Conse quently, (7.2) and (7.2a) define nonnegative set functions on T and T ( X ) , respectively. As we will see, p* is an outer measure induced by y through (7.2a). The latter is not the traditional Carathkodory construction of an outer measure from a formatter (Q,y), where fj was a semi-ring and y was a-additive. The above extension is rather of topological nature. Notice that (7.2a) defines outer regularity of P* on 9 ( X ) .
7.4 Proposition. The set function p* defined b y (7.2-7.2a) is an outer measure on Y(X).
Proof. If U and V are two open sets such that U 2 V, then f 4 U yields that f 4 V and therefore,
{f E Cc(X): f 4 U)
c {f E Cc(X): f 4 V),
which yields the monotonicity of y and hence of p*. It remains to prove a-subadditivity of p*. (See Definition 2.1,
7. R a d o n Measures o n Locally C o m p a c t Hausdorfl Spaces
495
Chapter 5.) Let (Qk} be a sequence of subsets of X with
or else, the inequality
holds true trivialwise. Given E > 0, for each n = 1,2,. .., there is an open superset Uk of Qk such that y(Uk) < p*(Qk) €/zk. By Corollary 11.6, Chapter 3, there is a n f E C,(X) such that
+
Then, K = suppf U. Since K is compact, (U1,U2,. ..} can be reduced to a finite subcover of K, say (U1,. . .,Un}. We can now apply Theorem 11.3, Chapter 3, to (Ul, ...,U,} and K on the partition of unity subordinate to this cover. 1 n other words, there is a n n-tuple { f ,...,f ,} C,(X) subordinate to the cover {U1,. .,U,) for K, i.e.,
.
Since ( C ?1 f= il) * ( K ) = 1, we have that f = and hence
C y--
and f f
+ Ui
The inequality '(f)
5
c=:
+E
i~*(Qi)
holds true for every f E C,(X) such that f Hence,
+ U,
00
given U = U Uk. k=1
However, since p* is monotone and Q C U ,we have
good for all
E
> 0. This yields the desired a-subadditivity.
0
As a n outer measure on ? ( X ) , in accordance with Theorem 2.3,
496
CHAPTER 8. ANALYSIS LN ABSTRACT SPACES
Chapter 5, p* generates the a-algebra C* of p*-measurable sets that "separate" all other subsets of X. (See Definition 2.2, Chapter 5.) By the same theorem, p: = Reszrp* is a measure on C*. We are going to show, among other things, that all open sets are p*-measurable, which would yield that %(X) E*. Therefore, the further restriction of p: from C* to %(X) will make p; a Bore1 measure p which, in addition, will turn out to be weakly regular. The latter will be followed by the unique integral representation I ( f ) = f d p valid for all f E ec(X) with respect to the Radon measure p induced b y I. All of this essentially forms the Riesz Representation Theorem, which we will break up into several smaller propositions and theorems. Notice that in the sequence of statements below we shall be using p* whenever applied to sets other than open sets (for which we use its restriction 7 on T), as we do not know yet that they belong to C*.
7.5 Propsition. Let K be a compact set in (X,T). Then, there exists a nonnegative function g E Cc(X) such that K 4 g and p * ( K ) I I ( g ) . In particular, p* is finite on St(X). proof. In accordance with Theorem 10.9, Chapter 3, any compact set in a locally COP; act Hausdorff space can be covered by finitely many open sets whose closures are compact. Hence, for any compact set K, there is an open superset of 8,say U, whose closure V' is compact. By Corollary 11.5, Chapter 3, given V', there is a function g E C,(X) such that lg 5 g 5 1. On the other hand, by Corollary 11.4, Chapter 3, there is another continuous function f with compact support such that K 4 f 4 U. In particular, f 5 g and by Remark 7.2, I ( f ) 5 I(g) for all such f's. Hence, y(U) 5 I(g). Finally, by monotonicity of p*,
A very similar resvlt is formulated as follows. 7.6 Pl;oposition. Let K be a compact set in ( X , T) and g E Cc(X) such that g 2 0 and g,(K) = 1. Then p*(K) 5 I(g) and p* is finite on R(X)* Notice that unlike Proposition 7.5, the function g is given and it does not dominate K.
Proof. Let 0 < a < 1 and U,= {x EX: g(x) >a). Then U, is an open set. By Corollary 11.6, Chapter 3, there is h E Cc(X) such that h 4 U,. It is readily seen that a - lg 2 h. (It is strictly greater on U, and greater than or equal to elsewhere.) It follows that,
7. R a d o n Measures o n Locally C o m p a c t Hausdorff Spaces
497
Q - ' I ( ~ ) 2 I(h), good for all h 4 U,, and therefore for sup{I(h): h 4 U,} = y(U,). From this and by monotonicity of p*,
The above inequality holds true for all a 11. Finally, given K E St, by Corollary 11.5, Chapter 3, there is g E €!,(X) such that K 4 g, which yields that p*(K) is finite. C3
7.7 Lemma. p* i s finitely additive o n St. Proof. Let K1 and K 2 be two disjoint compact sets. By Corollary 10.12, Chapter 3, in a locally compact Hausdorff space, K1 and K 2 can be separated by two disjoint open supersets, say U and V, respectively. Now, for each E > 0, there is an open superset W of K1 K2 such that
+
+
+
Since (U V) n W covers K l K2, the open sets Ul = U tl W and U2 = V tl W cover K l and K2, respectively. By monotonicity of y,
On the other hand, by Corollary 11.4, Chapter 3, there are f l, f E C,(X) such that K1 4 f l 4 U1 and K 2 -i f 2 4 U2. Therefore, by Proposition 7 L?
Obviously, in our case, K1 4 f l 4 UI and K 2 4 f 2 4 U2 if and only if K 1 K2 4 f f 4 U1 U2, and hence from (7.7a),
+
+
+
The latter, combined with (7.7) for E J, 0, yields
The inverse inequality is due to subadditivity of p*.
498
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
7.8 Theorem. p* is inner regular on
T.
Proof. We need to prove that y(U) = p*(U) = sup(p*(.K): K 5 U, K E St). Given an y(U) = a
E
(7.8)
> 0 and U open with y(U) < oo, let
+E.
a E R be such that By Corollary 11.6, Chapter 3, there is f 4 U such that
Hence, I ( f ) > a. Let K = suppf . Then, by Problem 7.1,
and Thus, we showed that, given E > 0, there is a compact set K 5 U with (7.8a) holding. This yields (7.8). Now, let y(U) = oo. Then, there is f -i U and y(U) =sup{I(f): f 4 U). Thus, for any M > 0 (arbitrarily large), there is f E e c ( X ) such that I ( f ) > M. Given K = suppf , by Problem 7.1, p*(K) > M. Hence, we showed that, given U with 7(U) = oo and M > 0, arbitrarily large, there is a compact subset K C - U such that p*(K) > M. Therefore, sup(p*(K): K 5 U, K E 8 ) = oo.
I7
7.9 Theorem. T C C*. Consequently, 3(X)5 C*.
Proof. We need to show that for each Q
X and U E T,
1. First, let Q E T . Then, Q n U E T and
Hence, for each E>O, by Corollary 11.6, Chapter 3, there is an f 4 Q fl U such that
Because Q n (suppf)' is an open set, there is g 4 Q n (suppf)' such that
7. R a d o n M e a s u r e s o n Locally C o m p a c t H a u s d o r f l S p a c e s
499
+ g 4 Q. Consequently,
Clearly, f
Y(Q) 1 I ( f )
+ I(g) > Y(Qn u )+ y(Q n (suppf)')
- 2E.
(7.9a)
On the other hand,
Qn(su~~f)~2Qn(UnQ)~=QnU~, which leads to
The latter, along with (7.9a) yields y((Q)
> y(Q n u ) + P*(Q n u c ) - 2~
and hence, y((Q)
1 Y(Q n u ) + p*(Q n uc).
The inverse inequality is, as usual, due to subadditivity of p*. 2. Let Q C X. If p*(Q) = oo then the separation is due to subadditivity. Let p*(Q) < 00.Then, since
for each
E
> 0, there is an open superset V of Q such that by case 1
~*(Q)+E>Y(V) =
For
E
10,
Y(V~U)+Y(~~U")
p*(Q) 2 p*(Q n u ) + P*(Q n u c )
and the inverse inequality follows from subadditivity of p*. Thus we showed that T 5 E*. This immediately implies that all Bore1 sets are p*-measurable. 0 From now on, the restriction of p* from C* (actually, p;) to %(X) will be denoted by p. The last two theorems finalize the most significant feature of p*, besides its integral representation, that its restriction from 9 ( X ) to %(X) is a Radon measure. Indeed, Theorem 7.8 states that p* is inner regular on T . Proposition 7.5 states that p* is finite on compact p* = p is a Borel measure. And, sets. Theorem 7.9 states that R e s %(XI finally, p* is outer regular, by definition, on T(X), and therefore, on
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
7.10 Theorem (Riesz's Representation Theorem). For any positive linear functional I on Cc(X) there is a Radon measure p such that for all f E CC(X), I(f =
Sf
+a
(7.10)
Proof. We have shown in the above theorems that through formulas (7.2) and (7.2a), I induces a Radon measure on the Bore1 a-algebra %(X).We need to prove that (7.10) holds true. Let f E C , ( X ) and U be an open set such that f 4 U and p(U) < oo. Since f is bounded, there is an M < rn such that 11 f 11, < M (where 11 11, stands for the supremum norm). Given E > 0, let {to,. ..,tn} be a partition of the interval [ - M,M] with
such that the mesh, rn =
T ,of the partition be less than E. Denote
where K = suppf , and
By outer regularity of p, for each of Ei such that
E
> 0, there is an open superset Vi
- Wi, we have that E; 5 Wi fl V i , and therefore, (7.10a) still Since Ei C holds when Vi is replaced by Ui = W i n V i a Because Ei C Ui 5 Wi, we have that
Thus, {U1, ...,U,} is an open cover of K and by Theorem 11.3, there is a partition (g1, ...,g} 5 Cc(X) of unity for K subordinate to this open cover, i.e., gi 4 Ui and K 4 Cr, Because f 5 ti
Since
+ E on Wi, it holds on any subset of Wi, and thus
C i"-- lgi = 1,
7. Radon Measures on Locally Compact Hausdorff Spaces
501
The latter, along with (7.10b) and (7.10c), yield: I(f) - fdp =
C;=,I(fg;) - En=1-J f
d ~
(since f 2 ti - E on Wi and thus on Ei)
Letting E 1 0 we arrive a t when replacing f by - f .
I(f) 5 f f dp. Now, the equality is reached
7.11 Proposition. The Radon measure in equation (7.10) is unique.
Proof. Suppose v is another Radon measure induced by I for which equation (7.10 ) holds. Let K be a compact set. Then, by the outer regularity, for each E > 0, there is an open set U such that
By Corollary 11.4, Chapter 3, there exists f E C,(X) K 4 f 4 U yielding that lK5 f lUand hence
<
such that
<
Thus, v(K) p(K). Interchanging the roles of p and v we arrive a t p = v on St. Inner regularity allows us to state that also p = v on T and 0 outer regularity finally yields p = v on B(X).
7.12 Theorem. Any Radon measure p is inner regular on p-finite Bore1 sets.
Proof. Let B E 3 ( X ) such that p(B) < m. We need to show that
502
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
or, equivalently, that for each E > 0, there is a compact subset K of B such that p ( K ) E > p(B). Choose E > 0. Since B is p-outer regular, there is a n open set U _> B such that
+
Since U is p-inner regular, there is a compact set C such that C & U and
Since U \ B (as an open set) is p-outer regular there is a n open superset V - U \ B such that, along with (7.12), 3
Since U \ B
E V, we have that
VC5 UCU B. Hence,
CflVCGCn(UCuB)
C C U) sUfl(UcUB) (since B C U) = B.
We see that C \ V is a compact subset of B with:
(by (7.12a) and since p ( C fl V)
5 p(V) < 5 by (7.12b))
The reader can rather easily conclude that:
7.13 Corollary. If B is a a-finite Bore1 set, then B is p-inner regular. (See Problem 7.4.)
7.14 Proposition. Let p be a a-finite Radon measure and B E %(X). Then for each E > 0, there is a closed subset F of B and an open superset U of B such that p(U\F) < E .
7. R a d o n M e a s u r e s o n Locally C o m p a c t H a u s d o r f f S p a c e s
503
.
Proof. Let {B,; n = 1,2,. .) be a partition of B such that p(Bn) < oo. Since each Bn is p o u t e r regular, there is an open superset Un of Bn such that
00
Let U = U U,. Then, B 2 U and n = l
Therefore,
Now, we apply to BC the same arguments as above to have an open superset V of BCwith p(V\BC) < Then, F: = VC is closed and F C B. Finally, because B\F = V\BC,
z.
The following proposition is an easy consequence of Corollary 7.13 and Proposition 7.14 and is offered as a small challenge for the reader as Problem 7.7.
7.15 Proposition. L e t p be a R a d o n m e a s u r e o n %(X), w h e r e X i s a locally c o m p a c t H a u s d o r f f space. I f B is a a - f i n i t e B o r e l set, t h e n f o r e a c h E > 0, t h e r e a r e a c o m p a c t s e t K a n d a n o p e n s e t U s u c h t h a t K cl BC U a n d p(U\K) < E .
7.16 Proposition. L e t p be a a - f i n i t e R a d o n m e a s u r e . T h e n f o r a n y B E %(X),t h e r e is a n F, subset of B and a G6 s u p e r s e t of B s u c h t h a t p(Ga\F,) = 0. Proof. Let B be a Borel set. By Proposition 7.14, for each E > 0, there are closed and open sets, F and U, respectively, such that F C B 5 U and p(U\F) < E . In particular, for
E
=
a, there are Fn (closed) and Un (open) such that
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Fn C B
Un and p(Un\Fn) < n
Then, with the notation 4, = U Fk and CUn = k=l
a.
n
n Uk, we have that k =1
(Un\gnC Un\Fn and thus p((Un\4,) <
a.
(7.16)
In addition, clearly, the sequence {CUn\4,} is monotone nonincreasing and:
yielding that
It therefore remains to show that
by using continuity from above in light of Theorem 1.7 (i), Chapter 5, which requires that /1(%1\41) < ~a From (7.16)) we have that
Now, the assertion that p(Gs\Fu) = 0 follows from (7.16-7.16b). As we remember, a regular Borel measure on a Borel a-algebra, generated by a locally compact Hausdorff space X, has a number of properties, one of which is its finiteness on compact sets. The following is an interesting fact that in some subclasses of locally compact Hausdorff spaces, for a Borel measure to be regular it is sufficient to be finite on compact sets. Namely, second countability or just a-compactness of all open subsets of X is such an add-on. (Recall that, according to Corollary 10.18, Chapter 3, a second countable locally compact Hausdorff space is also a-compact .)
7.17 Theorem. Let (X,T) be a locally compact Hausdorff space, in which every open set is a-compact. Then, every Borel measure on %(X), finite on compact sets, is regular.
Proof. Let p be a Borel measure such that p(K) < oo for all
7. Radon Measures on Locally Compact Hausdorff Spaces
505
K E R(X). Then, Cc(X) C L'(R,GB(x),~). Let I denote the positive linear functional on Cc(X) defined as I ( f ) = f f d p and let v be the Radon measure induced by I. By the assumption, for any U E T, there is 00
a sequence {C,} of compact sets such that U = U C,. Then, given C1 n = l
and U, there is an f E Cc(X) such that C14 f 4 U. For n = 2, there is an f E Cc(X) such that
Consequently, for n 2 2, recursively, there is an
Because
f n E ec(X)
such that
{ k 6= 1ck)t U, obviously, 1f ,} f lU, and
p(U) = f limn,,
f ,dp (by the Monotone Convergence Theorem)
Therefore, p = v on T . Now, let B be a Bore1 set. Since v, according to Problem 7.6, is a-finite, by Proposition 7.14, given an E > 0, there are closed and open sets, F and W such that F E B E W and v(W\F) < E. Since W \ F E T and p = v on T, so p(W\F) < E also. Consequently, p ( F ) < m, p ( W \ F ) = p(W) - p ( F ) < 00,and
Hence, p is outer regular on GB(X). Furthermore, from (7.17),
On the other hand, F can be p-approximated by a compact set. IndAed, since X is a-compact, so F is also, i.e., F can be represented as the union of compact sets. Alternatively, F can also be represented as the union of a monotone nondecreasing sequence 1K,} of compact sets, and therefore, by continuity from below,
Consequently, for each
E
> 0, there is an
n such that
506
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Combining (7.17a) and (7.17b), we have that p(Kn) > p(B) - 2 ~ which , shows the p-inner regularity of B and hence, regularity of p. In particular, p is Radon, and because of the uniqueness of p, we have that p = v.17 7.18 Remark. Notice that, since Rn with the usual topology is a acompact and locally compact Hausdorff space, any Borel-Lebesgue-Stieltjes measure, according to Theorem 7.17, is regular. 13 Another very useful result is as follows. 7.19 Theorem. Let p be a Radon measure on Borel a-algebra %(X) generated by a locally compact Hausdorff space. Then C,(X,r;C) = LP(Xl%(X), p;C), f o r all 1 5 p < oo.
Proof. Since by Problem 4.8, the space PP of all simple complexvalued functions is dense in LPl it is sufficient to prove that for each Borel set B with p(B) < oo, the function lBcan be approximated in the LP norm by elements of Cc(X). Given an E.> 0, by Proposition 7.15 (for which it is sufficient that B be u-finite, i.e., Rescn ~ p is a-finite; see Remark 2.12, Chapter 5)) there are a compact and open sets such that K 5 B 5 U and p(U\K) < E . By Corollary 11.4, Chapter 3, there is an f E Cc(X) valued in [0,1] such that K 4 f 4 U and K 5 suppf . Furthermore,
and hence
The following often referred to theorem (holding for locally compact Hausdorff spaces and Radon measures) states that a measurable function vanishing outside a set of finite measure can be approximated by a function with compact support. 7.20 Theorem (Lush). Let (X,T) be a locally compact Hausdorff space, p be a Radon measure on I ( X ) , and let f E C -'(x, %(x); c). Assume that the set E = (x E X: f(x) # 0) is p-finite. Then, f o r each E > 0, there is a function F E C,(X,r;C) such that p ( F # f ) < E .
Proof. 1. We first assume that f is bounded. Since E is assumed to be pfinite, f E ~ l ( x , % ( X ) , p ; C ) .Then, by Theorem 7.19, there is a sequence { f ,) C_ C,(X,r;C) that converges to f in mean. By Riesz-Fischer Theorem 4.9, there is a subsequence {hk: = f ) of (fn) that converges "k
7. Radon Measures on Locally Compact Hausdorff Spaces
507
to f p-a.e. in the topology of pointwise convergence. Since (E,C: = G$(X) n E,ResCnEp) is a finite measure space, by Egorov's Theorem 5.7, {hk) converges to f p-a.e. , i.e., for each E > 0, there is a Bore1 set A E such that p(E\A) < and {hk} converges to f uniformly on A. Furthermore, by Proposition 7.15, there are pairs of compact sets K and C and open sets V and U such that
5
KCACV,C&E&U and p(V\K)
< 5 and p(U\C) < 5.
But U\E
U\C and A\K C V\K
yield that
Since hk -+ f uniformly on A, thus uniformly on h', the function ReSK f is continuous. By Tietze's Extension Theorem 11.7, Chapter 3, there is a function F E Cc(X) as ExtX(ResKf) with respect to K and U such that F vanishes on UC and suppF C U. Since f = 0 outside E and F is an extension of f from K to X such that F = 0 outside U, {x: f(x)
# F(x))
U\K and p(U\K)
i.e. f and F differ on a set of measure less than F = f lEexcept on a set of measure less than E.
E.
< E,
We can also say that
2. Let f be unbounded. Denote
E n = {x E X : O <
I f(x) ( Sn).
Then {En]TE and therefore, by continuity from above (taking into account that p ( E ) < oo), we have p(E\En) -0. Hence, given an E > 0, there is an N such that for all n N ,
>
By case 1, applied to f bounded on En, there is an F E Cc(X) such that F = f l E everywhere except on a set of measure less than Thus, F =
5.
n
f lEexcept on a set of measure less than
E.
508
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
PROBLEMS
7.1
Show that for any function [X,f ,[O,l]] E Cc(X), p*(suppf) 2 I(f).
7.2
Show that for all K E 8, it holds true that
p*(K) = inf{I(f): f E Cc(X) and f
> lK}.
[Hant: Apply Proposition 7.6.1
7.3
Can the uniqueness of the Radon measure induced by a positive linear functional be established by means of Theorem 2.13, Chapter 5, a t least in part?
7.4
Prove Corollary 7.13.
7.5
Show that if (X,T) is a locally compact Hausdorff space, then every a-finite Radon measure on %(X) is regular.
7.6
Prove that if (X,T) is a locally compact and a-compact Hausdorff space, then any Radon measure on %(X) is a-finite and regular.
7.7
Prove Proposition 7.15.
7.8
In Lusin's Theorem 7.20, prove that if f is bounded, then the choice of such an F can be restricted to those with 11 F 11,s 11 f 11 , where 11 11, is the usual supremum norm.
-
7.9
Prove the statement: Let [CC(X,r;W),W,I]be a positive linear functional and K be a compact subset of X. Then there is a nonnegative real constant C K such that, for all f E C , ( X ) with S U P P ~5 K~ I I ( f ) I 5 C K I1f II ,(where 11 - 11 is the supremum norm).
7. Radon Measures on Locally Compact Hausdorff Spaces NEW TERMS: Borel measure 493 pouter regular set 493 p-inner regular set 493 outer regular Borel measure 493 inner regular Borel measure 493 weakly regular Borel measure (Radon measure) 493 Radon measure (weakly regular Borel measure) 493 positive linear functional 494 Radon measure induced by a positive linear functional 496 Riesz's Representation Theorem 500 Lusin's Theorem 506
509
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
8. MEASURE: DERIVATIVES In this section we will consider an alternative approach to the abstract notion of the Radon-Nikodym derivative in Euclidean spaces of signed Borel-Lebesgue-Stieltjes measures with respect to the Borel-Lebesgue measure. The idea of differentiation of measures, as a "pointwise limit," which we explore throughout, has some analog with the conventional concept of a derivative; and it has an interesting insight to the RadonNikodym derivative and applications to the differentiation of functions.
8.1 Definition. Denote by Y(6,x) the collection of all open cubes in Rn (whose edges are parallel to the coordinate axes) of diameter less than or equal to 6 and containing a point x. Let v be a signed Borel-LebesgueStieltjes measure on the Bore1 a-algebra B! (cf. Definition 1.1 (vi)). For each x E Rn, define the functions
-
Dv(z) = lim6+o sup and Pv(x) = limg4, inf
{g
: C t f (x, 6))
{% C t :
J(X,6)).
(8. l a )
Since the functions
and
are, for every fixed x, monotone nondecreasing and nonincreasing in 6, respectively, the limits in (8.1) and (8.la) exist (though they can be ca or - oo). The numbers &(x) and Dv(x) (satisfying a v 5 n u ) are called respectively the upper and the lower derivatives of measure v (with respect to the Borel-Lebesgue measure A). If they are equal and finite, we denote their common value Dv(x), and we say that v is diflerentiable at x (with respect to A) and call Dv(x) the (measure) derivative of v at x (with respect to A).
+
Notice that if v << A, then v = fdA (with respect to some RadonNikodym density) and since
V(c(x'd))
XI 4 )
represents the mean value of the
function f on the cube C(x,d) (of diameter d and containing point x), Dv, if it exists, seems to be equal to f A-a.e. in a vicinity of x. This idea (which gives a practical insight of the Radon-Nikodym derivative) will be explored in a rigorous way through several statements below.
511
8. Measure Derivatives
8.2 Remark. One interpretation of the measure derivative is if D v exists a t a point xo (and therefore, coincides with its upper and lower
derivatives), then
exists for 6 10 along any pertinent net of open cubes. Therefore, for any E > 0, there is a 6 > 0 such that for any open cube C containing xo, of diameter less than or equal to 6,
As a relevant net of cubes, we can take those centered a t x0 and even reduce that net to a sequence of cubes of diameters
{A}.
8.3 Lemma. Let C,,..., C, be open cubes in Rn. Then there is a subcollection, C ,...,Ck , of pairwise disjoint cubes among C1,...,C, such kl S that 53"C;=,'(Ck.). 3 .
-
Proof. Let bi be the diameter of Ci. Rearranging the cubes, we can assume that h1 2 62 2 . . 6,. Set kl = 1 and let k 2 be the smallest index (of the cubes) greater than 1 and such that the cube with this index be disjoint from Ck . If there is no such cube available, then we are done.
.>
1
Otherwise, set k3 to be the smallest index greater than k 2 and such that is disjoint from Ckl CkZ.Continue this process until the formation
+
C
k3
of all disjoint cubes
Ck ,...,C is finished. Suppose Sk is 1
j
lcS
a cube with
the same center as Ck but with a diameter three times as large. Since j
each Ci intersects some Ck ., with i > - k j (it is impossible otherwise, as 3
the set of the disjoint cubes is assumed to be complete) and d(C,) 5 d(Ck .), it yields that Ci C Sk . Hence, 3
j
8.4 Lemma. Let p be a positive Borel-Lebesgue-Stieltjes measure on 93" and let N E N p . Then Dp exists A-a.e. on N and D p l N E [O]*
Proof. Because p is a positive measure, 0 5 .Dp 5 B p ; and thus we need to show that for each positive a,
512
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Let
A = N n {x E Wn: &(x) > a), for some a > 0. Then, A is Borel (Problem 8.4) and, by regularity of p (see, Theorem 7.17 and Remark 7.18), for any E > 0, there is an open superset U of A such that p(U\A) < E . Since A is a p-null set, we can make p(U) arbitrarily small. We will show that the latter, times a positive constant, dominates A(K), where K is a compact subset of A, and hence A(A), taking into account regularity of A. Let K A be a compact set and U A be an open set. Given x E K, by Problem 8.2, there is an open cube C of any fixed diameter, say d, that contains x, and such that A(C) < ;p(C). From Problem 8.2, we can make d small enough to ensure C E U. We can cover K by all such cubes and due to compactness have this open cover (dominated by U) reduce to a finite subcover, say, C1,...,C,. Then, by Lemma 8.3, there is a subcollection, Ck ,...,Ck , of pairwise disjoint cubes, among C1,. . .,C,, such 1 3 that
As mentioned above, due to regularity of p, given an selected as p(A) + E$ > p(U). Hence,
On the other hand, by regularity of A, for each as A(K) E > A(A).
E
> 0, U can be
EL 3"
> 0, K can be selected
+
The latter, along with A(K) < E , gives A(A) < 2.5.
8.5 Corollary. Let v be a singular signed Borel-Lebesgue-Stieltjes measure. Then, the measure derivative Dv exists A-a.e. and Dv = 0 Aa.e.
Q?
Proof. Since v IA, by Proposition 3.2 (iii), v + ,v E and there is a Borel set B such that I v I (B) = v + (B) = v - (B) = A(BC) = 0. Hence, by Lemma 8.4, D 1 v 1 = Dv+ = Dv- = 0 Xa.e. on B and since
8. Measure Derivatives
513
BC E X A , we have that D 1 v 1 = D v + = Du- = 0 Xa.e. on Wn. Because D is a linear operator on the set of all signed Borel-Lebesgue-Stieltjes 0 measures, we have that Dv = 0 A-a.e. Since any Borel-Lebesgue-Stieltjes measure is a-finite, by Theorem 3.4, there is a unique Lebesgue decomposition of a signed Borel-LebesgueStieltjes measure v with respect to the Borel-Lebesgue measure A, as v , + us, where v, < A and v , IA. Absolute continuity of v, (with respect to A) provides a A-equivalent class
dva x of Radon-Nikodym densities,
which is referred to as the Radon-Nikodym derivative. The theorem below states that v , is A-almost everywhere differentiable and its derivative coincides with any Radon-Nikodym density of the class
dva x A-
a.e. We therefore formulate the theorem for an absolutely continuous signed Borel-Lebesgue-Stieltjes measure. 8.6 Theorem. Let v be a signed Borel-Lebesgue-Stieltjes measure on '33" such that v < A. Then Dv exists on some set A such that AC E HA and l A D v E d v
x.
Proof. Let f E
$.Given a real number a, denote
Then p is a positive Borel-Lebesgue-Stieltjes measure on 93n.Let B be a d,-bounded Bore1 set. Then
is obviously finite. From
it follows that
and since the latter holds true for any open cube, we have that
Let N = {f < a). Then, N E Np and by Lemma 8.4, Dp exists Xa.e. on N and l N D p E [O]X. This, applied to (8.6)) yields that
514
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
Denote S , = {f every a. Since
< a < b v ) . Then, S ,
N and therefore S , E NX for
we have that E E NAand therefore, n v 5 f Xa.e. Now, replacing v by - v and f by - f , we arrive a t Dv _> f A-a.e. Since v is clearly a-finite, by the Radon-Nikodym Theorem 2.2 (case 5b), f is A-a.e. finite and because Dy 5 n v , we have that Dv = f Xa.e. and thereby the statement is proved. Combining Corollary 8.5 and Theorem 8.6 we arrive at:
8.7 Corollary. Any signed Borel-Lebesgue-Stieltjes measure v is A-a.e. differentiable and i f v, is its absolutely continuous component in the Lebesgue decomposition, then Dv is A-a.e. identical to any RadonNikodym density of v , with respect to A.
Proof. If v is a signed Borel-Lebesgue-Stieltjes measure and v = v, + v, is its Lebesgue decomposition, then by Theorem 8.5, v , is differentiable Xa.e. More precisely, its derivative Dv, exists on some set A such that ACE NAand
By Corollary 8.4, v , is differentiable A-a.e. and its derivative Dv, = 0 Xa.e. In other words, there is a set B such that BCE NA and such that lBDv, = 0. Consequently, the set EC = A C UBCE NX and l E D v =
lEDv, E dva and therefore, v is Xa.e. differentiable.
D
PROBLEMS 8.1
v be a signed Borel-Lebesgue-Stieltjes measure and A = {x E Wn: Dv(x) > a) # 0, for some real a. Show that there is Let
a cube C containing x such that
8.2
Let v be a signed Borel-Lebesgue-Stieltjes measure and A = {x E B C Wn: &(x) > a) # 0, for some real a and B being a Bore1 set. Show that, given a positive real number 6, there is a cube C(x,6) such that u(c(xv6)) > a. X(C(x16))
8.3
;1: > a.
Show that
8. Measure Derivatives
is an open set.
8.5
Let F be an extended distribution function induced by a positive Borel-Lebesgue-Stieltjes measure p on (R,%). Show that if p is differentiable a t xo, then F is continuous a t xo.
516
CHAPTER 8. ANALYSIS IN ABSTRACT SPACES
NEW TERMS: lower derivative of a measure 510 upper derivative of a measure 510 measure differentiable a t a point 510 measure derivative 510
Chapter 9 Calculus on the Real Line In this chapter we utilize theorems on absolute continuity, singularity, and measure derivatives of Chapter 8. We will see a close connection between the signed measures and functions of bounded variations (to be introduced) and their decompositions. The underlying treatment will be entirely devoted to the real line, with topics belonging to traditional analysis and probability. However, some more advanced methods of Chapter 8 will be applied for quicker and more elegant results that lead to the calculus of Lebesgue and Lebesgue-Stieltjes integrals. While the Riemann-Stieltjes integral would fit perfectly into this chapter it will not be the subject of our discussion, mostly because it is readily available in numerous advanced calculus texts, although a close relationship between Riemann-Stieltjes and Lebesgue-Stieltjes integrals makes this topic very tempting to explore.
1. MONOTONE FUNCTIONS 1.1 Definition and Notation. Unless specified otherwise, we will consider real-valued functions [R,R,f], bounded over bounded intervals. A function f is monotone nondecreasing (nonincreasing) if f (x) 5 f (y) (f(x) 2 f(y)) whenever x < y. A function is monotone if it is of either types. The jump bf(x) of a function f a t a point x, is f (x ) - f (x - ). The latter is clearly a finite number a t any real point x. A point x is a jump discontinuity of f if sf(") # 0. [Note that the function 1 does not fall into this category of monotone functions, as it is not bounded over bounded intervals around zero.] Note that monotone functions are measurable. Indeed, if f is monotone nondecreasing, for any real number a, the set (f > a) is either empty or an interval.
+
1.2 Theorem. The set D of all jump discontinuities of a monotone function [R,R,f] is at most countable, and i f f is defined on a compact interval [a,x] and D 4 = {xl,x z,...} is the set of all discontinuities of f on (a,,) (a < x), then (01
CHAPTER 9. CALCULUS ON THE REAL LINE
Proof. We assume that f is monotone nondecreasing. Otherwise, we will deal with - f . Because
and
it is sufficient to prove that f has a t most countably many points of discontinuities on any compact interval [a,x]. First observe that for an ntuple, a < x1 < ... < x, < x, of points it is true that
Indeed, if to E (a,xl), tl E (xl,x2),. ..,tn E (x,,x) are arbitrarily selected points, then by summing up the inequalities
we have (1.2a). From inequality (1.2a), it also follows that if D, is the set of all jump discontinuities of f on [a,x] a t which the jumps are .,xn € D,, then n&5 f (x) - f (a) and greater than an 6 > 0, and if ill.. therefore D, is finite. Let D [ a l ~ ]denote the set of all jump discontinuities of f on [a,x] and let
Then, it is readily seen that
and since each Dl./k is finite, DtaIxl5 N, i.e., D ~ a l ,= l {x1,x2,...}. The 0 latter and (1.2a) yields (1.2). Observe that if the function f is defined on [a,x], then f (a + ) - f (a) = bl(a) and f (x) - f (x - ) = bf(x) can be taken for jumps of f a t
1. Monotone Functions
5 19
the ends of the interval [a,x]. With Af([a,a]) = 0, equation (1.2) still holds. On the other hand, if f is really defined on R, then from (1.2) it follows directly that Af([a,a]) = 6 (a). Now, if for Af ([a,x]) we will take a as a fixed constant and if x varies in [a,b], Af([a,x]) in (1.2) turns to a function of x, in new notation, Af(x)! which is monotone nondecreasing on [a,b]. The "step" function Af(x) is referred to as the cumulative jump function of f . While it is almost obvious how to turn a monotone into continuous function, we would like to formalize it as follows: 1.3 Proposition. Let [[a,b],~,f] be a monotone nondecreasing function. Then the function f - A is monotone nondecreasing and continuous on [a,b]. Proof. Let x applied to [x,y],
< y be any two points from [a,b]. Then, from (1.2)
< f(Y) - f (XI. By adding and subtracting f (x - ) to the left-hand side of inequality (1.3) and then rearranging terms we arrive a t
Therefore,
which yields that f - Af is indeed monotone nondecreasing. By letting y 1x we obtain from (1.3b) that
520
CHAPTER 9. CALCULUS ON THE REAL LINE
On the other hand, from (1.3a) and (1.3b),
Letting y i x in the latter, we have that
which, along with (1.3~))yields that
+
Af(x + ) - f (X ) = Af(x) - f (x). Analogously, we can show that A (X- ) - f (5 - ) = A (x)- f (x).
17
Recall that extended distribution functions fall into the category of monotone functions and there is a bijective map between the factor space 9,IB of B-equivalence classes of all extended distribution functions that differ in constants and Borel-Lebesgue-Stieltjes measures 23 they induce and vice versa. (See Example 1.2 (Sii) and Remark 3.5 (iii), Chapter 5, for a refresher.) It is intuitively clear that the measure derivative as a "pointwise" limit, if it exists, is identical to the function derivative. This is subject to the following theorem. 1.4 Theorem. Let f ( E 9,) be an extended distribution function and let p f be the positive Borel-Lebesgue-Stieltjes measure induced b y f . Then f is differentiable at a point xo if and only i f p f is differentiable at xo and in this case,
Proof. Let f be differentiable at xo. Then, for each positive a positive 6 such that
If x > xo, then by Problem 3.7 a), Chapter 5,
E,
there is
1. Monotone Functions and if x
< xO, since f is continuous a t xO,
Therefore, if x
and if x
< xo,
> xo,
The latter is not a significant difference from (1.4), since f , and therefore, F are continuous a t xO. Furthermore, because f can have only a t most countable many discontinuities, there is an interval around x, where f and F are continuous. In other words, the selection of 6 can be made appropriate to warrant F(x) = F ( x - ) . Then, by (1.4a) and Remark 8.2, D p ( x o ) exists and (1.4) holds. The converse is subject to similar arguments after in the expression 0 for F(x), f '(x0) is replaced by Dp(xo). Corollary 8.7 and Theorem 1.4 combined immediately yield:
1.5 Corollary. Every extended distribution function f E 9, is differentiable A-a.e. and
where pf is the Borel-Lebesgue-Stieltjes measure induced b y f and g is a Radon-Nikodym density of the continuous component of pf in its Lebesgue decomposition. 1.6 Corollay. Every monotone function bounded over bounded intervals is differentiable A-a.e.
Proof. Let g be a monotone nondecreasing function (otherwise, we consider
- g).
Define
to have f E 9,.Then f is differentiable A-a.e., due to Corollary 1.5 and so is g , which, by Theorem 1.2, has a t most countable many discontinuities, and hence equal f Xa.e.
1.7 Theorem (Fubini). Let
IF,)
be a sequence of monotone nonconverges to a decreasing functions such that the series=:c function F in the topology of pointwise convergence. Then:
522
CHAPTER 9. CALCULUS ON THE REAL LINE
(i) Both Fn and F are differentiable A-a.e. (ii) F'(r) = F h(x), k a . e . n =1 Proof. Assume that for each n, Fn is a distribution function and F is bounded. Let p~ be the corresponding finite Borel-Lebesgue-Stieltjes n
measure. The set function pF =
n = l pF n
is a positive measure. Then,
F is clearly a distribution function, and
It follows by elementary arguments that pF is a finite Borel-LebesgueStieltjes measure induced by F. Let PF n denote the Lebesgue decomposition of pF
n
and let
fn
be a Radon-
Nikodym density of its absolute continuous component. We show that
00 is the Lebesgue decomposition of pF and f : = En = is a RadonNilcodym denslty of its absolute continuous component. Since p i L A, there is a A-null set N n such that A(Nn) = p:(Nk) = 0. Let
Then, because N
2 N,
for each n (and thus NCE NC,),
On the other hand,=:c l p i is the continuous component of pF, since by the Monotone Convergence Theorem,
As a finite measure,=:c
( 5 pF) provides that f is an L'-function
and, by the Radon-Nikodym Theorem, f is a unique, modulo A, RadonNikodym density of=:C
with respect to the Lebesgue measure.
Since F is a distribution function, by Corollary 1.5, F' exists Xa.e. and
1. Monotone Functions
On the other hand, applying the same argument to F,, we have that
FL = D p a, = J , A-a.e. and the two equations yield F' =
c n = l Fk
Xa.e.
The general case of the theorem, when F is a monotone nondecreasing function, bounded over bounded intervals, is left for the exercise (Problem 1.1). 0 The following statement is an interesting partial confirmation of the revered Newton-Leibnitz theorem applied to a class of monotone functions. The latter are differentiable A-a.e. Unless specified otherwise, we will extend the derivative of such a function f by setting f ' = 0 on the set N E XAand NCis the set on which f ' exists.
1.8 Theorem. Let f be a bounded monotone nondecreasing function on the compact interval [a,b]. Then, f ' is measurable and
+
Proof. Let us (continuously) extend f through (b,b l ] by setting f(x) = f(b) on this interval. Then, at every point x where the derivative of f exists it can be represented as the limit
of a convergent sequence of measurable functions. Furthermore, f ' exists on a measurable subset of [a,b] whose complement is a A-null set on which f ' is set to equal zero. Thus, f ' is well defined on [a,b], it is nonnegative and therefore its Lebesgue integral exists. By Fatou's Lemma, then b a
J ;[f(. + A)- f ( x ) l ~ ( d x ) ) .
fid~
By the change of variables,
'
1
s:f(x+ri) A (dx) = [ 6 + ~ f ( x ) A ( d x ) and thus:
CHAPTER 9. CALCULUS ON THE REAL LINE
The above statement seems to fall surprisingly short of the familiar Newton-Leibnitz equation. Moreover, as we will learn from the example below, the result of Theorem 1.8 can deliver a strict inequality.
1.9 Example. (Cantor function). Let G,, n = 1,2,..., be open sets removed from [0,1] to form the Cantor ternary set (see Example 3.11, Chapter 5). Recall that each Gn is the union of 2n-1 disjoint open intervals. Now, the set summation of 1
n
U Gk is the union of 2" - 1 (as the result of the
k=l
+ ... + 2"-')
open intervals denoted by
and arranged in the order of their location in [0,1]. For each n, define the function F, : [0,1] +[0,1] as follows. Let
and F n ( l ) = 1. Then, interpolate Fn by connecting the ends of the corresponding segments of F , on Ak(n). For instance,
and
G1 + G, + GS = A1(3) + A2(3) + ...+ A7(3)
The graphs of F1 and F3 are drawn in Figure 1.1 below.
1. Monotone Functions
Figure 1.1 Observe that Ak(n) = A2k(n+ 1))and that Fn(x) = F,+l(x) = k/2", for x E Ak(n) = Azk(n I), k = 1,...,2" - 1. It is easily seen that Fn is a monotone nondecreasing, continuous function on [0,1], and it is also clear that 1 ( x ) - F VX E [0,1]. Thus Fn(x) converges unin1 , formly to a function F(x), which is called the Cantor function, and F is also continuous and monotone nondecreasing (as the result of the uniform convergence of a sequence of monotone nondecreasing, continuous functions). Therefore, since F(x) = Fn(x) = k/2n for x E Ak(n), we have that F1(x) = 0, for x E Ak(n), k = 1,2,. ..,2n - 1, n = 1,2,. .. . Hence,
+
00
F1(x) = 0 on U G,. The latter is the complement of the Cantor set C. n=l
Consequently, F' E [OIA on F ( l ) - F(0) = 1.
[0,1].
Therefore,
1 SoF1dX = 0,
while
PROBLEMS 1.1
Complete the proof of Fubini's Theorem 1.7 for the general case of when F is a monotone nondecreasing function, bounded over bounded intervals.
1.2
Let f be a monotone nondecreasing function on [a&] and F be a monotone function on [A,B]. Is the composition F o f : [a$] 4 W monotone?
526
CHAPTER 9. CALCULUS ON THE REAL LINE
1.3
Let f and F be the functions of Problem 1.2 and suppose the function f has a jump of discontinuity at xo E (a,b). Must F o f be discontinuous at xo?
1.4
Show that if f is continuous on [a,b], then the functions m(x): = inf(f ( t ) : t E [a,x]) and M(x): = sup(f ( t ) : t E [ a , ~ ] )are continuous and monotone on [a,b].
1.5
Give an example of two monotone nondecreasing functions whose product is not monotone.
1.6
Give a monotone increasing function [R,R,f discontinuous at each rational point.
1.7
Prove that if a function [(a,b),R,f] is monotone, bounded, and continuous, then it is uniformly continuous.
1.8
Does the validity of the statement of Problem 1.7 still hold if the interval (a,b) is replaced by R?
1. Monotone Functions
NEW TERMS: monotone nondecreasing function 517 monotone nonincreasing function 517 monotone function 5 17 jump discontinuity 517 cumulative jump function 519 Fubini's Theorem for monotone functions 521 Cantor's ternary function 524, 525
CHAPTER 9. CALCULUS ON THE REAL LINE
2. FUNCTIONS OF BOUNDED VARIATION Now we will introduce the class of functions of "bounded variation," which play the same role for signed measures as distribution functions do for generating positive Borel-Lebesgue-Stieltjes measures. 2.1 Definition. Let [a,b] be a compact interval in R and let P = {ao = a,. ..,an = b) be a partition of [a,b]. Let f be a measurable bounded real-valued function defined on [a,b] . Denote
and let 9 be the set of all partitions of [a,b]. Then we call sup(V(P): P E 9) the variation o f f on [a,b] and denote it by Vf[a,b]. The function f is said to be of bounded variation on [a,b] if Vf[a,b] < m.
2.2 Example. Consider the function
and make the partition P = {0 < xn < ... < xl < 1) such that
Then,
and hence
Consequently, V [O,l] = m. We will leave for a n exercise (Problems 2.1-2.14) the following properties of functions of bounded variation.
2.3 Theorem. Let [[a,b],R,f] be a bounded function. The following hold true: (i)
Iff is monotone, then it is of bounded variation.
(ii) Iff satisfies a Lipschitz condition, then f E V[a, b]. (iii) Let f E +[arb]. Then x I+ Vf[a,x] is a monotone nondecreasing function on [a$].
2. Functions of Bounded Variation
529
( i v ) The set V[a,b] of all functions of bounded variation on [a,b] is a vector space over the field R and it is closed with respect to multiplication. f Let f ,g € V[a,b]such that g 2 6 > 0. Then g E V[a,b]. (v) (vi)
I f f EV[a,b],thenVf[a,b]= V f [ a , c ] + V f [ c , b ] .
.
(vii) If P = { a = a. < al < .. < a, = b ) is a partition of [a,b] such that on each of the subintervals [ai,ai + l ] f is monotone, then f E V [ a ,b].
+
f
(viii) If f E V [ a ,b ] and [a,b]= [a,c] (c,b], then f E V [ a , c ] and E V[c,bl*
(is) f E V[a,b]i f and only i f f can be represented as the difference of two monotone nondecreasing functions.
(x)
I f f E V [ a ,b ] , then f is differentiable A-a.e. on [a,b].
( x i ) The set of all jump discontinuities of any function f E V [ a ,b] is ai most countable. ( x i i ) Any f E V [ a ,b ] can be represented as the sum of its jump function Af and a conlinuous function of bounded variation on [a,b]. (xiii) Let f E V [ a ,b ] . If f is continuous at xo E (a$), then so is x HV [ a ,X I . If f is right-continuous, then so is V [ a ,XI. ( x i v ) Any continuous function f E V[a,b] can be represented as the difference of two continuous monotone functions. • 2.4 Definition. Let [R,R,f ] be a bounded function. The limits
are said to be the variation of f on ( - 0 0 , b ] , the variation of f on [a,oo), and the total variation o f f , respectively. The function f is said to be of bounded variation on ( - oo,b], [a,oo), or R, if the above respective limits are finite, in notation, f E V ( - oo, b ] , f E V [ a ,oo), or f E V ( R ) , respectively. 2.5 Theorem.
(i)
For any two real numbers a < b,
530
CHAPTER 9, CALCULUS ON THE REAL LINE
and V f [ a , ~=)v f [ a , b ] + V f [ b , o o ) .
(ii) I f f E 4r(R), then
(iii) f E T ( R ) i f and only i f f can be represented as the difference of two monotone nondecreasing bounded functions. If, in addiiion, f is a distribution function, then the latter represeniation is of two distribution functions. 0 Proof. Parts ( i )and (ii)are left for the reader (Problem 2.21). (iii) Denote v ( x ) = V f( - m , x ] and
F: = v f
+f
and G = v f - f .
Clearly, v f is a monotone nondecreasing and bounded function. Let 32 < Y. Then, because I f ( Y ) - f (+) I 5 v f [ x , y 1 ,
The proof that G is monotone nondecreasing is analogous. Now,
is a pertinent representation. Finally, if f is a distribution function, then so is v due to part (ii)and Proposition 2.3 (xiii). 0 f ,
2.6 Definition. A function f E T ( R ) is said to be a signed distribution function (in notation, f E IDS), if it is right-continuous and vanishes a t
- 00.
2.7 Theorem. Lei 9 be the operator defined on set G,(R,E$) of all finite signed measures b y
Then, f is a signed distribution funciion and [6,(R,9),5IS,9] is a bijection.
Proof. 1) Given v E G,(R,3), let f = G9(v), in accordance with (2.7). We will show that f E IDs. If xo < xl < ... < x, are real numbers, then from
2. Functions of Bounded Variation
it follows that the total variation of f is bounded by 11 v 11 and therefore, f E V(R). Let v = v + - v - be the Jordan decomposition. Then, v + and v - are positive finite Borel-Lebesgue-Stieltjes measures that induce two distribution functions f + and f -, respectively. The latter, because of (2.7)) yield
f =f+-f-, perfectly in agreement with Theorem 2.5 (iii). Consequently, f is right continuous and it vanishes a t - m, because f + and f - are. Hence, f E 9,. 2) Let f E 9 , . Then, by Theorem 2.5 (iii), f can be represented as the difference f = f l - f 2 of two distribution functions. Let p1 and p2 be two finite positive Borel-Lebesgue-Stieltjes measures induced by f and fa, respectively. Define v = pI - p2. Then, obviously, v E G,(R, '3). Furthermore,
agrees with (2.7), thereby justifying that 9(v) = f and that v is an element of 6,(R,'3) induced by f or, equivalently, that
Hence, we showed that [&,(R,GJ3),9s,9] is surjective. It remains to prove that [G,(R,'3),IDs,9] is also injective. The need for this is as follows. Since the representation f = f - f is not unique (it is easily seen that if f - f is a decomposition of f , then { f ID - (f + ID)} gives a class of decompositions of f in two distribution functions), 9 - ' ( f ) may potentially yield more than one signed measure in accordance with the procedure specified in part 2). On the other hand, it is obvious that any representation of f will induce signed measures, all of which will coincide on the family {( - m,x]: x E R) and thus must be equal. Notice, however, that we do not have such a concept as uniqueness of Caratheodori extension for signed measures and we must proceed by using the Jordan decomposition instead.
+
t,
3) Suppose that 9*({ f 1) = {v,p) and let f,: f "-, f and f, be distribution functions corresponding to the respective Jordan decompositions of Y and p. Then, we have
532
CHAPTER 9. CALCULUS ON T H E REAL LINE
which yields that
+ 2+
+ +
On the other hand, v + p - and p + v - that correspond to the distribution functions f f, and :f f, , respectively, are positive, finite Borel-Lebesgue-Stieltjes measures, which must be equal, because of f ,f f i = f,$ f; and the uniqueness of Borel-Lebesgue-Stieltjes measures (induced by identical distribution functions). This leads to the equality v = p and therefore completes the proof of the theorem. 0
+
+
Clearly, IDs & T(R). Now, let g E 'Y(R). Then, by Theorem 2.5 (iii), g can be represented as g = g + - g - , where g + and g - are two bounded monotone nondecreasing functions vanishing a t - oo. We can convert them to distribution functions by letting f +(x) = g + ( x +) and f - (x) = g - (x - ) and hence making f + and f - elements of 9. Therefore, there is a bijective operator acting from 5DS to T(R) and Theorem 2.7 can be restated as follows.
2.8 Theorem. There is a bijective map between the set G,(R,!B)
of all finite signed Borel-Lebesgue-Stieltjes measures and the set T(R) of all functions of bounded variation on R. 0
PROBLEMS
2.1
Prove part (i) of Theorem 2.3.
2.2
Prove part (ii) of Theorem 2.3.
2.3
Prove part (iii) of Theorem 2.3.
2.4
Prove part (iv) of Theorem 2.3.
2.5
Prove part (v) of Theorem 2.3.
2.6
Prove part (vi) of Theorem 2.3.
2.7
Prove part (vii) of Theorem 2.3.
2.8
Prove part (viii) of Theorem 2.3.
2.9
Prove part (ix) of Theorem 2.3.
2.10
Prove part (x) of Theorem 2.3.
2.11
Prove part (xi) of Theorem 2.3.
2.12
Prove part (xii) of Theorem 2.3.
2. Functions of Bounded Variation 2.13
Prove part ( x i i i ) of Theorem 2.3.
2.14
Prove part ( x i v ) of Theorem 2.3.
2.15
Find the total variation of the function
defined on [O,11. 2.16
Find the total variation of the function
defined on [0,2].
2.17
Show that the function
is of bounded variation on [0,I]. 2.18
Prove that a differentiable function on [a,b] with a bounded derivative is a function of bounded variation.
2.19
Must a uniformly convergent series on [a,b] of functions of bounded variations be of bounded variation?
2.20
If f has a Riemann integrable derivative on [a$], prove that its total variation is
V[a,b]= 2.21
J o 1 f '(x)Idr.
Prove parts (i) and (ii) of Theorem 2.5.
534
CHAPTER 9. CALCULUS ON THE REAL LINE
NEW TERMS: variation of a function on a bounded interval 528 function of bounded variation 528, 529 variation of a function on an unbounded interval 529 total variation of a function 529 signed distribution function 530
3. Absolutely Continuous Functions
3. ABSOLUTELY CONTINUOUS FUNCTIONS Below we introduce the concept of absolute continuity and establish its connection with absolute continuity of measures.
3.1 Definition. A function [R,R,f] is called absolutely continuous on a compact interval [a,b] (in notation, f E A[a,bJ), if for each E > 0 there is a 6 > 0 such that for any finitely many bounded disjoint open subintervals (ai&;), i = 1,. ..,n, with
it holds true that (3. l a ) A function f is called absolutely continuous on R or just absolutely continuous (in notation f E A(R)), if for each E > 0, there is a 6 > 0 such that for any finitely many bounded disjoint open intervals (ai,bi), i = 1,...,n, satisfying (3.1), inequality (3.la) holds. 0 Clearly, the absolute continuity of a function f on R or an interval implies uniform continuity (and that the converse is not true), which in turn makes f measurable.
3.2 Proposition. A[a, b]
V[a, b].
Proof. Let f E A[a, b]. Then, for E = 1, there is a 6 such that for any n-tuple of disjoint open intervals, {(ak,bk)}i = with
it holds true that
Let us make a partition P = {no = a < nl < ... < u~ = b } of [a,b] into subintervals with meshP < 6. Then for each [ak-l,ak], its arbitrary de composition
yields
and thus
536
CHAPTER 9. CALCULUS ON THE REAL LINE
< 1 and thus Vf [a,b] 5 N.
Consequently, Vf [ak -
3.3 Example. From Example 2.2 and Proposition 3.2, it immediately follows that although the function
is continuous, it is not absolutely continuous.
17
3.4 Remark. Proposition 3.2, however, does not imply that A(R) - 4r(R). For instance, the identity function f (x) = x is absolutely contiC nuous, but not of bounded variation on R.
Let f be a signed distribution function that generates a finite signed Borel-Lebesgue-Stieltjes measure v. The theorem below states that v is absolutely continuous if f is, and vice versa. Prior to this, we need the following lemma.
3.5 Lemma. Iff E A(R) fl 'Y(R), then vf(x): = Vf( - m,x] E A(R). Proof. Since f E A(R), given any E > 0, there is a 6 > 0 such that for any n-tuple of disjoint open intervals, {(ak,bk)}E = with
C r = l(bk -ak) < 6,
it holds true that
If(bk)-f(ak)
I
<E-
Since
Vf b k , b k l
= sup( C I f ( - ) - f ( - ) 1 over all finite partitions of [ak,bk]), for each that
& > 0, there
is a partition ak = ao,
Therefore, E
CE = lVf [aklbkl - 5
On the other hand, since
< ... < aNk, = bk such
3. Absolutely Continuous Functions
537
we have
Consequently, from (3.5),
By Theorem 2.5 ( i ) ,and by our assumption that f E V ( R ) ,
which allows us to rewrite (3.5a) in the form
and thereby complete the proof.
3.6 Corollary. Let f and vf be as in Lemma 3.5. Then, the functions
F: = vf + f and G = vf - f are absolutely continuous, bounded, and monotone nondecreasing. I f f E A ( R ) n V ( R ) and vanishes at infinity, then it can be represented as the difference,
of two absolutely continuous distribution functions.
Proof. From Lemma 3.5, and linearity of d ( R ) , it follows that F and G are elements of d ( R ) and they are obviously bounded. The rest is due to Theorem 2.5 (iii). cl 3.7 Proposition. If f E d[a,b], then f can be represented as the difference of two distribution functions on [a,b].
3.8 Theorem. Let v E 4,(R, 93) and f , E 5DS be the corresponding signed distribution function generated b y v. Then the following are equivalent:
(ii) f , E A ( R ) .
CHAPTER 9. CALCULUS ON T H E REAL LINE
Proof. 93). Since I v I is a positive finite measure and (i) Let Y E G:~(R, absolutely continuous too, by Proposition 5.6, Chapter 6, for each E > 0, there is a positive d such that for each Borel set A with X(A) < 6, I Y I (A) < E . In particular, if A = k" = 1(ak,bk), we have that:
implying that f, E A(R).
(ii) Now, let f E A(R) fl9,.
Since v = vf is finite,
Therefore, f E T(R) n A(R), and by Lemma 3.5, then vf E A(R). By Corollary 3.6, the functions Vf + f Vf - f F: = -and G: = 2 2
are absolutely continuous, bounded, monotone nondecreasing, and vanishing a t - oo. In particular, being absolutely continuous, F and G are elements of 9. Let p~ and p~ be the corresponding finite BorelLebesgue-Stieltjes measures induced by F and G, respectively. Because F - G = f , the signed measure Df: = PF - pG is clearly an element of G,(R,!B). Since, as we know it from Theorem 2.7 (cf. the proof of part 3)), uf does not depend on the decomposition of f , we have that tf = vf. It remains to show that pF and pG are elements of We will again use Proposition 5.6, Chapter 6. Let B be a Borel set such that X(B) < where 6 is the 9hreshold" taken from the absolute continuity condition of the distribution function F. By regularity of X (see Theorem 7.17 and Remark 7.18), for each there is an open superset U of B such that
8F.
8,
i,
On the other hand, by Problem 2.10, Chapter 4, U can be represented as at most a countable union of disjoint semi-open intervals:
3. Absolutely Continuous Functions
so that, from (3.8),
Now, by absolute continuity of F, for any finite subcollection of because of (3.8a), {Ij= (aj,bj]}, say {(aj,bj]}?=
By continuity from below of P F ( ~5 ) E. In summary, we showed that for every Bore1 set B p~ < A. The same applies to
p ~ we , have that pF(U)
5 E. Since B C U,
that for each E > 0, there is a 60 = { such with X(B) < 60, pF(B) 5 E, and therefore pg and, consequently, to vf. 0
3.9 Theorem. A function f E A ( R ) f l 5DS if and only if there is g E L1(W, %, A;R) such that
f ( 4 = s:
,g(u)A(du)*
(3.9)
Proof. Suppose f E A(R) n as Then, since f is a signed distribution function, by Theorem 2.7, there is a unique signed Borel-LebesgueStieltjes measure vf E I , @ , 93) induced by f . Because f is absolutely continuous, by Theorem 3.8, vf < A. Therefore, by the Radon-Nikodym Theorem 2.2 (case 5a), vf has a Radon-Nikodym density
(i)
with respect to A, i.e.,
In particular,
f (x) = vf (( - m1xI) = S - , g(u)A(du).
S
(ii) Conversely, let g E L1(W, %, A$). Define v = gdA. Then, v E G,(R, 3) and thus, by Theorem 2.7, v induces the signed distribution function f , defined as f ,(x) = v(( - m,x]). On the other hand, since v << A, by Theorem 3.8, f, E A(W).
540
CHAPTER 9. CALCULUS ON T H E REAL LINE
3.10 Corollary (Lebesgue). Let f be defined as
with g E L'(R, 93,A). Then f is differentiable A-a.e. and f ' = g A-o.e.
Proof. By Theorem 3.9, f is a signed distribution function and g is a Radon-Nikodym density of u = J gdA. Let f = f + - f - , where f + = g + dA and f - = f g - dA. Then, by Theorem 3.9, f + and f - are distribution functions. The statement is now due to Corollary 1.5, The following theorem clarifies the "ambiguity" caused by Theorem 1.8 and establishes a noteworthy criterion for the equality in (1.8).
3.11 Theorem. A function f is absolutely continuous on an interval [a,b], if and only if it is differentiable A-a.e. and it can be represented a s
f (x) = f (a) + f zf ' ( u ) ~ ( d u ) -
(3.11)
Proof. Let f E A.[a,b]. Then, by Proposition 3.2, f E V[a, b]. Now, denote
which is defiried on W and is clearly an element of A(R) fl as. By formula (3.9a) of Theorem 3.9, there is a g E L'(w, 93,A) such that
By Corollary 3.10, F and therefore, f , must be differentiable A-a.e. on [a,b] and f' = g Xa.e. on [a,b]. The converse is obvious. Theorem 3.11 immediately yields:
3.12 Corollary. Let f E A[a,b] and f ' E [()IA. Then f is a constant on [a,bl. A challenging exercise is to prove Corollary 3.12 without the use of Theorem 3.11.
3. Absolutely Continuous Functions
PROBLEMS 3.1
Prove Corollary 3.12 without use of Theorem 3.11: Let f E A[a,b] and f' E [OIA.Show that f is a constant on [a,b]. [Hint: Use Vitali's Covering Theorem: Let E be a subset of R and 3 be a system of closed nonempty intervals. If for each x E E and E > 0 , there is an interval I E 3 such that x E I and A(I) < E , then system 3 is said to be a Vitali covering of set E. Let E be a bounded set and 3 be its Vitali covering. Then there is an almost countable subfamily of disjoint intervals from 3 that covers all points of E except for possibly its A-negligible subset.]
3.2
Show that a sum or a product of finitely many absolutely continuous functions on [a,b] is absolutely continuous.
3.3
Show that absolute continuity on [a,b] implies uniform continuity.
3-4
Let
3.5
Let f E A[a,b] such that f is also monotone nondecreasing. Suppose that F E A[f (a), (b)].Prove that F o f E A [ a ,b].
f , E~ A [ a l b ]and g(+) # 0 , x E [a,b]. Prove that $ E A [ a ,b].
542
CHAPTER 9. CALCULUS ON THE REAL LINE
NEW TERMS: absolutely continuous function on a compact interval 535 absolutely continuous function (on 8)535 Vitali's Covering Theorem 541
4. Singular Functions
4. SINGULAR FUNCTIONS We will continue our discussion on singularity of signed measures, started in Section 3, Chapter VIII, and connect this notion to that for distribution functions. Recall that a signed Borel-Lebesgue-Stieltjes measure v is singular-continuous if u is singular (i.e. u I A) and v is continuous (i.e., for each x E R, v({x)) = 0). Y is atomic if there is an almost countable set A of real numbers such that u({a)) > 0 for each a E A and u(AC)= 0. Since v L A, it is also called singular-discrete. Binomial, geometric, and Poisson measures are examples of positive singular-discrete BorelLebesgue-Stieltjes measures. 4.1 Definition. A function f is called singular-continuous if it is continuous, not a constant, A-a.e. differentiable, and its derivative is zero Xa.e. [Observe that by Corollary 3.12, a singular-continuous function is continuous but not absolutely continuous.]
4.2 Example. (Cantor Singular-Continuous Function). From Example 1.9, the Cantor ternary function F is monotone nondecreasing and singular-continuous. Let , u ~be the corresponding Borel-LebesgueStieltjes measure. Since F is constant on Ak(n), it follows that pF(Ak(")) = 0 and thus pF(CC) = 0. On the other hand, X(C) = 0. Thus, p~ I A. Furthermore, since F is continuous, pF({x)) = for all x E [O,11. Therefore, p~ is a singular continuous Borel-Lebesgue-Stieltjes measure induced by F. 0 The above example gives rise to a seemingly close relation between singular continuous distribution functions and singular continuous BorelLebesgue-Stieltjes measures. We will start with the following:
4.3 Theorem. Let p be a positive a-finite singular-continuous BorelLebesgue-Stieltjes measure. Then the corresponding extended distribution function f is singular-continuous.
Proof. Let p IA. Then, there is a Borel set A such that p(A) = X(AC) = 0. Since f: = f l, is an extended distribution function, by Corollary 1.6, f t exists A-a.e. and clearly f' 2 0 everywhere it exists. We will show that E = {x: ft(x) > 0) E NA.
(i)
Since f' = f 'lA < f 'lAIEA-a.e.,
544
CHAPTER 9. CALCULUS ON THE REAL LINE
We will prove that Jf'dA = 0. If so, A-a.e. A
J f'dA = J f'dA = 0 and thus f ' = 0 E
By Theorem 1.8, for each compact interval [a,b],
Since A is Bore1 and p is a-finite, by Theorem 2.28, Chapter 5, for each E > 0, there is a disjoint sequence {I,} of semi-open intervals such that AS I and -1 "
~r-
(Notice that since p(A) = 0, the 0-finiteness of p is not a necessary constraint to use Theorem 2.28.) Because of (4.3),
Therefore,
(ii) f is continuous, because p is continuous, i.e. p((x)) = 0 for all x E R.
4.4 Corollary. Let v be a singalar-continuous signed Borel-LebesgueStieltjes measure and f, be the signed distribution function induced by v. Then, f is singular continuous.
Proof. Let v = v + - v - be the Jordan decomposition of v and f + and f - be the corresponding distribution functions. Then, clearly v + and v - are singular continuous finite positive Borel-Lebesgue-S tie1tj es measures. The proof is complete after applying Theorem 4.3.
4.5 Theorem. Let f E 9,and f' = 0 A-a.e. Then pf IA.
Proof. Denote p = pf. Then, p is a positive c-finite Borel-LebesgueStieltjes measure. By the Lebesgue Decomposition Theorem 3.4, Chapter 8, there is a unique decomposition p = pa ps such that pa << A and ps IA. Assume first that p is finite. Then, both p, and ps are finite. By Radon-Nikodym's Theorem 2.2 (case I), there is a nonnegative LI-function g such that pa = f gdA. By Lebesgue Corollary 3.10, the function
+
+
is differentiable A-a.e. and F' = g A-a.e. On the other hand, f = F G, where G(x): = pS(( - o , x ] ) . By Theorem 4.3 (i), since ps IA, G' = 0 A-
4. Singular Functions
a.e. and therefore, F' = 0 A-a.e. and g = 0 A-a.e. Consequently,
and it leaves p IA. Now, if p is 0-finite, let {R,) be a countable measurable partition of R so that p, = ResC n p is a finite Borel-Lebesgue-Stieltjes measure, n
which, according to the above arguments is orthogonal to A, i.e., there is a set A, 5 R, such that p,(A,) = A(R,\A,) = 0. Therefore, the set
is such that p(A) = A(AC)= 0.
0
4.6 Corollary. Let f be a singular-continuous signed distribution function and let v f be the signed BorebLebesgue-Stieltjes measure induced b y v. Then, v f is a singular continuo^^ signed measure.
Proof. In the decomposition f = F
-G
into two distribution functions, each one is singular continuous. This, as we know, yields the decomposition v = p~ - PG into two finite positive Borel-LebesgueStieltjes measures each one of which is singular-continuous due to Theorem 4.5. 4.7 Definitions.
An extended distribution function D is said to be discrete if it is a monotone nondecreasing step function on any compact interval and it can be represented as
(i)
where {d,}t 5 R and
Cn =
- -A, = R is a countable decomposition of
R into semi-open intervals. Due to Theorem 1.2, an extended discrete distribution function can also be defined as a piecewise constant monotone nondecreasing function. If D = D l- D 2is a signed distribution function, with Di being discrete distribution functions, then D is said to be a discrete signed distribution function. Since any discrete signed distribution function D is almost everywhere constant, its derivative D' exists A-a.e. and D' = 0 A-a.e. Unlike its singular-continuous counterpart, a discrete signed distribution function is not continuous and thus we can alternatively call it singular-discret e.
(ii) Any singular-discrete or singular-continuous signed distribution function is referred to as singular.
546
CHAPTER 9. CALCULUS ON THE REAL LINE
4.8 Remark. If D is an extended discrete distribution function given by (4.7), it increases only a t points {a,} of an a t most countable set A and it induces the following atomic measure
where 6, = d, - d, - = p({xn}) > 0. Correspondingly, any signed singular-discrete distribution function induces a unique signed singular-discrete Borel-Lebesgue-Stieltjes measure. Conversely, any signed singular-discrete Borel-Lebesgue-Stieltjes measure generates a unique signed singular0 discrete distribution function. 4.9 Theorem. Any signed distribution funciion f can uniquely be decomposed as f = f a + f,, fd, (4.9)
+
where f ,, f ,,,f .are its absolute continuous, singular-continuous, and discrete components, respectively. Furthermore, f ' ezists A-a.e. and f ' = f h A-a.e.
Proof. By Corollary 3.8, Chapter 8, any signed Borel-LebesgueStieltjes measure v can uniquely be decomposed as
+
such that va << A, ucs vd IA, and vcs Ivd, where vcs and vd are singular-continuous and singular-discrete components of v. By the above theorems and propositions, each of the three components of v induces a unique signed distribution function of its respective type and therefore, the signed distribution function f, (induced by v) is of the form
This representation is clearly unique. Conversely, if f is a signed distribution function, it generates a unique signed Borel-LebesgueStieltjes measure v, which by the above decomposition, in turn yields the corresponding unique decomposition
of signed distribution functions. Finally, f ' exists A-a.e. and f ' = f h Aa.e. The following provides a practical method for determining the decomposition of a distribution function. By Proposition 1.3, any monotone
4. Singular Functions
547
nondecreasing function f can be represented as the sum of the monotone nondecreasing continuous function f - A f and the step function (cumulative jump function o f f ) Af. The theorem below states how a continuous function of bounded variation can uniquely be represented as a sum of an absolutely continuous and singular-continuous function.
4.10 Theorem. Let f E 'T[a, b] n e[,, b ] . Then f can be decomposed as the sum a + a, where a is an absolutely continuous and a is a singularcontinuous function. With a(a) = f (a), this representation is unique.
Proof.
(i)
Existence. Since f is differentiable A-a.e. on [a,b] we can define
Since f E V[a,b], it is bounded and it can be decomposed as the sum of two monotone nondecreasing functions. Hence, applying Theorem 1.8 to each of them we conclude that f ' E L'. Then, by Theorem 3.9, a E A[a, 61. As regards a, it appears to be a linear combination of two V[a, 61functions, and therefore, its derivative a' exists A-a.e. and wherever it exists, it is equal to f' - a' = 0. Of course, a E e[,, bl. Therefore, a is singular-con tinuous.
(ii) Uniqueness. Suppose f = a + a = 8 + ij . Thus a - 8 = a - 0 . Since a' = ?' = 0 Xa.e., ( a - 8 )' E [OIX Furthermore, a - 8 E A[a, 61, and therefore, by Corollary 3.12, a - a = const. On the other hand, N
N
N
a(.) - a (a) = f (a) - f (a) = 0. The latter shows that a is identical to 8 and thus, a is identical to ij .
4.11 Corollary. Let f E 5Ds n el,
Then f can be decomposed as
the sum a + a, where a is an absolutely continuous and a is a singularcontinuous function. With a ( a ) = f (a), this representation is unique.
4.12 Proposition. If f is a distribution funciion, then f can be decomposed as the sum a a, where a is an absolutely continuous and a is a singular-continuous distribution functions.
+
2
Proof. Let f be defined on [a,b]. Since f' 2 0, a(x) = f (a) + S f'dX is monotone nondecreasing. Furthermore, from Theorem 1.8,
a
548
CHAPTER 9. CALCULUS ON THE REAL LINE
and hence u ( y ) - u ( x ) 5 f ( y ) - f ( x ) or in the form
Now, suppose that the domain of f is R. Set p ( ( - m , x ] ) =
x
S
f'dA
<m .
-00
Since f ( x ) --+ 0 for x -, - m, we have for a -, - w,
and ~ ( x-1 ) 0 for x+ - m by @-continuity of p. This also implies that ~ ( x-+ ) 0 for x-+ - m.
4.13 Example. Consider the following distribution function:
We can decompose F as the sum of an absolutely continuous component,
Fdx)={
0,
x
x 2 1 ,
1 5 x 5 2
31
212,
and a discrete component,
The corresponding Borel-Lebesgue-Stieltjes measures are as follows:
PI =
r
w1(1121 ( x ) 25 X(d x)
and 1 p2 = 5Eg
1 + 2E1 -k 2 ~ 2 .
The form of p1 is due to (4.12).
•
4.14 Definition. Let X be a random variable on a probability space ( R , C , P ) and valued in (R,%), and let P x = PX* be its probability distribution. The random variable is called:
a ) continuous if P x
<< A'.
4. Singular Functions
b) discrete if
549
PX is an atomic probability measure.
c ) singular-continuous if PX is a singular-continuous probability measure.
d ) of a mixed type if PX is a convex combination of a t least two types from a, b or c.
PROBLEMS 4.1
Let f be a continuous distribution function. Suppose that given E > - 0 , A = (x E [a,b]: f '(x) exists and f '(x) 2 E ) . Prove that,
4.2
Let f be a continuous distribution function. Suppose that given E 2 0, A = (x E [a,b]: f'(x) exists and fl(x) 5 E ) . Show that
4.3
Let p be a singular-continuous measure. Using Problem 4.1 and 4.2 prove that f, is a singular-continuous distribution function.
4.4
Give an alternative proof of Corollary 4.6 by using the Vitali Covering Theorem (see Problem 3.1).
4.5
Let F be given as
Find the decomposition of F such that F = F1 + F a , where F1 is absolutely continuous and F 2 is a discrete distribution function and where the corresponding Lebesgue-Stieltjes measures p1 and p2 have the sum p ~ .
CHAPTER 9. CALCULUS ON THE REAL LINE
NEW TERMS: singular-continuous function 543 Can tor singular-con t inuous function 543 discrete extended distribution function 545 discrete signed distribution function 545 singular-discrete distribution function 545 singular distribution function 545 decomposition of a signed distribution function 546 decomposition of a continuous signed distribution function 547 continuous random variable 548 discrete random variable 549 singular-continuous random variable 550 random variable of a mixed type 550
Bibliography Apostol, Tom M., Mathematical Analysis: A Modern Approach to Advanced Calculus, 2nd edition, Addison-Wesley, Reading, MA. Burbaki, Nikolas, Elements of the History of Mathematics, Springer-Verlag, Berlin. Cohen, Paul J., Set Theory and Continuum Hypothesis, W. A. Benjamin, New York. Fisher, Emanuel, Intermediate Real Analysis, Springer-Verlag, New York. HernAndez-Lerma, Ondsimo and Lasserre, Jean-B., Fatou's Lemma and Lebesgue's Convergence Theorem for measures, Journ. Appl. Math. Sloch. Anal., 13:2, 137-146. Hrbacek, Karel and Jech, Thomas, Introduction to Set Theory, Marcel Dekker, New York. MacLane, S. and Birkhoff, G., Algebra, Chelsea, New York. Zorn, M., A remark on method in transfinite algebra, Bull. Amer. Math. Soc., 41, 667-670.
Index A-projection map 34 arsection of a function 363 arsection of a set 359 abelian group 47 absolutely continuous component of a signed measure 453 absolutely continuous function 535 absolutely continuous signed measure 437 absolutely continuous measure 348 accumulation point 71, 111 accumulation point of a filter base 169 accumulation point of a filter 169 accumulation point of a net, criterion for 177 accumulation point of a net 173 additive inverse 47 additive set function 222 additive group 47 N, 41 Alexander Subbase Theorem 146 Alexandrov compactification 191-192 algebra 46 algebra over a field 53 algebra of functions 84 algebra (field) 204 algebraic number 44 algebraic operation 47 al-Khowarismi, Muhammad 46 almost surely equality 448 almost uniform convergence of a sequence 474
antisymmetric binary relation 22 Arzeli's Theorem 158 Ascoli, Giulio 153 Ascoli's Theorem 153, 155 associative algebraic operation 47 associative laws 6 a t most countable set 41 atom (v-atom) 454 atom of a singular measure 454 atomic probability measure 226 atomic (discrete) measure 226 Axiom of Choice 27 Axiom of Extent 6 Bairejs Category Theorem 90 Banach, Stefan 59 Banach space 101 base parallelepiped 135 base, a construction of 118 base (simple) parallelepiped 116 base for a topology criterion for 115, 119 base sets 115 base for a topology 115 base neighborhood 110 Beppo Levi's Corollary 311 Bernoulli measure 231 bijective (onto and one-to-one) map 13 bilinear transformation 49 binary relation on a set 22 binary relation, 11 binomial series 161 binomial random variable 197 binomial measure 231
INDEX
Bolzano-Weierstrass compactness 93, 145 Borel, Emile 221 Borel-Lebesgue measure 260 Borel-Lebesgue measure of a cube under a linear map 411 Borel-Lebesgue-Stieltjes measure 261 Borel-measurable bounded functions 327 Borel measurable function 217 Borel measure 261, 493 Borel measure space 261 Borel outer measure 261 Borel u-algebra 210 Borel u-algebra, criterion of 213 bound of a function 89 boundary point 76, 111 boundary of a set 76, 112 bounded function 88 bounded set 26, 82 bounded sequence 74 box parallelepiped 136 box topology 136 branch of a function 12 Canonic chain of partitions 328 canonic representation (expansion) 29 1 Cantor, Georg I?. 3, 43, 107 Cantor singular-continuous function 543 Cantor ternary set 269 Cantor's ternary function 524, 525 Cantor's Theorem 42, 88 Carathdodory, Constantin 235, 236 Carathkodory's extension 241, 242
Carathdodory's Extension Theorem 236 Carathdodory's extension, uniqueness of 245, 246 cardinal number 40 carrier 4, 60 Cartesian product 11 Cartesian product of a sequence 31 Cartesian product of indexed family of sets 32 Cauchy , Augistin-Louis 329 Cauchy in measure sequence of functions 474 Cauchy integrable function 330 Cauchy-Schwarz inequality 461 Cauchy sequence 74 Cauchy sum 329 chain 25 chain rule for positive measures 351 chain rule for signed measures 449 chain rule in Banach spaces 394 chain rule in Euclidean spaces 395 change of variables for abstract integrals 341 change of variables for a bijective transformation 343 change of variables in Euclidean spaces 4 13 Chebyshev's inequality 474 choice function 27 closed set 69 closed ball in Rn, BorelLebesgue measure of 368 closed ball 70 closed set 109 closed semi-%-space 290
INDEX
closed %-space 284 closure of a set 69, 111 closure point 69, 111 closure point, criterion for 124 cluster point 111 C-'-space 216, 277 C - '(a, E;C) space 282 C - '(a, E ;R ) space 282 coarser topology 109 cocountable (count able complement) topology 112 codomain 12 cofinite (finite complement) topology 112 commutative algebraic operation 47 commutative laws 6 commutative ring 47 compact Hausdorff space 185 compact metric space 92 compact set 92, 144 compact sets in Hausdorff spaces 145 compact set under a continuous function 95 compact set under a continuous map 144 compact topological space 144 compactification 191 compactness, criteria of 93, 97 comparable elements 22 complement 6 complete Carathdodory 's extension 242 complete metric space 74 completely regular space 183 completeness criteria 87 completeness of a uniform metric space 152 completion of a measure 240
completion of a measure space 240 complex linear space 53 complex measure 430 complex measure space 430 component of a signed measure 453 composition of binary relations 13 composition of continuous functions 129 composition of maps 13 composition of measurable functions 217 conditional expectation given a random variable 448 conditional expectation given a a-hypothesis 446 conditional probability 446 congruence 23 congruence modulo z 24 conjugate exponents 62, 460 content 223 continuity criteria 79, 81 continuity from above a t the empty set 223, 423 continuity from above, criterion for 230 continuity from above on a sequence of sets 223 continuity from above on a a-algebra 223 continuity from below, criterion for 229 continuity from below on a sequence of sets 223 continuity from below on a c-algebra 222 continuity of a function a t a point 78, 172 continuity of a function, criteria for 128, 129, 130, 175 continuity of projection
INDEX
maps 136 continuous from above set function 423 continuous from below set function 423 continuous function 78, 128 continuous functions on a dense set 131 continuous measure 348 continuous random variable 445, 548 continuous singular measure 454 continuously differen tiable function 394 continuum 41 continuum hypothesis 43 convergence in mean 312 convergence in measure of a sequence of functions 472 convergence in probability (stochastic convergence) 483 convergence in the pth mean (LP-convergence) 463 convergence of a filter 169 convergence of a filter base 169 convergence of a filter base to a point, criterion for 178 convergence of a filter to a point, criterion for 177 convergence of a function along a filter base 169 convergence of a net to a point, criteria for 173, 176, 177 convergent sequence 74, 122 convolution of atomic measures 381 convolution of binomial measures 381 convolution of functions 383 convolution of a function and
measure 382, 383 convolution of measures 379 convolution of measures, properties for 379 convolution of point masses 380 convolution of Poisson measures 382 coordinate 32 countable (denumerable) set 41 coun tably compact topological space 149 counting measure 226 cover 92 cumulative jump function 519 cylinder 34 Darboux, Gaston 327, 329 Darboux lower sum 174, 327 Darboux upper sum 174, 327 d-bounded function 88 d-bounded set 82 decomposition of a continuous signed distribution function 547 decomposition of a positive measure 454 decomposition of a set 8 decomposition of a a-finite signed measure 456 decomposition of a signed distribution function 546 DeMorgan's laws 7 dense set 75, 111 density 346 derivative of a map 390 derived set 71, 111 diagonal 124 diameter of a set 82 diffeomorphism 396 difference 6 differentiable map 390
INDEX
Dini's Theorem 158 Diophantos 46 Dirac delta function 50 Dirac measure (point mass) 224 Dirac delta function, Fourier transform of 51 direct integrability 333 direct integral 333 directed set 172 Dirichlet's criterion 336 Dirichlet function 81, 297, 330 discrete convolution 49 discrete extended distribution function 543 discrete (atomic) measure 226 discrete metric 60 discrete random variable 279, 549 discrete signed distribution function 545 discrete topology 108 disjoint family of sets 8 disjoint sets 6 distance 60 distribution function 224, 263 distributive laws 7 domain 11 dominance of functions by sets 195 Dynkin system 204 Egorov9sTheorem 476 element of a set 3 elementary content 223 elementary events 4 embedded set 132 embedding 132 empty set (@) 4 @-continuity of measure 223 @-continuity of signed measure 423 &-bound of a family of
functions 486 equality of functions modulo p 304 equicontinuity 153 equipotent sets 40 equivalence kernel of a function 24 equivalence relation 22 equivalence class modulo M (E) 22 equivalent classes generated by a function 24 equivalent metrics 81 essential bound 467 essential supremum of a function 468 essentially bounded function 467 Euclidean metric (distance) 63 Euclidean (Frobenius) norm of a matrix 385 Euclidean norm 101 event 4 expectation of a function of a random variable 343 expectation of a random variable 303, 343 extended distribution function 224, 263 extended real-valued function 282 extended topology of pointwise convergence 284 extendibility of a formatter, criterion for 243 extendible formatter 242 extension of a function 14, 240 extension of a measure 240 Factor space 32 Fatou's Lemma for functions 313
557
INDEX
Fatou's Lemma for measures and functions 322 Fatou's Lemma for measures 318 Fatou's Lemma for measures and nonnegative functions 321 field 52, 204 filter 167 filter base 167 filter base generated by a net 177 filter base for a filter 167 filter generated by a filter base 168 filter that meets a set 180 finer topology 109 finite set 41 finite (Y-finite) set 422 finite set function 223 finite signed measure 422 finite subadditivity 228 first countable topological space 124 First Axiom of Countability 122 [f],-class 304 formatter of an outer measure 242 Frhchet, Maurice 59, 107 Frkchet derivative 390 Frobenius (Euclidean) norm of a matrix 387 F,-set 190 Fubini, Guido 356 Fubini's Theorem 367 Fubini's Theorem for monotone functions 52 1 function 11 function continuous a t a point, criterion for 128 function continuous at a point 128 function convergent along a
net 175 function of bounded variation 528, 529 fundamental system of neighborhoods 110 G a m m a function 324, 325 generalized HGlder's inequalities 471 generator of a subalgebra 161 geometric measure 429 greatest lower bound (infimum) 26 group 47 group automorphism 48 group endomorphism 48 group homomorphism 48 group isomorphism 48 G,-set 190 Hahn's Decomposition Theorem 425 Hahn decomposition 425 half-open interval (rectangle) 205 Hausdorff, Felix 59, 107 Hausdorff space, criterion for 125 Hausdorff (T2or separated) topological space 122, 182 Heine-Bore1 Theorem 94 hereditary property 143 Holder's inequality 62 Holder's inequality for Lw spaces 470 Holder's inequality for LP spaces 46 1 homeomorphic topological spaces 132 homeomorphism 132 homeomorphisms and Bore1 a-algebras, relationship between 218 homothetic function 277
INDEX
homothetic metric 100 hypercontinuum 43 Ideal 232 idempotence 7 identity function 14 image of a point 12 image measure 277 image of a set 12 improper Riemann integral 333 indefinite integral 346 independent family of random variables 378 independent random variables 378 indicator function 14 indiscrete topology 108 n-stable (intersection-stable) system 205 infinite signed measure 422 inner regular Bore1 measure 493 integrability of a complexvalued function 432 integrable simple function 466 integral of an extended nonnegative function 298 integral of a real-valued function relative to a signed measure 433 integral of an extended realvalued function 302 integral of a nonnegative simple function 296 integral with respect to the counting measure 371 integration by parts formula for Le besgue-St ieltj es integrals 375, 376 interchanging derivative and integral 317 interior of a set 69, 110
interior point 69, 110 intersection 6 n-stable family of sets 108 into map 13 inverse of function 13 inverse image of function 13 inverse image of an open set under a function 79 inverse 47 Inverse Mapping Theorem 397 isometric metric spaces 278 isometry 196 Jacobian 390 Jacobian matrix 390 joint distribution of random variables 378 Jordan, Camille 221 Jordan decomposition of a complex measure 430 Jordan decomposition of a signed measure 426 jump discontinuity 517 Kernel 48 Kdset 190 Largest element 26 &-tail of a net 173 lattice 26 least upper bound supremum 26 Lebesgue, Henri 221, 295 Lebesgue decomposition of a signed measure 453 Le besgue Decomposition Theorem 453 Lebesgue measure of a set under an affine map 406 Lebesgue Dominated Convergence Theorem
559
INDEX
for spaces 463 Lebesgue integral 302 Lebesgue-S tie1tj es integral 302 Lebesgue measure 260 Lebesgue-Stieltjes content 26 1 Lebesgue-S tie1tjes measure 261 Lebesgue o-algebra 260 Lebesgue elementary content 224 Lebesgue outer measure 260 Lebesgue-Stieltjes elementary content 225 Le besgue's Dominated Convergence Theorem for functions 3 14 Lebesgue's Dominated Convergence Theorem for measures 318 Lebesgue's Dominated Convergence Theorem for measures and functions 322 Lebesgue's Theorem of Riemann integrability 330 left distributive law 47 limit inferior 8 limit point of a filter 169 limit of a function a t a point 170 limit of a sequence 8 limit point 6f a sequence 74, 122 limit point of a net 173 limit point of a set 74, 122 limit superior 8 Lindelof set 92, 144 Lindelof space 92 Lindelof topological space 144 linear functional 103 linear operator 103 linear (total) order 22
linear space (vector space) over a field 53 Lipschitz condition 387 Lipschitz constant 387 locally compact Hausdorff space, properties for 186, 187, 189, 191, 193 locally compact Hausdorff space, criteria for 196, 197, 198 locally compact space 183 IL space 302 Loo(R,C, p; 43) space 468 Loo(R,C, p; R) space 468 I' space 61 'L space 302 ZP space 49 1P norm 101 LP space 50 LP(fi, C,p; 43) space 460 LP(S1, C, p; R ) 460 LP-convergence (convergence in the pth mean) 463 lower Baire function 329 lower bound 26 lower Darboux integral 328 lower derivative of a measure 510 lower limit topology 120 Lusin's Theorem 506
mapping 11 matrix supremum norm 388 maximal element 26 maximal filter 169 maximum row sum matrix norm 388 Mean Value Theorem 395 measurability of a complexvalued function 432 measurability of an extended real-valued function, criteria for 282, 283
INDEX
measurability of a function, criterion for 216 measurable cylinder 357 measurable function 216 measurable rectangle 357 measurable sets 205 measurable space 205 measure 223 measure derivative 510 measure differentiable a t a point 510 measure generated by an integral 346 measure of a Bore1 set under a diffeomorphism 411 measure of a hyperplane 271 measure space 223 mesh of a partition 328 metric 60 metric space 60 metric topology 109 metrizable topological space 109 metrization, 60 -set 350 minimal element 25 Minkowski's inequality 63, 461 modulus of congruence 24 moment generating function 298, 373 monoid 47 Monotone Convergence Theorem for functions 312 generalization 313 for measures 321 monotone function 517 monotone nondecreasing function 517 monotone nondecreasing sequence of sets 8 monotone nonincreasing function 517
12
monotone nonincreasing sequence of sets 8 monotone system 205 monotone vanishing sequence of sets 8 monotonicity 226 monotonicity of outer measure 236 Moore plane 144 motion 278 motion-invariant measure 278 p-a.e. property 304 p-inner regular set 493 p-integrable function 302 p-minimal decomposition of a set 228 p-negligible (negligible) set 240 p-null (null) set 240 p-outer regular set 493 p$ measure 236 p*-measurable set 236 p*-separability 236 multiplicative group 47 multiplicative inverse 47 multi-valued function 12 N(A)-tail of a sequence 122 N,-set 240 N ( x ,E)-tail 74 negative (v-negative) set 422 negative variation of a signed measure 426 negligible (p-negligible) set 240 neighborhood base a t a point 110 neighborhood filter 168 neighborhood of a point 109 neighborhood of a point that meets a set 111 neighborhood system a t a point 110 net 172
INDEX
net cofinally in a set 173 net generated by a neighborhood base 173 net induced by a directed set 172 Nikodym, Otto 295, 350 NLS 100 non-Bore1 set 273 nonnegative simple functions 288 norm 100 normal probability density function 50, 353 normal random variable 352 normal space 183 normality of a space, criteria for 185, 187 normed linear space (NLS) 100 nowhere dense set 75, 111 v-maximal set 428 v-minimal set 428 null (p-null) set 240
0ne-point compactification 191 one-to-one (injective, invertible) map 13 onto (surjective) map 13 open ball 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open cover 92 open map 140 open neighborhood of a point 109 open parallelepiped (rectangle) 116 open set 67, 108 open subcover 92 orthogonal measures 35 1 orthogonal trans-
formation 278 orthogonality (singularity) of a signed measure 452 outer measure 235 outer regular Bore1 measure 493 Pairwise disjoint sets 8 parabolic spline 165 parallelepiped 33 partial derivative 392 partial order relation 22 partition of an interval 8, 174, 327 partition of a set 8 partition of unity for a compact set 195 pick-a-point process 6 piecewise linear function 164 point mass (Dirac measure) 224 pointwise bounded set of functions 155 poin twise convergence, criterion for 139 pointwise limit 89 Poisson measure 231 Poisson random variable 197 positive linear functional 494 positive measure 422 positive (v-positive) set 422 positive variation of a signed measure 426 power 49 power set 8 pre-image 13 premeasure 223 principle of mathematical induction 28 probability density 352, 444 probability distribution 196 probability distribution function 444 probability measure 223
INDEX
probability space 223 product map 376 product measure space 368 product metric 63 product a-algebra 357 product space 63 product topology 124 product topology for arbitrarily many factor spaces 136 product topology for finitely many factor spaces 135 projection map 32 projection of a set on its quotient 23 proper subset 4 proper superset 4 property that hold almost everywhere (p-a.e.) 304 pseudo-metric 60 pseudo-metric space 60 Quasialgebra 289 quasiring 289 quotient (factor) set 23 quotient set modulo p 304 quotient topology 129 Radius of an open ball 65 Radon, Johann 222, 295 Radon measure 493 Radon measure induced by a positive linear functional 496 Radon-Nikodym density 348 Radon-Nikodym density of a signed measure 438 Radon-Ni kodym derivative 351 Radon-Nikodym derivative of a signed measure 444 Radon-Nikodym Theorem for complex measures 448 Radon-Nikodym Theorem for
positive measures 351 Radon-Nikodym Theorem for signed measures 438 random variable 279 random variable of a mixed type 550 range 12 rational parabolic spline 165 rational parallelepiped 116 real linear space 53 rectangle 33, 116, 205 rectangular cylinder 34 refinement 327 refinement of a partition 174 reflexive binary relation 22 regular space 183 regularity of a space, criteria for 184, 185 relative compactness 155 relative topology subspace 109 restriction of a function 240 restriction of a map 14 restriction of outer measure to C*-algebra 241 p-bound of a function 89 Riemann, B. Georg 107, 329 Riemann integrable function 328 Riemann integral 174, 328 Riesz, Frigyes 421 Riesz-Fischer Theorem 464 Riesz's Representation Theorem 500 right distributive law 47 ring 47 ring (as a collection of sets) 204 ring generated by a semi-ring 2 12 ring with unity 47 Russell's paradox 3 Sample space 4
563
INDEX
scalar 53 Schroder-Bernstein Theorem 44 Second Axiom of Countability 124 second countable topological space 124 Second Separation Axiom 122 section of a function 365 section of a set 359 semi-linear functional 103 semi-linear lattice 289 semi-linear operator 103 semi-linear space 53, 289 semi-norm 101, 460 semi-normed linear space 101 semi-ring 204 semi-ring, property for 211 semi-%space 289 semifield 53 semigroup 47 semigroup of functions 49 separable metria space 93 separable topological space 112 sequence 74 sequential compactness 93, 145 set 3 set-algebraic expression 6 set-algebraic transformation 6 set function 222 separating points set of functions 160 setwise convergence, criterion for 320 setwise convergence of measures 319 setwise limit of measures 319 :4 -set 437 a-additive set function 222 a-algebra (a-field) 204 a-algebra extended by a
set 214 a-algebra generated by a collection of functions 217 a-algebra generated by a collection of sets 210 a-algebra generated by a function 216 a-algebra generated by a set 205, 210 a-compact space 190 a-compactness, criteria of 190, 191, 193 a-field 204 a-finiteness of a set function 245 a-finite set function on a sequence of sets 223 a-finite set function on a system of sets 223 a-finite signed measure 420 C*-a-algebra 236 a-subadditivity 228 a-ideal 232 E n o r m 102 signed Bore1 measure 423 signed Borel-LebesgueStieltjes measure 423 signed distribution function 530 signed measure 422 simple cylinder 34 simple function 289 simple parallelepiped 116 singleton 4 singular-con tinuous component of a signed measure 457 singular components of a signed measure 457 singular distribution function 545 singular-con tinuous function 543 singular-continuous random
INDEX
variable 550 singular-discrete component of a signed measure 457 singular-discrete distribution function 545 singularity of a measure 353 singularity (orthogonality) of a signed measure 452 smallest element 26 smallest c-algebra 205 SNLS 101 space of all continuous bounded functions 152 space of all continuous functions 151 space of all continuous real-valued functions 195 space of all n-times differentiable functions 48 spherical coordinate transformation 414 standard (natural) topology 108 stochastic convergence (convergence in probability) 483 Stone, Marshall H. 160 Stone-Weierstrass Theorem 161, 164 Strong Law of Large Numbers 483 stronger (finer, larger) topology 109 subadditivity of outer measure 236 subalgebra 53 subalgebra generated by continuous functions 161 subalgebra of polynomials 164 subbase 116 subbase parallelepiped 119, 135 subcover 92
submultiplicative property of a matrix norm 387 subordinance 195 subset 4 subspace 53, 60, 109 sum of independent normal random variables 383, 384 superset 4 support of a function 195 supremum metric 61, 65, 88 supremum norm 102, 151 symmetric binary relation 22 symmetric difference 6 systems of sets, diagram of 207 Tietze's extension of continuous functions 198 Tietze's Extension Theorem 198 TIH metric 100 Tonelli's Theorem 365 topological space 108 topology 108 topology generated by a metric 81 topology generated by a subbase 116 topology induced by a set 108 topology of pointwise convergence 139 topology on the extended real line 108 T spaces, diagram of 184 To space 182 T1 space 182 T1 space, criterion for 183 T 2space (Hausdorff) 122, 182 T 3 space 183 T 4 space 183 total variation of a complex measure 430 total variation of a
INDEX
function 529 total variation of a signed measure 426 totally bounded set 82 total boundedness, criterion for 153 trace of a filter on a set 180 trace of a set in a topology 109 transitive binary relation 22 translation-invariance of Lebesgue measure 280 translation-invariant Borel measure 266 translation invariant metric 100 triangle inequality 60 two-sided identity 47 Tychonov space 183 Tychonov's Theorem 148 Tychonov topology 137 Ultrafilter 167 ultranet 173 unbounded set 82 uncountable set 41 uniform continuity criterion in compact space 96 uniform convergence 89, 152 uniform convergence, criterion for 152 uniform metric 88, 151 uniform metric space 151 uniformly bounded set of functions 89, 156 uniformly continuous function 82 uniformly integrable family of functions 486 uniformly integrable sequence of functions, criterion for 491 union 6 uniqueness of limits along
nets and filters, criteria for 178 unit cylinder 34 unity 47 universe 4 upper Baire function 329 upper bound 26 upper D a r b o u integral 328 upper derivative of a measure 510 Urysohn, Pave1 187 Urysohn's Corollary 189 Urysohn's Lemma 187 usual topology 108 Vaguely hereditary property 143 variation of a function on a bounded interval 528 variation of a function on an unbounded interval 529 vector 53 vector lattice 53 version of the conditional expectation 446 Vitali's Covering Theorem 541 volume of an ellipsoid 416 W e a k Law of Large Numbers 483 Weak (weaker, coarser, smaller) topology 109 Weak topology generated by a family of functions 138 weakly hereditary property 143 weakly regular Borel measure (Radon measure) 493 Weierstrass, Karl 107, 160 Weierstrass Theorem 163 well-ordered set 27 well-ordering principle 27
INDEX
Zermelo, Ernst 3 Zermelo's Theorem 27 zero 47 Zorn's Lemma 28
Keal Analysis ,,u.oductiot~to Che ThtarJ d Rerl
and Ipkgrath i
W
~ : A n ~ r l S I ~ ~ W P m # ipFniraahpincip.1 m p b rhn c d l a c d analysis. Sclf-mlshnsd,MI caverap of ~nplopy.masure t b n y , md integmlian, i t uKen r cborough ehbwation of *e major lheorsnq noitma. and eonmrucfianr nasdrd m% only by t h w interi n the m r t h a i l j c u l r t e n c m bu4 BIW by thwe purm~w camw in Sliltdu and prwbubility. u p a h n u m h , physics. mle n g i d n g . Slrucrurcd logically und flexibly rhrough the ruthor's m y yean dterching experiewe. the material L prwnted in lhrsc main aoclmro:
PM J c x ~ v mb e prrliminarics o f m lhrory and thc f w d u m u b of wirk *pcgt m d tcrpc)logy and fami* an ihl intmducrion to m l o l @
n -la t k W c r ofuwfure and intqmlton, ofhibackground i n nieasure theory.
aalid
Pllrr 111 nddmws more advanced topics. including elaborrtled and abPtract nnimr of ~trrm sorl ink&on h g wit11thsir qpllcrtinv tn funrtirllnl analysis. pmhbility theory. and conventiond analydr on rhe mal line. A d p r lies st the core o f all n~athematicnldkiplincs and ir efvnrial to r
range of scientific and engineering fielJ1. R.alA&&:An I d d mw Lk o/W lrurcku a d -1 olfers the perfect vehicle for huilding the foumladon ncetlrd fur n n m alvanced r t u d i .
is a h f m i r n f Malhn~~atics m hFltmda Institute o f Technology in Melk~urne.Flmih. and the Principal Wtcw ofthe Jmmul ujAl~pliedMutlytnatics d S~aarktuticAwlysiv. He is also on editorial bauds of three other applied mathematicsjournals.
JEWGENI it -W
I S B N 1-58'488-073-2
m
w
d
~