Studies in Advanced Mathematics Series Editor STEVEN G. KRANIZ WashingtoJI University in St. Louis
Editorial Board
Gerald B. Folland
R. Michael Beals Rutgers University
University of Washington
Dennis de Turck
William Helton
University of Pennsylvania
University of California at San Diego
Ronald DeVore
Norberto Salinas
University of South Carolina
University of Kansas
Lawrence C. Evans
Michael E. Taylor
University of California at Berkeley
University of North Carolin
Titles Included in the Series Steven R. Bell, The Cauchy Transform, Potenual Theory, and Conformal Mapping John J. Benedetto, Harmonic Analysis and Applications John J. Benedetto and Michael
W. Frazier, Wavelets: Mathematics and Applications
Albert Boggess, CR Manifolds and the Tangential Cauchy-Riemann Complex Goong Chen and Jianxin Zhou, V ibration and Damping in Distributed Systems,
Vol.
1:
Analysis, Esumation, Attenuation, and Design. Vol. 2: WKB and Wave Methods,
V isualization, and Experimentation Carl C. Cowen and Barbara D. MacCluer, Composition Operators on Spaces of Analytic Funcuons John
P. D'Angelo, Several Complex Variables and the Geometry of Real Hypersurfaces
Lawrence C. Evans and Ronald
F. Gariepy, Measure Theory and Fine Properties of Functions
Gerald B. Folland, A Course in Abstract Harmonic Analysis Jose Garc(a-Cuerva, Eugenio Hernandez, Fernando Soria, and Jose-Luis Torrea,
Fourier Analysis and Partial Differential Equations Peter B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah-Singer Index Theorem,
2nd Edition Alfred Gray, Modem Differential Geometry of Curves and Surfaces with Mathemauca, 2nd Edition Eugenio Hernandez and Guido Weiss, A First Course on Wavelets Steven G. Krant7., Partial Differenual Equations and Complex Analysis Steven G. Krantz, Real Analysis and Foundations Kenneth
L Kuttler, Modem Analysis
Michael Pedersen, Functional Analysis in Applied Mathematics and Engineering Clark Robinson, Dynamical Systems: Stability, Symbolic Dynamics, and Chaos, 2nd Edition Jolm Ryan, Clifford Algebras in Analysis and Related Topics Xavier Saint Raymond, Elementary Introduction to the Theory of Pseudodifferential Operators Robert Striclzartz, A Guide to Distribution Theory and Fourier Transforms Andre Unterberger and Harald Upmeier, Pseudodifferential Analysis on Symmetric Cones James S. Walker, Fast Fourier Transforms, 2nd Edition James S. Walker, Pnmer on Wavelets and their Scientific Applications Gilbert G. Walter, Wavelets and Other Orthogonal Systems with Applications Kelze Zhu, An Introduction to Operator Algebras
JEWGENI H. DSHALALOW
Real Analysis An Introduction to the Theory of Real Functions and Integration
CHAPMAN & HALUCRC Boca Raton
London
New York Washington,
D.C.
Library of Congress Cataloging-in-Publication Data Dshalalow, Jewgeni H.
Real analysis : an introduction to the theory of real functions and integration I Jewgeni
H. Dshalalow. p.
em. --(Studies in advanced mathematics)
Includes bibliographical references and index. ISBN
1. 2.
1-58488-073-2 (alk.
paper)
Mathematical analysis. I. Title. II. Series. Biology-molecular. I. McLachlan, Alan. II. Title.
QA300 .074 2000 515--dc21
00-058593 CIP
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distnbution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC,
2000 N.W. Corporate Blvd.,
Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
©
2001
by CRC Press LLC
No claim to original U.S. Government works
1-58488-073-2 Library of Congress Card Number 00-058593 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 International Standard Book Number
Printed on acid-free paper
To my Lord and Redeemer Who made the supreme sacrifice for me and Who will come again
Preface This book is intended to be an introductory two-semester course in abstract analy sis, which includes topology, measure theory, and integration, traditionally staff ing an assemblage of topics under the cognomen "Real Analysis," more common in the United States. Most North American schools offer this as a graduate one- to two-semester course for mathematics, physics, and engineering majors. Many European schools, to the best of my knowledge, do not have such a course; they have instead a sequence of separate courses such as
gration, and Functional Analysis.
In some countries, such as Russia and former
Soviet Republics, they, additionally, have a somewhat similar to
Topology, Measure and Inte
Real Variables course, which is
Real Analysis but is more specialized, and, its profile and
rigor vary from college to college.
A very good reason for learning real analysis is that not only is it a core course for all mathematical disciplines, but it is absolutely mandatory for statistics and probability, operations research, physics, and some engineering majors as well. Hence, rephrasing an old adage, all routes of science and technology go through real analysis. This text predominantly targets the first year graduate students of mathemat ical science majors as well as the frrst and second year graduate students of engi neering, physics, and operations research majors. A stronger senior undergraduate mathematics student can also benefit from the course. Some less theoretically oriented programs or those with weaker mathematics course curricula may frnd it reasonable to use the book for a three-semester course: with the first two semes ters of basics and the third semester of advanced topics. The course can always be shortened to two semesters in such schools with the option to cover the first seven chapters, which are also quite sufficient for technical majors. This book is destined primarily as a textbook and its purpose as a reference is secondary. The reason for such a claim is a rather thorough elaboration of ma jor theorems, notions, and constructions, very often supplied with a blueprint and sometimes a less formal introduction. The latter are then succeeded by detailed treatments. For instance, the Radon Nikodym Theorem is first introduced in Chapter
6, with a minimum of proofs and formalities, but with a number of exam
ples and exercises. Then it is followed by a more abstract version later, in Chapter
8.
Vll
. .
PREFACE
Vlll
. . .
The first three chapters of the book (Part
I) include preliminaries on sets
theory and basics of metric spaces and topology.
I have been using these three
chapters for the many years teaching a bilevel topology course at Florida Tech during our quarter system. However,
I
would not be able to cover the present
version of the three chapters in one quarter, and one semester would be a more appropriate term for the current program at our school. Hence, the first three chapters can easily serve as a separate one quarter to one semester topology se nior undergraduate or beginning graduate course. Chapters 4-7 (Part
II) present basics of measure and integration and, again,
they can be offered as a separate measure theory {and integration) course. Con sequently, Parts
I and IT can become appealing to those programs with separate
named courses and, in particular, to European students. Part ITI (Chapters
8 and 9)
includes a more elaborate and abstract version of measure and integration, along
with their applications to functional analysis
(LP spaces and Riesz Representation
Theorem for locally compact Hausdorff spaces), probability theory (conditional
expectation, uniform integrability, Lebesgue-Stieltjes integrals, decomposition of distribution functions, stochastic convergence, and convergence of Radon mea sures), and conventional analysis on the real line (monotone and absolutely con tinuous functions, functions of bounded variations, and major theorems of calcu lus). Part
III can be utilized for advanced topics, as well as an enlarged variant of
measure and integration. While the reader would be better off to have studied Part
I prior to Part IT and the first six sections of Chapter 8, the latter can also be used as
an independent material with sufficient basics of topology drawn from any
generic advanced analysis course. The book can also be used as a reference source for researchers in mathe matical and engineering sciences, and especially, operations research (such as applied stochastic processes, queueing theory, and reliability). The reader should understand, however, that the book is not intended to become an encyclopedia of mathematics or to be any kind of a broad reference.
I had to suppress my tempta
tion to include some written chapters on Hilbert spaces, functional analysis, and Fourier transforms, because of my motives to compile main topics of what consti tutes the real. analysis and to design a text by spending more time on details (with in the frameworks of the book size imposed by the publisher and buyers' afford ability). This text may be well suited for independent studies with or without in structors for which an abundance of examples and over pertinent support. While a solution manual is
600 exercises provide a
in preparation and will become
available soon (and it would be an additional studying aid), the publisher and I
have agreed on honoring only university instructors with this manual upon adop tion of the book for the course. The reader may also fmd the
new terms subsect
ions (at the end of each section) useful, especially considering a plethora of new definitions and notations, which not only can be intimidating, but they can create an additional memory burden and thereby slow down learning of the main concepts.
.
PREFACE
lX
Most of my thanks are due to my wife Irina for her ample support, encour agements, and overwhelming sacrifice.
I would like to express my deep apprecia
tion to Mr. Jiirgen Becker, for his constant guidance and countless ideas, Mr.
Donald Konwinski for his enormous editorial work on earlier versions of my
manuscript, Professors Gerald B. Folland and Ryszard Syski for their numerous and very constructive remarks, as well as the kind assistance of Professors S.G.
Deo, Jean-B. Lassere, Jordan Stoyanov, Mr. Gary Russell, the project editor, Mr.
David Alliot, and anonymous reviewers who thoroughly read my manuscript and
made many helpful suggestions. My thanks are also due to the publisher, Mr.
Robert Stem for his help and extreme patience.
Jewgeni H. Dshalalow Melbourne, Florida
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Part L An Introduction to General Topology
Chapter 1 1. 2. 3. 4. 5. 6. 7.
3.
4. 5. 6. 7.
3.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 3
.
3
11 Set Operations under Maps ........................... . ...17 Relations and Well-Ordering Principle ......................22 Cartesian Product . . . . 31 Cardinality . . 40 . . . . . 46 Basic Algebraic Structures Functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Analysis ofMetric Spaces ................. 59
. . . . 59 The Structure of Metric Spaces . . 65 Convergence in Metric Spaces ...........................7 4 Continuous Mappings in Metric Spaces 78 . 87 Complete Metric Spaces . Compactrless . . . 92 Linear and Normed Linear Spaces I 00
Defmitions and Notations
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 3 1. 2.
.
Set- Theoretic and Algebraic Preliminaries
Sets and Basic Notation
Chapter 2 1. 2.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Elements ofPoint Set Topology ..........107
Topological Spaces
.
.
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
Bases and Subbases for Topological Spaces
.
.
.
.
.
.
.
.
.
.
.
.
.
.. . .
.
.
Convergence of Sequences in Topological Spaces and
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107 115 .
Xl
CONTENTS
Countability
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Xll
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.
Continuity in Topological Spaces
5.
Product Topology
6.
Notes on Subspaces and Compactness
7.
Function Spaces and Ascoli 's Theorem
8.
Stone-Weierstrass Approximation Theorem
9.
Filter and Net Convergence
10. Separation 11.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Part IL Basics of Measure and Integration
2. 3.
Systems of Sets
.
.
.
.
System's Generators
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Measures
.
. .
. .
.
143 151 160 167 195
201
.
4.
Image Measures
5.
Extended Real-Valued Measurable Functions
6.
Simple Functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Elements ofIntegration
c-1(Q,..E)
.
.
203 204 210
. 216 .
. .. .. . .. . . .. ... ... .... .. . .. . . .. . 221
Lebesgue and Lebesgue-Stieltjes Measures
1.
Integration on
2.
Main Convergence Theorems
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Lebesgue and Riemann Integrals on R .. ..... ... .
.
Integration with Respect to Image Measures
.
...... . .
.
222 235 258 277 282 288 295 296 312
..... .327 .
.
.. ... 341 .
Measures Generated by Integrals. Absolute Continuity. Orthogonality . . .
.
.
.
..
.
.
.. ... .
.
.
.
.... .. ..... . ... .
.
.
.
.
.
. 346
Product Measures of Finitely Many Measurable Spaces and Fubini's Theorem .
7.
.
.
3.
6.
.
.
Extension of Set Functions to a Measure
5.
.
.
2.
4.
.
.
Set Functions
3.
.
.
1.
Chapter 6
.
.
.
Measurable Functions
Chapter 5
135
Measurable Spaces and Measurable Functions
I.
128
..... . ... .. . . . . .......... . .. . . ... ... . ... .182
Functions on Locally Compact Spaces
Chapter 4
122
.
.
.
.
.
.
. . .
.
.
.
Applications of Fubini's Theorem .
.
.
.. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . .
.
.
.. . 356 .
... . . ... ..... .
.
.
.
.
.
.
378
CONTENTS
Xlll . .
.
Chapter 7
Calculus in Euclidean Spaces
............ 387
1.
Differentiation ........................................ 387
2.
Change of Variables
...................................402
Part IlL Further Topics in Integration
Chapter 8
Analysis in Abstract Spaces
1.
Signed and Complex Measures
2.
Absolute Continuity ..
3.
Singularity
4. 5.
LP Spaces
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
Modes of Convergence
.
.
.
.
.... .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
.
.
.
.
.
1.
Monotone Functions
2.
Functions of Bounded Variation
3.
Absolute Continuous Functions
4.
Singular Functions .
INDEX
.
.
.
. .. .
.
.
.
.
. . .
.
. . .. . .
.
.
.
.
Calculus on the Real Line
BIBLIOGRAPHY .
.
.
.
.
.
.
.
.
.
.
.
.. .
.
.
.
.
.
.
.
419
421
.
.
.
.
.
437
.
.
.
.
.
.
.
.
.
.
Measure Derivatives
.
.
.. . .. . .. .. ..... ....452
8.
.
.
.
.
Radon Measures on Locally Compact Hausdorff Spaces
.
.
.. ..
. .
7.
Chapter 9
.
.
.
.
.
.
.
.
.460
.. . . . ...........................474 .
.
.
.
Uniform Integrability ... .
.
.
.
. . . ... . .422
6.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
. .
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
... .
.
.
..
.
.
.
.
.
.
.
.. . .
.
.
.
.
.
.
.
.
.
.
.....
.
.
.
.
.
.
.
.
.
.
.
. 486 .
.
.
.
.
.
.
.
. .. .
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
. .
493 510 517 517 528 535 543
. 55 1 .
.
.
553
Part/ An Introduction to General Topology
Chapter 1 Set-Theoretic and Algebraic Preliminaries
Set theory is not just one of the main tools in mathematics, it is the very root of mathematics, from which all mathematical disciplines stem. The great German mathematician, Georg Ferdinand Cantor, is considered to be a sole founder of set theory in a series of papers, the first of which appeared in 1874. Although Czech Bernard Bolzano (178 1-1848) made one of the first attempts to formalize set theory, in particular in his Paradoxien des Unendlichen 1851 work, by considering the one-to-one correspondence between two sets (later on developed by Cantor to what we now know as cardinals), neither he, nor anyone else, was really a predecessor to Cantor's creation. Ernst Zermelo (187 1- 1953) was another German, who among his numerous contributions to set theory, is the au thor of the first axiom for set theory (of 1908) and undoubtedly the primary axiom of the whole mathematics. This chapter presents only essentials of set theory and abstract algebra needed throughout the book. 1. SETS AND BASIC NOTATION
collection M into a whole of definite, distinct objects ( that are called elements of M) of our thought. In other words, we
Cantor defined a set as a
bind objects (perhaps of different nature) in our mind into a single entity and call that entity a set. We will denote sets by capital letters, and their elements by lower case letters. For instance, a set A has elements a , b, c, or a1 , a2 , . To abbreviate the expression "a is an element of the set A, " we will write " " a E A. The expression a rt A" reads a is not an element of A." Observe that the notion of a set is relatively simple if we deal with such frequently encountered sets as sets of integers, rational numbers, real numbers or continuous functions. In some rare situations, thought less use of this notion can lead to contradictions, like Bertrand Russell's paradox. Russell posed the following set dilemma. Let � be the set of all sets, which are not elements of themselves. Clearly, � is not empty. For instance, the set of all real numbers is not an element of itself (for it is •
••
3
4
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
not a real number), thus it belongs to �. The question arises: Is � an element of itself? If � E � then by definition of �, it should not belong to � which is a contradiction. Thus, � ft. �. But then, by definition, it must belong to �, which is impossible. In this case, we have put the definition of an object ahead of its existence. The concept of a set must be supported by axioms of set theory, just as main axioms of plane geo metry define the shape of lines. 1.1 Definitions.
( i) A set A is said to be a subset of a set B (in notation, A C B) if all elements of A are also elements of B. If A is a subset of B, we call B a superset of A (in notation, B:) A). A set that contains exactly one element, say a, is called a singleton (set) and it is denoted by {a} . If a E A, then we can alternatively write {a} C A. Any set is obviously a subset of itself: A � A. ( ii) The unique set with no elements is called the empty set and is denoted 0. Clearly, 0 is a subset of any set, including itself. (iii) A = B (read "set A equals set B") if and only if A C B and B C A; otherwise, we will write A f. B. Occasionally, we will be using the symbol " C " applied to the situation where one set is a subset of another set but the sets are not equal. A C B reads "A is a proper subset of B." In this case, B is a proper superset of A (in notation, B :J A). D We postulate the existence of a set that is a superset of all other sets in the framework of a certain mathematical model. This set is usually called a universal set or just universe. We will also make use of the word "carrier" as a synonym for the universe and reserve for it the Greek letter n. Sometimes, we will denote it by X, Y or Z. A universe (as a base for some mathematical model or problem) is generally defined to contain all considered sets and it varies from model to model. For example, if e ra , b] denotes the set of all n-times differentiable functions on interval [a,b], it contains, .as a subset, the set of possible solutions of an ordinary differential equation of the nth order. Thus, f2 = e ra , b] is a relevant universe within which the problem is posed. One could also take for n the set e [ a, b] of all continuous functions on [a,b] or even the set of all
real-valued functions on [a,b]. However, these are "vast" to serve for uni verses and they are impractical for this concrete problem. Set theory is also a basic ingredient of probability theory, which always begins with elements of set theory under slightly modified lexicon. For instance, a universe is referred to as sample space. Subsets of the sample space are called events, specifically singletons are called elementa-
1.
Sets and Basic Notation
5
ry events. The concept of the universe is most vivid when used in proba
bility theory. Let us consider the experiment that consists of tossing a coin until the first appearance of the head on the upper face of the coin. Denoting H as an output of the head and T as an output of the tail, when tossing the coin, we may define { ( T,T, . . . ,T,H)} as an elementary event of the sample space n populated by the elements {(H), (T,H), (T, T , H), . . . }. The universe n contains, as elements, all possible out comes of tossing the coin until the "first success" or the first appearance of the head. For instance, in the language of probability theory, the event {(H), ( T,H),(T, T, H)} corresponds to the "success in at most three tosses." 1.2 Notations. Throughout the whole book we will be using the following notation.
( i) Logical sym bois: V means "for all" 3 means "there is" or "there are" or "there exists" => means "implies" or "from . . . it follows that ... " ¢:> means "if and only if" 1\ ( & ) means "and" V means "or" : means "such that" (primarily used for definition of sets)
(ii) Frequently used sets: N: the set of all positive integers N0 : the set of all nonnegative integers Z: the set of all integers Q: the set of all rational numbers Qc : the set of all irrational numbers IR: the set of all real numbers C: the set of all complex numbers IR + the set of all nonnegative real n urn hers IR the set of all negative real numbers (iii) Denotation of sets: List: The elements are listed inside a pair of braces [for instance, {a,b,c} or {a 1 , a 2 , . . . }] . Condition: A description of the elements with a condition following a colon (that in this case reads "such that" ) , again with braces enclosing the set [for instance, The set of odd integers is { n E Z: :
_:
n =
2k+l, k E
Z}].
6
CHAPTER 1. S ET-THEORETIC AND ALGEBRAIC PRELIMINARIES
( iv) Main set operations: Union: Au B = { x E n: X E A v X E B} Intersection: A n B = { x E n: X E A 1\ X E B} Two subsets A, B C n are called disjoint if A n B= 0. Difference: A\B = { x E n: X E A 1\ X � B} [A\B is also called the complement of B with respect to A, with the alter native notation A - B or BA . ] Symmetric Difference: A� B = (A\B) U (B\A) Complement ( with respect to the universe f2): A c = An = f2\A (v) General notation: ": = " reads "set by definition." D indicates the end of a proof, remarks, examples, etc. A set-algebraic expression is a set in the form of some defined sets connected thrdugh set operations. Any transformation of a set-algebraic expression into another expression would require a set-theoretic manipula tion which we call a set-algebraic transformation. All basic set-algebraic transformations over basic set-algebraic expressions are known as Laws of Algebra ( or Calculus ) of Sets. D 1.3 Remark. One of the standard tools of the algebra of sets is the so called pick-a-point process applied to, say, showing that A C B or A = B. It is based on the following Axiom of Extent: For each se:t A and each set B, it is true that A = B if and only if for every x E n, x E A when and only when x E B . Axiom's modification: If every element of A is an element of B, then A C B. Thus, for the modification, the pick-a-point process consists of selecting an arbitrar-y point x of A (picking a point x ) and then proving that x also belongs to J1. The identities below can be verified easily by the reader using pick-aD point techniques. 1.4 Theorem (Laws of Algebra of Sets).
(i)
( ii)
Commutative Laws: A U B=B U A AnB = BnA
Associative Laws: (A U B) U C=A U (B U C) (A n B) n C=A n (B n C)
Sets and Basic Notation
1.
7
( iii ) Distributive Laws:
(A U B) n C = (A n C) U (' B n C) (A n B) U C= ( A U C) n (B U C)
(iv) Idempotence of complement: (Ac)c=A union: A U A A intersection: AnA=A =
(v) AnAc=0 (vi) AuAc=n (vii) DeMorgan 's Laws: (AUB)c=AcnBc (A n B) c = Ac U Be ( vi i) AU0==A (ix) An0=0 (x) nc = 0 and 0c n. i
=
D
1.5 Example. Show the validity of the first distri bu ti ve law.
<=> <=>
[ xEA
A
xEc
x E (A U B) n C X E (Au B) 1\ X Ec <=> x EA n C] v [ xEB A xE(A n C) U ( B n C).
xEc xEB n C] <=>
0
1.6 Remark. The concepts of union and intersection can be extended
to an arbitrary family of sets. For instance, U
i EI
Ai={xEf2:3iEl,xEAi}·
The distributive laws and DeMorgan's laws hold for arbitrary families (subject to Problem 1.1 b)) : U Ai) n B= U ( A i n B)
( ( n Ai ) U B= n (A U B) ( U Ai)c= n Ai ( n Ai )c= U Ai · iEI
iEI
iEI
i EI
i EI
iEI
i
iEI
i EI
D
8
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
1.7 Definitions.
An indexed family
To specify the type of convergence, we will write {A n } l A ({A n }! A). A sequence {A n } of sets is said to be monotone vanishing, if it is monotone nonincreasing and {A n }! C/J. ( v ) Let {A n } be an arbitrary sequence of sets. Denote 00 00 (a) lim n n A m . This limit is n---.inf oo A n (or just lim An ) = nU= l m= called the limit inferior. 00 00 (or just lim A n ) = n U n A m . This limit is A ( b) lim n n---.sup n = l m= oo called the limit superior. If lim A n = lim A n then we denote this common limit as nlim -too A n . In D this case, the limit of {A n } is said to exist and equal UE:oo A n . -
PROBLEMS 1.1
a) Prove Theorem 1.4, the laws of algebra of sets by using the pick-a b)
point process. Prove the generalized distributive laws and DeMorgan's laws stated in Remark 1.6.
1.
1.2
Sets and Basic Notation
9
Show that:
a) (A U B)\C ( A\C) U ( B\C). b) ( A n B)\C ( A\C) n ( B\C). ) C\(A U B) (C\A) n ( C\B). d) C\ ( A n B) = (C\A) U ( C\B). Show that A \B = A n Be. Let I A I = n (i.e. , the set A contains I � (A) I = 2 . c
1.3 1.4 1.5
n
n
elements). Show that
Prove that:
a) ( A\B) c= A c U B. b) [(Ac U B) c U (A U B c )] c = B\A. ) (A n B) U ( A n B c ) u (A c n B) = A U B . c
1.6
For each of the following, justify with a proof or give a counter example.
a) A U C= B U C=> A= B. b) (AU B)\B= A. ) A\B= C\B=> A= C. d) (A \B) c == ( A n B c ) c . c
1.7 1.8
Give an example of a monotone vanishing sequence of sets. Let {A n : n = 1,2, . . . } be an arbitrary sequence of sets. Define 00 00 A0 = n A n and A00= U A n .
n=l
n=l
a) Construct a monotone non increasing sequence of sets { B n } such that {B n } l A0 • b) Construct a monotone nondecreasing sequence of sets { C n } such that { C n } j A 00 • ) Given { C n } j A00 , construct a pairwise disjoint sequence {D n } such that L: :0= D n = A00 • c
1
1.9
In the condition of Problem 1.8 , show that
1.10
Let n be an arbitrary set. Find a sequence such that
{En } of subsets of n
10
CHAPTER 1 . S ET-THEORETIC AND ALGEBRAIC PRELIMINA RIES
NEW TERMS:
set 3 element of a set 3 Russell's paradox 3 subset 4 superset 4 singleton 4 empty set 4 proper subset 4 proper superset 4 universe 4 carrier 4 sample space 4 events 4 elementary events 4 union 6 intersection 6 disjoint sets 6 difference 6
symmetric difference 6 complement 6 set-algebraic expression 6 set-algebraic transformation 6 pick-a-point pr0cess 6 axiom of extent 6 commutative laws 6 associative laws 6 distributive laws 7 idempotence 7 DeMorgan 's laws 7 pairwise disjoint sets 8 disjoint family of sets 8 decomposition of a set 8 partition of a set 8 partition of an interval 8 power set 8 monotone nondecreasing sequence of sets 8 monotone nonincreasing sequence of sets 8 monotone vanishing sequence of sets 8 limit inferior 8 limit superior 8 limit of a sequence 8
2. Functions
11
2. FUNCTIONS
The word "function" was introduced by Gottfried von Leibnitz in 1694, initially as a term to denote any quantity related to a curve, such as its slope, the radius of curvature, etc. The notion of the function was refined subsequently by Johann Bernoulli, Leonard Euler, Joseph Fourier, and finally, by Lejeune Dirichlet in the middle of the nineteenth century with a formulation pretty close to what we are using at the present time and which a mathematics or engineering student meets in an introductory calculus course. Dirichlet introduced a variable, as a symbol that repre sents a set of numbers; if two variables x and y are so related that when ever x takes on a value, there is a value y assigned to x by some rule of correspondence. In this case y (a dependent variable) was said to be a function of x (an independent variable). In this section we introduce a more contemporary notion of a func tion. For functions operating with sets (rather than with points) , we will be using a nontraditional notation of f and / * (instead of just f) , previ ously used by MacLane and Birkhoff [1993] and which we found very appealing, as it brings more order within functions acting on collections of sets (such as topologies and sigma-algebras) and simplifies many proofs. *
2.1 Definitions.
( i) Let X and Y be two sets. The set {(x, y ): x E X, y E Y} of all ordered pairs of elements of X and Y is called the Cartesian or direct product of X and2 Y and it is denoted by X x Y. If X = Y then we shall write X X X = X • Similarly, the Cartesian product of n sets is
the set of all ordered n-tuples. (ii) Any subset f of X x Y is called a binary relation. ( iii) A binary relation f C X x Y is called a (single-valued) ,function if whenever (x , y 1 ) and (x , y 2 ) are elements of /, then y 1 = y 2 . We also say that the function f is a map (or mapping) from X to Y and denote this most frequently by the triple [X,Y,f] or by f: X � Y or by (x,f(x)) or by f (x) = y or by x � f ( x) ( iv) For a function f (as a subset of X x Y), denote .
and call it the
domain
D1
=
{x E X: (x ,y) E /}
of
f.
When a function
[X,Y,f]
is given we will
12
CHAPTER
1. S ET-THEORETIC AND ALGEBRAIC P RELIMINARIES
agree that X is the domain of f. If a domain is not specified, we agree to regard as D f the largest possible set where f is defined. The latter re quires a more rigorous motivation. For instance, let
f(x) = F,. x-1 This function is defined for all x E ( l,oo). On the extended real line !R = !R U { + oo , - oo }, we allow x E [l,oo]. And finally, it is not wrong to have x be any real ( or even complex) number, if f will take on values in Y C C (or C = C U { oo} ) . ( ) Another component of a function is its range, v
A superset of R1 (such as Y) is referred to as a codomain. In other words, Rf is the subset of all such elements of Y, which take part in the relation f C D f x Y. (vi) If x E D1, then f(x) ( E R1) is called the image of x under f. By the above definition, for every x there is a unique image. [Note that an "extended'' concept of a function allows more than one image of each point x under f. Any such function f is called multi-valued. The reader is definitely acquainted with principles of complex analysis where such functions are common. It is also known that in this case the range of a multi-valued function can be parttitioned into pairwise disjoint subsets, such that the function is then split into a number of single-valued functions called branches.] ( vii) If D C D 1 then the set of the images of all points of D under f is called the image of D under f and, following the notation of most analysis text books, it can be denoted
f(D) = { y E Y: 3 x E D, f(x) = y }. However, for the upcoming constructions, it is convenient to distinguish images of points of a set from images of subsets of X under f. In other words, we introduce the function
[�(X), � (Y), f *], where for D E �{X) we denote
f * (D) = { y E Y: 3 x E D , f(x) = y }.
2. Functions
13
Specifically, R 1 = f (D 1 ). We agree to set f * ( { x } ) = 0 \lx rt D f. How * we will always assume that in [X , Y , J] , X is the ever, unless specified, domain of function f . [In particular, this agreement excludes such an in consistency as having f(x) = C/J, whenever x ft. D 1 , since f(x) is supposed to be a point and not a set.] (viii) Let [X, Y, /] be a function. Define the function
inverse of f *" In other words, for each B E GJ(R 1 ), f (B ) = {x E X: f(x) E B}. The set f * (B) is called the inverse i m age of B under J , 1or the pre-imag e o f B under f. Another construction related to f * is f - defined as {(y, x) E Y x X: (x, y ) E /} and called the inverse of f. Unlike / * , in general, f - l is not a single-valued function (in other
and call it the
*
J
words, it is a binary relation or multi-valued function . Consider, for instance, the function [IR, IR, /] such that f(x) = x . Clearly, R 1 1 = IR + and the inverse V = f - of f is a two-valued function with domain D _ 1 = IR + and with range equal R , which can be decomposed J
IR = ( oo,O) + [O,oo). Accordingly, we have two branches [R + , oo,O), VJ and [R +, R + , V ] of V . ( ix) Observe that it is legitimate that f( x 1 ) = f( x 2 ) and x 1 f. x 2 . However, if f is such that f(x 1 ) = f(x 2 ) if and only if x 1 x 2 , then f is called one-to-one (or injective or invertible). If f is one-to-one, f - 1 is a
as (
-
-
=
single-valued function too. Since f - 1 in general is not a single-valued function we will agree to regard f - 1 (y) as a set (which in particular can be a singleton or the empty set) , with the alternative notation / * ( {y} ) . (x) Let [X, Y, f ] be a function. Generally, f * (X) = R 1 C Y. In this case, we say the map f is from X into Y. When f * ( X ) = Y, we say the map f is fro m X onto Y or surjective. We call f bijective if f is surjective (onto) and injective (one-to-one). ( xi ) Let f � X x Y and g C Y x Z be binary relations. Then the composition of f with g is defined as
g o f = {(x,z ) E X x Z : 3y: (x,y) E /, (y,z) E g}. The composition of f with g is most frequently used when [X,Y,/] and [ R 1 n D g' Z , g] are functions and, consequently, it is defined as
[X, R 1 n D g, Z ,go f].
D
14
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC P RELIMINARIES
2.2 Example. For a ftxed subset A C X, define the indicator function [X,IR ,1 A ] as 1, 0,
[X, IR, 1 A] is an into map, while [X, {0, 1 } , 1A] is an onto map. 2.3 Definition. Let f: X--+ Y and let A C X. Then define
Then ,
D
R es A f = {(x,y) E (Ax Y) n /}. This function is called the restriction of f to A. On the other hand, the function f is called an extension of the function R e s A f from A to X. D 2.4 Example. Consider [IR, [ - 1 , 1] , sin] which is surjective (i.e. , onto) but not injective (one-to-one). Take a restriction of function [IR, [ - 1 , 1] , sin] to one of the largest subsets A of IR where [IR, [ - 1 , 1] , sin] is monotone increasing. It is plausible to set A = [- ; , ;], since it is also symmetric about the Y-axis. Then [A, [ - 1, 1] , Res Asin] is obviously bi D jective and its inverse is the well-known function [ [ - 1 , 1] , A,arcsin] . 2.5 Remark. Let [X, Y, f] be a single-valued function such that for some y E R1, f * ( {y}) = {x 1 , x 2 , x 3 } C X. Consider the composition f o f * and find that
*
Thus, if f is single-valued, the restriction of f o / - 1 to R f is the identity function (denoted I, with the domain Df f _ 1 = Rf)· However, f- 1 o f need not be a single-valued function at all (show it). f- 1 o f is the identity function only when f is injective. D 0
PROBLEMS
2.1 2.2 2.3
Find the image of [ - 3,5) under 1 ( 1 , 2 ] . Find the inverse image of (�,4] under 1 ( 1 , 2 ] . Composition: a ) Show that the compose operator is associative. 1 1 1 . o f= Show that ( g o f)9 b) c ) Show that Dg o f = D1nt*( Dg) ·
2. Functions
2.4
15
Show the equivalence of the following statements: a ) f is one-to-one.
b) f * (A n B) = f * (A) n t * (B). ) For every pair A and B, = 0.
c
In the following problems we assume that
of disjoint sets,
f * (A ) n f * (B )
f is a map from X into Y.
2.5
Show that
2.6
Show that VB C Y,
2.7
Show that [X, Y, f] is onto if and only if \I B c Y.
A c X => A C f* o f * (A). f* o f * (B) C B. f * o f *(B) = B holds
16
CHAPTER 1 . SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
NEW TERMS:
Cartesian (direct) product 11 binary relation, 11 function 11 map 11 mapping 11 domain 11 range 12 codomain 12 image of a point 12 multi-valued function 12 image of a set 12 branch of a function 12 inverse image of function f 13 pre-image 13 in verse of function f 13 one-to-one (injective, invertible) map 13 into map 13 onto (surjective) map 13 bijective (onto and one-to-one) map 13 composition of binary relations 13 composition of maps 13 indicator function 14 restriction of a map 14 extension of a map 14 identity function 14 *
3. Set Operations under Maps
17
3 . SET OPERATIONS UNDER MAPS
The mos t remarkable property of the inverse of a function is that it "pre serves" all set operations. The function itself, as we shall see, does not have such a quality. The main theorems in this section will be proved for special cases of surjective maps; the rest will be left for the reader. 3.1 Theorem. Let [X, Y, f] be a surjective map and let B C Y. Then Proof. We prove an equivalent statement,
we show that
f*(B) + ! *(Be) = X,
i.e. ,
( i) f*( B) and f * (Be) are disjoint and ( ii) f * ( B) complements f * ( Be)
up to X. We start with: (i) Suppose f * (B) and f * (Be) hav.e a common point x. Then there is y 1 E B such that f(x) = y1 and y2 E Be such that f(x) = y2 . Thus, y 1 :f y2 and f is not a single-valued function. (See Figure 3.1.)
.f*(B)
f*(B)
y
X
Figure
3.1
18
CHAPTER I. SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
( ii) If f * ( B) does not complement f * ( Be) up to X, there will be at least one point x which does not belong to either of these sets (for they are disjoint as shown above). This is an obvious contradiction, since it follows that f(x) rt Y. (See Figure 3.2 below.) 0
f*(B)
f*(B)
y
X
Figure 3.2 Let [X, Y, f] be a function. Then [!*(Y)f = xc = 0. On the other hand, setting B = Y, by Problem 3. 1, we obtain 3.2 Example.
i.e. /*(0)
=
0.
0
Let [X, Y, f] be a surjective map. Then B 1 C B2 C Y implies that !*( B1 ) C !*( B 2 ). Proof. Suppose that f * (B 1 ) is not a subset of f* (B 2 ). This implies the existence of a point x which belongs to f*(B1 ) and does not belong to f* (B 2 ). Therefore , there is exactly one pointy E B 1 with f(x) =y. On the other hand, since x rt f*(B2 ), f(x) cannot belong to B 2 • But it must, since f( x) =y E B1 C B2 . (See Figure 3.3 below.) Hence, our assumption D above was wrong. 3.3 Theorem.
3. Set Operations under Maps
19
Figure 3.3 Let f: X--. Y be an onto map and let {Bi : i E I} be an indexed family of subsets of Y. Then, 3.4 Theorem.
Proof.
( i ) We prove that i U f*(B i) C ! *( i U B i ) · EI El Let x E U f * (B i )· Then there is an index i0 E I such that iEI x E f * (B i0 ) Since B i0 Ci U B i , by Theorem 3.3, f * (B i0 ) C f * (i U B i ), EI EI which implies that x E /*( U Bi )· i ei ( ii) We show the validity of the inverse inclusion, f * ( U B i ) Ci U f * (B i )· iei ei Let x E / * ( U Bi )· Then f(x) E U B i . Therefore, there is an index i EI iE/ i0 E I such that f(x) E B i0 if and only if {f(x)} C B i0 . By Theorem 3.3, it follows that f * {f(x)} C f * (B i 0). Since x E f * ({f(x)}), we have D {x} C f*{f(x)} C f * (Bi0 ) C U f * (B i )· i EI .
20
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 3.1 3. 2 3.3 3.4
3.5 3.6
Prove Theorem 3.1 under the condition that f is an into map. Prove Theorem 3.3 under the condition that f is an into map. Generalize Theorem 3.4 when f is an into map. Let [X, Y, f] be an into map and let {B i : i E I} be an indexed family of subsets of Y. a ) Pro ve tha t / * ( n B i ) = n f * ( B i ) · i el i EI b) If { Bi : i E I} is a pairwise disjoint family, show that ! * (Li E [ Bi ) = Li E I f* (B i ) • Show that f * (A \B) = f * (A)\f * (B). The results above prove that all set operations are closed under the in verses of maps. Show that not all set operations are closed under maps per the following. a ) Show that maps preserve inclusions. b) Show that maps preserve unions. ) Show that maps do not preserve intersections; specifically, show that as
c
1.( i ne i AJ c i ne i J * (A; )
and that the inverse inclusion need not hold. Explain the latter without a counterexample. d) Do maps preserve the difference? 3.7
Let "[X, Y, f] be a map and let A C Y. Show that
3.8
Prove the following properties of the indicator function defined on a nonempty set n: (i) lA n B = min{ lA , lB} = lAlB . (ii)
lA B = max{ lA , lB} u
3. Set Operations under Maps
21
(iii) lA+B = lA + lB . ( v)
lE. E 1A· = �i e IlA. lA c = 1 - lA .
( vi )
A C B =>!A < lB .
(iv)
( vii )
1
1
1
lu A. = sup{ lA.:i E l} , iEI li n A. = inf{ lA.:i E l} . EI Let {A n } be a sequence of subsets of n. Show that the function limlA n is the indicator function of the set lim A n and that the function lim 1 A n is the indicator function of the set lim A n . 1
1
1
3.9
3.10 3.11
1
Prove that nlim A n exists if and only if nlim lA exists. [Hint: Use Problem 3 . 9 . ] Let [X,X',F] be a bijective map and let T and r' be respective col lections of subsets of X and X' such that F ( r �) s; T and F ( r ) C r'. Show that F** ( r' ) = T and F ( r ) = r'. --.co
--.co
**
**
**
n
22
CHAP TER 1. SET-THEORETIC AND ALGEB RAIC PRELIMINARIES
4. RELATIONS AND WELL-ORDERING PRINCIPLE
In Definition 2.1 (ii) we introduced the concept of a binary relation R as an arbitrary subset of A x B. In the special case when R C A x B and A = B, we call R a binary relation on A. We will sometimes use as notation aRb instead of ( a,b ) E R. This notation makes sense, for instance, if R is stipulated by < or < on some set. In addition, we will also say that a pair ( A,R) is a binary relation, where in fact R is a binary relation on a set A (a carrier). Now we consider some special relations. 4.1 Definitions. Let R be a binary relation on S. ( i ) R is called reflexive if Va E S, (a,a) E R [aRa]. ( ii) R is called symmetric if ( a,b ) E R =? ( b,a) E R [aRb => bRa]. ( iii ) R is called antisymmetric if ( a,b ) , ( b,a ) E R => a = b [aRb 1\ bRa=> a = b ]. ( i v ) R is called transitive if ( a,b ), ( b,c) E R => ( a,c ) E R [aRb 1\ bRc =>aRc].
( v) R is called an equivalence on S (denoted by symbol or E) if it is reflexive, symmetric and transitive. [Observe that the equivalence E on S partitions S into mutually disjoint subsets, called equivalence classes. A partition of S is a family of disjoint subsets of S whose union is a decomposition of S. The elements of S "communicate" only within these classes. Therefore, every equiva lence relation generates mutually disjoint classes. The converse is also true: an arbitrary partition of the carrier S generates an equivalence relation.] (vi) R is called a partial order (denoted by the symbol -< ) if it is reflexive, antisymmetric and transitive. (vii) If -< is a partial order, it is called linear or total if every two elements of S are comparable, i.e. \la,b E S either a -< b or b -< a . (viii) Let S be an arbitrary set and let (E) be an equivalence relation on S. For t E S denote �
�
[ t ]� ( = [t] E) = {s E S : s t} �
and call it an equivalence class modulo classes
�
( E). The set of all equivalence
{[t] �} = S l � (or SI E or SjE)
4. Relations and Well- Ordering Principle
23
is said to be the quotient (or factor) set of S modulo . It is easily seen that a quotient set of S is also a partition of S. Note that x � [x] is a function assigning to each xES, an equiva lence class [x] . We will denote this function by 1rE (or 1r ) and call it D the projection of S on its quotient by E (or ) . �
�
�
�
�
4.2 Examples.
( i ) ( IR, = ) is an equivalence relation. Therefore, every real number as a singleton represents an equivalence class. ( ii ) (lR, < ) is a linear order. ( iii ) Congruent triangles on a plane offer an equivalence relation on the set of all triangles. [Two sets A and B are called congruent if there exists an "isometric" bijective map f : A --+ B, i.e., f must preserve the "distance" for every pair of points a,b E A and their images f (a ) ,J(b) E B.] ( iv) ( IR 2 , < ) is not a linear order if we define < as ( a1 ,61 ) < ( a2 ,b 2 ) if and only if a1 < a2 1\ b 1 < 62 • To make this relation a linear order we can define, for instance, ( a1,b1) < ( a2 , b2 ) if and only if II ( a1 ,61 ) II < II ( a2 ,b 2 ) II , where II ( a,b ) II is the distance of point ( a,b ) from the origin. ( v ) Let I be the relation on N such that n I m if and only if n divides m (without a remainder ) . It can be shown that (N, I ) is a partial order but not a linear order. (See Problem 4.5.) (vi ) Let p be a fixed integer greater than or equal to 2. Two integers a and b are called congruent modulo p if a - b is divisible by p (without remainder ) ; in notation we write p I a - b or a = b (mod p ) . The number p is called the modulus of congruence. Let "
"
[m] p = {n E ?L: m = n (mod p ) } (m E ?L) . In other words, [m] p = {n E ll.: 3k E 7l.: n = kp + m}.
Then any two integers m and n are related in terms of [ · ] p if and only if n E [ m] p . This is an equivalence relation. (Show it; see Problem 4. 1.) (vii) Let S be a nonempty set and R C S S be a binary relation. Taking for R the diagonal D = {( ) E S} we have with ( S,D ) the "smallest" ( by the contents of elements of S x S) equivalence relation on S, where each element forms a singleton-class, and D partitions S into { s } 5 classes. The "largest" equivalence relation on S is obviously R = s,s : s
X
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
24
S x S itself and it consists of the single class. ( viii) Any function [X, Y, f] generates an equivalence relation on its domain X partitioning X into disjoint subsets. Define the binary relation E1 ( 1) on X as �
x E1y <=> f(x) = f( y ).
Then, it is readily seen that E f is an equivalence relation on X, referred to as the equivalence kernel of the function f. Formally, for every point y E f * ( X) , the pre-image f - l (y) is an equivalence class in X and {[f - l (y )] E / y E f * ( X) } is the quotient set of X modulo E f (or 1). Furthermore, I: E f *(X) f - l ( y ) is a deco mposition of X. For instance, the function f(x) = x2 generates a partition of lR into a collection of subsets of the form { - a,a}, for a > 0, along with {0}, which is a factor set of lR modulo E 2 • Another example is the function \{7r(2n2 - 1 ) : n E 71. }, IR, tan . X = IR :::::;
Y
X
[
Let
Then,
A y = tan - 1 (y) = {arctan y + 1rn : n E 71.} = [arctan y] E t an .
Etan is the equivalence kernel of the function tan,
XIE
and
]
l (y) : y E IR} (the quotient set of X modulo E t an) = {tan tan D
The last discussion about equivalence relation generated by a func tion yields some important results and notions we would like to use in the upcoming materials of Chapters 6 and 8. While we demonstrated in Example 4.2 (viii) that any function on X generates an equivalence relation, the following proposition states that the converse is also true; namely that any equivalence relation E is the equivalence kernel of some function.
Let E be an equivalence relation on a nonempty set X. Then the projection [ X,X I E, 7rE] is an onto map with E as the equi valence kernel. D 4.3 Proposition.
4. Relations and Well-Ordering Principle
25
From the definition of 1rE it follows that 1rE is surjective. To claim that E is the equivalence kernel of 1rE, we need show that 1rE ( x ) = 1rE (y ) if and only if xEy. Let 1rE ( x ) = 1rE( y). Since xEx, x E [x]E and therefore, by the assumption ( 1rE ( x ) = 1rE (y)) x E [y]E · This proves that xE y . Now let xEz. If y E [x]E, then yEx and thus, by transitivity, yEz, i.e. y E [z]E. Therefore, [x]E C [z]E . The inverse inclusion, and thus the equality, is due to the symmetry of E. Hence, 1rE( x ) = 1rE (y ). D Proposition 4.3 asserts that the projection 1rE is a trivial example of an onto function defined on X and with the range X I E · Now suppose E is an equivalence relation on a set X and [X,Y,f] is any function whose equivalence kernel is E. The following theorem claims that, there is a unique "mediator" f between the quotient set X I E and the codomain Y of f. Proof.
4.4 Theorem.
Let E be an equivalence relation on a nonempty set X and [X,Y,f] be a function whose equivalence kernel is E. Then there is a D unique function [X I E,Y,f] such that f = f 1rE. The reader shall be able to take care of this theorem (Problem 4.10) well of Corollaries 4.5 and 4.6 (Problems 4.11 and 4.12). 4.5 Corollary. In the condition of Theorem 4.4, if f is onto, then f is bijective. D 4.6 Corollary. Let [X, Y,J] be a function and let E f denote its equiva lence kernel. Then, there is a unique one-to-one function [X I E f , Y,!J such that f can be represented as a composition o
as
as
of D Furthermore, f is bijective if f is surjective ( onto). Now, we turn to a discussion on the partial order relation and all rele vant notions and theorems, which we are going to apply throughout the book. 4. 7 Definitions. Let (A, -< ) be a partial order and let B C A. Clearly, ( B, -< ) is also a partial order. ( i ) The partial order ( B, � ) is called a chain in ( A, -< ) if it is linear. ( ii ) An element b0 E B is called a minimal element of B (relative to
26
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
-< ) if for each b E. B with b -< b0 , b b0 (compared with the smallest element b0 , which is -< b for all b E B). ( iii) An element b 00 E B is called a maximal element of B (relative =
to -< ) , if for each b E B, with b 00 -< b, it holds true that b = b 00 (compared with the largest element b 00 , which is such that b -< b 00 \lb E B) . [Observe that the difference between a minimal element and the smallest element of a set is as follows. A minimal element b 0 is -< b E B whenever b0 is comparable with some b. In addition, the smallest element is comparable with all elements of B.] ( iv) An element u E A is said to be an upper bound of B if b -< u \lb E B. An element l E A is said to be a lower bound of B if l -< b \lb E B. If B has lower and upper bounds then B is called bounded (or
-< -bounded).
( v) If the set of upper bounds of B has a smallest element u0 then this element is called the least upper bound of set B (abbreviated lub(B)) or supremum (sup(B)). Similarly, if the set of all lower bounds has a largest element 1 00 then it is called the greatest lower bound of the set B (in notation glb( B)) or infimum (inf( B)). [For instance, 0 is the glb((0, 1 )) or inf(0, 1 ) in ( IR, < ), while a lub of the set [ 1 , /2] n Q does not exist in (Q, < ).]
( vi ) Let B contain at least two points. The partial order (B, -< ) is
called a lattice if every two-element subset of B has a supremum and an infimum and they are also elements of B. [In notation: if B = { x,y } , then x V y = sup{x,y}
and
x 1\ y = inf{x,y}
4.8 Examples. ( i ) Let B =
= =
sup(B) inf(B) .]
D
{1,3,3 2 , ,3", . . . }. Then (B, I ) (where I is the relation in Example 4. 2 ( v )) is a chain in (N, I ). ( ii ) Let B = {2,3,4, . . . } and consider the relation I on B. In terms of this relation, the set of all prime numbers {2,3,5,7, 1 1 , . . . } is the set of all minimal elements, while there is no smallest element in B, since there is no minimal element related to all other elements. B does not have a maximal element either. ( iii) Consider the partial order (� ( n), C ). It is obvious that for an arbitrary subcollection A = { A i C n : i E I} C �(n), it is true that • • •
4.
Relations and Well-Ordering Principle
27
supA = U
A i E � ( 0) and infA = n A i E � ( 0). iEl iE l . a In particular, it holds true for pairs of subsets. Thus, ( � (0), C ) 1s lattice.
D
4.9 Definition. A linear order
(A, � ) is said to be well-ordered if
every nonempty subset of A has a smallest element in the sense of the same order � . 0 4. 10 Example. Let IR be the set of all real numbers and consider the relation (IR, < ) which is clearly a linear order. However, IR is not well ordered by < for there are nonempty subsets containing no smallest D element, such as (0, 1). But (N, < ) is well-ordered. Can all sets be well-ordered? This is one of the fun dam en tal ques tions in set theory posed by Georg Cantor in the 1870's. Cantor consider ed it obvious that every set can indeed be well-ordered. At that time set theory was not well-postulated yet. In 1908, Ernst Zermelo formulated his axiom of choice and showed in his paper, Untersuchungen uber die Grundlagen der Mengenlehre, that the axiom of choice is equivalent to the "well-ordering principle." The axiom of choice was included in an axiom scheme for set theory that was later (1922) strengthened by A. Frankel in his paper, Zu den Grundlagen der Cantor-Zermeloschen Men ,
genlehre.
Zermelo and Frankel introduced the following notions. Let 1 be a collection of sets. A function c defined on 1 is called a choice function, if for each S E 1, c ( S ) E S. In other words, c assigns to each set exactly one element of the set. Or less formally, we can choose exactly one element from each set. Observe that if 1 is an indexed set, i.e. 1 = { S i : i E I}, then we have f(i) = c ( S i) E S i . The axiom of choice is formulated in this way:
Every system of sets has a choice function. Zermelo proved that a nonempty set A can be well-ordered if rund only if its power set � (A) has a choice function. [There will be a short discussion of the axiom of choice in the upcoming sections.] 4. 11 Theorem (Zermelo).
well-ordering principle.
The axiom of choice is equivalent to the
4. 12 Examples.
( i) To illustrate a use of the axiom of choice, consider the following example. Let [X, Y, f] be an onto map. We show that there exists a sub set A C X such that Res A f : A � Y is bijective. Let be a choice funcc
28
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
tion for the factor set {[/ - 1 (y)] : y E Y} of X modulo E 1 . Then the set A = { c (f - 1 (y)) : y E Y } has the desired property. In other words, we choose one x from f - 1 (y) for each y and the collection of all these x's is A. (ii) Let A = {c(tan - l y) = arctany : y E IR} . Then A = ( - ; , ; ) and hence [A, IR, Re s A tan] is a function such that it is one-to-one and (Res A tan) - l = arctan. D One of the central results in set theory is Zorn's Lemma [1935] , which is widely used in set theory and which is also equivalent to the axiom of choice. 4.13 Lemma (Zorn). If each chain in a partially ordered set A has an upper bound, then A has a maximal element. PROBLEMS 4.1 4.2
4.3
Show that the relation in Example 4 . 2 (vii) is an equivalence relation on 7l. Give the equivalence classes for p = 4. Classify the following binary relations. a) Let n be a nonempty set. Define the relation (�(n), C ). b) Let n = IR2 \(x,O). Define R: ( a,b)R( c , d ) <:} ad = be . The following theorem is a statement of the principle of mathema
tical induction:
Let S( n) be a statement which is true or false, for n = 1, 2, . . . . Let S (1) be true and let S(n) 's being true imply that S ( n + 1) is true, n 1, 2, . . . . Then S ( n) is true for all =
n.
4.4 4.5 4.6
Prove it. [Hint: Use the well-ordering principle.] Prove that E i 2 = �n(n + 1)(2n + 1). i= l Show that (N, I ) in Example 4.2 ( ) is a partial order relation. Is (N, I ) a lattice? Is (IR, < ) a lattice? n
v
4.
Relations and Well-Ordering Principle
Is ((1,3), < ) a lattice? Is the set of all continuous real-valued functions a lattice? 4.8 Is the set of all real-valued polynomials a lattice? 4.9 4. 10 Prove Theorem 4.4. 4. 1 1 Prove Corollary 4.5. 4.12 Prove Corollary 4.6. 4.7
29
30
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
NEW TERMS:
binary relation on a set 22 - reflexive 22 - symmetric 22 - antisymmetric 22 - transitive 22 - equivalence 22 - partial order 22 - linear (t o t al) order 22 comparable elements 22 equivalence class modulo � (E) 22 quotient (factor) set 23 projection of a set on its quotient 23 congruence 23 congruence modulo p 24 modulus of congruence 24 equivalent classes generated by a function 24 equivalence kernel of a function 24 chain
25
minimal element 25 smallest element 26 maximal element 26 largest element 26 upper bound 26. lower bound 26 bounded set 26 least upper bound (supremum) 26 greatest lower bound (infimum) 26 latt ice 26 well-ordered set 27 well-ordering principle 27 choice function 27 axiom of choice 27 Zermelo's Theorem 27 Zorn's Lemma 28 principle of mathematical induction 28
5. Cartesian Product
31
5. CARTESIAN PRODUCT
The idea of the Cartesian product (or, equivalently, direct product) pri marily belongs to Rene Descartes who introduced this notion for two sets X and Y as a set of all ordered pairs {(x,y): x E X and y E Y}. Descartes was also the one who introduced the widely used Cartesian coordinate sys tem related to the Cartesian product. In Definition 2.1 , we introduced the notion of the Cartesian product of finitely many sets. We are going to extend this definition to arbitrarily many sets. We begin with sequences of sets. 5.1 Definitions.
(i)
Let {Yi : i 1 , 2 , . . } be a sequence of arbitrary sets. Then the Cartesian product of this sequence is the set of all sequences =
.
00
IT Y
n=l
n
=
{(a1,a2 ,
•
.
•
) : ai E Y i,
of elements from Y 1 , Y 2 , . . ..
i
=
1,2, . . .}
( ii) In the general case, let { Yx : x E X} be an indexed family of
sets.
X
Figure 5.1
32
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then the
Cartesian product (see Figure 5. 1 above) Y IT xEX x
= {/ : X -. x
UE XY x : f(x) E Yx ,
X
E X}
is the collection of all functions defined on the index set X and valued in Yx · Each such function is a choice function for the family {Yx : x E X}. D 5.2 Remarks.
( i) One of the basic questions that arises is this : when is the Cartesian product nonempty? Obviously, if at least one set Y k = 0 , then IT Y x = 0 . But if all Y x f. 0 , is the Cartesian product xeX
nonempty necessarily? Although the answer may seem obvious, we must turn to the axiom of choice. In other words, the Cartesian product of a family of sets is non empty if and only if there exists at least one choice function for this family. ( ii) We said that the Cartesian product of the family of sets { Y x : x E X } is the collection of all functions from X to Y x ' x E X. In particular, if Y x Y, for all x E X, then the Cartesian product is the collection of all functions from X to Y and is naturally denoted by Y X . Alternatively, the set yX is also denoted by �(X;Y). (iii) Let X be an arbitrary set. Then every subset A C X can be associated with its indicator function l A . Conversely, A = { x E X: l A (x) = 1 }. Therefore7 we can set a one-to-one correspondence between
5.3 Definitions.
( i)
Let { Y x :
x E X} be a collection of sets. The map
[x IT Y x ' Y a ' 7r a] EX for each a: E X is called the o:th projection map if a(f) = f( a:) , where f E IT Y x , f(o:) E Ya. The point /(a:) is called the o:th coordinate of f x 1r
eX
and the space Y a is called the o:th factor space. ( See Figure 5.2.) [Observe that 1r�( {/(a:) } ) f. {/} but it contains {/} . For instance, if
5. Cartesian Product
33
X = {1, . . . ,n} is finite,
In general, 1ri( {/(a)}) = IT Y x , where Y x = Y x' for
ya = { / (a)} .]
xeX
x
:f:- a, and
Figure 5.2 n
(ii)
Let X = {l, . . . ,n} and let Ai C Yi , i = l, . . . ,n. The set, IT Ai is
i=l
called a rectangle or parallelepiped and it can be expressed in the form (5.1)
(See Figure 5.3 below. ) The notion of a parallelepiped can also be extend ed when index set X is arbitrary. Given Ax C Y x' x E X, the set IT Ax xEX
is a parallelepiped with the alternative representation (5.1).
34
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
t .......,. .,. ._
. ............. Y. 2
__ _ ___
Figure 5.3 (iii) Now we introduce a more general notion of a projection map. Let { Yx : x E X} be an arbitrary indexed family of sets and let A C X. Define [ ll Yx , ll Ya , 7rA] xeX aEA and call it the A-projection map if 1rA ( f ) == f . ( A ) . Specifically, if A = {a} we have 7r{a } ( f) = f .( {a} ) which, in contrast with definition ( i), is a singleton. Let A c ll Ya . Then call 1l"A(A) an A-cylinder with base A. An AaeA cylinder is called a rectangular cylinder if ..A is a rectangle. If, in addition, A is a finite set then the rectangular cylinder is called simple. A sim ple A-cylinder is called a unit cylinder if A is a singleton. (See Figures D 5.�5.7.) 5.4 Example. Let A = {o 1 , a2 ,
7r{an, . . . , an } (f) and hence, '��" {an,
. . .•
=
•
•
•
, an} · Then,
f.( { al , . · ., an })
=
{/( al),. · .,/( on ) } '
n a n} ( { ! ( al ), . . ., ! (an ) } ) =i n / �i{f ( ai ))
is a {a1 , . . . ,an}-simple cylinder with base {f ( a1), . . . , / ( an) } .
0
5. Cartesian Product
Figure 5.4
Figure 5.5
35
36
CHAPTER 1. SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
.......___ ._. .
._............ -..... ... __"-¥......-..__....
A
1CA(f)=f.(A)
� Figure 5.6
X
37
5. Cartesian Product
A
A-cylinder with base
Figure 5.7
�
38
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS 5.1
Let
z
A =x IIE XA x , where Ax =
=x rr y , let Ax c y and let EX X
X
Yx except for finitely n many values of the index x, say 1 , . . . , n E X. Show that A = n 1rk ( Ak ) · 5.2
00
k=l
rr y x ' and let Anx c y where for each X = 1 , 2 , . . . , the sequence of sets { Anx } is monotone non decreasing (i.e. A 1 x C A 2x 00 C . . . ) with sup{ Anx= n = 1 ,2, . . . } = U Anx = Y x for x = 2,3, . . . . Also assume that A 11 = A 2 1 = A31 = . . . = A 1 . Show that
Let
z
=
x=l
X ,
n=l
sup 5.3
5.4
5.5
Let
{ x=fil Anx n= 1,2, . ..} =
=
1r i ( A 1 ).
Y x = IR, for all x E IR, A = (0,2) . 2 a) Draw 1r (f) for f(x) = x • A b ) Draw 1r A (A) for A = (0 , 1) x (0 ,1).
{Yx ; x E X } and { Zx ;x E X} be two family of sets. Show that a ) ( IJ Y x n IJ Z x = IJ (Y x n Z x ) · x xEX xEX EX b ) x rr y x U x rr z x c rr (Y n z x ) · xEX EX EX Let m , n E N and Y f. C/J. find an injective map [Y m ,Y",J]. a ) For m < b ) Find an injective map [Y",Y IR ,f]. ) Find a bijective map [Y" x y iR ,y iR ,f]. d) Find a bij ective map [Y IR x y lR ,y iR ,f]. e ) For A C X, find an injective map [Y A ,y X ,f]. Let
(
)( )(
n,
c
) )
X
5. Cartesian Product NEW TERMS:
Cartesian product of a sequence 3 1 Cartesian product of an indexed family of sets 32 projection map 32 coordinate 32 factor space 32 rectangle 33 parallelepiped 33 A-projection map 34 cylinder 34 rectangular cylinder 34 simple cylinder 34 unit cylinder 34
39
40
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES 6. CARDINALITY
One of the main perplexities in the theory of sets is finding a criterion for their "powers." We can overcome this difficulty when considering the class of "finite" sets. (We frequently operate with the term "finite" , though we did not give any strong definition.) We can easily define an equivalence relation in this class, for example, introducing en as the class of all n-element sets for every n E N0 • A partial order relation in this class would act as an appropriate comparison among sets from various classes. Sets A and B are said to be compared, in notation A ::5 B, when and only when A E en , B E es and n < s . Then we could assign to set A the number n and call it the cardinal number of A. Doing this, however, we would experience real difficulties when introducing "countable" and "uncountable" sets. Specifically, we would fail to operate with cardinal numbers as numbers in the usual sense. (Pursuing this philosophy we readily encounter contradictions - the most frequent phenomenon in set theory.) The basic principles of the formalism of cardinality belong to Georg Cantor who was the first to introduce a well-structured concept of "infinity" in his pioneering work done in the 1870 's and 1880's. We will present a rather informal version of cardinality sufficient for us throughout the analysis presented in this book. A curious reader should be referred to special monographs on set theory. We will start with comparison ideas based on finite sets, ideas that enable us to deal with infinite sets as well. 6. 1 Definitions.
( i) Two sets A and B are said to be equipotent if there is a bij ective function /: A � B. In this case we denote I A I = I B I (or A � B) and also say that A and B have equal cardinality. ( ii) If there exists a one-to-one function f: A � B, then we say that the cardinality of A is less than or equal to the cardinality of B, in notation f A I < I B I or A � B. If I A I < I B I and I A I # I B I we shall write I A I < I B I or A � B. (iii) A cardinal number is an equivalence class containing all sets that are " � -comparable." [For some cardinal numbers we will be using the same notation as for regular numbers.] (iv) Let 0 denote the cardinal number of the empty set 0 (the only representative of this class) . Note that 0 is not a number but the class containing 0 . Thus, I 0 I = 0.
6. C din a lit y
41
ar
(v) Similarly, the cardinal nun1ber n is the equivalence class containing the set {1, . . . , n }. Therefore, a set A is finite if it is equipotent with some set of cardinal number n, such that the integer number n is an element of N, i.e. , I A I = I { 1 , . . . , n } I = n. A set that is not finite is called infinite.
[One can easily show that N is infinite.] (vi) A set A is said to be countable or denumerable if it is equip otent with N and in this case we write I A I = N0 (pronounced aleph nought). A set A is called at most countable if I A I < I N I or A -< N. (vii) An infinite set, which is not countable is called uncountable. (viii) A set A is said to have the cardin ality of continuum if it is equipotent with the set IR of real numbers and we write I A I = G:. [We show below that N0 <
If sets A and B have only fin i tely many elements, then if and only if they have the same number of elements. ( i)
A
�
B
In contrast with finite sets, an infinite set can be equipotent with a proper subset of itself. Consider A = {1 ,3,5, . . . } � N and define f( n) = 2n - 1 , n E N. Then f: N � A is bijective and N � A . (iii) N0 � N0 x N0 • Indeed, the function ( ii)
f(k, n) = 2 k (2n + 1 ) - 1 is bijective from N0 x N0 to N0• Similarly, N � N x N. (iv) Let { A 1 , A 2 ,. . . } be a countable family of countable sets. Then 00 its union A = nU A n is countable. To construct an appropriate bijective =l map we first represent A as a countable union of disjoint sets. Let 1
B1
= A1 ,
B2
=
n-1 A2 \A 1 ,. . . ,B n = An \ U A k (for n > 1) . . . . k=l
Then, clearly A = E ::O= 1 Bn . Without loss of generality, we assume that each set Bn is countable (in general, any set may also be at most countable) and, therefore, can be enumerated as Bn = {bnl ' bn 2 , • • • } , n = 1 ,2, . . . . We can place these sets in the form of a matrix:
42
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
............ ....................... .
Now the desired bijective map is /( 1) = b 11 , /(2) = b 1 2 , /(3) = b 2 1 , f( 4) = b31 , /( 5 ) = b22 , /( 6 ) = b 1 3 , . . . , from N to A. ( v) The set Q of rational numbers is countable, for the function ! ( �) = ( m, n) is one-to-one from Q to (N x N) U {(0,0)}. The latter is countable by ( iii). (vi) We can show that N0 < G:. Clearly, N0 < G:. Then it is sufficient to show that N -< [0,1], since [0,1] � lR (see Problem 6.5). If a bijective function f: N ---. [0,1] exists, then f(n) is of type O . a n1 a n2 . . . . Now define the number O.b 1 b2 • • • such that bi = 3 if a i . f. 3 and bi = 5 if a i· = 3, i = 1 ,2,. . . . Then the number b := 0. b1 b2 • • . cannot appear among the values of f( n ) for it differs from f( n ) at the nth place. On the other hand, b E (0,1] contradicts the assumption that f is onto. Thus N 0 < (i. Observe that each rational number has two representations, e.g. 0.1 and 0.0999 . . . That means we have to be careful about different numbers above. D The following theorem is one of the central results in set theory. 1
1
.
6.4 Theorem (Cantor). A
-< � ( A) for every set A.
A (see Problem 1.4). Specifically, for the empty set, 1 0 1 = 0, while I �(0) I = 1. Since � (A) contains all singletons, it immediately follows that A -< � (A). To show that I A I f. I � ( A) I , we assume that A � � ( A) and deliver a con tradiction. By our assumption, there exists a bijective map f: A ---. � ( A). Then each element a in A is also an element of a subset of A that contains a. In other words, a may belong to f( a) (a subset of A) or may not. We then define B = {a E A: a rf. f(a)}. B is nonempty, since there exists at least one element a0 E A assigned to 0. We pick a point b E A such that f(b) = B. By definition of B, b E B <=> b f/:. f(b) = B, and this Proof. The result holds trivially for any finite set
is a contradiction.
D
6.5 Remarks.
( i) In Remark 5.2 ( iii) we showed that the power set �(X) of a set X is equipotent with the set {0 , 1} X of all functions f: X {0,1}. Note that 2 is the cardinal number of the set {0,1}. Thus, we conclude that ---.
6.
Cardinality
43
set I B I I A I = I B A I ). In particular, if N N = An interesting fact is that 2 ° = G:, = 2 °. the proof of which is left for the reader as an exercise (see Problem 6.6). ( ii ) The continuum hypothesis states that if � is an infinite cardinal, then there is no cardinal m such that � < m < 2 � . This was conjectured by Cantor for � = N0 • In 1900 David Hilbert included the "continuum problem" as Problem # 1 in his famous list of open problems in mathematics. In 1940 Kurt Godel proved that the continuum hypothesis is consistent with (i.e. does not contradict) the axioms of set theory (axiom of existence, axiom of choice, etc. ). In 1963 Paul Cohen [1966] showed that the continuum hypothesis is independent of the • axioms. (iii) The cardinal number 2 Q: is called the hypercontinuum. For example, the set 'P(IR) has the hypercontinuum cardinal. D Supplementary Historical Note. Modern set theory was founded by Georg Can tor, in a sequence of several articles that appeared between 1870 and 1880. One of these articles, Uber eine Eigenschaft des In begriffes allen reellen algebraischen Zahlen, appeared in Grelle 's Journal in 1874, and is said to have given birth to set theory. Georg Cantor was born of Danish parents (both of Jewish descents) in St. Petersburg, Russia, in 1845 , and lived there until 1856, when his parents moved to Frankfurt, Germany. Cantor began his university studies at Ziirich in 1862. After one semester at Zurich he moved to Berlin University, where he attended lectures of Weierstrass, Kummer and Kronecker. Leopold Kronecker later became Cantor's main opponent, criticizing his concept of infinity and regarding it as theology and not as mathematics. (Cantor, whose mother was a catholic and father a Protestant, has been a devoted Protestant and active theologian. The latter has become a major target of attacks by Cantor's liberal opponents in Berlin University.) In 186 7 Cantor received his Ph.D. (in number theory) from Berlin University. His dream to get a teaching position at Berlin University never came true, primarily due to the opposition of Kronecker. In 1869 Cantor was appointed at Halle University, where he remained until his retirement in 1913. Cantor died in a mental hospital in Halle in 1918. In 1925 David Hilbert recognized Cantor's concept of in finity. He said, "No one can drive us from the paradise that Cantor D created for us."
I � ( X) I = 2 l X I (where we IXI I N I then I 'P(N) I
44
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
PROBLEMS. 6. 1 6.2 6.3
6.4 6.5 6.6
Show the validity of the statement in Remark 6 . 2 . Prove the SchrOder-Bernstein Theorem: If A -< B and B -< A ,
A � B.
then
We call an algebraic number any root of a polynomial with integer coefficients. What is the cardinal number of all algebraic num bers? [Hint: Use Problem 6 . 2 . ] Prove that every subset of a countable set is at most countable. [Hint: Use the well-ordering principle.] Show that lR � [0,1]. [Hint: Show that [0,1] � (0,1).] Show that 2No == Q:.
Let [X Y, /] be a surjective map. Show that there is a subset of X equipotent with Y. Let [X, Y , f] be an injective map, where Y is countable, and let 6.8 f - 1 (y) be a countable set for each y E Y. Must X be countable? Let A be an uncountable set and let B C A be countable. Show 6.9 that A \B is uncountable. 6.10 Prove the statement: Every infinite set contains a countable 6.7
1
-
subset.
6. 11 6. 12
What is the cardinal number of all polynomials whose coefficients are algebraic numbers? Show that the set of all finite subsets of N is countable.
6. NEW TERMS:
cardinal number 40 equipotent sets 40 finite set 41 countable (denumerable) set
N0 41
at most countable Set 41 uncountable Set 41 continuum 41 Cantor's Theorem 42 continuum hypothesis 43 hypercontinuum 43 Schroder-Bernstein Theorem algebraic number 44
41
44
Cardinality
45
46
CHAPTER 1 . SET-THEORETIC AND ALG EBRAIC PRELIMINARIES 7. BASIC ALGEBRAIC STRUCTURES
Algebra
is a mathematical discipline that studies algebraic structures. The most rudimentary algebraic operations with natural and positive rational numbers were already encountered in ancient mathematical texts. The famous book, "Arithmetics," by Greek Diophantos (of Alexandria) in the third century A.D ., has a significant influence on the development of algebraic formalism. The term "algebra" stems from the text Al-jabr wa 'l-mukhabala (by Muhammad al-Khowarismi in the ninth century A.D.), which dealt with solution techniques for various problems reducing to first and second order algebraic equations. Not until the end of the fifteenth century, when the common algebraic operations + , , x , power, roots and parentheses were introduced, one used cumbersome phrases and descriptions of algebraic expressions. Fran�ois Viete, by the end of the sixteenth century, was the first to use letters to denote unknowns and parameters. The algebraic symbolism, as we know it now, has been used only since the middle of the seventeenth century. The Elementary Algebra (which deals with basic arithmetic operations on real numbers, first to fourth order algebraic equations, binomial formula, Diophant equations) was completed by the middle of the eighteenth century. Leonard Euler's Introduction to Algebra was one of the most prominent texts then. In the early nineteenth century the algebra became furnished with five basic (commutative and associative and distributive) laws with respect to two (multiplication). On the algebraic operations, + (addition) and strength of Dirichlet's definition of a function, later on, these operations were declared as binary operations based on the following definition. An operation on a set A is a rule that assigns to each ordered subset A n C A of n elements a uniquely defined element of the same set A. For n = 1 ,2, and 3, the operation is called unary, binary, and ternary, respectively. The alg ebraic structures were formalized in 1830 by the Brits George Peacock in 1830, Duncan Gregory in 1840, and Augustus De Morgan and further refined by the Germans Hermann Hankel and Hermann Grassman. The abstract alg ebra is regarded as having been born in 1846, when Joseph Liouville had published Galois' theory (of solvability of polynomial equations) based on the g roup concept, which began to spread within mathematics ever since. In 1872, German Felix Klein published a program, in which he proposed to formulate all of geometry as the study of invariants under groups of transformations. In 1883, Norwegian math ematician Marius Sophus Lie published his fundamental work on continu ous groups of transformations used in studies of continuous functions. The group theory, which is at the heart of contemporary abstract algebra, made prominent contributions to geometry, topology, and even physics in the 20th century . -
·
7. Basic Algebraic Structures
47
In
this section, we review some familiar algebraic structures. These will provide a basis for analysis shifting it to more abstract settings in the upcoming chapters. 7.1 Definitions.
( i)
A set y with a binary algebraic operation * (frequently called addition + or multiplication · ) from y x y into y is called a semigroup, in notation (Y,* ) , if is associative. [Note that even though + or · may denote addition and multiplication, they need not mean the conventional algebraic operations known for numbers.] ( ii) A semigroup (y, *) is called a monoid if, there is an element I E y (called a two-sided identity) such that for all x E y, X *I = I * X = x. (iii) A monoid (y,*) is called a group, if for each x E y, there is a * inverse x' such that X*x' = x'*x = I If is commutative (semigroup, monoid, or group) , (Y, *) is called commutative or A belian. If we use for * symbol + or - , (y, + ) is referred to as additive or ( y, · ) multiplicative, respectively. *
.
*
If (y, + ) is additive, the element I, denoted by (), is called zero, and the element x' denoted by - x is said to be an additive inverse of x. If (y, · ) is multiplicative, the element I is called the unity and denoted by 1. The element x' is denoted by x - l and is said to be a multiplicative inverse of x. (iv) A set � with addition + and multiplication from � x � into �' i.e. a triple (�, + ) , is called a ring if: a) (�, + ) is an Abelian group; b) is associative; c ) \1 a ,b,x E �, x · (a + b) = x · a + x · b (called the left distributive law) (a + b) · x = a · x + b · x (called the right distributive law) . Observe that multiplication · need not be commutative in a ring. How ever, if this is the case, the ring is called commutative. A ring need not have a unity either; consequently, a ring equipped with a unity is called a ring with unity. [For instance, the set of all matrices ..Ab(n ' n ) is a noncommutative ring with unity (unit matrix).] ( v) Let (y, *) and ( � , * ) be two groups and let [y, � ,/] be a map preserving the algebraic operations * and * , i.e. such that ·
,
·
•
f(x* y) f ( x) * f ( y) . =
48
CHAPTER 1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Then f is called a (group) homomorphism of y into �If [y, �, f] is bijective then it is called an isomorphism. In this case, the groups (y,*) and (�, * ) are called isomorphic. If [y, �' f] is a homomorphism, and in addition (y,*) = (�,* ) , then [y, �' f] is said to be an endomorphism. If [y, �' f] is an endomorphism and an isomorphism then it is called an automorphism. D A homomorphism preserves some (but not all) structural properties of groups, as the following theorem states.
Let [y, �' f] be a homomorphism. Then ( i) for each x E y, !( x ' ) = [!( x ) ]', and ( ii) f(I) = I . (See Problem 7.1. ) 7.2 Theorem.
7.3 Definition. Let
[y, �' f] be a homomorphism. Define
Kerf = f * ( {I} ) and call it the
kernel of f.
D
7.4 Examples.
( i)
The space of all continuous functions with operation + forms an Abelian group. The same space is not a group with operation of multi plication. (ii) All polynomials with operation + form an Abelian group. (iii) (?l, + ) , (lR, + ) and ( (O,oo ) , ) are Abelian groups; (?l, ) is an Abelian monoid. (iv ) The space C\{(0,0)} with the operation "complex multiplica tion" is obviously an Abelian group and ( C, + , ) is a ring. (v) Let ef:.>&f = e(n)((a,b] ;IR) denote the space of all n times continuously differentiable real-valued functions on [a,b] C IR . Then (ef:\1 , + ) is a commutative group. If GJ) n f denotes the nth derivative of a function /, then (ef:.>b1 ' ef��b1 ' GJ) n] is a homomorphism of (ef:.>b1 ' + ) ·
·
·
·
into (ef�� b 1 , + ) . Replacing ef:.>b 1 by the space of all polynomials '!P on [a,b] , we have [�, � ' G]"] as an endomorphism. (vi) Consider two groups (IR, + ) and ( (O ,oo ) , ) and the function f( x ) = ex. Then, [IR,IR + ,/] is an isomorphism. Indeed, f( x + y) = ·
7. Basic Algebraic Structures
49
f( x ) f(y). In addition, [IR, !R + , /] is bijective. ( vii) Let � = �(X;Y) = y X be the space of all functions from X into Y. Then, ( �, ) is a multiplicative monoid. For any nonnegative integer and f E �' define the unary operation power f " on � as: 0 f i =k 1, /"i k+ 1 = f f ".· The power has the properties, f i f k = f i + k and (f ) = f . Note that the power can be defined on an arbitrary ·
·
n
•
•
multiplicative monoid with the above properties. ( viii) A function T from C onto C (where C = C U {
bilinear transformation
if
T ( z ) = �= � � with
;!
oo
})
f:. 0. Let
is called a 'J
denote
the set of all bilinear transformations. Then, ( 0} be an indexed family of functions and let * be some binary operation defined on �. ( �,* ) is called a semi group (of function s) if /0 = 1 and for all s , t > O , fs * f t = f s + t ' Obviously, the semigroup ( �,*) is a commutative monoid. (x) Let fP ( C !RN ) be the space of all sequences such that for each where p E [1 , ) . Define the x = (x 0 , x 1 , . . . ) E fP , L: :0= 0 I x n I P < following operation on fP. For x and y, let = ( z0 , z 1 , ) = X*Y is such that z n = L: � = 0 x k y n k (called discrete convolution). The operation * is commutative and associative and it is closed in fP (see Problem 7. 1 1) . Obviously, 1 = (1,0,0, . . . ) is the unity of ( l P ,*) and thus ( f P ,*) is an Abelian monoid. Let x = (x 0 ,x 1 , . . . ) E fP such that x 0 f. 0. Define y = (y0 ,y 1 , . . . ) such that Yo = }0 . For n > 1, Yn can be determined recursively from the equations L: � = 0 x k Yn k = 0. For instance, o
o
o
o
o
o
oo ,
oo
z
. • .
_
_
2 x2 x x 1 , Y2 = 1 Y1 = - , 3-2· xo xo xo In conclusion, for each x with x 0 f. 0, there is a unique element y x - 1 . On the other hand, if l� denotes the subset of all elements x E fP with x0 = 0 then l� and its complement fP\ f� relative to fP are two equivalence classes induced by * · This implies that ( fP \ f�,*) is a commutative group. Obviously, the triple ( f P , + ,* ) is a commutative =
ring with unity.
50
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Now, let 9J be the space of all complex-valued functions analytic at zero and not equal to zero at the origin. This space is closed with respect to multiplication. Hence, ( 9J, ) is an Abelian group. Indeed, u = 0 1 is the unity and for each x E 9J, � is analytic at zero and it is a two-sided inverse of x. Obviously, each x E 9J can be expanded in Taylor series at zero, such that x is uniquely associated with the sequence ·
X =
_
n � ( { X n n . X ) ( 0); =
n =
0, 1, . . . } .
If F is defined as
F(x) x and F(l) =
then
=
�'
[fP \ f�,9J,F] is a group homomorphism such that F ( x* y) F( x)F( y). =
F - l ( x) x
need not be an element of fP\f�, for L.... n = O I x n I may be a divergent series. (xi) Let £P ( p > 1 ) denote the class of all real-valued functions { [IR ,lR,/]} such that I : I f I P < Define on LP operation * as follows.
Notice that � 00
=
oo
X * y ( u)
=
.
I : x ( u - v)y (v)dv .
The operation * is closed in £P and it is commutative and associative (see Problem 7.12 ) . Define the function
)2;
2 u , for u > 0 and u E IR. exJ f( u , u) } 2 u 21r lL 2u This function is a well-known probability density function of a normal random variable with mean 0 and variance u 2 • Consequently, =
I : f( u,u)du 1. =
From the theory of probability, it is also known that a lion portion of the integral under the curve f (over 99%) is concentrated over the interval ( - 3u,3u). Function f has its maximum value at 0 equal approximately 0.399� . Now, if we let u ----. 0 + , the resulting function is called the ( Dirac) delta function, in notation, 6. It is readily seen that the delta equals 0 on IR \ { 0} and at 0, and that I : fJ = 1. There is an alter native integral representation of delta function. Recall that the Fourier transform of f is oo
7.
Basic Algebraic Structures
bl
and that f can be restored by applying the inverse Fourier transform to its image as follows: f(u,u) = 2 J :exp { - iOu}exp - ( u ) 2 do.
{ �}
�
Again, letting u � 0, we arrive at
6 ( u) = 2� J :exp{ - iBu } d B.
By using this integral representation it will be easy to show that fJ is the unity of * operation:
X * 6(u) =
J x(u - v)6(v)dv = J x(t)6(u - t )dt t E IR
v E IR
=
J x(t) 2� J e - iO (u - t ) d() dt
t E lR
8 E IR
- i 9u j x( t )e i Ot dt d () . J e - 2 7r - ...L
9 e IR
t E lR
Since the expression in parenthesis is x( B ) , that denotes the Fourier transform of x, the rest is the inverse Fourier operator, which should restore x at u. So, x * fJ = x. According to Problem 7. 1 , fJ is a unique unity of operation *· Since fJ > 0 and because
I : I d( u , u ) I Pdu = I : d( u,u)Pdu
-
_
1r:;
vp
2 7r f 0oo
1 fie.
.JP v u
exp -
u 2 ) du (2 .JP 2 u
_
1r:; <
vp
oo ,
fJ is an element of £P. This all implies that (LP , * ) is a commutative monoid and, therefore, (LP, + ,*) is a commutative ring with unity. ( xii) As an application of the last example, consider the discrete indexed family of functions {/ n ; n = 0, 1 , . . . } defined as follows:
fo = f0 * = 6, / 1 * = J, f ( n + l ) * = f" ** f ·
52
CHAPTER
Then, f ( group.
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
n + k ) * = f " * * f k * , and therefore ( {f ; n = 0 , 1 , . . . } ,* ) is a semi n
(xiii) Let � * = � * (IR;IR) denote the space of all bounded real-valued functions. For a function A E � * ' define k [A )] A u �---> e ( u ) = 2: �= 0
��
in agreement with Example (vii). Obviously, for each u, the above series converges absolutely, since there is a positive constant M such that
k k M u) A( I I < "' M LJ n = 0 k! - e k ! - LJ n = 0 so that e A is again an element of � *" For a fixed A, define the family of functions f = e t A , t > 0. From the above definition of e A it follows that f0 = 1. It is easy to show that e 8A e t A = e ( s + t ) A _ Indeed, is A i n t k A k is tk r n A 2: i = 0 i ' 2: k = 0 k' = 2: r = 0 E 'k' i · • +ik, k= rn} · · {O < < . . • r r ' A 2: ri 0 s t - r! = 2: rn = 0 rr = ( - i) ! i! The last expression yields e ( s + t ) A for letting Consequently, (e t A , · ) is a semigroup defined in (ix). This example can be generalized �
00
00
'
t
n
·
•
r
n
--+
oo .
for operators, for instance, squa�e matrices. To discuss such cases rigor ously, one would require the concept of the "norm" of operators treated in upcoming chapters. 7.5 Definitions.
D
( i ) Let IF be a nonempty set with two binary operations, addition (a + {3) and multiplication ( af3) [in many instances, especially . for the elements- of IF, we will drop the conventional multiplication symbol · ]. ( IF, + , · ) is called a field if it is a commutative 1ring with unity and if for every a f. 0 there is a multiplicative inverse a - . In other words, IF is a field if for all a ,{3, 1 E IF, 1) (commutative law) a + {3 = {3 + a, a{3 = {3a 2) (associative law) (a + {3) + 1 = a + ({3 + 1 ), (af3)1 = a (f31) 3) (zero) there is an element 0 E IF such that a + 0 = a 4) (additive inverse) there is an element - a E IF such that
a + ( - a) = O
5) (distributive law) a(f3 + 1 ) = a{3 + Cl/ 6) (unity) there is an element 1 E IF such that 1 a = a 7) (multiplicative inverse) for every a f. 0, there is a- 1 E IF
such
7. Basic Algebraic Structures
53
that aa - 1 = 1. The elements of a field are called scalars. (ii) Let IF be as above with the exception that IF does not have addi tive inverses. Then IF is called a semifield. We will denote a semifield by
IF + .
[The set of all rational numbers, Q, the set of real numbers, IR, and the set of all complex numbers, C, are typical examples of fields. The set of all nonnegative rational or real numbers and the set of complex numbers z E C with Re ( z ) > 0, are examples of semifields.] (iii) A linear or vector space X over a field IF is a non empty set with the binary operations addition ( + ) on X x X into X and multipli cation ( · ) on f x X into X such that 1 ) + is commutative and associative; 2) there exists an element (called an origin of X), () E X such that 0 · x = () , \lx E X ; 3) 1 . X = x, \1 X E X ; 4) a (x + y) = ax + ay, (a + ,B )x = ax + ,B x, \1 a , ,B E IF, \1 x,y E X; 5) a( ,B x) = (a ,B )x, V a , ,B E f , \1 x E X. ( iv) Elements of X are frequently called vectors. If IF = lR then X is called a real linear space. If IF = C then X is called a complex linear space. If in ( iv) a semi field IF + is taken, then we call X a semi-linear
space. ()
Any subset of a linear space, which itself is a linear space, is referred to as a subspace. (vi) A ring (A, + , · ) is called an algebra over a field IF if its additive (Abelian) group (A, + ) is a linear space over IF. An algebra over a field IF will be denoted by ( A;IF). If ( A;IF) is an algebra, a pair ( A' ;IF') is called a subalgebra (of (A;IF)) if A' C A, IF' C IF, and (A';IF') is also an algebra. The above characteristics of commutative rings and rings with unities are hereditary for algebras. ( v i i ) A partially ordered linear space, which is also a lattice, is called a vector lattice. D v
7.6 Properties of Linear Spaces.
( i) By Definition 7.5 ( i ii) , 2) and 3 ), we have () + x = 0 · x + 1 x = (0 + 1 ) · x = x. Therefore, the origin () is zero and, by Problem 7.1, it ·
1s un1que. ( ii) For every x E X, there exists x such that Indeed, by Definition 7.5 (iii), 2) and 4), we have •
•
-
x + ( - x) = () .
54
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
() = 0 X = [1 + ( - 1)) X = X + ( - 1) · X . •
•
We call ( - 1) x the additive inverse of x and denote it by - x. Properties (i) and (ii) imply that (X, + ) is an Abelian group. (iii ) \1 a E IF, a B = a(O · x) = (a O) · x = 0 · x = B.
0
7. 7 Notation.
Let
X be a vector lattice over a field IF. Then \1 x E X, I x I = x V ( - x) ( E X) x + : = x V () ( E X) x - : = - x V () ( E X). :
D
7.8 Examples.
( i) (ii) (iii) ( iv)
{OJ is a subspace, since by Property 7.6 (iii), a · () = B. Any field is a linear space over itself.
lR" is a real linear space with () = (0, . . . ,0) over lR. 1 1 space, with all real sequences over the field lR
whose series
are absolutely convergent, is a linear space. ( v) i.P space over the fteld C, of all sequences such that for each x = (x 1 ,x 2 , . . . ) E iP , E� 1 I x n I P < oo , where p E [1, oo ) is a linear space. (See Problems 7.9 and 7.10.) ( vi) e [ a, b ] space of all continuous functions on [ a,b] is a real linear space. (vii) e ra b ] space of all n-times differentiable functions on [ a,b] is a real linear spa� e. ( viii) e(oo ) space of all analytic (entire) functions is a complex linear space. (ix) In Example 7.4 (x� , (I P\1� U { /J }, + , * ) , where /J = (0,0, . . . ) , is a field, since elements of fP\1 have multiplicative inverses. (C, + , · ) is another example of a field. P (x) The space lR X of all real-valued functions on a set X is a commutative algebra over lR with unity. lR X is also a vector lattice. (x i) The subspace GJ * ( X;lR) C [R X of all bounded real-valued func tions on a set X is a commutative subalgebra with unity and a vector lattice. =
,
7. Basic Algebraic Structures
55
(xii) The subspace e(X;lR) of all continuous functions is also a commutative subalgebra over lR with unity and a vector lattice. (xiii) The subspace e * (X;lR) of all bounded continuous functions is a commutative subalgebra of e(X ;IR) and a vector lattice. (xiv) The subspace e"(lR;IR) of all n-times differentiable functions is a commutative subalgebra with unity but not a lattice (sup{ x,- x } = 1 x 1 rt. e"(lR,lR) ). (xv) The space e ( oo ) (C;C) of all entire functions over C 1s a commutative algebra with unity but not a lattice. (x vi) The space <jJ of all polynomials with real coefficients is a commutative subalgebra over IR with unity but not a lattice. (xvii ) The space Q of all polynomials with rational coefficients is a commutative subalgebra over the field of rational numbers with unity but not a lattice. D •
PROBLEMS. 7. 1 7. 2 7.3
7.4 7.5 7.6 7. 7
7.8
Show that each monoid has exactly one identity. Let (y,*) be a group. Show that for each two elements x, y E y, there are l,r E y, such that l * x = y and x * r = y. An operation * is called reducible if X *Y = X *Z implies that y = z for all x,y, z . Show that if (Y,*) is a group, then * is reducible. In particular, show that for each x E y, its inverse is unique. Prove Theorem 7.2. Le t [ y, �, f] be an isomorphism. Show that [�, y, f - I] is also an isomorphism. Let [y, �' f ] be an isomorphism. Find Kerf. Let [ y, �, f ] be a mapping such that y = � = lR with operation + and let f(x) = [x] (i.e. the greatest integer less than or equal to x). Is [y, �, f] an endomorphism? Let (y, ) be the set of all 2 X 2 real matrices with determinant equal 1. a ) Show that (y, ) is a group. b) Let B be any 2 x 2 nonsingular matrix. Define the map [y, y, f] such that f(A) = B - lAB. Show that [y, y, /] is an automorphism. ·
·
56
7.9
7.10
7.1 1 7.12 7.13
CHAPTER
1 . SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
Show that, \la,b > 0 and p E [1, oo ), (a + b)P < 2P- 1 (aP + bP). [Hi n t: For p > 1, work with the auxiliary function f(x) = (a + x )P - 2P - 1(aP + xP), X > 0.) Show that iP is a linear space; specifically show that x , y E iP => x + y E iP. [Hint: Apply the inequality in Problem 7.9 in the form I xn + Y n I P < 2 p - l ( I xn I P + I Y n I P).] Show that the operation in Example 7.4 (x) is commutative and associative and it is closed in I P. Show that the operation in Example 7.4 (xi) is commutative and associative. Show that defined in Example 7.4 (viii) is associative and that *
*
T T - 1 = T - 1 T = 1. o
o
o
Is (�, + , ) (where (�, ) is defined in Example 7.4 (viii)) a ring? 7.15 Let S be a subset of C. Argue for what cases S is a subspace of C over lR. a) S is a closed unit disc centered at zero, i.e., S = { z E C: f z t < 1}. b) S = { z E C: { I Re ( z) I < 1} X { I Im ( z) I < 1 } }. c ) S = {z E C: {Im(z) = 0} x { I Im(z) I < 1 } } . d) S = {z E C: Im(z) > 0 and Re(z) > 0} U {z E C: Im(z) < 0 and Re(z) < 0} . 7. 16 Prove in Definition 7. 7, for functions, that x = x + - x - and 7.14
o
l x l = x+ +x-.
o
7. Basic Algebraic Structures NEW TERMS:
algebra 46 algebraic operation 47 semigroup 47 associative algebraic operation 47 monoid 47 two-sided identity 47 group 47 inverse 47 commutative algebraic operation 47 abelian group 47 additive group 47 multiplicative group 47 zero 47 additive in verse 4 7 unity 47 multiplicative inverse 47 ring 47 left distributive law 47 right distributive law 47 commutative ring 47 ring with unity 47 group homomorphism 48 group isomorphism 48 group endomorphism 48 group automorphism 48 kernel 48 space of all n times differentiable functions 48 power 49 bilinear transformation 49 semigroup of Functions 49 discrete convolution 49 [ P space 49 £ P space 50 normal probability density function 50 Dirac delta function 50 Dirac delta function, Fourier transform of 51 field 52 scalar 53 semifield 53 linear space ( vector space ) over a field 53 vector 53 real linear space 53
57
58
CHAPTER
1.
SET-THEORETIC AND ALGEBRAIC PRELIMINARIES
complex linear space 53 semi-linear space 53 subspace 53 algebra over a field 53 subalgebra 53 vector lattice 53
Chapter 2 Analys is of Metric Spaces
Metric spaces were introduced and studied by the French mathematician, Maurice Rene Frechet (in his doctoral dissertation published in 1906), and developed later by German Felix Hausdorff (in his book Grundziige der Mengenlehre of 1914). It was apparent that to the end of the nine teenth century the mathematical world (partly inspired by Cantor's fundamental work in set theory) was eager to structure more general sets than conventional IR". On the other hand, the needs of complex analysis and the rash development of differential equations speeded up this process. Typical examples are uniform convergence in function spaces, approximation of continuous functions by polynomials and the Riemann mapping theorem. After 1920, the theory of metric spaces, especially, fundamental work on normed spaces and their applications to functional analysis, was further developed by Pole Stefan Banach and his school. Paying a tribute to their achievements and of their other fellow country men followers, an important subclass of metric spaces was named "Polish." A series of studies of metric spaces were further undertaken in the late 1920s by the Russian school of analysis. At this time, metric spaces have become generalized to topological spaces. In this chapter we introduce main principles of metric spaces and their special case: normed linear spaces. This part of analysis traditional ly precedes the more general theory of topology and functional analysis. 1. DEFINITIONS AND NOTATIONS
The concept of "metric" (measuring distances in space) is at the root of mathematical (geometric) thinking. Starting with that concept �w e will see how the notions of limits of sequences and continuity of functions extend by metrization to those in more general spaces than Euclidean spaces introduced in calculus. Recall that a point x is a limit of a sequen ce { x n } if all terms of the sequence numbered with k, k + 1, . . . for some k are sufficiently "close" to x . The closeness of these points to x is defined in terms of the Euclidean distance I x - x k I , which determines the speci fic structure employed on the "carrier" IR. In many applications, the carrier is more general than IR or even IR". So, the question arises, 'how do we construct the analysis in the general space?' Since the distance was crucial in the formation of analysis on the real line, we will introduce this
59
60
CHAPTER 2 . ANALYSIS OF METRIC SPACES
notion also for the general space, emphasizing the main properties of the distance with which we have had experience. Once a distance ( or metric) between any two points of a set is defined, the set becomes "well structured" or metrized, and then is ranked as a space, more precisely, a
metric space.
1.1 Definitions.
set. A metric d ( or distance) on X is any (i) Let X be a nonempty 2 nonnegative function d: X __... [R + such that: (a ) \lx,y E X, d(x,y) = 0 ¢:? x = y. (b) \lx,y_ E X, d(x,y) = d(y,x). ( c) \lx,y, z E X, d(x,y) < d(x, z ) + d(z,y) ( triangle inequality). The pair (X ,d) is called a metric space. We will refer to set X as a carrier. Sometimes, for brevity, the carrier X itself will be called the
metric space . . (ii) If for x,y E X, x = y implies d(x,y) = 0, but the converse does not ( i.e. , d( x , y ) 0 does not necessarily yield x = y), and if (b) and (c) hold, then d is called a pseudo-metric. Correspondingly, the pair (X , d ) is called a pseudo-metric space. ( Any pseudo-metric can be made a metric by introducing the equivalence classes "generated by metric d," in such a way that and y will belong to one and the same class whenever D d(x,y) = 0.) . =
x
1.2 Remark. By the triangle inequality we have
( 1.2a) d (x, y ) - d( z ,y) < d(x, z ), which holds for all x,y, z E X. Then, interchanging x and z in the last in
equality we arrive at
d(x,y) - d(z,y) > - d(x,z).
( 1 . 2b )
Inequalities ( 1.2a) and ( 1.2b ) yield
I d(x,y) - d(y, z ) I < d(x, z ), \lx,y, z E X.
Y � X. Then subspace of (X,d) . Let
the pair
( Y ,d)
( 1.2c )
D
is also a metric space, called a
1.3 Examples (of metric spaces).
( i) The discrete metric is defined on a nonempty set X as
1.
Definitions and Notations
61
{
1, x :f- y 0, X = y. The triangle inequality does not hold if and only if d(x,y) = 1 and d(x,z) = d(z,y) = 0. However, this would only be possible for x = z = y. Hence, d(x,y) cannot equal 1 . ( ii) Let X = (O,oo) and d(x,y) = I � - � I· The triangle inequality follows from d(x,y) = ; - � = 1 ; - � + ! - � l d(x, y) =
l
�
1 � - ! I + I ! - �I = d(x,z) + d(z,y) . ( iii ) Let X consist of all sequences { x n } C IR. Such a carrier X is denoted by IR N . Recall that a subset of lR N is the 1 1 space if it contains <
only absolute convergent sequences, i.e. , those with
}: �= 1 I x n I < oo. Let us define the function d on 1 1 as d(x,y) = }: �= 1 1 x n - Yn I · Then 00 d(x ,y ) = L: I Xn - z n + z n - Yn I n= l 00 00 < L: I Xn - z n I + L: I z n - Yn I n= l n= l = d(x,z) + d(z,y). Thus, d is a metric on 1 1 , since the other properties of d as a metric are obvious. ( iv) Let e [a , b ] denote the set of all continuous functions on inter val [ a,b] C IR. Let us define d ( x, y) = sup { I x ( t ) - y ( t ) I : t E [a, b]}, called the supremum metric. Because any continuous function on a closed and bounded interval assumes maximum and minimum values, the defini tion of d makes sense. Since the inequalities
+ 1 z( t ) - y ( t ) 1 < sup I x( t ) - z( t ) I + sup I z( t ) - y( t ) I
1 x( t ) - y ( t ) 1
< 1 x( t ) - z( t ) 1
hold for all t E [ a,b ], we have
62
CHAPTER 2 . ANALYSIS OF METRIC SPACES
sup I x( t ) - y ( t ) I < sup I x( t ) - z( t ) I + sup I z( t ) - y( t ) I , which is exactly the triangle inequality. Hence d is a metric on ( v ) Now, define another metric on e [ a b 1:
e ( a, b ] '
,
d(x,y) =
J : 1 x( t ) - y(t) 1 d t .
It is easy to see that d(x,y) = O if and only if x( t ) = y( t ) for all t E [a,b] (why?). The triangle inequality is obvious. D PROBLEMS 1.1 1.2 1 .3
1.4 1 .5
Let X = IR and d(x,y) = sin 2 (x - y). Is (X,d) a metric space? Let X = lR and d(x,y) = y' I x - y I . Is (X,d) a metric space? Let X = IR". Define on X, d(x,y) = max{ I x k - Y k I : k = 1, . . . ,n} \lx = (x 1 , . . . ,x n ), y = (y 1 , . . . , Yn ) · Show that (X,d) is a metric space. Let d be a metric on X. Define p(x,y) = d( x ) . Show that p 1 + . on X . . a me t r1c d x, y ) 1s Two !leal numbers p > 1 and q > 1 are called conjugate exponents, if 1p + 1 -- 1 . q
t
Show that for all x,y E IR + and for conjugate exponents the following inequality holds. xP
p
and
q,
Yq ·
xy < p- + cr
with the function f (z) = � + � - z 1 / P and then subxP . ] st1tute z = ---q
[Hint: Work •
y
1.6
Prove Holder's inequality (for finite sums): for conjugate exponents p > 1 and q > 1 such that 1 + � = 1 , a1 ,. . . ,a n > 0, and b 1 ,. . . ,b n > 0 ,
[Hint: Apply Problem 1.5 to x = ai/ A and y = bi/ B, where
Definitions and Notations
1.
A=
[ zf:=1 af]1 / P and B = [ af:=l b�Jl / q. ]
a) Prove Minkowski's inequality (for finite sums): for a1 , . . . ,a n > 0, and b 1 , . . . , b > 0, it holds true that
1.7
63
p
> 1,
n
Make use of (a + b)P = a(a + b)P - 1 + b(a + b)P - 1 and then apply Holder's inequality.] b) Generalize Minkowski's inequality for infinite sums. The Euclidean metric or Euclidean distance is defined in lR" by
[Hint:
1.8
de ( x , y ) = J f: ( x k - Y k ) 2 , x = ( x1 , . . . ,x n ) , y = ( y1 ,. . . ,y n ) · v k= l (P 1 .8) (Specifically, if n = 1, we have d ( x , y ) = J( x - y ) 2 = I x - y 1 - )
Show that lity.]
d e is indeed a metric [Hint: Apply Minkowski 's inequa
[In
Problem 1.8 we defined the Euclidean metric on lR" by equation (P 1 . 8). This metric can be regarded as (P 1.8a) where d k ( x k ,Y k ) is the one-dimensional Euclidean metric on the kth coordinate axis ( kth factor space) . We can extend this notion and define a metric on the n-times Cartesian product set Y = Y 1 x Y 2 x . . . x Y n by formula (P 1 .8a). The proposition in Problem 1.9 states that such d is indeed a metric on Y. We call this metric the product metric and the Pcor responding metric space (Y,d p ) the product space. In notation, x {(Y k ' d k ) : k = 1, . . . ,n} .] 1.9
Let (Y k ' d k ) , k = 1 , . . . , n, be a collection of metric spaces and let Y be the Cartesian product of Y1 , . . . , Y n · Then the function d P on Y Y defined by (P 1.8a) is a metric on Y. Show that the functio n p ( x, y ) = E � = 1 d k (x k , Y k ) is also a metric Y . on Y = Y 1 Y 2 Prove the statement.
x
1.10
x
X
• • •
X
n
64
CHAPTER 2 . ANALYSIS OF METRIC SPACES
NEW TERMS:
metrization, 60 carrier 60 metric 60 distance 60 triangle inequality 60 metric space 60 pseudo-metric 60 pseudo-metric space 60 subspace 60 discrete metric 60 1 1 -space 6 1 supremum metric 6 1 conjugate exponents 62 Holder's inequality 62 Minkowski 's inequality 63 Euclidean metric 63 Euclidean distance 63 product metric 63 product space 63
2.
The Structure of Metric Spaces
65
2. THE STRUCTURE OF METRIC SPACES
The structural properties of metric spaces stem from the notion of the open ball with the aid of which we shall be able to introduce open and closed sets, interior, closure, and accumulation points. Open balls, due to a particular metric, generate convergence and continuity, the principles of any analysis, which we explore in this chapter and Chapter 3. 2.1 Definition. Let (X,d) be a metric space and let x E X and r > 0. The subset of X,
B(x,r) = {y E X : d(x,y) < r }, is called the open ball centered at x with radius r (with respect to metric d). [If we need to emphasize that the ball is with respect to metric d, we will write as Bd(x,r). This notation makes sense whenever more than one metric on X is considered.] D 2.2 Examples.
(i) The open ball interval (x - r, x + r).
B(x,r) in Euclidean space
(IR, d e) is the open
The open ball B(x,r) in Euclidean space (lR 2 , d e ) is the open disc centered at x with radius r in the usual sense. (iii ) Different choices of metric on a given carrier give rise to diffe rent spaces and, as the result, to different open balls. In metric spaces other than Euclidean, the shape of open balls may be quite surprising to our usual way of their perception. Consider, for instance, an open ball B(x , r ) in ( lR 2 , d) , where d is the supremum metric defined as in Problem 1.3, for n = 2, i.e. ,
( ii )
It is easy to see that the open ball B(x,r) is of the square shape and that the corresponding open ball Be( x, r ) with respect to the Euclidean� metric in IR 2 is inscribed in this square (see Figure 2. 1 below ) . (iv) Let (X,d) be a discrete metric space with the metric defined in Example 1.3 ( i). Then, for any x E X, an open ball centered at x is
B(x, r) =
{x }, X,
r
1.
66
CHAPTER 2. ANALYSIS OF METRIC SPACES
.x:z+r
•--
�-r ·
--
x1-r
Figure 2.1
x(t) + r
x(t) y(t) x(t) - r
a
b
Figure 2.2
2.
The Structure of Metric Spaces
67
(v) Let (X, d) be the metric space defined in Example 1 .3 (iv), where X = e[a, b] , and d(x , y ) = sup{ I x( t) - y(t) I : t
E [a,b]}.
Then the open ball B( x,r) has a shape as depicted in Figure 2.2 above. D 2.3 Definition. Let ( X, d) be a metric space. A subset A of the carrier X is called a d- o pen set (or just open set) if every point x of A can serve as the center of an open ball inscribed in A, i.e. , there is an r > 0 such that B ( x,r) C A. D Every open ball is an open set itself. Indeed, if x 1 E B(x,r) then r - d(x,x 1 ) > 0. Take r 1 = r - d(x , x 1 ) and show that B(x 1 ,r 1 ) C B(x,r). For every E B(x 1 ,r 1 ), by the triangle inequality, (i)
z
Thus
z
E B(x,r) (see Figure 2.3).
Figure 2.3
68
CHAPTER
2 . ANALYSIS OF METRIC SPACES
(ii) The set [ a,b ) , for a < b, in (IR, de) is not open, since there is no open ball B( a,r) � [ a , b ) . (iii) The carrier X is obviously open. (iv) A set A is not open if there is at least one point x E A such that there is no ball B(x,r) that can be inscribed in A. Since the empty
set does not have any point, it is reasonable to assign it to the class of open sets. ( v) In the Euclidean space (IR,de), IR is an open set but not an open ball (why? ) . D 2.5 Theorem.
For every metric space (X, d), the following statements
hold true: ( i) Arbitrary unions of open sets are open sets. ( ii) Finite · intersections of open sets are open sets. Proof.
( i) Let { Ak : k E I} be an indexed family of open sets in X and let A U Ak . If x E A then there is an index i such that x E A i . Since A i kEI ==
is open, there is an r > 0 such that
Therefore, A is open. n (ii) Let Al , . . . ,A n be open subsets of X and let A = n A k . If X E A k=l then x E A k , k = 1, . . . , n . It follows that there are r 1 , . . . , r n such that B(x,r k ) C A k , k 1, . . . , n . Let r min { r1 , . . . ,r n}· Then, obviously, B (x,r ) f. C/J and B (x,r ) C Ak , k = 1 , . . . , n. Thus, B(x, r ) C A and A is D open. 2.6 Remark. The intersection of more than a finitely many open sets need not be open. The reason is that = min { r k : k E I} can be zero. For example, let ==
==
r
Then
1 E An ,
n ==
1 , 2, . . . , which implies that 1 E n
00
00
{1} = nn A n . =l However, the set {1} is not open in (lR,de).
n=l A n and hence D
2 . The Structure of Metric Spaces
69
(X, d) be a discrete metric space. Then the power set
endowed with the discrete metric, all singletons are open, while in Euclidean space (IR,d e ) they are not. D 2.8 Definitions.
(i) A point x E A C X is called an interior point of A if there exists an open ball B(x,r) C A. The set of all interior points of set A is denoted by A or Int(A) and called the interior of A. [Clearly, A is the largest open subset of A, which yields that A is open if and only if A = A. Indeed, let C C A be an open set, larger than A . Then there is an x E C such that x rf:. A. But this is a contradiction, since must be an interior point of A.] ( ii) A subset A of X is called closed if its complement Ac is open. 0
0
0
0
0
x
[Specifically, the carrier X and the empty set C/J are both closed.] (iii) A point x E X is called a closure point of A C X if every open ball centered at x contains at least one element of A ( including x if x E A). We will also say, "if every open ball centered at x meets A ." The set of all closure points of A is denoted by A or by C l (A) and called the closure of A. [For example, let A = [ 0,2) U {5}. {5} is a one of the closure points since B(5, r) contains {5 } for all r > 0 . Thus, A = [0,2] U {5}.] 2.9 Proposition.
are closed sets.
Arbitrary intersections or finite unions of closed sets
Proof. The statements follow by applying DeMorgan's laws.
D
2.10 Examples.
(i) From Definition 2. 8 (iii) it follows that A C A. ( ii ) Since the set of all open subsets of a discrete metric space ( X, d) coincides with its power set, the set of all closed subsets is also the
power set. Particularly, in a discrete metric space all subsets are simultaneously open and closed. D 2.11 Proposition.
superset of A.
For any subset A of X, A is the smallest closed
70
CHAPTER
2 . ANALYSIS OF METRIC SPACES
Proof.
(i)
We show first that A is a closed set, i.e. that (C l(A))e is open. Let x E (Cl(A))e. Then there exists an open ball B(x,r) such that B( x, r) n A = (/J (since, otherwise, x would belong to A by the definition). However, we have not proved yet that B(x,r) n A = (/J, which would immediately imply that (Cl(A))e is open. Now we show that no point of B(x,r) is a closure point of A. Take an arbitrary point t E B(x,r). Since B(x,r) is an open set, there is an open ball B(t,rt) C B(x,r) also disjoint from A. By the definition of a closure point, this means that t rf. A. Since t was an arbitrary point of B(x,r), B(x,r) C (Cl(A))e. ( ii) Now we show that the closure of A is the smallest closed set containing A. Let B be an arbitrary closed set such that A C B. We prove that Be C (A )e. Since Be is open, for each x E Be, there is an open ball B(x,r ) C Be. This implies that B(x,r ) n B = C/J and that
B(x,r) n A = (/) .
� A ( by the definition of a closure point), which is equivalent to x E ( C l A )e. Therefore, we have proved that x E B e yields that � E (Cl A)e, i.e. Be C (Cl A)e. The latter is obviously equivalent to A C B. D 2.12 Corollary. A set A is closed if and only if A = A. Thus
x
(See Problem 2. 1 .) 2.13 Remark. Consider the set C(x,r) = {y E X : d(x,y) < r }. It can be easily shown that C is a closed set. (See Problem 2.4.) Such C is called a closed ball centered at x with radius r. Evidently, B(x,r) C C(x,r) implies that B(x,r) C C(x,r), since B is the smallest closed set containing B. However, we observe that C(x,r ) does not necessarily coincide ,w ith the closure of the corresponding open ball B( x, r ) . For instance, let (X ,d) be a discrete metric space, where any open ball is both closed and open set, i.e. B(x,r) = B(x,r). Because
C(x, r) =
{x }, X,
r<1 r > 1,
we have B(x,r) = C(x,r) = X for r > 1 or B(x,r) = C(x,r) = {x} for r < 1. For r = 1 , B(x,r) = {x} C C(x,r ) = X, unless X is a singleton. D 2.14 Examples.
( i) In the Euclidean metric space (IR,de), for each x E IR, { x} is
2. The Structure of Metric Spaces
71
closed. Indeed, { x} c = ( - oo,x) U ( x , oo) is open. ( ii) The set of all rational numbers Q is neither open nor closed. Indeed, it is known that each irrational point x is a limit of a sequence of rational points {x n } · Therefore, there is no open ball B(x,r), which does not contain rational points. This implies that Q c is not open, or equi valently, Q is not closed. On the other hand, Q cannot be open, since otherwise, every rational point q could be the center of an open ball (interval) containing just rational numbers. This is absurd, since any interval is continuum. Therefore, the set of all rational numbers is neither open nor closed. It also follows that the set of all irrational num bers is neither open nor closed. D 2.15 Definition. A point x E X is called an accumulation point of a set A C X if \/ r > 0, B(x,r) n (A\{x}) # C/J. [Observe that x need not be an element of A .] The set of all accumulation points of A is called the derived set of A and it is denoted by A'. D Unlike a closure point, an accumulation point must be "close, to A. If B(x,r ) n (A\{x}) -:f. (/J, then B(x,r) n A # (/J, and, consequently, x E A' yields that x E A or A' C A. 2.16 Examples.
( i) Notice that not every closure point is an accumulation point. For instance, let A = (0, 1 ) U {2} C (lR,d e ) · Then {2} is obviously a closure point of A . However, {2} is not an accumulation point of A , since B(2, � ) n (0, 1 ) = (/J . On the other hand, {0 } is an accumulation and closure point of A. ( ii) Let A == { 1 , �, �, . . . } C (IR,de) · Since 0 is the limit of the se quence {�} (in terms of Euclidean distance), it is also an accumulation point of A . Any open ball at 0 contains at least one point of A. This is the only accumulation point of A. By the way, A is not closed, for 0 is a D closure point of A. So we have A' = {0}, A = A U {0}. In the previous section we introduced the notion of the product metric. We wonder what the shape of open sets in the product metric space is. A remarkable property of this metric is given by the following theorem. 2.17 Theorem. Let {(Y k ,d k ) : k = 1 , . . . ,n} be a finite family of metric spaces and let (Y,d) = x {(Y k ,d k ) : k = 1 , . . . ,n} be the product space.
Then 0 C (Y,d) is open if and only if 0 is the union of sets of the form {Oi : i = 1, . . . ,n}, where each o i is open in (Y i ,d i )• X
A proof of this theorem in a more general form is given in Chapter 3.
72
CHAPTER 2. ANALYSIS
OF
METRIC SPACES
PROBLEMS 2. 1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 2.9 2.10
Prove Corollary 2.12. Is it true that A � B => A C B? Show that [A c ] c C A. Prove that a closed ball C(x,r) is a closed set. Show that in (IR",d e ), B(x,r) = C(x,r). Show that A = A u A'. Let A � (X,d), where X is an infinite set. Show that, if x is an accumulation point of A, then every open set containing x contains infinitely many points of A. Give an example of a continuum closed set that does ,not have any accumulation point. Find the shape of open balls in the metric space (X,d) introduced in Example 1.3 ( ii) . Show that the set 2.9.
[1,oo) is closed in the metric space in Problem
2.
The Structure of Metric Spaces
NEW TERMS:
open ball 65 radius of an open ball 65 supremum metric 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open ( d-o pen ) set 67 interior point 69 interior of a set 69 closed set 6 9 closure point 69 closure of a set 69 closed ball 70 accumulation point 7 1 derived set 7 1
73
74
CHAPTER
2 . ANALYSIS OF METRIC SPACES
3. CONVERGENCE IN METRIC SPACES
This section introduces the reader to one of the central notions in the analysis of metric spaces - convergence. Among different things, we will discuss the relation between limit and closure points. 3.1 Definitions.
(i) Recall that a function [N,Xf] is called a sequence, and its most commonly used notation is {x n } = f, with x n = f ( n ) . Let {x n } C (X,d) be a sequence and let x E X. A subsequence Q N = {x N , x N + 1 , . . . } is called an N(x,£) tail of {x n } if there are N > 1 and £ > 0 such that Q N C B( x ,£) . The sequence { x n } is said to con verge to a point x E X if for every £ > 0, there is a N(x,£)-tail. In notation, -
nlim � oo d(x n ,x) = O (also d-lim n� oo x n = x or just x n � x ). x is called a limit point of the sequence {x n } · A sequence is convergent if it is convergent to at least one limit point that belongs to X. (ii) A point x is said to be a limit point of a set A if there is a sequence { x71} C A convergent to x. (iii) A sequence { x n } is called a Cauchy sequence, in notation
if for each £ > 0, there is an N such that d(x n ,x m ) < £, for n , m > N. ( iv) A metric space (X,d) is called complete if every Cauchy sequence in X is convergent. (v) A sequence {x n } is called bounded if for every n , d(x 1 ,x n ) < M, D where M is a positive real nurn ber. 3.2 Remark. A sequence in a metric space can have at most one limit point. Indeed, let x,y be limits of a sequence { xn} C (X ,d) and let £ > 0 be arbitrary. Then, given an N, by the triangle inequality,
D (i.e. d(x,y) can be made arbitrarily small) . Thus, x = y. 3.3 Theorem. Let A C (X, d) . Then a point x is a closure point of a set A if and only if x is a limit point of A (i. e. there is a sequence { x n }
3. Convergence in Metric Spaces
75
C A such that x n � x). Proof.
( i) Let x be a closure point of A. If x E A then the proof becomes trivial ( take x n = x, n = 1,2, . . . ) . Let x E A \A. By the definition of a closure point, every open ball B(x,r) meets A. Thus for every n, there is a point, x n E A n B(x,�), so that d(x,x n ) < �- Therefore, {x n } is a desired sequence convergent to x. (ii) Let {x n } C A such that Ai!E00 x n = x. We prove that x E A. The convergence implies that for every £ > 0, there is an N such that
d(x,x n ) < for all n > N. £,
Thus Ve > 0, B(x,e) n A i= C/J, which yields that X E A. ( Particularly, if x E A'\A :j:. (/J, then there exists a sequence with all distinct terms such that x n � x.)
{x n }
D
A subset A of a metric space (X,d) is closed if and only if it contains all of its limit points. 3.4 Corollary. Proof.
( i) Let
A be closed and let
Then, by Theorem
3.3,
{ x n } C A be a convergent sequence.
nlim �oo x n = x E A. and x E A. Thus,
Since A is closed, A = A A contains all of its limit points. ( ii) Let A contain all of its limit points. Apply the pick-a-point pro cess. Let x E A. Then, by Theorem 3.3, there is a sequence {x n } C A such that nlim �oo x n = x. By our assumption, x belongs to A or, equivalent ly, A C A implying that A = A and hence A is closed. D 3.5 Definitions.
(i)
A subset A C (X, d) is called dense in X if A = X. [By Theorem 3.3 , A is dense in X if and only if the set of all limit points of A coincides with X, or, in other words, if and only if for every x E X, there exists a sequence { x n } C A such that x n � x.] ( ii) A set A C (X , d) is called nowhere dense if its closure has the empty set for its interior, i.e. , if Int (Cl(A)) = (/J.
76
CHAPTER
2 . ANALYSIS OF METRIC SPACES
(iii) A point x E ( X, d ) is called a boundary point of A i f every open ball at x contains points from A and from Ac. The set of all boundary points of A is called the boundary of A and is denoted by 8A. [Note that 8A = 8Ac = A n Ac].
D
3.6 Examples.
( i) Since each irrational number can be represented as the limit of
a sequence of rational numbers, Q is dense in IR (in terms of the Euclidean metric). ( ii) X and C/J have no boundary points. (iii) Let A = [0, 1) U {2}. Then, A = (0, 1), A = [0, 1 ] U {2}, A' = [0, 1 ], 8A = {0, 1 ,2} (since Ac = ( oo ,O) U [1 ,2) U (2, oo ), Ac = ( oo ,O] U [1, oo ), and A n A c = {0 , 1 ,2} ). (iv) Let A = {1 ,5, 10} � ( IR,de) · Then A is nowhere dense. ( v ) { � : n = 1 ,2, . . . } is nowhere dense in (lR,d e ). 0
-
-
PROBLEMS 3. 1 3.2 3.3 3. 4 3.5 3.6 3.7 3.8 3. 9
Show that every convergent sequence is a Cauchy sequence. Give an example when the converse is not true. Prove that A = A + 8A ., If x E 8A, must x be an accumulation point? Prove that a set A C (X,d) is nowhere dense in X if and only if the complement of its closure is dense in X. Assuming that (lR, d e ) is complete (a known fact from calculus) prove that (IR" ,de) is also complete. Show that any Cauchy sequence is bounded. Show that in a discrete metric space any convergent sequence has at most finitely many distinct terms. Show that any discrete metric space is complete. Show that if { x n } � (X ,d) is a Cauchy sequence and { x n k } is a subsequence convergent to a point a E X, then x n __. a. -
0
3.
Convergence in Metnc Spaces
NEW TERMS:
sequence 74 N ( x,t: ) -tail 74 convergent sequence 7 4 limit point of a sequence 74 limit point of a set 74 Cauchy sequence 74 complete metric space 7 4 bounded sequence 7 4 dense set 75 nowhere dense set 7 5 boundary point 76 boundary of a set 7 6
77
78
CHAPTER
2 . ANALYSIS OF METRIC SPACES
4. CONTINUOUS MAPPINGS IN METRIC SPACES
(X,d) and (Y,p) be two metric spaces. A function f : (X,d) � (Y,p) is called continuous at a point x0 E X if for each £ > 0, there is a number 6 > 0 such that p(f(x),f(x0 )) < £ for all x with d(x,x 0 ) < 6. The function f is called continuous on X or simply conti nuous if f is continuous at every point of X. D 4.2 Remark. Since Xo E r( {f(xo )} ) Xo E f * (B p (f(x o ), c: )). However, in general, x 0 need not be an interior point of f * (B p (f(x 0 ),e). The continuity of function f at x0 is equivalent to the statement that , for any £ > 0, x 0 is indeed an interior point of f * (B p (f(x 0 ),e)). In other words, f is continuous at x0 if and only if the inverse image under f * of any open ball centered at f(x0 ) contains x0 an interior point. (See Figure 4. 1 .) Consequently, there is an open ball B d (x 0 ,6) C f*(B p (f(x 0 ),e)). In particular, this implies that: 1) such a positive 6 exists, and 2) the image of Bd(x 0 ,6) under f is a subset of B p (f(x 0 ),e), which guarantees that p(f(x), J (x0 )) < £ for*all x with d(x,x0 ) < 6. 4. 1 Definition. Let
.
as
y
X
Figure 4. 1
4.
Continuous Mappings in Metric Spaces
79
However, if f is not continuous at x 0 , as it is depicted in Figure 4.2 below, x 0 need not be an interior point of f * (B (f(x 0 ),e)). In this case, no ball Bd (x 0 ,6) can be inscribed in f * (B p (f(x 0 ), c: )) or, equivalently, no positive fJ exists to warrant p(f(x),f(x0 )) to be less than £ for all x with
d( x,x0 ) < 6.
D
y
BP(f{x0),e) -- --··-
·--·-·--·--
- ..J!:..
I
1
J 1 I
________
-----� -·------·--·--···---- ··-·-·····-·------
r
f{xo)
I I
i
i I
1 iI I
......
1--·--
:
'
---····--.-·-··--·--·----·- ---··...----·--..�·--
i
x0
(non-interior point)
X
Figure 4.2 The following theorem is a generalization of the above principles of continuity. 4.3 Theorem.
A function f: ( X,d) � (Y,p) is continuous if and only if the in verse imag e of any open set in (Y, p) under f is open in (X ,d).
CHAPTER 2 . ANALYSIS OF METRIC SPACES
80 Proof.
1) As mentioned in Remark 4.2, we will begin the proof by showing
the validity of the following assertion:
f is continuous at x 0 if and only if x0 is an interior point of the inverse image under f * of any open ball Bp (f(x 0 ),e). Let x0 be an interior point of f * ( B p (f(x 0 ),e)). Then there is an open
ball
and hence, ( by Problems 3.6 (a) and 2.6 of Chapter
1),
which yields continuity of f at x0 • Now, let f be continuous at x 0 • Then , the inclusion f (B d (x 0 ,c5)) C B p (f(x0 ),e) holds, which, along with Problem 2.5 ( Chapter *1) lead to the following sequence of inclusions:
Because x0 is the center of B d (x 0 ,c5), it is an interior point of this ball and, due to the last inclusion, an interior point of f * (B p (f(x 0 ),e)). 2) Suppose f is continuous on X. We show that for each open set 0 C Y, f * (O) is open in (X,d). Pick a point x0 E / * (0). Then, f(x 0 ) E f * ( ! * ( 0)) C 0 and, since 0 is open, f( x 0 ) is its interior point. Thus, 0 is a superset of the open ball B p (f(x0 ),e)), for some £, and consequently, (4.3) Since f is continuous at x0 , by assertion 1 ) , x0 must be an interior point of f * (B p (f(x0 ),e)), and, by (4.3), an interior point of / * ( 0). Thus, ! * ( 0) is open. 3) Let f * (O) be open in ( X,d) for every open subset 0 of Y. Take x0 E X and construct an open ball B p (f(x 0 ),e). By our assumption, the set f * (B p (f(x0 ),e)) is open in (X,d). Since f(x 0 ) E B p (f(x0 ),e), we have that
x 0 E f * (B p (f(x 0),e)) and it is an interior point of D f * (B p (f(x 0 ),e)). By 1 ) , f must then be continuous at x0 .
and, therefore,
4.
Continuous Mappings in Metric Spaces
81
There will also be yet another useful criterion of continuity. 4.4 Theorem. A function f: ( X ,d) __. (Y,p) is continuous at
x E X if and only if for every sequence {x n }, d-convergent to x, its image se quence {f(x n )} is p-convergent to f (x) . We will prove this theorem for a more general case in Chapter 3 (Theorems 4.9 and 4. 10). 4.5 Definition. Let (X ,d) be a metric space and r ( d) be the collection of all open subsets of X with respect to metric d. Then r(d) (or j ust r) is D said to be the topology on X generated by d . Theorem 4.3 can now be reformulated as follows.
Let f: (X,d) � (Y,p) be a function and let r(d) and r (p) be the topologies generated by metrics d and p, respectively. Then f is continuous on X if and only if f ** ( r(p)) C r( d) [i. e., \/0 E r(p ), D f * (O) E r(d)] . 4. 7 E ple. Let f: ( IR ,d) � (lR,de) be the Dirichlet function defined f I Q , where Q is the set of rational numbers. H d de is the Euclidean metric then f is discontinuous at every point. If d is the dis crete metric, by Theorem 4.3, f is continuous on IR, since the inverse image of any open set in ( IR ,de ) under f is clearly an element of the 4.6 Theorem.
as
xam
=
=
power set coinciding with the "discrete topology" generated by the dis D crete metric (see Example 2. 7). We will further be interested in the conditions under which two dif ferent metrics on X generate one and the same topology. This property of metrics satisfies an equivalence relation on the set of all topologies on X and hence referred to as equivalence of metrics. In other words, topolo gies generated by metrics on a carrier induce an equivalence relation. 4.8 Definition. Two metrics d 1 and d 2 on X are called equivalent if r(d1 ) = r(d 2 ) (in notation d 1 � d 2 ). D 4.9 Remark. Let ( X ,d 1 ) and ( X,d 2 ) be two metric spaces and, let f: (X,d 1 ) � ( X ,d 2 ) be the identity function ( f (x) = x, x E X). If d1 and d 2 are equivalent and therefore r( d1 ) = r( d 2 ) , then for every open set 0 in (X,d 2 ) (and in (X,d 1 )), f * (O) E r(d 1 ). According to Theorem 4.4, this is equivalent to the statement that implying that
nlim �oo d 1 (x n ,x) = 0 nlim �oo d 2 (x n ,x) = 0. �oo d 2 (f(x n ),f(x)) = nlim
CHAPTER 2 . ANALYSIS
82
Thus, assuming (i) r(d 1 ) = r(d2), we showed that ( ii) nlim --+ oo d 1 ( x n ,x) = 0
OF
METRIC SPACES
¢> nlim --+ ood 2 ( x n ,x) = 0.
By Theorem 4.4, it follows that the converse is also true, i.e. that statement ( ii) implies statement ( i) . Hence, we may call two metrics D r( d 1 ) and r( d 2 ) on X equivalent if (i) or (ii) holds. From Theorem 4.3, it also follows that the identity map above is con tinuous under equivalent metrics. However, an identity map need not be continuous if d 1 and d 2 are not equivalent. 4. 10 Definitions.
(i)
Let A be a subset in a metric space (X,d). The number d(A) = sup{d(x,y) : x,y E A}
(more precisely, a real number or infinity) is called the diameter of A . The set A is called d -bounded or just bounded if d(A) < oo . Particularly, the metric space (X,d) or d is called bounded if X is bounded. A is said to be unbounded if d(A) = oo . ( ii) A subset A in a metric space ( X,d) is called totally bounded if for every c > 0, the set A can be covered by finitely many c-balls (i.e. balls with common radius c) . D 4. 11 Example. According to Problem 1.4, the function d(x,y) p(x, y ) = 1 + d(x,y) defined on a metric space (X,d) is a metric on X. Obviously
P ). Therefore, d and p are if and only if nlim p(x ,x) 0 (due to d = = n 1 oo -p --+ equivalent. Observe that p is clearly bounded while d is arbitrary. D We finish this section by rendering a short discussion on uniform con tinuity. This concept will be further developed in Section 6 and Chapter 3. 4.12 Definition. A function f : (X,d) --+ (Y,p) is called uniformly continuous on X if for every c > 0, there is a positive real number fJ such
4. that
Continuous Mappings in Metric Spaces
d(x ,y ) < 6 implies that p(f(x) ,f(y)) < for every x, y E X. e: ,
83 D
Unlike continuity, uniform continuity guarantees the existence of such positive 6 ( for every fixed e: ) for all points of X simultaneously. In the case of usual continuity, a delta depends upon a particular point x E X, where the continuity holds, so that a common delta, good for all points x E X, need not exist. Clearly, uniform continuity implies continui ty. Uniform con tin ui ty can also be defined on some subset A of X, so that in Deflni tion 4. 12, X will be replaced by A. 4.13 Examples.
( i)
Consider
f : ( IR, de ) --+ ( IR, de ) such that f(x) = x 2 • Then
implies that
I x - x0 + 2 x0 I < I x - x0 I + 2 I x0 I < 6 + 2 I x0 I and
l f (x) - f (xo ) l = l x 2 - x� l = l x - xa l · l x + x o l
< 6 ( 6 + 2 I x0 I ) . ·
Take 6 · ( 6 + 2 1 x0 I ) as t: such that
e: .
Then
6 can be found explicitly as a function of
Therefore, 2the function x 2 is d e-continuous at every point x0 E IR. However, x is not uniformly continuous on IR, since 6 depends upon x0 as well. Specifically, 6 --+ 0 when x0 --+ oo Consequently, we cannot find a 6 > 0 good for all x0 . ( ii) Let f(x) = x 2 be given as .
From the last inequality above we derive
I f(x) - f(x0) I < 6(6 + 2 1 x0 I ) < 6(6 + 6), and thus whenever
6 = .J9+"£ - 3, where t: = 6 (6 + 6). Thus d e (f(x), f(x0)) < de (x , x0) < 6 = � - 3. Since 6 is independent of x0 , f (x)
t:
CHAPTER 2 . ANALYSIS OF METRIC SPACES
84
is uniformly continuous. Observe that f has been given on a closed and bounded interval which provides the uniform continuity. However, in this case f would also be uniformly continuous if f were defined on any bounded but not necessarily closed interval, for instance (0,3) ( why? ) . (iii) A continuous function can be uniformly continuous over un bounded sets, as for example, functions f(x) = 1 , x E [l,oo ) , and D f(x) = sin x, x E IR. There is an analytical result, known as Heine-Bore! Theorem, stating that any continuous function defined on a closed and bounded set in any Euclidean metric space is also uniformly continuous. The general form of this result will be discussed in Section 6 (Theorem 6. 13). 4.14 Remark. It is known from calculus that the space of all real
valued continuous functions defined on [R n is closed under the formation of main algebraic operations. What if functions were defined on an arbitrary space (X,d)? We give here some informal discussion on this matter. Let IR X be the space of all real-valued functions defined on a set X and let f ,g E lR X . Define the following.
( i) f ± g is the function such that for each point x E X, (! ± g)(x) = f(x) ± g(x). (ii) fg is the function such that Vx E X, (fg)(x) f(x) g(x). (iii) + and oo are not real numbers. Consequently, fIg is the function such that for all x E X, (flg)(x) = f(x)lg(x), exclud ing x E X for which g(x) = 0. At all those values, the function fIg is either undefined or can be specified. =
oa
•
-
(iv) As a special case, any real-valued function multiplied by a real number, is a real-valued function too. ( v) J'he associative ( relative to multiplications ) and distributive laws of functions relative to the addition and multiplication defined in ( i) and ( ii) are the corresponding consequences of these laws for real numbers.
Bearing in mind these observations, we conclude that the space lR X is a commutative algebra over lR with unity and a vector lattice ( that was also mentioned in Example 7.7 (ix), Chapter 1). A subset e((X,d); ( lR, p )) ( of lR X ) of all continuous functions is a subalgebra characterized by the following properties : (a)
J,g E e => af + bg E e , Va,b E IR. (b) J,g E e => fg E e.
4.
Continuous Mappings in Metric Spaces
85
PROBLEMS 4. 1 4.2 4. 3
Show that if A is totally bounded then A is bounded. Give an example, where a bounded set is not totally bounded. Prove that e is indeed a subalgebra with properties ( a ) and ( b) above. Show that a continuous bounded function on a bounded interval need not be uniformly continuous. In
the problems below it is assumed that from (IR,d e ) to (lR,d e )·
f and
g
are functions
f : (( - oo ,O),d e ) --+ (( - oo ,O),d e ) be a function given by f(x) = 1· Show that f is continuous. Explain why f(x) is not uni
4.4
Let
4.5
formly continuous. Let f : A --+ IR be a differentiable function such that its derivative f ' is bounded over A, where A is an arbitrary ( bounded or unbounded ) interval. Show that f i s uniformly continuous on A.
4.6 4.7
Show that if f and g are uniformly continuous on lR and bounded then f g is uniformly continuous on lR too. Which of the following functions are uniformly continuous? )
f(x) = sin 2 x (x E lR). b) f (X) = x 3 X (X E (R). ) f(x) = x sin x (x E IR). d) f(x) = ln2 x (x E [ 1 , oo ). ) f(x) = x ln x (x E (1, 100)).
a c
COS
e
4.8
4. 9
Let f be a continuous function and g a uniformly continuous func tion on a set A such that I f I < I g I . Is f then uniformly conti nuous? Show that in (lR n ,d e) , any bounded set is also totally bounded.
86
CHAPTER 2 . ANALYSIS
OF
NEW TERMS:
continuous at a point function 78 continuous function on a set 78 inverse image of an open set under f 79 continuity criteria 79, 81 topology generated by a metric 81 Dirichlet function 81 equivalent metrics 81 diameter of a set 82 bounded set 82 d-bounded set 82 unbounded set 82 totally bounded set 82 uniformly continuous function 82 algebra of functions 84
METRIC SPACES
5.
Complete Metric Spaces
87
5. COMPLETE METRIC SPACES
In this section we will discuss the completeness of metric spaces as it was introduced in Definition 3. 1 (iv).
Let (X,d) be a complete metric space. Then a subspace ( A, d) is complete if and only if A is closed. Proof. Let A be closed and let {x n } C A be any Cauchy sequence. Since (X,d) is complete, there is a point x E X such that nlim --+ oox n = x. Then, by Corollary 3.4, x E A. Thus, ( A, d ) is complete. Now, let (A,d) be complete and { x n } be any convergent sequence in A. Then this se 5.1 Theorem.
quence is also a Cauchy sequence and hence A contains its limit. There fore, A is closed, again, by Corollary 3.4. D The reader should be aware of the differences between the notions of completeness and closeness of a subspace. (See Problem 5. 3.) 5.2 Theorem. A metric space (X,d) is complete if and only if � very nested sequence { C(x n ,r n )} of closed balls, with r n l 0 as n--+oo , has a
nonempty intersection. Proof. Because r n ! 0, for any r v < �£. Given that k > > v, n
£
> 0, there is an integer
v such
that
and, consequently, Therefore, { x n } is a Cauchy sequence. First assume that (X,d) is complete. Then, {x n } converges to a point, say x E X. Since each ball C(x n ,r n ) contains the tail of the sequence { x n } and because it is closed, it must contain x. Thus, n C(x n ,r n ) contains X and hence it is not empty. n=l Now, let any nested sequence of closed balls have a nonempty intersection and let {x k } be a Cauchy sequence in X. By Definition 3 . 1 (iii) , it implies the existence of an increasing subsequence {v 1 ,v 2 , . . . } of indices of { x k } such that for each n , 00
CHAPTER 2 . ANALYSIS
88
We show that the sequence y E C n + 1 . Then
{
cn
OF
METRIC SPACES
(
= c X V n ' 21n
)} is nested. Indeed, let
d ( y , X v n 1 ) < 2 n � 1 and d ( X v n , X v n ) < 2 n 1+ 1 . +1 + Therefore,
d ( y ,x v n ) < 21n , which yields that y is an interior point of C n and thus C n :> C n + 1 . Since by our assumption, the intersection n C n f. (/J, there is at least n=1 one point, say x that belongs to all balls. Furthermore, because the sequence { r n } of their radii is convergent to zero, the subsequence { xv n } of their centers must converge to x E X and thus, by Problem 3.9, {x k } also con verges to x. D 5.3 Remark. Clearly, in the final phrase of the last theorem, point x is a unique point of the intersection n C n · The below theorem is a n=1 useful refinement of this statement due to Georg Cantor. Because of its similarity with Theorem 5.2, its proof is suggested as an exercise (Problem 5.8). D 5.4 Theorem (Cantor). Let (X,d) be a complete metric space and let {A n }! C X be a sequence of nonempty closed subsets with 00
00
Then nn=1 A n consists of exactly one element. 00
D
d-bounded if Y is a number M such that
5.5 Definition. A function [X, (Y,d) ,f] is called
linear space and there is a nonnegative real d(f(x),O(x) ) < M, Vx E X, where 0 is the function identically equal to D () E Y (the origin of Y) . 5.6 Examples.
(i) Let X be a nonempty set, (Y,d) a linear metric space, and let � * � (X; (Y, d)) be the set of all d-bounded functions from X to Y. For all *f, g E � * define =
p(f,g) = sup{d(f(x) , g(x)) : x E X}. It can be shown (Problem 5.4) that p is a metric on GJ * ' called a uniform (or supremum) metric. Consequently, the convergence in (� * ,p) is called
5.
Complete Metric Spaces
89
uniform convergence. A subset of functions GJ C GJ * is said to be uni formly bounded on X if GJ is p-bounded, i.e. , diamGJ < M (a positive real number). We show that any Cauchy sequence in (GJ * ' p) is uniformly bounded. We will make use of Problem 5.5. Let {f n } be a Cauchy sequence in (GJ * 'p ) . Therefore, for £ = 1, there is an N = N(1) such that p(f n ,f k ) < 1, n,k > N. Let k = N(1). Then,
p(f n ' 0) < P(f n , f N ) + p(f N , O ) < 1 + M (f N ),
where M (f N ) is a "p-bound" of function f N · If M(f i ) is a bound of f i , then M, defined as max{M(f 1 ), . . . ,M(f N _ 1 ),1 + M (f N )}, p-dominates the whole sequence {f n } · By Problem 5.5, we have that {f n } is p bounded. ( ii) Assume that (Y,d) is a complete linear metric space. Let us show then that (GJ *'p ) is complete too. Consider a Cauchy sequence {/ n } c (GJ * 'p ) . It is obvious that for each fixed x E X, the sequence { f n (x)} is also Cauchy in (Y,d). Since (Y,d) is by our assumption complete, the "pointwise limit" of {f n } exists. Denote it by f. In other words,
nlim --. ood(f n (x),f(x)) = 0, Vx E X. We need to show that f E (GJ * 'p ) . Since {f n } is a Cauchy sequence, according to ( i ) it is uniformly bounded by a real number M . Thus we
have
d(f(x), O (x)) < d(f(x),f n (x)) + d(f n (x), O (x)) < d(f(x), J n (x)) + p(f n ,0) < M, . , 1.e.
d(f(x), O (x)) < d(f(x), J n (x)) + M .
The last inequality holds for every
x E X if --. n
oo ,
which yields
d(f(x), O (x)) < M, for all x E X. Consequently, p(f ,0) < M and hence f E (GJ * 'p ) . We only showed that f n (x) � f(x), for each f E GJ * " The assertion f n � f is subject to Problem 5.6.
x E X, and that D
CHAPTER 2 . ANALYSIS
90
OF
METRIC SPACES
PROBLEMS 5.1
5.2
Using similar arguments as in Example 5.6, show that the limit of any uniformly convergent sequence of continuous bounded func tions from ( X,d 0 ) to (Y,d) is a bounded and continuous function. Let { C n } be a sequence of closed balls in (IR",d e) such that each of the balls e n is centered at a point Xo E IR" and has radius � , n = 1 ,2, . . . . F1nd nn C n . •
00
=l
5.3
5.4 5.5
5.6 5. 7
5.8 5.9
Show that if a metric space ( X ,d) is not complete then a closed subspace (A,d) need not be complete either. [Hint: Consider the metric space in Problems 2. 9 and 2. 10.] Show that p, defined in Example 5.6 (i) , is a metric on GJ * . Let GJ C GJ ( X; (Y,d)), where Y is a linear space. Prove that GJ is p bounded if* and only if there is a positive constant M such that for all f E �, p ( f , O) < M. Show that in Example 5.6 ( ii) f n !... f. We can make use of the fact that the Euclidean and uniform metrics are equivalent to show completeness of (IR n ,d e ) · For n = 1 , it is well-known from calculus. Prove completeness of (IR",d e) for an arbitrary n. (See Problem 4.9.) Prove Cantor's Theorem a._,4 . Let ( X ,d) be a metric space. A subset A C X is said to be of the first category if it can be represented as a countable union of no w here dense sets. Otherwise, A is of the second category. Prove Baire's Category Theorem: A complete metric space is of the
second category.
5.
Complete Metric Spaces
NEW TERMS:
completeness criteria 87 Cantor's Theorem on intersection of closed sets 88 d-bounded function 88 bounded function 88 uniform metric 88 supremum metric 88 uniform convergence 89 uniformly bounded set of functions 89 p-bound of a function 89 bound of a function 89 point wise limit 8 9 Baire's Category Theorem 90
91
CHAPTER 2 . ANALYSIS OF METRIC SPACES
92
6. COMPACTNESS
Compactness is one of the kernel concepts in real analysis. We develop it in the present section for metric spaces and then in Chapter 3 for the general topological spaces. It stems from the fact known in IR that every bounded sequence has a convergent subsequence, which implies that any sequence in a closed bounded interval has a subsequence convergent to a point in this interval. In a general metric space, a subset A, in which every sequence has a subsequence convergent to a point in A is called sequentially compact or compact. Although compactness and sequential compactness are distinct notions in general topological spaces (and they ar� defined differently) , they are equivalent in metric spaces as Theorem 6.3 states it. Continuous functions defined on compact sets are uniformly conti nuous; continuous images of compact sets are compact ( hence, closed and bounded) anq this means that in normed linear spaces continuous func tions on compa<;t sets reach their maximum values) . Further applications lead to the celebrated Ascoli and Ascoli-Arzela theorems. 6. 1 Definitions.
( i)
A C X if
A family of sets
{ Ai : i E I} � (X ,d) is called a cover of a set
A c i U Ai. ei
Any subfamily of {Ai i E I}, which covers A is called a subcover of A. If { Ai : i E I} is a family of open sets, then the corresponding cover (or subcover) is called an open cover (or an open subcover) . ( ii ) A set A C (X ,d) is called compact if any open cover of A has within itself a finite subcover of A, or we will also say that "any open cover of A can be reduced to a finite subcover of A." Correspondingly, (X,d) is a compact metric space if X is compact. [Notice that any finite subset is compact. Consequently, to avoid triviality, in all theorems below we will assume that sets of spaces under consideration are infinite.] (iii ) A set A C (X,d) is called a Lindelof set if any open cover of A contains at most a countable subcover of A (or "can be reduced to at most a countable subcover" ). (X,d) is called a Lindelof space if X is a Lindelof set. D A noteworthy property of Euclidean spaces is given in the following classical result. :
6.2 Theorem (Lindelof).
(See Problem 6.7.)
(IR",d e) is a Lindelof space.
6. 6.3 Theorem.
Compactness
93
For a set A C (X,d), the following statements are equi
valent. ( i) A is compact. ( ii) Every infinite subset of A has an accumulation point in A (in this case A is called Balzano- Weierstrass compact ) . (iii) Every sequence in A has a subsequence that converges in A (A is called sequentially compact ). D
The sequential compactness of a subspace implies its completeness. (See Problem 6.6.) The proofs to the above statements are left for the reader. (Problem 6.8.) Definition 6.4. A metric space is called separable if it has a dense countable subset. 0 Example 6.5. The Euclidean metric space (lR, de) is separable. A relevant dense countable subset of IR would be Q, the set of rational numbers. Another example is the n-dimensional Euclidean metric space D with the countable, dense subset Q". Theorem 6.6.
Any compact metric space is separable.
X be compact. It is easy to see that for each E N, X can be covered by the family of open balls centered at every x E X with radius �- Since (X,d) is compact, this open cover can be reduced to a finite subcover, such that B(x , �) contains X, where F n = U x Proof. Let
n
E Fn {x�, . . . ,x/:n }. Denote F = U F " which is obviously a countable subset of n =l X. We show that F is dense in X, i.e. , F = X It is sufficient to prove that, for each y E X and r > 0, the open ball B(y,r) contains at least one point of the set F, i.e., y is a closure point of F. Choosing such y and r we take any n such that � < r. Then if 00
'
.
y E X Cx U B(x , �) ,
e Fn then there is a point x j E F n such that y E B(x j ,r). This implies that n n d(x jn ,y) < r and, therefore, x jn E B(y,r). Consequently, B(y,r) n F n # C/J and B(y,r) n F -=/= C/J. The proof of the statement is complete. D The following two theorems belong to central results in analysis. Theorem 6.7.
Let A C (X,d) be compact. Then A is closed and bound-
94
CHAPTER
2 . ANALYSIS OF METRIC SPACES
ed. Proof.
1) We show that A is bounded. Obviously A is covered by the family of open balls {B(x, l ): x E A } . Since A is compact, this open cover can be h reduced to a finite subcover, i.e. A C U B(x k ,1) for some integer h . Let k=l M = m { d(xi,x j): i,j = 1, . . . , h }. Then M is finite. For any x,y E A, there are xi and x j such that x E B( xi ' 1) and y E B( x i ' 1 ) . The following ax
holds due to the triangle inequality:
Therefore, A is bounded. 2) We show that A is closed, i.e. that A = A. Let x E A. By Theorem 3.3, there exists a sequence {x n } C A such that x n --. x. By Theorem 6.3, i f A is compact, every sequence { x n } C A has a sub sequence that converges in A. By Problem 3.9, such a subsequence must have the same limit as { x n } , i.e., x E A . Therefore, A is closed. D Theorem 6.8 (Beine-Borel). A set A C (IR",d e ) is compact if and only
if A is closed and bounded. Proof.
1) If A is compact it is closed and bounded as a special case of The
orem 6.7. 2) If A C (lR", d e ) is closed and bounded, d(x,y) < M < oo Fix a y E A and define a = (a 1 , . . . , an ) E A. Then we have
V x,y E A.
I ai I = Vaf < /kf:_ ( a k - 0) 2 = d ( a,B ) v =l < d(a, y ) + d ( y , B ) < M + d ( y , B ) , i = 1,
. . ., n,
where (} denotes the origin in lR". Since each y has a finite distance to the origin, then every other point of A, like a, has also a finite distance to the origin bounded by M + d ( y , B ) . Note that even though d ( a, B ) < oo for unbounded sets, d( a, B) would not have a uniform bound unless A is bounded. Now we show that any d e-bounded sequence in lR" has a conver gent subsequence. The below considerations Fepresent an appropriate selection proce dure. Let {x k } C A. Then {x ic } C IR is a bounded sequence of i-coor dinates (the ith-component sequence), i = 1, . . . , n . A bounded sequence does not necessarily converge but does have a convergent subsequence.
6.
Compactness
95
For i = 1, let such a subsequence be { x� 1 ,x� 2 , . . . } with the limit point x 1 . Select from the 2nd-component sequence, the subsequence with the same indices { x� 1 , x� 2 , . . . }. This subsequence is also bounded and hence contains a convergent subsequence {x% 1 ,x% 2 , . . . } with a limit point x 2 , so that the set of indices {k 1 ,k 2 , . . . } C {r 1 ,r 2 , . . . }. If we return to the subsequence {x� 1 ,x� 2 , . . . } and select from it the subsequence {xl 1 ,x l2 , . . . }, then this subsequence is also convergent and has the same limit x 1 . We can continue this process by taking the 3rd-component sequence, selecting the subsequence { x� 1 ,x�2 , . . . } and from this sequence a convergent subsequence with a limit point x3 . Then the above 1st- and 2nd component subsequences will be reduced to the ones with the ind i ces from the third selection and so on. Let x = ( x 1 , . . . ,x") be the limit of the selected subsequence of {x k }. Since A is closed, x must belong to A. Therefore, we have proven that an arbitrary sequence in A has a conver gent subsequence in A, i.e. that A is sequentially compact. By Theorem 6.3, A is compact. D 6.9 Remark. The second part of the Beine-Borel Theorem does not hold for general metric spaces. That is, if A is closed and bounded, it need not be compact. For example, let X be an infinite set and let d be the discrete metric (which is finite) on X. Then X is closed and bounded. Now consider
X C U B(x, l ) . xeX
Since each of the balls covers just one point, the open cover {B(x, l ): x E X} cannot be reduced to a finite subcover. Therefore, X is not com D pact. 6. 10 Theorem. Let f : ( X,d) --. (Y,p) be a continuous, surjective
function and let (X, d) be compact. Then the image f * ( X ) = Y is com pact. Consequently, the image of a compact set under a continuous Junc tion is compact. Proof. Take any open cover { 0 i : i E I} of Y to have Y = f (X) * Ci U O i . Then, EJ X = f * (f * (X)) C U / * (Oi) · iei Since f is continuous, f * ( 0 i ) is open, and because X is compact, there is a finite subcover of sets f * ( 0 i ) , without loss of generality indexed by 1 , . . . , n , i.e.,
CHAPTER 2 . ANALYSIS
96
OF
METRIC SPACES
Therefore,
n n f * (X) C U ! * (f * (O k )) = u o k . k= l k=l (By Problem 2. 7, Chapter 1 , since f is surjective.) D 6.11 Remark. Let f : (X,d) � (IR , de) be a continuous map and let A C (X,d) be compact. Then by Theorem 6 . 10, f(A) is compact in (IR , d e ) By Theorem 6.7, f( A ) is then closed and bounded, which means that the diameter of f(A) equals some M < As mentioned in part 2 ·
oo .
in the proof of the Beine-Borel Theorem, this implies that all points of f(A) have a finite distance (i.e., are bounded by some M0 ) to the origin, or equivalently, I f(x) I < M0 , for all x E A. We have therefore shown that a continuous real-valued map on a compact set assumes a minimum D and a maximtim value. 6.12 Examples.
( i) In (IR,d e ), IR is closed but not bounded. Therefore, by the Beine Borel Theorem, IR is not compact. ( ii) Take as A C (IR,de) the set (0, 1] which is bounded but not closed and therefore is not compact. Consider the open cover of A given by the family of sets {(�, 2) : n = 1,2, . . . }. Obviously, 00 1 nU= l (n,2) = ( 0,2 ) 2 A. It is not possible to select any finite subcover of A, for no finite subcover would include the point 0. Yet another argument that A is not compact D (by Theorem 6.3) is that the sequence {�} does not converge in A. A continuous function need not be uniformly continuous, unless it is defined on a compact set, as the following theorem states.
Let f : (X , d) � (Y,p) be a continuous function and let (X, d) be compact. Then f is uniformly continuous on X. Proof. Let f be continuous at x. Then, for each £ > 0, there is a 6.13 Theorem.
6x > 0, such that
p(f(x), J (y)) < �
for all y with d(x,y) < 6x . Since X is compact, after reduction, there is an n-tuple of open balls such that
6. Let fJ = � min{fJx1 , . . . ,fJx B( x i ,fJ x ./2) implies that
n
Compactness
97
} and let x,y be such that d(x,y) < 6. Then x E
1
and Thus, y belon gs to the ball B(x i ,fJ x . ). Since y and xi are within the distance of fJx ' due to continuity of f at xi , given £, 1
·
•
1
p(f(xi), J ( y )) < � · Obviously,
d(x,x i) < fJx . yields p(f(x i ),f(x)) < � and, therefore, 1
p(f(x), J (y)) < p(f(xi),f(x)) + p(f(x i ),f( y )) < c . 6. 14 Theorem. A
metric space (X, d) is compact if and only if it complete and totally bounded.
D is
Proof.
1 ) Let (X,d) be compact. Then by Problem 6.6, it is complete. Since X Cx U B(x,£) for some £ > 0, by compactness, the cover EX
{B(x,£): x E X} can be reduced to a finite subcover, which implies total boundedness. 2) Let (X,d) be complete and totally bounded. We will show that (X,d) is sequentially compact, which, by Theorem 6.3, would imply com pactness. Let { x n } be a sequence in X. We will construct a Cauchy sub sequence. Since X is totally bounded, it can be covered by finitely many open balls of radius 1. Then at least one of the balls, for instance B 1 , contains infinitely many terms, say {x l }, of this sequence. Furthe�more, cover X by balls of radius � and again an infinite subsequence { x�} C {xl} (since B1 will also be covered) is contained in one of the balls, which we label B 2 , and so on. The desired Cauchy sequence is formed by the selection of the first term from each subsequence. Indeed, by the con struction, x l and x � belong to ball B 1 . Thus, d(xl,x�) < 1 . x � and x i belong to ball B 2 , which implies that d(x�,xf) < �' and so on. Since (X,d) is complete, this Cauchy sequence is convergent, yielding sequenti al compactness of (X,d). D
CHAPTER 2 . ANALYSIS OF METRIC SPACES
98 PROBLEMS
6.7
Show that if {x k } C (IR",d e) with d(x k ,O) < 3, then {x k } has a con vergent subsequence. Define \IA,B C (X,d), d( A ,B) = inf{d(a,b) : a E A, b E B}. Let A be compact. Show that VB C X, there is an x E A such that d(x ,B) = d(A,B). [Hint: Use the fact that A is sequentially com pact.] Let A,B C (X,d) such that A is compact and B is closed. If A n B = C/J , show that d(A,B) > 0. Let A C (X,d). Show that if A is totally bounded then A is also totally bounded. Generalize Theorem 6.6: Any Lindelof metric space is separable. Show that sequential compactness of a subspace implies its comp leteness. Prove Theorem 6.2.
6.8
Prove Theorem 6. 3.
6.1 6.2
6.3 6.4 6.5 6.6
6.
Compactness
NEW TERMS:
cover 92 subcover 92 open cover 92 open subcover 92 compact set 92 compact metric space 92 Lindelof set 92 Lindelof space 92 compactness, criteria of 93 , 97 Bolzano-Weierstrass compactness 93 sequential compactness 93 separable metric space 93 Reine-Borel Theorem 94 compact set under a continuous function 95 uniform continuity criterion in compact space 96
99
CHAPTER 2 . ANALYSIS
100
OF
METRIC SPACES
7. LINEAR AND NORMED LINEAR SPACES
We have already mentioned that the Euclidean metric defines the length of a vector in n-dimensional Euclidean vector ( linear ) space. The follow ing generalizes the notion of vector length in a linear space and reconciles it with the notion of a special metric defined on a linear space ( initially discussed in Section 5) . 7. 1 Definition. Let (X,d) be a metric space such that X is a linear space over IR or C. The metric d is said to be: )
translation invariant if for all a, x, y E X , d(x + a , y + a) = d(x, y ) . b) homothetic if for all a E f and x, y E X, d(ax, o:y) = I a I d(x, y).
a
If d is translation invariant and homothetic we will abbreviate it by
TIH.
D
If d is a metric on a linear space X, then we are able to measure length of vectors, and thus comparing them, by setting the distance from any p oint x E X to one fi.Xed p oint of X , the origin. If, in addition, d is TIH then we can use the properties of X as a linear space, and in some particular cases, employ even the geometry, thereby replicating the Euclidean space and preserving the generality needed in applications. 7.2 Definition. Let d be a TIH metric on a linear space X, with the origin (), over f. ( assuming that IF is lR or C). Then for all x E X , we call the distance d(x,B) the norm of vector x and denote it by II x II . We will also call II II the norm on X ind ced by the TIH metric d. The pair D (X, II I I ) is called a normed linear space (NLS). u
·
·
Let II II be a norm on X in Definition following properties of II II hold true: (i) II X II = 0 ¢> X = B. ( ii) II ax II = I a I II x II , 'v' a E f, 'v' x E X. (iii) II X + y II < II X II + II y II 'v' X , y E X. 7.3 Theorem.
·
7. 2.
Then the
·
'
Proof.
Property ( i ) is obvious. (ii) II ax II = d(ax,B) = d(ax,aB) = I a I d(x,B) = I a I I I x II . (iii) II x + y II = d(x + y ,B) = d(x , - y ) < d(x,B) + d(B, - y ) = II x II D + I - l l II Y II = ll x ll + II Y II C onve rs ely, if II II is a real-valued nonnegative function defined on a linear space X and has properties (i-iii) of Theorem 7.3, then II II ·
·
7.
Linear and Normed Linear Spaces
101
d(x,y) = I x - y I
generates a TIH metric on X by setting (show it , see Problem 7. 10) . If d in Definition 7.2 is a TIH pseudometric then the function is called a semi-norm and correspondingly, the pair (X, ) is called a semi-normed linear space ( SNLS). It is easy to show that the Euclidean metric de on IR" is TIH. The as sociated norm induced by de is called the Euclidean norm and it will be denoted e. A very important class of NLS's is introduced below. 7.4 Definition. An NLS is called a Banach space if it is complete with respect to the metric induced by the norm (or the norm induced by a TIH metric). D
I I
I I
·
·
I I ·
7.5 Examples.
I I
(i)
I xI
The NLS (lR", ) over the field lR with is a Banach space with the Euclidean norm (see Problem 7 . 1 ) . ( ii) The NLS l P over the field C with the norm ·
e
I xI P = [ I: := I x n I PJ!P is a Banach space. Observe that I I P indeed defines a norm (called the l P norm) . (See Problem 7.5. ) Now let {x(n ) } be a Cauchy sequence. Then this sequence is uniformly bounded (show it in Problem 7.6), say, by some M E lR + . Let x = (x 1 , x2, ) be the pointwise limit of the sequence { x ( n ) } . This limit exists, since each xi is the limit of the ith-component sequence in (C,d e ) which is complete. We need to show that x is an element of lP, i.e. I x I P < and that [P X X ( n ) -+ (i.e. {x ( n ) } converges to x in l P norm). We have [ ktl l x k I pJ / p = [ ktl l xk - x�n) + x �n) I pl / p (by Minkowski 's inequality with ak = xk - x �n ) and bk = x �" ) ) < [ J:l I x k - x�n) I PJ/P + [ i:l I x �n) I PJ / p < [ ktl l xk - x�n) l pJ / P + l x ( n) I P < [ ktl l xk - x�n ) l pJ / P +M. 1
·
• • •
oo
Now, letting
n
-+
oo ,
we have
CHAPTER 2 . ANALYSIS
102
OF
METRIC SPACES
[ kt=1 I xk I P]1 / P < M, which holds for all 1, 2, . . . . Hence, we have II x II < M . Show th at fP norm (Problem 7. 7). Thus, fP is completeP and therefore is xa (Banach n ) --+ x inspace. (iii) Let GJ ( n ) be the space of all bounded real-valued functions on n valued in ( IR, d e ) or (C,d e ) · One can show that GJ is a linear space. The r =
*
*
norm II f I I u = sup { I f (w) I : w E n} is called the supre mum n o rm. GJ is a Banach space with respect to this norm (see Problem 7.4) . iv) Consider e [a , b ] as the space of all n-times differentiable real valued functions on a compact interval [ ]. It is easily seen that e ra ' b ] is a linear space. We introduce the following norm in e ra , b ] : *
(
a, b
Clearly, II . II E is a norm in e ra , b ] • We show that e ra , b ] is a Banach space under this norm. Let { f k } be a I I II E-Cauchy sequence. Then, for every £ > 0, there is a positive integer N such that \lk,j > N, ·
which implies
II f ( i ) j
i ) II u - sup { I t ( i ) - t kc i ) I } <
tc k
-
j
£,
z
·
-o1 -
,
, . . ., n.
( gi : a, b ]-+ IR gi
Therefore, by the well-known theorem from calculus cf. Theorem 4.2, p. 508, in Fisher [1983] , there exists a function [ to which the is continuous, sequence {/ j = 1 ,2, . . . } converges uniformly and i = 0, 1 , . . . ,n . On the other hand, it holds that
}i ) :
)
i f � 1 ) ( x ) - f Li - l ) ( a ) = I f Li ) ( u) d u , i [a, x ]
=
1 , . . . , n , k = 1 , 2,. . .
.
Let k -+ oo in the above equation. Since the convergence is uniform, we may interchange the limit and the integral (a more rigorous motivation is due to the Lebesgue Dominated Convergence Theorem in Chapter 6) and have
Y i-1 ( x ) - g i _1 ( a) = [aI, x ]gi(u)du , i = 1 , Consequently, we conclude that Y i- 1 is differentiable on [ a,b] and g i- 1 ( x ) . . .,n.
Linear and Normed Linear Spaces
7.
= Yi (x). Thus Yo E e ra , b ] implyi ng that a Banach space.
103
I I fk - Yo II E --+ 0 and
era , b ] is D
7.6 Definitions.
( i) Let X and Y be linear spaces over a field IF. A map is called a linear operator (with respect to IF) if
A : X --+ Y
( ii) A linear map f : X --+ IF (where X is a linear space over a field IF) is called a linear functional. (iii) Replacing a field IF in ( i) and ( ii) by a semifield IF + , we have the notions of a semi-linear operator and a semi-linear functional, res pectively. PROBLEMS 7.1 7.2
7.3
7.4
Show that (IR n , II II ) defined in Example 7.5 ( i) is an NLS and then show that it is a Banach space. Define the space 1 00 as the set of all bounded sequences x = { x 1 , x 2 , . . . } C C. Show that 1 00 is an NLS with the norm defined as II x II = sup{ I x i I : i = 1 ,2, . . . }. Define the space c C z oo as the subset of all convergent sub sequences and let c0 C c be the set of all sequences convergent to zero. Show that c and c0 are normed linear subspaces of / 00 with the same norm as that in Problem 7.2. Let � *(n) be the space of all bounded real-valued functions on n. Show that GJ is a linear space. Let II f II u = sup { I f( w ) I : w E n} be the supremum norm defined in Example 7.5 (iii) . Show that the supremum norm in GJ is indeed a norm and show that t!f is a Banach space with respect to this norm. Show that I I II in Example 7.5 ( ii) is a norm. P Show that the Cauchy sequence { x ( n )} in Example 7. 5 ( ii) is uniformly bounded. Show that the pointwise limit x of the sequence { x ( n )} in Example 7.5 (ii) is also an i P-limit. era . b 1 -+ e[ a , b 1 is linear ShOW that the differential operator with respect to IR. ·
e
*
*
7. 5 7.6 7.7 7.8
*
·
:;n
:
104 7.9 7.1
CHAPTER
2 . ANALYSIS OF METRIC SPACES
Let A be an n x m matrix. Show that A: IR m --+ IR" is a linear operator with respect to IR. Let II II be a real-valued nonnegative function defined on a linear space X over a field IF ( which is IR or C) and let it have properties ( i-iii) of Theorem 7.3. Show that II II generates a TIH metric on X by d(x, y) = II x - Y II · ·
·
7.
Linear and Normed Linear Spaces
NEW TERMS:
translation invariant metric 100 homothetic metric 100 TIH metric 100 norm 100 normed linear space (NLS) 100 NLS 100 semi-norm 101 semi-normed linear space (SNLS) 101 SNLS 101 Euclidean norm 101 Banach space 101 f P-norm 101 supremum norm 102 E-norm 102 linear operator 103 linear functional 103 semi-linear operator 103 semi-linear functional 103
105
Chapt er 3 Elements of Point Set Topology 1. TOP OLOGICAL SPACES
In Definition 4.5, Chapter 2, we called the collection of all open sets r ( d) of a metric space ( X,d) the topology induced by a metric. We recall that this collection of open sets or topology is closed with respect to the form ation of arbitrary unions and finite intersections. We understand that the topology of a metric space carries the main information about its structu ral quality. For instance, equivalent metrics possess the same topology. In addition, through the topology we can establish the continuity of a function (see Theorem 4.6, Chapter 2) without need of a metric. This all leads to an idea of defining a structure more general than distance on a set, a structure that preserves convergence and continuity. Mathematics historians are not in complete agreement about the roots of topology and who should get full credits for being its initiator. Most consider that topology, as the theory of structures, has its basis in the work of the German mathematician Felix Hausdorff, who published his fundamental monograph, Grundziige der Mengelehre (Principles of Set Theory), in Leipzig, in 1914. It was "immediately" preceded by Maurice Frechet's 1906 pioneering introduction to metric spaces. (Notice that contemporary topology has branched out into several specialized areas, such as general topology, algebraic topology, and combinatorial to pology. The very topology founded by Hausdorff was· what we now refer to as general topology, also called point set topology, which is deeply bound to classical analysis.) Bourbaki [1994] , regarded German Bernhard Georg Riemann's work (his doctoral and habilitation theses and a paper on abelian functions) from 1851 to 1857 revolutionary and qualified him as the creator of topology, since he was the first to recognize where topo logical ideas were needed. In 1870, Georg Cantor (apparently inspired by Riemann's work), in connection with the representation of real-valued functions by Fourier series, was concerned with the characterization of sets on which the function's value can be altered leaving the series in variant. This yielded more advanced concepts of topological accumulation point (earlier introduced by Karl Weierstrass) , derived set, closed set, connected set, dense set and others that further led to the topological big bang. The word topology was introduced for the first time in 1836 by Ger man Johann B. Listing, who used this as the notion of a "new analysis." 1
107
108
CHAPTER 3 . ELEMENTS OF POINT SET TOPOLOGY
Topology has been further evolved ever since. Most of the fundament al results in general topology were developed in works by Germans Felix Hausdorff, Heinrich Hopf, and Hermann Weyl, Russians Pavel Alexandr ov and Pavel Urysohn, Poles Stefan Banach, Kazimierz Kuratowski, and Waclaw Sierpinski, American Eliakim H. Moore and James Alexander, and Bourbaki group of French mathematicians. 1. 1 Definition. Let X f. C/J. A collection r of subsets of X is called a topology on X or a family of open sets, if:
( i) X, C/J E r. ( ii) {O i : i E I} C r => U O i E r. iEI (iii) r is n -stable, i.e. , 0 1 ,0 2 E r => 0 1 n 0 2 E r. [Observe that property (iii ) implies inductively that the intersection of any finite collection of open subsets will also be open. ] A carrier X endowed with a topology r is said to be a topological space. The topo logical space is denoted by (X, r ) . 0 1.2 Examples.
(i) Let (X,d) be a metric space and let r(d) be the topology generated by the metric d (see Definition 4.5, Chapter 2). Due to Theorem 2.5, Chapter 2, the collection of all open sets generated by metric d contains all arbitrary unions and finite intersections. Moreover, C/J and X are also open, so that r( d) is indeed a topology as it was defined above. For instance, the topology in IR" generated by the Euclidean metric d e is called the usual ( or standard or natura Q topology and it is denoted by r e· ( ii ) Let X be a nonempty set. Then the pair {X, C/J} = r0 is a trivial example of a topology. It is obviously the smallest topology on X, and it is called the indiscrete topology. Another trivial example of a topo logy is �(X), the collection of all subsets of X. This is the largest pos sible topology on X, and it is called the discrete topology. (iii ) F or A C X, r 1 = {X,C/J,A} is a topology "induced by set A." ( iv) Let X = lR = lR U { - oo } U { + oo } be the extended real line. Let r C '!P( X ) be the following collection of sets: 0 E r if and only if
1) O n lR E r e 2) if oo E 0 or - oo E 0, then there is an a E IR such that or an a E IR such that [ oo , a ) C 0, respectively. Then r is a topology on IR (see Problem 1. 1). -
(
a , oo
]
C0
1. Topological Spaces
109
( v) Let (X, r) be a topological space and let Y C X. Define the sys tem of subsets ry {0 n Y : 0 E r }. We show that ry is a topology on Y. Indeed, Y and C/J obviously belong to ry. Let {U i : i E I} C ry. Then, \/i E I, there is O i E r such that O i n Y U i E ry. Now U O i E r i EI and therefore Y n U Oi E ry. On the other hand, due to the distributive i EI law , Y n U o i =i U (Y n O i ) i U u i E ry. iei ei ei It can similarly be shown that ry is closed with respect to the formation of all finite intersections. Therefore, ry is a topology on Y C X, called the relative topology of r on Y. The pair ( Y,ry) is called a subspace. In some older textbooks, the topology ry is also called the trace of Y in r. For instance, take the Euclidean metric space (IR,d e ) and let Y = [0,1]. Then the set ( � , 1 ] is open in (Y,ry). D =
=
=
1.3 Remarks.
( i) Let X be a non-empty set and let r and r' be two topologies on X . If C then we say is weaker ( or smaller or co arser) than r'. We also say t hat r' is stronger ( or larger or finer) than r. As it follows from Examples 1.2 ( ii) and (iii), r C r � '!P( X). The indiscrete topology is, therefore, the coarsest topology on X, while � ( X ) is the finest topology on X. (ii) If ( X , d) is a metric space and r (d) is the topology induced by metric d ( also called the metric topology) , then ( X,r (d)) is said to be a metrizable ( topologicaQ space. Therefore, a metrizable space is a topo r
' r ,
r
0
1
logical space with a topology that comes from some metric. D 1.4 Definition. Let ( X, r) be a topological space. A subset A C X is called r-closed or j ust closed if Ac E r. D As in the case of metric spaces, we can easily prove that X and (/J are closed, finite unions of closed sets are closed, and arbitrary intersections of closed sets are closed. In Definitions 1.5 below we introduce some important notions for topological spaces. It will be advantageous to support these definitions by examples immediately after the notions are introduced. To reference the examples, we assign them the letter D followed by the prefix of the defini tion. 1.5 Definitions.
(i)
( X , r) be a topological space. A subset A C X is called a neighborhood of a point x E X if x belongs to some open subset of A. Specifically, if A E r then A is called an open neighborhood of Let
x.
1 10
CHAPTER 3 . ELEMENTS OF POINT SET TOPOLOGY
[Example D1.5( i) .
Let X = !R and r = {!R, 0 , { 1},(3,4], { 1 } U (3,4]}. Then { 1} is an open neighborhood of 1, [3,5] is a neighborhood of 3 � , ( - 2,0) is not a neighborhood of - 1 , and !R is the only neighborhood of - 1.] ( ii) A point x is called an interior point of a set A if A is a neigh borhood of x. The set of all points interior to A is called the interior of A and is denoted by A or by Int(A). [Example D1.5(ii ) . In Example D1.5( i) , 1 is the interior point of the set { 1}. The interior of set A = [3,5] is A = (3 ,4].] (iii ) The collection of all neighborhoods of a point x E X is called the neighborhood system at x and it is denoted by CUx . An arbitrary sub collection <:Bx C CUx is called a neighborhood base at x (or a fundamental system of neighborhoods of x ) if every neighborhood U E CUx is a super set to least �ne B E <:Bx. Any element B E <:Bx is called a base neighbor hood. Clearly CUx itself is a neighborhood base at x . Obviously, <:Bx is a neighborhood base at x if and only if there is another neighborhood base �x such that every base neighborhood Dx E �x is a superset to at least one neighborhood base B from �x· [Example D1.5( iii ) . Let {B(x,�), n = 1,2, . . . } be the sequence of d e open balls centered at a point x E !R". Clearly, it is a fundamental system of neighborh;oods of x. Another neighborhood base at x, which contains the above neighborhood base, is the system of all open balls with rational radii, center�d at x . We can alSil take the system of all open balls with positive real radii, centered at x . This system contains the first two neigh borhood bases.] D A neighborhood base <:Bx at x is in general a more "economical sys tem" of neighborhoods than the whole neighborhood system CUx ; and, as it will be shown, it is as informative about the structure of the space in the vicinity of x as CUx is. Technically, it is of greater advantage in vari ous proofs for us to use a base neighborhood than to use an arbitrary neighborhood. As it follows from the definition, an arbitrary set A need not be a neighborhood of all of its points. For instance, [0, 1] is not a neighbor hood for points 0 and 1 in the usual topology (!R, r e ) · More about the nature of neighborhoods is contained in the following propositions that the reader can easily verify. 1.6 Proposition. A C X is a neighborhood for all of its points if and 0
0
,
1 •
only if A is open.
D
(See Problem 1.4.) 1. 7 Proposition.
A is the largest open set contained in A. 0
D
1. Topological Spaces
111
( See Problems 1.5.) In particular, it follows that A is open if and only if A = A. 0
1.8 Definitions.
E X is called a closure point for a set A if any neighborhood of x has a nonempty intersection with A. We also say that any neighbor hood of x meets A. The set of all closure points of A is called the closure of A and it is denoted by A. [Sometimes, when working with relative topologies it is necessary to emphasize that the closure of A is with res pect to the carrier X, it is advisable to use �he notation Cl x A · However, for brevity we shall still use the notation A, whenever X is the only carrier under consideration. ] [Example D l .B( i). In the topology introduced in Example D1 . 5( i) , let us take A = ( - 2,0). Then we have ( i)
x
A = ( - oo,1) U (1,3] U (4,oo), while A = (/J. Indeed, for any x E ( - oo , 1) , IR is the only neighborhood of x ; thus IR n ( - 2,0) f. C/J . Observe that 1 is not a closure point of A, since {1} is a neighborhood (of 1) such that {1} n A = (/J. For set B = { - 1} we have B = ( - oo , 1) U (1,3] U (4,oo) = A. ] 0
A
A C X is said to be dense in X if A = X. A C X is said to be nowhere dense if Int ( A ) = (/J. [Example Dl.B( ii). Consider Example Dl. B(i). For A = ( - 2,0), (ii )
subset
(A) c = {1} U (3 , 4] , while
Int(A) = (/J, i.e. A is nowhere dense. The set
C = { - 1} U {1} U (3,4] is dense in X.] A
point x E X is called an accumulation point ( or cluster point ) of a set A if every neighborhood of x contains at least one point of A other than x. The set of all accumulation points is called the derived set and is denoted by A'. [Example Dl.B( iii). In Example Dl.B(i), A' = A.] (iv) A point x E X is called a boundary point of a set A if every neighborhood of x contains at least one point of A and at least one point of A c. The set of all boundary points of A is denoted by
( iii)
1 12
CHAPTER 3 . ELEMENTS OF P OINT SET TOPOLOGY
boundary of A. [Example Dl.B( iv). In Example Dl.B( i), A = (/J and 8A = A.] (The closure of A is evidently the smallest closed set containing A; and A is closed if and only if A = A. See Problem 1.6.) (v) A topological space ( X, ) is called separable if there exists at and called the
r
most a countable, dense subset of X.
D
PROBLEMS
Show that the collection r of sets introduced in Example 1.2 ( iv) is a topology in lR. Let X be a nonempty set and r = {X, Q),cc : C � X and C is 1.2 finite}. Show that r is a topology on X. r is called the cofinite (or finite complement) topology on X. Let X = lR and let r = {X, (/J , ( - oo,1],[1,oo ),(3, 10]}. Is r a topo 1.3 logy on IR? If not, supplement r by some subsets to a topology (and be reasonable) . 1.4 Prove Proposition 1.6. 1.5 Prove Proposition 1.7. [Hi nt: Show that A contains all open sets that are contained in A and use Proposition 1. 6. ] Show that the closure of A is the smallest closed set containing A; 1.6 and A is closed if and only if A = A. Show that ( a ) A c B ::} A C B, ( b) A U B = A U B, ( c ) A n B 1.7 C A n B and Int(A n B) = A n B. Is IntA = A? Show that A = A U 8 A. 1.8 For X being an infinite set, define r: = {X,(/J,Cc: C is at most 1.9 countable}. Show that r is a topology on X. We call such a topo logy co countable (or the countable complement topology) . 1.10 Show that A = A + 8A [ Hint: Proceed in the same way as in Problem 3.2, Chapter 2, and work with a neighborhood instead of a ball.] 1.11 Prove that a subset of a topological space is closed if and only if it contains all of its accumulation points. 1.1
0
0 0
- -
-
0
-
1. Topological Spaces 1.12
113
Let r = {lR,( - 1, 1 ],[0,5 ) ,{0},{ 1 0} }. a ) Extend T to the smallest topology T in IR generated by r. b) Let A = ( - 7, - 5], B = (0,7], and C = [ - �, 20). Find the sets -
-
-
0
0
0
A , B , C , A , B , C, A',B',C', 8A, 8B, and 8C. 1.13 1.14 1.15 1.16
Determine whether A,B and C are dense in IR. Show that 8A = C/J if and only if A is open and closed. Show that (A)c C A c . Show that the inverse inclusion in the previous problem holds if and only if A is closed and open. This provides an equivalent definition of a closure point. Show that x E A if and only if \I U x E CU x , U n A f. (/J . x
114
CHAPTER 3 . ELEMENTS OF P OINT SET TOP OLOGY
NEW TERMS:
topology 108 open sets 108 n -stable family of sets 108 topological space 108 usual topology 108 standard (natural) topology 108 indiscrete topology 108 discrete topology 108 topology induced by a set 108 topology on the extended real line 108 relative topology (subspace) 109 subspace 10 9 subspace 109 trace of a set: in a topology 109 weaker (coarse�, smaller) topology 109 coarser topology 109 stronger (finer, larger) topology 109 finer topology 109 metric topology 109 metrizable topological space 109 closed set 10 9 neighborhood of a point 109 open neighborhood of a point 109 interior point 110 interior of a set 110 neighborhood system at a point 110 neighborhood base at a point 110 fundamental system of neighborhoods at a point 110 base neighborhood 110 closure point for a set 111 neighborhood of a point that meets a set 111 closure of a set 111 dense set 111 now here dense set 11 1 accumulation (cluster) point 111 cluster (accumulation) point 111 derived set 111 boundary point of a set 111 boundary of a set 112 separable topological space 112 cofinite (finite complement) topology 112 cocountable (countable complement) topology 112
1 15
2. Bases and Subbases for Topological Spaces
2. BASES AND SUBBASES FOR TOP OLOGICAL SPACES In
the previous section, we introduced the notion of a collection of open sets, called a topology. In many applications, describing an entire topolo gy on a carrier is difficult and sometimes even impossible. This predica ment is manageable if one deals instead with a sort of "pre-topology," a smaller collection of sets, which is not a topology, but which generates a topology and thereby can be extended to a topology. With a similar idea, we come to introduce neighborhood bases. Take, for example, a metric space. While the family of all open balls does not yield a topology, every open set, as we know, can be made of the union of some subcollection of open balls, and consequently, it leads to a topology and gives rise to the notion of a base for a topology. 2.1 Definition. Let ( X, r ) be a topological space. A subcollection <:B of open sets is called a base for r if every open set is a union of some ele ments of <:B. (Specifically, it follows that 0 must be an element of <:B.) D The elements of <:B are called base sets. With no major difficulty (and with hints provided), the reader can afford est abl i shin g a v ery useful criterion of a base for r , subject to Problem 2.2. An important relation between bases and neighborhood bases is given in the following theorem. 2.2 Theorem. <:B is a base for r if and only if, 0 E <:B and for every point x E X, there is a neighborhood base <:Bx consisting of open sets
such that <:Bx C <:B.
<:B is a base for if and only if, for every x E X and each neighborhood U x of x , there is a base neighborhood Bx E <:B such that Bx C Ux · ( i) Let � be a base for r and let U x be a neighborhood of a point x E X. Without loss of generality we assume that U x is open. (Otherwise, take any open neighborhood Ox C U x of x and work with Ox instead.) If U x is open, there exists a subcollection of <:B whose union equals U x · Thus, at least one set of this subcollection, say Bx ( E �), must contain x, and Bx C U x · Observe that by Definition 1.5 ( iii), Bx is then an element of a neighborhood base and <:Bx {Bx} forms a neighborhood base of x. Therefore, each neighborhood base CUx of x has at least one neighborhood base <:Bx of x such that <:Bx C <:B and each U x E 9lx is a superset of at least one Bx E <:Bx. ( ii ) Let <:B C and assume that for every x E X, there is a neighbor hood base <:Bx C <:B. Let 0 be an arbitrary open set. Then, by our assump tion and by the definition of a neighborhood base, for any point x E 0 (since 0 is a neighborhood of x ) , there is a base neighborhood Bx E <:Bx Proof. We have to show that
r
=
r
116
CHAPTER 3 . ELEMENTS O F P OINT SET TOP OLOGY
such that Bx C 0. Thus 0 = U Bx (union of all such Bx E <:B ) . Hence, xeO every open set 0 E r can be composed of a union of some elements of <:B, or equi valen tly, <:B is a base for r. D 2.3 Examples.
Let {<:B x : x E X} be an arbitrary collection of open neighbor hood bases at all points. Then, U �x can be regarded as an example of EX ' a base for r. Indeed, as in Theorem 2.2, take a point x of any open set 0. Then, 0 is a neighborhood of x and thus it belongs to the neighborhood system at x. By the definition, a neighborhood base <:B x E {<:B x : x E X} is such that there is at least one base neighborhood B x of x in cluded in 0. Collecting all such neighborhoods of all points of 0, we can represent 0 as the union U B x . Hence, {<:Bx: x E X} is a base for the topology r. (i)
X
xEO
( ii) As mentioned at the beginning of this section, in any metric
space (X,d) , the collection of all open balls is a trivial example of a base for the corresponding metrizable topological space. Indeed, by Definition 2 .3, Chapter 2, for each op en neighborhood O x of x E X, there exists an open ball B(x,e) C Ox . Earlier (in Example 1 . 5 (iii) ) , we showed that B(x, r ) is a base neighborhood at x. Thus by Theorem 2.2, the system {B(x,e) : x E X, e > 0} is a base for r(d). As in Example D1 . 5(iii) , a neighborhood base at x can be reduced to the system <:B x = {B(x,q): q E Q, q > 0} of all balls with rational radii. Consequently, by Theorem 2.2, the collection of all open balis with rational radii is a base for r ( d). [Note that these balls are centered at all x E X, so consequently, this base need not be countable.] (iii) We give a rather informal definition of an open parallelepiped in (lR", r ) More formalism is brought in Section 5. A set e
·
o( i )
is an is called an open parallelepiped (or rectangle) in lR" if each open se� in IR. An open parallelepiped is said to be base (or simple) if each o < I ) is an open interval. Let � be the system of all base parallele pi peds in (lR", r ) along with the empty set (/J. Let x E lR" and let 0 x be any open neighborhood of x. Then, there is an open ball B(x,r) C Ox . On the other hand, there obviously is a base parallelepiped P x "centered" at x that can be inscribed into this ball, and this implies that P x C 0x · Therefore, the system � x of all open base parallelepipeds centered at x is a neighborhood base at x; and again by Theorem 2.2, GJl = {GJ x= x E X} is a base for (lR",r ) · Observe that the system of all "rational" parallele pipeds (i.e. those base ones with rational coordinates) is also a base for e
e
2. Bases and Subbases for Topological Spaces
117 (IR",r )
e ·
(iv) The collection of all singletons
base for the discrete topology on
X.
{x} E
2.4 Remarks.
( i) Let r I and r 2 be two topologies on X and let <:BI be a base for ri. If <:BI C r 2 then r1 C r 2 • [ Observe that <:B1 need not be a base for r 2 . ] Indeed, by the definition of a base, each 01 E r I can be represented as i i o l.j Bi . However, B i E r 2 implies that l) Bi o E r 2 • ( ii) Let r I and r 2 be two topologies on X with a common base <:B. Then, by (i) , ri C r 2 and r 2 � ri, and thus r 1 r 2 • In other words, a =
I
I
=
=
base uniquely defines a topology. Note that although one topology may have different bases, a base cannot share different topologies. (iii) Let r I � r 2 and let <:B 2 be a base for r 2 • It does not follow that <:B 2 is a base for r I · In fact, <:B 2 need not even be a subcollection of r I · However, if in addition, <:B 2 C r I, then by ( i) , r 2 � r I and therefore, ri = r 2 • Indeed, <:B 2 � ri C r 2 implies that ri = r 2 • D
In a construction of a topology on a carrier, it is often very helpful to start with a collection, yet smaller and more rudimentary than a base. Even more rewarding becomes the formation of product topologies and quick and tame continuity criteria of functions. Recall that a function /, corresponding between two metric spaces X and Y, is continuous if and only if inverse images under f of open sets in Y are open in X. Remark ably, continuity of f can be verified for a (frequently) much smaller community of subbase sets in Y. This will be established and elaborated in Section 4 for topological spaces. We begin with the following: 2.5 Definition. Let !f C �(X ) such that U A = X. If there exists the
A e !f
weakest topology containing !f, then it is called the topology generated by !f, and the collection !f is called a subbase on X. [Note that !f can directly restore only X, while <:B restores all open sets, including C/J. Clearly, a base <:B for a topology r, besides r itself, offers a trivial example of a D subbase on X.] To justify Definition 2.5 we need:
The weakest topology generated by a subbase exists. Proof. Clearly, there exists a topology containing !f (for instance, �(X)) . Then define r ( !f) as the intersection of all topologies containing !f. We show that r( !f) is a topology on X. (i) X and C/J belong to all topologies containing !f . Therefore X 2.6 Proposition.
118
CHAPTER 3 . ELEMENTS O F P O INT SET TO P O LO GY
and (/J E r ( !f). (ii) Let 0 1 ,0 2 ,. . . ,O n E r(!f). Then 0 1 ,0 2 , . . . ,O n are elements of every topology containing !f. This implies that n O k belongs to all n
k=1 topologies containing !f, and thus it belongs to r(!f). (iii) By similar arguments, r (!f) is closed relative to the formation of
arbitrary unions. Obviously, r(!f) is the weakest topology containing !f. D The following theorem shows that the way we generated the weakest topology r(!f) over a collection !f of "primitive" sets or a subbase, by extending this collection to the one closed with respect to the formation of finite intersections and arbitrary unions, takes place in the construc tion of arbitrary topologies. [It seems plausible to supplement !f by X, C/J, and all unions and finite intersections of elements of !f.] In addition, the theorem shows that the extension of a subbase to an n -stable super collection makes a base to the weakest topology r(!f). 2.7 Theorem.
Let !f be an arbitrary subcollection of
and let
where (/J E <:B and <:B contains all finite intersections of elements of !f. Then <:B is a base for r(!f). Proof. Let
r' = {
U B : <:B ' C- <:B }, B ':B' E
where <:B is defined in the condition of this statement. We show that r' is a topology on X. It is sufficient to show that r' contains all finite intersections; the other properties of r' as a topology are obvious. Also, for brevity in notation, we show this for the case of the intersection of two open sets. Let U and V be two elements of T1• By the definition of <:B,
where
U = U Ui and V = U Vi , (U i ,V j E <:B ) iei ieJ · n· J . . U i = n Sic and V i = sn s; , (Sic , S; E !f). k= 1 =1 1
.
m
.
2. Bases and Subbases for Topological Spaces
1 19 Then
· n· J r UnV = ( U n Sak ) n ( U sn 51) = U n sr E r'. j E J =l 8 r E I J p= l P i E I k =l Now, since obviously <:B is a base for r' and <:B C r ( !f) C r', by Remark 2.4 (iii), identifying r ( !f ) as r 1 , r' as r 2 , and <:B as <:B 2 , we have r ( !f ) = r'. In particular, we see that <:B is a base for r ( !f ) . D 1
m
•
.
u
2.8 Examples.
(i) In Example 2.3 ( ii), it was shown that the system <jJ of all base parallelepipeds is a base for (!R n , r ) On the other hand, it is easily seen that � is closed relative to the formation of all finite intersections ( recall that C/J is also in <jJ). Thus, <jJ is a base for r ( <jJ ) , according to Theorem 2. 7. Furthermore, <jJ is a base for r Thus, by Remark 2.4 ( ii) , r and r ( !f) coincide. In other words, the natural topology r on [R n is generated e ·
e·
e
e
by the system of all base parallelepipeds. In another situation , we can take for !f the system of all open parallelepipeds with rational coor dinates, which is certainly closed relative to all finite intersections. Then, r e would also be generated by the system of all rational parallelepipeds. [Recall that metrics d e and supremum metric are equivalent in lR n . No wonder that r e and r ( !f ) coincide. ] ( i i) In another scenario of (lR n , r e ) , the collection of open parallelepipeds of types ?ri((a i ,b i )) = IR x . . . x !R x (a i ,b i ) x lR x . . . x iR, where (a i ,b i ) 's are open intervals in IR, i = 1, . . . , n , forms a subbase for r e · [Note that none of ?ri((a i ,b i )) is a base parallelepiped. ] This collection can be ex tended to a base <:B for r e by including in <:B the empty set (/J and all finite intersections of the subbase parallelepipeds. Base <:B evidently contains � (why?). D PROBLEMS 2.1
2.2
Let (X, r ) be a topological space and let <:B C r. Show that <:B is a base for r if and only if for every open set 0 E r and each point x E 0, there is a subset U of 0 such that x E U E <:B. I
Show that <:B C <jJ( X) is a base for a topology on X if and only if ( i) each x E X belongs to at least one set B E <:B ( or equivalently, X = U B) B e <:B and ( i i) \/B1 ,B 2 E <:B and \/x E B1 n B 2 , 3 B E ':B such that x E B C B1 n B2 .
120
CHAPTER 3 . ELEMENTS O F POINT SET T O P OLOGY
[Hint: Use the steps that follow. 1) If <:B is a base, then apply Theorem 2.2. (ii). 2) Let r { U B : Y <:B' C <:B}. Show =
2.3
2.4
that
B E ':B ' topology on X and that <:B is a base for r.] Let <:B be a base for a topology r on X. Since <:B, in particular, is a subbase on X, it also generates the weakest topology r(<:B) and hence r (<:B) C r. Is r(<:B) = r?
Let r 1 denote the topology on the real line generated by all semi open intervals of type [a,b) where a,b E lR. This topology is called the lower limit topology. Show that {[a,b): a,b E lR} is a base for r1 and that r 1 is strictly finer than r the usual topology on the real line. Let � = { [a,b): a,b E Q}. Show that <:B is a base for the topology r that <:B generates and that r is strictly coarser than the lower limit topology r1 of Problem 2.4. Show that the collection of all sets on the real line of types ( a,oo) and ( - oo,b) is a subbase for the usual topology (lR,r ) · Show that any base and subbase parallelepipeds in Example 2.3 (ii) and Example 2.8 (ii), respectively, are open sets. e'
2.5
2.6
e
2. 7
r 1s. a
121
2.
Bases and Subbases for Topological Spaces
NEW TERMS:
pre-topology 1 15 base for a topology 1 15 base sets 1 1 5 base for a topology criterion for 1 15, 1 19 open parallelepiped (rectangle) 1 16 rectangle 1 16 base (simple) parallelepiped (rectangle) 116 simple parallelepiped 1 16 rational parallelepiped 1 16 subbase 116 topology generated by a sub base 1 16 base, a construction of 1 1 8 subbase parallelepiped 1 1 9 lower limit topology 120
122
CHAPTER 3 . ELEMENTS OF P O INT SET TO P O L O G Y
3. CO
NVERGENCE OF SEQUENCES IN TOP OLOGICAL SPACES AND COUNTABILITY
Convergence of sequences introduced in this section generalizes that of Section 3, Chapter 2, for metric spaces, and it is preparatory for the more general type of convergence of nets and filters to be treated in Section 9. 3.1 Definition. Let {x n : n = 1,2, . . . } C ( X ,r ) be a sequence and let A be a set. A subsequence Q N = {x n : n = N,N + 1, . . . } is called an N ( A ) tail of { x n } for some N > 1 if Q N C A. A sequence {x n : n = 1,2, . . . } C X is said to converge to a point x E X if for every neighborhood U of x , there is an N(U x )-tail of { x n }· The point x is said to be a limit point of the sequence. A point x is said to be a limit point of a set A if x is a limit point of some sequence { x n } C A. D Unlike metric spaces, a sequence in a topological space can have more than one limit as we learn it from the following example. 3.2 Example. Let X = lR, let r = {IR,Q),( - 2,3],[ - 1,2]} and let x n = �, n = 1,2, . . . . Then, {�} converges to all points of the set [ - 1 ,2], since for each point x E [ - 1,2], its open neighborhoods are IR, ( - 2,3], and [ - 1,2], each one of which contains the whole sequence. 0 In most applications we will deal with general topological spaces, in which every convergent sequence has exactly one limit. An important re presentative of this class is introduced in the definition below. 3. 3 Definition. A topological space ( X ,r ) is said to be Hausdorff (or separated or T2 ) if every two distinct points, x ,y E X, possess disjoint neighborhoods. D T2 is often referred to as the second separation axiom. Other separa tion axioms will be introduced and discussed in Section 10. As was men tioned, the following proposition (which will be hardly a challenge for the reader) is a consequence of Hausdorff spaces. x
Let ( X,r) be a Hausdorff topological space, D lim x = x , and let nlim x = y . Then x = y . --+oo n n --+oo n (See Problem 3. 1.) 3.5 Example. Let (X , d) be a metric space and let (X, r ( d)) be the cor 3.4 Proposition.
responding metrizable topological space. With x 1 and x 2 being distinct points of X, construct two open balls, B ( x 1 , r ) and B ( x 2 ,r ) , with r = jd( x 1, x 2 ). It follows that the balls are neighborhoods of x 1 and x 2 , respectively, and that B (x 1 , r ) n B (x 2 ,r ) = Q). This immediately implies D that any metrizable topological space is Hausdorff.
123
3. Convergence of Sequences in Topological Spaces 3.6 Remarks.
( i) In metric spaces (see Corollary 3.4, Chapter 2), a point is a closure point of a set A if and only if it is a limit point of A. This does not apply to general topological spaces. More specifically, a limit point is always a closure point, but the converse is not true. Let x be a closure point of A. If x E A, then setting x n = x, we have a sequence convergent to x. If x rt, A then, by definition, for each neighborhood U x of x, as a closure point, U x n A f. C/J. In this case, however, it is not clear how to choose a sequence convergent to x, i.e. , how to ensure that for each U x ' there is an N( U x )-tail, for we do not have the flexibility of metric spaces with balls like B(x, �) of Theorem 3.3, Chapter 2. In Remark (ii) below we will demonstrate an example of a topology where a set A contains all of its limit points and yet is not closed, or, in other words, some closure points of A are not its limit points. However, if x is a limit point of A, then it is always a closure point. Indeed, if { x n } C A is a sequence con vergent to x, then for every neighborhood U x of point x, there is a tail { x N ,x N + 1 , . . . } , which is contained by U x ' and hence U x meets A. ( ii) Consider the cocountable topology r on lR introduced earlier in Problem 1.9. Take A = (a,b) where a < b. Let {x i } C A be a se q uence . Then, by the definition of r , the complement of {xi} is open (and dis joint from {xi} ). If this sequence has a limit x E A c, then this limit should belong to the open set {xi}c (since {x n } C A ::} Ac C {x n }c), which can serve as an open neighborhood of x. This neighborhood does not have a single element of the sequence and, therefore, x cannot be its limit; or equivalently, this sequence cannot converge to any point of Ac. Therefore, x E A. However, A is not closed either. To see this, take a in association with set A = ( a,b ). Let 0 be any open set of the form lR\ { any sequence not containing a} . Then 0 is a neighborhood of a, any such neighborhood 0 meets A on some set, and a is an accumulation point of A. Thus, A is not closed, for otherwise, by Problem 1.11, it would contain all of its accumulation points. An alternative argument shows that the only convergent sequences in a cocountable topology are those with constant tails and X itself. In other words, any sequence {x n } with an N-tail is {x n = x : n > N} . It is clear that the complement of { x n : x n f. x} is an open set containing x. Therefore, every set contains all limits of its convergent sequences, but the only closed sets are the countable ones and the carrier. (iii) Consequently, there arises the question: Under what condition does a topological space have the property metric spaces have, namely, xEA
<=>
3 a sequence in A whose limit is x?
(3.6)
124
CHAPTER 3 . ELEMENTS O F P OINT SET T O P OLOGY
(With no additional condition, this result is valid for metric spaces; see Theorem 3.3, Chapter 2.) In other words, when is a set closed if and only if it contains all of its limit points? We also raise another question: When can Proposition 3.5 be reversed, i.e. when does the uniqueness of limits imply that the space ( X,r ) is Hausdorff? These two questions are closely related. To see this, assume that in a topological space (X, r ) property ( 3. 6 ) holds and, in addition, all limits are unique. Assume that we can prove that ( 3.6) also holds in the "product topology" r p on X 2 = X X X generated by the base <:B = r X r that consists of all open parallelepipeds {0 1 x 0 2 : 0 1 ,0 2 E r} . Pick an arbitrary point (x, y) from D, which stands for the closure of diagonal D = {(x, x): x E X}. If ( 3.6) holds in 2 (X,r) (and eventually in (X ,r p )) , then the point (x, y) is a limit of some sequence {(x n ,Y n )} C D. Since (x n ,Y n ) E D, we have that x n = Y n ; and, in accordance with our above assumption, by uniqueness of limits, x = y . Thus (x, y) E D, i.e. D = D or D is closed in the product topology r p· The latter implies that any point (x, y) with distinct coordinates is an interior point of De and hence it is contained in some base neighborhood D = C/J , i.e. (X,r) is Hausdorff. ox X c De. This implies that x n If (3.6) is so crucial for (X,r) to be Hausdorff, what then is a pre requisite for ( 3.6)? The answer is provided in the upcoming Theorems 3.8 and 3. 10. Before that we introduce the following important notions.
oy
o oy
3. 7 Definitions.
(i)
A topological space (X, r) is said to satisfy the first axiom of count ability (or to be first countable), if each point x E X has at most a countable neighborhood base. ( ii) A topological space (X, r ) is said to satisfy the second axiom of countability (or to be second countable) , if ( X,r) has a countable base. D As mentioned, a noteworthy attribute of topological spaces emulat ing metric spaces is subject to Theorem 3.8 combined with reader's efforts in Problem 3.7. 3.8 Theorem. Let (X, r ) be first countable and let A be a subset of X. Then a point x is a closure point of A if and only if there exists a se quence {x n } ( C A) which converges to x. D 3.9 Remark. In what follows, we will advance to the notion of the product topology to be rigorously constructed in Section 5 of this chap ter. We will call the topology on the Cartesian product X x X generated by all open parallelepipeds, 0 1 x 0 2 E r x r, the product topology and denote it by r p · The reason why r x r is a generator for r P is that r x r is a subbase and base for r P (in light of Proposition 2. 7). Obviously, r P
3. Convergence of Sequences in Topological Spaces
125
is first countable if r is; show it (see Problem 3. 12 ) . D The statement below builds promised bridges between uniqueness of limits of sequences, Hausdorff spaces, and closeness of the diagonal in r p · The same result will be generalized and applied to filter and nets in Section 10 (Theorem 10.22). 3. 10 Theorem.
Let ( X,r) be a topological space. Then the following
are equivalent. ( i) ( X ,r ) is Hausdorff. ( ii) All convergent sequences in (X, r) have unique limit points. 2 (iii) The diagonal D { ( x , x) E X } is closed in the product topology r on X 2 • p
=
Proof. (i)
=>
(ii) holds according to Proposition
3.4 (Problem 3.1).
For (ii) => (iii) we assume that all limits of sequences in ( X,r) are unique. If D is not closed, then there is a sequence { ( x n, x n ) } � D such that (x n , x n) ---. (x,y) with x f. y, but then it immediately contradicts as sumption ( ii) , since then x n ---. x and x n ---. y. For (iii) => (i) we assume that the diagonal D is closed in (X 2 ,r p ) · Let X =/:= y E X. Then (x , y) E nc c X 2 • Since nc is open, it can be re presented as a union of base open sets, i.e. as a union of open parallele pipeds. Then at least one of these parallelepipeds, say 0 X 0 y c nc' must contain the point ( x , y ) , i.e. , X E ox and y E o y . Thus ox and o y are open neighborhoods of x and y , respectively. They are disjoint, since D ox X o y c nc. Hence, ( X,r ) is Hausdorff. X
PROBLEMS 3.1 3.2 3.3 3.4 3.5 3.6
Prove Proposition 3.4. Show that any one-point set in a Hausdorff space is closed. Show that any metric space is first countable. Prove that any separable metric space is second countable. Is it true that any first countable topological space is also second countable? Prove that if a topological space is second countable, then it is separable and first countable.
126
CHAPTER 3 . ELEMENTS OF P O INT S ET T O P O LO GY
Prove Theorem 3.8. Let 0 C X be open. Show that \1 x E 0 and \1 sequence x n -+ x, 3.8 there is an N( 0)-tail of this sequence. Prove the converse of this statement assuming that X is first countable. While Corollary 3.4, Chapter 2, claims that in a metric space a set 3.9 A is closed if and only if it contains all its limit points, Remark 3.6 ( ii) asserts that in a general topological space a set A could contain all its limit points and still not be closed. However, for any set A of a first countable space, the former property does hold. Show that a set F is closed in X if and only if each convergent sequence in F converges to a point in F. 3.10 Show that subspaces of second countable spaces are second count able. 3.11 Show that r 1 x r 2 that consists of all open parallelepipeds { 0 1 X 0 2 :· 0 1 E r 1 , 0 2 E r 2 } is n -stable. 3.12 Show that r P in Remark 3.9 is first countable if r is first count able. 3.7
127
3. Convergence of Sequences in Topological Spaces
NEW TERMS:
N ( A ) -tail of a sequence 122 convergent sequence 122 limit point of a sequence 122 limit point of a set 122 Hausdorff ( separated, T 2 ) topological space separated topological space 122 T 2 space 122 Second Separation Axiom 122 product topology 124 diagonal 124 First Axiom of Countability 122 first countable topological space 124 Second Axiom of Countability 124 second countable topological space 12 4 closure point, criterion of 124 Hausdorff space, criterion of 12 5
122
128
CHAPTER 3 . ELEMENTS O F P O INT S ET T O P O LO G Y
4. C ONTINUITY IN TOPOLOGICAL SPACES
Except for a brief introduction of sequences (being a rather vague mani festation of functions) in the previous section, in the present section, func tions will appear for the first time in conjunction with topologies. Naturally, their most natural quality we look into will be continuity. After a first acquaintance with continuity in metric spaces (Section 4, Chapter 2), the reader will be well prepared to its "surprising" variant for topological spaces and a striking similarity between Theorem 4.2 below and Theorem 4.3, Chapter 2, with respect to a key continuity crite rion. Again, we will observe some other continuity properties, typical for metric spaces and holding for special topological spaces, yet more general than metric spaces. One of them deals with an important relationship between convergence of sequences and continuity of functions initiated in Chapter 2 (formulated as Theorem 4.4 and pledged to be proved in this section). •
4.1 Definitions.
f: ( X , r)---. ( Y , r 1 ) is said to be continuous at a point a E X if, for every neighborhood W f(a ) ' there is a neighborhood U a such that f (U a) C W f(a) · * This is obviously equivalent to the following definition: f is continu ous at a, if for every neighborhood W f(a ) , f * ( W f(a)) is a neighborhood of a (see Problem 4. 1 ) . ( ii) The function f is said to be continuous on X (or simply continuous) if it is continuous at each point a E X. D 4.2 Theorem. Let f : (X, r) � (Y, r 1 ) be a function. Then the follow ing are equivalent. ( i) f is continuous. ( ii) The inverse image under f of any open set H E r 1 is open, i. e. is an element of r. (i)
A function
Proof.
( i)
(ii). Let H E r 1 . For each point a E f * (H), f(a) E H and therefore f(a) is an interior point of H. Specifically, H is a neighborhood of f( a). Since f is continuous at a, there is a neighborhood U a such that f(U a) � H . Because the inclusion is preserved under the inverse, we have =>
which implies that
f * ( H ) contains a neighborhood for each of its points.
4. Continuity in Topological Spaces
129
Hence, f * ( H) is itself a neighborhood for all of its points. Therefore, by Proposition 1.6, f * (H) is open, i.e. is an element of r. (ii) => ( i). Let a E X and let Wf( a ) be a neighborhood of f ( a ) . Then, there exists an open set H E r 1 such that f(a) E H C W f( a · By ) assumption ( ii ), f * ( H) an element of r. Since obviously a E f *(H), f * ( H) is a neighborhood of a and thus f * (W f( a ) ) is also a neighborhood of a . Consequently, we have continuity of f at a. D Let (X, r) be a topological space. Denote the collection of all closed sets o c such that 0 E r by rc.
A function f : (X, r) ---. (Y, r1) is continuous on X if and only if the inverse image under f of any closed set o c E r c is closed in ( X, r). D (See Problem 4.2.) 4.4 Proposition. Let (X, r), ( Y, r ) and ( Z, r 2 ) be topological spaces and let f : X ---. Y and g : Y---. Z be continuous functions. Then the func tion g o f : X ---. Z is continuous. D (See Problem 4.3.) 4.5 Definition. Let ( X,r) be a topological space and let [X, Y,f] be a 4.3 Proposition.
1
1 ,
function. Define
r q {B C Y: f * (B) E r }, =
i.e., f ** ( r q ) � r. By the below arguments (Remarks 4.6), r q is a topo logy and it contains any topology relative to which f is continuous. r q is called the quotient topology induced on Y by f. D [Recall that f * is defined on �(X); consequently, we denote f ** as a function acting on �(<�(X)).] 4.6 Remarks.
( i) r q is indeed a topology: 1) (/J , Y E r q · n n 2) B1 , ,B n E rq => f * ( n B k ) = n t * (B k ) E r (as the interk=l nk =l section of open sets) => n Bk E r q k= l 3) A similar consideration can be used to show that r q contains • • •
•
( ii )
all unions. r q is the largest topology on Y relative to which
f is continu-
130
CHAPTER 3 . ELEMENTS O F P O INT SET TOP OLO GY
ous. This follows directly from Definition 4.5. D 4.7 Example. Let X = IR, r = {lR, C/J, ( - 1,2], [0,3), [0,2], ( - 1,3), ( - 1,1)} and let f( x) = x 2 defined as f: IR ---. !R = Y. It is clear that lR, (/J and [0, 1) are the only subsets of Y whose inverse images are in r. Therefore, {IR, C/J , [0, 1)} is the quotient topology on Y. D By Theorem 4.2, f : ( X , r ) ---. ( Y,r') is continuous if and only if f ( r') C r. However, if we know a generator !f of r ', then condition ( ii) of Theorem 4.2 can be weakened as the following theorem shows. 4.8 Theorem. Let f : (X, r ) ---. ( Y,r ( !f )) ( where r ( !f ) is the topology **
generated by a subbase !f). Then f is continuous if and only if ! ( !f ) C r. Proof. If f is continuous, then, in particular, f ( !f ) C r. Assume that f * ( !f ) C r and introduce the quotient topology r induced by f. Thus, !f C r which implies that r(!f) C r for r ( !f ) is the smallest topology containing !f. Then since f ( r ) C r, we have **
**
*
: q'
**
q
q'
q
D
Let f : ( X, r) ---. ( Y, r ' ) be a map continuous at some point x E X. If { x n } is a sequence convergent to x, the sequence {f( x n )} is convergent to f(x). D ( See Problem 4.10.) Theorems 4.8 and 4.9 and the next theorem form an analog to Theorem 4.6, Chapter 2, which was only valid for metric spaces. The statement in Theorem 4.9 h as no restriction as to the nature of topological spaces ( X , r) and (Y,r'), while its converse needs to be streng thened by the condition that (X, r) is first countable. 4. 10 Theorem. Let f: ( X, r) ---. ( Y, r' ) be a map and let (X, r) be first countable. If for any sequence { x n } convergent to a point x E X, the sequence {f(x n )} converges to f(x), then f is continuous at x. Proof. To prove this theorem, we assume that f is not continuous at x, then select a sequence {x n } convergent to x such that {f(x n )} does not converge to f(x). The assumption that ( X , r) is first countable is essential in the selection of a convergent sequence { x n }, which otherwise need not exist. If f is not continuous at x, there is a neighborhood W f(x) such that f *(W (x ) ) is not a neighborhood of x, or equivalently, there is no neighborhoo J U x such that f(U x ) C W f(x) · [Otherwise, if f(U x ) C 4.9 Theorem.
W f(x) ' then
Ux
C f (f (Ux )) C f * (W f(x)). *
*
4. Continuity in Topological Spaces
131
This would contradict our assumption. (See Figure
4. 1. )]
�(:c) -.... _
----f
Figure
4.1
Specifically, it follows that , for each base neighborhood B E <:B x , f * ( B) is not a subset of Wf( x ) " Since ( X , r) is first countable, there is a countable neighborhood base � x = {B 1 ,B 2 , } which can always be assumed to be monotone decreasing (why?). Now, each B i contains at least one point, say xi, such that f ( x i ) � W f( x ) ' which immediately yields that the sequence {/( xn ) } is not in W f( x ) and, thus, does not l converge to f ( x ) . However, x n � x . Indeed, for every neighborhood V x ' there is an element BN E <:B x such that B N C V x , which implies that B k C V \lk > N (since <:B x is monotone decreasing) . Thus, { xN, D xN +l ' " . } is the N(V x )-tail of { xn } · Theorem 4.10 leads to some useful applications. 4. 11 Lemma. Let f, g : ( X , r) � (Y,r') be two continuous maps. If ( Y, r') is Hausdorff, then the set S = { x E x : !( x ) = g( x ) } is closed in .
.
•
x' .
( X, r).
clearly the map (/,g): X x X � Y x Y is continuous relative to the respective product topologies. Since by the assumption, (Y,r') is Hausdorff, by Theorem 3. 10, the diagonal D in Y x Y is closed. Hence, the set S, as the in verse image of the diagonal D under the continuous map (f,g) must be closed. 0 Proof. Since
f and g are continuous,
Let J, g : ( X,r) � ( Y ,r') be two continuous maps that coincide on some dense set in X. If (X, r) is first countable and if D ( Y, r') is T2 , then f = g on X. 4. 12 Proposition.
132
CHA PTER 3 . ELEMENTS O F POINT SET TOP OLOGY
Thus, it follows that a continuous function is well-defined on a dense set. The proof to this proposition is the subj ect to Problem 4. 1 1. 4.13 Example. If J,g : ( IR " ,r ) ---. (lR",r ) are continuous maps that coincide on the set Q " of all vectors with rational coordinates, then f and g are identical on lR". This fact takes into account that (lR", r ) is Hausdorff and first countable. 0 4.14 Definition. Let ( X,r) and ( Y,r') be two topological spaces. A bijective map [X, Y, f] is called a homeomorphism if both f and f - l are continuous. The topological spaces ( X,r) and ( Y,r') are then called homeomorphic. We write X Y. If f fails to be surjective, then f is called an embedding of X into Y. X is also said to be embedded in Y by e
e
e
f'V
0
f.
4.15 Remark. It is not hard to see that the homeomorphic property
applied to a collection of topological spaces on fiXed carriers X and offers an equivalence relation (show it, Problem 4.12).
Y
0
PROBLEMS 4.1 4.2 4.3 4.4
4.5 4.6
4.7
Show that f is continuous at a point a if and only if for every neighborhood W f(a) , f * (W f( a )) is a neighborhood of a . Prove Proposition 4.3. Prove Proposition 4.4. Let f : ( X,r) � ( Y,r1) be a function such that f(x) = x, X = Y = lR, r = {lR, ,{1},[1,3)} and r 1 = {lR, , {2},[2,4)}. Is f continuous? Under the conditions of Problem 4.4, set f (x) = x + 1. Is f con tinuous? Let f : ( X,r) ---. ( Y,r') be a map. Show that f is continuous at a point x E X if and only if, for any base neighborhood B f(x) of the point f(x), f * ( B f(x) ) is a neighborhood of x. Under the condition of Problem 4.6, assume that r' = r ( d), i.e. ( Y, r' ) is a metrizable topological space. a ) Show that f is continuous at x E X if and only if the inverse image under f of any open ball Bd(f(x), £ ) is a neighborhood of x. b) Show that, for each open ball B d (f( x ),£) there is a neighbor hood U x( £) such that
4. Continuity in Topological Spaces
133
Let /: ( X,r) (Y, II · II d ) be a map, where Y is an NLS over a field IF, and let II II d be the norm generated by a TIH metric d . Show that f is continuous at x E X if and only if, for every e > 0, there is a neighborhood U x(e) E CUx such that for each y E U x( c) , II t (x) - f ( y ) II d < e . Prove the following statement: Let f : ( X,r) ---. ( lR", d e ), where 4.9 ( X,r) is a topological space. Then f is continuous at a point x E X if and only if, for every £ > 0, there is a neighborhood Ux(e) E CUx such that, for all y E Ux(e), l l f(x) - f( y ) l l e < £. ( II II e denotes the Euclidean norm.) 4.10 Prove Theorem 4.9. 4.8
---.
·
·
Prove Proposition 4. 12. 4.12 Prove the statement posed in Remark 4.15. 4.13 Show that (lR,r ) is homeomorphic to ( - 1 , 1) with the correspond ing relative topology on ( - 1, 1). 4. 14 Is ( lR,r e ) homeomorphic to [ - 1, 1] ? 4. 11
e
134
CHAPTER 3 . ELEMENTS O F P O INT S ET T O P O LO GY
NEW TERMS:
function continuous at a point 128 function continuous at a point, criterion of 128 continuous function 128 continuity of a function, criterion of 128, 129, 130 composition of continuous functions 129 quotient topology 129 continuous function on a dense set 131 homeomorphism 132 homeomorphic topological spaces 132 embedding 132 embedded set 132
5. Product Topology
135
5. PRODUCT TOPOLOGY
Let ( Y 1 , -r 1 ), . . . , (Y n 'r ) be topological spaces. One of the reasonable ways to define a topology on the Cartesian product Y = Y 1 x . . . x Y is to take the collection n
n
for a family of "open" parallelepipeds and declare it as a base for the topology it generates. <:B is obviously closed relative to the formation of all finite intersections [show it], and therefore, by Proposition 2. 7, is a base for r ( <:B) that includes all unions of elements of <:B. We wish to call r ( <:B) the product topology on Y and denote it by r p · The following is an attempt to reduce the base <:B for r p · 5. 1 Proposition.
Let
where � i is a base for i ' i 1 , . . . , n. Then <:B' is also a base for r (See Problem 5. 1.) Any element of <:B' is called a base parallelepiped. 5.2 Proposition. Let r
=
P
where !f i is a subbase for r i ' i 1 Then !f is a subbase for r p · (See Problem 5.2.) Any element of !f is called a subbase parallelepiped. 5.3 Proposition. Let =
, . . ., n.
.
D
D
where !f i is a subbase for r i · Then !f ' C !f is a subbase for r p · D (See Problem 5.3.) 0 bserve that any element of !f' is a unit cy Iinder. 5 4 Example. As it was mentioned in Example 2.8 ( i), the usual topology r on IR" coincides with the product topology r on lR" IR IR generated by the base <jJ of all open parallelepipeds (as the n ..
,
x . . . x 'V"
e
_,
P
=
136
CHAPTER 3 . ELEMENTS O F POINT SET TO P O L O GY
n-times Cartesian product of open sets in IR). The base parallelepipeds are of the form rr ( ai,bi), where ( ai,bi) c lR, and they are elements of a i=l base for ( IR , r e ) · In particular, the system of all rational parallelepipeds is also a base for r p = r e . The system !f' of all unit cylinders { 7r i * ((ai,bi)) : D ( ai,bi ) C IR, i = 1, . . . ,n} is a subbase for r e . (See Example 2.8 ( ii) . ) It is apparent that the projection maps are continuous relative to the product topology. Furthermore, n
Let Y .IT Y i and j : Y Y i be the jth projection 1=1 map, j 1, . . . ,n. Then the product topology r P on Y is the weakest topology for which each projection is continuous. Proof. Let r be a topology on Y, for which each projection is con tinuous, i.e. i ( r i) C r. Then for every set 0 i E r i' j 1 , . . . , n, 5.5 Theorem.
=
n
1r
�
=
1r
*
=
0= n n
(O k ) E r. 7r Z k =l
But 0 is known to belong to r P ' where 0 is a base set of r p · Thus, if <:B is a base for r P such that <:B C r , then by Remark 2 . 4 ( i), r P C r. D We extend the notion of product topology of finitely many factor spaces to that on the Cartesian product of arbitrarily many factor spaces. We therefore assume that {(Y x ,r x ) : x E X} is an arbitrary indexed fami ly of topological spaces. Let us consider two different models of topo logies on the Cartesian product Y IT Yx· One of them, called the box xEX topology (in notation rb), is subject to the following construction. We take for a base for r b the system of box parallelepipeds, =
or even a weaker base,
<:Bb {x IT Bx : Bx E �x } · EX Hence, the introduced box topology r b is not different from its version for =
finitely many factor spaces. There is another, "more economical" topolo g y on Y, which also preserves continuity of projection maps, and in addition, it leads to a tame formation of the widely used "pointwise to pology" (which the box topology does not) . 5.6 Definition. Let us define the topology r P on Y through the base (5.6)
5.
137
Product Topology
where 0x = Y x ' except for finitely many indices x E X. In other words, all elements of <:B are simple cylinders ( see Definition 5.3, Chapter 1). The topology r P generated by such a base is called the product or
Tychonov topology on Y.
Obviously base (5.6) for r P can be further reduced if each selected from a base <:B x for rx ·
D
Ox is
5. 7 Remarks.
(i)
!fx be a subbase for rx · One can show that the collection !f { 1r;(sx) : Sx E !fx ' x E X} of unit subbase cylinders is a subbase for rP ' just it is for the case of finite products. ( See Problem 5.7.) =
Let
as
(ii) We will always prefer to deal with the smallest possible base or
a sub base for r P ' provided that we have the know ledge of bases or sub bases for each r x· For instance, as the rule of thumb, we can take { 1r; (Ox) : Ox E rx} as a subbase for r P ' unless more is known about the nature of r x s. 0 '
5.8 Examples.
( i) Let { ( Yx ' rx ), x E X} be a collection of metrizable topological spaces and let Y = IT Y x· According to Example 2.3 ( i), the collection xeX of all open balls Bx( Y x ,r), Yx E Y x ' constitutes a base for (Y x, rx (d x )). Now, the set of all simple cylinders of the form
( 5.8 )
r P ' whereas the collection of all unit cylinders of the form 1r;(B n ( yx ,r x)) is a subbase for r p · ( ii) Let Y = IR [R be the collection of all real-valued functions on lR is a base for X
that are regarded as the Cartesian product of IR's, with each IR equipped with the usual topology. We select an open neighborhood U f of a point f E Y. First of all, according to (5.8), a simple cylinder with base (y 1 - t: 1 ,y1 + t: 1 ) x . . . x ( Y k - t: k , Y k + t: k ) has th e form (5. 8a) where Y x is a point in Y x = IR. In order that this cylinder be a neighbor hood of /, we need to replace Y x by the corresponding traces f ( a x) of f in the factor spaces Y a 1 , . . . , Y a k :
138
CHAPTER 3. ELEMENTS OF POINT SET TOPOLOGY
71"�1[(f(a1) - e:1.J(a1) + e:1)] n
0
0
0
n '/I"�JU
(See Figure 5.1.)
Figure 5.1 5.9 Remark. Let {gx=
E X } be a family of functions g
n __. yX ' where each Yx is endowed with a topology rx· Recall that g ; *( rx), 'r/x E X, is· a topology on n, and that each function Yx is continuous relative to this topology. The union of all these topologies, X
X :
� = x U g ;* (rx) , EX need not be a topology, for it does not necessarily preserve unions and intersections. But we can extend it to a topology, say r (� ), regarding � as a subbase. This topology is the weakest one for which all functions of the above family are continuous. r(!f) is called the weak topology gener ated by the family {gx}· Now, taking 11 Yx for n and 1rx (the xth xeX projection map) for gx , we deduce that the Tychonov topology rP is the weakest topology for which all projections are continuous. Consequently,
5. Product Topology
139
r P turns out to be the weak topology generated by the projection maps. (Of course, we need to show that r P = r(!f) ; see Problem 5. 7.) By the
way, this offers another (equivalent) definition of the Tychonoff topology on IT Y x · D
xEX
5. 10 Example. Recall that a sequence {xn}
C n converges to a point
X E n if, for every neighborhood u there is an N(U x )-tail of {xn} · In the product space n = IR IR , a sequence of points {/ n l is convergent to a point f E n if and only if f (x ) --+ f(x) for all x E IR. To see this we note (see Example 5.8 (ii)) that a base neighborhood U 1 of f in (5.8b) is of the form, X
'
n
f if it is close to f on each finite set In other words, f n { x 1 , . . . ,x k } C IR, specifically on singletons { x } C IR. D Example 5. 10 is motivational to the following notion. 5.1 1 Definition. Let { (Y x'r ) , x E Y}, be a topological space and let y x'r p ) be the Tychonov product topology. Recall that if y X = y x( IT EX and r x = r, for each x E X, then we denotedx IT Y x by y X and called it ex the set of functions from X to Y. Now the special Tychonov product topology (l�" x ,r p ) is called the topology of pointwise convergence. D As a generalization of Example 5. 10, the following proposition can help solidify our understanding of the topology of pointwise convergence. 5.12 Proposition. Let {fn } be a sequence in yX . Then fn --+ f E yX (in the topology of pointwise convergence) if and only if fn (x ) --+!( x ) \lx E X ( in the topology (Y x' r x) ) . Proof. Recall that 1r x : y X --+ Y is the x-projection map defined as 7rx (f ) = f(x) (see Section 5, Chapter 1). (i) First assume that f n --+ f in ( Y x , r p ) · By Theorem 5.5, 1rx is continuous for every x. Thus, by Theorem 4.9, 1r x (f n ) --+ 1r ( f) This yields that f n (x) --+ f(x) in (Y r x ) · (ii ) Let f n (x ) --+ f(x) in (Y x , rx ) , \lx E X. Let U f be a neighborhood of f in (Y x ,r p ) · Clearly, U f contains some base neighborhood B 1. Since by Theorem 2.2, B f E GJ3 f C GJ3 (for r p ) , it follows that B f is of the form --+
x
,
x
x'
.
140
CHAPTER 3. ELEMENTS OF P OINT SET T O P O LO GY
where all O f( x ) ' s but finitely many (O J ( x 1 ) , . . . ,0 1( x ) ) are Yx's an d for k each i = 1, . . . ,k, 0 f( x i ) contains ! (xi). Thus the base neighborhood B f is a simple cylinder k B , = n 7r i (O J ( x · ) ) . • 1
•= .
Now, f n --+ f if and only if for every base neighborhood B 1 , there is an N ( B 1 )-tail of { / n } · By our assumption, f n (xi) --+ f(xi), which implies the existence of an N i(O f( x i ) )-tail, i = 1 , . . . ,k. Let N = max { N 1 , . . . ,N k }. ( Note that this is exactly the place, where we take advantage of the Tychonoff product topology, for otherwise, in the case of the box topology, a ba.Se neighborhood of f could not be represented by a simple cylinder. The latter would be an obstacle in finding a finite maximum of infinitely many N /s, which would finally imply that {/ n } does not converge to .f in this box topology.) Then, for each xi , i = 1 , . . . ,k, we have the N(O f ( x i) )- tail of { / n (xi)}, which yields that .
f n E 1r; . (f n (xi)) C 7r; •. (o f( x •· ) ), i 1 , . . . ,k. =
s
Therefore, we have
k k f n E n 1 1r; •. ( ! n (xi)) Ci n 1 1r; •. (O f( x •· ) ) B 1 , for all n > N. = i'f= The latter tefls us that an N(B J )-tail of {f n } exists, and therefore, f n --+ f in (Y x ,r )· 0 =
p
•
PROBLEMS 5. 1 5.2 5.3 5.4
5.5
5.6
Prove Proposition 5 . 1 . Prove Proposition 5.2. Prove Proposition 5 .3. [Hint: Apply Theorem 2.7.] A map f : (X,r) --+ (Y,r') is said to be open if f(r) � r'. Show that in the product topology each projection map is open. [Hint: Use the fact that, according to Problem 3.3, Chapter 1 , maps pre serve unions.] n Let f : (O,r) --+ ( X = .IJ Xi,r p ) . Show that the function f is
•= 1
continuous if and only if each 1r j o f is continuous. [Hint: Show that f *(S) E r, for every subbase element of r P ' and then apply Theorem 4.8.] n Let (Xi' r i) be a Hausdorff space, i = 1 , . . . , n. Prove that (i IT X,
=1
141
5. Product Topology ) is Hausdorff. Show that � in Remark 5.9 is a subbase for the Tychonov topology. Show that all major properties of the product topology of finitely many spaces can be reformulated and can hold for the Tychonov topology (Problems 5.4-5.6). rp
5. 7 5.8
5.9
5. 10
Let ( X = IT x i ,r p ) be the Tychonov topology and assume that iei each factor space is first countable. Is (X ,r p ) first countable if: a ) I I I = Na? b ) I I I > G:? Generalize Theorem 5.5 for the case of Tychonov's topology.
142
CHAPTER 3. ELEMENTS O F P OINT SET T O P O LO GY
NEW TERMS:
product topology for finitely many factor spaces 135 base parallelepiped 135 subbase parallelepiped 135 continuity of projection maps 136 product topology for arbitrarily many factor spaces 136 box topology 136 box parallelepiped 136 Tychonov topology 137 weak topology (generated by a family of functions) 138 topology of pointwise convergence 139 pointwise convergence, criterion of 139 open map 140
6. Notes on Subspaces and Compactness
143
6. l'�OTES ON SUBSPACES AND COMPACTNESS
It has been mentioned that subspaces of topological spaces (i.e. relative topologies) inherit certain qualities of the original spaces. In this section we consider this notion more systematically. We will be concerned with such topological properties as separability, countability, and compactness and their effect on subspaces. 6. 1 Definition. A property of a space is referred to as hereditary if every subspace has this property. A property is said to be weakly heredita ry if it is inherited by a subspace whose carrier is closed in the original space. A property is vaguely hereditary if it is inherited by a subspace whose carrier is open in the original space. [The last notion is restricted D to use in this text book.] 6.2 Example. Second countability is hereditary. (See Problem 3.10.) 0 6.3 Remark. In Section 1 we denoted by A the closure of some subset A of a topological space (X, r ), understanding that this is the closure relative to the topology r. As was mentioned in Definition 1.8 ( i), in the case of subspaces we may need to deal with closures of subsets with res pect to any relative topology, say (Y,ry) . To make a certai � distinction clear we will then write C l y A. However, we will still use A having in mind the closure relative to the original space (X, r ). D 6.4 Example. The property of density of a set is not hereditary and not weakly hereditary, i.e. if D is dense in (X,r), its trace in a subspace (Y,ry) need not be dense. Let (X,r) = (IR,r ) and Y = lR + U { - -)2}. Then, obviously the set Q + = Q n Y is not dense in (Y,ry ). It is easily seen that { - .)2} is an open neighborhood of the point - "J2 that does not meet Q + . Thus C l y Q + f. Y. Since Y is closed in (IR,r ) the density property is not weakly hereditary either. D 6.5 Theorem. Separability is vaguely hereditary, but not ( weakly) here e
e
,
ditary.
Proof.
(i)
Let (X,r) be separable and let (Y,ry) be a subspace of (X,r) such that Y E r. We show that (Y,ry) is separable. Let D be a count able, dense set in (X,r). We need to prove that Cly(D n Y) = Y; specifi cally, we need to show that Y C Cly(D n Y), for the inverse inclusion holds trivially. Let y be any point of Y and let be any open neighbor hood of y in ry. Since Y is open in X, is also a neighborhood of the point y in r. [It is easy to show the following. Since is a neighborhood which is ry-open. But 0 � = O y n Y, where of y in ry, there is 0 � c 0 Y E r. Since Y is r-open, it follows that 0 � is also r-open and, clearly,
U�
U�
U�
U�
144
CHAPTER 3 . ELEMENTS OF P O INT SET T O P O L O G Y
U � E CUY in r.]
Therefore, U � meets D and, consequently, U � meets D n Y (as a subset of Y). Observe that if Y is not open in X, U � need not be a neigh borhood of y in Y. [For instance, let Y = (0,2] and U2 = (1 ,2]. Clearly, U2 is not a neighborhood of 2 in ( lR , r ) but it is a neighborhood of 2 in e
(Y,ry).]
,
( ii) As a counterexample of separability as a hereditary property, we consider the topology (X,r ) known as the Moore plane. Let X = lR x [O,oo) (the upper semi plane and the horizontal axis). The topology on X is described by the following base sets. At each point ( x , y ) E lR x ( O,oo ), the· neighborhood base is the collection of all open balls {B (( x , y ), r ) : r < y }, where B( z,r) = {z ' E X : de( z , z ') < r}. At each point ( x,O), the neighborhood base consists of all open balls touching the horizontal axis at (x,O), and the point (x,O) is attached to these balls. Take the union of all neighborhood bases and construct a base for the Moore plarie topology in light of Theorem 2.2. This topological space is separable with the dense countable subset D = Q 2 n X. Indeed, let ( x,y ) E lR x (O ,oo). Then any neighborhood of ( x,y ) contains points with rational coordinates (a property inherited from the Euclidean space). As for the points ( x ,O ) , any open ball bordering (x,O) also contains points with rational coordinates. Now, for a subspace of the Moore plane, consider the one with the horizontal axls as the carrier Y. Clearly, all singletons are traces of base neighborhoods at ( x ,O ) in Y yielaing the discrete topology as the relative topology on Y. According to Problem 6.2, any discrete topology with a noncountable carrier is not separable. Observe that yc is obviously open in X. Hence the separability is not weakly hereditary. D 6.6 Definition. A subset A of a topological space (X, r) is said to be compact ( Lindelo/) if every open cover of A contains a finite (at most countable) subcover. We also say that A is finitely ( countably) reducible. Specifically, if X is compact (Lindelof), (X,r) is called a compact topo
logical spq,ce ( Lindelof space).
D
6. 7 Example. Compactness in metrizable topological spaces obviously
coincides with that for the corresponding metric spaces. In this case, we may use the tools and criteria of compactness for metric spaces. For instance, the interval [ a,b] for a,b E IR " is compact in the sense of the Euclidean metric; therefore, it is compact in (IR",r ) while (a,b) is not compact in (IR " , r ) since it is not closed. D e
e
,
,
Let f. (X, r)---. ( Y, r') be a continuous function. Then the image of any compact subset of X is compact. One can use the same method of proof of Theorem 6.8 as that of 6.8 Theorem.
6. Notes on Subspaces and Compactness
145
6. 10, Chapter 2. 0 6.9 Theorem. Compactness is weakly hereditary (i. e., a closed subset of a compact topological space is compact). Proof. Let (X,r) be compact and let B C X be closed. Let {O i : i E I} be an open cover of B. Since Be is open, {Be, Oi : i E I} is an open cover of X. Since X is compact, there exists an open subcover of X, say {Be, 01, . . . , 0 n }, which is also an open subcover of B. Hence, B is Theorem
compact. D Hausdorff topological spaces possess an important property with res pect to compactness.
Every compact subset of a Hausdorff space is closed. Proof. Let A be a compact subset of the Hausdorff space (X,r). We show that A e is open. Take x E A e. The family of neighbor hoods of all points y E A covers A. We extract a particular subfamily of these neigh borhoods. Since (X, r) is Hausdorff, for each y E A, there is a neighbor hood U x( Y ) of x and a neighborhood V y (x) of y such that U x(Y) n V y ( x) C/J. Without loss of generality we may assume that the family {V y ( x) : y E A} is an open cover of A. (Otherwise, for each y E A we can select open subsets O y (x) � V y (x) such that y E O y (x).) Since A is compact, there exists an open subcover {V Y (x) : k 1, . . . ,n} of A. Obvik ously, V y (x) n U x( Yk ) = ¢. Select { 0 x( Yk ) C U x( Yk ), k 1, . . . , n }, whose k intersection (denoted by 0 x), since being finite, is open and nonempty. Therefore, Ox is an open neighborhood of x E Ae with Ox n A = C/J, which means that x is an interior point of A c. Thus, A c is open, or equivalently, A is closed. D 6.1 1 Remark. In Theorem 6.3, Chapter 2, we stated and proved (Problem 6.8) the equivalence of the conditions: ( i) A C (X,d) is com pact; ( ii) every infinite subset of A has an accumulation point in A (Balzano- Weierstrass compactness); (iii) every sequence in A has a con vergent subsequence (sequential compactness). This equivalence does not 6.10 Theorem.
=
=
=
hold for topological spaces, where ( i) and (iii) are in general distinct pro perties, and compactness just implies Bolzano-Weierstrass compactness, as the reader will prove it (see Problem 6.6). D Recall that second countable spaces are first countable and separable (see Problem 3.6). In addition, they are Lindelof spaces, as the following theorem asserts.
Any second countable topological space is Lindelof. c:B be a countable base of a topological space (X, r) and let
6.12 Theorem. Proof. Let
146
CHAPTER 3. ELEMENTS OF P OINT SET TO P O L O GY
0 { 0£ : i E I} be an arbitrary open cover of X. Let x E X. Then x belongs to some O E 0. Since � is a base for r, by Theorem 2.2, there is a neighborhood base � x C �- Then there is a base neighborhood B E � x such that B C 0 The collection of all distinct B 's for all x E X is at most countable. On the other hand, this collection obviously covers X. Consequently, the collection of all open supersets { 0 } associated with each B is also countable and it covers X. Thus ( X,r) is indeed Lindelof. =
x
x
x·
x
x
x
x
D
The below result is in the spirit of Theorem 4. 8, where, for continuity of a function f from (X, r) to (Y, r( !f )), it was sufficient to verify that f * ( !f) C r. Here we claim that, if !f is a subbase for r, then X is compact whenever every cover of X by elements from !f can be reduced to a finite subcover.
Let (X, r) be a topo logical space and let u be a subbase for r. If every open cover of X by elements of u is finitely reducible, i.e. if every open cover can be reduced to a finite subcover, then X is compact. Proof. We prove th e equivalent statement: If X is no t co mp a c t , then there exists an open cover by elements of u that is not finitely reducible. Assume that X is not compact. We will prove this assertion in four 6.13 Theorem (Alexander Subbase Theorem).
steps, which we outline as follows: ( i) Let 0 be the collection of all open covers of X that cannot be reduced to finite subcovers. We will show that 0 has a maximal element; call it ..Ab. ( ii) We will show that for every x E X, there is an open set M x E ..At, and a finite tuple of open sets { S1 (x), . . . , S n (x)} C u such that yv e will show that at least one of the sets {S 1 (x), . . . , S n (x)}, denote it S( x ) , belongs to ..Ab. (iv) We will recognize that for each x E X, S(x) E u and S(x) E .Ab. In particular, the latter will imply that {S(x) : x E X} is an open cover of X, which is not finitely reducible. On the other hand, since we will have {S(x) : x E X} C u, the proof will be complete. We will be concerned with each of the above steps in detail. Step (i) : Since X is not compact, 0 is not empty. Introduce on 0 the partial order relation in terms of the inclusion. ( In other words, two open covers, e1 and e 2 of X from 0 are related as e 1 C e 2 if and only if e 1 is
(iii)
6. Notes on Subspaces and Compactness
147
a subcover of the cover e 2 .) Let C C 0 be any chain, and let CU be the union of all elements of C. Clearly, CU is a cover of X that cannot be reduced to a finite subcover. Th us, 'U must belong to 0, and 'U is an upper bound of C. By Zorn's Lemma 4.13, Chapter 1, there is at least one maximal element in 0; denote it by .Ab. Step ( ii): Let x E X and let M E .At, such that x is an interior point of M ( which exists, for .At, E 0 is an open cover ) . On the other hand, by Theorem 2. 7, the collection � of all finite intersections of elements of the subbase u is a base <:B for r. Thus, as an open set, M is a union of base elements, each one of which is a finite intersection of elements of u; i.e. , x belongs to one of the base elements Bx , represented by a finite intern section n S k (x) of subbase elements. In other words, there is a tuple k= l n {S l (x) , . . . ,S n (x) } c (j such that X E n S k (x) c M . x
x
x
k= l
x
Step (iii): Assume that for the given set M there is no element of the tuple { S 1 ( x ) , . . . ,S n ( x)} which is an element of .Ab. In this case, for each S k (x) E {S 1 (x), . . . ,S n (x)}, there is a finite subcollection {M 1 ( k ) , . . . , M j k (k)} from .At, that supplements S k (x) to a cover of X . If no such x
finite subcollection were to exist, then {S k (x) } U .Ab would be an element of 0, i.e. {S k (x )} U .At, would not be finitely reducible, and .At, would not be a maximal element of 0. Hence, {M 1 (k ) , . . . ,M jk (k ) ,S k (x)}, k = 1, . . . , n , is a finite open cover of X . By Problem 6.9,
n { M 1 ( k ) ' . . . ' M jk ( k ) : k 1 ' . . . ' n ' and n s k ( X ) } k= l is also a finite open cover of X. This implies that =
is a finite cover of X , and is also a finite subcover of .Ab, which contra dicts the property that .At, is an element of 0. Thus, our assumption about {S 1 (x), . . . ,S n (x)} was wrong and there is at least one set S(x) E {S 1 (x), . . . ,S n (x)} which belongs to .Ab. Step (i v ): It follows that, for each x E X, there is an element S(x) common to u and .Ab, and thus the collection { S( x) : x E X } is an open cover of X that obviously cannot be reduced to a finite subcover. [Observe that the assumption that X is not compact implies that X cannot be finite. ] D The Alexander Subbase Theorem leads to the following meaningful result by Tychonoff.
148
CHAPTER 3 . ELEMENTS OF P O INT S E T TO P O LOGY
A nonempty Tychonov product is compact if and only if each factor space is compact. 6.14 Theorem (Tychonov).
Proof.
( i ) If the Tychonov product is compact, then compactness of factor
spaces follows from continuity of projection maps (see Theorems 5.5 and
6.8) . (ii) Let ( X,r p ) be the Tychonov product of compact spaces (X i ,r i ) , i E I . Take for a subbase for r the collection of all simple cylinders !f {1r i * ( O i ) : O i E r i , i E I } (see Proposition 5 . 3 ) . If X is not compact, by the Alexander Subbase theorem, no subcollection of !f covering X can be reduced to a finite subcover. Specifically, !f cannot be reduced to a finite subcover. Let {O i } be an arbitrary open cover of Xi. Then it can be reduced to a finite subcover, {O i(l) , . . . ,O i ( k i ) } of X i . Obviously, {1r i * (O(l)), . . . ,1r i * ( O i (k i )) } is a finite open cover of X that is a finite subcover of !f. This contradicts the hypothesis that X is not compact. 0 =
P
PROBLEMS
Is the property of density of a set vaguely hereditary? (Consider Example 6.4. ) Let ( X,� ( X )) be a discrete topological space with an uncountable 6.2 carrier. Show that the space is not separable. Show that the topological space in Problem 6.2 is not second 6.3 countable. Show that first countability is hereditary. 6.4 Prove that the Moore plane is not second countable. 6.5 Proye the statement: Every compact topological space is Bolzano 6.6 Weierstrass compact. Let . r be the cofinite topology on an arbitrary nonempty set X . 6.7 Show that ( X,r ) is compact. Show that any bijective continuous map f: ( X,r ) � (Y,r' ) , where 6.8 ( X,r ) is compact and (Y,r') is Hausdorff, is a homeomorphism. [Hint : Make use of Proposition 4.3.] Let {M 1 ( k ) , . . . ,M j k (k),S k } be a cover of a set X for each k = 1, 6.9 . . . ,n. Show that {M 1 ( k ), . . . , Mi k : k = l , . . . ,n , and n S k } is also a k=l cover of X. 6. 10 Let ( X,r ) be a Hausdorff topological space. Show that an 6.1
n
6. Notes on Subspaces and Compactness
149
6. 11 6.12
6.13 6.14
6.15
arbitrary intersection of compact sets in X is compact. Show that compactness is weakly hereditary. Let 0 be an open set in lR " . Show that there is a monotone increasing sequence {O k } l of open bounded subsets of 0 such that { O k } l 0. Is the property of a space to be Hausdorff hereditary of any kind? Let (X,r ) be a Hausdorff space and C and K be disjoint compact sets. Show that there are disjoint open supersets, U and V, of C and K, respectively. A topological space (X,r ) is called countably compact if any open countable cover of X has a finite subcover. Prove that the follow ing are equivalent:
(i)
(X,r ) is countably compact.
Each countable family of closed sets in (X,r ) with the finite intersection property ( i. e. the intersection of any finite subfamily is nonempty) has a nonempty intersection. ( iii) Every countably infinite subset A of X has a point x with the property that each neighborhood of x contains infinitely many points of A. (iv) Every sequence in X has a closure point. (ii)
150
CHAPTER
3 . ELEMENTS O F P O INT S ET TOP O LO GY
NEW TERMS:
hereditary property 143 weakly hereditary property 143 vaguely hereditary property 143 Moore plane 144 compact set 144 Lindelof set 144 compact topological space 144 Lindelof topological space 144 compact set under a continuous map 144 compact sets in Hausdorff spaces 145 Bolzano-Weierstrass com pact ness 145 sequential compactness 145 Alexander Subbase Theorem 146 Tychonov's Theorem 148 countably compact topological space 149
151
7.
Function Spaces and Ascoli 's Theorem
7. FUNCTION SPACES AND AS COLI'S THEOREM
7.1 Remarks.
Earlier (Example 5.2 ( i), Chapter 2), we introduced the space GJ * (X;IR) of all bounded real-valued functions on a set X, and metric p( f , g) = sup { I f (x) - g(x) I : x E X}. (i)
We called it the uniform metric. This metric is TIH and thus induces the corresponding norm, which we called the su·p remum norm. (Since GJ * (X) is a vector space over IR, the norm induced by p is legitimate.) It was also shown that p is complete and, therefore, GJ * (X) is Banach. A generalization of the above metric space is the linear space of all real-val ued, bounded vector functions f : X --+ IR" with the corresponding uniform metric, p(f,g) = sup { d e ( f (x), g(x)): x E X} (7. 1) (TIH too). We will be concerned with a similar metric space of all conti nuous real-valued vector functions defined on a compact topological space (X,r). By Theorem 6.8, any image of X under a continuous func tion f is compact in the corresponding image space. Since this image space is ( IR",r ) by the Heine-Borel theorem, f * (X) is bounded. Thus uniform metric (7 . 1 ) is a valid metric too. So we denote this metric space by (e( X ;IRn ), p ). ( ii) Observe that metric (7 . 1) can be generalized for the space of all continuous functions defined on a compact topological space (X, r) and valued in an arbitrary metric space (Y,d). Again the continuous image of X is compact in (Y,r(d)) and, according to Theorem 6.7, Chapter 2, it is closed and bounded. Hence, we are able to define the uniform metric (induced by metric d) by e ,
p( f , g ) = sup { d( f (x), f (x)): x E X}. Specifically, if d is TIH on a linear space Y over a field IF, then a norm, II
f II u = II f II
p
(7. 1a)
p defines
= p ( f ,B) = sup { I I ! ( x) I I d : x E X},
on Y (where II · I I d is the norm generated by the TIH metric d and I I · I I P is the norm generated by the metric p ) , i.e. the supremum norm. (iii) If (X,r) is not compact, instead of e(X;Y), we consider the
152
CHAPTER 3 . ELEMENTS OF P OINT S ET T O P O L O G Y
subspace e * (X;Y) of all continuous, bounded functions and define the uniform metric on e* (X;Y). D 7.2 Definition. Let ( X, r ) be a compact topological space and let ( Y, r ( d)) be a metrizable space. Denote by e( X; Y) the (linear) space of all continuous functions from (X,r) to (Y,r(d)). A sequence { / } C (e(X;Y),p) is said to converge uniformly to a function f E (e(X;Y),p), if p( f , f ) � 0 and n � oo . (p is the uniform metric defined by (7 . la ) . ) If d is TIH, then the corresponding condition becomes n
n
II
f
n
-
f I I P � 0 for n� oo .
D
The following lemma generalizes the classical result in analysis that states that the limit of a uniformly convergent sequence is continuous. Below we assume that (Y, I I I I ) is an NLS over a field IF, which is IR or c.
·
Let (X, ) be a compact topological space and let NLS over a field IF. Let { ! } C (e(X; Y), p) converge (Y, uniformly to a function f . Then f E e( X; Y). 0 7.3 Lemma. I I I I ) be an
r
n
·
Although a burden for proving this lemma and many other state ments in this section will be passed on to the reader, the associated prob lems will chiefly be provided with detailed hints and handouts. (See Problem 7.1.) 7.4 Theorem. Let Y be a linear space over a field IF and let f l I I d •
be a norm on Y induced by a translation-invariant metric d on Y. If ( Y, If If d ) is a Banach space, then the space (e(X;Y), If If ) is also Banach. D (See Problem 7.2. ) ·
·
p
7.5 Ex-ample. The following special case frequently occurs in applica tions. Since Y = lR", with the Euclidean norm I I I I , is a Banach space, D by Theorem 7.4, (e(X;lR n ) , I f If p ) is also a Banach space. ·
e
·
7.4 can be generalized as follows. 7.6 Theorem. Let (X, ) be a compact topological space and let (Y, d) be a complete metric space. Then the uniform metric space ( e ( X; Y) , p) is also complete. D (See Problem 7.3. ) Theorem
r
In many problems related to differential equations or complex analy sis, it is of interest to have a criterion, under which a closed and bounded subset of ( e ( X;Y), p) is sequentially (or Bolzano-Weierstrass) compact.
153
7.
Function Spaces and Ascoli 's Theorem
Hence, we need to know whether this subset is compact under the uniform metric. While a set may often be closed and bounded, it is in sufficient for compactness, in contrast with the Reine-Borel Theorem applied to Euclidean spaces. There will be an additional condition below characterizing compactness of some sets of continuous functions known as
equicontinuity.
Equicontinuity was introduced by Italian Giulio Ascoli in 1884 and it is regarded as one of the fundamental concepts in the theory of real functions. Ascoli 's Theorem was generalized by his fellow countryman Cesare Arzela in 1889 and it led to a very practical sequential compact ness criterion of functions often referred to a5 Ascoli-A rzela Theorem. 7.7 Definition. Let (X,r) be a topological space and let ( Y, d) be a metric space. A subset of functions GJ C e ( X Y ) is said to be (d-)equicon tinuous at x 0 E X if, for each £ > 0, there is a neighborhood U x0 of x 0 , such that for each x E Ux 0 and f E GJ, d(f(x),f(x 0 )) < £. The subset GJ is called (d-)equicontinuous if it is equicontinuous at each point of X. D 7.8 Theorem ( Ascoli) . Let ( X, r) be a compact topological space and ,
let ( e(X;lR"), p) be the function space endowed with the uniform metric p. A subset GJ C ( e ( X;lR" ) , p) is compact if and only if it is closed, bounded and d e -equicontinuous. The proof of Ascoli 's theorem is based on the following two lemmas. 7.9 Lemma. Let ( X, r ) be a compact topological space and let ( Y, d ) be a metric space. If a subset � C ( e ( X, Y ) , p ) is totally bounded in ( e ( X , Y ) , p ) , then � is d -equicontinuous on X. ( See Problem 7.4. ) 7. 10 Lemma. Let (X, r ) be a compact topological space, ( Y, d ) be a totally �ounded metric space, and {g C e ( X;Y ) be any d-equicontinuous
subset. Then {g is totally bounded. ( See Problem 7.5. )
Proof of Ascoli's Theorem.
( i)
If � is compact, it is closed and bounded by Theorem 6. 7, Chap ter 2, with no further restrictions. In this case, we have to prove that � is d e-equicontinuous. We first show that since � is bounded , there is a compact subset Y C lR" such that, for all x E X and for all f E �, f(x) E Y. Let f0 E '!f. Since f0 is continuous, by Theorem 6.8, f 0 .(X) is a compact subset of IR n . In other words, f 0 * ( X ) is closed and bounded. Hence, there is an open ball Bd e (0 = ( 0, . . . ,O),R) such that f 0 * (X) C
154
CHA PTER 3 . ELEMENTS OF P O INT SET TO P O LO GY
On the other hand, since GJ is bounded, there is an M > 0 such (B,R). e that p(f0 , ! ) < M, \1 f E GJ. Thus for all f E �, Bd
f( X) C Bd e (B,R + M ) ,
and now, Bd (B, R + M ) can be taken for Y. Hence, (Y,d e ) is a compact e subspace of (IR " ,d e ) such that each f E GJ is valued in Y. By compactness, GJ is totally bounded (see Theorem 6. 14, Chapter 2), and we conclude that GJ is d e-equicontinuous by Lemma 7.9. ( ii) Let GJ be closed, bounded, and d e-equicontinuous. As a closed subset of the complete metric space (e(X;IR" ) , p ) (Example 7.5 ) , (GJ , p) is complete (see Theorem 5. 1, Chapter 2). Since GJ is bounded, by the above argument in ( i), all functions of GJ are valued in a compact subspace of (IR",de ) · Now, X and Y are compact and GJ, by the assumption, is d e equicontinuous. By Lemma 7. 10, we conclude that GJ is totally bounded. Finally, we can make use of Theorem 6. 14, Chapter 2, and have GJ D compact. 7.11 Examples.
For the following examples we denote by e (l ) (X;Y) the space of all differentiable functions with uniformly bounded derivatives. (i) Let X = Y = IR. Then, e (l ) ( IR;IR) is an equicontinuous family. In deed, for every f E e (l ) (IR;IR), I f I < M. Let £ > 0 and X E IR. Then for all y E lR such that I x - y I < e/ M, we have, by the mean value theo rem, I
l f(x) - f ( y ) l = l f '( c ) l l x - y l (ii) -Let
< M · e/M = e.
X = [a,b ], Y = IR, and GJ be the subspace of e (l ) ( X;Y)
consisting of all uniformly bounded functions. We wish to show that (GJ , p ) is r:ompact. By Example ( i), GJ is equicontinuous. Clearly, (GJ , p) is bounded, since the diameter of GJ is
d(GJ) = sup{p(f ,g) f ,g E GJ} < 2N, :
where N is defined as the common bound for all f E '!f. Furthermore, it is easy to see that ( GJ ,p) is closed. Since a subset of a metric space is closed if and only if it contains all of its limit points, we select an ar bitrary convergent sequence {f n } C GJ and show that its limit is a function, which 1) is differentiable,
155
7.
Function Spaces and Ascoli 's Theorem
2) is bounded by N, 3) has its derivative bounded by M. The first statement immediately follows from the known fact in analysis that a uniformly convergent sequence {/ n l of differentiable functions has as the limit, a differentiable function /, and that p-limf � = f '. The other two statements can be easily verified. D There is another version of Ascoli 's Theorem frequently used in ap plications. It is based on the result of Problem 7. 14: If (GJ,p) � n ( e ( X ;IR ),p) is equicontinuous and bounded, then ( GJ, p) is also equiconti
nuous and bounded.
We will need another definition. Any subset of a topological space is called relatively compact if its closure is compact. For instance, if GJ is a sequence of continuous functions (which need not be closed), we migh� be interested in whether or not it has a convergent subsequence, i.e. if GJ is sequentially compact or, equivalently, if GJ is relatively compact. Now, with the use of Problem 7. 14, the following version of Ascoli's Theorem obviously holds. 7. 12 Theorem (Ascoli) . Let GJ be a subset in a uniform metric space (e( X; IR n ), p) . Then C!f is relatively compact if and only if GJ is bounded
and equicontinuous.
D
A more general version of Ascoli's Theorem for a subset GJ C e ( X;Y), where Y is a Banach space, requires a finer condition imposed on GJ. 7. 13 Theorem (Ascoli). Let GJ be a subset in a uniform normed linear space ( e(X; Y) , sup II · II ), where ( Y, II · II ) is a Banach space ( over IF).
Then GJ is relatively compact with respect to sup II · II if and only if GJ is equicontinuous and, fo every x E X, the set r
GJ(x) {f(x) E Y: f E GJ} =
is relatively compact in (Y, II · II ) .
D
(See Problem 7. 16.) As mentioned earlier, there are very many other versions of Ascoli 's Theorem known from text books and research papers that led to special applications. For instance, consider Arzela's Theorem (see Problem 7. 15). To work with some of the problems below we need the notion of pointwise boundedness. 7. 14 Definition. A collection C!f C e ( X; (Y, d)) of functions is called pointwise bounded if, for every x E X, the set GJ(x) = {f(x) : f E GJ} is bounded, i.e. for each x E X, there is a positive real number M such x
156
CHAPTER 3 . ELEMENTS O F POINT SET TO P O L O G Y
D d(f(x), g(x)) < Mx , for each pair /, g E GJ. Recall that a collection GJ C (e(X;Y), p) is uniformly bounded if there is a positive real number M such that p(f, g ) < M, \1 J, g E GJ.
that
PROBLEMS 7.1
Prove Lemma 7.3.
[Hint: Make use of the inequality
II f(x) - f( y ) II d < II f(x) - tk (x) II d + ll f k (x) - f k ( Y ) II d + ll fk ( y ) - f( y ) l l d and continuity of f k in the form of Problem
7.2 7.3
4.8.]
Prove Theorem 7.4. [Hint: Use Problems 4.6-4. 9 and apply Lem ma 7.3.] Prove Theorem 7.6. [Hint: Show first the validity of the statement similar to Lemma 7.3: Under the conditions of Theorem 7. 6, if a
sequence {f } C ( e(X, Y), p) converges uniformly to a function J, then f E e (X; Y).] Prove Lemma 7.9. [Hint: Let � be totally bounded; show that � is equicontinuous at any fixed point x 0 E X. 1) Choose any c > 0 an d 6 1 , 6 2 > 0 such that c > 2 6 1 + 6 2 . 2) Cover � by balls B p (f i ' 61), i = 1, . . . , n [call the n-tuple {/ 1 , . . . , f } a 6 1-net]. 3) Use continuity of each f i at x0 in the form of Problem 4. 7 b ) : for each 6 2 > 0, there is a neighborhood U�)0 of Xo with n
7.4
n
\.I v
x E ux( i0) ,
z = ·
1 , . . . , n.
4) Choose a neighborhood U x , good for all
5) Let f be any function in balls in 2 ), say B p (f i ' 6 1 ). 6) Use the estimate
0
Cif;
thus
fi 's.
f falls into one of the
157
7.
Function Spaces and As coli's Theorem
where the first term of the right-hand side of the inequality is less than 6 1 (why?), and the second term is dominated by
7.5
(The estimate needed then follows.)] Prove Lemma 7. 10. [ Hint: Choose £ > 0 and 6 1 , 6 2 > 0 such that £ > 26 1 + 6 2 . Show that there exist� an £-net { f 1 , . . . , f N } C �- Use the steps that follow. 1 ) Use equicontinuity of � and compactness of X to show that, for every 6 1 > 0 , there is a finite open cover (by neighbor hoods) { U 1 ( 6 1 ), . . . , U ( 6 1 )} of X, such that for any f E � and for any y that falls into a neighborhood U . ( 6 1 ), x
x
n
x
I
2) Cover Y by a finite collection {B(j)} of d-balls, such that B(j) = B d(Y i' 6 2 ), j == 1, 3) Let r be the collection of all integer functions . . . , m.
r: { 1 , . . . , n }
--+
{1 . . . , m }. ,
Let r' be a subset of r with the following property: an element 1 E r belongs to r' if and only if there is a function f E � such that ! (xi) E B( r(i)), i == 1 , . . . , n. Let I f' I = N. Then order the elements of r' and the functions assigned to r' by {1, . . . , N }, so that Show that �' is a relevant £-net. 4) Let f E � - Show that for this f there is an element of r' , say /j, such that if f ( x i ) E B( r j (i)), i = 1 , . . . , n, then d(f (x i), f j (xi)) < 62 , i = 1 , . . . , n . 5) Show that for all x E X\{x1, . . . , x } , n
by using the triangle inequality and the inequality in 1 ).
158
CHAPTER 3 . ELEMENTS OF P OINT SET TOP OLOGY
6) Show that the inequality in 5 ) implies the desired inequality p ( f , f k ) < £ for some k E {1, . . , N} and therefore {/ 1 , . . . , f N } is indeed an £-net in � -] Prove the following: Let (X, r ) be a topological space and let (Y, d) be a complete metric space. If e * ( X;Y) is the subspace of .
7.6
7.7 7.8
7.9
continuous bounded functions, then ( e * (X;Y), p ) is a uniform complete metric space. Prove the statement: If GJ C e(X; Y) is an equicontinuous family, then so is its uniform closure GJ. Prove Dini's Theorem: Let (X, ) be a compact topological space. Consider the space ( e(X; lR), p ) . Let {f n } be a monotone sequence from e(X; IR) such that {f n } converges to a continuous function f E e ( X; lR) in the topology of pointwise convergence. Then {f n } converges to f in p also. Let � n be the set of all polynomials defined on [0, 1] with degrees r
less than or equal to n and with all real coefficients bounded by a positive constant. Show that (�", p ) is compact.
Let GJ C e(X;Y), where X is a compact topological space and Y is a metric space. Show that if GJ is equicontinuous and pointwise bounded, then it is uniformly bounded in (e ( X;Y), p ) . 7.11 Let GJ C e ( X;IR" ) and let X be compact. Show that GJ is relatively compact if and only if it is equicontinuous and pointwise bounded. 7. 12 Let GJ be the set of functions 7. 10
GJ =
{ f ( x ) = a sinx: a E [ - 2, 2]}.
Show that the set ( GJ, p ) is sequentially compact. 7.13 Let GJ be a sequence of functions with f n (x) = b n cosx , b n = 1 + � , n = 1, 2, . . . , and f0 ( x ) = cosx. Show that ( GJ, p ) is compact. 7.14 Let dJ be a subset of (e ( X;lR"), p ) . Show that the uniform closure ( GJ, p ) is equicontinuous and bounded if and only if ( GJ, p ) is equi continuous and bounded. 7. 15 Prove Arzela's Theorem: Let X be compact and let {f k } C 7.16
e(x; lR") be a pointwise bounded and equicontinuous sequence of functions. Show that ( {f k }, p ) is sequentially compact. Prove Theorem 7.13.
159
7. Function Spaces and As coli 's Theorem
NEW TERMS:
uniform metric 151 uniform metric space 151 supremum norm 151 space of all continuous functions 151 space of all continuous bounded functions 152 uniform convergence 152 uniform convergence, criterion of 152 completeness of a uniform metric space 152 equicontinuity at a point 153 equicontinuity on a set 153 Ascoli's Theorem 153, 155 equicontinuity, criterion of 153 totally boundedness, criterion of 153 relative compactness 155 pointwise bounded set of functions 155 uniformly bounded set of functions 156 Dini's Theorem 158 Arzela's Theorem 158
160
CHAPTER 3 . ELEMENTS O F P OINT S ET T O P O L O GY
8. STONE-WEIERSTRASS APPROXIMATION THEOREM
(f2r) be a topological space, X be a compact subset, and let ( A,IR) C e(X;IR) be a subspace of all real-valued continuous functions on X that also contains products f g of functions from A. Each continuous func tion on a compact set is bounded, as we know it from Theorem 6.8. We will use the uniform metric p introduced in the previous section: Let
·
p ( f , g ) = sup { I f ( X ) - g ( X ) I : X E X}. Since e(X;lR) is complete (Example 7.3 or Theorem 7.2), A c e(X;IR). We wonder under what condition A = e(X;IR ), i.e. , under what condition
each continuous function can be "uniformly approximated" by elements of A. For instance, if A is the set of all polynomials, can a continuous function be uniformly approximated by a sequence of polynomials ? It is known from calculus that every function, analytic at a point can be uniformly approximated in a vicinity of this point by a sequence of polynomials (Taylor's theorem). In 1885, German Karl Weierstrass established a more general result (also known from calculus), which states that every continuous function defined on a compact interval X can be uniformly approximated by polynomials. Finally, American Marshall H. Stone in 1937 generalized the classical Weierstrass Theorem, allowing X t� be a compact topological space with some minor rest riction to the subspace A. For all necessary preliminaries the reader is referred to the beginning of Section• 7, Chapter 1. We will start with some auxiliary results to be rendered in a few steps (Lemmas 8.4 and 8.5) that lead to the Stone-Weierstrass Ap proximation Theorem. 8. 1 Remark. Compactness of the topological space (X,r) we were talking above is not a mandatory prerequisite to define the uniform metric, if we consider e * (X;Y) as a subspace of all d-bounded continuous functions from (X,r) to a complete metric space ( Y,d). The uniform metric p is. also well-defined on e (X;Y). Completeness of (e ( X;Y ),p) is * on the then due to Theorem 7.6 (where* only boundedness of e * ( X;Y) compact space X is essential). D •
8.2 Definitions.
( i ) Let y be a family of functions defined on a set X. Then y separates points of X if for each x and y from X such that x f. y, there is a function f E y such that f ( x ) f. f(y). ( ii ) Let y C e * (X;lR) be an arbitrary nonempty subcollection of con tinuous, bounded functions on X and let A be any subalgebra of e * (X;IR) containing y. The intersection of all subalgebras containing y is obviously
8. Stone- Weierstrass Approximation Theorem
16 1
a subalgebra (see Problem 8.1); and moreover it is the smallest sub algebra containing y, denoted by A(y), and is called the subalgebra g enerated by y. The subcollection y is called the g enerator of this sub algebra. D
Let X be a compact subset of a topolo gical space ( 0, ) and let y C e * (X; IR). If y separat es points and contains the unity 1 ( i.e. the function identically equal to 1 ), then the subalgebra A(y) g enerated by y is dense in e * ( X; IR) relative to the uniform metric p . 8.3 Theorem (Stone-Weierstrass ). r
[Observe that if needed, the condition "y separates points" can be strengthened by the condition "A(y) separates points."]
8.3. 8.4 Lemma. For each > 0, there is a polynomial P( t ) such that I P( t) - I t I I < for all t E [ - 1, 1 ]. Proof. Let I: :0 b n z " be the binomial expansion of the function (1 + z) a for E Q and E C. Recall that this function can be expanded in the binomial series, where the coefficient b n is given by the formula b n = a( a - 1) (a - n + 1) / n!, n > 1, and b0 1. ( 8. 4) A few lemmas will precede the proof of Theorem £
£
=
a
0
z
·
·
·
=
The binomial series is uniformly convergent in the open ball B( O , 1) � C and at point z = ( - 1, 0) for a > 0, it is absolutely convergent as a special case of a hypergeometric series. Thus, the series E� 0 b n x " with coefficients given by ( 8.4) is uniformly convergent to function ( 1 + x ) a , at least for all x E [ - 1, 0]. Letting a = � and replacing x by - x we arrive at the series E� 0 b� x ", which is uniformly convergent to (1 - x) 1 1 2 , \lx E [0, 1], where b� = ( - 1 ) " b n . The statement now follows if we set x = 1 - t 2 , where t E [ - 1, 1 ]. The series E� 0 b� ( 1 - t 2 ) " converges to I t I , \1 t E [ - 1, 1 ] with b� = ( - 1) " b n ; and the partial sums of the series are polynomials. 0 8.5 Lemma. Let ( X, r ) be a topolo g ical space, A C e (X; IR) be a sub * the closure A alg ebra, and ( e * ( X; IR), p ) be a uniform metric space. Then relative -to p is a sub alg ebra and, - in addition, A is a vector lat t ice, i. e., \l f , g E A , f l\ g and f V g E A. Proof. By Problem 8.2 a ) , A is a subalgebra. We need to show that A is a lattice. Because of =
=
=
162
CHAP TER 3 . ELEMENTS O F P OINT SET TOPOLOGY
a
1\
b = ( a + b) +2 l a - b l and a V b = ( a + b) -2 l a - b l ,
it suffices to show that with bounded function on X, I f I
f E A, I f I E A. Since f is a continuous, < M and I g I < 1, where g = f E A. M
Then, by Lemma for all x E X,
8.4, for every £ > 0, there is a polynomial P such that I P(g) - I g I I < £.
(8.5)
Since A is an algebra, P(g) E A. From inequality ( 8.5), we have that for each £ > 0, th � ball B p ( I g I , £) meets A, implying that I g I E A. Hence, I f I E A (see Problem 1._!_6 ). Finally, the statement of the lemma follows from the linearity of A. D Now we return to the Stone-Weierstrass Theorem. Proof ( of the Stone-Weierstrass Theorem). We will show that each function f E e ( X ; IR) can be approximated by functions from A = A(y ) relative to the uniform metric. By the assumption , y separates points, i.e., \1 x 1 f. x 2 E X, there exists a function g E y such that g(x 1 ) f. g(x 2 ). Define for fixed a, (3 E IR, the auxiliary function *
- g(x 1 ) h ( X ) = a + ((3 - ) gg(x) ( x2 ) - g ( xl ) which belongs to A, because 1 E A. Thus, \1 x 1 f. x 2 E X and \1 a, (3 E IR, there is an h E A such that h (x 1 ) = a and h (x 2 ) = {3. Let f E e * (X;IR). Then by the above argument, \1 X f. y, there is an h x y E A with the property that h x y ( x ) = f ( x) and h x y (Y) = f(y), 0!
1
[ g(x) g(z) f(y) hxy( z ) - f(x)] g(y) _ g(x) " Fix an x and let y be arbitrary. Since f - h x y is continuous at y and f(y) - h x y (Y) = 0, \/e > 0, there is an open ball B(y) = B(y, fl y ) such
where
= f(x) +
that
I hx y ( z ) - f( z ) - 0 I < £, \lz E B(y). Now, we cover X by {B(y) : y E X}, and by compactness of X, reduce
163
8.
Stone- Weierstrass Approximation Theorem
this cover to a finite subcover { B ( y 1 ) , . . . B ( y n ) } . Let the associated func tions, with the above properties in vicinities of y 1 , . . . , Yn be ,
h x y1 ' . . . , h x y n ' respectively, and let h x = min{ h x y , . . , h x y }, on n 1 h x E ..A. . By ( * ) , \/r: > O , .
X. By Lemma 8.5,
I h x y . (z ) - f(z ) I < £, \l z E B ( y i ) , i = 1, . . . , n, I
which implies that
hx (z ) < f (z ) + £, \lz E X . Observe that the above inequalities, along with their parameters, depend upon a fixed x E X. Notice that h x does not really approximate f on X; it just approximates f in a vicinity of point x. Thus, f = h x by continui ty of f - h x and h and f satisfy inequality ( * * ) . By continuity of f - h x , for each £ > 0, there is a ball B(x) = B(x, 6x) such that l f ( z) - h x ( z) - 0 1 < c:, \lz E B(x) . Ag ain, let us cover X by the collection { B( x) : x E X} and then reduce the latter to a finite subcover { B ( x 1 ) , . . . , B ( x k ) } . Correspondingly, \/r; > 0,
I f ( z ) - hx . ( z ) I < £ , 1
Then h = max { h x 1 , . . . , h x k }
Vz
E B( xi),
i=
1, . . . , k.
E A by similar considerations,
and hence
f ( z ) - £ < h( z ), z E X . Furthermore, and
(
**
) yields that h(z) <
f(z) + £,
I f ( z ) - h( z ) I <
c,
\lz E X .
From the last inequality we have that any function f E e ( X;IR) is approximated by elements of ..A, i.e. V r: > 0, B P. (f, r: ) n ..A f. *C/J which, due to Problem 1. 16, implies that Bp (f, r:) n A f. (/J. D
Every real-valued continuous function defined on a compact interval [a, b] can be approximated uniformly by polynomials. (In other words, the alg ebra e([a, b]; IR) of all continuous functions on [a, b] is the closure of the subalgebra A of all polynomials on [a, b]. ) 8.6 Corollary (K. Weierstrass ) .
164
C HAPTER 3 . ELEMENTS O F P OINT S ET TO P O L O G Y
Proof. The su balge bra of all polynomials on [a, b] has y = { 1 , x} as a generator, which contains 1 and separates points (see Problem 8.3) .
Therefore, the hypotheses of the Stone-Weierstrass theorem are satisfied.
0
In the proof of the classical version of the Stone-Weierstrass theorem we essentially needed a subalgebra A(g). Indeed, in Lemma 8.5 we made use of the fact that g E A and P(g) E A to show that I g I E A and to claim that A is a vector lattice. Should we have assumed that g is already a vector lattice separating points and containing 1, we were able to prove the Stone-Weierstrass theorem ( special version) without Lemmas 8.4 and 8.5. 8. 7 Theorem ( Stone-Weierstrass, special version) . Let (X, r ) be a
compact topolo g ical space and let y c e * (X; lR) be a vector lattice that separates points and contains 1. Then y is dense in e * (X; IR). D 8.8 Example.. Let g be the collection of all continuous piecewise linear functions on [0, 1]. Thus, y satisfies the hypotheses of Theorem 8. 7 and g = e( [ O , 1 ] , IR). In other words, every continuous function on [0, 1 ] can be approximated by a piecewise linear function.
0
PROBLEMS
8. 1 8.2
8.3 8.4 8.5 8.6
Show that A(y) in Definition 8.2 (ii) is a subalgebra. Let A be the closure of a subset A � e * (X;IR) relative to the uniform metric p. Show that a ) if A is an algebra, then .A is also an algebra; b) if A is a vector lattice, then A is also a vector lattice. Let y = {/( X ) = 1, g( X ) = X } c e([ a , b ] ; IR), for a < b E IR. Show that A(y) is the subalgebra of all polynomials on [ a , b ] . Let y be the collection of all continuous, piecewise linear functions on [0,1]. Show that y C e( [0 ,1 ] ;lR) is a vector lattice but not a sub algebra. Let X be a compact subset of IR. Show that (e(X,lR); p) is separable. Let (X, r (d)) be a compact metrizable topological space. Show that (e(X;IR), p ) is separable. [Hint: Use the steps that follow. 1) Let D = {d1,d 2 ,. . . } be a countable, dense set in (X,d) (why?). Define f n (x) d(x,d n ), \lx E X. =
165
8. Stone- Weierstrass Approximation Theorem 2 ) Show that f n E e(X;!R ) . 3) Show that {/ n } separates points.
4) Show that the algebra generated by {/ n = n = 0, 1 , . . . }, with /0 = 1, is dense in e ( X;lR).] Prove the following: Let X be a compact subset of !R". Then every 8. 7 real-valued continuous function on X can be approximated uni formly by polynomial functions of n variables. Can continuous functions on a compact interval be approximated 8.8 by polynomials with rational coefficients? Show that each continuous function on a compact interval can be 8.9 approximated by a differentiable function. 8.10 Can continuous functions on a compact interval be approximated by polynomials with integer coefficients? Can we apply the Stone Weierstrass theorem? 8.11 A continuous function defined on a compact interval [a,b] is called a parabolic spline if there is a partition { a0 = a,a 1 , . . . ,a n = b} of [a, b] ( c f. Definition 1. 7 ( i i) , Chapter 1 ) such that f is a second degree polynomial on each subinterval [a i ,a i + 1 ], i = 0, . . . , n - 1. Can continuous functions on [ a,b] be approximated by parabolic splines? If so, what version of the Stone-Weierstrass theorem should be applied? 8.1 2 Consider a subcollection GJ of "rational" parabolic splines on [ a,b ], i.e. piecewise second degree polynomials with rational coefficients. Can continuous functions on [ a,b] be approximated by elements of GJ?
166
CHAPTER 3 . ELEMENTS O F P OINT SET TO POLO G Y
NEW TERMS:
set of functions that Separates Points 160 subalgebra generated by continuous functions generator of a subalgebra 161 Stone-Weierstrass Theorem 161, 164 binomial series 161 Weierstrass Theorem 163 piecewise linear function 164 subalgebra of polynomials 164 parabolic spline 165 rational parabolic spline 165
161
9. Filter and Net Convergence
167
9 . FILTER AND NET CONVERGENCE
In this section we will generalize the concept of convergence of sequences introduced in Section 3. Many problems in topological spaces allow signi ficantly weaker conditions imposed on the linear order of terms in sequences while retaining the principles of convergence. This gives rise to the notion of a net, which is a set indexed by another (partially ordered) set, in which the usual linear order is therefore largely relaxed. One of the prominent applications of convergence of nets is the notion of the Riemann integral, which is known to have inspired American Eliakim H. Moore in his 1915 widely referred to paper, Definition of limit in general integral analysis, and 1922 paper, A general theory of limits, co-authored with H.L Smith, to develop the general concept of a net. Filters offer another, very useful type of con vergence in topological spaces such as convergence of neighborhoods to a point. The theory of fil ters was developed in the thirties by the famous Bourbaki group of French mathematicians. 9. 1 Definitions.
GJ
( i) Let X be a set and GJ C
if:
) C/J � GJb, b) for each two sets F1 ,F 2 E GJb, there is a set F E GJb such that F C F1 n F 2 (clearly, F 1 n F2 -=/= (/J) . (iii) Let GJ be a filter on X. A collection of subsets GJb C
)
GJb C GJ, b) each F E GJ is a superset for some Fb E GJb. ( iv) A filter GJ on X is called an ultrafilter if for each subset A of X, either A or A c is in GJ. a
9 2 Remarks. ..
( i)
A filter is obviously a filter base, since we can take
F 1 n F for 2
168
CHAPTER 3 . ELEMENTS O F P O INT SET T O P O L O G Y
F to have '!Fb C '!! . (ii) Let '!Fb be a filter base on X. We can extend '!Fb to a filter '!! by including in '!! additionally all supersets of each F b E '!lb. Indeed, a ) Let F 1 , F 2 E '!! . Then there are F b , F E: E '!Fb such that F b C F 1 and F E; C F 2 . Thus, there is an F b E '!Fb such that F b C F b n F E: ( C F 1 n F 2 ). By definition, '!! contains all supersets of elements of '!Fb, in particular, F 1 n F 2 is one superset of F b · Consequently, F 1 n F 2 E '!! . b) Let F E '!! . Then there is an Fb E '!! b such that F b C F. Now, '!! should contain all supersets of F b ' thus all supersets of F. Therefore, '!! is a filter. Note that � he above filter '!! is the smallest filter containing the filter base '!Fb ( show i� in Problem 9. 1). For instance, � ( X ) is another filter containing GJb. Consequently, it is called the filter generated by the filter base and it is denoted by '!F ( '!Fb). Thus a filter base on X is a filter base for a filter on X, namely for the filter generated by the filter base. (iii) We showed that a filter base on X is a filter for a filter base. The converse is also true: A filter base for a filter is a filter base ( show it in Problem 9.2). D 9.3 Examples.
(i) The neighborhood system CUx at a point x E ( X, r) is a filter on X, called the neighborhood filter. (ii) A neighborhood base <:Bx at a point x E (X,r) is a filter base on X.
(iii) Let x0 E X = lR. Then the following collection of sets are filter
bases:
)
{(x0 - t: ,x0+t: ) C IR: e > 0}. b) {[x 0 ,x0 + t: ) C IR: e > 0}. c) {(x0 ,x0+ t: ) C IR: e > 0}. d) { ( - 0] C (R: > 0}. e ) {(x0 - t: ,x0 ) C lR : c > 0)}. f) {(x0 - t: , x0+t: )\{x0 } C IR: e > 0}. a.
XO
X.
£
D
Let IF('!F0 ) be the collection of all filters that contain a Let C be the partial order inclusion on IF('!F0 ). A filter
9.4 Lemma.
filter '!10 on
£,X
9. Filter and Net Convergence
16 9
GJ E IF is an ultrafilter if and only if GJ is a maximal filter in IF. Proof.
1 ) Let GJ be a maximal filter in IF ( GJa) and let A C X. Each element of GJ intersects A or Ac. Assume that one such F meets A. Then, by Problem 9.4, GJ meets A. By Problems 9.5-9.7, GJ A = {F n A: F E GJ}, is a filter base for GJ' : = GJ U ( U GJ A )' which is equal to GJ ( GJ U {A}), i.e. , B :J A :
the filter generated by the collection GJ U {A}. GJ' is finer than GJ and it contains A. Since GJ is a maximal filter, it follows that GJ = GJ'. Thus, GJ contains A. The same result holds if F m·e ets A c . Therefore, GJ is an ultra filter. 2 ) Let GJ be an ultrafilter and let A � X such that A E GJ. We show that GJ is maximal. Let GJ' be any filter in f such that GJ C GJ'. Then there is F' E GJ'\ GJ. Since GJ is an ultrafilter and F' rf. GJ, we have that F' c E GJ and hence F' c E GJ'. However, this is impossible, for two disjoint sets F' and F' c cannot belong to the same filter and this is a contradiction. D 9.5 Proposition.
For each filter '!Fa, there is an ultrafilter '5' :::) '!Fa. Proof. Let IF(GJa) be a collection of all filters finer than '!Fa and let C ( GJa) be any chain in IF (GJa)· Then it is easy to see that GJ u GJ E C ( GJa) is again a filter and it is the largest filter in C (GJa)· Specifically, it is an upper bound for C('!Fa)· Then, by Zorn's Lemma 4. 13, Chapter 1, IF ( GJa)
0 has a maximal element which by Lemma 9.4 is an ultrafilter. 9.6 Definitions. (i) A filter GJ on a topological space ( X,r) is said to converge to an x E X (in notation GJ x) if it is finer than or equal to the neighborhood system CUx, i.e. if CUx C GJ. x is said to be a limit point of the filter GJ. Clearly, every neighborhood system CUx converges to x. (ii) A filter base GJb is said to converge to x (GJb x) if for every neighborhood U E CU x , there is an Fb E GJ b such that Fb C U x · Con sequently, each neighborhood base �x converges to x. (iii) A point x E (X,r) is said to be an accumulation point of the filter GJ (filter base GJb) if for each F E GJ (F b E GJb) and for each Ux E cux , F n ux -=1= f/J. (Y,r') (a topologi (iv ) Let GJb be a filter base on X and let f: X cal space). The function f is said to converge to l E Y (f ---. I) along the ..__.
�
x
---.
170
CHAPTER 3 . ELEMENTS O F POINT SET TOPOLO G Y
filter base '!Fb if for every that f * ( F ) C V1 •
neighborhood
V1 of l, there is an F E '!Fb such
D
9. 7 Examples.
Let X = N and let '!Fb = { { n,n + 1, . . . }: n E N } be a filter base on N . Now consider a map f: N � (Y,r dis cre t e ) , which, in fact, is a sequence in space Y. Then, Definition 9.6 (iv ) in this case reduces to the conventional definition of the limit of the sequence {f(n) = Y n } ( cf. Definition 3.1 ). (ii) Let (X,r), (Y,r ') be topological spaces, f: X � Y, a E X, l E Y, and let '!Fb = C:Ua (the neighborhood filter on X). Now, the expression f l along C:U a means: for each neighborhood V 1, there is a neighborhood u a E c:u a such that for each X E u a' f (x) E v l (or, equivalently, f * ( U a ) C V 1 ) , in notation, (i )
-4
lim f(x) = l. x� a
(9.7)
Observe that as long as C:U a is declared and since it is unique with respect to the point a and topology r, we need not specify along which filter base f con verges to l. Should C:U a be replaced by a specific neighbor hood base <:B a (also a filter base), then we can write (9. 7a) Now, let � a be a neighborhood base at a with (9.7a) holding. Then, by Definition 9.6 (iv), for each neighborhood V1 of l, there is a neighbor hood B a E <:B a of a such that f * ( B a) C V1 • Since <:B a C C:U a, (9.7a) then im plies (9.7) . Conversely, if (9.7) holds, then for each V1, there is a neigh borhood U a from the neighborhood system C:U a . Because each Ua is, by Definition 1.5 ( iii ) , a superset of at least one B a E � a (being an arbitrary neighborhood base at a) , (9.7a) must hold. Consequently, (9.7) and (9. 7a) are equivalent, even though (9. 7a) is related to a specific neighbor hood base of a . We therefore see that the limit is invariant of a neigh borhood oase of a and (9. 7) can be sustained with no specification of any neighborhood base. Consequently, (9. 7) can be used for the notion of con vergence of a function f at a point a. Notice that f acts between two to pological spaces. Interestingly enough, we could alternatively use a defini tion of convergence, similar to that of continuity in Definition 4.1, i.e. with no visible consent of a filter base. This would read: A function f is said to have a limit l at a point a if for each neighborhood V1 of l in (Y,r') , there is a neighborhood* U a of a in ( X,r ) such that f * ( U a ) C V1, or equivalently, if J (V1) is a
171
9.
Filter and Net Convergence
neighborhood of a . In particular, if (X,r) is first countable (which is the case of metric spaces and many other applications) , we can have f converge to l along any monotone decreasing countable neighborhood base of the point a, say, {B�}. If we now select from each B� an arbitrary point x n (as in the proof of Theorem 4.10 ) , then x n � a in the usual sense and, consequently, we can write lim f(x) = l (9. 7b) x �a
n
that has a double meaning. For one, it goes back to notation (9. 7-9. 7a) and limit (9.7b) is a limit of f along the filter base {B�}. On the other hand, it coincides with our conventional definition of the limit of f at a point a along the sequence {x n } · Finally, if limit in (9.7b) is consistent along any sequence { x n } that converges to a, then, by arguments as in Theorem 4. 10, we can show that l is a limit of f along a filter base {B�} and therefore, along any neighborhood base of a . The uniqueness of I is subject to Example ( i v ) below and we will see that this is the case if (Y, r') is Hausdorff. For instance, if we consider as f the function
then function
[IR",IR, g] is differentiable at a if and only if the limit lim f(x) = l x� a
exists, where l = g'( a) , and now we can say that function g is differ entiable at a if and only if this limit exists along any sequence { x n } �onvergent to a in the sense of notation (9. 7b ) . This idea is frequently used in analysis whenever convergence along a sequence is a plausible (if not the only) option for us. (iii ) Consider some special cases of limits along the filter bases from Example 9.3 (iii). Let X = Y = IR and f: X � (Y,r e )· a ) If '!Fb on X is �b = { ( a - e: , a + e: ) : t: > 0}, then the concept of limit introduced in Definition 9.6 ( iv) reduces to the conventional definition of the limit of a function known from calculus, with the usual notation lim f(x ) = I x� a b ) Similarly, with �b = {[a,a + e: ) : t: > 0} we obtain x� lima + f(x) = l. c ) VVith �b = {[b,oo): b E IR}, we have lim f(x) = l. x�+ oo (iv ) Let f: X � (Y,r), �b be a neighborhood base on X and let .
172
CHAPTER
3 . ELEMENTS OF POINT SET TOPOLOGY
f has a limit along '!Fb, then it is unique. Assume that 1 1 and 1 2 are two different limits along '!lb. Since Y is Hausdorff, there are two disjoint neighborhoods of 1 1 and 1 2 : V 1 1 and V1 2 • By the definition of the limit along �b ' there are two sets U 1 , U2 E '!! b such that (Y,r) be Hausdorff. We show that if
By the definition of U C U1 n U2 • Since
'!Fb as a filter base, there is U E '!F b such that
f * (U 1 n U 2 ) C V1 1 and /*(U 1 n U2 ) C V1 2 ,
D we have f * (U) C V1 1 n V1 2 = (/J. This is absurd, for U '# (/J. When introducing convergence of a function f: X ---. (Y,r) along a filter base '!Fb on X in Definition 9.6 ( ) we did not need to assume any topology on X. Now if we define a topology on X and take for '!! b the neighborhood filter CU:r:0 at a point x 0 E X, then, by Definition 9.6 ( i v) v
,
(applied to CUxo = '!! b ) and taking I E Y as f(x 0 ), we arrive at the defini tion of continuity of f at x 0 that agrees with Definition 4.1: A function f : (X,r) ---. (¥, r') is called continuous at a point x0 if lim
X �X O
f{x) = f(xo ) ·
Now, we consider another very useful type of con vergence: con ver gence along nets. As we will see it, the filter and net convergence have a very close relationship. 9.8 Definitions.
A set A is called ( i) < ) on A defined as:
directed
if there exists a relation ( nP.noted
)
( R ) for each .,\ E A, .,\ < .,\. b) (T) .,\ 1 < .,\2 and .,\2 < .,\3 imply that .,\ 1 :5 .,\3 . s ) ( SL - superlativity) for each pair .,\ 1 ,.,\ 2 E A, there is .,\ E A such that .,\ 1 < .,\ and .,\ 2 < .,\. ( ii) A net is roughly speaking a set indexed by a directed set, and it is a generalization of a sequence. More formally: A net in X induced by A is any function f: A X where A is a directed set. The point f(.,\) is denoted by x .,x and we will then instead denote the net by { x .,x} a
---.
=
9. Filter and Net Convergence
173 {xA
>. E A}. Observe that since f need not be surjective, { x A} is in general a proper subset of X. (iii) If {xA} is a net, then {xA: >.a < >.} is called a >.a-tail of {xA}. ( iv ) Let A C X. A >.a-tail of a net {xA} is called a >. a ( A ) tail of { x A} if the >.a-tail is a subset of A. ( v ) A net { x A} is said to be cofinally in A C X if for each >.a E A, there is >. > >.a such that x A E A. ( vi ) A point x E X is said to be an accumulation point of a net { x A} if the net { x A} if { x A} is cofinally in each neighborhood U x E CU x . ( vii ) Let { x A} be a net in X. { x A} is said to converg e to a point x E X (in notation x A --. x }, if for each neighborhood U x of x , there is a >.0 (U x ) -tail of { x A}. x is called a limit point of the net { x A}. ( viii) A net {xA} is called an ultranet if for every subset A C X, there is a >.0 ( A )-tail of {xA} or >.a ( A c)-tail of {xA}. D :
-
9.9 Examples.
An example of a directed set A will be IR" with ( i) >. = ( >. 1 , . . . ,>. n ) � J.L = (J.L 1 , . . . ,J.L n ) if and only if x i < Yi, for all i = 1 , . . . , n. ( ii) A neighborhood base � x at x, or even more trivial case, the neighborhood system CU x , with the relation U 1 < U 2 if and only if U 1 :::) U 2 for their elements, is a directed set. ( iii) Let X be an arbitrary continuum set and let { x A} be the net in X induced by A defined in ( i). Now, a >.a-tail involves only those x E X whose indices are < -related. ( i v) Let ( X, r ) be a topological space, x E X, and let � x be any neighborhood base of x directed as in ( ii). Now, we index a subset of X as follows. For each neighborhood B E <:B x , we pick a point y E B and index it by B, and so we obtain a net { y B: B E <:B x } in X. Observe that same points of X can be indexed by different neighborhoods, but for each neighborhood B E � x ' exactly one point (of this neighborhood) is assign ed. Any such net {YB} will be called a net generated by the neig hborhood base � x · It is understood that there are in general more than one net generated by a neighborhood base. If Ba is any neighborhood from
174
CHAPTER 3 . ELEMENTS OF P OINT SET TOPOLOGY
collecti o n of sets from the net, with all B C B0 ( C U ) However, since each Y B E B, the B 0-tail is a subset of B0 , specifically, of U x . Thus, Y B x
__.
·
x.
(vi) Let � be the collection of all finite partitions of the compact interval [a , b] . Recall (Definition 1.7 ( ii ), Chapter 1) that P E <jJ is a partition of [a,b] if P is any ordered finite set of points {a0 , ,a n } C [a , b] with a = a0 < a 1 < . . . < a n = b. Let P and P' be two partitions in �- We say that P' is finer than P if P is a proper subset of P'. P' is also said to be a refinement of P. We direct � as follows: for every pair of partitions P, P' E �, P < P' if and only if P' is finer than P (P C P'). Let f be a real-valued function defined on [a , b] . Then for a partition P, define: •
•
•
n Lp: = iE mi(ai - ai _ 1 ) (Darboux lower sum indexed by P). n= l Up = E M i(ai - ai _ 1 ) (Darboux upper sum indexed by P). i =l Consequently, {Lp} and {Up} are two nets in (IR,r ) and if each of them converges to the same real number I we call this number the Riemann integral off and denote it by e
I = Jb f(x)dx. [a, ] Indeed, let U be a neighborhood of I. If L p I, then there is a pattition P0 such that the P0-tail of Lp is in U, or equivalently, all Darboux lower sums indexed by the partitions finer than P0 must be in the £-range of I. __.
Observe that the "naive" definition of the Riemann integral is based on the existence of a limit of a sequence of lower sums over any sequence of subsequently refined partitions. The definition in this example is just the same, since the net convergence involves in fact the existence of such a limit over· all appropriate sequences of partitions. As mentioned, this motivated E.H. Moore and H.L. Smith to develop the general concept of a net. (vii ) If an ultranet { x A} is cofinally in A C X then there exists a Au ( A )-tail of { x A }. Indeed, by the definition of an ultranet, there is either a A0 (A)-tail or A0 (Ac)-tail. The latter contradicts the assumption that { xA } is cofinally in A. (viii) If { x A } is an ultranet and x is its accumulation point, then x is also a limit point. Indeed, if x is an accumulation point of { x A}, then for every neighborhood U x ' { x A } is co finally in U x ' and thus by above
9. Filter and Net Convergence
175
example ( vii ), there is a A a (U x )-tail of {x A } , which implies that x is a limit point of {x A } . ( i x) An example of a trivial ultranet. Let A be a directed set; then any function f: A � { x a } ( x a E X) is an ul tranet. D 9. 10 Proposition. Let A C (X,r). Then x E A if and only if there is a net {x A } in A such that xA --+ x. 0
0
Proof.
1) In Example 9.9 (i ) we have shown that for each x E (X,r), any
net generated by a neighborhood base <:B x converges to x. Thus, it is suffi cient to show that there is such a net located in A. If x E A, then, by Definition 9 . 6 ( iii), each neighborhood U x meets A at at least one point. Specifically, each neighborhood taken from a neighborhood base <:B x has a nonempty intersection with A . Therefore, a desired net {y B } generated by � x is any net whose terms picked from this intersections. 2 ) Conversely, if {y B } is a net in A convergent to x, then for each neighborhood U x ' there is a Ba-tail of the net that is included in U x · On the other hand, as a subset of the net, the Ba- tai l C B, which implies that the B a-tail c u n A. Consequently, u n A f. (/J and thus X E A. D 9.11 Remark. Let f: X --+ Y be any function and let {x A } C X be any net in X. Then, clearly {f(x A )} is a net in Y. D 9. 12 Definition. A function f: X --+ Y is said to converge to l E Y along the net {x A } if f(x A ) --+ l. D The theorem below refines Theorem 4.9 and modifies Theorem 4. 1 0. 9.13 Theorem. Let f: (X, r ) --+ (Y, r ' ) be a map. Then f is continu ous at a point x0 if and only if for each net { x A } in X such that x A --+ xa it yields that .1{ x A ) .1{ x0) . X
X
--+
Proof.
1) Let f be continuous at xa and let { x A } be a net in X convergent
to x a . Let W be a neighborhood of f(x a ) · Since f is continuous at x a , f * (W) is a neighborhood of xa . The convergence x A --+ x a guarantees the existence of a Aa-tail of {x A } included in f * (W). Thus, /(A a tail) C f f * (W) C W implying that f(x A ) --+ f(x a )· 2) Let f be not continuous at xa · The negation of the continuity means that there is a neighborhood W of f(xa ) for which f * (W) is not a neighborhood of x a , or equivalently, there is no neighborhood U x with a U x C f * (W). Therefore, f(U x ) is not a subset of W. This fact implies a a o
176
CHAPTER
3 . ELEMENTS OF P O INT SET T O P O LO GY
that each neighborhood B from <:B x has at least one representative, say 0 XB such that f(xB) E we. Since {xB} is a net generated by � it con a verges to x 0 , while the net {f(xB)} cannot converge to f (x 0 ), for it is separated from a neighborhood W of f (x0 ) . D X
,
A net { x A} in the product topological space (X = IT X i , rp) is convergent to a point x E X if and only if for each i E /, iEI 1r i ( x A) ---. 1r i( x) = xi E X i ( where 1r i denotes the ith projection map). 9.14 Theorem.
Proof.
X. Then since each projection map 7ri is continuous in product topology, by Theorem 9.13, 1ri(xA) ---. 1ri(x), \1 i E I. 2) Suppose that 1ri(xA) � 1ri(x), \li E I. If Ux is an element of a 1 ) Let xA
---._
x in
neighborhood base of the point x in the product topology then U x can be represented as
Then for each k = 1, . . . ,n, there is a A k such that for all A > A k , 1r i ( x A) E U x . , i.e. A k-tail is in U x . Since there are only finite many k .k .k such k's, by superlativity of A, there exists a A 0 > A k , k = 1 , . . . ,n, such that each A0-t.ail of {1ri (xA)} is in U X · , k = 1, . . . ,n. Hence, 1r; (A0-tail of k k .k {1ri (xA)} ) is contained by 1ri (U � . ) . Consequently, the A0-tail of { xA} is k k � in 1r; (U x . ) , k = 1, . . . ,n, and k .k XA E n i ( u ) = u for all A > Ao · k =l 7T' k X .. k D In other words, x A ---. x. 9.15 Remark. We activate Example 9. 7 ( i) treating a special case of the convergence of a function on N (sequence) along the filter base '!F b = {{n,� + 1, . . . }: n E N} in a discrete topological space. Since any sequence is a net, the filter base � b in this case obviously contains all n0tails of this net, and the converg ence of f along '!F b is equivalent to th e convergence of the net { f(n)} . We wonder what is a connection between the filter and net convergence, and in which cases they are equivalent. We will start with the natural generalization of this case. D n
*
X
9. 16 Proposition. Let { x A} be tails of { x A} is a filter base on X.
(See Problem
'
a net in X. Then the collection of all
9.1 1.)
9.17 Definition. Let {xA} be a net in
X. The filter base in
9. Filter and Net Convergence
177
Proposition 9.16 is said to be the filter base generated by the net { x .,x } and it is denoted by '!f .,x · Correspondingly, the filter '!f ( '!f .,x ) generated by this particular filter base is called the filter generated by the net { x .,x }· D The following two criteria form a bridge between filter and net con vergence. 9. 18 Theorem. A net { x .,x } --+ x if and only if the filter '!f ( '!f .,x ) generated by this net converges to x. D 9.19 Theorem. x is an accumulation point_ of a net { x .,x } if and only if x is an accumulation point of the filter '!f ( '!f .,x ) generated by this net. D The proofs to both theorems are left for the reader as Problems 9. 12 and 9. 13. 9.20 Remark. Let '!f be a filter on X. Denote A '!f = { ( x, F): x E F E '!f} and introduce the relation � on A '!f by Note that from each F, each time we select exactly one point x. Conse quently, we pair all elements of F with F. Then ( A '!f , < ) is a directed set (show it as Problem 9.14) and the projection map 1r: A '!f -+ X (assigning 1r(x,F) to x) is a net in X. This net is called the net based on '!f. So, the net based on '!f is just {x .,x } where ,\ = (x,F) and this particu lar x is labeled by ,\ or by F. This is somewhat similar to the labeling a net generated by a neighbor hood base. However, in this case, we select all elements x of F and, in addition, we deal with a filter base instead of a D filter. 9.21 Theorem. A filter '!f converges to x if and only if the net based on '!f converges to x. Proof.
1) Suppose that '!f -+ x. Then by Definition 9.6 (i), CUx C '!f. Let U x E 'Ux . Then U x E '!f. Let x .,x E U x · Then (x .,x ,U ) E Ar;r. By a a superlativity of A '!f , there is ,\ > ,\ a · Hence, there is an F( E '!F) C U ,\a < ,\ = (x _,x ,F), and x .,x E F. The collection of all such x _,x 's is the ,\ a-tail and it is a subset of U x being an arbitrary neighborhood of x. Therefore, x
�
x'
X _,x -+ X . 2 ) Let {x .,x } be the net generated by a filter '!f such that x.,x -+ E X. We need to show that CUx C en=. Since x .,x -+ x, for each U x ' there is ,\a E A crr: such that the ,\ a-tail is in U x ' i.e., for some ,\a = ( x _,x ,Fa ), all a x
178
CHAPTER 3 . ELEMENTS OF P O INT SET TOP O L O G Y
x A E U x ' with >. > >.a , or equivalently, with x A E FA C F Furthermore, Fa must be contained by Ux · If this is not the case, then at least Fa and U x are not disjoint (it follows from the above inclusions). Since by our assumption Fa \U x f. C/J, there is a y E Fa \U x ' and then the pair (y,Fa ), 0•
marked with some >. is obviously in the >. a-tail. Thus Y>.. must belong to U x ' which contradicts the assumption. Another reason why Fa C U is that if some x A E F belongs to U, then all other elements of F belong to U, for they participate in the relation (x>.,,F) C (y,F) and thus belong to the >.a-tail. So, w e have shown that an arbitrary neighborhood U x is a D superset to some Fa E GJ. By the definition of a filter, U x E �9.22 Example. If � = CU x , then such a filter always converges to x. By Theorem 9.21, the net {xA} based on CU x converges to x. A >.a-tail of this net would consist of all points y indexed with all neighborhoods D U E CU x , which are included in the ">.a-neighborhood" U Aa 9.23 Remark. The following considerations are similar to those in Remark 9.20. 1et '!Fb be a filter base on X. Denote x
•
AG} b = {(x,F): x E F E �b}
and set the relation < in A � b by (x 1 ,F 1 ) < (x 2 ,F 2 ) <=> F 2 C F 1 . Then A� b is a directed set (show it, in Problem 9.15). Now, the projection map 1r : A'!! � X is a net in X. This net is called the net based on the b
filter base �b·
9.24 Theorem ..
D
A filter base �b converges to x if and only if the net D based on �b converges to x. The proof of this theorem is similar to that of Theorem 9. 21 and it is subject to Problem 9. 16.
9.25 Example. Let '!F b = <:B x be an arbitrary neighborhood base of a point x E X. Then as mentioned, <:B x converges to x. By Theorem 9.24, the net { x-A} based on � x also converges to x. A typical >.a-tail is similar to that in Exampl e 9.22. D The theorem below is a refinement of Theorem 3.10 initiated for sequences. 9.26 Theorem. The following statements are equivalent: (i) ( ii)
(X,r) is T2 • All limits in (X, r) along nets or filters are unique. ( iii) The diagonal {(x,x) E X2 : x E X} is closed in the product
9. Filter and Net Convergence
179 topology X2 • Proof.
(i) => (ii): Let ( X,r ) be T 2 and let � be any filter on X with � ---. x and � � y. By Definition 9.6 ( i), C:U x C � and C:U Y C '!F . Thus, V U x ' u y E �, u n u y f. C/J ( by the definition of a filter ) . Consequently , either x = y or (X,r) is not Hausdorff. If now {x.,x} is any net in X with x.,x ---. x, then by Theorem 9.18, the filter � ( � .,x) generated by this net converges to the same point x. If y would be another point such that x.,x ---. y f. x, then by the same Theorem 9. 18, it would mean that � ( � .,x ) ---. y as well, which is impossible, for in T 2 , any filter, as proved, converges to at most one point. ( ii) ::} (iii): Assume that all limits in ( X, r) are unique along any nets. Therefore, the net based on a filter � converges to x and to no other point of X. By Theorem 9.21, it follows that � also converges to x and to no other point of X. Let D: = {(x,x) E X 2 : x E X}. Then the diagonal D will contain all nets (x.,x,x.,x) · By Proposition 9.10, a point (x,y) E D if and only if there is a net (xA,x.,x) C D: (x.,x,x.,x) ---. (x,y). Thus, if we show that x = y, it would imply that D = D. The statement x = y easily follows from the uniqueness of limits along nets. Therefore, for each point (x,x) E D, there is a net (x.,x,x.,x) ---. (x,x). The latter yields D = D. (iii) => (i): It can be directly taken from (iii) => (i) of Theorem 3.1 1. X
D
The next two results are analogous to Lemma 4. 12 and left for students as exercises.
4.1 1
and Proposition
Let f, g: ( X,r) ---. (Y,r') be continuous functions and let ( Y, r' ) be T2 • Then the set S: = {x: f(x) = g ( x)} is closed in X. D 9.28 Proposition. Let J,g: ( X,r ) ---. ( Y,r' ) be continuous maps and let (Y, r') be T2 • If f and g coincide on some dense set D C X then f = g D on X. 9.27 Lemma.
PROBLEMS 9.1 9.2 9. 3
Show that the filter � in Remark 9.2 ( ii) is the smallest filter containing the filter base � b · Show that a filter base for a filter is a filter base. Let X be a set and A C X. Define �: = {F E '!P(X): A C F} . Show that � is a filter on X. Give the smallest filter base � b on X
180
9.4 9.5 9.6
CHAPTER 3 . ELEMENTS O F P O INT SET T O P O L O GY
containing the set A. For Problems 9.4-9.7, let � be a filter on X and A C X. Show that if one element F E GJ meets A, then A meets all other elements of �- In this case we say � meets A. Let � meets A. Show that � A : = {F n A: F E �} is a filter on X, called the trace of the filter � on A. Show that �': = � U U � A is the smallest filter containing B :J A
� U {A}.
9. 7 9.8
)
(
Show that � A is a filter base for �'. Show that x is an accumulation point of a filter
x E n {F: F E �}.
�
9.9
Show that if a filter point of �.
converges to
9. 10
Let ( X, d), ( Y,p ) be metric spaces, fo ll owing statements are equivalent:
�
if and only if
x then x is an accumulation x0 E X, l E Y. Show that the
( i ) xlim ---. x0 f(x) = l ( in the sense of Definition 9.6 ( iv) and Example 9. 7 ( ii ). ( ii ) For each > 0, there is a 6 > 0 such that for all x E X with d(x,x0 ) < 6, p (f(x) , l ) < [Hint: Work with the system of open balls as a filter base.] £
£.
9.18
Prove Proposition 9.16. Prove Theorem 9. 18. Prove Theorem 9.19. Show that ( A � , < ) is a directed set. Show that ( A GJ ' < ) is a directed set. b Prove Theorem 9.24. Show that the net based on an ultrafilter is an ultranet. Show that the filter gene rated by an ultranet is an ultrafilter.
9. 19
Generalize Theorem 3 . 1 1 replacing condition
9.20
Prove Lemma 9.27. Prove Proposition 9.28.
9. 11 9. 12 9.13 9.14 9.15 9.16 9.17
9.21
( ii) by the condition: each net or filter in ( X,r) converges to no more than one point.
181
9.
Filter and Net Convergence
NEW TERMS:
filter 167 til ter base 16 7 filter base for a filter 167 ultrafilter 167 filter generated by a filter base 168 neighborhood filter 168 maximal filter 16 9 convergence of a filter 169 limit point of a filter 169 convergence of a filter base 169 accumulation point of a filter 169 accumulation point of a filter base 169 convergence of a function along a filter base 169 limit of a function at a point 170 continuity of a function at a point 172 directed set 172 net 172 net induced by a directed set 172 A 0-tail of a net 173 net, cofinally in a set 173 accumulation point of a net 173 convergence of a net to a point 173 limit point of a net 173 ultranet 173 net generated by a neighborhood base 173 partition of an interval 17 4 refinement of a partition 17 4 Darboux lower sum 17 4 Darboux upper sum 174 Riemann integral 17 4 function convergent along a net 175 continuity of a function, criterion of 175 convergence of a net to a point, criterion of 176 filter base generated by a net 177 convergence of a net to a point, criterion of 177 accumulation point of a net, criterion of 177 convergence of a filter to a point, criterion of 177 convergence of a filter base to a point, criterion of 178 uniqueness of limits along nets and filters, criteria of 178 filter that meets a set 180 trace of a filter on a set 180
182
CHAPTER 3 . ELEMENTS O F P O INT SET TO P O LO GY
10. SEPARATION
In this section we will see that the fineness of a topology is characterized by its ability to separate points and sets. We will treat some special types of topological spaces that have qualities somewhat similar to Haus dorff spaces introduced in Section 3 and here given in weaker or stronger forms. In addition to countability, it is another attempt to arrive at various classes of topological spaces having common properties with metric spaces and yet being sufficiently more general. 10.1 Definitions. Let (X,r) be a topological space. (i) (X,r) i& called a T0 space if for each pair of points x f. y E X, there is a neighborhood of x, U x such that y E U�:
o y
(ii) (X,r) is called a T 1 space if for each pair u and u y such that y E u � and X E U� :
x f. y E X,
there are
X
o x
(iii)
o y
(X,r) is called a T2 space ( or Hausdorff) if \lx f. y E X, 3
U x ,U y : U x n U y = C/J :
10. Separation
183
(iv) (X,r) is called regular if for every closed set F C X and for every point x E Fe there are disjoint open sets O x and 0 such that F C O and x E O x :
F 0
( v) (X, r) is called a T space if it is regular and it is a T1 space. (vi) ( X,r) is called completely regular if every closed set F C X and every point E Fe can be separated by a continuous function, i.e. if there is a continuous function f: (X,r) � ([0,1],r e ) : f(x) = 0, f(F) = 1 . (vii) (X,r) is called Tychonov if it is completely regular and a T13
x
space.
(viii) (X,r) is called normal if any two disjoint closed sets have
disjoint open supersets:
( ix) (X, r) is called a T space if it is normal and a T1 space. (x) (X,r) is called locally compact if every point of X has at least one compact neighborhood. D 10.2 Lemma. The following are equivalent: ( i) (X, r) is T1 . ( ii) Each one-point set is closed. ( iii) Every subset of X equals the intersections of all open sets containing this set. 4
184
CHAPTER
3 . ELEMENTS O F P O INT S ET TO P O L O GY
( i ) => ( ii): Let (X, r) be T 1 and let x E X. Then by the defini tion, each y ( f. x) has a neighborhood, disjoint from { x }; for instance, X\ { x} is such one. By the definition of a neighborhood, there is an open neighborhood, say O y C X\{x}. Thus, y is an interior point of X\{x}. Since y E X\ { x} was an arbitrary choice, it follows that X\ { x} is an open set. [Observe that Hausdorff spaces have the same property, c f. Proof.
Problem 3.2.]
(ii) => (iii): Assume that each singleton in (X,r) is closed. Let A C X. Then A = n (X\{x}). Now, the statement follows from the x
E Ac
X\{x} is open and that A C X\{x} , \1 x E Ac. (iii) => ( i) : Assume that every subset A C X is the intersection of all open sets containing A. Let A = { x }. Then { x} is the intersection of all open neighborhoods of x such that x = n Ox. Let y be a point such that there is no open set 0 that does not cont ain x. This implies that y E 0 x and hence y E n Ox and y = x. D 10.3 Proposition. If (X,r) is a T i space then the following diagram holds: fact that
Y
T2 => T 1 => T0 is obvious. Since T3 is T1 , by Lemma 10.-2, we take F = { y } , which is closed, to get T 2 • Similarly, by letting F 2 = {x} and applying Lemma 10.2 to set {x}, we have T4 => T3 . D 10.4 Example. Let X be any infinite set equipped with the cocount able topology r = {X,Q),C c : I C I < I N I } (introduced in Problem 1.7). Proof. Indeed:
Thus, by the definition, all at most countable sets are closed, specifically, all singletons are closed. Thus, by Lemma 10.2, r must be T 1 . Similarly, any cofinite topology (cf. Problem 1 . 1) is T 1 . Now let 0 1 and 0 2 be any two open sets in a cofini te topology with an infinite carrier. We show that Ot and 0 2 cannot be disjoint unless 0 1 or 0 2 is empty . If they are disjoint and nontrivial then 0 1 C 0� which is impossible, for 0� must be finite and 0 1 is infinite. Thus any cofinite topology on an infinite carrier cannot be T 2 . Similarly any cocountable topology on a carrier whose D cardinal number is greater than N0 cannot be T 2 . 10.5 Theorem. The following are equivalent for a topological space
(X,r): ( i) X is regular. ( ii) If 0 x is an open neighborhood of x then there exists an open set
10.
185
Separation
U which contains x and such that U C 0. (iii) Each x E X has a neighborhood base consisting of closed sets. Proof.
( i) => ( ii). Suppose X is regular. Let x E 0 E r . Then oe is closed and x rt o e and by regularity of X, there are disjoint open sets U and W such that X E u and oe c w. Clearly, we is closed and u c we c 0. Furthermore, U C we C 0. ( ii) => (iii). If <:Bx is a neighborhood base at x, then for each B E <:B x , there is an open subset 0 of B and, if ( ii) applies, there is an open subset U of 0 whose closure is in 0 . This way, we can form a neighborhood base at x, which consists of closed sets. (iii) => (i). Let F be a closed set such that x E Fe. Then, if (iii) applies, there is a closed neighborhood B of x such that B � Fe. As a neighborhood of x, B is such that B f. (/J and B is an open neighborhood of x (for there is an open subset of B that is a neighborhood of x ) . Now 0
we have that regular.
0
B is disjoint with Be, x E B, and F C Be. Hence, X is 0
0
D
10.6 Proposition.
A compact Hausdorff space is regular.
F be a closed subset of a compact Hausdorff space (X, ) and let x E Fe. For each a E F, there are open neighborhoods V a and Ux a of a and x, respectively, which are disjoint. Because F is closed, by Proof. Let
r
Theorem 6.9 it is also compact, and therefore, there is a finite open subcover {Va1 , . . . , Va } of F reduced from {V a = a E F}. If n
v = k u= 1 va k and u = k n= 1 ux a k n
then
n
U and V are such disjoint open sets that x E U and F C V.
D
A compact Hausdorff space is normal. Proof. Let A and B be disjoint closed subsets of (X, ) Since (X, ) a is regular, for each a E A there are disjoint open sets U a and v such a that U a is a neighborhood of a and v is a superset of B. Because A is compact, {U a } is reduced to a finite subcover {Ua 1 , ,U a } whose union ak is. U. Let V = n 1 V . Then, B C V, which is open, and U and V are . . k= d lSJOlnt. 0 10.7 Corollary.
r .
r
• • •
n
The class below of locally compact Hausdorff spaces we are going to explore will be useful in Chapter 8 when dealing with measures and
186
CHAPTER 3 . ELEMENTS O F P OINT SET TOP OL O G Y
integration. 10.8 Examples.
Observe that by Theorem 6.9, a compact topological space (X,r) is also locally compact [i.e. U C X must be compact] . (ii) The space ( IR",r ) is not compact but locally compact: Every point x E IR" has a compact neighborhood [x - 6,x + 6] . 0 (i)
x
e
Each locally compact Hausdorff space ( X, r ) has the property that each point of X has a neighborhood base consisting of open sets whose closures are compact. Proof. Let ( X, r ) be a locally compact Hausdorff space. Choose a point x E X. Let U be any neighborhood of x and K be a compact 10.9 Theorem.
neighborhood of x which is guaranteed by Definition 1 . 1 0 (x). Denote 0 = Int(K n U ) . As a closed subset of K ( 0 C K => 0 C K = K), by Theorem 6.9, 0 is compact in r n K. By Theorem 10.6, as a compact and Hausdorff subspace, 0 is regular. As an open neighborhood of x in ( X, r ) , and a subset of 0, 0 is also open in 2' n 0. By Theorem 10.5, there is an open neighborhood W of x in r n 0 such that its closure in r n 0, W C 0. ( It is easily seen that W is also open in r. ) Since 0 is a compact subspace, W is compact in 0. We need to show that W is also compact in (X, r ) . Let {V5} be an open cover of W in r. Then, {V n 0} is obviously an open cover of W in r n 0. This cover can be reduced to a finite sub cover {V 1 n O, . . . ,V k n O} and therefore, {V 1 , . . . ,V k } is a finite subcover of W in r. In a nutshell, we showed that an arbitrary neighborhood U of x has an open subneighborhood W whose closure is compact. Hence, a neighbor hood base at x forms thereby a neighborhood base consisting of open sets whose closures are compact. In particular, it means that every point of X D possesse� a neighborhood base consisting of compact sets. 8
Let ( X, r ) be a locally compact Hausdorff space a point o and let U be an open neighborhood x. Then there is an open f neighborhood 0 of x such that -0 C U and -0 is compact. 10.10 Proposition. x
( See Problem 10.6.)
x
x
Let K be a compact set in a locally compact Haus dorff space ( X, r ) and W be an open superset of K . Then there is an open superset U of K such that U C W and U is compact. 10. 11 Proposition.
Proof. By Proposition 10. 10, each point x of K has an open neigh borhood U whose closure is compact and included in W. If we cover K by all U 's, because of compactness of K, this cover can be reduced to a x
x
10. Separation
187 finite subcover, say
U . . . ,U n· If U = U 1,
1
U . . . U U " then clearly '
As a finite union of compact sets, U is compact. D The next is a small and useful consequence of Proposition 10.11 (whose proof we assign to Problem 10.8). It states that every locally com pact Hausdorff space is "weakly" normal. Recall that a space is normal if every two disjoint closed sets can be separated, i.e. they have disjoint open supersets. In a locally compact Hausdorff space, the same property applies to compact sets, which as we know ( cf. Theorem 6.10 ) , are closed in Hausdorff spaces. In other words, any two compact sets can be separated by disjoint open supersets.
In a locally compact Hausdorff space any two disjoint compact sets have disjoint open supersets. D 10. 12 Corollary.
The theorem below is quite famous and it is known as Urysohn's Lemma. Given two disjoint closed sets in a normal space (X, r ) , the lemma asserts the existence of a real-valued continuous function on f that "separates" two given disjoint closed sets, i.e. f: X __. [0, 1] such that f * ( A ) = 0 and f * ( B ) = 1. (The original proof guarantees the existence of a function f from X onto [0, 1], but with a simple transformation, the range of f can be made [a,b].) Whenever we talk about real-valued functions from X to IR, we will mean the usual topology in IR. The following short biographical note on Pavel S. Urysohn will add to the prominence of his widely referred to lemma. Pavel Samuilovich Urysohn (born in 1898 in Odessa, Russia), accord ing to Pavel S. Alexandrov, was the founder of the Russian school of topology. He studied mathematics under Nikolai N. Lusin in Moscow State University from which he was awarded a doctoral degree in 1921. He tragically died by drowning in Brittany, France (at the early age of 26 ) , during his visit of one of the mathematical conferences . Among the different significant results Urysohn made during his less than four years of academic work, was one of the central problems in topology - the dimensions of arbitrarily complex geometrical figures. 10. 13 Theorem (Urysohn's Lemma). A space ( X, r ) is normal if and only if whenever A and B and disjoint closed sets in X, there zs a continuous function f: X --.[0,1] such that f * ( A ) = 0 and f * ( B ) = 1. Proof.
Necessity. We assume that ( X, r ) is normal and that A and B are disjoint closed sets. By normality of ( X, r ) and Problem 10.8, there is an 1.:.
188
C HAPTER
3.
ELEMENTS OF P OINT SET TO P OL O GY
open superset U 1 1 2 of A such that U 1 1 2 n B = f/J. Now, the sets A and (U 1 1 2 ) c are disjoint and closed. By normality, there are open supersets, U1 1 4 and V of A and (U 1 1 2 ) c , respectively, such that
U1 14 , (U 11 2 ) c C V and U1 1 4 n V (/J . Therefore, U 1 1 4 C v c � U1 1 2 and this yields that U 11 4 C v c C U 1 1 2 . Since B and U 1 1 2 are disjoint and closed, by Problem 10.8, there is an _ such that U open superset U3 14 of U 3 14 n B = (/J. In summary, 1 12 A c U 1 I 4 , [j1 I 4 c U 1 1 2 , U 1 1 2 c U 3 I 4, and U3 I 4 n B C/J . =
A�
=
For convenience, we display one more step. Repeating the above argu ments, there are open sets
U118 ' U114 ' U318' U1 1 2 ' U 518 ' U31 4 ' and U718 .
that are embedded in the following way: A
c u1l8' u 1 l8 c u114 ' u1 14 c u3l8' u3l8 c u11 2 '
u1 1 2 c u5 l 8' u5l8 c u3l 4 ' [j3l 4 c u7 l8' with u7 l8 n B = (/J . Continuing the same process, we define sets u i 1 2 n , i = 1, . . . , 2 n - 1, wh ich are embedded as
c U1 1 2 " ' U 1 1 2 " c U2 1 2 " '· . . , U ( 2 " 1 )1 2 " n B = C/J. Let D0 denote the set of all dyadic rationals belonging to [0, 1], i.e. those numbers -of the form i/2" where i = 0,1, . . . ,2 n and 0, 1, . . . , and D be the subset of dyadic rationals from ( 0,1), i.e. , D0 \{0,1}. It is easy to show that D 0 is dense in [0, 1]. By induction, we can construct the count able fami ly {U d ; d E D} of open sets indexed by the elements of D such that for each pair p , q E D with p < q , A
n =
A
Let
c u P ' up c u and u n B = (/). q'
q
U denote the union of all U d's. Now, we introduce the function f (w) =
inf{p: w E U p }, if w belongs to some U P 1, w E [0, 1]\U
10. Separation
189
on X. Clearly, f (A) = 0 and f * (B) = 1 and that [0, 1] is the range of f. We prove that *f is continuous at each point w of X. Continuity is subject to the following arguments. It is easy to show that: if w E U then
f( w ) < p;
if w � U then
f( w ) > p;
P
hence,
P
f is continuous at w if for every neighborhood Wf(w ) ' there is a neighborhood V w such that f * ( V w ) C W f(w) " Let f( w ) E (0, 1) and let ( a,b) = W f(w) be any open subinterval of [0, 1] containing f( w ). Because D is dense in [0,1], there is a pair of dyadic rationals p,q E D By Definition 4. 1,
such that
a < p < f( w ) < q < b. is a neighborhood of w such that Now, the open set V w = U q f * ( V ) � ( a,b ) . It is a rather routine procedure to verify the continuity of f at 0 and 1. This completes the necessity of the statement. Z:, Sufficiency. Assume that for any two closed disjoint sets A and B, there is a continuous function f: X----. [0, 1] such that f * (A) = 0 and f ( B ) = 1. Since f is continuous, / * ([0,£)) and / * ((£, 1]) are open sets in ( X, r ) D and they contain A and B, respectively. 10. 14 Corollary. A T4 space is Tychonov. Proof. Let (X, r ) be a T4 space. By Lemma 10.2, as a T 1 space, each singleton in (X, r ) is closed. Since the T4-space is normal, given an x and a closed set F, to which x does not belong, by Urysohn's Lemma there is a continuous function f with the range [0, 1], which separates { x} and F. Hence, ( X, r ) is completely regular. In addition (X, r ) is a T 1 space. D 10. 15 Corollary (Urysohn). Let K and W be compact and open sets,
\UP
w
*
respectively, in a locally compact Hausdorff space ( X, r ) such that K C W. Then there is a continuous function [X,[0,1],/] such that f * ( K) = {1} and f *(G) = {0}, where ac is a compact subset of W containing K.
10.1 1, there is an open superset U of K whose closure U is compact and is contained in W. Since the subspace ( U,r n U) is compact Hausdorff, by Corollary 10. 7, it is normal. Then, by Proof. By Proposition
190
CHAPTER 3 . ELEMENTS OF P O INT S ET T O P OLO G Y
Urysohn's Lemma, for any two disjoint closed subsets of U, there is a continuous function [U,[0,1],t,o] such that tp (A) = {0} and tp (B) = { 1 } . Now, if take A = U \U and B = K we have *two disjoint closed* subsets of U (see Theorem 6.10) in the scenario of Urysohn's Lemma. Now, we extend the function tp to X by letting f (X\ U) = 0, where f denotes the * extension of tp from U to X. Hence, in particular, on its subset, G = (U) c . It remains to show that f is continuous. Let C be any closed subset of [0,1]. If C does not contain 0, then f * (C) = tp�* ( C) is closed in U and, therefore, it is closed in (X, r) (as the traces of all r-closed sets on U are all closed sets in r n U and they are closed in r ) . If 0 E C, then
! * (C) = t * (C U {0}) = t,o * (C) u u c is also closed in
(X, r ).
D
( X, r) be a topological space. Any at most countable intersection of open sets is denoted by G 8 • Any at most countable union of closed sets is denoted by F u · A set is referred to as u-compact, in notation K u ' if it is at most a countable union of com D pact sets. 10. 17 Proposition. Let (X,r) be a second countable locally compact Hausdorff space. Then each open set is an F u- and K u-set and each closed set is a G 8 • Proof. Let � be a countable basis for r and let U E r. By Proposition 10.10, each point x E U has an open neighborhood O x such that O x C U 10.16 Definition and Notation. Let
and O x is compact. On the other hand, O x can be represented as a union of some sets from <:B. Let B x be one such subset of 0 x that contains x. Then Bx C O x , B x is compact, and B x C U. Consequently, U can be represented as U = U Bx . xeU Since all B x 's are elements of <:B, which is countable, the family {B x : x E U} automatically reduces to a countable cover of U and so does {B x : x E U}. In other words, U is an F u- and K u-set. Let F be a closed set. Then Fe is an F u- and K u-set. Thus,
D Obviously,
191
10. Separation
Every second countable locally compact Hausdorff space is u-compact. D 10 .. 18 Corollary.
10. 19 Examples
(i) ( Topology on IR ) . In Example 1.2 (iv) we constructed a topology on the extended real line. There is another way how to do it. Define the map f of A = [ - ; , ;] onto lR as follows:
f(x) =
tanx, - ; < x < ; X+ oo , - !r.2 - oo , X = - 2· 7r
Now let us define a topology on IR. First of all, we consider the relative topology r A : = A n r on A, i.e. the topology relative to the usual topo logy r on IR, and then define the topology r on IR_as r : = f ( r A ) · Since f is evidently bijective, r is indeed a topology on !R . Furthermore, f is a homeomorphism and the spaces ( A,r A ) and ( IR,r ) are homeomorphic. Since A is compact, by Theorem 4.3, we conclude that IR is also compact. This example shows that by supplementing two more points to !R we made a compact space from a non-compact one. We observe that IR C IR, as well as r C r. Such a process is called a compactification. ( ii ) Let ( X,r ) be a compact Hausdorff space and let x E X. From Problem 3.2 we have that X \ { x} is open. Then by the previous theorem, X \ { x} with the relative topology on it, is locally compact. Consider now the inverse process, where we take X \ { x} and then give the point { x} back to X, which makes X compact. This is a very special case of one point compactification unlike the two-point compactification discussed in Example ( i). This example inspire us for a more general approach of a one-point compactification of a locally compact space. D Let ( X,r ) be a locally compact Hausdorff topological space and let w rt, X. Define X' : = X U {w} . Now we construct a new topology r ' on X' containing r and the sets of type ( X \ C ) U { w } where C is a r-compact subset of X . We are going to prove the result basically belonging to Pavel S. Alexandrov. e
e
e
The following hold true. a) ( X',r' ) is a topological space. b) ( X', r' ) is Hausdorff. c) ( X',r' ) is compact. Proof. We just show c). Let { Gi; i E I}
Theorem 10 .. 20.
be an open cover of X'.
192
CHAPTER 3 . ELEMENTS OF P OINT SET T O P O L O GY
Then there is an index i0 E I such that
w E G i0 . Hence, G i0 = (X\C) U { w },
is an open subcover of C G k k=l
where C is r-compact, specifically, U (without loss of generality, we took 1, . . . ,n finite subcover). Now, n
as
the relevant indices for the
X' = ( X'\C) U C = ( X \C) U { w } U C = G i0 U C = G i0
Therefore,
X' is compact.
u [ku=l ak]
D
10.21 Definition.
The point w of the compactification is called the point at infinity. The one-point compactification process described above is called the
Alexandrov compactification.
D
PROBLEMS
Show that T 2 is hereditary, i.e. every subspace (relative topology) of T2 is T2 • 10.2 Show that the Tychonov product topology of T 2 factor spaces is 10.1
T2 .
10.3 10.4 10.5 10.6 10.7
10.8 10.9 10. 10
Show that if the Tychonov product topology is T 2 then each factor space is also T2 • Show that local compactness is weakly hereditary. Show that local compactness is vaguely hereditary. Prove Proposition 10.10. [Hint : Use Problem 6.14.] Let ( X, r) be a normal space. Let A and B be two disjoint closed sets in ( X, r ) . Show that there is an open superset U of A such that U n B = (/J. Prove Corollary 10.12. Prove that a product of Hausdorff spaces is Hausdorff. Show that regularity is hereditary. Show that a subspace of a normal space need not be normal.
10. Separation
193
10. 11 A product of regular spaces is regular. Show that a product of
normal spaces need not be normal. 10.12 Prove that every metrizable space is normal. 10.13 Prove that every regular space with a countable base is normal. 10. 14 Prove that in every u-compact and locally compact Hausdorff space (X, r ) there is a sequence {% } of compact sets such that
" X = U %n 00
and
n =
1
194
CHAPTER 3 . ELEMENTS O F P O INT S ET TO P O L O G Y
NEW TERMS:
T0 space 182 T1 space 182 T space ( Hausdorff) 182 2
Hausdorff space 182 regular space 183 T3 space 183 completely regular space 183 Tychonov space 183 normal space 183 T4 space 183 locally compact space 183 T1 space, criterion of 183 T spaces, diagram of 184 regularity of a space, criteria of 184, 185 compact Hausdorff space 185 normality of a space, criteria of 185, 187 locally compact Hausdorff space, properties of 186, Urysohn's Lemma 187 Urysohn's Corollary 189 u-compact space 190 u-compactness, criteria of 190, 191, 193 G u-set 190 Fu-set 190 Ku-set 190 compactification 191 one-point compactification 191 Alexandrov compactification 191-192
187, 189, 191, 193
11. Functions on Locally Compact Spaces
195
11. FUNCTIONS ON LOCALLY COMPACT SPACES
In this section we will utilize a version of U rysohn 's Theorem for locally compact Hausdorff spaces in connection with a very important subclass of continuous functions that vanish outside compact sets. This will lead to one of the central results in analysis, a so-called Riesz Representation Theorem, explored in Chapter 8. 1 1.1 Definitions and Notation.
( i) Let ( X, ) be a topological space. For a real-valued function [X,IR,f], the set Cl(f*(IR \ {0}) is called the support, in notation, suppf or D supp ( f ) . ( ii) Given a topological space ( X, ) , the real vector space of all continuous real-valued functions will be denoted by e( X, r;IR ) or, shortly, by e(X). The symbols e* ( X ) and e c (X) denote subspaces of continuous bounded functions and continuous functions with compact support, res pectively. [ Obviously , e c ( X) C e* (X) C e ( X ) and the inclusions can be replaced by the equalities if ( X, ) is compact.] ( iii) Let K and W be a compact and open subset of X, respectively, and f E e c ( X) such that 0 < f < 1. We will denote K � f if f (K) 1 ( hence, 1 < f < 1) and f � W if suppf C W. In other words, * r
r
r
=
K
In this case we will say that we will write
f is subordinate to W. If K � f and f -< W,
and say that f is subordinate to W with respect to K. Clearly, if K -< f, then K C suppf. Notice that the use of symbol " � ' for f with K or W always requires that 0 < f < 1. ( iv) Let {W 1 , . . . ,W n } be a finite open cover of a compact set K. An n- tuple {/ , . . . ,f n } C e ( X) is said to be a partition of unity for K sub c 1 ordinate ( or dominated by) to {W 1 , . . . , W n } if: '
) f � Wi' i = 1 ,. . .,n. b) K -< E i 1 f i· a
=
D
11.2 Remark. In the upcoming theorems we are going to use
Urysohn's Corollary 10.15 and we would like to reformulate it in terms of the support of a function f introduced above. Recall that, according to
196
CHAPTER 3 . ELEMENTS O F P OINT S ET T O P O L O G Y
Urysohn's Corollary, given a compact set K and its open superset W in a locally compact Hausdorff space (X, r), there is an open set U with the compact closure, which are "squeezed" between K and W: -
K C U C U C W.
To this quadruple, there is a continuous function = 0 and f .(K) = 1. Consequently,
[X , [0 , 1] ,/] with f ,.. ( U c )
K C {/ f. 0} {f > 0} / * ((0,1]) C U. =
=
Since f is continuous and (0,1] is open in re n [0,1] , it follows that { f f. 0} is an open subset of U and therefore,
K � suppf C U. As a closed subset of a compact set, suppf is compact and hence f E ec ( X) in the scenario of Corollary 10. 15. Furthermore,
K � f -< W.
(11.2)
0
Let ( X, r) be a locally compact Hausdorff space and K be a compact set. Then, for any finite open cover of K there is a parti tion of unity silbordinate to this op.en cover. Proof. Since K is compact there is at least one finite open cover of K, say {W . . ,W n } · Let x E K. Then x belongs to at least one of W / s, say W 1 . By Proposition 10.10, there is an open neighborhood 0 of x whose closure 0 x is compact and such that 0 � W 1 . The open cover, {Ox: x E K} of K can be reduced to a finite subcover, say {Ox 1 , . . . ,Ox k }. Now, for each i 1, . . . , n , let H i be the union of those Ox . 's for which 0 . C W i or else set Hi C/J if no such inclusion is available. 0 bviously, each Hi is an open subset of W i whose closure, in notation, K i' is compact and included in Wi. Furthermore, {H 1 ,. . . ,H n } covers K. In light of Remark 11.2 applied to the pair of sets to Ki and Wi there is a continuous function [X,[0,1], g i] with a compact support such that Yi * ( Ki) 1 and Yi.(U'f) 0, where Ui is an open superset of Ki whose closure is compact and is contained in Wi and, in terms of (11.2), K i -< Yi -< Wi. Applying Remark 1 1.2 again, now to the pair of sets K n and . U H i, there is a continuous function [X,[0,1],g] with compact 1 support such that g * ( K) 1 and g . ( U c ) = 0, where U is an open superset 11.3 Theorem.
1,.
x
x
=
x
J
=
J
=
• =
=
=
11. Functions on Locally Compact Spaces
197 of
K whose closure is compact and contained in . =U 1 H i. ( In particular, ' L l\Hi 0.) In terms of (11.2), we have
r)
u.(
=
K
• =U 1 H i .
� g � .
n
In summary, we have
K � U � U C i U 1 H i C i U 1 Ki � i U 1 Ui C i U 1 U i C -
n
n
n
n
-
i
U 1 Wi· n
Let It is a routine procedure to verify that
for all
f > 0 on X:
x between K and •. =U 1 K i; n
f(x) = 1 for all x outside •. =U 1 U i; n
.
and
f(x) > 1 for all X between•. =u 1 Ki and •. =u 1 ui. n
n
This allows us to define the continuous functions
It is readily seen that
f > 0,
K
�
Ei
=
1 / i and that 0 < Ei 1 / i < 1. Since =
i·
or in terms of the above notation, f i � W Hence, the tuple { f 1 ,. . . , f } meets the requirements of the above assertion in terms of Definition 11.1 n
D
( iv).
Let K be a compact set in a locally compact Haus dorff space ( X, ) and W be an open superset of K. Then there is a continuous function [X, [0,1] ,/] with compact support such that K � f � W and K � suppf. 11.4 Corollary. r
Proof. The statement follows from Theorem 1 1.5 immediately for
n =
1.
0
198
CHAPTER 3 . ELEMENTS O F POINT SET T O P O L O GY
In particular, if W = X, we have
Let K be a compact set in a locally compact Haus dorff space (X, ) Then there is a continuous function [X, [0,1] ,/] with compact support such that K � f and K C suppf. Under the condition of Corollary 11.4, let K = (/J. Then, 11.6 Corollary. Given an open set W in a locally compact Hausdorff space (X, ) there is a continuous function [X,[0,1] ,/] with compact sup port such that f -< W. D 11.5 Corollary. r .
r ,
We complete this section by the widely referred to Tietze's Extension Theorem. 11.7 Definition. Let ( X,r ) be a locally compact Hausdorff space and K C U be compact and open sets, respectively. Let e ( X;C) and e(K;C) denote the spaces of all continuous complex-valued functions on X and K, respectively. A function F E e ( X;C) is said to be a Tietze 's extension of a function f E e(K;C) with respect to U, if: a
) f = ResKF.
b) F E ec (X). c) F * ( U c ) = {0}.
D
Let (X, ) be a locally compact Hausdorff space, K C U be compact and open sets, respectively. Then for every function f E e(K; C) there is a Tietze 's extension with respect to u. 11.8 Theorem (Tietze's Extension).
r
The proof of this theorem is offered as an exercise in several steps (Problems 1 1.1-3). PROBLEMS 11.1
11.2 11.3
Use Proposition 10. 11 to have an open set V such that K C V C V C U and V is compact. Let Q: be the subfamily of all continuous, real-valued functions admitting Tietze's extensions with respect to U. Show that Q: is a subalgebra. Use Proposition 10. 10 and Corollary 10.15 to show that Q: separates points and that it also contains constant functions. Construct an extension F of f E Q: from K to X and show that
II f II
u
=
II F II u ·
Use the Stone-Weierstrass Theorem
8.3 to prove that the closure
199
11. Functions on Locally Compact Spaces of Q: with respect to the uniform norm equals the result to complex-valued functions.
e( K ;IR) and extend
200
CHA PTER 3 . ELEMENTS OF P OINT SET T O P O L O G Y
NEW TERMS:
support of a function 195 space of all continuous real-valued functions 195 subordinance 195 partition of unity for a compact set 195 dominance of functions by sets 195 locally compact Hausdorff space, criteria of 196, 197, Tietze's extension of continuous functions 198 Tietze's Extension Theorem 198
198
Part II Basics of Measure and Integration
Chapter 4 Measurable Spaces and Measurable Functions
In the previous chapter we studied general topological spaces. A topology was defined as a collection of sets (on a carrier) that is closed with respect to the formation of arbitrary unions and finite intersections. In the present chapter, we introduce various classes of sets similar to topo logical spaces but serving other purposes. One of them prepares the student for another part of analysis - integration. Beyond the familiar integration we experienced in calculus, we will need to measure much more general sets than those which are used for the Riemann integral. For instance, we will consider abstract sets that are encountered in the theory of probability. In addition, we will largely extend the existing class of integrable functions. If we try to measure the length (or area) of all sets, set theory forces us into certain contradictions or paradoxes. Therefore, we have to restrict attention to m�asuring a (large) subclass of sets. It stands to reason that we would want the collection of "measurable" sets to be closed under certain operations such as union, complementation, and intersection. Thus we seek a collection of sets satisfying certain algebraic properties under the binary operations of union, intersection, and set-theoretic dif ference. This leads to the concept of a sigma-algebra. As with topological spaces, where base (or sometimes also subbase) sets were most convenient to study, in measure theory it is also useful to start with more primitive collections of sets called generators of sigma algebras. For instance, if we need to measure a flat closed figure, one of the reasonable ways to do it is to approximate the figure by a number of (various) disjoint rectangles whose measures we already know. Such a natural \Vay of measuring more complex sets by "base'' sets gives rise to the extension of measure from the collection of "abstract rectangles" to the set of all figures formed from rectangles under countable set opera tions. This method of extension was generalized by the German mathema tician Constantin Caratheodory in 1918. This chapter is just preparation for the next two, where we will be concerned with various classes of sets on which measure will be defined and then extended. Generators to these classes, in particular, topologies that have found other applications in this part of analysis, are of special interest.
203
204
CHAPTER 4. MEASURABLE S PACES
1. SYSTEMS OF SETS
1. 1 Definition. Let
of subsets of n. (i) E is called a (sigma-field) if: (a) f2 E E.
n be an arbitrary set and let E be some collection u-algebra (pronounced
"sigma-algebra" ) or
u-field
(b) A E E => Ac E E.
( c ) for any sequence {A n } of sets of E,
00
nU=l A n E E.
( ii) E is called an algebra (or field), denoted by A (i.e. E = A) if (a) Q E A. (b) A E E => A c E A. k ( c ) {A 1 , . . . ,A k } C E => U A n E A .
n= l
( iii) E is called a Dynkin system, denoted by '!» (E = '!») , if (a) Q E '!».
(b) A E '!» ::} Ac E '!». ( c ) for every sequence { A n } of pairwise disjoint sets of '!»,
}: �= 1 A n E '!». (iv) E is called a ring, denoted by � (E �) , if (a) C/J E �. =
(b) A,B E � => A\B E �. ( c ) A,B E � ::} A U B E �( v)
E is called a semi-ring, denoted by !f ( E = !f), if (a) C/J E !f. (b) A,B E !f ::} A n B E !f. ( c ) for A,B E !f, there is a finite tuple C 1 , . . . , C k of pairwise disjoint sets from !f such that A\ B can be represented as the union L: � = 1 C n .
1.
(vi)
Systems of Sets
205
monotone system, denoted by A (E = A), if: (a) for every {A n } i (i.e. monotone nondecreasing) sequence of sets of A,
E is called a
(b) for every {A n } l (i.e. monotone nonincreasing) sequence of sets of Jfb, E . A n A n n=l (vii) n -stable (pronounced "intersection-stable" ) if A,B E E => A n B E E. (viii) A pair (n,E), where E is a u-algebra in n, is called a measur able space, while elements of E are called measurable sets. [Compare these with topological space and open sets, respectively.] D 00
1.2 Examples.
( i) Let ( u-algebra) .
n be an arbitrary set. Then
{n,(/)} is the smallest algebra
( ii) '!P(n), the power set, is always a u-algebra. It is the largest u algebra in n. (iii) The smallest u-algebra containing a set A is obviously
{n,Q),A,A c }. ( iv) Let n IR n and let j1 be the system of all n-dimensional half n open intervals (or rectangles) of type ( a,b ], for a,b E lR . The intersection =
of two intervals is either C/J or again an interval. The difference of two in tervals need not be an interval, but it can be represented as a finite union of pairwise disjoint intervals (see Figure 1.1). Hence, j1 is a semi-ring.
- - - - - - - -
Figure 1 . 1
206
CHAPTER
( v)
Definition
Every u-algebra
4.
MEASURABLE S P A CES
E has three properties directly following from
1.1 (i): c a ) C/J E E ( since C/J = n ) . 00 b) For every sequence {A n } C E, nn A n E E. This holds =l because
J:l1 A� E E :} UJ/�r E E :} J\ A n E E.
Observe that by applying DeMorgan's law we can similarly show that this property and property ( c ) in Definition 1.1 ( i) are equivalent, i.e. , in 1.1 ( i, c ) a countable intersection can be replaced by a countable union. c ) Finally we have A,B E E ::} A\B E E ( due to A\B = A n Be). One can say that any u-algebra is closed with respect to the form ation of all countable set operations. (vi ) Every algebra A has the same property as u-algebras in ( v) except it is closed under finite intersections. Hence, any algebra is closed relative to the formation of all finite set operations. (vii) Let n be an arbitrary set and let E be the system of all subsets A of E such that either A or A c is at most countable. Then E is a u-algebra. Indeed, C/J is at most countable. Thus, Q E E. If A E E then obviously A c E E. Now let {A n } C E. Then either there is at least one countable set A i or else, all A n are not countable but their complements 00 are countable. In the first case, n A n is clearly countable. In the second n= l oo case, the set U A� is at most countable and, therefore, it belongs to E, n= l oo along with its complement. The latter, by DeMorgan's law, is n A n . n=l Consequently, E is a u-algebra. (viii) Let f : n --+ Q' be a map and E' a u-algebra in n'. Then E = f ** (E') defined as the system of all sets { f * (A') : A' E E'} is a u algebra. This property is due to the fact that the inverse of a map pre serves all set operations. For instance, if A E E then it follows from the definition of E' that A is the inverse image of some A' E E'. Thus (A') c E E' yielding A c = (f * (A')) c = f * ((A') c ) E E. The proof that the union of a sequence from E belongs to E is also analogous ( show it ) . (ix ) A monotone system need not be an algebra. For instance, let
1. Systems
of
Sets
207
Q: be the set of all convex subsets of IR 2 • Then it is easily seen that Q: is a monotone system. However, the union of two convex sets A U B need not be convex. The difference of two convex sets is not necessarily convex either. An algebra need not be a monotone system either, for it is not closed under countable set operations. (x) The collection of all finite subsets of an infinite set n is a ring in n. D PROBLEMS 1.1
12 ..
1.3
1.4
1.5
Let E be a u-algebra in n and let Q' be a subset of n. Define E' as E n n' = {A n n': A E E}. Then E' (called the trace of E in 0' ) is a u-algebra in 0'. Prove it. Let � be a ring in 0. Show that with any two sets A , B E � ' their intersection also belongs to �Let � be a collection of subsets of Q with the properties: (/) E � and A,B E � implies that A n B E � and A U B E � Is � a ring? Let A be an algebra in n. Prove that A is a u-algebra in n if and only if A is also a monotone system. [Hint: Show that any u algebra is a monotone system and any algebra, which is a mono tone system, is a u-algebra.] The flow chart below reflects the relations between some systems of sets
Eu => A => � =>
!f =>
n -stable
.lJ.
.A6
1.6
1. 7
Demonstrate that each relationship holds. Give an infinite collection of subsets of lR that contains C/J and lR and which is closed under countable unions and intersections but is not a u-alge bra. Let n be a finite set with IOI = 2n. Let Gj) be the system of all subsets D of n such that I D I = 2q, q = 0, 1 , . . . ,n. Show that Gj) is
208
1.8 1.9 1. 10
CHAPTER 4. MEASURABLE SPACES
a Dynkin system. Give a Dynkin system that is not a u-algebra. Show that if '!» is a Dynkin system then D,E E '!» and E C D => D\E E '!». Prove the statement: A Dynkin system GJ is a u-algebra if and
only if Gj) is n -stable.
Show that the inverse image of a ring � under the map f : 0. 1 --+0. is a ring in n l . 1.12 Show the equivalence of two definitions of a semi-ring if property c ) in Definition 1.1 ( v) can be replaced by 1. 11
) Let A,A1 , . . . ,A n E !f. Then there is a finite tuple disjoint sets from !f such that c
'
C1 , . . . ,Ck of
1. Systems of Sets
NEW TERMS: u-algebra ( u-field) 204 u-field 204 algebra (field) 204 field 204 Dynkin system 204 ring 204 semi-ring 204 monotone system 205 n -stable (intersection-stable) system 205 measurable space 205 measurable sets 205 smallest u-algebra 205 u-algebra containing (generated by) a set 205 half-open interval (rectangle) 205 rectangle 205 systems of sets, diagram of 207
209
210
CHAPTER 4. MEASURABLE SPACES 2. SYSTEMS ' GENERATORS
The intersection of arbitrarily many u-algebras ( algeb ras, monotone systems, rings) in n is a u-algebra (an algebra , a mono tone system, a ring). D (See Problem 2.1 .) 2.2 Remark. Let y be an arbitrary collection of subsets of n. There is obviously a u-algebra containing y, for instance, the power set �(0). If we collect all u-algebras that contain y and find their intersection, it must contain y and, due to Theorem 2.1, it is a u-algebra too. This u algebra is clearly the smallest one containing y. It is called the u-algebra generated by y and it is denoted by E(y). The system of sets y is called the generator of E(y). It is worthwhile to recollect the analog of a sub 2.1 Theorem.
base or bas.e ·and their role as generators of the smallest topology that contained them. While, as we saw it , the classes of generators in topology are quite limited in their practical use, their counterparts for u-algebras form a significantly richer inventory filled with such prominent collec tions as semi-rings, rings, Dynkin systems, monotone systems, and topolo gies themselves. Among them, rings and semi-rings shall be often used as generators of u-algebras (throughout this book, especially , in Chapter 5) in Caratheodory construction of measures. Another frequently used gene rator is a topology that we will see in action when characterizing regular and Radon measures and in calculus of Lebesgue-Stieltjes measures. The smallest u-algebra containing a topology r as a generator is called a Borel u-algebra and it is denoted <:B( r) or by �(0) or just by � whenever the nature of r is specified. Of various Borel u-algebras we are going to come across will be many generated by the usual topology. D 2 .. 3 Example. Let y be the system of sets containing only one subset A of n. As mentioned in Example 1.2 (iii), the smallest u-algebra con D taining y is {0,(/>,A,A c }. Problem 1.10 states that a Dynkin system is a u-algebra if and only if it is n -stable. The proposition below generalizes it by allowing the Dynkin system to have just a n -stable generator. 2.4 Proposition. A Dynkin system is a u-algebra if and only if it has
an n -stable generator. Proof. Let y be an n -stable system of subsets of n. Then E(y) = G_D(y). Since every u-algebra is a Dynkin system and G_D(y) is the smallest Dynkin system containing y then G_D(y) C E(y). The inverse relation
remains to be shown:
2. System 's Generators
211
D E G_D(y) and let GJ) D = {Q E �(0) : Q n D E G_D(y)} a) We show that GJ) D is Dynkin system. If A E GJ)n then A n D E G_D(y) and A c n D = D \ A = D \ (A n D) E G_D(y) ( see Problem 1. 12). This yields A c E GJ)D · Similarly, let {A n } C G] D be a sequence of pairwise disjoint sets. Then A n n D E G] ( y), for n = 1,2, . . .. Obviously, {A n n D} is a sequence of pairwise disjoint sets Let
.
a
and
implying that 2: :'= 1 A n E GJ) D . Therefore, GJ) D is a Dynkin system. b) We prove that for every D E G_D(y), G_D(y) C G] D · Let G E y. Then G E G_D ( y). Since y is n -stable, it follows directly from the definition of GJ) G that y C GJ) G · Thus y C G_D(y) C GJ) G' since G_D(y) is the smallest Dynkin system containing y and GJ) G is just a Dynkin system containing y. Now let D E G_D(y). Then D E GJ) G and G n D E G_D(y), implying that G E GJ) D or y C GJ) n· This yields that G_D(y) C GJ) D , since again �(y) is the smallest Dynkin system containing y. c ) we show that G_D ( y) is n -stable. Let C,D E G_D(y). Then G_D(y) C GJ) D and C, D E GJ) D · Thus, C n D E G_D(y) ( by the definition of Gj) n), and therefore G_D(y) is n -stable. Finally, by Problem 1.10, G_D(y) must be a u-algebra. Then, as the smallest u-algebra, E(y) C G_D(y). This is the desired inverse relation. D In the next lemma and theorem we present a construction of the ring generated by an arbitrary semi-ring. 2.5 Lemma. Let !f be a semi-ring in n and let � be the system of all finite unions of elements from !f . Then any element of � can be represented as a finite union of pairwise disjoint sets from !f, in notation, 2: � = 1 C k ' C k E !f. n Proof. Let R E �- Then by the definition of �' R = U S k ' where k= 1 S k E !f . We now construct a decomposition of R by elements of !f using Sk · Let
R 1 = S1 , R2 = S2 \ S1 , R3 = [S3 \ S1 ] n [S3 \ S2 ] , , k-1 R k = n1 [S k\ S i ], k = 1, . . , n . j= Since S k \S j = � C i k j is a finite union of elements from !f, it follows •
.
1
•
.
212
CHAPTER 4. MEASURABLE SPACES
that
k -1 k-1 R k = n 'L C i k " = E n ci k " . J"= 1 J J"= 1 . J •
•
(as a finite union of finite intersections) =
'2;, D ik , n
n
Dik are elements of !f. It is easily seen that R = U Sk = U R k , k= 1 n k= 1 where R k are pairwise disjoint. This leads to R = 'L 'L D i k and the k= 1 i •
where
lemma is proved.
D
Let !f be a semi-ring in n. Then the system of all finite unions of sets in !f is the ( smallest) ring �(!f) generated by !f . Proof. 1 ) We show that � described above is a ring. Since !f C � ' we have C/J E �. Let R1 , R 2 E �. Then, by the definition of � ' 2.6 Theorem.
Therefore,
n
R 1 U R2 = ku U (S 1k U S2i ) E �. =1 i=1 m
By Lemma 2.4 and by Problem 1.2(c) (Chapter 1), we have R 1 \ R2 = E � = 1 Ck \ E ;" 1 D i
(
)
)(
C k and D i , as elements of !f, are semi-ring sets, the sets C k \ Di r ·k k k = 'L E �" and E �" are also elements of !f. Therefore, s=l r i k "k n n rik .k R1 \R2 = kE n E E � = E E n E � E �. k= 1 s = 1 i=1 = 1 i = 1 s =1
Since •
m
m
Let !f C �', where �' is any ring in n. As a ring, �' is closed with respect to the formation of finite unions of sets from �'. Specifically, it is closed under finite unions of sets from !f; hence, it includes �. Consequently, � is the smallest ring generated by D !f. In Remark 2.2 we defined a Borel u-algebra as a u-algebra generated by a topology. We will show below that the smallest u-algebra E(!f(lR n )) generated by the semi-ring ofnall half open intervals (a,b] in IR n coincides with the Borel u-algebra �(IR ).
2) Now we show that � = �(!f).
2. System 's Generators
213
Let r , r c , and 1 denote respectively the system of all open, closed, and compact subsets of ( � " , r ) Then the following relations hold. 2.7 Theorem.
e
·
<:B (�" ): = E( r ) = E( r c ) = E( 1) = E(!f( !R ") ) . Proof. Since all compact sets in (� " , r ) are closed and bounded, it follows that 1 C r c C E( r c ) , and thus
On the other hand, every closed set F can be represented as a countable union of compact sets Ck E r c , k = 1,2,. . . . For instance, if C( c,k) denotes the compact ball centered at some point c and with radius k E N, then we may choose G k = F n C( c,k) implying that F = U= C k · There k l fore, all closed sets belong to the u-algebra E( 1 ) (since this u-algebra contains countable unions ) ; i.e. , r c C E(1) which yields 00
Both inclusions ( * ) and ( * * ) lead to E( r c ) = E( 1 ) . Since open sets are complementary to closed sets, it follows that � = E( r ) = E( r c ) = E( 1 ) . Now we show that <:B = E(!f). Any half-open interval ( a,b] in !R " can be represented as the intersection of a sequence of bounded open intervals of type (a,b n ) ( or as we called them earlier, open parallelepipeds) with b n l b. Therefore, the collection !f of all half open intervals belongs to u algebra E( r ) , which implies that E(!f) C E( r ) On the other hand, any open bounded interval can be represented as the union of a sequence of half-open intervals of !f; and any open set is a countable union of bounded open intervals as base sets ( recall that ( � " , r ) is second countable ) . Therefore, any open set is the union of countably many half open intervals from !f and we have r C E(!f), implying E( r ) C �(!f). Dual containment gives us E(!f) = �D .
PROBLEMS 2.1 2.2 2.3
Prove Theorem 2.1. Show that an intersection of semi-rings in n need not be a semi. r1ng. Show that a union of u-algebras in n need not be a u-algebra.
214 2.4
CHAPTER 4. MEASURABLE SPACES
A and B be subsets of n and let y = {A,B}. Find G_D(y) and E(y). Show that G_D(y) and E(y ) are identical if and only if one of
Let
the following conditions holds.
A n B or A n Be or A e n B or A e n Be is the empty set. 2.5
[Hint: Use Problem 1 . 10.] Let E be a u-algebra in n and let B c n. Show that the u-algebra generated by y = E U {B} is of the form [Hint:
2.6 2.7 2.8
2.9
1 ) Show that u(E') = u(E U {B} ) . 2) Show that E' is a u algebra in n . ] Let Yi and Y 2 be systems of sets in n. Show that Y l c Y 2 implies that E(.Y 1 ) C E(y 2 ). Let n be an arbitrary non-empty set and let A,B c n. Construct for r = {n,Q),A,B,A n B,A u B} the Borel u-algebra <:B ( T ) . Construct the Borel u-algebra generated by the cocountable topo logy r = {n,ct>, A e : A is at most countable} (see Problem 1. 7, Chapter 3, where n is a uncountable set). Let .Ab be a monotone system in n and let A be an algebra in n such that A C ..Ab. Prove that the u-algebra E(A) generated by A is a subcollection of .Ab. [Hint: Let .Ab0 = .Ab ( A) be the monotone system generated by A. Furthermore: 1 ) Let A be a fiXed element of .Ab0• Define
Show that .At, A = .Ab0• 2) Show that .At, A is thus an algebra. 3 ) Show that .Ab0 = E(A).] n 2.10 Show that any open set in [R can be represented countable union of disjoint semi-open cubes.
as
at most a
2.
System 's Generators
NEW TERMS: u-alge bra generated by a collection of sets 210 generator 210 Borel u-algebra 210 u-algebra generated by a set 210 semi-ring, propery of 21 1 ring generated by a semi-ring 212 Borel u-algebra, criterion of 213 u-algebra extended by a set 214 Borel u-algebra generated by a cocountable topology 214
215
216
CHAPTER 4. MEASURABLE SPACES 3. MEASURABLE FUNCTIONS
(n,E) and (n',E') be two measurable spaces. A function [n, f2', f] is called measurable if f * * (E') C E, i.e. if 'v' A' E E' f ** ( A') E E. The collections of1 all measurable functions from (n, E) to (n', E') will be denoted by e 1- (n, E; n', E'). Notice that symbol e - is a natural extrapolation of the common the space of all n-times conti notation in analysis, where e n stands for 0 nuously differentiable functions, with e ( or simply e) being used for the space of all continuous functions. So, not only has e - 1 been vacant, but it also agrees with the existing linear order (e",n = - 1 ,0, 1, . . . ; :) ) . D 3.2 Remark. In Example 1.2 (viii), we saw that f * * (E') is a u algebra in n. We wish to call it the u-algebra generated by function f. This is the smallest u-algebra relative to which f is measurable. D 3.1 Definition. Let
3.3 Examples. (i)
Each identity function
only if E' C E.
f: (n,E)�(n,E')
is measurable if and
( ii) Let f : (f2,E) --+ (n' ,E') be a constant function, i.e. f( w) = c E f2' , Vw E n. Therefore, f * (c) = n and / * ( {c} c ) = C/J, which yields that for each A' E E, f * (A') is either f2 or C/J. The latter implies that f is measurable with respect to the smallest u-algebra {n, C/J } in n. Thus, f is
always measurable. (iii) Let f ( w ) = 1 A (w) for some A C n. Let E' be an arbitrary u algebra in lR (for instance, the Borel u-algebra ) . It is easily seen that the inverse image under f of any subset of lR (specifically, of any subset of E') is one of the elements of the set, E = {n, C/J ,A,A c }. Therefore, E is the smallest u-algebra with respect to which f is measurable. On the other hand, if E is a u-algebra in n, then 1 A is measurable if and only if D A E E. There is a noteworthy parallel between continuity and measurability of functions and their relationships with topologies and u-algebras. Recall that a function [n,n',/] is continuous on n if there are two topologies T and T 1 declared on n and n', respectively, and t ** ( r ') c T. If, in addition, r ' is known to be induced by a subbase y', then the condition * f * ( r ' ) C T can be relaxed by / ** (y') C r. The pointed out analog with measurability is utilized by
Let ( n, E) and (f2', E') be two measurable spaces and let y' be a generator of E'. Then a function f : f2--+f2' is measurable if and only if f * * (y') C E. 3.4 Proposition.
3.
Measurable Functions
Proof. Let E = { Q ' E �(0'):
217
Obviously, E is a u algebra (show it, see Problem 3. 1 ). Now let / * * (y') C E. Then it follows that y' C E and hence E' = E(y') C E. Therefore, f is measurable. The con verse is trivial. D 3.5 Example. Let f: ( f2 ,r)--+(r2',r') be a continuous function on a topological space (n, r ). Then f * * ( r') C r C <:B( r) (the Borel u-algebra generated by T ). By Proposition 3 .4, the function f is then <:B( r )-<:B( r') measurable. We call f a Borel measurable function. D Measurability, like continuity, is preserved under the composition. 3.6 Proposition. Let f 1 : (f2 1 ,E 1 )--+(f2 2 = f 1 * (n 1 ) ,E 2 ) and f 2 : (f2 2 ,E 2 ) --+ (n3,E3) be measurable functions. Then the composition D f 2 o f 1 : f21 , E 1 ) --+ (03, E3) is meas urable . (See Problem 3.2.) 3.7 Remark. Let {ni,Ei: i E I} be an arbitrary collection of measur able spaces and let { / i : n --+ ni: i E I } be a collection of functions defined on a set Q. Every function f i of this family is clearly f i * *( E i) E .- measurable. We are interested in constructing the minimal u-algebra in n relative to which all functions of the family are measur able. Since U f i ** ( E i) is not, in general, a u-algebra, it is reasonable iei to regard it for the generator of the u-algebra generated by the family D {/ i ; i E I} , in notation, E(f i ; i E I) . 3.8 Lemma. Let { gi : (n, E) --+ (ni, E i)} be a collection of functions on n and let f : (n0 , E0 ) --+ (n, E) be a function on 00 • The function f is E0 -E( gi: i E I) -measurable if and only if each of the functions gi o f is E0-E .-measurable.
f * ( Q') E E}.
Proof.
1) Let g k o f be E0-E k-measurable \Ik E !. Then \I A k E E k , ( g k o f) * ( A k ) = f* o g k * ( A k ) E E0 where gk ( A k ) E . U g •� (E i ) · Taking A k IEI from E k for each respective k E I we run the whole set U g i * (E i ) i EI whose elements are further transferred by f * into E 0 • In other words, we have (3.8)
Since U g i * (Ei) is a generator of E( g i ; i E I), by Proposition 3 .4, iEI inclusion (3.8) is sufficient for f ** (E( g i ; i E I)) C E 0 • Therefore, f is indeed E0-E( g i; i E I)-measurable.
218
=i
CHAPTER 4. MEASURABLE SPACES
2) Let f be E0-E(gi;i E I)-measurable. This implies that VE E y U gi * ( Ei), f * (E) E E 0 • Besides, \1 A k E E k , g k (A k ) E U g i * (Ei) . iEI
EI
(g k o f) * (A k ) E E0 , which means that g k o f is E0-E k
Thus, \/A k E E k , measurable.
D
PROBLEMS 3.1 3.2 3.3
3.4
Show that E in the proof of Proposition 3.4 is a u-algebra. Prove Proposition 3.6. Let f:r2-+f2' be a function and let y' C � (f2'). Show that -
f ** (E(y')) = E(f * * (y')). [Hint: Let E: = {A' E �(f2'): ! * (A') E E ( / ** (y'))}. Show that E is a u-algebra. Then show that E (y') C E . ] Let [0, 0 1 , F] be a homeomorphism, with 0 and 01 being open sets in topological spaces ( X, r) and (X, r 1 ), respectively, and let <:B( r 0) and <:B( r 0 ) be the Borel u-algebras generated by the relative topologies r 0 and r 01 . Prove that [<:B( r 0) , <:B( r 0 1 ),F ] is .. . * 1
3.5
b lJecttve. Let [0,0 1 ,F] be a homeomorphism, with 0 and 0 1 being open sets in topological spaces (X,r) and (X,r 1 ), respectively, and let <:B( r0) and <:B( T 0 1 ) be the corresponding Borel u-algebras generated by the relative topologies r0 and r 0 1 . Suppose B C 0. Show that if F * (B) is Borel, then B is also Borel.
3.
Measurable Functions
219
NEW TERMS:
measurable function 216 e - 1 -space 216 u-algebra generated by a function 216 measurability of a function, criterion of 216 Borel measurable function 217 composition of measurable functions 217 u-algebra generated by a collection of functions 217 homeomorphisms and Borel u-algebras, relationship between
218
Chapter 5 Meas ures
This chapter is a precursor to the general theory of integration, which is a significant advancement from the Riemann integration known from calculus. AI though many applications in natural sciences triggered the development of g eneral integration and measure theory, the theory of probability has become the primary client of abstract measure even prior to integration. An early notion of measure was introduced by Italian Giuseppe Peano's in 1883. For a simple set in the plane, Peano used two types of polygons that contain and are included in the given set. The areas of the polygons of the former type have a greatest lower bound and of the latter type - the least upper bound. If these limits coincide, their common value is said to be the area of the set. However, if the limits differ, the concept of area would not apply. Apparently, Cantor's development of set theory greatly influenced Peano's concept of area for arbitrary sets in his 1887 monograph, Applicazioni geometriche del calcolo infinitesimale . He generalized his original idea on inner and outer measures of sets by poly gons for two- and three-dimensional Euclidean sets. Peano emphasized the close connection between measure and integral. In 1892, Frenchman Camille Jordan arrived at a more advanced concept of measure as a countably additive set function applied first to positive and then to signed set functions. The latter led to the prominent Jordan decomposition of two positive measures, which we will study in Chapter 8. Jordan's motivation of the concept of measurable sets apparently stemmed from the theory of double integration, which naturally arises when introducing integrals on arbitrary plane sets. However, Jordan's approach of the measurements of sets was restricted to the common, at that time, finite covers of sets by intervals or rectangles. The most revolutionary steps were undertaken by the Frenchman Emile Borel in his famous 1898 monograph, Lefons sur la theorie des fonctions, where he introduced the idea of countable, instead of finite, covers, thereby significantly extending classes of measurable sets. Borel has also pointed out in 1905 a possibility of using measure theory in probability, which has been successfully accomplished by Russian Andrey Kolmogorov not earlier than in 1933. However, in his Lefons, Borel did not bother to connect measure and integration. In 1902, another Frenchman, Henri Lebesgue further refined measure theory by combining the ideas of Camille Jordan on finite contents with 2?1
222
CHAPTER S. MEASURES
the countably additive measure notion of Emile Borel . Lebesgue called sets in IR" measurable whenever their inner and outer measures are equal. This led to the\ completion of the concept of measure and gave rise to the general theory of integration so significantly enlarging the class of integrable functions that it made Lebesgue say in 1902: "I know no func tion that is not summable and I do not know if any such exists." (How ever, Italian Guiseppe Vitali showed the existence of such a function in 1905.) Lebesgue also established several central theorems in the theory of integration; one of them is the famous Lebesgue Dominated Convergence Theorem. Finally, the Austrian Johann Radon , in his 1913 Habilitation work began to study abstract measures and integrals more general than those of Lebesgue in IR". Radon is also the author of the well known Radon and Radon-Stieltjes integrals. The latter is most frequently referred to as the Lebesgue-Stieltjes integral. Radon's ideas led not only to the abstract theory of mea5ure and integration but also to its applications in the boundary value problems in the theory of logarithmic potentials (developed by Radon himself) and contemporary theory of probability and stochastic processes. '
1. SET FUNCTIONS
1.1 Definitions.
( i) Let E be a system of subsets of n including the empty set (/J . A numerical function J.L : E IR such that J.L( C/J ) 0 is called a set function. In this chapter we will only consider nonnegative set functions J.L : E ---. ---.
=
IR + .
In the below definitions we assume that corresponding sets are elements of E. ( ii ) A set function J.L is called finitely additive or just additive if for any n-tuple 1 , . . . , A n of mutually disjoint sets with � 1 E E,
A
A
1
1-L( E � = l Ak) = E � = li-L( A k)•
E = Ak
(iii) A set function J.L is called u-additive if for any sequence , A ,. .. . of mutually disjoint sets with E :'= A n E E, it holds that 2
( iv) A set function
1
J.L on E is called continuous from below on E if
1.
Set Functions
for every monotone nondecreasing sequence it holds that
223
{A n } i such that nU=l A n E E
If this condition is known to hold for a particular monotone nondecreas ing sequence {A n } j, then Jl is said to be continuous from below on {A n } · ( v) Let {A n } be a monotone nonincreasing sequence and n n= l A n E E. A set function J.1. on E is said to be. continuous from above on {A n } if ( 1 . 1) The set function J.1. is called continuous from above on E, if (1. 1) holds for every monotone nonincreasing sequence {A n } ! such that nn A n E E. =l In particular, if {A n } ! (/J, (1. 1) reduces to 00
and this is referred to as shortly, (/)-continuity of J.l. ·
continuity from above at the empty set
or,
(vi) A set function J.1. is called finite on E if JJ.(A) < oo for all A E E. (vii) A set function J.1. is called u-finite on E if E contains a sequence {A n } monotonically increasing to Q ( i.e. n U A n = n) and JJ. is =l finite on {A n } · In this case, we also say that J.1. is u-finite on {A n } · (viii) An additive set function J.1. defined on a semi-ring !f (in Q) is called an elementary content ( on !f) . ( ix) An additive set function J.1. defined on a ring � ( in n) is, called a content ( on � ) . ( x) A u-additive set function J.1. defined on a ring � or algebra A ( in n) is called a premeasure ( on � or .A). (xi) A u-additive set function J.1. defined on a u-algebra E ( in Q) is ca>lled a measure (on E). If, in addition, JJ. ( O ) = 1, then p is called a prob ability measure. (xii) Let (f2,E) be a measurable space and let J.1. be a measure on E. Then the triple (f2,E ,J.l. ) is called a measure space. If J.1. is a prob ability measure, then the triple (O,E,J.L ) is called a probability space. D 00
224
CHAPTER S. MEASURES
1.2 Examples.
( i)
Let E be a u-algebra in Q and let a E n be a fixed point. Define the following set function £ a on E. For each A E E, we set £a (A) = 1 if a E A and £ a (A) = 0 if a rf. A. Then £a is a measure on E. It is clear that £ a (C/J) = 0 and that £a > 0. Let {A n } be a sequence of pairwise disjoint sets from E. Then a can either belong to exactly one set, A j (for some j), or to no set of the sequence. In the first case,
and in the second case,
On the other hand, in the first case,
and in the second case,
£a is u-additive. The measure £a is called a point mass or Dirac measure. 4, we introduced the ( ii) Let n = IR n . In Example 1.2 ( iv ), Chapter n system !f of all nhalf-open intervals ( a,b] C IR , which was shown to be a semi-ring on IR . For a = ( a 1 ,. , a n ) and b = ( b1 , . . . , b n ) ( ai < b i ), we define n 0 A ((a,b]) = k=I1 (b k - ak ) and A 0 (C/J) = 0. l 0 Then >.. 0 is obviously an elementary content on !f. >.. is said to be the Lebesgue elementary content. (iii) Let [IR,IR,/] be a bounded, monotone non decreasing, right-conti Therefore,
• .
nuous function that vanishes at oo Any such function is called a distribution function. There is also a variant of the so-called extended distribution function [IR,IR,J], which is monotone nondecreasing, right continuous, but need not be bounded over unbounded sets and does not vanish at oo (As a right-continuous function, an extended distribution function is bounded over bounded sets though.) Since the set { ! > a } (for any real number a } is either C/J or an interval , every distribution or ex tended distribution function is Borel. Both types of functions are sub classes of monotone functions (that will be introduced and studied in -
-
.
.
1. Set Functions
225
Chapter 9). Let 1 be the semi-ring of half-open intervals in function on !f as
Jl.�
Jl.� ( C/J )
=
IR. We define the set
0 and JJ.� ( ( a,b ]) = f( b) - f( a ) .
Jl.� is clearly additive on 1 and therefore is an elementary content on !f. Jl.� is called the Lebesgue-Stieltjes elementary content. ( iv) Let f2 be an uncountable set and let E = {A E '!P(f2) : either A or Ac ::5 N}, which is a u-algebra on f2 (see Example 1.2 (vii), Chapter 4). Then, define 'v' E = 0 if is at most countable and p ( = 1 if c is at most countable. We show tha t J.l is a measure on First observe that J1. � 0 and = 0 . Let n = 1,2, . . . } be a sequence of pairwise If the union E� 1 is at most countab l e, then each disjoint sets of is at most countable, and thus
A E, JJ. ( A ) JJ. ( C/J ) E.
An
A {An :
=
A)
n.
An
A
JL [ 2: ::' 1 A n] = 0 = 2: ::' 1 JL ( An ) 2: ::' 10 . If [ 2: � 1 Anr is at most countable, then we argue that there is exactly one set of the sequence {An} with at most a countable compliment. Sup pose that there is yet another set from this sequence with this property. Then both of them, say Ai and Ak, are at most countable and hence Ak n Ak is also at most countable. Since A i n Ak C/J, Ai U Ak = n, which is a contradiction, for f2 is uncoun t able. Therefore, we have exactly one set A i such that Ai is at most countable. Then, ==
==
=
=
==
=
is at most countable and
On the other hand,
E � 1 JJ.( An ) JJ.( A1) + · · · + JJ.( Ai- 1) + JJ.( A i ) + JJ.( A i+1) + · · · 1 0 0 =
=
�
'
,
1.
This yields u-additivity of J.l · i = 1,2, . . . } be a sequence of measure spaces and (v) Let
{n,E,Jl.i :
226 let as
CHAPTER S. MEASURES
{ai : i = 1 ,2, . . . }
be a nonnegative numerical sequence. Define
J.1.
on
E
(1 .2) Then J.1. is a measure on E. (See Problem 1 .3.) (vi) Consider the special case in the last example with J.l. n = e b n , n = 1 ,2, . . . , where { b n } c n. In other words, J.l. n is a point mass which was introduced in ( i) . Then the measure J.1. defined in ( 1 .2) is called an atomic (sometimes also discrete) measure. A further special case is of interest. With the given J.l. n , we also assume that the sequence {a n } has the property It is readily seen that the measure J.1. is a probability measure; more specifically, it ·is an atomic probability measure. (vii) Let n be an uncountable set and let E = { A E �(Q) : either A or A c -< N } as in (iv) . Define 'v'A E E, J.L(A) = O if A is at most count able and JJ.(A) = oo if A c is at most countable. Then J.1. is a measure on
E.
(viii) Let n be an arbitrary set and E be a u-algebra on n. Defme the following set function J.1. on E. For each A E E, JJ.(A) = I A I (i.e. the number of elements in A), if A is finite and JJ.(A) = oo, otherwise. Then , it is easy to verify that J.1. is a measure on E. It is called a counting measure and the corresponding triple (n, E, JJ.) will be referred to as a counting measure space. Note that if n is at most a countable set, a counting measure J.1. can also be expressed in terms of the measure intro duced in (1.2), with a n = 1 and J.l. n = £ b n . D 1.3 Proposition. Let J.1. be an elementary content on a semi-ring !f .
Then the following holds true: (i) For any two sets A and B from !f with A C B, JJ.(A) < JJ.(B) ( monotonicity) ( ii) Let A1 ,A2 ,. . . be a sequence of pairwise disjoint sets from !f and A n C A E !f for each n . Then, (1 .3) (iii)
Let A,A1 ,A 2 ,. • • E !f and A C n U= l A n . Then there is a countable decomposition I: C: 1 C k of A with C k 's E !f such that
1.
Set Functions
227 ( 1.3a )
Proof. (i) B = A + B \ A = A + additivity of J.l.,
E ; = 1 C5,
where C8's E !f. Hence, by
and the statement follows. ( ii) By Problem 1. 12, Chapter 4,
A\ E � = 1 Ak = E � Hence, because of the assumption that
1 C8 , where C 8 E !f.
A n C A,
Thus, by additivity of J.1. on !f,
which implies that
Consequently,
B n = A n n A ( E !f ) we have that n U= l B n = A. Denote n D1 = B 1 , D n + l = B n + l \ i U 1 B j, = 1 ,2, . . .. Then, A = E � 1 D k and, by Problem 1. 12, Chapter 4, Dk = E � k l cs k with cs k 's E !f. Since D k C B k ' by ( ii), (iii) With
00
n
B k C A k , we have that E � 1 E � k 1 JJ. ( Cs k) < E � 1 JJ. ( Bk) < E � 1 JJ. ( A k) , ( 1.3b )
Now, due to monotonicity of J.1. and because .
228
CHAPTER S . MEASURES
which yields ( 1.3a).
D
1.4 Corollaries. (i)
If
E :' = 1 E m
is a decomposition of A by elements E m 's from
!f, then (from Proposition
1.3 (iii)):
( 1.4)
i.e. ,
E � 1 Ck is a "J.L-minimal decomposition" of A.
(ii)
00
Let A,A 1 , A2 , . . . E !f and A C n U= l A n . If J.L is u-additive on !f (i.e., for any psequence of mutually disjoint sets { A n } C !f with E�- t A n E !f, then (from (1.4)) : J.t (A)
=
1 E ! k 1J.t(Gsk) .
E�
00
In particular, if A =n U= l A n E !f,
<
E ;;o 1 J.L ( A ) . _
00 J.L ( n U 1 A n ) � E n = l J.t ( A n ) , 00
k
( 1 . 4 a)
(1.4b)
which is known as the u-subadditivit:y property. Inequality (1.4b) ( originally applied to a semi-ring !f and elementary content J.L) implies: (iii) If J.L is a content on a ring � then for any A 1 , . . . , A n E �, (iv )
If J.L is a measure on a u-algebra E then the u-subadditivity property (1:4b) is valid for any sequence {A n } � E. ( v ) Because of monotonicity property of an elementary content, due to Proposition 1.3 (i), the definition ( 1. 1 (iv)) of finiteness can be relaxed for set functions from elementary contents and "above " by requiring merely J.L ( O) < oo D .
There are two more minor properties of contents left for an exercise: 1.5 Proposition. Every content J.L on a ring � has the following pro
perties: ( i)
A
C B => J.L ( B\ A) = J.L ( B)
-
J.L ( A)
(provided that J.L ( A) < oo ) .
1.
Set Functions
229
( ii ) J.L ( A U B) = J.L ( A ) + J.L ( B ) - J.L ( A n B). (See Problem 1. 2.)
Let J.L be a content on a ring �. Then J.L is a premeasure ( i.e., u-additive ) on � if and only if Jl is continuous from below. 1.6 Lemma.
Proof.
J.L be a premeasure on � and {A n } a monotone nondecreasing sequence 1n � such that A: = nU 1 A n E �- If J.L(A n ) = oo for some n , by = monotonicity, J.L(A k ) = oo for all k > n and J.L(A) = oo. Then continuity from below follows immediately. Assume that J.L ( A n ) < oo for all n. Denote B n = A n \A n _ 1 , n = 1,2, . . . , A0 = C/J . 1 ) Let
00
•
Then and
{B n } forms a pairwise disjoint sequence from � with E :O= 1 B n = A.
Since J.L(A n ) < oo (by the above assumption) , J.L(B n ) is well defined and, therefore, due to u-additivity of J.L (as a premeasure), ��oo J.L ( A n ) = �i.Eloo
E � 1 J.L(B k ) = E ;; 1 J.L(B k ) = Jl ( E '; 1 B k ) = J.L(A) , =
which yields continuity from below. 2) Now let J.L be a content on �, which is continuous from below, i.e., suppose that for every monotone increasing sequence of sets in '5b, {A n } i A: = nU1 A n such that A E �, it holds that
=
nlimoo J.L( A n ) = J.L( A). --+
Let { C n } be a sequence of pairwise disjoint sets. By setting
B n = E � = 1 Ck, we get {B n }T C �, with nU1 B n = B and hence nl � J.L( B n ) = J.L(B), which = is equivalent to
00
230
CHAPTER S. MEASURES
This is the desired u-additivity of
D
J.L·
1. 7 Theorem.
( i) If J.L is a premeasure on a ring � and if {An} l C � such that J.L ( A 1 ) < oo and A: =n n l An E �, then J.L is ¢-continuous ( or continuous = from above) on sequence {An}· In particular, if J.L is a finite pre measure on ring � then J.L is C/J-continuous ( or continuous from above) on �. ( ii) If J.L is a finite content on a ring � then C/J-continuity implies that J.L is a premeasure. 00
Proof.
( i).
is monotone nonincreasing, Sincie easily be shown that lJ = fl Lemma 1.6 to the sequence to arrive at
Since to
{An}
{ A1 \An} j C �. It can n /A1 \An) A1 \ ( n 1 An ). Now we apply {A1 \An}
J.L is finite on {An}, by Proposition 1.5 (i), equation (1.7) reduces J.L ( A1) - nlim---.oo J.L ( An) = J.L(A1) - J.L(A)
and there by yields assertion ( i ). ( ii ) We show that ¢-continuity implies Lemma 1.6 and that Lemma 1.6 in turn yields ( ii ) . Let C � such that j E �. Then C/J and ¢-continuity y ield
{An}i
{A \An} !
{An} A
nlim---.oo J.L ( A\An) = J.L( C/J ) = 0. Since the content is assumed to be finite on �, by Proposition 1. 5 i ), the last expression leads to
J.L
Applying Lemma 1.6 (2) we have that
(
J.L is a premeasure on �.
D
( i) We show that on some measure spaces, while {An} l (/J , J.L ( An) + 0. Let (f2, E, J.L ) be such that n = N, E = �(n), J.L ( { n}) = �. Let
1. Set Functions
231
A n = { n , n + 1, . . . }. Clearly, {A n } is a monotone decreasing vanishing sequence. However, J.L(A n ) = oo for each n and thus, J.L(A n ) ---. oo. So, C/J continuity does not apply, since we violated the condition J.L(A 1 ) < oo of Theorem 1.7, which, as we see, is essential. (ii) Consider in n the u-algebra E = {O,ct>,A,A c } and define J.L : E ---. {1,0,p,1 - p }, p E (0,1). Then J.L is a (probability) measure on E, called a Bernoulli measure. Notice that for the traditional Bernoulli measure n = {0, 1} and A = { 1}.
where
(iii) Consider the following atomic measure on the Borel u-algebra �(lR):
where
n k l - k , p E (0,1), k = 0, . . . , n . p ) ( ak (k) p Clearly, J.L is a probability measure. It is called the binomial measure and it is denoted by {3 n , ( iv ) Consider another atomic measure on <:B(IR ) : =
p·
J.L is also a probability measure, called the Poisson measure, in notation,
0
7r _,x .
PROBLEMS 1.1
Let !f be the semi-ring of half open intervals on the real line and >..0 be the Lebesgue elementary content. Take0 A = (0,1], B ' (1,2], 0 0 and C 0= (3,4]. While .,\ ( A +0 B) = .,\ (A) + .,\ (B), we cannot state 0 that .,\ (A + C) = .,\ (A) + .,\ ( C ), since A + C is not an interval, and therefore, the left-hand side of the last equation is not defined. 0 Hence, .,\ is not additive on !f. True of false?
1.2
Prove Proposition 1.5 ( i,ii ). Show that for a content on a ring, the notions of continuity from below and ¢-continuity are equivalent. Prove that J.L is a measure on E in Example 1.2 ( ) .
1.3 1.4
v
232 1.5
CHAPTER 5. MEASURES
Let (O,E,J.L) be a measure space and let { A n : n = 1 ,2, . . . } C E with E � 1 J.L( A n ) < oo. Show that J.L( nl.L� A n ) = 0 [Hint: Apply continuity from above of measure J.L to the monotone nonincreasUn Ak : n = 1 ,2, . ] ing sequence k= A subset �I of a ring � in Q is called an ideal in � if it has the following properties: =
1.6
{
.. .}
C/J E � I b ) B E �b A � B, A E � => A E �I c) A, B E �I => A U B E �I· a)
Let J.L be a content on a ring �- Define �J.J = {R E � : J.L(R) = 0}. Show that �J.l is an ideal in �A subset � u I of a ring � is called a u-ideal in � if, in addition 1.7 to properties -a-c) of Problem 1.6, it is closed with respect to the formation of at most countable unions. Let J.L be a premeasure on �- With the same definition of �J.J as in 1.6, prove that �J.J is a u-ideal in �Let �� I be a u-ideal in a ring �- Show that there exists a 1.8 - J.L on � such that � u = � ' defined in Problems premeasure I J.J 1 .7 and 1 .8. Let J.L be a finite content on a ring �- Show that d(A,B) : = 1.9 J.L( A� B ) , for all A,B E �, is a pseudo-metric on � (i.e. that d possesses all properties of a metric except for d ( A , B ) = 0 yields that A = B) . 1.10 Let ( 0, E, J.L) be a measure space and { A n } C E be a sequence of sets with J.L n U l A n < oo and inf{J.L( A n ) : n = 1 , 2, . . . } = a > 0. = Show that J-L J!mo A n > a.
( (
1.11
) )
Let (n, E, J.L) be a measure space and for a number 0 < a < oo, define � a = { G E E: J.L(G) < a} and
Show that E a is a u-algebra. 1. 12 In the condition of Problem 1 . 1 1 , let a = oo and J.L be a finite measure. Show that E 00 = E.
1. 1. 13
Set Functions
23
(n, E, J.L) be a measure space, y00 = {G E E: J.L(G) < oo }, an< E00 = { Q C n: Q n G E E, \1 G E y00 }. Define the set function J.Loc on E 00 as J.L(Q), Q E E oo, Q � E. Let
( Notice that E C E00 .) Show that J.l00 is a measure on E 00 • 1. 14 Argue that for any probability space (n, E, J.L), the axiom = 0 is redundant. Is it also true for any measure?
J.L(C/J)
234
CHAPTER S. MEASURES
NEW TERMS:
set function 222 additive set function 222 u-additive set function 222 continuity from below on a u-algebra 222 continuity from below on a sequence of sets 223 continuity from above on a sequence of sets 223 continuity from above on a u-algebra 223 continuity from above at the empty set 223 ¢-continuity 223 finite set function 223 u-finite set function on a system of sets 223 u-finite set function on a sequence of sets 223 elementary content 223 content 223 premeasure 223 measure 223 probability measure 223 measure space 223 probability space 223 point mass (Dirac measure) 224 Dirac measure (point mass) 224 Lebesgue elementary content 224 distribution flUlction 224 extended distribution function 224 Lebesgue-Stieltjes elementary content 225 atomic (discrete) measure 226 discrete (atomic) measure 226 atomic probability measure 226 counting measure 226 monotonicity 226 J.L-minimal decomposition of a set 228 u-subadditivity 228 finite subadditivity 228 continuity from below, criterion of 229 continuity from above, criterion of 230 Bernoulli measure 23 1 Binomial measure 23 1 Poisson measure 23 1 ideal 232 u-ideal 232
2.
Extension of Set Functions to a Measure
235
2. EXTENSION OF SET FUNCTIONS TO A MEASURE
We begin this section with the introduction of a set function that is not exactly a measure, as it is not even additive, but which is at the heart of the formation of measures extended from some more primitive set func tions. A prominent example of such a construction yields the Lebesgue measure. It is initially defined on rectangular figures and then the meas urement of a more arbitrary figure is accomplished by means of approxi mation of rectangles inscribed into the figures or rectangles that cover the figure. The latter leads to the notion of an "outer measure," which was initially proposed by Lebesgue at the turn of this century and later on re fined by Caratheodory. Caratheodory's approach is essentially preserved in the contemporary constructions. The principal idea of the extension begins with measuring an arbitra ry set by sequences of rudimentary sets, which should cover the set and whose measure is previously defined. The total "measure" of the cover is then minimized over all available cover-sequences of basal sets (such as rectangles in Euclidean space ) . As it turns out, this way we can measure all subsets by the resulting set function, i.e. , outer measure, but the latter fails to hold additivity, although it preserves some, rather useful proper ties of measure, such as subadditivity and monotonicity. Having proved this, we will notice that some of the additivity can be regained; namely, there are sets, including the basal sets, that, each, along with its compli ment , forms a two-set partition of any other set, on which the outer measure becomes additive. The collection of all such "separating" sets as sembles a u-algebra, which, as we will notice, will contain the basal sets. This is generally not the smallest u-algebra over the basic collection, but this u-algebra of separating sets can further be reduced. Our procedure, however, will be different from the more intuitive way described above. Rather than having a particular generator ( such as a semi-ring along with an elementary content ) in mind, we will try to develop the whole extension in general. In the beginning, we will define an outer measure as a set function with monotonicity and subadditivity and shov; that the su bcollection of all separating sets is a u-alge bra and, in addition, that the outer measure on this subcollection is a measure. All this will initially be rendered without assuming that the outer measure was generated by a "formatter" (i.e. , some collection of sets and set function ) . Then, w e take an arbitrary formatter and create a more specific outer measure by applying the above construction with countable covers. 2.1 Definition. Let n be a nonempty set and J.l * be a set function defined on �(n). J.l * is called an outer measure if:
236
CHAPTER S. MEASURES
J.L * ( C/J ) = 0. b) A C B => J.L * ( A) < J.L * ( B ) (monotonicity). c ) { Q n } C ".P (r2 ) => JL * C U 1 Q n ) < E � 1 J.L* ( Q n) ( u-subadditivity). Although axiom ) is redundant, since J.L * (C/J) = 0 a
)
=
a
general, we find it to a be useful reminder. 2.2 Definition. Let J.L * be an outer measure on is said to be J.L*-measurable, if for any Q � n,
(2. 1a) (2. 1 b ) (2. 1c) as a set function in D
�(f2). A subset M � f2 (2.2 )
We will also say that M separates Q . D The following is what essentially constitutes the widely referred to Caratheodory Extension Theorem. For convenience, we will break it up into several theorems. The idea of outer measures and the below construction belong to the German mathematician ( of the Greek origin ) Constantin Caratheodory that appeared in his 1914 paper, Uber das
line are Ma(i von Punktmengen eine Verallgemeinerung des Liingebegriffes (in Gottingen Nachrichten ) and in his famous 1918 book, Vorlesungen iiber Reellen Funktionen (in Teubner, Leipzig ) . 2.3 Theorem. The collection E* of all J.L * -measurable subsets forms a u-algebra in n. The restriction of J.L * from �(n) to E * , in notation J.L�, zs a measure. .
Proof. Since throughout the proof of this theorem we will largely use
equation ( 2.2 ) or prove its validity, we first notice that, due to subadditivity of J.L*, as an outer measure, the inequality
u
( 2.3 )
holds true for all subsets, Q and M, of n. Our proof will consist of the following steps. a ) n is obviously an element of E*, as it satisfies (2.2). If M E E * , then Me E E * , by their symmetry in ( 2.2 ) . b) We show that E * is closed with respect to the formation of finite unions, i.e. , we show that with A , B E E * , A U B E E*. Since B E E*, it follows that for each Q' E �(n) , ( 2.3a)
2.
Extension of Set Functions to a Measure
Specifically, (2.3a) is valid for Q E �(0). Hence,
237
Q' = Q n A and Q ' = Q n A c ,
J.L * (Q n A) = J.L * (Q n A n B) + J.L * (Q n A n Be) J.L * (Q n A c ) = J.L * (Q n A c n B) + J.L * (Q n A c n Be) .
and
Summing up the last two equations and taking into account that A E E * , we have
J.L * (Q) = J.L * (Q n A) + J.L * (Q n A c ) =
J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n A c n B)
imp l ying that
J.L * ( Q)
= J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n A c n B) (2.3b) Now replacing
Q in (2.3b) with Q n (A U B ) we also have J.L * (Q n (A U B))
= J.L * (Q n (A U B) n A n B) + J.L * (Q n (A U B ) n A n Be) + J.L * ( Q n (A u B ) n B n A c ) + J.L * (Q n (A u B) n A c n Be) = J.L * (Q n A n B) + J.L * (Q n A n Be) + J.L * (Q n B n A c ) + J.L * ( Q n C/J). The latter reduces to
J.L * (Q n (A U B )) = J.L * (Q n A n B ) + J.L * (Q n A n Be) + J.L * (Q n B n A c ) . Substituting (2 .6) into (2.5) we get
( 2 .3c)
238
CHAPTER S. MEASURES
which shows that A U B E E * . The above assertions a ) and b) imply that E * is an algebra in n. c ) Now we prove that E * is a u-algebra in n. Since E * , as an algebra, is n -stable, it is sufficient to show that E * is a Dynkin system. (See Problem 1 . 10 of Chapter 4.) Let {A n } C E * be a sequence of disjoint sets. Take A 1 ,A 2 E {A n } · Substituting A 1 = A and A2 = B into (2.3c), taking A and B in (2.3c) disjoint, and then noticing that A n Be = A and B n A c = B, we arrive at
Jl * [ Q n ( A + B)] = Jl * (Q n A)
+ Jl * (Q n B) .
(2.3d)
If A 1,. . . , A n is an n-tuple of mutually disjoint elements of then, by induction, from (2.3d) ,
E*,
(2.3e)
S = 2: � Ak. Denote S = E ::"= 1 A n . Because of Sn c S, (Q n sc ) c (Q n S�), and by monotonicity of Jl * , where
n
=
1
(2.3f) Since E * is an algebra, it follows that S n E E * and hence it is Jl * measurable, i.e. , it separates Q , which, combined with (2.3e) and ( 2. 3f) , yields
n
= 1,2, . . ..
Therefore, (2 .3g ) that, by u-subadditivity, is
E ;; 1 A k ) + Jl * (Q n s c ) = Jl * (Q n S) + Jl * (Q n sc ) .
> Jl * (Q n
Inequalities (2.3) and (2.3g-2.3h) lead to
(2.3h)
2.
Extension of Set Functions to a Measure J.L * ( Q)
239
= J.L * ( Q n S) + Jl * ( Q n s c )
concluding that S = 2:: :'= 1 A n indeed separates any Q C �(Q) and thus is an element of E * . The latter supports the claim that E * is a Dynkin system and, consequently, that E * is a u-alge bra. d) We show that J.L� is a measure on E * . Substituting the set S = 2:: :'= 1 A n for Q in (2.3g), we have
JL�( 2:: �= A n ) > 2:: ;' l JL� ( A k ) , 1
V\rhich, due to u-subadditivity of Jl * , leads to the strict equality and thereby, u-additivity of Jl� · Therefore, we have proved that ResE* J.L * , denoted by J.L� , is a measure. The proof is, therefore, completed. D 2.4 Examples.
( i)
Let
n = {a,b, c }, A = { a }, A c = {b, c }, P = { b} , Q = { c }, R = {a, b}, S = {a, c }.
Define the following set function J.L * on �(n). J.L * ((/J) = 0, J.L * (Q) = 4, J.L * (A)
= 1,
J.L * ( Ac ) = J.L * (R) = J.l * (S) = 3, J.L*(P) = J.L * (Q)
= 2.
One can easily verify that J.L * is an outer measure on �(Q)
= {(/J,n,A,A c ,P, Q , R , S },
as it satisfies axioms (2. la-2. 1c) , but Jl * is not a measure, because it is not additive. We can see that only the sets C/J, n, A, and A c J.L *-separate all subsets of n and, consequently, {C/J,f2,A, A c } is the u-algebra E * . Clearly, J.L � , as the restriction of J.L * on E*, is a measure. ( ii ) Let n be an infinite set. Define the set function 1 on �(Q) by 1 ( Q ) = 0 if Q is a finite set and 1 ( Q ) = 1 if Q is infinite. Let Q = { { w n }, n = 1 , 2, . . . } be a sequence of all different singletons. Then, while 1 ( Q ) = 1. Thus,
1
is not u-subadditive and not an outer measure. D
240
CHAPTER 5. MEASURES
Recall that a restriction of a function [X,Y, f] is a function [X0 ,Y0 , J0] defined on contracted domain X0 C X with f = f0 on X0 and Y0 � Y. (In notation, fo = R e s x 0 f. ) From Theorem 2.3, we learned that the set function [E*,[O,oo],J.L�] is a restriction of an outer measure
[�( n), [O,oo] ,J.L *]. If X and Y are supersets of X and Y, respectively, a function [X,Y,f] is called an extension of f (from X to X), if [X , Y , f] is the restriction of f to X. (In notation f = Ext x f . ) We will apply this notion to extend a set function 1 defined on a collection y of subsets of n to a set function 1 on an expanded family g(y ) of subsets of n. For instance, in Ex-ample 1.2 ( ii ) we defined the Lebesgue elementary content intervals in IR". We can extend the Le >..0 on the semi-ring !f of half-open 0 besgue elementary content >.. to a (unique) content >.. c on �(!f) (see Problem 2.2) , which turns out to be a premeasure on � (verified in Theorem 3 . 1). _The primary goal in this section is to construct an exten
sion of a set function, such as premeasure, given on a ring, to a measure on the smallest u-algebra generated by this ring. Although this is the main objective, other extensions, such as "completion" of a measure, will also be a focus of our discussions. 2.5 Definitions.
Let (n,E,J.L) be a measure space. A set N E E is called a J.L -null set (or just null set) if J.L( N) = 0. We denote the set of all J.L-null sets by N 1-' " A set E is called J.L-negligible (or j ust negligible) , if there is a measur able null superset of E . The measure space is called complete, if for each null set N E N 1-'' GJ(N) C E, i.e. , if all negligible sets are measurable. ( ii ) Consider a measure space (O,E,J.L ) . Let E be the collection of all sets of type A U M where A E E . and M is any negligible set. Accord ing to Problem 2.8, E is a u-algebra. We extend J.L to J.L on E by setting (i)
J.L(A U M) = J.L(A) = J.L(A) . (E,J.L) of (E,J.L) or just J.L is then said to be the completion of measure J.L and, due to Problem 2.7, (O, E, J.L) is a measure space, called the completion of measure space (n, E, J.L ). D 2.6 Example. Let n = IR, E = {A E �(IR) : either A or A c -< N}, which is a u-algebra on n (see Example 1.2 (vii), Chapter 4 ) , and let c: 1 be the point mass. Both A = { �, n = 1,2, . . . } and A c are elements of E and e 1 (A c ) = 0. Obviously, E = [2,oo), as a subset of A c , is negligible, but not measurable. Therefore, the measure space (lR,E,e 1 ) is not complete.
The extension
(See a more general case in Problem 2. 14.)
D
2.
Extension of Set Functions to a Measure
241
The proposition below is a paradigm of a complete measure space. 2. 7 Proposition. The restriction J.L� of an outer measure J.L * to the u algebra E * of all J.L * -measurable subsets of n is complete and (n, E * , J.L�)
is a complete measure space.
Proof. Since J.L * is defined on whole �(0), for any J..L *-negligible subset
N C n, due to (2. 1b) , J.L * ( N ) = 0 and, therefore, it is sufficient to show that N is J.L *-measurable. Let Q C n. Due to monotonicity of outer measure, J.L * (Q n N) = 0 and J.L�'(Q n N c ) < J.L * (Q) and this, along with (2.3),
yields
and, hence, that N E E * . D The following will be a construction of an outer measure by an arbitrary set function 1 defined on an arbitrary subcollection of sets � C �(0). As usual, we only assume that � contains the empty set and that /, as a set function, is such that 'Y(C/J) = 0. This construction lies in the basis of the Caratheodory extension of the set function 1 to a measure on u-algebra E( �). For any subset Q � n, denote by C!: Q ( � ) the collection of all at most countable covers of set Q by elements of �. (Unless there is another subcollection, besides �' under consideration, we will for brevity drop � in <EQ((J) .) Therefore, if G:Q =ft (/J, for any { G n } E G:Q, we have Q 2.8 Proposition.
C n 'Q G n· 1
The set function J.L * defined on �(0) as i nf 2: ;:"= r( G n ) : { G n } E G:Q . <E Q # (/)
{
}
1
oo,
�Q = (/J
(2.8)
is an outer measure. Proof. We need to verify the above properties (2. la-2. 1c) of J.L * as an
outer measure:
) Since C/J E � and 'Y(C/J) = 0, it follows that J..L * (C/J) = 0. b) We assume that both J.L * ( A ) and J.l. * (B) are finite, since otherwise, the proof is obvious. If A C B, Q: B C Q:A and then we can reach on Q: A a possibly smaller limit inferior than that on Q: B · Therefore, a
c)
Let
{ Q n } C �(n) and
00 =l
Q =n U Q n . If for at least one
n,
G:Q n = (/J,
·
242
CHAPTER S. MEASURES
then also 0. From and by the definition of a limit inferior, it follows that for e2 there is a cover {G in ' n = 1 ,2, . . . } E Q: Q . such that
-
1,
z
Now, clearly
{G i n' i,n = 1,2, . . . } E ttQ. Thus,
which proves monotonici ty. D We will call the couple (y,1) (a subset y of '!P(Q) and a set function 1 on y) a formatter of outer measure p. * defined by (2.8). As it has been shown, the formatter and, subsequently, the outer measure, induced the u-algebra E*, on which p.* was a complete measure. When constructing a measure space (n, E*, p.� ) by (y, 1), the major goal is to extend 1 from y to a measure, say p., acting on the smallest u algebra E(Q) generated by y. This can be achieved by restricting (E*,p.0) to (E (y) , p. ) � given that (E*,p.�) itself is an extension of (y, I ) · The latter, however, is · not guaranteed from the above construction, unless we impose some restrictions to the formatter (y, 1), for even though (y, 1) produces (n, E*, J.Lti ), (y, 1 ) need not have all elements p.*-measurable. In other words, y need not be a subset of E*. In addition, p.� need not coincide with 1 on y. For example, if 1 is an elementary content and y is a semi-ring, then, according to Problem 2.2, for each G E y, there is a cover { C n } of G such that E� 1 C n is a de composition of G and •
=
Hence, in order that J.l * (G) = 1( G), 1 must be u-addi ti ve on y, which, in general, it is not. Consequently, we call (E * ,p.�) (produced by (y, 1) in (2.8)) the complete Caratheodory extension of (y, 'Y ) if � C E* and Res y p.� = I · If (E* ,p.� ) is the Caratheodory extension of (y, 1), then the formatter (y, 1) is said to be extendible and the corresponding restriction of (E* ,p.�) to (E(y) , J.l) is referred to as the Caratheodory extension of (y, 1 ) . As mentioned above, one of the most important questions arises, what the formatter (y,1 ) should really be to be extendible and, con sequently, generate the Caratheodory extension. By now, we have a fairly
2.
Extension of Set Functions to a Measure
243
large choice of systems of sets and set functions on them ranging from semi-rings to u-algebras and elementary contents to measures. The idea is, however, to select a possibly more rudimentary formatter (g, 1), which is tame and suited in most common practical applications and construct ions and such that ( E* ,J.L�) is an extension of (g, 1 ). In particular, this means that the elements of y have to be J.L * -measurable. The theorem below, which is a crucial step in the whole extension procedure, infers that ( y, ")' ) can be a ring and premeasure to serve as a reasonable extendible formatter. 2.9 Theorem.
Let ( y, ")' ) be a semz-rzng and elementary content, respectively, in n, which produce the outer measure J.L* and u-algebra E* of J.L* measurable subsets of n. Then y c E * . ( ii ) If, in addition, ")' is u-additive on y, then ")' = Res y f..L * and therefore (E * , J.Lo ) is an extension of (y, r). (i)
Proof.
(i )
We have to show that y C E * , i.e. , that any element, G E g, J.L * -separates all subsets of n. Take any subset Q c n with tt Q f. (/J, since, otherwise, the proof would be trivial, and let e = {e n } be any (count able) cover of Q from tt Q . For a G E y, and e n E e, Since y is a semi-ring, e n n G is an element of y and e n \G can be represented as a finite union of pairwise disjoint elements of y, say L, n 1 S;n · Consequently, (2. 9) can be rewritten as
�
e n = ( e n n G) + L, Ni n 1 S i n
and, by finite additivity of ")', (2.9a) Now, suppose L, ::" (2.9a) over n gives
=
1 1( e n ) < oo Then, summing up all equations in .
(2.9b) where {S n } is the reordered sequence {S i n ' j = 1 , . . . ,N n ' n = 1 ,2, . . . } . As Q = (Q n G) + (Q n G c ), obviously, {e n n G } c y and {S n } are covers of Q n G and Q n G c , respectively. Consequently,
cy
244
CHAPTER S. MEASURES
and and then by (2. 9b )
,
Since this inequality holds for every cover the limit inferior to yield
e of Q, it should also hold for (2. 9c)
If L: �= 1 1' ( C n ) = oo , then the equation symbol in (2.9b) must be replaced by > to yield (2.9c) again. The inverse inequality is due to (2.3). Therefore, G separates all subsets of n and, consequently, g C E * . ( ii ) By Problem 2.2, for each G E g, there is a cover { C n} of G such that G = L: �= 1 C n and J.L * ( G) = L: �= 1 1' ( C n ) · Hence, if ")' is u 'l
"
additive, J.L * coincides with 1 on g. These two facts warrant that (E * ,J.L�) is an extension of (g, I ) · 2.10 Remarks.
D
( i) One should bear in mind that, while (g, 1) can be an extendible formatter for the outer measure J.L *, g is not really a generator for E*, as the latter need not be the smallest u-algebra containing g. We would like to make a clear distinction between these two terms. Recall that a family g C �(n) is said to be a generator of another family (g C ) �0 C �(n) with a prqperty P, if '!f0 is the intersection of all supercollections of g on each of which property P holds. In our case, E* will eventually contain the smallest u-algebra E = E(g) and, in general, J.L * needs to be further restricted "to this u-algebra. From Theorem 2. 9, we conclude that any elementary content 1 on a semi-ring g, which is u-additive, can be extended to a measure J.L = Res E(y) J.L * (acting on the smallest u-algebra E generated by g). In other words, if 1 is a u-addi ti ve elementary content on a semi-ring g, then there exists at least one extension, namely, Caratheodory 's extension. ( ii ) From the proof of Theorem 2.9, it is obvious that a semi-ring with a u-additive elementary content on it is one of the most economical systems good for the Caratheodory extension. However, it is often more
2.
Extension of Set Functions to a Measure
24 5
prudent to work with premeasures on rings. In practice, to start with, one can first extend a semi-ring with an elementary content to the smallest ring with the content using the procedure of Theorem 2.5 (Chapter 4) and Proposition 2. 1 1 below. ( iii) Another reasonable question arises: in how many different ways can a formatter ( �, 1 ) be extended to a measure on E (y) ? Theorem 2. 13 below states that with some relatively minor restriction (given in Remark 2. 12) to a set function 1, the uniqueness of Caratheodory 's extension is guaranteed. D We will begin with one useful extension of an elementary content on a semi-ring to a content on the smallest ring containing the semi-ring. 2. 11 Proposition. There is exactly one content on �(!f), which coincides with the elementary content on !f. (See Problem 2.3.) 2.12 Remark. In Definition 1 . 1 ( vii) we introduced the notion of u finiteness of a set function. Sometimes it is more convenient to use an other definition of u-finiteness, which is equivalent to 1 . 1 ( vii ) for a large class of set functions. Namely, the condition of having a monotone in creasing sequence {G n } i n from � with 1 ( G n ) < oo for all n can be replaced by the equivalent condition that there is at most a countable partition {n l , n 2 , . . . } c y of n ( = :E� 1 n n ) such that 'Y( n n ) < 00 for all n. For instance, rings with contents clearly provide a basis for such equi valence. For a semi-ring with elementary content, the first definition yields the second one, as we can arrange from { G n } i f2 a countable de composition; the converse is not true. Another related notion we are going to use in the sequel is u-finite ness of a set. Let (f2, E, J.L) be a measure space. A measurable set A is said to be u-finite if Res E n A f..L is u-finite. D 2. 13 Theorem. Let � be a n -stable generator of the u-algebra E(�) in n such that � contains a monotone increasing sequence {En } j n. Let J.L1 and J.L2 be two measures on E(�), which are u-finite on {Bn } and which coincide on �- Then J.L 1 = J.L 2 on E(y). Proof. Let A E � such that J.L 1 (A) = J.L 2 ( A) < oo and let '!» A = {B E E: J.L1 (A n B) = J.L 2 (A n B)}. We show that '!» A is a Dynkin system: a ) A E '!» A implies that f2 E '!» A · b) Let D E '!» A . Then A n ne = A\D = A\(A n D), which implies that =
J.L1 (A n De) = J.L1 (A) - J.L 1 (A n D)
= J.L2 (A) - J.L2 (A n D) = J.L2 (A n De),
246
CHAPTER S. MEASURES
c)
and this leads to De E �A. Let { D n l be a sequence of disjoint sets from
J.L1 (A n l: D n ) = J.L1 ( l: A n D n ) = =
00
00
n =l
n =l
l: J.L 2 (A n D n ) 00
n= l
�A. Then l: J.L 1 (A n D n ) 00
n =l
= J.L2 ( l: A n D) = J.L2 (A n l: D n ) · 00
00
n =l
n= l
00
Hence I: D n E � A ' and therefore �A is a Dynkin system. Since n =l obviously y C � A ' it follows that y C �(y) C �A . Also since y is n -stable, it follows that �(y) is a u-algebra. Hence, we have
y C �(y) = E(y) C �A C E(y) leading to
� ( y) = E(y) = �A · In particular, we proved that VB E E(y) J.L 1 (A n B) = J.L 2 (A n B). Now let {B n } be a monotone increasing sequence of sets from y convergent to n. Thus E(y) = �Bn . Then \In = 1,2, . . . , and \/ B E E(y) , J.L1 (B n n B) = J.L2 (B n n B). Since
{ B n n B} j B and si.ijce J.Li(B n B n ) < oo, by Lemma 1 .6 , nlim --+oo J.L 1 (B n B n ) = nlim --+ oo J.L 2 (B n B n ) D
Now, by means of Theorem 2. 13 we easily deduce the following signi ficant statement.
Let 1 be a u-finite and u-additive elementary content on a sem.i-ring y. Then the Caratheodory extension of 1 to a measure on u-algebra E (y) is a unique extension. D 2.14 Corollary.
The lemmas below will be used for various purposes and, in parti cular, will lead to a relationship between the completion ( O,E,J.L) of a measure space (fl,L',J.l) and the u-algebra E * of all J.l *-measurable sets. 2.15 Lemma. Let (0, y, 'Y) be an extendible formatter of the outer
measure J.L * , Yu the collection of all at most countable unions of elements from y. Then, for each Q c n, there is a set G E Yu, such that G ::> Q and (T
(T
2.
Extension of Set Functions to a Measure
247 (2. 15)
Proof. Because
J.l * is generated by ( g, 1'),
{ 'L ::"= 1 1'( G n): { G n } E
inf
(2. 15a)
(/J. then inequality (2. 15) holds trivially. Suppose J.l * ( Q) < oo. oo,
a: Q
=
If J.l * ( Q) = oo, Then, by definition of a limit inferior and from (2. 15a), for every e > 0, there is { G n } E Q: Q such that J.l * ( Q) + £
> =
'L �- 1 1'(Gn) > 'L � = 1 1'(G n) 'L � = 1 J.l* ( Gn ) > J.l:t ( n u= 1 Gn )·
(2. 15b)
Now, we make use of the fact that (y, 1') is an extendible formatter. This implies that not only Y C E*, but also Yu C E*. Since n � G n is
=1 k monotone increasing and J.l * n U 1 G n < oo for all k, by continuity from -below (Lemma 1 .6) ,
(
lim J.l *
k--+oo
( n U= 1 Gn)
)
=
J.l*( n lJ 1 Gn)· =
Passing to the limit in (2.15b ) , which holds true for all k, we prove (2. 15 ) with G = n U 1 G n being the desired set. D 00
= Lemma 2.16. Let J.l * be an outer measure, E * the u-algebra of all J.l * -measurable sets, and A any subset of n . If there is a J.l * -measurable set B such that B ::> A and J.l * (B\A) = 0, then A E E * . Proof. Since B E E*, it should J.l * -separate Q : u
(2. 16) Now, because
A C B, we can easily show that
From Q n (B\A) (2. 16a) ,
C B\A,
it follows that
J.l * (Q n (B\A)) = 0. From
248
CHAPTER S. MEASURES
Consequently, we can replace J.L*(Q n Be) in (2. 16) by J.L*(Q n A c ) . Finally, noticing that Q n B c Q n A, we have that D and this is the desired inequality. Lemma 2.17. Let J.L * be the outer measure generated by an extendible formatter (y, 1 ) , E * be the u-algebra of all J.L* -measurable sets, J.L� be Res E * J.L*, and let E ( y) be the u-algebra generated by �. Then, for every A* E E * such that J.L�(A * ) < oo, there is a set B E E(y) with B ::> A* and J.L�(B\A * ) = 0 . Proof. Since J.L�(A *) < oo , Q:Q f. C/J. From Lemma 2. 15, for every c > 0, say �. there is a a: = \ a� :::> A* such that JLQ( a:) :S JLQ(A * ) } + c . The latter yields that JLQ( a:\A *) < � · Obviously, k a: is still a Jl 1 superset of A* and since k Jl 1 a: c a:;', it follows that
(2. 17)
where Dm = ( k n= l a: )\A* E E * . The sequence {Dm} is clearly monotone nonincreasing and J.L�( D 1 ) < oo. Therefore, by continuity from above (see Theorem 1.7 (i))) of J.Lo and because of (2. 17),
JimoJLQ(Dm ) = JL0{ ( k fl 1 a; )\A* } = o.
The set k n= l a: obviously meets the requirements on set B "promised" in the statement and we are done with the proof. D Corollary 2.18. Let J.L* be the outer measure generated by a u-finite extendible formatter (y, 1), E * be the u-algebra of all J.L • -measurable sets, and let E(y) be the u-algebra generated by y. Then, for every A* E E * , there is a set B E E(y) with B ::> A* and J.L*(B\A *) = 0 . Proof. Since ( y, / ) is u-finite, there is a partition { H 1 ,H 2 , . . . } C y of n such that 1 ( H k) < oo. If A* E E * , then {A Z = A* n Hk , k = 1,2, . . . } is a J.L*-measurable partition of A*, with J.L*(A Z ) < oo for every k, and to each of which we can apply Lemma 2. 17 and have a set B k E E(y), with B k ::> A Z and J.L*(B k \A Z ) = 0.
2.
Extension of Set Functions to a Measure
24 9
Notice that since 00 B ( k U= l k )\( 2: n -_ 1 A � ) = ( k U= l Bk ) n ( n n= l (A �) c ) it holds true that
[
� · ( k lJ 1 B k )\( I: :'= 1 A � )
]
< �· [ k lJ 1 ( Bk \A k ) ] < I: :'= 1 � *(Bn\A � ) = O . 00 The statement follows after setting B = k U= l B k ( E E(y) ). D Now, with the aid of the above propositions, we can finally answer the question about the relationship between the completion (n,E,J.L) of a measure space (O,E,J.L) and the u-algebra E* of all J.L*-measurable sets. 2.19. Theorem. Let ( �, 1 ) be an extendible formatter for (0, E*, J.L�) and a generator for the measure space (0, E = u(y), J.L = Res E J.L*) whose completion is (n,E,J.L ). ( i ) Then, E C E * . ( ii )
If ( Y7 Y ) is u-finite, then E = E* and J.L = J.L� ·
Proof.
( i ) Obviously, E C E* if and only if, any element A of I: is of the form A U N, where A C E, N is J.L-negligible, and A is J.L*-measurable. According to Lemma 2. 16, A U N would be J.L*-measurable, if there is a J.L*-measurable set B such that B ::> A U N and J.L*( B\(A U N) ) = 0. By Definition 2.5 ( i ) of a J.L-negligible set, N must have a E-measurable J.L null superset, say N0 • (Note that even though, by Problem 2. 10, J.L*(N) = 0 and J.L*(A U N) = J.L*(A) , this does not warrant that A U N E E*.) Since A U N0 is a superset of A U N and, by Problem 2. 1 1 , (A U N0 )\(A U N) is a J.L*-null set, B = A U N0 meets all prerequisites of Lemma 2. 16, which makes A U N indeed J.L*-measurable. This proves part ( i ) of the theorem. ( ii) Because of part (i), we need to show that E* C E, i.e., that each A* can be represented as the union of a J.L-measurable set and J.L negligible set. By Problem 2.12, for any A* E E*, there is a E-measurable subset B of A* such that J.L*(A*\B) = 0. Obviously, A* can be decomposed as B and J.L*-measurable null set A* n Be. It only remains to show that A * n Be is J.L-negligible.
250
CHAPTER S. MEASURES
By Corollary 2. 18, for A*, there is a set C E E such that C ::> A * and J.L*( C \A*) = 0. The set-difference C \B = ( C \A*) + (A * \B), as the union of two J.L*-null sets, is a J.L*-null set, therefore, a J.L-null set (as C \B E E). This proves that A* n Be is J.L-negligible. Now, we show that J.L = J.Lo· (Recall that they are equal on E.) Since E = E*, A* = A U N, where A E E and N is J.L-negligible, and J.L(A * ) = J.L(A) = J.L*(A).
( 2 . 19 )
On the other hand, there is a J.L-null superset of N to yield J.L*(N) = 0 due to mono tonicity of J.L * . Finally, from the inequalities J.L*(A*) < J.L*(A) + J.L*(N) = J.L*(A)
and
J.L*(A * = A U N) > J.L*(A),
it follows that J.L*(A*) J.L*(A) and this, along with (2. 19), yields that J.L (A * ) = J.L*(A * ) for each A * E E* = E. 0 Example 2.20. If (n, E, J.L) is a probability space, it follows from Theorem 2.19 that the completion of (E, J.L) coincides with (E*,J.L�) produced by ( E, J.L) or by a "smaller generator" ( y, 1) of ( E, J.L ). 0 A noteworthy question arises : if we have a semi-ring and u-additive elementary content, would it make any difference, if we first extend them to the smallest ring and premeasure, according to Proposition 2. 11, and then use the Caratheodory extension to arrive at the smallest u-algebra and a measure on it, or apply the Caratheodory extension directly to that semi-ring and u-additive elementary content. The same question applies, say, to a ring with a premeasure and the generated u-algebra with a measure. The difference, if any, can apparently take place at the expense of two outer measures, induced by a formatter and its extension. 2.21 Theorem. Let (0, y, 'Yo ) be an extendible formatter of outer measure J.L* and u-algebra E* of J.L* -measurable sets and let ( g = g (y ) , 1) be an extension of (y, 'Yo ) and an extendible formatter of outer measure v* and <:B*, such that g C E* and 1 = Res gJ.L * · Then, v* = J.L* on � ( Q) and E* = <:B*. =
v*(Q) < J.L * (Q),
(2.21)
which yields the equation v* = J.L* on a subcollection of sets Q E � ( 0) with v*(Q) = oo Suppose v*(Q) < oo Then, for every � > 0, there is a cover {En } E C!:Q ( g ) with .
.
2. Extension of Set Functions to a Measure
25 1 (2.21a)
Since 1 = J.l.* on b and 1(E n ) = J.l.*(E n ) < oo, for each £2 - n - 1 > 0, there is a cover { G n k ' k = 1,2, . . . } E Q: En (y) , such that n-1 ) ) + J.l.*(E 'Yo( £2 n L: '; 1 G n k < = 'Y(E n ) + £2 - n - 1 .
(2.2lb)
Because { G n k ' n , k = 1,2, . . . } E G: Q (y) , from (2.2 1a) and (2.2 1b),
J.l. * (Q) � L: :O 1 I: '; 1 'Yo( G n k ) < v * (Q) + � + �(2.2 1c) Finally, taking in (2.2 1c) e = � leads to the inverse of inequality (2.21) and proves that J.l. * = v * on
=
s
=
Proof.
1) From g C E5 we have E e C E5• From !f C b C E e it follows that E5 C Ee. 2) Now measures J.l.s and J.l.e act on the same u-algebra E and coincide on semi-ring !f. Since 'Yo is u-finite on !f, by Corollary 2. 14, J.1. 5 = J.l. e on E. 3)
With
and, consequently,
252
CHAPTER S. MEASURES
we meet all conditions of Theorem 2.2 1 to have p.; = J.L ; = J.L * · 4) E * = E; = E; also by Theorem 2.2 1. 0 For instance, � can be a ring generated by !f and 1 the extension of the elementary content 'Yo in accordance with Proposition 2. 1 1; or � can be an algebra with 1 as a premeasure or � can even be the u-algebra E(!f). In particular, it follows that, once the Caratheodory extension from (!f, 'Yo ) to ( E (!f), J.L) is rendered, another Caratheodory's extension of (�, 1 ) would be redundant. Another consequence of Theorem 2.2 1 is the uniqueness of outer measures generated by measures. Corollary 2.23. Let J.L a measure on a u-algebra E, which produces the outer measure J.L * with u-algebra E * of measurable sets. If there is another outer measure � * , then � * = J.L * on �(Q) and E * = E * . Proof. This is a direct application of Theorem 2.2 1 with the following identification of the above characteristics : 1) Let � be a measure on E such that � = J.L · Then � can serve as an extension of ( E, J.L). 2) E � E * . D 3) J.L = 'j1 = Res E J.L * . -
"'
Corollary 2. 23 is useful in various applications of Caratheodory's extension. Suppose 'Yt and 12 are two elementary contents coinciding on a u-finite semi-ring !f (i.e. they are u-finite on !f). By Corollary 2. 14, their respective Caratheodory extensions J.Lt and J.L2 must coincide on E ( !f). Let J.L i and J.L; be the corresponding outer measures,_ according to Corollary 2.22, produced by 'Y t and 1 2 or J.L t and D J.L2 (regardless). By Corollary 2.23, J.Li = J.L; on '!P(Q) and Ei = E;. As in. Theorem 2.21, by comparing two measures generated by a set function acting on a collection of sets and their extension, we ended up comparing two corresponding produced outer measures. It seems to be reasonable to raise another question: what if an outer measure will pro duce another outer measure? Would this make any difference? More speci fically, can the restriction J.L� of an outer measure J.L * on E * become a formatter of another, different from J.L * , outer measure? Note that this is a different scenario from one considered in Theorem 2.21, since here J.L * is not supposed to be generated by a formatter and it "acts on its own." The following example shows us this distinction. 2.25 Example. Consider '!P(O ), J.L * , E * , and J.L� in Example 2.4 ( i ) : Remark 2.24.
2. Extension of Set Functions to a Measure
25 3
n = { a,b,c}, A = { a}, Ac = { b,c}, P = { b}, Q = { c }, R = { a,b }, S = { a,c }, J.L*(C/J) = 0 , J.L * (Q) = 4, J.L * ( A ) = 1, J.L * ( Ac ) = J.L * (R) = J.L * (S) = 3, J.L * ( P ) = J.L * (Q) = 2, Then, generate the outer measure v * by (E * , J.L �). So, we have : J.L * = v* on E * and v * (P) = J.L * (Ac ) = 3 ( > J.L * (P) = 2),
v * ( Q) = J.L * (A c ) = 3 ( > J.L *(Q) = 2), v * (R) = J.L * (Q) = 4 ( > J.L * (R) = 3), v * (S) = J.L *(O) = 4 ( > J.L * (S) = 3). D As we see it, in most cases v * is strictly greater than J.L * on Q and J.L * (A) = J.L * ( Q ). D (See Problem 2. 17.) 2.27 Remark. If J.L * is generated by an extendible formatter (�, 1 ) , then clearly J.L * = v *, due to Theorem 2.21, as (J.L'Q,E * ) can serve as an extension of (y, 'Y )· Alternatively, if Q E Q and J.L * ( G u) < J.L * ( Q ) + t:. We assume that v * is the outer measure generated by J.Lo · Since J.L * = v* on �(n) and G u E E * , we have J.L * (Gu) = v * (Gu) and, by monotonicity, v * (Gu) > v * ( Q ). Thus, we have e: ,
which yields v * ( Q ) < J.L * ( Q ). The inverse inequality is due to Problem
254
CHAPTER S. MEASURES
2. 1.
0
Let (0, E, J.L) be a measure space such that E = E(!f) with !f being a semi-ring, and J.L be u-finite on !f. Then, given A E E( !f ) and £ > 0, there is a disjoint countable cover { S n } C !f of A that "approximates" A, i.e. such that A C E�= l Sn and J.L( (E�= l Sn)\A) < £. Proof. Let 1 = Re s !f J.L and J..L * be t� e outer measure produced by ( !f, 'Y) · Then, J.L is the unique Caratheodori extension of 1 from !f to E(!f), according to Corollary 2. 14, and J.L = Res E f..L * · Case 1.:. Let J.L(A) = J.L*(A) < oo. Then, by (2.8) (of Proposition 2.8), for each £ > 0, there is a sequence { G n } E
E ::O= 1 / (G n ) = E ::O= 1 J..L ( G n) < J.L(A) + £. Since J.L( n lJ 1 G n ) < E ::'= 1 J.L(G n), we have that J.L( { n lJ 1 G n }\A ) < c .
Case 2. J.L(A) is arbitrary. By u-finiteness of J.l on !f, there is at most a countable decomposition E � 1 nk of n by { Ok } c !f such that J.L(Ok) < and hence J.L(A n n k) < oo. Now, we apply the above atguments to A n n k and 2ek . Hence, there is a sequence { G nk } c !f n nk such that 00
This leads to and thus to where G n: = E � 1 G nk· Finally, it remains to form a disjoint sequence of semi-ring sets as stated in the theorem. The latter can be rendered in the same way as in 0 Lemma 2.5 of Chapter 4. 2.29 Corollary. If in the condition of Theorem 2. 28, J.L(A) < oo, then A can be approximated by just a finite tuple of disjoint semi-ring sets.
2.
Extension of Set Functions to a Measure
255
Because J.L(A) < oo, by Case 1 of Theorem 2.28, so is J.L(E�= 1 G n ) < oo. Then, by continuity from above, for each e > 0, there is an N such that Proof.
thereby leading to
1-{ A�( 2: � = 1 Sk)) < 2 c: .
0
PROBLEMS 2.1 2.2
2.3 2.4
2.5
2.6
2. 7 2.8
2.9
Let J.L* be an outer measure on �(0), J.L� = Res E* J.L*, and v* be the outer measure induced by J.L� · Show that J.L* < v* on �(0). Let (y,")') be a formatter of the outer measure J.L* defined by (2.8). Show that if 1 is an elementary content and y is a semi-ring, then for each G E y , there is a cover { C n } of G such that G = 2: ::0= 1 C n and J.L*( G ) = 2: ::0= 1 1 ( C n ). Prove Proposition 2. 11. Let J.L be a finite measure on (O,E) and let E be any subcollection of E. Show that, for any fiXed subset Q c n, it is true that inf{ J.L(A): Q C A} = J.L(O) - sup{J.L(A c ) : Q C A}. Show that the original definition of u-finiteness 1 . 1 (vii) implies the second definition of u-finiteness for semi-rings and elementary contents mentioned in Remark 2. 12. Let J.L* be an outer measure on � ( Q ) and { An} a sequence of disjoint J.L*-measurable sets. Show that for any Q c n, J.L*(Q n L: :0= 1 A n ) = L: :O= l J.L * ( Q n A n ) · Let N E N (i.e., a J.L-n ull set) and let B E E. Show that J.L( N U B) = J.L( B\N) = J.L( B). Show that E defined in Definition 2.5 ( ii) is a u-algebra, J.L is a measure, that this extension does not depend upon representations of sets of E, and that (O,E,J.L) is complete. Show that the measure space defined in Example 1.2 ( iv) is complete. 1-'
256 2. 10 2. 11 2.12
2.13
CHAPTER 5 . MEASURES
Let J.l* be an outer measure on '!P(Q) and N C n be such that J.l*(N) = 0. Show that for any subset Q c n, J.l*( Q U N) = J.l*( Q ). Show that (A U N0 )\(A U N) in part (i) of Theorem 2.19 is a J.L* null set. Let J.l* be the outer measure generated by an extendible u-finite formatter (�, 1 ), E * be the u-algebra of all J.L*-measurable sets, and let E(y) be the u-algebra generated by y. Show that for every A* E E*, there is a set B E E(y) with B C A* and J.l*(A *\B) = 0. Let (f2,E0 ,J.l0 ) be a completion of a measure space (O,E,J.L ) . Define for each A C n J.L(A) = sup{J.L(B): B E E, B C A} and -J.L(A) = inf{J.L(B ) : B E E, A C B} . Show that
a) if A E E0, then J.L (A) = J.L(A) = J.Lo(A); b). if J.L (A) = -J.L(A) < oo, then A E E0 • 2.14 Let E be a u-algebra in Q and let a E n. Show that for {a} E E the measure space (n,E,ea) is complete if and only if E = '!P(Q). 2.15 (Generalization of Problem 1.12.) Let (fl, E, J.L) be a measure space, y00 = {G E E: J.L(G) < oo} , E00 = {Q C f2: Q n G E E, \IG E y00 }, and J.l be u-finite. Show that E00 = E. 2.16 In th� condition of Problem 1 . 13, show that if J.l is complete, then so is J.l00• 2.17 Prove Proposition 2.26. 2.18 Let J.l* be the outer measure generated by an extendible formatter (�, 1 ) on a non-empty se t n, E* be the u-algebra of all J.L*-measur able sets, and E( y) be the u-algebra generated by y. Show that a subset N C n is negligible if and only if J.l*(N) = 0 .
2.
Extension of Set Functions to a Measure
NEW TERMS:
outer measure 235 monotonicity of outer measure 236 subadditivity of outer measure 236 J.L *-measurable set 236 J.L *-separability 236 Caratheodory's Extension Theorem 236 J.L0-measure 236 E*-u-algebra 236 restriction of a function 240 extension of a function 240 J.L-null (null ) set 240 null (J.L-null) set 240 N -set 240 J.L-negligible (negligible ) set 240 negligible (J.L-negligible) set 240 extension of a measure 240 completion of a measure 240 completion of a measure space 240 restriction of outer measure to E*-algebra 24 1 Caratheodory 's extension 24 1 , 242 formatter of an outer measure 242 complete Caratheodory's extension 242 extendible formatter 242 extendibility of a formatter, criterion of 243 u-finiteness of a set function 245 Caratheodory's extension, uniqueness of 245, 246 1-'
257
258
CHAPTER S. MEASURES
3. LEBESGUE AND LEBESGUE-STIELT JES MEASURES
In this section, we will use the results of the previous section for the con struction of Lebesgue and Lebesgue-Stieltjes measures. We have learned that to warrant the Caratheodory extension, a given formatter should be at least a semi-ring and u-additive elementary content, which applies to some special cases of formatters in Euclidean spaces. In Theorem 3.1 below,n we will show that the Lebesgue content is u-additive on the ring �(IR ), which will clearly yield that Lebesgue elementary content is also u-additive on the semi-ring of half open intervals. Although it is possible to prove this statement directly ( cf. Problem 3.25 with no prior extension and ·¢-continuity arguments, as in Theorem 3 . 1), we prefer first to extend the elementary content to the ring, as we want to exploit the equivalence of ¢-continuity and u-additivity. The latter, as we know, can be observed on set families not lesser than rings. n Theorem 3.1. The Lebesgue content .,\c on the ring �(!R ) is u
additive, i. e. a premeasure. Proof. Since the Lebesgue content .,\ c is finite on �, by Proposition 1. 7 ( ii ), .,\c were a premeasure if it would be ¢-continuous. We shall be using an equivalent version of 0-continuity: For every monotone decreasing sequence {An} l C � with .,\c(A 1 ) < oo, the assumption that nliriJo"'c(An) (which clearly exists) is strictly positive must yield that n n= l An f. Q). 00
Let {An} be any such monotone decreasing sequence with (3. 1)
It is readily seen that (3. 1) implies that for each An f. 0, and there fore, by Cantor's Theorem 5.4, Chapter 2, n n= l An f. ct>. However, the nonempty intersection of the closures of An 's need not yield that the intersection n n= l An f. Q) either. To overcome this difficulty we will construct a subsequence of compact subsets of An 's with the desired above property. Now, since An 's E �, each An can be represented as a finite union of disjoint half open parallelepipeds, say E� = 1 P (for brevity let us drop index ) such that .,\c(P ) > 0. Then for each value of and for every P there is a half open parallelepiped II whose closure II is a proper subset of P and such that n,
00
n
£
5,
s
s
5
s
s
3. Lebesgue and Lebesgue-Stieltjes Measures
n 2 >..c ( P ) < >.. c ( JI 5 ) + 2 5 _ 2 5 - r t · Bound ( 3.1a) yields that s
259 ( 3.1a) ( 3.1b )
where B n = E : = 1 115• Obviously, B n C A n . It seems like we are done with the sequence {B n }· However, the claim that n n= 1 B n f. C/J is unwarranted, as { B n } need not be monotone decreasing. Therefore, we define n e n = n Bk , k=1
which forms a monotone nonincreasing sequence of sets term-wise dominated by {A n } · Now, we need to show that e n f. (/J. We shall be able to prove a much stronger statement that >.. c ( e n ) > 0 for all n. Namely, we will prove that ( 3.1c )
which , because of >.. c ( A n ) > £, would yield the desired
( 3.1d )
We prove ( 3. 1c ) by induction. For n = 1, ( 3. 1c ) holds true, since from ( 3.1 b ) , Now we assume that ( 3.1c ) holds for some n > 1 and show the validity of ( 3. 1c ) for n + 1. Because of e n + 1 = B n + 1 n e n and Proposition 1.5 (ii),
( 3.1e )
Due to ( 3.1e ) , the inequality >.. c (B n + 1 ) >- >.. c (A n + 1 ) - 2 n 1+ 1 £ (from ( 3.1b ) for n + 1 ) , and the assumption that ( 3. lc ) holds true for some fixed n we have
>.. c (B n + 1 U en ) > >.. c (A n + 1 ) + >.. c (A n ) - >.. c (C n + 1 ) - c( 1 - 2 n � 1 } Since obviously Bn + 1 U e n C A n , we have >.. c (A n ) > >.. c (B n + 1 U e n ), and hence
260
CHAPTER S. MEASURES
( 1 - 2 n\ 1) + A c (A n ) - A c (Bn + 1 U C n ) > A c ( A n + 1 ) c( 1 2n\ 1 }
· ( Cn + 1 ) > A c (A n + ) 1
\
-
c
-
-
This proves (3. 1c) and (3. 1d) and thereby yields that { C n } 1s. a monotone nonincreasing sequence of nonempty compact sets; hence, by Cantor's Theorem 5.4, Chapter 2, Consequently, it shows that >. c is indeed a premeasure on the ring '5b.
D
3.2 Remarks and Definitions.
Theorem 3.1 states that the Lebesgue content on �(!f) in IR" is u-additive. This, obviously, implies that the Lebesgue elementary content is also u-additive on !f. ( ii ) In Example 1.2 ( ii ) we defined the Lebesgue elementary content 0 >. on the semi-ring !f of half-open intervals in IR " . Now, by the use of Pro position 2 . 1 1 , Corollary 2. 14, and Theorem 03. 1, we can have the couple (�,>. c) or, in light of Remark ( i ) , even (!1',>. ) as an extendible formatter of the outer measure >. * acting on �(lR") and call this set function the Lebesgue outf1,r measure. The u-algebra E* C '!P(lR") of all >.*-measurable sets, in notation, L * , called the Lebesgue u-algebra of measurable sets, along with >.� R e s L >. * , calle€). the Lebesgue measure, will form a complete measure space, according to Proposition 2. 7. The further restriction >. Res E( !l' ) >. * of the Lebesgue outer measure on the smallest u-algebra generated by !f (which, according to Theorem 2. 7, Chapter 4, is identical to the smallest u-algebra generated by the usual topology) or, equivalently, by � ' known as the Borel u-algebra <:B on lR", is referred to as the Borel-Lebesgue measure. By noticing that there exists a monotone increasing sequence ( k,k ]" j lR" of half-open squares with (i )
=
*
=
-
we conclude that >.0 is u-finite on !f and, therefore, by Corollary 2. 14, the Borel-Lebesgue measure >. is unique on <:B. By Remark 2.24, the Lebesgue outer measure >. * and hence the Lebesgue measure >.� are also unique on �(lR") and L *, respectively. Finally, by Theorem 2. 19 ( ii ), the completion of Borel-Lebesgue measure >. coincides with Lebesgue measure >.0 on L * and the corresponding completion of the Borel u-algebr a coincides with the u-algebra L * of Lebesgue measurable sets. Both, Lebesgue and Borel-Lebesgue measures have their strengths
3.
Lebesgue and Lebesgue-Stieltjes Measures
26 1
and weaknesses. The Borel-Lebesgue measure acts on the Borel u-algebra, which stems from the usual topology and preserves some topological pro perties. The Borel-Lebesgue measure is also an element of a very im portant class of Borel measures. However, unlike Lebesgue measure, Borel-Lebesgue measure is not complete. D 3.3 Definitions.
( i ) Let <:B be a Borel u-algebra in n. Any measure J.L on <:B is called a Borel measure and the triple (0, <:B, J.L) is called a Borel measure space . ( ii ) A Borel measure J.L on (IR",<:B) is said to be a Borel-Lebesgue Stieltjes measure if J.L(B) < oo for any de-bounded Borel set B. Clearly, any Borel-Lebesgue-Stieltjes measure is u-finite. ( iii) Let J.L be a a Borel-Lebesgue-Stieltjes measure on (lR",<:B). Now, in light of Caratheodory's construction we can use the couple (':B,J.L) as an extendible formatter of the outer measure J.L* acting on '!P(lR " ) and call this set function the Borel outer measure. The u-algebra E* C '!P(IR" ) of all J.L *-measurable sets will be denoted by �� and called the Lebesgue Stieltjes u-algebra of measurable sets. The corresponding restriction J.L� = Res r:B * J.L* will be called the Lebesgue-Stieltjes measure. In the literature on measure theory, Lebesgue-Stieltjes measures are often confused with Borel-Lebesgue-Stieltjes measures. D In addition to the Borel-Lebesgue measure on Borel u-algebra <:B on IR", we present another construction of a Borel-Lebesgue-Stieltjes meas ure, for simplicity letting dimension n = 1. In Example 1.2 ( iii ) we introduced the Lebesgue-Stieltjes elementary content J.L� on the semi-ring !f of all half-open intervals ( a,b] C lR, by means of an extended distributi on function (i.e., a monotone nondecreasing, right-continuous function) f : (IR,d e)_.(IR,de), as J.L� ((a,b]) = f( b ) - f(a). Observe that J.L� reduces to the Lebesgue elementary content if f(x) = x. According to Proposition 2. 1 1, J.L� can uniquely be extended to the Lebesgue-Stieltjes content J.L f on the ring �(!f) of "figures." The following is to show that J.L f is u-additive. 3.4 Theorem. Let J.L f be a Lebesgue-Stieltjes content on the ring '5b(!f) induced by a monotone nondecreasing right-continuous function f. Then J.L f zs a premeasure. Proof. Since f..L J is finite on �(!f) , as in Theorem 3 . 1 , it is sufficient to show that J.L { is ¢-continuous. Let {R n } be a sequence of sets from �(!f) monotonical y decreasing to (/J. We prove that nlim J.Lj(Rn ) = 0 . _. oo We assume that R n C C, n = 1 ,2, . . . , where C is a compact set in (IR, r e) · A set R n E � is a figure if it is a finite union of disjoint intervals of type ( a ,b ]. Because of right-continuity of f , it can be easily shown 1-'
.
262
CHAPTER S. MEASURES
that , for each fixed > 0 and for any figure R n , there is a subfigure Bn C R n such that B n C R n and such that J.L j (R n ) - J.L j (Bn) < 2 n . It also follows that nn= l B n = C/J. We claim that there is an such that c C n (/J. To see this, observe that {C\B ; = 1 , 2 , . . . } is ) B = = n (B n n k k=l an open cover of C in the relative topology ( C, re n C). Since com pactness is weakly hereditary and C is closed, it follows that C is also compact in r e n C. Thus, the above cover reduces to a finite subcover, for example, C\B1 , . . . ,C\Br yielding that £
£
-
r
n
Thus, Now, for all
n
>
r,
and Since { R n } is monotone decreasing, it follows that n n n R n \( n Bk ) = U (R n \Bk ) C U (Rk \B k ). k=l k= l k=l Observe that this is the desired inclusion implying the estimate J.L1 ( R n ) < This inclusion is due to the inclusion Rn \Bk C R k \B k , which holds for all k < (as long as < oo). Hence the above countable intersection Thus we have reduces to a finite intersection of the sets B k , k = 1 , £.
n
n
. . .,r.
n n < f..L J (kU Rk \B k ) < kE f..L J (Rk \B k ) < e(1 - 2-"), =l =l D which shows that f..L J (R n ) -+ 0. Notice that it can alternatively be shown (Problem 3.26) that the Lebesgue-Stieltjes elementary content is u-additive with no prior exten sion to the Lebesgue-Stieltjes content and bypassing C/J-continuity. 3.5 Remarks.
( i) Using the same arguments as in Remarks and Definitions 3.2,
3. Lebesgue and Lebesgue-Stieltjes Measures
263
we will extend the Lebesgue-Stieltjes elementary content J.L� (or content J.L 1) from the semi-ring !f (or ring �(!f), respectively) to the Lebesgue Stieltjes measure J.Lj on the u-algebra E* ( = c:B �) of Lebesgue-Stieltjes measurable sets and then reduce it to the unique measure J.L 1 , which is clearly a Borel-Lebesgue-Stieltjes measure on the Borel u-algebra c:B(lR). ( ii ) When dealing with Borel measures , it is common to observe a certain property of a u-finite Borel measure J.L1 on the semi-ring !f in IR" and extend this property of J.L 1 from !f to the Borel u-algebra c:B arriving at another Borel measure J.L 2 • Since J.L 1 and J.L2 coincide on !f, by Corollary 2. 14 , J.L 1 = J.L 2 on <:B. Consequently, by Remark 2.24, the corresponding outer measures J.L i and J.L ; must coincide on �(IR") as well as their restrictions on c:Bi = c:B;. Note, however, that c:B : is not a general notation like c:B is, for it is not a induced by the usual topology and it is related to a particular Borel measure J.L on <:B. ( iii) We have learned that if f is an extended distribution function (see the definition in Example 1.2 ( iii )), then it induces a Borel Lebesgue-Stieltjes measure on <:B. Conversely, a Borel-Lebesgue-Stieltjes measure J.L generates an extended distribution function. If J.L is a finite Borel-Lebesgue-Stieltjes measure on c:B , then we can set f ( x) = J.L( ( - oo,x ]) and such an f is a distribution function. Indeed, take a sequence x 1 > x 2 > . . . � x. Then f(x n ) - f(x) = J.L((x,x n ])�O, by C/J continuity of J.L (Theorem 1. 7 ( i )) , which shows that f is right-continu ous. Since J.L is a finite measure, f is bounded. Finally, if x n is any monotone decreasing sequence convergent to - oo (such as { - n} ), then, again by C/J-continuity of J.L , it follows that J.L(( - oo , x n] ) and thus, f(x n ) � 0. If J.L is an arbitrary Borel-Lebesgue-Stieltjes measure, we can define f(O) = 0 and J.L((O,x]), x > 0 f(x) = - J.L((x,O]), x < 0. Similarly, one can show that f is an extended distribution function. (See Problem 3.3.) If m = U3(1R,c:B) denotes the set of all Borel-Lebesgue Stieltjes measures on (IR,c:B) , then it can be shown that any two extended distribution functions f 1 and f 2 that induce J.L E m can differ only in an additive constant (see Problem 3.4). The latter generates an equivalence relation, say �. Therefore, if m e denotes the set of all extended distribution functions, for each J.L E U3, there is a unique equivalence class {/ ; J.L } of all such extended distribution functions that induce J.L, and {f;J.L} = {/ + E IR}. Let m e l �: = { {f; J.L } : J.L E m} be the corresponding quotient set of m e . Then, there is a bijective map '!» from the set m onto the set me l �. c: c
264
CHAPTER S. MEASURES
As regards the subset m * of all finite Borel-Lebesgue-Stieltjes meas ures, then, obviously, each one of them generates a unique distribution function and there is a bijective map between m * and the set � ( C � e) of all distribution functions. To make all distinctions between distribution and extended distribu tion functions lucid the reader may find it expedient to go over Problem D
3.9.
We will return to Lebesgue measure A0 on L *. First, we prove a lemma about negligible sets. One of the interesting consequences of this result is that in IR", all Borel sets having a dimension less than are null sets. 3.6 Lemma. A set N C IR" is A-negligible if and only if for each > 0 there is a countable cover of semi-open intervals {Ik } C !f of N such that n
£
� 00
L..J k = l"0 (I k ) < e . Proof. Let N be A-negligible. Then, by Problem 2. 18, A * ( N) == 0 and
where Cf, N is the set of all countable covers of N by semi-open intervals and it is not empty, since otherwise A*( N) would equal oo By the defini tion of a limit inferior, for each > 0, there is a cover {I k } E Cf, N such that .
£
which proves the first part of the statement. Conversely, let e > 0 and let { Ik } C !f be a countable cover of N 0 with the property that L: � A (I k ) < Then, 1
£.
00
A * (N) � A * ( ku/k ) < L: ;' l Ao ( Ik ) < c and hence, by Problem 2. 18, N is a A-negligible set. 0 3.7 Lemma. Let f: IR ---. IR be an additive function, continuous at zero. Then, f is linear. Proof. First note that f(O) + f(O) f(O + 0) = f(O). ==
( 3 .7a)
This yields that f(O) = 0. Then, from 0 = f(O)
=
f ( x - x) = f (x) + f( - x )
(3. 7b)
3. Lebesgue and Lebesgue-Stieltjes Measures it follows that f( x ) = - f( - x ) and thus f is odd. Now, let positive integer number. Then, since f is additive, f( nx ) = nf( x ).
265 n
be any (3.7c)
If n is a negative integer, then, from (3.7b-c),
f(nx) = ! ( ( - n) ( - 1) x ) = - f ( - nx ) = - ( - n) f (x ) = nf( x ). Hence, for each n E 7L, f( nx ) = nf( x ),
which yields that
! (�) = kf( x ). Combining (3.7d) and (3.7f) we have that for each integer
(3.7d) (3. 7f) m,
m f (fr) f (�x) = �f( x). In other words, for each rational number q, =
f( qx) = qf(x). Since f is continuous at zero and because f is additive and odd we have from f ( x - y ) = f( x) + f( - y) = f ( x) - f (y) that f is continuous on IR. Now, let r E IR. Then, there is a sequence {q n } of rationals convergent to r. Due to continuity of /, nl.L� f(q n ) = f( r ).
(3. 7g)
On the other hand, f(q n · 1) = q n f(1) and (3.7g) lead to f( r ) = nlim --.oo q n = f(1) r . --t-oo f(qn ) = /(1) nlim D This shows that f is a linear function f ( x ) = e x , where c = f ( 1). n 3.8 Corollary. Let f : !R ---. !R be continuous at zero and additive for
each variable separately. Then
f( x1 , . , x n) = cx1 · . .x n , where c = /(1, . . . , 1). Proof. If x 2 , . , x n are fixed, then by Lemma 3.7, .•
••
266
CHAPTER S. MEASURES
Applying the same procedure successively to the other variables we have the statement. D n 3.9 Definition. A Borel measure J.L on <:B(IR ) is said to be translation invariant, if for each Borel set B E � (IR n) and x E IR n , J.L(B + x) J.L(B) , D where B + x = {x + y: y E B} . We will see in Section 4 that the Borel-Lebesgue and Lebesgue meas ures are translation invariant. The following theorem states that any translation-invariant Borel measure is a multiple of the Borel-Lebesgue measure. =
Let J.L be a translation-invariant n <:B(!R ) . Then, J.L = c>., where >. is the Borel-Lebesgue and c = J.L( C) ( C stands for a unit cube). Proof. For each x E IR , define 3. 10 Theorem.
I = X
[x,O),
x
C/J,
x=O x>O
[O ,x),
and sgn x =
Borel measure on measure on <:B(!R n)
1, x > O - 1, X < 0.
Denote (3 . 10)
We show that f defined in (3 . 10) is additive and continuous in each variable separately. Without loss of generality, we show it with respect to x 1 . Let x 1 = x + y. Case 1. Suppose x > 0 and y > 0. Then, I x + y = [O,x + y) = [O,x) + [x,x + y) and where
3.
Lebesgue and Lebesgue-Stieltjes Measures
267
R1 = I x I 2 x . . . x I n and R 2 = [x,x + y) x I 2 x . . . x I n . x
x
x
x
x
Since, x,y, and x + y are all positive,
(
sgn . fi x ; 2=1 From (3. 10a) ,
and since
)
=
sgn(x x 2 ·
•
•••
•
x n)
=
sgn(y x 2 ·
•
•.•
•
x n) ·
(3. 10a)
J.L is translation invariant,
Case 2. Suppose x + y > 0 and x > 0, y < 0. Then, sgn ( ( x + y) x 2 ·
•
•••
•
x n) (3 . 10b)
Since
I x + Y = [O,x + y) = [O,x)\[x + y,x) ,
>.([x + y,x)) = >.([y,O)), and because (3. 1 0b) we have that
J.L is translation invariant, using
Case 3. x + y > 0 and x < 0, y > 0 is same as case 2. The other combinations of x and y are left for the reader. (See Problem 3 .20.) No,v, we prove continuity of f at zero. Let { a k } be a sequence conver-
268
CHAPTER S. MEASURES
gent to zero from the right. Then, {I a k } is such that 00
{ a k } C IR + and the sequence of sets
k n= 1 la k
= {0}.
The latter yields that 00
n {Ia k x l x 2 x . . . x l x n } = l0 x l x 2 . . . x l x n . k=l By the definition, I0 = C/J; and by continuity from above of J.L , we have x
that
Similarly, by continuity from below of J.L, we have that
for
{ a k } T 0. In addition, f ( O, x 2 , .
• •
, x n) = 0 is by the definition of f.
By Corollary 3.8,
f (x1 , . . . ,xn ) = /(1, . . . ,1)x1 · x n = sgn(1 · · 1 )x1 · Xn J.L ( C) , · ·
where
C = [0,1)
x
'
1
v
·
· ·
,
(3. 10c)
. . . x [0, 1). On the other hand,
f (x 1 , . . . , x n) = sgn ( aJ J= 1 x ) J.L ( . fr= 1 Ix . )' ;
'
2
which, along with (3. 10c) , gives
Note that
JL c � /x; ) = sig=c��:��xn) JL ( C).
(3. 10d) (3. 10e)
Equations (3 . 10d) and (3. 10e) tell us that for any rectangle R whose all sides lie on corresponding coordinate axes, (3. 10f) For an arbitrarily positioned rectangle R whose all sides are parallel to the corresponding coordinate axes (3. 10f) still holds true due to the translation invariance of J.l·
3.
Lebesgue and Lebesgue-Stieltjes Measures
269
0
By J..Lo = J..L ( C) >-. we define an elementary content on the semi-ring !f of half open rectangles. Then, by 'jJ. = J..L ( C) >.. we also have a Borel measure on �- Now, we have three Borel measures on c:B: 'jJ. , J..L , and the (unique, as !f is n -stable) extension of J..Lo from !f to <:B. All three coincide on !f and therefore must be equal on <:B. D 3.11 Example. (Cantor ternary set). Consider the following family of subsets of [0,1]. Let n = C0 = [0,1], G 1 = (� , ; ) , C 1 = C0 \G 1 ,
G2 = (� , �) U (� , �), C2 = C1 \ G 2 , as depicted in Figure 3.1 below: 0
�
0
•
9
•
� (2)
•
•
•
•
•
01 ( 1 )
1. 3
•
�
2
3•
•3
9
9
•
.f; (2 ) • •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
F; (2 ) • •
•
•
1
•
02 (2)
]_
1. •
•
=
1
I
01 (2)
l.
G1
l3
Co
•
£ 9
•
•
•
•
� (2 )• •
•
Figure 3. 1 Therefore, eachn C n is the union of 2" closed intervals, while each the union of 2 - 1 open intervals. Also,
G n is
2n 2 n +l C n +l = C n \G n +l = Cn \ U O k ( n + 1) = U= F k ( n ) k 1 k 1 =
and { C n } is a monotone decreasing sequence of sets. The defined as C = nn C n
Cantor set
is
00
=1
and it can be characterized as follows. 1) C is closed as the intersection of closed sets. 2) Each C n contains 2 n closed disjoint intervals F1 ( n ) , . . . , F2 n ( n ) . Each of these intervals is a term of the monotone decreasing sequence
270
CHAPTER S. MEASURES
{F k(n )}l with de (F k (n)) = .,\(F k (n)) l O, n--+oo. By applying Cantor Theorem 5.4, Chapter 2, we conclude that V k = 1 ,2, . . . , n F k ( n ) n= l consists of exactly one point. In other words, C is a union of isolated 00
points and therefore nowhere dense.
5) The Lebesgue measure of G is .,\(G ) = �(�) n , since n
n
6) Thus Hence
.,\ ( C) 1 - 1 = 0 and therefore C is a Borel .,\-null set. =
7 ) C is not empty, since C contains all boundary points of the sets C n wh ich are 0, 1, � , � , 312 , 322 , 372 , 382 , The boundary points have the following ternary representations • • •
•
0=0.0000 . . . 1 = 1.0 ( or ) = 0. 22222 . . . ( in duadic representation ) � 0. 1 (or ) = 0.02222 . . . � = 0.2 = 0. 1222 . . . � = 0.01 = 0.00222 . . . � = 0.02 = 0.0122 . . . � = 0.21 = 0.2022. =
I
e
e
e
•
e
e
•
e
•
•
I
e
•
•
o
e
e
I
e
•
•
•
e
•
Each set C n has exactly 2 n boundary points, each of which has a unique triadic representation consisting of all n-tuples of digits 0 or 2. Observe that 2 = j
3. Lebesgue and Lebesgue-Stieltjes Measures
2 71
PROBLEMS 3. 1
Let H = { x = ( x 1 , . . . ,x n ) E IR n : x i = a E lR} be a hyperplane orthogonal to the ith coordinate axis. Show thatn H is a >.-null Borel set. [Hint: 1 ) Show that H is closed in (IR ,r ) and hence Borel, 2 ) Find a relevant countable cover of H by rectangles from !f and apply Lemma 3.6.] Show that each countable subset of !R n is a Borel >.-null set. [Hint: Use Problem 3.1]. Show that f defined by ( ) in Remark 3.5 ( i ii ) is an extended distribution function. Let f 1 and f 2 are two extended distribution functions and let J.L 1 and JJ. 2 are the corresponding Borel-Lebesgue-Stieltjes measures induced by these functions. Show that J.L 1 = J.L 2 if and only if f 1 - f 2 = c, where c is a constant function. Let m be the set of all extended distribution functions. Show that 9::> is a semilinear space over IR + . Let f 1 and f 2 be two extended distribution functions. If J.L 1 and J.L 2 are the corresponding Borel-Lebesgue-Stieltjes measures induced by f 1 and f 2 , show that for any nonnegative scalars a1 and a 2 , �( a 1 J.L 1 + a 2 J.L 2 ) = { a 1 / 1 + a2 / 2 ;J.L} , where '!» is defined in Remark 3.5 (iii ) . Let f: IR--+IR be an extended distribution function and let f..L J be the corresponding Borel-Lebesgue-Stieltjes measure on �(IR). Show that e
3 .. 2 3 .. 3 3 .. 4
3.5
*
e
e
3.6
3.7
3.8
a) f..L J ((a,b)) = f(b - ) - f(a) b ) f..LJ ( [a,b] ) = f( b ) - f(a - ) c ) J.Lj([a,b )) = f(b - ) - f(a - ) d) f is continuous if and only if f..L J ({x}) = 0, x E IR. Let f be the extended distribution function on IR given by
f(x) =
- 1, 1 + x, 2 + x2 , 15,
X < -2 -2 < x < 0 O<x<3 3<x
272
3.9
3.10
CHAPTER S . MEASURES
and let J.l j be the corresponding Borel-Lebesgue-Stieltjes measure. Evaluate the measure of the following sets: a ) {3 } b) { - 2 } c ) [ - 1,3) d) [O,� ) U (3,5). Let f be a distribution function and let J.l f denote the Borel Lebesgue-Stieltjes measure induced by f. Justify with a proof or give a counter-argument: a ) Must f be an extended distribution function? b) Suppose g is a function defined by ( * ) of Remark 3.5 (iii). Is g a distribution function? If your answer is yes, is g = f? Let p. be an atomic measure ( = 2: c;' 0cv:: b )·
1) Is J.l always a Borel-Lebesgue-Stieltjes measure? If it is not, give
a cond.ition under which J.l is a Borel-Lebesgue-Stieltjes measure. 2) Find in this case {f;J.L} . 3) Plot one such f. n 3.11 Consider the Borel cr- algebra � = �(IR) generat ed by the usu al topology. Show that, for any Borel set B E � and any point x E IR n , B + x = {z E IR n : z = y + x: y E B} E <:B. [Hint: Show that E x = {A E <:B: A + x E <:B} is a u-algebra. ] 3.12 Let (0, ':B,J.l) be a Borel measure space, such that the Borel u algebra <:B is generated by a Hausdorff topological space r, and J.l is a finite Borel measure. For any subset Q � n denote by %(Q) the collection of all com pact subsets of Q. Show that a subcollec tion ..Ab � <:B of all sets B E <:B such that J.l( B) = sup{J.L( K ) : K E %(B)} is a monotone system in n, i.e. a su bcollection of those Borel sets that can be approximated "from below" by compact subsets is a monotone system. 3.13 Let (n, ':B,J.l) a special case of the Borel measure space introduced namely, let n = IR n and the Borel u-algebra in Problem 3. 12, <:B = <:B(IR n ) be generated by the usual topology r e· Again assume that J.l is a finite Borel measure. Show that in this case every Borel set B can be approximated from below by a compact subset K C B; i.e. for every £ > 0 there is a compact subset K( £ ) C B, such that J.L(B) � J.L(K(c)) + £ .
3. 3.14 3.15
Lebesgue and Lebesgue-Stieltjes Measures
273
Generalize Problem 3.13 allowing J.l to be a u-finite Borel measure. Let (lR n , ':B ,J.L) be a Borel measure space, where the Borel u-algebra <:B is generated by the usual topology r in lR n and J.l is a finite Borel measure on <:B. Show that every Borel set can be "approxima ted from above" by an open set, i.e. if e( B) is the collection of all open supersets of B, then e
J.l(B) = inf{ J.L(O) : 0 E e(B)}. 3. 16
Show that there is a non-Borel set in �(IR). [Hints: We call x, y E IR equivalent (x y) if and only if x - y E Q (rational numbers). For every real number x E IR we assign another real number y to the class A x if and only if X - y E Q. 1) Show that (IR, ) is indeed an equivalence relation. �
.-v
Let g l be the quotient set of modulo . Using the Axiom of Choice we select any element from each class of g l that belongs to set (0 , 1]. Denote by A the collection of all such elements. 2) Show that such a selection is possible taking into account the Axiom of Choice; i.e. it can be shown that Yx E lR, Ax n (0 , 1] -:f. C/J. 3) Show that set A has the following properties: Vq -:f. r E Q , { q + A} n { r + A} = . (P3 . 16) lR can be restored from A as �
N
"-�
IR = U (q + A) = qEQ
}: (q + A).
qEQ
(P3. 16a)
4) Finally, let Q = Q n (0, 1 ]. Then U - (q + A) c (0,2]. If A qEQ
is a Borel set (and this is the assumption that will lead to a contradiction), then by Problem 3. 1 1 , x + A is a Borel set too; and by the translation-invariance of Borel-Lebesgue measure >., >.(x + A) = >.(A) implying that
A( U (q + A)) = }:_ .\(q + A) < .\((0,2]) = 2. _
qEQ
qEQ
Thus the above �eries is finite; and since the >.( q + A) values are equal for all q E Q , each of them must be zero, which implies that
274
CHAPTER S. MEASURES
.,\( q + A) = 0, \lq E Q . But !R =
L: (q + A ) => .,\(!R) = L:
qEQ
qEQ
.,\( q + A ) = 0,
which is an absurdity. Thus, our assumption that A is a Borel set was wrong.] 3 .. 17 Let .,\ denote the Borel-Lebesgue measure on the Borel u-algebra � (!R n ) . Show that for each Borel set B and e > 0, there is a count able cover of B by disjoint semi-open cubes { C k } such that In particular,
.,\(B) = inf{ L: � 1 .,\0 (C k ) : {C k } E G: B (c ubes) } . 3.18
[Hint: Use Problem 3.1 5.] Show that if N is a negligible set in (!R n c:B, .,\), for each e > 0, there is a countable cover of N by disjoint semi-open cubes { C k} such ,
that
3.19
Show t'hat if N is a subset of !R n , and for each e > 0, there is a countable cover of N by semi-open (not necessarily disjoint) cubes such that
then N is negligible. 3.20 Show additivity of f in Theorem 3 . 10 for the other combinations of x and y. 3.21 Let J.L be a translation invariant Borel measure on <:B (!R ) and let J.L* be the outer measure produced by ( <:B ( lR n) , J.L) and <:B: be the corresponding u-algebra of J.L * -measurable sets. Show that ( i) J.L * = J.L( C) .,\ * on � (!R n) and J.L� = J.L( C).,\� on <:B:. "
( ii ) ':B: = L* .
3.22
Let (!Rn, �, .,\) be the Borel-Lebesgue measure space and let B be a compact set. Show that for each £ > 0, there is a finite cover of B by semi-open rectangles D 1 , . . . ,D N such that
3. Lebesgue and Lebesgue-Stieltjes Measures
275 (P3.22)
For any £ > 0 , construct an open set D in (IR, r ) which is dense in IR and with A (D) < £ . n 3.24 Is every u-finite Borel measure on <:B(IR ) also a Borel-Lebesgue Stieltjes measure? 3.25 Give a direct, alternative to Theorem 3.1, proof that the Lebesgue elementary content A0 is u-additive on the semi-ring !f of half open intervals in IR n , not using any prior extension to the Lebesgue content A c on �(IR n ), as Theorem 3.1 does. 3.26 Give a direct (alternative to Theorem 3.4) proof that the Lebesgue-Stieltjes elementary content is u-additive with no prior extension to the Lebesgue-Stieltjes content and bypassing (/) continuity.
3.23
e
,
276
CHAPTER S . MEASURES
NEW TERMS:
Lebesgue outer measure 260 Lebesgue u-algebra 260 Lebesgue measure 260 Borel-Lebesgue measure 260 Borel measure 26 1 Borel measure space 26 1 Borel-Lebesgue-Stiel tj es measure 26 1 Borel outer measure 26 1 Lebesgue-Stieltj es measure 26 1 Lebesgue-Stieltjes content 26 1 distribution function 263 extended distribution function 263 translation-invariant Borel measure 266 Cantor ternary set 269 measure of a hyperplane 271 non-Borel set 273
4.
Image Measures
27 7
4. IMAGE MEASURES
In Remark 3.5 (iii) we saw how Borel-Lebesgue-Stieltjes measures can be generated by measurable functions belonging to the class of so-called extended distribution functions. In this section we will also generate measures by elements of the far more general e - 1 -class of measurable functions. The very process of generation of measure is totally different from that in Remark 3.5 (iii) and the two notions should not be confused with each other. Section 3 , Chapter 4, is a relevant prerequisite to this material. 4.1 Proposition. Let (0., E, J.L) be a measure space and let f.
(0., E)-+ (0.', E') be a measurable function. Then the set function D A' -+ J.Lf * (A') = J.L(f * (A')) on E' is a measure.
(See Problem 4. 1 .) 4.2 Definition and Notation. The measure J.L !* in Proposition 4. 1 induced by a measurable function f is called an image measure. Notice that directly from Definition 2. 1 (viii), J.Lf*( A') can alternatively be D viewed as J.L{ w E n: I ( w ) E A' } or, shortly, as J.L{ f E A' } . 4.3 Proposition. Let L: lR" --+ lR" be such that L( x) = a x + b, where a E IR\ { 0 } and b E lR " . Then the Borel-Lebesgue measure >. on <:B(lR") has the property >.L * = : n>. . Specifically, if a = 1 we have >.L * = >. ,
I I which shows that the Borel-Lebesgue measure is translation-invariant. Proof.
f( x) = ax (called a homothetic function), where x E lR" and a ( f. 0) E IR. Let >. be the Borel-Lebesgue measure on the Borel u-algebra 1 ) Let
<:B. We show that
Take
( a,b] E !f. Then, f * (( a,b]) =
n rr i=l
( a ' ba ] ai
i
'
which implies that and
>.f * ((a,b]) = � >.((a,b] ) for a > 0 a
278
CHAPTER S. MEASURES
A/ * (( a,b ]) = �( 1 ) " A(( a , b ]) for a < 0, a and thus
.\ f * ( ( a ,b ]) = I : I n .\ (( a ,b ]).
As a continuous map relative to the usual topology, f is Borel and, consequently, .\ f * is a Borel measure on c:B. Obviously, : n,\ is also a I I Borel measure on c:B. Since .\ f * and : n,\ are u-fmite on :1' and coincide I I on :1' (being a n -stable generator of c:B) , and since I : n .\ , is u-additive, I by Corollary 2. 14, they should also coincide on <:B. 2) Let g ( x) = x + b . Similarly, we can show (see Problem 4. 2) that Ag * is a Borel measure on <:B and that Ag * = A. Therefore, A is translation-invariant. Finally,
L = g o f and .\ L * = .\f * o g * = ! al l n.\ g* = : n,\· I I D The proposition is therefore proved. 4.4 Remark. Proposition 4.3 tells us that the Borel-Lebesgue measure is invariant under translation, which is a sort of motion defined as Ta ( x) = x + a.. In the two-dimensional Euclidean space we know another form of the motion, called rotation. A figure under rotation R and subsequent translation Ta is transformed into a congruent one. We can show that an arbitrary Borel set in lR" rotated and then translated preserves its volume. (See Corollary 2.3, Chapter 7.) In the n-dimensional Euclidean space, instead of rotation, we use an orthogonal transformation. More precisely, in lR" an orthogonal transfor mation is in the form of an orthogonal n x n matrix; recall that an n x n nonsingular matrix R is orthogonal if RR T = R T R = I (the identity matrix) . The composition M = T a o R is an example of a motion. Gener ally, a bijective map M from one metric space (X,d) onto another metric space (Y,p) is called an isometry if it preserves the distance, i.e. if for every pair x, y E X, d(x,y) = p(M(x ),M( y )). Such two metric spaces are said to be isometric. A motion is a special case of isometry when Y = X, p = d. In the Euclidean space (IR",d e ), a motion M can be represented by the composition T a o R , where Ta is a translation map and R is an orthogonal matrix. As a continuous map, the motion is also Borel. It can be shown (see Problem 4. 10) that the Borel-Lebesgue measure is motion D invariant, i.e. A M * = A. 4.5 Examples. (i)
Let (O.,E,J.L) be a probability space and let (O.',E') be a measure
4.
Image Measures
279
space. Then any E-E'-measurable function f: 0 --+ 0' is called a random variable. The corresponding image measure J.L f * is called the probability distribution ( of the random variable f). Observe that in probability theory, a probability measure is denoted by IP and a random variable is denoted by upper case letters like X, Y or Z. In most applications, 0' is the numeric set IR" or a subset of IR", and E' is the corresponding Borel u-algebra <:B(IR") or its trace on the subset. We would like to emphasize that a measurable function, say X, can only be a random variable if it is associated with a particular probability measure IP, along with which it induces the probability distribution. The latter specifies the random variable. In other words, measurable functions may share the same measurable space, but as far as probability theory goes, they differ if they induce different probability distributions (or more precisely, different classes of probability distributions categorized by their parameters) . (ii) Let (O,E,IP) be a probability space and X: 0 --+ {0,1, . . . , } be a random variable such that IP X * is a Poisson measure 1r .,\ . Then the random variable X is called a Poisson random variable. Similarly, a ran dom variable X: 0 --+ {0, . . . , n} is called binomial, if IP X * is a binomial measure !3 n , p · A random variable X is called (discrete) uniformly distributed if IP X * = l:: � = 0 n � 1 t: k . As it was pointed out in ( i) , X: n --+ { 0 , . . . , n} is just a measurable function (which can be uniform or bi nomial) , and it becomes a random variable upon specification of its distribution IP X* or even earlier, the probability measure IP. These are examples of so-called discrete random variables. The con struction of probability distributions of continuous random variables (i.e. , those whose ranges are continuums) requires integration and the concept of a density. The latter will be developed in Chapters 6 and 8. 0 PROBLEMS 4. 1 4.2 4.3
Prove Proposition 4. 1. Prove part 2 of Proposition 4.3. Let (O,E,J.L) be a measure space with n = IR, E = {A C IR: either A or A c � N} and let J.L( A ) = 0 for A � N and J.L ( A ) = 1 for A c � N. Let f2' = {0,1 } , E' = �( 0' ). Define [f2, f2 ', /] as
f (x ) =
{
0, if x is rational 1, if x is irrational.
Prove that f is E-E'-measurable and determine J.L f * .
280 4.4 4.5
CHAPTER 5. MEASURES
What are the traces of Borel u-algebras on Q' = {0, 1 , . . . } and n' = {0, 1 , . . . ,n} introduced in Example 4.5 ( ii)? Let ..Ab(IR") be the collection of all motions on (IR", d e ) · Show that ( .Ab(IR"), ) , where is the composition operator, forms a group with unity. Let f be a homothetic function (f(x) = ax) defined in Proposition 4.3 , part 1 , >. - the Borel-Lebesgue measure on <:B(IR") , >. * - the Lebesgue outer measure, L * - the u-algebra of Lebesgue measurable sets, and >.� - the Lebesgue measure on L * . Let J.l * be the outer measure generated by the image measure >. f * , <:B : - the u-algebra of all J.l *-measurable sets, and J.l� = Res <:B * J.l * . Show � that: a) J.l * = I a I - n >. * on '!P(IR ") . b) J.l� = I a I - n >.� on <:B * . c ) <:B : = L * . o
4. 6
o
Generalize Problem 4.6 by letting f to be a special case of the affine map f(x) = a x + b, a f. 0, b E IR". Show that the Lebesgue measure >.� on L * is translation-invariant. 4.8 Let J.l be a translation-invariant Borel measure on �(IR") and let 4 .9 J.l * be nhe outer measure produced by (<:B(IR") ,J.L) . Show that: a) J.l* = J.L(C) >. * , where C is the unit cube. b) <:B ; = L * . 4.10 Show that the Borel-Lebesgue measure is motion-invariant. (See also Chapter 7.) 4.7
4.
Image Measures
NEW TERMS:
e - 1 -class of functions 277
image measure 277 homothetic function 277 orthogonal transformation 278 isometry 196 isometric metric spaces 278 motion 278 motion-invariant measure 278 random variable 279 probability distribution 196 Poisson random variable 197 Binomial random variable 197 Discrete random variable 279 translation-invariance of Lebesgue measure 280
28 1
282
CHAPTER S . MEASURES 5. EXTENDED REAL-VALUED MEASURABLE FUNCTIONS
5. 1 Definitions and Notations.
(i) Recall (Section 3, Chapter 4) that e - 1 (0, E; 0', E', ) denotes
the collection of all measurable functions from a measurable space (0, E) to a measurable space (0', E') . If 0' = IR and E' = <:B(IR) , then e - 1 (0, E; IR) will denote the class of all real-valued measurable functions on a measurable space (0, E) . The class of all complex-valued functions will be denoted by e - 1 (0, E; C) = e - 1 (0, E; C, <:B(C)) ). Using the notion of product measures (Section 6, Chapter 6) we can show that a function f = u + iv E e - 1 (0, E; C) if and only if u, v E e - 1 (0, E; IR). (ii) In Examples 1.2 (iv) and 10. 19 (i), Chapter 3, we constructed a topology on the extended real line IR via "two-point compactification." The formed topological space (IR , r ) included all open sets of (IR, r ) and, in addition, open sets of types cr u
{ + 00 } ,
0u
{ - 00 } and
0' u { + } u { - }, 00
00
where 0 E r. The corresponding Borel u-algebra <:B(IR ) = E( r ), therefore, consists of all sets of <:B(IR) and combinations of unions of Borel sets with the sets { + oo } and { - oo }. In this section, we will be concerned with the class of all extended real-valued functions f: 0--+ IR which are E-<:B(IR ) measurable, where E is a u-algebra in 0. We denote such a class by e - 1 (0,E;IR) (or sometimes shortly by e - 1 if a measurable space (O,E) 0 is previously specified). We give a simple criterion for measurability of e - 1 -functions. 1 5.2 Proposition. A function f E e - is measurable if and only if, for every real value a , the set { w E 0: f( w ) < a} = / * ([ - oo,a ]), in notation {/ < a} , is measurable, i. e., is an element of E. Proof. We shall show that the collection of sets { [ - oo,a ]: a E IR } is a generator of <:B(IR ). Then the statement will follow directly from Propo sition 3.4, Chapter 4. Denote �' = E( {[ - oo,a ]: a E lR } ). Then,
( a,b] = [ - oo,b ]\[ - oo,a] E <:B' and hence
!f C <:B', which implies that <:B(IR) C <:B'. Since { + oo } = n [k, + oo] and { - oo } = n [ - oo, - k], k= 1 k=1 00
00
5.
Extended Real- Valued Measurable Functions
we have that { + oo } , { - oo }
283
E <:B'. Thus <:B(IR) C �'. Also,
[ - oo ,a] = { - oo } U ( - oo ,a) U {a} E <:B(IR ). Therefore, oo ,a]: a E IR} C <:B(IR), which yields that <:B' C <:B(IR) and, finally, <:B' = <:B(IR). D Proposition 5.2 can be extended to a number of modifications of con ditions equivalent to measurability.
{[ -
A function f E e - l is measurable if and only if any of the following conditions holds. ( i) {/ > a} E E, 'v'a E IR. ( ii ) {/ > a} E E, 'v'a E IR. ( iii ) {/ < a} E E, 'v'a E IR. 5.3 Corollary.
Proof. ( iii)
{/ < a} = f * ([ - oo ,a))
=f
i.e. {/ < b}
*( U00 [ - oo , a - nJ1 ) = U00 f * ( [ - oo ,a - 1 J ) n=1 n =l 1 = 00 U {/ < a - n}, n
n=1
E E, 'v'b E IR , and thus {/ < a} E E. Similarly , { ! < a} = = n f* n= 00
l
r[ }ll [ - oo ,a + �)J
( [ - oo ,a +
1
n
1 )) = n=1 n {/ < a + n } , 00
i.e. {/ < b} E E, 'v'b E IR implying that { / < a}. Therefore, the statements, {/ < a} E E, 'v' a E lR and {/ < a} , 'v' a E IR, are equivalent, which along with Proposition 5.2, yield statement ( iii ). D ( For the proof of ( i ) and ( ii ), see Problem 5. 1 .) 5.4 Proposition. Let J, g E e - 1 . Then {f < g} E E. Proof. We show that {/ < g } = U {/ < r} n { g > r} rEQ by using the pick-a-point process. We exclude the trivial case when {/ < g } = (/J. If w0 E {f( w < g ( w )}, then equivalently, f( w0 ) < g ( w0 ), implying that there exists an r 0 E Q such that f( w 0 ) < r 0 < g ( w 0 ).
284
CHAPTER S. MEASURES
Hence,
w0 E {f ( w ) < r0 } n {g( w ) > r0 } and
{f < g } C U {f < r} n { g > r}. r
Conversely, if
EQ
w0 E U {f < r} n { g > r }, r
EQ
then there exists an r 0 E Q such that w0 E {f (w ) < r0 } n { g(w > r 0 } and f (w0 ) < r0 < g( w0 ), implying that f (w 0 ) < g(w0 ). Thus w0 E {f (w ) < D g (w ) } . Now the statement shall follow from Proposition 5.3 ( ii,iii ). 5.5 Definitions. In the situations below we will deal with spaces of measurable functions that have not occurred before. We discuss the fol lowing constructions. ( i ) Let f be a field and let 9; be a vector lattice over f and a com mutative ring with unity. Observe that ( ffi,IF ) is an algebra and (9;, ) is a multiplicative Abelian semi-group with unity (i.e. a group that perhaps fails to have multiplicative inverses ) ; call it shortly an $-space over f. Throughout the remainder of this book, as an $-space, we shall consider a class of funt:tions ( extended real- or complex-valued over the field IR or C). For instarrce, the space of all continuous functions is an $-space over lR. [ Note that the term $-space is not common in real analysis literature and is restricted to the use in this book. ] ( ii ) Let IR n be the set of all functions from n to IR ( as we defined it in Section 5 , Chapter 1), and let (IR11,r p ) be the topology of pointwise converg ence ( cf. Definition 5. 1 1 , Chapter 3) generated by the compact topology (fR, r ) in each of the factor spaces. Let us call (fR n , r p ) the extended lopology of pointwise convergence. Let 9; be a subset of (fR n ,r p ) such that it is an $-space over IR. We call 9; a closed $-space if (ffi,r p ) contains the limits of all r P-convergent sequences. In other words, it contains the limit of every pointwise convergent sequence ( observe that since ( lR ,r ) is Hausdorff, any pointwise limit is unique ) . For instance, the space of all continuous functions is not a closed $-space. ( iii ) Consider the subspace ( e 1 , r p ) � (IR 11, r P ) of all measurable functions structured in terms of the extended topology of pointwise convergence. The next theorem states that until now ( e l ,r p ) is the widest, known class of functions, second to IR11• D ·
e
e
-
-
f, g
5.6 Proposition. (e - l ,r p ) is a closed g;-space 1 E e - and for {f n : n = 1,2, . . . } C e - 1 :
over IR, that is for any
Extended Real- Valued Measurable Functions
5.
285
( i) af + b E e - 1 (a ,b E !R). ( ii) t ± g E e - 1 . ( iii) J · g E e - 1 . (iv) sup {f n } E ( e - 1 , r p ) and inf{/ n } E ( e - 1 ,r p ) ; specifically, it follows that e - l is a lattice, and thus with any f E e - l , also I l l E e - 1. ( v) lim / n = infn= l , 2 . . .sup{/ m = > n } E ( e - 1 , r p) · ,
lim f n
m
= supn= 1 , 2 , . . _inf{ / m = > n } E ( e m
- 1 ,r
p) ·
( vi) if f{'---. f in the extended topology of pointwise convergence,
then f E ( e -
,r
p
).
Proof.
( i)
is obvious. ( ii) By (i), a - g E e - l implying that Therefore, by Problem 5.2 ( i), f + g E e - l .
{f + g < a} = {f < g - a} .
(iii) {/ 2 > a } = n (a < O) , {/ 2 > a } = { / > Ja} U { / < - Ja} (a > 0) ::} {/ 2 > a} E E => f 2 E e - 1 . The statement follows from the
rep rese nt ati on f · g = ! C f + g) 2 - ! C f - g) 2 •
( iv )
w0
E
We show that
{sup{/ n }
< a}
00 {sup{/ " } < a } = nn=1 {I n < a} . if and only if sup{/ n ( w0 )} < a or equivalently
or, equivalently,
00 Wo E nn=1 {/ n ( w )
< a }.
The latter implies that
00 {sup{f n } < a} = nn=1 { ! n < a} . Let {/ n J C e - 1 . Then { - f nJ c e - 1 . The statement follows from
inf{/ n } = - sup{ - f nl ·
286
CHAPTER S. MEASURES
Now if f E e - 1 , it implies that I f I = sup{/, - /} E e - 1 . ( v) This statement directly follows from ( iv ) . (vi) n--+oo lim f n = f if and only if lim f n = lim f n = /, and the statement follows from ( v ). D PROBLEMS 5.1 5. 2
5.3 5.4 5.5
Prove Corollary 5.3 ( i) and Prove that for f ,g E e - 1 , (i) {/ < g } E E ( ii) {f = g } E E
( ii) .
(iii) {f # g} E E. Let f ,g E e - 1 . Show that w � cos(/ 2 ( w) + 4g( w)) E e - 1 . Show that if f 3 E e - 1 then f E e - 1 • Show that if f 2 E e - 1 then f need not be in e - 1 . Let J,g E e - 1 and let A E E. Show that h(w) = f(w) l A (w) + g(w) lAc (w) E
5.6
5. 7 5.8
e - 1.
Let f: ( a,b ]--+ IR be a a ) monotone function b) convex function c) function with at most countably many discontinuities. Show that in each case, f is <:B( ( a ,b]) - <:B(IR)-measurable. Prove the statement: f E e - 1 if and only if {/ > d} E E for all d E D, where D C IR is any dense set in IR. Show that if f has derivative at each point of IR, then this deriva tive is Borel-measurable.
5.
Extended Real- Valued Measurable Functions
NEW TERMS:
e - 1 (0, E; IR )-space 282 e - 1 (0, E; C )-space 282
extended real-valued function 282 measurability of an extended real-valued function, criteria of 282, 283 $-space 284 extended topology of pointwise convergence 284 closed ffi-Space 284
287
288
CHAPTER s. MEASURES 6. SIMPLE FUNCTIONS
The present section is a direct precursor to integration, which we develop in the next chapter. The integral itself will be first defined for simple functions valued in a finite set of nonnegative reals. 6. 1 Definition. We consider the following subclass of functions from e - l (n,E;IR), which we call nonnegative simple functions and denote this subclass by tJ.i + (n,E) = tJ.i + (n, E;IR). An element s is said to belong to tJ.i + or to be nonnegative simple if:
a) s E e - 1 . b) s (w ) > 0; \lw E n. c) s
takes on only finitely many real values.
D
6.2 Remarks.
( i) Let E tJ.i + (O,E). If there is an n-tuple of nonnegative real numbers {al, . . . ,an} and a finite decomposition E � = l A k of n such that s ( w ) = ak for all w E A k , then the function s (as in Figure 6.1 ) can s
obviously be represented as
(6.1)
Figure
6.1
6.
Simple Functions
289
In some cases we may need to deal with different decompositions of n. Consequently, there are in general different finite representations or expansions of s E .P + of type (6. 1). However, there is obviously a unique one where (6 . 1 ) contains all different values {a 1 , . . . ,a n } of s. We wish to call such a representation (expansion) canonic . ( ii ) For the upcoming material we will need some modifications of $-spaces introduced in Definition 5.5. Let IF be a field. Recall that IF + C IF is called the semifield if all axioms of the field hold except for #4 (the existence of additive inverses, see Definition 7.5, Chapter 1). If $ is a linear space over IF, the corresponding ·restriction ($;IF + ) is called a semi-linear space. If, in addition, ($;IF) is a vector lattice, then (ffi;IF + ) is called a semi-linear lattice. Similarly, we can define corresponding restric tions of rings and algebras over IF + calling them quasirings and quasi algebras. If 9; is a semi-linear lattice over a semifield IF + and a commu tative quasiring with unity over IF + then we call the pair (ffi;IF + ) a semi
$-space .
( iii ) In Chapter 8 (Section 4) , we will also be using the notion of a simple function, which is just as in Definition 6 . 1 , except that they are not necessarily nonnegative. The set of all such simple functions will be D denoted by w(n, E) = .P(O, E;!R). 6.3 Proposition. (.P + (Sl,E) ; !R + ; ) is a semi-ffi-space. In other words, if s , t E .P + ' then: ( i ) a s + b t E .P + (a,b E !R + ). ( ii ) s t E .P + . ( iii ) sup(s,t) E .P + . ( i v) inf( s , t) E .P + . D (See Problem 6. 1.) We denote by (W" + (O,E),r p ) the subspace of all extended, real valued, nonnegative functions f E e - 1 to each of which there exists a monotone nondecreasing sequence {s n } C W + of nonnegative simple func tions such that f = sup{ s n } in the topology of pointwise convergence. By Proposition 5.5 ( i v ), .P + C e - 1 , i.e. .P + consists of only measurable functions. The following proposition asserts that W + is a semi-$-space and it is the closure of .P + with respect to the topology of pointwise con vergence). 6.4 Proposition. .P + (O,E) is a semi-ffi-space over !R + , i. e . if J,g E .P + (n,E), then: ( i ) af + bg E .P + (a,b E !R + ). ·
·
290
CHAPTER S. MEASURES
( ii) I . g E
tJ! + . (iii) sup(/ ,g) E tJt + . ( iv ) inf(/ ,g ) E tJi + .
Proof.
(i)
Let
f = sup{sn} , g = sup{ t n }· Then
and a sn, b t n E tJi + . Furthermore, { a s n } and {bt n } are monotone nonde. creasing. ( iz) The proof is similar to that for ( i) . (iii) Let w n = sup(s n , t n ) · Then obviously, sup{w n } exists and equals sup{/, g} (why?). ( iv) The proof is similar to that for (iii) . D 6.5 Theorem. Let e +
1:
- 1 (f2, E; IR
i. e . ,
the subclass of all nonnegative extended real-valued functions. Then e + 1 = tJi + and it is the a closed semi-ffi-space. Proof. Evidently, tJi + C e + 1 . Therefore, we are left to prove that e + 1 C tJi + We will show that, for every f E e + 1 , there is a monotone =
e
+
)
,
.
non decreasing sequence { s n } of nonnegative simple functions from tJi + such that sup { sn} = f. The latter is at the heart of the following construction. Let i = 0, 1 , . . . , n 2 n 1 f > in f < i {/ > n } , i = n 2 n .
{
For instance,
2 } n { �';/}.
n ]), 1 * } < { ((0,2n J 2 ! n n n 1 1 } < } { { ! > 2 n n J 2� f*((2, ,2 - ]).
A0 (n) = {f > 0} A l (n) = In other words,
=
=
-
Simple Functions
6.
291
and Ai(n) = / * ((n,oo]), i = n2". Therefore, all sets Ai(n), i = 0, . . . ,n2", are disjoint and obviously E measurable. Let us define
Both f and s n are depicted in Figure 6.2.
n
·
-+----t-!1 -
I
I
I
I
r
1 -
!
'
'
'
I
1
- � - - :-
I
I
• •"I' • I
•
I•
·
...
I
I
I
I
·
I
l
I
1
. r
I
I
-
'
· ·
I •LLJ�"""""-.. I
I
'
I
I
I
I
'
'·
..
'
\ ·. -
- -- - - - '
__
_
_
_
_
Ao _
_
� _
_
_.
�
,
.·
...
Figure 6.2 Clearly s n +l > sn. Besides, s n (w) < f(w) < s n ( w ) + 2- n , Vw E 0: f( w ) < n, and f(w) > n, V w E 0: f(w) = oo . Functions s n and s n + l are drawn in Figure 6. 3.
292
CHAPTER S . MEASURES
--- � ---- --- ---- -,-- - . -
sn
I
1
I
-·
-
- . - - - - _.,_.,.
I
, - - , - ·- r -
-
-
. --
I
-
-
.•
..
-
..
. - 7
i
-
..
-
-
-
-
.
- .. . ....� l
I -
·-
..
I
- - , - - - - - - - - - - .,. - - T - I - ---
Fi-g.ure 6 . 3 Thus there exists sup{s n } = f (pointwise \lw E 0), and therefore f E � + , implying that e + 1 C IF + . This proves that e + 1 = IF + . 0 PROBLEMS 6.1 6.2
Prove Proposition 6.3. Let f2 be an uncountable set and let E = { A c n : A or A c is at most countable}. Show that f E e - 1 (f2,E) if and only if f is constant everywhere except on an at most countable subset of n.
6. Simple Function s NEW TERMS: nonnegative simple functions 288 canonic representation (expansion) 289 canonic expansion (representation) 289 semi-linear space 289 semi-linear lattice 289 quasiring 289 quasialgebra 289 semi-$-space 289 simple function 289 closed semi-$-space 290
29 3
Chapt er 6 Elements of Integration
The historical significance of the development of measure theory is that it created a base for a generalization of the classical Riemann notion of the definite integral ( which since 1854 was considered to be the most general theory of integration ) . Riemann defined a bounded function over an interval [a,b] to be integrable if and only if the Darboux ( or Cauchy ) sums I: ':_ 1 /( t i ) >.. ( I i ), where I: ':_ 1 I i ' is a finite decomposition of [a, b) into subi � tervals, approach a uni que limiting value whenever the length of the largest interval goes to zero. A French mathematician, Henri Lebesgue (1 875-1941), assumed that the above intervals I i may be substituted by more general measurable sets and that the class of Riemann integrable functions can be enlarged to the class of measurable functions. In this case, we arrive at a more solid theory of integration, which is better suited for dealing with various limit processes and which greatly contributed to the contemporary theory of probability and stochastic processes. Although many results existed prior to Lebesgue's major work be tween 1901 and 19 10, Lebesgue's construction appeared to be the most ef ficient. After 1910, a large number of mathematicians began to engage in work initiated by Lebesgue. Some of the most significant contributions were made by the Frenchman Pierre Fatou (1878-1 929), Italian Guido Fubini (1 897-1943), Hungarian Frigyes ( Frederic ) Riesz (1880- 1 956), Pole Otto Nikodym (1887-1974), and Austrian Johann Radon (1 887-1 956) who developed the Lebesgue-Stieltjes integral and whose work led to the modern abstract theory of measure and integration. In this chapter, we will first be concerned with the main principles of integration with respect to arbitrary measures. We will be using standard techniques developed for Lebesgue integration but without sacrificing the generality . Then various applications of the integral will be considered. We will look at the integral as a measure ( and later, in Chapter 8, in the general case, as a "signed measure" ) , at Radon-Nikodym derivatives, at decomposition of measures and decomposition of absolutely continuous functions, and at "multiple integration. ,, Other applications of inte gration ( including uniform integrability ) and various principles of conver gence will be developed in Chapter 8.
295
296
CHAPTER 6. ELEMENTS OF INTEGRATION 1. INTEGRATION ON
e - 1(!l,E)
We begin the theory of integration with integrals of nonnegative simple functions, which we introduced in Section 6, Chapter 5. Prior to the definition of the rudimentary integral, the proposition below states that integrals of nonnegative simple functions are invariant of their representa tions. 1. 1 Lemma. Let (O.,E,J.L) be a measure space and let s E tJf + (O.,E)
have two representations: Then it holds that
Proof. The above representations are due to the two decompositions
of n :
Then
which implies that and By noticin,g that ai = b k on Ai n B k ' we are done with the proof. D 1.2 Definition. Let (n,E,J.L) be a measure space and let s E tJf + (O.,E) with the r�presentation Then the number is called the the symbols:
integral of s with respect to
J.L, and it is denoted by one of
I s( w )dJ.L( w ) or I s( w )J.L(dw) or, shortly, J sdJ.L .
D
1. Integration on e - 1 (0,E)
297
Since the value of the integral of a function s does not depend upon its representation, this definition is consistent. In other words, the integral s � I sdJ.L defines a functional on .P + valued in lR. 1.3 Proposition (Properties of the integral). ( i) For each measurable set A E E,
( ii)
The integral I is a nonnegative linear functional, e., z. .
I (as + b t ) dJL = a I sdJ.L + b I tdJ.L, where s,t E .P + and a,b E lR +
.
For any two nonnegative simple functions, s, t E tJ! + , such that s < t, it holds that I sdJL < I tdJ.L ( monotonicity) . (iii)
(See Problem 1 . 1 . ) 1.4 Example. Let f be the Dirichlet function defined as f = 1q (earlier introduced in Example 4.7, Chapter 2) , where Q is the set of all rational numbers (hence a Borel set). Thus f E .P + ( lR,<:B). By Proposition 1.3 ( i) , the integral of f with respect to Lebesgue measure >. is
I Jd>. = 1 · >.(Q) =x E >.(x) = o. eQ
D
For the upcoming definitions and statements we will denote a mono tone nondecreasing sequence of functions by {/ n l l and a monotone non increasing sequence of functions by {/ n l ! . 1.5 Lemma. Let { sn }l C .P + and s E 1JF + such that s < sup {s n } .
Then
Proof. Let
Denote
s = E � 1 a;1 A ; and let c > 0 be any small number. B n = {w : s n > ( 1 - c:)s} ( E E) .
Thus
s n > s(1 - c:)1 B n . By Proposition 1.3 (ii,iii),
298
CHAPTER 6. ELEMENTS OF INTEGRATION
By the definition of {s n } , it follows that {B n } j n, which implies that { A j n B n } j Aj . Therefore, by continuity from below of J.L (Lemma 1.6, Chapter 5), L: �
a1 iJ.L( A i ) = L: � 1 ai A�oo J.L( A i n B n ) = nlim --+ oo L: �- 1 a iJ.L( A i n B n ) = nlim --+oo I s l B d J.L.
I s d j.L =
z
n
The last equation is due to the relationship
Thus,
sup{ J s n dJ.L) } = nlim --+ oo J s n d J.L
> ( 1 - c ) �i_!!} I s I B n dJ.L = 00
( 1 - c ) I s d j.L ,
which proves the statement because the inequality holds for each c > 0. D 1.6 Corollary. For {s n }j, {t n }i C lff + such that sup{s n } = sup{ t n } ,
it holds that
D (See Problem 1.2.) Let us now turn to the integral of the functions from the more general class e + 1 = e 1 ( n, E; 1R + ) which we became familiar with first in Theorem 6.5, Chapter 5. 1.7 Definition. Let (O,E,J.L) be a measure space and let f E e + 1 . By Theorem 6.5, Chapter 5, there is a monotone, nondecreasing sequence { s n } j C tJ.i + such that f = sup{ s n } · Hence, it is plausible to define -
integral of ( an extended, real-valued, nonnegative func tion ) f with respect to measure J.L. By Corollary 1.6, the value of the integral, I f d J.L , is unique. D Analogous to Proposition 1.3 ( ii,iii) , we have: 1.8 Proposition. The integral introduced in Definition 1. 7 zs a positive, linear, monotone nondecreasing functional on e + 1 . Proof. Let /, g E e + 1 and a , b E IR + . Then
and call it the
.
1. Integration on e - 1 (0,E)
29 9
f = sup{s n }, g = sup{ t n } and af + b g = sup{as n + b t n } yield that
I (af + b g)dJ.L = sup{ I (as n + b t n )dJ.L},
which, by Proposition
1.3 ( ii ), equals sup{ a I s n dJ.L + b I t ndJ.L}
= a sup{ I s n dJ.L} + b sup{ I t n dJ.L} = a I fdJ.L + b I g dJ.L. Now let f < g. Then we have
Thus, by Lemma
1.5,
and finally, 0
1.9 Examples.
Let e a be a point mass on a measurable space (O,E) for some a E n and let s E 1/1 + (O,E) be such that s( a) = a i , for some ° i0 E { 1 , . . . ,n}. Then (i)
n
I s d e a = iL: ai e a( A i ) = a i0 · 1 = s( a).
=l
Now let f E e + 1 (0,E). Then there is a sequence { s n } i C tJi + such that f = sup {s n } · Thus I fde a = sup{s n (a)} = f(a). Similarly, if J.L = c e a (for some c > 0), I fdJ.L = cf(a). ( ii ) Let By Problem
1.3,
300
CHAP TER 6 . ELEMENTS O F INTEG RATION
( )
n
Specifically, if c i = i pi (1 - p ) - i , then J.L is the binomial measure x t !3 n , 1!.. · (See Example 1.8 (iii), Chapter 5.) Furthermore, if f( x ) = e , for t E lr' , then the transform of the binomial measure
is a function in t and is referred to as the moment generating function. In the general definition, t is allowed to run the complex plane C . (iii) Let (O.,E,J.L) be the measure space with n = [0,1], E = � ([0,1]), and J.L = A (Borel-Lebesgue measure on � ([0,1]). Let C be the Cantor set and G n be the open intervals of the Borel-Lebesgue measure �(;)" (introduced in Example 3.11, Chapter 5). Let us define the function
1, XEC f( x ) = 12 , x E G n, n = 1 , 2 , . . . . " We are going to evaluate the integral I f( x )A( d x ) (with respect to the [0, 1 ] Borel-Lebesgue measure). First of all, we have to identify the function /, which can be represented in the form f = sup { s n } , where 1, xEC 0, x E [0 , 1]\(G1 U . . . U G n U C) 1k , x E Gk , k = 1 , . . . , n . 2 Clearly, s n E tJ.i + ( [0, 1 ], � [0, 1 ]) and /( x ) = sup{ s n ( x ) }. Thus J E e + 1 ( [0,1 ] ,� ( [0,1 ])) n
and hence
x x A = sup I sn ( x)A ( d x ) f( ) ( ) d [0, 1 ] [0, 1 ] = sup [ 1 · A( C) + 0 · .-\ ( [0, 1 ]\ { G 1 , . . . ,G n . C}) + f: 21k A ( G k )] k= 1 I
1. Integration on e - 1 ( 0,E )
301
Let { J.L n } be a sequence of measures on a measurable space (O,E). Then J.L = I: �= 1 J.L n is a measure on E; and for an A E E, the integral of the indicator function 1 A is
( iv)
I lA d j.t = J.L ( A ) = I:�= 1 J.L n ( A ) = I: �= 1 I l A d J.L n · Let s E '.V + ( n, E). Then I sd J.L = I: ;;'= 1 ak J.L(Ak ) (1.9) = E ;;'= 1 ak l: �= 1 J.L" ( Ak ) = E �= 1 E ;:'= 1 a k J.L n ( A k ) = E �= 1 J s d J.L n · Now, for f E e + 1 , we have f = sup { s j } such that { s j }i C tJ.i + · Let b j = I: 7 = J s jd 1-'i. Since { b j n } is monotone increasing, n
1
which yields that s� p J
I: � 1 I s j dJ.Li = S}}P l: � = 1 I fdJ.L i = I: � 1 I f dJ.L i · (1. 9a)
Therefore,
I fdp. = sjp I s;dp. (by (1.9)) = sjp E � 1 I s ;dP.i (by ( 1.9 a)) = I: �- 1 J f dJ.Li· Thus we showed that
Now we further enlarge the class of integrable functions by consider ing arbitrary extended, real-valued, measurable functions of e - 1 ( 0 ,E ) . For each f E e - 1 and 0, being the function identically equal to zero on n, denote
+ f = sup{/,0}
and
+ f - = - inf{/,0) = ( - f)
302
( c f. 1) ,
CHAPTER 6. ELEMENTS O F INTEG RATION
Definition 7.7, Chapter 1). Clearly (see also Problem 7. 16, Chapter
By Proposition 6.6, Chapter 5, f + and f - are also elements of e - 1 ( more precisely, elements of e + 1 ) if and only if f E e - 1 . 1.10 Definitions.
(1 i) Let (O,E,J.L) be a measure space and let f E e - 1 (f2,E;fR) (or e - (0, E; IR) ) . If at least one of the integrals, I f + d J.L or I f - dJ.L, is finite, we say that the integral of f with respect to measure J.L exists and denote this integral by
( 1 . 1 0) We also denote
lL (n, E, J.L;fR) = {/ E e - 1 (0, E;fR) : I fdJ.L exists}.
(1. 10a)
If both of the integrals of the functions f + and of f - are finite, we say that the function f is J.L-inte g rable and again denote the integral of f by formula (1. 10). The subset of e - 1 of all JL-integrable functions is denoted - .1.e. by L 1 (O,E,p:;IR),
L 1 (n, E, j.L ;fR ) = {/ E e - 1 (0, E ) : I f + d j.L <
Note that
00
and
I f - d j.L < }. 00
(1. 10b)
I I t I dJ.L = I t + dJ.L + I t - d J.L, (1. 10c) I f I = f + + f - and Proposition 1.8. In light of (1. 10c) ,
w hich is due to (1. 10b) can be rewritten as
(1. 10d) If a measurable space is specified, the notation f E lL(O, E, J.L;IR) or f E L1 (f2, E, J.L ;fR) will be shortened to f E lL (J.L) or f E L 1 (J.L). (ii) If n = IR", E = <:B, and J.L is the Borel-Lebesgue measure >. and if the integral of the function f in ( 1. 10) exists, it is called the Lebes g ue inte g ral of f. If f is >.-integrable, we write f E L 1 ( >. ) . (iii) If n = IR, E = <:B and J.L = J.L F (a Borel-Lebesgue-Stieltjes meas ure induced by an extended distribution function F), and if g E lL(O, E, J.Lp ;IR), then the integral in (1. 10) is called the Lebesgue-Stieltjes
1.
Integration on e - 1 ( 0,E )
303
and we will write g E L 1 (J.L F ) if g is J.L y-integrable. e - 1 (0, E;IR) be the space of all extended real-valued random variables on a probability space (0, E, IP). From Example 4. 5 ( i) , Chapter 5, we recall that for any random variable X E e - 1 ( 0, E ) on (0, E, IP), the image measure IP X* is the probability distribution of X. If X E L 1 (n, E, IP;lR), then the numeric value I XdiP is called the expectation of the random variable X, in notation, IE[X] . Observe that IE[X] makes sense only if X is !?-integrable, i.e., if I I X I diP < oo. [It is now becoming clear why in text books on probability, the expectation D IE[X] is defined only when IE[ I X I ] < oo .]
integral of g; ( iv) Let
The integral is a linear, monotone, nondecreasing functional on the space £ 1 ( 0, E, J.L). D 1.11 Proposition.
(See Problem 1 .6.) 1.12 Proposition.
( .) L 1 ( 0, E,- J.L;IR- ) is a vector lattice over IR, e. for every pazr f, g E L 1 ( n, E, J.L ;IR), sup{f, g } , inf{ / , g } E L 1 (0,E,J.l). z
( ii)
z.
v
f E L 1 , I I fdJ.L I < I I f I d j.L .
Proof.
(i) l sup {f, g } l < l f l
l g l and l inf(f, g ) l < l f l + I Y I · The statement is now due to Problems 1. 7 and 1.8. (ii) Obviously, I f I > f and I f I > - f. Thus, by Proposition 1 . 1 1 , we have +
and
D 1. 13 Notations. Let
f E e - 1 (0,E;fR ) and A E E. Then, we denote
Specifically, it follows that D
304
CHAPTER 6 . ELEMENTS O F INTEG RATIO N
Now we will need the notion of "properties that hold almost every where." 1.14 Definitions and Remarks.
(i) Let (O,E,J.L) be a measure space. A property II (of points of 0 ) is said to hold almost everywhere (a. e. ) or J.L- almost everywhere (J.L- a. e. ) if there is a (J.L-null) set N E N 1-' (see Definition 2.5 ( i) , Chapter 5) such that II holds for all points of N c . Notice that this definition does not preclude property II to hold on N or on its subset. It merely says that II may fail on a negligible subset of N. ( ii) Two measurable functions f and g are said to equal (J.L-)a. e. if f = g on the compliment of a J.L-null set N. Observe that {f f. g } � N. Recall that, by Problem 5.2 (iii), Chapter 5, the set {f f. g } is measur able. Therefore, if f = g a. e., then the set { f f. g } E N 1-' ' i.e., is J.L-null. (iii) Let e - 1 (f2,E; fR ) be the set of all measurable functions on n and let J.L be a measure on E. Let [f] l-' denote the set of all functions that are pairwise equal J.L-a.e. on n. Specifically, [0] 1-' denotes the set of all measurable functions , which equal zero p.-a.e. on n. Clearly , the J.L-almost everywhere property of equality of functions induces an equivalence relation (say E) on the set e - 1 (f2,E; fR ) . Then 1 e - 1 (n, E; IR ) = e - (n, E; IR) (1.14)
1E
1 �-' =
denotes the quotient set {[f] l-' : f E e t ( n ,E; fR )} and it is called the quotient set modulo J.L · In light of these considerations, any two functions f and g such that f = g J.L-a.e. on n are also said to be equal modulo J.L and we will write f = g (mod J.L) , or f E ( g ] I-' , or equivalently, f g E D [0] 1-' . 1.15 Lemma. Let (O,E,J.L) be a measure space and let f E 1 e + (f2,E; IR). Then J fd J.L = 0 if and only if f E (0] 1-' . Proof. Denote N = {f > 0} (which is an element of E) . ( i) Let f E ( 0] 1-' . Then N E N 1-' . Let s n = n1 N ( E tJ.i + ) , n = 1 , 2 , . . .. Therefore, J s nd J.L = nJ.L(N) = 0, for all n. -
-
Denote s = sup{s n } · Then, by Theorem 6.5, Chapter 5, s E e + 1 and Finally, f = s n = 0 on N c . While f is arbitrary on N and, in particular, not necessarily oo, we have that s n j oo on N. Consequently, f < s on n
1.
Inte gration on e - 1 (0,E)
305
which, by monotonicity ( Proposition 1.8), yields and hence I f d p = 0. ( ii) Now let I fd p = 0. Denote
N n = { ! 2: �} ( = ! *( [�,oo] ))
Obviously,
,
n = 1,2, . . . .
N n E E and N ni N, where
N = U= N n = {/ > 0} E E. 00
n l By continuity from below of J.l,
�oo J.l( N n ) nlim
=
J.L( N).
( 1 . 15)
Clearly, n f > I N . Again, by monotonicity (Proposition 1 .8) , we have n
that
which leads to J.L(N n ) 0, n = 1,2,. . . . From (1. 15) it follows that J.l(N) 0 and hence N E N 1-' " Therefore, f E [ 0] 1-' . 0 =
=
Let (O,E,J.L) be a measure space and let J, g E e + 1 (0,E; fR ) such that f = g ( mod J.l) . Then 1.16 Proposition.
5, we have that N = {/ f. g } E E. Therefore, by the above assumption regarding f and g , N E N 1-' and the functions fi N and g l N are elements of the quotient set [0] 1-' . By Lem ma 1.15, it follows that Proof. By Problem 5.2 (iii) , Chapter
On the other hand, if A = N c , then
Similarly,
306
CHAPTER 6 . ELEMENTS OF INTEGRATION
The statement follows from f 1 A = g 1 A ' while on set N,
\lw E Q . Indeed,
I f = I g = 0;
on Nc we have that
f = g.
D
Let ( O,E,J.L) be a measure space and let f, g E e - 1 (n,E; R) such that I f I < g a. e .. Then g E L 1 ( n,E,J.L; IR ) implies that f E L 1 (O,E,J.L; -IR). Proof. Let g E L 1 ( 0,E,J.L; IR ) . Then by Proposition 6.6, Chapter 5, we 1.17 Proposition.
have that
g I = sup{ g , Clearly,
IfI
I f I } E e - 1.
< g 1 everywhere and g 1 = g (mod
J.L)
(show it),
and by Problem 1 . 17, g 1 E L 1 ( n,E,J.L; fR ). Then, by Problem 1 .8,
L 1 ( O,E,J.L; -lR).
fE
D
Let f, g E e - 1 ( 0, E) and f or g E L 1 (n, E, J.L) .
1.18 Proposition.
Then
I fdJ.L = I gdJ.L , for each A E E,
( 1 . 18)
I fdJ.L < I gdJ.L , for eac h A E E,
( 1 . 19)
A A D yields that f g (mod J.L). (See Problem 1 .27.) Theorem 1 . 19 and Corollary 1.20 modify and, to some extent, refine Proposition 1 . 18. 1.19 Theorem. If J.L is u-finite, J, g E lL ( O, E, J.L; IR), and =
A
A
then f < g J.L-a.e. on n. Proof. a
) Let
J.L be finite. Denote
Then, since by our assumption,
I fdJ.L < I gdJ.L for each A E E, we have
A
A
1.
Integration on e - 1 (0,E)
307
( 1 . 1 9a) On the other hand,
Therefore, from ( 1 . 19a) and because J gdJ.L is finite, An A that J.L( n) = 0, for each n . Thus,
On the other hand, from n
lJ 1 A n =
( n Q 1 {!
�
g + k}
) n ( nQ 1{ I g I
{ f > g} we conclude that J.L{f > g: g is finite} = 0. J.L{f > g : g >
Letting
Bn = { g =
L > M,
:5
which yields
n}
{ g is finite}
)
Hence,
- oo }
- oo ,J > - n} we have
= 0.
and therefore, or, equivalently, nJ.L(B n ) > OOJ.L(B n ) · This holds true if and only if J.L( B n ) = 0 (as the consequence of the agreement that oo 0 = 0). Thus, ·
J.L ( n U= l B n ) = J.L{f > g, g =
In summary, we proved that
n.
J.L{f > g } = 0
- oo } = O . implies that
f < g J.L-a.e.
b) Now, let J.L be u-finite and let J.L n = Res E n n n J.L. Then
on
308
CHAPTER 6 . ELEMENTS OF INTEG RATIO N
fdJ.l = I ln n fdJ.l < I I n n gdJ.l I A An A nn
and hence f < g J.l-a.e. on n n . The rest of this case is obvious. The reader can easily conclude that Corollary. If J.l is u-finite, J, g E IL(n, E, J.l; IR ), and
1 .20
IA fdJ.l = AJ gdJ.l, for each A E E,
D
(1.20)
then f = g J.l-a.e. on n.
D
(For a pertinent discussion, see Problem 1.28.) Finally, we would like to formulate the proposition below that will be often cited in the sequel and whose prove we assign to the reader as Problem 1.19. Proposition. Each function f E L1 ( f2,E,J.l; IR) is finite J.l-a.e. on D n.
1 .2 1
PROBLEMS
1.1 1 .2
Prove Proposition 1.3. Prove Corollary 1.6, i.e. , for { s n } j, { t n } l � tJi + such that sup{ s n } = sup{ t n } it holds that
[Hint:
1.3
Use the fact that s j < sup{t n } and t k < sup{s n }.] Show that for J.l = L: � 0ci £ a the corresponding value of the integral of any bounded measurAble function f is _
·'
1
1 .4
Let
1r .,\
be a Poisson measure and let
I fd7r.,\ =
1 .5
Under the condition of Problem
f E e + 1 ( lR, <:B; IR ). Show that
oo L:
n=Oe
-.,\ ). "
n ! f(n) .
1.4 assume that
f(x ) = x 2 b ) f(x ) = f(x + 1 ) 1 ( o, oo ) (x ) , where f (x ) I : e - t t x - l d t
a)
=
1.
1.6 1.7
Integration on e - 1 (0,E )
309
and find in each case the integral of f with respect to measure 1r .,x · Prove Proposition 1 . 1 1. Let Q be a non-Borel subset of IR (such as one in Problem 3 . 16, Chapter 5) and let C denote the Cantor ternary set. Define the function
f(x) = 1Q 0 (x)sinx + 1(Q G ) c (x ) x 2 • n
n
Is f Lebesgue measurable, i.e. f E e - 1 ( IR,L * ,>. ) ? 1.8 Let (O,E,J.L) be a measure space and· let f E e - 1 ( 0,E; IR). Show that f E L 1 ( n,E,J.L; fR ) if and only if there exists g E L 1 ( n,E,J.L; fR ) such that J f I < g . 1.9 Show that L 1 is a linear space over IR. 1.10 Show that
1.11 1.12
1.13
1.14
{ £ 1 ( 0,E, £a ; R ): a E !1}= {L 1 (0,E, £ a ;IR): a E 0} . Let ( O,E,J.L) be a complete measure space and let f E e - 1 ( n,E; fR). Suppose that g: n-.IR is an extended, real-valued function. Show, that if g = f (mod J.L), then g E e - 1 (0,E; fR ). [Hint: Show that { g < c } E E, \l c E IR . ] Let (O,E,J.L) be a complete measure space and let {/ n } C e - 1 ( f2,E; IR ) . Suppose that lim n -+oo f n exists and f n -+ f pointwise J.L- a.e. on n, where f is an extended, real-valued func tion. Show that f E e - 1 . Prove that f = g (mod J.L) if and only if f + = g + (mod J.L) and f - = g - (mod J.L). Show that
Show that f E [0] #-' if and only if f + ,/ - E [O] JJ . 1.16 Show that if f E e - 1 (0, E; IR ) then f E [O] J.£ yields that J fdJ.L 0. Does the con verse hold true? 1.17 Let (O,E,J.L) be a measure space and let f E L 1 ( n,E,J.L; fR), g E e - 1 ( 0,E; IR) such that f - g E [O] J.£ . Show that g E L 1 ( 0,E,J.L; fR ) and that I fd J.L = I gdJ.L. 1.18 Generalize Proposition 1 . 16 assuming that J,g E e - 1 ( n, E; fR ) an d that f E IL ( n, E, J.L; fR) (i.e. , that I f dJ.L exists). 1.19 Show that each function f E L1(f2,E,J.L ; IR ) is finite J.L-a.e. on n . 1.15
=
310
CHAPTER 6. ELEMENTS OF INTE G RATION
Let A = { I f I = oo }. Show that aJ.L(A ) < oo, \I a E IR + , and then show that nlim --+oo nJ.L( A ) < oo implies that J.L( A) = 0.]
[Hint:
Show that for f E e - 1 ( 0, E ), I fd J.L = 0 for each A E E if and A only if f E [0] 1-' . 1.21 Show by a counterexample that L 1 is not an ffi-space. 1.22 Let f E e - 1 ( 0, E; fR ) . Show that f E £1 ( 0, E, J.L; fR ) if and only if for each e > 0, there is a function g E £1 ( 0, E, J.L; IR + ) such that
1.20
I
{ I l l > g} 1.23
1.24
I t I dJ.L < e.
Let ( 0, E, J.L) be a measure space and f E L 1 (O, E, J.L; -IR),
c > 0.
Show that for each
{J.L n } be a sequence of measures on a measurable space ( 0, E), {e n } be a sequence of positive real numbers, and let J.L = L: � 1 c n J.L n , which is a measure on ( 0, E). Show that for every f E L 1 ( 0, E, J.L; -IR), Let
=
1.25
1.26 1.27
I fd(cJ.L) = c i fdJ.L. Let J.L and v be two measures on ( 0, E ) such that J.L < v. Show that for each f E £ 1 ( 0, E, J.L; fR ) n £1 ( 0, E, v; IR ) , the integral I fd(v - J.L) makes sense, f E £ 1 ( 0, E, v - J.L; fR ) , and that I t d(v - J.L) = I Jdv - I t dJ.L. Let J.L and v be two measures on ( 0, E ) such that J.L < v. Show that for each f E e + 1 ( 0, E; fR ) , I fdJ.L < I fdv. Prove Proposition-1 . 18, i.e., show that if J, g E e - 1 ( 0, E; fR ) and f or g E L 1 ( 0, E, J.L; IR ), then I fdJ.L = I gdJ.L for each A E E
(P 1.27)
A A yields that f = g (mod J.L). 1.28 Show by a counterexample that dropping the condition f or g E £ 1 ( 0, E, J.L; IR ) in Problem 1.27 need not yield f = g (mod J.L ) even if f and g are nonnegative.
1. Integration on e - 1 (0,E) NEW TERMS: integral of a nonnegative simple function 296 Dirichlet function 297 integral of an extended nonnegative function 298 moment generating function 300 integral of an extended real-valued function 302 J.L-integrable function 302 lL( n, E, J.L; IR )-space 302 L 1 ( n, E, J.L; IR)-space 302 Lebesgue integral 302 Lebesgue-Stieltjes integral 302 expectation of a random variable 303 property that hold almost everywhere (J.L-a.e.) 304 equality of functions modulo J.L 304 [0] /J-set 304 quotient set modulo J.L 304 [/] "'-class 304
311
312
CHAPTER 6. ELEMENTS O F INTEGRATION
2. MAIN CONVERGENCE THEOREMS
The following result is one of the basic convergence theorems a special case of which (Corollary 2.2) was originally proved by the Italian mathe matician, Beppo Levi (1875- 196 1). 2. 1 Theorem (of Monotone Convergence). Let { ! n } i c e + 1 . Then I sup{/ n }dJ.l = sup{ I f n d J.l } .
(2. 1)
Let1 f = sup{/ n } · Then, by Proposition 5.6 ( iv ), Chapter 5, sup{/ n } E e + . Thus, the integral on the left-hand side1 of (2. 1) makes sense. On the other hand, for each element f n E e + , there exists a monotone nondecreasing sequence of nonnegative simple functions { s�") } l C tJ! + such that sup{s �") : k = 1 ,2, . . . } = f n · Let k 1 tk = max{s � ) , . . . ,s � ) }. Proof.
Since tJ! + is a lattice, it follows that t k E tJ.i + , k = 1,2, . . . . Furthermore, { t k } is monotone nondecreasing. Since {/ n} is monotone nondecreasing, we have k 2 ) s �1 ) < f 1 < f k ' s � < f 2 < f k , . . . , s � ) < f k , and hence s �i] < f k ' i = 1, . . . ,k, which leads to (2. 1a)
and
sup{ t k } < sup{/ k } = f .
On the other hand, t k > s �n ) for k > n; this yields
(2. 1b )
sup{s�") : k = 1,2, . . . } = f n < sup{t k : k = 1,2, . . . }, and, consequently, sup{/ n } = f < sup{t k }.
(2. 1c)
Thus, by (2.1b) and (2.1c), Now the facts that f = sup{ t k } and that { t k } is monotone nondecreasing imply that
2.
Main Converg ence Theorems
3 13
Since t k < f k by (2. 1a), we have by Proposition 1.8 which yields
f n < f and Proposition Let {/ n } C e + 1 . Then
Finally, the inverse inequality holds due to 2.2 Corollary (Beppo Levy).
1.8. D
f [J:/n] djj = n�l f fndjj.
(See Problem 2. 1.) The Monotone Convergence Theorem can be generalized for r ary monotone sequence under a minor constraint.
an arbit
Let {/ n } j C e - 1 ( 0,E; fR ) and g E e - 1 ( 0,E; fR ) such that f n 2: g for all and I gdJ.L > oo Then, 2.3 Theorem (Generalized Monotone Convergence Theorem).
n
-
.
sup{ I f n dJ.L} = I sup{/ n } d J.L . (See Problem 2.2.) 2.4 Lemma (Fatou).
c e + 1 ( 0, E). Then
D
Let (n, E, J.L) be a measure space and let {f n }
Proof. By Proposition 5.6 ( v ), Chapter 5, lim Proposition 5.6 ( iv), Chapter 5,
fn E e + 1
Clearly, the sequence { g n } is monotone nondecreasing and hence
By monotonicity of the integral,
and by
3 14
CHAPTER 6 . ELEMENTS OF INTEG RATION
which implies that
Finally, by the Monotone Convergence Theorem,
to
2.5 Definition. Let /, {f n }
converg e to f in mean if
C L1 ( 0.,E,J.L; fR). The sequence { ! n l is said D
We now formulate and prove one of the central results in the theory of integration. As with the Monotone Convergence Theorem, the follow ing theorem enables us to interchange the limit and the integral for a pointwise convergent sequence of functions. However, it does not require that the sequence be monotone nondecreasing and nonnegative. On the other hand, the sequence needs an integrable dominating function, and thus it is not a generalization of the Monotone Convergence Theorem.
Let ( O.,E,J.L) be ,a measure space and let { ! n } c e - 1 ( 0., E; lR) be a (point wise) a. e. oonverg ent sequence. Suppose that there is a J.L-integrable function g ( E L 1 (0.,E,J.L; lR)) suck that g > 0, and that I f n I < g , n = 1 , 2,. . . . Then the followin g are true. ( i ) There exists at least one function f E e - 1 , such that f < oo, to which the sequence {f n } converg es a. e. in the topolog y of pointwise converg ence. ( ii) f E L 1 ( 0.,E,J.L; �) and {f n } C L 1 ( 0.,E,J.L; fR ); (iii) The sequence {f n } converg es to f in mean, e., 2.6 Theorem (Lebesgue's Dominated Convergence Theorem)
..
.
z.
I f - f n I d j.L = 0. I fd J.L = lim n --+ oo I f nd J.L. lim n--+ oo I
(iv)
Proof.
( i)
By our assumption, there is a negligible set
II
such that
lim n --+oo f n ( w ) exists for all
w E rrc and there is a J.L- null set N 1 ::> II. Therefore, N� C rr c
2.
Main Convergence Theorems
315
and lim n-+oo f n ( w ) exists for all w E N�. Since g E L 1 ( f2,E,J.L), by Proposition 1.21, it follows that g is finite J.L-a.e. on n, i.e. there is a J.L-null set N 2 such that g( w ) < oo for all w E N2 . Define the function (2.6) where A = (N 1 U N 2 ) c . Clearly, f n converges to f pointwise J.L-a.e. on n and hence, by Proposition 5.6 (iii) and (vi), Chapter 5, f E e - 1 . Indeed, since f n and A E e - 1 , it follows that f n A E e - 1 and that f n A --+ f in the topology of pointwise convergence; the latter implies that f n --+ f pointwise J.L- a .e. on n. ( ii) From (2.6) it follows that on set A, lim n -+oo f n = /; in addi tion, {/ n l is dominated by a finite function g on A. Thus, I f I < g on A and, due to (2.6), f = 0 on A c . Hence,
1
1
1
g , 'v'w E n. By Proposition 1. 17 and since I f I < oo, f E L 1 ( f2,E,J.L). Proposition 1. 17, {/ n l c L 1 ( 0,E,J.L). (iii) We prove that f n is convergent in mean to /, i.e. , IfI
Let Since
< 00 and
IfI
g n = I f - f n I ( E e + 1 ( f2,E),
<
why?). Then,
Also by
0 < gn <
it follows that g n E L 1 (f2,E,J.L), again by Problem 1.8. [ 0 bserve that since linearity of the integral holds just on need to show that g n E L 1 which would lead to
I I J I + g - g n = I ( I J I + g) + I ( - g n ).] Applying Fatou's lemma to the sequence { I f I + g - g n }, I lim ( I f I + g - g n )dJ.L < lim I ( I f I + g - g n )dJ.L = I ( I f I + g)dJ.L + lim I ( - g n )d J.L
I f I + g.
L1 ,
we do
we have:
3 16
CHAPTER 6. ELEMENTS O F INTEG RATION
(2.6a) Since f n___. f a.e. , then a.e. which implies that
g n --. 0
lim ( I f I
a.e. ,
and hence
+ g - Yn ) = I f I + g
I f I + g - Yn __. I f I + g a.e ..
By Proposition 1. 16,
which, together with inequality (2.6a), yields
or, equivalently, (2.6b) Because
Yn > 0, (2.6b) reveals that
and thus lim J
I f - f n I dj.L = 0,
which proves (iii) . Now ( iv) follows from Problem 2.6. 2. 7 Examples.
( i)
1
D
n
We evaluate nlim oo I 0 nx ( 1 - x ) dx. First observe that the -tn sequence {nx ( 1 - x) } is convergent to the function 0 pointwise on [0, 1n] . However, . it is an easy exercise to show that the sequence {nx ( 1 - x) } does not con verge to 0 uniformly. Otherwise, we could interchange the limit and the integral. (Seen Problem 3 . 12 of the next section.) Fortunate ly, the functions nx(1 - x ) are uniformly bounded by 1. Therefore, func tion can be taken as a pertinent integrable majorant function in the Lebesgue Dominated Convergence Theorem. This enables us to inter change the limit and the integral and conclude that
1
JL� I �nx ( 1 - x)" d x = 0.
(We can verify this result by direct computation of the integral
2. Main Convergence Theorems
3 17
I �nx( 1 - x) "dx = {n l Un 2 ) +
+
and then passing to the limit. ) ( i i) Calculate nlLTYa I �( 1 + � )" e - 2 x .,\ ( dx ). Clearly,
1( + � )" 1 [ 0, n ] (x )e - 2 x < e - x E L l .
Hence, by the Lebesgue Dominated Convergence Theorem,
2 1 x ( )e x.,\ ( dx ) I n " + ) JLTYa ( �) 1 [o, D = I Jl.TYa( 1 + � )" 1 [ 0, n ] (x)e - 2 x,\ (dx) = I ;'e - x .,\ (dx) = 1. 2.8 Remark. Note that we treated I � n x( 1 - x) " dx in Example 2.7 (i) informally both as Lebesgue (L) and Riemann (R) integrals (since they are identical in this case), although the formal relationship between the two will be de vel op ed and discussed in Section 3. The same applies to Example 2.7 (ii). In Problems 2.9-2.11 we will also assume that the D Lebesgue integrals are equal to Riemann integrals. Another useful application of Lebesgue's Dominated Convergence Theorem 2.6 leads to the possibility of interchanging the derivative and integral whenever we need to differentiate a function under integral. The only obstacle in using Theorem 2.6 is that it is formulated for sequences, while derivative is defined as a limit along nets or filters. Nevertheless, to overcome this predicament we will utilize the arguments of Example 9.7 ( ii) , Chapter 3, when the limit of a function, originally introduced along a filter base, reduces to the topological limit along countable neighbor hood bases whenever we deal with first countable spaces (which we fre quently do, as far as applied to derivatives in metric spaces, in particular, in Euclidean spaces) . This enables us to make use of limits as derivatives along sequences (as was pointed out in that example) and finally apply the Lebesgue Dominated Convergence Theorem. This is subject to Theorem 2.9, which the reader shall be able to prove. (See Problem
2.14. )
Let f E e - 1 (f2 [a,b], E'; IR) (a < b E IR ) be a Borel measurable function and for each t E [a , b] , f( ,t) E L 1 ( n, E, J.t; lR). ( i) If there is a J.L-integrable function g ( E L 1 ( f2,E , J.t ; fR)) such that g � 0, and that I f ( w, t) I < g( w) , t E [a,b], w E n, and if the function t f( ,t) is continuous at some e E [a,b] uniformly for all w, then the integral of parameter 2.9 Theorem.
x
·
.__.
·
3 18
CHAPTER 6. ELEMENTS OF INTEGRATION
I(t) =
( 2.9 ) I f( w , t )J.L (d w) is continuous at �, i. e. limt-+ e l ( t ) = I(�). ( In other words, the limit and integral are interchangeable.) ( ii ) If the partial derivative :{ exists and there is a J.L-integrable function g ( E L 1 ( r1.,E,J.L; lR)) such that g > 0, and that :
I %t f ( w , t) I Then , I is differentiable and I ( t) I
< g ( w ) , t E [a, b], w E n.
=
I :t f ( w , t) J.L ( d w ) .
( 2.9 a)
D The following are analogs of the main convergence theorems ( Mono tone Convergence Theorem, Fatou's Lemma, and Lebesgue's Dominated Convergence Theorem ) for measures, which are often needed in probabili ty and control theory. The theorems are essentially based on the recent results of Onesimo Hernandez-Lerma and Jean B. Lasserre [2000], which are established under weaker conditions than in previous texts and papers. Lemma (Fatou) . Let f E e + 1 (n, E) and {J.L, J.L 1 , J.L 2 , . . . } be a sequence of measures on E such that for each A E E, lim J.L n (A) > J.L(A).
2. 1 0
Then
Proof. Let
{s k } i C tJ.i + ( rJ.,E) such that s k j f and
( ) k "' m k j S k Ll j = l a 1 A jk .
Hence, for each k = 1,2, . . . , lim
f fdJ.l.n > lim n -+oo J s k d f.l.n = lim n-+ oo }: 7k 1 Ot�k ) Jl. n (A ;k ) }: 7 k l Ot�k ) lim n-+ oo Jl.n (A ;k) > J s k d f.l. .
The statement now follows from the definition of integral. Theorem ( Dominated Convergence ) . Let f E e + 1 (!1, E)
2. 11
D
and {v, J.L, J.L1 , J.L 2 , . . . } be a sequence of measures on E such that for each A E E, J.Ln (A) ---. J.L(A), J.Ln < v, and J fdv < Then oo .
2.
Proof. Since
Main Convergence Theorems
J.L n < v,
E. Due to Problem
1 .25,
it is easy to verify that
319
v - J.L n is a measure on
J fd(v - J.L n ) = J fdv - J fdJ.L n ·
Furthermore, = v(A) - nlillJo J.L n (A) = v(A) - J.L(A) = (v - J.L)(A). The last inequality holds true, because obviously J.L < v and hence v - J.L is a measure on E. Now, all conditions of Fatou's Lemma 2. 10 are met for the sequences { v - J.L n } and {J.L n } and therefore,
J fd(v - J.L) = J fdv - J fdJ.L < lim J fd(v - J.L n ) = lim ( J f d v -
J fdJ.L n ) = J fdv - lim J fdJ.L n
and Combining both inequalities we have
and hence, the statement. D To prove the Theorem of Monotone Convergence for measures we need the notion of set wise con vergence. 2.12 Definition. Let (0, E) be a measurable space and {J.L n } be a sequence of measures on (0, E). We will say that {J.L n } converges to a set function J.L setwise if j.!!IJo J.L n (A) = J.L(A) exists for each A E E. The set function J.L will be called the setwise limit of {J.L n } · D 2.13 Proposition.
The setwise limit J.L of {J.L n } has the following
properties: ( i ) J.L is monotone and additive. ( ii) Let { A 1 ,A 2 ,. . . } be a sequence of pairwise disjoint sets from E and A n C A E E. Then (2. 13)
320
CHAPTER 6. ELEMENTS OF INTEGRATION
Proof.
( i ) is trivial. ( ii) It can be verified directly from the definition of the set wise limit by using monotonicity and additivity or just due to Proposition 1.3 ( ii) , Chapter 5. D We are wondering what condition imposed on a sequence {J.L n } makes its setwise limit a measure. For instance, if the sequence {J.Ln} is monotone nonincreasing, then the limiting set function J.L need not be u additive, as we learn it from Problem 2. 12. 2.14 Theorem. Let a sequence {J.L n } of measures on a measure space (n, E) be convergent to a set function J.L setwise. Then J.L is a measure if
one of the following conditions holds. ( i) {J.L n } is a monotone nondecreasing sequence. ( ii) J.L is finite.
Proof. Let { A k } be a sequence of p airwise disjoint measur able sets with A as its union.
( i)
Since {J.L k } is monotone nondecreasing, for each
m =
1,2, . . . ,
which, combined with (2. 13), yields the st atement. ( ii) Since J.L is finite, by Theorem 1. 7 ( ii) , Chapter 5, if J.L is not u finite (which we are going to assume), it would not be $-continuous. In other words, there is a monotone nondecreasing sequence { A k } l C/J of measurable sets such that lim k -+ oo J.L( A k ) = £ > 0. Let a 1 = b 1 = 1 and suppose a j and b j are positive integers defined for all j < n . Furthermore, let a n + 1 > a n such that (If there is no such a n + 1 , then it would surely contradict our assumption that lim k -+ oo J.L( A k ) = £ > 0.) Now, let b n + 1 > b n such that
!.£ > an 1 ( A bn 1 ) . -r + + (Such a b n + 1 should exists, because J.L a n 1 is 0 -continuous. ) For B n : + Abn \A bn + 1 , we have that J.La n + 1 (B n ) > �E . Therefo re for j being odd 8
II
=
2. and j > k > 1,
Main Convergence Theorems
·( n J.L ( n
f..La
Then, for k > 1 ,
J
32 1
) !c:. En > kB n ) > !c:. }:n Bn >k
even:
�
even :
We can easily verify that the last inequality holds true also for all odd values of n. Consequently, for all k > 1,
JL( Abk) = t{ E :;'
tB s) > �c:.
The latter contradicts the assumption that lim k_. 00 J.L ( A k ) = c; > 0. 2.15 Theorem (of Monotone Convergence) . Let f E e + 1 (0, E)
D
and {J..L 1 , J..L 2 , . . . } be a mono tone nondecreasing sequence of measures on a measure space (n, E). Then there is a measure J.L on (0, E, J.l) such that J.L n ( A ) J.L ( A ) for all A of E and --+
(2. 15)
{J..L n }
is monotone nondecreasing, by Theorem 2. 14 ( i ) , the setwise limit J.L of {J.L n } exists and it is a measure on (0, E). Since f is nonnegative and J..L n j J.L, the sequence { J fdJ.L n } is monotone nondecreas ing and hence Proof. Since
(2. 15a) The last inequality holds because of J f d J.L n < J f d J.L which, in turn, is due to Problem 1.26. On the other hand, from Fatou's Lemma 2. 10 applied to our case, ,
that, combined with (2. 15), yields the statement. D The convergence theorems below are for sequences of functions and measures at once. 2.16 Lemma Fatou. Let { J.L, J.l1 , J..L 2 , . . . } be a sequence of measures on a measure space (f2, E) and let {f n } C e + 1 ( n , E) such that for each A E E, lim J.Ln (A) > J.L( A ) . Then
where
( 2. 16 )
322
CHAPTER 6 . ELEMENTS O F INTEGRATION
Proof. First
f ( w ) : = lim/ n ( w ), w E 0. assume that {/ n } C e + 1 ( f2, E).
positive integer N and for every
n,
(2. 16a) Then, for every fixed (2. 1 6b)
Applying the version of Fatou's Lemma 2. 10 to the right-hand side of (2. 16b) we have
> J inf{/
m=
m
> N} dJ.L .
(2. 16c)
Since {inf{/ m > N} N } j f defined in (2. 16a), applying the standard Monotone Convergence Theorem 2. 1, we arrive at m=
D The following generalization of Fatou 's Lemma 2. 16 is applied to arbitrary measurable functions {/ n } and its proof is left to the reader. (Problem 2. 13.)
In the condition of Fatou 's Lemma 2.16, let {g, f 1 , f 2 , . . . } c e - 1 (f2, E ) such that for all f n > g and lim n--+ oo J gd J.L n = J gd J.l > - oo. Then, 2.17 Lemm a (Fatou).
where
n,
J fdJ.L < lim J fn d J.Ln f ( w) : = lim/ n ( w ), w E 0. '
(2. 17) D
2.18 Theorem (Lebesgue's Dominated Convergence Theorem) .
Let {f n } C e -' 1 (f2, E), g E e + 1 (f2, E), and {v,J.L,J.L 1 ,J.L 2 , . . . } be a sequence of measures on the measure space (f2, E) such that: ( i) J.L n < v. ( ii) f n converges
to a function f in the topology of pointwise conver-
gence. (iii) J.L n con verges to J.L set wise. ( i v ) J gdv < oo. ( v ) I f n i < g.
2. Main Convergence Theorems
323
Then, (2. 18) for which we use the conditions ( i), (iii) and (iv) . Then, applying Theorem 2. 11 to g we have that
2. 11
Proof. Consider Theorem
Now, since
g ± f n > 0 for all
n,
we have from Fatou's Lemma 2.17,
On the other hand, since I gdJ.l < I gdv < oo,
that yields the assertion.
D
PROBLEMS 2. 1 2.2
Prove Corollary 2.2. Generalize the Monotone Convergence Theorem: Let {/ n } j C e - 1 ( 0,E) and g E e - 1 ( 0,E) such that f n > g for all n and suppose that J g dJ.l > oo. Prove that -
sup{ I f n dJ.l} = I sup{/ n } d jJ. . 2.3
2.4
Show that if I gdJ.l oo, the Generalized Monotone Conver gence Theorem need not hold. Let { f n } ! C e - 1 and g E e - 1 such that f n < g for all n. If I g dJ.l < oo, show that =
-
inf{ J f n dJ.l } = I inf{/ n } dJ.l 2.5
Let
.
( O,E,J.l) be a measure space and le t {A n } C E.
and if J.l < oo that
Prove that
324
CHAPTER 6 . ELEMENTS O F INTEG RATION
[Hint: Apply Fatou's Lemma 2.4 to the sequence of functions
2.6
2. 7
n}
and use Problem 3.8, Chapter 1; then apply DeMorgan's law to prove the second inequality. ] Show that if f n � f in mean then {1 A
Generalize Fatou 's Lemma 2.4 in the following way. Let {/ n } C e - 1 ( 0,E) and g E e - 1 ( 0,E) such that g < f for all n. Let I g - d J.l < oo. Show that n
2.8
I lim f n dJ.l < lim I fn d J.l • Let {/ n } C e - 1 (!1, E ) and g E e - 1 (!1,E) such n. Let J g + d J.l < oo. Show that
2.9
Let
{
that
f n < g for all
f n ( x) = n,, l O < x < � 0 < x < oo. n
Show that f n ---. 0 ,\-a.e. in the topology of pointwise convergence. Explain why
I lim n --.oo f n A( d x) f. lim n -+oo I f n A( d x) 2. 10
·
Let
x>2
n·
Show that 2.11
Use Lebesgue's Dominated Convergence Theorem 2.6 to prove that for all a > 0, -+ oo a ( a nlim
-
1)
,
n.n ·
·
·
a
( a + n - 1)
=
r(a),
( P2. 1 1 )
2.
Main Convergence Theorems
325
where r( a) is known to be the gamma function and it is expressed as the improper Riemann integral (P2. 1 la) Give an example of a monotone nonincreasing sequence of meas ures convergent to a set function J.l setwise such that J.l is not a measure. 2.13 Prove Fatou 's Lemma 2. 17. 2.14 Prove Theorem 2.9. [Hint: Use Theorem 2.6, the Mean Value Theorem, and Example 9. 7 ( ii), Chapter 3 .] 2. 12
326
CHAPTER 6 . ELEMENTS O F INTEG RATION
NEW TERMS:
Monotone Convergence Theorem for functions 312 Beppo Levi's Corollary 3 13 Monotone Convergence Theorem, Generalized 313 Fatou's Lemma for functions 313 convergence in mean 314 Lebesgue's Dominated Convergence Theorem for functions 3 14 interchanging derivative and integral 3 17 Fatou's Lemma for measures 3 18 Lebesgue's Dominated Convergence Theorem for measures 31 8 setwise con vergence of measures 3 1 9 setwise limit of measures 3 1 9 setwise convergence, criterion of 320 Monotone Convergence Theorem for measures 321 Fatou's Lemma for measures and nonnegative functions 32 1 Fatou 's Lemma for measures and functions 322 Lebesgue's Dominated Convergence Theorem for measures and functions 322 gamma function 324, 32 5
3.
Lebesgue and Riemann Integrals on IR
327
3. LEBESGUE AND RIEMANN INTEGRALS ON IR
In this section we will develop integration techniques in L 1 ( 1R, c:B,>.;IR) (see Definition 1 . 1 0 (ii)). The principal idea is to reduce the Lebesgue integral to the Riemann integral whenever it is possible in combination with the main convergence theorems. The Riemann notion of an integral, which was a refinement since its inception of Cauchy in 1832, was introduced in 1854. We begin with the concept of the Riemann integral of a bounded function on a compact interval suggested by the Frenchman Gaston (in some sources, Jean-Gaston) Darboux (1842-1917) in 1875 . Although the construction below is selfcontained, the reader is encouraged to go back to Example 9.9 (vi) , Chapter 3 , for topological preliminaries of this construction. Let n = [a,b] be a compact interval in IR. By Definition 1 .7 (ii) , Chapter 1 (see also Example 9. 9 (vi) , Chapter 3), partition of [ a,b] is any ordered n-tuple P = P( n) = P ( a0 , ,a n ) with • • •
P = {a0 , . . . ,a n E [a,b]: a = a0 < a1 < . . . < a n = b}. P1 and P2 be two partitions of [ a,b ]. We say P2 is finer than P1 if P1 C P2 . P2 is also said to be a refinement of P1 (in notation P 1 � P2 ).
Let
Thus, if <j'J is the set of all partitions on [a,b], -< is a partial order on GJl . Denote by e b- 1 ([a,b], c:B ([a,b])) = e b- 1 ( [a,b], c:B n [ a ,b]; IR) the set of all real-valued , Borel-measurable, bounded functions on [ a,b ]. Let f E eb- 1 ( [a,b], <:B([a,b])) and let P(n) = {a0 , . . . ,a n } be a partition of [a,b]. We introduce the following notation:
A i = (a i - l ,a i ], i = 2, . . . ,n, A 1 = ( a0 ,a1 ], mi = inf{ f ( x ) : x E A i }, M i = sup {f( x ): x E A i }, L(f, P ) = I: 7 = 1 mi (a i - a i_ 1 ) (the Darboux lower sum), U (f,P) = I: 7 = 1 M i (a i - a i_ 1 ) (the Darboux upper sum), l (f , P ) = l: 7= l mil A i ' u(f, P ) = l: 7 = 1 M i 1 A i · Clearly, the j ump functions l and u are elements of tJ.i + ([a,b], <:B([a,b])), i.e. , are nonnegative simple Borel-measurable functions. Thus, L(f , P ) and U(f , P ) can be interpreted in terms of Lebesgue integrals as L(f, P ) = J l (f, P)d>. = J l (f , P )d>. [a , b ]
328
CHAPTER 6. ELEMENTS OF INTEG RATION
and
U (f,P) = I u(f,P)dA = I u( f, P )dA [ a , b]
(in agreement with Notation 1 . 13 (ii)). Now let { P(n) = P(a0 , , a n ) ; n = 1 ,2, . . . } be a sequence of partitions of [a,b] such that { P(n), -< } is a chain. Denote I P(n) I the Lebesgue measure of the largest subinterval of P(n) and call it the mesh of this par tition. A chain { P(n) , -< } is said to be canonic if { I P(n) I } is a mono tone nonincreasing sequence vanishing for n ---. oo. Let l n = l ( f, P(n)) and u n = u ( f P (n)) denote the lower and the upper jump functions corresponding to a partition P (n) in a canonic chain. Then it can be easily verified that • • •
,
Let U n = U ( f,P(n)) = J u n dA and L n = L (f,P(n)) monotonicity of the Lebesgue integral, we have
=
J l n dA.
By
f is bounded, there exist U_ = inf Un = lim U n (called the upper Darboux integr:aQ and L + = sup L n = lim L n (called the lower Darboux integ raQ. 3. 1 Definition. If U _ = L + , then their common value, R ( f,[a,b ]), is called the Riemann integ ral of the (bounded) function f over [ a , b ], and the function f is called Riemann integrable. R (f ,[ a,b ]) is also denoted by Since
the symbol
a[ J, b ] f(x)dx.
(R)
Sometimes; to tell a Lebesgue integral from a Riemann integral we will write as (L) J f(x)dx. [a, b]
For notational consistency, most often we shall be using the dA symbol within the Lebesgue integral (rather than an "L" in front of it). However, many text books and papers routinely use the same symbol dx in Lebesgue integrals as in Riemann integrals, which we do not believe should cause any serious confusion (and it makes ,\ available for other notation).
D
3.2 Theorem. Let1 f E eb- 1 ( [a,b], � ([a,b])). If f is Riemann integrable
on [a ,b], then f E L ( [ a,b],<:B([a,b]),A; IR ) , i. e., it is Lebesgue integrable on [ a,b ]. In this case, the Riemann integ ral of f equals the Lebesg ue integral
3.
Lebesgue and Riemann Integ rals on lR
329
of f. f
be Riemann integrable. Then, nlim ---.oo ( U n - L n ) = 0. Applying Fatou's Lemma 2.4 to the sequence { u n - l n } C e + 1 ([ a , b ] , <:B([ a,b ]) ) , we have Proof. Let
0 < I lim (u n - l n ) dA = J lim (u n - l n ) dJ.l
�
< lim J (u n - l n ) dA = lim(U n - L n )
Because of *), Lemma 1 . 15, and the fact that elements of e + , we have that u = lim u n = l = lim ln = f
Also , since
f E eb- 1 ( [a,b], <:B([a,b]))
and
l n < J,
=
0.
un - f and f - l n are
a. e.
(3.2)
it follows that
l n E L 1 ([ a ,b ], �([ a ,b ]) , A; lR). Now we can apply Lebesgue's Dominated Convergence Theorem 2.6 to the sequence { l n } with respect to its a.e.-limit-function f to have
nlim ---. oo [ aJ, b ] l n d A = nlim ---. oo L n = (R)[ aJ, b ] f (x)dx = [ aI, b ] f d A.
D
3.3 Remarks.
(i) The functions
u = inf u n and l = sup l n are called the upper and the lower Baire functions. Therefore, L + is the Lebesgue integral of the lower Baire function l. (ii) The above construction of the Riemann integral , which is now
common in mathematical analysis courses , belongs to Gaston Darboux in his work Memoire sur I a theorie des fonctions discontinues of 1875�. The original construction of the Riemann integral, that goes b ack to Augustin-Louis Cauchy (1789-1857) in 1823 (and later generalized by Riemann in his Habilitationschrift of 1854), is as follows. Given a func tion f E e b- 1 ([a ,b], <:B([a,b])) and a p artition P from a canonic sequence of partitions {P(n) = P(a0 , . . . , a n ); n = 1 ,2, . . . } of the interval [a,b], define the Cauchy sum as (3 .3) where � i is any point of [a i _ 1 ,ai]. Note that, unlike the Darboux sum, the Cauchy sum is not specified because �/s are arbitrary. If the limit
330
CHAPTER 6. ELEMENTS O F INTEGRATION
C == lim I P(n) I -+ 0 C( f , P ( n )) exists as a unique number, then f is called Cauchy inte g rable the value of this limit is denoted by (C) J :f (x) dx . Clearly,
on [a,b] and
and therefore the Cauchy integral exists if L + = U _ In 1875, Darboux proved that this is also a necessary condition for the existence of C and in this case, C = R. Darboux's theorem and his approach are subjects in most standard texts in mathematical analysis, while Riemann's concept of the integral is more common in calculus classes as it leads to a quicker and more lucid interpretation. As a sufficient condition of the existence of the Riemann integral, Cauchy required that f be continuous on [a,b]. Riemann relaxed Cauchy's integrability condition by requiring that for each e > 0, there is a partition P of [ a , b] such that U _ ( J P ) - L + ( f , P ) < e. However, Riemann did not specify the class of functions, which are subject to integration (although he pointed out that a function can be discontinuous on a dense set and nevertheless integrable), as Lebesgue did in his Theorem 3. 5 which is to follow. D 3.4 Example. Let f be the Dirichlet jump function introduced in Example 1.4. Consider its modification .
,
f (x) = lQ
n
[ o , 11 ( x ) E e + 1 ([0,1],<:B([0,1 ] )).
The Lebesgue integral of f exists and equals zero. The Riemann integral of J, however, does not exist, since for every partition, the lower Baire function equals 0 ( l = 0) and the upper Baire function equals 1 ( u = 1). Therefore, the lower Darboux integral L + = 0, and the upper Darboux integral U _ = 1. D 3.5 Tlieorem (H. Lebesgue). Let f E e b- 1 ([a,b], <:B([a,b])). Then f is Riemann integrable on [ a , b] if and only if f is continuous >. -a.e. on [ a,b ]. Proof.
( i)
Observe that if f is continuous on [ a,b ], then it is uniformly continuous on [ a,b ]. This implies that for each £ > 0, there is a fJ > 0, such that for each partition P whose mesh is less than 6,
u (f , P) - l (f , P) < e .
(3.5)
(Show it, see Problem 3 . 1 . ) This leads to Riemann integrability. ( ii) Let f be bounded, Borel-measurable and ,\-a. e. continuous on
3.
Lebesgue and Riemann Integrals on IR
331
[ a,b ]. If f is not continuous everywhere, but is bounded, it can have only discontinuities of finite magnitude. From the nature of the lower and the upper Baire functions, l and u, it follows that l and u coincide with f at all points of continuity of f. (A rigorous proof of this statement, known as Baire's theorem, is contained in many standard analysis text books. ) At the points of discontinuity of f, l assumes the smallest values and u takes the largest values (this can be shown by elementary methods). (See Figure 3. 1 .)
u(x)
=
l(x), x '# x0
l(x) = u(x), x * x0
I I I
·-- - - - - - · - · · -
·-- - - - - - � -
-
l(x0 )
- - � -X�
--
. - - ·
Figure 3 . 1 Then, if f is discontinuous on a negligible set S , it should equivalently follow that u and l differ on the same set S. By the above condition, S C N where N is a measurable null set. Since f is bounded, u n and I n are measurable, bounded jump functions, and U n and L n exist. By Lebesgue's Dominated Convergence Theorem, U _ - L + = 0, which im plies that f is Riemann integrable. Indeed,
= J u d ).. - J l d).. = 0, by Lemma 1 . 15, since u = I on N c , i.e. , a. e. (iii) Let
f be Riemann integrable. Then, by (3 .2),
332
CHAPTER 6 . ELEMENTS OF INTEG RATION
f = l = lim n---. oo ln = u = lim n---.oo un a.e. Furthermore , f is bounded. We repeat the above arguments. From the nature of u and l, it follows that, in this case, u, l, and f coincide wher ever f is continuous. At all points of discontinuity, while f assumes one of these values, the smallest values of f will be assigned to l and the largest ones - to u. Therefore, the set, on which the function f is dis continuous equals the set on which u and l differ. This proves that f is D continuous >.-a.e. 3.6 Remarks.
By employing a canonic chain of partitions on the X-axis, in construction of the Riemann integral, we sometimes face the problem that the sequence of the corresponding lower jump functions {l n } con verges to the lower Baire function l, but it does not converge to f, as it turns out for the Dirichlet function. Consequently, the lower Darboux integral gives a "wrong" value. In contrast, the construction of the Lebesgue integral literally sets up partitions on the Y-axis whose canonic chains form monotone increasing sequences of lo we r jump functions. The latter, due to Theorem 5.5, Chapter 5, always converge to f. Con sequently, the lower Darboux integral L + equals the Lebesgue integral
( i)
"
"
J f d J.l. ( ii)
Although Riemann and Darboux enlarged the previously existing class of integrable functions, the Riemann integral has a plethora of limit ations, one of which goes back to the fundamental theorem of calculus in the form ( R) J f' (x)dx = f(b) - f ( a).
:
This formula becomes meaningless when a differentiable function integrable. On the other hand, the classical proof of the formula
f is not
d� J = f (u ) du = f(x) was originally based on the continuity assumption for f. The new con cept of integration suggested by Henri Lebesgue in 1902 in his doctoral work restored the generality of the fundamental theorem to its current status. Furthermore, the class of Lebesgue integrable functions is signi ficantly enlarged. Notice that from Theorem 6.5, Chapter 5, it follows that, in contrast with the Cauchy-Riemann-Darboux formation of par titions of [a,b] and essentially leading to Definition (3 . 3), the Lebesgue construction of the integral of an ( initially nonnegative ) function f suggests partitions of the interval [0 sup/] on the Y-axis instead. The latter leads to a notion of a sequence of nonnegative simple functions ,
3.
Lebesgue and Riemann Inte grals on lR
333
{ s n } approximating f from below, a very elegant and lucid definition of the integral of a nonnegative simple function, and, as a consequence, the definition of the integral J f d A as sup{ J s ndA }. The function f need not be A- a . e . continuous, nor need it even be bounded. (iii) As we mentioned, in order that a function be Lebesgue integr able, it need not be bounded. A class of Riemann-integrable functions, as known, can be "extended" for nonbounded functions by the use of the "improper integral." Another need for the improper integral arises when the interval of integration is unbounded. In the latter case, the integral is constructed as usual on a compact interval [ a , b ], and then its values are taken for a --+ oo or b --+ oo This is a "trick'' rather than a proper integral construction. That is why such integrals are called improper. ( iv) Unlike this type of improper integration over infinite intervals, there is another way to integrate functions with the conventional approach of constructing an integral via uniform "partitions" of the in finite interval . Consider as an example a bounded Borel measurable function f on an interval [ a , oo ) and a partition of t hi s interval by the sequence { a n }, where an = a + 6n, n = 0, 1 , . . . , for some positive 6. Then on each of the intervals � n = [ an , an + 1 ) consider -
.
mn = inf {f(x) : x E � n }
and
Mn = sup {f(x) : x E � n } · Since the Lebesgue measure of each interval � n equals 6, we have again the lower Darboux sum,
and the upper Darboux sum, If limo L(/,6) = lim U(/,6) then its common value is denoted by o!
o!o
(D) J c; f(x) dx
direct Riemann integral. The function f is then said to be directly Riemann inte grable. The direct integrability is used in prob
and called the
ability, specifically in renewal theory, where such a notion is introduced for a class of nonnegative functions bounded over finite intervals. D
334
CHAPTER 6 . ELEMENTS OF INTEG RATIO N
3. 7 Examples. (i) Let
Q = [0,1] and let f(x) = x 2 1 A (x) + sinx l A c(x), where A c is
the Cantor ternary set. The function f is a bounded Borel-measurable function on [0,1] and obviously >.-a.e. continuous on 2 [0,1]. Thus, f is Lebesgue as well as Riemann integrable and f(x) = x >.-a.e. on [0,1]. Furthermore,
J0, 1 f(x)dx = (L)0J1 f(x)>.(dx)
(R)
[ ] = (L)
J0 1
[, ]
x 2 >.(dx) = (R)
[, ]
(ii)
J0, 1
x 2 dx = �·
[ ] Let n = [1,2] and f(x) = (x - 1 ) - 1 / 3 . We wish to evaluate
J f(x)>.(dx). Since f is no longer bounded (on
[1 , 2 ]
[1,2])
we cannot apply
the same techniques as discussed above. Consequently, we introduce an auxiliary sequence of functions, {/ n } , defined as
1 < x < 1 + n13 (x - l)- 1 1 3 , 1 + 13 < x < 2 3.2). It is easily seen that {/ n } is monotone increasing continuous functions contained in e + 1 ([1,2], <:B ([1,2])) with n,
n
(see Figure sequence of sup{/ n } = f.
1
2
Figure
3.2
3. Lebesgue and Riemann Integrals on IR By Proposition 5.6 gence Theorem,
335
( iv ) , Chapter 5, f E e + 1 . By the Monotone Conver
I1 2 f ( x )A( dx ) = [,]
sup
fn ( x ) A ( dx ). I [1 , 2 ]
On the other hand,
= d f I ). ( R ) I f n ( x)dx n [ 1, 2] [1 , 2] = (R) I 3 n d x + ( R) I 3 (x - 1 ) - 113dx = � - � -\ · n [1 , 1 +1 / n ] [1 +1 / n , 2 ] ·
Thus,
f n (x) A (dx) = �· I [1 , 2]
sup
Observe that the improper integration technique for nonbounded func tions could also be applied to this function. D 3.8 Remark. The Lebesgue integrable functions constitute a much wider class in comparison to the Riemann integrable functions. It should also be mentioned that an L 1 ( 1R , �, >. ) -function f can be integrated over arbitrary Borel sets, while the Riemann integral is defined just on inter vals. With all these advantages, however, the Lebesgue integral does not have the same elegance and analytical tractability the Riemann integral has, due to its "Newton-Leibnitz bridge" to derivatives and a huge inven tory of integration techniques. In many cases, whenever possible, the Lebesgue integral is j ust reduced to a Riemann integral. In addition, the class of Riemann integrable functions is traditionally enlarged to include those functions which are Riemann integrable in an improper sense. There will be functions with discontinuities of an infrnite magnitude and functions defined on intervals of type [ a,oo ) or ( - oo,b ] or ( - oo, oo ) . In Example 3.7 (ii) we examined a Lebesgue integral of a nonbound ed function. In a certain sense, the approach used there reminds us of Riemann integration of nonbounded functions. In the proposition below we will state that in most cases, when the integration over an infinite interval is needed, we can use Riemann integration in the improper sense and equate their values to those for Lebesgue integrals. This fact makes D the Riemann improper integral more legitimate. 3.9 Proposition. Let f E e + 1 ( 1R, <:B; IR + ) let f be Riemann integrable on any compact interval. Then f E L 1 ( 1R , �,>.;IR + ) if and only if the improper Riemann integral of f ,
336
CHAPTER 6 . ELEMENTS O F INTEG RATIO N
R = lima --. - 00 I f (x )dx , b ---. oo [ a , b ]
exists. ( We say that f E �(IR), where �(IR) is the class of all functions on IR Riemann integ rable in the improper sense.) In this case R = I f d ).. . Proof. Denote Rnk =
Then, since
(R) BI f ( x )dx nk
wher e B n k = [ - k, n ] .
f is Riemann integrable, Rn k = I fi Bn k d ).. . Observing that f = sup { f l B n k : n = 1,2, . . . ; k = 1,2, . . . },
we have, by the Monotone Convergence Theorem,
I f d ).. = sup R n k = R <
oo .
D
3.10 Remark. The special case treated in the above proposition
applied to nonnegative Borel measurable functions can easily be extended by our noticing that f E L 1 (1R, <:B, >.. ; IR ) if to arbitrary functions of e and only if I f I E L 1 (1R, �, >.. ; IR). Therefore, using Proposition 3.9, we conclude that I f I must be an element of �(IR). In this case, evidently,
-l
(R) I ': 00 f (x) dx 00 = (R) I 00 / + (x) dx - (R) I ': f - (x) d x oo = I t + d >.. - J t - d >.. = J t d >.. . 3.1 1 Examples.
D
. x x sin ( w he re k '# k2 + x2
0) . W e show Consider the function f ( x ) = that this function is Riemann integrable in the improper sense but not Lebesgue integrable over IR + . We apply the Dirichlet criterion: (i )
Let g and h be two real-valued functions defined on [a, oo ) . If g is monotonically vanishing at oo and I (R) J � h (x )dx I < C, for each b > a and positive real number C, i. e., the integ,ral of h is uniformly bounded in b, then the improper integral ( R ) J :" g h is convergent. In our case, the function 2 x 2 can be taken for g and sinx can represent k +x
3. Lebesgue and Riemann Inte g rals on lR
337
0, and con sequently, ( R) J '; f converges. On the other hand, f E L 1 (1R, <:B, A;lR) if and only if l /1 E L 1 (1R,
=
oo
= En =O
J
J
1r sint{t + n1r) 0 k 2 + (n 7r + t) 2
dt
a =
1r 1rn J 0 sint d t >- k 2 + 7r 2 (n + 1 ) 2
�--�--�
(the second summation is due to the inequality 1rn + t < (n + l)1r, for t E [0,1r]). Thus 2
( ii) The function f( x) = sinx exp ( - ; ) is an element of e - 1 and it 2 is Lebesgue integrable, because I f ( x) I < g ( x ) = exp ( � ) and g ( x ) > 0 and because 00 -
g(x )dx j � -
.jz; (
2
oo
=
1.
)
Observe that x 1---+ exp - � , x E IR, is the normal density func21r tion of the standard normal distribution. (See Example 5. 10 (iii).) D PROBLEMS 3.1 3.2
3.3
Prove (3.5) in Theorem 3.5. In Example 3.4, we showed that the Dirichlet function f on [0,1] is Lebesgue integrable, but not Riemann integrable. Since the rationals have the Lebesgue measure of 0, the function f is equal to 0 (a constant) for A-almost all points on [0, 1 ], and therefore, it is continuous almost everywhere on [0,1]. By Theorem 3.5, f must be Riemann integrable. This is just the opposite of the result of Example 3.4. What is wrong with this reasoning? Is the function f ( x ) = � on [0, 1] Borel-measurable and A-integr-
338
3.4 3.5
CHA PTER 6. ELEM ENTS O F INTEG RATIO N
able? Show that the function /, such that f(x) = � cos( � ) on f(O) = 0, is Borel-measurable and not >.-integrable. Let f: [0, 1] � IR be defined as
f (x) =
3.6
0,
(0, 1]
and
0.
X=
Show that f is improperly Riemann integrable but not Lebesgue integrable. Let f be a monotone increasing differentiable function on [ a ,b] and let cp be its inverse function on [f(a),f(b)] . Prove that
f f( b ) J � f(x)>.(dx) = ycp '( y)>.(d y ). f( a )
3.7
Investigate
0 < a < 1) IR + .
3.8
3.9
if the function f(x) = s:ax l { x 1:- o} (x) (where is improperly Riemann and Lebesgue integrable over .
Let G be a nonempty open subset of [a,b] and let f be a Borel measurable function on [a,b] , discontinuous at each point of G . Can f be Riemann integrable? Show that the functional
II f - g II L l = I : I f - g I d ). semi-norm on L 1 ([a,b], �([a,b]), >.). How
defines a become a norm? 3.10 Let s E tJ.i + ([a,b], <:B([a,b])). Show that for each continuous function h E e([a,b]) such that
3.11
II II L 1 ·
> 0, there is a
Show that the space e([a,b]) of all continuous functions on interval [a,b] is dense in ( L1 ([a,b], <:B([a,b] ), >.), II II £ 1 ) . Use Lebesgue's Theorem 3.5 to show that the limit of a uniformly convergent sequence {/ n } of bounded Riemann integrable func tions on [a,b] is Riemann integrable on [a,b]. Prove that under this ·
3.12
£
can
3. Lebesgue and Riemann Integ rals on IR
339
condition, b = (R) I b (R) f (x)dx I n nlim nlim �oo �oo f n (x)dx. a
a
( P 3. 12)
Lei A be a closed negligible subset of [ a,b]. Is the function 1 A Riemann integrable? 3. 14 Let A be a subset of [a,b] whose closure is negligible. Is 1 A Riemann integrable? 3.15 Let {/ n } be a sequence of bounded, Borel measurable, nonnegative functions on A C IR. Suppose (L) I f n dA � 0 for n�oo. Is it true A that f n � o A-a.e. on A? 3. 13
340
CHAPTER 6 . ELEMENTS OF INTEG RATI O N
NEW TERMS:
partition 327 refinement 327 Borel-measurable bounded functions 327 Darboux lower sum 327 Darboux upper sum 327 mesh of a partition 328 canonic chain of partitions 328 upper Darboux integral 328 lower Darboux integral 328 Riemann integral 328 Riemann integrable function 328 upper Bair function 329 lower Baire functions 329 Cauchy sum 329 Cauchy integrable function 330 Dirichlet function 330 Lebesgue's Theorem of Riemann integrability improper Riemann integral 333 direct Riemann integral 333 direct Riemann integrability 333 Dirichlet's criterion 336
330
4.
Inte gration with Respect to Imag e Measures
34 1
4. INTEGRATION WITH RESPECT TO WAGE MEASURES
As one of the extensions of major integration techniques, we will study integration with respect to image measure J.LF * (where F is a measurable mapping) , with the nickname chan g e of variables, as it resembles the prominent method for the Riemann integral. In this section we will restrict our attention to the abstract integral. A more specific approach to a change of variables for Lebesgue integrals in Euclidean spaces will be treated separately in Chapter 7.
4. 1 Theorem (Change of Variables). Let (00 ,E0 ,J.L) be a measure space, f E e - 1 (0,E), and F: (00 , E0 )--+ (n, E) be a measurable map (such that J.LF * is an imag e measure on the measurable space (n, E)). Then, the following formula holds true: (4. 1) Specifically, if f = g1 A , where A E E and g E e - 1 {0,E), then (4. 1 ) reduces to ( 4. 1a) Proof.
( i)
Problem
Let
3.7,
Therefore,
E tJ.i + ( O,E) be just an indicator function Chapter 1, w e have that s
s
= 1 A . By
I 1A F( w0 ) dJ.L( w0 ) = I 1 F * ( A ) (w0 ) dJ.L( w0 ) = J.L(F * (A)) = J.LF * (A) = I 1 A ( w )dJ.LF * ( w ). o
( ii) Then,
Let s be a nonnegative simple function with the representation,
342
CHAPTER 6 . ELEMENTS O F INTEGRATION
and
n J s o F dJ.L = E ai J.LF * (Ai) = J sdJ.LF * . (iii) Let
i= l
f E e + 1 (0,E). Then there exists { s n } j C tJi + such that
f = sup{s n }· For s n we have, according to (ii) :
Observe that {s n o F } j C tJ! + (00 ,E 0 ) and, by Proposition
5.6 (iv) ,
1 sup{ s n o F} = f o F E e + ( 00 , E0 ).
Therefore, we have that
I f o F d J.L = SUp { J n o F d J.L} S
= sup{ J s n dJ.LF*) =
( iv)
Problem
Let
4. 1,
I f dJ.LF*. 1 + f E e - (f2, E). Then, f = f - f - and, according to
Therefore,
f o F = J o F + - / o F - = J + o F - f - o F, and this, along with (iii) , imply that
J J o F dJ.L = J f o F + dJ.L - I f o F - dJ.L = f f + o F dJ.l - f f - o F dJ.l I J + dJ.LF* - I f - o F* d J.L = I f d J.L F*. (v) have,
Let
f = g lA where A E E and g E e - l (O,E). Then we
0
4. Integration with Respect to Image Measures 4.2 Corollary.
343
Let (rl,E,J.L) be a measure space and let F: ( rl,E ) � ( rl,E)
be a bijective transformation which is E -E measurable along with its in verse F * . Then, for each f E e - 1 (f2,E), the following formula holds true. I* f dJ.L = I f F * dJ.LF * . ( 4.2) A F (A ) ( See Problem 4. 2.) 0
4.3 Examples.
(i) Then,
Let
/ E e - 1 (1R " ,<:B " ) and L(x) = o:x + b
for
o: E IR
and
b E IR".
I = I f(o:x + b),\(dx) = I f o L(x) A (dx), B
B
where B is any Borel set and ,\ is the Lebesgue measure. Let Representing the Borel set B as we have
A = L*(B).
B = L * o L * (B) = L * (A) = � (A - b),
I = I f o L(x) A (dx) ( by (4. 1 a)) = J f(x),\L * (dx). A L* ( A) By Proposition 4.3, Chapter 5, A L * = I :I n A, implying that (4.3 ) I = I :I n 1 f(x) A (dx), where A = o:B + b. ( 4.3 ) is due to Problem 1.23, i.e. due to the fact that J f d( c J.L) = c i fdJ.L , where > 0. (ii) Let ( rl,E,IP) be a probability space and let X E e - 1 ( f2,E) be a random variable. Recall that X induces the image measure IP' X * , or, equi valently, the probability distribution on the measurable space ( IR,C!B), thereby generating the new probability space (IR, <:B, IP' X * ). The functional of X, I X ( w ) IP ( d w ) , was called ( In Definition 1. 10 (iv)) the expectation of the random variable X and denoted by symbol IE [X]. Let g E e l ( lR <:B). Then, g o X is also a random variable whose expectation is c
-
,
lE [g o X] =
By formula ( 4. 1), we have
J g o X ( w ) IP(d w ).
344
C HAPTER 6. ELEMENTS OF INTEGRATION
IE [ g o X] = I g (x) IP X * ( dx ). Specifically, if g (x) = x , we have IE[ X] = I x!P X* ( dx ). If from ( 4.3a), Notation 1. 13, and Definition 4.2, Chapter 5,
IE [ 1 A
0
X] =
I IP X * ( d X ) = IP X* (A) = IP {X E A}.
A
( 4.3a)
g = 1A,
then
( 4. 3 b)
0
PROBLEMS 4.1 4.2 4.3 4.4
Show that f o F + = f + o F and f o F - = f - o F. Prove Corollary 4.2. Simplify I f(e 2 X)>.. ( dx), where f E e - 1 (lR, �;lR ) and A = [1 ,2]. A Use the change of variables formula to evaluate the integral I f(2x + 1 )>.. ( dx ), where A f(x ) and
A = [1 ,3].
=
oo,
4. Integration with Respect to Image Measures NEW TERMS: change of variables 341 change of variables for a bijective transformation 343 expectation of a random variable 343 expectation of a function of a random variable 343
345
346
CHAPTER 6 . ELEMENTS O F INTEGRATION
5. MEASURES GENERATED BY INTEGRALS . ABS OLUTE CONTINUITY. ORTHO GONALITY
In this section we will learn that the integral I fdJ.L, as a set function A v(A), turns out to be a measure. Hence the two measures, J.L (the original measure ) and v (generated by the integral ) , are related through the given integrand-function J, which is referred to as a density. Now, under what condition imposed on two arbitrary given measures can a density func tion exist? The question raised leads to one of the central results in meas ure theory and integration, known as the Radon-Nikodym Theorem, which specifies exactly that condition. This section gives a very brief and informal acquaintance with the Radon-Nikodym Theorem and its ramifications needed to advance to the upcoming material and serving as an introduction. A more elaborated and general version of Radon-Nikodym Theorem will be treated in Section 2, Chapter 8. Let (0, E, J.L) be a measure space. Consider the integral
A--. I fdJ.L = J /1 A dJ.L A
as a set function on E. If f > 0, then as the following proposition states, we have a measure on E.
Let (O,E,J.L) be a measure space and let f E e + 1 (f2,E). Then, the set function v(A) = J fdJ.L is a measure on E. A 0 5. 1 Proposition.
(See Problem
5. 1.)
5.1, v is the measure gene rated by the integral J f d J.L ; v is also called the indefinite integral of f with respect to J.L. The function f is called a ( Radon-Nikodym) density function of v relative to J.L. D 5.3 Proposition. Let ( O,E,J.L) be a measure space. (i) If f and g E e + 1 (f2, E), and v is the measure generated by the integral I f dJ.L. Then (5.3) J g dv = I g f dJ.L. 5.2 Definition.. According to Propositon
( ii) In the condition of (i, , let g E e - 1 (Q,L' ; iR). Then g E L 1 (f2,E,v; IR) if and only if gf E L (O,E,J.L; IR), and in this case (5.3) holds too.
5.
Measures Generated by Integrals
347
Proof.
( i ) As usual, we begin with g E tJi + (O,E) as a nonnegative simple function g = E � = 1 a i 1 Ai to get ( 5.3 ) :
J g dv = E � = 1 ai v( A i ) = E 1� _ 1 a i J f d J.L = J f E 1� _ 1 ai l A 1. dJ.L = I f g dJ.L . ( 5.3a) - A 1· For g E e + 1 , there is { s n } l C tJi + such that g = sup { s n } · By ( 5.3a) , Since { s n f} l C e + 1 , by the Monotone Convergence Theorem,
( i i ) NOW let g E e - 1 . Thus J g dv = J g + dv - J g - dv = I ( t u) + dJ.L - I ( t u) - d j.L = I t u dJ.L.
D
The following example motivates the Radon-Nikodym Theorem. 5.4 Example. Let (O,E,J.L ) be a measure space and let J.L be u-finite. Then, there exists a sequence {An} l n such that J.L( A n ) < oo Define the sequence {a n } C IR + \ { 0} as .
a n = min Let
{ JL( A�) 2n , ln }, n = 1 , 2 ,
... .
Then,
Therefore, if J.L is u-finite, there always exists a positive element L 1 (n,E,J.L). Conversely, let g > 0 and g E L 1 (n,E,J.L). Then and
gn > l A n . Thus
g of
348
CHAPTER 6 . ELEMENTS OF INTEGRATION
which implies that J.L( A n ) < oo Since g > 0, it follows that A n j n. D We have shown that u-finiteness of J.L is equivalent to the existence of a positive integrable function g. In other words, there is a positive "Radon-Nikodym density" g such that the measure v generated by the integral is finite. Another noteworthy observation is that if .
then g l A E [0] /J . Since g > 0, A E N 1-' ' i.e., from v(A) = 0 it follows that J.L( A ) = 0. Shou1d J.L( A ) = 0, then g l A E [0] 1-' and v(A) = 0. Thus, v(A) = 0 if and only if J.L(A) = 0. In other words, v and J.L possess the same null-sets. It is clear that, if g is just nonnegative, v(A) = 0 does not necessarily imply that J.L(A) = 0. But from J.L(A) = 0, it follows anyway that v(A) = 0 (why?). If v has a density relative to J.L, then a J.L-nul l set is also a v-null set. Is the converse of the statement true? (i.e. , would this relation between the measures guarantee the existence of a density?) The answer will be given in the Radon-Nikodym Theorem below. 5.5 Definition. Let J.L and v be two measures on a measure space (Q,E). The measure v is called (absolutely) continuous (with respect to J.L) if every J.L-null set is also a v-null set. If v is continuous relative to J.L , then we write v � J.L · Any Borel measure continuous with respect to the Lebesgue measure is just called continuous. D The use of the word "continuity" is basically due to the following proposition. 5.6 Proposition. Let v be a finite measure on (n, E) and let J.L be another measure on (n, E) . Then the following are equivalent: (A) v � J.L· (B) For all £ > 0, there is 6 > 0, such that for each A E E with J.L(A) < 6, the inequality v ( A) < c holds. Proof.
( i ) Suppose statement (B) is true. Choose an e. Denote by Ll the set of all A E E, for which J.L(A) < 6. Then N 11 C Ll (where N denotes the subset of all J.L-null sets) . Then, for all N E N , 0 = J.L(N) < 6 and v(N) < £. Since £ can be made arbitrarily smal f, we conclude that v(N) = 0 and thus v � J.L·
5.
Measures Generated by Integrals
349
( ii) Suppose now that statement (B) is not true. That means, for some 0 and for any 6 0 there is a set A ( 6 ) E E such that J.L ( A ( 6 )) < 6 implies that v ( A ( 6 )) We now define the sequence of 6's as 6n = � , n = 1,2, . . . , and construct the corresponding sequence of A 's such that A ( 6n ) = An with the above property, i.e. { An } is a J.L-monotone 00 decreasing sequence but "v-resistant." Let A = lim An. Then A CmU n A m = and ( 00 00 ) J.L( A ) < J.L mUn Am < mEnJ.L( Am � < 2 n-1 1 ' n = 1,2, . . . . Therefore, J.L( A ) = 0. However, by Problem 2.5, since v is finite, c >
>
>
£.
2
and thus v is not J.L-continuous. Hence (A) is not true either. D The most general version of the celebrated Radon-Nikodym Theorem was proved by the Pole Otto Nikodym in his paper, Sur une g enera lisa tion des integrales de M. J. Radon of 1930. Another prominent Pole, Stanislav Saks, suggested the name of this theorem, perhaps meaning as Nikodym's Theorem on Radon Integrals, although Radon himself proved a much more special case. The idea of Radon-Nikodym's result had its inception in a 1884 paper by Thomas Stieltjes, in which he introduced the new concept of a density function in connection with his famous "Stieltjes integral" (in its present version known as the Riemann-Stieltjes integral) and initiall y applied to very restricted classes of functions. In 1909, Frederic Riesz proved in his widely referred to Representation Theorem that Stieltjes integrals are represented by the most general continuous linear function als on [a,b] (whose more general version we will explore in Section 7, Chapter 8 ) . Riesz's result yielded many generalizations, of which the most produc tive was by Johann Radon in his 1913 paper, Theorie und Anwend�u ngen der absolut additiven Mengenfunktionen. In this paper, Radon, combining the ideas of Lebesgue and Riesz, introduced an integral with respect to n Borel measures o.n Borel u-algebra of IR rather than the Borel-Lebesgue measure used by Lebesgue. Among other things, Radon showed the exis tence of a Radon-Nikodym density function with respect to this integral as an absolute continuous measure with respect to the Borel-Lebesgue measure, significantly generalizing the earlier theorem by Lebesgue about the existence of an almost everywhere differentiable density. Right after the appearance of Radon's paper, Maurice Frechet noticed that Radon's result cann be generalized for arbitrary measures, rather than Borel measures of IR . This lead Nikodym to his 1930 gene-
350
CHAPTER 6 . ELEMENTS O F INTEG RATI ON
ralization of Radon's theorem in the form very close to the present version. Consequently, a significant gap in integral theory existed between 1913 and 1930. Soon thereafter, in 1933, Nikodym's generaliza tion led to the birth of measure-theoretic probability theory (in Andrey Kolmogorov's famous monograph, Grundbegriffe der Wahrscheinlichkeits rechnung), the concept of conditional expectation, and an introduction to the theory of stochastic processes. Still, many consider Radon as the father of the modern theory of integration. Otto Nikodym, who is at the heart of one of the most important re sults ever made in mathematics, was born on August 13, 1887, in eastern Poland, then belonging to the Russian empire. In 1919 he was among 16 mathematicians .to found the Polish Mathematical Society. Shortly after World War II, Nikodym's family moved to Belgium and then to France, where Nikodym was invited by the Institute of H. Poincare to work on the mathematical foundations of quantum mechanics. (He published his results in numerous papers, and his monograph, The Mathematical Appa ratus for Quantum Theories, was published by Springer-Verlag in 1966.) In 1948 he accepted a position in the United States at Kenyon College, Gambier, Ohio, where he stayed until his retirement. He died in 197 4. We introduce some preliminaries on the Radon-Nikodym Theorem (further to be embellished in Chapter 8). 5.7 Notation. Let m1 = !Dl(Q, E) be the set of all measures on (0, E) . For a fixed measure J.L E !Dl, denote !IJ1 11< = {v E !IJ1 :__ v «: J.L} . (This set is not empty, since J.L E fJR 11< . ) Define on IL( n, E, J.L;lR + ) a mapping J 11 such that for each f E IL( n, E, J.L;lR + ) ,
I t = I f dJ.L = v( · ) . JJ
( ) •
D By Problem 1 . 2 0 , I 11 is valued in !IJ1 11< . Now the Radon-Nikodym Theorem states that if J.L is u-finite, for each v E !IJ1 11< , there exists a unique (up t � the equivalence class modulo J.L) Radon-Nikodym density f E lL(O, E, J.L;lR + ) of v relative to J.L· This needs some clarification : 1) Given a function f E IL(Q, E, J.L; fR + ) , I 11 ! defines a measure, which is absolutely continuous with respect to J.L· As noticed above, this is done. Consequently, [IL(O, E, J.L; fR + ) , !IJ1 11< , I 11] is an into mapping. 2 ) Recall (Definitions and Remarks 1.14 (iii)) that the J.L-almost everywhere property of equality of measurable functions generates an equivalence relation � on e - 1 ( 0, E; fR ) and thus oE IL ( O, E, J.L; fR + ), as a subset of e - l (n, E; lR). Consequently, lL ( n , E, j.L; lR + ) I JJ is a quotient set, "inherited" from (1. 14). On the other hand, by Corollary 1.20, the
5. Measures Generated by Integrals mapping I "agrees" with this equivalence relation E, i.e. as its equi :alence kernel. Then, by Theorem 4.4, Chapter unique function, say
351
1�
I
adopts E there is a
such that where 7rE stands for the projection of l(n, E, J.L; IR + ) on its quotient IL(O, E, J.L; _lR + ) I 11 by E. (See Section 4, Chapter 1.) Therefore, I , literally turns to the injective mapping I p that now acts on the quotient set IL(O, E, J.L; IR + ) I 11• 3) The major claim (existence) of the Radon-Nikodym Theorem is that the mapping [IL(O, E, J.L; IR + ) I 11 , r.m 11< , I ,J is surjective. In other words, for each measure v E r.m 11< (i.e. , absolutely continuous with respect to J.L) , there is an equivalence class [!] 11 of Radon-Nikodym densities of v relative to J.l · A compact version of the above arguments is as follows: 5.8 Theorem (Radon-Nikodym). Let J.L E fJJI. ( O, E) be a u-finite meas ure. Then [IL(O, E, J.L; IR + ) I 11 , !IJ1 11< , J P ] is a bijective map. As mentioned, the uniqueness of the Radon-Nikodym density class is due to Corollary 1.20. The rest of the proof of Theorem 5.8 (existence) will be rendered in Section 2, Chapter 8, for more general classes of D signed measures. By Radon-Nikodym's Theorem, the map I P is therefore invertible and its inverse, denoted by symbol is also a map valued in IL(O, E, J.L) J 11 • Thus, for any v E ID111< , there is a nonempty equivalence class [!] 11 of Radon-Nikodym densities of v relative to J.L and, for a fixed E fJJl 11<.. , we will write �
-
d� ,
v
and call it the Radon-Nikodym derivative of measure v relative to the measure J.L· It should be clear that, unlike a Radon-Nikodym density, the Radon-Nikodym derivative �v is a J.L-equivalence class of all Radon Nikodym densities with respectJ.Lto measure v. 5.9 Proposition (Chain Rule). Let f E �� and g E �:. Then,
3 52
CHAPTER 6 . ELEMENTS OF INTEG RATION
fg
d-rr E dv d 11 dv
d-rr - d 11 _
•
A E E, 1r(A) = I g l Adv and , by Proposition 5.3, 1r (A) = I f g l A dJ.L , which implies that f g is a density of 1r relative to J.L , i.e., 0 fg E �� · Proof. For
5. 10 Examples.
(i) Let n be an uncountable set; let E = {A E �(0): either A or A c is countable}; and let v(A) = 0 if A is countable and v(A) = oo if Ac is countable. Let J.L be a counting measure on E, i.e. , let J.L (A) = I A I if A is finite; otherwise , J.L (A) = oo. Since the only J.L-null set available is C/J, it immediately follows that v � J.L· However, we show that v cannot have a density relative to J.L· Assume the opposite. Let g E e + 1 be such that g E �� . Then, for all w E n, we have that:
0 = v({ w }) = J g(x) J.L (dx) {w } = I l {w } (x)g(x)J.L(dx) = I l {w } (x)g( w ) J.L (dx) = g( w ) I l {w } (x) J.L (dx) = g( w ) J.L ( w) = g( w ). This implies that g = 0, which, in turn, yields that, v = 0. This is a contradiction. (For a further discussion see Problem 5.3.) ( ii) Let t: a be a point mass on (IR",�). We are interested in whether or not t:a � .,\. Let B = {a} E �. Then, .,\({a}) = O and t: a ( {a}) = l. Therefore, by the Radon-Nikodym Theorem, t: a does not have any density relative to .,\. (iii) Let .,\ be the Borel-Lebesgue measure on (lR", <:B) and let v be a probability measure on the same measurable space such that v � .,\. Since .,\ is u-finite, by the Radon-Nikodym Theorem, there exists a densi ty, f E ��·, called the probability density of v, such that v = I fd.,\. Should v be a probability distribution , say IP' X * , induced by a random variable X E e - 1 (0,E), then f is referred to as the probability density function (p.d.f. or pdf)2 of X. Let X: (O,E) --+ (IR,<:B) be a random variable with IP' X* = J f( a , u ) d.,\, where 2 ) (x a 2 1 exp , x E lR. f( a , u ; x) = � 2 2 2u 21ru
{
}
Then, X is called a normal random variable with parameters ( a , u 2 ); the corresponding probability density function of X, f(a, u 2 ) is called the
5. Measures Generated by Integrals
353
normal density.
D
5.11 Definition. Let J..L I and J..L 2 be two measures on a measurable space (O,E). J..L 1 is said to be sing ular relative to J..L 2 (or ortho g onal to J..L 2 ) if there is an A E E such that
In this case we write J..L1 j_ J..L2 • It should be clear that the orthogonality relation is symmetric.
D
Let e a be a point mass on (lR", �) with a E lR", and let A = {a } c . Then, .,\({a} ) = ea(A ) = O. Therefore, ea l_ .,\. In Example 5. 10 (ii), it was shown that ea is not absolutely continuous relative to .,\. Now we have established another relation between ea and A. (ii) Let J..L = ce a + b.,\ , where c,b E lR + \{0}. It is clear that cea j_ .,\ and b.,\ � .,\, which implies that measure J..L is a sum of continuous and a singular components relative to .,\. In the general case, if J..L and v are measures on (O,E) such that v is u-finite, there is a unique decompo sition of v = v a + v where v a is an absolute continuous measure relative to J..L and v is a singular measure relative to J..L · This fact is due to the well-known Lebesgue decomposition theorem. D (i)
8 ,
s
PROBLEMS
5.1 5.2
5.3
5.4 5.5
Prove Proposition 5. 1. Let J..L and v be measures on (O,E) such that v( A ) < J..L�A ) , V A E E, and let J..L be u-finite. Show that there exists f E , such that 0 :5 f < 1. [Hint: If f is a Radon-Nikodym dens ity, show that either A = {/ > 1} = C/J (there is a sequence A n j A such that v(A n ) > J..L ( A n ) , if J..L ( A n ) > 0) or A E X 1-' (if J..L ( A n ) = 0). In the latter case set g = /1 A c .] Let J..L and v be measures on (O,E) such that v � J..L and let v be finite and g E �- Denote A = {w E Q : g ( w ) '# 0}. Show that the restriction of J.l 6n E n A is u-finite. Give an example where J.l ( A ) is not finite. Let 1r be a Poisson measure on (IR, <:B). Investigate whether 1r is absolutely continuous or singular relative to .,\ . Let p 1 , p 2 , and J..L be measures on (O,E) such that p 1 j_ J..L and
dv
a
a
354
5.6 5.7 5.8 5.9
CHA PTER 6 . ELEMENTS OF INTEGRATION
p 2 j_ J.L· Show that P I + p2 j_ J.L· Let P I < J.L and p 2 j_ J.L · Show that P I j_ p 2 • Prove that p < J.L and p j_ J.L imply that p 0. Let J.L and v be u-finite measures on (O,E). Show that J.L and v possess densities f and g , respectively, relative to p = J.L + v. =
Is orthogonality transitive?
5. Measures Generated by Integrals NEW TERMS: density 346
measure generated by an integral 346 indefinite integral 346 Radon-Nikodym density 348 absolutely continuous measure 348 continuous measure 348 !IJ1 JJ< -set 350 Radon-Nikodym Theorem 351 Radon-Nikodym derivative 351 chain rule 351 probability density 352 probability density function 352 normal random variable 352 normal density 353 singularity of a measure 353 orthogonal measures 353
355
356
CHAPTER 6. ELEMENTS O F INTEG RATION
6. PRODUCT MEASURES OF FINITELY MANY MEASURABLE SPACES AND FUBINI'S THEOREM
The present section will extend the results on integration to Cartesian products. It will discuss the formation of product u-algebras (which has some resemblance with the product topology) and product measures on them. This leads to the main result of this section - the celebrated Fubini's Theorem, which allows one to iterate multiple integrals as its measure-theoretic analog of multiple Riemann integrals. Many text books in analysis and on the history of mathematics adopt "Fubini's Theorem" as a generic name for a class of theorems establishing the Identity of multiple integrals with iterated integrals. In the mainstream of the evolution of calculus, when integrating a function f on the rectangle R = [a,b] x [c , d] , the question was raised: under what condition does the existence of the double integral I I f d(x , y) guarantee R
the existence of either of the iterated integrals, J : { I �f (x , y)dy } dx and- · J �{ I :f (x ,y)d x } d y, and will they all be equal? Fubini's Theorem, in one of its earlier forms was proved by Augustin-Louis Cauchy in the early nineteenth century and applied to continuous functions. In 1904, Henri Lebesgue extended this result to bounded measurable functions. In 1906 Beppo Levi conjectured that f need not be bounded, but j ust integrable. Italian Guido Fubini (1879- 1943) proved this statement in 1907. Namely, he proved that given the function f is integrable on R , the functions x._.f (x , y) and y._.f (x ; y) are integrable for almost all x and y , respectively. In addition, the functions y � I :f (x , y)dx and x � I �f (x , y)dy are integrable and
Fubini, however, imposed some unnecessary condition on the integrand function. This was corrected and refined independently by Italian Leonida Tonelli (1 885-1946) in 1909, Brit Ernest W. Hobson (1856-1933), and Belgian Charles J .G.N. (Baron) de la Valee-Poussin (1 866- 1962) in 1910 who rendered proofs entirely different from that of Fubini. The notion of multiple integrals goes back to as early as the middle of the 18th century, first in the form of an indefinite integral. Later on, by 1770, Leonard Euler, formalized the double integral on a bounded domain and applied the above formula for iterated integrals by j ustifying it in terms of Riemann sums. Functions to be integrated were assumed to be continuous and the area of integration was not too complicated. This
6. Product Measures and Fubini 's Theorem
357
approach began to run into serious difficulties as soon as more general cases were considered. Not until Lebesgue published his famous thesis in 1902, has it become possible to tackle other classes of functions, which all led to Fubini's Theorem as of 1910. More general versions of Fubini's Theorem (which we are going to explore in this section), applied to ab stract measures and integrals, appeared to be possible after the Austrian Joachim Radon's extension of the Lebesgue integral in 1913 (mentioned in Section 5) . 6. 1 Definition. Let (O i ,E i ), be a measurable space for i = 1, . . . , n. n Given arbitrary measurable sets A i E E i , we call A = IT a measurable A i = i 1 rectangle in n. The u-algebra generated by all measurable rectangles is called the product u-algebra and it is denoted by .® E i = E 1 ® E 2 ® . . . • =1
® En .
D
A stronger defmition of the product u-algebra will follow. Let 1r i : n -+ ni be the projection map (or projection operator), i = 1, . . . , n (see Section 5, Chapter 1 ) . Recall that 1ri(A i) is a cylinder (with base Ai), which can also be represented as the Cartesian product
In terms of projection operators, a rectangle can be expressed as the intersection of n cylinders with bases A 1 , . . . ,A n , i.e.
n
A = kn 1r k (A k ). =l Now recall that the inverse projection 1ri * (E i) is a that is a u-algebra generated by the map i · This is in
u-algebra on n our case the u 1r algebra generated by all measurable cylinders with bases Ai E E i · The union of all these u-algebras for i = 1, . . . , n need not be a u-algebrC)., and therefore the smallest u-algebra generated by this union is to be consider ed. * 6.2 Definition. The u-algebra U 'k 1r k= l ( E k ) induced by the pro jection operators 7r l ,. . . , 7r n is called the product u-algebra and it is n . shortly by E ® . denoted by .® E i or sometimes • =1 n The lemma below reveals the nature of E = ® E · · Consider •=1 one more notation. Let Y i be an arbitrary subset of E i · Let us denote by
E(
)
10\
10'
•
I
358
CHAPTER 6. ELEMENTS OF INTEGRATION
the set of all measurable rectangles G1 x . . . x G n where G /s are picked from Y i· 6.3 Lemma. Let Y i be a generator of E i containing a sequence {Gi k : k = 1, 2, . . . } of sets, i = 1, . . . , n , monotonically increasing to n i . Then the product u-algebra E ® coincides with the u-algebra Proof. ( i) Because y C E ® , it follows that E C E ® . Indeed, every A i E Y i is also an element of E i and therefore 1ri (Ai) E 1ri ( E i ) C E which implies that
( ii ) Now we show the inverse inclusion E ® C E. We prove that each 1r i is E-E 1measurable. By our assumption, each generator of E i contains a sequence { G i n } l ni. Consider •••
where
A i E Y i·
X
Gn k ,
-
= G k E y,
Observe that
Therefore, ....
sup k {G k }
00
-
= U G k = 1r i (Ai) E E k= l
(since we took the union of elements of E). Hence, we proved that the inverse image of an arbitrary element of Y i (which is a generator of E i ) under 1r i belongs to E. According to Proposition 3.4, Chapter 4, we claim that the same inclusion holds for an arbitrary element of Ei or that 1r i is E-E1measurable. Since 1ri is E ® -E .-measurable for all i = 1, . . . , n, it follows that E contains all cylinders, i.e. a generator of E ® . Thus E con tains E ® . Observe that for Y i = E i , we thereby reconciled two definitions of product u-algebras: Definitions 6. 1 and 6 . 2 . D 6.4 Remark. Now we see that in light of the above lemma, E ® = ® E 1 is generated by a more "economical" generator than that given 1 in Definition 6.1, i.e. by all rectangles from E/s. In some cases, when we fail to indicate this generator, we do consider E ® as the u-algebra ·
z. =
6. Product Measures and Fubini 's Theorem
359
generated by all rectangles as it follows from Lemma 6.3. D 6.5 Examples. To tell sets and set collections in IR from those in IR n we will attach to the latter the superscript n . E •- = � - = � i = 1, . . . ,n. Then n = IR n and (i) E ® = •=1 .® � i . We also know that there is another u-algebra in IR n , i.e. n the Borel u-algebra � n = <:B ( IR ). What is the relation between <:B n and .® <:B i ? Recall that <:B n was generated by the semi-ring of n-dimensional •=1 semi-open intervals ( which we for convenience denote by !f n ) . Observe that ,
z
!f n = !f '
X
•
•
n
•
v
X
!f
_,
and that each !f contains a sequence monotonically increasing to by Lemma 6.3, <:B n and .® <:B i must coincide.
IR. Thus,
•=1
( ii) Recall that the B o r el-L eb esg u e
.A n on � n was extended 0 from the Lebesgue elementary content .A n , defined on !f" as measure
n
.A n is
the unique extension of that content on . ®l <:B i . We We know that z= can look at a more general problem. Let us now consider an n-tuple of measure spaces ( O i ,E i ' J.L i ), i = 1, . . . ,n. We wonder if there exists a unique E measure J.l on the measurable space fi ni , .® such that for each i 1= . 1 · 1 rectangle A,
(
)
n 6.6 Definition. Let B be an arbitrary subset of n = IT ni. i=1 for a point a i E n i the ai-section of B as
0
We define
If ai is such that a i rt, 1ri( B) , then ( w 1 , . . . , w i _ 1 , a i , w i + 1 , . . . , w n ) rt, B and Ba . = C/J. ( See Figure 6.1. ) l
360
CHAPTER 6. ELEMENTS O F INTEG RATION
B
Figure
6.1
Here 7r i j was defined so that 7r i j : n � ni X nj •
Let A be an arbitrary element of E 1 ® L'2 and let Then the corresponding sections Aa1 and Aa2 are ai measurable in the way that Aa 1 E E2 and Aa 2 E E 1 . Proof. Denote E' = { A E E1 ® E 2 : Aa E E 2 } . We show that: 1 6.7 Lemma. E ni, i = 1, 2.
(a) E' is a u-algebra in 0 1 ® 02.
( b) any rectangle is an element of E'. This would imply that E' contains E 1 ® E 2 • On the other hand, by the above definition, E 1 ® E 2 contains E'; therefore, E' would coincide with E 1 ® L' 2 . By Problem 6.2, we have that the section is commutative with all set operations.
6. Product Measures and Fubini 's Theorem Finally observe that
A2 , a1 E A 1 C/J , a1 fl. A 1 ,
361
(6.7)
i.e. any a 1 -section of a measurable rectangle belongs to E 2 which proves assertion (b) above. Assertion ( a ) becomes an easy exercise for the reader D ( Problem 6.3). Example 6.8. Recall that the Lebesgue measure >.� is complete on L * . Consider >. � on L *(lR) . We show that >.� ® >.� is not complete. Let Q be any subset of lR which does not belong to L *(lR). Then, {0} x Q � L * ® L * or else, by Problem 6.4, Q would be an element of L * . On the other hand, {0} x Q is a ( proper ) subset of {0} x lR , The latter by Problem 3.1, Chapter 5, is clearly a measurable Borel null set. Hence, the Lebesgue space (lR",L *(lR"),>.�), as a complete measure space, does not coincide with the product measure space (lR",.®1 Li(lR),,®1 >.�i), in contrast with its •= •= Borel-Lebesgue counterpart. D If
J.Li is a measure on E i , i = 1,2, then according to Lemma 6.7, for an arbitrary set A E E 1 ® E 2 , A a 1 E E 2 and Aa2 E E 1 , and thus f..L t ( A a ) and J.L 2 ( A a ) are defined terms. For a fixed A , J.L 1 ( A a ) and 2 2 1 p 2 ( A a 1 ) are functions of a 2 and a1 , respectively. The proposition below states that under some restrictions they are even measurable. 6.9 Lemma. Let J.L 1 and p 2 be u-finite measures on E 1 and E 2 , respectively. Then for a fixed set A E E 1 ® E 2 the function 1
is E 1 -<:B + -measurable ( E2 -<:B + -measurable ). Proof.
( i)
We prove this proposition under the assumption that p 2 is
finite. Let E = { A E E 1 ® E 2 : / A is E 1 -<:B +-measurable } . We show that E = E1 ® E2 • a) For n = n1 X n2 and f n (al) = J.l2 ( f22 ) = const ( a real positive number ) , we have that / n is E 1 -<:B +-measurable and hence f2 E E. Observe that the finiteness of J.L 2 is essential, since otherwise, we can only arrive at the weaker result that f A is E 1 -<:B +-measurable. b) Let A E E. Then
362
CHA PTER 6 . ELEMENTS O F INTEG RATION
and
f A c (a1 ) = J.l2 (( Ac ) a1 ) = J.L2 (( A a1 ) c ) = J.L2 ( n2 )
- J.L2 ( Aa 1 ) = f n ( a 1 ) - f A ( a1 )
is measurable as the difference of two measurable functions. This implies that A c E E. c ) Let { An } be a sequence of disjoint sets of E. Then, oo
oo
= n'L/ 2 (( An ) al ) = n'L/ A n ( al ) = sup{
n
i">;/ A / at )}
00E An E E
is clearly measurable. This implies that and thus E is a n =1 Dynkin system. d) Now let A = A 1 x A 2 be a rectangle. Then by equation (6. 7),
fA ( a1 ) = 0, if a1 � A 1 ; and therefore,
fA ( a1 ) = P.2 ( A2 ) lA 1 ( a1 ) is E 1-� +-measurable as the product of a constant function J.L2 ( A 2 ) and an indicator function of a measurable set A 1 E E 1 . Thus, E contains the set y0 of all measurable rectangles which is n -stable. Therefore, E 1 ® E 2 is a Dynkin system generated by an n -stable generator, which implies E 1 ® E 2 C E (as E is also a Dynkin system containing y0 ). However, by the definition, E C E1 ® E 2 . Therefore, E is a u-algebra and
( ii) Now we assume that p. 2 is u-finite. Then there exists a sequence such that
{n� } c E2 n� i n2 and J.l2 ( n�)
is finite and we can apply the above result to the trace u-algebras E 1 n n�, and state that f � is
6.
where
Product Measures and Fubini 's Theorem
363
f � ( a1 ) = J.L 2 ( Aa 1 n Q�) and A E E 1 ® E2 .
Consequently, by continuity from below, the function
D
Let (Oi, E i ' J.L i ) be two measure spaces and let J.L i be u finite measures. Then there exists a measure J.L on E 1 ® E2 such that, for each A E E 1 ® E2 , J.L(A) is determined by the formula 6. 10 Theorem.
(6 . 10)
Specifically, for A = A 1 A 2 , X
(6 . 10a)
f A ( a1 ) = J.L2 (A a 1 ).
Then by Lemma 6.9, f A is a nonnegative extended-valued E 1 -� +-measurable function. Therefore, the integral (6 . 1 0b) Proof. Let
is defined for any E 1 ® E2-measurable set A and it is a nonnegative set function which denote by J.L( A). We will prove that J.L is a measure on E 1 ® E 2 . Clearly J.L ( C/J ) = 0. In Lemma 6.8 we showed that
f Enoo 1 A n = nE f A n =1 00
=
for {A n } as a sequence of measurable disjoint sets of E 1 ® E 2 . Applying this equation to (6. 10b ), we have, by Corollary 2.2,
In particular, for
A = A 1 A2 , x
36 4
CHAPTER 6. ELEMENTS O F INTEG RATIO N
D
The measure J.l on E1 ® E 2 is called a product measure and it is denoted by J.l 1 ® J.l 2 . Theorem 6. 10 is readily extendible to the product measure i® J.l i on finitely many measure spaces.
=l
The uniqueness of the product measure is subj ect to the following.
Let (f2 i ,E i ,J.l i ) be measure spaces such that Y i are n stable generators of E i , i = 1, . . . , n . If J.l i 's are ufinite, then the product n n . measure . ®= 1 J.l i on .®=1 i zs unzque. • • 6.11 Theorem.
P LJ
.
y = • =Q1 y i , which, according to Lemma 6.3, is a generat or of .®1 E i · It is easily seen that y is n -stable. Indeed, •= Proof. Denote
.
(G1
X
• • •
X
G n ) n (H
1 X
• • •
X
Hn)
n = ( n 7r k * (G k )) n ( n1 7rk * ( H k )) k =1 n k= = n 1r k * (G k n H k ) k=1 (G 1 n H 1 ) ( G n n H n ), n
X
• • •
X
since each Y i is n -stable. Since J.l i is u-finite for each i, there are sequences G i k i f2 i , i = 1 , . . . , n, and k = 1 , 2, . . . , such that , J.l i( G i k )
< oo
.
In Lemma 6.3 we have shown that the rectangle
and, in addition, that
i®= 1 J.l i with the property n n ® = Jl t (R t ) . . . Jl n (R n ) ) ( R Jl i i P i t t
Now suppose there is a product measure
6. Product Measures and Fubini 's Theorem
365
for every measurable rectangle from .®1 E i· Specifically, the property •= holds for every rectangle selected from the generator � - If there is another measure on .®1 E i with the same property on � ' i.e. that coincides with •= the original product measure on y, then, by Corollary 2.14, Chapter 5 (on the uniqueness of the extension of a measure), these two measures must coincide on i®1 E i . This proves the uniqueness of the product = measure. Observ e that another measure on ® E i may exist. However, n
i=l
it cannot be a product measure. D 6. 12 Definition. Let f : n 1 X n2 � n3 be a map, n = nl X n 2 X n3 and let 1r 1 , 1r 2 and 1r3 be the corresponding projection operators in n. Let
f a1 ( w2 ) = 7r23( 7r � (a1 ) n f)
and call it the
a1 -section of f . In other words,
f a1 ( w2 ) = {( w2 , w3) E 02 x 03:
(a1 , w 2 , w3) E f}.
The a 2 -section is defined equivalently. 6.13 Proposition. Let (0 1 X n 2 , E 1 ® E 2 ) space, (03, E3) be a measurable space, and
D
be a measurable product let f : n1 n2 � n3 be a E 1 ® E2 -E3-measurable function. Then the section fa is E 2 -E3-measur1 able and f a 2 is E 1 -E3-measurable. (See Problem 6. 10.) X
The theorems below resemble some of the original versions, of Fubini's Theorem. Theorem 6. 14 (which is by many referred to as Tonelli's Theorem) states in essence that if the integral of a nonnegative extended-valued E 1 ® E 2-measurable function is finite, then it can be cal culated by using iterated integrals of formula (6.14) below. To check on whether or not an arbitrary E 1 ® E 2-measurable function f is integrable one can also apply Theorem 6. 14 directly to I f I by using formula
(6. 13).
spaces and let
Let
i
(ni,E i ,J.Li), = 1,2, be u-finite f E e + 1 (01 X n2 , E 1 ® E 2 ). Then the functions
6.14 Theorem (Tonelli).
measure
366
CHAPTER 6. ELEMENTS O F INTEG RATION
are E1 - and E2 -measurable, respectively, and J f d J.L 1 ® J.L2 = J ( J f a1 (W2 ) d J.L2 ( w2 ) ) d J.L1 ( a1 ) = J ( J f a ( w 1 ) d J.L 1 ( w1 ) ) d J.L 2 ( a 2 ). 2
( 6. 1 4)
Proof.
( i)
As usual, we begin with nonnegative simple functions. Let
By Problem 6. 1 1 , measurable and
which is E 1 -<:B +-measurable by Lemma 6.9. Hence we may integrate it with respect to measure J.L 1 :
I ( I 5 a1 ( w2 ) d JL2 ( w2 )) dJL l ( al ) = 'L ? = 1 ad JL2 ( (A ; ) a1 ) d JLl ( al )
·
(6. 14a)
By Theorems 6 . 10 and 6. 1 1 , equation (6. 14a) reduces to (6. 14b) ( ii) Let f E e + 1 ( f21 x f22 ,E 1 ® E 2 ) . Then there is a sequence such that {s n } i C tJ.i + ( f2 1 x f22 ,e - 1 ) and that f = sup{s n } · Thus
I fd JL1 ® JL2 = sup{ I s n d JL 1 ® JL2} (by (6. 14a) and (6. 14b) ) = sup
J ( J ( s n ( w2 ) ) a1 dJ.L 2 ( w2 ) )dJ.L 1 (a1 )
(by the Monotone Convergence Theorem )
By Problem 6. 12, finally,
{sn}
6. Product Measures and Fubini 's Theorem
367
(6. 14). 0 6.15 Theorem (Fubini). Let ( n i ,E i ,J.L i), i = 1,2, be u-finite measure spaces and let f E L1 (n1 n2 , E1 ® E2 ,J.l 1 ® J.L2 ;1R). Then the functions
Now it is obvious how to complete the proof of formula X
w 2 � f a1 (w2 ) and w1 � f a2 ( w1 ) are J.L 2 -integrable J.L 1 - a .e., and J.L1 -inte g rable J.L 2 - a .e., respectively; the functions a1 � J f a 1 ( w2 ) dJ.L2 ( w2 ) and a2 � J f a2 ( w 1 )dJ.L 1 ( w1 )
are J.L 1 - a .e. and J.L 2 -a .e. defined, and they are J.L 1 - and J.L 2 - integrable, respectively, and the formula J f dJ.L 1 ® J.L2 = J ( J f a 1 ( w2 )dJ.L2 ( w2 ) ) dJ.L1 ( a1 ) = J ( J f a 2 ( w1 )dJ.L 1 ( w1 ))dJ.L 2 ( a2 )
(6.15)
holds. Proof. Since by the condition of the theorem,
by Problem that
6.13
and then by Theorem
6.14
applied to
IfI,
we have
J I f I dJ.L 1 ® J.L2 = J ( J I f a1 ( w2 ) I dJ.L 2 ( w2 ) )dJ.L 1 ( a1 ) = J ( J I f a 2 ( w1 ) I dJ.L1 ( w1 ))dJ.L2 ( a2 ) < 00 •
1.21, the functions a1 � J I f a 1 ( w2 ) I dJ.L 2 ( w2 ) and a2� J I fa 2 ( w1 ) I dJ.L 1 ( w1 ) are J.L 1 - and J.L 2-a . e . finite, respectively. In other words, for almost all a 1 E n1 and a 2 E n 2 , I f a1 I and I fa 2 I are J.L 2 - and By Proposition
J.L 1 -integrable, respectively. It is easy to verify that the same applies to the pairs of functions ( fa 1 ) + , ( / a ) - and ( /a2 ) + , ( /a 2 ) - . By The orem 1 6.14, the pairs of functions
368
CHAPTER 6 . ELEMENTS O F INTEGRATION
and are E 1 - and E 2-measurable, and they are J.L 1 -a.e. and J.L 2-a.e. defined, respectively. Finally, applying formula ( 6. 1 4 ) to f + , (/ a 1 ) + , and (f a2 ) + and then to f - , (/ a1 ) - , and (/ a2 ) - , we arrive at formula D (6 . 1 5 ) . 6. 16 Remarks.
( i ) Fubini 's theorem generalizes Theorem 6 . 1 1 since for f = 1 A and A E E1 ® E 2 , the result of Theorem 6. 10 immediately follows. That
is why Theorem 6.10 is also called Fubini's theorem. (ii ) Observe that Tonelli 's and Fubini's Theorems differ not only in that they are applied to nonnegative and arbitrary measurable functions, respectively, but also that functions on a product space belong to L 1 space is a conclusion in Tonelli's Theorem while being a hypothesis in Fubini 's Theorem.
( iii ) The above results (including Fubini's theorem) can naturally be
extended from the case of two spaces to the case of any finitely many spaces. This jnvolves a relatively straightforward notational routine and will not be discussed further except by way of examples. D 6. 17 Definition. Let (n i ,E i ,J.L i ), i = 1, . . . , n , be measure spaces with un n n finite measures. Then the triple ( IT n i , . ® E i , . ® J.L i ) is called the i =1
product (measure) space.
•= 1
1 =1
D
6.18 Exam ples.
( i ) Let C n (x0 ,r) be a closed ball in (IR",r n ) · Denote V n = A " (C n (B, 1 )). We wish to show that A " (C n (x0 , r)) = V n · r " , where and
v2 k = v2k - l = 1
1! 1r k , k = 1,2, . . .
2k . 3 . . . (2k -
A closed ball is clearly a Borel set. Let map, where
k- 1 ' 7r 1)
( 6. 1 8 )
k = 1 ,2, . . . .
(6 . 1 8a)
L n ( r , x0): lR" IR " be the bijective �
x, x0 E IR" and r > 0. Then, C n ( B, 1 ) = L * (r,x0 )(C n (r, x0)). Ob
6. serve that
Product Measures and Fubini's Theorem
369
L * (r,x0 ) = E(� ) o M ( - x0 ),
where E means expansion with factor � and M stands for the parallel motion (here with the shift - x0 ). On the other hand,
Cn ( B ,1 ) = {x E !R" : II X II 2 < 1}
Now,
(by Fubini's theorem and by (6. 18b))
1 . I L*n
1 /2 ' 2 2 ((1 x ) x 2 2 1 -
B ) (Cn-2 (B ' 1))
d).. n - 2 d).. 2 (x 1 ' x 2 ) .
The interior (second) integral is, due to Proposition and the above observation equal to
5.3, 1 ),
Chapter 5,
37 0
CHAPTER 6. ELEMENTS OF INTEG RATION
By Proposition equals
5.3, 2),
Therefore,
Chapter
5, and by Theorem 4.1,
the last integral
(1 _ X 1 2 _ X 2 2 ) ( n - 2 )/ 2 V n - 2 ·
f
n 2 2 2 2 ( )/ 1C 2 ( B , l ) ( x v x2 )d A 2 (x l ,x 2 ) V n = V n - 2 (1 - x 1 - x2 )
( by Fu bini's theorem ) =
Vn-2
Jf
(1 - x 1 2 - x/) ( n - 2 )/ 2 1 C2 (0, 1) ( x l , x2 )d A (x l )d A (x 2 ) .
This is a Lebesgue integral of a continuous function on the unit ball and it can be reduced to a Riemann integral by using conventional techniques for Riemann integrals. For example, the double integral above is then
I
I
p E [ a , x ] O E [0 , 1r ] and thus
p( l - p2 ) ( n - 2 )/ 2 d () d p = 2: '
v n = v n - 2 2;, n = 2,3, . . . .
( 6. 1 8c ) ( 6. 18d )
V 0 = 1. Then, V 1 = 2 ( as the Lebesgue measure of the interval [ - 1,1]). By ( 6. 18c ) , V2 = 1r ( that agrees with the definition of V0 ). 2 2 1 2 27T' 2 7T' V = V1 3 = 1 . 3 - , and V = 2f· Let
3
1f'
4
The validity of formulas ( 6 . 1 8 ) and ( 6. 18a ) is then easily shown by induction and the use of ( 6. 18d ) . ( ii ) We show that Fubini's theorem need not hold when at least one of the measures, J.L 1 or J.L 2 is not u-finite. Let (O i ,E i ) = ([0, 1],C!B([0,1])), i = 1,2, J.L1 = Res [0 , 1 1 A , and J.L 2 (A) = IA I , if A is finite and J.L 2 (A) = oo , if A is infinite, where A E E2 • Denote the diagonal of the square ( see Figure
6.2).
6. Product Measures and Fubini 's Theorem
�-
X
---�--�
371
Ql
6.2 We show that D E E 1 ® E 2 = <:B 2 ([0,1] 2 ). Let Figure
s n j = [j -n 1 , nj], J. = 1, . . . , n ,
and
Then D E E 1 ® E 2 for D =
On the other hand,
00
nn= l A n . Now we find
A ( D y )J.L 2 ( dy ) = 0. J 01
[ , ] So as we see, Fubini 's theorem or more precisely, the second equation in (6.10) of Theorem 6. 10, does not hold. ( iii) Let (N, �(N), 'Y) be the counting measure space introduced in Example 1.2 (viii) , Chapter 5, for more general measure spaces. We will consider a sequence { s n } of nonnegative simple functions on N as
where
{ ak } is a nonnegative sequence of reals, so that
372
CHAPTER 6. ELEMENTS O F INTEGRATION
Hence the integral of g will turn to a series: (6. 18e) This is readily extendible to a series with real-valued terms. In other words, the integral of a sequence {a n } C IR with respect to the counting measure 1 is represented by the series in (6. 18e). Let { ! n } be a sequence of nonnegative functions of e + 1 (0, E) and let J.L be a u-finite measure on E. Since the above counting measure 1 is u-finite, the function f (where f(n, w ) = f n ( w )) obviously meets the conditions of Tonelli's Theorem 6. 14: f E e + 1 (N X n, �(N) ® E). Conse quently, the sections
are �(N)- and E-measurable, respectively, and (6 . 1 8f) (18f) is an nice illustration of Tonelli's Theorem. Howvere, it is a slightly weaker alternative to Beppo Levi's Corollary 2.2, since the latter does not require J.L to be u-finite. Now, let {/ n } E e - 1 (0, E). To use an analog of Fubini's Theorem, we need to make sure that f E L 1 (N x n, �(N) ® E, 1 ® J.L;IR ), or, alternatively, apply the above procedure initially to the sequence f n I } instead. Then, from (6. 1 8f) we can get
t1
Should now,
J I f I d1 ® J.L or their equivalents, E ::"= J I f n I dJ.L or
J E ::"= I f n I d J.L, be finite, then it would yield that 0
0
and therefore, Fubini's formula (6.18f) would hold true, now for an arbitrary sequence of measurable functions {/ n } · Notice that, since
is a necessary condition for
f E L1 ( N n, �(N) ® E, 1 ® J.L ;fR ), it automatx
6. Pro duct Measures and Fu bini 's Theorem
373
ically implies that (6.18h) or would_be alternative necessary conditions for f E L 1 (N x n, �(N) ® E, 1 ® J.L;IR ) (although the latter is by no way a necessary condition for Fubini's Theorem). This version of Fubini's Theorem can compete with Generalized Monotone Convergence Theorem 2.4 and Lebesgue's Domi nated Convergence Theorem 2.6 in some applications. ( iv) As an illustration to the last application of Fubini 's Theorem, consider a random variable X on a probability space (0, E, IP'). The func tion m( B ) H IE[e 0X] (normally, complex-valued) is known to be the moment generating function of X. If we expand e 8X in the Maclorin . series, eo x � oo !L.xn =
we will have with
'-' n = On!
'
a scenario of the application of Fubini's Theorem discussed in Example (iii) . Hence we have to make sure that, in light of (6 . 18h) , the series is
I: ;:"=
o'f�[ I �� n I x I n ]
=
I: ;:"= I �� nIE[ I x I nl < oo 0
in some vicinity of {} = 0 . [The latter holds for many practical cases, provided that lE[ I X I "] < oo for all n.] Assuming that all absolute moments of X exist and the above series converges, the application of Fubini's Theorem (6 .18f) yields that
m ( B ) = "'"' 00 �IE[X"] ' L...J n = On! as a Taylor series expansion of m ( B ) in terms of all moments of the ran dom variable X, and consequently that IE[X"]
=
m (n)(O), n = 0 , 1 , . . . .
v) Consider Borel-Lebesgue measure A 2 = A ® A on Borel u-algebra ( � 2 • Let A = Q x IR. According to Problem 3.1, Chapter 5, A is a count able union of Borel-null sets. Thus,
374
CHAPTER 6. ELEMENTS OF INTEG RATION
On the other hand, the section ( 1 A ) 1 is not A-integrable for all a 1 E Q. This is, however, in agreement with Fubini's Theorem that the function ( l A )a 1 is A-integrable only for almost all a1 E IR. (vi) Now we discuss yet another application often occurring in prob ability theory. Let J.L F and J.L a be finite Borel-Lebesgue-Stieltjes measures induced by distribution functions F,G E 9::> ( 1R, �) (see Remark 3.5 (iii), Chapter 5) . Recall that a
From Problem 3. 7, Chapter 5, given a compact interval I = have that
[a,b], we
(6. 18i) [F(b) - F(a - )][G(b) - G(a - )]. Let T = {(x,y) E [a,b] 2 : y > x} and T1 = {(x,y) E [a,b] 2 : y < x}, which are the upper and lower triangles of the square I , respectively. Now we calculate the measure of 1 2 under J.L p ® J.L a by using Theorem 6 . 1 0 in =
u
terms of Lebesgue-Stieltjes integrals:
2 ) = J.L p ® J.La (T ) + ® J.L a (T l f..L F J J.L a ([a,x])J.L p (dx) + J J.L p ([a,y))J.L a (dy)
f..L F ® J.L a ([a,b] =
=
I
u
)
I
J [G(x) - G(a - )] J.L p (dx) + J [F(x - ) - F(a - )] J.L a (dx). I
I
(6 . 18j)
Equating (6. 18i) and (6. 18j) we arrive at
J G(x)J.L p (dx) + J F(x - )J.L G (dx) I
I
=
F(b)G(b) - F(a - )G(a - ).
Interchanging the roles of F and
G we have
J G(x - )J.L p (dx) + J F(x)J.L G (dx)
I
I
=
(6. 18k)
F(b)G(b) - F(a - )G(a - ).
(6 . 181)
6. Pro du ct Measures and Fubini 's Theorem
375
Hence, from (6. 18k) and (6. 181) we establish the following integration by parts formula for Lebesgue-Stieltjes integrals:
F(b)G(b) - F(a - )G(a - ) =
(6. 18m)
I � {F( x ) + F(x - )} Jla (dx) + I � { G (x) + G(x - )} Jlp (dx). I
I
0
PROBLEMS 6. 1
Yo be the set of all measurable rectangles A = A 1 An , A i E E i . Denote by E(y0) the algebra generated by y0 and by Let
x . . . x
e(y0) the collection of all finite unions of disjoint rectangles of y0. 6.2 6.3 6.4 6.5 6.6
Show that E(y0) = e(y0). Prove that the section is commutative with respect to all set operations. Show the validity of assertion a ) in Lemma 6. 7. Show that a rectangle R 1 x R 2 E E ® E , where R 1 and R 2 are not empty, if and only if R 1 E E and R 2 E E . Let (O i , E i, Jl i), i = be u-finite measure spaces. Show that the product measure Il l ® Jl 2 is u-finite . k k k Let (lR ",� ",>.") and (IR ,� ,>. ) be the Borel-Lebesgue measure spaces. Show that
1,2,
1,2,
Let ( O i , E i , Jl i ) , i = be measure spaces with u-finite measures and let A E E 1 ® E 2 . Show that the following statements are equivalent: 1 ) Jl 1 ® Jl 2 (A ) = 0; Jl2 (Aa 1 ) = 0 Jl1- a . e . on 01 ; 3 ) Jl 1 (A a 2 ) = 0 Jl 2- a . e . on 0 2 . Let A c nl X n2 and let al E n l . Show that ( l A ) a 1 = l A a . 6.8 1 Show that f a 1 * (A 3 ) = (J * (A3 )) a 1 , A 3 C 03 . 6.9 6.10 Prove Proposition 6. 13. [ Hint: Apply Lemma 6.7 and Problem
6.7
2)
376
6. 11
CHAPTER 6 . ELEMENTS O F INTEGRATION
6.9. ] Let A, B c that
n1 n2 be two disjoint sets and let X
a , (3
E IR. Show
1 = a( l A ) a 1 + {3(1 B) a 1 . Let f E e + 1 (0 1 n 2 ,E 1 ® E 2 ) and let {s n } c tJ.i + (0 1 n 2 ,e - 1 ) such that f = sup{s n } · Show that f a 1 = sup {( n ) a 1 } [Hint: Apply Theorem 6.5, Chapter 5, and Problem 6.10 ] . Show that I f I a = I f a I , (f + ) a = (f a ) + , and (f - ) a = (f a ) - · Let E 1 and E 2 be u-algebras on 0 1 and 02 , respectively. Show that E1 0 E 2 is a semi-ring. Let y 1 and y 2 be semi-rings on 0 1 and 0 2 , respectively. Is y 1 0 y 2 also a semi-ring? What will the smallest algebra generated by E 1 0 E 2 from ( al A + (3 1 B ) a
6.12
X
X
s
6.13 6.14
6.15
Problem 6 . 14 look like? 6.16 Let J.li and v i be finite measures on a measurable space (ni, E i), i = 1 ,2. Show that if J.l i «: v i ' i = 1,2, then J.l 1 + J.l2 � v 1 + v 2 . 1 6.17 Let (O, E, J.l) be a u-finite measure space and let f E e + (n, E). Prove that
( P6.17 ) by using Theorem 6. 10. 6. 18 Generalization of ( P6. 17 ) . In the condition of Problem 6. 17, let g: IR + ---. IR + be a continuous monotone nondecreasing function such that g ( O )= O and which is continuously differentiable on (O,oo). Show that f g(f)dJ.l = ( L ) f
(O, oo )
g ' ( X) J.l( { f > X } ) A( dx)
= (R) J g'(x)J.l({f > x})dx. 00
0
6.19
Show that if F and G in Example 6. 18 (vi) have no common dis continuities, then formula ( 6. 18m ) reduces to
F(b)G(b) - F(a - )G(a - ) = J F(x)J.l a (dx) + J G(x)J.l p (dx). I I ( P6. 19 )
6. Pro duct Measures and Fubini 's Theorem
NEW TERMS: measurable rectangle 357 product u-algebra 357 measurable cylinder 357 section of a set 359 ai-section of a set 359 section of a function 365 a .-section of a function 365 Tonelli's Theorem 365 Fubini's Theorem 367 product measure space 368 closed ball in IR", Borel-Lebesgue measure of 368 integral with respect to the counting measure 371 moment generating function 373 integration by parts formula for Lebesgue-Stieltjes integrals 375 , 376
377
378
CHA PTER 6. ELEMENTS OF INTEG RATION
7. APPLICATIONS OF FUBINI'S THEOREM
Product measures and Fubini's theorem find some of their finest applica tions in probability theory. One of them has to do with independence of random variables, a popular topic in statistics and stochastic processes. 7. 1 Definitions. Let (O,E,IP) be a probability space. ( i) Let � C E be an arbitrary (indexed) family of events (i.e. meas urable subsets of 0) . y is called rP-independent (or j ust in dependent) if, for any finite subcollection { A i 1 , . . . ,A i n } of n > 2 events from y , the following relation holds true:
=
rP {A i1 n . . . n Ai n } IP (A i1 )
•
·
· IP (A i n ) .
(7 . 1a)
Observe that, if � is an independent family of events then the Dynkin system generated by y is also independent (see Problem 7. 1). If, in addition, y is n -stable, then '!»(y) is an independent u-algebra. ( ii) Let m { Yi ; i E I} c E be an indexed collection of families of events. m is called in dependent if, for any finite subset {i 1 , . . . ,i n } C I, 1 , . . . ,n, the events Ai 1 , . . . ,Ai n n > 2, and for any choice of Ai E Y i , k k k " d epen d ent. are 1n ( iii) Let GJ {X i ; i E I} be an indexed collection of random vari ables on ( 0, £, IP). GJ is called in dependent if the corresponding collection { u( X i); i E I} of u-algebras generated by these random variables is inde pendent. (iv) Let X i : n ni , i 1, . . . ,n, be E,-E random variables on n n1 X . . . X n n and ( 0, E, IP). Then we denote ® X i {X 1 , . . . ,X n } : n 1 call it the pro duct map. n ( v) It appears (Problem 7 .2 ) that the product map ® 1 Xi is E. ® E ,-measurable. Therefore, by letting
=
=
=
�
.
=
" =
�
=
.
' =
1
= IP. ®n x i = IP( . ® 1 X i) * , 1 we can define a probability measure on ( fr n ; ® E i ) and call it the 1 i 1
'& =
IP ® x . "
" =
'& =
joint distribution of random variables
=
�
X 1 , . . . ,X n .
D
Let IP x . IP Xi be the distribution of the random variable X i , i 1 , . . . ,n . This is a probability measure on E i . Then, according to the previous section, we can construct the triple
=
'&
7. Applications of Fubini 's Theorem
379
On the other hand , we already have another measure IP' ® X . on ft n;,i ® E ; . which in general, need not be a product measure. The
(i
1
1
)
1
following statement clarifies the matter. 7.2 Proposition. The joint probability distribution IP' ® X . is a pro duct
n
I
measure and equals . ® PX . if and only if the ran dom varia bles = are in depen dent. 1 2
X1 , . . . , Xn
1
D
( See Problem 7.3. ) Note that the treatment of the product IP' ® x . of more than finitely I
many independent random variables is more complicated; such a treat ment involves the product of infinitely many u-algebras and measures. Another important application of product measures and Fubini's theorem is the notion of "convolution" of measures. k 7.3 Definition. Let !B*(IR , � ( IR k)) be the set of all fmite Borel meas ures on � k = <:B(!R k ). Clearly, !B * (IR IC , <:B ( !R k )) is a semi-linear space over the field lR + . Let J.L = .® J.Li where J.Li E !B (it is easily seen that 1 =l n k 1 J.L E !B * (IR c , <:B ")). Consider the linear measurable map Ln : lR k n ---. lR k as Ln ( x 1 , . . . , x n ) = E 7 = 1 x i . Then the image measure J.L L n * is called the convolution of measures f..L t , , J.L n and it is denoted by •
•
•
i *1 J.L i = f..Lt *f..L2 * · · · *J.L n · n
7.4 Properties of Convolution.
(i)
Let
f E e + 1 (1R k ,<:B k ) and let J.L 1 ,J.L 2 E !B * (IR k ,<:B k ). Then
( 7 . 4a)
f E e - 1 (1R k ,<:B k ) and J.L 1 ,J.L 2 E !B * , we require be f..L t *J.L 2- ( or J.L 2 *J.L1 -) integrable to have ( 7.4a ) valid. Specifically , let f = 1 A ' A E <:B k . Then, For
that
f = f+ - f
380
CHAPTER 6. ELEMENTS OF INTE G RATION
( since l A o L = l T * ( A ) where T (x 1 ) = x 1 + x 2 and T * ( y )
y - x2 )
2 ( xl )J.L l ( dx l )J.L 2 ( dx 2 ) J J.l l (A - x 2 )J.L 2 ( dx 2 ).
J J 1A =
=
-
x
(7 .4b )
Applying Fubini's Theorem to ( 7.4a ) ( i.e. interchanging the integration ) we also get But the expression on the right is exactly mutativity of the convolution. (ii) If J.t 1 ,J.L 2 , v E !B * (IR k , <:B k ), then have by (7 .4b ) that
(J.L 2 *J.L 1 )(A), which implies
com
J.L 1 + J.L 2 E m.(IR k , <:B k ) , and we
v *(J.L 1 + J.L2 )( A ) = J (J.L1 + J.L2)( A - x ) v ( d x ) = J J.L 1 (A - x)v(dx) + J J.L2 (A - x)v(dx) =
( v * J.L 1 )(A) + ( v *J.L2 )(A).
(7 .4c )
Thus, the convolution is distrivutive. 00 k k ( iii ) Let {v, J.L n } � m * (IR , <:B ) such that E J.L n E !B*(lR k , � k ) . Then, n =l
by the same argument as in (ii), we get 00
00
V * ( nE=l J.L n ) = nE= l V * J.L n
(7 .4d )
i.e. the convolution is also u- distributive. ( iv) Let J.L 1 and J.L2 be as above and let a E IR + \ {0}. Then, it is easily seen that (7 .4e )
D
7.5 Examples. (i) Let J.L t = e a and J.L 2 = e b E m.(IR k ,<:B k ). Then by ( 2.2),
7. Applications of Fubini 's Theorem
t: -
381
since a ( A b ) = l A b ( a ) = lA ( a + b ) for fixed a ,b,A . We can therefore write (7 .5a) _
(3
( ii) Let n , P and (3 m , P be binomial measures introduced in Example 1. 8 (iii), Chapter 5. We find the convolution of these measures by applying (7.4c)- (7. 5a):
!3 m , p * f3 n , p
=
i n k i m k ) ) ( ( p p [ kf:_ k P (l ) ck] * [ i� i P (l ) c;] O
Denoting i + k = j and renumbering the second sum, we have
The middle sum is fore,
=
( iii) oo ?:
J =O
(m t n) by a known combinatorial identity. There
Convolution of atomic measures. Let E !B * . Then by (7.4d)-(7. 5a),
(3jt: j
00
00
J.L l * J.L 2 = ?: ?: • = 0 J =O
Substituting k for i + j we have
o:i(3jE: i+ j .
00
J.ll = E i= O
aiei
and
J.L2
(7.5b)
k The expression E i=O
ai(3k -i =
product of power series
1
k is known
as
the convolution in the
382
CHAPTER 6. ELEMENTS O F INTEG RATION
( iv ) Consider the following special case of ( iii ) by taking J.L 1 = 1r a and J.L 2 = 1rb - Poisson measures with parameters a and b (introduced in Example 1.8 (iv ) , Chapter 5). By formula ( 7 . 5b ) , we therefore have 00 ) 7ra *7rb = ( � exp( - a )�1 *( ?: exp( - b)-:;-bj ) 00
a=O
i
'Z .
J·
J =O
i k
i k' b = L, exp( - ( a + b)) ! k - ) ! k i E: k i i( i k O 'Lo
00
=
k
i i (� k t b exp ( - ( a + b)) -\ )a f k k=O . i=O 'Z
k b a ( ) + = L, exp( - ( a + b)) k' = 1r a + b · oo
k=O
.
D
f > 0 be ank element of L1 (1R k ,<:B k ,,\kk) kand let J.L be generated by I f d ,\ . Therefore, J.L E !B * (IR , <:B ). Let v be
7.6 Remark. Let
the measure another measure from !B * (lR k , <:B k ). We wonder about the convolution J.L*V · We have by (7.4a) and with Ly (x) = x + y that:
J.L*v( A ) = I lA d(J.L*V) = I I lA (x + y) J.L (dx)v(dy) = I ( I lA (x + y) f (x) >. k (dx) )v(dy) = I ( I 1 A ( x + y ) f( x ) >. k L : � (dx) )v(dy) = I ( I 1/Jy L _y (x ) >. k ( dx) ) v(dy) [where 1/; y ( x) = 1 A ( x + y ) f ( x )] = I ( I 1/Jy ( x - y) >. k (dx) )v(dy) = I( I lA ( x) f ( x - y ) >. k (dx) )v(d y ) o
[by Fubini 's theorem] where 0 is, by Fubini's theorem, c:B k-measurable and j is obviously finite. Therefore,
7. Applications of Fubini 's Th eorem
383
v and it will be denoted by cp = f * v. D Observe that, since J.l* V = V * J.l, we have that f * v = v * f · 7.8 Remark. Let J, g E L 1 ( A k ) and J, g > 0. Let J.l = I fdA k and v = I gdA k . Then J.l,V E !B * ; and we obtain, by using Proposition 5.3 that f * v(x) = I f(x - y ) v (d y ) = I f(x - y)g(y)A k (dy) . the function f and the me asure
Now we have from (7.6a) ,
=
J.l * V = ( I fdA k )*v = I f * v dA k I( I f(x - y )g(y)>. k (dy) )>. k (dx) = I f *g(x) >. k (dx),
where we denoted
f*g(x) = I f(x - y )g(y)A k (dy) . The function
f
an d
g.
f * g is obviously integrable and we call it the
(7.8a) convolution of D
The definition of the convolution for functions can be extended from (7.8a) to real-valued integrable functions. However, we shall refrain from connecting it with the convolution for measures, since it will require a background on signed measures (in Chapter 8) . 7.9 Example. In probability theory, the convolution finds its applica tion for the distribution of the sum of independent random variables. Let X 1 , . . . ,X n be independent random variables valued in IR k with their distributions lP x i ' i = l, . . . , n. Let S = E 7 = 1 Xi and L n b e as in Defini tion 7.3. Then S = Ln
o
(
.
z
® X· =1 ) z
and thus n -1 * IP IP L; 1 . IPs = S = ( ® X i 1=1 Since X 1 ,. . . ,X n are independent, by Proposition 7 .2, n -1 n = . ® 1P x . = IP n IP( . ® X i ® X 1· z = 1 1 1=1 1. = 1 .
)
)
and, following Definition 7.3, we obtain that
o
384
CHAPTER 6. ELEMENTS OF INTEG RATION
D PROBLEMS 7. 1
7.2
Let (O,E,IP) be a probability space and let y C E be an independent family of events. Show that the Dynkin system '!»(g) generated by y is also independent. Show thatn the product map i ® X i introduced in Definition 7. 1 ( . ) 1s LJ -_ ® 1measurabl e. = 1 I = 1 Prove Proposition 7.2. Let L + denote the space of all real-valued nonnegative integrable functions. Show that ( L + , * ) , with the binary operator * defined by (7 .8a) is a semi-ffi-space (over the semifield IR + ) in light of Remark 6.2 ( ii) , Chapter 5. Le t 1 a u 2 denote the probability distribution of a normal random , variable X with density /( a ,u2) defined in Example 5.10 (iii) and lety 1 2 be the probability distribution of another normal {3 , 6 random variable Y. Show that, if X and Y are independent, then IP X + y = 'Y a + u 2 + 2 . tv
7.3 7.4
7.5
•
P
P LJ
>.. k
{3,
6
7. Applications of Fubini 's Theorem NEW TERMS: independent random variables 378 independent family of random variables 378 product map 378 joint distribution of random variables 378 convolution of measures 379 convolution of measures, properties of 379 convolution of point masses 380 convolution of binomial measures 381 convolution of atomic measures 381 convolution of Poisson measures 382 convolution of a function and measure 382, 383 convolution of functions 383 sum of independent random variables 383 , 384 sum of independent normal random variables 384
385
Chapt er 7 Calculus in Euclidean Spaces
This is about Lebesgue integration in Euclidean spaces, which will prima rily deal with the change of variables techniques. As a mandatory prelimi nary and for consistency, it begins (in Section 1) with differentiation. Any standard analysis text book can serve as an alternative refresher. Al though the Euclidean space is the chief application in this chapter, for didactical purposes, we allow us to introduce certain concepts for Banach spaces. 1. DIFFERENTIATION
1. 1 Definition. Let X and X' be Banach spaces. A function F: X -+ X' is said to satisfy a Lipschitz con dition on a subset 0 C X, if there is a constant K, called the Lipschitz constant on 0, such that D II F(x) - F(y ) II < K II x - y II for all x , y E 0.
Clearly, a function F that satisfies a Lipschitz condition on a set 0 is uniformly continuous on 0. If the Lipschitz constant is zero, then F = canst on 0. 1.2 Remarks. Let X = IR n and X' = IR m both endowed with Euclidean norms and let L be a linear operator from IR n onto IR m . It is known that any linear operator can be expressed by a matrix. Conversely, any m x n matrix, say A, represents a linear operator, so that L (x) = Ax, for each x E IR n . The Euclidean or Fro benius norm of matrix A = ( aij) is defined as (1.2) There are a few other norms we are going to use in the sequel. Before we introduce them, note that the notation I l l I l l for a matrix norm is used when besides the usual properties of the norm, a matrix norm is sub multiplicative, i.e. , if ·
I l l AB I ll < I l l A 1 1 1 1 1 1 B I l l ·
(1.2a)
One can show (Problem 1. 1) that the Frobenius norm is submultiplica387
388
CHAPTER 7 . CALCULUS IN EUCLID EAN S P A CES
tive. The matrix supremum norm is defined as
II A II u = max { I aii I : 1 � i < m, 1 < i < } n
(1.2b)
.
It is not submultiplicative. (A counterexam ple is II A 2 II u > II A II �, where A is the 2 x 2 matrix with ai j = 1 for all i and j. ) The maximum row sum matrix norm is defined as
I l l A l l l r s = max { L: � 1 1 ai k I : 1 < i < } m
=
(1.2c)
.
One can show (Problem 1.2) that the maximum row sum matrix norm is su bmul tiplicative. We will outline the following propertiesnof matrix-vector norms. We assume that A is an m x n matrix and x E lR .
II Ax II e � I l l A I l l e II X II e ·
( i)
(MN . 1)
Indeed, with the aid of Holder's inequality ( p = q = 2) , 2 �m �n n n 2 � � � m ) · � ( < x a a 1 1 Ax l l e = � X i i k k - LJ 1 LJ k 1 ak LJ j 1 J LJ 1 LJ k 1 a =
=
=
I l l A I l l � II X II �
=
(ii)
II Ax II u <. n II A II u II X II u ·
(iii)
II Ax II u < I II A I I I r s II X II u ·
=
=
(MN.2) This is easy to verify as well as the other inequalities (see Problem 1.3):
(iv)
( v)
I l l A I l ie <
ylrrin II A II u ·
I I I A II l u < II A II u · n
If L is a linear operator expressed by an (MN. l-3) it follows that
II L( X ) - L(y) II
=
(MN.3)
m x n
(MN .4) (MN.5) matrix A , then from
II A ( X - y) II < K II X - y II ,
where II · II is the Euclidean or supremum norm and K is either II I A I I le , or n II A II u or I I I A I I I rs'n respectively. Consequently, L satisfies a Lipschitz condition on lR with the respective Lipschitz nconstants. In particular, it follows that L is uniformly continuous on [R with respect to the Euclidean and supremum norms. D n n 1.3 Lemma. L et F: 0 ( C !R ) _. IR be a map that satisfies a Lipschit z
1.
Differentiation
389
condition on 0. If N is a >.. - negligible s ub s et of 0, then s o is F .(N) . Proof. Let C E !f be a semi-open cube in IRn and F * (C) - its image under F. Since F is continuous on 0, F .( C) is d-bounded and therefore there is a compact cube C * with each side equal to di a mF * (C), which contains F * (C). Let K stand for the Lipschitz constant on 0. Then, it is easily seen that diamF * (C) < KdiamC with di a mC = r.jn and r being the cube edge length. Clearly,
>.. * (F * (C)) � >.. ( C * ) = [ diamF .(C)] n < ( K dia mC) n , where ).. * denotes the Lebesgue outer measure on '!P(lR"). Now, since a cube, >.. ( C) = rn and
C is (1.3)
If N is a negligible set, then according to Problem 3. 18, Chapter 5, for each e > 0 , there is a countable cover of N by disjoint semi-open cubes { C k } such that (1.3a) Therefore, . unions,
N C E '; 1 C k
and since maps preserve inclusions and
The latter, along with (1 .3) and (1.3a) , yield that:
).. * ( F ( N)) :::; E � 1 ).. * ( F * ( c k )) *
< E � 1 >-. (CZ) = E ;; 1 (K dia mC k)" = E � 1 ( K .jn) n>.. ( C k ) = (Ky'Ti) n E ;; 1 >-. (C k) < e.
We showed that for any c , F. (N) can be covered by countably many half open cubes with the sum of their volumes less that e. By Lemma D 3.6, of Chapter 5, F *(N) is negligible. The following concept of the derivative was given by Frechet in 1903 , which we first formulate for Banach spaces.
390
CHAPTER 7 . CALCULUS IN EU CLIDEAN S P A CES
1.4 Definitions.
( i ) Let n and Q' be Banach spaces and let 0 be an open set in n. A map F: 0 � 0' is said to be differentiable at a point x E CJ if there is a continuous linear operator L ( F , x ) : Q � Q' and a map o : Q � Q' such that
tz. m h --+ 9 o (hh) =
II II
and
B'
F(x + h ) = F(x) + L ( F , x ) ( h ) + o ( h ), x + h
E 0.
( 1 .4)
It is easy to shoyv that if a map F has such an operator L ( F x ) ' then it is unique given F and x (Problem 1 .4) . The operator L ( F � ) is usually , denoted by F'(x) or D Fx is called the derivative (or Frechet derivative) of F at x. Consequently, from (1.4) , .l F(x + h ) - F(x) 1. DF x( h ) = ( 1 .4a)
h�
II h II
h� II h II .
If the function F is differentiable at every point of 0, it is said to be differentiable on 0. Then x � DFx is evidently a function itself, which is obtained by the application of the operator D to F. ( ii) Consider the special case of n and Q' being Euclidean spaces IR n and IR m , resp,ectively. Then, at every x = (u 1 , . . . ,u n )T E IR n , F(x) = ( f 1 ( x ), . . . , f m (x))T. In the above definition, the linear operator L F ( x ) ' as any linear operator in lR n (recall it is also continuous) , is known to be represented by an m x n matrix, say M x · Therefore, the derivative of F at x is, in this case, a matrix, called the Jacobian matri x, in notation �F(x). Then, (1.4) and (1.4a) can be rewritten as F(x + h ) = F(x) + �F(x) h + o ( h ), and
'
x E 0,
( 1 .4b)
�F(x) h . I (1 .4c) h � II h II h� II h II the determinant of � F( x) is denoted by J F( x) and is called
For m = n , the Jacobian .
F(x + h ) - F(x)
= li
D
1.5 Examples.
( i ) If F itself is a continuous linear map, then F( x + h ) - F( x) = F( h) and taking o = 0 (zero funct ion) , we get L F ( x ) ( h) = DF x( h ) = F( h ). Therefore, F is everywhere differentiable and for all x, D Fx = F, i.e. , D F x does not depend on x and F coincides with its derivative. In particular, if F acts in the Euclidean space and thus is represented by an
1.
Differentiation
39 1
m x n matrix, say M, then the Jacobian matrix & F ( x ) equals M. ( ii ) Let n = Q' = e([0, 1] , !R ) with norm II X II = sup{ x ( t ) : t E [0, 1] } and let 0 = { X : I I X I I < r} for some r > 0. Define the operator F: 0 � n as F ( x )( t ) = y ( t ) + I K ( t ,s) g ( s , x (s)) d s , (1.5) where K( t ,s) is continuous on [0, 1] 2 and t he partial derivative ( u , v) ( defined on the set R = [0, 1] x !R ) exists and is uniformly con t inuous on R. Then we can show that
�
�!
F ( x + h )( t ) - F ( x )( t ) = I � K(t,s) [g ( s , x ( s) + h ( s)) - g ( s, x ( s))] d s 1 {} = I 0 K ( t , s) a vg ( s , x ( s)) h (s) dx + cp ( x , h ) '
where
I zm . h-.rJ ll cpII(xh, hII ) ll = 0 .
Thus, F is differentiable at x and its derivative satisfies 1 89 ( F '(x) h)( t ) = I 0 K( t , s) 8 (s , x( s ))h ( s) dx . v
(1.5a)
0
m = ( f , . . . , ) ] be a function . 1 / T F is differentiable at an interior point x of 0 if and only if each component function f 1 , . . . , f m is differen tiable at x and in this case 1.6 Proposition. Let [ 0( C !Rn ), !R , F
m
Proof.
( i)
Suppose F is differentiable at x . Then,
F ( x + h ) - F ( x ) = (f1 ( x + h ) - f1 ( x ), . . . , f m ( x + h ) - f m ( x )) T =
D F x( h ) + o ( h ) = d F ( x ) h + o ( h ) = (& � ( x ) P . . ,& p ( x )) T h + o ( h ), .
(1 .6)
where & P, ( x ) is the ith row vector of d F ( x ). The right-hand side of (1 .6) can also be writ ten in the form
3 92
CHAPTER 7. CALCULUS IN EU CLIDEAN S P A CES
which yields that
. fi(x + h) - fi(x) = �p (x) h + oi(h)
and, hence, f i is differentiabl� at x and its derivative fi( x) is expressed by a 1 x n Jacobian matrix � F(x). Consequently, we have that F'(x) =
(f}(x), . . . ,f'm(x)) T . ( ii) The converse of the statement is obvious.
D
1. 7 Definitions.
( i) Suppose [0( C lR" ) ,IR, f] is a function. If f is differentiable at x E 0 "along the segment [x,x + te k]" parallel to the X k axis, where t is a real scalar and e k is the kth basis vector of lR", i.e. , the limit .l lm f(x + t e k ) - f(x) t t --+0
-
------
exists, it is called the partial derivative of
f with re spect to its kth
coordinate , in notation ::k (x ) . [Note that by fixing all components of vector x except for x k , in the above limit, the partial derivative ::k (x) is nothing else but the usual Newton-Leibnitz derivative.] (ii) We c;an analogously define the kth partial derivative of a vector function [ 0( c ·IR" ) , IR m , F ( ! 1 , . . f m )] as =
..
-
1
. F( x + te k ) - F (x) aF� ( X ) _ 1 1m , a '- k t t--+0
if the limit on the right exists. In light of Proposition 1.6 kth partial derivative (x) of F is
g�
8a F (x) ek
=
(
)
8fl (x), , 8f m(x) T a ek . . . aek
( h = tek ) , the ( 1 . 7)
and it exists if and only if the corresponding partial derivatives of all its D component functions exist. Suppose [0 ( � lR"),lR,/] is a function differentiable at a point x E 0. Therefore, f ' ( x ) exists and from (1.4a), .l f(x + h ) - f(x) = 1 . f'(x)( h ) (1.8) . � h h� II h II II h II In particular, if h = te k ' where t is a real scalar and e k is the kth basis vector of lR", h is the increment of x taken along the segment of a line
1.
393
Differentiation
II h II = t and, since f ' is linear, 8f ( x ) · f (x + t e k ) - f (x) - l 1m t t--+ 0 '- k
parallel to the X k-axis. Then,
a�
(1.8a)
::k
From (1.8a) it follows that (x) equals the scalar product of f's Jacobian matrix &f(x) and the kth basis vector e k . If [0 ( C IR" ), lR m , F] is a vector function differentiable at an interior point x of 0, then Propo sition 1.6 and (1.8a) yield (1.8b) Thus, if F is differentiable at x, all its partial derivatives exist and are determined by formula (1.8b ). In particular, (1.8b) reveals the nature of the Jacobian matrix � F ( x) . Namely, from (1.8b) and (1.7) it follows that ... ... •
.. .
•
.
•
(1.8c)
8fm ) ae (x
.
n
g�(x) and therefore, �p(x) ( �(x), g[ ( ) }
The kth column of �p(x) is
=
x
. . .•
(1.8d)
0
The above can be summarized as the following theorem. 1.8 Theorem. Let [ 0 ( C lR"), lR m , F] b e a function differentiable at a point x E 0 ( an interior point) . Then , all its partial derivative s exis t and its Jacobian matrix & F ( x) is equal to
( aaJe�. (x); i
=
1, . . . , m; k
=
1.9 Definition. Let 0 be an open set in
1, . . . , n
IR " .
)
D
.
A function [ 0, IR m F] is ,
394
CHAPTER 7 . CALCULUS
IN EUCLIDEAN
SPACES
said to be continuou s ly differentiable on 0 or a e 1 ( 0 1Rm ) -fun c tion if F is g exist and differentiable on 0, and all of its partial derivatives 1 are continuous on 0. Note that F is a e 1 - map if and only if F is differentiable and F' is continuous on 0. D ,
g[
1. 10 Examples. (i) If F E e 1 (0,1Rm) and a continuous function on 0.
m
, . . .,
[
n
= n , then the Jacobian JF is obviously
( ii) It can be easily verified that F (x , y )
=
( ':+�i )
is a e 1 ( {(x,y) E lR 2 : X = y } c , IR 2 ) -function. The following is the chain rule holding in Banach spaces. 1.11 Theorem (Chain Rule). L et n, 01 , and 0 2 be Banach spaces and let H: 0 ( � Q) � n 1 and G: 0 1 ( c n1) ---. n2 be maps such that H( 0 ) C 0 1 . Let H be differentiable at x E 0 and G be differentiable at H( x ). Then the compo s ed map G o H is a differentiable function at x and
(G o H)'(x) = G'(H(x))(H'(x)).
( 1 . 1 1)
Proof. By the assumption of differentiability,
H(x + h ) = H (x) + DHx( h) + o H ( h )
and
G(H(x + h)) =
G ( H(x)) + DG H ( x ) ( H (x + h ) - H(x)) + o 0 ( H (x + h ) - H(x)).
Substituting the expression for have that
H(x + h ) - H(x) = DHx( h ) + o H ( h ) we
G(H(x + h )) = G(H(x)) + DG H ( x ) (D H x + oH ( h )) + o 0 (H(x + h ) - H(x)). By linearity of DGH ( x ) '
1.
Differentiation
395
+ oa ( H (x + h ) - H(x)). Now, by continuity of H, H (x + h ) - H (x) � e 1 when arity and continuity of D G H( x ) '
h � B , and by line
(
. D GH( x ) (oH (h )) . D (oH (h ) ) G = l�o l� H( x ) II h II II h II '
)
01 Therefore, G H (x + h) = G H(x) + D G H( x ) DH X + o a H(h ) . V'
0
0
0
1. 12 Corollary. In the condition of Theorem
01 = lR m , and 02 = IR1 • Then,
1. 11,
D
let n = IR", (1. 12) D
Mean Value Theorem) . Let F : lRn � [R m be dif ferentiable on a conve x s et 0 . Then , for any x and y E 0, there is a p oint TJ , which belongs to the line s egment S (x, y ) between x and y , such 1. 13 Theorem (The
that
F(y) - F(x) = F ' ( TJ )( y - x) .
( 1 . 13 )
x, y E 0. Denote g( t ) = ty + (1 - t )x for 0 < t < 1 . Then, the function g represents the segment S( x, y ) and F o g will let the function F run over the segment S(x, y ) . By the chain rule, the function ,P = F o g is evidently differentiable on the segment [0, 1] and by ( 1 . 1 1 ), Proof. Let
cJi ' ( t ) = (F o g) ' ( t ) = F ' (g( t ))(g ' ( t )) = F ' (g( t ))( y - x) . Now, applying to <.P : [0, 1] � IR the Mean Value Theorem known from standard analysis we conclude that there is a point e E ( 0, 1 ) such that <.P' ( e ) = 4> ( 1 ) - <.P (O) = F( y ) - F(x). Taking TJ = g( e ) , we prove the above D statement. n 1.14 Corollary. Let 0 c IR be a conve x open s e t and F E e 1 ( O,IR m ). Then F s atisfies a Lip s chitz condition on any conve x compact s ubset B of 0 with Lip s chitz con stant K = sup{ I I I dp(z) l l l e: E B}. Proof. From ( 1. 13) and ( MN. 1), z
396
CHA PTER 7. CALCULUS IN EUC LID EA N S P A C ES
I I F( y) - F(x) I I e < I I Y - X I I e I l l F'( TJ ) l l l e ' TJ E S(x, y)
C B.
(1. 14)
In particular, II F(y) - F(x) II e < sup{ I I I dF (z) I l ie : z E B} I I Y - X II e . ( 1 . 14a)
D
The following result will also be useful in the sequel. 1.15 Corollary. Let B be a conve x compact s ub s et of an open s et 0 C lR" and F E e 1 (0,1R m ). Then F satisfies a Lips chitz condition on B with re spect to the s upremum norm and with Lip s chitz constant K = yln sup{ l l l l F(z.) I l i e : z E B}. Proof. Let F = (/ 1 , . . . ,f m ) T . By (1. 14) and then by (MN.4) for m = 1,
I f i ( Y ) - f i (x) I < I I f i ( Y ) - f i (x) I I e � II f i ( TJ ) I I e I I Y - X I I e < II y - X II
e
I: 7' 1 I: � 1
< J1i I l l d F ( TJ ) I l i e I I Y - X I I
=
u'
[�( 7]) r
TJ E S(x ,y)
C B,
which obviously yields that II F(y) - F(x)
II
u
< yln sup{ l l l l F ( z ) I l i e : z E B} I I Y - x I I
u
( 1 . 15)
and thereby the statement of this corollary. 0 The following is a modification of Lemma 1 .3. 1 1.16 Lemma. Let F : 0 ( C lR") ---. lR" be a e map and 0 an open s et. If N is a negligible s ub s et of 0, then s o is also F * (N). Proof. Since ( IR",r e) is second countable and since open rectangles with rational coordinates are a countable base for (IR",r e ) (see Example 2.8 ( i), Chapter 3) , 0 can be represented by a union of such rectangles. Because F' is continuous on 0 and each R k is convex and bounded, it follows from ( 1 . 14a) that F satisfies a Lipschitz condition on R k with K k = sup{ I I I F'(z) I l i e : z E R k } being a Lipschitz constant on R k . Since N n R k is negligible, by Lemma 1.3, F * (N n R k ) is also negligible. This yields that F * (N) is negligible as the countable union of sets D {F.(N n R k ) } 's. 1.17 Definition. Let 0 and 0' be open subsets of IR " and [O,O',F] be 1 a e -map. F is called a diffeomorphis m or diffeomorphic or e 1 -invertible , -
1.
Differentiation
397
if:
( i) F is bijective. ( i i ) [0', 0, F - 1 ] is a e 1 -rna p. D The following is a version of the Inverse Mapping Theorem, which can be found in many standard analysis books, such as one by Tom Apostol [1 974] . 1.18 Theorem (Inverse Mapping Theorem). L et [0 C IR",lR",F] be a e1 - map and let J p(x) t- 0 for some X E 0 . Then: ( i ) there are open s et s U C 0 and V C IR" such that x E U, F(x) E V, and [U,V,F] is bijective; ( ii) [V,U,F - 1 ] is a e 1 - map ;
(iii)
& F _ 1 (F(x)) & p(x)
=I
D
.
1.19 Remarks. The Inverse Mapping Theorem tells us that:
( i)
[U, V ,F] is a diffeomorphism. (ii) If [0 c IR",lR",F] is a e 1 - map and J p(x) -=/= 0 on 0, then [O,F ( O), F] is a diffeomorphism. D 1.20 Example. Let [0 C IR",O' C lR",F] be a diffeomorphism. We show that for each x E 0, *
(F - 1 )'(F(x))(F'(x)) = D(F - 1 ) F ( x ) DF x = I As the identity map, 1 = F - 1 o F, D1 x = D(F - 1 F) x = I (see Example 1.5 o
.
( 1 .20)
(i)).
On the other hand, by the chain rule (Theorem 1.1 1), ( 1 .20a) In terms of Jacobian matrices the same results read & F _ 1 (F(x)) & p(x)
=I
.
(1.20b)
The latter yields the following: and thus,
( & p (x)) - 1
=
1 J p(x )
JF
=
& F _ 1 (F(x))
( 1 .20c)
- l (F (x)) .
(1 .20d)
398
CHAPTER 7. CALCULUS IN EUCLID EAN S P A CES
(1.20b-d) imply that if [0 C 1R n ,O' C 1R n ,F] is a diffeomorphism, then both J F f. 0 on 0 and J 1 f. 0 on 0'. D F 1.21 Proposition. Let [O,O',F] be a diffeomorphism in IR n and A be a s ub set of 0 . Then , A is Lebesgue measurable if and only if F * (A) is also. _
Proof.
( i) Let La be the trace u-algebra of Lebesgue measurable sets on 0 and L;, - the corresponding trace Lebesgue u-algebra in 0'. If A E .t0, then, by Corollary 2. 18, Chapter 5, there is a Borel superset B of A from the trace Borel u-algebra <:B n 0, such that A� (B\A) = 0, i.e., B\A is A negligible. Since F is a e 1 map, by Lemma 1 . 16, F * (B\A) is also neg ligible. Therefore, since (L ; ,, A � ) is complete, F * (B\A) E L;,. On the other hand, since F is a homeomorphism, it preserves all set
operations and
By Problem 3.5, Chapter 4, F * (B) is Borel, thus, measurable set and we have that
F * (A) is a Lebesgue (1.21 )
( ii) Because F is diffeomorphic, ** * * La and, in additian, o (L;,) F ** F = 1 ** (identity) . ( 1 .2 1a) F c Consequently, from ( 1 .21a), yields
L; , C F * * (La) and this, along with ( 1 .21) (1 .2 1b)
and thus the assertion.
D
1.22 Remarks.
(i)
Let F be a homeomorphic map. From Problem 3.5, Chapter 4, F ** (<:B ) = <:B is a Borel u-algebra in IR n and, therefore, the image measure J.l = .,\F * is a Borel measure. For B , being a compact set , F * (B) is also compact and thus J.L is a Borel-Lebesgue-Stieltjes measure. that
(ii)
J.L «.:
If F is diffeomorphic, then from Lemma 1 . 16 and
A.
(i), it follows D
1.
399
Differentiatio n
PROBLEMS 1.1 1.2 1.3
1.4 1.5 1.6 1. 7 1.8
1.9
Show that the Frobenius norm is submultiplicative. Show that the maximum row sum matrix norm is sub multiplicative. Prove inequalities (MN.2-MN.5). In the problems below, unless specified otherwise, we assume that a function F: n --. 0' has Banach spaces as the domain and codomain. Sh �w that given F and x, the linear operator L F ( x ) in (1 .4) is unique. Let F: n ---. Q' be a constant function. Show that F is differentiable everywhere on n and that for all X E n , DF X = 0. Show that if F: n --. 0' is differentiable at x, then it is continuous at x. Show that the derivative is a linear operator acting on the set of all differentiable functions F: n ---. Q' at a point x. Let n , 0 1 , 0 2 , and Q' are Banach spaces, F: 0 ---. 0 1 , G: 0 ---. 02 be functions differentiable at x E 0 (where 0 is an open set), F ® G: n >{ n ---. Q'. Show that the product function F ® G is differentiable at x and D(F ® G)x = DF xG(x) + F (x) D Gx . Let [IR",IR",L] be a linear map given by a regular matrix M. Show that there is a positive real number a such that, for all x E IR",
and
1.10
II M X II e < a II X II e '
(P 1.9)
II M X II e < f3 II X II u '
(P 1.9a)
1X > X M II II e � II
II e '
(P 1.9b)
I I M - l x II e > � I I X II u '
(P 1.9c)
Let [0 c IR",1R m ,F] be a e 1 -function, where 0 is an open set, and x0 E 0. Prove that for each £ > 0, there is an open ball B e (x0 ,6) C 0 or B u (x0 ,6) C 0 such that
II (F'(x ) - F'(xo ))( h) II e < £ II h II u '
(P l . lO)
400
CHAPTER 7. C ALCULUS IN EUCLIDEAN S P A C ES
or
1.11
II (F' (x) - F'(xo ))( h ) II u < £ II h II u '
respectively. In the conditions of Problem 1. 10, let 0 be a convex set. Prove that for each t: > 0, there is an open ball Be( x0 ,6) C 0 or Bu (x0,6) C 0 such that
II F(x + h ) - F(x) - DFx (h) II e < II h II u ' E:
for all
( P 1. 1 1 )
x E Be(x0 ,6) and h E IR" such that x + h E Be(x0 ,6)
or
1.12
( P1. 10a)
II F(x + h) - F(x ) - DF x ( h ) I I u < £ II h II u '
( P 1 . 1 1a)
for all x E B u (x0 ,6) and h E IR " such that x + h E B u (x0 ,6) , respectively. Let [0 c IR",IR", F] be a e 1 -function, where 0 is an open set, and x0 E 0 such that the Jacobian J F ( x0) f. 0. Prove that there is an open ball Be(x0,6) C 0 such that for all y E Be(x0,6), (P 1 . 1 2)
[Be(x0 ,6),1R " , F] is one-to-one. 1.13
( P 1 . 1 2a )
[0 C IR " ,IR", F] be a diffeomorphism. Show that for each x0 E 0, Let
or, equivalently,
1.14 1.15
F'(x0 )(F - 1 ) ' (F(x0)) = 1. Show that if (IR",1R m , F] is differentiable, then { x E IR" : I I I F'(x) I l i e < a } is an open set in IR". Under the condition of Problem 1 . 14, is { x E IR" : II F'(x) II u < a}
an open set?
1.
Differentiation
NEW TERMS: Lipschitz condition 387 Lipschitz constant 387 Euclidean (Frobenius) norm of a matrix 387 Frobenius (Euclidean) norm of a matrix 387 submultiplicative property of a matrix norm 387 matrix supremum norm 388 maximum row sum matrix norm 388 differentiable map 3 90 derivative of a map 390 Frechet derivative 390 Jacobian matrix 390 Jacobian 390 partial derivative 392 continuously differentiable function 394 chain rule in Banach spaces 3 94 chain rule in Euclidean spaces 395 Mean Value Theorem 395 diffeomorphism 396 Inverse Mapping Theorem 397
40 1
402
CHAPTER 7. CALCULUS IN EUCLIDEAN S P A CES
2. CHANGE OF VARIABLES
2.1 Lemma. Let L be a linear op erator from IR" to IR" expressed by a regular matrix M and C be the compact unit cube spanned by the basis vectors in IR". Then, it holds true that
( 2 . 1) Proof.
( i) We will refer to the linear operator L as to
elementary, if the
corresponding matrix M is regular and one of the following three types:
Type 1. M is derived from the n x n unity matrix I whose ith
element on the main diagonal is replaced by a nonzero real number c.
Type 2. M is obtained from the n x n unity matrix I , in which
the columns i and j are interchanged.
Type 3. M is obtained from the
unity matrix I in such a way that in its column i, the element e j i = 0 is replaced by the element m j i = 1. n x n
In all types above we assume i,j = 1, . . . , n and i '# j. Clearly, if x E IR" is a column vector, then L( x) = M x stipulates the rules of the following transformation of x: For type 1, the ith entry of x is multiplied by c and the rest of the entries are left unchanged. For type 2, the entries x i and x j are interchanged and the rest of the entries remain unchanged. For type 3, entry x i is replaced by x i + x j and the other entries are left unchanged. ( ii) We first show that J.l( G) = ).. L * (C) = I detM I , if L is an elementary operator. Remember that C is the closed unit cube spanned by the basis vectors e 1 ,. . . , e n and expressed as the Cartesian product [0, 1] ". Consequently, it is obvious that when mapping C by L * we apply L to each of its points x = t1 e 1 + . . . + t n e n , where t i E [0, 1] . Therefore, by the above rules we have: Type 1
or
L * (C) = (0, 1]
X
• • •
X
[ 0, c]
'-v-'
ith edge
X
• • •
X
[ 0, 1] , if c > 0
2. Change of Varia bles
403
and L * (C) = [0, 1] x . . . x [c,O] x . . . x [0, 1] , if c < 0. '-y-J
ith edge
The edges of C, from e 1 ,. . . ,e n are transformed onto e 1 ,. . . ,ce i , . . . ,e n whose volume .A( L * ( C)) equals I c I . This is the same value as that of Type 2
In this case, the edges e i and e j are interchanged, and therefore, the shape of the cube remains the same. The volume of .A(L * (C)) is the same as that of .A( C) = 1 = I - detl I = I det.A(L * (C)) I · Type 3
The edges of C will be transformed onto ( e 1 , . . . , e i + e j , . . . e n ), which will '---v---'
ith edge
span a paralleletop whose sides parallel to the X i X J-plane are rhombi and the other sides are squares. For convenience sake i = 1 and j = 2, the volume of L * (C) can be calculated by using Fu bini's theorem as follows: A(L * ( C)) =
'
j d,\ n(x1 1
L* ( C)
• • •
, xn)
n-2 V'
This reduces to 1 as it is easy to see. On the other hand, it is also the same quantity as I det.A(L * ( C)) I = det( e 1 ,. . . ,e i + ej , . . . ,e n ) · (iii) Now, if instead of a cube, we have a compact rectangle R, i.e. a paralleletop with its edges spanned by the coordinate axes and possibly translated, by similar arguments as in (i-ii) we obtain that >.L * (R) = I detM I .A(R) ,
(2. 1a)
if L is an elementary linear operator. (See Problem 2. 1 where the validity
of (2. 1a) is to be shown.)
CHAPTER 7. CALC ULUS IN EUCLIDEAN S P A CES
404
Let P be a compact paralleletop in lR". Since the boundary 8P of P consists of parallelograms each of which have a dimension less than n, >.. ( 8P) = 0 and, therefore, >.. ( P) = >.. ( P) . By Problem 2. 10, Chapter 4, as an open set, P can be represented as a countable union of disjoint semi-open cubes:
( iv)
0
0
0 >.. ( P)
Therefore, >.. ( P) = = there is an N E N such that
0P 00 Ej
=
=
00 E = 1 ci .
i 1 >.. ( C j) <
oo
and hence for each
£
>0
(2. 1 b) On the other hand, by Problem 3.22, Chapter 5, for each e > 0, there is a finite cover o� P by disjoint semi-open rectangles R 1 , . . . , R r such that
E � 1 >-. (R i ) � < >.. ( P) < E � 1 >-. (R i ) • =
-
=
(2. 1c)
Equations (2. 1b) and (2. 1c) yield
E � 1 >.. ( Ri) - � < >.. ( P) < E r: 1 >.. ( c j) + � · =
(2. 1d)
Therefore, from (2. 1d) we have that (2. 1e) Now L * (C) = P is a compact paralleletop with the property that for each e > 0, there is a finite cover of P by semi-open disjoint rectangles and a finite tuple of semi-open disjoint rectangles that can "approxi mate" P from above and below, (2. 1f) In terms of the Lebesgue measure >.. , this is in accordance with (2. 1c2. 1f). ( v ) Suppose L is an elementary linear operator. Then, applying L to (2. 1f) and evaluating the Lebesgue measure of the resulting inclusion we have
From (2 .la), the last inequality can be rewritten as
405
2. Change of Variables
L: f 1 C i and � = L: � = 1 R i , in the form " ( L * (e)) = 1 detM 1 " (e) < " L * (P) < " ( L *(�)) = I detM I A (�).
or, with notation e =
On the other hand, replacing £ in (2 .1e) by
£
(2. 1g)
I detM I we get (2. 1h)
We conclude that, if L is an elementary operator applied to a compact paralleletop P, for each e > 0, there are a subset e and a superset � of P whose images under L * satisfy inequalities (2. 1g-2. 1h) and and
" ( L* (e)) = 1 detM 1 " (e)
(2. 1i)
A ( L * ( � )) = I detM I A (�).
(2. 1j)
Equations (2. 1 g-2. 1j) yield that
A (L * (P)) = I detM I A(P) . L is a regular linear operator, then,
(2. 1k)
it is known from linear algebra, L can be expressed as a composition of finitely many elementary operators or, equivalently, M = M 1 · · · M 8, where M/s are elementary matrices. (One of the arguments is the Gauss-Jordan algorithm for deriva tion of the matrix inverse.) The application of L * = ( L 1 L8 ) * or any subgroup of L 1 L8 to C makes it a compact paralleletop such as P above. Consequently, (vi) If
as
o
o
• • •
•
•
•
o
o
and because of (2. 1k),
which finally yields
2.2 Theorem. Let
L: IR" lR" �
be a line ar operator spe cified by
406
CHAPTER 7 . CALCULUS IN EU CLIDEAN S P A CES
matrix M. Then, fo r every Lebesgue me asurable set
"a (L * ( E )) = 1 detM 1 "a ( E ).
E, (2.2)
Proof.
( i) If M is a singular matrix, then L maps the ( n-dimensional) set E into lRm , where < and, therefore, L * (E) becomes A-negligible. On the other hand, detM = 0 and thus equation (2.2) is valid. m
n
( ii) Suppose M is regular.* Then L is diffeomorphic on IR" and, due to Proposition 1.21, L*( E ) E L . Denote
Then J.L� is a measure on <:B* and the restriction of J.L� to J.L (which evidently is �L*) from <:B* to the Borel u-algebra <:B is a Borel-Lebesgue Stieltjes measure. For every a E lR" and E E <:B, the set E + a E <:B (why?) and L*( E + a) = L ( E ) + Ma. Since (Proposition 4.3 , Chapter 5) the Lebesgue measure A* is translation invariant, we have that
J.L( E + a) = A(L*( E )) = J.L( E ), which makes J.L also translation invariant on �- Therefore, by Theorem 3 . 10 (Chapter 5), J.L = J.L(C)A, where C is the unit cube in lR" with the edges along the coordinate axes. :Qy Problem 4.9 (Chapter 5), the outer measure J.L * (generated by J.L) will obey the same relation and
J.L* = J.L( C)A * on 'P(f2), where A * is the Lebesgue outer measure, J.Lo = J.L( C)A0 on <:B* and <:B* = L * .
The statement of the theorem follows from Lemma 2. 1: J.L(C) = D A(L * (C)) I detM I · 2 3 Corollary. (Generalization of Proposition 4.3, Chapter 5.) If L is an affine transformation (L(x) = Mx + b, where M is an n x n matrix =
..
and b is a real number), then for every Lebesgue me asura ble set
E,
(2.3) D
1 2.4 Lemma. Let [0 ( C IR"), lR", F] be a e -map, 0 be an open set, C C 0 be a comp act cube with its edges parallel to co ordinate axes, and £ > 0 be such that for all x E C , I l l d p (x) - I l l l e < £, whe re I is the
2. Change of Variables
407
identity matrix. Then it holds true that
( 2.4)
is also a e 1 -map. By satisfies a Lipschitz condition on C with respect to the
Proof. Denote
Corollary 1. 15, cp supremum norm:
dJ (x) = F(x) - x . Clearly,
cp
II dJ (x) - dJ (xo ) II u < K II X - Xo II u ' with
K = yin sup{ I l l d 4> ( z ) l l l e: z E C}, where
(2.4a)
x ,x0 E C and, obviously, d q; (x) = dF(x) - I. From (2.4a) , II F ( x) - F(xo ) II u < II 4> (x) - 4> (xo ) II + II x - X o II u < ( K + 1) II x - x0 II u · u
(2.4b)
x0 is the center of a cube C and 2r is the length of its edge, II x - x0 II u < r and
If
II F ( x) - F ( x0) II u < r ( K + 1).
(2.4c)
The last inequality tells us that F( x) belongs to the compact cube centered at F(x0 ) with edge 2r(K + 1) or ball with radius r(K + 1), with respect to the supremum norm, in notation B u (F(x0),r( K + 1)). In other words, (2.4c) yields that
(see Figure 2. 1),
408
CHAPTER 7. CALCULUS IN EUCLID EAN S PACES
: F. ( C ) I
I I
I
Xo �
- - - - - - - -¢- - - - - - - -
-
-
-
F(x0 )1 Q-
-
-
-
-
-
-
-
-
-
-
-
r
r(K + 1)
Figure and because
I
2.1
F (C ) is a Borel set, *
A (F * (C)) < A0(B u (F(x0),r(K + 1 ))) =
Now, if follows.
( 2r) " (K + 1)" = (K + 1 ) " A0(C) .
I l l d p (x) - I l l l e < for all x E C, then K < t:Vn and ( 2.4 ) £
2.5 Proposition. L et [0 pose for some b > 0,
C lR",0 1 C IR " , F]
I Jp (x) I fo r all x E; B, where
B
=
1 detd p (x) 1
be a diffeomorphism. Sup
< b,
(2.5)
is a Borel subset of 0. Then,
A(F * ( B)) < b A (B) . Proof.
0
(2.5a)
(i) Suppose B is an open and bounded set such that B C 0. We prove (2.5a) under the assumption that (2.5) holds true for all x E B. Denote
2. Change of Variables
q,(x) =
...
...
Since rule,
409
8gm (F (x)) ae n
q, (x0)F(x) represents a linear map applied to F(x), by the chain
[q, (x0)F(x) - Ix] ' = q,(x0)F'(x) - I = (F - 1 ) ' (F(x0))F ' (x) - I . By Example 1.20, (F - 1 ) '(F(x0))(F' (x0)) = I . Thus, [q, (x0)F(x) - Ix] ' = (F - 1 ) ' (F(x0))F ' (x) - (F - 1 ) ' (F(x0))( F ' (x0)) 1 = (F - ) ' (F(x0)) [F' (x) - F ' (x0 )] and this turns out to be the product of matrices (F - 1 ) ' (F(x0)) and F' (x) - F'(x0 ). Since the Frobenius norm is submultiplicative (see ( 1 .2a)) ,
1 < ) ' (F(x0)) l l l e I l l F' (x) - F'( x0) l l l e ] (F [ (x0)F(x) Ix ' I q, Ill llle ll = I l l q,(xo) ( l i e I l l F' (x) - F' ( x o ) I l i e ·
(2.5b)
Since q, is continuous and B is compact, q, is bounded on B (in terms of the Frobenius norm) and so it is on B. Hence, there is an M > 0 such that (2.5c) II q,( x) II < M for all x E B. As a e 1 -map, F' is continuous on B and because B is compact, F ' is therefore uniformly continuous, i.e. , for every c > 0, there is a 6 > 0 such that , for all x , y E B with II x - y II e < 6,
I l l F'(x) - F' ( y ) l l l e < M·
(2.5d)
Combining (2.5c) and (2.5d) we have from (2.5b) that
I l l [q,(x0)F(x) - Ix] ' l l l e < c given II x - x0 II e < 6.
(2.5e)
410
CHAPTER 7. CALCULUS IN EUCLIDEAN SPA CES
By Problem 2. 10, Chapter 4, B, as an open set, can be represented as at most a countable union of disjoint semi-open cubes { C k } with edges parallel to their coordinate axes. 0 bviously, we can assume that the edge of each cube does not exceed 28 or, otherwise, we can subdivide the edges accordingly if necessary. Now, if x0 is the center of such a cube, then II x - x0 II < 8 for any x from the cube. From Problem 1 . 13, u
Hence,
(2.5f)
Since q, ( x0) F is demeomorphic (as a composition of regular linear and demeomorphic maps) , q, ( x0) F * ( C k ) is a Borel set. Since F ' ( x0) is a linear operator, by Theorem 2.2, and from (2.5f),
A( F * (C k )) = A( F' (x0) q, (x0) F * (C k )) =
By our assumption,
q, (x0) F,
I det F' (x0) I A(q, (x0) F*(C k )) .
(2.5g)
I detF' (x) I < b on B. By Lemma 2.4, applied to
Hence, (2.5h) Inequality (2.5h) holds for any cube. Now, since that
B = I: � 1 C k , we have
and thus
A(F*(B)) = E ;:' l A (F * (C k )) < b ( 1 + t:y'7i) " E ;:' 1 A0( C k ) = b(1 + t:y'1i) " A (B). Since the latter holds for every
£
>
0, we have that
A ( F * (B)) < b A (B). Hence, given that (2.5) holds true on an open and bounded set
B , (2.5a)
2. Change of Variables
411
is valid. (ii) Now we suppose that (2.5) holds true on 0. Note that 0 is
open but not necessarily bounded. By Problem 6. 12, Chapter 3 , there is a monotone sequence { O k } of bounded open subsets of 0, increasing to 0. By Part ( i) , for each O k ,
Since F *(0)
=
00
U F (O ), by continuity from below, k=l * k
(iii) Finally, let B be a Borel subset of 0 on which (2.5) holds true.
By regularity of >., Problem 3 . 15 (Chapter 5), for each e > 0, there is an open superset Oe of B such that >.(Oe \B) < £ or >.(Oe) < >.(B) + £. We assume that 0 e C 0, or, otherwise, we take 0 n 0 instead. Denote e
"'
0 has the following properties: f'V
1) Since I det�p(x) I < b on B, B C 0 f'V 2 ) .§ince 0 = OE: n {x E IR": II F'(x) I I < b + e} , by Problem 1. 14, 0 is open. f'V
So, we have that B C 0
C Oe. Thus,
>.(F * (B)) < >.(F * ( 0 ))
< (b + e)>.( O )
< (b + e)>.(Oe) < (b + e) [>.(B) + e] . D This holds true for any £ > 0. Hence it yields the statement. 2.6 Proposition. L et [0 C IR",0 1 C lR", F] be a diffeomorphism. Then for each Borel subset B of 0,
( 2.6 ) Proof.
each
( i)
k=
Let B be a Borel subset of 0 such that >.(B) < oo Define for 1,2, . . . and a fixed positive integer m , .
Bmk =
{ X E B: k ;;; 1 < I Jp (X) I < !;. }
412
CHAP TER 7. CALCULUS IN EU CLIDEAN S PA CES
From Proposition 2.5, (2.6a) From Example 1.20, (1.20d) ,
l = J 1 ( F ( x )). (2.6b) J F (x) p I J p (x) I ( < �) and hence , from (2.6b) , -
For all
x E B m k ' k ,-:;; 1 �
,
(!f < ) I J F _ l ( F(x)) l < k : 1 or (';; < ) I J F _ 1( y ) I < k : 1 for all y E F * (B m k ). If we apply Proposition 2.5 to F - l we will have that which along with (2.6a) yields
For all
x E Bm k ' (2.6d)
Integrating (2.6d) we have
Combining (2.6c) and (2. 6e) leads to
A(F * (Bm k )) - J I Jp (x) I A( d x) Bmk < k ,-:;; 1 A(B m k ) - �A (B m k) = �A(B m k ).
{-'
�k 1 Jp (x) 1 -' Cdx) }
E ;' 1 C F * C B m kn -B = A(F * (B )) -
JB I J p (x) I A( d x) < �A ( B) .
(2.6f)
413
2. Change of Variables
Since by our assumption A(B) < oo, we have from (2.6f) the validity of (2.6) by letting m --+ oo. ( ii) If B is an arbitrary Borel set , we can make a countable decom position of B = E � 1 B5 such that A(B 5 ) < oo and get (2.6) by sum ming up the equations
over Let and
s.
D
2.7 Remark. Formula (2.6) can be alternatively expressed as follows.
B 1 be a Borel subset of 0 1 and B = F * (B 1 ) . Then B is also Borel B 1 = F *(B) . Applying Proposition 2.6 to such a B, we have that A(B 1 ) = J I J p (x) I A(dx) . F* ( B 1 )
2.8 Theorem. (Change of Variables.) Let [0 diffeomorphism, let A be a Borel subset of 0 and each Borel measurable function [O,IR,g] ,
Proof.
(2.7) D
C lR",01 � lR",F] be a A 1 = F * (A). Then for
(2.8) J g( y)A( d y) = J g(F(x)) I JF (x) I A(dx) . A A1 Let g = l B1 for some Borel subset B 1 of A 1 and B = F * (B 1 ) .
Then, by (2.6),
f g( y )A( d y )
A1
=
f l B 1 ( y )A( dy )
=
=
A(B 1 ) = A(F * ( B))
A1 JB I Jp (x) I A( d x) = J lB (x) I JF (x) I A(dx) = =
J lB 1 (F(x)) I JF (x) I A( d x)
A
J g(F(x)) I JF (x) I A( d x) .
(2.8a)
A Thus (2.8) holds true for g being an indicator function. Let g be a simple function, i.e. , g = E 7 = 1 a i l B i ' where { Bi, i = . . . ,k} is a measurable partition of A 1 . From (2.8a),
1,
414
CHAPTER 7. CALCULUS IN EU CLIDEA N S P ACES
f g( y )A( dy ) A1
= f }: � a i lB ( y )A( d y) A _ 1
• -
'·
1 = }: ·� - 1 ai f l B'· ( y ) A ( dy) - A1 L: � _ 1 a i J l B '· (F(x)) I J p (x) I A ( d x) A1 = J g(F(x)) I J p(x) I A ( d x) . A 1
I
=
• -
•
The rest of this theorem is due to the standard procedure by going over D to the class of tJi: + -functions and then to g = g + - g - . 2.9 Examples. (Spherical Coordinate Transformation). ( i) Let 0 be an open subset of IR3 defined as 0 = {(r,O,cp) E IR 3 : r > 0, 0 < 0 < 2 1r , 0 < cp < 1r } and let
F=
[O,lR3 , F] be defined as
(x(r,O,cp)
= r cosO sincp, y(r,O,cp) = r sinO sincp, z(r,O,cp) = r coscp ) T .
(2. 9) The transformation has the range IR3 \D, where D = { (x,y,z) E IR3 : x > 0, easily see that F is a e 1 -map on 0 and its Jacobi y = 0, z E IR}. One can an, J p(r,O,cp) = - r 2 sincp f. 0 on 0. By Remark 1 . 1 9 (ii), [O, F * (0) = IR3 \D, F] is a diffeomorphism. Such a map transforms the rectangle [O, p] x [ 0,2 7r ) x [ 0, 1r] onto the ball Be(O, p ), but it obviously fails to be a diffeomorphism. On the other hand, if we take R = (O, p) X (0,2 7r ) X (0, 7r ) instead as . the domain of F it will transform the open rectangle an open ball Be(O, p) with the deleted sector
R onto
S = {(x,y,z) E IR3 : x = r sincp, y = 0, z = r coscp, 0 < r < p , 0 < cp < 1r } = {(x,y,z) E IR3 : x 2 + z2 < p , 0 < x, y = 0}. The transformation diffeomorphism. ( ii) Let
[R ,Be(O,p)\S, F], with F defined by (2. 9), is clearly a
[IR,IR,h] be a continuous function and let g be defined as
2.
415
Change of Varia bles
(2.9 a) Let
Be(O , p) be an open ball in IR3 • We will show that
J
B e (O, p )
gdA
(2.9b )
p
= 47r J h(r)r2dr. 0
Consider the transformation [R , Be(O , p)\S ,F] from ( i ) . Since S is a two dimensional set, its Lebesgue measure in IR3 is zero and, consequently,
Now we are going to apply formula
( 2.8) :
J gdA = J g( F ( p )) I Jp (P) I A(d p ) , A A1 with A 1 = Be ( O, p) \S, A = F * (A 1 ) = R = (O, p) ( 0 , 27r ) (0,1r) , p = (r, B , cp) , and I J p ( P ) I = r 2 sincp. Clearly, g(F(p)) = h(r) , which by Fubini's X
X
Theorem leads to
p
=J
2J7r
7r
J
r = O 8 = 0 cp = O
h( r )r2sincpA( dr ) A( dO)A( dcp) .
The last expression reduces to a Riemann integral and this further p reduces to 47r J h(r)r2dr. 0 0
PROBLEMS 2. 1 2.2
Show the validity of ( 2 . 1a) . Let [IR,IR,h] be a continuous function and let the open ellipsoid
Show that
E (O ; a1 ,a2 ,a3) denote
416
2.3 2.4
CHAPTER 7 . C ALCULUS IN EUCLIDEAN S P A C ES
Show that the volume of the ellipsoid in Problem Evaluate the integral
2.2 is �1ra1 a 2 a3 .
J exp {(x 2 + y 2 + z2 ) 3\ 2 }d.,\(x , y , z) ,
Be(O , p )
where
Be ( O, p) is a ball in lR3 .
2.
41 7
Change of Variables
NEW TERMS: Borel-Lebesgue measure of a cube under a linear map 402 Lebesgue measure of a set under an affine map 406 Borel-Lebesgue measure of a Borel set under a diffeomorphism change of variables in Euclidean spaces 413 spherical coordinate transformation 414 volume of an ellipsoid 416
411
Part III Further Topics in Integration
Chapter 8
A nalysis in A bstract Spac es This chapter (which is the least focused of the entire text) continues integration started in Chapter 6 and combines seemingly diverse topics from measure, integration, functional analysis, and topology. After we learned about absolute continuity of positive measures briefly introduced in Chapter 6, Section 5 (which may be sufficient for a first acquaintance), we will render a more thorough analysis of the Radon-Nikodym theory (Section 2) from the position of signed and comp lex measures (subject to Section 1). Singularity and Lebesgue decompo sition of signed measures are also treated here (Section 3) in a more rigor ous fashion. The reader will definitely benefit from having a first look at Chapter 6, Section 5, even though much of its formalities are suppressed. The results on signed measures are then applied to the analysis of L P spaces (a traditional topic of functional analysis) and generalization of the Lebesgue Dominated Convergence Theorem (Section 4), followed by convergence of measures (Section 5) and uniform integrability (Section 6). In Section 7, we return to locally compact Hausdorff spaces (started in Sections 10 and 1 1, Chapter 3) in connection with regularity of Radon measures and the general proof of the Riesz Representation Theorem (Section 7) . The chapter concludes with measures derivatives (Section 8) making traditional calculus on the real line (Chapter 9) very powerful. Besides the Radon-Nikodym Theorem (initially discussed in Chapter 6) , LP spaces and the Riesz Representation Theorem are among the main topics of this chapter. LP spaces (and their duals) were introduced and studied by the Hungarian Frigyes (Frederic) Riesz (one of the major figures in early functional analysis) who presented in 1910 a fully developed theory of these spaces, operators on them, and their spectral theory. His 1909 widely referred to Representation Theorem (of conti nuous linear functionals through integrals) , initiated by Jacques Hadamard in 1903, was his other major accomplishment, even though he proved this theorem for the special case of Riemann-Stieltjes integrals on [a,b] . Consequently, Riesz used no measure theory, although his work made a huge impact on the development of measure theory and inte gration and, in particular, lead Johann Radon to his 1913 revolutionary work.
421
422
CHAPTER 8 . ANALYSIS IN A BSTRA CT S P A CES
1. SIGNED AND COMPLEX MEASURES
The situation below is motivational to study a more general class of set functions than those we called "measures." Let ( O,E, J.L) be a measure space and let f E L 1 ( n,E, J.L; fR ). Define the following set function on E: v (A)
where
v + (A)
=
=
I fdJ.L A
=
v + (A) - v - (A) ,
I f + dJ.L and v-(A) A
=
I f- dJ.L . A
The set function v has all the properties of a measure ( u-additivity follows by Lebesgue's Dominated Convergence Theorem) except for being positive. However, in the above decomposition v = v + - v - , the set function v is represented by the difference of two measures. We will study this type of a set function, which we wish to call a signe d measure. We give a formal definition below, without saying anything about a de composition which is to follow later. 1.1 Definitions.
( i) Let ( n, E ) be a measurable space. A set function
called a signed me asure if: a)
v:
E --t- IR js
0; b) for each A E E, the value of v ( A) is well defined, i.e. it is either finite or + oo or - oo; c) v is u-additive. To tell signed measures from nonnegative measures, we will refer to the latter as positive me asures. CS(n, E) will denote the set of all signed meas ures on the measurable space (0, E ) . ( ii) The signed measure is called finite if its range is a subset of IR. Otherwise, is is called infinite. The triple (0, E, v) is called the signed measure spa ce. According to the type of the signed measure, the signed measure space is referred to as finite or infinite. The signed measure v is called u-finite if E admits a countable measurable partition {O n } of v fini te sets. (iii) Sometimes, we will need a notion of a finite set under v (or just a v-finite set) . This is referred to as a measurable set A with I v(A) I < oo. A measurable set P is called v-positive (or just positive) if v(P n A ) > 0 for all A E E. A measurable set N is called v- negative (or just nega tive) if v ( N n A) < 0 for all A E E. Obviously, P (N) is positive (nega tive) if and only if for any measurable subset E of P (N), v(E) > 0 ( < 0). v(C/J)
=
1 . Signed and Complex Measures
423
( iv) A set function v: E --+ IR is called continu ous from below if for every monotone nondecreasing sequence { A n } j C E it holds that
nlim --+oo v(A n ) = v ( nU=l A n)·
( v) Let {A n } be a monotone nonincreasing sequence of sets from E of which at least one is v-finite. A set function v: E--+IR is said to be continu ous from above on {A n } if ( 1 . 1) The set function v is continuous from ab ove on E, if (1.1) holds for every monotone nonincreasing sequence { A n } l C E with at least one v-finite set. In particular, if {A n } l C/J, ( 1 . 1) reduces to
nlim --+oo v(A n ) = 0 and this is referred to as continuity from ab ove at th e empty set or, shortly, (/)- continuity of v. (vi) Any signed measure on the Borel u-algebra is called a signed n Borel me asure. In particular, a signed Borel measure on ( lR , <:B) , finite on d e-bounded Borel sets is said to be a signed Borel-L eb esgue-Stieltjes measure.
D
1.2 Remarks.
( i ) Notice that ( ii) and ( iii) imply that if A 1 + A 2 is any decompo sition of a measurable set A and if v(A 1 ) = - oo , then so is v(A) (and v(A 2 ) must not equal + oo) and v(A 2 ) is either finite of - oo ; and it is not possible for any subset of A to have its signed measure be + oo as it would yield v(A) = + oo . If A is a finite set under v, then any finite or countable decomposition of A consists of finite subsets under v. (ii) If a sequence of mutually disjoint measurable sets { A n } is such that its union is finite under v, i.e. I v(E�= 1 An) I < oo , then u-additi vity of v ( I v ( E�= 1 A n ) I = I E�= 1 v( A n ) I ) implies that the series E�= 1 v( A ) is also absolutely convergent. This, as we know, does not hold true for series in the general case. The reader is encouraged to explain this phenomenon. (See Problem 1. 1.) D We will start with a few introductory properties of signed measures. 1.3 Proposition. Let (0, E, v) be a signed measure spa ce. (i) If A and B are measura ble sets such that B is v-finite and A C B, then A is also v-finite and v(B\A) = v(B) - v(A) . n
424
CHA PTER 8 . ANALYSIS IN AB STRA CT S PA CES
( ii) The signed measure v zs continuous from below and from
ab ove.
(See Problems 1.2 and 1.3.) The converse of Proposition 1.3 ( ii) is as follows. 1.4 Proposition. L et v be a finitely additive set function on the measurable spa ce ( 0, E) such that v(C/J) = 0. If v is continuous from below, then it is u-additive. If v is finite, then the co ntinuity from ab ove implies that v is u- additive. (See Problem 1.4.) 1.5 Lemma. Let ( 0, E, v) be signed measure spa ce an d let A E E be such that oo < v( A ) < 0. Then there is a negative subset N of A su ch that v(N) < v(A) . Proof. If A does not contain at least one subset E with v(E) > 0, A is negative itself and the statement of the lemma is proved. Otherwise, let S0 = sup {v(C): C C B0 = A } , -
which, by Proposition 1.3 ( i) , is finite and by our assumption about E is also positive. Hence, for every £ , there is a set C 1 � A such that v(C 1 ) + £ > Sg > 0. Let £ = �S0 • Then, C 1 is such that v(C 1 ) > �S 0• Now, if B 1 = A \C 1 is v-negati\ce, then we are done with the proof. Indeed, v(B 1 ) = v(A) - v(C 1 ) , by Proposition 1.3 (i) , and because v(C 1 ) > 0 , v(B 1 ) < v(A) . Otherwise, there is at least one subset of B 1 whose measure is strictly positive. Continuing with the same procedure, at step n we arrive at set
which is either a v-negative set satisfying v(B n ) < v(A) or it admits at least one subset with a positive value under v. This again leads to a posi tive real n urn ber and the existence of a nontrivial set C n + 1 such that v( C n + 1 ) > �S n > 0. If for no n, B n defined above is negative, then we set We show that N is a negative subset of A claimed in the statement of
1.
Signed and Complex Measures
425
the lemma. From we see that both v(N) and E�= 1 v( C n ) are finite. The latter implies that v ( C n) and, consequently, S n , dominated by v( C n ) , are vanishing. (Notice that, because v(E� 1 C n ) > 0, N # C/J.) This in turn yields that N is negative. Indeed, from the definition of S n , for every measurable subset E of B n , v(E) < S n . Since B n C N, it follows that for every meas urable set D , v(N n D ) < S n ! 0. Finally, that v(N) < v( A ) is obvious. D The following theorem states that there is an (essentially unique) de composition of the carrier set n into a positive and a negative set relative to a given si g ned measure v. This decomposition, referred to as a Hahn decomp osition leads to the upcoming Jordan decomposition of v into the difference of two positive measures mentioned in the beginning of this section. 1.6 Theorem (Hahn Decomposition Theorem). Let ( O,E, v) be a sign ed measure sp a c e . Then n can be partitioned into two sets, P and N, of which P is a positive and N is a negative set, referred to as a Hahn decomposition of n with respect to v, in notation (P, N) . A Hahn =
decomposition is unique in the fo llowing sense. If there is an other Hahn decomposition (P', N') then P6.P' and N 6.N' are v-null sets and therefore all Hahn de compositions form a unique equivalence class.
Proof. We assume without loss of g enerality that v does not take the value - oo . If C/J is the only negative set of v, then for each A E E, v( A)
> 0. (If there is a set A such that v( A ) < 0, then by Lemma 1.5 there would be a nonempty, negative subset of A .) Therefore, (f2,C/>) is the "trivial" Hahn decomposition and we are done with the proof. Let I
=
inf{v(E) : E E E and E is v-negative} .
Clearly, I :5 0. Then, there is a sequence {N n } of negative sets with lim n -HX>v(N n ) = I . Because of Problem 1.5, 00 N: = U N n n=l
is also a negative set. Regarding B n as k U N k ' we have {B n } as a =l monotone nondecreasing sequence of negative sets T N and hence, by Proposition 1.3 ( ii), lim n __. 00 v(B n ) = v(N) . Furthermore, since B n \N n � B n and B n is negative, v(B n \N n ) < 0. On the other hand, v(B n \N n ) = v(B n ) - v(N n ) and thus v(B n ) < v(N n )· The latter yields that v(N) < I. On the other hand, as for a negative set, v(N) > I, and thus v(N)
426
CHAPTER B . A NA LYSffi rn ABSTRACT S P A CES
=I
Now we show that P = N c is a v-positive set. If this is not the case, then there is at least one measurable subset A of P with oo < v( A ) < 0 and then, by Lemma 1 .5, there is a measurable, negative subset B of A with v( B ) < v( A ) ; hence v( B ) < 0. Then, B + N makes a negative set such that v( B + N ) = v( B ) + v(N) < v( N ) = I, which contradicts the fact that I is the v-limit-inferior of all negative sets. The uniqueness of D the Hahn decomposition is left for an exercise. (See Problem 1.7.) While the Hanh decomposition is a decomposition of the carrier n (with respect to the signed measure v) , the Jordan decomposition below is of the signed measure itself. It states that each signed measure is the difference of two positive measures. 1.7 Corollary (Jordan Decomposition). L et (n, E, v) be a sign ed measure space. Then v can be represented as the difference of two posi .
-
tive measures; of which at least one is finite, and this representation is unique (in the sense that it ib invariant of any Hahn decomposition) .
(P,N) be a Hahn decomposition of n relative to v and define the set functions + and on E as follows: Proof. Let
v
v
-
v + ( A ) = v(A n P) and v - ( A ) =
-
v( A n N) .
(1.7)
It follows from the definition of v + and v - that both are positive meas ures on E. It is also obvious why only one of them can be infinite. Hence, v = v + v - is the Jordan de comp osition induced by the Hahn de compo -
sition
( P ,N).
Suppose that J.l + J.l - is yet another Jordan decomposition of v induced by the Hahn decomposition (P',N'). Then, it can be easily shown (and it is left for an exercise; see Problem 1.8) that v + = J.l + and D v =v . 1.8 Definition. The defmed in Corollary 1 . 18 Jordan decomposition of a signed measure v, due to its uniqueness, suggests the following terms: -
v + is called the positive variation of v v - is called the negative variation of v I v I = v + + v - is called the total variation of v. (As the sum D of two positive measures, I v I is a positive measure itself.) One of the remarkable properties of the Hahn-Jordan decomposition of a signed measure is that it attains its maximum and minimum values on two disjoint measurable subsets of n as stated by the following propo sition.
1.
Signed and Complex Measures
42 7
1.9 Proposition. L et ( 0, E, v) be a signed measure spa ce. Then the positive, negative and total variations of v can be represented as follows. Given any measurable set A E E,
(i) v + (A) = sup{v(E): E E E n A} (ii) v - (A) = sup{ - v(E): E E E n A} = - inf{v(E): E E E n A} (iii) I v I (A) = sup{ }: � = 1 I v(E k ) I : { E1 , . . . ,En} C E and }: � = 1 E k C A} . Proof. Denote by (P,N) a Hahn decomposition of n with respect to v
and let
vsu p (A) = sup{v(E): E E E n A} and
v i n f (A) = sup{ - v(E): E E E n A} = - inf{v(E): E E E n A}.
(i) Clearly, v + (A) = v(A n P) < vsu p (A). To prove the inverse in equality we notice that because (P,E n P,Res E n pv) is a positive meas ure space, Res E n pV is monotone and hence, for each E E E n A,
v(E) = v(E n P) + v(E n N) < v(E n P) < v(A n P) = v + (A) . This yields the desired inverse inequality and thereby proves part ( i) of the proposition. (ii) Because P and N interchange their roles for - v, we have
and therefore v - = - vi n f · D We leave part (iii) for an exercise ( Problem 1 . 9) . 1.10 Remark. In summary of the Hahn-Jordan decomposition, we have that v + (A) = sup { v(E): E E E n A} = v(A n P),
428
CHAPTER 8 . ANALYS IS IN ABSTRACT S PA CES
- v - (A) = i nf{v( E): E E E n A} = v( A n N) and This has an obvious interpretation. The signed measure v attains its max imum and minimum values on two measurable disjoint subsets of A: A n P and A n N, respectively; and the entire measure of A is the sum of these two values. In particular, it follows that P and N are the v -max imal and v- minimal subsets of n (in notation, P = S and N = I) on which v attains maximum and minimum values, respectively. This is due to the fact that ( P, E n P, Res E n pv) is a positive measure space and hence Res E n pV is monotone. A similar argument explains why v attains a minimum value on N. D Let us consider a few examples. 1.11 Examples.
( i)
Let (O,E,v) be a signed measure space. If v is a positive measure, then, obviously, S n and I = (/J. Consider the case with v = J f dp, where p is a positive measure on (O,E) and f E L 1 ( 0, E, J.l). Then, =
v ( A) = I fd p = A
Therefore, {f
I
< o}
I
A n {! > 0 }
f d p < v(A) <
fd p +
{f
I
> o}
I
A n {/ < 0 }
fd p .
fd p, \1 A E E,
and thus S = {/ > 0} and I = {/ < 0}. ( ii) If J.l and p are two positive measures (of which at least one is finite) , the difference v = J.L - p is a signed measure. However, it is not, in general, the Jordan decomposition of v. Let v be a signed measure on the measurable space ( 0, E). Denote by v E = Res E n E v, where E is a measurable set. To obtain the Jordan decomposition of v = J.L - p, we need any Hahn decomposition of n with respect to v. Say, ( P,N) is one. Then, from Corollary 1.7, ( 1 . 1 1) v + = vp = J.lp - Pp and (1. 1 1a) We can also make use of formulas of ( i) and determine the positive and negative variations.
( ii)
of Proposition 1.9 to
1. Sign ed and Comp lex Measures
429
(iii) Let e 0 be the point mass and IP' a probability measure on (lR , <:B). We find a Hahn decomposition of the signed measure v = IP' - €0 • We show that I = { 0 } is a v- minimal ( and negative ) set discussed in Remark 1. 10. For an A E <:B, -
and either or
v(A n i ) =
IP'({O})
-
€0 ({ 0 }) =
IP'({O}) - 1 , with 0 E A
v(A n I) = 0, with 0 rf_ A,
which implies that v(A n I) < 0. Using relations ( 1. 1 1 ) and ( 1 . 1 1a) , we have the Jordan decomposition of v: and
v + (A )
=
v( A n Ic)
=
IP'( A n Ic),
v - (A) = l A (O) ( l - lP( { 0 } ).
Note that { 0} = I is the set where v attains its minimum. ( iv) Let v = >. J.l , where ,\ is the Lebesgue measure on (lR, �) and is the geometric measure defmed as -
J.L
Clearly, N = { 1,2, . . . } is a negative set relative to v , whereas P = Nc is a positive set. Thus, (P,N) is a Hahn decomposition of lR relative to v and, consequently, for every Borel set A, and
v + (A) = ( >. - J.l) (A n { 1 ,2, . . . } c) v - ( A) =
(J.L - >. ) ( A n { 1,2, . . . } )
represent the Jordan decomposition of v. Since N is a >.-null set, the latter reduces to v - (A) = J.l (A n { 1 ,2, . . . } ) .
Therefore, v attains its minimum at N and its value is - 1 , while the maximum value of v is oo and it is attained at N c. D The next embellishment of the notion of measure is a complex measure.
430
CHAPTER 8 . ANALYSIS IN ABSTRACT S P A CES
1. 12 Definition. Let (0, E) be a measurable space. A set function on E is said to be a complex measure if:
( i)
(ii) (iii)
v
v is valued in C. [Notice that being valued in C, v must not
have infinite values, and therefore, of those signed or positive measures only finite ones can be qualified as complex meas ures.]
v( ¢ ) = 0. v is u-additive. [Analogously to the signed measures (see Remark 1 .2 (ii)), u-additivity of v,
(where I I stands for the two-dimensional Euclidean norm), implies that the series E� 1 v( An ) is also absolutely convergent.] D The triple (0, E, v) is referred to as a complex m easure sp ace. Now, we use a similar concept in Proposition 1.9 (iii) to define the total variation of a complex measure. ·
=
1.13 Definitions.
(i) Given a complex measupe space (0, E, v), the complex measure v can be represented as v = v1 + iv 2 , where v 1 and v 2 are finite signed measures on E. Hahn decompositions should then be applied for v 1 and v 2 and their corresponding Jordan decompositions will yield
(1. 13) with v 1+ , v 1- , v 2+ , and v 2- being positive finite measures. We will call (1. 13) the Jordan decomposition of the complex measure v. (ii) For each measurable set A , the total variation I v I (A) of a complex measure v is is defined as sup
{ I: � = 1 I v(A k) I , over all finite measurable partitions { A 1
1•
.
•
,A n } of A
}
1.14 Proposition. The total variation of a complex m e asure (n, E) is a finite positive me asure on (0, E) .
D
v
on
Proof. Let {A 1 ,. . . ,A n } be a measurable partition of a set A E E.
1.
Signed and Complex Me asures
43 1
Because for nonnegative real numbers a , b , c , d , and due to Proposition 1 . 9 (iii) we have E�= 1
and therefore,
I v(A k) I < E � = I v 1 I (A k) + E � = 1 I v 2 l (A k) < I v1 I (A) + I v 2 l (A), 1
l v I (A) < I v1 I (A) + I v 2 1 (A) = (v 1+ + v 1- + v 2+ + v 2- )(A) .
(1. 14)
Consequently, the total variation of any measurable set is a real nonnega tive number. Obviously, I v I ( ¢ ) = 0. Now we show that v is an additive set function. Let A and B be disjoint measurable sets and let {E 1 , . . . ,E n } be a measurable partition of A + B. Then,
and the triangle inequality of the Euclidean norm yield that:
E � = 1 1 v(E k) l
and therefore,
< 2: � = 1 1 v(E k n A) I + 2: � = 1 1 v(E k n B) I < l v i (A) + l v i ( B), I v I (A + B) < I v I (A) + I v I (B) .
(1. 14a)
The inverse inequality is due to the following. Given a measurable partition { E1 ,. . . ,E n } of A + B, it holds true that
with
Fk =
k = 1 , . . .n as another partition of A + B E k n B, k = n + 1, . . . ,2n
432
CHAPTER S. ANALYS ffi rn ABSTRACT S PA CES
Applying the supremum twice to the left-hand side of the above inequali ty we arrive at the desired inverse to inequality ( 1. 14a) . Hence, we showed that the total variation of v is a finite content on E. Finally, by Proposition 1. 7 (ii ) , I v I is u-additive if it is ¢-continu ous. This readily follows from ( 1 . 14 ) and the fact that v 1+ , v 1- , v 2+ , and D v 2- , as positive measures, are ¢-continuous. 1. 15 Remarks.
( i)
Notice that there is a slight difference in the definition of the total variation of a signed measure and a complex measure, but accord ing to Problem 1 . 13, they agree in the case of finite signed measures.
( ii ) While the set 6 (n, E ) of all signed measures is not a linear space ( the sum of two signed measures need not be a signed measure, as we can arrive at oo - oo ) , the space G: ( O, E ) of all complex measures (over the field C) is. It is easy to verify that II v II defined as I v I ( 0) is a norm and therefore upgrades G: ( n, E ) to a normed linear space. It can be shown ( Problem 1 . 14 ) that (
measures
e
(v V p)(A) = sup {v (E) + p(A\E): E E E n A} (v 1\ p )(A) = inf{ v (� ) + p(A \E) : E E E n A}. It is obvious that v 1\ p < v < v V p and readily shown ( Problem 1 . 15 ) that
v + = v V 0, v and therefore,
-
=
v 1\ p < p < v V p .
v 1\ 0, and I v I = v V ( - v)
(S * (n, E ) , II II , < ) is a Banach lattice. ·
It can also be
(1 . 1 5) D
The following is an embellishment of the integral notion of real- and complex-valued functions with respect to signed and complex measures. 1.16 Definitions.
Let (n,C,/ = ( u,v )] be a complex-valued function. Given a u 2 algebra E in n, f is measurable if for every Borel set B E � ( lR ), using the projection operators one can easily f * (B) E E, as usual . By 2 show that f is E-<:B ( lR ) measurable if and only if u and v are E-<:B ( IR ) measurable. Now, given a positive measure J.l E !Dl ( O, E), we say that f E L1 (0, E, J.L;C) or f is J.L- int egrable if I f I E L1 ( 0, E, J.L;lR + ). Since I f I < I u I + I v I < 2 I f I , f is integrable if and only if both
( i)
1. Signed and Complex Measures u
433
and v are elements of L 1 (n, E, J.L;lR ) and in this case we will write (1. 16)
and therefore, L 1 (n, E, J.L ;C) is a linear space with the integral being a linear function on L 1 (n, E, p; C). All major theorems of integration ( cf. Sections 1 and 2, Chapter 6) hold true with very minor notational modifi cations.
(ii) Let
Denote
v E 6(0, E)
with its Jordan decomposition
v = v+ -v-. (1. 16a)
The integral of a function f E e - 1 (0, E; IR ) relative to the signed measure v is defined as
I fd v = J fdv + - I fd v - .
( 1 . 16b)
The function f is said to be integrable with resp ect to th e signed measure v, if f E L1 (n, E, v;lR ) . The value of the integral I fd v is therefore finite. (Notice that the full decomposition of ( 1 . 16b) is (1. 16c) and thus L 1 (n, E, v;fR) could be enlarged by including those f's for which just one of the parts (negative or positive) in (1. 16c) is finite.]
(iii) Let us now define integrals of complex-valued functions with
respect to complex measures. Let eb- 1 (0, E;C) denote the linear space of all E-measurable complex-valued bounded functions (eb- 1 ( 0, E; lR ) is the corresponding subspace of real-valued bounded functions) . Let v = v 1 + iv 2 be a complex measure on (0, E) (1. 16d) where each of the four integrals of ( 1 . 16d) relative to the signed measures v 1 and v2 is subject to representation (1. 16b ) . Obviously, it makes a vector of linear combinations of eight finite positive integrals and there fore the integral of ( 1 . 16d) exists and it is complex-valued. 0
1.17 Example. The integral in (1. 16d) , as a functional of J, is clearly linear. Replacing f by f 1 A we see that ( 1 . 16d) also defines a complex measure and therefore v � I fd v is a linear operator from <E(O, E) to
434
CHAPTER 8. ANALYS IS IN ABSTRACT S PA CES
a:(n, E) .
Now we define a norm on being a simple function ,
eb- 1 (0, E;C).
Given a
v E G:(n, E),
and
s
we have that
I f s dv I = I E � = 1 a kv (A k) I < E � = l a k l v(A k) l < E � = 1 ll 5 ll su p l v (A k) l < II s II su p II v II
·
If f E eb- 1 (0, E;lR + ) , then there is a sequence { s n } of simple functions with s n j f. Hence,
which leads to and obviousl�, to the same inequality for f E eb- 1 (0, E;C) .
D
PROBLEMS 1. 1 1.2 1.3 1.4 1.5 1.6
Why is the series E� vergent?
=
1 v(A n )
in Remark
1.2
(ii) absolutely con-
1.3 ( i) Prove Proposition 1.3 ( ii) . Prove Proposition 1.4. Prove Proposition
Show that the families of positive and negative sets are closed with respect to at most countable unions and intersections.
Let ( n, E, v) be a signed measure space and let A E E be such that 0 < v(A) < oo Show that there is a positive subset P of A, with v(A) < v(P) < oo .
.
1.7 1.8
Show that all Hahn decompositions form a unique equivalence class.
Prove that the Jordan decomposition of a signed measure ( from
1 . Signed and Complex Measures
435
Corollary 1 .7) is invariant of any Hahn decomposition.
1.9 1.10
1. 11
1. 12
Prove part (iii) of Proposition 1 . 9 .
Let v be a signed measure on ( O,E) represented as a difference of two positive measures v = J.L 1 - J.L 2 . Show that J.L 1 > v + and
J.L 2 > v - . Let ( 0, E, v) be a signed measure space. Show that I v I (A) = 0, for an A E E, if and only if v (S) = 0 for each S E E n A . Show by example that v(A) = 0 is not sufficient for I v I (A) = 0. Let (n,E,J.L) be a positive measure space and let v be the indefinite integral v = I fdJ.L generated by f E IL(n, E, J.L ;fR ) and J.L · Show that v is a signed measure,
1.13
Show that the total variation of a finite signed measure agrees with its total variation as for a complex measure.
1.14
Prove that space.
1.15
( G: (n, E) , II II ) ·
(of Remark 1 . 15 ( ii) i s a Banach
Prove that the subspace ( <5 * ( 0, E ) , II II , < ) of all finite signed measures is a Banach lattice, i.e. show the validity of equations (1. 15). ·
1.16
Show that [O,C,/ = (u , v)] (of Definition 1 . 1 6 (i)) is E-� ( IR 2 ) measurable if and only if u and v are E-� ( IR ) measurable.
1.17
Show that for each
1.18
Modify the Lebesgue Dominated Convergence Theorem of Section 2 , Chapter 6, and prove it for complex-valued functions.
1.19
Let
1.20 1.21
f E L 1 (n, E, J.L;C ) , I I fdJ.L I < J I f I dJ.L .
(0, E, v) be a complex measure space. Prove that 1 E L ( 0, E, v;C ), I v I (A) = I I f I d v, for all A E E. A
Show that-, given a signed- measure
L (0, E, v;IR ) = L1 ( 0, E, I v I ;IR ) . 1
v,
for each
f
it holds true that
Show that any signed Borel-Lebesgue-Stieltj es measure on is u-finite.
n (!R , � )
436
CHAPTER S . ANALYSffi rn ABSTRACT S P A CES
NEW TERMS:
signed measure 422 positive measure 422 finite signed measure 422 infinite signed measure 422 u-finite signed measure 422 finite (v-finite) set 422 positive (v-positive) set 422 negative (v-negative) set 422 continuous from below set functions 423 continuous from above set functions 423 continuity from above at the empty set 423 C/J-con tin ui ty of a signed measure 423 signed Borel measure 423 signed Borel-Lebesgue-Stieltjes measure 423 Hahn decomposition 425 Hahn Decomposition Theorem 425 Jordan decomposition of a signed measure 426 positive variation of a signed measure 426 negative variation of a signed measure 426 total variation of a signed measure 426 v-maximal set 428 v-minimal set 428 geometric measure 429 complex measure 430 complex measure space 430 Jordan decomposition of a complex measure 430 total variation of a complex measure 430 measurability of a complex-valued function 432 integrability of a complex-valued function 432 integral of a real-valued function relative to a signed measure 433
2. A bsolute Continuity
43 7
2. ABSOLUTE CONTINUITY
1. 1 Definition and Notation.
Absolute continuity of signed measures is formulated in the same way as that of positive measures. Let J.L and v be positive and signed measures, respectively, on a measure space ( 0, E). We call a signed meas ure v absolutely continuous with resp ect to ( a p ositive me asure ) J.L (in notation, v � J.L) if every measurable J.L-null set A is also a v-null set. A signed measure that is absolutely continuous with respect to the Lebesgue measure is called continuous. Recall that <5( n, E) denotes the set of all signed measures on ( n, E) and ml(n, E) is the subset of all positive measures. Given J.L E ml, denote 6 11<.. = {v E <S (n, E) : v �p }. Define on IL(n, E, J.L; fR ) the map I 11 such that for each g E l(O, E, J.L;lR ), (2. 1) that, according to Problem 1 . 12, is valued in <S and, as easily seen, D v � J.L· Consequently, [IL(n, E, J.L; fR ) , <S 11� , I 11] is an into map . From Definitions and Remarks 1 . 14 (iii) , Chapter 6, we remember that the J.L-almost everywhere property of equality of measurable func tions generates an equivalence relation E on e - l (n, E; fR ) and thus on IL(O, E, J.L; fR ), as a subset of e - 1 (0, E; fR ). Consequently,
is also a quotient set. On the other hand, by Corollary 1 . 20, Chapter 6, the map I agrees with the equivalence relation E. In other words, I 11 adopts E cis its equivalence kernel . Then, by Theorem 4.4, Chapter 1 , there is a unique function, denoted by -
[IL ( n, E, J.L; lR ) such that
l p. , s <.. I ,J ,
I 1-' = I "
IJ
0
,
?r , E
where 1rE is the projection of IL(O, E, J.L; lR ) on its quotient IL(O, E, J.L; lR ) I 11 by E. (See Section 4, Chapter 1, for refresher.) Therefore, the map I ,. "becomes" an injection J ,. with the domain IL(Q, E, J.li iR) I ,. .
Recall that any function g from the quotient class [g ] 11 of IL-functions (generating the signed measure v in (2. 1)) is referred to as a Radon-
438
CHAPTER 8 . ANALYSIS IN ABSTRA CT SPA CES
Nikodym density of (the signed m easure) ure) J.L·
v
relative t o (the positive meas
The chief component of the Radon-Nikodym Theorem below is that the map [IL(n, E, J.L; fR) I 11,611< , J J is surjective. In other words, for each measure v E 611< , there is an equivalence class [g]11 of Radon-Nikodym densities of v relative to J.L.
2.2 Theorem {Radon-Nikodym). L et J.L E !Dl(O, E) be a u-finite meas ure. Th en [IL(n, E, J.L;fR ) I JJ , s JJ< , J ,] is a bijective m ap. Proof. The proof of the theorem includes three objectives:
- 11, J [ g ]11 E <S 11< , ,
1) Show that g1ven [ g ]11 E IL(O, E, J.L; lR ) I
.
i.e., that
for each g E [g ]11, v = I gdJ.L defines a signed measure, absolutely continu ous with respect to J.L· This is readily done. Since g E IL, v is a signed measure. The proof that v � J.L is trivial. Therefore,
is an into map. 2) Show that is a surjective. (an onto) map, i.e., that for every signed measure v E 611< , absolutely continuous with respect to a positive u-finite measure J.L there is an equivalent class [ g ]11 E IL(O, E, J.L; lR ) I 11 of Radon-Nikodym densities of v relative to J.L· This is the key part of the theorem and it is referred to as the "existence of a Radon-Nikodym density .'' 3) Show that the map
is injective (one-to-one) , i.e. that the above equivalent class [ g ]11 is unique. This is due to Corollary 1 .20, Chapter 6 . It therefore remains to prove the existence of the Radon-Nikodym density , i.e. , given a signed measure v, absolutely continuous relative to a u-finite positive measure J.L, there is an equivalence class [g]11 of IL functions such that for each element g of this class v = I gdJ.L. We break up the proof of existence into five parts starting with the case of two finite measures J.L and v and embellishing it to the case when J.L and v are u-finite positive and signed measures, respectively .
2. A bsolute
Continuity
43 9
1. J.L and v are two finite positive measures. Abbreviate L � = L1 ( n , E, J.L; fR + ) and introduce
Case
functions
4> = { / E L1+ : l f d!-L < v(A) , 'v' A E L' }
the subset of
Since 0 E is closed under finite suprema. Indeed, let ,g E
f
{w E A: f(w) > g(w)} G = { w E A: f(w) < g(w)}. E=
and Then, E + G
= A and I f V g dJ.L = EI fd J.L + GI g d J.L < v( E) + v ( G ) = v( A) .
A Now, let
S : = sup{ l fd !-L: f E 4> } < v( !l) < oo .
I dJ.L} =
Then, there is a sequence {
{f
<
<
=
<
f =i 1
If
I
I
Now, we will show that [g] JJ is an equivalence class of Radon Nikodym densities of relative to J.L, i.e for each g E [g] JJ ,
v
J g dJ.L = v (A ) , for all A E E.
A
Because
v � J.L and for all A E E J g d J.L < v( A) , A
the set function
p = v - J g dj.L is a finite positive measure, absolutely continuous relative to measure J.L.
440
CHAPTER B . ANALYSffi rn ABSTRACT S P A CES
If g is not a Radon-Nikodym density of and p( O) > 0. Thus for some positive £ ,
v with
respect to
J.L ,
then
p '¥=. 0 ( 2.2a)
J.L( O) - cp( O) < 0.
Consider the ( finite ) signed measure 1 = J.L - e p . By Theorem 1.6, there is a Hahn decomposition (P,N ) of n such that 1 ( A n P ) > 0 and 1 ( A n N ) < 0 for all measurable sets A, i.e. , and
J.L( P n A) - cp( P n A) > 0
( 2.2b )
J.L( N n A) - c: p( N n A) < 0.
( 2.2c )
If J.L( N ) = 0 then, because of p � J.L , p( N ) = 0 and thus 1 (N) other hand, from ( 2.2b ) , by setting A = n we have that
J.L(P) - cp( P) = 1 (P) > 0 .
= 0.
On the
( 2.2d )
Furthermore, since by the above assumption about p , N turns out to be a 1-null set, it follows from ( 2.2a) that 1 ( P ) < 0. This contradicts inequality ( 2.2d ) . Hence, J.L( N ) must be positive. Now we have from ( 2.2c ) that
< p( N n A) < p( A )
= v(A) - I g d J.L A
equivalently,
J (i lN + g )d p. < v( A) . A Thus, the- function ! I N + g E l/>. But, since J.L( lV) > 0, or ,
it holds true that
This contradicts that g is an I -maximal element o f
v( A) for all case.
A E E,
=
Ig
A
dJ.L
which proves the statement of the theorem for this special
Notice that because
v
is finite and therefore every Radon-Nikodym
2. A bsolute Continuity
441
density g is an £ 1 -function, by Proposition 1 .2 1 , Chapter 6 , g is finite J.L a. e. If it is "occasionally'' infinite, we can redefine g as to make it finite. Therefore, of the equivalence class [g] JJ of Radon-Nikodym densities there is a subclass of finite ones. In summary of case 1 , given two finite positive measures J.L and v such that � «: J.L, there is a unique ( nonempty ) equivalence class [g] JJ E L 1 (n, E, J.L;lR + ) I JJ of Radon-Nikodym densities (of measure J.L relative to measure v) of which a nonempty subclass is of finite densities. Case 2 . J.L and v are finite and u-finite po�itive measures, resp. If v is u-finite then there is at most a countable decomposition of for all n = 1 ,2 , . . . . Let n = E�= 1 nn , such that v ( On) <
oo
vn
= Res
E
n n n v.
Then vn is a finite measure on nn n E and from case 1 it follows that there is a measurable nonnegative function gn: nn lR such that
�
=I
vn(A n nn)
A n nn
Yn d j.L , for each A E E, n = 1 ,2 , . . . .
Now by the Monotone Convergence Theorem applied to the sequence { I:� = 1 gn ln n } we have that v ( A) = L: := 1 vn ( A n nn)
oo
= L: n
I
gn dJ.L 1 - A n nn
-
It only remains to set g theorem.
= AI L: oon - 1 gn ln n dJ.L .
= E�= 1gnlnn
-
to complete this part of the
Therefore, given two positive measures J.L and v such that J.L is finite, v is u-finite and v «: J.L, there is unique equivalence class [g] JJ E IL ( n, E, J.L;IR + ) I JJ of Radon-Nikodym densities ( whose integral is not nec essarily finite ) . Case 3. J.L is a finite positive and v is an arbitrary positive measure. Denote by F
= { B E E: Res E n B v is u-finite } .
Since C/J E r, it follows that r f.
(/J. Let
S = sup { J.L ( B) : B E F}
442
CHA PTER 8. ANALYSIS IN ABSTRACT S P ACES
and let {E n } C r such that J.L( E n ) � S. ( Since S < J.L( O) < oo , for each n = 1 ,2, . . . , there is a set E n such that S - � < J.L( E n ) < S. ) Clearly,
E : = n U= l E n E F. 00
S > J.L( E) > J.L( E n ) S and J.L( E ) = S. Now since v is u-finite on E, from case 2, it follows that E n E-c:A + -measurable L 1 -function [E, IR + , g ] , such that Hence,
�
for all A E E. Fix an A E E. a)
J.L(A n E c) > 0. If v(A n E c) < oo, E U (A n E c) E r. The latter yields that Let
there is an
then A n E c E F, and thus
J.L(E) = S. Thus v(A n Ec) = oo. b) Let J.L( A n E c ) = 0. Then since v � J.L , it holds true that v( A n E c )
and this contradicts
= 0.
The above cases a ) and equation
by agreeing that
oo 0 = 0. Furthermore, ·
v(A)
where
b) can be combined in the following compact
= v(A n E) + v(A n E c )
g =. g 1 E + oo l E c · Notice that g is measurable, since
Therefore, given two positive measures J.L and v such that J.L is finite, v is arbitrary, and v � J.L, there is unique equivalence class [g] JJ E IL ( O, E, J.L; IR + ) I JJ of Radon-Nikodym densities. Case 4. J.l is a u-finite and v is an arbitrary positive measure.
Let n = E�= 1 n n such that J.L( O n ) < oo for all n > 1 . Due to case 3 , for each n, there is a E n n n-c:A + -measurable function [O n , IR + , gJ , such
2. A bsolute Continuity
443
that for all
A E E. Denoting Y n = g l n n
and thus
we have
v(A) = I g dJ.L A
,
where, by the Monotone Convergence Theorem,
g = E�= l Y n · and v such that
Therefore, given two positive measures J.L J.L is u finite, v is arbitrary, and v « Jl , there is a unique equivalence class [g] JJ E IL(O, E, J.L;IR + ) I JJ of ( nonnegative ) Radon-Nikodym densities.
v is a signed measure. decomposition of v, where,
Case 5. J.L is a u-finite positive measure and
Let v = v + - v - be the Jordan for instance, v - is supposed to be finite. By case 4, there are functions [n, lR + , g i] , i = 1,2, such that
v + (A) = I g 1 dJ.L A our assumption v -
Since by a. e . This leads to
v - (A) = I g 2 dJ.L, A E E. A finite, g 2 is J.L-integrable and g 2 < oo
and is
J.L
v(A) = v + (A) - v - (A) = I ( g 1 - g 2 )dJ.L . A In summary of case 5, given a u-finite positive measure J.L and signed measure v, with v << J.L, there is a unique equivalence class [g] JJ E IL(O, E, J.L;IR ) I JJ of Radon-Nikodym densities of v relative to J.L· Case 5a ( special case of 5, with
v being a finite signed measure. )
In this case, clearly, given a u-finite positive measure J.L and a finite signed measure v, with v « J.L, there is a unique equivalence class [g] JJ E L 1 ( n, E, J.L;IR ) I JJ of Radon-Nikodym densities of v relative to J.L· Case 5b ( special case of 5, with
v being a u-finite signed measure. ) Since v is u-finite, there is a countable decomposition E�= 1 n n = n such that v n = R e s E n n v is finite for every n. By case 5a, since n v n « J.L n = Res E n nn J.L,
444
CHAPTER 8. ANALYSIS IN ABSTRACT S P ACES
Radon-Nikodym densities of v n relative to
J.l n·
Now, letting
g n E [g n J JJ n , we define the class [g ] JJ of Radon-Nikodym densities of v relative to J.L · As a sum of countably many integrable func tions, g is clearly J.L- a. e. finite. for every
The proof of the theorem is now complete.
D
By Radon-Nikodym Theorem, the map J 11 is therefore invertible and
its inverse
(J
. 11
)
-
1
is also a map valued in
l (!1, E, J.l ;IR ) I JJ .
In other
words, for any v E 6 JJ< , under ( J II! ) 1 , there is a nonempty equivalence class [g] JJ of Radon-Nikodym densities of v relative to J.L · We denote -
( J 11 ) - 1 by the symbol
and for a fixed
v E <SJJ� , we set
d�
and call it the Radon-Nikodym derivative of measure v relative to me as ure J.L · We would like to emphasize that a Radon-Nikodym derivative is not the same as a Radon-Nikodym density (as it is being routinely used in the colloquial language) , but it is an equivalence class of Radon-Niko dym densities.
2.3 Example. Let X ( E l (n, E, J.L ;IR )) be a random variable defined on a probability space (0, E, IP'). Recall that X induces the image measure IP' X* referred to as the proba bility distribution and yielding the probability space (IR, �,IP' X*) . Given a (positive) Borel measure J.L such that IP' X* � J.L, we have, according to case 1 of the Radon-Nikodym Theorem, a nonempty equivalence class
diP' X* dj.L of Radon-Nikodym densities (pro ba bility density functions) such that for any
g E d�; ·,
it holds true that IP X* = J g d p. . For instance, if p. = A is
the Borel-Lebesgue measure on ( IR, <:B) , then the probability distribution IP' X * can be represented by the Lebesgue integral and a density g can often be reduced to the usual Newton-Leibnitz derivative of the pro babili ty distribution function x J--+ IP' X*( - oo , x] . A random variable X, whose probability distribution IP' X* is absolutely continuous with respect to the
2. A bsolute Continuity
445
Borel-Lebesgue measure (or as we agreed to call it, j ust "continuous" ) , is said to be continuous. In probability theory, it is common to specify a probability density function and (as one of the consequences of the D Radon-Nikodym Theorem) it uniquely defines a random variable. As another application of the Radon-Nikodym Theorem, we formu late the following result. 2.4 Corollary. Let [n, IR + , /] be a E-<:B + -measura ble function and let J.L be a finite measure on E . Then for each su b-u- algebra E0 C E, there exists a unique equivalence class
[ !0 ] �-' c l ( n , E, J.L; IR + ) 1 �-' ' such that
I fa dj.L = I fdJ.L
Ao Ao for each f0 E [f 0]1-' an d for all A0 E E0• Proof. Let J.Lo
==
R esE0 J.L and let
v =
J f dp,. Then, for any A0 E E0 ,
v(A0) = I fdJ.L = I fdJ.L0 Ao Ao
(2.4a)
v � J.Lo · By Radon-Nikodym 's theorem (Case 3) , there is a nonnegative E0- c:B + -measurable equivalence class Jv of Radon J.Lo Nikodym densities such that for each f 0 E Jv , J.Lo (2.4b) v( Ao ) = I f o dJ.Lo · Aa and therefore
The statement of the proposition now follows from (2.4a) and (2.4b ) .
D
Corollary 2.4 can be generalized as follows. 2.5 Proposition. Let [!1, IR, f] be a E-GJJ - me asurable function from IL(O, E, J.L; lR) an d let J.L be a finite measure on E. Then, for each sub-u algebra E0 such that E 0 C E there exists a unique quotient class [f 0] � C l l JJ ' such that
I f0 dJ.L = J fdJ.L
Ao Ao for each fo E [/0] JJ an d for all A0 E E0 • (See Problem 2.3.)
D
446
CHAPTER 8. ANALYSIS IN AB STRACT S P A CES
The above propositions find an important application in probability theory. 2.6 Definitions.
( i)
Let
X be a random variable on a probability space (0, E, IP)
valued in IR and let E0 be any sub-u-algebra of E. Then , in light of Proposition 2.5, there exists a class of IF-integrable random variables [X0]p such that for each X 0 E [X0]p the equation
( 2.6a)
I X0 d iP' = I X diP' Ao Ao A0 E E0 • The class [X0]p
holds true for all of IF-equivalent random variables is called the conditional expectation of X given u-hyp othesis E0 , in notation,
( 2.6b )
[X0 ] p = IE[ X I E0 ] = IE EO[X] .
Any random variable
X0
conditional expectation
E
from the class
[X0]p
is called a version of th e
[ O[ X] .
(ii) For a measurable set ( event )
E
A E E take X = l A . Then, for a
sub-u-algebra 1J0 C E, IE O[l ] is called the conditional pro bability of A event A given u-hypothesis E0 and it is denoted by IP'(A I E 0 ) or by
IP' EO(A ) .
The following construction explains why tional expectation."
[X0 ]p is called the "condi
2. 7 Examples.
( i)
X be a random variable on a probability space (0, E, IP') and let n = E� H n be a measurable decomposition such that IP'(H n ) > 0. Then for each n = 1, 2, . . . , the conditional probability Let
=
1
defines the probability measure (0, E n Hn ) , where
IP' H n on the new measurable space
IPH = IP(1 ) R es E n H iP n
n
n
.
Thus, the expected value of X with respect to measure
IP'
Hn
is then
2. A bsolute Continuity
447
H [ [ X I H nJ = J Xd iP n = IP ( 1n) J Xd iP , n
(2.7)
which is called the con ditional expectation of X given the hypothesis Hn . Observe that the value IE[ X I H ,J is a constant (random variable) . Now consider the random variable (2.7a) which is E0-�-measurable, where E0 = u( { H n }) is a u-algebra generated by the sequence of hypotheses {H n } · Obviously, E0 = { n,ct>,A = i �
1 H i: I
C
N} .
Hence, for every A E E0 (which a union of some H/s) :
The random variable X0 is then a version of the conditional expectation IE [ X I E0] that belongs to the class [X0]p·
(ii) We consider a special case of the above example. Let n = [0, 1 ) , E = <:B n [0 , 1 ) and I? = Res E A (where A denotes the Borel-Lebesgue meas ure) . As decomposition, take
it = L � 1 H k , where H k = = Let X( w ) =
w,
[ k ;;- 1 , � ) .
for all w E n. Then,
and
I Hk X Thus, from (2. 7) ,
and from (2. 7a) ,
dl? = I
Hk w
A( dw ) = 2k -; 1 . 2n
448
CHAPTER 8. ANALYS IS IN ABSTRACT S PA C ES
n 2k - 1 1 X0 = "' LJ k 1 2n H k ' =
as a version of the conditional expectation = u ( H 1 , . . . ,H n ) ·
IE[X I E0] ,
where E0
(iii) Let X and Y be two random variables on a probability space ( n, L', IP). Then, E0 = u ( Y ) is a sub-u-algebra of E generated by Y. The corresponding conditional expectation of X given E0 is denoted [[X I Y] D or IEY[x]. 2.8 Remarks.
( i)
Observe that from (2.6a) and (2. 6b) it does not follow that
IEEO[X] = X (mod IP') , because X need not be E0-measurable. However, [L'O[X] = X (mod IP') if X is E0-measurable (see Problem 2. 10) . (ii) Note that if two random variables X and Y belong to the same equivalence class, we would normally write X = Y (mod IP') or X = Y IP' a . e . on n. In probability, however, the latter is usually denoted by X Y IP- a.s. on n or ju s t a.s. ( reads almost surely). 0 =
After a short break from the Radon-Nikodym Theorem for signed measures, we return to this theme with a version of Radon-Nikodym 's Theorem for complex measures. This is readily done as follows. Firstly, given a u-fmite positive measure J.L E !Dl(n, E) , we will denote by
CE,11� = {v E CE,(n, E ) : v << J.L }. Let v E Cf, 11<:. and let v = v 1 + iv 2 . Since v 1 � J.L and v 2 < J.L and v 1 and v 2 are finite signed measures, according to case 5a of the Radon-Nikodym Theorem, there are two equivalent classes ( g 1 ]11 and [g 2 ]J.' of RadonNikodym densities from the factor space L 1 ( n, E, J.L ;fR) every elements g 1 and g 2 of their respective classes,
I J.'
so that, for
thereby making [g ]11 = ( g 1 ]11 x ( g 2 ]11 C L 1 ( n, E, J.L ;C) (see Definition 1 . 16) the desired Radon-Nikodym derivative. The uniqueness of [g ]11 is based on that for signed measures. Summarizing the above arguments we have:
2.9 Theorem {Radon-Nikodym for complex measures ) . L et J.L E !Dl ( n, E) be a u-finite measure. Then [L1 (n, E, J.L ;C) I 11 , Cf, 11� , J ,J is a bi D jective map.
Finally, with reader's help (Problem 2. 1) we will establish a small,
2. A bsolute Continuity
449
but useful result in
2.10 Proposition. Let v be a signed me asure and J.l ure. Then v � J.l if and only if I v I � J.l·
be a p ositive measD
PROBLEMS
2.1 2.2
2.3 2.4
Prove Proposition 2. 10. Consider in case 1 of the Radon-Nikodym Theorem, the partial order -< on cp by defining f -< g if and only if f < g J.L- a.e. Show that any chain in iP has an upper bound and thus, by Zorn's Lemma, 4. 13, Chapter 1 ,
Let J.l E ml(n, E) (i.e. a positive measure) and signed measure such that v = I f d J.l· Show that if
v
be a u-finite
J fdJ.l = J gdJ.l, for all A E E, A A
then f = g (mod J.L ) .
2.5 2.6
Let ( n, E, v) be a complex measure space. Show that the Radon Nikodym derivative satisfies = 1 I v 1 -a. e. on n. [Hint: I I Use Problem 1 . 19.]
df�
In the condition of Problem 2.5, show that for each f E eb- 1 ( n, E; !R) (see Definition 1. 16 (iii)) ,
I f dv = [Hint: Use Problem 2 . 5. ]
2.7
df�
J f df� I d I v I .
Let (n, E, v) be a u-finite signed measure space and J.l and R are two positive u-finite measures on (n, E) with v � J.l and J.l � p . Show that v � p and prove the chain rule
dvdp - dvdJJ dJJdp
If, in addition,
p
- . e . on
p a
� J.l, then
l (or dpdJJ = (dJJ) dp J.L-
2.8
n.
E Show that IE [IE D[X]] = IE[X] .
p-
)
a.e.
450
CHA PTER 8 . ANALYSIS IN ABSTRACT S P A C ES
IEEO[ aX + bY] = alE EO[X] + b!E EO[Y] a.s .
2.9
Show that
2.10
Show that if X is E0-measurable, then [L'O[X]
2.11 2.12
Show that if X < Y
2.13
a . s.
E
=X
E
then fE O[X] < !E O[Y]
a . s.
a.s.
Let Y be an E0-measurable and IF-integrable random variable and let X be a E-measurable random variable such that XY E L 1 ( 1P' ) . Show that Show that Q: 11< is a linear space over the field apply to
<S 11< ?
C.
Does the same
2. A bsolut e Continuity NEW TERMS:
absolutely continuous signed measure 437 Radon-Nikodym density of a signed measure 438 Radon-Nikodym Theorem for a signed measure 438 Radon-Nikodym derivative of a signed measure 444 probability density function 444 probability distribution function 444 continuous random variable 445 conditional expectation given a u- hypothesis 446 version of the conditional expectation 446 conditional pro habili ty of an even given a u- hypothesis 446 conditional expectation gi v en a random variable 448 almost surely equality 448 Radon-Nikodym Theorem for a complex measure 448 chain rule 449
45 1
CHAPTER 8. ANALYS IS IN ABSTRACT S P ACES
452
3. SINGULARITY
The singularity (which we introduced in Section 5, Chapter 6 , for positive measures) is a sort of opposite notion to continuity. Definition and Notation. Let v and p be two signed or complex measures on a measurable space (f2, E). v is said to be sin g ular with respect (or ortho g onaQ to p, in notation, v j_ p , if there is a measurable partition (n l ,n 2 ) of n such that I v I (n l ) = I p I (n 2 ) = 0. Clearly , ( <5, j_ ) is a symmetric relation. Therefore, v and p are to be called mutually singular or just singular. [Because the total variations of complex measures coincide with that for finite signed measures (Problem 1 . 13) and the total variations of signed and positive measures are equal , the above definition of singularity agrees with that for positive measures.] A signed or complex measure, orthogonal to the Lebesgue measure is called just singular.
3.1
Given a signed measure a signed measure space (n, E, v), we will denote by <S v.l. (n, E ) the subset of all signed measures CS(n, E ) orthogo nal to v . 0 We establish a few major properties of singular measures.
3.2
Proposition. Let J.l be a positive measure and
v and
be signed measures on the measurable space (n, E ) . The followin g hold true: ( i ) If v = v - v - is the Jardan decomposition , then v j_ v - . ( ii) If v E 611.1. and p E 611.1. , then v + p , v - p E 611.1. .
+
p
+
( iii) v j_ J.l if and only if v + j_ J.l and v - j_ J.l · (iv) If v � J.l and p E 611.1. , then v j_ p. ( v ) If v � J.l and v j_ J.l, then v = 0. Proof. W e leave (i ) for the reader. (Problem 3. 1.)
( ii) By the definition, there are two measurable sets A and B such that J.l(A) = J.l ( B ) = 0 and I v I (A e) = I p I (Be) = 0. Then, by Problem 1 . 1 1 , v 0 on E n Ae and p = 0 on E n Be. Consequently , v, p, v + p , and v - p are identically zero, each one on E n (A e n Be) . Again, applying Problem 1 . 1 1 , we see that the measures I v + p I and I v - p I attain zero on the set A e n Be. On the other hand, obviously , J.l ( A U B) _
=
o.
(iii)
a)
v j_ J.l
implies that I v I (A) = v + (A) + v - (A) some A and therefore v + (A) = v - (A) = 0.
= J.l(Ac) = 0
for
3. Singularity
b) If v +
..L
J.l and v - ..L J.l, then by
453
( ii), I v I = + + v
v
-
..L
J.l ·
Since p ..L J.L, there is a set A E E such that J.L(A) = I p I (A c ) = 0. By Proposition 2.10, I v I � J.l· In other words , I v I (A) = 0, which proves the statement.
(iv) (v)
Replacing p in (iv) by v we have in the condition of (iv) that v ..L v. Therefore, there is an A E E such that I v I (A) = I v I ( A c ) = 0 and , since I v I is a positive measure, I v I (n) = 0 and I v I = v + = v -
= 0. 3.3
D
Definition. Let J.l be a positive mea.Sure and v - a singular meas ure. If v has a decomposition in two signed measures in the form
then it is called a L ebesgue decomposition of v with respect to J.l· The measures v a and v s are said to be absolutely continuous and sin g ular components of v with respect to J.l· D
3.4 Theorem (Lebesgue Decomposition Theorem).
Let J.l be a u-finite positive measure and lJ be a u-finite sig ned measure, both on a measur able space (n, E) . Then, there is a unique L ebesg ue decomposition of v with resp ect to J.l· Proof. Let v be a u-finite positive measure. Obviously, J.l + v is a u finite positive measure and both J.l and v are absolutely continuous with respect to J.l + v. By the Radon-Nikodym Theorem, case 4, there is a unique equivalence class [/] JJ E IL(n, E, J.l;lR ) I [J of (nonnegative) Radon + Nikodym densities with respect to J.l + v. Let f be one such density. Denote E = {f > 0} and define two measures:
v a = Res E n E v and v s Obviously , v a + v s
= v.
Let J.l ( A) J.l
= Re s E n E c v.
= 0 for some A E E. Then, since
= J f d ( J.l + v)
>
0, it follows that 1 Af E [O ] JJ + v · On the other hand , because f > 0 on E n A, the set E n A is J.l + v-null and , therefore, v-null as well. Consequently, v a ( A ) = 0 or, in other words, a � J.l· To show that ..L J.l, observe that 8( E ) = 0, whereas
and f lJ s
lJ
v
Now, let v be a u-finite signed measure with its Jordan decomposition
454
CHAPTER B . ANALYSffi lli ABSTRA CT SP A C ES
v - (with Applying the above arguments to v + and respect to the same set E) we have that v : � J.l and v a � J.l, which makes v a = v : - v ; � J.l· The same applies to I v 5 I = v 5+ + v 5- _!_ J.L in the proof of v s ..L J.l ·
v = v+ v-. -
Now, we prove uniqueness. Let suppose that
v
be a finite signed measure and
Then , because v is finite, by Problem 2 . 1 3 , v a - v 1 = v s - v 2 is a signed measure, absolutely continuous with respect to J.l and, by Proposition 3.2 ( ii), orthogonal to J.l · Thus, by Proposition 3. 2 ( v) , the signed measure v v 1 = v s - v 2 must be 0. If v is u-finite, then there is a countable measurable partition {On} of n so that v is finite on each nn. Then, by the above arguments, the restriction of the Lebesgue decomposition of v on each (nn,E n On) is unique, which obviously yields uniqueness of the Lebesgue decomposition of v on the entire (n, E). D a
-
Next, we consider yet another decomposition of a measure into two mutually singular components. 3.5 Definitions. Let (n, E , v ) be a signed measure space such that for each w E n, { w } E E.
A point a E n is said to be a v- atom (or just an atom) if I v I ( {a} ) > 0. In this case, we aistl say that v has an atom at {a} . v is called atomic (or discrete) if the set of atoms of v is at most countable, i.e. there is a countable set A of n of atoms such that I v I (A c ) = 0. ( ii) v is called continuous if I v I ( { w } ) = 0 for all w's. Notice that if ( n, E, v ) is an atomic measure space with respect to a countable set A on which v is concentrated , then v can be represented as
(i)
v = E v ( { w }) t: w. wEA
(3. 5 )
D
Apparently, if v and p are signed measures on ( n, E) , as in Definition 3.5 , such that v is continuous and p is atomic, then v ..L p. It seems plausible that a signed measure v on ( n, E) is, in general , of the mixed type and that it permits a decomposition v = v + v d into a c continuous and discrete component. Of course, in contrast with the Lebesgue decomposition, there is no "third party measure" involved. We start with positive measures.
w
Theorem. L et (0, E) be a measurable space such that for each E n, { w } E E and let J.l be a u-finite positive measure on (n , E ) . Then
3.6
there is a unique decomposition
J.l
= J.lc + J.l d into a continuous and dis-
3. Singularity
crete comp onent such that
f..L c ..L f..Ld ·
Proof. Assume that J.L is finite. Let Then C is measurable and
J.L( C) =
455
C be
any countable subset of E.
2: Jl ( { w} ) < J.L(O) < oo
wee
Obviously,
L:
wen
(3.6)
.
J.L( { w} ) = sup { J.L ( C) : C E E and C � N} .
From (3. 6) we have that
L: J.L ( { w} ) < oo Thus,
wen
.
L: J.L( { w} )
wen
can have
only at most countable many positive terms. In other words, the set all J.L-atoms can be at most countable. Denote
A of
Then, f..L d is an atomic measure. We will show that the set function P. = c J.L - f..L d is a positive measure. It clearly suffices to show that f..L > 0. Let c B be a measurable set. Then ,
J.L(B) = J.L(A n B) + J.L(Ac n B) = J.L d ( B) + J.L( A c n B).
Clearly, f..L c is continuous and, as mentioned previously, f..L c ..L f..L d · Conse quently, J.L = f..L c + f..L d is the desired decomposition. Now suppose that J.L is u-finite and let {Qn} be a countable measur able partition of n such that
is finite for each n. Applying the above arguments to every J.L n , we arrive at the decomposition J.L n = J.L � + J.L� relative to the set A n of the atoms of J.L n · Then , is the set of all atoms of J.L and
456
CHAPTER 8 . ANALYSIS IN ABSTRACT S PA CES
is the desired decomposition of J.l and J.l c j_ J.l d with respect to A. It now remains to prove the uniqueness of the decomposition. Let
( 3.6a) Since the set A of all atoms of J.l is unique, both J.l d and P d are concentra ted on A that makes them clearly equal. If B is a j.l-finite measurable set, then J.l d = P d and ( 3.6a) immediately imply that J.l c (B) = P c (B). Other wise, let B n = B n D n , w here {D n } l f2 and
J.l n = Res E n D n J.l < oo. Then,
J.l c (B n ) = P c (B n ) and continuity from above lead to
and to the equality of J.l c and
p
c·
0
3.7 Theorem. L et (n, E) be a measure sp ace as in Theorem 3. 6 and v be a u-finite sig ned measure on (n, E) .. Then, given a u-finite positive measure J.l on (f2, E), there is a unique decomposition
with respect to J.l into three u-finite signed measures, of which th e first one is continuous and a bsolute continuous with resp ect to J.l, the second is continuous and singular with respect to J.l, and the third one is atomic. Furthermore, v d ..L v c a and v c a ..L v d· Proof. Let v = v + v - be its Jordan decomposition. Then, by Theorem 3 .6 , v + and v - can be decomposed as -
relative to the sets A + and A - of atoms of v + and v - , respectively. Consequently,
is the corresponding decomposition of the signed measure v into its conti nuous v c and atomic v d components with respect to the set A = A + U A - of atoms of v. This representation is obviously unique.
3. Sing ularity
457
Now, given a u-finite positive measure f..L , let v = v c + vd be the de composition (with respect to the set A of atoms of v ) . According to Theorem 3.4, there is a unique Lebesgue decomposition of v c = v c a + v c s with respect to f..L · Therefore, v = v c a + v c s + v d is a unique decomposi tion of v with respect to f..L into three u-finite signed measures of which the first is continuous and absolute continuous with respect to f..L , the se cond is continuous and singular with respect to f..L and the third one is atomic. Furthermore, we have that v c a ( A ) = v c s ( A ) = v d ( A c ) = 0. There fore, vd j_ v c a and v c a j_ vd. D 3.8 Corollary .. Let v be a si g ned Borel�L ebesgu e-Stieltjes measure on (lR", <:B) and >. be the Borel-Lebesgu e measure. Then, there is a unzque
decomposition
(3.8)
with respect to the Borel-L ebesgue measure >. such that j_ >., and v c s j_ v d .
v a � >., v c s + v d
Proof. Because any Borel-Lebesgue-Stieltjes measure is u-finite, by Theorem 3.7, v can uniquely be decomposed as
where
v ca � >.. Since obviously, v d j_ >., by Proposition 3.2
( ii) ,
Because the Lebesgue decomposition is unique, it follows that v c a is the absolute continuous and v c s + v d is the singular component in the Lebesgue decomposition of v. In particular, it follows that va = v c a is also continuous. D 3.9 Definition. The singular components v c s and v d of v in decompo sition (3. 8) are said to be singular-continuous and sin g ular-discrete (or just discrete), respectively. D
We are going to continue our discussion of singularity of measures in Section 4, Chapter 9. PROBLEMS 3.1
Prove part ( i) of Proposition 3 . 1 .
3.2
Generalize Proposition 3.2 for complex measures replacing signed measures.
458
CHAPTER 8. ANALYSIS IN ABSTRA CT S P A C ES
3.3
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v in Theorem 3.4.
3.4
Prove a version of the Lebesgue Decomposition Theorem with a complex measure replacing the signed measure v and an arbitrary positive measure J.L replacing the u-finite positive measure J.L in Theorem 3.4.
3. 5
Prove a version of the Lebesgue Decomposition Theorem with a u finite positive measure replacing the signed measure v and an arbitrary positive measure J.L replacing the u-finite positive measure J.L in Theorem 3 .4.
3.6
Let v a ann v s be the absolute continuous and singular components of a complex measure v with respect to a positive measure J.L · Sh ow tha t I v I = I v a I + I v s I ·
3. Singularity NEW TERMS:
singularity ( orthogonality ) of a signed measure 452 orthogonality ( singularity ) of a signed measure 452 Lebesgue decomposition of a signed measure 453 absolutely continuous component of a signed measure 453 component of a signed measure 453 Lebesgue Decomposition Theorem 453 atom ( v-atom ) 454 atom of a singular measure 454 continuous singular measure 454 dec om position of a positive measure 454 decomposition of a u-finite signed measure 456 singular components of a signed measure 457 singular-continuous component of a signed measure 457 singular-discrete component of a signed measure 457
459
460
CHAPTER S. ANALYSffi rn AB STRACT SP ACES
4. LP SPACES
This section will deal with the so-called LP-spaces and give more sys tematic studies of them as metric spaces. 4.1 Notation. Let (n,E,J.L) be a (positive) measure space. Then, for 0 < p < oo , we denote by LP(f2,E,j.t;C) , the set of all measurable complex-valued functions such that I f I P E L 1 (n,E,J.L;C) . In particular, if J.L is the counting measure on (n, E) with n = { 1 ,2, . . . } , then the set LP(f2, E, j.L;C) reduces to the familiar lP space of all summable sequences. We will occasionally abbreviate LP(f2, E, j.L;C) as LP(f2, E, J.L) or just LP. One more notation we are going to use throughout is LP(f2, E, J.L;lR) as the set of all e � 1 (n, E; fR )-functions with I f I p E L 1 (n, E, j.L; fR + ). D 4.2 Proposition. LP(f2,E ,J.L;C) is a linear space over the field C. Proof. Let a, b > 0, then
(a + b) P < [2{a V b}] P (4. 2) Now , for J,g E LP(f2,E,J.L,C) , due to (4.2), we have (4. 2a) from which we see that obvious.
f + g E LP.
The other linear space properties are D
Notice that LP(f2, E, J.L;lR) is sort of quasi-linear over lR. Due to ( 4. 2) and the homogeneity, the LP is "linear" restricted to the scalars from IR but not IR , of course. Consequently, endowing a norm on LP should be done with care and respect to the accepted terminology. We now introduce a semi-norm on LP. 4.3 Theorem. The real-valued function
defined zs .
as
II II P : ·
LP(f2,E ,J.L;C) --+ lR +
a semz-norm. .
Theorem 4.3, whose proof will follow, essentially reduces to the triangle inequality , which we show in two steps below. Recall (Problem 1.5, Chapter 2) that two real numbers p > 1 and q > 1 are said to be con ju g ate exponents if
4. LP Spaces
46 1
� + � = 1. Now we prove the Holder inequality for the semi-norm
LP(f2,E,f.L;C).
4.4 Proposition (Holder's Inequality). Let 1
<
I I · II
on
p oo and q be its con <
jugate exponent, and let f E LP(f2,E,f.L;C) and g E Lq(fl,E,f.L;C). Then, fg E L 1 and (4.4) Proof. By Problem 1 . 5 , Chapter 2,
l fg l Hence,
<
I ! I P + I Yq i q , P
I f g I is bounded by integrable functions and ( 4. 4a)
II I I
II II q
or g vanishes or is infinity (or any If one of the values f P Assume that neither of them is zero or combination) , then ( 4.4) holds. replaced by f and infmity . Then ( 4.4a) still holds with f I f P g I g - by g . This yie lds ( 4.4 ) . D
II II
II II q
p q
Observe that for the special case = = 2, Holder's inequality reduces to the frequently used Cauchy-Schwarz inequality. (In addition to (4.4) , we have f g E L 1 and J,g E L 2 .) Now, we are ready to prove the triangle inequality, known as Minkowski's inequality. 4.5 Proposition (Minkowski's Inequality). Let 1
LP(f2,E,f.L;C) . Then f + g E LP(f2,E,f.L;C) and
<
p oo <
and J,g E (4. 5)
p
Proof. For = 1, ( 4.5) reduces to the known triangle inequality for L 1 space. Assume that 1 < < oo and denote by q its conjugate exponent. We have
p
<
Since obviously
I I + g I P= I f + g I I f + g I IfI
It+gI
p -1
p - l + I g I I t + g I p- 1 .
( 4.5a)
pq - q p and because the space LP ( f2,E,f.L;C) is linear, =
462
CHAPTER B . ANALYS ffi rn ABSTRACT SPAC ES
and hence
Consequently,
1 p f g + I I E Lq. Now we apply the Holder inequality to J, g E £ P and to I f + g I p - 1 E L q to have I f I I f + g I p - 1 and I g I I f + g I p - 1 as £ 1 -functions ·
·
and
II t I t + g I p - 1 II = J I f I I t + g I p - 1 d Jl < II f II p[ J ( I t + g I p -1 ) q d ll]1 1 q (since pq - q = p) with
II f II p II t + g II p p f q , 1 p g g I I I t + I II < II g II p I I ! + g II p p f q .
( 4.5b )
=
(4.5c)
Applying the norm ( integral operator ) to ( 4.5a-c ) we have
II t + g II � < II t II p II t + g II p p f q + II g II p II t + g II p p f q . Dividing both sides of the last inequality by II f + g II p f q ( of course, we P holds true assume II f + g II � > 0, or else the triangle inequality immediately ) and due to p - ( p / q) = 1 we have the above assertion. D Proof of Theorem 4.3. Notice that I I a f II = I a I II f II satisfies P 2. Property P (iii) of property (ii) of the norm in Theorem 7.3, Chapter
the same theorem is subject to the Mink ow ski inequality. And finally, f = 0 implies II f II P = 0. The converse however gives a weaker condition: II f II = 0 yields f = 0 jl-a.e. . Theorem 4.3 is therefore P proved. D 4.6 Remark. To make ( LP, I I II ) a normed space we will pass to equivalent classes in the same way as in Sections 1 and 5 of Chapter 6 and Section 2 of the present chapter. Recall that, the jl-almost every where property of equality of measurable functions generates an equi valence relation E on e 1 ( n , E; C) and thus on £P. Consequently , ·
-
is also a quotient set . Then ,
[O] JJ is a linear subspace and
4. LP Sp aces
463
is the (quotient) space, with the origin () = [0]1-', generated by E and II · II P is now a norm on £P(f2, E, J.L; C) I JJ " Indeed, by Lemma 1 . 1 5 , D Chapter 6 , we see that II f II = 0 implies that f E [0]1-'. P 4.7 Definition. A sequence {/ n } C LP(Q,E,J.L;C) is said to converge in the pth mean to a function f E £P(f2,E,J.L;C) (or just LP- converge to f) if
II f f n II --+ 0, for n--+ oo -
P
.
We will also denote it by f n � f.
D
Problems 4.2 and 4.3 ( which are essentially due to Riesz) state that if an LP-sequence {f n } converges to an LP-function f, then the conver gence of { II f n II p } to II f II is equivalent to the convergence of { f n } to P f in the pth mean. Below we state and prove a more general version of the Lebesgue Dominated Convergence Theorem than Theorem 2.6, Chapter 6, for (L 1 ( n , E, J.L ) , I I · 1 1 1 )-space. 4.8 Theorem (Lebesgue's Dominated Convergence Theorem). L et (n, E, J.L) be a measure sp ace and {f n } C e - 1 ( 0, E;C) ( or e - 1 ( 0, E;IR )) be an a. e. convergent sequence, a. e. dominated by an LP(Q, E, J.L; IR + )
function g, more precisely, lowing are true:
I fn I
< g for · each
n
J.L- a. e . . Then the fol
( i) { f n } c LP( n, E, j.L ;C) ( or LP(f2, E ' J.L;lR) ); ( ii) there is an LP(Q, E, J.L; C)-function f such that {f n } converges to f a.e. in the topology of p ointwise convergence; £P ( 'Z. Z'Z. " ) f n --+ f ; (iv) ll f n ll p --+ ll f ll p · Proof. As usual, denote by N == N Jl the subfamily of all measurable J.L-null sets. Since {f n } is a.e. convergent pointwise, there is M E N s,uch that
lim n -HX>f n ( w ) exists for all w E Me.
Denote by L( w ) the value of this limit. Since g P E L 1 ( n , E, J.L;lR + ) , by Proposition 1 .2 1 , Chapter 6, there is N E N, such that g(w) < oo on Ne. Furthermore, there is a set o n E N su ch th at I f n I < g for all w E 0�.
= n U= O n . Then, clearly 0 E N. Denote A == Me n Ne n oe and f l == Ll A . Then, f n --+ f J.L-a.e., f E e - 1 (0, E;C). Because I f n I < g < on A, I f I < g a. e. , I f I < oo and hence f E C. By Proposition 1 . 17, Let 0
00
00
Chapter 6 , we have that
CHAPTER 8 . ANALYSIS IN A B STRA CT S P A CES
464
I f I and, consequently, f E £P(f1., E, J.L;C). Let Y n = I f n - f I P and h = ( I f I + g)P. Then, the sequence { g n } is nonnegative and is dominated by h. Since I f I + g E £P(f1., E, J.L;IR + ), Y n E £ 1 (0., E, J.L;IR + ). Applying Fatou's Lemma to h - Y n we have
Therefore,
Since Y n --+ 0 a . e . , This and (4.8) yield
Because
h - Y n --+ h
a.e.
h
and therefore lim(h - g n ) =
a.e
..
Y n > 0, we have l i m n--+ oo J Y n dJ.L = l i m n oo II f n - f II p = 0. --+
Finally,
II f n II --+ II f II P is due to Problem 4.2.
D
P
We are going to show that the space LP(f1., E, J.L;C) is complete with and hence th e quotient space respect to the seminorm II II P LP(f1., E, J.L; C) I . JJ is Banach. ·
4.9 Theorem (Riesz-Fischer). Let {f n } C LP(f1., E, J.L; C) ( or LP(f1., E, j.l; IR )) be a Cauchy sequence with respect to the seminorm · p · Then, there exists f E £P(f1., E, J.L; C) such that f L--+P f .
II II
n
Proof. Let {/ n } be an LP-Cauchy sequence. Then, given there is an N k such that for all indices n k , n k + l > N k '
£
= 2 -k, (4. 9)
Hence, there is a subsequence
{/ n k } whose terms satisfy (4. 9) . Denote
Y k = f nk - f n k + t and g = r: : 1 1 Y k l inequality of Problem 4.1 to the sequence { I g k I } .
and apply the we have from ( 4.9):
Then (4. 9a)
Thus, g E LP or, equivalently, gP E L 1 . By Proposition 1 .2 1 , Chapter 6 , g P and, therefore, g is finite J.L-a.e .. The latter implies that the partial
4. £P Spaces
465
sums
and hence the subsequence
I f" k I
{/ nk } converge J.L-a.e. on n.
Furthermore,
= I f n + g 1 + . . . + g k I � I f n 1 I + g, 1
and since (due to ( 4. 9a)) g E £P(f2, E, J.L;IR + ), the subsequence
{/" k }
is
dominated by an integrable nonnegative function I f n I + g. All other 1 conditions of the Lebesgue Dominated Convergence Theorem 4.8 (applied to the subsequence {/ n k } ) are met. Consequently, there is a function f
E £P(f2, E, J.L; C ) to which
{ / n k } converges J.L-a.e., both in the topology of
pointwise convergence and in the pth mean. Finally , {/ n } , being an LP-Cauchy sequence, by Problem 3.9, Chap ter 2, must converge to the same limit function f (as its subsequence {f n } ) in the pth mean. D k
Notice that the function f to which {/ n } converges in the pth mean is defined uniquely J.L-a.e .. Therefore, the Riesz-Fischer theorem states that the quotient space LP(O, E, J.L; C) I 11 is Banach. As a byproduct, the theorem provides a subsequence { f n k } of { fn } , which converges to f J.L-
a.e. in the topology of pointwise con vergence. The theorem does not state, however, that {/ n } also converges to f J.L-a.e. pointwise. (The reader is encouraged to provide a counterexample where such an option is not the case, see Problem 4.6.) Below is what we can afford. 4.10 Proposition. If an LP(Q, E, J.L; C)-Cauchy sequence {/ n } conver ges J.L-a.e. pointwise to a function f E e - 1 ( 0., E; C), th en f E LP and
I
n
e f.
l""tJ
Proof. By Riesz-Fischer Theorem 4.9, there is an LP-function f such
:hat f
n
e
7
{
}
and there is a subsequence f n k s;
{ !n}
such that f n k
_,
f a.e. pointwise. On the other hand, by our assumption, f n k --+ f a.e. pointwise. Therefore, f E [f ]11 and the rest of the statement is again l""tJ
due to the Riesz-Fischer Theorem.
D
4.1 1 Proposition. Let (O.,E,J.L) be a measure space, such that J.L is finite and let f E e - I (n, E; C ) . If 1 < p < q < + oo, then ( 4. 1 1)
466
CHAPTER 8 . ANALYSIS IN AB STRACT S P A CES
and therefore Lq ( O.,E,J.l,C )
C £P(f2,E,J.l,C) .
Proof. We assume that p < q or else ( 4. 1 1 ) is trivial . Then denote a = q/p and b = af(a - 1 ) = qf(q - p) . Then , a and b are conj ugate exponents with a > 1 . Since J.l is finite, the constant function 1 E L b ( O., E,J.l,IR) . Now apply Holder's inequality to I f I P and to 1 with respect to the conjugate exponents a and b:
or, equivalently, (since pa = q, 1 /a = pfq and 1 /b = 1 - qfp) 1 1 = 11 t 11 � [J.l( n ) r -
q
that proves ( 4. 1 1).
D
4. 12 Examples.
( i) Consider an important special case. If J.l is a probability meas ure in Proposition 4. 1 1 , then the result applied to a random variable X can be interpreted as follows. The existence of the moment of nth order implies the existence of all lower moments of X. (ii) The statement of Proposition 4. 1 1 that, for p < q ,
need not hold if J.l is not finite. For example, if n = [l ,oo) and J.l is the counting measure concentrated on set { 1 ,2, . . . } , i.e. J.l = E� 1 e n . Let J(x) = �- Then, =
and thus f E £2. However, it is easily seen that f � £ 1 .
D
The theorem below states that the space of all real-valued integrable "extended" simple functions is dense in £P. We need the following notation. Let qiP(Q, E, J.l;IR) = W'(O., E;IR) n £P(f2, E, J.l;IR) denote the sub set of all real-valued simple £P-integrable functions. (See Remark 6.2 (iii) , Chapter 5, on simple functions.) 4. 13 Theorem. The real subspace qiP is dense in ( LP,
II II p ) ·
Proof. qiP C LP, by the definition. Now, given an f E £P, by Theorem 6.5, Chapter 5, for f + and f - there are monotone nondecreas·
4. LP Spaces
467
ing sequences {s: } j f + and { s; } j f - . Since f E £P, so are f + , J - E £P and, consequently, and By ( 4.2), and since f E £P, we have that {f - s n } C LP. Therefore, the sequence { I f - sn I P} is dominated by an £ 1 -function 2P + 1 I f I P. We also know that {f - sn} converges to function 0 pointwise. Hence , the sequence {f - sn} meets all criteria of the Lebesgue Dominated Convergence Theorem. As the result, there is an £P(f1., E, J.L;IR)-function, say f*, to which {f - sn} converges a.e. pointwise. Hence f* E [O] JJ and by setting
! * = 0, we have lim n-+oo I I f - sn II P = 0 or that sn
� f. In other words,
D 4.14 Remarks. (i) Given an £P-function J, we proved the existence of an "extended" sequence { sn} of simple functions such that { I sn I } is monotone increasing to I f I and { s n } converges to f pointwise. (ii) Noticed that not only tJ! = e - 1 ( 0., E;IR) in e - 1 ( 0., E),r 00 (i.e. , in the topology of pointwise convergence) , but as we showed, the sub space tJ! P of tJ! is dense in ( LP, II II p ) . ·
(iii) A minor adjustment to Proposition 4. 13 allows us to claim that the subspace tf!P(Q, E, J.L;C) = tfl(Q, E;C) n £P(f1., E, J.L;C) of all complex valued simple £P-integrable functions is dense in £P(f1., E, J.L;C). (Problem D 4.8.) The following topic on J.L-a. e. bounded measurable or "£ 00-functions" occurs often in applications and is going to be explored. We will also see how the L00-space fits in the LP-family. 4. 15 Definition. Let f E e - 1 ( 0., E; C) or e - 1 (0., E; IR). A positive
real number M is said to be an essential bound for f if I f I < M J.L-a.e. on n. If f has an essential bound it is naturally called essentially bound
ed.
D
We would like to notice the difference between J.L-a.e. finite and essen-
468
CHAPTER 8 . ANALYSIS IN AB STRACT SP ACES
tially bounded functions. For instance the function � E e - 1 (0, L'; IR ) is finite .A-a.e. on IR, i.e. every where, except for 0, whereas it is not essentially bounded. Moreover, the "repaired" version of � ,
{
-:j. 0 f ( X ) = �'0, XX = 0 becomes finite (and an element of bounded.
e - 1 (0, E; C)) ,
but still not essentially
4. 16 Definition and Notation. If a measurable function f on (0, E, J.L) is essentially bounded, then the infimum of all essential bounds for f is called the (J.L-) essential supremum of f and it is denoted by I I f II 00 or by ess sup { I f I } . More formally,
II f I I 00 = inf{ M > 0: J.L { I f I > M } = 0 } .
The subset of e - 1 (0, E; C ) (or e - 1 (0, E; IR )) of all essentially bounded functions is denoted by L00 (0, E, J.L;C) (or L 00 (0, E, J.L;IR ), resp.). (Of course, if f is not essentially bounded, it would make sense to set I I f I I 00 = oo . However, since we are going to use I I I I 00 as the norm wit hin D L 00 , we do not need such an extension.) It is easy to see that L 00 (0, E, J.L; C) is a vector space over the field C, while L 00 (f2, E, J.L; IR ) is a "quasi"-vector space over IR. The properties below justify n . II 00 as a semi-norm on L 00 • 4. 17 Proposition. Given two measurable functions f and g on (0, E, J.L) and a scalar a E C, the following are valid: ·
( i) ( ii) (iii) (iv)
( v) ( vi)
( vii) ( viii)
I t I < I I t I I oo J.L- a . e. on n. I I t + g I I oo < I I f I I oo + I I g I I oo · 1 t 1 < 1 g 1 J.L-a.e. on n implies that II t I I oo < I I g I I oo · / E [g] J.& => ll f ll oo = I I Y I I oo · l l a f ll oo = l a l ll f ll oo · II f I I 00 = 0 <=> f E [ O ] J.& . I I a I I oo = I a I . I I t g II oo < I I t I I oo I I g II oo ·
Proof.
( i)
Given £ =
�,
there is an essential bound M n such that
4. £P Spaces
IfI Hen ce, the set
( ii)
{IfI
I t+g I
>
<
469
( J.l- a. e.) < II f II oo + � · II f II 00 + �} E N 11 and along with this, the set <M
n
ItI + IgI
< I I t I I oo + I I g II oo J.L- a. e.
Hence, I I f II 00 + II g II 00 is an essential bound for f + g. Thus, as the infimum of all essential bounds,
Because of ( i) and our assumption, we have a.e . . Therefore, II g II 00 is an essential bound and
(iii)
as
<
II g II 00 J.l
I I f I I 00 is the infimum of all essential bounds. ( iv) The validity of this statement follows from
( v)
Because of I a f I = I a I I f I , it follows the essential supremum of a f .
n.
IfI
(iii) applied twice. that I a I II f II 00 is
Let II f II 00 = 0. From ( i), it follo ws that The converse of this statement is trivial.
(vi)
(vii) Follows from Il l II 00 = 1 and ( v ). ( viii) I fg I = I t I I g I < II t II oo II g II oo (vii),
J.L- a . e.
IfI
< 0 J.L- a.e. on
=> by
(iii),
and
The above properties yield that (L 00 , II II 00 ) is a semi-normed llnear space and it can be made an NLS by passing to the usual factor space L 00 1 11 • We will establish a few more properties, such as Holder's inequali ty and completeness of L 00 1 11 , to have L 00 be a part of the LP family. First, a few examples. ·
4. 18 Examples.
( i) function
= 1,2, . . . } and B = [0, 1] \A. Define the measurable ( [0 , 1] , '!B n [0 , l] ,JL = Res'!B n [O, ]A) as
Let A = { �; n
f on
f(x)
= sinxl B (x ) = E ::O
l
=
1 n1 A (x) .
470
CHAPTER 8 . ANALYSIS IN AB STRA CT S P A CES
Clearly, the function f is not bounded and therefore I I f I I s u p ever, since J.L (A ) = 0, II f II 00 = sin 1 . ( ii) In the condition of Example ( i) , let
g (x)
= sinx1B (x) =
= oo.
How
L: �= 1 arctann1 A (x) .
Then, I I g I I s u p 1, w hi le II g I I 00 = sin 1 < 1 . D 4. 19 Proposition (Holder's Inequality for L00 spaces). L et f E L 1 and g E L00 • Then, f g E L 1 and the following inequality holds true: =
The proof is left for the exercise (Problem 4. 10) . 4.20 Notation. Given a sequence { f n} C L 00 , we will write
D
fn rc f if I I f - f n I I 00 __... 0 wh en n � oo. 0 4.21 Theorem. The sp ace (L 00 (0, E, J.L;C), II I I 00 ) is Banach. Proof. Let {f n} be a Cauchy sequence. Then, by Problem 4. 13, there is a set A E N 11 such that f n - f m ---. 0 uniformly on A c. Consequently, there is a function [Ac,C,/0] to which {fn} converges uniformly on Ac. It is readily j ustified ( cf. Proposition 5.6 (vi) , Chapter 5) that ·
Thus, the function f = f0 1 A c E e - 1 (0, E; C). Clearly, f is essentially bounded . Since f n ---. f J.L-a.e. uniformly on n, by Problem 4. 12, D
4.22 Definition and Notation. As we see it from the above analysis of L 00 spaces, the latter become a natural extension of the LP spaces in the following way. The two versions of the Holder Inequality can be combined in one after upgrading the notion of the conjugate exponent. Two extended real numbers 1 < p < oo and 1 < q < oo are said to be con jugate exponents if they satisfy the equation p1 + q1
= 1,
with the usual agreement that � = 0. The generalization below of conju-
4. £P Spaces
47 1
gate exponents allows modification of the Holder Inequality. The extend ed real numbers 1 < q i < oo, i = 1, . . . ,n, are said to be conjugate expo nents if they satisfy the equation 1 1 - q+ . . . + q-1 ·
1
( 4.22)
n
D
4.23 Proposition (Generalized Holder's Inequality; Version 1 ) . Given n conjugate exponents of ( 4. 22) , let g i E L q i(Q, E,J.L; C), i = 1, . . . ,n. Then, 1 g1 · · ·gn E L and
( 4.23)
D
The following is a modification of the Holder lnequali ty . 4.24 Proposition (Generalized Holder's Inequality; Version 2). Given n + 1 extended real numbers 0 < P i < oo, i == 0, . . . n, such that
(4.24) io = i1 + · · · + p1n · and functions f j E L P 3(0, E,J.L; C), j 1, . . . ,n, it holds true that f 1· · f n ·
=
E L P0 and
II f 1· · · f n II Po < II f 1 II p 1 · · · II
f n II p n ·
( 4.24a)
D
It can be verified ( Problem 4. 14) that the two versions are equi valent . The proof of one of them is left for the exercise ( Problem 4. 15) . PROBLEMS
n}
p
< oo. Show that the
4. 1
Let {f C LP(f2,E,J.L,IR + ), where 1 < following inequality holds:
4.2
n } C LP(Q, E, J.L;C) be a sequence of functions such that f n � f E LP J.L-a.e. pointwise. Prove that if f n � f (i.e. II t n - f I I p � o), th en I I f n I I p � I I f II p ·
4.3
Prove the converse to Problem 4.2 [Hint: Apply Fatou's Lemma 2.4, Chapter 6. )]
4.4
Show that if {f
Let {f
n } C LP is a Cauchy sequence, then it is uniformly
472
CHAPTER 8. ANALYSIS IN AB STRACT S P A CES
bounded.
4.5
Let {/ n } C LP(Q, E, J..L ; C) and {gn} C Lq( O, E, J..L ; C), f n � f E LP, Lq Y n ---. g E Lq, and p and q be conjugate exponents. Prove that f n n Ll f g.
Y __...
4.6
Show that in the Riesz-Fischer Theorem , {f n} need not converge to f J..L- a.e. in the topology of pointwise convergence.
4.7
Show that LP(Q, E, J..L ; C) is a lattice, i.e., if f ,g E LP(O,E,J..L ; C) then also f V g, f 1\ g E LP(f2,E,J..L ; C) .
4.8
Prove that the set !liP(Q, E , J..L ; C) = !li(O, E; C) n LP(Q, E , J..L ; C) of all complex-valued simple LP-integrable functions is dense in LP(Q, E, p;C).
4.9 4.10 4.11
Show that Leo(n, E, J..L ; C) is a lattice.
4.12
Prove the converse of the statement in Problem 4. 1 1 : Given {/,/ n} C Leo(n, E, J..L ) suppose there is an A E N 11 such that f n ---. f uniformly on A c . Show that f n L---.eo f.
4.13
Prove that {/ n} C Leo(n, r, J..L ) is a Cauchy sequence if and only if there is an A E N 11 such that f n - f m ---. 0 uniformly on A c .
4.14
Show that the two versions (Propositions 4.23 and 4. 24) of the generalized Holder Inequality are equivalent.
4.15 4.16
Prove one of the versions of the generalized Holder lnequali ty.
Prove the Holder Inequality for Leo spaces. (Proposition 4. 19.) eo f. Show that there is Let { /,/ n} C Leo(n, E, J..L ) such that f n L---. an A E N 11 such that f n ---. f uniformly on A c .
Let (f2,E ,J..L ) be
as
follows: n = IR+ , E =
<:B+ , J..L = Res E ,\, and let
where A = [ 0, �], B = ( n,n 2] , n > 2. Investigate if the sequence {f n} is Leo-convergent and if the answer is yes, give a version of its Leo-limit. Repeat the same investigation with respect to the L 1 space.
4. £P Sp aces NEW TERMS:
LP ( n, E, J..L ; C) space 460 LP(Q, E, J..L ; IR) space 460 semi-norm 460 conjugate exponents 460 Holder's inequality for LP spaces 46 1 Cauchy-Schwarz inequality 46 1 Minkowski 's inequality 46 1 convergence in the pth mean (LP-convergenc�) 463 LP-con vergence (con vergence in the pth mean) 463 Lebesgue Dominated Convergence Theorem for LP spaces 463 Riesz-Fischer Theorem 464 integrable simple function 466 essential bound 467 essential bounded function 467 essential supremum of a function 468 L 00 (Q, E, J..L ; C) space 468 L00 (0, E, J..L ; C) space 468 I I I I 00 se m i-norm 468 Holder's inequality for L 00 spaces 470 generalized Holder's inequalities 47 1 •
·
473
474
CHAPTER 8 . ANALYSIS IN AB STRA CT SPA CES
5. MODES OF CONVERGENCE
In this section, we explore other forms of convergencies initiated in Section 2, Chapter 6. Many of them find a frequent application in analy sis and probability. 5.1 Definitions. Let { / n } be a measure on (n, E ).
e - l (n, E; C)-sequence
(i) {f n } is said to converg e to a function f E measure if for each £ > 0,
and let J.L be a
e - 1 (n, E; C)
in
lim n -+ oof..L ( { I f n - f I > £}) = 0 , in notation, f n � f. The function f is called a J.L-limit of {f n } · ( ii) {f n } is said to be Cauchy in measure if for each £ > 0,
(iii) {/ n } is said to converg e almost uniformly to f (in notation, f n ---. f J.L-a.u.) if for each £ > 0 there is a set A ( = A( e )) E E such that D J.L( A) < £ and f n ---. f uniformly on A c .
We will begin with the statement that "almost uniform convergence implies convergence in measure," which is quite obvious and its proof left for the exercise (Problem 5. 1). 5.2 Proposition. Let (n, E, J.L) be a measure space and J , {f n } e - 1 (n, E; C) such that f n ---. J J.L-a.u. on n . Then f n � f.
C
Next, we will need the following
5.3 Lemma {Chebyshev's Inequality). If f E e - 1 (n, E; C) 0 < p < + oo, then for each £ > 0, J1.
C{ I f I
> c} ) <
{I '}p r ·
and (5.3)
Proof. Let A: = { l / 1 > e } . Then, I J I P = I J I P I A + l f i P I A and c thus
( I I f I I p )P > AI
If f E £P(n, E, J.L; C) for
p
IfI
p d J.L > £ p AI dJ.L = £ p J.L (A) .
> 1 , then from (5.3) it follows that
D
5. Modes of Converg ence
475
Another, noteworthy consequence of Chebyshev's Inequality is 5.4 Proposition. L et
{f n } , f C LP ( Q, E, J.L; C),
for 1 <
p
< oo If .
JJ
£
f n __...P f, then f n ---. f. Proof. The statement follows directly from Chebyshev's Inequality applied to f n - f. D
The converse of the last proposition does not hold as we learn it from the following example. 5.5 Examples. (i)
Let f n
= �l ( O , n ) " Then, f n ---. 0. Let £ E (0,1). Then, ( O,n), if n < � (/J, if n > !
and thus
n, . f n < 0, if n > ! 1
1
e
that yields
lim n-+ oo .,\( { I fn - 0 I > e }) = 0 for all £ E (0, 1). However,
(ii) The pointwise convergence does not imply convergence in meas ure. Let f n = l { n n + I ) " Then, { f n} converges to 0 pointwise. However,
, for every £ E ( 0,1), and
{fn > e } = (n,n + 1) .,\( {f n > £}) = 1, for all n. D be a measure space and let {f n},f C
The LP-convergence does not hold either in this case.
5.6 Theorem. L et (O., E, J.L) e - 1 (0 , E; C). If f n ---. f J.L-a . u . , then f n ---. f J.L-a.e. pointwise.
Proof. Almost uniform con vergence means that for each k, there is a measurable set Ak such that J.L( Ak) < � and f n ---. f uniformly on Ak. Denote
CHAPTER 8 . ANALYSIS IN AB STRACT SP ACES
476
Then J.L( Bk ) < z and f n ---.
f J.L-a. u. on B/c. Hence fn ---. f pointwise on 00
Be: = U B/c k=l
(but not necessarily uniformly on Be). On the other hand, since J.L( B k ) is finite, by continuity from above, D
The converse of this statement, as we know from analysis, does not hold true, unless J.L is finite, as the following, widely referred to theorem states. 5.7 Theorem (Egorov). Let (n, E, J.L) be a finite measure sp ace and f, {f n} C e - 1 (n, E; C) such that {f n} co nverg es to f J.L-a. e . pointwise. Then {f n} converg es to f J.L-a . u . Proof. By the assumption, there is a J.L-null set verges to f pointwise on N c . Define
N such that {f n} con
Clearly, the sequence { A ; n= n = 1,2, . . . } is monotone nonincreasing for each j and since { f n} con verges to f pointwise on N c , for every j, we have that { A j n}n ! C/J. Because J.L is finite, by 0-continuity, Denote 0 be chosen and let n 3· be such that J.L( A 3·n . ) < �] 2 j 00 A = . U A 3·n ]=1 j Then, J.L ( A ) < c and if w rf. A, it follows from the definition of A j n and A that for every j, Let
c
>
. ·
and therefore {f n} converges to f uniformly on A. 5.8 Proposition. Let (0, E, J.L ) b e
{f n}
C e - 1 (n, E; C)
iJ f· fn ---.
a
D
finite measure space and let f,
be such that f n -+ f J.L-a.e. pointwise on
n. Then,
5. Mo des of Converg ence
477
Proof. Let
Then,
{f
n + f } =e U O A (e ) . >
(5 .8)
w E {f n + f } if and only if there is a fJ > 0 and a subsequence, { fn/w) } such that for all < 6, I f n . ( w) - f ( w) I > £, j = 1 ,2, . . . . j The latter is equivalent to w E A ( t:) for all £ < 6. Finally , (5.8) is due to the fact that { A ( t:)} i for £ 1 0. Indeed,
c
Consequently, for each £ > 0,
J.L ( A ( t: )) < J.L (
{f n + f}} = 0.
(5.8a)
Since by our assumption, J.L is finite, due to (5 .8a) and by continuity from above,
Finally , (5 . 8b) and that
An (t:) C En ( t:) yield that for each £ > 0,
and thus convergence of {f
n} to f in measure.
(5.8b)
D
The converse of this proposition is a much weaker statement that convergence {f to f in measure guarantees just the existence of a sub sequence of {f convergent to f J.L-a.e.
n} n}
n} C e - I be a Cauchy sequence in measure. Then : (i) there exists a measurable function f to which the sequence {f n} . con verges zn measure; (ii) there is a subsequence { fn k } of { f n} that con verg es to f p.-a. e. 5.9 Theorem. L et (0, E, J.L) be a measure space and {f
in the topology of pointwise convergence; (iii) if f � g , then g [f] JJ .
n
E
478
CHAPTER 8 . ANALYSIS IN ABSTRA CT S P A CES
Proof.
Since {f n } is Cauchy in such that
J..L ,
for each
J.L( { I f n - f m I > £ } )
In terms of e
and 6 > 0, there is an N 0
< 6, for all m , n > N 0 •
= 6 = 21k , the above can be reformulated as t{{ I f n - f m I > 21k}) < }• for all
fm . ! n E Tk = { f N k , f N k + I > · . -}
hk : = f n k E T k . Since {T k } is monotone nonincreasing, hk + 1 are elements of T k ' and thus the subsequence { hk } of {/ n }
Now, choose one
hk and
£>0
is such that for each k
=
1 , 2, . . . ,
(5 . 9)
where Let 00
Bs : = . U A j. j=s s, { hk } is Cauchy ,
We will show that for each notice that since for each w E Ak,
In other > N,
uniformly on
B �.
l h < h (w) (w) E � l k m l k l hi (w) - hi + I (w) l "' m - 1 1: = 1 _ 1 < 1 . < L...J i = k 2 z 2k - 1 2m - 1 2k - 1 words, given a 6 > 0 , there is an N > s such that for all I h k (w) - h m (w) I
< 6,
good for all 00
First
(5 . 9a) m
>k
w E B�.
Consequently, {hk } is Cauchy on A: = U B� in the topology of points=l . w1se con vergence. Furthermore, since the sequence { B s } is monotone nonincreasing and
5. Modes of Convergence from (5 .9) , J.L(B5) <
2
5
479
� 1 ( < oo) , by continuity from above,
lim s -+oo J.L(B s ) = J.L(A c ) < lim s -+oo s � 1 = 0,
(5 . 9b)
2
and thus { h k } is pointwise Cauchy on A, i.e. )1-a.e. Define Clearly, f exists and is finite for each w, and, by Theorem 5.9 (vi) , Chapter 5, f E e - 1 ( 0, E; C). From (5. 9a) it follows that, for m ---. oo,
I f (w) - hk (w) I < 2 k 1_ 1 for all w E B� and and hence because of (5 . 9b)
hk � f .
k>
s,
Moreover, since
and because { f n } was assumed to be Cauchy in measure, each of the sets on the right of inclusion (5. 9c) converges to zero. Therefore,
Finally, let g be yet another J.L-limit of { f n } · Then, from
{ I f - g I > t:} E N11, good for all £ > 0, and thus g = f
(mod J.L) .
From Proposition 5.4 and Theorem 5.9 we arrive at
f C LP(Q, E, J.L; C) such that
D
£P
---. f . Then {f } there is a subsequence { f k } of { f } that converges to f J.L- a . e . D . t wzse. . pozn
5.10 Corollary. Let
n
,
n
f
n
n
The following proposition makes some sort of converse of Proposition 5.4 (that LP-convergence implies convergence in measure) with one addi tional condition. 5. 11 Proposition. Let (O, E, J.L) be a measure space an d let /, {/ n } C
LP(Q, E, J.L; C) such that f n IR + )-function g such that I f Proof. Since f
n
� /,
n
� f and suppose there is an £P < Then f __... f . I g.
LP(Q, E, J.L;
n
according to Theorem 5.9, there is a sub-
480
CHAPTER 8. ANALYSIS IN ABSTRACT S PACES
{ fnk } of {!n }, which converges to f p.-a.e. on n in the topology of pointwise convergence. Since { f n k } is dom; nated by g, by Lebesgue's Dominated Convergence Theorem 4.8, f n k S f and f E LP ( fl, E, J.L ; C). Suppose that f n f . Then there is a positive £ and a subsequence { hj: = fnj } of {! n } such that for all j's, it holds true that ( ) l hj - f l p > £ . On the other hand, since hj..!:.. f, there is a subsequence { hi } (of { hi } ) ; convergent to f J.L-a.e. on f2 (and also dominated by g ) and thus, by the Lebesgue Dominated Convergence Theorem, hj. -+ f thereby directly D contradicting ( ) . 5.12 Proposition. Let f n ..!:.. f. Then, every subsequence { f n } of k { f n } contains a subsequence f n k } such that f n -+ f J.L -a.e. on n. { j. j Proof. By the assumption, every subsequence f n k of { f n } must converge to f in measure. Then, by Theorem 5 . 9 , f n must have at least one subsequence, say f n k } that converges to f p.-a.e. on n. D { j sequence
£P
+
*
£P
1
*
k .
k
The converse of Proposition 5 . 12 requires the finiteness of J.l·
{ f n } be a sequence of e - 1 (n, E ; C)-function s on a finite measure sp ace (n, E, J.L) . Suppose that every subsequence n k of {/ n } contains a subsequence n k . such that nk . -+ J.L-a.e. J j � on f2 . Then, n -+ Proof. Since J.L is finite, by Proposition 5 . 8 , n Therefore, k J. given an 0 , every subsequence { a n k } of a n: k = 1 ,2,. · . , n5.13 Proposition. Let
{f }
f f.
{f }
f f f �f .
£>
{ = � {I f f I > E: } ). } has a subsequence { a n k } that converges to 0. Therefore, the numeric j
sequence {a n } is sequentially compact and ( cf. Theorem 6 . 3 , Chapter 2 D or Problem 3 .9 , Chapter 2) converges to 0 itself. The following chart (Figure 5 . 1 ) makes an overview of the major convergence modes and their relations and summarizes the theorems and propositions above.
5. Modes of Converg ence
48 1
Every subsequence } of {/,. } has a
{fnk
subsequence such that
p is fmite
kj
fn
.. ..
/,
{tnkj } -4-
f a. e.
p
� n --r
J
J
/,,
p is fmite
,
,,.
....
I fn i < g E lf'
In
�
�
f
,u - a . u.
,
f
(LDCT)
,u - a . e.
p is finite
(Egorov)
Figure 5. 1 5. 14 Proposition. Let (O, E, J.L) be a finite measure sp ace and J, {fn}
C e - 1 (0, E; C)
such that f n � f .. Suppose a function cp: C ---. C is conti11 nuous .. Then, cp o f n ---. cp o f. Proof. Since cp is continuous, cp o J,{cp o f n} C e - 1 (0, E; C). By Proposition 5 . 12, each subsequence of {/ n} has a subsequence, say fn convergent fo f J.L a . e . on n. Hence, by continuity of cp , also k· j
}
-
}
converges to cp o f J.L a . e . on k. j is due to Proposition 5. 13. cp o f n
-
n.
Since J.L is finite, the statement D
5.15 Proposition. Let {! n} , {gn} C e - 1 (n, E; C) be two sequences on a measure space (0, E, J.L) convergent in measure to measurable functions
f an d g, resp ectively .. Then, for any two complex numb ers a and b, JJ af n + bgn ---. af + bg.
482
CHAPTER B . AN ALYSffi rn ABSTRACT S P A CES
Proof. From
we have that
Therefore,
af n � af .. D 5. 16 Proposition. Let { f n }, {g n } C e - 1 (0, E; C) be two sequences on a finite measure space (0, E, J.L) convergent in measure to m easurable functions f and g, respectively. Then, f nYn .!:... f g. Proof. By Proposition 5. 12, every subsequence of { f n } contains a subsequence convergent to f J-L-a.e. on n. Let f n k be any subsequence of {f n } and f n k be a subsequen ce of f n k convergent to f J-L-a.e. on Furthermore, it is obvious that
n.
{ j}
Then the subsequence
{ Gi : = Ynki
J
convergent to
{ g n ki }
g J-L-a.e.
{ } { } of
{g n }
on
n.
has a sub se que nce
Therefore, the sequence
{F i G i } ( where F i : = f n k · ) converges to fg J.L- a .e . on n.
J 1· In summary, we showed that an arbitrary subsequence
{/ nYn }
has a subsequence {F i G i } that converges to statement now follows by Proposition 5. 13.
{ !nkYnk } of
fg J.L-a. e . on 0.
The D
5. 17 Examples.
A n = [O,� ]. Obviously , f n---. 0 >.-a. e. Therefore, by Proposition 5.8, f n � f = 0 pointwise. Since >. is finite on 0, by Egorov's Theorem, f n ---. f = 0 a.u. However, (i)
Let
0 = [0, 1]
and let
f n = e" l A n ,
where
for n---. oo (0 < p < oo ) . So , the LP-convergence of {/ n } does not hold. The same applies to L 00 : II f n II 00 = e" ---. oo, for n ---. oo.
5. Modes of Convergence
483
(ii) Let f2 = 1R + , E = <:B + , J.L = R e s E n n A , f n = l n ' and A n = A [n,n + �] . Clearly f n � 0 A- a . e . pointwise, and hence by Proposition 5.8, fn Furthermore,
II f II � = [}J n
JJ
___.
0.
I f n I PdA =
(�)P--+ 0, n--+ oo
(0 <
p
< oo ) .
Since II f n II 00 = 1 , for all n , f n ----. 1 in L 00 (f2,E,J.L;IR ) . However, {/ n } does not converge almost uniformly on n. Assume the opposite, i.e. suppose there is an A E E such that A( A) < £ and {/ n } converges uniformly to 0 on A c. Clearly , then f n should be less than one on A c for
sufficiently large n , which implies that for sufficiently large n , A c n . Un Ai
= C/J. Thus 'U n A ; C A and A (A) > joint), which is a contradiction.
i
1=
I: � n A (A ; ) = oo (since A;'s are dis-
(iii) The following is an application of two major convergence modes
to probability. Let {X n } be a sequence of L 1 (0, E, IP; IR)-random vari ables. Construct the sequence (5. 17)
and denote f: = 0. If f n .!: f in measure, we say that {/ n } converges to f in probability (also called stochastic convergence) and in this particular case, we say that the sequence {X n } obeys the Wea k Law of Large Num bers. If the sequence in (5. 1 7) is such that f n ----. f IP- a . e . on n (more precisely, rP- alm ost surely or rP-a.s.) , then {X n } is said to obey the Strong
Law of Larg e Numbers.
Due to Proposition 5.8, the Strong Law of Large Numbers implies the Weak Law of Large Numbers, thereby j ustifying their names. In the special case, when the random variables {X n } share a common mean, say m , the convergence of f n to 0 means that the average value
"" kn = 1 X k r-u n ·. -- nl L...J of the sequence converges to m (weakly or strongly) and therefore becomes a constant. This is often being used in statistics to evaluate the unknown mean ( m ) of a population (by J.L n )· Notice that the Central Limit Theorem is also applied as a practical tool to estimate the sample size within a given significance level . Finally , the reader can be referred to regular text books in pro babili ty to learn about various sufficient conditions to satisfy the Weak and Strong Laws of Large Numbers. D
484
CHAPTER 8. ANALYSIS IN ABSTRACT S P A CES
PROBLEMS
5.2.
5.1
Prove Proposition
5.2
Show that
5.3
Give an example of a sequence convergent in measure but not in £P.
5.4
Let J.l =
f n .!:.. f implies that {/ n } is Cauchy in measure.
£P(f2,L',j.t;lR) be as follows: !1 = (0,1], E = <:B n (0,1], Restn>.. , and p > 1. Define a sequence {/ n } in £P as f n (x ) =
n lAn (x) ,
A n : = [0,�). Show that the J.L-limit of {/ n } £P-limit of {/ n } is not, for all p > 1 (including oo )
is
0,
but the
.
5.5
Let £P(f2, E, J.L;IR) be as follows : Q = IR, E
Define f n ( x ) :
uniformly on lR,
Find
<:B, J.L = >.. , p >
II f n II 00 •
0.
Sh ow th at
f n --+ 0 >..- a.e., >..-I im f n = 0. However, show that f n fails to converge in £P (0 < p < oo ) . Define n = [0,1], E = <:B n (0,1], J.L = Rest0>.. , p > 0. Define f n m = l A n m , where A n m = ( m; l , �], m = 1, . . . , n , n = 1,2,. . . . Show tltat the sequence {/ n m , m = 1,. . . , n , = 1,2 , . . . } converges to 0 in the pth mean but does not converge >..- a.e. , not a. u . . and f n --+ 0
5.6
= � 1A n ( x ) , A n : = [O,e n].
=
f n --+ 0
i n L 00 ,
n
not in £ 00 •
5. Modes of Co nvergence NEW TERMS:
convergence in measure sequence of functions 474 Cauchy in measure sequence of functions 474 almost uniform con vergence of a sequence of functions 4 7 4 Chebyshev's inequality 474 Egorov's Theorem 476 convergence in probability (stochastic convergence) 483 stochastic convergence (convergence in probability) 483 Weak Law of Large Numbers 483 Strong Law of Large Numbers 483
485
486
CHAPTER B. AN ALYSffi rn ABSTRA CT S P A CES
6. UNIFORM INTEGRABILITY
Uniform integrability has some resemblance with equicontinuity as it applies to a family of functions. Recall that Problem 1 . 22 , Chapter 6 , states that a function f E e - 1 (0, E; C) on a measure space (n, E, J.L) is integrable if and only if for each e > 0, there is g E L 1 (0, E, J.L; IR ) such + that
J
{ I l l > g}
l f l dJ.L < e.
This is a motivation for the notion of uniform integrability of a family of integrable functions, for all of which such a function g exists, given any positive e.
6.1 Definition. A family � c e - 1 (0, E; C) of functions is said to be uniformly inte g ra ble with respect to a measure J.L on (0, E) if for each e > 0, there is g E L 1 ( n , E, J.L; IR ) such that for every f E q,,
+
(6. 1) I f I djl < e. S { I l l > g} The function g is said to be an £- bound of q, _ D 6.2 Remark. If (rl, E, J.L) is a finite measure space, then Problem 1 . 22 of Chapter 6 can be restated as: a function f E e - 1 (0, E; C) on a finite measure space (0, E, J.L) is inte g rable: if and only if for each e > 0, there is a nonnegative number N such that (6 .2) ConsequentlY., a family q, c e - 1 (0, E; C) is uniformly integra ble with respect to a finite measure J.L if for every £ > 0, there is a nonnegative number N such that for every f E q,, (6.2 ) holds true. This second variant of uniform integrability was originally introduced in connection with martingale theory in probability. Definition 6 . 1 is therefore more general. D 6.3 Examples.
A finite set q, = {f 1 , . . . ,f n } of L 1 -functions forms a uniformly integrable family. Indeed, given an £ > 0, by Problem 1 . 22 , Chapter 6 , each f i has an £-bound g i · Therefore, g = g 1 V . . . V g n is an £-bound of q, _ More generally, replacing f i by a uniformly integrable family q, i of functions, we deduce that the finite union of uniformly integrable families of functions is uniformly integrable.
( i)
6. Uniform Integ rability
487
(ii) In the Lebesgue Dominated Convergence Theorem, a sequence
{/ n }, dominated by a nonnegative L 1-function g, is uniformly integrable. Indeed, since for each n, I f n I < g a.e., we have that
However, it is not true that a uniformly integrable family is dominated by any function. Consider a finite measure space (N, '!P(N) ,J.L) such that J.L( { n } ) = �, 2 n = 1 ,2, . . . , and a sequence {/ n } of measurable functions defined as
2" n'
k=n
0,
k # n.
We will show that {/ n } is uniformly integrable, by using the definition of Remark 6.2. Let N > 0. Then,
Since, obviously,
2: > N C/J, 2: < N
{n} ,
holds, 1 , k = n and 0,
�>N
otherwise
l { f n > N } < l { n } leadin g to 1 2" 2" < J.L({n}) = n d = d j.L I I I I l · n t j.t -n l n{ " { I fn I > N} Consequently , given an £ > 0, for all n > ! , the set {/ n + 1 , . . . } is uniform ly integrable. Since f , . . . ,/ n are integrable, the whole sequence {/ n } is uniformly integrable. k On the other hand, g( k ) = 2k is evidently the smallest function of those dominating the sequence {f n } and it is not J.L-integrable. Indeed,
and therefore,
1
488
CHAPTER 8. ANALYSIS IN ABSTRACT S P A CES
and
2 k ( J.L (d 1 I Tl [k } w) w )
k1 2 s�p I g dJ.L = s�p 1k 2 k = oo. Therefore, there is no integrable dominating function for { f n } ·
L:
n k
=
=
L:
n k
=
D
We immediately observe that 6.4 Proposition. If a family is uniformly integrable, then
sup { J I f I d J.L= f E q, } < 00 . Proof. Indeed, given an £ > 0, let g be an £-bound of . Then,
+
I
{Ill
< g}
I f I dJ.L < £ + I g dJ.L , good for all f E �-
D
The following is a useful criterion of uniform integrability for a sequence of functions on a finite measure space. We start with 6.5 Definition. A sequence {f n } C e - 1 (0, E; C) on a measure space ( 0, E, J.L ) is said to be uniformly continuous in LP if I fn I P ----. 0 with
IA
D J.L ( A) ----. 0 uniformly in n. 6.6 Theorem. Let {f n } C e - 1 ( 0, E; C) b e a sequence of functions on a finite measure space ( 0, E, J.L ) . {f n } is uniformly integ rable if and only if it is uniformly continuous in L 1 and the int eg rals I I f n I d J.L are uni formly bounded. Proof.
1 . Let {f n } be uniformly continuous in L 1 and the integrals I f n I d J.L be uniformly bounded. Then, by Chebyshev's Inequality (Le mm a 5.3) and due to uniform boundedness,
I
Hence, J.L{ I f n I > N} ----. 0, as
I
N ----. 0, and this implies that
{ I /n I > N}
I f I d J.L n
---t
0, for all n,
by uniform continuity. The latter leads to uniform integrability of {f n }·
6. Uniform Integrability
489
2. Let { f n } be uniformly integrable. Then ,
(6.6) By uniform integrability of
{f n}, N can be chosen such that
I I f n I dJ.L < � , fo r all { I '" I > N }
n.
If J.L(A) < 2 � , then from (6 .6) we have that I I fn I < £ and thus { f n} is A 1 uniformly continuous in L . The uniform boundedness is due to Proposition 6.4. D Now, we prove another criterion of uniform integrability for arbitrary measures generalizing Theorem 6. 6. 6.7 Theorem. A family c e - 1 (0, E; C) is uniformly J.L-inte g rable if
and only if the following two conditions hold: A) sup { I I f I dJ.L: f E � } < oo .
B ) For each £ > 0, there is a nonnegative L 1 -function cp and 6 > 0 su ch that for each measurable set A with I cp d J.L < 6,
A I I f I dJ.L < £ unifo rmly for all f E . A
Proof.
1 . Suppose conditions A) and B ) are met. For each
Since by A) , I I f I dJ.L < M,
with
A = { l / 1 2: ccp}.
an £-bound for .
c
c
> 0 and f E ,
can be chosen large enough to have
I cpdJ.L < 6 A Then, by B), I I f I < e for all f and thus ccp is A
2. Conversely, let be uniformly integrable. Since
490
CHAPTER B . ANALYSIS IN AB STRA CT S P A CES
I I f I d jl , { I l l < g}
we have
(6.7)
If g is an £-bound of �, then (6. 7) yields
and thus condition A) . Taking cp = g and 6 = £ we have for each measurable set A with I gd Jl < £ , we have from (6.7) I I f I d jl < 2£ and thereby condition B).
A
0
A
6.8 Proposition. L et suppose the family ci>P: =
the family
ci> C LP ( Q, E, jl ; C) for some 1 < p < oo and { I f I P: f E ci>} is uniformly int eg rable. Then
{ I af + bg I P: f,g E ci>, a,b E C}
is also uniformly inte g rable.
Proof. For any f E equality,
LP
Now, let f 1 = af and f 2 subsequently, by ( 4.2) ,
=
and A E E, f 1 A E
LP.
By Minkowski 's in
b g , for some f,g E �. Then, from (6.8) and
II ( af + b g) 1 A II � < ( I a I II f 1 A II + I b I II g 1 A II )P < 2 P( I a I P II f l A II � + I b I P II g 1 A II �) · p
Therefore, by Theorem 6.7, conditions A) and B) for for I af + bg I P .
p
IfIP
imply those D
By Proposition 5.4, f n � f implies that f n � f. The converse of this holds true if { f n } , in addition, is uniformly integrable. The following two versions of the converse are left for the reader. 6.9 Theorem. L et f, { f n } C e - 1 (0., E; C) b e a sequence on a finite
measure space (0., E, Jl) such that
f n � f. If { I f n I P} is uniformly
6 . Uniform Inte g rability
integrable for some
49 1
£P
p
> 0, then f n --+ f. 6. 10 Theorem. For each sequence {f n} C LP(Q, E, J.L; C), the following are equivalent: ( i) {f n} is LP- convergent. ( ii) {f n} is convergent in measure and { I f n I P} is uniformly integr able. PROBLEMS 6. 1
6.2
Let {f n } C e 1 ( 0. E; IR) be a uniformly integrable sequence on a measure space (rl, E, J.L). (Using Fatou's Lemma) show that -
,
Let {f n} C e 1 ( n, E; IR) be a uniformly integrable sequence on a measure space (0., E, J.l ). If f n --+ f J.L-a.e. on n or in measure , then f is integrable. -
6.3
Prove Theorem 6.9.
6.4
Prove Theorem 6. 10.
492
CHAPTER 8 . ANALYS IS IN ABSTRACT S PA CES
NEW TERMS:
uniformly integrable family of functions 486 c;-bound of a family of functions 486 uniformly integrable sequence of functions, a criterion of 49 1
7.
R ad o n Measures on Locally Compact Hausdorff Spaces
493
7. RADON MEASURES ON LO CALLY C OMPACT HAUSDORFF SPACES
We will assume that (X, r ) is a locally compact Hausdorff"-/topological space, <:B( X) is the Borel u-algebra generated by T , and m = m (X, <:B( X)) is the family of all positive Borel measures on <:B(X) . Let fi=' = fi='(X) and R = R(X) be the families of closed and compact sets in (X, r ) , res pectively. Unless specified otherwise, under a Borel m easure we will understand a positive Borel measure. "--
7.1 Definition.
( i)
Let J.L E
"-/
m . A Borel set A is called:
a) J.L- outer regular if J.L(A) = inf{ J.L(G) : G ::> A, G E r } . b) J.L- inner regular if J.L(A) = sup{ J.L(K) : K C A, K E R} .
( ii) A Borel measure J.L is said to be outer ( inner) regular on a sub family � C <:B( X) if all elements of y are J.L-outer (-inner) regular.
(iii)
A
Borel measure J.L is called weakly regular or Radon if:
a ) J.L is finite on R(X) (compact sets) . b) J.L is outer regular on � (X) (Borel sets) . c) J.L is inner regular on r (open sets) .
( iv ) A Borel measure J.L is called re g ular if J.L is Radon and it is inner regular on <:B(X) . Denote m = tR(X) the subfamily of Radon measures on D <:B(X) . As we recall, the Radon-Nikodym Theorem inferred that, given two measures J.L and v in the relation J.L «: v, there is a unique equivalence class of density functions [f] IJ such that, for each f E [f] IJ , v = J f dJ.L. Therefore, the integral J ( )dJ.L "represents" a function (/) ; more precisely, a class of functions. We will be interested in another representa tion of the integral. ·
From Section 1 of Chapter 6, we learned that given a measure J.L, the integral f � J f dJ.L = : I(f) is a linear functional on L 1 (n, E, J.L). Can a general linear functional be represented by an integral with respect to a particular measure? Rephrasing the latter, can a given linear functional I on a function space ,P be associated with some measure, say J.L, so that, for each f E
494
CHAPTER 8 . ANALYSIS IN ABSTRACT S P A CES
space e c ( X ) . (Recall that ec (X) denotes the subspace of all continuous functions with compact support. We suggest that the reader turns to Section 1 1 of Chapter 3 for a refresher and notation.) More specifically, given a positive linear functional I on e c ( X), there exists a unique measure J.L on m such that I(f) == I fdJ.L holds true for all f E e c (X).
7.2 Definition and Remark. Let � = '!F ( X;IR ) be the vector space of all real-valued functions on X. An operator [�,1R,I] is referred to as a positive lin ear functional if I is linear on � and I(f) > 0 whenever f > 0. As a linear and positive functional, I is monotone. Indeed, for f < g , g - f > 0, and hence, I(g) == I(g - f) + I(f) > I(f) . D
7.3 Notation. Let I be a positive linear ec ( X, r;lR) . Define a set function 'Y on as
functional on
T
ec (X) ==
1 (U) == sup{ I(/) : f E e c (X) and f � U}
and extend it from r to
(7.2)
� ( X ) by introducing
p.*( Q ) = inf{'Y(U) : U ::> Q and U open} .
(7.2a)
D
Recall (see Definition 1 1 . 1 (iii) , Chapter 3) that a function f with compact support is sub ordinate to an open set U, in notation f � U if 0 < f < 1 and suppf C U. Furthermore, f -< (/J if and only if f = 0. Thus J.L * (C/J) = 'Y(C/J) = 0. Conse quently, (7.2) and (7.2a) define nonnegative set functions on r and � ( X ) , respectively. As we will see, J.L* is an outer measure induced by 1 through (7.2a) . The latter is not the traditional Caratheodory construction of an outer measure from a formatter (y , 1 ), where y was a semi-ring and 1 was u-additive. The above extension is rather of topological nature. Notice that (7. 2a) defines outer regularity of J.L* on � ( X ) . 7.4 Proposition. The set function J.L * defined by (7. 2-7.2a) is an outer
m easure on � ( X ) .
Proof. If U and V are two open sets such that U C V, then f yields that f � V and therefore,
�
U
which yields the mono tonicity of 1 and hence of J.L*. It remains to prove u-subadditivity of J.L* . (See Definition 2. 1 ,
7. Radon Measures on Locally Compact Hausdorff Spaces Chapter 5.) Let
495
{Q k } be a sequence of subsets of X with
or else, the inequality
holds true trivial wise. Given £ > 0, for each n = 1,2, . . . , there is an open superset U k of Q k such that 1 (U k ) < J.l * ( Qk) + e / 2 k . By Corollary 1 1 .6, Chapter 3, there is an f E e ( X ) such that
c
0<
f < 1 and f � U = k 00U 1 U k ·
Then, K = suppf C U. Since K is compact, {U 1 ,U 2 , . . . } can be reduced to a finite subcover of K, say {U 1 , . . . ,U n } · We can now apply Theorem 1 1 .3, Chapter 3, to {U 1 ,. . . ,U n } and K on the partition of unity sub ordinate to this cover. In other words, there is an n-tuple {/ 1 , . . . ,/ n } c ec ( X ) subordinate to the cover {U 1 , . . . ,U n} for K, i.e. ,
f i � U i and K � L, � = 1 / i ( of course, 0 < 2:, � = 1 / i < 1 ) . Since ( L, 7 = 1 f i ) * ( K) = 1 , we have that f = L, � = 1 f f i and f f i � U i and hence
L, 7 = 1 I (ff i ) < L, � = 1 1 ( U i ) L, � 1 1 (U i ) < L, � 1 J.l * (Q i ) + e.
l (f) <
=
The inequality
holds true for every Hence,
However, since
f E ec (X) such that
�
f U,
given
U = k 00U= 1 U k ·
J.l • is monotone and Q C U, we have
good for all e > 0. This yields the desired u-subadditivity .
D
As an outer measure on '!P(X), in accordance with Theorem 2.3,
CHAPTER B. ANALYSIS IN ABSTRACT S P A CES
496
Chapter 5, J.l * generates the u-algebra E * of J.l* -measurable sets that "separate" all other subsets of X. ( See Definition 2.2, Chapter 5.) By the same theorem, J.l� = Res E * J.l* is a measure on E*. We are going to show,
among other things, that all open sets are J.l *-measurable, which would yield that <:B( X) C E* . Therefore, the further restriction of J.l� from E* to <:B( X) will make J.l� a Borel measure J.l which, in addition, will turn out to be weakly regular. The latter will be followed by the unique integral re presentation I ( f ) = J fdJ.l valid for all f E ec (X) with respect to the Radon measure J.l induced by I. All of this essentially forms the Riesz Re presentation Theorem, which we will break up into several smaller propo sitions and theorems. Notice that in the sequence of statements below we shall be using J.l * whenever applied to sets other than open sets (for which we use its restric tion i 011 r) , as we do not know yet that they belong to E*.
) Then, there exists such that K -< g and J.L* (K) < l(g) . In
7.5 Proposition. L et K be a compact set in
a nonne g ative function g E ec (X) particular, J.l* is finite on R(X).
(X,
r .
" P roof. In accordance with Theorem 10.9, Chapter 3, any compact set in a locally co1., ._ act Hausdorff space can be covered by finitely many open sets whose �Losures are compact. Hence, for any compact set K, there is an open superset of K, say U, whose clobure U is compact. By Corollary 11 . 5, Chapter 3, given U, there is a function g E e c ( X ) such that l u < g < 1 . On the other hand, by Corollary 1 1 .4, Chapter 3 , there
is another continuous function f with compact support such that K -< f -< U. In particular, f < g and by Remark 7.2, I(f) < I(g) for all such f's. Hence, 1(U) < I(g) . Finally, by monotonicity of J.l*,
J.l * (K)
< J.l*(U ) = 1(U) < I(g) < oo.
D
A very similar rest!lt is formulated as follows.
7.6 PJ:oposition. L et K b e a compact set in (X, r ) and g E e c (X) such that g > 0 and g (K) = 1 . Then J.l*(K) < I(g) and J.l* is finite on
R(X).
*
Notice that unlike Proposition 7.5, the function g is given and it does not dominate K. Proof. Let 0 < a < 1 and U o: = {x E X: g(x) > a }. Then U o: is an open set. By Corollary 1 1 .6, Chapter 3 , there is h E ec (X) such that h -< U o: · It is readily seen that a - 1 g > h . ( It is strictly greater on U o: and greater than or equal to elsewhere. ) It follows that,
7. Radon Measures on Locally Compact Hausdorff Spaces a:
- 1 I(g) > I ( h ), good for all
h�U
497
0,
and therefore for sup { I( h ) : h � U 0 } = 1 ( U 0 ) . From this and by monotoni city of J.l * ,
The above inequality holds true for all a: j 1 . Finally , given K E R, by Corollary 1 1 .5, Chapter 3 , there is g E e (X) such that K � g , which c yields that J.l * (K) is finite. D 7.7 Lemma. J.l* is finitely additive on R . Proof. Let K 1 and K 2 be two disjoint compact sets. By Corollary 10. 12, Chapter 3 , in a locally compact Hausdorff space, K 1 and K 2 can be separated by two disjoint open supersets, say U and V, respectively. Now, for each £ > 0, there is an open superset W of K 1 + K 2 such that
Since (U + V) n W covers K1 + K2, the open sets U 1 = U n W and U 2 = V n W cover K 1 and K 2 , respectively . By monotonicity of 1 , J.l * (K1 + K 2 ) = inf{'Y(O) : K 1 + K 2 c 0 E r} > 1 (W) - e > 1(U 1 + U 2 ) - e
(7. 7)
On the other hand, by Corollary 1 1 .4, Chapter 3 , there are f 1 , / 2 E e ( X) c such that K1 -< f 1 -< U 1 and K 2 -< f 2 � U 2 . Therefore, by Proposition 7. 6 ,
Obviously , in our case, K 1 -< f 1 -< U 1 and K 2 -< f 2 � U 2 if and only if K 1 + K 2 -< / 1 + / 2 -< U 1 + U 2 , and hen ce from (7. 7a) ,
The latter, combined with (7. 7) for e ! 0, yields
The inverse inequality is due to subadditivity of J.l * .
D
498
CHAPTER B. ANALYSIS
IN
7.8 Theorem. J.l* is inner regular on
ABSTRA CT S P A CES
r.
Proof. We need to prove that
1(U) = J.l * (U) = sup{J.l * (K) : K C U, K E R} .
(7.8)
Given an £ > 0 and U open with 1(U) < oo, let a E IR be such that 1(U) = a + £. By Corollary 1 1 . 6 , Chapter 3 , there is f -< U such that I(f) + e > 1 (U) = a + e. Hence, I(f) > a. Let K = suppf. Then, by Problem 7. 1 , J.l * (K) > I(f) > a
and
J.l * (K) + e > a + e = ! (U).
(7.8a)
Thus, we showed that, given e > 0, there is a compact set K C U with (7.8a) holding. This yields (7.8) . Now, let 1 (U) = oo. Then, there is f -< U and 1(U) = sup{ I(f) : f -< U} . Thus, for any M > 0 (arbitrarily large) , there is f E e (X) such c Hence, that I(f) > M. Given K = suppf, by Problem 7. 1 , J.l• (K) > M. we showed that, given U with 1(U) = oo and M > 0, arbitrarily large, there is a compact subset K C U such that J.l * (K) > M. Therefore, sup {J.l * (K) : K 7.9 Theorem.
r
C E*.
C U, K E R} = oo.
Consequently, <:B(X)
Proof. We need to show that for each Q
D
C E*.
C X and U E
r,
(7. 9) 1 . First, let Q E r. Then, Q n u E T and 1 ( Q n U) = sup { I (f) : f -< Q n U} . Hence, for each e >O, by Corollary 1 1 .6 , Chapter 3 , there is an f -< Q n U such that I(f) + e > ! ( Q n U). Because Q n (suppf) c is an open set, there is g -< Q n (suppf)c such that I(g) + e > !( Q n (suppf) c) .
7. Radon Measures on Locally Compact Hausdorff Spaces
499
Clearly, f + g -< Q . Consequently, ! ( Q ) > I (f) + l ( g ) > !( Q n U ) + 1( Q n (suppf) c ) - 2 t: .
(7. 9a)
On the other hand, Q n (suppf) c ::> Q n (U n Q ) c = Q n uc, which leads to 1( Q n (suppf) c )
= J.l * ( Q n (supp f) c ) > J.l * ( Q n U).
The latter, along with (7.9a) yields and hence,
I (( Q ) > 1( Q n U) + J.l * ( Q n u c ) - 2 t: I( ( Q ) > I( Q n U) + J.L * ( Q n U c ).
The inverse inequality is, as usual, due to subadditivity of J.l* . 2. Let Q C X . If J.L * ( Q ) = oo then the separation is due to subadditivi ty. Let J.l*( Q ) < oo . Then, since
J.l * ( Q ) = inf{1(V): Q C V E r } , for each £ > 0, there is an open superset V of Q such that
J.l * ( Q ) + t: > 1(V) b y =
case
1
1( V n U) + 1(V n u c )
> J.l * ( Q n U) + J.L * ( Q n u c ) . For £ l 0, and the inverse inequality follows from subadditivity of J.L * . Thus we showed that r C E * . This immediately implies that all Borel sets are J.l *-measurable. D
From now on, the restriction of J.l * from E * (act ually, J.l�) to � ( X ) will be denoted by J.l· The last two theorems finalize the most significant feature of J.l * , besides its integral representation, that its restriction from � ( X ) to � ( X ) is a Radon measure. Indeed, Theorem 7.8 states that J.L * is inner regular on r . Proposition 7.5 states that J.l * is finite on compact sets. Theorem 7.9 states that Res � (X) J.l* = J.L is a Borel measure. And, finally , J.L >"' is outer regular, by definition, on
�(X ) , and therefore, on
500
CHAPTER B. ANALYSIS IN ABSTRACT S P A CES
<:B(X) . 7.10 Theorem (Riesz's Representation Theorem). For any p ositive linear functional I on ec ( X) there is a Radon measure J.l such that for all f E e c ( X) ,
I( f)
= J fdJ.l.
(7. 10)
Proof. We have shown in the above theorems that through formulas (7.2) and (7.2a) , I induces a Radon measure on the Borel u-algebra <:B(X) . We need to prove that (7. 1 0) holds true. Let f E ec (X) and U be an open set such that f -< U and J.l( U ) < oo Since f is bounded, there is an M < oo such that I I f I I u < M (where I I I I u stands for the supremum norm) . Given £ > 0, let { t 0, . . . , t n } be a partition of the interval [ - M ,M] with .
·
t - = - M + im a
such that the mesh, m =
2�, of the partition be less than £. Denote
Ei = f * ( ( t i 1 , t i]) n K, _
where
K = su pp f , and
By outer regularity of J.l, for each £ of E i such that
>
0, there is an open superset
Vi
(7. 10a)
Ei C W i n Vi, and therefore, (7. 10a) still Ui = Wi n Vi. Because E i C U i C Wi, we n K c i u 1 ui c u . Thus, {U 1 , . . . ,U n } is an open cover of K and by Theorem 1 1 .3 , there is a partition { g 1 , . . . ,g} C e c ( X) of unity for K subordinate to this open cov er, i. e . , g i -< U i and K -< E f = 1 Y i · Because f < t i + e on Wi, it holds on any subset of Wi, and thus Since E i C W i , we have that holds when Vi is replaced by have that
(7. 10b) Since
L: � = 1 gi = 1 ,
7. R a don Measures on Locally Co mpact Hausdo rff Spaces
Also, note that
501
L: 7 1 E i = K and I f dJ.l = I f dJ.l = L: � _ 1 I fd J.l. =
• E •· K The latter, along with (7. 10b) and (7. 10c) , yield:
I( f) - I f dJ.l = L: 7 1 I(f 9 i) - L: 7 lj_ f dJJ (since J > ti - e on W i and thus on E i ) < L: 7 ( ti + e ) I(gi ) - L: 7 ( ti - e )J.l( E i ) 1 1 =
=
•
=
=
< L: 7 1 [( ti + e )J.l(U i) - ( t i + e - 2 e )J.l( E i)] = L: 7 = 1 ( ti + e )[J.l(U i) - J.l( E i)] + 2c; L: 7 1 J.l( E i) < E 7 = 1 ( t i + e ) � + 2 e J.l(K) = e [M + e + 2J.l ( K)]. =
=
Letting c; ! 0 we arrive at when replacing by
f - f.
I(f) <
J
f dJ.l .
Now, the equality is reached D
7. 1 1 Proposition. The Radon measure in equation (7. 1 0) is unique. Proof. Suppose v is another Radon measure induced by I for which equation (7. 10 ) holds. Let K be a compact set. Then, by the outer regularity, for each £ > 0, there is an open set U such that
J.l( K) + e > J.l(U). By Corollary 1 1 .4, Chapter 3 , there exists K -< � U yielding that lK � < lu and hence
f
f
f E ec ( X)
such that
v( K) = I lK d v < J f dv = I f dJ.l < J lu dJ.l = J.l(U) < J.l ( K) + c;.
v(K) < J.l(K).
Interchanging the roles of J.l and J.l = v on R. Inner regularity allows us to state that also outer regularity finally yields J.l = v on �(X) .
Thus,
7.12 Theorem. A ny Radon measure
Borel sets.
Proof. Let B E �(X) such that p.(B)
J.l
v we arrive at J.l = v on and r
D
is inner regular on J.l-finite
< oo. We need to show that
502
CHAPTER B . ANALYSffi rn ABSTRACT S P A CES
J.l(B) = sup{J.l(K) : K C B, K E R(X) } or, equivalently, that for each £ > 0 , there is a compact subset K of B such that J.l(K) + £ > J.L(B) . Choose £ > 0. Since B is J.l-outer regular, there is an open set U ::> B such that J.l (B) + � > J.l(U ) .
(7. 1 2)
Since U is J.l-inner regular, there is a compact set C such that C C U and J.l( C) + � > J.l(U).
(7. 12a)
Since U\B (as an open set) is J.l-outer regular there is an open superset V ::> U\B such that, along with (7. 12), J.l( V ) < J.l(U\B) + : < � Since U\B C V , we have that
(7. 12b)
v c C uc U B. Hence,
(as C C U) (since B C U)
= B.
We see that C\ V is a compact subset of B with: J.l( C\ V ) = J.l( C) - J.l( C n V ) (by (7. 12a) and sinc e J.l(C n V ) < J.l( V ) < � by (7. 12b)) (as J.l(U) > J.l(B))
> J.l( U) - � + (
- �)
J.l(B) - £.
0
The reader can rather easily conclude that: 7.13 Corollary. If B is a u-finite Borel set, then B is J.l-inner
regular.
(See Problem 7.4.)
7.14 Proposition. L et J.l be a u -finite Radon measure and B E <:B(X). Then for each £ > 0, there is a closed subset F of B and an op en super set U of B such that J.l(U\F) < £.
7. Radon Measures on L o cally Compact Hausdorff Spaces Proof. Let oo . Since each such that
Let
U
00
= U
n = l
503
{B n ; n = 1 ,2, . . . } be a partition of B such that J.l(B n) < B n is J.l-outer regular, there is an open superset U n of B n
Un · Then, B C U and U \B = C 'U 1 Un ) n ( 2: �_ 1 B n y = n 'U
1[ U n n ( 2: � Bn )c ]
= n lJ
= 1
1 [ U n n ( k fl 1 Bk )]
Therefore, Now, we apply to Be the same arguments as above to have an open superset V of Be with J.l(V\Be) < �· Then, F: = ve is closed and F C B. Finally, because B\F = V\Be,
J.l(U \ F ) = J.l( U \B) + J.l(B\F) <
e: .
0
The following proposition is an easy consequence of Corollary 7. 13 and Proposition 7. 14 and is offered as a small challenge for the reader as Problem 7. 7.
J.l be a Radon measure on <:B(X) , where X is a locally compact Hausdorff space. If B is a u-finite Borel set, then for each t: > 0 , there are a compact set K and an open set U such that K C 0 B C U and J.l(U\K ) < 7.16 Proposition. Let p. be a u-finite Radon measure. Then for any B E <:B ( X), there is an F subset of B and a G 0 superset of B such that J.l(G6\F ) = 0. Proof. Let B be a Borel set. By Proposition 7. 14, for each t: > 0, there are closed and open sets, F and U, respectively, such that 7.15 Proposition. L et e: .
u
u
F C B C U and J.l( U \F) < t:. In particular, for t: =
�, there are F n (closed ) and U n ( open ) such that
CHAPTER 8 . ANALYSIS IN ABSTRACT S P AC ES
504
Then, with the notation
GJ n
=
k U= l F k and 9l n = k n= l U k'
we have that (7. 1 6)
{9l n \ GJ n }
In addition, clearly, the sequence and:
is monotone nonincreasing
n n 1 (9.L n \ c:F n ) = n n 1 ( 9.L n n c:F � ) = ( n n= l cu.n ) n ( n n= l GJ � )
yielding that
P. ( n fl 1 (9.L n \ c:F n ) ) = P.( G5 \F
u
).
(7. 16a)
It therefore remains to show that (7. 16b) by using continuity from above in light of Theorem 1 . 7 ( i) , Chapter 5, which requires that J.t ( 9l 1 \ 1 ) < oo. From (7. 16), we have that
GJ
Now, the assertion that J.t( G 6\F u ) = 0 follows from (7. 16-7. 16b) .
D
As we remember, a regular Borel measure on a Borel u-algebra, gener ated by a locally compact Hausdorff space X, has a number of proper ties, one of which is its finiteness on compact sets. The following is an in teresting fact that in some subclasses of locally compact Hausdorff spaces, for a Borel measure to be regular it is sufficient to be finite on compact sets. Namely, second countability or j ust u-compactness of all open subsets of X is such an add-on. (Recall that, according to Corollary 10. 18, Chapter 3, a second countable locally compact Hausdorff space is also u-compact. )
locally compact Hausdorff space, in which every open set is u-compact. Then, every Borel measure on <:B(X) , finit e on compact sets, is regular. 7.17 Theorem. Let
Proof. Let
J.l
(X,r) be
a
be a Borel measure such that J.t ( K) <
oo
for all
7. Radon Measures on L ocally Compact Hausdorff Spaces
505
K E R(X) .
Then, ec (X) C L 1 ( n, <:B(X) , J.t). Let I denote the positive linear functional on e c (X) defined as I(f) = f fdJ.t and let v be the Radon measure induced by I . By the assumption, for any U E r , there is a sequence
{C n }
and U, there is an f E e c (X) such that an f E e c (X) such that
Consequently, for
U = n U= l C n . Then, given C1 C1 -< f -< U. For n = 2, there is
of compact sets such that
n
> 2, recursively, there is an f n E e c ( X) such that
( k u= \l upp h ) u ( t. 0= l ci ) -< fn -< u. c Because { k U = 1 k } T U, obviously, {f n } i lu, and n
J.t( U )
= f l im n --+ oof n dJ.l
(by the Monotone Convergence Theorem)
Therefore, J.l = v on r . Now, let B be a Borel set. Since v, according to Problem 7.6, is u-finite, by Proposition 7. 14, given an e > 0, there are closed and open sets, F and W such that F C B C W and v(W\F) < e. Since W\F E r and J.l = v on r , so J.t(W\F) < e also. Consequently, J.t( F) < oo, J.t(W\F) = J.t(W) - J.t(F) < oo, and J.t(W) - J.t( B) Hence,
< J.t(W) - J.t( F) < e.
(7 . 1 7)
J.l is outer regular on <:B(X) . Furthermore, from (7. 1 7) , J.t(F) > J.t(B) - e.
(7. 17a)
On the other hand, F can be J.t-approximated by a compact set. Indeed, since X is u-compact, so F is also, i. e. , F can be represented as the union of compact sets. Alternatively, F can also be represented as the union of a monotone nondecreasing sequence { K n } of compact sets, and therefore, by continuity from below,
Consequently, for each
e > 0,
there is an
n such that (7. 17b)
506
CHAPTER B . ANALYSffi rn AB STRA CT SP ACES
Combining (7. 17a) and (7. 17b) , we have that J.l ( K n ) > J.l ( B ) 2£ , which shows the J.t-inner regularity of B and hence, regularity of J.l · In particu lar, J.l is Radon, and because of the uniqueness of J.l, we have that J.l = v.D -
7.18 Remark. Notice that, since IR " with the usual topology is a u compact and locally compact Hausdorff space, any Borel-Lebesgue-Stielt jes measure, according to Theorem 7. 17, is regular. D
Another very useful result is as follows.
<:B(X ) generated by a locally compact Hausdorff space. Then e c (X,r;C) = LP (X, <:B (X ), J.t;C), for all 1 < p < oo. Proof. Since by Problem 4.8, the space tJiP of all simple complex valued functions is dense in LP, it is sufficient to prove that for each Borel set B with J.l( B) < oo, the function 1 B can be approximated in the LP norm by elements of ec ( X). Given an e. > 0 , by Proposition 7. 15 (for which it is sufficient that B be u-finite, i.e. , R es r; n BJ.l is u-finite; see Remark 2 . 12, Chapter 5) , there are a compact and open sets such that K C B C U and J.t(U\K) < e. By Corollary 1 1 .4, Chapter 3, there is an f E e (X) valued in (0, 1] such that c K -< f -< U and K C suppf. Furthermore, 7.19 Theorem. Let J.l be a Radon measure on Borel u-algebra
and hence
I l l B f II P < ll l u -
=
-
1K
II P
1 1 ll l u\K II = [ J.l(U\K) r < E: P . p
D
The following often referred to theorem (holding for locally compact Hausdorff spaces and Radon measures) states that a measurable function vanishing outside a set of finite measure can be approximated by a func tion with compact support.
Hausdorff ( X,r ) be a locally compact space, J.l be a Radon measure on <:B ( X ), and let f E e - 1 ( X, <:B ( X ) ; C) . Assume that the set E = {x E X: f(x) '# 0} is J.l-finite. Then, for each e > 0, there is a function F E ec ( X,r;C) such that J.l{ F '# /} < e. 7.20 Theorem (Lusin). L et
Proof.
1 . We first assume that f is bounded. Since E is assumed to be J.l finite, f E L 1 (X, <:B ( X ), J.t;C). Then, by Theorem 7. 19, there is a sequence {! n } C ec ( X,r;C) that converges to f in mean. By Riesz-Fischer Theorem 4.9, there is a subsequence { h k : = f n } of {/ n } that converges k
7. R a don Measures on Locally Compact Hausdorff Spa ces
507
J.L- a .e. in the topology of pointwise convergence. Since (E,E : = <:B(X ) n E , R e s E EJ.L) is a finite measure space, by Egorov's Theorem 5 . 7, { h k } converges to f J.L- a .e. , i .e. , for each > 0, there is a Borel set A C E such that J.L( E \A) < � and {hk} converges to f uniformly on A . Furthermore, by Proposition 7. 15, there are pairs of compact sets K and C and open sets V and U such that to f
n
and
E:
J.L( V\ K) < � and J.L( U \C) < �-
But yield that
U\E c U\C and A\K c V\ K J.L(A\K) < � and J.L(U\ E) < �-
Since h k --+ f uniformly on A , thus uniformly on K, the function Res K f is continuous. By Tietze's Extension Theorem 1 1 .7, Chapter 3 , there is a function F E e ( X) as Ext ( Res K f) with respect to K and U such that c x F vanishes on u c and suppF C U. Since f = 0 outside E and F is an extension of f from K to X such that F = 0 outside U,
{ x : f(x) f. F(x) } c U\ K and J.L(U\K) < e, i.e. f and F differ on a set of measure less than F = f i E except on a set of measure less than e. 2. Let
f be unbounded.
E: .
We can also say that
Denote
E n = {x E X : O < l f(x) l < n}. Then { E n } l E and therefore, by continuity from above (taking into account that J.L(E) < oo ) , we have J.L(E\E n ) --+ 0. Hence, given an E: > 0 , there i s an N such that for all n > N,
By case 1, applied to f bounded on E n , there is an F E e ( X ) such that c F = f i E n everywhere except on a set of measure less than � - Thus, F = fl E except on a set of measure less than
e.
0
508
CHAPTER 8 . ANALYSIS IN ABSTRACT SP ACES
PROBLEMS 7. 1 7.2
Show that for any function (X, /, (0, 1]] E e c (X), J.L* (suppf) Show that for all K E R, it holds true that J.L * (K)
> I(f) .
= inf{I(f) : f E ec (X) and f > lK }.
[Hint: Apply Proposition 7.6.] 7.3
Can the uniqueness of the Radon measure induced by a positive linear functional be established by means of Theorem 2. 13, Chapter 5 , at least in part?
7.4
Prove Corollary 7. 13.
7.5
Show that if (X,r) is a locally compact Hausdorff space, then every u-finite Radon measure on <:B(X) is regular.
7.6
Prove that if (X, r) is a locally compact and u-compact Hausdorff space, then any Radon measure on <:B(X) is u-finite and regular.
7.7
Prove Proposition 7. 15.
7.8
In Lusin's Theorem 7.20, prove that if f is bounded, then the choice of such an F can be restricted to those with II F II u < II f II u ' where II II u is the usual supremum norm. ·
7.9
Prove the statement : Let (e c(X,r;lR),IR, I] be a positive linear func tional and K be a compact subset of X. Then there is a nonnega tive real constant C K such that, for all f E e c (X) with supp f C K, I I( f ) I < C K II f II u (where II II u is the supremum norm) . ·
7. Radon Measures on L ocally Compact Hausdorff Spaces NEW TERMS:
Borel measure 493 J.L-outer regular set 493 J.L-inner regular set 493 outer regular Borel measure 493 inner regular Borel measure 493 weakly regular Borel measure (Radon measure) 493 Radon measure (weakly regular Borel measure) 493 positive linear functional 494 Radon measure induced by a positive linear functional 496 Riesz 's Rep res entation Theorem 500 Lusin's Theorem 506
509
510
CHAPTER B. ANALYSffi lli ABSTRA CT S P A CES
8. MEASURE DERIVATIVES
In this section we will consider an alternative approach to the abstract notion of the Radon-Nikodym derivative in Euclidean spaces of signed Borel-Lebesgue-Stieltjes measures with respect to the Borel-Lebesgue measure. The idea of differentiation of measures, as a "pointwise limit," which we explore throughout, has some analog with the conventional concept of a derivative; and it has an interesting insight to the Radon Nikodym derivative and applications to the differentiation of functions. 8. 1 Definition. Denote by !f( 6, x) the collection of all open cubes in IR " (whose edges are parallel to the coordinate axes) of diameter less than or equal to 6 and containing a point x. Let v be a signed Borel-Lebesgue Stieltjes measure on the Borel u-algebra <:B (cf. Definition 1 . 1 ( vi)). For each x E IR " , define the functions
{ C E !f( x, 6) } sup Dv( x) = { D v(x) = lim6__. 0 inf ���� : C E :f (x, 6) }·
and
Since the functions
and
v (G) ;\ ( G : )
. hm 6 0
-
-+
{ (x, 6) H sup { (x,6) H 1nf .
v( G) ;\ ( G : )
v(G) ;\ ( G ) :
(8. 1 ) (8. 1a)
} } C E !f(x, 6) C E !f(x, 6)
are, for every fixed x, monotone non decreasing and nonincreasing in 6, respectively, the limits in (8 . 1) and (8. 1a) exist (though they can be + oo or - oo ) The numbers Dv(x) and Dv(x) (satisfying J2v < Dv) are called respectively the upper and the lower derivatives of measure v (with res pect to the Borel-L ebesgue measure >.) . If they are equal and finite, we de note their common value Dv( x ) , and we say that v is differentia ble at x (with resp ect to >.) and call Dv(x) the ( measure) derivative of v at x (with respect to >.) . D .
Notice that if v
«
>., then v
Nikodym density) and since
= J f d >.
(with respect to some Radon
����=: ��� represents the mean value of the
function f on the cube C(x,d) (of diameter d and containing point x) , Dv, if it exists, seems to be equal to f >.-a.e. in a vicinity of x. This idea (which gives a practical insight of the Radon-Nikodym derivative) will be explored in a rigorous way through several statements below.
8.
Measure Derivatives
511
8.2 Remark. One interpretation of the measure derivative is if Dv exists at a point x0 (and therefore, coincides with its upper and lower derivatives), then
. Dv( x0) = h m 0
-+
{0 ;\v ((GG))
:
e E !f( x0, 8 )
}
(8.2)
exists for 8 l 0 along any pertinent net of open cubes. Therefore, for any £ > 0, there is a 8 > 0 such that for any open cube e containing x0, of diameter less than or equal to 8,
v;\ ( G ) - Dv(x0) < e. (G )
(8.2a)
As a relevant net of cubes, we can take those centered at x0 and even reduce that net to a sequence of cubes of diameters { � }. 0
e 1 ,. . . ,e m be open cubes in lR " . Then there is a sub collection, e k , . . . ,ek , of pairwise disjoint cubes among e 1 ,. . . ,e m such 1 5 that m " . U ei < 3" L: � 1 " ( e k . ) . J J 1 8.3 Lemma. Let
(
)
1 =
=
Proof. Let 8i be the diameter of e i . Rearranging the cubes, we can assume that 8 1 > 8 2 > . . . > 8 m . Set k 1 = 1 and let k 2 be the smallest in dex (of the cubes) greater than 1 and such that the cube with this index be disjoint from e k . If there is no such cube available, then we are done.
1
Otherwise, set k3 to be the smallest index greater than k 2 and such that e k is disjoint from ek + ek2 . Continue this process until the formation
1 of all disjoint cubes C k , . . . ,e k is finished. Suppose S k is a cube with 1 s J the same center as C k . but with a diameter three times as large. Since each e i intersects some e k with i > kj (it is impossible otherwise, as J 3
.
J
·'
the set of the disjoint cubes is assumed to be complete) and d( e i ) < d( e k . ), it yields that e i C S k . · Hence, J
J
8.4 Lemma. Let J.L be a positive Borel-Lebesgue-Stieltjes measure on � n and let N E N JJ " Then DJ.L exists .,\- a. e. on N and DJ.Ll N E [0] ;\ · Proof. Because J.L is a positive measure, 0 need to show that for each positive a, {x
< I2J.L < D J.L;
E N: DJ.L(x) > a} E N ;\ ·
and thus we
5 12 Let
CHAPTER B . ANALYS ffi rn ABSTRACT S P A CES
A = N n { x E IR
n : DJ.L(x) > a} , for some a > 0 .
Then, A is Borel (Problem 8.4) and, by regularity of J.L (see, Theorem 7. 1 7 and Remark 7. 18) , for any £ > 0 , there is an open superset U of A such that J.L(U\A) < £. Since A is a J.L-null set, we can make J.L(U) arbitra rily small. We will show that the latter, times a positive constant, do minates >.( K ) , where K is a compact subset of A, and hence >.(A) , taking into account regularity of >.. Let K � A be a compact set and U ::> A be an open set. Given x E K, by Problem 8.2, there is an open cube C of any ftxed diameter, say d , that contains x, and such that >.(C) < �J.L( C) . From Problem 8.2, we can make d small enough to ensure C C U. We can cover K by all such cubes and due to compactness have this open cover (dominated by U) reduce to a finite subcover, say, C1 , . . . ,C m . Then, by Lemma 8.3, there is a sub collection, C k , ,C k , of pairwise disjoint cubes, among C 1 , . . . ,C m ' such 8 1 that .
•
.
>.
(. 0 1 ci) < 3 n L: s. J
I =
=
1 >.(C k J. ) .
As mentioned above, due to regularity of J.L, given an e an 3 selected as J.L(A) + £ an > J.L(U) . 3 Hence,
On the other hand, by regularity of >., for each
as
The latter, along with >.(K)
>.( K ) + e
> 0,
U can be
£ > 0, K can be selected
> >. (A).
< £, gives >.(A) < 2 £ .
D
8.5 Corollary. L et v be a sing ular signed Borel-L ebesg ue-Stieltjes measure. Then, the measure derivative Dv exists >.-a.e. and Dv = 0 >. a.e. Proof. Since v ..L >., by Proposition 3 . 2 (iii) , v + ,v E <S AJ.. and there is a Borel set B such that I v I (B) = v + (B) = v - (B) = >. (Be) = 0. Hence, by Lemma 8.4, D I v I = Dv + = Dv - = 0 >.-a. e. on B and since
8. Measure Derivatives
5 13
Be E N .,x, we have that D I v I = Dv + = Dv - = 0 >.-a.e. on IR " . Because D is a linear operator on the set of all signed Borel-Lebesgue-Stieltjes measures, we have that Dv = 0 >.-a. e. D
Since any Borel-Lebesgue-Stieltjes measure is u-finite, by Theorem 3.4, there is a unique Lebesgue decomposition of a signed Borel-Lebesgue Stieltjes measure v with respect to the Borel-Lebesgue measure >., as v a + v 5 , where v a << >. and v s ..L >.. Absolute continuity of v a (with respect to >.) provides a >.-equivalent class
:
dv d
of Radon-Nikodym densities,
which is referred to as the Radon-Nikodym derivative. The theorem below states that v a is >.-almost everywhere differentiable and its derivative coincides with any Radon-Nikodym density of the class
: >.-
dv d
a.e. We therefore formulate the theorem for an absolutely continuous
signed Borel-Lebesgue-Stieltjes measure.
8.6 Theorem. L et v be a sig ned Borel-L ebesg ue-Stieltjes measure on such that v << >. . Then Dv exists on some set A such that A c E N ,\ and l A Dv E �Proof. Let f E ��- Given a real number a, denote
�n
Then p is a positive Borel-Lebesgue-Stieltjes measure on de-bounded Borel set. Then p (B )
<:B " .
Let
B be a
= v( B n {/ > a}) - a>.(B n {/ > a})
is obviously finite. From v( C ) - a>.( C)
it follows that
v(C)
= J (f - a) d>. < p(C) c
p (G) < a + .,\(G) ' .,\(G)
and since the latter holds true for any open cube, we have that Dv < a + Dp.
(8.6)
Let N = {f < a} . Then, N E N P and by Lemma 8.4, Dp exists >.-a.e. on N and l N Dp E [O].,x· This, applied to (8. 6) , yields that
514
CHAPTER 8. A NALYSIS IN A BSTRA CT S P A CES
Denote S a = {/ every a. Since
< a < Dv}.
Then,
Sa C N
E: = {Dv > f} C
and therefore
Sa E NA
for
U Sa , aeQ
we have that E E N A and therefore, Dv < f >.-a. e. Now, replacing v by v and f by /, we arrive at Dv > f >.-a.e. Since v is clearly u-finite, by the Radon-Nikodym Theorem 2.2 (case 5b ) , f is >.-a. e. finite and because Dv < Dv, we have that Dv = f >.-a. e. and D thereby the statement is proved. -
-
Combining Corollary 8.5 and Theorem 8.6 we arrive at:
8. 7 Corollary. A ny signed Borel-L ebesgue-Stieltjes measure v is >.-a.e. differentiable and if v a is its absolutely continuous component in the L ebesgue decomposition, then Dv is >.-a.e. identical to any Radon Nik odym density of v a with respect to >.. Proof. If v is a signed Borel-Lebesgue-Stieltjes measure and v = v a + v is its Lebesgue decomposition, then by Theorem 8.5, v a is differ entiable >.-a. e. More precisely, its derivative Dv a exists on some set A such that A c E N A and s
By Corollary 8.4, v s is differentiable >.-a. e. and its derivative Dv 5 = 0 >. a.e. In other words, there is a set 13 such that Be E N A and such that 1 BDv5 = 0. Consequently, the set E c = A c u B c E NA and 1 EDv = dv a d
1 EDv a E A
and therefore,
v is >.-a. e. differentiable.
D
PROBLEMS
8.1
v be a signed Borel-Lebesgue-Stieltjes measure and A = { x E IR " : Dv( x) > a} f. C/J, for some real a. Show that there is a cube C containing x such that ���� > a.
8.2
v be a signed Borel-Lebesgue-Stieltjes measure and A = {x E B C IR " : Dv(x) > a} f. C/J , for some real a and B being a Borel set. Show that , given a positive real number 6, there is a cube C(x,6) v(G(x , 8) ) such that A(G(x, o)) > a.
8.3
Show that
Let
Let
8. A is an open set.
{
Measure Derivatives
( v (c) c)
= x E IR n : sup A (
-
:
C E :f ( x , 6)
515
) } >
a
D J.L, D J.L and D J.L E e 1 (IR", �; IR).
8.4
Prove that
8.5
Let F be an extended distribution function induced by a positive Borel-Lebesgue-Stieltjes measure J.L on (IR, <:B) . Show that if J.L is dif ferentiable at x0 , then F is continuous at x0•
5 16
CHAPTER a. ANALYSffi rn ABSTRACT S P A CES
NEW TERMS:
lower derivative of a measure 510 upper derivative of a measure 5 1 0 measure differentiable at a point 510 measure derivative 5 1 0
Chapter 9 Calculus
on
the R e al Line
In this chapter we utilize theorems on absolute continuity, singularity, and measure derivatives of Chapter 8. We will see a close connection between the signed measures and functions of bounded variations (to be introduced) and their de compositions. The underlying treatment will be entirely devoted to the real line, with topics belonging to traditional analysis and probability. However, some more advanced methods of Chapter 8 will be applied for quicker and more elegant results that lead to the calculus of Lebesgue and Lebesgue-Stieltjes integrals. While the Riemann-Stieltjes integral would fit perfectly into this chapter it will not be the subject of our discussion, mostly because it is readily available in numerous advanced calculus texts, although a close relationship between Riemann-Stieltjes and Lebesgue-Stieltjes integrals makes this topic very tempting to explore . 1. MONOTONE FUNCTIONS
1.1
Definition and Notation. Unless specified otherwise, we will con sider real-valued functions [IR,IR,f] , bounded over bounded intervals. A function f is monotone nondecreasing ( nonincreasing) if f(x) < f(y) (f(x) > f(y)) whenever x < y. A function is monotone if it is of either types. The jump fJ 1 ( x) of a function f at a point x , is f( x + ) - f ( x - ) . The latter is clearly a finite number at any real point x. A point x is a jump discontinuity of f if fJ 1 ( x) f. 0. [Note that the function ; does not fall into this category of monotone functions, as it is not bounded over bounded intervals around zero.] Note that monotone functions are measurable. Indeed, if f is mono tone nondecreasing, for any real number a, the set { f > a} is either D empty or an interval .
1.2
The
set D of all jump discontinuities of a monotone function [IR,IR,f] is at most countable, and if f is defined on a compact interval [a,x] and D ( a x ) = {x 1 ,x 2 , . . . } is the set of all discontinuities of f on (a,x) (a < x) , then, Theorem.
5 1 "'
518
CHAPTER 9 . CALCULUS O N THE REAL LINE
< f(x) - f(a). Proof. We assume that will deal with - f. Because
f
( 1 . 2)
is monotone nondecreasing. Otherwise, we
lR = n U= 1 ( - n,n) 00
and
( - n,n) = k U= 1 [ - n + k1 , n - k1 ] ,
it is sufficient to prove that f has at most countably many points of dis continuities on any compact interval [a,x]. First observe that for an n tuple, a < x 1 < . . . < x n < x, of points it is true that
f(a + ) - f(a) + E � = 1 6 1 (x k) + f(x) - f(x - ) < f(x) - f(a) .
( 1 .2a)
Indeed, if t0 E (a,x 1 ), t 1 E (x 1 ,x 2 ), . . . , t n E (x n ,x) are arbitrarily selected points, then b y summing up the inequalities
f(a + ) - f(a) < f( t 1 ) - f(a) 6,(x k) < f(t k) - f( t k - 1 ), k = 1 , . . . ,n f(x) - f(x - ) < f(x) - f( t n ) we have ( 1 . 2a) . From inequality ( 1 . 2a) , it also follows that if De is the set of all jump discontinuities of f on [a,x] at which the jumps are greater than an £ > 0, and if x 1 , . . . ,x n E De, then n £ < f(x) - f(a) and therefore De is finite. Let D [ a , x ] denote the set of all jump discontinuities of f on [a,x] and let
D 1 / k = {u E [a,u]: 6, (u) > Z } ·
Then, it is readily seen that
D [a , x ] = k U D 1 / k' 1 00
and since each D 1 /k is finite, latter and ( 1 . 2a) yields ( 1 .2).
D [a , x ] -< N, i.e. , D [a , x ] = {x 1 ,x2, . . . }. The
D
Observe that if the function f is defined on [a,x], then f( a + ) - f(a) = 61 (a) and f(x) - f(x - ) = 6 1 (x) can be taken for jumps of f at
1 . Monotone Functions
5 19
the ends of the interval [a,x]. With � , ([a,a]) = 0, equation ( 1 .2) still holds. On the other hand, if f is really defined on IR, then from ( 1 . 2) it follows directly that �, ([a , a]) = o , (a) . Now, if for � ,([a ,x]) we will take a as a fixed constant and if x varies in [a,b], �,([a,x]) in ( 1 .2) turns to a function of x, in new notation, � 1 (x) , which is monotone nondecreasing on [a,b]. The "step" function � 1 (x) is referred to as the cumulative jump function of f. While it is almost obvious how to turn a monotone into continuous function, we would like to formalize it as follows:
1.3
[[a,b],lR,f] be a monotone nondecreasing function f - � f is monotone nondecreasing and
Proposition. L et
function. Then the continuous on [a,b]. Proof. Let x applied to [x, y],
be any two points fr om
[a,b].
Then, from ( 1 .2)
� f ([x,y] )
= f(x + ) - f(x) + E x < xk < a1 (x k) + f( y ) - f(y - ) (1.3 ) < f( y) - f(x) . adding and subtracting f(x - ) to the left-hand side of inequality y
By (1 .3) and then rearranging terms we arrive at
� 1 ([x, y]) ± f(x - ) = E x < x < a 1 (x k) + f( y ) - f(y - ) - [f (x) - f(x - ) J k = f(a + ) - f(a) + E a < x < o1 (x k) + f(y) - f( y - ) k - { f(a + ) - f(a) + E a < xk < x a 1 (x k) + f(x) - f(x - ) } = � 1 ([a, y]) - � ,([a,x]) . y
( 1 .3a)
y
Therefore,
� J ( Y) - � ,(x) < f ( y) - f(x),
( 1 .3b )
which yields that f - � f is indeed monotone nondecreasing. By letting y l x we obtain from ( 1 .3b ) that
( 1.3c )
520
CHAPTER 9 . CALC ULUS O N THE REAL LIN E
On the other hand, from (1.3a) and ( 1 .3b ) ,
il 1 ( y) - il1 (x) = il 1 ([a, y]) - 6. 1 ([a,x]) = f (x + ) - f (x - ) E x < xk < 6 f ( x k) + f ( y) - f( y - ) - [f(x) - f(x - )] = t (x + ) - t (x) + E x < xk < a 1 (x k) + f ( y ) - f( y - ) .
+
y
Letting y ! x in the latter, we have that
y
l:l. f (x + ) - l:l.J (x) > f (x + ) - f ( x ) , which, along with ( 1 .3c) , yields that
Analogously, we can show that
il 1 (x - ) - f(x - ) = � 1 ( x) - f( x).
D
Recall that extended distribution functions fall into the category of monotone functions and there is a bijective map between the factor space m e l � of �-equivalence classes of all extended distribution functions that differ in constants and Borel-Lebesgue-Stieltjes measures !B they induce and vice versa. (See Example 1.2 (:iii) and Remark 3.5 (iii) , Chapter 5 , for a refresher.) It is intuitively clear that the measure derivative as a "pointwise" limit, if it exists, is identical to the function derivative . This is subject to the following theorem .
f ( E me )
be an extended distribution function and let J.L f be the positive Borel-L ebesgue-Stieltjes measure induced by f . Then f is differentiable at a point x 0 if and only if J.L f is differentiable at x0 and in this case, 1.4 Theorem. Let
( 1 .4) Proof. Let f be differentiable at a positive fJ such that 1( X) - 1( XQ)
x0 • Then, for each positive e, there is
• f ( x 1f ) e 0 < 0 < I x - x0 I < fJ. x x0 If x > x0 , then by Problem 3 . 7 a), Chapter 5, F( x):
=
_
1
( 1 .4a)
1. Monotone Functions
and if x
52 1
< x0 , since f is continuous at x0 , J.L j ((x , x0 )) = f(x) - f(x 0 ).
Therefore, if x
< x0 , F(x) =
llA((( (x,x, xx0o))) ) - f'(x0 )
and if x > x0 ,
The latter is not a significant difference from ( 1 .4) , since /, and there fore, F are continuous at x0 • Furthermore, because f can have only at most countable many discontinuities, there is an interval around x 0 , where f and F are continuous. In other words, the selection of 6 can be made appropriate to warrant F(x) = F(x - ) . Then, by ( 1 .4a) and Remark 8.2, Dp, ( x0) exists and ( 1 .4) holds. The converse is subj ect to similar arguments after in the expression D for F(x) , f ' (x 0) is replaced by DJ.L(x 0 ) . Corollary 8. 7 and Theorem 1 .4 combined immediately yield :
1.5 Corollary. Every extended distribution function f E ID e is dif
ferentiable
>.-a.e.
and
f' = D J.L f = g
>.-a. e.,
where J.L f is the Borel-L ebesgue- Stieltj es measure induced by f and g is a Radon-Niko dym density of the continuous component of J.L f in its L ebesgue decomposition. 1.6 Corollary. Every monotone function bounded over bounded inter
vals is differentiable
).
.. a.e.
Proof. Let g be a monotone nondecreasing function (otherwise, we consider g ) . Define
-
f (x): = g(x + )
to have f E m e . Then f is differentiable >.-a.e., due to Corollary 1.5 and so is g, which, by Theorem 1.2, has at most countable many discontinui D ties, and hence equal f >.-a.e. 1.7 Theorem (Fubini). L et {F n } b e a sequence of monotone non-
decreasing functions such that the series E� 1 F converges to a function F in the topolo g y of p ointwise convergence. Then: =
n
522
CHAPTER 9 . CALCULUS O N THE REAL LINE
(i)
Both F n and F are differentiable A-a. e.
(ii) F'(x) =
I: �= 1 F �(x),
A-a.e.
Proof. Assume that for each n , F n is a distribution function and F is bounded. Let J.L F n be the corresponding finite Borel-Lebesgue-Stieltjes
00
measure. The set function J.L F = I: n - 1 J.L F n is a positive measure. Then, F is clearly a distribution function, and _
It follows by eleiJlentary arguments that J.L F is a finite Borel-Lebesgue Stieltjes measure induced by F. Let denote the Lebesgue decomposition of J.L F n and let f n be a RadonNikodym density of its absolute continuous component. We show that is the Lebesgue decomposition of J.L F and f: = E�= 1 / n is a Radon Nikodym density of its absolute continuous component . Since J.L� _!_ A, there is a A-null set N n such that A(N n) = J.L�(N�) = 0. Let N
= n 00u= l Nn.
Then, because N ::> N n for each n ( and thus Nc C N�) , On the other hand, E�= l f..L � is the continuous component of J.L F ' since by the Monotone Convergence Theorem, As a finite measure, E�= 1 J.L� ( < J.L F ) provides that f is an L 1 -function and, by the Radon-Nikodym Theorem, f is a unique, modulo A, Radon
Nikodym density of E� 1 J.L� with respect to the Lebesgue measure. Since F is a distribution function, by Corollary 1 . 5 , F' exists A-a.e. and =
1 . Monot one Functions
On the other hand, applying the same argument to F F�
n
= D J.L� = f A-a. e.
and the two equations yield F' =
I: :0=
1
523
n ' we have that
F� A-a. e.
The general case of the theorem, when F is a monotone nondecreas ing function , bounded over bounded intervals, is left for the exercise D (Problem 1 . 1) . The following statement is an interesting partial confirmation of the revered Newton-Leibnitz theorem applied to a class of monotone func tions. The latter are differentiable A-a.e. Unless specified otherwise, we will extend the derivative of such a function f by setting f' = 0 on the set N E NA and Nc is the set on which /' exists.
1.8
Theorem. Let f be a bounded monot one nondecreasing function
on the compact int erval [a,b] . Then, f' is measurable and
J : f ' dA < f(b) - f(a) .
( 1 . 8)
Proof. Let us (continuously) extend f through (b,b + 1] by setting f(x) = f(b) on this interval. Then, at every point x where the derivative of f exists it can be represented as the limit
of a convergent sequence of measurable functions. Furthermore, f' exists on a measurable subset of [a,b] whose complement is a A-null set on which f' is set to equal zero. Thus, f' is well defined on [a, b] , it is non negative and therefore its Lebesgue integral exists. By Fatou's Lemma, then
J !I ' d.\ < sup{ n J ![ f( x + �) - f (x ) ]A(dx ) }
By the change of variables,
and thus:
bJ af(x + �)A ( dx) = J b+ n f (x)A (dx) a+n 1
l
J ![ f( x + �) - f (x) ];\ (d x)
524
CHAPTER 9 . CALCULUS ON THE REAL LINE
=
L
b +l n
= �f(b) -
f(x)>.(dx) -
L
Ja
a +l n
f(x)>.(dx)
a +l n
f(x)>.(dx) < � [f(b) - f(a)].
D
The above statement seems to fall surprisingly short of the familiar Newton-Leibnitz equation. Moreover, as we will learn from the example below, the result of Theorem 1.8 can deliver a strict inequality. Example. (Cantor function). Let G n , n = 1 , 2 , . . . , be open sets removed from [Of1] to form the Cantor ternary set (see Example 3.11, n Chapter 5 ) . Recall that each G n is the union of 2 - 1 disj oint open inter-
1.9
n
n
vals. Now, the set kU Gk is the union of 2 - 1 ( as the result of the =1 n summation of 1 + . . . + 2 - 1 ) open intervals denoted by
and arranged in the order of their location in function F n : [0,1] --+ [0,1] as follows. Let
F n (x)
F n ( O)
= 0,
Fn ( l )
= 1.
[0,1]. For each n, define the
= k/2 n , if X E Ak(n), k = 1,2, . . . ,2 n - 1,
and
Then, interpolate F n by connecting the ends of the corresponding seg ments of F n on Ak( n ) . For instance,
and
2 ) + ( 1 2 ) + ( 7 ) + ( 1 2 ) ( 20 ) + ( 7 ) + ( 2 5 2 6 ) 1 ( - 27'27 27'27 . 27'27 2 7'27 -
9'9
8
3'3
+
19
The graphs of F 1 and F3 are drawn in Figure
8
9'9
1.1 below.
1.
' F; '
, ,
,
'
,, ''
525
Monotone Functions
'�-------J - - ------------- --- - - - - - --
, ,
,
,
,
,
�(3)
A1 (3)
-
-
As (3) Figure
-
1.1
A k (n) = A 2k (n + 1), and that F n (x) = F n +1 (x) = k/2 n , for x E A k ( n ) = A 2k ( n + 1 ), k = 1, . . . ,2 n - 1. It is easily seen that F n is a monotone nondecreasing, continuous function on [0, 1 ], and it is also clear that I F n (x) - F n +1 (x) I < 21n , \lx E [0,1]. Thus F n (x) conver ges uniformly to a function F(x ), which is called the Cant or function, and F is Observe that
also continuous and monotone nondecreasing ( as the result of the uni form convergence of a sequence of monotone nondecreasing, continuous functions ) . Therefore, since F(x) = F n (x) = k/2 n for x E Ak(n), we have that F' (x) = 0, for x E A k (n), k = 1,2, . . . ,2 n - 1, n = 1,2,. . . . Hence,
F'(x) = 0
00
on U l G n . The latter 1s the complement of the Cantor set
n=
Consequently ,
F (1) - F( O ) = 1.
•
F' E [OL\
on
[0,1].
Therefore,
J � F'dA = 0,
C.
while
0
PROBLEMS
1.1
Complete the proof of Fubini's Theorem 1. 7 for the general case of when F is a monotone nondecreasing function, bounded over bounded intervals.
1.2
Let f be a monotone nondecreasing function on [a,b] and F be a monotone function on [A,B]. Is the composition F o f: [a,b] --+IR monotone?
526
CHAPTER 9 . CALCULUS ON THE REAL LINE
1.3
Let f and F be the functions of Problem 1.2 and suppose the function f has a jump of discontinuity at x0 E (a,b). Must F o f be discontinuous at x0 ?
1.4
Show that if f is continuous on [a,b] , then the functions m (x): = inf{ f(t) : t E [a,x]} and M(x): = sup{f(t) : t E [a,x]} are continuous and monotone on [a,b] .
1.5
Give an example of two monotone nondecreasing functions whose product is not monotone.
1.6
Give a monotone increasing function rational point.
1.7
Prove that if a function [( a,b ),!R,/] is monotone, bounded, and continuous, then it is uniformly continuous.
1.8
Does the validity of the statement of Problem 1 . 7 still hold if the interval ( a,b ) is replaced by !R?
[!R,!R,/] discontinuous at each
1. Monot one Functions NEW TERMS:
monotone nondecreasing function 5 1 7 monotone nonincreasing function 51 7 monotone function 5 1 7 jump discontinuity 517 cumulative jump function 519 Fubini's Theorem for monotone functions 521 Cantor,s ternary function 524, 525
527
528
CHAPTER 9 . CALCULUS ON THE REAL LINE
2. FUNCTIONS OF B OUNDED VARIATION
Now we will introduce the class of functions of "bounded variation," which play the same role for signed measures as distribution functions do for generating positive Borel-Lebesgue-Stieltjes measures.
2.1 Definition. Let [a,b] be a compact interval in P = { a0 = a, . . . ,a n = b} be a partition of [a,b]. Let f be a bounded real-valued function defined on [a,b]. Denote
IR and let measurable
V (P) = V (P,( [a,b]) = E � 1 I f ( a i) - f( ai - l ) I and let � be the set of all partitions of [a,b]. Then we call sup{ V (P) : P E '!P} the variation of f on [a,b] and denote it by V 1 [a,b]. The function D f is said to be of b ounded variation on [a,b] if V 1 [a,b] < oo. 2.2 Example. Consider the function =
f (x) and make the partition P
X=0 x sin � , 0 < x < 1 0,
=
= {0 < x n < . . . < x 1 < 1 } such that Xn - ( 1 ) n +� f(x) = ( - l ) n ( 1 ) n + !.2 "
7r
Then,
7r
and hence
Consequently,
V1 [0,1] = oo.
D
We will leave for an exercise (Problems 2. 1-2. 14) the following properties of functions of bounded variation.
2.3
Theorem. Let
hold true:
( i) ( ii ) ( iii )
[[a,b],IR,f]
be a bounded function. The following
If f is monotone, then it is of bounded variation.
If f satisfies a Lipschitz condition, then f E 'r [ a, b ] .
L et f E o/"( a , b ]. Then function on [a,b].
x V1 [a, x] t-+
is a monotone nondecreasing
2. Functions of Bounded Variation
529
( iv) The set 'r[a,b] of all functions of bounded variation on [a,b] is a vector space over the field IR and it is closed with respect to multiplica tion. Let J,g E 'r[a, b] such that g > fJ > 0. Then � E 'r[a,b] . (v)
( vi) If f E 'r[a,b] , then V1 [a, b] = V 1 [a,c] + V 1 [c,b] . (vi i) lf P = {a = a0 < a1 < . . . < a n = b} is a partition of [a,b] such that on each of the subintervals [ai,ai + 1 ] f is monotone, then f E 'r[a, b].
( v iii ) If / E 'r[a, b] and [a, b] = [a,c] + ( c,b] , then / E 'r[a, c] and / E 'r[c, b] .
( ix)
f E 'r[a, b] if and only if f can be represented as the difference of two monotone nondecreasin g functions. ( x)
If f E 'r[ a, b ], then f is differentiable A-a.e. on [ a,b] .
(xi)
The set of all jump discontinuities of any function f E 'r[ a, b] is at most countable.
(xii)
A ny f E 'r[a, b] can be represented as the sum of its jump function � f and a continuous function of bounded variation on [a,b] .
( x iii ) Let f E 'r[ a, b ]. If f is continuous at x 0 E ( a,b ) , th en so is x H V[ a, x ] . If f is right- continuous, then so is V[ a, x ]. ( xi v) A ny continuous function f E 'r[ a, b] can be represented as the D difference of two continuous monotone functions. 2.4 Definition. Let [IR,IR,/] be a bounded function . The limits V 1( - oo,b] V 1 [a,oo) V 1 (lR)
=
= lim a-+oo V 1 [ - a,b] =
lim b-+oo V 1 [a , b]
V 1 ( - oo, + oo)
= lim a-+oo V 1 [ - a,a]
are said to be the variation of f on ( oo, b ], the variation of f on [a,oo) , and the total variation of J, respectively. The function f is said to be of bounded variation on ( - oo,b] , [a,oo) , or IR, if the above respective limits are finite, in notation, f E 'r( - oo, b ], D f E 'r[ a, oo ), or f E 'r(IR) , respectively. -
2.5 Theorem. ( i)
For any two real numbers
a < b,
530
CHAPTER 9 . CALCULUS O N THE REAL LINE
and
v, [a,oo) = v , [a, b] + v, [b,oo).
( ii)
If f E o/"(�) , then
lim a -+ oo V 1 ( - oo, - a]
= lim a -+oo V1 [a,oo) = 0.
f E 'r ( � ) if and only if f can be represented as the difference of two monotone nondecreasing bounded functions. If, in a ddition, f is a distribution function, then the latter representation is of two distribution D functions.
(iii)
( i) and ( ii) are left for the reader ( Problem 2.21 ). (iii) Denote v 1 ( x) = V1 ( - oo,x] and
Proof. Parts
F: = v1 + f and G = v1 - f. Clearly, v f is a monotone nondecreasing and bounded function. Let x < y . Then, because I f (y) - f(x) I < V f[ x, y] ,
F ( y) - F(x) = V1 [x, y] + f (y) - f (x) > 0. The proof that G is monotone nondecreasing is analogous. Now,
is a pertinent representation. Finally, if f is a distribution function, then so is v 1, due to part ( ii) and Proposition 2.3 ( xiii ) . D
2.6 Definition. A function f E 'r(lR ) is said to be a signe d distribution function ( in notation, f E 9::> 5 ) , if it is right-continuous and vanishes at D - oo. 2.7 Theorem. Let
GJ)
finite signe d measures by
be the operator defined on set f (x)
= v(( - oo,x]).
Then, f is a signe d distribution function an d jection. Proof.
6 * (lR,<:B)
of all ( 2.7)
[6* ( lR,<:B),9::> 5 ,GJ)]
is a bi
1) Given v E 6 ( � ,<:B), let f = '!»(v), in accordance with (2.7) . We * will show that f E 9::> s · If x 0 < x 1 < . . . < x n are real numbers, then from
2. Functions of Bounded Variation
531
it follows that the total variation of f is bounded by II v I I and therefore, f E 'r(IR) . Let v = v + - v - be the Jordan decomposition. Then, v + and v - are positive finite Borel-Lebesgue-Stieltjes measures that induce two distribution functions f + and f - , respectively. The latter, because of (2. 7) , yield perfectly in agreement with Theorem 2.5 (iii). Consequently, f is right continuous and it vanishes at - oo, because f + and f - are. Hence, t E m8 .
2) Let f E m s · Then, by Theorem 2.5 ( iii), f can be represented as the difference f = f 1 - f 2 of two distribution functions. Let J.L 1 and J.L 2 be two finite positive Borel-Lebesgue-Stieltjes measures induced by f 1 and f 2 , respectively. Defi n e v = J.li - J.L 2 • Then, obviously, v E S (IR, <:B). * Furthermore, v(( - oo,x])
= J..L 1 (( - oo,x]) - J.L 2 (( - oo,x])
= f 1 ( x ) - f 2 ( x ) = f( x ) agrees with (2 .7) , thereby justifying that G](v) = f and that v is an element of S (IR, <:B) induced by f or, equivalently , that *
Hence, we showed that [<S* (IR, �),m 8 , G]] is surjective. It remains to prove that [6 (1R, <:B) , 9:l8,G]] is also injective. The need for this is as fol lows. Since the* representation f = f 1 - f 2 is not unique (it is easily seen that if / 1 - / 2 is a decomposition of /, then {/ 1 + 9:l - (/ 2 + 9:l) } gives a class of decompositions of f in two distribution functions) , G] - 1 (/) may potentially yield more than one signed measure in accordance with the procedure specified in part 2) . On the other hand, it is obvious that any representation of f will induce signed measures, all of which will coincide on the family {( - oo,x]: x E IR} and thus must be equal . Notice, however, that we do not have such a concept as uniqueness of Caratheodori extension for signed measures and we must proceed by using the Jordan decom position instead. ,
3) Suppose that G_D*( {/}) = { v,p} and let f: , f; , f + , and f - be P P distribution functions corresponding to the respective Jordan decomposi tions of v and p. Then, we have
53 2
C HA PTER 9 . CALCULUS O N THE REAL LINE
f - t - t+ !- - t - t+ !-
v
-
v
-
p
v
-
p
-
p
'
which yields that
!+ +!- - t+ +tOn the other hand, v + + p - and p + + v - that correspond to the distribution functions f : + f ; and f: + f; , respectively, are positive, finite Borel-Lebesgue-Stieltj es measures, which must be equal, because of f J- + fP- = fP+ + f; and the uniqueness of Borel-Lebesgue-Stieltjes measures (induced by identical distribution functions) . This leads to the equality v = p and therefore completes the proof of the theorem. D Clearly, 9::> 5 C 'r(IR). Now , let g E 'r(IR) . Then, by Theorem 2.5 (iii), g can be represented as g = g + g - , where g + and g - are two bounded monotone nondecreasing functions vanishing at oo We can convert them to distribution functions by letting t + ( x ) = g + ( x + ) and f - ( x) = g - (x - ) and hence making f + and f - elements of 9::> . There v
p
-
p
-
v
•
-
.
fore, there is a bijective operator acting from 9::> s to 'r(IR) and Theorem 2. 7 can be restated as follows.
2.8 Theorem.
There is a bijective map between the set <S * (lR, <:B) of all finite signed Borel-Lebesgue-Stieltjes meas u res and the set 'r(IR) of all functions of b o u nded variation on IR . D PROBLEMS
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12
( i) of Theorem 2.3. Prove part ( i i) of Theorem 2. 3. Prove part ( iii) of Theorem 2.3. Prove part ( z"v) of Theorem 2.3. Prove part ( v) of Theorem 2.3. Prove part ( v i) of Theorem 2.3. Prove part ( vii) of Theorem 2.3. Prove part ( v iii) of Theorem 2.3. Prove part ( i x ) of Theorem 2.3. Prove part (x) of Theorem 2.3. Prove part (xi) of Theorem 2.3. Prove part ( xii) of Theorem 2.3. Prove part
2 . Functions of Bounded Variation
2.13 2.14 2.15
533
Prove part ( xiii ) of Theorem 2.3. Prove part ( xiv ) of Theorem 2.3.
Find the total variation of the function 0, 1 -x 5,
f(x) =
x=O 0<x<1 x=1
'
defined on [0, 1 ] .
2.16
Find the total variation of the function x-1 10, 2 x ,
f ( x) =
x<1 x=1 x>1
'
defined on [0, 2] .
2.17
Show that the function
f(x ) =
x=O
0,
x cos( � ), 2
is of bounded variation on [0, 1 ] .
2.18
Prove that a differentiable function on [a, b] with a bounded derivative is a function of bounded variation.
2.19
Must a uniformly convergent series on [a, b] of functions of bound ed variations be of bounded variation?
2.20
If f has a Riemann integrable derivative on [a,b] , prove that its total variation is 'r[a,b] =
2.21
J : I f (x ) I dx. '
Prove parts ( i) and ( ii) of Theorem 2.5.
53 4
CHAPTER 9. CALCULUS ON THE REAL LINE
NEW TERMS:
variation of a function on a bounded interval 528 function of bounded variation 528, 529 variation of a function on an unbounded interval 529 total variation of a function 529 signed distribution function 530
3 . A bsolutely Continuous Functions
535
3. ABSOLUTELY CONTINUOUS FUNCTIONS
Below we introduce the concept of absolute continuity and establish its connection with absolute continuity of measures. 3. 1 Definition.
A function [IR,IR,/] is called absolutely continuous on a compact interval [a, b] (in notation, f E A[a,b] ) , if for each £ > 0 there is a fJ > 0 such that for any finitely many bounded disjoint open subinter vals ( a i , b 1 ), i = 1 , . . . ,n, with (3. 1) it holds true that (3 . 1a)
f
is called absolutely continuous on IR or just absolutely continuous (in notation f E A (IR) ) , if for each c > 0 , there is a fJ > 0 such that for any finitely many bounded disjoint open intervals ( a i ,b i ) , i = 1 , . . . ,n, satisfying (3 . 1 ) , inequality (3 . 1a) holds. D
A function
Clearly, the absolute continuity of a function f on IR or an interval implies uniform continuity (and that the converse is not true) , which in turn makes f measurable. 3.2 Proposition. A[ a, b] C 'r[ a, b ].
Proof. Let f E A[ a, b ]. Then , for c = 1, there is a n-tuple of disjoint open intervals, {(a k ,b k)}/: = 1 with
fJ such that
for any
it holds true that Let us make a partition P = { a0 = a < a1 < . . . < a N = b } of [a, b] into subintervals with meshP < 6. Then for each [a k 1 , a k] its arbitrary de composition _
yields and thus
,
536
CHAPTER 9 . CALCULUS O N THE REAL LINE
D
3.3 Example.
From Example 2.2 and Proposition 3.2, it immediately follows that although the function X=0
0,
f(x) =
x sin � , 0 < x < 1
is continuous, it is not absolutely continuous.
D
3.4
Remark. Proposition 3 . 2 , however, does not imply that A(IR) C 'r(IR) . For instance, the identity function f (x) = x is absolutely continuous, but not of bounded variation on IR. D
Let f be a signed distribution function that generates a finite signed Borel-Le besgue-Stiel tj es measure v. The theorem below states that v is absolutely continuous if f is, and vice versa. Prior to this, we need the following lemma.
3.5 Lemma. If f E A ( IR) n 'r(IR), then v1(x): = V 1( - oo,x] E A(lR). Proof. Since f E A(IR) , given any £ > 0 , there is a fJ > 0 such that for any n-tuple of disjoint open intervals, { ( ak ,b k )} k = 1 with E � 1 (b k - ak ) < 6 , it holds true that E � = 1 l f(b k ) - f(ak ) l < £. =
Since
V1 [ak ,b k] = sup{ E
I f( · ) - f( · ) I
over all finite partitions of
for each {n > 0, there is a partition that
Therefore,
On the other hand, since
[ak ,bk] },
a k = a0 , k < . . . < a N k ' k = b k
such
3 . A bsolutely Continuous Functions
Nk
L.J k = l L.J m k = l ( a m k , k - a m k - 1 , k )
�
�
n
=
�
L.J k = l (b k - a k ) n
53 7
< 0'
we have
L � = 1 L:�Z = 1 f ( a m k , k ) - f ( a m k - l , k ) < � Consequently, from (3.5) ,
L�
=
1
V1 [ak , b k] < e: .
(3. 5a)
By Theorem 2.5 ( i ) , and by our assumption that f E 'r(IR),
which allows us to rewrite (3. 5a) in the form
and there by complete the proof.
3.6 Corollary.
Let f and v 1 be as in L emma
F:
=
v f + f and
G
=
vf
-
D 3. 5.
Then, the functions
f
are absolutely continuous, bounded, and monotone nondecreasing. If f E A( IR ) n 'r(IR) and vanishes at infinity, then it can be represented as the difference, f
=
�F - �G,
of two absolutely continuous distribution functions. Proof. From Lemma 3.5, and linearity of A(IR), it follows that F and G are elements of A(lR) and they are obviously bounded. The rest is due to Theorem 2.5 (iii). D
3.7
Proposition. If f E A[a,b] , then f can be represented as the dif
ference of two distribution functions on [a,b] . 3.8 Theorem. Let v E s. ( IR, <:B) and f E m8 be the corresp onding signed distribution function generated by v . Then the following are equivalent: v
(i)
( ii)
v <<
,\
f E A(IR). v
538
CHAPTER 9 . CALCULUS ON THE REAL LINE
Proof.
(i)
Let v E 6 *�(1R, <:B). Since I v I is a positive finite measure and absolutely continuous too, by Proposition 5.6, Chapter 6 , for each £ > 0, there is a positive fJ such that for each Borel set A with >.(A) < 6, I v I (A) < £. In particular, if A = L: � = 1 (ak ,b k ) , we have that : I
I (A) = L: � 1 I v I (a k ,b k ) = L: � = 1 v + ((ak ,b k)) + v - ((ak ,b k )) > L: � 1 I v + ((ak ,b k)) - v - ((ak ,b k )) I v((ak ,b k)) e> Iv
=
=
= L: � 1 I t ,., ( b k ) - t ,., (a k ) I , =
implying that f v E A(IR).
( ii )
Now, let f E A(IR) n 9::> s · Since
v = v f is finite,
Therefore, f E 'r(IR) n A(IR), and by Lemma 3 . 5 , then Corollary 3 . 6 , the functions F:
=
vf + f 2
and
G: =
v
f
E A(IR). By
v, - f 2
are absolutely continuous, bounded, monotone nondecreasing, and vanish ing at - oo . In particular, being absolutely continuous, F and G are elements of 9::> . Let J.L F and J.L a be the corresponding finite Borel Lebesgue-Stieltjes measures induced by F and G, respectively. Because F - G = f, the signed measure v 1: = J.L F - J.L a is clearly an element of <S*(IR, <:B). Since, as we know it from Theorem 2. 7 ( cf. the proof of part 3) ) , v f does not depend on the decomposition of /, we have that v f = v 1. It remains to show that J.L F and J.L a are elements of lB .,x� We will again use Proposition 5.6, Chapter 6. Let B be a Borel set such that >.(B) < �, where fJ is the "threshold" taken from the absolute continuity condition of the distribution function F. By regularity of >. ( see Theorem 7. 17 and Remark 7. 18) , for each �, there is an open superset U of B such that •
>.(B) + � > >. ( U) .
(3.8)
On the other hand , by Problem 2. 10, Chapter 4, U can be represented as at most a countable union of disjoint semi-open intervals:
3. A b s olutely Continu ou s Functions
539
so that , from (3 .8) , (3.8a) Now, by absolute continuity of F, for any finite subcollection of { I j = (a j ,b j]}, say {(aj ,b j]} j = 1 , because of (3. 8a) ,
2: j = 1 F(b j ) - F(aj) = 2: j = 1 Jl p ((a j ,b j] ) = Jl.F( 2: j = l (a ; ,b ;J ) < c . By continuity from below of Jl p , we have that Jl p (U ) < e . Since B C U, Jl F (B) < e. In summary, we showed that for each £ > 0, there is a 60 = � such that for every Borel set B with A (B) < 60 , Jl F (B) < £, and therefore D Jl F << A. The same applies to Jla and , consequently, to v 1 . 3.9 Theorem. A function f E A(IR) n m s if and only if there is g E L1 (IR, <:B, A; lR ) such that
f(x)
= I _ 00 g ( u)A ( du) . X
(3 . 9)
Proof.
Suppose f E A ( IR ) n 9::> s · Then, since f is a signed distribution function, by Theorem 2. 7, there is a unique signed Borel-Lebesgue Stieltjes measure v f E CS ( IR, <:B ) induced by f. Because f is absolutely * continuous, by Theorem 3.8, v f << A. Therefore, by the Radon-Nikodym Theorem 2.2 (case 5a) , v f has a Radon-Nikodym density
(i)
-
g E L 1 (IR, <:B,A;IR )
with respect to A, i.e., (3. 9a) In particular, Conversely, let g E L 1 ( 1R, <:B, A;IR). Define v = I gdA. Then, v E CS* ( lR, <:B ) and thus, by Theorem 2.7, v induces the signed distribution function f v ' defined as f v ( x ) = v (( - oo ,x] ) . On the other hand , since D v « A, by Theorem 3 . 8, f v E A ( IR ) .
( ii)
540
CHAPTER 9 . CALCULUS ON THE REAL LINE
3. 1 0 Corollary (Lebesgue). Let f be defined as f (x )
=
J � 00 g(u),\(du),
with g E L 1 ( IR, <:B, ,\). Then f is differentiable
A -a .e. an d f' = g ,\ -a . e .
Proof. By Theorem 3.9, f is a signed distribution function and g is a Radon-Nikodym density of v = J gd,\. Let f = f + - f - , where f + = J g + d,\ and f - = J g - d,\. Then, by Theorem 3 .9, f + and f - are distribution functions. The statement is now due to Corollary 1 .5. D
The following theorem clarifies the "ambiguity" caused by Theorem 1.8 and establishes a noteworthy criterion for the equality in (1 .8) .
3. 11 Theorem. A function f is absolutely continuous on an interval [a,b] , if and only if it is differentiable A -a.e. and it can be represented as f(x)
=
f( a ) +
s : t '(u),\(du).
(3 . 1 1)
Proof. Let f E A [a , b] . Then, by Proposition 3.2, f E 'r[a, b]. Now , denote
F(x )
= 1 [ a , b ] (x) ( f(x) - f ( a) ) + 1 (b, oo ) (x) [f (b) - f(a)] ,
which is defin'ed on IR and is clearly an element of A(lR) n m s · By formula ( 3. 9a) of Theorem 3.9, there is a g E L 1 (1R, <:B, ,\) such that v
( ( a x]) ,.
= J : g(u),\(du) = F ( x) - F ( a)
= F(x) = 1 [ a, b ] (x) [f(x) - f(a) ] . By Corollary 3 . 10, F and therefore, /, must be differentiable [a, b] and /' = g A-a.e. on [a,b] . The converse is obvious.
A-a.e.
on D
Theorem 3 . 1 1 immediately yields:
3. 1 2 Corollary. Let f E A[a,b] and f' E [OL\ · Then f is a constant on D [a,b] . A challenging exercise is to prove Corollary 3. 12 without the use of Theorem 3 . 1 1 .
3 . A bsolutely Continuous Functions
541
PROBLEMS 3. 1
Prove Corollary 3 . 1 2 without use of Theorem 3 . 1 1 : Let f E A[a,b] and / ' E [OL\ · Show that f is a constant on [a,b] . [Hint: Use Vitali's Covering Theorem: Let E be a subset of IR and � be a system of closed nonempty intervals. If for each x E E and e > 0, there is an interval I E � such that x E I and >. ( I ) < £, then sys tem � is said to be a Vitali covering of set E . Let E be a bounded set an d � be its Vitali covering. Then there is an almost countable subfamily of disjoint intervals from � that covers all p oints of E
except for possibly its >.-n egligible subset.]
3.2
Show that a sum or a product of finitely many absolutely continuous functions on [a,b] is absolutely continuous.
3.3
Show that absolute continuity on [ a ,b] implies uniform continuity.
3.4 3.5
Let f,g E A[a, b) and
g(x) :j:. 0 , x E [a,b] . Prove that � E A[a, b].
Let f E A[ a, b] such that f is also monotone non decreasing. Suppose that F E A[/( a ) , (b)). Prove that F o f E A[ a, b ].
542
CHAPTER 9. CALCULUS O N THE REAL LINE
NEW TERMS:
absolutely continuous function on a compact interval 535 absolutely continuous function ( on IR) 535 Vitali's Covering Theorem 54 1
4. Singular Functions
543
4. SINGULAR FUNCTIONS
We will continue our discussion on singularity of signed measures, started in Section 3 , Chapter VIII, and connect this notion to that for distribu tion functions. Recall that a signed Borel-Lebesgue-Stieltjes measure v is singular-continuous if v is singular (i.e. v _!_ ,\) and v is continuous (i.e., for each x E IR, v ( { x}) = 0) . v is atomic if there is an almost countable set A of real numbers such that v ( {a}) > 0 for each a E A and v ( A c ) = 0. Since v _L A , it is also called singular-discrete. Binomial, geometric, and Poisson measures are examples of positive singular-discrete Borel Lebesgue-Stieltjes measures. 4. 1 Definition. A function f is called singular-continuous if it is conti nuous, not a constant , A-a.e. differentiable, and its derivative is zero A a.e. [Observe that by Corollary 3 . 12 , a singular-continuous function is continuous but not absolutely continuous.] D
4.2
Example. (Cantor Singular-Continuous Function). From Example 1 . 9, the Cantor ternary function F is monotone nondecreasing and singular-continuous. Let Jl. p be the corresponding Borel-Lebesgue Stieltjes measure. Since F is constant on A k (n), it follows that J.L p (Ak (n)) = 0 and thus J.L F ( C c) = 0. On the other hand , .X(C) = 0. Thus, J.L F _L ,\. Furthermore, since F is continuous, J.L p ( { x}) = for all x E [0, 1] . Therefore, J.L F is a singular continuous Borel-Lebesgue-Stieltjes D measure induced by F.
The above example gives rise to a seemingly close relation between singular continuous distribution functions and singular continuous Borel Lebesgue-Stieltjes measures. We will start with the following:
J.L
be a positive u-finite singular- continuous Borel Lebesgue-Stieltjes measure. Then the corresponding extended distribution function f 11 is singular- continuous. 4.3 Theorem. Let
Proof.
Let J.L _!_ ,\. Then, there is a Borel set A such that J.L( A) = ,\(Ac ) = 0. Since f: = f 11 is an extended distribution function, by Corollary 1.6, f' exists A-a.e. and clearly /' > 0 everywhere it exists. We will show that E = {x: f'(x) > 0} E N A "
( i)
J f'd,\ = I l A f'd,\ < I f'd,\. E E A
544
CHAPTER 9. CALCULUS ON THE REAL LINE
We will prove that
>..- a.e.
I f'd).. = 0. If so, J f' d ).. = J f 'd).. E
A
0 and thus f'
=
=
0
By Theorem 1 .8, for each compact interval [a , b] , (4. 3) J :f'd).. < f(b) - f(a) = J.t((a,b]). Since A is Borel and J.L is u-finite, by Theorem 2.28, Chapter 5, for each £ > 0, there is a disjoint sequence {I n J of semi-open intervals such that A C I: ::" 1 I n and J.L( E �= l l n\ A) = E �= 1 J.L( l n ) - J.L(A) = E �= 1 J.L( In ) < e. (Notice that since J.L(A) 0, the u-finiteness of J.L is not a necessary con =
=
straint to use Theorem 2.28.) Because of (4.3),
Therefore,
X
( ii)
E lR.
J !'
A
f is continuous, because
=
0.
J.L is continuous,
i.e.
J.L( { x }) = 0
for all D
Let v be a singular-continuous signed Borel-Lebesgue Stieltjes measure and f be the signed distribution function induc ed by v . Then, f is singular continuous.
4.4 Corollary.
v
Proof. Let v = v + - v - be the Jordan decomposition of v and f + and f - be the corresponding distribution functions. Then, clearly v + and v - are singular continuous finite positive Borel-Lebesgue-Stieltjes D measures. The proof is complete after applying Theorem 4.3.
4.5 Theorem.
L et f E ID e and f' = 0 >.. - a . e . Then
J.t j
...L
>.. .
Proof. Denote J.L = f..LJ · Then, J.t is a positive u-finite Borel-Lebesgue Stieltjes measure. By the Lebesgue Decomposition Theorem 3 .4, Chapter 8, there is a unique decomposition J.L = f..La + f..L s such that f..L a «: ).. and f..L s ..L >.. . Assume first that J.L is finite. Then, both f..La and J.L5 are finite. By Radon-Nikodym's Theorem 2.2 ( case 1), there is a nonnegative L 1-func tion g such that f..La = J gd >.. . By Lebesgue Corollary 3 . 10, the function
F ( x ) = I _ 00g( u )>.. ( d u) = J.La (( - oo, x] ) X
is differentiable >..- a.e. and F' = g >.. - a.e. On the other hand, f = F + G, where G ( x ) : = J.L5(( - oo, x]) . By Theorem 4.3 ( i) , since f..Ls ..L >.. , G' = 0 >..-
4. Singular Functions a.e. and therefore, F' = 0
>.-a. e. and g f..L a
=
=
J gd j.L
0
545
>.-a. e. Consequently,
=
0
and it leaves J.L ..L >.. Now, if J.L is u-finite, let {O n } be a countable measurable partition of IR so that J.L n = Re s E n n n J.L is a finite Borel-Lebesgue-Stieltjes measure,
which, according to the above arguments is orthogonal to >., i.e., there is a set A n C n n such that J.L n ( A n ) = >.(n n \A n ) = 0 . Therefore, the set
is such that
J.L(A) >.(Ac ) =
A = L: �= 1 A n =
0.
D
4.6 Corollary. Let f be a singular-continuous signed distribution func
tion and let v f be the signed Borel-Lebesgue-Stieltjes measure induced by v . Then, v f is a sin g ular continuous signed measure.
Proof. In the decomposition f = F - G into two distribution func tions, each one is singular continuous. This, as we know, yields the de composition v 1 = J.L F - J.La into two finite positive Borel-Lebesgue Stieltjes measures each one of which is singular-continuous due to Theorem 4.5. D 4. 7 Definitions.
( i)
An extended distribution function D is said to be discrete if it is a monotone nondecreasing step function on any compact interval and it can be represented as (4. 7) where { d n } j C lR and
L: := 00 A n _
=
IR is a countable decomposition of
lR into semi-open intervals. Due to Theorem 1 .2, an extended discrete distribution function can also be defined as a piecewise constant mono tone nondecreasing function. If D = D 1 - D 2 is a signed distribution func tion, with D i being discrete distribution functions, then D is said to be a
discrete signed distribution function.
Since any discrete signed distribution function D is almost every where constant, its derivative D' exists >.-a.e. and D' = 0 >.-a.e. Unlike its singular-continuous counterpart, a discrete signed distribution function is not continuous and thus we can alternatively call it singular-discrete.
( ii)
Any singular-discrete or singular-continuous signed distribution function is referred to as singular. D
546
CHAPTER 9. CALCULUS ON THE REAL LINE
4.8
Remark. If D is an extended discrete distribution function given by ( 4. 7) , it increases only at points { xn } of an at most countable set A and it induces the following atomic measure
(4.8) where 6 n = d n - d n _ 1 = J.L( { x n }) > 0. Correspondingly, any signed singul ar-discrete distribution function induces a unique signed singular-discrete Borel-Le besgue-S tiel tj es measure. Conversely, any signed singular-discrete Borel-Lebesgue-Stieltjes measure generates a unique signed singular discrete distribution function. D 4.9 Theorem. A ny signed distribution function
composed as
f
can uniquely be de
f = fa + fc s + fd '
(4. 9)
where f a ' f f d . are its absolute continuous, singular- continuous, and discrete components, respectively. Furthermore, f' exists >.-a.e. and f ' = f � >.-a.e. cs'
Proof. By Corollary 3.8, Chapter 8, any signed Borel-Lebesgue Stieltjes measure v can uniquely be decomposed as
(4. 9a) such that v a << >., v c s + v d _!_ >., and v c s _!_ v d ' where v c s and v d are singular-continuous and singular-discrete components of v . By the above theorems and propositions, each of the three components of v induces a unique signed distribution function of its respective type and therefore, the signed distribution function f v (induced by v ) is of the form
This representation is clearly unique. Conversely, if f is a signed distribution function, it generates a unique signed Borel-Lebesgue Stieltjes measure v, which by the above decomposition, in turn yields the corresponding unique decomposition
of signed distribution functions. Finally,
a .e.
f'
exists >.-a.e. and
f' = f�
>. D
The following provides a practical method for determining the decom position of a distribution function. By Proposition 1 . 3 , any monotone
4. Singular Functions
547
nondecreasing function f can be represented as the sum of the monotone nondecreasing continuous function f - � 1 and the step function (cumulative jump function of f) � 1. The theorem below states how a continuous function of bounded variation can uniquely be represented as a sum of an absolutely continu ous and singular-continuous function.
f E 'r[a, b] n e [a , b ] " Then f can b e decomposed as the sum a + u, where a is an absolutely continuous and u is a singular continuous function. With a (a ) = /{a ) , this representation is unique. 4. 10 Theorem. Let
Proof.
(i)
Existence. Since
f is differentiable ,\- a e. on [a, b] we can define .
X
a(x) = f( a) + aJ f' d,\ , u = f - a.
( 4. 10)
Since f E 'r[ a, b ], it is bounded and it can be decomposed as the sum of two monotone nondecreasing functions. Hence, applying Theorem 1.8 to each of them we conclude that f' E £ 1 . Then, by Theorem 3.9, a E A[ a, b ]. As regards u, it appears to be a linear combination of two 'r[ a, b ] functions, and therefore, its derivative u ' exists >.-a.e. and wherever it e� ists, it is � qual to f' - a' = 0. Of course, u E e [a , b ]· Therefore, u is singular-con t1n uous. Uniqueness. Suppose f = a + u = a + u . Thus a - a = u - u . Since u ' = u' = 0 A-a. e., ( a - a )' E [ O L\. Furthermore, a - a E A[ a, b ], and therefore, by Corollary 3 . 12, a - a = canst. On the other hand,
( ii )
a ( a) - a (a) = f(a) - f(a) = 0. a is identical to a and thus, u is identical to u . D 4. 11 Corollary. L et f E 9::> 5 n e[ a , b ] · Then f can b e decomposed as the sum a + u, where a is an absolutely continuous and is a singular continuous function. With a (a ) = /{a ) , this representation is unique. D 4. 12 Proposition. If f is a distribution function, then f can b e decom posed as the sum a + u, where a is an absolut ely continuous and u is a
The latter shows that
u
singular-continuous distribution functions.
Proof. Let f be defined on [a , b] . Since f' > 0, a (x) = f( a ) + monotone nondecreasing. Furthermore, from Theorem 1 . 8 , y
I f'd,\ < f( y ) - f(x) (x < y )
X
X
Ia f'd,\ is
548
CHAPTER 9 . CALCULUS O N THE REAL LINE
and hence
a(y) - a (x) < f(y) - f(x) or in the form u (x) = f(x) - a (x) < u (y) = f(y) - a(y) .
Now, suppose that the domain of Since
f is lR.
Set J.L ( ( - oo,x])
f(x) --+ 0 for x -+ - oo, we have for a -+ - oo,
X
=
Ioo f'd).. < oo.
-
(4. 1 2) a (x) = J.L (( - oo,x] ) = I 00 /'d).. and a ( x) --+ 0 for x--+ - oo by ¢-continuity of J.L· This also implies that u ( x) --+ 0 for x--+ - oo. D 4.13 Example. Consider the following distribution function: X
_
F(x) =
We can decompose
0,
x
6,
X > 2.
1 2' O < x < 1 x2 , 1 < x < 2
F as the sum of an absolutely continuous component, 0,
x2 - 1 ,
X<1 1<
3,
x<2
X> 2,
and a discrete component,
The corresponding Borel-Lebesgue-Stieltjes measures are as follows:
and The form of f..L t is due to ( 4. 12) .
D
4.14 Definition. Let X be a random variable on a probability space (n,E,IP) and valued in (lR, <:B) , and let IP x = IP X* be its probability
distribution. The random variable is called: a)
continu ous if IP X
>.. 1 .
4. Singular Functions
549
b) discrete if IP' X is an atomic probability measure. c
) singular-continuous if IP' x is a singular-continuous probability
measure.
d)
of a mixe d typ e if IP' X is a convex combination of at least two types from a, b or c . D PROBLEMS
4.1
Let f be a continuous distribution function. Suppose that given £ > 0, A = { x E [a,b] : f'(x) exists and f'(x) > e } . Prove that, (P4. 1)
4.2
Let f be a continuous distribution function. Suppose that given £ > 0, A = {x E [a,b] : f'(x) exists and f'(x) < e } . Show that
4.3
Let J.L be a singular-continuous measure. Using Problem 4. 1 and 4.2 prove that f 11 is a singular-continuous distribution function.
4.4
Give an alternative proof of Corollary 4.6 by using the Vitali Covering Theorem (see Problem 3. 1).
4.5
Let F be given as
F(x)
=
0,
X< -1
1,
-1 <x
x e,
2 3x , 20,
O<x<1 1<x<2 X>2.
Find the decomposition of F such that F = F 1 + F 2 , where F 1 is absolutely continuous and F 2 is a discrete distribution function and where the corresponding Lebesgue-Stieltjes measures J.L1 and J.L 2 have the sum J.L F ·
550
CHAPTER 9 . CALCU LUS O N T H E REAL LIN E
NEW TERMS:
singular-continuous function 543 Cantor singular-continuous function 543 discrete extended distribution function 545 discrete signed distribution function 545 singular-discrete distribution function 545 singular distribution function 545 decomposition of a signed distribution function 546 decomposition of a continuous signed distribution function 54 7 continuous random variable 548 discrete random variable 549 singular-continuous random variable 550 random variable of a mixed type 550
Bibliography [1974]
Apostol , Tom M., Mathematical Analysis: A Modern Appro ach to A dvanced Calculus, 2 n d edition, Addison-Wesley, Reading, MA.
(1 99 4]
Burbaki, Nikolas, Elements of the History of Mathematics, Springer-Verlag, Berlin.
(1966]
Cohen, Paul J ., Set Theory and Continuum Hypothesis, W. A. Benjamin, New York.
[1983]
Fisher, Emanuel, Intermediate Real A nalysis, Springer-Verlag, New York.
[2000]
Hernandez-Lerma, Onesimo and Lasserre, Jean-B. , Fatou's Lem ma and Lebesgue's Convergence Theorem for measures, Journ. Appl. Math. St o ch. A nal. , 13: 2, 137- 146 .
[1984]
Hrbacek, Karel and Jech, Thomas, Intro duction to Set Th eory, Marcel Dekker, New York.
[1993]
MacLane, S . and Birkhoff, G . , A lgebra, Chelsea, New York.
(1935]
Zorn, M . , A remark on method in transfinite algebra, Bull. A mer. Math. So c., 41 , 667-670.
Index A -projection map 34 a1section of a function 363 a1section of a set 359 abelian group 4 7 absolutely continuous component of a signed measure 453 absolutely continuous function 535 absolutely continuous signed measure 437 absolutely continuous measure 348 accumulation point 7 1 , 1 1 1 accumulation point of a filter base 169 accumulation point of a filter 169 accumulation point of a net, criterion for 1 77 accumulation point of a net 173 additive in verse 4 7 additive set function 222 additive group 4 7 N0 41 Alexander Subbase Theorem 146 Alexandrov compactification 1 9 1- 192 algebra 46 algebra over a field 53 algebra of functions 84 algebra (field) 204 algebraic number 44 algebraic operation 4 7 al-Khowarismi, Muhammad 46 almost surely equality 448 almost uniform convergence of a sequence 4 7 4
an tisymmetric binary relation 22 Arzela's Theorem 1 58 Ascoli, Giulio 153 Ascoli's Theorem 153, 155 associative algebraic operation 4 7 associative Ia ws 6 at most countable set 41 atom (v-atom) 454 atom of a singular meas ure 454 atomic probability measure 226 atomic (discrete) measure 226 Axiom of Choice 27 Axiom of Extent 6 B aire's Category Theorem 90 Banach, Stefan 59 Banach space 1 0 1 base parallelepiped 1 3 5 base, a construction of 1 18 base (simple) parallelepiped 1 1 6 base for a topology criterion for 1 15, 1 1 9 base sets 1 15 base for a topology 1 1 5 base neighborhood 1 1 0 Beppo Levi's Corollary 3 1 1 Bernoulli measure 23 1 bijective (onto and one-to-one) map 13 bilinear transformation 49 binary relation on a set 22 binary relation, 1 1 binomial series 1 6 1 binomial random variable 1 97 binomial measure 23 1
554 Bolzano-Weierstrass compactness 93, 145 Borel, Emile 22 1 Borel-Lebesgue measure 260 Borel-Lebesgue measure of a cube under a linear map 4 1 1 Borel- Le besgue-S tiel tj es measure 26 1 Borel-measurable bounded functions 327 Borel measurable function 2 1 7 Borel measure 26 1 , 493 Borel measure space 26 1 Borel outer measure 26 1 Borel u-algebra 210 Borel u-algebra, criterion of 2 13 bound of a function 89 boundary point 76, 1 1 1 boundary of a set 76 , 1 12 bounded function 88 bounded set 26, 82 bounded sequence 74 box parallelepiped 136 box topology 136 branch of a function 12 C anonic chain of partitions 328 canonic representation (expansion) 29 1 Cantor, Georg F. 3, 43, 107 Can tor singular-continuous function 543 Cantor ternary set 269 Cantor's ternary function 524, 525 Cantor's Theorem 42 , 88 Caratheodory, Cons tan tin 235, 236 Caratheodory's exten sion 241 , 242
IND EX
Caratheodory 's Extension Theorem 236 Caratheodory 's extension, uniqueness of 245 , 246 cardinal number 40 carrier 4, 60 Cartesian product 1 1 Cartesian product of a sequence 3 1 Cartesian product of indexed family of sets 32 Cauchy, Augistin-Louis 329 Cauchy in measure sequence of functions 4 7 4 Cauchy integrable function 330 Cauchy-Schwarz inequality 46 1 Cauchy sequence 74 Cauchy sum 329 chain 25 chain rule for positive measures 35 1 chain rule for signed measures 449 chain rule in Banach spaces 394 chain rule in Euclidean spaces 395 change of variables for abstract integrals 34 1 change of variables for a bij ective transfor mation 343 change of variables in Euclidean spaces 413 Chebyshev's inequality 4 74 choice function 27 closed set 69 n closed ball in IR , BorelLebesgue measure of 368 closed ball 70 closed set 109 closed semi-ffi-space 290
IN D EX
closed ffi-space 284 closure of a set 69, 1 1 1 closure point 69, 1 1 1 closure point, criterion for 124 cluster point 1 1 1 e - 1 -space 2 1 6 , 277 e - 1 (n, E; C ) space 282 e - 1 (0, E; IR) space 282 coarser topology 109 cocountable (countable complement) topo logy 1 1 2 codomain 1 2 cofinite (finite complement) topology 1 1 2 commutative algebraic operation 4 7 commutative laws 6 commutative ring 47 compact Hausdorff space 185 compact metric space 92 compact set 92, 144 compact sets in Hausdorff spaces 145 compact set under a continuous function 95 compact set under a continuous map 144 compact topological space 144 compactification 191 compactness, criteria of 93, 97 comparable elements 22 complement 6 complete Caratheodory's extension 242 complete metric space 7 4 completely regular space 183 completeness criteria 87 completeness of a uniform metric space 152 completion of a measure 240
completion of a measure space 240 complex linear space 53 complex measure 430 complex measure space 430 component of a signed measure 453 composition of binary relations 13 composition of continuous functions 129 composition of maps 13 composition of measurable functions 2 1 7 conditional expectation given a random variable 448 conditional expectation given a u-hypothesis 446 conditional probability 446 congruence 23 congruence modulo � 24 conjugate exponents 62, 460 content 223 continuity criteria 7 9 , 8 1 continuity from above at the empty set 223, 423 continuity from above, criterion for 230 continuity from above on a sequence of sets 223 continuity from above on a u-algebra 223 continuity from below, criterion for 229 continuity from below on a sequence of sets 223 continuity from below on a u-algebra 222 continuity of a function at a point 78 , 172 continuity of a function, criteria for 128, 129, 130, 175 continuity of projection
555
556 maps 136 continuous from above set function 423 continuous from below set function 423 continuous function 78 , 128 continuous functions on a dense set 131 continuous measure 348 continuous random variable 445 , 548 continuous singular measure 454 continuously differentiable function 394 continuum 4 1 continuum hypothesis 43 convergence in mean 3 12 convergence in measure of a sequence of functions 472 convergence in probability (stochastic convergence) 483 convergence in the pth mean ( £P-convergence) 463 convergence of a filter 169 con vergence of a filter base 169 convergence of a filter base to a point, criterion for 1 78 convergence of a filter to a point, criterion for 1 77 convergence of a function along a filter base 169 convergence of a net to a point, criteria for 173, 176 , 177 convergent sequence 74, 122 convolution of atomic measures 381 con vol uti on of binomial measures 3 8 1 convolution of functions 383 convolution of a function and
INDEX
measure 382, 383 convolution of measures 3 79 convolution of measures, properties for 379 convolution of point masses 380 convolution of Poisson measures 382 coordinate 32 countable (denumerable) set 41 countably compact topological space 149 counting measure 226 cover 92 cumulative jump function 5 1 9 cylinder 34 D arboux, Gaston 327, 329 Darboux lower sum 1 74, 327 Darboux upper sum 1 7 4, 327 d-bounded function 88 d-bounded set 82 decomposition of a continuous signed distribution function 54 7 decomposition of a positive measure 454 decomposition of a set 8 decomposition of a u-finite signed measure 45 6 decomposition of a signed distribution function 546 DeMorgan's laws 7 dense set 75 , 1 1 1 density 346 derivative of a map 390 derived set 71, 1 1 1 diagonal 124 diameter of a set 82 diffeomorphism 396 difference 6 differentiable map 390
INDEX
Dini 's Theorem 158 Diophantos 46 Dirac delta function 50 Dirac measure (point mass) 224 Dirac delta function , Fourier transform of 5 1 direct integrability 333 direct integral 333 directed set 1 72 Dirichlet's criterion 336 Dirichlet function 8 1 , 297, 330 discrete convolution 49 discrete extended distribution function 543 discrete (atomic) measure 226 discrete metric 60 discrete random variable 279, 549 discrete signed distribution function 545 discrete topology 1 08 disj oint family of sets 8 disj oint sets 6 distance 60 distribution function 224, 263 distributive laws 7 domain 1 1 dominance of functions by sets 195 Dynkin system 204 E gorov's Theorem 476 element of a set 3 elementary content 223 elementary events 4 embedded set 132 embedding 132 empty set (C/J) 4 0-continuity of measure 223 ¢-continuity of signed measure 423 £-bound of a family of
functions 486 equality of functions modulo J.L 304 equicontinuity 153 equipotent sets 40 equivalence kernel of a function 24 equivalence relation 22 equivalence class modulo � (E) 22 equivalent classes generated by a function 24 equivalent metrics 8 1 essential bound 46 7 essential supremum of a function 468 essentially bounded function 467 Euclidean metric (distance) 63 Euclidean (Fro beni us) norm of a matrix 385 Euclidean norm 1 0 1 event 4 expectation of a function of a random variable 343 expectation of a random variable 303 , 343 extended distribution function 224, 263 extended real-valued function 282 extended topology of point wise convergence 284 extendibility of a formatter, criterion for 243 extendible formatter 242 extension of a function 14, 240 extension of a measure 240 F actor sp�ce 32 Fatou's Lemma for functions 3 13
557
558 Fatou 's Lemma for measures and functions 322 Fatou's Lemma for measures 3 18 Fatou's Lemma for meas ures and nonnegative functions 32 1 field 52, 204 filter 1 67 filter base 167 filter base generated by a net 177 filter base for a filter 167 filter generated by a filter base 168 filter that meets a set 180 finer topology 109 finite set 4 1 finite (v-finite) set 422 finite set function 223 finite signed measure 422 finite subadditivity 228 first countable topological space 124 First Axiom of Countability 122 [/] JJ-class 304 formatter of an outer measure 242 Frechet, Maurice 59, 107 Frechet derivative 390 Frobenius (Euclidean ) norm of a matrix 387 F u-set 190 Fubini , Guido 356 Fubini's Theorem 367 Fubini's Theorem for monotone functions 52 1 function 1 1 function continuous at a point, criterion for 128 function continuous at a point 128 function convergent along a
INDEX
net 1 75 function of bounded variation 528, 529 fundamental system of neigh borhoods 1 10 G amma function 324, 325 generalized Holder's inequalities 4 7 1 generator of a subalgebra 1 6 1 geometric measure 429 greatest lower bound (infimum ) 26 group 47 group automorphism 48 group endomorphism 48 group homomorphism 48 group isomorphism 48 G u-set 190 H ahn's Decomposition Theorem 425 Hahn decomposition 425 half-open interval (rectangle) 205 Hausdorff, Felix 59, 107 Hausdorff space, criterion for 125 Hausdorff (T2 or separated ) topological space 122, 1 82 Heine-Borel Theorem 94 hereditary property 143 Holder's inequality 62 Holder's inequality for L00 spaces 470 Holder's inequality for £P spaces 46 1 homeomorphic topological spaces 132 homeomorphism 132 homeomorphisms and Borel u-algebras, relationship between 2 18 homothetic function 277
INDEX
homothetic metric 100 hypercontinuum 43 Ideal 232 idempotence 7 identity function 14 image of a point 12 image measure 277 image of a set 12 improper Riemann integral 333 indefinite integral 346 independent family of random variables 378 independent random variables 378 indicator function 14 indiscrete topology 108 n -stable (intersection-stable) system 205 infinite signed measure 422 inner regular Borel measure 493 integrability of a complex valued function 432 integrable simple function 466 integral of an extended nonnegative function 298 integral of a real-valued function relative to a signed measure 433 integral of an extended realvalued function 302 integral of a nonnegative simple function 296 integral with respect to the counting measure 371 integration by parts formula for Lebesgue-Stieltjes integrals 375, 376 interchanging derivative and integral 3 17 interior of a set 69, 1 10
interior point 69, 1 10 intersection 6 n -stable family of sets 108 into map 13 inverse of function 1 3 in verse image of function 13 in verse image of an open set under a function 79 inverse 47 Inverse Mapping Theorem 397 isometric metric spaces 278 isometry 196 J acobian 390 Jacobian matrix 390 joint distribution of random variables 3 78 Jordan, Camille 22 1 Jordan decomposition of a complex measure 430 Jordan decomposition of a signed measure 426 jump discontinuity 5 17 K ernel 48 K u-set 190 L argest element 26 A0- tail of a net 1 73 lattice 26 least upper bound supremum 26 Lebesgue, Henri 22 1 , 295 Lebesgue decomposition of a signed measure 453 Lebesgue Decomposition Theorem 453 Lebesgue measure of a set under an affine map 406 Lebesgue Dominated Convergence Theorem
559
560 for spaces 463 Lebesgue integral 302 Lebesgue-Stiel tj es integral 302 Lebesgue measure 260 Lebesgue-Stieltjes content 26 1 Le besgue-S tiel tj es measure 26 1 Lebesgue u-algebra 260 Lebesgue elementary content 224 Lebesgue outer measure 260 Lebesgue-Stieltjes elementary content 225 Lebesgue's Dominated Convergence Theorem for functions 3 14 Lebesgue's Dominated Convergence Theorem for measures 3 1 8 Lebesgue's Dominated Con vergence Theorem for measures and functions 322 Lebesgue's Theorem of Riemann integrabi lity 330 left distributive law 4 7 limit inferior 8 limit point of a filter 169 limit of a function at a point 170 limit of a sequence 8 limit point of a sequence 74, 122 limit point of a net 1 73 limit point of a set 7 4, 122 limit superior 8 Lindelof set 92, 144 Lindelof space 92 Lindelof topological space 144 linear functional 103 linear operator 103 linear (total) order 22
INDEX
linear space (vector space) over a field 53 Lipschitz condition 3 87 Lipschitz constant 387 locally com pact Hausdorff space, properties for 186 , 1 87, 1 89, 19 1 , 193 locally compact Hausdorff space, criteria for 1 96 , 197, 198 locally compact space 183 IL space 302 L00 (f2, E, J.L; C) space 468 L00 (0, E, J.Li lR) space 468 1 1 space 6 1 L 1 space 302 fP space 49 fP norm 1 0 1 LP space 50 £P( n, E, p. ; C) space 460 £P(f2, E, J.L; lR) 460 LP-convergence (convergence in the pth mean) 463 lower Baire function 329 lower bound 26 lower Darboux integral 328 lower derivative of a measure 5 10 lower limit topology 120 Lusin's Theorem 506 M ap 1 1 mapping 1 1 matrix supremum norm 388 maximal element 26 maximal filter 169 maximum row sum matrix norm 388 Mean Value Theorem 395 measurability of a complex valued function 432 measurability of an extended real-valued function , criteria for 282, 283
INDEX
measurability of a function, criterion for 2 16 measurable cylinder 357 measurable function 2 16 measurable rectangle 357 measurable sets 205 measurable space 205 measure 223 measure derivative 510 measure differentiable at a point 510 measure generated by an integral 346 measure of a Borel set under a diffeomorphism 4 1 1 measure of a hyperplane 271 measure space 223 mesh of a partition 328 metric 60 metric space 60 metric topology 109 metrizable topological space 109 metrization, 60 !Dl 11< -set 350 minimal element 25 Minkowski 's inequality 63, 46 1 modulus of congruence 24 moment generating function 298, 373 monoid 47 Monotone Convergence Theorem for functions 3 1 2 generalization 3 13 for measures 321 monotone function 5 1 7 monotone nondecreasing function 5 17 monotone nondecreasing sequence of sets 8 monotone nonincreasing function 5 17
monotone nonincreasing sequence of sets 8 monotone system 205 monotone vanishing sequence of sets 8 monotonicity 226 monotonici ty of outer measure 236 Moore plane 144 motion 278 motion-invariant measure 278 J.L-a.e. property 304 J.L-inner regular set 493 J.L-integrable function 302 j.l- minimal decomposition of a set 228 J.L-negligible ( negligible ) set 240 J.L-null ( null ) set 240 J.L-outer regular set 493 J.l o measure 236 J.l *-measurable set 236 J.L*-separability 236 multiplicative group 47 multiplicative inverse 47 multi-valued function 12
N ( A ) -tail of a sequence 122
N 11-set 240
N( x,c: ) -tail 74 negative ( v-negative ) set 422 negative variation of a signed measure 426 negligible (J.l-negligible ) set 240 neighborhood base at a point 1 10 neighborhood filter 168 neighborhood of a point 109 neighborhood of a point that meets a set 1 1 1 neighborhood system at a point 1 10 net 172
56 1
562 net cofinally in a set 1 73 net generated by a neighborhood base 173 net induced by a directed set 172 Nikodym, Otto 295 , 350 NLS 100 non-Borel set 273 nonnegative simple functions 288 norm 100 normal probability density function 50, 353 normal random variable 352 normal space 183 normality of a space, criteria for 185, 187 normed linear space (NLS) 100 nowhere dense set 75, 1 1 1 v-maximal set 428 v-minimal set 428 null (J.L- null) set 240 O ne-point compactifi cation 1 9 1 one-to-one (injective, invertible) map 13 onto (surjective) map 13 open ball 65 open ball with respect to the Euclidean metric 66 open ball with respect to the supremum metric 66 open cover 92 open map 140 open neighborhood of a point 109 open parallelepiped (rectangle) 1 16 open set 67, 108 open su bcover 92 orthogonal measures 35 1 orthogonal trans-
IN DEX
formation 278 orthogonality (singularity) of a signed measure 452 outer measure 235 outer regular Borel measure 493 P air wise disjoint sets 8 parabolic spline 1 65 parallelepiped 33 partial derivative 392 partial order relation 22 partition of an interval 8, 17 4, 327 partition of a set 8 partition of unity for a compact set 195 pick-a-point process 6 piecewise linear function 164 point mass (Dirac measure) 224 pointwise bounded set of functions 155 pointwise convergence, criterion for 13 9 pointwise limit 89 Poisson measure 23 1 Poisson random variable 197 positive linear functional 494 positive measure 422 positive (v-positive) set 422 positive variation of a signed measure 426 power 49 power set 8 pre-image 13 premeasure 223 principle of mathematical induction 28 probability density 352, 444 probability distribution 196 probability distribution function 444 probability measure 223
INDEX
probability space 223 product map 376 product measure space 368 product metric 63 product u-algebra 357 product space 63 product topology 124 product topology for arbitrarily many factor spaces 136 product topology for finitely many factor spaces 135 projection map 32 projection of a set on its quotient 23 proper subset 4 proper superset 4 property that hold almost everywhere (p-a.e.) 304 pseudo-metric 60 pseudo-metric space 60
Q uasialgebra 289
quasiring 289 quotient (factor) set 23 quotient set modulo J.L 304 quotient topology 129 Radius of an open ball 65 Radon, Johann 222, 295 Radon measure 493 Radon measure induced by a positive linear functional 496 Radon-Nikodym density 348 Radon-Nikodym density of a signed measure 438 Radon-Nikodym derivative 35 1 Radon-Nikodym derivative of a signed measure 444 Radon-Nikodym Theorem for complex measures 448 Radon-Nikodym Theorem for
positive measures 35 1 Radon-Nikodym Theorem for signed measures 438 random variable 279 random variable of a mixed type 550 range 12 rational parabolic spline 1 65 rational parallelepiped 1 16 real linear space 53 rectangle 33, 1 16, 205 rectangular cy Iinder 34 refinement 327 refinement of a partition 17 4 reflexive binary relation 22 regular space 1 83 regularity of a space, criteria for 184, 185 relative compactness 155 relative topology subspace 109 restriction of a function 240 restriction of a map 14 restriction of outer measure to E *-algebra 24 1 p- bound of a function 89 Riemann, B. Georg 107, 329 Riemann integrable function 328 Riemann integral 174, 328 Riesz, Frigyes 42 1 Riesz-Fischer Theorem 464 Riesz's Representation Theorem 500 right distributive law 47 ring 47 ring (as a collection of sets) 204 ring generated by a semi-ring 2 12 ring with unity 4 7 Russell's paradox 3 S ample space 4
563
564 scalar 53 Schroder-Bernstein Theorem 44 Second Axiom of Countability 124 second countable topological space 124 Second Separation Axiom 122 section of a function 365 section of a set 359 semi-linear functional 103 semi-linear lattice 289 semi-linear operator 103 semi-linear space 53 , 289 semi-norm 1 0 1 , 460 semi-normed linear space 1 0 1 semi-ring 204 semi-ring, property for 2 1 1 semi-ffi-space 289 semifield 53 semigroup 4 7 semigroup of functions 49 separable metria space 93 separable topological space 1 1 2 sequence 74 sequential compactness 93, 145 set 3 set-algebraic expression 6 set-algebraic transformation 6 set function 222 separating points set of functions 160 set wise con vergence, criterion for 320 setwise convergence of measures 3 19 setwise limit of measures 3 19 <S 11< -set 437 u-additive set function 222 u-algebra ( u-field) 204 u-algebra extended by a
INDEX
set 2 14 u-algebra generated by a col lection of functions 2 17 u-algebra generated by a collection of sets 2 1 0 u-algebra generated by a function 216 u-algebra generated by a set 205 , 2 1 0 u-compact space 190 u-compactness, criteria of 190, 1 9 1 , 193 u-field 204 u-finiteness of a set function 245 u-finite set function on a sequence of sets 223 u-finite set function on a system of sets 223 u-finite signed measure 420 E * -u-algebra 236 u-subadditivity 228 u-ideal 232 E-norm 102 signed Borel measure 423 signed Borel-LebesgueStieltjes measure 423 signed distribution function 530 signed measure 422 simple cylinder 34 simple function 289 simple parallelepiped 1 16 singleton 4 singular-continuous component of a signed measure 457 singular components of a signed measure 457 singular distribution function 545 singular-continuous function 543 singular-continuous random
INDEX
variable 550 singular-discrete component of a signed measure 457 singular-discrete distribution function 545 singularity of a measure 353 singularity (orthogonality) of a signed measure 452 smallest element 26 smallest u-algebra 205 SNLS 101 space of all continuous bounded functions 152 space of all continuous functions 15 1 space of all continuous real-valued functions 195 space of all n-times differentiable functions 48 spherical coordinate transformation 4 14 standard (natural) topology 108 stochastic convergence (convergence in probability) 483 Stone, Marshall H. 1 60 Stone-Weierstrass Theorem 1 6 1 , 164 S trong Law of Large Numbers 483 stronger (finer, larger) topology 109 subadditivity of outer measure 236 subalgebra 53 subalgebra generated by continuous functions 1 6 1 subalgebra of polynomials 164 subbase 1 1 6 subbase parallele piped 1 19, 135 subcover 92
submultiplicative property of a matrix norm 387 subordinance 195 subset 4 subspace 53 , 60, 109 sum of independent normal random variables 383, 384 superset 4 support of a function 195 supremum metric 6 1 , 65, 88 supremum norm 102, 15 1 symmetric binary relation 22 symmetric difference 6 systems of sets, diagram of 207 T ietze's extension of continuous functions 198 Tietze's Extension Theorem 1 98 TIH metric 100 Tonelli's Theorem 365 topological space 108 topology 108 topology generated by a metric 8 1 topology generated by a subbase 1 16 topology induced by a set 108 topology of pointwise convergence 139 topology on the extended real line 108 T spaces, diagram of 184 T0 space 182 T 1 space 182 T 1 space, criterion for 183 T2 space (Hausdorff) 122, 182 T3 space 183 T4 space 183 total variation of a complex measure 430 total variation of a
565
566 function 529 total variation of a signed measure 426 totally bounded set 82 total boundedness, criterion for 153 trace of a filter on a set 180 trace of a set in a topology 109 transitive binary relation 22 translation-invariance of Lebesgue measure 280 translation-invariant Borel measure 266 translation invariant metric 100 triangle inequality 60 two-sided identity 4 7 Tychonov space 1 83 Tychonov's Theorem 148 Tychonov topology 137
U ltrafilter 167
ultranet 173 unbounded set 82 uncountable set 41 uniform continuity criterion in compact space 96 uniform convergence 89, 152 uniform convergence, criterion for 152 uniform metric 88, 1 5 1 uniform metric space 1 5 1 uniformly bounded set of functions 89, 156 uniformly continuous function 82 uniformly integrable family of functions 486 uniformly integrable sequence of functions, criterion for 49 1 union 6 uniqueness of limits along
IN DEX
nets and filters, criteria for 178 unit cylinder 34 unity 47 universe 4 upper Baire function 329 upper bound 26 upper Darbou.x integral 328 upper derivative of a measure 5 10 Urysohn, Pavel 187 Urysohn's Corollary 189 Urysohn's Lemma 187 usual topology 108
V aguely hereditary
property 143 variation of a function on a bounded interval 528 variation of a function on an unbounded interval 529 vector 53 vector lattice 53 version of the conditional expectation 446 Vi tali's Covering Theorem 54 1 volume of an ellipsoid 416
W eak Law of Large
Numbers 483 Weak ( weaker, coarser, smaller ) topology 109 Weak topology generated by a family of functions 138 weakly hereditary property 143 weakly regular Borel measure ( Radon measure ) 493 Weierstrass, Karl 107, 160 Weierstrass Theorem 163 well-ordered set 27 well-ordering principle 27
INDEX
X $-space 284 Z ermelo, Ernst 3
Zermelo's Theorem 27 zero 47 Zorn's Lemma 28
567
n ntroduc ion o the Theor_, of Rea Functions and Integration _
N H.
J R
l eli
11 I
if R� a/ f.UII• tioll\ 1111d ltl
lr
,,;,,,
illuntinatc"' th print=ipal topic� t11at cun�titute real andl) sis. Sclf-cont�•ined. co\ erage of topologv. rne.t(\ure the ry. elaboration of the rnaJor theorem
•
nd inte rati n. it oiler� ' lh
\\ ith
ugh
notion,. and con,tructions ne'dcd not only
b� tho�e in te re ,ted in the rnathem4ttical 'ciences but also b) those pu r� ui n ' career� in stati�tic� and probability. operations research. ph sics. 1nd engin ring Structured logicallv and tlcxiblv through the author·s ntdnv } ea� ot teaching expcricn e. the matcn l i� pre cnted in lhr
main sccti n :
Pan I CU\C� the prchr11inaric' uf �cl lhcnl) and the tundarncntasls of n1ctn
•
!'.pace�
·
nd tupo lug
and um1� ''" idl'al introduction t
Part I I detail' the ha,ic�
•
,
t• polog .
l rneasure ,tnll inte 'nllion otl'erin • a 'ol i d
background in rnea�urc theory. Pan I l l addresses more ad\ anced topics. including elabordted and ab tract
•
\Cn\U "'
ol rnc�t�ure :and intcgnttann along " ith th ·ir application' to functinnul
nal\ sis. probabilit� theory. and con�ention,tl analy�is on the real line. An I ) is lies at the core of all nlathenlatic,tl di�ciplinc� and is esc;cntial to r.tngc
tlr
(
1 '\C ic nt i fi c and cnginccrinl!
11
huildin
R
l
nlctinn an
h el d �. R· al \11al�·.\ i.�: \n llltroductloll t I rt
r:tJtiall olTc r� the p rtect vehicle for
the foundation nct.:tlcd for n1urc at.h anccd studi
E\\'(;F
·
�.
)\\ is a P� tcssnr ut �1&�Lh auatics at the Florida
I ll.
Jn,ututc of Technology an �1 c l bou me . Flund�t. ,and the Princ ip ll Editor ot the �
J urnul
1
Af'Pii l \ utlz
'lti
,,
·s
11
i St ·ll utir \11 1/\'.\i.\.
He i� al�o on editorial board� ot three othe
8278
p li ed rnathcmatic� JOUrnal�.
ap
I SBN
CHAPMAN & HAWCRC
L - 58488- 0 7 3 - 2 90000
9 II 8 1 584 880738
l.