Introduction to Measure and Integration

INTRODUCTION TO MEASURE AND INTEGRATION BY S. J. TAYLOR Professor of Mathematics at Westfield College, University of L...

Author: S. J. Taylor

341 downloads 3570 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

INTRODUCTION TO MEASURE AND INTEGRATION BY

S. J. TAYLOR Professor of Mathematics at Westfield College, University of London

CAMBRIDGE UNIVERSITY PRESS

CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo, Delhi

Cambridge University Press The Edinburgh Building, Cambridge C132 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521098045

© Cambridge University Press 1966, 1973

This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published as Chs. 1-9 of Kingman and Taylor Introduction to Measure and Probability 1966 Reprinted as Introduction to Measure and Integration 1973 Re-issued in this digitally printed version 2008 A catalogue record for this publication is available from the British Library

Library of Congress Catalogue Card Number: 73-84325 ISBN 978-0-521-09804-5 paperback

iii

CONTENTS page v

Preface 1

2

Theory of sets 1.1

Sets

1.2 1.3 1.4 1.5 1.6

Mappings Cardinal numbers Operations on subsets Classes of subsets Axiom of choice

Metric space Completeness and compactness 2.3 Functions 2.4 Cartesian products 2.5 2.6 2.7

Further types of subset Normed linear space Cantor set

Types of set function Hahn-Jordan decompositions Additive set functions on a ring 3.4 Length, area and volume of elementary figures

41

44 49

51 61

65 69

Construction and properties of measures Extension theorem ; Lebesgue measure Complete measures 4.3 Approximation theorems 4.4* Geometrical properties of Lebesgue measure 4.5 Lebesgue-Stieltjes measure 4.1 4.2

5

23 29 35 38

Set functions 3.1 3.2 3.3

4

3

5 9 14 19

Point set topology 2.1 2.2

3

1

74 81

84 88 95

Definitions and properties of the integral What is an integral? Simple functions; measurable functions 5.3 Definition of the integral 5.4 Properties of the integral 5.5 Lebesgue integral; Lebesgue-Stieltjes integral 5.6* Conditions for integrability 5.1 5.2

100 101

110 115 124 127

iv

6

CONTENTS

Related spaces and measures Classes of subsets in a product space Product measures Fubini's theorem 6.4 Radon-Nikodym theorem 6.5 Mappings of measure spaces 6.6* Measure in function space 6.7 Applications

page 134

6.1 6.2 6.3

7

1.38

143 148 153 157 162

The space of measurable functions Point-wise convergence Convergence in measure Convergence in pth mean 7.4 Inequalities 7.5* Measure preserving transformations from a space to itself

166

7.1 7.2 7.3

8

174 183 187

Linear functionals Dependence of 22 on the underlying (S, Orthogonal systems of functions Riesz-Fischer theorem 8.4* Space of linear functionals 8.5* The space conjugate to Y. 8.6* Mean ergodic theorem

8.1 8.2 8.3

9

171

, ,a)

194 199 202 209 215 219

Structure of measures in special spaces Differentiating a monotone function Differentiating the indefinite integral 9.3 Point-wise differentiation of measures 9.4* The Daniell integral 9.5* Representation of linear functionals 9.6* Haar measure 9.1 9.2

224 230 236 241

250 254

Index of notation

261

General Index

263

V

PREFACE There are many ways of developing the theory of measure and integration. In the present book measure is studied first as the primary concept and the integral is obtained later by extending its definition from the special case of `simple' functions using monotone limits. The theory is presented for general measure spaces though at each stage Lebesgue measure and the Lebesgue integral in Rn are considered as the most important example, and the detailed properties are established for the Lebesgue case. The book is designed for use either in the final undergraduate year at British universities or as a basic text in measure theory at the postgraduate level. Though the subject is developed as a branch of pure

mathematics, it is presented in such a way that it has immediate application to any branch of applied mathematics which requires the

basic theory of measure and integration as a foundation for its mathematical apparatus. In particular, our development of the subject is a suitable basis for modern probability theory - in fact this book first appeared as the initial section of the book Introduction to measure and probability (Cambridge University Press, 1966) written jointly with J. F. C. Kingman. The book is largely self-contained. The first two chapters contain the essential parts of set theory and point set topology; these could

well be omitted by a reader already familiar with these subjects. Chapters 3 and 4 develop the theory of measure by the usual process

of extension from `simple sets' to those of a larger class, and the properties of Lebesgue measure are obtained. The integral is defined in Chapter 5, again by extending its definition stage by stage, using

monotone sequences. Chapter 6 includes a discussion of product measures and a definition of measure in function space. Convergence in function space is considered in Chapter 7, and Chapter 8 includes a treatment of complete orthonormal sets in Hilbert space. Chapter 9 deals with special spaces; differentiation theory for real functions of a real variable is developed and related to Lebesgue measure theory, and the Haar measure on a locally compact group is defined.

Starred sections contain more advanced material and can be omitted at a first reading. It will be clear to any reader familiar with the standard treatises that this book owes much to what has gone before. I do not claim any particular originality for the treatment, but the form of presentation owes much to my experience of teaching this subject - at Birmingham

Vi

PREFACE

University, Cornell University and the University of London - and I readily acknowledge the stimulus received from this source. I am grateful to Dr B. Fishel and Professor G. E. H. Reuter who made helpful criticisms of an early draft, and to a great number of students

and colleagues who pointed out misprints and errors in the first edition. However my main debt of gratitude is to Professor J. F. C. Kingman who was co-author of the first edition of this book, and who was much involved in every detail of it.

S.J.T. London

December 1972

1

THEORY OF SETS 1.1

Sets

We do not want to become involved in the logical foundations of mathematics. In order to avoid these we will adopt a rather naive attitude to set theory. This will not lead us into difficulties because in any given situation we will be considering sets which are all contained in (are subsets of) a fixed set or space or suitable collections of such sets. The logical difficulties which can arise in set theory only appear when

one considers sets which are `too big'-like the set of all sets, for instance. We assume the basic algebraic properties of the positive integers, the real numbers, and Euclidean spaces and make no attempt to obtain these from more primitive set theoretic notions. However, we will give an outline development (in Chapter 2) of the topological properties of these sets. In a space X a set E is well defined if there is a rule which determines, for each element (or point) x in X, whether or not it is in E. We write x r: E (read `x belongs to E') whenever x is an element of E, and the

negation of this statement is written x 0 E. Given two sets E, F we say that E is contained in F, or E is a subset of F, or F contains E and write E c F if every element x in E also belongs to F. If E C F and there is at least one element in F but not in E, we say that E is a proper subset of F. Two sets E, F are equal if and only if they contain the same ele-

ments; i.e. if and only if E c F and F

E. In this case we write

E = F. This means that if we want to prove that E = F we must prove both x E E x E F and x E F x r: E (the symbol should be read `implies'). Since a set is determined by its elements, one of the commonest methods of describing a set is by means of a defining sentence: thus E is the set of all elements (of X) which have the property P (usually delineated). The notation of `braces' is often used in this situation

E = {x: x has property P}

but when we use this notation we will always assume that only elements x in some fixed set X are being considered-as otherwise logical paradoxes can arise. When a set has only a finite number of

2

THEORY OF SETS

[1.1

elements we can write them down between braces E = {x, y, z, a, b}. In particular {x} stands for the set containing the single element x. One must always distinguish between the element x and the set {x}, for example, the empty set 0 defined below is not the same as the class { 0 } containing the empty set. Empty set (or null set) The set which contains no elements is called the empty set and will

be denoted by o. Clearly 0 = {x: x + x}, and

o c E for all sets E.

In fact since QJ contains no element, any statement made about the elements of 0 is true (as well as its negative). There are some sets which will be considered very frequently, and we consistently use the following notation: Z, for the set of positive integers, Q, for the set of rationals, R = R1, for the set of all real numbers, C, for the set of complex numbers, Rn, for Euclidean n-dimensional space, i.e. the set of ordered ntuples (x1, x2, ..., xn) where all the xi are in R.

We assume that the reader is familiar with the algebraic and order properties of these sets. In particular we will use the fact that Z is well ordered, that is, that every non-empty set of positive integers has a least member: this is equivalent to the principle of mathematical induction. We frequently have to consider sets of sets, and occasionally sets of sets of sets. It is convenient to talk of classes of sets and collections

of classes to distinguish these types of set, and we will use italic capitals A, B, ... for sets, script capitals .2f, a, W, ... for classes and Greek capitals A, P,... for collections. Thus CEW is read `the set C belongs to the class'; and .W c a means that every set in the class.2f is also in the class M.

Cartesian product Given two sets E, F we define the Cartesian (or direct) product E x F to be the set of all ordered pairs (x; y) whose first element x E E and whose second element y e F. This clearly extends immediately to the product El x E2 x ... x E. of any finite number of sets. In particular it is immediate that Rn, Euclidean n-space, is the Cartesian product

SETS

1.11

3

of n copies of R. For an infinite indexed class {Ej, i E I} of sets, the product II El is the set of elements of the form {as, i E I} with aj E Es iEI for each i E I.

Exercises 1.1 1. Describe in words the following sets: (i) {t a R: 0 5 t S 1};

(u) {(x, y) E R2: x2+ y2 S 1}; (iii) {k E Z: k = n2 for some n r: Z};

(iv) {keZ:nj k=> n= 1 or k}; (v)

(vi) {B: B c E}.

2. Show that the relation c is reflexive and transitive, but not in general symmetric.

3. The sets X x (Y x Z) and (X x Y) x Z are different but there is a natural correspondence between them.

4. Suppose x is an element of X and A = {x}. Which of the following statements are correct : x e A, x e %, x e A, x c I, A E %, A c %, A e x? 5. Suppose P(a) and Q(a) are two propositions about the element such that P(a) . Q(a). Show that {a: P(a)} c {a: Q(a)}.

1.2

Mappings

Suppose A and B are any two sets: a function from A to B is a rule which, for each element in A, determines a unique element in B. We talk of the function f and use the notation f : A -+ B to denote a function f defined on A and taking values in B. For any x E A, f (x) means the value of the function f at the point x and is therefore an element of the set B: we therefore avoid the terminology (common

in older text books) `the function f(x)'. The words mapping and transformation are often used as a synonym for function. For a given function f : A B, we call A the domain of f and the subset of B consisting of the set of values f (x) for x in A is called the range of f and may be denoted f (A). When f (A) = B we say that f is a function from A onto B. Given a function f : A -> B, by definition f (x) is a uniquely determined element of B for each x e A; if in addition

for each y in f (A) there is a unique x e A (we know there is at least one) with y = f(x) we say that the function f is (1, 1). Another shorter way of saying this is that f : A -> B is (1, 1) if and only if for x1, x2 E A,

x1 4 x2=f(x1) 4f(x2)

4

THEORY OF SETS

11.2

Given f : A -> B there is an associated f : sad -* -4, where .sad is the class

of all subsets of A and .4 is the class of all subsets of B, defined by

f(E) =

with y = f(x)}

for each E c A. (the symbol 3 should be read, `there exists': i.e. the set described by {x E E: y = f (x)) is not empty). There is also a function f-1: -4 -> &I defined by

f-1(F) = {xEA:f(x)EF}, for each F - B. The set f-1(F) is called the inverse image of F under f. Note that if yEB-f(A), then the inverse image f-1({y}) of the one point set {y} is the empty set. If f : A -> B is (1,1) and Y E f (A), then it is clear that f -1 ({y}) is a one point subset of A, so that in this case (only) we can think off-' as a function from f (A) to A. In particular, if f: A -* B is (1, 1) and onto there is a function f-1: B -- A called the inverse function off such that f-1(y) = x if and only if y = f(x). Now suppose f: Al B, g: A2 -- B are functions such that A, ' A2 and f (x) = g(x) for all x in A 2: under these conditions we say that f is an extension of g (from A2 to A1) and g is the restriction of f (to A2). For example, if g(x) = cos x

(x E R);

f(x + iy) = cos x cosh y + i sin x sinh y

(x +iyEC);

then f: C --> C is an extension of g: R --> C from R to C, and the usual convention of designating both f and g by 'cos' obscures the differences in their domains. If we have two functions f: A -* B, g: B -a C the result of applying the rule for g to the element f (x) defines an element in C for all x E A. Thus we have defined a function h: A -+ C which is called the composition off and g and denoted g of or g(f). Thus, for x E A h(x) = (g of) x = g(f(x)) E C.

Note that, if f : A -> B is (1, 1) and onto we could define the inverse function p l: B --> A as the unique function from B to A such that (fof-') (y) = y for all yEB, (f-1 of) (x) = x for all x E A. Sequence

Given any set X a finite sequence of n points of X is a function from

{1, 2,..., n} to X. This is usually denoted by xl, x2, ..., xn where xi c X is the value of the function at the integer i. Similarly, an infinite

MAPPINGS

1.21

5

sequence in X is a function from Z to X (where Z is the set of positive integers). This is denoted x1, x2, ..., or {xi} (i = 1, 2, ...), or just {xi} where xi is the value of the function at i, and is called the ith element of the sequence. Given a sequence {ni} of positive integers (that is, a

function f : Z -+ Z where f(i) = ni) such that ni > nn for i > j, and a sequence {xi} of elements of X (a function g: Z --* X) it is clear that the

composite function g of: Z X is again a sequence. Such a sequence is called a subsequence of {xi} and is denoted {xn,} (i = 1, 2,...). Thus {x.,} is a subsequence of {xi} if ni E Z for all i E Z, and i > j = ni > n p

We can think of a sequence as a point in the product space Ij Xi i=i

where Xi = X for all i. More generally a point in the product space 11 Xi with X i = X for i E I can be identified as a function f : I -+ X.

iel

Exercises 1.2

1. Suppose f : R R is defined by f (x) = sin x. Describe each of the following sets:

f-1{0}, f l{1}, f-1{2}, f-1{y:0 <, y < 2. Suppose f : A - . B is any function. Prove

(i) E c f-1(f(E)), for each E c A; (ii) F f(f-1(F)), for each F e B; and give examples in which there is not equality in (i), (ii).

3. Suppose f : A -* B, g : B -+ C are functions and h = g of: show that h-1(E) = f-1[g-1(E)] for each E e C.

4. If AcBCC,f:A-*X,g:B-+X,h:C-+X are such that his an extension of g and g is an extension off, prove that f is the restriction of h

toA. 5. Show that the restriction of a (1,1) mapping is (1, 1).

6. Suppose m, n E Z, A is a set with m distinct elements and B is a set with n distinct elements. How many distinct functions are there from A to B?

1.3

Cardinal numbers

If there is a mapping f: A -+ B which is (1,1) and onto, then it is reasonable to say that there are the same number of elements in A as there are in B. In fact, for finite sets, the elementary process of counting sets up such a mapping from the set being counted to the integers {1, 2, ..., n}, and from experience we know that if the same finite set of objects is counted in different ways we always end up with

6

THEORY OF SETS

11.3

the same integer n. (This fact can also be deduced from primitive axioms about the integers.) We say that the set A is equivalent to the set B, and write A - B if there is a mapping f: A -> B which is (1,1) and onto. It is clear that - is an equivalence relation between sets in the sense that it is reflexive, symmetric and transitive, and we can therefore form equivalence classes of sets with respect to this relation. Such an equivalence class of sets is called a cardir' l number, but by noting that the equivalence class is determined by any one of its members, we see that the easiest way to specify a cardinal number is to specify a representative set. Thus any set which can be mapped (1, 1) onto the representative set will have the same cardinal. As is usual we shall use the following notation:

the cardinal of the empty set 0 is 0; the cardinal of the set of integers {1, 2, ...n} is n; the cardinal of the set Z of positive integers is No; the cardinal of the set R of real numbers is c.

Since Z is ordered we can clearly order the cardinals of finite sets by saying that A has a smaller cardinal than B if A is equivalent to a proper subset of B. This definition does not work for infinite sets as the mappings 2 n -+ 2n or

n-n

map Z onto a proper subset of Z and are (1, 1). Instead we say that the

cardinal of a set A is less than the cardinal of the set B if there is a subset B1 cz B such that A - B1 but no subset Al c A such that Al - B. From this definition of ordering we consider the following statements, where m, n, p denote cardinals

(i) m.m m=n. (ii)

(iii) at least one of the relations m < n, m = n, n < m holds. Now (i) follows easily from the definition, for let M, N, P be sets with

cardinals m, n, p and suppose N1 c N, P1 c P with M - N1, N - P1. The mapping f: N -. P1 when restricted to Nl gives an equivalence

N1-P2cP1 sothat M,P2cP. Further if P-1111cMthe mapping g: M -> Nl when restricted to M1 shows P -M1 - N2 c N which contradicts n < p. (ii) can also be deduced from the definition (see exercise 1.3 (5)), though this requires quite a complicated argument: (ii) is known as the Schroder-Bernstein theorem. However, the truth of (iii)-that all cardinals are comparable-cannot be proved. without

1.31

CARDINAL NUMBERS

7

the use of an additional axiom (known as the axiom of choice) which we will discuss briefly in § 1.6. If we assume the axiom of choice or something equivalent, then (iii) is also true.

A set of cardinal X. is said to be enumerable. Thus such a set A - Z so that the elements of A can be `enumerated' as a sequence a1, a2, ... in which each element of A occurs once and only once. A set which has a cardinal m 5 No is said to be countable. Thus E is countable

if there is a subset A c Z such that E - A, and a set is countable if it is either finite or enumerable. Given any infinite set B we can choose, by induction, a sequence {bi} of distinct elements in B and if B1 is the set of elements in {bi} the cardinal of B1 is No. Hence if m is an infinite cardinal we always have m > No. By using the equivalence bi+-+b2i

between B1 and the proper subset B2 B1 where B2 contains the even elements of {bi} and the identity mapping b<-+b for we have an equivalence between B = B1 v (B - B1) and B2 V (B - B1),

a proper subset of B. This shows that any infinite set B contains a proper subset of the same cardinal.

In order to see that some infinite sets have cardinal > No it is sufficient to recall that the set {x E R: 0 < x < 1} cannot be arranged as a sequence.4 Now it tan-1 x + I = f (X), x E R defines a mapping f : R -a (0, 1) which is (1, 1) and onto so that R has the same cardinal as the interval (0, 1) and we have c > No. It is worth remarking that a famous unsolved problem of mathematics concerns the existence or otherwise of cardinals m such that c > m > No. The axiom that no such exist, that is that m > No = m >, c is known as the continuum hypothesis.

The fact that there are infinitely many different infinite cardinals follows from the next theorem, which ccmpares the cardinal of a set E with the cardinal of the class of subsets of E.

Theorem 1.1. For any set E, the class (f = (E) of all subsets of E has a cardinal greater than that of E.

Proof. For sets E of finite cardinal n, one can prove directly that the cardinal of '(E) is 2n, and an induction argument easily yields n < 2n for n E Z. However, the case of finite sets E is included in the general proof, so there is nothing gained by this special argument. t See, for example, J. C. Burkill, A First Course in Mathematical Analysis (Cambridge, 1962).

8

THEORY OF SETS

1.3

Suppose 2 is the class of one points sets {x} with x e E. Then 2 c ' and E - 2 because of the mapping x H {x}. Therefore it is sufficient to prove by (ii) above, that ' is equivalent to no subset

El c E. Suppose then that g ' -* El is (1, 1) and onto and let x: El -> W denote the inverse function. Let A be the subset of El defined by

A = {x e El,

x x(x)}.

Then A E 6 so that c(A) = xc E El. Now if x0 a A, x(xc) = A does not contain x0 which is impossible, while if x0 0 A, then x0 is not in x(xo) so that x0 E A. In either case we have a contradiction. It is possible to build up systematically an arithmetic of cardinals. This will only be needed for finite cardinals and No in this book, so

we restrict the results to these cases and discuss them in the next section.

Exercises 1.3 1. Show that (0,1] .. (0,1) by considering, defined by

f(x)=I-x, for lj<x.1;

=I-x, for J < x < ,J; =I-x, for }<x<,j; 2 -x,

for 2n<x.ri

Deduce that all intervals (a, b), (a, b], [a, b] or [a, b) with a R which is monotonic, i.e.

a <xl <x2
d(x) = f(x+0)-f(x-0) satisfies 1/(n+l) < d(x) < 1/n and prove this is finite for all n in Z.

3. Show that R2 - R. Hint.

defines a (1, 1) mapping between pairs of decimal expansions and single expansions of numbers in (0,1). Modify this mapping to eliminate the difficulty caused by the fact that decimal expansions are not quite unique. 4. Prove that a finite set E of cardinal m has 2m distinct subsets.

5. Suppose Al c A, Bl c B, Al ... B and A - B1. Construct a mapping to show that A - B.

1.31

CARDINAL NUMBERS

9

Hint. Suppose f: A -). B1, g: B --> Al are (1,1) and onto. Say x (in either A or B) is an ancestor of y if and only if y can be obtained from x by successive applications off and g. Decompose A into 3 sets A0, A6, A; according as to whether the element x has an odd, even or infinite number of ancestors

and decompose B similarly. Consider the mapping which agrees with f on A. and A,, and with g-1 on A0.

1.4 Operations on subsets For two sets A, B we define the union of A and B (denoted A v B)

to be the set of elements in either A or B or both. The intersection of A and B (denoted A n B) is the set of elements in both A and B.

Fig. 1

If A c X, the complement of A with respect to X (denoted X -A) is the set of those elements in X which are not in A. We also use (A - B) to denote the set of elements in A which are not in B for arbitrary sets A, B. For any two sets A, B the symmetric difference (denoted AL B) is (A - B) v (B - A), that is the set of elements which are in one of A, B but not in both. Note that AL B = B L A. These finite operations on sets are best illustrated by means of a Venn diagram. In this some figure (like a rectangle) denotes the whole space X and suitable geometrical figures inside denote the subsets A, B, etc. It is well known that drawing does not prove a theorem, but

the reader is advised to illustrate the results of the next paragraph by means of suitable Venn diagrams (see Figure 1).

The operations v, n,,L satisfy algebraic laws, some of which are

THEORY OF SETS

10

11.4

listed below. We assume the reader is familiar with these, so proofs are omitted.

(i) AuB=BuA, AnB=BnA; (ii) (AvB) v C= A v (B v C), (A n B) n C= A n (B n C); (iii) A n (B v C) = (A n B) v (A n C),

A v (BnC) = (AvB)n (AuC);

(iv) A v o= A, A n N= O; (v) ifAcX, then A v X= X, An X= A.; (vi) ifAcX, B c X, then X - (AvB) = (X - A) n (X - B), X- (AnB) = (X - A) v (X - B); (vii) AvB = (AA B) A (AnB), A - B = AA (AnB). A similarity between the laws satisfied by n, v and the usual algebraic laws for multiplication and addition can be observed (in fact the older notation for these operations is product and sum) but the differences should also be noted: in particular the distributive laws, (iii) above, are different in the algebra of sets. (vi) above will be generalized and proved as a lemma-it is known as de Morgan's law.

Given a class f of subsets A, the union U {A; A E''} is the set of elements which are in at least one set A belonging to ' and the intersection n {A; A E '} is the set of elements which are in every set A of W. If the class ' is indexed so that ' consists precisely of the sets Aa, (a E 1), then we use the notations U,,,,, I Aa, f 1 a E I A. for the union and intersection of the class. In particular when ' is finite or enumerable it is usual to assume that it is indexed by {1, 2, ..., n} or Z respectively and the notation is n

n

U Ai,

fl Ai,

i=1

i=1

oo

U Ai, i=1

co

fl Ai i=1

When the class' is empty, that is I = 0, we adopt the conventions U Ea =o, fl Ea = X, the whole space. aEI

aEI

This ensures that certain identities are valid without restriction on I. Lemma. Suppose E, a E I is a class of subsets of X, and E1 is one set of the class, then

(i) aEI flEacE1cUEa; aEI (ii) x- UaEI E. = f aEI l (X - Ea); (iii) X - n Ea = U (X - Ea). aE1 aE1

OPERATIONS ON SUBSETS

1.41

11

Proof. (i) This is immediate from the definition.

(ii) Suppose x c X - U Ea, then x c X and x is not in U Ea, that aEI

aEI

is x is not in any Ea, a E I so that x E X - Ea for every a in I, and X E n (X - Ea). Conversely if x E n (X - Ea), then for every a E I, aEI

aEI

x is in X but not in Ea, so x E X but x is not in U Ea; that is, x E X - U E. aEI aEI

(iii) Similar to (ii).

Two sets A, B are said to be disjoint if they have no elements in common; that is, if A n B = o. A disjoint class is a class ' of sets such that any two distinct sets of ' are disjoint. The union of a disjoint class is sometimes called a disjoint union. p

Lemma. Given a finite or enumerable union of sets U Ei (where p i=1

can be + oo), there are subsets Fi e Ei such that the sets Fi are disjoint p

p

i=1

i=1

and UEi = U Fi. Proof. We write out the details for p = oo. Only obvious changes p

are needed for p E Z. Put C = U Ei and define F1 = El, i=1

Fn= En_ n-1

UEi (n= 2,3,...).

i=1

Then F. C En for all n, and if i > j, Fi and E) are disjoint, so that F,, F must be disjoint. Further if x E C, and n is the smallest integer (which exists because Z is well ordered) such that x E En; then x E E. 00

but not to Ei for i < n. Thus X E F. and so x E U F. Thus i=1

co

C c U Fi, and the reverse inclusion is immediate. i=1

Theorem 1.2. The union of a countable class of countable sets is a countable set.

Proof. By the process of the above lemma we can replace the countable union by a countable disjoint union of sets which are subsets of

those in the original class-each of which is therefore countable. Each countable set can be enumerated as a finite or infinite sequence. So we have 00 C = U Ei a disjoint union, i=1

Ei = {xi;}

(j = 1, 2, ... ),

where the infinite union may be a finite one and some (or all) of the sequences {xi;} may be finite. Put F. = {x11: i +j = n}, then F. is a

12

[1.4

THEORY OF SETS

finite set containing at most (n + 1) elements. The sets F. are disjoint, and C can be enumerated by first enumerating F1, then F2, and so on. If F. = 0 for n > N then C is finite; otherwise it is enumerable. ] Corollary. The set Q of rational numbers is enumerable.

Proof. Q = U En, where E. is the set of real numbers of the form n=1

p/n where p is an integer. En is enumerable since

0, + 1, -1, + 2, - 2,..., +p, -p, ... is an enumeration of the set of integers. ] For a sequence E1, E2, .(iij1) .. of sets, we put

lim sup Ei =n

00

00

OD

n n=1` in

lim inf Ei = U

Go

n=1 i=n

and if {E1} is such that lim sup Ei = lim inf Ei we say that the sequence

converges to the set E = lim sup Ei = lira inf Ei. For any sequence {Ei}, lim sup Ei is the set of those elements which are in Ei for infinitely

many i and lim inf Ei is the set of those elements which are in all but a finite number of the sets E. A sequence {Ei} is said to be increasing if, for each positive integer n, En En+1; it is said to be decreasing if, for each positive integer n, En En+l. A monotone sequence of sets is one which is either

increasing or decreasing. Note that any monotone sequence convergences to a limit for (i) If {Ei} is increasing, OD

co

U Ei = U Ei, fl Ei = E. for all n,

in

i=n

i=1

so that lim sup Ei = lim inf Ei = U Ei; while i=1

(ii) If {E1} is decreasing, 00

W

00

UEi=E'n, in flEi=flEE i=n i=1

for all n,

so that lim sup Ei = lim inf Ei = (1 Ei. 00

i=1

Indicator function Given a subset E of a space X, the function XE: X --> R defined by for xEE, { 0 for x E X - G 1

1.41

OPERATIONS ON SUBSETS

13

is called the indicator function of E (many books use the term `characteristic function' for xE, but we will avoid this term because charac-

teristic function has a different meaning in probability theory). The correspondence between subsets of X and indicator functions is clearly (1, 1) for E = {x: yE(x) = 1},

and we will use indicator functions as a convenient tool for carrying out operations on sets. Exercises 1.4 1. Prove each of the following set identities:

A-B=A-(AnB)=AvB-B, A n (B-C) = AnB-A n C,

(A-B)-C = A-(BvC), A-(B-C) = (A-B)v (An C), (A -B) n (C-D) =AnC-BvD, Et (FAG) = (EAF)AG,

En(FAG) _ (EnF)A(EnG),

EA 0 =E,EAX =X-E,

EAE= s,EO(X-E)=X, EA F = (Eu F)-(En F). 2. With respect to which of the operations L, v, n does the class of all subsets of X form a group?

3. Show that E c F if and only if X - F c X - E. 4. Prove that A A B= C A D if and only if A A C= BL D, by showing that either equality is equivalent to the statement that every point of X is in 0, 2 or 4 of the sets A, B, C, D.

5. Show that if Il e I2, then

nEaDnEa, UE.CUEa.

aEIt aEI2 aEIt aEll 6. A real number is said to be algebraic if it is a zero of a polynomial an xn +an-1x°`-1 + ... -- ao where the coefficients ai are integers. Defining the `height' of a polynomial to be the integer

h = n+lanl+Ian-1I +...+laol, show that there are only finitely many polynomials of height h, and deduce that the set of all algebraic real numbers is enumerable. Deduce that the

THEORY OF SETS

14

11.4

set of transcendental numbers (real numbers which are not algebraic) has cardinal > Mo.

7. Show that in Rn, the set Qn of points (x1, x2, ..., xn), where each coordinate x= is rational, is an enumerable set. Further, the class of all spheres with centres at points of Qn and rational radii is an enumerable class.

8. Show that any sequence of disjoint sets converges to 0. Show that {En} is a convergent sequence if and only if there is no point x of X such that each of x e En, x e X - En holds for infinitely many n. Suppose

0 < x 5 1- (1/n)} n odd, E. {x: n even; {X: (1/n) 5 x < 1}

show that {En} converges but is not monotone.

9. For any sequence {En} of sets prove (i) lim sup En, lim inf E. are unaltered by the omission or alteration of any finite number of sets in the sequence. (ii) for any set F,

F-lim sup R. = liminf(F-En), F - lim inf En = lim sup (F - En). 10. If En = A for n even, En = B for n odd, show that lim sup En = A v B,

liminfEn = An B. 11. Can an uncountable union of distinct sets be countable? 12. If {En} is a sequence of sets and

D1 = E1, D .n = Dn_10 En for

n=2,3,...,

show that the sequence {DJ converges to a limit if and only if lim En = o. 13. Show that XE(x) < xF(x) for all x in X if and only if E c F. Suppose

A=EvF,B=EnF,C=E/ F: show that

xB=XE*XF' xC= IXE-XFI' Generalise the first two of these identities to finite unions and interXA=xE+XF-XB,

sections.

14. If xn is the indicator function of E. (n = 1, 2, ...) and A = lim sup En, B = lim inf En, show that, for all x in X, XA(x) = lim sup xn(x), n-4- OD

1.5

XB(x) = lim inf xn(x) n-4. OD

Classes of subsets

Up to the present our operations have been defined on the class '' of all subsets of a given set X. This class is too large for many pur-

CLASSES OF SUBSETS

1.5]

15

poses and it is usual to restrict attention to subclasses of W. However it is important that the subclasses considered have sufficient structure, and we now define various types of class starting with the simplest.

1. Semi-ring A class. of subsets such that (i)

o EY;

(ii) A, B E. ' z A n B E.So; (iii) A, B E.5

A - B = U Ei, where the Ei are disjoint sets in 9, is i=1

called a semi-ring. (Note that many authors, following Von Neumann, who first defined the concept, have an additional condition in the de-

finition of a semi-ring-instead of (iii) they assume that if A, B EY and B a A there is a finite class Co, C1, ..., C. of sets of .51 such that

B=CocC1c... cCn=A and Di=C1-C1_1E.So for i= 1,2,...,n. This stronger condition causes complications and we weaken it since it is unnecessary.) An important example of a semi-ring of subsets of R is the class' = 91 of finite intervals (a, b] which are open on the left and closed on the right. Similarly, -9n consisting of the rectangles in Rn of the form {(x1, x2, ..., xn): ai < xi S bi} is a semi-ring in Rn.

2. Ring This is any non-empty class . of subsets such that

A,BER=>AnBEP and ALBEPP. Since 0 = A A A, A v B= (A A B) A (A n B), and A- B= A A (A n B) we see that a ring is a class of sets closed under the operations of union, intersection, and difference and QS E R. Thus a ring is certainly also a semi-ring. As examples the system { o, X} is a ring as is the class of all

subsets of X. However, the class 9 of half-open intervals in R is not a ring, for it is not closed under the operation of difference. 3. Field (or algebra)

Any class sad of subsets of X which is a ring and contains X is called a field. Thus a ring is a field if and only if it is closed under the operation

of taking the complement. The class of all finite subsets of a space X is a ring, but is not a field unless X is finite. In R the class of all bounded

subsets is a ring but not a field.

THEORY OF SETS

16

[1.5

4. Sigma ring A ring . is called a

if it is closed under countable unions, i.e.

if

00

AiEA (i= 1,2,...)=>UAiE9. i-1

00

ao

00

Now put A = U Ai and use the identity fl Ai = A- U (A - Ai) to i=1

i=1

see that a

i=1

is also closed under countable intersections. Hence

if R is a o--ring and {An} is a sequence of sets from PAP then lim sup A.

and lim inf A. both belong to R. 5. Sigma field (o field, Borel field, a-algebra) Any class.F of sets which contains the whole space X and is a o'-ring is called a a-field. Alternatively, a a field which is closed under countable unions. For any space X, the class of

all countable subsets will be a v-ring, but will only be a v-field if X is countable. 6. Monotone class Any class 4f of subsets such that, for any monotone sequence {En} of sets in .4' we have lim En E .4' is called a monotone class. It is clear is a monotone class, and any monotone class which is a that a ring is also a v-ring since

EiE.4' OD

n

UEIEJ,

i=1

n

and U Ei is monotone so that U Ei = lim U Ei is in .f1. i=1

i=1

i=1

We now use the term z-class to denote any one of the types 2, 3, 4,

5, 6 above (but not a semi-ring), and we consider a collection of z-classes.

Lemma. If W., for a E I is a z-class, then'' = n wa is a z-class. aEI

Proof. Each of these z-classes is defined in terms of closure with respect to specified operations. Since each %, is closed with respect to operations, the resulting subset will be in Wa for all a E I and therefore

in 'f, so that '' is also a z-class.' Note. The intersection of a collection of semi-rings need not be a semi-ring. Theorem 1.3. Given any class ' of subsets of X there is a unique z-class .9 containing (f such that, if .l is any other z-class containing ' we must have .2

Y.

CLASSES OF SUBSETS

1.51

17

Remark. The z-class .5o obtained in this theorem is called the z-class generated by W. It is clearly the smallest z-class of subsets which contains 6. Proof. The class of all subsets of X is a z-class containing W. Put

Y = (1 {2: 2 f and 2 is a z-class). Them is a z-class by the lemma and it clearly satisfies the conditions of the theorem.

In certain special cases one can specify the nature of the z-class generated by a given class. Theorem 1.4. The ring M (Y) generated by a semi-ring.5o consists precisely of the sets which can be expressed in the form n

E=UAk k=1 of a finite disjoint union of sets of Y. Proof. (i) The ring.(b°) certainly must contain all sets of this form, since it has to be closed under finite unions. (ii) To see that the system .2 of sets of this type form a ring suppose n m

A=UAk, B=UBk k=1 k=1

and put Ci, = Ai n B f E.9'. Then since the sets C,, are disjoint and n m

ArB=U UCi, i=13=1

the system 2 is closed under intersections. Now from the definition of a semi-ring, an induction argument shows that r;

m

Ai = U Ci1 v U Dik, a=1

n

s1

B, = U Ci, U U Ekf, i=1

(i = 1, ..., n)

k=1

(j = 1, 2, ..., m);

k=1

where the finite sequences {Dik} (k = 1, ..., ri) and {Ek,} (k = 1, ..., s,) consist of disjoint sets in Y. It follows now that

ALB= i=1 U UDik k=1

Um (UEk;) j=1 k=1

so that the system 2 is also closed under the operation of taking the symmetric difference.

Example. We have already seen that J', the class of intervals (a, b] in R, is a semi-ring. The generated ring is the class off of finite unions of disjoint half-open intervals. off is called the class of elemen-

18

THEORY OF SETS

11.5

tary figures in R. Similarly, the elementary figures in Rn form the class Pn of finite disjoint unions of half-open rectangles from 6pn. The next theorem is often important in proving that a given class is a o--ring.

Theorem 1.5. If 9 is any ring, the monotone class _W(M) generated by .

is the same as the o--ring.9'(M) generated by M.

Corollary. Any monotone class .4' which contains a ring q contains the

.c"(.) generated by M.

Proof. Since a a-ring is always a monotone class and .9'(.) we must have Y (M) (9), denoted by .4'.

M

Hence it is sufficient to show that .,K is a o--ring, and this will follow if we can prove that .4' is a ring. For any set F, let 2(F) be the class

of sets E for which E - F, F - E, E v F are all in .4'. Then if 2(F) is not empty it is easy to check that it is a monotone class. It is clear

that 2(F)

R for any F e 9 so that 2(F) .4'. Hence, EE.%',

E e 2(F) . F E 2(E) by the symmetry of the definition of the class 2, and it follows that ..Ill c 2(E) since 2 is a monotone class. But the truth of this for every E E .4' implies that .4' is a ring. I In § 1.2 we discussed mappings f : X -* Y and saw that any such mapping induced a set mapping f-1 on the class of all subsets of Y. If f-1 is restricted to a special class IF of subsets in Y, then the image of ' under f-I will be a class of subsets in X. The interesting thing is that the structure of the class 'C is often preserved by such a mapping f-1. Theorem 1.6. Suppose s' is a z-class of subsets of Y, f: X Y is any FE.

mapping and f-1 (io) denotes the class of subsets of X of the form f-1(E), EE'. Then f-1(co) is a z-class of subsets of X. Proof. It is easy to check that the mapping!-': 'f --> f -'(W) commutes with each of the set operations union, symmetric difference, countable

union and monotone limit. The closure of ' with respect to any of these operations therefore implies the closure off-'(W) with respect to the same operation. I Exercises 1.5

1. Give an example of two semi-rings Y,, .9" whose intersection is not a semi-ring.

2. Prove that any finite field is also a v-field. 3. If M is a ring of sets and we define operations Q = multiplication and

$ =addition by

EpF=EnF, E©F=ELF

show that 9 becomes a ring in the algebraic sense.

CLASSES OF SUBSETS

1.51

19

4. If .GP is a ring and ' is the class of all subsets E of X such that either E or (X - E) is in Ge, show that ' is a field.

5. What is the ring .g(') generated by each of the following classes: (i) for a single fixed E, le = {E}; (ii) for a single E, ' is class of all

subsets of E; (iii) ' is class of all sets with precisely 2 points?

6. Prove that if A is any subset of a space X, A + o or X, then the v-field JF(A) generated by the set A is the class { 0, A, X -A, X}.

7. If le is a non-empty class of sets show that every set in the v-ring generated by 'f is a subset of a countable union of sets of . 8. For each of the following classes' describe the v-field, u-ring and monotone glass generated by W. (i) P is any permutation of the points of X, i.e. any transformation from

X to itself which is (1, 1) and onto, and ' is the class of subsets of X left invariant by P. (ii) X is R3, Euclidean 3-space, ' is the class of all cylinders in X, i.e. sets E such that (x, y, z1) E E . (x, y, z2) e E for all z2 E R.

(iii) X = R2, the plane, ' is class of all sets which are subsets of a countable union of horizontal lines.

9. Suppose X is the set of rational numbers in 0 < x < 1, and let 2 be the set of intervals of the form {x c X ; a < x 5 b} where 0 <, a s b < 1; a, b EX. Show that 2 is a semi-ring and every set in 2 is either empty or infinite.

Show that the u-ring generated by 2 contains all subsets of X.

10. Given a function f : X Y, and a class of subsets . of X, f (.V) will denote the class of subsets of Y of the form f (A), A e d What is the relation between f (A - B) and f (A) - f (B) ? Give an example in which f (A n B) $ f (A) n f (B). Show that it is possible to have a ring sad such that f (.V) is not a ring. Y and a semi-ring Yin Y such that Give an example of a mapping!: X f-1(.9) is not a semi-ring. For any class V' of sets in Y show that g(f-1(./V')) =f-1(.g

(.f 1(`A)) =ff 1(.

where R(') is the ring generated by le, and .F(') is the v-field generated by W.

1.6

Axiom of choice

Any non-empty set A contains at least one element x, and in the ordinary process of logic one can choose a particular element from a non-empty set. By using the principle of induction it follows that one can choose an element from each of a sequence of non-empty sets, but difficulty arises if one has to make the simultaneous choice of an

20

THEORY OF SETS

[1.6

element from each set of a non-countable class W. The assumption that such a choice is possible can be formulated in the following equivalent forms, known as the axiom of choice:

(1) Given a non-empty class ' of disjoint non-empty sets Ea, there is a set G c U {E.: E. E f} such that G n E. is a single point set for each Ea E W.

(2) For a non-empty class' of non-empty sets Ea, there is a function (called a choice function ) f : ' -+ U {Ea: Ea E'} such that, for each E. in ', f (Ea) E Ea. The difficulty in proofs using the axiom of choice is that only the existence of a choice function is postulated, and if '' is uncountable,

one has no information about its nature. However, we will find it convenient at times to use this axiom (or something equivalent). It has recently been shown that both the axiom of choice, and its negative, are consistent with the other axioms of set theory, so that one has to postulate this as an axiom. Although part of our theory will be valid without this axiom we will not trouble to discover how much and we will use the axiom of choice throughout when it is convenient.

There are a large number of other apparently different axioms which turn out to be logically equivalent to the axiom of choice. We

will formulate just two of these, as they will be convenient later. Various new concepts will be needed before we can state them precisely.

Partial ordering Suppose V is a set with elements a, b, ... and -< is a relation defined between some but not necessarily all pairs a, b E V such that (i) -< is transitive, i.e. a -< b, b -< c a -< c; (ii) -< is reflexive, i.e. a -< a for all a in V;

(iii) a- a=b; then V is said to be partially ordered by the relation -<. V is said to be simply (or totally) ordered if, (iv) for each pair a, b E V at least one of a -< b, b -< a is valid.

Any partial ordering in a set V induces automatically a partial ordering in every subset of V. If W V and the induced ordering in W is a simple ordering, then W is said to be a chain in V. For example, in R the usual S relation defines a total ordering of R. However, in R2, if we say (xi, Yi) -< (x2, Y2) if and only if yi < y2

and xi 5 x2 we have an example of a partial ordering which is not simple. A more useful example is the class ' of all subsets of a fixed set X with A -< B meaning A c B.

1.6]

AXIOM OF CHOICE

21

A chain W in a partially ordered set V is called a maximal chain if it is not possible to obtain a larger chain by the addition of an element in (V - W). We can now state Kuratowski's lemma. Every partially ordered Bet V contains a maximal chain. This means that there is a totally ordered subset W c V such that for every x e V - W, there is some element yEW such that neither of x -{ y, y < x is true. For a partially ordered set V, the element a is said to be an upper bound for the subset C c V if c -< a for every c E C. The element a

is said to be the least upper bound or supremum of the subset C if (i) a is an upper bound for C; (ii) if b is an upper bound for C, then a -< b. It is easy to check that, in any partially ordered set V, it is impossible

for two distinct elements al, a2 to satisfy the above conditions (i), (ii) so that the supremum of a set C is unique when it exists. However,

even when a set V is totally ordered, not all its subsets need have a

supremum. With the usual ordering R has the property that any non-empty subset C which is bounded above has a supremum (this is known as the least upper bound axiom), but Q does not have this property. Finally, we say that the element m e V is a maximal element of V if m -< a . m = a. We can now state Zorn's lemma. If V is partially ordered and each chain W in V has a supremum, then V has a maximal element.

Both Zorn's lemma and Kuratowski's lemma can be deduced from the axiom of choice, fi but we will not give the details as these are complicated and outside the mainstream of our argument. However,

,the next theorem shows that, if we assume Zorn's lemma as an additional axiom, then both the axiom of choice and Kuratowski's lemma will be valid. This means that, in our subsequent work, we will assume whichever of these three results happens to be most convenient. Theorem 1.6. The statements of (A) Kuratowski's lemma and (B) Zorn's lemma are equivalent. Either of them implies (C) the axiom of choice. Proof. (A) . (B). Suppose W is a maximal chain in V, then by the

hypothesis of (B) there is a supremum m for W so that a -< m for all a E W. If m is not a maximal element of V, then there is a be V such that b 4 m and m -< b. Then b is not in Was this would imply b -< m, t For a discussion of these and other axioms equivalent to axiom of choice see, for example, J. L. Kelley, General Topology (Van Nostrand, 1955).

22

THEORY OF SETS

[1.6

and b = m. Hence we may add b to the chain W and the new set ob-

tained is still a chain. This would contradict the fact that W is a maximal chain. (B) . (A). The chains in V form a class f which is partially ordered by inclusion. If now VYis a chain in ' with elements W (each of which is a chain in V), then the union U {W: W E *Y} is a chain in V so that it is an element of ' which can only be the supremum of 0. Hence by hypothesis ' contains a maximal element, i.e. V contains a maximal chain. (B) (C). We now suppose given a class .NV of sets E. There are

clearly some subsets (in fact any finite subset) . c .'V on which it is possible to define a choice function g: . --> U {E,: Ea E .} such that g(Ea) E E. The set V of all such functions g is therefore nonempty and it is partially ordered if we say g1-< g2 if gl is defined on

., 92 is defined on X., . c X. and g1(Ea) = g2(Ea) for Ea E (i.e. g2 is an extension of g1). If now W is a chain in V containing func-

tions gi defined on M, the supremum of W is the function defined which has the value gi(Ea) on any set E. E .. If we now on U assume (B) it follows that the set V has a maximal element f. Then this function f must be defined on all the sets Ea, for otherwise if f is not defined on E1 we could choose an element x1 E E1, put f (El) = x1 and this would be a proper extension off and therefore contradict the fact

that f is maximal.' Exercises 1.6

1. Show that Z is partially ordered if a < b means that a is a divisor of b.

2. Suppose a is a decomposition of the non-empty set X into disjoint subsets; X = UAi all the Ai disjoint. Show that the collection of such decompositions is partially ordered if a -< f means that ft is a refinement of a, i.e. if ft is the decomposition X = UB; then each B3 is a subset of some A, 3. A partially ordered set V is said to be well ordered if each non-empty subset W -- V has a least element, i.e. there is a wo E W such that wo -< w for all w e W. Show that, if V is well ordered, then it is simply ordered, and

by considering the natural ordering of R show that there exist simply ordered sets which are not well ordered. 4. Assuming Zorn's lemma, show that any set X can be well ordered. Hint. Consider the class le of well ordered subsets V X with the partial ordering V1 -< V2 if: (i) V1 c V2, (ii) the ordering in Vi is the same as that induced by the ordering in V2, (iii) V1 is an initial segment of V2 in the sense

b E V1. Show that each chain in' has a supremum and show that the maximal element Vo in ' must be X. that a e V1, b E V2, b -< a

23

2

POINT SET TOPOLOGY 2.1 Metric space In the first chapter we were concerned with abstract sets where no

structure in the set was assumed or used. In practice, most useful spaces do have a structure which can be described in terms of a class

of subsets called `open'. By far the most convenient method of obtaining this class of open sets is to quantify the notion of nearness for each pair of points in the space. A non-empty set X together with a `distance' function p: X x X ->. R is said to form a metric space provided that (i)

p(y, x) = p(x, y) ,>0 for all x, y e X ;

(ii) p(x, y) = 0 if and only if x = y; (iii) p(x, y) < p(x, z) + p(y, z) for all x, y, z e X.

The real number p(x, y) should be thought of as the distance from x to y. Note that it is possible to deduce conditions (i), (ii) and (iii) from a smaller set of axioms: this has little point as all the conditions

agree with the intuitive notion of distance. Condition (iii) for p is often called the triangle inequality because it says that the lengths of two sides of a triangle sum to at least that of the third. Condition (ii) ensures that p distinguishes distinct points of X, and (i) says that the distance from y to x is the same as the distance from x to y. When

we speak of a metric space X we mean the set X together with a particular p satisfying conditions (i), (ii) and (iii) above. If there is any danger of ambiguity we will speak of the metric space (X, p). In the set R of real numbers, it is not difficult to check (i), (ii) and (iii) for the usual distance function

P(x,y) = Ix-yI, and similarly in RR, x = (x1, ..., xn), y = (yi, ..., yn) ll}

P(x, y) =

DZi

(xs - yz) ZJ J

(one always assumes the positive square root) the conditions for a metric are satisfied. Thus R and Rn are metric spaces with the usual Euclidean distance for p.

POINT SET TOPOLOGY

24

[2.1

Open sphere

In a metric space (X, p), if x c X, r > 0, then S(x,r) = {y:p(x,y) < r};

the set consisting of those points of X whose distance from x is less than r is called an open sphere (spherical neighbourhood) centre x, radius r. Clearly, in Rn, S(x, r) is the inside of the usual Euclidean n-sphere centre x, radius r (for n = 2, the `sphere' is the interior of a circle while for n = 1 it reduces to the interval (x -r,x+r)). Open set

A subset E of a metric space X is said to be open if, for each point x

in E there is an r > 0 such that the open sphere S(x, r) c E. Note that the open spheres defined above are examples of open sets since y E S(x, r) = p(x, y) = r1 < r,

so that, for 0 < r2 <, r - r1, S(y, r2) c S(x, r).

Theorem 2.1. In a metric space x, the class 9 of open sets satisfies (1)

0,XETJ;

(ii) A1,A2,...,A.ET=> nAEV; i=1

(iii) A. E V for a in I

U A. c!?.

aEI

Proof. (i) Since any statement about the elements of 0 is true, 0 E 9, and it is clear that S(x, r) c X for any x E X, r > 0 so certainly X E 9. n

(ii)

If x E fl Ai, then x E Ai for i = 1, ..., n and each Ai is open i-1

so there are real numbers ri > 0 for which S(x, ri) c A. If we put r = min ri, then 0 < r < ri so that S(x, r) S(x, rj) c Ai for 1-
i = 1,...,n; and S(x,r) c (1 Ai. i=1

(iii) For any x E U Aa, there must be a particular a in I such that aEI

x E A.. Since this Aa is open, there is an r > 0 such that

S(x, r) c Aa c U A. aEI

Remark. The condition (ii) says that 9 is closed for finite intersections, while (iii) says it is closed under arbitrary unions. One

METRIC SPACE

2.11

25

cannot extend (ii) to give closure for infinite intersections for, in R the intervals (0, 1 + (1/n)) are open sets, but 00

={x:0<x51}=(0,1] fl (0,1+11 n n=1\ is not open as it contains no open sphere centre 1.

It is more general to start with a set X and a class V of subsets of X satisfying (i), (ii), (iii) of theorem 2.1 and to call these `the open sets' in X. Such a class 9 and set X are said to form a topological

space, and 9 is said to determine the topology in X. A topological space (X, 9) is said to be metrisable if there is a distance function p defined on it which determines the class 9 for its open sets. Most topo-

logical spaces (X, T) of interest satisfy the rather weak conditions which are sufficient to ensure metrisability, so that little is lost by assuming in the first place that we have a metric space (X,p). Of course two different metrics p1, p2 on a set X may define the same class

V of open sets, so that even when a topological space is metrisable, the metric p is not uniquely determined-see exercise 2.4 (1). In this chapter we will define most of the further concepts. which depend on the toplogy of X in terms of the class 9 of open sets in X: this means that the definitions will make sense either in a metric space (X, p) or in a topological space (X, V). However, when it simplifies the proof, we will assume that X has a metric p determining ( and use this metric, so that some theorems will be stated and proved for metric spaces even though they are true more generally. Closed set

A subset E of X is said to be closed if (X - E) is open. If we apply

this definition, with de Morgan's laws, to the conditions (i), (ii), (iii) of theorem 2.1 satisfied by the class of open sets, we see that the class ' of closed sets satisfies (i)

Q , X Ef;

(ii) A1, A2, ..., An e

U Ai E'f; i=1

(iii) A,,EWo,ainl=> nA,,EW, aEI

so that the class W is closed for finite unions and arbitrary intersections. In a metric space (X, p), for x E X, r > 0 the set S(x, r) = {y: p(x, y) < r}

POINT SET TOPOLOGY

26

[2.1

is called the closed sphere centre x, radius r. It is always a closed set according to our definition for Y E G = X- S(x, r)

p(x, y) = rl > r

so that S(y, r2) c G for 0 < r2 5 rl - r. Neighbourhood

In a topological space (X, 9), any open set containing x E X is said to be a neighbourhood of x. Limit point of a set Given a subset E of X, a point x E X is said to be a limit point (or

point of accumulation) of E if every neighbourhood of x contains a point of E other than x. Note that the point x may or may not be in E. In a metric space it is easy to see that x is a limit point of E only if every neighbourhood N of x contains infinitely many points of E: for, if N contains only the points x1, x2, ..., xn of E (all different from x), then S(x, r) where r = min p(x, xi) is a neighbourhood of x which con1 i
tains no point of E other than x.

Lemma. A set E c X is closed if and only if E contains all its limit points. Proof. Suppose E is closed, then X - E = G is open, so that if x E G there is a neighbourhood N of x with N c G. This means that N con-

tains no point of E so that x is not a limit point of E. Conversely, if E is a set which contains its limit points and x E G = X - E, then x is not a limit point of E so there is a neighbourhood N,, of x containing

no point of E. Since Nx is open, so is H = U N. But Nx c G for all XEG

x E G so H c G, and every point x of G is in the corresponding Nz so H G. Thus H = G and G is open. Closure

For any set E c X, the closure of E, denoted by R, is the intersection of all the closed subsets of X which contain E. It is immediate

that E is a closed set, and E = E if and only if E is closed. Further since a closed set contains its limit points, E must contain all the limit points of E: in fact

E = E v E',

where E' is the set of limit points of E, known as the derived set of E; for if x 0 E u E', there is a neighbourhood N of x which contains no point of E v E' so that (X - N) is a closed set containing E and x 0 E.

METRIC SPACE

2.1]

27

Limit of a sequence Given a sequence {xi} of points in a metric space (X, p) we say that

the sequence converges to the point x E X if each neighbourhood of x contains all but a finite number of points of the sequence. Thus {xi} converges to x if given e > 0, there is an integer N such that

i>N=> p(x,xi)<e. We then write x = lim xi or x = lim xi and say that x is the limit of the

sequence {xi}. Note that, in a metric space, the limit of a sequence is unique-see exercise 2.1 (7). In a metric space X, given a point x and a set E, the distance from x to E, denoted by d(x, E) is defined by d(x, E) = inf{p(x, y): yeE}.

This is always defined since {p(x, y) : y e E} is a set of non-negative real numbers. If E c S(x, r) for some open sphere, we say that E is bounded and define the diameter of E, denoted diam (E), by diam (E) = sup {p(x, y) : x, y e E}.

If E is not bounded then the set {p(x, y) : x, y E E} is not bounded above

and we put diam (E) = + oo. Note that diam (E) is finite if and only

if E is bounded. Finally, if E, F are two subsets of X, we define d(E, F) by

d E F= inf

x

x E E, Y E F

= inf {d(x, E), x e F}

and call d(E, F) the distance from E to F. Note that if E n F + 0, then d(E, F) = 0 but there is no converse to this statement. Remark

Many readers will be familiar with the concepts of this section for

R and R2. Usually the proofs given in these special cases can be generalised to a general metric space, and often even to a topological space. The reader who has difficulty in working in an abstract situation should visualise the argument in the plane R2, but not use any of the special properties of R2. 2

TIT

POINT SET TOPOLOGY

28

[2.1

Exercises 2.1 1. In any set X, the class '' of all subsets of X satisfies the conditions for a class of open sets. Show that this topology can be generated by the metric p(x, y) = 1 for x $ y, x, y E X.

(This is called the discrete topology in X.) At the opposite end of the scale, the indiscrete topology in X is that for which 9 = { 0, X}: in this case the space is non-metrisable if X contains at least two points.

2. Show that in a metric space X containing at least 2 points (i) single point sets {x} are closed, but they are open only if d(x, X -{x}) > 0; (ii) finite sets are closed;

(iii) any open set G is the union of the class of open spheres contained

in 0; (iv) any open set 0 is the union of the class of closed spheres contained in G.

3. In the topological space X, given a set E -- X a point x is said to be an interior point of E if there is a neighbourhood N of x with N c E. Prove (i) the set E° consisting of the interior points of E is open; (ii) E is open if and only if E = E°. 4. In a topological space X show that A1 v A2v...v A = A1v A2v ...

but this does not extend to arbitrary unions. Give an example in which

EAF+EnF. 5. If X is a 2-point space {x1, x2}, p(xl, x2) = 1 show that (X, p) is a metric space in which the closure S(x1,1) of the open sphere S(x1,1) is not the same

as the closed sphere S(x1,1). However, if X is a normed linear space (see §2.6) then S(x, r) = S(x, r).

6. Suppose A c R and A is closed and bounded below. Show that the infinum of A is an element of A.

7. In a metric space, suppose is a sequence converging to x and E is the set of points in this sequence. Show (i) every subsequence converges to x; (ii) either x is the only limit point of E or there is an integer N such that

x = x for n > N. Deduce that a sequence {xn} cannot converge to two different limit points. in a metric space and 8. Suppose E is the set of points in a sequence x is a limit point of E. Show there is a subsequence {x t} which converges to X.

2.11

METRIC SPACE 9. For any set E in a metric space, show that

29

E = {x: d(x, E) = 0}.

10. If E, F are subsets of a metric space X, x, y e X, show (i) p(x,y) > Jd(x,E)-d(y,E)I; (ii) p(x,y)
(f) Jp(x, y1) -p(x, y2) < diam (E), if (iv) d(E, F) = d(F, E).

y1, y2 E E;

Is d a metric on the space of subsets of X?

11. In R show that a bounded open set is uniquely expressible as a countable union of disjoint open intervals. Hint. For each x E E, put a = inf {y: (y, x) c E}, b = sup {y: (x, y) C E}; and show that the open interval It,, = (a, b) contains x, is contained in E and is such that any open interval I satisfying x E I c E satisfies I C I,,. Deduce that for x1, x2 E E, either Ixi = Ix9 or I xl n Ixs = o, so that E _ U I,, is a xEE

disjoint union. Enumerate the intervals Ix by considering those of length greater than 1/n (n = 1, 2, ...). In Rn (n > 2) show that a bounded open set can be expressed as a disjoint union of a countable number of half-open rectangles in Yn (but that this expression is never unique). Show that in general an open set in Rn (n > 2) cannot be expressed as a disjoint union of open spheres, or of open rectangles.

2.2 Completeness and compactness In a metric space (X, p) a sequence {xn} is said to be a Cauchy sequence if given e > 0, there is an integer N such that n, m > N u p(xn, xm) < e. It is immediate that any sequence {xn} in a metric space which converges to a point x E X, is a Cauchy sequence. Complete metric space

A metric space (X, p) is said to be complete if, for each Cauchy sequence {xn} in X, there is a point x E X such that x = lim xn. For example, the set Q of rationale is a metric space with the usual distance, but it is not complete for V2 0 Q, but one can easily define a Cauchy sequence {xn} of rationals which converges to .J2 (in R), and this sequence cannot converge to any rational. One of the important properties of the space R is that it is complete. This property is equivalent to the assumption that, in the usual ordering, every nonvoid subset of R which is bounded above has a supremum or least upper bound. We now give a proof of the completeness of R by a method which will turn out to be useful in more complicated situations. 2-2

POINT SET TOPOLOGY Lemma. The space R is complete. 30

[2.2

Proof. Let {xn} be a Cauchy sequence in R. Define a sequence of integers {ni} by no = 1; if ni_1 is defined, let ni be such that ni > ni-1 and n, m > n i I xn -xmI < 1/2i. Then the series OD

(xn,

-xni_1)

is absolutely convergent, and therefore convergent,-[ say to y. But P

Z (xni-xnt_1) = xn,, -x1,

i=1

so the subsequence {xnp} (p = 1, 2, ...) must converge to x = x1 + y. Given e > 0, choose integers P, N > np such that

p'> P=> Ix-xnD I <2e,

n,m>

Ixn-xml <2e. Now, if m > N, we can take np > N with p >, P to obtain Ix-xml -< Ix-xnpl +Ixnp - xml < 6, so the sequence {x j must converge to x. Covering systems

A class ' of subsets of X is said to cover the set E c X or form a covering for E, if E c U {S: S Eon }. If all the sets of ' are open, and le covers E, then we say that' is an open covering of E. Compact set

A subset E of X is said to be compact if, for each open covering c'

of E, there is a finite subclass W. c' such that' 1 covers E. For example, the celebrated Heine-Borel theorem states that any finite closed interval [a, b] is compact. Though this is proved in most elementary text-books we include a proof which starts from the least upper bound property. Lemma. If a, b are real numbers, the closed interval

[a,b]={x:aex
empty subset of R which is bounded above has a supremum. (See, for example, Burkill, A First Course in Mathematical Analysis, Cambridge, 1962.)

2.21

COMPLETENESS AND COMPACTNESS

31

Proof. Let le be any open cover of [a, b] and let c be the supremum of the set of x in [a, b] for which some finite subfamily W1 c W covers [a, x]. (This set is non-void since it contains a.) Choose a set GEW with c e G and choose a point d E (a, c) such that the closed interval

[d, c] c G. Then there is a finite subfamily covering [a, d] and the addition of 0 to this family gives a finite subfamily covering [a, c]. But unless c = b, since G is open, we have covered by a finite subfamily the interval [a, e] for some e > c which contradicts the choice of c. 3

It is also possible to prove directly that any closed rectangle in Rn is compact, but we will be able to deduce this from theorem 2.6. We can use this to show that, in Rn, every closed bounded set is compact. This will follow from the following:

Lemma. If E is compact, and F is a closed subset of E, then F is compact.

Proof. Suppose le is an open covering for F. Then', together with

(X-F), which is open, forms an open covering for E. This has a finite subcovering (1 of E and 6 n W, must be a finite subclass of which covers F. J It is not true in a general metric space that closed bounded subsets are compact-see exercise 2.2 (3). However, we can prove: Lemma. In a metric space X, every compact subset is closed and bounded. Proof. If E is not closed, there is a point x0 E X which is a limit point of E but is not in E. For every x E E, put S,, = S(x, r) with r = Jp(x, x0).

Then the collection of all such open spheres covers E, but every finite subclass S(xl, r1), ..., S(xn, rn) has a void intersection with S(xo, r), where r = min ri and so cannot cover E, for S(xo, r). contains points 16i
of E. On the other hand, if E is not bounded, the class of open spheres of radius 1 and centres in E covers E, but no finite subclass can cover E. J Whenever the whole space X is compact, we talk of a compact space. The above lemma shows that Rn is not compact because it is not bounded. A space X is said to be locally compact if every point x in X has a neighbourhood N such that N is compact. It is clear that Rn is locally compact. There are various other properties in a topological space which are equivalent to compactness under suitable conditions. Weierstrass property

A set E is said to have property (W) if every infinite subset of E has at least one limit point.

POINT SET TOPOLOGY

32

[2.2

Finite intersection property

A class a of subsets of E is said to have the finite intersection n

property if every finite intersection fl Ai, where Ai c d, (i = 1, 2, ..., n) i=1

is non-void. Theorem 2.2. (i) A closed subset E of a topological space X is compact if and only if every class sad of closed subsets of E with the finite intersection property has a non-void intersection.

(ii) In a metric space X, a subset E is compact if and only if it has property (W). Remark. In a general topological space, property (W) is equivalent

to sequential compactness-the property that every countable open covering has a finite subcovering. A space in which arbitrary open coverings can be replaced by countable sub coverings is called Lindelof.

Thus in any Lindelof space, property (W) is equivalent to compactness.

Proof. (i) Suppose E is compact and Fa, a E I is a class of closed

subsets of E with fl Fa void. Then the class of sets Ga = X -Fa, aEI

a E I is an open covering of E. Choose a finite subcovering Gal, Ga$, ..., Gan;

n

then ( .1 = 0 i -1

so that the class of sets Fa, a e I has not got the finite intersection property. This proves that compactness implies that any class sad of closed subsets with the finite intersection property has a non-void intersection. Conversely suppose a closed set E is such that any class .d of closed subsets with the finite intersection property has a nonvoid intersection, and suppose Ga, a E I is an open covering of E, so that (1(X - Ga) = Q1. If E is closed E n (X - Ga), a E I is a family of aEI

closed subsets of E, so there must be a finite set al, a2, ..., an such that n

n (X - Gai) n E _ 0 . This means that Gal, ... , Gan form a finite subcovering for E.

(ii) Suppose first that E has not got property (W). Let A be an infinite subset of E with no limit point. If A is not enumerable, choose an enumerable subset B of A. Then B is closed and (X - B) is open.

Enumerate B as a sequence of distinct points {xi}, and for each xi choose a neighbourhood Ni which contains xi but no other point of B.

2.21

COMPLETENESS AND COMPACTNESS

33

Then the sequence {Ni} together with the set (X - B) form an open covering of E, which has no finite subcovering as none of the open sets Ni can be omitted without `uncovering' the corresponding xi. Conversely suppose E has property (W). Then there is a finite class W,, of spherical neighbourhoods of radius 1/n which covers E; for otherwise we could find an infinite subset of E all of whose points were

distant more than 1/n apart and such a subset can have no limit w

point. Let' = U Wo n; so that ' is countable. Now if G is any open set, n-1 for each x in G n Ewe can find a sphere S E 'o containing x with S c G:

for we can first choose S > 0 so that S(x, S) c G and if n > 2/4, the sphere of Wn which contains x will be contained in S(x, S). Given any open covering -9 of E, carry out the above process for each set D of .9 which intersects E, and each x in D n E, and let "' c ' be the countable collection of open spheres obtained. For each SEW', choose one set D E -9 with D S, and let .9' be the countable class of sets

so obtained. Then, since W' covers E, the class _9' is a countable subcovering. This means that, if we assume property (W), open coverings can be replaced by a countable subcovering. 00

Now suppose E U Gi where the sets Gi are open. Then, if there i=1

is no finite subcovering, for each integer n we can find a point

xnEE- U Gt i=1

and the sequence {xn} must form an infinite set, so that there is a limit point x0 E E. But x0 E Gk for some k, and Gk is open and therefore

is a neighbourhood of x0; this means we can find an n > k such that n xn E Gk

U Gi, i=1

which is a contradiction. Compactification

Many operations can be carried out more easily in compact spaces than in non-compact spaces. Given a non-compact space X a useful trick is to enlarge it to a topological space X* X which is compact and such that the system G of open sets in X is obtained by taking the intersection X n 0 with X of sets G which are open in X*. This device is known as the compatification of X. For example, R is not compact, but if we adjoin two points + oo, - oo to form the space R* of extended

34

POINT SET TOPOLOGY

[2.2

real numbers we can show that R* is compact if we call a set E c R* open if E n R is open and if + oo E E, there is a neighbourhood {x: a < x < + oo} c E; if - oo E E, there is a neighbourhood {x: - oo 5 x < b} c E,

where a, b E R. Note that the extended real number system R* is simply ordered if we put - oo < x < + oo for all x e R. In general, a non-compact topological space X may be compactified

in many different ways. The simplest method is to adjoin a single point oo (which can be thought of as a point at infinity) to give the space X* = X v {oo} and say that a subset G of X* is open if either (i) G c X and G is open in X; or (ii) oo E G and (X* - G) is a closed compact subset of X. It is not difficult to verify that this collection of `open' sets defines a topology

in X* in which X* is compact. This process is called the one-point compactification of X. It is familiar in the theory of the complex plane, where it is usual to add a single point at infinity (with neighbourhoods of the form In > R) to make the resulting `closed plane' compact. Note that, if G* is the class of open sets in X*, G is the class of sets of the form X n E, E e G*. There are other, more sophisticated, methods of compactifying a

topological space X, but we will not require these in the present book.

Exercises 2.2 1. If (X, p) is a compact metric space, show that (i) X is complete; (ii) for each e > 0, X can be covered by a finite class of open spheres of radius E.

2. If (X, p) is a complete metric space which, for each e > 0, can be covered by a finite number of spheres of radius e, show that X is compact.

3. The open interval X = (0, 1) c R with the usual metric p(x1,x2) = Ixl-x21 is a metric space. In (X, p) the set X is closed and bounded. Show that X is not compact (and therefore not complete by example 2).

4. Construct a covering of the closed interval [0, 1] by a family of closed intervals such that there is no finite subcovering.

5. If A, B are compact subsets of a metric space X, show that there are points xo E A, yo c B for which

p(xo, yo) = d(A, B).

2.2]

COMPLETENESS AND COMPACTNESS

35

Hint. Take sequences {xi} in A, {yi} in B with p(xi, yi) < d(A, B) + 1/i, and apply property (W) to find convergent subsequences.

6. Give the details of the proof that the process of adjoining a point co used to give the one-point compactification does yield a compact set X*. If this process is applied to a space X which is already compact, show that the one point set {co} is then both open and closed in X*.

2.3 Functions In Chapter 1 we defined the notion of a function f: X --> Y. When X and Y are topological spaces it is natural to enquire how the function f is related to these topologies. In particular do points which are

`close' in X map into points which are close in Y? We make this precise first for metric spaces. Continuous function If (X, px), (Y, pp) are metric spaces, a function f: X -* Y is said to

be continuous at x = a if, given e > 0 there is a S > 0 such that

px(x,a) < 6- py(f(x),f(a)) < e. If E c X, we say f is continuous on E if it is continuous at each point of E. In particular f: X -. Y is said to be continuous (or continuous on X) if it is continuous at each point of X.

Lemma. If (X, px), (Y, py) are metric spaces a function f : X --> Y is continuous if and only if f-'(G) is an open set in X for each open set G in Y.

Proof. Suppose first that f is continuous and G is an open set in Y. If f-'(G) is void, then it is open. Otherwise, let a e f-'(G), f (a) e G so that there is an e > 0 for which the sphere S(f(a), e) c G. But then we can find a 8 > 0 such that px(x, a) < 8 = f (X) E S(f (a), e) e G

so that the sphere S(a, S) c f-'(G). Conversely consider f at a point a of X. For each e > 0, S(f (a), e) = H

is an open set in Y, so that if f-'(H) is open, we can find a 8 > 0 for which S(a, 8) c f-'(H), that is such that px(x,a) < 8=>- pr(f(x),f(a)) < e.]

If (X, 9),

JV') are topological spaces, the function f: X -> Y is said to be continuous if f-'(H) e 9 for every H in .'. The lemma just proved shows that when the topologies in X, Y are determined by .

36

POINT SET TOPOLOGY

[2.3

metrics this definition agrees with the one first given for mappings from one metric space to another. Now if f: X -> Y is continuous and E is a closed subset of Y, it follows that f-1(E) is closed in X. One has to be careful about the implications in the reverse direction. In general, it is not true for a continuous f : X -a Y, that A open in X f (A) open in Y. There is one important result of this kind which is valid: Theorem 2.3. If f: X -> Y is continuous, and A is a compact subset of X, then f(A) is compact in Y.

Proof. Suppose G,, a E I forms an open covering of f (A). Then f-'(G,,) is open for each a and the class a E I must cover A. Since A is compact, there is a finite subcovering f-1(G1),...,f-1(G,") which covers A, and this implies that G1, ...,G,, cover f(A).

Corollary. If f : X - R is continuous, and A is compact, the set f (A) is bounded and the function f attains its bounds on A at points in A.

Proof. f(A) is compact, and so it must be closed and bounded. Hence sup {x: x E f (A)} and inf {x: x E f (A)} exist and belong to the set f (A). Hence there are points x1, x2 E A for which f (A) C [ f (x1), f (x2)]. I

Remark. The reader will recognise this corollary as a generalisation

of the elementary theorem that a continuous function f: [a, b] -+ R is bounded and attains its bounds. It is important to notice that, in a metric space X, the distance function p(x, y) is continuous for each fixed y considered as a function from X to R. Further, for a fixed set A, d(x, A) defines a continuous function from X to R since p(x1, x2) 1> I d(x1, A) - d(x2, A) 1.

This means that if E is compact, F is any set, the function d(x,F) for x E E attains its lower bound so that there is an x0 in E with d(xo, F) = d(E, F).

Now, if F is also compact d(xo, F) = inf {p(xo, y): y E F}

is the lower bound on a compact set of another continuous function, so that there is a yo in F such that d(xo, F) = p(xo, yo) = d(E, F).

Thus we have proved a further corollary to theorem 2.3-which could have been proved by a different argument (see exercise 2.2 (5)).

FUNCTIONS

2.31

37

Corollary. If E, F are two compact subsets of a metric space (X, p), there are points xo E E, yo E F such that

P(xo, yo) = d(E, F).

Uniformly continuous function Y from the metric space (X, px) to the metric A mapping f : X space (Y, pY) is said to be uniformly continuous on the subset A C X if given e > 0, there is a 8 > 0 for which

x,yEA, px(x,y) < 8z pY(f(x),f(y)) < e.

(2.3.1)

Clearly a function which is uniformly continuous on A is certainly continuous at each point of A, but the point of the condition (2.3.1) is that one can make f(x) close to f(y) in Y simply by making x close to y simultaneously for all x, y e A. The choice of 8 in (2.3.1) does not depend on x or y. In general, uniform continuity does not follow from continuity, but there is an important case in which it does:

Theorem 2.4. If X, Y are metric spaces, and f : X -* Y is continuous on A where A is a compact subset of X, then f is uniformly continuous on A. Proof. Given e > 0, for each x E A, there is a 8, > 0 such that 6 E S(x, 8,) n A

f(6) E 8(f (X), fe).

For x c A, the class of spheres Sx = S(x, J88) form an open covering of A. Choose a finite subcovering 5_ - 1 , ..., S and put S = I min (8x1, ..., 8). Then if px(g, V) < S, 6, r/ E A, there must be a sphere S", which contains 6, and S(xz, 8x,) will then contain i . This implies

PY(f(),fM) <' PY(f(),f(x,))+PY(f(rl),f(xs)) < e.1 Remark. The reader will recognize the above theorem as a generalisation of the result that a continuous function f: [a, b] -+ R is uniformly continuous. Exercises 2.3

1. Consider the function f: (0, oo) -+ R given by f(x) = min (1,1/x), for

x>0.

Show that it is continuous. Find the image f (E) of

(i) the set E = (1,1); (ii) the set Z of positive integers; (i) shows that E can be open, f (E) not open; (ii) shows that E can be closed, f (E) not closed.

POINT SET TOPOLOGY

38

[2.3

2. The function f : (0,1)

R given by f (x) _

1

x(1-x)

is continuous on (0,1), but not bounded and not uniformly continuous, so theorems 2.3, 2.4 fail if the set is not closed. To see they also fail if the set is closed but not compact, examine

g: R R given by

g(x) = exp (x).

3. In the argument of the proof to theorem 2.4 why could we not have put g = inf {Sx: x A} before first restricting to a finite subset? 4. Suppose A is compact and f f,} is a monotone sequence of continuous functions f,: A -* R converging to a continuous f : A -+ R. Show that the

convergence must be uniform, and give an example to show that the condition that A be compact is essential.

5. Prove Lebesgue's covering lemma, which states that if le is an open cover of a compact set A in a metric space (X, p), then there is a 8 > 0, such that the sphere S(x, S) is contained in a set of ' for each x e X.

2.4 Cartesian products We have already defined the direct product of two arbitrary sets X, Y as the set of ordered pairs (x, y) with x e X, Y E Y. If (X, 9) (Y, Jr) are topological spaces, then there is a natural method of defining a topology in X x Y. Let .V be the class of rectangle sets G x H with G E T, HE .*'and let' be the class of sets in X x Y which are unions of sets in.2f (finite or infinite unions): it is immediate that ' satisfies the conditions (i), (ii), (iii) of theorem 2.1. This, class ' of `open' sets in X x Y is said to define the product topology. This definition extends in an obvious manner to finite products X. X X2 x ... X Xn, and it is also possible to extend it to an arbitrary product of topological spaces-though we will not have occasion to consider a topology for infinite product spaces.

Theorem 2.5. If (Xi, pi) (i = 1, ..., n) are metric spaces then P((x1, ..., xn), (Y1, ..., yn)) = max pi(xi, yi)

1'i
defines a metric in the Cartesian product X1 x ... x X. which generates the product topology.

Proof. It is clear that p(x, y) = 0 if and only if pi(xi, yi) = 0 for each i

CARTESIAN PRODUCTS

2.41

39

which implies xi = yi, 1 < i < n or x = y. Thus in order to show that p is a metric it is sufficient to prove the triangle inequality. But pi(xi, yi) < pi(xi, z1) +pi(yi, z1)

(i = 1, ..., n)

since the pi are all metrics, so that max pi(xi, yi) < max {Pi(xi, z1) +P1(yi, zi)} 1
1
max pi (xi, z1) + max pi(yi, zi)

1
1
Now in the product space, the open sphere centre x, radius r has the form {(y1, ..., yn): Pi(xi, yi) < r,

1 < i < n} = S(x1, r) x S(x2, r) x ... X S(xn, r)

that is, it is the product of spheres in each of the component spaces. Thus the open spheres are open sets in the product topology and since every open set in the topology of a metric p is a union of the open spheres contained in it, each such set must be open in the product topology. Conversely if G is an open set in the product topology and x c G, there must be open sets Gi c Xi such that x c Gl x ... x G,, c G.

Choose ri > 0 such that S(xi, r1) c Gi and put r = min ri. Then 1
r > 0 and S(x, r) c S(x1, r1) x ... x S(xn, rj c G. Thus any set G open in the product topology is also open in the topology of the metric p. Remark. The metric p defined in this theorem is by no means the only one which generates the product topology-see exercise 2.4 (1, 2). Theorem 2.6. If X, Y are compact topological spaces, then X x Y is compact in the product topology.

Remark. The proof which follows extends immediately to finite Cartesian products of compact sets. Actually the theorem is true for arbitrary products, and in this more general form is due to Tychonoff. Proof. Suppose first that R. = G. x Ha, a E I is a covering of X x Y by open rectangles. Then if x0 is a fixed point of X and Ixa is the set of indices a for which (xo, y) E G. x Ha for some y E Y, the class Ha, a E I_,o forms an open covering of the compact set Y. Hence, there is a finite set Jxo c I such that Ra, a E J,. covers the set {xo} x Y. But if we put Ago = n Ga, Ax, is open, contains x0, and the finite class aEJzo

Ra, a E JXo must cover all of Ax0 x Y. For each x0 E X, form such an open

set A,,: the class of all sets of this form is an open covering of the

40

POINT SET TOPOLOGY

[2.4

compact set X, and so has a finite subcovering Axl, Axe, ..., A. n

It

follows that Ra, a E U Jxi is a finite subcovering of X x Y. i=1

It remains to show that, in testing for compactness, it is sufficient to consider coverings by open rectangles G x H with G open in X, H open in Y. Suppose then that every covering of X x Y by open rectangles has a finite subcovering, and consider an arbitrary open covering Ga, a E I. Each point (x, y) E X X Y is an element of an open set Ga and therefore there is an open rectangle Rx,v with (x, y) E Rx,v C Ga.

The class of open rectangles Rx, v, (x, y) E X X Y clearly covers X x Y and so we can find a finite subcovering R1, R2, ..., R.. The corresponding sets Gl,..., G. then form a finite subclass of the original covering class which covers X x Y. Corollary. Any bounded closed set in Rn is compact. Proof. The usual topology in Rn is given by the distance function n

P(x,y)

t-1

(xi-y1)2

while theorem 2.5 shows that the product topology is given by the distance function

r(x, y) = max k - yi l 1' i < n

Now 7(x, y) 4 p(x, y) 5 V n n T(x, y) for all x, y E Rn so that the topology

of the usual metric p is the product topology.

If E is bounded, there is a real number K such that E is a subset of the Cartesian product of n intervals [ - K, K]. Since each of these intervals is compact, the product is compact and therefore E is compact if it is closed. Exercises 2.4

1. If p11 P2 are two metrics in X such that cp1(x, y) < p2(x, y) 5 kp1(x, y) for all x, y E X where c > 0, k > 0; show that pl and P2 generate the same topology in X.

2. Show that, if X x Y has the product topology, then the projection function p: X x Y -* X defined by p(x, y) = x is continuous. In a space X, if T, T2 are two collections of `open' sets defining topologies and 61 g2 we say that the topology given by T2 is coarser than that given by T,. Show that the product topology in X x Y is the coarsest topology for which projections are continuous. (For an arbitrary Cartesian product ji Xa of topological spaces (Xa, 9a) the projection p,B can be defined as ad p g(xa, a e I) = xe a X, for any 6 E I. One method of defining the product

CARTESIAN PRODUCTS

2.4]

41

topology in the Cartesian product space is to say that it is the coarsest topology for which each of the projections is continuous.)

3. Suppose X x Y has the product topology and A e X, B C Y. Show

that

AxB=AxB,

and prove that the product of closed sets is closed.

2.5

Further types of subset

In a topological space X, a subset E c X is said to be nowhere dense if the closure E of E contains no non-void open set. If E is no-

where dense, and G is any non-void open set, the intersection G n (X-2) is a non-void open subset disjoint from E, and therefore from E. Conversely if E contains a non-void open set H then every non-void open subset of H is a neighbourhood of each of its points, and therefore contains points of E. Thus E is nowhere dense if and only if every non-void open set in X contains a non-void open set disjoint from E. Category

A subset E c X is said to be of the first category (in X) if there is a sequence {En} of nowhere dense subsets of X such that E = 1J 00 Ei. A i=1

set E c X which cannot be expressed as a countable union of nowhere dense sets is said to be of the second category. Before proving that complete metric spaces are necessarily of the

second category, it is convenient to prove a lemma which again generalises a well-known result in R(about a decreasing sequence of closed intervals). Lemma (Cantor). In a complete metric space, given {An} a decreasing 00

sequence of non-empty closed sets such that diam (An) -> 0, fl An is a n=1

one point set.

Proof. For each integer n, choose a point xn a An. Then given e > 0 we can choose N so that

n>N=> diam(An)<e, and, since AN

An for n > N,

n,m > N- xn,x.EANz P(xn,X.) < e; so that {xn} is a Cauchy sequence. Since the space is complete, there

is a point x0 such that xn -> xo as n - oo. For each n, since An is

POINT SET TOPOLOGY

42

12.5

closed and xi e An for i 3 n we must have x0 a A. so that x0 a (10" An. n=1

Now diam (flA)

diam (An) for each n so that diam (fl A) = 0

1

and the set fl An cannot contain more than one point. Theorem 2.7 (Baire). Every complete metric space X is of the second category.

Proof. It is sufficient to show that if {An} is any sequence of nowhere

dense sets there are points x E X - U An. Starting with such a sequence n=1

{An}, since we can find a non-void open set disjoint from Al there is a sphere S(xl, r1) with 0 < rl < 1 such that S(xl, r1) n Al = 0. Suppose we have found spheres S(x1, r1) S(x2, r2) ... S(xn-1, rn-1) with

0 < rn_1 < 1 /(n - 1) such that S(xi, ri) n Ai = 0 for 1 5 i S n - 1. Then S(xn_1, rn-1) is a non-void open set in X and A. is nowhere dense. There must therefore be an open subset disjoint from An so that we can find a //sphere S(xn, rn) with

0 < rn < 1/n, S(xn, rn) c S(xn-v rn-1) and S(xn, rn) n An = 0 OD

By the last lemma, there is a unique point xo a fl S(xn, rn) and this n=1

point x0 cannot be in A. for any integer n. This means that OD

n=1

Corollary. Rn is of the second category. For a < b, the interval [a, b] c R is of the second category. Dense subset

If A, E are subsets of a topological space X, we say that A is dense

in E if A m E. This means that any open set G which contains a point of E also contains a point of A. In particular A is dense in X if A = X, that is, if every non-void open set contains a point of A. Separable space

A topological space X is said to be separable if there is a countable set E c X such that E is dense in X. This implies the existence of a sequence {xn} in X such that every non-void open subset contains a point of the sequence. It is immediate that R and Rn are separable as the set of points with rational coordinates is dense and countable. Further, every compact metric space X is separable, since X can be

FURTHER TYPES OF SUBSET

2.5J

43

covered by a finite class len of open spheres of radius 1 /n for n = 1, 2, ...

and the (countable) set consisting of the centres of all these spheres is clearly dense in X. Borel sets and Borelian sets

i generated by In any topological space X, we will call the .'1'' generated by the open sets the class of Borel sets, and the the compact sets the class of Borelian sets. (One must remark that some authors use the term Borel sets for .7E.) In a metric space the compact sets are closed, and therefore in -4 so that .'' c R. If X = U Ki is a countable union of compact sets OD

i=1

(in this case we can say that X is o--compact), then M' = R. In order to prove this it is sufficient to show that each open set G E X': but if G is open, E = X - G is closed and so E n Ki is compact for each i OD

and this implies E _ U E n Ki e . ' and G = X - E E X. Now Euclii=1

dean n-space Rn is the union of the closed spheres S(0, k) (k = 1, 2,...) each of which is compact, so Rn is o--compact. This means that, in Rn the Borel sets and the Borelian sets are the same.

Note that, by our definition, the class Pin of Borel sets in Rn is the v-field generated by the open sets in R. It is convenient to see that i can also be obtained as the o--field generated by a simpler class of sets.

Lemma. The class On of half-open intervals in Rn generates the afield R n of Borel sets in Rn.

Proof. Let n be the o--field generated by qn. Each set in n, {(x1, x2,

i = 1, 2, ..., n}

..., x,y): ai < xt < bi,

can clearly be obtained as a countable intersection °°

1

fl ((xl, .. .,xn):ai<xi
k=1

n

of open rectangles, and is therefore in Pin. Hence -41L D gn so that Oi" n Fn. On the other hand, each open set G in Rn is a union of those rectangles of Opn whose boundary points ai, bi are all rational. Since there are only countably many such sets, each G is a countable union of

9. This implies that Fn Pin.' It is sometimes useful to be able to describe sets which can be obtained from a given class 6 by a countable operation. We say that sets in 9n and so Oil" n

POINT SET TOPOLOGY

44

[2.5

E is a WQ set if it is possible to find sets E1, E2 ... in ' such that 00

00

E _ U Ei; and E is a W8 set if E = fl E;, for a sequence {En} in W. i=1

i=1

In particular, if 9 is the class of open sets in a space X, we see that

9 - g$ 9, T,,, = 9. Similarly, if F is the class of closed sets, .4 => .FPF,and.F.=.F. Perfect set

A subset E of a topological space X is said to be perfect if E is closed,

and each point of E is a limit point of E. For example, in Rn, any closed sphere S(X, r), r > 0 is perfect and, in particular, the closed interval [a, b] is perfect in R for any a < b. It is obvious that finite sets in a metric space cannot be perfect. In fact more is true-see exercise 2.5 (7).

Exercises 2.5 1. Show that, in R", any countable set is of the first category. Give a category argument for the existence of irrational numbers. 2. Show that the class .N' of nowhere dense subsets of X is a ring, and the class ' containing all sets of the first category is the generated o--ring.

3. Show that in a complete metric space, a set of the first category contains no non-empty open set. Deduce that every non-empty open set is of the second category. 4. If 0 is an open set in a topological space, prove that (C - G) is nowhere dense.

5. In R show that the class of half-open intervals with rational endpoints generates the or-field -4 of all Borel sets. Similarly in R", show that

9' generates the Borel sets an. 6. Show that a set E is perfect if and only if E = E', where E' is the set of limit points of E. 7. Show that any non-empty perfect subset of a complete metric space is non-countable. Hint. Use theorem 2.7 and the fact that a closed subset of a complete metric space is complete.

2.6 Normed linear space There are many abstract sets which have an algebraic structure as well as a topology. Thus if, in the set X there is a binary operation + (called addition) and an operation in which elements of X can be

2.61

NORMED LINEAR SPACE

45

multiplied by elements of the real number field R to give elements in X we say that X is a real linear space if for all x, y, z E X, a, b, E R;

x+y = y+x; (ii) x+(y+z) _ (x+y)+z; (iii) x+y = x+z= y = z; (iv) a(x+y) = ax+ay; (v) (a+b)x = ax+bx; (i)

(vi) a(bx) = (ab) x;

(vii) l.x = X.

It follows from these axioms that X has a unique zero element 0 = 0. y for all y E X, and that subtraction can be defined in X by

x-y = x+(-1)y. In the present book we will only consider linear spaces over R. Most of our results can be extended, though sometimes with a little difficulty, to linear spaces over the number field C. We will not carry out this extension, nor do we consider any more general number fields.

It is immediate that Rn is a real linear space with vector addition

and scalar multiplication for the two operations. The properties of linear spaces are studied at length in elementary courses on linear algebra. t We will not require many of these, but will develop the properties of linear independence when they are needed in Chapter 8. Norm

If in a real linear space X there is a function n: X R satisfying (i) n(0) = 0, n(x) > 0 if x + 0; (ii) n(x+y) < n(x)+n(y) for all x,yEX; (iii) n(ax) = j al n(x) for a e R,

x c X,

we say that X is a normed linear space. We will in this case use the usual notation IIxjl for the value n(x) of the norm function n at x. In any normed linear space X, P(x, Y) _ IIx - yII = P(x-Y, 0)

defines a metric, and in the topology determined by this metric, the algebraic operations are continuous in the sense that (i) x + y is continuous in the product topology of X x X ; (ii) ax is continuous in the product topology of X x R. t See, for example, G. Birkoff and S. MacLane. A Survey of Modern Algebra, (Macmillan, 1941).

46

POINT SET TOPOLOGY

[2.6

It follows in particular that (iii) a E R, lim xn = 0 lim (ax.) = 0; (iv) X E X, a . E R, lim a, = 0=> lim (an x) = 0.

(The reader is advised to check (i)-(iv) using the axioms.) Special normed linear spaces will be studied in Chapters 7 and 8. At this stage we consider a few important examples of such spaces and examine the topological structure imposed by the norm.

M. Consider the set of those functions x: [0, 1] -> R which are bounded. Define, fort E [0, 1] (x + y) (t) = x(t) + y(t),

(ax) (t) = ax(t)

and check that this makes M a linear space. If we put 1lxii = sup Ix(t)I

o,tsi

it is not hard to check that the conditions for a norm are also satisfied, so we have a normed linear space.

C. The set of those functions x: [0, 1] - R which are continuous is a subset C of M. Since this subset C is closed under the operations

of addition and scalar multiplication, it must be a normed linear space with the same norm

114 = sup M01. s. The set of all sequences of real numbers {xi} is a linear space if we put {xi} + {yz} = {xi + yi}, a{xi} = {axi}.

Since for x, y real we have

Ix+yl

1+lx+yl it follows that

p({xi}, {yi})

defines a metric in s.

Ixl

lyl

1+1x1 +1+Iyl +xi - yil

i=12 1+Ixi-yil

m. This is the set of all bounded sequences of real numbers with the same linear structure as s. However this time it is more convenient to use the norm ii{x41 = sup ixil,

i

to make m into a normed linear space.

2.6]

NORMED LINEAR SPACE

47

c. This is the set of convergent sequences of real numbers with the same norm and linear structure as m.

Each of the above spaces has a topology defined by the norm. We now obtain a few of the topological properties of these spaces, leaving the reader to determine the remainder. Lemma. The space C is complete.

Proof. If {xn} is a Cauchy sequence in C, then for each t e [0, 1] {xn(t)} is a Cauchy sequence in R which must converge to a real number

xo(t). For each e > 0, there is an integer N such that, if Ym(t) = IxN(t)-xm(t)I

(m > N),

then Ilymll < jc; that is,

0 < ym(t) < je for each tin [0, 1]. If we now let m -- oo, it follows that I xN(t) - xo(t) I < je

for all t in [0, 1]

so that, if n > N, t E [0, 1] I xn(t) -x0(I 0 as n -* oo. This means that xo is the uniform limit

of a sequence of continuous functions and must therefore be continuous; that is, xo a C. I Lemma. The space M is not separable. Proof. For each s e [0, 1), let xs be the function given by

for 0
0

{1

Then if r, s e [0, 1) and r + s we must have II xr - xsll = 1. Now any dense set in M has to contain a point y8 such that Ilys-xsll < I for each s e [0, 1); and we cannot have yr = y8 with r + s for then

1= Ilxr-xsll < Ilxr - Yrli+IIYr - ysil + llys-xsll < 1. This means that any set dense in M must contain at least c points, and therefore M is not separable. Lemma. The space c is not locally compact.

Proof. A metric space X is locally compact if, for each zeX, there is an e > 0 such that the closed sphere S(z, e) is compact. Now put

xi =

1

for

i = k,

0

for

i 4 k,

48

POINT SET TOPOLOGY

[2.6

and for each integer k,

xk={x?}(i=1,2,...)Ec and k+j

llxk-xfll=1.

Given z = {zi} e c and e > 0, put zk = z+exk

and all the points zk are in S(z, e). But

if k + j

IIzk - zill = e

so that {zk} (k = 1, 2,...) forms an infinite set in i (z, e) with no limit point, and 2(z, e) cannot be compact. J Exercises 2.6 1. Show that s is bounded but not compact. If x = {xi} c s, and E _ {y: lyil , Ixil}, show that E is compact in s, but show that s is not locally compact. 2. Show that each of the spaces M, c, m, s is complete. 3. Show that each of the spaces C, c, s is separable, but that m is not separable. Hint for C. Consider the set of functions which take rational values at each of a finite set of rationals in [0, 11 and are defined by linear interpolation between these points. 4. C*(X) denotes the space of functions f: % -+ R which are continuous and bounded. Show that C*(R) is not separable by considering continuous functions which take the values + 1, -1 on disjoints subsets Z1, Z2 of the set Z of positive integers and are defined elsewhere by linear interpolation. (The distance between any two such functions is 2, and there are c of them.)

However, let .9 be the subset of C*(R) consisting of those functions f for which lim ft x), lim &) both exist. Prove that -9 is separable. ro

5. Let 12 be the subset of s such that

xZ converges. In the linear struc-

i=Z

ture of s show that 12 is a linear space and that

IIxIl=fix? defines a norm. In the topology of this norm show that 12 is separable. The subset {x: 1xil < 1/i} of 12 is known as the Hilbert cube: prove it is compact. Hint. Starting with an infinite sequence in the Hilbert cube pick a sub-

sequence in which the first coordinate converges, then successive subsequences in which the 2nd, 3rd, ..., nth coordinate converges. Show that the sequence to which these coordinates converge is in the Hilbert cube and is a limit point of the original set.

CANTOR SET

2.71

49

2.7 Cantor set We now digress briefly from the study of general situations and consider the definition and properties of a special subset of R first considered by Cantor. This set and associated functions will be useful

in the sequel to provide counter-examples to several conjectures which are plausible but false. If we denote the open interval ((3r-2)/3n, (3r-1)/3n) by En r, put 3n-1

W

Gn=r=1 UE,,r G= UGn; n=1 it is clear that G is an open subset of [0, 1]. Its complement

C=[0,1]-G is called the Cantor set. From its definition C is closed. Lemma. The Cantor set C is nowhere dense and perfect. Proof. If we express points xE [0, 1] in the form

x=

ai i=1 3'

(2.7.1)

(a1 = 0, 1, 2),

then the set G. above is the set of x for which an = 1. Hence, the set C consists of precisely those real numbers which have a representation in the form (2.7.1) with each ai = 0 or 2. Given a point x1 E C, altering

the nth term an (replacing 0, 2 by 2, 0 respectively) gives a new point x2 in C such that

I xl -

x2l =

2.3-n.

This shows that every point of C is a limit point of C, so that C is perfect. If H is any open set with H n [0, 1] not void then H n [0, 1] contains

an open interval I of length S > 0. If S > 31-n, then I must contain

an interval En,r so that H contains an open set disjoint from C. This proves that C is nowhere dense. I From the above lemma and example 2.5 (7) we can deduce that C is not countable. However, one can prove that C must have the same cardinal as the continuum [0, 1] by considering the following mapping:

if

1 ],

put X=

00

(bi = 0 or

zZ 2i

1),

where the sequence {bi} does not satisfy bi = 1 for i > N. Put P X)

00 ai

ai = 0

if

bi = 0,

= i? 13'

(ai = 2

if

bi = 1.

FOINT SET TOPOLOGY

50

[2.7

Then f: [0, 1] -* C is (1, 1) and maps [0, 1] onto a (proper) subset of C. Since C c [0, 1] the cardinal of C is c. We can think of f as a function on [0, 1] to [0, 1]. It is easy to see

that f is monotonic, that is, {

x1 < x2 e f (x1) < J (x2)

so that, for each y E [0, 1], (2.7.2)

f-1[0, y] = [0, z].

If z is defined by (2.7.2), then we say that z = g(y)[0, 1] -> [0, 1] as a monotonic function which is clearly constant on each of the sets En ,. In fact

This defines g:

3r-2 3n

3r-1 3n

-< y

g(y) =

2r- I 2n

The function g is continuous and monotonic increasing, for

0 S y1 - Y2 < 3-n-1

0 5 g(y1) - g(y2) < 2-n.

Since the function g is constant in each En,, it follows that it is differentiable with zero derivative at each point of G. One can easily see that g increases at each point of C-and in fact the `upper derivative' at points of C is + oo. Note that there is nothing magical about the integer 3 used in the construction of C. Similar constructions using expansions to a different base will give sets with similar properties.

1. If x =

c

Exercises 2.7

(c2 = 0,1, ..., 9) is a decimal expansion of real numbers i=110' in [0, 1] and T is the set of such x for which cE + 7, show that T is perfect and nowhere dense. 2. Construct a set which is dense in [0,1] and yet the union of a countable class of nowhere dense perfect sets.

3. Show that the function g : [0,1]-x[0,1] defined above satisfies a Lipschitz condition of order a = log 2/log 3, but not of any order 8 > a. (A function h: I-* R is said to satisfy a Lipschitz condition of order a at xo E I if jh(x) - h(xo) I < K I x - xo) j a for x e I and some suitable K E R.)

51

3

SET FUNCTIONS Types of set function We consider only functions u: ' -> R*, where ' is a non-empty class of sets. Thus p is a rule which determines, for each Ee', a 3.1

unique element u(E) which is either a real number or ± oo. We always

assume that' contains the empty set 0. R* denotes the compactification of the real number field R by the addition of two points + oo, - oo, while R+ will denote the set of non-negative real numbers together with +oo. It is not possible to arrange for R* to be an algebraic field extending R, though we will preserve as many of the algebraic properties of R as possible by adopting the convention that, for any a e R,

-oo
if

a > 0,

if a=0, if a<0,

a(+ oo) = + oo,

a(- oo)

oo;

a(± oo) = 0; a(+ oo) _ - oo,

a(- oo) _ + oo;

(+00)(+00)=+00' (+00)(-00)=-00' (-00)(-00)=+00. Thus the operation of dividing by (± oo) is not allowed and

(+co)+(-co) is not defined in R*. All these definitions are natural except for the convention 0( ± oo) = 0.

Arbitrary set functions a: W -)- R* are not of much interest. We adopt conditions on p which correspond to our intuitive idea of mass

for a physical object (we generalise the notion to allow negative masses). The first property we define corresponds to the notion that the mass of a pair of disjoint objects is the sum of the masses of the individual objects. Additive set function A set function ,u: ' -> R* is said to be (finitely) additive if

(i) p(O)=0,

SET FUNCTIONS

52

13.1

(ii) for every finite collection E1, E2, ..., E. of disjoint sets of W such n

that U Ei EW we have

it (U E) = Fu (Ei).

y61

(3.1.1)

21 i=i In this definition condition (i) is almost redundant for it will be implied by (ii) provided there is at least one EE'o with #(E) finite. Note also that we do not in the definition assume that ' is closed under finite disjoint unions so that, in testing a given set function ,u: ' -> R* to see whether or not it is additive, we can only use sets E, eW which are disjoint and have their union in W. However, the definition is taken to imply that the right-hand side of (3.1.1) has a unique meaning in R* so that in particular there are no sets E, FEW such that En F = o, E v +oo, p(F) = -oo. The natural domain of definition for an additive set function It is a ring since, if '' is a ring,

n

EiE' (2 = 1,2,...,n)= UEiEcf. i=1

For a ring Se it is worth noticing that p:' -+ R* is additive if and only

ifp(0)=0and

E, F EW, E n F = 0 = µ(E u F) _ #(E) +#(F), since in this case the general result (3.1.1) can be obtained from the result for two sets by a simple induction argument. When ' is a finite class of sets it is easy to give examples of set functions defined on le which are additive. We now give a number of less trivial- examples which will be useful for illustrating our later definitions. In each case the reader should check that ,u: 6 -+ R* is additive. Example 1. 0 any space with infinitely many points, (f the class of all subsets of Q. Define p by p(E) = number of points in E, if E is finite; p(E) = +oo, if E is infinite. Example 2. S2 any topological space, ' the class of all subsets of 92.

Put

p(E) = 0, if E is of the first category in n; p(E) _ + oo, if E is of the second category in S2.

Example 3. S2 = R, ' the class of all finite intervals of R. For E = [a, b] or [a, b) or (a, b] or (a, b), put

,u(E) = b -a.

TYPES OF SET FUNCTION

3.11

53

Example 4. S2 is any space with at least two distinct points n, s; o is the class of all subsets of Q. ,u(E) = 0, if E contains neither or both of n, s;

p(E) = 1, if E contains n but not s;

µ(E) = -1, if E contains 8 but not n. Example 5. t = (0, 1], the set of real numbers x with 0 < x S 1, 'f the class of half-open intervals (a, b] where 0 < a < b < 1. ,u(a, b] = b -a if a + 0; ,u(0, b] = + oo.

Example 6. S2 is any infinite space, W the class of all its subsets. Let

x1, x2, ..., xn, ... be an enumerable sequence of distinct points of 0, 00

and suppose P1, P2, ... is a sequence of real numbers such that E pi i=1

either converges absolutely or is properly divergent to + oo or - oo (the case E pi convergent, E lpi I divergent is not allowed: why?). Put

,u(E) = E pi, where the sum extends over all integers i = 1, 2, ... for which xi E E. Any set function which can be defined as in example 6 is called discrete. Note that example (4) can be thought of as a special case of example (6).

Although it is not sufficient to restrict our attention to set functions ,u: W -. R which are finite valued, the condition of additivity which is usually assumed prevents ,u from taking both the values + oo, - oo at least when ' is a ring. This is one of the results in the next theorem.

Theorem 3.1. Suppose T: V R* is an additive set function defined on a ring' and E, FEW. Then (i)

if E F and r(F) is finite T(E - F) = T(E) -T(F);

(ii)

if E F and T(F) is infinite T(E) = T(F);

(iii) if T(E) _ + oo, then r(F) + - oo. Proof. (i) Since '' is a ring, E - F E'o and additivity implies, since

Fn (E-F) = 0,

T(E) = T(E-F)+T(F).

Subtracting the finite real number T(F) gives the result.

(3.1.2)

SET FUNCTIONS

54

[3.1

(ii) If T(F) = +oo, then (3.1.2) can only have a meaning if T(E - F) + - oo, and this implies T(E) = + oo. The case r(F)

oo

is similar.

(iii) Since E n F, E - F, F - E are disjoint sets of le

T(E) = r(EnF)+T(E-F) = +oo, T(F) =T(EnF)+T(F-E) = -oo could only have meaning if r(E n F) is finite. But this would imply T(E - F) = + 00, T(F - E) = - oo, and then, since EL F E W,

T(E0F) = r(E - F) + T(F - E) =+oo+(-oo) which is impossible. J Our definition of additivity means that for ,u: ' ->. R* to be additive

any set E0E' which can be split into a finite number of disjoint subsets in ' must be such that ,u(Eo) is the same as the sum of the values of ,u on the `pieces'. We often want this to be true for a dissection of E0 into a countably infinite collection of subsets in W. o--additive set function

A set function ,a:'

R* is said to be a--additive (sometimes

called completely additive, or countably additive) if (i) Aa(O)=0, (ii) for any disjoint sequence El, E2, ... of sets of such that 00

E _ UEiE'f, i=1 00

p(E) = Z,u(Ei).

(3.1.3)

i=1

As before the condition (i) is redundant if It takes any finite values.

Since we may assume that all but a finite number of the sequence {Ei} are void it is clear that any set function which is o--additive is also additive. To see that the converse is not true it is sufficient to consider example (5) on p. 53. Put E _ (0,1], En=

1

(W+1_1

n(n= 1,2,...);

,

then {En} is a disjoint sequence in (f whose union E is in ' but + oo = ,u(E) + 1 = =1

\n

E ME.). -n+1 ) _ n=1 1

3.1]

TYPES OF SET FUNCTION

55

Notice further that even when '' is a ring it does not follow that Ei c9 (i = 1, 2, ...) =>E = UEE a 1; so that in testing (3.1.3) we can only use those sets E E' which can be split into a countable sequence of disjoint subsets in W. In particular if' is a finite class of sets then additivity for ,u: (f ->. R* implies additivity. We also interpret (3.1.3) to mean that the right-hand side

is uniquely defined and independent of the order of the sets Ei; a decomposition E = U Ei,

thus if p is

i=1

we cannot have,u(Ei) = +oo, p(E5) _ -oo, nor can the series in (3.1.3) converge conditionally. It is easy to check that each of the set functions in examples (1), (2), (4), (6) on p. 52 is and the set function of example (3) is also o--additive though the proof of this fact is non-trivial. This proof will be given in detail in § 3.4, as it is an essential step in the definition of Lebesgue measure in R. Measure

Any non-negative set function p:'

R+ which is o-additive is

called a measure on ', (R+ = {x E R*: x > 0}).

We should remark that there is not general agreement in the literature as to which set functions ought to be called measures. According to our definition the set functions in examples (1), (2) and (3) are measures, those in (4) and (6) are not because /I can take negative values while the set function in (5) is not because it is not o--additive.

The natural domain of definition of a measure, or indeed of any since then

or-additive set function, is a

OD

EiE1o

(i=1,2,...). UEiE'. i=1

However, we will not restrict our consideration to

set functions already defined on a o -ring. Given a set function ,u: (f -> R* where 'f is a ring it is usually quite easy to check whether or not u is additive for one only has to check (3.1.1) for n = 2. In order to check that it is also cr-additive it is useful

to have a characterisation of o--additive It in terms of a continuity condition for monotone sequences of sets. Since we have seen already

(theorem 3.1) that such set functions cannot attain both values + oo, - oo there will be no loss of generality in assuming that - oo < p(E) 5 + oo for all E E W.

SET FUNCTIONS

56

[3.1

Continuity

Suppose q is a ring and p:.

R* is additive with ,u(E) > -oo

for all E E M. Then for any E E R we say that : (i) u is continuous from below at E if

lim #(E.) = p(E)

n-aw

(3.1.4)

for every monotone increasing sequence {En} of sets in gP which converges to E;

(ii) It is continuous from above at E if (3.1.4) is satisfied for any monotone decreasing sequence {En} in 9 with limit E which is such that p(En) < oo for some n; (iii) It is continuous at B if it is continuous at E from below and from above (when E = 0 the first requirement is trivially satisfied). Theorem 3.2. Suppose 9 is a ring and ,u:.9 --3- R* is additive with ,u(E) > - oo for all E E R. (i) If p is o--additive, then p is continuous at B for all E E 9; (ii) if It is continuous from below at every set E E 9?, then p is vadditive; (iii) if p is finite and continuous from above at o, then p is o--additive. Proof. (i) If;a(En) = +oo for n = N and {En} is monotone increasing

then ,u(E) = +oo and #(E.) = +oo for n >, N by theorem 3.1 (ii), where E = lim En. Thus in this case ,u(En) ->. p(E) as n -a oo. On the other hand, if p(En) < oo for all n and {En} increases to E, then 00

E = El U U (En+1- En) %=1

is a disjoint decomposition of E and co

p(E) = #(El) + E Ft(En+1- En) n=1 N

_ ,u(E1) + lim Z fp(En+1- En) = lim F(EN), N-00 n=1 N-->co

since ,u is additive on the ring R. Thusp is continuous from below at E. Now suppose {En} decreases to E and p(EN) < +oo. Put

Fn = EN - En for n >,N. Then, by theorem 3.1 (ii), p(F.) < oo and the sequence {Fn} is monotone

increasing to EN - E. Hence, as n

oo,

#(F.) -.p`(EN - E) = ,u(EN) -p(E)

TYPES OF SET FUNCTION

3.11

57

But µ(F,) =,u(EN)-µ(Ef) so that p(En)-)-µ(E) as n-*oo, since ,u(EN) is finite, and fc is also continuous from above at E.

(ii) Suppose E E M, Ei EM (i = 1, 2, ...) are such that E _ U Ei i=1 and the sets Ei are disjoint. put n

Fn= UEiEJI (n= 1,2,...), and {Fn} is a monotone increasing sequence of sets in 9 which converges to E E M. If ,u is continuous from below at E

c(E) as n -->oo

E,a(E' i) _ ,a(1''n)

i=1

00

so that

,u(E) = E,u(Ei), i=1

and ,u is Q-additive.

(iii) In the notation of (ii) put

(n= 1,2,...).

Gn=E-FFEP2

Then {Gn} is a monotone decreasing sequence converging to 0 and, for n = 1, 2, ... n d=1

If u is finite and continuous from above at 0 we must have #(G.) -* 0

so that again

,u(E) =

as n-oo ,u(Ei). i=1

Remark 1. In our definition of continuity from above we only require to have #(En) ->. ,u(E) for those sequences {En} which decrease to E for which u (En) is finite for some n. To see that we could not relax this finiteness condition, consider example (2) on p. 52 which we have already seen to be o-additive with S2 = (0, 1). Then if

En = (0, n)

(n = 1, 2, ... )

we have a sequence decreasing to 0 such that ,u(En) = +oo for all n since En is of the second category.

Remark 2. The condition that ,u be finite cannot be omitted in theorem 3.2 (iii). Consider example (5) on p. 53 which we saw was additive but not o'-additive. Actually the class' of sets on which It is defined is a semi-ring rather than a ring, but its definition can easily

SET FUNCTIONS

58

[3.1

be extended to the ring of finite disjoint unions of sets in W by using theorem 3.4. It is easy to check that it will remain continuous from above at 0, but not a -additive. Part (iii) of theorem 3.2 will prove very useful in practice, especially for finite valued set functions #:. i R+ which are non-negative and additive on a ring M. In order to prove that such ap is a measure it is sufficient to show that, if {En} is any sequence of sets in qi decreasing

to 0,

µ(E.) --> 0

as n -* oo.

(3.1.5)

If (3.1.5) is false for some such sequence {En} then, since #(E.) is monotone decreasing we must have

#(E.)->- 8> 0 as

(3.1.6)

If we can establish a contradiction by assuming (3.1.6), then (3.1.5) will be proved and we will have deduced that It is a measure. When we come to consider particular set functions one of our objectives will be to define p on as large a class ' as possible. We will also want It to be v-additive. It would be desirable to define, on the class of all subsets of 0, but unfortunately this is not possible if 0 is not countable and It is to have an interesting structure. In particular it has been shown, using the continuum hypothesis, that it is impossible to define a measure p on all subsets of the real line such that (a) sets consisting of a single point have zero measure (this eliminates discrete set functions like examples (1), (4), (6) on pp. 52-3); (b) every set of infinite measure has a subset of finite positive measure (this eliminates

example (2)); (c) the measure of the whole space is not zero. In practice the method used is to define It with desired properties on a restricted class of sets ' (as in examples (3) or (5)) and then extend the definition to a larger class _q W. Extension Given two classes le

v:.

_q of subsets of SZ and set functions ,u: le ->. R*,

R* we say that v is an extension of p if, for all Ee' v(E) = p(E);

under the same conditions we say that p is the restriction of v to W.

It is sometimes appropriate (as in probability theory) to work with set functions ,u which are finite. However, most of the theorems

which can be proved for finite v-additive set functions can also be obtained with a condition slightly weaker than finiteness.

TYPES OF SET FUNCTION

3.11

59

ofinite set function

A set function ,a:' --a R* is said to be o--finite if, for each E eC, co

there is a sequence of sets Ci (i = 1, 2, ...) e' such that E C U Ci i=i and p(Ci) is finite for all i. In our examples, on p. 52, the set functions in (3), (4), and (5) are

all finite, (1) gives a o--finite measure if and only if it is countable, (2) is not o--finite if Q is of the second category, (6) is finite if EIpzI converges and otherwise it is o--finite. Sometimes it is useful to relax the condition of additivity in order

to be able to define It on the class of all subsets. The most common example of this is in the concept of outer measure. Outer measure

If IF is the class of all subsets of 0, then ,u:'

R+ is called an outer

measure on t if (i) u(O) = 0; (ii) u is monotone

in the sense that E c F . p(E) < ,u(F); (iii) u is countably subadditive in the sense that for any sequence {Ei} of sets,

00

OD

E U Ei - li(E) S E,u(E1) i=1

(3.1.7)

i=1

Note that every measure on the class of all subsets of a space S2 is an outer measure on Q. However, it is not difficult to give examples of outer measures which are not measures. Example 7. S2 any space with more than one point. Put

p(0) = 0, p(E) = 1 for all E + 0. In this book we do not study the properties of outer measures for their own sake, but we will use them as a tool to extend the definition of measures. Exercises 3.1 1. If S2 = [0,1) and '' consists of the 6 sets

0,

Q,

[0, '),

[0, 1),

[0, 1),

li(o)=0, u[0,1)=2, p[0,1)=2, ,a[0,1) = 4,

fz[4,1) = 2,

lt(Q) = 4, show that ,a is additive on W. Canp be extended to an additive set function on the ring generated by le? 3

TIT

SET FUNCTIONS [3.1 60 2. Show that if 9 is any finite ring of subsets of 0 and p is additive on R then ,u is a--additive on R.

3. A set function fi: ' -+ R* is said to be monotone if p(0) = 0 and E e F, E, F E' . p(E) < ,u(F). Show that monotone set functions are nonnegative, and if % is a ring, show that an additive non-negative set function is monotone. Of the set functions in examples (1)-(7), which are monotone?

4. Z is the space of positive integers and coZ a is a convergent series of positive terms. If E is a finite subset of Z, put -r(E) _

n=1

a,,; if E is an infinite subset of

nEE

Z, put T(E) = + oo.

Show that T is additive, but not or-additive on the class of all subsets of Z.

5. Z is the space of positive integers; for E e Z let rn(E) be the number of integers in E which are not greater than n. Let ' be the class of subsets E for which lim rn(E) = T(E)

n exists. Show that T is finitely additive, but not v-additive on e, but that ' is not even a semi-ring. n-->w

6. Ifp is finitely additive on a ring 9; E, F, G E 9 show p(E) +p(F) = #(E v F) +,u(E n F), p(E) +,u(F) +, a (G) +,u(E n Fn 0)

= p(EvFuC;)+,u(EnF)+p(FnG)+lu(GnE). State and prove a relationship of this kind for n subsets of R.

7. Suppose. is a v-ring of subsets of n, It is a measure on Y. Show that the class of sets EE.9' with p(E) finite forms a ring, and the class with p(E) v-finite forms a a--ring. 8. If E is a set in So of a-finite #-measure (where p is a measure on So) and 9 c 01 where -9 is a class of disjoint subsets of E show that the subclass of those D e -9 for which p(D) > 0 is countable. 9. State and prove a version of theorem 3.2 (i) for set functions, defined on a semi-ring W.

10. To show that the finiteness condition in the definition of 'continuous from above' in theorem 3.2 (i) cannot be relaxed, consider any infinite

space.Q and put T(E) = number of points in E, if E is finite; T(E) = + oo, if E is infinite.

61 TYPES OF SET FUNCTION Then T is a measure on the class of all subsets of 0, but for any sequence of infinite sets which decreases to 0, we do not have urn T(E) = 0.

3.11

11. Suppose 9 is the semi-ring of half-open intervals (a, b], Q is the set of rationale in (0, 1] and 9Q is the semi-ring of sets of the form (a, b] n Q.

Put

,u{(a,b]nQ}=b-a if 0
Show that ,u is additive on 9Q and is continuous above and below at every set in eQ, but is not a--additive. This shows that theorem 3.2 (ii), (iii) is not true for semi-rings.

12. Show that if It is an outer measure on S2, E0 any fixed subset then ,uo(E) = p(E n Eo) defines another outer measure on Q. 13. Show that if It, v are outer measures on S2, so is T defined by T(E) = max [u(E), v(E)].

14. Suppose is a sequence of o--additive set functions defined on a o--ring Y and that lim O(E) exists for all E in .9'. Show that gS is finitely additive on Y. If either (i) 0.(E) -> q'(E) uniformly on Y with c(E) > - co for all E r: 9; or (ii) 01(E) > - oo, 0.(E) monotone increasing for all E E.©; show that 0 is a--additive on Y.

3.2 Hahn-Jordan decompositions When discussing o -additive set functions we will usually restrict

our attention to the non-negative ones (which we call measures). The present section justifies this procedure by showing that, under reasonable conditions a `signed' set function u: ' -> R* which is completely additive can be expressed as the difference of two measures.

This means that properties of completely additive set functions can be deduced from the corresponding properties of measures. There are also versions of the decomposition theorem for finitely additive set functions, but we will not consider these. We have already seen (theorem 3.1 (iii)) that an additive set function

defined on a ring cannot take both the values +oo, - oo. If .90 is a o-ring and 1u:.5o R* is completely additive then for any sequence {Ei} of disjoint sets in .9',

,t

i =1

Ei = Z lu(Ei) i=1

co

Since U Ei is independent of the order of the sets in the sequence, i=1

it follows that the series on the right-hand side must be either 3-2

62

SET FUNCTIONS

[3.2

absolutely convergent or properly divergent. In the case of example (6)

the set function

,u(E) = E pi x;EE

can be decomposed

where

µ(E) = ,u+(E) -p_(E),

#+(E) = Z max (O, pi), ,u_(E) = - Z min (O,p,) x,EE

xiEE

so that

are measures of which at least one is finite. Further if we put P = {x;,u{x} > 0}, N = f2 - P we have ,u+E = ,u+(P n E), ,u_E = -,u(N n E) for all E c SZ, so that the decomposition into the difference of two measures can also be obtained by splitting 0 into two subsets P, N such that ,u is non-negative on every subset of P and non-positive on every subset of N. These two aspects of the decomposition are true in general, provided . ' is a o--field. Theorem 3.3. Given a completely additive T: F -* R* defined on a o --field

.F, there are measures T+ and T_ defined on F and subsets P, N in .F such that P u N = 0, P n N = 0 and for each E E .F, T+(E) = T(En P) > 0, r_(E) _ -T(En N) >0, T(E) = T+(E) -T_(E); so that T is the difference of two measures T+, T_ on F. At least one of T+, T_ is finite and, if T is finite or o- finite so are both T+, T_.

Proof. Since T can take at most one of the values + oo, - oo we may assume without loss of generality that, for all E c :.F,

- oo < T(E) S + oo.

We first prove that, if E E F and A(E) =

inf

BcE, BE.F

T(B),

(3.2.1)

then A(S2) + - oo. If this is false then there is a set B1 E F for which

T(B1) < -1. At least one of A(B1), A(f2 - B1) must be - oo; since A(A v B) > A(A) +A(B) if A, B are disjoint sets of F. Put Al equal to B1 if A(B1) = - oo and (f2 - B1) otherwise. Proceed by induction. For each positive integer n, choose Bn+1 c A. such that T(Bn+1) < - (n+ 1).

If A(B,t+1) = - oo, put An+1 = Bn+1; otherwise put An+1 = An - Bn+1 Then A(An+1) = - oo

3.2]

HAHN-JORDAN DECOMPOSITIONS

63

There are two possible cases: (i) for infinitely many integers n, A. = An-1- Bn;

(ii) for n > no, A. = Bn. In case (i) there is a subsequence {B.,} of disjoint sets and

T iUBni = iT(B.,)
=1

=1

so that r takes the value w

-oo on E _ UBn,E,

,

i=1

contrary to assumption. In case (ii), the set 00

E= (1 B, E °F n=n,

and since {Bn} is then a decreasing sequence of sets we have

T(E) = lim r(B,) = - oo

noo

again giving a contradiction.

Since T(0) = 0, A(S2) 5 0 so that A = A(S2) is finite and we can find a sequence {Cn} of sets in .F for which T(C,)

Now consider the set Cn n C,n+1 By noting that C. V Cn+1 = (Cn - C. n Cn+l) V (Cn+1- C. n C.+1) V (Cn n Cn+1)

is a decomposition into disjoint sets, it follows that T(C, n Cn+1) = T(Cn) +T(Cn+1) -T(Cn v Cn+1)1

< A+2-n+A+2-n-1-A = A+2-n+2-n-1. This argument can be repeated to the pair of sets (Cn n C.+,) and Cn+2:

by induction it can be proved that

T\ACr)
r=n

Y2-r
r=n

If we put Dn = fl Cr we have Dn EF and, by theorem 3.2 (i), r=n

T(D,) < A+21-n. But now {Dn} is a monotone increasing sequence of sets in F so that N = lim Dn = lim inf Cn E .F, n->oo

and by theorem 3.2 (i).

n--> ao

T(N) = A,

64

SET FUNCTIONS

[3.2

Finally, put P = fl - N. If E e.F, E e P we must have T(E) > 0 for, if T(E) < 0, then T(E v N) = T(E) +T(N) < A. Also if Be F, E c N we must have T(E) < 0 for, if T(E) > 0, then T(N - E) = T(N) - T(E) < A. If we now put T+(E) = T(E n P),

T_(E) = T(E n N),

it is clear that all the conditions of the theorem are satisfied.

Remark. It is usual to call the decomposition T = T+-T_, of T into the difference of two measures, the Jordan decomposition while the decomposition of SZ into positive and negative sets P and N is called the Hahn decomposition. It is not difficult to show that the Jordan decomposition is unique while the sets P, N are not uniquely determined by T unless T(E) + 0 for all EE,F such that 14(E) + 0 and p(F) = 0 or µ(E) for every F c E with F E.F. It is further clear that T_(E) = -A(E), T+(E )

=

sup

T(B )

BCE, BE.i

f

(3 . 2. 2)

under the conditions of theorem 3.3, where A(E) is given by (3.2.1). If one is given a a or-ring Y which is not a then it is not, in general, possible to obtain the Hahn decomposition, but the Jordan decomposition is still possible, using (3.2.1), (3.2.2) as the definition of T_, T+.

Exercises 3.2 1. If :Y -> R* is a'-additive on a Y, show that, for any Ee.9', there are sets A c E, B e E with A, B E So such that c(A) = inf 0(0), O(B) = sup (Cry). CCE,CE.f

CCE,CE`.'

2. Showthat, given a (finitely) additive ,u: 3P --> R* defined on a o--ring M

and taking finite values, there is a decomposition .u(E) = µ+(E) -u-(E) of p into the difference of two non-negative additive set functions on M.

3. The set E0 E' is said to be an atom of a set function 0:' R* if g5(Eo) + 0 and for every E e B0, E ET; ¢(E) = 0 or ¢(E0). Write down the atoms of the set functions of examples (4) and (6) on page 53.

4. A set function 0: ' -> R* is said to be non-atomic if it has no atoms. non-atomic, and finite on the Suppose q5:.F-* R* is valued. Show that for any A e.., 0 takes every real value between - 0-(A) and 6+(A) for some subset E e A.

ADDITIVE SET FUNCTIONS

3.31

3.3

65

Additive set functions on a ring

In order to simplify the arguments we now consider only nonnegative set functions ,u:'' --> R+. It is often possible, for a given ring

9 to find a semi-ring W c R such that 9 is the ring generated by le. We saw (see § 1.5) that the sets of R can then be expressed in terms of

the sets of (f, so it is natural to ask whether in these circumstances a set function 1a:

-> R+ can be extended top: R ->- R+. We now prove

that, if, a is additive on ', this is always possible and that the result is unique. Theorem 3.4. If ,u:'-. R+- is a non-negative additiveset functiondefined on a semi-ring ', then there is a unique additive set function v defined on the generated ring . = .(%) such that v is an extension of /J,. v is non-negative on 9, and is called the extension of p from 'to A(W).

Proof. Suppose A is any set of . = R(T), then by theorem 1.4, n

A = U Ek where the sets El, are disjoint and Ek e'. Define k=1

n

v(A) = E µ(Ek)-

(3.3.1)

k=1

Since for any a, b e R+, a + b is always defined, the right-hand side of (3.3.1) defines a number in R+. v is thus defined on provided we can show that (3.3.1) gives the same result for any two decompositions of A into disjoint subsets in W. Suppose n m

A=UEk=UFj, k=1 j=1

where F e' and are disjoint. Put Hkj = Ek n F1. Then -since W is a semi-ring the sets Hkj a le, are disjoint and m

Ek=UHkj (k j=1 n

Fj = U Hkj

n);

(9 = 1, 2, ..., m);

k=1

so that, since It is additive on W,

E1.(Ek)=E(E/(Hkj))=E(Eu(Hkj))=Elt(Fj) j=1 k=1 j=1 k=1 j=1

k=1

and it makes no difference which decomposition is used with (3.3.1) to define v(A).

66

SET FUNCTIONS

13.3

If A1, A2 are disjoint sets of 9, and n

A1= U Ek, A2=U1}, in k=1

i=1

then Al v A2 is a set of 3P with a possible decomposition into disjoint subsets of 1 given by n

m

k=1

4=1

A1vA2= UEkvUFi. n m v(A1 v A2) = E,u(Ek) +iE,u(F )

Hence

= v(A1) + v(A2),

since v is uniquely determined by (3.3.1). Thus v is finitely additive on R (since R is a ring). It is obvious that v is non-negative. Now let r be any extension of p from ' to °.rP which is additive. If n

A e 9P and A = IJ Ek is a decomposition into disjoint sets of '', k=1

n

T(A) = E r(Ek), since r is additive; k-1 n

= Ep(Ek), since r is an extension; k-1

= v(A) by (3.3.1).

Thus v is the unique additive extension of p from $° to Q.

If we start with a measure p: % --> R+ on a semi-ring ', then It is clearly a non-negative finitely additive set function, and so possesses

a unique additive extension to the generated ring R. What can we say about this extension?

Theorem 3.5. If ,u:'-.R+ is a measure defined on a semi-ring W, then the (unique) additive extension of µ to the generated ring 3P(() is also a measure.

Proof. In the last theorem we discovered the form of the unique additive extension v of ,u from ' to R. It is sufficient to show that v is on R. Suppose E e R, Ek (k = 1, 2,...) e 9 and are disjoint, and E = Uco Ek, k=1

Put

n

E = U A Ar disjoint sets of %; r-1 nk

Ek = U Bkd, Bk4 disjoint sets off. 4x1

ADDITIVE SET FUNCTIONS

3.3]

67

Put

Crki=ArnB then {Crki} forms a disjoint collection of sets in ', and co

n

nk

Ar = U U Crki, Bki = U Crki r=1 k-1 i=1 are disjoint decompositions into sets of W. Since ,a is additive on ', n

u(Bki) = Ela(Cki)i r=1

and since it is a--additive on W

nk

0o

p(Ar) = z z/J'(Crki)k=1i=1

Since the order of summation of double series of non-negative terms makes no difference, we have (00

v(E) _ E f,(Ar) _ r=1

nk

I E fp(Crki)

r=1 k=1 i=1

k

= E ( E E (Cki) k=1 i=1 r=1 Co

nk

°o

= Fi Zi lu(Bki) = Fi v(Ek) J k=1 i=1

k=1

The above theorem gives one method of obtaining a measure on a

ring-it is sufficient to define a measure on any semi-ring which generates the given ring. The extension to the generated ring is easily carried out, is unique, and gives a measure. There are circumstances in which one can define a set function p directly on a ring so that it is easy to see that p is non-negative and additive. Under these circumstances one can often use theorem 3.2 as a criterion for determining whether or not It is a measure. Another useful criterion is given by the following theorem.

Theorem 3.6. Suppose ,u: 9 --> R+ is non-negative and additive on a ring R. Then (i) if E E R, and {Ei} is a sequence of disjoint sets of R such that Co

°°

E=) UEE

p(E) i Tlu(Ei)i

i=1

i=1

(ii) p is a measure if and only if for any sequence {Ei} of sets in R 00

such that U Ei i=1

E E PA,

00

p(E) 5 Ep(Ei) i=1

SET FUNCTIONS

68

IA(R)>,,A

[3.3

Proof (i). For each positive integer n,

Ei

so that

i-1 since ,u is additive. Hence

\i=1

I

Z ,u(Ei),

i=1

p(E) % E,u(E'i) i=1

(ii) First, suppose that p is a measure. Put Fi = En Ei (i = 1, 2,...); Gl = F1, and

n-1 Gn =

(n = 2, 3,....).

Fn - U Fi i=1

Then {On} is a sequence of disjoint sets of 9 such that co

n=1

Thus

µ(E) = µ (tio) = -1

i=1

n=1

,a(Gi) 5 E,i(Fi), t=1

since µ is o'-additive and non-negative. co

Conversely if it is known that It is additive and E = U Ei is a i=1

disjoint decomposition of E ER into sets in 9, by (i) 00

p(E) % Eli(Ei); i=1

and if the condition in (ii) is satisfied, 00

p(E) 5 E,u(Ei) i=1

so that we must have

u(E) = E ,u(Ei) i=1

and p is a measure on R. ]

Exercises 3.3

1. If n = {1, 2,3,4, 5,}, show that ' consisting of o, 0, {1}, {2, 3}, {1, 2, 3,}, (4,51 is a semi-ring and that 0, 3,1,1,2, 1 defines a set of values for an additive set function ,u on W. What is the ring ? generated by 6? Find the additive extension of p to M, and show that it is a measure.

2. Suppose . is any ring of subsets, 0: G -+ R+ is non-negative, finitely additive on 9P, and p:. -->. R+ is a measure on 6. such that, for any sequence of sets in R 0= 0 as n oo;

show that 0 is completely additive.

ADDITIVE SET FUNCTIONS

3.3]

69

3. If ,u: R -+ R+ is finitely additive on a ring . and E, F E R are such that #(E L F) = 0, we say that E - F. Show that - is an equivalence relation in R and that

E- F-#(E) =,u(F) _,u(EvF) =,u(EnF). Is the class of all sets E E R for which E -' 0 a ring?

4. In the notation of question 3, put p(E, F) = #(E A F) and show that p(E, F) > 0, p(E, F) = p(F, E), p(E, F) 5 p(E, 0)+p(O, F). If E1,., E2, F1- F2 are all sets in .', show that p(E1, F1) = p(E2, F2). Does p define a metric in A?

3.4 Length, area and volume of elementary figures In § 1.5 we saw that:

(i) In R = RI (Euclidean 1-space) the class 9 = 91 of halfopen intervals (a, b] forms a semi-ring which generates the ring n

f of elementary figures (sets E of the form E _ (J (ai, bi] with i=1

bi
(ii) In Rk the half-open intervals have the form {(x1, x2, ..., xk): ai < xi < bi, i = 1, 2,..., k} and they again form a semi-ring 9k which generates the ring d1k of elementary figures (sets which can be expressed

as a finite union of disjoint sets of .9k). Instead of using the terms length (for k = 1), area (for k = 2) and volume (for k > 3) of an interval we will use the same word `length' in each case. Thus the `length' of an interval of .k will be the product of the lengths of k perpendicular edges. ,u(a, b] = b -a, k

u{(x1,

..., xk) : ai < x 5 bi, i = 1, 2, ..., k} = H (bi - ai)i=1

Thus for each k we have defined a set function

#: 9k

R+

which has the usual physical meaning of length, area or volume. Historically this set function and its extension to a larger class of subsets of Rk was the first to be studied; it leads quickly to the definition of Lebesgue measure in Rk. Our object in the present section is to show

that the set function obtained by extending ,u from oak to iffk is a measure on ek. There are essentially two distinct methods of doing this, and both will work for each k. In both it is necessary to show that

,u is additive on 9ak so that it has a unique extension to an additive

70

SET FUNCTIONS

[3.4

set function in offk. Then one can either make use of the continuity theorem 3.2 to show that ,u: fk -> R+ is a measure on 01, or one can prove directly that ,u is a measure on 9k and appeal to theorem 3.5 to deduce that its extension is also a measure. We illustrate by applying the first method to the case k = 1, and the second method to the case k = 2.

k=1 For each (a, b] E 9 we put µ(a, b] = b - a. It follows that, is additive on 9 for if (a, b] _ U (ai, bi] and the (ai, bi] are disjoint we may assume i=1

that these intervals are ordered so that bi < ai+1(i = 1, 2, ..., n - 1). It follows that we must have a1= a, bn = b and bi = ai+1(i =1, 2, ... , n - 1) so that, if an+1 = bn, n

n

n

E u(ai, bi] = E (bi - a1) = F (ai+1- a1) 1=1

i=1

i=1

_ (b -a) = ,u(a, b].

By theorem 3.4 there is a unique additive extension u: of -> R+ since

d° is the smallest ring containing the semi-ring 9. Since p is finite on f it will follow from theorem 3.2 (iii) that p is a measure, if we can prove that p is continuous from above at o. Suppose this is false; then there is a monotone sequence {En} of sets in for which lim E. = o but #(E.) -> 4 > 0 as n -+ oo. Now El consists of a finite number of intervals of 9. Let F1 be a set of 9 obtained by taking away short half-open intervals of 9 from the left-hand end of each of the intervals of E1 in such a way that

F1 c Fi c E1; fu(F1) > fu(E1) - 8/22. We now proceed by induction. Suppose we have obtained F. e S such that

F. c T. c E. ^ Fn-1 n

16

and #(F.) > lz(En) - rEi 2r+1 Then F. ^ En+1 E of and ,u(Fn ^ En+1) %

(3.4.1)

-,u(L''n - Fn) % µ(En+1) - E+1 T=1

(3.4.2.)

We can again remove small half-open intervals from the left-hand end of each interval of F. n En+1 to give a set Fn+1 E& such that p(Fn+1) > p(En+1 A Fn) - 8/2n+2

and

Fn+l c Fn+1 c En+1 ^ Fn.

(3.4.3)

LENGTH, AREA AND VOLUME

3.4]

By (3.4.2) and (3.4.3) we deduce that

71

n+1 E

/

a(FF+1) > u(En+1) -

2r+1+1

Thus by induction we can establish (3.4.1) for all n. Since II(E.) >, 46 for all n, we have #(F.) > 16, for all n so that all the sets F. are non-void. Hence {Fn} is a decreasing sequence CD

of non-empty bounded closed sets. Hence n F. is not void. But n=1 00

00

nFnc n=nEn= o,

n=1

so we obtain a contradiction.

k=2 Suppose C = {(x, y) : a < x < b, c < y < d} is a set of g2, and p(C) = (b - a) (d - c). In order to prove that u is additive on g2, suppose that

C

n

U Ci is a decomposition of C into disjoint rectangles

i=1

in each of which one of the sides (say (c, d]) remains the same. Then the other sides (ai, biJ must be disjoint and satisfy n

(a, b] = U (ai, bi] i=1

so that by the corresponding result in 91, ,u is additive in this case. More generally if n

C = U Ci, i=1

Ci = {(x, y): ai < x < bi, ci < y < di}

is a decomposition of C into a finite number of disjoint rectangles,

use the infinite lines x = ai, x = bi, (i = 1, 2,..., n) to decompose each Ci into a finite number of pieces Cik each with the same bounds for the y-coordinate. Hence n

/ E p(Ci) = E Eclu\cik),

i=1

i

k

and we can sum the right-hand side by first summing over the rectangles whose x-coordinate is bounded by a pair of contiguous ai, by and then summing over these intervals in x. Thus by repeated application of additivity in 91 we get n

A(C) = E lu(Ci), i=1

as required. (The reader should draw a picture.)

72

SET FUNCTIONS

[3.4

Go

Now suppose C = U Ci is an infinite decomposition of C into disi=1

joint sets of 92. We must show that It is completely additive on g2. Since 92 is a semi-ring it follows by induction that, for each n. n

C- UCi i=1

can be expressed as a finite union of sets of °.1'2. Since A is non-negative,

this implies that

n

E,u(Ci),

,u(C)

for all n,

i-1 00

so that

p(C) >' E p(C1). i=1

Suppose if possible that p is not v-additive, then there will be such a set C for which

co

,u(C) =

+24 (4 > 0).

i=1

(3.4.4)

We now use another form of compactness argument to obtain a contradiction. Suppose e > 0 is small enough to ensure that, if

Fo={(x,y):a+e<x
,u(FO) > ,u(C) - 8;

and ei > 0 are small enough to ensure that, if Fi = {(x, y) : ai < x < bi + ei, c1 < y < di + ei},

then

p(Fi) < p(Ci) +S2-n

(i = 1, 2, ...).

(3.4.5)

Then F. c C and Ci c F°, the interior of Fi (i = 1, 2,...); so that 00

Fo C U Foi. i=1

Since Fo is compact and the sets FOi are open it follows that, for some integer n, we have C n

PO

n

U F° so that Fo c U Fi. i=1

i=1

By the finite additivity of p on Og2 this implies n

p(Fo) < Ep(Fi) i=1

so that, by (3.4.5)

µ(C)-S < E,u(Ci)+ E42-i. 1=1

Which contradicts (3.4.4).

i=1

3.41

LENGTH, AREA AND VOLUME

73

Thus,u is a measure on 92 and by theorem 3.5 the unique additive extension ,u: &2

R+ is also a measure. Either form of argument

clearly extends to the class of elementary figures in Rk, so we have proved: Theorem 3.7. Suppose offk is the class of elementary figures in Rk, that is, the class of those sets n

E=UC1 i=1 where the Ci are disjoint half-open intervals in Rk. If we put ,u(Ci) = length of Ci = product of lengths of the sides of the interval Ci and

n

p(E)

p(Ci),

i=1

then It is uniquely defined on ffk and is a measure.

74

4

CONSTRUCTION AND PROPERTIES OF MEASURES 4.1

Extension theorem; Lebesgue measure

Measure was defined as a non-negative o -additive set function defined on a class of sets W. In testing T for Q-additivity we only needed co

T(E) = ZT(Ei) i=1 00

for sequences {Ei} of disjoint sets of le for which E = U Ei Ele. This i=1

is an artificial restriction as the condition of additivity does not W

apply to a sequence {Ei} unless the union set U Ei happens to belong i=1

to W. For this reason the natural domain of definition for a measure T: le -> R+ is a o -ring, and in practice most useful measures are defined on

In the last chapter we considered properties of measures defined on a ring ., so our first objective in the present chapter will be to prove that these can always be extended to a measure on the o--ring .9' generated by R. This extension is unique provided the measure on . is o--finite. We introduce an (unnecessary) simplifying assumptionthat the generated is also a o--field, i.e. that it contains the

whole space 0. Even with this simplification the main extension theorem is somewhat involved. The main idea is that of introducing an outer approximating set function, defined in terms of the measure on R, and then restricting this to a class of sets on which it is a--additive.

The relevant set function turns out to be an outer measure, so it is convenient first to obtain a theorem about all outer measures. Measurable set

Suppose ,u* is an outer measure defined for all subsets of S2: that

is, ,u* is non-negative, countably subadditive and monotone (see p. 59). A subset E is said to be measurable with respect to ,u* if, for everysetA C S2, 1t*(A) _,u*(AnE)+,u*(A-E). (4.1.1)

It is important to stress that the concept of measurability for a set depends on the outer measure ,u*. The same set E may well be

4.11

EXTENSION THEOREM

75

measurable with respect to,ul and non-measurable with respect to,u2 .

It helps ones intuition to realise that (4.1.1) states that if one divides the set A using E and its complement, then the outer measure

of the `pieces' adds up correctly. Thus a set E is ,u*-measurable if and only if it breaks up no set A into two subsets on which ,u* is not additive. The measurability of E depends on what the set E does to the outer measure of all the other subsets.

The reader may find the above explanation of condition (4.1.1) still inadequate to provide the definition of measurability with much intuitive content. This is a case where the definition is justified by the result-it turns out that, for suitable outer measures, a wide class of sets is measurable and the class of measurable sets has got the right

kind of structural properties. The definition is therefore justified ultimately by the elegance and usefulness of the theory which results from it. Note that, because of the subadditivity condition on outer measures, we always have ,u*(A) ,u*(A n E) +,u*(A - E)

for all sets A, E. Hence E is a*-measurable if and only if

,MA) 3 µ*(A n E) +,u*(A - E)

(4.1.2)

for every set A c Q. Since (4.1.2) is automatically satisfied for sets A with,u*(A) _ +oo, E is,u*-measurable if and only if (4.1.2) is satisfied for every A S1 with It* (A) < oo. It is worth remarking that many of the early discussions of measurability use the concept of inner measure. If ,u*(S2) < oo, this can be defined for all subsets E by

,u*(E) =,u*(S2)-,u*(S2-E).

In this method of procedure a set E is said to be measurable if ,u* (E) = ,u*(E). This apparently weaker definition of measurability can be shown to be equivalent to the one we have adopted provided the outer measureu* is regular. (An outer measure is said to be regular

if, for every A c 0, there is a measurable cover E A such that ,ME) = ,u*(A).) This means in particular that, under these circumstances, it is sufficient to use the single test set i for A in (4.1.1.). We do not use the concept of inner measure in our development. Theorem 4.1. Let ,u* be an outer measure on S2, and let .elf be the class of sets of S2 which are measurable with respect to ,u*, Then .4' is a ofield and the restriction of ,u* to .,' defines a measure on .4'.

76

PROPERTIES OF MEASURE

[4.1

Proof. We first show that any finite union of sets of .

is in .,ll. It is clearly sufficient to prove that El v E2 E .4' for any El, E2 E .Wf. For any set A, since El is measurable, #*(A) = /.c*(A n E1) +,u*(A -El).

(4.1.3)

Now use (A - El) as a test set for the measurable E2

,u*(A-El) =,u*((A-E1)rF2)+,u*(A-El-E2), ,u*(A- El) =,u*((A-F1)rE2)+/t*(A-(E1uF2)). But

(4.1.4)

[(A - El) n E2] U (A n El) = A n (El v E2),

so that if we substitute (4.1.4) into (4.1.3) and use the subadditivity of ,u*, we obtain

,u*(A) = ,u*(A n El) +,u*((A - E1) (A) +,u*(A - (El v E2)) #*(A n (El u E2)) +,u*(A - (El u E2))

so that, by (4.1.2), E1 v E2 e .4'. Now, since A n E = A - (12 - E), the equation for the measurability

of (S2 - E) is the same as that for the measurability of E. Hence, (t) - E) is measurable if and only if E is measurable. Since

n

n

i=1

i=1

flEi = S2- U (S2-Ei),

(see §1.4),

it follows that the class -if is also closed under finite intersections so that.,& is a field. In order to show that .4' is a av-field it is sufficient to show that E = U Ek e .4l for any sequence {Ek) of sets of .,ff. There is k=1

no loss in generality in assuming that the sets Ek are disjoint for, since

.4f is a ring, any countable union can be replaced by a countable disjoint union of subsets in -ff. Put n

(n = 1, 2, ...),

Fn = U Ek k=1

and let .°n be the hypothesis that, for any A, µ*(A n Fn) =

,u*(A n Ek).

k-1

Clearly.*', is true. Use A n Fn+1 as a test set for the measurability of

Fn :then

#*(A n F.+1) = ,u*(A n Fn) +,u*(A n E.J.

Hence Xl => . °n+1 so that, by induction .*n is true for all positive integers n.

EXTENSION THEOREM

4.1]

77

Since,a* is monotonic, for each n ,a*(A n E) > ,a*(A n Fn)

It*(A n Ek), k=1

00

so that

p*(A n E) >, Ia*(A n Ek), k=1

and the subadditivity of,a* now implies that µ*(A n E) = F, µ*(A n Ek).

Thus, for any A, and any n,

(4.1.5)

k=1

,a*(A) = p*(A n Fn) +,a*(A - Fn) >,

,a*(A n Ek) +,u*(A - E)

k=1

using ,a*(A n E) +,a*(A - E),

and this implies E e .JI by (4.1.2).

Now the restriction of ,a* to .,dl is a non-negative set function. Further (4.1.5) with A replaced by S2 shows that ,a* is o-additive on .4' and is therefore a measure on . '. We can now prove the basic extension theorem. In order to simplify

the formulation we will assume that the ring 9 of subsets of 0 is such that there is a sequence of sets {En} in .g' such that 0 = U En. n=1

We then say that S2 is o-9. This condition implies that the o--ring generated by 9 is a o--field. Theorem 4.2 is true without this restriction, but the proof would then require the consideration of outer measures defined on a suitable class of subsets of 92, rather than on all subsets.

Since this generalisation also causes complications in the definition of the integral, and the extra generality is rarely needed, we will keep the condition that S2 be o--R. Theorem 4.2. Suppose R is a ring of subsets of 92 such that S2 is and ,a: 9 -> R+ is a measure defined on £. Then there is an extension

of It to a measure v defined on .(.), the o -ring generated by R. If u is o- finite on a, then the extension is unique, and is 0 --finite on Y.

Proof. Let' be the class of all subsets of Q. Since 0 is o--. , any Be can be covered by a countable sequence of sets of 9. Put 00

,u*(E) = inf i=1

the infimum being taken over all sequences of sets {Fi} in 9 such that OD

E U J. It is clear that ,a*:le --)- R+ is non-negative, monotone and

i-i

78

PROPERTIES OF MEASURE

that ,u*(0) = 0. Suppose now that E infinite for some i,

[4.1

00

U Ei. Then, if p*(Ei) is

i=1

00

,u*(E) < E,u*(Ei)

(4.1.6)

i=1

is immediate. If a*(E1) < oo for all i; for any e > 0, choose sets Fik (k = 1, 2,...) in .ri'P such that 00

00

Ei c U Fik k=1

and

E /(Fik) < ,u*(Ei) +

k=1

6

24

(i = 1, 2, ... ).

The countable collection {Fik} will now cover E, and 00

00

p*(E) < E E F(Fik) < E'0 i=1 k=1

i=1

241

Since e is arbitrary, (4.1.6) now follows, and we have proved that R+ is an outer measure. Let .4f be the class of subsets of Q

,u*:W

which are measurable with respect to ,u*.

We first want to show that .,r' M. If E R and # *(A) < oo (the case,u*(A) = +oo is unimportant as (4.1.2) is then trivially satisfied), 00

choose a sequence {Ei} of sets of .q' such that A c u Ei and i=1

,u*(A) +e > E1 (Ei) = E L,u(Ei-E)+,u(Ei-E)l i=1

i=1

> ,u*(A n E) +p*(A -E), by the subadditivity of,u*. Since e is arbitrary, we have again proved (4.1.2), so that E E -0. By theorem 4.1, .4' is a o -ring, so that _W the o -ring generated by 9. But the restriction of,u* to -0 is a measure, so that its further restriction v to .9' is also a measure. If E e 9 it is clear that ,u*(E) > ,u(E) because of theorem 3.6 (i), and since E is a covering of itself, ,u*(E) < ,u(E). Hence, for all sets E e °R, we have v(E) _ ,u*(E) _ ,u(E), so that v is an extension of ,u from R to Y.

.,

00

If we now assume that 1a* is o -finite on qP, it follows that S2 = U Ei i=1

with {Ei} an increasing sequence of sets ing andp(E1) finite, i = 1, 2,....

For a fixed integer n, consider the ring R. consisting of sets of the form E. n E with E E R. Suppose ,ul and , t2 are any two extensions of ,u from 9,, to Y. = .9(P2.). Then all the subsets in Y. are contained in the set E,, so that It, and ,u2 are finite on if,,. Now let .9 be the subclass of those sets E of .So,, for which ,u1(E) = ,u2(E). Since

4.1]

EXTENSION THEOREM

79

finite measures are continuous from above and below, it follows that .1 is a monotone class. By theorem 1.5, since J-n 9n, it follows that 9-n Y. and we must have .°ln = Son. Thus the extension of p to Y. is unique for every n. But, for any E So we have

E= limEnK. n-_> ro

so that a further application of the continuity theorem shows that the extension of p to .5o must be unique.

Theorem 4.2 can be applied to any measure defined on a ring a. In 3.4 we saw that the concept of length in R', area in R2 and volume in Rk (k > 3) could be precisely formulated on the ring 46'k of elementary

figures to define a measure on 8k. It is clear that Rk is or-!o k, and the measure is actually finite on !o k. The o -ring generated by gk is the class _Vk of Borel sets in Rk (proved in § 2.5). Thus if we apply the statement of theorem 4.2 to this measure ,a: gk ->- R+ we obtain a unique exten-

sion to a measure v: jk -+ R+ which is o--finite on jk. It is worth noticing that in the proof of theorem 4.2 the extension was actually carried out to a class of measurable sets containing Rk. This class is denoted by Wk and can be shown to be larger than sk. A set E c Rk is said to be Lebesgue measurable if and only if it is in the class 2k. In particular all Borel sets in Rk are Lebesgue measurable. The set function v: 2"' -> R+ is called Lebesgue measure in k-space and should be thought of as a generalisation of the notion of-k-dimensional volume to a very wide class of sets. We will examine the properties of this set

function in some detail in § 4.4, and it will then become clear that many of our intuitive ideas of length, area, and volume can be precisely formulated and remain valid for Lebesgue measure.

It is worth noticing that the outer measure obtained by covering as in theorem 4.2 is always a regular outer measure. For, if,u*(E) < oo, choose sets Tn, r E 9 (r = 1, 2, ...) such that E C U Tn.r, r=1

Go

Then

> Gi#(T.") n r=1

,a* (E) + 1 Co

A=n UTn,r=) E, Ae2, n=1r=1

and p*(A) = p*(E). This means that the approach through inner measure will lead to the same class of measurable sets and the same extension to this class. In particular the Lebesgue measure can be obtained by this method provided one considers subsets of a fixed bounded interval (of finite measure) in the first instance and then allows the interval to expand to the whole Euclidean space.

PROPERTIES OF MEASURE

80

[4.1

Exercises 4.1 1. Suppose p* is an outer measure on S2 = lim Ek where {Ek} is a monotone increasing sequence of sets. Show that if a set E is such that E n Ek is measurable (p*) for all sufficiently large k, then E is measurable (p*).

2. Show that if p* is a regular outer measure on S2 and p*(Q) < oo, then a necessary and sufficient condition for E to be measurable (,a*) is that p*(S2) = p*(E)+p*(Q-E). 3. In each of the following cases, show that p* is an outer measure, and determine the class of measurable sets

(i) p*(o) . 0, p*(E) = 1 for all E + 0. (ii) p*(Q) = 0, p*(E) = 1 for E + 0 or S2, p*(SZ) = 2. (iii) S2 is not countable; p*(E) = 0 if E is countable, p*(E) = 1 if E is not countable. 4. Show that any outer measure which is (finitely) additive is o--additive.

5. Suppose p* is an outer measure on 0 and E, F are two subsets, at least one of which is measurable (p*). Show that

p*(E) +p*(F) = p*(Eu F) +,a*(E n F).

6. Suppose

is a sequence of sets in a o--ring .97, and # is a measure

on 9. Show that (i)

(ii) provided U00 Ek has finite measure for some n, k=n

p(lim sup En) > lim sup OD

If E p(En) < oo, show that p(lim sup En) = 0. n=1

7. Show that, if p is a discrete measure on n (as in example (6) of §3.1 with pi > 0), then the operation of extending it to an outer measure and restricting this extension to the class of measurable sets as in theorem 4.2 yields nothing new. 8. Suppose .,/l is the u-ring of p*-measurable sets in Q. Then if {En} is a monotone increasing sequence of sets in .4' and A is any set p*( lim A n En) = lim p*(A n En). n-). 00

n-)1 oo

Prove a corresponding result for a decreasing sequence (which needs an

additional condition).

9. If p* is a regular outer measure, show that p* (lim An) = lim p*(An) for any increasing sequence 10. Suppose in theorem 4.2 that p is known only to be finitely additive on l; then the same procedure yields an outer measure p* and a restriction

4.11

EXTENSION THEOREM

81

µ of /t* to the u*-measurable sets. Show that ;u is a measure but is not necessarily an extension of It.

11. Suppose . is a ring of subsets of a countable set fZ such that every set in R is either empty or infinite, but the generated sigma-ring Y(R) contains all subsets of S2 (see exercise 1.5(8)). Putp1(E) = number of points in E, ,u2(E) = 2,u1(E) for all subsets E c Q. Then /Zl, /b2 agree on ? but not on .9'(R) so that the uniqueness assertion of theorem 4.2 requires ,u to be v-finite.

12. Suppose h(t) is any continuous monotonic increasing function defined on (0, y), y > 0 with lim h(t) = 0. If Sl is any metric space, let t-)- o+ 00

h-m*(E) = lim [inf

h{diam (Ci)}J ,

8-+0i=1

where the infimum is taken over all sequences {C;} of sets of diameter < 8 which cover E (if there are no such coverings then the inf is +eo). Show that h-m*(E) defines an outer measure in Q. (It is called the Hausdorff measure with respect to h(t).)

4.2 Complete measures If we again think of measure as a mass distribution in the space S2, it is clear that any subset of a set of zero mass should have the mass

zero assigned to it. The present section seeks to make this notion precise.

Given a measure T: ' -> R+ we say that the class 'f is complete with respect to r if

EcF, FEW, r

that r(E) = 0.) If r:Wo -> R+ is such that ' is complete with respect to r we also say that r is a complete measure. All measures It which are obtained (as in theorem 4.1) by restricting

an outer measure ,u* to the class .,' of sets which are measurable (,u*) are complete measures. For, since outer measures are monotone, non-negative, EcF, µ*(F) = 0 =>/t*(E) = 0,

and all sets E of zero /t*-measure are measurable /t* by (4.1.2) since

p*(A) > /t*(A-E) _,u*(A-E)+#*(AnE). In particular Lebesgue measure defined on the class Ik is a complete measure.

PROPERTIES OF MEASURE

82

[4.2

Given any measure p on a o--ring .5, there is a simple method of extending it to a complete measure on a larger o -ring-called the completion of.? with respect to ,u. Theorem 4.3. Given a measure u on a o--ring.?, let So be the class of all

sets of the form EL N where E E.? and N c- FEY with µ(F) = 0. Then 9 is a and if we put µ(EA N) = ,u(E),

then µ':.9 -a R+ is a (uniquely) defined extension of p from .S? to .7, and ;u is a complete measure on .9.

Proof. Let E0 = E A N, where E E.S, N F E.?, µ(F) = 0. Put #(E). If

El = E - F, then El c E0, El E.' and #(E1)

N1 = E0-El, then El, N1 are disjoint and E0 = El v N1. Further, since

E0CEuF=(E-F)vF, we haveN1 c F and #(F) = 0. Thus the class 9 is the same as the class

of sets E v N with E E. °, N c F E.S, ,u(F) =0 and E n N= 0. A similar argument shows that So is also the same as the class of sets of the form E - N with E E.S, N c F E.9, ,u(F) = 0 and N c E. It is now easy to check that 9 is a ring. Suppose E1, E2 E.9; first express them as El = X, - N1, E2 = X 2 - N2, N1 c X 1, N2 C X2 where N1 c F1, N2 e F2 and µ(F1) _ p(F2) = 0. Then

E1AE2 = X1nX2-(N1vN2),

and X. n X2E?, N1 v N2 c F1 v F2E.5, µ(F1 v F2) = 0; E1 n E2 E .P. Now Put

so

E,=X3-N3, E2 = X2-N2, N3 n X 3 = o, N3 c F3 with µ(F3) = 0.

Then

El - E2 = (X3 - X 2) v (N3 - X2) V (N2 n E1) = (X3 - X2) U N51 where N. c F. u F2 and µ(F3 v F2) = 0. Finally E, = X 3 v N3,

E2 = X4 v N4,

where X4 n N4 = o,

and N4 c F4 with µ(F4) = 0. Then

E1vE2= (X3vX4)v(N3vN4-X3vX4) = (X3VX4)vN8,

that

4.2]

COMPLETE MEASURES

83

where N6 c F3 u F. and ,u(F3 v F4) = 0. Thus .9 is closed under the finite operations of intersection, difference, union so it is a ring. To prove it is a a -ring, put

Ei=XiuNi, NicFi, u(Fi)=0 (i=1,2,...); then

OD

00

00

i=1

i=1

i=1

UEi=UXiuUN7=XvN, 00

where

N c U Fi = F and µ(F) =0.

Hence

U Ei E Y.

i=1

w

i=1

To see that ,u is uniquely defined on 9, let

E1AN1=E2AN2 be two representations of the same set. Then (see exercise 1.4 (5))

E1AE2=N1AN2 and N1 A N2 C F E.9' with ,u(F) = 0. Hence

u(E1- E2) = ,u(E2 - E1) = 0,

and

,u(E1) = ,u(E1 ^ E2) = pp(E2)

Thus if we define ,ic on So by 7Z(E0) = lp(E1)

if

Eo = E1 L N1,

ii is uniquely defined.

It only remains to show that 9 is complete with respect to µ. Suppose E is any set of .9 with µ(E) = 0. Then B = X v N where X E.9', ,u(X) = 0, N c: FEY, ,u(F) = 0. Thus, if G c E, we have G c X v F with p(X v F) = 0 and X v F E.9'; so that G = 0 v GE.9,

and µ(G) = 0.1 We already saw that if It was a a -finite measure defined on a ring 9, then it had only one extension to a measure on the generated ar-ring.9'.

If we now complete .9 to obtain the measure ;u defined on 9 so that 9 is now complete with respect to the extension 71 of ,a, then we have extended p from . to R. Since the extension from .So to .9, is also unique, it follows that there is only one extension of p from 9P to R. There is a sense in which, in general, this is as far as one can get with extensions while still preserving uniqueness, though it may be possible to extend ,u further to a larger o--field; see theorem 6.11.

PROPERTIES OF MEASURE

84

14.2

It should also be noticed that in the extension theorem 4.2, the class f of ,u*-measurable sets is none other than .9 the completion of the a-ring .50 with respect to ,u. For, in the first place, .,' .Sv and dl is complete, hence .ill .5". Secondly, if E is any set of .4 such that µ(E) < oo, we can cover it by FEY such that ,*(F) = ,u*(E). Then F - E E .,11 and has zero measure, so that it can be covered by a GE.S° with #(G) = 0, and

E_ (F-G)u(EnG)E.So. Since It is a-finite on .4', and .9 is a a-ring, it now follows that

.,k c .9. In particular, Lebesgue measure on 2k is the unique extension of the concept of length from the semi-ring 9k to the a-ring 2k which is the completion of Rk. Exercises 4.2

1. Suppose It is a measure on a a-ring .2 and ;u on .2 is its completion. Show that if A, Bet with A c E c B, ,u(B - A) = 0 then E E , and Z(E) =,u(A) =#(B). 2. Given a a-finite measure,u on a ring. the extension given by theorem 4.2 yields a complete measure on the class .4' of #*-measurable sets which is the completion of .5o the generated a-ring. The following example shows that this is not true if the hypothesis of a-finiteness is omitted: Let S2 be non-countable, .9' the ring (also a a-ring) of all sets which are countable or have countable complements, ju(E) = number of points in E for E EY. Then .5o is complete with respect to a, but applying theorem 4.2 yields a complete measure on the class of all subsets (as every subset is measurable).

4.3

Approximation theorems

We have seen how the definition of a measure can be extended from a ring .g' to the generated a-ring .50, and its completion .9. It is con-

venient to think of the sets of £ as having a simple structure, so that it becomes interesting to see that the sets of So can always be approxi-

mated in measure with arbitrary accuracy by sets in the original

ring ?. Theorem 4.4. Suppose .5P is a ring for which S2 is a-., and the o -finite measure ,u: rP -+ R+ has been extended (uniquely) to the completion 9 of the a-ring .90 generated by 9?. Then for any e > 0, any set EE.5° with ,u(E) < oo, there is a set F E .? such that

#(E A F) < e.

APPROXIMATION THEOREMS

4.3]

85

Proof. First, find a set E1 a .So such that ,u(EL El) = 0. Then ,u(E1) =,u(E) < oo, so that by the construction of theorem 4.2, we have E,cuTi T, 4E 5

so that we can choose a sequence of disjoint sets {Ti} of 9 such that 00

E1

Co

and

(J Ti

a*(E,) + Je > E,u(Ti). i=1

i=1

Now choose a finite integer n such that Go

E t(Ti) < .e,

n+1

n

and put Then

and Hence

F = U Ti E .Q. i=1

E1 -F F -E1

Co

U Ti, so that ,u(E1- F) < je;

i=n+i

I J Ti-E1 so that ,u(F-El) < ,fe.

i=1

,u(E F) = #(E1 A F) < e. I

Remark. The condition,u(E) < oo cannot be omitted from the above

theorem, since it is possible for a finite measure It on 9 to have an extension to .9' which is measure).

but not finite (for example, Lebesgue

It is also worth noticing that the sets E of 9 can be approximated exactly in measure by sets in ., by theorem 4.3. We noticed earlier that the outer measure,u* generated by the process of theorem 4.2 is always regular. This means that an arbitrary set E SZ is always contained in a set FEY for which ,*(E) =#(F), so that every set can be approximated from the outside by a set of.9' of the same measure. If E is not,u*-measurable (i.e. not in 9) then two-sided approximation is not possible. Up to the present we have only considered general approximation theorems valid in any abstract space. If the measure is defined in a

topological space, then it is of interest to obtain approximation theorems which connect the measure properties to the topology of the space. We do not, however, discuss this problem in general: instead we consider Euclidean space with the usual topology, and Lebesgue measure.

PROPERTIES OF MEASURE

86

[4.3

Regular measure Suppose .So is a

of subsets of a topological space S2 which includes the open and the closed subsets of S2, and p:$-->- R+ is a measure. Then the measure ,u is said to be regular if, for each e > 0, (i) given E E.Y, there is an open G E with ,u(G - E) < e;

(ii) given E E ., there is a closed F E with p(E - F) < e. Since the class . of Borel sets in S is the generated by the open sets, the condition that 3 includes the open sets implies .9' .4. If p is regular on ., then . .9, where . denotes the completion of . with respect to ,u; for if Sn is a sequence of positive numbers de-

creasing to zero one can find for any E in . an open set G. and a closed set F. such that µ(G. - Fn) < Sn and

and G.

E = Fn,

G=nGn, F=IJFF n=1 n=1

will then be Borel sets with G E F and µ(G - F) = 0. Metric outer measure

An outer measure µ* defined on a metric space S2 and such that p* is additive on separated sets, i.e. d(E, F) > 0 . p*(E v F) = ,u*(E) +,u*(F),

is said to be a metric outer measure. It can be proved that, for any metric outer measure, the class ,t of measurable sets contains the open sets (and therefore contains -4), and that, if u* is also o--finite, the restriction of µ* to . ' is regular. Since Lebesgue measure is generated by a metric outer measure, this general theory would allow us to deduce that Lebesgue measure is regular. However, we prefer

instead to prove the result only for the special.case of Lebesgue measure.

Theorem 4.5. Lebesgue k-dimensional measure, defined on the class 2k of Lebesgue measurable sets in Rk, is a regular measure.

Proof. We give the details of the proof for k = 1; only obvious

alterations are needed for general k. Suppose E e 2 = 21; then B n [n, n + 1) = E. e 2 for every integer n, and p(En) < 1 < oo. By the construction of theorem 4.2, there is a countable covering {Cz} of E. by f-open intervals of 9 such that 1e

lu(En)

4

°°

iE

F(Cni)

4.3]

APPROXIMATION THEOREMS

87

Enlarge each of these intervals Cni to an open interval Gni such that 1 e u(Gni - Cni) < 4 21n1+i 00

Then Q. = U Gni is an open set which contains En and satisfies i=1

u(Qn-En) <

221n1.

00

If we now put Q = U Qn, then Q is open, Q C E, and u(Q - E) < e. n=-ao

This proves condition (i) for regularity. For any E E 2', 1) - E E 2, and we can apply the above argument

to obtain an open R

12 - E such that ,u (R - (S1- E)) < e. Then F = SZ - R is closed, F C E and #(E - F) = u(R n E) < e, so that the second condition for regularity is also satisfied. I

Corollary. Given any set E e 2k, there is a Va-set Q and an .°F, set R such that

Q=)E=)R and µ(Q-R)=0.

Proof. Note that 9ra and .F,,. sets were defined in § 2.5. For each integer n, take an open set G. E and a closed set F. C E such that In,

#(Gn-E) < n The sets

#(E-Fn) <' n1.

OD

00

n=1

n=1

Q = n Gn and R = (,J F.

then satisfy the conditions of the corollary. I

This corollary strengthens the result that any set in Fk can be approximated exactly in measure by a set in Rk-which follows from

the fact that 2k is the completion of elk with respect to Lebesgue measure. Exercises 4.3

1. Suppose 2 is the o--ring generated by a ring 9 and ,u, v are two a-finite measures on R. Show that if E e 2 is such that both #(E), v(E) are finite then, for any e > 0, there is a set E. e R for which p(E A Eo) < e,

v(E A E0) < e.

2. Suppose 0 is a metric space and p* is an outer measure on S2 such that every Borel set is #*-measurable. Show that µ* is a metric outer measure, i.e. that for E1, E2 C S1,

d(E1,Es) > 0

,a*(E1vB2) = p*(E1)+,u*(E2)

88

PROPERTIES OF MEASURE

Hint. Take an open set G

[4.3

El, G n E2 = o and use E1 v E2 as a test

set for the measurability of G.

3. Suppose ,a* is a metric outer measure on a metric space Q. Show that if E is a subset of an open subset G and En = E n {x: d(x, C - G) > 1/n} then lim ,u*(E,,) = /.c*(E). n-* o0

Hint. {En} is a monotonic increasing sequence of sets whose limit is E.

Put E. = o, D. = En+1- E. and notice that if neither D,,+1 nor E is empty then d(Dn+2,

0 so that n

/i*(E2n) > E u*(D21-1)

p*(E2n+1) > !. 14*(D2i), i=1

4=1

If either of these series diverges, then ,u*(En) moo = / *(E). If both converge, use 00 lME) S,u*(E2n)+ Z p*(D21)+ E /z*(D2i+1) OD

i=n

i=n

4. Ifu* is a metric outer measure, show that all open sets (and therefore all Borel sets) are u*-measurable. Hint. If G is open, A any subset, use notation of (3) applied to E = A n G. Then d(En, A n - G) > 0 so /c*(A) > ,u*{En v (A n - G)} =

+/C*(A n - G).

4.4* Geometrical properties of Lebesgue measure We have now defined Lebesgue measure in Euclidean space and considered some of its measure-theoretic properties. However, the justification for studying Lebesgue measure is that it makes precise our intuitive notion of length, area, volume in Euclidean space and generalises these notions to sets where our intuition breaks down. In the present section we want to show that Lebesgue measure has

got the properties which geometrical intuition would lead us to expect. It is convenient to adopt the notation IEI for the Lebesgue measure of any set E e 2k, so that for sets EE-TI, I E l is a generalisation of length; for sets E e 22, I E I is a generalisation of area; for sets E E 2k (k > 3), IEI is a generalisation of volume. Since the set consisting of a single point x can be enclosed in an interval of 9 of arbitrarily small length, it follows that I{x}I = 0

for xe Rk.

In particular, in R', 1[a,b]I = I (a,b)I = I(a,b]I = I[a,b)I = b-a so that the Lebesgue measure of any interval on the line is its length. Any countable set in Rk is the union of its single points, and is therefore

4.4]

LEBESGUE MEASURE: PROPERTIES

89

of zero measure. In particular the set of points in Rk with rational coordinates forms a set of zero measure (even though this set is dense in the whole space).

In Rk (k > 2), any segment of length l of a straight line can be covered by [nl] + 1 cubes of g k of side 1/n so that the Lebesgue measure of such a segment must be less than ([x] denotes the largest integer not greater than x) k \ {[nl] + 1} = 0 inki I as n --> oo,

\nl

and so I L I = 0 for any segment L of finite length. Any infinite straight

line in Rk, k > 2, is the countable union of segments of finite length so that ILI = 0 for any straight line L in Rk (k > 2). It follows that, if we are calculating the measure of any geometrical figure in the plane which is bounded by a countable collection of straight lines, then the area will be the same whether all, some or none of the boundary lines are included in the set.

The above argument shows that there are sets E in Rk (k > 2) which are not countable, but such that IEI = 0. The question arises whether or not such sets exist in R'. This is easily answered by the Cantor set

00

C=(1Fn, n=0

defined in § 2.7 where F. = [0, 1] and F. is obtained from Fn_1 by replac-

ing each closed interval of Fn_1 by two closed intervals obtained by removing an open interval of one third its length from the centre. We proved that C was perfect and therefore non-countable. But

so that

IFnl = I I Fn-lI = (J)'IF0l = (J)n, ICI = lim I Fns = 0. n--)- co

It is worth remarking that it is also possible for perfect nowhere dense sets in R to have positive measure-see exercises 4.4 (2, 3). We now consider what happens to the Lebesgue measure of sets under elementary transformations of the space. (i) Translation Suppose X E Rk and E

Rk. Put

E(x) _ {z:z= x+y, yEE}. For the intervals I E 9k, it is immediate that

II(x)I = III

90

PROPERTIES OF MEASURE

[4.4

so that the outer measure ,u* is invariant under translations, and Lebesgue measure must therefore also be invariant provided measur-

ability is preserved. Suppose E E S9k, and A is a test set for E(x). Then since E is measurable, using A (- x) as a test set,

p*(A(- x)) _ ,u*(A(- x) n E) +,a*(A(- x) - E) ,MA) = µ*(AnE(x))+,tt*(A-E(x)) so that and E(x) must also be measurable. (ii) Reflexion in a plane perpendicular to an axis (For k = 1 this means reflexion in a point, for k = 2 this means reflexion in a line parallel to an axis.) It is clear that,a* is invariant under

such a reflexion because the reflexion of the covering sets of 9k again gives I-open intervals of the same measure. A similar argument

to that used in (i) shows that measurability is preserved, so that Lebesgue measure is invariant under such reflexions. (iii) Uniform magnification

For p > 0, the transformation of Rk obtained by putting y = px for all x E Rk will be called a magnification by the factor p, and pE

denotes the result of applying this magnification to the set E. If I E gk, then it is clear that

pI E .k and

I pII = pk I II .

Hence, if ,u* denotes the outer measure generated by Lebesgue measure on Yk,

pp*(pE) = pk,a*(E)

for all sets E. A similar argument to that used in (i) shows that measurability is preserved by magnification, so that if E is Lebesgue measurable, so is pE and IpEI - pk IEI

-

(iv) Rotation about the origin

Lebesgue measure is invariant in this case also, but rather more work is needed to prove it. The key idea needed for the proof is that an open sphere centre 0 is invariant under rotation about 0. Suppose I is a fixed interval of 9k I ={x: ai < xi < bi, i=1,2,...k}.

Then for any x E Rk (p > 0), (pI) (x) is an interval of Rk similar, and similarly situated to I. If x denotes the transformation of Rk consisting of a fixed rotation about 0, then X(PI) (x) = (pxI) (xx)

LEBESGUE MEASURE: PROPERTIES

4.4]

91

By (i) and (ii)

Ix(pI)(x)I =p'Ix11, I(pI)(x)I =pdIII,

so that

IX(PI) (x) I=

I

II I (PI) (x) I I

for all p > 0, x E Rk. This means that, for a given x and I, the effect on the measure is the same for all intervals of the form (p1) (x).

Now any open set G can be expressed as a countable union of disjoint sets of the form (pI) (x). In particular the unit open sphere S centre the origin, can be expressed this way OD

S = U (piI) (xi), i=1 ro

ISI = EI(piI)(xi)I.

and

i=1

But xS = 8, so that

II Ix(piI)xiI =

Go

OD

EI(PJ)(xi)I = ISI = IxSI I

I E 9k.

xII =

III.

co

This argument is valid for any interval

We can now use arguments similar to those in (i) to show that, for

any set E c Rk

,z (xE) = w (E) and measurability is preserved under X. Thus if E E.Fk, XE is also in

2k and

IxEI

_

IEI

Note finally that reflexion in an arbitrary plane can be obtained by successively applying the operations (iv), (ii), (i), (iv). We have thus proved

Theorem 4.6. The class Yk of Lebesgue measurable subsets of Rk, and Lebesgue measure on _pk are invariant under translations, reflexions and rotations. If E and F are two subsets of Rk which are congruent in the sense of Euclid and E is measurable, then so is F and

IEI =1FI. For p > 0, if pE denotes the set of vectors x of the form py, y E E, then

EEYkz pEE2k, and IpEI = pkjEI. If k, 1, r are positive integers and k + l = r, then the Euclidean space Rr can be thought of as a Cartesian product Rk x R. We have defined Lebesgue measure independently in each dimension, but the 4

TIT

PROPERTIES OF MEASURE [4.4 92 measure of the primary sets -0A' could have been obtained as a product

of the measures of corresponding sets in Yk, 91. It is therefore not surprising that this is true of a wider class of sets. Theorem 4.7. If E E 2k, F E 22l then the Cartesian product E x F e 2pk+l and

IE x FI =IEI .IFI.

Proof. We use µ* to denote the outer measure generated by Lebesgue measure in the space where the set lies. Suppose first that E, F are bounded so that there are finite open intervals J, K such that E c J, F c K. We can then cover E and F by countable collections of open intervals such that 00

00

EcUQicJ, FcUR1cK, i=1

f=1

00

i=1

OD

I QiI < IEI +e,

Z IR5I < IFI +e. f=1

ThenExF c I.JQixR1,so that

i,i It*(ExF) < Z IQixR I = E IQil lR>I i.f

i.y

= E IQil E I R1l < (IEI +e) (IFI +e). i=1

f=1

Since e is arbitrary, it follows that #*(E x F) S IEI. IFI.

But

(4.4.1)

JxK=ExFv(J-E)xFvEx(K-F)v(J-E)x(K-F),

and the subadditivity of 1a* gives, with (4.4.1),

p*(J x K) <, IEI. IFI + IJ-EI. IFI + IEI. IK-FI + IJ-EI. I K-FI. But J x K is an open rectangle and therefore in 2k+', and

p*(J x K) = IJI. IKI = (IEI + IJ-EI) (IFI + 1K-Fl). It follows that all the inequalities of type (4.4.1) must be equalities. In particular (4.4.2.) R*(E x F) = IEI IFI By the corollary to theorem 4.5, we can find sequences {An}, {Bn} of disjoint closed sets such that

A=UAncE, B=UBmCF, m=1 n=1 IE-AI = 0, IF-BI = 0.

4.41

LEBESGUE MEASURE: PROPERTIES

93

Since A x B is an F, set in Rk+a it is measurable and

1*(A x B) = IA x BI = IAI . IBI = IEI. IFI.

But A x B c E x F, and Lebesgue measure is complete so that we must have E x F measurable and IE x FI = ,u*(E x F) =

IEI.

IFI.

In order to remove the restriction of boundedness, apply the above to E n S, , F n Sn, where S. S,', are spheres of radius n centre the origin in k-space, l-space respectively. This shows that, for each n, (E n Sn) x (Fn S.) E 2k+l,

I(EnS.)x(FnSn)I = IEnS.IIFnS;,,I and the result follows from the continuity of measures on letting

n-.oo.1 Non-measurable sets

We have now seen that Lebesgue measure can be defined on 2k, a large class of subsets of Rk, in such a way as to preserve the intuitive

geometrical ideas of volume. We also remarked earlier that it is impossible to define such a measure on all subsets of Rk, so we now demonstrate the existence of at least one subset which is not in 2'k. Again we carry out the construction for k = 1. Consider subsets E c (0,1] and for xE (0, 1] let E(x) be the set of real numbers z such

that or

z = x + y,

yEE and x + y < 1,

z = x+y-1, yEE and x+y > 1;

that is, E(x) is the result of translating E a distance x and then taking the non-integer part. From property (i), it follows immediately that EE2=> E(x)E2, IEI = IE(x)I. Now let Z be the set of rationals in (0, 1]. Two sets Z(x1), Z(x2) will be disjoint if (x1- x2) is irrational and identical if (x1- x2) is rational. Let f be the class of disjoint sets of the form Z(x). By the axiom of choice (see § 1.6) there is a set T containing precisely one point from each of the sets in W. If Z is the set (r1, r2, ... ), we put Qn = T (rn)

(n = 1, 2, ... ).

Then 00

n-1

UQn=(0,1],

4-2

94

PROPERTIES OF MEASURE

[4.4

since every point x E (0, 1] is in Z(xl) for some x1 and if q E Z(xl) n T,

we have q - x1 = rn so that x E Q. Also the sets Q. are disjoint as T contains only one point from each set in and therefore cannot

contain two points differing by a rational. If T E 2, then Q. E 2 (n = 1, 2,...) and I TI = IQ.I (n = 1, 2, ...). But then

1 = 1(0,1]l _ I 001 Q .

1

n=1

and this equation cannot possibly be satisfied either by IQ-1 = 0 or I QnI = c > 0 for all n. The only possibility is that the set T is not measurable. It is worth remarking that there are many more Lebesgue sets than

there are Borel sets. The number of sets in 2k is not more than the number of subsets of Rk, i.e. not more than 2c. However it is at least 2c for it contains all subsets of the Cantor set (perfect with c points in it), so that the cardinality of 2k must be 2c. However the cardinality of the class a1k of Borel subsets of Rk is c and c < 2c

(see § 1.3) so that there must be some sets which are in 2k but not in .2k; this means that the class Vk is not complete with respect

to Lebesgue measure. In order actually to exhibit a set in 2k but not in _Jk one has to work a bit harder so we do not include such an example. Exercises 4.4

1. Show that the set of points in [0, 1] whose binary expansion has zero in all the even places is a Lebesgue measurable set of zero measure. Is it a Borel set?

2. By changing the lengths of the extracted intervals in the construction of the Cantor set, show how to obtain a nowhere dense perfect set of measure J. 3. Generalise (2) to show that for any e > 0 there is a nowhere dense, perfect subset of [0, 1] with measure greater than 1-e. 4. Consider a union of sets of (3) to obtain a subset of [0, 1] of full measure

which is of the first category, and another subset of [0,1] of zero measure which is of the second category. 5 Show that any bounded set in Euclidean space Rk has finite Lebesgue outer measure. Is the converse of this statement true? 6. Suppose X is the circumference of a unit circle in R2. Show that there is a unique measure aC defined on Borel subsets of X such that ,u(X) = 1 andp is invariant under all rotations of X into itself.

LEBESGUE MEASURE: PROPERTIES

4.4]

95 7. By considering suitable approximating polygons (finite unions of rectangles will do), show that the area of the plane region bounded by x = 1, y = 0, y = x3 is J. Generalise to the case y = xk, where k > 0 but need not be an integer.

8. Show that a subset E of a bounded interval I c Rk is measurable if, for any e > 0 there are elementary figures Q1, Q2 a ek such that Q1 E,

I-E and

Q2

IQ1I+IQ2I < III +e-

9. Suppose X is the unit square {(x, y): 0 < x < 1, 0 < y < 1}. If E c [0,1] put 2 = {(x, y): 0 < y < 1] and let .' be the class of sets P such that E is 21-measurable. Put µ(E) = IEI, and show that the subset M = {(x, y):0 < x < 1, y = J} is not measurable with respect to the outer measure p* generated by a on the class of all subsets of X. Show that

µ*(M) = 1, p*(X-M) = 1.

4.5 Lebesgue-Stieltjes measure There are other measures in Rk which are of importance in probability theory. Suppose F : R -* R is a monotone increasing real valued function of a real variable which is everywhere continuous on the right. Such a function is called a Stieltje8 measure function. Put #F(a, b] = F(b) - F(a)

for each (a, b] E 9. Then It. is non-negative and (finitely) additive on 9-the proof used for the length function in § 3.4 can be easily adapted to show this (the length function corresponds to F(x) = x). By applying theorem 3.4 we can extend /tF uniquely to an additive set function on d, the ring of elementary figures. As in § 3.4 we again have at least two methods of showing that IPF is a measure one. By

theorem 3.2 (iii) if ,aF is not a measure, then there is a monotone decreasing sequence {En} of sets of a such that lim E, = o, but limpF(Ef) = S > 0. The argument used in the Lebesgue case for k = 1 can be modified by using the fact that, for any e > 0, if PF(a, b] > 0,

we can always find a y > 0 such that (a + y, b] = [a + y, b] c (a, b]

and

,/F(a + y, b] > ,aF(a, b] - e,

since F is continuous on the right at a. This leads us to a contradiction which establishes that ,aF is a measure on e. For k >- 2, we must start with a function F: Rk -> R which is con-

96

PROPERTIES OF MEASURE

[4.5

tinuous on the right in each variable separately and such that, for I E 9}k,

2k

,up(I) _

i=1

yiF(Y) ? 0,

(4.5.1)

where V are the 2k vertices of the set I E .k and yy = + 1 for the vertex

in which each co-ordinate is largest and y, = (-1)f if the vertex Y is such that r of its coordinates are at the lower bound (and (k - r) at the upper bound). Any such function F is called a k-dimensional Stieltjes measure function. With a little care. it is not difficult to show that, under these conditions, ,uF is a non-negative additive set function

on 5k and that it therefore has a unique extension to to k. Either of the arguments given in § 3.4 can now be modified to show that ,aF is a measure on 1i k.

We can now apply theorem 4.2 to this measure It,, to extend it to the o--ring 1k of Borel sets in Rk. As in the case of Lebesgue measure,

this extension automatically defines ,uF on the completion TF of Oak with respect to It,. The class 2F is called the class of sets which are Lebesgue-Stieltjes measurable for the function F. The class clearly

depends on the function F-for in the particular case F - c, TF contains all subsets of Rk as 1t (Rk) = 0 and pF is complete; while if F(x1, x2, ...xk) = x1x2 ... xk, then luF is the length function and -T,k, is the Lebesgue class 2k. Each of these measures ,up:YF --)- R+ is regular. The proof given

in theorem 4.5 can easily be modified to show this (we again do the case k = 1) by using the fact that, for any e > 0, if (a, b] E 9, there is a y > 0 such that (a,b+y] (a,b+y) (a,b]

and

#,(a, b + y) S #p (a, b + y] < ,aF, (a, b] + e,

to obtain economical coverings by open intervals. Probability measure

Given a o'-field Fof subsets of 0, any measure P: F R+ such that P(S2) = 1 is called a probability measure on F. If in addition F is complete with respect to P we will say that the triple (S2, .F, P) form a probability space. Distribution function A function F: R -* R is called a distribution function if (i) F is monotonic increasing, continuous on the right; (ii) F(x) -+ 0 as x -> - co, F(x) -+ 1 as x -* +oo.

4.5]

LEBESGUE-STIELTJES MEASURE

97

A function F: Rk -- R is called a (k-dimensional) distribution function

if (i) F is continuous on the right in each variable; (ii) /AF.(I) > 0 for all I F_ 9k, where pp is defined by (4.5.1), (iii) F(xl, x2, ..., xk) -+ 0 as any one of xl, x2, ..., xk-* - 00,

F(xl, x2, ..., xk) -* 1

as xl, x2, ..., xk

all -+ +oo.

It is immediate from our definitions that any distribution function F can be used to define a Lebesgue-Stieltjes measure OF on the o--field 2'F. Further #,,(Rk) = 1 and ,aF is complete, so that every distribution function determines a probability measure and (Rk, °F, #F), is a probability space. There, is a sense in which these are the only interesting probability measures on Rk. Theorem 4.8. Suppose So is a o- field of sets in R,. contains the open sets and ,a:.5o -* R+ is a complete measure which is finite on bounded sets

in Y. Then there is a Stieltjes measure function F: R --> R such that is a probability . n .5°F and ,u coincides with It, on YF. If space, then F can be chosen to be a distribution function. Proof. Since contains the open sets and is a o--field, it must contain

-4, the Borel sets and in particular . 9, the class of half-open intervals. Define F by F(x)

x] - {,u(0, _p(x, 0]

for x 0, for x < 0.

Then F: R -> R is clearly defined and is monotonic increasing for all real x (note that F(0) = 0). By theorem 3.2 (i), if {xn} is any monotonic sequence decreasing to x, lim F(x,,) = F(x); since n->- oo

if x '> 0, lim (0, x,L] = (0, x], if x < 0, lim (xn, 0] = (x, 0]. Thus F is continuous on the right, and must therefore be a Stieltjes measure function. Now if a >O, ,u(a, b] = ,u(0, b] -,u(0, a] = F(b) -F(a); if a < 0 S b, ,u(a, b] = ,u(a, 0] +,u(0, b] = F(b) -F(a);

if b < 0, µ(a, b] = µ(a, 0] -µ(b, 0] = F(b) - F(a); so that It coincides with,uF on -0. By uniqueness of the extension of a measure to the generated a--field and its completion, we have it = uF on Y,, and .S° = 2F.

98

PROPERTIES OF MEASURE

[4.5

If p is a probability measure on .9', we must have

lim F(x) - lim F(x) = lim p(- n, n] = 1,

x--+.0

so that

x-). -oo

F1(x) = F(x) - lim F(x)

will be a distribution function generating the same Stieltjes measure

as F.] Remark. The case where u is a probability measure could have been done directly by defining F'i(x) = p(- co, x]. It is clear that this case extends immediately to Rk since if we put F(xl, x2, ..., xk) = p{( 1, ..., fk):1 14 x,,, 2 = 1, ..., k}

it is easy to check that F is a k-dimensional distribution. Discrete probability

There is a special case of a probability measure in which all the probability is concentrated on a countable set E0 c S2. This can be defined by specialising example (6) of §3.1. If {xn,} is any sequence in OD

0, and {pn} is a sequence of positive real numbers with E pn = 1, n=1

then it is clear that

P(E) =xnEE E pn

defines a probability measure on the class of all subsets of fl. When S2 = R, this measure can be obtained from the distribution function

F(x) = E pn xn<x

so that, in R (or in Rk for that matter) a discrete probability measure

can be expressed as the Lebesgue-Stieltjes measure of a suitable distribution function. Exercises 4.5 1. To see that condition (4.5.1) for k-dimensional Stieltjes measure functions is not implied by the condition that F be monotonic increasing in each variable separately consider

F(x x,) =

max (0,x1 + x2 -I-1)

t

I

for x1 + x2 < 0,

for x1+x2

0.

Does this condition (4.5.1) imply that F is monotonic in each variable?

LEBESGUE-STIELTJES MEASURE

4.5]

99

2. If F: R -> R is a Stieltjes measure function, show that

,aF(a,b) =F(b-0)-F(a), 1t_,[a,b] =F(b)-F(a-0) and determine ,u, for intervals of the form

(- c , a), (a, co) 3. If F is a Stieltjes measure function in R which generates the Stieltjes measure ,u,, show that F(x) is continuous if and only if IuF{x} = 0 for all single point sets {x}. What is the corresponding continuity condition in Rk? [a, b),

4. Consider Lebesgue measure on 21-subsets of [0,1] and let E0 be a subset of [0,1] which is non-measurable, such that the Lebesgue outer measure of Eo and ([0,1]-E0) are both 1. Let .2 be the smallest 0--field of subsets of [0,1] containing Eo and Y. Show that .2 consists of sets of the form E = A n Eo+B n ([0,1] - E0) for A, B F2" and that #(E) = IA n [0,1]l defines a probability measure on the a--field .2. By applying theorem 4.8 to this probability measure show that, in general it is not possible to deduce in theorem 4.8 that .So = 2F.

5. Suppose

F(x) - r0 for x < 0, 1

Show that

for x > 0.

,up(-1, 0) < F(0) - F(-1).

6. Give an example of a right-continuous monotone F such that ,up(a, b) < F(b) - F(a) < ,uF[a, b].

7. Show that, if F, G are distribution functions in Rk, then aF+bG is a distribution function for any a > 0, b >, 0, a+b = 1.

8. In R2' F(xi, x2)

1

for

xl >, 0,

fO

for

all other points.

x2 3 0,

Show that this F is a distribution function describing a unit mass at 0. 9. State and prove an n-dimensional form of theorem 4.8. 10. We can obtain completely additive set functions in RI which are not

necessarily non-negative by the following method. Suppose F: R -> R is continuous on the right everywhere and of bounded variation in each finite interval and F(b)-F(a) is bounded below for all a < b and define TF(a, b] = F(b) -F(a).

Show that TF is additive on and can be extended to S By an extension of theorem 4.2, TF can then be extended to a o--additive set function on a. Now apply theorem 3.3 to express TF as the difference of two measures. Finally, the argument of theorem 4.8 shows that TF is the difference of two Stieltjes measures.

100

5

DEFINITIONS AND PROPERTIES OF THE INTEGRAL 5.1 What is an integral? Historically the concept of integration was first considered for real

functions of a real variable where either the notion of `the process inverse to differentiation' or the notion of `area under a curve' was the starting point. In the first case a real number was obtained as the difference of two values of the `indefinite' integral, while the second case corresponds immediately to the `definite' integral. The so-called `fundamental theorem of the integral calculus' provided the link between the two ideas. Our discussion of the operation of integration will start from the notion of a definite integral, though in the first instance the `interval' over which the function is integrated will be the whole space. Thus, for `suitable' functions f : 0 > R* we want to define the integral 5(f) as a real number. The `suitable' functions will be called integrable and .1(f) will be called the integral of f. Before defining such an operator., we examine the sort of properties 5 should have before we would be justified in calling it an `integral'. Suppose then that sad is a class of functions f : S2 > R*, and 5:a > R defines a real number for every f E.W. Then we want S to satisfy: (i) f d, f (x) >, 0 allx E 0 _0(f) >, 0, that is f preserves positivity ;

(ii) f,gE.W, a, ftER= of+figE.V and

5(af+fg) = a.N)+/if (g), that is .1 is linear on.Qf ; (iii) S is continuous on.V in some sense, at least we would want to have.f(f ,,) -> 0 as n > oo for any sequence { fn} of functions in a which is monotone decreasing with fn(x) > 0 for all x in 0.

These conditions are satisfied by the elementary integration process, but the Riemann integral does not satisfy the following strengthened form of (iii): (iii)* If {fn.} is an increasing sequence of functions in.V, and fn(x) -->f(x)

for all x c 0,

then f E.V and .f(fn) -> .1(f) as n -> co.

5.1]

WHAT IS AN INTEGRAL?

101

This is the most serious limitation of the Riemann integral for, with this definition of integration, it is necessary in (iii)* to postulate ,,(x) -+f(x) uniformly in x before one can conclude that f Esad and

J(fn) -->.1(f). Now conditions about the continuity of f are really essential if the operation is to be a useful tool in analysis-there would not be much of analysis left if one could not carry out at least sequential limiting operations. One of our main objectives, therefore, is to define an operator .1 which satisfies (iii) *.

One method of studying integration theory (essentially due to P. J. Daniell) is to start with a restricted class sago of functions with a simple structure, define .f:d0 - R to satisfy (i), (ii) and (iii) and then

extend Qto and the functional .1 step by step until f:d -> R is defined on a sufficiently large class while (i), (ii) and (iii)* are satisfied.

Using this approach one can deduce a measure on a suitable o -ring of subsets of S2 by putting ,u(E) = .f (XE)

for those sets E for which XE E sad. Condition (i) then implies that

,u is non-negative, condition (ii) that it is additive and condition (iii) that it is v-additive provided the domain of definition is a ring. We will give details of this approach in § 9.4, but for the present we will

regard the measure as the primary concept and define the integral in

terms of a given measure. We will, however, obtain an operator R which has the above properties and moreover in defining

.1:sad

.1 we will continually have these desired properties in mind. Thus out

of many possible ways of obtaining the integral starting from a measure, we choose the method of definition by limits of monotone sequences of `simple' functions.

5.2 Simple functions; measurable functions where it is a space, F a o--field We now assume given of subsets of S2 and It a measure on.F. All the concepts we now define are relative to (SZ, It is worth remarking that our definitions can be modified to apply to the case where JF is a o -ring rather than

a o -field, but this results in additional complications in proofs. The additional labour involved does not seem justified for the small gain in generality. Our object is to define an operation, called integration, having

the properties discussed in § 5.1 on a suitable class of functions f: Q -* R*. Ultimately we want this domain of definition for the integral to be as large as possible. In the present section we obtain the properties of certain classes of functions which will be important later.

PROPERTIES OF THE INTEGRAL

102

[5.2

Dissection

If

0

n

= U Ei and the sets Ei are disjoint, then El, E2, ..., E. are said i=1

to form a (finite) dissection of Q. They are said to form an.'-dissection if, in addition, Ei E .F (i = 1, 2, ..., n). Simple function A function f: S2 -> R is called F-simple if it can be expressed as n

f(x) = i1 cxhi(x)' = 1

where E1, E2, ..., E. form an .F-dissection of SZ and

ci e R (i = 1, 2, ..., n).

Thus an F-simple function is one which takes a constant value ci on the set E. where the sets Ei are disjoint sets of .F. The additional n

condition implied by our definition that 11 = U Ei is not important i=1

(and is omitted by many authors), since if

F'n+1= a-i=1 UEi$ 0 we can always put cn+1

0 and write n+1

f = E ci xEi

i1

to see that the function is .f"-simple. If there is only one o--field .°F under consideration we will talk of simple functions rather than Fsimple functions. Lemma. The sum, difference and product of two simple functions is a simple function. Proof. Suppose we have the representations m

n

f =2Y- eixEi, 9 =jE djxdj; =J

=1

then the sets Hi j = Ei n A j (i = 1, 2,..., n; j = 1, 2,..., m) are in .F and form a dissection of Q. Further f (x) = ci and g(x) = d,

so that

(f ± g) (x) = ci ± dj, n

and

f ± g =iZ

for xeHij, (fg) (x) = ci dj

n m

in

jZ

XH,j = XEi xd j

for xeHij

(ci ± dj) XH,

f9 =iE

jZ

cidjxHij I

5.2]

SIMPLE AND MEASURABLE FUNCTIONS

103

Note that the constant functions

f(x) = c all

xE S1

are simple, so that by this lemma it also follows that cf is simple if f is and the class of simple functions forms a linear space over the reals.

One should regard simple functions as a generalisation of `step' functions, but it is clear that they form a very restricted class since the image of S2 under a simple function is a finite subset of R.

In defining measurability we will want to consider functions f: £1- R* with extended real values. It is possible to define a topology in R* and to define the class of Borel sets in R* in terms of this topology.

However, we adopt the simpler procedure of defining the class R* of Borel sets in R* directly. We say that a set B c R* is a Borel set in R* if it is the union of a set in MI (the class of Borel sets in R) with any subset of R* - R = {- oo, + oo}. Measurable function

A function f: t -> R* is said to be F-measurable if and only if f-1(B) E.F

for every BE -4*. If there is only one .F under discussion we may say that f is a measurable function. From the definition it appears at first sight that one has to work hard to check that a given function is.F-measurable. However, in practice it is sufficient to check that f-1(E) E.F for a suitable class of subsets which generates the o--field °.,$*. The most important such class is given by the next theorem.

Theorem 5.1. In order that f: S2 -> R* be F-measurable each of the following conditions is necessary and sufficient:

{x:f(x) < c}E.F for all cER; (ii) {x:f(x) > c}EJF for all (i)

(iii) {x: f (x) > c} E.F for all c E R; (iv) {x: f (x) < c} E JF for all c E R. Proof (i). Since [ - cc, c] E -4*, it is clear that the given condition is necessary. If we suppose that the condition is satisfied, and put E., = {x: f (x) S c} = f-1[ - co, c],

then E0 E F, for all c E R. But the sets I,, = [ - oo, c], c E R generate the o--field a*, so that, for each B E -V* (see exercise 1.5.(10)), the set f-1(B)

[5.2 PROPERTIES OF THE INTEGRAL is in the o--field of subsets of S2 generated by the sets E0, c E R. Since .F is a we have f-1(B) E.F 104

for all BE a*. (ii), (ii) and (iv). A similar proof can be constructed for each case. Alternatively, it is easy to prove directly that each of (i), (ii), (iii), (iv) is equivalent to each of the others. 3

Corollary. Any p-simple function is Proof. If n f = cixEi, then E. = {x:f(x) < c} ii=1

is the finite union of those sets Ei(e.F) for which ci < c, and is there-

fore in F. By condition (i) of the theorem, this implies that f is measurable. 3 The next theorem examines further the relationship between simple

functions and measurable functions. It is both important and somewhat surprising.

Theorem 5.2. Any non-negative measurable function f : S2 - R+ is the limit of a monotone increasing sequence of non-negative simple functions. Proof. For each positive integers, let

Q8 = (x:

p-_28'

(p = 1,2,...,22,1);

Qo

8=

- U QP,8 = {x: f (x) i 28}. P=1

Then, since f is F-measurable, QP,BEJ5F and the sets QP,s (P = 0, 1, ..., 228)

form an F-dissection of Q. The function f8(x) =

p

1

28

= 28

for

xEQP,B

(p = 1, 2,..., 22s);

for xEQ0,$

is a simple function and it is immediate that

0
or f8(x) + 28+1 = f8+1(x)

SIMPLE AND MEASURABLE FUNCTIONS

5.21

105

Further, if x E Q0 8, then f8(x) = 28 < f (x) so that either x E Qo, +1, or x E Q,, 8+1 for some p > 228+1 + 1; and in either case ,,+1(x) > f8(x). Thus for each integer s f8+1(x) > f8(x) for all x e 0; and the sequence {f8} of simple functions is monotone increasing. If x is such that f(x) is finite, then, if 28 > f(x) we have 0 < f (x) - f8(x) < 2-8

so that in this case f8(x) - f (x) as s -* oo. On the other hand, if f (x) _ + oo, then f8(x) = 28, so that again f8(x) -+ f(x) as s -+ oo. ] For any function f: S2 ->- R* we define the positive and negative

parts f+, f off by

f+(x) = max [0, f (x)], f _(x) = -min [0, f (x)].

Then clearly for all x,

f (x) = f+(x) -f-(x),

If(x)I =f+(x)+f (x), and each of the functions f+, f_ is non-negative. It follows immediately from theorem 5.1 that, for any measurable f, f+ and f- are both measurable. An application of theorem 5.2 now shows that any measurable function can be expressed as a limit of simple functions. Our next step is to show that the class of measurable functions f: f -> R* is closed both for finite algebraic operations and for countable limiting operations. A minor difficulty arises in that R* is not an algebraic field so that, for example, (f + g) (x) is not defined at points where f (x) = + oo,

g(x) = - oo. In the following theorem therefore, we assume that the functions are such that the algebraic operations are possible. Theorem 5.3. If f and g are measurable functions: S2 -. R* and k e R, then each of the functions:

f +k, kf,

f+g, f2,

fg,

1/f (where (1/f) (x) = +oo if f(x) = 0),

min (f, g), .f+, which is defined, is measurable. max (.f, g),

f,

If 1

Proof. The measurability of the first two functions f + k, kf follows immediately from any part of theorem 5.1. Consider now the function (f + g). Let {ri} be a sequence containing all the rationals in R. Then, since {ri} is dense in R, for any c E R, OD

{x:f(x)+g(x)> c} = U {x:f(x) > ri}n{x:g(x) > c-ri}. i=1

PROPERTIES OF THE INTEGRAL [5.2 By theorem 5.1 each of the sets on the right-hand side is in F so that, because .F is a a -field, the set on the left is also in.F, and (f + g) must be measurable. Now 0 if c < 0, {x:(f(x))2
{x:-c0, and each of these sets is in .F, so that f 2 is measurable. Further {x: 1/c < f(x) < 0} if c < 0,

{x:(1/f)(x) 1/c} and each of these sets is in .F, so that 1/f is measurable.

if c > 0,

We have already seen that f+ and f are measurable, so that if I = f+ +f_ is also measurable. The measurability of the remaining functions now follow from the identities:

fg = {[(f+g)2-f2-g2],

max(f,g) = J[f+g-If-gl], min (f, g) = f +g - max (f, g) It is clear that the above theorem extends immediately to functions obtained by carrying out a finite number of algebraic operations on any finite collection of measurable functions.

Theorem 5.4. Suppose {fn}, n = 1, 2, ... is a sequence of measurable functions: S2 -> R*; then (i) the functions supfn and inff,, are measurable; n

(ii) the functions lim sup fn, lim inf fn are measurable; n-+ao

n-->,,o

(iii) if the sequence {fn} converges and in particular if it is monotone, lim fn is measurable. n-+oo 00

Proof. (i)

{x: sUpfn > C} = U {x: fn(x) > C}n=1

n

Since.F is a o--field, an application of theorem 5.1 now shows that supfn n

is measurable. The case of inf fn can be proved similarly or it may be deduced from

n

inf fn = -sup (-fn). n

n

Suppose now that {fn} is monotone increasing; then

lim fn = supfn

n-aao

n

SIMPLE AND MEASURABLE FUNCTIONS

5.2]

107

and is therefore measurable. Similarly, if Y J is decreasing, lim fn = inf fn. 00

If

9'n = supfr,

hn = inf fr,

ran

r_> n

then {90, {hn} are monotone sequences, and lim sup fn = lim gn, n--> co

n--* oo

lim inf fn = lim hn n -ao

so that both are measurable. (iii) If {fn} converges its limit will be measurable because it is the common value of the measurable functions lim sup fn, lim inf fn. 3

It should be remarked that the class of measurable functions is not closed for non-countable operations of the above type. Thus, if A is non-countable and fa: S2 -+ R* is measurable for each a c A, there is no reason for f(x) = supfa(x) aE4

to be measurable. For example, let A be a subset of [0, 1] which is not 2-measurable (see § 4.4), and put fa(x) = 1 if x = a;

=0 if x4a. Then for each a e A, fa is 2-measurable (it is actually 2-simple) but xa(x) = suPfa aEA

is certainly not 2-measurable. In practice when one needs to consider non-countable suprema (as in the theory of stochastic processes with continuous time parameter) one tries to replace the index set A by a countable subset giving the same supremum for the family (at least except for a special subset of S2 of zero measure). If this procedure is impossible for any reason, then there are very serious difficulties in using non-countable suprema.

In the special case where l is a topological space and 9 is the o--field of Borel sets in S2, there is a special name for a-measurable functions. Borel measurability If . is the class of Borel sets in K2 and f: K2 -> R* is -4-measurable,

then we say that f is a Borel measurable function on 0.

Lemma. Any continuous function f: S2 --> R on a topological space l is Borel measurable.

108

PROPERTIES OF THE INTEGRAL

[5.2

Proof. Since, for continuous f, the inverse image of an open set in R* is open in fl it follows that {x: f (x) < c} is open for all c E R and is therefore in .4.

If F, 2 are any two o--fields of subsets of f2 such that F 2, it is immediate that any function f : f2 -a R* which is .2-measurable is also

F-measurable. In particular if F 2, then any continuous function on a topological space f2 is .F-measurable. If fZ = Rk (Euclidean k-space) then we know that the class _Tk of Lebesgue measurable sets, and the class YF of sets which are measurable with respect to the Lebesgue-Stieltjes measure defined by F each contain .4k, the Borel sets in Rk. Hence, all continuous functions on Rk are Borel measurable and therefore .2' -measurable for any F (in particular they are ^Pk-measurable which we call Lebesgue measurable). Functions which normally occur in real analysis are usually obtainable from continuous functions and simple functions by the operations of the following types: (i) finite algebraic operations; (ii) countable limiting operations; (iii) composition.

We have already seen that operations of types (i) and (ii) preserve measurability so that we should consider whether composition operations can be carried out within the class of measurable functions.

Lemma. Suppose f: R* R* is Borel measurable and g: 0 -> R* is .F-measurable, then the composite function fog: fZ -* R* is F measurable.

Proof. If A is any Borel set in R*, then since f is Borel measurable, the set f-'(A) is also a Borel set in R*. Now {x: f(g(x))EA} = {x:g(x)EB}E.5F

since B = f-'(A) E 2*. I

Remark. In the above lemma, it is not sufficient to assume that f: R* -* R* is Lebesgue measurable -see exercise 5.2 (9). This means that, for most of the functions which normally occur in analysis, it is immediately obvious that they are 2F -measurable for every F, and in particular that they are Lebesgue measurable. Almost everywhere (a.e.)

It is convenient to have a way of describing the behaviour of a function f : f2 -> R* outside an (unspecified) set of zero measure. If P is some property describing the behaviour of f(x) at a particular point x, then we say that f (x) has a property P almost everywhere

SIMPLE AND MEASURABLE FUNCTIONS

5.2]

109

with respect to u, if there is some set with ,u(E) = 0 such that f (x) has property P for all x E 0 - E. We then write f(x) has property P a.e. (,u).;

and, if there is no ambiguity about the measure being considered, (,u) can be omitted.

Lemma. If F is complete with respect to It, and f = g a.e., then f is measurable if and only if g is measurable.

Proof. For any c e R the set {x:f(x) < c} o{x:g(x) < c} C {x:f(x) + g(x)}

so that {x:f(x) < c} differs from {x: g(x) < c} by a subset of a set of zero measure. If .F is complete with respect to It, all such sets are in .F so that {x: f (x) < c} E .F if and only if {x: g(x) < c} E .F. J Exercises 5.2

1. In theorem 5.1, show that the condition {x: f (x) S c} E.V for all rational c is sufficient to imply that f : ) R* is 3---measurable.

2. Suppose {f,,) is a sequence of functions: S2 -> R* each of which is finite a.e. Show that, for almost all x in S2,

3. Suppose G is an open set in R and { R. Show that

is finite for all n.

is a convergent sequence of

functions: S2

{x: lim n-.co

E G} = U U fl 00

m=1k=1n=k

{x: d(f

(x), R- 0) > m),

where d(y, E) denotes the distance from y to E (defined in §2.1). 4. Show that, in theorem 5.2, the condition f >, 0 can be deleted provided we do not require monotonicity for the sequence {f,,} of simple functions converging to f. Show that if f is unbounded above and below it is impossible to arrange for the sequence {f,,,} to be monotone.

5. An elementary function is one which assumes a countable set of values R is measurable then it is the uniform limit of a monotone sequence of elementary functions, but that if f is not bounded it is not the uniform limit of simple functions.

each on a measurable subset of 0. Show that, if f: f

6. If.'V is a finite field of subsets of S2, show that f : S2 -* R is 35-`-measurable

if and only if it is F.simple. 7. If S2 is a topological space, give examples to show that, for f : Cl -+ R, the condition `f is continuous a.e. in Cl'

PROPERTIES OF THE INTEGRAL neither implies nor is implied by the condition 110

[5.2

`there is a continuous g: S2 - R for which f = g a.e.' 8. Suppose S2 is a topological space,. .4 and (0, F, u) is such that .F is complete with respect to It. Show that any function f which is continuous a.e. is .F-measurable. Give an example of a measurable function which cannot be made continuous by altering its values on any set of zero measure.

9. If 2 is the class of Lebesgue measurable sets in R, give an example of an 2-measurable function f : R --> R and an 2-measurable set E c R for which f-1(E) is not 2-measurable. Hint. Use a suitable subset of the Cantor set (see §2.7).

5.3 Definition of the integral Our method is to define the operation of integration first for nonnegative simple functions, and then extend the definition step-by-step showing at each stage that the desirable properties discussed in §5.1

are obtained. If we think of measure as a mass distribution in 0, and integration as a means of averaging a given function f with respect

to this mass distribution it is clear that there is only one reasonable definition for the integral of (1) A non-negative simple function

If

n

f(x) = E ciXEi(x),

(5.3.1)

i=1

where ci > 0 (i = 1, 2, ..., n) we define

ffczu

= i=1 E ci,u(Ei)

This sum is always defined since each of the terms is non-negative.

It is called the integral off with respect to p. (Note that if ci = 0, ,u(Ei) = -boo our convention is that cip(Ei) = 0.) Since the representation of a simple function in the form (5.3.1.) is not unique we must first see that our definition of the integral does not depend on the particular representation used. Suppose

f=

n i=1

m

cj xE. = Fi dj XFj, j=1

then since both systems of sets are dissections of 0 m

n

,u(Ei) = fE,a(Ei n Fj) and µ(F) =iE11(E1 n Fj).

(5.3.2)

DEFINITION OF THE INTEGRAL 111 Also if Ei n F is not empty, it will contain a point x and f (x) = ci = dj. 5.3]

Thus

n

i=1

nm E Ec,u(EinFj) _ i=1j=1

n

m

df#(EinFj)

i=1j=1 m _ djp(Fj). j=1

Now consider two non-negative simple functions n

m

f = E Ci XE,, 9 = E di XFj i=1 j=1

and use the representations

nm

cm

n

f= Ej=1ECi,XEgnFj, 9=1j=1 ZdjXEtinF1 J=1 i=1 in terms of the dissection Ei n F j. Then the simple function (f + g) has the representation nm

f+g= E E (ci+dj)xE;nF,, i=1 j=1

and

J

(f+ g) du = E E (ci+dj),u(Eir1 ) i=1 j=1

n m n m = E E ci,a(Ei n Fi) + E E dj,u(EE n Fj) i=1j=1 n

i=1j=1

m

= E ci,u(Ei) + E dj,u(F), using i=1

j=1

(5.3.2)

=Jfda+Jgdu. It is now immediate from the definition that if a > 0, 8 > 0 and q are non-negative simple functions then

r

f (af+ffg)du = a ffdu+ffJ 9du so that our operator is linear on the class of non-negative simple functions. It is also clear that it is order preserving; that is, if f, g are simple functions and f > g then f fdu > f gdu. These properties allow us to extend our definition to: (2) Non-negative measurable functions Given a measurable f : SZ --> R+, by theorem 5.2 there is a monotone

increasing sequence fn of simple functions such that fn ->.f. Since

PROPERTIES OF THE INTEGRAL [5.3 ff. d# is defined for all n, and is monotone increasing it has a limit in R+ (which may be +oo). We define 112

ffd

fnd %(5.3.3)

=lm f n-oo J

Since there are many possible monotone sequences of simple functions which converge to a given non-negative measurable f, we must show

that the integral f f du defined in this way is independent of the particular sequence used. Suppose {fn} is an increasing sequence of non-negative simple functions and f = lim fn > g, where g is non-negative simple. The n-->.

first (and main) step in 00 showing that our definition (5.3.3) is proper is to show that, in these circumstances

limu> J gdu.

(5.3.4)

n->oo

k

Put

9=

ciXEi, i=1

then if f g du = + oo, there must be an integer i, 1 0, p(Ei) _ + oo. Then for any fixed e such that 0 < e < ci, the sequence of sets {Ann Ej (n = 1, 2, ...) is monotone increasing to Ej where

An = {x:fn+e > g}. Hence p(An n Ei) --,- + oo as n -* oo, by theorem 3.2. But

(5.3.5)

ffdu > f/flxflR.dP > (ci -e)#(An n Ei) --> +oo as n Thus (5.3.4) is established, if f gdu = +oo. Now assume that f gdu is finite and put

A = {x:g(x) > 0} = U Ei. Ci>0

Since g is simple, c = min ci is positive. and ,a(A) < oo. We now ca>0

suppose e > 0 and again define A. by (5.3.5). Then

ffndu > ffnxAnAd1iu > f(g_e)x4n4du =

f9ly-d.nAdu-e#(A. n A) >

fxAn Adp-eu(A).

Since #(A. fl Ei) ->,a(Ei) for each i, we can evaluate the integrals as finite sums and find an integer no = no(e) such that

findu > fgxAc1iu_e_ep(4) for n > no,

5.3]

DEFINITION OF THE INTEGRAL

113

so that we have established (5.3.4) also in the case f gda < oo. We can now suppose given two sequences of simple functions

0
A

0<91
f = lim fn > gm,

nw

lim J f .d# > If we now let m -+ oo

lim n-->oo

J

gmdp.

ffda > lim fgm du. m->ao

Since the situation is symmetrical, the opposite inequality is similarly proved and we must have

f

lim fnd1t = lim n-> oo

mom J

gmdu.

Thus the operation of integration is properly defined for nonnegative measurable functions. Because of the corresponding result

for non-negative simple functions, it now follows that, if f, g are two non-negative measurable functions and a > 0, f3 > 0 then foxf

+ Rg) d# = a f fda+fl gdu.

By our definition, for f > 0 and measurable, f f du may be finite or +oo. A non-negative measurable function f is said to be integrable with respect to the measure, if f f du is finite.

There are clearly two possible reasons for such an f to fail to be integrable. Either there is a simple function g < f for which f gdu = oo, which would imply the existence of a c > 0 for which p{x: f(x) > c} = +oo, or alternatively it is possible that f gdu is finite for all simple functions g < f (which implies p{x: f(x) > c} < oo,

all c > 0) but, for any sequence gn of simple functions converging

tof, f

gndu-+ +ocasn-> co.

We can now define the integral for: (3) Integrable measurable functions We know that if f: L2 -> R* is measurable, then so are

.f+,f-

and f = f+-f-.

114

PROPERTIES OF THE INTEGRAL

[5.3

If both f+ and f_ are integrable, then we say that f is integrable and define

ffd

=

Thus our operation of integration is now well-defined on the class .V of integrable functions. We will show in the next section that all the desirable properties discussed in §5-1 are satisfied by this operation. Finally, we define: (4) Integral of a function f over a set A

This can be considered only for sets A in F. Put fA

fdu = ffxAda

provided ffx4d/L is defined. Thus f fda will be defined if either (i) fXA is non-negative and measurable, or (ii) fXA is measurable and

integrable. We say that f is integrable over A (with respect to p) if the function fXA is integrable. It is clear that fa fdu = ffdlt and we will usually continue to omit the set f when we are integrating over the whole space.

Note that, if E e .°F and ,u(E) = 0, then any function f : 0 -> R* is integrable over E with

fE1d = 0. Exercises 5.3 1. Show that a simple function f(x) = E C XE;(x)

is integrable if and only if c; = 0 for each integer i such that 4a(E) = +ao.

2. Let Sl be a finite set, u(E) the number of points in E. Show that all functions on S2 are simple functions and that the theory of integration reduces to the theory of finite sums.

3. If f: S2 -> R* is integrable (a) show that, for any e > 0 #rx: I f (x) I 1 e} Goo.

115 DEFINITION OF THE INTEGRAL 4. Suppose ,ul and 1a2 are two measures defined on f7 and v =111+,u2. Show that if f is integrable with respect top and,u2 over a set E, then it is integrable with respect to v and

5.3]

fE fdv

=JE fdw1+JEfdp2

5. Suppose f : Sl --> R+ is a non-negative measurable function. Show that

ffdu = sup LZ p(Ek)inf{f(x):xeEk}1 , where the supremum is taken over the collection of all finite classes of disjoint measurable sets with

E = U Ek. k-.1

(This is a possible way of defining

f f du which leads to the same class of a

integrable functions).

6. Suppose ,a(E) < oo and f : E a R is a measurable finite-valued function defined on E Put 00

Sn(E) _

k

ka-.2n

p x:xEE,

k 2n

< f(x) <

k + ll

2n J Show that this series is absolutely convergent for all n if it is absolutely convergent for any one n e Z. Show that f is integrable on E if and only if the series converges absolutely for all n and in this case

fdu = Jim Sn(E). Jf B

S- 00

Show that this is not valid if ,a(E) = + oo. (This is another possible way of defining

5.4

r s fB

da.)

Properties of the integral

We have now defined the operation of integration with respect to a measure p on a class of integrable functions. The first objective must

be to show that our operation has the properties outlined in § 5.1. These are of two types: those involving only a finite number of functions, and operations involving a countable class of functions. We will obtain various closure properties of the classd while we are examining the integration operation.

Theorem 5.5. Suppose (0, jF, p) is a measure space, A, B are disjoint sets in F and P L -* R*, g: S2 -+ R* are two functions integrable

PROPERTIES OF THE INTEGRAL

116

[5.4

(over 92) with respect to It. Then f is integrable over A, f +g and If j are integrable (over 12) and

rs

p

fd,u =J fdu+J fdu;

W

fa-B

(ii) f is finite a.e.;

(iii)

f(f+g)du = f fdu+ f gdu;

(iv) I f fd,2l < f If I du; (v) for any c E R, cf is integrable and f cf du = c f f du;

(vi) f>0= ffdu>O;f>g=> ffdu> fgdu; (vii) if f > 0 and f fdu = 0, then f = 0 a.e.; (viii) f = g a.e. = f fdu = f gdu; (ix) if h: i --> R* is iF-measurable and IhI < f, then h is integrable.

Corollary. Any function f : S2 -* R* which is bounded, .F-measurable, and zero outside a set E in F of finite ,u-measure is integrable (over 0) with respect to ,u.

Proof. If f : t2 -+ R+ is non-negative measurable and integrable (over t2) and 0 < g < f with g: S2--> R+ measurable, it follows immediately from the definition of the integral of non-negative measur-

able functions that g is integrable. Since for any A E .F, xy is measurable,

0 < f+ XA < f+ and 0 < f_ XA < f-, and a function f which is integrable over t1 is also integrable over any measurable set A. (i) If A, B are disjoint,

xd-B = XA+xB,

so that

f+xa -B = f+xa +.f+XB,

f-xA B =f-xA+f-XB; and since property (i) is already known for non-negative measurable functions we must have

fd'B f da = ff+xA Bdk- fi_x4

B4

f

= f+xAda- f f- xAda+ f f+xBdu- ff-xad1v; and

fB/dP = f/czp-i- fBfdP,

since all the terms are finite.

(5.4.1)

5.4]

PROPERTIES OF THE INTEGRAL

117

(ii) If f is not finite a.e., then at least one of the sets

A1={x:f(x)=+oo}, A2={x:f(x)=-oo} has positive measure. Suppose,u(A,) > 0. Then it follows immediately

from the definition that f f+dµ = + oo which means that f is not integrable (over S2).

(iii) This has already been proved for non-negative functions f, g.

If fl, f2 are non-negative and f = fi-f2, then fi+f_ = fs+f+ and applying (iii) for non-negative functions gives

fA d# +

ff

d14

= ff2 d1% +jf+ d#

so that

ffdi = f/ld/L_ff2du. Now the general result follows since, for finite f, g

f + g = (f++g+) - (f-+g-), so that

f

Pf + g) du = f (f++g+) d4 - (f-+g-)dw

f

f

f

= f+dµ- f- da+ f g+d#- g-d#

= ffdu+fgdu. Finally apply (iii) to the function If I = f+ +f_ to deduce that if I is integrable and

f!/14 = f f+du+ f f_du. (iv) This now follows immediately as

I ffdlt

I- ff+4 -ff-du s ff+d+f/_d.

(v) Ifc=0,cf=0and fcfdp=0=cffdu. Ifc> Othen (cf)+ = of+,

(cf)- = cf-

118

PROPERTIES OF THE INTEGRAL

[5.4

and the result follows since it has already been proved for nonnegative functions (p. 113). Similarly, if c < 0 (cf ) +

= - cf-,

(cf )

- = - cf+,

f (cf)+da-f (cf)_du = (- c) ff- d1t + ff+du = cffdu. C

(vi) The first statement follows from the definition. If f > g, then f = g + (f - g), and V- g) > 0. By (iii), we now have ffd1u =

fdP+f(f_)d1u > f gdu.

(vii) If {x:f(x) > 0} has positive measure, then by theorem 3.2 there is an integer n such that, if A = {x:f(x) > 1/n},

,u(A) > 0. But n-1XA S fxa 5 f, so that 1

5fdu > n xadu = nu(A) > 0. Hence, if f > 0 and f f du = 0, we must have ,u{x: f (x) > 0} = 0.

(viii) If f = g a.e., then f+ = g+, f = g- a.e. In the construction of theorem 5.2, the sets Q,,,. for the two functions f+ and g+ will all have the same measure. Hence, there are simple functions fn -+ f+, gn -+ g+ such that

ffndPfndh/t

(n = 1, 2, ...),

and it follows that f f+da = f g+da. Similarly f f -d# = f g_d#. (ix) If I 1i < f then 0 < h+ 5 f, 0 < h_ < f. From (vi) it now follows that each off h+da, f h_du is finite, and h is therefore integrable.

Proof of corollary. If If I < K, then the simple function KxE is integrable and the integrability of f now follows from (ix). Remark. If F is complete with respect to It, then (viii) can clearly be strengthened as follows: If f: Ll --> R* is integrable, and g: fZ -* R* is such that f = g a.e., then g is integrable and f f du = f gdu. There is also a converse to this remark: if f and g are integrable functions such that

f f du = f gdu

for all E e ,V,

then f = g a.e. For, suppose not, so that ,u{x: f(x) + g(x)} > 0. Then

5.41

PROPERTIES OF THE INTEGRAL

119

at least one of {x:f(x) > g(x)}, {x:f(x) < g(x)} has positive measure. By theorem 3.2 there{x:f(x) must be an integer n such that, for

> 9(x)+n}, #(E.) > 0.

En =

But then

fflf_ffl4> 1 µ(E) >0 n

which is a contradiction establishing the required result. We can now consider theorems about the continuity of the integration operator. Theorem 5.6. Suppose {fn} is a monotone increasing sequence of nonnegative measurable functions: t -> R+ and fn(x) -*f(x) for all xE S2: then

lim jfd dc = jfdiz,

n->ao

in the sense that, if f is integrable, the integrals f fndµ converge to f fdp; while if f is not integrable either fn is integrable for all n and f fndµ - . +oo as n -* oo, or there is an integer N such that fN is not integrable so that f fndµ = +oo for n > N. Proof. For each n = 1, 2,... choose an increasing sequence {fn.k}(k = 1,2,...) of non-negative simple functions converging to fn, and put 9k = maxUn.kln
Then {gk} is a non-decreasing sequence of non-negative simple func-

tions and

g = lim 9k k --)- ao

is non-negative measurable. But

fn,k<9k
for

n
(5.4.2)

fn < g < f; and, if we now let n - oo, we see that f = g. Using the order property (vi) of theorem 5.5 and (5.4.2) gives

so that

ffn.kd1u <

f gkdp < ffu for n < k.

For fixed n, let k -* co; from the definition of the integral,

ffndµ < f gd1u < lim ffkdP. k-+ OD

PROPERTIES OF THE INTEGRAL If we now let n-> oo, we obtain 120

15.4

r

1im ffndp < gda <, lim ffkdu. n-a oo J n-00 J Since the two extremes of the inequality are the same, we must have

Jimu= fd/u = ffdu.] Corollary (absolute continuity). If f is integrable (over 0) then, for A E .F,

Proof. Put

J, fda -> 0

as ,u(A) -* 0.

fn = f if

I f I< n

=n if IfI>n. Then I fnI is monotonic increasing to I f I as n -> oo Since, by theorem 5.5, If I is integrable, we have

as n -+ oo.

Given e > 0, choose N such that

f IfI du < f IfnI du++e for n > N. Then if A e.F is such that a(A) < e/2N, we have, by theorem 5.5,

ffdu fIfIdP=ffNId# (IfI-IfNI)da<e.3

+f, IfI-IfNIdU
Remark. The notion of absolute continuity for a set function v: .F -+ R will be considered more fully in § 6.4.

Theorem 5.7 (Fatou). If {fn} is a sequence of measurable functions which is bounded below by an integrable function, then

f liminffndp 5 liminfffndu n--» n-aao

Remark. The operation lim inf picks out the small values of a sequence. This theorem says that if this is done point-wise and the result is then integrated the answer will be not greater than the result of first integrating and then applying the operation.

PROPERTIES OF THE INTEGRAL

5.4]

121

Proof. Since {fn} is bounded below by an integrable function g we may assume without loss of generality that fn > 0 for all n. For

h,,.=fn-g> 0 a.e.t and fhnd/ u = f/n du -J gdu,

lim inf hn = lim inf fn - g a.e.

Put gn = inf fk, then gn is an increasing sequence of measurable k.n functions and

lim gn = lim inf fn. n-*oo

n-ioo

Since fn > gn, for all n

liminf f fnd# > lim f gndu = f lim gn du = f liminffnd1z, J n-).co n->m J by theorem 5.6.

f

Corollary. If {fn} is a sequence of measurable functions whichis bounded above by an integrable function, then

r f limsupfndu > limsupJ fndu. n-,ao

n-O-ao

Proof. This can be proved directly by a method similar to that of

theorem 5.7, or it can be deduced from that theorem by putting gn=-ffn(n= 1, 2, Theorem 5.8 (Lebesgue). (i) If g: fl -> R+ is integrable, {fn} is a sequence of measurable functions SZ R* such that IfnI < g (n = 1, 2, ... ) and fn -->f as n -> oo, then f is integrable and

ffn da

ffdl4 as

n - oo.

(ii) Suppose g: Sl -+ R+ is integrable, - oo < a f as t -> a + or t -* b -, then f is integrable and

fd -*ffdiu. Proof. (i) We first prove the special case of the theorem where t Since g is integrable the set (x: jg(x) I = + oo } has zero measure, so that the operation f (x) -g(x) can be carried out at least outside the set (x: lg(x) l _ + co }. We put in a.e. to cover the possible exceptional set of zero measure where (ff - g) is not defined. By theorem 5.5 (viii) such exceptional sets do not effect the value of the integrals, to be zero at the points where fa = g = ± oo. and we could arbitrarily define

PROPERTIES OF THE INTEGRAL

122

[5.4

fn > 0 and fn - 0 as n -+ oo. In this case we can apply theorem 5.7 and corollary to give

flimsupfdP = fodfc = 0

lim sup

ff,d# < lim sup fndµ.

= J lim inf

J Hence all the inequalities must be equalities, lim f fn du exists, and has the value zero.

In the general case, put h,, = I fn -f I ; then 0 < hn < 2g, 2g is integrable and hn is measurable with hn -* 0 as n -+ oo. But then

ffd1u_ffd/fIf_fJd1u

-, 0 as

n --> co,

and f is integrable by theorem 5.5 (ix). (ii) Suppose, for example, that f, f as t --> a+, then we can apply the sequence form of the theorem tofu = ft., where {tn} is any sequence in (a, b) converging to a. Since f = limfn we must have

ffn(LP-> J/c4u.

But the right-hand side is now independent of the particular sequence

{tn} chosen so that f fgdu must approach the limit f f du as t ->a through values in (a, b). 1 Exercises 5.4

1. Suppose f : S2 -+ R is measurable, A EF,,u(A) < oo and

f (x) =0 for x E S2 -A, m< f (x) < M for x e A, where m, M E R. Show that f is integrable and mp(A) <

ffd# < Mp(A).

2. Prove that, if f and g are integrable functions, min [ffd/2.

fdia]

>

f min (f, g) dµ.

If the two sides of this inequality are equal, what deduction can be made about the relation between f and g? 3. Prove that, for any e > 0, if f is integrable over E there is a subset Ei c E such that uc(E0) < oo, and u

fB fdµ- fa.fdul < e.

PROPERTIES OF THE INTEGRAL

5.41

123

4. Show that f : S2 -> R* is integrable if and only if for any e < 0, there exist integrable functions g and h with g 3 f 3 h and f (g - h) d1i < e. 00

5. If E _ U Er is a countable union of disjoint sets off, and f is inter=1

grable over E, then

f fda =E00 f E

r=1

E,

.fda

and the series converges absolutely.

6. Suppose Z is the set of positive integers, Jz'.is the class of all subsets of Z and lu(E) denotes the number of points in E. Show that any f : Z -> R*

is g -measurable and that f is integrable if and only if E f (n) converges n=1

absolutely. Deduce that the sum of an absolutely convergent series is unaffected by any rearrangement of the terms. 7. Suppose {fn} is a sequence of integrable functions and

00 fflfn!d4u<

n=1

o.

w

Show that the series E fn(x) converges absolutely a.e. to an integrable function f and that

n=1

ffd1i = E ffndlu n=1

8. Suppose {Er,} is a sequence of sets in .°F, m is a fixed positive integer,

and G is the set of points which are in E. for at least m integers n. Then G is measurable and 00 1rn1u(G) < E fu(En) n=1

9. Show that a measurable function f is integrable over a measurable set E if and only if Eµ[E n {x: I f (x) 13 n}]

converges.

10. Suppose f is measurable, g is integrable and a, ft e R with a < f (x) < 8 a.e. Then there is a real y such that a < y < lQ and ff JgJ d# = y

f

I9Id1z.

Show by an example that we cannot replace IgI by g in this equation.

11. Suppose p is Lebesgue measure in R and put

fn(x)=-n2 for

=0

xE (0,1/n),

otherwise. TIT

5

PROPERTIES OF THE INTEGRAL Then lim inf f = lim f = 0 for all x, but 124

[5.4

= -n. This shows that theorem 5.7 is not valid without the restriction that { be bounded below by an integrable g.

12. State and prove a version of Fatou's lemma (theorem 5.7) for a family ft, t e (a, b), of non-negative measurable functions.

13. Is it true that, for measurable f, g:12

R*,

f2 and g2 integrable =>fg integrable? Sho w that, if

[ffdP]2 = ff2c1f2dp,

then f and g are essentially proportional : that is, there is a real a such that f = ag a.e., or g = 0 a.e.

5.5 Lebesgue integral; Lebesgue-Stieltjes integral We have defined the operation of integration on an abstract measure

space (12, F, p). Historically this method of integration was first where, denotes Lebesgue measure on the a--field 2 of Lebesgue measurable sets. We have made the definition in the general case since no more work is involved, but we must now specialise it to obtain the Lebesgue integral. If E is a Lebesgue measurable set in R,,u denotes Lebesgue measure in R, f is 2-measurable,, then it is usual to use the notation defined on (R,

f f (x) dx

for fE f dl-t.

In particular, if E is an interval with end-points a, b we use the notation b

fa f(x) dx for fE fdx, where E = [a, b] or (a, b) or [a, b) or (a, b]. Note that, since the Lebesgue measure of a single point is zero, it makes no difference whether the interval is open or closed. In the above notation a may be - oorrand b may be + oo so that f '0-f(x)dx means

Jxfdu =It

is worth remarking that the integral over an infinite interval is defined directly (an infinite interval is a measurable set) and not as the limit of integrals over finite intervals.

LEBESGUE INTEGRAL

5.51

125

In Rk similar notations are used f E... f f (x) dx means a `multiple integral' off over the set Ee2k in Euclidean k-space with respect to Lebesgue measure. If instead of using Lebesgue measure we use a Lebesgue-Stieltjes

measure (defined in § 4.5) given a point function F in Rk, this is equivalent to working in the measure space (Rk, yak., luF). We use the notation fE f (x) dF(x)

for fEfdPF.

In this case we do not, in general, obtain the same result when we integrate over El = [a, b] and E2 = (a, b) so we will not use the notation b f (x) dF(x)

Ja

unless we know that F is continuous for all x. (This condition is sufficient to imply that the ,uF measure of single point sets is zero, so that the integrals over El and E2 are the same).

Because Sfk is complete with respect to Lebesgue measure (2F is complete with respect to pp) we see that if f : Rk -+ R* is integrable and f = g a.e., then g: Rk --> R* is also integrable. The theorems of § 5.4 were proved for any measure space (92, F, u) so they are true in particular for Lebesgue measure in Rk. Thus the Lebesgue integral is an order preserving linear operation on the class .Qt of Lebesgue integrable functions. It is also a continuous operator in the following senses. Theorem 5.6 A. If {fn} (n = 1, 2,...) is a monotone increasing sequence of non-negative Lebesgue measurable functions on Rk -> R+ and

f = lim fn, then

f

f(x) dx = lim J co

n->oo

fn(x) dx. oo

Corollary. If {f,,} is any sequence of non-negative Lebesgue measur00

able functions on Rk --* R+ and f = E fn, then n=1

ff(x) dx = 0o

n=1

ff(x) dx.

Theorem 5.8 A. If g is Lebesgue integrable and {fn} is a sequence of Lebesgue measurable functions Rk -> R such that fn --> f a.e. as n oo 5-2

PROPERTIES OF THE INTEGRAL

126

[5.5

and Ifnl < g a.e. for each n; then the functions fn, f are Lebesgue integrable and

lim ff(x) dx = ff(x) dx. "0M

00 o0

Corollary. If E e 2'k and I E is finite, then for any sequence {fn} of 2k-measurable functions Rk -* R such that Ifn(x) I 5 a < oo for all n, all x E E, fn -3 f a.e. inE we have

f f (x) dx = lim f fn(x) dx. E

E

It is clear that theorem 5.8A can also be translated to give a corresponding result for series. It is also worth remarking that the theorems corresponding to theorems 5.6 A, 5.8 A for the Riemann integral can only be proved by using some additional assumption that ensures that

f is integrable : for example, it is sufficient to assume that fn -* f uniformly. Exercises 5.5 1. From first principles calculate the Lebesgue integrals ro J1

(i)

(p > -1);

xgdx (ii) f10 i

fo

(q < -1);

(iii) fsf du, where It is Lebesgue measure in R2, f (X, y) = xy and S is the unit square 0 . x . 1, O < y < 1. 2. Suppose f: R

R* is Lebesgue integrable and

F(x) = f-" f (t) dt. 00

Show that F is a uniformly continuous function.

3. Show that if {fn} is a sequence of integrable functions E - R* such that

If

f,,(x) l dx < co,

t hen fn(x) -* 0 for almost all xeE. 4. Show that if I fn(x)I 5 1/n2 for all integers n,xEE, and each fn is measurable and g is integrable over E, then E fn(x) g(x) dx = I fn(x) g(x) dx. n=1fz

J E n=1

LEBESGUE INTEGRAL

5.5]

127

5. Caratheodory defines the Lebesgue integral of a non-negative measurable function in R as the Lebesgue measure of the ordinate set in R2

f

0
a Show that this definition is equivalent to the one we have given. 6. Suppose {xs} is a sequence of points in R and pt > 0,

pt < co. F(x)

F(x) = Z pt

is defined by

xt<x

and ,uF denotes the Lebesgue-Stieltjes measure with respect to F. Show that all functions f: R - R* are measurable, and that f is integrable if and 00

only if Z pt f (xt) converges absolutely. t=1

7. Show that the function f(x) = 1/x2 is integrable with respect to Lebesgue measure over [1, oo), but not with respect to the LebesgueStieltjes measure generated by F(x) = x3.

8. Show that the function f (x) = x2 is integrable with respect to the Lebesgue-Stieltjes measure generated by

x < 0,

0,

F(x)

1 1

(x -F

1)4'

x > 0.

9. Show that, for non-negative measurable functions f: R -+ R+, the Cauchy definition of the integral over an infinite interval t

f d# = lim o

t-*GO

f (x) dx o

is equivalent to the Lebesgue definition. By considering the function f (x) = sin x/x, show that this equivalence does not extend to all measurable f. 10. Show that if f : [a, b]

R is continuous and t e (a, b)

lim f tf(x)dx] =.f(t); V-*t yl t [f af(x)dx-

thus the Lebesgue indefinite integral can be differentiated at points where the integrand is continuous.

5.6* Conditions for integrability The strength of the integration operator we have defined is that it works on a very wide class of functions. Provided the o--field .F is large, the restriction that f has to be.F-measurable is not a serious one,

PROPERTIES OF THE INTEGRAL [5.6 for we have seen that in a topological space S2, if F contains the open 128

sets, then any function which can be obtained from continuous functions or simple functions by countable operations will be .Fmeasurable. The only additional restriction for integrability of f is on the size (that is, the measure) of the sets where jf j is large. It should be emphasised that our operation could be called `absolute integration' for f is integrable if and only if jf is and we do not allow the large negative values off to `cancel out' the large positive values to give a finite integral unless each of f+ and f_ is separately integrable (see exercise, 5.5 (9)).

If we restrict our consideration now to the Lebesgue integral on R, these general comments still apply, but here it is worth comparing the Lebesgue integral with the Riemann integral over finite intervals. Since we want to compare integration operators, for the present section (only) we will use b

Y f f(x) dx to denote the Lebesgue integral, a

'f(x) dx to denote the Riemann integral. a

It is easy to give examples of functions which are 2-integrable but not 9-integrable. There are two kinds of bad behaviour which can prevent a function from being 9-integrable. These are illustrated by: (1) bounded functions which are badly discontinuous but still 2-measurable. For example 11 when x is rational, f(x)

0 when x is irrational,

is discontinuous everywhere. For any a < b, it is clear that b

f (x) dx

Ja cannot exist. However, the set of rational points is countable, and therefore 2-measurable with zero measure, so that f(x) is an 2simple function and

b

2fa f(x)dx=0.

J (2) Functions which are unbounded in (a, b) cannot be G -integrable

even if they are continuous everywhere. For example, f(x) = x(0 < a < 1) is not R-integrable over (0, 1), although an elementary

5.61

CONDITIONS FOR INTEGRABILITY

129

calculation shows that it is 2'-integrable. If the points of unboundedness off (as in the above case) are finite in number, it is sometimes possible to use the `Cauchy-Riemann' process to define the integral. Thus 1

lim aJ f(x) dx 6+O+

e

is defined in the above case and could be used as a definition of f (x) dx. Provided the Cauchy-Riemann integral of lf (x) exists, it Jo

is not difficult to show that, if the Cauchy-Riemann process for f(x) works, then f is 2-integrable to the same value. This is not true without the condition that the process works for lf(x)l, since the 2-integral is an absolute integral. We know (corollary to theorem 5.5) that any function f: [a, b] -> R which is 2'-measurable and bounded is .'-integrable. For the existence of the 9-integral it is necessary for f to be bounded, but the condition of measurability does not give sufficient smoothness. In fact the natural way of characterising functions which are . -integrable

over a finite interval is in terms of the measure of the set of points where the function is discontinuous. Theorem 5.9. A bounded function f: [a, b] -+ R is Riemann integrable if and only if the set E of points in [a, b] at which f is discontinuous satisfies JEl = 0. Any f: [a, b] -+ R which is Riemann integrable is Lebesgue integrable to the same value.

Proof. We use the following definition for the Riemann integral off over [a, b] (this is not the usual one but can easily be seen to be equivalent by using the basic theory of the .?-integral). For any positive integer n, divide Io = (a, b] into 2n equal half-open intervals

put

n,a = (an,4-v an.i] (2 = 1, 2, ..., 2n); mn,,i = inf{f(x):an,j_1 < x < an.4}, _Mn.t = sup{f(x): an i_1 < x < an,z},

gn(x) =

hn(x) =

mn1 0

0

0 Then for each integer n, x E Io

for for

x e In. d, x ¢ Io;

for X E In, for x 0 Io.

gn(x) < f(x) < hn(x);

130

[5.6 PROPERTIES OF THE INTEGRAL {gn} is a monotone increasing sequence of simple functions, and {hn} is a monotone decreasing sequence of simple functions. If we put

h = lim hn,

g = lim gn,

n-1 Go

n--). OD

then g < f < h. Further, by definition,

Y (`b Ja

g(x) dx = lim 2J b gn(x) dx a

n-+ oo

= lim

b-a 2"

,--,..o

E mn, = lim sn,

n

i=1

say;

n->oo

pb

2Jaf h(x) dx = lim 2 J hn(x) dx b

a

= limb-a -27EE Mn, = lim Sn, say. 2"

n-.w

i=1

n-->oo

We say that f is .?-integrable over [a, b] if, only if b

lim sn = lim S. and 9 f (x) dx a

n->oo

is then the common value of the limit.

Now notice that if f is continuous at x e (a, b) then g(x) = h(x). Conversely if g(x) = h(x) and x is not a dyadic point (that is, x OD, where D is the countable set of end-points of intervals In,i), then f is continuous at x. pb

If AJ f(x) dx exists, since g < f < h, a

2faa g(x) dx = MME f(x) dx = 2Jra h(x) dx pb

b

so that, by theorem 5.5 (vii) g = h a.e. Since the set E of points where f is discontinuous is contained in D u {x: g(x) + h(x)} it follows that JET = 0. Further, since Lebesgue measure is complete, f is P-measurable and, by theorem 5.5 (viii), f (x) dx =

b

YJa E

b g(x)

Yfa

dx = ME b f(x) dx. a

Conversely if the set E satisfies JEJ = 0, this implies g(x) = h(x) a.e., which gives, by theorem 5.5 (viii) b g(x)

YEa

so that f is . -integrable. I

dx =

b h(x)

Yfa

dx

5.6]

CONDITIONS FOR INTEGRABILITY

131

Theorem 5.9 shows that .P-integrable functions have to be continuous at most points. We have many examples of 2-integrable functions which are continuous nowhere. However, there is a sense in which even 2-integrable functions have to be approximable by continuous functions-in fact by functions which are arbitrarily smooth, that is, functions that can be differentiated arbitrarily often.

Theorem 5.10. Given any 2'-integrable function f: R -* R* and any e> 0 there is a finite interval (a, b), and a bounded function g: R --> R such that g(x) vanishes outside (a, b), is infinitely differentiable for all real

xand

Yflfx_uxiIdx < e.

Proof. We carry out the approximation in 4 stages.

(i) First, find a finite interval [a, b] and a bounded measurable function fi which vanishes outside [a, b] and is such that

I/(x)-fi(x)I dx < Je.

Yf

This can be done by considering the sequence of functions f (x) if x e [ - n, n] and If(x) I < n, if x E f - n. n1 and f(x) > n. a..(Xi

-n

It

0

if xo[-n,n].

xE[-n,nj ana j(x) < -n,

Then gn(x) -+ f(x) for all x and I gnI < If 1. By theorem 5.8 it follows

that

f

-If(x)-gn(x)Idx-*0

as n->oo

so that we can fix a sufficiently large N and put fi(x) = gN(x).

(ii) The next step is to approximate fl by an'-simple function/2 which vanishes outside [a, b] and satisfies I.fi(x) -12(x) I dx < }e.

This is clearly possible since we defined the integral as a limit of the integrals of simple functions. (iii) Now a simple function is a finite sum of multiples of indicator

functions. If each indicator function can be approximated by the indicator function of a finite number of disjoint intervals, then it will follow that f2 can be approximated by f 3, a step function of the form n

f3(x) _

a=i

gxJ1(x),

PROPERTIES OF THE INTEGRAL where each Jt is a finite interval and 132

[5.6

I f2(x) -/3(x)I dx < JE.

To see that this is possible start with a bounded IF-measurable set E and 7/ > 0. Find an open set G z) E such that I G - E I < in and from the countable union of disjoint open intervals making up G pick a finite number to form Go such that I G - Go I < Jr/. It will then follow that IEA GoI < r/ so that

f-I IX) -xao(x)I dx <,t/(iv) In order to obtain the required infinitely differentiable function g for which

f f3(x)-g(x)I dx < fe it is now sufficient to find a function for one of the components Xj,(x) of f3.

Suppose J = (a, b) and 0 < 21 1. 10a,,10) - cbb,,7(t)} dt.

- OD

It is easy to check that h is infinitely differentiable and

fIxx-h(x)Idx < 471, since 0 <, 1(x) < 1 for all x and {x: X j(x) + h(x)} is contained in the two

intervals (a-71, a+,I) and (b-is, b+71).J Remark 1. We stated our approximation theorem in R1. It is also true in Rk for every k, and in this case we can require the approximating

function to have partial derivatives of all orders everywhere. Our proof requires only minor modifications to give the corresponding theorem in Rk.

Remark 2. If SZ is a topological space, and F includes the Baire to show that sets in S2 then theorem 5.10 can be generalized to (ti, any integrable function can be approximated by a continuous function.

(The Baire sets are the sets in the a--ring generated by sets R {x: f (x) > 0} where f : S2 is continuous and vanishes outside a compact set).

5.61

CONDITIONS FOR INTEGRABILITY

133

Exercises 5.6

1. In theorem 5.10 it was shown that any integrable function f could be appproximated by a step function g in the sense that

fIfx)_(xi dx < e. Show that in general it is not possible to arrange at the same time that g R* is integrable, then

f a function which is uniformly continuous and zero outside a bounded interval. 3. If f"(x) =

e-"x-2e-2nx show that f,, is integrable over [0, +oo) but

that

f 4. Put and

o

Z 'fn(x) dx. I fn (x)) dx + n-1 0 J

0=01

x2 sin l/x3 g(x)

_ {0

(x + 0), (x = 0),

for all x e R. g'(x) =f(x) Show that f (x) is finite for all x, but unbounded near x = 0. Show that fix) is not R-integrable over (0,1), but that it is Cauchy-Riemann integrable (evaluate its integral). Isf(x) 2'-integrable over (0, 1)?

134

6

RELATED SPACES AND MEASURES 6.1 Classes of subsets in a product space In the last few chapters we have defined all our concepts in a single abstract space S2 and usually we have at any time considered only one measure defined on a fixed class of subsets of Q. In applications one often requires to consider more than one measure, and the relationship between the spaces and measures involved become important. We first consider measures defined on the Cartesian product of two measure spaces. Before considering the definition of such measures we must examine, in the present section, the structure of the relevant classes of subsets.

In § 1.1 we defined the Cartesian product X x Y of two spaces X, Y to be the set of all ordered pairs (x, y) with x E X, y E Y. Rectangle

Any set in X x Y of the form E x F with E c X, F c Y is called a rectangle (set). Product of classes

If '', 3 denote classes of subsets in X, Y respectively, then ' x -9 denotes the class of all rectangles E x F with E E ', F E.9. Product ring, field, o --field

If z-class again denotes any one of ring, field, or-ring, ar-field and W, 3 are z-classes in X, Y, respectively, then the product z-class is the z-class in X x Y generated by le x -9.

Lemma. If le, -9 are semi-rings in X, Y respectively, then ' x is a semi-ring in X x Y. Proof. It is immediate that le x -9 is closed for finite intersections, so that we have only to prove that

E1x.F -E2xF2 can be expressed as a union of disjoint sets of ' x _q for any El, E2 E'i;

F1, F2 E _q.

6.11

SUBSETS IN A PRODUCT SPACE

135

Since 1, .9 are semi-rings we have n

m

El - E2 = U Ei, Fi - F2 = U Fj, j-3

i-3

where the sets Et (i = 3,4,..., n) are disjoint sets of ', Fj (j = 3,4,..., m) are disjoint sets of .9. Now El x Fl - E2 x F2 = (El n E2) X (FI - F2) v (El - E2) X (F1 n F2) v (El - E2) X (F1- FZ)

m

n

n m

= U (E1nE2)xFjiU Eix(FitF2)uU UEixFj,

i-3 i=3 j-3 1=3 and these are all disjoint sets in W x 2. It is important to notice that this lemma does not extend to any of the z-classes as W x -9 is not closed under the operation of union. In particular, if le, _q are o--fields then le x _q will not be a We

will use the notation ' *

for the v-field in X x Y generated by

W x 2. We also need some operations which are effective in the opposite direction, from the product space to the components. Section

Given any set E c X x Y and any point x E X, the subset EX _ {y: (x, y) E E}

of Y is called the section of E at x. Similarly, for y E Y, the subset Ev = {x: (x, y) E E) of X is the section of E at y.

Projection Given any set E -- X x Y the sets {x: there exists y with (x, y) E E}, {y: there exists x with (x, y) E E} are called the projections of E into the spaces X, Y, respectively.

Although the product o--field' * -9 of two o--fields', -9 contains more than the rectangle sets E x F, E E', F E-9, one can deduce an important restriction on the sections of its sets.

Theorem 6.1. If .'-', T are o--fields in X, Y, respectively, and .° is the product a -field in X x Y, then all sets E in.*' have the property that

all yeY. Proof. Let 3' be the class of subsets E of X x Y with the property that every section of E is in the appropriate Since rectangle sets certainly have this property, it is immediate that le .F x T.

E_, ETr for

Moreover, it is not difficult to verify that W satisfies all the axioms of a Hence, T .Y the generated by F x 9.3

136

RELATED SPACES AND MEASURES

[6.1

Sections of functions

Given any function f: X x Y

R*; for each fixed x E X, fx(y) = f (x, y) for y E Y

defines a function Y -* R* and for each fixed yE Y, ff(x) = f(x, y) for x E X defines a function on X to R*. These functions fx, fv are called the sections of f at x, y, respectively.

Corollary. Under the conditions of theorem 6.1, given any .-measurable function f: X x Y --> R*, each of the sections fx(y) is 9-measurable and each of the sections ff(x) is .°F-measurable. Proof. Suppose xa is a fixed point in X and M is a Borel set in R*;

then

{y: ,,.(Y) E.411 = {y: f (x0, y) E M} = {(x, Y): f (X, y) E M}x0

so that the test set is the section at x0 of a set in .'.

The results of theorem 6.1 and corollary can be extended in an obvious way to finite Cartesian products X. X X2 x ... X X,1; there is no difficulty in making the required modifications to the definitions and proofs. It is not quite so immediate that they can also be extended to arbitrary Cartesian products jj Xi.

iel

Let us recall that a point in jj Xi can be thought of as a function iEl f : I U X j such that f(i) E Xi for each i E I. Suppose then that we iEI have a collection {Xi, i E I} of spaces and o--fields .i of subsets of Xi.

Cylinder set If i1, i2, ..., in is any finite subset of I and EjkE rzk, k = 1, ..., n; the set of points f E IIXi such that f(ik) E Eik

(k = 1, 2, ..., n),

is said to be a (finite dimensional) cylinder set in 11X1. When we say

that f is in a cylinder set C, the values off are restricted only on a finite set of indices. The class of all such cylinder sets in IIXi will be denoted i E I).

Lemma. The class

i E I) of cylinder sets is a semi-ring of subsets

ofiEI IIXi. Proof. We can think of .j as a semi-ring in Xi which contains the whole space X. Then if two sets A = {f: f(i)EEi,iEJ}, B = {f: f(i)EFF,iEK},

6.11

SUBSETS IN A PRODUCT SPACE

137

are in '(.Fi, i e 1), J and K must be finite subsets of I, and each of the

sets Ei, Fi must be in the relevant

.j. The set J v K = L

is also a finite subset of I and, if we put

Ei =Xi for iEK-J, Fi = Xi for iEJ-K, then

A = {f : f (i) E Ei, i E L}, B ={ f: f (i) E Fi, i E L},

are now cylinder sets in which the same finite subset L of indices are restricted. Since we know that any finite Cartesian product of semirings is a semi-ring, we can deduce that A - B is a finite disjoint union of sets of this type and A n B is a set of this type. Hence W(Aj, i E I) is a semi-ring.

Note. The case I = Z is important. The cylinder sets in II Xi i-1

then reduce to sets of the form E1 x E2 x ... x E.

00

x jj Xi with i=n+1

Ei E.Fj (i = 1, 2, ..., n). 00

The results corresponding to theorem 6.1 for jj Xi are formulated as examples for the reader to prove.

i=1

Exercises 6.1

1. If 9 is a ring of subsets of X, .9' is a ring of subsets of Y, show that the product ring consists of those sets in X x Y which are finite unions of disjoint rectangles in . x .50. 2. If Al, A2 -_ X, B1, B2 that Al = A2, B1 = B2.

Y and Al x B1 = A2 x B2 is not null, prove

3. Suppose E = A x B, El = Al x B1 and E2 = A2 x B2 are all nonempty rectangles in X x Y. Show that E is a disjoint union of El and E2 if and only if either A is a disjoint union of A1, A2 and B = B1 = B21 or B is a disjoint union of B1, B2 and A = Al = A2. 4. If .9', l are Q-rings in X, Y, respectively, then the product v-ring in X x Y is a o--field if and only if both So and l are

5. Show that the intersection of a class of rectangles is a rectangle.

6. Suppose X = Y is any uncountable set and So = Jr' is the class of subsets which are either countable or have countable complement. Determine the product a-field of .9 and .°l: If D = {(x, y): x = y} is the diagonal in X x Y show that every section of D is in .50 or .T but D is not in the product v-field. This shows that theorem 6.1 has no converse. 7. Suppose .5; T are o-fields in X, Y; then a rectangle set E x F is in the product v-field if and only if E e .F, F E !Y.

RELATED SPACES AND MEASURES

138

[6.1

8. Suppose _-Y is the product v-field of two a-fields.F, 9. Show that any function on X x Y -* R which is .'-simple has all its sections F-simple or 9-simple. 9. Suppose .r is the product v-field of two v-fields . F2. Show that the projection of a set in 30" on an axis need not be in .F1, .°F2, respectively.

10. Suppose Fi is a v-field in Xi (i = 1, 2,...) and the v-field generated by cylinder sets W(.

+1, ...)

,.

in lj Xi is denoted by .V,,. Then given any i=n

co

set E in rj Xi the (finite dimensional) section of E at x1, x2, ..., xk is the set i=1

00

(in n Xi) of points (xk+i, xk+2, ...) such that (x1, x2, ...) r :E. Then if EE Y1 \\

i=k+1

the product o--field in rj Xi, all its k-dimensional sections belong to .Sok+1 i=1

6.2 Product measures We now assume that (X1, are measure and (X2, spaces and/111 F2 are o--finite measures. The product Q-field .' in X1 X X2 was defined as the smallest containing the class F. are 'F1 x F2 which is known to be a semi-ring since each of semi-rings. In Chapters 3 and 4 we developed a general method of extending a measure from a semi-ring to the generated a-ring. Since the semi-ring F1 x F2 contains the whole space X 1 x X 2 this generated r-ring must be a a-field and is therefore F1 * .F2, the product or-field. Thus if we use theorems 3.5 and 4.2 we can extend any a-finite measure on x F2 to a a'-finite measure on . * .'F2 in a unique way.

Suppose E1 x E. is any rectangle set in F1 x F2 and put #(E1 x E2) = #1(L' 1) #2(L' 2)1

with the usual convention that 0. oo = co. 0 = 0. Then p is a nonnegative set function on F1 x F2 which is easily seen to be cr-finite. Our first objective is to show that p is a measure on the semi-ring .F1 x .r 2. First, suppose that

ExF= U (EixFi) i=1

with the sets E. x Fi disjoint. Define the functions fi: X1-* R+ by fi(x) = ps(Fi) xEi(x) (i = 1, 2, ..., n). Then fi is a non-negative function or possibly a function which takes the value + eo on a measur-

able set Ei and zero outside it: in any case ficzpi = p1(Ei) p2(r'i)

(i = 1, 2, ..., n).

PRODUCT MEASURES

6.2]

139

Similarly, if f(x) = ,u2(F) XE(x) we have

ffdi = Ia1(E)uu2(F) Now for each fixed x in X1 we have (E x F)x = U (E, x Fi)x i-1 with the sets (Ei x Fi)x disjoint. Since 1u2 is (finitely) additive it follows

that

n f(x) = Efi(x) i=1

If we now use (finite) additivity for integrals of non-negative simple functions we have lu1(E),a2(F) =

ffd,u1 = f Tidal = i=1 ffidu1 = L.i,a1(Ei)li2(Fi) i=1 ti=1

This shows that the set function u 1we have defined is finitely additive on Fl x " 2. The same argument extends without difficulty to countable unions of disjoint rectangles 00

U(EixFi)=ExF

i=1

because all the functions fi(x) are non-negative measurable, so that the monotone convergence theorem 5.6 justifies the inversion of integration and summation. Thus It is a measure on the semi-ring .F1 x .F2. It can be extended uniquely by theorem 3.5 to the generated ring, and then, by theorem 4.2, to the generated o--ring which is the product o--field F1 * .5F2. The result is called the product measure on
for

E1 E .°F1,

E2 E '2.

The above theorem clearly extends immediately to any finite Cartesian product of measure spaces. Difficulties arise with the Cartesian product of an enumerable collection of measure spaces unless we arrange that the infinite products of real numbers occurring converge. The easiest way to ensure this is to restrict the discussion to countable products of measure spaces (Xi, A j, pi) with,ui(Xi) = 1.

140

RELATED SPACES AND MEASURES

[6.2

It is possible to define product measures on arbitrary product spaces jj Xi such that,ui(Xi) = 1 by exactly the method used below. iEI

We carry out the construction only for enumerable products as, in applications, it is not usually appropriate to consider the product measure for non-countable products. In § 6.6 we will give a general

construction for a measure in jl Xi, an arbitrary product space-

iEI this construction could clearly be specialized to give the results of the

remainder of this section, but it is simpler to deal with the case of product measures in countable product spaces first. We will set up our measure on the product o--field by a slightly different procedure.

be the semi-ring of cylinder sets in jI Xi.

Let

We define It on W by u(E) = 1z1(E1)#2(E2) . . .lun(En), if

E = E1 x E2 x ... X E. x

jj Xi; Ei E, (i = 1, 2, ..., n). 00

i==n+1

It is clear that 0 < ,u(E) < 1 for all E in V. To see that It is finitely additive on ' it is sufficient to see that, in any finite collection of cylinder sets, only a finite number of coordinates are involved so that, m

if C = U Cj is a dissection of C E' into disjoint sets of le, there is an j=1

integer N such that C and Cj (j = 1, ..., m) can all be expressed in the form

E1xE2x...xENx jj Xi. 00

i=N+1

We can then apply theorem 6.2 to the finite products to see that m

fi(C) = E,u(Cj). j=1

By theorem 3.4, u has a unique additive extension to the ring Q of finite unions of cylinder sets. In order to apply theorem 4.2 we must show that u is a measure on 9. This can be done by using the continuity theorem 3.2. It is sufficient to show that any monotone decreasing sequence {An} of sets in R such that 0

has a non-void intersection. OD

Let Y. = f X. Then by the above procedure we can define proi=n+1

duct set functions v(n) on the class *n> of finite unions of cylinder

PRODUCT MEASURES

6.21

141

It is clear that, for each integer n, we can obtain It on W by taking the product of the measures µi (i = 1, 2, ..., n) Sets

and zA' ). Let

(x1,y)EAn}

An(x1) = {y: yEY1,

be the section of An at x1 E X1. It is clear that, for each x1 E Xl, An(x1) E 9(1) and if

B.,1 = {x1: vA1)(An(x1)) > e}

then Bn,1 is a finite union of sets in .°F1 and is therefore in

:

further we must have p1(Bn,1) -+' je(1-p1(Bn,1)) i ,a(An) i e,

by considering Ann (Bn,1 x Yl) and Ann (X1- Bn,1) x Y1. It follows

that

p1(Bn,1) > e

(n = 1, 2, ... ).

But {An} is monotone decreasing so {Bn,1} must also decrease with n and )(

lul 1

1

1 Bn,

In..l

l

1

'Ee

Since p1 is a measure on JF1, it follows that there must be at least one point x1 E X1 for which v<1)(An(x1)) > je

for all n.

We now suppose x1 is fixed as such a point in (1 Bn,1 and repeat the argument to the sequence of sets {An(x1)} in the space Y1. This gives a point x2 E X2 such that v(2)(An(x1, x2)) > e/22

for all n.

By an induction argument we obtain a point (x1, x2, ...,) in rj Xi such i-1 that, for any k, n An(xl, x2, ..., xk) * fQ .

But each set An has only a finite number of coordinates restricted so the point (x1, x2, ...) must be in A. for all n. This completes the proof that ,u is continuous from above at 0. Since p is now seen to be a finite measure on the ring . it has a unique extension to the generated a-ring which is also the product v00

field in rj X. This extension is called the product measure. Thus we i-1 have proved

Theorem 6.3. If (Xi, JFj,

are measure spaces with

pi(Xi) = 1

(i = 1, 2, ...);

RELATED SPACES AND MEASURES

142

[6.2

then there is a unique measure µ defined on the product o --field F of 00

subsets of X

i=1

Xi which is generated by the cylinder sets of the form

E1xE2x...xEnx rj Xi (EiEFi,i= 1,2,...), i=n+1

such that

)

µ(E1x...xEnx

ft

i=n+1

\\

= lu1(E1) fi2(E2) . . . run(E.)

Exercises 6.2

1. Given 3 or-finite measure spaces (X1,. 1,µl), (X2,. ,µ2), let T be the product measure of µ1, µ2 in X1 X X2 and v the product measure

of It,, µ3 in X2 X X3. Show that, in the space X1 X X2 X X3 the product measure of T and µ3 is the same as the product measure of It, and v.

2. Suppose (Xi,.5Fi,,ui) (i = 1, 2,...) is a sequence of measure spaces 00

with ai(Xi) = 1. Let µ be the product measure of theorem 6.3 on jj Xi i=1

CO

and suppose Tn is the corresponding product measure of jj X. Show that i=n+1

µ is the same as the product measure of µ17u2, ..., µn, Tn on the finite Cartesian product

X1XX2X...XXnX

T7

11

Xi

.

3. The product measure of two complete measures need not be complete. As an example take X1 = X2 = unit interval with Lebesgue measure. Suppose M is a non-measurable set in X1, and consider the set M x {y}; use exercise 6.1 (7). m

4. Suppose jj Xi is a product space with µi(Xi) = 1. Let E. i=1

00

µ(Ei).

(i = 1, 2,...). Then the set jj'0Ei is in the product o,-field and µ(E) i=1

i=1

5. If a cylinder set E1 x E2 x ... x E x jj00 Xi is in the product u-field n+1

F generated by W(JF1, JF2, ...), then it is in (i = 1, 2,..., n).

in fact Ei E.

6.31

FUBINI'S THEOREM

143

6.3 Fubini's theorem Given two measure spaces (X, F, It), (Y, 9, v) we have now seen how to define a product measure on the product o--field in X x Y. Given a function f: X x Y --> R* there are sections f,,: Y ->- R* defined for every x E X. Our objective in the present section is to compare the integral off (x, y) with respect to the product measure with the iterated integral obtained by first integrating fe(y) with respect to v for each fixed x, and then integrating the resulting function of x with respect to

the measure It. Because of our method of defining the integral the general result will follow easily from the special case of simple functions. The essential step towards this case is given by the next theorem. Theorem 6.4. Given (X, F, It), (Y,!?, v) two o--finite measure spaces, let A be the product measure defined on the product o --field F* 9. Then

for all A F* 9, v(A.,) is F-measurable anda(Av) is 9-measurable; and r A(A) = #(A') dv = fv(A)d. J

Proof. Suppose first that p(X), v(Y) are both finite. Let _W be the class of subsets of X x Y for which the conclusions of the theorem are valid. Then .4' .F x T since if A = El x E2, El E .F, E2E W v(A,,) is.F-simple as a function of x, ,u(AY) is 9-simple as a function of y, and both these functions integrate to A(A) by the definition of A on

.F x 9. It follows that A contains the ring . of finite unions of rectangle sets of F x T. Since the limit of a monotone sequence of measurable functions is measurable, and theorem 5.6 applies to the integrals, it follows immediately that .4' is a monotone class. Hence, by theorem 1.5, .,' is a o--ring. But clearly .4' contains X x Y so that .4' is a o--field and _W n F* 9. The restriction ,u(X) < oo, v(Y) < oo can now be removed by the usual device of taking measurable sequences {A,z} increasing to X and {Bn} increasing to Y for which p(A) < oo,

v(B,) < oo for all n, and considering the set A n (An x B.) which increases to A as n - oo. Corollary. Under the conditions of theorem 6.4, if A E.5F* 9, A(A) = 0 if and only if v(A.,) = 0 for almost all x, and if and only if p(AY) = 0 for almost all y. This follows from the theorem using the fact that a non-negative

measurable function can integrate to zero only if it is zero almost everywhere. I

RELATED SPACES AND MEASURES

144

[6.3

Theorem 6.5. Given all the conditions of theorem 6.4, we write ." for the product o -field .F* 9.

(i) If h: X x Y -* R+ is any non-negative ilo measurable function then

fh

=f

(fhdv) d = f(fhd)dv.

(ii) If h: X x Y - R* is

-measurable andA-integrable, then

h_,: Y -> R* is v-integrable for almost all x and hy: X -> R* is ,u-integrable for almost all y. Further

r

rr

f hdA=Jfdu=Jgdv, where

f(x) = fhdv when hx is v-integrable, g(y) = fh dp when by is,u-integrable

and f, g are defined to be zero on the remaining null sets.

(iii) If f: X x Y -+ R* is 1'-measurable and f (f If, dv) dµ is finite, then

fi dA =

f(ffd) dv = f(ffdv) du.

Proof (i). If h is the indicator function of a set in W the result follows by theorem 6.4. Because of the linearity of the integration

process it now follows for non-negative .*'-simple functions (note that

sections of an *-simple function will be simple by theorem 6.1). If we now take a sequence {h( n)} of non-negative simple functions increas-

ing to h, we will have the sections {h(xn)}, {hvn)} increasing to h, by respectively. Hence, as n -> oo, f h(') d A -> f hex )dv -->

an

f h,dv

for all x,

fh dil,

fhd/2 --f hydu

for all y,

and application of the monotone convergence theorem (5.6) now suffices to complete the proof.

(ii) Since h is integrable, the positive and negative parts h+, bare integrable. Apply (i) to each of these functions. Then

f+(x) = fh:dv

6.3]

FUBINI'3 THEOREM

145

will always be defined, though it may take the value +oo. Since ff+(x) dp exists, we must have f+ finite except for a set of zero umeasure. Similarly, f- is finite almost everywhere. If we put f(x) = f+(x) -.f-(x) when both f+, f- are finite and f(x) = 0 otherwise, we see that

f hdA = fh+dA_fh_dA =

ff+du- rf-dµ = ffda.

(iii) Again split f into positive and negative parts. Since 0
We have been careful to define the product measure A on the .V which contains F x T. Some authors define prosmallest duct measure to be the completion of this A obtained by the process of theorem 4.3. If one uses this definition then some of our statements

have to be modified to exclude possible subsets of zero measure, though the essential content of the results remain valid. In particular, given a function f(x, y) which is measurable with respect to the completed ar-field . f°, one can only say that the section fx is T-measurable for almost all x. However, provided F and T are complete with respect to their respective measures, theorem 6.5 remains valid as stated. We can use our definition of product measure to give an alternative definition of the integral of a non-negative measurable function. Theorem 6.6. Suppose (S2, -,µ) is a o --finite measure space, (R, ., v) denotes the real line with Lebesgue measure on it and z is the product

measure ,a x v defined on the product crfield dY in t x R. Then if E E F and f : E -;,- R+ is non-negative, f is F-measurable over E if and only if Q(E,f) E .ye, and in this case,

fE fd1i = r(Q(E',f)); where Q(E,f) is the ordinate set defined by

{(x,y):xEE,yR,0
146

RELATED SPACES AND MEASURES

[6.3

Proof. Suppose first that Q(E, f) e.. Then by theorem 6.1 all its sections are in F. But the section of Q(E, f) at y = a is the set {x:f(x)>a},

so that by definition, f is .F-measurable. Conversely, if f is .Fmeasurable then there is a sequence {fti} of .F-simple non-negative

functions which increases to f. Now for any F-simple function Q(E, fn) is a finite union of measurable rectangles and is therefore in .-Y. Also Q(E, fn) increases monotonely to Q(E,f) so we must have Q(E,f) E .-Y. Further if

r

A = i-1 Cn, 4 xEn

{

with E,,z a disjoint partition of E,

ffdµ = 7- Cn.2#(E.,d) = E9'(En,d x [ID, 0.,j)) = T(Q(E,fn)) If we now let n -> oo we obtain the desired result.

Corollary. If f: R -* R+ is 2-measurable, then the ordinate set {(x, y): a < x S b, 0 s y < f (x)} is 22-measurable and has planar

fa f (x) dx. b

Lebesgue measure

In many elementary accounts of integration the notion of `area under the curve' is intuitively important. This last corollary makes this notion rigorous for the Lebesgue integral of non-negative functions.

It is possible to consider Euclidean k-dimensional space Rk as the Cartesian product of k distinct spaces R. Since we have a natural measure (R, 2, v) on each of these spaces we could form the product measure defined on Fk the product or-field in Rk by the process of theorem 6.2. How does this measure compare with Lebesgue measure in Rk? Since all the extension processes used are unique, and the two

measures clearly coincide on 9k = 9 x 9 x ... x 9, the half-open rectangles in Rk, it is clear that the two measures coincide whenever both are defined. However, 2t'k is complete with respect to Lebesgue measure while.Fk is not known to be so. To see that F, is not complete it is sufficient to consider the product of a linear set which is not measur-

able in R with (k - 1) single point sets. This set cannot be in the product o--field by exercise 6.1 (7), but it is a subset of a line in Rk and therefore it must be in 2k. It follows that 2k is a larger than JFk. Since ak, the class of Borel sets in Rk is the ou-field generated by .9k, we also have Fk P. If E is any set in 21 but not in 91 the Cartesian product of E with (k - 1) whole lines R will be in Fk but not in ak, so that Fk is a larger o--field than gik.

6.3]

FUBINI'S THEOREM

147

If we consider the case k = 2, a function f (x, y) which is 32measurable need not be Thus we can only say that the function fe(y) = f(x, y) considered as a function of y for fixed x is measurable for almost all x. Thus in Theorem 6.5 (ii), if f (x, y) is Lebesgue integrable we can deduce that ¢(x) = f f (x, y) dy exists and is finite except for an exceptional set of x of zero measure. As g5(x) is thus defined a.e. it can be integrated and F2-measurable.

fff(xY)dxdY = fr(x)dx. Exercises 6.3

1. Suppose S2 is any set of cardinal greater than X0, and F is the o-field of sets in fI which are either countable or have a countable complement. For EeJF, put p(E) = 0 if E is 1 if (S2-E) is countable. Consider the Cartesian product of two copies of S2 and let E be a set in S2 x SZ which has countable x-sections for every x and y-sections whose complement is countable for every y. If is the indicator function of E, then fhu(x)au(dx) = 1,

h

fh(Y)(dY) = 0.

Why does this not contradict theorem 6.4? 2. Suppose (X, .F,#) (Y, OF, v) are o -finite measure spaces and A is the product measure on the product a-field A. Show that (i) If E, G c: A' are such that v(E.,) = v(G,,) for almost all x e X, then A(E) = A(G).

(ii) If f, g are integrable functions on X, Y then f (x) g(y) is integrable on

XxYand

ff(x) g(y) dA = ffdufYdv. 3. X = Y = [0,1] an&F, 9 are the Borel subsets. Let p(E) be the Lebesgue measure of E, v(E) the number of points in E. Form the product measure It x v on Borel subsets of the unit square. Then if D is the diagonal {(x, y); x = y}, D is measurable and

f v(Dx)u(dx) = 1,

f(DY) v(dy) = 0.

Why does this not contradict theorem 6.4?

4. If f(x, y) =

(x2-y2)/(x2+y2)2 show that

f{ffx,YdY}dx = 4 , 0

0

0

0

f(x,y)dx dy= -4,

[6.3

RELATED SPACES AND MEASURES

148

where all the integrals are taken in the Lebesgue sense. Thus theorem 6.5 (iii) is not valid without the modulus sign. Similarly, show that

1 0

1(e-xv-2e- v)dy}dx.

(e--2e-209)dx)dy+J

J1

1

JO

11

5. If f (x, y) = xyl (x2 + y2)2, then +1

+1

(f

1

+1

f(x,

= = f(ff(x,Y)dY)dx

)

-1 but the integral over the unit square in R2 does not exist.

6. Given a countable collection of probability spaces (X2, .j u;) and the product measure ,u on the product v-field, we can form the finite product measures T.. = µ1 X P2 X ... x pn and the product measure A on the product

space rj X j. Then, if f (x1, x2, ...) is any p-integrable function on rj Xj we {=n+1

have

i

1

fdu= jftx1, x, ...) d n dTn.

6.4 Radon-Nikodym theorem We start with a definition. Absolute continuity

of subsets of S2 and p is a measure on .F. Suppose F is a Then the set function v:.F -+ R* is said to be absolutely continuous with respect to p if v(E) = 0 for every E in F with ,u(E) = 0. In this case we write v < It. If (f2, u) is a measure space and f : 0 - R* is µ-integrable, then it is clear that v(E) = fE fdu J

defines a finite valued absolutely continuous set function v. In fact, in § 5.4 we proved that v was and that (corollary to theorem 5.6) given e > 0, there is a 8 > 0 such that for E e.F,

p(E) < S' Iv(E)I < e. (6.4.1) It is immediate that any set function v which satisfies (6.4.1) is absolutely continuous with respect to It. The conditions are equivalent for finite measures, but not in general (see exercise 6.4 (4)). There is a partial converse given by: Lemma. If (S2, F, p) is a measure space and v: F ->. R is finite valued, ar-additive and absolutely continuous with respect to ,u, then v satisfies condition (6.4.1).

6.4]

RADON-NIKODYM THEOREM

149

Proof. By the decomposition of § 3.2, any such v is the difference of two finite measures, so it is sufficient to prove the result for a measure v.

Then if (6.4.1) is false, there is an e > 0 and a sequence {En} of sets of F such that v(En) > e and ,u(En) < 2-n. Put E = lim sup En. Then

#(E) 5 p U Er) <, j U(Er) < 2-r`, so that ,u(E) = 0 while

r=n+1

u

v(E) = lim v 1

r=n+1

n+1

Er) > lim sup v(Er)

so that v(E) > e. This contradicts v << p. I Thus we see that the indefinite integral of an integrable function defines an absolutely continuous set function. Our object in the present

section is to obtain the converse of this statement under suitable conditions. It is convenient at the same time to consider a more general o--additive set function and to decompose it into a maximal absolutely continuous component and a remainder which has to be

concentrated on a p-null set. It is convenient to give a further definition. Singular set function

Given a measure space a set function v: F - R* is said to be singular with respect to p if there is a set E0E.F for which p(E0) = 0 and

v(E) = v(E n Eo),

all EE.F.

(6.4.2)

This condition clearly means that the parts of SZ outside the null set E0 make no contribution to P. In fact if v is also a measure we see that S2 can be dissected into two sets Eo, El E. such that

Eo n El = 0, Eo v E1 = S2, ,u(Eo) = 0, v(E1) = 0. The symmetry of the relationship in this case is sometimes stressed by saying that It and v are mutually singular. Theorem 6.7. Given a afinite measure space (S2, . It) and a o--additive, o--finite set function v, then there is a unique decomposition V = V1+V2

into set functions vi which are ofinite and such that v1 is singular with respect to ,u and v2 << p. Further there is a finite valued measurable f: S2 --> R such that v2(E)

JE

fda, all EE.F.

RELATED SPACES AND MEASURES

150

[6.4

The function f is unique in the sense that if we also have

gda

v2(E) = fE

for all E in .F, then f(x) = g(x) except in a set of zero,-measure. Corollary. Under the conditions of the theorem if v < It then there is a finite valued f : S2 -. R such that

fdµ for Ee.F.

v(E) =

fE

Note. The decomposition of v into absolutely continuous and singular components is often called the Lebesgue decomposition, while the integral representation is called the Radon-Nikodym theorem. Proof. Since we can express SZ as a union of a countable set of disjoint sets on each of which both ,a and v are finite, there is no loss in

generality in assuming that they are both finite on 92. This applies to both the existence and uniqueness proofs. We first see that the decomposition is unique.

Let

V = Vl+V2 = V3+V4,

where v1, v3 are singular and v2, v4 are absolutely continuous. Then v1- V3 = V4 - V21

Taking the union of support sets of v1, v3 gives a set Eo such that (v1- v3) (E) = (v1- v3) (E n E0), ,a(Eo) = 0.

But (v4 - v2) is absolutely continuous and therefore zero on any null set so that, for any E E .F, (v4-v2)(E') = (v1-v3)(E) = (v1-v3)(EnEo) = (v4 - v2) (E n Eo) = 0.

Thus vl(E) = v3(E), v2(E) = v4(E) for all E. The uniqueness of the integral representation of v2 was proved in § 5.4. Thus it is sufficient to find any decomposition and integral representation. By theorem 3.3 we can decompose v into the difference of two mea-

sures. It is therefore sufficient to prove the theorem when v is a measure. Now let .-° be the class of non-negative measurable

f: U -* R+

such that

v(E) >

f

E

f d# for all E in JF

RADON-NIKODYM THEOREM

6.4j

and put

a = suP {fiz1u:iE .31

151

.

Let {fn} be a sequence of functions in .° such that 1

.fndu > a- -. n

Put gn(x) = max{fl(x), f2(x), ..., fn(x)}. Then if and n is fixed we can decompose E into a disjoint union E1 v E2 v... v E,, of sets of .F such that gn = fj on Ej. Hence

f gnciu = j=1fEj gndlu

fjd'a s E v(E;) = v(E), j=1 Ej j=1 so that 9n E .° for all n. But {gn} is monotone increasing, and by the E

monotone convergence theorem, fo(x) = lim gn(x) EA°. Since fa(x) > fn(x)

for all n,

a = f fo(x) dµ.

we must have

For each E in F, put v2(E) = f Efodp,

v1(E) = v(E) - v2(E).

Then v2 is absolutely continuous with respect top, so it only remains to show that v1 is singular. Consider the o--additive set function ;(n = V1-(1/n) a

and decompose S2, using theorem 3.3. into positive and negative sets Pn, Nn such that Pn v Nn = SZ, Pn n Nn = o, E c Pn o- An(E) > 0, E c Nn A(E) < 0. Then, for E c Pn, v(E) = v1(E)+v2(E) % v2(E')+np(E) =

f

E

(.i +n) du.

This shows that the function equal to fo on N. and [fo + (1/n)] on P. is in .*'. This will give a larger integral than a unless ,u(Pn) = 0. If 00

P = UP., then p(P) = 0. Further S2 - P c Nn for all n so that n=1

v1(SZ-P) = 0 and v1(E) = v1(E n P)

that is, v1 is ,u-singular.

for all E in .F,

152

RELATED SPACES AND MEASURES

[6.4

In the case where v <<,u, by the uniqueness of the decomposition we must have v = v2, and the integral representation of v now follows. ]

Remark. In the statement of theorem 6.7 we do not assert that the function f is integrable. A necessary and sufficient condition that f be integrable is that v be finite. However, the use of the symbol

fF f du asserts that eitherf+ or f- has a finite integral. This corresponds to the result of theorem 3.2 that v cannot take both the values ± cc. Derivative of a set function If is a measure space and

for E in,

v(E) =

then we write f = dv/dµ and call f the Radon-Nikodym derivative of v with respect toy.

One should emphasise that the derivative dv/dµ is not defined uniquely at any given point, it has to be considered as a function and then it becomes uniquely defined in the sense that any two functions representing the same derivative can differ only on au-null set. Exercises 6.4 1. Show that if µ, v are any two measures on a a-ring SP, then v < µ -{- v.

2. Suppose F(x) is the Cantor function defined in §2.7 and v is the Lebesgue-Stieltjes measure with respect to F. Show that v is singular with respect to Lebesgue measure. 3. Suppose (S2, .F, µ) is a measure space with µ(S2) < ac and visa measure, v << It. Show there is a set & such that (S2 - E) has v-finite v measure and for

every measurable F c E, v(F) is either 0 or oo.

4. Let t be the set of positive integers,

µ(E) = E 2-n, nEE

v(E)

2n nEE

then v < µ, but (6.4.1) is not satisfied. This shows that (6.4.1) is a stronger condition than absolute continuity when v is not finite. 5. Suppose Q is an uncountable set,." is the class of sets which are either

countable or have countable complements. For E e.ri", put µ(E) = the number of points in E, v(E) = 0 or 1 according as E is countable or not. Then

clearly v <#, but no integral representation is possible. This shows that in the Radon-Nikodym theorem we cannot do without the condition that µ be ar-finite.

6.41

RADON-NIKODYM THEOREM

153

6. If A, ,u, v are or-finite measures on IF and A << It, ,u << v; show that A<
dA_dAdu dv

du dv

except on a set of zero A-measure.

7. A, a are v-finite measures on F with y < A. Then if f is ,u-integrable

ffdu = J f d dA. 8. If A,# are a--finite measures on F such that ,u
-(

dA)

except for a set of zero A-measure.

9. If ,u, v are o--finite measures on F such that v << ,u, show that the set of points x at which dv/du is zero has zero v-measure. 10. Suppose {,u1} is a countable family of finite measures on a o,-field F. Show that there exists a finite a on such that each of the pi is absolutely continuous with respect to It.

r Ilk - lu, n

11. Suppose lun

=

k=1

n

vn = E vk - v, k=1

where all the u, v with suffices are finite measures on a or-field 317 and vn is din-continuous for all n. Show that (i) du1/dun - du1ldc almost everywhere (f1). a.e. (v). (ii) If each,un is v-continuous then d7n/dv --* (iii) v is 71-continuous and dvn/dun -* dv/du a.e. (F1).

6.5 Mappings of measure spaces In mathematical arguments one often needs to consider two spaces, X, Y with a mapping f: X -* Y. Such a mapping induces mappings on the classes of subsets of X and Y: if E c X, f(E) denotes the set of yin Y with y = f(x), and if F c Y, f-1(F) denotes the set of x in X with f (x) e F; further if V is a class of subsets of X, f (W) denotes the class of sets f(E) with Eeq', and similarly for f-1(&) where & is a class of subsets of Y. We saw (§ 1.5) that f-1 preserves the structure of a class of subsets, so that if.9' is a a--field in Y, f -1(J') is a or-field in X. Sometimes the two spaces X, Y already have classes of subsets defined,

and one can then examine the relationship of the mapping f to these.

154

RELATED SPACES AND MEASURES

[6.5

Measurable transformation

Suppose f is a mapping from X into Y, F is a in X and 9r is a o--field in Y, then we say that f is a measurable transformation from (X, F) into (Y, 9) if f-1(E) EJF for every E in K This condition can also be written f-1(g) c ,F. In Chapter 5 we discussed `measurable functions'. In our new terminology these are measurable transformations from (X, .F) to (R*, 9) in which f:X

is the oS-field of Borel sets in R*. Given mappings

Y, g: Y -± Z we can consider the composition g(f) : X -* Z

defined by g(f) (x) = g(f(x)). In particular if g: Y -* R* is an extended real-valued function on Y, then g(f) defines an extended real function on X.

Lemma. If f: X -a Y is a measurable transformation from (X,F) into (Y, 9) and g: Y --> R* is EQ-measurable as a function with extended real values, then the composition g(f) is .F-measurable.

Proof. For any Borel set B in R* we have {x:g(f)(x)EB} = f-1 {y: g(y) E BI

= f-1(E) for some E E 9,

and is therefore in F. ] Remark. We obtained a special case of this lemma when we proved that a Borel measurable function of a measurable function is measurable (see §5.2). If we start with a measure space (X, and f is a measurable transformation from (X, F) into (Y, 9) it is natural to use f to define a measure v on 9 by putting v(E) _ ,a(f-1(E))

for E E 9.

(6.5.1)

With this definition of v it is immediate that (Y, 9, v) is a measure space. If (6.5.1) holds we will write v = ,uf-1. This allows us to carry out a `change of variable' in an integral. Theorem 6.8. Suppose f is a measurable transformation from a measure space (X,.F,It) to (Y, 9) and g: Y -* R* is T-measurable: then

fd(f_1) = fg(f)du in the sense that if either integral exists so does the other and the two are equal

Proof. It is clearly sufficient to consider non-negative functions

MAPPINGS OF MEASURE SPACES

6.5]

155

g: Y -). R+. Suppose first that g = XF, the indicator function of a set E in 9. Then g(f) (x) = 1 if x E f-' (E), = 0 if x of'-'(E); so that g(f) is the indicator function off-1(E), a set in F. Thus, in this case, by (6.5.1) f gd(fuf-1) = of-1(E) = #(f -1(E)) = f g(f)du.

By linearity, the result now follows for non-negative 9-simple functions g. If {gj is an increasing sequence of non-negative simple functions converging to the measurable function g, then gn(f) will be an

increasing sequence of simple functions converging to g(f). The definition of the integral of a non-negative function now completes the proof. I Sometimes in integration, when the variable is changed, one wants to integrate with respect to a new measure v + µf-1. We can do this easily whenuf-1 is absolutely continuous with respect to v. Theorem 6.9. Given o-finite measure spaces (X, .F, ,u) and (Y, T, v) and a measurable transformation f from (X, F) into (Y, T) such that µf -' is absolutely continuous with respect to v

f

f g(f)du = g.Odv, where 0 is the Radon-Nikodym derivative d(,af-')/dv, for every measurable g: Y -* R* in the sense that, if either integral exists, so does the other and the two are equal.

Corollary. If q: R -- R+ is Lebesgue integrable, F(x) = Eco q(t) dt, and ,up is the Lebesgue-Stieltjes measure generated by F, then f Bg(x) dx = f bg(F(t)) d#F = f bg(F(t)) q(t) dt

d a where A = F(a), B = F(b). Proof. By theorem 6.8 we have

a

f g(.f) du = f gd(1uf-1). 6

TIT

156

RELATED SPACES AND MEASURES

[6.5

Since # -1 is absolutely continuous with respect to v, there exists a measurable 0 such that, for every E E 9

fczdv = (/ff-1) (E). If g is the indicator function of a measurable set E it now follows that

f gd(1pf-1) = (4-1) (E) = f g. Odv and the required result now follows by successive extension to functions g which are: (i) non-negative, simple, (ii) non-negative, measurable, (iii) measurable. Under the conditions of the corollary we consider the mapping F: R->. R given by the measure function F from the Lebesgue measure space (R, .1, PF) to (R, .4,,a). Theorem 6.8 then gives the first equality. If we define .1(E)

fV

gd,u,

then A: 21 -> R is a measure which coincides with

/F(E) = fE' F for intervals of 9 and therefore for all sets E E Y. Hence the measure

pp is absolutely continuous with respect to Lebesgue measure ,u and q is a possible definition of the Radon-Nikodym derivative daF/da. The second equality now follows from theorem 6.9. Remark. It is clear from the above that the function y5 (or q in the

corollary) plays the part of the Jacobian (or rather the absolute value of the Jacobian) in the theory of transformations of multiple integrals. In general it is not easy to obtain an explicit value for the Radon-Nikodym derivative d(,af-1)/dv, but in important special cases this can be done. In particular, if both spaces are (Rk, 21k, ,u) with ,a Lebesgue measure, and f: Rk -> Rk is a linear transformation given by a non-singular matrix A so that y = Ax one can prove that If(E)I =

IIA[I.IEI,

where I1AII denotes the absolute value of the determinant of A. (This can best be shown by expressing A as a product of elementary transformations, and proving the result for each elementary transformation.) This means that, in this case a possible Radon-Nikodym derivative is the constant function I[A 11.

MAPPINGS OF MEASURE SPACES

6.51

157

Exercises 6.5 1. Show that the composition of two measurable transformations is measurable.

2. If f is a measurable transformation from (X, -5F) into (Y, .P) and u, v are two measures on such that u << v, show that ,uf-1 << vf-1. 3.

(Integration by parts.) If F(x), G(x) are non-negative continuous

functions satisfying the conditions of §4.5 for a Stieltjes measure function and E is any Borel set, then fE F(x) dG(x) + fE G(x) dF(x) = ,uFa(E), J J where ,UFO denotes the Lebesgue-Stieltjes measure generated by F(x) G(x). In particular if r F(x) = ff(t) dt, G(x) = x g(t) dt, a Ja

then

f

b

b

F(x) g(x) dx +

a

ff(x) G(x) dx = F(b) G(b) -F(a) G(a).

4. Suppose A is a non-singular k x k-matrix defining a mapping from Rk to Rk, then this is a measurable transformation from (R(k), 2'(k) to itself. If ,uk denotes Lebesgue measure in Rk show that

ffdPk = IIAII ff (A) dltx, for any Lebesgue measurable f, where f(A) denotes the composite map f (A) (x) = f (Ax).

6.6* Measure in function space We saw that points in the product space jj Xi can be thought of as iEI functions f: I - - U Xi in which f(i) E X. In the particular case where iEI

Xi is the same space X for all i, the space i1 Xi reduces to the set of iEI functions : I -- X. For this reason such a product space is often denoted by XI. Since theorem 6.3 clearly extends to arbitrary product spaces

we can produce a product measure in XI starting from any measure ,u on X with ,a(X) = 1. However, for non-countable I, such product measures are rarely of interest. In applications, the space XI usually

describes a stochastic process (see Chapter 15), and the product measure in XI would correspond to complete independence (see Chapter 11) between the values in each of the coordinate spaces. Usually one wants to be able to define and use measures in XI which are not product measures. 6-2

158

RELATED SPACES AND MEASURES

[6.6

In our account we restrict X to be the real line R (it is easy to extend

the theory to the case X = C, but some restriction is needed for its validity), leaving the index set I completely arbitrary. Borel sets in RI If we assume the usual topology in R, and denote the class of Borel sets in R by then the class ' of cylinder sets {f RI: f (ik) E Bk,

k = 1, 2, ..., n},

Bk E -4

is a semi-ring of subsets in RI. The a -field generated by' will be denoted by _I. If GRn denotes the class of Borel sets in Rn, it is immediate

that _I can also be generated by the class of sets of the form

{fERI: ak < f(ik) < bk, k= 1,2,...,n}, or of the form {fERI: (f(i1),f(i2),...,f(in))EBn},

BnE.1n.

(6.6.1) (6.6.2)

It is important to notice that no set in .4I can have restrictions on an

uncountable set of coordinates. For, if E is a countable subset of I and F = I - E, a set of the form {f E RI: fE E RE},

(6.6.3)

where fE denotes the restriction off to E, contains functions f which are not restricted on F. The class of subsets of RI of the form (6.6.3) (for all possible countable sets E C I) is clearly a o--field which contains the finite dimensional cylinder sets W. Further, every set of the form (6.6.3) must be in .91I, so that the Borel sets in R' are precisely the sets of the form (6.6.3). Our object will be to extend a measure which is already defined on

sets of the form (6.6.1) to the o--field _I. For a fixed finite set 21, 22, ..., in E I, the sets of the form (6.6.1) clearly generate a o--field

containing those sets of RI obtained by taking a Borel set in Ri1 x Ris x ... x Ri,, and forming the cylinder with this set as base. If we are to have ,a(RI) = 1, then, for each fixed i1, i2, ..., in, our set function on sets of the form (6.6.2) must define a measure on the Borel sets of the Euclidean n-space R.1 x ... x Ri, in which the whole space has measure 1. It is clear that the measures given in the various Euclidean spaces of this type have to satisfy various consistency relations, if there is to be

any hope of extending to a single measure on the whole of _4I. For such a measure on 9I must yield the original system on restriction to sets of the form (6.6.2). These consistency conditions can be stated

6.61

MEASURE IN FUNCTION SPACE

159

in terms of multidimensional distribution functions which generate the measures on sets (6.6.2), but we prefer to state them (equivalently) in terms of the measures. We assume then that for each finite set of distinct indices i1, i2, ..., in we have a measure /-t'1'2 ... in defined for the Borel sets in Rn such that

(I) 1ail...inin+1(A x R) = #il...in(A), AEan. (II) If 77 is a permutation of (1, 2, ..., n) and 0: Rn -+ Rn is the mapping (x1, ..., xn) _ (x,11, x,12, ..., x,r,y)I T

then

pi"

= ,ail i2... in 0-1-

The condition (I) says that putting on the additional condition f(in+1) E R at a new index cannot effect the measure of the set since it imposes no restriction, and condition (II) makes precise the notion that the order in which the index set ill i2, ..., in is written should not have any effect on the measure of the (same) set. Both these consistency conditions are clearly necessary if there is to be any hope of extending the measures ,ail.. .in to a single measure ,a on RI. The fact that they are also sufficient was proved by Daniell in 1918 and rediscovered by Kolmogorov in 1933. We state it as

Theorem 6.10. If I is any infinite index set, and for each finite set il, i2, ..., in of different indices in I there is a measure ,ai1i2... in defined

on the Borel subsets of Rn such that the family of all such measures satisfies the consistency conditions (I) and (II), there is a unique measure ,a defined on 91 in R' such that, for each n E Z, Bn E.In, p{f E R': (f(i1), ...,f(in)) E Bn} = ,aili2... in (Bn).

Proof. Let .5° denote the semi-ring of sets in Rr of the form (6.6.1) for some finite value of n. Let .92 denote the ring generated by .9" consisting of finite unions of disjoint sets in Y. Now /Zi1i2... in defines the measure of the set {fER': ak < f(ik) < bk, k = 1,...,n} (6.6.4) and the consistency conditions (I) and (II) clearly ensure that the measure is uniquely defined and additive on . (for the sets of any finite class of sets ink can all be described by restrictions on the same finite set of coordinates, and therefore the measure can be given by a single measure of the family). It follows, by theorem 3.1, that there is a set function r defined on the ring . which is additive and coincides with the measure pi,...in on a set of the form (6.6.1). Further a' is the o--field generated by R and we can obtain the required measure p on ar by applying theorem 4.2 to the measure T

RELATED SPACES AND MEASURES

160

[6.6

-provided the conditions of that theorem are satisfied. It is immediate that RI is for RIE9, and r(RI) = 1; so that the only condition which requires proof is that T is a measure on R. The proof of this fact is an extension of the method used in §§ 3.4, 4.5. If r is not a measure on ?, we can find a decreasing sequence {En}

of sets in R such that n E. =o, but T (E.) > 8 > 0 for all n. Now n=1

given any set C of the form (6.6.4), and e > 0, we can choose I > 0 such that T(D) > T(C) - e

D = {fERI: (f(i1),f(i2),..., f(in))EP} P, = {ak+q< xk 5 bk, k = 1, 2, ..., n}, and sinceF','2' "n is a measure. But now P,, c PO. This argument clearly extends to any non-empty subset in 9, and we can apply it by induction to the sequence {En}. Since in each of the sets E. the value of f at only a finite set of indices is restricted, there is no loss of generality in assuming that in the sets E1, E2 ..., E. there is a restriction on f only at the first n of the indices in the sequence il,22,...,2n,.... (If this condition is not satisfied one need only add additional sets in 9 where

-

to the sequence {En} to obtain a new sequence of which the original is a subsequence.)

Thus we may assume that

/

En = {f E RI: (f(i1), ...,f(i )) E Qn}r

where Q n E Can the class of elementary figures in Rn. The condition that En be a decreasing sequence now means that Qn+1 C Qn x R. We apply the above procedure to each of the sets En to give a sequence {Dn} of sets

Dn = {f RI: (f (i1), ..., f (in)) E Pn}

such that P. c Qn, Pn E 61n and 8 TD ( n) >TE ( n) - 2n+1

If we put

then

Vn = Dl n D2 n ... n D. n

T(Vn) = T(En)-T(En-Vn) i T(En)- ET(Ei-Di) > J8 i=1

so that the sets {V.} form a monotone decreasing sequence of nonempty sets. In each V,, choose a point fn = {fn(i), 2E1}.

MEASURE IN FUNCTION SPACE

6.61

Now

(fn+p(i1),fn+p(22), ... ,fn+p(i ))

161

(p = 1, 2,...)

defines a sequence of points in Rn which is a subset of the bounded closed set

(P1 x Rn-1) n (P2 x Rn-2) n ... n (Pn) = Fn.

We can therefore find a subsequence of {fn+p} which, evaluated at the first n indices converges to a point of Fn. Since T(Vn) > JS, Vn is not empty and F. is not empty since V. C {f E RI: (.f (2i), ... , f (i )) E F.}.

Further Fn x R c Fn+1 (n = 1, 2,...), and we can now employ a standard

diagonalisation argument to obtain a point in (1co En. n=1

Obtain successively, by induction, infinite increasing sequences of positive integers V1 Z) V2 = ... :D PL. => ...

such that {fn} restricted to the sequence vk gives a sequence whose values at i1, i2, ..., ik converge to a point in Fk. Form the sequence v obtained by taking the kth integer in the sequence vk. Then, for each k, v is a subsequence of Vk except for a finite number of terms at the beginning so that {(fn(i1), fn(i2), ..., fn(ik))}, nE v must converge to a point in Fk C Qk. If we put qk = lim fn(ik) (k = 1, 2,...) the set nEv

H = U E E RI. f (2k) = qk,

k = 1, 2, ... } co

is non-empty, and H c Vn c En for all n. This contradicts fl En = 0.1 n=1

Remark. For a finite index set I, theorem 6.10 is still true, but lacks any content as the measure ,ui1... in already is the required a if I = {21, 22, ..., 2n}. Brownian motion We can set up a mathematical model for Brownian motion by apply-

ing theorem 6.10 to a particular family of finite dimensional distributions. Use the index set T = {t E R, t > 01 which can be thought of as time and, for

0
define

,at1... to {f E RI: ai < f(ti) < bi, i = 1, ..r., n} r fb,, )2 bn-1 (en-1- En=2)2 exp - (Sn - 5nto-1)I exp d6n 2(1n-1- to-2) J d5n-1 fan_i 2(tn (27T)jn an

JbEexpL-(62-61)21d62 f blexp(-*) d 2(t2-t1) J ai

1.

162

RELATED SPACES AND MEASURES

[6.6

The fact that this defines a consistent family of measures on .So which can be extended to all sets of the form (6.6.2) can be proved directly (it will follow from the discussion of the multinormal distribution in Chapter 14). Hence, we can apply theorem 6.10 to give a measure ,a on IT the class of Borel sets in RT = Q. This is called Wiener measure in the space of functions f: T --> R, and is an example of a stochastic process which will be discussed more fully in Chapter 15.

However, let us use the example of Wiener measure to illustrate the inadequacy of theorem 6.10. This follows from the fact that the o--field _IT is too small to contain interesting sets-for we have seen that it contains no set in which a non-countable set of time coordinates

is restricted. Even if PT is completed with respect to a to give a probability measure, the completed o-field is still too small. For if A c RT is a set in which f is restricted at a non-countable set, the only

set of .IT which is contained in A is the empty set. This means that

A can only be measurable, if it has measure zero. But the same argument applies to the complement of A so that if both A and its complement involve restrictions on f : T R at a non-countable set of indices, then the outer measure of A must be 1, and the inner measure must be 0. In particular the set f f E RT : a < f (t) < b for all t E [tl, t2]} (6.6.4)

is not measurable, and if C is the set of functions f: T - R which are continuous for all t E T, C also has outer measure 1 (and inner measure

0). Various methods can be used to extend ,a from .VT to a larger v-field which includes C and (6.6.4) and other sets of interest. These have been studied in detail and the interested reader is advised to look in J. L. Doob, Stochastic Processes (Wiley, 1953).

6.7 Applications In the second part of this book random variables will be defined as .p-measurable functions f: S2 -+ R* where (S2, $ a) is a probability space. Although it is usual to work with general carrier spaces S2, there is a sense in which the real line R has a structure sufficiently complicated to reproduce all the probability properties of the function f.

In fact, in many treatises on probability theory, the carrier space S2 is barely mentioned. This attitude is partially justified by the following considerations. Suppose (52, .° ', p) is any finite measure space and f : S2 -+ R is .F-measurable. For all real x, define F(x) = p{y: f(y) e x}. (6.7.1)

6.71

APPLICATIONS

163

Then F(x) -* 0 as x ->. -oo, F(x) --> ,u(S2) as x -->- +oo, and F: R ->- R

is continuous on the right. Thus we can define a Stieltjes measure #p using this particular F. Theorem 6.11. Suppose (02, .F, ft) is a finite measure space and!: c2 R is .F-measurable, ,uF is the Lebesgue-Stieltjes measure in R given by (6.7.1) and g: R R is Borel measurable, then g(f) is $ -measurable and ,u{x: g(f) (x) E B) is determined by uF for every Borel set B. Further

fg(f)da =

f g(x) dF(x)

in the sense that, if either side exists so does the other, and the two are equal.

Proof. {x: g(f) (x) E B} = {x: f (x) E g-1(B)} and g-1(B) = C is a Borel set so that {x: f (X) E C} is in F, and,u{x: f (X) E C) is uniquely determined

by ,u{x: a < f (x) S b} = F(b) - F(a) for all real a, b since 9, the class of half-open intervals generates the R of Borel sets in R, and F(b) - F(a) = ,uF(a, b]. Thus, for all B in 9, ,u{x: g(f) (x) E B} = ,uF(g-1(B)).

Now suppose g is an indicator function of a Borel set B. Then

fg(f)ciu =u{x: f(x)EB} = /tF(B) = fdPF.

By linearity our result follows for non-negative simple functions and the monotone convergence theorem then gives it for non-negative Borel measurable g and then for all integrable g. Corollary. In the notation of the theorem

ffdu = fxdF(x). Remark. There is an n-dimensional form of theorem 6.11 and corollary which links the behaviour of n measurable functions with a Lebesgue-Stieltjes distribution in Rn-see Chapter 14. Marginal distributions Not all measures in product spaces are product measures. Suppose X, Y are spaces, then the projection X x Y --* X given by p(x, y) = x

defines a mapping. This will be a measurable transformation on (X x Y, .$) into (X,.9') provided E x Y E .F for every E E Y. In this case, if It is a finite measure on J F, pp-1 defines a measure on Y. In

164

RELATED SPACES AND MEASURES

[6.7

general it may not be a very interesting measure as there may be no sets of finite positive measure. However, if (X x Y, . ,1u) is a finite measure space, then the measure pp-1 on .9' is called the marginal measure on X. The marginal measure on Y is similarly defined using a projection on Y. If It is a probability measure these marginal measures are called marginal (probability) distributions. If F(x, y) is a distribution function in R2 (see § 4.5) then

lim F(x, y) and

lim F(x, y)

7H+M

will again define 1-dimensional distribution functions, and it is immediate from theorem 4.8 that the corresponding Lebesgue-Stieltjes measures will be the marginal distributions of,UF. If F(x, y) = F1(x) F2(y)

is the product of two 1-dimensional distributions, then ,uF will be the completion of the product measure 1aF, x,uF2 and F1, F2 will be the marginal distributions for F. Conversely, if ,uF is a product measure, then it must be the product of its marginal distributions so that F(x, y) = F1(x) E2(y) is a necessary as well as a sufficient condition for FiF to be a product of two probability measures. Thick subsets

For any finite measure space measure

we can generate the outer 00

p*(E) = inf E#(E1) i=1

(6.7.2)

the infimum being taken over all sequences of sets {Ei} in F with 00

E c U Ei. (Since It is a measure on the o--field. , (6.7.2) is the same is1

as ,u*(E) = inf,u(F) for F A subset Eo of SZ is said to be thick in 4 if ,u*(Eo) = ,u(52). Thus a subset Eo is thick if and only if (52 - Eo) contains no set in F of positive ,a-measure. There is a sense in which the measure space can be projected onto any thick subset.

Theorem 6.12. If Eo is a thick subset of the finite measure space (52, , µ), .F o =.Fn Eo, and uo(E n Eo) = µ(E) f o r any E . , then (Eo,.Fo, uo) is a measure space. Proof. We first see that 1uo is defined uniquely on Fo. If A1, A2EF are such that A 1 n Eo = A 2 n Eo, then we must have

(A1 LA2)nEo= o, so that /(A1 o A2) = 0 and Au(.1l J = ,u(A2).

APPLICATIONS

6.71

165

Now suppose {B,.,} is a disjoint sequence of sets in Fo so that there is a sequence of sets {Cn} in F with

Bn=CnnE0 (n= 1,2,...). n-1

Put

Dn = Cn - U Ci (n = 1, 2, ... ). i-1 D. n Eo = Cn n Eo,

Then

so that ,u(Dn 0 Cn) = 0. It follows that 00

00

00

Eµ0(Bn)= Eµ(CC)= Eµ(Dn)=,u(UDn) ==4 %(UB) n-1 / n=1 / so that µo is a measure. Remark. This theorem shows that in a probability space (52,.5V, P), the o--field F can be extended to include any set E. not in it whose outer measure is 1. The effect of this extension is to discard all the points of 52 which are not in E0. The device turns out to be useful in the theory of stochastic processes where, by a careful choice of E0, one

can obtain a probability on a useful class of subsets. In particular, for Wiener measure in RT described in § 6.6, it can be shown that the set C of continuous functions is thick and that the extension given by putting Eo = C is a useful one-see Chapter 15. Exercises 6.7 1. Formulate and prove a theorem of the form of theorem 6.11 for n F-measurable functions fi: 52 -. R (i = 1, 2, ..., n).

2. Find the 2-dimensional distribution function F(x, y) which generates the measure µF such that uF(R) is 11V2 (length of diagonal D in B) for any rectangle R, where D is the segment joining (0, 0) to (1, 1). Calculate the marginal distributions of µF, and show that µF is not a product measure. is a complete o--finite measure space, and the outer mea3. If sure,a* is defined by (6.7.2) show that a set E is ,u*-measurable if and only

if it is in F. 4. Suppose (52, such that, for A1,

is a finite measure space and E. is a subset of 4

A1nE0=A2rE'o=µ(A1)=µ(A2) Prove that E. is thick in Q.

166

7

THE SPACE OF MEASURABLE FUNCTIONS Throughout this chapter we will assume (unless stated otherwise) that (f2, F, tt) is a v-finite measure space, and that the o--field .F is complete with respect to It. This implies that if f: f1-* R*, g: S2 -- R*

are functions such that f is F-measurable and f = g a.e., then g is also .l-measurable. Thus, if M is the class of functions f: 92 -> R* which are F-measurable, we say that fl, f2 in M are equivalent if fl = f2 a.e. This clearly defines an equivalence relation in M and we can form the space J,-' of equivalence classes with respect to this relation. When we think of a function f of M as an element of fl we are

really thinking of f as a representative of the class of F-measurable functions which are equal to f a.e. As is usual we will use the same notation f for an element of M and .4'. We can think of M or _W as an abstract space, and the definition of convergence if given in terms of a metric will then impose a topological structure on the space. We will consider several such notions of convergence of which some, but not all, can be expressed in terms of a metric in -W. We will obtain the relationships between different notions of convergence, and in each case prove that the space is complete in the sense that for any Cauchy sequence there is a limit function to which the sequence converges. The main strategy used to prove completeness will be to find a suitable subsequence of the given sequence which clearly converges to a limit f and then show that f is a limit of the whole sequence. This extends the method used in § 2.2 to show that R is complete.

7.1 Point-wise convergence Given a sequence {fn} of functions where fn: E -> R* and a function f : E -+ R* (E c S2), we say that fn converges to f point-wise on E if, for each x in E, f,,(x) ->f(x) as n -> co. This notion has a meaning if we restrict consideration to .41. If E is such that ,a(S2 - E) = 0, and fn -> f point-wise on E, then we say that fn -+ f a.e. For if fn -> f a.e., fn = gn a.e. for each n, and gn - g a.e., then {x:.f(x) + g(x)} C {x:fn(x) -H/(x)} 00

v {x: gn(x)

g(x)} v U {x: fn(x) + gn(x)}, n=1

7.1]

POINT-WISE CONVERGENCE

167

and each of these sets has zero measure so f (x) = g(x) a.e. which means that f = g in -W. { fn} is a Cauchy sequence (point-wise) on E if, given x E E, e > 0, there is an integer N such that (7.1.1) I fn(x)-fm(x)I < e for n, m > N.

(This has meaning only if fn: E R is finite valued.) Because R is complete it is clear that if { fn} is a Cauchy sequence on E, there must

be an f: E -* R such that fn ->-f point-wise on E. Uniform convergence

If the sequence {fn} and the function f are finite valued functions on E to R, we say that f converges uniformly to f on E if for each e > 0, there is an integer N such that

xeE, n > N

I fn(x)-f(x)I < 6.

Similarly, we say that the sequence is a Cauchy sequence uniformly on

E if given e > 0, a single integer N can be chosen so that (7.1.1) is satisfied for all x E E. Since a Cauchy sequence uniformly on E is certainly a Cauchy sequence on E and the existence of lim fm(x) = f(x) n->co

follows for each x, we can let m --> oo in (7.1.1) to deduce that a Cauchy

sequence uniformly on E must have a limit function f: E -. R such that fn ->.f uniformly on E. If p(L - E) = 0 and fn --> f uniformly on E, then we say that fn uniformly a.e. All these notions have a meaning for functions which

f

need not be measurable. However, the notion of convergence uniformly a.e. can be expressed in terms of a metric on the restricted class of measurable functions. Essentially bounded functions

An .F-measurable function f: )

R* is said to be essentially

bounded if #{x: If(x) I > a} = 0 for some real number a. In this case we define the essential supremum of f by ess sup I f I = inf {a: ,u{x: If(x) I > a} = 0}.

Notice that, if ess sup If I = C, then

E = {x: If(x)I > C} =kU1(x:If(x)I > C+

so that ,a(E) = 0 and I f(x)I < C outside E. Thus I f(x)! < C a.e., and if we define f(x) if I f(x)I <' C, 0

if If(x)I >C,

168

SPACE OF MEASURABLE FUNCTIONS

17.1

then I f * (x) I < C for all x and f * =f a.e. Further {x: I f *(x) l > C - e) has positive measure for all e > 0, so that it is non-empty and we must have sup If* I = C. It is clear that, if f = g a.e., then ess sup f = ess sup g, so that we can think of ess sup as a functional on the subset Y,'C' of the essentially bounded functions of -9. If we define (af +/3g) by (af +/ig) (x) = af(x) +/3g(x)

=0 it is clear that (af +,6g) E _

when f(x), g(x) E R, otherwise;

if f, g E _W,0 for any a,,8 E R so that Y., is a linear subspace of .mil (over the reals) Further

P.(f, g) = ess sup if - gi defines a metric in Y., for (i) Pc(f,g) = P.(g,f);

(ii) p. (f, g) = 0 if only if f = g a.e.; (iii) ess sup I f + gl < ess sup I f I +ess sup I g I so that

p. (f, g) < p, (f, h) +P.(h, g)

Now it is clear that, if {fn} and f are functions in

-

such that

fn -->.f uniformly a.e., then pro(fn, f) -* 0 as n -> oo. Conversely suppose po(f f) --> 0, and let E. be a set of .F with,a(En) = 0 and

ess sup l fn -f I = sup I fn(x) -f(x) I EQ-En

Put E = U En, then for x E SZ - E

I fn(x) -f(x) 15 sup I fn.(x) -f(x) I = ess sup If,, -fl x e n -En

so thatfn --> f uniformly on SZ - E and #(E) = 0. A similar, but slightly

more complicated argument shows that, in 2 , a Cauchy sequence uniformly a.e. is the same as a Cauchy sequence in p. norm. Almost uniform convergence

Given functions fn: E -+ R* (n = 1, 2, ...) and f: E R* each of which is finite a.e. on E we say that fn converges almost uniformly to f on E if, for each e > 0, there is a set Fe ' E, FEE .F, ,u(F6) < e such that fn -->f uniformly on (E-FE). The example E = [0,1] c R, fn(x) = xn It Lebesgue measure shows that it is possible for a sequence to converge almost uniformly on E while it does not converge uniformly a.e. on E. However, it is immediate from the definitions that convergence

uniformly a.e. implies almost uniform convergence. What is more surprising is that, under suitable conditions, convergence a.e. implies almost uniform convergence.

7.11

POINT-WISE CONVERGENCE

169

Theorem 7.1. (Egoroff). Suppose E oo, and { fn} is a sequence of measurable functions on E -> R* which are finite a.e. and converge a.e. to a function f: E -> J2* which is also finite a.e. Then fn -+ f almost uniformly in E. Proof. By omitting a subset of E of zero measure, we may assume that all the functions fn and f are finite and that

for all xEE.

fn(x)-*f(x) For positive integers, m, n put

A-=

ill

{x:

Jfti(x)-f(x)I <

yn}.

Then, for fixed m, Ai', A,-,..., An, ... is an increasing sequence of measurable sets converging to E. Since #(E) is finite, by theorem 3.2 there is a positive integer N. = Nm(m) such that

,u(E - A') < e/2'n for i > Nm. a)

If we put

FE

U (E-AmNm

M=1

then ,u(FE) < e. Further given S > 0 we can choose m so that 1/m < 8

and then fi(x)- f(x) I < S for all i > Nm, xE (E-F,), so that fn --> f uniformly on (E - FE).

Remark. The converse to theorem 7.1 is true and almost trivial. For if {fn}, f are finite a.e. on E, measurable, and fn -->. f almost uni-

formly, this means we can find sets F. with ,a(F.) < 1/n such that fn -> f uniformly on (E - Fn) and so fn f point-wise on (E - Fn). Put a) F = ll Fn, then ,u(F) = 0 n=1

andfn -,,-f point-wise on (E-F) so thatf,, - f a.e. on E. Exercises 7.1

1. Let X be the space of positive integers, class of all subsets of X, and,u(E) the number of integers in E c X. If fn(x) is the indicator function of {1, 2,..., n}, then f,,,(x) -- 1 for all x. However, fn does not converge almost uniformly to 1, showing that theorem 7.1 is false without µ(E) < oo.

2. Suppose the conditions of theorem 7.1 are satisfied except that ,u(E) = eo, show that given P > 0, there is a subset Fp c E with ,u(Fp) > P such that fn f uniformly on Fp but that there need not be a subset F with µ(F) _ +co with fn -* f uniformly on F.

170

SPACE OF MEASURABLE FUNCTIONS

[7.1

3. Suppose EE,"-", ,u(E) < oo, f,,: E -* R* (n = 1, 2,...) is a Cauchy sequence a.e. of measurable funtions each finite a.e. Prove there is a finite

c and a measurable F c E with ,u(F) > 0 such that, for every integer n, all x E F, If (x) < c. 4. Suppose E E.°, E has v-finite measure, f (n = 1, 2,...) and f are finite a.e. on E and f -* f a.e. on E. Show that there exists a sequence {E;} of sets in .° such that ,u(E- U Ezl = 0 and d, -* f i=1

\\

uniformly on each Ej. By considering the measure of example 2, §3.1, and a suitable sequence of functions show that the condition that E has v-finite measure is essential.

5. In §4.4 we produced a sequence of sets each of which was not Lebesgue measurable. If we put f (x) = indicator function of n

[0, 1) - LJ Q2, i=1

then f (x) -* 0 for all x in [0,1].

Show that f does not converge almost uniformly so that theorem 7.1 fails if the functions are not measurable. 6. Suppose fa: E -* R, h > 0 is a continuous family of measurable functions, each finite valued, #(E) < oo and for each x E E, f,,(x) --* f (x) as h 0 where f is finite valued. Then if a continuous parameter version of Egoroff's

theorem were valid we would have given e > 0, there exists Fs FE e E, ,u(FE) < e such that f h(x) -* f (x) ash -* 0 uniformly on (E -FE). The

following example shows that this extension is false. In Chapter 4, we saw that there is a non-measurable set E e [0, 1) such that every point x e [0, 1) has a unique representation x = y + q (mod 1), y e E, q rational.

Prove that, if M is a measurable subset of [0, 1] such that M n E(r) is non-void for finitely many rationale r, then I M I = 0. Arrange the rationals Q as a sequence For x E [0, 1) let n(x) be the

integer such that x = y + y E E. If x/n(x) _ al a2 ... (decimal representation not ending in 9 recurring), put O(x) _ /31/32... where Ak = ak (k = 1, 2, ...); and N2k-1 = 1 for k = n(x), 0 otherwise. Put fh(x) = 1, for x = ¢(h), fh(x) = 0 otherwise. Prove fh(x) -* 0 as h 0 for each x. Show that if M any measurable set, I M > 0, then f h(x) +i 0 uniformly on M. 7. Suppose f f,,} is a sequence of functions in .2, In f a.e. and there is an integrable function g such that < g a.e. for all n. Show that f,, -a f almost uniformly.

8. Define what is meant by saying that a sequence {fn} of a.e. finite valued functions is a Cauchy sequence almost uniformly, and show that this implies the existence of a limit function f such that fn -* f almost uniformly.

7.2]

CONVERGENCE IN MEASURE

171

7.2 Convergence in measure We now consider a different kind of `nearness' in .4' in which the measure of the set where two functions differ by more than a fixed

positive number is relevant. This time we make the definitions relative to the whole space Q. Obvious changes give the corresponding

concepts relative to a set E in .. Given .-measurable functions R* (n = 1, 2,...) we say that fn converges in

f: SZ -> R*, fn: S2

measure (,u) to f if, for each e > 0, lim ,u{x: I fn(x) -f(x) I > e} = 0. n

oo

Note that the definition only makes sense for functions in W which are finite a.e. We first see that the limit in measure is unique in .alt'. For suppose fn --> f in measure, fn g in measure; then if 8 > 0, {x: I AX) - g(x) I > 8} C {x: I fn(x) -Ax) I > 18} v {x: I fn(x) - g(x) I > Zs}

and both sets on the right can be made of arbitrarily small measure by choosing n large. This means that ,u{x: I&) - g(x) I > S} = 0

for each S > 0,

and it follows that f = g a.e. (by taking a sequence 8n decreasing to zero). We say that the sequence { fn} of functions in. ' is a Cauchy sequence

in measure if, given e > 0, 8 > 0 there is an integer N such that n > N, m > N - ,u{x: I fn(x) - fm(x) I > e} < 8. The argument used to prove uniqueness of the limit also shows that fn ->.f in measure

{fn} is a Cauchy sequence in measure.

The converse is included in the following theorem. Theorem 7.2. Suppose f and fn (n = 1, 2, ...) are functions in .,/l which are finite a.e. Then (i) fn - f almost uniformly = fn --> f in measure; (ii) {fn} is a Cauchy sequence almost uniformly z {fn} is a Cauchy sequence in measure; (iii) {fn} is a Cauchy sequence in measure = there is a subsequence {nk} such that {fnk} is a Cauchy sequence almost uniformly;

(iv) {fn} is a Cauchy sequence in measure = there is a function g e.4' such that fn -> g in measure.

SPACE OF MEASURABLE FUNCTIONS

172

[7.2

Proof. (i) If fn - f almost uniformly, for each e > 0, 8 > 0, we can find a set E. E . such that ,u(E8) < 8 and fn -a f uniformly on (S2 - Ea). Hence there is an N such that

ft(x)-f(x)I < e for n > N, xE(12-E8)

and then

,u{x: I fn(x)-f(x)I > e} <,u(E8) < 8 for n > N.

(ii) An argument similar to that in (i) will work. (iii) Now suppose that fn is a Cauchy sequence in measure. For each positive integer k, choose an integer Mk such that Mk > Mk-1 and / n > mk, m > Mk - 1u{x: I fn(x) -fm(x) I > 2-k} < 2-k.

Put

Ek = {x: Ifmk(x) fmk+i(x)I > 2-k, 00

Fk= i=k UEi. 00

Then

u(Fk) 5 Zfi(Ei) s 21-k. i=k

Given e > 0 we can choose k so that e > 21-k; then ,u(Fk) < e and for all x E (S2 - Fk) we have

for all i > k.

I fmi(x)-fmi+l(x)I < 2-i Thus

j-1 i > k - I f, (x) -fmj(x) I s

s=i

(fms(x) -fms+i(x) I <

21-z

so that the sequence fmi converges uniformly on (S2 - Fk); that is, since ,u(Fk) -> 0 as k -> oo, it is a Cauchy sequence almost uniformly. (iv) By (iii) we can obtain a subsequencefmk of the given sequence

which is a Cauchy sequence almost uniformly. This means we can find a function g E.% such that fmk - g almost uniformly as k -> oo. Now, for e > 0, {x:lfn(x) - g(x)I > e} -- {x:Ifn(X) -fmk(x)I >-

e}

-{X: I fmk(x) - g(x) I % ie}.

Given 6 > 0 we can find a set E8 a .F and integers ko, N such that ,u(Ea) < 18,

I fmk(x) - g(x) I < 4e

and

for k > ko, x E S2 - Ed,

,u{x: I fn(x) - fmk(x) I > 4e} < 16

for

n > N, mk > N.

It follows that n > max {N, mko} = ,ii{x: I f,,(x) - g(x) I > e} < 8.1

CONVERGENCE IN MEASURE

7.21

173

It is not difficult to see that convergence in measure does not necessarily imply convergence point-wise at any point, and so it certainly cannot imply almost uniform convergence of the whole sequence. For let

k2kr

r[__ 1

Er,k =

(r = 1, 2,..., 2k;

k = 1, 2, ...),

and arrange these intervals as a single sequence of sets {Fn} by taking first those for which k = 1, then those with k = 2, etc. If ,u denotes Lebesgue measure on [0, 1], and f,,(x) is the indicator function of Fn, then, for 0 < e < 1, {x:Ifn(x)I i e} = F.

so that, for any e > 0, ,a{x: I fn(x) I > e} 5 ,u(F.) ->. 0. This means that fn 0 in measure in [0, 1]. However, at no point x in [0, 1] does f,,(x)--> 0; in fact, since every x is in infinitely many of the sets F. and infinitely many of the sets (S2 - F,,,) we have lim inf f,,, (x) = 0, lim sup fn(x) = 1

for all x E [0, 1].

Exercises 7.2

1. Suppose { fn} is a Cauchy sequence in measure, and fnt, f,n, are two sub-

sequences which converge to f, g, respectively. Prove that f = g a.e.

2. Show that if {f} is a Cauchy sequence in measure then every subsequence of { fn} is also a Cauchy sequence in measure.

3. If S2 is the set of positive integers and, a is the counting measure on the class 0T of all subsets, show that convergence in measure is equivalent to uniform convergence.

4. If #(S2) = co can we say that convergence a.e. implies convergence in measure? 5. Suppose {An} is a sequence of sets in,'Z;', xn is the indicator function of A,,, and d(A, B) = ,u(A A B) for A, Be ". Show that is a Cauchy sequence in measure if and only if d(A,,, A.) -* 0 as n, m -> oo.

6. Suppose {fn} is a sequence of functions of M which are finite a.e. and fn -* f a.e. with f finite a.e. Show that, if (i) ,u(S2) < cc, or for all n where go is integrable; then f,, -> f in measure.

<, go

7. Suppose (S2, is a finite measure space and { fn}, are finite valued F-measurable functions which converge in measure to f, g respectively. Show (i) If,,l converges in measure to If ; (ii) for all real a, 6 the sequence converges in measure to (af+fig);

174

SPACE OF MEASURABLE FUNCTIONS

[7.2

(iii) if f = 0 a.e., then fn converges in measure to f2; (iv) the sequence {f,,g} converges in measure to fg; (v) the sequence {f,,2,} converges in measure to f 2;

(vi) the sequence converges in measure to fg; (vii) if f + 0 a.e. all n, f + 0 a.e., the sequence {1/fn} converges in measure to 1/f. Is the condition µ(S2) < oo essential for all these results?

7.3 Convergence in pth mean All the definitions of the present section can be made relative to an arbitrary E in 97. Since we could restrict It to the o--field F A E of subsets of E, there is no loss in generality in making our definitions in terms of S2, the whole space. In Chapter 5 we saw that f E.Y is µ-integrable (over S2) if and only if If I is µ-integrable. Further we saw that the subset of L1 of M consisting of µ-integrable functions is

a linear space (here we define (af +fg) (x) arbitrarily on the set of zero measure where it is not defined because it involves + oo + (- oo)). Further for f, g E L1,

pI(f,g) =

f If-gi dµ

is finite. By theorem 5.6, p(f, g) = 0 if and only if f = g a.e., so that if we take equivalence classes of functions equal a.e. to form the linear

space .i c A' we see that pi(ff g) = p1(g,f) for all f, g E Y1, p1(f, g) = 0 if and only if f =gin -V1.

The triangle inequality p1(f, h) < p1(f, g) +p1(g, h)

for f, g, h E 2'i

also follows by integrating

If(x)-h(x)I , If(x)-g(x)) +Ig(x)-h(x)I, so that pl defines a metric in the space Y1. Convergence in mean

A sequence {fn} of functions in L1 (or in Y1) is said to converge in mean to a function f in L1 if p1(ff) -+ 0 as n - oo. A sequence { fn} of functions in L. is a Cauchy sequence in mean if p1(f., fm) --> 0 as n,m -aoo. Convergence in mean is the special case p = 1 of convergence in pth mean. Since most of the proofs are the same for p = 1 and p > 1, it is convenient to consider this at the same time.

CONVERGENCE IN pTH MEAN

7.31

175

The class .p For p > 1, a function f in M is said to be of class Lp if If I p is ,uintegrable. Since 21f(x)I, if If(x)I > Ig(x)I, l.f(x)+g(x)I < 2lg(x)l, if Ig(x)I > If(x)I; we have, for all x, If(x)+g(x)Ip S 2p{lf(x)Ip+Ig(x)Ip}.

(7.3.1)

Thus, if f, g E Lp we must have (f ± g) e Lp. With the usual convention

about the set of zero measure where (af +,6g) may not be defined, it follows that Lp is a linear space. For f, g e LP we define isp

Pp(f,g) _

Ifif-gipdµJ

and notice again that pp(f, g) = 0 if and only if f = g,a.e. so that in the space Yp c .ill of equivalence classes we have Pp(f, g) = Pp(g,f), pp(f, g) = 0 if and only if f =gin gyp.

We will prove in the next section that pp satisfies the triangle inequality, which shows that it is a metric in rp. However, we can now define: Convergence in pth mean

A sequence {fn} of functions in L. (or in Yp) is said to converge in pth mean to a function f in LP if pp(fn, f) -> 0 as n -->. oo. A sequence {fn} of functions in LP is a Cauchy sequence in pth mean if pp(fn, fm) ->. 0

as n,m -goo. It is immediate, by (7.3.1) that convergence in pth mean to a function implies that we have a Cauchy sequence in pth mean. Completeness for this type of convergence can now be proved. Theorem 7.3. For p > 1, if { f n} is a sequence of functions in LP which

is a Cauchy sequence in pth mean, then there is an f in Lp such that fn -*f in pth mean. Proof. We again use the device of obtaining a subsequence which will converge a.e. to f. For any e > 0, let N(e) denote an integer such

that If,. - fs Jf

du < ep+1

for r, s > N(e).

176

SPACE OF MEASURABLE FUNCTIONS

[7.3

Put AT k= N(e2-k), and assume that Nk+l > Nk for each integer k. Then µ(E(e, r, s)) < e for r, s > N(e), where

E(e, r, s) = {x: I fr - fb I % e}.

If we put

Ek = E(e2-k,Nk+1,Nk), co

Fk = U Ei, i=k

we have u(Ek) < 2-k e, µ(Fk) < 21-k e, and if x is not in Fk, fNi+i(x) - fNi(x)I < e2-i for all

i >, k.

Hence the series E (fNi+,-fNi) converges outside F = fl Fk and i=1

µ(F) = 0. Suppose then that fNi

k=1

f a.e. For a fixed integer r, if we

put gi = I fNi -fr I p, g = I f -fr I p we obtain a sequence gi of non-negative

measurable functions with lim inf gi = lim gi = g a.e. By theorem 5.7 (Fatou) we have fd1u < lim inf f IfNi -frIpdµ < e if r > N(e).

Hence, g is integrable, so that (f - fr) E LP which implies that f e Lp. We have also proved that

fIf_frIh/<6 if r>N(e) so that fr -+ f in pth mean. It is worth remarking at this stage that the theorem corresponding to theorem 7.3 for Riemann integrals over a finite interval is false. It is not difficult to construct an example of a sequence of functions whose pth powers are Riemann integrable and which Cauchy converges in pth mean, but for which the limit is necessarily discontinuous

on a set of positive measure and so cannot be Riemann integrable by theorem 5.9 (see exercise 7.3 (10)). Thus theorem 7.3 exhibits another way in which the Lebesgue integral is a big improvement on the Riemann integral. We now relate convergence in pth mean to convergence in measure. Theorem 7.4. If {f.) is a sequence of functions of Lp (p 3 1) which is a Cauchy sequence in pth mean then { is a Cauchy sequence in measure. If fn -+ f in pth mean, then fn -+ f in measure. Proof. For any h in LP, r/ > 0 rli{x: I h(x) I > V1/2P} > V1

fI hI pdu i r.

7.3]

CONVERGENCE IN pTH MEAN

177

If {fn} is not a Cauchy sequence in measure, then there is an e > 0, 8> 0 for which P'{x: I fn(x) - fm(x) I '> e} > 8

for infinitely many n, m. If now rl > 0 is small enough to ensure that e >, ?11/2p, 8 > rtk we have

fifn(X)_fm(X)I4a

i I> 0

for infinitely many n, m so that {fn} is not a Cauchy sequence in pth mean. This proves the first statement: the second part of the theorem is proved similarly. Remark. The example after theorem 7.2 shows that {fn} may converge in pth mean but not converge a.e., though theorems 7.2, 7.3 together show that there must be a subsequence {fnti} which converges a.e. If we consider Lebesgue measure in R and put n-1/P f -W =

{0 ni/P

{0

for x in [0, n], otherwise, for x in [0, 1/n],

otherwise.

we see thatfn --> 0 uniformly (and therefore almost uniformly, a.e.,

and in measure) but not in pth mean. If t = [0, 1], then gn

0

almost uniformly, a.e. and in measure, but not in pth mean so that even in a finite measure space we cannot deduce convergence in mean from other types of convergence without some additional condition,

even if the functions concerned are all in Y p. The next definition turns out to be appropriate: Set functions equicontinuous at 0 Suppose v2 (i E.1) is a family of set functions defined on a

The family is said to be equicontinuous at 0 if, given e > 0 and any sequence {Bn} of sets of F which decreases to 0, there is an integer N such that I v;(Bn) i< e for all i e I, n N. In § 6.4 we saw that a set function v was absolutely continuous with respect to a if, given e > 0 there is a 6 > 0 such that, for

u (E) < 6r iv(E)I < e;

and that this condition was also necessary if v was a finite valued measure. This makes the following definition reasonable:

178

SPACE OF MEASURABLE FUNCTIONS

[7.3

Uniform absolute continuity Any family vi (i e I) of set functions defined on .F is said to be uniformly absolutely continuous with respect to u if, given e > 0 there is a S > 0 such that, for E E.F, p(E) < 8 I vi(E) I < e for all i. To see

what this condition means, suppose vi (i E I) is a family of measures each of which is absolutely continuous with respect to p, but such that the family is not uniformly absolutely continuous. Then there is an e > 0, and a sequence {B.) of sets of F with indices {in} such that p(B.) < 2-n, vi,,(Bn) > e. 00

Put

Ak = U Bn, C = lim Ak. n=k

k- oo

Then,u(C) = 0 and lim (Ak - C) = o. It follows that vik(Ak - C) = vik(Ak) i vik(Bk) > e > 0 so that, by considering the sequence {Ak-C} which decreases to 0,

we see that the family v i (i E I) is not equicontiruous at o. Thus we have proved Lemma. Suppose vi (i E I) is a family of measures on .°F each of which is absolutely continuous w.r.t. It. Then if the family is equicontinuous at 0, it is uniformly absolutely continuous w.r.t. ,u. Theorem 7.5. Suppose { fn} is a sequence of functions of L,, and

vn(E) =

Ifnl P du, E

(Fe 9-,n = 1, 2,...).

(i) {fn} is a Cauchy sequence in pth mean if and only if {fn} is a Cauchy sequence in measure and the family {vn} of measures is equicontinuous

at 0. (ii) The sequence {fn} converges to f in pth mean if fn converges to f in measure and {vn} is equicontinuous at 0. Proof. (i) Suppose first that {fn} is a Cauchy sequence in pth mean. Then by theorem 7.4 {fn} is a Cauchy sequence in measure. For each

e > 0, there is an N such that e

f Jfn -fN I P da < 2P+1 for

n > N.

Now suppose {Bk} is a sequence of sets of F decreasing to o. Since vn is absolutely continuous for n = 1, 2,..., N we can find, by theorem 5.6 an integer ko such that fBk fnI P <

2 +1+1

for k >,ko (n = 1, 2, ..., N).

CONVERGENCE IN pTH MEAN

7.3]

179

By (7.3.1) we obtain, for n > N, k > ko,

flfnlpolu

<

2PfBkIfNlPd,+2pf BkIfn-fNIPdµ

< 2+2p

f,

If. -fN

IP d/z < e,

so that the sequence {v,} is equicontinuous at 0.

In the other direction, since we assume that It is o--finite on n, there must be a sequence {En} in .F which decreases to 0 and is such

that ,u(S2 - En) is finite for all n. Given e > 0, the equicontinuity of vn now ensures that there is a set E = EN with u(Sl - E) < oo and

frIfnIPd#<

+2

for all n.

Thus, for all m, n, by (7.3.1)

J. Ifn-fmIPdu < je.

(7.3.2)

Now put S2 - E = F, ,u(F) = A. By the lemma, the sequence {vn} of measures must be uniformly absolutely continuous. We can therefore find an q > 0 such that, for B E , ,u(B) < 7j, vn(B) =

I fn P d< 2p+3' fBIP

(7.3.3)

For each in, n put 11P

(4Ae

Cm.n = {x: I fm(x) -fn(x) I >

Then J

F-Cm.r

Ifm-f.l Pdlu <

)

)

6#(F-Cm,n) < 6,0(F) _ 1e.

Since {fn} is a Cauchy sequence in pth mean we can find an no such that lu(Cm,n) n0, n > no. This gives, by (7.3.3), Cm,fn

so that

ffIVd#< 21

Cm, n

Cm, n

IF Ifm -fn I p du < je for m, n > no.

This, together with (7.3.2) gives

f Ifm-fnIpdu < e for m,n > no.

SPACE OF MEASURABLE FUNCTIONS

180

[7.3

(ii) If fn -> f in measure, then {f} is a Cauchy sequence in measure so that by (i) the condition that {v,,,) is equicontinuous implies that {fn} is a Cauchy sequence in pth mean. By theorem 7.3, there exists a g E Lp such that fn -*- g in pth mean. By theorem 7.4 (i), fn -a g in measure so that we have f = g a.e. and it follows that fn -- fin pth mean. We can now slightly strengthen the dominated convergence theorem (5.8).

Theorem 7.6. Suppose p > 1, and {fn} is a sequence of measurable functions with Ifn I P S h E L1 for each n. If either fn -> fo in measure or fn --> fo a.e., then fn --> fo in pth mean.

Proof. We must have vn(E) = fflPd1a S f hdu, so that the E

E

family {vn} is equicontinuous at 0 by theorem 5.6. If fn fo in measure we can apply theorem 7.5 (ii) to obtain the result. On the other hand, in the proof of theorem 7.5 (i) we only use convergence in measure on the subset F of SZ with ,a(F) finite. On F, fn -. fo a.e. implies fn in

measure by theorems 7.1, 7.2, so that the condition fn -*. fo a.e., together with equicontinuity at 0 of {vn}, implies convergence in pth mean of {fn}.

We have now defined convergence to a limit for sequences of functions in several different ways, and have proved completeness in each

case. It may help to summarise the relationships by a number of diagrams (Figures 2 to 4). In each of these an arrow from A to B Uniform

* pth mean Pointwise *

Pointwise a.e.- * `

unuorm

In measure

Fig. 2. No additional conditions.

indicates that convergence in sense A implies convergence in sense B. The lack of an arrow from A to B indicates that there is an example of a sequence satisfying the stated conditions which converges in sense A, but not in sense B. We assume throughout that we are considering functions in M which are a.e. finite.

CONVERGENCE IN pTH MEAN

7.31

Uniform

Uniform

In pth * E mean

\ /

In pth * f I

/ 1 \, - * Uniform I

181

X/

a.e.

I

__ * Uniform u.W.

t

T

* Almos t

a.e.

\K

JLea11

* Almost uniform

a.e. *

uniform

* In measure

In measure

Fig. 4. f JP < h a.e.,

Fig. 3. y(i) < oo.

Exercises 7.3 1. Check Figures 2, 3, 4, stating in each case the theorem or theorems which justifies A --* B, or the example which justifies the exclusion of

ABB.

2. Show that, if lc(f) < oo, then the condition in theorem 7.5 that {vn} be equicontinuous at 0 can be replaced by a condition that {vn} is uniformly absolutely continuous.

3. Show that if {fn} Cauchy converges uniformly a.e. and each fn is integrable over E with ,u(E) finite, then f (x) = lim f ,(x) a.e. is integrable over E and

fE Ifn-.fI d1z-0 as n -> oo. 4. Suppose 0 is set of positive integers and ,u is the counting measure. Then (i) If 1/n for 1 < k <, n, f .(k) {0 for k > n, show that fn(x) -->- 0 uniformly on Q, each f, is integrable but

-

ffnd/2-»

f

This shows ,u(E) < oo is essential in question (3).

(ii) For the same sequence {f} show that vn(E) = fE f. d#

is uniformly absolutely continuous, but not equicontinuous at 0. This shows that the condition ,a(0) < co is essential in question (2).

(iii) If

1/k

fn(k)

- 10

1 < k < n, for k > n, for

show that { fn} is uniformly convergent on 01, each fn is integrable, but the limit is not.

SPACE OF MEASURABLE FUNCTIONS

182

[7.3

5. Show that, if {f,,,} converges in pth mean to f, and g is essentially bounded, then f f,, g} converges in pth mean to fg.

6. Show that if

fndu (n=1,2,.-.),

vn(E) =

defines a sequence of set functions which is uniformly absolutely continuous then so does An (E) = SE If.I du.

7. Suppose {fn} is a sequence of functions in L1. Show that

is a

Cauchy sequence in mean if and only if

fIf.d =x is a Cauchy sequence of real numbers for every E E .F, and { f,j is a Cauchy

sequence in measure. Give an example of a sequence which does not converge in measure, for which

lim fE for all E.

8. Suppose 1u(L) < oo, and for f, gE.,', and a.e. finite;

If-gI dµ AM) = f l+If-gl Show that p defines a metric in the space of a.e. finite functions of .4', and that convergence in this metric is equivalent to convergence in measure.

9. Suppose #(Q) < co (1 < q < p). Show that Yi zD _a Yv Y,,, and that p.(f, 0) = lim p9(f, 0) for f E 2,,. Show that the finite measure condition is essential.

By considering a suitable function on [0, 1] show that 2' + n Y,, but

that if f E l Yp -then p,(f, 0) --* oo asp -* oo.

v>1

T>1

10. Suppose S2 = [0,1], ,u is Lebesgue measure. Let K be a nowhere dense perfect set with positive measure and let {Ek} be the set of disjoint open intervals such that (0, 1) - K = (J00 Ei. Let fn be the indicator function n

i=1

of F,, = U Ei. Prove that fn (n = 1, 2,...) is Riemann integrable and converges in mean to the indicator function f of (0, 1) - K. By considering the construction of K, show that f is discontinuous a.e. on the set K of positive measure, and so cannot be Riemann integrable. This shows that the class of Riemann integrable functions is not complete with respect to convergence in mean.

INEQUALITIES

7.41

7.4

183

Inequalities

We now obtain some inequalities which turn out to be important in several branches of analysis. We need to use the algebraic inequality xayfi < ax +/3y

(7.4.1)

for x > O, y > 0, a > 0, fl > 0, a+ fl = 1, which is strict unless x = y. This is most easily proved by taking logarithms to give

alogx+/3logy ( log (ax+fly) and using the fact that log: R+ -> R is strictly concave since it has a negative second derivative. Conjugate indices

If p > 1, q > 1 and 1/p+ 1/q = 1, we say that p and q are conjugate indices.

Theorem 7.7 (Holder). If p, q are conjugate indices, and f e Lp, g r: LQ then fg is integrable and, for each E in F, d)1/p

fE

d1u< (fE IfI

(fE

!glQd/A) .

The inequality is strict unless there exist real numbers a, b such that

alfIP=blgl4a.e.onE. Proof. If

f IfI du = 0,

then the loose inequality is certainly satisfied and if the right-hand side is also zero then either f = 0 a.e. on E or g = 0 a.e. on E. In either case the condition alf Ip = bIgIQ a.e. is satisfied with b = 0 or a = 0, respectively. Hence we may assume that

fIfgId>0. Put

El = {x: Ig(x)l < If(x)Ip-1}, E2 = S2 - El.

Then

so that

for xeEl,

If(x)g(x)I < If(x)IP,

for x e E2,

I f(x) g(x) I S I g(x) I Ql

I f (x) g(x) 15 I f (x) I P + l g(x) I Q,

this implies that fg is integrable.

for all x,

SPACE OF MEASURABLE FUNCTIONS

184

[7.4

Given E e . , put E0 = E n {x : f (x) g(x) = 0}. Then by our assumption ,a(E - E0) > 0. For x e E - E0, we can apply (7.4.1) replacing

a by

p,

If(x)Ip

# by 1, x by

E-E,,

This gives

If

g(x) g

and y by

fE-E0 Iglgdu

Iflpd# +

If(x)IP

<

\p f

Iflpdi

Ig(x)Ig q

fE-Eo IglgduE-Eo (7.4.2)

If we now integrate over (E - E0) and note that the right-hand side gives 1/p+ 1/q = 1, we obtain the desired inequality for the integral over (E - E0). We can only obtain equality for the integrals if we have

equality in (7.4.2) for almost all x E (E - E0). The condition for equality in (7.4.1) now shows that we must have aI f I P = blglq a.e. on (E - E0) where

it

a=

E-E IfIpdu and 0

b=

fE-

IgIgdu. 0

The inequality for the integrals over E now follows, and we can again only have equality if f = g = 0 a.e. on E0, since otherwise the right-hand side is increased while the left-hand side remains the same on replacing (E - E0) by E. Thus equality can only occur if all I P = b I g I g a.e. on E. 1

Remark 1. The special case p = q = 2 of theorem 7.7 is called Schwartz's inequality. A simpler proof of this case is possible. See exercise 5.4 (13).

Remark 2. In the sense of theorem 7.7 the index conjugate top =1 is q = oo. It is easy to prove directly that, if f E L g E L. then

f1f1dt< (flflit)esssuPlxEI) Theorem 7.8 (Minkowski). For p >, 1, if f, g E Lp then (f + g) E LP and

for any EEC, (fi)l/p+(fEIglpd1a)l/p

(fEIf+gIPdl-t)l/p

\

For p > 1, equality is strict unless there are real numbers a > 0, b > 0 such that of = bg a.e. on E.

7.4]

INEQUALITIES

185

Proof. We already proved in §7.3 that Lp is a linear space. For p = 1, the result is immediate. For p > 1,

fElf+glpdu=fElf+gIIf+gIp-ld1t

f

EIfIIf+gIp-1da+ f II If+ gI p-1du,

with equality if and only if f and g have the same sign a.e. in E. If we now apply theorem 7.7 to each of these integrals we obtain !'du)114

JE f+gP dp

(IEIflpda)llp

(fE +

+ (fE IgI p

d1)" (fE

If+glp)11q

with strict inequality unless there are numbers a, /1, y such that aI f I p = /ih f +gIp = yI gI p a.e. We can only have equality in both in-

equalities if there is a > 0, b > 0 with of = bg a.e. Provided it is not zero we now divide both sides by 11q

(fE If+gIpd1) to obtain the desired result. If fE If + gIP dy = 0,

then the inequality is trivially satisfied, and equality is only possible

iff=g=0a.e.]

The above theorem shows that 11p

Pp(f,g)=

(fEIf

-gIpd,-)

defines a metric in gyp. We have proved the Holder and Minkowski inequalities for general

measure spaces. They are of course valid for Lebesgue measure in R either over a finite interval or over the whole line.

However, we can also apply these general theorems to the case where S> is the set of positive integers Z, and ,a is the counting measure

,a(E) = number of integers in E, which makes all subsets E c Z measurable. Then functions f : Z -> R and g: Z - R reduce to

f(i) = ai, g(i) = bi (i = 1, 2, ...),

186

SPACE OF MEASURABLE FUNCTIONS

[7.4

where fail and {bi} are sequences of real numbers, and we can apply theorems 7.7, 7.8 to give: 1

p)1Ip

Holder.

E laibil < (j l ail i=1

(ii=1lbil4

i=1

J114

/

in the sense that the convergence of both series on the right implies the convergence of the series on the left and the inequality. Further, equality is only possible if there is a constant k such that l ail p = kl bil4 for all i.

Minkowski.

( co

llp

co

lai+bilp

lbilP)llp

lailP)llp

+ (Z'

i=1

=1

with equality if and only if there is a k > 0 such that ai = kbi for all i.

It is interesting to see how these elementary inequalities (which can of course be proved directly) fall out of the general theorems by using a simple special measure. Exercises 7.4

1. Suppose a > 0, /B > 0, y > 0, a +,6 + y = 1 and f e La, g e Lp h E Ly. Show that

f

E lfghl

dp

\ (IE lflVI

d1,)a

(fE IgIllad)ie (fE l

hlllyd1u)'.

Generalise to n > 2 functions.

2. If ,u is Lebesgue measure on I = (a, b) and f e L,,(I, ,u) show that there is a continuous g such that b

5

If (x) - g(x) 11 dx < e.

Jb

If (x+h)-f(x)lpdx-*0 as h->. 0

Deduce that a

(here f is given the value zero outside I). Hint. See theorem 5.10. 3. If u (S2) < oo, 1 5 p < q and f c L,, show that [PM, 0)l < kpa(f, 0) for a suitable constant k. Show that p(S2) < oo is essential.

4. Show by an example that theorem 7.8 is false for p < 1. 5. If p, q are conjugate indices, fn ->.f0 in pth mean, gn -a g o in qth mean, in mean.

show thatfngn

7.4]

INEQUALITIES

6. By considering intervals of O.

187

whose coordinates are rational, and

linear combinations of indicator functions of such intervals obtain a countable dense set in £,o for S2 = Rk, ,u any Lebesgue-Stieltjes measure. Such a space Tq is therefore separable.

7.5* Measure preserving transformations from a space to itself In § 6.5 we discussed measurable transformations T from

to (Y,.9') and defined the measure ,uT-1 on . in terms of the transformation. A special case of this is obtained when T : X -+ X maps X

into itself. We then say that T is measure preserving if, for every Ee F, ,u(T-1(E)) = p(E). Given a mapping T: X -+ X, we can define the iterates Tn obtained by composing T with itself n times. For convenience TO will denote the identity mapping, and T-n will be defined as a set mapping T-n (E) = {x: T" (x) e E}

even if T-n is not a point function. If 'F is the v-ring generated by a semi-ring 9, then it is immediate

on applying the extension theorems of Chapters 3 and 4 that T is measure preserving if, and only if, #(T-1(E)) _ ,u(E)

for every E in 9.

If T is a (1, 1) transformation from X to itself, then it is said to be invertible and the condition for T to be measure preserving in this case can be written as a(T (E)) = ,a(E) for all E in F.

In § 4.5 we considered the geometrical properties of Lebesgue measure and showed that the transformations of Euclidean space defined in terms of translations, rotations or reflexions are measure preserving. One can also prove that a matrix transformation of determinant 1 defines a measure preserving transformation in Euclidean space. All these are easily seen to be invertible.

If f2 = [0, 1) and Tx = 2x reduced mod 1 then, for Lebesgue measure, T is seen to be measure preserving by considering the effect of T-1 on the dyadic intervals [p/2q, (p + 1)/24) which form a semi-ring generating -4. If x = . a1a2... is the expansion of x as a binary `decimal', then Tx = . a2 a3.... This T is not invertible. It is worth remarking that the study of measure preserving transformations started with certain considerations in statistical mechanics. Suppose we have a system of k particles whose present state is described by a point in `phase space' R6kinwhich each particle determines 7

TIT

188

SPACE OF MEASURABLE FUNCTIONS

17.5

3 coordinates for position and 3 coordinates for momentum. Then the entire history of the system can be represented by a trajectory in phase space which is completely determined (assuming the laws of

classical mechanics) by a single point on it. Thus for any (time) t we can define an invertible transformation Te by saying that, for x in phase space, T x denotes the state of a system which starts at x after a time t. One of the basic results in statistical mechanics (due to Liouville) states that, if the coordinates in phase space are correctly chosen, then the `flow' in phase space leaves all volumes (i.e. Lebesgue measure in RBk) unchanged. This means that Tt becomes

a measure preserving transformation in (Rsk, YO, It). In practice k is enormous, and it is not possible to observe at any one moment all the particles of the system. Instead one asks questions like `what is the probability that at time t the state of the system belongs to a given subset of phase space?' One then imposes conditions which

ensure that this can be calculated by considering the `average' behaviour of T x as t oo. To be more precise TB +t = TsT, so that Tnt = (TT)n and one can consider a discrete model, count the proportion

of times up to n that Ti X E E where T = T,a and then let n --> oo. In practice a set E in phase space is replaced by a function f (x) (representing some physical measurement) and one considers the average behaviour in terms of the sequence

jn-1 - i=o Zif(Tix) (n = 1, 2, ...). This discussion of phase space provides a physical interpretation for the mathematical results which we now formulate precisely. For the remainder of this section, T will denote a measure preserving transformation of S2 to itself, and f : t2 -* R* will denote an integrable function. We define fk(x) =f(Tkx) (k=0,1,2,...). Then fk will be integrable and theorem 6.8 shows immediately that 5/k dp

=Jfdu.

Before giving the proof (due to F. Riesz) of the point-wise ergodic theorem, we obtain a lemma which is an important step towards it.

Lemma. (sometimes called the maximal ergodic theorem). Suppose E is the set of points x E L such that n

E fi(x) i 0 i=0

7.5]

MEASURE PRESERVING TRANSFORMATIONS

for at least one n: then

189

fEfda > 0.

Proof. We first need a result about finite sequences of real numbers. Suppose al, a2 ..., an E R and m < n. A term ai of this sequence is called

an m-leader if there is an integer p, 1 < p < m, such that ai + ai+1 +. _. + ai+p_1 % 0.

For a fixed m, let ak be the first m-leader, and let (ak + ... + ak+p_1) be the shortest non-negative sum that it leads. Then for every integer h with k < h < k + p -1, we must have ah + ah+1 + ... + ak+p-1 > 0, so that ah is an m-leader. Now continue with the first m-leader in ak+p, ..., an and repeat the argument until all the m-leaders have been found. It follows that the sum of all the m-leaders of the original sequence must be non-negative, as it is the same as the sum of the non-negative shortest sums obtained by the above procedure. We can now turn to the proof of the lemma and notice that, since f is integrable, we may assume that it is everywhere finite valued. If E. denotes the set of x such that n

E fi(x) i 0 i= O

for at least one n < m, then E. increases to E, so it is sufficient to prove f da >, 0 for all m.

For a positive integer n, let s(x) be the sum of the m-leaders of the finite sequence fo(x), f1(x), ...,fn+m_1(x). Let Ak be the set of x for which

fk(x) is an m-leader and let xk be the indicator function of Ak. Since n+m-1

Ak is measurable, and s(x) = Z xk(x) fk(x), s is measurable and integrable and s(x) > 0 so that

k=0

n+m-1

,I

k=0

fdkfkda

0.

(7.5.1)

Now notice that, for k = 1, 2, ..., n-1, T(x) E Ak_1 if and only if fk-1(Tx) + ... +fk_l+p-1(Tx) > 0 for some p < m, which is equivalent

to fk(x) + ... + f k+p-1(x) > 0 for some p < m which in turn is the condition for X E Ak. Thus Ak = T-'Ak-1 = T-kA0 for k = 1, ..., n -1. Hence by theorem 6.8,

f akfk(x)dit=JT-"f(Tkx)d#=

f

df(x)da 7-2

SPACE OF MEASURABLE FUNCTIONS

190

[7.5

so that the first n terms of (7.5.1) are all equal. Now A0 = Em, so that (7.5.1) implies

r nfEmfdu+mJJfjda>

0.

Divide by n, keep m fixed and let n -> oo to give

f

Emfdu >, 0.]

Theorem 7.9 (Birkhoff ). Suppose T is a measure preserving transformato itself and f:S2--> R* is intion on a o --finite measure space tegrable. Then (i)

I n-1 - Z f(Tix) converges point-wise a.e.;

(ii)

the limit function f*(x) is integrable and invariant under T

n i=0

(i.e. f *(Tx) = f *(x) a.e.);

(iii) if 4u(1) < oo, then f f*du = f fdu.

Proof (i). Suppose r, s are rational numbers r < s and B = B(r, s) is the set of points x for which lim inf

I n-1 i-

II n-1

2Z fi(x) < r < s < lim sup - 2Z fi(x)

It is immediate that B is measurable and invariant under T. Our result will now follow if we can prove that ,u(B(r, s)) = 0 for all rational r, s. The first step in this direction is to show that #(B) < oo. We may assume without loss of generality that s > 0, for otherwise

the argument can be carried out with f replaced by -f. Suppose CE.F, ,u(C) < oo, C c B, and x is the indicator function of C. Apply the lemma to the function (f - sx) to give

fEf_sxd

j_

/

0,

where E is defined in the lemma. If X E B, then at least one of the averages

I n-1

- Zfi(x) > s >

ni_o so that at least one of the sums

0

n-1

E {f(Tix)-sx(Tix)} > 0,

i=0

and it follows that x E E. Thus

fEfdu

> fEsxdu so that

u (C) < s fiii du.

7.5]

MEASURE PRESERVING TRANSFORMATIONS

191

Since B has measure and its subsets of finite measure have bounded measure, it follows that µ(B) is finite. Since B is invariant under T we can restrict our o-field and measure to B and think of T as a measure preserving transformation on B. Apply the lemma

again to the integrable function (f-8), and note that in this case the set E of the lemma is the whole space B. This gives fB (f - 8) d# > 0.

(r -f) du > 0.

Similarly, we can obtain fB

Together these give

1,

(r - s)du >, 0.

Since r < s, we must have ,u(B) = 0. (ii) Put r1 n-I f *(x) = lim {- E fi(x)}

,.0, ni=0

when the sequence converges. Then it is immediate that f * is measur-

able and invariant. Further 1 n-1

l n-1

d1t

-ni=0 E

E fi(x) 1 n- i=o

I fi(x)I dp =

lf(x)l da

so that, by theorem 5.7 (Fatou) f * is integrable, and 511*1

du s f IfI du

(iii) For fixed n, put D(k, n) _ {x:

k

< f* (X) <

k+1

and apply the lemma to the transformation T on the set D(k, n) which can be assumed to be invariant. Then f*(x) 3 k/2n in D(k, n), so that at least one of the sums n-1 k E (fi(x)-2n+ > 0 for each e > 0. Hence

i=0

L fdu >

k (_e)u(D(k,n)),

k, n)

and so we must have fdu % 2k ,u(D(k,n))

D(k,n)

192

SPACE OF MEASURABL E FUNCTIONS

Similarly

(' D(k,n)

and

k

[7.5

fd \ k+1 (/D(k, n)) 2n

µ (D(k, n)) c f

J D(k,n)

f dµ 5

k2 1 µ (D( k , n)).

d

2nµ(D(k,n));

For each integer k, it follows that J D(k, n)f and if we sum over k

*d

JD(k, n)f

Ifaf * d1i

-fnfdu

I

< I #(Q)

Since n is arbitrary we must have f f * dµ = f f du. ]

For applications to statistical mechanics one would expect the equilibrium value f*(x) to be independent of the point x, so that the limit function f* of theorem 7.9 is a constant. Unfortunately this is not true without imposing an additional condition. Ergodic transformation

A measure preserving transformation T is said to be ergodic (or metrically transitive or metrically invariant) if for all invariant sets E (sets for which T-1(E) = (E)), µ(E) = 0 or,u(S2-E) = 0. Lemma. T is ergodic if and only if every measurable invariant function is constant a.e.

Proof. Suppose g is measurable and invariant. Then {x: g(x) > a} is invariant for all real a, and must either have zero measure or be the complement of a set of zero measure. Hence g = constant a.e., if T is ergodic. Conversely, if every measurable invariant function is constant a.e., since the indicator function of an invariant set is an invariant function, there cannot be any invariant sets other than null sets and complements of null sets. ] We can now apply this lemma to theorem 7.9 when T is ergodic. There are two cases: (i) µ(S2) = + oo. Since the only constant which is integrable over a space of infinite measure is zero we deduce that l n-1 -ni=0 Efi(x) 0 a.e. (ii) µ(S2) < oo. We can integrate f * = c a.e. by (iii) to obtain l n-1 1 ffdµ a.e. iEfi(x)

n

AD)

7.5]

MEASURE PRESERVING TRANSFORMATIONS

193

This last result ties up with our remarks about statistical mechanics,

since it shows that the average value off on the discrete trajectory approaches the average value off in phase space for all starting points x except for a possible null set.

The reader who wishes to learn more about ergodic theory is advised to read P. R. Halmos, Lectures on Ergodic Theory (Chelsea, 1956).

Exercises 7.5 1. Suppose S2 is the real line, T is the translation Tx = x+1, f is the indicator function of [0, 1]. What is

10-1 f*(x) = lim- Z fi(x) n i=0

in this case? Show that theorem 7.9 (iii) is not satisfied without the condition,u(S2) < oo.

2. Suppose T is measure preserving and ergodic on (Q, F, p) and u(S2) < ao.

If f is non-negative measurable and i n-1 - Ef(Ttix)-*cERa.e., ni=0 deduce that f is integrable. 3. Suppose S2 is five point space {a, b, c, d, e},

is the set of all subsets,

,u{a} =,u{b} = ,u{c} = 1 and u{d} = µ{e} = 2, T is the permutation (a, b, c) (d, e).

Show that T is measure preserving but not ergodic. Find the f* of theorem 7.9 if f is the indicator function of {a, b, e}.

4. Suppose (0,.5F, P) is a probability measure. Form the doubly infinite Cartesian product of copies of (0, _5F, P) labelled (..., -2, -1, 0, 1, ..., n, ...) and the product measure by the process of theorem 6.3. If a point of this product space is w = (..., w_1, wo, w1, ...) and T is the shift (Tw)n = wn+1 ; show that T is measure preserving and ergodic. 5. If 0 = [0, 1), Tx = 2x(mod 1), and ,u is Lebesgue measure, show that T is ergodic. By applying the ergodic theorem to the indicator function of [0, {) deduce the Borel normal number theorem which states that in the binary expansion 0 a1 a2 ... an ... of real numbers in [0, 1), the density

-in ai n i=1

for almost all x.

6. Suppose T is ergodic and measure preserving on a finite measure and f, g are the indicator functions of measurable sets space (X, F, G. Show that 1 n-1 c(F),u (G) lira In fE ,u((T zF) n G) AX) 0

I

194

8

LINEAR FUNCTIONALS In this chapter all measure spaces (S2, F, p) will be o-finite, and F will be complete with respect top, unless stated otherwise. In Chapter 7

we saw that Y. (1 < p < oo) with the metric llp

Pp(f,g) _ (f If-glpd,a}

and Y. with the metric

P.(f,g) = esssupIf-gl, were complete metric spaces. We also proved they were linear spaces (over the reals); and it is immediate that the metric defines a norm

(1 0

if f + 0,

I1011 = 01

Ilf+glIp < Ilfllp+Ilglip, IIafIIp =

Ialllfllp

for

aeR.

We will omit the suffix p in II . lip when it is clear which Fp space is being considered. It turns out that the space.T2 has some special properties not shared by other 2p spaces. These can be postulated in terms of the difference between Hilbert space and Banach space, but we prefer to examine,

in the first three sections of this chapter, the structure of .8ti and then discuss later the analogous properties of more general normed linear spaces. 8.1 Dependence of 22 on the underlying (12, F, p.) In general, the structure of the space .92 depends on the underlying space when we want to emphasise this we use the notation We first examine conditions on which will ensure

that T2(Q,/t) is separable (in the topology of the norm). We later define (real) Hilbert space in terms of its abstract properties, and show that 22(K1,,u) is always a realisation of Hilbert space.

DEPENDENCE OF .P2 ON (f, F, p)

8.11

195

Countable basis for measure

In the measure space (t1,

we can use the equivalence relation

A - B a,u(A. B) = 0 to identify the subsets in F which differ only by a set of measure zero. If we denote the resulting quotient space by .,K, it is clear that, when ,u(S2) < oo, F,, is a metric space with the metric p(A, B) = µ(A D B), and one can further show that .F,, is complete. In this case we can define a dense subset by means of the topology of this metric. However, the notion of a dense subset in Jr. can be extended to include the case p(S2) = oo by a device which

makes sense provided It is o--finite on Q. Thus we say that µ has a countable basis if there is a sequence {En} of sets in .F such that, given e > 0 and any A e.F with #(A) < oo, there is a set Ek of the sequence for which #(AA Ek) < e.

In Chapters 3 and 4 we saw how measures could be obtained by extending a measure already defined on a semi-ring. If p can be defined by extending a finite measure on a semi-ring -0 which contains

only a countable number of sets, then u has a countable basis. For the ring 9 generated by ' is countable, and forms a basis for F by theorem 4.4. In particular, in the definition of Lebesgue measure, we could have used the countable semi-ring of f-open intervals, whose

bounding coordinates are rationals, to generate the o--field Rk of Borel sets in Rk; so it follows that Lebesgue measure in Rk has a countable basis. We first obtain a condition equivalent to the existence of a countable basis for p.

Lemma. A measure p has a countable basis if and only if, for each

e > 0, any collection ' c F of subsets of finite measure such that

A,BEf, A $ B=>p(ALB) >,e

(8.1.1)

is countable.

Proof. Suppose first that e > 0 is such that there is a non-countable le satisfying (8.1.1); and suppose if possible that u has a countable basis -9. Then, for each A E ' we can find a set EA E 9 with

p(A t EA) < 1e.

Then, if A + B, µ(E.g

EB) > ,u(A I B) -#(A A EA) -#(B A EB) > ae > 0,

so that EA 4 EB. Thus if .9 is dense, it contains a non-countable subclass, which is a contradiction.

196

LINEAR FUNCTIONALS

[8.1

Conversely, suppose the condition is satisfied. Then for each positive

integer n, the set F. of those classes W. which satisfy (8.1.1) with e = 1/n can be partially ordered by inclusion. Clearly if An Fn is a totally ordered set of classes, the union of all the classes in On is a class Wn which is a maximal element of On. By Zorn's lemma (see § 1.6) it follows that we can obtain a maximal element in Fn with respect to this ordering. Thus we can find a class'. c Fsatisfying (8.1.1), with e = 1 /n, and such that, given E E.F, there is at least one A EW,Oy with ,u(A L E) < 1/n, as otherwise E could be added to Wn 00

to form a larger collection. But WO, is countable soW _ U W°n is count-

able and forms a basis for It. J

Theorem 8.1. The space P2(i,4a) of functions f: £2 R* which are square integrable is separable (in the norm topology) if and only if the measure ,a has a countable basis.

Proof. Suppose first that 2'2 is separable, so that there is a sequence

{fn} in Y2 such that for any e > 0, f eY2 we can find an integer k with jjf - fkjj < e. Let'' be any collection of measurable sets of finite measure. Then for each A E'6', the indicator function x4 E Y2, so there

is an integer k4 such that Ilfkd-xe11 < 3e.

Then, if'' satisfies (8.1.1), we must have 1l fka -fkB!I > 3e

for A + B

so that kA + kB, and C must be countable. By the lemma this implies that ,u has a countable basis.

Conversely suppose that It has a countable basis'. The set , of all simple functions

n

h = E ri x, i=1

which are finite sums of rational multiples of indicator functions of sets of 'e is then countable. In order to prove 112 separable, it is sufficient to show that this set 9 is dense in 2'2. From the definition of the integral, for any f e x''2, e > 0 we can find a set E E .F with ,u(E) < oo such that f is bounded on E and IfI2d1j < 3e2. SZ-E

DEPENDENCE OF 22 ON (0, F, p) 197 On the set E, we can use the process of theorem 5.2 to approximate f uniformly by a simple function g taking only rational values 8.1]

Ei n EE = 0

n

9 = ik xEk

4-

(i + j),

U E1.

i=1

Using #(E) < co, this means that such a function g can be found with

fE!f_g12dp < je2. Then

IIf-gll2 = L_E If-gI2du+ fE I f-gl2du

= f f-EIfI2dp+f if-gl2dp < 1e2.

so that IIf - gll < 1e. If all the rk in the representation of g are zero we are finished, so there is no loss in generality in assuming they are all

non-zero. Since ' is a basis for ,u and p(Ek) < oo, we can find sets Ck of such that 2

,u(EkLCk) < Gr n) k

(k= 1,2,...,n). E

Then

II rkXEk-rkXC, II2 =

2

(2n

so that, if n

h = F+rkXCk, k=1

and

n

we have IIg - hII , = II rkX k-rkXC.II < JE k=1 IIf-hII s IIf - gIl+Iig - hII < e.J

Corollary. If p denotes Lebesgue measure in Rk, then is separable. To prove this we use the observation that Lebesgue measure in Rk has a countable basis. It is worth remarking that the classical method

of proving this corollary is to approximate f e 22 by a continuous function vanishing outside a finite interval, and then approximate this continuous function by a rational polynomial.

We end this section by making an important definition which is essential to a geometric understanding of linear spaces. We will see

later than it is not possible to define an inner product in .p for

p+2.

198

LINEAR FUNCTIONALS

(8.1

Inner product For any normed linear space K over the reals a function (f, g) on K x K ->. R is called an inner product (or scalar product) if (i) V, g) = (g,f); (11) (fl+f2,g) = (f1,g)+(f2,g); (iii) (Af, g) = A(f, g), for A E R; (iv) (ff) = 11f J12.

For the normed linear space 'T2 we can define (f, g) = P9 dl t, f, g E '42,

since, by theorem 7.7, the productfg is integrable. It is a simple matter

to check that, with this definition, (f, g) satisfies all the conditions (i)-(iv) for an inner product. Exercises 8.1 1. For any normed linear space with an inner product, prove that V 'O '< IlfiIIIgII.

Hint. Consider (f + 8g, f + Og) >, 0 for all real 6.

2. If (f, g) = 0 in a normed linear space, show that Ilf+gll2 = ilfll2+Ilgll2.

3. Suppose (92, .v u) is a discrete measure space, i.e. there is a sequence {pi} of reals with E 1 pi I < oo and a sequence {xi} in f2 such that p(E) = F, pi. x=eE

Prove that 22(f2 u) is separable.

4. If (Y, 9, v) are v-finite measure spaces each with countable bases, show that the product measure A = It x v on X x Y also has a countable basis. Hint. Consider finite unions of rectangles which are products of basic sets. 5. Generalise the above to countable products of spaces with ,ui(Xi) = 1. The example (8) below shows that it does not extend to arbitrary products. 6. Let Q. be any set and define the counting measure ,u(E) = number of points in E when E is finite; #(E) = + oo otherwise. Show (i) if f2 is countable,

the finite subsets of 0 form a countable basis; (ii) if f2 is uncountable, there is no countable basis for ,u.

7. Show that any Lebesgue-Stieltjes measure (Rk, able basis.

has a count-

DEPENDENCE OF 22 ON (S2, . µ) 199 8. Suppose I is a non-countable index set and for a e I, Xa denotes

8.1]

a 2-point space {0, 1} with µa{0} = ,aa{1} = 1. Form the product measure p on the Cartesian product j-j {0, 1} = S2. Show that there is no countable ael

basis for p.

9. Show that, if p has a countable basis, then 2(S2,µ), I < p < co is a separable space.

8.2 Orthogonal systems of functions We now examine the part of the structure of Y2(S2 p) which is more intimately related to the inner product. Linear dependence

In a linear space K, the finite set is said to be linearly dependent if there are real numbers c;,, not all zero, such that n

0. i=1

(8.2.1)

On the other hand, if (8.2.1) implies that all the ci are zero, then we say that 01, ..., 0. are linearly independent. A set E e K is said to be linearly independent if each of its finite subsets is linearly independent.

Note that, when K = 22 the relation (8.2.1) becomes n

cioi(x) = 0 a.e.

Closed linear span In a normed linear space K, given a family {qa}, a r: I of points of K of all finite linear combinations the set n

cak 0.,

ca& E R

(8.2.2)

k=1

is called the span of {0a}, and its closure (in the norm topology) is called the closed linear span of {0} and denoted by M{Oa}. Thus this set M consists of precisely those elements of K which can be approximated in norm by elements of the form (8.2.2). Complete set A family {ca} (a c: I) in a normed linear space K is said to form a com-

plete set if its closed linear span is the whole space.

Suppose now that a normed linear space K is separable, so that there is a sequence x1, x2, ..., xn, ... of points dense in K. By omitting, in turn, any point in the sequence which can be expressed as a linear

200

LINEAR FUNCTIONALS

[8.2

combination of the previous ones we obtain a sequence 9.1 g2, ... which is linearly independent, and has the same linear span as {x,z}. Since {xn} is dense, the closed linear span of {gn} must therefore be the

whole space. Thus in any separable normed linear space there is a complete set of independent points which is either finite or enumerable. If there is a finite complete set (g1, g2, ..., gi.) of independent points and

K has an inner product, then we will see that K is isomorphic to Euclidean k-space. For K = Y2(S2,1a), it is easy to see that K is finite-dimensional if ,a is a discrete measure concentrated on a finite

set of points, for then the indicator functions of these individual points will form a finite complete set. However, the interesting Y2spaces are infinite-dimensional. Any (S2, .° u) for which contains infinitely many disjoint sets, each of finite positive measure, clearly generates an infinite-dimensional since the indicator functions of this sequence form an independent set. Orthogonal system

Two points x, y in a normed linear space K with an inner product are said to be orthogonal if (x, y) = 0. Any class %} (i E I) of points of K which are different from zero and pairwise orthogonal is called an orthogonal system. A non-zero element x E K is said to be normalized if Jlxii = 1, i.e. (x, x) = 1. An orthogonal system of normalized points is said to be an orthonormal system in K. Thus {O2} (i E I) is an orthonormal system if 1 for i = j E I, 10 for i + j.

Now any orthogonal system of points is certainly linearly indepen-

dent for, if we take the inner product of (8.2.1) with O; we obtain cf(c,, 0f) = 0, so that cf = 0. Further, if K is separable, any orthogonal system in K is countable. For any such system can be normalised to give an orthonormal system {oi} (i E I), and then JJ 0z - Y'7II = V2

for i 4 j ;

and, if {x,,} is a countable dense set, we can find for each i E I an integer nz such that 11 x'i - 5zlj < J and this gives 11xni-xnjJJ > 4J2-1 > 0

for i + j,

so that n z + nj.

In the study of finite-dimensional normed linear spaces it is helpful to use a (finite) orthogonal normalized basis. In the general case, at least for K separable, it is also possible to find a complete orthonormal

8.2]

ORTHOGONAL SYSTEMS OF FUNCTIONS

201

sequence for K. This can be done by first obtaining a complete independent sequence and then orthogonalising it by the process of the next theorem.

Theorem 8.2 (Gram-Schmidt orthogonalisation process). If K is a normed linear space with an inner product and x1, x2, ..., xn, ... is a linearly independent sequence in K, then there is an orthonormal sequence y11 y21... Yn, ... such that

(1) yn = an1x14an2x24... +annxn, ann 4 0; (ii) xn = bnlyl+bn2y2+...+bnnyn, bnn + 0; where a11, b11 are real numbers. Further each y1 is uniquely determined (up to the sign) by these conditions.

Proof. If yl = ax1, then (yl, yl) = a2(x1, x1) = 1

if a is suitably chosen. The conditions are therefore satisfied with n = 1 if b11 = 1/a,1 = , J(x1, x2). (Note that the linear independence condition ensures that Il xlll + 0.) For n > 1, suppose y1, y2, ... , yn_1 have been found to satisfy all the conditions. Then

xn = bn1y1+... +bn,n-lyn-l+zn, where bnj = (xn, yi) (i = 1, 2, ..., n - 1) so that (zn, yz) = 0 for i < n. We must have (zn, zn) > 0, since otherwise zn = 0 and x1, x2, ..., xn would be linearly dependent. If we put Zn

yn =

II

bnn = N(zn,zn),

//

(z

zn))

then (yn, yn) = 1, (yn, y2) = 0 for i < n and (ii) is satisfied. We can

then deduce (i) since bnn 4 0. By induction the method of orthogonalisation is established. The uniqueness of the process (apart from sign) follows since the

values of the constants are all determined except for the ± sign in the square root which occurs at each stage. J

Corollary. If

is separable, then there is a complete ortho-

normal sequence in Y2.

Proof. Start with a sequence {fn} which is dense in 22, and replace it first by a sequence {gn} of linearly independent functions with the same closed linear span. If this is an infinite sequence, use the process of theorem 8.2 to obtain the orthonormal sequence {hn}. It is clear that this sequence has the same closed linear span as {gn} so that it is a complete set. On the other hand, if 22 is finite dimensional, we will obtain

202

LINEAR FUNCTIONALS

[8.2

a finite set {g1, 92, ... , 9n} whose linear span is 2'2. This finite set

can be replaced by a finite orthonormal set using the process of theorem 8.2.] In practice it is not always easy to prove that a given orthonormal sequence {01, 452, ...} is complete. Various methods for proving completeness will be given in the next section. Exercises 8.2 1. Suppose L1 = [0,1), ,u is Lebesgue measure, fo(x) - 1, fn(x)

if 2n-1x - y e [0, J) if 2n-ix=ye[j,1)

+1 -{-1

(mod 1),

(mod1).

The functions fn: [0, 1) -> R are called the Rademacher functions. Show that they form an orthonormal sequence in 2. For Q = [ - rr, 7r], It Lebesgue measure, show that the trigonometric functions 1

1

,

V2 rr

-

1

cos x,

sin x, ...,

n form an orthonormal sequence in

1,

1

cos nx, - sin nx, ...

TV

3. For S2 = [-1, 1], ,a Lebesgue measure, show that the Legendre polynomials

I

Pn(x)

are orthogonal in

_ 1)n}

do{(

2nn

dxn

(n = 0, 1, 2,...),

and that the sequence V{1(2n+1)}P,,(x) is

orthonormal.

8.3

Riesz-Fischer theorem

This theorem is formulated and proved in Hilbert space. Since 22 spaces will be shown to be realisations of Hilbert space, we will deduce the classical theorem about the Fourier expansion as a trigonometric series of a function in .2 as a special case. Hilbert space

Suppose H is a normed linear space with an inner product which is complete in the topology of the norm; then H is said to be a Hilbert space. (Note that some older text-books require separability in addition.) If H is finite-dimensional, then theorem 8.2 allows us to choose a finite orthogonal basis e1, e2, ..., en for H. It is then clear that n

x = E Ckek, k=1

Ck = (x, ek)

(8.3.1)

RIESZ-FISCHER THEOREM

8.31

203

is the unique expansion of x e H in terms of this basis. For separable infinite-dimensional H, we have seen that there is always an orthonormal sequence {ei} which forms a complete set. The main objective

of the present section is to obtain the extension of (8.3.1) to this infinite-dimensional case. However, in formulating the results we will

not assume that H is separable. It will turn out that an expansion in the form (8.3.1) is still possible, and that at most countably many terms in any such expansion will be non-zero. Before leaving the finite-dimensional H, we should observe that any Hilbert space of dimension n is isomorphic to Euclidean n-space Rn with the usual scalar product. For (8.3.1) determines the point (cl, c2, ..., CJ E Rn and defines a (1, 1) mapping which then preserves the inner product. Fourier coefficients

Given an orthonormal family (ei), j e J in a Hilbert space H, and any point x E H, the real numbers (j e J),

C, = (x, ei)

are called the Fourier coefficients of x on the orthonormal family, and the series E ci ei JEJ

is called the Fourier series of x. (Note we have not yet said in what sense (if any) this series converges.) The choice of the Fourier coefficients c, can be justified as follows. If I is a finite subset of the index set J, re-label the indices 1, 2, ..., n and consider the partial sum n

sn =

Then

i=1 n

ai ei

(n = 1, 2, ... ). n

11sn-x112 = xZaiei, x- i=1 Zaiei i=1 n

n

x

= JJxJJ1-2E(x,aiei)+E1 n

n

i=1

i=1

JJxJJ2-2aici+ Ilsn-xll2 = 11x112-

ai, n

n

so that

Ci

i=1

E1(aiei, a,ef),

i=1

(ai-Ci)2.

(8.3.2)

Thus Ilsn - xll will be a minimum when all the terms of the last series

in (8.3.2) are zero, and

aiei is the best approximation (in norm)

i=1

204

LINEAR FUNCTIONALS

[8.3

to x when all the ai are Fourier coefficients. This generalises the wellknown geometrical theorem (for Rn) which states that the length of the perpendicular from a point to a plane is smaller than the length of any other line joining the point to the plane: for (x -

\

k

I c i e,) is orthoi=1

k

gonal to all linear combinations of the form

fi ei.

i=1

Bessel's inequality

We can make another deduction from (8.3.2) by noting that I1sn-x112 > 0.

If we put ai = ci we obtain n k=1 =1

If we now define

(!Ixll2.

c to be the supremum of

jEJ

jEI

c, for all finite subsets

I c J we find that jEJ

e

(8.3.3)

IIxI12,

and this is known as Bessel's inequality. It follows as an immediate corollary that at most countably many Fourier coefficients of a point in H can be non-zero. Theorem 8.3. If {ej} (j E J) is an orthonormalfamily in a Hilbert space, each of the following conditions is equivalent to {ej} being a complete family (i) Z e; = JJx112 for every x E H (Parseval), jEJ

where {c,} are the Fourier coefficients of x;

(ii) The finite partial sums sI = Eckek of the Fourier series of x converge to x in norm for all x E H.

Note. For any arbitrary index set J we say that Z xj converges in jEI

norm to x if, given e > 0, there is a finite set K such that if I is finite

and K c I c J then 11

j xj - xII < e.

It is easy to check that, when J is countable and the xj are real so that we have a real series this notion reduces to the usual definition of absolute convergence. Proof. The conditions (i) and (ii) are clearly equivalent by (8.3.2). Now suppose that (ii) is satisfied. Then any x can be approximated in norm by a finite sum sn which is a linear combination of e1, e2, ..., en.

RIESZ-FISCHER THEOREM

8.31

205

Hence, each x is in the closed linear span of {ej} and the sequence must form a complete system.

Conversely suppose {ej} is a complete system. Then given e > 0, N

x E H, there is a finite sum y = E ai ei for which l l x - y II < e. But, i=1

if SN is the corresponding partial Fourier sum, we know ilx-y112 i

Ilx-sNII2,

N

so that, by (8.3.2)

Ec2 >'

11x112-e.

i=1

Since e is arbitrary, we can combine this with (8.3.3) to give ECj = I1 x1l2.] JEJ

From (8.3.3) we know that a given set {/3i} (jEJ) of real numbers can only be the Fourier coefficients of a point in H if E fj2 converges. jEJ

It turns out that this condition is sufficient as well as necessary. Theorem 8.4 (Riesz-Fischer). Let {e,} (j E J) be any orthonormal system (not necessarily complete) in a Hilbert space H, and let {/3j} j E J be any

set of real numbers such that E ,6j' converges. Then there is a point jEJ

X E H with Fourier coefficients /3j = (x, ej) such that the finite partial sums sI = Y ,,8i ei converge to x in norm. jEI

Proof. Since E /3 converges the set of j for which /), + 0 is countable jEJ

and we may suppose then that these indices are renamed A, 182, 116,,---

(it may be only a finite set). Put sn = E /3i ei i=1

Then

n+p RR

Ilsn+p-sn112 = E N2 i=n+1

Since E f2 converges, it follows that {sn} is a Cauchy sequence in norm. Since H is complete, there must be an x e H such that 8n -+ x in norm.

Further

(x, ei) = (sn, ei) + (x - sn, ei)

=

Ni+(x-8n,e1)

for n >, i

and, by exercise 8.1 (1) I

(x-sn, ei)I <- lleill Ilx-snll = IIx-snll --> 0

as

n -9 oo.

Since (x, ei) is independent of n, we have & = (x, ei) for all i. J

206

LINEAR FUNCTIONALS

[8.3

Corollary. An orthonormal system {ee} (j E J) in a Hilbert space H forms a complete system if and only if the only point x E H which is orthogonal to all the {e;} is the point x = 0.

Proof. Suppose {e1} is a complete orthonormal set and (x, ef) = 0 for all j. Then all the Fourier coefficients of x are zero and so IIxl12=Zc1=0.

IEJ

Conversely suppose {e;} is not a complete system. Then by theorem 8.3 (i), there is a point x E H with 11 x11

2 > E cf, where cf = (x, ej). IEJ

By theorem 8.4, there is a y E H such that (y, e5),

IIyll2 = E cJ2. ,EJ

Then the point (x - y) E H is orthogonal to all the e,. But 11 x - yll + 0, since llxll > llyll. 1

Remark. If the Hilbert space is separable we have already observed

that any orthonormal system is countable. For a separable Hilbert space, therefore, it is natural to state and prove theorem 8.4 and corollary in terms of an arbitrary orthonormal sequence. No essential modifications to the proof are needed. The space 12

The class of all infinite sequences c,, c2, ..., c,,,, ... of real numbers aD

for which Z ck converges is called the space 12. By using the discrete k=1

version of theorem 7.7 one can check that if {ci}, {di} E l2, 00

(c, d) _ i=1

cidi

(8.3.4)

converges and defines an inner product. Alternatively, if S2 is the set of positive integers, ,u is the counting measure, ci = f(i), then f .T2

(0, It) if and only if Z00 ck < oo, and (8.3.4) is the usual inner product k=1

(f, g) = f fg da in 22. Completeness and separability for 12 can be proved directly, or they can be deduced from the corresponding properties of Y2(92, #). Thus 12 is a separable Hilbert space according to our definition-and historically 12 was the space first considered in detail.

207

RIESZ-FISCHER THEOREM

8.31

The justification for our abstract definition of Hilbert space is contained in the following theorem. Theorem 8.5. An n-dimensional Hilbert space is isomorphic to Rn. Any separable infinite-dimensional Hilbert space is isomorphic to 12.

Proof. The finite-dimensional case was considered earlier. If H is separable and infinite-dimensional we can select a complete ortho-

normal sequence {en} and obtain the Fourier coefficients {cn} of a point x E H. Then since x E H E c2, < oo, this defines a mapping from H to 12. Conversely, every sequence in 12 determines a point in H with these Fourier coefficients, by theorem 8.4. There is only one such function by the corollary to theorem 8.4. Thus to prove that we have an isomorphism it is sufficient to prove that the linear structure and inner product are preserved by the mapping. Suppose x(1), x(2) E H

correspond to {c()}, {c(2)} E 12. Then it is immediate that (a E R);

ax(l) 4-+ {ac21)}

X(1) + x(2) t-+ {(C(l) + C(2))}; I x(1)II2 = Y_(C111)2,

IIx(2)II2 = E(c2))2,

llx(1)+x(2)1!2 =

Ilx(1)II2+2(x(1)

x(2))+Ilx(2)II2

F..(Cz1))2+2}rc'c2)+

(c2))2,

so that (x(1), x(2)) = Zc(il, ci2).

Corollary. If

is any measure space with a countable basis, is either finite-dimensional, when it is isomorphic to a Euclidean space Rn, or it is infinite-dimensional when it is isomorphic to 12. If (521,;, µl), (Q2, such that-T2(51, p1) and-T2(12,1a2) are both infinite-dimensional and separable, then Y2(521,µ1) and then

°2(021 p2) are isomorphic.

The theorems of this section were first obtained for trigonometric series of functions f in Y2([-7r,7r],1a) where p is Lebesgue measure. In order to obtain these special theorems one only has to prove that the functions 1

V2rr'

1

cos x,

1

.

sin x,

... ,

1

cos nx,

1

sin nx, ...

n n form a complete orthonormal sequence in this Y2 space. The steps necessary for this proof are contained in exercise 8.3 (2).

208

LINEAR FUNCTIONALS

[8.3

Exercises 8.3 m

1. Prove that a series E an of real terms converges absolutely to s if and n=1

only if, for each e > 0 there is a finite set I e Z such that for every finite K with I C K c Z we have (s - Eanl < e. nEK

2. If S2 = [ - it, 7f],µ is Lebesgue measure, f E Y2 (S2,µ)

1f

am = -

7f

bm =

1

f (x) cos mx dx

(m = 0,1, 2, ... );

f (x) sin mx dx

(m = 1, 2, ... ),

IT

_n

then the a,,,, bm are the classical Fourier coefficients off. Prove: OD

f ff {f(x)}2dx; (1) jag+E (a+bm) < 1 7< -,r

m=1

(ii) if {an}, {bn} are such that 0"

jao+ E (a2 +b2,,) < 00,

then there is a function f E 2'2 for which these are the Fourier coefficients and such that = Jao + En (am cos mx + bm sin mx) -± f sn(x) m=1

in second mean; (iii) if {an}, {bn} are the Fourier coefficients of f in the above sense n

8.(X) = Ja0+ E (am cos mx+bm sin mx), m=1

o'n(x)

= n+1

[so(x)+sl(x)+...+sn(x)],

then vn(x) -> f (x) in Y. norm (and in fact on -* f a.e.); 1

n

_nncr

(iv)

Eat+E(ak+bk)(___)a

(x)d/2=

n-F1 n

E.fIT

(v) since

v(x) dµ-*

o0

< ao+(ak+bk) <' ao+E(ak+bk); 1 1 r J

f 2(x) dwe have for f

Ir

J ff{f(x)}2dfa _ Sao+E (ak+bk);

(vi) the trigonometric system of exercise 8.2 (2) is complete.

RIESZ-FISCHER THEOREM

8.31

3. If

209

is any orthonormal sequence in a Hilbert space H, then for

xEH,(x,0)-0as n--> oo.

4. If f e P2([-7T, 7T], u) then, as n -> co,

ff(x)sunnxdx -+ 0,

f'f(x) cos nx dx -* 0. n

5. If 1" is the space of real sequences {x;} for which jxijv < co

for

1 <, p < oo,

show that lp is separable. I., is the space of bounded real sequences with jjxli = sup lxil. By considering sequences of 0's and 1's show that l., is not separable. Deduce that Y., is not separable if S2 has infinite but v-finite measure.

8.4* Space of linear functionals We start by defining a more general type of normed linear space. Banach space

Any normed linear space over the reals which is complete in the topology determined by the norm is called a (real) Banach space. We saw already that the .2 (1 < p 5 + co) spaces are normed linear spaces and that each of them is complete in the norm topology so that each Yp is a Banach space. Euclidean n-space Rn with the usual metric provides a simpler example of a Banach space. C[a, b], the space of continuous functions on a finite closed interval, with 11f li = sup 1 f (x)1,

can also be easily seen to be a Banach space.

From our definition of Hilbert space, it follows that any Banach

space will be a Hilbert space provided there is an inner product defined satisfying 1if 112 = (f, f ). The question immediately arises as to whether or not all Banach spaces are Hilbert spaces; or is it always possible to define an inner product in a Banach space? We can settle this as follows. If there is to be an inner product, then

11f+gil2 = (f+g,f+g) = (f,f)+2(f,g)+(g,g),

I1f-g1l2 = (f-g,f-g) = (f,f)-2(f,g)+(g,g), so that on adding Ilf+g112+I1f-gil1 = 211f112+211gI12

(8.4.1)

Thus, a relation (8.4.1) for all f, g in the space is a necessary condition

for the Banach space to have an inner product. One can also easily check that, if (8.4.1) is always satisfied, then (f,g) = {lIf+gll2-11f112-11g112}

(8.4.2)

LINEAR FUNCTIONALS

210

[8.4

satisfies all the conditions for an inner product, so that any Banach space which satisfies (8.4.1) is a Hilbert space if we define the inner product by (8.4.2). We can think of the condition (8.4.1) as a generalisation of the Euclidean theorem that in any parallelogram the sum of the squares on the diagonals is twice the sum of the squares on two adjacent sides. If this theorem is not valid in the Banach space K, then it is not possible to define an inner product on K. This allows us to show that 2P is not a Hilbert space for p + 2-see exercise 8.4 (2). Linear functional

Given a linear space K over the reals a function T : K -* R is called a linear functional if, for all x1, x2 E K, a,,8 E R, T (ax1 + fix2) = aT (xl) + f T (x2).

If K is a normed linear space, then T is continuous at xo E K if, given e > 0, there is a 8 > 0 with

IT(x)-T(xo)I < e.

IIx-xoll <

For a normed K the functional is said to be bounded if there is a real constant C such that for all xEK.

IT(x)I 5 CIIxII

It is immediate that a linear functional on a normed linear space is continuous everywhere if it is continuous at any one point. The connexion between continuity and boundedness is not quite so obvious.

Lemma. A linear functional Ton a normed linear space K is continuous if and only if it is bounded. Proof. Suppose first that T is bounded, then given x1 E K, e > 0, put 8 = e. C-1 and we have

IT(x)-T(x1)I = IT(x-x1)I 5 CIlx-x1Il < e if 11 x - x1II < 8. Conversely, if T is continuous at 0 E K we can choose

B > 0 such that Then for x E K,

IT(x)I IIBII

for

IIxii

B.

I I = B, so that IT(x)I =

and T is bounded. J

1

IIBIT(IIxiI)

< BIIxlI,

8.4]

SPACE OF LINEAR FUNCTIONALS

211

Norm of a bounded functional If T is a bounded linear functional on a normed linear space K, the smallest number C satisfying I T (x) I S C11 xII for all x E K is called the norm of T and we denote it by II T11. Because of linearity, T(x) I II TII

= sup

I

= sup IT(x)l. IIxII=1

11 4

If T1, T2 are two linear functionals on a linear space K, a,,8E R then (aT1 + 18T2) (x) = aT1(x) + f T2(x)

is also a linear functional on K; and the set of all linear functionals on K is a linear space. We can say more if K is normed.

Lemma. If K* denotes the set of bounded linear functionals on a normed linear space K, then K* is a Banach space. Proof. T E K*, 11 TII = 0 implies that T (x) = 0 for all x E K which

means that T is the null transformation, 1ITI+T211= sup IT1(x)+T2(x)I , sup I14=1

IT1(x)I+SUP IT2(x)

IIxII=1

I1xII=1

= 11Tih1+IIT21I

and

IIaTII = sup IaT(x)I = jai sup IT(x)l = Ial.11TI1. I4II=1

IIxII=1

This shows that K* is a normed linear space with the norm IITII = sup IT(x)I IlxII=1

It remains to show that K* is complete. Suppose {Tn} is a sequence in K* such that m,n-+oo. IITm-TnII -,-0 as

Then, for each x E K, I Tm(x) - TT(x) I ->. 0 as m, n --> co. The complete-

ness of R now implies that there is a real number y = T(x) = limTT(x). n--*

Now T is clearly a linear transformation and since IIIT-II-1IT-III s IITm-TnII the real sequence {IITnII} must be bounded, say by C. Then ITT(x)I < CIIxII

for all xEK

and all integers n, so that IT(x)I 5 CIIxII and T is a bounded linear functional, that is T E K*. Given e > 0, there is an integer N = N(e) such that ITm(x)-T,,(x)I < e for

11x11=1,

xEK, m, n, > N.

LINEAR FUNCTIONALS

212

If we let m

oo, then IT(x)-T,,(x) I < e

[8.4

(n > N), so that II T - T II -> 0 as n - oo, and K* is complete. for

lixil = 1

Conjugate space

For any normed linear space K (in particular if K is a Banach space), the Banach space K* of bounded linear functionals on K is called the conjugate space (or dual space) of K. Linear subspace

A set H contained in a linear space K such that H is itself a linear space is called a linear subspace of K.

If K contains a point x + 0, it is clear that the set of all points ax, a E R is a linear subspace H of K. Then, if we put, T (ax) = a for all ax E H it is immediate that T + 0 is a linear functional on H. However, it is not immediately obvious that the set of bounded linear functionals defined on the whole of K contains any T + 0. The existence of such a non-trivial T will follow if we can prove that linear functionals defined on a subspace can always be extended to the whole of K. Theorem 8.6 (Hahn-Banach extension). Suppose K is a linear subspace of a linear space H. Then any bounded linear functional on K can be extended to a bounded linear functional on H with the same norm.

Proof. Suppose f: K R is the given functional and a = sup 1f(x) I I X11

xEK

Then I f(x)I < aJJxJI for all x in K. Consider the class'' of all linear

functionals T defined on spaces J such that (i) K - J c H; (ii) T(x) =f(x)for xEK; (iii) T(x) < aJIxJJ for xEJ. We can partially order' by putting g1 < 92'f 91 is defined on J1, 92 on J2, K - J1 - J2 c H and gl(x) = g2(x) = f (x)

for x E K,

gl(x) = g2(x)

for x E J1.

By Zorn's lemma (§ 1.6) we can find a maximal element in this partial

ordering. This must be an extension T of f defined on a subspace J c H such that no further extension to a larger subspace is possible. It is clearly sufficient to show that, for this maximal TEe, we must have J = H. Suppose not, then there is a point z E H - J. We will obtain a con-

tradiction by showing that T can be extended to the linear space J. consisting of all points of the form j + az, j E J, a E R. Note first that,

SPACE OF LINEAR FUNCTIONALS

8.4]

213

since z 0 J, the representation (j + az) is unique. The extension to Jz will therefore be determined by its value at z. Now if x, y e J, T(x)-T(y) = T(x-y) < all x-yll < allx+zll +all -y-zll so that - all -y-zll - T(y) < allx+zll-T(x). Hence sup[ - all - y-zll -T(y)] < inf[allx+zll -T(x)]. xEJ

VEJ

Let t be any real number satisfying

sup[-all -y-zll -T(y)] < t < inf[allx+zll -T(x)], yEJ

(8.4.3)

xEJ

and put T(z) = t: this implies T(k + az) = T(k) + at for kEJ. Now put y = x/a in (8.4.3) and let w = x+az: -a

w a

-Tia
w a

If a > 0 multiply the right-hand inequality by a, while if a < 0 multiply the left-hand inequality by a. Both cases give

allwll -T(x) > at

so that

T(w) < allwll

and TE ''. Since J is a proper subset of Jz this establishes the existence

of the extension. To see that the extension F has the same norm as f we need only note that I F(x) I < all xll for all x in H so that

IIFII
IIFII = sup IF(x)l >_ sup lf(x)l = xEH xEK I1x11=1

I1x1l=1

Remark. In the above theorem, the only property of the norm which we used was that llx+yll < llxll+llyll for all x,yEH. It is possible to state and prove the extension theorem in terms of any subadditive bounding functional. This gives

Theorem 8.6A (Hahn-Banach extension). Suppose K is a linear subspace of a linear space H, p is a subadditive functional on H such that p(ax) = ap(x) for a >, 0, xEH; and f is a linear functional on K such that f (x) < p(x) for all x E K. Then there is a linear functional f : H - R such that f (x) = f (x)

for x E K,

j (x) < p(x) for x E H.

LINEAR FUNCTIONALS

214

[8.4

Exercises 8.4

1. If (0, .F, u) is such that there are two sets El, E2 .. with ,u(E1), u(E2) positive and finite, show by considering the indicator functions of does not satisfy (8.4.1) if p + 2; and therefore is not a El, E. that Hilbert space. 2. Prove that m, the set of bounded real sequences {xi}, is a Banach space with IIxII = sup I xi1

i

3. Suppose K is a Banach space, K* is its dual, and K** is the dual of K*. Prove: (i) if x is a fixed element in K, X (f) = f (x) for f E K* defines a linear functional on K*; (ii) for the above function IIXII = 11xii, so that T(x) = X is a norm preserving map from K to K**; (iii) this map T preserves the linear structure; (iv) The set of elements X of K** such that X (f) =&) for some x K forms a closed linear subspace of K**.

4. If K is a linear subspace of a Banach space H show that a point y of H is in the closure of K if and only if f (y) = 0 for every linear functional f c H* which vanishes on K.

5. In §4.4 we showed that it is not possible to define a measure on [0,1)

which is defined for all subsets and invariant for translations (mod 1). The following steps will show that we can define such a finitely additive set function on all subsets of [0, 1) with v[0,1) = 1 and v(E) = IEI when E is Lebesgue measurable. (i) Let H be the set of all bounded functions f: [0,1) - R which are extended to be periodic in the whole of R by f (x +1) = f (x). Prove H is

a linear space.

(ii) Put

1 n

M(f;al,a2,.... an) = sup- 1 f(x+ai),

xERni=1 p(f) = infM(f;a1, ..., an) for all such finite sequences of real a,. Prove that p is subadditive and p(af) = ap(f) for a > 0. (iii) If f: [0, 1) -. R is bounded and Lebesgue measurable, show that the Lebesgue integral 5(f) 5 M(f; a1, ..., an). (iv) Show that the set of bounded measurable f : [0, 1) -+ R is a linear sub. space of H, and 5(f) is a bounded linear functional on this subspace. (v) Use theorem 8.6 A to extend f to a linear functional F defined on all of H. (vi) Show that F{ f (x+x0)} = F{ f (x)} for all xo E R.

(vii) By considering indicator functions of subsets of [0, 1), put v(E) = F(XE) for El c [0, 1).

SPACE OF LINEAR FUNCTIONALS

8.4]

215

Prove v(E1 v E2) = v(E1) + v(E2) if E1, E2 are disjoint, v(E) = JEJ

if E is Lebesgue measurable,

v(E)

is invariant under translations.

6. Similar arguments to those used in (5) above can be applied to the linear space V of bounded real functions f : [0, + eo) -. R. Show that there exists a bounded linear functional Lim f (x) on V such that, for a, b E R. (i) Lim {af (x) + bg(x)} = a Lim f (x) + b Lim g(x); (ii) f (x) > 0 in [0, co) r. Lim f (x) > 0; (iii) Lim f (x + x0) = Lim f (x) for any xo >, 0; (iv) Lim f (x) = lim f (x) if this exists.

Deduce a corresponding result for the space m of bounded real sequences.

8.5* The space conjugate to 2p We have seen that Y, (1 5 p < + oo) is a Banach space and, a fortiori, a normed linear space. It follows from the lemma on p. 211 that the space of bounded linear functionals on .p is also a Banach space. The object of the present section is to identify these conjugate spaces at least up to an isomorphism.

Theorem 8.7. Suppose (f ,

is a a -finite measure space and Y,,, 1 5 p < oo is the linear space of ,F-measurable functions f : SZ -> R* whose pth power is integrable, with the usual norm {ftfIPd}lip

Ilf1I =

Let 1/p+ 1/q = 1 (if p = 1, q = co). Then (i) for each

F(f) = ffdu

defines a linear funtional on Yp; (ii) given any bounded linear functional F on 27 , there is a g c .T. such that F(f) = ffgdiu,

and in this case

IIFII = (flglQdp}

1/q

= esssup IgI

if p > 1,

if P=1-

Proof. (i) This follows immediately from theorem 7.7 and the linearity properties of the integral; for {flfvd4u)"PJigIQdu}11q

IF(f)I <

LINEAR FUNCTIONALS

216

[8.5

(ii) Suppose first that ,u(S2) < oo and F is a linear functional on .T,. For any measurable E c S2 put cr(E) = F(XE),

where xE is the indicator function of E. The linearity of F implies

immediately that 0' is finitely additive. Now suppose E _ (J Ei, i=1

N

Ei disjoint. Then ,u( U E1) ->,u(E) as N \\\i=1

oo, so that in -rp,

1

N

where QN = U Ei. i=1 Since F is continuous, we must have II xQr, - XE II -->0,

N

00

E cr(Ei) = lim E o (Ei) = tr(E)

i=1

N-00 i=1

so that o- is completely additive on -5F. Further 1u(E) = 0 - v(E) = 0, so that a- is absolutely continuous with respect top. By theorem 6.7 it now follows that there exists a function g which is integrable, such that

for all E E z,

F(,XE) = v(E) = I gdti

r

f

= yjpgdlc. This gives the required representation for F on the class of indicator functions of measurable sets. We must prove that g E .q, and that the representation is valid on the whole of p.

It is clear from linearity that the representation is valid for F simple functions. If fo E Yp, fo > 0, we can find a sequence fn of simple functions which increases monotonely to fo. Then by theorem 7.6, f n - f 0 in pth mean, and by the continuity of F

f

F(ffo) = limF(fn) = lim fngdu = f fogdu n->00 on applying the monotone convergence theorem to {fng+} and {fng_} separately. The restriction fo > 0 can be removed by considering fi. and f_ separately, so that

F(f) = ffgczu for all fE,Pp. Now suppose p > 1, and g(t) has been obtained by the above process.

Put 9n(t) --

I g(t)19-1 signg(t) lllnsign g(t)

if

Ig(t) I q-1 < n,

if

Ig(t)Iq-1 > n.

THE SPACE CONJUGATE TO 2p

8.5]

217

Then each gn is bounded and measurable, and is therefore in .p. Hence

(fId/4)1/p

IF(gn)I = if gngdlt I , IIFIIIIgnII = IIFII

But

gng = IgnI I gI - IgnI Ign11/q-1= Ignip,

so that

1/p

fIgnIPdiu s IIFII (fIgnd) (finiz)1/gs

and

11 F11.

But

I gnI p _

Iglga.e.

(flgdIt)1/4S

so that, by theorem 5.5

(8.5.1)

IIFII.

Before going on to prove equality in (8.5.1), let us now remove the restriction ,u(Q) < oo. Suppose ,u(SZ) = oo, so that there is a sequence {Q,,,} of disjoint measurable sets with co

SZ = U Q1, p (Qn) < oo

all n.

i=1

We can apply the above argument to each of the spaces (Qn, Win, a)

where ffln = F n Qn. By the uniqueness of the derivative g in the Radon-Nikodym theorem, if f e and vanishes outside N

RN

F(f) = ffgdi&

zUUQi'

But if f > 0, we can apply the monotone convergence theorem to each

of {fng+}, {fn g_} where fn = AR to obtain this representation by using the continuity of F. The final step is to use f = f+ -f- so that the representation is valid on all of gyp. Further (8.5.1) follows since it is true for the integral over each R. Now by Holder's inequality (theorem 7.7) we have IF(f)I <- IIfIIJ Iglgdu}

so that

IIFII =

{fIvdi}

1/g

using (8.5.1).

We need to modify the argument in the case p = 1, assuming that g has been defined as before as the Radon-Nikodym derivative of o. For any t > 0, let E be a set such that 0 t for x E E. Put f (x) = XE sign g(x) and we have

F(f) > tu(E),

11f 11 =,u(E)

LINEAR FUNCTIONALS

218

[8.5

so IIFII >, t. Since such a set E can be found for any t < ess sup IgI we must have IIFII _> esssup IgI.

But

IF(f)I = ffudp.l <- (esssuplg1)IIf1I,

so that IIFII S esssup Igi.] Corollary. If H is a separable Hilbert space: (i) for any fixed h e H, the inner product F(f) = (f, h) defines a bounded linear functional; (ii) for any bounded linear functional F on H, there is an h e H such that F(f) = (f, h) for all f E H: further 11 F11 = II hIl.

is a such that Proof. Choose a measure space separable Hilbert space, and so is isomorphic to H. Now apply the theorem in the case p = q = 2. Note. One can also construct a direct proof of the Corollary without

the restriction that H be separable; the case p = q = 2 of theorem 8.7 could then be deduced from this. Reflexive Banach space In exercise 8.4 (3) we proved that, for any Banach space H, H** n H

in the sense that H is isomorphic to a Banach subspace of H**. Those Banach spaces H for which H = H** are called reflexive. By our representation theorem 8.7, 2, is reflexive for 1 < p < oo. In general, .1 is not reflexive because Y, is bigger than £1: this will follow from exercises 8.5 (3,4). In fact very little is known about the structure of .,*o : the difficulty is that the axiom of choice, or something

equivalent, is needed to construct .* and this makes it impossible to get a hold on it. Exercises 8.5 1. Suppose 1 f in .9 norm, g g in £a norm. Deduce that

ff- g. du - JfgdU. 2. If f2 is the set of positive integers and u is counting measure, then 2D (1 < p < oo) reduces to the set of sequences {xi} of real numbers such

that

I xi I9 < oo; 2,,, reduces to the set m of bounded sequences. i=1

3. Let X = [-1, 1], ,u Lebesgue measure. Show that the collection ' of continuous functions f : X -* R is a closed linear subspace of Y., (pro-

THE SPACE CONJUGATE TO Y

8.5]

219

vided any function f which is equal a.e. to a continuous function is identified

with it). Hence, by theorem 8.6, extend the bounded linear functional F(f) = f (O) from 9 to Y. without changing its norm. If possible, suppose there is an f0 which is integrable and such that

F(f) = fffodu for fEY.. Then, for the special sequence fn(x) = (1- IxI" ),

we have F(fn) = 1 for all n. Show that, for any f0 E.

,

fff0d - 0. 4. Extend example (3) to show that if S2 contains a disjoint sequence of measurable sets of finite positive measure, then 21(Q,,u) is a proper subspace of . . Deduce that ll is not reflexive.

8.6* Mean ergodic theorem In §7.6 we obtained the point-wise ergodic theorem for functions in _T1. If the function is in 3°2 there is an alternative form of this theorem in which point-wise convergence is replaced by convergence

in second mean. We saw that any F2 is a Hilbert space. A measure preserving transformation T on the underlying measure space then leads naturally to a mapping on the Hilbert space to itself which preserves the inner product (and norm). It is therefore possible to state the mean ergodic theorem in terms of the properties of such a mapping in Hilbert space, and deduce the _T2 theorem by considering this as a realization of Hilbert space. However, we choose instead to

state and prove it directly as a theorem about the structure of It helps if we first show that bounded linear functionals on a Banach space can be used to separate a closed linear subspace K from a point not in K (see exercise 8.4 (4)).

Theorem 8.8. Suppose K is a linear subspace of a Banach space H, and y e H with d(y, K) = I > 0. Then there is a bounded linear functional F on H such that 11 F11 = 1, F(y) = rl, F(x) = 0 for all xeK.

Proof. Let J be the set of points of H of the form

x = z+ay, zeK, aeR. Then J is a linear subspace of H and the representation of points of J in this form is unique. Define a linear functional f on J by f (z + ay) = ay. 8

TIT

LINEAR FUNCTIONALS

220

(8.6

Then f vanishes on K and, for a + 0, z

Ilz+ayll = IaI -+y > IaI q = If(z+ay)I, a so that If II < 1. But if {zn} is a sequence in K for which II zn - yII ->- y we have

If(zn-y)I = If(zn)-f(y)I = If(y)I = y so that IIf II > 1, on letting n -* oo. Hence IIf II = 1, and f has all the desired properties except that it is only defined on J, a linear subspace of H. Use theorem 8.6 to extend it to a linear functional F on the whole of H with 11 F11 = If II =1. IIf IIIIzn - yII >

Corollary. If (S2, °4 a) is a or finite measure space, and K is a closed linear subspace of 22(S2 u), and y E Y2 - K, then y = z + x where z E K and (x, w) = O for all w e K.

Proof. T2(Q, t) is a Banach space, and K is closed (in the metric p2) so that d(y, K) = 71 > 0. Find the functional F satisfying the conditions of theorem 8.8 and represent it, by theorem 8.7, as F(p) = (,u, g) where g E 92.

Now put x = Vg, z = y - x so that (x, w) = qF(w) = 0

for all w E K.

It only remains to show that z E K. For e > 0 choose k c K such that Then

Ilk-yl12 = (k-y,k-y) < y2+e. I1k-z1I2 = (k-y,k-y)+2(x,k-y)+(x,x) =

Ilk-yII2+2,1(g,k-y)+y2llgll2

= Ilk-yIl2-27lF(y)+y2IIFII2 Ilk-yII2-V2 < e, = so that there are points of K arbitrarily close to z, and we must have

z E K, since K is closed.

Let us remind ourselves of the conditions under which we established theorem 7.9. (52, _5F, It) is a o--finite measure space, and T is a measure preserving transformation from 11 to itself. Tk is the result of repeating the transformation k times (T° is the identity map). For

an F -measurable function f which is finite a.e. we consider the sequence of means

1 n-1

gn = n- E f(TZx). z=°

(8.6.1)

MEAN ERGODIC THEOREM

8.61

221

Theorem 8.9. I f (1 and T satisfy the conditions in theorem 7.9, fEY2(n,p..), and gn is defined by (8.6.1) then {gn} is a Cauchy sequence in second mean. Its limit (in second mean) f * satisfies

(i) f* is invariant under T, that is f*(Tx) = f *(x) a.e.; (ii) IIf*II s Ill II; (iii) for any function gin Y2 which is invariant under T, (g, f *) _ (g, f ).

Proof. (a) Suppose first that f is such that there is an h E '2 such

that

f (x) = h(Tx) - h(x) a.e.

1 n-1 1 gn(x) = n E f(Tix) = [h(Tn-1x) - h(x)] i.0 n 0 as , co. n so that 11 gll < 211 hll /n --> (b) Now suppose f is the limit (in second mean) of a sequence LJk} such that, for each k, fk(x) = hk(Tx) - hk(x) with hk E 2'2. Then

Then

Z n-1 {f(T1x) -fk(T2x)}

IIgn!I
1

n-1

+n

fk (T ix)

i=O

i=O

1 E 1IIf(Tix)-fk(Tix)II n i=o

EfkT1(x) + 1n i=o

IIf-All +nllhkll; so that we can make IIgnII < e by first choosing fk with If - fkll < je and then making n large. The class of f E -T2 which satisfy either (a) or (b) is clearly a closed

linear subspace K of 22. By the corollary to theorem 8.8, any f E '2 can be written uniquely as

f=/1+f2 where f1 E K, and Now

(f2,Tf-f) = 0 for all fE22. 0 = (f2,Tf-f) = (f2,Tf)-(f2,f) (T-1f2,f) - (f2,f) = (T-1f2 -f2,f)

for all f E Y2,

and in particular when f is the indicator function of a measurable set E of finite measure. Hence T--1f2 = f2 a.e. so that f2 is invariant under T. Hence l n-1 - Z f2(Tix) = f2(x) a.e. for all n, ni=o 8-2

222

LINEAR FUNCTIONALS

18.6

so that f * = f2 is the limit in second mean of {gn}. Thus (i) and (ii) are

proved. To prove (iii), suppose g is invariant under T; then (T if, g) = (.f, T -1g) = (f, g)

so that (gn, g) = (f, g) for each n and the result follows on letting n --* oo since the inner product is continuous in the norm topology.

Corollary. Under the conditions of theorem 8.9, if T is ergodic, then the limit (in second mean) f * = c a.e. Also (i) if ,u(S2) = oo, then c = 0, (ii) if #(Q) < oo, then f f*du = f fdu.

Proof. The only invariant functions are constants so (ii) of the theorem implies that f * = c a.e. Now if µ(S2) = oo, we have 11 f *1I finite,

so c = 0. If µ(S2) < oo, then the function g(x) = 1 is in 2'2 and is invariant so that (1,f*) = ff*diu = (1,f) = f du.

f

Exercises 8.6

1. If µ(S2,) < oo, f e 2q (S2,µ), 1 . 0. (This gives a simpler proof of theorem 8.9 for the case µ(S2) < oo). Prove that (ii) of the corollary to theorem 8.9 is valid without the condition that T be ergodic.

2. Suppose X is an open subset of Rk of finite Lebesgue measure and T : X - X preserves Lebesgue measure and is ergodic. Show that, for almost all x e X, the sequence {Tkx} is dense in X.

223

9

STRUCTURE OF MEASURES IN SPECIAL SPACES In the present book most of the theory of measure and integration has been developed in abstract spaces, and we have used the properties

of special spaces only to illustrate the general theory. The present chapter, apart from § 9.4, is devoted to a discussion of properties which depend essentially on the structure of the space.

The first question considered is that of point-wise differentiation. In the Radon-Nikodym theorem 6.7 we defined the derivative du/dv of one measure with respect to another for suitable measures ,u, v: but the point function du/dv obtained is only determined in the sense that the equivalence class of functions equal almost everywhere is uniquely defined. This means that at no single point (except for those points which form sets of positive measure) is the derivative defined by the Radon-Nikodym theorem. In order to define du/dv at a point x, the local topological structure of the space near x has to be considered.

It is possible to develop this local differentiation theory in fairly general spaces, but only at the cost of complicated and rather unnatural additional conditions: we have decided instead to give the detailed theory only in the space R of real numbers where the term derivative has a clear elementary meaning. There are several ways of defining an integral with properties similar

to those obtained in Chapter 5. So far in this book we have considered definitions which start from a given measure defined on a suitable class of sets. In § 9.4 we describe the Daniell integral and show that, under suitable conditions this can be obtained in terms of a measure. Then, for locally compact spaces, we discuss positive linear functionals on the space Cg of real-valued continuous functions which vanish outside a compact set, and show that these also correspond to integrals with respect to a suitable measure. The final section of the chapter is devoted to the definition of Haar measure in topological spaces which have the algebraic structure

of a group and in which the group operation is continuous. The details are given only for locally compact metric groups.

MEASURES IN SPECIAL SPACES

224

[9.1

Differentiating a monotone function We say that f:I - R where I is an open interval in R (that is, a

9.1

set of the form (a, b) with a, b E R*), is monotone increasing, if x1,x2EI, x1 < x2 .f(x1)
but D+f(x) = lira sup f (x + h- 0+

h) -f(x) h

D f (x) = lim supf (x)

-f(x - h);

(x) - f (x - h) h) -f(x) = lim inff (x + D_ f (x) = lim inff h h- o+ h-0+ are always uniquely determined in the extended real number system R*. These numbers are called the derivates off at x. We say that f is differentiable at x if D+f(x) = D+f(x) = D f(x) = D_f(x) = Df(x) + ±oo. D+f(x)

It is clear that f is differentiable at x if and only if there is a real number Df(x) such that, given e > 0 there is a 8 > 0 for which

f(x+h)-f(x)-Df(x)I < e if h

0 R is continuous, but not monotone, it is possible that it is differentiable at no point x. However, a monotone f : I --> R must be continuous except at the points in a countable set, and the monotonicity further implies that there are some points x where the derivative exists. In fact we prove much more: the set of points x in I where f is not differentiable turns out to have zero measure. In order to prove this it is convenient first to obtain a new type of covering theorem. When in § 2.2 we showed that a bounded closed interval K in R is compact we started with a covering of K by a family of open sets and we demanded that all of K be covered by a finite subfamily. However, in proving compactness we were not interested in economical covering, and the covering sets finally chosen could overlap. Clearly if we require that the covering sets must not overlap we can no longer require that all of K be covered. However, even if we are satisfied with a countable subcovering by disjoint sets of

almost all of K (see exercise 9.1(8)) additional conditions on the nature of the original covering are essential. A suitable form of these conditions now follows.

DIFFERENTIATING A FUNCTION

9.1]

225

Vitali covering

For a subset E c R, a class f of intervals is said to cover E in the

Vitali sense if, given x E E, e > 0 there is an interval J E / with xEJandO < IJI < e. Theorem 9.1. Suppose E c R has finite Lebesgue outer measure and is covered in the Vitali sense by a class / of intervals. Then there is a countable disjoint subclass f1 e / such that

IE-U{J:JE /1}I = 0. Proof. We use JAI to denote the Lebesgue outer measure of A whether or not A is measurable. There is no harm in assuming that all the intervals J in / are closed since III = III for any interval I. We may further assume without loss of generality that there is an open set 0 D E with I G I < oo, and that all the intervals of f are contained in G. We choose f1 by induction as follows. Let J1 be any interval of /. Suppose we have already chosen disjoint intervals J1,J2, ...,Jm and let sm be the supremum of the lengths of the intervals in If which do not intersect any of J1,J2, ..., Jm. Now sm < IGI < oo, and if E is not conm

tained in U Ji, we must have sm > 0. Thus if E is not already covered, i=1

m

we can choose Jm+l disjoint from U Ji with IJm+1I > lism. Now the i=1 M

theorem is proved if E c U Ji for any finite m. Otherwise we obtain i=1

a sequence {J,,} of disjoint sets so that 00

i=1

IJiI , IGI < co.

Now suppose, if possible, that 00

JE- U JiI = S > 0. e=1 co

We can choose N so that

i=N+1

IJiI <

S,

N

F=E- UJi.

and put N

i=1

F must be non-void and U Ji is closed so we can find a point x in E i=1

226

MEASURES IN SPECIAL SPACES

[9.1

and an interval J off containing x and short enough to be disjoint N

from U Ji. This implies IJI S sn < 2IJn+1I Since i=1

limlRRI =0,

n-ao

this J must meet at least one of the Ji for i > n. Let k be the smallest integer for which Jr Jk + 0. Then IJI 5 Sk-1 < 2IJkl, so the distance

from x to the mid-point of Jk is at most IJI+JIJkI _ ZI4!, and x must belong to the interval Hk which has the same centre as Jk and 5 times the length. Thus 0

I'' ci=N+1 U Hi

00

and

6 = II''I 5

IHil = 5

%=N+1

'-+

IJil < 6,

i=N+1

which establishes a contradiction.' Corollary. Under the conditions of theorem 9.1, for each e > 0 there is a finite set J1, J2, ..., J. of disjoint intervals of f such that p

E- U Ji

< C.

i=1

Theorem 9.2. Suppose f:I --> R is monotone increasing. Then the set E of points x in I for which f is differentiable at x satisfies II - El = 0. The derivative f' is Lebesgue measurable, and if [a, b] c I, b.f'(x) dx 5 f(b) -f(a).

a

Proof. It is clearly sufficient to prove the theorem for a finite closed interval I = [c, d]. The first step is to show that each of the subsets of I: {x: D+f(x) > D_f(x)},

{x:D-f(x) > D+f(x)},

{x: D+f(x) > D+f(x)},

{x:D-f(x) > D_f(x)},

has zero Lebesgue measure. We give the details for the set

E = {x: D+f(x) > D_ f(x)};

the proof for the others is similar. Now E is the (countable) union of sets

E.,,, = {x: D+f(x) > u > v > D_ f(x)}

over rational pairs u, v. It is therefore sufficient to show that I Eu, ro I = 0

for all pairs u, v with u > v.

DIFFERENTIATING A FUNCTION

9.11

227

Let t = (EU,,l and e > 0. Find an open set 0 D E..,, with 101 < t+e.

For each x e E.,,,, there is an arbitrarily small closed interval

[x - h, x] c G

with

/(x) - f (x - h) < vh.

By theorem 9.1, corollary we can find a finite disjoint collection of such intervals whose interiors cover a subset F of

Jl, J2, ... , JN

En,,, with JEU,,D-Fl < e. If we sum over these intervals N Fi n=1

N v E hn < vl Gl

(

n=1

< v(t+e). But each y e F is the left-hand end-point of an arbitrarily small interval [y, y + k] which is contained in one of the Ji (i = 1, 2, ..., N) and such that

f(y+k)-f(y) > uk.

Use theorem 9.1 again to find a disjoint collection K1, K21 ..., KP of such intervals which covers a subset H of F with l Hl > t - 2e. Summing over these intervals, since each K. is contained in a J., P

N

i=1

{f (xn) -J (xn - hn)J %

f(yi + ki) -f(yi)

i=1 P

>uki>u(t-2e) i=1 so that

v(t + e) > u(t - 2e).

Since u > v and e is arbitrary, we must have t = 0. Thus for almost all x in I, g(x) = Df(x) = limf(x+h)-f(x)

Ago

h

exists as an element in R* (we are thus allowing the value ± oo for a limit). If we put gn(x) =

n[f(x+

-f(x)] for

where we re-define f(x) = f(b) for x >, b, then gn(x) is defined and measurable and gn(x) --> g(x) for almost all x in [a, b] as n -* oo so that g: I - R* is Lebesgue measurable if we define it arbitrarily to be zero

22$

MEASURES IN SPECIAL SPACES

[9.1

on the exceptional set where Df(x) is not defined. By Fatou (theorem 5.7) b

fa

g(x) dx 5 lim inf f b gn(x) dx

n-oo Ja

fa( f (x + n) - f (x) } dx

Jim inf n

rb+(1/n)

= liminf[nJb

a+(1/n)

f(x)dx-n fa

5 f(b) -f(a).

f(x)dx]

J

This shows that the function g is integrable and so finite almost everywhere. Thus f is differentiable a.e. in [a, b]. Since [a, b] is an arbitrary subinterval of I, f is differentiable almost everywhere in I. Functions of bounded variation

A function f: I R is said to be of bounded variation on I if n

i=1

I f (xi) -f (xi-1) I

is bounded above for all ordered finite sequences xu < x1 < ... < xn

in I. Clearly if f: is of bounded variation on I, it is also of bounded variation on each interval J c I. For an ordered sequence a = {xi}, i = 0, 1, ..., n put n

p(a) _

i=1

max [O,f(xi) -f(xi-1)], n

n(a)

i=1

min [O, f(xi) -.f(xi-1)], n

t(a) = p(a) + n(a) _

I f (xi) -.f (xi-1) I

i=1

If f : [a, b] -> R is of bounded variation on [a, b], put

Ta = sup t(a), Pa = sup p(a), a

a

Na = sup n(a), a

where each of the suprema is taken over all ordered finite sequences a in [a, b]. It is easy to check that, in this case Ta = Pa + Na, f(b) -f(a) = Pa - Na. Now if f: [a, b] R is of bounded variation on [a, b] we can put g(x) = Na, h(x) = PQ for all

x e [a, b]

so that f (x) can be expressed as the difference of two non-decreasing functions of bounded variation.

9.11

DIFFERENTIATING A FUNCTION

229

Corollary (Lebesgue). A function f: I -a R which is of bounded variation on each finite interval [a, b] c I must be differentiable at x for almost all x in I.

Proof. In each finite [a, b] we can express f as the difference of two monotone increasing functions g and h. Each of these is differentiable almost everywhere in [a, b] by theorem 9.2. Hence the difference f is differentiable almost everywhere in [a, b]. I Exercises 9.1

1. Show that, if g: I ->- R, h: I - R are each monotone increasing, then f = g - h is of bounded variation on each [a, b] c I. 2. If f: I -* R is of bounded variation on each [a, b] c I, show that the limits f(x + 0), f(x-0) exist at each interior point of I. 3. If c is an interior point of I and f : I -* R has a (local) maximum at c, show that D+f (c) < 0, D_ f (c) > 0.

4. If f: [a, b] --> R is continuous and D+f(x) > 0 for all x in [a, b), show that f (b) >, f (a).

5. Define

f (o) = 0, f(x) = x2 sin x 2 for x + 0; g(0) = 0, g(x) = x2 sin x-1

for x + 0.

Which of the functions f, g is of bounded variation on [-1, 1]?

6. Give an example of a function for which all the four derivates are different at x = 0. 7. For any Lebesgue measurable f : I--> R, prove that D+f(x) is Lebesgue measurable.

8. Show that theorem 9.1 as stated in R is false in R" for n > 2. Hint. Take a Vitali covering of [0, 1] and for each J of covering consider J x [0, 1] and J x [ 3,1 J. This will give a covering in the sense of our definition

of the unit square [0, 1] x [0, 1]. Show theorem 9.1 is not satisfied. (In fact a more complicated construction shows that theorem 9.1 fails even if we require each point of the set to be covered by an interval J of arbitrarily small diameter.) 9. Show that theorem 9.1 is true in R" for all n if we restrict the covering

to cubes. (In fact it can be shown that it is true if there is a constant K such that the ratio of the lengths of longest and shortest sides is bounded for the intervals in f) 10. For the Cantor ternary function g: [0, 1] -> [0, 1] show that g'(x) = 0 for all x e [0, 1] - C.

230

MEASURES IN SPECIAL SPACES

[9.1

(This shows that we cannot hope, in general, for equality in

ff'(x) dx b

11. Prove that a convergent series of non-decreasing real functions can be differentiated term by term a.e.

9.2 Differentiating the indefinite integral The `fundamental theorem of the integral calculus' states that, if f : [a, b] -> R is a continuous function and

F(x) = 1:1(t) then F: [a, b] --> R is differentiable in (a, b) with F'(x) = f(x). The object of this section is to obtain the analogous theorem for the Lebesgue integral, where it is not appropriate to assume that f is continuous. (Of course, if f : [a, b] -* R is continuous on [a, b], we know that F(x) = f(x) for all x in (a, b) since the Lebesgue integral coincides with the Riemann integral in this case.) The first thing to note is that, even for a monotonic function F, we cannot claim that, in general,

b

JF'(x) dx = F(b) - F(a),

a

(9.2.1)

see exercise 9.1 (10). We will, however, obtain necessary and sufficient conditions for the truth of (9.2.1). Lemma. If f : [a, b] ->. R* is Lebesgue integrable on [a, b] and

1:1(1) dt = 0 for all x in [a, b], then f(t) = 0 for almost all tin [a, b]. Note. This strengthens the result of theorem 5.5 (vii). Proof. If the lemma is false then at least one of the sets

{t:f(t)<0}, {t:f(t)>0} has positive measure. If I {t: f(t) > 0} 1 > 0 then we can find a S > 0 for

which JET > 0, where E = {t:f(t) > S}. Now choose a closed set F c E with IFI > 0, and consider the open set G = (a, b) -F. Then 0=

a fdm

= JFfdm+J afdm.

But G is the disjoint union of a countable collection of open intervals (an, bn)

and

fa.

fdm = 0

DIFFERENTIATING THE INTEGRAL

9.21

231

for each n. Since the integral defines a o--additive set function we must have

f dm = 0 so that fF fdm = 0 fG

51dm > 8IF1 > 0.1

and this contradicts

Let us now consider the properties of any function F which is an indefinite integral, that is F(x) = f f(t) dt a

for a function f: [a, b] -± R* which is Lebesgue integrable. It is immediate from theorem 5.6 that F is continuous on [a, b], but more can be said: since it is the difference of the indefinite integrals of f+ and f- it must be the difference of two monotone functions and therefore it is of bounded variation. In fact, we saw in theorem 5.6 that the set function v(E) = fE fdmn: E measurable, E - [a, b] J

is absolutely continuous; that is that v(E) 0 as m(E) -* 0. This means in particular that given e > 0, there is a 8 > 0 such that if n

E = U Ik is a finite disjoint union of intervals in [a, b] for which k=1 n

n

m(Ik) < 8, k=1

then

I v(E) I = E v(Ik)

< E.

k=1

In fact, by considering separately the intervals Ik for which v is positive and negative we can find 8 > 0 such that n

n

m(Ik) < 8 -Z I V(Ik) I < 6-

k=1

k=1

In terms of the indefinite integral F this means that the function F: [a, b] -+ R is such that, for each e > 0 there is a 8 > 0 for which n

n

i=1

i=1

E (bi - ai) < 8- E I F(bi) - F(ai) I < e

(9.2.2)

for any finite class of disjoint intervals (ai, bi) c (a, b). Any function

F: I--> R which satisfies this condition on every finite interval (a, b) c I is said to be absolutely continuous on I.

It is immediate that any function F: I - R which is absolutely continuous is of bounded variation on each finite interval [a, b] C I. For if we put e = 1 in (9.2.2) and choose 8 > 0, then any finite dissec-

MEASURES IN SPECIAL SPACES

232

[9.2

tion of [a, b] can be split into K sets of intervals (by inserting extra division points if necessary) each of total length less than 8, where K = [(b - a)/8] + 1; and it follows that, for any dissection of [a, b] n

r=1

K.

F(xr) - F(xr-1)

By the corollary to theorem 9.2 we now see that any function F which is absolutely continuous is differentiable except on a set of zero measure.

Theorem 9.3. Suppose f: [a, b] -* R* is Lebesgue integrable on [a, b] and F: [a, b] --> R satisfies

F(x) = F(a) + 1:1(t) Then F is differentiable with F'(x) = f (x) for almost all x in [a, b]. Proof. Assume first that f is bounded on [a, b] so that for a suitable M in R, If(x) I < M, for all x in [a, b]. Now we know that F is absolutely continuous and therefore differentiable almost everywhere. Put n[F(x+n\\l

fn(x) =

-F(x)J.

Then I f.I < M and fn(x) - F'(x) almost everywhere; so, by theorem

5.8 fora
fF'(x)dx = 1imfa f(x)dx = lmn f = lim [nf

+(1/ n)

) -F(x)]dx

F(x) dx - n f

a+(1/n)

F(x) dx

a

= F(c ) - F(a) =

f cf

(x) dx

since F is continuous. Hence f c {F'(x) - f(x)} dx = 0

for all c in [a, b] so that F'(x) = f(x) almost everywhere. Now suppose that f: [a, b] R* is integrable but not bounded. From the definition of the integral it is sufficient to prove the theorem

when f > 0. Put and

gn(x) = min [n, f (x)] x

Gn(x) = fa [f(t) - gn(t)] dt.

9.21

DIFFERENTIATING THE INTEGRAL

233

Since f - fn > 0, G. is monotone increasing and so has a non-negative derivative almost everywhere. Since fn is bounded (by n) we know that dx

J.xf,,(t) dtl( = fn(x) a.e.,

so that the derivative Fi(x) = G.n(x) +

d

dx

Jxf.(t) dt) > fn(x),

and exists almost everywhere. Since this is true for each integer n, F'(x) > f(x) a.e.

(9.2.3)

bF'(x) dx > f b f(x) dx = F(b) - F(a), a Ja and by theorem 9.2 we must have Hence

fb

a

F'(x) dx = F(b) - F(a) = bfj

dx,

{F'(x)-f(x)}dx = 0.

and

Ja This with (9.2.3) implies that F'(x) = f(x) a.e.

Lemma. If F: [a, b] -> R is absolutely continuous on [a, b] and

F'(x) = 0 a.e., then F is constant.

Proof. Suppose a < c < b, and E _ {x E [a, c]; F'(x) = 0}. For a fixed e > 0, there are arbitrarily small intervals [x, x + h] for each xEE such that IF(x+h)-F(x)I < eh. Choose 8 > 0 to satisfy (9.2.2) in the definition of absolute continuity and use theorem 9.1 to obtain a finite collection [xk, yk] of intervals with

IF(yk)-F(xk)I < e(yk-xk)

which cover all of E except for a subset of measure less than 8. Order these intervals so that yo = a < x1 < y1 -< x2 < ... < yn -< C = xn+i, n

and

i=0 n

I xi+1- yi l < 8.

By (9.2.2) this implies Z I F(xi+i) - F(yi) I < e i=o

234

MEASURES IN SPECIAL SPACES

[9.2

and, from the choice of the covering family n

so that

i=o

IF(c) - F(a) =

F(yi)-F(xi)I < e(c-a) n

n

2=o

{F(x2+1) - F(yz)} +

{F(yz) - F(xi)} a=o

< e(c-a+1). Since e is arbitrary, we have F(c) = F(a). ]

Theorem 9.4. A function F: I

R is an indefinite integral, that is

there is a measurable f : I --> R* such that

F(b)-F(a) =J bf(x)dx a

for all [a, b] c I, if and only if F is absolutely continuous on I. Proof. We have already seen that any indefinite integral is abso-

lutely continuous. Conversely suppose F: I -+ R is absolutely continuous. Then F is differentiable almost everywhere in [a, b] and IF'(x) I 5 Fi(x)+F2(x) a.e.,

where F = Fi - F2 expresses F as the difference of two monotone functions. By theorem 9.2, F' in integrable on [a, b]. Put G(x) = faF'(t) d t. Then G is absolutely continuous and so is H = F - G. But, by theorem 9.3,

H'=F'-G'=F'-F'=0a.e.

so that H is constant by the lemma. Hence

F(x) = f Jaa Corollary. Every absolutely continuous function F:I --> R is the indefinite integral of its derivative.

Density Given a set A C R, X E R consider the ratio

IInAj III

for all intervals I containing x where JEJ denotes the Lebesgue outer

measure of E. If this ratio converges to a limit as III -> 0, then

DIFFERENTIATING THE INTEGRAL

9.21

235

this limit is called the density of A at x and denoted ?-(x, A). The point x is called a point of density for A if T(x, A) = 1, and a point of dispersion for A if T(x, A) = 0. We can obtain the following as a corollary of theorem 9.4.

Lemma (Lebesgue).

If A - R, A is

Lebesgue measurable, then

T(x, A) = 1 for almost all x E A,

T(x, A) = 0 for almost all x E R - A.

Proof. Suppose a < x < b. Then the indicator function yd is Lebesgue integrable over [a, b]. Hence

F(x) = f xx.dx a

is differentiable almost everywhere and

F'(x) = 1 for almost all x in [a, b] n A,

F'(x) = 0 for almost all x in [a, b] n (R - A). But if x is such that F'(x) = 1, there is for each e > 0 a E > 0 such that (i)

1>I[x,x]nAl >1-e for 0
>1-e for 0
(ii)

and so

]nAl i '> l [x - kh + h >1-e

k

for

0 < h, k < S,

which is precisely the condition for T(x, A) = 1. A similar proof shows that, at points x where F'(x) = 0 we have T(x, A) = 0.1 Exercises 9.2

1. If F: I R is absolutely continuous, show that FD is absolutely continuous for each p > 1, but not, in general, for p < 1. 2. If F: [a, b] ->. R is such that F' exists everywhere in (a, b) and is bounded show that

rb

F'(x) dx = F(b) - F(a).

For F(x) = x2 sin l/x2 (x + 0), F(0) = 0 show that F'(x) exists for all x but is not Lebesgue integrable over [-1, 1]. (This shows that even the Lebesgue integral is not strong enough to integrate all derivatives.) 3. Construct a subset A c R for which T(0, A) = J. 9

TIT

236

[9.2 MEASURES IN SPECIAL SPACES 4. Extend the density result to non-measurable sets A by showing that for any A c R, T(x, A) = 1 for all x in A except a subset of zero measure. Hint. Assume A is contained in a finite interval, and take a measurable set B A with JBI = CAI.

Deduce that a set A c R is measurable if and only if r(x, A) = 0 for almost all x in (R-A). 5. Prove that the Cantor function g: [0, 1] -* [0, 1] defined in §2.7 is monotone increasing and continuous but not absolutely continuous. 6. The function f: [0, 1] --> R is absolutely continuous on [e, 1] for each e > 0. Can one deduce that f is absolutely continuous on [0,1]? Does the additional condition that f is of bounded variation on [0, 1] help?

9.3 Point-wise differentiation of measures In theorem 4.8 we proved that all measuresp in R defined for Borel

sets and finite on bounded sets are Lebesgue-Stieltjes measures: that is, there is a monotone increasing function F: R -- R which is continuous on the right such that It = ,aF on -4. Because of this correspondence we can obtain properties of such Borel measures in terms of the corresponding properties of F. Lemma 1. Suppose ,uF is the Lebesgue-Stieltjes measure with respect to

the function F: R -- R which is continuous on the right. Then ,aF is absolutely continuous with respect to Lebesgue measure m if and only if F is absolutely continuous.

Proof. Suppose first that F is absolutely continuous. Then, by theorem 9.4

,aF(a, b] = F(b) -F(a) = bF'(t) dt Ja so that, for E E 9, ,uF coincides with the set function v(E) = fE F' dm. J

But the extension of a measure from 9 to.' is unique, so that ,aF = v on -4, and up must therefore be absolutely continuous with respect to m. Conversely, if pp is absolutely continuous with respect to m, by the Radon-Nikodym theorem m there is an f > 0 such that

#' = f E

dm

forE.

.

9.31

Hence

POINT-WISE DIFFERENTIATION

#p(0, x] = F(x)-F(0) =

237

ff4t dt for x > 0,

#1,(x, 0] = F(0)-F(x) _ f f(t) dt

for x < 0,

o

so that F: R -->- R is an indefinite integral and must therefore be absolutely continuous. Atom

Given any measure space (X,3;7, p) in which F contains all single

point sets the point x E X is said to be an atom for the measure µ if ,u{x} > 0. A measure It with no atoms is said to be non-atomic. Now if It is o--finite, the set of atoms of It is countable. In this case if we put

v(E) = Z ,u{x} xEE µ{x} + 0

we obtain a new measure v defined on all subsets of X, and v is a discrete measure as defined in § 3.1. Further, the set function

T =,a-v defined on F is clearly non-atomic and so

It = v+T is a decomposition of a o--finite measure a into the sum of a discrete measure and a non-atomic measure. This decomposition is clearly unique. Thus we have proved Lemma 2. Given a o --finite measure space (X, F, #) in which. all single point sets there is a unique decomposition of It,

contains

p = V +T

for which v is a discrete measure on X and r is a non-atomic measure

on F. Lemma 3. A measure ,a on . (the Borel sets of R) which is finite on bounded intervals is a discrete measure if and only if p = /tF where F is a jump function, that is, F(x) _

pi for

x >, 0,1

0<xs5x

-F(x) =

pi for x<x{
(9.3.1)

x < 0, J

where the measure p has atoms xi of weight (or measure) pi. 9-2

238

MEASURES IN SPECIAL SPACES

[9.3

Proof. It is clear that if F: R -. R satisfies (9.3.1) then ,

F(a,b]= a<x;5bpi=xiE(a,b] E Ps

so that #F coincides with the discrete measure

v(E) =x;EE E pi for E e 9. By uniqueness of extension PF. must be a discrete measure. Conversely, if a is a discrete measure with atoms xi of weight pi, an application of the theorem 4.8 shows that ,a = ,UF, with F a jump

function., Lemma 4. A measure, It defined on .1 which is finite on bounded intervals

is non-atomic if and only if ,it _ ,aF for a continuous F: R -- R.

Proof. If F is continuous, then 0 < ,UF,{x} < ,aF,(x - h, x] = F(x) - F(x - h)

for all h > 0, so that ,up{x} = 0. Conversely if F is not continuous at x0, then 1

,aF.{xo} = "M /tF (xo - n, xo = F(xo) - F(xo - 0)

so xo is an atom. I Singular monotone function

Any function F: I - R which is continuous and monotone increasing, such that F'(x) = 0 for all x in I except for a set of zero Lebesgue measure, is said to be singular. The function g defined in § 2.7 clearly satisfies these conditions without being constant. Lemma 5. A function F: R-+ R is singular if and only if the LebesgueStieltjes measure Itp is non-atomic and singular with respect to Lebesgue measure.

Proof. The continuity of F is equivalent to the condition that u, be non-atomic by lemma 4. Now a measure v is singular with respect

to Lebesgue measure if and only if any absolutely continuous T satisfying T(E) 5 v(E) for all E in . must be zero. Now if F'(x) > 0 on a set of positive measure, the set et function T(E) =

dx fE

is not always zero and T < ,aF. by theorem 9.2 so ,aF is not singular with respect to Lebesgue measure. Conversely, if µF is not singular a

POINT-WISE DIFFERENTIATION

9.3]

239

non-null absolutely continuous measure T 5 µF can be found, and this corresponds to a function G, that is G(b) - G(a) = b G'(x) dx. Ja

But F(x) > G'(x) when both are defined, so F'(x) > 0 on a set of positive measure. )

Theorem 9.5 (Lebesgue). Given any function F: R --> R which is monotone increasing and continuous on the right, there is a decomposition

F=F1+F2+F3

ofF

where F1 is a jump function,

F2 is singular, F. is absolutely continuous.

This decomposition is unique if we insist that F1(0) = F2(0) = 0.

Proof. Use the function F to define a Lebesgue-Stieltjes measure ,up on -4. Decompose µF with respect to Lebesgue measure m by theorem 6.7 so that ,aF = vl + V3

with v3 < m and v1 singular with respect to m. Decompose v1 by lemma 2,

v1=A1+A2,

where Al is discrete and A2 is non-atomic.

Let F1, F2 be the monotone functions (with F1(0) = F2(0) = 0) obtained by theorem 4.8 for which Al = PF,' A2 = UF2 on A Then by lemmas 3 and 5, F1 is a jump function, and F2 is a singular function. If one applies theorem 4.8 to v3 one obtains an absolutely continuous G3 for which v3 = µa,. Finally, put F3(x) = G3(x)+F(0) and we still have F. absolutely continuous, and v3 = µF8. Now F(x) - F(0) = Fi(x) - F1(0) + F2(x) - F2(0) + F3(x) - F3(0)

for all x so that

F(x) = F1(x)+F2(x)+F3(x).

The uniqueness follows from the uniqueness of the decomposition µF = Al + A2 + v3, and theorem 4.8.1

In R we can also use the connexion between µF and F to define differentiation. Thus if F:I -+ R is differentiable at x0, this means

that

F(xo + h) - F(xo - k)

h+k

-)- F'(xo)

as h,k -> 0

MEASURES IN SPECIAL SPACES

240

with h > O, k > 0, and

[9.3

17L - - F(xo)

pF(xo-k,xa+h]

I(xo- k ,xo+ ]1 This can be written #F(J) -a F'(xo) as

A

IJI-->0

for intervals J containing x0, and we can write d aF/dm (xo) for the value

of this limit. More generally, if p, v are two measures in R which are finite for bounded sets then lim L/t(J)

I.n-- o L v(J)J

xEJ

when it exists, is called the derivative of p with respect to v at the point x. In Rn we can consider the values of the ratio ,a(J)

(9.3.2)

V(J)

for rectangles J (in 911) containing a fixed point x and ask whether or not this ratio approaches a limit as diam (J) 0. The existence of

this limit for all x except for a set of zero v measure can be proved when v is Lebesgue measure: the limit in this case is called the strong derivate of p at x. This result is harder to prove than the result in § 9.1 because theorem 9.1 is not valid without some restriction on the ratio of the sides of the covering class /. Essentially similar methods to those of § 9.1 will work if only cubes J are considered. On the other hand if in (9.3.2) one considers rectangles with arbitrary orientation an example can be given for which the limit exists nowhere.

Differentiation point-wise in abstract spaces can also be defined in terms of suitable `nets', and the theorems of this chapter can be obtained if sufficient conditions are imposed. Since the results are not often used in practice, we will not state them in detail. Exercises 9.3 1. Enumerate the rationals as a sequence {ri}. By considering the discrete

measure with mass 1/i2 at ri (i = 1, 2,...) define a jump function which is constant in no interval. 2. Give an example of a singular function which is constant in no interval.

3. If F, G are two monotone real functions differentiable at xo with G'(xo) + 0, show that d#F duG

(xo )

= lim -F

xoEJ #G(J)

IJI-+o

exists and e quals

,

()

G (x0)

THE DANIELL INTEGRAL

9.4]

241

9.4* The Daniell integral Our approach in this book has been to regard measure as the primitive concept, and to define the integration process in terms of a given

measure. One important alternative is to start with an `integral' defined on a suitable class of functions, extend its definition to a larger

domain with desirable properties and then obtain measure as a by-product at a later stage. In the present section we describe this alternative approach: it is convenient to use it in the following section

to obtain the integral representation of an important class of linear functionals.

For an arbitrary space X, we consider a family L of functions f: X ->. R satisfying

(i) L is a linear space over the reals; (ii) for each f E L, the function f+ E L, where

f+(x) = max (0j (x)). Now if we define, for each f, g E L, X E X

(fvg) (x) = max (f(x), 9(x)), (f A g) (x) = min (f (x), g(x) ),

the relations

f+ = fv0, fvg = (f-g)v0+g, fAg =f+g-(fvg); show that (iii) if f, g E L, then f v g, f n g E L.

Any family L satisfying conditions (i) and (ii) (and therefore (iii)) is called a vector lattice of functions. Suppose 5 is a linear functional on L (considered as a real linear space), then we say S is positive if

fEL, f > 0=- 5(f) > 0. A positive linear functional f on L is said to be a Daniell functional if, for every increasing sequence {Jn} of functions of L

.f (g) S limS(fn)

(9.4.1)

n-co

for each g E L satisfying g(x) 5 lim fn(x) for all x E X. (Note that n-co

lim fn(x) will be +oo if the sequence {fn(x)} is unbounded, and even n-*00

if lim fn exists as a function with finite values we do not assume that it is in L.) In particular, this implies that, if f is a Daniell functional, {fn} a monotone sequence in L such that f (x) = lim fn(x), x E X defines n- C0

242

MEASURES IN SPECIAL SPACES

[9.4

a function in L then 5(f) = lim5(fn). For if {fz} is increasing then n--*oo

f > fn for all n, so 5(f) > -f(fn) since .1 is positive, which with (9.4.1) gives the required equality. Thus a Daniell functional is continuous in the sense that for any sequence { fn} in L which decreases monotonically to the zero function we must have . .f (f.) -). 0. Any Daniell functional

is therefore an `integral' in the sense discussed in § 5.1. However, for the integral to be useful we want the domain L to be as large as possible: if {fn} is an increasing sequence in L which is bounded above

by an element of L we would certainly want lim fn to be in L. The Daniell integral is essentially the result of extending a Daniell functional.f from L to a class Ll L: it turns out that this extension can be carried out in two stages. Suppose f is a Daniell functional on a vector lattice L. Denote by L+ the set of functions f: X -> R* which are limits of monotone increasing functions of L. L+ is not a linear space but

a,f > 0 f,gEL+= of+/3gEL+. Then if {fn} is an increasing sequence in L, {,f (f.)} is an increasing sequence in R which has a unique limit in R u { + oo}. We can define 5 in L+ by _f(lim fn) = lim5(fn).

This definition is proper because if {fn},{gn} are two monotone sequences each converging to h in L+, condition (9.4.1) gives, for fixed n,

fn S h = lim gn

5(fn) 4 "M -f (9n)

so that lim5(fn) S lim.f (gn) and the opposite inequality can be similarly obtained. It is clear that f is linear on L+ in the sense that

a > 0, f >, 0; f,gEL+=.f(af+fg) = a.f(f)+fS(g) For an arbitrary function f : X -> R* we define the upper integral .f *(f) by

.'*(f) = inf of (g), 8>f DEL+

where we adopt the (usual) convention that the infimum of the empty set is + oo. Similarly, the lower integral 5* (f) is defined by

5*(f) = -.f *(-f), and we say that a function f: X -a R* is integrable (with respect to 5) if 5*(f) _ 5* (f) and is finite. The class of integrable functions will be denoted by Ll = L1(5, L). For f E Ll we call the common value of 5*(f ), 5* (f) the integral off and denote it by /(f ). We now show that

THE DANIELL INTEGRAL

9.4]

243

this functional/ on L1 is a Daniell functional which extends 5, and that L1 has the closure properties desired. It is convenient to obtain a number of preliminary results before stating the theorem. Lemma 1. If {gn} is a sequence of non-negative functions in L+, then 00

OD

g = 1ign n=1

is in L+ and Jf(g) = Z 5(gn). n=1

Proof. It is clear that a non-negative function f: X -> R+ belongs to L+ if and only if there is a sequence {fn} of non-negative functions in L with f =

fn. By definition, in this case n=1 CO

5(f) = E 5(fn) n=1

Hence, each function gn can be expressed as a sum Go

gn = Efn,v with fn, v : X -+ R+, fn,, E L. V=1

It follows that

g = I Zfnro n v

is a countable sum of non-negative functions of L and so must be in L+. Further since all the terms are non-negative, the order of summation is immaterial and OD

= E (E (fn,v)) n=1 v=1

Lemma 2. For arbitrary functions f: X --> R*, g: X --> R*:

(i) 5*(f+g) (ii) if c % 0,

5*(cf) = c5*(f);

(iii) if f 5 g, 5*(f) < 5*(g), J*(f) 5-f*(g); (iv) 5*(f) 5 J*(f); (v) if fEL+, 5*(f) =5*(f) _ 5(f). Proof. (i), (ii) and (iii) follow immediately from the definitions. It is worth noting in (i), that we can put (f + g) (x) = + oo at those points x for which one of f (x) is +oo and the other is - co so that (i)

is true whatever the value in R* chosen for (f + g) (x) at such points x.

(iv) Since 0 = 5(0) = 5(f -f) < 5*(f) +5*(- f) by (i), it follows

that. *(f) _ -5*(-f) < 5*(f) (v) If f E L+, then by definition ./*(f) = .1(f ). Now if g E L, then

244

MEASURES IN SPECIAL SPACES

[9.4

- g c L c L+ so that -0'* (g) = .fi(g). But each f in L+ is the limit of an increasing sequence {gn} in L. Thus f > gn so J*(f) 3 5*(gn) = 5(gn)

andJ* (f) >, lim.f(gn) = 5(f).] Lemma 3. If {gn} is a sequence of functions on X to R+, and

g= n=1 Egn,

then .O*(g) < E.-.O*(gn) n=1

Proof. If5* (gn) = +oo for some n, or if the series I.f*(gn) diverges

there is nothing to prove. Otherwise, given e > 0, for each integer n choose hn > gn, hn E L+ such that .f*(gn) > 5(hn) - e 2-n. Then h= E hn E L+ by lemma 1, h >,g and 00

-f *(g) <, 5(h) = E 5(hn) < e+ 21 .E*(gn) n=1

Since e is arbitrary the result is proved.

Theorem 9.6. Given a Daniell functional ,f on a vector lattice L of functions on X to R, the process defining a functional / on the set L1 determines a Daniell functional on a lattice L1 which extends .f. Further, if {fn} is an increasing sequence of functions in L1 and f = lim fn, then f E L1 if and only if lim /(fn) is finite in which case /(f) = lim /' (fn).

Proof. Lemma 2(v) shows that L1

L and that f is an extension

of .0. Now if g E L1 so does cg for c in R since

c%0-5*(cf)=c.f *(f)=c.f*(f)=5*(cf), c<0

5*(cf) = c-f*(f) = c5*(f) _ .f*(cf).

Further, if f and g are both in L1, using lemma 2 (i),t

so

/(f)-/(g) -5*(f+g) =5*(-f-g) /(f)+/(g) % *(f+g);

and, by lemma 2 (iv), f + g E L1 and

/(f+g) = /(f)+/(g) Thus L1 is a real linear space, and f is a linear functional on L1. To prove that L1 is a lattice it is sufficient to prove that

fEL1= f+eL1. t As pointed out in the proof of lemma 2(i) the inequality is valid, whatever value in R* is chosen for (f+g) (x) at points x where f(x) = + oo, g(x) = - oo. The proof given then shows that, for f, g e L1, (f+ g) E L1 whatever values are assumed at such points.

THE DANIELL INTEGRAL

9.41

245

For a fixed f in L1 and each c > 0, choose functions g, h in L+ such that

-h
5(g) < /(f)+e < oo,

f (h) < -/(f)+e < oo.

Now g = (g v 0) + (g A 0) and g A 0 E L+; so .5(gv 0) < .fi(g) - 5(g A 0) < oo.

Thus g+ = g v 0 E L+ and .f(g+) < oo. Similarly, - h_ = h A 0 E L+ and

h- < f+ < 9+. But (g+h) > 0; and separate consideration of each possible pair of signs for g, h shows that g+- h_ < g + h. Hence .f (9+) +.f (- h_) < .fi(g) +.f (h) < 2e.

But

h_) <

*(f+) < 5*(.f+) < 5(g+)

so that .f*(f+)-5*(f+) < 2e. Since e is arbitrary and /(g+) is finite we have f+ E L1 as required.

Now suppose {fn} is an increasing sequence of functions in L1 and f = lim fn. Then if lim /(fn) = + oo, and g < f, g E L1 it is clear

that /(g) < lim /(fn) since /(g) is finite. On the other hand if lim /(fn) is finite, put h = f - fl. Then h > 0 and W

h =E.+ V.+1 -A). n=1

00

By lemma 3,

E {d (fn+1) - (fn)}

n=1

= lim /(fn) - /(.f1)

so that

*(f) _ 5*(fi+h) < Jl*(fi)+J*(h) < lim /(fn)

But fn < f so that .f* (f) >, Jim /(fn), and we must have 5*(.f) = 5*(f) = Jim f(fn)-

This means that, if lim /(fn) is finite, then f is in L1, and

/(f) = lim f (fn) The positive functional / therefore satisfies (9.4.1) and must be a Daniell functional on L1. I Remark. There may be some functions f : X -* R* which take the values ± oo at some points but are still in L1. In the course of the proof

we saw that it made no difference to the linear functional / what value was assigned to (bf + eg) (x) at points x where the usual calculation

leads to + oo + (- oo). It is in this sense that f is a linear functional

246

MEASURES IN SPECIAL SPACES

[9.4

on the real linear space L1. However, we will shortly see that all functions f : X -- R* in L1 must take finite values at `almost all' points, so that the set where (bf + cg) is not determined by the laws of algebra is always small (relative to f). Now if one starts with a Daniell functional ./ on a vector lattice L which is already closed for monotone limits, i.e. if {f j is a monotone sequence in L and lim 5(fn) is finite, then f = lim f , is in L; the exten-

sion process defined will lead to nothing new as the part of L+ on which .0 is finite is in L and this will give L = L1. Daniell integral

Any Daniell functional / on a vector lattice L1 of functions on X to R* such that the limit f of a monotone sequence {fn} of functions in L1 is in L1 provided lim /(f.) is finite is called a Daniell integral.

We now see how one can obtain a theory of measure if one starts with a linear operator f satisfying these conditions. The definitions are made so that the integral (in the sense of Chapter 5) with respect to

the measure recovers the operator f. Starting with a Daniell integral

/ we say that a non-negative function f : X -a R+ is measurable (with respect to f) if g c L1 = f A g E L1. We say that a set A c X is measurable (with respect to /) if the indicator function XA is measurable; while the set A is integrable if XA E L1. In order to ensure that the

class of measurable functions and sets has useful properties we will

further assume that the space X is measurable, that is, that the constant function f (x) __ 1 is measurable.

Lemma 4. If X is measurable, then the class.d of sets measurable with respect to f is a o field. If f: X --> R+ is any non-negative integrable function, the set Ea = {x: f (x) > a} E.d for all a E R.

Proof. Given f, g non-negative measurable functions, the lattice properties of L1 immediately give that f v g and f A g are measurable. But

XAAXB, XA' B = XAV XB

so that A, Bed = A n B and A u B Ea. Further for any set E, gAXE = (gvO+gAO)AXE = (gvO)AXE+gAO so that if g E L1, g v 0 and g A 0 E L1 and

(gVO)AXA-B=

ELI,

(gAO)AXA-B = gAOEL1,

so that g A XA-B E L1. Thus sad is a ring, and since X ES, we have proved that a is a field. To show that sad is a o--field one need only use

THE DANIELL INTEGRAL

9.4]

247

the fact that L1 is closed for monotone limits which are bounded, n

since E. = U Ai is monotone and so is XEn. i=1

Now if f : X -> R+ is non-negative and is in L1, Ea = X for a < 0.

If a = 0 put h = f; while if a > 0 put h = [a-l f - (a-l f) A 1]. Then h e L1, and in either case h(x) > 0 for x E Ea and h(x) = 0 for x E X - Ea.

For each integer n, put fn = 1 A (nh). Then fn E L1 and the sequence {fn} increases monotonely to yE Hence XEE is measurable, so Ea is measurable. ]

Theorem 9.7 (Stone). Suppose / is a Daniell integral on the class L1 of functions f : X -+ R*, and X is a measurable set with respect to f . Then p(E) = /(XE) when E is integrable, p(E) = +oo otherwise defines a measure p on the o--field .ud of measurable sets. A function

f : X - R* is in L1 if and only if it is integrable with respect to this measure p, and

/(f) = f da for all f e Ll. J

Proof. It is immediate that ,u(0) = 0. If B is integrable and A is measurable with A c B, the definitions ensure that A is integrable and

0 < p(A) 5 p(B). This inequality is trivially satisfied when B is measurable but not integrable, so p is monotone on d. Now let {En} be a disjoint sequence ina and E = U co En. If at least n-1

one of the E. fails to be integrable, then E is not integrable and u(E) = +oo = Ep(En).

(9.4.2)

If each of the sets E is integrable, then E will be integrable if and only if Ep(En) < oo by theorem 9.6, since XE = EXEE and in this case

p(E) = Ep(En) < oo. It is clear from the statement of theorem 9.6 that (9.4.2) will be satisfied if Ep(En) = +oo. Thus in all cases, It is on.. and that any non-negative Now lemma 4 ensures that .2f is a g-integrable functions is .sad-measurable. Since each g-integrable function is the difference of two non-negative g-integrable functions it follows that any f in L1 is a-measurable.

248

MEASURES IN SPECIAL SPACES

[9.4

Consider a non-negative f: X -> R+ in L1. For each pair (r, 8) of positive integers put Er,s = {x: f(x) > r/s}.

Now E,S E.Qf and xEr 8 E L1 (that is, ,u(Er,s) < oo) since XEr,s - xEr., A

(if).

fn,=-8 r=1 xE,s' s=2",

Put

and note that {f,,} is a monotone sequence in L1 which converges to f.

Hence /(f) = lim /(fn). But 1

s'

/(fn) = Z /(XEr,,) 8r=1

1

s'

-8r=1 Zi lu(Er.s) = ffndlj,,

and from the definition of the integral of a non-negative.-measurable function we have

/(f) = lim ffnda = ffd,u. Conversely, if f : X -+ R+ is non-negative and integrable with respect top, then each of the sets Er,s is insaf and has finite,-measure. Hence xEr 8 and therefore fn are in L1. Since

= lim f f- du = lim aNn) < co, by theorem 9.6, f = lim fn is in L.I. This completes the representation

theorem for non-negative functions. But for both the functional f, and the integral with respect to ,u we have a decomposition f = f+-fof any integrable f: X -* R* as the difference of two non-negative integrable functions, so we can deduce the representation for arbitrary integrable functions. I An obvious question arising is that of uniqueness for the measure

,u in theorem 9.7. This cannot always be obtained, but we give an outline of the uniqueness proof under suitable conditions in exercises 9.4(8, 9).

9.41

THE DANIELL INTEGRAL

249

Exercises 9.4 1. Show that the condition (9.4.1) for a positive linear functional is equivalent to saying that, if {un} is a sequence of non-negative functions in L and 0 E L satisfies 0< E u,n, then -0(0) 5 E 5(un). 2. If (S2, ,u) is a a -finite measure space, L is the class of u-integrable functions and 5(f) = f f d u, show that .0 is a Daniell functional on L. 3. Let J be the class of continuous functions on R to R which are zero outside [ - K, K] for some K and put f (x) dx in the Riemann sense.

JO(f) =

J Show that S is a Daniell functional on J. 4. If / is a Daniell integral defined on the class L1 prove that fEL1= IfIEL1. 5. (Fatou for Daniell integral.) Suppose {f,,,} is a sequence of nonnegative functions in L1. Prove that lim inf f is in L1 if lim inf .f(f) < oo and in this case /(lim inf fn) < lim inf /(f,,,). 0-000

6. (Dominated convergence.) Suppose {fn} is a convergent sequence in L1 such that I f n I

g for all n where g e L1. Then if f = lim f n, f E L1 and

A(f) = lim /(fn) 7. Suppose It is a measure on a field.Vof subsets of X, and L is the family of finite linear combinations of indicator functions of sets of d with finite measure. Show that Lisa vector lattice and ifs is defined on L to be integration with respect to, u, then. is a Daniell functional. Discuss its extension

/to a Daniell integral. 8. Suppose 5 is a Daniell functional on a vector lattice L, and f' is an extension of 5 to a Daniell functional on a vector lattice L' L. If 5 and 5' are extended to give Daniell integrals over L1 and Li show that Li L1 and f' is an extension of f 9. Suppose L is a fixed vector lattice containing the constant function 1 and -4 is the smallest Q-field of subsets of X such that each function in L

is measurable -4. Prove that for each Daniell integral / on L1 there is a unique measure p on a such that

/(f)=Jfdu for all fEL. Hint. If sad is a--field of sets measurable w.r.t. 0, as a. Existence of p follows from theorem 9.7. To prove uniqueness it is sufficient to show that for any such ,u, ,u(B) = /(XB) for all

Use questions 8 and 7 above to extend the two Daniell functionals-one given and the other defined in terms of the integrals with respect to p.

250

MEASURES IN SPECIAL SPACES

[9.5

9.5* Representation of linear functionals In this section we restrict our attention to topological spaces X which are locally compact and Hausdorff. A topological space is Hausdorff if given two distinct points x, y E X, there are open sets G, H with x E G, y E H, G n H = 0. The family of functions f: X -> R

which are continuous on X and vanish outside a compact subset of X is called C0(X). If we define the support of a function f: X -> R

to be the closure of the set {x:f(x) + 0}, then C0(X) is the family of those continuous functions f: X -> R which have compact support. Baire sets and measure The class of Baire sets is the smallest o'-field W of subsets of X such

that each function f in C0(X) is f-measurable. Thus' is the o--field generated by the sets of the form {x: f (x) > a}, f E Co(X ), a E R. A measure u is called a Baire measure on X if It is defined on the o'-fieldle of Baire subsets, and u(K) is finite for each compact set K in'. Clearly C0(X) is a normed linear space if we put 11f 11 = sup 1f(X) 1,

xEX

and we will also use the fact that C0(x) is a vector lattice. This allows us to identify the positive linear functionals on C0(X).

Theorem 9.8 (Biesz). Suppose X is locally compact Hausdorff, and 5 is a positive linear functional on the space C0(X) of continuous functions f : X -. R with compact support. Then there is a Baire measure ,u on X such that

5(f) _ (f du for all f E Co(X ). Proof. The first step is to show that -0 must be a Daniell functional on C0(X). Suppose fEC0(X), {fn} is an increasing sequence in C0(X) and f 5 lim fn. In order to prove that 5(f) 5 lim.f (fn) it is sufficient to show that f(f) = lim.>f(gn) where gn = f n fn so that f = lim gn 1< lim fn.

But then, if we put hn = f - gn we obtain a decreasing sequence of functions of C0(X) whose limit is zero. Let K be the support of hl, then there is a function 0 in C0(X) which is non-negative and satisfies

c(x) = 1 for x E K. f For each x E K, e > 0 there is an n., such that is continuous, there is an open set Gx for which hnx(x) < 2e and, since x E Gz and hnx(t) < e for t E Gx. t This uses a separation property of X; see, for example page 146 of J. L. Kelley Oencral Topology, Van Nostrand (1955).

9.51

LINEAR FUNCTIONALS

Since K is compact there is a finite subcovering

251

Gad of K.

If N = max [nxl, ..., nx8] we have h,(x) < e for all x in K, n > N. Thus

0
We can now apply theorem 9.7 to the extension f off to Li C0(X) to obtain a measure It on the o--field.which contains the Baire sets and such that, for f E C0(X),

5(f) _ /(f) = ff dp. By considering the above function 0 which is in C0(X) and takes the value 1 on the compact K, we see that

f

,t(K) = f(XK) 5 /(0) = Oda < oo, so that the measure It we have obtained is finite on compact sets. When X is compact, C0(X) is the same as C(X) the space of continuous f : X -+ R, so that in this case the positive linear functionals

on C(X) correspond to finite Baire measures. Further, because of exercise 9.5 (9) there is uniqueness. This gives

Corollary. If X is a compact topological space and C(X) is the set of continuous functions f : X R, then there is a (1, 1) correspondence between positive linear functionals f on C(X) and finite Baire measures p on X given by , 0 (f)

=5

.

If we want to consider more general linear functionals on C(X), it is convenient to express these as the difference of two positive linear functionals so that theorem 9.8 can be applied. This can be done for bounded linear functionals.

If L is a vector lattice of bounded functions f: X -* R, then L is a normed linear space with If II = sup I f (x) 1. A bounded linear functional F has a norm IIFII = sup IF(f)j. IIJII<1

Theorem 9.9. Suppose Lisa vector lattice of bounded functions f : X ->- R

which contains the constant function 1. Then for each bounded linear functional F on L, there are two positive linear functionals F+ and Fsuch that F = F+-F- and 11 FII = F+(1)+F (1).

252

MEASURES IN SPECIAL SPACES

19.5

Proof. For each f > 0 in L put F+(f) = sup F(g) OSg
Since F(0) = 0, F+(f) > 0 and F+(f) > F(f). Further F+(cf) = cF+(f) for c > 0.

If f, g are two non-negative functions in L, such that 0 < 0 < f, 0 < x < g, then 0 < 0+x < f+g, so that F+(f+g) > F(O)+F(x). Taking suprema over all such 0, x in L gives

F+(f+g) > F+(.f)+F+(g) To obtain the reverse inequality consider x e L such that

0 < x< f+ g: then 0 < x A f< f and 0 5 x- (x Af) < g F(x) = F(XAf)+F[x-(xAf)] so that < F+(f)+F+(g) and taking the supremum over such x gives F+(f+ g) < F+(f) + F+(g)

For an arbitrary f E L, let p, q be two constants such that (f + p) and (f + q) are both non-negative. Then

F+(.f+p+q) = F+(f +p) + F+(q) = F+(f + q) + F+(p)

so that

F+(f +p) - F+(p) = F+(f + q) - F+(q)

This means that the value of [F+(f +p) - F+(p)] is independent of p and we can define F+(f) to be this value. Thus F+ is now defined on L, F+(f + g) = F+(f) + F+(g)

and

for all f, g E L,

F+(cf) = cF+(f) for c > 0, /EL.

But F+(- f) + F+(f) = F+(0) = 0 so we have F+(- f)

F+(f) and

F+ is a positive linear functional on L.

But F+(f) > F(f) so that F- = F+ - F is also a positive linear functional on L. Now

IIFII < 11 F+II+JIF-11=F+(1)+F-(1).

To establish the opposite inequality consider functions f E L for which 0 < f < 1. Since 12f

<1

11F11 > F(2f-1) = 2F(f)-F(1).

9.5]

LINEAR FUNCTIONALS

253

Taking the supremum over such f gives 11 F11 > 2F+(1)-F(1) =

Corollary. Let X be a compact Hausdorff space and C(X) the set of continuous functions f: X -+ R. Then there is a (1, 1) correspondence between finite signed Baire measures v on X and the dual space to C(X) given by

F(f) = ff dv. Moreover, IIFII = I vI (X).

Proof. If one starts with a finite signed Baire measure v, then by theorem 3.3, there is a decomposition v = v+- v_ into the difference of two finite Baire measures. Clearly

f

F(f) = fdv+- ffdv_ then defines a bounded linear functional on C(X) since each function

f in C(X) is bounded and measurable with respect to the class of Baize sets. Conversely given a bounded linear functional F on C(X), this can be decomposed by theorem 9.9 into the difference F = F+-F- of two

positive linear functionals. Apply theorem 9.8 and corollary to find finite Baire measures ,a1, ,u2 with

F+(f) = ffd1q) F-(f) = ffdAuz. If we put v = ,u1-,u2, then v is a finite Baire measure and

F(f) = ff dv. Now

IF(f)I < f IfIdIvI Ilfll IvI(X)

so that IIFII < I vI (X). Further IvI(X) < 1t1(X)+,u2(X) = F+(1)+F-(1) = IIFII so we have IIFII = I vl (X).

To prove that v is uniquely determined by F, suppose there are

two signed measures v1, v22 with

ffdvi =

ffdv2

for each f e C(X ).

MEASURES IN SPECIAL SPACES [9.5 254 Decompose A = v1- v2 by theorem 3.2 to give A = A+ - A_. Then

ffdA

=J f d l_ for all f C(X ),

so that by the uniqueness proved in exercise 9.4 (9), A+ = A_ on the Baire sets. Hence v1 = v2. I Exercises 9.5-

1. Show that in a locally compact separable metric space the class of Baire sets is the same as the class of Borel sets.

2. Suppose u is a Baire measure on a locally compact space X. Let H be the union of all open Baire sets 0 for which ,u(G) = 0. The complement F = X - H is closed and called the support of a. Prove (i) if G is an open Baire set and G n F + 0 then µ(G) > 0; (ii) if K is a compact Baire set with k n F = o, then #(K) = 0; (iii) if f e C0(X) and f > 0, f f d u = 0 if and only if f =- 0.

3. The corollary to theorem 9.9 is not valid on C0(X) for X locally compact Hausdorff. A Radon measure 0 on a locally compact space is defined to be a linear functional on 00(X) which is continuous in the sense that, for each compact K, e > 0 there is a 6 > 0 such that 1If(x) I < 6 for all x, with the support of f contained in K, implies that I#(f) I < e. Prove every positive linear functional is a Radon measure. For R and the usual topology define CO

g5(f) = Z (-1)*f(r) for fEC0(R). Show that 0 is a Radon measure, but that 0 does not correspond to any signed Baire measure.

9.6* Haar measure There is a general method of defining a measure on an important class of topological spaces which have the algebraic structure of a group. For notational purposes we will represent the group operation in the set X by multiplication. We do not assume that the group operation is commutative. For subsets A, B of X and an element x E X we define

xA = {xy: y E A}, AB = {xy: x E A, y E B}, A-1

= {x: X_1 E A},

and call xA and Ax respectively the left translation and right translation of A by x. We also require the algebraic operations to be con-

HAAR MEASURE

9.61

255

tinuous in the topology of X. The theory of Haar measure can be developed for any such topological group which is locally compact and Hausdorff, but in this section we will make the additional (unnecessary) assumption that the topology is determined by a metric p.

A set X is a metric group if X is a group and there is a metric p such that in (X, p), the group operation is continuous. In particular lim X. = xo n-

lim Y. = yo n- oo

lim X. Y. = xoYo,

n-aoo

co

lim xn 1 = xo 1. n-aco

We will, for the remainder of the section, assume that X is a metric group which is locally compact in the topology of the metric. We are interested in measures for which the translation of A by any element x leaves the measure invariant. For example, the space R of real numbers is clearly a metric group with ordinary addition for the group operation. Given a set E c R, and a point x E R, xE denotes the set of real numbers of the form x + y with y E E. We showed in § 4.5 that Lebesgue measure in R is invariant under translations in the sense that, for measurable E, E = I xE 1. The notation of an invariant measure in a topological group should be thought of as a generalisation of this property of Lebesgue measure in R. To be precise, a measure,u defined on the class . of Borel subsets of X is called a left Haar measure if (i)

,u is invariant under left translations; that is for every E E 9,

x E X u(xE) = ,a(E) ; (ii) for every compact set C, ,u(C) < oo; (iii) for every non-void open set G, µ(G) > 0.

Conditions (ii) and (iii) eliminate such trivial measures as the zero measure, and the measure which is +oo except on the null set. A right Haar measure is one for which left translation invariance is replaced by invariance for right translations. We give the details of construction for a left Haar measure: obvious modifications would give the right Haar measure. Let WO be the class of non-empty open subsets of X whose closures

are compact. The important consequence of local compactness is that every compact K in X can be covered by a finite number of sets of WO. The sets 0, X added to WO form the class `e. The first step is to

define a suitable set function A one. Suppose HE(o, and G is any non-empty open set. Then 9 = {xG: xEHG-'}

256

MEASURES IN SPECIAL SPACES

[9.6

is a class of open sets covering H since, if y e H, g E G, x = yg-1, y = xg E xG. But H is compact so there is a finite subclass of 9, which covers H. Let the smallest number of sets of 9 which cover H be denoted

(H: G).

This is a measure of the relative sizes of H and G. It is immediate that, for A, B, C E To

1 < (A:C) < (A:B)(B:C). Novy compare all sets with some fixed HOE WO, and put, for each non-empty open set G, H E WO,

AG(H) =

(H:G) ( o.G).

Now, for fixed H, A,(H) is a bounded function of G since

0<

< A0(H) < (H:H0).

1

(Ho: H)

(9.6.1)

If e is the identity element o//f the group X and

Sn=SIe,n1 (n= 1,2,...) is the open sphere centre e radius 1/n, then for each fixed HEWO A2.(H)

(n= 1,2,...)

is a bounded sequence of real numbers. Put A(H) = Lim A ,.(H)

where Lim is the generalized limit defined for all sequences in m using the Hahn-Banach theorem to extend the definition from c to m while preserving the norm (see exercise 8.4.(7)). Finally, put A(o) = 0, A(X) = +oo if X is not compact (and so not in moo). Lemma. The set function A defined on le has the following properties: (i) 0 < A(H) < oo for every H E 'o; (ii) if H1, H2 E WO, d(H1, H2) > 0 then

A(H1 v H2) = A.(H1) + A(H2); (iii) for any Hl, H2 E WO,

A(H1 v H2) < A(H1) + A(H2);

(iv) if Hl, H2 E leo, Hl C H2 then A(H1) < A(H2);

(v) for any x E X, H E To, A(xH) = A(H).

9.6]

257

HAAR MEASURE

Proof. By (9.6.1), As.(H) is bounded below and above so that

0<

1

(Ho: H)

5 A(H) S (H: Ho) < oo.

This establishes (i). Further, for each He WO, G open, the covering ratios (xH: 0) = (H: G) for all x r: X; so that the sequence Asn(H) is invariant under left translations: therefore A is also and (v) is proved. If H1, H2 E co and d (HI, H2) = q > 0, then for 1 /n <,I we must have

(H1v H2:Sn) = (HI:S.)+(H2:Sn), As.(H1 v H2) = As (H) + As.(H2),

and (ii) is now established by taking generalised limits. Now for any open G, and H1, H2 E 'Co (H1 u H2: G) s (H1: G) + (H2: 0),

so

Asn(H1 v H2) 5 As.(H1) +As,, (H2);

this implies (iii) and a similar argument gives (iv). We now define a set function µ* for all subsets of X by ,u*(E) = inf E00A(HH),

(9.6.2)

i-1

where the infimum is taken over all coverings {H,H} of E by sets in W.

Theorem 9.10. In a locally compact metric group, the set functiona* given by (9.6.2) is a metric outer measure. The restriction ,u of ,u* to the class -4 of Borel sets is a left Haar measure.

Proof. In the definition of outer measure given in § 3.1, condition (i) is obvious, (ii) follows from (iv) of the lemma, and subadditivity (iii) follows from (9.6.2) as in the proof of theorem 4.2. Thus µ* is an

outer measure. Now suppose E1, E2 C X with d(E1, E2) > 0. If E1 V E2 cannot be covered by a sequence from WO, then at least one of the sets E1, E2 cannot be covered by such a sequence and ,u*(E1 v E2) = u*(E1) +,u*(E2)

(9.6.3)

since both sides are + oo. If E1, E2 can be covered by sequences from Wo, first choose open sets G1 E1, G2 E2 for which d(G1, G2) > 0 and let {Hi} be a sequence of sets from WO covering E1 v E2 with

EA(Hi) 5 It* (El v E2)+e.

258

MEASURES IN SPECIAL SPACES

For each i, put

[9.6

H' = Gl n Hi, Hi2= G2 n Hi.

Then by (ii) and (iv) of the lemma, for each integer i, A(Hi) > A(H' v H%) = A(Hi) + A(H2)

and so

p*(E1) +,u*(E2) < F-A(Hi) < 1a*(E1 v E2) + e.

Since this is true for each e > 0, andp* is subadditive we have established (7.6.3) so that p* is a metric outer measure. Now apply theorem 4.1 to p* to obtain a measure p on a class .4l of ,u*-measurable sets. Since p* is a metric outer measure, this class .4l includes the open sets and therefore the Borel sets.4 (see exercise 4.3 (4)); so that the restriction of /.z* to -4 defines a measure on -4. If we now examine the conditions for It to be left Haar measure we see that (v) of the lemma implies that p* is left translation invariant. If K is any compact set in X, there is a finite subclass of WO which covers K so that

n

p*(K) < E A(Hi) < co i=1

so that condition (ii) for a Haar measure is satisfied. Now suppose

0 is any non-void open set in X. If x e G, pick e > 0 such that S(x, e) e G and put E = S(x,'je) so that E c G. Since X is locally compact we may assume a is small enough to make E compact so that E e c1a. If,a(G) = oo then p(G) > 0; so we may suppose ,u(G) < oo. For each y > 0 there is a sequence {Hi} from WO such that 00

U

i=1

G

E, F'A(Hi) < ,a*(G) +r/.

But E is compact so a finite number of the Hi must cover E. Then if UHi:D E,

i=1

A(E) < A ( U Hi) i=1

i=1

A(Hi) < u* (G) +rl,

and since y is arbitrary we have ,u(G) =,u*(G) > A(E) > 0

so that It satisfies condition (iii) for a Haar measure.

Corollary. For any compact metric group X there is a left Haar measure P defined on a crfield F which includes the Borel sets such that (X, .F, P) is a probability space.

HAAR MEASURE

9.61

259

Proof. If X is compact, the above construction gives a left Haar measure in which

0 < ,u(X) < 00

with ,u defined on a o--field $ which is complete with respect to ,cc. If

we put

for Ee.F

P(E) =

it is clear that (X, .F, P) is a probability space. Exercises 9.6 1. Suppose S2 is the set of positive real numbers with the usual metric and multiplication for the group operation. If (1, e) is the reference set Ho used in the definition of Haar measure It in f1, show that for each interval (a, b) c Q.

,u(a, b) = log b/a

(Here e is the base for Napierian logarithms.)

2. With X = R and addition for the group operation define Haar measure ,u with (0,1) taken as the reference set Ho. Show that It is Lebesgue measure in R.

3. Let X be the set of 2 x 2 matrices of the form

(0 x)

with x > 0 and multiplication for the group operation. Define a metric in X by using the Euclidean metric in R2. Show that in the topology of this metric, X is a locally compact metric group. Define

F(x y=-y 0

X/

x'

Map the Lebesgue-Stieltjes measure lip in the right half-plane x > 0 of R2 onto the set X, and show that the result is both a left and a right Haar measure.

4. X is the set of 2 x 2 matrices of the form

xy (0

1

(x > 0)

metrised by the Euclidean metric in R2. As in question 3, obtain a measure in X by mapping the Lebesgue-Stieltjes measure ,uF of question 3 into X. Show that this is a left Haar measure but not a right Haar measure.

5. If ,u is a left Haar measure on X and v is defined by v(E) =,u(E-1), show v is a right Haar measure.

260

MEASURES IN SPECIAL SPACES

[9.6

6. The left Haar measure of theorem 9.10 is regular in the sense that ,u*(E) = inf{,u(G): G

E, G open}.

7. Haar measure is obviously not unique since for any Haar measure ,u, c > 0 the measure cu is also a Haar measure. However, on a compact

metric group, with the condition u(X) = 1 it can be proved that the Haar measure is essentially unique.

8. If A, B are two compact sets with ,u(A) = ,u(B) = 0, does it follow that #(AB) = 0? 9. If It is a Haar measure in X, then X has a discrete topology if and only if µ{x} $ 0 for at least one x r: X.

10. If a Haar measure It on X is finite prove that X is compact.

11. In a locally compact metric group X show that aHaar measure # on X is o--finite if and only if X is o--compact.

261

INDEX OF NOTATION A-B, 9

lp, 1w, 209

-4,-4-,43

2k, 79

R*, 103

F, 96

C0(X), 250

168

C(X), 251 C, 2

Y2(S2, µ), 194

C*(X), 48

M, .,f, 166 M(ca), 199 M, m, 46 p, gn, 15

c, 6,47t Wx-9, 134 W*-9, 135

R, Rn, 2 R*, 34, 51

C, 46 C(a, b), 209

((3; iEI), 136

Q, 2

D, D+, D_, D+, D-, 224 d(x, E), d(E, F), 27 E, 26

R+, 51 RI, 158 S(x, r), 24 S(x, r), 25

Ex, EY, 135

s, 46

ExF, 2

2, 82

e, 15 gn, 18 f-1, 4

5P(M), 18

fog, 4 f: A-i-B, 3

{x; P},1 Z, 2 a.e., 109 diam, 27 ess inf, ess sup, 167 lim inf, lim sup, 12

XI, 157 {x1}, 5

fe(y), 136 (f, g), 198 Via, 44

F(A), 19 19

9a, 44 -or, 100

K*, 211 43

L1, $1, 174 Lp, Yp, 175 12, 48

11+, ,u_, 62 96

,af-1, 154 per, 168 Pl, 174 pp, 175

(r--q, 77 T(x, A), 235

t Note that the symbol c has two distinct uses, which should not be confused.

262 XE, 12

No,6 E, O, z , =>, 1

o,2

INDEX OF NOTATION

u, n, to 11.11,45

f , see chapter 5 <<, 148

H,3

V, A, 241

3,4

The symbol ] is used to signal

-,6

v, n,A,9

the end of a proof.

263

GENERAL INDEX absolute continuity of functions, 231; of measures, 120, 148, 236 integration, 128 additive set function, 51, 65, 214 algebra of subsets, 15 algebraic numbers, 13 almost everywhere, 108 uniform convergence, 168 approximation in measure, 84 to measurable functions, 131 area, 69, 79 under a curve, 146 atom, 64, 237 axiom of choice, 7, 19, 93 Baire sets, 132, 250 measure, 250 Baire's category theorem, 42 Banach space, 194, 209 Bessel's inequality, 204 Birkhoff's theorem, 190 Borel field, 16 -measurable function, 107, 154 sets, 43 Borelian sets, 43 bounded, 27 convergence theorem, 126 linear functional, 210, 251 variation, 228 bounds, upper and lower, 21 Brownian motion, 161

Cantor set, function, 49, 110, 152, 229, 236,

Cantor's lemma, 41 Caratheodory, 127 cardinal numbers, 5, 6 Cartesian product, 2, 38, 134 category, first and second, 41 Cauchy integral, 127 -Riemann integral, 129 sequence, 29, 167, 171 chain, 20 maximal, 21 change of variable, 155 class, 2 closed linear span, 199 set, 25 sphere, 26 closure, 26 coarser topology, 40 collection, 2 compactification, 33, 34 compactness, 29, 30, 39

local, 31 complement, 9 complete measure, 81, 109, 166 metric space, 29, 175, 194 set in a linear space, 199 completion, 82 composition, 4, 154 conjugate indices, 183 space, 213, 215 consistency conditions, 158 continuous function, 35 set function, 56 continuum hypothesis, 7, 58 convergence, 180-1 in mean, 174 in measure, 171 in norm, 204 in pth mean, 174 of sets, 12 countable, 7 basis for measure, 195, 207 counting measure, 185 covering, 30, 225 cylinder set, 136, 140, 158

Daniell extension, 242 functional, 241 integral, 101, 223, 241, 246 -Kolmogorov theorem, 159 decomposition, Hahn-Jordan, 61, 64 Lebesgue, 149, 239 de Morgan's laws, 10 dense, 42, 195 density, 234 derivate, 224 derivative, 224, 240 Radon-Nikodym, 152_223 derived set, 26 diameter, 27 differentiable, 224 discrete distribution measure, 53, 80, 237 probability, 98 topology, 28 disjoint, 11 dissection, 102 distance, 23, 27 distribution function, 96-7 domain, 3 dominated convergence theorem, 121,

125-6, 180, 249 dual space, 21, 215

Egoroff's theorem, 169

264

GENERAL INDEX

elementary event figures, 17-18; length, area and volume of, 69

functions, 109 empty set, 2 enumerable, 7 equicontinuity, 177 equivalence relation, 6, 166 ergodic theorems, maximal, 188; mean, 221; pointwise, 190 transformation, 192, 222 essentially bounded, 167 extended real numbers, 33 extension of functions, 4 of set functions, 58 theorems, 65-6, 77, 244

interior point, 28 intersection, 9, 10 invariant function, 190 measure, 90, 255 inverse function, 4 image, 4 invertible, 187 Jacobian, 156

Jordan decomposition, 61, 64 jump function, ,237 Kolmogorov, 159 Kuratowski's lemma, 21

least upper bound axiom, 21, 30 Lebesgue convergence theorem, 121

Fatou's lemma, 120, 249 field, 15 finite-dimensional distributions, inter-

section property, 32 Fourier coefficients, 203 series, 203

Fubini's theorem, 143-4 function, 3 space, 157

generated ring, 17, 65 o -ring, 18, 77 z-class, 17

Gram-Schmidt orthogonalisation, 201 groups, 254 Haar measure, 255, 257, 259 Hahn-Banach theorem, 212-13 decomposition, 61, 64 Hausdorff space, 250 Heine-Borel theorem, 30 Hilbert cube, 48 space, 194, 202

Holder's inequality, 183, 186 indefinite integral, 127, 149, 230, 234 indicator function, 12 indiscrete topology, 28 inequalities, 183 inner measure, 75 product, 198 integrable function, 113-14,127, 129, 246 set, 246 integral, 100 Cauchy, 127 Cauchy-Riemann, 129 Daniell, 101, 223, 241, 246 Lebesgue, 124 Lebesgue-Stieltjes, 125 Riemann, 100, 128, 129, 176 integration by parts, 157

covering lemma, 38 decomposition theorem, 149, 239 density theorem, 235 integral, 124 measurable, 108 measure, 69, 79, 195

-Stieltjes integral, 125; measure, 95, 198, 236

Legendre polynomials, 202 length, 69, 79 limit of a sequence, 27 point, 26 linear dependence, 199 functional, 201, 215, 241, 250 space, 45, 194 span, 199 subspace, 212 Lindelof space, 22 Liouville's theorem, 188 local compactness, 31, 250, 254

majorised, aee dominated mapping, 3, 153 marginal distribution, 163 maximal ergodic theorem, 188 mean ergodic theorem, 221 measurable function, 103, 107,

166,

246

set, 74, 79, 96, 246 space, 246 transformation, 154 measure, 55 Haar, 255, 257, 259 Lebesgue, 69, 79, 195 Lebesgue-Stieltjes, 95, 198, 236 Radon, 254 measure-preserving transformations, 187 metric, 23, 185 group, 255 outer measure, 86, 88, 257 space, 23

GENERAL INDEX metrisable, 25 Minkowski's inequality, 184, 186 monotone class, 16, 79 class theorem, 18 convergence theorem, 119 sequence, 12 set function, 60 monotonic function, 8, 224, 226 mutually singular measures, 149 neighbourhood, 26 non-atomic, 64, 238 non-measurable set, 93 norm, 45, 194, 211 normal numbers, 193 normed linear space, 44-5 nowhere dense, 41, 49 null set, 2 open covering, 30 set, 24 sphere, 24 ordered pairs, 2 ordering, 20 ordinate set, 145 orthogonal system, 199, 200 orthogonalisation, 201 orthonormal system, 200 outer measure, 59, 74 metric, 86, 88, 257

parallelogram law, 209 Parseval's identity, 204 partial ordering, 20 perfect set, 44, 49 phase space, 187 point of density, 235 of dispersion, 235 pointwise convergence, 166 positive linear functional, 241, 250 probability measure, 96, 98 space, 96 product field, 134 measure, 139, 141, 164 ring, 134 v-field, 134 space, 3, 5, 134 z-class, 134 projection, 40, 135 proper subset, 1

Rademacher functions, 202 Radon measure, 254 -Nikodym derivative, 152, 223; theorem, 149 range of a function, 3 rectangle, 134 reflexive, 20, 218

265

regular measure, 86

outer measure, 75 representation of linear functionals, 250 restriction, 4

of a set function, 58, 75 Riemann integral, 100, 128, 129, 176 Riesz-Fisher theorem, 205 Riesz's lemma, 188 representation theorem, 250 ring, 15 (in algebraic sense), 18 scalar product, 198 SchrSder-Bernstein theorem, 6 Schwarz's inequality, 184 section, 135-6 semi-ring, 15 separable, 48, 187, 194 separating functional, 219 sequence, 4 monotone, 12, 18 set, 1 function, 51 shift, 193 sigma additive (v-additive), 54 algebra (v-algebra), 16 compact (v-compact), 43 field (v-field), 16 finite (Q-finite), 59 ring (Q-ring), 16 simple function, 102 singular, 149, 238 statistical mechanics, 188 Stieltjes measure, 95-6, 99, 163 see also Lebesgue-Stieltjes Stone's theorem, 247 strong derivative, 240 subadditive, 59, 213 support, 250 supremum, 21, 29 thick, 164

topological group, 254 space, 25

transcendental numbers 14 transformation, 3, 89, 154 measure-preserving, 187 transitive, 20 triangle inequality, 23 trigonometric functions, 202 Tychonoff's theorem, 39

uniform absolute continuity, 178 continuity, 37 convergence, 167 union, 9, 10 uniqueness of extension, 77

266 upper bound, 21 vector lattice, 241 Venn diagram, 9 Vitali covering, 225 volume, 69, 79 Von Neumann, 15

GENERAL INDEX Weierstrass property, 31 well ordered, 2, 22 Wiener measure, 162 z-class, 16, 134 Zorn's lemma, 21

Introduction to measure and integration

Read more

Introduction to Measure Theory and Integration

Read more

Lebesgue integration and measure

Read more

Lebesgue measure and integration

Read more

Measure and integration theory

Read more

Lebesgue measure and integration

Read more

Lebesgue measure and integration

Read more

Measure Theory and Integration

Read more

General Integration and Measure

Read more

Lebesgue Integration and Measure

Read more

Measure and integration: A concise introduction to real analysis

Read more

Introduction to Measure and Integration (Addison-Wesley Mathematics Series)

Read more

Measure and Integration: A Concise Introduction to Real Analysis

Read more

Measure, integration and function spaces

Read more

Measure, Integration and Function Spaces

Read more

Lectures on measure and integration

Read more

Measure, Integration and Function Spaces

Read more

Measure, Integration and Functional Analysis

Read more

Measure, Integration and Functional Analysis

Read more

Measure, Integration And Function Spaces

Read more

Introduction to stochastic integration

Read more

Introduction to integration

Read more

Introduction to Lebesgue integration

Read more

Introduction to Lebesgue integration

Read more

Introduction to Stochastic Integration

Read more

Introduction to Stochastic Integration

Read more

Introduction to Integration

Read more

An introduction to measure theory

Read more

An introduction to measure theory

Read more

Introduction to geometric measure theory

Read more

Recommend Documents

Introduction to measure and integration

Introduction to Measure Theory and Integration

10 APPUNTI LECTURE NOTES Luigi Ambrosio, Giuseppe Da Prato and Andrea Mennucci Scuola Normale Superiore Piazza dei C...

Lebesgue integration and measure

Lebesgue measure and integration

LEBESGUE MEASURE and INTEGRATION P K Jain University of Delhi Delhi, India V P Gupta National Council of Educational...

Measure and integration theory

Lebesgue measure and integration

Lebesgue measure and integration

P K JAIN • V P GUPTA ~~ LEBESGUE ~ i MEASURE = -.. ,., AND INTEGRATION - """" "' ~ M ~ .c( A 1-fellted Press Book ...

Measure Theory and Integration

MEASURE THEORY AND INTEGRATION G. de BARRA, Ph.D. Department of Mathematics Royal Holloway College, University of Londo...

General Integration and Measure

Lebesgue Integration and Measure