The Structure of Lebesgue Integration Theory G. TEMPLE
OXFOR'D AT THE CLARENDON PRESS 1971
Oxford University Press, Ely House, London W. 1 GLASGOW NEW YORK TORONTO MELBOURNE WELI.INGTON CAPE TOWN SALISBURY IBADAN NAIROBI DAR
1~8
SAT.AAM I.UHAKA ADDIS ABABA
BOMBAY CALCUTTA MADRAS KARACHI LAHORE DACCA KUALA LUMPUR SINGAPORE HONG KONG TOKYO
©OXFORD UNIVERSITY PRESS 1971
PRINTED IN GREAT BRITAIN
Preface The purpose of this work is to introduce the principles and techniques of the theory of integration in the general and simple form that we owe primarily to Lebesgue, de la Vallee-Poussin, and W. H. Young. It is addressed to those who are already familiar with the elementary calculus of differentiation and integration as applied to the standard functions of algebraical and trigonometrical type. Some slight acquaintance with the topology of open and closed sets may also now be presumed in most first-year undergraduates, for whom the book is written, but it is not essential. I have endeavoured to provide an account of the essentials of the theory and practice of Lebesgue integration that are indispensable in analysis, in theoretical physics, and in the theory of probability in a form that can be readily assimilated by students reading for honours in mathematics, physics, or engineering. To realize this purpose is a serious and important pedagogical problem, for the theory of Lebesgue integration occupies a strange, ambivalent position in the minds of mathematicians confronted with the challenge of planning a syllabus for undergraduates. Then Lebesgue integration appears to be at once indispensable and unattainable, desirable and impracticable. As compared with 'Riemann' integration, so strongly entrenched in university courses of analysis, the subject of 'Lebesgue' integration possesses three great advantages: it is applicable to a much larger class of functions, the properties of the integral are much easier to establish, and the applications of the theory are made with much greater facility. And yet the Lebesgue theory is almost universally regarded as too difficult for inclusion in undergraduate instruction, and in spite of the numerous excellent expositions of the Lebesgue theory, there is still a need for a strictly elementary account of the subject, which will make it readily accessible and utilizable in an undergraduate course.
6
Preface
First of all let us squarely face the ineluctable problems which confront the writer who aspires to provide what tho French so happily call 'une rnuvre de la haute vulgarisation'. It must be admitted that there are undoubtedly two arduous passages in the traditional approach to the Lebesgue theory. The first is the theory of measure and the second is the theory of the differentiation of an indefinite integral. The first difficulty can be evaded by framing a direct definition of the integral, independently of the theory of measure. This has been done by L. C. Young, by 0. Perron, and by F. Riesz and B. Sz-Nagy. For a number of cogent reasons I have resisted the temptation to follow this seductive deviation from the traditional route. In the first place in the systematic and structural account of the Lebesgue theory the theory of measure is conceptually prior to the general theory of integration, since it is in fact the theory of the integration of the simple functions whose range consists of just two numbers, zero and unity. In the second place it is impossible to avoid the concept of sets of points of'zero measure'. In the third place the theory of measure is indispensable in such important applications as ergodic theory and the theory of probability. In the present introduction to integration theory the theory of measure has therefore been retained, but an attempt has been made to simplify and shorten the exposition by translating the traditional account from geometrical into analytical language. This is easily accomplished by systematically representing a set of points E by its characteristic function x(E)-a device suggested by Ch. J. de la Vallee-Poussin. The theory has also been simplified by replacing 'open sets' as the central concept, by enumerable collections of intervals which may be closed, open, or half-open. The second serious difficulty in the Lebesgue theory is the differentiation of an indefinite integral. In elementary calculus an integral cf;(x) of a function f(x) is descriptively defined as a function whose derivative is the integrand f(x). To students familiar with this concept it must be a sad and disheartening
Preface
7
experience to realize that the corresponding property of the Lebesgue integral is almost the last to be established, and that it requires the formidable apparatus of the covering theorems of Vitali or of Riesz, or the theory of re8eaux developed by de la Vallee-Poussin. In the method of treatment proposed in this book the theory of differentiation is based on the analytical discussion given by F. Riesz and B. Sz-Nagy (1953) and its geometrical expression in the 'rising sun' theorem of F. Boas (1960). The removal of these two well-known difficulties in the Lebesgue theory is scarcely an adequate excuse for the publication of yet another introduction to this much-introduced subject. The real justification lies in the more exacting demands now made on authors of mathematical works. The revolution in mathematical theory, which is still proceeding, has also provoked a revolution in mathematical teaching, and has imposed new canons of exposition. The two essential and necessary conditions which a modern textbook must attempt to satisfy are those associated with the key-words 'motivation' and 'structure'. An exposition of any branch of mathematics must now provide the student with an adequate motivation, that is with a line of thought which leads naturally, and almost inevitably and automatically, from the elementary concepts and methods he already possesses to the more general and abstract ideas and techniques of the theory that he proposes to study. The motivation reveals the inadequacy of our present knowledge, it poses urgent and important questions we are as yet unable to solve, and, in its highest achievements, it restates these questions in a form which suggests what methods must be devised for a solution. The appetites excited by motivation must then be satisfied by a systematic exposition in which the whole of the theory is dominated by a few simple principles that endow the subject with a definite structure that can be described in advance before the student is committed to a detailed study. Thus a structure is not so much a set of definitions and theorems as a programme that directs the advance of the whole subject. These are high ideals for any writer and the present book must
8
Preface
be regarded as an experiment designed to determine how far these ideals can be realized in an introduction to the subject of integration. This is a subject which now offers most appropriate material for such an investigation. Numerous accounts of the theory of integration have been published, each of them furnishing its own special insight and technique. It now seems possible to extract the essential motivating ideas and structural principles that unify the whole theory. The motivating ideas, described in Chapter 2, lead from the Archimedean 'method of exhaustion' to the general concept of an integral. The structural principles are two in number-here termed the principle of bracketing and the principle of monotony. The principle of bracketing is a method of induction which enables us to extend a class of functions which are susceptible of integration, by using the concept of upper and lower integrals. We can thus ascend from the concept of area or volume to the Lebesgue measure and the Lebesgue integral. To carry out this programme we need the principle of monotony by which all sequences are reduced to monotone sequences and all functions are reduced to monotone functions. The problems of convergence and of integration are thus reduced to their simplest possible form. Both of these principles are firmly embedded in the literature, especially in the writings of W. H. Young and L. C. Young. The main contribution of the present work is the exhibition of these concepts as the essential structure of the Lebesgue theory of measure, of integration, of differentiation, and of convergence. I must express my indebtedness to friends and colleagues who have read this book in proof or in typescript, and who have given me valuable advice and help. I will mention especially Dr.J.D.M. Wright, who read the bookinproof,andDr.A. Ingleton, who, with great patience and much friendly criticism, has given me invaluable help in correcting and improving the successive drafts of the work.
Oxford 20 January 1971
G.T.
Contents
1. MOTIVATION 1.1. Introduction 1.2. The object of the Lebesgue theory 1.3. The achievement of Lebesgue 1.4. The techniques of Lebesgue theory 1.5. Alternative theories
13 13 15 16 17
2. THE CONCEPT OF AN INTEGRAL 2.1. Introduction 2.2. 'Primitives' 2.3. Areas 2.4. The Lebesgue integral 2,5. Lebesgue measure 2.6. The structure of the Lebesgue theory of integration 2. 7. Exercises
19 20 22 25 28 30 31
3. THE TECHNIQUES OF LEBESGUE THEORY 3.1. The starting-point 3.2. The method of bracketing 3.3. The Riemann integral 3.4. Monotone sequences 3.5. Infinite integrals 3.6. The Dini derivatives 3.7. Exercises
32 32 35 37 39 41 42
4. INDICATORS 4.1. Introduction 4.2. Boolean convergence 4.3. Open and closed sets 4.4. Covering theorems 4.5. The indicator of a function 4.6. Exercises
44 46 48 51 52 53
10
Contents SETS OF ZERO MEASURE
5. DIFFERENTIATION OF MONOTONE FUNCTIONS 5.1. Introduction 5.2. Sets of points of measure zero 5.3. The Cantor set of points 5.4. Average metric density 5.5. The problem of differentiation 5.6. The 'rising sun' lemma 5.7. The differentiation of monotone functions 5.8. The differentiation of series of monotone functions 5.9. Exercises
54 55 57 59 60 61 64 67 69
LEBESGUE THEORY IN ONE DIMENSION
6. GEOMETRIC MEASURE OF OUTER AND INNER SETS 6.1. Introduction 7l 6.2. Elementary sets 71 6.3. Bounded outer sets 76 6.4. Unbounded outer sets 82 6.5. The principle of complementarity 82 6.6. Inner sets 83 6. 7. Exercises 86 7. LEBESGUE MEASURE 7.1. Introduction 7 .2. Outer and inner measure 7.3. Lebesgue measure 7.4. Examples of measurable sets 7.5. Unbounded sets 7.6. Non-measurable sets 7. 7. Criteria for measurability 7 .8. Monotone sequences of sets 7 .9. Exercises 8. THE LEBESGUE INTEGRAL OF BOUNDED, MEASURABLE FUNCTIONS 8.1. Introduction 8.2. Measurable functions 8.3. Measure functions 8.4. Simple functions 8.5. Lebesgue bracketing functions 8.6. The Lebesgue-Young integral
87 87 90 94 96 97 98 99 101
103 104 107 108
no Ill
Contents 8.7. The Lebesgue integral as a positive, linear continuous functional 8.8. The differentiability of the indefinite Lebesgue integral 8.9. Exercises
11 ll4 121 124
9. LEBESGUE INTEGRAL OF SUMMABLE FUNCTIONS 9.1. Introduction 126 9.2. Summable functions 126 9.3. The Lebesgue integral of summable functions as a positive linear 'continuous' functional 131 9.4. The Lebesgue integral as a prinlitive 138 9.5. Exercises 143 LEBESGUE THEORY IN d DIMENSIONS
10. MULTIPLE INTEGRALS IO.l. Introduction 10.2. Elementary sets in d dimensions 10.3. Lebesgue theory in d dimensions 10.4. Fubini 's theorem stated 10.5. Fubini 's theorem for indicators 10.6. Fubini 's theorem for summable functions 10.7. Tonelli's theorem 10.8. Product sets 10.9. The geometric definition of the Lebesgue integral 10.10. Fubini 's theorem in d dimensions 10.11. Exercises
144 144 146 148 150 152 155 157 159 160 162
11. THE ll.l. 11.2. 11.3. 11.4. 11.5. 11.6.
LEBESGUE-STIELTJES INTEGRAL Introduction The weighted measure The Lebesgue representation of a Stieltjes integral The Lebesgue-Stieltjes integral in one dimension The Lebesgue-Stieltjes integral in two dimensions Exercises
163 164 167 170 171 173
12. EPILOGUE 12.1. The generality of the Lebesgue integral 12.2. The descriptive definition of the Lebesgue integral 12.3. Measure functions 12.4. The Young integral 12.5. References
174 174 177 178 lSI
INDEX
182
1
Motivation
1.1. Introduction The study of any branch of mathematics is essentially a guided research, and, before he commits himself to such a project, the prudent student will require satisfactory answers to three or four questions: (i) (ii) (iii) (iv)
What What What What
is the object of the investigation? measure of success will be attained? methods of investigation will be needed? and other lines of investigation are available?
We proceed to answer these questions so far as they relate to the study of the Lebesgue theory of integration and differentiation.
1.2. The object of the Lebesgue theory The object of the Lebesgue theory is stated clearly by Henri Lebesgue in the paper published in 1902 which is his thesiscertainly the most famous and influential doctoral thesis ever written. There he states his purpose to give the most precise and general definitions of three mathematical concepts: the integral of a function, the length of a curve, and the area of a curved surface. But why seek for the most general definition of an integral? Why not be content with the integral as defined in elementary treatises on the calculus? There are two reasons for a divine discontent. (i) The greater generality of a definition is obtained by greater abstraction and therefore with greater simplicity. (ii) The familiar world of the well-behaved 'tame' functions of elementary calculus is not 'closed', and most limiting processes take us out of these comfortable surroundings into a strange
14
Motivation
world of 'wild' functions where the elementary concepts of integration are no longer valid. In view of the notorious difficulty of the Lebesgue theory the first reason may appear an idle paradox. But for the purpose of comprehending the real nature of an integral we know far too much about the properties of particular functions such as xn, sin x and cos x, exp x and log x, and the multitude of known facts is an embarrassment. As soon as we begin to generalize and abstract we are no longer concerned with these trivial details and we can concentrate on the essential features of the problem, for example, is the function to be integrated monotone or continuous? The general definition of an integral given by Lebesgue is in fact essentially simple because it depends only on the most general properties of the function to be integrated. (In fairness to the student it must, however, be admitted that simplicity in mathematics, like simplicity of character, is an ideal to be achieved only by unremitting toil.) The construction of 'wild' functions from 'tame' functions by means of limiting processes is now an integral part of analysis with its own recognizable techniques such as the 'principle of the condensation of singularities'. Thus, from the function
or
f(x) = xcoslnJxJ
(x =1= 0)
0
(x = 0),
which has no derivative at the origin (x the function
= 0), we can construct
where z1 , z2, ••• are the rational numbers between 0 and l. The function cfo(x) is continuous but has no derivative at any of the infinite number of points x = zn, n = I, 2, .... The contemplation of such wild or pathological functions was repugnant to many classical analysts, such as Poincare and Hermite (Saks 1937, p. iv) but their construction has two definite advantages. At the beginning of our studies it demonstrates the inadequacy of elementary analysis, and at the end
Motivation
15
of our studies it can show that the results obtained are the 'best possible'.
1.3. The achievement of Lebesgue In this book we are concerned mainly with Lebesgue's attempt to provide the most general definition of the integral of a function of one or more variables, x, or x 1 , x 2 , ••• , xnFirst let us consider the pro~lem of integration of bounded functions over a bounded interval. In this case the method of Lebesgue yields the most general possible definition, i.e. it applies to the widest possible class of functions. These are the functions described by Lebesgue as 'measurable'. The scope of Lebesgue's definition of the integral b
J f(x) dx a
is established by the proof that if f(x) is bounded and measurable, then the indefinite Lebesgue integral X
rp(x)
=
Jf(t) dt a
possesses 'almost everywhere' a derivative rp'(x) equal to f(x). Without anticipating the precise definition of the phrase 'almost everywhere', it is sufficient to state that it allows rp(x) to have no derivative at a set of points that may be enumerable, as in the case of the wild function quoted above, or even nonenumerable. Thus the object of Lebesgue's theory is completely attained so far as bounded functions over a bounded interval are concerned. In the case of unbounded functions or unbounded intervals the success of the Lebesgue theory is only partial. In fact the Lebesgue theory now applies directly only to the limited class of functions described as 'summable', and does not apply to the class of integrals which are not 'absolutely convergent'. Thus the integral co
J
lsi:xjdx
0
16
Motivation
is infinite, and in consequence, the integral
f 00
si:x dx
0
is not directly integrable by the methods of Lebesgue, although it may be defined by the limiting process
f
a
lim a-+oo
sin x dx, X
0
which is not, strictly speaking, part of the Lebesgue theory. Finally, we must emphasize that the Lebesgue theory applies to functions of several variables and to their integrals, not only over 'domains' but also over sets of points that belong to the class described as 'measurable'.
1 .4. The techniques of Lebesgue theory The achievements of the Lebesgue theory are finely described in all the standard works on the subject, but it is not always made clear that this success of the theory is attained by the skilful use of two simple techniques, which may be described as the 'method of bracketing' and the 'method of monotony'. These methods will be expounded in detail later but they may be summarily described as follows: The method of monotony consists in the reduction of problems of convergence to the study of monotone sequences, which are familiar and simple instruments of calculation. The method of bracketing consists in using the integrals of certain tame functions to integrate certain wild functions by 'bracketing' a wil~ function between a pair of tame functions. Thus two integrable tame functions .\(x) and p.(x) may be said to bracket a wild functionf(x) with tolerance*' > 0, if .\(x) ::( f(x) ::( p.(x) b
and if
b
J p.(x) dx- J .\(x) dx < *'· a
a
The integrals of .\(x) and p.(x) may then be regarded as approximations to the (as yet) unknown and undefined integral ofj(x).
Motivation
17
If there exists such a pair of integrable bracketing functions for any prescribed tolerance E > 0, we have at our disposal the method to define and to evaluate the integral of f(x) to any desired degree of approximation. These two techniques are all that is necessary to construct the whole of the Lebesgue theory of integration.
1 .5. Alternative theories The account of the Lebesgue integral given in the following chapters follows the mature thought of Lebesgue as expounded in the second edition (1928) of his book modestly entitled Lef}OnB sur l'integration. In this account the theory of measure is developed as a preliminary to the theory of integration, but there are alternative methods of which the student should be informed. Broadly speaking, these alternative methods fall into two classes-in one class integration is reduced to measure theory, and, in the other, measure theory is subsumed into the theory of integration. Lebesgue himself adopted the first method in the first edition (1904) of his Lef}ons and this method is followed by Burkill (1953) in his Cambridge Tract. In this method geometry reigns supreme b
and the integral J.J(x) dx of a non-negative function f(x) is a
defined as the two-dimensional measure of the set of points (x,y) such that (a~ x ~ b, 0 ~ y ~ f(x)). In the second method the integral is defined directly and the measure of a set of points E is then defined as the integral of their 'indicator' a:(x, E), i.e. the function that is equal to unity if x E E or to zero if x ¢:. E. There are various techniques for a direct definition of the Lebesgue integral and we briefly refer to three of these: (i) the method of monotone sequences invented by W. H. Young (1910) and expounded by L. C. Young (1927); (ii) the modification of the Darboux-Riemann method described by Saks (1937, p. 3) and employed by Williamson (1962,p. 39); 853146X
B
18
Motivation
(iii) the use of sequences of step functions {c/>n(x)}, with some generalized type of convergence. This is the method described by Riesz and Nagy (1953) and employed by lngleton (1965). Each of these methods has its own advantages, and we may apply to them the words of Kipling: There are nine and sixty ways of constructing tribal lays, And-every-single-one-of-them-is-right.
2
The concept of an integral
2.1. Introduction In the classical treatises on the various branches of mathematics the fundamental definitions and axioms are enunciated at the very beginning and followed by a systematic explanation of their logical consequences. But an introductory account of the Lebesgue theory cannot exhibit this classical perfection, for the fundamental definition of the Lebesgue integral is not a datum to be unquestionably accepted but a quaesitum that has to be achieved. The fact that we know the name of the entity-'the Lebesgue integral' -that we have to discover must not blind us to the fact that, at the beginning of our search, we do not know anything more about it, except that we hope it will prove to be a generalization of the integral that we have met in elementary calculus. In this puzzling and paradoxical situation how can we plan a systematic investigation? The answer is provided by the distinction between 'constructive' and 'descriptive' definitions. Our task is to develop a constructive definition of the Lebesgue integral that will guarantee its real existence in the world of mathematics. The constructive definition will be the end of our search. But at the very beginning we can give a descriptive definition of the Lebesgue integral by enumerating some of the properties that it must possess. These properties will be the most general and fundamental properties: of the integrals that we have encountered in elementary analysis. We shall find that these properties, together with the two techniques of bracketing and of monotony, almost inevitably decide the path that leads to a constructive definition of the Lebesgue integral. We therefore begin by disengaging the general concept of an integral from the material provided by the elements of the differential and integral calculus. In fact, elementary calculus
20
The concept of an integral
does provide two definitions of an integral, viz. as a 'primitive' and as an area. By examining the concept of a primitive we shall obtain a descriptive definition of an integral and by examining the concept of area we shall see, in general terms, how a constructive definition can be achieved.
2.2. 'Primitives' A primitive is a correlative of a derivative, i.e. if two real functions, rp(x) andf(x), ofareal variablex, defined and bounded in the interval (a, b)= {x: a< x < b}, are so related that rp(x+hl-rp(x)-+ f(x)
as h-+ 0
for all values of x and x+h in the given interval, then f(x) is the derivative of rp(x) and rp(x) is a primitive ofj(x). Any other primitive ofj(x) differs from rp(x) only by an additive constant c, and has the form rp(x)+c. If the primitive function rp(x) is prescribed, and if it has been chosen from the severely restricted class of functions that do possess derivatives, then the definition given above does prescribe a definite limiting process for calculatingf(x) (in principle) to any prescribed degree of accuracy. The definition is therefore 'constructive'. But if it is the derivativef(x) that is prescribed, then the definition is purely 'descriptive' and provides no determinate means of calculating the primitive rp(x), which in fact is usually found (when it exists!) by ingenious artifices and patient experimentation, aided by a well-stocked memory of lists of elementary functions and their derivatives. However, the familiar functions f(x) = exp( -x2 )
or
f(x) = x-1 sin x
provide examples of integrands whose primitives cannot be expressed by any finite combination of elementary functions. The relation of a primitive to a derivative is essentially a 'local' property, i.e. the numerical value of the derivative f(x) at a specified point x = gdepends only on the values of the primitive rp(x) in an arbitrarily small neighbourhood g-e < x < g+e of
The concept of an integral
21
the point g. In more technical language, differentiability at a point x = g is a local property of a function f(x) because it is a property of the 'restriction' of j(x) to any neighbourhood of x = g, i.e. of any function f(x, E) such that j(x, E) = f(x) when g-E < x < g+E, for some E > 0. The fundamental global properties of the differential relation are the following, which we designate by (N), (L), and (P). 1, over the (N) The integral of the unit function, f(x) interval [a, b] isb-a, i.e. the function cf>(x) = x-a is a prii~itive of the unit function j(x) = l. This property is often described as the Lebesgue normalizing condition. (L) If cp 1 (x) and cp 2 (x) are respectively primitives of j 1 (x) and j 2 (x) in the same interval [a, b] thenc 1 cp 1 (x)+c 2 cp 2 (x) is a primitive of ctf1 (x)+cd2 (x) in the. same interval, for all real numbers c1 and c2 • (P) If cf>(x) is a primitive of f(x) in an interval [a, b], and if f(x) is non-negative in this interval, then
=
cp(x) ;?: cp(a). The outstanding question that remains for investigation is to investigate the continuity of the differential relation. We can in fact prove that if the sequence of derivatives {c/>~(x)}, (n = 1, 2, ... ) converges uniformly in a closed interval [a, b] to a derivative cf>'(x) as n tends to infinity, then the corresponding sequence of primitives {c/>n(x)-c/>n(a)} converges to cp(x)-cp(a). This is a very restricted species of continuity for it requires that (i) the limit of the sequence {c/>~(x)} should be itself a derivative cf>'(x), and that (ii) the convergence of cp~(x) to cp'(x) should be uniform. At the present stage of our investigation we cannot foresee how much these conditions can be relaxed in a descriptive definition of an integral. We shall in fact establish three different (but related) conditions, which are each sufficient to ensure that b
lim n-->-oo
b
I fn(x) dx = I f(x) dx, a
a
22
The concept of an integral
where {fn(x)} is a sequence of integrable functions with limit function f(x), viz. (i) the condition of 'bounded convergence', i.e.
lfn(x)l < K for each x in [a, bJand for each n, K being a constant independent of x and n (Theorem 8. 7. 7); (ii) the condition of 'monotone convergence', i.e. 0 ~ fn(x) ~ fn+l(x)
for each x in [ -oo, ooJand for each n (Theorem 9.4.2); (iii) the condition of 'dominated convergence', i.e. lfn(x) I < rp(x)
for each x in [ -oo, ooJ and for each n, rp(x) being integrable over [ -oo, ooJ (Theorem 9.3. 7). (The theorems quoted are even more general, for they refer to integrals over measurable sets of points rather than over intervals.) We are now in a position to give a descriptive definition of an integral. The definite integral of f(x) over an interval [a, bJ must be (when it exists!) a real number that depends upon the values assumed by f(x) in this interval. It is therefore a 'functional'. Our investigation of the properties of a primitive suggests that the definite integral of f(x) over an interval [a, bJshould be a positive, linear, functional satisfying the Lebesgue normalizing condition. Since the properties (P), (L), and (N) are possessed by the primitives of derivatives, these properties are consistent with one another and can be taken as a descriptive definition of an integral.
2.3. Areas In elementary analysis a constructive definition of the concept of 'area' is obtained by the method of 'exhaustion' invented by the Greek mathematicians Eudoxus and Archimedes. The method is most simply described by considering a bounded, positive, non-decreasing function f(x) defined in an
The concept of an integral
23
interval [ct, b], and the region R in the (x, y)-plane specified by the relations 0 ~ y ~f(x). a~ x ~ b, The function f(x) is not assumed to be continuous. (Naturally the same method is applicable to bounded, positive, non-increasing functions.) We divide the interval [a, bJ by a finite number of points a = x0
<
<
x1
x2
< ... <
xn = b,
and define two step functions, A( X) = j(xp) } if Xp ~ X < Xp+l t-t(x) = f(xp+I) and p = 0, 1, 2, ... , n-l. Then, since f(x) is non-decreasing, >.(x)
~
f(x)
~
t-t(x).
A glance at Fig. 1 will show that the regions of the (x, y)-plane specified by the relations b,
0 ~ y ~ >.(x)
a ~ x ~ b,
0 ~ y ~ t-t(x)
a
and
~
x
~
are bounded by two polygons, one inscribed and the other escribed to the region 'under the curve', i.e. the region a
~
x
~
0 ~ y ~ f(x).
b,
By definition the integrals b
J >.(x) dx
h
and
a
Jt-t(x) dx a
are the areas of the inscribed and escribed polygons respectively, I.e.
n-1
b
J >.(x) dx = p~o (xp+I-xp)f(xp), a b
n-1
J t-t(x) dx = p~o (xp+I-xp)f(xp+I). a
Hence b
b
n-1
Jt-t(x) dx- J>.(x) dx = p~o (xp+I-xp){J(xp+ 1)-f(xp)}, a
a
24
The concept of an integral
y=j(J:)
B
1
v
/
A
v
l( b
a
X
FIG. 1
and, if the maximum length of the intervals (xp, xp+l) for = 0, l, 2, ... ,n-l, is E, then
p
0 ~ Xp+l-Xp ~
E
and b
n-1
b
Jp.,(x) dx- J.\(x) dx ~ a
E
p~o {f(xp+l)-f(xp)}
a
=
E{J(b)-f(a)}.
Thus, in the terminology of§ 1.4, the functionf(x) is bracketed by the step functions .\(x) and p.,(x) with a tolerance E{J(b) -f(a)}, which can be made arbitrarily small by sufficiently increasing the number of sub-intervals into which the interval [a, b] is divided. Now 0
~
b
b
1
J .\(x) dx ~ aJp.,(x) dx ~ n~ (xp+l-xp)f(b) = (b-a)f(b). a P-0
The concept of an integral
25
Hence, if we consider all possible divisions of the interval [a, b], b
I >.(x) dx has an upper bound,
the collection of integrals A=
a
while the collection of integrals M
=
b
I p.(x) dx
has a lower
a
bound. Therefore the integrals A have a supremum or least upper bound, sup A, while the integrals M have an infimum or greatest lower bound, inf M. Also 0 for all E
>
~
infM-supA
~
E{f(b)-f(a)}
0. Therefore
sup A= infM =A, say. Hence there is a unique number A such that, for any division of the interval [a, b], A A M ~ ~ ' b and this number is therefore defined to be the integral f(x) dx.
I
a
This we may call the 'Archimedean' integral. Clearly
b
Jf(x) dx ~ (b-a)f(b),
(b-a)f(a) ~
a
so that the Archimedean integral of a monotone function satisfies the mean value theorem, and hence is a positive functional.
2.4. The Lebesgue integral The success of the method of exhaustion applied to the Archimedean integral clearly depends upon the closeness with which the bracketing functions >.(x) and p.(x) approximate to the integrand f(x). In fact since
= f(xp) in the interval xP ~ x < >.(x)
~
f(x) ~ f(xp+l)
=
p.(x)
xP+l' it follows that
p.(x)->.(x)
=
f(xp+l)-f(xp),
i.e. the variation of f(x) in this interval. lfJ(x) is non-decreasing and continuous in [a, b] then, for any prescribed tolerance E > 0 we can choose a division of the interval such that f(xp+l)-f(xp)
<
E
for p = 0, 1, 2, ... ,n-l,
26
The concept of an integral
whence
f-L(X)-f(x)
and
f(x)-A.(x)
< <
e e
at all points x in [a, b]. Again, if f(x) is continuous, but not necessarily monotone, in [a, bJwe can define as bracketing functions the representations A.(x) = infj(x) } . . m each mterval xP ~ x f-L(X) = sup f (x)
<
xp+l·
Then, since f(x) is uniformly continuous, for any arbitrary tolerance e < 0 we can choose a division of the interval [a, bJ such that supf(x)-infj(x) < e in each sub-interval xP ~ x < xP+l' p = 0, 1, 2, ... , n-1. If, however, f(x) is neither monotone nor continuous, we are faced with the embarrassing possibility that it may be wildly oscillatory in value and may have an infinity of maxima and minima in each sub-interval of [a, b]. The existence of such functions is assured by the example 00
1
1
f(x) = " ' 2 sin-n X-Zn
f;:1
if x is irrational, or f(x) = 0 if x is rational, where z1 , z2 , ••• is some enumeration of the rational numbers between 0 and l. In these circumstances bracketing functions based on a division of the domain [a, bJ of f(x) would be useless, and we proceed to consider a new species of bracketing function based on a division of the range [A, B] ofj(x)-this is one of the major contributions of Lebesgue to the theory of integration. We divide the range [A, B] of the function f(x) by a finite number of points ·
A = t0 < t 1 < t 2 < and introduce the function
o/
(x) _ { 1 0
P
so that
n-1
!
P=O
o/p(x)
... < tn = B
(tp < f(x) ~ tp+l) (f(x) ~ tP or tp+l
=
1
if a ~ x ~ b.
The concept of an integral
27
We define two bracketing functions, n-1
L tP rfp(x},
.\(x) =
P=O
~-t(x}
n-l
= p=O L tp+1 rpp(x}.
Then, if e is the maximum length of the intervals (tp, tp+l) for p
=
0, I, 2, ... , n-1,
0 ~ tp+l-tp ~ ~-t(x)-.\(x) ~
and
€,
n-1
L
(tp+l-tp)r/Jp(x)
p=O
<
n-1
L
e
rfp(x) = e.
p=O
.\(x) ~ f(x) ~ ~-t(x).
Now
Hence the arbitrary functionf(x) is uniformly approximated by the bracketing functions .\(x) and /L(x) over the whole interval
[a, b]. Moreover, if we possessed a definition of integration applicable to the bracketing functions we could conclude at once that b
b
b
J ~-t(x) dx- J .\(x) dx = J {~-t(x)-.\(x)} dx ~ e(b-a). a
a
a
We could then define the integral off(x) over the interval [a, b] as b
Jf(x) dx = a
b
inf
b
J ~-t(x) dx = sup J .\(x) dx a
a
for all bracketing functions of the type defined above. This is the Lebesgue integral ! It thus appears that the problem of integrating a bounded functionf(x) over a bounded interval [a,b] would be solved if we could integrate the bracketing functions .\(x) and ~-t(x), and that these bracketing functions would be integrable if we could integrate the functions rpp(x). This reconaissance of the problem of integration, due to Lebesgue (1928, chap. vii) shows that the essence of the matter is the integration of functions, such as rfp(x), which can take only two values, viz 0 and I. We therefore proceed to make a
28
The concept of an integral
preliminary examination of the geometrical significance of the b
integrals
J ifip(x) dx. a
2.5. Lebesgue measure (i) In the simplest case, whenf(x) is a non·decreasing function of x in [a, b], then to any number tP in the range [A, B] of f(x) there corresponds a unique number xP such that a:( xP :( b
{f(x) :( tP f(x) > tP
and Then
tp
< b
Hence
if Xp
f(x) :( tp+l
J ifip(x) dx = a
(x :( xp), (x > xp)·
<
X :( Xp+ 1.
Xp+l
J1 . dx = xP+l-xP.
Xp
Therefore the Lebesgue bracketing functions n-1
and
L tP+l ifip(x)
p=O
have exactly the same integrals n-1
:L f(xp+l)(xp+l-xp)
P=O
as the step functions that we introduced as bracketing functions in§ 2.3. THEOREM
2.5.1. Iff (x) is a non-decreasing function, it possesses
a Lebesgue integral, which is exactly the same as the 'A rchimedean' integral of§ 2.3.
(ii) To discuss the integration of ifip(x) whenf(x) is continuous but not monotone in [a, bJit is more convenient to introduce the function cx(x, t) = { 1 (t < f(x) ), 0
(! (x) :( t).
Then The advantage of this transformation is that it is easily proved that the set of points at which cx(x, tp) = 1 is 'open', i.e. it
The concept of an integral
29
consists of an enumerable collection of open intervals (ak, b~c), h
(k = I, 2, ... ).
I o:(x, tp) dx =
Thus •
b
bk-ak, and
a
00
may be suitably defined as
I
I o:(x, tp) dx
(bk-ak), since the series of positive
k~1
terms has an upper bound (b-a) and is therefore convergent. (This, of course, requires proof, which we shall supply below.) We may cite, as an example, the function (x =1= 0), (x
=
0),
in the domain (0, I) and the auxiliary function o:(x,O) ={I 0
(0
(f(x) ~ 0).
1
Jo:(x,O) dx = -l-i+i--i+··· =
Then
I-ln2.
0
(iii) In the general case, when f(x) is neither monotone nor continuous, further research is necessary to determine under what conditions the function o:(x, t) can be integrated. The answer is provided by the Lebesgue theory of measure that we shall develop in Chapter 7. Assuming that the given functionf(x) is such that the auxiliary function o:(x, t) is integrable with respect to x for each value of t, let b m(t) = o:(x, t) dx.
J
a
This function m(t) will be called the 'measure function' of f(x) and will be proved to be a non -increasing function of t. The Lebesgue bracketing functions will then have the integrals n-1
A
= I
n-1
tp{m(tp)-m(tp+l)} and M
p~O
= I
tp+l{m(tp)-m(tp+l)},
p=O
and as before 0 ~ M-A ~ e
n-1
I
{m(tp)-m(tp+l)}
p=O
=
e{m(t0 )-m(tn)}
= e(b-a).
30
The concept of an integral
The Lebesgue integral off (x) over [a, b] can then be defined as b
Jf(x) dx = supA = infM a
for all Lebesgue bracketing functions, bracketing f(x).
2.6. The structure of the Lebesgue theory of integration The preceding survey of the problem of integration, as analysed by Lebesgue, shows that a necessary preliminary is the theory of measure. To enable the essential features of the theory to be grasped more readily we shall give prior consideration to the theories of measure and integration in one dimension, but the terminology, theorems, and definitions will be so phrased that they can readily be interpreted in the d-dimensional theory. We shall show that the Lebesgue integral is a positive, linear continuous functional and we shall investigate in what sense the indefinite Lebesgue integral X
4>(x) =
Jf(t) dt
(a :( x :( b)
a
can be regarded as a primitive of f(x). However, there is a chapter in the theory of Lebesgue integration that is even simpler than the one-dimensional theory. This is the chapter that discusses sets of points of zero measure. This theory of 'null sets' is not only of extreme simplicity, but it permeates the whole of Lebesgue theory. In particular, it is indispensable for. the discussion of the differentiability of the indefinite integral x 4>(x) = f(t) dt.
J
a
We shall therefore begin our discussion of the measure theory with a chapter on sets of zero measure (Chapter 5). Before we proceed further to discuss the Lebesgue theory in one dimension it is convenient to give a rather more formal
The concept of an integral
31
exposition of the two main instruments employed in the theorythe method of monotony and the method of bracketing. Also, in order to express the theory of measure in analytical form (rather than in the usual geometrical form) we shall intercalate a chapter on the theory and use of the indicator function cx(x, t).
2.7. Exercises I. If the sequence of derivatives {c/>~(x)} is non-negative, monotonic in x, monotonic inn, and converges to zero as n-'>- oo in [a, b], prove that cPn(x)-c/>n(a)-'>- 0
as n-'>- oo
(Denjoy).
2. If the function j(x) possesses a derivative f'(x) at each point of the interval a ,;;; x ,;;; b, j'(a) = ex, j'(b) = ~. ~ =1= ex, and y lies between ex and~. then there is a point c between a and b such that f'(c) = y (Darboux). 3. Deduce from the properties (N), (L), (P) of the differential relation (§ 2.2) that, ifc/>(x) is a primitive ofj(x) in the interval [a,b], and if L .::;; f(x) ,;;; M
for a ,;;; x ,;;; b,
thenL(b-a),;;; c/>(b)-rp(a),;;; M(b-a). Hence deduce Rolle's theorem. 4. If j(x) is a continuous non-decreasing function in [a, b] and c is any constant, prove that b
b
I {c-f(x)} dx = c(b-a)- I f(x) dx.
a
a
3
The techniques of Lebesgue theory
3.1. The starting-point In seeking for the widest possible generalization of the concept of an integral there are two directions that we may follow-we may try to generalize the concept of a primitive or we may try to generalize the concept of area. The first method has been followed by Perron, Ward, and Henstock, the second method by Riemann, Borel, Young, and Lebesgue. In following the method of Lebesgue the starting-point is necessarily the concept of the area of a rectangle and its immediate extension to the integrals of step functions such as the bracketing functions of§ 2.3. To change the metaphor, no other material is available for the construction of the Lebesgue integral than the integrals of step functions. It follows that the only techniques available for the process of construction are those already employed in constructing the functions of analysis from step functions. These are the familiar algebraical techniques of addition, multiplication, and their inverses, together with the analytical techniques of limiting processes.
3.2. The method of bracketing The two techniques of generalization characteristic of Lebesgue theory are the method of bracketing and the method of monotonic convergence. The method of bracketing has been briefly discussed in§ 1.4 as a technique for extending the concept of integration, but it is interesting to indicate the very extensive class of functionals to which it can be applied and to expose the inherent restrictions in this method. Briefly we shall show that the method of bracketing can be applied to the class offunctionals J(f), which are characterized by the property that, if f(x} ~ g(x), then
The techniques of Lebesgue theory
33
I(f) ~ I(g), but that bracketing is necessarily a 'closure' operation that can only be applied once to extend a given class of functionals.
3.2.1. A functional I(f), defined for a set F of functions f(x) defined in an interval (a ~ x ~ b), is said to be 'monotonic' if I(f) ~ I(g) whenever f(x) ~ g(x) for all x in [a, b], and f(x), g(x) belong to the class F. DEFINITION
DEFINITION 3.2.2. A function rp(x) is said to be 'bracketed' by two functions ,\(x, E), f.L(X, E) (with tolerance E > 0) if ,\ and fL belong to the domain F of a monotonic functional I(f) and if
A.(x, E) ~ rp(x) ~ f.L(X, E),
l(f.L)-1(,\) ~
€.
3.2.1. If, for each prescribed tolerance E > 0, the function rp(x) is bracketed by two functions A.(x, E), f.L(X, E) belonging THEOREM
to the domain F of afunctional 1(,\) then the supremum of 1(,\) and the infimum of l(f.L) both exist and are equal.
Let .\0 , fLo be any fixed pair of bracketing functions. Then Ao ~ rp ~ fLo,
,\ ~ rfo ~ fL·
Hence and Thus the numbers /(,\) are bounded above, and the numbers l(f.L) are bounded below. Hence the numbers/(,\) have a supremum or least upper bound, sup/(,\), and the numbers l(f.L) have an infimum or greatest lower bound, inf l(f.L). For any prescribed tolerance E ~ 0 there exist bracketing functions ,\ and fL such that l(f.L)-1(,\) ~
Hence
inf l(f.L)-sup /(,\)
and therefore
sup/(,\)
~
=
E,
€.
for all
E
>
0,
inf l(f.L).
DEFINITION 3.2.3. With the notation and terminology of Theorem 3.2.1, the 'bracketed' functional I*(rp) is defined to be
I*(rp) 853146X
= sup/(,\) = inf l(f.L). c
The techniques of Lebesgue theory
34
THEOREM
3.2.2. If rf>(x) belongs to the domain F of the functional
I(rf>), then
I*(r/>)
For
sup I(>t)
=
=
I(rf>)
I(rf>).
=
inf I(P-).
Hence the bracketed functional I*(r/>) of a function of the class F is equal to the functional I(rf>). Thus the bracketed functional is a valid and self-consistent extension of the original function, which we may call the bracketed extension. It is important to notice at once the intrinsic limitation of the method of bracketing. Let F be the domain of a monotonic functional I (f) and
the domain of the bracketed functional I*(rf>). Is it possible to construct a further extension of the original functional by the use of functions if;(x) which are bracketed by functions from the class ? The answer is in the negative. THEOREM 3.2.3. If, for any prescribed tolerance E > 0, a function if;(x) is bracketed by two functions rf>1(x) and r/> 2(x) of the set , then if;(x) also belongs to the set .
For there exist bracketing functions rf>v r/> 2 of set such that
rP1 ~if;~ rP2> I*(r/> 2)-I*(r/>1) ~
and
E.
Also there exist bracketing functions >t1 , 11-v >.. 2 , 11-2 of set F such that A2 ~ rP2 ~ 1-'2 and I*(P-2)-I*(>t2) ~ €. Note also that
I*(>t 2). ~ I*(r/> 2)
and
I*(r/>1)
~
I*(P-1).
Then and
I*(P-2)- I*(>t1)
= {I*(P-2) - I*(r/>2)}+{ I*(r/>1)- I*(>t1)}+{I*(r/>2)- I*(r/>1)} ~
{I*(P- 2) - I*(>t 2)}+{ I*(P-1)- I*(>t 1)}+{I*(r/>2)- I*(r/>1)}
The techniques of Lebesgue theory
35
Thus ~ is bracketed with arbitrary tolerance 3E, by functions and t-t 2 of the class F. Hence ~ belongs to the same set $ as the functions rf>v r/> 2 , and we have not succeeded in making any further extension of the domain of the bracketed functional I*(rf>). The operation of bracketing is thus similar to the closure operation in point-set topology in as much as both are idempotent operations, i.e. the repetition of the operation produces no further extension of the set to which they are applied. .\1
3.3. The Riemann integral As we have shown in§ 2.3 bracketing by step functions furnishes the integral of bounded, monotonic functions. In fact the scope of this technique is much larger and it provides the simplest definition of the Riemann integral. As before we divide the integral [a, b] by a finite number of points a= x0 < x 1 < x 2 < ... < xn =b. Let LP and MP be the greatest lower bound and the least upper bound respectively of a function f(x) in the interval xP ~ x < xP+l' for p = 0, l, 2, ... , n-l. Let FP be any number in the range [ LP, MP]. Then the sum n-1
S
= L
Fp(xp+l-xp)
p=O
is a Riemann approximation to the integral of f(x) in [a, b]. Let E = max(xp+l-xp) for p = 0, l, 2, ... ,n-l. As E-+ 0, n -+ oo and the corresponding sums S may converge to a limit. b
If so, this limit is the Riemann integral R
Jf(x) dx. a
It is, however, clear that f(x) is bracketed by the step function .\(x) = Lp}
= MP
t-t(x) and that
~ X
.\(x) ~ f(x) ~ t-t(x) b
while
.f
1 Xp
A=
J.\(x) dx =
if a
<
Xp-l'
~ x
<
b,
n-l
p~o Lp(xp+l-xp),
a b
M=
n-l
J t-t(x) dx = p~o Mp(xp+l-xp), a
36
The techniq'ues of Lebesg'ue theory
whence 0 ~ M-A ~
n-l
L (Mp-Lp)(xp+l-xp) P=O
~
n-l €
L
(Mp-Lp)·
p=O
As before, the collection of numbers M has a greatest lower bound, inf M, and the collection of numbers A has a least upper bound, sup A. These numbers are the upper and lower Darboux integrals of f(x) over [a, b]. LP ~ FP ~ MP
Now
A~
and
S ~M.
Hence if the upper and lower Darboux integrals are equal, then the Riemann approximations S must converge to the common value of sup A and inf M. Conversely we can always choose the numbers FP so that LP = FP and A = S or so that MP = FP and M = S. Thus the equality of the upper and lower Darboux integrals is a necessary and sufficient condition for the existence of the Riemann integral. We note without proof that any one of the following conditions is sufficient to ensure the existence of the Riemann integral of a bracketed function f(x): (1) f(x) is continuous (and therefore uniformly continuous)
in [a, b], (2) f(x) has only an enumerable set of discontinuities in [a, b], (3) f(x) has bounded variation in [a, b], and, of course, it is sufficient if f(x) is monotone in [a, b]. The simplicity and naturalness of the Riemann integral leave little to be desired, and it is easy to show that it is a positive, absolute, and linear functional (see Exercise 3). But unfortunately it is not ·continuous, i.e. if each of the functionsfn(x) in a convergent sequence is integrable by Riemann's method, it is not necessarily true that the limit functionf(x) is also integrable by the same method. Consider, for example, any enumeration {zn} (n = 1, 2, ... ) of the rational numbers between 0 and 1, and let
f
(x) n
= {1
(x = z1 , z2 , ••• , zn), 0 otherwise.
The techniques of Lebesgue theory
37
Then in any subdivision of the interval [0, 1],
and
=
0,
MP = l.
A=O,
M=l.
LP
The limit functionf(x) has the value unity if xis rational and the value zero if x is irrational. The upper and lower Darboux integrals are respectively I and 0, whence f(x) has no Riemann integral. To be just to the Riemann integral, which has played so great a part in nineteenth-century analysis, it must be stated that it does possess a certain restricted continuity, in the sense that if the functions {fn{x)} are each integrable by Riemann's method, if the sequence {fn{x)} converges uniformly to the limit function f(x), and iff(x) is also integrable by Riemann's method, then b
b
J fn(x) dx ~ J f(x) dx a
as n ~ oo.
a
But this is far too restrictive a condition even in the theory of Fourier series, and having saluted the memory of Riemann's integral, we therefore pass on to our main topic-the Lebesgue integral.
3.4. Monotone sequences Monotone sequences play an important role in Lebesgue theory, and we therefore summarize their main properties in this section. It is convenient to reword the traditional definitions of limiting points in order to get a more compact description. DEFINITION 3.4.1. 'Almost all' numbers of an enumerable sequence {un} (n = l, 2, ... ) are said to possess a property P if there is only a finite number that do not possess the property. If {un} is any sequence bounded below, then there exists an infimum or greatest lower bound inf un = A such that
{i) un
~
A for all n, and
(ii) there is at least one number of the sequence in any closed
interval [A,A+E] where E > 0.
The techniques of Lebesgue theory
38
There is also a lower limit A = lim inf un such that, for each tolerance E > 0, (i) un;;;:, A-E for almost all n, and (ii) there is at least one number of the sequence less than A +E. But if {un} is a monotone, non-increasing sequence, bounded below, if un ;;;:, un+l• for all n,
i.e.
=
then there exists a unique limit, l
lim un, such that
(i) un ;;;:, l for all n, and (ii) un < l+E for almost all n. Similarly if {un} is any sequence bounded above, then there exists a supremum or least upper bound, sup un = fL, such that (i) un ~ t-t for all n, and (ii) there is at least one number of the sequence in any closed interval [f-t-E, fL]. There is also an upper limit M tolerance E > 0,
= lim sup un such that, for each
(i) un ~ M +E for almost all n, and (ii) there is at least one number of the sequence greater than M-E. But if {un} is a monotone, non-decreasing sequence, bounded above,
if un ~ un+l for all n, then there exists a unique limit, m = lim un, such that
I.e.
(i) un
~
m for all n, and
(ii) un
>
m-e for almost all n.
The study of the convergence of any bounded sequence {un} can be reduced to the study of two monotonic sequences by the use of the associated 'peak and chasm' sequences. DEFINITION
3.4.2. Let 7Tn
= supuP
for p;;;:, n,
Xn
=
for
infuP
p;;;:, n.
The techniques of Lebesgue theory
39
Then {rrn} and {xn} are the peak and chasm sequences associated with the original sequence {un}· The peak sequence is non-increasing and the chasm sequence is non -decreasing. Their limits
= lim sup un = lim un lim Xn = lim inf un = lim un
lim rr n and
as n ~ oo, are the 'upper' and 'lower' limits of the sequence {un}· The necessary and sufficient condition that the original sequence {un} should converge to a unique limit .\ is that lim inf un
= lim sup un
= lim un = .\.
All these considerations apply of course to a sequence {un} defined as un = fn(x), i.e. the value of the functions fn(x) at a specific point x. For example, un(x) and vn(x), the real and imaginary parts of the partial sums n ! eipx (0 < X < 2rr) P=O
of the Fourier series of a Dirac delta function, are represented on the Argand diagram by points on a circle of centre !+!i cot and of radius cosectx. Hence, if xf2rr is irrational, we easily find that limsupun = !+cosec}x,
tx
lim inf un = !-cosec jx,
= liminfvn =
lim supvn
!cot}x+}cosec}x, !cot}x-!cosecjx.
(If xf2rr is rational, the corresponding results are slightly more complicated.)
3.5. Infinite integrals In the Lebesgue theory the upper and lower bracketing functions are bounded, whence the definition of bracketed integrals (§ 3.2) is obviously restricted to bounded functions. In the Lebesgue theory the extension of the definition of integration to unbounded functions and to infinite intervals of integration is made par-
40
The techniques of Lebesgue theory
ticularly simple because we have to consider only integrals of non-negative functions, as will appear later in § 9.2. As a result the extension of the concept of integration can be made in terms of monotonic convergence. DEFINITION 3.5.1. If f(x) is a non -negative function defined for all values of x, then the 'truncated function' f 8 ,1(x) is defined as f 8 ,1(x)
= f(x)
if
-8 ~
x
~
s
and f(x)
~
t,
or
fs,!(x)
= t
if
-s
~
x
~
s
and f(x)
>
t,
or
f
=
if
lxl > s,
(x)
8 ,1
0
s and t being positive numbers. Thusf8 ,1(x) converges monotonically tof(x) ass and t converge independently to zero. DEFINITION 3.5.2. If each of the truncated functions f 8 ,1(x) belongs to the domain of a monotonic functional!( ), then the 'monotonic' extension of J( ) for the functionf(x) = limf8 ,1(x) is l*(f) = lim l(fs,t) as s, t-+ 00. If s
<
a
and t
<
'T,
then l(f8 ,t) ~ l(fa,T),
whence 1(!8 ,1) necessarily converges to a limit (finite or infinite) ass, t-+ oo. It is almost trivial to note THEOREM 3.5.1. In Definition 3.5.2, the variables sand t can be restricted to the positive integers, and moreover we may take them to be equal so that ~ = t = n. It is, however, important to state
THEOREM 3.5.2. If f(x) is bounded and defined on a bounded interval [a, b], then l*(f) = 1 (f). For f 81 (x)
= 0 if s > max( a, b).
This result is necessary to establish the consistency of our definition of the monotone extension of 1( ).
The techniques of Lebesgue theory
41
3.6. The Dini derivatives In discussing the differentiation of an indefinite Lebesgue integral (§ 5.1) we shall have to take explicit cognisance of the fact that a continuous function does not necessarily possess a unique derivative everywhere. The elementary example f(x) =
lxl
shows that f(x) may not be differentiable at the origin, where f(x+hh)-f(x) --,.. -~+I
or
-1,
for x
=
0,
according as h tends to zero through positive or negative values. By various ingenious methods it is possible to construct continuous functions that are not differentiable at any rational point, or indeed at any point whatsoever, and we refer to The theory of functions by E. C. Titchmarsh (§§ ll.2l-ll.23) for an account of what has been called the 'morbid pathology' of analysis. When a function f(x) fails to possess a derivative in the ordinary sense, i.e. when the incrementary ratio G(x, h) = f(x+h)-f(x) h
does not tend to a unique limit as lhl-+ 0, we can employ the peak and chasm functions of§ 3.4 to define what are commonly called the 'Dini' derivatives, after the Italian mathematician who introduced them into analysis. There is this difference that in§ 3.4 we were considering the limits of a set of numbersfn(x) which were defined for integral values of n, whereas now we are concerned with the limits of a set of numbers G(x, h) which are defined for values of h in a continuous interval, -S < h < S. Let 7T(X, S) be the supremum of G(x, h) and x(x, S) be the infimum value of G(x, h) in the domain 0 < h < S. Then as S-+ 0, 7T(X, S) is non-increasing, and x(x, S) is non-decreasing. Therefore 7T(X, S) and x(x, S) each tend monotonically to unique
The techniques o.f Lebesgue theory
42
limits as 8 ~ 0. These are upper and lower Dini derivatives of f(x) on the right, usually denoted by D+f(x) D +f(x)
=
lim 7T(X, 8)
=
lim x(x, 8)
8--->Q
8--->Q
=
lim sup G(x, h)
=
lim inf G(x, h)
(0
<
h
<
8),
(0
<
h
<
3).
h--->Q h--->Q
Similarly, the upper and lower Dini derivatives ofj(x) on the left are defined as D-J(x) = limsupG(x,h)
(-8
h--->Q
D_f(x) = liminfG(x,h)
(-8
0), 0).
h-->-0
The Dini derivatives always exist, but their numerical value may be +oo or -00.
3.7. Exercises A functional I(j) is said to be (i) positive iff;> 0 implies that I(j) ;> 0, (ii) additive if I(fi 2) = I(j1 )+ I(j2), (iii) multiplicative if I(cj) = ci(j) for any constant c, (iv) linear if I(cif1 +c 2j 2) = c1 I(j1 )+c1 I(j2), for any constants
+/
cl, c2,
(v) (vi)
'absolute' if the existence of I(j) implies the existence of I(ifi), completely additive if the conditions j,. ;> 0,
I(j,.) exists,
.. f'P
I
~f
as n
~
oo,
'IJ=l
and I(j) exists, are sufficient to ensure that 00
I
I(j'J)) = I(j).
'J)=l
1. Prove that the integral of a step function is a monotonic, positive, linear, absolute functional. 2. If I*(cp) is the bracketed extension of I(j), prove that (i) if I is positive, so also is I*, (ii) if I is additive, so also is I*, (iii) if I is multiplicative, so also is I*, (iv) if I is linear, so also is I*, (v) if I is linear and absolute, so also is I*, (vi) if I is positive, linear, additive, and completely additive, so also is I*.
The techniques of Lebesgue theory
43
3. The Riemann integral R(j) can be defined as the bracketed extension of the monotone functional I (f) for step functions in a bounded interval. Examine which of the properties listed above are possessed by the Riemann integral. 4. Show that the Riemann integral over a bounded interval of nonnegative, non -decreasing functions f (x) is a completely additive functional (Denjoy 1941-9, p. 428), i.e. ifj,.;;;. O,j,.(x) is non-decreasing in x,j,.(x) n
is uniformly bounded, then functionj(x) as n-+ oo, and
~ fp(x) P=1
converges to a non-decreasing
co
~ l(j,.) = 2)=1
l(f).
5. (i) Ifj,.(x) possesses a non-negative derivativej~(x) for each integer n and each x in (a, b), and if
s,.(x) =
n ~ fl'(x) 1
converges to a differentiable sum function s(x) in (a, b), show that the co
series~ f~(x)
converges in (a, b) to a function A(x) such that A( X) .;;; s'(x).
1
(Show that s(x+h)-s(x)
" > ~{fl'(x+h)-jp(s)}.) 1
(ii) Show that there is a sequence of integers Pn such that the series co
~ {s(x)-sJ>ft(x)} n=1
converges for each x in (a, b). Hence deduce that s;,.(x) -+ s'(x)
and
s~(x) -+
s'(x)
as n-+ oo. (This is a mild form of a theorem due to Fubini. Note that s(x)-s2'ft(x) .;;; s(b)-sl'ft(b).) 6. If c/J(x)
= f(x)+g(x), prove that D+J+D+g .;;; D+c/J .;;; D+c/J .;;; D+J+D+g.
By considering the functions g(x) = x-lxl,
j(x) = jxl,
show that the signs .;;; cannot be replaced by
=.
7. Ifj(x) has a unique derivativef'(x) prove that D+(J+g)
=
j'(x)+D+g.
4
Indicators
4.1. Introduction We have it on the authority of Henri Poincare ({]jJuvres de Laguerre, tome 1, Preface, p. x, Paris, 1898) that 'in the mathematical sciences a good notation has the same philosophical importance as a good classification in the natural sciences'. In the theory of sets of points there are many advantages in adopting the notation due to Charles de la Vallee-Poussin (1916), by which the whole of the theory is expressible in analytical form, rather than in the usual geometrical language. In particular, it is unnecessary to memorize formulae for the manipulation of special symbols for the intersection and complements of sets of points. DEFINITION 4.1.1. The 'characteristic function'' x(x, E) of a set of points E in a space R is defined by the relations
x E _
x( ' ) -
{1
o
(x E E), (x ¢ E),
x being any point of R. In view of the numerous meanings that have been given to the adjective 'characteristic', we shall follow the lead of modern books on probability theory and call the function x(x, E) the 'indicator' of the set E. Thus the indicator of the whole spaceR is the function f(x) 1 and the indicator of the 'empty set' is f(x) 0. When we are discussing the properties of some specified set E we may write x(x)for x(x, E) and often we may further abbreviate x(x) to the single letter X· Just as we commonly speak of 'the point (x, y)', meaning the point with coordinates (x, y), so we shall speak of the 'set' or 'set of points x(x, E)', meaning the set of points E with indicator x(x, E).
=
=
Indicators
45
The fundamental property of indicators is given by 4.1.1. The necessary and sufficient condition that a function f(x), i.e. a mapping from the space S of the points x to the space R of the real numbers, should be an indicator is that THEOREM
{J(x)}2
=
f(x)
for each x.
Since the elements a of a Boolean algebra are characterized by the relation a 2 = a it seems appropriate to give the theory of indicators the name of Boolean analysis. Thus point-set topology can be expressed in the form of Boolean analysis, and we shall proceed to summarize the relevant properties of sets of points in this form. The first fundamental relation in point-set theory is that of 'inclusion'. DEFINITION 4.1.2. A set o: is 'included' in a set fl, or 'covered' by a set fl if each point of o: is also a point of fl.
4.1.2. The necessary and sufficient condition that the should be 'covered' by the set fl is that
THEOREM
set
o:
o:(x)
For, if
o:
=
o:(x)fl(x)
for each x.
is covered by fl then, by Definition 4.1.2, o:(x) = I
whence Now whence
implies that
o:(x) 0
~
o:(x)
~
~
fl(x)
I
and
fl(x)
=
I,
~
I,
for each x. 0
~
0 ~ o:(I-fl) ~ fl(I-fl)
and
o:(I-fl) = 0,
i.e.
o:(x) = o:(x)fl(x).
fl(x)
=
0,
Conversely, if o: = o:fl then, either o: = 0 or fl = I and in either case o: ~fl. The second fundamental relation is that of 'disjunction'. 4.1.3. A finite or enumerable collection of sets, o: 1 , o: 2 , ••• , is said to be 'disjoint' if o:P o:q = 0 for p -=!=- q, i.e. if no two different sets have a point in common. DEFINITION
46
Indicators
4.2. Boolean convergence Since it is one of the distinguishing features of the Lebesgue theory of measure to consider enumerable collections of sets of points, we proceed at once to consider the conditions for the convergence of a sequence of indicators {xn(x)} (n = I, 2, ... ). In classical analysis the condition for the convergence of a sequence {sn} (n = I, 2, ... ) to a limit s as n tends to infinity is that, if e is any prescribed tolerance, then lsn-sl
<
e
for 'almost all n', i.e. for all values of n except a finite number, i.e. for all n greater than some threshhold n(e) which usually depends on e. But in a sequence of indicators, {xn(x)}, each term can take two values only, viz. 0 and I. Hence if Xn(x) converges to a limit x(x) the condition that lxn(x)-x(x)l < e for almost all n implies that, for each value of x, the terms Xn(x) are actually equal to the limit x(x) for almost all n. Hence we have THEOREM 4.2.1. The necessary and sufficient condition for the convergence of a sequence of indicators {xn(x)} to a limit x(x) as n-+ oo is that, for each x,
Xn(x) = x(x) for almost all n. CoROLLARY. The limit x(x) of a convergent sequence of indicators is itself an indicator.
For almost all n, and each value of x,
= X~ = Xn = X· Hence by Definition 4.l.I, xis an indicator. X2
THEOREM 4.2.2. If {xn(x)} is any enumerable sequence of indicators, then the product
7Tn(x)
=
Xt(X) X2(x) ··· Xn(x)
is a"tso an indicator, and the sequence {1rn(x)} is always convergent.
Indicators
47
For, if Xn(g) = 1 for all n, then 7Tn(g) = I for all n and 7Tn(g)-+ I. But, if there is at least one integer p such that Xp(g) = 0, then 7Tn(g) = 0 if n ~ p, and 7Tn(g)-+ 0. Hence 7Tn(g) tends to a limit 7T(X) which has the values 0 and I only, and is therefore an indicator. Therefore we can frame DEFINITION 4.2.1. The 'intersection' of a finite or enumerable collection of sets {xn(x)} (n = I, 2, ... ) is the set 7T(x)
=
Xl(x)x2(x) ··· Xn(x)
if the collection is finite, or the set 7T(x) = lim {x1 (x)x2(x) ... xn(x)} n--+oo
if the collection is enumerable. THEOREM 4.2.3. The intersection of Xv points common to x1 , x2 , •••• For 7T(x) = 1 if and only if Xn(x)
=
x2 , •••
consists of the
I for all n.
DEFINITION 4.2.2. The 'union' of a finite or enumerable collection of sets {xm(x)} (m = I, 2, ... ) is the set a(x) = I-{1-XI(x)}{1-x 2(x)} ... {I-xn(x)}
if the collection is finite, or the set
if the collection is enumerable. THEOREM 4.2.4. The union of the sets x1, x2 , ••• consists of the points which belong to at least one of these sets. For a(x) = I only if one of the terms {1-xp(x)} is zero. Although we need no special symbol, such as Xm(x), for the intersection of the sets Xv x2 , ••• , it will be convenient to denote
n
00
their union by U Xn(x), or by U Xn(x), or by ex u /3, if there are only two sets ex and f3.
n=l
48
Indicators
A special case of great importance is covered by THEOREM 4.2.5. If the collection of sets {xn(x)} is disjoint, then the series 00 ! Xn(x) n=l
always converges to the indicator of the union of the sets.
For Xp Xq
= 0 if p
-=1=
q, whence
l-(I-xl)(I-x2) ... (I-xn)
=
X1+X2+···+Xn-
4.3. Open and closed sets If we employ the language of Boolean analysis then the simplest and most natural way to define the classical concepts of 'open' and 'closed' sets is as follows.
DEFINITION 4.3.1. The set o(x) is said to be 'open' if the indicator o(x) is continuous at each point x = g where o(g) = I. DEFINITION 4.3.2. The set K(x) is said to be 'closed' if the indicator K(x) is continuous at each point x = gwhere K(g) = 0. THEOREM 4.3.1. The open interval a< x < b is an open set in the sense of Definition 4. 3.1, and the closed interval a ~ x ~ b is a closed set in the sense of Definition 4.3.2. In particular, an isolated point, x = c, is a closed set. THEOREM 4.3.2. If g is a point of the open set o(x) then interior point of this set.
g is an
For there is a neighbourhood of g in which o(x) = o(g) = I. Hence this neigh~ourhood of g belongs to the set o(x), i.e. g is an interior point of o(x). THEOREM 4.3.3. If K(x) is a closed set, then it is identical with its closure. Let g1 , g2 , ••• be any convergent sequence of points belonging to g. Then g belongs to the closure of K(x), and any neighbourhood of g contains an infinite number of the points {gn}· But, if K(g) = 0, there exists a neighbourhood of g for all K(x) with limit
Indicators
49
points x of which K(x) = 0 by Definition 4.3.2. Thus we have a contradiction and hence K(g) = I, i.e. the set K(x) contains the limit of any convergent sequence of points in K(x). The preceding theorems show that the definitions of open and closed sets in terms of the continuity of their indicators are equivalent to the usual definitions. (But see Exercise 7 of § 4.60.) DEFINITION 4.3.3. Two sets a(x) and fJ(x) are said to be 'complementary' with respect to a set y(x) if a(x) and fJ(x) are disjoint, and y(x) is their union. THEOREM 4.3.4. The necessary and sufficient condition that a(x) and fJ(x) should be complementary with respect to y(x) is that a(x)+fJ(x) = y(x).
If a and fJ are disjoint their union is I-(I-a)(I-fJ) = a+fJ.
Hence the condition is necessary. It is also sufficient, for it implies that 0
whence
a
=
=
(a+fJ-y)(a+fJ+y)
a+fJ-y+2afJ
=
2afJ,
and fJ are disjoint. Therefore their union is a+fJ
=
y.
THEOREM 4.3.5. The complement o(x) of a closed set K(x) with respect to an open set O(x) is itself open. o(x) = O(x)-K(X).
For
Hence, if o(g) = I, then O(g) = I and K(g) = 0. Therefore O(x) and K(x) are continuous at g. Therefore o(x) is continuous at g. Hence o(x) is open. THEOREM 4.3.6. The complement K(x) of an open set o(x) with respect to a closed set r(x) is itself closed. For
K(x)
=
r(x)-o(x).
Hence, if K(g) = 0, then either
(i) r(g) = I = o(g),
or
(ii) r(g)
853146X
= 0=
o(g). D
Indicators
50
If o(g) = I then there is a neighbourhood of o(x) = I and in which r(x)
=
g m which
K(x) +o(x) ~ o(x)
whence r(x) = I and r(x) is continuous. Hence K(x) is continuous at g. Ifr(g) = Othenthereisaneighbourhoodofginwhichr(x) = 0 and in which
K(x) = r(x)-o(x) ~ r(x),
whence K(x) = 0. Thus K(x) is continuous at g. Hence K(x) is continuous at any point g at which K(g) = 0. Therefore K(x) is closed. THEOREM 4.3. 7. The intersection of a finite or enumerable collection of closed sets {KP} is closed.
Let and
cxn(x)
=
K1 (x)K 2 (x) ... Kn(x),
cxn(g) = 0.
Then there exists an integer p such that I~p~n
and Hence there is a neighbourhood of g in which Kp(x) = 0, and therefore cxn(x) = 0. Thus cxn(x) is closed. Now let
cx(x) =
lim cxn(x), n--+ ro
and Then, byTheore.m 4.2.I, there is anintegerp such that o:p(g) = 0. Hence there is a neighbourhood of g in which o:p(x) = 0 and therefore o:(x) = 0. Therefore cx(x) is closed. THEOREM 4.3.8. The union of a finite or enumerable collection of open sets is open.
This follows as the 'dual' theorem of 4.3. 7 by using the principles of complementarity (4.3.5 and 4.3.6).
Indicators
51
4.4. Covering theorems THEOREM 4.4.1. The union of enumerable non-disjoint sets {o:n} is also the union of the enumerable disjoint sets {,Bn} defined as {11
= 0:1,
fln = o:n(l-o:1)(1-o:2) ··· (1-o:n-1)
(n
>
1).
Since and it follows that fl~ = flw Hence the functions {,8n} are indicators. Also, if p < q, {Jp flq = <Xp <Xq( l-o:1)( l-o:2) ... ( 1-o:q-1).
The right-hand side contains the factor o:p(l-o:p) = 0,
whence so that the sets {fJP} are disjoint. Finally, we can prove by induction that
fJ1+fJ2+···+fln
l-(l-o:1)(1-o:2) ... (l-o:n).
{11 = l-(l-o:1)
For and
=
l-(l-o:1)(l-o:2) ... (1-o:n)+fln+l
=
1 +(l-o:1)(1-o:2) ... (o:n+l-l).
Hence the infinite series and product converging by Theorem 4.2.2. CoROLLARY. The union of enumerable non-disjoint intervals is also the union of certain enumerable disjoint intervals. For, if o:1, o: 2, ... are intervals, then each fln is a finite collection of disjoint intervals. THEOREM 4.4.2 (the Reine-Borel theorem). If o:(x) is a compact set of points E covered by an enumerable collection 0 of
Indicators
52
open sets {o:n(x)} (n = l, 2, ... ) then E is also covered by a finite number of sets in the collection 0.
Let Then the possible values of the function cfon(x) are 0, l, 2, ... , n. Hence cfon(x) has a greatest lower bound An which is certainly attained at some point gn, and cfon(x)
~
cf;,,(gn)
=
An
for all
X
in E.
Since E is compact, the sequence of points {gn} has at least one point of accumulation gin E, and moreover, for some integer q, <Xq(g) = l. But o:q(x) is continuous at g since o:q is open. Therefore, for some integer p ~ q,
=
Hence
c/;p(gp) ~ <Xq(gp)
Thus
cfov(x) ~ cfov(gp) ~ l
l.
and E is covered by the finite number of open sets, o:v o: 2 , ••• , o:P. CoROLLARY. The Heine-Borel theorem clearly applies if E is a bounded, closed set in d-dimensional Euclidean space.
4.5. The indicator of a function We have already indicated in§ 2.4 the importance of the functions o:p(x) associated with a function f(x) by the relations o:p(x)
=
{
l
0
if tv < f(x) :( tvw if f(x) :( tP or tp+l < f(x).
In general, if x is a point of a space S and f(x) any mapping from S to the real numbers R we have the definition DEFINITION 4.5.1. The indicator of the function f(x), viz. o:(x, t, f) or o:(x, t) is defined as the indicator of the set of points x at which f(x) > t. Hence THEOREM 4.5.1. The indicator of the function f(x) is a nonincreasing function of t.
Indicators
53
For, if s < t and f(x) > t, then f(x) > s, i.e. cx(x, t, f) = 1 implies that cx(x, s, f) = l. But, iff(x) :( t, then we may have either f(x) > s or f(x) :( s, i.e. cx(x, t, f) = 0 implies that cx(x, s, f) = 1 or 0. Therefore cx(x, t, f) :( cx(x, s, f).
4.6. Exercises I. If the indicators cr., f3 and cr./3 are linearly dependent, i.e. if there exists a linear relation such that the real numbers a, b, c are not all zero, prove that either cr. = cr./3 or f3 = cr.f3. 2. If cr.,
[3, cr.f3, and y are each indicators and y = acr. + b/3 + crxf3,
where a, b, c are real numbers, prove that either y = rx+f3-rxf3, or y = rx+{3-2rxf3, or y = rx-rxf3, or y = f3-rxf3. 3. The 'differences' of two sets cr. and f3 are defined as cr."'/3 = cr.-rx/3 and [3"' cr. = [3- rx/3 while the 'symmetric difference' is defined as rxA/3 = rx+f3-2rxf3. Give geometric definitions of these sets in terms of the relations of inclusion and disjunction.
4. Ifg is a point covered by the union of the sets {rxn(x)} of Theorem 4.4.I, prove directly that there is an integer p such that cr.n(g) = 0 ifn < p, rxp(g) = I, f3n(g) = 0 if n =/= p, [3p(g) = I. 5. If {cr.n(x)} is a decreasing sequence ofnon·empty, bounded closed sets in d·dimensional Euclidean space, i.e. if 0'.1 > rx2 > ··· > Cl'.n > rxn+l > ···• prove that the intersection of all the rxn is not empty.
6. Show that the indicator of the rational points in [0, I] can be expressed in the form X( X) = lim { lim (cosm! 1TX)2n}. m""""+oo
n~oo
SETS OF ZERO MEASURE
5
Differentiation of monotone functions
5.1. Introduction In Chapter l in order to provide a motivation for the search for the most general concept of integration we introduced the ideas of 'tame' and 'wild' functions. The tame functions of elementary calculus are the bounded continuous functions with finite derivatives everywhere. Then there are functions such as sgnx,
\x\, [x],
which are tame everywhere except at one point (in these cases the origin). From these we can construct functions such as q
L p~l
q
\x-pfq\
(p,q integers),
L
sgn(x-pfq),
p~l
which are tame everywhere except at a finite number of points; and functions such as
which is tame everywhere except at the rational points x = r 1 , r 2 , •••• Thus it appears that there are degrees of wildness and that one way of quantifying the wildness of a function is by giving some general specification of the points at which it ceases to be tame. Such a criterion of wildness should have some practical utility and be related to the general concept of integration. Thus if the 'wild' points of a wild function could be neglected in constructing its integral, the function could be described as 'wild but harmless'.
Differentiation of monotone functions
55
For example all physicists would regard lx I as a primitive of sgn x disregarding the discontinuity in sgn x at the origin. Similarly any finite number of discontinuities in an otherwise continuous integral are commonly ignored. But how far can we go in neglecting wild points? What is the largest collection of wild points that can be safely ignored in integration? The answer to this question is furnished by Lebesgue's theory of sets of points of zero measure.
5.2. Sets of points of measure zero The length of an interval I will be denoted by III and the area of a rectangle R by IR I· 5.2.1. A set of points Eon the real axis is said to have zero one-dimensional measure if, to each positive number E there corresponds an enumerable collection of intervals {In} (n = 1, 2, ... ), which cover the set E and whose total length DEFINITION
does not exceed E. We note that the intervals may possibly be overlapping, and that it is immaterial whether the intervals are open, closed, or half-open. Also, since the terms II1 1, II2 1, ... of the series are each positive, the sum I is independent of the order of enumeration. 5.2.2. A set of points E in the (x, y)-plane is said to have zero two-dimensional measure, if to each positive number E there corresponds an enumerable collection of rectangles {Rn} (with edges parallel to the lines x = 0 or y = 0) which cover the set E and whose total area DEFINITION
R=
L"" IRnl
n=l
does not exceed E. As before, the rectangles may possibly be overlapping and may or may not include their edges or vertices. When there is no danger of confusion we shall use the shorter description 'sets of zero measure', or 'null sets' for sets with zero one-dimensional measure.
56
Differentiation of monotone functions
When a function f(x) possesses a property P for each point of an interval (a, b) except for the points of a set of measure zero it is customary to say that 'f(x) possesses the property P almost everywhere in (a, b)' or to write 'f(x) possesses the property P p.p. in (a, b)' (p.p. being the abbreviation for presque partout). The essential point in these definitions, due effectively to Lebesgue, is the use of an enumerable collection of intervals (or rectangles) to cover the set of points E. It is obvious that, in one dimension, a single point, or a finite number of points has zero measure, and we shall prove that the same is true for an enumerable set of points. THEOREM 5.2.1. Any enumerable set of points E = { Xv x 2 , ••• } has zero measure.
For E is covered by the enumerable collection of intervals ]n
=
{X, Xn- 2:+1 < X < Xn + 2:+1}
whose total length is
The set of rational points, i.e. the points whose coordinates Xv x 2 , ••• are each a rational number, is enumerable and therefore has zero measure.
CoROLLARY.
5.2.2. If Ev E 2 , ••• is an enumerable collection of sets, each of measure zero, then their union E is also of measure zero.
THEOREM
For the set Ek qan be covered by an enumerable collection of intervals (akn• bkn) (n = I, 2, ... ) of total length less than ef2k, where e is any arbitrary tolerance. Hence the union of Ev E 2 , ••• can be covered by an enumerable collection of intervals of total length less than e. The question naturally arises whether a set of points of zero measure is necessarily enumerable. The answer is in the negative as is shown by the example of the 'Cantor set' in the next section.
Differentiation of monotone functions
57
5.3. The Cantor set of points To construct the Cantor set of points we proceed as follows. From the unit interval, 0 ~ x ~ l, we remove in succession E 1 , the middle third, l < x < -f, E 2 , the middle thirds of the remaining interva1s, viz.
(i) (ii)
l (iii)
f.r <
~<X<
<X<~'
f,
E 3 , the middle thirds of the remaining intervals, viz. X
<
227'
277
<
<
X
g<
287'
X
<
~ ~'
g <
X
<
~ ~'
and continue this process indefinitely. The lengths of the intervals removed are 2 4 jE2j
=
32'
lEa\=
33 ,
!En!= !(i)n-t.
and, in general,
Thus, at the nth stage, we have removed intervals of total length n
! k=l
jEk\ = 1-(i)n.
Now consider the points that remain after Ev E 2 , ... have been removed. These form a set 0. 0 is certainly covered by the points that remain after Ev E 2 , ... , En have been removed. The removal of this finite number of intervals leaves a set of points on which consists of a finite number of intervals of total length (2/3)n. Since n can be chosen to make (2/3)n less than any assigned tolerance e > 0, the set 0 has zero measure. The set 0, first constructed by Cantor, is not, however, enumerable. To prove this we observe that any number x in the interval 0 ~ x ~ l can be expressed in the form
where each numerator an is either 0, l, or 2. The points of E 1 can be expressed as X=
58
Differentiation of monotone functions
the points of E 2 as
2: ;~+l+(O or 00
x
=
j);
n=3
the points of E 3 as 00
1 +(O or 32 or 9 2 or x -_ "' ~ an+ 3n 27
2+2) 9 .
3
n=4
In general, the points of En are those for which x can be expressed in the form
where
ak =
0, l, or 2, i.e.
2: ~k 00
x =
and
an = l.
n=l
Hence the points of the Cantor set 0, i.e. the points of the interval (0, l) that remain after the removal of Ev E 2 , ••• , can be represented only in the form
2: ;z, 00
x
=
k=l
where each ak is either 0 or 2. If possible let these points be enumerable. Then they will form a sequence {xn}, with n = l, 2, ... , and Xn expressible in the form 00
X = n
" ' an,k ~ 3k' k=l
where each an,k is· either 0 or 2. Now consider the point
g= ~
ok ~ 3k k=l
where ok = 2-ak,k• This point lies in the interval (0, I) and belongs to the Cantor set since Ok = 0 or 2. But Ok is always different from ak,k· Hence gis differentfromeachnumberxn, forn = l, 2, ... , i.e. the number
Differentiation of monotone functions
59
g that belongs to 0 is not included in the given enumeration of the points of 0. We have thus arrived at a contradiction, which proves that the Cantor set 0 is not enumerable. 5.4. Average metric density A simple and useful criterion to prove that a set of points E has zero measure (Boas I960, p. 64), is conveniently expressed in terms of the concept of the 'average metric density' of the set E in an interval]. DEFINITION 5.4.I. If the subset of E that lies in the interval], of length 111, can be covered by an enumerable collection of intervals of total length not greater than p 111, then we say that the 'average metric density' of the set E in the interval I is not greater than p. Clearly 0 :( p :( I. THEOREM 5.4.1. If the average metric density of a bounded set E is not greater than a number p less than unity, for all intervals I, then E is a set of measure zero.
Consider the subset of E in an interval (a, b). This subset can be covered by an enumerable collection of intervals (ak, bk) (k = I, 2, ... ) of total length not greater than p(b-a). Now the subset of Ein (ak, bk) can be covered by an enumerable collection of intervals (akz, bk1) (l = I, 2, ... ) of total length not greater than p(ak-bk). Hence the subset of E in (a, b) can be covered by an enumerable collection of intervals (akl• bk1) (k, l = I, 2, ... ) of total length not greater than
By mathematical induction it follows that the subset of E in (a, b) can be covered by an enumerable collection of intervals of
total length not greater than pn(b-a), where n is any positive integer. Since n is arbitrary, the subset of E in (a, b) has zero measure.
60
Differentiation of monotone Junctions
5.5. The problem of differentiation We are now in a position to enunciate the central result of the Lebesgue theory of differentiation as follows. If the function f(x) is continuous and non-decreasing in the interval (a, b), thenf(x) possesses a derivativef'(x) at all points of this interval, with the exception of a set of points Z of zero measure. This remarkable result, which finds so many applications, not only in the theory of integration but also in the whole of analysis and differential geometry, was first discovered by Lebesgue (1904, p. 128). The line of argument that we shall follow is that given by F. Riesz (1953, pp. 6-7) as simplified by R. P. Boas (1960, p. 134). The main strategy is to show that the following chain of inequalities hold almost everywhere in (a, b), viz. 0
< D+f(x) < D_f(x) < D-J(x) < D+f(x) < D+f(x) < oo.
It then follows at once that the four Dini derivatives ofj(x) are finite and equal almost everywhere, i.e. f(x) possesses a finite derivative almost everywhere in (a, b). To establish these inequalities it is sufficient to prove that, if f(x) is any continuous, non-decreasing function, then
and For, ify
=
D+f(x)
< oo
D+f(x)
< D_f(x)
and Hence and
p.p.
(I)
-x
and then
p.p.
g(y) g(y+h)
=
= -f(x),
-f(-y-h)
=
-f(x-h),
g(y+h)-g(y) _ f(x-h)-f(x)
h D+g(y)
-
=
-h D-j(x)
D_g(y) = D+f(x).
But g(y+h)-g(y) = f(x)-f(x-h), so that the function g(y), like f(x), is continuous and non-decreasing in y.
Differentiation of monotone functions
61
Therefore (I) implies that D+g(x) ::( D_g(x)
p.p. (II)
whence Now by the very definition of the Dini derivatives D_f(x) ::( D-j(x)
and
D+f(x) ::( D+f(x).
(III)
Hence, by combining the inequalities (I), (II), and (III), we find the desired result: D+f(x) ::( D_f(x)
p.p.
::::;; D-J(x) ::::;; D+f(x)
(I) (III)
p.p.
::::;; D+f(x).
(II) (III)
The problem is thus resolved into the proof of the two inequalities D+f(x) < oo p.p. and
D+f(x) ::( D_f(x)
p.p.
for any continuous, non-decreasing function f(x).
5.6. The 'rising sun· lemma Various methods have been devised to construct enumerable collections of intervals covering the points at which D+f(x) < oo or D+f(x) ::( D_f(x), such as those invented by Vitali (Burkill 1953, p. 46), de la Vallee-Poussin (1916), and Rajchman and Saks (Titchmarsh, p. 358). But the method devised by F. Riesz has the advantage of a simple geometric interpretation and we shall therefore adopt it here, following the vivid account given by R. P. Boas (1960, p. 134). The geometrical significance of the 'rising sun' lemma is easily grasped if we regard the graph, y = f(x), of a continuous function f(x), as the profile of a series of parallel ridges (parallel to the z-axis !) of a mountain range, illuminated by the horizontal rays of the rising sun, at infinity on the x-axis (Fig. 2).
Differentiation of monotone functions
62 y
v
•
w
W'\ ~
X
FIG. 2
Some points on the ridges are in the sunshine and some are in the shadows cast by ridges on their right. The points in the shadows occupy a number of hollows, such as x' < x < x" f(x) < f(x") in which f(x') ~ j(x"). whence In order to give precision and rigour to these geometrical intuitions we frame the following definition and theorem. DEFINITION 5.6.1. If f(x) is continuous in the interval [a, b], a point x is said to be shaded, or dominated, by a point g if
a ~x
< g~
b
and f(x)
< j(g).
Differentiation of monotone functions
63
THEOREM 5.6.I (the 'rising sun' lemma). The dominated points form an enumerable collection of disjoint open intervals (ak, bk) (k = I, 2, ... ) such that if
<
x
<
then
f(x)
<
f(bk)
and
f(ak)
ak
bk,
< f(bk).
The proof divides in four parts. (I) Letx = s be any dominated point. Sincef(x)is continuous, it attains a maximum M(s) at some point gin the interval [s, b]; and M(s) = f(g) > f(s). There may be several points g at which f(x) attains the same maximum M(s). Let s" be the least upper bound of the points at which f(g) = M(s). Then
f(x) < f(s")
if s < x < s".
(2) Sincef(x) is continuous, there is an interval a< x < sin which f(x) < f(s"). Let s' be the greatest lower bound of the points a at the lower extremity of these intervals. Then s' <sand
f(x) if
s'
< f(s"),
<
x
<
s".
Thus any dominated point x = slies in the interior of a unique 'shaded interval' (s', s") such that, if s' < x < s", then whence
f(x)
<
f(s')
< f(s").
f(s");
(3) The shaded intervals are disjoint. For, if (s', s") and (t', t") are two overlapping shaded intervals, they have a common point x, and by (I), f(x)
< f(s"),
f(x)
<
f(t").
Hence xis a dominated point. Therefore, by (I) and (2), x lies in a unique shaded interval. Thus the intervals (s', s") and (t', t") are identical. Finally, to sum up, the dominated points s lie in a collection of disjoint open intervals, such as (s', s"), for which f(s')
< f(s").
64
Differentiation of monotone functions
(4) The shaded intervals are enumerable. For, if n is any positive integer, the total number of the shaded intervals in (a, b) with lengths greater than (b-a)f2n is not greater than 2n, since the shaded intervals are disjoint. Hence the total number of shaded intervals in (a, b) with lengths lying between (b-a)/2n+l and (b-a)/2n is not greater than 2n+ 1 and therefore is finite. Thus the whole set of shaded intervals is enumerable. There is a companion theorem, which we may call the 'setting sun lemma' and which is obtained by changing the definition of a dominated point as follows. DEFINITION 5.6.2. If f(x) is continuous in [a, b], a point x is said to be dominated by a point if, for some g, a:( g < x :( b and f(x)
ak < x < bk, f(x) < f(ak)
and
f(bk) :( f(ak).
5.7. The differentiation of monotone functions We can now apply the covering theorems of the preceding section to examine the differentiability of monotone functions. THEOREM 5.7.1: If cp(x) is continuous and non-decreasing in [a, b], if fL is any positive number, and EJL the set of points in [a, b] at which D+cfo(x) > fL, then EJL is covered by an enumerable disjoint set of intervals P, (o:k, fJk) (k = I, 2, ... ), such that g(P)
=
!
(fJk-o:k) :( cp(b)-cp(a).
k=l
Let f(x) = cp(x)-fLX. Then f(x) is continuous in [a, b].
fL
Differentiation of monotone functions If x
E
65
Ew there exists a point gin (a, b) such that X<
g,
>
cp(g)-cp(x)
g-x
I.e.
f(g)
fl-,
> f(x),
i.e. xis dominated by g. Therefore, by Theorem 3.6.1, the points Elie in an enumerable set of intervals, P, (cxk, fJk) (k = l, 2, ... ), such that
f(fJk)
~
f(cxk)
for k ~
cp(fJk)-cp(cxk)
Hence
=
l, 2, ....
P-(fJk-cxk).
00
Therefore
L
WJ(P) :(
{cp(fJk)-cp(cxk)}.
k=l
If we consider only a finite number of terms on the right-hand side, it is clear that, since cf;(x) is non-decreasing, therefore n
L {cp(fJk)-cp(cxk)} :( cf;(b)-cp(a). k=l Since this is true for all n, therefore 11-g(P) :( cf;(b)-cp(a).
THEOREM 5.7.2. If cp(x) is continuous and non-decreasing in [a, b], .\ any positive number, and FA the set of points in [a, b] at which D _(x) < .\,then FA is covered by an enumerable set of intervals Q, (cxk,fJk) (k = l, 2, ... ), such that
Let
f(x)
=
cp(x)-.\x.
Then f(x) is continuous in [a, b]. If x 'YJ in (a, b) such that
E F~o.,
there exists a point
'Y)<X
and
c/;('Y))-cf;(x)
< .\,
T)-X
f('YJ) > f(x). Hence FA is covered by the set of points x at which 'YJ thatf(x)
i.e.
853146X
E
<
x implies
66
Differentiation of monotone functions
Then, hy the 'setting sun' lemma (5.6.2), these latter points form an enumerable collection of open intervals (o:k, fJk)
(k
=
I, 2, ... )
f(o:k) ~ f(fJk).
such that
~(fJk) -~(o:k) ~
Hence,
A({Jk-o:k).
00
.2 {~({Jk)-~(o:k)} ~ A(b-a).
Therefore
k=l THEOREM
5. 7 .3. If ~(x) is continuous and non-decreasing in
a bounded interval [a, b] then the set of points in [a, b] at which D+~(x)
=
has measure zero.
+oo
For this set of points E is covered by an enumerable disjoint set of intervals P of total length g(P) less than ~(b)-~(a)
fL
by Theorem 5. 7.I. Since this is true for all fL E has measure zero.
>
0 it follows that
THEOREM 5. 7 .4. If ~(x) is continuous and non-decreasing in a bounded interval [a, b], then the set of points in [a, b] at which
D_~(x)
<
D+~(x)
has measure zero. Apply Theorem 5.7.1 to a typical interval (o:k,{Jk) of Theorem 5.7.2. The set of points at which D _ ~(x) < A is covered by the intervals (o:k,fJk). By Theorem 5.7.1, the set of points in [o:k,fJk] at which D+~(x) > fL is covered by a disjoint collection of intervals of total.length less than {~(fJk) -~( o:k)}/fL·
Hence the set of points in [a, b] at which D_~(x)
<
D+~(x)
is covered by a disjoint collection of intervals of total length less w
than fL-1 _2 {~(fJk)-~(o:k)}, and by Theorem 5.7 .2 this is less than k=l
A(b-a)ffL·
Differentiation of monJone functions
67
Therefore the average metric density of the points in (a, b) at which D_rf>(x) < .\ < 11- < D+rf>(x) is less than .\/11- which is less than unity. Since this is ..true for any sub-interval (a, b), it follows from Theorem 3.4.1 that the set of points at which D_rf>(x)
< .\ <
11-
<
D+rf>(x)
in a bounded interval is of measure zero. THEOREM 5. 7 .5. If rf>(x) is continuous and non-decreasing in a bounded interval [a, b] then rf>(x) has a derivative rf>'(x) almost everywhere in [a, b].
Take.\ and 11- to be rational numbers such that.\ < 11-· Then by Theorem 5.7.4 the set ofpoints Et..,JL in [a,b] at which D _ rf>(x)
< .\ <
11-
<
D+rf>(x)
is of measure zero. But the set of points Et..,JL in [a, b] at which D_rf>(x) < D+rf>(x) is the union of the sets Et..,w Since .\ and Pare rational these sets are enumerable. Hence, by Theorem 5. 7 .2, the set E has measure zero. Therefore, almost everywhere, 0 ~ D+rf>(x) ~ D_rf>(x) ~
D-rf>(x) (III in § 5.5)
~
D+rf>(x) (II in§ 5.5)
~
D+rf>(x) (III in § 5.5)
<
oo
(Theorem 5.7.3).
Hence the four Dini derivatives are equal and finite almost everywhere in (a, b), i.e. rf>(x) is differentiable almost everywhere in (a, b).
5.8. The differentiation of series of monotone · functions Denjoy's theorem (§3.7, Exercise 4) on the integration of a monotone sequence of monotone functions has a companion theorem due to Fubini (Riesz and Sz-Nagy 1953, p. 12), on the differentiation of a convergent series of monotone functions.
68
Differentialion of monotone functions
THEOREM 5.8.1. If fn(x) is a continuous, non-decreasing function of x in [a, b] for each value of n ( l, 2, ... ) and if the series
converges to a sum function s(x) at each point of [a, b], then almost everywhere in [a, b], the series of derivatives
I
n=l
f~(x)
exists and converges to the derivative s'(x). Apart from a set of values of x of measure zero the nondecreasing functions fn(x), and the partial sums n
sn(x) =
I
k=l
fk(x),
together with the sum function s(x), have finite, non-negative derivatives in (a, b). To study the convergence of the sequence s~(x)
n
=
I f~(x) k=l
we note that
whence
s'(x)
~ s~(x)
p.p. in (a, b)
The functionsf~(x) are non-negative, whence {s~(x)} is a nondecreasing sequence. We have just shown that it is bounded by s'(x). Hence the sequence {s~(x)} is convergent almost everywhere in [a,b]. To show that its limit is s(x), we select a sub-sequence sn(x), such that 0 ~ s(b)-sn(k)(b) ~ lj2k (k = l, 2, ... ). Then whence s(x)-sn(x) is a non-decreasing function of x. Hence 0 ~ s(x)-sn(x) ~ s(b)-sn(k)(b) ~ lj2k. CJ)
Therefore the series
I
{s(x)-sn(x)}
k=l
is a convergent series of non-decreasing functions.
Differentialion of monotone functions
69
We can apply the result established above for the similar 00
series !f~(x), viz. that the differentiated series is convergent, k~l
almost everywhere in [a, b]. Thus the series 00
!
{s'(x)-s~(k)(x)}
k~l
is convergent almost everywhere in (a, b). Therefore s'(x)-s~(k)(x)--+ 0
ask--+ oo.
But the sequence {s~(x)} is non-decreasing in n. Hence s'(x)-s~(x)--+ 0
as n--+ oo.
5.9. Exercises l. Show that the least upper bound of the average metric density of a bounded set E (taken over all sub-intervals of an interval I) is either 0 or l.
2. If
E,. is the union of Emn for m = 1, 2, ... , an-t, and E is the union of the enumerable collection E 11 E 2 ••• , show that Cantor's set 0 is the complement of E with respect to [0, 1]. 3. Show that a necessary and sufficient condition that a set of points E should have measure zero is that there should exist an enumerable ro
collection of intervals I 1,I2 , ••• of finite total length~ n~l
/I,./ such that each
point of E is interior to an infinite number of these intervals (Ries21-Nagy 1953, p. 6). 4. Ifj(x) is non-increasing and not necessarily continuous in (a, b), show that the limits j(x+O) = limj(x+h) (h > 0), h->-0
j(x-0)
= limj(x-h) (h >
0)
h->-0
exist at each point x in (a, b). Definej(a-0) andj(b+O) to bej(a) andj(b) respectively. Let F(x) = max{j(x-0), j(x), j(x+O)}.
70
Differentiation of monotone functions
A point x = s in (a, b) is said to be dominated if there exists a point x = g such that and Show that the dominated points form an open set, E. If (ak, bk) is an open interval belonging toE, prove that j(ak+O)
< F(bk)·
Show that the points at whichj(x) =1= F(x) are enumerable. Hence deduce thatj(x) is differentiable almost everywhere in (a, b).
LEBESGUE THEORY IN ONE DIMENSION
6
Geometric measure of outer sets and inner sets
6.1. Introduction It appears from the considerations advanced in § 2.4 that the whole problem of the integration of bounded functions f(x) can be reduced to the problem of the integration of the indicators ex(x, t,f) of the sets of points at which f(x) > t. We therefore embark on a systematic method of integrating indicators, guided by the descriptive definition of an integral developed in Chapter 2.
When the integral
Jex(x) dx
has been defined for an indicator ex, its value will be a generalization of the length of an interval that we shall call the geometric measure g(ex) of the set ex. To define this integral we shall employ the method of bracketing (§ 3.2) and shall therefore need to construct outer and inner bracketing functions t-t(x), .\(x), such that .\(x) ~ ex(x) ~ t-t(x). These bracketing functions will be the indicators of certain outer and inner sets of points, which we now proceed to define.
6.2. Elementary sets The theory of measure in one dimension as given by Lebesgue and by de la Vallee-Poussin is expressed almost entirely in terms of open and closed sets. The exception is the fundamental union-intersection theorem (6.2.5 and 6.3.9), which requires the use of general intervals that may or may not include either of their extremities. Kolmogorov and Fomin (1961) and Williamson
72
Geometric measure of outer sets and inner sets
(1962) have simplified the original Lebesgue theory by the systematic use of general intervals throughout the theory, and we shall follow their example in order to give explicit proofs of the basic theorems in a form that can readily be extended to spaces of two or more dimensions. DEFINITION 6.2.1. In one dimension an 'interval' is a set of points x such that a -< x -< b, where the symbol -<, introduced by Williamson, represents either < or ~'and a, b are finite or infinite numbers. In particular a single point is an interval. DEFINITION 6.2.2. An 'elementary set', with indicator a, is the union of a finite number of disjoint intervals, with indicators
so that
ajak=O
and
a= a 1+a2+ ... +an.
ifj=/=k,
6.2.1. The intersection, union, difference, and symmetric difference of two elementary sets are also elementary sets. THEOREM
The intersection of two intervals and
(a-< x-< b)
(p
-< x -< q)
is an interval of the form max(a,p)-< x-< min(b,q). Now if a and -r are elementary sets and where {as} and {-r1} are each disjoint intervals, then the intersection of a and -r is a-r=.2asTt·
Now as Tt is an int'erval and (ap-rq)(as-rt)
unless
p
=
=
(upas)(-rq-r1)
=
0
s and q = t.
Hence a-r is the union of a finite number of disjoint intervals and is therefore an elementary set. If a is the union of a finite number of disjoint intervals
Us= (as-< X-< bs),
Geometric measure of outer sets and inner sets
73
we can enumerate these so that
-< b -< a 2 -< b2 -< ... -< an -< bn. covered by an interval w = (p -< x -< q),
a1
1
Hence if a is the complement of a with respect tow is the finite number of disjoint intervals
... , (some of which may be empty sets). Therefore the complement w-a is an elementary set. In particular, 1-w is an elementary
set. If a and Tare any two elementary sets, then 1-a and 1-T are elementary sets, together with their intersection (1-a)(1-T) and the union of a and T, aUT= 1-(1-a)(1-T).
Also the differences and
a-aT
T-aT
are clearly elementary sets, and the symmetric difference a~
T
=
a U T -aT.
Thus elementary sets form an algebra in which 'addition' is defined by union and 'multiplication' by intersection. 6.2.2. If a and T are any pair of elementary sets then there exists a finite collection of disjoint intervals {y8 } such that THEOREM
a
. Lasys, 8
T
=
Lt bert,
aUT= LYs> 8
and each coefficient, a8 or bt> is either 0 or l.
Now
a= (a-aT)+aT, T
and the elementary sets
=
(T-aT)+aT,
74
Gemnetric measure of outer sets and inner sets
are disjoint. Let {ys} be the finite collection of disjoint intervals in these elementary sets. Then
and
aUT= (a-aT)+(T-aT)+aT =LYs, s
where each coefficient is either 0 or l. 6.2.3. The geometric measure g(ex) of an interval ex is its length, and is zero if ex is a single point, or the empty set 0. DEFINITION
The geometric measure of an elementary set should clearly be defined in terms of the geometric measures of its component intervals, but we must take note of the fact that an elementary set can be resolved into the union of disjoint intervals in an infinite number of ways. For example, the interval (0 :( x :( I) can be expressed as the union of the intervals (0 :( x < s), (x = s), and (s < x :( 1). We therefore need the following theorems. THEOREM 6.2.3. If the interval a is the union of a finite number of disjoint intervals {as} then
g(a)
=
Ls g(as)·
For we can enumerate the intervals so that as+l lies to the right of as and adjoins as. THEOREM 6.2.4. If the elementary set a has two different representations, ·a = L exs and T = L {11,
t
s
as the union of a finite number of disjoint intervals, then
L g(exs) = Lt g(fJe)· 8
For and
a
=
a2
= s,tL exs f1t
exs = exs a =
Lt exs f1t·
Geametric measure of outer sets and inner sets
75
Hence, by Theorem 6.2.3, g(exs) =
Lt g(exsf3e),
whence Similarly whence We can now frame DEFINITION
6.2.4. The geometric measure g(ex) of an elemen-
tary set ex is where exv ex 2 , ... are the components of ex m any finite representation. The main instrument in proving theorems about geometric measure is the 'union-intersection' theorem, the simplest form of which is the following. THEOREM
6.2.5. If ex and f3 are each elementary sets then g(ex U f3)+g(exf3) = g(ex)+g(f3).
By Theorem 6.2.2, ex and f3 can be represented in the form ex=
Ls asys,
f3 =
Lt btYt
where {y8 } is a finite collection of disjoint intervals, and each coefficient, a8 or be. is either 0 or I. Hence
Whence the theorem follows at once. CoROLLARIES.
(i) If ex and f3 are disjoint elementary sets, then
g(ex U /3) = g(ex)+g(f3).
Geometric meMure of outer sets and inner sets
76
(ii) By induction it follows that if o:v o: 2, ... , an is a finite collection of disjoint elementary sets, then gCQl o:s) = g(o:l)+g(o:2)+ ... +g(o:n)·
(iii) If a and Tare elementary sets and a is covered by T, then g(a)
For, on writing o: =a, theorem, we find that 0:
and
U
~
fJ =
fJ =
T,
g(T).
T-a in the union-intersection
o:{J
= 0,
g(T) = g(a)+g(T-a)
~
g(a).
6.3. Bounded outer sets In the terminology that we have just adopted the theory of measure before Lebesgue may be briefly described as follows. The 'content' of a set of points o: was defined as the greatest lower bound of the geometric measures g(a) of all elementary sets a covering o:. But this is clearly a rather crude measure for, according to this definition, both the rational points and the irrational points in the interval (0, I) have the same contentunity, although the irrational points are vastly more numerous. In the Lebesgue theory the concept of 'content' is replaced by the concept of 'measure', and this is achieved by covering the set of points o:, not by a finite collection of disjoint intervals, but by an enumerable collection of disjoint intervals. This provides a much finer measure of a set of points, which we now proceed to explore. DEFINITION 6.3.1. An outer set is the union of an enumerable collection of disjoint intervals, called a representation of the outer set. THEOREM 6.3.1. The union of an enumerable collection of disjoint outer sets is an outer set.
If the outer set an is the union of the enumerable disjoint intervals {anp} (p = I, 2, ... ) then o:, the union of the outer sets {o:n} is the union of the intervals {anp} (n,p = I, 2, ... ).
Geometric measure of outer sets and inner sets
77
Now Also whence
amp
and
=
exm amp
ampanq
(by Theorem 4.1.2),
=
(exmamp)(exnanq)
= =
(exm exn)(amp anq) 0
ifm o:ft. n.
Hence the collection of intervals (anp) is disjoint, whence ex is an outer set. THEOREM
6.3.2. The union of an enumerable collection of
(not necessarily disjoint) intervals is an outer set. If the set ex is the union of the bounded, enumerable, nondisjoint intervals {exn} (n = I, 2, ... ) then by the covering Theorem 4.4.I, ex is also the union of the enumerable, disjoint sets {,Bn} where and
.Bn
=
exn(I-ex1 )(I-ex2 ) ... (I-exn_ 1 )
(q> I).
But, by Theorem 6.2.I, (I-ex1 ), (I-ex 2 ), ••. , (I-exn_ 1 ) are elementary sets, whence .Bn is an elementary set, i.e. the union of a finite number of disjoint intervals. Hence ex is the union of an enumerable collection of disjoint intervals, and is therefore an outer set. THEOREM
6.3.3. The intersection of any two outer sets is an
outer set. If the outer sets ex and ,8 are respectively the unions of the enumerable collections of disjoint intervals {exm} and {,Bn}, then the intersection ex,B is the union of the enumerable intervals {exm,Bn}· But (exm ,Bn)( exp ,Bq) = (exm exp)(,Bn ,Bq)
= 0 unless m =
p
and
n = q.
Hence the intervals exm .Bn are disjoint, and therefore the intersection ex,B is an outer set.
78
Geometric measure of outer sets and inner sets THEOREM
6.3.4. The union of any two outer sets is an outer
set.
In the notation of Theorem 6.3.3, the union ex U f3 of the outer sets ex and f3 is the union of the enumerable collection of intervals {exm} and {f3n}· Hence, by Theorem 6.3.2, ex U f3 is an outer set. We must next establish the existence of geometric measures for outer sets. 6.3.5. If a is a bounded outer set with enumerable disjoint components {an} then the series THEOREM
is convergent.
If a is covered by an interval w, then the union r n of the finite collection of disjoint intervals a 1 , a2 , ••• , an is an elementary set also covered by w. Hence, by Corollary (iii) to Theorem 6.2.5, n
L g(as) =
g(rn) ~ g(w).
s~l
00
Hence the series
L g(an)
is bounded. Since each term is non-
1
negative, the series is therefore convergent. THEOREM 6.3.6. If a and r are bounded outer sets with the representations 00
a=
L ex
8
00
and
T
s~l
= Lf3t t~l
and if a is covered by r, then 00
Jlg(exs) ~
00
"J:/
We shall compare the elementary set
and the outer set
Geometric measure of outer sets and inner sets
79
We replace each interval ex8 by a closed interval ex: covered by ex8 and only slightly smaller so that and for s = 1, 2, ... , m, E being an aybitrary positive tolerance (if ex8 is a point, then ex: is an empty set!). The union of the closed, disjoint intervals {ex:} forms a closed set. We replace each interval {31 by an open interval f3i covering {31 and only slightly larger so that
f3t :( f3i g(f3t) :( g(f3i) :( g(f3t) +E/21.
and
The closed set ,Lex: is covered by the enumerable open intervals 8
{f3i}. Hence by the Reine-Borel covering theorem (4.4.2) the closed set _Lex: is also covered by the union of a finite number of 8
open intervals {3~, {3~, ... , f3i,.. Therefore, by Corollary (iii) to Theorem 6.2.5,
Hence
Since this is true for all m and
CoROLLARY.
E,
If a is a bounded outer set with two representations w
a=
L ex
8
w
and
a=
s~l
w
then
L g(exs)
s~l
L f3t,
t~l
w
=
L g(f3t).
1~1
For a is covered by a, whence, by the theorem,
Geometric measure of outer sets and inner sets
80
Similarly and therefore We can now define the geometric measure of any bounded outer set ex. 00
DEFINITION 6.3.2. If L ex8 is any representation of a bounded, 8=1
outer set ex, then the geometric measure of ex is 00
g(ex) =
L g(ex8). 8=1
THEOREM 6.3.7. If a and -rare bounded outer sets and a ts covered by -r, then g(a) ~ g(-r). This follows at once from Definition 6.3.2 and Theorem 6.3.6. We can now generalize the union-intersection theorem for an enumerable collection of bounded outer sets. First we consider disjoint sets. THEOREM 6.3.8. If exv ex 2 , ... are enumerable, disjoint outer sets with a bounded union ex, then ex is an outer set and g(ex)
=
g(ex 1 )+g(ex2 )+ ... ·
For if ex8 is the union of enumerable disjoint intervals {ex81}, then ex8 ~ ex, whence ex8 is bounded and, by Definition 6.3.2, 00
g(exs)
= L g(exst)· t=1
But ex is the union of the disjoint intervals {exst). Therefore, by the theorem on the derangement of double series with nonnegative terms, 00
g(ex) =
00
L g(ex8t) = 8=1 L g(exs)· s,t=1
CoROLLARY. If (31 , (32 , ... are uniformly bounded, enumerable elementary sets such that then where
f3n ~ f3n+l (n = 1, 2, ... ), g(f3n) --+ g(f3) as n --+ oo (3 =lim f3nn-+oo
Geometric measure of outer sets and inner sets
For
f3
where
81
= fJ1 +ex1 +ex2+ ... , exn = f3n+1-f3m
and are disjoint elementary sets. to which we can apply the main theorem. Secondly we consider outer sets which are not necessarily disjoint. THEOREM 6.3. 9 (the union-intersection theorem for outer sets). If ex and f3 are each bounded outer sets, then
g(ex U f3)+g(exf3)
= g(ex)+g(f3).
By Theorems 6.3.3 and 6.3.4, exf3 and a: U f3 are each outer sets. Let ex = 2: ex8 a~d f3 = L f3t be representations of ex and f3 in t
8
terms of enumera le, disjoint intervals. n
Let
= t=1 L f3t·
and
Tn
and
g( Tn)
Then, by Definition 6.3.2,
=
n
L g(f3t) t =1
and Hence, as m, n-+ oo, g(am)-+ g(ex), Also, if then and
g(Tn)-+ g({3),
g(am Tn)-+ g(ex{3).
Yn =an U Tn, Yn ~ Yn+l Yn-+ ex U f3 as n-+ 00.
Therefore, by the corollary to Theorem 6.3.8, g(yn)-+ g(ex
U
{3).
But, by the union-intersection theorem for elementary sets (6.2. 5 )
g(an U Tn)+g(an Tn} = g(an)+g(Tn)•
Now let n-+ oo. Then g(ex U f3)+g(ex{3) 858146X
F
=
g(ex}+g({3).
82
Geometric measure of outer sets and inner sets
As in the case of Theorem 6.2.5 there are a number of obvious corollaries to this theorem. CoROLLARY.
(i) If rx and fJ are bounded disjoint outer sets, then
g( rx U {J) = g( rx) +g(/3).
(ii) By induction it follows that if rxv rx 2, ... , rxn is a finite collection of bounded disjoint outer sets, then n
g(
U rxs) = s=l
g(rxl)+g(rx2)+ ... +g(rxn)·
(iii) If {rxn} is any finite collection of bounded outer sets with union rx then g( rx 1) +g(rx 2) ~ g( rx 1 u rx 2) and by induction n
L g(rxs) s=l
~ g(rx).
6.4. Unbounded outer sets The definition of the geometric measure of an unbounded outer set a can be achieved only by considering the family of sets {as}, where as is the intersection of a and the open interval
-S <X< S, s being any positive number. Thus, ass-+ oo these open intervals steadily expand to infinity. THEOREM
For, if s
6.4.1. As s-+ oo, g(as) tends to a unique limit>..
< t,
then
and, by Theorem 6.3.7,
g(as) ::;;;; g(at). Hence the function g(as) converges to a unique limit>. ass-+ oo, which may be finite or infinite. DEFINITION 6.4.1. The geometric measure g(a) of the unbounded outer set a is the limit,
g(a)
= lim g(as)· S-+W
6.5. The principle of complementarity So far we have defined and studied only outer sets of pointsthe indicators of which are to be the upper bracketing functions
Geometric measure of outer sets and inner sets
83
in our definition of the Lebesgue measure. We must now examine inner sets-the indicators of which are to be the lower bracketing functions. In the original Lebesgue theory the outer sets were open sets and the inner sets were closed sets. We need to modify this theory in view of the generalization of outer sets that we adopted in§ 6.3. The motivation of the Lebesgue theory was the 'principle of complementarity', which we can describe as follows. If a and r are complementary with respect to an interval w then and ar = 0. Now let fL be an outer set covering a and v be an outer set covering r. Then T~V
and
w-v
~T =
a.
Hence a is bracketed by the sets
A
=w-v
and
t-t·
The lower bracketing set A is the complement of the upper bracketing set v with respect to the interval w. We shall take the upper bracketing sets to be 'outer sets' covering the prescribed set a; and we shall take the lower bracketing sets to be 'inner sets' which are the complements of outer sets with respect to an interval. But it is necessary to give a self-consistent definition of inner sets A and of their geometric measure g(A) and to prove that if a given set a is bracketed by an outer set fL and an inner set Athen g(A) ~ g(t.t)- We therefore proceed to establish the necessary definitions and theorems.
6.6. Inner sets THEOREM 6.6.1. If A is the complement of an outer set t-t 1 with respect to an interval w 1 , then it is also the complement of some outer set t-t 2 with respect to any other interval w 2 which covers A.
(l) We first prove that A is the complement of an outer set fL with respect to the interval w = w 1 w 2 •
Geometric measure of outer sets and inner sets
84
Since A is covered by w1 and by w 2 , it is also covered by w. Let
= A+ft· p. = w-A = w-(w 1-p.1) w
Then
=
P-1-(wl-w).
Now w = w1w2 :( w1. Thus w 1 -w consists of one or two intervals. Also w 1 -w :( p. 1, whence and
IS
a set, which clearly
(w1-w)p.1 = (w 1-w),
=
fL
P-1(1-wl+w).
But l-w1 +w is a set consisting of two or three intervals, and p.1 is the union of an enumerable collection of disjoint intervals. Hence so also is p., i.e. p. is an outer set. (2) In the general case, let w2
=
A+p. 2 , whence
P-2 = w2-w+p..
But and so that
fL
=
p.w
=
fLW2,
(w 2-w)p. = 0.
Thus p. 2 is the union of two disjoint sets, viz.: (i) w 2 -w, which consists of one or two intervals, and (ii) p., which we have proved to be an outer set.
Therefore p. 2 is also an outer set. 6.6.1. Any set A that is the complement of an outer set p. with respect to some interval w is an 'inner set'. DEFINITION
DEFINITION 6.6."2. The geometric measure of an inner set A with respect to a finite interval w covering A is defined to be
gw(A)
=
g(w)-g(w-A).
6.6.2. If gw,(A) and gw 2 (A) are the geometric measures of an inner set A with respect to two intervals w 1 and w 2 each covering A, then THEOREM
Geometric measure of outer sets and inner sets
85
We apply the union-intersection theorem (6.3.9) to the sets ex
=
fJ =
and
w1
w 2 -A.
Their union is ex U fJ = w1 +(w 2 -A)-(w 1 w2 -wl A).
But A is covered by w 1, whence WlA =A,
and
fJ is exfJ = w1 (w 2 -A) = w 1 w 2 -A.
The intersection of ex and
Hence
g(w1 )+g(w 2 -A) = g(ex)+g(fJ)
= = =
g(ex U fJ)+g(exfJ) g(w1 U w 2 ) +g(w 1 w 2 -A).
Similarly
g(w 2 )+g(w 1 -A)
Therefore
g(w1 )+g(w2 -A) = g(w 2 )+g(w 1 -A)
and
g(w 1 )-g(w 1 -A) = g(w 2 )-g(w2 -A).
g(w1 U w2 )+g(w 1 w2 -A).
DEFINITION 6.6.3. The geometric measure of a bounded inner set A is g(A) = g(w)-g(w-A), where w is any finite interval covering A. THEOREM 6.6.3. If the bounded set a is bracketed by an outer set fL and an inner set A then g(A) ~ g(t-t)· A~ a~
For
fL
and there is a finite interval w covering A and fL such that v
= w-A
is an outer set (by Definition 6.6.1). Hence fLU V = fL+V-fLV
=
t.t+w-A-t.tW+fLA
=w
since
fL = fLW
and
A = JLA.
Geometric measure of outer sets and inner sets
86
Therefore by the union-intersection theorem (6.3.9) g(w) = g(P- U v) = g(P-)+g(v)-g(P-v) :( g(P-)+g(v).
But
g(A) = g(w)-g(v),
whence
g(A) :( g(P-)-
It is almost trivial, but nevertheless necessary, to establish the following theorem. 6.6.4. If a is any bounded set, then there always exist outer sets 11- and inner sets A such that A :( a :( 1-'· THEOREM
Since a is bounded there is an interval~-' covering a, and any point of a is an interval A covered by a. A and 11- are inner and outer sets bracketing a.
6.7. Exercises l. Show that a single point x = a is both an outer set and an inner set. Show that the empty set is both an outer set and an inner set.
2. A functionf(x) is semi-continuous at x
=
g
(i) on the right, ifj(g+h) -+j(g) ash-+ 0 for h > 0, (ii) on the left, ifj(g-h)-+ j(g) ash-+ 0 for h < 0.
or
Show that cx(x) is the indicator of an outer set if cx(x) is semi-continuous (on either right or left or both) at each point g where cx(g) = l. Show that cx(x) is the indicator of an inner set if cx(x) is semi-continuous at each point g where cx(g) = 0. 3. Show that the intersection of a finite or an enumerable collection of inner sets is also an inner set (which may be empty) (see Theorem 6.3.2). 4. Show that the intersection of a finite number of outer sets is also an outer set.
I
7
lebesgue measure
7.1. Introduction In the preceding chapter we have developed the theory of the geometric measure of outer and inner sets and have shown that, if a is any bounded set of points, then there exist outer and inner sets fL and ,\ such that
and
g(,\) ~ Y(t-t)·
Our guiding principles have been those originally adopted by Lebesgue, viz.: (i) the use of enumerable collections of disjoint intervals {t-tn} for the upper bracketing function f-t, and
(ii) the use of the principle of complementarity to define the lower bracketing function ,\, Lebesgue himself never gave a complete and formal account of his theory of measure and integration and the first systematic treatment was given by de la Vallee-Poussin. In his account which we follow here, the central position is occupied by the 'union-intersection theorem'. In this chapter we develop the theory of measure as the theory of the integration of the indicators o:(x) of bounded linear sets of points.
7.2. Outer and inner measure DEFINITION 7.2.1. The outer measure m*(a) of a set of points is the greatest lower bound of the geometric measures g(t.t) of outer sets fL covering a, i.e. a
m*(a) = inf g(t-t)
for t-t
~a.
88
Lebesgue measure
DEFINITION 7.2.2. The inner measure m*(a) of a set of points a is the least upper bound of the geometric measures g(,\) of the inner sets ,\ covered by a, i.e. m*(a)
=
sup g(,\)
for,\
~
a.
THEOREM 7.2.1. The inner measure of any set a is not greater than its outer measure, i.e. m*(a)
~
m*(a).
For, with the notation of the preceding definitions, g(,\) ~ g(t-t)
whence
m*(a)
(by Theorem 6.6.3),
= sup g(,\)
~
inf g(t-t)
=
m*(a).
The union-intersection theorem (6.3.9) for the geometric measure of outer sets can now be generalized to the outer and inner measures of any pair of sets. THEOREM 7.2.2. If a and -r are any pair of sets, then m*(a)+m*(-r)
~
m*(a U -r)+m*(a-r).
From Definition 7.2.1, if e is any prescribed positive tolerance, there exist outer sets o: and (3 such that a~ o:,
and Now
m*(a)
~
T
g(o:)-e,
aU -r
~
(3
m*(-r)
~
=
1-(1-a)(1--r)
~
1-(1-o:)(1-(3)
g(f3)-e.
= u (3, 0:
and
a-r ~ o:f3.
But o: U (3 and o:(3 are outer sets. Therefore and
m*(a U -r)
~
g(o: U (3)
m*(a-r)
~
g(o:(3).
By the union-intersection theorem (6.3.9) g(o: u f3)+g(o:(3)
whence
=
g(o:)+g((3),
Lebesgue measure
Since this is true for all e
>
0 it follows that
m*(a U T)+m*(aT) THEOREM 7 .2.3.
If a and
89
T
~
m*(a)+m*(T).
are any pair of sets, then
m*(a U T)+m*(aT) ~ m*(a)+m*(T).
Let w be any finite interval covering a and
T,
and
T' = w-T.
a ' = w-a,
a' U T' = w-aT
Then and
a'T'
=
w-a U T.
m*(a U T) = g(w)-m*(a'T'),
Hence
m*(aT) = g(w)-m*(a' U T'),
and
m*(a U T)+m*(aT)
~ 2g(w)-m*(a')-m*(T')
(by Theorem 7.2.2)
CoROLLARY.
If a 1 , a 2 , ... ,
=
m*(a)+m*(T).
an
are disjoint sets, then
For and the corollary follows by induction. THEOREM 7 .2.4.
If the set a is covered by the set a~
T,
i.e. if
T,
If e is any prescribed tolerance there exists an outer set v such that T~V and g(v)-e ~ m*(T).
But whence
a~ T ~
m*(a)
~
Since this is true for each e
g(v)
>
m*(a)
v,
~ m*(T)+e.
0 it follows that ~
m*(T).
90
Lebesgue measure
m*(a) = g(w)-m*(w-a),
Also
m*(r) = g(w)-m*(w-r),
and whence
w-r m*(r)-m*(a)
~
w-a,
= m*(w-a)-m*(w-r)
~
0,
by the first part of this theorem. The following theorem is needed later (§ 7.9, Exercise 4) in constructing a criterion for measurability. THEOREM 7.2.5. If a and r are any two sets and 8 is their symmetric difference,
0 =aU =
then
T-aT
a+r-2ar, ~
Jm*(a)-m*(r)\
m*(o).
We note that T
U
8 = r+(a+r-2ar)-r(a+r-2ar)
= a+r(l-a)
~a.
Hence, by Theorem 7.2.4, m*(a)
~
m*(r U 8),
and by the union-intersection Theorem (7 .2.2), m*(r u o)+m*(ro)
~
m*(r)+m*(o).
Therefore m*(a)-m*(r)
Similarly
~
m*(o)-m*(ro)
m*(r)-m*(a)
~
~
m*(o).
m*(o).
Whence the theorem follows.
7.3. Lebesgue measure If.\ and p. are inner and outer sets bracketing a given bounded set a, then Thus if g(.\) and g(p.) are to be regarded as lower and upper approximations to the integral of a(x) the tolerance of this bracketing process, g(p.)-g(.\), cannot be less than m*(a)-m*(a) and it will not succeed in providing a definition of a(x) dx
J
Lebesgue measure
91
unless m*(a) = m*(a). Hence we are forced to the following definition. DEFINITION 7 .3.1. A bounded set a is measurable, in the sense of Lebesgue, if m*(a) = m*(a) and the common value of its outer and inner measures is the Lebesgue measure m(a) of a, i.e.
m(a) = m*(a) = m*(a). We note at once THEOREM 7.3.1. If the set a bounded by an interval w is measurable then so also is the complementary set T = w- a, and m(a)+m(T) = g(w). For, by Definition 7 .2.2, and whence
m*(T) = g(w)-m*(a) m*(a) = g(w)-m*(T), m*(T)-m*(T) = m*(a)-m*(a) = 0,
m(a)+m(T) = g(w). We can now complete the series of union-intersection theorems as follows.
and
THEOREM 7 .3.2. If a and T are bounded measurable sets, then
so is their intersection aT and their union aU T, and m(a U T)+m(aT) = m(a)+m(T). By the union-intersection theorems for outer and inner measure, (7 .2.2) and (7 .2.3), m*(a U T)+m*(aT) ~ m*(a)+m*(T)
=
m*(a)+m*(T)
~
m*(a U T)+m*(aT).
Hence
{m*(a U T)-m*(a U T)}+{m*(aT)-m*(aT)} ~ 0. But, by Theorem 7 .2.1, the bracketed expressions are each non-negative. They are therefore each equal to zero, i.e. m*(a U T)
= m*(a U T)
and Thus the union, aUT, and the intersection, aT, are each measurable.
Lebesgue measure
92
Therefore, using the union-intersection theorems again, m(a U T)+m(aT)::::;; m(a)+m(T) ::::;; m(a U T)+m(aT),
whence CoROLLARY.
(i) If a and Tare disjoint measurable sets then
m(a+T)'= m(a)+m(T). By induction, if a 1 , a 2 , ••• , an is a finite collection of disjoint measurable sets then m(a1 +a2 + ... +an) = m(a1 )+m(a2 )+ ... +m(an)·
It is not quite trivial to notice that (ii) If a and Tare measurable and a covers T, then m(a) For 1J = a(1-T)ismeasurable and m(a) = m( T)+m(1J) (cf. Theorem 7.2.4).
~
~
m( T).
m(T)
THEOREM 7.3.3. If {an} is any enumerable collection of sets of points with bounded union a, then 00
I m*(an)· n=l
m*(a) ::::;;
By Definition 7.2.1, for any prescribed tolerance is an outer set cxn such that
E
>
0, there
and The outer set cxn is the union of an enumerable collection of disjoint intervals {cxnp} (p = 1, 2, ... ), whence ex, the union of all the outer sets {cxn}, is the union of an enumerable collection of intervals {cxnp} (n,p = 1, 2, ... ). These intervals are not necessarily disjoint, but by the corollary to the covering Theorem 4.4.1 there exists an enumerable collection of disjoint elementary sets {.Bnp}, such that and
Lebesgue measure·
93
Since a = U an is covered by U cxn, it is also covered by U rxnp and 'by U !3nv (n,p = l, 2, ... ). Hence m*(a)
~ gC.~Jnv) = nJ~lg(f3np) 00
~
,L
(by Theorem 6.3.7)
g(rxnp)
n,p~l
00
= L g(cxn) n~l
This is true for all
E
~
>
00
00
00
n~l
n~l
n~l
L m*(an)+ L Ej2n = L m*(an)+€.
0, so that the theorem is established.
THEOREM 7.3.4. If {an} is any enumerable collection of disjoint sets of points with union a, then 00
m*(a) ;;:?:
,L m*(an).
n~l
By Theorem 7.2.4, since a covers the union of av a 2 , ••• , an m*(a) ;;:?: m*(
Qav)
;;:?: m*(a1 )+m*(a2 )+ ... +m*(an),
by the corollary Theorem 7.2.3. Since this is true for all n, it follows that 00
m*(a) ;;:?:
,L m*(an)•
n~l
Finally, by combining Theorems 7.3.3 and 7.3.4 we obtain THEOREM 7.3.5. If {an} is any enumerable collection of disjoint measurable sets, with union a, then a is measurable and 00
m(a) =
,L m(an)·
n~l
The corollary to Theorem 7.3.2 states that the Lebesgue measure is an additive functional in the sense that, if av a 2 , ... , an is a finite collection of disjoint measurable sets, then m(a1 +a2 + ... +an) = m(a1 )+m(a2 )+ ... +m(an).
·Lebesgue measure
94
We have now proved that the Lebesgue measure is completely additive in the sense that, if {an} is an enumerable collection of disjoint sets then their union is measurable and m(a1 +a 2 + ... ) = m(a1 )+m(a2 )+ .... The Lebesgue measure m(a) of a measurable set a is therefore a positive, additive, continuous functional of a and we shall prove (Theorem 7.4.2) that it satisfies the normalizing condition that if a is an interval, then m(a) is the length of the interval. Hence the Lebesgue measure m(a) can rightly be taken to be the integral of the indicator a.
7.4. Examples of measurable sets The simplest sets of points are finite or enumerable collections of points and we can prove at once 7.4.1. Any enumerable collection of points E is measurable and has Lebesgue measure zero. THEOREM
When E contains only one point g it is covered by the interval
g-!€ < with geometric measure 0 and
~
E.
X
< g+!€
Hence
m*(E) ~ m*(E)
<
for all
E
m*(E)-m*(E)
=
E
>
0,
0.
Thus E is measurable and m(E) = 0. Hence, by the completely additive property of Lebesgue measure (Theorem 7.3.5), an enumerable collection E of points {En} is measurabl~ and 00
m(E)
= L m(En) = n=l
0.
Thus a set of points with 'zero one-dimensional' measure, according to Definition 5.2.1, has Lebesgue measure zero in accordance with Definition 7.3.1. In order of increasing complexity the next set of points to be considered is the interval.
Lebesgue measure
95
THEOREM 7.4.2. Any interval ex is measurable and its Lebesgue measure is equal to its geometric measure, i.e.
m(ex)
= g(ex).
Since ex is covered by ex it follows from Theorem 7.2.1 that m*(ex) ::( g(ex). Also, if ex is covered by an interval
w
then
Hence m*(ex)-m*(ex) ::( g(ex)+g(w-ex)-g(w) = 0. Thus ex is measurable. Now let a be any outer set covering ex. Then, by Theorem 6.3.7, g(ex) ::( g(a), whence
m*(ex) = inf g(a)
But
~
g(ex).
m*(ex) ::( g(ex). m(ex) = m*(ex) = g(ex).
Therefore
THEOREM 7.4.3. Any outer set is measurable and its Lebesgue measure is equal to its geometric measure.
If a is the union of enumerable disjoint intervals then by the complete additivity of Lebesgue measure (Theorem 7.3.5) a is measurable and
z m(an) z g(an) 00
m(a) =
00
=
n~l
(by Theorem 7.4.2)
n~l
= g(a)
(by Definition 6.3.2).
THEOREM 7.4.4. Any inner set is measurable and its Lebesgue measure is equal to its geometric measure.
By Theorem 7.3.1, if the inner set -r is the complement of an outer set a with respect to an interval w, then m*(-r)
=
g(w)-m*(a),
m*(-r) = g(w)-m*(a).
96
Lebesgue measure
But whence
m*(a)
= m*(a) = m(a) by Theorem 7.4.3; m*(T)
= m*(T) = g(w)-m(a) = g(w)-g(a)
= g(T), by Definition 6.6.2. Next we shall consider 'null sets', i.e. sets of points of zero measure.
THEOREM 7.4.5. The necessary and sufficient condition that a set o: should be measurable and have measure zero is that m*(o:) = 0.
The condition is obviously necessary for if o: has measure zero then m*(o:) = m(o:) = 0. The condition is also sufficient for the relations 0 :( m*(o:) :( m*(o:)
show that
m*(o:)
=
m*(o:)
=
= 0, 0.
The definition that we gave in Chapter 5 (5.2.1) is therefore completely in accord with the definition based on the theory of Lebesgue measure.
7.5. Unbounded sets Hitherto we have restricted ourselves to bounded sets of points o: and we have therefore been able to assert the existence of upper bracketing functions 11- such that 11- is an enumerable collection of intervals which cover o:. In the case of unbounded sets bounded upper bracketing functions may not be available and we therefore rely on the method of monotony to provide a definition of measure. DEFINITION 7.5.1. If as denotes the interval - S <X< S
an unbounded set o: is measurable if the bounded sets
LelJesgue measure
97
are measurable for each value of s, and the measure of ex is defined to be the limit of the non-decreasing function m(ex8 ), i.e. m(ex) = lim m(exa8 ). 8__,.00
This measure m(ex) may possibly be infinite.
7.6. Non-measurable sets At this stage the patient reader may well be inclined to inquire if the use of both upper and lower bracketing functions and the introduction of both outer and inner measures is really necessary, since the measure of a bounded measurable set ex can be defined simply as
m(ex) = inf g(t-L)
for all outer sets fL covering ex. It might therefore appear that only the outer measure and the upper bracketing function are necessary to define. The answer is that such a definition would apply only to measurable sets and that measurable sets are defined only by reference to their outer and inner measures. We have not proved that all sets are measurable and hence we have relied on the bracketing process to identify the measurable sets. But the awkward question then arises-are there really any nonmeasurable sets, i.e. sets such that m*(ex) -=F m*(ex)? The reply to this question falls into three parts: (i) Examples of non-measurable sets have been constructed on the assumption that the axiom of choice of set theory is valid. (ii) It has been proved by Solovay (1970) that the existence of non-measurable sets cannot be established if the axiom of choice is disallowed.
(iii) But the sets usually encountered in analysis are all
measurable. We shall therefore not pursue this matter further but refer the reader to the discussion in Burkill (1953), McShane (1,947), and Williamson (1962).
Lebesgue measure
98
7.7. Criteria for measurability The preceding considerations suggest the desirability of constructing some simple and practical criteria of the measurability of sets. To do this we shall compare the set a to be examined for measurability with certain outer sets ex, which closely approximate to a in the sense that the outer measure m*(a ~ex) of the symmetric difference of a and ex can be made arbitrarily small. We shall thus obtain some insight into the 'structure' of a measurable set a and we shall prove that for any tolerance e > 0 there exists an outer set ex such that m*(a ~ex) < e. The basic theorem follows at once from the definition of outer and inner measures: 7. 7 .1. If ex and fJ are complementary with respect to an interval w then the necessary and sufficient condition that ex and fJ should be measurable is that for any e > 0 there should exist outer sets f-t and v such that THEOREM
and
By Definitions 7.2.1 and 7.2.2,
< g(f.t), m*(fJ) < g(v), m*(ex) < g(w-fJ) = g(w)-m*(fJ). m*(ex)-m*(ex) < g(~.t)-g(w)+m*(fJ) < g(~.t)+g(v)-g(w) m*(ex)
Hence and similarly
m*(fJ)-m*(fJ) < g(~.t)+g(v)-g(w). Thus the conditions of the theorem are sufficient to ensure the measurability of ex and fJ. Also, if ex is measurable, so also is fJ by Theorem 7.3.1. Hence by definition for any tolerance e > 0 there exist outer sets f-t and v such that ex f-t, fJ v and
<
m(ex) ;:?:
<
g(~.t)-!e,
m(fJ) ;:?: g(v)-!e,
whence g(~.t)+g(v) < m(ex)+m(fJ)+e = g(w)+e by Theorem 7.3.1. Thus the conditions are necessary.
Lebesgue measure
99
A slightly different version of this result is due to de la Vallee-Poussin. THEOREM 7. 7 .2. A necessary and sufficient condition that a bounded set a should be measurable is that to any tolerance E > 0 there corresponds an inner set>.. such that>.. :( a and m*(a->..) :( E.
Since a covers >.., a>.. = >... The union of 'YJ = a->.. and of>.. is 'Y) U
>.. = (a->..)+A.-(a>..->..) =a.
The intersection of 'YJ and >.. is 'Y)A
=a>..->..= 0.
Hence by the union-intersection theorems m*('YJ)+m*(A.);;:?: m*(a) and whence
m*(a)-m*(a) :( {m*('Y))-m*('YJ)}+{m*(>..)-m*(>..)} :( m*('YJ). The condition is therefore sufficient to ensure that
The condition is also necessary. For >.. is measurable, and, if a is measurable, so also is 'YJ (by the corollary to Theorem 7.3.2) and m(a) = m(A.)+m('YJ). But, by Definition 7.2.2, for every given tolerance E > 0, there exists an inner set>.. such that>.. ::(a,
m(a) ::( m(>..)+E, whence
m('YJ) ::::;; E.
7.8. Monotone sequences of sets By expressing Theorem 7.3.5 in terms of monotone sequences of measurable sets, instead of a series of disjoint sets, we can generalize it to apply to any convergent sequence of measurable sets.
100
Lebesgue measure
THEOREM 7 .8.1. If {an} is a bounded non-decreasing sequence of measurable sets, then a= lim an is also measurable and n--? oo
n-+oo
Since {an} is bounded, there is an interval w covering all the sets an- Then w-an is measurable by Theorem 7.3.1, and
is measurable by Theorem 7 .3.2. Now i.e. whence
TmTn
=
0.
Thus the sets {T n} are disjoint. But and
a1 T n
0 for n = 2, 3, ....
=
Hence, by Theorem 7.3.5, 00
m(a)
= m(a1 )+ 2 m(Tn) n~2
n
2 {m(an)-m(an-1)}, n---+oo n=2
= m(a1 )+lim by Theorem 7.3.2. Therefore
m(a) = lim m(an)· n->-oo
If {an} is a bounded non-increasing sequence of measurable sets then a = lim an is also measurable and CoROLLARY.
n->- oo
For we have only to apply Theorem 7.8.1 to the non -decreasing sequence We can now use the peak and chasm functions of Definition 3.4.2 to discuss the measurability of any bounded convergent sequence of measurable sets.
Lebesgue measure
101
THEOREM 7.8.2. If {an} is a bounded convergent sequence of measurable sets with limit a then a is also measurable and n-+oo
The associated peak and chasm sequences of Definition 3.4.2 are {1rn} and {xn} where 00
17n
and
=
U k=n
ak
Xn = anan+I··· ·
Now
17 n
;;:?:
17 n+l>
whence the sequence {1rn} converges to a limit corollary to Theorem 7.8.1,
1T
and, by the
m(1r) =lim m(1rn) ;;:?: lim sup m(an).
Similarly if each an is bounded by an interval complementary relations
w
we have the
w-x = w-lim Xn = lim{w-xn)
and
m(w-x) ;;:?: lim m(w-xn),
i.e. But hence
X=
1T
=a,
m(a) :( liminfm(an) :( limsupm(an) :( m(a),
and
m(a) = lim m(an).
7.9. Exercises l. A necessary and sufficient condition for a bounded set a to be measurable is that, for any tolerance E > 0, there exist open sets f3 and ex such that
a,-;;;{3, ~d
{3-a,-;;;ex
~~,-;;;~
2. A necessary and sufficient condition for a bounded set a to be measurable is that, for any tolerance E > 0, there exists an open set f3 such that
m*({3-a) <
E
(Saks, Riesz).
3. A necessary and sufficient condition for a bounded set a to be measurable is that, for any tolerance E > 0, there exist an elementary set ex and two other sets 7] 1 , 7] 2 such that a
and
m*(7J 1) < E,
= ex+7Jl-7J2 m*(7J 2) < E (Lebesgue).
102
Lebesgue measure
4. A necessary and sufficient condition for a bounded set a to be measur· able is that, for any tolerance e > 0, there exists an elementary set c.: such that m*(o) ~ € where 3 = c.:+a- 2c.:a is the symmetric difference of c.: and a (Kolmogorov and Fomin). 5. A necessary and sufficient condition for a bounded set a to be measurable is that, for any set -r, m*(-r) = m*(-ra)+m*(-ra'),
where a' = 1-a.
8
The Lebesgue integral of bounded, measurable functions
8.1. Introduction In the preceding chapters (6 and 7) we have discussed measurable sets of points and the integration of indicators. Now, as we have indicated in § 2.4, the strategy of Lebesgue integration is to bracket the integrand f(x) by a pair of functions of the form
p,(x)
=
z
n-1 tp+l
o:p(x),
p=O
where o:p(x) is the indicator of the set of points at which tP
< f(x)
~ tp+I·
The problem of defining the integral ofj(x) is thus reduced to the problem of integrating the bracketing functions .\(x) and p,(x) and this depends upon the problem of integrating the indicators o:p(x). The problem of integration is therefore soluble, at least for bounded functions f(x) and finite interval of integration, when the indicators o:p(x) are integrable, i.e. when the sets of points tP
< f(x)
~ tP+l
are measurable for all values of t 0 , t 1 , ... , tnIn pre-Lebesguean analysis integrals in one dimension were defined only over finite or infinite intervals, but it is one of the remarkable and important characteristics of the Lebesgue integral that it can be defined just as easily over any measurable set of points E. It is a great convenience to adopt this more general definition from the beginning. To illustrate the concept of integration over a measurable E we may anticipate the results established later in this chapter
104
Lebesgue integral of bounded, measurable functions
(§ 8.6) and say that, if f(x) is integrable over interval [a, b] and if x(x, E) is the indicator of a measurable set of points E lying in this interval, then the product f(x)x(x, E) is also integrable over [a, b] and the integral of f(x) over the set E is b
Jf(x) dx = Jf(x)x(x, E) dx. E
a
In uniting these two ideas-of the bracketing process and of integration over a measurable set--we shall therefore begin by studying bounded functions f(x) that are defined on a bounded measurable set E, and are such that the subsets of Eat which
t
~
u
are measurable for all t and u.
8.2. Measurable functions The set of points in a measurable set Eat which t < f(x) ~ u is clearly the intersection of the points in E at which t < f(x) and the set of points in Eat whichf(x) ~ u. We therefore take as our basic definition: DEFINITION 8.2.1. We denote by E(f > t), E(f < t), E(f ~ t), E(f ~ t), E(f = t) the sets of points in a measurable set E at which f(x) > t, f(x) < t, f(x) ~ t, f(x) ~ t, f(x) = t, respectively. DEFINITION 8.2.2. A functionf(x) is 'measurable in a measurable set E' if the set E(f > t) is measurable for each value oft. THEOREM 8.2.1. If the set E(f > t) is measurable for each value oft, so also are the, sets E(f < t), E(f ~ t), E(f ~ t), E(f = t). The set E(f ~ t) is the limit of the convergent sequence of measurable sets E(f > t-1/n) for n = I, 2,.... Hence, by Theorem 7.8.2 the set E(f ~ t) is measurable. The sets E(f < t) and E(f ~ t) are complementary with respect to the set E. Hence, by Theorem 7.3.1 the set E(f < t) is measurable. Similarly, the set E(f ~ t) is measurable.
Lebesgue integral of bounded, measurable functions
105
Finally, the set E(f = t) is the intersection of the sets E(f;?: t) and E(f ~ t), whence, by Theorem 7.3.2, the set E(f = t) is measurable. THEOREM
8.2.2. Iff(x) and g(x) are each measurable in a set E,
so also are the functions f(x)+c,
f2(x), and
cf(x),
lf(x) I,
where cis any real number.
The relations E(f+c
and
E(cf
>
>
t)
=
t) = E(f
E(f
>
>
tjc)
t-c) (c
>
0)
are sufficient to show thatf+c is measurable in E for all c, and that cf is measurable in E if c is positive. Also the set of points E( -f > t) is the same as the set E(f < -t), which is measurable by Theorem 8.2.1. Hence - f is measurable in E, and cf is measurable on E whether cis positive or negative. Thesametheoremshowsthatthesets E(f > vt) and E(f < - vt) are both measurable if t is non-negative, whence their union E(f2 > t) is measurable, i.e. f2 is measurable in E. Similarly the set E( If I > t) is the union of the measurable sets E(f > t) and E(f < -t), whence If I is measurable in E. In order to establish the measurability of f+g we need the following lemma. LEMMA. If f(x) and g(x) are bounded, measurable functions in E, then the set of points E (f >g) is measurable.
Let {rn} be any enumeration of the rational numbers. Then any point gin Eat whichf(g) > g(g) lies in the intersection In of two sets of the type E(f > r n) and E(g < r n). Hence E(f > g) is the union of the enumerable, measurable sets {In} and is therefore measurable. THEOREM 8.2.3. Iff and g are measurable in E, so also aref+g, af+bg, fg, max(f, g), min(f, g).
For E(f+g > t) = E(f > t-g). By Theorem 8.2.2, -g and t-g are measurable, whence by the
106
Lebesgue integral of bounded, measurable functions
lemma,J+g is measurable. Also by Theorem 8.2.2, afand bg are measurable, whence af+bg is measurable. Now and by Theorem 8.2.2, (f+g) 2 , (f-g) 2 are measurable, whencefg is measurable. If cf> =max(!, g) and if;= min(!, g) then c/>
=
ilf-gl+t(f+g),
if;= i(f+g)-!lf-gl. Sincef+g,f-g, lf+gl, lf-gl are measurable, so also arecf> and if;. THEOREM 8.2.4. lf{fn(x)} is a sequence of functions measurable in E, then the limits
lim fn(x),
and
n__,.oo
lim fn(x).
are also measurable in E.
Let
M(x)
=
supfn(x),
L(x) = inffn(x).
The set E(M > t) is the union of the enumerable collection of measurablesetsE(fn > t),andthesetE(L < t) = E(-L > -t) is the union of the enumerable collection of measurable sets E( -fn > -t). Hence M(x) and L(x) are each measurable in E. Now let and Then Hence
Mn(x) = sup{fn(x), fn+l(x), ... }, Ln(x) = inf{Jn(x), fn+l(x), ... }. Ln+1 (x) .~ Ln(x)
lim fn(x)
and
Mn(x) ~ Mn+l(x).
= n__,.oo lim Ln(x)
and Ln(x) and Mn(x) are each measurable in E. Hence, as before, so
also are lim Ln(x) and lim Mn(x), limfn(x) and lim fn(x).
Lebesgue integral of bounded, measurable functions CoROLLARY.
107
If Un(x)} is a sequence of functions measurable in
E and fn(x) converges pointwise to a limit f(x) in E then f(x) is measurable in E.
For
f(x)
= lim fn(x) = lim fn(x). n--+ oo
n--+ oo
Finally, we prove that if f(x) is continuous in the interval
[a, b] then it is also measurable in [a, b]. THEOREM 8.2.5. The set of points in [a, b] at which f(x) > t is open, and is therefore an outer set, which is measurable by Theorem 7 .4.3.
All the functions of classical analysis can be constructed from the functions f(x) = 1 and f(x) = x by the algebraic processes of addition and multiplication together with the analytic process of taking the limit of a convergent sequence. Hence all such functions are measurable. The question of the existence of non-measurable functions, like the question of the existence of non-measurable sets of points depends upon the truth or falsehood of the axiom of choice, but we shall not explore this branch of the morbid pathology of functions.
8.3. Measure functions DEFINITION 8.3.1. Hf(x) is measurable in a set E, its 'measure function' mE(t, f) is the measure of the set of points in E at which f(x) > t. THEOREM 8.3.1. The measure function mE(t, f) is a nonincreasing function of t.
For, if s < t, then the set of points E(f set E(f > s), whence
>
t) is covered by the
mE(t, f) =:::; mE(s,f),
by Theorem 7 .3.2, Corollary (ii). IfJ(x) is measurable in a set E which lies in a finite interval I and if o: ::::; f(x) ::::; {:1 in E, then the measure function CoROLLARY.
108
Lebesgue integral of bounded, measurable functions
mE(t, f) maps the bounded domain ex ~ r ~ fJ into a bounded range (since 0 ~ mE(t,f) ~ Ill), whence mE(t,f) is integrable by§ 2.5.
8.4. Simple functions The functions >.(x) and fL(x) which Lebesgue introduced to bracket a bounded function f(x) belong to the class of 'simple' functions, which may be formally described in the following definitions and theorems. DEFINITION 8.4.1. A simple function a(x) is one which is zero outside some finite interval (a, b) and whose range is a finite collection of distinct real numbers Sv s 2 , ... , sm. THEOREM 8.4.1. The simple function a(x) of Definition 8.4.1 can be expressed in the form m
a(x)
= L
sP aP,
p~l
where aP = ap(x) is the indicator of the points at which a(x) = sP and where
For if xis any prescribed point, then a(x) has one and only one of the values Sv s 2 , ... , sm, say sk, and m
L sP ap(x) =
p~l
sk
=
a(x).
To define the integral of the simple function a(x) we could follow the same methods as those we used to define the integral of an indicator function (Chapter 7), but a few moments reflection should satisfy the reader that the final result would be expressed by DEFINITION 8.4.2. If the simple function a(x) is measurable, then its integral is
Ja(x) dx = P~l sP Jap(x) dx = P~ 1 sP mP, wheremP isthemeasureofthe set of points aP atwhicha(x) = sl" To justify this definition we prove
Lebesgue integral of bounded, measurable functions
109
8.4.2. The integral J a(x) dx of a simple function a( x) is a positive, linear functional on the space of simple functions. THEOREM
For, if a(x) is non-negative, then for p
sP ;?: 0
=
1, 2, ... , m,
Ja(x) dx ;?: 0.
whence
Now let the simple functions a(x) and T(x) have representations a(x)
=
Lp sPrxP,
.L rxp =
with Then
a(x)+T(X)
T(X)
=
Lq tq{3q,
.L f3q = L = L sP rxP+ L tqf3q 1,
= ,L (sp+tq)rxpf3q· Also .L rxpf3q = .L rxp .L f3q = L The set of values (sp+tq) may not all be distinct, but they can be grouped into a finite collection of distinct values u 1 , u 2, ••• and then ,L rxP {3q summed over all p and q for which sP +tq = uk will be the indicator x(x, uk) of the points at which a(x) = uk. Hence the sum of two simple functions is a simple function. Now, by Definition 8.4.2,
f {a(x)+T(X)} dx = -t f X(X, Uk
= ,L (sp+tq)
Uk)
dx
J
rxp{3q dx. p,q The set of points at which a(x) = sP is the union of the finite collection of disjoint sets at which a(x) = sP and T(x) = tq for
q = 1,2, .... Hence, by Theorem 7 .3.2, Corollary (i),
J rxP dx = L J rxp{3q dx. J {3q dx = ,L J rxpf3q dx. q
Similarly
p
Therefore
J {a(x)+T(x)} dx = p.,L sP J rxp{3q dx+ p. ,L tq J rxpf3q dx J rxP dx+ ,L tq J {3q dx J a(x) dx+ J T(x) dx.
= ,L sP p =
q
Lebesgue integral of bounded, measurable functions
110
Similarly we can show that, if c and k are any real numbers, ca(x), kT(x) and ca(x)+kT(X) are simple functions, and that
J {ca(x)+kT(x)} dx = c J a(x) dx+k J T(X) dx. Thus the integral of a simple function is a linear functional.
8.5. Lebesgue bracketing functions The following definition is effectively the same as that given by Lebesgue. DEFINITION 8.5.1. If E is a measurable set of points and if f(x) is a bounded function and its range, A < f(x) :( B, is
divided at the finite number of points t0 , tv···> tn so that A
=
~-t(x)
and
t0
<
t1
<
t2
< ... <
tn
=
B,
n-1
= 1
tp+l{cxE(x, tp)-cxE(x, tp+l)}
p-0
are lower and upper Lebesgue bracketing functions for f(x), cxE(x, tp) being the indicator of the points in E at which f(x)
>
tP.
THEOREM
8.5.1. The Lebesgue bracketing functions of a bounded
measurable function f(x) defined on a bounded, measurable set E are uniform approximations to f(x) onE, and b
sup
J A(x) dx =
b
inf
a
Let E
=
a
max(tp+l-tp) for p
=
0, I, 2, ... , n-1. Then, if x
A(x) :( f (x) :( and
J ~-t{x) dx. E
E,
~-t(x),
~-t(x)-A(x) :(E.
Hence the functions A(x) and in E.
~-t(x)
approximate f(x) uniformly
Lebesgue integral of bounded, measurable functions
Ill
_1
The bracketing functions are simple functions. Hence they are integrable and, by Theorem 8.4.2, b
b
b
J f-t(x) dx- J .\(x) dx = J{f-t(X)-.\(x)} dx ~ E(b-a). a
a
a
Now .\(x) ~ B and f-t(x) ~A. Hence if we consider all the Lebesgue functions of a bounded measurable functionf(x), then the bounds b b sup .\(x) dx and inf f-t(X) dx
J
J
a
a
both exist, and b
inf
b
Jf-t(X) dx-sup J.\(x) dx ~ E(b-a). a
a
Since this is true for all
E
>
0 it follows that
b
inf
J f-t(x) dx = a
b
sup
J .\(x) dx. a
8.6. The lebesgue-Young integral We are at last in a position to define the Lebesgue integral of any bounded, measurable functionf(x) over a bounded measurable set E. In accordance with our general bracketing principle, which has been our main guide from Chapter l onwards, we now give DEFINITION 8.6.1. If .\(x) and f-t(X) are Lebesgue bracketing functions of a bounded, measurable functionf(x), for a bounded, measurable set E, then the Lebesgue integral of f(x) over E is defined to be
Jf(x) dx = E
b
sup
b
J.\(x) dx = inf Jf-t(x) dx. a
a
We shall justify this definition in§ 8.7 by showing that the Lebesgue integral is a positive, linear, continuous functional and by proving that almost everywhere in (a, b) the derivative
:X If(t) dt X
a
exists and equals f(x).
112
Lebesgue integral of bounded, measurable functions
The Lebesgue integral of f(x) can be expressed as the integral of the measure function mE(t,f) in the form given by THEOREM
~f(x) ~
8.6.1. If A
B
Jf(x) dx =
A
mk = mE(tk, f) =
Then
JmE(t,f) dt.
Am(E)+
E
Let
B,Jor all x E E then
m0
=
Jo:(x, tn, f) dx.
m(E)
and
=
A <
Also
t0
mn = 0. t 1•
J.\(x) dx = t0(m 0-m1)+t1(m 1-m2)+ ... +tn_1(mn_ 1-mn) E
= mo to+m1(t1-to) + ··· +mn-1(tn-1-tn-2).
Hence, by Theorem 2.5.1 on the integration of monotone functions, B
J .\(x) dx ~ Am(E)+ J mE(t,f) dt. Similarly
E
JfL(X) dx = E
A
t 1(m 0-m1)+t 2(m 1-m2)+ ... +tn(mn_ 1-mn)
= mo t1 +m1(t2-t1) +··· +mn-1(tn -tn-1) B
JmE(t,f)dt.
;;?: Am(E)+ A
Therefore, by Definition 8.6.1,
Jf(x) dx = E
B
Am(E)+
JmE(t,f) dt. A
This representation of the Lebesgue integral in terms of the measure function · mE(t, f) was the independent discovery of Lebesgue and of W. H. Young {1905). While the independent variable x varies over the set E, the dependent variable t = f(x) varies over the range {A, B), so that in the definition of the Lebesgue integral the roles of the independent and dependent variables are interchanged. This inversion of the roles of the independent and dependent variables is well described by G. H. Hardy as a 'dramatic' transformation.
Lebesgue integral of bounded, measurable functions
113
CoROLLARY. (i) We note in particular that if f(x) is bounded and measurable in the closed interval [a, b], then it is also bounded and measurable in the open interval (a, b) and the half-open intervals (a,b], [a,b). Each of these intervals has the same measure b-a, and the measure function mE(t, f) has the same value in each case. Therefore we can use the same symbol b
Jf(x) dx a
for the integral of f(x) over [a, b], (a, b), [a, b) or (a, b]. (ii) It is almost trivial to note that if f(x) is bounded and measurable in [a, bJthen we can similarly define the Lebesgue integral
fJ
Jf(x) dx "' for any interval [tx, ,8] covered by [a, b]. a
Note. It is convenient to define
dx for a
<
b to mean
b
b
- J f(x)
J f(x)
dx.
a
There is a companion theorem to 8.6.1 which can be proved in exactly the same way and which yields THEOREM 8.6.2. If f(x) is a bounded, measurable function and A < f(x) :( Bin the measurable set E and 11-E(t, f) is the measure of the set of points in Eat which f(x) :( t,
then
Jf(x) dx =
B
Bm(E)-
E
J11-E(t,f) dt. A
We can also deduce this result from Theorem 8.6.1 by introducing ,B(x, t) as the indicator of the points in Eat whichf(x) :( t. Then
tx(x, t) +,B(x, t) b
11-E(t,f)
=
XE(x),
J,B(x,t) dx = m(E)-mE(t,f) a
853146X
=
H
114
Lebesgue integral of bounded, measurable functions
and
B
Am( E)+
J mE(t,f) dt =
B
J {m(E)-P-E(t,f)} dt
Am(E)+
A
A B
=
Bm( E)-
f fLE(t, f) dt
A
by the additive property of the integrals of monotone function (§ 2.7, Exercise 4).
8.7. The Lebesgue integral as a positive, linear, continuous functional Throughout this section f(x) denotes a bounded, measurable function and E a bounded, measurable set of points. To justify the definition (8.6.1) of the Lebesgue integral of f(x) over E we must show that J f(x) dx is a positive, linear E
continuous functional of.f(x) with the Lebesgue norm. We shall in fact establish the theorem of bounded convergence (8.7.7) anticipated in § 2.2. THEOREM 8.7.1 (the mean value theorem). If A < J f(x) dx ~ Bm(E).
<
f(x) ~ B,
then Am(E)
E
For
.\(x) )':. A{cxE(x,A)-cxE(x, B)} fL(X) ~ B{cxE(x,A)-cxE(x, B)},
J .\(x) dx )':. Am(E)
whence
]IJ
f fL(x) dx ~ Bm(E),
and
E
where
m('E)
J{cxE(x, t )-cxE(x, tn)} dx,
=
0
E
i.e. the measure of the set of points in Eat which A
Jf(x) dx = Om(E). E
~B.
Lebesgue integral of bounded, measurable functions If E is the interval (a, b), and f(x)
115
= 1 then
b
J f(x) dx
= (b-a).
a
Thus the Lebesgue integral satisfies the Lebesgue normalizing condition of § 2.2. (ii) If A ~ 0, then J f(x) dx ~ 0. E
Hence the Lebesgue integral is a positive functional on the space of bounded measurable functions. THEOREM 8.7.2 (the addition theorem for sets). lfthe bounded set E is the union of a finite or enumerable collection of disjoint measurable sets {En} andf(x) is bounded and measurable onE, then
= ~ J f(x) dx.
J f(x) dx E
n-lEn
If A(x) and 11-(x) are Lebesgue bracketing functions for f(x) and the set E, and if XP = x(x, Ep) is the indicator of the set EP then by Definition 8.5.1, XpA and Xp/1- are Lebesgue bracketing functions for f(x) and the set EP" For, if x E EP, then
XPA ~ f(x) ~ Xp/1-· Similarly are Lebesgue bracketing functions for f(x) and the union Un of the sets E 1 , E 2 , ••• , EnAll these bracketing functions are simple, whence by Theorem 8.4.2 on sums of simple functions,
!
J XP A dx
~ J f(x) dx ~
LJ
XP 11- dx.
Un
Also
J
XpAdx~Jf(x)dx~ J Xp/1-dx. Ep
We can choose the functions A and 11- so that J Xp/1-dx- J XpAdx E
E
~ J 11-dx-J Adx ~ Ejn.
116
Lebesgue integral of bounded, measurable functions
Hence, on passing to the limit,
J f(x) dx = Un
i J f(z) dx.
P=l Ep
Now let V.. be the union of the sets En+I• En+ 2 , •••• Then, by the result just established,
Jf(x) dx = Jf(x) dx + Jf(x) dx. Un
E
Vn
But, by the mean value theorem (8.7.1), Am(V..) :(
Jf(x) dx :( Bm(Vn), Vn
if A :( f(x) :( B in E. Also
m(V..) = m(E)-m(Un)--? 0
as n--? oo.
Therefore
Jf(x) dx =
lim
E
J f(x) dx =
~
n-+oo Un
Jf(x) dx.
n-1 Ep
CoROLLARY. (i) By the corollary to Theorem 8.6.1 it follows that if f(x) is bounded and measurable on [a, b] and if a< c < b, then c
b
b
Jf(x) dx = Jf(x) dx+ Jf(x) dx. a
a
c
(ii) If f(x) = 0 p.p. in E, then
J f(x) =
0.
E
For iff(x)
= 0 inN andf(x)
o:j:. 0 in M, then m(M)
Jf(x) dx = Jf(x) dx+ Jf(x) dx = E
N
=
0 and
0.
M
We can now prove the following extension of the mean value theorem (8.7.1). THEOREM 8. 7 .3. If f(x) and cp(x) are each bounded and measurable on the set E, and if cp(x) ~ f(x) on E then
Jcp(x) dx ~ Jf(x) dx. E
E
Lebesgue integral of bounded, measurable functions
117
If >.(x) is the lower Lebesgue bracketing function for j(x) as given in Definition 8.5.1, then >.(x) is also a lower bracketing function for rf>(x). For we have only to choose A and B so that
A < f(x)
~
rf>(x)
~
B.
j rf>(x) dx ~ j >.(x) dx,
Hence
J rf>(x) dx ~ sup J >.(x) dx = J f(x) dx.
and
E
E
E
The mean value theorem enables us to infer upper and lower bounds for the integral of f(x) from upper and lower bounds for f(x). It is obvious that there cannot be any exact converse of this theorem, but there are two theorems that are converses of Corollary (ii) of Theorem 8.7.2. THEOREM 8.7.4 (the null integral theorem) {i) If f(x) is bounded, measurable and non-negative in a measurable set E, and f(x) dx = 0, then f(x) = 0 p.p. in E.
I
E
Let En be the set of points in E at which f(x) En is measurable and 0
~ ~ m(En) <
J
En
f(x) dx
~
J
f(x) dx
>
1/n. Then
= 0
E
(by Theorem 8.7.1) whence
m(En) = 0. Now the set of points PonE at whichf(x) > 0 is the union of the enumerable sets {En} (n = 1, 2, ... ). Hence the measure of the set Pis zero, i.e. f(x) = 0 p.p. in E. (ii) If f(x) is bounded and measurable in an interval I and
I f(x)dx = OforanyintervalJcoveredbyithenf(x) = Op.p.ini. J
Letj(x) > 0 in a set Pin I. Since Pis measurable it covers an inner set Q, andf(x) > 0 in Q. The complement, I -Q, is an outer set, i.e. an enumerable collection of intervals En- Hence
J f(x) dx = J f(x) dx- .f J f(x) dx = Q
I
n=l En
0.
Lebesgue integral of bounded, measurable functions
118
Therefore, by part (i),j(x) = 0 p.p. in Q, which is a contradiction, whence the result follows. So far we have followed the concise exposition of de la ValleePoussin rather closely, but we can remove a certain artificiality from his proof of the following addition theorems by using the properties of simple functions. THEOREM 8.7.5 (the addition theorem for functions). If the functions f 1 and f 2 are bounded and measurable in E so also is f 1+f2 and U1+f2) dx = f1 dx+ !2 dx.
J
J
E
J
E
E
If .\1, P..v and .\ 2, p.. 2 are lower and upper Lebesgue bracketing functions for / 1 andf2 respectively, then .\1+.\ 2 and p..1+p.. 2 are bracketing functions for f 1+f2, although they are not Lebesgue bracketing functions. Nevertheless they are simple functions, whence, by Theorem 8.4.2,
J .\
J .\
1 dx+
E
2
dx
J (.\ +.\
=
1
E
2)
dx.
E
But, by Theorem 8.7.3,
J(.\1+.\2) dx ~ J(f1+f2) dx; E
E
J .\1 dx+ J .\2 dx ~ J (/1+/2) dx,
whence
E
E
E
and, similarly,
J p..1 dx+ J p..2 dx ~ J (f1+f2) dx. E
E
E
Now we can choose .\1, .\2, P..v p.. 2 so that, given any tolerance €
>
0,
J p..1 dx-E < J/1 dx < J .\1 dx+E, E
J p.. 2 dx-E < J/ 2 dx < J .\2 dx+E. E
Since this is true for all E > 0,
J (/ +/ 1
E
2)
dx =
J/ E
1
dx+
J/ E
2
dx.
Lebesgue integral of bounded, measurable functions
119
We can now show that the Lebesgue integral is a linear functional. THEOREM 8.7.6. lff(x) and g(x) are bounded and measurable on a measurable set E, and if a, bare any real numbers, then
J (af+bg) dx = a J f dx+b J g dx. E
E
E
If a ~ 0, and if,\, p. are lower and upper Lebesgue bracketing functions to f(x), then a,\, ap. are lower and upper Lebesgue bracketing functions to af(x). Hence
Jaf dx =a Jf dx E
(a ~ 0).
E
Also, by Theorem 8.7.5,
J (-af) dx+ J (af) dx = E
whence
0,
E
J (-af) dx = - J afdx =-a J fdx. E
J af dx = a J f dx,
Therefore
E
whether a is positive or negative (or zero). Finally, by the same Theorem 8.7.5,
J (af+bg) dx = E
Jafdx+ J bg dx =a J fdx+b J g dx. E
E
E
E
COROLLARY. By induction it follows that if fv f 2 , ••• , fn is any finite collection of functions, each bounded and measurable on a measurable set E, and if c1 , c2 , ••• , en are any real numbers, then
The extension of this result to an enumerable collection of functions is one of the most powerful and attractive features of the Lebesgue theory. We shall begin by proving the 'theorem of bounded convergence', which applies to a sequence Un(x)} (n = 1, 2, ... ), such that, ateachpointxofa bounded and measurable set E, lfn(x) I is less than a constant K independent ofn, and fn(x) converges to a limit function f(x).
120
Lebesgue integral of bounded, measurable functions
THEOREM 8.7.7. If the sequence Un(x)} converges boundedly to f(x) in a bounded and measurable set E, then
J fn(x) dx--+ Jf(x) dx E
as n--+ oo.
E
Choose any positive number E. Let E 1 be the set of points at which lf(x)-fn(x) I < E, for all n and let Ek+l be the set of points at which
I f(x)-fk(x) I ?: E lf(x)-fn(x) I < E for n ?: k+l.
and
Then the sets {Ek} (k = 1, 2, ... ) are disjoint and their union is E. Hence, by Theorem 8.7.2,
Jfn(x) dx = i Jfn(x) dx.
E
k=l Ek
By the addition theorems (8.7.2) and (8.7.5),
Jf(x) dx- Jfn(x) dx = J {f(x)-fn(x)} dx E
E
E
=
i J{f(x)-fn(x)} dx+ J{J(x)-fn(x)} dx
k=l Ek
Fn
where Fn = E-E1 -E2 - ••• -En. By the mean value theorem (8.7.1),
I J{f(x)-fn(x)} dxl < Em(Ek) whence
ifn?: k+1,
Ek
lkt l
{f(x)-fn(x)} dx
I< \tm(Ek) <
Em(E).
Sincefn(x) converges boundedly tof(x), there is a constant B such that lf(x)-fn(x) I < B for all nand all x in E. Hence
IJ
and Now
{f(x)-fn(x)}
dxj < Bm(Fn)
11 f(x) dx-j fn(x) dxl < Em(E)+Bm(Fn)· E 1 +E2 + ... +En--+ E as n--+ oo,
Lebesgue integral of bounded, measurable functions
121
whence Therefore
li:!s~p
11 f(x) dx-j fn(x) dxl < Em(E),
Jfn(x) dx-+ Jf(x) dx
I.e.
E
as n-+ oo.
E
8.8. The differentiability of the indefinite Lebesgue integral To complete the discussion of the Lebesgue integral of a bounded measurable function f(x) over a bounded measurable set of points Ewe must return to the special and familiar case when E is an interval [a, t] and discuss the properties of the indefinite integral,
~(t)
t
Jf(x) dx,
=
a
regarded as a function oft in some interval [a, b]. We shall prove that almost everywhere in [a, b], ~(t) possesses a derivative ~' (t) equal to f(t). This in fact is the best possible result that we can hope to prove for if x(x, E) is the indicator of Cantor's ternary set E (§ 5.3), then t
m(t)
=
f x(x, E) dx = 0,
if o :( t :( 1;
0
whence
m'(t)
=
0,
x(t, E) = 1
but
if tEE, i.e. at a set of points of measure zero. We therefore begin by considering the simplest Lebesgue integral, i.e. the measure of a measurable set of points. THEOREM
8.8.1. If x(x) is the indicator of a set of points E and
X
m(x) =
J x(t) dt,
then, almost everywhere, the measure function
a
m(x) has a derivative m'(x) equal to x(x).
Lebesgue integral of bounded, measurable functions
122
f x(t) dt ? o, jJ
Since if y
>
m(y)-m(x)
=
X
x, therefore 0 :( m(y)-m(x) :( y-x,
whence m(x) is continuous and is a non-decreasing function of x. Therefore, by Theorem 5.7.5, m(x) possesses a derivative m'(x), almost everywhere and 0 :( m'(x) :( l. It only remains to prove that, almost everywhere m'(x) = x(x). Let mn(x)
=
n{m(x+lfn)-m(x)}
(n
=
l, 2, ... ).
Then mn(x) converges boundedly to m'(x) almost everywhere in any interval [a, b]. Therefore by the theorem of bounded convergence (8.7.7), if a:( c :( b c
c
J mn(x) dx--? Jm'(x) dx a c
But
as n--? oo.
a c
J mn(x) dx =
c
J m(x+lfn) dx-n J m(x) dx
n
a
a
a
c+l/n
J
= n
a+l/n
m(x) dx-n
c
J
m(x) dx,
a
for all n > (b-c)-1. Now, by the mean value theorem (8.7.1), if A
= inf m(x) and
B
= supm(x)
for a:( x :( a+lfn, then a+l/n
Ajn :(
J
m(x) dx :( Bjn.
a
But, since m(x) is ·continuous, A --? m(a)
and
B--? m(a)
as n--? oo.
a+l/n
Therefore
n
J
m(x) dx--? m(a)
as n--? oo.
m(x) dx--? m(c)
as n--? oo.
a c+l/n
Similarly
n
J
c
Lebesgue integral of bounded, measurable functions
123
Therefore c
c
I m'(x) dx = m(c)-m(a) = I x(x) dx, a
a
for any interval [a, c]. Hence, by Theorem 8. 7 .4, m'(x)
=
p.p. in [a,b].
x(x)
A point g is said to be a point of metric density of the set E with indicator x(x) if the measure function X
m(x)
=
I x(t) dt a
has a derivative m'(g) at the point g. Hence we have proved that almost all points are points of metric density and that the value of the metric density is 1 if x E E or 0 if x f/= E. We can now examine the differentiability of the Lebesgue integral of a bounded, measurable functionf(x) over a bounded interval [a, b]. THEOREM
8.8.2.
Almost everywhere in [a, b] the integral
X
cp(x) =
Jf(t)
dt (a
<
x
<
b) possesses a derivative equal to f(x).
a
If o:(x, t) is the indicator of the set of points in [a, b] at which f(x) > t then the measure function X
m(x)
=
I o:(s, t) ds a
has, almost everywhere in [a, b], a derivative m'(x) equal to o:(x, t) by Theorem 8.8.1. Hence, if >.(x) and ft(X) are the Lebesgue bracketing functions of Definition 8.5.1, the integrals X
I >.(s) ds a
X
and
I ft(s) ds a
have, almost everywhere in [a, b] derivatives which are respect,ively equal to >.(x) and ft(x).
124
Lebesgue integral of bounded, measurable functions
But, by Theorem 8.7.3, x+h
J
x+h
,\(s) ds ~
X
x+h
J f(s) ds ~ J p,(s) ds. X
Let
X
fh(x) =
~
x+h
J
f(s) ds.
X
Then, at almost every point in [a, b], ,\(x)
~
liminfjh(x)
~
limsupfh(x)
h--->-0
~
p,(x).
h--->-0
But we can choose the bracketing functions so that ,\(x)
~
f(x)
~
p,(x),
and p,(x)-,\(x) < l/n for each tolerance 1/n each positive integer n, f(x)-l/n
~lim
infjh(x)
~lim
h--->-0
supfh(x)
>
0. Hence, for
~.f(x)+lfn,
h--->-0
except in a set of measure zero. The union of an enumerable collection of sets of measure zero has measure zero. Hence, almost everywhere in [a, b], limsup A(x) h--->-0
=
liminf A(x) =f(x), h--->-0
i.e. Similarly we can prove that D-cp(x) = f(x) = D_cp(x), whence the integral cp(x) possesses a derivative equal to f(x) almost everywhere in [a, b].
8.9. Exercises 1. If the set E(j ;;;, t) is measurable, deduce the measurability of E(j > t).
2. Ifj(x) is measurable and n is an outer set of points, show that the set of points E for whichj(x) En is measurable. 3. If f(x) is continuous and g(x) is measurable, prove that f[g(x)] is measurable. 4. Evaluate the indicator {3(x, T) of the indicator cx(x, a) of a measurable functionf(x) and evaluate 1
J{3(x, T) dx. 0
Lebesgue integral of bounded, measurable functions
125
5. Ifj (x) is bounded and measurable in [a, b], prove that for each tolerance > 0 there exists a step function g(x) such that
E
b
J jj(x)-g(x)j dx ,;;;
E.
a
6. If the sequence of bounded, measurable functions Un(x)} converges to a limit functionj(x) at all points of a measurable set E, prove that for each tolerance E > 0 there exists a measurable set F such that F c E, m(F) > m(E)-E, andfn(x)-+ j(x) uniformly in F as n-+ oo.
9
The lebesgue integral of summable functions
9.1. Introduction In the preceding chapter we have given the analytic construction of the integral of a bounded, measurable function over a bounded measurable set by the methods of Lebesgue, and following in the path of de la Vallee-Poussin we have shown that the Lebesgue integral is a positive, linear, 'continuous' functional. Lastly we have shown that the indefinite integral X
~(x)
=
Jf(s) ds a
has almost everywhere a derivative equal to f(x). So far as bounded functions and bounded intervals are concerned it does appear that the Lebesgue integral furnishes the most general concept of an integral, but it is clearly desirable to extend this concept to unbounded functions and to unbounded intervals. Here the methods of Lebesgue do not meet with such complete success. In fact the theory applies only to the class of functions now called 'summable',t and for further generalizations it is necessary to employ the more powerful methods of Denjoy, Perron, Ward, and Henstock.
9.2. Summable functions We have proved in Theorems 8.6.1 and 8.6.2 that the Lebesgue integral of a functionf(x), bounded and measurable in a bounded and measurable set E, can be expressed as a Young integral in the form B f(x) dx = Am(E)+ mE(t,f) dt,
J
J
E
where
A
A
~f(x) ~ B
t In Lebesgue's first paper (1902) this adjective is equivalent to 'integrable'.
Lebesgue integral of summable functions
127
if x E E, and mE(t, f) is the measure of the set of points in E at which f(x) > t, or in the form
I f(x) dx =
B
Bm(E)-
E
I ftE(t,f) dt, A
where 11-E(t, f) is the measure of the set of points in E at which f(x) < t. These expressions cannot be immediately generalized to unbounded functions or to unbounded intervals, for, in the first term either A or B or both will be infinite. This particular difficulty disappears if we restrict ourselves to non -negative functions, for, if f(x) ;;:?: 0 then B
I f(x) dx = I
E
mE(t, f) dt.
0
We are now left with the problem of defining the Young integral for an unbounded measurable function f(x) and an unbounded measurable set E. By Definition 7 .5.1 the integrand mE(t, f) is given in terms of mE,(t, f) the measure of the set E in the interval E8 ( -s :( x :( s), as
mE(t,f) =lim mE,(t,f). 8-+CO
Hence mE(t, f) like mE,(t, f) is a non-decreasing function oft. The difficulties in framing a definition are that the upper limit B may be infinite and the integrand mE(t, f) may not be bounded. However, it is clear that we can restrict ourselves to the case when mE(t, f) is bounded in any interval 0 < e :( t
For in the alternative case, for some positive number -r, mE,( -r, f) would tend to infinity as E 8 ~ oo and so also would mE,(t, f) for 0 :( t ~ -r. The Young integral would therefore be infinite. For example, if X
f(x) = - - for x ;;:?: 0
1+x
and
f(x) = 0
for x
<
0,
128
Lebesgue integral of summable functions
then
0
and
~f(x) ~
t
mE,(t, f) = s - 1 - t
1,
if this is positive.
Thus mE,(t,f) --? oo as s--? oo for all t ~ 0. If mE(t, f) is bounded in any interval, 0 < E ~ t, i.e. if mE(t, f) is finite for all positive values oft, we can start with the Young integral
and consider the effects of allowing E to tend to zero and B to tend to infinity. We are thus led to the following definition. DEFINITION 9.2.1. Anon-negativefunctionf(x) is 'summable' over the bounded or unbounded, measurable set E, if the Young I integral B
JmE(t,f) dt
E
converges to a finite limit as pendently to infinity.
E
tends to zero and B tends inde-
THEOREM 9.2.1. If f(x) is non-negative and measurable over a measurable set E, and f(x) ~ cf>(x), where cf>(x) is summable over E, then f(x) is summable over E. mE(t,f) ~ mE(t, c/>),
For B
and
B
JmE(t,f) dt ~ JmE(t,cf>) dt. E •
E
Hence the integral on the left must converge as B--? oo.
E --?
0 and
DEFINITION 9.2.2. The Lebesgue integral over E of a nonnegative function f(x) summable over E is B
J f(x) dx = E
lim
J mE(t,f) dt.
B~oo, £--?-0 €
Lebesgue integral of summable functions
129
Consider, for example, the function sin 2x
f(x) = -
X
2
. 1fx #- 0,
,
f(O) = 0,
for which B = I. The set of points at whichf(x) > tis obviously covered by the set of points at which x- 2 > t. Hence, if E is the semi-infinite interval 0 ~ x ~ oo, then mE(t, f)
<
t-i
1
B
JmE(t,J) dt < Jt-i dt
and
E
= 2(1-Ei).
E
Thus the function x- 2 sin 2x is summable over (0 ~ x ~ oo). To extend these definitions to functions that take both positive and negative values we introduce the concept of the positive and negative parts of a function. The positive and negative parts of a function f(x) defined on a set E are the non-negative functions, defined also on E such that DEFINITION
9.2.3.
f+(x)
= {f(x) (f(x) ?: 0), 0
_ x _ {-f(x)
and
f ( )- 0 9.2.2.
THEOREM
(a)
f(x)
< 0), (f(x) < 0), (f(x) ~ 0).
(f(x)
=
f+(x)-J-(x), lf(x) I = f+(x)+J-(x).
(b) If f(x) is bounded and measurable on a measurable set E, so also are f+(x) and J-(x) and
Jf(x) dx = Jf+(x) dx- JJ-(x) dx, J lf(x) I dx = Jf+(x) dx+ JJ-(x) dx. E
E
E
E
E
E
Since f+(x) = max(/, 0) and J-(x) = max( -f, 0), f+(x) and J-(x) are bounded and measurable by Theorem 8.2.3, and the expressions for the integrals of f(x) and lf(x) I follow from the addition theorem for functions (8.7.5). 853146X
I
130
Lebesgue integral of summable functions
DEFINITION 9.2.4. The functionf(x) is summable over E if its positive and negative parts f+(x) and f-(x) are each summable over E. DEFINITION 9.2.5. If f(x) is summable over E its Lebesgue integral over E is
Jf(x) dx = Jf+(x) dx- Jf-(x) dx. E
E
E
THEOREM 9.2.3. If f(x) is non-negative and summable over the measurable set I and if I covers the measurable set J, then f(x) is also summable over J, and
Jf(x) dx ;?: J f(x) dx. J
I
> >
For the set E{x; f(x) measurable sets E{x;f(x) able. Also
> t, x
E{x; f(x)
E
t, x t,x
E E
J} is the intersection of the I} and J, and hence is measur-
I)} ~ E{x; f(x)
>
t, x
E
J)};
whence 00
00
Jmi(t, f) dt ;?: JmJ(t, f) dt.
and
0
0
Two important consequences of these definitions relate to sets of zero measure. THEOREM 9.2.4. If f(x) is any function, bounded or unbounded, measurable or not, and if E is any set of measure zero, then
Jf(x) dx = 0. E
For
mE(t,j+)
and
~
m(E)
mE(t,f-) :( m(E) B
B
= =
0, 0.
JmE(t,j+) dt = 0 = JmE(t,J-) dt, and therefore Jf(x) dx = 0,
Hence
E
by Definition 9.2.5.
Lebesgue integral of summable functions
131
THEOREM 9.2.5. lff(x) is summable over a measurable set E and Z is a subset of E of measure zero, then
Jf(x) dx = Jf(x) dx, E
F
where F is the complement of Z with respect to E.
We need only prove the theorem for f+(x). Now
mE(t,f+) = mF(t,f+)+mz(t,j+)
and
mz(t, f+) = 0.
Hence 00
00
Jf+(x) dx = JmE(t,j+) dt = JmF(t, f+) dt = Jf+(x) dx. E
0
0
F
Hence in calculating a Lebesgue integral over Ewe can always neglect the contribution from a set of points in E of measure zero. DEFINITION 9.2.6. Two functionsf(x) and g(x) each summable over E are said to be 'equivalent' in E if f(x) = g(x) almost everywhere in E.
9.2.6. lff(x) and g(x) are each summable over E and f(x) and g(x) are equivalent in E, then THEOREM
Jf(x) dx = Jg(x) dx. E
E
9.3. The lebesgue integral of summable functions as a positive, linear, 'continuous' functional To justify the definition of the Lebesgue integral of a summable function we must show that, as in the case of a bounded measurable function, it is a positive, linear, continuous functional, but now 'continuity' is taken in the very general sense of the theorem of 'dominated convergence' (9.3.7). THEOREM 9.3.1. If f(x) is non-negative and summable over a measurable set E, then f(x) dx ;;?: 0.
J
E
132
Lebesgue integral of summable functions
For in this case 00
J f(x) dx = Jf+(x) dx = J mE(t,f+) dt;?:. 0. E
E
0
Hence the Lebesgue integral of a summable function is a positive functional. THEOREM 9.3.2 (the addition theorem for sets). If the bounded measurable set E is the union of a finite or enumerable collection of disjoint, measurable sets En (n = l, 2, ... ) and if f(x) is summable over each set En then it is also summable over E and
Jf(x) dx =
E
~
Jf(x) dx.
n-l En
It is clearly sufficient to prove this theorem for f+(x). Let F be the set of points in Eat which 0
<
E
<
B
and Fn be the corresponding set of points in En. Then, by Theorems 8.7.2 and 9.2.3,
Jj+(x) dx = F
! B'kJj+(x) dx :::;;; ! EkJj+(x) dx.
k~l
k~l
Hence, proceeding to the limit E --+ 0, B --+ oo,
J f+(x) dx :::;;; ! EkJj+(x) dx.
E
Also
k~l
Jf+(x) dx ;?:. f FkJj+(x) dx.
F
k~l
Hence, proceeding to the same limit,
Jf+(x) dx ;?:. i; EkJj+(x) dx.
E
k~l
Finally, let p tend to infinity. Then
J j+(x) dx ;?:. f EkJ j+(x) dx.
E
k~l
The two bounds we have obtained for the integral of j+(x) over E establish the theorem.
Lebesgue integral of summable functions
133
THEOREM 9.3.3 (the addition theoremforsummablefunctions). If the functions f(x) and g(x) are each summable over the measurable set E, then so also is their sum cf;(x) = f(x)+g(x) and I cp(x) dx = I f(x) dx+ I g(x) dx. E
E
E
We first prove the theorem for non-negative functions f(x) and g(x). In the notation of§ 8.2 let F = E(O
<
~ f ~ B),
€
=
G
I= intersection ofF and G,
Then I s; F, G s; U. In the abbreviated notation,
~
If+ I g I
I
Jf
€
~ g ~ B).
J f(x) dx,
=
etc., it is clear
I
I g ~ If+ I g.
If+ F
<
U =union ofF and G.
I
that
E(O
G
U
U
Now f(x), g(x), and cf;(x) are bounded and measurable in each of the bounded and measurable sets I, F, G, and U. Hence, by Theorem 8. 7.5, If+ I g = I cp, I
and
I
I
If+ I g =I cf;.
u u Let E --+ 0 and B --+ oo. Then If--+ If F
u
and
I g--+ I g.
E
G
E
Hence J cf; is bounded for alii s; E and therefore cf; is summable I
over E. Thus I cf; E
~If+ E
I g E
~I cf;; E
whence the theorem follows for non-negative functions f, g, and cf;. In the general cases whenf(x) and g(x) may each be of variable sign, we note first that lcfo(x)l ~ lf(x)l+ lg(x)l.
134
Lebesgue integral of summable functions
Now lf(x)l and lg(x) I are each summable over E, whence so also is their sum lf(x)l+lg(x)l, as we have just proved. Therefore, by Theorem 9.2.1, lc/>(x) I is summable over E, and so also is c/>(x). We replace the usual argument, due to de la Vallee-Poussin, by the following concise proof, due to John Wright. By Theorem 9.2.1, cf>+ and cf>- are each summable over E. Now
whence, by applying the above result twice,
f cf>++ f f-+ Jg- = f cf>-+ f f++ Jg+.
E
E
E
E
E
E
Therefore
THEOREM 9.3.4. If f(x) and g(x) are each summable over a measurable set E and if a, b are any real numbers, then
J(af+bg) =a Jf+b Jg. E
E
E
In the notation of Theorem 9.3.3,
Jaf = lim Jaf = a lim J f = a Jf. E
F
F
E
Hence, by Theorem 9.3.3,
a
J f+b J g = Jaf+ Jbg = J(af+bg). E
E
E
E
E
Thus the Lebesgue integral of summable functions is a positive, linear functional. Clearly there can be no mean value theorem for unbounded functions but there are a number of related theorems. THEOREM 9.3.5. If f(x) and g(x) are each summable over a bounded measurable set E, and if f(x) ~ g(x) in E, then
I f(x) dx ~ I g(x) dx. E
E
Lebesgue integral of summable functions
135
For, iff(x)-g(x) = cfo(x) then cp(x) ;?: 0 in E, and, by Theorem 9.3.1, cp(x) dx ;?: 0.
J
E
Hence, by Theorem 9.3.3,
Jf(x) dx = Jg(x) dx+ Jcp(x) dx ;?: Jg(x) dx. E
THEOREM
E
E
E
9.3.6. If f(x) is measurable over a measurable set E, lf(x) I
and
~
cp(x),
.
where cp(x) is summable over E, thenf(x) is summable over E.
For ifF is the set of points in E at which
<
0
E
~
f+(x) ~ B,
then, by Theorems 9.3.5 and 9.2.3,
Jf+(x) dx ~ Jcp(x) dx ~ Jcp(x) dx. F
F
E
Hence the integral of f+(x) oyer F converges as E ~ 0, B ~ oo. Thereforef+(x) is summable over E. Similarly f-(x) is summable over E, and so also is f(x). Finally we come to the triumphant climax of this series of theorems-Lebesgue's great theorem of 'dominated convergence'. THEOREM 9.3.7. If {fn(x)} (n = I, 2, ... ) is a sequence of measurable functions defined in a bounded measurable set E, if
lfn(x) I ~ cp(x) for each n, where cp(x) is summable in E, and if fn(x) converges pointwise in E as n ~ oo to a function f(x), then f(x) is summable in E and lim fn(x) dx = f(x) dx. n~co
J
E
J
E
Since lfn(x)l ~ cp(x) for each n, lf(x)l ~ cp(x), and hence by Theorem 9.3.6, f(x) is summable in E. As in the proof of Theorem 8. 7. 7 we follow the concise proof given by de la Vallee-Poussin.
136
Lebesgue integral of summable functions
Let gn(x) = lf(x)-fn(x) I and let e be any positive number. Then, by Theorems 9.3.4 and 9.2.2, gn(x) is summable in E. Let
>
E1
= E{x; e
E2
= E{x; g1 (x)
g1 (x), g2 (x), ...}
> g2 (x), g3{x), ...} ~ e > gk+I(x), gk+ 2 (x) ... }.
~
Ek+l = E{x; gk(x)
e
Then the sets {Ek} (k = I, 2, ... ) form an enumerable collection of disjoint, me.asurable sets whose union is E. Hence by the addition theorem for summable functions (9.3.3),
where Since 0 :( gn < e in each of the sets E 1 , E 2 , ... , En it follows that 0 :( J gn :( em(Ek) if k = 1, 2, ... , n. Also Ek
f gn :( 2 f cp, f cf;- f cf;- f cf;- ... - f cp, 0 :(
Fn
f cf;
and
Fn
=
Jcf; --+ 0
n---+oo
Ez
En
as n --+ oo.
Fn
Therefore limsup
E,
E
whence
F,.
Jgn < em(E )+em(E )+ ... +em(En) :( em(E). 1
2
E
Since m( E) is finite and this inequality is true for all e
Jgn --+ 0
>
0
as n--+ 00.
E
Thus
I Jfn(x) dx- Jf(x) dx/ :( J gn(x) dx--+ 0 E
E
as n--+ 00.
E
The theorem of dominated convergence is also true for any measurable set E, bounded or unbounded. CoROLLARY.
Lebesgue integral of summable functions
137
Let E 8 be the intersection of E and the finite interval x :( 8. Then
- 8 :(
f lf-fnl
=
E
where
An,s+Bn,s>
f
An,s =
lf-fnl
E-E,
f lf-fnl• 0 :( An,s :( 2 f rp-2 f rp.
and
=
Bn,s
E,
For all n,
E
E,
Hence there exists an integer a( E) such that An,s :( E for 8 ;;:?: a( E) and for all n. Also there exists an integer r(8, E) such that
Bn,s :( E for n ;;:?: r(8, E). Hence there exists an integer v(a, E) dependent only on E such that An,u+Bn,u :( 2E for n > v(a, E). Therefore
J lf-fnl =
lim
0
n---+co E
and the theorem is established. Thus the Lebesgue integral of summable functions is a continuous functional. The existence of a summable dominant function rp(x) such that lfn(x)l :( rp(x) and rp(x) is summable over an unbounded measured intervalE is sufficient but not necessary for the convergence of the sequence, fn(x) dx to the limit limfn(x) dx.
I
I
E
E
A simple counter example is given by (n-! < x otherwise.
<
n+t),
Then for all x and
I
00
0
fn(x) dx
=
2n+l ln----+ 0 2n-l
as n--+ oo.
138
Lebesgue integral of summable functions
But if rf>(x) ~ lfn(x) I for all n, then rf>(x) summable dominant function exists.
~
x- 1 and hence no
9.4. The Lebesgue integral as a primitive Wehavealreadyprovedin Theorem 8.8.2thatiff(x)isa bounded, measurable function in an interval [a, bJ then the indefinite x Lebesgue integral rf>(x) = f(t) dt
J
a
possesses, almost everywhere in [a, b], a derivative rf>' (x) equal to f(x). We can now extend this result to summable functions. We need a preliminary lemma, due to Fatou, on sequences of functions {fn(x)} which converge to a limit f(x) (but which do not possess dominated convergence). THEOREM 9.4.1 (Fatou's lemma). If {fn(x)} is any sequence of non-negative functions, each summable in a bounded measurable set E, and if fn(x)--+ f(x) pointwise in E, and liminf Jfn(x) dx < oo, E
thenf(x)issummableover Eand liminf
Jfn(x) dx ~ J f(x) dx.
n---+oo E
E
fn,k = min(fn, k) (k = 1, 2, ... ). Then, as n--+ oo,fn k--+ min(/, k) = rPk• say. Hence, by Theorem 8.7.7, fn,k--+ rPk· Let
f
f
E
E
But fn k ~ fn, whence, by Theorem 9.3.5,
f fn,k ~ f fn·
E
Therefore liminf
E
Jfn ~ liminfjJn,k =lim Jfn,k = JrPk· E
'
E
E
Since this is true for all k and since k
Jm(t,f) dt = JrPk ~ liminf Jfn, 0
E
Lebesgue integral of summable functions
139
it follows that f is summable and that
Jf
:( liminf J fn-
E
E
This theorem is the 'best possible' with the prescribed restrictions on the functionsfn(x) for we can easily construct an example where the sign of inequality must be taken. x _ {n2x fn( ) 0
Let
(0 :( x :( 1/n),
(1/n
<
x). 1
Then fn(x)
-+
0 for each value of x. Hence If= 0. But 0
1
1/n
1
Jfn = Jn x dx = ! > Jf. 2
0
0
0
A special case ofFatou's lemma (9.4.1) is worthy of note, viz. the 'monotone convergence theorem'. THEOREM 9.4.2. If {fn(x)} is any monotone, non-decreasing sequence of non-negative functions, each summable in a bounded, measurable set E, fn(x)-+ f(x) pointwise in E, and
Jfn(x) dx < oo,
lim
E
then f(x) is summable over E and
Jfn(x) dx-+ Jf(x) dx. E
E
For by Fatou's lemma (9.4.1), f(x) is summable over E. Hence fn(x) dx :( f(x) dx
J
J
E
E
:( liminf n_,.oo
= lim
J fn(x) dx
(by 9.4.1)
E
J fn(x) dx,
n_,.oo E
I
since the sequence { f(x) dx} is monotonic; therefore E
lim
Jfn(x) dx = J f(x) dx.
n_,.«> E
E
140
Lebesgue integral of summable functions
Before studying the differentiation of an indefinite integral, it is a simpler problem to study the integration of a derivative. The surprising result is given by THEOREM 9.4.3. If rf>(x) is continuous and non-decreasing in [a,b] then its derivative rf>'(x) is summable over (q,,b) and b
Jrf>'(x) dx ~ rf>(b)-rf>(a). a
By Theorem 5. 7.5 the incrementary ratio
rf>(x, n) = nrf>(x+ 1/n)-nrf>(x)
(n = I, 2, ... )
converges to a limit rf>'(x) as n--+ oo almost everywhere in (a, b), i.e. at a set of points E with measure m(E) = b-a. Hence, by Fatou's lemma, rf>'(x) is summable over E and
Jr/>'(x) dx ~ liminf Jrf>(x,n) dx. n~oo
E
.
E
Now, by Theorem 9.2.5, b
Jr/>'(x) dx = Jrf>'(x) dx, a
E
while b
J rf>(x, n) dx = J rf>(x, n) dx E
a b
=
n
b
Jrf>(x+ 1/n) dx-n Jrf>(x) dx a
a
b+l/n
= n
J
a+l/n
rf>(x) dx-n
J
rf>(x) dx
a
b
if we define rf>(x) as equal to rf>(b) in the interval b Now b+l/n .n rf>(x) dx = rf>(b),
~
x
~
J
b
a+l/n
and
n
J
rf>(x) dx ~ rf>(a).
a
Hence b
b
Jrf>'(x) dx ~ li.e~f Jrf>(x, n) dx ~ rf>(b)-rf>(a). a
a
b+Ifn.
Lebesgue integral of summable functions
141
Once again we note that this is the 'best possible' result, for there are continuous and non-decreasing functions cfo(x) for which b cp'(x) dx < cp(b)-cfo(a).
J
a
To investigate the differentiability of the indefinite Lebesgue x integral cp(x) = f(t) dt
J
a
we can, as usual, restrict ourselves to a non-negative function f(x), but we need to verify the continuity of cfo(x) when f(x) is
unbounded but summable. THEOREM 9.4.4. If f(x) is non-negative and summable over the interval [a, b], then the Lebesgue integral X
=
cfo(x)
Jf(t) dt a
is continuous for all x in [a, b]. fn(x)
Let
=
{fn(x)
(f(x) :( n) (f(x) > n).
fn(x) ~ f(x)
Then
as n
~
oo,
and, by Theorem 9.3.7, b
b
Jfn(x) dx ~ Jf(x) dx a
Hence to any tolerance such that b 0 :(
E
>
0 there corresponds an integer n b
Jf(x) dx- Jfn(x) dx < !E.
a
a
Now fn(x) :( f(x), whence, if a :( a: I! f3
< fJ :( b,
then
Jf(x) dx- Jfn(x) dx < !E
0 :(
"' f3 and
as n-+ oo.
a
0 :(
Jfn(x) dx
"'
"' <
n(fJ-a:)+!E.
Lebesgue integral of summable functions
142
Therefore, if {3-01.
<
e/2n, then
r/>({3)-r/>(01.) < e, i.e. rf>(x) is continuous for all x in [a, b]. 0
THEOREM
<
9.4.5. If f(x) is summable over the interval [a, b] then
the Lebesgue integral rf>(x)
= J"' f(t)
dt possesses almost everywhere
a
in (a, b) a derivative rf>'(x) equal to f(x).
The functionf(x)-fn(x) is non-negative, and summable, and the integral
J"' {f(t)-fn(t)} dt =
J"' fn(t) dt
rf>(x)-rf>(a)-
a
a
is a non -decreasing, continuous function of x. Hence, by Theorem 5.7.5, it possesses almost everywhere a derivative which is nonnegative. By the same theorem the integrals
J"' f(t) dt
and
a
J"' fn(t) dt a
are differentiable almost everywhere. Now r/>'(x) ;;?:
! J"'
fn(t) dt+
~~ ~
J"'
{j(t)-fn(t)} dt
p.p.
a
a
andf(t) ;;?:fn(t). Therefore, by Theorem 8.8.2, r/>'(x) ;;?: d:
J"' fn(t) dt = fn(x)
p.p.
a
and
r/>'(x) ;;?: lim fn(x) n-+«>
= f(x)
p.p.
But, by Theorem 9.3.3, b
b
b
J{r/>'(x)-f(x)} dx = Jr/>'(x) dx- Jf(x) dx ~ 0. a
a
a
Hence, by the null integral theorem (8.7.4), r/>'(x) = f(x)
p.p. in (a, b).
Lebesgue integral of summable functions
143
9.5. Exercises 1. Show that the frmction
smx j(x)
=
{
~
(x
=!= 0)
(x
=
0)
is not summable over the interval (0 .;;; x .;;; oo). 2. If the non-negative frmctions fn(x) (n = 1, 2, ... ) are each summable over a measurable set E, and if fn(x) .;;; fn-t- 1 (x) prove that the limit frmction f(x) = lim fn(x) n-+«>
is summable over E and that
Jfn(x) dx-+ Jj(x) dx E
3.
Iffn(x)
as n-+ oo.
E
denotes the trrmcated frmction fn(x)
= {
f(x) n
prove that (f+g)n .;;; f n + gn .;;; (9.3.3) for frmctions.
~f 0 .;;; x .;;; n 1fn<x (n=1,2, ... )
(f+ g) 2 n, and deduce the addition theorem
4. The frmction
j(x)
=
2x sin(1/x 2 ) - (2/x)cos(1/x 2 )
is the derivative of x 2 sin(1/x 2 ). Explain why j(x) is not summable over [0, 1]. 00
5. Examine the sequence of integrals
J fn(x) dx
in the light of the
0
theorems of dominated and monotone convergence if (i) (ii) (iii) (iv)
fn(x) fn(x) fn(x) fn(x)
=
= =
=
nxe-n"', 2n 2 e-n'x', cp(nx), cp(x-n),
where cp(x) is summable over (0, oo).
LEBESGUE THEORY IN d DIMENSIONS
10
Multiple integrals
10.1. Introduction It is possible to develop the theory of Lebesgue measure and integration in d-dimensional Euclidean space from the very beginning, but in the interests of intelligibility we have so far restricted our exposition to one dimension. We have, however, so phrased the terminology, the notation, the definitions, and the theorems so that most of them are applicable to d-dimensional space.
10.2. Elementary sets in d dimensions DEFINITION 10.2.1. In d dimensions, with coordinates (x1 , x 2 , ••• , xa) an 'interval' is the Cartesian product of the linear
intervals
ak-< xk-< bk (k = 1, 2, ... , d) where ak, bk are finite or infinite numbers for all k. DEFINITION 10.2.2. An 'elementary set', with indicator a is the union of a finite number of disjoint intervals, with indicators a 1 ak
=
0
if j -:::/= k,
and a= a1 +a2 +... +aw The intervals {a8 } are called the 'components 'of a. THEOREM 10.2.1. The intersection, union, difference, and symmetric difference of two elementary sets are also elementary sets. The intersection of two intervals
Multiple integrals
145
is an interval of the form
The proof for the intersection of two elementary sets then follows as in Theorem 6.2.1. If the interval T is covered by an interval w, then the complement w-T is an elementary set. For if Tis the interval (ak-< xk-< bk) (k = I, 2, ... , d) then the 2d hyperplanes, x = ak and x = bk divide the interval w into 3d intervals, one of which may be taken to beT and the remainder of which form an elementary set, i.e. the complement w-T is an elementary set. If the elementary set Tis covered by another elementary set a then the complement a-T is an elementary set. For U-T =a-UT
where {as} are the component intervals of a. Now as and l-T are elementary sets, and so is their intersection as(l-T). Also as(l-T).a1(l-T) = 0, i.e. the elementary sets aa(l-T) are disjoint. Thus a-T is the union of a finite number of disjoint intervals and is therefore an elementary set. It follows as in Theorem 6.2.1 that, if a and Tare elementary sets, so also is their union auT, the differences a-aT, T-aT, and the symmetric difference a~ T. Thus elementary sets in d dimensions form an algebra, as in one dimension. 10.2.2. If a and T are any pair of elementary sets then there exists a finite collection of disjoint intervals {Ys} such that THEOREM
T
=
2 btYt> I
aUT= 2Ys> s
where each coefficient as or b1 is either 0 or I.
The proof is exactly the same as in Theorem 6.2.2. 868146X
K
I46
Multiple integrals
DEFINITION I0.2.3. The geometric measure g(01.) of an interval 01. (ak-< xk-< bk) (k = I, 2, ... , d) is its area (if d = 2), volume (if d = 3), or hypervolume (if d > 3), and d
g(a) =
IT (bk-ak). k=l
THEOREM I0.2.3. If the interval w is the union of a finite number N of disjoint intervals {as} then N
g(w)
= L g(as)· s=l
-<
If as is the interval (a~) xk-< b~)) (k = I, 2, ... , d) then the 2dN hyperplanes, xk = a~>, xk = b~) divide the interval w into
a finite number of intervals, w 1 , w 2 , .•. which can be enumerated so that the interior of w 8 coincides with the interior of a 8 for s = I, 2, ... , N. Hence
1 0.3. Lebesgue theory in d dimensions From this point onwards the d-dimensional theory follows, almost word for word, the one-dimensional theory of §§ 6.2, 6.3, 6.4, 6.5, and 6.6. Outer and inner measure are then defined and discussed exactly as in Chapter 7, and the Lebesgue integral as in Chapters 8 and 9, with one exception noted below and with the convention that the symbols b
JJ ... J f(x) dx
1 dx 2 ••• dxd,
Jf(x) dx, a
R
or
Jf(x) R
now represent the. integral of f(x 1 , x 2 , ••• , xd) over the interval R ak ~
Xn
~ bk
(lc
=
I, 2, ... , d).
The reduction of the d-dimensional theory to the one-dimensional case is facilitated by Lebesgue's concept of integration over a measurable set and by Young's integral
Jf(x) dx = JmE(t, f) dt, E
Multiple integrals
I47
which reduces the d-dimensional integral of f(x) over E to the one-dimensional integral of its measure function over the range ofj(x).
The one exception is that we must exclude the results concerning the differentiability of the indefinite Lebesgue integral I
cp(t) =
J f(x) dx, which is crucially dependent on the monotone a
and continuous character of cf;(t) as a function of the single variable t for non-negative functions f(x). There is, however, one new problem that arises in the multidimensional theory, which is most clearly exhibited in the case of two dimensions. This is the problem of the relation between the integral f(x, y) dxdy
I
R
(a~ x ~
over the rectangle R
b, p
~
y
~
b
g(y) =
q) and the integrals
q
I f(x, y) dx,
I f(x, y) dy,
h(x) =
a
p q
b
I g(y) dy, I h(x) dx. a
p
Elementary calculus suggests that if f(x, y) is bounded and continuous in R, then the multiple integral can be expressed as · a repeated integral in the forms q
b
I f(x, y) dxdy = I g(y) dy = I h(x) dx, R
a
p
but there are a number of well-known examples which show that these relations are not true for all unbounded functions: (I)
If
except at the origin, where f = 0 and if R is the square (0
then
=
g(y)
~
x
~
I,
-(I+y2)-I,
1
I g(y) dy 0
0
~
y
~
I),
h(x)
=
(l+x2)-l,
1
=
-!7T,
I h(x) = !1r. 0
148
Multiple integrals
The set of points (x, y) at which f(x, y) > t > 0, occupies the interior of half one of the loops of the lemniscate, tr 2 = cos 20,
whence
= !t-1
m(t,f)
and the Young integral, co
J m(t,f) dt 0
is not convergent, so that f(x, y) is not summable over R. If rfo(z) = pzP(I+z2P) (p and R is the rectangle (0 x (2)
>
I), z
= xy, f(x, y) = drfo(z)jdz
< < oo, 0 < y
g(y) = 0,
h(x) = x-1rfo(cx),
c
f g(y) dy
co
J h(x) dx = t7T
= 0.
0
0
(Hobson, vol. ii, p. 351). If f+(x, y) and j-(x, y) are the positive and negative parts of rfo'(z) then co
co
Jf+ dx = !PlY = J j- dx 0
(if y
>
0).
0
Hence the integral of rfo'(z) is not absolutely convergent. Similar problems arise in the case of integrals over three or more dimensions but for the sake of clarity we shall restrict our exposition to integrals in two dimensions.
10.4. Fubini's theorem stated We begin by giving a general statement of the theory which is 1 due to Fubini. We consider a function f(x,y) which is non-negative and summable over the whole of the (x, y)-plane. To apply the theorem to a summable function taking both positive and negative signs we have only to consider its positive and negative parts separately. Again to apply the theorem to the integral of
Multiple integrals
149
f(x, y) over a bounded measurable set Ewe introduce the function cp(x, y) defined by the relations cp(x, y) = {f(x, y) 0
(x, y (x,y
E
i
E), E).
With the non-negative, summable functionf(x, y) we associate its indicator function cx(x, y, a) = {
>
I
(f(x, y)
0
(f(x, y) ~ a),
a),
and its two-dimensional measure function m(a) =
JJ cx(x, y, a) dxdy.
We prove that if y is a fixed number thenf(x, y), regarded as a function of x, is summable for almost all values of y and has a one-dimensional measure function TJ(Y, a) =
J cx(x, y, a) dx.
Similarly we prove that if x is a fixed number then f(x, y), regarded as a function of y, is summable for almost all values of x and has a one-dimensional measure function g(x, a) =
J cx(x, y, a) dy.
We then prove that
Jg(x, a) dx = m(a) = JTJ(Y, a) dy. Turning to the functionf(x, y) we next prove that the integrals g(y) = and
h(x)
=
Jf(x, y) dx Jf(x, y) dy
exist for almost all values of y and x respectively, and that
Jg(y) dy = Jf(x, y) dxdy = Jh(x) dx. Fubini's theorem presupposes the existence of the double integral JJ f(x, y) dxdy and then expresses this double integral as a pair of repeated integrals I=
Jdy { Jf(x,y) dx}
and
J =
Jdx { Jf(x,y) dy}.
150
Multiple integrals
But, in many problems of analysis, we can presuppose the existence of one repeated integral I and we require conditions under which I is equal to the other repeated integral J. Sufficient conditions are provided by Tonelli's theorem in the form that if f(x, y) is non-negative and measurable over the (x, y)-plane, and if the integral! exists, then so also does the integral J and I= J.
10.5. Fubini's theorem for indicators THEOREM 10.5.1. If the set of points E with indicator cx(x, y) has finite two-dimensional measure t-t
=
JJ cx(x, y) dxdy
then the set of points A on the line x = s with indicator cx(s, y) has finite one-dimensional measure t.t(s) =
Jcx(s, y) dy
for almost all values of s. Also t.t(s) is integrable over s and
Jt.t(s) ds =
t-t·
(i) If E is the interval (a-< x-< b, p-< y-< q), then
fL
(s) = {(q-p) 0
if a -< s -< b, otherwise
Jt.t(s) ds = (b-a)(q-p) =
and
t-t·
(ii) If E is an elementary set cx(x, y, E), i.e. the union of a finite collection of disjoint intervals {En}, then En will have a one-dimensional measure
t.tn(s) =
Jcx(s, y, En) dy
and a two-dimensional measure
fLn =
Jt.tn(s) ds.
E has the one-dimensional measure
t.t(s) =
Jcx(s, y, E) dy = I
t.tn(s)
and the two-dimensional measure
fL =
JJ cx(s, y, E) dsdy = JJ I
cx(s, y, En) dsdy.
Multiple integrals
Hence
fL
=
151
Jt-t(s) ds.
(iii) If Eisa bounded, outer set, i.e. the union of an enumerable collection of disjoint intervals {En} (n = I, 2, ... ), then each En has a one-dimensional measure t.t(s, En) and, by Theorem 7 .4.3, in two dimensions, E also has a one-dimensional measure ro
t.t(S, E) =
'2 t.t(s, En)· n=l
Hence, by the theorem of bounded convergence,
Jt.t(s, E) ds = n~l Jt.t(s, En) ds = n~l t.t(En),
by (ii),
= t-t(E). (iv) IfF is an inner set, the complement of an outer set with respect to a bounded interval I (a-< x-< b, p-< y-< q) then
t.t(s, F) = =
Hence
J{cx(s, y,I)-rx(s, y, E)} dy {(q-p)-t.t(S, E) 0
if 8 E I, otherwise.
Jt.t(s, F) ds = J(q-p) ds- Jt.t(s, E) ds I
I
= t-t(I)-t-t(E) = t-t(F).
(v) If G is a bounded, measurable set, then there exist two monotone sequences of bracketing sets {En} and {Fn} such that each En is an outer set, each Fn is an inner set, and
En :::) En+l :::) G :::) Fn+I :::) Fn, and where {En} is any sequence of positive numbers converging to zero as n __,.. oo. Let t.t(s, En) and t-t(s, Fn) be one-dimensional measures of En and Fn respectively, and let
f-tn(s) = t.t(S, En)-t.t(S, Fn)· Then
152
Multiple integrals
Hence, as n--+ oo, 1-'n(s) converges to a non-negative function t-t(s). Now
J t-tn(s) ds = J 1-'n(s, En) ds- J 1-'n(s, Fn) ds =
t-t(En) -p,(Fn),
<
€.
by (iv),
Therefore, by the monotone convergence theorem,
f t-t(s) ds =
lim n~oo
Jt-tn(s) ds =
0.
But t-t(s) ;?: 0. Hence, by the null integral theorem (8.7.4), t-t(s)
= 0 p.p.
Now and
J cx(s, y, En) dy- J cx(s, y, Fn) dy = t-t(8, En)-t-t(8, Fn) = 1-'n(s). Therefore for almost all s, cx(s, y, G) is integrable with respect to y, i.e. for almost all s, G has a one-dimensional measure p,(s, G) and Hence, m(s, G) is integrable with respect to s and 1-'(Fn) =
J t-t(8, Fn) ds ~ J t-t(8, G) ds ~ J t-t(8, En) ds = p,(En)•
Jt-t(s, G) ds = t-t(G).
Therefore
(vi) If H is an unbounded, measurable set, let G7, be the intersection of H with the interval -n ~ x ~ n, -n ~ y ~ n. Then t-t(H) = lim t-t(Gn) = lim t-t(s, Gn) ds n--+co
=
n---+oo
J
J lim t-t(s, Gn) ds = J t-t(s, H) ds. n~oo
This completes the proof of Fubini's theorem for indicatdrs of sets E with finite two-dimensional measure.
10.6. Fubini's theorem for summable functions THEOREM 10.6.1. If f(x, y) is non-negative, bounded, and measurable over the interval I (a~ x ~ b, p ~ y ~ q), then
Multiple integrals (a~
f(x, y) is also measurable over the line for almost all values of y, and if
x
~
153
b, y = constant)
b
J f(x, y) dx
=
g(y)
(p.p. in y)
a
then g(y) is measurable over the line p
~
y
~
q and
q
J g(y) dy = JJ f(x, y) dxdy. P
I
Fubini's theorem for indicators (10.5.1) applies to the indicator rx(x, y, t) of the function f(x, y) which has the integral m(t)
JJ rx(x, y, t) dxdy,
=
I
which is the two-dimensional measure of the set E{x, y; f(x, y) Hence the integral
> t}.
b
YJ(y, t) =
J rx(x, y, t) dx a
exists for almost all y and gives the measure of the set
>
E{x; f(x, y)
t, y = constant}. q
Also
m(t)
=
J YJ(y, t) dy. p
Now consider the Lebesgue bracketing function .\(x, y) of f(x, y) defined in 8.5.1 in the form n-1
.\(x,y) =
!
tk{rx(x,y,tk)-rx(x,y,tk+l)}.
k~o
Since Fubini's theorem applies to the indicator rx(x, y, tk) it follows that the integral b
J.\(x, y) dx a
exists for almost all y and that q
b
Jdy J .\(x, y) dx = JJ .\(x, y) dxdy. p
a
I
Multiple integrals
154
Similarly if t-t(x, y) is the upper bracketing function q
b
Jdy Jt-t(x, y) dx = JJ t-t(x, y) dxdy. a
p
Now
I
,\(x, y)
and
b
~
~
f(x, y)
fL(X, y)
b
J t-t(x, y) dx- J ,\(x, y) dx < a
p.p. in y,
E(b-a),
a
where
E
(k = 0, 1, ... ,n-1).
= max(tk+I-tk)
b
Hence
Jf(x, y) dx exists for almost all values of y and a b
b
b
J,\(x, y) dx ~ Jf(x, y) dx ~ Jt-t(x, y) dx. a
a
a
Also b
q
b
q
b
q
Jdy J,\(x, y) dx ~ Jdy Jf(x, y) dx ~ Jdy JfL(X, y) dx. p
a
a
p
a
p
b
q
Jdy Jf(x, y) dx = JJf(x, y) dxdy.
Therefore
p
a
F
THEOREM 10.6.2. If f(x, y) is non-negative and summable over the (x, y)-plane, I, then f(x, y) is also summable over the line (-oo < x < oo, y = constant) for almost all values of y, and if 00
g(y)
=
Jf(x, y) dx -00
then g(y) is summable over the line -oo
<
y
<
oo and
'00
J g(y) dy = JJf(x, y) dxdy.
-00
Define the truncated function fn(x, y) as follows: if JxJ
~ n,
IYI
~ n,
and
fn
= n, if JxJ
~ n,
IYI
~ n,
and f
fn
= 0, if JxJ >
fn =f,
n
or
JyJ
>
n.
0
~f~ n, ~ n,
Multiple integrals
155
Thenfn(x, y) is bounded and measurable and, by Theorem 10.6.1, the integral oo gn(Y) = fn(x, y) dx
f
-00
exists for almost all y and 00
f gn(Y) dy = ff fn(X, y) dxdy. -
00
Now, by the theorem of monotone convergence (9.4.2), the integral oo g(y) = f(x, y) dx
f
-00
exists as the limit 00
00
Joo l~~fn(x,y) dx = l~~ _[fn(x,y) dx = l~ gn(y). Similarly the integral
ff f(x, y) dxdy I
exists as the limit
Jf n-+oo lim fn(x, y) dxdy
=
lim
Jf fn(x, y) dxdy
n_,.oo I
I
00
=
lim
n---+oo _
f gn(Y) dy 00
_t l~ 00
=
gn(Y) dy
00
=
f g(y) dy. -oo
This completes the proof of Fubini's theorem.
10.7. Tonelli's theorem Fubini's theorem can be used to establish the following result, due to Tonelli, which is often of much more practical utility.
156
Multiple integrals
THEOREM 10.7.1. If f(x,y) is a non-negative, measurable function of (x, y) over the x, y plane I ( -oo < x < oo, -oo < y < oo) and, if one repeated integral 00
00
Jdy f f(x,y) dx -co
-co
exists, then so does the other repeated integral 00
00
f dx f f(x,y) dy -co
-co
and the two repeated integrals are equal.
If fn(x, y) is the truncated function of Theorem 10.6.2 then fn(x,y) is non-negative, bounded, and measurable, and the
integrals 00
00
f dy f fn dx,
Jf fn dxdy, I
-00
-00
each exist and are equal. The existence of the repeated integral 00
00
f dy f fdx -00
implies the existence of
-00
oo
g(y) =
f f(x, y) dx -00 00
for almost ally, and the existence of
J g(y) dy. -00
Now, by the monotone convergence theorem (9.4.2), 00
_{ g(y) dy
00
00
= _[ l~ gn(Y) dy = !~
=
lim
_{ gn(Y) dy
If fn(x, y) dxdy = If lim fn(x, y) dxdy
n-+oo I
I
n-+oo
by (10.6.1;),
00
i.e. the integral
JJf dxdy exists
and equals
1
J g(y) dy. -00
We can now use Theorem 10.6.2 to establish the existence of 00
h(x)
=
f f(x, y) dy -oo
Multiple integrals
157
almost everywhere in x and to prove that 00
JJ f(x, y) dxdy = Jh(x) dx. I
- oo 00
Combining these results we see that
I
00
h(x) dx =
-00
I
g(y) dy,
-00
which is the theorem to be proved.
1 0.8. Product sets In order to discuss the geometric definition of the Lebesgue integral (§ 10.9) we shall need to consider the measurability of the 'product set', G = EX F, of a set Eon the x-axis and a set F on the y-axis. 10.8.1. The product set G = EX F, is the set of points with plane Cartesian coordinates (x, y) such that x E E andy EF. DEFINITION
a=
DEFINITION 10.8.2. If cx(x) andfJ = fJ(y) are the indicators of E and F respectively, then the indicator of G will be denoted by a* fJ. (The symbol ar has been used (Definition 4.2.1) to denote the intersection of the two collinear sets a and r. To prevent any misunderstanding we shall use the symbol ex fJ to denote the 'Cartesian product' of the set cx(x) on the x-axis and the set fJ(y) on the y-axis.)
*
THEOREM 10.8.1. If the sets E and Fare bounded and measurable, so also is G = EX F, and
m(G) = m(E)m(F).
Let a = a(x) and fJ = fJ(y) be the indicators of E and F respectively. Then to any tolerance E > 0 there correspond outer sets a = a(x) and r = r(y) such that a ::( a, fJ ::( r and m(a)-E ::( m(a) ::( m(a), m(r)-E ::( m(fJ) ::( m(r).
Multiple integrals
158
Now a and Tare respectively the unions of enumerable disjoint intervals {ap}, {T q}, so that a=
!
aP,
T=
!
Tq,
and
*
aP Tq is a rectangle whose edges are the intervals aP and Tq. Hence m(ap Tq) = m(ap)m(Tq) and m(a T) =! m(ap)m(Tq)
* *
=! m(ap)! m(Tq) = m(a)m(T).
The product set ex * fJ is covered by the open set a
* T, whence
m*(ex * fJ) ~ m(a * T) = m(a)m( T) ~ {m(ex)+E}{m(fJ)+E}.
This is true for each
E
>
m*(ex
0, therefore
* {J) ~ m(ex)m({J).
Now the sets ex and fJ are each bounded. Hence there are intervals A and B, lying on the x- andy-axes respectively, such that ex ~ a ~ A and fJ ~ T ~ B. The complement of the set ex fJ with respect to the rectangle A B is expressible as the union of three disjoint sets as
*
*
A
* B-ex * fJ =
(A-ex)* (B-fJ)+(A-ex)
* fJ+ex * (B-{J).
By the result established above for outer measures
* (B-{J)} ~ m(A-ex)m(B-fJ), m*{(A-ex) * fJ} ~ m(A-ex)m({J), m*{ex * (B-{J)} ~ m(ex)m(B-fJ). Therefore m*(A * B-ex * fJ) ~ m(A)m(B)-m(ex)m({J), and m*(ex * fJ) = m(A * B)-m*(A * B-ex * {J) m*{(A-ex)
~
m(ex)m(fJ).
Thus
* fJ) ~ m*(ex * fJ) ~ m(ex)m(fJ). Hence ex* fJ is measurable and m(ex * fJ) = m(ex)m({J). m(ex)m({J) ~ m*(ex
Multiple integrals
159
10.9. The geometric definition of the Lebesgue integral The definition of the Lebesgue integral given by its inventor in his thesis and first paper (1902) was geometric rather than analytic in character, and it provides an illuminating approach, as can be seen from the clear and concise account given by Burkill ( 1953). The relation between the geometric and analytic definitions is given by the following definition and theorem. DEFINITION 10.9.1. Iff(x) is a non-negative function defined in a set E, then the 'ordinate set' of f(x) over E is the set of points (x, y) such that x E E and 0 ~ y < f(x). THEOREM 10.9.1. If f(x) is non-negative, bounded, and measurable over a bounded, measurable set E, then the Lebesgue integral J f(x) dx is equal to the two-dimensional measure of the ordinate E
set of f(x) over E. In the notation of§ 8.5 let the range [0, B] of f(x) be divided by the finite number of points t0 < t1 < t 2 < ... < tn = B. Let o:(x, y) be the indicator of the ordinate set of f(x) over E, and let Y)p(y) be the indicator of the set tP ~ y < tp+1· 0
=
=
Let
.\(x, y)
and
t-t(x, y) =
n-1
L o:(x, tp+l) * YJp(y)
P=O
n-1
Then
*
:L o:(x, tp) Y)p(y). P=O .\(x, y) ~ o:(x, y) ~ fL(X, y).
*
The sets o:(x,tp) Y)p(y) (p = 0, 1,2, ... ,n-1) are disjoint, and by Theorem 10.8.1 on product sets, m{o:(x, tp)
* Y)p(y)} =
m{o:(x, tp)}m{YJp(y)}
= m(tp, f)(tp+I-tp), where m(t, f) is the measure of the set {x; f(x) > t}. Hence the measure of the set of points with indicator t-t(x, y) is n-1
m(t-t) =
L
P=O
m(tp,f)(tP+l-tp)·
160
Multiple integrals
Similarly and
m(fL) -m(,\)
=
n-1
L {m(tp, f)-m(tp+I• f}(tp+I-tp)
p=O
n-1
< EP=O L {m(tp,f)-m(tp+vf)} < Em(E), if E = max(tp+I-tp) for p = 0, 1, 2, ... ,n-l. B
Therefore
m(,\)
< Jm(t, f) dt < m(f.L) 0
and, since ex :( fL,
< m(f.L).
m*(ex)
If w is an interval covering ex, then
< w-,\
w-ex
and, by the preceding result, m*(w-ex) :( m(w-,\) = m(w)-m(.\),
whence Now
< m(w)-m*(w-ex) = m*(ex). m*(ex)-m*(ex) < m(f.L)-m(,\) < Em(E). m(,\)
Since this holds for each
E
>
0,
m*(ex) = m*(ex).
Thus the ordinate set ex is measurable. Hence
m(,\) :( m(ex)
< m(f.L)
B
and therefore
m(ex) =
Jm(t,f) dt. 0
If the Lebesgue integral is defined geometrically as in Theorem 10.8.1, then it is possible to give very compact proofs of the convergence theo~ems (8.7.7 and 9.3.7) and ofFubini's theorem, as in Burkill's monograph (1953).
10.10. Fubini's theorem in d dimensions The generalization of Fubini's theorem to d dimensions is proved in exactly the same way as in two dimensions and it is sufficient to state the results without proof if we use the customary compact notation.
Multiple integrals
161
The d-dimensional space R is the Cartesian product of the p-dimensional space R(p) and the q-dimensional space RCq) (d = p+q). The vector x = (x 1 , x 2 , ... , xp) is a point in R
=
a(xv x 2 , ... , xP, Yv Y2 , ... , Yq).
If E has finite d-dimensional measure in R,
Ja(x, y),
=
fL
R
then the set of points a(8, y) (8 a fixed vector in R
J a(8, y).
p.(8) =
RC•>
Also p.(8) is a measurable function of 8 if 8 E R(p) and
Jp.(8) d8 =
fL·
RC•>
If f(x, y) = f(xv x 2 , ... , xP, Yv y 2 , ... , yq) is non-negative and summable over R thenf(x, t) is also summable over R
g(y)
J f(x, y),
=
IN>>
then g(y) is summable over R
J g(y) = Jf(x, y). R<•>
R
Finally, if f(x, y) is non-negative and measurable in R, and if one repeated integral f(x,y)
JJ
RC•>RCP>
exists, then so also does the other
J J f(x, y) R}P> RC•>
and the two repeated integrals are equal. 853146X
L
162
M 'llltiple integrals
10.11. Exercises l. j(x, A) possesses a partial derivative g(x,>.)
=
of(x,>.)jo>.
if a .;;; x .;;; b, and ex .;;; )1. .;;; {3. g(x,>.) is a bounded and measurable function of (x,>..). Prove that b h()\) = g(x,>.) dx
J
a A
b
J h(t) dt = J {f(x,>.)-f(x,cx)} dx.
exists and that
a
I b
Deduce that
h()\)
=
:)1.
j(x,>.) dx
p.p. in>...
a
2. Use Fubini's theorem to prove that w
w
0
0
J e-x' dx J e-
11 '
t7T,
=
dy
expressing the multiple integral as a Young integral. 3. If g(x) and h(x) are summable functions and G(x)
J" g(s) ds,
=
J" h(s) ds,
=
H(x)
-w
-w
prove that
J
g(x)H(x) dx+
_ oo
_
r
h(x)G(x) dx
=
lim G(t)H(t). t--+co
00
4. If f(x, y) is bounded and measurable in the interval (a .;;; x .;;; b, p .;;; y .;;; q) and is non-increasing in y for each fixed value of x, prove directly that
a
I>
a
b
J dx Jf(x,y) dy = J dy J j(x,y) dx. a
p
p
a
5. If j(x) is non-negative, and bounded over E, and if the ordinate set of j(x) over E is measurable with measure m, show that j(x) is measurable over E and that m = f(x) dx (Williamson, pp. 54, 55). .
.
J
E
6. Prove Theorems 10.8.1 and 10.9.1 for unbotmded measurable sets / E.
11
The Lebesgue-Stieltjes integral
11.1. Introduction The concept of the Stieltjes integral can be illustrated by the problem of calculating the quantity of heat Q required to raise the temperature of a given heterogeneous body by 1 °0, given the specific heat at each point of the body (Lebesgue 1928, p. xii). Let the body be divided into a finite number of parts of masses mv m 2 , ••• , mm and let f.p and i5P be the infimum and the supremum of the specific heat at points in the part with mass mP. Then, by the definition of specific heat, Q is intermediate in value between the sums n
A=
L f.pm/'
p~l
n
and
M=
L
i5PmP.
p~l
As in§ 3.3 we can consider the collections of numbers A and M for all subdivisions of the body into a finite number of parts, and we can define sup A and inf M as the lower and upper Stieltjes integrals of the specific heat over the mass of the body. If these bounds are equal we can define their common value to be the Stieltjes integral of the specific heat over the mass of the body. This process is clearly analogous to that by which we obtained the lower and upper Darboux integrals(§ 3.3) and the Riemann integral, and the result is often called the Riemann-Stieltjes integral. It is subject to the same criticism as the Riemann integral and we shall therefore pass on at once and construct the analogue of the Lebesgue integral and thus obtain the Lebesgue-Stieltjes integral. For this purpose we can adopt the whole of the Lebesgue theory if we make one small but vital change at the very beginning and replace the geometric measure of an interval by what we shall call the 'weighted measure'. Thus in the example quoted
164
The Lebesgue-Stieltjes integral
at the beginning of this section the primary concept would be not the volumes of the parts into which the body is divided but the masses of those parts. Now the physical concept of mass has one important property that is analogous to the mathematical concept of measure, viz. it is an additive function of the parts into which a body is divided, i.e. if a body of mass m is divided into a finite number of parts with masses m1 , m 2 , ..• , mn, then m = m 1 +m 2 + ... +mnBut the concept of mass differs from the concept of measure in as much as we can have masses that are concentrated into surfaces, lines, or points, whereas the three-dimensional measure of a plane, or straight line, or a point is zero. (We do not speak of three-dimensional measure of a surface or a curve in general, because there are pathological examples that invalidate the corresponding plausible assertion.) This 'grittiness' or lack of smoothness in a mass distribution necessitates a rather careful definition of the concept of weighted measure. In three-dimensional space and even in two-dimensional space this leads to rather tiresome complications and we shall therefore first restrict ourselves to the Lebesgue-Stieltjes integral in one dimension.
11.2. The weighted measure Whether we are considering an open interval (a, x) or a closed interval [a, x] its weighted measure w(x) is a non-negative, monotone, non-decreasing function of x. In any interval a ~ x ~ b, such a function is necessarily continuous at each point with the possible exception of a finite or enumerable set of points {xn} (n = 1, 2, ... ). At each of these points there exist the limits · W (Xn- O) = 1I. l l W (Xn- h) and
w(xn+O) =lim w(xn+h)
as h tends to zero through positive values. Hence we are led to the following definitions. 11.2.1. The weight function w(x) negative, non-decreasing function of x. DEFINITION
IS
a non-
The Lebesgue-Stieltjes integral DEFINITION
11.2.2 The weighted measure of a point
165
t is
w(t+O)-w(t-0).
(This is zero unless t is one of the points of discontinuity xv x2, .... ) The weighted measure of an open interval (a, b) is w(b-0)-w(a+ 0). The weighted measure of a closed interval [a, b] is w(b+O)-w(a-0). The weighted measure of a half-open interval [a, b) is w(b-0)-w(a-0). The theory of weighted measure can now be developed by exact analogy with the theory of geometric measure (Chapter 6). DEFINITION 11.2.3. The weighted measure w(a) of an elementary set a, consisting of a finite number of disjoint intervals a 1 , a 2 , .•• , an is the sum
DEFINITION 11.2.4. The weighted measure w(a) of an outer set a consisting of an enumerable collection of disjoint intervals is the sum 00 w(a) = :2 w(an)· n~1
It may now be verified as in Theorem 6.3.6 for Lebesgue measure that, with these definitions, if a and T are bounded outer sets with the representations 00
a
=
:2 a s=1
00
8
and
T
=
:2 {31, 1=1
ro
and if a is covered by
T,
then
:2 w(a s=1
oo
8)
~
:2 w(fJ1). 1=1
DEFINITION 11.2.5. The outer weighted measure w*(a) of a bounded set of points a is the infimum of the weighted measures w(a) of the outer sets a which cover a, i.e.
w*(a)
= inf w(G!) for a ;;:, a.
166
The Lebesgue-Stieltjes integral
DEFINITION 11.2.6. The inner weighted measure ww(a) of a bounded set of points a with respect to an interval w which covers a is ww(a) = w(w)-w*(w-a). THEOREM 11.2.1. The value ofww(a) is independent of w. DEFINITION 11.2.7. The inner weighted measure w*(a) of a bounded set of points is the value of ww(a) for any interval w which covers a. DEFINITION 11.2.8. A bounded set of points a is said to have Stieltjes measure f-tw(a) with respect to the weight function w(x) if the outer and inner weighted measures of a are equal, and the value of the Stieltjes measure is defined to be
f-tw(a) = w*(a) = w*(a). For brevity we often say that, under these conditions, 'the set a is measurable (w) '. THEOREM 11.2.2. The Stieltjes measure f-tw(a) is a positive, additive, continuous functional of the indicator a(x), i.e. ~
f-tw(a)
0,
f-tw(al+a2) = f..l,.,.(al)+f-tw(a2)
if ala2 = 0,
ro
and
fLw(a)
2
=
f-tw(an)
n~l
if av a2 , ••• is an enumerable collection of dis_joint sets each bounded by the same interval I. The Stieltjes measure f-tw(a) therefore possesses many of the properties of an integral, and is therefore commonly written in the form f-tw(a) = a(x) dw(x).
J
In particular, for an interval J,
f-tw(l)
c=
f dw(x). I
DEFINITION 11.2.9. A bounded function f(x) will be said to have Stieltjes measure function f-tw(t,f) (or to be measurable w) if the set of points E{x,f(x) > t} has Stieltjes measure f-tw(t,f) for each value of t.
The Lebesgue-Stieltjes integral
167
As in § 8.5 we can introduce the upper and lower Lebesgue bracketing functions f-t(x) and .\(x), corresponding to a partition
A = t0 < t1 < t 2 < ... < tn = B of the range of f(x) for a ~ x ~ b. Let e = max(tp+l-tp) THEOREM:
for p = 0, 1, 2, ... ,n-l.
11.2.3.
b
b
Jf-t(X) dw(x)- J.\(x) dw(x) ~ e{f-tw(B,f)-f-tw(A,J)}. a
a
DEFINITION 11.2.10. The Lebesgue-Stieltjes integral of f(x) with respect to the weight function w(x) over the interval [a, b] is b
Jf(x) dw(x) = a
b
inf
b
J f-t(X) dw(x) =
sup
a
J.\(x) dw(x), a
for all Lebesgue bracketing functions .\(x) and f-t(X). 11.2.4.
THEOREM: b
Jf(x) dw(x) = a
B
f-tw(A,J) .A+
Jf-tw(t,f) dt. A
This identi:fies the Lebesgue-Stieltjes integral with the 'YoungStieltjes' integral of the monotone measure function f-tw(t,f). We can then extend the definition to unbounded functions and unbounded intervals of integration as before, by considering separately the positive and negative parts of f(x).
11.3. The Lebesgue representation of a Stieltjes integral Lebesgue has shown that the Lebesgue-Stieltjes integral in one dimension can be represented as an ordinary Lebesgue integral by a simple transformation of the independent variable from x to the 'Lebesgue inverse function' of the weight function w(x). The weight function w(x) has no unique inverse in the ordinary sense, for, to a prescribed value of y, there may correspond a
168
The Lebesgue-Stieltjes integral
whole interval, o: ~ x ~ fJ of points at which w(x) = y as in Fig. 3 (a). However, since w(x) is non-decreasing in any bounded interval [a, b], it is measurable in the sense of Lebesgue, and its w(x) ll'(b)
/
/
(a)
w(a)
/ ~(/!)
y
/3
a
,f
b
)
1\
""'
b-Ill
b-fl
(b)
~ w(a)
y
"' w(b) FIG 3.
measure function g(y), the Lebesgue measure of the set of points E{x; w(x) > y}, does provide a species of inverse function for w(x) (see Fig. 3 (b)). The Lebesgue inverse function g(y) is a non-increasing function which is discontinuous at each value 7J of y which corresponds to an interval [o:, fJ] ofxin which the weight function w(x) remains constant, for g(7J+O) = b-fJ and g(TJ-0) = b-o:.
The Lebesgue-Stieltjes integral
169
In the Lebesgue transformation we introduce the function defined by the relation j[g(y)]
cf>(y).
=
The indicator function of cf>(y), say {J(y, t) is defined by the relations {J(y, t) = (cf>(y) > t), 0 (cf>(y) ~ t).
{l
If cx(x, t) is the indicator of f(x), then these relations imply that cx(g(y), t)
{J(y, t).
=
Hence the lower Lebesgue bracketing function for f(x) is
n-1
=
.L tp{J3(y, tp)-fJ(y, tp+l)}
v~o
and its Stieltjes integral is b
J ,\(x) dw(x) = :~:tv{m(tp)-m(tp+l)}, "
where
m(tp) =
JfJ(y, tp) dy,
which is the Lebesgue measure of the set E{y; cf>(y)
> tJJ}.
b
Therefore, the supremum of
J,\(x) dw(x)
is the Lebesgue
a
integral of cf>(y) over the range y = w(a) toy= w(b). We could b
J
similarly identify the infimum of f.L(x) dw(x). But the common a
value of these bounds is the Lebesgue-Stieltjes integral of f(x) with respect to w(x). IJ
w(b)
J f(x) dw(x) = J j[g(y)] dy.
Therefore
a
In particular, if f(x) elsewhere, then s
-w(a)
l in the interval (a, s) and is zero w(s)
J f(x) dw(x) = J l dy = w(s)-w(a). a
w(a)
170
The Lebesgue-Stieltjes integral
11.4. The Lebesgue-Stieltjes integral in one dimension The preceding section has been devoted to the LebesgueStieltjes integral of a bounded function f(x) with respect to a non-negative, non-decreasing weight function w(x). The theory is easily extended to weight functions that are of bounded variation, and we shall merely state the relevant definitions and theorems, leaving the proofs to the reader. 11.4.1. If w 1 (x) and w 2 (x) are two non-negative, non-decreasing weight functions, so also is their sum THEOREM
and if a boundedfunctionf(x) has Stieltjes measure for an interval I with respect to each of the weight functions w 1 (x) and w 2(x), then it has Stieltjes measure with respect to w(x) and
I f(x) dw(x) =I f(x) dw (x)+ I f(x) dw (x). 2
1
I
1
1
DEFINITION 11.4.1. A function w(x) of the real variable xis said to be 'of bounded variation' in an interval I if there exists a pair of non-negative, non-decreasing functions, p(x) and n(x), such that w(x) = p(x)-n(x) for x E I. THEOREM 11.4.2. If w(x) has bounded variation ~n I and [p1 (x), n 1 (x)], [p 2(x), n 2 (x)] are two pairs of non-negative, nondecreasing functions such that
w(x) = p 1 (x)-n 1 (x) = p 2 (x)-n 2 (x), and if the bounded function f(x) has Stieltjes measure for an interval I with respect to each of the weight functions p 1 (x), p 2 (x), n 1 (x), n 2 (x), then
I f(x) dp (x)- I f(x) dn (x) Jf(x) dp (x)- Jf(x) dn (x). 1
I
1
I
=
2
I
2
I
The Lebesgue-Stieltjes integral
171
DEFINITION 11.4.2. The Lebesgue-Stieltjes integral of the bounded function f(x) with respect to the weight function w(x) of Definition 11.4.1 for an interval] is
I f(x) dw(x) = I f(x) dp (x)- I f(x) dn (x) 1
I
I
1
I
in the notation of Theorem 11.4.2.
11.5. The lebesgue-Stieltjes integral in two dimensions The theory of the Lebesgue-Stieltjes integral in two or more dimensions will be found, in summary form, in modern books on statistics and the theory of probability (e.g. Moran 1969, p. 203 and Kingman and Taylor 1966, p. 95) and, in more detail, in serious works on integration (e.g. McShane 1947, chap. vii). The crux of the theory is the introduction of a weight function w(J) which is a mapping of the intervals in Euclidean space of d dimensions into the real axis, and which is (i) non-negative; (ii) additive, in the sense that if the interval I is the union of two disjoint intervals 1 1 , 1 2 then
w(l) = w(J1 )+w(l 2 ). The main difficulty arises from the possible discontinuities in the weight function. From the point of view of the physicist the weight function may represent a discrete distribution of mass at a number of isolated points. T~1e question then arises, does the weight function w(J) include any point masses on the boundaries of I? And, if so, how is the additive character of w(J) preserved? From the point of view of the mathematician the possibility of defining the weight function for any bounded open set Q depends upon the representation ofQ as the union of an enumerable collection of disjoint intervals In, and hence the intervals In must not all be open intervals, or all closed intervals. After meditating on these difficulties the reader may be prepared to settle for the following definitions, which allow a
172
The Lebesgue-Stieltjes integral
fairly concise account of the theory. In order to exhibit the essence of the theory we shall restrict ourselves to two dimensions. The extension to three or more dimensions is an obvious generalization, which is fully discussed in the references given above. DEFINITION 11.5.1. In the convenient terminology of Ingleton (1965) a 'standard' interval is the set of points (x, y) in the half-open rectangle a~ x < b, c ~ y
t~r
THEOREM 11.5.2. The Stieltjes measure function w(I) of Definition 11.5.2 is nonnegative and additive for all standard inFIG. 4 tervals I. We can now follow the same process as in Chapter 7 and proceed to define the outer and inner Stieltjes measures of a set of points Q by covering Q with an enumerable collection of (a,c)
(b,c)
The Lebesgue-Stieltjes integral
173
disjoint standard intervals. We can then define the sets of points that have a unique Stieltjes measure. A bounded function f(x, y) will be said to have Stieltjes measure if this is true of each set of points (x, y) for which f(x,y) > t. We can then introduce the indicator cx(x, y, t) of this set of points and its Stieltjes measure f-tw(t) which we identify with the Lebesgue-Stieltjes integral of the indicator. We can bracket f(x, y) by two simple functions as in § 11.2 and define the Lebesgue-Stieltjes integral of f(x) over a region Q with respect to a weight function w(x, y). ·As in one dimension we can consider weight functions of bounded variation and functions that are summable with respect to such weight functions.
11 .6. Exercises l. The Heaviside unit function H(x)
{
=
<
0
(x
l
(x > 0),
0),
is a weight function in ono dimension, which is semi-continuous on the right, i.e. H(x) = limH(x-E), as E tends to zero through positive values. It represents a unit mass at the origin. 2. Construct the corresponding weight function in two dimensions for a unit mass at the origin. 3. In d dimensions, where d ;;;, 2, a standard interval is specified by the 2d vertices, V1 , f;;, ... of a rectangular block I, ak
< xk
<
bk
(k
= l, 2, ... ,d).
The numbers a 1 , a 2 , ... , aa will be called the lower coordinates of I, and the numbers b1 , b 2 , ••• , ba will be called the upper coordinates of I. If a vertex Vi is specified by r upper coordinates and by d-r lower coordinates, we assign to it the number N(fi) = (-l)r+d. Show that a Stieltjes measure function in d dimensions has the form 11-(I)
= _L N(fi)f(fi)
(l
2a),
where f(fi) is a non-negative function of the coordinates of fi, nondecreasing in each coordinate and semi-continuous in the sense that j(x1 ,x 2 , ... ,xa)
= limj(x1 -E1 ,x2 -E 2 , ••• ,xa-Ea)
as E1 , E2 , ... , Ed tend to zero through positive values.
12
Epilogue
12.1. The generality of the Lebesgue integral It seems appropriate to conclude this introductory account of the Lebesgue integral with a rapid summary of its advantages. The most obvious advantage of the Lebesgue theory is in its application to the integration of sequences of functions where the generality and simplicity of the theorems of bounded and dominated convergence greatly facilitate the analysis. By comparison the Riemann theory is much restricted in scope. But a greater attraction of the Lebesgue theory is its wide generality and the kind of inner necessity and inevitability with which the theory develops. From this point of view the central feature is the relation between an integrable function f(x) and its indicator o:(x, t,J). To exhibit this characteristic of the Lebesgue theory we shall develop the descriptive definition of the Lebesgue integral as given by Lebesgue himself in the second edition (1928) of his
Le9ons sur l'integration.
12.2. The descriptive definition of the Lebesgue integral To avoid confusion with the Lebesgue integral as constructively defined in the preceding chapters we shall now speak of the 'Lebesgue functional'. DEFINITION
12.2.1. A Lebesgue functional is a functional
b
J f(x)
dx, defined in a space 2
of bounded and real-valued
a
functions f(x) on the interval (a, b), which is linear (L), positive (P), 'absolute' (A), and 'monotonely convergent' (C), with the Lebesgue normalization (N). The functions in 2 will be called 'Lebesgue functions'.
Epilogue
175
The definitions of linear, positive functionals have been given in § 2.2. Condition (A) implies that, if f(x) E 2, so also does its absolute value lf(x) I, and condition (N) implies that the indicator o:(x) of any interval (p, q) is a. Lebesgue function, and that b
J o:(x) dx =
if a ~ p
q-p
<
q ~ b.
a
Condition (C) means that, if {fn(x)} is a sequence of Lebesgue functions, such that 0 ~ fn(x) ~ fn+I(x) ~ M
for all nand a~ x ~ b, where M is independent of x and n, and if the sequence {fn(x)} converges to a limit function f(x) at all points of (a, b) except at an enumerable set x = x 1 , x 2 , ••• , thenf(x) is a Lebesgue function in (a, b) and IJ
b
lim jfn(x) dx = n-+-co
a
Jf(x) dx. a
Condition (L) now implies that the space 2 contains the step functions of§ 2.3. It is, of course, necessary to verify that these characteristics by which we have described the Lebesgue functional are mutually consistent. To do this it is sufficient to observe that all these characteristics are possessed by the integrals of functions which are constant in the interval (a, b). THEOREM
12.2.1. If f(x)
=
c in (a, b), the functional
b
Jf(x) dx =
a
c(b-a)
.
possesses the characteristics (L), (P), (A), (C), and (N).
For the characteristics (L), (P), (A), and (N) the proof is too elementary to print. For the characteristic (C), let fn(x) =en
and
en-+ c
in (a, b),
as n-+ oo.
I76
Epilogue b
Jfn(x) dx =
Then
b
cn(b-a)
-7
c(b-a)
=
a
Jf(x) dx, a
as n -7 ro. In order to show that the class 2 of Lebesgue functions does contain some non-trivial functions we give the following theorem. THEOREM I2.2.2. Any bounded, non-negative, non-decreasing function f(x) is a Lebesgue function. Let f(a) =A, f(b) = B, and b-a = 2nen, where n is a positive integer. Define the step function, cfon(x), by the condition that
= f(a+pen),
cfon(x)
a+pen ~X< a+(p+I)en,
if
for p = 0, I, 2, ... , 2n-1. Then, if xis fixed, pis the greatest integer not exceeding 2n(x-a)f(b-a).
Similarly if
c/Jn+l(x) = f(a+qen+l) a+qen+l ~X< a+(q+I)en+l>
for q = 0, I, 2, ... , 2n+1 -I. If xis fixed, q is the greatest integer not exceeding 2n+1(x-a)f(b-a), but 2p
~
whence Therefore
2n+1(x-a)f(b-a), 2p
(a+qen+l)-(a+pen)
~
q.
= 2-n-l(b-a)(q-2p) :;:;::, 0.
But f(x) is non-decreasing, whence cfon+l(x)-rpn(x) = f(a+qen+l)-f(a+pen) :;:;::, 0.
Also Hence at each value of x, whereat f(x) is continuous, cfon(x)
-7
f(x)
as n
-7
ro.
But f(x) is non-decreasing, and hence is discontinuous only at an enumerable set of points in (a, b). Therefore by condition (C), f(x) is a Lebesgue function.
Epilogue
177
12.3. Measure functions DEFINITION 12.3.1. The indicator rx(x, t,j) of a function f(x) is defined by the relations
{o1
rx(x, t,f) =
(f(x) "> t)' (f(x) ~ t),
where tis any real number. THEOREM 12.3.1. If f(x) is a Lebesgue function in (a, b), so also is its indicator rx(x, t,J). Let
rxn(x) = tlnf(x)-ntl-tlnf(x)-nt-11 +t,
where n is a positive integer. Then it follows from conditions (N), (L), and (A) of Definition 12.2.1, that rxn(x) is a Lebesgue function in (a, b). Now rxn(x) =
(f(x) ~ t+1fn),
1 ( nf(x)-nt
(t ~f(x) ~ t+1/n), (f(x)
0
~
t).
Hence {rxn(x)} is a bounded, non-decreasing sequence of Lebesgue functions. Also
{1
(f(x) > t), (f(x) ~ t),
ex (x)--? n 0
as n--? oo. Therefore, by condition (C) of Definition 12.2.1, the indicator rx(x,t,f) is a Lebesgue function in (a,b).
DEFINITION 12.3.2. The Lebesgue integral, b
m(t,J)
=
Jrx(x, t,j) dx, a
which exists by Theorem 12.3.1, is called the 'measure function' of f(x). THEOREM 12.3.2. If f(x) is a Lebesgue function in (a, b), and if, in this interval, A
M
178
Epilogue
LEMMA. If cf;(x) and ~(x) are Lebesgue functions in (a, b) and if cf;(x) ;;?: ~(x) in (a, b), then b
b
J cf;(x) dx ;;?: J ~(x) dx. a
a
For by conditions (L) and (P) of Definition 12.2.1, b
b
b
Jcf;(x) dx- J~(x) dx = J{cf;(x)-~(x)} dx. a
Now, if s
< t,
a
a
then o:(x, s,J) ;;?: o:(x, t,j),
whence m(s,J) ;;?: m(t,j), by the lemma. Thus -m(t,j) is a non-decreasing function of t. o:(x, A,f) = 1
Also and
o:(x, B,J) = 0,
if a
m(A,J) = b-a
Therefore
<
x
<
b.
by (N)
b
and
J0 dx
m(B,J) =
a b
b
J1 dx- J1 dx
=
a
= Thus
by (N).
0
0 = m(B,j)
by (L)
a
~
m(t,j) ~ m(A,j),
if A < t < B. Hence the function -m(t,j) is bounded. Therefore the function -m(t,f), and also (by condition (L)), the function m(t,j) are Lebesgue functions oft in (A, B).
12.4. The Young integral THEOREM 12.4.1. lff(x) is a Lebesgue function in (a, b), and if 0 ~f(x) ~ M
in this interval, then b
M
Jf(x) dx = J m(t,j) dt, a
Let 0 = t0 < t1 < t 2 range [0, M] of f(x).
0
< ... <
tn = M be a partition of the
Epilogue
179
As in§ 8.5 we introduce the Lebesgue bracketing functions A(x) and p.(x):
Then
~
A(x)
f(x)
~
p.(x).
Now by conditions (L), (N) and Theorem 12.3.1, A(x) and p.(x) are Lebesgue functions in (a, b). Hence, by the lemma of Theorem 12.3.2, b
b
b
I A(x) dx ~ I f(x) dx ~ I p.(x) dx. n
a
a
The analysis of§ 8.6 also shows that b
M
b
I A(x) dx ~ I m(t,f) dt ~ I p.(x) dx, a
a
0 II
and that
/,
I p.(x) dx- I A(x) dx ~ e(b-a), a
"
where e = max(tp+l-tp) for p = 0, 1, 2, ... , n-l. Since this is true for any tolerance e > 0 it follows that b
M
I f(x) dx = I m(t,J) dt. a
0
This identifies the Lebesgue integral of f(x) over (a, b) with theW. H. Young integral of its measure function m(t,f) over the range (0, M) of f(x). The significance of this investigation is that it proves that the domain of the Lebesgue functional, as defined in 12.2.1, is included in the space of bounded measurable functions, i.e. the functions f(x) whose indicator cx(x, t, f) is a Lebesgue function. Of course it remains to be proved that all such bounded measurable functions are in fact Lebesgue functions and, to do this, we need the constructive definitions of the preceding chapters. The use of monotone sequences then allows us to extend the
180
Epilogue
definition of the Lebesgue integral to unbounded functions and to unbounded intervals as in Chapter 9, but such integrals are necessarily absolutely convergent. If, however, we are prepared to waive condition (A) and admit non-absolutely convergent integrals, then the space of integrable functions can be considerably extended. In this work of analytic exploration the initial advances were made by A. Denjoy (1942-9) by means of constructive definitions involving transfinite induction. Later, 0. Perron gave a descriptive definition by adapting the method of bracketing to give directly upper and lower bounds to the integral (rather than the integrand, as in our exposition). Outstanding advances in this field have been made by A. J. Ward and R. Henstock and are described in the latter author's book Theory of integration (1963). The intentions of the author of the present work will be amply fulfilled if this introductory account of the Lebesgue integral has stimulated the reader to study the more formal and profound account of the subject. The following works are especially recommended for the undergraduate. The original paper and books by Lebesgue and the exposition by de la Vallee-Poussin are somewhat terse, and chapters x, xi, and xii of The theory offunctions by E. C. Titchmarsh (Clarendon Press, Oxford) will be found to provide a most illuminating commentary. The geometric theory of the Lebesgue integral is expounded with great clarity and conciseness in J. C. Burkill's Cambridge Tract. A rather more advanced treatment is given with greater emphasis on topological and set-theoretic concepts in Lebesgue integration by J. H. Williamson. By contrast, the treatment by A. N. Kolmogorov and S. V. Fomin in the translation published by Academic Press (New York and London, 1961) with the title Measure, Lebesgue integrals and Hilbert space, may be perhaps described as more algebraical in presentation. A forthcoming book by A. W. Ingleton will provide yet another attractive line of approach to integration theory which may be roughly characterized as the method of functional analysis.
Epilogue
181
12.5. References BoAs, R. P. (1960). A primer of real junctions. Wiley. BuRKILL, J. C. ( 1953). The Lebesgue integral. Cambridge University Press. DENJOY, A. (1941-9). Ler;ons sur le calcul des Villars, Paris.
coeffic~ents.
Gauthier-
HARTMAN, S., and MIKUSINSKI, J. {1961). The theoTy of Lebesguerneasu1·e and integration. Pergamon Press, Oxford. HENSTOCK, R. ( 1963). TheoTy o.f integration. Butterworths, London. HoBsON, E, W. (1927). The theory of Junct'ions of a real variable and the theory of Fourier's series, vols. i and ii. Cambridge University Press. lNGLETON, A. W. (1965). Institute, Oxford.
Notes on 'integration.
The Mathematical
KINGMAN, ,J. :F. C'., and TAYIA1R, 1-l. ,J. (1966). Intmdttetion to measure and probability. Cambridge University Press. KoLMOGOROV, A. N., and Fo~nN, S. V. (1961). 11'IeasuTe, Lebesgue integrals and HilbeTt space. Academic Press, New York. LEBESGUE, H. ( 1902). appl. 3, 231-359.
Integrale, longueur, aire.
Annali 111at. puT.
- - - (1904, 1928). Ler;ons su'r l'integmtinn. Gauthier-Villars, Paris. 1\'Ic SHANE, E. J. (1947). Integmtion. Princeton University Press. MoRAN, P. A. P. ( 1968). An h~t1'0rluct,ion to probability theory. Clarendon Press, Oxford. Hmsz, F., and Sz-NAGY, B. ( Hl53). lA'<;ons rl'1malyse fonctionelle. Akademiai Kiaclo, Budapest. SAKS, S. (1937). Theory of the integml (English translation by L. C. Ymmg). Subwencji Fundasza Kultury Narodowej, Warszawa-Lw6w. SoLOVAY R, M. (1971) Annals of Math. (2), 92, l. DE LA VALLEE-PoussiN, C. J. (1915). Trans. Am. math. Soc. 16, 435. - - ( 1916). lntegmles de Lebesgtte. Gauthier-Villars, Paris. WILLIAMSON, J. H. (1962). Lebesg,'ue integration. Holt, Rinehart, and Winston, New York. YouNG, L. C. (1927). The theory of integration. Cambridge University Press. YOUNG, W. H. (1905). On upper and lower integrals. PToc. Lond. rnath. Soc. 2, 52.
Index
Absolutely-convergent integrals, 15, 180 Addition theorems, 115, 118, 132, 133 Additivity of measure, 93, 94 'Almost all', 37, 46 'Almost everywhere', 15, 56 Area, 22 Boolean: algebra, 45 analysis, 45 convergence, 46 Bounded convergence theorems, 119, 120 Bounded variation, 170 Bracketed extension, 34 Bracketing, 33-5 functions, 16, 24, 26, 27, 33 (Lebesgue), 28, 29, llO, 167, 17!-l limitations of, 32 Cantor's ternary set, 57-9, 121 Characteristic function, 44 Closed sets, 48-50 Complementarity, 49, 82 Components of set, 144 Constructive definition, 19, 20 Content, 76 Convergence : bounded, 22, l19-20 dominated, 22, 135 monotone, 22, 37, 139 pointwise, 138, 175 uniform, 21 Covering, 45 theorems, 51 Denjoy's theorem, 43, 67 Density, metric, 59, 123 Derivatives, Dini, 41 Descriptive definition, 1\J, 20, 174, 175 Difference of sets, 72 Differentiation, 60 of monotone functions, 64 of series of monotone functions, 67
Disjoint sets, 45 Disjunction, 45 Dominated points, 62 Elementary sets, 72, 194 Extension, bracketed, 34 Fatou's theorem, 138, 139 Fubini, G., 43, 67, 148, 150, 152, 160 Functional, 22 'absolute', 42 completely additive, 42 'continuous', 21 linear, 21, 42 monotonic, 33 normalized, 21 positive, 21, 42 Functions: bounded variation of, 170 bracketing, 16, 24, 26, 27, 33 (Lebesgue), 28, 29, 110, 167, 17\J characteristic, 44 equivalent, 131 inverse, 167 measurable, 104 measure, 107 (Stieltjes), 166, 172 negative part, 129 pathological, 14, 41, 107 positive part, 129 simple, 108 summable, 15, 126-31 tame, 13, 14, 54 truncated, 40, 143, 154 weight, 164 wild, 14, 54 Geometric measure, 74 75 80 146 Global properties, 21 ' ' ' Greatest lower bound, 37 'Reine-Borel' theorem, 51-2, 7\J Inclusion, 45 Indicator: of a function, 52, /i3 of a set, 17, 44
Index Infimum, 37 Ingleton, A., 8 Inner sets, 83-4 Integral: Archilnedean, 25, 28 Darboux, 36 definition: constructive, 19 descriptive, 19, 22,174 geometric, 159 d-dimensional, 146 differentiation of, 121 infinite, 39, 130 Lebesgue, 25, 27, 111, 128 Lebesgue-Stieltjes, 163, 170, 171 Lebesgue-Young, 112, 178 multiple, 144 over measurable set, I 03 Riemann, 35 Young-Stieltjcs, 167 Intersection, 4 7 Interval: general, 71 standard, 17 2 Kipling, R., 18 Limits, inf and sup, 3\l Local properties, 20 Mean value thoot·om, 114, 1HI Measurable sets, !11 Measure: additivity, 93 complete, 94 function, 29, 107, 177 geometric, 74, 75, 80, 146 inner, 88 Lebesgue, 28, 91 outer, 87 weighted, 163, 164, 166 zero, 55, 94 Metric density, 59, 123 Monotone: convergence, 22, 37, 139 extension, 40 functional, 33 method, 16 sequences, 37, 99 Morbid pathology of analysis, 14, 41, 107
183
Non-measurable sets, 97 Normalizing condition 21 115 Null integral theorem,' Null sets, 55
u.f
Open sets, 48 Outer sets: bounded, 76 unbounded, 82 Peak and chasm functions, 38 Poincare, H., 44 Points, dominated or shaded, 62 Primitive, 20, 138 Product sets, 157 Representation of outer sets, 76 Rising sun lemma, 61 Sets: Cantor, 57-9, 121 closed, 48-50 components of, 144 difference of, 7 2 elementary, 72, 144 inner, 83-4 intersection of, 4 7 measurable, 91 monotone sequences of, 99 non-measurable, 97 null, 55 open, 48-50 ordinate, 159 outer: bounded, 76 unbounded, 82 product, 157 representation of outer, 76 symmetric difference, 72 unbounded, 82, 96 union of, 47 Setting sun lemma, 64 Simplicity, 14 Summable functions, 15, 126-31 Symmetric difference, 72 Tame functions, 13, 14, 54 Theorems: addition for functions, ll8, 133 for sets, 115, 132 of bounded convergence, 119, 120
184
Index
Theorems (cont.) : covering, 51 Denjoy's, 43, 67 Fatou's, 138, 139 Fubini's: for indicators, 150 for multiple integrals, 148 for summable functions, 152 in d dimensions, 160 for series of derivatives, 43, 67 'Heine-Bore!', 51-2, 79 monotone convergence, 139 null integral, 11 7 'rising sun', 61 Rolle's, 31 'setting sun', 64
Tonelli's, 150, 155 union-intersection, 75, 81, 88, 91 Tolerance, 16 Tonelli's theorem, 150, 155 Truncated function, 40, 143, 154 Unbounded sets, 82, 96 Union, 47 Union-intersection them·em, 75, 81, 88, 91 Variation, bounded, 170 Wild functions, 14, 54 Wright, J. D. lVI., 8, 134 Young, vV. H., 112, 167, 178
PRINTED IN GREAT BRITAIN AT THE UNIVERSITY PRESS, OXFORD BY VIVIAN RIDLER PRINTER TO THE UNIVERSITY