DISTRIBUTIONS AND FOURIER TRANSFORMS
This is Volume 32 in PURE AND APPLIED MATHEMATICS A series of monographs and textbooks Edited by PAUL A. SMITH and SAMUEL EILENBERG, Columbia University, New York A complete list of the books in this series appears at the end of this volume.
DISTRIBUTIONS AND FOURIER TRANSFORMS WILLIAM F. DONOGHUE, JR. Department of Mathematics University of California Irvine, California
A C A D E M I C P R E S S New York and London
1969
C~PYRIGHT 0 1969, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRI'ITEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS, INC.
1 I 1 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, I N C . (LONDON) LTD. Berkeley Square House, London W.1
LIBRARY OF CONGRESS CATALQG CARD NUMBER: 69-12285
PRINTED IN THE UNITED STATES OF AMERICA
In this book I try to give a readable introduction to the modern theory of the Fourier transform and to show some interesting applications of that theory in higher analysis. The book is directed to students having only a moderate preparation in real and complex analysis. More exactly, I suppose the reader to be familiar with the elements of real variables and Lebesgue integration and to have some knowledge of analytic functions. Further along in the book both Hilbert spaces and LP-spaces play a role, but the reader is presumed to know only a little about either topic, much less, in fact, than appears in any standard modern real variable textbook. Much of the material the student is expected to know is reviewed in the first part of the book, which also serves to establish our 'conventions of notation and terminology. Some topics from advanced calculus and analytic function theory are treated here. There have also been adjoined brief discussions of linear topological spaces, analytic functions of several variables, as well as certain aspects of convexity; these subjects are perhaps not strictly needed for the study of the Fourier transform as we undertake it. Not everything in Part I is needed for the study of Part I1 which presents the theory of distributions on the n-dimensional real space as well as the theory of the Fourier transform for temperate distributions. The machinery developed in Part 11 makes it possible to obtain significant results in harmonic analysis in a fairly simple and direct way; this is done in Part 111. The whole book can be covered conveniently in a one-year course if one or two special topics in the third part are omitted. Much of the book closely follows the lectures in harmonic analysis given by L. Hormander at Stockholm University during the academic year 1958-1959. However, a number of topics covered in those lectures have been omitted, while a good deal of potential theory and analytic function theory has been adjoined; it would be surprising if Professor Hormander cared to acknowledge the result as his own. Nevertheless, almost everything in this book has been taught me by L. Hormander and N. Aronszajn. V
vi
PREFACE
There are certain inconsistencies in the presentation. To make the book accessible to as wide a readership as possible I have avoided the treatment of distributions on manifolds and never refer to an exterior differential form. This has made it desirable to accept the Green's formula without proof, although it is only needed here for spheres. Sometimes a theorem is proved with the tacit assumption that the functions or linear spaces occurring in the argument are all real, and later that theorem is invoked in a context where the scalars are complex. This abuse is preferred to the repetition of some incantation assuring the reader that the arguments may be modified to cover the case of complex scalars. I have tried to make the notations as traditional and natural as possible, but have not been able to avoid some trivial ambiguities. Thus, for example, a system of points in R" is generally written x k , although the same notation is used for the coordinate functions themselves. A book covering such a wide range of material is bound to contain mistakes. These, I think, are unimportant, so long as the book conveys the mathematical spirit of the apostolic, nay, the Petrine succession, extending from Gauss, Riemann, and Dirichlet, through Hilbert, Courant, Friedrichs, and John.
March, 1969
WILLIAM F. DONOGHUE, JR.
Contents
Preface
V
PARTI. Introduction
1
Equicontinuous Families Infinite Products Convex Functions The Gamma Function 5. Measure and Integration 6. Hausdorff Measures and Dimension 7. Product Measures 8. The Newtonian Potential 9. Harmonic Functions and the Poisson Integral 10. Smooth Functions 1 1. Taylor’s Formula 12. The Orthogonal Group 13. Second-Order Differential Operators 14. Convex Sets 15. Convex Functions of Several Variables 16. Analytic Functions of Several Variables 17. Linear Topological Spaces
3 7 10 20 23 30 35 38 42 49 53 58 63 66 69 77 81
PART11. Distributions
89
1. 2. 3. 4.
18. 19. 20. 21.
91 94 97 101
Distributions Differentiation of Distributions Topology of Distributions The Support of a Distribution vii
viii 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. ’34. 35. 36.
CONTENTS
Distributions in One Dimension Homogeneous Distributions The Analytic Continuation of Distributions The Convolution of a Distribution with a Test Function The Convolution of Distributions Harmonic and Subharmonic Distributions Temperate Distributions Fourier Transforms of Functions in Y Fourier Transforms of Temperate Distributions The Convolution of Temperate Distributions Fourier Transforms of Homogeneous Distributions Periodic Distributions in One Variable Periodic Distributions in Several Variables Spherical Harmonics Singular Integrals
105 108 1 I4 118 123 127 134 138 144 149 154 161 165 167 175
PART111. Harmonic Analysis
179
37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 5 I. 52. 53. 54. 55. 56. 57. 58.
181 189 192 197 20 1 207 210 215 224 227 236 24 1 246 252 257 263 27 1 278 285 292 297 306
Functions of Positive Type Groups of Unitary Transformations Autocorrelation Functions Uniform Distribution Modulo 1 Schoenberg’s Theorem Distributions of Positive Type Paley-Wiener Theorems Functions of the Pick Class Titchmarsh Convolution Theorem The,Spectrum of a Distribution Tauberian Theorems Prime Number Theorem The Riemann Zeta Function Beurling’s Theorem Riesz Convexity Theorem The Salem Example Convolution Operators A Hardy-Littlewood Inequality Functions of Exponential Type The Bessel Kernel The Bessel Potential The Spaces of the Bessel Potential
Index
313
PART I
INTRODUCTION
This Page Intentionally Left Blank
1. Equicontinuous Families Let X be a metric space with the metric d(x, y). If f(x) is a uniformly continuous real or complex-valued function defined on X , its modulus of continuity is the function 4 t ) = suplf(x1 -fWI 4x9 r) 5 t , Y
which is defined for all positive t. o(t)is monotone nondecreasing in t and approaches 0 as t does. Since it is not always necessary to operate with the modulus of continuity of a function, we will say that any function w * ( t ) which is monotone and vanishes as t approaches 0 and which satisfies the inequality w ( t ) S o*(t)is a modulus of continuity for f(x). The function f(x) is Lipschitzian with Lipschitz constant M if a*(?) = Mt is a modulus of continuity for it: it is Lipschitzian of order a if a function of the form w * ( t ) = M t a will serve as a modulus of continuity. In practice, only values of a which are smaller than 1 are of interest; one easily shows, for example, that on the real line a function Lipschitzian of order a for a > 1 is necessarily a constant. Let 8 be a family of functions defined on the metric space X; the family is called equicontinuous if there exists a fixed modulus of continuity w ( t ) which serves for all functions in the family. Let 8 be an infinite equicontinuous family Theorem (Ascoli-Arzela): of functions on the compact metric space K which is uniformly bounded, that is, If(x)l 6 M for all x in K and allfin 8;then 8 contains an infinite sequence h(x) which converges uniformly on K. PROOF: The proof is essentially that of the Bolzano-Weierstrass theorem. For any small positive E , there exists a finite set F = [x, ,xz , . . .,x,] of points of K such that every point of K is in an &-neighborhoodof at least one point in F. (This is just the assertion that the compact K is totally bounded.) For the same E , we divide the circle ( z I 5 M into I disjoint sets Gj of diameter at most E. Next, we partition the family S into I" disjoint subfamilies; each subfamily being described by I" assertions of the formf(xi) in Gj.Since 9 is infinite, at least one of the subfamilies is also infinite, and for any two functions fl(x) and fz(x) belonging to the same subfamily, we have
I.fi(x) -fZ(x)I
5 Ifi(x) -.f1(xi)l + Ifi(xi) -fi(xi)I + Ifi(xi) -fi(x)l*
If we choose xi in F so that d(xi, x) < E , the first and last terms are bounded by o ( E ) ; the middle term is bounded by E since both numbersfl(xi) andf,(x,) 3
4
I. INTRODUCTION
belong to the same set Gi; thus, independently of x, I f k ) -f2(x)l 5 2 4 4 + E . Accordingly, for a fixed small E , we find an infinite subfamily Sl of S having the property that any two functions in F, differ by at most 24.5) + E anywhere in K . Passing to ~ / 2and arguing with the family @,, we obtain an infinite subfamily associated with the bound 2 4 4 2 ) ( ~ / 2 ) and , continuing in this fashion, we obtain an infinite descending sequence of subfamilies F,, associated with the bounds 20(&2-") 2 - " which ~ converges to 0. We have now only to chooseA(x) in the family F,, distinct from the previousf,(x) in order to obtain an infinite sequence converging uniformly on K. The sequencef"(x) obviously converges to a continuous limit f*(x) which in general does not belong to the family 9 ; however, the function o(t)is a modulus of continuity forf* and If*(x)l 5 M on K. When the metric space X is the union of a sequence of compact sets and the family of functions 9 is uniformly bounded and equicontinuous on each compact, we can evidently extract a subsequence which converges uniformly on all compact subsets of X. An important special case of the foregoing arises in function theory. We suppose that 9 is a family of functions analytic in some region G and uniformly bounded there by M ; if K is a compact subset of G, it can be surrounded by a rectifiable curve C lying wholly in G. We let d denote the distance from K to the curve C, and note that for anyf(z) in the family and any z in K
+
+
ML
5 - p where L is the length of the curve C. Thus the derivatives of functions in 9 are uniformly bounded on K and hence those functions are all Lipschitzian with the same Lipschitz constant, that is, the family is equicontinuous on K. Accordingly, when the family is infinite, we can extract an infinite sequence which converges uniformly on all compact subsets of G to a limit which is also analytic in G and bounded there by M. We apply this remark to prove the following theorem which may easily be generalized.
Theorem: Letf(z)be analytic and bounded in G, the sector 0 < (21< R, larg zI < c, and suppose f(x) approaches 0 as the real x does; then f(z) converges to 0 uniformly in the sector larg zI 5 d for any d < c.
5
1, EQUICONTINUOUS FAMILIES
PROOF: (See Fig. I) We suppose R 2 2 and consider the compact K defined by 4 5 Iz( 5 1 and (argzI 5 d as well as the sequence of functions fn(z) =f(2-"z) which is uniformly bounded in G and hence equicontinuous on K. We may extract a subsequence converging uniformly on K to an analytic limit f*(z). On the intersection of the real axis with K, we have fn(x) converging to 0, whence f * ( x ) = 0, that is, f* vanishes on the real axis and is
1
Fig. 1.
therefore identically 0. Since this argument holds for any convergent subsequence off,(z), it follows that the original sequencef,(z) converged uniformly on K to 0. Therefore, for sufficiently large n, If.(z)l < E on K, which means If(z)l < con the set JzI < 2-". Thusf(z)converges t o 0 uniformly in the angle. An extended real valued function u(x) on a metric space X is lower semicontinuous if it never takes the value - co (although + co is permitted) and for every real 1 the set defined by the inequality u(x) 5 A is closed. The upper semicontinuous functions are the negatives of the lower semicontinuous ones, and the continuous functions are exactly those which are both upper and lower semicontinuous. If K is a compact subset of X and u(x) is lower semicontinuous on K, then that function is bounded from below on K, since otherwise the sets K,,
6
I. INTRODUCTION
consisting of points x in K for which u(x) 5 --n would form a decreasing sequence of nonempty closed subsets of K;these would have to have a point in common at which the function took the excluded value - 00. The function u(x) then the subsets K. actually attains its infimum on K, for if I = inf, of K defined by u(x) 5 I + l/n have a nonempty intersection KAupon which u(x) = I .
Theorem: Letf,(x) be a family of continuous (or lower semicontinuous) functions on the metric space X and F(x) = supf,(x); then F(x) is lower semicontinuous.
PROOF: It is evident that F(x) cannot assume the value - 0 0 , and the set F(x) 4 I is the intersection of the family of closed setsf.(x) 5 I and is therefore closed. A converse to the previous theorem holds if the space X is compact. Theorem: Let u(x) be lower semicontinuous on the compact metric space X; there then exists a monotone increasing sequence of continuous functions uk(x) converging to u(x).
PROOF: Since the function u(x) is bounded from below, there is no loss of generality in assuming that u(x) is nonnegative on X. The compact metric space Xis separable, and the open sets have a countable base, namely, the spheres s(xk, r ) of rational radius centered about points of a given countable dense subset of X. For every pair of such spheres, S' and S" where S' is contained in S " , we select once and for all a continuous function f(x, S', S") taking values in the interval [0, 11, which vanishes outside S" and equals + 1 on S ' . Only a countable family of functions f(x, S',S") is obtained in this way. For a given positive E and every point x, in X the set u(x) > u(xo) - E is an open set containing x,; there exists, therefore, a pair of spheres S' and S" in the countable base such that xo is contained in S' which in turn is contained in S". Let r be a rational number in the interval (u(xo) - 2.2, u(xo) - E ) ; now rf(x, S', S")is a continuous function satisfying the inequality rf(x, S', S")5 u(x) everywhere on X. Only countably many functions of the form rf(x, S',5"') appear, and it is obvious that u(x) is the supremum of this family if E approaches 0. If that family is enumerated in any way and written g,(x), the functions AX) = max gm(X) mck
form a monotone increasing sequence of continuous functions which converges to u(x).
2. 1NFINITE PRODUCTS
7
2. Infinite Products Let a, be a sequence of complex numbers; we consider the sequence of products: p,, = ak = u,u2a3 a,,.Obviously, if one of the ak is 0, the products p,, vanish for all large n and the sequence of products converges trivially to 0. We suppose, therefore, that none of the factors vanishes; it is then clear that the products p,, converge to a limit P which is not zero and which is finite if and only if log p . converges to log P for an appropriate determination of the logarithm, and, therefore, if and only if the series log ak is convergent. Now, if that series does not converge absolutely, it will be possible, by a suitable rearrangement of its terms, to make it converge to some other limit or to diverge. Accordingly, the partial products converge to a finite, nonzero limit independently of the order of the factors, if and only if the series Cllog akl converges to a finite sum. From the convergence of this series we deduce lim, log uk = 0, and therefore limkak = 1 as we would expect. In studying the convergence of the product, then, we can assume that the numbers ak are sufficiently close to 1. Consider that determination of the logarithm which is real on the real axis; we have log 1 = 0 and the logarithm is analytic in a circle of radius 1 about z = 1. We can divide that function by ( z - 1) to obtain a quotient log z/(z - 1) = q(z)which is analytic in the same circle and such that q(1) = 1. It is now clear that there exists R > 0 such that in a circle about z = 1 of radius R, 4 < 1q(z)1 < 2, and therefore,
n;=l
1
We write a, = I
+ bk and know that bk converges to 0. Hence, for large k,
and the convergence of the series of logarithms is completely equivalent to the convergence of the series Clbkl. To sum up: The infinite product nF=l(l + bk)converges to a finite, nonzero limit independently of the order of the factors, if and only if the series xlbkl converges and no bk is - 1.
8
1. INTRODUCTION
As an example we consider the infinite product
evidently converges, thus S ( z ) For any fixed value of z , the series ~1z2/n2n21 is well defined and finite for all z , and vanishes for z of the form nn and only at such z . If P,(z) is the mth partial product and IzI S R, then
= P,,,(iR)
It follows that the sequence of partial products is uniformly bounded in the circle of radius R, and hence contains a subsequence converging uniformly on that circle to an analytic limit which necessarily has the value P ( z ) at z. The uniqueness of the limit shows that the passage to a subsequence was unnecessary, and since R was arbitrary, it follows that P ( z ) is an entire function. Although true, it is not so clear that P ( z ) = sin z / z . A proof will be given in Section 44. Let us recall another theorem from the theory of functions: Schwarz's Lemma: Let f ( z ) be analytic in the circle lzl < 1 and bounded there; set M = suplf(z)l, IzI < 1 and suppose f ( 0 ) = 0. Then the function h(z) = f ( z ) / z is also analytic in the circle and supIh(z)l = M. PROOF: From the power series expansion we see that we can divide out z , and so h(z) is analytic in the circle. We pass to a subcircle of radius r = 1 - E where the positive E is small. For that subcircle, the function lh(z)l assumes its maximum on the boundary, and that maximum is therefore of the form
Since E is arbitrarily small, Ih(z)I is bounded by M.
2.
INFINITE PRODUCTS
9
We obtain virtually the same result if we change the hypothesis slightly and suppose that f(a) = 0 for some a in the unit circle and divide out the function
this linear fractional function has a zero at a, a pole at l/a and is regular in a neighborhood of IzI = 1 where it has absolute value 1. (Check this by computing the absolute value of h(e"").)If we dividef(z) by h,(z), we find, as before, that the quotient has the same bound M in the circle. We consider next a functionf(z), analytic in the circle and bounded by M ; we suppose also thatf(0) = p > 0. Let ak be the sequence of zeros off(z); in general, this sequence is infinite, and we find it convenient to enumerate the zeros in such a way that
It should be noted that if a is a zero of order v , then a occurs v times in the sequence. Thus, each zero is counted as often as its multiplicity requires. Then, successively, for each a,, we divide out h,,(z), the quotient each time being bounded by M . In particular, at the origin we have, for any n,
Since (a,[ is always positive and smaller than 1, the sequence of partial products diminishes to a nonzero limit and hence the infinite product converges. We deduce that the series (1 - lakl) converges. Thus we have proved half of the following theorem, due to W. Blaschke. A sequence ak of complex numbers in the unit Theorem (Blaschke): circle is the set of zeros of a bounded analytic function with appropriate multiplicity if and only if the series C ( 1 - lakl)converges. Note that 1 - la( is the distance from a to the boundary of the circle. To complete the proof of the Blaschke theorem, we construct, for a given sequence ak satisfying the condition, a bounded analytic function having exactly those zeros. We may suppose that no ak is zero. The function in question is the Blaschke product W
B(z)=
fl ha,(z)
k= 1
3
10
I. INTRODUCTION
which vanishes whenever z = a,,. For other values of z, the product converges to a finite, nonzero limit, since - b k ( Z ) = 1 - h,,(z)
whence lbk(z)l 6 2(1 - lakl)/(l - lzl) and therefore x l b k ( Z ) I converges. The partial products B,(z) = hak(z)are rational functions which have the bound 1 in the unit circle; the sequence of these products is then uniformly bounded for IzI < 1. Hence, there exists a subsequence converging to a limit B(z) which is analytic in the circle; B ( z ) must then coincide with the infinite product. This completes the proof. Note that by an ingenious choice of the numbers ak we can construct a bounded analytic B (z) which has the circle IzI = 1 as a natural boundary. The theorem of Blaschke and Schwarz's lemma combined permit us to set up a canonical factorization for functions bounded and analytic in the circle:
mk=
f ( z ) = CzmB(z)e"'),
where m is an integer 2 0, C a constant, B ( z ) a Blaschke product, and iv(z) is analytic with u(z) SO.The integer m is the multiplicity of the zero off(z) at the origin, if there is one, m being equal to 0 otherwise, and the Blaschke product is completely determined by the other zeros off(z). Because of the argument above, the ratio h(z) =f(z)/z"B(z) is bounded in the circle and C should be taken as its bound. It follows that the function h(z)/C has no zeros in the circle and is bounded there by 1 ; its logarithm is therefore analytic in the circle with a negative real part.
g(z) = u(z)
+
3. Convex Functions We shall consider only functionsf(x), real and finite, defined on an open interval (a, b). Such a function is midpoint convex if and only if for all x, y in (a, 6).
and is said to be convex if and only if for all x, y in (a, 6 ) and all t in the closed interval [0, 1 ) f ( t x + (1 - t>Y) 5 tf(4 + (1 - t ) f ( v ).
3.
11
CONVEX FUNCTIONS
The convex functions are clearly midpoint convex: we have only to set t = 4; we shall show that any midpoint convex function which satisfies reasonable further conditions is convex. That there exist midpoint convex functions which are not convex is shown by the following example: consider the real numbers as a vector space over the field of rational numbers, and let { x A }be a Hamel base; every x in R is representable in a unique way as a finite sum with rational coefficients x=
The coefficients cA(x) are since f is rational,
"
c
CA(X)XA.
linear " functions of x taking rational values;
and, therefore, cA(x)is midpoint convex. Since it is not the constant function, and assumes only rational values, cA(x)is not continuous, and therefore not convex, since, as we shall see, convex functions are continuous.
Theorem: convex.
If f(x) is midpoint convex and continuous, then f ( x ) is
PROOF: We first show, by induction on n, that the convexityinequality above holds for all x and y in (a, 6) and all t of the form p/2". The inequality being shown for n, we pass to n 1 : let
+
=;
[$x
+r p+ y
1
, where p
+ q = 2"+',
and where we may suppose that p c q, whence p < 2" < q = 2" + r. Now
Since the set of t of the form p/2" is dense in the unit interval, from the continuity off(x), we obtain the full convexity inequality, that is, f ( x ) is convex.
12
1. INTRODUCTION
Theorem:
If f(x) is midpoint convex and is discontinuous at a point
xo in (a, b), then f(x) is unbounded on every subinterval of (a, b), and hence
everywhere discontinuous.
PROOF: We may suppose that the interval is of the form (-u, a), that xo = 0 and that f ( 0 ) = 0. There exists a sequence x, converging to 0 for whichf(x,,) converges to a limit m # 0; we may suppose m > 0, since, otherwise, we pass to y,, = -x, and use that sequence instead. Now the sequence 2x,, also converges to 0, and we have =f (2XJ
and therefore lim inff(2x.) 2 2m. Repeating the argument lim inff(4x,,) 2 4m and inductively lim inff(2kx,,) 2 2km. Thusf(x) is not bounded near x = 0, and there even exists a sequence x,, converging to 0 upon which f converges to infinity. Let z be an arbitrary point of the interval; the sequence z + 2x,, converges to z, while
Since the left-hand side converges to infinity with increasing n, the right-hand side also converges to infinity, whencef(z 2x,) converges to infinity, and f is not bounded near the point z. Since z was arbitrary, it follows thatf(x) is bounded in the neighborhood of no point. Since convex functions are bounded on subintervals, it follows that convex functions are continuous. The following beautiful theorem is due to Sierpinski.
+
Theorem (Sierpinski): If f(x) is midpoint convex and Lebesgue measurable, then it is convex.
PROOF: The theorem is a consequence of the following even stronger result proved by Ostrowski. Iff(x) is midpoint convex and bounded on a Theorem (Ostrowski): set E which is Lebesgue measurable with positive measure, then f ( x ) is convex.
3.
CONVEX FUNCTIONS
13
PROOF: We will write m ( A ) to denote Lebesgue measure, m*(A) for the Lebesgue outer measure. There exists an open set G in (a,b) containing E such that m ( E ) 5 m ( G ) < $m(E). Since G is the union of a sequence of disjoint intervals I, and all these sets are measurable, we may write
1m(E n In) 5 1m(In) <+Cm(EnZn);
hence, for at least one value of n we have
m(E n In) I m(IJ < +m(En I,). We may therefore suppose that I,,= I = (a, b) and that E is contained in I. Let M be the bound forf(x) on E. Iff(x) is not convex, then the set of x in I wheref(x) > M is dense; we may therefore assume that the midpoint of Z is in that set (otherwise we diminish I slightly.) We may pick coordinates so " that the midpoint is 0, and I = ( - a , a). Let S be the set f(x) > M and S its reflection through the origin, that is, the set of all x such that - x is in S. For any choice of x
" and therefore either x or -x belongs to S. Hence S u S contains the whole " interval I . Moreover, although S may not be measurable, S is obtained from "
it by an isometry, hence m*(S) = m*(S). Since E is disjoint from S we have " E c S as well as S c I - E = complement of E. Then " m(E) 5 m*(S)
= m*(S) S
m ( I - E ) = m(I) - m(E)
and so 2m(E) S m ( I ) < +m(E),contradicting the hypothesis m(E) > 0. We find this theorem useful when we have to verify that a given function is convex: the function is usually obviously measurable, hence, only the midpoint convexity has to be verified, and this involves only the convenient number 4. When f ( x ) is convex, we consider the difference quotient (for h > 0)
f ( x + h) - f ( x ) h
14
I. INTRODUCTION
which is defined and continuous in the interval (a, b - h). We verify that the difference quotient is monotone increasing in h and also in x. For h' < h" we write h' = rh, h" = h, where 0 < t < 1, and have to show
f ( x + th) - f(x) < f ( x + h) - f ( x ) 1
t
that is, f(x want
9
+ th) 5 tf(x + h) + (1 - r)f(x), which is true. f(xl
+ h)
If(x"
-f(Xl)
For x' < x", we
+ h) - f ( x " )
or f(x'
6 f ( x " + h) +f(x')
+ h) +f(x")
Now X"
h d
= - x) +
(x" ~
- ") d
(x"
+ h)
and Xr+h=
where d = X" - XI
(x"
d
X'
+ -dh (x" + h) ,
+ h, and therefore
and we obtain the desired inequality by addition. It follows that, if we let the positive h diminish to 0 in the difference quotient, we obtain a limit f;(x) at every x (from the monotonicity of the difference quotients that function is itself monotone nondecreasing). In a similar way, the difference quotients on the left (h > 0)
-h are monotone increasing in x and monotone decreasing in the positive h, thus, for each x there exists the limitfl(x) which is a monotone function of x. It is easy to see thatfl(x) Sf;(x) Sf.!(x + h), so that a t any point of continuity of fi(x) we have equality. Thus the convex function f(x) has a left
3. CONVEX FUNCTIONS
15
derivative at all points and a right derivative at all points: the derivatives are monotone nondecreasing and coincide except on the set of their (common) discontinuities, and this is at most a countable set. Thusf(x) is differentiable almost everywhere, with a monotone derivative. When we pass to a closed subinterval [a, /I] of (a, b) and consider any points x, y (x < y) in the subinterval the difference quotient satisfies the following inequality:
and hence, all such difference quotients are bounded in absolute value by max(lf’+(417lf:(/I)l) from which it follows that the function is Lipschitzian on the closed interval, therefore absolutely continuous there. We see that f(x) is the indefinite integral of its monotone derivative, and choosing any point c in (a, b), we have finally
f(4= f ( c ) + S’fW
dt.
Conversely, it is easy to verify that the indefinite integral of an arbitrary monotone increasing function is convex. The following assertions are easy to prove: the sum of two convex functions is convex and so also is their maximum; the supremum of an arbitrary family of convex functions is convex on any interval where it is finite; if F(A) is convex and monotone increasing and f(x) convex, the composed function F(f(x)) = k(x) is convex where it exists. In particular, therefore, whenf(x) is convex, so also is ef(X). A Cz-functionf(x) is convex if and only iff”(x) 2 0. It is geometrically self-evident that a convex curve is above its tangent; we get a formal proof by noting that for positive h
and therefore,
f(x
+ h ) 2 f ( x ) + hf’+(X)
9
and on the other side, we obtainf(x - h) 2 f ( x ) - hfL(x). Thus, if m is any number in the interval fl(x) 5 m Sf;(x), f(x) + m(y - x) s f ( y ) . This circumstance gives us a proof of the famous Jensen’s Inequality.
16
1. INTRODUCTION
Theorem (Jensen): If d p ( l ) is a positive distribution of mass of total mass 1 on the interval (a, 6) where the convex function k ( l ) is defined, and if c = A dp(l), then
I
Equality occurs if and only if dp is supported by a point, or an interval upon which k ( l ) is linear.
PROOF: We may pick coordinates so that c = 0, and by adding a constant to k(L) (which does not affect the convexity) we may suppose that k(0) = 0. We have to show that the integral above is positive. If m = k’+(O) the function k(L) - ml is positive on the interval, hence has a positive integral : 05 Ik(L) - m l dp(l) = S k ( 4 dP(4
We see that equality occurs only if dp is supported by a set where the positive and convex function k ( l ) - ml vanishes; if this set is not a point, then it is an interval upon which k ( l ) is linear. In the special case when dp consists of two point masses, Jensen’s inequality is the usual assertion of convexity. Note, however, if for some choice of x and y in the interval,
thenfis linear in the interval (x, y ) . Since the function f ( x ) = -log x is convex on the right half-axis we invoke Jensen’s inequality for the measure dp consisting of a finite or countable distribution of masses p i at points xi to obtain -log(cpixi) 5 - c p i log x i , where x p r = 1; taking exponentials which is the most general form of the inequality between arithmetic and geometric means. We note that since the logarithm is linear on no interval, equality can occur only if all of the points x i coincide. We pass next to the class of logarithmically convex functions. A function f(x) is logarithmically convex if and only if it is positive and its logarithm is a convkx function; equivalently, if and only iff(x) = ek(x)where k ( x ) is convex.
3. CONVEX FUNCTIONS
17
Every such function is convex, but not every convex function is logarithmically convex: for example, f ( x ) = x = elogxis clearly not logarithmically convex. It is convenient to notice how the midpoint convexity of the logarithm is expressed :
A C *-function is logarithmically convex if and only if (f’)’Sf”.Using the midpoint logarithmic convexity inequality, it is easy to show that the sum of log convex functions is log convex. We cannot infer that log convex functions are any smoother than convex functions, but remark that no nonconstant log convex function can be linear on an interval, since we would havef” = 0, hencef’ = 0 on the interval. The positive limit of a sequence of logarithmically convex functions is logarithmically convex if it is finite, as is also the finite supremum of an arbitrary family of such functions. We use these facts in the following proof of Holder’s inequality.
Theorem (Holder): Let (X, p) be a measure space and f ( x ) and g ( x ) measurable functions on it such that F ( x ) = If(x)lp and G ( x ) = lg(x)I4 are integrable, with 1 < p < co and l/p + I/q = 1 ; then the (obviously measurable) function f ( x ) g(x) is integrable and l / f ( x ) d x ) dr(x)l 5 p ( x ) l p dP(X)”Pjls(X)l~dP(x)l/q.
Equality can occur if and only if there exists a constant C such that F ( x ) = CG(x) almost everywhere p.
PROOF: We may suppose 0 < F ( x ) G ( x ) c 00, since otherwise, we pass to the subset of X on which this inequality is satisfied. Let s = l/p; from the inequality between geometric and arithmetric means I f ( x ) s ( x ) l = F(x)”G(x)’ -s
- sF(x) _<
+ (1 - s)G(x)
and therefore the product is integrable, and indeed, F(x)‘G (x)’-‘ is integrable for all t in the interval [0, 11. Thus the function @ ( t ) = /F(x)‘G(x)’-‘dp(x) = JexpCt(b F(x) - 1% G(x))IG(x)4 4 x 1
is logarithmically convex, being the limit of a sequence of such functions. It follows, therefore, that @ ( t ) 5 @(O)’-‘@(l)‘ and for t = s = l/p, th’is is
18
I. INTRODUCTION
Holder's inequality. If we suppose that equality occurs for this value, we deduce from Jensen's inequality that log @ ( t ) was linear in the interval, whence @ ( t ) = AB' for appropriate A and B. Setting G(x) dp(x) = dv(x) to obtain a positive measure of finite total mass, and putting H ( x ) = log F ( x ) - log G ( x ) - log B
we have ~ ( t =) / e r H ( x ) dv(x) = constant
for all t in the interval. The second differences of A(r) are all 0. However, the second difference under the integral sign is of the form
and the second factor is a square. Therefore, putting K ( x ) = (h/Z)H(x),we have
and as the integrand is nonnegative, it vanishes for almost all x . Thus K(x) = 0 almost everywhere, whence F ( x ) = BG(x) almost everywhere. Three Lines Theorem (Lindelof): Let f ( x ) be bounded and analytic in the strip a < x < b and not identically 0; then the function P ( X ) = SUPy I f ( x
+ 091
is logarithmically convex in the interval (a, b). PROOF: We may take (a, b) = (0, 1) and remark that it is enough to prove the theorem for functionsf(z) which converge to 0 uniformly with increasing IyI; we formfN(z) = f ( z ) ( N - l)/(N - z), a function which vanishes at infinity in the strip and with converges with increasing N tof(z). Since the factor (N - 1)/(N- z) is bounded in absolute value by 1 in the strip, the . corresponding &x) 5 p(x), and with increasing N converges to ~ ( x )Thus p(x) is the limit of a sequence of log convex functions, and is log convex itself.
We may therefore suppose that for E > 0 there exists N such that lyl > N implies If(z)l < E , and therefore on any line x = constant, the function If(x + iy)l actually attains its maximum.
3.
19
CONVEX FUNCTIONS
If p ( x ) is not log convex, there exist points x' and that
XI
in the interval such
x' + X" log p(x') log fl(x") log P( 7)2
'
+
We may choose the real 2 in such a way that the function k(x) = log ~ ( x+) Ax is such that k(x') = k(x") < k([x' x"]/2). Letf*(z) =f(z)e"; this function, bounded and analytic in the strip, corresponds to p*(x) = ek(X).We consider f * ( z ) in the rectangle determined by the vertical lines x = x' and x = x" and the horizontal lines Iyl = N where N is large. Evidently If*(z)l attains its maximum at an interior point of the rectangle, contradicting the usual maximum principle for harmonic functions. Thus p ( x ) is logarithmically convex. Another version, perhaps older than the Lindelof theorem, is the following.
+
Let f(z) be bounded and analytic Hadamard Three Circles Theorem: in the circle lzl < 1 and not identically zero; then the function M ( r ) = SUPIzI=rIf(z)l
is a logarithmically convex function of log r .
PROOF: The function F ( [ ) = f ( e c ) is analytic in the left half-plane and bounded there; by the Three Lines Theorem,
A x ) = SUP IF(x + iv)l Y
= sup If(exe'Y)l Y
=M(8)
is logarithmically convex in x . We conclude this section with a theorem from the Theory of Equations. Let P ( x ) be a polynomial with real coefficients, all of whose Theorem: a,Z for 1 2 k n - I. roots are real. If P ( x ) = c; =o akg , then a k - l a k + l
s
PROOF:
The inequality to be established may be written
Since (k - l)! ( k + l)! 2 (k!)', it is enough to show
Pk-l ) ( O ) P ( k +1)(0) 5 [P'k)(O)]',
s
20
I. INTRODUCTION
and it is even sufficient to prove the inequality for the special case k = 1, since if P ( x ) has only real zeros, it follows from Rolle's theorem that P ' ( x ) , and indeed all of the derivatives, have the same property. The problem reduces, therefore, to the proof of the inequality
p "W(4 6 cp '(x)l for x = 0, but it is easier to show that it holds for all x . This inequality is the assertion that the function -loglP(x)l is convex in any interval (a, b) in which P has no zeros. If the not necessarily distinct zeros of P ( x ) are written I,,A,, ..., In,then
is obviously convex between those zeros. The proof is complete.
4. The Gamma Function For x > 0, we consider the integral T ( x ) = somtx-le-' dt; the positive integrand diminishes exponentially for large t, and looks like t 4 where q > - 1 for small t ; thus the integral exists and is positive for all x > 0. Passing to the complex variable z = x + iy we see that the integral exists for any z in the right half-plane. It is not difficult to see that the function is analytic there; t"' e-' dt, each of these is we consider the sequence of functions Fn(z)= l/n the limit of the sequence of its Riemann sums t;-'rn, and these functions are entire, being finite sums of exponentials. For the strip 0 < a < x < b we have
1"
Ixtf-'mkl
6 Et;-'m,
5 Ct,O-'rn,+ Ct,b-'mk 5 r(a) + r ( b ) + 1 and the Riemann sums are uniformly bounded in such a strip, whence the functions Fn(z) are analytic in the strip and bounded there. It follows that T(z) = lim F,(z) is analytic in the half-plane. Since it is obvious that sup,lr(x + iy)l = T(x), we also see that r is logarithmically convex on the right half-axis. We obtain the functional equation for the Gamma function by integrating by parts:
r(z + 1) = zT(z)
and
r(l)= 1, whence
r(n)= (n - l)! .
4.
21
THE GAMMA FUNCTION
The function may now be extended analytically to the left half-plane if we make use of the functional equation: we first consider the strip - 1 < x S 0 and define T(z) there by T(z) = r(l + z)/z. Since, for small Iz( we have T(z 1) = 1 zh(z) where h(z) is regular near the origin, the extended function has a simple pole at the origin with residue + 1; it is analytic everywhere else in the strip. We repeat the process, passing to the strip -2 < x 6 - 1. We proceed inductively, extending the definition of T throughout the whole finite plane and obtain a meromorphic function whose only singularities are simple poles at the nonpositive integers and which satisfies the functional equation throughout the plane. Had we been interested only in the functional equationf(x + 1) = xf(x) for x > 0, we could have found many solutions. Indeed, it is easy to see that every solution to that equation can be constructed in the following way : Choose an arbitrary finite functionf(t) in the interval 0 < t 5 1. Extend the function to 1 < x 2 2 by the rulef(t 1) = tf(t). Continuing in this way we obtain a functionf(x), finite everywhere, and satisfying the equation. If we require only that f(I ) = 1, the function which we construct will then satisfy f ( n ) = (n - l)! for all positive integers n. Now, although this class of solutions to the functional equation is very great, a remarkable theorem due to Artin asserts that there is only one solution if we require that it be logarithmically convex.
+
+
+
Theorem (Artin): mically convex; iff(x determined.
Letf(x) be defined and finite for x > 0 and logarith-
+ 1) = xf(x) for all x andf( 1) = 1, thenf(x) is uniquely
PROOF: (We remark first that since T(x) is a solution to the equation which is logarithmically convex and satisfies r(1) = 1, evidentlyf(x) = T(x).) Select x of the form x = n + t with 0 < t < 1 and let k(x) = logf(x). From the convexity, then, k(n) - k(n - 1) 1
< k(x) - k(n) t
r
k(n
+ 1) - k(n) 1
9
log(n - 1)'slog f ( x ) - log(n - l)! 5 log n',
f (n + t ) (n - 1)' 5 (n - l)!
22
I. INTRODUCTION
From the functional equation we have
f(n
+ t ) = t(r + l)(r + 2)
(t
+ n - l)f(t)
and so
5 n'. Passing to reciprocals,
and since limn(1
+ r/n)-
= 1,
and therefore the limit as n approaches infinity of
exists and equalsf(t). This proves the theorem, since it shows that f ( r ) is uniquely determined in the interval (0, 1) and hence everywhere. Accordingly T(z) = limnFn(z)where the function Fn(z)= (l/z) ezlognniZl[l + (4k)I-l is meromorphic in the entire plane and has exactly n 1 poles which are simple. It may be written
+
From elementary calculus we can verify that Ei=,(l/k) - log n converges with increasing n to Euler's constant C > 0, and hence the first factor in Fn(z)converges to (l/z) e-Cz.The second factor is an infinite product which converges everywhere except at the negative integers. We write
b(z) = [ez/(l
+ z)] - 1
and note that this function is bounded in the circle IzI 4 4 and has a double zero a t the origin, whence b(z) = z2h(z) with h(z) bounded in the circle. Therefore Ib(z)l 5 Mlzl' for a suitable M and small 121.
5.
The product then involves factors of the form I
cnF=
23
MEASURE AND INTEGRATION
1
+
bk(Z)
=1
+ b(z/k) and
lbk(Z)l 5 M(z12 ( 1/k2). It follows that we may speak of the infinite product e Z l k /1[ ( z / k ) ] which converges uniformly on compact subsets of the
+
plane away from the negative integers, since C Ibk(z)I converges uniformly on all compact sets. We obtain in this way another representation of T(z) :
which explicitly displays the poles of the function. From this representation, too, we see that the Gamma function cannot have a zero, since the product is a convergent one. We should remark that the expression above is an analytic function which coincides with T(z) on the interval 0 < t < 1, and hence coincides with it everywhere. It is easy to verify that T(z)T( - z ) = reciprocal of - z 2 [l - (z2/k2)] a product which we have already identified without proof. Accordingly
nr
n sin(nz)
- zr(z)r( -z ) = and since - zT( - z ) = T( I - z), if we set z = 3 we get also easily determine T(3) from the integral.
= n. We could
5. Measure and Integration In this section, as well as in the section on product measures, we give few proofs, our purpose being merely to establish conventions of notation and terminology, since the reader is familiar with the material. An outer measure on an abstract space X is an extended real valued nonnegative function p ( A ) defined on all subsets of X having the properties: (i) (ii) (iii) A set A
p(0) =0; A c B implies p ( A ) 5 p ( B ) ;
for any countable family A,, p ( u A , ) 5 C p ( A , ) . is measurable if and only if, for all sets T,
p ( T ) = p(T n A )
+ p(T n (X- A ) ) .
The family of measurable sets forms a sigma-algebra of sets; that is, the union of countably many measurable sets is again measurable, the complements of measurable sets are measurable, and the empty set and the space X are both measurable sets. All sets of outer measure 0 are measurable..
24
I. INTRODUCTION
Given an outer measure p, we may pass to another by the definition p*(A) = inf p ( B ) ,
the infimum being taken over all measurable sets B which contain A. The functions p and p* coincide on the class of p-measurable sets, and every such set is p*-measurable. Any set A which is p*-measurable and not p-measurable satisfies p * ( A ) = + co. A measure is called regular if p and p* coincide; this is equivalent to assuming that every set A is contained in a measurable set A* having the same measure. When the measure is regular, the testing set T occuring in the definition of measurability may be taken to be measurable itself. When the space X is a topological space, the measure p is called a Borel measure if the Borel sets are p-measurable, and this happens if and only if all open sets are measurable. If X is locally compact and p a Borel measure which is finite on all compact subsets of X, then p is called a Radon measure. An outer measure p on a metric space Xis a Caratheodory outer measure if, whenever the sets A and B are at a positive distance apart, P(A u B ) = P ( 4
+ p(B)
*
One proves that every such outer measure is a Borel measure. A measure is sigma-finite if the space X can be decomposed into a countable union of measurable sets of finite measure. We remind ourselves of the three fundamental theorems in the theory of integration. We suppose that p is an outer measure on X . When a function f(x) is measurable and nonnegative, we admit the value + co in the definition of the integral. If f,(x) is a sequence of nonnegative measurable
Theorem (Fatou): functions on X then
Slim inf j n ( x )d p ( x ) 5 lim inf J j n ( x )d p ( x ) . If f,(x) is a sequence of nonnegative measurTheorem (Beppo Levi): able functions on X such thatf,(x) Sf,+,(x) for all x and n, then
If a sequence f , ( x ) of integrable functions conTheorem (Lebesgue): verges almost everywhere to F(x) and if there exists an integrable function cD(x) for which IS,(x)l O(x) almost everywhere for all n, then lim
If,,(.)
dp(x)
exists and equals
b(x) d p ( x ) .
5,
25
MUSURE AND INTEGRATION
We also recall the definition of the spaces L p ( X ,p): for 1 6 p < 00, this is the space of all p-measurable functionsf(x) on X for which the integral
/If (X)lP
&(x)
is finite and the pth root of this integral, written Ilfll,, is a norm on the linear space LP(X, p). More exactly, IlflI, is a seminorm, which vanishes if and only iff(x) = 0 almost everywhere. When p = + co, we take L"(X, p) as the linear space of all bounded, p-measurable functions on X and the corresponding seminorm is defined as the essential supremum:
llfll
= inf
A: the set If($]
> A has p-measure 0.
Whether p is finite or not, Holder's inequality is valid:
+
for any pair offunctionsfinLP(X,p)andginLq(X,p)where(l/p) ( I / q ) = 1. An important theorem which we do not prove asserts that any continuous linear functional F on L p ( X ,p) corresponds to any element g(x) of Lq(X, p) where ( l / p ) + (I/q) = 1 and is given by the formula
Kf) = Jf(x)g(x)
44x)
'
Here we must suppose that p is finite, and when p = 1 that p is sigma-finite. The theorem which follows is of considerable importance, but our proof leaves many verifications to the reader. Theorem: Let X be a locally compact metric space and C , ( X ) the linear space of all continuous functions on X which vanish outside a compact subset of that space. If F(f)is a linear functional on C,(X) having the property that F(f)2 0 wheneverf(x) 2 0, then there exists a Radon measure p on X such that for allf
PROOF: We first construct the measure p , defining the function m(K) on the class of compact subsets of X as follows: m ( K ) = inf F(u),
u(x)
2 0 on X ,
u(x)
2 1 on K .
It is fairly easy to verify that m(K) has the following properties: ( I ) 0 5 m(K) < co for all K . (2) m(0)= 0.
26
I. INTRODUCTION
(3) K , c K , implies m(K,)5 m(K,). (4) If K c , K , , then m(K)S m(K,). (5) If K, n K, = 0 , then m(K,u K,) = m(K,)+ m(K,).
ur=
,
To establish (9,one should note that there exist positive functions in C , ( X ) vanishing on the one compact set and equal to + 1 on the other. We next define a set function p(G) on the open subsets of X by setting p(G) = sup m(K), K c G .
It is not hard to check that p has generally the same properties:
+
0 5 p(G) 5 rn for all G. (2) P ( 0 ) = 0. (3) G , c G2 implies p(G,) 5 p(C,). (4) If G c ,Gi, then p(C) 5 p(Gi). (5) If G, and G, are at a positive distance apart, then p(G, u C,) = P(G,) + P ( W (1)
ur=
I;"= ,
Finally, the function p(G) is extended from the open sets to all subsets of X by the definition p ( A ) = inf p(G), A c G ,
and it is easy to see that this is a Caratheodory outer measure on X. It is almost obvious that if K is compact, then m(K)5 p(K), and to establish the reverse inequality, we note that, for E > 0, there exists a positive function u(x) in C , ( X ) which is 2 1 on K and for which F(u) S m ( K ) + E . Let G, be the (open) set where u(x) > 1 - E and H an arbitrary compact subset of G,; now u(x)/( 1 - E ) > 1 on H, and so m(H)5 F(u)/(l - E ) 5 (m(K)+ E ) / (1 - E ) ; therefore, since H was an arbitrary compact in G, , p(G,) is bounded by the same number, and E being arbitrary, p ( K ) 5 m(K),as desired. It remains to show that the measure p represents the linear functional F(f)and it will be enough to show this for positive functionsf(x) in'C,(X). Given such a function, we first remark that the set of non-zero values of 1 for which the set f ( x ) = 1 has positive p-measure is at most countable, in view of the fact that the set f ( x ) # 0 has finite p-measure. We partition the interval 0 5 15 1 llfll by a finite sequence of closely spaced points:
+
0 = 1, < 1, < 1, < *.. < 1, =1
+ IlflIwJ,
choosing the l i outside the countable set of exceptional values identified above. This partition gives rise to a decomposition of the setf(x) > 0 into a
5.
27
MEASURE AND INTEGRATION
finite union of measurable sets, namely, a set of measure 0 and the open sets Gi defined by the inequalities lie,
0, there exists a compact subset K i in Gi such that m(Ki) > p(Ci) - E and a positive function ui(x) in C , ( X ) such that ui(x) = 0 outside Gi, ui(x) = 1 on Ki and ui(x) 5 1 everywhere on X . Thus F(ui) 2 p(Gi) - E . Similarly, there exists wi(x) in C , ( X ) which is positive and equal to + 1 on the compact closure of Gi such that F ( w i ) 5 rn(Gi) E = p(Gi) E . (Here we use the fact that the boundary of Gi has p-measure 0.) Accordingly,
+
+
N
1
li-
1
lUi(x) 5 f(x) N
I
1liWi(X) 1
and all three terms of this inequality are positive functions in C,(X). Thus N
CAi-lF(Ui) 5 F(f) 1
and
from the arbitrariness of the E we infer
N
and from the theory of integration, it follows that
Let X be a compact metric space and Riesz Representation Theorem: C ( X )the space of continuous functions on X normed by llfll = suplf(x)l; if F i s a linear functional on C ( X )which iscontinuous, that is, IF(f)l 5 Mllfll, for a fixed M and allf, then there exists a (signed) measure v on X such that
28
I. INTRODUCTION
for all f ( x ) in C(X )
F(f) = Sf(.)
W x )*
Moreover, the number /ldv(x)l may be taken as the bound of F, i.e., the number M above.
PROOF: There would be nothing to prove if we knew that F was a positive linear functional, that is F ( p ) 2 0 for every p(x) 2 0, since then the previous theorem would guarantee that F was represented by a positive Radon measure p on X . This would be so even without the explicit hypothesis that F is continuous. In general, since F cannot be supposed positive, we form F+(p) = SUP F(u), 0 S u(x) 5 p ( x ) ,
to obtain a function defined on the cone of positive elements of C ( X ) ; For 1 2 0 it is clear that F+(1p) = 1F+(p),and it is not very hard to show that i f p and q are any two positive functions, F + ( p + q) = F + ( p ) + F+(q). This is a consequence of the fact that the inequalities'0 6 u 5 p and 0 S v 5 q imply that 0 5 w = u + u S p + q, while the inequality 0 2 w 5 p + q implies that w is ofthe form w = u + u with 0 S u S p and 0 5 u q ; we have only to set u(x) = min(p(x), w(x))and u(x) = W ( X ) - u(x). Obviously F + ( p )2 0 fotp 2 0. We can extend F+ so it becomes a linear functional on the whole C ( X ) if we make use of the canonical decomposition of continuous functions into differences of positive continuous functions given by
We put F + ( f ) = F + ( f + -f-) = F + ( f + )- F+(f-),and since it is obvious that F + ( - f ) = -F+(f), we have, in general, F+(J.f) = J.F+(f)for all real 1. Moreover, F+(f+ g ) = F + ( f ) F + ( g ) is an easy consequence of the identity
+
f+ 9 = (f+g>+ - (f+ g)= (f++ 9+)- (f-+ 9-1. It therefore follows that F+ is a positive, continuous linear functional on C ( X ) ,represented by a positive Radon measure p + , which is of finite total
5.
MEASURE AND INTEGRATION
29
mass since X is compact. If F- is the linear functional defined by F-(f)= F + ( f )-F(f), it is continuous as the difference of two continuous linear functionals and is obviously positive, thus represented by a positive Radon measure p- on X . Hence, finally,
where v = p + - p- has the total mass Idv(x)( = p + ( X ) + p - ( X ) . IX When X is a compact metric space, the associated continuous function space taken with the usual norm Ilfll, = suplf(x)l is a Banach space-a complete, normed linear space. It is also true that C ( X ) is separable, but we prefer not to prove this fact, and in the following theorem we either accept it without proof, or adjoin the hypothesis that C ( X ) is separable to Helly’s theorem. A continuous linear functional F on C ( X ) has a bound IlFll = suplF(f)l the supremum being taken over all f i n C ( X ) for which Ilfll, I1, and it is important to notice that the functional is necessarily Lipschitzian with Lipschitz constant IJF11. This follows from the easy estimate
I W ) - F(dl = IW- s)l S IlFll I l f -
911,
-
Accordingly, if a family F, of linear functionals has a uniform bound: sup,))F,,,I) = k < + 00, the family is necessarily equicontinuous on the metric space C ( X ) .This circumstance enables us to prove the useful theorem which follows. Let vk be a sequence of (signed) Radon measures on Helly’s Theorem : the compact metric space X of uniformly bounded total mass; then there exists a subsequence vk, and a Radon measure vo on X such that for allf(x) in C ( X )
I
lim j ( x ) dVk,(X) exists and equals I j ( x ) dv,(x). i
PROOF: We have already remarked that the linear functionals Fk associated with vk form an equicontinuous family on C ( K ) , and therefore,
30
I. INTRODUCTION
iff, is a countable dense subset of C ( X ) , we can extract a subsequence Fk, so that lim Fk,( f,) exists for every m. The limit now exists for everyfin C ( X ) , since an arbitrary f may be approximated by an elementf, of the countable dense subset so that [If- fm[lrn < E , whence IFk,(f)
- F k l ( f ) l 5 I F k j ( f ) - Fk,(fm)l -k -k
IFkl(fm)
IFkj(fm)
- Fkl(fm)I
- Fkl(f)l
5 2 M E + IFk,(fm) - Fkl(fm)I
I
where M is the common bound for the total masses of v k . For sufficiently large k and I this is smaller than (2M I)&, and the E being arbitrary, the sequence Fk,( f ) is Cauchy. Hence for everyfin C ( X )there exists a well-defined limit
+
FO(f)
= lim F k , ( f ) i
and IFo(f)I5 M 11 f Ilm, from which it is evident that the obviously linear Fo is continuous on C ( X ) and therefore represented by a measure vo of finite total mass. It is also clear that our theorem has really little to do with measures, and that we have proved the following apparently more abstract theorem.
Theorem: Let E be a separable normed linear space and F,,,a sequence of continuous linear functionals on E such that sup, llFmll is finite; then there exists a continuous linear functional Fo on E and a subsequence F,, such that lim, Fmk(x)exists and equals Fo(x) for every x in E. In more pretentious language, the theorem asserts the relative weak-star compactness of bounded sets in the dual of a separable normed linear space.
6. Hausdorl€ Measures and Dimension If Xis a metric space, we construct the family of Hausdorff measures on X in the following way, Let h(t) be a function defined on the unit interval [O, I] which is monotone nondecreasing and for which h(0) = 0. We require also that 0 be a point of continuity of the function, that is, limt+oh(t) = 0. For an arbitrary positive q, we form the set function h,(A) = inf Ch(d,)
6.
HAUSDORFF MEASURES AND DIMENSION
31
the infimum being taken over all coverings of A by families of sets Bi of diameter di where di S r]. By a routine argument, we show that h,(A) is an outer measure on X. As r] diminishes to 0, h,(A) increases, or at least, does not decrease. There exists, therefore, a well-defined set function
H ( A ) = SUP h,(A)
9
rl'0
which is also an outer measure. This is the Hausdorff measure on Xassociated with the function h(r). It is clear that H ( A ) is a Caratheodory outer measure, since if two sets A and B are at a distance d apart, d > 0, then, for r] c d/2 we evidently have h,(A u B) = h,(A) + h,( B), whence
H(A u B ) = H(A) + H ( B ) . If F is an isometric mapping of X onto itself, evidently H ( F ( A ) ) = H ( A ) and A is measurable if and only if F ( A ) is measurable. If X is R" and h(r) = c, 1" for an appropriate choice of c,,, the corresponding Hausdorff measure is the usual Lebesgue measure in R". We consider next the family of functions h,(t) = t' for a > 0 ; these correspond to the Hausdorff measures H J A ) on X. Sometimes it is convenient to adjoin ho(t) = [-log f ] - ' and the corresponding Ho(A). One shows that if > a and &(A) is finite for some set A, then H&4) = 0. Accordingly, if the set A is fixed and H,(A) is regarded as a function of a, this function is finite and nonzero for at most one value of the parameter a, namely, a. = inf a, &(A) = 0. The number a. is called the Hausdorff dimension of the set A. It is clearly invariant under isometries, and in fact, is invariant under Lipschitzian homeomorphisms of A. This is a consequence of the following property of the Hausdorff measures H, . Let F be a Lipschitzian transformation of X into itself with Lipschitz constant M ; that is, d(F(x),F Q ) ) 5 Md(x, y ) for all x , y in X . Then H,(F(A)) 5 M'HJA). The proof is an immediate consequence of the fact that if B, is a covering of A by sets of diameter at most q, then F(BJ is a covering of F(A) by sets of diameter at most Mq. Let S denote the surface of the unit sphere in R" where n 2 2. If x = (xl, x 2 , . . .,x,,) is a point of S for which x, is not 0, then the projection of S onto the coordinate hyperplane x,, = 0 is a homeomorphism of a neighborhood of x in S onto a neighborhood in R"-' which is Lipschitzian with Lipschitzian inverse. Thus, there is a neighborhood of x in S with a Hausdorff measure H,,-' which is finite and nonzero. We can therefore define a nontrivial Bore1 measure o on S by taking the Hausdorff measure associated with the function h(r) = t"-' and normalizing it so that w(S)= 1; the resulting measure which we shall often use is obviously invariant under rotations of the sphere.
32
1. INTRODUCTION
The following theorem is a classical one concerning removable singularities of analytic functions.
Theorem: Let G be a region in the plane, F a relatively closed subset of G for which H , ( F ) is finite; letf(z) be a continuous complex valued function defined in G which is analytic on the complement of F; thenf(z) is analytic throughout G.
PROOF: We invoke Morera’s theorem and show that the integral of f ( z ) along any rectifiable path bounding a subset of G is zero; for this, it is
enough to show that the integral vanishes when C, the path of integration, bounds a rectangle which is contained in G. Let R be the rectangle and K = F n R. For E > 0, there exists a covering of K by circles C, of diameter d, < E such that E d , 5 H , ( F ) + 1 = M , and since K is compact, we may suppose that the number of circles in the covering is finite. This covering gives rise to a decomposition of R into a finite number of pieces D, , the boundary of each Dkconsisting of line segments (from the boundary of R) and arcs of circles. (See Fig. 2.) Those pieces D, of diameter larger than E are pieces in
Fig. 2.
the interior of whichf(z) is analytic; the integral off(z) around the boundary of such a piece is therefore 0. We may write
Jpd z = c J f ( z )d z
Y
81
where B, is the boundary of Di which may be supposed to have diameter S E .
6.
HAUSDORFF MEASURES AND DIMENSION
33
Choosing a point z , in D i , we may write
and the absolute value of this integral is at most w(E)IB,Iwhere w ( t ) is the modulus of continuity off(z) on R and lBil is the length of the path B, . Now B, I S IT di4 nM and hence the integral is bounded by w ( ~ ) n Ma, quantity which approaches 0 with E .
11
1
A special case of this theorem is the Schwarz reflection principle. We conclude this section with the construction of sets of the Cantor type in the unit interval having a prescribed Hausdorff dimension where 0 < a < 1. For this purpose, we choose a positive q and an integer N so that Nq < 1, and indeed, so that Nq" = 1, or what is the same thing, log N + a log q = 0. Choose N points ai in the unit interval [0, 11 in such a way that 0 5 a, < a2 < .* < aNS 1 - q and widely enough spaced so that the distance apart of any two a, is larger than q. Let K , be the compact set consisting of the union of the N closed intervals [ a , , a, + q], each of width q. Let K2 be the set obtained from K , by subdividing each of those intervals in the same way; thus K2 consists of N 2 intervals each of length q2 of the form [a, aj q, ai aj q q '1. In a similar way K3 is composed of N intervals of length q3 of the form
+
[a, + ajq + akq2,ai
+ ajq +
+
+ q3]
+ *
Inductively, a sequence of sets K,, consisting of N" closed intervals of length q" is so defined; these sets decrease to an intersection K which is a compact perfect nowhere dense subset of [0, 13. The construction is shown in Fig. 3. We compute the Hausdorff measure of order a of K . Among the coverings of K which compete in the definition of H J K ) are the coverings K, themselves, consisting of N " intervals of length q". Summing the ath power of the diameters of the covering intervals, we obtain
H , ( K ) S N"q"" = (Nq")" = 1
and see that the dimension of K is at most a. To show that the dimension is exactly a, we show that H J K ) is not 0. In computing the Hausdorff measure, it is enough to take the infimum of d; over all coverings of K by countable families of (sufficiently small) open intervals A iwith endpoints in the complement of K ; this is a consequence of the fact that K is nowhere dense. From the compactness, it also is clear that these coverings need consist only of a finite number of disjoint, open intervals.
1
34
I. INTRODUCTION
Let A, be such a covering; the open set U A , contains the compact K as a subset, and hence also contains K,,for a sufficiently large n. For each interval A , , let d, be its diameter and p i the smallest integer p such that A, contains at least one interval of K,,. Thus A, contains a certain number k, of constituent intervals of KPl. It follows that d, 2 k,qPr and it is important to note that the integer k, satisfies the inequality 1 5 k i S 2N - 2, since if k, were larger than 2N - 2, the interval A , would contain at least one interval of K,,,-,. It is also clear that A, contains k , N intervals of K p l + l ,and kiN"-P1 intervals of K,, . Let M = (2N - 2)o-l; now df 2 k;qP'"2 Mkiqpf"and since Nq" = 1 this may be written df 2 Mk,N"-p'qna,whence, summing over i, we have
Cdf 2 MqnaCk,Nn-P1 = MN"q"" =M>O. Thus H J K ) 2 M > 0. For the construction of the usual Cantor set in [0,1] we have N = 2 and q = 3 with a, = 0 and u2 = 3; the dimension is log 2 a== 0.6309.. . . log 3
7.
35
PRODUCT MEASURES
7. Product Measures If (X,p ) and ( Y, v) are measure spaces, the direct product Z = X x Y can be made into a measure space with the product measure w as follows. For every measurable A c X and B c Y, the rectangle R = A x B may be assigned the mass p(A)v(B); then, for an arbitrary subset C of 2 we set w ( C ) = infCp(Ai)v(Bi),
the infimum being taken over all coverings of C by familes of rectangles Ri = A i x B,. One shows that w is a regular outer measure for which o(R)= p(A)v(B). If X and Yare metric spaces, and p and v are Caratheodory measures, then w is also a Caratheodory measure when Z is given the usual product topology. The important theorems are those of Fubini and Tonelli. Theorem (Tonelli): Suppose p and v are sigma-finite and f ( z ) a nonnegative w-measurable function on 2 ; then for almost all x, f(x, y ) is vmeasurable and f(x, y ) dv(y) is a nonnegative p-measurable function of x; /Y we then have
We remark that in the equality above, all three terms may be + 00. This is not the case in the Fubini theorem which is an immediate consequence of the previous theorem. If p and v are sigma-finite andf(z) is an integrable Theorem (Fubini): function on 2, thenf(x, y ) is v-integrable for almost all x and f(x, y ) dv(y) is p-integrable; we then have
jy
We obtain a most important product measure when we introduce spherical coordinates in R", and this we can do in the following way. To the point x in
36
I. INTRODUCTION
R" we assign the coordinates (r, 0) where r = 1x1 = distance from x to 0 and 0 is the point xlr on the sphere S.The space R" with the origin deleted is then the (topological) direct product of the half-axis 0 r < 00 and the compact S.On this space we take the product measure w, dw r n - dr where dr is the usual one-dimensional Lebesgue measure, w the normalized measure on S introduced in the previous section, and w, a constant to be determined. The product measure is a Radon measure, and the product measure of the ball 1x1 < R is (R"/n)w,;we select w, so that this coincides with the usual n-dimensional Lebesgue measure of such balls. The product measure is invariant under orthogonal transformations of the space as is the Lebesgue measure, and it is therefore not surprising that these two measures are the same. We do not demonstrate this fact yet, but remark that since the two measures coincide on the class of balls centered about the origin, all smooth functions of radius have the same integral relative to either of these measures. It remains to determine the constant w, explicitly. We consider the Gaussian g(x) = e-1"12'andintegrate it relative to both measures.
-=
I g ( x ) d x = [/je-rxi2dxl d x ,
* *
'
dx,
On the other hand,
!low e+P - ' d r d w
I
w, = w, e- t
dt (,-')',
2Ji
Hence GO, = 2nn1'/r(n/2). The following theorem, which gives the rule for differentiation under the integral sign, is very useful. Theorem: Let (X, p) be a sigma-finite measure space and (I,dr) the interval I = [a, b] with Lebesgue measure; letf(x, t) be a product measurable function on the product having the properties
(i) f(x, t) is p-integrable over X for all t; (ii) f ( x , t) is absolutely continuous on I for all x ;
7.
PRODUCT MEASURES
37
Then F ( t ) = j x f ( x , t ) d p ( x ) is absolutely continuous on I and
PROOF: For all x ,
since the function is absolutely continuous in
= F(a)
+
1';
f
over I. Thus
( x , s ) ds d p ( x )
X a
The order of integration may be changed by virtue of the Fubini theorem, since the partial derivative is surely product measurable, being the limit of difference quotients of f ( x , t ) , and therefore the integral occurring in (iii) makes sense. Accordingly
In the applications of this theorem which we shall make, X will generally be a topological space, p a Borel measure, and the function f ( x , t ) a Borel function for the product space and therefore product measurable. The hypotheses (i) and (ii) are necessary for the function F ( r ) and the formula for its derivative to make sense. Hence the essential hypothesis to be verified will usually be (iii). Iff(x) is Lebesgue measurable on R",the functionf(x - y ) is a function of 2n variables, the coordinates of x and of y ; we wish to show that it is measurable on R2".This is a consequence of the following assertion: Let E A
be a measurable subset of R", and E the subset of R2" of all points ( x ,y ) for A
which x - y is in E ; then E is measurable in R'". For the proof, we consider the transformation Twhich maps R2"into itselfasfollows: T ( x ,y ) = ( x - y , y ) ; T is linear, and one-to-one, indeed, T-' is the linear transformation T - ' ( x , y ) = ( x + y , y ) . Since the Lebesgue measure is a Hausdorff measure, and T and T - ' are Lipschitzian, sets of measure 0 go into sets of measure 0, while sets which are G-deltas map into C-deltas. Thus T and T - ' preserve measurability, since a set A is Lebesgue measurable if and only if there exists a set N A
of measure 0 such that A u N is a G-delta. Now T ( E )is the rectangle E x R", and this is clearly measurable in R2"if and only if E is measurable in R".
38
I. INTRODUCTION
8. The Newtonian Potential In the space Rn,we consider the Laplacian differential operator
and note that if u(x) is a function only of radius, r = 1x1, the operator takes a particularly simple form Au(x) = f"(r)
n-1 +f'(r), r
where u(x) = f ( l x l ) . Hence, the solutions to the equation Au = 0 when n 2 3 which are functions only of r = 1x1 are necessarily of the form u(x) = A + BrZ-"; here, we use the fact that the solutions of a second-order linear differential operator form a two-dimensional vector space; we then have only to verify that u = constant and u(x) = rZ-" are a pair of linearly independent solutions to the equation. When n = 2, the solutions are of the form A + B log r but for simplicity in the sequel, we shall exclude this case, although it is actually the most interesting in view of the relation between potential theory and the theory of analytic functions. Let p be a positive Bore1 measure on Rn of finite total mass; for n 2 3 we form the Newtonian potential of p,
a function which is unambiguously defined for all x, although, perhaps, often infinite. Let F denote the support of p, that is, the smallest closed set, the complement of which has p-measure 0. The Newtonian potential of p then has the following properties: (1) u(x) is a C"-function in the complement of F and Au = 0 there. (2) u(x) is positive and lower semicontinuous. (3) u(x) is integrable over any sphere 1x1 < R, hence is infinite only on a set of Lebesgue measure 0. (4) If u(x) is bounded on F by M, then u(x) 5 2"-'M for all x in R".
PROOF: If x is considered in a neighborhood U contained in the open complement of F, the differentiation under the integral sign is legitimate; since there the function Ix - yI2-" is C" and x is bounded away from y in F,
8.
THE NEWTONIAN POTENTIAL
39
we find that u(x) is C" in the neighborhood U ;since r'-" is harmonic, we find that Au = 0 in any such neighborhood. That the function is lower semicontinuous follows immediately from the theorem of Fatou: if & converges to x,, , then u(xo) I lim inf u(xk).If we compute the integral of u(x) over a sphere of radius R we have
the interchange of integrations being permitted by the fact that the function Ix - yI2-" is a Borel function in the 2n-dimensional space and positive there. The inner integral is the potential computed with a measure v consisting of Lebesgue measure confined to the sphere 1x1 c R; since r2-" is Lebesgue' integrable this potential is a bounded function; it is surely lower semicontinuous and it obviously takes its maximum at the origin, where it is finite. Thus we have to compute the integral of a bounded Borel function relative to the measure p which has finite total mass. The integral is therefore finite. Hence u(x) is locally integrable, and therefore is infinite only on a set of Lebesgue measure 0. We suppose finally that u(x) 5 M on F. Let x be outside F, and x' the nearest point of F to x. (If there is more than one such point, x' is any convenient choice of the nearest point.) Now, for all y in F, Ix' - yI
5 21x - yl ,
whence Ix - yI2-" 5 2n-21x' - y l z - " .
From this it follows directly that u(x) 5 2"-' u(x') S 2"-'M. We use these facts to compute two interesting potentials. Let
U ( x ) = Jdw(Y)/lx - A"-'; here w is the uniform distribution of unit mass over the sphere Jyl = 1. The function is clearly harmonic outside F = [lyl = 13 and is a function only of radius, owing to the symmetry. Thus, inside the sphere, since U(0) = 1, the functionisconstant, U ( x ) = 1, and outside thesphere wehave U ( x ) = BIxJ'-", since U ( x ) must vanish as 1x1 grows large. We will show presently that U ( x )is continuous, hence B = 1. Let x vary on some line, say the positive x,-coordinate axis, and let z be the intersection of that line with the unit sphere. By the lower semicontinuity of U ( x ) , or what is the same thing, Fatou's theorem, U(z) 5 1 and hence l/(z- yl"-' is w-integrable. As x varies on the positive x,-axis, z is the nearest point to x belonging to the support of w, hence, as we have seen, Ix - yl'-" 5 2"-'1z - y12-" for all y such that Iyl = 1; from the Lebesgue convergence theorem, then, U ( x )approaches U(z) and U ( x )is continuous.
40
I. INTRODUCTION
Another interesting potential is
the measure being Lebesgue measure restricted to the unit ball IyI 5 1. As we have already observed, this function is finite at all points, since r 2 - , is locally integrable in R". We write it in spherical coordinates to obtain dw(2) n-2
r d r , where y = rz with r = Iyl and lzl = 1 .
Thus, if U ( x ) denotes the potential which we have just studied, then 1
V ( x ) = J- rLI(z) d r w, o r
If we compute AV, which is well defined except at the surface of the unit sphere, we find AV = o,(2 - n) inside the sphere, and A V = 0 outside it. We consider a form of Green's formula: If V is an open set with a suitably regular boundary B, and u(x) and v(x) are two C2-fun$ions defined in the closure of V, then
AUV- AVUd x =
au
a0
-V - - u d S , I,an an
where dS is the element of surface area of B and du/dn is the exterior normal derivative of u(x). The formula is easily verified when V is a rectangular parallelepiped, however, we shall need it when V is a sphere. Let V be the volume between two concentric spheres in R" centered about the origin, that is, the set E c r c R; let u(x) be a Cz-function which vanishes for 1x1 >= Rand let v(x) = r2-",a function which is smooth and harmonic in V. We apply Green's formula, and note that the integrand vanishes on the outer boundary of V, whence
dx =
-J .aur ( E , 0) dw(8) W , E + ( 2 - n)w, / u ( c , 0) d w ( 0 )
8.
THE NEWTONIAN POTENTIAL
41
Since IxI2-" is integrable (we have n 2 3), the left-hand side tends with de- ~ first term on the right-hand approaches 0, creasing E to A U ( X ) / ~ Xd~x"; the since the gradient of the C2-function u(x) is uniformly bounded; the last term evidently approaches (2 n)w,u(O).Since the origin can be chosen anywhere, we have finally
s
where the integral may be taken over the whole space. Another interesting consequence of the Green's formula is obtained by applying it when v(x) = 1 and V is the sphere of radius R.We have then jvAu(x) d x
=s 7
au ( R , 0)d o ( @ R " - ' W ~ . r
We shall write F ( r ) = ju(r, 0) dw(0) and note that F(0) = u(0). Since F ( r ) is absolutely continuous and F'(r) = j(au/dr)(r, 0) do(@ we have R 1 F'(R) = - -
j
lsRI
Au d x ,
SR
where SR denotes the ball 1x1 < R and IS,l its measure. Thus, finally, F(R) - u(0) =
IRnr -1 -
0
IS.1 s,
Au d x d r .
Thus, if we suppose AM2 0 inside a sphere 1x1 < R , then for all r < R we have u(0) S / u ( r , 0) d w ( 0 ) .
It follows that if u ( x ) is C2 in an open set G where AMis nonnegative, then for any x in G and any r smaller than the distance of x to the boundary, we have u(x) S
Iu ( x + r z )
dw(z)
I n this case we say that u is subharmonic in G . Similarly, if AM5 0 in G we have the reverse inequality; u(x) is superharmonic in G. Finally, for A M = 0 in G, we have equality, and we say that u(x) is harmonic in G. In the next section we introduce a seemingly more general definition of harmonic functions, while in Section 27 the subharmonic functions are considered in some detail.
42
I. INTRODUCTION
9. Harmonic Functions and the Poisson Integral For a fixed point y in R",n function
2 2, we verify by differentiation that the
lY12 - 1x1' Ix - Yl"
is a solution to the Laplace equation for all x # y . The computation is easy but tedious; it is convenient to simplify by the change of variable z = x - y, obtaining the function -1 ---
1z1"-2
2(G Y ) .
14" '
the first term is known to be harmonic, and only the second need be differentiated. Letf(y) be a continuous function defined on the surface of the unit sphere Iyl = 1;for 1x1 < 1 we consider the Poisson Integral
which surely exists, the integrand being continuous on Iyl = 1. It is also clear that we may differentiate the integral as many times as we like under the integral sign; accordingly u(x) is a C"-function in 1x1 c 1 and satisfies the Laplace equation Au = 0. If we consider the special case when f(y) = 1, then u(x) is a function of radius, harmonic in the sphere for which u(0) = 1. Since the harmonic functions which depend only on radius are all of the form u(x) = A Elx12-" (or u(x) = A -iE log 1x1 when n = 2), we see that u(x) = 1 inside the sphere. We use this fact to show that the Poisson integral takes on the boundary valuesf(y). Choose z so that lzl = 1. Now,
+
We write the integral as a sum of two terms, the first being the integral taken over ly - zI > 6, the second for Iy - zl S 6. If M = supIf(y)l, Iyl = 1, then the first term in absolute value is bounded by
9. HARMONIC FUNCTIONS AND THE POISSON INTEGRAL
43
On the other hand, given a small E , there exists 6 so that for ly - zl < 6 we have I f ( y ) -f(z)l < E , the second term is therefore bounded by
Thus, finally, lu(rz) - f ( z ) l 6 E + 2 M ( 1 - r2)(2/S)”,which is < 2 for ~ r sufficiently close to 1 uniformly on JyI = 1. Therefore, if we define u(y) = f ( y ) on the boundary of the sphere, we obtain a function harmonic inside the sphere, and continuous on the closed sphere. By a simple change of variables, we obtain the Poisson integral when we are concerned with a sphere of radius R:
(in this formula, of course, (yl = 1). We have u(Ry) =f(y). We have already established a mean value property for functions which are solutions to the differential equation Au = 0 and are therefore led to the following definition. A function u(x) defined in an open subset G of R”is harmonic there if and only if (i) u(x) is locally integrable, that is, u(x) is integrable over any compact subset of G, and (ii) for every x in G, and any sphere S(x, r ) with center at x and radius r which is wholly contained in G,
Note that the first hypothesis is needed so that the second makes sense. If u(x) is a function defined in G and C 2 there for which Au = 0, we have already seen that it satisfies a mean value theorem; that is, for any xo in G and sufficiently small r u(xo) = j u ( x o
+ rz) do(z).
We multiply this equation by r”-l and integrate over 0 < r < R to obtain
uOR” = /loR u(xo + rz)r“-l n
dr dw(z)
44
I. INTRODUCTION
Thus u(x) is harmonic in the sense of our definition, which uses volume integrals and not surface integrals. We remark first that if u(x) is harmonic in G, it is continuous there, for if we suppose that xk is a sequence in G converging to xo in G, the distances from xk to the boundary of G are uniformly bounded from below, and we may suppose such distances 2 2 r for some small positive r. We consider, then, the sphere of radius 2r about xo and take k so large that S ( x k , r ) is a subset of S ( x o , 2r). Then, writing x k ( X ) as the characteristic function of S ( x , , r ) we have and and the Lebesgue convergence theorem may be invoked to show that u(xk) approaches u(xo); the integrands xk(y)u(y) converge pointwise almost everywhere to xo(y)u(y) and are uniformly bounded in absolute value by the integrable function IH(y)u(y)I where H ( y ) is the characteristic function of S(x0 2 4 We select next a point xo in G and a sphere S ( x o ,R) about it which is wholly contained in G. We may change coordinates so that xo appears as the origin. We form the Poisson integral for the sphere of radius R determined by the values of u(x) on that surface 9
4
We consider next the difference, u(x) = w(x) - u(x) in the closed sphere 1x1 5 R.Since each of the terms is continuous in the closed sphere, this function is also continuous and vanishes on the boundary. Suppose M = sup v(x) for 1x1 S Rand that M > 0. There is then a point zo where u(zo) = Mand (zo(< R. Choose r = R - (zoIand note that 0
1 ISrI
SS(.a.,) M -
V(Y) d y = M
- ~(20)= 0,
whence u(y) = M almost everywhere in S(zo, r ) ; from the continuity of v(y), the equation even holds everywhere in that sphere. Clearly, there is a sequence of points in that sphere approaching the boundary and the corresponding values of the continuous u(y) must approach zero, contradicting M > 0. A similar argument shows that the minimum of u(x) in the sphere is 0, whence u ( x ) = w(x). But w(x), a Poisson integral, is known to be C" and a solution of Aw = 0 inside the sphere; hence u(x) has the same properties. Since the sphere was about an arbitrary point xo in G, it follows that the harmonic function u(x) is C" in G and satisfies the differential equation Au = 0. We prove two well-known results about harmonic functions in R".
9.
HARMONIC FUNCTIONS AND THE POISSON INTEGRAL
45
Liouville's Theorem: Let u(x) be harmonic in all of R" and bounded there by M : then u(x) = constant. PROOF: We choose a point xo and estimate lu(xo) - u(0)I.We take the spheres S(0, R ) and S(xo, R), where R > 41xol, and write their characteristic functions xo, x , to obtain
the integrand vanishes inside the sphere about the origin of radius R - lxol and outside the sphere of radius R IxoI, and so is supported by a set of measure at most
+
by the mean value theorem, this is bounded by o,(R that
+ Ixol)"-'21xol. It follows
a quantity which converges to 0 with increasing R . Since R may be taken arbitrarily large, it follows that u(x) = constant = u(0).
Schwarz Reflection Principle: Let u ( x ) be defined and continuous in the closed hemisphere 1x1 S 1, x, 2 0. Suppose u(x) = 0 for x, = 0 and that u(x) is harmonic in the open hemisphere. Then u(x) may be extended to the lower hemisphere by the equation ~
(
~
-u(xI,x2,..*,~,-1, -xn)
1 x 3Z , * * * , x n - l ? x n ) =
to obtain a function continuous in 1x1 5 1 and harmonic in 1x1 c 1.
PROOF: Given x in R", we write x' for the reflection, that is, the point with the same coordinates except x; = -x, . Our extension formula then is conveniently written u ( x ) = -u(x'). Let f ( y ) be the restriction of u(x) to 1x1 = 1, x, 2 0; we extend f(y) by the equation f ( y ) = - f ( y ' ) to obtain a function defined and continuous on the sphere lyl = I . We form the Poisson integral of the extended function
46
I. INTRODUCTION
and note that this function satisfies the equation w(x') = - w(x) since 1 - lx'I2 = I - 1x1' and Ix' - yl = Ix - y'l; thus, w(x') is the Poisson integral off(y') which is the negative off(y). From this equation it also follows that w(x) vanishes when x, = 0 (since then x = x'). Thus, in the upper hemisphere, the harmonic function w(x) has the same boundary values as the harmonic function u(x) ;their difference, which is also harmonic, vanishes on the boundary, is continuous in the closed hemisphere, and cannot have a maximum or minimum inside by an argument which we have already used. Thus that difference vanishes identically and u(x) and w(x) coincide in the upper hemisphere. Hence u(x) is extended by w(x) as asserted in the Reflection theorem. In Section 1 we made the important observation that a family of functions, analytic in some region G of the complex plane and uniformly bounded there was an equicontinuous family. The same result holds in a more general context: we consider an infinite sequence uk(x) of functions harmonic in a region G of R" and uniformly bounded there by the constant M.Let K be a compact subset of G, d the diameter of K, and R the distance from K to the boundary of G. [f x 1 and x2 are any two points of K,the estimate used in the proof of Liouville's theorem shows that if Ixl - x2 I is sufficiently small, say smaller than R/4, then
and the coefficient of Ixr - x2 I above is then a uniform Lipschitz constant for functions in the sequence relative to the compact K. Since G is a countable union of such compacts, there exists a subsequence uk,(x) converging uniformly on all compact subsets of G to a limit u(x) which is necessarily continuous. The limit is even harmonic, since the mean value property for u(x) is an immediate consequence of the mean value property for the functions of the sequence via the Lebesgue convergence theorem. When a function u(x) is harmonic in a region G and is also nonnegative, more can be said. Suppose u(x) is a nonnegative, harmonic function in the ball 1x1 < R + E ; from the Poisson integral representation for u, we have for all x in 1x1 < R ,
since u(0) = j u ( R y ) dw(y) and R - 1x1 is a lower bound for the distance from x to the sphere lyl = R. This is called Harnack's inequality and it enables us to establish the following remarkable theorem.
9.
HARMONIC FUNCTIONS AND THE POISSON INTEGRAL
47
Theorem: Let uk(x) be a sequence of positive harmonic functions in a region G and xoa point of G such that the values U k ( x 0 ) are bounded: then there exists a subsequence uk,(x) converging uniformly on compact subsets of G to a positive, harmonic limit in G. PROOF: It is enough to show that the sequence is uniformly bounded on any compact subset of G which is connected and which contains xo. If K is such a set, let r be smaller than one quarter of the distance from K to the boundary of G; we can cover K by a finite number of overlapping balls of radius r. From the Harnack inequality, the boundedness of the sequence of functions at some point of such a ball implies its boundedness at all other points of the ball, indeed, the sequence is uniformly bounded on the ball. Since there are only finitely many balls in the covering, the sequence is uniformly bounded on K.
The most general function u(x), positive and harmonic in the Theorem: ball 1x1 < R is of the form
where v is a positive Radon measure on the sphere Iyl = 1. PROOF: Let Rk be a sequence of positive numbers converging increasingly to R; for 1x1 < Rk we have
The measure dvk(y) = u(Rky)d o ( y ) is positive and has total mass u(0); from Helly's theorem it follows that there is a weakly convergent subsequence of these measures, that is, there exists a positive Radon measure vo on Jyl = 1 such that the integrals j f ( y ) dvk,(y)converge to / f ( y ) dvo(y) for all functions f ( y ) continuous on the sphere. Since the Poisson kernel is continuous on the sphere, the formula of the theorem is then valid. The Poisson integral is particularly useful in the study of harmonic and analytic functions in the complex plane, and it is then convenient to write the integral in terms of polar coordinates. If z = reieis a point in the disk lzl < 1 and y is a point on the boundary, that is, y = e'W,the kernel 1 - lzl2 Iz
- YI2
becomes
1 - r2
1
+ r2 - 2r cos(e - o)
48
I. INTRODUCTION
and the Poisson integral is written u(z) = u(reie)
= -1J
1 - r2
2n
1
211
+ r 2 - 2r cos(e - w ) u(e'O) d o .
From the previous theorem we have the following corollary.
Corollary: The most general function u(z) positive and harmonic in the disk IzI < 1 is of the form
where v is a positive Radon measure on the circle IzI = 1. Another easy and useful result is the following. Theorem: The most general functionf(z), analytic in the disk IzI < 1 and having a positive real part there, is of the form
where Cis the constant Im[f(O)] and v a positive Radon measure on the circle IZI
= 1.
+
PROOF: The functionf(z) = u(z) io(z) has positive real part and so the harmonic u(z) is the Poisson integral of a positive measure v . Since the real part of (e'" z)/(ei" - z) is the Poisson kernel, the formula determines an analytic function U(z) iV(z) for which U(z) = u(z). It follows that the function so obtained differs from f(z) by a constant and the theorem readily follows.
+
+
At the end of Section 2, we obtained a canonical factorization for functions analytic and bounded in the disk. This is now easily extended to the following representation. Theorem: The most general functionf(z), analytic and bounded in the disk IzI < 1, is of the form
10. SMOOTH FUNCTIONS
49
where Cis a constant, 1 a nonnegative integer, B ( z ) a Blaschke product, and v a positive Radon measure on the circle. A proof is scarcely necessary; the function g(z) occurring in the canonical factorization off(z) has a negative real part.
10. Smooth Functions In most of this section we consider functions .f(x) defined on the real axis and C" there, that is, having continuous derivatives of all orders. Such functions need not be analytic, as the example which we shall now construct shows. Let f(x) = 0 for x 5 0 = exp( - l/x) for x > 0; it is clear thatf(x) is C" everywhere on R', except perhaps at the origin. The function is clearly continuous at the origin, since limx,,f(x) = 0. We next invoke the following eleme'ntary lemma. Lemma: Letf(x) and g(x) be continuous function on the real axis such that g(x) is the derivative off(x) at all points x different from 0; thenf'(0) exists and equals g(0).
PROOF: By the mean value theorem, the functionf(x) is Lipschitzian in the interval [O, 11 sincef(x) - f ( O ) =f'(O.u)x (the mean value theorem only requires that the derivative exist in the open interval (0, x)) and therefore the difference is bounded in absolute value by llgllmlxl.A similar argument shows that the function is Lipschitzian i n [ - 1,0] and hence, it is absolutely continuous in [ - 1, 13 and therefore the indefinite integral of its derivative. We infer that
f(x) = f(- 1) +
-1
do d t
and since g(t) is continuous, f ( x ) is differentiable at the origin with derivative
do)
*
Now let r(x) be any rational function of x, f ( r ) , the function introduced above. We have lim r ( x ) f ( x ) = 0 , x-0
50
1. INTRODUCTION
because this is surely true if r(x) is continuous at the origin; if it has a singularity at the origin, that singularity is of the form x - for ~ some positive integer k. We write x = e-' and infer that the limit is 0, since rke-' converges to 0 with increasing t . Finally, we note that the successive derivatives of the functionf(x), which exist everywhere except at the origin, are all of the form r(x)f(x) where r(x) is rational; thus, they are all continuous functions, vanishing for x 5 0; from the lemma, it follows that each is differentiable at the origin with derivative 0 there. Thusf(x) is C", but clearly not analytic, since its formal Taylor expansion about the origin vanishes identically. If we pass next to the function F(x) = f ( l - x'), we obtain a function F(x) which is C" and vanishes outside the interval [- 1, 13; this function is even and positive. Next we average this function, choosing a small positive E and forming $(x) =
1;
dt .
It is legitimate to differentiate under the integral sign, and hence $(x) is itself a C@-function,even and nonnegative. For fixed x , the integrand F ( [ x - t ] / ~ ) vanishes outside the interval x - E < t < x E and therefore Jl(x) = 0 for 1x1 > 1 + E and is constant in the interval 1x1 < 1 - E. If $(x) is multiplied by a suitable positive constant, it becomes identically + 1 in a neighborhood of the origin. It is now clear that there is a substantial class of functions C" on the real axis; the class is closed under differentiation, multiplication, and addition, as well as the operations of translation (f(x) being carried into f(x - h)) and composition: f(g(x)) = (f.g)(x). Every function in the class has a formal Taylor expansion about an arbitrary point xo ,but that expansion usually has the radius of convergence 0, and even when it does not, the sum of the series in general is different from the function. The question naturally arises as to whether the Taylor coefficients of the function are subject to some condition as a consequence of the hypothesis that the function is C". It was shown by Bore1 that this is not the case: those coefficients may be arbitrary. The elegant proof is due to L. Ggrding.
+
Theorem: Let be an arbitrary sequence of real numbers; then there exists a function F(x), C" on the whole axis, such that F'k'(0) = a, for all k.
PROOF: equal to
Let $(x) be a C"-function which vanishes for 1x1 and let 6, = k '&olakl. The function
+ 1 for 1x1 <
+
=- 1 and is
51
10. SMOOTH FUNCTIONS
has the required properties. Only finitely many terms of the series are nonzero on any closed interval [c, d ] not containing the origin, since +(bkX) vanishes for 1x1 > l / b k , a quantity which converges to 0. Thus F ( x ) is C" in a neighborhood of any nonzero x , and we have to show that it is equally regular at the origin. We form the derivatives; when x is not 0 these are given by the convergent series Pyx)=
m
n
11 k=O j=o(n
x k(k
n! ak - j ) ! j !k !
- l ) ( k - 2)
( k -j
+ l)Xk-j+(n-i)(bkx)b;-j
and we shall show that this series of continuous functions converges uniformly on the real axis using the Weierstrass M-test. Let M,, = rnaxjSn~~&j)~lm and suppose k 2 n + 1. Since only terms for which Ix(bkc 1 will contribute to the sum we have
Accordingly n! ak - k(k - 1) l F ( n - j ) ! j !k !
* * *
(k
-
+ l)X"f$'"-''(bk
Hence the sum of the terms is given for which k
2"M,
1 f -( k n)!
k=n
-
X)b;-'l
1
5 2"Mn(k - n)!'
2 n + 1 is bounded by
- e2"Mn.
Since the series giving F'"'(x) converges uniformly on the axis, that function may be extended to x = 0 in such a way that it becomes continuous, and in view of our lemma, F ( " - ' ) ( x ) , when so extended, is differentiable at the origin and its derivative is F'"'(0). But this number is just a,. There are theorems which assert that a C"-function which satisfies some further condition is actually analytic. We give two interesting examples. Let f ( x ) be a C"-function on the interval Theorem (S. N. Bernstein): (a,b) having either of the following properties:
(i) f("(x> 2 0 in (a,b) for all k 2 0, (ii) ( - l)"f'"(x) 1 0 in (a,b) for all k
2 0.
Thenf(x) is the restriction to (a,b) of a function analytic in a circle of radius (b - u).
52
I. INTRODUCTION
PROOF: If (i) holds we consider the formal Taylor expansion off(x) taken about the point a E where E is positive and small. The coefficients in the series
+
k!
k=o
(x
- a - &)k
are all positive, and its partial sumsP,,(x) form a monotone increasing sequence of positive polynomials on the interval [a + E , b). From the remainder form of Taylor's theorem, for x in that interval,
where 5 is some point in the interval (a + E , x). Since the remainder is nonnegative, P,,(x) S f ( x ) in [a + E , 6) and therefore the sequence P,,(x) there converges increasingly to a sum S(x) S f ( x ) . It follows that the series has a nonzero radius of convergence which is at least b - a - E and its sum S(x) is analytic in a circle of that radius about a + E . Let c be the midpoint of the interval (a, b); for x in [a + E , c ] . Then
0 I f (XI - PAX)
f
=-
(n
l'(0
+ l ) ! (x - a - &)n+
and the quantity on the right is a term of a convergent series, hence converges to 0. Thusf(x) = S(x) in a neighborhood of a E , and since E is arbitrary, the functions coincide everywhere. Finally, as E approaches 0, the function S(x) is analytic in a circle about a of radius b - a. When the alternate hypothesis (ii) holds, the argument is almost the same; we expand initially about b - E and note that the partial sums of the Taylor series form a monotone increasing sequence of polynomials on the interval (a, b - E ] . It follows that f ( x ) is analytic in a circle about b of radius b - a. In the special case that the interval is the right half-axis, the point b may be taken arbitrarily large, andfappears as a function analytic in the right halfplane. Here Bernstein has shown more, namely, that the functionf(x) is the Laplace transform of a positive measure:
+
f (x)
= [ome-xl d p ( t ) .
We shall not give the proof, which has been given in an elegant modern version by G. Choquet. A curious theorem of Corominas and Balaguer asserts that if a function f(x) is C" on an interval, and if its formal Taylor expansion about every point
1 1.
TAYLOR'S FORMULA
53
of the interval has at least one coefficient equal to 0, then the function is a polynomial. This result was also obtained by S. Agmon. Theorem: Let f(x) be C" on (c, d ) such that for every point x in the interval there exists an integer N , for which f'"-'(x) = 0; then f ( x ) is a polynomial.
PROOF: Let G be the open set of all x for which there exists a neighborhood within which f ( x ) coincides with some polynomial, that is, a neighborhood on whichf'k'(x) vanishes identically for some value of k. Let F be the complement of G ; the theorem will be proved if we show that F is empty. The set F cannot have an isolated point, for if x, were such a point, it is the right-hand endpoint of an interval (a, x,) on which f coincides with a polynomial, and the Taylor coefficients of the polynomial coincide with the formal Taylor expansion off(x) about x,; similarly, x, is the left-hand endpoint of an interval (x,, b) on whichf(x) coincides with a polynomial, which is determined by the Taylor expansion of the function about x,; thus,fcoincides with a certain polynomial in the interval (a, b) and x, is not in F. Let En be the subset of F o n whichf(")(x) = 0; this is a closed subset of F, and F is the union of the sets E,, . By category, then, there exists N such that EN contains a sphere, that is, a point x, in F and all points of F sufficiently close to x,. If I is a closed interval about x, which is sufficiently small, F n I c E N and f"'(x) vanishes identically in Fn I. It follows that the difference quotients of that function, computed with points in F n Zalso vanish. and since that set is perfect the derivativef(N+"(x) also vanishes on F n I, as well as all higher derivatives. The perfect set F n I cannot coincide with the interval I, since then all interior points of that interval belong to G ; thus, there is a small interval (a, 6) in I which is a constituent interval of G , andf(x) coincides with a polynomial p ( x ) in (a, b). Since p ( x ) can be obtained from its Taylor expansion about either of the end points, and all the coefficientsf'k'(a) =f'k'(b) = 0 fork 2 N , it follows thatf"'(x) vanishes identically on (a, b). Since (a, 6) was an arbitrary subinterval of the intersection G n I,it follows thatf"'(x) vanishes identically on I, whence I is contained in G , except, perhaps, for the endpoints. This shows that F is empty.
11. Taylor's Formula We introduce certain notations and conventions that we shall regularly use in the sequel. Let Z " denote the group of n-tuples of integers:
54
I. INTRODUCTION
with the obvious definition of addition; we are interested only in those elements a 5 0, that is, ak 5 0 for all k, and shall not explicitly state this in the future. If x is a point in R", x = (x,, x 2 , . . . ,x,,) we may write the monomial X" to denote the product x?. Thus we will have x"xs = x " ' ~ . By la1 we mean ak and by a ! we mean ak!. For any o! we have the corresponding differential operator
1;
n;=n;
If P ( x ) is a polynomial, it may be written as a sum of monomials in standard fashion: P ( x ) = c u a x a and since Daxs = (/?!/(/?- a)!)xs-" if a 5 p and is 0 otherwise, we obtain the coefficients a, = (l/a!)D"P(O).This leads to the usual Taylor expansion of the polynomial about a point x: P ( x + y ) = C{P("'(x)/a!}y".In particular this leads us to
and we see that if x = y = (1, I , 1,. . . , l), then for 1.1 = N, 2N = a!/(o! - /?)!/?!. Given a polynomial P ( x ) = a, x", we may form a corresponding polynomial in the differential operator: P(D)= X u , D";there is a one-to-one correspondence between these two classes of polynomials, since if we consider the smooth function ecxwhere Cx = CC k ~ for k any choice of the complex vector C, we have P(D)e5"= P(C)es". The Leibnitz formula for the differentiation of a product is similar to the formula for the Taylor expansion :
1
We establish this formula by verifying that if we introduce two auxiliary operators: D, and D,,with the convention that D, operates only on the function u(x) and its derivatives, while D, operates only on u, then P(D)uu = P ( D , + D,)uu. From the Taylor formula, then,
from which the Leibnitz formula follows immediately. In particular we have
11. TAYLOR'S FORMULA
55
The Taylor expansion exists in general for smooth functionsf(x) defined in a region of R". Suppose, for simplicity, that f ( x ) is C"; we'may write
where gk(X) is the function j i ( 3 j / d x k ) ( x t )df. It is legitimate to differentiate under the integral sign; we find
and this may be differentiated again. It is easy to see that gk(0) = (a@xk)(0), and hence
or more generally
where the functions g&x) are also C". This may be written = pN(x)
+ RN(X)
where PN(x) is a polynomial of degree N and the remainder R N ( x ) is a C"function satisfying an inequality of the form IRN(x)l 5 C)xIN+' in some neighborhood of 0. Letf(x) be the function of one variable introduced in the previous section which is C", vanishes on the left half-axis, and is given by exp( - l/x) for x > 0;since 1 - 1x1' is a polynomial, we form F ( x ) =f(l - Ix12) to obtain a C" function on R" which is a function of radius, and which vanishes outside the sphere of unit radius, and is, of course, positive. It is convenient to multiply F ( x ) by an appropriate constant to obtain a function q ( x ) for which I q ( x ) dx = 1. With the function so obtained we are able to define the regularizations of any locally integrable function in R". Given the function cp(x) above, we form, for E > 0, the family of functions cp,(x) = E - " ~ ( X / Eeach ) ; qe is a positive C"-function vanishing outside the sphere of radius E , and cp,(x) dx = 1.
I
56
I. INTRODUCTION
Supposef(x) belongs to L,'oc(R"), that is, is a measurable function on R" which is integrable over any compact set. We form the regularization of f(x) as follows:
1
x-y
-
dy
= En jcp(-)f(Y) E
= Icp(z)f(x - E Z ) d z .
These integrals surely exist, since the integrands vanish outside a ball of radius E , the function cp is bounded andf(x) is locally integrable by hypothesis. The regularizations are also Cm-functions, since the differentiation under the integral sign is obviously legitimate. We must show that as E approaches 0, the regularizations converge to f(x) in an appropriate sense. This is the content of the next theorem.
Theorem : If f(x) is uniformly continuous, the regularizations f k converge uniformly to f. (2) If f(x) belongs to LP(Rn)for 1 S p < 00, then this also holds for the regularizations and l\hIl,,5 llfllp; as E approaches O,f, converges to f in Lp. (3) The regularizationsf,(x) converge .tof(x) at every Lebesgue point of f(x), and hence almost everywhere, and in particular at every point of continuity off(x).
(1)
+
PROOF :
(1) Let o(r)be the modulus of continuity off(x). Now
If,(x) -f ( x ) I 5
I&)
If(x - 4 -f(x) I dz
I
5 O(E)
cp(z)dz = O ( E )
and this estimate holds uniformly in x.
(2) Whenfis in Lp(R"),1 5 p c 00, we write the regularization fe(x) =
- E Z M Z ) dz
= Jf(x - E Z ) d M ( z ),
1 1.
57
TAYLOR'S FORMULA
where dM(z) is the measure q ( z ) dz of total mass 1 ; from Holder's inequality If,(x)l
=< i l f ( x
1
1
- ez)JPd M I I P / l PdM1'4, where - + - = 0. P 4
Thus [lfe(x)lp d x
S ~ ~ I f( x=)IP d x d M =
Ilf .1;
Here we have used the Fubini-Tonelli Theorem, since we have shown that the functionf(x - z) is measurable on the 2n-dimensional space. Since there exists a continuous function g(x) with compact support (and therefore uniformly continuous) for which I l f - g1Ip < q we have
llf-hllp S llf-gllp + 119 - g e l l p + 11ge - h I l p and the last term may be written \\(g- f ) e \ l p and is no larger than 119 -flip; so ~ ~ f - f 5e 2q ~ ~+ p11 g - ge(lpand since the regularizations ofg converge to that function uniformly, they surely do in Lp,whence, ~ ~ p for small E ~ ~ f - f <e 3q. (3) We have
where M is a bound for ~ ( x ) . A point x is said to be a Lebesgue point of an integrable function f(x) if and only if the average occurring in the last term converges to 0. A standard theorem asserts that almost every point x in R" is a Lebesgue point for a given integrable functionf(x); this is not difficult to prove when n = 1 and is found in many references. For higher dimensions the theorem is not so easy and is related to the general theory of the differentiation of set functions. It is obvious that a point of continuity off(x) is a Lebesgue point. The following property of the regularization is also important and useful. Iff(x) is k times continuously differentiable in some region G of R" then for all points in a slightly smaller region and all indices ct with 1.1 S k, (W ) e(x) = Dafee(x1
9
58
I. INTRODUCTION
that is, the derivatives of the regularizations are the regularizations of the derivatives. Of course we must require that the distance from x to the boundary of G is greater than E ; then, since the differentiation under the integral sign is legitimate
Since
the differential operator D" may be taken as acting in the variable y, hence
and this may now be integrated by parts. The integrated terms vanish, since the function (P( {x - y } / e ) vanishes outside a sphere of radius E about x, and so finally
as required. The hypothesis of our assertion could be slightly weakened: we need only require k - 1 continuous derivatives for f(x), the derivatives of order k - 1 being Lipschitzian.
12, The Orthogonal Group In this section we consider linear transformations 1 of R" into itself; these are the transformations which satisfy
for all vectors x and y and all scalars a and b. When a coordinate system is fixed, such transformations are usually written as matrices. The general linear group, GL(n), is the class of linear transformations on R" which have inverses, and obviously forms a group. There are two remarkable subgroups, the group of homothetic transformations and the orthogonal group. The homothetic
12.
THE ORTHOGONAL GROUP
59
transformations are of the form I 1 where I > 0 and Z is the identity transformation; we write the transformation I,. Evidently I,l, = l,, and I-' = l,,,, I , =identity and the group is isomorphic to the multiplicative group of positive reals. The orthogonal group consists of those linear transformations which are isometries: (l(x)l = 1x1. The surface of the unit ball is then mapped by 1 onto itself in a one-to-one way. The group is denoted by the symbol O(n). Iff(x) is a function on R",the composition offwith an element of GL(n) is defined as the function
(f W ) =f ( W. 0
Clearly ( f o 1') 1" = f 0 (1'1"). If there exists a number k such that f o 1, = for all E > 0, thenf(x) is said to be homogeneous of degree k. A transformation 1 has an adjoint I* defined by the equation 0
~''f
(W, Y ) = (x, l*(Y)) for all x and y in R";here it should be emphasized that the inner product is a real inner product since the vectors are elements of a real vector space. When I is written as a matrix, I* is given by the transposed matrix. It is not difficult to show that if I has an inverse, then I* also has an inverse and (l*)-' = (I-')*. We write 1, for the adjoint of the inverse, and this notation will be particularly useful when the Fourier transform is studied. It is easy to see that (I(x),I&) = (x, y ) for all vectors x and y . When I belongs to the orthogonal group, 1 = 1,. There are many occasions when it is desirable to be able to form certain averages of functions defined on the orthogonal group, and this circumstance leads naturally to the development of a theory of integration over that group. Rather than consider the general theory of integration on locally compact groups, we confine ourselves here to the theory of integration developed by W. Maak for compact metric groups and present that theory for the particular group O(n). First note that O(n) occurs naturally as a compact metric space, the metric being defined in the following way. For any element 1 in O(n) define its displacement d(1) by
d ( l ) = sup Ix - I(x)l 1x1 = I
this supremum is attained for at least one point xo of the unit sphere since that sphere is compact and the function Ix - I(x)( is continuous. For any x in R" and any pair of group elements 1, and l2 Ix - 1112(X)l
s Ix - Wl + 112(x) - l,lz(X)l 5 4 1 2 ) + 411)
60
1. INTRODUCTION
and therefore d(l,12)S d(I,) + 41,). Obviously d(l) 5 2 for all I in O(n) and d(I) = 0 only when I is the identity. It is also clear that d(1-l) = d(I). If xo is a point of the unit sphere where the function J x- 1,12(x)lattains its maximum, then, if y o = Iz(xo), 4412)
= 1x0
- I*12(Xo)l - I2(I,I2
= lI,(XO)
X0)l
= IYO
- ~,~I(YO)l
64
1,)
2
and since the transformations I, and I , are arbitrary, d(I,lz)= 41, I l ) . For the metric on O(n) we take P ( l , 1,) = d(l,1;
*
I
Evidently p ( I , , 1,) = 0 if and only if I , = I,, while p ( l , , I,) = p(l, ,I,) since = (I,/;')-'. The triangle inequality is easily verified:
121;l
PUl3
4)= 4W) = d(l,lY'12 l;,)
d(fJY1)+ d ( l J 3 =PUl,
12)
+ PUZ 4) 9
*
It is important to note that the metric is invariant under the group operation: if I , , I , , and I , are any three elements of the group, then d l I l 2 > 11l3)
=d
l 2
9
l3)
moreover, PUI,
I,) =d l ? , 1Y1)
and from these equations it is clear that multiplication by a group element I , is a homeomorphism, indeed an isometry of O(n) onto itself. It is also easy to show that O(n) is compact under this metric: a sequence 1k of group elements is a sequence of homeomorphisms of the unit ball 1x1 5 1 in R";since the 1, are isometries, they are necessarily Lipschitzian with Lipschitz constant + 1, and the Ascoli-Arzela theorem guarantees that there exists a subsequence Ik, converging uniformly on the unit ball. The limit is evidently a linear isometry lo of R" onto itself, and the numbers p(&, , lo) converge to 0. It is therefore established that O(n) is a compact metric space and also a topological group since the group operations are (jointly) continuous for this metric. Before proceeding to Maak's theory of integration, we first follow that author in his solution to the so-called Marriage Problem. Let there be given
12. THE ORTHOGONAL GROUP
61
two finite sets B and G of boys and girls respectively, and a relation of acquaintance between them. We shall say that a solution to the Marriage Problem exists if it is possible to marry off the boys, each to a girl of his acquaintance. Since the marriages are to be monogamous, it is obvious that a necessary condition for such a mass marriage to be possible is that there exist no subset of k boys whose total acquaintance among the girls involves fewer than k girls. The result of Maak is that this condition is also sufficient for a solution to the Marriage Problem. The proof is by induction on N , the cardinal of B. The proof for N = 1 or N = 2 is immediate, and is here omitted. The assertion being supposed true for N , the passage to N 1 is established by the following argument. Suppose, first, there exists a proper subset H of B consisting of k boys, whose total acquaintance consists of exactly k girls forming a set W. By the inductive hypothesis, the boys in H can be married to the girls in W. The remaining N 1 - k boys have sufficiently many acquaintances in the complement of W, for if there exists a subset M o f j remaining boys having a total acquaintance among the unmarried girls which consists of fewer than j girls, the union H u M is a set of k + j boys, with a total acquaintance of at most k + j - 1 girls, contradicting the hypothesis of the Maak theorem. Hence, the existence of a proper subset of k boys, acquainted with exactly k girls leads to a solution of the marriage problem for N 1. More generally, then, given N + I boys, to solve the Marriage Problem we marry off one boy to a girl of his acquaintance. The remaining N boys can be married off to the remaining girls by the inductive hypothesis, since otherwise there exists a subset of k of them with a total acquaintance of at most k - 1 unmarried girls, and in this latter case we divorce the married couple, and adjoin the divorcte to those k - 1 unmarried girls, obtaining a set of k boys whose total acquaintance consists of exactly k girls. This is the case considered in the first part of our argument, which is now complete. It is now possible to construct a Haar (or Maak) measure on the group O(n).Given E > 0, let Hi be a decomposition of O(n)into a finite union of sets of diameter at most E ; such a decomposition is minimal when the total number N = N ( E )of sets in the decomposition is a minimum. We consider only minimal decompositions of O(n). For any such minimal decomposition, choose a point li in each H iand put the mass 1/N at that point; in this way there is determined a measure p of total mass 1 consisting of N equal point masses more or less uniformly distributed through O(n). Let E approach 0 through a countable set of values; there results a sequence pk of measures on O(n) to which Helly’s theorem is applicable, and hence a subsequence converging weakly to a measure p,, . I t is a remarkable fact that p,, is independent of the particular choice of the measures pk . To show this, we first consider any two minimal decompositions H l and H; associated with a particular E > 0; there are equally many sets in each
+
+
+
62
1. INTRODUCTION
decomposition, namely, N = N(E).Let p’ and p” be the two associated measures, consisting of equal point masses at points I,’ in H iand IJ in H J respectively. If we think of the Hfas boys and the H J as girls, the hypotheses of the Marriage Problem are satisfied if the H,‘ and H J are said to be acquainted if and only if they have a nonempty intersection. For if there exists a subset of k sets Hi whose union intersects at most k - 1 sets H J , the decomposition H,‘ could not have been minimal, since the substitution of the k - 1 sets H J for the k sets Hf results in a decomposition of O(n) into N - 1 sets of diameter at most E . It follows, therefore, that the subscripts may be assigned to the N sets H J in such a way that Hfalways has a nonempty intersection with the H J having the same subscript. Suppose, next, that F ( I ) is any continuous function on O(n); since the group is compact, the function is uniformly continuous, and we let ~ ( t ) denote the modulus of continuity. Now
I
N
5 W(2E) since p ( l ; , If’)< 2.5, because the sets Hf and HI’have diameter at most E and a nonempty intersection. It follows immediately that F ( I ) dpk(l) is a Cauchy
s
s
sequence of numbers, converging, of course, to F ( I )dpo(l).Thusp, isuniquely determined and it was not necessary to pass to a subsequence in the application of Helly’s theorem. Moreover, if lo is a fixed group element, F ( x ) continuous on O(n) and pk a sequence of approximating measures of the type constructed above which converges to p o , then
=
1 lim - C F(li) N-+W
N
Thus the measure p o is invariant under the group operation: if A is a Bore1 set and lo a group element, then p 0 [ 4 = p0Cl0 A]. The approximating sums ( l / N ) F ( l i ) and the corresponding measures pk introduced above for the construction of the Haar measure are themselves
13.
SECOND-ORDER DIFFERENTIAL OPERATORS
63
particularly useful in applications; we shall call them Maak sums and Maak measures, in analogy to Riemann sums. One application is immediate. Let F(x) be a function defined and continuous on some ball 1x1 S R in R". The corresponding function F(l(x))= (F l)(x) for fixed x is a continuous function on O(n) and hence is integrable. Now 0
is a spherical average of F ( x ) ;it satisfies the equation G(l(x))= G(x) for all 1 in O(n) and hence G(x) is a function of radius 1x1 only. However, it is possible to find a spherical average of F(x) in another way: the function F(x) is written in terms of the spherical coordinates F(r, 0) and the average is H ( x ) = h(r) = l F ( r , O)dw(0).Evidently both F ( x ) and ( F ol)(x) give rise to the same spherical average H ( x )in this way. So also do the functions ( I / N ) C (F /i)(x)and hence H ( x ) = j G ( r , 0) dw(0)and therefore H ( x ) = G(x). 0
Accordingly, the Lebesgue integral
llXl sRF(x)dx
=
jlxl s R F ( l ( ~ dx, ) ) which is
G(x)dx = lORjF(r,0) dw(O)r"-l drw, and it independent of I , is equal to !lxlS R follows finally that the Lebesgue measure dx in R" coincides with the product measure w, dwr"-' dr as was asserted in Section 7.
13. Second-Order Differential Operators In this section we prove a general existence and uniqueness theorem for solutions of a second-order linear differential equation. The equations are considered relative to a closed, finite interval [a, b] and are of the form d
--(P(x) dX
2)+
R ( x ) y ( x ) = h(x) .
We shall require that h(x) and R(x) be integrable over the interval, and that the reciprocal of P ( x ) be also integrable over that interval. A solution y ( x ) must be such that the terms of the equation make sense, at least almost everywhere; hence we require y ( x ) to be absolutely continuous. It will then be differentiable almost everywhere and the product P ( x ) dy/dx will then make sense almost everywhere. We shall say that this product is absolutely continuous when we mean that there exists an absolutely continuous function on [a, b] which coincides almost everywhere with P ( x ) dy/dx; we will then be able to speak of the value of P ( x ) dy/dx at an arbitrary point c of the interval.
64
I. INTRODUCTION
Theorem: Let c be a point of [a, b] and A, B two constants; suppose R(x) and h(x) are integrable over [a, 61 and l/P(x) also integrable; then there exists a unique function f(x) with the following properties: (1) f i s absolutely continuous on [a, b], (2) P(x)f(x) is absolutely continuous on [a, b ] , (3) -(Pf')' + Rf = h almost everywhere on [a, b], (4) f(d = -4(m(4 = B.
PROOF: We first consider the case when R(x) vanishes identically; if there exists such a functionf(x) it is uniquely determined by integration and the requirements (4); we shall have f(x) = A + Bu(x)
+ f(u(s) - u(x))h(s) ds,
dt
-
where u(x) = c
C
fYt)
On the other hand, it is easy to check that the functionf(x) given by the expression above satisfies requirements (1)-(4). We pass next to the case when R(x) does not vanish identically; a function f(x) satisfying the conditions of the theorem must then be a solution to - (Pf = h - Rf, and hence I)'
f(x)=A
+ MX) + r
( U w
- u(x))(h(s) - R(slf(s>) ds
while, conversely, any continuous solution f(x) of this integral equation satisfies the conditions of the theorem. We display the existence of a continuous solution of the integral equation by an iterative procedure. Setfo(x) = 0 and for n > 0, f n + 1(x)
=A
+ BNX) + r ( u ( S )
- v(x))(h(s) - R(s)jn(s))d s .
All the functions fn(x) are continuous on [a, b]. Let go(x) = 0 and gn(x) = fn(x) - f n - , ( x ) ; the series x g k ( x )has the partial sumsf,(x) and we shall show that the series converges uniformly, hence to a continuous limit f ( x ) . The function g,(x) satisfies the integral equation gn(x) = -s:(.(s)
- o(x))R(s)gn- 1(s) ds *
Let M = suplu(x)(,a 5 x Ib, and select a positive 6 such that !I IR(s)l ds < 1/4M whenever the length of the closed interval I is 5 6 ; this can be done, since the function R(x) is integrable over .[a,b ] . We select next an interval Z of length 6 containing the point c and prove our theorem relative to 1, rather
13.
SECOND-ORDER DIFFERENTIAL OPERATORS
65
than [a, b ] . If llgll = suplg(x)l, x E I then we have, for any x in I,
whence llgnII 5 +llg,,-,I1 and the series xgk(x) converges uniformly on I. The continuous functionf(x) which is the sum of this series is evidently a solution to the integral equation, and therefore of the differential equation for the interval I containing c, and the conditions (4) are satisfied byf(x) at c. We also get uniqueness for the solution : if there were two solutions, their difference, g(x) would satisfy the integral equation
and as above, we would find (1g1(5 fllgll, whence llgll = 0. The number S determined from the absolute continuity of the indefinite integral of IR(x)l is independent of the position of the interval 1 in [a, 61, we therefore subdivide the latter interval into a finite union of intervals of length 614 and choose in each a point c i ; an obvious continuation process then completes the construction off@). In the special case when the equation is a homogeneous one, namely, h(x) = 0 for all x, for any two solutions u(x), u(x) of the equation, we introduce the determinant
This function is evidently absolutely continuous on [a, b ] , and its derivative vanishes almost everywhere, hence W(x) is a constant, which vanishes if and only if the two solutions u and u are linearly dependent. Two such solutions which are linearly independent give rise to the convenient representation of the solution to the general problem which follows : f ( x ) = Cu(x)
+ Du(x) - f[u(x)u(s) C
- u(x)u(s)]
h(s) d s .
W(X>
One verifies that this function satisfies (I), (2), and (3); for a convenient choice of the coefficients C and D, (4) will be satisfied; by virtue of the uniqueness, this must then be the solution.
66
I. INTRODUCTION
14. Convex Sets Let K be a closed, nonempty convex set in Hilbert space which does not contain the origin and let d be the distance from the origin to K ’ . We show that there exists a unique element g‘ in K ‘ nearest the origin, that is, such that Ilg’ll = d. For if gk is a sequence in K ’ for which lim 11gkll = d, then for large enough k , llgkll < d + E and where 1 > k , in the two dimensional subspace determined by the elements gk and g1the line segment having those endpoints is contained in the shell between the spheres about the origin of radius d and d + E , respectively. Thus, from Fig. 4, by elementary geometry, (Igk - gill 5
8dE + 4 2 and thus the sequence is Cauchy. Since the Hilbert space is complete, the sequence converges to an element g’ for which Ilg’ll = d, and gf is in K ’ since K ‘ is closed. The element is unique, since if there were two, the line segment between them would lie on the surface of the sphere of radius d, a geometric impossibility.
14. CONVEX SETS
67
Let h’ be an arbitrary element of K’, g’ the element determined above which is closest to the origin; for t in the unit interval we form tg’ + (1 - t)h’ which is in K‘. Since
+
Ilg’lI’ = d2 5 \Its’ (1 - t)h’ll’,
+
we have (1 - tZ)11g‘1l25 (1 - t)’llh’(lZ 2t(l - t ) Re[(g’, h‘)] and canceling a factor (1 - t ) and letting t approach 1 11g’11’ = d2 5 Re[(g‘, h’)] . This is simply the assertion that the vectors from the origin to g’and h’ make an angle at most 90” and is illustrated by Fig. 5.
Fig. 5. 11g’1125 Re[(g’, h’)].
More generally, we may consider an arbitrary closed convex set K in Hilbert space and a pointfnot in K ; there is a unique element g in K which is nearest tof; for any h in K we will have
1l.f- gll’ = d2 5 ReUg -f, h -f)I. Suppose the Hilbert space is over real scalars; therefore, the inner product is real; let u = f - g and let L be the real linear functional L(w) = (u, w). Now for all h in K , 4 f )- L(g) = 1Iu1l2 = d2 5 L(f)- L(h); therefore, L(h) 5 L(g) < L(f).
68
I. INTRODUCTION
The linear functional L takes its maximum on K at the point 9,and this maximum is smaller than L ( f ) . From this we immediately obtain the fact that a closed convex set in Hilbert space is the intersection of the closed halfspaces containing it: a half-space being a set defined by an inequality of the form L(h) C for some continuous linear functional L. Let 2, be the lattice of points z in R" having integer coordinates and K a convex open set in R" which is symmetric about the origin, that is, K = - K .
Minkowski's Lemma: If the Lebesgue measure of K is greater than 2", then that set contains a lattice point of Z, different from the origin. PROOF: There is no loss of generality in supposing the volume of K finite. For some large integer m consider the hypercubes determined by systems of inequalities of the form -nis x i < m
ni
+1,
m
i = l , 2 ,..., n ,
where the n , are integers; such hypercubes have volume m-". Let N,,, denote the number of such hypercubes wholly contained in K . From the theory of the Riemann integral, it follows that the volume of K is approximated by N , m-" and so for sufficiently large m , N,,, > (2m)".Since the hypercubes are disjoint, and any hypercube contains at least one point x of the form x = ( l / m ) z where z is in 2, it follows that there are more than (2m)" points of 2, in the set mK. Next consider the equivalence relation defined on 2, by congruence modulo 2m; two lattice points z' = ( z ; ,z ; , . . . , z;) and z" = ( z t , z';, . . . ,z:) being equivalent if and only if z; - z; is divisible by 2m for all i. There are exactly (2m)"different equivalence classes, and therefore mK contains a pair of distinct equivalent lattice points z' and z". Accordingly, z' - z" = 2mz* where z* is in Z,,; hence mz* = (z' - z")/2= (z' + ( - z " ) ) / 2 is a point of mK, that set being convex and symmetric about the origin. Thus z* is in K and it is not the origin since z' and z" were distinct. Let I be a nonsingular linear transformation of R" into itself and B the parallelepiped defined by the inequalities Ixi I < bi ,i = 1 , 2 , . . . ,n. The volume of B is 2" bk and the set K = I(B) is convex, symmetric about the origin, and has volume ldet 42" b, . Thus K contains a point of Z , different from the origin if ldet I1 bk > 1. We obtain a well-known theorem.
fl;=
n;=
fl;=,
Theorem (Minkowski):
The system of n inequalities
ICaijzjl < bi has a nontrivial solution in Z,if 0 < ldet aijl <
n;=,b,.
15.
CONVEX FUNCTIONS OF SEVERAL VARIABLES
69
PROOF: The matrix uij determines a nonsingular linear transformation bk > 1 and on R"; if the inverse of this transformation is I then ldet I1 there exists a lattice point z not the origin in I(B). Accordingly l-'(z) is in B.
n;=l
15. Convex Functions of Several Variables A function f (x) defined in some open convex set G in R" is called convex if, for any points x and y in G and all f in the interval [0, I], the inequality f ( r x + (1 - OY) Irf(x)
+ (1 - t)fW
is valid. Thus the function is convex if and only if it appears as a convex function of one variable on any line segment in G. Let x l , x2, . . . ,xNbe N points of the set G and p i a set of N positive numbers such that c p i = 1 ;then f ( c p i x i ) IC pif(xi) The proof is by induction; the assertion being established for N we have
where M = c : p i and 1 - M = p N + 1. It follows from this inequality that if xo is a point of G, then the convex functionf(x) is bounded from above in a neighborhood of x,, , since xo is a convex combination of the vertices xi of any cube in G containing xo in its interior, for then we have
f(x0) 5 1 pi f(xi)
5 max f(xi) . i
This circumstance enables us to repeat an earlier argument to show that the convex function f(x) is actually continuous: there is no loss of generality in assuming that if there is a point of G wherefis not continuous it is the origin and thatf(0) = 0, and indeed, that there exists a sequence x, in G converging to the origin for which limf(xk) = p > 0. Then the sequence 2xk also converges to the origin and lim inff(2xk) 2 2p, while for the sequence 4xk we have
70
I. INTRODUCTION
lim inff(4xk) > 4p; inductively, then, lim inff(2qx,) 2 2qp for all q and f(x) is not bounded from above in a neighborhood of the origin, a contradiction.
Theorem:
Iff(x) is a C2-function defined in a convex, open set G in x n matrix computed at any x in G,
R",thenf(x) is convex if and only if the n
is positive definite (or semidefinite.) The matrix is called the Hessian. PROOF: Let x, belong to G and y be arbitrary: whenf(x) is convex the ry) is a convex function of t in some neighborhood of function F ( t ) =f(x the origin; sincef(x) is C 2 then so also is F ( t ) , and F"(0) 2 0. But
+
and because y is arbitrary, the matrix above is positive. Conversely, if the matrix is positive, functions of the type F ( t ) are convex in t , whenfis convex. Let f (x) be convex, defined in the open set G. We consider the function on any line parallel to the x,-axis; on such a linef(x) appears as a usual convex function of x1 and the derivative with respect to that variable exists (except perhaps for a countable set of points) and is monotone increasing. Since the partial derivative af/dx, is a limit of difference quotients of the continuous f(x), the set upon which that derivative fails to exist is measurable in R" and intersects any line parallel to the x,-axis in a set of linear measure 0. Thus, by the Fubini theorem, the partial derivative off exists almost everywhere. Since the same argument may be carried out for the other n - 1 variables, we infer that the function has a gradient almost everywhere, and that the coordinates of the gradient increase monotonically with the coordinates of the point. We should remark that if F ( t ) is an increasing convex function on an interval of the real axis, then for any convex function k(x) on R",the composed function K(x) = P(k(x)) is convex where it is defined:
This is particularly important when k(x) = 1x1. It is easy to see that the sum of two or more convex functions is again convex and that the supremum of an arbitrary family of convex functions is convex if it is finite. As in the case of one variable, we say that a function k ( x )
15.
CONVEX FUNCTIONS OF SEVERAL VARIABLES
71
is logarithmically convex if it is positive and has a convex logarithm. Such functions are necessarily convex. For some problems, there is an advantage in considering certain functions more general than those introduced above as convex. The wider class consists of functions defined on the whole R" and which are permitted to assume the value + 00 although not - 00 ;the convexity inequality should hold everywhere. The set upon which such a function is finite is obviously convex, and if that set has an interior, the convex function is continuous in that interior. If that set has no interior, the function is infinite except on a subset of a linear variety in R" of lower dimension. It is convenient to require also that the convex function be lower semicontinuous; this is no severe restriction, since it applies only to the behavior of the function at certain boundary points of the set on which it is finite. Let K ( x ) be such a convex function defined on R". We introduce its conjugate function
K*(t) = SUP(X, t) - K ( x ) the supremum being taken over all x in R".As a supremum of a family of linear (hence convex) functions, K * ( o is a convex, lower semicontinuous function. Obviously, we always have the fundamental inequality ( x , 5)
6 K(x) + K * ( U
and therefore K * * ( x ) = SUP(X,5) - K * ( t )
c
5 K ( x ). We agree to leave out of consideration the function K ( x ) which is identically since its conjugate would be identically - co. It is easy to compute the conjugates of some simple functions. If K ( x ) = C + ( x , A) for some constant C and some point A in R", then obviously C = K ( 0 ) and the conjugate function is
+ 00
K*(<)= -C
+ SUP(X,< - A) X
Hence, K * ( < ) = - C = -K(O) at 5 = A and is + co elsewhere. It is even easier to compute the conjugate of this function: K**(x) = (x, A) - K*(A) since 5 = A is the only point where (x, 5 ) - K * ( < ) is not - 00. Hence K(x) = K**(x). If lx12 K ( x )= 2
72
I. INTRODUCTION
then
= sup ltI2-
It - XI2 - 2
2
= -ltI2
2 '
If C > 0 and K(x) is convex, then CK(x) is convex and [CK(x)]*(t) = CK *. Let I be a nonsingular linear transformation of R" into itself; by I, we denote the adjoint of its inverse, namely, the transformation defined by the identity (W, I*(tN = (x, t). If K(x) is convex, then the composed function ( K OI)(x) = K(I(x)) is convex and (KO
I)*(t) = SUP(& t) - K ( W ) = SUP(I(X), I*(t)) - K ( W ) = ( K * I*)(<). O
Finally, we note that [K(x)
+ (x, 41*(t)= K*(t - 4
and CK(x - Y)l*(t) = K*(t) + (t,Y ) * It is also easy to verify that if K,(x) 5 K2(x) for all x, then K : ( t ) 2 K;(t) for all t. This fact makes it easy to show that the only function which is its own conjugate is the function 1xI2/2;for if K(x) = K*(x), we must have
+
(x, X) = 1x1' 5 K(x) K*(x) = 2K(x) and therefore, 1xI2/2 2 K(x); taking conjugates we get the reverse inequality. An important special case arises when the convex function Kc(x) vanishes on some closed convex set C and is infinite elsewhere: the conjugate
KXt) = SUP(X3 0 XEC
is then called the support function of C and we write it H c ( t ) . This function is positively homogeneous: for t 2 0 we have Hc(rt)= tHc(t). Moreover, if A and B are two compact convex sets, the set A + B defined as the collection o f
15.
CONVEX FUNCTIONS OF SEVERAL VARIABLES
73
vectors of the form x + y with x in A and y in B is also compact and convex. . any positive 1,the set I A consists We will have H A +,(() = H A ( r ) + H E ( ( ) For of all vectors Ax with x in A ;this set is closed and convex and HAA(5) = AHA((). If HA(<)is given, we can recover A by taking the set of all x such that (x, 5) 5 HA(5)for all 5, indeed, that inequality is satisfied if x belongs to A, while if xo is a point outside the closed and convex set A, there exists a linear functional L(x) = (x, to) s o .that L(xo) = (xo, to)> supxeA(x,to)= H A ( t 0 ) . Accordingly, H X x ) = SUP(& 5 ) -
e
vanishes for x in A and is positive for x not in A . We may write
and from the homogeniety of H A ( 5 ) it follows that the supremum in brackets is infinite if (x, 5) - H A ( ( ) is positive, and otherwise it vanishes. This means that H:(x) vanishes on A and is infinite elsewhere; accordingly, K,**(X)= HA*(X)= K"(X). The improper convex function K(x) that is identically infinite has the conjugate K *(x) identically - 00 ;this is the support function of the empty set. It remains to show that the term "conjugate" is not a misnomer, that is, that K**(x) = K(.x). Evidently the hypothesis that K ( x ) be lower semicontinuous is necessary here. We have already shown this for convex functions K(x) that vanish on a closed convex set and are infinite elsewhere. We pass to the graph of K: this is the set of points (x, z ) in R n f ' with x in R",z in R' such that z = K(x). The set C, determined by the inequality z 2 K(x) is closed and convex in R"" and the points of the graph are boundary points of C,. Note that if xo is a point in R" where K is infinite, there is no point in C, above it. C, is closed because K ( x ) is lower semicontinuous. We consider first a point x o , where z0 = K ( x o ) is finite. For positive E the point (xo,zo - E ) is not in C,, and there exists a linear functional L defined on R"" such that L(xo, zo - E ) < inf L(x, z), the infimum being taken over C , . The functional L is necessarily of the form L(x, z) = 5x + Lz where 5 is in R" and 1 in R'. Thus (xo
+ l z o - 1~< 5x + 1z
for all (x, Z ) in C,.
It then follows that 12 0, and indeed, that number must be strictly positive, since the inequality would reduce to equality if 1 = 0 and x = x,, . We divide by 1 to obtain b
-xo
1
+ K(x0) -
4
E
<- x 1
+ K(x),
74
I. INTRODUCTION
an inequality which holds for all x , since we have derived it when K ( x ) was finite (that is, ( x , K ( x ) ) in C,) and the inequality is obvious when K ( x ) = + 03. If we set q = -t/A this becomes V(X
- xO)
- E < K ( x )- K ( x J
and letting E approach 0, we have q(x - xo)
2K(x)- K(x0).
The vector q is often called a subgradient of the function K ( x ) at x,; if K ( x ) has a gradient at x , , then q must coincide with that gradient. We now have qx - K ( x ) S qx, - K(x,), and therefore K*(q) = qx, - K(x,), whence K**(xo) 2 qx, - K * ( q ) = K(x,). Thus K**(xo) = K ( x , ) , since we always have the inequality K**(x) 5 K ( x ) . It follows that K**(x) = K ( x ) on the set where K ( x ) is finite. It remains to show the equality of K ( x ) and K**(x) a t points where K ( x ) is infinite. We suppose first that K ( x ) >= 0 on R" and that C is the closure of the set where K ( x ) is finite. Let K,(x) vanish on C and be infinite elsewhere. Evidently K , ( x ) S K ( x ) , whence K:*(x) 6 K**(x). Since we have already shown that K:*(x) = K l ( x ) , it follows that K**(x) is infinite outside C. Accordingly, the inequality K**(x) < K ( x ) can only occur at a point x which is a boundary point of C where K ( x ) is infinite. Such a point is the endpoint of a line segment upon the interior of which the functions K and K ** coincide and are finite. Since both are lower semicontinuous, those functions are equal at the point x as well. Finally, if the function K ( x ) is not always positive, we select a point x, at which it has a subgradient, that is to say, any point where K ( x ) is finite. There then exists a linear functional ( x , q) and a constant c such that &(x) = K ( x )
+ c + (xq)
is positive for all x, hence equal to its second conjugate. The simple rules which we have given for computations with conjugates show that K:*(x) = K * * ( x )
+ c + (xq)
and therefore in general, K ( x ) = K**(x), which completes the proof. The traditional form of the conjugate function occurs in Young's inequality: we consider a convex function on R' which is even, positive, and which vanishes at the origin. The function is then the indefinite integral of a monotone increasing function m(t). For positive c, the conjugate is obtained by integrating the function inverse to m. This is generally shown by giving
15.
CONVEX FUNCTIONS OF SEVERAL VARIABLES
75
a geometric argument illustrated by Fig. 6 to establish Young's inequality
and noting that for any x > 0 there exists t > 0 for which equality holds. The most important special case comes up when K ( x ) = x p / pwherep > 1;we then have K*(C) = lq/qwhere p - l q-' = 1. Holder's inequality then is deduced from Young's inequality.
+
Fig. 6. Young's inequality: ( x ,
f)sK ( x ) + K * ( f ) .
We have already remarked that the conjugate of a homogeneous convex function on R" necessarily assumes only the values 0 and infinity; accordingly, every homogeneous convex function is the support function of some closed convex set. We conclude this section with a briefaccount of the Hahn-Banach theorem. Let H ( x ) be a convex function defined on a linear space Theorem: E and which is positively homogeneous; namely, H ( t x ) = r H ( x ) for all t > 0. Let xo be fixed in E. Then there exists a linear functional F ( x ) defined on E such that F(x,) = H (xo) and F ( x ) 5 H ( x ) for all x .
76
1. INTRODUCTION
PROOF: We first consider the case when the dimension of E is finite. If E is one-dimensional, H ( x ) is necessarily of the form H(x) = A x =Bx
for x > 0, forx
-=
with A - B 2 0 (convexity). If xo 2 0, we select F(x) = Ax, while if xo 0 we take F(x) = Bx. Thus the theorem is established for dimension E = n = 1. We argue by induction to obtain the result for all finite dimensional E; the theorem being true for n, we pass to n + 1 by choosing a subspace M of E of dimension n which contains xo ,and a linear functional f ( x ) defined on M for which f ( x o ) = H ( x o ) as well as f ( x ) S H ( x ) for all x in M. We must extend the functional f ( x ) , defined on M , to a functional F ( x ) defined on E. Choose an element e in E not in M ; every vector u in E can be written u = x + l e where x is in M and l a scalar; this representation is unique. Hence the linear functional F is determined by the specification of its value at the element e: F(u) = f ( x ) AF(e). We must select the number F(e), and therefore define F(u) on E, in such a way that F(u) 5 H ( u ) for all u in E. This means
+
F(u) = f ( x ) 5 H(x
+ AF(e) +le)
for all scalars L and all x in M. We write it LF(e) 6 H ( x
+ Ae) - f ( x )
and we already know that inequality when l = 0. Dividing by 1 1 1 and putting y = x/lll we have
& F ( e ) j H ( y + & e ) - f ( y ) , where & = L / I l l = + I .
Thus, finally, F(e) 5 H ( y
+ e) - f ( y )
for all y in M
and
F(e) I f ( y ) - H ( y - e) for all y in M. It follows that SUP f ( y ) - W y - 4 6 F(e)
ysM
I inf H ( y + e) - f ( y ) . Y EM
On the other hand, if this inequality holds, it is possible to select F(e), and therefore to define F in such a way that the inequality F(u) 5 H ( u ) holds everywhere in the space. To establish this inequality, we need only consider
16.
ANALYTIC FUNCTIONS OF SEVERAL VARIABLES
77
an arbitrary pair of vectors x, y in M and to show
that is, f ( x ) +f ( y ) 5 H bJ+ e) + H (x
-4
*
The left-hand side equals f ( x + y), and this, by the inductive hypothesis, is at most H ( x + y ) = H ( ( x - e) + (y e)), which in turn is smaller than H ( x - e) + H ( y + e) by convexity and homogeneity. Thus the theorem is established for finite dimensional spaces. Since our argument did not make explicit use of the finite dimensionality of M, we apply Zorn's lemma in the usual way to extend the theorem to arbitrary vectors spaces E. We should remark that in our proof, we tacitly supposed that the scalar field was real. However, the proof did not require that the function H ( x ) be positive.
+
16. Analytic Functions of Several Variables A vector space of dimension n over the complex scalars is obtained by taking the collection of n-uples of complex numbers z = [z,, z2 ,z 3 , .. . ,2.1, and this space, denoted by C", may also be regarded as a real vector space of dimension 2n with generic point z = x + iy = [ x I ,y,, x 2 , y,, . ..,y,,]. The natural topology on C" is clearly that of R2" but it is often convenient not to make use of balls about a point E C"but rather ofpolycylinders about such a point, namely, sets defined by inequalities of the form
1Zk - [k 1 < rk , k = 1, 2, . . . ,n Such sets are obviously open and form a base for the metric topology on C" equivalent to the base defined by the balls. A polynomial P ( z ) on C" is a function defined by a finite sum of the form P ( z ) = pJ,z"
and is evidently a polynomial in the 2n variables xk , yk and hence a smooth function. If we fix all of the values of zj f o r j different from k, P ( z ) appears as a polynomial in the remaining complex variable zk and hence
If we sum over k we find AP = 0 and conclude that P ( z ) is a harmonic function.
78
I. INTRODUCTION
Suppose that the formal power series converges for some value z = c, no component of which is 0. We shall only require that the general term of the series converges to 0 and therefore that rp 5 Mwhere thereexistsaconstant Msuch that forall tl, la,( lr"l = laal = r, > 0. If 0 is a number in the interval (0, 1) and z in C" belongs to the closed polycylinder lzkl 5 Or,, k = 1, 2, . . ., n then
n;=l
1
and therefore, the series a,za converges absolutely and uniformly on the closed polycylinder, being majorated there by the convergent series
Hence, the series defines a continuous function on the open polycylinder lzkl < r,, k = 1, 2, , . ., n where this function is even harmonic since, on any compact subset of the polycylinder, it is the uniform limit of a sequence of (harmonic) polynomials. We have, therefore, a natural definition of an analytic function F ( z ) defined on an open subset G of C": F ( z ) is analytic in G if and only if, for every point [ in G, F ( z ) may be represented in some polycylinder about [ by a convergent series F ( z ) = C a,(z -
r)" .
If the series is differentiated term by term, then
and the uniform convergence of this series on compact subsets of the polycylinder guarantees that the differentiation was legitimate, accordingly the partial derivatives dF/azkare themselves analytic in G. Clearly, D"F(c)/tl!= a,. Certain properties of analytic functions defined on a domain G in C" are now obvious: the class of functions analytic in G is closed under addition and multiplication; iff(T) is an analytic function of one complex variable and F ( z ) analytic in G, then the composed function f ( F ( z ) ) is also analytic in G if the composition makes sense. It is also important to notice that when G is open and connected, the analytic function F(z) in G is completely determined by its behavior in the neighborhood of an arbitrary point c by virtue of an obvious analytic continuation argument.
16.
ANALYTIC FUNCTIONS OF SEVERAL VARIABLES
79
If F(z) is analytic in G and ( is a point of that domain, then the function f(Zk) = F((,, T Z , . . . , ( k - 1, zk, ( k f 1, . . . , (,) is an analytic function of the one complex variable zk in some circle Izk - (kl < r ; this is a consequence of the fact thatf(z,) may be represented as a power series. Accordingly, an analytic function F(z) is analytic in each of the variables separately, and an important theorem due to Hartogs asserts that a complex-valued function F(z) defined in a region G in C" is analytic if it is analytic in each variable separately. We shall only prove the theorem under the strong additional hypothesis that F(z) is continuous in G. There is no loss of generality in our supposing that G is a polycylinder about the origin defined by the inequalities lzkl < 2r. We select any system of positive numbers rk 5 r and note that the analyticity of F(z) in the variable z, implies
a representation which is valid for z in the polycylinder lzkl < rk. After a finite number of steps,
where the integration is taken over the set l(kl = rk, k = 1, 2, ... , n. This expression is a repeated integral representing F(z); we shall now consider it from another point of view. Write(, = keiekand (' = ni=(,;let dtldenotetheusualproductmeasure on the n-dimensional torus, namely, d0 = dd, do, do3 * * do,. The measure
n
n
dp=(2n)-neXp(iCdk)ddnrk k= 1
is a Bore1 measure of finite total mass supported by the compact set defined by the I I equations = rk,k = 1,2,. . . ,n and in a neighborhood of that compact the function F ( i ) / ( ( - z) is continuous if z is fixed inside the open polycylinder lzkl < rk, k = 1, 2, . . . , n. Accordingly, the integral F(()/(C - z) dp makes
80
I. INTRODUCTION
sense for all such z and defines a continuous function of z in the polycylinder. Because of the Fubini theorem, the integral may be taken as a repeated integral and it therefore equals the expression for F ( z ) above. We may therefore write
the integration being taken over the set defined by the equations Now when C is in that set and z in the polycylinder 1 --
lclk
= rk
1
and therefore
Because of the uniform convergence of the series on the support of the measure,
z being fixed, the summation and integration may be interchanged: F(z) =
F(5) a , za, where a , = (27~)-" -d o .
r"
It follows that F ( z ) is analytic in a neighborhood of the origin since it is represented there by a convergent power series. We also have the Cauchy integral formula:
which is valid for all z in the polycylinder and all multi-indices a. It is important to notice that the Cauchy integral formula represents an analytic function in a polycylinder in terms of its values on a subset of the boundary. Since the dimension of C" is 2n and the integration is taken over a set of dimension only n, it becomes clear that whenever n > 1, the Cauchy integral formula involves integration only over a rather small subset of the boundary of the polycylinder. This subset is the distinguished boundary of the polycylinder. It is this circumstance which makes the study of many complex variables essentially different from the study of one complex variable. For example, a function F(z), analytic in a connected neighborhood of the boundary necessarily possesses an analytic continuation over the whole polycylinder,
17.
LINEAR TOPOLOGICAL SPACES
81
given, naturally, by the Cauchy integral formula. The proof is not quite trivial and we omit it. More generally, in the study of many complex variables, there will exist many open sets having the property that any functionf(z), analytic in the set, has an analytic continuation to a larger domain. Accordingly, the sets which are natural domains of existence play an important role: such sets are called holomorphy envelopes and an extensive study has been made of such sets. For example, it can be shown that any open set, convex in R2"is a holomorphy envelope. Let f ( z ) = U ( z ) + i V ( z ) be a smooth, complex-valued function defined in some domain in C", U ( z )and V ( z )its real and imaginary parts, respectively. It is convenient to introduce the formal differential operators
Now, it is easy to see that for any k,
and this vanishes identically if and only if the smooth functionf(z) satisfies the Cauchy-Riemann equations in the variables xk, y , . Accordingly, a smooth function f is analytic if and only if aflaz, = 0 for all k, and in this case the partial derivatives df/azk are analytic functions which are given by the formal differential operator introduced above. It should be emphasized, however, that the 2n variables zk = x k + iyk , = xk - iyk are not a system of independent variables in the space. We also note the identity
a2f
--
az,
az,
-Akf
a2f =-+-.
a2f
ax; ay;
17. Linear Topological Spaces Let E be a linear space over real or complex scalars. Let JV be a family of nonempty sets containing 0 with the following properties : (1) (2) (3) (4) (5)
For x in Vand V i n .M, Ax is in Vfor all 2 with 111 _I 1. For U and V in N ,there exists Win N contained in U n V . For every x # 0 in E, there exists V in JV not containing x. For every x in E and V in .Ar, there exists I so that Ax is in V. Given U in N ,there exists V in N such that V + V c U .
82
I. INTRODUCTION
Sets of the family N are called neighborhoods of the origin ; we define neighborhoods of an arbitrary point x by taking all sets of the form x + V where V is in N.Because of (2), these neighborhood systems do, in fact, define a topology when open sets are defined as all sets G, which, containing x , also contain a neighborhood of x . The topology is Hausdorff in view of (3) and ( 5 ) : for x different from y we take U in JV not containing x - y, and V in N such that V V c U ; the neighborhoods x V and y + V are then disjoint. In this topology, the linear space operations of addition and scalar multiplication are continuous. To show the continuity of addition, we must show that the mapping [ x , y ] -P x y from E x E into E is continuous when the product space is given the usual product (Tychonoff) topology. Given a neighborhood of x y , namely, x y U where U is in N,by ( 5 ) we take V in .N with V V c U and form the neighborhoods x + V and y + V respectively; their product in E x E is mapped into x + y + U. The proof of the continuity of scalar multiplication is slightly longer. We consider here the mapping from the product of E with the scalar field into E given by [A, x] -+ Ax. Let C be the set of all scalars r with It1 c E ; for a given U in N,we must show the existence of E > 0 and V in N such that (A + C)(x + V) is contained in Ax + U . By virtue of (3,there exists V in JV such that 2 k V c U where k is so large that 2k > 2 + 111 and because of (4), there exists E > 0 with EX belonging to V ; now Cx + AV + CV is contained in (2 + I1l)V and therefore in U. [Here we have used (l).] It is more important to note that if the linear space E has a topology making it a Hausdorff space with continuous addition and scalar multiplication, then the topology may be defined by means of a family of neighborhoods of the origin satisfying the axioms above. Because, if we suppose that the topology is defined by a system of neighborhoods taken at each point, the continuity of addition permits us to substitute the translates of the neighborhoods of the origin for the initial system of neighborhoods at x without changing the topology. The axiom (2) is necessarily satisfied by every neighborhood system. Axiom (3) is a consequence of the hypothesis that E is a Hausdorff space. Axiom (4) follows from the fact that for any x , Ax must converge to 0 in E as A converges to 0 in the scalar field. Axiom ( 5 ) comes from the continuity of addition. Finally, the first axiom is not quite necessary. However, the continuity of scalar multiplication at x = 0 and A = 0 requires that for any neighborhood of the origin U , there exists E > 0 and a neighborhood V such that AV c Ufor all 1 with 111 < E. Thus, the sets V * = tV, the union being taken over all scalars t with It1 < E form an equivalent system of neighborhoods of the origin, and this system satisfies (1). Sometimes it is found convenient to require that the neighborhoods VinJV be closed. It is instructive to consider some examples.
+
+
+
+
+
+ +
u
17.
LINEAR TOPOLOGICAL SPACES
83
Example I : On the unit interval, we consider the space E of all equivalence classes of Lebesgue measurable functions x ( t ) , two functions being equivalent if they coincide almost everywhere. We define neighborhoods of the origin V , as follows: V, consist of all functions x ( t ) such that
It is easy to verify that the five axioms are satisfied and that the topology is that of convergence in measure. E is a complete metric space. Example 2: On the unit interval, we consider the linear space of all (finite) real valued functions x(t). A neighborhood of the origin is defined by a positive number E and a finite set F = [t,, r 2 , . .., t,]. The neighborhood VF,cis the set of all functions x ( t ) satisfying the N inequalities Ix(tk)l < E. One verifies that the five axioms are satisfied. If we ignore the linear space structure of E we see that E is just the direct product of uncountably many copies of the real axis in the usual Tychonoff topology. Note that E is not a metric space and that its topology is not metrizable. If S is the set of all elements in the space equal to + 1 at all but a finite number of points (the finite set depending on the element x in S, not on S ) , then 0 is clearly a limit point of S, since S contains points in every neighborhood of the origin. However, no sequence in S converges to 0, since, for any such sequence x,,(t) in S there exists a point to in the interval for which x,,(to)= 1 for all n ; accordingly, the neighborhood determined by E = t and the finite set F = [r,] contains no element of the sequence. In considering the linear topological space E, it is natural to study the linear functionals on E which are continuous; that is, the functions F(x) continuous on E for which F(ax by) = aF(x) bF(y). If G is an open subset of the scalar field, the set F-'(G) must be open in E, and in particular, taking G as the set of all scalars 1 for which 111 < 1, the set F-'(G) is an open set containing the origin, and hence containing a neighborhood of the origin. We remark also that that set is convex. Returning, now, to the linear space of Example 1, the real measurable functions on [0, 11 we find that this space has no continuous linear functionals, since, as we shall show, there exists no open convex set containing the origin other than E itself. Let VN be a neighborhood of the origin. Divide the interval [O, 13 into N disjoint equal subintervals I k and for each k let X k ( t ) be the characteristic function of Ik . For any measurable function f ( r ) , the product f ( f ) X k ( t ) is in v,, and the function ( I/N)Cf ( f ) X k ( t ) = (l/N)f(t) is a convex combination of elements of V , . Sincef(t) was arbitrary, the whole space E is contained in the convex hull of V , . Thus there is no nontrivial open convex set containing the origin, or, for that matter, any other point.
+
+
84
I. INTRODUCTION
This example, and our interest in studying continuous linear functionals makes it desirable to adjoin a further axiom:
(6) The sets V in N are convex, and thereby define the class of locally convex linear topological spaces. When the space is locally convex, the topology is often more conveniently described by the use of a family of seminorms. We say that a function llxllv defined on E is a seminorm if (1) the function takes finite, nonnegative values, ( 2 ) it is positively homogeneous: IIAxllv = IAI lIxlly for all I and all x , and (3) it is convex: (Ix yllv 6 (Ixlly llyllv (in view of ( 2 ) this is the usual convexity). A seminorm is a norm if, moreover, (4) llxllv = 0 implies x = 0. If V is a neighborhood of the origin, we define J(xllv= inf[L > 0, x / I in V]. One verifies that this is a seminorm when V is convex, and that if we take V as a closed set, then x is in V i f and only if llxllv 6 1. Thus, we may work with the family of seminorms rather than the family of neighborhoods. If we make the convention that a family of seminorms contains, with each seminorm llxll all of the seminorms tllxll for t > 0, the six axioms for the neighborhood system reduce to the following two axioms for the family of seminorms:
+
+
I. For every pair llxll, and llxllz in the family, there exists a third I I x ~ ~ ~ for which Ilxl13 2 2[llxl11 + 11x1I2] for all x . 11. For every xo in E, xo # 0, there is a seminorm llxll, of the family so that J(xoII,> 0. The second axiom is needed only to make the space Hausdorff. The linear space of Example 2 is locally convex; the seminorms are the functions IlxIIF= SUP I x ( 0 l , t in F , where F is any finite set in [0, 11. Letf'(x) be a linear functional on the locally convex space E. Iffis continuous, the set of x for which If(x)l < 1 is an open convex set containing the origin, and therefore containing a set V in N.It follows that If(x)l 5 llxllv throughout the space. Conversely, if a linear functional f' satisfies such an inequality, it is bounded on a neighborhood V of the origin, hence, given any x and N, if y belongs to x ( I / N )V
+
If(Y) - f ( x ) l
= If(Y - x)l
=1 IlY - Xll"
N
and therefore f is continuous.
17.
LINEAR TOPOLOGICAL SPACES
85
Suppose xo is not 0; by I1 there exists a seminorm in the family for which IIxolly > 0 and since that seminorm is convex and positively homogeneous,
we invoke the Hahn-Banach theorem to obtain a linear functionalf(x) having the propertiesf(x,) = IIxollyandf(x) 2 llxlly for all x in E. In particular, we have - f ( x ) =f(-x) S (I -xlly = llxlly and therefore, If(x)l 2 llxlly andfis continuous. It follows that there exist sufficiently many continuous linear functionals, that is, iff(x,) = 0 for every continuous linearf, then x,, = 0. If the locally convex space E has a metric topology, there is a countable basis of the neighborhoods of a point, and therefore the neighborhood system JV may be taken as a countable system of sets, and the family of seminorms is a countable family which may be so indexed that the functions llxll,, are monotone increasing in n. Conversely, if the family of seminorms is countable, the topology of E is metrizable: we introduce the metric d(x, y) = 1 2 - n
Ilx - Yll. 1 + IIX - Yll,
and it is easy to verify that this is a metric, the real function t / ( l + t ) being monotone increasing for positive t . The metric is also translation-invariant : d(x - z , y - z ) = d(x, y ) for all z. If a sequence x k in E converges to 0 in the sense of the metric, that is, if d(xk,0) converges to 0, then for every fixed n evidently llxkllnalso converges to 0, and therefore xk converges to 0 in E. On the other hand, if xk converges to 0 in E, then for large N and small E , k being sufficiently large, max,., (IXkll,, E and therefore d(xk, 0) < E + 2-N.Accordingly, the sequence converges to 0 in the metric topology, and the metric determines a topology on E which is the same as that determined by the seminorms. A subset B of a locally convex space E is called bounded if, for every neighborhood Vin JV, there exists a scalar 1so that 1B c V . It is easy to show that the union of two bounded sets is bounded, that the closure of a bounded set is bounded, and that all compact sets are bounded. In general, bounded sets are rather thin, as the following theorem shows.
-=
Theorem: If there exists a bounded neighborhood of the origin, the space E is a normed linear space.
PROOF: Let V be the neighborhood in .N which is bounded. The sequence of sets ( l / n ) V form an equivalent system of neighborhoods of the origin, since a U in JV contains a set of the sequence; thus the topology is metrizable and the family of seminorms consists of constant multiples of IJxJly, which is actually a norm. Since the translates of a bounded set are bounded, it follows that if E is not a normed linear space, no bounded set has an interior point.
86
I. INTRODUCTION
A locally convex space E is called a Montel space if every closed and bounded set is compact. A well-known theorem in the theory of Banach spaces asserts that the unit sphere of a normed linear space is compact if and only if the space is finite dimensional. Thus, among normed linear spaces, only the finite dimensional ones are Montel spaces. However, there exist a variety of infinite dimensional locally convex spaces, clearly not normed linear spaces, which are Montel spaces. We give two examples.
E is the space of all functions x(z) analytic in a region G Example 3: of the complex plane with the topology of uniform convergence on compact subsets of G. Thus, a neighborhood of the origin is indexed by a compact subset K of G and a positive E : VK,e= all x(z)
for which Ix(z)l < E on K .
The corresponding seminorms are
J(xIIK = suplx(z)l,
z in K.
The space is a complete metric space. A bounded set in E is a family B of functions x(z) uniformly bounded on each compact: suplx(z)l = M K , B< 00, the supremum being taken over all x in B and z in K. Since G is a countable union of compact sets and the functions in B are equicontinuous on any compact, there exists a sequence xk(z) in B converging uniformly on compact subsets of G to an analytic limit xo(z); xo is in E, and, of course, is in the closure of B. Thus E is a Montel space. Example 4: E is the space g Kof all Cw-i'unctionson R" vanishing for x outside the compact set K,the topology being determined by the seminorms
E is a complete metric space. A bounded set B is a family of functions u(x) having the^ property that SUP IlUllN = MN < uE B
for all N. Evidently, the functions in B are all uniformly Lipschitzian and are uniformly bounded ; the family is therefore equicontinuous for the uniform topology and there exists a sequence u, in B converging uniformly on the whole space R". The first derivatives of the functions u,,(x) have the same property: they are uniformly bounded and have uniformly bounded derivatives ; accordingly, those derivatives form an equicontinuous family. Proceeding inductively and using the Cantor diagonal process, we arrive at a subsequence uk(x) which converges uniformly on R" to a continuous limit u(x) and such that
17.
87
LINEAR TOPOLOGICAL SPACES
the sequences D"u, converge uniformly to the continuous functions u,(x). It is easy to verify that uU(x)= D"r(x). When E is a locally convex space, we denote by E' the class of all continuous linear functionals on E ; this is obviously a vector space over the same scalar field. There are two natural locally convex topologies which arise on E'. (1) The weak-star topology on E' is determined by the following system of neighborhoods of the origin:
VF,c= allfin E'
such that
If(&)[ < E , xk in F
= [xl,
x 2 , . . . ,x N ].
The corresponding seminorms are, of course,
IlfllF
= suplf@)l,
x in F .
(2) The strong topology on E' is determined by a family Jlr of neighborhoods of the origin indexed by the family of bounded sets in E : V,
= allfin
E'
such that
If(x)l < 1 for x in B .
The corresponding seminorms are IlfllB
= SUPlf(X)l Y x in B .
Note that when E is a normed linear space, the strong topology is the usual topology of the dual Banach space. Another important topology occurs on the space E itself, and is defined by E ' : the bi-eak topology of E is determined by the system of neighborhoods VF,E= all x in E such that
Ifi(x)l < E , f;in F = [f, ,fi, . . . ,fN] .
Of course, this is the weak-star topology we would obtain regarding E as a space of linear functionals over E'. The space E' consisting of the continuous linear functionals on the locally convex space E is called the conjugate or dual space. We will not often have occasion to make use of the strong topology on E ' , however, the weak-star topology will be very useful, and so also will be the weak topology defined on E by E ' . This will be particularly important when we define distributions. We shall consider the linear space 9consisting of C"-functions on R" with compact support; when that space is given an appropriate locally convex topology, the corresponding continuous linear functionals will be the distributions on R" and the dual space is the space of distributions. However, a direct description of the required topology on 9 is complicated and really unnecessary. We shall instead make use of the following artifice: we define directly the linear functionals which are to be the distributions; these functionals form a linear space 93' and that space gives rise to a weak topology on 9.The topology so obtained, otherwise not described explicitly, is then to be the topology of 9.We will be
88
I. INTRODUCTION
able to identify the class of convergent sequences, and although it will transpire that the topology of 9 is not metric, the knowledge of the convergent sequences will be enough for our purposes. We conclude the section with two examples of auxiliary spaces which we shall often have occasion to use. 8 is the space of all Cm-functionson R" with the topology Example 5: determined by the seminorms \lu\\K,N
=
1
1. 5 N
suPID"u(x)l, X6K
where K is a compact subset of R" and N a n integer. Since the whole R" is the union of an increasing sequence of compact subsets, there are only countably many seminorms and the topology is metric. Evidently a sequence u,(x) converges in B if and only if, for every compact set K,and fixed IY,the systems D'u, converge uniformly on K. It is also clear that if the sequence u, converges to u, then the derivatives D'u, converge to D"u. From the Arzela-Ascoli Theorem, it readily follows that closed and bounded sets in d are compact, that is, that 6 is a Montel space. This is not a fact that we shall need, however. It is more important to notice that the operations of multiplication and differentiation are continuous in d and that it is complete.
Example 6: The space 9 consists of all C"-function on R" which vanish quite rapidly at infinity; more exactly, for every multi-index a and every positive integer N sup 1(1
+ IxI')~D"u(x)I<
01).
X
The topology is determined by the sequence of seminorms
As in the previous example, the space is a complete metric space; it is a Montel space, and the operations of differentiation and multiplication are continuous. Five of the examples given in this section are complete metric spaces; the fact that such spaces are of the second category in themselves is often used to establish special properties of those linear spaces.
PART I1
DISTRIBUTIONS
This Page Intentionally Left Blank
18. Distributions Letf(x) be a continuous function defined in R" or at least on an open subset of that space. If the function is bounded, we set llfll a = sup, If(x)I and define the support off, written suppfas the closure of the set If(x)l > Oand thus the support is the complement of the interior of the setf(x) = 0. We are particularly interested in the functions which are infinitely differentiable with compact support; we have already seen that many such functions exist, and we call them testjiinctions. Let R be an open subset of Rn;by 9 = 9 ( R ) we designate the class of all testfunctions having support in R and this is evidently a linear space. We think of it as a linear space over real scalars when we are considering only real valued testfunctions, and for complex valued testfunctions we take it as a space over complex scalars. It is convenient to introduce a family of seminorms on 9 as follows:
these seminorms are actually norms and form an increasing sequence. It is natural then to define a locally convex topology on 9 with these seminorms; the resulting topology makes 9 a metric space which unfortunately is not complete. To see this, we suppose that R = R' = the real axis and take a fixed testfunction q ( x ) ; the partial sums of the infinite series 2-nq(x
- n)
n= 1
form a Cauchy sequence for the metric topology determined by the seminorms, but the series does not represent a function with compact support. We therefore reject the topology which we have just defined, but we will find the norms JJrpJJ,useful in any case. Definition: A linear functional T o n 9(R) is a distribution if and only if for every compact subset K of Q, there exist constants C and N such that
for all testfunctions cp with support in K. If the integer N can be chosen independent of the compact K, and N is the smallest such choice, the distribution is said to be of order N. 91
92
11. DISTRIBUTIONS
We consider some important examples of distributions. Let dp be a Radon measure on R; we define the corresponding distribution by the equation
If K is a compact subset of !J and C = !,Jdp(x)I is the total mass of dp on K then IT(rp)l 5 c IIrpllm =
c llrpllo
for all testfunctions supported by K. Thus the functional T corresponding to dp is in fact a distribution and is of order 0. We will make a canonical identification of Radon measures and the distributions which correspond in this way. An important special case is given by the Dirac &distribution; this is the measure which consists of a positive unit mass at the origin, or equivalently, the distribution 6 defined by S ( q ) = rp(0). Letf(x) be a locally integrable function on R; we form the Radon measure f ( x ) dx and pass to the corresponding distribution
f(rp)=
/ rp(x)f(x) dx
*
In the sequel we shall often identify locally integrable functions and the corresponding distribution and shall not speak explicitly of this identification, Thus, we will speak of a distribution which is a polynomial, a Cm-function,or the characteristic function of a set, etc. Let {U,}be a countable family of open subsets of R having compact closures in R and which form a covering of that set: every x in R belongs to at least one U,.We suppose also that the covering is locallyjinire, that is, that every x in R belongs at most to a finite number of the sets of the covering. It is then easy to show that a compact subset of R intersects at most a finite number of the Ui . By induction, we can always construct a further locally finite covering subordinate to the covering { U i };this is a locally finite covering { V i } such that the closure of V i is a compact subset of U,. Iff,(x) is the characteristic function of V i, then for sufficiently small positive E , the regularization offi(x) of order E will be a testfunction rpi(x) in R which is positive on V i and which has its support in U,. We form the infinite series
@(XI = C rpi(x) to obtain a function which is strictly positive in i2 and infinitely differentiable there, since only finitely many terms of the series are nonzero on any compact subset of R. Finally, we form the functions
18. DISTRIBUTIONS
93
and obtain a system of testfunctions satisfying the conditions 0 5 $i(x) 5 1, G i ( x ) = 0 outside U i , and ZI+~~(X) = 1. The system is called a partition of unity subordinate to the covering { U i } .We should remark that there is no difficulty concerning the existence of partitions of unity, since any open subset R of R" has locally finite coverings. We use the partitions of unity first to show that the distributions are determined by their local behavior. More exactly, if two distributions T and S on R have the property that for every x in R there exists a neighborhood U such that T ( q )= S ( q ) for all testfunctions q(x) supported by U,then T = S. The proof consists in passing to a locally finite covering { U i } consisting of neighborhoods on which the distributions coincide and taking a corresponding partition of unity. For any testfunction q(x) we have CP(X> =
C $i(X)dX)
and
T ( q )= C T($i CP) = C S($i CP) = S(V) * i
i
since only finitely many terms in the series are nonzero. Theorem: measure.
T is a distribution of order 0 if and only if T is a Radon
PROOF: We have already seen that the Radon measures are distributions of order 0. On the other hand, T being of order 0, we take a locally finite covering { U , } of R and for each i we consider the continuous function space %'(Bi). This, of course, is the space of continuous functions on the compact, Bi with the usual supremum norm. The linear functional T ( q )is defined on the subspace consisting of testfunctions with support in Ui , and because T is of order 0, it is continuous for the norm of W(Bi).Thus the functional can be extended by continuity to the closure of the testfunctions, and by HahnBanach to a continuous linear functional on the whole continuous function space. The theorem of F. Riesz guarantees that this extension is a measure on U i of finite total mass, whence ~ ( q=)
SP(X)
dpi(X)
for all testfunctions with support in U i . If, now, $i(x) is a partition of unity subordinate to the covering U i , we have, finally, T ( v ) = T ( Z +iq) = =
1T($iq)
c j$i(x)q(x) dpi(x)
= jrp(x) d p ( 4
94
11. DISTRIBUTIONS
where dp(x) is the Radon measure x
y JIi(x) dp,(x).
A distribution Ton R is positive if and only if T(cp) 2 0 for Definition: all testfunctions satisfying q(x) 2 0. If the distribution T is positive, and K a compact, we select a positive testfunction JI which equals + 1 on a rreighborhood of K. Since for any cp(x) supported by K
- IlcplIw
s cp(x)
5 +IlcpIImJI(X) for all points x in R, we obviously have IT(cp)I 5 I T(JI)I llcpll
and therefore T is of order 0. A slight variant of our proof establishes the following result.
Theorem: A distribution T is positive if and only if it is a positive Radon measure. Let R be the positive half-axis in R' and let {Xk} be the sequence {l/k}. It is easy to verify that the form
is a distribution on R which is not of finite order. Moreover, this distribution cannot be extended to a distribution defined on the whole R".
19. Differentiation of Distributions Let T be a distribution on R and xk one of the coordinate functions. We define the derivative of T with respect to xk by the equation
Since the derivative of a testfunction is again a testfunction, the differentiated distribution is a linear functional on 9 ( R ) ; that it is a distribution follows from the inequality
valid for any testfunction cp supported by a compact K on which IT(cp)l 5 Cllcpll,.
19.
DIFFERENTIATIONOF DISTRIBUTIONS
95
Since the testfunctions cp are smooth, the mixed partial derivatives are independent of the order of differentiation: a2q --
ax, ax,
-- a2q ax,ax,
and it follows that the same equation holds for distributions:
a2T --. a2T -ax, ax, ax, ax, We infer that for any multi-index Q the corresponding derivative of T is given by the equation D"T(cp)= (- 1 ) W ( D"q).
It is instructuve to consider some examples. Example I :
If R contains the origin, we can differentiate the Dirac 6:
(Do6)((p)= (- l)I"Id(D"q) = ( - 1)1"1( D"cp)(O) . Example 2: x
5 0 and equals
If R = R' and Y(x) is the function which vanishes for
+ 1 for x > 0, then the distribution derivative Y' = 6.
Example 3: The second derivative of the distribution corresponding to the functionf(x) = 1x1 on R' is 26. We continue to take R = R' and consider the distribution Example 4: corresponding to the functionf(x) wheref(x) = 0 for x 5 0 andf(x) = 1/& for x > 0. The function is locally integrable, and is identified with the corresponding distribution in the canonical way. Using the definition of the derivative and integrating by parts we obtain
a distribution which is clearly not a Radon measure. Away from the origin, however, the distribution derivative coincides with the usual derivative of the smooth function f(x). Let R be the space R" where n 2 3; we consider the locally Example 5: integrable function E(x) = 1/[(2 - n ) w , l ~ l " -which ~ ] we have already met in the study of Newtonian potentials. The function E(x) is harmonic, except at
96
11. DISTRIBUTIONS
the origin where it has a singularity. To compute the distribution Laplacian we write Wrp) =Wrp)
-
1 A r p W dx (2 - n)wn -oy-2'
I,,
and since rp is a C2-functionwith compact support, the integral equals rp(0). Thus the Laplacian of E is the Dirac delta. Let f(x) be a C'-function on R; for the distribution
Example 6: afpx, we- have
and by an integration by parts this becomes
Thus the usual derivative of the function and the distribution derivative coincide when the functions and derivatives are identified in the canonical way. It follows more generally that for functionsf(x) in C ' , the distribution derivative Pffor la1 5 1 coincides with the usual derivative off(x) of the same order. The theorem which follows is easy but important; we may fairly attribute it to Du Bois-Reymond. Theorem: Iff(x) and g(x) are continuous functions on 0 c R" such that aflax, = g in the sense of distributions, then that equation also holds in the classical sense.
PROOF. We form the regularizationsf,(x) and gz(x)and verify that aJ(x)/ax, = g,(x), a consequence of the definition of the distribution derivative. We may suppose that k = 1 and write fe(x1, ~2
9
* * * 9
xn) = fe(a, x2 x3 9
9
* * *
9
xn) +
la XI
gs(t, x2 9
* * *
9
xn)
dt *
As E approaches 0, the functions converge uniformly, and so
f(x1,
~2
9 * *
-
9
xn) = f(a, ~2 .* * 9
9
Xn)
+ JaX1 g(t, ~2
9
* * *
Xn) dt
and therefore f is C' in the variable x1 and its derivative is the continuous function g(x).
20.
97
TOPOLOGY OF DISTRIBUTIONS
In the classical Calculus of Variations one considered a function F(x, y, z ) of three variables which was sufficiently smooth and sought to minimize the integral / ” b ~ ( ~ ( U’(t), ~ ) ,
t) dt
over the class of functions u(t) defined on the closed interval [a, b] and having continuous derivatives there. If we suppose that a solution u(t), which minimizes the integral, exists and belongs to the class C’, then, for & in an interval about the origin, the function
I
b
G(&)= F(u(t)
+
Erp(t),
u’(t)
+ Eq+(t),
t ) dt ,
4
q(t)being a testfunction on (a, b), has its minimum at E = 0 and is differentiable. Thus G’(0) = 0, that is,
/4bE+
aF
rp
rp’ d t = 0 ,
the differentiation under the integral sign being legitimate, and this equation simply says that of the two continuous functions oft,
one is the distribution derivative of the other. The previous theorem then asserts that it is a derivative in the classical sense. Thus the Euler-Lagrange equation is satisfied by the minimizing u ( t ) :
d aF aF dt a d au ’ The result of Du Bois-Reymond, obtained in the middle of the 19th century, asserted that, if a solution u(r) of the variational problem exists, then it actually satisfies the Euler-Lagrange equation. This remark shows how far back the idea of distributions goes. --=-
20. Topology of Distributions We denote the space of distributions on n by 9‘= 9’(Sl); this space is a linear space over the same scalar field as the space of testfunctions B(Q).We should remark that both spaces are modules over the class of infinitely differentiable functions on 0; here we must put
(aT)(cp)= T(arp), where a = a(x) E Cm(Q).
98
11. DISTRIBUTIONS
We immediately verify that the Leibnitz formula for differentiation is valid when the distribution T is multiplied by the function u(x) as follows. If D is a first-order differential operator and cp a testfunction then
An easy induction argument then verifies the Leibnitz formula in general : DU(UT)=
c p!(aa-! f?)! (
DBU)D"
-B T .
The spaces 9 and 9' have a natural pairing and the elements of either are linear functionals over the other. Thus there arises the weak or weak-star topology in each of these spaces. On the space of testfunctions the weak topology is defined by the system of neighborhoods indexed by a finite set of distributions and a positive E :
-=
V = VF,e= all testfunctions p(x) for which ITk(cp)I E for all Tk in F, where F = [T,, T, , . . . , T l ] . The weak star topology is defined on the space of distributions, the neighborof testhoods of the origin being defined by finite sets F = [ql, 43, , . . .c,p], functions and a positive E :
-=
V = VF,e= all distributions T for which IT(fpk)l for all (Pk in F.
E
We shall always take these spaces in the topologies just described. Since these are locally convex linear space topologies, the linear space operations are always continuous, as also is the operation of differentiation. We show this for the space of testfunctions. If VF,cis a neighborhood of the origin defined by E > 0 and the finite set of distributions F = [T,, T,, . . . , TI], we set G = [D'T,, D"T,, . . . , D"Tl], and note that the differential operator D" carries the neighborhood VG,einto V F , e .Thus the operator is continuous. A similar Just the same argument works to show the continuity of D" in 9'(S2). argument shows that multiplication by the C"-function u(x) is a continuous operation on 9 and on 9'. It will soon be clear that the topology on 9 is not a metric one. We start by identifying the convergent sequences in 9.
20. Theorem: only if
TOPOLOGY OF DISTRIBUTIONS
99
A sequence cpk(x) of testfunctions converges to 0 if and
( I ) There exists a compact subset K of R such that supp cpm c K for all k . (2) For all N , IIcpkIIN converges to 0. PROOF: If a sequence of testfunctions satisfies the hypotheses 1 and 2 and T is a distribution, there exist constants C and N associated with T and K such that IT(cp)l 5 C llcpllN for all testfunctions cp supported by K and in particular for the testfunctions of the sequence ( P k . Thus T(cpk)converges to 0, and therefore uniformly over any finite set of distributions q.It follows that the sequence converges weakly to 0. The converse is more difficult; we must suppose that a sequence of testfunctions converges weakly to 0 and deduce that I and 2 are satisfied. We will argue by contradiction, often passing to appropriately selected subsequences, since a subsequence of a sequence converging to 0 also converges to 0. If ( I ) is not satisfied, the union of the supports of the testfunctions cpk is contained in no compact subset of R. Hence, there exists a sequence of points in that union converging to infinity, or at least converging to a point of R" which is not in R. Passing, if necessary, to a subsequence of the testfunctions, we obtain a sequence ( P k ( X ) in 9 and a sequence xk in R such that ( p k ( x k ) is not 0, and ( P k ( X j ) = 0 f o r j > k. We define a Radon measure on R by putting the mass mk at the point x k ; since only finitely many points of the sequence xk are contained in any compact subset of R it follows that any such compact has finite total mass, that is, however the numbers mk are chosen, we obtain a Radon measure. Select mk so that k T(qk) =
mi ( P k ( X i )
= ;
1
this can always be done inductively. The distribution T, which is the Radon measure just defined, has then the property T(cp,) = 1 and this contradicts the hypothesis that the sequence of testfunctions converged to 0. In order to show (2), it will be enough to show that the sequence (Pk is uniformly bounded, for this will mean that any sequence of derivatives Daqk is also uniformly bounded, since the sequence Dacpk also converges weakly to 0. Thus the functions will be uniformly bounded and uniformly Lipschitzian. By the Ascoli-Arzela theorem, then, they will converge uniformly on the compact set K , which supports the functions of the sequence, and therefore uniformly on R". Since the same assertion will be valid for any of the sequences D"cpk,it will follow that II(PkI[N converges to 0 for every N , as desired. If the sequence is not uniformly bounded, there exists a subsequence which we also write (Pk having the property that II(Pk)I > 3k. We select points xk in R
100
11. DISTRIBUTIONS
such that Iqk(Xk)l = II(PkIIm and using the fact that the sequence converges to 0 pointwise and passing if need be to a further subsequence, we may require Icpk(Xj)I < 4 - k . We again select masses mk which we put at the points x k ;we shall require lmkl < c m and therefore we shall construct a distribution T which is a measure of finite total mass for which T(cpk) does not converge to 0.We take mk = 3 - k and estimate
&:
c
This quantity exceeds 3 - (1/4') in absolute value, hence it cannot converge to 0 with increasing 1. The proof is complete. In the theorem we have identified the sequences converging to 0, clearly, a sequence cpk(x) converges to $(x) if and only if (pk(x) $(x) converges to 0. We should remark that the distributions are exactly the sequentially continuous linear forms on 9, because a distribution is continuous on that space, hence sequentially continuous. On the other hand, a linear functional F(cp) on 9 which is sequentially continuous is a distribution, for if it were not, there would exist a compact subset K of for which there was no possible choice of C and N so that IF(cp)l 5 CllcpllNwould hold for all testfunctions supported by K. Accordingly, for each N, there is a testfunction cpN having the property 1 = F ( q N )2 N I I ( P N I I N where cpN has its support in K. Thus this sequence converges to 0 without F(cpN) converging to 0, contradicting the assumed sequential continuity F. Thus, to check that a linear functional on 9 is a distribution, it is only required to check that it is sequentially continuous We can now show that 9 is not a metric space. If the topology of 9 were metrizable, we denote the metric by p(q, J / ) and consider an increasing sequence K,, of compacts whose union is 0.For each K,,, we select a testfunction cp,,(x) which is = I on K,,. Holding n fixed, we select A,, so that p(A,, cp,, ,0) < 2-"; this can always be done since A(p converges to 0 in 9 as A approaches 0. The sequence A,,cp,,(x) converges to 0 but does not satisfy hypothesis (1) of the previous theorem. We have next an important theorem concerning the topology of 9'.
-
Theorem: If a sequence Tk of distributions has the property that for every testfunction cp the sequence of numbers Tk(cp) is Cauchy, then there exists a distribution To such that Tkconverges to T o .
PROOF: It is evident how To is defined: we have To(cp)= lim, Tk(cp) and this is obviously a linear functional on 9 ;we have to show that it is continuous. That is to say, that for each compact subset K of R there exist constants C and N so that IT(q)l 5 CllqIl, for every cp supported by K. For this purpose we pass to the space g K ,studied in Section 17; this is the space
21. THE SUPPORT OF A
DISTRIBUTION
101
of all testfunctions supported by K, that is, all Cm-functions in 0 vanishing outside K . The topology of gKis defined by the family of seminorms
and
QK
is a complete metric space. On this space we consider the function
lllfIII =
ITk(cp)I k
which is nonnegative and positively homogeneous, in fact, a seminorm, since it is the supremum of a family of seminorms. Moreover, this function is lower semicontinuous since it's a supremum of continuous functions. Thus, if Sj is the set of all f in g Kfor which lllflll = < j ,Sj is closed and convex in Q K , and since that space is the union of the sets S j , a category argument shows that at least one Sj contains a sphere, that is, has an interior point. Since Sj is symmetric about the origin, we may suppose that 0 is the interior point. Thus, if p(f; g) represents the metric in g K there , exists M and E > 0 so that P(f, 0) < E
implies lllflll
sM,
that is ITk(f)I 5 hf
for all k ,
and therefore IT,(f)I 5 M for all,fin the sphere of radius E about 0. This is the assertion that To is continuous on Q K and therefore that IT,,(cp)I 5 Cllcpll, for appropriate C and N . The content of the theorem can be put in more technical language: The space 9' is weak-star sequentially complete.
21. The Support of a Distribution If T is a distribution, the support of T is the set of all points x in such that for every neighborhood U of x there exists a testfunction cp E 9 ( U ) so that T ( q ) # 0. It is clear that the support is a closed set, since its complement is open, any point in the complement being surrounded by a neighborhood U so that T ( p )= 0 for all testfunctions supported by U.When the distribution is a continuous function, its support as a distribution and its support as a continuous function are the same set. Writing supp T for that support we also see supp D"T c supp T
and
supp a(x)T c supp a n supp T .
102
11. DISTRIBUTIONS
Let cp be a testfunction whose support K is contained in the complement of supp T; it is then easy to show that T(cp) = 0, since K is covered by neighborhoods Ui on which T vanishes in an obvious sense. We extend U i to a locally finite covering of some neighborhood of K and take the corresponding partition of unity to deduce T(cp) = T(cpicp) = 0. When the distribution T has a compact support, we select a testfunction x which is identically + 1 on a neighborhood of supp T, and a compact set K containing the support of x. For any testfunction cp in 9('(n) the product xcp is supported by K and T(cp) = T ( ~ c psince ) cp - xcp vanishes on a neighborhood of supp T. Accordingly, for an appropriate C and N associated with K we have IT(cp)l
r CIIXcpIIN
and since, from the Leibnitz rule for differentiation, there is an inequality of I ( ~all testfunctions cp, we have, finally, the form IIxcpIIN S c l l ~ l I ~ I I c pfor
IT(cp)l 6 CllcpllN for all testfunctions in 9('(n). Note that the constant C is now independent of the support of cp. If cp(x) is a testfunction and ~ ( xanother, ) which equals + 1 on a neighborhood of the support of cp, we have cp(x> = cp(x)x(x) = P(x)x(x)
+ J/(4
where P ( x ) is the polynomial which is the Taylor expansion of order N of cp about the origin. From the considerations of Section 11, it is clear that the remainder term is a testfunction $ which satisfies an inequality of the form Ill/(x)l 5 C I x I N + l . When the function cp(x) vanishes at the origin with all of its derivatives of order $ N then P ( x ) = 0 and cp coincides with $; we have Icp(x)l 6 C I x I N + ' and for la1 5 N we have I D a q ( x ) 5 C I X ) ~ + with ' - ~ an ~~ appropriate constant C. Theorem: Let T be a distribution of order N supported by the set F and let q ( x ) be a testfunction which, together with all of its derivatives of order S N vanishes on F. Then T(cp)= 0.
PROOF: We let F, denote the set of points x whose distance from F is at most E and let x ( x ) be the regularization of order E of the characteristic function of FZe;this is a C"-function which equals + 1 on a neighborhood of F = supp T. Accordingly, T(cp) = T(xcp). Since x vanishes outside F4e, the product xcp is supported by F4eand vanishes on F. If K is a compact containing
21.
THE SUPPORT OF A DISTRIBUTION
103
the support of cp we find then that IT(cp)l = IT(xcp)I 5 CIIxcpIIN where C depends, of course, on K, and we shall show that the quantity on the right converges to 0 with 8, hence that T(cp)= 0. If cp,(x) is the regularizing function and $,(x) = D'cp,(x), then
and
and therefore there exists a constant C, depending only on F such that for all a with la1 5 N and x in F4c, ID"x(x)l 5 C1~-lal. Moreover, from the hypothesis ) the remarks preceding the theorem, made concerning the testfunction ~ ( xand we have a constant C2 such that for all x in F4e, ID"cp(x)l 5 C Z ~ N + l - l a l . Hence, to estimate (IxcpIIN,we consider the individual terms in that norm, namely, 11 D"(xcp)II and write, using the Leibnitz rule,
We compute the supremum over Fdc to find that this is at most
for a suitable constant C3 depending only on F and the testfunction cp. Thus, 11 D"(xcp)ll, converges to 0 with E ; IIxcpIINalso converges to 0, therefore T(cp) does too. The proof is complete. An important consequence of this theorem is the following corollary. Corollary: If the distribution T is supported by a point, it is a finite linear combination of derivatives of the unit mass at that point. PROOF: We may suppose that the point is the origin and that T is of order N. For any testfunction cp(x) we write its Taylor expansion
cpw = P ( X ) X ( X ) + $W where ~ ( xis) a testfunction which is identically + 1 on a neighborhood of the support of ~ ( xand ) the testfunction $(x) vanishes at the origin, together with
104
11. DISTRIBUTIONS
all derivatives of order less than N. We must then have T($)= 0 and T ( q )= T ( P x )= ~ ( D " q ( O ) / c r ! ) T ( x " ~and ( x ) )if, we set C, = (l/a!)T(xax(x)), we have
and
T=
( - l)l"'C,Da6.
In Section 17 we introduced the space 8 = &(a)of all C"-functions a ( x ) defined in R with the topology determined by the family of seminorms IlallK,N
=
F
.1 5N
suPIDaa(x)l XEK
where K is a compact subset of Q. 8 is a complete metric space, and as we have seen, the space of testfunctions and the space of distributions are modules over 6. Let F(a) be a continuous linear functional on 8; since F must be continuous relative to some seminorm in the family, it follows that there exists a compact subset K of R and an integer N so that IF(a)l 5 C l l ~ l lfor ~ , all ~ a in 8 and an appropriate constant C. Thus, if a is a testfunction with support disjoint from K, I l ~ l l = ~ 0, ~and F(a) = 0. When the linear functional F is restricted to the space of testfunctions, it is evidently a distribution, and indeed has compact support. Thus the functional F determines a corresponding distribution with compact support. On the other hand, if T is a distribution with compact support K,we can extend it to a linear functional on 6 ;select a testfunction ~ ( x which ) is equal to 1 on a neighborhood of K, and define F(a) = T(xu) to obtain a linear functional F defined on 8. We will have
+
IF(a)l
5 IIXaIIN 5 c' I I a I I K * , N ,
where K' is the support of x ; thus F is continuous on 6.The functional F is uniquely determined by T; had we chosen another testfunction xl(x) equal to + 1 on the support of T to obtain another functional Fl(a), we would have F(a) - m a ) = T((X - X l ) 4 = 0
9
since the testfunction ( X - zl)a vanishes on a neighborhood of the support of T. It is therefore legitimate for us to identify the distributions with compact support with the space of continuous linear functionals on 8;we shall do this regularly in the sequel, and denote the class of such distributions by 8'.
22.
DISTRIBUTIONS IN ONE DIMENSION
105
22. Distributions in One Dimension We consider 0 an open connected subset of R'; it is therefore an open interval (a, h), where the length of the interval may be infinite. We denote differentiation by primes. Our first result is a well-known theorem in Calculus.
Theorem:
If T' = 0, then T is a constant.
PROOF: Of course this means that T is the distribution corresponding to the constant function. T' = 0 means T(cp') = 0 for all testfunctions cp. Now a testfunction cp(x) is the derivative of another testfunction if and only if /cp(n) dx = 0, since it is the derivative of/Imcp(r) dr. To prove the theorem, we choose an arbitrary testfunction any testfunction cp(x),
cp(4
=
[444 -
x for which dt
x(4+
I
~ ( xdx ) = 1 and write, for
/cp(t)
dt x ( x ) *
The first term is a testfunction, the integral of which is 0; it is therefore the derivative of another testfunction and the distribution T vanishes for the testfunction in brackets. Thus T(cp) = Icp(t) dr T ( x )and T corresponds to the constant T(x). If a is a constant, the differential equation T' = aThas only Corollary: the solution T = Ce"", that is, the classical one.
PROOF: We write S = e-""T;this is a distribution since the exponential is a C"-function and we can multiply distributions by such functions. We find S' = 0, hence S = C. The device which we have used in the proof of the theore-m shows that there always exists a primitive distribution, that is, given a distribution T, there always exists a distribution S such that S' = T. In view of our theorem, S i s determined only up to an additive constant. We can construct S by writing
this is clearly linear in cp and it is easy to verify that S(cp,) converges to 0 if the sequence cpk converges to 0 in 9 ; thus S is a distribution, and obviously S' = T.
106
11. DISTRIBUTIONS
It is important to notice the following: if the sequence of testfunctions is supported by the compact interval [a', b'], and if that sequence, with all derivatives of order I N - 1 converges uniformly to 0, where N is the order of T relative to [u', b'] then s(cpk) converges to 0.Thus S is of order at most N - 1 on [a', b'] if T is of order N there. It follows that T is the Nth derivative of a distribution of order 0 on [a', b']. We shall improve this result presently.
Theorem:
T' 2 0 if and only if T is a monotone increasing function.
PROOF: We know that T I 2 0 if and only if it is a positive Radon measure, that is, T'(cp) = d @ ) ; we integrate this Stieltjes integral by parts, noting that the integrated term vanishes, since the testfunction vanishes outside some interval, to obtain - T(cp') = - jq'(x)p(x) dx. Accordingly, T corresponds to the monotone nondecreasing function p ( x ) + C for some constant C. Conversely, any monotone nondecreasing &) has a positive measure for its derivative. Corollary: T" 2 0 if and only if T'is a monotone increasing function, and this occurs if and only if T is a convex function. We need only the classical fact that the convex functions are exactly the integrals of nondecreasing functions. Since a distribution of order 0 is the difference of two positive measures, it corresponds to the derivative of the difference of two monotone functions and to the second derivative of the difference of two convex functions. This, coupled with our earlier remarks concerning primitives of distributions, gives rise to the following theorem.
Theorem: A distribution T of order N is the (N + 2)nd derivative of a continuous function. Theorem: T'"'
Let the distribution T satisfy the differential equation
+ a,,-,(X)T("-') + U,,-~(X)T("-~) + u,(x)T' + a,(x)T=f(x) + . * a
wheref(x) is continuous and the coefficients a,(x) are in 8 ;then 7'is a C"function, hence a classical solution to the equation.
PROOF: We first note that the distribution T(")is of order 0, for if it were of order N > 0, the terms T(n-k)would be of order N - k, and thus all terms in the equation except the first would be of order at most N - 1, whence T'") of order N - 1, a contradiction. Since this is so, on any compact sub-
22.
DISTRIBUTIONS IN ONE DIMENSION
107
interval the terms T(n-k)are all functions of bounded variation for k 2 1, and therefore T'") is the sum of a continuous function and one of bounded variation on any such subinterval, in particular, T'") is bounded and T("-') is Lipschitzian on such subintervals, and therefore continuous. Thus T(")finally appears as a continuous function. This theorem, as well as the previous ones, shows that there is nothing essentially to be gained by the use of distributions for the study of ordinary differential equations. The situation is quite different when the equations to be studied are partial differential equations, however. We should also remark that there exist theorems showing that a distribution Tin more than one variable is locally of the form P ( D ) S , where P (D)is a differential operator with constant coefficients, and f ( x ) is a continuous function. We do not give the proof, although it is not hard, since the theorem would serve only to show that the class of distributions will arise naturally when we seek a class containing the continuous functions and closed under differentiation. We consider finally the division problem in one dimension. In general, given a distribution T and a Cm-function a(x), the division problem is the problem of finding a distribution S such that U S= T. If the function a(x) never vanishes, an obvious solution is ( l / a ) Tand this solution is unique. On the other hand, when a(x) does have zeros, a solution S, if it exists, cannot be unique, since any measure p supported by the zero set of a(x) satisfies the equation up = 0. The study of the division problem requires a careful study of the behavior of a(x) on and near the set where it vanishes. We consider here only the simplest possible case, where R is a subinterval of the real axis and the function a(x) has a simple, isolated zero. Evidently, we may suppose that the point in question is the origin, and a(x) = x. For the solution of the division problem, we select a testfunction ~ ( x ) equal to + 1 in a neighborhood of the origin and form the corresponding Taylor expansion of the testfunction q ( x ) , taking that expansion only to one term : = cp(O)X(X)
+X
W )
7
we set
and note that this is obviously a linear functional on 9.It is also a distribution since it is sequentially continuous: when the testfunction cp converges to 0 in 9, the testfunction t,b also converges to 0. The solution S is determined up to a multiple of the &distribution; this is easy to see, since the difference of any two solutions must be supported by the origin, and hence is necessarily of the
108
11. DISTRIBUTIONS
form P ( D ) 6 ; now xP(D)G = 0 means that the polynomial PAD) consists only of the constant term, since xD'6 = (- I ) D ' - ' ~ ,
for all 1 2 1 .
As an example, we can consider T as the &distribution in R' ; dividing it by x we obtain S = - 6' + C6 where the constant C is arbitrary. A more interesting example is obtained when the distribution T is defined by the function 1/,/x on the right half-axis and vanishing on the left and we divide by -2x; for the quotient, we obtain
where the constant C is arbitrary. For an appropriate choice of C we obtain the distribution determined earlier as the derivative T'. Using partitions of unity, we can easily extend our argument to cover the case of division when ZZ is the whole real axis and a(x) is a smooth function whose zeros are all of finite order and all isolated. For example, it is easy to see that we can divide on R' by the function sin x, the solution being determined up to a Radon measure supported by the zeros of that function.
23. Homogeneous Distributions Letf(x) be a function which is locally integrable on R" and suppose that 1 is a linear transformation of R" into itself having an inverse. The mapping 1 induces a new function: ( f o I ) ( x ) =f(lx) which is also locally integrable. If we think off(x) as a distribution, then for any testfunction rp, we have
and this formula shows how the composition of a distribution with 1 should be defined: 1
(To l)(rp) = -T ( q 0 P ) (det 11
It is obvious that the composition with 1 defines a linear mapping of the space
23. HOMOGENEOUSDISTRIBUTIONS
109
of distributions onto itself, so long as the distributions we consider are taken in domains invariant under I. If I is the mapping given by -I where I is the identity, I carries x into - x and l2 = I. In this case, ldet 11 = 1 and we write T 0 (-I) = ?. Clearly i'(q) = T(Cp),where, of course, &x) = cp( -x). Similarly, if I is any orthogonal transformation of R" into itself, the absolute value of the determinant is 1 and the formula for the composition is quite simple. An important case arises when I is EIfor some positive number E and the distributions are defined on the whole space; here ldet 11 = E" and
A distribution T is homogeneous of order k, if for all positive E, T OI, = ckT. Note that this extends the usual definition when we speak of homogeneous functions. It is also important to notice that the delta distribution is homogeneous of order --n. Moreover, for transformations I of this this form, D"(q o fE-') = ~-l'lD"cp re-' and therefore, if T is homogeneous of order k, D"T is homogeneous of order k - Iu1. The notation which we have introduced is not particularly handy, and so we will use it but little; the idea, although quite simple, is very important. It is obvious that it will be often convenient to speak of distributions on R3 which are invariant under rotation or under reflection through the origin, or which are invariant for other groups of transformations, as well as those which are homogeneous of a certain order. In every such case the meaning of invariance will be that the composition-of the distribution with transformations I in the group leaves the distribution fixed, that is, T 1 = T. It is also convenient to consider another group of transformations of R" into itself: the translation group. For any functionf(x) defined on R" and any vector It in that space, we define 0
0
= f ( x - h, and it is clear that these translation operators form a group isomorphic with (Fhf)(x)
R": = identity
Y - h =
T i ' ,r h + k = r
h r k
=r
k y h .
Since this group of operators maps the space of testfunctions into itself, we extend it to a group of operators mapping the space of distributions onto itself by the definition (FhT)(q)
= T(F-hq)
;
when T is a function, this definition coincides with our initial one, since, T being given by f ( x ) , ( y h
T ) ( q )= jf(x)dx -k h, dx =jf(x
- h)cp(x) dx .
110
11. DISTRIBUTIONS
The translation operators commute with the differential operator for functions, in particular testfunctions, and therefore also for distributions : ("h
D"T)(cp)
= D"T("-h = T((- l)lalDu.T -hq)
= (- l ) l a ' T ( r - h D "Cp) = (- 1 ) ' " ' S h T(D"9) = (Du"h
T)(cp)
*
It is also clear that the translation operators are sequentially continuous linear transformations of the space of testfunctions into itself, since if pk converges to 0 in 9, the sequence "h (Pk also converges to 0; thus, the distribution "h T introduced abobe actually is a distribution, that is, has the continuity properties required of a distribution. It follows immediately that if the sequence of distributions Tkconverges to a limit T o , then "h Tkconverges to Y h To and the translation operator is sequentially continuous on the space of distributions. Let h be a point on the positive x,-axis in R" and cp a testfunction; we form the difference quotient
and as Ih I diminishes to 0 we obtain a sequence of testfunctions supported by a fixed compact, and these testfunctions converge pointwise to the testfunction dqldx,. This sequence of testfunctions is uniformly bounded, since the mean value theorem guarantees that the quotient is equal to (dcp/dx,)(x + Oh) for an appropriate value of O in the interval [0, 11, hence these quotients are all bounded by IIcpII,. We could equally well have argued with the testfunction D"cp to have obtained a bounded sequence of testfunctions
converging pointwise to D" acp/dx,. Hence we may invoke the Arzela-Ascoli theorem to deduce that the sequence of difference quotients converges in the space of testfunctions to the testfunction dcpldx,. It follows in general that for the space 9, the differential operators are the limits of difference quotients. The same conclusion also holds for distributions:
converges to
23.
111
HOMOGENEOUSDISTRIBUTIONS
If the point h in R" approaches a limit k through a countable set of values, then for any testfunction cp, the sequence of testfunctions y,,cp is uniformly bounded and the supports of these functions are contained in a fixed compact set. It is perfectly clear that the testfunctions converge pointwise to Y k cp and that the derivatives have the same property; thus the sequence converges in 9(R"), since the Ascoli-Arzela theorem guarantees that the convergence of these sequences is uniform. Hence for any distribution T, the numerically valued functionf(x) = T ( Y Xcp) is continuous in x , and indeed this function is even differentiable, since the difference quotient
converges to . F - x T (-dcp/dx,). This argument may be repeated for higher derivatives and hencef(x) is infinitely differentiable. Let ~ ( xbe) a testfunction; the family cp 1, then varies continuously in 9, and indeed, differentiably, since for a fixed E and small positive t the difference quotients 0
cp(EX
+ tx) -
cp(EX)
t
form a family of testfunctions supported by some fixed ball of radius R and uniformly bounded by RllcplJ,. The bound is obtained from the mean value theorem, since at any point x the quotient is
for an appropriate 0 in the unit interval. Since the corresponding quotients for the functions D"cp(x) have the same property, it follows that the functions cp(EX
+ tx) t
&X)
converge in 9 to j=1
xi
2 (8x1 axj
as t diminishes to 0. Thus, for any distribution T and any testfunction cp, the function T(cp 0 1,) is a differentiable function of E and its derivative is
A distribution T is homogeneous of degree k, if and only if Theorem: it satisfies the Euler equation kT=
"
aT
x ~ j - . j=1 axj
112
11. DISTRIBUTIONS
PROOF: If T is homogeneous of degree k, then T(cp 0 /,) = E-"(T0 l,-')(cp) = c - ( " + ~ ) T ( ~ ) .
The derivative of this function at E = 1 is then
and so
-(n+k)T=
" aT Exjaxj
-nT-
j=l
from which follows the Euler equation. Converscly, if T satisfies the Euler equation, we write y ( ~=) T(cp 0 1,) to obtain
and since
a (cp I,) 0
= E-
ax
acp I , , 0
ax,
we find
-
--n + k Y (4 &
*
Thus y(c) = C E - ( " +with ~ ) C = y(1) = T(cp), and therefore ~ - ( " + ~ ) T (=c pT(cp ) I,) = E-"(T0 l,-')(cp), 0
whence T 0 1,' = E - ~ and T T is homogeneous of degree k. Since 6 is homogeneous of degree --n, we must have
-ns=
as C x j j=l dxj n
and it is easy to verify this equation directly.
23.
HOMOGENEOUS DISTRIBUTIONS
113
If the distribution T is homogeneous of degree k on the real axis, it is necessarily a function of the form Cxkaway from the origin, since the Euler equation reduces to kT = xT' and we have shown that such equations only have the classical solutions. Note, however, that different constants C may occur for the left and right half-axes. Theorem: If T is a distribution on R" and is invariant under the orthogonal group and is homogeneous of degree k, then, away from the origin, T has the form Clxlk.(Here, n 2 2 of course.)
PROOF: If +(r) is a testfunction in one dimension supported by the half-axis 0 < r < co, the function
=%w = d x ) = +(IN is a testfunction in,R", vanishing in a neighborhood of the origin. The mapping
9 'determined by this equation is a continuous linear transformation of the testfunctions on the half-axis into 9 ( R " ) ,and therefore, the linear functional
W )= T ( W ) is adistribution on r > 0. Since Y ( $ o I,) = (U+)0 I, and therefore, S(lcI o 1,) = T ( Y + 0 I,), the hypothesis that T is homogeneous of order k leads to the equation S 0 I, = ck+"-'S,and a slight modification of the proof of the previous theorem shows that S = crk+"-' for an appropriate constant c. It follows that the distribution U defined by U ( q ) = T(q)- C / I x I k q ( x ) d x , where C = c / o n
=-
is a distribution defined on the domain 1x1 0, which vanishes on all testfunctions that are functions only of radius 1x1 ; moreover, U is invariant under the orthogonal group O(n). It remains to show that U is identically 0. Let q ( x ) be an arbitrary testfunction vanishing for 1x1 < E ; we average it over the orthogonal group, forming the Maak sums:
The functions pN(%)are testfunctions and converge with increasing N as testfunctions to a limit q,(x) which depends only on radius. Since U is invariant under O(n) U ( q ) = U ( q N )= iim U(q,) = V(q,) = 0.
114
11. DISTRIBUTIONS
24. The Analytic Continuation of Distributions We have made a canonical identification of locally integrable functions f ( x ) with distributions, and this identification is at the base of our theory. However, there are functions arising quite naturally in analysis which are not locally integrable, and it is therefore desirable to extend the relation between functions and distributions to a wider class of functions. Let us suppose that the functions in question have isolated singularities. It is therefore sufficient to suppose that we are concerned with a function f ( x ) , locally integrable except in a neighborhood of the origin; we shall also suppose that the singularity is not too bad, more exactly, that for some integer N the function IxlNf(x)is integrable over 1x1 5 1. Iff(x) is to be identified with the distribution T , then, for testfunctions rp whose support does not contain the origin, we ought to have
The same formula ought to be valid whenever the testfunction q ( x ) is such that the integral above exists, in particular, then, if rp(x) satisfies an inequality of the form Icp(x)l S ClxlN.Making use, then, of the Taylor expansion for rp, we have rp(4= P(x)x(x) + K4 9
+
where the testfunction ~ ( xequals ) 1 on a suitable sphere 1x1 6 R, and where I$(x)l 5 C(xINif the degree of the polynomial P(x) is N - 1. Since T has already been defined for functions of the type $(x), it remains only to define it for functions of the type P(x)x(x),and here we will have
the sum being taken over all indices tl with ItlI 6 N - 1, and therefore T(P1) = C,DU6)(rp),where C, = (l/a!)T(xdX(x)).It follows that we have very little choice in determining a distribution T which is to correspond to f ( x ) ; T is completely determined on sets not containing the origin, and we are at liberty to determine only the coefficients C, in the polynomial P(D ) = c C , D " which is applied to 6. Moreover, these coefficients cannot be determined arbitrarily, for we should want to assign a distribution to the function f ( x ) in a way consistent with, say, differentiation: the function D"f has a singularity at the origin, and should correspond to a distribution S for which
(c
24.
THE ANALYTIC CONTINUATION OF DISTRIBUTIONS
115
D"T = S. Hence we seek a recipe which will provide a consistent determination of a distribution T to correspond to a functionf(x) having a certain type of singularity. While no solution to this problem exists in general, those functions f ( x ) having singularities of an analytic nature do admit such natural extensions to distributions, the extension being found by a process of analytic continuation to which this section is devoted. Let us remark that we have already twice encountered this kind of problem: the derivative of the function g(x) which vanished for x < 0 and was given by x-'" for x > 0 was given in the sense of distributions by a function f ( x ) which coincided with the usual derivative away from the origin, but had a nonintegrable singularity at the origin. We also encountered this problem when we sought to divide a distribution in one variable by the function a(x) = x. We proceed to the definition of a distribution depending analytically on a parameter, that is, a distribution-valued analytic function. Let G be a region in the complex 1-plane and for every 1in G let TAbe a distribution in B'(0). This function is said to be analytic, if for every testfunction cp in B(0) the numerically valued function TA(cp)is analytic for 1in G. The usual results concerning analytic functions may now be extended to the functions TA, for example, we consider the derivative with respect to 1
which is evidently a linear functional in cp; it is a distribution, since it is the limit of a sequence of difference quotients TA+,(cp)- TA(cp) h and we have already established that the space of distributions is weak-star sequentially complete. It is also evident that the derivative is itself an analytic distribution-valued function in G, and so are the higher derivatives. In the same way, we can consider the Taylor expansion: for any testfunction cp, we have TA(q)= ak(cp)(rl -
1
the series converging in the largest circle about 1, in G. The coefficients are given by 1 dkTA ak(q) = - -(cp) taken at i= I,. k ! dLk These coefficientsare distributions, and so TAis the limit of the distributions which are the partial sums of the series. Hence 1 dkTA TA =C - -( I - I,)k k! dlk
116
11. DISTRIBUTIONS
with the series converging in the space of distributions in the largest circle about 1, which is contained in G. Similar arguments, all based on the theorem that the pointwise limit of a sequence of distributions is a distribution, permit us to speak of a Laurent expansion of a distribution and of certain integrals of distributions along a path in G. We also obtain the concept of analytic continuation: if H i s a region containing G as a proper subset, and if the numerically valued functions T,(q) are all analytically continuable to H , then the distribution T, can be continued to a distribution valued function analytic in the larger domain; all that we need for the argument is to notice that the circles of convergence of the Taylor series are now larger: they are the largest circles about their centers which are contained in H. Let P ( D ) be a polynomial in the differential operator with Cm-coefficients. The distribution P ( D ) T , is evidently also analytic in 1and if we make an analytic continuation of T, from G to H, the function P(D)T, is similarly continuable, and the continuation is still the result of applying the differential operator to the continuation of T,. A similar assertion can be made if we consider the multiplication of T, by the smooth function a(x) in &(R): a(x)T, is analytically continuable and its continuation is the result of multiplying the continuation of TAby the function a(x). Finally, if we suppose that R = Rnand I a suitable linear transformation of that space onto itself, we find that the analytic continuation of T , 1 coincides with the analytic continuation of T, composed with the mapping 1. In particular, if T, is homogeneous of order k, so is its analytic continuation, and indeed, if k is itself an analytic function of A the extension is homogeneous of order k(1). We give two illustrations of this important topic. Consider first R = R' and the distribution which corresponds to the function which vanishes for x < 0 and is given by x A - l for x > 0; if Re[A] > 0, the function is locally integrable and the distribution is clearly analytic in the right half-plane of A. For any testfunction q ( x ) we pass to the Taylor expansion about the origin, writing 0
q(x)= P(x)
+ xNg(x).
We then have
The function on the left-hand side we know to be analytic in the right halfplane; the first term on the right-hand side is an entire function of 1. The
24.
THE ANALYTIC CONTINUATION OF DISTRIBUTlONS
117
middle term is analytic in the half-plane Re[1] > -N, and the last term is a rational function of 1 which we can compute explicitly: it is 1 Dkcp(0)
x-- 1 + k
N-l
k=Ok!
*
Since cp and N were arbitrary, it is evident that we can continue the distribution from the right half-plane to the whole plane, with the exception of simple poles at the origin and the negative integers. Since these poles are exactly those of the Gamma function, we find it advantageous to consider, instead, the distribution TA= x"-'/r(A) with the convention that the distribution is 0 on the left half of the real axis; TAadmits an analytic continuation to an entire distribution-valued function. For 1 > 1 we obviously have Ti = TA-, where the prime denotes differentiation with respect to x; by analytic continuation, then, that relation holds everywhere. For 1 = 1, Ti = Tl = Y ( x ) , the Heaviside function, equal to + 1 for x > 0 and equal to 0 for x c 0; we know its derivative, Y' = 6, and hence infer that To = 6 , and therefore T-k = Dk6 for all k 2 0. We note that this is consistent with the general rule that TA is homogeneous of degree 1- 1. For our second example, we consider the function r'-" on the space R" where r = 1x1; if Re[1] > 0, this is a locally integrable function which is homogeneous of degree 1 - n. As in the previous example, the distribution may be continued analytically over into the left half-plane, although poles will appear at some of the negative integers. For any testfunction cp, we may write
where P ( x ) is the Taylor expansion of cp(x) about the origin, taken to all terms of order N - 1. The first term is entire as a function of 1,the second is analytic in the half-plane Re[A] > -N, and the third term is a rational function of 1 which we may compute explicitly:
where the sum is taken over all indices for which la1
The coefficient reduces to
5 N - 1 and
118
11. DISTRIBUTIONS
and it is easy to see that the integral vanishes if the multi-index a contains an odd integer. Hence a must be of the form 2p and our distribution can only have poles at the origin and the even negative integers. It is therefore convenient to pass to the distribution defined by the equation T A= ra-"/l-(l/2)to obtain a distribution analytic in the entire plane. It is easy to verify that we have ATA= 2(L - n)T,-, and since T2= (2 - n)w, E where E is the fundamental solution for the Laplacian satisfying AE = 6, we infer that To = (0,/2)6 and that T - z k is a constant multiple of Ak6 for integers k > 0. Here we are supposing n 3, so that E is the fundamental solution for the Laplacian. For some applications, it is desirable to pass to the Riesz kernel: the distribution
which now has simple poles at values l of the form n + 2k for integers k 2 0. This analytic distribution satisfies the equation - A R , = R , - 2 and R , = - E. Thus R - 2 k= ( -A)kd if k is an integer 20.When 1 is real and in the interval (0, n), the kernel R , is a positive locally integrable function. In conclusion, we should remark that the simple formulas which we have written to obtain the analytic continuation of a distribution are rarely the best: the continuation is independent of the particular formula with which we compute, and the astute selection of such a formula will always be profitable in any particular case.
25. The Convolution of a Distribution with a Testfunction In this section we consider only distributions defined on the whole space R". We have already introduced a special notation for the testfunctions cp o I and the distributions T 0 I when I is the reflection of R" through the origin, v " namely, T 1 = T and cp 1 = 4; it is desirable also to introduce F h = r - h and d.= (- 1)l"lD". It is then easy to verify the identities 0
0
(D'P)"
=fib@,
(rP(Ph)"=yh@,
(D"T)" = 3?, (FhT)" = yhf, and it will be a general rule that the reflection of a product is the product of the reflections.
25.
THE CONVOLUTION OF A DISTRIBUTION WITH A TESTFUNCTION
119
When cp is a testfunction and T a distribution, we define the convolution of T with rp as the function (or distribution) which follows:
49 = (T * cp)(x) = T ( F xG)
.
We have already seen that a(x) belongs to the class b(R")of C"-functions on the space. It is then immediate that ( y h
* cp)(x)
a)(x) = ( y h
= ( T * y h cp)(x)
and passing to difference quotients we obtain, in general (D"u)(x)= (D"T * cp)(x) = (T
* D"cp)(X).
Another useful and easy identity is ( T * cp)" = ? * @. When T has a compact support T * cp also has a compact support, since for large 1x1 the testfunction YxG has a support disjoint from that of T ; the convolution is therefore another testfunction. Another obvious but important fact is the following:
4 0 ) = (T * cp)(O) = T(,Fofj) = T(i0) ;
hence T(cp)= (T * G)(O). When the distribution T is given by the locally integrable function f ( x ) , we have
( T * cp)(x> = /S(X
- Y>cp(Y> dY,
which is the usual definition for the convolution of two functions. If we convolute two testfunctions cp and rl/ we obtain a third: x ( x ) = (cp =f
* *>(XI
d x - Y)*(Y) dY.
Of course, the integral above is taken only over a compact set, since the support of II/ is compact, and we find it convenient to approximate the integral by a sequence of Riemann sums formed in the following way. The domain of integration F is written as a finite union of disjoint measurable sets F i , each
120
11. DISTRIBUTIONS
of diameter smaller than mating sum is
E;
a point y i is chosen in each Fi, and the approxi-
= S,(X)
.
The approximating sums are themselves all testfunctions, being finite linear combinations of translates of cp, and there exists a fixed compact set K which supports all the testfunctions S, ,namely, the set of all points x whose distance from F is at most twice the diameter of the support of cp. The sums S,(x) converge to ~ ( x at ) every x as E diminishes to 0 and those sums are uniformly bounded :
Moreover, the derivatives D"S,(x) are the corresponding Riemann sums for the convolution (D'cp * i,b)(x) and are therefore uniformly bounded and converge pointwise. Thus the Arzela-Ascoli theorem guarantees that the testfunctions s , ( ~ converge ) to ~ ( xin ) the space 9 ( R " ) ;it is this circumstance that enables us to prove the following theorem.
Theorem:
( T * c p ) * J I =T*(cp*Il/).
PROOF: ( T * x)(x)
= T(Y,i)
=
jv * cp)(x - Y ) W ) d y *
= ((T cp)
*W)
We have earlier defined the regularization of a locally integrable function
f ( x ) : we took a testfunction cp(x) which was even and positive, and for which jcp(x)dx = 1, and defined cp,(x) as E-"((P 1;') = (l/e")cp(x/~).The regularization was the C"-function &(x) = (f*rp&) and we showed that the regularizations converge tof(x) in any reasonable sense. In particular, whenf((x) was a testfunction, the regularizations converged tof(x) in the topology of 9, since 0
25.
THE CONVOLUTIONOF A DISTRIBUTION WITH A TESTFUNCTION
121
a sequence of them had a fixed compact support and converged uniformly, as well as all of the derivatives. We can now extend the idea of regularization to general distributions, not just locally integrable functions as follows: T, = T,(x) = ( T * cp,)(x) ;
this is a Cm-function, and for any testfunction IcI(x), T,($)
* bm) * cp, * Il/)(O) = T * (cp, * Il/)(O) = T(cp, * IcI) = (T,
= (T
and this converges with diminishing E to T($). Thus, the regularizations converge to Tin the space of distributions. Let the distribution T be fixed: it is then clear that the mapping which carries the testfunction cp(x) into the C"-function a(x) = (T * cp)(x) is a linear mapping of 9 into d which commutes with translation and which is sequentially continuous, a convergent sequence in 9 being carried into a convergent sequence in 6. The next theorem assures us that every such mapping is a convolution. Let 2 be a linear mapping of 9 ( R " )into b(R")which comTheorem: mutes with translation: r h ( y q ) = y(rhq), and which is sequentially continuous: (P, converging to 0 in 9 implies 2cp,converging to 0 in &; then there c p = T * cp for all cp. exists a unique distribution T such that 9 PROOF: The evaluation functional ycp(0) is a linear functional orP9 and from the continuity of the mapping it is even a sequentially continuous linear functional on 9, hence a distribution. We write, then, Ycp(0)= i'(cp), and now (dRcp)(x)= 2'(F-x cp)(O) = F(2Cxcp)= T ( Y x ;P) = ( T * cp)(x). The distribution T is uniquely determined since its regularizations are, T, being the image under 2 of the function cp,, and these regularizations converge to T as the E converges to 0. When the distribution T has compact support, the corresponding convolution mapping carries the space of testfunctions into itself. The previous theorem admits an easy extension to one asserting that the linear, translational invariant mappings of 9 into itself which are sequentially continuous are convolutions with distributions in 6'. In fact, the only point in the proof which is not immediate is the compactness of the support of T, however, if that set were not compact, there would be a sequence of points x, in the support of ?
122
11. DISTRIBUTIONS
having no finite limit point, and each x, would be surrounded by a neighborhood of small diameter supporting a testfunction $, for which ?($,) = I . We translate these testfunctions to the origin, forming cp, = .Yxn 19, , a system of testfunctions supported by the unit sphere. For a suitable choice of constants c, converging rapidly to 0, the sequence c, rp, converges to 0 in La, while the system of their convolutions with T is not supported by any fixed compact, hence does not converge to 0 in 9. Let Tbe the 6-distribution and cp a testfunction;(6 * cp)(x) = 6(Yx@)= cp(x) and so 6 corresponds to the identity mapping of 9 into itself. More, generally, then, for any polynomial P with constant coefficients,the convolution with the distribution P ( D ) 6 is the mapping which carries cp into P(D)rp. In a similar way we see that the translation operator y h itself corresponds to convolution with the distribution y h 6 ,and this distribution is the measure consisting of a unit mass at the point x = h. If the distribution T has compact support and a(x) is a function in the class 6 we can obviously form the convolution (T * u)(x) = T ( F x a), whether or not a(x) is in La; the convolution is again in 6 and it is easy to verify that Du(T* a) = (D"T* a ) = T * ( P a ) as well as fh(T*U)=FhT*U = T*yhU.
If a sequence a, converges to 0 in the metric space 8, then the sequence T * a, also converges to 0. We also have another consequence of the previous theorem.
'
Corollary: Every linear mapping9 from 6 into I which is continuous and commutes with translation is of the form 2Za = T * a for some uniquely determined distribution T with compact support. PROOF: The restriction o f 9 t o 9 satisfies the hypothesis of thetheorem, hence corresponds to a distribution. The continuity of the linear form 9a(O) on the space 6 makes the distribution one with compact support. We consider finally the support of the convolution T * cp, where rp is a testfunction. Theorem:
supp T * rp E supp T + supp cp.
PROOF: It should be noted that the fact that supp rp is compact makes it easy to show that the sum supp T + supp rp is closed. It will be enough to
26.
THE CONVOLUTION OF DISTRIBUTIONS
123
show that any point x for which ( T * cp)(x) is not 0 is of the form x = y + z with y in supp T and z in supp cp. Since T ( Y X4) is not 0, there is a point y in the support of T which is also in the support o f y , 4, and this support is the set supp 6 + x; hence y = x z for some z in the support of cp.
-
26. The Convolution of Distributions In this section we define the convolution of two distributions, one of which has compact support; later we will extend the definition a little further. It should be made clear, however, that it is not possible to define the convolution of a pair of arbitrary distributions. Let T be a distribution on R" and S another distribution with compact support. The distribution S defines a mapping of the space of testfunctions into itself; this mapping is sequentially continuous and commutes with translation. The distribution T also defines a mapping of the testfunctions into €' which is sequentially continuous and commutes with translation, and it therefore follows that the composition of these mappings is sequentially continuous from 9 to d and commutes with translation, hence, corresponds to a distribution which we write T * S and take as the definition of the convolution of those distributions. We could have considered the mappings in another order: convolution with T would carry 9 into d and convolution with S would carry d into 8 ; the composed mapping would be continuous from 9 to 8,would commute with translation, and would therefore correspond to a uniquely determined distribution which we write S * T. It is important to show that S * T = T * S, that is, that the composed mappings, in either order, are the same. For this purpose we need the following lemma. Lemma:
If S is in b', cp in 9, and a in b then ( S * (cp
* a ) ) ( x ) = ( ( S * cp) * a ) ( x )
*
PROOF: Choose r > 0 so large that the supports of both S and cp are contained in a sphere of radius r about the origin, then choose a large M and a ) is equal to 1 on a sphere of radius M + 4r. We then testfunction ~ ( xwhich write a ( x ) as a sum:
4 x 1 = x(x)a(x) + (1 - x(x))a(x) = a,(x)
+ az(x).
124
11. DISTRIBUTIONS
Now ( S * cp)
and
* a = ( S * cp) * (a1 + a2) = ( S * cp) * a, + ( S * cp) * a,
s * (cp * a) = S * (cp * (a1 +u2)) = S * (cp
* U l ) + S * (cp * a 2 ) .
In view of the fact that a, is a testfunction, the two first terms are the same, and the lemma is proved if we show that the second terms are equal. However, the function a 2 ( x )has its support outside the sphere of radius M + 4r, hence cp * a2 has its support outside the sphere of radius M + 2r, and the convolution of this with S has its support outside the sphere of radius M.In a similar way we find that since S * cp is supported by the sphere of radius 4r, its convolution with a, vanishes in the sphere of radius M . Thus the two second terms above vanish on the sphere 1x1 5 M , whence the functions S * (cp * a) and ( S * cp) * a coincide on that sphere; M being arbitrary, they coincide everywhere and the proof is complete. Theorem:
T*S=S*T
PROOF: We shall show presently that for every pair of testfunctions cp and $ we have ( T * S ) * (cp * +) = ( S * T ) * (cp * $). From this we get the desired equality by taking cp = cp,, the system of regularizing functions, to obtain ( T * S ) , = ( S * T ) , for all E , and since the regularizations of a distribution converge to the distribution the equality is proved. Let a(x) = (T * $)(x); then
* $) = T * ( S * cp * $) = T * ( S * cp) *$ testfunction, this equals ( S * cp) * T * $ = ( S * cp) * a,
( T * S ) * (cp
and since S * cp is a which, by the lemma, is
S * (cp * a ) = S * ( T * $ * cp) = ( S * T ) * ($ * cp) = ( S * T ) * (cp
* $).
Since the distribution 6 has compact support, D"6 and r h 6 also have compact support; from the previous theorem, then, we have the identities D"(T* S ) = (D'T) * S = T * D"S
and
rh(T * S ) = ( r h T )* S = T* rhs,
since the convolution of D"6 with T is D"T and similarly r h 6 * T = Y,,T for all distributions T. It is also easy to verify the identity ( T * S)' = ? * 3.
26.
THE CONVOLUTION OF DISTRIBUTIONS
125
We consider next the support of the distribution T * S. If x is a point of that support, there exists a testfunction cp supported by an &-neighborhood of x such that ( T * S)(cp) = ( T * S * @)(O) is not 0. Thus T * ( S * @)(O) = T ( ( S * @)") is not 0, or, better still, T(S * cp) is not 0. It follows that there exists a pointy in the support of T which is also in the support of * cp, and is therefore of the form - z + x' where z is in the support of S and x' in the support of cp. It follows that x' = z + y is in an &-neighborhoodof x and is contained in the set supp S + supp T and that set is closed since supp S is compact. The E being arbitrary, we have proved the following result.
s
Theorem :
supp T * S .csupp T + supp S.
We would like to extend the definition of convolution to other pairs T, S without the hypothesis that S has compact support. In general this is not possible: it is easy to believe that there is no natural definition for T * T where the distribution Tcorresponds to the constant function + 1. In various special cases both Tand S satisfying appropriate hypotheses, the notion of convolution can be extended. Suppose that p and v are Radon measures on R", v having compact support. It is clear that the convolution p * v is also a Radon measure, since it is obviously a positive distribution when p and v are positive measures. Since v has compact support it has finite total mass, and the convolution of v with a testfunction, or more generally, with a bounded function in d is itself a bounded function in 8:I(v * a)(x)l 5 Mllallca where M is the total mass of v . We may therefore drop the hypothesis that v has compact support and impose the hypothesis that both measures have finite total mass : the convolution mappings determined by the measures carry the testfunctions and the bounded functions of d into the bounded functions of 6.Thus the composition of those mappings defines a distribution which we take to be p * v . It is easy to verify that
* v>(cp>= + Y ) W )4 4 Y ) and hence, that p * v = v * p as well as (p * v)' = * i . Obviously, p * v is a (P
3
measure of finite total mass, and if S is any distribution with compact support, we have the identity ( p * v ) * S = p * ( v * S ) as an immediate consequence of the definition of convolution as the composition of mappings. We may therefore substitute S = D"6 and S = Y h 6 to obtain the familiar relations D"(p * v) = (D"p) * v = p * D"v
and
126
11. DISTRIBUTIONS
Since p * v is a measure, it is sometimes of interest to compute the convolution measure of a Bore1 set A in terms of the initial measures; by an easy limiting process we approximate the characteristic function of A by testfunctions to obtain (p *
W )= //X.(X + Y ) dP(X) dVW
= /V(9--,A) dp(x).
Note that if p = 6, then p ( T - , A ) = 1 if and only ify is in A , hence (6 * v)(A) = j x A ( y )dv(y) = v ( A ) , that is, 6 * v = v, which should be the case. When the measure p is absolutely continuous, that is, dp(x) = f ( x ) d x withf(x) in L'(R"),the convolution p * v is also absolutely continuous.
Theorem: The convolution of an L'-functionf(x) and a measure v of finite total mass is the integrable function g ( x ) , where
PROOF:
That g ( x ) is integrable follows from Fubini's theorem:
= /lf(x)l
dX/IdV(Y)l.
Hence, we have to verify that the measure g ( x ) dx is the result of convoluting f ( x ) dx and v. The convolution with v carries a testfunction cp into the function a(z) = / q ( z - y ) dv(y) and the convolution of a withfgives rise to the function
which may be written
27.
HARMONIC AND SUBHARMONIC DISTRIBUTIONS
127
27. Harmonic and Subharmonic Distributions A distribution T is called harmonic if it is a solution of the differential equation AT = 0. An important theorem, due to Hermann Weyl, asserts that the harmonic distributions are the usual harmonic functions.
Theorem (Weyl):
If AT = 0, then Tis a harmonic function.
PROOF: We must show that T is a C2-function, because then the distribution Laplacian and the usual Laplacian are the same, and T appears as an ordinary harmonic function. Let z be a fixed point in the region R where T is defined; we select a testfunction ~ ( xwhich ) equals + I in the ball Ix - zI S r and define S = x T ; S is a distribution with compact support and we are to show that S is a C"-function in the ball, thereby showing that T is a smooth function in a neighborhood of z. It will then follow that T is everywhere a smooth function, hence a harmonic function. The distribution A S has compact support and vanishes in the set Ix - zI < r . We let E ( x ) be the fundamental solution for the Laplacian,
which is a C"-function away from the origin and which satisfies the equation AE = 6. Let cp(x) be a testfunction which is equal to + 1 on the ball 1x1 5 E and which vanishes for 1x1 2 2~ where E is small. We write
E ( x ) = cp(4-W)+ (1 - cp(x))E(x) = El(X)
+ E~(x) ;
here E,(x) is a C"-function on R" and E , ( x ) is a distribution with compact support. Now S = 6 * S = AE* S = E * AS = El * A S + E2 * A S . The term E , * A S is the convolution of a Cm-functonwith a distribution having compact support; it is therefore a C"-function itself. The term El * ASis theconvolution of two distributions with compact support, and its support is within a 2.5neighborhood of the complement of the ball Ix - zI 5 r ; that is to say, the convolution El * A S is 0 in the ball Ix - zI < r - 2.2. It follows that S is a Cm-function in the ball Ix - zI < r - 2.2, and since z was arbitrary, T is a C"-function everywhere.
128
11. DISTRIBUTIONS
The proof of the Weyl theorem is incomplete, since we made the tacit assumption that the dimension of the space was at least 3, because we have established the fundamental solution E ( x ) only for R",n 2 3. A corresponding proof, using the logarithmic kernel, will hold in two dimensions, but we omit it; in one dimension the theorem merely says that the differential equation T " = 0 has only linear functions as its solutions, and this we have proved earlier. A distribution T is said to be subharmonic in a domain R if and only if its Laplacian is a positive measure: AT 1 0; the distribution is superharmonic if it is the negative of a subharmonic distribution. The subharmonic distributions are described by a remarkable theorem essentially due to F. Riesz. As before, to shorten the proof, we shall suppose n 2 3. The ball of radius r about the center z is written S ( z , r ) ; its volume is IS,l = w,r"/n.
Theorem: Let T be a distribution in R; the following assertions are then equivalent : AT 2 0 in R, that is, T is subharmonic in R. (2) T is a locally integrable function in R which, in the interior of any closed ball S(z, r ) contained in R, admits the representation (1)
T ( x )= h ( x )
+ (2 - n)w,
s
Ix dp(y) - yl"-'
'
where h(x) is harmonic and p a positive Radon measure on S(x, r ) : (it is in fact the restriction of AT to that ball). ( 3 ) T is a locally integrable function in R which coincides almost everywhere with an upper semicontinuous function u satisfying the inequality u(x)
/u(x
+ r y ) dw(y)
for all x
and for all r not larger than the distance from x to the boundary of ZZ. (4) T is a locally integrable function u ( x ) which satisfies almost everywhere the inequality 1 u(x) 2 ISPIJV,.
p,u(y)dy
for all p not larger than the distance from x to the boundary of 0.
PROOF: If T is subharmonic and p the measure AT, the measure of the compact S ( z , r ) is finite, and restricting p to that set, we form the convolution u(x) = ( E * p)(x), and in the interior of S(z, r ) , the Laplacian of u is also p. It follows that A(T - u) = 0 inside S(z, r), and therefore, by the previous
27.
HARMONIC AND SUBHARMONICDISTRIBUTIONS
129
theorem, T - u = h is a harmonic function in that ball. Thus (2) is proved. The convolution u ( x ) is the Newtonian potential of p multiplied by the negative constant 1/(2 - n)w, and is therefore a negative, upper semicontinuous function, since the Newtonian potentials are positive and lower semicontinuous. To establish (3), it is enough to show that for any Newtonian potential u(x) = jdp(z)/Ix - z I " - ~ we have
4 x 1 2 j 4 x + PY) d 4 . Y ) and it is clear that we may suppose x to be the origin. Now
where U is the Newtonian potential of the measure dw and has been explicitly computed in Section 8. We have
Hence
= o(0).
s
Thus (3) is proved. Since the averages u(py) dw(y) converge increasingly to ~ ( 0as ) p approaches 0, then for every x, the upper semicontinuous subharmonic u(x) satisfies the relation
J
u ( x ) = lim u ( x r+O
where the convergence is monotone.
+ r y ) dw(y)
130
11. DISTRIBUTIONS
It is easy to see that if the distribution satisfies (3) it will also satisfy the corresponding inequality given in (4): we multiply the inequality by w, r"and integrate from 0 to p to obtain
I r"-' d r P
ISPI u ( x ) =
o,u(x)
0
4
su(x
+ ry)r"-'
dr w, do
It remains to show that a function u(x) in fl satisfying the mean value inequality (4) has a nonnegative distribution Laplacian there. For small p let xp(x) be the characteristic function of the ball 1x1 < p and Hp(x)= (1/ IspI)xp(x>;
the mean value inequality may then be written
4-4I(Hp* U ) W and holds almost everywhere in any subset of R with distance > p from the boundary. The regularizations of u then satisfy the same inequality on such sets, and these are smooth functions. If a regularization u,(x) = (u * cp,)(x) had a negative Laplacian at a point xowhere the distance from xo to the boundary was greater than E then u, would be superharmonic near xo and for sufficiently small p u,(xo) > ( H p * ue)(xo),a contradiction. It follows that for small enough E the regularizations have positive Laplacians on any set bounded away from the boundary of R. Since the regularizations converge to u as distributions, the positive distributions Aue converge to Au which is therefore a positive distribution, hence a positive measure p. Thus u is subharmonic. This comletes the proof of the theorem. When a distribution T corresponds to a locally integrable function f(x), that function, of course, is determined only up to a set of measure 0 since the distribution is more accurately the measure f(x) dx. However, in certain special cases there exists an obvious canonical determination for the function ; for example, when T is harmonic, the functionf(x) ought to be taken as the corresponding smooth function. There also exists a canonical determination of the function representing a subharmonic distribution T and this is defined by the representation of Tin terms of a harmonic function and a Newtonian potential. To obtain this canonical determination, let f(x) be any locally integrable function representing T, and set for all x u ( x ) = lim(Hp * f ) ( x ) P-0
27.
HARMONIC AND SUBHARMONIC DISTRIBUTIONS
131
From the proof of the Riesz theorem, it readily follows that the functions ( H , * f ) ( x ) converge decreasingly to the upper semicontinuous function u(x) which we have taken for the canonical representation. Of course u(x) coincides with,f(x) at every Lebesgue point of the latter function, hence almost everywhere. We should also note that instead of considering the convolutions ( H , * f ) ( x ) ,we could equally well have considered the regularizations off, provided the regularizing testfunction p(x) was a function only of radius. It follows that the regularizations of a subharmonic distribution converge decreasingly to an obviously upper semicontinuous function which is the canonical representation of that distribution. These considerations lead us to the concept of a subharmonic function. A real function u(x) defined on a domain R in R" is subharmonic there if and only if (i) it is upper semicontinuous; and (ii) for every compact subset K of R and every function h(x), continuous on K and harmonic in the interior of K,the inequality u(x) 5 h(x) on the boundary of K implies the same inequality for all x in K. The definition explains the choice of the term " subharmonic": if u(x) is smaller than a harmonic function on the boundary of Kit must also be smaller inside K. It is also clear that the subharmonic functions in one dimension are exactly the convex functions. The following theorem is therefore not surprising.
Theorem: The canonical representation of a subharmonic distribution is a subharmonic function; conversely, every subharmonic function is the canonical representation of a subharmonic distribution.
PROOF: Let u(x) be the canonical representation of a subharmonic distribution in the domain R, K a compact subset of R, and h(x) a function continuous on Kand harmonic in the interior of K.We suppose that u(x) 5 h(x) on the boundary of K and show that this inequality also holds in the interior. The function w ( x ) = u(x) - h(x) is upper semicontinuous on K and is therefore bounded on that set and attains its maximum at some point xo of K. If xo is an interior point of K,then, since w(x) is the canonical representation of a subharmonic distribution on the interior of K,the inequality who)
s jwcxo + r y ) M Y )
is valid for all small r > 0, and this may be written
+ rY) - w(x0)l dU(Y) 2 0. Since the integrand is nonpositive, it follows that w(xo + ry) = w(xo) almost jcwcxo
everywhere relative to the measure dw(y), and this holds for all values of r
132
11. DISTRIBUTIONS
smaller than the distance d from xo to the boundary of K. Accordingly, w(x) = w(xo) almost everywhere in the ball S(xo, d), and in view of the upper seniicontinuity, W(X) is constant there. It therefore becomes clear that the function also attains its maximum at a boundary point of K. This maximum cannot be strictly positive, and hence u ( x ) 5 h(x) throughout K, as desired. On the other hand, if u(x) is a subharmonic function in IT we consider an arbitrary point xo of that domain and a ball S(xo, r ) centered about that point. On the compact boundary of that ball the upper semicontinuous u(xo + ry) is the limit of a monotone decreasing sequence of continuous functions j k ( x o+ ry). Let hk(x) be the Poisson integral of the continuous fk(xo + ry); this function is continuous on the closed ball and harmonic in the interior and it coincides withfk(x + ry) on the boundary. Since u(xo + ry) Sfk(xo+ ry) for all k, it follows that u(xO)
S hk(xO) = sfk(xO + r y ) do(y)
and the monotone convergence theorem guarantees that the inequality u(x0)
s
+ ry) W
Y )
is valid; thus u(x)is subharmonic as a distribution. Moreover, the inequality u(x) 5 ( H , * u)(x) holds for any point x and sufficiently small p, and therefore u(x) S V ( x ) , where V ( x ) is the canonical representation of the corresponding subharmonic distribution. If these functions do not coincide, there exists a point xo and a real number c such that u(xo) c c < V ( x o ) .The set u(x) < c is open and contains xo, and so for small enough p V(x0) 5 ( H , * u)(xo) < c , a contradiction. It is often convenient to know that the finite supremum of a family of subharmonic functions is subharmonic, in analogy to the case of convex functions. Such a supremum, in general, is not upper semicontinuous, but if the family is countable the supremum is surely measurable. Let uk(x) be the family; the inequality uk(x) 5 ( H , * uk)(x), valid for small p and all k implies that V ( x )5 ( H , * V ) ( x )where V ( x )= sup, Uk(x) and therefore that U ( x ) is subharmonic as a distribution. If the further hypothesis that U ( x ) is continuous is satisfied, then the function is evidently the canonical representation of the corresponding distribution, and is therefore subharmonic as a function. Using the Green's formula and following the method of Section 8, it is easy to show that for n = 2 the function log 1x1 is subharmonic and that its Laplacian is 2x6. It follows from this that iff(z) is analytic in some region of the complex plane, the function u(z) = log I f ( z )I (clearly harmonic away from
27.
133
HARMONIC AND SUBHARMONIC DISTRIBUTIONS
the zeros of.f(z)) is subharmonic, its Laplacian being the measure which puts the mass 271 at each zero ofJ(z), counting those zeros as often as multiplicity requires. This circumstance makes possible an ingenious proof of the Three Lines theorem given by Thorin. Let f(z) be analytic and bounded in the strip 0 < x < 1 ; its derivativef'(z) is then bounded in any closed substrip a 5 x b and the family of functions gy(x) = If(x iy)l is uniformly Lipshitzian on the interval [a, b]. Hence, that family is equicontinuous and the supremum p(x) = supy I f ( x i y )] is continuous on the open interval (0, I). If the function .f(z) is not trivial, p ( x ) never vanishes and p(x) = eK(X), where K ( x ) = sup,loglf(x iy)l is also continuous. If we consider K ( x ) as a function in the strip, we have
+
+
+
+
K ( x ) = K ( x iy) = sup log I f ( x
+ iy + it)l,
I
where t runs over a countable set of values, say the rational numbers. It follows that K(z) is subharmonic in the strip, and since it is a function of only one variable, K ( x ) is convex and therefore p ( x ) is logarithmically convex. Another important result in function theory, Jensen's formula, is obtained from the same considerations. Let f(z) be analytic in a region containing the origin; for simplicity we suppose f ( 0 ) # 0. Let p be the positive measure, A logIf(z)I, and choose the positive R so thatf(z) has no zeros on the circle IzI = R , and therefore no zeros in the annulus R S IzI S R E for a certain small c > 0. If S, denotes the disk of radius r, we have p(SR)= p(SR+&)and if rp is a testfunction, equal to 1 on S , and vanishing for IzI > R E , then
+
+
pC(sR) = /q(z)
+
dpL(Z)
= jAcp(4 1%
If(z)l dz '
(Here dz represents the element of area in the plane.) We may suppose that the testfunction cp is a function only of radius: q(z) = F(lz1) where F ( r ) = 1 for r 2 R and vanishes for r > R + E . It follows that Zn
p ( S R )=
O
I
R+E
R
1
(F"(r)+ - F'(r)) loglf(re'e)lr dr dO r
134
11. DISTRIBUTIONS
The fact thatf(z) has no zeros in the annulus of R 6 IzI 5 R + E means that H ( r ) is a smooth function in the interval [R, R + E ] since we can differentiate under the integral sign as often as required. It is therefore possible to integrate by parts, and since rF’(r) vanishes at the endpoints of the interval of integration, we have
-1
R+E
p(S,) =
rF‘(r)H’(r) dr .
R
Since rH’(r) is continuous in [R, R + E ] and the integral of F’(r) over that interval is - 1 we infer, letting E approach 0, that p(SJ = rH’(r) in that interval. The function H ( r ) is differentiable except, perhaps at a countable set of isolated values of r which correspond to zeros off(z), and this equation shows that H’(r) is locally bounded where it exists. Hence H ( r ) is Lipschitzian, and therefore absolutely continuous; since H ( 0 ) = 211 log If(0)l we obtain
Jensen’s Formula : R
’ 0 dr = o r
2% J0 loglf(Reie)l dB
- 2n
loglf(0)l.
Of course p(S,) = 2nN(r) where N ( r ) is the number of zeros off(z) in the disk S , .
28. Temperate Distributions We introduce the class 9’ consisting of C”-functionsf(x) on R” having the property that p(x)q(D)f(x) is bounded for all polynomials p and q. Such functions evidently converge to 0 quite rapidly at infinity: for every index a and every integer N the function (1 + I x I ~ ) ~ D ” ~is( xbounded, ) and therefore ID”f‘(x)l6 C(l + IxI’)-~. The class 9 is a linear space, and we take it with a natural topology: that of the uniform convergence, on R”, of the functions p(x)q( D)f(x). This topology is determined by the sequence of seminorms (they are in fact norms) defined by
The topology is obviously metric, and it is not hard to see that the space is complete; if fi(x) is a Cauchy sequence, it is surely Cauchy in the uniform norm, and so also are the sequences (1 + I ~ l ~ ) ~ o ” f ifor ( x )any N and any a ; thus there exists anfo.(x) in Y to which the sequence converges.
28.
135
TEMPERATE DISTRIBUTIONS
The testfunctions are exactly the functions of Y with compact support, and it is important to notice that the testfunctions are dense in 9; this we verify by choosing a testfunction cp(x) which equals + 1 for 1x1 5 1 and which vanishes for 1x1 > 2; we set f,(x) = cp(x/m)f(x),and we show that this sequence of testfunctions converges in Y tof(x). Let $ ( x ) = 1 - cp(x); therefore, f ( x ) -f,(x) = t,b(x/m)f(x). By the Leibnitz rule,
and if the index B is not 0, we have
for an appropriate constant C, . Separating out the term in the sum for which p = 0, we have
where Co = 11 $ 11 , and therefore,
I
+ sup CoID"f(x)l(l+ lx12)k. lxl'm
It follows that 11(1 + IxlZ)kDa(f-fm)I)m converges to 0 with increasing m. Besides being a complete metric space, Y is also separable; since we do not need this fact now we postpone the proof. It is clear that Y is closed under differentiation, and that differentiation is a continuous linear mapping of 9 ' into itself. The product of any two elements of 9'is again in 9, and it is also true, but not so easy to see, that Y is closed under convolution; this will become clear a little later. We consider next the continuous linear functionals on 9. If F is such a functional, it is clearly a linear functional on 93, and since a sequence converging in 93 evidently converges also in 9, Fis sequentially continuous on 93 and hence is a distribution. We therefore identify F with this distribution and call it a temperate distribution. It is necessary to note that two distinct continuous linear functionalson Y determine two distinct temperate distributions,
136
11. DISTRIBUTIONS
since 29 is dense in 9.Not every distribution is temperate: the function J ( x ) = ex on the real axis determines a distribution which cannot be extended to a continuous linear functional on 9. The space of temperate distributions Note that we could have begun is the dual of the space Y and is written 9'. the theory of distributions with the space Y rather than 9 ;the distributions which we would have obtained would be those in Y' and the theory would have developed along quite analogous lines. It is clear that the temperate distributions are closed under differentiation, but it is not true that the product of a temperate distribution T and a C"function a(x) is always temperate, For example, if a(x) = cos(ex) and T the distribution defined by the equation T ( q )= C,"= q ' ( x - n), aT is not temperate. This is a consequence of the fact that, the function a(x)f(x) need not even thoughf(x) is, since the derivatives of this function, in general, be in 9, will not vanish at infinity. It is important, but easy to see, that any distribution S with compact support is temperate. The following theorem is also quite useful.
Theorem: Let p be a measure on R" such that there exists a constant C and an integer Nso that the total mass of ci in a ball of radius r is at most Cr"; then p is temperate.
PROOF: We have J
which for large enough k is finite, as we shall show, and therefore the distribution is continuous relative to the seminorm ))I'Plllk and hence temperate. To estimate the integral above, we introduce M ( r ) = total mass of p in the ball of radius r ; the integral becomes the Stieltjes integral dM(r)/(l + rZ)k
I
which can be integrated byparts; it is then bounded by M(1) + J'F2kCr""' dr which is finite if 2k > N + 1. From the theorem, it follows that any measure of finite total mass is temperate and that any bounded measurable function is temperate. Thus, functions in L"(R") and functions in L'(R") are temperate distributions. Since any function in LP(R")is the sum of a bounded function and an integrable one, the functions in Lp(R")are all temperate distributions. This remark gives rise to an interesting example: the function a(x) = cos(ex) is temperate, and so, therefore, are its various derivatives. The first derivative is a'@) = - sin(e") ex is temperate, although its absolute value is not.
28.
TEMPERATE DISTRIBUTIONS
137
Of course, the space 9" of temperate distributions is taken in the weakstar topology determined by 9, as explained in Section 17. A sequence T,, in 9" converges to To if and only if T k ( f )converges to T , ( f ) for every f in 9. It should be noted, however, that the topology of 9" is not metrizable. Finally we should observe that, just as in the case of general distributions, the space of temperate distributions is weak-star sequentially complete, that is, if Tkis a sequence in 9' such that T k ( f )is a Cauchy sequence for everyfin 9, then the linear functional T o ( f )= tim, T k ( j ) is also a temperate distribution. The proof is virtually the one we gave before, and depends on the fact that Y is a complete metric space. We form the seminorm
lllflll = SUP I Tk(f)I k which is lower semicontinuous on 9, since it is a supremum of continuous functions, and which is everywhere finite by hypothesis. The sets Km =
cf, 111lfl1s ml
are closed and convex, and are also symmetric about the origin. Their union hence, by category, there exists an integer N such that KN has an covers 9, interior point, and by virtue of the symmetry, the origin is such an interior point. It follows that IIIfIII is bounded on some neighborhood of the origin, and therefore IT,,(f)I is bounded on such a neighborhood. Thus To is conthat is, it is a temperate distribution. tinuous on 9, It is easy to show that the complete metric space 9 is separable: the testand for any testfunctionf(x), functions are themselves a dense subset of 9, its regularizations ( f * cp,)(x) converge tofin 9 as E approaches 0 through a countable set of values. Since the supports of these functions are all contained in some fixed compact, the sequence also converges in the metric of 9'.Now, as we have seen, the regularizations themselves are the limits of the Riemann sums:
and the Riemann sums are testfunctions, converging in 9 to f, hence also converging in 9'.I t follows that the system of functions
'as the coefficients mi run through all form a countable dense subset of 9 rational numbers, the y i through a countable dense subset of R", and the E through a sequence converging to 0.
138
11. DISTRIBUTIONS
29. Fourier Transforms of Functions in Y For any functionf(x) in the class Y we define its Fourier Transform as the function
?(&)given by the formula fit) = ( 2 ~ ) - " / ~ j e - ~ ( ~ c ) fdx( x .)
There is no doubt that the integral exists, sincef(x) is continuous and vanishes h quite rapidly at infinity, moreover, f(&)is even a continuous function of 5 since if l approaches a limit, the integrands above converge to the corresponding limit at all points of R" and are uniformly bounded by the integrable function If(x)l, and therefore, the Lebesgue convergence theorem guarantees n
h
the continuity off(5). It is more important to note thatf(5) is differentiable: if we differentiate formally relative to the variable we have
&,,
a? (5) = (21)-"/21e-
at 1
i(tx)(
-ix,)f(x) dx
and since the function (-ix,)f(x) is in 9, hence integrable, the differentiation n
under the integral sign was legitimate. If follows thatf(5) has continuous first derivatives everywhere. Since this argument can easily be repeated, we see h
h
thatf(5) has continuous derivatives of all orders, that is,f(l) is in 8, and its derivatives are given by the formula
Iff(x) is in Y , so also is af/ax,(x); we compute its Fourier transform to obtain A
A
A
which can be integrated by parts. We find $flax,(&) = ( i ( , ) f ( & ) , and more A
generally (o"~)(o =( i t ~ f i ~ .
29.
FOURIER TRANSFORMS OF FUNCTIONS IN
Y
I39
A
The class Y consisting of Fourier transforms of functions in Y is therefore a class of Cm-functions, closed under the operations of differentiation and multiplication by polynomials. Moreover, any function in that class is bounded; we have only to choose N large enough in the following inequality:
s IllllllN~(,
dx +
lx,2)N'
It follows that the Fourier transforms are themselves functions in 9. Theorem: into itself.
The Fourier transform is a continuous transformation of Y
PROOF: Supposefi(x) is a sequence converging to 0 in 9; choosing N greater than the dimension of the space in the foregoing inequality we find *
that
llfillrnconverges to 0. Moreover, for fixed k and a, the sequence gj(x) =
(1 - A)k(- ix)"ji(x) also converges to 0 in 9, therefore I I (+ ~
A
~ t ~ ~ ) ~ o "=f II2jIIm i~~m
A
also converges to 0. Since converges to 0.
Illfilll, is a finite sum of such terms, that seminorm
It is easy to see that the Gaussian G(x) = exp(- 1xI2/2) is a function in Y ; in order to compute its Fourier transform we consider the special case when the dimension n is I and note that the function satisfies the differential equation G'(x) = -xG(x). Taking Fourier transforms we find A
iC'(5) = [ - ixC]"(t)
A
A
A
*
and since iC'(5) is also equal to - @(t)it follows that (d/&)G(t) = - tC(<). A
This means that the derivative of the ratio G(()/G(<) vanishes identically, and *
*.
so G(<) = CC(<)where the constant C, of course, is G(0) = 1.
140
11. DISTRIBUTIONS A
The equation G(5) = G(5), which asserts that the Gaussian is its own Fourier transform, also holds for higher dimensions, as the following computation shows.
=
fi ((2n)-
k= 1
=
at)
+m
I/'/
exp[ - itkt ] exp[- t2/2] dt
-m
f
The important result that the Fourier transform is one-to-one and onto is a consequence of the following theorem. A
V
f=J
Theorem:
PROOF: We first remark that reflection and Fourier transform commute, " A
A V
namely, f =f and this is an immediate consequence of the definition of the transform by the integral. Thus, even functions have even transforms. Choosefandg in Y and form the Fourier transform ofg(<)f(t); we obtain A
h
M5)f(5>1(x)= (2n)-"'Se-'x5g(51f^(t) d5
"rf( y ) d y d t .
= (2n)-"" le-ixcg(t)( 2n)-"" s e -
Since bothfand g are integrable. Fubini's theorem may be invoked to interchange the order of integrations; the expression above becomes ( 2 n ) - n ~ ' / ( 2 n ) - " ~ z ~ e - ' ( x + y )dr tgf((~y) ) dy
= (2n)-"/'/
c(x
+ y)f(y)dy .
Thus, finally, /e-'x's(oS^(o d5 =
/ & + y)f(y) d y
Now we pick the function g(t) as follows: we select a function k ( t ) in Y such that k(0) = 1, and let g(5) = k ( ~ 5where ) E is positive. Accordingly, g^((x)
= ( 2 7 ~ ) - " / ~ S e - ' ~ ~dkr( ~ 5 )
.
=&-"(2n)-'/'~e-i'~/& dt~ ( ~ )
29.
FOURIER TRANSFORMS OF FUNCTIONS IN
9'
141
h
and therefore, $(x) = E - " ~ ( x / E ) . Thus, the right-hand side of the identity above becomes c - " l l k (x- -+) fy( y ) d y
=\&)f(-x
+ES)~Y,
E /-.
while the left-hand side is J e - '"%(~t)f(t) d t . We therefore let E approach 0 in the identity
e-'"%(&t)f^(t) d t = j i(s)f( - x + E S ) ds and note that the integrands on either side converge with decreasing E pointh
/-.
wise everywhere to the limits e-'"rf(t) and k(s)f( - x ) , respectively, and in each case these functions are uniformly bounded in absolute value by the integrable functions h
l k ) l Ilkll,
and
IkWl Ilfll,
9
respectively.
The Lebesgue convergence theorem then guarantees that in the limit we obtain . . " f W = C f ( - x) = C f ( 4 h
/-.
after we have multiplied both sides by the factor ( 2 ~ ) - " /the ~ , constant C being independent of the choice offin 9'. Since there exists an even function in 9' which is its own Fourier transform, namely the Gaussian, we have C = 1. The previous theorem immediately gives us an expression for the inverse Fourier transform : we have "
f(x)
=
f( - x)
= (2n)-""je-'"$(t)
dt
and therefore, f ( x ) = (27~)-"'~je+'"$(5) d t . Some authors prefer to define the Fourier transform without the factor (2n)-"I2. Another convention is to define the transform without that factor, but to insert it in the exponential, so that the transform is given by the integral /e-'"'"Cf(x) dx, the inverse transform occurring in an equally simple expression. The convention which we follow, however, seems to be the one most widely used.
142
11. DISTRIBUTIONS
We return to the fundamental identity of the proof of the previous theorem:
*
v
and substitute, forf, the function h where h is in 9.We obtain
+y)z(-Y)
[gh]"(x) = (27r)-"l21 $(x
dy
A
= (234-q;
* h)(x)
and by simple substitutions, the companion equation
(B * h)*(O
= (2x)"'2$(t)kt)
'
It follows that multiplication and convolution correspond to one another under the Fourier transform (except, of course, for the factor (27~)"") and since the class Y is closed under multiplication, that class must also be closed under convolution. There is more to be learned from the identity of the previous theorem. We write again (2n)-"'2Je-'X'g(t)?(t)
d t = (27~)-"'~/ $(x
+ Y ) ~ ( Yd) y
and set x = 0, canceling the factor (2n)-"I2 to obtain
J s(t)f(t)d t =J i ( Y ) f ( Y ) dY an identity which we shall need when we study the Fourier transforms of *
temperate distributions. If we substitutefforfwe find
J S(Of(0 d t =J ;(x)f^(x)
dx ;
selecting h in 9' so that its complex conjugate is the reflection ofS, that is, v
f = h, we have
A
= h(t) ;
thus, finally,
I sct>m =J ii(t,i;cr,dT d5
In the familiar notation of L2-spaces, this may be conveniently put as (9, h) =
G,
;
29.
FOURIER TRANSFORMS OF FUNCTIONS IN
Y
143
this is the Parseval equation; in particular, setting g = h we have llgll = 11$11, which shows that the Fourier transform, defined on the dense subspace Y of L2, is an isometric mapping of that space on itself. Of course, the term isometric has reference to the Lz-metric. We list the essential results for the Fourier transforms of functions in the class 9':
f^ct)= (27r)-"/z/e-i91(x) d x , f ( x ) = (2n)-"/2/e'ic$t) d t , A
(D"f)^(t)= (itY?(t)
D"f = ((- ixyf (x))A,
9
A
(shf
I^(<)
=
e-iChf?t)
y hf = (e'
3
(f* dA(0= (27r)"/Zk)S^t) A
9
ix"f(x))A,
im,
(fg)^(t) = ( 2 4 -Y?* V A
V
A V
f =f.
f =f,
Of great importance is the following relation, a special case of which we have already encountered :
(f 0 I)^
1 ldet I1
& l*
=-
Here the transformation 1, is defined by the identity (t,l-'(x)) = (/*(t), x) and when the transformations are written in terms of matrices, I, is the transpose of the inverse of I, which we suppose exists. The relation is proved by a straightforward change of variables, as follows:
In particular, when 1 = I , , (f I,)"(() = E - " ~ ( ( / E ) . 0
144
11. DISTRIBUTIONS
30. Fourier Transforms of Temperate Distributions *
If T is a temperate distribution, we define its Fourier transform T by the A
equation T(cp)= T(G)for all testfunctions q. Since the Fourier transform is a *
continuous one-to-one mapping of Y onto itself, the distribution T is also a continuous linear functional on 9, that is, is itself a temperate distribution. A
' we have T(f)= T ( f ) for all f in 9,not just testBecause Q is dense in 9 functions. The properties of the transformation on Y are carried over immediately to the same properties for the Fourier transform on 9".Thus the Fourier transform is a one-to-one continuous linear mapping of 9'onto itself with period four. Theorem : (1)
(D"T)* = (ity?. *
(2) D T = (( - ix)"T)*. A
(3)
( y h
T)*= e -IhC Ta A
('I)y
h
" A
T = (eihXT)A. A V
(5) T = T. ,. A
V
(6) T = T . 1 = -T o 1,. (7) ( T o l)n ldet 11
n
* A = T ( (- ix)"f) = ( - ix)"T(f )
= (( - ix)"T)*(f).
30.
FOURIER TRANSFORMS OF TEMPERATE DISTRIBUTIONS
(f 0 1 ; ' )
A
=-
'
(det &'I
145
701;:
and therefore
Now,
1 T ( f ^ o1 - ' ) ldet 11
=-
= T((f0
Of course, the most important part of the previous theorem is ( 6 ) , which guarantees that the Fourier transform is one-to-one and onto. When the distribution T is a measure on R" of finite total mass it is clearly temperate; its Fourier transform is then a function, as the following theorem shows.
Theorem: If p has finite total mass its Fourier transform is the continuous and bounded function p(r) defined by
F ( t ) = (2n)-"/*Se-'"c d / l ( x ) .
146
11. DISTRIBUTIONS
PROOF: There is no question about the existence of the integral above, since p has finite total mass and the exponential is of absolute value 1, thus @(t)15 (2n)-"/'11pll where 11p11 = jIdp(x)l;the functioniscontinuous by virtue of the Lebesgue convergence theorem, since, as tkconverges to tothe integrands converge pointwise everywhere to e - ixhand are uniformly bounded by the integrable function + 1. To show that $(t)is indeed the Fourier Transform we invoke the Fubini theorem:
+
16.) d p ( x ) = ( 2 r ) - " ' ' ~ j e - ' " y ( t )dl; d p ( x ) =/i;COf ( 5 ) d5 *
Thus our theorem shows that the Fourier transform of a function in 9'is also its Fourier transform when that function is regarded as a temperate distribution. A stronger result holds when we know that the measure p is absolutely continuous.
Riemann-Lebesgue Lemma: If f ( x ) is an integrable function, its Fourier transform is the function
?(t) = (2n)-"''(e-"~(x) d x , which is continuous, vanishing at infinity and bounded by (2n)-"/'!lf(x)l
dx.
*
PROOF: All that we have to show is thatf(C) vanishes at infinity. Since the space 9 is dense in L1(Rn),there exists a testfunction cp such thatf= cp *
$(t)+ h(t)with < E . We take the limit superior as t approaches
where the L'-norm of h is smaller than E . Accordinglyf(t) A
G(t)in the class Y and Ilhll,
+h
A
=
infinity : *
n
lim sup
~f(t)l s lim sup I$(t)I + Iim sup lh(t)l ;
this is necessarily smaller than A
E
because
$ is in 9'.Since E is arbitrary, the
function f(t) vanishes at infinity. The hypothesis that the measure dp = f ( x ) dx was absolutely continuous was really needed in the previous theorem ; for example, the Fourier transform of 6 is the constant function (2n)-"/'1, which does not vanish at infinity. We have already seen that two measures p and v , each of finite total mass, have a convolution p * v which is also a measure of finite total mass. Thus the convolution is also temperate. We verify that the Fourier transform of the
30.
FOURIER TRANSFORMSOF TEMPERATE DISTRIBUTIONS
147
convolution is the product of the Fourier transforms, with an additional y ) dp(x) d v b ) ; hence, factor of (27~)"". We recall that ( p * v)(rp) = //p(x computing the transform, we obtain,
+
(p
* v)"(<)
= (p
* v ) ( ( 2 7 ~ ) - " / ~ e - ' 5 ( ~)+ J ' )
= (27L)-"/2j/e-itxe-'FY dp(x) dv(y) = (27Cyp(t);(t).
In the special case when both measures are absolutely continuous we have
(I*g)^(t) = ( 2n)"/zfit)at)
9
a formula which we have already found whenfand g are in 9'. It is also easy to describe the Fourier transforms of functions in Lz(R"). A
If the distribution T is given by the measuref(x) dx wherefis in L2, then T is '.There exists a the distribution defined by T ( g ) = If@)&) dx for all g in 9 sequencef, in 9'converging in L2 to& and therefore, by the Parseval formula, the sequence& is Cauchy in which has been established for functions in 9, Lz, and therefore converges to some limit h ( t ) in L2. Accordingly, A
A
5 9 ) = J f ( x ) a x ) dx = lim jfk(x)?(x) d x k
= Jh(t)g(t) d r * A
Since this relation holds for all g in 9, T is identified with the Lz-function A
A
A(() which we write asf(4). For every k we had ll&llz
=
ll&llz, and therefore
I\
fandfhave the same Lz-norm. On L2, therefore, the Fourier transform is the isometry obtained by extending the Fourier transform from the class 9' to the whole space. The fact that 9 was dense in L2 determined this extension uniquely. It is perhaps of interest to note that the Fourier transform is therefore a unitary transformation ofLZwith period 4; it therefore has a spectrum which consists of only the numbers + 1, - 1, i, and - i, and indeed, the space is a direct sum of four eigenspaces of the transformation. We do not insist on that here.
+
148
11.
DISTRIBUTIONS
An explicit formula for the transformation can be given as follows:
T(l)= lim (2n)-"/2/
e-ixy(x) dx ,
1x1Sk
k
the limit being taken in the L2-topology, since the functions which coincide withf(x) on the ball of radius k and vanish outside are in L', and converge tof(x) in L z . In every case we have p(x)l2
d x = fl&OI' d t
* -
and, indeed, / f ( x ) s ( x ) dx = sf({)c(t)d( for allfand g in L2. The distributions with compact support are always temperate ; their Fourier transforms are described by the following theorem. Theorem: Let T be a distribution whose compact support is contained in the interior of a ball of radius A. The Fourier transform of T is a function A
T ( [ )which is defined not only on R" but on C", the space of n-complex variables and is an entire, analytic function satisfying an inequality of the form *
IT([)!9 C ( l
*
+ ICl)NeA1ql. It is given by the equation T ( ( ) = (2n)-"/2T(e-irx). *
PROOF: We first study the function given by the formula for T, and then identify that function as the Fourier transform of T.
The function e-icxfor fixed ( is a C"-function on R" and its Taylor expansion about the origin converges to it in the space 8.Thus
(-iy
X'P
e-rxc= cT
converges, with all its derivatives, uniformly on compacts in R". Accordingly,
T(e-'"() = x T ( -()-Yix)" a!
also converges for all ( in C" and therefore uniformly on compact subsets of I\
that space. It follows that T(C),as given by the formula, is an analytic function defined everywhere in C". Since the distribution T is supported by a ball of radius A, there exists a constant C and an integer N so that
3 1.
THE CONVOLUTION OF TEMPERATE DISTRIBUTIONS
149
This inequality is equivalent to the inequality stated in the theorem. A
fi
To identify T we suppose that cp is a testfunction, and write T(cp) = T ( @ ) . We write the Fourier transform $(t)as the limit of its Riemann sums: $<x) = Iim & ( x ) k k
C
= lim (27~)-"/~ e-ixcJ/vq(5)d t k
j= 1
,
I
where the support of cp is divided into k disjoint sets V j each of diameter smaller than E , the point t j being chosen in V j , This sequence converges to G(x) for every x in R" and the functions of the sequence are uniformly bounded by jIcp(t)l dt. A corresponding remark holds for the sequence of derivatives DaGk(x),since the derivatives of the Riemann sum are essentially Riemann sums for the testfunction ( - i<)%p(t), and thus the Ascoli-Arzela theorem guarantees that the Riemann sums converge to $(x) in 8.Accordingly,
T ( $ ) = Iim T(&) k
C T(e-'"{')/ k
= lim (2n)-"" k
j= I
cp(t)d t
VJ A
and these are evidently the Riemann sums of ST(<)cp(()d t . Thus, the proof is complete.
31. The Convolution of Temperate Distributions In previous sections we have obtained a variety of results concerning distributions which we extend in this section to temperate distributions. For example, it has been shown that the difference quotients ["-;,"o]T
of a
distribution converge in the topology of B'(R") to an appropriate derivative of T, that the regularizations T * cpE converge in the same topology to T, and that distributions depending analytically on a parameter have certain properties. Now it is necessary to establish the corresponding results for
150
11.
DISTRIBUTIONS
distributions which are temperate, and, what is the essential point, that the convergence takes place in the topology of the space 9" of temperate distributions. It should first be noticed that multiplication is a (jointly) continuous operation in 9': if &(x) converge in Y to f(x), and similarly gk(X) converge to g(x), then the productsf,(x)g,(x) converge to f(x)g(x). Accordingly, if T is temperate and g is in 9, then the product g(x)T, which makes sense as a distribution since g(x) is smooth, is indeed a temperate distribution, since ( g T ) ( f )= T(gf)is continuous asfvaries in 9. Because the Fourier transform is a linear homeomorphism of Y onto itself, the continuity of multiplication in Y implies the continuity of convolution: gk *f,converge in Y to g *f, since their Fourier transforms converge correspondingly. Let F(x) be given on the real axis by (eiX- l)/ix; it is the restriction to that axis of an entire function, and this function and all of its derivatives vanish at infinity, although F(x) is not in Y since it does not converge to 0 sufficiently rapidly at infinity. As the positive E approaches 0, the functions F(Ex)converge to + 1 uniformly on compacts and are uniformly bounded; for k >= 1 the derivatives DkF(ex) converge to 0 uniformly on the whole axis. Thus, forfin Y the product F(Ex)~(x)converges in the topology of Y tof(x). This circumstance enables us to infer that if g(5) is any function in the class Y and h is a point in R" on the positive (,-axis, then
converges in the space Y to g(e) as lhl approaches 0. It follows immediately that the difference quotient
which is itself a function in 9,converges in the topology of that space to ag(x)/dx, since the Fourier transform of the difference quotient may be written
More generally, it is clear that the differential operators are the limits of the corresponding difference quotients, not merely in the topology of Q'(R"),but in the topology of Y'(R"). Let T be in 9'andfin 9 ; there is only one way to define their convolution, since the definition must be an extension of the definition of convolution already given. We must take
(T * f ) W = T ( . G
31.
I51
THE CONVOLUTION OF TEMPERATE DISTRIBUTIONS
and it is not hard to see that this is a C"-function because the previous argument shows that it is continuous and even differentiable, since the difference quotients [(T * f ) ( x h) - (T *f)(x)]/IhI converge with diminishing lhl to (S * f ) ( x ) where the temperate distribution S is a first derivative of T. Thus, part of the following theorem has been established.
+
Theorem: If T is in 9'and f in 9, their convolution (T * f)(x) is a Cm-function satisfying an inequality of the form I(T*f)(x)l C(l + IxlZ)".
PROOF: For some choice of C and N , T being temperate, and for all f i n 9,
I ( T * f)(h)I = 1 T ( y h f",I
Since 1
+ Ix + hlz 6 (1 + lxlz)(l + lhI2), it follows that I(T * f)(h)lS(1 $- Ihl2INC c Il(1 + IxIz)ND%llm 1. 6 N = ( l $- Ihlz)NCIIIrlllN-
The next two theorems are proved in virtually the same way as the corresponding theorems of Section 25; it is therefore unnecessary to give separate proofs.
Theorem: Every continuous linear mapping 2' from 9 to 8 which commutes with translation is of the form 9f= T *ffor some temperate distribution T; conversely, every temperate distribution defines such a mapping and the correspondence between 9 and T is one-to-one. Theorem:
If T is temperate andf and g in 9, then T*(f*d=(T*f)*g.
Finally, for the Fourier transform, we have an important result. *
Theorem:
A
For Tin 9" and f in 9,(T *f)"(() = (2n)"'Zf(<)T.
I52
11. DISTRIBUTIONS
PROOF: Since T is temperate, the product fit)? is also temperate. Of course, since T * f i s a smooth function of polynomial growth, it is a temperate distribution. For any g in Y (T * f ) ” W = (T *f)(?> “ = (T *f*
im)
= (T
* (?*
m(0)
A A
= (211)””T(fg) = (27r)”/2(f?)(g).
When the regularization of a distribution T is computed, one forms the convolution T * q E ,where the testfunction q ( x ) is subject to the conditions j q ( x ) dx = 1 and is positive and even, and qe= e - “ q o 1;’. The Fourier is ~ ( E C )and the normalization requirement means that transform of qOE G(0) = ( 2 7 ~ ) - ” /Thus, ~. when T is temperate, so also are its regularizations, n
as( E) Tapproaches ; 0 these conand their Fourier transforms are ( ~ ~ z ) ” ’ ~ $ ( E n
verge, in 9”,to T. Because the Fourier transform in a linear homeomorphism of 9’ onto itself, it follows that the regularizations converge to T in the topology of the space of temperate distributions. The distributions with compact support are temperate, and we have seen that their Fourier transforms are entire functions of exponential growth. Since the convolution of a distribution in 8‘ with an arbitrary distribution has been defined, there is a special interest in the case when the convolution is taken with a temperate distribution.
Theorem: If S is a distribution with compact support, and T is temn * perate, then T * S is temperate and ( T * S ) ^ = (2n)”/’S(5)T. A
PROOF: The fact that the smooth function S(t) is bounded by a polyh nomial on R“ as well as all of its derivatives D“S(5)means that multiplication by S(t) is a continuous linear mapping of 9’into itself. Thus, convolution with the distribution S is an operation which maps Y continuously into itself, and so T * S is continuous on 9, that is, temperate. Now, ” v (T * s ) ^ ( j )= (T * s )(f)= T * s * f(o) = T * ( S * fi(o) A
A
r\
3 1.
THE CONVOLUTION OF TEMPERATE DISTRIBUTIONS
I53
and
so Accordingly, (T * S)”(f) = ( ~ x ) ” / ~ T ( ( $ ) ” ) = (2n)”’2?($)
= (2n)”’2(s^(5)^r)(f).
We have already seen special cases of this theorem, for example, the convolution of the distribution P ( D ) 6 with a temperate distribution T has for its A
Fourier transform P (i5)T. Let T, be a distribution which depends analytically on the parameter A as that number varies over some region G in the complex A-plane; if we suppose in addition that T, is temperate for A in G, then its derivatives with respect to 2 will also be temperate; this is a consequence of the fact that the quotients [(T,+,, - T,)/h](f) converge for all f in Y as h tends to 0, and that the space of temperate distributions is (weak-star) sequentially complete. From the same circumstance, it follows that the Taylor expansion of T, about some point in G converges in the space 9” to T,. It is important to notice the A
A
following, T, being temperate: (T,)(f) = T , ( f ) is also analytic in G, and the analytic continuation of one side of this equation coincides with the analytic A
continuation of the other. Thus, the analytic continuation of ( T ) ,coincides A
with the continuation of (T,). In the next section, we will find this fact very useful in computing certain Fourier transforms, since there will be a region A
of the plane in which it is fairly easy to compute T,,’and then, by analytic continuation we obtain the transform for all A. It should be emphasized, however, that it may be possible to continue a distribution T, analytically outside of a domain C in which T, is temperate without the continuation being temperate. For example, the distribution defined on the real axis by the function e-’lXi is analytic for all 1, that is, is entire. For Re[1] 2 0, this distribution is temperate, but for Re[1] < 0, the distribution is not temperate. This circumstance should not be surprising: for the analyticity of a distribution T , in G we require that all the functions T , ( f ) be analytic in G for every testfunction f;for the analyticity of the temperate distribution, we require more, namely, that the function T , ( f ) be analytic in G for all functionsfin 9.
154
11. DISTRIBUTIONS
32. Fourier Transforms of Homogeneous Distributions Suppose the temperate distribution T is homogeneous of degree k, that is, T 0 I, = E ~ Ttaking ; Fourier transforms of both sides of this equation we h
h
T ( T o 1,)" = (l/d')T 0 1;'. Since 1; obtain (T 0 le)" = E ~where h
A
= 1,,, ,the relaA
) T thus the Fourier transform T is homogention becomes T 1, = E - ( " + ~and eous and of degree --n - k. This relation holds for all homogeneous distributions, in view of the following theorem. 0
Theorem:
Every homogeneous distribution is temperate.
PROOF: Let T be homogeneous of degree k on R". Since T is a distribution, there exists a constant C, and an integer N so that IT(cp)l 6 C, IIcpIIN for all testfunctions cp supported by the ball 1x1 5 I . If a testfunction cp has its support in the ballof radius R, then cp 0 I , is a testfunction supported by 1x1 I 1. NOW I(T0 l,')(cp)1 = R-klT(cp)l = R"IT(cp 0 lR)l SCIR"(Icp0 lRllN. We are only interested in large R, so we may suppose R 1 and it is clear that
=-
5 RNllPllN
*
It follows that IT(cp)l 5 CIR"+k+Nllcpl[N for every testfunction cp supported by the ball 1x1 I R. In particular, we suppose that cp is a testfunction such that the diameter of the support of cp is at most 1 and let R = lx'l where x' is a point of that support; we have IT(cp)l 5 C1(R l)"+k+N(lVllN.On the other hand, if R is not very small, R - 1 5 1x1 for all x in the support of cp and so 2N DU ( R - ~)2Nl~"cp(x)l1x1 I cp(x)l
+
+
s
5 (1
+ IxlZ)Nl~"cp(~)l
and therefore ( R - 1)2NIIDaqIImI II(1
+ IXI~)~D'V(X)II~,
whence, summing over the multi-indices a with la1 5 N, we obtain ( R - 1)2NllVllNI IIIVlllN'
32.
FOURIER TRANSFORMS OF HOMOGENEOUSDISTRIBUTIONS
155
Note that the seminorm on the right is associated with the space 9'. Accordingly¶for such a testfunction,
at least when R 2 4. We take next a partition of unity xi(x) for the space R"; this partition of unity is not arbitrary: the functions are to be supported by sets of diameter at most 1 and are to be such that there exists a constant C2such that llxillN5 C , for every index i. Such partitions of unity are easy to come by, since the space being covered is the whole R"; indeed, it is possible to construct a partition of unity xi having the property that every function of the family is a translate of some fixed testfunction. Let x i be a point in the support of x i and Ri = Ixil. Iff is an arbitrary testfunction, we set f i ( x ) = xi(x)f(x) to write f = 1 f; and therefore to deduce
Now, owing to our special choice of the partition of unity,
and therefore, IT(f)I
5
IIIfIIINCIC22NC(Ri+ lT+k(Ri -
where the series on the right converges if N is sufficiently large, for example, if N 2 3(n + k + 1). Hence, finally, for every testfunctionf
and Tis continuous relative to a seminorm of the space 9';that is, is temperate. It is interesting to notice that our assertion that the Fourier transform of a distribution homogeneous of degree k is homogeneous of degree - n - k can also be obtained from the Euler relation. T being homogeneous of degree k means kT=
aT EX.j = l ax^,
156
11. DISTRIBUTIONS
which becomes, under Fourier transforms
whence A
-(k+n)T=
a? I t j j=l atj n
A
and this is the Euler equation for T. As an example, we consider in detail the Fourier transform of the distribution TAdefined for Re[A] > 0 by the function l / ~ x ~ " -this A ; is a locally integrable function, and as we know, T Aadmits an analytic continuation (as a distribution) to the left half-plane, with singularities at the even integers SO. The distribution is homogeneous of degree -n A and is obviously temperate since it is of polynomial growth at infinity; thus the previous theorem need not be appealed to. For Re[A] positive but
+
For such values of 2, then, the function is the sum of an integrable function and an L2-function; its Fourier transform is therefore the sum of a continuous function vanishing at infinity (by Riemann-Lebesgue lemma) and an ,!,'-function (because of the Parseval relations.) Thus the Fourier transform is a function. Since TAis invariant under the orthogonal group-the linear transA
formations of R" into itself which leave distance invariant-so
also is T , ,
A
and so, for the values of A in question, TAis a function of radius. Since it must A
be homogeneous of degree - n - ( A - n ) = -2, TAis necessarily of the form CItl-', at least for A in the strip 0 < Re[A] < n/2. We therefore write
where the coefficient C(n, A) is to be determined. This coefficient is surely an analytic function in the strip, since both distributions are analytic in that strip. Before determining that coefficient,we should remark that the left-hand side can be continued analytically throughout the plane except for certain poles at the origin and at negative integers, indeed, if we divide by r(A/2) we
32.
FOURIER TRANSFORMS OF HOMOGENEOUS DISTRIBUTIONS
157
obtain a distribution analytic in the entire plane. It is important to note that this extension is still temperate. The right-hand side is a distribution of exactly the same type, and is analytically continuable to a temperate distribution defined throughout the plane except for certain singularities at positive integers of the form n + 2k for k 2 0. Accordingly, the function C ( n , A) occurring above ought to be meromorphic in the entire plane, with zeros at points of the form A = n + 2k and poles at A = -2k for k 2 0. This we see when we compute C ( n , A) explicitly, as follows. Since both sides of the equation above are temperate distributions, we can apply the distributions in question to the Gaussian G(x) = exp( - 1xI2/2),a function which is its own Fourier transform. *
The equation T,(C) = TJC) may be written
the factor wn may be cancelled and the new variable t Thus,
= r 2 / 2 introduced.
and
whence, finally,
A more symmetric way of writing our result is
this way ofwriting the relation leads us to introduce the distribution SAobtained by dividing T1 by the factor 2'/'r(A/2); the distribution valued analytic function so obtained is analytic everywhere in the finite plane and is temperate. A
We then have S,
=
158
11. DISTRIBUTIONS
A special case is of interest: suppose n > 2 and 1 = 2 and recall that the fundamental solution for the Laplace equation is given by E(x) =
1
w,(2 - n)lxl"-Z ,
we find, therefore, that
A
A
Since AE = 6 we must have - ltl2E(5)= a(() = (2n)-"/2;substituting the known value of onand simplifying, we find that this relation in fact holds. It is also important to notice that the Riesz kernel, introduced in Section 24, may be written Rk = 2-A'2n-"/2r([n - 4 / 2 ) S k and this is temperate and meromorphic in the plane. Taking the Fourier transform we find the simple expression
and, therefore, A
A
A
R,+, = (21~)"'~R, R, , a relation which suggests that the convolution equation R,+, = R, * R, is valid. As yet, however, we lack a definition of the convolution in question. In the special case when the indices ct and p are in the open interval (O,n/2)the distributions R, and R, are positive, locally integrable functions which are not integrable over the whole space. For x different from 0, the integral
exists and is finite, since the integrand has two integrable singularities (at the origin and at x) and falls off at infinity like lyla+8-2n . A simple change of variables shows that the integral is homogeneous in x of degree a + /I- n, and since it is obviously invariant under orthogonal transformations, the integral must be of the form CR,+,(x). In order to show that C = 1, we select a function cp in 9'having the property that @ vanishes in a neighborhood of the origin; the convolution R, * cp has the Fourier transform ltl-B$(t) which is a function in 9' vanishing near 0. Accordingly, the convolution R, * cp is itself in Y and this function may be convoluted with R, to obtain a
32.
159
FOURIER TRANSFORMS OF HOMOGENEOUS DISTRIBUTIONS
function in Y having the Fourier transform ltl-a-8$(5). The convolution can also be computed explicitly using the Fubini theorem:
=
c pa+& - z)cp(z) dz
I
which has the Fourier transform C151-a-B$((). Thus C = 1. One can often compute the Fourier transform of a homogeneous distribution without making use of the analytic continuation. For example, we consider the function f ( x ) = 1x1 on R'; this convex function is even and homogeneous of degree 1 ;thus, its transform must be even and homogeneous *
of degree -2. Accordingly, away from the origin,fcoincides with a function of the form Cltl-' and it is not difficult to compute the constant C. Indeed, *
taking the Fourier transform of the equation D ' f = 26, we have 2(2n)-'/' and hence - C = @. Let the distribution S be defined by the formula
-l'f(t)
=
The integral exists for any testfunction cp since the integrand is bounded near the origin and is integrable for large 5. It is easy to see that S is a distribution *
A
and that S(cp) =f(cp) for all cp vanishing at the origin. Accordingly S -f = c6 for some constant c. Since 6 is homogeneous of degree - 1 and the difference *
is homogeneous of degree -2 it follows that S =f. Virtually the same argument works in higher dimensions: the function f ( x ) = 1x1 on R" is homogeneous of degree 1 and is invariant under the transformations of the orthogonal group. The Fourier transform must then be of the form Clxl -"-I away from the origin. We pass to spherical coordinates and introduce the distribution
From the Taylor expansion of cp(x) about the origin:
+
V ( X ) = ~ ( 0 )Cakxk
+ Cxixjgij(x).
As g i j ( x ) is smooth, it readily follows that Icp(0) - fcp(r, e) dw(8)I 5 cr2 and therefore, that the integral defining S always exists, and that it is small when cp and its first and second derivatives are small. Thus S is a distribution, and for
160
11. DISTRIBUTIONS A
an appropriate choice of the constant, C, S(cp) =f(cp) for all testfunctions cp vanishing at the origin. S is homogeneous of degree --n - 1 and is invariant under the transformations of the orthogonal group. It follows that the difA
ference S -fis a multiple of 6 since it has the same null space as 6. The latter distribution is homogeneous of degree - n , however, and not - n - 1. Thus A
s =.f. To determine the constant C, we note that the Laplacian off must be homogeneous of degree - 1, and since f is convex, hence, subharmonic, the Laplacian is a positive measure. Away from the origin, we obviously have Af = (n - l)/lxl, and there can be no point mass at the origin since 6 does not have the required degree of homogeneity since n > 1. Accordingly,
= (n - l)C(n,
n - 1)Irp
and therefore, away from the origin, S is given by the function -(n - I ) C(n,n - 1) 1{1-'-", while from the definition of S it is given by -CI~l-'-". Hence
;
c = (n - i)c(n,n - I ) = 24*r("7c-i/2. l) The next example is more complicated. The distribution defined on R" and associated with the function JxI-"is determined only up to a multiple of 6, since the function Ix(-" is not integrable in a neighborhood of the origin. For an arbitrary positive s, let S, be defined by
it is clear that all possible distributions associated with the function IxI-" are obtained in this way as s varies over the positive real numbers. It is easy to verify the equation:
s, - s,= 0, log since J < l x l < s
[XI-"
(1)
- 6,
dx = - w , log ( t l s ) . Moreover, &"(S, l,)(cp) is given by 0
33.
PERIODIC DISTRIBUTIONS IN ONE VARIABLE
161
and with the change of variable y = X / E , this becomes S,,,(q). Accordingly,
s,
0
I,
= E-"s,
+ &-"(log& ) 0 , 8
and therefore, A
s,
A
I, = s, - (27t)-"'2wn(lOg &)l .
0
Consider next the distribution T = - ( 2 7 ~ - " / ~ 0 log ,, integrable function on R". Obviously,
151; this is a locally
T 0 1, = T - (2n)-"'*wn(log ~ ) 1
is)
and therefore, (T 0 lE = T - S , for all E , which difference is homogeneous of order 0, besides being invariant under the orthogonal group. A
It follows that the distribution T - S, is a constant function, but different constants arise for different values of s. Thus, for a certain function c(s) defined for s > 0, we have A
S, = T - c ( s ) ~.
33. Periodic Distributions in One Variable On the real axis we consider a distribution Twhich is periodic, with period h. I t is important to notice that T is necessarily temperate.
Theorem:
Every periodic distribution is temperate.
PROOF: Let q ( x ) be a testfunction, supported by the interval (0, h ) and which equals + 1 in a neighborhood of h/2; we shall further suppose that 0 2 cp(x) 5 I and extend cp by periodicity to obtain a periodic C"-function F ( x ) with period h taking values in the unit interval. We may write T = F(x)T
+(I
- F(x))T
and it will be sufficient to show that each of the terms in this sum is temperate. Either of these distributions is of the form E k F k h S , where the sum is taken over all integers k, and S is a distribution with compact support. The sum converges in the space of distributions. The partial sums are temperate, and we show that they converge in the space of temperate distributions. the series x(Fkh S ) ( f )converges, More precisely, we show that foreveryfin 9, and thus that the partial sums form a Cauchy sequence of numbers. It will follow that the partial sums of S form a weak-star convergent sequence
162
11. DISTRIBUTIONS
in 9', and because of the weak-star sequential completeness of that space, they converge to a temperate distribution. Forfin 9, the series f(x)and all its derivatives converge uniformly on the compact support of S ; this is a consequence of the fact that the functions o f 9 vanish quite rapidly at infinity. Thus, s ( c s k h f ) = C s ( s , h f ) converges, and the proof is complete.
zYkh
Since the periodic T is temperate, it has a Fourier transform, the nature of which is easily determined. Taking the Fourier transform of the equation A
Y h T
A
= T we obtain (e-ihC- l ) T = 0; it follows that
T is supported by the A
zeros of the function (e-ihT- l), and since these zeros are simple, Tmust be a A
measure. Accordingly, T is a measure, supported by the points 5 of the form [ = 2nk/h as k runs through the integers. Obviously, it is convenient to have 11 = 2n, and in this case the Fourier transform is supported by the integers. We pass next to the special case of a distribution P periodic with period h = &, which consists of a unit mass at each point of the form x = kh. Its *
Fourier transform, P, is supported by the same set of points and is a measure. *
A
The equation ( e - i h x- l ) P = 0 shows that P is periodic with period h, so P has the same mass at every point. It follows that there exists a constant C so A
A A
that P = CP, and since P is even, P = P = C2P, where C = & 1. Since there exist functions in Y which are positive, whose Fourier transforms are also positive, for example, the Gaussian, it follows that C = 1 . Thus P is its own Fourier transform. This may be written as follows: for all f in 9, h
the sums being taken over all integers k. Substitutingfo finally, the following formula. Poisson Summation Formula (1):
for f we obtain,
For all functions f in 9,
If we substitute Y-,f for f in the Poisson formula, we obtain
Suppose that p is a measure on the axis, periodic with period 2n; we know it to be temperate, and its Fourier transform is a measure supported by the integers.
33.
163
PERIODIC DISTRIBUTIONS IN ONE VARIABLE
If we integrate the Poisson formula above over a period of dp, we have
1
A
C f(k)/
=-
fi
2n
eikxdp(x) 0
A
and therefore, substitutingf forf,
Thus the measure
2 concentrates the mass
at the point ( = k. In particular, when the measure p is absolutely continuous, it is of the form f ( x ) dx where f ( x ) is locally integrable and periodic with period 271. The masses ck are then the usual Fourier coefficients of the function. A
It often happens that the Fourier transform T of a distribution T periodic with period 271 is a measure of finite total mass, that is, Ickl < a;in this case T is a continuous function, indeed, by the formula for the inverse Fourier transform
C
the series converging absolutely. We also have a form of the Parseval equations for trigonometric series. If f ( x ) is a locally L2-function, which is periodic of period 2n, its Fourier transform is the measure which puts the mass ck at ( = k where ck is the usual Fourier coefficient. The regularization f 0 cp. is also A
that is, the periodic with period 2n and its Fourier transform is (271)’/~$(&()5, measure which puts the mass (2n)”’$(&k)Ck at 4: = k. Since $(() is in 9,this second measure has finite total mass, hence
(f * c p ~ ) ( ~ = >
c
eikxCk
$(Ek)
the series converging absolutely. Accordingly, integrating the identity we find
164
11.
DISTRIBUTIONS
As E approaches 0, the left-hand side approaches the square of the L2-norm off, since the regularizations converge in L2, and it is not difficult to see that the right-hand side converges to ICkl2, since G(0) = ( 2 ~ ) - ' / Accordingly, ~.
c
and more generally, if g is another periodic locally L2-function with Fourier coefficients dk, we have (f,g ) = ck dk for the Lz inner product. The Parseval ~ " k runs equation shows that the system of functions h&) = ( 2 ~ r ) - ' / ~ e 'as through the integers is a complete orthonormal set in L2(0, 2n). Our arguments make it plain that a trigonometric series should be thought of as a formal Fourier transform of a measure on the integers, the coefficient ck being the mass at ( = k. If the coefficients have a polynomial order of growth, the measure is the Fourier transform of a periodic distribution T of finite order. The smoother that distribution is, the more rapidly do the Fourier coefficients tend to 0 at infinity. If, for example, T has an mth derivative in L2 then Iklmlck(belongs to f 2 , whence, Certainly lCkl 5 Clkl-". While the continuity off@) does not guarantee that its Fourier transform is a measure of finite total mass, the hypothesis that f belongs to an appropriate Lipschitz class will, as the following theorem of S. N. Bernstein shows.
1
Theorem (Bernstein): Letf(x) be a function, periodic with period 2n satisfying a Lipschitz condition of order LY where u > 3; then the Fourier series converges absolutely. PROOF: Sincef(x) is Lipschitzian, it is certainly continuous and hence is locally L2. The modulus of continuity of f(x) is o(t) and by hypothesis, o(r) < Cr". The L2-functionf(x h) - f ( x - h) has the Fourier coefficients ck(2i)sin(hk) and is continuous, so
+
ln2'lf(x
+ h ) - f ( x - h)I2 dx = 1lckI24sin2(hk) 52~C'o(2h)~ -
and therefore, there exists a constant C, so that 1 ICkI2 sin2(hk) S C,hZa. We choose an integer N and consider the block of integers k for which 2N Ikl < 2N+1;there are 2N+' members of the block. Select h = (1r/4)2-~ and note that for k in the block, 3 g sin2(hk) 5 1 ; therefore, if we sum over the block, l c k I 2 6 4C12-2N". Now by the Schwarz inequality and still summing over the block, we have
1
34.
165
PERIODIC DISTRIBUTIONS IN SEVERAL VARIABLES
Hence there exists a constant C, so that this sum is bounded by C, hZN where h = 2'*-"' is smaller than 1, since CL > 3. Thus (ckl is finite. Bernstein's theorem is the best possible in the following sense: there exists a functionf(x) that is Lipschitzian of order 4 with a Fourier series that does not converge absolutely.
+
1
34. Periodic Distributions in Several Variables The distributions T which we consider in this section are defined on R" and have n linearly independent periods. That is to say, we suppose the existence of n linearly independent vectors h , , h, , . . . , h, in R", such that Yhk T = T for k = 1, 2, . . . ,n. There will then exist a linear transformation I of R" into itself, such that / ( & ) = hk for all k, where ek is the unit vector in the direction of the kth coordinate axis. It will follow that if S = To f then F z S = S for all z in Z " , the lattice of points with integer coordinates. We shall presently show that S is temperate, from which it will follow that T is also temporate. Let $(x) be a testfunction which is equal to 1 on the cube C defined by the inequalities lxil 5 1, i = 1, 2, . . . , n ; we suppose further that 0 5 $(x) S 1 and that $(x) vanishes outside a small neighborhood of C. The sum
+
is a periodic C"-function which never vanishes, and the ratios cPZ(4
=5
2
$(x)/4(4
= Y2 c P o ( 4
form a partition of unity. Accordingly, the distribution S may be written as a F 2 ( p 0 S ) each , term of which has compact support and is sum: S = CzsZn therefore temperate. The argument of the previous section shows that the sum converges in the space of temperate distributions, hence that S is temperate. Moreover, the equation .T2S = S, which holds for all z in Z" implies A
*
that (e-izS - 1)s= 0 for all such z, and therefore, S is supported by the lattice of points of the form 27rc where [ has integer coordinates. It should also be A
clear that S must be a measure.
*
The distribution T = S 0 I-' therefore has the Fourier transform T which is a measure supported by the lattice 1,(27rZ").The simplest illustration gives rise to the Poisson summation formula: the distribution P which consists of a unit mass at every point of the lattice Z " satisfies the equation ( 1 - eiznCr)P= 0
166
11. DISTRIBUTIONS A
A
for every [ in that lattice, and from this it follows that YZnC P = P and therefore h
A
that P is periodic. This means that P consists of the measure which puts the same mass m at every point of 2nZ". Accordingly, for every function cp(x) in 9,
Since cp(x) may be the Gaussian, m is clearly positive, while for cp = $ 0 Ze we have
and taking formula.
E
= J% we find m = ( 2 7 ~ ) "This ~ ~ . establishes the following
Poisson Summation Formula (2) :
The study of the periodic distributions in R" is, there.are, exactly para :I to that in R 1; as in the previous section, one shows that iff(x) is a locally integrable function such that every point in the lattice 2nZ" is a period of f(x), then its Fourier transform is a measure supported by 2".At the point z of that lattice, the measure has the mass 2n 2 n
2n
jo.-*jo e - i ( X i Z ' f ( x , ,x 2 , . . . , x,) d x , d x ,
c, = (2n)-"12J'o
dx,.
When the total mass is finite, the function is given by the absolutely convergent series
and when the function is locally L2,the Parseval equation is valid:
1 IcL12 = 1 1 . * . j o z n l f ( x l . x 2 , . . . ,x,)12 d x , d x , ...dx,. Zn Zn
z EZ"
0
0
The Poisson summation formula makes possible an alternate proof for the Minkowski Lemma stated in Section 14. Following C. L. Siegel, we suppose that I/ is a parallelepiped lxil < bi , i = 1,2, . . . ,n, in R" of volume m( V ) = b, which contains no point different from the origin with integer coordinates. We show that m( V ) =< 2".
35.
SPHERICAL HARMONICS
167
Let C be the cube [ x i [< 1, i = 1,2, . . . ,n in R" and l a linear transformation which maps V onto C ; then ldet flm(V ) = 2". If t ( s ) is defined on the real axis as the triangle function, vanishing for Is1 > 1 and given by 1 - Is1 on [ - 1, 11, then T ( x )= t(xk) is a continuous function on R" supported by C. Its Fourier transform is readily computed: it is the positive, integrable function
n;=l
The composed function (T 0 I)@) = T(l(x))is supported by V and so vanishes for all x other than the origin with integer coordinates. Even though this function is not in the class 9, the Poisson summation formula may be applied to it in view of an obvious regularization argument. Accordingly, C ( T 0 I)(k)= (2.)"'2C(To
1)^(2nk),
where all terms in these sums are nonnegative and the summation is taken over all k in the lattice 2" of points with integer coordinates. The left-hand side reduces to the single term corresponding to k = 0, hence,
A
A
and because ( T 0 I,)(O) = T(0)= (27r)-"/2,the first term ofthe series is I/ldet I). We therefore find that 1 = ldet 1 I - I + R where the remainder, R, is nonnegative, and it follows that m( V ) 5 2". We should also note that if equality holds in this inequality, the remainder above must vanish, and since it is a sum of nonnegative terms, every term T(1,(2nk))vanishes fork not the origin. Since the zeros of $are exactly the lattice 2nZ", the transformation I , must carry that lattice into itself. We see then that I maps Z " into itself and so the vertices of the parallelepiped V = l - ' ( C ) are points of 2". Thus V must contain C, the smallest parallelepiped with vertices in Z " containing the origin in its interior. Now since ldet fI = 1, it follows that V coincides with C if m( V ) = 2" and V intersects Z " only in the origin.
35. Spherical Harmonics For m 2 0, we consider the space Il, consisting of homogeneous polynomials P ( x ) of degree m defined on R" for n 2 3. Thus, P 0 le = E ~ and P U, is obviously a vector space of dimension d(m) = (m + n - l)!/m!(n - l)!, a basis for the space being given by the monomials xa for lctl = m.
168
11. DISTRIBUTlONS
If a and p are two multi-indices with IaI =
IpI = m, then
which vanishes if c(k > pk for some value of k. Accordingly, Duxp= 0 except when a = p ; in this case, Duxu= a ! . It follows that if P ( x ) and Q ( x ) are homogeneous polynomials of degree m,
then P ( D ) Q ( x ) is the constant ~ l u l = m a u b uAc tconvenient !. inner product is thus obtained for the space Il,; we set ( P , Q ) = P ( D ) Q ( x ) and note that (P,P) 2_ 0, the latter quantity vanishing only for the zero polynomial. It is easy to see that when P ( x ) is homogeneous of degree m,the differential wheneverj 2_ m operator P ( D ) determines a linear mapping of nj into f I j d m and it is important to notice that this mapping is onto. For if it were not onto, there would exist a nontrivial polynomial Q ( x ) in Ilj-,,, orthogonal to all polynomials P ( D ) H ( x ) for H in I l j . In particular, for H ( x ) = Q ( x ) P ( x ) , ( Q , P ( D ) Q P ) = 0, and therefore, H ( D ) H ( x ) = 0, whence H ( x ) vanishes identically, a contradiction. In view of this result, it becomes clear that the dimension of the null space of P ( D ) in H i , that is, of the polynomials Q homogeneous of degree j for which P ( D ) Q = 0, is exactly the difference d(j) - d(j - m). The representation theorem which follows is a useful one.
Theorem: Let P ( x ) belong to l7, and j be an integer > m ; then any polynomial T ( x )in I l j admits a unique representation of the form
T ( x )= C Rk(x)Pk(X) (the sum being taken over all k for which km S j ) , where P(D)Rk = 0 and Rk is not divisible by P.
PROOF: We first suppose k = 0 and try to write T = SP + R, where Ro belongs to nj and satisfies P ( D)Ro = 0. The polynomials of the form S ( x ) P ( x ) as S runs through I l j - , form a subspace of Il, of dimension d(j - m ) while the dimension of the null space of P ( D ) in f I j is d(j) - d(j - m). These two subspaces have only the zero polynomial in common, since a polynomial H ( x ) of the form S ( x ) P ( x )which is in the null space of P ( D ) must satisfy the equation H ( D ) H ( x ) = 0. The direct sum of these two subspaces thus hasdimensiond(j) and must, therefore, coincide with H i . It follows that T may be written in the form T = SP R , where P ( D ) R , = 0, the representation being unique and R, not divisible by P. Similarly, the polynomial S has a unique decomposition S = PSI R , with P ( D ) R , = 0 and R, not divisible by P, whence
+ +
35.
SPHERICAL HARMONICS
169
+
T = P ' S , P R , + R , . Since S , can also be so decomposed, the theorem is proved after a finite number of steps. The interesting case is that where P(x) = xi = 1xI2;here m = 2 and P(D) is the Laplace operator A . It follows that any polynomial T ( x ) in U j has a unique representation of the form
I;,
and is not divisible by lxlz and is a solution to ARk = 0, where Rk(X) is in that is, is a harmonic function. The solid harmonics of order m are the homogeneous polynomials of degree m which are also harmonic functions; they form a vector space S,,, of dimension d(m) - d(m - 2). The spherical harmonics of order m are the restrictions of the solid harmonics of order m to the surface of the unit ball, that is, the restrictions of polynomials in X,,, to the surface S = [lxl = 13. These functions form a vector space 9,having the same dimension as %, since a solid harmonic which vanishes on S must vanish for 1x1 < 1 because it is the Poisson integral of its boundary values; thus it vanishes identically. The correspondence between the solid and spherical harmonics is thus one-to-one. Since the spherical harmonics are smooth functions on S , they are clearly bounded and integrable relative to the natural Hausdorff measure do on S and it is generally convenient to regard y,,, as a (finite-dimensional) subspace of the Hilbert space L'(S, do). Iff is a spherical harmonic of order m, the functionf(x/IxI) is defined for all x # 0 in R" and is homogeneous of order0; the product Ixl'"f(x/lxl) = P(x) is then homogeneous of degree m and is a solid harmonic, since the function f(x/lxl) must be of the form Ixl-"'H(x) for some solid harmonic H ( x ) , whence H = P. Let P(x) and Q(x) be solid harmonics of order m a n d j respectively where m # j . It is obvious that P AQ - Q A P vanishes identically; integrating this expression over the unit ball 1x1 5 1 and making use of Green's formula, we obtain
Here, the normal derivative is of course the derivative with respect to the radius 1x1. As the solid harmonic P(x) is of the form Ixl'"f(x/Ixl), the derivative with respect to radius is mlxl"'-'f(x/lxl) and the normal derivative of Q(x) is equally simple. Thus, on the surface S where 1x1 = 1, we have
170
11. DISTRIBUTIONS
and it follows that the spherical harmonics obtained from P and Q are orthogonal in the space L 2 ( S ,do).Accordingly, for m different from j , the subspaces 9,and Yj are mutually orthogonal in that Hilbert space. It also follows easily that for the solid harmonics P and Q of different orders,
Since an arbitrary polynomial H ( x ) on R” is a finite sum of homogeneous polynomials, and these, by the representation theorem, are sums of products of powers of lxI2 and solid harmonics, it follows that the restriction to S of any polynomial H ( x ) is a finite sum of spherical harmonics there. From this fact, it is easy to conclude that the system of all spherical harmonics is complete in the Hilbert space L’(S, dw). For i f f i n L’(S, do)is orthogonal to all spherical harmonics, the measure dp = f ( x ) do(x) has compact support in R” and its (entire) Fourier transform is determined by the McLaurin coefficients
z(C)
By hypothesis, these vanish for all u, hence G(C) vanishes identically, and therefore, f ( x ) = 0 almost everywhere dw. It has already been shown that the subspaces 9,are mutually orthogonal, hence, if €, denotes the projection on Y,,an arbitrary ,f in L’(S, do) has the unique representation All the spaces which we have considered here are invariant relative to the orthogonal group: if P belongs to IZ, so also does P 0 I for any I in O(n),while P is harmonic if and only if it has the mean value property defining harmonic functions; P I has that same property and is therefore also harmonic. Thus the space 2, is invariant under the substitutions of the orthogonal group, which is obviously also true of the corresponding space of spherical harmonics. The substitutions of the orthogonal group also act in a natural way on L2(S,d w ) : iff is a function in that space and 1 an element of O(n) the sets defined by theinequalitiesa
0
I
(U/f)(X) = f ( W ) ,
I I ~ / f 1 I 2= Ilfll’;
the element I-’ gives rise to the inverse transformation and it becomes clear
35.
SPHERICAL HARMONICS
171
that the system U, is a group of unitary transformations of L2(S,dw) into itself which is isomorphic to the orthogonal group O(n). Thus U , , U , , = U f I f 2 and
U t - , = U;'.
The subspace 9,is mapped into itself by all of the unitary transformations U, . Let m be given and p a fixed point in S ; the elements of 9,are continuous functions on S and the valuation L,(f)
=f(P)
is a continuous linear functional on the Hilbert space 9,. Accordingly, that functional is represented by an inner product, and there exists a uniquely determined element Z?' in 9,such that
f(P) =
If
(X)Zb"'(X)
S
d4x)
= (f,Zb"9
for all f in 9,. The spherical harmonic Z $ ' ( x ) is called the zonal harmonic of order m with pole p . From the equation ( U , f ) ( P )= f(4P)) = ( V f f ,Zb"9
u; 'Zb"')
= (f,
= (f, Zl;",:),
it follows that Z ~ ' ( l - ' ( x )=) Z,($(x), or better, Z!;"d,(l(x)) = Z F ) ( x ) . Suppose, then, that x' and x" are points of S equally distant from p ; there surely exists then an orthogonal transformation I in O(n) so that l ( p ) = p and I@') = x"; and therefore, Z T ) ( x ' ) = Z T ' ( x " ) . Thus Z T ) ( x ) depends only on the distance of x from the pole p . In the special case when n = 3 and S is thought of as the globe with p the North pole, the zonal harmonics are all functions only of latitude, hence their name. It will become clear that this feature of the zonal harmonics makes them quite the simplest of the spherical harmonics. Let the pole p be taken as the point ( I , 0, 0, . . . , 0) and the subscript neglected; the zonal harmonic may be written Z m ( x / I x I )= I x l - " P ( x ) where P ( x ) is a solid harmonic of order m.Because P is a homogeneous polynomial of degree m, this may be written
and since the function can depend only on the ratio x , / l x l , this leads to the equation
I72
11. DISTRIBUTIONS
where u,(t) is a polynomial of degree at most m considered on the interval [ - I , I]. For distinct values of m and j , then, the orthogonality of the harmonics Ixl'"Z('"'(x/lxl) and IxljZ"'(x/lxl) over 1x1 5 1 can be written
We put y as the point in R"-' with the coordinates (x2, x3, . . . , x,) to obtain
and introduce spherical coordinates in R"-', writing dy = r"-2 dr w , - ~d o so that the equation becomes
This double integral is just an integral over a semicircle in the x,, r plane and can be computed with polar coordinates. We put x, = p cos 0 and r = p sin 0 to obtain dx, dr = p dp d0 and
The integral with respect to p is easily computed and is not zero; hence, j~u,,,(cosO)uj(cos 0)(sin 0)"-2 d0 = 0 or, finally, u,,(f)uj(r)W(r) dr = 0, where
W(r) = ( 1 - t 2 ) ( n - 3 ) / 2 .
It follows that the polynomials u,(r) from an orthogonal system of polynomials over the interval [ - I , I ] relative to the weight function I+'(/), that is, the measure W ( r )dt. In particular, then, the degree of u,(r) is exactly m. In the special case n = 3, the weight function w(tj = 1 and the polynomials u,(t) are scalar multiples of the Legendre polynomials. When n = 4,the associated orthogonal polynomials are the Tchebysheff polynomials of the second kind. For general n, the systems are called the ultraspherical polynomials or sometimes the Gegenbauer polynomials. This identification of the zonal harmonics gives rise to another interesting observation; the zonal harmonic is essentially the only function in 9,which depends on only one variable. For iff(x) is a spherical harmonic of order m depending only on the distance from the pole, then, after an appropriate
35.
SPHERICAL HARMONICS
173
choice of coordinates,f(x/lxl) is given by r(xl/lxl), where r ( t ) is a polynomial in r of degree at most m. The orthogonality off(x) and Z"'(x) for all j different from m then leads as previously to the orthogonality of the polynomial r ( r ) and the polynomials uj(t) relative to W ( r )dr for all such j . Since the orthogonal system of polynomials is complete, it follows that c(r) = cufll(r)for some constant c, hencef(x) = cZ""'(x). It has already been remarked that the space 9,is invariant under the operators U, representing the group O(n). An important fact is that this representation is irreducible; that is, that there cannot exist a proper, nontrivial subspace 4 of 9,, which is itself invariant under the group. To show this, we shall consider an arbitrary nonzero element f of 9,and form the subspace A ( f )consisting of all finite linear combinations of images o f f under U , , that is, all functions of the form N
1
k= 1
AkU/kf'
As a subspace of the finite dimensional Y,, A ( f )is automatically closed. We shall show that A ( f )contains a zonal harmonic Zp', and therefore, since A'(f) is invariant under U , , it contains all of the zonal harmonics 2f;P:. Thus any g in 9,orthogonal to .H(f)must be orthogonal to every such zonal harmonic, whence g(9) = (9, Z T ) )= 0 for all points 9 in S and g vanishes identically. Let p be a point of S wheref(p) is not 0 and let O,(n) be the subgroup of the orthogonal group consisting of transformations 1 satisfying / ( p )= p . Following the method of Section 12, we integrate over that subgroup to obtain
U,f
h=S
& J ( 0 9
O,(fl'
a limit of Maak sums ( I / N ) x t = U,,f = h, . Now, hN is evidently in & ( f ) and h ( p ) = lim h N ( p )= / ( p ) is not 0, hence 11 is not zero. By the construction of h, we have U,h = k for all 1 in O ( n )satisfying / ( p ) = p and therefore h is a spherical harmonic of order m which is a function of only one variable, the distance from p . Hence h = CZP'for some nonzero constant c, and the zonal harmonic completing the proof. is in .&i'(.f), which therefore coincides with Y,, It is now possible to determine the subspaces of L 2 ( S ,dw) which are invariant for the group O(n). It should first be observed that if .Iis a closed subspace of L 2 ( S ,rlw) and E the. (orthogonal) projection on it, then .A! is invariant for the group if and only if Ecommutes with every U, . The invariance of A? also implies the invariance of its orthocomplement A', since g in A' means (9,U , f ) = 0 for every,/in . i l a n d every I in U ( n ) ; hence, ( U , - * g , f ) = 0 and Ulg is in A' for all 1. Accordingly, if ,/= E f + ( I - E ) f , then U,f = U,Ef+ U,(I - E ) f ; the first term is in 4 and the second in A', whence
174
11. DISTRIBUTIONS
U1, then for f in A, U 1 f= U,Ef = EU,f is also in A and .1is invariant. Suppose, now, that A is invariant under the group and E is the corresponding projection. We form T = EjEEm where E m , of course, is the projection on 9,. The range of T is contained in Y j ,and since the factors all commute with U , , so does T. Hence, TUJ = UITf and the range of T is invariant under the group. Because Y jis irreducible, it follows that the range of T is either the zero subspace, or all of Y j .The dimension of the range must be at most the dimension of the range of E m , that is, the dimension of Y m , and therefore, j 6 m. The operator T is nonzero if and only if its adjoint T * = El, EEj is nonzero; if this happens, then m 5 j as we have seen. Accordingly, T = 0 unless, perhaps,j = m,in which case the range of E E , is orthogonal to all Y j f o r j different from m. This means that the range of EE, is contained in Y,,, and, therefore, that 9,is contained in the range of E, that is, in Ji'. Finally, then, for every m 2 0 either Y mis contained in A or it is contained in A%''. We may therefore write the direct sums: U, Ef = EU,.f. On the other hand, if E commutes with all
where the systems m' and m" are complementary subsets of the integers 2 0 . Now let f be an element of L2(S,do)and let A%'( f ) be the closed linear subspace spanned by the vectors U,f. We may write
f =CEmf
and
Ulf = =
c UIEmf C Em
it follows that Y jis contained in &(f) if and only if Ejfis not zero. In particular, iffis a function only of one variable, that is, of the distance to a fixed polep, then for an appropriate choice of coordinates,fis of the formf(x) = v(xl) where v ( r ) belongs to L 2 ( [- I , 11, W ( t )dt). Moreover, U,f =f for all 1 in O(n) which map the pole p into itself, and therefore U, Emf = Emf for all m and all such 1. This means that El,fis also a function of one variable, hence of the form c , , Z ~ ' .Finally, then, &(f) contains sPj if and only if the coefXcient cj is nonzero, and this coefficient vanishes if and only if the integral +I
S_
u(t)uj(r)W(t) dt
= 0.
1
We have therefore proved most of the following theorem due to P. Ungar. Theorem: (Ungar): I f f is the characteristic function of a circular cap, that is, a subset of S of the form x, > a, then the subspace &(f) coincides with Lz(S, do)if and only if a does not belong to a certain countable, dense subset of the interval [- 1, I].
36.
SINGULAR INTEGRALS
175
PROOF: The coefficient cj associated with f is uj(t) W ( t )dt = cj(a) and this function of a has only finitely many zeros in the interval, since its derivative is of the form p ( a ) W ( a )where p is a polynomial; the derivative has only finitely many zeros and hence, the nonconstant cj(a) has only finitely many zeros, in view of Rolle's theorem. Thus the subset of [ - 1, 13, where at least one cj(a) vanishes, is countable and we shall omit the proof that it is dense.
36. Singular Integrals Let W(X) be a bounded, measurable function on R", which is homogeneous of order 0; this means, effectively, that w ( x ) is a function only of direction and not of distance from the origin. Accordingly, w ( x ) = W ( x / l x l ) for some bounded function W defined on the surface 1x1 = 1. We pass to the distribution S, = w ( ~ ) / l x l " - ' ,where 0 < Re[I] < n ; the I- n, distribution is obviously a function, and is homogeneous of degree , moreover, it depends analytically on I . The distribution admits an analytic continuation over most of the complex I-plane and this is determined in exactly the same way as the continuation is found in the special case w ( x ) = 1 treated in Section 24. There will be poles at certain nonpositive integers. In order to study the behavior of the distribution near I = 0, we choose a positive radius d and write
an expression which is certainly valid when 0 < Re[I] < n. If, now, we make the hypothesis that the integral of w ( x ) over the unit sphere vanishes, that is, W ( 0 )dw(0) = 0, and this will surely be the case if w ( x ) is an odd function, then the final term in the formula above vanishes. It becomes clear that S, has no pole at I = 0 , and its value at that point is given by the formula
s
Moreover, since the number So(cp)is independent of d, we may write
176
11. DISTRIBUTIONS
to obtain a sort of Cauchy principal value. It follows that the distribution w(x)/lxl" is unambiguously defined. It is called a singular integral. Since the singular integral is homogeneous of degree -n, it is temperate and its Fourier transform is homogeneous and of degree 0. It is also easy to see that the Fourier transform is a function, since the singular integral is the sum of a distribution with compact support and a function in the classL2, and therefore its transform is the sum of an entire function and a function which is square integrable. h
Accordingly, S o ( { ) is a function of the form F&t/ltl)where is in L2(S,do). An important theorem of Calderon and Zygmund asserts that the transh
form So(<)is itself a bounded function. We shall prove it only in the special *
case when the homogeneous function W(X) is odd. In this case So(5)is the limit in the space of temperate distributions of the. sequence
and since the integrand is odd, this reduces to
(sin t ) / t dz where 0 = x/lxl. If ( 0 0 = a > 0, the inner integral becomes and converges boundedly, with increasing N to n/2. In a similar way, the inner integral converges to -n/2 if (05) is negative. Accordingly,
where the signum function vanishes at the origin and is otherwise given by the formula sign t = r / l t l . A
Our argument shows not only that So(t)is a bounded function and homogeneous of order 0, but that it is even a continuous function away from the * origin; this is a consequence of the fact that So(5) on the unit sphere =1 is obtained from the initial function W(O) by an integral operator having the kernel K(O, 5 ) = C sign(0, 5 ) for some C. This operator carries L2(S,o)into the space of continuous functions on S. Although we do not prove the full Calderon-Zygmund theorem con*
cerning the boundedness of So({), it is instructive to prove it in the special case when the function W ( 0 ) is a spherical harmonic. For this purpose we first establish the following lemma.
36.
Lemma:
177
SINGULAR INTEGRALS
Let u ( x ) be an integrable function on R" of the form
where P ( 0 ) is a spherical harmonic of degree k . Then the Fourier transform is of the form
where P is the same spherical harmonic. PROOF: Let us remark that we know the theorem in the case of the spherical harmonics of degree 0, for in this case, P (0) = 1 and u ( x ) is a function of radius only. Consider first the function ,-iceor, where t > 0 and both 8 and 5 are points on the surface S of the unit ball. I f we suppose that both t and 5 are fixed, the exponential appears as a smooth, continuous function on the surface S , and is obviously square integrable. I t therefore has an expansion in terms of the system of spherical harmonics. In each of the spaces Y',, select the zonal harmonic Zy)(0)having its pole at the point 5 ; since the exponential depends only on the distance between 8 and 5, it is a function of essentially only one variable and its projection on 9,has the same property. Accordingly, the exponential has an expansion in spherical harmonics of the form
We now use this expansion to compute the Fourier transform of u(x). = (2n)-n/z/e-i(xt)u(x)d x = ( 2 ~ c ) - " / ' J ' / e - 'c')rpf(r)P(0)rn-l (~~ dr
where p =
do(8) on,
and 4' =
and because P ( 0 ) is itself a spherical harmonic of degree k,
1 j(r)rn-lck(rp)Jzp(tqqe)do(e) d r . m
I;(s) = (2Tc)-+w,
0
178
11. DISTRIBUTIONS
Because of the reproducing property of the zonal harmonic, this reduces to m
%t)= P(t')(2n)-"'2wnJb
f(r)r"- 'ck(rIt1) dr ,
proving the lemma. ) a testfunction depending only on radius and equal to Let ~ ( x be neighborhood of the origin. For small positive d the function
+ 1 in a
satisfies the hypotheses of the lemma if P ( 0 ) is a spherical harmonic. The Fourier transform Gd(t) is therefore of the form P(
integral, and so the Fourier transforms converge to the transform So(<)which is a locally integrable function and homogeneous of degree 0. Hence for some constant C,
The value of C depends on k, the degree of the spherical harmonic P ( 0 ) . More generally, then, if So is the singular integral determined by a function *
W(O),a finite sum of spherical harmonics, then So(t) is a finite sum of spherical harmonics of the same degrees.
PART 111
HARMONIC ANALYSIS
This Page Intentionally Left Blank
37. Functions of Positive Type An N x N matrix ajk is called a positive matrix if the associated quadratic form is nonnegative, that is,
for all choices of the N complex numbers zk . Selecting these numbers so that all but one vanish, we see that the diagonal elements of the matrix are positive, and choosing the zk in such a way that all but two vanish, say z, and z,,, then
+
~ v v l z v I z~ , , , l Z , , l Z
+ a,,,z,Z, + apvzp, 5 0,
from which it easily follows that a,,, = a,, and the matrix is hermitian symmetric. Moreover, if a,,, = la,,,leie, then putting z, = 1 and z, = te-" where t is real, the polynomial
+
a v v t 2 a,,,,
+ a,, te-" + aPvteie= a,, t 2 + 21a,,lt + a,,,,
has positive coefficients and is nonnegative on the real r-axis. Thus, the discriminant is nonpositive and IQv,,12
5 avva,,,,,
an inequality which is really the Schwarz inequality. Let ek be an orthonormal basis in an N-dimensional Hilbert space; the matrix ajk gives rise to a linear transformation A defined by
and accordingly, ( A f ,f)2 0 for all f in the Hilbert space. The transformation A is hermitian symmetric and has a complete orthonormal system of eigenvectors v i corresponding to eigenvalues l i which are nonnegative. It is clear that, conversely, every transformation A having the property that (Af,f) is always nonnegative will give rise to a positive matrix when it is written in terms of an arbitrary orthonormal basis. The basis being ek we will have ajk = (Aek, ej), and if we make use of the fact that A has a positive square root S, this square root being simply the transformation with the same eigenvectors and the square roots of the eigenvalues, we may write ajk = (sek sej)
= (gk 2 gj) * 181
182
111. HARMONIC ANALYSIS
It follows that the positive matrix is really a Gram’s matrix and the inequality lav,[25 avva,,,,is really the Schwarz inequality. Finally we note that of course any Gram’s matrix is a positive one:
1c ( g k , g j k j z k
=
IIc zjgjllz
20. Theorem ( I Schur): . If aJk and bjk are positive matrices of the same order, then the matrix cjk = ajk bjk is also positive.
PROOF: While our method of proof is purely algebraic, it is convenient to avoid the subscripts, and therefore to write the matrices as if they were the kernels of integral operators. Let X be the space of N points, namely, the indices from 1 to N, and p the counting measure on X ;the kernel A(x, x’) defined by A(x, x ’ ) = ax,.+,corresponds to an integral operator on the space L2(X,p) : (#)(x)
= j 4 x . x f ) f ( x f )4 4 x 7
*
In a similar way, bjk determines an integral operator B on L2( Y, p), where Y is another name for X.The product space 2 = X x Y consists of N 2 points, and with the counting measure p, which is also the product measure dp(x) dp(y) gives rise to the N2-dimensional Hilbert space L2(Z, p). The kernel K ( z , z’) = A(x, x’)B(y, y’) now defines an integral operator in L2(Z):
(Kf)(z) = j K k z f ) f ( z ’ MZ’) ) ‘ Let ui(x) be the system of normalized eigenfunctions of A in L 2 ( X )and ai the corresponding (positive) eigenvalues; similarly wj(y) and correspond to B in Lz( Y). The functionfii(z) = ui(x)wj(y) is then an eigenfunction of K with eigenvalue t l i p j , and since there are N Z such functions and they form an orthonormal set, they are complete and it follows that Kis a positive operator. Consider next any functiong(z) inL2(Z, p) which vanishes outside the diagonal, that is, g(z) = g(x, y) = 0 for x different from y. Now,
aj
0 5 (Kg, g) = j j w , zt)g(zfls(z) MZ’) 4 4 z ) = ]]A(.,
x’)B(y, y’ls(xf,y’lg(x, Y) W ’ ) 444
= ]IA(., x’)B(x, x’lg(x’, XMX, x)
4 4 x 7 dPW
37.
183
FUNCTIONS OF POSITIVE TYPE
Since it is obvious that any system of N complex numbers zk can be realized by an appropriate choice of g(z) vanishing outside the diagonal in Z , the proof is complete. The definition which follows appears very artificial, but the class so defined is important. Definition: A functionf(x) on R" is of positive type if and only if it is continuous and has the property that for every integer N and every set of N points x,, xz , . . . , xNin the space, the associated matrix
f(Xj - x,J is a positive matrix. It is evident that if f ( x ) is of positive type then f( -x) = f(x>,f(0) 2 0, and lf(x)lZ S f ( 0 ) 2 ;hence, the functions of positive type have a certain hermitian symmetry and attain their maxima at the origin. In particular, they are bounded and, therefore, are temperate distributions. From Schur's theorem, it follows that the product of any two functions of positive type is again of positive type, that product being obviously continuous. It is even easier to show that the sum of two functions of positive type is also one, and therefore that the class of functions of positive type is a convex cone, closed under multiplication. The exponential function itself is of positive type: for fixed h in R" the function eixhis continuous and
It also readily follows that the Fourier transform of any positive measure p of finite total mass is also of positive type, since it was shown in Section 30 that such functions are continuous, while the equation f ( x -y ) =(27r)-"/zpr,-ivr dp(l) shows that f(x - y ) = (ex, e,,) is an inner product in the L2-space associated with the measure p, and therefore, that the matrix f(xi - xk) is a Gram's matrix. In particular, any function with an integrable nonnegative Fourier transform is of positive type, for example, the Gaussian, which is its own transform. Foi an integrable functionf(x) we set f ( x ) =f( -x); its Fourier #
-
A
transform will be A
(27~)'"~ 'l)(fI
f(t), and
hence, the Fourier transform of
f*f
is
and is positive. Accordingly, f * f is of positive type; we note
184
111. HARMONIC ANALYSIS
that it is convenient always to choose the testfunction with which regularizations are formed so that it is of positive type. This can be done by taking the usual regularizing function cp and passing to cp * $, a positive testfunction of positive type; its Fourier transform is then a positive function in Y of positive type; and therefore attains its maximum at the origin. The fundamental theorem is due to Bochner. Theorem (Bochner): functions of the form
The functions of positive type are exactly the f ( x ) = (2rr)-"'2le-'cx d p ( t ) ,
where p is a positive Radon measure of finite total mass.
PROOF: We have already shown that such functions are of positive type. Now let f ( x ) be of positive type and cp(x) an arbitrary function in 9'; from the boundedness off(x), its continuity and the integrability of cp(x) the integral j p ( X
- Y ) c p ( X ) c P o d x dY
is the limit of its Riemann sums of the form
which is therefore nonnegative. The integral may be computed in any order; we first change variables, writing z = x - y to obtain 0 5 jf(z)/cp(x)cp(x - 2) d x d z
*
= J'fWcp $)(z) d z .
Consideringf as a temperate distribution and taking Fourier transforms, we find 0m c p* "
a
= h W 2l@(0l2) " n and therefore infer that the temperate distribution f is positive on squares, " n that is,f(tj) 2 0 for all $ in 9'of the form Icp(t)12 with rp in 9. Let tj(t) be a testfunction which is nonnegative and G(t) the Gaussian; for E > 0 the function $(t) E * G ( ~is)in Y and is everywhere strictly positive.
+
37. FUNCTIONS OF POSITIVE TYPE
185
This function therefore has a positive square root which is C" and which is also in 9, since it coincides with ~G(t/,/2) outside of of some compact. Accordingly " 0 Sf($ + EZG) " " A
=?w+ EZflG)
7
" and since E is arbitrary,f($) 5 0 for positive testfunctions. It follows thatfis a positive distribution, that is, a positive Radon measure p. To show that the measure has finite total mass, we regularizef(x) with a positive regularizing function of positive type. Now sincef(x) takes its maximum at the origin, A
A
as E diminishes to 0 the integrand converges to (2n)-"" uniformly on compact subsets of R";therefore, for any compact subset K of that space, p ( K ) 5 (2n)"'2f(O). Since the bound is independent of K, p has finite total mass. It is worthwhile to introduce a further definition: a functionf(x) which is bounded and measurable is of weak positive type if for every testfunction q(x) the integral
is nonnegative. The proof of Bochners theorem essentially depended on the fact thatf(x) of positive type was of weak positive type. Hence the following theorem is immediate. Theorem: A functionf(x) is of weak positive type if and only if it coincides almost everywhere with a function of positive type. The functions of positive type do not have any regularity properties other than continuity; almost a century ago, Weierstrass showed that for an appropriate choice of the positive integer a, and b > 0, the continuous function m
f(x) =
Cbk~~S(ak~)
k=l
was nowhere differentiable. Now, this is evidently of positive type, since it is the Fourier transform of a positive measure of finite total mass, the measure
186
111. HARMONIC ANALYSIS
consisting of a sequence of point masses on the axis. Nevertheless, if a function of positive type behaves well at the origin, it will behave equally well elsewhere, as the following theorem shows. Theorem: Let f ( x ) be of positive type and belong to the class C Z kin some neighborhood of the origin ; then f(x) is everywhere C Z k . PROOF:
Let p be the Fourier transform off(x) so that (1 -
4m
= g(x)
has the Fourier transform (1 + IIIZ)kdp((). Since g(x) is continuous in some neighborhood of 0, it is certainly bounded there, and its regularizations (for small E ) are uniformly bounded there. We have
and deduce that the measure (1 + l(12)k dp(() has finite total mass and g(x) is a (continuous) function of positive type. Thus for all indices a with la1 5 2k, the distribution (D"f)(x) has a (possibly signed) measure of finite total mass as its Fourier transform, and is therefore a continuous function on R". We finally invoke the theorem of du Bois-Reymond to infer that f ( x ) itself has continuous derivatives of all orders S 2 k everywhere in the space. Supposef(x) of positive type has the Fourier transform p and attains its maximum at some point h in R" different from the origin; then
0 = f(0) - f(h) = ( 2 n ) - n / 2 1 ( 1 - e-**t)d p ( t ) .
It follows that the positive measure (1 - cos(h()) tip([) is 0, hence that the measure (1 - e - i h 5 )dp(5) is 0. Now for all x
=o and h is a period off(x). In one dimension, the triangle function t ( x ) is that even, positive function which vanishes for 1x1 > 1 and is equal to 1 - 1x1 elsewhere; it is obviously integrable and the Fourier transform is easily computed by elementary calculus, but we prefer to compute it in another way. The distribution second derivative of t ( x ) is a measure which consists of 3 point masses; two equal positive masses at the points x = + 1 and x = - 1 as well as a negative mass at the origin.
37.
187
FUNCTIONS OF POSITIVE TYPE A
A
The total mass o f t ” is 0. Now t ” ( 5 ) = -tZt(t)= (2n)-’/’[-2 therefore
+ 2costJ
and
We need not be concerned about a possible singularity at the origin, since n
t ( 5 ) must be a continuous and, indeed, analytic function. In any event, the
numerator above has a zero of order 2 at the origin. Thus t(x) is of positive type. Essentially the same calculation is used to prove the following theorem. A bounded, even, positive functionf(x) which is convex on Theorem: the right half-axis is of positive type on R’. PROOF: From the convexity, it follows that the function is continuous. We must show that the sums C C f ( x j - xk)z, 2, are nonnegative. Supposing that the points xj are given, we form the set of positive differences xjk = !xi - xk( > 0 to obtain a finite subset F of x > 0. We shall presently construct a function g(x) of positive type which coincides withf(x) on F and ” on its reflection F satisfying the inequality g(0) 2 f(0). Then
c c f(xj
- xk)zjzk
2C 20
cdxj- xk)zjzk
as desired.
The new function g(x) is first defined on the right half-axis by g(x) = maxCO, /I(X), /z(x), * *
f
9
/&I9
where the functions lk(x) are linear, the graph of I,(x) being tangent to that of f ( x ) at the kth point of F(see Fig. 7). The function is next extended to the left half-axis by reflection, to obtain an even, continuous positive function which is convex on the right half-axis and has compact support. The graph of g(x) is a polygonal arc; the second derivative ofg is a measure, consisting of positive masses at the points of Fand their reflections, and a negative mass at the origin. Moreover, the total mass of g” is 0. We may write g” = -mS + p, where p is a positive measure of total mass m. Accordingly, A
s^”(O= -ma
+ P ( 8 = - t2s^(0,
A
whence ;(5) =( m6(5) - 2(5))/<’.The function
P(() is of positive type and its
A
= m6((),hence the numerator in the fraction maximum is p(0) = (1/&)m above is never negative and g(x) is of positive type.
188 111. HARMONIC ANALYSIS
38. GROUPS OF UNITARY TRANSFORMATIONS
189
38. Groups of Unitary Transformations An important application of Bochner’s theorem arises in the study of one parameter groups of unitary transformations in Hilbert space. Let U , be a strongly continuous representation of the group R1by unitary operators on a = U , Us= UsU , for all real t and s, Hilbert space 2.This means that 17,+~ hence, that U,, = I = identity, U - , = U;’ for all t, and that IIU,-,f - U,f (1 converges to 0 with s for every t in R’ and everyfin 2.The continuity condition is really only a requirement that U , be continuous at the origin, since IIUr+s f - U ,f I( = IIU,f-fll. The main theorem is due to M. H. Stone. Our proof will not be complete. Theorem (Stone):
There exists a resolution of the identity EL such that U , = / e i L r dEL
for all real t . PROOF: Choose an element f i n JP and consider the linear space Y of ail formal finite sums x f z k U,,L a space which admits a quadratic seminorm namely, the norm induced by X. We cannot regard Y as a subspace of X since certain different formal sums may reduce to the same element in JP. The function (U,Jf) = $ ( t ) is a function of positive type: it is continuous by hypothesis and xx$(fj-fk)Zjzk=
IIxZkUrkfl12 20. -
We may therefore write +(t> = /eiCr ~ P ( O
for some positive measure p of finite total mass. Now map Y into the concrete Hilbert space L2(p)in the following way:
1akUtkf
goes into the function
1akeittk.
This mapping is an isometry:
1
xakU,k.fI12
=
11
‘keitrkI2
dP(0
9
therefore, elements o f 9 which correspond to the same element of JP have the same image in L2(p). It follows that the linear subspace of X determined by all elements U,fis mapped isometrically into a subspace of L 2 ( p ) by the
190
111. HARMONIC ANALYSIS
mapping introduced above, the element f itself going into the function = + 1. The mapping may therefore be extended by continuity to the smallest closed linear subspace of &' containing all U,J This subspace is The isometry from A ( f )to L2(p) is in fact onto, since if g(5) called A!(f). belongs to L2(p)and is orthogonal to every function eirC, then
f(5)
0=Je-iw,
4.45)
for all t , andsthe (signed) measure g(5) dp(5) has the Fourier transform 0. The uniqueness theorem for Fourier transforms then guarantees that g(5) is the 0 element of L2(p). The space &(f) is now realized concretely by L2(p)and the operators U, are unitarily equivalent to the operators of multiplication by the exponentials. Let E L ( ( )be the characteristic function of the interval 5 < 1; multiplication by E, is a projection in L2(p) and the usual integration theory for operators shows that the equation of Stone's theorem is valid. The theorem has therefore been established when f can be chosen so that &(f) = &'. In the general case, it is necessary to decompose 2 into a direct sum of reducing subspaces, each of the form A(f).We omit these details. We turn to the computation of the Fourier transform of certain singular measures supported by sets of the Cantor type in the interval [0, 13; our notations and terms all refer to the construction of Section 6 where a Cantor type set of Hausdorff dimension a was constructed. Let p , be a measure on [0, 11 consisting of N point masses, each of mass l/N located at the points a,, u2 , . . . ,aN. The Fourier transform of p1 is the function (l/&)P(() where P ( 5 ) = (l/N)Ce-ieak. Let p 2 be a similar measure of total mass 1, which has the masses 1/N located at the points q q , a 2 q , . . .,a,?; its Fourier transform will be (l/,/%)P(tq). In the same way, the measure p 3 puts the mass 1/N at points of the form akq2 and will have the Fourier transform (1/@)P(cq2). A sequence of measures is defined in this way. The convolution p1 * p 2 is a positive measure of total mass 1 ; it consists of N 2 equal point masses at points of the form a, + ak q. More generally, the convolution V n = PI * P 2 * P 3 * P4 * * Pn consists of N" equal point masses at the left-hand endpoints of the constituent intervals of the set K,,; the measure has total mass 1 and its Fourier transform is
It is easy to see that the sequence v,, converges to a measure p of total mass 1 and supported by the set K ;we do not have to invoke Helly's theorem
38.
GROUPS OF UNITARY TRANSFORMATIONS
191
for this. If Z is any interval, the endpoints of which are not in K, then for sufficiently large n the endpoints will not be in K, either and the number v,(Z) will not change with increasing n. Therefore, the limiting measure p is uniquely determined; it is supported by K since it is supported by every K, , that set being a support for v, whenever m 2 n. All the measures considered here are positive and supported by the unit interval ; their Fourier transforms are therefore entire functions of positive type on the real axis. Since the v, converge to p as temperate distributions, the transforms converge and we find
There can be no difficulty with the convergence of the product: the partial products converge uniformly on any compact subset of the complex plane. In the special case when the set K is the usual Cantor set, we have
N
= 2,
q = 113
and
a, = 0, a2 = 213,
whence P(C;)= ,-it13
Accordingly, since
cos(513).
173-k = +,the Fourier transform is
the factor e-'<12occurring quite naturally, since the set is not symmetric about the origin, but rather about x = +.The smooth function P(t) attains its maximum only at the origin, since it is not periodic, because in that case, p would be supported by a countable set of points. The transform $(t)does not vanish at infinity; this is established by the determination of the value of the function at points t of the form C; = 2n3" which follows. We have c0s(2n3"-~) = + 1 if m - k 2 0, hence,
a quantity independent of m.It is not 0, since forj = 1, the corresponding cosine is -+, and for largerj, by the mean value theorem,
Since the infinite product G(2n3"') is not 0.
nT= 1 - [2~/3~])converges,it is not 0, and therefore I(
192
111. HARMONIC ANALYSIS
39. Autocorrelation Functions Let f(t) be defined and locally L2 on the real axis; it is said to have the autocorrelation function @(s) if and only if for all s the limit lim
-1 2T
~=-m
1
T
f(t - s)f(t) dr
-T
exists and equals @(s). We first consider some examples. Example I :
Let f ( t ) = eiAr; no matter what the value of T is, we have
and therefore, the autocorrelation exists and is e - iAs. Example 2:
Let f ( t ) = eiAr2where 1 is not 0. For T > 0, then,
- eiAs
sin(2lsT) 2hT '
this is a quantity which converges to 0 with increasing Tif s is not 0. Thus, the autocorrelation @(s) exists and vanishes everywhere, except at the origin, where @(O) = 1. Example 3:
Letf(t) be a function in L2(R');by the Schwarz inequality,
the norm being taken in L2(R'). Thus, the autocorrelation exists and is identically 0.
39.
193
AUTOCORRELATION FUNCTIONS
Example 4: Let v be a (signed) measure of finite total mass andf(t) be the continuous function
I
1 ,
-
2T
-T
I
1
f ( t - s ) f o dt = 2T
,
IIei'('-s)e-iqt
-T
dv(1) dv(q) dt .
Since all three measures are of finite total mass and the integrand a bounded smooth function, the order of integration may be altered to obtain
As T approaches infinity, the integrand is uniformly bounded in the (1,q)
plane and converges pointwise to 0 at all points not on the diagonal 1 = q ; on the diagonal, the functions are all equal to eFiAS.It follows that the limit exists for all s and the autocorrelation @(s) is given by
q) is the characteristic function of the diagonal, a measurable set where ~(1, for the product measure. The theorem of Fubini permits us to compute the integral as a repeated integral; first integrating relative to q, we find @(s)
=
I
e-'"v[1] dv(1)
where v [ 1 ] is the mass which v concentrates at the point A ; therefore, finally @(s)
=
1e-iAksv[Ak]z.
Thus, only the point masses in the measure v contribute to the autocorrelation. In particular, iff(t) were the Fourier transform of a singular measure of the Cantor type, the autocorrelation would exist and be identically 0, even though the function would not, in general, vanish at infinity. We infer that in some sense it would be small most of the time. Theorem (Wiener): positive type.
The autocorrelation function @(s)
is of weak
PROOF: Let f T ( f )be the function which coincides with f ( t ) on the = (1/2T) interval [ - T , T ] and vanishes elsewhere; the function h
*fT)(s) has the Fourier transform (,/%/2T)lfT(t)12= d p T ( t ) and is evidently of positive type. We first show that @(, -s) converges to @(s) as T (fT
194
111. HARMONIC ANALYSIS
increases, and it is enough to show this for s 2 0. We may write
and this differs from (1/2T)s_’Tf(t - s ) f o dt by at most
when T > s. Since the one quantity converges to @(s), it is enough to show that the square of the difference converges to 0, and by the Schwarz inequality that square is bounded by the product
Each factor converges to 0 with increasing T, for example, the second factor is bounded by
which converges to @(O) - @(O) = 0; the first factor is treated similarly. If M = SUP, @ T ( 0 ) , then all the functions @&) are uniformly bounded by Mand so is the pointwise limit @(s). If q ( x ) is any testfunction, the positive numbers
converge by the Lebesgue convergence” theorem to
SI@(y - x)q(x)cpdy) dx dy,
which is therefore nonnegative. Thus @(s) is of weak positive type, and so is @(s) itself. It is also clear that the positive measure p which is the Fourier transform of @(s), is the weak limit (limit in the sense of Helly’s theorem) of A
the positive measures dF,(r) = (,/%/2T)lfT(-{)I2 theorem is due to van der Corput.
Theorem (van der Corput): the Fourier transform p ; then lim sup ~ + m
Letf(t) have the autocorrelation
I -!2T
d t . Another important
-T
f(t) dt
I
2
5
1 p[O].
(Here p[O] denotes the mass which p concentrates at the origin.)
a(s)with
39.
AUTOCORRELATION FUNCTIONS
195
PROOF: Let q(t) be a regularizing function in 9 ;we recall that we may always take it as a positive function of positive type. Since its integral is + 1, we have
and we can estimate the absolute value squared of this quantity by the Schwarz inequality, recalling that the testfunction q,(t) is supported by the interval It1 5 E . Thus,
Now, by the Parseval equation, the last quantity is equal to
and this may be written
It has already been remarked that the measures dpT converge weakly to d z and therefore,
As E converges to infinity, the functions $ ( E ( ) which are of positive type are uniformly bounded by their common value at the origin, namely, 1 / @ and converge pointwise to 0 at all other points; from the Lebesgue convergence
theorem, then, we infer that the bound on the right converges to p [ O ] / J 2 n as desired. Letf(t) have the autocorrelation function @(s) with Fourier transform p ; the function g(r) = e""f(t) also has an autocorrelation, namely, e-ihS@(s),
196
111. HARMONIC ANALYSIS
and this has the Fourier transform y
h p.
Thus,
and we obtain a corollary to van der Corput's theorem.
Corollary:
Iff(r) has an autocorrelation, the quantity
1
l T f( t)e"" dt 2T - T
-
converges to 0 for all but countably many h. The hypothesis that f ( r ) had an autocorrelation was essential for the corollary. We should remark that Wallin has shown that if we suppose only thatf(r) is bounded and measurable, then the statement of the corollary holds for all h outside of an exceptional set whose Hausdorff dimension is 0. We also remark that if a functionf(t) has an autocorrelation @(s) and if f ( t ) is constant on all open intervals (k,k + 1) for all integers k, then the function @(s) is linear on all such intervals and is a continuous function. This is a consequence of the fact that the convolutions (1/2T)(f, * f T ) ( - s ) are all linear on intervals of the form (k,k + 1) and they converge pointwise to @(s), which must therefore have the same property. The continuity of @(s) follows from the fact that the continuous approximating functions are uniformly Lipschitzian, at least locally. Thus @(s) is actually of positive type, and not just of weak positive type. It should be emphasized that the sum of two functionsf(t) and g(r), each having an autocorrelation, may not itself have an autocorrelation. We sketch an example due to E. Thorp: Let f ( t ) be the characteristic function of the union of the intervals [8k - 1, 8k + 13 as k runs through the integers; this is an even function, periodic with period 8. It is easy to identify the autocorrelation @(s) which is also even and periodic with period 8. We choose a sequence T,, = 4p,, where the odd integers p,, converge rapidly to infinity and define a function g(t) on the right half-axis by setting g(r) = f ( t ) in alternate intervals of the form [ T, , T,,+J,while in the remaining intervals g(r) is given by the translate: g(t) = f ( t - 2). The function g(r) is extended to r < 0 by reflection, so g(r) is also even. For any s with Is1 < 8, the numbers
differ by at most 16 and it is therefore easy to verify that g(r) has an autocorrelation, and that autocorrelation is also @(s).
40.
UNIFORM DISTRIBUTION MODULO
1
197
In alternate intervals [T,,, T,,,,] on the right half-axis, the function - g ( r ) vanishes, and in the remaining intervals If(r) -g(t)l = 1 on a subset whose measure is approximately half the length T,,, - T,,. A similar comment is valid for the left half-axis. If the difference f(r) - g(t) had an autocorrelation, the limit as T approached infinity of
f(r)
would exist, but if the Tn are widely enough spaced, say T,, = 4(n! + l), the function Q ( T ) oscillates between 0 and $.
40. Uniform Distribution Modulo 1 A real number x may be written in a unique way in the form x = [XI + (x) where [x] is an integer and (x) is in the interval 0 5 x < 1 ; the number (x) is the representative of x modulo 1. Given a sequence ak of real numbers, we study the sequence modulo 1, that is, the sequence (ak) in [0, 1). The sequence is said to be uniformly distributed mod 1 if, for every interval I contained in [0, I), the proportion of (ak)which falls in I is asymptotically equal to the length of I. More formally: if N ( m , I) is the number of (ak) with k 5 m which are in the interval I , then m
m
exists and equals the Lebesgue measure of I. It is also possible to think of the uniform distribution in another way: the first m numbers (ak) determine a measure in the unit interval consisting of point masses I/m at the m (not neces sarily distinct) points (a&; this sequence of measures pm consists of measures of total mass 1, and by Helly’s theorem has at least one weakly convergent subsequence, converging to a limit measure p. The sequence is uniformly distributed mod 1 if and only if p is the Lebesgue measure, and in this case it was not necessary to pass to a subsequence. This remark is virtually a proof of the following theorem, which we nevertheless prove without invoking Helly’s theorem.
Theorem: The sequence 0, is uniformly distributed mod 1 if and only if for every Riemann integrable functionf(x), periodic with period 1, the limit l N 1 lim - f ( a k ) exists and equals f ( x ) dx
1
~
+
Nk=i
m
lo
198
111.
HARMONIC ANALYSIS
PROOF: If the limit exists, as asserted in the theorem, we take forf(x) the characteristic function of the interval Iextended over the axis with period 1 to infer that lim, N(m, I)/m exists and equals the Lebesgue measure of I, that is, that the sequence is uniformly distributed mod 1. On the other hand, if the sequence is uniformly distributed mod 1, the assertion of the theorem holds for any functionf(x) which is a finite linear combination of such characteristic functions of intervals extended by periodicity with period 1. Now for any function f ( x ) , Riemann integrable in the interval, there exist two finite linear combinations of characteristic functions of intervals h(x) and g(x) such that g(x) S f ( x ) 5 h(x) and h(x) - g(x) < E ; extending those two functions by periodicity we have
and the limits at either end of this inequality differ by at most E. Thus, the theorem is proved, and from it we obtain a criterion established by H. Weyl.
Theorem (Weyl): The sequence uk is uniformly distributed mod 1 if and only if, for every integer I > 0,
I N
lim N
1 eiZnfak exists and is
0.
Nk=i
PROOF: If the sequence uk is uniformly distributed mod 1, we invoke the noting that l o l f ( x )dx = 0. On previous theorem for the functionf(x) = eiZnfx, the other hand, if the limits considered in the theorem exist and are 0, then for every trigonometric polynomial P ( x ) = A,,, eilnmx,
zm
1
lirn N
1N P(ak)
exists and equals
Nk=i
Iff(x) is the characteristic function of an interval I i n [0, 1) extended periodically with period 1, there exist two trigonometric polynomials P ( x ) and Q(x) so that P ( x ) S f ( x ) S Q(x) and Q(x) - P ( x ) < E ; we infer that lim( l/N)C;= f(ak) exists and equals the length of I, hence, that the sequence is uniformly distributed mod 1.
40.
UNIFORM DISTRIBUTION MODULO
1
199
Let 2 be an irrational number and a, be the sequence ak = k2, k 2 1; this sequence is uniformly distributed mod 1, since for every 1 > 0,
and this quantity is bounded in absolute value by 2/Nlsin(2nlA)l and hence converges too. Had 1 been rational, of course only a finite set of residues mod 1 would occur. We pass to a theorem of van der Corput.
Theorem (van der Corput): Let the sequence ak have the property that for every integer h > 0 the sequence ak+h- ak = bk is uniformly distributed mod 1 ; then this also holds for the sequence ak .
PROOF: For a fixed integer 1 > 0, define the function f ( t ) equal to 0 for t < 0 and equal to eiZnLak in the interval k - 1 S t < k. This function has an autocorrelation, since the functions (1/2T)(fT * f T ) ( s )which are linear in intervals of the form ( k - 1, k), converge for integral values of s. This convergence is obvious for s = 0, while for s = h > 0 and larger integer values of T,
which converges to 0 by hypothesis. The autocorrelation therefore exists and is a triangle function: it vanishes for JsI 2 1 and is equal to +(l - Isl) for Is1 < 1 ; its Fourier transform is an absolutely continuous measure p which has therefore no mass at the origin. From the van der Corput theorem, then,
and this means that the numbers ( 1 / 2 N ) ~ ~eiZnfak = , converge to 0 with increasing N . Since I was arbitrary, it follows that the ak are uniformly distributed mod 1.
Corollary:
Let the polynomial P ( x ) = A,xm
+ A,,,-,x"'-~+
* - *
+ A , x + A0
have an irrational leading coefficient A,,,; then the sequence ak = P ( k ) is uniformly distributed mod 1.
200
111. HARMONIC ANALYSIS
PROOF: We argue by induction; for m = 1, the theorem has already been shown. For larger m and any integer h > 0, the polynomial Q(x) = P ( x + h) - P ( x ) is of lower degree and has an irrational number as its leading coefficient, and so Q(k)is uniformly distributed mod 1. The h being arbitrary, the previous theorem guarantees that ak is uniformly distributed mod 1. It is not difficult to extend the criteria of the previous theorems to sequences of points in R"; these sequences are reduced mod 1 to sequences of representative points in the unit cube of R", each coordinate being taken mod 1 separately. The most interesting case occurs when n = 2, where the sequence of points has the coordinates ( a k ,bk),the representatives mod 1 being ((a&, (bk)) in the unit square. The sequence is uniformly distributed in the square if and only if for every pair of integers ( I , h) not both 0, the sums
converge to 0. If we consider a point moving with uniform velocity in the x , y plane along a linear path of slope m, the coordinates of the point may be written as functions of time: ~ ( t =) 1, y ( t ) = mt + b, and when these coordinates are reduced mod 1, we obtain a family of lines of slope m in the unit square. As the time t runs through the positive integers, we obtain a set of points in the square
Fig. 8.
4 I.
SCHOENBERG'S THEOREM
201
which is uniformly distributed there, provided that the slope m is irrational, since -
CN
ei2n(l+hm)k
Nk= I
ei l n h b
converges to 0 with increasing N because (I + hm) is irrational if m is irrational. If we make the further reduction shown by Fig. 8
W )= min C(x(", f - (x(r))l, y o ) = min f - (m19 C(Y(N9
we obtain a continuous path in the square of side length f which is that of a billiard ball on a square billiard table, the ball being reflected by the sides of the table in the usual way. Thus, the slope being irrational, the ball spends equal amounts of time in equal areas of the table. Note that the initial condition, essentially the coordinate b, has nothing to do with the long term behavior of the ball. When the slope m is rational, the path of the ball is periodic.
41. Schoenberg's Theorem The measure w on R" which consists of a uniform distribution of unit mass on the surface 1x1 = 1 clearly plays an important role in the study of functions and distributions which are spherically symmetric, that is, are invariant under the orthogonal group. Hence, it is natural to expect that the Fourier transform of that measure will appear in a variety of applications and will be a particularly important function of positive type. We study that function in this section, but find it convenient to normalize the measure differently, and to consider the measure w, d o ; we recall that w, = 2 d 2 / r ( n / 2 ) . Let
G,(t)
= (2n)-"'2je-"x0wn dw(x) ;
this is evidently a function of positive type, and since the support of the measure is compact, it can be extended to an analytic function of n complex variables. Since the measure is invariant under the transformations of the orthogonal group, so is the function G,(t), which is therefore a function of radius alone, and we may write = Gn(O>fn(
It I)
7
202
111.
HARMONIC ANALYSIS
the function f ( t ) being defined for t 20. However, fn(f)bn(0) is merely the restriction of the entire function &(t)to the right tl half-axis; it is therefore an even function of the real variable t, and an entire function of the complex variable z = t + is. Obviously,f,(O) = 1. Since the measure is concentrated on the sphere 1x1 = 1, it is clear that (1 - lxl2)w,, dw is the zero measure, hence (1 + A)@,(<) = 0, and this equation reduces to n-1
fA + f n = 0
f::+- t
7
when the Laplacian is written in terms of the radius. From the analyticity of fn(z), it follows that that entire function satisfies the given differential equation
throughout the complex z-plane. We pass to the McLaurin expansion for fn(z) which is of the form m
fn(z) =
k=O
a2kz2k,
sincefn(z) is even on the real axis. From the differential equation we obtain a recurrence relation for the coefficients: -aZk-2 - 2 + 2k)'
= 2k(n
Let v = (n - 2)/2; the relation becomes 0 2 k = -a2,_,/4k(v view of the condition a, = f , ( O ) = 1, the coefficients are
+ k), and
in
+
(-i)kr(v 1) a 2 k = 4 k k ! r ( v + k + 1)' Accordingly,
+
fn(z)
1 ) ~ (-i)kr(v =kZ04'k! T(v + k + 1)'
~
~
Since by definition the Bessel function of the first kind of order v is zv * (- l)kz2k J ' ( Z ) = ? ? z o 4 k k ! r ( ~ + k + 1)' it follows that fn(z) = 2'z-'J,(z)T(v finally obtain
where v = (n - 2)/2.
+ l), and since o^,(O)
= 2-'/r(v
+ 1) we
203
41. SCHOENBERG'S THEOREM
This explicit identification of 6,,(() is particularly useful when it is required to write the Fourier transform of an integrable functionf(x) defined on R" which is a function only of radius:f(x) = F(lx1). The transform
T(t)= (27r)-"/z/e-i(""f(x)d x may be written in spherical coordinates as
= /omG,,(tr)F(r)r"- dr
=
1
n-2 2 .
~omJv(ltI r)F(r)r"12 d r , v = -
It is desirable to have certain estimates for the magnitude of G,,(t) for large ltl, or, what is the same thing, for Ifn(t)I when the real variable t is large. Let x,(x) be the characteristic function of the ball of radius p in R"; its Fourier transform is
i p ( t= ) (27r)-"/'/
e-i(xs)d x bl
= (2,)-"/2/0p/e-i(sxI.)rw d o f - d r
and for fixed t this is evidently absolutely continuous as a function of p and has the continuous derivative O^,,(tp)p"-'. For p = 1, then, the derivative is particularly simple :
It is also possible to write the Fourier transform z(t)in another way and to compute the integral explicitly. The inner product (xt) is bounded by over the domain of integration; we choose t in the interval [ - p , p] and consider the set of x in 1x1 p for which (xt) = t l t l . This set is contained in a translate of a coordinate hyperplane; it is shaped like a (n - 1)-dimensional sphere of radius ( p z - tZ)'/' and its n - 1 dimensional Lebesgue measure is therefore
s
wn-
-(P2
n-1
I
-t
Z (n-l)/Z
1
204
111. HARMONIC ANALYSIS
Here we are tacitly supposing that n > 2. Since the exponential is constant on this set, we have finally
This expression may be differentiated with respect to p and p set equal to 1 ; the contribution to the derivative from the limits of integration will be 0, since the integrand vanishes at those limits. Thus the derivative is
and we finally obtain 1
&in(()
= 20,-
,(2n)-"'2
cos(t l(l)(l - t 2 ) ( " - 3 ) / 2 dr .
J-0
When n = 3, this reduces to a particularly simple expression:
but we are more interested in larger values of n, and therefore, n being greater than 3, we integrate by parts to obtain
Since [sin XI 5 1 for real x, the integral is bounded by 2/(n - 3) and
Because Gi,(() = Gn(0)h(15 I) with &,,(O) = ~ , ( 2 n ) - "this / ~ leads to the inequality
From the logarithmic convexity of the Gamma function,
41.
SCHOENBERG’S THEOREM
205
and therefore,
r(q)
n-1 2
r(y) Accordingly,
I t is convenient to introduce the entire function H,,(z) =f,(z&); real axis this function satisfies the inequality
on the
1
IHn(x)I < JJrI.1 *
Its power series expansion about the origin can be immediately determined from that off,(z):
where, as before, v = (n - 2)/2. The factor nkr(v + 1)/T(v + k + 1)2k is bounded by + 1 in absolute value, hence, throughout the complex plane, IH,,(z)~ 5 e1212 uniformly in n. Moreover, as n increases, the coefficients C2k converge to ( - I)k/k!and so the functions H,,(z), which form an equicontinuous family on any compact subset of the plane, converge to the function e-”. The inequality IH,,(x)l 5 I / &Ixl shows that this convergence is even uniform on the real axis, and this circumstance is essential for the proof of the following remarkable theorem due to I. J. Schoenberg. A function F ( r ) defined for r 2 0 has the Theorem (Schoenberg): property that for every integer n, the function 0 ( x ) = F(lxl)
is a function of positive type on R” if and only if there exists a positive Radon measure p of finite total mass such that
PROOF: Half of the proof of the theorem is easy: if F(r) is of the given form, the fact that dp has finite total mass means that F(r) is continuous on
206
111. HARMONIC ANALYSIS
the closed right half-axis, and @ ( x ) is continuous on R". This function is of positive type since
and this is positive since the integrand is positive, the Gaussian being of positive type. On the other hand, if we suppose that F(r) has the property that @ ( x ) = F(lx1) is of positive type on R" for every n, the same holds for @ ( x ) G ( ~ xwhere ) E is positive and G(x) the Gaussian because the product of functions of positive type is again of positive type. The function @ ( x ) G ( ~ x is ) also in L'(R") and hence, its Fourier transform is a positive integrable function; both the function and i t s transform are invariant under orthogonal transformations and are therefore functions of radius. Accordingly, @ ( x ) ~ ( E x )= ( 2 n ) - " / 2 j e - ' ( ~ x ) ~ , (drt) ,
where M,,(() = m,,(I(l);this may be written
After a change of variables: r = t ,/%, this becomes
where the measure dp,(t) is Z,(0)m,(tJ2n)t"-'(2n)"/2
dt ; we have
~ ( 0 =) ~ ( 0= ) jmdpn(t) 0
for all n. It is therefore possible to invoke Helly's theorem: the functions H,,(Ixlr)converge uniformly on the axis to e-lxlZr2and the measures dp,, form a sequence of measures of the same mass on the one-point compactification of the real axis. Thus @ ( x ) G ( E x ) = le-lxl'r' dv,(t)
for some positive measure v, on the right half-axis of total mass F(0). As E approaches 0, Helly's theorem again guarantees the existence of a positive measure v of total mass F(0) for which JO
and if we set 1 = t 2 we finally obtain the measure dp(1) required by the theorem.
42.
DISTRIBUTIONSOF POSITIVE TYPE
207
42. Distributions of Positive Type We have already remarked that it is often convenient to have the regularizing functions cp(x) as testfunctions of positive type; this could always be obtained by passing from the testfunction q ( x ) to the testfunction cp * @, the transform of which is (2n)”’21?(t)12. The regularizing function can also be supposed positive and even, and its Fourier transform will have the same properties. Sometimes there is an advantage in having the transform strictly positive on the whole of R“, and this can be guaranteed in the following way. The testfunction cp(x) is surely a distribution with compact support; its Fourier transform is then an entire function of n complex variables, which, on the real space, belongs to the class 9.The transform $(t),therefore, can never vanish on an open subset of R”, since it would then be represented by its Taylor expansion about a point in that open set, and that Taylor expansion would be identically 0. It follows that the set defined by the ’equation @(<)= 0 is a closed set which is nowhere dense. Thus the testfunction cp2(x) has (2n)-”/’$ * for its Fourier transform, and the convolution is never 0 since the integrand ?(t - q)$(q) is strictly positive on a set of positive measure. We are, therefore, always at liberty to assume that the regularizing function q ( x ) is a testfunction which is even, positive, of positive type, and with a Fourier transform having the same properties and which never vanishes. A distribution T defined on R” is said to be of positive type if and only if, for every testfunction cp, T ( q * @)2 0. It is clear that the functions of positive type, and more generally, the functions of weak positive type, are distributions of positive type. An important generalization of Bochner’s theorem has been obtained by L. Schwartz. A distribution T o n R” is of positive type if and Theorem (Scbwartz): only if it is temperate and its Fourier transform is a positive Radon measure. PROOF: If we suppose that p is a positive Radon measure which is also a temperate distribution, then by the definition of the Fourier transform
208
Ill. HARMONIC ANALYSIS
and hence, is of positive type. On the other hand, if the distribution T is of positive type, the convolution T * cp * @ is a continuous function which is also a function of positive type: we have only to verify that it is of weak positive type; this follows from
-
( T * cp * @I($ * 6)= T(? * $1 * (t;* $1) 2 0. From Bochner’s theorem, then, the Fourier transform of T * cp * @ is a positive Radon measure dp, of finite total mass. If we consider any two testfunctions cp(x) and $(x), we find that the Fourier transform of T * cp * @ * $ * 6 can be written in either of two forms: (2n)”1h12 dP, = (2~)”l?(t)I
and it is therefore convenient to select a testfunction $(x) having a positive Fourier transform which is never 0 to define
440 = (2n)-”l;(0r2
dP#.
Accordingly, dp* = (2n)”lk)12 &,
IWl’
dP# = (2n)”l$(0l21~(5)I24.451 =
l?(t,l2
dPp,
and, therefore, dp, = (2n)”l$(t)I2d p ( 0 for any testfunction cp. It follows that v
Vcp * 4) = (T * cp * @)(O)
and if ~ ( x is) a testfunction of the form cp
* 4, this reads
In particular, it is possible to suppose that x is a regularizing function, supported by the unit ball 1x1 I ; the function x,(x) = ( ~ / E ” ) x ( x / E )is then supported by the ball of radius E , and since T is a distribution,
s
IT(Xe)I 4 CllXellN
C E - ” - ~ I I X Ifor ( ~ all positive E
6 1.
42.
DISTRIBUTIONSOF POSITIVE TYPE
209
On the other hand,
Thus the p-measure of the ball of radius 1 / is ~ bounded by C'( that is, the ball of radius R has measure at most C'Rn+N;the measure p is of polynomial growth at infinity and is hence a temperate distribution. " It remains only to identify the Fourier transform of p with T. The regularization T * xe is a function of positive type with the Fourier transform (2n)"/'?(~5)dp(t), and hence, for all testfunctions $(x), (T *
X G ) =jk)(2n)./2?(Eo
440 ; v
v
as E approaches 0, the left-hand side converges to T($) = T($),while the rightA
hand side becomes J' $(() dp(l) by virtue of the Lebesgue convergence theorem. Hence,
=/h 440 and p is the Fourier transform of T. It is not hard to find examples of distributions of positive type which are not functions of weak positive type. The &distribution is such an example, its Fourier transform being a positive multiple of Lebesgue measure. Iff(x) is any function of positive type with Fourier transform p, the distribution (1 - A)kjforany positive integer k has the Fourier transform (1 + I ( I Z ) k dp(5) and is therefore of positive type. We have also seen that the derivative - t"(x), where t ( x ) is the triangle function on the real axis, is the mass distribution which puts a mass of 2 units at the origin, and - 1 units at the points x = 1 and x = - 1, respectively; these three point masses form a distribution of positive type A more sophisticated example can be obtained by taking the Cantor measure, p, concentrated on the Cantor set, and convoluting it with its reflection = p to obtain a measure p * p = v on the interval [ - 1, I]. The Fourier transform
210
111. HARMONIC ANALYSIS
is positive, and indeed is a function of positive type. The measure v is not absolutely continuous, since its transform does not vanish at infinity. Moreover, v has no point masses because p has none, since for any Bore1 set A,
and if A is finite, the integrand vanishes identically. It is known that v is a singular measure, but we shall not prove it. An important example of a distribution of positive type was found in Section 33 associated with the Poisson summation formula: this distribution is a positive measure which has equal masses at all points with integer coordinates and has a similar measure for its Fourier transform; both distributions are positive measures, not of finite total mass, and hence both are of positive type. In Section 32, the distribution defined by the function I / l ~ l " - ~ was studied in some detail; when I belonged to the interval 0 < A < n, this was a positive, locally integrable function, and its Fourier transform w3s C(n, A ) / l ( I A and had the same properties, the coefficient C(n, A) being positive. Since these distributions are of positive type, we deduce that for any testfunction cp(x),
Another remark may be instructive: let f(x) be an L2-function which is positive ; its Fourier transform is therefore a distribution of positive type and n
iff(() is bounded it is a function of positive type. It will follow thatf(x) dx is a measure offinite total mass. Accordingly, a positiveL2-function with a bounded Fourier transform is necessarily integrable.
43. Paley-Wiener Theorems In Section 30 it was established that if T is a distribution with compact support, then its Fourier transform is an entire function of n complex variables defined by
It is desirable to extend and sharpen this result.
43.
21 1
PALEY-WIENER THEOREMS
Theorem: Let T be a distribution with compact support, K the convex hull of the support of T and H ( t ) the support function of K ; then the Fourier A
transform T ( i ) satisfies an inequality of the form h
where 5 = II
+ iq and N is the order of T.
PROOF: Let ~ ( x be ) a fixed testfunction, equal to (2n)-"" on a neighborhood of K and let F ( r ) be a fixed C"-function on the real axis such that F ( t ) = + 1 for t 2 1 and F ( r ) = 0 for t > 2. Choose a small positive 1 and form the testfunction d x > = x(x)e-""~'~(lil"C(xs) - H(q)I).
This testfunction is equal to (2n)-"'ze-i(xr' on a neighborhood of K since the A
third factor is + 1 for (xr])- H ( q ) < 151-A and therefore T ( i ) = T(cp). Since T has a compact support, there exists a constant C and an integer N so that IT(t,b)l 5 C ll+ilN for all testfunctions, accordingly *
I W ) l s c IIrpIIN The function D"cp can be computed by the Leibnitz rule, rp being the product of three factors. It will appear as the sum a!
Da(XeF) = 1 -DBXDYeDdF, p ! y ! 6!
where the summation is taken over all multi-indices b, y, and 6 such that p + y + 6 = a. In estimating the absolute value we have IDBx(x)l5 )I zllN in every case, and since [e-i(xc)l= e(xq)it follows that )Dye-'("')(= I(llUle("q). There exists a constant A4 which is a bound for the derivatives IF(k)(t)Ion the real axis fork 5 N , hence the factor D'Fis bounded by M(l(lAq)'d'and therefore
It follows that there exists a constant C such that
212
111. HARMONIC ANALYSIS
the supremum being taken over the neighborhood of K defined by the inequality (xq) < H ( q ) + 31cl-A. If lcl > 1 then dXq) 5 eH("e3on this neighborhood. Thus for > 1,
none of the terms depends on A except the exponent in which it figures explicitly; letting A approach 0, we obtain the desired inequality for 2 1, it therefore follows for < 1, perhaps with a slightly larger constant. We should note that our argument has not supposed that H ( q ) is positive. When the distribution T is a measure, N = 0, and the inequality takes the simple form IT(c)l 5 CeH("'.This circumstance leads to the following result. The Fourier transform $(c) of a testfunction q ( x ) is an Corollary: entire function of n complex variables; for every integer k 2 0 there exists a constant ck so that
I3(C)I 5 Ck(l + ICKkeH("), c = 5 + iq, where H ( q ) is the support function of the convex hull of the support of ~ ( x ) .
PROOF: Differentiate q(x) k times in the x,-direction; the resulting testfunction is a measure with compact support, hence 15, lk1$(c)l 5 C1eH(") for an appropriate constant C,, the support of the derivative being contained in the support of q(x). Since a similar inequality holds for all other coordinates,
+1
Since the ratio of (1 lcjlk) to (1 from 0, the corollary is proved.
+
is bounded and bounded away
It is important to establish the converse of the previous theorem and its corollary, and it is convenient to begin with the corollary. Theorem: Let K be a compact, convex set in R",H ( < ) , its support function; let F ( [ ) be an entire function of n complex variables with the property that for every integer k 2 0, there exists a constant C, such that IF(()[ 5 ck(1 Icl)-keH(v) where c = 5 + i q ; there then exists a testfunction q(x) supported by K such that $(c) = F ( c ) .
+
PROOF: It is evident that on the real space F ( < ) is a C"-function which diminishes rapidly as 151 approaches infinity, indeed, for any polynomial p ( t ) the function p ( t ) F ( < )is integrable. Thus F(<) is the Fourier
43.
PALEY-WIENER THEOREMS
213
transform of a C"-function q ( x ) and we have only to show that cp(x) has a compact support. Let x be any point not in K ; the theorem is proved by showing q ( x ) = 0. The inverse Fourier transform may be written
this can be put more explicitly as
where each integral is taken along the real axis and each integrand is an entire function. The path of integration may therefore be changed: for an arbitrary vector 0 in R" and positive t we pass from 5 to ( it0, changing the variable a coordinate at a time to obtain
+
q ( x ) = (2nc)-"/2j e i ( x t ) - f ( r o ) F
(5 + it81 d t .
+ I(l)-keH(fo); this for k 2 n + 1 is
The integrand is bounded by e-f'xo)Ck(l integrable over R". Accordingly, Iq(x)l
5 (21r)-"'~/(1 + Itl)-kd(e'rH(e)-(xo)lC k
since H(r])is positively homogeneous. Since x is not in K we can take 8 so that H ( 0 ) - (x8) is negative, and as t can be chosen arbitrarily large, cp(x) = 0. It follows that q(x) is a testfunction, and, indeed, one supported by K . Theorem: Let F ( i ) be an entire function of n complex variables which satisfies an inequality of the form IF(i)l 5 C(l lCl)NeH(9)where H(r]) is the support function of some compact, convex set K ; then F(5) is the Fourier transform of a distribution T supported by K .
+
PROOF: On the real space F ( ( ) is bounded by a polynomial, hence is a temperate distribution, and is therefore the Fourier transform of a temperate distribution T. The regularization of T with a regularizing function cp,(x) is a convolution of T with a distribution of compact support; its Fourier transform is then the product (2n)'"'@(~lJF([),or, more precisely, coincides with this function on the real space 4' = 5 and hence, admits an analytic continuation to this entire function. But this entire function is bounded by cck(1 I~J)N-keH(q)+elvl, for any k > 0 since cp,(x) is supported by the ball 1x1 5 E and the support function for that set is ~ l r ] ( . From the previous theorem it follows that the smooth function T * cp, is supported by an Eneighborhood of K. As E approaches 0, the regularizations converge to T,
+
214
Ill. HARMONIC ANALYSIS
which is therefore supported by every neighborhood of K. This proves the theorem.
Theorem:
Let the distribution T be a measure with compact support h
and M ( q ) = suprlT(5
+ iq)l; then log M(rq) - log M ( 0 ) m t
H ( q ) = lim I+
is the support function of the smallest closed convex set supporting T. PROOF:
From previous theorems it is clear that the Fourier transform
h
A
T ( [ )satisfies an inequality of the form IT([)I S Ce“I“1;therefore, M ( q ) satisfies the same inequality and, in particular, is always finite. Moreover, M ( q ) is A
never 0 since the vanishing of T(5 h
+ iq) for some q and all real < implies that
T ( [ )is identically 0. It is important to notice that M ( q ) is a logarithmically convex function of q. To show this, we consider three real vectors t, q’, and q” and the complex variable z = t is; the point 5 + i(zq’ + (1 - z)q”)depends analytically on z and may be written t - s(q’ - q”) + i(tq’ + (1 - t)q”), and the function
+
A
f ( z ) = T(t - s(q’ - q”)
+ i(tq’ + (I - t)q”))
is an entire function of z. By the Three Lines theorem, the supremum L(t) = sup,lf(t is)l is a logarithmically convex function of t . Taking the supremum again over t we find that log M(tq’ (1 - t)q”) is convex in t. Since the vectors q’ and q” were arbitrary, it follows that M ( q )is logarithmically convex. In view of this fact, the difference quotient
+
+
log M(tq) - log M ( 0 ) t
is convex in q for fixed positive t, and as the difference quotient of a convex function, it increases with t. Since it is bounded by
Alvl
+
log c - log M(0) t
9
it follows that there exists a limit as t approaches infinity. Thus H ( q ) = lim I’W
= sup t>O
log M ( t q ) - log M ( 0 ) t log M(tq) - log M ( 0 ) t
44.
FUNCTIONS OF THE PICK CLASS
215
is convex in q and finite everywhere. Evidently H ( q ) is at least as large as the quotient for t = 1, whence log M ( q ) 5 H ( q ) + log M(O), that is M(q) 5 M(O)e"(". It is also clear that H(q) is positively homogeneous : if s > 0, then H ( s ~= ) SUP
log M(tsq) - log M(0) t
r
log M(rq) - log M ( 0 ) r
= s sup r
= sH(q).
I t follows that H ( q ) is the support function of a compact convex set Kin R". From the previous theorem, it is clear that K is a support for the measure T, and it remains to show that it is exactly the smallest convex set supporting T. If K * is a compact, convex support for T with support function H*(q), then A
IT(c)l 5 Ce"*(")and therefore M ( q ) 5 CeH*(q), whence, for r > 0, log M ( t q ) - log M ( 0 ) 5 tH*(q)
+ log c - log M ( 0 )
and therefore H ( q ) S H*(q). It follows that K is contained in K * .
44. Functions of the Pick Class In Section 9, which treated the Poisson integral, it was established that the most general function ~ ( z )positive , and harmonic in the unit disk IzI < 1, was of the form u ( z ) = u(reio)
=
11 +
r2
1 - r2 dv(e'"), - 2r cos(0 - w )
where dv was a positive Radon measure on the circumference IzI = 1 of total mass u(0). From this it followed almost immediately that the most general function.f(z), analytic in the disk with positive real part was of the form f ( z ) = u(z)
where C = Im[f(O)] is real.
+ io(z)
216
111. HARMONIC ANALYSIS
Now it is convenient to study harmonic and analytic functions in the upper half-plane rather than the unit circle. We write the complex variable ( = 6 iq and introduce the linear fractional transformations
+
.l+z
( = ((z) = I __
1-z
and
z = z(()=-
(i
+1
i[-
1’
which are inverses of one another and which interchange the disk IzI < 1 and the half-plane q > 0. Moreover, the formulas
Cp(0
= V(ZK))
Y
obviously determine a one-to-one mapping of t..e class a functions .f(z), analytic in the disk with positive real part, and the class of functions q((), analytic in the upper half-plane, with positive imaginary part. The functions of this latter class are called Pick functions, or functions of the Pick class. The most general function in the Pick class is then obtained by a change of variables in the integral formula above. In computing that change of variable, it is convenient to display any contribution tof(z) which may arise from a point mass in the measure dv at z = 1, since that point goes into the point at infinity under the mapping ( ( z ) .Accordingly,
where dv’is the measure dv diminished by the mass v[ 13 at z = 1. By a routine computation, then, Cp(5) =
cot(w/2)( - 1
- c + V C 1 X +J ( + cot(w/2) dv’(ei0)
which becomes
after the substitution A = -cot(w/2), where a = v[1], /?= Re[q(i)], and dp is a positive Radon measure on the real A-axis of finite total mass. Experience has shown that it is better not to insist on using measures of finite total mass and to introduce the measure dp(l) = (A2 + 1) dp(A). This leads to the following canonical representation of functions in the Pick class.
44. FUNCTIONS OF THE PICK CLASS
217
Theorem: The most general function q ( [ ) ,analytic in the upper halfplane with positive imaginary part is of the form
where u 10, P = Re[q(i)], and &(A) + dp(~)is finite.
a positive Radon measure for which
j ( ~ ~
It is easy to see that if a and P are given, as well as a measure &(A) satisfying the requirements of the theorem, then the integral exists for all [ in the upper half-plane and is a Pick function. The function may be written q ( [ )= U([) i V ( [ ) to obtain a companion canonical representation.
+
Corollary: The most general function V ( [ ) , harmonic and positive in the upper half-plane, is of the form V ( 0=
v r + i9)
s
where u 2 0 and dp a positive Radon measure for which (A2 finite.
+ l)-’
dp(A) is
It is important to prove that the canonical representations given by the theorem and its corollary are unique: that is, that the numbers u and as well as the measure dp are determined by the function q ( [ ) .This is of course obvious for p. Moreover V(i9) - a --
444
+
9
J’AZ
+ 92
and as 9 increases, the positive integrand diminishes monotonically to 0. Hence, from the Lebesgue convergence theorem,
q-+m
9
The measure p is described by a suitable normalized monotone increasing function p(A) on the real axis; the normalization being p(0) = 0 and f [ p ( A + 0) + p(A - O)] = p(A); the second condition is preferable to the requirement that p be left (or right) continuous. The correspondence between V ( [ ) and p is given by the following theorem.
218
Theorem:
111.
HARMONIC ANALYSIS
For any finite a and b,
I
l b p ( b ) - p ( a ) = lim - V ( x + iq) d x . V+O
II
a
PROOF: Suppose first that the measure has finite total mass; then by the Fubini theorem and the substitution t = ( x l ) / q ,
-
As q approaches 0, the positive integrand is bounded by + 1 and converges to the function F ( l ) equal to + 1 in the open interval (a, b), vanishing outside the closed interval [a, b], and equal to 4 at the endpoints. In view of the convention made about the discontinuities of p ( l ) , then, the limit is
A b ) - lc(a) = jF(4 444 ' If the measure does not have finite total mass, we decompose p into a sum p = pl + p 2 , where pl is the restriction of p to the interval (a - 1, b + 1). There is a corresponding decomposition of V ( [ ) = Vl([) + V 2 ( [ ) .The theorem has been proved for Vl(C), while the function V2(x + iq) converges uniformly to 0 with decreasing q on the interval [a, b]. Thus the theorem is proved. Some illustrations of the previous theorems may be instructive. The function Log [ = logl[l i arg(l,) is that determination of the logarithm which is real on the right half-axis and analytic in the plane slit along the negative real axis. In the upper half-plane its imaginary part V ( [ ) = arg([) is positive and bounded by II.Thus Log ( is a Pick function and LY = 0, while B = Re[Log i] also vanishes. The function ( l / n ) V ( x+ iq) converges to + 1 uniformly on closed subsets of x < 0 and converges uniformly to 0 on closed subsets of x > 0. Hence the measure d p ( l ) is merely the restriction of Lebesgue measure to the left half-axis, and accordingly,
+
44. FUNCTIONS OF THE PICK CLASS
219
The function tan 1; satisfies the usual trigonometric addition formula tan(t
+ tan(iq) + i p ) = 1tan(<) - tan(()tan(iq)
and so its imaginary part is easily shown to be positive for q > 0. From the boundedness of the hyperbolic tangent on the real axis we deduce that a = 0, and it is easy to see that /?= Re[tan(i)] = 0. Since the tangent is real and analytic between its poles, the measure p reduces to a system of point masses at these poles, and from the periodicity of the function, it follows that the same mass is put at each pole. Because the residue of the tangent at a pole is always - 1, it follows that tan 1; =
1-n,-c1
1,
--
1,z+1'
the sum being taken over all the zeros of the cosine. Since these are symmetrically distributed about the origin, the canonical representation for the tangent reduces to co
1 1
tan 1; = 21;C 0 (n f ) 2 n 2- 5 2 ' It is also easy to see that the meromorphic function
+
-1 -cot I; = tan 1; =
cos 1; -sin 1;
d d1; is in the Pick class and that it admits a representation =
- - log sin 1;
where 1, = nn runs through the zeros of the sine. The sum simplifies to -1/1;+
d and since I/[ = - (log 1;), we have
4
f
n=l
21;
n2n2 - 1;'
220
111. HARMONIC ANALYSIS
We integrate this equation along a path which is a line segment from the origin to a nonreal z; on this path the series converges uniformly, and therefore, since (sin ()/[ = 1 at the origin sin z Z
n=1
Finally, taking exponentials, we obtain for all nonreal z sin -=n(1--) z Z
n=l
Z2
n2n2
and it becomes clear that the equation is also valid for real z. This result was mentioned without proof in Section 2. It is not difficult to see that if the positive harmonic function V ( c ) is bounded in the upper half-plane by the constant M, then the associated function p(A) is Lipschitzian, with Lipschitz constant M/n. It follows that dp is absolutely continuous and therefore of the form dp(A) = (l/n) V(A)dA where 0 5 V(A)5 M. We next show that V(A) has been appropriately named, that is, that the harmonic V ( ( ) assumes boundary values equal to V(A) almost everywhere. This is a consequence of the following theorem of Fatou.
Theorem (Fatou): If the function p(A) occurring in the canonical representation of V ( ( ) has a finite derivative at A = to,then V ( t o + iq) converges to np’(t0) with diminishing q.
PROOF: There is no loss of generality in supposing that to= 0 and that the measure dp is concentrated on the interval [- 1, 13. Since
the Stieltjes integral can be integrated by parts to obtain
where the integration is taken over the whole real axis. After the substitution
r = A/q, this becomes
and since the ratio p(s)/sis uniformly bounded in s, the Lebesgue convergence theorem guarantees that
44.
FUNCTIONS OF THE PICK CLASS
22 1
The integral could be computed explicitly, however, since it must be a universal constant, simple examples show that it must be n, for example, V ( [ ) = Im[Log [] = arg [. A more general version of the Fatou theorem can be proved: the function
~ ( t+, reio) converges to np'(to)as r goes to O uniformly for w in an interval of the form [ E , n - E ] where E > 0. We shall not give the proof. Since every monotone function p(1) is differentiable almost everywhere, the theorem of Fatou implies that every function G([), bounded and harmonic in the half-plane, assumes boundary values G(t) = lim,,+o G(t + iq) almost everywhere, since the addition of a constant makes the function positive. It is also clear that bounded analytic functions in the half-plane take such boundary values almost everywhere. Iff(z) is a function which is bounded and analytic in the disk IzI c 1, then F ( [ ) =f(z(t;)) is bounded and analytic in the upper half-plane when z([) is the linear fractional transformation already introduced which maps the half-plane onto the disk. Since F ( [ ) has boundary values almost everywhere on the real axis, it follows that f ( z ) assumes boundary values almost everywhere on the unit circle. That is to say, the limit limr,,f(reio) exists for almost all w in [O, an]. It is obvious that the limiting function defined almost everywhere is measurable. In particular, the Blaschke products introduced in Section 2 take boundary values almost everywhere on the circle, and an important theorem asserts that those boundary values are almost everywhere of absolute value + 1. If B(z) is a Blaschke product and B(e'") = limr+l B(reio), Theorem: then lE(eio)l = 1 almost everywhere.
PROOF: Since lB(z)I 5 1 in the disk, it is obvious that IB(eio)l 1 almost everywhere. If there is a measurable set E of positive measure on which lB(e'"')I c 1, then
and by the Fatou theorem in integration theory this is no larger than
loIloglB(re'")l I do. 2n
lim inf r+ 1
It is therefore sufficient to show that lim,+l H ( r ) = 0 where 2n
H(r) =
loglB(reim)Ido, 0
222
111. HARMONIC ANALYSIS
and this, by Jensen's formula (Section 27), is given by
2n loglB(0)I + 2 n ( f i ) dt , o
t
where N ( t ) is the number of zeros of B ( z ) in the disk lzl 5 t. If the zeros are denoted as in Section 2, then log IB(0)l = log lu,l and this may be written log t dN(t) and integrated by parts. It is easy to check that the integrated
j:
term vanishes and therefore to find loglB(0)I = -jolN(t)/tdt. This means that
H ( r ) = 2 n j 1 N ( ? ) / tdt, which obviously approaches 0 as r tends to 1. The canonical factorization given in Section 2 for the most general function f ( z ) bounded and analytic in the unit circle is easily brought over to the halfplane by means of the conformal mappings introduced above. The factor Z" becomes ([c - i]/[c + i])" and the individual factors in the Blaschke product are all of the form c([c - A ] / [ [ - A])where the constant c of absolute value + 1 is so chosen that the factor is positive at c = i. Accordingly, the product z"B(z) is transformed into the function
which has exactly the Ak as its zeros. This is the Blaschke product for the halfplane and satisfies IB(c)I 5 1 for all c in the upper half-plane. As a bounded analytic function, it assumes boundary values almost everywhere on the real axis, and the previous theorem shows that those boundary values are almost everywhere of absolute value 1. It follows that the most general function F(c) bounded and analytic in the upper half-plane is of the form
+
F(c) = C ~ ( c ) G ( c ) where the bounded G(c) has no zeros. It should be emphasized that G is bounded since the corresponding assertion is valid in the unit disk: the quotients f(z)/B,(z), where BN(z)is the partial Blaschke product, were uniformly bounded by the bound of&). Thus, the constant C may be taken in for an appropriate such a way that suplC(t;)l = 1, and therefore G(c) = eicp(c) choice of q(c) = V ( [ )+ i V ( 0 in the Pick class. Moreover, we may require /3 = Re[&)] = 0 choosing the phase of C correctly. In this way, a canonical factorization is determined for functions bounded and analytic in the upperhalf-plane. F(C) = CB(C)G(C)
44. FUNCTIONS OF THE PICK CLASS Theorem: and
223
Let F(c) be bounded and analytic in the upper half-plane
M(?)= SUP IF(( + iq)l
*
t
If H ( q ) = lim
log M(rq) - log M ( 0 )
t-+m
t
9
then H ( q ) = - aq where a occurs in the canonical factorization of F(c).
PROOF: From the Three Lines theorem, the function M ( q ) is logarithmically convex, and the limit H ( q ) must be positively homogeneous; there is only one complex variable in question, and therefore H ( q ) is linear, at least for positive q, the only ones of interest. It follows that H ( q ) = cq and the theorem asserts that the constant c is -a. Suppose first that the factor CS(c) is not present in the canonical factorization, that is, F(c) = C(c) = e-Y(c)+iU(c). Now -loglG(t
+ i q ) l = V(t + iq)
and the positive quantity on the right is always at least aq, while for large q e-(a+e)o
- IG(i?)l - e-"'J 5
and therefore M ( q ) satisfies the same inequalities when q is large. It follows that log M(tq) - log M ( 0 ) = -aq. t-+m t
H ( q ) = lim
Thus, the theorem is true for the functions G(C). Moreover, the value of C does not affect the quantity H(q) at all, and there is then no loss of generality in taking C = 1 for this proof. Since IB(C)l S 1, the function M ( q ) 5 suptlC(5 iq)l, and therefore H ( q ) = cq where c 5 -a. If c is strictly smaller than -a, say c = - a - 2.5 where E is positive, then
+
IF(c)e-'("+zc)c~ 5 M(q)e("+")o
5 M(O)
224
111. HARMONIC ANALYSIS
and so the product Fl(() = F(()e-i'"+2"'1is bounded in the half-plane. However, l ~ ( i ~ i @) +~W- i vI 5 eEV, at least for large q, and therefore, G , ( ( ) = G([)e-i'"+2e)'5 is not bounded in the half-plane. The bounded Fl(()therefore admits the canonical factorization Fl(() = B ( ( ) G , ( [ )where G,(() is unbounded, a contradiction. This completes the proof of the theorem.
45. Titchmarsh Convolution Theorem Let Tl and T, be distributions in R" with compact support, T, their convolution, and K,, K, , and K, the convex hulls of the supports of these distributions. It has been shown in Section 26 that SUPP(T3) = SUPP(T1 *
7-2)
E SUPP(T1) + SUPP(7-2) and from this it easily follows that K, Kl + K2 when the convex hulls are taken instead. This relation may also be written in terms of the support functions of those three convex, compact sets : H 3 h ) 2 Hl(r1) + H 2 W * It is an important fact that while the set supp(T,) may in fact be smaller than the sum supp(T,) supp(T,), theconvex hulls are equal, that is, K , = Kl K,, or equivalently, H,(q) = Hl(q) H2(q).The result is due to Titchmarsh and Lions.
+
+
+
Theorem: If T, and T, have compact support, the convex hull of supp(T, * T2)is the sum of the convex hulls of supp (T,) and supp (T,).
PROOF: We suppose that the theorem is false and show that it is then false in the special case when T, and T2are testfunctions on the real axis; this special case is then treated by the methods of the Fourier transform and the results of the previous section. Supposing, then, that K, is a proper subset of Kl + K , , we infer that there exists a vector q with lql = 1 such that H3(q) c Hl(q) + H,(q); choosing x' in Kl such that Hl(q) = supxsK,(x,q) = (x', q ) and x" in K2 so H2(q)
225
45. TITCHMARSH CONVOLUTION THEOREM
+
= sup,,&, q) = (x”, q), we have H3(q) = supXEK3(x, q) < (x‘ x”, q). It is important to notice that the points x’ and x” may be supposed to belong to supp(T,) and supp(Tz), respectively, since the supremum of (xq) on K , is attained at a point of supp(T,), the former set being the convex hull of the latter. Now the distribution T =’.T-.x,Tl is supported by supp(T,) - x’, a set whichcontains the origin; similarlyT”=9--,..Tz is supported bysupp(T,) - x” which also contains the origin. The convolution T’ * T“ is supported by supp(T3) - (x’ x”) and the origin lies outside the closed convex hull of this set. Let d be the distance from the origin to that convex hull; by the definition of the support of a distribution, there exist testfunctions $I(x) and 1,9~(x)
+
v
”
supported by a ball of radiusd/8 about the origin for which T‘($J = T”($2)= 1. Accordingly, the convolutions T‘ * $, and T“ * $z are testfunctions, not vanishing at the origin and having a convolution which is a testfunction supported by a set with a convex hull not containing the origin. Finally, if these testfunctions are composed with a suitable orthogonal transformation I, the composed functions will not vanish at the origin, and their convolution will be supported by the half-space x, 5 - E where E is positive and x1 the first coordinate. Let q l ( x ) and cpz(x)be the testfunctions so obtained, and write the generic point of R” in the form x = (x,, x’) where x’ = (xz, x 3 , . . . ,x,) is a point of Rn-1. Form the integral F l ( x l ,5’) =
e-i(x’t‘)ql(xl, x’) d x ‘ ;
Rn-1
for any fixed value of x,, this function is analytic in 5’ and cannot vanish identically for x1 = 0; moreover, if 5’ is fixed, F,(x,, 5’) is a testfunction in the one variable x,. In a similar fashion we form
and then select 5’ in R“-’once and for all in such a way that F,(O, <‘)Fz(O,5’) is not zero. In this way we obtain two testfunctions on the real axis fi(X1) = F,(x,, 5’) and fz(x1) = Fz(x,, 5’) , neither of which vanishes at the origin. Their convolution may be written
+m
e-i(x’c’)ql(xl - y , , x’) dx’ e-i(Y‘t’)qz(yl,y ’ ) dy’ d y , . !R“-l
226
111. HARMONIC ANALYSIS
Since all integrations are taken only over compact sets and all integrands smooth functions, this may be written
which, after the substitution z' = x' +a
1-
-i(z'<') - 1 !Rvl~ 1
'Pl(X1
+ y', becomes
- YI, z' - Yf)(P2(Y,, Y ' ) dY' dz' dY1.
Finally, then,
I,.-,e-i(z'<')(ql*
(P2)(XI,
z ' ) dz' = (fl
* f2)(x1)
and since the convolution 'p, * 'p2 vanishes for any point with xl > - E , the convolution f l * f 2 is supported by the half-axisx, 5 -&even thoughfl(0)f2(O) does not vanish. The Titchmarsh theorem has now been reduced to the special case of two testfunctions on the real axis. Let [a', b'] be the closed convex hull of the support off,(x); evidently a' < 0 < 6'. The testfunctionI-,.f, = g1 has then a support with the convex hull [c, 01 where c < 0. The support function of this interval is H,(q) and this function vanishes for q > 0. In a similar way, the closed convex hull of the support of.f2 is [a", b"] and that of the testfunction y - b " f 2 = g2 is [c", 01 where c" < 0;there corresponds a support function H2(r])= 0 for q > 0. From the Paley-Wiener theorems it follows that the Fourier transforms ;,(() and ;,(() are bounded in the upper half-plane; these two functions admit canonical representations of the form *
sdc) = ClBI(~)eiq1('),
i2(() = c2B2(()eiq2(C), where the functions B,(c) and B2((),of course, are Blaschke products and the exponents 'p,(() and cp2(() are Pick functions. The coefficients c1 occurring in the canonical representations of the Pick functions are both 0 in view of the Paley-Wiener theorems and the results of the previous section. Hence the canonical representation for the Fourier transform of the convolution is
id0 = (27t)1/2;1(0z2(2(r) = ( ~ ~ T ) ' /B1(()B2(()eir'Pl(r)+qz(r)', ~c,c~ where, obviously, the coefficient a associated with the Pick function in the exponen?ial is 0. This contradicts the hypothesis that the convolution g, * g2 is supported by an interval of the form [d, - E ] where d < - E < 0, for in that case, the corresponding support function H , ( ( ) would necessarily involve a coefficient a equal to E > 0. Another proof of the Titchmarsh theorem will be indicated in Section 55.
46. THE SPECTRUM OF A
227
DISTRIBUTION
46. The Spectrum of a Distribution *
If T is a temperate distribution on R", the closed set supp(T) is called the spectrum of T. Evidently the spectrum is empty if and only if T = 0, and it is easy to show that when the spectrum is a finite set [tl, t 2 ,. . . , lN], then T is of the form T = xi= pk(x)ei(xtk),where Pk(x) is a polynomial. Such functions are called exponential polynomials. In view of the Paley-Wiener theorems, the compactness of the spectrum of T implies that the distribution Tis actuallya smooth function, the restriction to the real space of an entire function T(z)of n complex variables satisfying an inequality of the form IT(z)l S C(1 IzI)NeR1yl where z = x + iy and R is the radius of a ball containing the spectrum. A particularly important case arises when T is a bounded function on R" with compact spectrum. In this case, ifP( D)is any polynomial of differentiation the function P(D)T is also a bounded smooth function and IIP(D)TII,g C (1 T 11 ", where the constant C depends only on the polynomial P(D)itself and the radius R of the smallest ball about the origin containing the spectrum ) coincides with the of T. To show this, we select a testfunction ( ~ ( 5which polynomial P ( i 0 for 5 R + 1. Then
+
(P(D)T)* = P(i& = (P(0F
" = (2.)-"'2($
* T)^ ,
the convolution on the right making sense, since T is temperate and $ is in 9'. Because T is actually in L", that convolution is an integral, and it follows that P(D)T(x)
1
= (27~)-"'~&x
- y)T(y) dv
and, therefore, easily that IIP(D)Tll, IlTll, (2n)-"'Zj @(y)l dy; the factor on the right-hand side depends only on the choice of ( ~ ( 5and ) therefore, only on the choice of P(D)and R. The bounded functions with compact spectrum therefore enjoy the remarkable property that P(D)T can be obtained from T(x) by convolution with a kernel in the class Y , that is to say, by a certain type of averaging process. It is also clear that every derivative is also bounded and belongs to the same class. The sharp result is the Bernstein inequality. Letf(x) be a bounded measurable function with spectrum in Theorem: the ball 151 R ; then IIgradfI(, S RIIfII,.
228
111. HARMONIC ANALYSIS
PROOF: First it is necessary to reduce the proof to that of a simpler case. Since the spectrum of the translateyhf is also the spectrum off, and since the bounds (IgradYhfllmand l l y h f l l are independent of h, it is enough to prove lgradf(0)I S Rllfll,. If the theorem is not true, there exists a bounded function f(x) for which lgrad f(0)l > (R 2~)Jlfll,,the spectrum offbeing contained in a sphere of radius R. However, the regularization of the
+
P.
Fourier transform, f * qe is a testfunction supported by a sphere of radius R + E and is the Fourier transform of the function (~IT)"'~@(Ex)~(x) which is As E converges to 0, the gradient of this product at always bounded by IIfII,. the origin converges to gradf(O), hence, for the proof of the theorem, we may A
suppose that f is a measure (and even a testfunction) supported by the ball 151 2 R. This circumstance leads to the inequality If(z)I CeRIYIwhich is more manageable than If(z)I 5 C(l ( z ( ) N e R t y l . A further simplification is possib1e:fmay be transformed tofo I by means of a suitable orthogonal transformation I in such a way that the gradient at the origin is in the direction of the x,-axis. Thus gradf(0) = a(f0 l)/dx,(O) and f o I obviously satisfies the other hypotheses of the theorem. The function of one variable F(x,) = f ( x , , 0, 0, . . . ,0) has a derivative at the origin equal to gradf( 0 ) and is the restriction to the real axis of the entire function F ( z , ) = f ( z , , 0, 0, . . . ,0). Evidently, lF(zl)l I CeRIYIIand so the spectrum of F(x) is in the interval [ - R, R ] . It is therefore sufficient to prove Bernstein's inequality for the function F(x,) since supx,lF(x,)l 5 supxlf(x)l = Ilfl ,; that is to say, it is sufficient to prove the theorem for functions of one variable. Finally, since we must show If'(0)l Rllfll,, the passage to the functionf(x/R) makes it clear that R may be taken equal to + 1. Let k be a positive integer and S, a square of side length 8kn in the complex z-plane centered about the origin, the sides being parallel to the coordinate axes. On this path the quotientf(z)/cos z is a bounded function, since, on the vertical sides
+
Iml 5cos - cosh y CelYl
z
5 2c, while on the horizontal sides lyl 2 4n and
s 4c. Hence, as the integer k increases, the integral
s
46.
THE SPECTRUM OF A DISTRIBUTION
229
converges to 0 since Ilk] S 4C/kn by an easy estimate. The integral Ikcan be explicitly computed by residues, since the integrand has a pole at the origin, as well as poles at zeros of the cosine occuring inside the square and is given by
the sum being taken over rn # 0 with Iml < 8k. It follows immediately that f ’ ( 0 ) is given by the convergent series
In the special case when f(x) = sin x, a bounded function with spectrum in the interval [ - 1, I], this reduces to
where the summation is taken over all odd integers m, positive or negative. More generally, then,
=
IlfIlm
and the proof is complete. The previous argument, due to L. GArding, shows a good deal more than simply Bernstein’s inequality. It is clear that for all x
and this may be written
f’(4= (f* P ) ( X )
9
where p is the measure which has the mass - 4 sin(mn/2)/m2n2 at the point x = inn/2. This measure of finite total mass has a continuous Fourier transform
and it is easy to verify that i;
+
230
111. HARMONIC ANALYSIS
and since the points mn/2 for odd m are local maxima or minima of the function,f’(mn/2) = 0. Indeed, from
it follows thatf‘(kn/2) = f 1 when k is even, the sign depending on the parity of k/2. Since the functionf’(x) is bounded and has its spectrum in the unit interval,
and thusf”(jn/2) = 0 for even integers j andf”(jn/2) = k 1 for odd integersj, the sign depending on the parity of ( j 1)/2. The argument can be repeated to show that the nth derivative off(x) has the same values as the nth derivative of sin x at points of the form x = m42, and therefore the Taylor expansion off(x) about the point x = 4 2 coincides with the expansion of sin x about that point. The functions being analytic, they coincide everywhere, and it follows more generally that the only bounded functionf(x) on the real axis, with spectrum in the interval [ -R, R] for which llf’llm = Rllfll, where f’ attains its supremum, is of the formf(x) = C sin(Rx - h). It is convenient to introduce two definitions. A sequence of functions fn(x), continuous and bounded on R” converges narrowly tofo(x) having the same properties if and only iff,(x) converges to.fo(x) uniformly on bounded subsets of R“ and the numbers ~ ~ f nconverge ~ ~ m to IIfoll,. A distribution Ton R” which is temperate is called a bounded distribution if and only if all of the Evidently, convolutions (T* q)(x) are bounded functions when cp is in 9. any bounded measurable function is a bounded distribution. The important theorem is due to A. Beurling.
+
Theorem (Beurling): If Tis a bounded distribution, a point tobelongs to the spectrum of T if and only if there exists a sequence of functions q,,in 9 such that the convolutions f,(x) = (T * q,)(x) converge narrowly to ,fo(x) = eixco. PROOF:
Suppose first that to=: 0 and therefore thatfo(x) is identically
+ 1. If 0 belongs to the spectrum of T, there exist testfunctions $,,([) supported *
by 5 l/n for which T(&) is not zero and therefore (T* q,)(x) =fn(x) not identically 0. If 8, is multiplied by an appropriate real constant, it will follow that 111;,11, = 1 = llfoll, while the multiplication of $,, by the exponential eihCcorresponds to the translation of qn, and therefore off;, . Accordingly, it is possible to choose the testfunctions &(t)so thatf,(O) 2 1 - (l/n). To show
+
46.
231
THE SPECTRUM OF A DISTRIBUTION
that f,(x) converge narrowly to f o ( x ) = I , it is only necessary to show the uniform convergence on compact subsets of R"; however, for any point x in the space, If,(x)
- 1I 5
+ If"(0) - 1 I
-f,(O)l
Iffl(4
+
and by the mean value theorem, this is smaller than 1x1 IIgradf,II, (l/n), which, by Bernstein's inequality is in its turn smaller than (1 Ixl)/n, the spectrum off;,(x) being in the ball of radius I/n. Thus, If,(x) - 11 converges uniformly to 0 on any compact subset of R". On the other hand, if 0 is not in the spectrum, there cannot exist such a sequencef,(x) = (T* rp,)(x) with rpn in Y andf,(x) converging narrowly to + 1.
+
*
This is a consequence of the existence of a testfunction
$(t)supported by a A
neighborhood of 0 wholly outside the spectrum of Tand for which $(O) = 1. The convolution T * $ vanishes identically and so also does T * cp,, * $ = f,* $, and this may be written
1
fn(X
- Y)$(Y) dY
Since I) is in Y and hence in L' with iiffll1 convergence theorem guarantees that
*
= 0*
uniformly bounded, the Lebesgue
=0.
contradicting $(O) = 1. If tois not 0 we set S = e-ixCoTand note that tois in the spectrum of Tif and only if 0 is in the spectrum of S, and this happens if and only if there exists a sequence cpn in Y such that the functions g,,(x) = (S * q,)(x) converge narrowly to go(x) = 1. The sequence $,,(x) = eixtorpn(x)also belongs to Y and g,(x) = e-'"'O(T * $,,)(x); hence, (T * $")(x) converges narrowly to eixro as desired. In the remainder of this section we suppose the bounded distribution T to be a bounded measurable functionf(x). Where K ( x ) was a function in L'(R"), the convolution ( K * f ) ( x ) was defined in previous sections of this book only under some additional hypotheses, for example, that K has compact support, or that one of the two distributions belongs to the class 9. However, wherever that convolution was defined it has been the function
232
111. HARMONIC ANALYSIS
which makes sense for all x in R". We shall take this integral in general as the definition of the convolution, and it is important to notice that the convolution is a continuous and bounded function. This is a consequence of the fact that K(x) can be approximated in L1(Rn)by testfunctions. Accordingly, if K,,,(x)is a testfunction so chosen that IK(x) - K,,,(x)I dx < l/m, then
I
I(K * f)(x) - (Km * f)(x)lS
1
llf ll m
and the convolution is the uniform limit of the sequence of continuous functions (K,,, * f ) ( x ) .Moreover, in those cases which we have studied, the Fourier h
transform of the bounded continuous K * f was given by (2n)"I2K(t)? This result we have proved, for example, whenf(x) is also integrable, or when K ( x ) has compact support or is in the class 9.However, it cannot be true in general, h
since at best all we know about f is that it is a distribution, and a distribution can only be multiplied by a Cm-function,and therefore the product A
h
(2n)"I2K(5)f will make sense in general only if further hypotheses are made about f or K. A
Theorem: off.
If K * f vanishes identically, then K ( t ) = 0 on the spectrum
PROOF: If to belongs to that spectrum, by Beurling's theorem there exists a sequence q,(x) in 9 so that the functions fn(x)= ( f * q,)(x) converge narrowly to efxeO.Then since
vanishes identically, it must vanish for x = 0, whence by the Lebesgue theorem
I
0 = K(y)f,(
- y) dy
A
converges to IK(y)e-"'O dy = (2n)"/'K(t0).
h
Therefore, K(<,) = 0. It is not true that the vanishing of K ( t ) on the spectrum offimplies that the convolution is 0; however, some information is given by the following theorem, due essentially to Agmon and Mandelbrodt. A
*
Theorem: If K ( t ) = 0 on the spectrum off, then the spectrum of the convolution K * f is a perfect subset of the boundary of the spectrum off.
46.
233
THE SPECTRUM OF A DISTRIBUTION
PROOF: If tobelongs to the spectrum of K * f , by the Beurling theorem there exists a sequence (P,, in Y so that (K * f * cp,)(x) converges narrowly to eiXCo.For any function $ in 9, then, the convolutions ( K * f * (Pn * $)(x) =
p
* f * (P,,)(x- Y)$(Y) 4 A
converge by the Lebesgue theorem to / e i e o c x - y ) $ ( ydy ) = (2n)”/*ei”~~$(5,). If tobelongs to the complement of the spectrum off, there exists a function A
$(x) in
Y whose Fourier transform
$(<) is supported by a neighborhood of *
tolying wholly outside the spectrum off with $(to) = 1. In this case, however,
f * $ vanishes identically, and therefore K * f * (P,, * $ also, and this means $(to) = 0, a contradiction. Similarly, if tois an interior point of the spectrum * off, it is surrounded by a neighborhood on which K ( t ) vanishes identically; A
A
$(t)is supported by that neighborhood and satisfies * $(to) = 1. Since K * $ vanishes identically so also does K * f * cpn * $ and,
we choose $(<) so that A
therefore, also $(to), another contradiction. It follows that tois not an interior A
point of the set where K ( t ) vanishes, nor of the complement of the spectrum off. To show that the spectrum of K * f is perfect is to show that there is no isolated point in that spectrum, and there is no loss of generality if we suppose that the point in question is the origin. If 0 belongs to the spectrum of K * f and is an isolated point of that set, then there exists an even, positive test function ii I ; the transform would be the function x,(x) = (I/E”)x(x/E).Then the convolutions f , = K * f * xe form a family of functions, uniformly bounded by M = 11 f II,IIK(y)l dy and the spectrum ofS, is contained in the ball of radius I/&.From Bernstein’s inequality it follows that [Igradf,11 5 M/E and hence, for large E, the functions f,are uniformly Lipschitzian with a small Lipschitz constant. From the Arzela-Ascoli theorem, then, there exists a subsequence converging uniformly on compact subsets
+
234
111. HARMONIC ANALYSIS
to a continuous function with Lipschitz constant 0, that is, to a constant function C 2 . Accordingly, C , = ( K * f * Xe)(X> = (K *f,)(x) = /K(y)-te(x
- Y ) d~ A
converges with increasing C , = 0, as desired.
to C2/ K ( y ) dy = C2(2n)"'2K(0)= 0 whence
E
If K ( x ) is in L'(R") and A denotes the space of Theorem (Wiener): finite linear combinations of translates of K , then di is dense in L'(R")if and A
only if the Fourier transform K ( ( ) is never 0. A
PROOF: If for some to,K(t,) = 0, then for every h in R" S K ( x - h)e-'"ro dx
=0
*
and so F((,) = o for every ~ ( xin) A. If a sequence F,(x) in A converges in *
A
L1(R")to G(x), then G(5,) = lim Fk(to)= 0 and so the limit points of A have Fourier transforms vanishing at 4, from which it follows that 4 is not dense. On the other hand, if di is not dense, there exists a bounded measurable function f for which fG(x)f(x) dx = 0 for all G in 4, and therefore " j K ( x - h ) f ( x )dx = 0 for all h. Thus K *f vanishes identically and so K ( ( ) " vanishes on the nonempty spectrum of$ A completely analogous theorem holds for the space L2(R")but its proof is essentially more elementary. A
Theorem (Wiener): If K ( x ) is in L2(R"),the finite linear combinations of its translates are dense in L2(R")if and only if the Fourier transform k(t) vanishes on no set of positive Lebesgue measure. h
PROOF: If the linear span of the translates is not dense in L2, there exists a g(x) in L2 orthogonal to all the translates; taking Fourier transforms this may be written
A
-
The integrable function K ( r ) 3 ( ( ) therefore has a Fourier transform which A
-
vanishes identically, hence K(()?(()
=0
almost everywhere. It follows that
46.
235
THE SPECTRUM OF A DISTRIBUTION
A
K ( ( ) vanishes on a set of positive measure, since otherwise A
Ax) would have
to vanish almost everywhere, whenceg = 0. Conversely, if K ( ( ) = 0 on a set E of positive and finite measure, the characteristic function ?(() of that set is in A
-
L2 and K(()?(<) = 0 almost everywhere. It follows that g is orthogonal to .M in L2, completing the proof. Our next theorem is of great importance. Wiener Tauberian Theorem:
Let K ( x ) be in L’(R”) and be such that
A
K ( ( ) never vanishes; letf(x) belong to Lm(R”)and suppose lim ( K * f ) ( x ) Ixl-+m
exists. (This necessarily finite limit may be written A j K ( x ) dx = A(2n)”/’K^(O).) Then for any G(x) in L’(R”), lim (C * f ) ( x ) = A IC(x) d x 1xl-m
A
= A(2n)”’%(O).
If G(x) is a finite linear combination of translates of K ( x ) the theorem is virtually obvious. Since K ( ( ) is never 0, an arbitrary G(x) in L’ may be approximated in that space by such combinations of translates. Write G = G, + G2 where GI is in & and JG,(x)l d x / < E ; now PROOF:
A
and therefore lim sup/(G * f ) ( x ) - A jG(Y) dYI 1x1
-+
‘x
s (Ilfllm + I A b
9
which can be made arbitrarily small. Thus the proof is complete. The hypothesis of the theorem can be modified to require only that the convolution ( K * f ) ( x ) approach a limit as x approaches infinity in some open cone with vertex at the origin; we conclude, as before, that (G * f ) ( x ) converges to A G(y)dy in that same cone. In particular, in the special and important case of one dimension, we may suppose only that the real number x approaches infinity through positive (or negative) values. The Wiener theorem is really a theorem concerningfand not K, and can be given a slightly different formulation. The system of uniformly bounded
I
236
111. HARMONIC ANALYSIS
functionsf(x - h) in Lm(R")has the property that for a certain integrable K(x), lim If(. - h)K(x) d x exists and equals A l K ( x ) d x h A
Then if K ( ( ) is never 0, the functionsf(x - h) converge to A in the weak-star topology of Lm(Rn).From this fact, we would expectf(x) to look very much like the constant function A for large 1x1, and indeed, if C(x) is a regularizing function cpe, the convolutions cf* cp,)(x) do converge to A at infinity. It follows that if the functionf(x) is uniformly continuous, it then itself converges to A at infinity. In most applications, no such hypothesis is available, and it is necessary to make a particular study off(x) to determine its exact behavior at infinity.
47. Tauberian Theorems One of the best known summability methods is the method of Abel summation: the sum S is assigned to the series cp=I ak if the function f ( z ) = ak zk is analytic in the unit disk and approaches the limit S as z converges to + 1 along the real axis. It is easy to show that the method is regular, that is, that the series is Abel summable to S whenever it converges and converges to S. Various theorems provide a partial converse to this statement, guaranteeing that a series is convergent if it is Abel summable and some further hypothesis is satisfied. The first such theorem was proved by Tauber who showed that an Abel summable series for which lim, kak = 0 is convergent. All such theorems have therefore been called Tauberian theorems. One of the most important is due to Littlewood.
ckm,l
Theorem (Littlewood): Let ak be Abel summable to S and sup klakl = A < 0 0 ; then the series is convergent. By hypothesis, the function f ( z ) = xp=la k Z k is analytic in = S; accordingly, the function F(s) = xp'l ake-ks is analytic in the open right half-plane Re[s] > 0 and approaches the limit S as the real s diminishes to 0. This function is a Laplace transform: PROOF:
Izl
< 1 and lim,,,f(x)
where the measure dp puts the mass ak at the point Iz = k. The corresponding function p(A) vanishes for 1 < 1, is constant in intervals of the form (k,k + l),
47.
237
TAUBERIAN THEOREMS
and increases at most logarithmically. We take it to be right-continuous. Because of the slow growth of p(A), the integral may be integrated by parts to obtain m
F(s) = s
e-"'p(l) d l .
0
It is important to show that the function p(1) is bounded. Now for large integers N N
F(1/N) - p(N) = C a,(e-'lN - 1) k= 1
m
ake-klN. +k =1 N+ 1
From the convexity of the exponential le-'" - 1 I S k/Nand therefore the first sum is bounded by A. If g ( t ) is the function e-'/t, the second sum is bounded by the positive series A I$ g ( k / N ) (l+ / N ) , which in turn is bounded by A + A g ( l ) dt since the function g(t) is monotone decreasing for t > 1. /la Accordingly, as N increases, F ( l / N ) converges to S and therefore p ( N ) is bounded. Let s = e - y and 1= ex, then d1 = ex dx and s 1 = e x - y . F(s) = F(e-Y)
=s
e
- p - y
ex- 'p(e") d x .
Let K ( t ) = e-e'e' (this is a positive, integrable function on the real axis) and let f(x) be the bounded measurable function p(e"). Now lim y++m
I
v
K ( x - y ) f ( x ) dx
=
lim (K
* f ) ( y )= S
y++m
and the Wiener-Tauberian theorem may be invoked if the Fourier transform A
K ( 5 ) is never 0. But this transform is easily computed as follows:
i(<) = (2n)- 1/2~e-ic'e-e'e' dt
= (211)- 1 / 2 r ( 1 -it). A
Since the Gamma function has no zeros, K ( 5 ) is never 0, and the hypotheses of the Wiener theorem are satisfied. Hence, every regularization off(x) converges to S at infinity.
238
111. HARMONIC ANALYSIS
In order to make sure thatf(x) itself converges to S at infinity, we must study this function in greater detail. The hypothesis, laklk 5 A shows that forlsa?
Accordingly, if y is large and x is larger, I f ( 4 -f(v)I = Icl(e") - cc(e')l 5 Ae-Y + Alx - y l . If
xk
is a sequence of points on the axis converging to infinity such that
> S + 4q for some positive q, then, for h = q / A and all large enough k, f ( t ) > S + q for t in the interval [xk ,xk + h]. If we take a regularizing function f(Xk)
q(x) supported by an interval of length smaller than h/2, the convolution x = xk (h/2) and therefore does not converge to S at infinity, a contradiction. Thus lim supf(x) 5 S. A similar argument shows that lim inf f ( x ) 2 S, completing the proof of the theorem. Another important Tauberian theorem is the tkehara theorem.
(f*rp)(x) excczds S + q at points of thz form
+
Theorem (Ikebara): Let dp be a positive measure on the right halfaxis such that the Laplace transform
converges for Re[s] > 1 ; suppose, moreover, there exists a constant A such that F ( s ) - [ A / ( s - l)] = H ( s ) is continuous in the closed half-plane Re[s] 2 1 ; then limx-rme-"p(x) exists and equals A.
PROOF: Whenever Re[s] > 1, it is possible to integrate by parts to obtain
where the function h(s) is analytic for Re[s] > 1 and continuous in Re[s]
2 1.
47.
TAUBERIAN THEOREMS
239
Now set b(x) = e-%(x) and a,(x) for the function which vanishes for x < 0 and is equal to e - E xfor x > 0. Taking s = 1 + E + it, we obtain
After division by ,/Ti, the term on the left-hand side is a function of t, the Fourier transform of the product a,(x)b(x).The first term on the right-hand side is the Fourier transform of Aa,(x). From the definition of Fourier transh
form, that is, the equation T ( q ) = T(&, it follows that for any function cp in 9'.
I0e-"b(x)$(x) m
m
d x = AS e - ' G ( x ) d x 0
1 +/ h ( l + E + it)cp(()d t .
J%
Now select q(r)as a testfunction such that its Fourier transform is nonnegative, multiply it by the exponential eiht, and substitute to obtain
/ome-"b(x)$(x - h ) d x = A
As E approaches 0, the two integrals on the right-hand side converge to finite limits, in view of the integrability of $ and the hypothesis that h(s) is continuous in the closed half-plane Re[s] 2 1. The integrand on the left is always positive and increases as E approaches 0, hence from the Beppo Levi theorem it follows that the limit is integrable and
As h approaches + ol), the second integral on the right-hand side converges to 0 in view of the Riemann-Lebesgue lemma, and therefore, since the first term increases with h because q ( x ) is positive,
lim Iomb(x)$(x - h ) d x = A lim h - +m
= A ! $(x)
dx.
This makes it clear that certain averages of b ( x ) approach A at infinity.
240
111. HARMONIC ANALYSIS
Let x > h > 0; then ex&) = p(x) 2 p(h) = ehb(h) and therefore b(x) 2 b(h)eh-". Accordingly, h
lomb(x)$(x - h ) d x =
b(x)G(x - h ) dx 0
+I
m
b(x)$(x - h ) dx
h
m
1 b(h)/ eh-.$(x - h ) dx h
= b(h)Ime-'$(l.)
dl..
0
I@(.)
When the testfunction q({)is so chosen that
dx = 1, this leads to
lim sup b(h) 5 A [ [ome-A$(l.)dl.] h-. m
-'.
The factor multiplying A can be made as close to I as desired: for @(A) we A
'with substitute &$(&A - &) = I/@), which is also a positive function in 9 00 integral equal to 1. As E approaches infinity, the quantity J0 e-$(l.) dA converges to 1. Thus lirn S U p h , , b(h) 5 A. The argument dealing with the limit inferior is slightly more complicated.
jomb(x)$(x
- h) dx = j
m -h
b(x + h)$(x) dx
-1
=1-h
+
+
!Om'
Since b(x + h) S b(h)e-", for negative x this integral is bounded from above by 0
B I - '?(x) dx -h
+[
b(h)e-"$(x) dx
-1
+ sup b(x)Sm$(x) d x , x l h 0
where B = sup,,,b(x). Let h approach infinity in such a way that b(h) converges to the limit inferior; then 0
0
$(x) dx
AS -m
The substitution of E&)
S B I I *$(x) dx + lirn inf b(h)/ e-.$(x) d x m
-1
for @(x)then leads to lirn inf b(h) 2 A h-r m
completing the proof.
48.
PRIME NUMBER THEOREM
24 1
48. Prime Number Theorem In this section we write the complex variable s in the form s = 0 + it following established custom. The Riemann Zeta function is given by the infinite series
which converges absolutely and uniformly on compact subsets of the halfplane CT > 1. Since I/nsis e-' log n, an entire function, it is easy to see that c(s) is the Laplace transform of the positive measure which puts a unit mass at the logarithm of each integer 2 1. The function evidently has a singularity at s = 1, and this singularity can be determined exactly. Since 1 s-1
-= /lwx-s d x
is valid for
0
> 1, it follows that in the same region,
However, the series on the right-hand side converges for 0 > 0, since x-' has the derivative -sx-'-' and this is bounded in absolute value on the interval ti-"-' [n,n + 11 by I ~ l n - ~ -and l , therefore the series is majorated by [sl which converges for positive CT. Hence r(s) is meromorphic in the open right half-plane and has only one singularity in that region, a simple pole at s = 1. Let P k denote the sequence of primes: p 1 = 2, p 2 = 3, p 3 = 5, p4 = 7 , etc. From the convergence of the series representing c(s) we deduce that p;' converges also for CT > 1. Hence the infinite product i(1-i)
k= 1
converges in the same region. However, since
242
111. HARMONIC ANALYSIS
where the final sum is taken over all odd integers m,and since
the last sum being taken over integers k not divisible by 2 or 3, it becomes clear that
where the sum is taken over all integers n 2 1 not divisible by any prime Q N . Evidently, this sum converges to 1 as N increases, hence the infinite product represents the reciprocal of Zeta. That reciprocal approaches 0 as s approaches + 1 along the real axis, and therefore the infinite product cannot converge for s = 1. This in turn implies that xF=ll/p, = + coy a very weak result concerning the distribution of primes. Let G(s) be the negative of the logarithmic derivative of Zeta:
+
=
d - - log [(s) ds
.
Since [(s) is meromorphic for 0 > 0, G(s) is also meromorphic and has only simple poles in that region. Since the residues at such poles are integers corresponding to the multiplicity of the point in question as a zero or pole of [(s), it follows that G(s) has a simple pole at s = 1 with residue + 1, and simple poles at the zeros of [(s) with residue - k , where k is the multiplicity of the zero. Since the reciprocal of [(s) is given for 0 > 1 by a convergent infinite product, it is clear that [(s) has no zeros in the half-plane 0 > 1,and therefore G(s) is regular in that half-plane. The infinite product makes it possible to write a series for the logarithm of [(s) : m
1 log k=
-log [(s) =
1
and since log(1 - z) =
-cm n = l z n / nwhen lzl < 1, the double series
48.
PRIME NUMBER THEOREM
243
converges absolutely and uniformly in compact subsets of r7 > 1. Since this can be differentiated term by term it follows that G(s) can be written as a Laplace transform, at least for Q > 1, as
where the measure dv has the mass l/n at points x of the form p" where p is a prime. It will be necessary to apply the Ikehara theorem to the function G(s) which has a pole of the form l/(s - 1) at s = 1, and so it is necessary to verify that G(s) has no other singularities on the line r7 = 1. Owing to the special form of G(s),such singularities can only be poles of the form - k/(s - (1 + ito)), where 1 + ito is a zero of [(s) of multiplicity k . It is immediate that k cannot be greater than 1, since for positive E , the function Be(<) = G(l + E + i t ) is a function of positive type which therefore attains its maximum at the origin 5 = 0. When E is small, this value is of the order 1 / ~while , is approximately - k/E if c(s) has a zero at 1 + ito.Now for small positive E , the function ED,([) is of positive type and ~j21,(0)= 1 + ~ h , where , the quantity h, remains bounded as E approaches 0; similarly, ~@,(5,) = - 1 + where h, is bounded for all small E . At the point 5 = 25,, the function is of the form
+
where k is 0 or + 1 depending on whether 1 i25, is a regular point or a pole of G(s) and h, is bounded. Let x , = 0, x 2 = to,and x 3 = 25, and form the positive matrix E @ , ( X ~ - x k ) ; as E approaches 0, this matrix converges to the positive matrix 1
[-I- k
-1 -1
-k ;I]
Whether k is 0 or 1 the determinant of this matrix is negative and therefore it is not a positive matrix; this contradicts the assumption that C(s) had a pole at 1 + it,, hence no such pole exists. It follows that [(s) has no zeros on the line 0 = 1, and it is clear that that function has no zeros in the open half-plane 0 > 1 since the infinite product which represents the reciprocal of c(s) converges in that half-plane. Let n(x) be defined as the number of primes I x ; this is a monotone increasing function which vanishes for x < 2 and takes integer values; obviously n(x) < x. In order to show that n(x)/x converges to 0 with increasing x the following lemma is proved.
244
111.
HARMONIC ANALYSIS
Lemma: Let v be an integer and N , = p1p2p3. .. ,p,, the product of the first v primes; then in any block of Nu consecutive integers, exactly
of them are not divisible by any prime Pk with k
5 v.
PROOF: The proof is by induction; for v = 1, the Lemma reduces to the assertion that any pair of two consecutive integers contains just one odd integer. Given N , + , = pv+lN, consecutive integers, the set may be resolved into pv+ blocks of N, consecutive integers, and from the inductive hypothesis there are exactly pv+lN, (1 - (l/pk)) integers in the whole set not divisible by Pk for k 5 v. On the other hand, the whole set contains N , multiples of P , + ~These . are of the, form kpu+lwhere the factor k runs over a block of Nv consecutive integers. Hence there are exactly N, (1 - ( l/pk))multiples of in the block which are not divisible by any smaller prime. It follows that there are exactly
n;=
n;=
integers in the block not divisible by primep, with k 5 v + 1. Let the large positive x be in some interval of the form kNv 5 x (k + 1)N, where k < pv+ ; now ~ ( x 2 ) n(2kNv) is smaller than the number of integers in the block [l, 2kN,] not divisible byp, for k 4 v, and this block is the union of 2k blocks of Nv consecutive integers. Hence
-=
Since the corresponding infinite product diverges, K ( X ) / X converges to 0 with increasing x, that is, with increasing v. Since dn(x) is the measure which puts a unit mass at each prime, the divergence of the series 1/pk may be written as l/n &(A) = + 00, and if this Stieltjes integral is integrated by parts, the integrated term vanishes since n(A)/I converges to 0 as I increases. Thus
s
1
sowy y
=
+0O
and it is clear that the ratio K(A)/I converges very slowly to zero at infinity.
48.
245
PRIME NUMBER THEOREM
The exact behavior is given by the following famous result, conjectured in the eighteenth century and finally proved at the very end of the nineteenth by Hadamard and La VallCe-Poussin independently.
Prime Number Theorem:
The limit lim x-m
exists and equals
n(x)log x x
+ 1.
PROOF: It has already been remarked that the Ikehara theorem may be applied to the function
I, m
~(s= )
e-"1 dv(ek)
and therefore the quantity epx/; 1 dv(eA)converges to 1 with increasing x. Let c = e l ; now 1
N
lim - log t dv(t) = 1 N N O and after an integration by parts, since v(r) vanishes for t < 1 v(N)log N N
1
dt converges to 1 .
+
We shall presently show that v(r) = n(t) p ( t ) where p ( t ) S 4s log t; it will follow that v(t)/r converges to 0 with increasing t, and hence that its average over the interval [0, N ] also does. Accordingly rc(N)log N p(N)log N converges to N + N
+1,
and the theorem follows since the second term converges to 0, because it is smaller than (log2 N ) / J N . Since the measure dv puts the mass I/n at every number A of the form p" where p is a prime, we may write v(x) = .(X)
+ f.(x'/2) + 3n(x'/3) + ..*
and therefore p(x) = (l/n).(x'/"). If x is fixed, the nth term in the series vanishes for all n such that x"" < 2 , that is, for n > log x/log 2. Since the co) rc(x'/2)log x/(2 log 2), efficients in the series are at most f, it follows that p ( ~ < and because .(x)/x S 1 and 2 log 2 > 1, p(x) < &log x as desired.
246
111. HARMONIC ANALYSIS
49. The Riemann Zeta Function If h = ( 2 1 ~ ) and ’ ~ ~ cp(x) is a function in 9, by the Poisson summation formula
1cpw = c 8 n h )
9
the sums being taken over all integers n. Let G ( x ) be the Gaussian e-tx12/2 and ~ ( x=) (l/E)G(x/E)where E is positive; then
-xG(--) 1 nh = x G ( e n h ) . E
Put x = E’ > 0; then, since h2/2 = 11, this becomes O(X)
=
x
e-n2nx
- x- 112 C e - n 2 n / x = x-”2O(l/x),
where O(x) is defined for x > 0 by the convergent series above. It follows that the associated function
= f-C@)
- 11
9
which is the Laplace transform of the measure which has a unit mass at every I of the form n2n, is analytic in the open right half-plane and satisfies the functional equation W ( l / X )= X”2W(X) + f - P - ffor all x in that half-plane. Evidently W
W(X)
s e-ax C (e-nx)n n=O
e-“ -1 - e-nx
and therefore W ( x )vanishes quite rapidly at infinity. The functional equation now makes possible an estimate of the behavior of W ( x ) near the origin: W ( x ) 5 x - ’ / ’ , and therefore the function is integrable over the half-axis x > 0.
49.
THE RIEMANN ZETA FUNCTION
247
In the same way, it can be shown that thederivatives of W ( x )vanish exponentially with increasing x. For k > 0,
and since the series is bounded for large x by a fixed C,, W'k'(X) g e - T k nk.
For a later application, it will be important to know the behavior of W ( s ) as the complex variable s approaches s = i along a path in the right halfplane almost parallel to the real axis. Lets = i z where z = reio, Iwl 5 n/4, and r is small. Now
+
~
( + iz) =
C W
e-n2nie-n2nz
n=l
Since the first sum may be written e-(2n)*nzwhere the summation is taken over all integers 2 I , it is therefore equal to W(4z), and hence, finally,
W ( i + z ) = 2W(4z) - W(z) . From the functional equation satisfied by W(z),it then follows that
where there is no ambiguity in the determination of the square root. The first term above converges exponentially to 0 as r decreases since it is bounded by r-'/'4e-"lr. All of derivatives of W(z) have the same property since the differentiation of the relation above displays W'k'(i+ z) as a finite linear combination of functions of the form Wcm)(l/z)and Wcm)(1/4z)with coefficients having only algebraic singularities at the origin. Since those derivatives diminish exponentially, the derivatives of W ( i z ) converge rapidly to 0 as z approaches the origin in the sector larg zI = IwI 5 4 4 . From the definition of T(s), when Re s = cr > 0,
+
248
111. HARMONIC ANALYSIS
after the substitution x = n 2 d this becomes n5As/z[o*e-n2aA
As/z dAlJ and
therefore
Summing over n and invoking the Fubini theorem we obtain for
Q
>1
Now
and the second integral may be computed explicitly, finally,
Q
being
2 1.
Thus
It is easy to see that the first term, the integral, is an entire function of s and it therefore appears that q(s) is meromorphic in the entire plane, having only two singularities, simple poles a t s = 1 and s = 0. Hence, c(s) possesses an analytic continuation over the whole plane. The formula also shows that rlw = rl(l
-4
*
Since T(s/2) has poles at the negative even integers and q(s) does not, it follows that c(s) has simple zeros at those points. These are called the trivial zeros of c(s). Because the product ~ - ~ / ~ I ' r ( snever / 2 ) vanishes, it follows that the nontrivial zeros of ( ( 8 ) and the zeros of q(s) are the same. However, c(s) has no zero in the half-plane r~ 2 1, and therefore q(s) has no zeros there; in view of the functional equation satisfied by ~ ( s ) ,this means that q(s) has no zeros in
49.
THE RIEMANN ZETA FUNCTION
249
the closed left half-plane u 5 0, and hence, [(s) has no nontrivial zeros in that half-plane. It follows that the nontrivial zeros of [(s) are in the open strip 0 < u < 1. The function q(s) has a simple pole at the origin which it inherits from T(s/2), hence ((s) is regular and nonzero at the origin. We note also that ((s) - l/(s - 1) is entire. There is another consequence of the functional equation satisfied by q(s) q(f
+ it) = q(l - f - it) = q(f --ir)
= q(f
+ it)
by virtue of the Schwarz reflection principle and the reality of q(s) on the real axis. It follows that q(s) is real on the line u = and therefore that the zeros of that function are symmetrically distributed about the line u = f as well as about the real axis. These zeros, as we have remarked, are all in the strip 0 < u < 1. The famous Riemann Conjecture, now more than a century old, is that all the nontrivial zeros of [(s) fall on the line Q = f. The function
+
is evidently entire, real on the real axis and on the line u = f. To study this function on the vertical line u = f, it is convenient to change variables and to write qz) = t(f iz),
+
obtaining an even, entire function of z. We find an explicit representation of this function from the integral formula for q(s); here it is convenient to use 4 2 .
Putting 1 = e4', this becomes 1
= - - (1
m
+ 2)j W(e4')eX
cos(zx) dx .
0
The formula now takes a familiar form if we consider only the real values z = 5 to write
I-
250
111. HARMONIC ANALYSIS
Accordingly, if F ( x ) is the even integrable function defined by ~ ( x= ) +e-lxl
- ~(e41~1)el~I, n
then its Fourier transform is given by F ( ( ) = *
J2/K
E((/2)/(1
+ (’).
Since E(z)
is entire, F ( ( ) is meromorphic in the complex [-plane with poles only at [ = + i and. [ = - i . It is important to notice that the function F ( x ) is also analytic: the function +e-x - W(e4x)exis analytic in a strip about the real axis of width 4 4 and coincides with F ( x ) for x > 0; however, this function is even and therefore coincides with F ( x ) for all real x and therefore F ( x ) itself is analytic. To show this we write the functional equation for Was
+ +(A’/’ - 1) W ( e - 4 x )= e2’W(e4’) + +(e2x - l), which leads
W(l/1) = P W ( 1 )
and put 1 = e4x to obtain easily to ) e - x - W(e4’)eX = +ex - W(e-4x)e-”. It should also be clear from the estimates we have obtained for the behavior of W(1)and its derivatives 6
hence, its Fourier transform F ( t ) is in for large 1 that F ( x ) is in the class 9, *
The study of the Fourier transform pair F ( x ) , F(5) provides a the class 9. proof of a theorem of G. H. Hardy.
Theorem (Hardy): The Riemann Zeta function has infinitely many zeros on the line 0 = +.
PROOF: [(s) has zeros on the line 0 = 4 if and only if E((/2) has real n zeros, and therefore if and only if F(5) has real zeros, and infinitely many n zeros of the one correspond to infinitely many of the other. If F ( ( ) has only finitely many zeros, that analytic function has a fixed sign at infinity; we may n therefore suppose without real loss of generality that F ( ( ) is positive for all sufficiently large Ill. If q(()is a suitable even, positive testfunction, the function n * H ( ( ) = F(5) + q ( ( ) is always > O and therefore its Fourier transform H ( x ) = F ( x ) + $(x) is a function of positive type in 9which is also analytic in a strip about the real axis, since $ is entire. Accordingly, H ( z ) can be expanded in a Taylor series about the origin m
where ( 2 k ) ! c Z k = ( 2 r ) - ” 2 ~ ( i ~ ) d2( k ; ~(()
49.
THE RIEMANN ZETA FUNCTION
25 1
only the coefficients with even indices occur, since all of the functions are even. Clearly (- 1)kC2k> 0 for all k and the radius of convergence of the series is exactly 4 8 . If the analytic function is considered on the segment of the imaginary axis defined by 0 < y < 4 8 , from N Y ) =W i Y )
it is clear that the function h(y) is a positive, increasing function of the real variable y , and all its derivatives h(j'(y) have the same property. It follows that for everyj 2 0 the function h(j'(y) increases to a (possibly infinite) limit dj as y increases to 4 8 . Obviously d2k> (2k)!lczkl. On the other hand, writing out H ( z ) explicitly and recalling that @(z)is entire, we see that the numbers dj are also given by d j = (i)@j)(in/8)
+ lim Di[+e-iY - W(ei4y)eiy]. y-W8
The derivative here can be computed in the usual way
as y approaches 4 8 , the function D"'W(ei4Y)converges rapidly to 0 if m > 0, as shown earlier. The function W(ei4y)itself converges to -+, and the limit is therefore -(i)j cos(x/8) i f j is odd and
-(i)j+'
sin(n/8) i f j is even.
The final inference is that the numbers dj differ in absolute value by at most 2 from cp(j'(in/8); the radius of convergence of the series having the coefficients dzk/(2k)! is infinite since (I/j!)p'(in/8) are the coefficients of the expansion of the entire @(z)about z = in/8. In view of the inequality, d2k> ( 2 k ) ! l ~> ~ ~0,l it follows that the radius of convergence of H ( z ) = C 2 k z z k is infinite, contradicting the fact that its radius of convergence is exactly n/8. As we have remarked, the Riemann Conjecture is that all the nontrivial zeros of c(s) fall on the line ~7= +; the theorem of G. H. Hardy shows that there are in fact infinitely many on that line. In recent years, the behavior of c(s) in the critical strip 0 5 o 5 1 has also been investigated with the help of the high-speed computer and an astonishingly large number of zeros have been located, approximately 34 million, in fact. It has been possible to verify that these zeros do fall on the critical line. Indeed, if R is the boundary of the
252
111. HARMONIC ANALYSIS
rectangle formed by the intersection of the strips 0 Q 5 1 and a 5 t 5 b, and if C(s) has no zeros on the lines t = a and t = b, then the integral
is an integer, namely, the number of zeros of c(s) inside R. It follows that if C(s) and its derivative is computed numerically to a certain accuracy, the integral can be determined exactly. In particular, when it equals 1, there is only one zero of C(s) inside the rectangle, and on symmetry grounds this zero must fall on the line Q = 4.
+
50. Beurling’s Theorem Let p(x) be the function defined for x > 0 equal to the representative of x modulo 1; thus x = [x] p(x) where [XI is the largest integer 5 x . For 0 < 8 4 1, the function p(8/x) coincides with 8 / x when x > 0 and takes values in the interval [0, 11; this function has only countably many discontinuities, and the points of discontinuity form a sequence converging to 0. The linear s p a c e d consisting of functions of the form
+
N
f(x) =
1 ak p(ok/x),
k= 1
where
1ak 8 k = 0
then consists of bounded, measurable functions vanishing for x > maxk[Bk] and therefore vanishing for x > 1. In this section the following remarkable result of A. Beurling is established.
Theorem (Beurling): d is dense in Lp(O, l), 1 5 p 5 00, if and only if the Riemann Zeta function ((s) has no zeros in the half-plane Q > l/p. PROOF: The proof is lengthy, and it is first convenient to make certain calculations. When s = a + it the function 2-l belongs to Lp(O,1) for p finite if and only if a > l/p’, where, as usual, p’ is the conjugate index determined by (l/p) (l/p‘) = 1. Its norm in that space is given by
+
IlxF-’llp = (1
+ (a - l)p)-l’P.
The integral p ( e / x ) 2 - ’ dx exists whenever a > 0 and is an analytic Iol function of s; it may therefore be conveniently computed for Q > 1 and
253
50. BEURLING'S THEOREM
determined for other values of s by analytic continuation. We have I
1
p(O/x)x"-' d x = 6
xS-* d x
+P I
Q
p(r)t-"-'
dt
1
J-0
and the first term is evidently (6' - 6)/(l - s). The integral occurring in the second term is a Stieltjes integral; we integrate by parts from 1 + 0 to 00 as follows
since the measure dp(r) consists of negative unit point masses at the integers 2 1 and Lebesgue measure otherwise. Thus the integral becomes 1
1 s
1
- - [((s) - 11 - - - S
1-s
and finally,
I0 1
p(6/x)xs- d x
-8
c(s)
1-s
S
.
= -- 8"-
It follows that forf(x) in A,
Suppose, now, that A is dense in Lp(O,1) where p is finite and that s is chosen with Re[s] = u > l / p . Since it is now possible to approximate the function h(x) = - I in the Lp norm by functions in A, there exist functions f ( x ) in A so that \If+ 1 [ I p is arbitrarily small. Moreover, the function XS-' is in Lp'(O,1) and so by Holder's inequality
thus, the integral can be made arbitrarily small by an appropriate choice off' in A. But this means that
1
can be made arbitrarily small for the right choices o f q , 6 k satisfying a, 8k = 0. It follows that c(s) # 0 and therefore the Zeta function has no zero in the halfplane u > l/p. This establishes the easy half of the theorem; since c(s) surely
254
Ill. HARMONIC ANALYSIS
has nontrivial zeros, it follows that A is not dense in Lp(O, 1) for p > 2, a fortiori for p = 00. For the balance of the argument, we assume that A is not dense in Lp(O,I) and that 1 5 p 6 2, and show that the Zeta function has a zero in the half-plane cr > l/p. It should first be noticed that iff(x) is in A, so also is the functionf(x/e) where 0 < E < 1 ; the division by E merely multiplies each 0, by E . By hypothesis, A is not dense in LP(0, 1) and therefore there exists a function g(x) in Lp'(O,1) so that Iolf(x/~)g(x)dx = 0 for all f i n A and all E in (0, 1). After the substitution x = e - A , E = e - y this becomes
lomf
(e-A+y)g(e-A)e-Ad 1 = H ( y ) =0
for y > 0 and all f in A.
Set G(1) = g(e-A)e-a for 1 > 0,
=O
for1<0,
to obtain an integrable function on the real axis IIG(1)l d1 = I1lg(x)l dx < 0 0 . 0
Similarly, put
F(1) =f(eA) =O
for 1 < 0,
for1>0,
to obtain a bounded measurable function which vanishes for positive 1. The convolution H(Y) = I F ( ) , - W ( 1 )d1 = (F
* G)(Y)
is then a bounded, continuous function of y vanishing for positive y. To avoid confusion with the Zeta function itself, we write the complex variable z = + iq and form the complex Fourier transform of F(1):
<
p(z)= (27~)-1/21e-'zAF(1)d 1 . Since F(1) is bounded, and vanishes for 1 > 0, this function is analytic in the upper half-plane q > 0 and is bounded there by (27t)-'l2 I(F11 mq- The integral
'.
50.
255
BEURLING'S THEOREM
may even be computed explicitly by means of the substitution x = e', with the result
n
It follows that F ( z ) can be continued to be meromorphic in the whole plane, and since x u k 8, = 0 there is no singularity at z = i corresponding to the pole n
of the Zeta function. Thus F ( z ) has at most one singularity: a simple pole at the origin. Since the convolution H(A) is bounded and continuous and vanishes for positive A, its complex Fourier transform n
H ( z ) =(2n)-'l2 Ie-"'H(A)
dA
is analytic in the upper half-plane and is bounded there by C/q for some constant C. Finally, the integrable function G(A) has a Fourier transform, and since that function vanishes for 1 < 0, its transform may be extended to a function bounded and analytic in the lower half-plane n
G ( z ) = (27r)-'l2 /e-"'G(A) d A . 1
The substitution x = e-' makes this integral appear as (2n)-1/Z[o g(x)x" dx and because of the hypothesis that g(x) is in Lp'(O, l), this integral exists (and defines an analytic function) so long as xi' is in Lp(O,l), that is to say, provided h
q < l/p. It follows that G(z) is analytic in a larger half-plane than q < 0 and, because of Holder's inequality, n
IW)l I (2~)-'/211911p~ llXiZllp for z = ( + iq where q > 0. All three functions are analytic in the strip 0 < q < I/p and for any such q, n z = t + iq, F ( z ) , G(z), and H ( z ) are respectively the Fourier transforms of the integrable functions eqaF(A),eq'G(A) and e"H(1). It is easy to verify that the last function is the convolution of the first two, and therefore, in the strip, *
A
~n
n
H(z)=(~X)'/~G(Z>F(Z). A
From this relation it follows that G(z) is the ratio of two bounded analytic n
functions in the half-plane q > 1/2p and therefore that G(z) is meromorphic n n in the whole plane, which is true accordingly, for H ( z ) also. If G(z) has a
256
111. HARMONIC ANALYSIS
pole, it cannot have that pole on the line q = l/p, for if there were such a pole, A
+
at, say z , = to (i/p), then for z = to+ iq with q = (l/p) - E , IG(z)l > C / E A
contradicting the estimate IG(z)l < CE-'/Pobtained from Holder's inequality.
+
A
Suppose, then, that G ( z ) has a pole at zo = to iq, with qo > l/p. Because A
A
H ( z ) is regular in the half-plane, F ( z ) must have a zero at zo , and this for all f f x )in A. It follows that the function c( - izo) xak 8FiZovanishes at that point for all admissible choices of the coefficients and the parameters 8k. Hence, (( - iz,) vanishes, and therefore [(s) has a zero with real part > l/p. A
The argument is therefore complete unless G ( z ) has no poles at all, that is, is an entire function. We must show that this is impossible. Choose a, = 8, = 1,a2 = - l/8, and O2 = 8 for some 8 in the unit interval. The function ak 8;"l is then bounded away from 0 in the half-plane q 5 2. Moreover, the function I(( - iz)l is also bounded away from 0 in that half-plane; this is a consequence of the inequality.
11;
valid for all s with ReCs] = 0 2 2. Hence, from the relation
A
and the fact that H ( z ) is bounded in the half-plane q
2 1, it follows that the
A
2 2. It has already been established any half-plane q 5 (l/p) - E and our
ratio G(z)/zis bounded in the half-plane q ,-..
that the function G(z) is bounded in
A
argument depends on our showing that G ( z ) / z is bounded in the strip (l/p) - E 5 q S 2; if this is shown, then the entire function C ( z ) evidently A
+
A
satisfies an inequality of the form IG(z)l 5 A Biz1 and is therefore a polynomial; being bounded in a half-plane the polynomial must be a constant, h
and since G(t) vanishes at infinity, the constant is 0. Thus, the argument will be complete. To show that G(z)/z is bounded in the strip in question we note that that function is regular in a neighborhood of the strip and is bounded on the two bounding lines. It will be shown that the growth of the function in the strip A
h
+
is not too great, more precisely, that IG(t iq)l 6 Celt' for appropriate C > 0 and all q in the interval [(l/p) - E , 21. A form of the Phragmen-Lindelof theorem given in Section 55 will then guarantee that G(z)/z is bounded in the strip, since the hypothesis p 5 2 makes the width of the strip at most 3 x / 2 . h
-=
5 1.
RIESZ CONVEXITY THEOREM
257
A
By hypothesis, C(z)is the ratio of two bounded functions in the half-plane
2 E ; these two functions may be written in canonical form
k)
(2~)”’G(z)= 7 F(z)
A
and since C(z) is entire, the zeros occurring in the Blaschke product E2(z)also appear in El(z); thus, the function E2(z) divides E,(z). Thus, the ratio is of the form C3E,(z) exp[q,(z) - (p2(z)] and therefore IG(z)l S C exp[ V/(z)] where V(z) is a positive harmonic function in the half-plane q 2 E . It must be shown that V ( t iq) increases at most linearly along horizontal lines, and this is a general property of functions positive and harmonic in a half-plane. If V ( c ) is positive and harmonic in the half-plane, the function u(z) = V[i(z 1)/( 1 - z)] is positive and harmonic in the unit disk, and by Harnack’s inequality u(z) 5 2u(0)/(1 - 121). Accordingly, V ( l ) 5 2V(i)/(l - r ) where r = I([ - i)/(c i)l = I1 - [2i/(t; i)]1.It follows that for large values of V ( ( ) 5 Clcl; in particular, there exist constants A and Esuch that V ( t + iq) 5 Altl B for all points c in the strip i) S q 5 2. Hence, for an appropriate
+
+
+
+
+
*
+
choice of the constant C, G(t iq) 6 Celt’. This inequality is stronger than that required in the Phragmen-Lindelof theorem, and completes the proof of the Beurling theorem. The theorem of this section provides a dramatic illustration of the fact that it is always difficult to show that a given set of functions is complete. The spaces L p are given by a rather abstract definition : all measurable functions for which If(x)Ip dx is finite, while the functions occurring in a completeness problem are given concretely; accordingly, some mathematics must be done to show that such a set is complete. This section shows that one of the most famous problems in mathematics is equivalent to a completeness problem in Lz(O, 1).
I
51. Riesz Convexity Theorem In this section we prove a theorem concerning general linear transformations defined on certain Lp spaces; since the measure spaces are almost arbitrary, we cannot use the theory of distributions. Let (A’, p ) be a measure space and (Y,v) another: the letters S, and S, denote the space of simple functions on the corresponding measure spaces; thusf(x) is an element of S, if and only iff(x) is a finite linear combination of
258
111. HARMONIC ANALYSIS
characteristic functions of p-measurable subsets of X of finite measure. It is evident that S, is a linear space, dense in every Lp(X,p) for 1 S p < co and it is even dense in L m ( X )when p ( X ) < co.We consider linear transformations T defined on S, and taking values in the space M yof v-measurable functions on Y. Such a transformation is said to be of type (p, q) if and only if it is continuous from S, to L4(Y) when S, is given the topology determined by the norm of L p ( X ) ;thus for all f in S,, IITfll, 5 C,,llfII,. When T is of type (p, q), it has a uniquely determined extension to the closure of S, in Lp which is a continuous mapping of that space into Lq( Y ) with the same bound C,, . This closure is, of course, all of L p ( X )except in the special casep = co. When T is of type (p, q), there then exists a constant C,,, so that
1
j Y ( T f ) ( M Y ) MY)l = I(Tf, 911
5 c p q II f l l p llgllq, (here q' denotes the index conjugate to q) for allfin S, and g in S, . Moreover, from Holder's inequality, it follows that this is a sufficient condition for T to be of type (p, q), since the v-measurable function (Tf)(y)is then evidently LQ and its norm in that space is at most C p q ~ ~ f [ ~ p . The type-set of such a transformation T is the set of all points in the plane with coordinates (l/p, l/q) where Tis of type (p, q). The important theorem is due to M. Riesz and Thorin Theorem (Reisz-Thorin):
The type set of T is convex.
PROOF: Let ( l / p o , l/qo) and (l/pl, l/ql)belong to the type set of T ; it is to be shown that (l/pf, l/q,) also belongs to that set where 0 6 t 5 1, and + (1 - Wpo l/q, = t/41 + (1 - t ) ) / 4 0 . l/q: = 1 - l/q, = t/q; + (1 - t)/qd. The point (l/p,, l/q,) belongs
1lPf= t/Pl
7
Note that to the type set of T if and only if there exists a constant C, such that l(Tj;g)l S C, for every pair of simple functions f ( x ) and g ( y ) for which Ilfll,,, = llgllq,.= 1. If these simple functions are written explicitly
1
c
f ( x ) = ak X k W g(Y) = bl X , W where xk(x) and ~ , ( yare ) characteristic functions of the disjoint measurable sets At and B, , respectively, then 3
1 = Ilf IIP, = = =
c c Ib,lQ"v(4) IWP(4)
11911q,~
9
9
5 1, RlESZ CONVEXITY THEOREM
259
while (TA 9)= CxUkbl(TXk, XI). Set c k = l a k l P r and dl = 1b11“‘ to obtain ak -- ck I / P t e i e k and b 1 -- d l1/ q i ’ e - i # l so that
(~j g ),= as well as C c k p(Ak) =
c c c-/ptdI/qc’ei(h-#i)
c d, v(B,)
I
(TXk > XI)
= 1. Next, form the analytic function
~ (= 1 ~ 21 PI+(^ -z)/~old[z/q~’+(l -~)/qo’l~i(Bk-S~) I (TXk
3
XI)
;
as a finite linear combination of exponentials, this is entire, and it is easy to see that it is bounded on any vertical line in the z-plane. Accordingly, by the Lindelof Three Lines theorem, if z = t + iq, the function
W $ )= supIH(t + h9l 9
is finite for all t and is a logarithmically convex function of.that variable. M ( 0 ) = suplc C ~ / P O d / / B o ’ C ~ 9/PI[ 1 - I / P O l d i 9 [ l / , I , - 1/ q O ’ l e & e k - S I ) ( T X k X I ) [ I
c
9
9
and since most of the factors are of absolute value simply M ( 0 ) = suplc
+ 1 this may be put more
C ~ / P O d / / r l O ’ e ~ [ o k + o l( ’Tl X k
XI)[
1
For any choice of q the term on the right-hand side is of the form l(Tfl, g,)l where fl(X)
=
c c:’pOf?iok&(x)
and
g,(y) =
d/lqo‘eiol’
XdY) *
Here ~ ~ J 9 1 ~ p o = IIgl)l,,. = 1, and therefore M ( 0 ) IC,,,,. Similarly, M(1) 5 Cplqland by virtue of the logarithmic convexity of M ( t )
This completes the proof, since the initial choice of the simple functions f and g was arbitrary, subject only to the conditions Ilfllp, = (lg(lqt,= 1. It is also clear that more has been proved : the bound C,, for T as a transformation from L p ( X ) to Lq( Y ) is finite and logarithmically convex on the type set. When a transformation T is of type (po, qo) and also of type (pl, q l ) then, as we have remarked, it possesses a well determined extension to a continuous linear transformation of LPo(X)into Lqo( Y) as well as a continuous extension to a transformation from L p l ( X )to Lql(Y). It is important to notice that these two extensions are consistent: if a function f(x) belongs to both the spaces L p ( X ) for p = p o , p = p1 it can be approximated by a sequence of simple functions,f,(x) in S, which is Cauchy for both Lp norms which converges
260
111. HARMONIC ANALYSIS
pointwise almost everywhere to f ( x ) . The sequence of images (Tf,)(y)is then Cauchy for both L4 norms, and an appropriate subsequence converges almost everywhere to a function g(y) which is the limit, in either L4 space of the sequence (Tf,)(y). It follows that the transformation T is extended to the union of the spaces L p ( X )where l/p belongs to the projection of the type set on the l/p axis; it takes values in the union of the spaces L4(Y) where I/q is in the projection of the type set on the l/q-axis, and this extended transformation is the real object of interest. It often happens that a transformation T arises naturally not on the space of simple functions, but rather on a domain which is dense in every L p ( X ) forp < co ;the type could then be defined in the same way (l/p, l/q) being in the type set if and only if T was a continuous transformation to Lq( Y) when its domain is given the L p ( X )norm. In this case, and only in this case, T would have a continuous extension to the whole space L p ( X )and, what is important here, that extension would be consistently defined on the simple functions, no matter what point in the type set was considered. It follows that the type set is convex in this case as well. In particular, when the measure spaces are subsets of R" with Lebesgue measure, it is convenient to take the testfunctions 9 as the initial domain of the transformation, and to extend the transformation to a union of Lp spaces. We pass to certain examples. Let X = Y = R", dp = dv = Lebesgue measure, and T the transformation defined on the testfunctions by Fourier transformation :
It is obvious that T is linear. Because of the Parseval equality this is a unitary transformation in L2, hence (4, 4) belongs to the type set of T. The RiemannLebesgue lemma guarantees that iff is in L'(R"), its Fourier transform is a continuous function vanishing at infinity, and therefore an element of Lm(R"), A
and, indeed, Ilfll, 5 Ilflll. This means that the point (1,O) belongs to the type set of T. Thus the line segment connecting those points also belongs to the type set of T and we obtain a theorem due to Titchmarsh. The Fourier transformation is a continuous Theorem (Titchmarch): linear transformation from LP(R")to LP'(R")for allp in the interval 1 6 p 5 2. It should be noticed that the extension of T from the testfunctions to the union L'(R") u L2(R")actually does coincide with the Fourier transform as we have defined it; the Fourier transform is a continuous mapping of the space of temperate distributions on itself, and if a sequence of testfunctions converge in LP to some limit, they surely converge as temperate distributions.
5 1, RIESZ CONVEXITY THEOREM
26 1
It is worth showing that the type set of the Fourier transform is exactly the line segment determined above. If T is of type ( p , q ) , then for all testfunctions cp(x) in 9, ll@llq
5 Cllcpllp
for some suitable constant C. We pass to the testfunction (cp 0 I,)(x) = EX) which has the Fourier transform E-"$((/c), and infer that
IKcp
O
L)Allq
= E-"+"~11311q
5 CICP 4llp O
= cE-"/p(Jrp(lp
Since this inequality must hold for all positive E , it follows that the exponents on either side are equal, that is, ( I / p ) + ( I / q )= 1. Thus, the type set is a subset of the line determined by that equation. On the other hand, the type set cannot contain a pair (l/p, l/p') wherep > 2, for in this case, the Fourier transform would be a one-to-one continuous and invertible transformation between LP(R")and Lp'(R")since the pair (l/p', l/p) already belongs to the type set " and the square of the Fourier transform is the operation of reflection (f+f) which is surely an isometry in Lp.Hence, the type set of the Fourier transform is determined. Let X be the interval [ 0 , 2 n ] and p the Lebesgue measure; let Y be the space of integers with the counting measure v. We study the transformation T defined on the simple functions on X , carrying each such function into the sequence of its Fourier coefficients: 2n
Tf = c k ,
ck
= (211)- l'*/o
e - i k x f ( X )d x .
The sequence ck is to be regarded as an element of Lq(Y, v), a space usually written P. The inequality lCkl 5 ( 2 n ) - ' / ' / ~ f ( x ) ldx may be written IIckllm 5 (211)-'" llflll and this shows that the point (1,O) is in the type set of T. Similarly, the Parseval equation lCk12 = Jlf(x)l2 dx may be written I l c k ) ) 2 = ~ l f l l ~ which means that the point ($, $) is in the type set. An obvious inequality shows that (0, 0) is in the type set with ~ ~ c 5 k ~( 2~7m ~ ) ~ / ~ 1 1and f l I ~finally, the fact that llfl12 =< ,/%&fllm with the Parseval equation means that llckllz 5 ,/%llfllm and therefore that (0,$) is in the type set of T. Hence the type set contains the convex hull of the four points (1, 0), (3, f), (0, $), and (0, 0), and although we do not show it, the type set consists of exactly this closed, convex set (see Fig. 9). The result is again a well-known theorem:
c
,
262
111. HARMONIC ANALYSIS
Fig. 9a. Titchmarsh.
Fig. 9b. HausdorfFYoung.
Theorem (Hausdofl-Young):
For l / q
min[-, l/p’]
5 J~llfll, and in the special case l/q = l/p’, 1 5 p 5 2, IICkllP# s llfllp. IICkllq
To check that the constants in these inequalities are correct, we note that the bound Cpqassociated with T is at most J% at the four corner points; from
52.
THE SALEM EXAMPLE
263
its convexity as a function of (l/p, l/q) it follows that it is uniformly bounded on the type set by On the segment where I/q = l/p', the bound is a t most 1. One can obtain results complementary to the Hausdorff-Young theorem by studying the transformation T', defined on the simple functions of the measure space X = integers wth counting measure, and taking values in the space of measurable functionsf(y) on Y, the interval [0,271] with Lebesgue measure. The transformation is defined as follows : the simple function, which is here a finitely nonzero sequence of coefficients c, is carried into the trigonometric polynomial
4%.
~ ' ( c , ) ( y=) ( ~ 1 t ) - ' /C ~ c, elky = f(y);
evidently T' is a sort of summability method applied to the trigonometric series with coefficients in 1". As before the Parseval equation guarantees that the point (4, +) is in the type set and Ilfl12 = I I c , ~ ~ Moreover, ~. since If(y)l 5 ( 2 7 ~ ) - ~ Ic/,~ ~ Cit follows that IIfII, 5 ( 2 n ) - 1 ~ 2 ~ ~and c , ~therefore ~l that (1,O) is in the type set of T'. The same argument shows that (1, 1) is in the type set l[flll I&llcklll, while the inequality Ilfll, ,/%llfll, and the Parseval equation makes llflll 5 ,/%llckI12, putting the point (4, 1) in the type set of T'. The type set therefore contains the convex hull of the four points (1, 0), (1, l), (+, I), and (+,f), and actually is exactly that set. The corresponding theorem follows. Theorem:
For 1 5 p I2 and q
5 p' = 1 - l/p,
IISII, IJ%,llp and in the special case q
= PI,
52. The Salem Example The Titchmarsh theorem, established in the previous section, shows that an Lp function has a Fourier transform in Lp' provided that 1 5 p 5 2; if the same result were true for some p > 2 then the Fourier transform would be continuous from LP to LP' and would have a continuous inverse, namely, the
264
111. HARMONIC ANALYSIS
Fourier transform followed by reflection. These transformations being oneto-one, it would follow that the L p and LP' norms were equivalent, and therefore that p = 2. Accordingly, for every q > 2, there exist functions g(x) in L4(R")for which 2(
to all Lp classes such thatf(t) belongs to L4 only for q 2 2. The kernel of the argument is contained in the following lemma due to van der Corput. Lemma (van der Corput): Let k(x) be a C2convex function defined on the interval (a, b) for which k"(x) 2 pz for some positive p ; then
(Note that the estimate is independent of the length of the interval.)
PROOF: Since the interval (a, b) may be decomposed into the union of two intervals on which k(x) is monotone, there is no loss of generality in supposing that k(x) is monotone increasing, and proving the inequality with 4 in place of 8. Consider first the interval for which k'(x) 2 m for some positive m, and introduce the change of variable l = k(x). Since the function is monotone, there is an inverse function x = $@),and $'(A) = l/k'(x) 5 l/m. It is also clear that $'(A) is monotone. We have, therefore, on that interval eik@) d x = Ia'e''4'(A) d l
which is bounded in absolute value by 3
On the other hand, since the interval on which 0 S k'(x) 5 m has length at most m/pz, the integral is bounded by (3/m)+ ( m / p z ) ;putting m = J j p we finally obtain the bound 2 J j / p < 4/p as required. Now let h(x) be a smooth function on the real axis, vanishing for x < 0 and equal to eiXIOEX
for x > 2 ; +(log
XI2
52.
265
THE SALEM EXAMPLE A
this function belongs to L4 if and only if q 2 2. Its Fourier transform h(5) is therefore in L z , and we shall show that it is both bounded and integrable. A
This will put h ( ( ) in every Lp class. To show the boundedness, we note that ;(()
is of the form E ( ( )
1;
+ G(()where E ( ( ) = (l/Jg) e-'"%(x) A
bounded function vanishing at infinity and A,(() in L 2 )of the functions A
/IN(()
1
=7 1 J2n
is the limit in mean (that is, 1
Nei(xlogx-tx)
2
dx is a
&(log
dx . x)2
We put
and write
to obtain
From van der Corput's lemma, IS(x)I 5 8 &, and therefore
A
It follows that there is a uniform bound for the functions hN(() and therefore A
that h(() is bounded. Hence to show that it is integrable, we need only consider its behavior at infinity. However, the smooth function h(x) has a derivative which diminishes like 1/( log x) at infinity, and hence is in L2. Accordingly, A
A
its Fourier transform ith(() is also in that class, and h(() = g(5)/( for large
151
A
where g(() is L2.This means that A ( ( ) is integrable at infinity, hence, since it is A
bounded, a function in L'. We now putf(x) = h(x) to obtain a function in all LP classes with a Fourier transform only in Lq for q 2 2. Our next example i s considerably more complicated. For u in the interval (0,l) and q > 2/u we shall show the existence of a compact subset K of the unit interval having Hausdorff dimension a supporting a positive Radon measure p such that the Fourier transform ?(() belongs to L4. Of course Z(t) is a function of positive type, hence bounded, and also the restriction to R' of
266
111. HARMONIC ANALYSIS
an entire function of exponential type, and since K, the support of p, has Hausdorff dimension smaller than 1, K is a set of Lebesgue measure 0 and p a singular measure. It is first necessary to make certain preliminary remarks. A system of real numbers A,, A,, . . . ,A, is independent if it is linearly independent over the field of rational numbers, that is to say, from the equation kiAi = 0 and the hypothesis that the coefficients k, are integers, we may infer that k, = 0 for all i. It is clear that if we are given finitely many numbers I,, the set of all sums nili with integer coefficients is countable, and therefore the complementary set is everywhere dense. We can therefore approximate with arbitrary precision any finite set b, with equally many ak so that the a, are independent. Now let the N numbers a,, a 2 ,, . . ,aNbe independent, and consider the trigonometric polynomial
1
c
We have then a lemma due to R. Salem.
Lemma (Salem): The frequencies being independent and r 2 2 there exists a constant To such that for all real b and T 2 T o ,
PROOF:
We suppose that r is an even integer r = 21. Now
lP(OI2' = N - 2 ' [ 1 exp[- itak]]'[C exp[
+ ita,]]'
where the frequencies A, occurring in the second sum do not vanish. Hence, if we average over a long interval, the average of the second term tends to zero as the length of the interval increases, independently of its position on the axis. Accordingly, uniformly in b, for T sufficiently large, 1
b+T
?;I b
lP(t)I2' d t
N-2'
converges to
and since II
Z);(
2
=< N - 2 i I ! x zI !
52.
THE SALEM EXAMPLE
26 7
this means that for T large enough, the average is smaller than l ! N - ' < (//A')'. The quantity /bb+TlP(5)1pd5/T'lPmay be written I(P I l p since it is the LP-norm of a bounded function on a finite interval relative to a measure of total mass 1 ; we have shown that IIP 1 I z 1 5 J//%when I is a positive integer. It is important to notice that IIP [ I p is a logarithmically convex function of the variable l/p; this is really contained in the proof of the Riesz convexity theorem, but we prove it independently. We have
the supremum being taken over all positive, simple functions
4(t) for which
Jbb+'4(T)d
+
whence
In Section 6 we gave a recipe for the construction of a subset K of the interval [0, 13 which was closed and had the Hausdorff dimension a. This set was the intersection of a sequence of sets K,, , where K,, consisted of N intervals of length v]", the number v] being so chosen that Nv]" = 1 . Our present construction is virtually the same, only we vary the choice of the width q as follows. We select v] > 0 and an integer N so that Nv]" = 1 and then consider a sequence of positive numbers qn converging to v ] , We shall also require )I,, 5 v] for all n. It will be necessary then to take a system of N numbers ak in the unit interval which are independent, so that the previous lemma can be invoked, and which are sufficiently widely spaced. The first set of the sequence, K , , consists of N intervals of length v ] , of the form [a,, ak q,]. The second set, K 2 , is obtained from K , , only now the factor q2 is used: we obtain N 2 intervals of
+
268
111. HARMONIC ANALYSIS
length qlqz of the form [ai + u j q l , ui + ujql + akqlt/Z]. Inductively, then, we obtain a sequence of sets K,, ,the nth set consisting of N" intervals of length q1q2q3 q,,. Only a slight modification of the argument of Section 6 is needed to show that the intersection of this sequence of sets is a set K of Hausdorff dimension a. The computation of Section 38 shows that there is associated with this set a positive measure p of finite total mass; K supports p and the Fourier transform of /A is the infinite product
where
Since it is exceedingly difficult to compute with products of the type just described we consider no single such product, but rather a whole probability space of them. For this purpose let Q denote the space of all sequences of numbers q,, where n 1 1 and
for all n ; this is the direct product of intervals I,, of length q(n + 1)-?. On this space we introduce the usual product (or probability) measure do, so that a set described by a finite number of inequalities
. .., k ,
bni < qni < c n i , i = 1,2,
has the o-measure k i=1
The measure of the whole space is 1, and smooth functions of different coordinates, sayf(qj) and g(qm),will be independent random variables, and the integral of their product is the product of their integrals. More generally, given a continuous function of the first k coordinates F(ql,q z , ..., t f k ) we have
s,
WIl, t l z 3
'* '
n k
= q-k
j=1
9
ilk) d o
( j+
/j*'* IF(q1, qz
3
.. .
Y
qk)
dill d V 2
* ' drlk
52.
THE SALEM EXAMPLE
269
the integral on the right being the usual repeated Riemann integral over the intervals 4 . Let t be the generic point of the probability space R. Associated with each t is a sequence qn and a perfect subset K , of the interval [0, I]. K , has Hausdorff dimension a and supports a positive measure p t having the Fourier transform
Our object is to show that for a given q > 2/a almost all p,(t) belong to Lq(R') and this fact will be an obvious consequence of the finiteness of the integral
From the Fubini-Tonelli theorem, it follows that it will be enough to show that the function
/
Wt) = nli(t)lqd d t ) is integrable over the real axis. Since every measure p, has total mass 1 the functions p,(<)are uniformly bounded in t and t ; it is therefore only necessary to investigate H ( 5 ) for large and we shall show that it diminishes like 1(1-'-' for a certain positive E . If the real t is fixed, we select a positive integer n = n(t) and write the easy estimate
n Ip(tqlqZ n
lGt(t)lq
k= 1
*"
qk)Iq
to obtain
This integral is written as a repeated integral, the integration relative to the final variable q,, being distinguished
x /l/p(~ql* * * q n - l q n ) I q d q n d t l n - 1 * * * d q l *
Now change variables in the integral relative to q,,, putting t = cq, where
+ l)'/qc j IP(t)lqdt, the integral being taken over an interval of length T = qc(n + 1)-'. It follows that if n = n(5) can be
c = lt1q1q2
qn-l to obtain (n
270
111. HARMONIC ANALYSIS
taken so that T B T o , the lemma of Salem guarantees that the integral is bounded by Q = N-q'2[q/2 + 1]q12, a quantity which is of course independent of the other coordinates &. We must therefore require that To be smaller than and if this is the case, we repeat the foregoing argument to estimate the integral relative to v , , - ~ ;this is legitimate since To is surely smaller than n-2151q q1 q2 * * * accordingly, after two integrations, we have Q2 as a bound. In a similar way, the other variables are integrated out of the formula and H ( t ) p't' provided that n(C) = n is so chosen that (n + l)'T0 5 l
and the last factor is greater than the convergent product P = We shall therefore require that n = n ( t ) be so chosen that
2 log(n(t)
+ 1) + log
25
The
nr= 1 - (1/k2)). 2(
+
log(
To infer that H ( < ) S ltl-l-e for some small positive E we must evidently arrange matters so that Q < 1 and n(<) log Q 5 -(1 + E ) log1<1. This leads to the inequality 1 + E
2llog QI 10g"l
*
The other inequality which n(t) must satisfy may be written
Obviously, it is enough to have these inequalities satisfied for sufficiently large 151; if, now, we ignore the condition that n ( t ) take integer values and take n(<) = m log151 for a suitable constant m > 0, the inequalities reduce to 1 + E
11%
QI
<m<-
1
llog 91
and a solution can easily be found if log Q < log 9, that is, if a
53. CONVOLUTION OPERATORS
27 1
Here we have used the fundamental fact that NqO' = 1. The hypothesis q > 2/a now makes it clear that if N is taken sufficiently large throughout the construction, two positive constants m' < m" exist so that m' loglt( and m" log151 satisfy both inequalities for large enough 151. Since the difference (m"- m') log151 converges to infinity, it finally follows that an integer valued function n(t) may be defined satisfying both inequalities for large 151. The argument is now essentially complete, and the construction should be carried out in the following way. The numbers q and a being given, we first select the positive integer N so large that q log[(q/2) 11 < [q - (2/a)] log N . The small positive q is then determined from the equation Nq" = 1. Next, the N independent numbers a, are chosen in [0, 13 widely enough spaced so that the intervals [a,, uk q ] are disjoint. Then for each t = {qn} in R, the corresponding perfect set Kt and the measure p , are constructed. The foregoing argument shows that for all t , except those in a set of probability measure 0, the Fourier transform c,(t)is in Lq. We remark finally that Salem showed even more, namely, the existence of such a measure p supported by a set of Hausdorff dimension a with a Fourier transform $(t) belonging to every Lq with q > 2/a. The construction is laborious. From the results of Section 58 it will become clear that no better result could be expected.
+
+
53. Convolution Operators In this section we investigate linear transformations T from LP(Rn)to Lq(R")which are continuous and commute with translation, that is, T ( r hu) =
r h
T(u)
for every u in L p and every h in R". We will find that such transformations are convolutions and that the corresponding type sets have special properties. It is first convenient to establish the following lemma, typical of a whole family of results concerning the local behavior of distributions.
Lemma: Letfbe a distribution in R" which has the property that every derivative Dybelongs locally to Lp for JaI 6 n ; thenfcoincides almost everywhere with a continuous function f ( x ) and there exists a constant C so that
272
111. HARMONIC ANALYSIS
PROOF: We first suppose thatfhas compact support, and, indeed, that its support is the unit ball 1x1 5 1 . Let H ( x ) be the characteristic function of the positive cone in R", that is, the set where all of the coordinate functions are positive. It is easy to verify that H ( x ) = Y ( x , )Y ( x , ) * . * Y(x,,)where Y is the Heaviside function, and that H ( x ) is a fundamental solution for the mixed differential operator of order n :
.
a"H = D @ H = ~ , p=[l,i ax ax2 ax,
,..., 11
Accordingly .f = 6 * f = DPH * f = H * Dpf and this last term is the convolution of a bounded function with an integrable function with compact support. Obviously this convolution is continuous and coincides with f almost 5 11 H 11 11 Dpfll, = 11 Dpflll and because of Holder's everywhere. We have 11 f 11 , inequality this is smaller than CII Dpfllpwhere C depends only on the measure of the unit ball. More generally, then, given f as in the lemma, we consider the product x is supported by the unit ball and equals + 1 in a neighborhood of the origin; now ~ J m a ybe taken as a continuous function iffis corrected on a set of measure 0 and
~ ( xf () x ) where the testfunction
From the Leibnitz formula it then follows that there exists another constant C which depends essentially only on the choice of the multiplier x so that
for all x near the origin. The assertion of the lemma is an obvious consequence of this. It also becomes clear that a distribution which has all of its derivatives locally in some Lp class is necessarily a Cm-function. A linear transformation T which carries Lp(R")into L4(R")and commutes with translation, must carry difference quotients into difference quotients :
Now if u is a testfunction, the difference quotients may be so taken that they converge as testfunctions to a first derivative of u, and therefore converge to Du in Lp. It will follow that the difference quotients of T(u)will converge in L4 and therefore that they surely converge as distributions; they converge to DT(u) which must then be an element of Lq(R").This argument may be repeated for the higher derivatives, and we infer that T(u) and all of its derivatives are in L4.This means that the transformation T, when restricted to the testfunctions, maps 9 into 8 in a translational invariant way. Moreover, if a
273
53. CONVOLUTION OPERATORS
sequence u, of testfunctions converges in 9 the images T(uJ converge not only in Lq but also uniformly on compacts, in view of the inequality of the lemma. Thus Tis a sequentially continuous map from 9 to d and the theorem of Section 25 may be invoked. It follows that there exists a distribution (which we denote by the same letter) so that the transformation is a convolution:
T(u)= T * u , and this convolution makes sense for all testfunctions u, and also for all u in Lp with compact support. For the other elements of Lp we will simply define the convolution to be T(u). The class of operators obtained in this way we call convolution operators. It is important to note that for the study of the type set of such convolution operators it is sufficient to consider T * u only for functions u which are testfunctions or which have compact support. The type set of T is a convex subset of the square 0 5 l/p 5 1,0 5 l / q 5 1 by virtue of the Riesz-Thorin theorem. We distinguish the diagonals of that square and borrowing a term from heraldry, call the dexter diagonal the one given by the equation l/p = I/q; the other diagonal is the sinister one and is described by l / q = 1 - (l/p) = I/$, These diagonals are shown in Fig. 10. Evidently, points on the dexter diagonal correspond to transformations of L p into itself, while those on the sinister diagonal refer to transformations from a space to its conjugate. Most of the following theorem is due to Hormander.
Theorem (Hormander): The type set of a convolution operator T contains no point above the dexter diagonal and is symmetric about the sinister diagonal. " From the identity (T * u)" = T * Z and the equation IlZll, = " IIuII, it is obvious that T (convolution with the reflected distribution) and T have the same type set. Now a point (l/p, l/q) belongs to the type set of T if and only if there exists a constant C so that for all testfunctions u and v
PROOF:
I(T* u, 011 5 Cllullpll~llq~. We have (T *
and so
U, I ! ) = ( T
* u * Z)(O)
"
= (T * u
* Z)"(O)
"
= (T
* u * Z)(O)
" I(T* 0, u)l 5 CllUllq~ll~lIp;
"
= (T
* 0, u)
" this means that the point ( l / q ' , l/p') is in the type set of Tand therefore in the type set of T. Accordingly, the type set is symmetric about the sinister diagonal.
214
111. HARMONIC ANALYSIS
Fig. 10. Young's convolution inequality.
If the type set of T contains a point (l/p, l / q ) above the dexter diagonal it then contains such a point not on the boundary of the square in view of the convexity and symmetry. Accordingly, we may suppose that 1 < q < p < co and that Cis a bound for the transformation from LptoL4. For any testfunction u and large 1/11, we put v h = u + T h u and note that IIvhll; = 2llull;. Now T * u,, = T * u + . T h T * u and as Ihl increases the quantity IIT * uhll: converges to 211T * ull:. Accordingly,
IIT
* ul19 5 lim sup IIT * t+,11,2-'/' h
53. CONVOLUTION OPERATORS
275
where k = 2(1'p)-(1'q) < 1. Since u was arbitrary, it follows that Ck is also a bound for the transformation, which also holds for Ck2.Since the bounds Ck" converge to 0, it follows that T = 0 and the transformation is trivial. A useful result in this direction is the convolution inequality of W. H. Young, illustrated in Fig. 10. Theorem (Young): The point (1, l/r) where r > 1 belongs to the type set of the convolution operator T if and only if T is in L'; its type set then contains the line segment I/q = (I/p) - (l/r') where 0 5 I/q S I/r.
PROOF: The regularizations T, of the distribution T a r e the convolutions of T with the testfunctions q e ,all of which have unit L'-norm. It follows that there is a uniform bound for the norms of the regularizations in L': llTJr S M and so for any testfunction u, ITe(u)I 5 Mllullr. where M does not depend on E or u. Accordingly, IT(u)l 5 Mllull,. and Tdetermines a continuous linear functional on L". Thus T may be taken as an element of L' since r' is finite. Conversely, if T is a function in L', and u and u are testfunctions,
5 ll7"llrll~llr~ll~ll~This means that ( I , I/r) is in the type set of T. We consider next the convolution operator defined by the Riesz kernel; here R,(x) is the function c/IxI"-' where 0 < a < n and c is a positive constant given in Section 24, the exact value of which is immaterial to our argument. The function belongs to no LP-class but is homogeneous of order a - n ; it is then easy to verify that for all testfunctions u, U) 0 I,. R,(u 0 I,) = The important result is the following theorem which describes the type set of R,; this is shown in Fig. 11.
The type set of R, consists of all interior points of Sobolev's Theorem: the square on the line I/q = (I/p) - (a/n). PROOF: We first show that the type set cannot contain points not on the line. If (l/p, I/q) is in the type set, there exists a constant C so that for all testfunctions u and all positive E ,
llRa(Uo~e)llqICllu01ellp*
276
111. HARMONIC ANALYSIS
Fig. lla; Sobolev's inequality: IIR *fll,
5 C Ilfll,.
Fig. l l b . Type set of the Bessel kernel G.. IlC *fll,
Since JIu
0
=E
5C
~ ~ f ~ ~ , .
- " / ~ I I u ~ ~ and ~ R, is homogeneous, this reduces to - (n/cl) llR,41q Ic~-"'pll~IJp, -01
an inequality which must be valid for all positive E. It follows that
-
1/q= ( l / P ) - (a/4 Next let u(x) be a positive testfunction supported by the unit ball; it is obvious that if u(x)dx = 1 then (R.u)(x) > +R(x)for large 1x1. This function is not in the class Lq(R")for q = n/(n - OL)and so R,u is not in that class either.
I
53.
277
CONVOLUTION OPERATORS
This means that the point (1, 1 - (a/n))cannot belong to the type set of R, , and so, in view of the symmetry, (a/n,0) cannot belong to that type set either. We have next to show that the points within the square on the line l/q = (l/p) - (a/n)are actually in the type set. Two simplifications are possible. Because of the homogeneity of the kernel R, and the choice of the exponents, it is sufficient to prove the existence of a constant C so that I(R,u, u)l 5 Cll~ll~llu11~. for all testfunctions u and u supported by the unit ball 1x1 5 1. A device introduced by du Plessis enables us to reduce the problem to the one-dimensional case, as follows. From the inequality between arithmetic and geometric means we have immediately
and therefore
It follows that
where p = a/n. Now, if we suppose the theorem true for n the higher values of n, since we may write
=
1, it follows for
Here, of course, x' and y' are points of R"-' determined by the coordinate systems xk,yk fork 2 2. Integrating relative to x1 and y , and using the theorem for n = 1, we obtain
where V(x') =! I . ( . ,
x')IP
dx:IP
and
V(y') = slv(y,, y')k' d ~ : / ~ ' .
278
111. HARMONIC ANALYSIS
It is clear that U(x') and V ( y ' )belong to LP(R"-')and L4'(R"-'),respectively, . repeat the argument, and that IIU I, = llullp as well as IIVII,. = I I U ~ ~ ~ . We integrating relative to the variables xz and yz to obtain a similar inequality; after n steps this becomes I(Rau, u)l 5 cC; llullp 11~11q*,
which was to be shown. The proof of Sobolev's theorem thus reduces to the proof of the inequality
where 0 < l/q = (l/p) - p and p > 1. Of course, p is in the interval (0, 1). This inequality has been established by Hardy and Littlewood, and will be proved in the next section. The term " Sobolev's inequality " properly applies to convolution inequalities of the form provided by this theorem: IIRaflIq Cllf[lp. By extension, however, any inequality guaranteeing that a function IS in a certain class if its derivatives are in another is called a Sobolev inequality. One example is immediate. If we suppose that the Laplacian offis in Lp, then, since f = - Rz AL the function f is in L4 for I/q = ( I / p )- (2/n) if there are finite values of q satisfying this equation. Of course, we suppose p > 1. It should be emphasized that the theorem is both local and global. The Bessel kernel G, ,studied in Section 56, also gives rise to a convolution operator similar to R, . The corresponding type set contains that of the Riesz kernel and also the whole dexter diagonal and is shown in Fig. 11. We do not give the argument which is not hard and depends essentially on both the Young Convolution inequality and the Sobolev inequality.
+
54. A Hardy-Littlewood Inequality In this section we establish the inequality of Hardy and Littlewood used in the proof of the Sobolev inequality of the previous section. We must begin with two elementary lemmas. Lemma (Tchebysheff): Let f(x) and g(x) be nonnegative, monotone functions on the interval (0, I) wheref(x) is increasing and g(x) is decreasing; then
54.
279
A HARDY-LITTLEWOOD INEQUALITY
I S
I
PROOF: We are to show that the difference f d x g d x - I f g d x is positive, and this difference may be written
it may also be written -1
.I
and so is equal to half the sum
The integrand is negative because the functions f and g are monotone in opposite senses, and hence the difference is positive. It is also clear that iff andg had been monotone in the same sense, then exactly the opposite inequality would hold. Letf(x) belong to Lp(O,I ) where p > 1 and F ( x ) =lOxf(t)dt ; Lemma: then F ( x ) / x is in Lp(O,I ) and
PROOF: Clearly, we may suppose f ( x ) 1 0 , and in our argument we will assume that f ( x ) is bounded; the inequality being established for all bounded functions, it will then follow by continuity for arbitrary functions in Lp. We first integrate by parts to obtain F(x)
( F ( x ) I ~ xd- x~ =
(yP &:1
-
l-p
Jo' ~ ~ - ~ F ( x ) ~ d- x~. f ( x )
The integrated term is negative at the upper limit of integration and vanishes at the lower limit, since F ( x ) / x is bounded. It follows that the integral is bounded by
280
111. HARMONIC ANALYSIS
and by Holder's inequality this is no larger than
Now, since ( p - 1)p' = p ,
P IP-1
Ilf Ilp.
Let K ( x , y) be the function Ix - yla-' where 0 < /3 < 1 ; for l/q = ( l / p ) - /3 and /3 < I / p < 1 we have to show the boundedness of the ratio
1
.
llullp lluIl*,
+
+
as the functions u(x) and u(y) vary over the spaces Lp(- 1, 1) and L4'(- 1, 1) respectively. It is obvious that we may always suppose that those functions are nonnegative, and it is not hard to show that they may be taken as even functions. For if we substitute Z(x) = u ( - x ) for u in the expression above, neither the integral in the numerator nor the norm in the denominator changes ; similarly we can substitute 5 for u. It follows that the even functions (u + 3)/2 and ( u 5)/2 can be substituted for u and u in the ratio without changing the numerator; the denominator, if anything, becomes smaller, since if u, = tu + (1 - t ) Z , then llutllpis a convex function on the interval 0 5 r 5 1, which takes the same value at the endpoints, and therefore cannot be larger at t = 4. An essential part of our argument consists in establishing a further fact: the even and positive functions u ( x ) and u(y) may also be supposed monotone nonincreasing functions of 1x1 and lyl respectively. We postpone the proof of this fact to the end of the section, and now merely assume it. Because of the symmetry of the kernel and the functions u(x), u(y),it is clear that the integral computed over the bottom half of the square, namely, the set y < 0, is equal to the integral taken over the top half. Moreover, for this top half, the integral computed over the left-hand side x < 0 is smaller than the integral taken over the right-hand side; this is an evident consequence of the fact that u(x) is even, and K ( x , y) increases as the point ( x , y ) approaches the diagonal. It is therefore sufficient to establish the existence of a constant C that such
+
54.
A HARDY-LITTLEWOOD INEQUALITY
28 1
where u(x), u(y) are positive, monotone nonincreasing functions on the unit interval. We carry out the computation in detail for that part of the integral associated with the triangle 0 2 y 2 x, 0 5 x 5 1 ; the integral over the other triangle is estimated in exactly the same way. Integrate first relative to the variable y ; the function K ( x , y ) is monotone increasing in y , while u(y) is monotone decreasing and both functions are nonnegative. Hence by Tchebysheff's lemma,
XB
=-
B
V(x),
where V ( x ) denotes the indefinite integral of u(y). Accordingly, xB- 1
/)(x,
Y)O(Y) dY
5 -W B
)
9
and if we multiply by u(x) and integrate over the unit interval, we obtain j l r K ( x , y)u(y) dyu(x) dx 0 0
5
/ x ~ - ' V ( x ) u ( x )d x ;
1 '
0
again, by Holder's inequality, this is no larger than
All that remains is to estimate the integral on the right-hand side rather carefully. From Holder's inequality follows an easy estimate for V ( x ) :
282
Ill. HARMONIC ANALYSIS
+
+
Here, the exponent A is (fi - 1)p’ [(p’ - q’)/q] q’, and this vanishes in view of the hypothesis relating p , q, and B. Accordingly,
in view of the second lemma of this section. Therefore, finally,
=
c llullp
1141q,.
The argument clearly depended in an essential way on the hypothesis that u(y) was monotone decreasing on the interval [0, 13 and the corresponging calculation for the integral over the triangle 0 Ix 6 y , 0 5 y 5 1 requires the hypothesis that u(x) is monotone decreasing on [0, 11. This hypothesis is validated by a “ rearrangement” theorem, first established by F. Riesz. Let ul(x) and u2(x)be two measurable functions on the interval [ - 1, 11 ; they are said to be rearrangements of one another if for every real 1,the sets [ul(x) 2 A] and [u2(x) 2 A] have the same Lebesgue measure. It is then obvious that IIu 11 - IIu211pfor all p 2 1. P.T If u(x) is a positive integrable function on [- 1, 11 its decreasing even rearrangement u*(x) is that rearrangement of u which is even and nonincreasing on [0, 11. Such a rearrangement always exists, and it is easy to give an explicit formula: if m(A) = f m [ u 2 A], where m denotes the Lebesgue measure, then m(A) has an inverse function, and for x > 0, u*(x) = m - ’ ( x ) .
+
’
+
Rearrangement Theorem: If K ( x , y ) = k(lx - y I ) where k ( t ) is a positive, integrable function decreasing on [0, 11, then for any pair of positive, bounded measurable functions u and u,
1- j- K ( x , +1
+l I
Y)U(X)O(Y)
+1
+1
-1
-1
j j
d x dY I
a x , Y ) U * ( X ) O * ( Y ) d x dY
where u* and u* are the decreasing, even rearrangements of u and u.
PROOF: First recall that the simple functions are those which are finite linear combinations of characteristic functions of measurable sets of finite measure. A bounded measurable function u(x) on [ - 1, + 13 is the limit almost everywhere of a monotone increasing sequence u,(x) of simple func-
54. A HARDY-LITTLEWOOD INEQUALITY
283
tions, and its rearrangement u*(x) is the limit of the rearrangements u,*(x), a consequence of the fact that ul(x) S uz(x) implies ur(x) 5 ur(x). It follows that it is enough to prove the theorem for simple functions u and u. Any such function u(x) =
ck x k ( x )
may be written in a variety of ways as a combination of characteristic functions ; we make a canonical choice as follows. Let 0 < 1, < 1, < *.. < 1, be the finitely many values assumed by u(x) and let x&) be the characteristic function of the set u(x) 1 1,. The function u(x) is now given by the sum N
llxl(x)
+k1 =2
- lk-l)xk(x);
the coefficients are positive and the partial sums form a monotone increasing sequence. It is now easy to see that for the rearrangement, N
1
and so, if u(y) = 6, $,(y) is the canonical representation of u(y) in terms of characteristic functions t,bI(y), the inequality to be proved becomes
Accordingly, the proof reduces to the proof of the inequality for one term, that is, to the case when both u(x) and u(y) are characteristic functions of measurable subsets of the interval. A further simplification is immediate: on obvious continuity grounds we may suppose that u(x) and u(y) are characteristic functions of sets S’ and S“, respectively, each set being a finite union of intervals. The proof is then by induction on N, the sum of the number of intervals in S‘and the number of intervals in S”. This integer is at least 2, and when it is 2 the set S’ x S” is a rectangle. It should be geometrically evident that the integral of K ( x , y ) over such a rectangle is maximized if the center of the rectangle lies on the diagonal x = y. This proves the theorem for N = 2. For larger N, we note that the set S” x S’ is an array of rectangles, and if the top row of rectangles is wholly above the diagonal, the integral of K ( x , y) over the set is increased if the top row is translated downwards, and the integral increases until the center of the small rectangle in the upper right hand corner falls on the diagonal (see Fig. 12a). In the same way, if the right-hand column of rectangles lies wholly to the right of the diagonal, that column may be moved to the left without decreasing the value of the integral.
284 Ill. HARMONIC ANALYSIS
55. FUNCTIONS OF EXPONENTIAL TYPE
285
It follows that if we start with the set S" x S' then either the center of the upper right hand rectangle is on the diagonal already or translations of the type described serve to increase the integral. In the course of these simple transformations, one row or column may have been brought flush with another, thereby decreasing the number N of intervals, and proving the theorem from the induction hypothesis. If this is not the case, we may suppose the center of the rectangle in the upper right-hand corner to fall on the diagonal and we simultaneously translate the top row and the right column, that is to say, the intervals I' and I" corresponding to the top row and the right column are simultaneously translated towards their neighbors ; the rectangles in the top row move downwards, those in the right column move to the left, and the rectangle I" x I' moves along the diagonal in a southwesterly direction. The contribution to the integral of K over the set increases for every rectangle except the special one I" x I' whose contribution is unchanged. This final transformation diminishes the total number of intervals, and hence proves the theorem by induction. ' It should be remarked that F. Riesz proved a rearrangement theorem considerably more general than the one stated in this section.
55. Functions of Exponential Type The proper object of this section is the study of certain entire functions but it is convenient to begin with theorems concerning subharmonic functions in the plane. So far we have not explicitly noted that a convex function K ( x ) defined in some region of R" is subharmonic there, although this is obvious when K ( x ) is C 2 .For any xo in the domain of Kand sufficiently small r > 0 the inequality
2K(x0) 5 K ( x o
+ ry) + K ( x , - ry)
is valid where lyl = 1. Integrating this inequality relative to the measure do(y) we find K(x0)
a x 0 + ru)d d y )
for all small r ; thus, the continuous K ( x ) is subharmonic as a distribution, hence is a subharmonic function. In a one-dimensional space the subharmonic functions are exactly the convex functions, while for a two-dimensional space a further hypothesis is required.
286
111. HARMONIC ANALYSIS
Theorem: Let H ( z ) be a subharmonic function in the plane which is homogeneous of degree 1, that is, H(Lz) = LH(z) for all A 2 0; then H(z) is convex. PROOF: Since H ( z ) is subharmonic it is upper semicontinuous, and therefore is bounded on the disk IzI 5 1 ;it follows that there exists a constant A such that H(z) Alzl throughout the plane. We have to show that the restriction of H(z) to any line segment [ z , , z2] is a convex function on the segment. Since the restriction of H(z) to any ray arg(z) = constant is linear there, we may suppose that the segment in question does not lie on any such ray. There is no loss of generality in supposing that the segment is short and near the positive real axis; accordingly we take z , = rleiWand z2 = rZe-”” where 0 < w c 4 4 . It is easy to see that there exists a linear function L(z) = ax + by which coincides with H(z) on the rays larg(z)l = o,since it is only necessary to choose the coefficients a and b so that
and the determinant of this system does not vanish in view of the hypothesis made about the points z1 and z2. For a positive E , the polynomial &(x2- y 2 ) is a harmonic function which is positive for larg zl < 4 4 and so the harmonic function L(z) &(x2- y 2 ) is at least as large as H(z) on the two rays larg zI = w . Moreover, the harmonic function increases quadratically with increasing IzI while H(z) 5 Alzl; hence for lzl = R, where R is big enough, the harmonic function is larger. It follows that the inequality
+
H(z)
s L(z) +
&(XZ
- y2)
is valid on the boundary of the region larg (z)l < w, IzI < R, and because H(z) is subharmonic, it must be valid throughout that region. The R being arbitrarily large, the inequality holds for all z with larg (z)l 5 o,and the E being arbitrarily small we have finally H(z) 6 L(z) for larg (z)l 5 w , in particular, on the segment [ z , , z 2 ] . This shows that H ( z ) is convex, since the segment was arbitrary. The device used in the proof of the previous theorem is commonly employed in establishing theorems of the Phragmen-Lindelof type. We give some simple examples. Theorem: Letf(z) be analytic in a neighborhood of the set Fdescribed by the inequalities 1 IzI and larg(z)l 5 w where w < n/4;iff(z) is bounded
55.
FUNCTIONS OF EXPONENTIAL TYPE
28 7
on the boundary of F and satisfies an inequality of the form If(z)I 5 BeAIZI, then it is bounded throughout F.
PROOF: We may suppose that If(z)l 5 1 on the boundary of F and pass to the subharmonic function loglf(z)l which is then nonpositive on that boundary. Since loglf(z)) 5 AIzl log B, the inequality
+
will be valid for large IzI and so this inequality must hold for points z on the boundary of the intersection of F with a disk IzI S R . Since loglf(z)l is subharmonic, that inequality holds in the interior of the intersection, and since R may be arbitrarily large, it holds throughout F. Finally, the positive E being arbitrarily small, loglf(z)) is never positive in F, and therefore If(z)l 5 1 on that set, as asserted. We have invoked an equivalent version of this theorem in Section 50 where we needed the following result. Let F((‘) be analytic in a region of the (‘-plane containing Theorem: the half-strip 5 2 0, lql o < n/4;if F(() is bounded on the boundary of the half-strip and satisfies an inequality of the form IF(C)l S C expCAetl, then F((‘)is bounded throughout the half-strip.
PROOF: We make the substitution z = er = ereiqwhich maps the halfstrip into that part of the sector larg (z)l 5 o which is outside the unit circle, The function If(z)l = IF(log z)l is bounded by CeAI‘I in that set, and the previous theorem then shows that f ( z ) is bounded in the image of the halfstrip. This means that F ( ( ) is bounded in the half-strip. An entire functionf(z) is said to be of exponential type if it satisfies an inequality of the form If(z)l S Ce”I‘I. Let f(z) be such a function, and for simplicity, supposef(0) = 1. The subharmonic function F,(z) = logIf(rz)l/t is bounded by Alzl for all positive t and satisfies the inequality
and it follows that the integral of F,(z) over the disk lzl < R is necessarily positive and bounded by the integral of Alzl over that disk. Indeed, if P, is
288
111. HARMONIC ANALYSIS
the subset of IzI < R upon which F,(z) is positive, and Nt is the complementary subset, then
and so
l(
11 < R
JFt(z)Idx d y
4nAR3/3.
It follows that the system of measures F,(z) dx dy has uniformly bounded total mass on any compact subset of the plane, and by Helly's theorem, any sequence of such measures has a weakly convergent subsequence. Accordingly, if t converges to infinity through a countable set of values, there is a subsequence t, having the property that the distributions Ftnconverge as distributions to some limit T. Since the Laplacians of the F," converge as distributions to the Laplacian of T, AT must be a positive measure and so T is a subharmonic distribution. We write its canonical representation as T(z). For n >= 0, the function H,(z) = supr,, Ft(z) is clearly a subharmonic distribution, and as n increases this sequence converges decreasingly to
If T is any limiting distribution of the type we have obtained, the inequality T ( z ) 5 H(z) 6 Alzl holds almost everywhere and it follows that H ( z ) is also a locally integrable function. Thus the sequence H,(z) converges to H ( z ) in the space of distributions and the distribution H ( z ) is subharmonic. Since it is also homogeneous of degree 1, the canonical representation of H ( z ) is a convex function. It can be shown by arguments similar to those used earlier in this section that the limit superior H ( z ) itself is convex, but we omit the proof. The fact that H ( z ) is both convex and homogeneous implies that it is the support function of a compact, convex set K in the plane. That set is called the Polya Indicator Diagram off(z) and its support function, H ( z ) is called the indicator function corresponding tof(z). It often occurs in the literature in a superficially different form: the function H(z)/lzI depends only on the angle arg(z) and H ( z ) is described by a function h(0) for 0 5 0 5 2 R . The most interesting special case arises when the indicator function depends only on the variable y ; it is then of the form H(z) = b y , Y 5 0 - a y , Y 50,
55.
FUNCTIONS OF EXPONENTIAL TYPE
289
where b - a 2 0. The corresponding indicator diagram is then an interval on the imaginary axis with the endpoints ai and bi. In particular, this will be the case whenf(z) is the Fourier transform of a distribution T on the real axis supported by the interval [a, b ] . The convergence of Ftn as distributions to some limit T implies the convergence of the Laplacians AFtn to AT, and this means that AT is a positive measure. It follows that T is itself subharmonic. In general, we would expect a variety of subharmonic distributions T to appear in this way, however, in certain special cases, only the indicator function can appear as such a limit as the following theorem of Beurling shows.
Theorem (Beurling): inequality of the form
Let f(z) be of exponential type and satisfy an
then as t approaches infinity, the distribution limit of F,(z) = logIf(tz)I/t exists and equals the indicator function H(z).
PROOF: We may suppose that t, is an unbounded increasing sequence of positive numbers such that Frn converges to a subharmonic limit T ; we have to show that T = H(z). It will be enough to show this separately for the upper and lower half-planes, and we therefore consider only the upper halfplane. The function g(z) = e’””f(z) is bounded in the upper half-plane, and so admits a canonical representation, according to Section 44, of the form g(z) = C,B(z)e’“‘(”,where B(z) is a Blaschke product in the half-plane and cp(z) = V ( z ) + uz + iV(z) belongs to the Pick class. Accordingly, loglg(rz)l-- loglf(tz)l - A y t
t = F,(z) - A y
and therefore
It is obvious that loglC,I/t converges to 0 with increasing t , and we shall show that the last two terms on the right-hand side also converge to 0, making
290
III. HARMONIC ANALYSIS
it clear that the limit T(z) is necessarily a uniquely determined linear function of y in the upper half-plane, namely (A - a)y. Now, the definition of H(z) as the limit superior of F,(z) shows that H(z) = T(z). Of course, essentially the same argument holds for the lower half-plane. The function V(z)is a positive, harmonic function in the upper half-plane, associated with a zero mass at infinity; this means that lim(V(it)/r) = 0 and it is therefore not difficult to infer that lim(V(zt)/t) also vanishes when z is fixed in the upper half-plane. Indeed, for every t > 1, the function V,(z) = V ( t z ) / t is positive and harmonic in the half-plane, and V,(i)converges to 0 with increasing t. From the compactness property of families of positive harmonic functions bounded at a point established in Section 9, it follows that for any sequence t, converging to infinity, the sequence of functions V J z ) has a subsequence, converging uniformly on compact subsets of the half-plane to a positive harmonic limit W(z). Now, W ( i ) = 0, hence, W ( z ) vanishes identically, and the limiting function is unique. Accordingly, V,(z) converges to 0 with increasing t. The essential part of the argument concerns the behavior of the function loglB(tz)l/t and depends on the following lemma.
+
Lemma: Let K be the disk ( z - Zl 5 p, where Z = X iY is fixed in the upper half-plane and 0 < p < Y; then there exists a constant CKso that
uniformly for
c = t + iq in the upper half-plane.
PROOF: Each side of the inequality is a positive, continuous function in the upper half-plane, and therefore the ratio of the two is bounded, and bounded away from 0 on any compact subset of the half-plane. In particular, if the compact set in question is the disk determined by the inequality 4 Yq/[(X - t)’ + ( Y + q)’] >= R for some small R, there will exist a constant CKmaking the inequality valid for c in that disk. For outside that disk, however, the inequality holds with the constant CK= 871 Y ; the integrand is the negative, subharmonic function logl(z - {)/(z - [)I2, and therefore
55.
FUNCTIONS OF EXPONENTIAL TYPE
29 1
and if we choose R as in Section 2 so that lzl < R implies Jlog(1 + z)l < 2121, then
_< 87rY3 ( X - 5)'
'I
+ ( Y + rl)"
If [ k = t k + i q k is the system of zeros of f ( z ) in the upper half-plane the Blaschke product B(z) may be written
where the phase angles
6k
are appropriately chosen. It readily follows that
and therefore that the series
1
qk/[ti
+ (1 + &)'I is convergent. Moreover,
We integrate the absolute value of this function over a disk Iz - 21 p in the upper half-plane to obtain
dx d y
1
5C K
" kz1
'Ik
(xt - tk)' + ( Y t + qk)'
and since this function converges to 0 with increasing t, the integral also converges to 0. Since the disk K was an arbitrary disk in the half-plane, it becomes clear that as distributions, the functions loglB(rz) I/t necessarily converge to 0 with increasing t . This completes the proof of the theorem. The most interesting consequence of the Beurling theorem has to do with the Laplacians of the functions F, and H: the measures AF, must converge as distributions to the measure AH, and the latter measure is simply (b - a) dx where dx represents Lebesgue measure on the real axis. If c k is the (countable)
292
111. HARMONIC ANALYSIS
set of zeros off(z), each zero taken as often as its multiplicity requires, then the measure AF, has the mass 2 4 t at the points the system of point masses AF, converges to ( b - a) dx. The total mass of the unit disk IzI S 1 relative to the measure AFt is then (27r/r)N(t),where N ( r ) is the number of zeros off(z) in the disk IzI 5 t . Accordingly,
ck/f;
N(t) t
converges to
-.b -7ra
It is now evident that if f ( z ) is the Fourier transform of a measure or function with compact support, say the interval [a, b ] , then the diameter of the support can be deduced from the asymptotic distribution of the zeros of f ( z ) . This makes possible another proof of the Titchmarsh convolution theorem, since if T,,T,, and T3 are measures on the real axis with compact A
A
A
and all three support, where T3 = Tl* T 2 , then T 3 ( [ )= (2n)'/'T1([)T2([) functions are entire, satisfying the hypotheses of the Beurling theorem. If Nl(r),N2(r),and N3(r)are the corresponding counting functions for the zeros, then, obviously N 3 W = %(r) + Nz(r) 9
from which it follows immediately that the diameter of the support of T3 is the sum of the diameters of the supports of Tland T,; this is the Titchmarsh theorem.
56. The Bessel Kernel In Section 24, the Riesz kernel was introduced for values of a in the interval
0 < a < n ; this was the locally integrable positive function
r(7) R,(x) = 2-'7~-"/'
JXIa-"
r(3 A
and its Fourier transform had the simple form R,(t) = (27r)-"/'ltl-". Thus R,(x) is a distribution of positive type, although not a function of positive type. It has the further property of being homogeneous of degree a - n. This kernel is very useful in many applications, however, the fact that the function R,(x) is not an integrable function leads to many inconveniences, and it is often preferable to consider instead the distribution G, ,defined for all a > 0, having the Fourier transform G,(t) = (27r)-"l2(1 + lt12)-a/2. We study that
56. THE BESSEL KERNEL
293
distribution in this section and find that G,(x) is an integrable function inheriting many of the useful properties of R,(x). It is possible to compute G,(x) explicitly; we omit that computation and record only the result: Ga(x) = Ca K,n - a)/2(Ix I ) I x I(a-n)/2
where K , is the modified Bessel function of the third kind and C,is the reciprocal of
r(B)= /ome-'tB-'
For p > 0, we may write positive
d'
!-(/I) - , m e - f J sB
dt and therefore, when s is
S
0
= Jbae-islfi-l
dl.
Accordingly,
It follows from the easy half of the Schoenberg theorem that the function A
G,(()
= (2x)-"/2(1
+ l(12)-"/2
is a function ofpositive type on R" for every n and every a > 0. So its Fourier transform, G,, is a positive measure of total mass 1, and is also a distribution A
of positive type since G,(() is positive. The distribution G, is an L2-function *
if and only if C,(() is an L2-function, and this happens if and only if a > n/2. It is easy to see that Go = 6, that (1 - A)G, = Ga-2 whenever a 2 2, and that the convolution equation G, * G, = G,+, is valid for all positive a and /I. P.
When a > n, the function G,(() is integrable and in this case the transform G, is a continuous function vanishing at infinity. While we are only interested in positive values of the parameter a, we h
should note that G,(() is a temperate distribution for all complex values of a and depends analytically on a ; its transform G, must then have the same property, and we can identify G, for values of a in the interval 0 c a c n by making use of the analytic continuation from larger values of a. It is important to show that G, is an absolutely continuous measure, that is, an integrable function for all positive a ; it is then obviously a function of radius only, and we shall see that it is a monotone decreasing function, analytic away from the origin.
294
111. HARMONIC ANALYSIS
For c( > n, we can compute the transform explicitly
The inner integral is the Fourier transform of G((,/%) where G (without a subscript) denotes the Gaussian ;this transform is therefore (21)-n12G(x/ Thus G,(x) = ga,n(1x1)’ where
4%).
It is perhaps appropriate to change the variable, putting I = l/t to obtain the integral in the form 1~~,-(r.i4)r -I/? (n-a)/z dt . e t t ’
for any r > 0 and any complex value of a,the integrand vanishes exponentially at infinity and is bounded at the origin; the integral therefore is always finite and is an analytic function of a. The analytic continuation of the distribution G, is then the function gJIx1); it follows immediately that this positive function diminishes as 1x1 increases, and that it is of the form F(r2/2)where F ( s ) is the Laplace transform.of a positive measure. Evidently F(s) is analytic for Re[s] >= E > 0, and therefore G,(x) is analytic for x # 0. In the special case a = n 1, we obtain
+
where r = 1x1. In particular, for n = 1 and a = 2, the Fourier transform of 1 1 -
f i 1 + 15IZ
is
56. THE BESSEL
295
KERNEL
It follows in general that the function Gn+l(x) has a particularly simple form:
It is worthwhile to determine the nature of the singularity in Ga(x)when 0 < a 6 n. If we change the variables in one of the integral representations of this function, writing t = s/r2, we find
-=
and if a n, the Lebesgue convergence theorem guarantees that the integral converges to
as r converges to 0. Thus, for 0 < a < n,
= (1
- O(x))Rz(x)
where the function O(x) converges to 0 with 1x1, and Ra(x)is the Riesz kernel introduced in Section 24, the Fourier transform of which, ( 2 ~ ) - " / ~ 1 r l - is ", CI
very much like C,(t). The case a = n is only slightly harder: here
1:
and the integral is broken up into a sum of three integrals + Jr: + Two of these integrals are bounded as r approaches 0, for example,
1.:
296
111. HARMONIC
ANALYSIS
and the third integral is estimated similarly. The exponential in the remaining term
takes values in the interval 4 < 1 < 1 ; that integral, therefore, is equal to 21 log rl(l - e(r)) where 0 < e(r) < for all small r. Hence, finally,
To compute the behavior of G,(x) as 1x1 app.roaches infinity, we write the
after the substitution x = rt/2 this may be reduced to the more symmetric expression
If r is very large, only the part of the integral taken over the interval 1 5 x 5 1 + E is significant, hence the integral is very close to
and for large r, this is approximately (2n/r)'12. Thus, collecting the constants, we find that for all positive a and large 1x1, = ~,(2,)1/2(~1(a-n-~)/2 e -14 (1 + o w , where C, is given at the beginning of this section, and O(x) converges to 0 with l/lx[. Finally, we should remark that from the equation (1 - A)G, = C,- valid for a > 2, we have AGa = G, - Ga-2 and this is an integrable function which has no fixed sign since AC, dx = [ G , dx - IG,- dx = 0. From the estimates which we have obtained for the singularity at the origin, it is easy to verify that AG, is negative in a neighborhood of the origin, more accurately, in a ball of radius r(a), where r(a) tends to 0 as ct diminishes to 2. When a = 2, the equation (I - A)G2 = 6 shows that a different behavior prevails: AG2(x) is positive away from the origin. The same statement is true for values of u in the interval (0,2), since by analytic continuation we obtain the identity
s
AGa(x) = ga,n(lxl) - ga-z,n(IxI)
57.
THE BESSEL POTENTIAL
297
valid for all real a, the function g,Jr) being given by an integral formula which we have derived. Now if a belongs to the interval (0,2), the number (a - 2)/2 is in (- I , 0), an interval where the Gamma function is negative. It follows that AG,(x) is the sum of two positive functions when 1x1 > 0. Finally, then, G,(x) is subharmonic away from the origin if 0 < a S 2, and is superharmonic in a neighborhood of the origin when 2 < a.
57. The Bessel Potential Let ,u be a positive Radon measure on R" and G,, the Bessel kernel of order 2a where a > 0; the integral
makes sense for all x and defines a function u(x) which may take the value + 00. This function is called the 2a-Bessel potential of p and is written u = G2,p. If p has compact support, or satisfies other special hypotheses, the distribution u is the convolution G,, * p , but in this section, we prefer to think of u as a function defined everywhere and not as a distribution. It should be remarked that the arguments of this section could be carried out for the Riesz-Frostman potentials of order 2a where 0 < tl < 4 2 ; these are the functions
where R,, is the Riesz kernel of order 2a. The Riesz kernel is somewhat simpler than the Bessel kernel and both kernels have the same singularity at the origin ; however, the Bessel kernel diminishes exponentially at infinity and is therefore more convenient for the study of functions on the whole space R". Many of the results found in Section 8 for the Newtonian potential can be established by virtually the same arguments for Bessel potentials. Thus, if the measure p has finite total mass the corresponding potential G,, p is a positive, lower semicontinuous function, which is integrable, and therefore can be infinite only on a set of measure zero. It is only slightly more difficult to bring over another result: if p has compact support K and its potential is bounded on K by M , then u(x) is bounded throughout the space by C M where the constant C depends only on the diameter of K . To show this we argue as in Section 8, choosing a point x' in K nearest x and noting that Ix' - yl 5 2 I x - y ( for all y in K , hence G,,(x - y ) 5 G,,[(x' - y)/2]. It is then' only necessary to
298
111. HARMONIC ANALYSIS
consider the form of the kernel GZQ(x)= gZaJ1x1) and to establish the existence of a constant CR(for any R) such that gZQ(r/2)5 cRgza(r) for all r in [0, R ] . An arbitrary positive Radon measure can be decomposed into the sum of two such measures p = p, p z , where p, is the restruction of p to some ball of radius R. It is evident that GZQp= GzQpl+ G Z Q p zand that the first function is an integrable function of x. The second potential will be a Cmfunction in the ball 1x1 < R or it will be identically infinite there. Indeed, if m(r) = pz(S,) where S, is the ball 1x1 < r, then at the origin,
+
m
GZQPZ(O)
=
jR gzn,n(r) dm(r)
and it is obvious that if m ( r ) increases too rapidly at infinity the potential is infinite at the origin, and indeed throughout the ball 1x1 < R. This shows that for an arbitrary Radon measure p, either the potential G z Q pis indentically infinite on R" or it is finite almost everywhere, locally integrable, and lower semicontinuous. Naturally, only the latter case is of interest. A positive Radon measure is said to have finite 2a-energy if the integral
is finite. If u(x) is the potential G2,p(x) the Fubini-Tonelli theorem permits us to write 11p11Q ' =
lu(x)
dp(x)Y
while if v is another positive Radon measure of finite 2a-energy, we may speak of the mutual energy
again by the Fubini-Tonelli theorem this is (p9 ')-Q
= j G z Q '(') =
dp(x)
P(Y) ~ v ( Y ) .
A measure p of finite total mass need not be of finite 2a-energy; a simple example is provided by the measure 6 which has a unit mass a t the origin. Its 2a-energy is just GZQ(0)and this is infinite for 2a 5 n. On the other hand, a measure offinite energy need not be of finite total mass, because the exponential decay of Gzaat infinity may make the energy integral finite, even though the measure p is quite large.
57.
THE BESSEL POTENTIAL
299
We should also note that a bounded, nonnegative iiitegrable function h(x) determines a measure h(x) dx which has finite 2a-energy. The potential G2,h is continuous, being the convolution of an integrable function and a
1
bounded one, and it is also bounded. Hence, G,,h(x)h(x) dx is finite. The identity G, * G , = Ga+, established in the previous section shows that the function Gu+B(x)is the a-potential of the measure GB(x)dx. Moreover, any function in the class Y is an a-potential of a (signed) measure: given cp in Y we may write cp = Gu$ where $ is in 9; it is only necessary to select i,b
+
so that &(() = Q(t)(l 1t)2)a/2. We use these facts and the Fubini-Tonelli theorem in the following calculations. Let p have finite 2a-energy. Now IIpI12-a
= jjc,a(x
- Y ) d ~ ( x~) P ( Y )
j j j Ga(x - z)Ga(z - Y) d z = j Icap(z)12 d z =
=
J I j ( z ) I 2dz,
MX)
MY)
where f(z) = jG,(i - x) d p ( x ) .
It follows that the a-potential of p is a positive, L2-function f ( z ) and the square of its ,!,'-norm is exactly the 2a-energy of p . Moreover, if GJdenotes the a-potential of the measuref(x) dx, then G2up= G,fidentically. It is obvious that G,f'is a temperate distribution with the Fourier transform h ( 1
+ 1512)-a'2*
I
We have next to compute cp(x) dp(x) where cp is a testfunction; we may and since the measure I$(x)l dx has finite 2aput cp = Ga$ where $ is in 9, energy, the Fubini theorem permits the following interchange of integrations:
SV(X)~ P ( x ) j G a III(X) =
~CC(*Y)
= / $ ( x ) G a ~ ( xd-x ) = j $ ( x ) f ( x ) ds
"
=
jk)?(t)r15
h
Since/(() is in L2(R"),it becomes clear that p is a temperate distribution and h
that its Fourier transform is p(t) =f(<)(l
+ lt12)"/2.This Fourier transform
300
111. HARMONIC ANALYSIS
is a distribution of positive type because it is the transform of a positive measure; it is also a function, but in general is not a function of positive type since p in general does not have finite total mass. The 2cc-energy of is also given in terms of the transform by the integral
1
IIPII~, = l;(Ol2(1
+ liIz)-'
d5.
So far we have considered only positive Radon measures of finite energy; if we pass to the differences of such measures we obtain a linear space M-, contained in the space of temperate distributions and having a natural quadratic norm defined by 11Pl12-, = =
JmIv+ l
11
G2JX
d5
- Y ) dP(X) M Y ) .
The inner product corresponding is the mutual energy (P, v ) - u = =
[;(oG(l+ l5I2)-" &
11
G2.b
=(GUCc,
- Y ) 44x1 W Y )
GaV)LZ
where, of course, G,v is also in L2(R").The space M-, of signed measures obtained in this way is not complete; it is a pre-Hilbert space. However, an important theorem assures us that certain subsets of this space are complete, and are, therefore, closed subsets of the completion.
Theorem: Let F be a closed subset of R" and C, the convex cone of all positive measures of finite 2cr-energy supported by F; then C, is complete in M-,.
PROOF: A sequence of positive Radon measures supported by Fwhich is Cauchy in M-,is surely a convergent sequence of temperate distributions: for any function cp in Y we have "
lim Pm(cp) = lim
m+m
m
J Grn(t>$(t> dt
57.
+ 1<12>"'2
Now the function +(()(I
30 1
THE BESSEL POTENTIAL
is a fixed function in L2(R"),while the
+
sequence Fnl(t)(l lt12)-"'2 is Cauchy in that space by hypothesis. Accordingly, the system of numbers p,(cp) is Cauchy, and the pm converge to some temperate distribution p. Each p,,, being a positive Radon measure supported by F, it immediately follows that p has the same properties. From the continuity of the Fourier transform on the space 9" we have ;(()(I
+ lt12>-a/2 = li m ?ni(t)(l + I ~ I ~ ) - ~ / ~ 111
and therefore p has finite 2a-energy. Finally IIP - P n t I I t a = J12(t)- Prn(t)l2(1
+ 1t12>-"at
converges to 0, completing the proof of the theorem. We can now introduce the capacitary potential u,(x) = G2, p K ( x )associated with a compact subset K of R" and the corresponding capacitary measure p K . The previous theorem guarantees that the cone C, consisting of all positive Radon measures supported by K and having finite 2a-energy is a closed, convex subset of the completion of M - , . Tt may happen that C, consists only of the zero measure. If cp(x) is a testfunction equal to 1 on a neighborhood of the compact K we may write cp = G2,$ where $ is in 9';the measure $(x) dx is a signed measure of finite 2a-energy. From the results of Section 14 it follows that there is a unique element pK in C , nearest to $ in the sense of the metric of M - , . We call it the capacitary measure. As was shown in Section 14, the inequality
+
IIPK
I ReC(p, -
-
$5
v
- $)-a]
must hold for all v in C,. The Hilbert space in question is real, and so we obtain ~
~
- (PK ~
, $)-aK 5 ( p~ K -
$~ 9
V ) - m~
a
for all such v, and this may be written more explicitly: /[G2apK(X)
-
dPK(x)
2 J[G2aPK(X)
-
I1 d v ( x ) .
Among the measures v in C, are the measures tp, where t is an arbitrary positive number; it follows that the left-hand side of the inequality above is 0. This means that the capacitary potential ZI,(X) = Gz,P(K(X) is equal to + 1 almost everywhere with respect to the capacitary measure p, . The inequality also shows that the subset of K upon which uK(x)is smaller than 1 is a set of measure 0 for every measure v supported by K and having finite 2a-energy. It should be emphasized that the definition of the capacitary potential and the capacitary measure is independent of the particular choice of the
302
111. HARMONIC ANALYSIS
testfunction q ( x ) equal to
+ 1 on a neighborhood of the set K ; for every v in C , ( v , $1- a = $ q ( x ) ~
x
)
= $ 1 dv(x) =v(K)
and only the inner products (v, $ ) - a matter in the previous calculation. Since u,(x) is lower semicontinuous, the set u,(x) 5 1 is closed, and the fact that u,(x) = 1 almost everywhere p, means that this set is a support for p,. Thus uK(x)is bounded by + 1 on the support of p, and so u,(x) is bounded throughout the space R”. Let U z a denote the family of 2a-exceptional sets: this is the class of all subsets of R” which are contained in G-deltas of measure 0 for every positive Radon measure of finite 2a-energy. It is clear that this class is a sigma-ring, that is, it is closed under countable unions, and that a subset of an exceptional set is also exceptional. All exceptional sets are of Lebesgue measure zero, but as we shall see, the exceptional sets form a much smaller class than the sets of Lebesgue measure zero. The term “almost everywhere” has a well established meaning in measure theory; we shall say “quasi-everywhere” when we mean “with the exception of a set in UZu.”Because the set u,(x) < 1 is a G-delta of v measure 0 for every v in C, ,our results about capacitary potentials may be put as follows:
uK(x)= G I a p K ( x )2 1 quasi-everywhere on K u,(x) = Gzap K ( x ) = 1 almost everywhere pK .
Writing JpKIfor the total mass of p K , we then have
=p
dK)
= fUKcX) dpK(X)
=
2
1IpKII-u-
The common value in the previous equation is called the 2a-capacity of K and is written y2.(K). The capacity vanishes if and only if p, is the 0 measure, and in this case 0 = u,(x) 2 1 quasi-everywhere on K . It follows that K is a set of measure 0 for all v of finite 2a-energy, and therefore that K is in U z a . Conversely, if K is exceptional the convex cone C, consists only of the 0 measure and the capacity vanishes. The following lemmas will be useful when we turn to the study of the capacity y2.(K) as a set function.
57.
303
THE BESSEL POTENTIAL
Let K be the intersection of a decreasing sequence of comLemma 1: pact sets K , and p, the capacitary measure of K,; then p,,, converges to pK in M-..
+
PROOF: Let q(x) = Gzu$(x)be a testfunction which is 1 in a neighborhood of K , ; the capacitary potentials p,,, are the elements in the convex nearest in the sense of the metric of M - , . As the sets K, diminish cone CKm to K , the corresponding convex cones CKmdiminish to their intersection C,, and it is easy to verify that the sequence c(, of nearest elements to I) forms a Cauchy sequence, converging to the corresponding nearest element of C, , that is, to p K . It follows from the lemma that Y A K ) = IIpKII!,
= IimIIpmIItu = lim Y2a( K m )
*
y z a ( K ) = infIlv11?,, the infimum being taken over all posiLemma 2: tive measures v of finite 2cc-energy for which G,,v(x) 2 1 quasi-everywhere on K .
PRWF: (v,
pK)-a
= /G2av(x)
2
dpK(x)
dpK(x) = P K ( K ) = Y d K ) = IIpKlltU
for all such v, and by the Schwarz inequality, (v, pK)-. 5 llvll -,llp~Il -,and therefore llpKII -,S J ( v J-,. ( Since equality holds in the Schwarz inequality only for v a scalar multiple of p K , it follows that the capacitary measure is the unique solution to the variational problem posed in the lemma. Lemma 3: y 2 , ( K ) = sup v(K), the supremum being taken over all v of finite 2cc-energy for which C,,v(x) 5 1 almost everywhere v.
PROOF: I f v has the property that its potential satisfies the inequality C,,v(x) 5 1 almost everywhere relative to the measure v itself then v*, the restriction of v to K, surely has the same property. Now
1) v* 1) 2-,
= jGz.
v*(x) dv*(x)
- v*(K) I pdX) dv*(x)
/G2a
= (pK
9
v*)-a
5 IIv*ll-aIIflKII-a
304
111. HARMONIC ANALYSIS
anditfollowsthat IIv*Il-, 6 ~ ~ p , J - hence,v(K) a, = v * ( K ) 6 l[pKll?a = y Z a ( K ) . The supremum is attained only when equality occurs in the Schwarz inequality, and we then have v* = pK and v = v*. The 2a-capacity is a set function which we have defined on the class of compact sets; it is obvious that it is a monotone function in the sense that Kl c K2 implies y2.(Kl) S yza(K2)and that the capacity of the empty set is 0. The previous lemma now shows that the capacity is subadditive because the class A of measures p of finite 2a-energy, having the property that G 2 a p5 1 almost everywhere p, is independent of the compact K under consideration ; so if we take suprema over Jl Y2a(K1 u K2) = SUP A K i u K z ) S SUP P(K1) + SUP P(K2) = ~2a(K1)+ ~ 2 a ( K 2 ) . More generally, then, yza(uFl1 K j ) 6 ~ ~ ~ l y 2 a (holds K j ) for all finite unions of compacts Ki . We may therefore follow the method of Section 5 and extend K), the set function y2. to the class of open sets, putting y2.(G) = sup Y ~ ~ ( the supremum being taken over all compact subsets K of G. It is easy to verify that this extended function is subadditive on the class of open sets. Finally, for arbitrary sets A , we define the capacity Y2a(A)
=
inf y 2 a ( G ) ,
A c G.
In this way there is obtained an outer measure on R". Of course, it is necessary to verify that the extended function coincides with the initial capacity on compact sets. According to the new definition, y Z a ( K )= inf yZa(G)where the infimum is taken over all open sets containing K; evidently we may suppose that K is the intersection of a decreasing sequence of bounded open sets C, such that y2.(K) = lim yza(Gm).It is also clear that we may suppose that each G, has a compact closure contained in Gm-l if m 2 2. If p,,,is the capacitary measure associated with the compact G , it follows directly from the definition of the capacity that IIpmII2-a
6 YZa(Grn-1)
5
2 IIPm-111-a
and therefore yZa(K)= l i m ~ ~ p , ~ ~This, ? a . however, in view of lemma I , is what we originally had for the capacity of K . It should also be clear that an arbitrary set A is contained in a G-delta having exactly the same capacity. Although the capacity is an outer measure it is not a Caratheodory outer measure: if two sets are at a positive distance apart it need not follow that the capacity of their union is the sum of their capacities, even though the sets are compact. Indeed, almost the opposite circumstance prevails: it can be
57.
THE BESSEL POTENTIAL
305
shown that the only sets which are measurable for the capacity y2,, are the sets of capacity zero and their complements. Fortunately, in studies where the capacity plays a role, measurability is not of interest, and the important question is whether or not a set is of capacity zero. There is also a concept of capacitability: a set A is capacitable if and only if v2,(A) = sup y2.(K), the supremum being taken over all compacts K contained in A . By definition, open sets are capacitable. In the theory ofcapacities developed by G . Choquet it is established that all analytic sets are capacitable, and this theorem is proved under much more general hypotheses than we have admitted here. Of course we will not prove Choquet's theorem, but we will invoke it in the special case of sets which are G-deltas to obtain two interesting results. Theorem:
A subset A of R" is exceptional if and only if y2.(A) = 0.
PROOF: If the capacity of A vanishes, A is contained in the inter,) section of a decreasing sequence C, of open sets such that Y ~ ~ ( Gconverges to 0. If v is a positive Radon measure of finite energy such that v ( A ) = 2M > 0 then v(G,) 2 2 M and there exists a compact subset K , of G, such that v(K,) 2 M . Let u, = GzUpmbe the capacitary potential of K , . Now M 5 v(Km)
5 SumCx) dv(x) =h
5 5
r n
9
v)-a
IlPrnll -aIIvII ~
-a
~~
JY~~(G,)IIVII -u
and the quantity on the right converges to 0 with increasing m. Hence there exists no v of finite energy for which v(A) is positive, and therefore A is exceptional.
On the other hand, if A is exceptional it is contained in an exceptional G-delta which is a capacitable set by Choquet's theorem, and its capacity is therefore the supremum of the capacities yZu(K ) taken over all compact subsets K . Since those compact subsets are exceptional, their capacities vanish, and therefore yZu(A)= 0. Theorem: Let p be a positive Radon measure on R", not necessarily of finite 2a-energy; then the potential = p 2 u ( x - Y ) 40)
is either identically infinite, or is infinite only on an exceptional G-delta.
306
111. HARMONIC ANALYSIS
PROOF: If we suppose that u(x) is not identically infinite it is then a lower semicontinuous function and the sets defined by the inequalities u(x) > k are open. The intersection of that decreasing sequence of open sets is a G-delta, and is the set where u(x) is infinite. It will be sufficient to show that the intersection of this set with the open ball 1x1 < R is exceptional, since the class of exceptional sets is closed under countable unions. Let q ( x ) be a testfunction taking values in the interval [0, I] and equal to 1 for 1x1 5 R and write P = VCl = PI
+ (1 - 44P
+ Pz *
Evidently, u = G,,pl + G2,pZ,and neither potential in this sum is identically infinite. It follows that G2,p2 is finite and even continuous in the ball 1x1 < R. The subset of that ball where u ( x ) is infinite is therefore the set where ul(x) = GZupl(x)is. Since that set is a G-delta it is capacitable by Choquet's theorem. If its capacity were positive, there would exist a compact subset K of positive capacity upon which u I ( x ) was infinite. Let uK = G Z U p Kbe the capacitary potential of K ; it is obvious that the integral ul(x) dpK(x)is infinite, although,
I
I
by virtue of the Tonelli theorem, this integral is also equal to uK(x)dp,(x) which is finite, being the integral of a bounded, lower semicontinuous function relative to a positive Radon measure with compact support. This contradiction shows that ul(x) is infinite only on a set of capacity zero, and therefore u(x) has the same property. In this section we have made no hypothesis concerning the value of the parameter a except that it was positive. However, if 2a > n, the kernel G2,(x) is a bounded, continuous integrable function of positive type. It follows that a set consisting of only one point has a positive capacity. Accordingly, only the empty set has capacity zero if 2a > n, and similarly, only the empty set is exceptional. When a = 0, it is convenient to take the Lebesgue measure for the capacity.
58. The Spaces of Bessel Potentials Let c( be a real number and X uthe linear space of all temperate distributions u on R" having Fourier transforms that are square integrable relative to the measure (I + 1(12)" d5; thus, u is in 2,if and only if lu11,2 = jl0(5)l"1
+ ltI2)' d t
58.
307
THE SPACES OF BESSEL POTENTIALS
is finite. For example, if a is positive and p a measure of finite 2a-energy the Bessel potential u = G2apbelongs to X uwhile the measure p itself belongs to X - , . In particular, the space M - , is a subset of X-.. It is obvious that X , is a Hilbert space and that X o is just another name for L2(R"). When a is positive, the elements of X uare also elements of L2(R")and are therefore not properly considered as functions, but rather as equivalence classes of functions, two functions being considered equivalent if and only if they coincide almost everywhere. One of our principal purposes in this section is to modify the definition of the spaces X afor positive a in such a way that the representing functions will be much better defined; they will appear as defined up to a set of 2a-capacity zero rather than just a set of Lebesgue measure zero. Iff(x) is an element of L2(R")which is positive almost everywhere, the Bessel potential GJis unambiguously defined for all x and is a positive, lower semicontinuous function on the space, and as a distribution it belongs to .Xa.I t is important to notice that the set where Gaf is infinite is a set of 2a-capacity zero. The proof is immediate: that set is obviously a G-delta, being the intersection of the sequence of open sets G,f(x) > m, and so is capacitable. If it were not of capacity zero, there would exist a compact set K of positive capacity upon which G,f was infinite. Now G a f d p , is then
f
f
also infinite, although by the Tonelli theorem this is equal to f ( x ) g ( x )dx, where g(x) = Gapu,is in L2(R")since p K has finite 2a-energy. More generally, then, since an arbitrary measure f ( x ) dx with f ( x ) in L2 may be written in a unique way as the difference of two positive measures with L2-densities, f ( x ) dx =f+(x) dx -f-(x) dx, then G,f = 15,f+ - Ga,f_ makes sense and is finite as an integral for all points .'I except, perhaps, those in a subset of a G-delta of 2a-capacity zero. I t is clear that an arbitrary element u of X u ,where a > 0 is of the form G , f f o r some f in L2 = X 0 ;we have only to take f as the inverse Fourier transform of O(t)(l + l(12)"/. Moreover, the L2-norm o f f is given by the identity Ilfll; = IG,f11,2 = Ilull,z. We are now able to realize the elements of X afor a > 0 in a more special way: to the distribution u = G a f ,we assign the equivalence class of functions that coincide with the function G a f = Gay+ - G , f - except on a set of 2acapacity zero. The Hilbert space (of equivalence classes) so obtained is called the space of Bessel potentials of order a and is written Pa.It should be emphasized that the spaces H a and P a are indistinguishable as spaces of distributions; however, the equivalence classes in P a are essentially smaller than those
308
111. HARMONIC ANALYSIS
i n S a ,and this is a consequence of the fact that the family of sets of 2a-capacity zero is materially smaller than the family of sets of Lebesgue measure zero. Of course, when c1 > 4 2 , an equivalence class in P a contains only one function (which happens to be continuous) because only the empty set has capacity zero in this case. Let uk(x) be a sequence of functions in Pa which is Cauchy relative to the norm of that space. (More properly, we should call it a seminorm and not a norm, since we think of the elements of the space as functions and not as equivalence classes.) We may pass to a subsequence written u j = G,f j and so chosen that - ujlla < 2-j. Now, except perhaps for a set of 2a-capacity zero,
C I V j + 1(x) - u j ( x ) I = 2 IGa(fj+ 1 - f j ) ( x ) I 5 2 GaIfj+ 1 -fjI(x> = Ga h ( x ) ,
Ifi+ - j J . It follows that the series converges where h is the ,!,*-function for all x outside some set of capacity zero, and therefore that the general term of that series converges to 0 outside such a set. This means that the subsequence uj(x) converges to a finite limit outside a set of capacity zero. Since the space Pa is complete, there exists a Bessel potential w(x) in P" such that uk converges to w in Pa, and it is easy to see that lluj - wIIa < 2 2 - j . The previous argument now makes it clear that the series
1
Iuj(x)
- w(x)I
converges outside a set of 2a-capacity zero, and hence the subsequence u j ( x ) converges pointwise to the correct limit w(x) outside such a set. It is possible to obtain the concepts of capacity, capacitary potential and capacitary measure operating directly with the space P a ; we therefore introduce new definitions, independent of those given in the previous section. Let A be an arbitrary set in R", and C, be the cone of all potentials u(x) in P a having the property that u(x) 2 1 on A except for a subset of 2"-capacity zero. If the cone C, is empty, y s h a l l call A a set of infinite capacity, and it will have no capacitary potential. The remarks in the previous paragraph show that the cone C,, is closed in Pa and is obviously convex. It follows that there exists a unique element u,, in that cone nearest the origin. If cp is a positive testfunction and t >= 0, the function uA(x)+ tcp(x) is also in C,; therefore, the Hilbert space being real, IIuA
+ tpII,2 = IIuAII,2 + t211~II,2+ W U A ,
cp)a
2 IIuAlIOf
for all positive t. It follows that (u, , cp)= 2 0 for all positive testfunctions cp. Thus, the distribution T ( q )= (u,, ,cp), is a positive distribution and is represented by a positive Radon measure p A . This measure is supported by the
58.
THE SPACES OF BESSEL POTENTIALS
309
closure of A , because, if the testfunction cp has its support in the complement of the closure of A , the potential uA(x)+ tcp(x) is also in C, for negative values of t , and this implies (uA,cp), = 0; hence cp dp, = 0. If we write the distribution pA in terms of the Fourier transform, we infer that p A is temperate and find
I
GAr) = iiA(W + lm" and therefore u, = G,,p,. Indeed,
It is also clear that pA is of finite 2cr-energy.
= p"(5)1*(1
+ ltI2)"&
= II~Alt:. Of course, we call uA the capacitary potential of A , pA the corresponding capacitary measure, and llpAIl?, = lluA1l: the 2a-capacity of A . The capacitary potential may now be considered as the lower semicontinuous function G2,pA(x)and not as an element of P" determined up to an exceptional set. The set defined by the inequality uA(x) > 1 is open, and it is not hard to show that it has measure zero for the capacitary measure. If the positive test function q has its support in the set uA(x)> I , then u, + tcp is in CA for certain (small) negative values of t , and the variational property defining uA implies that (u, , cp), = Scp dpA = 0. Hence, u,(x) 5 1 almost everywhere pA . It should be clear that when the set A is a compact set K , the corresponding capacitary potential uA coincides with the capacitary potential defined in the previous section. This is the content of Lemma 2 of that section. Accordingly, the set function 2
Cap(A) = I I p A I t c a = IIUAIIa coincides with y2,(A) on the class of all compact sets. We next show that these set functions coincide on the class of open sets. If G is open, let K,, be an increasing sequence of compacts whose union is G , and u,, = Gzapnbe the corresponding capacitary potentials. The numbers ~ ~ u=,y2,(K,,) , ~ ~ converge ~ increasingly to y2.(C) which we suppose finite. For n > m, the potential u,,(x) 2 I on K,,, quasi-everywhere; accordingly,
3 10
111. HARMONIC ANALYSIS
and therefore IIun - umII.2 = IIunII.2
+ IIumII.2 - 2(un,
s IIunII.2 -
um)a
IIumII.2.
Therefore, the sequence is a Cauchy sequence in Pa and converges to a limit c(x) in that space. Passing, if need be, to a subsequence, we may suppose that the sequence converges pointwise outside an exceptional set and therefore that u ( x ) 2 1 quasi-everywhere on every K,. This means that u(x) is in C , , and therefore that Cap(G) = IIuGIIi S llu11.2. As the set function Cap(A) is obviously monotone, we have, finally, Cap(G) = y2,(G) as desired. It is now clear that if a set A is a G-delta, then Cap(A) is at least as large as the supremum of the 2a-capacities of compacts contained in A, while it is smaller than y,,(G) for any open set G containing A, whence Cap(A) = y2,(A) for such sets. Finally, for an arbitrary set A, the set uA(x)2 1 is a G-delta having the same capacitary potential as A, and it follows that the set functions Cap(A) and y,,(A) coincide on the class of all sets. We have remarked that the class of sets of capacity zero is substantially smaller than the class of sets of Lebesgue measure zero. This is made explicit by the following theorem of 0. Frostman which we state without proof. Theorem: For any set A and 0 < a < 4 2 , y2,(A) = 0 if Hn-,,(A) = 0, while if Hn-2a(A) is infinite, then y2&A) > 0 for all /3 > a. The Frostman theorem makes it clear, in particular, that subsets of dimension n - 1 in R" which occur as boundaries of open sets have positive capacity for a > 3. For example, the surface S of the unit ball has positive capacity for such values of a, the corresponding capacitary potential being evidently a function only of radius, and the capacitary measure being of the form Cw for a certain value of C. The spaces of potentials Pa are, therefore, particularly adapted to the study of elliptic boundary value problems since the Bessel potential u = G,fis already defined up to a set of capacity zero on the boundary of the domain for which the problem is posed; however, the corresponding distribution u in 2, cannot be said to be defined on that boundary at all since the boundary is only a set of Lebesgue measure zero. In Section 56, it has been shown that for a > 1, the kernel G2,(x) is superharmonic in a neighborhood of the origin; hence, if S, is the ball S(0, r ) and u ( x ) is the capacitary potential of its surface, the function u ( x ) is a continuous superharmonic function in a neighborhood of the ball if r is sufficiently small. Since u(x) = 1 on the surface, we must have u(x) 2 1 everywhere inside, and because the potential cannot be harmonic inside, we actually have u(x) > 1 for 1x1 c r. Accordingly, the capacitary potential of the boundary is also
58. THE SPACES OF BESSEL POTENTIALS
31 1
the capacitary potential for the whole ball and the corresponding measure is concentrated on the boundary. It follows that the capacity of the ball S(0, r ) is exactly the same as the capacity of its boundary. The corresponding capacitary potential is then strictly greater than 1 at all interior points of the set. For a 5 1, quite a different thing happens: the kernel is now subnarmonic away from the origin and so the capacitary potential of any set is subharmonic away from the support of the measure. If that capacitary potential is continuous, it attains its supremum on the support of the measure, being subharmonic everywhere else, and so its bound is at most + 1. By a rather complicated argument, it can be shown that all capacitary potentials uA have the same property 11 uA11 = 1 if a 5 1. It is clear from the previous section that a compact set K has capacity zero if and only if it supports no measure of finite 2a-energy. A seemingly more exacting requirement leads to the following definition : The compact is 2a-pohr if and only if it supports no distribution belonging to X - , . Obviously, a polar compact is of capacity zero. It is not known at this time whether there exist compact sets of capacity zero which are not polar sets. For a > 4 2 , there is, of course, no problem since only the empty set has capacity zero. More information is given by the following lemma. The compact set K is 2a-polar if there exists a sequence urn Lemma: in ' P converging to 0 in that space such that for every m there exists a neighborhood of K on which u,,,(x) = 1.
PROOF: If there exits such a sequence of potentials they may be suitably regularized so that they are also C"-functions. If T is a distribution of finite 2a-energy supported by K and cp is a testfunction, then T(cp)= T(cpu,) for all m, whence
IT(cp)l 5 IITII-uIIcp~fnII., and the quantity on the right-hand side converges to 0 with increasing m since multiplication by the fixed testfunction cp is a continuous mapping of Pa into itself. Accordingly T = 0, and therefore K supports no nontrivial distribution of finite 2a-energy. From the lemma it follows that if a 5 1, any exceptional compact is polar: the set may be taken as the intersection of a decreasing sequence of bounded open sets G, , and if u, is the capacitary potential of G, ,then u,(x) = 1 on that set. Evidently, the numbers llu,llPf converge to y2,(K) = 0. These considerations lead to another theorem of Beurling. We consider an integrable function f ( x ) on the real axis which is also integrable square; it therefore belongs to every L p for 1 5 p 5 2 and has a continuous Fourier transform. Let F be the closed set where the Fourier transform vanishes and .M the space of finite linear combinations of translates off.
312
Theorem (Beurling):
111. HARMONIC ANALYSIS
If for some p in the interval (0, 1) the space
A? is not dense in LP(R'), then the Hausdorff dimension of F is at least
2/p' = 2 - (2/p). If A? is not dense in Lp, there exists a g(x) in Lp' orthogonal PROOF: to all the translates
J j ( x - h)g(x) dx = 0. This may be written f * 5 = 0, and the equation remains valid if g is regularized; hence, we may suppose that g(x) is a bounded C"-function in Lp' with a compact spectrum. Let the distribution T be the inverse Fourier transform of g; T has a compact support K which is contained in the set F by virtue of a theorem of Section 46. Choose a so that 1 - 2a > 2/p'; it will then follow that the distribution T has finite 2a-energy, since that energy is given by SlS(X)l2(1
+ IxI2)-= dx,
and this integral is finite by virtue of Holder's inequality since lg(x)I2 is in L9 where q = p/2 and (1 + I x ~ ' ) - ~is in L9'. Hence, the set K supports a distribution of finite energy. The distinction between polar sets and sets of capacity zero being meaningless when the dimension of the space is 5 2 it follows that K is a set of positive 2a-capacity. The Frostman theorem then implies that H l - 2 a ( K ) > 0. Since K is contained in F, H&F) > 0 for all p > 2/p'. The argument of this section shows that a compact subset of the real axis of Hausdorff dimension a cannot support a distribution T with Fourier transform in Lq where q c 2/a. Hence, the result of Salem cited at the end of Section 52 cannot be improved.
Index A
Dual space, 87 Du Bois-Reymond, 96, 97 Du Plessis, N., 277
Agmon, S., 53, 232 Artin, E., 21 Ascoli-Arzela theorem, 3
E
B
Equicontinuous family, 3 Euler equation, 111, 155 Euler-Lagrange equation, 97 Exponential polynomial, 227 Exponential type, 287
Balaguer, F., 52 Bernstein, S. N., 51, 164, 227 Beurling, A., 230, 252, 289, 312 Blaschke product, 9, 221 Blaschke, W., 9 Bochner, S., 184 Borel, E., 50 Borel measure, 24 Bounded distribution, 230 Bounded set, 85
F Fatou, P,24, 220 Frostman, O., 297, 310 Fubini theorem, 35
C G Calderon, A., 176 Cantor measure, 191 Cantor set, 34 Capacitary measure, 301 Capacitary potential, 301 Capacity, 304 Caratheodory outer measure, 24 Choquet, G., 52,305 Conjugate convex function, 71 Conjugate space, 87 Convolution of measures, 125 Corominas, E., 52
D Dexter diagonal, 273 Dirac delta, 92, 95 Distinguished boundary, 80 Distribution, 87, 91 Division problem, 107
Girding, L., 50,229 Gaussian, 139, 183 Gegenbauer polynomials, 172 Green’s formula, 40
H Hadamard, J., 19, 245 Hahn-Banach theorem, 75 Hardy, G. H., 250 Harmonic function, 43 Harnack‘s inequality, 46 Hartogs’ theorem, 79 Hausdorff-Young theorem, 262 Helly’s theorem, 29 Hessian, 70 Hormander, L., 273 Holder’s inequality, 17 Holomorphy envelope, 81 313
314
INDEX
I Ikehara, S., 238 Indicator diagram, 288 Irreducible representation, 173
J Jensen’s formula, 133 Jensen’s inequality, 16
L LaVallee-Poussin, C. J., 245 Lebesgue, H.,24 Lebesgue point, 57 Legendre polynomials, 172 Leibnitz, G., 54, 98 Levi, Beppo, 24 Lindelof, E.,18, 286 Lions, J. L., 224 Liouville’s theorem, 45 Lipschitzian function, 3 Littlewood, J. E., 236 Locally convex space, 84 Locally finite covering, 92 Logarithmically convex function, 16 Lower semicontinuous function. 5
0
Order of distribution, 91 Ostrowski, A., 12 Outer measure, 24
P Parseval equation, 143, 163 Partition of unity, 93 Phragmen-Lindelof theorem, 286 Poisson integral, 42 Poisson summation formula, 162, 166 Polar sets, 3 11 Polya, G., 288 Polycylinder, 77 Positive distribution, 94 Positive matrix, 181
R Radon measure, 24 Reflection, 33, 45 Regularization, 56 Regular outer measure, 24 Regular summability method, 236 Riemann-Lebesgue lemma, 146 Riesz, F., 27, 128, 282 Riesz, M., 258, 297 Riesz kernel, 118, 158
M
S Maak, W.,59 Maak sums, 63 Mandelbrodt, S. 232 Marriage problem, 60 Minkowski, H.,68, 166 Modulus of continuity, 3 Monte1 space, 86
N Narrow convergence, 230 Norm, 84
Salem, R., 267 Schoenberg, I. J., 205 Schur, I., 182 Schwartz, L., 207 Schwan reflection principle, 33, 45 Schwarz’s lemma, 8 Semicontinuous, 5 Seminorm, 84 Siegel, C.L.,166 Sierpinski, W.,12 Singular measures, 190, 193 Sinister diagonal, 273 Sobolev, S., 275
315
INDEX
Solid harmonic, 169 Spherical harmonic, 169 Stone, M. H., 189 Strong topology, 87 Subharmonic function, 41, 131 Superharmonic function, 41
T Tauber, A., 236 Tchebysheff, P., 278 Tchebysheff polynomials, 172 Thorin, G. O., 133, 258 Thorp, E., 196 Three Circles theorem, 19 Three Lines theorem, 18 Titchmarsh, E., 224, 260 Tonelli theorem, 35 Translation group, 109 Type set, 258
V van der Corput, J. G., 194, 199, 264
W Wallin, H., 196 Weak positive type, 185 Weak-star topology, 87 Weak topology, 87 Weyl, H., 127, 198 Wiener, N., 193,234,235
Y Young, W. H., 75,262,275 Young’s inequality, 75
Z
U Ultraspherical polynomials, 172 Ungar, P., 174 Upper semicontinuous, 5
Zeta function, 241 Zonal harmonic, 171 Zygmund, A., 176
Pure and Applied Mathematics A Series of Monographs and Textbooks Edited by
Paul A. Smith and Samuel Eilmnberg Columbia University, New York
1 : ARNOLD SOMMERFELD. Partial Differential Equations in Physics. 1949 (Lectures
on Theoretical Physics, Volume VI) 2 : REINHOLD BAER.Linear Algebra and Projective Geometry. 1952 BUSEMANN A N D PAUL KELLY.Projective Geometry and Projective 3 : HERBERT Metrics. 1953 A N D M. SCHIFFER. Kernel Functions and Elliptic Differential 4 : STEFANBERGMAN Equations in Mathematical Physics. 1953 5 : RALPHPHILIPBOAS,JR. Entire Functions. 1954 BUSEMANN. The Geometry of Geodesics. 1955 6 : HERBERT 7 : CLAUDE CHEVALLEY. Fundamental Concepts of Algebra. 1956 8: SZE-TSEN Hu. Homotopy Theory. 1959 Solution of Equations and Systems of Equations. Second 9 : A. M. OSTROWSKI. Edition. 1%6 Foundations of Modern Analysis. 1960 10: J. DIEUDONN~. 11 : S. I. GOLDBERG. Curvature and Homology. 1962 HELGASON. Differential Geometry and Symmetric Spaces. 1962 12 : SIGURDUR Introduction to the Theory of Integration. 1963 13: T. H. HILDELIIUNDT. ABHYANKAR. Local Analytic Geometry. 1964 14: SHREERAM 15 : RICHARD L. BISHOPA N D RICHARD J. CRITTENDEN. Geometry of Manifolds. 1964 A. GAAL.Point Set Topology. 1964 16: STEVEN MITCHELL. Theory of Categories. 1965 17: BARRY 18: ANTHONY P. MORSE.A Theory of Sets. 1965
Pure and Applied Mathematics A Series of Monographs and Textbooks
19: GUSTAVECHOQUET. Topology. 1966 20: Z. I. BOREVICH A N D I. R. SHAFAREVICH. Number Theory. 1966 21 : JOSB LUISMASSERAA N D J U A N JORGE SCHAFFER. Linear Differential Equations and Function Spaces. 1966 D. SCHAFER. An Introduction to Nonassociative Algebras. 1966 22 : RICHARD 23: MARTINEICHLER. Introduction to the Theory of Algebraic Numbers and Functions. 1966 24 : SHREERAM ABHYANKAR. Resolution of Singularities of Embedded Algebraic Surfaces. 1966 25 : FRANCOIS TREVES. Topological Vector Spaces, Distributions, and Kernels. 1967 26: PETER D. LAXand RALPHS. PHILLIPS. Scattering Theory. 1967 27: OYSTEINORE.The Four Color Problem. 1967 28: MAURICE HEINS.Complex Function Theory. 1968 A N D R. K. GETOOR. Markov Processes and Potential Theory. 29 : R.M. BLUMENTHAL 1968 30: L. J. MORDELL. Diophantine Equations. 1969 31 : J. BARKLEY ROSSER.Simplified Independence Proofs : Boolean Valued Models of Set Theory. 1969 32 : WILLIAM F. DONOGHUE, JR. Distributions and Fourier Transforms. 1969 33 : MARSTON MORSEA N D STEWART S. CAIRNS.Critical Point Theory in Global and Differential Topology. 1969 Irt
fireparatioit:
HANSFREUDENTHAL A N D H. DE VRIES.Linear Lie Groups. J. DIEUDONN~. Foundations of Modern Analysis (enlarged and corrected printing) EDWIN WEISS.Cohomology of Groups.
This Page Intentionally Left Blank