Pointwise Convergence Of Fourier Series

Preface This book grew out of my attempt in August 1998 to compare Carleson’s and Feﬀerman’s proofs of the pointwise c...

Author: Juan Arias de Reyna

24 downloads 1049 Views 1MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Preface

This book grew out of my attempt in August 1998 to compare Carleson’s and Feﬀerman’s proofs of the pointwise convergence of Fourier series with Lacey and Thiele’s proof of the boundedness of the bilinear Hilbert transform. I started with Carleson’s paper and soon realized that my summer vacation would not suﬃce to understand Carleson’s proof. Bit by bit I began to understand it. I was impressed by the breathtaking proof and started to give a detailed exposition that could be understandable by someone who, like me, was not a specialist in harmonic analysis. I’ve been working on this project for almost two years and lectured on it at the University of Seville from February to June 2000. Thus, this book is meant for graduate students who want to understand one of the great achievements of the twentieth century. This is the ﬁrst exposition of Carleson’s theorem about the convergence of Fourier series in book form. It diﬀers from the previous lecture notes, one by Mozzochi [38], and the other by Jørsboe and Mejlbro [26], in that our exposition points out the motivation of every step in the proof. Since its publication in 1966, the theorem has acquired a reputation of being an isolated result, very technical, and not proﬁtable to study. There have also been many attempts to obtain the results by simpler methods. To this day it is the proof that gives the ﬁnest results about the maximal operator of Fourier series. The Carleson analysis of the function, one of the fundamental steps of the proof, has an interesting musical interpretation. A sound wave consists of a periodic variation of pressure occurring around the equilibrium pressure prevailing at a particular time and place. The sound signal f is the variation of the pressure as a function of time. The Carleson analysis gives the score of a musical composition given the sound signal f . The Carleson analysis can be carried out at diﬀerent levels. Obviously the above assertion is true only if we consider an adequate level. Carleson’s proof has something that reminds me of living organisms. The proof is based on many choices that seem arbitrary. This happens also in living organisms. An example is the error in the design of the eyes of the vertebrates. The photoreceptors are situated in the retina, but their outputs emerge on the wrong side: inside the eyes. Therefore the axons must ﬁnally

VI

Preface

be packed in the optic nerve that exit the eyes by the so called blind spot. But so many ﬁbers (125 million light-sensitive cells) will not pass by a small spot. Hence evolution has solved the problem packing another layer of neurons inside the eyes that have rich interconections with the photoreceptors and with each other. These neurons process the information before it is send to the brain, hence the number of axons that must leave the eye is sustantially reduced (one million axons in each optic nerve). The incoming light must traverse these neurons to reach the photoreceptors, hence evolution has the added problem of making them transparent. We have tried to arrange the proof so that these things do not happen, so that these arbitrary selections do not shade the idea of the proof. We have had the advantage of the text processor TEX, which has allowed us to rewrite without much pain. (We hope that no signs of these rewritings remain). By the way, the eyes and the ears process the information in totally diﬀerent ways. The proof of Carleson follows more the ear than the eyes. But what these neurons are doing in the inside of the eyes is just to solve the problem: How must I compress the information to send images using the least possible number of bits? A problem for which the wavelets are being used today. I would like this book to be a commentary to the Carleson paper. Therefore we give the Carleson-Hunt theorem following more Carleson’s than Hunt’s paper. The chapter on the maximal operator of Fourier series S ∗ f , gives the ﬁrst exposition of the consequences of the Carleson-Hunt theorem. Some of the results appear here for the ﬁrst time. I wish to express my thanks to Fernando Soria and to N. Yu Antonov for sending me their papers and their comments about the consequences of the Carleson-Hunt theorem. Also to some members of the department of Mathematical Analysis of the University of Seville, especially to Luis Rodr´ıguezPiazza who showed me the example contained in chapter XIII.

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . About the notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v xi xv

Part I. Fourier series and Hilbert Transform 1. Hardy-Littlewood maximal function 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Weak Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Diﬀerentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 A general inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 4 6 8 9

2. Fourier Series 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Dirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Fourier Series of Continuous Functions . . . . . . . . . . . . . . . . . . 2.4 Banach continuity principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Summability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 The Conjugate Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 The Hilbert transform on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 The conjecture of Luzin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11 12 15 18 20 24 26 28

3. Hilbert Transform 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Truncated operators on L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Truncated operators on L1 (R) . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Hilbert Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Maximal Hilbert Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 31 32 36 37 39

VIII

Table of Contents

Part II. The Carleson-Hunt Theorem 4. The 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14

Basic Step Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carleson maximal operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . Local norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dyadic Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The ﬁrst term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation α/β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The second term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The third term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First form of the basic step . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some comments about the proof . . . . . . . . . . . . . . . . . . . . . . . . Choosing the partition Πα . The norm |f |α . . . . . . . . . . . . . Basic theorem, second form . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51 51 53 56 59 60 61 62 63 63 66 66 68 70

5. Maximal inequalities 5.1 Maximal inequalities for Δ(Π, x) . . . . . . . . . . . . . . . . . . . . . . . 5.2 Maximal inequalities for HI∗ f . . . . . . . . . . . . . . . . . . . . . . . . . .

73 75

6. Growth of Partial Sums 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The seven trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 The exceptional set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Bound for the partial sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77 78 78 81

7. Carleson Analysis of the Function 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 A musical interlude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 The notes of f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 The set X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 The set S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85 86 87 89 90

8. Allowed pairs 8.1 The length of the notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8.2 Well situated notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8.3 The length of well situated notes . . . . . . . . . . . . . . . . . . . . . . . 98 8.4 Allowed pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 8.5 The exceptional set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 9. Pair Interchange Theorems 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Choosing the shift m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 A bound of f α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Selecting an allowed pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

103 103 105 107

Table of Contents

IX

10. All together 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 10.2 End of proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Part III. Consequences 11. Some spaces of functions 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Decreasing rearrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 The Lorentz spaces Lp,1 (μ) and Lp,∞ (μ) . . . . . . . . . . . . . . . . . 11.4 Marcinkiewicz interpolation theorem . . . . . . . . . . . . . . . . . . . . 11.5 Spaces near L1 (μ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 The spaces L log L(μ) and Lexp (μ) . . . . . . . . . . . . . . . . . . . . . . 12. The 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 13. 13.1 13.2

Maximal Operator of Fourier series Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximal operator of Fourier series . . . . . . . . . . . . . . . . . . . . . . The distribution function of S ∗ f . . . . . . . . . . . . . . . . . . . . . . . . The operator S ∗ on the space L∞ . . . . . . . . . . . . . . . . . . . . . . The operator S ∗ on the space L(log L)2 . . . . . . . . . . . . . . . . . The operator S ∗ on the space Lp . . . . . . . . . . . . . . . . . . . . . . . The maximal space Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The theorem of Antonov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fourier transform on the line Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127 127 130 134 137 141 145 145 147 148 149 150 152 157 163 163

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Introduction

The origin of Fourier series is the 18th century study by Euler and Daniel Bernoulli of the vibrating string. Bernoulli took the point of view, suggested by physical considerations, that every function can be expanded in a trigonometric series. At this time the prevalent idea was that such an expression implied diﬀerentiability properties, and that such an expansion was not possible in general. Such a question was not one for that time. A response depended on what is understood by a function, a concept that was not clear until the 20th century. The ﬁrst positive results were given in 1829 by Dirichlet, who proved that the expansion is valid for every continuous function with a ﬁnite number of maxima and minima. A great portion of the mathematics of the ﬁrst part of the 20th century was motivated by the convergence of Fourier series. For example, Cantor’s set theory has its origin in the study of this convergence. Also Lebesgue’s measure theory owes its success to its application to Fourier series. Luzin, in 1913, while considering the properties of Hilbert’s transform, conjectured that every function in L2 [−π, π] has an a. e. convergent Fourier series. Kolmogorov, in 1923 gave an example of a function in L1 [−π, π] with an a. e. divergent Fourier series. A. P. Calderon (in 1959) proved that if the Fourier series of every function in L2 [−π, π] converges a. e., then m{x : sup |Sn f (x)| > y} ≤ C n

f 2 . y2

For many people the belief in Luzin’s conjecture was destroyed; it seemed too good to be true. So, it was a surprise when Carleson, in 1966, proved Luzin’s conjecture. The next year Hunt proved the a. e. convergence of the Fourier series of every f ∈ Lp [−π, π] for 1 < p ≤ ∞. Kolmogorov’s example is in fact a function in L log log L with a. e. divergent Fourier series. Hunt proved that every function in L(log L)2 has an a. e. convergent Fourier series. Sj¨ olin, in 1969, sharpened this result: every function in the space L log L log log L has an a. e. convergent Fourier series. The last result in this direction is that of Antonov (in 1996) who proved the

XII

Introduction

same for functions in L log L(log log log L). Also, there are some quasi-Banach spaces of functions with a. e. convergent Fourier series given by Soria in 1985. In the other direction, the results of Kolmogorov were sharpened by Chen in 1969 giving functions in L(log log L)1−ε with a. e. divergent Fourier series. Recently, Konyagin (1999) has obtained the same result for the space Lϕ(L), whenever ϕ satisﬁes ϕ(t) = o( (log t/ log log t)). Apart from the proof of Carleson, there have been two others. First, the one by Feﬀerman in 1973. He says: However, our proof is very ineﬃcient near L1 . Carleson’s construction can be pushed down as far as L log L(log log L), but our proof seems unavoidably restricted to L(log L)M for some large M . Then we have the recent proof of Lacey and Thiele; they are not so explicit as Feﬀerman, but what they prove (as far as I know) is limited to the case p = 2. The proof of Lacey and Thiele is based on ideas from Feﬀerman proof, also the proof of Feﬀerman has been very important since it has inspired these two authors in his magniﬁcent proof of the boundedness of the bilinear Hilbert transform. The trees and forests that appears in these proofs has some resemblance to the notes of the function and the allowed pairs of the proof of Carleson that are introduced in chapter eight and nine of this book, but to understand better these relationships will be matter for other book. The aim of this book is the exposition of the principal result about the convergence of Fourier series, that is, the Carleson-Hunt Theorem. The book has three parts. The ﬁrst part gives a review of some results needed in the proof and consists of three chapters. In the ﬁrst chapter we give a review of the Hardy-Littlewod maximal function. We prove that this operator transforms Lp into Lp for 1 < p ≤ ∞. The diﬀerentiation theorem allows one to see the great success we get with a pointwise convergence problem by applying the idea of the maximal function. This makes it reasonable to consider the maximal operator S ∗ f (x) = supn |Sn (f, x)|, in the problem of convergence of Fourier series. In chapter two we give elementary results about Fourier series. We see the relevance of the conjugate function and explain the elements on which Luzin, in 1913, founded his conjecture about the convergence of Fourier series of L2 functions. We also present Dini’s and Jordan’s tests of convergence in conformity with the law: s/he who does not know these criteria must not read Carleson’s proof. The properties of Hilbert’s transform, needed in the proof of Carleson’s Theorem are treated in chapter three. In the second part we give the exposition of the Carleson-Hunt Theorem. The basic idea of the proof is the following. Our aim is to bound what we call Carleson integrals π in(x−t) e f (t) dt. p.v. −π x − t

Introduction

XIII

To this end we consider a partition Π of the interval [−π, π] into subintervals, one of them I(x) containing the point x, and write the integral as π in(x−t) e ein(x−t) f (t) dt = p.v. f (t) dt p.v. −π x − t I(x) x − t in(x−t) e f (t) − MJ MJ + dt + dt. x−t J J x−t J∈Π,J=I(x)

where MJ is the mean value of ein(x−t) f (t) on the interval J. The last sum can be conveniently bounded so that, in fact, we have changed the problem of bounding the ﬁrst integral to the analogous problem for the integral on I(x). After a change of scale we see that we have a similar integral, but the number of cycles in the exponent n has decreased in number. Therefore we can repeat the reasoning. With this procedure we obtain the theorem that Sn (f, x) = o(log log n) a. e., for every f ∈ L2 [−π, π]. We think that to understand the proof of Carleson’s theorem it is important to start with this theorem, because this is how the proof was generated. Only after we have understood this proof can we understand the very clever modiﬁcations that Carleson devised to obtain his theorem. The next three chapters are dedicated to this end. The ﬁrst deals with the actual bound of the second and third terms, and the problem of how we must choose the partition to optimize these bounds. In Chapter ﬁve we prove that the bounds are good, with the exception of sets of controlled measure. Then, in Chapter six, given f ∈ L2 [−π, π], y > 0, N ∈ N , and ε > 0, we deﬁne a measurable set E with m(E) < Aε and such that π in(x−t) f 2 e sup p.v. f (t) dt ≤ C √ (log N ). ε 0≤n≤N/4 −π x − t From this estimate we obtain the desired conclusion that Sn (f, x) = o(log log n) a. e. These three chapters follow Carleson’s paper, where instead of f ∈ L2 he assumed that |f |(log+ |f |)1+δ ∈ L1 , reaching the same conclusion. Since we shall obtain further results, we have taken the simpler hypothesis that f ∈ L2 . In fact, our motivation to include the proof is to allow the reader to understand the modiﬁcations contained in the next ﬁve chapters. The logarithmic term appears in the above proof because every time we apply the basic procedure, we must put apart in the set E a small subset where the bound is not good. We have to put in a term log N in order to obtain a controlled measure. In fact, we are considering all pairs (n, J) formed by a dyadic subinterval J of [−π, π], and the number of cycles of the Carleson integral. If we consider the procedure of chapters four to six, then we suspect

XIV

Introduction

that we do not need all of these pairs. This is the basic observation on which all the clever reasoning of Carleson is founded. In chapter seven we determine which pairs are needed. Carleson made an analysis of the function to detect which pairs these are. If we think of f as the sound signal of a piece of music, then this analysis can be seen as a process to derive from f the score of this piece of music. In this chapter we deﬁne the set Qj of notes of f to the level j. In chapter eight we deﬁne the set Rj of allowed pairs. This is an enlargement of the set of notes of f , so that we can achieve two objectives. The principal objective is that if α = (n, J) is a pair such that α ∈ Rj , then the sounds of the notes of f (at level j) that have a duration containing J, is essentially a single note or a rest. This is very important because if we consider a Carleson integral Cα f (x) with this pair, then we have a candidate note, the sound of f , that is an allowed pair and therefore can be used in the basic procedure of chapter four. Chapter nine is the most diﬃcult part of the proof. In it we see how, given an arbitrary Carleson integral Cα f (x), we can obtain an allowed pair ξ such that we can apply the procedure of chapter four, and a change of frequency to bound this integral. In chapter ten we apply all this machinery to prove the basic inequality of Theorem 10.2. The last part of the book is dedicated to deriving some consequences of the proof of Carleson-Hunt. First, in chapter eleven we prove a version of the Marcinkiewicz interpolation theorem and give the deﬁnition and ﬁrst properties of the spaces that we shall need in chapter twelve. In particular, we study a class of spaces near L1 (μ) that play a prominent role. We prove that they are atomic spaces, a fact that allows very neat proofs in the following chapter. In Chapter twelve we study the maximal operator S ∗ f of Fourier series. In it we give detailed and explicit versions of Hunt’s theorem, with improved constants. We end the chapter by deﬁning two quasi-Banach spaces, Q and QA, of functions with almost everywhere convergent Fourier series. These spaces improve the known results of Sj¨ olin, Soria and Antonov, and the proofs are simpler. In the last chapter we consider the Fourier transform on R. We consider the problem of when we can obtain the Fourier transform of a function f ∈ Lp (R) by the formula a f (t)e−2πixt dt. f(x) = lim a→+∞

−a

We prove by an example (Example 13.2) that our results are optimal.

1. Hardy-Littlewood maximal function

1.1 Introduction What Carleson proved in 1966 was Luzin’s conjecture of 1913, and this proof depended on many results obtained in the ﬁfty years since the conjecture was stated. In this chapter we make a rapid exposition of one of these prerequisites. We can also see one of the best ideas, that is, taking a maximal operator when one wants to prove pointwise convergence. The convergence result obtained is simple: the diﬀerentiability of the deﬁnite integral. This permits one to observe one of the pieces of Carleson’s proof without any technical problems. Given a function f ∈ L1 (R) we ask about the diﬀerentiability properties of the deﬁnite integral x

F (x) =

f (t) dt. −∞

This is equivalent to the question of whether there exists F (x + h) − F (x) 1 x+h = lim lim f (t) dt. h→0 h→0 h x h When we are confronted with questions of convergence it is advisable to study the corresponding maximal function. Here, 1 x+h sup f (t) dt. h h x An analogous result in dimension n will be 1 f (x) = lim f (t) dt, Qx |Q| Q

(1.1)

where Q denotes a cube of center x and side h and we write Q x to express that the side h → 0+ . In the one-dimensional case we have Q = [x − h, x + h], this diﬀerence ([x − h, x + h] instead of [x, x + h]) has no consequence, as we will see. For every locally integrable function f : Rn → C, we put

J.A. de Reyna: LNM 1785, pp. 3–10, 2002. c Springer-Verlag Berlin Heidelberg 2002

4

1. Hardy-Littlewood maximal function

1 Mf (x) = sup Q |Q|

|f (t)| dt, Q

where the supremum is taken over all cubes Q ⊂ Rn with center x. Mf is the Hardy-Littlewood maximal function.

1.2 Weak inequality First observe that given f locally integrable, the function Mf : Rn → [0, +∞] is measurable. In fact for every positive real number α the set {Mf (x) > α} is open, because given x ∈ Rn with Mf (x) > α there exists a cube Q with center at x and such that 1 |f (t)| dt > α. |Q| Q We only have to observe that the function 1 |f (t)| dt y → |Q| y+Q is continuous. If f ∈ Lp (Rn ), with 1 α} ≤ cn

f 1 . α

The proof is really wonderful. The set where Mf (x) > α is covered by cubes where the mean of |f | is greater than α. If this set has a big measure, we shall have plenty of these cubes. Then we can select a big pairwise disjoint subfamily and this implies that the norm of f is big. The most delicate point of this proof is that at which we select the disjoint cubes. This is accomplished by the following covering lemma Lemma 1.1 (Covering lemma) Let Rd be endowed with some norm, and let cd = 2 · 3d . If A ⊂ Rd is a non-empty set of ﬁnite exterior measure, and U is a covering of A by open balls, then there is a ﬁnite subfamily of disjoint balls B1 ,. . . ,Bn of U such that cd

n

m(Bj ) ≥ m∗ (A).

j=1

Proof. We can assume that A is measurable, because if it were not, there would exist open set G ⊃ A with m(G) ﬁnite and such that U would be a

1.2 Weak inequality

5

covering of G. Now, assuming that A is measurable, there exists a compact set K ⊂ A with m(K) ≥ m(A)/2. Now select a ﬁnite subcovering of K, say that with the balls U1 , U2 , . . . , Um . Assume that these balls are ordered with decreasing radii. Then we select the balls Bj in the following way. First B1 = U1 is the greatest of them all. Then B2 is the ﬁrst ball in the sequence of Uj that is disjoint from B1 , if there is one, in the other case we put n = 1. Then B3 will be the ﬁrst ball from the Uj that is disjoint from B1 ∪ B2 . We continue in this way, until every ball from the sequence Uj has non-empty intersection with some Bj . m n Now we claim that K ⊂ j=1 3Bj . In fact we know that K ⊂ j=1 Uj . Hence for every x ∈ K, there is a ﬁrst j such that x ∈ Uj . If this Uj is equal to some Bk obviously we have x ∈ Bk ⊂ 3Bk . In other case Uj intersects some Bk = Us . Selecting the minimum k, it must be that s < j, for otherwise we would have selected Uj instead of Bk in our process. So the radius of the ball Bk is greater than or equal to that of Uj . It follows that Uj ⊂ 3Bk . Therefore n 1 m(A) ≤ m(K) ≤ 3d m(Bj ), 2 j=1 and the construction implies that these balls are disjoint.

Lemma 1.2 (Hardy and Littlewood) If f ∈ L1 (Rd ) then Mf satisﬁes, for each α > 0, the weak inequality m{x ∈ Rd | Mf (x) > α} ≤ cd

f 1 . α

Proof. Let A = {x ∈ Rd | Mf (x) > α}, it is an open set. We do not know yet that it has ﬁnite measure, so we consider An = A ∩ Bn , where Bn is a ball of radius n and center 0. Now each x ∈ An has Mf (x) > α; hence there exists an open cube Q, with center at x and such that 1 |f (t)| dt > α. (1.2) |Q| Q Now cubes are balls for the norm · ∞ on Rd . So we can apply the covering lemma to obtain a ﬁnite set of disjoint cubes (Qj )m j=1 such that every one of them satisﬁes (1.2), and m(An ) ≤ cd

m j=1

Therefore we have

m(Qj ).

6

1. Hardy-Littlewood maximal function

m 1 m(An ) ≤ cd |f (t)| dt. α Q j=1 Since the cubes are disjoint m(An ) ≤ cd

f 1 . α

Taking limits when n → ∞, we obtain our desired bound.

1.3 Diﬀerentiability As an application we desire to obtain (1.1). In fact we can prove something more. It is not only that at almost every point x ∈ Rd we have 1 f (t) − f (x) dt = 0, lim Qx |Q| Q but that we have

1 lim Qx |Q|

f (t) − f (x) dt = 0.

Q

A point where this is true is called a Lebesgue point of f . Theorem 1.3 (Diﬀerentiability Theorem) Let f : Rd → C be a locally integrable function. There exists a subset Z ⊂ Rd of null measure and such that every x ∈ / Z is a Lebesgue point of f . That is 1 f (t) − f (x) dt = 0. lim Qx |Q| Q Proof. Whether x is a Lebesgue point of f or not, depends only on the values of f in a neighborhood of x. So we can reduce to the case of f integrable. Also the results are true for a dense set on L1 (Rd ). In fact if f is continuous, given x and ε > 0, there is a neighborhood of x such that |f (t)−f (x)| < ε. Hence if Q denotes a cube with a suﬃciently small radius we have 1 f (t) − f (x) dt ≤ ε. |Q| Q Hence, for a continuous function f , every point is a Lebesgue point. Now we can observe for the ﬁrst time how the maximal function intervenes in pointwise convergence matters. We are going to deﬁne the operator Ω. If f ∈ L1 (Rd ), 1 f (t) − f (x) dt Ωf (x) = lim sup Qx |Q| Q

1.3 Diﬀerentiability

7

Note that Ωf (x) ≤ Mf (x) + |f (x)|. Now our objective is to prove that Ωf (x) = 0 almost everywhere. Fix ε > 0. Since the continuous functions are dense on L1 (Rd ), we obtain a continuous ϕ ∈ L1 (Rd ), such that f − ϕ1 < ε. By the triangle inequality Ωf (x) ≤ Ωϕ(x) + Ω(f − ϕ)(x) = Ω(f − ϕ)(x) ≤ M(f − ϕ)(x) + |f (x) − ϕ(x)|. Hence for every α > 0 we have {Ωf (x) > α} ⊂ {M(f − ϕ)(x) > α/2} ∪ {|f (x) − ϕ(x)| > α/2}. Now we use the weak inequality for the Hardy-Littlewood maximal function and the Chebyshev inequality for |f − ϕ| m{Ωf (x) > α} ≤ 2cd

f − ϕ1 ε f − ϕ1 +2 ≤ Cd . α α α

Since this inequality is true for every ε > 0, we deduce m{Ωf (x) > α} = 0. And this is true for every α > 0, hence Ωf (x) = 0 almost everywhere. As an example we prove that

x

F (x) =

f (t) dt −∞

is diﬀerentiable at every Lebesgue point of f . We assume that f is integrable. For h > 0 1 x+h F (x + h) − F (x) − f (x) = f (t) − f (x) dt. h h x Hence

F (x + h) − F (x) 1 x+h f (t) − f (x) dt − f (x) ≤ h h x x+h 2 f (t) − f (x) dt. ≤ 2h x−h

If x is a Lebesgue point of f we know that the limit when h → 0 is equal to zero. An analogous procedure proves the existence of the left-hand limit at x.

8

1. Hardy-Littlewood maximal function

1.4 Interpolation At one extreme, with p = 1, the maximal function Mf satisﬁes a weak inequality. At the other extreme p = +∞, it is obvious from the deﬁnition that if f ∈ L∞ (Rd ) Mf ∞ ≤ f ∞ . An idea of Marcinkiewicz permits us to interpolate between these two extremes. Theorem 1.4 For every f ∈ Lp (Rd ), 1 0 we decompose f , f = f χA +f χRd A , where A = {|f | > α}. Then Mf ≤ α + M(f χA ). Consequently cd |f | χ{|f |>α} dm. m{Mf > 2α} ≤ m{M(f χA ) > α} ≤ α Rd The proof depends on a judicious use of this inequality. In particular observe that we have used a diﬀerent decomposition of f for every α. We have the following chain of inequalities +∞ p tp−1 m{Mf > t} dt ≤ Mf p = p 0

+∞

t

p 0

p−1 2cd

t

Rd

Applying Fubini’s theorem |f (x)| 2cd p Rd

2cd p

Rd

0

|f | χ{|f |>t/2} dm dt ≤

+∞

tp−2 χ{|f (x)|>t/2} dt dx =

2p cd p (2|f (x)|)p−1 |f (x)| dx = f pp p−1 p−1

It is easy to see that (p/(p − 1))1/p is equivalent to p/(p − 1). Hence we obtain our claim about the norm. In the case of p = 1 the best we can say is the weak inequality. For example if f 1 > 0, then Mf is not integrable. In spite of this we shall need in the proof of Carleson theorem a bound of the integral of the maximal function on a set of ﬁnite measure; that is a consequence of the weak inequality

1.5 A general inequality

9

Proposition 1.5 For every function f ∈ L1 (Rd ) and B ⊂ Rd a measurable set Mf (x) dx ≤ m(B) + 2cd |f (x)| log+ |f (x)| dx. Rd

B

Proof. Let mB be the measure mB (M ) = m(B ∩ M ). We have +∞ Mf (x) dx = mB {Mf (x) > t} dt. 0

B

Now we have two inequalities mB {Mf (x) > t} ≤ m(B), and the weak inequality. The point of the proof is to use adequately the weak inequality. For every α we have f = f χA +f χRd A where A = {f (x) > α}. Therefore Mf ≤ α + M(f χA ), and {Mf (x) > 2α} ⊂ {M(f χA )(x) > α}. It follows that cd m{Mf (x) > 2α} ≤ α Hence

Mf (x) dx ≤ m(B) + 2 B

1

+∞

{|f (x)|>α}

cd

t

Therefore by Fubini’s theorem Mf (x) dx ≤ m(B) + 2cd

Rd

B

|f (x)| dx.

{|f (x)|>t}

|f (x)| dx dt.

|f (x)| log+ |f (x)| dx.

1.5 A general inequality The Hardy-Littlewood maximal function can be used to prove many theorems of pointwise convergence. This and many other applications of these functions derive from the following inequality. Theorem 1.6 Let ϕ: Rd → R be a positive, radial, decreasing, and integrable function. Then for every f ∈ Lp (Rd ) and x ∈ Rd we have |ϕ ∗ f (x)| ≤ Cd ϕ1 Mf (x), where Cd is a constant depending only on the dimension, and equal to 1 for d = 1.

10

1. Hardy-Littlewood maximal function

Proof. We say that ϕ is radial if there is a function u: [0, +∞) → R such that ϕ(x) = u(|x|) for every x ∈ Rd . Also we say that a radial function ϕ is decreasing if u is decreasing. The function u is measurable, hence there is an increasing sequence of simple functions (un ) such that un (t) converges to u(t) for every t ≥ 0. In this case, since u is decreasing, it is possible to choose each un un (t) =

N

hj χ[0,tj ] (t),

j=1

where 0 < t1 < t2 < · · · < tN and hj > 0 and the natural number N depends on n. Now the proof is straightforward. Let ϕn (x) = un (|x|). By the monotone convergence theorem |ϕ ∗ f (x)| ≤ ϕ ∗ |f |(x) = lim ϕn ∗ |f |(x). n

Therefore ϕn ∗ |f |(x) =

N j=1

hj

B(x,tj )

|f (y)| dy.

We can replace the ball B(x, tj ) by the cube with center x and side 2tj . The quotient between the volume of the ball and the cube is bounded by a constant. Thus ϕn ∗ |f |(x) ≤

N

hj m Q(xj , tj ) · Mf (x) ≤ Cd ϕ1 Mf (x).

j=1

2. Fourier Series

2.1 Introduction Let f : R → C be a 2π-periodic function, integrable in [−π, π]. The Fourier series of f is the series +∞ aj eijt (2.1) j=−∞

where the Fourier coeﬃcients aj are deﬁned by π 1 aj = f (t)e−ijt dt. 2π −π

(2.2)

These coeﬃcients are denoted as f(j) = aj . These series had been considered in the eighteen century by Daniel Bernoulli, Euler, Lagrange, etc. They knew that if a function is given by the series (2.1), the coeﬃcients can be calculated by (2.2). They also knew many examples. Bernoulli, studying the movement of a string ﬁxed at its extremes, gave the expression ∞ jπx cos jρt, y(x, t) = aj sin j=1 for its position, where is the length of the string and the coeﬃcient ρ depends on its physical properties. In 1753 Euler noticed a paradoxical implication: the initial position of the string would be given by f (x) =

∞ j=1

aj sin

jπx .

At this moment the curves were classiﬁed as continuous, if they were deﬁned by a formula, and geometrical if they could be drawn with the hand. They thought that the ﬁrst ones were locally determined while the movement of the hand was not determined by the ﬁrst stroke. Bernoulli believed that the representation of an arbitrary function was possible.

J.A. de Reyna: LNM 1785, pp. 11–29, 2002. c Springer-Verlag Berlin Heidelberg 2002

12

2. Fourier Series

Fourier aﬃrmed in his book Th´eorie Analytique de la Chaleur (1822) that the development was valid in the general case. This topic is connected with the deﬁnition of the concept of function.

2.2 Dirichlet Kernel The convergence of the series (2.1) was considered by Dirichlet in 1829. He proved that the series converges to f (x + 0) + f (x − 0) /2 for every piecewise continuous and monotonous function. This was later superseded by the results of Dini and Jordan. To prove these results we consider ﬁrst the result of Riemann Proposition 2.1 (Riemann-Lebesgue lemma) If f : R → C is 2πperiodic and integrable on [−π, π], then lim f(j) = 0.

|j|→∞

Proof. If we change variables u = t + π/j in the integral (2.2) the exponential changes sign. Hence we have π π

1 1 π −ijt −ijt f (j) = e f (t)e dt − f t− dt. 4π −π 4π −π j For a continuous function f it follows that lim |f(j)| = 0. For a general f we approximate it in L1 norm by a continuous function.

2.2 Dirichlet Kernel

To study pointwise convergence we consider the partial sums Sn (f, x) =

n

f(j)eijx .

j=−n

Since every coeﬃcient has an integral expression, we obtain an integral form for the partial sum of the Fourier series π 1 Dn (x − t)f (t) dt, Sn (f, x) = 2π −π where the function Dn , the Dirichlet kernel, is given by n sin n + 12 t ijt e = Dn (t) = . sin t/2 j=−n It follows that f → Sn (f, x) is a continuous linear form deﬁned on L1 [−π, π]. The function Dn is 2π-periodic, with integral equal to 1, but Dn 1 and Dn ∞ are not uniformly bounded. With the integral expression of the partial sums we can obtain the two basic conditions for pointwise convergence.

13

. ...... ..... ........ .. ...... .. ... ... ... .. .. ... ... ... ..... .... .. .. ... .. ... ... ... .. .. ... .. .. ... ... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. .. ... ..... . . .. .. .. ... ... .... . . .. ... .. ... ... .... . . .. ... .. .. ... ..... . . .. ... .. .. ..... ..... ..... ..... .. . ... ..... ..... .. ... .. ... .. .... ... ... ... ... .. . . . . . ... .. .. .. .. .. .. ..... . . .. . .. . .. ..... . ... ...... .. ... .. ... .. .. .. .. .. .. .. .. .. .. .. ... .. ... .. ... .. .. . ... . .. .. . .. ... .. .. .. .. .. .. .. .. .. .. .. ... . ... . .. .. .. .. .. .. .. .. ... .. ... .. ... ... ... .. .. .. ... .. .. .. .. .. ... .. ... .. ... .. ... .. .. . .. . .. .. ... .. ... ... ... ... ... ... ... ... .. .. .. .. ... .. .. . .. .. .. .. .. .. .. .. ... .. ... .. ... .. .... .. .. . .. .. .. .. .. . . . . . . . .... .. .. .. . ... . .. .. .. .. .. . ..... . . . ... ..... .. .. .. .. .. .. .. . ..... . . .. .. . .. . .. .. .. .. ...... .. .. .. .. ...... .... ... .. .. .. .. . . ... . .. .. ... .. .. .. ... . ... ... ... .. .. .. ... .. .. ... ..... .... ..... ..... .... ..... .... ...

π

−π

The Dirichlet kernel D8 (t).

Theorem 2.2 (Dini’s test) If f ∈ L1 [−π, π] and π dt < +∞, |f (x + t) + f (x − t) − 2f (x)| t 0 then the Fourier series of f at the point x converges to f (x). Proof. The diﬀerence Sn (f ) − f can be written as π 1 Dn (t) f (x − t) − f (x) dt = Sn (f, x) − f (x) = 2π −π π 1 Dn (t) f (x + t) + f (x − t) − 2f (x) dt. 2π 0 Since 2 sin t/2 ∼ t, the Riemann-Lebesgue lemma proves that this diﬀerence tends to 0.

14

2. Fourier Series

Theorem 2.3 (Jordan’s test) If f ∈ L1 [−π, π] is of bounded variation on an open interval that contains x, then the Fourier series at x converges to f (x + 0) + f (x − 0) /2. Proof. The proof is based on the fact that although Dn 1 are not bounded the integrals δ Dn (t) dt 0

are uniformly bounded on n and δ. (This can be proved changing the Dirich1 let kernel to the equivalent sin n + 2 /t and then applying a change of variables). Without loss of generality we can assume that x = 0. Also we can assume that f is increasing on a neighborhood of 0. We must prove π 1 Dn (t) f (t) + f (−t) dt = f (0+) + f (0−) /2. lim n 2π 0 By symmetry it suﬃces to prove π 1 Dn (t)f (t) dt = f (0+)/2. lim n 2π 0 Finally we can assume that f (0+) = 0. Choose δ > 0 such that 0 ≤ f (t) < ε for every 0 < t < δ. We decompose the integral into two parts, one on [0, δ] and the other on [δ, π]. We apply to the ﬁrst integral the second mean value theorem, which states that if g is continuous and f monotone on [a, b], there exists c ∈ [a, b] such that b b c f (t)g(t) dt = f (b−) g(t) dt + f (a+) g(t) dt. a

c

a

Therefore 0

π

Dn (t)f (t) dt = f (δ−)

δ

Dn (t) dt + η

π

Dn (t)f (t) dt. δ

The second integral converges to 0 by the Riemann-Lebesgue lemma and the ﬁrst is less than Cε by the property of Dirichlet kernel that we have noted. We see that these conditions only depend on the values of f in an arbitrarily small neighborhood of x. This is a general fact and it is known as the Riemann localization principle: the convergence of the Fourier series to f (x) only depends on the values of f in a neighborhood of f . This is clear from the expression of Sn (f, x) as an integral and the Riemann-Lebesgue lemma. This is surprising, because each f(j) depends on all the values of f .

2.3 Fourier series of continuous functions

15

The two criteria given are independent. If f (t) = 1/|log(t/2π)|, g(t) = t sin(1/t), for 0 < t < π and 0 < α < 1, then f satisﬁes Jordan’s condition but not Dini’s test, at the point t = 0. On the other hand g satisﬁes only Dini’s test. α

2.3 Fourier series of continuous functions The convergence conditions that we have proved show that the Fourier series of a diﬀerentiable function converges pointwise to the function. This is not true for continuous functions. Du Bois Reymond constructed a continuous function whose Fourier series is divergent at one point. This follows from Banach-Steinhauss theorem. We consider Tn (f ) = Sn (f, 0) as a linear operator on the space of continuous functions on [−π, π] that take the same values at the extremes. By the Banach-Steinhauss theorem supn Tn < +∞ if and only if for every f ∈ C(T) we have supN |Tn (f )| < +∞. But an easy calculus shows that Tn = Dn 1 =

4 log n + O(1). π2

The numbers Ln = Dn 1 are called Lebesgue constants. Its order is computed as follows π 2 π sin(n + 1/2)t 2 sin(n + 1/2)t Ln = dt = dt + O(1) 2π 0 sin(t/2) π 0 t 2 = π

n−1 2 (k+1)π sin u sin u du + O(1) = du + O(1) u π u kπ

(2n+1)π/2

0

k=0

n−1 2 π sin u = du + O(1) π 0 kπ + u k=0

2 π

0

n−1 4 4 1 sin u + O(1) = 2 log n + O(1). du + O(1) = kπ + u π kπ + ξ π

π n−1 k=1

k=1

Notice the following corollary Corollary 2.4 If f ∈ L∞ [−π, π], then |Sn (f, x)| ≤ ( π42 log n + C)f ∞ . The following theorem is more diﬃcult. In its proof we need an expression of the Dirichlet kernel that plays an important role in Carleson’s theorem.

16

2. Fourier Series

Theorem 2.5 (Hardy) If f ∈ L1 [−π, π], then at every Lebesgue point x of f lim [Sn (f, x)/(log n)] = 0. n→∞

Furthermore, if f is continuous on an open interval I, the convergence is uniform on every closed J ⊂ I.

Proof. The Dirichlet kernel can be written as

1 sin n + 1/2 t sin nt 2 =2 + cos nt + − Dn (t) = sin nt. sin t/2 t tan t/2 t The last two terms are bounded uniformly in n and t. Therefore Dn (t) = 2

sin nt + ϕn (t), t

|t| < π,

(2.3)

and there is an absolute constant 0 < C < +∞ such that ϕn ∞ ≤ C. This expression will play a role in Carleson’s theorem. Now we have π 1 π sin nt f (x − t) |f (t)| dt. dt ≤ c Sn (f, x) − π −π t −π It follows that for every f in L1 [−π, π] 1 π sin nt dt . |Sn (f, x)| ≤ cf 1 + f (x − t) π −π t The function sin t/t has integrals uniformly bounded on intervals. It follows that 1 π sin nt |Sn (f, x)| ≤ C + dt . {f (x + t) + f (x − t) − 2f (x)} π 0 t Let ϕx (t) denote the function f (x+t)+f (x−t)−2f (x). If x is a Lebesgue point of f , the primitive Φ(t) of |ϕx (t)| satisﬁes Φ(t) = o(t) when t → 0. With these notations n 1/n 1 π −1 |ϕx (t)| dt + t |ϕx (t)| dt |Sn (f, x)| ≤ C + π 0 π 1/n 1 1 π n 1 −1 π Φ(t)t−2 dt. =C+ Φ + Φ(t)t 1/n + π n π π 1/n It follows easily that Sn (f, x)/ log n → 0 since, x being a Lebesgue point of f , we have Φ(t) = o(t).

2.3 Fourier series of continuous functions

17

This result is the best possible in the following sense: for every sequence (λn ) such that λ−1 n log n → +∞, there exists a continuous function f such that |Sn (f, x)| > λn for inﬁnitely many natural numbers n. Some properties of the trigonometrical system follow from the fact that it is an orthonormal set of functions. For example D. E. Menshov and H. Rademacher proved that the series of orthonormal functions j cj ϕj con verges almost everywhere if j |cj log j|2 < +∞. This is also the best result for general orthonormal systems, in particular D. E. Menshov in 1923 proved that there exist series j cj ϕj divergent almost everywhere and such that 2 j |cj | < +∞. Therefore Carleson’s theorem is a property of the trigonometrical system that depends on the natural ordering of this system. The result of D. E. Menshov and H. Rademacher is easily proved. In fact this is a general result that relates the order of Sn (f, x) and the convergence of certain series. Theorem 2.6 Assume that (λn ) is an increasing sequence of positive real number such that for every g ∈ L2 [−π, π] Sn (g, x) = 0. n→+∞ λn+1 Then if f ∈ L1 [−π, π] satisﬁes j |f(j)λ|j| |2 < +∞, then lim

f (x) = lim Sn (f, x),

a. e.

n

Proof. By the Riesz-Fischer theorem there exists a function g ∈ L2 [−π, π], such that g(j) = f(j)λ|j| . Comparing Fourier coeﬃcients we derive the equality n

1 1 Sn (g, x) Sn (f, x) = − . Sk (g, x) + λk λk+1 λn+1 k=0

By our hypothesis about the functions on L2 [−π, π], we deduce that the character of the sequence Sn (f, x) coincides with that of the series ∞

∞ 1 1 Sk (g, x) = − hk (x). λk λk+1

k=1

But as a series on L2 [−π, π], we have series converges a. e.

k=1

k

hk 2 < +∞. Therefore the

18

2. Fourier Series

2.4 Banach continuity principle The hypothesis that limn Sn (f, x)/λn+1 → 0 in the theorem 2.6 can be replaced by supn |Sn (f, x)/λn+1 | < +∞ a. e. This is a general fact due to Banach. This reduces the problem of a. e. convergence of Fourier series to proving the pointwise boundedness of the maximal operator sup |Sn (f, x)|. n

To prove these assertions we need some knowledge about the space of measurable functions L0 [−π, π]. It is a metric space with distance π 1 |f − g| d(f, g) = dm. 2π −π 1 + |f − g| This is a complete metric vector space. A sequence (fn ) converges to 0 if and only if it converges to 0 in measure. That is to say: for every ε > 0 we have limn m{|fn | > ε} = 0. Consider now a sequence (Tn ) of linear operators Tn : Lp [−π, π] → L0 [−π, π]. We assume that each Tn is continuous in measure, therefore for every (fk ) with fk p → 0, and every ε > 0 we have m{|T (fk )| > ε} → 0, (and this is true for every T = Tn ). Observe that if Tn f (x) converges a. e., then the maximal operator T ∗ f (x) = supn |Tn f (x)| is bounded a. e. The principle of Banach is a sort of uniform boundedness principle: The continuity in measure of a sequence of operators and the almost everywhere ﬁniteness of the maximal operator imply the continuity at 0 in measure of the maximal operator. Theorem 2.7 (Banach’s continuity principle) Let us assume that for every f ∈ Lp [−π, π], the function T ∗ f (x) < +∞ a. e. on [−π, π], then there exists a decreasing function C(α) deﬁned for every α > 0, such that limα→+∞ C(α) = 0 and such that m{T ∗ f (x) > αf p } ≤ C(α), for every f ∈ Lp [−π, π] Proof. Fix a positive real number ε > 0. For every natural number let Fn be the set of f ∈ Lp [−π, π] such that m{T ∗ f (x) > n} ≤ ε. The set Fn is closed on Lp [−π, π]. To prove this consider f ∈ Fn , then m{T ∗ f (x) > n} > ε. It follows that there exists N such that

2.4 Banach continuity principle

19

m{ sup Tk f (x) > n} > ε. 1≤k≤N

Then there exists δ > 0 such that m{ sup Tk f (x) > n + δ} > ε + δ. 1≤k≤N

By the continuity in measure of the operators Tk , there exists a δ > 0 such that for every g with f − gp < δ we have m{|Tk (f − g)(x)| > δ} < δ/2k ,

1 ≤ k ≤ N.

Let Z be the union of the exceptional sets {|Tk (f −g)(x)| > δ}. Then m(Z) < δ. Also we have {T ∗ g(x) > n} ∪ Z ⊃ { sup Tk f (x) > n + δ}. 1≤k≤N

Therefore it follows that m{T ∗ g(x) > n} > ε. That is, the set Lp [−π, π] Fn is open. Now our hypothesis about the boundedness of T ∗ f implies that Lp [−π, π] = Fn . n

By Baire’s category theorem there is some n ∈ N such that Fn has a nonempty interior. That is, there exist f0 ∈ Fn and δ > 0, such that f = f0 + δg with gp = 1. Thus m{T ∗ (f0 + δg) > n} ≤ ε. Then m{T ∗ g > 2n/δ} ≤ m{T ∗ (f0 + δg) > n} + m{T ∗ (f0 − δg) > n} ≤ 2ε. Therefore for every g ∈ Lp [−π, π] m{T ∗ g > (2n/δ)gp } ≤ 2ε. Hence, if we deﬁne C(α) = sup m{T ∗ g > αgp }, the function C(α) satisﬁes limα→+∞ C(α) = 0.

This principle is completed with the fact that under the same hypothesis about (Tn ), the set of f ∈ Lp [−π, π], where the limit limn Tn f (x) exists a. e., is closed in Lp [−π, π]. To prove this, deﬁne the operator

20

2. Fourier Series

Ω(f )(x) = lim sup |Tn f (x) − Tm f (x)|. n,m

It is clear that Ωf ≤ 2T ∗ f . Therefore m{Ωf (x) > αf p } ≤ C(α/2). For every function ϕ such that the limit limn Tn ϕ(x) exists a. e., we have Ωϕ = 0, and Ω(f − ϕ) = Ωf . It follows that m{Ωf (x) > αf − ϕp } ≤ C(α/2). Now let f be in the closure of the sets of functions ϕ. Take α = 1/ε and f − ϕp < ε2 . We obtain m{Ωf (x) > ε} ≤ C(1/2ε). It follows easily that m{Ωf (x) > 0} = 0.

2.5 Summability As we have said, Du Bois Reymond constructed a continuous function whose Fourier series diverges at some point. Lipot Fej´er proved, when he was 19 years old, that in spite of this we can recover a continuous function from its Fourier series. Recall that if a sequence converges, there converges also, and to the same limit, the series formed by the arithmetic means of his terms. Fej´er considered the mean values of the partial sums 1 Sn (f, x). σn (f, x) = n + 1 j=0 n

We have an integral expression for these mean values π 1 Fn (x − t)f (t) dt σn (f, x) = 2π −π where Fn is Fej´er kernel: j=n n

|j| ijt 1 1− Dj (t) = Fn (t) = e . n + 1 j=0 n+1 j=−n

There is another expression for Fn . We substitute the value of the Dirichlet kernel, then we want to sum the sequence sin(n+1/2)t. This is the imaginary part of n n 1 ei(n+1)t − 1 i(2j+1)t/2 −it/2 e =e eijt = 2i sin(t/2) j=0 j=0

2.5 Summability

thus

n j=0

sin

21

1 − cos(n + 1)t sin2 (n + 1) t/2 2j + 1 t= = ; 2 2 sin(t/2) sin(t/2)

it follows that 1 Fn (t) = n+1

sin(n + 1) t/2 sin(t/2)

2 .

(2.4)

We thus have that Fn is a positive function, Fn 1 = 1, and for every δ > 0 we have, uniformly on δ < |t| ≤ π, that limn Fn (t) = 0. With more generality we deﬁne a summability kernel to be a sequence (kn ) of periodic functions such that: π 1 (i) kn (t) dt = 1. 2π −π kn 1 ≤ C.

(ii) For every δ > 0 (iii)

1 lim n→+∞ 2π

|kn (t)| dt = 0. δ<|t|≤π

In the following theorem we identify [−π, π] with the torus T so that we can speak of kn ∗ f . Here is the theorem of Fej´er, extended for every summability kernel. Theorem 2.8 Let (kn ) be a summability kernel. If f : R → C is continuous and 2π-periodic, then kn ∗ f (x) converges uniformly to f (x). Moreover, for every 1 ≤ p < +∞ and f ∈ Lp [−π, π] we have lim kn ∗ f − f p = 0.

n→+∞

Proof. First assume f to be continuous and 2π-periodic. By the property (i) of the summability kernel π 1 kn ∗ f (x) − f (x) = f (x − t) − f (x) kn (t) dt. 2π −π Given ε > 0 we decompose the integral into two parts, one over {|t| < δ} and the other over {δ < |t| < π}. The ﬁrst is small by the continuity of f and property (ii) of the kernel; the second is small by (iii). Observe that the same proof shows us the convergence at every point of continuity of f , for a measurable bounded f . Since (Fn ) is a summability kernel and Fn ∗ f is a trigonometrical polynomial for every f , it follows that these polynomials are dense on C(T).

22

2. Fourier Series

Now for a f ∈ Lp [−π, π] we have that t → G(t) = f (· + t) − f (·)p is a continuous 2π-periodic function. We have π 1 f (· − t) − f (·)p kn (t) dt = kn ∗ G(0). kn ∗ f − f p ≤ 2π −π We can apply to this convolution the ﬁrst part of the theorem to conclude that limn kn ∗ G(0) → 0. Thus if f is continuous and 2π-periodic, σn (f, x) converges uniformly to f , and converges to f in Lp [−π, π] if f ∈ Lp [−π, π]. Another important example is that of the Poisson kernel. This kernel appear when we consider the Fourier series of f as the boundary values of a complex function deﬁned on the open unit disc. If f ∈ L1 [−π, π] the series +∞

f(j)z j +

j=0

+∞

f(−j)z j

j=1

converges on the open unit disc and deﬁnes a complex harmonic function u(z). Then iθ

u(re ) =

+∞

f(j)r e

|j| ijθ

j=−∞

1 = 2π

π

−π

Pr (θ − t)f (t) dt,

where the Poisson kernel Pr (θ) is deﬁned as Pr (θ) =

+∞

|j| ijθ

r e

j=−∞

1 − r2 = . 1 − 2r cos θ + r2

(2.5)

It is easy to check that Pr (θ) is a summability kernel. (Here the variable r takes the role of n, but this is a minor diﬀerence).

..... ... ... .. ... 0.65 ... .... . . .. .... ... .. .. . .. .. .. .. ... ... ... .. ... ... ... .. .. .. .. ... .. ........... ..... .. ..... ..... ... ... .. .. ... ..... .... ..... .... ... .... ..... . . ...... ... . . ........ . ... . . . ... ..... .. .. . . . . ... ... ....... . 0.4 . . . .. ... ... ...... . . . . . ..... ........ . .... . . . . . . . . ...... ........... ... ...... . . . . . . . . . . . . . ... ........ ............................ ................ ............. ......................... ............................... ...................................

The Poisson kernel.

P

(t)

−π

Then we have

P

(t) π

2.5 Summability

23

Proposition 2.9 If f ∈ Lp [−π, π], 1 ≤ p < +∞, or p = +∞ and f is continuous with f (π) = f (−π), we have limr→1− Pr ∗ f − f p = 0. Now we consider f ∈ L1 [−π, π] and ask about the a. e. convergence of Fn ∗f (x) or Pr ∗f (x) to f (x). Obviously there exist some subsequences Fnk ∗f and Prk ∗ f that converge a. e. to f . u(reiθ ) = Pr ∗ f (θ) is a harmonic function on the unit disc. Then what we want is a theorem of radial convergence of a harmonic function to limr→1− u(reiθ ). The ﬁrst theorem of this type is due to Fatou in 1905: A bounded and analytic function on the open unit disc has radial limits at almost every point of the boundary. We prefer to give a proof like that of the diﬀerentiation theorem that also can be extended to the case of σn (f, x). Theorem 2.10 (Fatou) Let f ∈ L1 [−π, π]. For almost every point x ∈ [−π, π] we have lim Pr ∗ f (x) = f (x),

r→1−

lim σn (f, x) = f (x). n

Proof. The principal part is to prove that the maximal operators P ∗ f (x) = sup |Pr ∗ f (x)|, 0
F ∗ f (x) = sup |Fn ∗ f (x)| n

are bounded. This follows from the general inequality about the maximal Hardy-Littlewood function. Deﬁne f ◦ : R → C as 0 for |x| > 2π and equal to the periodic extension of f when |x| < 2π. Also put Pr◦ : R → C as Pr◦ (θ) = 0 when |θ| > π, and Pr (θ) for |θ| < π. Then we can write Pr ∗ f (x) = Pr◦ ∗ f ◦ (x),

|x| < π.

In the same way we can deﬁne the function Fn◦ so that σn (f, x) = Fn ∗ f (x) = Fn◦ ∗ f ◦ (x),

|x| < π.

Since Pr◦ is a radial function that is decreasing for x > 0 and its integral on R is equal to 1, we have |Pr◦ ∗ f ◦ (x)| ≤ Mf ◦ (x). Thus P ∗ f (x) ≤ Mf ◦ (x) for every |x| < π. The Fej´er kernel is not decreasing, but sin(t/2) > t/π for 0 < t < π, therefore ⎧

2 ⎪ ⎨ 1 (n+1)t/2 ≤ 16(n + 1), if |t| ≤ 1 , n+1 t/4 n+1 Fn◦ (t) ≤

2 ⎪ 1 16 1 1 ⎩ 1 = n+1 if n+1 < |t| < π. n+1 t/4 t2 , Thus Fn is bounded by a radial function that is decreasing for t > 0 and has integrals uniformly bounded. It follows that

24

2. Fourier Series

F ∗ f (x) ≤ CMf (x). Now the proof of the a. e. pointwise convergence follows as in the diﬀerentiability theorem.

2.6 The conjugate function Since (eint ) is a complete orthonormal system on the space L2 [−π, π], we have Parseval equality π 2 1 f(j) = |f (t)|2 dt. 2π −π j∈Z

It follows that lim f − Sn (f )2 = 0. n

In fact for every 1 < p < +∞ we have lim f − Sn (f )p = 0, n

(2.6)

for every f ∈ Lp [−π, π]. In the case p = 1 this is no longer true. In fact if Fn and Dn are Fej´er’s and Dirichlet’s kernels, then Sn (FN ) = Dn ∗ FN = σN (Dn ). Therefore by Fej´er’s Theorem limN Sn (FN ) = Dn on L1 [−π, π]. Since FN 1 = 1, it follows that Sn 1 ≥ Ln . And we know that Ln ∼ log n. On the other hand to prove (2.6), it suﬃces to prove that the norm of the operators of partial sums Sn : Lp [−π, π] → Lp [−π, π] is uniformly bounded. In fact, since the polynomials are dense on Lp [−π, π], given ε > 0 we ﬁnd a polynomial Pε such that f − Pε p < ε. Then if n is greater than the degree of Pε f − Sn (f )p ≤ f − Pε p + Sn (Pε ) − Sn (f )p ≤ ε + Cε. The uniform boundedness of these norms was proved by M. Riesz in 1928. He considered the operator deﬁned on the space of trigonometrical polynomials as aj eijt = aj eijt . R j

j≥0

It is clear that R is a continuous projection on L2 [−π, π]. What is remarkable about R is that we have the relationship Sn (f ) = e−int R(eint f ) − ei(n+1)t R(e−i(n+1)t f ). Then the uniform boundedness of the norm of Sn follows if R can be extended to a continuous operator on Lp [−π, π].

2.6 The conjugate function

25

The operator R is related to the conjugate harmonic function. Consider a power series (an + ibn )z n . n>0

Its real and imaginary part for z = eit are (an cos nt − bn sin nt); v= (an sin nt + bn cos nt). u= n>0

n>0

We say that v = u is the conjugate series to u. The operator H that sends u to v must satisfy H(cos nt) = sin nt;

H(sin nt) = − cos nt.

It is the same to say H(eint ) = −i sgn(n)eint . The R and H operators are related by R f (θ) + eiθ R e−iθ f (θ) = f (θ) + iH f (θ) . It follows that the operator H extends to a continuous operator from L2 [−π, π] to L2 [−π, π]. In the next chapter we shall study the operator H. For the time being we obtain some expression for this operator. Let f ∈ L1 [−π, π]. Its Fourier series is f(j)eijx . j

We shall call

(−i) sgn(j)f(j)eijx

j

the conjugate series. It is clear that when f ∈ L2 [−π, π], this conjugate series is the Fourier series of Hf . We can express the partial sums of the conjugate series as a convolution π n 1 ijx ˜ ˜ n (x − t) dt. (−i) sgn(j)f (j)e = f (t)D Sn (f, x) = 2π −π j=−n ˜ n is the conjugate of the Dirichlet kernel Here D ˜ n (t) = 2 D

n j=1

sin jt =

cos t/2 − cos(n + 1/2)t . sin t/2

And we have a condition of convergence similar to that of Dini.

26

2. Fourier Series

Theorem 2.11 (Pringsheim convergence test) Let f ∈ L1 [−π, π] be a 2π-periodic function and x ∈ [−π, π] such that π dt |f (x + t) − f (x − t)| < +∞, t 0 then the conjugate series converges at the point x. ˜ n (t) is an odd function, Proof. Since D π 1 ˜ n (t) dt. f (x − t) − f (x + t) D S˜n (f, x) = 2π 0 ˜ n (t) end the Then the Riemann-Lebesgue lemma and the expression for D proof. We also get that 1 lim S˜n (f, x) = 2π

π

0

f (x − t) − f (x + t) dt. tan t/2

It follows easily that under the hypothesis of the theorem, there exists the principal value and π f (x − t) 1 lim S˜n (f, x) = p.v. dt. 2π −π tan t/2 We see that the Hilbert transform of a diﬀerentiable function is given by π 1 f (x − t) Hf (x) = p.v. dt. 2π −π tan t/2 In 1913 Luzin proved that the principal value exists and equals Hf (x) a. e. for every f ∈ L2 [−π, π]. Later, in 1919, Privalov proved that the principal value exists a. e. for every f ∈ L1 [−π, π].

2.7 The Hilbert transform on R In the following chapter we will study the Hilbert transform. It is convenient perform this study on R instead of on the torus. Almost all of what we have said for Fourier series has an analogue for R. For f ∈ L1 (R) the Fourier transform is deﬁned as +∞ f(x) = f (t)e−2πitx dt. −∞

2.7 The Hilbert transform on R

27

This is analogous to the Fourier coeﬃcients f(j). The partial sums of the Fourier series are similar to a 2πiξx dξ = f (t)Da (x − t) dt, f (ξ)e Sa (f, x) = −a

where the Dirichlet kernel is replaced by Da (t) =

sin 2πat . πt

And Fej´er sums are replaced by 1 a σa (f, x) = St (f, x) dt = f (x − ξ)Fa (ξ) dξ, a 0 where the analogue of the Fej´er kernel is

2 1 sin πξa . a πξ The role of the unit disc is taken by the semiplane y > 0 on R2 . If f ∈ L1 (R), we deﬁne on this semiplane the analytic function +∞ f (t) 1 dt. F (z) = πi −∞ t − z When f is a real function the real and imaginary parts of F are given by y 1 +∞ f (t) dt; u(x, y) = π −∞ (x − t)2 + y 2 x−t 1 +∞ f (t) dt. v(x, y) = π −∞ (x − t)2 + y 2 For a general f the functions u and v deﬁned by these integrals are harmonic conjugate functions. The Hilbert transform is deﬁned as +∞ f (t) 1 dt. Hf (x) = p.v. π −∞ x − t The study of this transform is equivalent to that of the transform on the torus. In fact, given f ∈ L1 [−π, π], if we deﬁne f ◦ : R → C as 0 for |x| > 2π and equal to the periodic extension of f when |x| < 2π; then for every |x| < π π ◦ f (x − t) 1 p.v. Hf (x) = dt. 2π −π tan t/2 Therefore

28

2. Fourier Series

π +∞ ◦

1 2 1 f (x − t) 1 ◦ − dt + p.v. dt Hf (x) = f (x − t) 2π −π tan t/2 t π t −∞ f ◦ (x − t) 1 . − π |t|>π t If we designate by HR and HT the two transforms we have |HT f (x) − HR f ◦ (x)| ≤ Cf 1 . And results for either transform can be transferred to the other.

2.8 The conjecture of Luzin When Luzin published his paper in 1913, he knew the result of Fatou: the Poisson integral π 1 1 − r2 f (t) dt 2π −π 1 − 2r cos(t − x) + r2 1 converges a. e. to f (x) for every π]. He also knew the F. Riesz ∞ f ∈2 L [−π, 2 and E. Fischer result: given n=1 (a n∞+ bn ) < +∞ there exists a function 2 f ∈ L [−π, π] with Fourier series n=1 an cos nx + bn sin nx. He deduced that with the same hypotheses, there exists also the conjugate function g ∈ ∞ L2 [−π, π] with Fourier series n=1 −bn cos nx + an sin nx. representations in terms of Then the coeﬃcients an and bn have integral ∞ f and also in terms of g. The analytic function n=1 (an − ibn )(reix )n has two representations; π π 1 (1 − r2 )f (t) 2rg(t) sin(t − x) 1 dt = dt. 2 2π −π 1 − 2r cos(t − x) + r 2π −π 1 − 2r cos(t − x) + r2

Fatou’s result gives then that the second integral also converges a. e. to f (x) when r → 1− . He proved then the following result: Theorem 2.12 Let f ∈ L2 [−π, π], then π 2rg(t) sin(t − x) g(x − t) 1 1 dt = 0 lim dt − 2 − 2π −π 1 − 2r cos(t − x) + r 2π η<|t|<π tan t/2 r→1 a. e., where η = η(r) is the unique solution of cos x = 2r/(1 + r2 ) on the interval (0, π/2). Therefore he had proved that for f ∈ L2 (R)

2.8 The conjecture of Luzin

1 p.v. 2π

π

−π

g(x − t) dt = f (x), tan t/2

29

a. e.

Then he remarked that Sn (f, x) =

n j=1

1 = 2π

aj cos jx + bj sin jx

π

−π

g(x + t)

1 cos(n + 1/2)t − dt. tan t/2 sin t/2

Therefore lim Sn (f, x) = f (x) a. e., if and only if π cos nt dt = 0, g(x + t) lim p.v. n→∞ t −π

a. e.

Then he knew that for every g ∈ L2 (R) the principal value g(x + t) p.v. dt exists a. e. t

(2.7)

(2.8)

He noticed that this is not due to the smallness of the integrand. In fact, he knew that there exists a continuous function f and a set of positive measure A so that f (x + t) − f (x − t) dt = +∞ t for x ∈ A. But the proof he knew of (2.8) used as we have seen the theory of functions of a complex variable. In this proof it is not clear how it is that the cancellation between the positive and negative values produces the existence of the principal value. He conjectured that in a more constructive proof this cancellation would be clear and would not be disturbed by the presence of the factor cos nt in (2.7). From these considerations he conjectured that every f ∈ L2 [−π, π] would have a Fourier series a. e. convergent, that is Sn (f, x) → f (x) a. e. on [−π, π]. Carleson proved this in 1966. Later Hunt proved that this is true for every f ∈ Lp [−π, π] with 1 1 and ﬁnally by Besikovitch for p = 1. But these proofs did not satisfy Luzin, because they were very complicated. Nevertheless, it was the right procedure. The results and techniques developed by these authors are needed in the ﬁnal proof of Carleson’s result.

3. Hilbert Transform

3.1 Introduction The properties of the Hilbert transform not only inspired Luzin’s conjecture about Fourier series of functions on L2 ; they are also needed in the proof of the Carleson-Hunt Theorem. We need the existence almost everywhere of the principal value integral f (t) f (t) dt = lim dt, Hf (x) = p.v. x−t ε→0+ |x−t|>ε x − t for every f ∈ L1 (R) and also the bound H∗ f p ≤ Cp f p , where H∗ f denotes the maximal operator ∗ H f (x) = sup ε>0

|x−t|>ε

f (t) dt. x−t

To obtain the ﬁne result of Sj¨ olin: Sn (f, x) converges a. e. when π |f (t)| log+ |f (t)| log+ log+ |f (t)| dt < +∞, −π

it is necessary to estimate the constant Cp in the previous inequality.

3.2 Trunctated operators on L2 (R) We begin studying the truncated operators Hε f (x) = Kε ∗ f (x) =

|x−t|>ε

f (t) dt. x−t

Here Kε denotes the function equal to 1/t for |t| > ε and 0 otherwise. As Kε ∈ Lp (R) for every 1 < p ≤ +∞, the convolution is deﬁned for every f ∈ Lp (R), 1 ≤ p < +∞ and Kε ∗ f is a continuous and bounded function. J.A. de Reyna: LNM 1785, pp. 31–44, 2002. c Springer-Verlag Berlin Heidelberg 2002

32

3. Hilbert Transform

We want to prove that the operator Hε : Lp (R) → Lp (R) is bounded by a constant that does not depend on ε. To achieve this we apply the interpolation of operators. First consider the case p = 2. The Fourier transform is an isometry of 2 L (R). Also since Hε f is a convolution we have H ε f = Kε · f . Hence the ε g. norm of Hε is equal to the norm of the operator that sends g ∈ L2 to K ∞ In particular Kε is bounded if Kε is in L , and Hε = Kε ∞ . ε . Since Kε Hence the result for p = 2 is reduced to the calculation of K is not integrable, we must calculate its transform as a limit of the functions R sin 2πxt −2πixt dt = −2i dt, Kε,R (x) = e t t R>|t|>ε ε where the limit is taken in L2 (R). As the pointwise limit of these functions when R → +∞ exists, it is equal to the limit in L2 (R). Hence +∞ +∞ sin 2πxt sin t Kε (x) = −2i dt = −2i dt, t t ε y for y = 2πxε, and it is easy to see that these integrals are uniformly bounded by an absolute constant.

3.3 Truncated operators on L1 (R) It is not true that Hε maps L1 (R) on L1 (R). What can be proved is only a weak type inequality. If f ∈ Lp (μ), we have the relation μ{x ∈ X : |f (x)| > t} ≤ f pp t−p . A function that satisﬁes an inequality of the type μ{x ∈ X : |f (x)| > t} ≤

Cp tp

is not necessarily contained in Lp (μ). We say it is in weak Lp . The best constant C = f ∗p that satisﬁes the above inequality is called the weak Lp norm of f . Also an operator T , deﬁned on every f ∈ Lp (μ), is said to be of type (p, q) if T f q ≤ Cf p and of weak type (p, q) if ν{y ∈ Y : |T f (y)| > t} ≤ C q f qp /tq . It is the same to say that T f ∗q ≤ Cf p . Then what we shall prove is that the operator Hε is of weak type (1, 1). We give a proof that can be extended to more general operators. It is based on the so-called decomposition of Calder´ on-Zygmund.

3.3 Truncated operators on L1 (R)

33

Theorem 3.1 (Decomposition of Calder´ on-Zygmund) Let f ∈ L1 (R) and a positive real number α be given. There exists a decomposition f = g + b (a good and a bad function) with the following properties. There exists an open set Ω = j Qj where Qj are nonoverlapping open intervals such that b(t) dt = 0 for every j. On F = R Ω the function g is bounded by α. Qj On every Qj the function g is constant and equal to the mean value of f on Qj and 1 1 α≤ |f (t)| dt, f (t) dt ≤ 2α. |Qj | Qj |Qj | Qj

Proof. We consider the line to be decomposed into disjoint intervals on which the mean value of |f | is less than α. This can be achieved taking large intervals. Now we subdivide each of these intervals into two equal intervals. For each one we calculate the mean value of |f |. If one of these mean values is greater than α we take the corresponding interval as one of the Qj . We continue the process dividing those intervals on which the mean value is less than α. Finally we set Ω equal to the union of those intervals on which the mean value is greater than α. Then we set 1 f (t) dt χQj (x) b(x) = f (x) χΩ (x) − |Qj | Qj j 1 f (t) dt χQj (x). g(x) = f (x) χF (x) + |Qj | Qj j Now it is easy to prove that these functions satisfy all our conditions. The Qj are disjoint by construction. For almost every point x ∈ F , the complement of Ω, there is a sequence of intervals Jn , with x ∈ Jn and such that the mean value of |f | on every one is less than α. By the diﬀerentiation theorem the value of f (x) is also less than α if x is a point of Lebesgue of f . Every Qj is the half of an interval J where the mean value of |f | is less than α. Therefore 1 2 f (t) dt ≤ |f (t)| dt ≤ 2α. |Qj | Qj |J| J With this decomposition we can prove: Theorem 3.2 (Kolmogorov) For every f ∈ L1 (R) and α > 0 m{x ∈ R : |Hε f (x)| > α} ≤ C

f 1 . α

(3.1)

34

3. Hilbert Transform

Proof. Let f = g + bbe the Calder´ on-Zygmund decomposition at the level α, and also let Ω = j Qj the corresponding open set. We know that g and b are in L1 , since 1 |f | dm + f dmm(Qj ) ≤ f 1 . g1 = |Qj | Qj F j Therefore {x ∈ R : Hε f > 2α} ⊂ {x ∈ R : Hε g > α} ∪ {x ∈ R : Hε b > α}. But g is also in L2 , and since |g| ≤ α on F and ≤ 2α on Ω 2 |f | dm + 2α |f | dm ≤ 2αf 1 . g2 ≤ α F

Ω

It follows that m{x ∈ R : Hε g > α} ≤

Hε g22 2αf 1 f 1 . ≤C ≤A 2 2 α α α

Now we start with the bad function. First observe that if G = we have 1 2 m(G) ≤ 2 m(Qj ) ≤ 2 |f | dm ≤ f 1 . α Qj α j j

j

2Qj ,

Therefore we have m{x ∈ R : Hε b(x) > α} ≤ m(G) + m{x ∈ R G : Hε b(x) > α}. To obtain the corresponding inequality we calculate |Hε b(x)| dx ≤ K (x − t)b(t) dt dx ε RG

j

RG

Qj

where we have used that b is zero on F . Now applying that the integral of b is zero on every Qj , we get |Hε b(x)| dx ≤ (Kε (x − t) − Kε (x − tj ))b(t) dt dx RG

j

RG

Qj

where tj denotes the center of Qj . Thus, by Fubini’s theorem, the last integral is less than |Kε (x − t) − Kε (x − tj )| dx |b(t)| dt j

Qj

|x−tj |>2|t−tj |

Now we claim: the integral on x is bounded by an absolute constant B. Therefore

3.3 Truncated operators on L1 (R)

RG

|Hε b(x)| dx ≤ B

j

35

|b(t)| dt ≤ 2Bf 1 .

Qj

Consequently m{x ∈ R G : Hε b(x) > α} ≤

2Bf 1 . α

We only have to collect the results obtained. We have to prove the claim. This will be done in the following proposition. In the following it will be convenient to use Iverson-Knuth’s notation: If P (x) is a condition that can be true or false for every x, then [P (x)] by deﬁnition, is equal to 1 if P (x) is true and 0 if P (x) is false. In other words, [P (x)] is the characteristic function of the set {x ∈ R : P (x)}. The claim will be a consequence of the following fact: Proposition 3.3 For every a ∈ R and ε > 0 there exists an even function ψ: R → [0, +∞) such that it is decreasing on [0, +∞), Kε (t + a) − Kε (t)[|t| > 2|a|] ≤ ψ(x) for every |t| ≥ |x|, and the integral is bounded by an absolute constant ψ(t) dt < C.

Proof. We have Kε (t + a) − Kε (t) · [|t| > 2|a|] ≤ K(t + a) − K(t) · [|t + a| > ε] · [|t| > 2|a|] + K(t) · [|t + a| > ε] − [|t| > ε] · [|t| > 2|a|]. That is, for ε < 3|a| |a| 1 [|t| > 2|a|] + [2|a| ≤ |t| ≤ 4|a|]. |t|(|t| − |a|) |t|

≤ And for ε ≥ 3|a| ≤ Now let

|a| 1 2 4 [|t| > 2|a|] + ε ≤ |t| ≤ ε . |t|(|t| − |a|) |t| 3 3

|a|/(|t|(|t| − |a|)) if |t| > 2|a|, 1/2|a| if |t| ≤ 2|a|. ⎧ −1 if 2ε/3 ≤ |t| ≤ 4ε/3, ⎪ ⎨ |t| 3 ψ2 (ε, t) = if |t| ≤ 2ε/3, ⎪ ⎩ 2ε 0 if |t| > 4ε/3.

ψ1 (a, t) =

36

3. Hilbert Transform

Then we can take ψ(t) = ψ1 (a, t) + ψ2 (3|a|, t) in case ε < 3|a| and ψ(t) = ψ1 (a, t) + ψ2 (ε, t) when ε ≥ 3|a|. It is clear that this function satisﬁes all our conditions. This proposition will be needed in the proof of Cotlar’s Inequality (Theorem 3.7)

3.4 Interpolation Assume that T is a linear operator deﬁned on some space of measurable functions that contains Lp0 (μ) and Lp1 (μ) for some 1 ≤ p1 < p0 ≤ +∞. Then T is deﬁned also on every p ∈ [p1 , p0 ]. This is true because we can decompose every function f ∈ Lp (μ) f = f0 + f1 , where, being A = {t ∈ X: |f (t)| < 1}, we deﬁne f0 = f χA

and

f1 = f − f0 .

Then |f0 | ≤ 1 and |f0 | ≤ |f |, and so f0 ∈ L∞ (μ) and f0 ∈ Lp (μ). This implies that f0 ∈ Lp0 (μ). In an analogous way |f1 | ≤ |f | and |f1 | ≤ (|f |)p , therefore f1 ∈ Lp (μ) and f1 ∈ L1 (μ), and so it is in the intermediate Lp1 (μ). Then it is clear that T (f ) = T (f0 ) + T (f1 ) is well deﬁned. The interpolation theorem gives a quantitative version of this observation. We will prove the Marcinkiewicz interpolation theorem in chapter 11. We apply now this theorem. The reader can read this chapter now or even only the proof of theorem 11.10 and some previous deﬁnitions needed in it. Proposition 3.4 For every 1 < p < +∞ the operator Hε : Lp (R) → Lp (R) is continuous and Hε p ≤ Cp2 /(p − 1). Proof. We have proved that Hε is of weak type (1, 1) and strong type (2, 2). Therefore applying Marcinkiewicz’s Theorem 11.10 for 1 2 are conjugates to p < 2. And it is easy to see that Hε p = Hε p . This follows from Fubini’s theorem: If f ∈ Lp and g ∈ Lp we have f (t) g(x) dt g(x) dx = dx f (t) dt R |x−t|>ε x − t R |x−t|>ε x − t

3.5 The Hilbert transform

37

This implies that if |p − 2| > 1/2, then Hε p ≤ Cp2 /(p − 1). (This is true for 1 < p < 3/2 and the bound can be written in the symmetric form Cpp ) We only have to prove that the norm Hε p is uniformly bounded for |p − 2| ≤ 1/2, and this can be done by applying again the interpolation theorem, this time between p1 = 4/3 and p0 = 4.

3.5 The Hilbert transform Now we are in position to prove that, for every 1 < p < +∞ there is a bounded operator H: Lp (R) → Lp (R). We shall prove that if 1 < p < +∞ and f ∈ Lp (R), then there exists the limit limε→0+ Hε f (being taken in the space Lp (R)). In the case p = 1, Hε f is only in weak L1 so that we have to modify slightly the reasoning. We shall need a dense subset where the limits exist. Proposition 3.5 Let ϕ be an inﬁnitely diﬀerentiable function of compact support. For every 1 < p < +∞ the limit Hϕ = lim+ Hε ϕ ε→0

exists on Lp (R). Moreover for every x there exists the limit Hϕ(x) = lim+ Hε ϕ(x). ε→0

Proof. Observe that for 0 < δ < ε 1 1 Hε ϕ(x) − Hδ ϕ(x) = ϕ(x − t) dt = ϕ(x − t) − ϕ(x) dt. δ<|t|<ε t δ<|t|<ε t Therefore

Hε ϕ − Hδ ϕp ≤ δ<|t|<ε

1 ϕ(· − t) − ϕ(·)p dt. t

The hypothesis about ϕ implies that ϕ(· − t) − ϕ(·)p ≤ C|t|. It follows that Hε ϕ − Hδ ϕp ≤ C(ε − δ). We can think of Hε ϕ as a convolution of an L1 with an Lp function (p > 1); hence it is an Lp function. We apply the completeness of Lp (R) to get our ﬁrst assertion. The conclusion about pointwise convergence can be derived in the same way, since ϕ(· − t) − ϕ(·)∞ ≤ C|t|.

38

3. Hilbert Transform

Observe that under the same hypotheses about ϕ, we can prove in the case p = 1 the inequality Hε ϕ − Hδ ϕ1 ≤ C(ε − δ), but Hε ϕ ∈ L1 (R). Now we can prove that we can deﬁne for every f ∈ Lp (R) Hf = lim+ Hε f, ε→0

where the limits must be understood in the sense of the norm of Lp (R), p > 1. In fact, Hε f − Hδ f p ≤ Hε ϕ − Hδ ϕp + Hε (f − ϕ)p + Hδ (f − ϕ)p . Therefore we can prove for p > 1, given the density of the smooth functions, that Hε f satisﬁes the Cauchy condition of convergence. The problem in the case p = 1 is that we only know the weak inequality given by Kolmogorov theorem above. Almost all the reasoning for p > 1 can be implemented for p = 1. We shall consider the space L1,∞ (R) of all the measurable functions f : R → C such that f 1,∞ < +∞. This is deﬁned as f 1,∞ = sup t · m{x ∈ R : |f (x)| > t}. t>0

This is not a norm but satisﬁes af 1,∞ = |a| · f 1,∞ ,

f + g1,∞ ≤ 2f 1,∞ + 2g1,∞ .

Therefore L1,∞ (R) is a vector space. Also for f ∈ L1 (R) we have f 1,∞ ≤ f 1 . Sometimes we call L1,∞ (R) the weak L1 space. This quasi-norm allows us to deﬁne a topology on L1,∞ (R) where a basis of neighborhoods of f are the sets f +B(0, ε), where B(0, ε) = {g ∈ L1,∞ (R) : g1,∞ < ε}. Now for f ∈ L1 (R) and a smooth ϕ Hε f − Hδ f 1,∞ ≤ 3Hε ϕ − Hδ ϕ1,∞ + 3Hε (f − ϕ)1,∞ + 3Hδ (f − ϕ)1,∞ ≤ Cϕ |ε − δ| + Cf − ϕ1 . We are able to prove the condition of Cauchy for Hε f in the space L1,∞ (R). All that is needed now is to prove that the space L1,∞ (R) is complete. Proposition 3.6 Let (fn ) be a Cauchy sequence in L1,∞ (R). Then there exists a subsequence (fnk ) and a measurable function f ∈ L1,∞ (R), such that fnk converges a. e. to f , and limn fn − f 1,∞ = 0. Proof. The proof is the same as that of the Riesz-Fischer theorem. We select the subsequence gk = fnk in such way that gk+1 − gk 1,∞ < 4−k . If Zk = −k k −k = 2−k . Therefore Z = {|g − gk | > 2 }, we have m(Zk ) ≤ 2 · 4 k+1 N k>N Zk is of measure zero and for every x ∈ Z there exists N such that

3.6 Maximal Hilbert transform

39

for every n > N , |gn+1 (x) − gn (x)| ≤ 2−n . It follows that the sequence (gn ) converges a. e. to some measurable function f . To prove that limn fn − f 1,∞ = 0 it is convenient to observe that for every ﬁnite sequence of functions (hj ) in L1,∞ (R) we have

N

hj 1,∞ ≤

j=1

N

2j hj 1,∞ .

j=1

Therefore we have gn+k − gn 1,∞ ≤ 41−n . That is, m(Ak (α)) = m{|gn+k − gn | > α} ≤ Since {|f − gn | > α} ⊂ Z ∪

∞ N

k=N

1 4n−1 α

.

Ak (α), it follows easily that

m{|f − gn | > α} ≤ 41−n α−1 . Therefore f − gn 1,∞ → 0.

Now we can deﬁne the Hilbert transform Hf for every f ∈ Lp (R). For 1 < p < +∞, Hf ∈ Lp (R). Also H: Lp (R) → Lp (R) is a bounded linear operator and p2 f p , 1 < p < +∞. Hf p ≤ C p−1 In the case p = 1 the linear operator H is deﬁned but it takes values in L1,∞ . In particular for every f ∈ L1 (R), Hf 1,∞ ≤ Cf 1 . This follows from the corresponding Kolmogorov theorem about Hε and from the fact that for some sequence (εn ) we have that Hεn f converges a. e. to Hf .

3.6 Maximal Hilbert transform In the proof of Carleson’s Theorem we need the inequality H∗ f p ≤ Bpf p , valid for every p ≥ 2 and every f ∈ Lp (R), and where H∗ f (x) = sup |Hε f (x)| ε>0

is the maximal Hilbert transform. Historically this result is obtained to give a real proof of the pointwise convergence Hf (x) = lim Hε f (x), +

a. e.

ε→0

The proof which we shall give of this result is based on the following bound.

40

3. Hilbert Transform

Theorem 3.7 (Cotlar’s Inequality) Let f ∈ Lp (R), 1 0. We want to prove |Hε f (a)| ≤ A M(Hf )(a) + Mf (a) . We notice that

Hε f (a) =

|t−a|>ε

f (t) dt = a−t

R

f2 (t) dt = Hf2 (a), a−t

where f = f1 + f2 and f2 is equal to f on |t − a| > ε and is equal to 0 on the interval 2J = {t : |a − t| ≤ ε}. This equality has a sense in spite of the fact that we have deﬁned H only as an operator from Lp (R) to Lp (R). In fact, since f2 is null on the interval 2J, at every point x of the interval J = {t : |a − t| ≤ ε/2}, we have that Hη f2 (x) does not depend on η for η < ε/2. Therefore Hf2 is equal to Hη f2 on J and is a continuous function on this interval. We can bound the oscillation of Hf2 on J. |Hf2 (x) − Hf2 (a)| ≤ Kη (x − t) − Kη (a − t) · |f2 (t)| dt. Now we recall that there exists an even function ψ(x) that decreases with |x|, its integral is bounded by an absolute constant, and such that Kη (x − t) − Kη (a − t)[|a − t| > 2|x − a|] ≤ ψ(a − t). (cf. proposition 3.3). Therefore, it follows that |Hf2 (x) − Hf2 (a)| ≤ ψ(a − t)f2 (t) dt ≤ CMf2 (a) ≤ CMf (a), R

by the general inequality satisﬁed by the Hardy-Littlewood maximal function. Collecting our results we have |Hε f (a)| ≤ CMf (a) + |Hf2 (x)|, for every x ∈ J. Therefore |Hε f (a)| ≤ CMf (a) + |Hf (x)| + |Hf1 (x)|,

(3.2)

for almost every x ∈ J. What we have achieved is the liberty to choose x ∈ J. Now we use probabilistic reasoning to prove that there is some point where |Hf (x)| and |Hf1 (x)| are bounded. By the deﬁnition of the Hardy-Littlewood maximal function,

3.6 Maximal Hilbert transform

1 |J|

41

|Hf (t)| dt ≤ M(Hf )(a). J

The probability that a function is greater than three times its mean is less than 1/3, therefore 2 |J|. 3

m{t ∈ J : |Hf (t)| < 3M(Hf )(a)} ≥

For the other term we use the weak inequality for Hf1 m{t ∈ J : |Hf1 (t)| > α} ≤ C

Mf (a) f1 1 ≤ 2C |J|. α α

Therefore, in the same way we obtain m{t ∈ J : |Hf1 (t)| ≤ 6CMf (a)} ≥

2 |J|. 3

Now there is some point x ∈ J such that, simultaneously |Hf (x)| < 3M(Hf )(a),

and

|Hf1 (x)| < 6CMf (a).

Finally, we arrive at |Hε f (a)| ≤ CMf (a) + 3M(Hf )(a) + 6CMf (a) ≤ A Mf (a) + M(Hf )(a) . Now we can prove the bound on the norms. Since p > 1 we have, applying the known results about the Hardy-Littlewood maximal function and the Hilbert transform, H∗ f p ≤ AM(Hf )p + AMf p ≤

Cp Cp Hf p + f p p−1 p−1

p3 Cp ≤C f + f p . p (p − 1)2 p−1 For p > 2 it follows that H∗ f p ≤ Cpf p . The inequality that we obtain near p = 1 is not sharp. We shall need the sharp estimate to obtain the result of Sj¨ olin. Also we shall give the corresponding result for p = 1 that is needed in the proof of the almost everywhere pointwise convergence of Hε f (x) to Hf (x) for f ∈ L1 (R). The problem at p = 1 is in Cotlar’s Inequality. M(Hf ) may be inﬁnity at every point if Hf is not in L1 (R). Hence we modify this inequality taking M(|Hf |1/2 ) instead of M(Hf ).

42

3. Hilbert Transform

Theorem 3.8 (Modiﬁed Cotlar’s Inequality) Let f ∈ L1 (R), then H∗ f (x) ≤ A {M(|Hf |1/2 )(x)}2 + Mf (x) . Proof. The proof is the same as that of the inequality of Cotlar, until we obtain inequality (3.2). By deﬁnition of the maximal Hardy-Littlewood maximal function, 1 |Hf |1/2 dm ≤ M(|Hf |1/2 )(a). |J| J Therefore m{t ∈ J : |Hf (t)| < {3M(|Hf |1/2 )(a)}2 } ≥

2 m(J). 3

The rest of the proof is the same as before. We obtain that there is some point x ∈ J where simultaneously we have |Hf (x)|1/2 < 3M(|Hf |1/2 )(a), and |Hf1 (x)| ≤ 6CMf (a). Now we can prove that H∗ f is in weak L1 when f ∈ L1 (R). In fact if T : L1 (R) → L1,∞ (R), M(T f ) is not, in general, in L1,∞ (R), but as Kolmogorov observed (see next proposition), M(|T f |1/2 ) is in L2,∞ (R), and M(|T f |1/2 )2,∞ ≤ 2c1 T f 1 . Given the modiﬁed Cotlar’s inequality, we can prove that H∗ f is in L1,∞ (R) for every f ∈ L1 (R). In fact, M(|Hf |1/2 ) ∈ L2,∞ (R) gives us m{t ∈ R : {M(|Hf |1/2 )}2 > α} ≤ C

f 1 . α

Proposition 3.9 (Kolmogorov) Let T be an operator such that for every f ∈ L1 (R), we have T f 1,∞ ≤ T f 1 . Then for every α > 0, and f ∈ L1 (R) f 1 m{t ∈ R : {M(|T f |1/2 )}2 > α} ≤ cT . α Proof. For every measurable function g: R → C we have g = g[|g| ≤ α] + g[|g| > α]. Therefore {Mg > 2α} ⊂ {M(g[|g| > α]) > α} and by the lemma of Hardy and Littlewood c1 m{Mg > 2α} ≤ |g| dm. α |g|>α We apply this inequality to our function

3.6 Maximal Hilbert transform 1/2

m{t ∈ R : {M(|T f |

c1 )} > 4α} ≤ √ α 2

√ α

|T f |1/2 >

43

|T f |1/2 dm.

Now we follow the traditional path +∞ c1 c1 1/2 √ |T f | dm = √ t−1/2 m{|T f | > t} dt, α |T f |1/2 >√α 2 α α and now by the hypothesis on T , we get 1/2

m{t ∈ R : {M(|T f |

c1 T f 1 +∞ −3/2 √ )} > 4α} ≤ t dt 2 α α c1 T f 1 . = α 2

Theorem 3.10 For every f ∈ L1 (R) and α > 0 m{x ∈ R : H∗ f (x) > α} ≤ C

f 1 , α

where C is an absolute constant. Therefore for every f ∈ Lp (R) H∗ f p ≤ B

p2 f p , p−1

1 2α} ⊂ M(|Hf |1/2 ) > α/A ∪ Mf > α/A . Hence, the previous theorem, applied to H gives us the weak inequality. Now we know that there is a constant C such that Hf 4 ≤ Cf 4 . We apply Marcinkiewicz’s Theorem to get H∗ f p ≤

C f p , p−1

1 < p < 2.

Recall that we have proved H∗ f p ≤ Bpf p , These two inequalities prove (3.3). Finally we arrive at

2 ≤ p < +∞.

44

3. Hilbert Transform

Theorem 3.11 (Pointwise convergence) For every f ∈ Lp (R), where 1 ≤ p < +∞, a. e. Hf (x) = lim+ Hε f (x), ε→0

Proof. The proof is almost the same as that of the diﬀerentiation theorem. Put Ωf (x) = lim sup Hε f (x) − lim inf Hε f (x). ε→0+

ε→0+

What we have to prove is that Ωf (x) = 0 a. e. For every smooth function of compact support ϕ it is easily shown that Ωϕ(x) = 0 for all x ∈ R. We also know that Ωf (x) ≤ 2H∗ f (x) for all x ∈ R. These two facts combine in the following way m{Ωf > α} = m{Ω(f − ϕ) > α} ≤ m{H∗ (f − ϕ) > α/2} Therefore by the results we have proved about H∗ , it follows that

Cf − ϕ p p m{Ωf > α} ≤ . α By the density of the smooth functions on Lp (R), it follows that, for every α > 0, m{Ωf > α} = 0. Therefore Ωf (x) = 0 a. e. as we wanted to prove.

Part Two The Carleson–Hunt Theorem

In this part we study the proof of the Carleson–Hunt Theorem (theorem 12.8 p.172]. To prove the convergence almost everywhere of a Fourier series we consider the maximal operator S ∗ (f, x) = supn |Sn (f, x)|. Instead of this we prefer to consider the Carleson maximal operator. This is deﬁned by replacing the Dirichlet kernel by eint /t. In equation (2.3) we have seen that this is almost the same. The Carleson maximal operator is deﬁned as π in(x−t) e ∗ f (t) dt. C f (x) = sup p.v. n∈Z −π x − t This step follows the observation of Luzin. We also want to prove that C ∗ f (x) is in Lp (−π/2, π/2) when f ∈ p L (−π, π). By the interpolation theorems we must bound the measure of the set where C ∗ f (x) > y. Therefore we are interested in bounding the Carleson integrals π in(x−t) e f (t) dt. p.v. −π x − t The main idea of the proof is to decompose the interval I into subintervals, one of them, I(x), containing x (x ∈ I(x)/2). Then we have iλ(x−t) eiλ(x−t) e eiλ(x−t) f (t) dt = p.v. f (t) dt + f (t) dt. p.v. I x−t I(x) x − t J x−t J

Clearly, the ﬁrst term gives the main contribution. It is almost a new Carleson integral, the only diﬀerence being that, in general, the number of cycles will not be an integer. We shall see that we can replace it by a Carleson integral. The other terms can be conveniently bounded. To this end we write iλ(x−t) iλ(x−t) e e f (t) − M M f (t) dt = dt + dt, x−t J x−t J J x−t where M is the mean value of eiλ(x−t) f (t) on J. Now we can beneﬁt from the fact that the integral of eiλ(x−t) f (t) − M in J is 0, and put iλ(x−t) iλ(x−t) 1 e f (t) − M 1 dt = − e f (t) − M dt, x−t x − t x − tJ J J J.A. de Reyna: LNM 1785, pp. 47–49, 2002. c Springer-Verlag Berlin Heidelberg 2002

48

where tJ is the center of J. All these terms can be bounded by weak local norms t

c 1 j f (n,J) = f (t) exp −2πi n + dt . 1 + j 2 |J| J 3 |J| j∈Z

Hence, given α = (n, J), f α is a mean value of absolute values of (generalized) Fourier coeﬃcients of f . Observe now that the ﬁrst term is of the type with which we started, only that the new integral is more simple because it corresponds to fewer cycles, but a non integer number of cycles. We can change to a new integral that has an integer number of cycles, and in such a way that the diﬀerence is again bounded by the local norm. We iterate this process until we obtain a Carleson integral with 0 cycles, that is, we arrive to a Hilbert transform that is easily estimated. The remainders that appear in these steps are bounded except on an exceptional set. The larger we allow the bound, the smaller can this exceptional set be made. To bound the number of steps we start with n ≤ 2N . In this way we arrive naturally to the log log n result. That is because for every interval that appears in the process we must add a fraction to the exceptional set. In the process we consider every dyadic interval of length ≤ 2π/2N . This gives a contribution of length proportional to N . We must compensate with log N in the bound allowed. Chapter four contains the basic step. This is a precise formulation of how we choose the partition of the interval of a given Carleson integral, and the bounds we obtain. It also contains the theorem of change of frequency that gives a good bound of the diﬀerence between two Carleson integrals in the same interval but with diﬀerent frequency. This theorem plays an important role in the second part of the proof. Chapter ﬁve gives the bounds of the terms that appear in the basic step. And ﬁnally chapter six deﬁnes the exceptional set and combines the previous bounds to obtain the result Sn (f, x) = o(log log n) a.e. for every f ∈ L2 [−π, π]. Then we start the proof of the Carleson-Hunt theorem. This is a reﬁned version of the previous theorem. An analysis of the previous proof shows the reason for the log log n term: we have used in the proof all the pairs α = (n, I). But if we think about the process of selection of the central interval I(x) we notice that, in it, the local norm must be relatively great. Therefore we need to select a set S of allowed pairs and assure that we only use them in the inductive steps. If we start with a Carleson integral (maybe one for an allowed pair), and carry out the procedure of chapter four, we reach a Carleson integral that in general does not correspond to an allowed pair. Therefore we must change to another one, but with a controllable error.

49

In the way to the deﬁnition of the allowed pairs Carleson gives an analysis of the function f that is very clever. This analysis can be regarded as that of writing the score from a piece of music. Given a level bj y p/2 , (the intensity of the least intense note), we start from the four equal intervals into which the interval I is divided. In every one we obtain the Fourier series of f , and retain only those terms that are greater that the chosen level. This forms the polynomial Pj . Then every one of these intervals is divided into two parts. In every one of these parts we obtain the Fourier series of f − Pj , and proceed in the same way. This procedure gives one a set of pairs Q that we call notes of f . This will be the starting point for the deﬁnition of the allowed pairs In chapters 4, 5 and 6 we will apply the principal idea of the proof in a straightforward way. We will obtain that for every f ∈ L2 the partial sums Sn (f, x) = o(log log n). This reasoning is relatively easy to follow. The kernel of the proof is in chapters 7 to 10. In these chapters the previous proof is reﬁned to remove the log log n term.

4. The Basic Step

4.1 Introduction The proof of the Carleson Theorem is based on a new method of estimating partial sums of Fourier series. We replace the Dirichlet integral by the singular integral π in(x−t) e f (t) dt. C(n,I) f (x) = p.v. −π x − t The new method consists of applying repeatedly a basic step that we are going to analyze in this chapter. Given a partition Π of the interval I = [−π, π], we decompose the integral as ein(x−t) EΠ f (t) C(n,I) f (x) = p.v. f (t) dt + dt x−t I(x) x − t J J∈Π,J⊂I(x) (4.1) ein(x−t) f (t) − EΠ f (t) dt, + x−t J J∈Π,J⊂I(x)

where I(x) is an interval containing x, that is a union of some members of Π, and EΠ f is the conditional expectation of ein(x−t) f (t) with respect to Π. The principal point of the basic step is the careful choice of I(x) and Π so that we have good control of the last two sums. The ﬁrst term is an integral of the same type that we can treat in a similar way. First we deﬁne local norms so that we can give adequate bounds for the last two terms. Later we will show the precise bounds and we will give the form of the basic step in theorem 4.13.

4.2 Carleson maximal operator Let I ⊂ R be a bounded interval and n ∈ Z. For every f ∈ L1 (I), we consider the singular integral 2πin(x−t)/|I| e f (t) dt. (4.2) p.v. x−t I

J.A. de Reyna: LNM 1785, pp. 51–72, 2002. c Springer-Verlag Berlin Heidelberg 2002

52

4. The Basic Step

If x is contained in the interval I/2 with the same center as I and half length we call (4.2) a Carleson integral. Here we have an arbitrary selection: x ∈ I/2. Every condition x ∈ θI, with θ ∈ (0, 1) would be adequate. If we allow x to be near the extremes of I, the simplest Carleson integral would be unbounded. For example if we want to bound b dt b supp.v. = sup log a a,b>0 −a t we need M −1 < b/a < M . This is the condition 0 ∈ θ(a, b) with θ = (M − 1)/(M + 1) ∈ (0, 1). Although this selection is arbitrary, many details of the proof shall depend on the selection made here. The set of pairs P = {(n, I) : n ∈ Z, I a bounded interval of R} will assume a very central role in the following. We use greek letters to denote the elements of P. Given a pair α = (n, I) we denote I by I(α) and n by n(α). Also we call |I| = |I(α)| the length of α, and write it as |α|. Then for every α ∈ P and x ∈ I(α)/2 we put e2πin(α)(x−t)/|α| f (t) dt. Cα f (x) = p.v. x−t I(α) Given f ∈ L1 (I) and α ∈ P with I(α) = I, Cα f (x) is deﬁned for almost every x ∈ I(α)/2. We shall study the maximal Carleson operator CI∗ f (x) =

sup

|Cα f (x)|.

I(α)=I,n(α)∈Z

This is a measurable function with values in [0, +∞] (since it is a supreme of a countable number of measurable functions). We shall prove that it is a bounded operator from Lp (I) to Lp (I/2) for every 1 0 such that if x ∈ I(γ)/2, and for every λ ∈ R, |Cγ (eiλt )(x)| ≤ A. Proof. Write I = I(γ). By deﬁnition if λ(γ) = 2πn(γ)/|γ|, iλ(γ)(x−t) e iλt eiλt dt. Cγ (e )(x) = p.v. x−t I Hence, if ω = λ(γ) − λ, |Cγ (e

iλt

iω(x−t) e dt. )(x)| = p.v. I x−t

4.3 Local norms

53

We change variables, letting u = x − t and see that b iωu e iλt du. |Cγ (e )(x)| = p.v. −a u Since x ∈ I(γ)/2, we have 0 ∈ J/2 if J = [−a, b]. This condition is equivalent to 1/3 < b/a < 3. Therefore a iωu b iωu e e iλt |Cγ (e )(x)| = p.v. du + du u −a u a a sin ωu ≤ du + log 3. u −a x Furthermore −x (sin t/t) dt is bounded, thus proving our lemma.

4.3 Local norms The basic procedure is the reduction of one Carleson integral to another, simpler, Carleson integral. In order to bound the diﬀerence between them we shall need norms associated to pairs α ∈ P. First we associate a function eα to every pair α ∈ P. eα is a function in 2 L (R) supported by the interval I(α).

n(α) x χI(α) (x) = eiλ(α)x χI(α) (x). eα (x) = exp 2πi |α| The function eα is a localized wave train. It is localized in the time interval I(α), and has angular frequency λ(α) = 2πn(α)/|I(α)|. The number |α| = |I(α)| is the duration and n(α) the total number of cycles in the wave train. The functions eα with I(α) = I ﬁxed, form an orthonormal system and every function f supported by I can be developed in a series of these functions, convergent in L2 (R). f=

f, eα eα . |α|

I(α)=I

Observe that 1 f, eα = |α| |α|

n(α) t f (t) exp −2πi |α| I(α)

dt.

We deﬁne the local norm f α as

t c 1 j dt . f α = f (t) exp −2πi n(α) + 1 + j 2 |I| I 3 |I| j∈Z

54

4. The Basic Step

Here c is chosen so that j∈Z c/(1 + j 2 ) = 1. Hence f α is a mean value of absolute values of (generalized) Fourier coeﬃcients of f . One motivation for this deﬁnition is that with this norm we can control the integrals I f (t)ϕ(t) dt when the function ϕ is twice continuously diﬀerentiable on I. We shall show this in theorem 4.3 below. Proposition 4.2 Let ϕ ∈ C 2 [a, b], δ = b − a. For every x ∈ [a, b] we have

n x , ϕ(x) = cn exp 2πi 3δ n∈Z

where the coeﬃcient cn satisﬁes (1+n2 )|cn | ≤ A(ϕ∞ +δ 2 ϕ ∞ ), for every n ∈ Z. Proof. First assume that δ = 1. We can extends ϕ to a twice continuously diﬀerentiable function ϕ, ˜ of period 3, deﬁned on R. We can assume that ϕ ˜ ∞ , ϕ˜ ∞ and ϕ˜ ∞ are bounded by ϕ∞ + ϕ ∞ . The Fourier series for ϕ˜ is

n ϕ(x) ˜ = cn exp 2πi x , 3 n where 1 cn = 3

3

0

1

n −2πint ϕ(t) ˜ exp −2πi t dt = ϕ(3t)e ˜ dt. 3 0

Now, integration by parts leads, for n = 0 to 1 9 cn = ϕ˜ (3t)e−2πint dt, 2 (2πin) 0 so that for some absolute constant A and every n ∈ Z (1 + n2 )|cn | ≤ A(ϕ∞ + ϕ ∞ ). In the general case a change of scale leads to the inequality (1 + n2 )|cn | ≤ A(ϕ∞ + δ 2 ϕ ∞ ). Theorem 4.3 Let f ∈ L1 (I), ϕ ∈ C 2 (I) and α = (n, I) ∈ P. For some absolute constant B we have 1 2πin(x−t)/|α| e f (t)ϕ(t) dt ≤ B(ϕ∞ + |α|2 ϕ ∞ ) f α . |α| I Proof. We apply proposition 4.2 to ϕ and obtain

4.3 Local norms

ϕ(x) =

j∈Z

55

j x , cj exp 2πi 3 |α|

where (1 + j 2 )|cj | ≤ A(ϕ∞ + |α|2 ϕ ∞ ). Hence we have 1 2πin(x−t)/|α| e f (t)ϕ(t) dt |α| I 1

j t |cj | f (t) exp −2πi n − ≤ dt |α| I 3 |α| j∈Z

≤ B(ϕ∞ + |α|2 ϕ ∞ ) f α . We shall need a bound of f α when f is an exponential, and will give it here: Proposition 4.4 There exists an absolute constant C such that for every ω ∈ R and α ∈ P we have e2πiωx α ≤ 1,

and

C . e2πiωx α ≤ ω|α| − n(α)

Proof. Since |e2πiωx | = 1 and f α is a mean value of integrals, 1

j t f (t) exp −2πi n(α) + dt, |α| I(α) 3 |α|

(4.3)

the ﬁrst inequality is obvious. For the second inequality we calculate (4.3), and obtain |e2πi(A−j/3) − 1| , 2π|A − j/3| where A = ω|α| − n(α). Hence we have 2πiωx

e

α ≤

j∈Z

≤

1 2π

c |e2πi(A−j/3) − 1| 1 + j 2 2π|A − j/3| |j|/3≤|A|/2

c 4 + 2 1 + j |A|

|j|/3>|A|/2

M c 2 + . ≤ 2 1+j π|A| |A|

And, therefore e2πiωx α ≤

(2/π) + M . |A|

For technical reasons it will be convenient to replace A = ω|α| − n(α) by ω|α| − n(α). Since e2πiωx α ≤ 1 always, this presents no problem.

56

4. The Basic Step

Proposition 4.5 There exists an absolute constant B > 0 such that, for every interval K and ω ∈ R, there exists k ∈ N such that for κ = (k, K), we have e2πiωx κ ≥ B. We can choose k = |ω| · |K|. Proof. As in the proof of proposition 4.4 we obtain e2πiωx κ =

j∈Z

c |e2πi(A−j0 /3) − 1| c |e2πi(A−j/3) − 1| ≥ . 1 + j 2 2π|A − j/3| 1 + j02 2π|A − j0 /3|

One of every three consecutive j satisﬁes |e2πi(A−j/3) − 1| > 1. Hence we can choose j0 such that |A − j0 /3| < 1, and |e2πi(A−j0 /3) − 1| > 1. For such j0 we have C e2πiωx κ ≥ . 1 + 9(1 + |A|)2 This is greater than B if we choose |A| < 1. Since A = |ω| · |K| − n(κ), this goal is achieved taking k = n(κ) = |ω| · |K|. Note that for α = (n, I) and f ∈ Lp (I) we have f α ≤ f Lp (I) ,

(4.4)

Here and elsewhere if f ∈ Lp (I) we put

1 1/p

1/p p p |f | dm , and f p = |f | dm . f Lp (I) = |I| I I Hence f Lp (I) always denotes the Lp (I) norm with respect to normalized Lebesgue measure.

4.4 Dyadic partition Given an interval I = [a, b], the central point c of I divides this interval in two intervals of half length that we denote by I0 = [a, c] and I1 = [c, b]. a

d

c

e

b

ﬁg. 4.1

Analogously given I1 = [c, b] we obtain the interval (I1 )0 = I10 = [c, e] and I11 = [e, b] where e is the central point of the interval I1 .

4.4 Dyadic partition

57

The same process deﬁnes the intervals Iu for every word u ∈ {0, 1}∗ . We call these intervals dyadic intervals generated from I. Every dyadic interval Iu has two sons, Iu0 and Iu1 . Every dyadic interval has a brother. For example, the brother of I00101 is I00100 . But, in general, every dyadic interval has two contiguous intervals. We understand by contiguous an interval of the same length and with a unique point in common. For example it is easy to see that the contiguous intervals of I00101 are its brother I00100 and I00110 . We also speak of the grandsons of I. They are the four intervals, I00 , I01 , I10 , and I11 . We are dealing with the operator CI∗ : Lp (I) → Lp (I/2). I denotes always this interval and we speak of dyadic intervals as those intervals that can be written as Iu with u ∈ {0, 1}∗ . Also it must be noticed that I is an arbitrary interval, so we can apply every concept deﬁned on I to every other interval. For example, we can speak of dyadic intervals with respect to J. Smoothing intervals. In general the union of two contiguous dyadic intervals is not a dyadic interval. Such an interval we shall call a smoothing interval . They play a prominent role in the proof. All dyadic intervals are smoothing intervals, since they can be written as the union of its two sons. But there are smoothing intervals that are not dyadic. For example the interval [d, e] = I01 ∪ I10 is a smoothing interval, but it is not dyadic. Given an interval I we shall denote by I/2 its middle half, that is I/2 = I01 ∪ I10 . Given the interval I, we denote by PI the set of pairs (n, J) where n ∈ Z and J is a smoothing interval with respect to I. Dyadic Points. The extremes of all dyadic intervals (with respect to I) are called dyadic points and the set of all dyadic points will be denoted by D. D is a countable set and so it is of measure 0. Proposition 4.6 Given I and x ∈ I/2 that is not a dyadic point, for every n = 0, 1, 2, . . . there is only one smoothing interval In of length |I|/2n such that x ∈ In /2. We also have I = I0 ⊃ I1 ⊃ I2 ⊃ . . .. Proof. It is clear that for n = 0 there is only one smoothing interval I. Now assume that there is only one smoothing interval J = In = [a, b] with / D, there is x ∈ In /2. Then, x ∈ (d, e) (refer to the ﬁgure 1.1). Since x ∈ one and only one interval J010 , J011 , J100 , or J101 that contains x. In every case we check that there is only one smoothing interval with the required conditions, In+1 will be respectively [a, c], [d, e], [d, e], or [c, b]. We observe that in every case In+1 ⊂ In . Choosing I(x). We shall consider partitions Π of some smoothing interval J where every member of Π is a dyadic interval generated from I, of

58

4. The Basic Step

length ≤ |J|/4. We always consider closed intervals, and when we speak of partitions we don’t take into consideration the extremes of these intervals. We also speak of disjoint intervals to mean those whose interiors are disjoint. We assume a Carleson integral Cα f (x) and a dyadic partition Π of I = I(α), where every J ∈ Π has length |J| ≤ |I|/4, to be given. I(x) will be an interval, a union of some members of Π, such that x ∈ I(x)/2, so that the ﬁrst term in decomposition (4.1) will be almost a Carleson integral. We must also choose I(x) in order to obtain a good bound for the other members of decomposition (4.1). For example EΠ f (t) dt. J x−t Here x ∈ J ∈ Π. Therefore we shall choose I(x) so that |x − t| is of the same order as |J|, for every t ∈ J. The next proposition guarantees that all these conditions can be attained. Proposition 4.7 Let x ∈ I/2 and let Π be a dyadic partition of the smoothing interval I, with intervals of length ≤ |I|/4. Then there exists a smoothing interval I(x) such that: (a) x ∈ I(x)/2. (b) |I(x)| ≤ |I|/2. (c) I(x) is the union of some intervals of Π. (d) Some of the two sons of I(x) is a member of Π. (e) For every J ∈ Π such that J ⊂ I(x) we have d(x, J) ≥ |J|/2. (f ) Each smoothing interval J with I(x) ⊂ J ⊂ I and x ∈ J/2 is a union of intervals of the partition Π. Proof. We consider the smoothing intervals J ∪J , union of an interval J ∈ Π and a contiguous interval J , such that x ∈ (J ∪J )/2. For example if x ∈ J0 ∈ Π, we can choose J0 contiguous to J0 and such that J0 ∪ J0 satisﬁes these conditions. Now let I(x) be such an interval J ∪ J of maximum length. Then (a) and (d) are satisﬁed by construction. (b) follows from the hypothesis that every J ∈ Π has length ≤ |I|/4. (c) Let I(x) = J ∪ J with J ∈ Π. J and J are dyadic intervals. H

H J

J

ﬁg. 4.2

Two dyadic intervals have intersection of measure zero or one is contained in the other. Hence if J is not the union of intervals of Π, there must be an element H ∈ Π such that J ⊂ H. Since H and J are members of Π they are essentially disjoint. Let H be the dyadic interval of the same length as H that contains J. As J and J are contiguous, H and H are also contiguous.

4.5 Some deﬁnitions

59

Then K = H ∪ H satisﬁes x ∈ K/2, and |K| > |I(x)|. This contradicts the deﬁnition of I(x). (e) Let J ∈ Π be such that J ∩ I(x) is of measure zero. If d(x, J) < |J|/2, there exists J contiguous to J and containing x. Hence x ∈ (J ∪ J )/2. By deﬁnition this implies I(x) ⊃ J ∪ J , which contradicts the hypothesis that J ⊂ I(x). (f) Let J be such a smoothing interval and let K be a son of J. Two dyadic intervals are disjoint or one is contained in the other. Since K is dyadic it follows that K is the union of the intervals L ∈ Π such that L ⊂ K or there is L ∈ Π with L K. The case L K is impossible, because if L K, since x ∈ J/2 we have d(x, L) ≤ d(x, K) ≤ |K|/2 ≤

1 |L|. 4

Therefore there is L contiguous to L and such that x ∈ (L ∪ L )/2. Also |L ∪ L | > 2|K| ≥ |I(x)|, which contradicts the deﬁnition of I(x). The condition (e) says that every interval J ∈ Π such that J ⊂ I(x) is contained in a security interval J of length 2|J| such that x is not contained in J . This will play a role in the bound (4.15).

4.5 Some deﬁnitions The remainder terms in (4.1) will be bounded using two functions. The ﬁrst is a modiﬁcation of the maximal Hilbert transform. Deﬁnition 4.8 Given f ∈ L1 (I), we deﬁne HI∗ f (x), the maximal Hilbert dyadic transform on I, by f (t) ∗ dt, HI f (x) = supp.v. K K x−t where the supremum is over the intervals K = J ∪ J , which are the union of two contiguous dyadic intervals such that x ∈ K/2.

The second function will be needed to bound the third term in (4.1). Deﬁnition 4.9 Given a ﬁnite partition Π of I by intervals Jk of length δk and center tk we deﬁne the function Δ(Π, x) =

k

δk2 . (x − tk )2 + δk2

60

4. The Basic Step

In the future reasoning a pair α = (n, I), (with some other elements that now are of no consequence) will determine a dyadic partition Πα of I(α). Hence we shall denote by Δα (x) the corresponding function Δ(Πα , x). We shall also use Hα∗ f (x) to denote the maximal Hilbert dyadic transform ∗ Eα f (x), where the function Eα f (x) denotes HI(α) iλ(α)(x−t) f (t), Πα ), E(e

the expectation of the function eiλ(α)(x−t)f (t) with respect to the partition Πα . Hence if z ∈ J ∈ Π, Eα f (z) = |J|−1 J eiλ(α)(x−t) f (t) dt. (In order not to be pedantic, the notations Δα , and Hα∗ f do not mention the partition Πα ). In the sequel we analyze every term of the decomposition (4.1). This will allow us to decide how to choose our partition Πα .

4.6 Basic decomposition The basic step in the proof of Carleson’s Theorem is a decomposition of a Carleson integral Cα f (x) into three parts associated with a dyadic partition Π of I = I(α). We assume that every J ∈ Π has a measure ≤ |I|/4; then Proposition 4.7 gives us an interval I(x) such that x ∈ I(x)/2. The decomposition is given by iλ(α)(x−t) e f (t) dt Cα f (x) = p.v. x−t I eiλ(α)(x−t) = p.v. f (t) dt x−t I(x) Eα f (t) eiλ(α)(x−t) f (t) − Eα f (t) + dt + dt. x−t II(x) x − t II(x) (4.5) We shall transform the ﬁrst term into a Carleson integral that can be considered simpler than Cα f (x). The second and third terms can be bounded in terms of the functions Hα f (x) and Δα (x) = Δ(Πα , x). In the following sections we are going to obtain the relevant bounds. These are collected together in theorem 4.11, that is a preliminary version of the basic step. Then we choose a partition in order to optimize the bound of theorem 4.11. With the selected partition Π we formulate a new form of the basic step in Theorem 4.13. In what follows we will try to bound Carleson integrals Cα f (x). It must be understood that we put Cα f (x) = +∞ if the principal value is not deﬁned.

4.7 The ﬁrst term

61

With this convention it can be noticed that our bounds are also correct in this case.

4.7 The ﬁrst term In fact, the ﬁrst integral in (4.5) is almost a Carleson integral. We need β = (m, I(x)) ∈ P such that the diﬀerence between Cβ f (x) and the ﬁrst integral is small. Now these are eiλ(α)(x−t) eiλ(β)(x−t) p.v. f (t) dt, p.v. f (t) dt. x−t x−t I(x) I(x) They will be equal if λ(α) = λ(β), that is, 2π

n m = 2π . |I| |I(x)|

(4.6)

In general it is not possible to choose m ∈ Z such that (4.6) holds. Hence we choose |I(x)| ! m= n . (4.7) |I| Now that we have chosen a convenient β = (m, I(x)) ∈ P we must bound the diﬀerence. p.v. I(x)

eiλ(α)(x−t) f (t) dt − Cβ f (x). x−t

Here, for the ﬁrst time, we want to change the frequency of a Carleson integral. These changes are governed by Theorem 4.10 (Change of frequency) Let α, β be two pairs with I(α) = I(β) = J and x ∈ J/2. If |n(α) − n(β)| ≤ M with M > 1, then |Cβ f (x) − Cα f (x)| ≤ BM 3 f α , where B is some absolute constant. Remark. It is not necessary that n(β) ∈ Z. Proof. We apply Theorem 4.3. First observe that ei λ(β)−λ(α) (x−t) − 1 iλ(α)(x−t) f (t) dt. |Cβ f (x) − Cα f (x)| = e x−t J Let λ(β) − λ(α) = L; then

(4.8)

62

4. The Basic Step

1 eiL(x−t) − 1 iλ(α)(x−t) e |Cβ f (x) − Cα f (x)| = |L| |α| f (t) dt. |J| J |L|(x − t) The function (eit − 1)/t deﬁned on R is in C ∞ (R) and so it is easy to see that if ϕ(t) = (eiL(x−t) − 1)/L(x − t) then ϕ∞ + |α|2 ϕ ∞ ≤ C(1 + L2 |α|2 ). Since |L| |α| = 2π|n(β) − n(α)|, Theorem 4.3 gives us |Cβ f (x) − Cα f (x)| ≤ C|L| |α|(1 + |L|2 |α|2 )f α ≤ BM 3 f α . If we apply this to our case it follows that eiλ(α)(x−t) f (t) dt − Cβ f (x) ≤ Cf β . p.v. x−t I(x)

(4.9)

4.8 Notation α/β We have related the ﬁrst term in (4.5) to a Carleson integral Cβ f (x). The process to obtain β from α and I(x) will be used very often. Hence given α ∈ P and an interval J ⊂ I(α) we deﬁne α/J ∈ P by |J| ! α/J = (m, J), where m = n(α) . |α| We choose α/J so that eα/J represents more or less the same musical note as eα , but of duration J, as far as this is possible. Observe that by deﬁnition 0 ≤ λ(α) − λ(α/J) |J| < 2π. (4.10) Also, given α and β ∈ P such that I(β) ⊂ I(α), we deﬁne α/β = α/I(β). For future reference we notice the following relation: λ(α) |β| = n(α/β) + h, 2π

where 0 ≤ h < 1,

valid whenever α and β ∈ P are such that I(β) ⊂ I(α). This follows from the identity λ(α) |β| |β| = n(α) . 2π |α|

(4.11)

4.10 The third term

63

If we have I(α) ⊃ J ⊃ K and all are smoothing intervals, we have α/K = (α/J)/K.

(4.12)

In fact, we have |K| ! , n(α/K) = n |α|

|K| ! n (α/J)/K = n(α/J) = |J|

|J| ! |K| ! n . |α| |J|

All the lengths are of type |I|/2s . Hence all we have to prove is # " n/2k ! n ! = k+l . 2l 2 This is easy if we think in binary. With our new notations, Proposition 4.5 says that there exists an absolute constant B > 0 such that eiλ(δ)t δ/K ≥ B, for every K ⊂ I(δ).

4.9 The second term This term is really easy to bound. We have Eα f (t) Eα f (t) Eα f (t) dt = p.v. dt − p.v. dt. II(x) x − t I x−t I(x) x − t Now these two intervals I and I(x) are of the form that is used in deﬁnition 4.8 of the maximal Hilbert dyadic transform. Hence we have Eα f (t) dt ≤ 2Hα∗ f (x). (4.13) x − t II(x)

4.10 The third term We use here the fact that the numerator has a vanishing integral on every J ∈ Π. In this way we can change the ﬁrst order singularity into a second order singularity, which is easier to handle. Observe that by Proposition 4.7, I I(x) is the union of some members of the partition Π. Hence we can write eiλ(α)(x−t) f (t) − Eα f (t) eiλ(α)(x−t) f (t) − Eα f (t) dt = dt, x−t x−t II(x) Jk Jk

where we are summing over those Jk ∈ Π ’disjoint’ from I(x). Let tk be the center of Jk . We have

64

4. The Basic Step

1 t − tk 1 = − . x − t x − tk (x − t)(x − tk ) Now, the integral of eiλ(α)(x−t) f (t) − Eα f (t) on every Jk is zero, so that eiλ(α)(x−t) f (t) − Eα f (t) dt = x−t II(x) t − tk t − tk iλ(α)(x−t) e Eα f (t) dt. f (t) dt − Jk (x − t)(x − tk ) Jk (x − t)(x − tk ) Jk

Jk

(4.14) We want to reduce the third term to the function Δα (x). First consider the second part of (4.14). We recall that from Proposition 4.7 we know that d(x, Jk ) ≥ δk /2, where δk = |Jk |, for every Jk that appears in (4.14). Hence if t ∈ Jk , δk ≤ |x − tk | it follows that |(x−t)(x−tk )| ≥ |x−tk |2 −

δk 1 1 1 |x−tk | ≥ |x−tk |2 ≥ |x−tk |2 + δk2 . (4.15) 2 2 4 4

By deﬁnition Eα f is constant on every Jk . Hence we have t − tk δk dt Eα f (t) dt ≤2 |Eα f (tk )| 2 2 Jk (x − t)(x − tk ) Jk (x − tk ) + δk Jk

Jk

=2

Jk

δk2 |Eα f (tk )|. (x − tk )2 + δk2

Now, as every term in the deﬁnition 4.9 of Δ(Π, x) is positive, we conclude that t − tk δk2 |Eα f (tk )|. (4.16) Eα f (t) dt ≤ 2 (x − tk )2 + δk2 Jk (x − t)(x − tk ) Jk ∈Π

Jk

We are now summing over all Jk ∈ Πα . Now we bound the ﬁrst term in (4.14). Here we use Theorem 4.3 again. The procedure is similar to that we have used to bound the ﬁrst term. Let βk = α/Jk . We have t − tk eiλ(α)(x−t) f (t) dt Jk (x − t)(x − tk ) Jk 1 $ |Jk |(t − tk ) iλ(α)−λ(β ) (x−t) % k e = eiλ(βk )(x−t) f (t) dt. |Jk | Jk (x − t)(x − tk ) Jk

Now we apply Theorem 4.3 to every integral. Applying (4.15)

4.10 The third term

65

& |J |(t − t ) & δk2 & k k i λ(α)−λ(βk ) (x−t) & e . & & ≤2 (x − t)(x − tk ) (x − tk )2 + δk2 ∞ The second derivative of |J|(t − tk )eiM (x−t) (x − t)(x − tk ) is −

2|J|eiM (x−t) 2|J|iM (t − tk )eiM (x−t) 2|J|iM eiM (x−t) + − (x − t)(x − tk ) (x − t)2 (x − tk ) (x − t)2 (x − tk )

|J|M 2 (t − tk )eiM (x−t) 2|J|(t − tk )eiM (x−t) − + . (x − t)(x − tk ) (x − t)3 (x − tk ) By (4.10), |J|M ≤ 2π, and by (4.15) |Jk | = δk , |x − t| > δk /2 and |t − tk | < δk /2. We obtain that δk2 ϕ ∞ is bounded by δk2 [8π + 32π + 32] . (x − tk )2 + δk2 2

All this gives us t − tk δk2 eiλ(α)(x−t) f (t) dt ≤ C f βk , (x − tk )2 + δk2 Jk (x − t)(x − tk ) Jk

Jk

(4.17) where C is an absolute constant. To compare (4.16) and (4.17) we observe that |Eα f (tk )| ≤ Cf βk .

(4.18)

In fact, we have that 1 iλ(α)(x−t) e f (t) dt |Eα f (tk )| = |Jk | Jk 1 i λ(α)−λ(βk ) (x−t) iλ(βk )(x−t) = e e f (t) dt. |Jk | Jk We are now in position to apply Theorem 4.3. Here we use again (4.10) and ﬁnally obtain |Eα f (tk )| ≤ Cf βk , for some absolute constant C. Therefore the third term in (4.5) is bounded by

66

4. The Basic Step

D

Jk

δk2 f βk . (x − tk )2 + δk2

Now, as every summand in the deﬁnition of Δα (x) is positive, we can write |third term in (4.5)| ≤ D sup f βk Δα (x).

(4.19)

Jk

4.11 First form of the basic step Theorem 4.11 Let Cα f (x) be a Carleson integral and Π = Πα a dyadic partition of I = I(α). Assume that every J ∈ Π has measure ≤ |I|/4. Let I(x) be the interval deﬁned on proposition 4.7, and β = α/I(x). Then |Cα f (x) − Cβ f (x)| ≤ Cf β + 2Hα∗ f (x) + D sup f α/Jk Δα (x),

(4.20)

Jk

where C and D are absolute constants. Observe that by (4.9) we have |Cα f (x) − Cβ f (x)| ≤ Cf β + k

Jk

eiλ(α)(x−t) f (t) dt. x−t

We have introduced here the conditional expectation Eα f (x) and then we have replaced (x − t)−1 by (t − tk )(x − t)−1 (x − tk )−1 . This is very convenient. For example, if we apply directly Theorem 4.3 we only obtain the bound ≤ C k f α/Jk .

4.12 Some comments about the proof 1. In what sense can we say that Cβ f (x) is a ‘simpler’ Carleson Integral? First, we can say that we pass from Cα f (x) to Cβ f (x) where if n(α) > 0 then n(β) ≤ n(α). Thus we can expect to obtain n(β) = 0. We can restrict the study to real functions, because, with f real, we have C−α f (x) = Cα f (x),

(4.21)

if α = (n, I) and −α = (−n, I) with n ∈ N. Hence we can assume that n(α) is positive. Another case in which we can consider only values n(α) > 0 is when we study functions f with |f | = χA (for some measurable set A). In this case f is of the same nature and

4.12 Some comments about the proof

67

C−α f (x) = Cα f (x). This is not all we have to say about question 1. But it is what we can say now. 2. We want to prove that the Carleson maximal operator is bounded from Lp (I) to Lp (I/2) (1 y} ≤ . yp

Hence we would like, given f ∈ Lp (I) and given y > 0, to deﬁne E with m(E) < Ap f pp /y p and such that for every x ∈ I/2 E and α ∈ P with I(α) = I we have |Cα f (x)| < y. In fact, we will ﬁrst construct, given f ∈ Lp (I), y > 0 and N ∈ N, a subset EN with m(EN ) < Ap f pp /y p ; such that for every x ∈ I/2 EN and α ∈ P with I(α) = I, and 0 ≤ n(α) < 2N we will have |Cα f (x)| < y. Then {CI∗ f >y} ⊂

{x ∈ I/2 : |Cα f (x)| > y, 0 ≤ n(α) < 2N , I(α) = I},

N

and since AN = {x ∈ I/2 : |Cα f (x)| > y, 0 ≤ n(α) < 2N , I(α) = I} is an increasing sequence of sets, we will have m({CI∗ f > y}) = lim m(AN ) ≤ Ap f pp /y p . N

Technical reasons will force us to replace 2N by θ2N (θ an absolute constant) in the above reasoning. 3. Another important point about the proof is that we shall use the basic step repeatedly. Therefore, given a Carleson integral Cα f (x) with I(α) = I, / EN , we shall obtain a sequence (αj )sj=1 in P with 0 ≤ n(α) < 2N and x ∈ α1 = α. Then we will have |Cα f (x)| ≤

s−1

|Cαj f (x) − Cαj+1 f (x)| + |Cαs f (x)|.

(4.22)

j=1

We shall apply the basic step to every diﬀerence and will arrange things so that n(αs ) = 0.

68

4. The Basic Step

4.13 Choosing the partition Πα . The norm |f |α From now on we will consider a Carleson integral Cα f (x) where 0 ≤ n(α) < 2N and I(α) = J, but we will not assume that J = I. If we want to apply the basic step to this integral, what selection of the partition Π will be good? The intervals of the partition Π will be dyadic with respect to J = I(α) and of length less than |J|/4. What we want, in view of (4.20) is to have j control of f α/Jk for every Jk ∈ Π. Hence we put bj = 2 · 2−2 , and we will assume that for the intervals J00 , J01 , J10 , and J11 we have f α/J00 , f α/J01 , f α/J10 , f α/J11 < ybj−1 .

(4.23)

We obtain the partition Πα by a process of subdivision. We start with the four grandsons of J = I(α) of which we assume (4.23). Then at every stage of the process we take some interval K and we subdivide it in its two sons K0 and K1 , if they satisfy the condition f α/K0 , f α/K1 < ybj−1 . If they do not, we consider K to be one of the intervals of the partition. Since this process can be inﬁnite we also stop the division if |K| ≤ |I|/2N and consider it to be of the partition. As we need to consider the condition (4.23), we deﬁne for every α = (n, J) ∈ P |f |α = sup f α/J00 , f α/J01 , f α/J10 , f α/J11 . It is important to see that this construction and the deﬁnition of I(x) in proposition 4.7, imply that either |I(x)| = 2|I|/2N or |f |β ≥ ybj−1 . Now we have another answer to the question 1. If we start with ybj ≤ |f |α < ybj−1 we arrive at |f |β ≥ ybj−1 . We go from level j to a lesser level. It is true that a lesser level means here a greater norm |f |β , but it also means a smaller number of cycles n(β) ≤ n(α), and we will arrange things so that we arrive to Cαs f with n(αs ) = 0. We also have a good bound of the diﬀerences |Cαj f (x) − Cαj+1 f (x)| in (4.22). As we have motivated above, given f ∈ L1 (I) and α = (n, J) ∈ P, we deﬁne (4.24) |f |α = sup f α/J00 , f α/J01 , f α/J10 , f α/J11 . We will say a Carleson integral Cα f (x) is of level j ∈ N if ybj ≤ |f |α < ybj−1 . In the construction that follows we assume f ∈ L1 (I), a Carleson integral Cα f (x), a natural number N , and a real number y > 0 to be given; where α = (n, J) with 0 ≤ n < 2N , and J being the union of two dyadic intervals

4.13 Choosing the partition Πα . The norm |f |α

69

with respect to I, has length |J| > 4|I|/2N . We also assume that |f |α < bj−1 y for some natural number j (not necessarily the level of Cα f (x)). Our objective is to select a convenient dyadic partition Π of J so that we can apply theorem 4.11. We consider now the set of dyadic intervals Ju with respect to J, such that |J|/4 ≥ |Ju | ≥ |I|/2N . For example, in the ﬁrst four rows of the ﬁgure 4.3 we have represented these intervals.

x ﬁg. 4.3 For every one of these intervals we determine if they satisfy the condition f α/Ju < ybj−1 .

(4.25)

(In the ﬁgure we have painted in black the intervals that, hypothetically, do not satisfy this condition). Now the interval Ju is a member of the partition Π if it is of length |Ju | = |I|/2N and it, all its ancestors and their brothers satisfy the condition (4.25), (as J10011 in the example of the ﬁgure) or it, all its ancestors and their brothers satisfy the condition (4.25) but one of its sons does not satisfy this condition (as J011 in the example). (The ﬁfth row of the ﬁgure is a representation of the partition Π in the case we are handling). Finally, observe that according to proposition 4.7, the interval I(x) is always the union of one of the intervals of Π and a contiguous interval of the same length. Therefore |I(x)| = 2|I|/2N or some of the four grandsons of I(x) will not satisfy the condition (4.25). (In the ﬁgure assuming that x is in the interval J10001 , the interval I(x) is represented in the sixth row. In this case I(x) is not a dyadic interval). In our example, J10011 is a member of the partition Π, since its length is just |I|/2N and it, its ancestors J1001 , J100 , J10 , and their brothers J10010 , J1000 , J101 , J11 satisfy the condition (4.25). Also J011 is a member of the partition since it, its ancestor J01 and their brothers J010 , J00 satisfy the condition (4.25), but one of its sons does not, in this case J0111 .

70

4. The Basic Step

4.14 Basic theorem, second form Now that we have a good selection of the partition Πα we can give a better version of the basic step. We shall need the following comparison between the two norms · α and | · |α . Proposition 4.12 There is a constant C > 0 such that for every f ∈ L2 (J) and α = (n, J) f α ≤ C|f |α . Proof. Let |J| = δ, and denote by K the grandsons of J. Then $

c 1 j t % dt. f (t) exp −2πi n + f α = 1 + j2 δ J 3 δ j Let δ = δ/4. We can write f α ≤

K

j

c 1 1 + j 2 4δ

$

n j t % + f (t) exp −2πi dt, 4 12 δ K

where K denotes the grandsons of J. Let n = 4m + r where r = 0, 1, 2 or 3 and put 4j + s instead of j, where s = 0, 1, 2 or 3. Then f α is equal to $

r s t % c 1 j f (t) exp −2πi m + + + dt. 2 4δ 1 + (4j + s) 3 4 12 δ K s j K

By Proposition 4.2 we have for t ∈ K $ r

st% t exp 2πi + c exp 2πi = ; 4 12 δ 3 δ

where (1 + 2 )|c | ≤ B (and where c depends on K, r, and s). Now we have f α bounded by $

|c | t % c j + f (t) exp −2πi m + dt 2 1 + (4j + s) 4δ 3 3 δ K s j K

≤

K

≤C

K

Now observe that

s

j,k

j

C |c ||F (m, j + , K)| 1 + j2

|F (m, k, K)| (1 + k 2 ) . 2 2 (1 + j )(1 + (k − j) ) 1 + k2

4.14 Basic theorem, second form

71

1 1 + k2 1 + k2 1 = + (1 + j 2 )(1 + (k − j)2 ) 1 + j2 1 + (k − j)2 2 + j 2 + (j − k)2

1 1 ≤2 + . 1 + j2 1 + (k − j)2 Hence we have f α ≤ 2C

K

≤D

k

j

|F (m, k, K)| 1 1 + 1 + j2 1 + (k − j)2 1 + k2

|F (m, k, K)| K

k

≤D

1 + k2

f α/K ≤ 4D|f |α .

K

In the same way we can prove that f α ≤ C sup{f α/J0 , f α/J1 },

J = I(α).

(4.26)

Now we can formulate the basic step with all its ingredients. Theorem 4.13 (Basic Step) Let ξ ∈ PI , and x ∈ I(ξ)/2 and assume that |f |ξ < ybj−1 . Given N ∈ N, let Πξ and I(x) be the corresponding partition of I(ξ) and interval, deﬁned in Proposition 4.7. Let J be a smoothing interval such that I(x) ⊂ J ⊂ I(ξ), and x ∈ J/2. Assume that |ξ| ≥ 4|I|/2N . Then we have Cξ/J f (x) − Cξ/I(x) f (x) ≤ Cybj−1 + 2Hξ∗ f (x) + Dybj−1 Δξ (x), (4.27) where C and D are absolute constants. Proof. The condition |ξ| ≥ 4|I|/2N assure us that we can apply the procedure to obtain Πξ . We have seen that the selection of I(x) implies that J is union of some members of Πξ , and by (4.26), f ξ/J ≤ Cybj−1 . The same reasoning gives f ξ/I(x) ≤ Cybj−1 . Now observe that 2πin(ξ/J)(x−t)/|J| e f (t) dt, Cξ/J f (x) = p.v. x−t J where n(ξ/J) = n(ξ)|J|/|I(ξ)|. By a change of frequency, we have 2πin(ξ)(x−t)/|I(ξ)| e f (t) dt ≤ Bf ξ/J ≤ Cybj−1 . Cξ/J f (x) − p.v. x − t J

72

4. The Basic Step

In spite of the possibility that J = I(ξ) we can use the partition Πξ , as in the basic decomposition (4.1) for the integral 2πin(ξ)(x−t)/|I(ξ)| e p.v. f (t) dt. (4.28) x−t J We obtain a representation of the integral in (4.28) as eiλ(ξ)(x−t) Eξ f (t) f (t) dt + dt p.v. x−t I(x) JI(x) x − t eiλ(ξ)(x−t) f (t) − Eξ f (t) + dt. x−t JI(x) For the ﬁrst term we obtain, as in (4.9), and by a change of frequency, | First term − Cξ/I(x) f (x)| ≤ Cf ξ/I(x) ≤ Cybj−1 . The second can be bounded as in (4.13) by | Second term | ≤ 2Hξ∗ f (x). For the third term we must use the fact that J I(x) can be written as a union of intervals from Πξ . Then we proceed as in Theorem 4.11. In this case we obtain a sum of some of the terms of Δξ (x) instead of all the terms. Since these terms are all positive this sum is less than Δξ (x). Finally we obtain, as in (4.19), | Third term | ≤ D sup f βk Δξ (x) ≤ Dybj−1 Δξ (x). Jk

This ﬁnishes the proof.

5. Maximal Inequalities

In this chapter we give two inequalities to bound the two terms Δξ (x) and Hξ∗ f (x) that arise in the basic step.

5.1 Maximal inequality for Δ(Π, x) Theorem 5.1 There are some absolute constants A and B > 0, such that for every ﬁnite partition Π of the interval J ⊂ R by intervals, we have m{x ∈ J : Δ(Π, x) > y} ≤ Ae−By . |J|

(5.1)

Proof. First recall that if the intervals Jk of Π have center at tk and length δk we have deﬁned the function Δ(Π, x) as Δ(Π, x) =

k

δk2 . (x − tk )2 + δk2

Let g: R → [0, +∞) be a bounded and measurable function. We can deﬁne a harmonic function on the upper-half plane by convolution with the Poisson kernel 1 y g(t) dt = Py ∗ g(x). u(x, y) = π R (x − t)2 + y 2 Hence we have πδk u(tk , δk ) = πδk Pδk ∗ g(tk ). Δ(Π, t)g(t) dt = k

k

If we assume g to be positive, Lemma 5.2 gives 2 Pδ ∗ g(t) dt. Pδk ∗ g(tk ) ≤ δ k Jk k Therefore

Δ(Π, t)g(t) dt ≤ 2π

k

J.A. de Reyna: LNM 1785, pp. 73–76, 2002. c Springer-Verlag Berlin Heidelberg 2002

Jk

Pδk ∗ g(t) dt.

74

5. Maximal Inequalities

By the general inequality of the Hardy-Littlewood maximal function (cf. Theorem 1.6), we get Pδk ∗ g(t) ≤ Mg(t). Hence Δ(Π, t)g(t) dt ≤ 2π Mg(t) dt J

Now, we have seen (cf. Proposition 1.5), that B Mf (x) dx ≤ m(B) + 2c1 |f (x)| log+ |f (x)| dx; hence Δ(Π, t)g(t) dt ≤ c|J| + c |g(t)| log+ |g(t)| dt. R

Now we put g(t) = ey/2c χ{Δ(Π,t)>y} (t) and obtain y/2c ye m{t : Δ(Π, t) > y} ≤ Δ(Π, t)g(t) dt ≤ c|J| + cey/2c Hence

y m{t : Δ(Π, t) > y}. 2c

2c m{t : Δ(Π, t) > y} ≤ e−y/2c . |J| y

Since the left member is less than or equal 1, we obtain the desired bound. We prove now the following lemma: Lemma 5.2 Let g ∈ L∞ (R) be a positive function and u(x, y) = Py ∗ g(x). Then for every interval J ⊂ R with center at a and length y we have 2 u(x, y) dx. u(a, y) ≤ |J| J Proof. Without loss of generality we can assume that a = 0. So we want to prove y 1 y 1 2 g(−t) dt ≤ g(x − t) dt dx π R t2 + y 2 |J| J π R t2 + y 2 1 y 2 g(−t) dt dx. = |J| J π R (x + t)2 + y 2 But this follows from y y 2 2 y/2 y ≤ dx = dx. 2 2 2 2 t +y |J| J (x + t) + y y −y/2 (x + t)2 + y 2 Changing the variable we see that (5.2) is equivalent to

(5.2)

5.2 Maximal inequality for HI∗ f

1 ≤2 2 u +1

1/2

−1/2

75

dx . (x + u)2 + 1

And we see that for every ξ with |ξ| < 1/2 we have u2

1 2 ≤ . +1 (ξ + u)2 + 1

5.2 Maximal inequality for HI∗ f In order to prove an inequality of the same type as (5.1) for HI∗ f we ﬁrst relate this maximal function to the ordinary maximal Hilbert transform H∗ f and the maximal function of Hardy and Littlewood. Proposition 5.3 Let 1 ≤ p < +∞, f ∈ Lp (R), and let I ⊂ R be a bounded interval. Then for every x ∈ I we have HI∗ f (x) ≤ 2H∗ f (x) + 6Mf (x).

(5.3)

Proof. Observe that in the deﬁnition of HI∗ f (x) we consider only the restriction of f to the interval I. Let K ⊂ I be such that x ∈ K/2. Let J ⊂ K be an interval with center at x and of maximal length, and L ⊃ K be an interval with center at x but of minimal length. We can write f (t) f (t) f (t) dt ≤ dt + dt. K x−t J x−t KJ x − t The ﬁrst term is equal to Hf (x) − RJ · · · and consequently is bounded by 2H∗ f (x). To bound the second term observe that if t ∈ K J, then |x − t| ≥ d(x, R K) ≥ |K|/4. Hence 4 f (t) 4|L| 1 dt ≤ |f (t)| dt ≤ |f (t)| dt. |K| KJ |K| |L| L KJ x − t From x ∈ K/2 and the deﬁnition of L follows that |L| ≤ 6|K|/4. Hence we have obtained (5.3). Theorem 5.4 There are absolute constants A and B > 0 such that for every f ∈ L∞ (I), and every y > 0 m{HI∗ f (x) > y} ≤ Ae−By/f ∞ . |I|

(5.4)

76

5. Maximal Inequalities

Proof. By homogeneity we can assume that f ∞ = 1. We shall also consider f as the restriction to I of a function vanishing on R I. By proposition 5.3 as Mf ∞ ≤ 1 we have {HI∗ f (x) > y} ⊂ {2H∗ f (x) > y/2}, if y > 12. Now we want to bound

∗

eAH

f (t)

dt.

I

We have seen that the maximal Hilbert transform satisﬁes H∗ f p ≤ Bpf p , Hence

AH∗ f (t)

e

I

2 y/4} eAy/4 − Ay/4 ≤ C|I|. Therefore, if y > y0 (y0 depending only on A), we have 1 Ay/4 m{HI∗ f (x) > y} ≤ C|I|. e 2 Hence we have proved (5.4) for y > y0 . Now, changing A, we can forget the restriction on y.

6. Growth of Partial Sums

6.1 Introduction As a ﬁrst indication that the basic step is powerful we give a theorem about the partial sums of the Fourier series of a function f ∈ L2 [0, 2π]. This can be regarded as a toy example of the techniques involved in Carleson’s theorem. This example will also justify our procedure in the proof of Carleson’s Theorem. The principal result in this chapter will be that if f ∈ L2 [0, 2π], then Sn (f, x) = o(log log n) almost everywhere. As we have seen, to bound Sn (f, x) we must bound the corresponding Carleson integral. Hence our ﬁrst objective will be to bound supα |Cα f (x)| where the supremum is taken over all pairs α with I(α) = I and |n(α)| < θ2N . Each time we apply the basic step we lessen the values of n(α) and |α|; the role of θ is to assure that we arrive to n(α) = 0 before we arrive to |α| < 4|I|/2N . In this chapter θ = 1/4, but later, in the proof of the Carleson Theorem, we shall need another value of θ. We will consider that f is real, so that by (4.21) we can assume 0 ≤ n(α) < θ2N . ∞ Given ε > 0, we shall construct a set E = N =2 EN ⊂ I with m(E) < Aε and such that sup 0≤n(α)<θ2N I(α)=I

f 2 |Cα f (x)| < B √ log N, ε

for x ∈ I/2 E.

To achieve this we will use the procedure indicated in (4.22). Every piece EN of the exceptional set will be the union of four subsets EN = S ∪ TN ∪ UN ∪ V . TN and UN will allow us to bound the terms Hξ∗ f (x) and Δξ (x) when we apply the basic step. V will allow us to bound the ﬁnal term |Cαs f (x)| in (4.22) when n(αs ) = 0. The deﬁnition of S will be given so that for every x ∈ I/2S the integrals Cαj f (x) that appear in (4.22) will have a level j ≥ 1. Let C be the set of pairs (n, J), where 0 ≤ n < θ2N , and J is a smoothing interval with respect to I, such that |J| ≥ 4|I|/2N .

J.A. de Reyna: LNM 1785, pp. 77–84, 2002. c Springer-Verlag Berlin Heidelberg 2002

78

6. Growth of Partial Sums

6.2 The seven trick We are going to deﬁne the exceptional set EN . Given α ∈ C, from I(α) ⊂ EN , we want to derive some properties of every grandchild J of I(α). Therefore, for every such property p, we shall deﬁne the set A = {J : J dyadic and p(J)}, where p(J) denotes that J satisﬁes the property we are considering. Now, given a measurable set A, we can deﬁne the set A∗ = {7J : J ⊂ A, dyadic}, where 7J denotes the interval of length 7|J| and the same center as J. Hence for every dyadic interval J ⊂ A we put in A∗ the interval J and three contiguous intervals of the same length at each side.

Then the subset A∗ satisﬁes our conditions. In fact if I(α) ⊂ A∗ and J is a grandchild of I(α) we have J ⊂ A. Suppose that J ⊂ A. Since J is dyadic we would have I(α) ⊂ 7J ⊂ A∗ , that is, a contradiction. Moreover, two dyadic intervals are disjoint or one is contained in the other, and all dyadic intervals are subsets of I. Hence we can obtain a set V of disjoint dyadic intervals J ⊂ A such that A∗ = 7J. J∈V

Therefore

m(A∗ ) ≤

7|J| ≤ 7m(A).

J∈V

And in general we have that if J ⊂ A∗ ,then every grandchild

J is a smoothing interval such that K of J satisﬁes K ⊂ A.

6.3 The exceptional set Let y > 0. Later we will determine y big enough so that m(E) < Aε. ∗ The ﬁrst component of the exceptional set is S = J 7J, where the union is taken for all the dyadic intervals, J, such that 1 |f |2 dm ≥ y 2 . (6.1) |J| J

6.3 The exceptional set

79

The deﬁnition of S ∗ is given so that: For every α ∈ PI such that x ∈ I(α)/2, but x ∈ S ∗ we have |f |α ≤ y. In fact, for every grandson J of I(α), we have 1 |f |2 dm < y 2 . |J| J (In the other case x ∈ I(α) ⊂ 7J ⊂ S ∗ ). Therefore for every grandson J of I(α) we have f α/J < y. Hence |f |α < y. Two dyadic intervals are disjoint or one is contained in the other. It follows that there exists a sequence (Jn ) of disjoint dyadic intervals, such that every Jn satisﬁes (6.1), and every other J that satisﬁes (6.1) is contained in one of the Jn . Hence

7 7 ∗ m(S ) = m 7Jn ≤ 7 m(Jn ) ≤ 2 |f |2 dm ≤ 2 f 22 . y I y n n Let α ∈ C such that I(α) ⊂ S ∗ , then |f |α < y. Hence |f |α = 0 or there exists j ∈ N such that ybj ≤ |f |α < ybj−1 . With α, j, y and N we obtain a partition Πα of I(α) as in section [4.13]. Now, for such α, we deﬁne TN (α) = ∅ if |f |α = 0 or, in other case, √ (6.2) TN (α) = x ∈ I(α) : Hα∗ f (x) > M ybj−1 log N /bj , √ where M > 0 and y > 0 will be determined later. Observe also that N > bj always. The norm Eα f ∞ has been bounded in (4.18) in terms of the local norms of f on the intervals of the partition, hence we have the bound Eα f ∞ ≤ C supk f βk < Cybj−1 . Therefore by the maximal inequality obtained in (5.4) for the maximal Hilbert dyadic transform,

√ BM log( N /bj ) m(TN (α)) ≤ A|α| exp − C b2j |f |2 ≤ A|α| 3 ≤ A|α| 2 α3 , N y N if we choose M in such a way that BM/C = 6. Let TN be the union of the sets TN (α). To bound m(TN ) we need the following: Lemma 6.1 For every f ∈ L2 (J) n∈Z

|f |2αn

1 ≤ 16 |f |2 dm , |J| J

where αn = (n, J) for every n ∈ Z.

80

6. Growth of Partial Sums

Now we have m(TN ) =

m(TN (α)) ≤

J

α=(n,J)

A|J|y −2 N −3 |f |2α

n

≤ 16Ay

−2

N

−3

J

|f |2 dm.

J

Here J can be any smoothing interval of I whose length ≥ 4|I|/2N . Summing ﬁrst for all J with the same length |I|/2r , m(TN ) ≤ 16Ay

−2

N

−3

N −2

2

r=0

|f |2 dm ≤

I

32A f 22 . y2 N 2

(6.3)

The set UN will also be the union of UN (α), for the same α’s. Put √ UN (α) = x ∈ I(α) : Δα (x) > (M/C) log N /bj , where M , C and y are the same constants that appear in (6.2). By the corresponding maximal inequality of theorem 5.1 we obtain: m(UN ) = m(UN (α)) ≤ A|J|y −2 N −3 |f |2α . J

n

J

n

The same reasoning we used before proves the inequality m(UN ) ≤

32A f 22 . y2 N 2

Finally, the last component V will be the set V = {x ∈ I : HI∗ f (x) > y} By the relation, proved in proposition 5.3, between the maximal Hilbert dyadic transform, the maximal Hilbert transform and the maximal HardyLittlewood function, the function H ∗ f is of weak type (2, 2). Therefore we have f 22 m(V ) ≤ C 2 . y Observe that given f ∈ L2 (I), N ∈ N, and y > 0, we have constructed a ∞ ∗ measurable set EN = S ∪ TN ∪ UN ∪ V ⊂ I such that for E = N =2 EN we have ∞ f 2 ∗ m(TN ∪ UN ) ≤ A 2 2 , m(E) = m(S ) + m(V ) + y N =2

where A denotes an absolute constant. Proof of lemma 6.1. First, we prove the inequality

6.4 Bound for the partial sums

f 2αn

n∈Z

81

1 2 ≤ |f | dm . |J| J

There exists (xn )n∈Z ∈ 2 (Z) with n |xn |2 = 1 such that $

1/2 x c j t % n 2 dt. f αn = f (t) exp −2πi n + 2 |J| 1 + j 3 |J| J j,n n∈Z

Hence if we denote by y3n+j the above integral, we have 2 2 2 −1 |y3n | = |y3n+1 | = |y3n+2 | = |J| |f (t)|2 dt. n∈Z

Therefore

n∈Z

f 2αn

1/2 =c

J

n∈Z

xn y3n+j j,n

n∈Z

1 + j2

≤ f L2 (J)

j

c = f L2 (J) . 1 + j2

Now we denote by K the grandsons of J and we have |f |2αn ≤ f 2(n/4,K) ≤ 4 f 2(m,K) K

n∈Z

n

K

m

1 16 16 2 2 ≤4 |f | dm ≤ |f | dm ≤ |f |2 dm. |K| K |J| |J| K J K

K

6.4 Bound for the partial sums Proposition 6.2 Let f ∈ L2 (I), and ε > 0 be given. There exists a set E of measure m(E) < Aε, such that for every N > 2 and x ∈ I/2 E f 2 |C(k,I) f (x)| ≤ B √ (log N ). ε 0≤k<θ2N sup

(6.4)

Proof. Without loss of generality we √ can assume that f is a real function with f 2 > 0. Choosing y = f 2 / ε, we have constructed in the previous section a set E such that m(E) < Aε. Consider the set C of those pairs α such that I(α) is a smoothing interval of length |α| > 4|I|/2N and 0 ≤ n(α)|I|/|α| < θ2N . (This is a subset of the set of pairs C that we have used in section 1). The set C is deﬁned in order that for every pair α = (k, I) with 0 ≤ k < θ2N and every J ⊂ I a smoothing interval of length > 4|I|/2N we have α/J ∈ C .

82

6. Growth of Partial Sums

Now choose x ∈ I/2 E. For every Carleson integral Cα f (x) appearing in (6.4), we have α ∈ C . Assume that we have a Carleson integral Cα f (x) with α ∈ C . Since x ∈ S ∗ and x ∈ I(α)/2, we have |f |α < y. Hence there is a well deﬁned level j ∈ N such that bj y ≤ |f |α < ybj−1 . (Or f = 0 a. e. on I(α) and there is nothing to prove). We are in position to apply the procedure of section [4.13] to obtain a dyadic partition Πα of I(α) with intervals J of length |I|/2N ≤ |J| ≤ |I|/4. Then Proposition 4.7 gives us a smoothing interval I(x). This interval I(x) is the union of an interval J0 of Πα and a contiguous interval. If |J0 | = |I|/2N , then |I(x)| = 2|I|/2N . Otherwise there is a son K of J0 (hence a grandson of I(x)) such that f α/K ≥ ybj−1 . Therefore we have |I(x)| = 2|I|/2N or |f |α/I(x) ≥ ybj−1 . Put β = α/I(x). We are going to prove that we have a good bound for |Cα f (x) − Cβ f (x)|. Also either β ∈ C or we have a good bound for Cβ f (x). When |I(x)| ≤ 4|I|/2N we have n(β) = 0. In fact, 4 |I(x)| ! |I| |I(x)| ! = n(α) < θ2N N = 1. n(β) = n(α) |α| |α| |I| 2 Since x ∈ V we will have |Cβ f (x)| ≤ y, a bound that is good enough for us. In the second case |I(x)| = |I(β)| > 4|I|/2N and |I(x)| ! |I| |I| |I| = n(α) ≤ n(α) < θ2N . n(β) |β| |α| |β| |α| Therefore β ∈ C . Also by Proposition 4.7, x ∈ I(β)/2. If |f |β = 0, we obtain Cβ f (x) = 0 and so we have a good bound of Cβ f (x). Otherwise there exists k ∈ N such that ybk ≤ |f |β < ybk−1 . The construction of Πα and the fact that |f |β ≥ ybj−1 imply that j > k. The basic step gives us in all cases (taking J = I(α)) |Cα f (x) − Cβ f (x)| ≤ Cybj−1 + 2Hα∗ f (x) + Dybj−1 Δα (x). Now since x ∈ EN this is

√ √ ≤ Cybj−1 + 2M ybj−1 log( N /bj ) + Dybj−1 (M/C) log( N /bj ) √ ≤ Cybj−1 log( N /bj ).

Now either Cβ f (x) = 0, or n(β) = 0, or we are in position to apply the same procedure with β instead of α. In the last case we also have a level 1 ≤ k < j, so that in a ﬁnite number of steps we must arrive at n(β) = 0 or Cβ f (x) = 0. Then we obtain √ Cybj−1 log( N /bj ) ≤ B(log N )y. |Cα f (x)| ≤ y + j∈N

6.4 Bound for the partial sums

83

Proposition 6.3 Let f ∈ L2 (I), then sup |C(k,I) f (x)| = o(log log n),

a.e. on I/2.

0≤|k|
Proof. As we did before, we assume that f is a real function. The proposition is equivalent to sup 0≤k<θ2N

|C(k,I) f (x)| = o(log N ),

a.e. on I/2.

(6.5)

What we have proved can be written as lim sup N →+∞

sup0≤k<θ2N |C(k,I) f (x)| f 2 ≤B √ , log N ε

x ∈ E.

Now we need the fact (cf. Proposition 4.1) that for every trigonometric polynomial P , there exists a constant C(P ) < +∞ such that sup 0≤k<θ2N

|C(k,I) P (x)| ≤ C(P ).

Hence there exists a set E , with m(E ) < Aε, such that for x ∈ E sup0≤k<θ2N |C(k,I) f (x)| log N N →+∞ sup0≤k<θ2N |C(k,I) (f − P )(x)| f − P 2 √ ≤B = lim sup . log N ε N →+∞

lim sup

The density of the trigonometric polynomial can be used now to prove that the set of points where lim sup N →+∞

sup0≤k<θ2N |C(k,I) f (x)| >0 log N

is of measure ≤ Aε.

Proposition 6.4 Let f ∈ L2 [−π, π], and let Sn (f, x) be the partial sums of the Fourier series of f . Then Sn (f, x) = o(log log n),

a.e. on [−π, π].

Proof. The partial sums are given by the convolution of f with the Dirichlet kernel Sn (f, x) = Dn ∗ f (x). Thus, by the expression (2.3) of the Dirichlet kernel we get π 1 1 π ◦ sin t dt + f (x − t) f ◦ (x − t)ϕn (t) dt, Sn (f, x) = π −π t 2π −π

84

6. Growth of Partial Sums

for every x with |x| < π, and where ϕn ∞ ≤ C uniformly on n ∈ N, and f ◦ denotes the periodic extension of f for |t| < 2π and 0 for |t| ≥ 2π. Therefore as sin t/t ∈ L2 (R) we get 1 2π ◦ sin(x − t) dt ≤ Cf 2 . f (t) Sn (f, x) − π −2π x−t Thus sup |Sn (f, x)| ≤

0≤n≤N

sup 0≤|n|≤N

|Cα f ◦ (x)| + Cf 2 ,

where the supremum on the right is taken over those α with I(α) = [−2π, 2π]. By proposition 6.3 it follows that sup |Sn (f, x)| = o(log log N ).

0≤n≤N

Remark. Obviously Proposition 6.4 is superseded by Carleson’s Theorem. Carleson proved instead that Sn (f, x) = o(log log n) a.e. if there exists δ > 0 such that I |f |(log+ |f |)1+δ dm < +∞. The proof is almost the same but Lemma 6.1 must be replaced by an inequality of Hausdorﬀ-Young type < +∞, exp −af −1/(1+δ) αn n∈Z

valid whenever I |f |(log+ |f |)1+δ dm < +∞. At the time of the publication of the Carleson theorem the best result known about Sn (f, x) for f ∈√L2 (I) was the Kolmogorov-Seliverstov-Plessner theorem giving Sn (f, x) = o( log n) a.e.

7. Carleson analysis of the function

7.1 Introduction Carleson’s Theorem is achieved by reﬁning the proof in the previous chapter. The weak point of this proof is that we have allowed all the pairs α ∈ C , and for each one a fraction of I(α) must be included in EN , (TN (α) ∪ UN (α) where we have not a good bound for the second and third term in the basic decomposition). Since we are summing over all pairs α = (n, J) with J any smoothing interval of I with length ≥ 4|I|/2N we obtain a factor N in m(TN ) and √ m(UN ) (see equation (6.3)). We are forced to compensate it by a term log( N ) in the exponent. Finally this logarithmic term appears in the ﬁnal bound of the Carleson integral. But it is clear from the proof that not all pairs are used in the inductive steps. So we must deﬁne a set of allowed pairs and assure that we only use these at the inductive steps. There is also the possibility that we change the frequency of a Carleson integral, introducing a controllable error, to transfer it to an allowed pair. It is clear that possible candidates for the allowed pairs will be those intervals that appear as I(x) in the process of choosing the partition in Proposition 6.2. In this process we check for every dyadic interval below I(α) if the condition f α/Ju < ybj−1 is satisﬁed. The intervals I(x) are selected between the grandfathers of those that do not satisfy this condition. Recalling that f β are mean values of generalized Fourier coeﬃcients of f on the interval I(β), we are persuaded to think that the allowed pairs must be related to large local Fourier coeﬃcients. This, maybe, justiﬁes the following step in Carleson’s construction: what we will call the Carleson analysis of the function f . This can be seen as the process of writing the score from a piece of music. The following section, that can be skipped, tries to explain this connection. In this and the following chapter we assume a function f ∈ L2 (I) ∩ Lp (I) to be given, where 1 ≤ p < +∞. Later we shall assume that |f | = χA is the characteristic function of a measurable set of I, and we shall use interpolation techniques to recover the case of a general f ∈ Lp (I).

J.A. de Reyna: LNM 1785, pp. 85–91, 2002. c Springer-Verlag Berlin Heidelberg 2002

86

7. Carleson analysis of the function

7.2 A musical interlude When we are hearing music our ears and mind are working very heavily. What do we hear? A possible answer to this question is that our mind analyzes the sound signal f to obtain the notes that compose f . These notes are what we sense as music. Obviously this is a good answer: when a musician wants to preserve the music he writes the score, and what he writes must be a substantial portion of what we hear. If we look at the text of a musical composition we see connected dots situated on ﬁve parallel lines. The vertical axis represents the pitch, the horizontal axis the time. Every dot represents a note. We can think of a note as a wave train aeα (x). The human sense of pitch is determined by the frequency of vibration of the sound. The frequency of aeα (x) is λ(α). Other elements of the note are its intensity, represented by the constant a, its duration |α|, and the interval of time in which the note must be produced, I(α). Hence every dot is associated with a pair α = (n, I), and the music can be seen as a set of pairs α that are the notes that compose the sound. If the music has a binary rhythm, such as one determined by a time signature of 2 over 2 or 4 over 4, the notes will be on dyadic intervals, such as the one we are dealing with in the proof of Carleson’s Theorem. The signal f can be written in this case as h f (x) = aj eαj (x). j=1

In real music a note has a quality of tone that is not captured in the pair α. In fact the ear senses a note as opposed to noise if the sound signal f is periodic. During a fraction of a second (say a fraction greater that 1/64 of a second) the graph of f is periodic with a periodicity between 27 (the note A−2 ) and 4176 (the note C7 ) cycles by second. But the function is not a sinusoidal function. It has a complicated Fourier coeﬃcients structure. And we are considering only its period. In the text of a musical composition the articulations, accents, and nuances of the tone strength are designated at the best in a very faulty way and often not at all, so that the musical interpretation has an important role. This implies more or less that it is not easy to deduce our coeﬃcients aj , and other harmonic notes from the score. This makes our previous expression for f very poor. For our purpose what is important is: The allowed pairs that we must select to obtain a proof of Carleson’s Theorem are connected with the notes that compose the function f . We must obtain a device that given the sound f gives us the notes that compose the music f . This is what we will call the Carleson analysis of the function f .

7.3 The notes of f

87

7.3 The notes of f The allowed pairs must be determined so that they contain pairs for which |f |α is specially great. To this purpose we carry out the Carleson analysis of the function f . We ﬁx a level j and retain the pairs α = (n, I) for which the corresponding Fourier coeﬃcient of f is greater that bj y p/2 . Next we develop the rest of the function on the two sons of I. We retain also those terms with coeﬃcients greater that bj y p/2 , and so on. We will call these pairs the notes of f at level j. The set of allowed pairs will be a modiﬁcation of these. In what follows we need to consider only pairs (n, J) where J is a dyadic interval. We deﬁne DI as this set: DI = {α ∈ P : I(α) a dyadic interval with respect to I}. If α = (n, J) and J is a dyadic interval with respect to I, then there is a unique u ∈ {0, 1}∗ such that J = Iu . We put u = u(α). In general if u ∈ {0, 1}∗ we call u his father. For instance 0100110 = 010011. Hence if α = (n, J) and J is a dyadic interval, Iu (α) is the father of J. The structure of f will determine sets of pairs Qju that we call notes of f at level j. These pairs contain the information about f that we shall need. We are going to deﬁne, by induction, for every level j ∈ N and every u ∈ {0, 1}∗ , a set Qju and a function Puj (x). In the musical interpretation this function represents the sound of the notes of level j that last for all of the interval Iu . For technical reasons we must take Qj∅ = Qj0 = Qj1 = ∅ and the corresponding functions P∅j = P0j = P1j = 0 In the ﬁrst step of the induction we deﬁne for every u of length 2, that is u = 00, or u = 01, or u = 10, or u = 11: Qju = {α ∈ DI : I(α) = Iu , |f, eα | ≥ bj y p/2 |Iu |}.

(7.1)

That is to say, we retain the pairs α that give a Fourier coeﬃcient greater j than or equal to bj y p/2 , where bj = 2 · 2−2 , and deﬁne Puj (x) =

f, eα eα (x); |I | u j

(7.2)

α∈Qu

and, assuming we have deﬁned Qju and Puj , and that v = u0 or v = u1 we deﬁne (7.3) Qjv = {α ∈ DI : I(α) = Iv , |f − Puj , eα | ≥ bj y p/2 |Iv |},

88

7. Carleson analysis of the function

f − P j , eα u eα (x). |I | v j

Pvj (x) = Puj (x) +

α∈Qv

We deﬁne the set of notes of level j as the union Qj =

(7.4)

u∈{0,1}∗

Qju .

With the above deﬁnition it is easy to see that Puj (x)

=

f − Puj (α) , eα |α|

α ∈ Qj I(α) ⊃ Iu

eα (x) =

a(α)eα (x).

(7.5)

α ∈ Qj I(α) ⊃ Iu

Observe that these coeﬃcients a(α) also depend on j. By deﬁnition Pvj − Puj is orthogonal to f − Pvj on Iv , when v = u0 (or v = u1). Therefore j 2 2 |f (x) − Pv (x)| dx + |a(α)| |α| = |f (x) − Puj (x)|2 dx. Iv

Iv

α∈Qjv

Summing over all the dyadic intervals of length |v| ≤ n, j 2 2 |a(α)| |α| = |f (x)|2 dx. |f (x) − Pu (x)| dx + |u|=n

Iu

Therefore

α ∈ Qj |u(α)| ≤ n

2

|a(α)| |α| ≤

I

|f (x)|2 dx.

(7.6)

I

α∈Qj

The deﬁnition of a(α) implies that for every α ∈ Qj , |a(α)| ≥ bj y p/2 . Hence the length of all the pairs in Qj is bounded by α∈Qj

|α| ≤ b−2 j

f 22 . yp

This is the bound of the length of the notes of level j.

(7.7)

7.4 The set X

89

7.4 The set X Here we deﬁne a component of the exceptional set. The objective is that when Iu ⊂ X we have a good bound for Puj (x), and also a bound for the number of terms in the sum that deﬁnes this function. Equation (7.6) invites us to deﬁne the function Aj (x) = |a(α)|2 χI(α) (x). α∈Qj

In some way this function represents the intensity of the sounds x. By (7.6), we have Aj (x) dx ≤ f 22 . I

For every j we deﬁne the set Xj = {x ∈ I : Aj (x) > y p /bj }. By deﬁnition, Aj is a union of a set of dyadic intervals. Chebyshev’s inequality gives us the bound f 2 m(Xj ) ≤ bj p 2 . y Now we put X= Xj . j

Therefore m(X) ≤ C

f 22 . yp

Proposition 7.1 Let Iu be a dyadic interval, such that Iu ⊂ X. Then Puj has at most b−3 terms and j

|Puj (x)| ≤

I(α)⊃Iu

,α∈Qj

|a(α)| ≤

y p/2 . b2j

(7.8)

Proof. Notice that the condition Iu ⊂ X implies that Iu ⊂ Xj . Hence there exists a point x0 ∈ Iu such that x0 ∈ Xj . So Aj (x0 ) = |a(α)|2 χI(α) (x0 ) ≤ y p /bj . α∈Qj

For every term a(α)eα (x) of Puj , by (7.5), x0 ∈ I(α). Denote by N the number of terms in Puj , and notice that |a(α)| ≥ bj y p/2 . From the previous inequality we deduce

90

7. Carleson analysis of the function

N b2j y p ≤ y p /bj . This is the bound on the number of terms of Puj . Now |Puj (x)|

≤

1/2 √ 2 |a(α)||eα (x)| ≤ N |a(α)|

α ∈ Qj I(α) ⊃ Iu −3/2

≤ bj

−3/2 p/2 −1/2 y bj

Aj (x0 )1/2 ≤ bj

p/2 = b−2 . j y

The set X is a union of dyadic intervals. We deﬁne X ∗ following the seven trick so that f 22 ∗ (7.9) m(X ) ≤ C p . y Therefore if J ⊂ X ∗ then K ⊂ X for every grandson K of J.

7.5 The set S Besides X ∗ , the set S ∗ will be a component of the exceptional set EN . The deﬁnition is almost the same as in the case of the chapter on partial growth of the sums of Fourier series, but here we have another element, the number p. The deﬁnition of S is given so that every pair satisﬁes |f |α < y .

α with I(α) ⊂ S ∗

Hence, at every point in which we are interested, Cα f (x) will have a well deﬁned level or |f |α = 0 in which case Cα f (x) = 0 and we have achieved our objective: to bound Cα f (x). We deﬁne S as the union of all the intervals J, where J is dyadic with respect to I, and 1 |f |p dm ≥ y p . (7.10) |J| J Hence for every dyadic interval that satisﬁes (7.10) we put in S the interval J. Now if α = (n, J) ∈ PI and J ⊂ S ∗ , we will have for every grandson Ju of J

1 1/p 1 p |f | dm ≤ |f | dm < y. |Ju | Ju |Ju | Ju Hence f α/Ju < y, and

|f |α < y.

(7.11)

7.5 The set S

91

Now if x ∈ S ∗ , α = (n, J) ∈ PI and x ∈ I(α)/2, we will also have I(α) ⊂ S ∗ , and |f |α < y. As a consequence of (7.11), if I(α) ⊂ S ∗ , α ∈ PI , the Carleson integral Cα f (x) has level j ∈ N, that is, ybj ≤ |f |α < ybj−1 ,

(7.12)

or |f |α = 0, which implies f = 0 on I(α). The set S can be written as the union k Jk where every Jk is a maximal dyadic interval satisfying (7.10). The intervals Jk are ‘disjoint’, hence we have m(S) ≤ m(Jk ), k

where every Jk satisﬁes (7.10). Therefore 1 1 |f |p dm ≤ p f pp . m(S) ≤ |Jk | ≤ p y y Jk k

This is part of the bound of the exceptional set m(S ∗ ) ≤

7 f pp . p y

(7.13)

8. Allowed Pairs

8.1 The length of the notes We need a set of pairs S, the allowed pairs, so that in order to bound a given Carleson integral Cα f (x) we apply repeatedly the basic decomposition, but always to integrals Cβ f (x) with β being an allowed pair. In each application we may change the integral in question Cα f (x) to another Cβ f (x) if the diﬀerence |Cα f (x) − Cβ f (x)| can be conveniently bounded. The advantage of using the allowed pairs is that for every pair in which we use the basic decomposition we must exclude a certain set, where we can not bound the second and third terms of the decomposition. These exceptional sets are a proportion of the sets I(β). Hence the set S is subject to the restriction |β| < +∞. (8.1) β∈S

We have seen that a possible candidate for S is the set of notes (on every level) of f . This crude selection must be reﬁned, and this is the subject of this chapter. The general idea is that we can extend the set of notes, always restricted by (8.1), in order to achieve that every Carleson integral Cα f (x) can be approximated by a Carleson integral Cβ f (x) with β ∈ S. Hence, now a basic condition is to verify that the set of notes has its total length bounded. Assuming that f ∈ L2 (I) we have obtained this bound in (7.7). The condition f ∈ L2 (I) is used here to apply Parseval’s Inequality. This is the principal obstacle to the Lp result. Carleson had the feeling that this can be removed but ﬁnally it was Hunt who obtained this extension. We will assume now that f ∈ L2 (I) ∩ Lp (I); at a certain point we will also assume that |f | is a characteristic function. Later an interpolation argument will give the general result.

J.A. de Reyna: LNM 1785, pp. 93–102, 2002. c Springer-Verlag Berlin Heidelberg 2002

94

8. Allowed Pairs

8.2 Well situated notes We are going to deﬁne the set of well situated pairs. These pairs are an enlargement of the set Q = j≥1 Qj of notes of f . We add pairs that in a certain sense are near the notes of f . This will deﬁne the set R of well situated notes or pairs. This set is deﬁned as the sets of pairs that satisfy one of two conditions. Our objective is that given a Carleson integral Cα f (x) there exists an j consists of a single note γ we have a good candidate allowed β near α. If Pu(α) β = γ/α for this β. A good candidate must also be a note with a comparable pitch. So we enlarge the set of notes of f adding all the notes for which there are two or more simultaneous notes of f of comparable pitch, because if there are two or more simultaneous notes we will not have a unique candidate for β. This will be no problem if the lengths of all the added notes are controlled. Therefore the aim of the ﬁrst stage of the deﬁnition is that from α ∈ Rj j it follows that the function Pu(α) (x), on the interval I(α), essentially consists of a single note or a rest. When we speak of essentially we are refering to the notes with a comparable pitch to the note α. How can we achieve this? If there are two notes δ and γ in Qj comparable to α, we add to Rj all the comparable notes on the intervals contained in I(δ) ∩ I(γ). In this way we achieve that such an α, for which there exist δ and γ, will be in Rj . The pair α ∈ DI will be an element of Rj if it satisﬁes one of two conditions A(Rj ) or B(Rj ). For the beneﬁt of those who want to read the original paper of Carleson we retain in these notations the reference to the letters a and b. First we give only condition B(Rj ) and prove that it achieves our purj when α ∈ Rj . pose: to obtain a precise information about the function Pu(α) Condition B(Rj ) Let α ∈ DI . If there is β ∈ Qj such that I(α) ⊂ I(β) then, by deﬁnition α ∈ Rj if there are two diﬀerent elements γ and δ in Qj with I(γ) ∩ I(δ) ⊃ I(β) and such that −10 b10 j ≤ |λ(γ) − λ(δ)| · |α| ≤ 32 · bj

and

|n(α) − n(γ/α)| < b−10 . j

The role of I(β) in this condition is to simplify the calculation of |α| for those α that satisfy the condition. As we have said this condition is designed so that the following lemma can be proved. Lemma 8.1 (Structure of the functions Puj ) Let α ∈ DI be such that I(α) ⊂ X. Assume that α ∈ Rj and put u = u(α). Then we can write P = Puj as t ∈ I(α) (8.2) P (t) = ρeiλ(δ)t + P0 (t) + P1 (t),

8.2 Well situated notes

95

where P1 (t) consists of the terms of P (t) for which |n(α) − n(γ/α)| ≥ b−10 , j and p/2 |ρ| ≤ b−2 , and |P0 (t)| ≤ b8j y p/2 , if t ∈ I(α). (8.3) j y Furthermore, if ρ = 0, then δ ∈ Qj and satisﬁes I(δ) ⊃ I(α). Proof. By (7.5), P (t) =

a(γ)eγ (t) =

γ

I(γ)⊃I(α),γ∈Qj

··· +

···

γ

−10 . Therefore where the γ refers to those terms with |n(α) − n(γ/α)| < bj by deﬁnition ··· P1 (t) = γ

satisﬁes the conditions of the theorem. Now if the sum γ is empty, we put ρ = 0 and P0 (t) = 0 and the lemma is true. If the sum is reduced to one term a(δ)eδ (t) we can put ρ = a(δ) and p/2 . P0 (t) = 0. Since Iu ⊂ X, (7.8) gives us |ρ| ≤ b−2 j y Finally if there are two or more elements in the sum γ we must have for every two of them γ and δ that |λ(γ) − λ(δ)| · |α| < b10 j .

(8.4)

In the other case we can assume without loss of generality that I(γ) ⊂ I(δ) (they are dyadic intervals), and take β = γ. Then α satisﬁes the condition B(Rj ). In fact, |α| |α| − n(δ) . |λ(γ) − λ(δ)| · |α| = 2π n(γ) |γ| |δ| " # By deﬁnition n(γ/α) = n(γ)|α|/|γ| and by the construction of γ and δ we have |n(α) − n(γ/α)| < b−10 and |n(α) − n(δ/α)| < b−10 . Hence j j + 2) ≤ 32 · b−10 . |λ(γ) − λ(δ)| · |α| ≤ 2π(|n(γ/α) − n(δ/α)| + 2) ≤ 2π(2b−10 j j As we know by hypothesis that α ∈ Rj we have proved (8.4). We choose δ to be one of the terms in the sum γ . Let t0 be the central point of I(α), we have for every t ∈ I(α) (Observe that for every γ that appears in the sum below we have t ∈ I(γ))

iλ(γ)t0 a(γ)eγ (t) = a(γ)e eiλ(δ)(t−t0 ) + γ

γ

a(γ)eiλ(γ)t0 eiλ(γ)(t−t0 ) − eiλ(δ)(t−t0 )

γ

= ρeiλ(δ)t + P0 (t).

96

8. Allowed Pairs

p/2 Again (7.8) gives us that |ρ| ≤ b−2 . j y On the other hand i λ(γ)−λ(δ) (t−t0 ) |P0 (t)| ≤ |a(γ)|e − 1 . γ

Since t ∈ I(α) and t0 is the central point of this interval, we have λ(γ) − λ(δ) (t − t0 ) ≤ λ(γ) − λ(δ) · |α| ≤ b10 j . Hence, since I(α) ⊂ X −2 p/2 = b8j y p/2 . |P0 (t)| ≤ b10 j bj y

Remark. In the case that there exists δ ∈ Qj with I(δ) ⊃ I(α) and |n(α) − , we can choose λ(δ) as the exponent that appears in (8.2). n(δ/α)| < b−10 j The next condition in the deﬁnition of Rj is added in order to achieve, under certain hypotheses, that the function Puj (x) coincides on every grandson Iv of Iu with Pvj (x). Condition A(Rj ) If β ∈ Qj then α ∈ Rj for every α ∈ DI such that I(α) ⊂ I(β),

|α| > b10 j |β|,

|n(α) − n(γ/α)| < b−10 . j

and

(8.5)

where γ ∈ Qj is such that I(γ) ⊃ I(β). This condition needs also an exceptional set that allows the reasoning of Lemma (8.2). For every α ∈ Qj let Yj (α) be the union of the two intervals of length 8b3j |α| with centers at the two extremes of I(α). Then we set Y =

∞

Yj (α).

(8.6)

j=1 α∈Qj

Hence we have, by the bound (7.7) of the length of the notes of level j, m(Y ) ≤

∞ j=1

α∈Qj

16b3j |α|

≤ 16

∞ j=1

bj

f 22 C ≤ p f 22 . p y y

(8.7)

8.2 Well situated notes

97

Lemma 8.2 Let α ∈ PI such that α/L ∈ Rj for every grandchild L of I(α), and I(α) ⊂ Y . Let Puj and Pvj be the functions associated to two grandchildren J = Iu and K = Iv of I(α). If there is a term a(γ)eγ (x) of Puj , such that |n(γ/J) − n(α/J)| < b−9 j , then Puj = Pvj . Hence the four functions associated to the grandchildren of α are the same. Proof. The hypotheses imply that γ ∈ Qj is such that I(γ) ⊃ J. Since α/J ∈ Rj by condition A(Rj ) (taking β = γ) we conclude that |α/J| ≤ b10 j |γ|. It follows that the size of I(γ) is very large compared with that of I(α). ................................................................

I(α)

...............................................................

K

J

I(γ)

a Yj (γ) Fig. 1

Now we shall show that I(γ) ⊃ I(α). To this purpose we must show that it contains every grandchildren of I(α), say K. In fact if I(γ) ⊃ K there is one end point a of I(γ) contained in I(α). Yj (γ) contains an interval with center at a and length 8b3j |γ| > 8b10 j |γ| ≥ 2|α|. Hence it must be that I(α) ⊂ Yj (γ) ⊂ Y . This contradicts our hypothesis. Therefore I(γ) contains every grandchild of I(α) and hence I(γ) ⊃ I(α). From the above reasoning we can not only deduce that I(γ) ⊃ I(α) but also that the distance from I(α) to the end points of I(γ) must satisfy + |α| ≥ 4b3j |γ|. .............................

I(α)

.............................

a ..............................................

4b3j |γ|

I(γ) ............................................

Yj (γ) Fig. 2

98

8. Allowed Pairs

It follows that

7 ≥ 4(b3j − b10 j )|γ| > bj |γ|.

In order to prove that the two functions a(δ)eδ (x), and Pvj (x) = Puj (x) = δ∈Qj ,I(δ)⊃J

a(δ)eδ (x),

δ∈Qj ,I(δ)⊃K

coincide, we must prove that every δ ∈ Qj such that I(δ) ⊃ J satisﬁes also I(δ) ⊃ K. (Observe that now we have proved I(γ) ⊃ I(α), and there is no special assumption about the interval J). Hence we assume that β ∈ Qj , I(β) ⊃ J and I(β) ⊃ K. It follows that one end point b of I(β) is contained in I(α). Since Yj (β) ⊃ I(α) it follows that 8b3j |β| < 2|α|. Now we can prove that I(γ) ⊃ I(β). In fact, I(γ) contains an interval of length |α| + 2 with the same center as I(α). Hence all we have to show is that > |β|. But we have > b7j |γ| >

1 −3 b |α| > |β|. 4 j

We can apply now condition A(Rj ), with our β and γ, to prove that α/J ∈ Rj . This contradicts our hypotheses. Hence it follows that I(β) ⊃ K, and the two functions Puj and Pvj coincide. Remark. There is nothing magical about the exponent 9. This will be applied in Proposition 9.3 and any number between 1 and 10 suﬃces.

8.3 The length of well situated notes We must bound the length of the well situated notes. In fact we bound only the well situated pairs α such that I(α) ⊂ X. The importance of this condition is that assuming it we can apply Proposition 7.1. That is, we know j has at most b−3 terms. that Pu(α) j First observe that given β ∈ Qj , the number of cycles n(α) of a pair times the α that satisﬁes condition A(Rj ) with this β is less than 2b−10 j j number k of γ ∈ Q such that I(γ) ⊃ I(β). Since I(α) ⊂ I(β) we have I(β) ⊂ X. Therefore k ≤ b−3 j . Independently, the intervals I(α) must be dyadic subintervals of I(β) of length between b10 j |β| and |β|. The length of all these intervals is |α| ≤ (1 + log2 b−10 )|β|. j I(α)⊂I(β),|α|≥b10 |β| j

Therefore, by the known estimates of the length of the notes of f (7.7), with j ﬁxed,

8.4 Allowed pairs

|α| ≤

α∈A(Rj ),I(α)⊂X

99

Cb−13 (log b−1 j j )|β|

β∈Qj

≤

2 −15 −1 f 2 Cbj (log bj ) p

y

≤

2 −16 f 2 Cbj . p

y

In the same way, the number of pairs γ and δ, that with a ﬁxed β, satisﬁes −3 conditions B(Rj ) is ≤ b−3 j · bj . (As in the previous case we are interested only in the case I(α) ⊂ X). With these γ and δ ﬁxed, the number of cycles values, and the length of |α| can attain n(α) can be selected between 2b−10 j −1 only ≤ C(log bj ) values. With these observations we can conclude that

|α| ≤

2 −6 −10 −1 −2 f 2 Cbj bj (log bj )bj p

y

α∈B(Rj ),I(α)⊂X

≤

2 −19 f 2 Cbj . p

Therefore the lengths of the well situated notes I(α) ⊂ X is bounded in the following way:

|α| ≤ Cb−19 j

α∈Rj ,I(α)⊂X

f 22 . yp

y

of level j with

(8.8)

8.4 Allowed pairs Recall the basic step of the proof. We have a Carleson integral Cα f (x), I(α) ⊂ S ∗ and this gives us that the Carleson integral has a well deﬁned level j ∈ N such that ybj ≤ |f |α < ybj−1 . We apply the process by which we obtain a partition Πα , and to obtain a good bound for the terms of the decomposition we carefully select a smoothing interval I(x). Now, in general, one of the halves of I(x) is an interval of Πα . This implies that a grandson J of I(x) satisﬁes f α/J ≥ ybj−1 .

(8.9)

Now we pass from Cα f (x) to Cβ f (x) where β = α/I(x). We can assume that α was an allowed pair, but the problem is that in general β is not such a pair. The idea of the proof is to choose the set of allowed pairs S j so that we can choose γ ∈ S j with |Cβ f (x) − Cγ f (x)| conveniently bounded. j We have obtained a set Rj such that δ ∈ Rj implies that Pu(δ) sounds as a rest or a pure note ξ if we consider only the sounds of comparable pitch to that of δ. By (8.9) for some grandchildren J of I(β), f β/J ≥ ybj−1 . This implies that certain coeﬃcients of Fourier of f on J with a pitch comparable

100

8. Allowed Pairs

to that of β are of a certain size. The hope is that taking δ = β/J we can j ) the arrange things so that we can in fact prove that there exists (in Pu(δ) note ξ of this pitch. Then the candidate to γ will be this note ξ/I(x). According to the previous considerations we must deﬁne β ∈ S j so that it implies β/J ∈ Rj . This will not be suﬃcient because we can not prove the implication j there exists one note in Pu(β/J) f β/J ≥ ybj−1 =⇒ of pitch near to that of β. Instead we shall prove that there exists a natural number m only depending on p, and such that m+j there exists one note in Pu(β/J) f β/J ≥ ybj−1 =⇒ of pitch near to that of β. This shift is of no consequence to the rest of the proof. In fact the natural number m must be chosen great enough to attain also another objective. For the present we take m depending only on p and great enough. Then we can deﬁne S j , the set of allowed pairs: Deﬁnition 8.3 (Allowed pairs) A pair α is in S j if I(α) ⊂ X ∗ is a smoothing interval, and for some grandchild J of I(α) we have α/J ∈ Rm+j .

8.5 The exceptional set Here we deﬁne the exceptional set. We assume 1 0 and a natural number N to be given. And our purpose is to deﬁne a set EN ⊂ R, with m(EN ) < Af p /y p such that for every x ∈ I/2 EN and α = (n, I) ∈ P with 0 ≤ n < 2N , we have |Cα f (x)| < Bp y. In fact we will assume that |f | = χA is the characteristic function of a measurable set. Therefore once we obtain a bound of |C(n,I) f (x)| for 0 ≤ n < 2N , we shall have also a bound for 0 ≤ |n| < 2N applying the same reasoning to f . The set EN , and other auxiliary sets that we will deﬁne later, will depend heavily on f , p, y and N . We will not mention the dependence on f , p, and y, but we shall mention the dependence on p of every constant that appears. Recall that the shift m depends on p and only on p. The set EN will be a union D ∪ S ∗ ∪ TN ∪ UN ∪ V ∪ X ∗ ∪ Y of various sets. We retain N , but the main point in the proof is to make the estimates be independent of N . The role of N is now to deﬁne the partition Πα for every allowed pair α.

8.5 The exceptional set

101

We have deﬁned the sets X ∗ , S ∗ , and Y that do not depend on N and have seen the inequalities (7.9), (7.13) and (8.7) m(S ∗ ) ≤

7 f pp , p y

m(X ∗ ) ≤

C f 22 , p y

m(Y ) ≤

C f 22 . p y

Now to deﬁne the sets TN and UN , we must consider the allowed pairs α and the associated partitions Πα . Recall from section [4.13] the needed ingredients to deﬁne Πα : • • • • •

The function f ∈ L1 (I), The natural number N , A positive real number y, The pair α, where I(α) is a smoothing interval of length |α| ≥ 4|I|/2N , A natural number j ≥ 1 such that |f |α < ybj−1 .

We have been given f , N and y, so we consider, for every j ∈ N, the set of allowed pairs α ∈ S j such that |α| ≥ 4|I|/2N and |f |α < ybj−1 . For every such pair we obtain the associated partition Πα . With this partition ∗ Eα f (x) we can deﬁne the functions Δα (x) = Δ(Πα , x) and Hα∗ f (x) = HI(α) and the two sets −1/2

UN,j (α) = {x ∈ I(α) : Δα (x) > C1 2m bj−1 }, TN,j (α) = {x ∈ I(α) :

Hα∗ f (x)

> C2 2

m

1/2 ybj−1 },

(8.10)

where m is the shift and C1 and C2 are constant that we are going to ﬁx. By the maximal inequalities proved for Δα (x) and Hα∗ f (x) (see (5.1) and (5.4)) we obtain −1/2 m(UN,j (α)) ≤ A|α| exp −BC1 2m bj−1 , 1/2 m(TN,j (α)) ≤ A|α| exp −BC2 2m ybj−1 Eα f −1 ∞ . By (4.18) we have the bound Eα f ∞ ≤ C supk f βk ≤ Cybj−1 . Hence with a proper choice of the constant C2 in the deﬁnition of TN,j (α) we have −1/2 m(TN,j (α)) ≤ A|α| exp −BC1 2m bj−1 . We put UN =

UN,j (α),

TN =

α,j

TN,j (α),

α,j

where the summation is on all allowed pairs, with α ∈ S j , and |f |α < ybj−1 . Collecting our results, and selecting adequately C1 , we get m(UN ∪ TN ) ≤ 2

∞ j=1

A2−2

m+8 −1/2 bj−1

α∈S j

|α|.

102

8. Allowed Pairs

For every β ∈ Rm+j there are eight pairs α such that α/β = β. In fact there are two smoothing intervals I(α) such that I(β) is a grandchild of I(α), and there are four possible values of n(α). Thus, taking into account the bound of the length of well situated notes (8.8), we have ≤ 32

∞ j=1

−1/2

−2m+8 bj−1

A2

β∈Rm+j ,I(β)⊂X

∞ Cf 22 −19 −2m+8 b−1/2 j−1 . |β| ≤ b 2 y p j=1 m+j

It is easy to see that no matter what the natural number m, there is an absolute constant C such that m(UN ∪ TN ) ≤

C f 22 . yp

(8.11)

Another component of the exceptional set is the set D of dyadic points with respect to the interval I. Finally we deﬁne the set V , the last component of EN . V is deﬁned as the set (8.12) V = {x : HI∗ f (x) ≥ By2m }. Assume that f ∈ Lp (R). By (5.3), HI∗ f (x) ≤ 2H∗ f (x) + 6Mf (x). The theorems about the maximal Hilbert transform and the Hardy-Littlewood maximal function imply that HI∗ f p ≤ C Therefore m(V ) ≤

p2 f p . p−1

p f p Cp2 p . m p B2 (p − 1) y

We will see in the following chapter that the selection of m is such that 2 p2 m p ≤2 ≤A . A p−1 p−1

Therefore choosing conveniently the constant B in (8.12), we get m(V ) ≤

f pp . yp

(8.13)

9. Pair Interchange Theorems

9.1 Introduction This chapter contains the most diﬃcult part of the proof of Carleson’s Theorem, that is, how to manage to pass from a Carleson integral Cα f (x) to another where we can apply the basic step. This is accomplished mainly by changing the frequency to obtain Cγ/α f (x), where γ is an allowed pair with a lesser level than that of the initial Carleson integral. First we will obtain a nearby allowed pair β that controls the change of frequency. This nearby allowed pair will be obtained mainly by hearing the sound of f near α. We will have as our ears Proposition 9.2, which gives a bound of f α if f does not sound near α. The structure theorem of the Puj will assure us that we will hear a musical sound. We start the chapter choosing the shift m. This will allow us to change y p/2 for y, but the principal role of the shift m is to classify the quantities that we encounter at two levels of magnitude.

9.2 Choosing the shift m The measure of the exceptional set EN must be bounded by Af pp /y p . Of this type is the bound (7.13) of m(S ∗ ) and (8.13) of m(V ). But the bounds we have obtained for m(X ∗ ), m(UN ∪ TN ) and m(Y ) in (7.9), (8.11), and (8.7) are of type Af 22 /y p . This problem is generated by the use of the Bessel inequality in (7.6). To overcome this diﬃculty we observe that if f is a measurable function such that |f | = χA is a characteristic function, then f 22 = f pp . Later we will have to deal with more general functions. We shall call such a function a special function. There is also another reason for which it is convenient to consider the case of these special function. In fact this will permit us to deﬁne the shift m, depending only on p, connecting the level of a Carleson integral and y. The shift m is a natural number, depending only on p. As we have said in the introduction to the deﬁnition of allowed pairs S j , m has a role in the process of selecting a note of f near a not allowed pair. In the following proposition we give to m another role in the proof of the Lp result.

J.A. de Reyna: LNM 1785, pp. 103–115, 2002. c Springer-Verlag Berlin Heidelberg 2002

104

9. Pair Interchange Theorems

Proposition 9.1 (Selection of the shift) Let 1 < p < +∞, then there exists a natural number m = m(p) such that if α ∈ PI , f is a special function, and j ∈ N are such that bj y ≤ |f |α , and I(α) ⊂ S ∗ , then −1/4

y p/2 ≤ bm+j y,

and

bm+j < y.

(9.1)

Proof. There exists a grandchild J of I(α) such that f α/J = |f |α ≥ bj y. Since J ⊂ S we have 1 |f |p dm < y p . |J| J Also from the deﬁnition of f α/J and the fact that |f | is a characteristic function it follows that 1 1 |f | dm = |f |p dm. f α/J ≤ |J| J |J| J Therefore bj y < y p . −1/4 Hence we show that y p/2 ≤ bm+j y for every m ≥ m(p), if bj y < y p . First, we consider the case 1 . 4 p−1 Then we have

1 − p/2 2m − 2−j > (1 − 2−j ) . 4 p−1

We can write that as 1 − 2m+j 1 − p/2 < (1 − 2j ) . 4 p−1 Since bj = 21−2 , we deduce that j

1/4

(1−p/2)/(p−1)

bm+j < bj

< y 1−p/2 .

This is the ﬁrst inequality in (9.1). For the second one we consider ﬁrst the 1/4 case 1 bm+j . Now, if p ≥ 2. For y ≤ 1 it suﬃces to take m ≥ 1, and we have −1/4

y p/2−1 ≤ 1 ≤ bm+j ,

bm+j < bj < y p−1 ≤ y.

If y > 1, bj y ≤ 1. In fact, since |f | is a characteristic function bj y ≤ |f |α ≤ 1. We choose m such that

9.3 A bound for f α

105

2m − 1 p −1< . 2 4 From which we deduce that (1 − 2−j )(p/2 − 1) < (2m − 2−j )/4. Therefore (2j − 1)(p/2 − 1) < (2m+j − 1)/4. Hence

1−p/2

y p/2−1 ≤ bj

−1/4

< bm+j .

We shall impose another condition on m, namely that m ≥ m0 where m0 is an absolute constant. This is needed in the proof of Proposition 9.3 below. Roughly speaking, this proposition says that near every Carleson integral Cα f (x) of level j there is another Cβ f (x), with a pair β ∈ S j . Taking all these restrictions on m into account we can say that there exists a constant 0 < A < +∞ such that the shift m can be any natural number satisfying p2 m . 2 ≥A p−1 So we have also

p2 A ≥ 2m , p−1

for some absolute constant A .

9.3 A bound for f α The following proposition is essential to obtain, from a lower bound for a local norm: f α > r > 0, some note of f at the scale of α. Proposition 9.2 Let α = (n, J) ∈ P, f ∈ L2 (J) and let ak eβ (t) f (t) = β=(k,J)

be its local Fourier expansion. Assume that |ak | ≤ N for every k with |k+n| < M , where 1 < M < +∞, and 0 < N < +∞. Then

f L2 (J) , ||f ||α ≤ B N log M + √ M where B is an absolute constant.

(9.2)

106

9. Pair Interchange Theorems

Proof. By deﬁnition f α =

j∈Z

c 1 + j2

t 1 j dt . f (t) exp −2πi n(α) + |J| 3 |J| J

The integral can be written as 1 f (t)e−iλ(α)t · e−2πijt/3|J| dt = a k bk , |J| J

(9.3)

k∈Z

where a k and bk are the coeﬃcients of the expansions on L2 (J) a n(β) eβ (t), e2πijt/3|J| = bn(β) eβ (t). f (t)e−iλ(α)t = β=(n,J)

Hence a k

1 = |J|

β=(n,J)

t dt = an(α)+k . f (t) exp −2πi(n(α) + k) |J| J

By hypothesis |a k | ≤ N if |k| < M . If k + j/3 = 0, the coeﬃcient b−k are bounded by exp2πi(j/3 + k − 1 1 |b−k | ≤ . ≤ 2π(j/3 + k) |j/3 + k| In the case k = −j/3, |bk | is bounded by 1. Now we can bound the integral in (9.3). For |j| ≤ M/2 ≤ |a k bk | + |a k bk | |k|≤M

≤N +N

|k|<M,k=−j/3

|k|>M

2 1/2 1/2

1 1 2 |ak | . + |j/3 + k| |j/3 + k| k

|k|≥M

Hence there exists a constant C such that

f L2 (J) |ak bk | ≤ C N log M + √ . M k For |j| > M/2 we use only the Schwarz inequality, and obtain |a k bk | ≤ f L2 (J) . k

Therefore f α is bounded by

9.4 Selecting an allowed pair

107

f L2 (J) c 1 2 C N log M + √ + f L (J) 1 + j2 1 + j2 M j |j|>M/2

f L2 (J) ≤ B N log M + √ . M

9.4 Selecting an allowed pair Every grandson α/K of a not allowed pair α ∈ S j is not well situated (α/K ∈ Rm+j ). Then, we have arranged things so that the sound f in the scale of α/K is a rest or a pure note. If we also assume that we are at level j (bj y ≤ |f |α < bj−1 y), and have selected adequately the shift m, then f m+j must be a pure note. deﬁnitely sounds, so by the structure theorem Pu(δ/K) All pure notes of f are well situated, so a grandfather β of this note will be an allowed pair nearby to α. If we are considering this β as a step in our process, we must bound the diﬀerence |Cβ/α f (x) − Cα f (x)|, for this will be convenient to give a bound for |n(β/α) − n(α)|. In this paragraph we are undertaking the process of selecting this nearby allowed pair β. Proposition 9.3 Let f be a special function, and α ∈ PI such that bj y ≤ / S j , and let x ∈ I/2 be a point such that x ∈ I(α)/2 and |f |α < bj−1 y, α ∈ x∈ / D ∪S ∗ ∪X ∗ ∪Y . Then there exists β ∈ S j , with I(β) ⊃ I(α), x ∈ I(β)/2, and such that n(α) − n(β/α) < A0 , (9.4) bj −3/2

and for every γ with I(γ) = I(α), and |n(α) − n(γ)| ≤ 2A0 bj Cγ f (x) − Cα f (x) ≤ B |f |β/α + bj y .

we have (9.5)

Remark. In such a situation we shall call β a nearby allowed pair to α. Its function is to control the changes of frequency. Proof. There exists a grandchild K of I(α) such that f α/K = |f |α . For this K, as for every other grandchild of I(α), we have α/K ∈ Rm+j . We hope to ﬁnd a β starting with some term of the function Pum+j (t), where u = u(K). A general remark about the proof: Thanks to the shift m, we have quantities at two levels ybj and ybm+j and always ybj ybm+j .

108

9. Pair Interchange Theorems

We divide the proof in four steps. First step: To obtain δ ∈ Qm+j , with I(δ) ⊃ K The proof starts applying Proposition 9.2 to obtain, for every k ∈ Z, a bound for f − Pum+j (k,K) . By construction of Pum+j (t) the local Fourier coeﬃcients of f − Pum+j (t) are less than bm+j y p/2 . Also (9.6) f − Pum+j L2 (K) ≤ f L2 (K) + Pum+j L2 (K) . Since I(α) ⊂ S ∗ , K ⊂ S and f 2L2 (K) = f pLp (K) < y p . Furthermore, since

I(α) ⊂ X ∗ , K ⊂ X and by (7.8), |Pum+j (x)| ≤ y p/2 b−2 m+j . Collecting all this we obtain p/2 p/2 ≤ 2b−2 . f − Pum+j L2 (K) ≤ y p/2 + b−2 m+j y m+j y

We are in position to apply Proposition (9.2) with N = bm+j y p/2 , M = m+j L2 (K) . We obtain that for every b−8 m+j and the bound obtained for f − Pu k∈Z f −

Pum+j (k,K)

≤B y

p/2

bm+j log(b−8 m+j )

+

p/2 2b−2 m+j y

b−4 m+j

2 p/2 3/2 bm+j . ≤ By p/2 bm+j log(b−8 m+j ) + 2bm+j ≤ 3By (9.7) The last inequality requires that we take m big enough. In fact m ≥ 5 suﬃces. Now we have a lower bound of Pum+j α/K . This will imply that there are terms in this function. 3/4

Pum+j α/K ≥ f α/K − f − Pum+j α/K ≥ ybj − 3By p/2 bm+j −1/4

It is here were we are forced to choose the shift so that y p/2 ≤ bm+j y or something similar. Since f is a special function, we are able to choose the shift so that −1/4 p/2 ≤ ybm+j , therefore y 1/2

Pum+j α/K ≥ ybj − 3Bybm+j >

1 ybj . 2

(9.8)

We assume here that m has been chosen in order to satisfy 1/2

3Bbm+j < bj /2. Given that bj = 21−2 , it is easy to see that this condition is equivalent to m ≥ m0 for some absolute constant m0 . j

9.4 Selecting an allowed pair

109

Note that the same calculation proves that for every grandchild L = Iv of I(α) we have 1/2 f − Pvm+j (k,L) ≤ 3Bybm+j . (9.9) Since α/K ∈ Rm+j and K ⊂ X, the lemma on the structure of Pum+j (lemma 8.1) says that Pum+j (t) = ρeiλ(δ)t + P0 (t) + P1 (t), where

p/2 , |ρ| ≤ b−2 m+j y

t ∈ K = Iu

|P0 (t)| ≤ b8m+j y p/2 ,

(9.10) (9.11)

and all the terms a(γ)eiλ(γ)t of P1 (t), satisfy |n(α/K) − n(γ/K)| ≥ b−10 m+j . m+j What we want to prove is that ρ = 0 so that δ ∈ Q is such that I(δ) ⊃ K. Then we can choose β to be a grandfather of δ. But ρ = 0 would imply that Pum+j α/K is small and this will be in contradiction with (9.8). So we try to obtain an upper bound of Pum+j α/K , related to the decomposition (9.10) and to compare it with (9.8). We have given adequate bounds for the local norm of exponentials in Proposition 4.4. Hence taking (9.10) into account we have Pum+j α/K ≤

C|ρ| + b8m+j y p/2 + Cb10 |a(γ)|. m+j |n(δ/K) − n(α/K)| γ

If n(δ/K) = n(α/K) the ﬁrst term is reduced to |ρ|. Since K ⊂ X, we have a bound for the last sum (see (7.8)) Pum+j α/K ≤

C|ρ| + b8m+j y p/2 + Cb8m+j y p/2 . |n(δ/K) − n(α/K)|

We conclude from (9.8) that 1 C|ρ| ybj < Pum+j α/K ≤ + 2b7m+j y. 2 |n(δ/K) − n(α/K)| We assume m is such that, for every j, 8b7m+j < bj , and we obtain ybj <

4C|ρ| . |n(δ/K) − n(α/K)|

(9.12)

Therefore |ρ| = 0. This gives us a term δ ∈ Qm+j and such that I(δ) ⊃ K. Second step: Bound for |n(δ/K) − n(α/K)|. Now we must prove that the pitch of δ is comparable to that of α. That is we must bound |n(δ/K) − n(α/K)|, so we assume it is not null. From the proof of the structure theorem (of Puj ) we know that we can take as δ any pair ∈ Qm+j such that I(δ) ⊃ K and such that there holds the inequality |n(δ/K) − n(α/K)| < b−10 m+j . But our hypotheses bj y ≤ |f |α <

110

9. Pair Interchange Theorems

bj−1 y and α ∈ S j are very strong. In fact, we can prove that every one of this δ’s satisﬁes |n(δ/K) − n(α/K)| < Cb−1 j . To this end we must prove |ρ| ≤ Cy (cf. (9.12)). A ﬁrst approximation is obtained from (9.12) and the known p/2 estimate |ρ| ≤ b−2 m+j y p/2 4Cb−2 m+j y −5 |n(δ/K) − n(α/K)| ≤ ≤ 4Cb−4 m+j < bm+j . ybj

(9.13)

Now we reverse the inequalities from the previous step. The local norm Pum+j (k,K) has a maximum near k = n(δ/K). Hence for δ/K = (k, K), Proposition 4.5 gives us Pum+j δ/K ≥ B|ρ| − b8m+j y p/2 −

γ

And we have γ

C|a(γ)| . |n(δ/K) − n(γ/K)|

C|a(γ)| ≤ |n(δ/K) − n(γ/K)| ≤

γ

Hence, by the sense of

C|a(γ)| |n(γ/K) − n(α/K)| − |n(δ/K) − n(α/K)|

γ

.

and (9.13), this sum is bounded by

Cy p/2 b−2 C|a(γ)| m+j ≤ −10 . −10 −5 bm+j − bm+j bm+j − b−5 m+j

−10 Since m ≥ 2 we obtain 2b−5 m+j < bm+j , so that ﬁnally we arrive at

Pum+j δ/K ≥ B|ρ| − b8m+j y p/2 − 2Cy p/2 b8m+j > B|ρ| − yb6m+j . But, on the other hand, by (9.7), 1/2

Pum+j δ/K ≤ f δ/K + f − Pum+j δ/K ≤ f δ/K + 3Bybm+j . Therefore

1/2

|ρ| ≤ C(f δ/K + ybm+j ). Now f δ/K ≤ f L1 (K) ≤ f Lp (K) < y and we see at once that |ρ| ≤ 2Cy, which with (9.12) establishes that |n(δ/K) − n(α/K)| <

C . bj

(9.14)

9.4 Selecting an allowed pair

111

Third step: Deﬁnition of β We are in position to apply Lemma 8.2, to prove that the functions Pvm+j are the same for every grandson of α. Namely α is a pair such that α/L ∈ Rm+j for every grandchild L of I(α), I(α) ⊂ Y by hypothesis, and there is a term δ ∈ Qm+j of Pum+j such that |n(δ/K) − n(α/K)| <

C < b−9 m+j . bj

(This is another restriction on m of the type m ≥ m0 ). We deduce that in fact I(δ) ⊃ I(α) and that the four functions Pvm+j , corresponding to the four grandsons Iv of I(α), coincide. Since δ ∈ Qm+j we know that δ is a dyadic interval with |δ| ≤ |I|/4. Then as x ∈ I/2 is not a dyadic point there is only one smoothing interval I(β) of length 4|δ| and such that x ∈ I(β)/2. We deﬁne β as the pair with n(β) = 4n(δ), and this I(β). It is easy to see that with this deﬁnition I(δ) is a grandson of I(β) and β/I(δ) = δ. Furthermore I(β) ⊂ X ∗ since I(β) ⊃ I(δ) ⊃ I(α) and I(α) ⊂ X ∗ . Hence to prove β ∈ S j we only have to prove that δ ∈ Rm+j . We know that δ ∈ Qm+j . But condition A(Rj ) implies that every α ∈ Qj satisﬁes also α ∈ Rj (take β and γ equal to α in condition A(Rj )). Hence δ ∈ Rm+j . We also have that |K| ! |α| ! 4n(α/K) + r − 4n(δ) = |n(α) − n(β/α)| = n(α) − n(β) |β| |δ| C A0 ≤4 +8≤ . bj bj This proves (9.4). Fourth step: Bound for |Cγ f (x) − Cα f (x)|. −3/2 Let γ be a pair such that I(γ) = I(α) and |n(α) − n(γ)| ≤ 2A0 bj . m+j Let P = Pu and recall that we have proved that it coincides with every Pvm+j if v = v(L) for some grandchild L of I(α). Also note that (9.10) and the inequalities in (9.11) are valid now for every t ∈ I(α). First we have |Cγ f (x) − Cα f (x)| ≤ |Cγ (f − P )(x) − Cα (f − P )(x)| + |Cγ P (x) − Cα P (x)|. By a changing of frequency, by the bounds of f − P (k,L) obtained at step 1 (cf. (9.9)), and by the comparison between the two norms f − P α and |f − P |α (proposition (4.12)) we get −3/2 3

|Cγ (f − P )(x) − Cα (f − P )(x)| ≤ C(2A0 bj (by the restriction m ≥ m0 ).

1/2

) 3Bybm+j ≤ bj y,

112

9. Pair Interchange Theorems

Second, by the structure of P |Cγ P (x)−Cα P (x)| ≤ |ρ| · |Cγ (eiλ(δ)t )(x) − Cα (eiλ(δ)t )(x)| + |Cγ (P0 )(x) − Cα (P0 )(x)| + |Cγ (P1 )(x) − Cα (P1 )(x)|. The Carleson integral of an exponential is bounded (cf. Lemma 4.1). So for the ﬁrst term we have |ρ| · |Cγ (eiλ(δ)t )(x) − Cα (eiλ(δ)t )(x)| ≤ 2A|ρ|. In the second term we have, one more time, a change of frequency. Hence −3/2 3

|Cγ (P0 )(x) − Cα (P0 )(x)| ≤ B(2A0 bj

−9/2 7 bm+j y

) P0 α ≤ Abj

< bj y.

(Again by the assumption m ≥ m0 ). The third term is bounded in the following way: |Cγ (P1 )(x) − Cα (P1 )(x)| ≤ |a(η)| · |Cγ (eiλ(η)t ) − Cα (eiλ(η)t )|. η

By another change of frequency, we get |Cγ (P1 )(x) − Cα (P1 )(x)| |a(η)| −9/2 iλ(η)t −9/2 ! ≤C |a(η)|bj e α ≤ Cbj n(η) |η| |α| − n(α) η η |a(η)| −9/2 −9/2 10 ≤ Cbj ≤ C bj bm+j y p/2 b−2 m+j 4|n(η/K) − n(α/K)| − 8 η −9/2 7 bm+j y

≤ Cbj

≤ bj y.

Collecting all these inequalities we have proved that |Cγ f (x) − Cα f (x)| ≤ 2A|ρ| + 3bj y ≤ C(|ρ| + bj y), and by (9.14) |Cγ f (x) − Cα f (x)| ≤ B(f δ/K + bj y) ≤ B(|f |β/α + bj y), since (β/α)/K = δ/K.

Given a Carleson integral Cα f (x) = 0 with x ∈ S ∗ , it has a level j ∈ N (i. e. bj y ≤ |f |α < bj−1 y). Moreover, if α ∈ S j we can apply the basic step. If, on the other hand, α ∈ S j , by the previous theorem, there exists what we call a nearby allowed pair β ∈ S j with x ∈ I(β)/2, I(α) ⊂ I(β), that controls −3/2 the changes of frequencies, so that if |n(γ/α) − n(α)| < 2A0 bj , we have |Cγ/α f (x) − Cα f (x)| ≤ B(|f |β/α + bj y).

9.4 Selecting an allowed pair

113

In particular, this is true for γ = β. Since β is an allowed pair (β ∈ S j ) and |Cβ/α f (x) − Cα f (x)| ≤ B(|f |β/α + bj y) we can think to apply the basic step to the Carleson integral Cβ/α f (x). This can be problematic for two reasons: (a) We have not a reasonable bound for |f |β/α . (b) We shall arrive to a Carleson integral Cβ/I(x) f (x). There is not guarantee that I(x) I(α). This situation is inaceptable. What we need is a more simple Carleson integral, not one with more cycles. Therefore we have not obtained our objective. What is the new level k with bk y ≤ |f |β/α < bk−1 y?, is β/α ∈ S k ? These questions have not a unique answer. Since the best bound of |f |β/α is bk−1 y, the next time we must apply the basic step at some level ≤ k, and it is essential that k ≤ j, because we started at level j. Therefore what we need is a pair ξ, and a level such that: (a) We are in a good position to apply the basic step, I(ξ) ⊃ I(α), x ∈ I(ξ)/2,

ξ ∈ S ,

and |f |ξ < b−1 y.

(b) We are at an adequate level |f |β/α < b−1 y, with < j. (c) We have a controlled change of frequency −3/2

|n(ξ/α) − n(α)| < 2A0 bj

.

(d) Finally, to be able to apply the basic step, I(α) must be a union of sets of the partition Πξ corresponding to this pair ξ. This pair ξ is obtained by picking, from the set of (μ, ) with some of these properties, one with the minimum level . Next we prove that it satisﬁes all our conditions. Proposition 9.4 Assume that f is a special function, x ∈ I/2 and let α ∈ / S j and x ∈ I(α)/2 such that PI , such that bj y ≤ |f |α < bj−1 y, α ∈ x∈ / D ∪S ∗ ∪X ∗ ∪Y . We also assume that 0 ≤ n(α) < 2N , and |α| > 4|I|/2N . Then there exists ξ ∈ S k , with 1 ≤ k ≤ j such that I(ξ) ⊃ I(α), x ∈ I(ξ)/2, and also with β being a nearby allowed pair to α, it is satisﬁed that |f |β/α ≤ bk−1 y,

|f |ξ < bk−1 y,

|n(ξ/α) − n(α)| ≤

2A0 . bj

(9.15)

Furthermore, if Πξ is the partition determined on I(ξ) and I(x) is the central interval, then I(x) I(α), and I(α) I(x) is a union of intervals of Πξ .

114

9. Pair Interchange Theorems

Proof. Let Σ be the set of pairs (μ, ), where μ ∈ PI and ∈ N satisfying the conditions: (i) I(μ) ⊃ I(α) and x ∈ I(μ)/2. (ii) |f |β/α < b−1 y, and ≤ j. j (iii) n(μ/α) − n(α) ≤ A0 i= b−1 i . (iv) μ ∈ S . Where in (iii) the constant A0 is the same as that appearing in proposition 9.3. The proof will be divided into 3 steps. Step 1: Σ is nonempty. If |f |β/α < bj−1 y, then (β, j) ∈ Σ which proves our claim. If |f |β/α ≥ bj−1 y there exists k, with 1 ≤ k ≤ j − 1 such that bk y ≤ |f |β/α < bk−1 y. (Observe that, since I(β/α) ⊂ S ∗ , we know that |f |β/α < y, and |f |β/α > 0 since |f |α ≥ bj y > 0). Now, if β/α ∈ S k , then (β/α, k) ∈ Σ. If on the other hand β/α ∈ S k , we obtain an allowed pair β nearby to β/α (proposition 9.3); hence β ∈ S k , with I(β ) ⊃ I(β/α) = I(α), x ∈ I(β )/2, and n(β/α) − n(β /α) ≤ A0 . bk Then (β , k) ∈ Σ. We see that condition (iii) in the deﬁnition of Σ is satisﬁed since n(β /α) − n(α) ≤ n(β /α) − n(β/α) + n(β/α) − n(α) A0 A0 ≤ + ≤ A0 b−1 i . bk bj j

i=k

The other conditions are easily veriﬁed. Step 2: Selection of (ξ, k) ∈ Σ and the proof of |f |ξ < bk−1 y. We pick a (ξ, k) ∈ Σ with a minimum k. We are going to prove that ξ and k satisfy the theorem. If it were true that |f |ξ ≥ bk−1 y there would be an , with 1 ≤ < k and b y ≤ |f |ξ < b−1 y. If ξ ∈ S , (ξ, ) ∈ Σ in contradiction to the selection of (ξ, k). Therefore ξ ∈ S . Then (by Proposition 9.3) there exists an allowed pair β ∈ S nearby to ξ. Then I(β ) ⊃ I(ξ), x ∈ I(β )/2, and n(ξ)−n(β /ξ)| < A0 /b . j Since (ξ, k) ∈ Σ, we have n(ξ/α) − n(α) ≤ A0 i=k b−1 i , I(ξ) ⊃ I(α), and |f |β/α < bk−1 y. We can check now that (β , ) ∈ Σ. Condition (i) is clear; (ii) follows from |f |β/α < bk−1 y < b−1 y.

9.4 Selecting an allowed pair

115

To deduce (iii), observe that n(β /α) − n(α) ≤ n(β /α) − n(ξ/α) + n(ξ/α) − n(α) j ≤ n(β /α) − n(ξ/α) + A0 b−1 . i

i=k

Now n(β /α) − n(ξ/α) ≤ n(β /ξ) − n(ξ). This is a particular case of the inequality N ! M ! N ! r+s − r ≤ s − M , 2 2 2 valid when M , N , r and s are nonnegative integers. This can be easily proved using the binary expansion of natural numbers. Therefore we have j j −1 n(β /α) − n(α) ≤ A0 + A0 b i ≤ A0 b−1 i . b i=k

i=

Since < k, that (β , ) ∈ Σ is in contradiction to the deﬁnition of (ξ, k). This contradiction proves the inequality |f |ξ < bk−1 y, and ﬁnishes step 2. Step 3: Claims on the partition Πξ . Now let Πξ be the partition of I(ξ) obtained with the given N , y, and the inequality |f |ξ < bk−1 y. Let also I(x) denote the central interval determined by x and Πξ . I(x) and I(α) are two smoothing intervals which contain x in their middle halves. By Proposition 4.6 it follows that I(α) ⊂ I(x) or I(x) I(α). The ﬁrst hypothesis leads to a contradiction. In fact I(x) ⊃ I(α) and |α| > 4|I|/2N . By construction, one of the two halves of I(x), say L, is a member of the partition Πξ . Since |L| > 2|I|/2N , K, one of the two sons of L, is such that f ξ/K ≥ bk−1 y. Hence there exists with 1 ≤ < k such that b y ≤ |f |ξ/I(x) < b−1 y. If ξ/I(x) ∈ S , then (ξ/I(x), ) ∈ Σ. This is in contradiction to the selection of (ξ, k) in Σ. If ξ/I(x) ∈ S , Proposition 9.3 gives a nearby allowed pair γ ∈ S . As in the second step we can prove that (γ, ) ∈ Σ. This is also contradictory. We have proved I(x) I(α) ⊂ I(ξ). By the last condition of Proposition 4.7, in this case I(α) and I(x) are unions of intervals of the partition Πξ .

10. All together

10.1 Introduction In this chapter we prove the weak inequality from which all the results about the Carleson maximal operator can be obtained by standard methods. In Proposition 10.1, we apply the results of the previous chapter to prove that given a Carleson integral we can either bound it at its level or we can obtain another Carleson integral for which we can bound the diﬀerence and such that the second Carleson integral has a lesser level. Theorem 10.2 is the principal result of Carleson-Hunt. After proving this, it will only remain to show that all the pieces of the jigsaw ﬁt together. In particular, we have to adjust the absolute constant θ so that before we arrive at a Carleson integral with |α| ≤ 4|I|/2N (for which we can not apply the basic step or Proposition 9.4) we arrive at one for which n(α) = 0.

10.2 End of the proof The following proposition is a formulation of the basic step of the proof. We can summarize it by saying that a Carleson integral of level j is small with respect to its level or can be approximated (in relation to its level) by another Carleson integral of level j < j. In this chapter, if f α = 0, we shall say that the level of the Carleson integral Cα f (x) is ∞. Thus the level of a Carleson integral is always deﬁned and is a natural number or inﬁnity. Throughout this chapter f is a special function, |f | = χA . Also we denote by EN the exceptional set deﬁned in section [8.5]. Therefore m(EN ) ≤ Cf pp /y p . Proposition 10.1 Let Cα f (x) be a Carleson integral of level j, such that x ∈ EN , and |α| > 4|I|/2N . Then there exists a natural number 1 ≤ k ≤ j satisfying one of the two following conditions: 1/2 (A) |Cα f (x)| ≤ C2m ybk−1 . (B) There exists δ ∈ PI , such that I(δ) I(α), x ∈ I(δ)/2, λ(δ) ≤ 1/2 (1 + bj )λ(α), and either |δ| ≤ 4|I|/2N

J.A. de Reyna: LNM 1785, pp. 117–123, 2002. c Springer-Verlag Berlin Heidelberg 2002

118

10. All together

or the level of Cδ f (x) is j < k, and Cδ f (x) − Cα f (x) ≤ C2m yb1/2 . k−1

(10.1)

Proof. If the level of Cα f (x) is ∞, then Cα f (x) = 0 and condition (A) is satisﬁed with k = 1. Thus we assume that j ∈ N. Because j is the level of Cα f (x) we have ybj ≤ |f |α < ybj−1 . We divide the proof into two cases. First case. Assume that α is an allowed pair α ∈ S j . Given N and |f |α < ybj−1 , we can use the procedure to ﬁnd a partition Πα of I(α) and a smoothing interval I(x) as in Proposition 4.7. Set δ = α/I(x). Then, δ satisﬁes the condition of the theorem, in particular |δ| 1 n(δ) ≤ 2π n(α) = λ(α). λ(δ) = 2π |δ| |α| |δ| If |δ| > 4|I|/2N , we can apply the basic step, with ξ = α and J = I(α) and obtain the bound |Cα f (x) − Cδ f (x)| ≤ Cybj−1 + 2Hα∗ f (x) + Dybj−1 Δα (x). Since α ∈ S j and x ∈ / TN (α) ∪ UN (α) we have by the deﬁnition of the sets TN,j (α) and UN,j (α) 1/2

Hα∗ f (x) ≤ C2 2m ybj−1 , And we get

−1/2

Δα (x) ≤ C1 2m bj−1 . 1/2

|Cα f (x) − Cδ f (x)| ≤ C2m ybj−1 . In this case we take k = j and condition (B) is satisﬁed. Since |δ| > 4|I|/2N , the construction of I(x) also implies that the level j of Cδ f (x) is less than k = j, and I(x) I(α). Second case. Assume that α ∈ / S j . We apply the machinery of the last chapter: We select a nearby allowed pair β and apply Proposition 9.4, ﬁnding a level k, with 1 ≤ k ≤ j and a pair ξ ∈ S k such that I(ξ) ⊃ I(α), x ∈ I(ξ)/2, and |f |β/α ≤ ybk−1 . Let also Πξ and I(x) be the partition and central interval arising from ξ and the inequality |f |ξ < ybk−1 . 3/2

First subcase. If n(α) ≤ 2A0 /bj , since the nearby allowed pair β controls the changes of frequency, we have Cγ f (x) − Cα f (x) ≤ B(f β/α + bj y) ≤ Cybk−1 , where γ = (0, I(α)). Now as x ∈ /V

10.2 End of the proof

Cγ f (x) = p.v. I(α)

119

f (t) dt ≤ B2m y. x−t

Thus, in this subcase, we put k = 1 (changing its previous value) and obtain 1/2 |Cα f (x)| ≤ C2m ybk−1 = C2m y. 3/2

Second subcase. Now we assume n(α) > 2A0 /bj . Take δ = ξ/I(x). 1/2

(A) Proof of λ(δ) ≤ (1 + bj )λ(α). Observe that by deﬁnition and by the construction of ξ |I(x)| ! n(ξ/α) − n(α) ≤ 2A0 . n(δ) = n(ξ) , |ξ| bj Put n = n(ξ/α). Since I(ξ) ⊃ I(α) I(δ) = I(x), we get |δ| ! . n(δ) = n ξ/I(x) = n (ξ/α)/I(x) = n |α| Therefore

! n(δ) 1 n n(α) 2A0 1 |δ| = n ≤ ≤ + |δ| |δ| |α| |α| |α| bj |α| 1/2

bj n(α) n(α) 1/2 + n(α) = (1 + bj ). < |α| |α| |α| 1/2

And we have proved λ(δ) ≤ (1 + bj )λ(α). Now assume that |δ| > 4|I|/2N . Then by the construction of Πξ and I(x), the level j of Cδ f (x) is j < k. As I(α) I(x) is union of intervals of Πξ , we can apply the basic step to obtain Cξ/I(α) f (x) − Cδ f (x) ≤ Cybk−1 + 2Hξ∗ f (x) + Dybk−1 Δξ (x). Since ξ ∈ S k and x ∈ EN , we obtain the bounds 1/2

Hξ∗ f (x) ≤ C2 2m ybk−1 , Hence

−1/2

Δξ (x) ≤ C1 2m bk−1 .

Cξ/α f (x) − Cδ f (x) ≤ C2m yb1/2 . k−1

Furthermore, β controls the change of frequency, thus Cξ/α f (x) − Cα f (x) ≤ B(ybk−1 + ybj ) ≤ Cybk−1 , because |n(ξ/α) − n(α)| ≤ 2A0 b−1 j , and |f |β/α < ybk−1 . Combining the last two equations leads us to Cδ f (x) − Cα f (x) ≤ C2m yb1/2 . k−1

120

10. All together

Theorem 10.2 There exists an absolute constant C such that for every special function f , y > 0 and 1 y} ≤

Bpp

f pp , yp

where Bp ≤ Cp2 /(p − 1). Proof. Given f , p and y > 0 we consider a natural number N (we are interested only in big values of N , N ≥ N0 ). With these elements we perform the Carleson analysis of the function f and deﬁne the shift m and the exceptional set EN = D ∪ S ∗ ∪ TN ∪ UN ∪ V ∪ X ∗ ∪ Y . As we have seen in chapter eight, |f | being a characteristic function m(EN ) ≤ C

f pp . yp

We are going to prove that for every x ∈ I/2 EN sup 0≤n<θ2N

|C(n,I) f (x)| ≤ B2m y,

(10.2)

where θ is an absolute constant, and m is the shift considered in the previous chapters. We know that 2m ∼ Cp2 /(p − 1). Let E = {x ∈ I/2 : supn≥0 |C(n,I) f (x)| > B2m y}. It is easy to see that E ⊂ lim inf N EN , hence we get m(E) ≤ C

f pp . yp

If we apply the same reasoning to f we obtain an analogous inequality for the set E = {x ∈ I/2 : supn≤0 |C(n,I) f (x)| > B2m y}. Thus we get m{x ∈ I/2 : CI∗ f (x) > B2m y} ≤ 2C

f pp . yp

From which the assertion of the theorem follows easily. Next, we show (10.2). Take α = (n, I) with 0 ≤ n < θ2N , x ∈ I/2 EN and proceed to show that |Cα f (x)| ≤ B2m y. We start an interative process that takes a Carleson integral Cα f (x) of level j and obtain: a number 1 ≤ k ≤ j, and a bound of Cα f (x), or a second Carleson integral Cδ f (x) of level j < k, and besides, a good bound of the diﬀerence |Cδ f (x) − Cα f (x)|. Then the process continues with Cδ f (x) instead of Cα f (x). We start with α = α0 = (n, I), but the same procedure is repeated with other pairs. So, now we assume a Carleson integral Cα f (x) of level j to be given, |α| > 4|I|/2N , and x ∈ I/2 EN . Observe that since x ∈ / V , if n(α) = 0 then |Cα f (x)| ≤ B2m y.

10.2 End of the proof

121

We apply Proposition 10.1, and obtain a natural number k, with 1 ≤ k ≤ j and one of these two possibilities: 1/2 1 If Cα f (x) ≤ C2m ybk−1 , we stop the process, for we have the bound we are seeking. 2 In this second possibility the level j is a natural number and Proposition 1/2 10.1 gives us a pair δ with I(δ) I(α), x ∈ I(δ)/2, and λ(δ) ≤ (1+bj )λ(α). We divide this case into two other cases depending on the value of n(α). 2.1 If n(α) = 0, we change the value of k to k = 1 and we have Cα f (x) ≤ 1/2 C2m ybk−1 = C2m y. We stop the process at α. 2.2 If n(α) > 0 we divide it into two subcases. 2.2.1 |δ| ≤ 4|I|/2N . In this case we stop the process. But observe that in this case we have not attained our objective, that is, to bound |Cα f (x)| or the diﬀerence |Cδ f (x) − Cα f (x)|. 2.2.2 We have |δ| > 4|I|/2N . Since we are now in case (B) of Proposition 10.1, the level of Cδ f (x) is j < k, and Cδ f (x) − Cα f (x) ≤ C2m yb1/2 . k−1 The process starts with α = α0 , of level j0 and such that 0 ≤ n(α0 ) < θ2N . We obtain k0 , with 1 ≤ k0 ≤ j0 , and stop or obtain α1 = δ, such that the level of Cα1 f (x) is j1 = j < k0 , and 1/2

|Cα1 f (x) − Cα0 f (x)| ≤ C2m ybk0 −1 . If the process is not stopped, |α1 | > 4|I|/2N and we can continue with Cα1 f (x) in the same way we have proceeded with Cα0 f (x). Then we stop the process or deﬁne a new α2 , j2 and k2 . This process must terminate because the position of x implies that the level is always a natural number, but j0 > j1 > j2 > · · · We can repeat the reasoning to obtain a sequence of pairs α0 ,

α1 ,

α2 ,

...

αs .

and the corresponding numbers j0 ≥ k0 > j1 ≥ k1 > j2 ≥ k2 > j3 ≥ . . . > js ≥ ks . Now when we stop the process we shall have 1/2

|Cαs f (x)| ≤ C2m ybks −1 . This is because the situation depicted in 2.2.1 never happens. In fact, before that can happen we stop the process at the point 2.1. The reason for this is connected with the selection of θ. Assume that αs exists. We have λ(αr+1 ) ≤ 1/2 (1 + bjr )λ(αr ). Therefore

122

10. All together

λ(αs ) ≤

s−1 '

(1 +

1/2 bjι )λ(α0 )

ι=0

≤

+∞ '

1/2

1 + bj

λ(α0 ) = Cλ(α0 ).

j=1

Take θ so that θC < 1/4. We have 2π

n(α0 ) n(αs ) ≤ 2πC . |αs | |α0 |

Hence, if n(αs ) = 0, then n(αs ) ≥ 1, and we get |αs | ≥

1 |α0 | 4|I| 1 |α0 | ≥ > N . N C n(α0 ) C θ2 2

Therefore we arrive at n(αs ) = 0 before we arrive at |αs | ≤ 4|I|/2N . The Carleson integrals Cαι f (x) satisfy the inequalities Cα f (x) − Cα f (x) ≤ C2m yb1/2 , 0 ≤ ι ≤ s − 1. ι+1 ι kι −1 Since we have excluded the only case this is not true, the last where m 1/2 integral Cαs f (x) is bounded by Cαs f (x) ≤ C2 ybks −1 . Collecting these results we have s−1 Cα f (x) ≤ f (x) − C f (x) + C f (x) C α α α 0 ι+1 ι s ι=0

≤C2 y m

s ι=0

1/2 bkι −1

≤C2 y m

∞

1/2

bk

= B2m y.

k=1

10.2 End of the proof ........ ...

Cα f (x) j = level of Cα f (x) Prop. 10.1

................ ................ ..................... ..........

1≤k≤j 1...........................

Yes ....

............ .............

..... ..... .. .... ... .........

.......

......... ... ... ... ... . . . . . . . .

STOP

........ ........ ............... ........ ........ ........ ........ . . . . . . . ........ .... ........ ........ . . . ........ . . . . .. . . ........ . . . . . .. ........ . . . . . . . ........ .. . . 1/2 . . . . . m ........ .... ........ ........ ........ α k−1 ....... . . . . ........ . . . .. . . . ........ . . . . ........ ....... ........ ........ ........ ........ ........ ....... . . . . ........ . . . ................

|C f (x)| ≤ C2 yb

No

....... .. ........... .................

2

...........

.. ......

δ 1/2 λ(δ) < (1+bj )λ(α) 2.1

............ ............... ..............

Yes .... .......

..................... ....... ........ ....... ........ ........ ........ . . . . . . ........ ... . . . . . . ........ .. . . . . . . ....... . .......... ........ ....... ........ ....... . . . . . . ........ ........ ........ ........ ........ ........ .............. ......

n(α) = 0

k=1 1/2 |Cα f (x)| ≤ C2m ybk−1 . ........ ... .... ... ........ .

Yes .... .......

........ .... ..... ... .........

......... ... ... .. ... . . ........

2.2.1 ..... ..... ... .. . .. . . .........

STOP

STOP

No .... .......

...... . ............. ..................

2.2

.. ..................... ....... ....... ....... ....... . . . . . . ....... .... ....... ....... . . . . ....... . . ... . . . . . . . .. N ................... ............ ....... ...... ....... . . . . ....... ..... ....... ....... . . . . . ....... . ...... ....... ....... ....... ....... ............. ... ...........

|δ| ≤ 4|I|/2

j := level of Cδ f (x) < k 1/2 |Cδ f (x) − Cα f (x)| ≤ C2m ybk−1 ...........

Flow diagram of the proof

No 2.2.2

123

11. Spaces of functions

11.1 Introduction Even if we are only interested in S ∗ f , when f ∈ Lp , we shall need some other spaces of functions. Since these spaces are not usually included in a ﬁrst course of real analysis we give here a brief summary of their deﬁnitions and properties. The reader can ﬁnd a more complete study in the references Hunt [22] and Bennett and Sharpley [3]. We will deﬁne the Lorentz spaces Lp,1 (μ) and Lp,∞ (μ). In Theorem 11.6, we prove that Lp,1 (μ) is the smallest rearrangement invariant space of measurable functions such that χM = χM p . We express these property by saying that it is an atomic space. Then we see Lp,∞ (μ) as its dual space, this dictates our selection of the norm · p,∞ . The proof that we presents of the duality Theorem 11.7 may appear unnecessarilly complicated, but it has the merit of getting absolute constants. We presents Marcinkiewicz’s Theorem with special attention to the constant that appear that will have a role in our theorems on Fourier series. We end the chapter studying a class of spaces near L1 (μ) that play a prominent role in the next chapter. We prove that they are atomic spaces, a fact that allows very neat proof in the following chapter.

11.2 Decreasing rearrangement The functions we are considering will be deﬁned on an interval X of R and we will always consider the normalized Lebesgue measure on this interval, μ. Therefore μ is a probability measure μ(X) = 1. Given a measurable function f : X → R we consider its distribution function μf (y) = μ{|f | > y}. μf : [0, +∞) → [0, 1] is a decreasing and right-continuous function. Observe that if (fn ) is a sequence of measurable functions such that |fn | is an increasing sequence converging to |f |, then μfn is increasing and converges to μf .

J.A. de Reyna: LNM 1785, pp. 127–143, 2002. c Springer-Verlag Berlin Heidelberg 2002

128

11. Spaces of functions

Apart from μf we also consider the decreasing rearrangement of f that is deﬁned as f ∗ (t) = m{y > 0 : μf (y) > t}.

If f is a positive simple function, there is a decreasing ﬁnite sequence of measurable sets (Aj )nj=1 and positive real numbers sj such that f = n j=1 sj χAj . (If the non null values that f attains are a1 > a2 > · · · > an , and an+1 = 0 we can take n Aj = {f > aj+1 } and sj = aj − aj+1 ). Then it is ∗ easy to see that f = j=1 sj χ[0,μ(Aj )) . Proposition 11.1 For every measurable function f : X → [0, +∞] and every measurable set A μ(A) f dμ ≤ f ∗ (t) dt. 0

A

Proof. Since |f | ≤ |g| implies f ∗ ≤ g ∗ we only need to prove the case where f is a simple function. Then with the above representation we get μ(A) n ∞ ∗ f dμ = sj μ(A ∩ Aj ); f (t) dt = sj inf{μ(Aj ), μ(A)}. A

0

j=1

j=1

The comparison between these quantities is trivial.

Theorem 11.2 (Hardy and Littlewood) For all measurable functions f and g: X → C we have 1 f ∗ (t)g ∗ (t) dt. f g dμ ≤ 0

X

Proof. Since (|f |)∗ = f ∗ we can assume that f and g are positive. First assume that f is a simple function and consider its representation as above. We shall have 1 μ(Aj ) n n ∗ f g dμ = sj g dμ ≤ sj g (t) dt = f ∗ (t)g ∗ (t) dt. j=1

Aj

j=1

0

0

For a general f let (fn ) be an increasing sequence of positive simple functions converging to f . We shall have 1 1 ∗ ∗ fn g dm ≤ f ∗ g ∗ dm. fn g dμ ≤ 0

0

And now we can apply the monotone convergence theorem.

11.2 Decreasing rearrangement

129

Proposition 11.3 A measurable function f : X → C is equimeasurable with f ∗ , that is, for every y > 0 we have μ{|f | > y} = m{f ∗ > y}. Proof. Since f ∗ is decreasing m{f ∗ > y} = sup{t : f ∗ (t) > y}. By the same reasoning f ∗ (t) = m{μf (s) > t} = sup{s : μf (s) > t}. Hence we get m{f ∗ > y} = sup{t : there is s > y, with μf (s) > t}. Since μ{f > sn } → μ{f > y} for every decreasing sequence (sn ) converging to y, we get m{f ∗ > y} = sup{t : μf (y) > t} = μf (y). Proposition 11.4 If ϕ is a nonnegative, Borel measurable function on [0, +∞), we have 1 ϕ(|f |) dμ = ϕ(f ∗ (t)) dt, X

0

for every measurable function f . Proof. Let ν1 be the Borel measure on [0, +∞) image of μ by the function |f |, and let ν1 the Borel measure on [0, +∞) image of m by the function f ∗ . The two functions |f | and f ∗ are equimeasurable m{t ∈ (0, 1) : f ∗ (t) > s} = μ{x ∈ X : |f (x)| > s}. Therefore, for s > 0, we have ν1 (s, +∞) = ν2 (s, +∞). Since ν1 and ν2 are probabilities we have also ν1 {0} = ν2 {0}. Therefore the two image measures are the same. Then since ϕ is Borel measurable and positive we have 1 ϕ(|f |) dμ = ϕ dν1 = ϕ dν2 = ϕ(f ∗ ) dm. X

[0,+∞)

[0,+∞)

0

For every measurable function f : X → C we deﬁne for t > 0 f ∗∗ (t) in the following way 1 t ∗ ∗∗ f (s) ds. f (t) = t 0 This function is also the supremum of the mean values of f . Proposition 11.5 For every measurable function $ 1 % ∗∗ f (t) = sup |f | dμ : μ(A) = t . μ(A) A

130

11. Spaces of functions

Proof. By Proposition 11.1, for every measurable set A with μ(A) = t t |f | dμ ≤ f ∗ (s) ds A

0

Therefore the supremum of the mean values of |f | is less than or equal to f ∗∗ (t). To prove the equality, ﬁrst assume that there is an y > 0 such that μ{|f | > y} = m{f ∗ > y} = t. Then we apply Theorem 11.4 with ϕ(t) = t χ(y,+∞) (t). If we denote by A the set {|f | > y} we get t ∗ |f | dμ = ϕ(|f |) dμ = ϕ(f ) dm = f ∗ dm. 0

A

In the other case there is some point y0 such that μ{|f | > y} < t for y > y0 and μ{|f | > y} > t for y < y0 . Then there is a set of positive measure on X where |f | = y0 . Hence f ∗ takes this value on a set of positive measure. Then we can obtain a set A = A0 ∪ A1 where A0 = {|f | > y0 } and A1 ⊂ {|f | = y0 } and such that μ(A) = t. It is easy to see that in this case we also obtain the equality. Therefore we have (f + g)∗∗ (t) ≤ f ∗∗ (t) + g ∗∗ (t).

11.3 The Lorentz spaces Lp,1 (μ) and Lp,∞ (μ) First we consider the Lorentz spaces Lp,1 (μ). It is the set of those measurable functions f : X → C such that 1 1 1 1/p ∗ dt t f (t) = f ∗ (tp ) dt < +∞. f p,1 = p 0 t 0 Every function f ∈ Lp,1 (μ) is in Lp (μ). In the case that p = 1 there is nothing to prove. For p > 1 let q be the conjugate exponent. Then 1 f p = sup f g dμ ≤ f ∗ g ∗ dm. gq ≤1

0

It is easy to see that for every such g, we have g ∗ (t) ≤ t−1/q . Therefore we get f p ≤ pf p,1 . It can be proved that the above inequality can be improved, removing the coeﬃcient p. Now we show that f p,1 is a norm and Lp,1 (μ) is a Banach space. From the equality (λf )∗ = |λ|f ∗ , we get λf p,1 = |λ| f p,1 . To prove the triangle inequality we bring in f ∗∗ ,

11.3 The Lorentz spaces Lp,1 (μ) and Lp,∞ (μ)

f + gp,1 = p =p

−1

1

t−1/q (f + g)∗ dt

0 −1 −1/q

t

131

1 −1 −1 t(f + g) + p q ∗∗

0

1

t(f + g)∗∗ t−1−1/q dt.

0

Since (f + g) ∈ Lp we know that limt→0+ t1/p (f + g)∗∗ (t) = 0. Hence 1 −1 ∗∗ −1 −1 (f + g)∗∗ t−1/q dt. f + gp,1 = p (f + g) (1) + p q 0

We have seen that (f + g)∗∗ ≤ f ∗∗ + g ∗∗ . It follows that · p,1 is a norm. To prove that the space is complete it suﬃces to prove thatfor every ∞ ∞ sequence of functions (fj ) with j=1 fj p,1 < +∞, the series j=1 fj is absolutely convergent almost ∞ everywhere on X, and that the sum S of the series satisﬁes Sp,1 ≤ j=1 fj p,1 . To prove this assertion we can assume that the functions fj ≥ 0 for every j ∈ N. Then the partial sums Sj are increasing, therefore Sj∗ is increasing and converges to S ∗ . Hence 1 1 ∞ −1 −1/q ∗ −1 −1/q ∗ t S (t) dt = lim p t SN (t) dt ≤ fj p,1 . p N →+∞

0

0

j=1

For a characteristic function χA a simple computation shows that χA p,1 is equal to χA p . The space Lp,1 (μ) (p > 1) can be deﬁned as the smallest Banach space with this property. This is the content of the following theorem. Theorem 11.6 There is an absolute constant C such that for every f ∈ ∞ Lp,1 (μ) there exist a sequence of measurable sets (Aj )∞ j=1 and numbers (aj )j=1 such that f=

∞

aj χAj ,

∞

f p,1 ≤

j=1

|aj |μ(Aj )1/p ≤ Cf p,1 .

j=1

Proof. Given the measurable function f , by induction we can deﬁne a parp −pj tition of X into measurable sets (Aj )∞ j=1 such that μ(Aj ) = (e − 1)e and for every x ∈ Aj , y ∈ Ak with j < k we have |f (x)| ≤ |f (y)|. Deﬁne aj = supx∈Aj |f (x)|. ∞ We can easily check that f ∗ ≥ j=1 aj χIj , were Ij denotes the interval Ij = [e−p(j+1) , e−pj ). Therefore −1

f p,1 ≥ (1 − e

)

∞ j=1

aj e−j .

132

11. Spaces of functions

By the construction of Aj we have a1 ≤ a2 ≤ · · ·. If we assume that f is not equivalent to 0, there exists some N such that a1 = · · · = aN −1 = 0 and aN > 0. Then we can write f=

∞

aj fj ,

fj = a−1 j f χAj ,

where

fj ∞ ≤ 1.

j=N

∞ Since every function with values in [0, 1) can be writen as j=1 2−j χTj , we get ∞ ∞ βj,k χTj,k , Tj,k ⊂ Aj , |βj,k | ≤ 4. fj = k=1

k=1

Therefore we obtain the expression f=

∞

aj βj,k χTj,k .

j,k=1

This is the decomposition we are seeking. In fact

1/p

aj |βj,k |μ(Tj,k )

≤4

∞

1/p

aj μ(Aj )

1/p

= 4(e − 1) p

j=1

j,k

≤

∞

aj e−j

j=1

4e f p,1 . 1 − e−1

In particular we have proved the density of the simple functions in the space Lp,1 (μ). Remark. I stress the fact that C is an absolute constant in the previous theorem. It follows, for example that the inequality f p ≤ pf p,1 obtained above, can be improved now. In fact, under the conditions of the theorem, we have f p ≤ |aj | χAj p ≤ Cf p,1 . Now we consider the dual space. We deﬁne Lp,∞ (μ) as the set of measurable functions f such that sup0
0
0
For p > 1, these two quantities are equivalent. Indeed if we assume sup t1/p f ∗ (t) = C,

0
then

11.3 The Lorentz spaces Lp,1 (μ) and Lp,∞ (μ)

1 f (t) = t ∗∗

t

0

f ∗ (s) ds ≤

133

Cp −1/p t p−1

(11.1).

It is easy to see that as we have deﬁned it · p,∞ is a norm, and Lp,∞ (μ) a vector space. For p = 1, we can see that L1,∞ (μ) is a vector space. It is the weak L1 space. It can be shown that it is not a normed space. Therefore for p = 1 we put f 1,∞ = sup t1/p f ∗ (t). 0
This is not a norm but a quasi-norm. Now it is straightforward to see that for p > 1 the space Lp,∞ (μ) is a older’s Banach space. Also it is clear that Lp (μ) ⊂ Lp,∞ (μ). In fact, by H¨ inequality, t 1/p ∗∗ 1/p−1 f ∗ (s) ds ≤ t1/p−1 t1/q f ∗ p = f p . t f (t) = t 0

Therefore f p,∞ ≤ f p . Theorem 11.7 For p > 1, Lq,∞ (μ) is the dual space of Lp,1 (μ), where q is the conjugate exponent to p. Proof. Given g ∈ Lq,∞ (μ), for every f ∈ Lp,1 (μ) we put f = ∞ with j=1 |aj | χAj p ≤ Cf p,1 . Then ∞ |aj | f g dμ ≤ X

≤

j=1 ∞

|g| dμ ≤

Aj

∞

∞ j=1

aj χAj ,

|aj |μ(Aj )g ∗∗ (μ(Aj ))

j=1 1−1/q

|aj |μ(Aj )

j=1

gq,∞ ≤

∞

|aj | χAj p gq,∞

j=1

≤ Cf p,1 gq,∞ . This allows us to identify every function g ∈ Lq,∞ (μ) with a continuous linear functional deﬁned in Lp,1 (μ). Let u be a continuos linear functional on Lp,1 (μ). Put ν(A) = u(χA ), for every measurable set A. Then ν is an additive set function. Since |ν(A)| ≤ u χA p,1 = uμ(A)1/p , the function ν is a signed measure and ν μ. By the Radon-Nikodym Theorem there is a measurable function g, such that ν(A) = A g dμ, for every measurable set A. This function is in Lq,∞ (μ). In fact we have g dμ ≤ uμ(A)1/p . A

From this we derive that g ∗∗ (t) ≤ ut−1/q . Therefore gq,∞ ≤ u.

134

11. Spaces of functions

It follows that for every simple function ϕ we have u(ϕ) = ϕg dμ. By the density of the simple functions and the inequality f g dμ ≤ Cf p,1 gq,∞ , it follows that u(f ) = f g dμ for every f ∈ Lp,1 (μ). Therefore Lq,∞ (μ) is the dual space. Also, if we denote by · •p,∞ the dual norm in the space Lq,∞ (μ), • gq,∞ = sup f g dμ. f p,1 ≤1

X

Then we have proved that for some absolute constant C we have gq,∞ ≤ g•q,∞ ≤ Cgq,∞ .

11.4 Marcinkiewicz Interpolation Theorem To prove this interpolation theorem we shall need the following inequality. Theorem 11.8 (Hardy’s inequality) For every positive real function f deﬁned on (0, +∞), and every 1 ≤ p1 < p < p0 < +∞ we have −1 ∞ p p1

1 p1

∞

t 1 1 − pp −1 p p1 1 t f (s)s ds dt ≤ − f (s) ds , p1 p 0 0 0

+∞

∞ p p1

1 p1 1 1 −1 ∞ − pp −1 p p t 0 f (s)s 0 ds dt ≤ f (s) ds . − p p0 0 t 0 Proof. Both inequalities are proved in the same way. For example, to prove the ﬁrst one, we apply H¨ older’s inequality to the inner integral

t

− p1

f (s)s 0

s

1 p1

1 −p

ds ≤

t p

f (s) s 0

1 p1

1 −p

ds

p1 t p11 − p1 1− p1 1 p1

−

1 p

.

Then, if we denote by I the ﬁrst term of the inequality, we get

1 1 1−p ∞ t p I ≤ − f (s)p s1/p1 −1/p ds t1/p−1/p1 −1 dt p1 p 0 0 After applying Fubini’s Theorem we get

1

+∞ 1 1−p +∞ p p 1/p1 −1/p − f (s) s t1/p−1/p1 −1 dt ds I ≤ p1 p s 0∞

1 −p 1 − f (s)p ds. ≤ p1 p 0

11.4 Marcinkiewicz Interpolation Theorem

135

Now we are ready to prove Marcinkiewicz’s Theorem. We say that an operator S mapping a vector space of measurable functions to measurable functions is sublinear if |S(f + g)| ≤ |Sf | + |Sg|, and for every scalar a, we have |S(af )| = |a||Sf |. Theorem 11.9 (Marcinkiewicz) Let S be a sublinear operator deﬁned on Lp0 ,1 (μ) + Lp1 ,1 (μ) where +∞ > p0 > p1 > 1. Assume that there exist constants M0 and M1 such that Sf p0 ,∞ ≤ M0 f p0 ,1 ,

and

Sf p1 ,∞ ≤ M1 f p1 ,1 .

Then, for every p ∈ (p1 , p0 ), S: Lp (μ) → Lp (μ) is continuous with norm Sp ≤

p(p0 − p1 ) M 1−θ M1θ , (p0 − p)(p − p1 ) 0

where

1 θ 1−θ = + . p p0 p1

Proof. Let f be a function in Lp (μ). First we bound (Sf )∗∗ (t) using the hypotheses that S maps Lq,1 (μ) → Lq,∞ (μ) for q = p0 and q = p1 . We decompose f into two functions, f = f0 + f1 , in the following way f (s) if |f (s)| ≤ f ∗ (at), f0 (s) = f ∗ (at) sgn(f (s)) if |f (s)| ≥ f ∗ (at); 0 if |f (s)| ≤ f ∗ (at), f1 (s) = f (s) − f ∗ (at) sgn(f (s)) if |f (s)| ≥ f ∗ (at). Here a is a parameter that we will choose later. It is easy to see that the decreasing rearrangements of these functions are given by ∗ 0 if s ≥ at f (s) if s ≥ at ∗ ∗ f0 (s) = f1 (s) = ∗ ∗ ∗ f (s) − f (at) if s ≤ at f (at) if s ≤ at Since f0 ∈ Lp0 ,1 and f1 ∈ Lp1 ,1 we see that S is deﬁned on Lp . Since we have (Sf )∗∗ (t) ≤ (Sf0 )∗∗ (t) + (Sf1 )∗∗ (t) ≤ M1 −1/p1 1 ∗ M0 −1/p0 1 ∗ 1/p0 −1 t f0 (s)s ds + t f1 (s)s1/p1 −1 ds, p0 p 1 0 0 (Sf )∗∗ (t) is bounded by 1/p0

(M0 a

M0 −1/p0 1 ∗ − M1 a )f (at) + t f (s)s1/p0 −1 ds p0 at at M1 −1/p1 t f ∗ (s)s1/p1 −1 ds. + p1 0 1/p1

∗

136

11. Spaces of functions

We consider f ∗ (s), deﬁned for s > 1, equal to 0. By Proposition 11.4, Sf p = Sf ∗ p , and since Sf ∗ ≤ Sf ∗∗ , we obtain Sf p ≤ (Sf )∗∗ p ≤ (M0 a1/p0 − M1 a1/p1 )a−1/p f p p 1/p M0 ∞ −p/p0 1 ∗ 1/p0 −1 + t f (s)s ds dt + p0 0 at p 1/p M1 ∞ −p/p1 at ∗ 1/p1 −1 t f (s)s ds dt . p1 0 0 With a change of variables we get Sf p ≤ (M0 a1/p0 − M1 a1/p1 )a−1/p f p p 1/p M0 1/p0 −1/p ∞ −p/p0 +∞ ∗ + a t f (s)s1/p0 −1 ds dt p0 0 t ∞ t p 1/p

M1 1/p1 −1/p + a t−p/p1 f ∗ (s)s1/p1 −1 ds dt . p1 0 0 Now we apply Hardy’s inequality. It follows that Sf p ≤ (M0 a1/p0 − M1 a1/p1 )a−1/p f p pM0 1/p0 −1/p ∞ ∗ p 1/p a + f (s) ds p0 − p 0 pM1 1/p1 −1/p ∞ ∗ p 1/p + a f (s) ds . p − p1 0 Therefore

p M p1 M1 1/p1 −1/p 0 0 1/p0 −1/p a Sf p ≤ + a f p . p0 − p p − p1 Now we select the best value for a. In this case M0 = M1 a1/p0 −1/p1 is the best choice. With this election we obtain Sf p ≤

p(p0 − p1 ) M01−θ M1θ , (p0 − p)(p − p1 )

where

1−θ 1 θ = + . p p0 p1

Remark. In fact we have proved that

∞ 1/p Sf p,p = f ∗∗ (t)p dt 0

is bounded. This is an equivalent norm, but Hardy’s inequality gives f p ≤ f p,p ≤

p f p . p−1

11.5 Spaces near L1 (μ)

137

In the above theorem we assume that Sf p,∞ ≤ M f p,1 . The norm Sf p,∞ refers to (Sf )∗∗ and it is usually easier to bound (Sf )∗ . Another problem with the theorem as we have given it, is that we have excluded the case p1 = 1, because in this case Sf 1,∞ is not a norm, and is deﬁned in another way. Therefore we give another version. Theorem 11.10 (Marcinkiewicz) Let S be a sublinear operator deﬁned on Lp0 ,1 (μ) ∪ Lp1 ,1 (μ) where +∞ > p0 > p1 ≥ 1. Assume that there exist constants M0 and M1 such that sup t1/p0 (Sf )∗ (t) ≤ M0 f p0 ,1 , and sup t1/p1 (Sf )∗ (t) ≤ M1 f p1 ,1 .

0
0
Then, for every p ∈ (p1 , p0 ), S: Lp (μ) → Lp (μ) is continuous with norm Sp ≤ 21/p

p(p0 − p1 ) M 1−θ M1θ , (p0 − p)(p − p1 ) 0

where

1−θ 1 θ = + . p p0 p1

Proof. We follow the same procedure as in the previous theorem. Instead of a bound for (Sf )∗∗ (t) we obtain a bound for (Sf )∗ (2t). It is easy to see that (Sf )∗ (2t) ≤ (Sf0 )∗ (t) + (Sf1 )∗ (t). Observe that Sf p = (Sf )∗ p , and

∞ 1/p −1/p ∗ 2 (Sf ) p = (Sf )∗ (2t)p dt ≤ (Sf0 )∗ + (Sf1 )∗ p . 0

Then we follow the same reasoning as before.

11.5 Spaces near L1 (μ) We can deﬁne many spaces between L1 (μ) and p>1 Lp (μ). They will play a role in the problem of the almost everywhere convergence of Fourier series, since every f in the last space has an a. e. convergent Fourier series, and by Kolmogorov’s example there exists a function in L1 whose Fourier series is everywhere divergent. We shall deﬁne a space that we call Lϕ(L), where ϕ: [0, +∞) → [0, +∞) will be a function such that: (1) There exists a constant C > 0 such that for every t > 0, ϕ(t2 ) ≤ Cϕ(t). (2) ϕ(t) is absolutely continuous and ϕ (t) ≥ 0 a. e. (3) ϕ(0) = 0. (4) limt→+∞ ϕ(t) = +∞.

138

11. Spaces of functions

The space Lϕ(L) will be the set of measurable functions such that +∞ |f |ϕ(|f |) dμ < +∞. 0

Proposition 11.11 Assuming that ϕ satiﬁes the above conditions, Lϕ(L) is a Banach space whose norm is given by 1 +∞ ∗∗ f (t) 1 ∗ f Lϕ(L) = f (t)ϕ(1/t) dt = ϕ dt. t t 0 0 Proof. The equality of the two expressions given for the norm is a consequence of Fubini’s Theorem applied to the function 1 1 ∗ f (s) χ{0<s 1). Then since ϕ ≥ 0, the second expression and the known properties of f ∗∗ prove that f Lϕ(L) is a norm. Now we can prove that the norm is ﬁnite precisely in the set Lϕ(L). In fact, for a given measurable f , we deﬁne A = {f ∗ (t)2 > 1/t} and we obtain, by property (1), 1 1 ∗ 1 ∗ ∗ √ ϕ(1/t) dt f (t)ϕ(1/t) dt = C f (t)ϕ f (t) dt + t A 0 0 1 ϕ(x−2 ) dx. C |f |ϕ(|f |) dμ + 2 0

The last integral is ﬁnite. In fact, it is comparable to ∞ eα log n 1 2n ϕ(2 ) ≤ < +∞. 2n 2n j=1

(We have used repeatedly the condition (1)). Therefore the norm is ﬁnite for every function in Lϕ(L). On the other hand, if f Lϕ(L) , then f 1 < +∞. Hence tf ∗ (t) ≤ f 1 . Therefore 1 1 ∗ ∗ |f |ϕ(|f |) dμ = f (t)ϕ(f (t)) dt ≤ f ∗ (t)ϕ(f 1 /t) dt. 0

X

0

Now if f 1 ≤ 1 we will have |f |ϕ(|f |) dμ ≤ X

0

1

f ∗ (t)ϕ(1/t) dt = f Lϕ(L) .

11.5 Spaces near L1 (μ)

And if f 1 > 1 we will have |f |ϕ(|f |) dμ ≤ f 1

1

0

X

≤ f 1

1

0

139

f ∗ (f 1 t)ϕ(1/t) dt f ∗ (t)ϕ(1/t) dt ≤ f 1 f Lϕ(L) < +∞.

p The proof that Lϕ(L) is a Banach space can be given as in the L (μ) case. Given a sequence of functions (fn ) such that n fn Lϕ(L) < +∞, we prove a. .e. and deﬁnes a measurable that the series n fn is absolutely convergent function F . Then it is easy to prove that F = n fn in the space Lϕ(L).

An important information about the space Lϕ(L) is the value of the norm of a characteristic function. We have 1 μ(M )ϕ(2/μ(M )) ≤ χM Lϕ(L) ≤ Cϕ μ(M )ϕ(2/μ(M )). 2 This follows from the following Lemma 11.12 There exists a constant Cϕ , such that x x ϕ(2/x) ≤ ϕ(1/t) dt ≤ Cϕ x ϕ(2/x), 0 < x < 1. 2 0 Proof. Since ϕ(t2 ) ≤ Cϕ(t), we have x ϕ(2/x) ≤ 2

0

x

ϕ(1/t) dt ≤

x

C(x − x )ϕ(1/x) +

0

0

x

ϕ(1/t) dt ≤

ϕ(1/t) dt + x2

2

On the other hand we have x2 ϕ(1/t) dt = 2

x2

x2

ϕ(1/t) dt. 0

2

ϕ(1/u )u du ≤ 2C 0 x ≤ 2Cx ϕ(1/u) du.

x

ϕ(1/u)u du 0

0

It follows that there exists some x0 such that x2 1 x ϕ(1/t) dt ≤ ϕ(1/t) dt, 2 0 0 Therefore in [0, x0 ] we have

0 < x < x0 .

140

11. Spaces of functions

x ϕ(2/x) ≤ 2

x

ϕ(1/t) dt ≤ 2C x ϕ(1/x).

0

x In the interval [x0 , 1] the functions xϕ(2/x) and 0 ϕ(1/t) dt are continuous and non null. Therefore our lemma is true also in this interval. Observe that the conditions ϕ(t2 ) ≤ Cϕ(t) and limt→+∞ ϕ(t) = +∞ implies that ϕ(2) = 0 is impossible. A property that is important for us is that these spaces are atomic. Theorem 11.13 There exists a constant Cϕ such that for every function f ∈ Lϕ(L) there exist a sequence of measurable sets (Aj )∞ j=1 and a sequence ∞ ∞ of complex numbers (aj )j=1 such that f = j=1 aj χAj , and f Lϕ(L) ≤

∞

|aj | χAj Lϕ(L) ≤ Cϕ f Lϕ(L) .

j=1

Proof. Given the measurable function f , as in Theorem 11.6, we can put f = ∞ j where fj has support on a measurable set Aj , aj ≥ 0, fj ∞ ≤ 1 j=N aj f ∞ ∗ and f ≥ j=1 aj χIj , with Ij = [e−j−1 , e−j ) and μ(Aj ) = (e − 1)e−j = mj . As in Theorem 11.6 we obtain the decomposition f= aj βj,k χTj,k . j,k

Then we have

|aj βjk | χTj,k Lϕ(L) ≤ 4

∞

aj mj ϕ(2/mj ) ≤ 4C(e − 1)

j=1

≤C

aj

j=1

j,k

≤ 4C

∞

∞

∞

μ(Aj )

ϕ(1/t) dt 0

aj e−j ϕ 2(e − 1)−1 ej

j=1 −j

aj e

−1

(1 − e

j=1

j+1

)ϕ(e

)≤C

∞ j=1

≤ C

0

1

aj

ϕ(1/t) dt Ij

f ∗ (t)ϕ(1/t) dt = C f Lϕ(L) .

11.6 The spaces L log L(μ) and Lexp (μ)

141

11.6 The spaces L log L(μ) and Lexp (μ) The space L log L(μ) is the space of measurable functions f such that |f | log+ |f | dμ < +∞. X By the previous section we know that 1 f L log L = f ∗∗ (t) dt, 0

is a norm in this space. That the above expression is a norm in the vector space where it is ﬁnite is a trivial consequence of the inequality (f + g)∗∗ ≤ f ∗∗ + g ∗∗ . By Fubini’s Theorem we obtain that 1 f L log L = f ∗ (t) log(1/t) dt. 0

Since f ∗ ≤ f ∗∗ we have f 1 ≤ f L log L . We shall need the expression of the norm of a characteristic function χA L log L = μ(A) log e/μ(A) . By Theorem 11.13 we get the following. Theorem 11.14 There is an absolute constant C such that for every function f ∈ L log L there exists a sequence of measurable sets (Aj )∞ j=1 and a sequence of complex numbers (aj ) such that f=

∞

aj χAj ,

and

f L log L ≤

j=1

∞

|aj | χA L log L ≤ Cf L log L .

j=1

The dual space of L log L can be characterized in several ways. We shall call it Lexp . It is the space of measurable functions such that μ{|f | > y} ≤ Ae−By for some constants A and B (that change with f ). It is also the space α|f |of functions such that there is some positive real number α such that dμ < +∞. The equivalence of these two conditions is easy to prove. e But to obtain a norm that deﬁnes the space we deﬁne the space as the set of measurable functions such that −1 ∗∗ f (t) < +∞. f Lexp = sup log(e/t) 0
142

11. Spaces of functions

Theorem 11.15 The dual space of L log L is Lexp . There is some constant C such that gLexp ≤ g•Lexp ≤ CgLexp , Where · •Lexp denotes the norm on Lexp as a dual of L log L. Proof. First, assume that f ∈ L log L and g ∈ Lexp , then f g is integrable and (11.2) f g dμ ≤ Cf L log L gLexp . X

To see it, write f =

∞

j=1

aj χAj as in Theorem 11.14. Then

∞ |aj | f g dμ ≤ X

≤ ≤

j=1 ∞ j=1 ∞

|g| dμ ≤

Aj

∞

|aj |μ(Aj )g ∗∗ (μ(Aj ))

j=1

|aj |μ(Aj )gLexp log(e/μ(Aj )) |aj | χAj L log L gLexp ≤ Cf L log L gLexp .

j=1

Assume that u is a continuous linear functional on L log L. Then we deﬁne ν(A) = u(χA ). This is an additive function deﬁned on measurable sets. Now |ν(A)| ≤ uμ(A) log(e/μ(A)). From this inequality it follows that ν is a signed countably additive measure and that ν μ. Then the RadonNikodym Theorem gives us a measurable function g such that for every measurable set A, u(A) = A g dμ. This function g is in Lexp (μ). Indeed, we know g dμ ≤ uμ(A) log(e/μ(A)). A

Then, by Proposition 11.5, we obtain g ∗∗ (t) ≤ u log(e/t). That is, g ∈ Lexp and gLexp ≤ u. Now for every simple function we have u(ϕ) = ϕg dμ. By the density of simple functions in the space L log L (this follows from Theorem 11.14) and (11.2), we derive that u(f ) = f g dμ, ∀f ∈ L log L. X

Then we have proved that Lexp is the dual space to L log L. Besides, we have proved the two inequalities between the norms. Finally note the following examples of spaces of functions Take ϕ(t) = (log+ t)2 . We obtain the space L(log L)2 , that is, the set of measurable functions f such that

11.6 The spaces L log L(μ) and Lexp (μ)

143

|f |(log+ |f |)2 dμ < +∞.

X

The norm in this case is given by 1 f ∗ (t)(log 1/t)2 dt < +∞. f L(log L)2 = 0

The norm of a characteristic function χA with μ(A) = m is equal to m(log e/m)2 + m. By the general theory this is also an atomic space Theorem 11.16 There is an absolute constant C such that for every function f ∈ L(log L)2 there exist a sequence of measurable sets (Aj )∞ j=1 and a ∞ sequence of complex numbers (aj ) such that f = j=1 aj χAj and f L(log L)2 ≤

∞

|aj | χA L(log L)2 ≤ Cf L(log L)2 .

j=1

Two more spaces have a role in the theory of pointwise convergence of Fourier series. They are L log L log log L and L log L log log log L. They are deﬁned as the set of measurable functions such that, respectively, the integrals |f |(log+ |f |)(log+ log+ |f |) dμ X

and

|f |(log+ |f |)(log+ log+ log+ |f |) dμ,

X

are ﬁnite. Deﬁne L(t) = L1 (t) = log(1 + t) and for every integer n > 2 put Ln (t) = L(Ln−1 (t)). These are well deﬁned functions Ln : [0, +∞) → [0, +∞), and it is easily veriﬁed, by induction, that Ln (t2 ) ≤ 2Ln (t). Then if we put ϕ(t) = L1 (t)L3 (t) we will have ϕ(t2 ) ≤ 4ϕ(t). All the other conditions (2), (3) and (4) are easily veriﬁed. It is clear that with this choice of ϕ we obtain the space L log L log log log L. The other one is analogous.

12. The maximal operator of Fourier series

12.1 Introduction Let f : I → C be a measurable function such that |f | = χA . For every y > 0 and 1 y} ≤ . (12.1) p−1 yp This is the basic result of the Carleson-Hunt construction. In this chapter we turn our attention to the maximal operator S ∗ of Fourier series. First, in section 2 we obtain an analogous condition to (12.1) for S ∗ . In section 3 we prove that our knowledge about S ∗ can be summarized by an inequality for the decreasing rearrangement (S ∗ f )∗ of S ∗ f . Then we derive Hunt’s theorems about the continuity of the operator S ∗ between the spaces L∞ → Lexp ; L(log L)2 → L1 ; and Lp → Lp (1 < p < +∞). Finally we deﬁne two quasiBanach spaces Q and QA, and prove that S ∗ maps them to L1,∞ . These results contain those of Sj¨ olin, Soria, and Antonov.

12.2 Maximal operator of Fourier series All the considerations of this chapter can be applied to the operator CI∗ , but we prefer to talk about the maximal operator of Fourier series S ∗ f (x) = sup |Sn (f, x)|. n∈N

First, we prove that S ∗ f satisﬁes the same restricted weak inequality. In this chapter the functions usually are deﬁned on [−π, π] and we denote by μ the normalized Lebesgue measure on this interval. Proposition 12.1 There exists an absolute constant C < +∞ such that for every measurable function f : [−π, π] → R with |f | = χA , every 1 0 we have μ{x ∈ [−π, π] : S ∗ f (x) > y} ≤ Cpp

J.A. de Reyna: LNM 1785, pp. 145–162, 2002. c Springer-Verlag Berlin Heidelberg 2002

μ(A) , yp

Cp ≤ C

p2 . p−1

(12.2)

146

12. The maximal operator of Fourier series

Proof. Let f ◦ : R → R be equal to 0 for |t| > 2π and equal to the periodic extension of f on [−2π, 2π]. Recalling the expression of the Dirichlet kernel (2.3), we obtain, for every x ∈ [−π, π], π 1 1 π ◦ sin nt dt + Sn (f, x) = f (x − t) f ◦ (x − t)ϕn (t) dt. π −π t 2π −π Since sin nt/t is uniformly bounded in [π, 3π] and [−3π, −π], and ϕn ∞ is uniformly bounded in n ∈ N, we get 1 2π ◦ sin n(x − t) |x| < π. dt + Cm(A), |Sn (f, x)| ≤ f (t) π −2π x−t Therefore if I denotes the interval [−2π, 2π], we obtain S ∗ f (x) ≤ CI∗ f ◦ (x) + Cm(A). Since |f ◦ | = χA with m(A ) ≤ 2m(A), for y > 2Cm(A) we have m{S ∗ f (x) > y} ≤ m{CI∗ f ◦ (x) > y/2} ≤ (2Cp )p

2m(A) , yp

where Cp ≤ Cp2 /(p − 1) is the constant in (12.1). Finally we arrive at m{S ∗ f (x) > y} ≤ Dpp

m(A) , yp

Dp ≤ D

p2 . p−1

Taking D big enough we can eliminate the restriction y > 2Cm(A). Finally we change to the measure μ. Now we can prove another extension of this result. Theorem 12.2 There is an absolute constant C such that given a measurable function f on [−π, π] with f ∞ = 1 and a measurable set A such that f (x) = 0 for every x ∈ A, then for every 1 0 we have

C p p μ(A), Cp ≤ Cp2 /(p − 1). μ{S ∗ f > y} ≤ y Proof. First assume f to be real. Then we can write f = eig(x) χA (x) , where g(x) = arccos f (x) is conveniently deﬁned so as to be measurable. Therefore we have f = (f1 + f2 )/2 with f1 and f2 being special functions in our meaning, that is |f1 | = |f2 | = χA . Then we have S ∗ f ≤ (S ∗ f1 + S ∗ f2 )/2. Therefore for every y > 0,

12.3 The distribution function of S ∗ f

147

{S ∗ f > y} ⊂ {S ∗ f1 > y} ∪ {S ∗ f2 > y}. Hence, by the basic result, we derive

C p

2C p p p μ(A) ≤ μ(A). μ{S ∗ f > y} ≤ 2 y y For a complex f the theorem follows easily from the real case.

12.3 The distribution function of S ∗ f Our ﬁrst objective is to obtain the information contained in these inequalities about the distribution function of S ∗ f . We shall need a function ϕa to describe this distribution function (1/t) log(e/t) if 0 < t < 1, ϕa (t) = ea(1−t) if 1 ≤ t < +∞. Lemma 12.3 (a) Given A > 0 there exist N and a > 0 such that for every y > 0 there exists p > 1 with 1 Ap2 p ≤ N ϕa (y). yp p − 1 (b) Given M and a > 0, there is a constant A > 0 such that for every y > 0 and every p > 1 M ϕa (y) ≤

1 Ap2 p . yp p − 1

Proof. (a) Given A, take p = sup(αy, 2) when y > 1, with α > 0 conveniently −1 chosen; and p = 1 + log(e/y) when y < 1. (b) For y > 1 the inequality is equivalent to (M ea )1/p ye−ay/p ≤ Ap2 /(p − 1). For a ﬁxed p the function ye−ay/p ≤ p/ae. Therefore we must prove (M ea )1/p ≤ Aaep/(p − 1). Now it is easy to see that this is true for every value of p > 1 if we choose a convenient A. p For y < 1 we have to prove M y p−1 log(e/y) ≤ Ap2 /(p − 1) . For a ﬁxed p, M y p−1 log(e/y) attains a maximum at y = e exp(−(p − 1)−1 ) and for this value the inequality can be easily checked. From Theorem 12.2 and Lemma 12.3 we obtain the following theorem. This is equivalent to Theorem 12.2 and contains all the information that the construction of Carleson-Hunt gives about the operator S ∗ .

148

12. The maximal operator of Fourier series

Theorem 12.4 There exist absolute constants a, and B such that if f is a measurable function with f ∞ ≤ 1 and A a measurable set such that f (x) = 0 for every x ∈ A, then for every y > 0, μ{S ∗ f > y} ≤ Bϕa (y)μ(A).

12.4 The operator S ∗ f on the space L∞ Theorem 12.5 (Hunt) There are absolute constants A and B > 0 such that for every f ∈ L∞ [−π, π], and every y > 0 μ{S ∗ f (x) > y} ≤ Ae−By/f ∞ . Proof. By homogeneity we can assume that f ∞ = 1. If we take A such that Ae−B > 1 the inequality says something only for y > 1. In the case y > 1 the inequality is the same as that of Theorem 12.4. The set of measurable functions g that satisﬁes an inequality of the type μ{|g| > y} ≤ Ae−By , forms a Banach space. This space coincides C|g| with the dμ < +∞, space of functions for which there is a constant C such that e and is usually called Lexp . As we have seen in chapter eleven, a norm on this space is given by g ∗∗ (t) . gLexp = sup 0
12.5 The operator S ∗ on the space L(log L)2

149

12.5 The operator S ∗ on the space L(log L)2 We derive from Theorem 12.4 that the operator S ∗ maps the space L(log L)2 into L1 . First, the space L(log L)2 is the set of measurable functions g such that |g|(log+ |g|)2 is integrable. This is a Banach space. We have deﬁned its norm as 1

gL(log L)2 =

g ∗ (t)(log t)2 dt.

0

Theorem 12.6 (Hunt) The operator S ∗ : L(log L)2 → L1 is continuous. Hence there is a constant C such that for every measurable function f S ∗ f 1 ≤ Cf L(log L)2 . Proof. Given f there is a sequence of pairwise disjoint measurable sets −j (Aj )∞ j=1 such that μ(Aj ) = 2 , and for every j < k, x ∈ Aj and y ∈ Ak we have |f (x)| ≤ |f (y)|. Put aj = sup{|f (x)| : x ∈ Aj }. Then if fj = f χAj , ∞ we have that f = j=1 fj and fj ∞ ≤ aj . We deduce easily the following inequality for the decreasing rearrangement: ∗

f (t) ≥

∞

aj χ[2−j−1 ,2−j )

j=1

Now by Theorem 12.4 for f ∈ L∞ , supported on the measurable set A of measure μ(A) = m, and bounded by 1, we have +∞ m ∞ ∗ ∗ μ{S f > y} dy ≤ dy + Bmϕa (y) dy S f 1 = 0

0

m

≤ Cm(log e/m)2 . Therefore by the subadditivity of S ∗ we get, for a general f , ∗

S f 1 ≤ C

∞ j=1

aj

1 (log e2j )2 . 2j

Thus ∗

S f 1 ≤ C

∞ j=1

aj

1 2j+1

(j+1) 2

(log 2

) ≤C

1

f ∗ (t)(log t)2 dt.

0

150

12. The maximal operator of Fourier series

12.6 The operator S ∗ on the space Lp Here we derive the principal result of Hunt. Observe that our notation for the norms of Lp,1 (μ) and Lp,∞ (μ) is not standard. We have put, for 1 y}

1/p

y>0

Cp2 χ p . ≤ p−1 A

For every function f , the decreasing rearrangement f ∗ and the distribution function μf (y) = μ{f > y} are essentially inverse functions, therefore sup t1/p f ∗ (t) = sup yμf (y)1/p .

0
y>0

Thus, for every characteristic function, sup t1/p (S χA )∗ (t) ≤ C

0
p2 χ p . p−1 A

We have seen in (11.1) that sup t1/p f ∗∗ (t) ≤

0
Thus we have S ∗ χA p,∞ ≤

p sup t1/p f ∗ (t). p − 1 0
Now, recall that Lp,1 (μ) is an atomic ∞space. That is, each function p,1 f ∈ L (μ) can be written as f = j=1 aj χAj in such a way that ∞ j=1 |aj | χAj p ≤ Cf p,1 . Hence ∞ ∗ |aj |S ∗ χAj . S f≤ j=1

Therefore

12.6 The operator S ∗ on the space Lp ∗

S f p,∞ ≤

∞

∗

|aj |S χAj p,∞

j=1

≤ C

151

∞ p3 ≤C |aj | χAj p (p − 1)2 j=1

p3 f p,1 . (p − 1)2

Theorem 12.8 (Carleson-Hunt) For every 1 < p < +∞, the operator S ∗ : Lp → Lp is continuous. More precisely, there is a constant C such that for every measurable function f p4 f p . S f p ≤ C (p − 1)3 ∗

Therefore for f ∈ Lp [−π, π] lim Sn (f, x) = f (x),

n→∞

a. e. on [−π, π].

Proof. By the previous theorem we know that S ∗ f p,+∞ ≤ C

p3 f p,1 , (p − 1)2

for every 1 < p < +∞. Thus we can apply Marcinkiewicz’s Theorem 11.9. Therefore if p1 < p < p0 , the norm of the operator S ∗ : Lp (μ) → Lp (μ) is bounded by 1−θ

θ p31 p30 p(p0 − p1 )

. S p ≤ C (p0 − p)(p − p1 ) (p0 − 1)2 (p1 − 1)2 ∗

Observe that if p 0 and p 1 denote the conjugate exponents, then p30 /(p0 −1)2 = p0 p 0 2 . Thus if we take p0 and p1 conjugate exponents, S ∗ p ≤ C

p(p0 − p1 ) p1+θ p2−θ . 1 (p0 − p)(p − p1 ) 0

In the case 1 < p < 2 we choose p0 = (p + 1)/(p − 1) and p1 = (p + 1)/2. With these choices p0 and p1 are conjugates, θ = (1 + 2p − p2 )/p(3 − p), and we get 2θ p(p + 1)4 (3 − p) C ≤ . S ∗ p ≤ C 4(1 + 2p − p2 )(p − 1)2+θ (p − 1)3 In the case 2 < p < +∞ we choose p0 = 2p, p1 = 2p/(2p − 1), then θ = 1/2(p − 1) and we get S ∗ p ≤ C

2(p − 1)(2p)4 ≤ C p. 2−θ p(2p − 3)(2p − 1)

152

12. The maximal operator of Fourier series

Therefore for all values of p we arrive at S ∗ p ≤ C

p4 . (p − 1)3

The conclusion about the pointwise convergence is now standard. We know that for a dense set of functions on Lp [−π, π] we have pointwise convergence at all points. Now we introduce the operator Ωf (x) = lim sup |Sn (f, x) − f (x)|. n

It is clear that Ωf (x) ≤ S ∗ f (x) + |f (x)|. Also Ω(f ) = Ω(f − ϕ), for every ϕ periodic and diﬀerentiable. Then for every positive real number α and ϕ as before {Ωf (x) > 2α} ⊂ {S ∗ (f − ϕ) > α} ∪ {|f (x) − ϕ(x)| > α}, from which we deduce easily that {Ωf (x) > 2α} has measure zero.

12.7 The maximal space Q We are interested here in a space of functions with almost everywhere convergent Fourier series. First, observe the following consequence of Theorem 12.4 Corollary 12.9 There exists an absolute constant C, such that if f is a measurable function with f ∞ ≤ 1 and A is a measurable set with μ(A) = m and such that f (x) = 0 for every x ∈ A, then for every y > 0 we have μ{S ∗ f > y} ≤ C

m log(e/m) . y

Proof. By corollary 12.4 we know that μ{S ∗ f > y} ≤ Bmϕa (y). Since μ is a probability, μ{S∗ f > y} ≤ 1. Therefore, we have to prove that there exists C such that inf 1, Bmϕ(y) ≤ Cm log(e/m)/y. This is easily checked. In 1961 Stein [44, p. 169] proved that if X is a Banach space of functions deﬁned on [−π, π] and such that f 1 ≤ f X , and for every f ∈ X, S ∗ f (x) < +∞ for x in a set of positive measure, then μ{S ∗ f > y} ≤ Ay −1 f X and in case that X = Lp (1 y} ≤ Ay −p f p . This made Zygmund think that the conjecture of Luzin must be false. Since the construction of Carleson-Hunt gives μ{S ∗ f > y} ≤ Ay −1 f L log L for

12.7 The maximal space Q

153

characteristic functions, the best we could hope to prove is a. e. convergence for functions in the space L log L. But this appears very diﬃcult to achieve. We have here some functions f with S ∗ f ∈ L1,∞ , therefore we are interested in conditions of stability of the space L1,∞ . The principal result in this direction is the following theorem of Stein and Weiss (but we give a proof due to Kalton). For typographical reasons we put f ∗1 instead of f 1,∞ Theorem (Stein-Weiss) Let (fj ) be a sequence of functions L1,∞ , ∞ on ∞ 12.10 ∞ ∗ ∗ ∗ with j=1 fj 1 = 1. If j=1 fj 1 log(e/fj 1 ) < +∞, the sum j=1 fj is almost everywhere absolutely convergent and ∞ ∞ &∗ & & & fj & ≤ 4 fj ∗1 log & 1

j=1

j=1

e . fj ∗1

Proof (Kalton). Let A be a measurable set of measure μ(A) = m. Deﬁne the sets Bj = {|fj | > 2/m}, and B = j Bj . We have μ(B) ≤

∞

∞ 1

μ(Bj ) ≤

j=1

j=1

2

fj ∗1 m ≤ m/2.

Then 2 inf |f (t)| ≤ inf |f (t)| ≤ t∈A t∈AB m ∞

2 ≤ m j=1 ∞

2 ≤ m j=1

0

2/m

∞

2 |f | dμ ≤ m j=1 AB ∞

2 |fj | dμ ≤ m j=1 ABj

inf(fj ∗1 /t, m) dt

|fj | dμ AB

inf(|fj |, 2/m) dμ A

∞

2 fj ∗1 + fj ∗1 log 2/fj ∗1 = m j=1

∞ ∞ 2 4 ∗ ∗ ≤ 1+ fj 1 log e/fj 1 ≤ fj ∗1 log e/fj ∗1 . m m j=1 j=1

Therefore if this quantity is 4L/m we have μ{|f | > 4L/m} < m, because on every set A of measure m we have some points where |f | is less than 4L/m. This is equivalent to saying that μ{|f | > y} ≤ 4L/y. The quantity that appears in the above theorem is not homogeneous, and ∞ we have proved the inequality only when j=1 fj ∗1 = 1. These diﬃculties can be avoided.

154

12. The maximal operator of Fourier series

Theorem 12.11 (Kalton) For every sequence (fj ) on L1,∞ we have ∞ ∞ & &∗ & & fj & ≤ C (1 + log j)fj ∗1 . & j=1

1

j=1

(C is an absolute constant). ∞ ∗ Proof. Obviously we can assume that j=1 fj 1 = 1. So, by the above theorem we only ∞need to prove that there is an absolute constant such that if aj > 0 and j=1 aj = 1, then ∞ j=1

aj log e/aj ≤ C

∞

a∗j (1 + log j),

j=1

where (a∗j ) denotes the decreasing rearrangement of the sequence (aj ). ∞ ∗ Observe that j=1 aj (1 + log j) is the inﬁmum of all possible sums ∞ j=1 aσ(j) (1 + log j) for every permutation σ of the natural numbers Hence we have to prove the following assertion: For every α > 1, the supremum of the function N j=1

xj log(e/xj ) − α

N

xj (1 + log j)

j=1

in the set of (x1 , x2 , . . . , xN ) with x1 , x2 , . . . xN ≥ 0 and equal to N

1 log e −α α j j=1

N j=1

xj = 1 is (12.3)

We proceed by induction on N . For N = 1, 2 the result can be easily veriﬁed. Also observe that, for every dimension, there is a critical point, interior to the region we are considering. This point can be obtained by

−1 N −α −α . From Lagrange multipliers. The computations give ak = k j=1 j the fact we see that the function attains the value (12.3). We assume the result to be true in the case N − 1. In the case of N variables, the maximum of the function exists because it is a continuous function on a compact set. It is clear that the maximum is attained at a point where x1 ≥ x2 ≥ · · · ≥ xN ≥ 0. So we assume that for xj = aj the function takes the maximum value. Every aj > 0, because if aj = 0 for some j, then aN = 0 and the maximum in this case would be the same as in that with N −1 variables. This contradicts the fact that we have obtained a greater value at an interior point. Therefore the maximum is a relative maximum and must coincide with the point calculated.

12.7 The maximal space Q

155

∞Therefore, taking some concrete value for α > 1 we obtain for xj ≥ 0 and j=1 xj = 1 ∞

xj log e/xj ≤ log(eζ(α)) + α

j=1

∞

xj (1 + log j) ≤ C

j=1

∞

xj (1 + log j).

j=1

Remark. Observe that from ∞

a∗j

∞ j=1

aj = 1 it follows that a∗j j ≤ 1, thus

log(e/a∗j )

≥

j=1

∞

a∗j (1 + log j).

j=1

Therefore Theorem 12.11 is equivalent to Stein-Weiss theorem. A quasi-norm on a real or complex vector space X is a map x → x such that x ≥ 0 and x = 0 if and only if x = 0, ax = |a| x for every scalar a and vector x, and ﬁnally there is some constant C > 0 such that x + y ≤ C(x + y). A quasi norm induces a compatible topology. If this space is complete it is called a quasi-Banach space. L1,∞ is a quasi-Banach space. A quasi-Banach space is called logconvex if it satisﬁes the conclusion of Proposition 12.11. We deﬁne now the space Q of all measurable functions f such that there exist a sequence (aj ) of positive real numbers and measurable sets Aj of measure mj , such that |f | ≤

∞

aj χAj ,

j=1

∞

aj mj log(e/mj )(1 + log j) < +∞.

j=1

And for every such function we deﬁne f Q = inf

∞ $

% aj mj log(e/mj )(1 + log j) ,

j=1

where we consider all possible sequences (aj ) and (Aj ). Proposition 12.12 Q is a logconvex quasi-Banach space. Proof. From the deﬁnition it is clear that for f ∈ Q, f Q ≥ 0. To see that the quasi-norm vanishes only for f ∼ 0, we obtain the inequality f L log L ≤ ∞ f Q . This is true because from |f | ≤ j=1 aj χAj we derive

156

12. The maximal operator of Fourier series

f L log L ≤ ≤

∞ j=1 ∞

aj χAj L log L =

∞

aj mj log(e/mj )

j=1

aj mj log(e/mj )(1 + log j).

j=1

The equality af Q = |a| f Q is asimple consequence of the deﬁnition. ∞ Assume now that we have f = j=1 fj . Given ε > 0, we determine positive real numbers ajk and measurable sets Ajk such that fj Q ≤

∞

ajk mjk log(e/mjk )(1 + log k) ≤ fj Q +

k=1

It follows that f Q ≤

ε (1 + log j)−1 . 2j

ajk mjk log(e/mjk )(1 + log σ(j, k))

jk

where σ: N × N → N is a bijective application. For example, take the usual bijection σ(j, k) = j +(j +k −2)(j +k −1)/2. Then we have (1+log σ(j, k)) ≤ 2(1 + log j)(1 + log k) Therefore f Q ≤ ε + 2

∞

(1 + log j)fj Q

j=1

This implies that · Q is a logconvex quasi-norm. To prove that ∞the space is complete we only need to prove that if fj Q ≤ 2−j the series j=1 fj is convergent in Q. This is obvious from the deﬁnition and the logconvexity of the norm. Proposition 12.13 For every measurable f μ{S ∗ f > y} ≤ C

f Q , y

where C < +∞ is an absolute constant. Therefore the Fourier series of every function in Q is pointwise almost everywhere convergent. ∞ Proof. If |f | ≤ j=1 aj χAj , then there exist measurable functions fj with ∞ fj ∞ ≤ 1 and supported on Aj , and such that f = j=1 aj fj . It follows from Corollary 12.9 that S ∗ fj ∗1 ≤ Cmj log(e/mj ), where mj = μ(Aj ). Therefore from Kalton theorem S

∗

f ∗1

≤C

∞ j=1

aj (1 + log j)S

∗

fj ∗1

≤C

∞ j=1

(1 + log j)aj mj log(e/mj ).

12.8 The theorem of Antonov

157

Taking the inﬁmum for all possible decompositions we arrive at S ∗ f ∗1 ≤ C f Q . Remark. It is easy to see that the space Bϕ∗ 1 deﬁned by Soria [42] is contained in Q. But it is not clear whether Q = Bϕ∗ 1 . Remark. We can give an example of a sublinear operator T : Q → L1,∞ such that for every f ∈ Q, we have sup{T g : g ∈ Q, |g| ≤ |f |} = +∞. But this operator satisﬁes the conclusion of Corollary 12.9. Therefore in this sense the space Q is maximal. But obviously it may be possible that we can derive better properties for S ∗ assuming only the properties given in Theorem 12.4. As we will see in the following section, it is also possible to obtain some extra information for the special operator S ∗ in which we are interested.

12.8 The theorem of Antonov Now we are going to apply the ideas of Antonov to obtain a quasi-Banach space QA ⊃ Q in which we can prove a. e. convergence. In particular we shall see that QA ⊃ L log L log log log L. We shall need some information about the Dirichlet kernel that is due to Antonov. Proposition 12.14 (Antonov) Let f : [−π, π] → R be a measurable function and A a measurable set such that 0 ≤ f (x) ≤ a for every x ∈ A and f (x) = 0 for every x ∈ A. Then for every ε > 0 and n ∈ N, there is a measurable set B ⊂ A such that aμ(B) = f 1 and such that |Sj (f, x) − Sj (a χB , x)| ≤ ε,

for every x ∈ [−π, π], and 1 ≤ j ≤ n.

Proof. Without loss of generality we can assume that f 1 > 0. We are going to divide the interval [−π, π] into N equal subintervals, where N is big enough. Let (Jk )N k=1 be these intervals. For every such interval we determine a measurable set Bk ⊂ Jk ∩ A such that f dμ = aμ(Bk ). Jk ∩A

This can be done because a−1 Jk ∩A f dμ ≤ μ(Jk ∩ A). Then we take B = N k=1 Bk . Now it is clear that B ⊂ A. Also, aμ(B) = a

N j=1

μ(Bk ) =

N j=1

Jk ∩A

f dμ = f 1 .

f dμ = A

158

12. The maximal operator of Fourier series

Now assume that x ∈ [−π, π] and 1 ≤ j ≤ n. We have π |Sj (f, x) − Sj (a χB , x)| = Dj (x − t) f (t) − a χB (t) dμ(t) N ≤ k=1

Dj (x − t) f (t) − a χB (t) dμ(t)

Jk

N = k=1

−π

Jk ∩A

Dj (x − t) f (t) − a χBk (t) dμ(t).

Now we notice that f (t)−a χBk (t) has an integral equal to 0 on the set Jk ∩A. Therefore if tk is one extreme of the interval Jk we have |Sj (f, x) − Sj (a χB , x)| N ≤ k=1

Jk ∩A

Dj (x − t) − Dj (x − tk ) f (t) − a χBk (t) dμ(t).

We apply the mean value theorem, for t ∈ Jk |Dj (x − t) − Dj (x − tk )| ≤

2π j(j + 1). N

Therefore |Sj (f, x) − Sj (a χB , x)| ≤

N 2π k=1

≤ if we take N≥

N

j(j + 1)

Jk ∩A

f + a χBk dμ

2π j(j + 1)2f 1 ≤ ε, N

4πn(n + 1) f 1 . ε

The procedure of Antonov invites us to deﬁne a quasi-norm in the following way. Deﬁnition 12.15 We say that a measurable function f is in QA if the following quantity is ﬁnite: f QA

∞ $

∞

ef j ∞ = inf (1 + log j)fj 1 log fj , : |f | ≤ f j 1 j=1 j=1

%

fj ≥ 0

It is clear that for every scalar a we have af QA = |a|f QA . Furthermore, as in the case of · Q we can prove easily that it is a logconvex quasi-norm:

12.8 The theorem of Antonov ∞ & & & & fj & &

QA

j=1

≤

∞

159

(1 + log j)fj QA .

j=1

With this deﬁnition we can prove the following theorem. Theorem 12.16 For every measurable f , and y > 0 μ{S ∗ f > y} ≤ C

f QA , y

where C < +∞ is an absolute constant. Therefore the Fourier series of every function in QA is pointwise almost everywhere convergent. Proof. To prove the inequality we can assume that f ∈ QA. Every f ∈ QA is a linear combination of four positive measurable functions fj ∈ QA, f = (f1 − f2 ) + i(f3 − f4 ) and with fj QA ≤ f QA . Therefore we can assume that f ≥ 0. Then, given ε > 0, we obtain measurable functions fj ∈ L∞ (μ) such that f QA

ef j ∞ ≤ (1 + log j)fj 1 log ≤ f QA + ε. fj 1 j=1 ∞

We can assume that f = h1 log

j

eh ∞

h1

fj because given 0 ≤ h ≤ g we have

≤ h1 log

eg ∞

h1

≤ g1 log

eg ∞

g1

.

Observe that x log ea/x is increasing on [0, a]). Fix a natural number n. We apply Proposition 12.14 to obtain for every j ∈ N a measurable set Bj such that for every j ∈ N and 1 ≤ k ≤ n we have μ(Bj ) =

fj 1 , fj ∞

|Sk (fj , x) − Sk (fj ∞ χBj , x)| ≤

Now consider the function g = gQ ≤

∞

∞ j=1

fj ∞ χBj . Its Q-norm is bounded:

(1 + log j)fj ∞ μ(Bj ) log e/μ(Bj )

j=1 ∞

ef j ∞ (1 + log j)fj 1 log . = fj 1 j=1 Therefore gQ ≤ f QA + ε. On the other hand, for 1 ≤ k ≤ n

ε . 2j

160

12. The maximal operator of Fourier series

|Sk (f, x) − Sk (g, x)| ≤

∞

|Sk (fj , x) − Sk (fj ∞ χBj , x)| ≤ ε.

j=1

Therefore by Proposition 12.13 μ{ sup |Sk (f, x)| > y} ≤ μ{S ∗ g > y − ε} ≤ C 1≤k≤n

f QA + ε gQ ≤C . y−ε y−ε

Since this inequality is true for every n ∈ N we arrive at μ{S ∗ f > y} ≤ C

f QA + ε . y−ε

Now this is true for every ε < y. Thus we have μ{S ∗ f > y} ≤ C

f QA . y

Finally we are going to prove the theorem of Antonov. Theorem 12.17 (Antonov) There exists a constant C > 0 such that for every measurable function f : [−π, π] → C and every y > 0 we have μ{S ∗ f > y} ≤ C

f L log L log log log L . y

Therefore the Fourier series of every function f ∈ L log L log log log L is almost everywhere convergent. Proof. Recall that we said that f ∈ L log L log log log L when |f |(log+ |f |)(log+ log+ log+ |f |) dμ < +∞. X

We have seen in Section 11.6 that this is a Banach space with norm ∞ f ∗ (t)ϕ(1/t) dt, f L log L log log log L = 0

where ϕ: [0, +∞) → [0, +∞) is deﬁned by ϕ(t) = L1 (t)L3 (t). This function satisﬁes the conditions (1), (2), (3), and (4) of Section 11.5. Let f be a function in L log L log log log L. Let f ∗ be the decreasing ren arrangement of f . Consider the sequence (xn )∞ n=0 , where xn = exp(1 − e ), deﬁne an = f ∗ (xn ), and let An = {|f | ≥ an−1 }. We see that A1 = [−π, π] and that (An ) is a decreasing sequence of measurable sets. We can put ∞ f = n=1 fn where the functions fn are deﬁned as

12.8 The theorem of Antonov

⎧ ⎨ f (t) − an−1 sgn f (t) fn (t) = (an − an−1 ) sgn f (t) ⎩ 0

161

if t ∈ An An+1 if t ∈ An+1 , if t ∈ An ,

where sgn(a) = a/|a| if a = 0, and sgn(0) = 0. Then we have e

fn ∞ fn ∞ = exp(en ). ≤e fn 1 fn ∞ exp(1 − en )

Since |fn (t)| = |f (t)| − an−1 if an > |f (t)| ≥ an−1 , |fn (t)| = an − an−1 if |f (t)| ≥ an and |fn (t)| = 0 in other case, we have {|f | > y + an−1 } if 0 < y < an − an−1 , {|fn | > y} = ∅ if y > an − an−1 . Hence

fn 1 =

+∞

μ{|fn | > y} dy =

0

an

an −an−1

0

μ{|f | > y} dy =

an−1

μ{|f | > y + an−1 } dy =

an

x dy, an−1

where x denotes the function μf (y). Therefore ∞ n (1 + log n)e f QA ≤

an

x dy.

an−1

n=1

Observe that when an−1 < y < an we have x ∈ (xn , xn−1 ) therefore exp(en−1 ) ≤

e ≤ exp(en ). x

Thus for this values of y n ≤ log e(log e/x); 1 + log n ≤ log e log e(log e/x) . Put ψ(x) = (log ex) log e log e(log ex) . Then en ≤ e(log e/x);

f QA

∞ ≤ e n=1

an

+∞

xψ(1/x) dx = e

xψ(1/x) dx. 0

an−1

It is not diﬃcult to see that there exists a constant C such that for x ≥ 1 we have ψ(x) ≤ e−1 CL1 (x)L3 (x) = Cϕ(x). Therefore we have +∞ +∞ x

f QA ≤ C xϕ(x) dy ≤ C ϕ(1/s) ds dy 0

0

0

Observe that if f ∗ (s) = t we have s < x if and only if t > y. Therefore by Fubini’s Theorem we get

162

12. The maximal operator of Fourier series

f QA ≤ C

0

+∞

f ∗ (s)ϕ(1/s) ds = Cf L log L log log log L .

Remark. By modifying the proof of Carleson Hunt’s Theorem I get the following result. Theorem 12.18 There exists an absolute constant C > 0, such that for every f ∈ L2 [−π, π], 1 ≤ p ≤ 2, and y > 0, we have f pp $ f 2 %p log e , μ{S f > y} ≤ C yp f p ∗

p

In the case p = 1 this is a consequence of our fundamental Theorem 12.16. Remark. We have not included all the results about the operator S ∗ . The most important that we have omitted are: The result of Sj¨ olin [40] that S ∗ maps L(log L)1+θ into L(log L)1−θ , for 0 < θ ≤ 1. The result of Soria [43] that S ∗ maps L log L log log L to L(log L)−1 .

Remark. We must complete this chapter with some facts about the negative results. In 1922 Kolmogorov, a 19 year old student of Luzin obtained the ﬁrst example of a function f ∈ L1 (μ) with almost everywhere divergent Fourier series. Four years later he gave the example of such a function with everywhere divergent Fourier series. These results were reﬁned by Stein and Kahane. Recently Konyagin [33] has announced that given a nondecreasing function ϕ: [0, +∞) → [0, +∞) such that

√log t , t → +∞, ϕ(t) = o √ log log t then there exists a function f ∈ Lϕ(L) whose Fourier series is everywhere divergent.

13. Fourier Transform on the line

13.1 Introduction We collect here the inmediate consequences for the Fourier integral on the line. Also we give an example, that Luis Rodriguez Piazza showed to me, that proves that these results are sharp.

13.2 Fourier transform Let 1 0

−a

sin 2πa(x − t) ∗ SR f (x) = sup f (t) dt. x−t a>0 R Theorem 13.1 Let 1 < p < +∞. There exist constants Cp ≤ Cp4 /(p − 1)2 such that for every f ∈ Lp (R) we have ∗ f p ≤ Cp f p . SR

Moreover if 1 ≤ p ≤ 2 and p denotes the conjugate exponent F ∗ f p ≤ Cp f p . Proof. Let I be any interval, and x ∈ I/2. For every real number s we have by a change of frequency |C(s,I) f (x) − C(s,I) f (x)| ≤ Cf (s,I) ≤ Cf Lp (I) . It follows that sup |C(s,I) f (x)| ≤ CI∗ f (x) + Cf Lp (I) .

s∈R

J.A. de Reyna: LNM 1785, pp. 163–166, 2002. c Springer-Verlag Berlin Heidelberg 2002

164

13. Fourier Transform on the line

Therefore & & & & & sup |C(s,I) f (x)|& &s∈R &

Lp (I/2)

≤ CI∗ f (x)Lp (I/2) + Cf Lp (I) .

From which we derive

p p sup |C(s,I) f (x)| dx ≤ Cp |f (x)|p dx. I/2 s∈R

I

If we put I = [−2a, 2a], we can write this as a

2a 2a 2πis e− 4a (x−t) p p dt dx ≤ Cp sup p.v. f (t) |f (x)|p dx. x − t −a s∈R −2a −2a This is the same as

a

−a

p ∗ CR,a f (x)

where ∗ CR,a f (x)

dx ≤

= sup p.v. s∈R

Cpp

2a

|f (x)|p dx,

−2a

e−2πis(x−t) f (t) dt. x−t −2a 2a

We shall consider also the following Carleson maximal operator e2πis(x−t) ∗ CR f (x) = sup p.v. f (t) dt. x−t s∈R R Since

2a

e−2πis(x−t) dt = p.v. lim p.v. f (t) a→∞ x−t −2a

f (t) R

e−2πis(x−t) dt, x−t

we have, for every y > 0 ∗ 1 ∗ f (x), y ≤ lim inf CR,a f (x). inf CR a→+∞ 2 Therefore 1 2p Hence

inf R

p ∗ CR f (x), y

dx ≤

Cpp

R

|f (x)|p dx.

∗ CR f Lp (R) ≤ 2Cp f Lp (R) .

∗ ∗ f (x) ≤ CR f (x). Therefore Now it is clear that SR ∗ f Lp (R) ≤ 2Cp f Lp (R) . SR

13.2 Fourier transform

165

We know that the Fourier transform sends L1 (R) to L∞ (R), and L2 (R) to L2 (R). By interpolation we can deﬁne it for f ∈ Lp (R), 1 ≤ p ≤ 2. Then f ∈ Lp (R) and we have the Hausdorﬀ-Young inequality fLp (R) ≤ f Lp (R) . By the multiplication theorem, if f ∈ L2 (R) a −2πiξx f (ξ)e dξ = f(t)Da (x − t) dt. −a

R

where Da (t) = sin 2πat/πt is the Fourier transform of χ[−a,a] . This is true also for every f ∈ Lp (R), (1 ≤ p ≤ 2). ∗ Hence it follows that F ∗ f (x) ≤ CR f (x) and then F ∗ f Lp ≤ 2Cp fLp (R) ≤ 2Cp f Lp (R) . As a corollary we have that for every f ∈ Lp (R), with 1 ≤ p ≤ 2 we have a f (t)e−2πitx dt, a. e. on R. f(x) = lim a→+∞

−a

The inequality F ∗ f p ≤ Cp f p of the preceding theorem can not be extended to values of p > 2 as we see in the following example. Example 13.2 There is a function f : R → R such that f ∈ Lp (R) for every p > 2 and such that a 2πitx a.e. on R. f (t)e dt = +∞, sup a>0

−a

Proof. Let f be the function ∞ 1 √ χ[2n ,2n +1] (t). f (t) = n n=1

It is obvious that f ∈ Lp (R) for every p > 2. On the other hand the Fourier transform of χ[2n ,2n +1] (t) is equal to sin πx −πix −2πi2n x e e . πx It follows easily that

166

13. Fourier Transform on the line

F f (x) = sup

a

∗

a>0

2πitx

f (t)e

−a

N sin πx 1 −2πi2n x √ · sup e dt ≥ . πx n N ≥1 n=1

Consequently, to prove our assertion it is enough to show that the last supremum is inﬁnite a.e. The set of powers of 2 is a Λ4 set, that is for every trigonometric polyn N nomial of the form P (x) = n=1 an e−2πi2 x we have P 4 ≤ 21/4 P 2 in fact 1 N N 2 2 4 −2πi2n x 2πi2n x P 4 = an e an e dx 0

n=1 m

n=1

and since 2 + 2 = 2 + 2 only if {k, l} = {n, m} we obtain

2 P 44 = 4 |an |2 |am |2 + |an |4 ≤ 2 |an |2 = 2f 42 . k

l

n

n=m

n

n

As a consequence of the equivalence of the two norms, there is some absolute constant δ > 0 such that for every trigonometric polynomial in powers of 2, we have m{x ∈ [0, 1] : |P (x)| ≥ P 2 /2} > δ. To see this, ﬁx θ ∈ (0, 1), and let A = {x ∈ [0, 1] : |P (x)| ≥ θP 2 }. We have

1/2 2 2 2 |P (x)| dμ ≤ m(A) |P |4 dm (1 − θ )P 2 ≤ A A 2 ≤ m(A)P 4 ≤ C m(A)P 22 . Therefore m(A) ≥ (1 − θ2 )2 /C 2 . Let B be the set of x ∈ R where N n 1 √ e−2πi2 x = +∞. sup n N ≥1 n=1 This set B is periodic with period 2n for every n ∈ N, because it coincides with the set where N 1 −2πi2n x √ e sup = +∞. n N ≥n0 n=n 0

It follows that m(B ∩ [0, 1]) must be 0 or 1. The set B contains the limit superior of the sets BN

N $ % 1 −2πi2n x 1 √ e = x ∈ [0, 1] : log N . ≥ 2 n n=1

Hence m(B ∩ [0, 1]) ≥ δ > 0. It follows that R B is of Lebesgue measure zero.

References

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]

N. Yu. Antonov: Convergence of Fourier series, East J. Approx., 2, 187–196 (1966) N. A. Bary: Treatise on Trigonometric Series, vols. 1 and 2, Pergamon Press Inc, New York, 1964 C. Bennett, R. Sharpley: Interpolation of Operators, Academic Press, Boston 1988 J. Bergh, J. L¨ ofstr¨ om: Interpolation spaces, Springer, Berlin 1976 A. S. Besikovitch: Sur la nature des fonctions ` a carr´e sommable et des ensembles measurables, Fund. Math., 4, 172–195 (1923) A. S. Besikovitch: On a general metric property of summable functions, J. London Math. Soc., 1, 120–128 (1926) P. Billard: Sur la convergence presque partout des s´eries de Fourier-Walsh des fonctions de l’espace L2 (0, 1), Studia Math., 28, 363–388 (1967) A. P. Calder´ on, A. Zygmund: On the existence of certain singular integrals, Acta Math., 88, 85–139 (1952) L. Carleson: On convergence and growth of partial sums of Fourier series, Acta Math., 116, 135–157 (1966) Y. M. Chen: An almost everywhere divergent Fourier series of the class L(log+ log+ L)1−ε , J. London Math. Soc., 44, 643–654 (1969) J. Duoandikoetxea: Fourier Analysis, American Math. Soc., Providence, RI, 2001 R. E. Edwards: Fourier Series a Modern Introduction, vols. 1 and 2, Holt Rinehart and Winston Inc, New York 1967 P. Fatou: S´eries trigonom´etriques et s´eries de Taylor, cta Math., 30, 335–400 (1906) C. Feﬀerman: Pointwise convergence of Fourier series, Ann. of Math., 98, 551– 571 (1973) C. Feﬀerman: Erratum Pointwise convergence of Fourier series, Ann. of Math., 146, 239 (1997). G. Gallavotti, A. Porzio: The almost everywhere convergence of Fourier series, Quaderno del Consiglio Nazionale dell Ricerche, Grupo Nazionale di Fisica Matematica, Roma 1987 A. Garsia: Topics in almost everywhere Convergence, Markham Publ., Chicago 1970 M. de Guzm´ an: Diﬀerentiation of Integrals in Rn , Springer, Lect. Notes in Math. 481, Berlin 1975. M. de Guzm´ an: Real Variable Methods in Fourier Analysis, North Holland, Amsterdam, 1981 G. H. Hardy: On the summability of Fourier’s series, Proc. London Math. Soc., 12, 365–372 (1913)

168

References

[21] G. H. Hardy, J. E. Littlewood: A maximal theorem with function-theoretic applications, Acta Math., 54, 81–116 (1930) [22] R. A. Hunt: On L(p, q) spaces, L’Enseignement Mathematique, 12, 249–276 (1966). [23] R. A. Hunt: On the convergence of Fourier series. Orthogonal Expansions and their continuous analogues, Proc. Conf. Edwardsville (1967), Ill, Southern Illinois Univ. Press, 235–255 (1966). [24] R. A. Hunt: Comments on Luzin’s conjecture and Carleson’s proof for L2 Fourier series, Linear Operators and approximation, II (Proc. Conf., Oberwolfach Math. Res. Inst., Oberwolfach 1974), Internat. Ser. Numer. Math., Birkhauser, Basel 235–245 (1974) [25] R. A. Hunt, M. H. Taibleson: Almost everywhere convergence of Fourier series on the ring of integers of a local ﬁeld, Siam J. Math. Anal., 2, 607–625 (1971) [26] O. G. Jørsboe, L. Mejlbro: The Carleson-Hunt theorem on Fourier Series, Springer, Lect. Notes in Math. 911, Berlin 1982. [27] J. P. Kahane: Sur la divergence presque sˆ ure partout de certaines s´eries de Fourier al´eatoires, Ann. Univ. Sci. Budapest, E¨ otv¨ os Sect. Math., 3–4, 101– 108 (1960/61) [28] N. J. Kalton: Convexity, type and the three space problem, Studia Math., 69, 247–287 (1981) [29] Y. Katznelson: An Introduction to Harmonic Analysis, Dover, New York 1976 [30] A. Kolmogorov: Une s´erie de Fourier-Lebesgue divergente presque partout, Fund. Math., 4, 324–328 (1923) [31] A. Kolmogorov: Une s´erie de Fourier-Lebesgue divergente partout, C. R. Acad. Sci. Paris, 183, 1327–1329 (1926) [32] S. V. Konyagin: On divergence of trigonometric Fourier series everywhere, C. R. Acad. Sci. Paris, 329, 693–697 (1999) [33] S. V. Konyagin: On the almost everywhere divergence of Fourier series, Matematicheskii Sbornik, 191, 103–126 (2000). Translation in Sb. Math. 191, 361–370 (2000) [34] T. W. K¨ orner: Everywhere divergent Fourier series, Colloq. Math., 45, 103–118 (1981) [35] M. Lacey, C. Thiele: A proof of boundedness of the Carleson operator, Math. Res. Lett., 7, 361–370 (2000) [36] N. Luzin: Sur la convergence des s´eries trigonom´etriques de Fourier, C. R. Acad. Sci. Paris, 156, 1655–1658 (1913) [37] A M´ att´e: The convergence of Fourier series of square integrable functions, Matematikai Lapok, 18, 195–242 (1967) [38] C. J. Mozzochi: On the pointwise convergence of Fourier Series, Springer, Lect. Notes in Math. 199, Berlin 1971. [39] P. Sj¨ olin: An inequality of Paley and convergence a.e. of Walsh-Fourier series, Arkiv Math., 7, 551–570 (1969) [40] P. Sj¨ olin: Two theorems in Fourier integrals and Fourier series, in Approximations and Function spaces, Banach Center Publ., 22, 413–426 PWN, Warsaw 1989 [41] F. Soria: Note on diﬀerentiation of integrals and the halo conjecture, Studia Math., 81, 29–36 (1985) [42] F. Soria: On an extrapolation theorem of Carleson-Sj¨ olin with applications to a.e. convergence of Fourier series, Studia Math., 94, 235–244 (1989) [43] F. Soria: Integrability properties of the maximal operator on partial sums of Fourier series, Rend. Circ. Mat. Palermo, 38, 371–376 (1989) [44] E. M. Stein: On limits of sequences of operators, Ann. Math., 74, 140–170 (1961)

References

169

[45] E. M. Stein: Singular integrals and diﬀerentiability properties of functions, Princeton University Press, Princeton 1970 [46] E. M. Stein, G. Weiss: Introduction to Fourier analysis on euclidean spaces, Princeton University Press, Princeton 1971 [47] E. C. Titchmarsh: Reciprocal Formulae involving series and integrals, Math. Z., 25, 321–347 (1926) [48] A. Torchinski: Real-Variable Methods in Harmonic Analysis, Academic Press, New York 1986 [49] S. Yano: Notes on Fourier Analysis (XXIX): An extrapolation theorem, J. Math. Soc. Japan, 3, 296–305 (1951) [50] A. Zygmund: Trigonometric Series, Cambridge University Press, New York 1959

Comments

The theorem about the pointwise almost everywhere convergence of Poisson integrals of functions in L1 (theorem 2.10) is contained in [13]. In [20] is proved theorem 2.5. Here Hardy also asks whether the estimate o(log n) is best possible. Luzin [36] asked for a real proof (instead of complex) of the existence of Hilbert transform for f ∈ L2 , and conjectured Carleson’s Theorem. The ﬁrst author to give a real proof of the existence of the Hilbert transform was Besikovitch [5]. Then Titchmarsh [47] gave one for f ∈ Lp , p > 1. Finally Besikovitch [6] obtained the proof in the case p = 1. A generalization of all these results is obtained in the seminal paper of Calder´ on and Zygmund [8]. An important tool for the Carleson-Hunt Theorem is the maximal function of Hardy and Littlewood [21]. Some important consequences about the almost everywhere convergence of Fourier series were obtained by Stein [44]. The ﬁrst example of a function f ∈ L1 with a. e. divergent Fourier series was given by Kolmogorov [30]. Four years later he obtained an everywhere divergent Fourier series [31]. We can see the construction and some consequences in K¨ orner [34]. Related results are contained in the papers by Kahane [27] and Chen [10]. The latest results about divergent Fourier series have been announced by Konyagin [33]. The proof of Carleson’s Theorem appeared in Acta. Math. [9]. In the same year was published a paper of Hunt [22] where he reviewed the interpolation properties of the Lp,q spaces. This paper contains parts of the Ph. D. Thesis of Hunt under the direction of G. Weiss. In 1968 Hunt [23] obtained our theorems 12.1, 12.5, 12.6 and 12.8. But the constant he gave in theorem 12.8 is Cp5 /(p − 1)3 . Many interesting historical notes are given in another paper by Hunt [24]. The ﬁrst modiﬁcation of the Carleson Theorem was given by Billard [7] to the series of Walsh. The extension to the case of Walsh-Fourier series of the Theorem of Hunt is given by Sj¨ olin [39]. In this paper it is proved also that the Fourier series of a function in L log L log log L is a.e. convergent.

172

Comments

In Hunt and Taibleson [25] the Carleson-Hunt theorem is extended to the L space of the integers of a local ﬁeld endowed with the Haar measure. The paper [25] contains also a simpliﬁed proof of the result of Sj¨ olin, which is done by using special functions (those that can be written as f = g χA , with 1/2 ≤ g ≤ 1). p

The ﬁrst theorem of extrapolation was given by Yano [49]. He proves that if an operator is bounded on Lp , (1 < p < p0 ) with norm O((p − 1)r ), then it maps L(log L)r to L1 . This was reﬁned and applied to the operator S ∗ giving the results of Sj¨ olin that we have mentioned. Soria [41] and [42] deﬁned a quasi-Banach function space B1 , and proved that every function in this space has an a.e. convergent Fourier series. This space is similar to our quasi-Banach space Q. We have applied a result of Kalton [28] to deﬁne the quasi-Banach spaces Q and QA. In another direction Sj¨ olin [39] proved that S ∗ maps L(log L)1+θ into L(log L)1−θ for 0 < θ ≤ 1. And ﬁnally Soria [43] gave the ﬁnal form to the Sj¨ olin result. He proved essentially f Q ≤ Cf L log L log log L . The result of Antonov is given in his paper [1]. Other proofs of Carleson’s Theorem have been given. First was one by Feﬀerman [14], (see Gallavotti and Porzio[16] for an exposition). Recently there has appeared another short proof by Lacey and Thiele [35]. We have consulted many books: Zygmund[50], Bary [2], Edwards [12], Katznelson [29], Stein [45], Stein-Weiss [46], and Duoandikoetxea [11], Guzm´an [18] and [19], Torchinski [48], Garsia [17], Bergh y L¨ ofstr¨ om [4], and Bennett and Sharpley [3]. We know of two previous treatments of the Carleson Hunt theorem, one by Mozzochi [38] and the other by Jørsboe and L. Mejlbro [26]. Also we have noticed that there is one in Hungarian by M´ att´e [37].

3:35 pm, 6/27/05