irma_weber_titelei
10.8.2009
11:03 Uhr
Seite 1
IRMA Lectures in Mathematics and Theoretical Physics 14 Edited by Chr...

Author:
Weber M.

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 1

IRMA Lectures in Mathematics and Theoretical Physics 14 Edited by Christian Kassel and Vladimir G. Turaev

Institut de Recherche Mathématique Avancée CNRS et Université de Strasbourg 7 rue René-Descartes 67084 Strasbourg Cedex France

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 2

IRMA Lectures in Mathematics and Theoretical Physics Edited by Christian Kassel and Vladimir G. Turaev This series is devoted to the publication of research monographs, lecture notes, and other material arising from programs of the Institut de Recherche Mathématique Avancée (Strasbourg, France). The goal is to promote recent advances in mathematics and theoretical physics and to make them accessible to wide circles of mathematicians, physicists, and students of these disciplines. Previously published in this series: 1 2 3 4 5 6 7 8 9 10 11 12 13

Deformation Quantization, Gilles Halbout (Ed.) Locally Compact Quantum Groups and Groupoids, Leonid Vainerman (Ed.) From Combinatorics to Dynamical Systems, Frédéric Fauvet and Claude Mitschi (Eds.) Three courses on Partial Differential Equations, Eric Sonnendrücker (Ed.) Infinite Dimensional Groups and Manifolds, Tilman Wurzbacher (Ed.) Athanase Papadopoulos, Metric Spaces, Convexity and Nonpositive Curvature Numerical Methods for Hyperbolic and Kinetic Problems, Stéphane Cordier, Thierry Goudon, Michaël Gutnic and Eric Sonnendrücker (Eds.) AdS/CFT Correspondence: Einstein Metrics and Their Conformal Boundaries, Oliver Biquard (Ed.) Differential Equations and Quantum Groups, D. Bertrand, B. Enriquez, C. Mitschi, C. Sabbah and R. Schäfke (Eds.) Physics and Number Theory, Louise Nyssen (Ed.) Handbook of Teichmüller Theory, Volume I, Athanase Papadopoulos (Ed.) Quantum Groups, Benjamin Enriquez (Ed.) Handbook on Teichmüller Theory, Volume II, Athanase Papadopoulos (Ed.)

Volumes 1–5 are available from Walter de Gruyter (www.degruyter.de)

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 3

Michel Weber

Dynamical Systems and Processes

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 4

Author: Michel Weber Institut de Recherche Mathématique Avancée CNRS et Université de Strasbourg 7, rue René Descartes 67084 Strasbourg Cedex France

2000 Mathematics Subject Classification: 37-02, 60-02. Key words: Dynamical systems, measure-preserving transformation, ergodic theorems, spectral theorems, convergence almost everywhere, central limit theorem, stochastic processes, gaussian processes, metric entropy method, majorizing measure method, randomization methods, Riemann sums

978-3-03719-046-3 The Swiss National Library lists this publication in The Swiss Book, the Swiss national bibliography, and the detailed bibliographic data are available on the Internet at http://www.helveticat.ch. This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use permission of the copyright owner must be obtained.

© 2009 European Mathematical Society Contact address: European Mathematical Society Publishing House Seminar for Applied Mathematics ETH-Zentrum FLI C4 CH-8092 Zürich Switzerland Phone: +41 (0)44 632 34 36 Email: [email protected] Homepage: www.ems-ph.org Typeset using the author’s TEX files: I. Zimmermann, Freiburg Printed in Germany 987654321

Preface

The aim of this book is to present in a concise and accessible way, as well as in a common setting, various tools and methods arising from spectral theory, ergodic theory and probability theory, which contribute interactively to the current research on almost everywhere convergence problems. The recent developments in the study of these questions are often obtained by combining either methods of spectral theory with principles of ergodic theory or methods from probability theory with tools and principles from spectral theory and ergodic theory. The spectral criterion of Gaposhkin, and later, following a remarkable metric entropy inequality of Talagrand, the spectral regularization developed in the setting of the study of square functions and oscillation functions in ergodic theory, are typical examples of this fruitful interaction. Another example of thorough interaction is certainly the work of Bourgain and notably his famous entropy criterion, at the basis of which lies the continuity principle of Stein. It was not our aim to write a complete treatise in ergodic theory, assuming such enterprise to be conceivable. The development of this theory during the last twenty years was indeed considerable. A similar remark can be made for the part concerning the study of the regularity of stochastic processes. The work is also not a synthesis of most significant results, complete with sketched proofs and references. We chose the intermediate route to writing a book in the spirit of lectures oriented towards research. The book provides an easy access to many tools, methods and results used in current research, presenting each of them in as wide a setting as possible. The proofs of these results are often given with full details. This book is divided in four parts, which came more or less naturally while writing it. Part I is devoted to spectral results and is followed by Part II, in which tools and results from ergodic theory are presented. In the third part, in connection with the description of two main methods, namely the metric entropy method and the majorizing measure method, recent applications to ergodic theory are given via the study of some maximal inequalities of Gál–Koksma type and the Lp norm, 1 ≤ p ≤ ∞, of important classes of polynomials. Finally, in the last part of the book we recollect classical results, as well as recent advances concerning Riemann sums and Khintchin sums, and the value distribution of divisors of Bernoulli or Rademacher sums, used in the study of Riemann sums. In Part I we begin elementarily with the spectral inequality. Chapter 1 concerns von Neumann’s theorem, which forms with Birkhoff’s ergodic theorem the basis of ergodic theory. It seems natural to include in this chapter Talagrand’s metric entropy n−1 estimate for the set {ATn f, n ≥ 1} where ATn is the average operator I +T +···+T n of a contraction T in a Hilbert space, thus completing naturally the von Neumann theorem. Recently discovered, remarkably efficient, spectral regularization inequalities analysing other structural properties of the set {ATn f, n ≥ 1}, followed by Weyl’s

vi

Preface

criterion and the van der Corput principle, complete this chapter. Chapter 2 starts with presenting the arguments leading to the representation of a weakly stationary process as Fourier transform of a random measure with orthogonal increments. Next we study Gaposhkin’s spectral criterion. In Part II, we first review in Chapter 3 classical ergodic and mixing properties of measurable dynamical systems. We also study several standard examples. Chapter 4 is devoted to Birkhoff’s pointwise theorem, to dominated ergodic theorems in Lp and to BMO spaces of associated maximal operators. This is continued with a discussion around spectral characterizations of the speed of convergence in Birkhoff’s pointwise theorem. Next we examine oscillation functions of ergodic averages. The transference principle and Wiener–Wintner theorems are discussed. A study of weighted ergodic averages concludes this chapter. In Chapter 5, some basic tools from ergodic theory, the Banach principle, the continuity principle and the conjugacy lemma are studied in detail. Chapter 6 concerns entropy criteria of Bourgain. Several functional inequalities linking the studied sequence of L2 -operators with the canonical Gaussian process on L2 are established, from which the criteria are then easily deduced. Study of the statistic of the ergodic averages naturally leads to investigating the question of the existence of some f ∈ L2 such that the related ergodic averages satisfy a central limit theorem, the invariance principle or the almost sure central limit theorem. Chapter 7 is devoted to this study. A detailed proof of the theorem of Burton–Denker on the existence, in any aperiodic dynamical system, of the central limit theorem is given. The method of proof relies upon Kakutani–Rochlin’s lemma and imitates the analogous result for irrational rotations of the unit circle which is obtained by using Fourier series. A fundamental fact in the background of the entire construction is provided by using Rochlin’s result on a factor space of Lebesgue space. The case of irrational rotations involving various remarkably efficient methods is more closely investigated. The existence of L2 elements of the torus satisfying the central limit theorem (CLT) is established for various types of means: nonlinear ergodic means, weighted ergodic means, and ergodic means along the squares. For the latter case, the circle method is used. The chapter concludes with a recent study of a kind of achieved form of the CLT, the convergence in variation implying the convergence of related density distributions in the spaces Lp (R), 1 ≤ p ≤ ∞, in the symptomatic case of lacunary random Fourier series. Two rather general methods are investigated in Part III: the metric entropy method and the majorizing measure method. In Chapter 8, a useful criterion for almost everywhere convergence involving covering numbers is proved, and then used to prove in a unified setting several classical results, such as Stechkin’s theorem, Gál–Koksma theorems and quantitative Borel–Cantelli lemmas. The metric entropy method is next applied to establish quite useful estimates of the supremum of random polynomials, notably random Dirichlet polynomials, and to study almost sure convergence properties of weighted series of contractions and random perturbation of some intersective sets in ergodic theory. Chapter 9 concerns an important tool: the majorizing measure method. A general criterion for almost sure convergence of averages is proved by means of this

Preface

vii

method. We continue with recent applications of the majorizing measure method to the study of the supremum of random polynomials, including a strictly stronger form of the well-known Salem–Zygmund estimate. Some remarkable classes of examples are studied. Chapter 10 is a succinct study of Gaussian processes presented in the form of a toolbox. Various fundamental results from the theory are discussed, sometimes with historical comments and proofs. Much importance is given to very handy correlation inequalities. Part IV is devoted to three studies: the study of Riemann sums, the study of convergence properties of the system {f (nk x), k ≥ 1} and a probabilistic approach concerning divisors with applications. Chapters 1 to 6 and partially Chapters 8 to 10 are based on lectures given at the Mathematical Institute of the University of Strasbourg. Chapters 11 to 13 are mainly based on research articles, as well as some parts of Chapters 1, 4, 7, 8, 9. In writing this book, we followed a general principle: where the proofs in our source readings were only sketched, we fill in the gaps in as much detail as possible. Further, we give quasisystematically complete references with page numbers and/or precise numeration of cited results. We always keep in mind the wish to help, as much as we can, the researcher but also the teacher and the graduate student in their work in these beautiful areas of mathematics, trying also to spare their time and to let them share our passion for research at the interfaces of related problems. I would like to thank Mikhail Lifshits for the many discussions and encouragements. I would also like to thank Istvan Berkes for his indefectible enthusiasm and the many exchanges and comments, as well as Ulrich Krengel for stimulating comments. I am much indebted and grateful to Irene Zimmermann for her technical assistance and for numerous observations and remarks. I thank Manfred Karbe and the European Mathematical Society Publishing House for accepting this work in their IRMA series, and for efficient help in publishing. I devote this book to my wife Marie-Christine. She always provided a favourable atmosphere for mathematical work.

Contents

Preface Part I

v Spectral theorems and convergence in mean

1

1 The von Neumann theorem and spectral regularization 1.1 Bochner–Herglotz lemma . . . . . . . . . . . . . . . . . 1.2 The spectral inequality . . . . . . . . . . . . . . . . . . 1.3 The von Neumann theorem . . . . . . . . . . . . . . . . 1.4 The spectral regularization inequality . . . . . . . . . . . 1.5 Moving averages . . . . . . . . . . . . . . . . . . . . . 1.6 Uniform distribution mod a – the Weyl criterion . . . . . 1.7 The van der Corput principle . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3 8 10 26 44 51 55

2 Spectral representation of weakly stationary processes 2.1 Weakly stationary processes . . . . . . . . . . . . . . 2.2 Spectral representation of unitary operators . . . . . . 2.3 Elements of stochastic integration . . . . . . . . . . . 2.4 Spectral representation of weakly stationary processes . 2.5 Weakly stationary sequences and orthogonal series . . 2.6 Gaposhkin’s spectral criterion . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

61 61 64 76 78 80 85

Part II

. . . . . .

Ergodic Theorems

91

3 Dynamical systems – ergodicity and mixing 3.1 Measurable dynamical systems – topological dynamical systems 3.2 Ergodicity of a dynamical system . . . . . . . . . . . . . . . . . 3.3 Weak mixing, strong mixing, continuous spectrum . . . . . . . . 3.4 Spectral mixing theorem . . . . . . . . . . . . . . . . . . . . . 3.5 Other equivalences and other forms of mixing . . . . . . . . . . 3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

93 93 101 103 110 114 121

4 Pointwise ergodic theorems 4.1 Birkhoff’s pointwise theorem 4.2 Dominated ergodic theorems 4.3 Classes L logm L . . . . . . 4.4 A converse . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

129 129 139 144 145

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

x 4.5 4.6 4.7 4.8 4.9

Contents

Speed of convergence . . . . . . . . . . Oscillation functions of ergodic averages Wiener–Wintner theorem . . . . . . . . Weighted ergodic averages . . . . . . . Subsequence averages . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

148 152 165 168 193

5 Banach principle and continuity principle 5.1 Banach principle . . . . . . . . . . . . . . . 5.2 Continuity principle . . . . . . . . . . . . . . 5.3 Applications . . . . . . . . . . . . . . . . . . 5.4 A principle of domination – conjugacy lemma

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

200 200 206 217 226

6 Maximal operators and Gaussian processes 6.1 Some liaison theorems . . . . . . . . . . . 6.2 Two preliminary lemmas . . . . . . . . . . 6.3 Proof of Theorem 6.1.1 . . . . . . . . . . . 6.4 Proof of Theorem 6.1.6 . . . . . . . . . . . 6.5 The case Lp , 1 < p < 2 . . . . . . . . . . 6.6 A remarkable GB set property . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

230 230 242 247 249 254 259

7 The central limit theorem for dynamical systems 7.1 Introduction and preliminaries . . . . . . . . . . 7.2 A theorem of Burton and Denker . . . . . . . . . 7.3 The central limit theorem for orbits . . . . . . . . 7.4 A theorem of Volný . . . . . . . . . . . . . . . . 7.5 CLT for rotations . . . . . . . . . . . . . . . . . 7.6 Lacunary series and convergence in variation . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

267 267 269 284 289 291 315

Part III

. . . . .

. . . . .

. . . . . .

Methods arising from the theory of stochastic processes

8 The metric entropy method 8.1 Introduction and general results . . . . . . . . . . . . . . . . . . . 8.2 A theorem of Stechkin . . . . . . . . . . . . . . . . . . . . . . . 8.3 An application to the quantitative Borel–Cantelli lemma . . . . . . 8.4 Application to Gál–Koksma’s theorems . . . . . . . . . . . . . . 8.5 An application to the supremum of random polynomials . . . . . . 8.6 Application to a.s. convergence of weighted series of contractions 8.7 An application to random perturbation of intersective sets . . . . . 8.8 An application to the discrepancy of some random sequences . . . 8.9 An application to random Dirichlet polynomials . . . . . . . . . .

339

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

341 341 349 353 364 369 387 403 409 415

9 The majorizing measure method 433 9.1 Introduction – the exponential case . . . . . . . . . . . . . . . . . . . . . 433

xi

Contents

9.2 A general approach . . . . . . . . . . . . . . . 9.3 A useful criterion . . . . . . . . . . . . . . . . 9.4 Proof of Theorem 9.3.3 . . . . . . . . . . . . . 9.5 Proof of Theorems 9.3.10 and 9.3.11 . . . . . . 9.6 Proof of Theorem 9.3.12 and some examples . 9.7 A stronger form of Salem–Zygmund’s estimate 9.8 Some examples and discussion . . . . . . . . . 9.9 Uniform convergence of random Fourier series

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

438 447 457 469 471 475 478 488

10 Gaussian processes 10.1 Gaussian variables and correlation estimates . . . 10.2 0-1 laws, integrability and comparison lemmas . 10.3 Regularity and irregularity of Gaussian processes 10.4 Gaussian suprema . . . . . . . . . . . . . . . . . 10.5 Oscillations of Gaussian Stein’s elements . . . . 10.6 Tightness of Gaussian Stein’s elements . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

491 491 504 510 517 529 537

Part IV Three studies

547

11 Riemann sums 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The results of Jessen and Rudin . . . . . . . . . . . . . . . . . . 11.3 Individual theorems of spectral type . . . . . . . . . . . . . . . 11.4 Breadth and dimension . . . . . . . . . . . . . . . . . . . . . . 11.5 Bourgain’s results . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Connection with number theory . . . . . . . . . . . . . . . . . 11.7 Riemann sums and the randomly sampled trigonometric system 11.8 Almost sure convergence and square functions of Riemann sums

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

549 549 551 554 557 562 565 573 587

12 A study of the system (f (nx)) 12.1 Introduction and mean convergence . . . . . . 12.2 Almost sure convergence – sufficient conditions 12.3 Almost sure convergence – necessary conditions 12.4 Random sequences . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

601 601 611 634 642

. . . . . . .

659 659 661 675 685 691 699 701

. . . .

. . . .

. . . .

. . . .

13 Divisors and random walks 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 13.2 Value distribution and small divisors of Bernoulli sums 13.3 An LIL for arithmetic functions . . . . . . . . . . . . . 13.4 On the order of magnitude of the divisor functions . . . 13.5 Value distribution of the divisors of n2 + 1 . . . . . . . 13.6 Value distribution of the divisors of Rademacher sums . 13.7 The functional equation and the Lindelöf Hypothesis .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

xii

Contents

13.8 An extremal divisor case . . . . . . . . . . . . . . . . . . . . . . . . . . 711 Bibliography

729

Index

759

Part I Spectral theorems and convergence in mean

Chapter 1

The von Neumann theorem and spectral regularization

Von Neumann’s theorem is, together with Birkhoff’s theorem, one of the fundamental results in ergodic theory. A remarkable spectral regularization inequality is established, from which Talagrand’s entropy estimate is deduced, as well as sharp bounds for the Littlewood–Paley square functions. Other averages, like moving averages, are considered. Some useful lemmas, the Bochner–Herglotz lemma, the spectral lemma and the spectral inequality are first established and completed by some other, sometimes less known results. Two important tools are included at the end of the chapter: Weyl’s equidistribution theorem and the van der Corput principle.

1.1

Bochner–Herglotz lemma

The lemmas studied in this section, as well as in the next one, are classical tools of spectral analysis. The spectral inequality, which is easily derived from the Bochner– Herglotz lemma, allows us to reduce many problems of in-norm evaluation of vectors, to much more tractable harmonic analysis questions. This tool is often used in ergodic theory. We thus begin by establishing Bochner–Herglotz’s lemma. A function γ : R → R is nonnegative definite if for any positive integer n, and any u1 , . . . , un ∈ R, a1 , . . . , an ∈ C, we have ai a¯ j γ (ui − uj ) ≥ 0. 1≤i,j ≤n

For a continuous function γ : R → R, an equivalent definition of nonnegative definiteness is that for any measurable bounded function ξ(x) vanishing outside some finite interval, ∞

∞

−∞ −∞

γ (t − s)ξ(t)ξ(s) dtds ≥ 0.

A sequence of complex numbers {ak , k ∈ Z} is nonnegative definite if a−k = a¯ k and if the inequality ρi ρ j ai−j ≥ 0, 1≤i,j ≤n

holds for any finite system of complex numbers ρ1 , . . . , ρn . A function γ : Z → R is thus nonnegative definite if the sequence {γ (k), k ∈ Z} is nonnegative definite. These notions immediately extend to functions defined on Rd or Zd . Let T = R/Z = [0, 1[

4

1 The von Neumann theorem and spectral regularization

be the circle equipped with the normalized Lebesgue measure λ, and let Td denote the d-dimensional torus equipped with the measure λd . 1.1.1 Lemma. a) Let γ : Rd → R be continuous, nonnegative definite. Then there exists a nonnegative bounded measure μ on Rd , such that for any x ∈ Rd , γ (x) = eit,x μ(dt). Rd

b) Let γ : Zd → R be nonnegative definite. Then there exists a nonnegative bounded measure μ on Td , such that for any k ∈ Zd , γ (k) = e2iπ k,t μ(dt). Td

Proof. We give the proof for d = 1, the multidimensional case being obtained in a quite identical way. Let Z denote some positive integer. Consider a) first. Put Ik =

k

[0,Z[k i,j =1

e−i(ui −uj )x γ (ui − uj ) du1 . . . duk .

By assumption Ik ≥ 0. Moreover, Ik = kγ (0) du1 . . . duk [0,Z[k

du1 . . . duk Z Z −i(ui −uj )x e γ (ui − uj ) dui duj k−2 dui duj 0 0 i,j =1 [0,Z[ Z Z = kγ (0)Z k + k(k − 1)Z k−2 e−i(u−v)x γ (u − v) dudv. +

k

0

Dividing by k(k

0

− 1)Z k−2

and then letting k tend to infinity, implies Z Z e−i(u−v)x γ (u − v)dudv ≥ 0. 0

0

Making the change of variables u − v = t gives Z Z Z−v e−itx γ (t)dt dv = e−itx γ (t) {min(Z, Z − t) − sup(0, −t)} dt 0

−v

−Z Z

=

−Z

e−itx γ (t) (Z − |t|) dt ≥ 0.

Let γZ (x) = γ (x) (1 − |x|/Z) 1[−Z,Z] (x),

γˆZ (x) =

R

e−itx γZ (t) dt.

5

1.1 Bochner–Herglotz lemma

Then γˆZ (x) ≥ 0, and evidently γZ ∈ L∞ (R). We show that γˆZ ∈ L1 (R). Integrating 2 2 γˆZ (x) over R with respect to the density √1 e−x /(2σ ) , yields σ 2π

R

γˆZ (x)e

−

x2 2σ 2

dx = √ σ 2π

2 2 2 −itx− x 2 dx 2σ γZ (t) e γZ (t)e−σ t /2 dt. dt = √ σ 2π R R R

Hence, since γZ ∞ ≤ γ (0), R

γˆZ (x)e

−

x2 2σ 2

√ 2 2 dx = σ 2π γZ (t)e−σ t /2 dt R √ √ 2 2 ≤ σ 2π γ (0) e−σ t /2 dt = 2π γ (0). R

But γˆZ (x) ≥ 0. Letting σ tend to infinity increasingly, finally shows in view of Fatou’s lemma that γˆZ ∈ L1 (R). Now we need the Fourier inversion theorem: Let h, hˆ ∈ L1 (Rd ). Then for almost all x, h(x) =

Rd

ˆ eit,x h(t)dt.

Thus γˆZ ∈ L1 (R) and for almost all x, γZ (x) = R eitx γˆZ (t)dt. As γZ and the mapping itx x → R e γˆZ (t)dt are continuous, the above equality holds in turn everywhere. Hence γ (0) = γZ (0) = γˆZ (t)dt. R

Denote by μZ the measure on R having density γˆZ (t). Since γZ (x) → γ (x) everywhere as Z tends to infinity, we get lim μˆ Z (x) = γ (x).

Z→∞

By assumption γ is continuous. It follows from the corollary on p. 481 in [Feller: 1966, II] that there exists a nonnegative bounded measure μ on R such that γ (x) = μ(x). ˆ Z −2iπ(n−m)x γ (n − m) ≥ 0. We pass to the proof of b). By assumption n,m=1 e This sum can also be written as Z

e−2iπ(n−m)x γ (n − m)

n,m=1

=

n−1 Z n=1 p=n−Z

e

−2iπ xp

γ (p) =

Z−1 −Z+1

e−2iπ xp γ (p)

p+1≤n≤p+Z 1≤n≤Z

1=

6

1 The von Neumann theorem and spectral regularization

=

Z−1

e−2iπ xp γ (p){min(p + Z, Z) − max(1, p + 1) + 1}

−Z+1

=

Z−1

e−2iπ xp γ (p) (Z − |p|) .

−Z+1

Put γZ (p) = γ (p)1{−Z+1,Z−1} (1 − |p|/Z) and gZ (x) = p∈Z e−2iπ xp γZ (p). Then γˆZ (−x) = gZ (x) ≥ 0, and since γZ has compact support, gZ is bounded continuous. Further gZ (x)e2iπ xr dx = γZ (p) e2iπ x(r−p) dx = γZ (r). T

T

p∈Z

In particular γZ (0) = γ (0) = T gZ (x)dx, thereby implying that the nonnegative measures νZ on (T, B(T)) with density gZ (x) are relatively compact for the weak convergence topology D on T. Hence, there exists a subsequence J and a bounded nonnegative measure ν on T such that D

lim

JZ→∞

and limJZ→∞ γZ (r) = any r ∈ Z,

Te

2iπ xr ν(dx).

νZ = ν, Since limZ→∞ γZ (r) = γ (r), we get for

γ (r) =

T

e2iπ xp ν(dx).

Schoenberg’s theorem. Schoenberg [1938] found a beautiful complement to Bochner’s theorem, which is worth being formulated here. Let f : R+ → R+ be continuous, nonnegative definite. Assume that f (0) = 1. Schoenberg’s theorem translates, via Bochner’s theorem, to the equivalence of the following two assertions: (a) For all d ≥ 1, there is a probability measure μd on Rd such that for every x ∈ Rd , eix,y μd (x). f ( x d ) = Rd

Here x d is the Euclidian norm on Rd . (b) There exists a Borel probability ν on R+ such that for any positive real t, ∞ 2 e−st /2 ν(ds). f (t) = 0

There is a proof of Schoenberg’s theorem via the law of large numbers in Khoshnevisan [2005], to which we may also refer as a source. 1.1.2 Remarks. 1. Nonnegative definite sequences are characterized by the previous lemma. According to this one, a sequence is nonnegative definite if and only if there

1.1 Bochner–Herglotz lemma

7

exists a weakly stationary sequence {Xn , n ≥ 1} in a Hilbert space H such that for any positive integers h and k, Xh , Xk = γh−k . This point can also be established by means of a direct vector representation in H , see Ky Fan [1946: Paragraph 2 and Appendix]. Nonnegative definite sequences are closely related to nonnegative trigonometric polynomials. p 2. A trigonometric polynomial k=−p zk eikθ with z−k = zk and taking only nonnegative values, is said to be nonnegative. In view of a classical result of Fejér and F. Riesz (Fejér [1915]), there exist p + 1 complex numbers ρ0 , ρ1 , . . . , ρp such that p

2 zk eikθ = ρ0 + ρ1 eiθ + · · · + ρp eipθ .

k=−p

3. We also quote a theorem due to Szász [1918] (see Ky Fan [1946: Paragraph 3]). A sequence {an , n ∈ Z} is nonnegative definite, if and only if, p

ak zk ≥ 0

k=−p

p holds for any nonnegative trigonometric polynomial k=−p zk eikθ of arbitrary order p. This characterization is to be compared with the one of Hausdorff [1923]: the sequence {an , n ∈ Z} is nonnegative definite, if and only if p p

ah−k ei(h−k)θ ≥ 0

h=1 k=1

is satisfied for any positive integer p and any real θ . Below we list some standard examples of nonnegative definite sequences and weakly stationary sequences. 1.1.3 Examples. (1) Given a weakly stationary sequence {Xn , n ≥ 1} in H , it is readily seen that, for any real value of ϑ, the sequence {e−inϑ Xn , n ≥ 1} is weakly stationary too. Anticipating a bit von Neumann’s theorem, for any value of ϑ the limit e−iϑ X1 + e−i2ϑ X2 + · · · + e−inϑ Xn n→∞ n

(ϑ) = lim

also exists. Further (see before Remarks 1.3.4), if ϑ1 = ϑ2 (mod 2π ), (ϑ1 ) and (ϑ2 ) are orthogonal elements in H . And there exists at most a countable infinite set of values of ϑ for which (ϑ) differs from the null element of H (see Ky Fan [1946: Paragraph 6]). (2) Let : R → R+ be even, convex and nonincreasing. Then the sequence { (n), n ∈ Z} is nonnegative definite. This follows from a classical theorem due to Polyá.

8

1 The von Neumann theorem and spectral regularization

(3) Let S be the space of correlated sequences introduced by Wiener [1933: Chapter 4], namely the space of sequences a = {a(n), n ∈ Z} with a−n = a¯ n , such that for any k ≥ 0 the limit n−1 1 γa (k) = lim a(j )a(j + k) n→∞ n j =0

exists. Observe that for any integers r, s with 0 ≤ r ≤ s, 1 a(h + r)a(h + s). n→∞ n n−1

γa (s − r) = lim

h=0

From this follows that the sequence {γa (k), k ≥ 0} is nonnegative definite. Indeed, m

n−1 m 1 ck c¯l a(j + l)a(j + k) n→∞ n

ck c¯l γa (k − l) = lim

j =0 k,l=1

k,l=1

n−1 m 2 1 ck a(j + k) ≥ 0. n→∞ n

= lim

j =0 k=1

In view of the Bochner–Herglotz theorem, there exists a uniquely determined nonnegative bounded measure a on [−π, π[, called the spectral measure of the sequence a. Consider the family of measures J,a (dα) =

2 1 −ij α e a(j ) dα. J 0≤j <J

A theorem due to Coquet, Kamae and Mendes-France [1977: Theorem 1] shows that the family of measures J,a converges weakly to a . To establish this property, ˆ J,a converges pointwise it suffices to show that the sequence of Fourier transforms ˆ to a , which is easily checked.

1.2 The spectral inequality Bochner–Herglotz’s lemma has a very useful consequence, which we now state. 1.2.1 Lemma. Let T be a contraction in a Hilbert space H . For any n ∈ Z, let Tn = T n if n ≥ 0 and Tn = T ∗ |n| if n < 0. Let x ∈ H . The sequence {Tn x, x, n ∈ Z} is nonnegative definite, and there exists a uniquely determined nonnegative bounded measure μx on T, the spectral measure of T at x verifying exp(2iπ nt)μx (dt) (∀n ∈ Z). Tn x, x = T

9

1.2 The spectral inequality

Proof. The second assertion follows from Lemma 1.1.1. The first assertion is simple when T is an isometry. n

zl z¯ m Tl−m x, x =

m

l

l,m=−n

2 zl z¯ m Tl+n x, Tm+n x = zl Tl+n x ≥ 0. l

For the general case, we put for any 0 < r < 1 and t ∈ T, U (r, t) = r k e2iπ kt T k , k≥0

V (r, t) =

r |k| e2iπ kt Tk = −I + U (r, t) + U (r, t)∗ .

k∈Z

If y = U (r, t)x, we have y − x = re2iπ t T y. Thus y − x ≤ y , and this shows V (r, t)x, x = −x, x + y, x + x, y = y, y − y − x, y − x ≥ 0. For any complex numbers {zl , |l| ≤ n}, we have n

r

l−m

zl z¯ m Tl−m x, x =

l,m=−n

=

n l,m=−n k n T l,m=−n

zl z¯ m Tk x, xr

|k|

T

e2iπ(k−(l−m))t dt

zl z¯ m e−2iπ(l−m)t V (r, t)x, x dt

2 = zl e−2iπ lt V (r, t)x, x dt ≥ 0. T

l

Letting r tend to 1 gives the required inequality. We shall now deduce from the spectral lemma an extremely useful tool. 1.2.2 Proposition. Let T be a contraction in a Hilbert space H , and let p(x) be a polynomial. Then, for any x ∈ H , 2iπ t 2 p(e

p(T )x 2 ≤ ) μx (dt), T

where the measure μx is the same as in Lemma 1.2.1. Proof. We follow an argument due to Wierdl. The inequality is obviously satisfied if the order of p is equal to 0. Assume now that the inequality is true for any polynomial of order k − 1. Let p(y) = a0 + · · · + ak y k , and consider the auxiliary polynomials q(y) = a1 y + · · · + ak y k ,

u(y) =

q(y) = a1 + · · · + ak y k−1 . y

10

1 The von Neumann theorem and spectral regularization

We have |p(y)|2 = |a0 + q(y)|2 = |a0 |2 + a0 q(y) ¯ + q(y)a¯ 0 + |q(y)|2 , and

p(T )x 2 = (a0 I + q(T )) x 2 = |a0 |2 x 2 + a0 x, q(T )x + q(T )x, a0 x + q(T )x 2 . By using the induction hypothesis, we have 2iπ t 2 u(e ) μx (dt).

u(T )x 2 ≤ T

Since T is a contraction, then q(T )x = T u(T )x ≤ u(T )x . And as |u(e2iπ t )| = |q(e2iπ t )|, we get 2iπ t 2 q(e

q(T )x 2 ≤ ) μx (dt). T

Besides, a0 x, q(T )x =

T

a0 q(e ¯

2iπ t

) μx (dt) and

q(T )x, a0 x =

T

q(e2iπ t )a¯ 0 μx (dt).

By putting together these various estimates, we obtain

p(T )x 2 ≤ ¯ 2iπ t ) + q(e2iπ t )a¯ 0 + |q(e2iπ t )|2 μx (dt), |a0 |2 + a0 q(e T

and this establishes the spectral inequality for all polynomials of order k, and thereby for any polynomial.

1.3 The von Neumann theorem Let T be a contraction in a Hilbert space H and introduce the operators 1 k T , n n−1

An = ATn =

n = 1, 2, . . . .

(1.3.1)

k=0

The fundamental result of von Neumann [1931] can be stated as follows. 1.3.1 Theorem. The limit limn→∞ An f = f¯ exists for any f ∈ H , and the map PT : f → f¯ is the orthogonal projection of H onto the subspace of invariant vectors HT = {g ∈ H : T g = g}. Further H = H0 ⊕ HT , where H0 = {g − T g, g ∈ H }. Proof. (1) The proof is based on the following lemma.

1.3 The von Neumann theorem

11

1.3.2 Lemma. Let T be a contraction in a Hilbert space H . Then the adjoint operator (Section 2.2.6) T ∗ has the same fixed points as T . Proof. T ∗ is also a contraction and if Tf = f , then f, Tf = Tf, f = f 2 . Conversely f, Tf = Tf, f = f 2 implies f, T ∗ f = f 2 and

Tf − f 2 = Tf − f, Tf − f = Tf 2 + f 2 − f, Tf − Tf, f = Tf 2 − f, T ∗ f ≤ 0. Thus Tf = f and so Tf = f ⇔ Tf, f = T ∗ f, f = f 2 . Therefore Tf = f ⇐⇒ Tf, f = T ∗ f, f = f, f ⇐⇒ T ∗ f = f. (2) We show that H = H0 ⊕ HT . According to (1), for any f ∈ HT , f, g − T g = f, g − f, T g = f, g − T ∗ f, g = 0. Hence HT ⊂ H0⊥ . Besides, if f is orthogonal to H0 , then 0 = f, g − T g = f − T ∗ f, g for any g in H . Thus T ∗ f = f , and thereby Tf = f . This implies that H0⊥ = HT . (3) It is plain that the theorem is satisfied for any vector of the type f + g − T g, f ∈ HT and g ∈ H . Indeed 1 k 1 k 1 k T (f + g − T g) = T f+ (T g − T k+1 g) n n n n−1

n−1

n−1

k=0

k=0

k=0

1 = f + (g − T n g) → f, n as n tends to infinity, and f is the orthogonal projection on HT of f + g − T g. (4)According to (2), these vectors are dense in H . The operators An are contractions as well. It follows that the set of vectors for which the theorem is true is closed in H . Let indeed A = {x ∈ H such that if y = projH0 (x) then lim An (y) = 0}. n→∞

We show that A is closed. Let {xn , n ≥ 1} ⊂ A, xn → x. Then yn → y, and

AN (y) ≤ AN (y − yp ) + AN (yp ) ≤ AN (yp ) + y − yp . Let ε > 0 and let p be a fixed integer such that y − yp < ε/2. Let N (ε) be such that for any N ≥ N(ε), AN (yp ) ≤ ε/2. We obtain that AN (y) ≤ ε. Thus A is closed in H and the theorem is established. Let {Xn , n ≥ 0} be a weakly stationary sequence in a Hilbert space H . According to Theorem 2.1.3, there exists a unitary operator U on H such that Xn = U n X0 . By von Neumann’s theorem, we get that the limit (X) := lim

n→∞

X0 + · · · + Xn−1 n

12

1 The von Neumann theorem and spectral regularization

exists in H . It can be directly observed that the inner product (X), Xh is independent of h. Indeed, by using the weak stationarity

Xk+1 + · · · + Xk+n Xh+1 + · · · + Xh+n , Xh = lim , Xk (X), Xh = lim n→∞ n→∞ n n = (X), Xk . And consequently

(X), Xh = (X),

X1 + · · · + Xn , n

which gives as n tends to infinity: (X), Xh = (X) 2 . As observed in Examples 1.1.3, for any real value of ϑ, the sequence {e−inϑ Xn , n ≥ 1} is weakly stationary too. The limit e−iϑ X1 + e−i2ϑ X2 + · · · + e−inϑ Xn n→∞ n thus also exists, for any value of ϑ. Then (X, ϑ) = lim

e−ihϑ Xh , (X, ϑ) = (X, ϑ) 2 , independently of h. Hence −iϑ1 e X1 + e−i2ϑ1 X2 + · · · + e−inϑ1 Xn

, (X, ϑ2 ) n ei(ϑ2 −ϑ1 ) + ei2(ϑ2 −ϑ1 ) + · · · + ein(ϑ2 −ϑ1 ) = . (X, ϑ2 ) 2 . n Therefore, if ϑ1 = ϑ2 (mod2π ), the last equation becomes, as n tends to infinity, (X, ϑ1 ), (X, ϑ2 ) = 0, as claimed in Examples 1.1.3. Weakly stationary sequences, however, enjoy other remarkable properties; among them is certainly the following identity which does not seem to be so known. An identity of Ky Fan. For any two positive integers n, m,

X1 + · · · + Xm 2

X1 + · · · + Xn 2

X1 + · · · + Xn+m 2 + − n m n+m

n(n + m) X1 + · · · + Xn X1 + · · · + Xn+m 2 = − . m n n+m This nice identity was observed and applied in Ky Fan [1946: 598]. The proof goes as follows. Put for any positive integer n, Sn = X1 + · · · + Xn , and if m is another positive integer let Tn,m = Sn+m − Sn , so that Sn+m = Sn + Tn,m . Then Sn Sn+m −

n

2

2 2 = Sn + Sn+m − Sn , Sn+m − Sn+m , Sn , n + m n2 (n + m)2 n n+m n+m n

13

1.3 The von Neumann theorem

and so

Sn+m 2 n(n + m) Sn − m n n + m (n + m) Sn 2 n Sn+m 2 = + nm m(n + m) 1 − Sn , Sn+m + Sn+m , Sn m

Sn 2

Sn 2

Sm 2

Sm 2

Sn+m 2 = + + − − n m m m n+m

Sn+m 2 1 + − Sn , Sn+m + Sn+m , Sn . m m But Sn , Sn+m + Sn+m , Sn = 2 Sn 2 + Sn , Tn,m + Tn,m , Sn , so that in turn n(n + m) m

Sn Sn+m −

n

n+m

2 2 2 2

2 2 = Sn + Sm − Sn+m + 1 S n+m − Sn

n m n+m m 2 − Sm − Sn , Tn,m − Tn,m , Sn

=

Sn 2

Sm 2

Sn+m 2 + − , n m n+m

since Sn+m = Sn + Tn,m . And we are done. Note that the weak stationarity assumption was only used in the last line of calculations, to say that Tn,m = Sm . A simple although quite interesting consequence of Ky Fan’s identity is

Sm 2

Sn 2

Sn+m 2 + , ≤ n m n+m

(1.3.2)

which is valid for any two positive integers n, m. This is inequality (4.8) in Ky Fan [1946]. We say that a sequence {gn , n ≥ 1} of real numbers is subadditive if it satisfies gn+m ≤ gn + gm .

(1.3.3)

Then we have the following well-known lemma. 1.3.3 Lemma. If {gn , n ≥ 1} is a subadditive sequence of real numbers, then gn /n converges to inf n≥1 (gn /n). Proof. Fix an arbitrary positive integer N and write n = jn N + rn with 1 ≤ rn ≤ N. Clearly jnn → N1 as n tends to infinity. Further gj N + grn gj N gr gr gn gn jn gN gN gr + n = ≤ ≤ n ≤ n + n ≤ + n. n≥1 n n n jn N n jn N n N n inf

14

1 The von Neumann theorem and spectral regularization

Letting now n tend to infinity gives inf

n≥1

gn gN gn gn ≤ lim sup ≤ . ≤ lim inf n→∞ n N n n→∞ n

As N was arbitrary, the lemma is proved. We thus deduce from (1.3.2) and from the lemma applied to gn :=

Sn 2 n

that

Sn Sn lim = inf . n→∞

n

n≥1

(1.3.4)

n

This is a remarkable consequence of Ky Fan’s identity, which remains true for averages of contractions by von Neumann’s theorem (proceed by approximation in view of the decomposition H = H0 ⊕ HT ). We continue with another interesting consequence concerning the ratios 2 Sn Snk n − n k+1 k k+1

1 nk

−

1 nk+1

,

where N = {nk , k ≥ 1} is a given increasing sequence of positive integers. Notice that in the orthonormal case, namely if X1 , X2 , . . . is an orthonormal sequence, then Sn k − Snk+1 2 = 1 − 1 precisely. We have the following properties: nk nk+1 nk nk+1

a)

Snk+1 2 N−1 Snk 1 nk − nk+1 lim sup

1 1 N→∞ nN k=1 nk − nk+1

b) Further if lim nk+1 − nk = ∞, then k→∞

Snk+1 −nk 2

SnN 2 − ≤ lim sup sup . 2 n2N N →∞ 1≤k

Snk+1 2 N −1 Snk 1 nk − nk+1 lim

1 1 N →∞ nN nk − nk+1 k=1

= 0.

2

(k+1)a N −1 Ska 1 ka − (k+1)a c) Moreover, lim

1 = 0. 1 N,a→∞ Na ka − (k+1)a k=1 d) Finally, let D = Dj , j ≥ 1 be a chain: Dj |Dj +1 for every j . Then

∞

1

j =1

Dj +1

Dj +1 Dj −1

S kD

kDjj − k=1

1 kDj

−

S

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SD1

D1

2

− lim

J →∞

SDJ +1

DJ +1

2

< ∞.

It follows from c) that in at most linearly growing sequences, averages of weakly stationary sequences asymptotically exhibit, in density, increments comparable to averages of orthogonal sequences, which is a bit unexpected. From Ky Fan’s identity, we

15

1.3 The von Neumann theorem

indeed get for each k, 2 Sn Snk n − n k+1 k k+1

−

1 nk

1 nk+1

=

Snk+1 2

Snk+1 −nk 2

Snk 2 − + . nk nk+1 nk+1 − nk

(1.3.5)

Summing from k = 1 up to N − 1 leads to

Snk+1 2 S N−1 nnkk − nk+1 1 1 nk − nk+1 k=1

N −1

Snk+1 −nk 2

Sn1 2

SnN 2 = − + (nk+1 − nk ) . n1 nN nk+1 − nk k=1

Dividing both sides by nN gives

nN

N−1 k=1

=

1 Sn Sn 2 k − k+1 nk 1 nk

nk+1 1 k+1

−n

N −1

Snk+1 −nk 2

Sn1 2

SnN 2 1 − + (n − n ) k+1 k n1 nN nk+1 − nk nN n2N k=1

(1.3.5a)

N −1

Snk+1 −nk 2 SnN 2

Sn1 2 1 ≤ + (nk+1 − nk ) − . n1 nN nN nk+1 − nk n2N k=1

Letting next N tend to infinity yields Snk+1 2 Snk N −1 − nk nk+1 1 lim sup

1 1 N→∞ nN k=1 nk − nk+1

N −1

Snk+1 −nk 2 SnN 2 1 (nk+1 − nk ) − nk+1 − nk n2N N →∞ nN k=1 S 2

SnN 2 nk+1 −nk

≤ lim sup sup − . 2 n2N N →∞ 1≤k

≤ lim sup

So a) is proved. Now if limk→∞ nk+1 − nk = ∞, suppose first that limn→∞ Snn = 0. Then limk→∞ (1.3.5a) gives

Snk+1 −nk

nk+1 −nk

= 0, and so letting N tend to infinity in the first equality in Snk+1 2 Snk N −1 − 1 n nk+1 lim

k1 1 N →∞ nN nk − nk+1 k=1

= 0.

Hence b) is proved in that case. If limn→∞ Snn > 0, there exists χ ∈ H such that limn→∞ Snn − χ = 0. It suffices to apply the result obtained to the weakly stationary sequence {Xi − χ , i ≥ 1} to reach the same conclusion in this case as well.

16

1 The von Neumann theorem and spectral regularization

Now assume nk = ak, a being some fixed positive integer. Replace nk by its value in the first part of (1.3.5a):

1 Na

N−1 Skaka

1 ka k=1

− −

S(k+1)a 2 (k+1)a 1 (k+1)a

N −1

SaN 2 1 Sa 2

Sa 2 − 2 2 + = 2 a N N a2 a N

(1.3.5b)

k=1

Sa 2

SaN 2 = − . a2 a2N 2 Hence

2

(k+1)a N −1 Ska Sa 2 1 ka − (k+1)a

SN a 2 lim sup = 0,

1 ≤ lim sup 2 − 1 a (N a)2 N,a→∞ Na N,a→∞ ka − (k+1)a S

k=1

which is c). Let {Dj , j ≥ 1} be a chain, and apply equality (1.3.5b) with N = Dj +1 /Dj , a = Dj .

1 Na

N −1 ka Ska

1 ka k=1

− −

S(k+1)a 2 (k+1)a 1 (k+1)a

=

SaN 2

Sa 2 − . a2 a2N 2

We obtain Dj +1 Dj −1

S kD

kDjj −

1 Dj +1

1 kDj

k=1

−

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SDj

2

Dj

−

SDj +1

2

Dj +1

.

Summing up from j = 1 to j = J gives J

1

j =1

Dj +1

Dj +1 Dj −1

S kD

kDjj − k=1

1 kDj

−

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SD1

D1

2

−

SDJ +1

DJ +1

2

.

Letting J tend to infinity gives d). Stronger forms of b) exist in some cases. If T on L1 of a σ -finite measure space is a Dunford–Schwartz contraction (see subsection “extensions” in Section 4.2), write SnT = nl=1 T l and ATn = Sn√ /n. The following is an easy consequence of (1.3.5) and Remark 9.3.9.2. Let f ∈ (I − T )L2 . Then for any increasing sequence N = {nk , k ≥ 1} of positive integers, we have ST f SnT f 2 nk k+1 K − nk nk+1 1 lim = 0.

1 1 K→∞ K nk − nk+1 k=1

1.3 The von Neumann theorem

17

1.3.4 Remark. Let f ∈ H . From the spectral inequality follows that π Vn (θ ) − Vm (θ )2 μf (dθ ),

An f − Am f 2 ≤ −π

ikθ , where μf is the spectral measure of f relative to T , and V (θ ) = 1 −1 k=0 e = 1, 2, . . . . It is easily seen that limn,m→∞ |Vn (θ ) − Vm (θ )| = 0, for any θ . As moreover |V (θ )| ≤ 1, we deduce from the dominated convergence theorem that 2 π limn,m→∞ −π Vn (θ ) − Vm (θ ) μf (dθ ) = 0; hence the sequence {An f, n ≥ 1} is a Cauchy sequence, thus converging in H . This is another convenient way to recover the convergence part in von Neumann’s theorem. Weighted averages. The same argument allows us to prove the following more genof nonnegative reals with partial sums eral result. Let w = {wk , k ≥ 0} be a sequence ∞ Wn = n−1 w = 0, for each n. Assume that k=1 wk = ∞. Consider the weighted k=0 k averages n−1 1 Bn = BnT := wk T k , n = 1, 2, . . . . Wn k=0

−1 2iπ kθ converges, then the sequence If for each real number θ , W (θ ) := k=0 wk e {Bn f, n ≥ 1} converges in H , for any f ∈ H and any contraction T in H . This condition is in fact necessary and sufficient. Let indeed ϑ be some irrational number in (T, λ) and consider on H = L2 (λ) the operator T defined by Tf ( · ) = f ( · + ϑ), the rotation of angle ϑ. Choose 2iπ kϑ , the converse assertion thus f (t) = e2iπ t . As Bn f (t) = e2iπ t W1n n−1 k=0 wk e follows immediately. With a little more effort, we can in fact prove the following. 1 W

1.3.5 Lemma. The following are equivalent: (i) For every contraction T on a Hilbert space H and any f ∈ H the sequence {BnT f, n ≥ 1} converges in norm. (ii) For every contraction T on a Hilbert space H and any f ∈ H the sequence {BnT f, n ≥ 1} converges weakly in H . (iii) For every real θ, the sequence {Wn (θ ), n ≥ 1} converges. Assertion (i) is fulfilled if 1 wn + |wk − wk+1 | = 0. n→∞ Wn n−1

lim

(1.3.6)

k=1

If w is nondecreasing, (1.3.6) becomes limn wn /Wn = 0. If the sequence is nonincreasing, then (1.3.6) is always satisfied. k Recall that we denoted PT f = limn→∞ n1 n−1 k=0 T f in Theorem 1.3.1.

18

1 The von Neumann theorem and spectral regularization

1.3.6 Corollary. Let w be as before. Then {BnT f, n ≥ 1} converges to PT f in norm for every Hilbert space contraction T and f ∈ H , if and only if, n 1 wk zk = 0 for every z ∈ T, z = 1. n→∞ Wn

lim

(1.3.7)

k=1

Condition (1.3.7) does not imply condition (1.3.6) (see Lin–Weber [2007]). 1.3.7 Corollary. Let w satisfy (1.3.7). Then n w2 lim k=1 k2 = 0. n n→∞ k=1 wk

(1.3.8)

Indeed let {ek , ∈ Z} be the standard orthonormal basis of 2 , with T the isometric shift defined by T ej = ej +1 . Then f = e1 satisfies PT f = 0. Since (1.3.7) is satisfied, the orthonormality and the previous corollary yield 2 2 n n n 1 1 2 1 k w = w e = w T e k k+1 k 1 → 0. k Wn Wn Wn2 k=1

k=1

k=1

Let us also briefly discuss the case of subsequence ergodic averages. Let 1 if wk = n for some ≥ 1, wk = 0 otherwise, where 1 ≤ n1 < n2 < · · · is an increasing sequence of integers, which we denote N . The subsequence averages CN f =

N CN f

N 1 nk := T f, N

n = 1, 2, . . . ,

k=1

converges in the mean for any , and any contraction T in H , if and only if, for x ∈ H2iπ nk ϑ , N ≥ 1 converges. any real ϑ, the sequence N1 N e k=1 Mean good sequences. We say that a sequence {nj , j ≥ 1} of integers is mean good if, given any contraction T in an arbitrary Hilbert space H , for any x ∈ H the sequence of weighted averages N 1 nj T x, N = 0, 1, . . . , N j =0

converges in H . More generally, for 1 ≤ p < ∞ we say that a sequence {nj , j ≥ 1} of integers is universally p-mean good when, given any measurable dynamical system (X, A, μ, τ ) (see Section 3.1), any f ∈ Lp (μ), the weighted averages N 1 nj T f, N j =0

N = 0, 1, . . . ,

19

1.3 The von Neumann theorem

converge in Lp (μ). Here we have set Tf = f τ . By what precedes, a sequence {nj , j ≥ 1} of integers is mean good if and only if, for 2iπ nk ϑ , N ≥ 1 converges. See also Remark 3.4.2, any real ϑ, the sequence N1 N k=1 e where it is proved that if the sequence {nj , j ≥ 1} is a good sequence (Section 1.6), then it is mean good relatively to the class of weakly mixing dynamical systems. Finally the moving averages Bn,k f =

k+n−1 1 j T f n j =k

converge in H to f¯, for any f ∈ H , as (n, k) tends to (∞, ∞), where the limit f¯ is the same as in Theorem 1.3.1. Speed of convergence. It can be easily observed that no speed of convergence can be exhibited in general. Indeed, by the uniform boundeness principle (Theorem 2.2.8), the existence of a sequence an → ∞ such that lim supn→∞ an An f − Pf < ∞ for all f is equivalent to an An − P → 0. This cannot be realized in general. Let θ be irrational and choose Tf (x) = f (x + θ ), f ∈ L1 (T). Let 0 < ε < 1 and fix some positive integer N . Since {j ϑ, j ≥ 1} is dense in T, by using Lemma 1.4.2 (iii) one can select j = j (N) such that max Vn (j θ ) − 1 ≤ ε. n≤N

Put f = ej . Then Pf = f, 1 = 0 and An f = |Vn (j θ )|. Thus | An f − 1| ≤ ε, for n ≤ N. This implies that 1 − ε ≤ min An − P ≤ 1. n≤N

Since ε can be arbitrarily small, we therefore have that An − P = 1 for every n, thus providing a contradiction. One can also use the shift-model to produce counterexamples. Let μ be a probability measure on (R, B(R)) such that R xμ(dx) = 0 and R x 2 μ(dx) = 1 and let (RZ , B(RZ ), P) with P = μZ . Consider the shift T on RZ defined for x ∈ RZ , x = {xk , k ∈ Z} by T x = {xk−1 , k ∈ Z}. Let ξ = {ξk , k ∈ Z} be i.i.d. random variables with image law μ. Let also a = {ak , k ∈ Z} ∈ 2 (Z) and put ξk ak . f (ξ ) = Then f ∈ L2 (P) and f (T ξ ) =

k∈Z

k∈Z ξk−1 ak

k∈Z ξk ak+1

so that

1 ak + ak+1 + . . . + ak+n−1 f (T ξ ) = ξk . n n n−1

ATn f (ξ ) =

=

=0

k∈Z

20

1 The von Neumann theorem and spectral regularization

Suppose first, one has a speed of convergence, namely ATn f (ξ ) = O(εn ) where εn ↓ 0. Let β = {βk , k ∈ Z} be such that β 2 = 1 and β2 Cεn . :22 ≥2n

Choose a such that ak = βL 2−L if 22L ≤ |k| < 22(L+1) , L = 0, 1, . . . . Then

ATn f 2

ak + ak+1 + . . . + ak+n−1 2 = n k∈Z

L:22L ≥2n

k∈Z 22L ≤k

≥

≥

βL2

ak + ak+1 + . . . + ak+n−1 n

2(L+1) 2 − 22L − n − 1

22L

L:22L ≥2n

≥C

2

βL2

L:22L ≥2n

Cεn . This gives a contradiction. Suppose now that P-a.s.

An f (ξ ) = O(εn ). Choose μ to be Gaussian. The basic properties of Gaussian processes (Section 10.2) imply |An (f )| E sup < ∞. εn n≥1 But this is contradictory again since obviously |An (f )| |An (f )|

An (f ) 2 ≥ sup E ≥ C sup 1. ε ε εn n n n≥1 n≥1 n≥1

E sup

The last inequality follows from the very construction of a. Extensions. There are naturally numerous extensions or variants of von Neumann’s theorem, which are not possible to present from an exhaustive point of view. For instance: • Let {xn , n ≥ 1} be a sequence of elements of a real or complex Hilbert space H , and denote sn = x1 + · · · + xn . Assume that the following two conditions are realized: (a) ∃c > 0 such that (b)

sn+m − sn 2 − sm 2 < cm,

sn+1 2 − sn 2 n→∞ n lim

exists.

Ky Fan [1945: Theorem II] showed that sn /n strongly converges in H .

1.3 The von Neumann theorem

21

• There exist several extensions in Banach spaces obtained by Bruck [1979], De la Torre [1976], Landers–Rogge [1978], Pham [1993], Yoshimoto [1976]. We shall quote the Mean Ergodic theorem due to the contributions of Eberlein, Riesz, Yoshida and Kakutani. Let X be a Banach space and let I be the identity operator on X. An operator k) T : X → X is Cesàro bounded if (ATn = n1 n−1 T k=0 sup ATn < ∞. n

If T is Cesàro bounded, the identity n−1 T n−1 = ATn −

n−1 T An−1 n

shows that the condition limn→∞ n−1 T n−1 x = 0 is necessary for the sequence {ATn x, n ≥ 1} to converge. 1.3.8 Theorem (Mean ergodic theorem). Let T be a Cesàro bounded linear operator in a Banach space X. For any x satisfying limn→∞ n−1 T n−1 x = 0, and any y ∈ X, the following conditions are equivalent: (i) T y = y and y ∈ co{x, T x, T 2 x, . . . }, (ii) y = limn→∞ ATn x, (iii) y = w- limn→∞ ATn x, (iv) y is a weak cluster point of the sequence {ATn x, n ≥ 1}. We denote by co(•) the closed convex hull of •, and write w- lim to denote the limit in the sense of the weak topology of σ (X, X ∗ ). From the mean ergodic theorem thus follows: If Tn /n → 0 and supn ATn < ∞, then {x ∈ X : ATN x converges} = {y ∈ X : T y = y} ⊕ (I − T )X. An operator T : X → X is power-bounded if sup T N < ∞. N

As a special case of the mean ergodic theorem we have 1.3.9 Theorem. Let T : X → X be a power-bounded linear operator in a reflexive Banach space X. For any x ∈ X, the sequence of averages {ATn x, n ≥ 1} converges in X to a T -invariant limit. We refer to Krengel [1985: 72–73]. This applies in particular to the spaces Lp (1 < p < ∞).

22

1 The von Neumann theorem and spectral regularization

A linear operator T : X → X in a Banach space X is called mean ergodic if for any x ∈ X, the sequence of averages {ATn x, n ≥ 1} converges in X; by the above theorem, power bounded linear operators in reflexive Banach spaces have this property. Let E(T )x denote the limit of the sequence {ATn x, n ≥ 1}. Let w = {wk , k ≥ 0} be a sequenceof nonnegative reals with partial sums Wn = n−1 k=0 wk = 0, for each n. Assume that ∞ w = ∞. Condition (1.3.6) is equivalent to the fact that for every k=1 k power-bounded mean ergodic T on a Banach space X and every x ∈ X, we have limn→∞ W1n nk=1 wk T k x − E(T )x = 0. Orthogonality and weak orthogonality. Let (H, · ) be a real or complex Hilbert vectors (thereby weakly stationary) space, and let {fn , n ≥ 1} be orthonormal in the 2 2 |cj | , the series cn fn inner product space H . As u≤j ≤v cj fj = u≤j ≤v converges in H , for any sequence {cn , n ≥ 1} such that cn2 < ∞. The fact that 1 fi → 0 n n

in H

(1.3.9)

i=1

can be for instance deduced from Ky Fan’s result (see the section Extensions, p. 20 This in fact remains true for weakly orthogonal systems. First recall Bessel’s inequality: If {ei , 1 ≤ i ≤ n} are orthogonal vectors in the inner product space H , then n

|x, ei |2 ≤ x 2

for any x ∈ H .

i=1

Boas [1941] and independently Bellman [1944] proved the following generalization of Bessel’s inequality: If x, y1 , . . . , yn are elements of an inner product space (H, ·, ·), then n

|x, yi |2 ≤ x 2

i=1

max yi 2 +

1≤i≤n

|yi , yj |2

1/2 .

1≤i =j ≤n

Fink, Mitrinovi´c and Pe´cari´c [1993] extended Boas–Bellman’s inequality as follows: If x, y1 , . . . , yn are elements of (H, ·, ·), and c1 , . . . , cn are complex numbers, then n i=1

|ci x, yi |2 ≤ x 2

n i=1

|ci |2 · max yi 2 + 1≤i≤n

|yi , yj |2

1/2 .

1≤i =j ≤n

Other extensions are established in [Dragomir: 2003]. In relation with Bessel’s inequality is the notion of a quasi-orthogonal system introduced by Bellman (see [Kac–Salem–Zygmund: 1948]). Let f = {fn , n ≥ 1} be a sequence in H . Then f is called a quasi-orthogonal system if the quadratic form on 2 defined by {xh , h ≥ 1} → h xh fh 2 is bounded.

23

1.3 The von Neumann theorem

A necessary and sufficient condition for f to be quasi-orthogonal is that the series cn fn converges in H , for any sequence {cn , n ≥ 1} such that cn2 < ∞. And as noticed before, a suitable use of Kronecker’s lemma implies that (1.3.9) holds in turn for quasi-orthogonal systems. Kac, Salem and Zygmund observed that every theorem on orthogonal systems whose proof depends only on Bessel’s inequality, holds for quasi-orthogonal systems. In particular for H = L2 (X, A, μ), (X, A, μ) a probability space, Rademacher– Menchov’s theorem, the almost everywhere convergence of the series cn fn , 2 asserting provided that cn log2 n < ∞, applies. This is readily seen from the fact that f is quasi-orthogonal if and only if there exists a constant L depending on f only, such that

1/2 xi fi ≤ L |xi |2 . i≤n

i≤n

And this indeed suffices for the proof of Rademacher–Menchov’s theorem (see Remark 8.3.5), since we have the increment property 2 ci fi ≤ L2 |ci |2 . n≤i≤m

n≤i≤m

Consequently, if f is quasi-orthogonal, then 1 a.e. fi −−→ 0. n n

(1.3.10)

i=1

There is a useful sufficient condition for quasi-orthogonality (Lemma 7.4.3 in [Weber: 1998a]): In order for f to be quasi-orthogonal, it is sufficient that sup

j ≥1 k

|fj , fk | < ∞.

(1.3.11)

Indeed, from the relation xj fj , xk fk + xk fk , xj fj = xj xk fj , fk + xk xj fk , fj , it is plain that

|xj fj , xk fk | + |xk fk , xj fj | ≤ 2|fj , fk ||xj ||xk | ≤ |fj , fk | |xj |2 + |xk |2 , and so 2 xi fi = |xi |2 fi 2 + i≤n

i≤n n

1≤j

≤ max fi 2 · |xi |2 +

i=1

i≤n

xj fj , xk fk + xk fk , xj fj

1≤j

|xj |2 + |xk |2 |fj , fk |

24

1 The von Neumann theorem and spectral regularization

n ≤ max fi 2 · |xi |2 + |xj |2 |fj , fk | i=1

+

j ≤n

i≤n

|xk |

2

|fj , fk |

j

1≤j

k≤n

n ≤ max fi 2 · |xi |2 + sup |fj , fk | |xj |2 . i=1

Hence we may take

j ≥1 k =j

i≤n

j ≤n

n L = max fi 2 + sup |fj , fk |. i=1

j ≥1 k =j

From the above calculation, we can formulate another variant of Bessel’s inequality: If x, y1 , . . . , yn are elements of an inner product space (H, ·, ·), then n

|x, yi | ≤ x

2

2

n

2 i=1 |x, yi |

n

max yi + sup 2

i=1

i=1

Indeed

n

j =1 k =j

= x,

n

|yj , yk | ≤ 2 x

i=1 x, yi yi

2

n

sup

n

j =1 k=1

|yj , yk | .

(1.3.12) ≤ x 2 ni=1 x, yi yi , and

n n 2 x, yi yi = |x, yi |2 |yi |2 i=1

i=1

+

x, yi x, yj yi , yj + x, yi x, yj yj , yi

1≤i<j ≤n

n

≤ max |yj |

2

j =1

+

n

|x, yi |2

i=1

|yi , yj | |x, yj |2 + |x, yi |2

1≤i<j ≤n n

n ≤ max |yj |2 |x, yi |2 + sup |yj , yk | |x, yj |2 j =1

j ≥1 k =j

i=1

j ≤n

Thus n i=1

|x, yi |2 ≤ x 2

n n n n max yi 2 + sup |yj , yk | ≤ 2 x 2 sup |yj , yk | . i=1

j =1 k =j

j =1 k=1

We conclude these remarks with a digression towards the large sieve inequality, which we recall: Consider a function f : {1, . . . , N} → C with Fourier transform S(t) =

N n=1

f (n)e2iπ nt .

25

1.3 The von Neumann theorem

Let t1 , . . . , tm be δ-separated, which means that |ti − tj | ≥ δ when i = j . Then m

|S(tj )| ≤ (8N + δ 2

−1

j =1

)

N

|f (n)|2 .

(1.3.13)

n=1

The proof of this inequality is not hard. As noticed by Bombieri, it can also be proved by using an approximate Bessel inequality, namely inequality (1.3.12). A short proof is given in [Green: 1999]. Coboundaries. Let T be a contraction in a Hilbert space H . An element f of H is a coboundary for T , if the equation f = g − Tg

(1.3.14)

(called the cohomological equation) admits a solution g, which in this case, is called the transfer-function of f or the cobounding function of f . Usually one requires T to be generated by an automorphism (see Section 3.1) τ of some probability space (X, A, μ), T g = g τ , and the cohomological equation is taken in the sense of equivalenceclasses modulo μ. Assume for simplicity that T is ergodic, or equivalently that PT f = X f dμ (see Section 3.2). In this case, by the Riesz decomposition of L2 (μ), the coboundaries are dense in L20 (μ), and by a plain density argument the coboundaries g − T g with g p bounded form a dense subset in all L0 (μ), 1 ≤ p < ∞. For p = ∞, things get more complicated. The following result is due to Koˇcergin [1976: Corollary 3]. 1.3.10 Theorem. Let f ∈ L10 (μ) and ε > 0. Then there exists g ∈ L0 (μ) such that

f − (g − T g) ∞ < ε. Let ϕ : R+ → R+ such that limt→∞ ϕ(t) = ∞. Then there exists a function 0 f ∈ L∞ 0 (μ) with f ∞ = 1 such that for any g ∈ L (μ) with X ϕ(|g(t)|)μ(dt) < ∞,

f − (g − T g) ∞ ≥ 1/2. In particular, for any p > 0, there exists f with f ∞ = 1 such that inf

g∈Lp (μ)

f − (g − T g) ∞ ≥ 1/2.

In relation with this, a recent result (see Section 4.5) of Volný and Weiss [2004] establishes a link between the fact that a function f is well approximated in L∞ (μ) by coboundaries with cobounding functions living “almost” in Lp (μ), and the order of μ{|An f | > 1}. Another important feature of coboundaries is that the sums are bounded: if f is a coboundary then n−1 T k f < ∞. (1.3.15) sup n≥1

k=0

26

1 The von Neumann theorem and spectral regularization

The converse is in fact also true, namely if (1.3.15) holds, then f is a coboundary. This follows from a theorem due to Browder [1958]. One can consider a larger setting to state this property. Let X be a Banach space and let an operator T : X → X. Equation (1.3.14) is then reformulated as f = (I − T )g, where I denotes the identity operator on X. The following extension of Browder’s result is due to Lin and Sine [1983]. 1.3.11 Theorem. Let T be mean ergodic. The following conditions are equivalent for y ∈ X: (i) y ∈ (I − T )X, j (ii) xn = n1 nk=1 k−1 j =0 T y has a weakly convergent subsequence, (iii) {xn , n ≥ 1} converges strongly (and x = limn→∞ xn satisfies (I − T )x = y), k (iv) supn≥1 n−1 k=0 T y < ∞.

1.4 The spectral regularization inequality A remarkable result of Talagrand [1996a: Theorem 1.3] allows us to make precise von Neumann’s theorem. Note AT (f ) = {ATn f, n ≥ 1}, and let for any ε > 0, N(AT (f ), · , ε) be the entropy number of order ε of AT (f ), namely the minimal number (possibly infinite) of Hilbertian open balls centered in AT (f ) of radius ε enough to cover AT (f ). 1.4.1 Theorem (Talagrand’s entropy estimate). Let T be a contraction in a Hilbert space H . Then, ∀f ∈ H, ∀0 < ε ≤ f ,

N(AT (f ), · , ε) ≤ 1 + 30

f 2 . ε2

In Lifshits–Weber [2000], the better constant 6π ≈ 18.85 is obtained by using a finer spectral regularization kernel than the one used in this section. The theorem shows that the convergence of the sequence AT (f ) is very regular, like in the plane R2 . This phenomenon is all the more surprising since no speed of convergence exists in general in von Neumann’s mean ergodic theorem, as we saw in the previous section. Talagrand’s proof induces a remarkable idea of spectral regularization which has been developed in Lifshits–Weber [2000], [2003] for fixed and moving ergodic averages. Put for any positive real x and any −π ≤ θ ≤ π, Vx (θ ) =

(eixθ − 1) . x(eiθ − 1)

We first collect and give the proof of some elementary but useful estimates of these kernels.

27

1.4 The spectral regularization inequality

1.4.2 Lemma. We have the following estimates valid for x > 0 and −π ≤ θ ≤ π :

π π (i) |Vx (θ)| ≤ x|θ | , and |Vx (θ )| ≤ inf 1, x|θ | for x integer. ∂ (ii) ∂x Vx (θ) ≤ π4 |θ|. (iii) For any integers m ≥ n ≥ 1, |Vn (θ ) − Vm (θ )| ≤

π 4 |θ | (m − n).

(iv) |Vn (θ) − Vm (θ )| ≤ 2 (m − n) /m. Proof. First observe that |Vx (θ )| ≤

2 . x|eiθ −1|

As for any −π ≤ θ ≤ π , |eiθ − 1| =

2|θ | π 2| sin |θ| 2 | ≥ π , we deduce that |Vx (θ )| ≤ x|θ | , if −π ≤ θ ≤ π and x > 0. And |Vx (θ)| ≤ 1 if x is an integer. Hence (i). Now let −π ≤ θ ≤ π and put for any real x > 0, eixθ − 1 ϕθ (x) = . x

Then ϕθ (x) =

iθ xeixθ −eixθ +1 , x2

and noting δ(u) := |iueiu − eiu + 1|2 , we have

δ(u) = (1 − u sin u − cos u)2 + (u cos u − sin u)2 = 2[1 − u sin u − cos u] + u2 . We claim that for all u ≥ 0, δ(u) ≤ u4 /4. As δ(u) = δ(−u) it suffices to prove it for u ≥ 0. But δ (u) = 2u(1 − cos u) and if we set H (u) := u4 /4 − δ(u), we get H (u) = u3 − 2u(1 − cos u) = u(u2 − 4 sin2 (u/2)) ≥ 0, since | sin v| ≤ |v|. Then |ϕθ (x)| = |δ(xθ)|/x 2 ≤ |θ |2 /2. As it follows that ∂ V (θ ) ≤ π |ϕ (x)| ≤ π |θ |. x ∂x 2|θ | θ 4

∂ ∂x Vx (θ )

=

1 ϕ (x), eiθ −1 θ

Hence (ii). Let m ≥ n be positive integers . Then |Vn (θ) − Vm (θ )| =

1 π |ϕθ (n) − ϕθ (m)| ≤ (m − n) sup |ϕθ (x)|, − 1| 2|θ | n<x<m

|eixθ

and so |Vn (θ) − Vm (θ )| ≤

π 4 |θ|(m − n).

Now

n−1 m−1 1 1 ij θ 2(m − n) 1 ij θ |Vn (θ) − Vm (θ )| = e − e ≤ − .

n

m

j =0

m

j =n

m

Hence, (iii) and (iv). Introduce for θ ∈ [−π, π ) and y ∈ (0, 1] the regularizing kernel Q(θ, y) =

|θ | 1 ∧ 2. |θ| y

(1.4.1)

28

1 The von Neumann theorem and spectral regularization

1.4.3 Lemma. Let m ≥ n be two positive integers. Then, for any θ ∈ [−π, π ),

1/n

4π

Q(θ, y)dy + 4 1[ 1 , 1 ) (|θ |) ≥ |Vm (θ ) − Vn (θ )|2 . m n

1/m

Proof. Consider three cases: (1) |θ| ≥ n1 . By definition of Q and by Lemma 1.4.2.

1/n

1 m−n 1 dy = m n|θ | 1/m |θ| 2π 1 1 1 ≥ |Vm (θ ) − Vn (θ )|2 . ≥ |Vm (θ ) − Vn (θ )| 2 n|θ | 2π 4π

Q(θ, y)dy =

1/m

(2) |θ | ≤

1 m.

1/n

Then, for the same reasons

1/n

|θ| dy = (m − n)|θ | 2 1/m y 4 2 ≥ |Vm (θ ) − Vn (θ )| ≥ |Vm (θ ) − Vn (θ )|2 . π π

1/m

(3)

1 n

1/n

Q(θ, y)dy =

> |θ | ≥

1 m.

This case is obvious since we have |Vm (θ ) − Vn (θ )| ≤ 2.

Let f ∈ H , with spectral measure μf . Introduce a new measure, the spectral regularization of the measure μf with respect to the kernel Q, defined by μˆ f (dy) = 4π

π

−π

Q(θ, y)μf (dθ ) dy + 4μf (dy).

(1.4.2)

It is easy to verify that μˆ f ([0, 1]) ≤ 4(2π + 1)μf ([−π, π]) ≤ 4(2π + 1) f 2 . Indeed, if |θ | ≤ 1, then 0

1

|θ |

Q(θ, y)dy = 0

−1

|θ|

dy +

1 |θ |

|θ|y −2 dy = 1 + |θ |(|θ |−1 − 1) = 2 − |θ | ≤ 2,

1 1 and if 1 ≤ |θ| ≤ π , then y ≤ |θ| and 0 Q(θ, y)dy = 0 |θ |−1 dy ≤ 1. We thus have 1 0 Q(θ, y)dy ≤ 2; hence the inequality. 1.4.4 Theorem (Spectral regularization inequality). For any integers m ≥ n ≥ 1,

ATn f − ATm f 2 ≤ μˆ f

1 1 m, n

.

29

1.4 The spectral regularization inequality

Proof. By integrating the inequality of Lemma 1.4.3 with respect to the measure μf , we get π 1/n π 1 1 Q(θ, y)μf (dθ ) dy+4 μf m , n ≥ |Vm (θ )−Vn (θ )|2 μf (dθ ). 4π 1/m

−π

−π

By means of the spectral inequality (Proposition 1.2.2), we thus obtain the claimed result. The spectral regularization inequality allows us to easily evaluate the Littlewood– Paley square function associated to the averages ATn (f ). Put for any nondecreasing sequence N = {np , p ≥ 1} of positive integers, and any f ∈ H , SN (f ) =

∞

ATnp+1 (f ) − ATnp (f ) 2

1/2 .

(1.4.3)

p=1

These functions, which are extrapolated from the Littlewood–Paley theory, gained much interest in the ergodic circles during the last decade. We briefly recall their role in Fourier analysis on T. Introduce the so-called dyadic intervals ⎧ j −1 j −1 + 1, . . . , 2j − 1} if j > 0, ⎪ ⎨{2 , 2 j = {0} if j = 0, ⎪ ⎩ −|j | if j < 0, If f is any integrable function on T and fˆ its Fourier transform, then we write Sj f = ˆ n∈j f (n)χn . The square function of f is defined by Sf =

|Sj f |2

1/2 ,

j ∈Z

and the Littlewood–Paley theorem on T expresses that to each p in (1, ∞) correspond positive numbers Ap and Bp such that Ap Sf p ≤ f p ≤ Bp Sf p for (say) all trigonometric polynomials f on T. For more, see [Edwards–Gaudry: 1977]. The square function also appears in martingale theory ([Burkholder–Gundy: 1970], inequality (1.4)). Let f1 , f2 , . . . be a martingale on some probability space and d1 , d2 , . . . its difference sequence, so that fn =

n

dk ,

n ≥ 1.

k=1

Let

f

denote the maximal function of the martingale sequence: f = supn≥1 |fn |.

30

1 The von Neumann theorem and spectral regularization

The maximal function is related to the square function Sf = inequalities Ap Sf p ≤ f p ≤ Bp Sf p

∞

2 1/2 k=1 dk

by the

valid for 1 < p < ∞. 1.4.5 Theorem (Square function inequality). For any nondecreasing sequence N of positive integers, and any f ∈ H , SN (f ) ≤ 2(2π + 1)1/2 f . Proof. From Theorem 1.4.4, follows immediately that ∞

ATnp+1 (f ) − ATnp (f ) 2 ≤

p=1

∞

μˆ f

1 1 np+1 , np

≤ μˆ f {[0, 1]} ≤ 4(2π + 1) f 2 .

p=1

Actually the better constant 6π is obtained in Lifshits and Weber [2000: 77] by using another kernel Q. The corresponding spectral regularization of μ is given by π d μˆ −3 2 Q(θ, x)μ(dθ ) = |x| θ μ(dθ ) + |θ |−1 μ(dθ ), (x) = dx −π |θ |<|x| |x|<|θ |≤π 0 < |x| ≤ π. For any two positive integers m ≥ n, we have

ATn f − ATm f 2 ≤ 4π μˆ

1 1 m, n

.

By applying the above inequality to the measure μ = δθ , we also get for each θ , ∞

|Vnp+1 (θ ) − Vnp (θ )|2 ≤ 4(2π + 1).

(1.4.4)

p=1

This inequality was proved by Jones, Ostrovskii, Rosenblatt [1996] by different arguments, with the constant 252 . Note that (1.4.4) immediately implies Theorem 1.4.5, which is Theorem 1.2 in the above mentioned paper. Square functions for other ergodic averages are considered in [Nair–Weber: 1999]. We now can give a simple proof of Talagrand’s inequality. Proof of Theorem 1.4.1. Let 0 < ε ≤ 1, f ∈ H be such that f = 1. Let also t0 < t1 < · · · < tr be an ordered sequence of positive integers such that T A f − AT f ≥ ε, ∀0 ≤ i < j ≤ r. ti tj Apply the previous theorem to the subsequence N = {t0 , t1 , . . . , tr , tr+1 , tr+2 , . . . } , where tr+j = tr + j , if j = 1, 2, . . . . Then ε2 r ≤ 4(2π + 1), and consequently, N(AT (f ), . , ε) ≤ 1 + This establishes the claimed inequality.

4(2π + 1) 30 ≤ 1+ 2. 2 ε ε

31

1.4 The spectral regularization inequality

1.4.6 Remarks. (1) The above estimate is also optimal. This can be seen by considering rotations. Take X = [−π, π ) provided with the normalized Lebesgue measure λ. Let also θ ∈ X be irrational and consider the unitary operator U on L2 (X, λ) associated with the rotation θ: τ x = x + θ mod (2π ), x ∈ X and defined by Uf = f τ . Let f ∈ L2 (X, λ), f = n∈Z an en where we denote en (x) = einx . Then AN (f ) − AM (f ) 22 = n∈Z |an |2 |VN (nθ ) − VM (nθ )|2 . By virtue of Weyl’s criterion (Section 1.6), we can build inductively two increasing sequences of positive integers N1 < N2 < · · · and l1 < l2 < · · · such that for any j = 1, . . . and any i < j , |VNj (lj θ )| >

1 , 2

|VNj (li θ )| <

1 . 4

Now let {rk , k ≥ 1} be some increasing sequence of integers, and put Rk = j

ANj (fk ) − ANi (fk ) 22

Rk+1 −1 1 ≥ |VNj (ls θ ) − VNi (ls θ )|2 rk s=Rk

1 1 ≥ |VNj (li θ ) − VNi (li θ )|2 ≥ . rk 16rk Hence, ANj (f ) − ANi (f ) 2 ≥ kc ANj (fk ) − ANi (fk ) 2 ≥ j < Rk+1 which proves that

1 N fk , √ 4 rk

≥ rk ,

and

N f,

c 1 √ k 4 rk

c √1 k 4 rk

for any Rk ≤ i <

≥ rk .

The first inequality shows the optimality of Talagrand’s estimate, by taking into account its homogeneity properties. The second inequality shows this: whatever ϕ : R+ → R+ , > 0. with limx→0 ϕ(x) = 0, there exists f ∈ L2 (λ) such that lim supε→0 εN−2(f,ε) ϕ(ε) (2) Under appropriate spectral conditions, Talagrand’s estimate can be sharpened. Let f ∈ H , f = 1 with spectral measure μ satisfying for β ≥ 0,

π

−π

π log |θ|

β

μ(dθ ) < ∞.

Then ([Gamet–Weber: 2000] Proposition 1.2) there exists a constant K = K(f, β) such that K N(AT (f ), · , ε) ≤ 2 , 0 < ε ≤ 1. ε | log ε|β

32

1 The von Neumann theorem and spectral regularization

Gaposhkin’s estimates. These questions were further investigated by Gaposhkin, who considered sequences ξ = {ξk , k ≥ 1} of square integrable random variables satisfying the following quasi-stationary condition: m+n 2 1 C2 ξk ≤ α0 ,

n

(1.4.5)

n

2

k=m

for all m ≥ 0, n ≥ 1, where C0 > 0 and 0 < α ≤ 2. Condition (1.4.5) holds for instance if ξ is a weakly stationary sequence such that E ξk = 0, κ(n) := E ξk ξk+n satisfies C1 , nα

|κ(n)| ≤

for some 0 < α < 1 and C1 > 0. Then ([Gaposhkin: 2005], theorem 1) the entropy number N(ε) of the associated set of means satisfies the inequality N(ε) ≤

C0 Dα , ε

where Dα ≤

2−α α

(2−α)/α + 2 for 0 < α < 1,

and Dα ≤ 3 for 1 ≤ α ≤ 2.

(3) For unitary operators with discrete spectrum, the entropy estimate can be ameliorated. Let U be a unitary operator with discrete spectrum in H = L2 (λ), or in an arbitrary separable Hilbert space H . Let {ej , j ∈ Z} be a basis in H and {λj , j ∈ Z} be a sequence on the unit circle such that U (x) = U xj ej = λj xj ej , xj = x, ej . j ∈Z

j ∈Z

Then for each complex polynomial P , we have P (U )x = j ∈Z P (λj )xj ej . Let q ∈ (1, 2]. Further, for each x ∈ H , 1/2 1/q

P (U )x 2 = |P (λj )|2 |xj |2 ≤ |P (λj )|q |xj |q . (1.4.6) j ∈Z

j ∈Z

Combining this estimate and the spectral regularization method, we can prove: 1.4.7 Proposition. Let Bq = {x ∈ H : constant C depending only on q such that

j ∈Z |xj |

q

≤ 1}. There exist a universal

sup N(AU (x), · , ε) ≤ Cε −q . x∈Bq

1.4 The spectral regularization inequality

33

The proposition can be applied to the unitary operator related to rotations of the circle, with the exponential functions providing the relevant basis of eigenfunctions ˆ q ≤ 1}, where xˆ denotes the Fourier transform (the sequence and Bq = {x ∈ H : x

of the Fourier coefficients) of x ∈ Lp ([0, 1[). Proof. Fix x ∈ Bq . Define the pseudo-spectral measure of x as μ = j ∈Z |xj |q δλj . j Let, as usual, Vn (z) = n−1 n−1 j =0 z be the complex polynomials corresponding to the operators An . For all positive integers n < m, we can apply (1.4.10) to P = Vn − Vm and rewrite it as q q

An − Am 2 ≤ Vn − Vm q,μ . The next step is a regularization procedure similar to the one previously performed. Define for 0 < r ≤ 1 the regularized measure μˆ on [0, 1] by its Lebesgue density d μˆ (r) = Q(z, r)μ(dz) dr −q−1 q = r |1 − z| μ(dz) + r q−2 |1 − z|1−q μ(dz). |1−z|

r≤|1−z|≤2

Using the standard estimates |Vn (z) − Vm (z)| ≤ min 4/n|1 − z|, 2(m − n)/n, (m − n)|1 − z|/2 , we get for all z with |z| = 1 (namely z ∈ T) the inequality 1/n q |Vn (z) − Vm (z)| ≤ C1 Q(z, r)dr, ∀z : |z| = 1, 1/m

with C1 depending only on q. Integration over μ yields q

Vn − Vm q,μ ≤ C1 μ[1/n, ˆ 1/m].

(1.4.7)

Moreover, we have a total mass bound μ[0, ˆ 1] ≤ C2 μ(T), with C2 depending only on q. Let us take arbitrary ε > 0 and cut [0, 1] into segments such that the measure μˆ of each segment does not exceed ε q /C1 . It thus follows from (1.4.7) that the related covering of the set AU (x) consists of the sets of diameter not larger than ε and finally N(AU (x), · , ε) ≤ C1 C2 ε−q + 1. Entropy numbers attached to i.i.d. sequences. Relatively surprisingly, entropy numbers attached to i.i.d. sequences behave more smoothly. To see this, let H be some L2 (μ), μ a probability measure, and choose U and f ∈ L2 (μ) with f, 1 = 0,

34

1 The von Neumann theorem and spectral regularization

f = 1, such that f, Uf, U 2 f, . . . is a sequence of i.i.d. r.v.’s. If we write more simply An = AU n (f ), then

An − Am 2 =

1 1 − , n m

for any integers n < m. Let 0 < ε < 1 be fixed. Thus An ≤ ε, if n ≥ ε12 . For each 1 ≤ n ≤ 2ε , we cover An with one ball of radius ε. Finally, if 2ε < n < ε12 , let mk = kε12 , 1 ≤ k ≤ 1ε . Notice that x ≥ x/2 if x ≥ 1, and x − y ≤ 3(x − y) 1 if x − y ≥ 1/2. Thus ε ≤ m 1 ≤ 2ε , and ε

1 1 1 1 1 1 1 1 − = 2 ≥ 21 1 ≥ . = kε2 (k + 1)ε 2 ε k(k + 1) ε ε ( ε + 1) 1+ε 2 1 1 So kε12 − (k+1)ε , which implies ≤ ε32 k(k+1) 2 1 1 − (k+1)ε2 1 1 1 1 kε 2 4 − = 1 1 ≤ 4k(k + 1)ε − mk mk+1 kε2 (k + 1)ε 2 2 2 kε

(k+1)ε

≤ 4k(k + 1)ε

4

3 1 2 ε k(k + 1)

= 12ε2 .

Let mk ≤ n < mk+1 . Then

An − Amk 2 =

1 1 1 1 − ≤ − ≤ 12ε2 . mk n mk mk+1

Hence, for some absolute constant C, C −1 ε−1 ≤ N (A(f ), · , ε) ≤ Cε −1 .

(1.4.8)

We conclude these remarks by pointing out a Cauchy type uniform estimate of averages An , easy to draw from the above estimates and Theorem 8.1.1, |AN (f ) − AM (f )| 1/2 < ∞

N =M≥1 1 − 1 M N 1 du + + whenever : R → R is an increasing map such that 0 √u(u) < ∞. E sup

(1.4.9)

These plain computations also show, when combined with Rosenthal’s inequalities, that this estimate continues to be valid in Lp , 2 < p < ∞. More precisely, let {ξj , j = 0, 1, . . . } be a sequence of mean zero independent variables with finite moments of order p ≥ 2 and σ ≤ ξj 2 ≤ ξj p ≤ K for all j . Let An = n−1 n−1 j =0 ξj , A = {An , n ≥ 1}: Then for each p ≥ 2 there exist constants cp and Cp depending only on p such that entropy numbers obey cp σ ε−1 − 1 ≤ N(A, p, ε) ≤ Cp Kε−1 + 2 for all ε > 0.

(1.4.10)

35

1.4 The spectral regularization inequality

Lacunary subsequences. Let N = {nj , j ≥ 1} be a strictly growing sequence of positive integers satisfying the condition cN := sup # N ∩ [2k , 2k+1 [ < ∞. k≥1

Better estimates of entropy numbers than in Theorem 1.4.1 can be obtained in that case. Let f ∈ H with spectral measure μ. Put

π μ{0 < |θ| ≤ u} + u (ε) = inf + log , 0 < u ≤ π . 2 ε u Then there exists a universal constant C such that for any N , any f ∈ H with f = 1 and any 0 < ε ≤ 1,

N An f, n ∈ N , · , ε ≤ CcN (ε). For proofs, see Weber [1998a: Corollary 3.3] or Lifshits–Weber [2000: Corollary 4]. Extension to Lp with p > 1. Assume that H = L2 (μ), (X, A, μ) being a probability space, and define Tf = f τ where τ is a measure-preserving transformation of X (Section 3.1). By Theorem 1.4.5, the associated square function SN defined in (1.4.3) maps L2 (μ) to L2 (μ). This can be extended for 1 < p < ∞: There exists a constant Cp such that for any increasing sequence N = {nk , k ≥ 1} and any f ∈ Lp (μ), we have ∞ T A

p 1/p T (f ) − A (f ) ≤ Cp f p . nk+1 nk p

(1.4.11)

k=1

This nice result was shown by Jones, Kaufman, Rosenblatt and Wierdl [1998]. It is a direct consequence of a stronger result (see Theorem A), which we shall discuss in Section 4.6.6. With the notation from the beginning of the section, let N (AT (f ), p, ε) be the minimal number (possibly infinite) of Lp (μ) open balls centered in AT (f ) of radius ε, enough to cover AT (f ). In a way similar to the one we used to derive entropy estimates from the square function, we deduce from (1.4.11): There exists a constant Cp such that for ε > 0 and any f ∈ Lp (μ), N(A (f ), p, ε) ≤ T

p f p Cp p .

ε

(1.4.12)

For irrational rotations, this bound can be improved by using the Hausdorff–Young inequality (Lifshits [1997] and Weber [1997]). Let τ x = x + ϑ be a rotation on (T, λ), and T defined by Tf = f τ .

36

1 The von Neumann theorem and spectral regularization

Let 2 ≤ p < ∞ and 1/p + 1/q = 1. For f ∈ Lp (T), f ∼ fˆ = {fˆj , j ∈ Z} be its Fourier transform. Then

sup N(AT (f ), p, ε) ≤ Cε−q .

ˆ

j ∈Z fj ej ,

let

(1.4.13)

fˆ q ≤1

As T ej = e2iπj ϑ ej := λj ej , for all polynomials P we have P (λj )fˆj ej . P (T )f = j ∈Z

By the Hausdorff–Young theorem, we get

P (T )f p ≤ Cp

|P (λj )|q |fˆj |q

1/q .

j ∈Z

But this is a complete analog to (1.4.6) and we can proceed as in the proof of Proposition 1.4.7, by introducing a pseudo-spectral measure μ = j ∈Z |fˆj |q δλj , and its regularized version μˆ with the same kernel Q(z, r). We arrive at the estimate q

q

q

q

q

(An − Am )f p = (Vn − Vm )(T )x p ≤ Cp Vn − Vm q,μ ≤ C1 Cp μ[1/m, ˆ 1/n]. The estimate for covering numbers follows straightforwardly. Note that the proof works not only for rotations but also for all operators whose duals (with respect to a Fourier transform) act in q as contractive multiplications. Any convolution operator with respect to unit mass measure satisfies this condition. For more general averages such as averages of Dunford–Schwartz operators, or of a contraction in Lp , we do not know whether an analogous formulation of (1.4.12) exists. This estimate cannot, however, be improved in general as the following nice counterexample from Lisfshits [1997] shows. Lifshits’ counterexample. Let 2 ≤ p < ∞ and let U : Lp (T) → Lp (T) be the multiplication operator defined for any f ∈ Lp (T) and any θ ∈ T by Uf (θ ) = eiθ f (θ ). I +U +···+U We write An = AU n = n for any ε > 0 small enough that

n−1

where I is the identity operator. We shall prove

sup N (A(f ), p, ε/3) ≥ ε−p .

f p =1

Note that An f (θ ) = Vn (θ )f (θ ), so that for any positive integers m, n, p |Vn (θ ) − Vm (θ )|p |f (θ )|p dθ.

An f − Am f p = T

(1.4.14)

1.4 The spectral regularization inequality

37

Let B be some fixed integer strictly greater than 12. From the standard estimates |Vm (θ )| ≤ π(mθ )−1 ,

|Vn (θ ) − 1| ≤ π(n − 1)θ/4 ≤ nθ,

valid for any m, n, θ, we deduce that if B/m ≤ θ ≤ B 2 /m and n ≤ B −3 m, then |Vn (θ ) − Vm (θ )| ≥ 1/2. It follows for any f ∈ Lp (T), any m and any n ≤ B −3 m that B 2 /m p

An f − Am f p ≥ 2−p |f (θ )|p dθ. B/m

In particular, for any f ∈ Lp (T) and any positive integers l > t, B 2−3l p

AB 3t f − AB 3l f p ≥ 2−p |f (θ )|p dθ. B 1−3l

Let M be some positive integer and put ε = M −1/p . Set f (θ) =

M l=1

1

M(B 2−3l

1/p 1[B 1−3l ,B 2−3l ] (θ ).

− B 1−3l )

Then f p = 1 and

B 2−3l B 1−3l

Thus

|f (θ)|p dθ =

1 = εp , M

AB 3t f − AB 3l f p ≥ 2−p εp , p

l = 1, . . . , M. 1 ≤ t < l ≤ M.

We deduce from these calculations that N (A(f ), p, ε/3) ≥ M = ε−p , as claimed. A variant in L1 . There is a general estimate of a weaker form of the square function in L1 , which is due to Jones, Rosenblatt and Wierdl [1999: Theorem 2.3], and can be stated as follows. Let (X, A, μ) be a probability space. Consider mappings Tn : L1 (μ) → L1 (μ) and assume that each is strongly positive in the sense that Tn f ≥ 0 for all f ∈ L1 (μ). We also assume that each Tn is positively homogeneous, which means that Tn (cf ) = cTn f for nonnegative c and f ∈ L1 (μ). For instance, Tn can be the absolute value of any linear operator from L1 (μ) to L1 (μ).

∞ 2 1/2 . Then Let Sf (x) = n=1 Tn f (x) sup sup λ λ≥0 f 1 ≤1

∞ n=1

μ{|Tn f | ≥ λ} ≤ C "⇒ sup sup λμ{Sf ≥ λ} ≤ 10C. (1.4.15) λ≥0 f 1 ≤1

38

1 The von Neumann theorem and spectral regularization

The proof is rather elementary. As Sf ≤ S1 f + S2 f , where S1 f (x) = S2 f (x) =

∞ n=1 ∞

1/2

(Tn f (x))2 1{Tn f ≤1} (x)

, 1/2

(Tn f (x))2 1{Tn f >1} (x)

,

n=1

we get μ{Sf ≥ 2} ≤ μ{S1 f ≥ 1} + μ{S2 f ≥ 1} ∞ (Tn f )2 1{Tn f >1} ≥ 1 ≤ μ{S1 f ≥ 1} + μ ≤ μ{S1 f ≥ 1} +

n=1 ∞

μ{Tn f > 1} ≤ μ{(S1 f )2 ≥ 1} + C f 1

n=1 ∞

=μ

(Tn f (x))2 .

n=1 ∞

≤μ ≤

k=0

1{2−k−1 ≤Tn f ≤2−k } ≥ 1 + C f 1

k=0

2−2k

k=0 ∞

∞

2−2k

∞

1{2−k−1 ≤Tn f ≤2−k } ≥ 1 + C f 1

n=1 ∞

μ{Tn f ≥ 2−k−1 } + C f 1 ≤ 5C f 1 .

n=1

Let t > 0. Replacing now f by f/t gives tμ{Sf ≥ 2t} ≤ 5C f 1 ; hence sup λμ{Sf ≥ λ} ≤ 10C f 1 . λ≥0

Extensions to the Hilbert transform. Results of the previous section have extensions to the discrete bilateral Hilbert transform Hn (f ) = U j (f )/j, 0<|j |≤n

where U : H → H is still a contraction in a Hilbert space H . A link between the Hilbert transform and ergodic means can be deduced from the following elementary identity (here aj are complex numbers): 1 1 aj = Sn − Sj , n n n

j =1

n−1 j =1

Sj =

j 1 k=1

k

ak ,

n ≥ 1.

The properties of Hn were notably considered in the work of Jajte [1987].

39

1.4 The spectral regularization inequality

The associated sequence of spectral kernels is defined as 1 sin(j θ ) eij θ = 2i , j j

Wn (θ) =

0<|j |≤n

W = {Wn , n ≥ 1}.

0<j ≤n

We also introduce the auxiliary sequence of functions n (θ ) =

∞ sin(j θ ) . j

j =n+1

Then we observe that for all m ≥ n, |Wm (θ ) − Wn (θ )| = 2 |n (θ ) − m (θ )|. By applying the Abel transform, we get 1.4.8 Lemma. For all θ ∈ [−π, π ) the following inequalities hold: a) for all n ≥ 1, |n (θ )| ≤ 4/(n|θ|); b) for all m ≥ n, |n (θ ) − m (θ )| ≤ (m − n)/m; c) for all m ≥ n, |n (θ ) − m (θ )| ≤ (m − n)|θ |. One can easily deduce from Lemma 1.4.8, in the same manner as for proving Lemma 1.4.3, that for all integers m ≥ n and each θ ∈ [−π, π ),

1/n

32 1/m

Q(θ, y)dy + 4 1[ 1 , 1 ) (|θ|) ≥ |Wm (θ ) − Wn (θ )|2 , m n

(1.4.16)

where the kernel Q(θ, y) is defined in (1.4.1). In view of the definition of μˆ (see (1.4.2)) we get 1.4.9 Theorem. Let m ≥ n be two positive integers. Then

Wm − Wn 22,μ ≤ 3μˆ

1 1 m, n

.

This result yields corollaries similar to those of Theorem 1.4.4. In particular, for any increasing sequence of positive integers {np , p ≥ 1},

Hnp+1 (f ) − Hnp (f ) 2 ≤ 12(2π + 1) f 2 , (1.4.17) p

and for every ε > 0 the entropy number of the set H (f ) = {Hn f, n ≥ 1} satisfies N(H (f ), ε) ≤ 1 +

12(2π + 1) 88 ≤ 1+ 2. ε ε2

(1.4.18)

40

1 The von Neumann theorem and spectral regularization

Problem 1. Let {Tt , t ∈ R} be a flow (Section 4.1) and consider the one-sided Hilbert transform n u T + T −u f du, An f = 2 1 with corresponding spectral kernel nt n ∞ cos ut cos v cos v du = dv = It − Int where It = dv. Vn (t) = u v v 1 t t Prove that for all t ≥ 0 and all reals n ≥ m ≥ 1, 1/m |Vm (t) − Vn (t)| ≤ 64 Q(t, x)dx, 1/n

where

⎧ ⎪ ⎨1/t Q(t, x) = | log x|/x ⎪ ⎩ 0

if 0 ≤ x ≤ t, if t ≤ x ≤ t, elsewhere.

Let μ be any measure on R (for instance the spectral measure of f ) and let μˆ denote the regularized measure defined as d μˆ Q(|t|, x)μ(dt), 0 < x < ∞. (x) = dx R Show for all reals n ≥ m ≥ 1 that

!"

Vm (t) − Vn (t) 22,μ ≤ 64μˆ Besides, 1 μ(R) ˆ = μ(R) + 2

1 −1

1 1 n, m

.

| log |t||2 μ(dt).

See also Remark 2.6.4. Extension to correlated sequences. We shall now indicate an extension to the Wiener space S of correlated sequences, namely the space consisting of sequences a = {a(n), n ∈ Z} such that for any integer k, the limit 1 γa (k) = lim a(j )a(j + k) n→∞ n n−1

j =0

exists. We provide S with the semi-norm

1 a(j )2 (a) = lim sup n n→∞ n−1

j =0

1/2

.

41

1.4 The spectral regularization inequality

Entropy numbers associated to any subset E of (S, ) are denoted by N (E, , · ). For any a = {aj , j ∈ Z} ∈ 2 (Z), let us write ϕa (α) = j ∈Z e−2iπj α a(j ). Let also T be the right shift on the space of sequences: T (bn , n ∈ Z) = (bn+1 , n ∈ Z), and denote AN =

I + T + · · · + T N −1 , N

N = 1, 2, . . . .

1.4.10 Corollary. For any a ∈ S, there exists a constant K(a) depending on a only, such that K(a) ∀0 < ε ≤ K(a), N({ATn (a), n ≥ 1}, , ε) ≤ 2 . ε Proof. By the Bessel–Parseval equality, ∀N, M ∈ N,

(AN − AM )(a)(n)2 = |ϕa (α)|2 |VN (α) − VM (α)|2 dα. T

n∈Z

Hence, for any J ≥ 1, and all N, M such that N ∨ M < J , 1 J

|(AN − AM )(a)(n)|2

0≤n<J −N ∨M

≤

T

2 2 1 −2iπj α e a(j ) VN (α) − VM (α) dα. J 0≤j <J

We can view the right integral as an integration with respect to the measure J,a (dα) =

2 1 −2iπj α e a(j ) dα. J

(1.4.19)

0≤j <J

Assume now that a ∈ S. By 1.1.3 there exists a unique nonnegative bounded measure a on T, the spectral measure of the sequence a, such that ∀m ∈ Z,

γa (m) =

π

−π

e2iπ mα a (dα).

Further, we know that the family of measures J,a weakly converges to a . We thus deduce 1 2 |(AN − AM ) (a)(n)| ≤ |VN (α) − VM (α)|2 a (dα) lim sup T J →∞ J 0≤n<J

for all N, M ≥ 1. The result now follows easily.

42

1 The von Neumann theorem and spectral regularization

Continuous time and Fourier inversion formula. The results stated in Section 1.4 remain valid if we consider a semigroup of unitary operators {Ut , t ∈ R+ } in a Hilbert space H and the corresponding averages AT (f ) =

1 T

T

Ut (f )dt . 0

In this case, one must replace the space L2 ([−π, π ), μ) with L2 (R, μ), and consider the kernels VT (θ ) =

sin T y VT (y) = $ VT (y) = . Ty

eiT θ − 1 , iT θ

(1.4.20)

We define by continuity VT (0) = VT (0) ≡ 1. Since the basic elementary inequalities |VT2 (θ ) − VT1 (θ )| ≤ min |VT1 (θ )| ≤

2(T2 −T1 ) T2 −T1 2 |θ|, T2

T2 ≥ T1 ,

,

(1.4.21)

2 , T1 |θ|

hold true, we still have the analogue of Theorem 1.4.4,

VT2 − VT1 22,μ ≤ 8μˆ

1 1 T2 , T1

.

(1.4.22)

All corollaries about entropy numbers and square functions follow directly. There is an interesting application to the Fourier inversionformula, which is worth noting. If ν is a distribution function on R and νˆ (t) = R eitx ν(dx) denotes its characteristic function, then (see for instance Theorem 6.2.4 in Chung [1970]) 1 lim T →∞ 2T

T

−T

e−itx0 νˆ (t)dt = ν{x0 }.

(1.4.23)

From this result also follows that

1 T →∞ 2T lim

T

−T

|ˆν (t)|2 dt =

ν({x})2 .

(1.4.24)

ν ∗n ({x})2 .

(1.4.25)

x∈R

And more generally, for any positive integer n, 1 lim T →∞ 2T

T

−T

|ˆν (t)|2n dt =

x∈R

That (1.4.24), (1.4.25) follow from (1.4.23) is simple, and is better seen using a probabilistic language (by homogeneity, there is no loss in assuming ν(R) = 1 = νˆ (0)). If X is a real-valued random variable defined on some probability space (, A, P) such that X(P) = ν, and X denotes an independent copy of X, then Z = X − X

1.4 The spectral regularization inequality

43

has distribution function ν ∗ ν , where ν (A) = ν(−A) for any A ∈ A. And |ˆν (t)|2 = ν ∗ ν (t) = E eit (X−X ) , so (1.4.23) means T 1 it (X−X ) lim e dt = (ν ∗ ν )({0}) = ν ({−y})ν(dy) E T →∞ 2T −T R = ν({x})2 x∈R

=

P{X = x}2 ,

x∈R

which yields (1.4.24). Let Z1 , Z2 , . . . , Zn be independent copies of Z and set Sn = Z1 + · · · + Zn . Applying the above to Sn gives (since E eitSn = |ˆν (t)|2n ) T 1 lim |ˆν (t)|2n dt = P(Sn = x)2 = ν ∗n ({x})2 , T →∞ 2T −T x∈R

x∈R

which is (1.4.25). Recall briefly for our purpose how to obtain (1.4.23). As |VT (y)| ≤ 1 everywhere and VT (y) → 0 as T tends to infinity for all y = 0, by the dominated convergence theorem it holds that for any real x0 , VT (x − x0 )ν(dx) → ν{x0 } as T tends to infinity. (1.4.26) R

And so MT (x0 ) : = =

1 2T R

T

−T

e−itx0 νˆ (t)dt =

R\{x0 }

sin T (x − x0 ) ν(dx) + ν{x0 } T (x − x0 )

(1.4.27)

VT (x − x0 )ν(dx) → ν{x0 }.

The Fourier inversion formula can be made a little more precise. Not only MT (x0 ) → ν{x0 }, but in fact for any arbitrary nondecreasing sequence T = {Tp , p ≥ 1} of positive reals, ∞ MT (x0 ) − MT (x0 )2 ≤ 24ν 2 (R). (1.4.28) k+1 k k=1

This time the total mass of the measure appears, unlike in (1.4.23). However (1.4.28) implies the convergence of MT (x0 ) as T → ∞. By the Cauchy–Schwarz inequality, we first observe that 2 # " MT (x0 ) − MT (x0 )2 = (x − x ) − V (x − x ) ν(dx) V T2 0 T1 0 2 1 R (1.4.29) " #2 ≤ ν(R) · VT2 (x − x0 ) − VT1 (x − x0 ) ν(dx). R

44

1 The von Neumann theorem and spectral regularization

By (1.4.22), R

#2

"

VT2 (x − x0 ) − VT1 (x − x0 ) ν(dx) ≤

R

VT (y) − VT (y)2 νx (dy) 0 2 1

≤ 8ˆνx0

1 1 T2 , T1

(1.4.30)

,

where for any real y we write νy (A) = ν(A − y), for every A ∈ B(R). Furthermore, Q(θ, x)dxν(dθ ) νˆ x0 (R) = (1.4.31) = |x|−3 dx θ 2 + dx|θ |−1 ν(dθ ) |θ |<|x|

|x|<|θ |

≤ 3νx0 (R) = 3ν(R). Let T = {Tk , k ≥ 1} be any arbitrary nondecreasing sequence of positive reals. It follows from the above estimates that ∞ MT

k+1

2 (x0 ) − MTk (x0 ) ≤ 24ν 2 (R).

k=1

1.5

Moving averages

In a similar way, one can develop for moving averages the idea of spectral regularization relative to some suitable class of regularizing kernels. Let (H, · ) be a Hilbert space and consider an arbitrary contraction U : H → H . Let also φ : R+ →R+ be a nondecreasing function with derivative, and such that φ(N) ⊂ N. Consider, for any positive integer n, the sequence of moving averages BnU,φ

=

Bnφ

1 = n

φ(n)+n−1

Uj.

j =φ(n)

It is important to underline here the considerable differences of structures which appear when passing from the study of fixed averages to the one of moving averages. For examples see for instance Section 4.1. We now introduce the associated spectral kernels Wn (θ ) = eiφ(n)θ Vn (θ ).

(1.5.1)

By definition, for any positive integers n, m, |Wm (θ) − Wn (θ )| ≤ |Vm (θ ) − Vn (θ )| + |ei(φ(m)−φ(n))θ − 1||Vm (θ )|. The first difference was estimated in Lemma 1.4.2. Concerning the second difference, we have |ei(φ(m)−φ(n))θ − 1| ≤ |θ |(φ(m) − φ(n)) ∧ 2

45

1.5 Moving averages

and |Vm (θ )| ≤

π ∧1 . |θ|m

It follows that

π2 ∧1 . |Wm (θ) − Wn (θ )| ≤ 2|Vm (θ ) − Vn (θ )| + 2{|θ | (φ(m) − φ(n)) ∧ 2} |θ |2 m2 2

2

2

2

1.5.1 Lemma. Let m ≥ n be two positive integers. Then, for any θ ∈ [−π, π ), 1/n 1 φ (1/y)Q(θ, y)dy ≥ {|θ|2 (φ(m) − φ(n))2 ∧ 1} ∧ 1 , (1.5.2) |θ |2 m2 1/m and the regularizing kernel Q is defined in (1.4.1). Proof. Consider two cases. 1) |θ| ≥ m1 . Then 1/n m |θ | 1 1 φ (1/y) dy = φ (u) ∧ |θ | du ∧ |θ| y 2 |θ |u2 1/m n m 1 φ (u) ∧ |θ | du ≥ |θ |m2 n φ(m) − φ(n) (φ(m) − φ(n))|θ | = = . 2 |θ|m |θ |2 m2 By using the elementary inequality x ≥ x 2 ∧ 1 with x = (φ(m) − φ(n))|θ |, we obtain the requested result. 2) |θ | ≤ m1 . Then 1/n 1/n |θ | φ (1/y)Q(θ, y)dy = φ (1/y) 2 dy y 1/m 1/m m φ (u)du = |θ |(φ(m) − φ(n)). = |θ| n

By means of the same elementary inequality, we have |θ |(φ(m)−φ(n)) ≥ |θ |2 (φ(m)− φ(n))2 ∧ 1, and the lemma is thus completely proved. We deduce from Lemmas 1.4.3 and 1.5.1 that 1/n

|Wm (θ) − Wn (θ )|2 ≤ 8π + 4π 2 φ (1/y) Q(θ, y)dy + 8 1[ 1 , 1 ) (|θ |). (1.5.3) 1/m

m n

Let now f ∈ H with spectral measure μf . Introduce a regularization μˆ f of the measure μf , relative to the kernel Q, by putting π

μˆ f (dy) = (1.5.4) 8π + 4π 2 φ (1/y) Q(θ, y)μf (dθ )dy + 8μ(dy). −π

46

1 The von Neumann theorem and spectral regularization

1.5.2 Theorem (Spectral regularization inequality for moving averages). For any positive integers m ≥ n, " # φ

Bm (f ) − Bnφ (f ) ≤ μˆ f m1 , n1 . Proof. By integrating (1.5.3) with respect to the measure μf , we get π

# |Wm − Wn |2 μf (θ ) ≤ μˆ f m1 , n1 . −π

The result thus follows by means of the spectral inequality. 1.5.3 Corollary (Square function of moving averages). Let ∞

1 φ (u) (θ ) = du + φ |θ |−1 |θ | 1{|θ |≤1} . 2 |θ| |θ|1 ∨1 u

(1.5.5)

Then, for any f ∈ H with spectral measure μ, and for any nondecreasing sequence of positive integers {np , p ≥ 1}, we have ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 4π 2

p=1

π −π

(θ )μ(dθ ) + 8(π + 1)μ[−π, π ). (1.5.6)

Proof. By using (1.5.3) and (1.5.4) we get ∞

Bnφp+1 (f ) − Bnφp (f ) 2

p=1

≤

∞ 1 μˆ np+1 , n1p = μˆ 0, n11 p=1

≤

1

dy 0

π

−π

8π + 4π 2 φ (1/y) Q(θ, y)μ(dθ ) + 8μ(0, 1]

≤ (8π + 8)μ[−π, π ) + 4π

π

2

−π

1 0

1 |θ | φ (1/y) ∧ 2 dyμ(dθ ). |θ | y

Further, by making the change of variables y = u−1 , we get 1 |θ |∧1 1 1 φ (1/y) |θ | |θ | φ (1/y) φ (1/y) 2 dy ∧ 2 dy ≤ dy + |θ| y |θ | y 0 0 |θ |∧1 ∞ 1 φ (u) ≤ du + φ(|θ |−1 )|θ | 1{|θ |≤1} |θ| |θ|1 ∨1 u2 = (θ ).

1.5 Moving averages

47

1.5.4 Remarks. 1. For sublinear functions φ such that supu φ (u) ≤ C, as given in (1.5.5) is uniformly bounded; and thus ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ (4π 2 sup (θ ) + 8π + 8)μ[−π, π ) θ

p=1

≤ 8(π C + π + 1) f 2 . 2

Estimate (1.5.6) remains again reasonable for functions φ growing faster than linearly, α but only for f such that (θ )μ(dθ ) < ∞. If, for instance, 1−αφ(u) ≈ u , α ∈ [1, 2), the estimate is efficient under the spectral hypothesis: |θ | μ(dθ ) < ∞. If α ≥ 2, then (θ) is infinite, and estimate (1.5.6) is no longer efficient. 2. Assume that the sequence {np , p ≥ 1} grows very fast, and for instance that the following condition is realized: M = sup

% ∞ $ nl 2

l≥1 p=l

np

< ∞.

(G1)

An equivalent reformulation of (G1) is: ∃q ∈ N, c > 0 : ∀p (1 + c) np ≤ np+q . Consider also a stronger assumption: ∃q ∈ N, c > 0 : ∀p max φ(np ) ; (1 + c) np ≤ np+q . (G2) This assumption is verified if for instance {np , p ≥ 1} is a sequence with a superexponential growth, namely np ∼ exp{Ca p } and φ is polynomial, φ(u) ∼ uα . We introduce the inverse function of the sequence {np , p ≥ 1}, sup{p : np ≤ x} if x ≥ n1 , L(x) = 0 if x < n1 . 1.5.5 Proposition. Assume that the sequence {np , p ≥ 1} satisfies assumption (G1). Then, for any element f ∈ H , ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2(6π + 2 + 2π 2 M) f 2 + 4I (μf ),

(1.5.7)

p=1

where

I (μf ) =

"

# L(|θ|−1 ) − L φ −1 (|θ |−1 ) + μf (dθ ).

(1.5.8)

Further, if assumption (G2) is verified, then we also have ∞ p=1

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2(6π + π 2 M + q + 1) f 2 .

(1.5.9)

48

1 The von Neumann theorem and spectral regularization

Proof. By means of (1.5.2) we have ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2

p=1

+2

∞

|Vnp+1 (θ ) − Vnp (θ )|2 μf (dθ )

p=1

$ ∞

|θ| (φ(np+1 ) − φ(np )) ∧ 2 2

2

p=1

π2 ∧1 |θ |2 n2p+1

%

μf (dθ ).

The first term can be bounded by 12π f 2 . Consider the second term. Fix θ, and put p− = L φ −1 (|θ |−1 ) − 1, p ∗ = L(|θ|−1 ). Then −

p

−

(φ(np+1 ) − φ(np )) ≤ 2

p=1

p

(φ(np+1 ) − φ(np ))φ(np− +1 ) ≤ φ(np− +1 )2 ≤ |θ |−2 .

p=1

Similarly (G1) implies:

∞

1 p=p ∗ n2 p+1

≤

M n2p∗ +1

≤ M|θ |2 . It follows that the second

term inside the integral is bounded above by 1 + 2π 2 M + 2

∞

"

# 1{p−

10.8.2009

11:03 Uhr

Seite 1

IRMA Lectures in Mathematics and Theoretical Physics 14 Edited by Christian Kassel and Vladimir G. Turaev

Institut de Recherche Mathématique Avancée CNRS et Université de Strasbourg 7 rue René-Descartes 67084 Strasbourg Cedex France

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 2

IRMA Lectures in Mathematics and Theoretical Physics Edited by Christian Kassel and Vladimir G. Turaev This series is devoted to the publication of research monographs, lecture notes, and other material arising from programs of the Institut de Recherche Mathématique Avancée (Strasbourg, France). The goal is to promote recent advances in mathematics and theoretical physics and to make them accessible to wide circles of mathematicians, physicists, and students of these disciplines. Previously published in this series: 1 2 3 4 5 6 7 8 9 10 11 12 13

Deformation Quantization, Gilles Halbout (Ed.) Locally Compact Quantum Groups and Groupoids, Leonid Vainerman (Ed.) From Combinatorics to Dynamical Systems, Frédéric Fauvet and Claude Mitschi (Eds.) Three courses on Partial Differential Equations, Eric Sonnendrücker (Ed.) Infinite Dimensional Groups and Manifolds, Tilman Wurzbacher (Ed.) Athanase Papadopoulos, Metric Spaces, Convexity and Nonpositive Curvature Numerical Methods for Hyperbolic and Kinetic Problems, Stéphane Cordier, Thierry Goudon, Michaël Gutnic and Eric Sonnendrücker (Eds.) AdS/CFT Correspondence: Einstein Metrics and Their Conformal Boundaries, Oliver Biquard (Ed.) Differential Equations and Quantum Groups, D. Bertrand, B. Enriquez, C. Mitschi, C. Sabbah and R. Schäfke (Eds.) Physics and Number Theory, Louise Nyssen (Ed.) Handbook of Teichmüller Theory, Volume I, Athanase Papadopoulos (Ed.) Quantum Groups, Benjamin Enriquez (Ed.) Handbook on Teichmüller Theory, Volume II, Athanase Papadopoulos (Ed.)

Volumes 1–5 are available from Walter de Gruyter (www.degruyter.de)

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 3

Michel Weber

Dynamical Systems and Processes

irma_weber_titelei

10.8.2009

11:03 Uhr

Seite 4

Author: Michel Weber Institut de Recherche Mathématique Avancée CNRS et Université de Strasbourg 7, rue René Descartes 67084 Strasbourg Cedex France

2000 Mathematics Subject Classification: 37-02, 60-02. Key words: Dynamical systems, measure-preserving transformation, ergodic theorems, spectral theorems, convergence almost everywhere, central limit theorem, stochastic processes, gaussian processes, metric entropy method, majorizing measure method, randomization methods, Riemann sums

978-3-03719-046-3 The Swiss National Library lists this publication in The Swiss Book, the Swiss national bibliography, and the detailed bibliographic data are available on the Internet at http://www.helveticat.ch. This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use permission of the copyright owner must be obtained.

© 2009 European Mathematical Society Contact address: European Mathematical Society Publishing House Seminar for Applied Mathematics ETH-Zentrum FLI C4 CH-8092 Zürich Switzerland Phone: +41 (0)44 632 34 36 Email: [email protected] Homepage: www.ems-ph.org Typeset using the author’s TEX files: I. Zimmermann, Freiburg Printed in Germany 987654321

Preface

The aim of this book is to present in a concise and accessible way, as well as in a common setting, various tools and methods arising from spectral theory, ergodic theory and probability theory, which contribute interactively to the current research on almost everywhere convergence problems. The recent developments in the study of these questions are often obtained by combining either methods of spectral theory with principles of ergodic theory or methods from probability theory with tools and principles from spectral theory and ergodic theory. The spectral criterion of Gaposhkin, and later, following a remarkable metric entropy inequality of Talagrand, the spectral regularization developed in the setting of the study of square functions and oscillation functions in ergodic theory, are typical examples of this fruitful interaction. Another example of thorough interaction is certainly the work of Bourgain and notably his famous entropy criterion, at the basis of which lies the continuity principle of Stein. It was not our aim to write a complete treatise in ergodic theory, assuming such enterprise to be conceivable. The development of this theory during the last twenty years was indeed considerable. A similar remark can be made for the part concerning the study of the regularity of stochastic processes. The work is also not a synthesis of most significant results, complete with sketched proofs and references. We chose the intermediate route to writing a book in the spirit of lectures oriented towards research. The book provides an easy access to many tools, methods and results used in current research, presenting each of them in as wide a setting as possible. The proofs of these results are often given with full details. This book is divided in four parts, which came more or less naturally while writing it. Part I is devoted to spectral results and is followed by Part II, in which tools and results from ergodic theory are presented. In the third part, in connection with the description of two main methods, namely the metric entropy method and the majorizing measure method, recent applications to ergodic theory are given via the study of some maximal inequalities of Gál–Koksma type and the Lp norm, 1 ≤ p ≤ ∞, of important classes of polynomials. Finally, in the last part of the book we recollect classical results, as well as recent advances concerning Riemann sums and Khintchin sums, and the value distribution of divisors of Bernoulli or Rademacher sums, used in the study of Riemann sums. In Part I we begin elementarily with the spectral inequality. Chapter 1 concerns von Neumann’s theorem, which forms with Birkhoff’s ergodic theorem the basis of ergodic theory. It seems natural to include in this chapter Talagrand’s metric entropy n−1 estimate for the set {ATn f, n ≥ 1} where ATn is the average operator I +T +···+T n of a contraction T in a Hilbert space, thus completing naturally the von Neumann theorem. Recently discovered, remarkably efficient, spectral regularization inequalities analysing other structural properties of the set {ATn f, n ≥ 1}, followed by Weyl’s

vi

Preface

criterion and the van der Corput principle, complete this chapter. Chapter 2 starts with presenting the arguments leading to the representation of a weakly stationary process as Fourier transform of a random measure with orthogonal increments. Next we study Gaposhkin’s spectral criterion. In Part II, we first review in Chapter 3 classical ergodic and mixing properties of measurable dynamical systems. We also study several standard examples. Chapter 4 is devoted to Birkhoff’s pointwise theorem, to dominated ergodic theorems in Lp and to BMO spaces of associated maximal operators. This is continued with a discussion around spectral characterizations of the speed of convergence in Birkhoff’s pointwise theorem. Next we examine oscillation functions of ergodic averages. The transference principle and Wiener–Wintner theorems are discussed. A study of weighted ergodic averages concludes this chapter. In Chapter 5, some basic tools from ergodic theory, the Banach principle, the continuity principle and the conjugacy lemma are studied in detail. Chapter 6 concerns entropy criteria of Bourgain. Several functional inequalities linking the studied sequence of L2 -operators with the canonical Gaussian process on L2 are established, from which the criteria are then easily deduced. Study of the statistic of the ergodic averages naturally leads to investigating the question of the existence of some f ∈ L2 such that the related ergodic averages satisfy a central limit theorem, the invariance principle or the almost sure central limit theorem. Chapter 7 is devoted to this study. A detailed proof of the theorem of Burton–Denker on the existence, in any aperiodic dynamical system, of the central limit theorem is given. The method of proof relies upon Kakutani–Rochlin’s lemma and imitates the analogous result for irrational rotations of the unit circle which is obtained by using Fourier series. A fundamental fact in the background of the entire construction is provided by using Rochlin’s result on a factor space of Lebesgue space. The case of irrational rotations involving various remarkably efficient methods is more closely investigated. The existence of L2 elements of the torus satisfying the central limit theorem (CLT) is established for various types of means: nonlinear ergodic means, weighted ergodic means, and ergodic means along the squares. For the latter case, the circle method is used. The chapter concludes with a recent study of a kind of achieved form of the CLT, the convergence in variation implying the convergence of related density distributions in the spaces Lp (R), 1 ≤ p ≤ ∞, in the symptomatic case of lacunary random Fourier series. Two rather general methods are investigated in Part III: the metric entropy method and the majorizing measure method. In Chapter 8, a useful criterion for almost everywhere convergence involving covering numbers is proved, and then used to prove in a unified setting several classical results, such as Stechkin’s theorem, Gál–Koksma theorems and quantitative Borel–Cantelli lemmas. The metric entropy method is next applied to establish quite useful estimates of the supremum of random polynomials, notably random Dirichlet polynomials, and to study almost sure convergence properties of weighted series of contractions and random perturbation of some intersective sets in ergodic theory. Chapter 9 concerns an important tool: the majorizing measure method. A general criterion for almost sure convergence of averages is proved by means of this

Preface

vii

method. We continue with recent applications of the majorizing measure method to the study of the supremum of random polynomials, including a strictly stronger form of the well-known Salem–Zygmund estimate. Some remarkable classes of examples are studied. Chapter 10 is a succinct study of Gaussian processes presented in the form of a toolbox. Various fundamental results from the theory are discussed, sometimes with historical comments and proofs. Much importance is given to very handy correlation inequalities. Part IV is devoted to three studies: the study of Riemann sums, the study of convergence properties of the system {f (nk x), k ≥ 1} and a probabilistic approach concerning divisors with applications. Chapters 1 to 6 and partially Chapters 8 to 10 are based on lectures given at the Mathematical Institute of the University of Strasbourg. Chapters 11 to 13 are mainly based on research articles, as well as some parts of Chapters 1, 4, 7, 8, 9. In writing this book, we followed a general principle: where the proofs in our source readings were only sketched, we fill in the gaps in as much detail as possible. Further, we give quasisystematically complete references with page numbers and/or precise numeration of cited results. We always keep in mind the wish to help, as much as we can, the researcher but also the teacher and the graduate student in their work in these beautiful areas of mathematics, trying also to spare their time and to let them share our passion for research at the interfaces of related problems. I would like to thank Mikhail Lifshits for the many discussions and encouragements. I would also like to thank Istvan Berkes for his indefectible enthusiasm and the many exchanges and comments, as well as Ulrich Krengel for stimulating comments. I am much indebted and grateful to Irene Zimmermann for her technical assistance and for numerous observations and remarks. I thank Manfred Karbe and the European Mathematical Society Publishing House for accepting this work in their IRMA series, and for efficient help in publishing. I devote this book to my wife Marie-Christine. She always provided a favourable atmosphere for mathematical work.

Contents

Preface Part I

v Spectral theorems and convergence in mean

1

1 The von Neumann theorem and spectral regularization 1.1 Bochner–Herglotz lemma . . . . . . . . . . . . . . . . . 1.2 The spectral inequality . . . . . . . . . . . . . . . . . . 1.3 The von Neumann theorem . . . . . . . . . . . . . . . . 1.4 The spectral regularization inequality . . . . . . . . . . . 1.5 Moving averages . . . . . . . . . . . . . . . . . . . . . 1.6 Uniform distribution mod a – the Weyl criterion . . . . . 1.7 The van der Corput principle . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3 8 10 26 44 51 55

2 Spectral representation of weakly stationary processes 2.1 Weakly stationary processes . . . . . . . . . . . . . . 2.2 Spectral representation of unitary operators . . . . . . 2.3 Elements of stochastic integration . . . . . . . . . . . 2.4 Spectral representation of weakly stationary processes . 2.5 Weakly stationary sequences and orthogonal series . . 2.6 Gaposhkin’s spectral criterion . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

61 61 64 76 78 80 85

Part II

. . . . . .

Ergodic Theorems

91

3 Dynamical systems – ergodicity and mixing 3.1 Measurable dynamical systems – topological dynamical systems 3.2 Ergodicity of a dynamical system . . . . . . . . . . . . . . . . . 3.3 Weak mixing, strong mixing, continuous spectrum . . . . . . . . 3.4 Spectral mixing theorem . . . . . . . . . . . . . . . . . . . . . 3.5 Other equivalences and other forms of mixing . . . . . . . . . . 3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

93 93 101 103 110 114 121

4 Pointwise ergodic theorems 4.1 Birkhoff’s pointwise theorem 4.2 Dominated ergodic theorems 4.3 Classes L logm L . . . . . . 4.4 A converse . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

129 129 139 144 145

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

x 4.5 4.6 4.7 4.8 4.9

Contents

Speed of convergence . . . . . . . . . . Oscillation functions of ergodic averages Wiener–Wintner theorem . . . . . . . . Weighted ergodic averages . . . . . . . Subsequence averages . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

148 152 165 168 193

5 Banach principle and continuity principle 5.1 Banach principle . . . . . . . . . . . . . . . 5.2 Continuity principle . . . . . . . . . . . . . . 5.3 Applications . . . . . . . . . . . . . . . . . . 5.4 A principle of domination – conjugacy lemma

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

200 200 206 217 226

6 Maximal operators and Gaussian processes 6.1 Some liaison theorems . . . . . . . . . . . 6.2 Two preliminary lemmas . . . . . . . . . . 6.3 Proof of Theorem 6.1.1 . . . . . . . . . . . 6.4 Proof of Theorem 6.1.6 . . . . . . . . . . . 6.5 The case Lp , 1 < p < 2 . . . . . . . . . . 6.6 A remarkable GB set property . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

230 230 242 247 249 254 259

7 The central limit theorem for dynamical systems 7.1 Introduction and preliminaries . . . . . . . . . . 7.2 A theorem of Burton and Denker . . . . . . . . . 7.3 The central limit theorem for orbits . . . . . . . . 7.4 A theorem of Volný . . . . . . . . . . . . . . . . 7.5 CLT for rotations . . . . . . . . . . . . . . . . . 7.6 Lacunary series and convergence in variation . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

267 267 269 284 289 291 315

Part III

. . . . .

. . . . .

. . . . . .

Methods arising from the theory of stochastic processes

8 The metric entropy method 8.1 Introduction and general results . . . . . . . . . . . . . . . . . . . 8.2 A theorem of Stechkin . . . . . . . . . . . . . . . . . . . . . . . 8.3 An application to the quantitative Borel–Cantelli lemma . . . . . . 8.4 Application to Gál–Koksma’s theorems . . . . . . . . . . . . . . 8.5 An application to the supremum of random polynomials . . . . . . 8.6 Application to a.s. convergence of weighted series of contractions 8.7 An application to random perturbation of intersective sets . . . . . 8.8 An application to the discrepancy of some random sequences . . . 8.9 An application to random Dirichlet polynomials . . . . . . . . . .

339

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

341 341 349 353 364 369 387 403 409 415

9 The majorizing measure method 433 9.1 Introduction – the exponential case . . . . . . . . . . . . . . . . . . . . . 433

xi

Contents

9.2 A general approach . . . . . . . . . . . . . . . 9.3 A useful criterion . . . . . . . . . . . . . . . . 9.4 Proof of Theorem 9.3.3 . . . . . . . . . . . . . 9.5 Proof of Theorems 9.3.10 and 9.3.11 . . . . . . 9.6 Proof of Theorem 9.3.12 and some examples . 9.7 A stronger form of Salem–Zygmund’s estimate 9.8 Some examples and discussion . . . . . . . . . 9.9 Uniform convergence of random Fourier series

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

438 447 457 469 471 475 478 488

10 Gaussian processes 10.1 Gaussian variables and correlation estimates . . . 10.2 0-1 laws, integrability and comparison lemmas . 10.3 Regularity and irregularity of Gaussian processes 10.4 Gaussian suprema . . . . . . . . . . . . . . . . . 10.5 Oscillations of Gaussian Stein’s elements . . . . 10.6 Tightness of Gaussian Stein’s elements . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

491 491 504 510 517 529 537

Part IV Three studies

547

11 Riemann sums 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The results of Jessen and Rudin . . . . . . . . . . . . . . . . . . 11.3 Individual theorems of spectral type . . . . . . . . . . . . . . . 11.4 Breadth and dimension . . . . . . . . . . . . . . . . . . . . . . 11.5 Bourgain’s results . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Connection with number theory . . . . . . . . . . . . . . . . . 11.7 Riemann sums and the randomly sampled trigonometric system 11.8 Almost sure convergence and square functions of Riemann sums

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

549 549 551 554 557 562 565 573 587

12 A study of the system (f (nx)) 12.1 Introduction and mean convergence . . . . . . 12.2 Almost sure convergence – sufficient conditions 12.3 Almost sure convergence – necessary conditions 12.4 Random sequences . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

601 601 611 634 642

. . . . . . .

659 659 661 675 685 691 699 701

. . . .

. . . .

. . . .

. . . .

13 Divisors and random walks 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 13.2 Value distribution and small divisors of Bernoulli sums 13.3 An LIL for arithmetic functions . . . . . . . . . . . . . 13.4 On the order of magnitude of the divisor functions . . . 13.5 Value distribution of the divisors of n2 + 1 . . . . . . . 13.6 Value distribution of the divisors of Rademacher sums . 13.7 The functional equation and the Lindelöf Hypothesis .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

xii

Contents

13.8 An extremal divisor case . . . . . . . . . . . . . . . . . . . . . . . . . . 711 Bibliography

729

Index

759

Part I Spectral theorems and convergence in mean

Chapter 1

The von Neumann theorem and spectral regularization

Von Neumann’s theorem is, together with Birkhoff’s theorem, one of the fundamental results in ergodic theory. A remarkable spectral regularization inequality is established, from which Talagrand’s entropy estimate is deduced, as well as sharp bounds for the Littlewood–Paley square functions. Other averages, like moving averages, are considered. Some useful lemmas, the Bochner–Herglotz lemma, the spectral lemma and the spectral inequality are first established and completed by some other, sometimes less known results. Two important tools are included at the end of the chapter: Weyl’s equidistribution theorem and the van der Corput principle.

1.1

Bochner–Herglotz lemma

The lemmas studied in this section, as well as in the next one, are classical tools of spectral analysis. The spectral inequality, which is easily derived from the Bochner– Herglotz lemma, allows us to reduce many problems of in-norm evaluation of vectors, to much more tractable harmonic analysis questions. This tool is often used in ergodic theory. We thus begin by establishing Bochner–Herglotz’s lemma. A function γ : R → R is nonnegative definite if for any positive integer n, and any u1 , . . . , un ∈ R, a1 , . . . , an ∈ C, we have ai a¯ j γ (ui − uj ) ≥ 0. 1≤i,j ≤n

For a continuous function γ : R → R, an equivalent definition of nonnegative definiteness is that for any measurable bounded function ξ(x) vanishing outside some finite interval, ∞

∞

−∞ −∞

γ (t − s)ξ(t)ξ(s) dtds ≥ 0.

A sequence of complex numbers {ak , k ∈ Z} is nonnegative definite if a−k = a¯ k and if the inequality ρi ρ j ai−j ≥ 0, 1≤i,j ≤n

holds for any finite system of complex numbers ρ1 , . . . , ρn . A function γ : Z → R is thus nonnegative definite if the sequence {γ (k), k ∈ Z} is nonnegative definite. These notions immediately extend to functions defined on Rd or Zd . Let T = R/Z = [0, 1[

4

1 The von Neumann theorem and spectral regularization

be the circle equipped with the normalized Lebesgue measure λ, and let Td denote the d-dimensional torus equipped with the measure λd . 1.1.1 Lemma. a) Let γ : Rd → R be continuous, nonnegative definite. Then there exists a nonnegative bounded measure μ on Rd , such that for any x ∈ Rd , γ (x) = eit,x μ(dt). Rd

b) Let γ : Zd → R be nonnegative definite. Then there exists a nonnegative bounded measure μ on Td , such that for any k ∈ Zd , γ (k) = e2iπ k,t μ(dt). Td

Proof. We give the proof for d = 1, the multidimensional case being obtained in a quite identical way. Let Z denote some positive integer. Consider a) first. Put Ik =

k

[0,Z[k i,j =1

e−i(ui −uj )x γ (ui − uj ) du1 . . . duk .

By assumption Ik ≥ 0. Moreover, Ik = kγ (0) du1 . . . duk [0,Z[k

du1 . . . duk Z Z −i(ui −uj )x e γ (ui − uj ) dui duj k−2 dui duj 0 0 i,j =1 [0,Z[ Z Z = kγ (0)Z k + k(k − 1)Z k−2 e−i(u−v)x γ (u − v) dudv. +

k

0

Dividing by k(k

0

− 1)Z k−2

and then letting k tend to infinity, implies Z Z e−i(u−v)x γ (u − v)dudv ≥ 0. 0

0

Making the change of variables u − v = t gives Z Z Z−v e−itx γ (t)dt dv = e−itx γ (t) {min(Z, Z − t) − sup(0, −t)} dt 0

−v

−Z Z

=

−Z

e−itx γ (t) (Z − |t|) dt ≥ 0.

Let γZ (x) = γ (x) (1 − |x|/Z) 1[−Z,Z] (x),

γˆZ (x) =

R

e−itx γZ (t) dt.

5

1.1 Bochner–Herglotz lemma

Then γˆZ (x) ≥ 0, and evidently γZ ∈ L∞ (R). We show that γˆZ ∈ L1 (R). Integrating 2 2 γˆZ (x) over R with respect to the density √1 e−x /(2σ ) , yields σ 2π

R

γˆZ (x)e

−

x2 2σ 2

dx = √ σ 2π

2 2 2 −itx− x 2 dx 2σ γZ (t) e γZ (t)e−σ t /2 dt. dt = √ σ 2π R R R

Hence, since γZ ∞ ≤ γ (0), R

γˆZ (x)e

−

x2 2σ 2

√ 2 2 dx = σ 2π γZ (t)e−σ t /2 dt R √ √ 2 2 ≤ σ 2π γ (0) e−σ t /2 dt = 2π γ (0). R

But γˆZ (x) ≥ 0. Letting σ tend to infinity increasingly, finally shows in view of Fatou’s lemma that γˆZ ∈ L1 (R). Now we need the Fourier inversion theorem: Let h, hˆ ∈ L1 (Rd ). Then for almost all x, h(x) =

Rd

ˆ eit,x h(t)dt.

Thus γˆZ ∈ L1 (R) and for almost all x, γZ (x) = R eitx γˆZ (t)dt. As γZ and the mapping itx x → R e γˆZ (t)dt are continuous, the above equality holds in turn everywhere. Hence γ (0) = γZ (0) = γˆZ (t)dt. R

Denote by μZ the measure on R having density γˆZ (t). Since γZ (x) → γ (x) everywhere as Z tends to infinity, we get lim μˆ Z (x) = γ (x).

Z→∞

By assumption γ is continuous. It follows from the corollary on p. 481 in [Feller: 1966, II] that there exists a nonnegative bounded measure μ on R such that γ (x) = μ(x). ˆ Z −2iπ(n−m)x γ (n − m) ≥ 0. We pass to the proof of b). By assumption n,m=1 e This sum can also be written as Z

e−2iπ(n−m)x γ (n − m)

n,m=1

=

n−1 Z n=1 p=n−Z

e

−2iπ xp

γ (p) =

Z−1 −Z+1

e−2iπ xp γ (p)

p+1≤n≤p+Z 1≤n≤Z

1=

6

1 The von Neumann theorem and spectral regularization

=

Z−1

e−2iπ xp γ (p){min(p + Z, Z) − max(1, p + 1) + 1}

−Z+1

=

Z−1

e−2iπ xp γ (p) (Z − |p|) .

−Z+1

Put γZ (p) = γ (p)1{−Z+1,Z−1} (1 − |p|/Z) and gZ (x) = p∈Z e−2iπ xp γZ (p). Then γˆZ (−x) = gZ (x) ≥ 0, and since γZ has compact support, gZ is bounded continuous. Further gZ (x)e2iπ xr dx = γZ (p) e2iπ x(r−p) dx = γZ (r). T

T

p∈Z

In particular γZ (0) = γ (0) = T gZ (x)dx, thereby implying that the nonnegative measures νZ on (T, B(T)) with density gZ (x) are relatively compact for the weak convergence topology D on T. Hence, there exists a subsequence J and a bounded nonnegative measure ν on T such that D

lim

JZ→∞

and limJZ→∞ γZ (r) = any r ∈ Z,

Te

2iπ xr ν(dx).

νZ = ν, Since limZ→∞ γZ (r) = γ (r), we get for

γ (r) =

T

e2iπ xp ν(dx).

Schoenberg’s theorem. Schoenberg [1938] found a beautiful complement to Bochner’s theorem, which is worth being formulated here. Let f : R+ → R+ be continuous, nonnegative definite. Assume that f (0) = 1. Schoenberg’s theorem translates, via Bochner’s theorem, to the equivalence of the following two assertions: (a) For all d ≥ 1, there is a probability measure μd on Rd such that for every x ∈ Rd , eix,y μd (x). f ( x d ) = Rd

Here x d is the Euclidian norm on Rd . (b) There exists a Borel probability ν on R+ such that for any positive real t, ∞ 2 e−st /2 ν(ds). f (t) = 0

There is a proof of Schoenberg’s theorem via the law of large numbers in Khoshnevisan [2005], to which we may also refer as a source. 1.1.2 Remarks. 1. Nonnegative definite sequences are characterized by the previous lemma. According to this one, a sequence is nonnegative definite if and only if there

1.1 Bochner–Herglotz lemma

7

exists a weakly stationary sequence {Xn , n ≥ 1} in a Hilbert space H such that for any positive integers h and k, Xh , Xk = γh−k . This point can also be established by means of a direct vector representation in H , see Ky Fan [1946: Paragraph 2 and Appendix]. Nonnegative definite sequences are closely related to nonnegative trigonometric polynomials. p 2. A trigonometric polynomial k=−p zk eikθ with z−k = zk and taking only nonnegative values, is said to be nonnegative. In view of a classical result of Fejér and F. Riesz (Fejér [1915]), there exist p + 1 complex numbers ρ0 , ρ1 , . . . , ρp such that p

2 zk eikθ = ρ0 + ρ1 eiθ + · · · + ρp eipθ .

k=−p

3. We also quote a theorem due to Szász [1918] (see Ky Fan [1946: Paragraph 3]). A sequence {an , n ∈ Z} is nonnegative definite, if and only if, p

ak zk ≥ 0

k=−p

p holds for any nonnegative trigonometric polynomial k=−p zk eikθ of arbitrary order p. This characterization is to be compared with the one of Hausdorff [1923]: the sequence {an , n ∈ Z} is nonnegative definite, if and only if p p

ah−k ei(h−k)θ ≥ 0

h=1 k=1

is satisfied for any positive integer p and any real θ . Below we list some standard examples of nonnegative definite sequences and weakly stationary sequences. 1.1.3 Examples. (1) Given a weakly stationary sequence {Xn , n ≥ 1} in H , it is readily seen that, for any real value of ϑ, the sequence {e−inϑ Xn , n ≥ 1} is weakly stationary too. Anticipating a bit von Neumann’s theorem, for any value of ϑ the limit e−iϑ X1 + e−i2ϑ X2 + · · · + e−inϑ Xn n→∞ n

(ϑ) = lim

also exists. Further (see before Remarks 1.3.4), if ϑ1 = ϑ2 (mod 2π ), (ϑ1 ) and (ϑ2 ) are orthogonal elements in H . And there exists at most a countable infinite set of values of ϑ for which (ϑ) differs from the null element of H (see Ky Fan [1946: Paragraph 6]). (2) Let : R → R+ be even, convex and nonincreasing. Then the sequence { (n), n ∈ Z} is nonnegative definite. This follows from a classical theorem due to Polyá.

8

1 The von Neumann theorem and spectral regularization

(3) Let S be the space of correlated sequences introduced by Wiener [1933: Chapter 4], namely the space of sequences a = {a(n), n ∈ Z} with a−n = a¯ n , such that for any k ≥ 0 the limit n−1 1 γa (k) = lim a(j )a(j + k) n→∞ n j =0

exists. Observe that for any integers r, s with 0 ≤ r ≤ s, 1 a(h + r)a(h + s). n→∞ n n−1

γa (s − r) = lim

h=0

From this follows that the sequence {γa (k), k ≥ 0} is nonnegative definite. Indeed, m

n−1 m 1 ck c¯l a(j + l)a(j + k) n→∞ n

ck c¯l γa (k − l) = lim

j =0 k,l=1

k,l=1

n−1 m 2 1 ck a(j + k) ≥ 0. n→∞ n

= lim

j =0 k=1

In view of the Bochner–Herglotz theorem, there exists a uniquely determined nonnegative bounded measure a on [−π, π[, called the spectral measure of the sequence a. Consider the family of measures J,a (dα) =

2 1 −ij α e a(j ) dα. J 0≤j <J

A theorem due to Coquet, Kamae and Mendes-France [1977: Theorem 1] shows that the family of measures J,a converges weakly to a . To establish this property, ˆ J,a converges pointwise it suffices to show that the sequence of Fourier transforms ˆ to a , which is easily checked.

1.2 The spectral inequality Bochner–Herglotz’s lemma has a very useful consequence, which we now state. 1.2.1 Lemma. Let T be a contraction in a Hilbert space H . For any n ∈ Z, let Tn = T n if n ≥ 0 and Tn = T ∗ |n| if n < 0. Let x ∈ H . The sequence {Tn x, x, n ∈ Z} is nonnegative definite, and there exists a uniquely determined nonnegative bounded measure μx on T, the spectral measure of T at x verifying exp(2iπ nt)μx (dt) (∀n ∈ Z). Tn x, x = T

9

1.2 The spectral inequality

Proof. The second assertion follows from Lemma 1.1.1. The first assertion is simple when T is an isometry. n

zl z¯ m Tl−m x, x =

m

l

l,m=−n

2 zl z¯ m Tl+n x, Tm+n x = zl Tl+n x ≥ 0. l

For the general case, we put for any 0 < r < 1 and t ∈ T, U (r, t) = r k e2iπ kt T k , k≥0

V (r, t) =

r |k| e2iπ kt Tk = −I + U (r, t) + U (r, t)∗ .

k∈Z

If y = U (r, t)x, we have y − x = re2iπ t T y. Thus y − x ≤ y , and this shows V (r, t)x, x = −x, x + y, x + x, y = y, y − y − x, y − x ≥ 0. For any complex numbers {zl , |l| ≤ n}, we have n

r

l−m

zl z¯ m Tl−m x, x =

l,m=−n

=

n l,m=−n k n T l,m=−n

zl z¯ m Tk x, xr

|k|

T

e2iπ(k−(l−m))t dt

zl z¯ m e−2iπ(l−m)t V (r, t)x, x dt

2 = zl e−2iπ lt V (r, t)x, x dt ≥ 0. T

l

Letting r tend to 1 gives the required inequality. We shall now deduce from the spectral lemma an extremely useful tool. 1.2.2 Proposition. Let T be a contraction in a Hilbert space H , and let p(x) be a polynomial. Then, for any x ∈ H , 2iπ t 2 p(e

p(T )x 2 ≤ ) μx (dt), T

where the measure μx is the same as in Lemma 1.2.1. Proof. We follow an argument due to Wierdl. The inequality is obviously satisfied if the order of p is equal to 0. Assume now that the inequality is true for any polynomial of order k − 1. Let p(y) = a0 + · · · + ak y k , and consider the auxiliary polynomials q(y) = a1 y + · · · + ak y k ,

u(y) =

q(y) = a1 + · · · + ak y k−1 . y

10

1 The von Neumann theorem and spectral regularization

We have |p(y)|2 = |a0 + q(y)|2 = |a0 |2 + a0 q(y) ¯ + q(y)a¯ 0 + |q(y)|2 , and

p(T )x 2 = (a0 I + q(T )) x 2 = |a0 |2 x 2 + a0 x, q(T )x + q(T )x, a0 x + q(T )x 2 . By using the induction hypothesis, we have 2iπ t 2 u(e ) μx (dt).

u(T )x 2 ≤ T

Since T is a contraction, then q(T )x = T u(T )x ≤ u(T )x . And as |u(e2iπ t )| = |q(e2iπ t )|, we get 2iπ t 2 q(e

q(T )x 2 ≤ ) μx (dt). T

Besides, a0 x, q(T )x =

T

a0 q(e ¯

2iπ t

) μx (dt) and

q(T )x, a0 x =

T

q(e2iπ t )a¯ 0 μx (dt).

By putting together these various estimates, we obtain

p(T )x 2 ≤ ¯ 2iπ t ) + q(e2iπ t )a¯ 0 + |q(e2iπ t )|2 μx (dt), |a0 |2 + a0 q(e T

and this establishes the spectral inequality for all polynomials of order k, and thereby for any polynomial.

1.3 The von Neumann theorem Let T be a contraction in a Hilbert space H and introduce the operators 1 k T , n n−1

An = ATn =

n = 1, 2, . . . .

(1.3.1)

k=0

The fundamental result of von Neumann [1931] can be stated as follows. 1.3.1 Theorem. The limit limn→∞ An f = f¯ exists for any f ∈ H , and the map PT : f → f¯ is the orthogonal projection of H onto the subspace of invariant vectors HT = {g ∈ H : T g = g}. Further H = H0 ⊕ HT , where H0 = {g − T g, g ∈ H }. Proof. (1) The proof is based on the following lemma.

1.3 The von Neumann theorem

11

1.3.2 Lemma. Let T be a contraction in a Hilbert space H . Then the adjoint operator (Section 2.2.6) T ∗ has the same fixed points as T . Proof. T ∗ is also a contraction and if Tf = f , then f, Tf = Tf, f = f 2 . Conversely f, Tf = Tf, f = f 2 implies f, T ∗ f = f 2 and

Tf − f 2 = Tf − f, Tf − f = Tf 2 + f 2 − f, Tf − Tf, f = Tf 2 − f, T ∗ f ≤ 0. Thus Tf = f and so Tf = f ⇔ Tf, f = T ∗ f, f = f 2 . Therefore Tf = f ⇐⇒ Tf, f = T ∗ f, f = f, f ⇐⇒ T ∗ f = f. (2) We show that H = H0 ⊕ HT . According to (1), for any f ∈ HT , f, g − T g = f, g − f, T g = f, g − T ∗ f, g = 0. Hence HT ⊂ H0⊥ . Besides, if f is orthogonal to H0 , then 0 = f, g − T g = f − T ∗ f, g for any g in H . Thus T ∗ f = f , and thereby Tf = f . This implies that H0⊥ = HT . (3) It is plain that the theorem is satisfied for any vector of the type f + g − T g, f ∈ HT and g ∈ H . Indeed 1 k 1 k 1 k T (f + g − T g) = T f+ (T g − T k+1 g) n n n n−1

n−1

n−1

k=0

k=0

k=0

1 = f + (g − T n g) → f, n as n tends to infinity, and f is the orthogonal projection on HT of f + g − T g. (4)According to (2), these vectors are dense in H . The operators An are contractions as well. It follows that the set of vectors for which the theorem is true is closed in H . Let indeed A = {x ∈ H such that if y = projH0 (x) then lim An (y) = 0}. n→∞

We show that A is closed. Let {xn , n ≥ 1} ⊂ A, xn → x. Then yn → y, and

AN (y) ≤ AN (y − yp ) + AN (yp ) ≤ AN (yp ) + y − yp . Let ε > 0 and let p be a fixed integer such that y − yp < ε/2. Let N (ε) be such that for any N ≥ N(ε), AN (yp ) ≤ ε/2. We obtain that AN (y) ≤ ε. Thus A is closed in H and the theorem is established. Let {Xn , n ≥ 0} be a weakly stationary sequence in a Hilbert space H . According to Theorem 2.1.3, there exists a unitary operator U on H such that Xn = U n X0 . By von Neumann’s theorem, we get that the limit (X) := lim

n→∞

X0 + · · · + Xn−1 n

12

1 The von Neumann theorem and spectral regularization

exists in H . It can be directly observed that the inner product (X), Xh is independent of h. Indeed, by using the weak stationarity

Xk+1 + · · · + Xk+n Xh+1 + · · · + Xh+n , Xh = lim , Xk (X), Xh = lim n→∞ n→∞ n n = (X), Xk . And consequently

(X), Xh = (X),

X1 + · · · + Xn , n

which gives as n tends to infinity: (X), Xh = (X) 2 . As observed in Examples 1.1.3, for any real value of ϑ, the sequence {e−inϑ Xn , n ≥ 1} is weakly stationary too. The limit e−iϑ X1 + e−i2ϑ X2 + · · · + e−inϑ Xn n→∞ n thus also exists, for any value of ϑ. Then (X, ϑ) = lim

e−ihϑ Xh , (X, ϑ) = (X, ϑ) 2 , independently of h. Hence −iϑ1 e X1 + e−i2ϑ1 X2 + · · · + e−inϑ1 Xn

, (X, ϑ2 ) n ei(ϑ2 −ϑ1 ) + ei2(ϑ2 −ϑ1 ) + · · · + ein(ϑ2 −ϑ1 ) = . (X, ϑ2 ) 2 . n Therefore, if ϑ1 = ϑ2 (mod2π ), the last equation becomes, as n tends to infinity, (X, ϑ1 ), (X, ϑ2 ) = 0, as claimed in Examples 1.1.3. Weakly stationary sequences, however, enjoy other remarkable properties; among them is certainly the following identity which does not seem to be so known. An identity of Ky Fan. For any two positive integers n, m,

X1 + · · · + Xm 2

X1 + · · · + Xn 2

X1 + · · · + Xn+m 2 + − n m n+m

n(n + m) X1 + · · · + Xn X1 + · · · + Xn+m 2 = − . m n n+m This nice identity was observed and applied in Ky Fan [1946: 598]. The proof goes as follows. Put for any positive integer n, Sn = X1 + · · · + Xn , and if m is another positive integer let Tn,m = Sn+m − Sn , so that Sn+m = Sn + Tn,m . Then Sn Sn+m −

n

2

2 2 = Sn + Sn+m − Sn , Sn+m − Sn+m , Sn , n + m n2 (n + m)2 n n+m n+m n

13

1.3 The von Neumann theorem

and so

Sn+m 2 n(n + m) Sn − m n n + m (n + m) Sn 2 n Sn+m 2 = + nm m(n + m) 1 − Sn , Sn+m + Sn+m , Sn m

Sn 2

Sn 2

Sm 2

Sm 2

Sn+m 2 = + + − − n m m m n+m

Sn+m 2 1 + − Sn , Sn+m + Sn+m , Sn . m m But Sn , Sn+m + Sn+m , Sn = 2 Sn 2 + Sn , Tn,m + Tn,m , Sn , so that in turn n(n + m) m

Sn Sn+m −

n

n+m

2 2 2 2

2 2 = Sn + Sm − Sn+m + 1 S n+m − Sn

n m n+m m 2 − Sm − Sn , Tn,m − Tn,m , Sn

=

Sn 2

Sm 2

Sn+m 2 + − , n m n+m

since Sn+m = Sn + Tn,m . And we are done. Note that the weak stationarity assumption was only used in the last line of calculations, to say that Tn,m = Sm . A simple although quite interesting consequence of Ky Fan’s identity is

Sm 2

Sn 2

Sn+m 2 + , ≤ n m n+m

(1.3.2)

which is valid for any two positive integers n, m. This is inequality (4.8) in Ky Fan [1946]. We say that a sequence {gn , n ≥ 1} of real numbers is subadditive if it satisfies gn+m ≤ gn + gm .

(1.3.3)

Then we have the following well-known lemma. 1.3.3 Lemma. If {gn , n ≥ 1} is a subadditive sequence of real numbers, then gn /n converges to inf n≥1 (gn /n). Proof. Fix an arbitrary positive integer N and write n = jn N + rn with 1 ≤ rn ≤ N. Clearly jnn → N1 as n tends to infinity. Further gj N + grn gj N gr gr gn gn jn gN gN gr + n = ≤ ≤ n ≤ n + n ≤ + n. n≥1 n n n jn N n jn N n N n inf

14

1 The von Neumann theorem and spectral regularization

Letting now n tend to infinity gives inf

n≥1

gn gN gn gn ≤ lim sup ≤ . ≤ lim inf n→∞ n N n n→∞ n

As N was arbitrary, the lemma is proved. We thus deduce from (1.3.2) and from the lemma applied to gn :=

Sn 2 n

that

Sn Sn lim = inf . n→∞

n

n≥1

(1.3.4)

n

This is a remarkable consequence of Ky Fan’s identity, which remains true for averages of contractions by von Neumann’s theorem (proceed by approximation in view of the decomposition H = H0 ⊕ HT ). We continue with another interesting consequence concerning the ratios 2 Sn Snk n − n k+1 k k+1

1 nk

−

1 nk+1

,

where N = {nk , k ≥ 1} is a given increasing sequence of positive integers. Notice that in the orthonormal case, namely if X1 , X2 , . . . is an orthonormal sequence, then Sn k − Snk+1 2 = 1 − 1 precisely. We have the following properties: nk nk+1 nk nk+1

a)

Snk+1 2 N−1 Snk 1 nk − nk+1 lim sup

1 1 N→∞ nN k=1 nk − nk+1

b) Further if lim nk+1 − nk = ∞, then k→∞

Snk+1 −nk 2

SnN 2 − ≤ lim sup sup . 2 n2N N →∞ 1≤k

Snk+1 2 N −1 Snk 1 nk − nk+1 lim

1 1 N →∞ nN nk − nk+1 k=1

= 0.

2

(k+1)a N −1 Ska 1 ka − (k+1)a c) Moreover, lim

1 = 0. 1 N,a→∞ Na ka − (k+1)a k=1 d) Finally, let D = Dj , j ≥ 1 be a chain: Dj |Dj +1 for every j . Then

∞

1

j =1

Dj +1

Dj +1 Dj −1

S kD

kDjj − k=1

1 kDj

−

S

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SD1

D1

2

− lim

J →∞

SDJ +1

DJ +1

2

< ∞.

It follows from c) that in at most linearly growing sequences, averages of weakly stationary sequences asymptotically exhibit, in density, increments comparable to averages of orthogonal sequences, which is a bit unexpected. From Ky Fan’s identity, we

15

1.3 The von Neumann theorem

indeed get for each k, 2 Sn Snk n − n k+1 k k+1

−

1 nk

1 nk+1

=

Snk+1 2

Snk+1 −nk 2

Snk 2 − + . nk nk+1 nk+1 − nk

(1.3.5)

Summing from k = 1 up to N − 1 leads to

Snk+1 2 S N−1 nnkk − nk+1 1 1 nk − nk+1 k=1

N −1

Snk+1 −nk 2

Sn1 2

SnN 2 = − + (nk+1 − nk ) . n1 nN nk+1 − nk k=1

Dividing both sides by nN gives

nN

N−1 k=1

=

1 Sn Sn 2 k − k+1 nk 1 nk

nk+1 1 k+1

−n

N −1

Snk+1 −nk 2

Sn1 2

SnN 2 1 − + (n − n ) k+1 k n1 nN nk+1 − nk nN n2N k=1

(1.3.5a)

N −1

Snk+1 −nk 2 SnN 2

Sn1 2 1 ≤ + (nk+1 − nk ) − . n1 nN nN nk+1 − nk n2N k=1

Letting next N tend to infinity yields Snk+1 2 Snk N −1 − nk nk+1 1 lim sup

1 1 N→∞ nN k=1 nk − nk+1

N −1

Snk+1 −nk 2 SnN 2 1 (nk+1 − nk ) − nk+1 − nk n2N N →∞ nN k=1 S 2

SnN 2 nk+1 −nk

≤ lim sup sup − . 2 n2N N →∞ 1≤k

≤ lim sup

So a) is proved. Now if limk→∞ nk+1 − nk = ∞, suppose first that limn→∞ Snn = 0. Then limk→∞ (1.3.5a) gives

Snk+1 −nk

nk+1 −nk

= 0, and so letting N tend to infinity in the first equality in Snk+1 2 Snk N −1 − 1 n nk+1 lim

k1 1 N →∞ nN nk − nk+1 k=1

= 0.

Hence b) is proved in that case. If limn→∞ Snn > 0, there exists χ ∈ H such that limn→∞ Snn − χ = 0. It suffices to apply the result obtained to the weakly stationary sequence {Xi − χ , i ≥ 1} to reach the same conclusion in this case as well.

16

1 The von Neumann theorem and spectral regularization

Now assume nk = ak, a being some fixed positive integer. Replace nk by its value in the first part of (1.3.5a):

1 Na

N−1 Skaka

1 ka k=1

− −

S(k+1)a 2 (k+1)a 1 (k+1)a

N −1

SaN 2 1 Sa 2

Sa 2 − 2 2 + = 2 a N N a2 a N

(1.3.5b)

k=1

Sa 2

SaN 2 = − . a2 a2N 2 Hence

2

(k+1)a N −1 Ska Sa 2 1 ka − (k+1)a

SN a 2 lim sup = 0,

1 ≤ lim sup 2 − 1 a (N a)2 N,a→∞ Na N,a→∞ ka − (k+1)a S

k=1

which is c). Let {Dj , j ≥ 1} be a chain, and apply equality (1.3.5b) with N = Dj +1 /Dj , a = Dj .

1 Na

N −1 ka Ska

1 ka k=1

− −

S(k+1)a 2 (k+1)a 1 (k+1)a

=

SaN 2

Sa 2 − . a2 a2N 2

We obtain Dj +1 Dj −1

S kD

kDjj −

1 Dj +1

1 kDj

k=1

−

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SDj

2

Dj

−

SDj +1

2

Dj +1

.

Summing up from j = 1 to j = J gives J

1

j =1

Dj +1

Dj +1 Dj −1

S kD

kDjj − k=1

1 kDj

−

S(k+1)Dj 2 (k+1)Dj 1 (k+1)Dj

=

SD1

D1

2

−

SDJ +1

DJ +1

2

.

Letting J tend to infinity gives d). Stronger forms of b) exist in some cases. If T on L1 of a σ -finite measure space is a Dunford–Schwartz contraction (see subsection “extensions” in Section 4.2), write SnT = nl=1 T l and ATn = Sn√ /n. The following is an easy consequence of (1.3.5) and Remark 9.3.9.2. Let f ∈ (I − T )L2 . Then for any increasing sequence N = {nk , k ≥ 1} of positive integers, we have ST f SnT f 2 nk k+1 K − nk nk+1 1 lim = 0.

1 1 K→∞ K nk − nk+1 k=1

1.3 The von Neumann theorem

17

1.3.4 Remark. Let f ∈ H . From the spectral inequality follows that π Vn (θ ) − Vm (θ )2 μf (dθ ),

An f − Am f 2 ≤ −π

ikθ , where μf is the spectral measure of f relative to T , and V (θ ) = 1 −1 k=0 e = 1, 2, . . . . It is easily seen that limn,m→∞ |Vn (θ ) − Vm (θ )| = 0, for any θ . As moreover |V (θ )| ≤ 1, we deduce from the dominated convergence theorem that 2 π limn,m→∞ −π Vn (θ ) − Vm (θ ) μf (dθ ) = 0; hence the sequence {An f, n ≥ 1} is a Cauchy sequence, thus converging in H . This is another convenient way to recover the convergence part in von Neumann’s theorem. Weighted averages. The same argument allows us to prove the following more genof nonnegative reals with partial sums eral result. Let w = {wk , k ≥ 0} be a sequence ∞ Wn = n−1 w = 0, for each n. Assume that k=1 wk = ∞. Consider the weighted k=0 k averages n−1 1 Bn = BnT := wk T k , n = 1, 2, . . . . Wn k=0

−1 2iπ kθ converges, then the sequence If for each real number θ , W (θ ) := k=0 wk e {Bn f, n ≥ 1} converges in H , for any f ∈ H and any contraction T in H . This condition is in fact necessary and sufficient. Let indeed ϑ be some irrational number in (T, λ) and consider on H = L2 (λ) the operator T defined by Tf ( · ) = f ( · + ϑ), the rotation of angle ϑ. Choose 2iπ kϑ , the converse assertion thus f (t) = e2iπ t . As Bn f (t) = e2iπ t W1n n−1 k=0 wk e follows immediately. With a little more effort, we can in fact prove the following. 1 W

1.3.5 Lemma. The following are equivalent: (i) For every contraction T on a Hilbert space H and any f ∈ H the sequence {BnT f, n ≥ 1} converges in norm. (ii) For every contraction T on a Hilbert space H and any f ∈ H the sequence {BnT f, n ≥ 1} converges weakly in H . (iii) For every real θ, the sequence {Wn (θ ), n ≥ 1} converges. Assertion (i) is fulfilled if 1 wn + |wk − wk+1 | = 0. n→∞ Wn n−1

lim

(1.3.6)

k=1

If w is nondecreasing, (1.3.6) becomes limn wn /Wn = 0. If the sequence is nonincreasing, then (1.3.6) is always satisfied. k Recall that we denoted PT f = limn→∞ n1 n−1 k=0 T f in Theorem 1.3.1.

18

1 The von Neumann theorem and spectral regularization

1.3.6 Corollary. Let w be as before. Then {BnT f, n ≥ 1} converges to PT f in norm for every Hilbert space contraction T and f ∈ H , if and only if, n 1 wk zk = 0 for every z ∈ T, z = 1. n→∞ Wn

lim

(1.3.7)

k=1

Condition (1.3.7) does not imply condition (1.3.6) (see Lin–Weber [2007]). 1.3.7 Corollary. Let w satisfy (1.3.7). Then n w2 lim k=1 k2 = 0. n n→∞ k=1 wk

(1.3.8)

Indeed let {ek , ∈ Z} be the standard orthonormal basis of 2 , with T the isometric shift defined by T ej = ej +1 . Then f = e1 satisfies PT f = 0. Since (1.3.7) is satisfied, the orthonormality and the previous corollary yield 2 2 n n n 1 1 2 1 k w = w e = w T e k k+1 k 1 → 0. k Wn Wn Wn2 k=1

k=1

k=1

Let us also briefly discuss the case of subsequence ergodic averages. Let 1 if wk = n for some ≥ 1, wk = 0 otherwise, where 1 ≤ n1 < n2 < · · · is an increasing sequence of integers, which we denote N . The subsequence averages CN f =

N CN f

N 1 nk := T f, N

n = 1, 2, . . . ,

k=1

converges in the mean for any , and any contraction T in H , if and only if, for x ∈ H2iπ nk ϑ , N ≥ 1 converges. any real ϑ, the sequence N1 N e k=1 Mean good sequences. We say that a sequence {nj , j ≥ 1} of integers is mean good if, given any contraction T in an arbitrary Hilbert space H , for any x ∈ H the sequence of weighted averages N 1 nj T x, N = 0, 1, . . . , N j =0

converges in H . More generally, for 1 ≤ p < ∞ we say that a sequence {nj , j ≥ 1} of integers is universally p-mean good when, given any measurable dynamical system (X, A, μ, τ ) (see Section 3.1), any f ∈ Lp (μ), the weighted averages N 1 nj T f, N j =0

N = 0, 1, . . . ,

19

1.3 The von Neumann theorem

converge in Lp (μ). Here we have set Tf = f τ . By what precedes, a sequence {nj , j ≥ 1} of integers is mean good if and only if, for 2iπ nk ϑ , N ≥ 1 converges. See also Remark 3.4.2, any real ϑ, the sequence N1 N k=1 e where it is proved that if the sequence {nj , j ≥ 1} is a good sequence (Section 1.6), then it is mean good relatively to the class of weakly mixing dynamical systems. Finally the moving averages Bn,k f =

k+n−1 1 j T f n j =k

converge in H to f¯, for any f ∈ H , as (n, k) tends to (∞, ∞), where the limit f¯ is the same as in Theorem 1.3.1. Speed of convergence. It can be easily observed that no speed of convergence can be exhibited in general. Indeed, by the uniform boundeness principle (Theorem 2.2.8), the existence of a sequence an → ∞ such that lim supn→∞ an An f − Pf < ∞ for all f is equivalent to an An − P → 0. This cannot be realized in general. Let θ be irrational and choose Tf (x) = f (x + θ ), f ∈ L1 (T). Let 0 < ε < 1 and fix some positive integer N . Since {j ϑ, j ≥ 1} is dense in T, by using Lemma 1.4.2 (iii) one can select j = j (N) such that max Vn (j θ ) − 1 ≤ ε. n≤N

Put f = ej . Then Pf = f, 1 = 0 and An f = |Vn (j θ )|. Thus | An f − 1| ≤ ε, for n ≤ N. This implies that 1 − ε ≤ min An − P ≤ 1. n≤N

Since ε can be arbitrarily small, we therefore have that An − P = 1 for every n, thus providing a contradiction. One can also use the shift-model to produce counterexamples. Let μ be a probability measure on (R, B(R)) such that R xμ(dx) = 0 and R x 2 μ(dx) = 1 and let (RZ , B(RZ ), P) with P = μZ . Consider the shift T on RZ defined for x ∈ RZ , x = {xk , k ∈ Z} by T x = {xk−1 , k ∈ Z}. Let ξ = {ξk , k ∈ Z} be i.i.d. random variables with image law μ. Let also a = {ak , k ∈ Z} ∈ 2 (Z) and put ξk ak . f (ξ ) = Then f ∈ L2 (P) and f (T ξ ) =

k∈Z

k∈Z ξk−1 ak

k∈Z ξk ak+1

so that

1 ak + ak+1 + . . . + ak+n−1 f (T ξ ) = ξk . n n n−1

ATn f (ξ ) =

=

=0

k∈Z

20

1 The von Neumann theorem and spectral regularization

Suppose first, one has a speed of convergence, namely ATn f (ξ ) = O(εn ) where εn ↓ 0. Let β = {βk , k ∈ Z} be such that β 2 = 1 and β2 Cεn . :22 ≥2n

Choose a such that ak = βL 2−L if 22L ≤ |k| < 22(L+1) , L = 0, 1, . . . . Then

ATn f 2

ak + ak+1 + . . . + ak+n−1 2 = n k∈Z

L:22L ≥2n

k∈Z 22L ≤k

≥

≥

βL2

ak + ak+1 + . . . + ak+n−1 n

2(L+1) 2 − 22L − n − 1

22L

L:22L ≥2n

≥C

2

βL2

L:22L ≥2n

Cεn . This gives a contradiction. Suppose now that P-a.s.

An f (ξ ) = O(εn ). Choose μ to be Gaussian. The basic properties of Gaussian processes (Section 10.2) imply |An (f )| E sup < ∞. εn n≥1 But this is contradictory again since obviously |An (f )| |An (f )|

An (f ) 2 ≥ sup E ≥ C sup 1. ε ε εn n n n≥1 n≥1 n≥1

E sup

The last inequality follows from the very construction of a. Extensions. There are naturally numerous extensions or variants of von Neumann’s theorem, which are not possible to present from an exhaustive point of view. For instance: • Let {xn , n ≥ 1} be a sequence of elements of a real or complex Hilbert space H , and denote sn = x1 + · · · + xn . Assume that the following two conditions are realized: (a) ∃c > 0 such that (b)

sn+m − sn 2 − sm 2 < cm,

sn+1 2 − sn 2 n→∞ n lim

exists.

Ky Fan [1945: Theorem II] showed that sn /n strongly converges in H .

1.3 The von Neumann theorem

21

• There exist several extensions in Banach spaces obtained by Bruck [1979], De la Torre [1976], Landers–Rogge [1978], Pham [1993], Yoshimoto [1976]. We shall quote the Mean Ergodic theorem due to the contributions of Eberlein, Riesz, Yoshida and Kakutani. Let X be a Banach space and let I be the identity operator on X. An operator k) T : X → X is Cesàro bounded if (ATn = n1 n−1 T k=0 sup ATn < ∞. n

If T is Cesàro bounded, the identity n−1 T n−1 = ATn −

n−1 T An−1 n

shows that the condition limn→∞ n−1 T n−1 x = 0 is necessary for the sequence {ATn x, n ≥ 1} to converge. 1.3.8 Theorem (Mean ergodic theorem). Let T be a Cesàro bounded linear operator in a Banach space X. For any x satisfying limn→∞ n−1 T n−1 x = 0, and any y ∈ X, the following conditions are equivalent: (i) T y = y and y ∈ co{x, T x, T 2 x, . . . }, (ii) y = limn→∞ ATn x, (iii) y = w- limn→∞ ATn x, (iv) y is a weak cluster point of the sequence {ATn x, n ≥ 1}. We denote by co(•) the closed convex hull of •, and write w- lim to denote the limit in the sense of the weak topology of σ (X, X ∗ ). From the mean ergodic theorem thus follows: If Tn /n → 0 and supn ATn < ∞, then {x ∈ X : ATN x converges} = {y ∈ X : T y = y} ⊕ (I − T )X. An operator T : X → X is power-bounded if sup T N < ∞. N

As a special case of the mean ergodic theorem we have 1.3.9 Theorem. Let T : X → X be a power-bounded linear operator in a reflexive Banach space X. For any x ∈ X, the sequence of averages {ATn x, n ≥ 1} converges in X to a T -invariant limit. We refer to Krengel [1985: 72–73]. This applies in particular to the spaces Lp (1 < p < ∞).

22

1 The von Neumann theorem and spectral regularization

A linear operator T : X → X in a Banach space X is called mean ergodic if for any x ∈ X, the sequence of averages {ATn x, n ≥ 1} converges in X; by the above theorem, power bounded linear operators in reflexive Banach spaces have this property. Let E(T )x denote the limit of the sequence {ATn x, n ≥ 1}. Let w = {wk , k ≥ 0} be a sequenceof nonnegative reals with partial sums Wn = n−1 k=0 wk = 0, for each n. Assume that ∞ w = ∞. Condition (1.3.6) is equivalent to the fact that for every k=1 k power-bounded mean ergodic T on a Banach space X and every x ∈ X, we have limn→∞ W1n nk=1 wk T k x − E(T )x = 0. Orthogonality and weak orthogonality. Let (H, · ) be a real or complex Hilbert vectors (thereby weakly stationary) space, and let {fn , n ≥ 1} be orthonormal in the 2 2 |cj | , the series cn fn inner product space H . As u≤j ≤v cj fj = u≤j ≤v converges in H , for any sequence {cn , n ≥ 1} such that cn2 < ∞. The fact that 1 fi → 0 n n

in H

(1.3.9)

i=1

can be for instance deduced from Ky Fan’s result (see the section Extensions, p. 20 This in fact remains true for weakly orthogonal systems. First recall Bessel’s inequality: If {ei , 1 ≤ i ≤ n} are orthogonal vectors in the inner product space H , then n

|x, ei |2 ≤ x 2

for any x ∈ H .

i=1

Boas [1941] and independently Bellman [1944] proved the following generalization of Bessel’s inequality: If x, y1 , . . . , yn are elements of an inner product space (H, ·, ·), then n

|x, yi |2 ≤ x 2

i=1

max yi 2 +

1≤i≤n

|yi , yj |2

1/2 .

1≤i =j ≤n

Fink, Mitrinovi´c and Pe´cari´c [1993] extended Boas–Bellman’s inequality as follows: If x, y1 , . . . , yn are elements of (H, ·, ·), and c1 , . . . , cn are complex numbers, then n i=1

|ci x, yi |2 ≤ x 2

n i=1

|ci |2 · max yi 2 + 1≤i≤n

|yi , yj |2

1/2 .

1≤i =j ≤n

Other extensions are established in [Dragomir: 2003]. In relation with Bessel’s inequality is the notion of a quasi-orthogonal system introduced by Bellman (see [Kac–Salem–Zygmund: 1948]). Let f = {fn , n ≥ 1} be a sequence in H . Then f is called a quasi-orthogonal system if the quadratic form on 2 defined by {xh , h ≥ 1} → h xh fh 2 is bounded.

23

1.3 The von Neumann theorem

A necessary and sufficient condition for f to be quasi-orthogonal is that the series cn fn converges in H , for any sequence {cn , n ≥ 1} such that cn2 < ∞. And as noticed before, a suitable use of Kronecker’s lemma implies that (1.3.9) holds in turn for quasi-orthogonal systems. Kac, Salem and Zygmund observed that every theorem on orthogonal systems whose proof depends only on Bessel’s inequality, holds for quasi-orthogonal systems. In particular for H = L2 (X, A, μ), (X, A, μ) a probability space, Rademacher– Menchov’s theorem, the almost everywhere convergence of the series cn fn , 2 asserting provided that cn log2 n < ∞, applies. This is readily seen from the fact that f is quasi-orthogonal if and only if there exists a constant L depending on f only, such that

1/2 xi fi ≤ L |xi |2 . i≤n

i≤n

And this indeed suffices for the proof of Rademacher–Menchov’s theorem (see Remark 8.3.5), since we have the increment property 2 ci fi ≤ L2 |ci |2 . n≤i≤m

n≤i≤m

Consequently, if f is quasi-orthogonal, then 1 a.e. fi −−→ 0. n n

(1.3.10)

i=1

There is a useful sufficient condition for quasi-orthogonality (Lemma 7.4.3 in [Weber: 1998a]): In order for f to be quasi-orthogonal, it is sufficient that sup

j ≥1 k

|fj , fk | < ∞.

(1.3.11)

Indeed, from the relation xj fj , xk fk + xk fk , xj fj = xj xk fj , fk + xk xj fk , fj , it is plain that

|xj fj , xk fk | + |xk fk , xj fj | ≤ 2|fj , fk ||xj ||xk | ≤ |fj , fk | |xj |2 + |xk |2 , and so 2 xi fi = |xi |2 fi 2 + i≤n

i≤n n

1≤j

≤ max fi 2 · |xi |2 +

i=1

i≤n

xj fj , xk fk + xk fk , xj fj

1≤j

|xj |2 + |xk |2 |fj , fk |

24

1 The von Neumann theorem and spectral regularization

n ≤ max fi 2 · |xi |2 + |xj |2 |fj , fk | i=1

+

j ≤n

i≤n

|xk |

2

|fj , fk |

j

1≤j

k≤n

n ≤ max fi 2 · |xi |2 + sup |fj , fk | |xj |2 . i=1

Hence we may take

j ≥1 k =j

i≤n

j ≤n

n L = max fi 2 + sup |fj , fk |. i=1

j ≥1 k =j

From the above calculation, we can formulate another variant of Bessel’s inequality: If x, y1 , . . . , yn are elements of an inner product space (H, ·, ·), then n

|x, yi | ≤ x

2

2

n

2 i=1 |x, yi |

n

max yi + sup 2

i=1

i=1

Indeed

n

j =1 k =j

= x,

n

|yj , yk | ≤ 2 x

i=1 x, yi yi

2

n

sup

n

j =1 k=1

|yj , yk | .

(1.3.12) ≤ x 2 ni=1 x, yi yi , and

n n 2 x, yi yi = |x, yi |2 |yi |2 i=1

i=1

+

x, yi x, yj yi , yj + x, yi x, yj yj , yi

1≤i<j ≤n

n

≤ max |yj |

2

j =1

+

n

|x, yi |2

i=1

|yi , yj | |x, yj |2 + |x, yi |2

1≤i<j ≤n n

n ≤ max |yj |2 |x, yi |2 + sup |yj , yk | |x, yj |2 j =1

j ≥1 k =j

i=1

j ≤n

Thus n i=1

|x, yi |2 ≤ x 2

n n n n max yi 2 + sup |yj , yk | ≤ 2 x 2 sup |yj , yk | . i=1

j =1 k =j

j =1 k=1

We conclude these remarks with a digression towards the large sieve inequality, which we recall: Consider a function f : {1, . . . , N} → C with Fourier transform S(t) =

N n=1

f (n)e2iπ nt .

25

1.3 The von Neumann theorem

Let t1 , . . . , tm be δ-separated, which means that |ti − tj | ≥ δ when i = j . Then m

|S(tj )| ≤ (8N + δ 2

−1

j =1

)

N

|f (n)|2 .

(1.3.13)

n=1

The proof of this inequality is not hard. As noticed by Bombieri, it can also be proved by using an approximate Bessel inequality, namely inequality (1.3.12). A short proof is given in [Green: 1999]. Coboundaries. Let T be a contraction in a Hilbert space H . An element f of H is a coboundary for T , if the equation f = g − Tg

(1.3.14)

(called the cohomological equation) admits a solution g, which in this case, is called the transfer-function of f or the cobounding function of f . Usually one requires T to be generated by an automorphism (see Section 3.1) τ of some probability space (X, A, μ), T g = g τ , and the cohomological equation is taken in the sense of equivalenceclasses modulo μ. Assume for simplicity that T is ergodic, or equivalently that PT f = X f dμ (see Section 3.2). In this case, by the Riesz decomposition of L2 (μ), the coboundaries are dense in L20 (μ), and by a plain density argument the coboundaries g − T g with g p bounded form a dense subset in all L0 (μ), 1 ≤ p < ∞. For p = ∞, things get more complicated. The following result is due to Koˇcergin [1976: Corollary 3]. 1.3.10 Theorem. Let f ∈ L10 (μ) and ε > 0. Then there exists g ∈ L0 (μ) such that

f − (g − T g) ∞ < ε. Let ϕ : R+ → R+ such that limt→∞ ϕ(t) = ∞. Then there exists a function 0 f ∈ L∞ 0 (μ) with f ∞ = 1 such that for any g ∈ L (μ) with X ϕ(|g(t)|)μ(dt) < ∞,

f − (g − T g) ∞ ≥ 1/2. In particular, for any p > 0, there exists f with f ∞ = 1 such that inf

g∈Lp (μ)

f − (g − T g) ∞ ≥ 1/2.

In relation with this, a recent result (see Section 4.5) of Volný and Weiss [2004] establishes a link between the fact that a function f is well approximated in L∞ (μ) by coboundaries with cobounding functions living “almost” in Lp (μ), and the order of μ{|An f | > 1}. Another important feature of coboundaries is that the sums are bounded: if f is a coboundary then n−1 T k f < ∞. (1.3.15) sup n≥1

k=0

26

1 The von Neumann theorem and spectral regularization

The converse is in fact also true, namely if (1.3.15) holds, then f is a coboundary. This follows from a theorem due to Browder [1958]. One can consider a larger setting to state this property. Let X be a Banach space and let an operator T : X → X. Equation (1.3.14) is then reformulated as f = (I − T )g, where I denotes the identity operator on X. The following extension of Browder’s result is due to Lin and Sine [1983]. 1.3.11 Theorem. Let T be mean ergodic. The following conditions are equivalent for y ∈ X: (i) y ∈ (I − T )X, j (ii) xn = n1 nk=1 k−1 j =0 T y has a weakly convergent subsequence, (iii) {xn , n ≥ 1} converges strongly (and x = limn→∞ xn satisfies (I − T )x = y), k (iv) supn≥1 n−1 k=0 T y < ∞.

1.4 The spectral regularization inequality A remarkable result of Talagrand [1996a: Theorem 1.3] allows us to make precise von Neumann’s theorem. Note AT (f ) = {ATn f, n ≥ 1}, and let for any ε > 0, N(AT (f ), · , ε) be the entropy number of order ε of AT (f ), namely the minimal number (possibly infinite) of Hilbertian open balls centered in AT (f ) of radius ε enough to cover AT (f ). 1.4.1 Theorem (Talagrand’s entropy estimate). Let T be a contraction in a Hilbert space H . Then, ∀f ∈ H, ∀0 < ε ≤ f ,

N(AT (f ), · , ε) ≤ 1 + 30

f 2 . ε2

In Lifshits–Weber [2000], the better constant 6π ≈ 18.85 is obtained by using a finer spectral regularization kernel than the one used in this section. The theorem shows that the convergence of the sequence AT (f ) is very regular, like in the plane R2 . This phenomenon is all the more surprising since no speed of convergence exists in general in von Neumann’s mean ergodic theorem, as we saw in the previous section. Talagrand’s proof induces a remarkable idea of spectral regularization which has been developed in Lifshits–Weber [2000], [2003] for fixed and moving ergodic averages. Put for any positive real x and any −π ≤ θ ≤ π, Vx (θ ) =

(eixθ − 1) . x(eiθ − 1)

We first collect and give the proof of some elementary but useful estimates of these kernels.

27

1.4 The spectral regularization inequality

1.4.2 Lemma. We have the following estimates valid for x > 0 and −π ≤ θ ≤ π :

π π (i) |Vx (θ)| ≤ x|θ | , and |Vx (θ )| ≤ inf 1, x|θ | for x integer. ∂ (ii) ∂x Vx (θ) ≤ π4 |θ|. (iii) For any integers m ≥ n ≥ 1, |Vn (θ ) − Vm (θ )| ≤

π 4 |θ | (m − n).

(iv) |Vn (θ) − Vm (θ )| ≤ 2 (m − n) /m. Proof. First observe that |Vx (θ )| ≤

2 . x|eiθ −1|

As for any −π ≤ θ ≤ π , |eiθ − 1| =

2|θ | π 2| sin |θ| 2 | ≥ π , we deduce that |Vx (θ )| ≤ x|θ | , if −π ≤ θ ≤ π and x > 0. And |Vx (θ)| ≤ 1 if x is an integer. Hence (i). Now let −π ≤ θ ≤ π and put for any real x > 0, eixθ − 1 ϕθ (x) = . x

Then ϕθ (x) =

iθ xeixθ −eixθ +1 , x2

and noting δ(u) := |iueiu − eiu + 1|2 , we have

δ(u) = (1 − u sin u − cos u)2 + (u cos u − sin u)2 = 2[1 − u sin u − cos u] + u2 . We claim that for all u ≥ 0, δ(u) ≤ u4 /4. As δ(u) = δ(−u) it suffices to prove it for u ≥ 0. But δ (u) = 2u(1 − cos u) and if we set H (u) := u4 /4 − δ(u), we get H (u) = u3 − 2u(1 − cos u) = u(u2 − 4 sin2 (u/2)) ≥ 0, since | sin v| ≤ |v|. Then |ϕθ (x)| = |δ(xθ)|/x 2 ≤ |θ |2 /2. As it follows that ∂ V (θ ) ≤ π |ϕ (x)| ≤ π |θ |. x ∂x 2|θ | θ 4

∂ ∂x Vx (θ )

=

1 ϕ (x), eiθ −1 θ

Hence (ii). Let m ≥ n be positive integers . Then |Vn (θ) − Vm (θ )| =

1 π |ϕθ (n) − ϕθ (m)| ≤ (m − n) sup |ϕθ (x)|, − 1| 2|θ | n<x<m

|eixθ

and so |Vn (θ) − Vm (θ )| ≤

π 4 |θ|(m − n).

Now

n−1 m−1 1 1 ij θ 2(m − n) 1 ij θ |Vn (θ) − Vm (θ )| = e − e ≤ − .

n

m

j =0

m

j =n

m

Hence, (iii) and (iv). Introduce for θ ∈ [−π, π ) and y ∈ (0, 1] the regularizing kernel Q(θ, y) =

|θ | 1 ∧ 2. |θ| y

(1.4.1)

28

1 The von Neumann theorem and spectral regularization

1.4.3 Lemma. Let m ≥ n be two positive integers. Then, for any θ ∈ [−π, π ),

1/n

4π

Q(θ, y)dy + 4 1[ 1 , 1 ) (|θ |) ≥ |Vm (θ ) − Vn (θ )|2 . m n

1/m

Proof. Consider three cases: (1) |θ| ≥ n1 . By definition of Q and by Lemma 1.4.2.

1/n

1 m−n 1 dy = m n|θ | 1/m |θ| 2π 1 1 1 ≥ |Vm (θ ) − Vn (θ )|2 . ≥ |Vm (θ ) − Vn (θ )| 2 n|θ | 2π 4π

Q(θ, y)dy =

1/m

(2) |θ | ≤

1 m.

1/n

Then, for the same reasons

1/n

|θ| dy = (m − n)|θ | 2 1/m y 4 2 ≥ |Vm (θ ) − Vn (θ )| ≥ |Vm (θ ) − Vn (θ )|2 . π π

1/m

(3)

1 n

1/n

Q(θ, y)dy =

> |θ | ≥

1 m.

This case is obvious since we have |Vm (θ ) − Vn (θ )| ≤ 2.

Let f ∈ H , with spectral measure μf . Introduce a new measure, the spectral regularization of the measure μf with respect to the kernel Q, defined by μˆ f (dy) = 4π

π

−π

Q(θ, y)μf (dθ ) dy + 4μf (dy).

(1.4.2)

It is easy to verify that μˆ f ([0, 1]) ≤ 4(2π + 1)μf ([−π, π]) ≤ 4(2π + 1) f 2 . Indeed, if |θ | ≤ 1, then 0

1

|θ |

Q(θ, y)dy = 0

−1

|θ|

dy +

1 |θ |

|θ|y −2 dy = 1 + |θ |(|θ |−1 − 1) = 2 − |θ | ≤ 2,

1 1 and if 1 ≤ |θ| ≤ π , then y ≤ |θ| and 0 Q(θ, y)dy = 0 |θ |−1 dy ≤ 1. We thus have 1 0 Q(θ, y)dy ≤ 2; hence the inequality. 1.4.4 Theorem (Spectral regularization inequality). For any integers m ≥ n ≥ 1,

ATn f − ATm f 2 ≤ μˆ f

1 1 m, n

.

29

1.4 The spectral regularization inequality

Proof. By integrating the inequality of Lemma 1.4.3 with respect to the measure μf , we get π 1/n π 1 1 Q(θ, y)μf (dθ ) dy+4 μf m , n ≥ |Vm (θ )−Vn (θ )|2 μf (dθ ). 4π 1/m

−π

−π

By means of the spectral inequality (Proposition 1.2.2), we thus obtain the claimed result. The spectral regularization inequality allows us to easily evaluate the Littlewood– Paley square function associated to the averages ATn (f ). Put for any nondecreasing sequence N = {np , p ≥ 1} of positive integers, and any f ∈ H , SN (f ) =

∞

ATnp+1 (f ) − ATnp (f ) 2

1/2 .

(1.4.3)

p=1

These functions, which are extrapolated from the Littlewood–Paley theory, gained much interest in the ergodic circles during the last decade. We briefly recall their role in Fourier analysis on T. Introduce the so-called dyadic intervals ⎧ j −1 j −1 + 1, . . . , 2j − 1} if j > 0, ⎪ ⎨{2 , 2 j = {0} if j = 0, ⎪ ⎩ −|j | if j < 0, If f is any integrable function on T and fˆ its Fourier transform, then we write Sj f = ˆ n∈j f (n)χn . The square function of f is defined by Sf =

|Sj f |2

1/2 ,

j ∈Z

and the Littlewood–Paley theorem on T expresses that to each p in (1, ∞) correspond positive numbers Ap and Bp such that Ap Sf p ≤ f p ≤ Bp Sf p for (say) all trigonometric polynomials f on T. For more, see [Edwards–Gaudry: 1977]. The square function also appears in martingale theory ([Burkholder–Gundy: 1970], inequality (1.4)). Let f1 , f2 , . . . be a martingale on some probability space and d1 , d2 , . . . its difference sequence, so that fn =

n

dk ,

n ≥ 1.

k=1

Let

f

denote the maximal function of the martingale sequence: f = supn≥1 |fn |.

30

1 The von Neumann theorem and spectral regularization

The maximal function is related to the square function Sf = inequalities Ap Sf p ≤ f p ≤ Bp Sf p

∞

2 1/2 k=1 dk

by the

valid for 1 < p < ∞. 1.4.5 Theorem (Square function inequality). For any nondecreasing sequence N of positive integers, and any f ∈ H , SN (f ) ≤ 2(2π + 1)1/2 f . Proof. From Theorem 1.4.4, follows immediately that ∞

ATnp+1 (f ) − ATnp (f ) 2 ≤

p=1

∞

μˆ f

1 1 np+1 , np

≤ μˆ f {[0, 1]} ≤ 4(2π + 1) f 2 .

p=1

Actually the better constant 6π is obtained in Lifshits and Weber [2000: 77] by using another kernel Q. The corresponding spectral regularization of μ is given by π d μˆ −3 2 Q(θ, x)μ(dθ ) = |x| θ μ(dθ ) + |θ |−1 μ(dθ ), (x) = dx −π |θ |<|x| |x|<|θ |≤π 0 < |x| ≤ π. For any two positive integers m ≥ n, we have

ATn f − ATm f 2 ≤ 4π μˆ

1 1 m, n

.

By applying the above inequality to the measure μ = δθ , we also get for each θ , ∞

|Vnp+1 (θ ) − Vnp (θ )|2 ≤ 4(2π + 1).

(1.4.4)

p=1

This inequality was proved by Jones, Ostrovskii, Rosenblatt [1996] by different arguments, with the constant 252 . Note that (1.4.4) immediately implies Theorem 1.4.5, which is Theorem 1.2 in the above mentioned paper. Square functions for other ergodic averages are considered in [Nair–Weber: 1999]. We now can give a simple proof of Talagrand’s inequality. Proof of Theorem 1.4.1. Let 0 < ε ≤ 1, f ∈ H be such that f = 1. Let also t0 < t1 < · · · < tr be an ordered sequence of positive integers such that T A f − AT f ≥ ε, ∀0 ≤ i < j ≤ r. ti tj Apply the previous theorem to the subsequence N = {t0 , t1 , . . . , tr , tr+1 , tr+2 , . . . } , where tr+j = tr + j , if j = 1, 2, . . . . Then ε2 r ≤ 4(2π + 1), and consequently, N(AT (f ), . , ε) ≤ 1 + This establishes the claimed inequality.

4(2π + 1) 30 ≤ 1+ 2. 2 ε ε

31

1.4 The spectral regularization inequality

1.4.6 Remarks. (1) The above estimate is also optimal. This can be seen by considering rotations. Take X = [−π, π ) provided with the normalized Lebesgue measure λ. Let also θ ∈ X be irrational and consider the unitary operator U on L2 (X, λ) associated with the rotation θ: τ x = x + θ mod (2π ), x ∈ X and defined by Uf = f τ . Let f ∈ L2 (X, λ), f = n∈Z an en where we denote en (x) = einx . Then AN (f ) − AM (f ) 22 = n∈Z |an |2 |VN (nθ ) − VM (nθ )|2 . By virtue of Weyl’s criterion (Section 1.6), we can build inductively two increasing sequences of positive integers N1 < N2 < · · · and l1 < l2 < · · · such that for any j = 1, . . . and any i < j , |VNj (lj θ )| >

1 , 2

|VNj (li θ )| <

1 . 4

Now let {rk , k ≥ 1} be some increasing sequence of integers, and put Rk = j

ANj (fk ) − ANi (fk ) 22

Rk+1 −1 1 ≥ |VNj (ls θ ) − VNi (ls θ )|2 rk s=Rk

1 1 ≥ |VNj (li θ ) − VNi (li θ )|2 ≥ . rk 16rk Hence, ANj (f ) − ANi (f ) 2 ≥ kc ANj (fk ) − ANi (fk ) 2 ≥ j < Rk+1 which proves that

1 N fk , √ 4 rk

≥ rk ,

and

N f,

c 1 √ k 4 rk

c √1 k 4 rk

for any Rk ≤ i <

≥ rk .

The first inequality shows the optimality of Talagrand’s estimate, by taking into account its homogeneity properties. The second inequality shows this: whatever ϕ : R+ → R+ , > 0. with limx→0 ϕ(x) = 0, there exists f ∈ L2 (λ) such that lim supε→0 εN−2(f,ε) ϕ(ε) (2) Under appropriate spectral conditions, Talagrand’s estimate can be sharpened. Let f ∈ H , f = 1 with spectral measure μ satisfying for β ≥ 0,

π

−π

π log |θ|

β

μ(dθ ) < ∞.

Then ([Gamet–Weber: 2000] Proposition 1.2) there exists a constant K = K(f, β) such that K N(AT (f ), · , ε) ≤ 2 , 0 < ε ≤ 1. ε | log ε|β

32

1 The von Neumann theorem and spectral regularization

Gaposhkin’s estimates. These questions were further investigated by Gaposhkin, who considered sequences ξ = {ξk , k ≥ 1} of square integrable random variables satisfying the following quasi-stationary condition: m+n 2 1 C2 ξk ≤ α0 ,

n

(1.4.5)

n

2

k=m

for all m ≥ 0, n ≥ 1, where C0 > 0 and 0 < α ≤ 2. Condition (1.4.5) holds for instance if ξ is a weakly stationary sequence such that E ξk = 0, κ(n) := E ξk ξk+n satisfies C1 , nα

|κ(n)| ≤

for some 0 < α < 1 and C1 > 0. Then ([Gaposhkin: 2005], theorem 1) the entropy number N(ε) of the associated set of means satisfies the inequality N(ε) ≤

C0 Dα , ε

where Dα ≤

2−α α

(2−α)/α + 2 for 0 < α < 1,

and Dα ≤ 3 for 1 ≤ α ≤ 2.

(3) For unitary operators with discrete spectrum, the entropy estimate can be ameliorated. Let U be a unitary operator with discrete spectrum in H = L2 (λ), or in an arbitrary separable Hilbert space H . Let {ej , j ∈ Z} be a basis in H and {λj , j ∈ Z} be a sequence on the unit circle such that U (x) = U xj ej = λj xj ej , xj = x, ej . j ∈Z

j ∈Z

Then for each complex polynomial P , we have P (U )x = j ∈Z P (λj )xj ej . Let q ∈ (1, 2]. Further, for each x ∈ H , 1/2 1/q

P (U )x 2 = |P (λj )|2 |xj |2 ≤ |P (λj )|q |xj |q . (1.4.6) j ∈Z

j ∈Z

Combining this estimate and the spectral regularization method, we can prove: 1.4.7 Proposition. Let Bq = {x ∈ H : constant C depending only on q such that

j ∈Z |xj |

q

≤ 1}. There exist a universal

sup N(AU (x), · , ε) ≤ Cε −q . x∈Bq

1.4 The spectral regularization inequality

33

The proposition can be applied to the unitary operator related to rotations of the circle, with the exponential functions providing the relevant basis of eigenfunctions ˆ q ≤ 1}, where xˆ denotes the Fourier transform (the sequence and Bq = {x ∈ H : x

of the Fourier coefficients) of x ∈ Lp ([0, 1[). Proof. Fix x ∈ Bq . Define the pseudo-spectral measure of x as μ = j ∈Z |xj |q δλj . j Let, as usual, Vn (z) = n−1 n−1 j =0 z be the complex polynomials corresponding to the operators An . For all positive integers n < m, we can apply (1.4.10) to P = Vn − Vm and rewrite it as q q

An − Am 2 ≤ Vn − Vm q,μ . The next step is a regularization procedure similar to the one previously performed. Define for 0 < r ≤ 1 the regularized measure μˆ on [0, 1] by its Lebesgue density d μˆ (r) = Q(z, r)μ(dz) dr −q−1 q = r |1 − z| μ(dz) + r q−2 |1 − z|1−q μ(dz). |1−z|

r≤|1−z|≤2

Using the standard estimates |Vn (z) − Vm (z)| ≤ min 4/n|1 − z|, 2(m − n)/n, (m − n)|1 − z|/2 , we get for all z with |z| = 1 (namely z ∈ T) the inequality 1/n q |Vn (z) − Vm (z)| ≤ C1 Q(z, r)dr, ∀z : |z| = 1, 1/m

with C1 depending only on q. Integration over μ yields q

Vn − Vm q,μ ≤ C1 μ[1/n, ˆ 1/m].

(1.4.7)

Moreover, we have a total mass bound μ[0, ˆ 1] ≤ C2 μ(T), with C2 depending only on q. Let us take arbitrary ε > 0 and cut [0, 1] into segments such that the measure μˆ of each segment does not exceed ε q /C1 . It thus follows from (1.4.7) that the related covering of the set AU (x) consists of the sets of diameter not larger than ε and finally N(AU (x), · , ε) ≤ C1 C2 ε−q + 1. Entropy numbers attached to i.i.d. sequences. Relatively surprisingly, entropy numbers attached to i.i.d. sequences behave more smoothly. To see this, let H be some L2 (μ), μ a probability measure, and choose U and f ∈ L2 (μ) with f, 1 = 0,

34

1 The von Neumann theorem and spectral regularization

f = 1, such that f, Uf, U 2 f, . . . is a sequence of i.i.d. r.v.’s. If we write more simply An = AU n (f ), then

An − Am 2 =

1 1 − , n m

for any integers n < m. Let 0 < ε < 1 be fixed. Thus An ≤ ε, if n ≥ ε12 . For each 1 ≤ n ≤ 2ε , we cover An with one ball of radius ε. Finally, if 2ε < n < ε12 , let mk = kε12 , 1 ≤ k ≤ 1ε . Notice that x ≥ x/2 if x ≥ 1, and x − y ≤ 3(x − y) 1 if x − y ≥ 1/2. Thus ε ≤ m 1 ≤ 2ε , and ε

1 1 1 1 1 1 1 1 − = 2 ≥ 21 1 ≥ . = kε2 (k + 1)ε 2 ε k(k + 1) ε ε ( ε + 1) 1+ε 2 1 1 So kε12 − (k+1)ε , which implies ≤ ε32 k(k+1) 2 1 1 − (k+1)ε2 1 1 1 1 kε 2 4 − = 1 1 ≤ 4k(k + 1)ε − mk mk+1 kε2 (k + 1)ε 2 2 2 kε

(k+1)ε

≤ 4k(k + 1)ε

4

3 1 2 ε k(k + 1)

= 12ε2 .

Let mk ≤ n < mk+1 . Then

An − Amk 2 =

1 1 1 1 − ≤ − ≤ 12ε2 . mk n mk mk+1

Hence, for some absolute constant C, C −1 ε−1 ≤ N (A(f ), · , ε) ≤ Cε −1 .

(1.4.8)

We conclude these remarks by pointing out a Cauchy type uniform estimate of averages An , easy to draw from the above estimates and Theorem 8.1.1, |AN (f ) − AM (f )| 1/2 < ∞

N =M≥1 1 − 1 M N 1 du + + whenever : R → R is an increasing map such that 0 √u(u) < ∞. E sup

(1.4.9)

These plain computations also show, when combined with Rosenthal’s inequalities, that this estimate continues to be valid in Lp , 2 < p < ∞. More precisely, let {ξj , j = 0, 1, . . . } be a sequence of mean zero independent variables with finite moments of order p ≥ 2 and σ ≤ ξj 2 ≤ ξj p ≤ K for all j . Let An = n−1 n−1 j =0 ξj , A = {An , n ≥ 1}: Then for each p ≥ 2 there exist constants cp and Cp depending only on p such that entropy numbers obey cp σ ε−1 − 1 ≤ N(A, p, ε) ≤ Cp Kε−1 + 2 for all ε > 0.

(1.4.10)

35

1.4 The spectral regularization inequality

Lacunary subsequences. Let N = {nj , j ≥ 1} be a strictly growing sequence of positive integers satisfying the condition cN := sup # N ∩ [2k , 2k+1 [ < ∞. k≥1

Better estimates of entropy numbers than in Theorem 1.4.1 can be obtained in that case. Let f ∈ H with spectral measure μ. Put

π μ{0 < |θ| ≤ u} + u (ε) = inf + log , 0 < u ≤ π . 2 ε u Then there exists a universal constant C such that for any N , any f ∈ H with f = 1 and any 0 < ε ≤ 1,

N An f, n ∈ N , · , ε ≤ CcN (ε). For proofs, see Weber [1998a: Corollary 3.3] or Lifshits–Weber [2000: Corollary 4]. Extension to Lp with p > 1. Assume that H = L2 (μ), (X, A, μ) being a probability space, and define Tf = f τ where τ is a measure-preserving transformation of X (Section 3.1). By Theorem 1.4.5, the associated square function SN defined in (1.4.3) maps L2 (μ) to L2 (μ). This can be extended for 1 < p < ∞: There exists a constant Cp such that for any increasing sequence N = {nk , k ≥ 1} and any f ∈ Lp (μ), we have ∞ T A

p 1/p T (f ) − A (f ) ≤ Cp f p . nk+1 nk p

(1.4.11)

k=1

This nice result was shown by Jones, Kaufman, Rosenblatt and Wierdl [1998]. It is a direct consequence of a stronger result (see Theorem A), which we shall discuss in Section 4.6.6. With the notation from the beginning of the section, let N (AT (f ), p, ε) be the minimal number (possibly infinite) of Lp (μ) open balls centered in AT (f ) of radius ε, enough to cover AT (f ). In a way similar to the one we used to derive entropy estimates from the square function, we deduce from (1.4.11): There exists a constant Cp such that for ε > 0 and any f ∈ Lp (μ), N(A (f ), p, ε) ≤ T

p f p Cp p .

ε

(1.4.12)

For irrational rotations, this bound can be improved by using the Hausdorff–Young inequality (Lifshits [1997] and Weber [1997]). Let τ x = x + ϑ be a rotation on (T, λ), and T defined by Tf = f τ .

36

1 The von Neumann theorem and spectral regularization

Let 2 ≤ p < ∞ and 1/p + 1/q = 1. For f ∈ Lp (T), f ∼ fˆ = {fˆj , j ∈ Z} be its Fourier transform. Then

sup N(AT (f ), p, ε) ≤ Cε−q .

ˆ

j ∈Z fj ej ,

let

(1.4.13)

fˆ q ≤1

As T ej = e2iπj ϑ ej := λj ej , for all polynomials P we have P (λj )fˆj ej . P (T )f = j ∈Z

By the Hausdorff–Young theorem, we get

P (T )f p ≤ Cp

|P (λj )|q |fˆj |q

1/q .

j ∈Z

But this is a complete analog to (1.4.6) and we can proceed as in the proof of Proposition 1.4.7, by introducing a pseudo-spectral measure μ = j ∈Z |fˆj |q δλj , and its regularized version μˆ with the same kernel Q(z, r). We arrive at the estimate q

q

q

q

q

(An − Am )f p = (Vn − Vm )(T )x p ≤ Cp Vn − Vm q,μ ≤ C1 Cp μ[1/m, ˆ 1/n]. The estimate for covering numbers follows straightforwardly. Note that the proof works not only for rotations but also for all operators whose duals (with respect to a Fourier transform) act in q as contractive multiplications. Any convolution operator with respect to unit mass measure satisfies this condition. For more general averages such as averages of Dunford–Schwartz operators, or of a contraction in Lp , we do not know whether an analogous formulation of (1.4.12) exists. This estimate cannot, however, be improved in general as the following nice counterexample from Lisfshits [1997] shows. Lifshits’ counterexample. Let 2 ≤ p < ∞ and let U : Lp (T) → Lp (T) be the multiplication operator defined for any f ∈ Lp (T) and any θ ∈ T by Uf (θ ) = eiθ f (θ ). I +U +···+U We write An = AU n = n for any ε > 0 small enough that

n−1

where I is the identity operator. We shall prove

sup N (A(f ), p, ε/3) ≥ ε−p .

f p =1

Note that An f (θ ) = Vn (θ )f (θ ), so that for any positive integers m, n, p |Vn (θ ) − Vm (θ )|p |f (θ )|p dθ.

An f − Am f p = T

(1.4.14)

1.4 The spectral regularization inequality

37

Let B be some fixed integer strictly greater than 12. From the standard estimates |Vm (θ )| ≤ π(mθ )−1 ,

|Vn (θ ) − 1| ≤ π(n − 1)θ/4 ≤ nθ,

valid for any m, n, θ, we deduce that if B/m ≤ θ ≤ B 2 /m and n ≤ B −3 m, then |Vn (θ ) − Vm (θ )| ≥ 1/2. It follows for any f ∈ Lp (T), any m and any n ≤ B −3 m that B 2 /m p

An f − Am f p ≥ 2−p |f (θ )|p dθ. B/m

In particular, for any f ∈ Lp (T) and any positive integers l > t, B 2−3l p

AB 3t f − AB 3l f p ≥ 2−p |f (θ )|p dθ. B 1−3l

Let M be some positive integer and put ε = M −1/p . Set f (θ) =

M l=1

1

M(B 2−3l

1/p 1[B 1−3l ,B 2−3l ] (θ ).

− B 1−3l )

Then f p = 1 and

B 2−3l B 1−3l

Thus

|f (θ)|p dθ =

1 = εp , M

AB 3t f − AB 3l f p ≥ 2−p εp , p

l = 1, . . . , M. 1 ≤ t < l ≤ M.

We deduce from these calculations that N (A(f ), p, ε/3) ≥ M = ε−p , as claimed. A variant in L1 . There is a general estimate of a weaker form of the square function in L1 , which is due to Jones, Rosenblatt and Wierdl [1999: Theorem 2.3], and can be stated as follows. Let (X, A, μ) be a probability space. Consider mappings Tn : L1 (μ) → L1 (μ) and assume that each is strongly positive in the sense that Tn f ≥ 0 for all f ∈ L1 (μ). We also assume that each Tn is positively homogeneous, which means that Tn (cf ) = cTn f for nonnegative c and f ∈ L1 (μ). For instance, Tn can be the absolute value of any linear operator from L1 (μ) to L1 (μ).

∞ 2 1/2 . Then Let Sf (x) = n=1 Tn f (x) sup sup λ λ≥0 f 1 ≤1

∞ n=1

μ{|Tn f | ≥ λ} ≤ C "⇒ sup sup λμ{Sf ≥ λ} ≤ 10C. (1.4.15) λ≥0 f 1 ≤1

38

1 The von Neumann theorem and spectral regularization

The proof is rather elementary. As Sf ≤ S1 f + S2 f , where S1 f (x) = S2 f (x) =

∞ n=1 ∞

1/2

(Tn f (x))2 1{Tn f ≤1} (x)

, 1/2

(Tn f (x))2 1{Tn f >1} (x)

,

n=1

we get μ{Sf ≥ 2} ≤ μ{S1 f ≥ 1} + μ{S2 f ≥ 1} ∞ (Tn f )2 1{Tn f >1} ≥ 1 ≤ μ{S1 f ≥ 1} + μ ≤ μ{S1 f ≥ 1} +

n=1 ∞

μ{Tn f > 1} ≤ μ{(S1 f )2 ≥ 1} + C f 1

n=1 ∞

=μ

(Tn f (x))2 .

n=1 ∞

≤μ ≤

k=0

1{2−k−1 ≤Tn f ≤2−k } ≥ 1 + C f 1

k=0

2−2k

k=0 ∞

∞

2−2k

∞

1{2−k−1 ≤Tn f ≤2−k } ≥ 1 + C f 1

n=1 ∞

μ{Tn f ≥ 2−k−1 } + C f 1 ≤ 5C f 1 .

n=1

Let t > 0. Replacing now f by f/t gives tμ{Sf ≥ 2t} ≤ 5C f 1 ; hence sup λμ{Sf ≥ λ} ≤ 10C f 1 . λ≥0

Extensions to the Hilbert transform. Results of the previous section have extensions to the discrete bilateral Hilbert transform Hn (f ) = U j (f )/j, 0<|j |≤n

where U : H → H is still a contraction in a Hilbert space H . A link between the Hilbert transform and ergodic means can be deduced from the following elementary identity (here aj are complex numbers): 1 1 aj = Sn − Sj , n n n

j =1

n−1 j =1

Sj =

j 1 k=1

k

ak ,

n ≥ 1.

The properties of Hn were notably considered in the work of Jajte [1987].

39

1.4 The spectral regularization inequality

The associated sequence of spectral kernels is defined as 1 sin(j θ ) eij θ = 2i , j j

Wn (θ) =

0<|j |≤n

W = {Wn , n ≥ 1}.

0<j ≤n

We also introduce the auxiliary sequence of functions n (θ ) =

∞ sin(j θ ) . j

j =n+1

Then we observe that for all m ≥ n, |Wm (θ ) − Wn (θ )| = 2 |n (θ ) − m (θ )|. By applying the Abel transform, we get 1.4.8 Lemma. For all θ ∈ [−π, π ) the following inequalities hold: a) for all n ≥ 1, |n (θ )| ≤ 4/(n|θ|); b) for all m ≥ n, |n (θ ) − m (θ )| ≤ (m − n)/m; c) for all m ≥ n, |n (θ ) − m (θ )| ≤ (m − n)|θ |. One can easily deduce from Lemma 1.4.8, in the same manner as for proving Lemma 1.4.3, that for all integers m ≥ n and each θ ∈ [−π, π ),

1/n

32 1/m

Q(θ, y)dy + 4 1[ 1 , 1 ) (|θ|) ≥ |Wm (θ ) − Wn (θ )|2 , m n

(1.4.16)

where the kernel Q(θ, y) is defined in (1.4.1). In view of the definition of μˆ (see (1.4.2)) we get 1.4.9 Theorem. Let m ≥ n be two positive integers. Then

Wm − Wn 22,μ ≤ 3μˆ

1 1 m, n

.

This result yields corollaries similar to those of Theorem 1.4.4. In particular, for any increasing sequence of positive integers {np , p ≥ 1},

Hnp+1 (f ) − Hnp (f ) 2 ≤ 12(2π + 1) f 2 , (1.4.17) p

and for every ε > 0 the entropy number of the set H (f ) = {Hn f, n ≥ 1} satisfies N(H (f ), ε) ≤ 1 +

12(2π + 1) 88 ≤ 1+ 2. ε ε2

(1.4.18)

40

1 The von Neumann theorem and spectral regularization

Problem 1. Let {Tt , t ∈ R} be a flow (Section 4.1) and consider the one-sided Hilbert transform n u T + T −u f du, An f = 2 1 with corresponding spectral kernel nt n ∞ cos ut cos v cos v du = dv = It − Int where It = dv. Vn (t) = u v v 1 t t Prove that for all t ≥ 0 and all reals n ≥ m ≥ 1, 1/m |Vm (t) − Vn (t)| ≤ 64 Q(t, x)dx, 1/n

where

⎧ ⎪ ⎨1/t Q(t, x) = | log x|/x ⎪ ⎩ 0

if 0 ≤ x ≤ t, if t ≤ x ≤ t, elsewhere.

Let μ be any measure on R (for instance the spectral measure of f ) and let μˆ denote the regularized measure defined as d μˆ Q(|t|, x)μ(dt), 0 < x < ∞. (x) = dx R Show for all reals n ≥ m ≥ 1 that

!"

Vm (t) − Vn (t) 22,μ ≤ 64μˆ Besides, 1 μ(R) ˆ = μ(R) + 2

1 −1

1 1 n, m

.

| log |t||2 μ(dt).

See also Remark 2.6.4. Extension to correlated sequences. We shall now indicate an extension to the Wiener space S of correlated sequences, namely the space consisting of sequences a = {a(n), n ∈ Z} such that for any integer k, the limit 1 γa (k) = lim a(j )a(j + k) n→∞ n n−1

j =0

exists. We provide S with the semi-norm

1 a(j )2 (a) = lim sup n n→∞ n−1

j =0

1/2

.

41

1.4 The spectral regularization inequality

Entropy numbers associated to any subset E of (S, ) are denoted by N (E, , · ). For any a = {aj , j ∈ Z} ∈ 2 (Z), let us write ϕa (α) = j ∈Z e−2iπj α a(j ). Let also T be the right shift on the space of sequences: T (bn , n ∈ Z) = (bn+1 , n ∈ Z), and denote AN =

I + T + · · · + T N −1 , N

N = 1, 2, . . . .

1.4.10 Corollary. For any a ∈ S, there exists a constant K(a) depending on a only, such that K(a) ∀0 < ε ≤ K(a), N({ATn (a), n ≥ 1}, , ε) ≤ 2 . ε Proof. By the Bessel–Parseval equality, ∀N, M ∈ N,

(AN − AM )(a)(n)2 = |ϕa (α)|2 |VN (α) − VM (α)|2 dα. T

n∈Z

Hence, for any J ≥ 1, and all N, M such that N ∨ M < J , 1 J

|(AN − AM )(a)(n)|2

0≤n<J −N ∨M

≤

T

2 2 1 −2iπj α e a(j ) VN (α) − VM (α) dα. J 0≤j <J

We can view the right integral as an integration with respect to the measure J,a (dα) =

2 1 −2iπj α e a(j ) dα. J

(1.4.19)

0≤j <J

Assume now that a ∈ S. By 1.1.3 there exists a unique nonnegative bounded measure a on T, the spectral measure of the sequence a, such that ∀m ∈ Z,

γa (m) =

π

−π

e2iπ mα a (dα).

Further, we know that the family of measures J,a weakly converges to a . We thus deduce 1 2 |(AN − AM ) (a)(n)| ≤ |VN (α) − VM (α)|2 a (dα) lim sup T J →∞ J 0≤n<J

for all N, M ≥ 1. The result now follows easily.

42

1 The von Neumann theorem and spectral regularization

Continuous time and Fourier inversion formula. The results stated in Section 1.4 remain valid if we consider a semigroup of unitary operators {Ut , t ∈ R+ } in a Hilbert space H and the corresponding averages AT (f ) =

1 T

T

Ut (f )dt . 0

In this case, one must replace the space L2 ([−π, π ), μ) with L2 (R, μ), and consider the kernels VT (θ ) =

sin T y VT (y) = $ VT (y) = . Ty

eiT θ − 1 , iT θ

(1.4.20)

We define by continuity VT (0) = VT (0) ≡ 1. Since the basic elementary inequalities |VT2 (θ ) − VT1 (θ )| ≤ min |VT1 (θ )| ≤

2(T2 −T1 ) T2 −T1 2 |θ|, T2

T2 ≥ T1 ,

,

(1.4.21)

2 , T1 |θ|

hold true, we still have the analogue of Theorem 1.4.4,

VT2 − VT1 22,μ ≤ 8μˆ

1 1 T2 , T1

.

(1.4.22)

All corollaries about entropy numbers and square functions follow directly. There is an interesting application to the Fourier inversionformula, which is worth noting. If ν is a distribution function on R and νˆ (t) = R eitx ν(dx) denotes its characteristic function, then (see for instance Theorem 6.2.4 in Chung [1970]) 1 lim T →∞ 2T

T

−T

e−itx0 νˆ (t)dt = ν{x0 }.

(1.4.23)

From this result also follows that

1 T →∞ 2T lim

T

−T

|ˆν (t)|2 dt =

ν({x})2 .

(1.4.24)

ν ∗n ({x})2 .

(1.4.25)

x∈R

And more generally, for any positive integer n, 1 lim T →∞ 2T

T

−T

|ˆν (t)|2n dt =

x∈R

That (1.4.24), (1.4.25) follow from (1.4.23) is simple, and is better seen using a probabilistic language (by homogeneity, there is no loss in assuming ν(R) = 1 = νˆ (0)). If X is a real-valued random variable defined on some probability space (, A, P) such that X(P) = ν, and X denotes an independent copy of X, then Z = X − X

1.4 The spectral regularization inequality

43

has distribution function ν ∗ ν , where ν (A) = ν(−A) for any A ∈ A. And |ˆν (t)|2 = ν ∗ ν (t) = E eit (X−X ) , so (1.4.23) means T 1 it (X−X ) lim e dt = (ν ∗ ν )({0}) = ν ({−y})ν(dy) E T →∞ 2T −T R = ν({x})2 x∈R

=

P{X = x}2 ,

x∈R

which yields (1.4.24). Let Z1 , Z2 , . . . , Zn be independent copies of Z and set Sn = Z1 + · · · + Zn . Applying the above to Sn gives (since E eitSn = |ˆν (t)|2n ) T 1 lim |ˆν (t)|2n dt = P(Sn = x)2 = ν ∗n ({x})2 , T →∞ 2T −T x∈R

x∈R

which is (1.4.25). Recall briefly for our purpose how to obtain (1.4.23). As |VT (y)| ≤ 1 everywhere and VT (y) → 0 as T tends to infinity for all y = 0, by the dominated convergence theorem it holds that for any real x0 , VT (x − x0 )ν(dx) → ν{x0 } as T tends to infinity. (1.4.26) R

And so MT (x0 ) : = =

1 2T R

T

−T

e−itx0 νˆ (t)dt =

R\{x0 }

sin T (x − x0 ) ν(dx) + ν{x0 } T (x − x0 )

(1.4.27)

VT (x − x0 )ν(dx) → ν{x0 }.

The Fourier inversion formula can be made a little more precise. Not only MT (x0 ) → ν{x0 }, but in fact for any arbitrary nondecreasing sequence T = {Tp , p ≥ 1} of positive reals, ∞ MT (x0 ) − MT (x0 )2 ≤ 24ν 2 (R). (1.4.28) k+1 k k=1

This time the total mass of the measure appears, unlike in (1.4.23). However (1.4.28) implies the convergence of MT (x0 ) as T → ∞. By the Cauchy–Schwarz inequality, we first observe that 2 # " MT (x0 ) − MT (x0 )2 = (x − x ) − V (x − x ) ν(dx) V T2 0 T1 0 2 1 R (1.4.29) " #2 ≤ ν(R) · VT2 (x − x0 ) − VT1 (x − x0 ) ν(dx). R

44

1 The von Neumann theorem and spectral regularization

By (1.4.22), R

#2

"

VT2 (x − x0 ) − VT1 (x − x0 ) ν(dx) ≤

R

VT (y) − VT (y)2 νx (dy) 0 2 1

≤ 8ˆνx0

1 1 T2 , T1

(1.4.30)

,

where for any real y we write νy (A) = ν(A − y), for every A ∈ B(R). Furthermore, Q(θ, x)dxν(dθ ) νˆ x0 (R) = (1.4.31) = |x|−3 dx θ 2 + dx|θ |−1 ν(dθ ) |θ |<|x|

|x|<|θ |

≤ 3νx0 (R) = 3ν(R). Let T = {Tk , k ≥ 1} be any arbitrary nondecreasing sequence of positive reals. It follows from the above estimates that ∞ MT

k+1

2 (x0 ) − MTk (x0 ) ≤ 24ν 2 (R).

k=1

1.5

Moving averages

In a similar way, one can develop for moving averages the idea of spectral regularization relative to some suitable class of regularizing kernels. Let (H, · ) be a Hilbert space and consider an arbitrary contraction U : H → H . Let also φ : R+ →R+ be a nondecreasing function with derivative, and such that φ(N) ⊂ N. Consider, for any positive integer n, the sequence of moving averages BnU,φ

=

Bnφ

1 = n

φ(n)+n−1

Uj.

j =φ(n)

It is important to underline here the considerable differences of structures which appear when passing from the study of fixed averages to the one of moving averages. For examples see for instance Section 4.1. We now introduce the associated spectral kernels Wn (θ ) = eiφ(n)θ Vn (θ ).

(1.5.1)

By definition, for any positive integers n, m, |Wm (θ) − Wn (θ )| ≤ |Vm (θ ) − Vn (θ )| + |ei(φ(m)−φ(n))θ − 1||Vm (θ )|. The first difference was estimated in Lemma 1.4.2. Concerning the second difference, we have |ei(φ(m)−φ(n))θ − 1| ≤ |θ |(φ(m) − φ(n)) ∧ 2

45

1.5 Moving averages

and |Vm (θ )| ≤

π ∧1 . |θ|m

It follows that

π2 ∧1 . |Wm (θ) − Wn (θ )| ≤ 2|Vm (θ ) − Vn (θ )| + 2{|θ | (φ(m) − φ(n)) ∧ 2} |θ |2 m2 2

2

2

2

1.5.1 Lemma. Let m ≥ n be two positive integers. Then, for any θ ∈ [−π, π ), 1/n 1 φ (1/y)Q(θ, y)dy ≥ {|θ|2 (φ(m) − φ(n))2 ∧ 1} ∧ 1 , (1.5.2) |θ |2 m2 1/m and the regularizing kernel Q is defined in (1.4.1). Proof. Consider two cases. 1) |θ| ≥ m1 . Then 1/n m |θ | 1 1 φ (1/y) dy = φ (u) ∧ |θ | du ∧ |θ| y 2 |θ |u2 1/m n m 1 φ (u) ∧ |θ | du ≥ |θ |m2 n φ(m) − φ(n) (φ(m) − φ(n))|θ | = = . 2 |θ|m |θ |2 m2 By using the elementary inequality x ≥ x 2 ∧ 1 with x = (φ(m) − φ(n))|θ |, we obtain the requested result. 2) |θ | ≤ m1 . Then 1/n 1/n |θ | φ (1/y)Q(θ, y)dy = φ (1/y) 2 dy y 1/m 1/m m φ (u)du = |θ |(φ(m) − φ(n)). = |θ| n

By means of the same elementary inequality, we have |θ |(φ(m)−φ(n)) ≥ |θ |2 (φ(m)− φ(n))2 ∧ 1, and the lemma is thus completely proved. We deduce from Lemmas 1.4.3 and 1.5.1 that 1/n

|Wm (θ) − Wn (θ )|2 ≤ 8π + 4π 2 φ (1/y) Q(θ, y)dy + 8 1[ 1 , 1 ) (|θ |). (1.5.3) 1/m

m n

Let now f ∈ H with spectral measure μf . Introduce a regularization μˆ f of the measure μf , relative to the kernel Q, by putting π

μˆ f (dy) = (1.5.4) 8π + 4π 2 φ (1/y) Q(θ, y)μf (dθ )dy + 8μ(dy). −π

46

1 The von Neumann theorem and spectral regularization

1.5.2 Theorem (Spectral regularization inequality for moving averages). For any positive integers m ≥ n, " # φ

Bm (f ) − Bnφ (f ) ≤ μˆ f m1 , n1 . Proof. By integrating (1.5.3) with respect to the measure μf , we get π

# |Wm − Wn |2 μf (θ ) ≤ μˆ f m1 , n1 . −π

The result thus follows by means of the spectral inequality. 1.5.3 Corollary (Square function of moving averages). Let ∞

1 φ (u) (θ ) = du + φ |θ |−1 |θ | 1{|θ |≤1} . 2 |θ| |θ|1 ∨1 u

(1.5.5)

Then, for any f ∈ H with spectral measure μ, and for any nondecreasing sequence of positive integers {np , p ≥ 1}, we have ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 4π 2

p=1

π −π

(θ )μ(dθ ) + 8(π + 1)μ[−π, π ). (1.5.6)

Proof. By using (1.5.3) and (1.5.4) we get ∞

Bnφp+1 (f ) − Bnφp (f ) 2

p=1

≤

∞ 1 μˆ np+1 , n1p = μˆ 0, n11 p=1

≤

1

dy 0

π

−π

8π + 4π 2 φ (1/y) Q(θ, y)μ(dθ ) + 8μ(0, 1]

≤ (8π + 8)μ[−π, π ) + 4π

π

2

−π

1 0

1 |θ | φ (1/y) ∧ 2 dyμ(dθ ). |θ | y

Further, by making the change of variables y = u−1 , we get 1 |θ |∧1 1 1 φ (1/y) |θ | |θ | φ (1/y) φ (1/y) 2 dy ∧ 2 dy ≤ dy + |θ| y |θ | y 0 0 |θ |∧1 ∞ 1 φ (u) ≤ du + φ(|θ |−1 )|θ | 1{|θ |≤1} |θ| |θ|1 ∨1 u2 = (θ ).

1.5 Moving averages

47

1.5.4 Remarks. 1. For sublinear functions φ such that supu φ (u) ≤ C, as given in (1.5.5) is uniformly bounded; and thus ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ (4π 2 sup (θ ) + 8π + 8)μ[−π, π ) θ

p=1

≤ 8(π C + π + 1) f 2 . 2

Estimate (1.5.6) remains again reasonable for functions φ growing faster than linearly, α but only for f such that (θ )μ(dθ ) < ∞. If, for instance, 1−αφ(u) ≈ u , α ∈ [1, 2), the estimate is efficient under the spectral hypothesis: |θ | μ(dθ ) < ∞. If α ≥ 2, then (θ) is infinite, and estimate (1.5.6) is no longer efficient. 2. Assume that the sequence {np , p ≥ 1} grows very fast, and for instance that the following condition is realized: M = sup

% ∞ $ nl 2

l≥1 p=l

np

< ∞.

(G1)

An equivalent reformulation of (G1) is: ∃q ∈ N, c > 0 : ∀p (1 + c) np ≤ np+q . Consider also a stronger assumption: ∃q ∈ N, c > 0 : ∀p max φ(np ) ; (1 + c) np ≤ np+q . (G2) This assumption is verified if for instance {np , p ≥ 1} is a sequence with a superexponential growth, namely np ∼ exp{Ca p } and φ is polynomial, φ(u) ∼ uα . We introduce the inverse function of the sequence {np , p ≥ 1}, sup{p : np ≤ x} if x ≥ n1 , L(x) = 0 if x < n1 . 1.5.5 Proposition. Assume that the sequence {np , p ≥ 1} satisfies assumption (G1). Then, for any element f ∈ H , ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2(6π + 2 + 2π 2 M) f 2 + 4I (μf ),

(1.5.7)

p=1

where

I (μf ) =

"

# L(|θ|−1 ) − L φ −1 (|θ |−1 ) + μf (dθ ).

(1.5.8)

Further, if assumption (G2) is verified, then we also have ∞ p=1

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2(6π + π 2 M + q + 1) f 2 .

(1.5.9)

48

1 The von Neumann theorem and spectral regularization

Proof. By means of (1.5.2) we have ∞

Bnφp+1 (f ) − Bnφp (f ) 2 ≤ 2

p=1

+2

∞

|Vnp+1 (θ ) − Vnp (θ )|2 μf (dθ )

p=1

$ ∞

|θ| (φ(np+1 ) − φ(np )) ∧ 2 2

2

p=1

π2 ∧1 |θ |2 n2p+1

%

μf (dθ ).

The first term can be bounded by 12π f 2 . Consider the second term. Fix θ, and put p− = L φ −1 (|θ |−1 ) − 1, p ∗ = L(|θ|−1 ). Then −

p

−

(φ(np+1 ) − φ(np )) ≤ 2

p=1

p

(φ(np+1 ) − φ(np ))φ(np− +1 ) ≤ φ(np− +1 )2 ≤ |θ |−2 .

p=1

Similarly (G1) implies:

∞

1 p=p ∗ n2 p+1

≤

M n2p∗ +1

≤ M|θ |2 . It follows that the second

term inside the integral is bounded above by 1 + 2π 2 M + 2

∞

"

# 1{p−