Dynamical systems and processes

irma_weber_titelei 10.8.2009 11:03 Uhr Seite 1 IRMA Lectures in Mathematics and Theoretical Physics 14 Edited by Chr...

Author: Weber M.

43 downloads 1281 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

First, evaluate the supremum of Q2 . Introduce the following random process: Xε (γ ) = αj εn n−σ β pn , γ ∈ G, ν<j ≤τ

j

n∈Ej

where γ = (αj )ν<j ≤τ , (βm )1≤m≤N/2 and G = γ : |αj | ∨ |β | ≤1, ν < j ≤ τ, 1 ≤ m 2iπ { k =j ak (n)zk +[aj (n)−1]zj } 2iπ z −σ j e . m ≤ N/2 . Write Q2 (z) = ν<j ≤τ e n∈Ej εn n Considering separately the imaginary and real parts of e2iπ aj (n)zj and e2iπ easily shows that sup Q2 (z) ≤ 4 sup Xε (γ ). z∈Tτ

γ ∈G

k =j

ak (n)zk

420

8 The metric entropy method

By the contraction principle, E sup Q2 (z) ≤ 4

3

z∈Tτ

π E sup X(γ ), 2 γ ∈G

where {X(γ ), γ ∈ G} is the same process as Xε (γ ) except that the Rademacher random variables εn are replaced by independent N (0, 1) random variables μn : αj μn n−σ β pn . X(γ ) = ν<j ≤τ

j

n∈Ej

The problem now reduces to estimating the supremum of the real-valued Gaussian process X. Towards this aim, we examine the L2 -norm of its increments: " #2

Xγ − Xγ 22 = n−2σ αj β pn − αj β n ≤2

pj

j

ν<j ≤τ n∈Ej

" # n−2σ (αj − αj )2 + (β pn − β n )2 , pj

j

ν<j ≤τ n∈Ej

where we have used the identity αj β pn − αj β n = (αj − αj )β pn + (β pn − β n )αj . j

j

pj

The “α” component part is easily controlled as follows: n−2σ (αj − αj )2 ≤ (αj − αj )2 pj−2σ ν<j ≤τ n∈Ej

ν<j ≤τ

≤ Cσ

(αj − αj )2

ν<j ≤τ

j

pj

m−2σ

m≤N/pj

(8.9.7)

N 1−2σ . pj

For the “β” component, we have 2 n (β pj − β pnj )

n2σ

ν<j ≤τ n∈Ej

≤

2 (βm − βm )

ν<j ≤τ mpj ≤N

m≤N/pν

:=

1 (mpj )2σ

(8.9.8)

2 2 Km (βm − βm ) .

m≤N/pν

Now we evaluate the coefficients Km . Consider two cases. 1) m ≤ N/pτ . Then mpj ≤ mpτ ≤ N for all j ≤ τ and, by using the standard estimate ([Hardy–Wright: 1979], Theorem 8) pj ∼ j log j, we have 2 Km =

(mpj )−2σ ≤ m−2σ

ν<j ≤τ −2σ

≤Cm

j ≤τ

(8.9.9)

pj−2σ

j ≤τ −2σ

(j log j )

= Cσ m−2σ τ 1−2σ (log τ )−2σ ≤ Cσ m−2σ

τ . pτ2σ

421

8.9 An application to random Dirichlet polynomials

Thus m≤N/pτ

τ 1/2 Km ≤ Cσ σ pτ ≤

−σ

m

≤ Cσ

m≤N/pτ

N pτ

1−σ 1/2 τ

pτσ

=

Cσ N 1−σ τ 1/2 pτ

Cσ N 1−σ . τ 1/2 log τ

2) N/pν ≥ m > N/pτ . Then take a unique k ∈ (ν, τ ] such that N/pk < m ≤ N/pk−1 . We have 2 Km = (mpj )−2σ ≤ m−2σ pj−2σ ν<j ≤k−1

≤ Cσ m−2σ

j ≤k−1

(j log j )−2σ ≤ Cσ m−2σ

j ≤k

≤ Cσ m−2σ = Cσ Since k log k ≤ Cpk ≤ C

k 1−2σ (log k)2σ

k k ≤ Cσ m−2σ 2σ (N/m)2σ pk

k . N 2σ N m,

we have

N −1 N log . k≤C m m

1/2

−1/2 . It follows that log N We arrive at Km ≤ Cσ N −σ N m m

−1/2 N 1/2 N log m m m≤N/pν 1/pν ≤ Cσ N 1−σ u−1/2 (log(1/u))−1/2 du

Km ≤ Cσ N −σ

m≤N/pν

0

≤ Cσ N 1−σ pν−1/2 (log pν )−1/2 ≤

Cσ N 1−σ . ν 1/2 log ν

Now define a second Gaussian process by putting, for all γ ∈ G, N 1−2σ 1/2 αj ξj + Y (γ ) = pj ν<j ≤τ

Km βm ξm := Yγ + Yγ ,

m≤N/pν

where ξi , ξj are independent N (0, 1) random variables. It follows from (8.9.7) and (8.9.8) that for some suitable constant Cσ , one has the comparison relations: for all γ , γ ∈ G,

Xγ − Xγ 2 ≤ Cσ Yγ − Yγ 2 .

422

8 The metric entropy method

By virtue of the comparison Lemma 10.2.3, since X0 = Y0 = 0, we have E sup |Xγ | ≤ 2E sup Xγ ≤ 2Cσ E sup Yγ ≤ 2Cσ E sup |Yγ |. γ ∈G

γ ∈G

γ ∈G

γ ∈G

It remains to evaluate the supremum of Y . First of all, −1/2 1 pj . E sup |Y (γ )| ≤ N 2 −σ γ ∈G

ν<j ≤τ

By (8.9.10), we have

−1/2

pj

≤

ν<j ≤τ

−1/2

pj

1<j ≤τ

thus 1

E sup |Y (γ )| ≤ C N 2 −σ γ ∈G

≤

Cτ 1/2 , (log τ )1/2

τ 1/2 . (log τ )1/2

(8.9.10)

To control the supremum of Y , we use our estimates for the sums of Km and write N 1−σ N 1−σ Cσ N 1−σ E sup |Y (γ )| ≤ Km ≤ Cσ + ≤ . ν 1/2 log ν τ 1/2 log τ ν 1/2 log ν γ ∈G m≤N/pν

(8.9.11) Now, we turn to the supremum of Q1 . Towards this aim, introduce the auxiliary Gaussian process ϒ(z) = n−σ θn cos 2π a(n), z + θn sin 2π a(n), z , z ∈ Tν , P + (n)≤pν

where θi , θj are independent N (0, 1) random variables. By symmetrization, √ E sup Q1 (z) ≤ 8πE sup ϒ(z), z∈Tν

z∈Tν

so that we are again led to evaluating the supremum of a real-valued Gaussian process. For z, z ∈ Tν put ϒ(z) − ϒ(z) 2 := d(z, z ), and observe that 1 d(z, z )2 = 4 sin2 (π a(n), z − z ) (8.9.12) 2σ n + n:P (n)≤pν

≤ 4π 2

n:P + (n)≤pν

≤ 4π 2

1 |a(n), z − z |2 n2σ n−2σ

ν !

n:P + (n)≤pν

= 4π 2

n:P + (n)≤p

2

aj (n)|zj − zj |

j =1 ν

ν j1 ,j2 =1

aj1 (n)aj2 (n)|zj1 − zj 1 | |zj2 − zj 2 |n−2σ

423

8.9 An application to random Dirichlet polynomials

= 4π

2

ν

j1 ,j2 =1 n:P + (n)≤pν

≤ 4π 2

ν j1 ,j2 =1

≤ 4π 2

ν j1 ,j2 =1

≤ Cσ N 1−2σ = Cσ N 1−2σ = Cσ N 1−2σ

aj1 (n)aj2 (n)|zj1 − zj 1 | |zj2 − zj 2 |n−2σ

|zj1 − zj 1 | |zj2 − zj 2 | |zj1 − zj 1 | |zj2 − zj 2 |

ν j1 ,j2 =1 ν j1 ,j2 =1 ν

∞ b1 ,b2 =1 ∞ b1 ,b2 =1

|zj1 − zj 1 | |zj2 − zj 2 | |zj1 − zj 1 | |zj2 − zj 2 |

|zj − zj |

j =1

∞

b pj−b

b1 b2

n−2σ

n≤N,aj1 (n)=b1 ,aj2 (n)=b2 1 σ −2b2 σ b1 b2 pj−2b pj2 1

∞ b1 ,b2 =1 ∞ b1 ,b2 =1

k −2σ

−b −b k≤Npj 1 pj 2 1 2 + P (k)≤pν

1 σ −2b2 σ 1 −b2 1−2σ b1 b2 pj−2b pj2 [pj−b pj2 ] 1 1

1 −b2 b1 b2 pj−b pj2 1

2 .

b=1

Thus,

d(z, z ) ≤ Cσ N

1/2−σ

ν

|zj − zj |

j =1

∞

b pj−b .

(8.9.13)

b=1

Now we explore the entropy properties of the metric space (Tν , d). Towards this aim, take ε ∈ (0, 1) and cover T ν by rectangular cells so that, if z and z belong to the same cell, we have |zj − zj | ≤

ε log log ν ,

1 ≤ j ≤ ν 1/2 ,

ε,

ν 1/2 < j ≤ ν.

(8.9.14)

Thus, every cell is a product of two cubes of different size and dimension. The necessary number of cells M(ε) is bounded as follows: M(ε) ≤

log log ν ε

[ν 1/2 ]

ε−(ν−[ν

1/2 ])

= (1/ε)ν (log log ν)[ν

1/2 ]

.

Let us now evaluate the distance d(z, z ) for z, z satisfying (8.9.14). By (8.9.13) we have d(z, z ) ≤ Cσ N 1/2−σ {d1 + d2 + d3 } ,

424

8 The metric entropy method

where d1 =

ν

|zj − zj |

j =1

b pj−b ,

b=2

d2 =

∞

|zj − zj |pj−1 ,

ν 1/2 <j ≤ν

d3 =

|zj − zj |pj−1 .

j ≤ν 1/2

For any j ≥ 1 we have ∞

b

pj−b

b=2

b 2 ∞ ∞ 2 2 −b = b 2 ≤ b 2−b = Cpj−2 . pj pj b=2

(8.9.15)

b=2

Hence, d1 ≤

ν

Cpj−2 max |zj − zj | ≤ Cε. j ≤ν

j =1

Similarly, d2 ≤

pj−1

ν 1/2 <j ≤ν

≤C ≤C

max |zj − zj |

ν 1/2 <j ≤ν

(j log j )−1 ε

ν 1/2 <j ≤ν ν du

ν 1/2

u log u

ε = C log log ν − log

log ν 2

ε = C(log 2) ε.

Finally, d3 ≤

ν

pj−1

j =1

max |zj − zj | ≤ C

j ≤ν 1/2

ν

(j log j )−1

j =1

ε ≤ C ε. log log ν

By summing up three estimates, we have d(z, z ) ≤ Cσ N 1/2−σ ε which enables the evaluation of the metric entropy. Let N (Tν , d, u) be the minimal number of balls of radius u that cover the space ν (T , d). We have log N (Tν , d, Cσ N 1/2−σ ε) ≤ log M(ε) ≤ ν| log ε| + ν 1/2 · log log log ν. Observe also that

ϒ(z) 2 ≤ Cσ N 1/2−σ ,

z ∈ Tν .

(8.9.16)

425

8.9 An application to random Dirichlet polynomials 1

Hence, D := diam(Tν , d) ≤ Cσ N 2 −σ , and by the classical Dudley’s entropy theorem (see (10.3.9) and (10.3.10)), for any fixed z ∈ Tν , E sup |ϒ(z ) − ϒ(z)| ≤ Cσ z ∈T ν

D

[log N (Tν , d, u)]1/2 du

0

Cσ N 1/2−σ

≤ Cσ 0

1

= Cσ N 1/2−σ

[log N (Tν , d, Cσ N 1/2−σ ε)]1/2 dε

0 1"

ν| log ε| + log log log ν · ν 1/2

≤ Cσ N 1/2−σ ≤ Cσ N

[log N (Tν , d, u)]1/2 du

0 1/2−σ 1/2

ν

#1/2

dε

.

Using again (8.9.16), we have E sup |ϒ(z )| ≤ Cσ N 1/2−σ ν 1/2 . z ∈T ν

(8.9.17)

The final stage of the proof provides the optimal choice of the parameter ν balancing the quantities (8.9.10), (8.9.11), and (8.9.17). As the theorem’s claim suggests, we consider three cases. Case 1. N 1/2 ≤ τ ≤ N. Obviously, this case contains the results of Halasz and Queffélec. In this case we choose ν=

τ , log N 1/2−σ 1/2

thus balancing (8.9.10) and (8.9.17). We obtain from both terms the bound Cσ N(log N )τ1/2 while the term (8.9.11) is negligible. The correctness condition ν ≤ τ is obvious. Case 2. N 1/2 (log N)−1 ≤ τ ≤ N 1/2 . In this case we choose ν = N 1/2 (log N )−1 , 3/4−σ

N thus balancing (8.9.11) and (8.9.17). We obtain from both terms the bound Cσ (log N )1/2 while the term (8.9.10) is negligible. The correctness condition ν ≤ τ is obvious for the range under consideration.

Case 3. 1 ≤ τ ≤ N 1/2 (log N)−1 . Here we just set ν = τ . It means that we do not need the splitting of the polynomial in two parts. Formally, the quantities (8.9.10) and (8.9.11) are not necessary and we obtain the bound Cσ N 1/2−σ τ 1/2 directly from (8.9.17). The upper bound is now proved completely.

426

8 The metric entropy method

Proof of the lower bound in Theorem 8.9.1. Let d = {dn , n ≥ 1} be a sequence of reals. Recall that by (8.9.5) we have τ sup dn εn n−σ −it = sup Q(z) z∈Tτ

t∈R j =1 n∈E j

where Q(z) =

τ

dn εn n−σ e2iπ a(n),z .

j =1 n∈Ej

Tτ

defined by Consider the subset Z of Z = z = {zj , 1 ≤ j ≤ τ } : zj = 0, if j ≤ τ/2, and zj ∈ {0, 1/2}, if j ∈ (τ/2, τ ] . Observe that the imaginary part of Q vanishes on Z, since for any z ∈ Z and any n it is true that e2iπ a(n),z = cos(2π a(n), z) = (−1)2a(n),z . Hence, Q takes the following simple form on Z, Q(z) = dn εn n−σ (−1)2a(n),z . τ/2<j ≤τ n∈Ej

This is no longer a trigonometric polynomial, but simply a finite rank Rademacher process. For j ∈ (τ/2, τ ] define Lj = n = pj n˜ : n˜ ≤ pNj and P + (n) ˜ ≤ pτ/2 . Since Ej ⊃ Lj ,

j = 1, . . . τ,

the sets Lj are pairwise disjoint. Put, for z ∈ Z, Q (z) = εn n−σ (−1)2a(n),z . τ/2<j ≤τ n∈Lj

We now recall a useful fact. 8.9.2 Lemma. Let X = {Xz , z ∈ Z} and Y = {Yz , z ∈ Z} be two finite sets of random variables defined on a common probability space. We assume that X and Y are independent and that the random variables Yz are all centered. Then E sup |Xz + Yz | ≥ E sup |Xz |. z∈Z

z∈Z

8.9 An application to random Dirichlet polynomials

427

Proof. Let be the σ -field generated by Y . Then # "

E sup |Xz + Yz | = E E sup |Xz + Yz | z∈Z z∈Z # " ≥ E sup E (Xz + Yz ) z∈Z

= E sup Xz + E Yz = E sup Xz . z∈Z

z∈Z

Clearly, since {Q(z) − Q (z), z ∈ Z} and {Q (z), z ∈ Z} are independent, E sup |Q(z)| ≥ E sup Q (z) . z∈Z

z∈Z

We now proceed to a direct evaluation of Q (z) by proving 8.9.3 Proposition. There exists a universal constant c such that for any system of coefficients {dn , n ≥ 1}, 1/2 1/2 dn2 n−2σ ≤ E sup Q (z) ≤ dn2 n−2σ . c z∈Z

τ/2<j ≤τ n∈Lj

τ/2<j ≤τ n∈Lj

Proof. For any n ∈ Lj , we have 2a(n), z = 2zj , so that dn εn n−σ (−1)2a(n),z = (−1)2zj dn εn (ω)n−σ . n∈Lj

Thus

n∈Lj

Q (z) =

(−1)2zj

τ/2<j ≤τ

dn εn (ω)n−σ .

n∈Lj

Let ω ∈ . We can select zj = zj (ω) = 0 or 1/2, τ/2 < j ≤ τ , according to the sign + or − of the sum n∈Lj dn εn (ω)n−σ . This implies that sup Q (z) = z∈Z

dn εn n−σ .

τ/2<j ≤τ n∈Lj

Now we shall use the well-known Khintchin’s inequalities. Let {εi , 1 ≤ i ≤ N } be a Rademacher sequence. For any 0 < p < ∞, there exist positive finite constants cp , Cp depending on p only, such that for any finite sequence {ai , 1 ≤ i ≤ N} of real numbers cp

N i=1

ai2

1/2

N N 1/2 ≤ ai εi ≤ Cp ai2 . i=1

p

i=1

√ See Kashin and Saakyan [1989]. Further Cp ≤ K p, p ≥ 1, where K is numerical.

428

8 The metric entropy method

Consequently,

E sup Q (z) = z∈Z

E dn εn n−σ ≥ c

τ/2<j ≤τ

=c

τ/2<j ≤τ

n∈Lj

dn2 n−2σ

2 1/2 E dn εn n−σ

τ/2<j ≤τ

1/2

n∈Lj

.

n∈Lj

The upper bound immediately follows from the Cauchy–Schwarz inequality. 8.9.4 Corollary. If (dn ) is a multiplicative system, we have E sup Q (z) ≥ c N −σ z∈Z

dpj

τ/2<j ≤τ

dn2˜

1/2 .

n≤N/p ˜ j P + (n)≤p ˜ τ/2

Now we can finish the proof of Theorem 8.9.1. If dn ≡ 1, we get from the above corollary that τ E sup εn n−σ e2iπ a(n),z ≥ E sup Q (z) z∈Tτ

z∈Z

j =1 n∈Ej

≥ =

C Nσ C Nσ

1/2 # m ≤ N/pj : P + (m) ≤ pτ/2

τ/2<j ≤τ

"

τ/2<j ≤τ

N , pτ/2 pj

1/2

.

Since

"

N ∗ N cN N N N , pτ/2 ≥ " , pτ/2 = " , pτ/2 ≥ , pτ/2 , "∗ pj pτ pτ pτ τ log τ pτ

we obtain $ %1/2 τ cN c τ −σ 2iπ a(n),z ∗ N E sup dn εn n e , pτ/2 " ≥ σ N 2 τ log τ pτ z∈Tτ j =1 n∈Ej

=cN

1/2−σ

τ log τ

1/2

"

∗

N , pτ/2 pτ

1/2

,

as asserted. Remark. Theorem 8.9.1 was extended [2009a] to weighted ran in Lifshits–Weber −s , under moderate conditions on the dom Dirichlet polynomials D(s) = N d(n)n n=2

429

8.9 An application to random Dirichlet polynomials

weights d(n). In fact the approach can be used with slight modifications to treat the case when d(n) is a non-negative sub-multiplicative function, namely d(nm) ≤ d(n)d(m) provided (n, m) = 1,

(8.9.18)

and satisfy

n (8.9.19) p|n "⇒ d(n) ≤ C d( ), and d(pj ) ≤ C1 λj , p √ for some positive C, C1 , λ with λ < 2, any prime number p, any integers n, j . √ Clearly, if C < 2, the second property is implied by the first. But this is not always so as the√following example yields. Fix some prime number P1 as well some reals 1 < λ1 < 2, C1 ≥ 1, and put j C1 λj if P1 n, (8.9.20) d(n) = 1 if (n, P1 ) = 1. (n) Condition (8.9.19) is satisfied by the divisor function d(n) = δ|n 1, or if d(n) = λ where (n) = pν ||n ν is the prime divisor sum function; but also for multiplicative functions such that d(p a ) ≤ λ, a = 1, 2, . . . . (8.9.21) d(pa−1 ) Other remarkable examples are 1 if (n, K) = 1, dK (n) = 0 if (n, K) > 1. where K is some positive integer. And the truncated divisor function dN (n) = #{k ≤ N : k|n}, where N ≥ 1 is some fixed positive integer. These examples are studied in [Weber: 2009a] where significant simplifications of the approach are provided, yielding also strictly better bounds than in Theorem 8.9.1. 8.9.10. Other results. In this section we apply the technique used on some other sets of coefficients. Let {dn , n ≥ 1} be a sequence of multiplicative weights: dnm = dn dm whenever n, m are coprimes. Write Bm = dn2 . (8.9.22) 2≤n≤m

By choosing τ = μ := π(N) in the lower bound of Proposition 8.9.6, we get N E sup dn εn n−σ e2iπ a(n),z ≥ E sup |Q (z)| z∈Tμ n=2

z∈Z

≥ CN −σ

μ/2<j ≤μ

dpj

n≤N/p ˜ j P + (n)≤p ˜ μ/2

dn2˜

1/2 .

430

8 The metric entropy method

Note that for large N in the case τ = μ the sets Lj reduce to n = pj n˜ : n˜ ≤ Indeed, if n˜ ≤ pNj and if there is an s 2 ∼ (μ log μ)2 /4 ∼ N ≥ pj ps ≥ pμ/2 + necessarily P (n) ˜ ≤ pμ/2 . Thereby,

N pj

.

≥ μ/2 such that ps |n, ˜ then this implies that N 2 /4, which is impossible for large N. Thus

N dn εn n−σ e2iπ a(n),z ≥ CN −σ E sup z∈Tμ n=2

dpj

μ/2<j ≤μ

= CN −σ

n≤N/p ˜ j

dn2˜

1/2

1/2

μ/2<j ≤μ

dpj BN/pj .

We have obtained 8.9.5 Proposition. There exists a universal constant C, N0 such that for any 0 ≤ σ < 1/2, any integer N ≥ N0 and any multiplicative sequence of weights {dn , n ≥ 1}, N E sup εn dn n−σ −it ≥ CN −σ t∈R n=2

μ/2<j ≤μ

1/2

dpj BN/pj

where Bm is defined in (8.9.22). Apply this to the case dn = d(n), where d(n) = #{d : d|n} is the divisor function. Although these weights are very irregular, their sums behave regularly, in particular, N

d (n) ∼ 2

n=1

N log3 N π2

as N tends to infinity. The last estimate immediately provides Bm ∼ (m/π 2 ) log3 m, hence (noticing that dpj = 2 and μ ∼ N/ log N )), μ/2<j ≤μ

1/2

dpj BN/pj ∼ =

(2N/pj π 2 )1/2 log3/2

μ/2<j ≤μ

2N 1/2 π

2N 1/2 ∼ π ≈ N 1/2

1

1/2 μ/2<j ≤μ pj

μ/2<j ≤μ

N pj

N j log j log j )1/2

log3/2

μ/2<j ≤μ

log3/2

N pj

(j

1 μ1/2 N 1/2 . ≈ N ∼ (j log j )1/2 (log μ)1/2 log N

Now, let {Pk , k ∈ K} be a finite set of mutually coprime numbers. Consider the set of integers ) E = n : n = k∈K Pkαk , αk ∈ {0, 1}

431

8.9 An application to random Dirichlet polynomials

and the associated Dirichlet polynomial DE (t) = where N =

εn n−σ −it =

k∈K

εn χE (n)n−σ −it ,

n=2

n∈E

)

N

Pk . We prove the following.

8.9.6 Proposition. There exists a universal constant C such that, for any σ ≥ 0 and any {Pk , k ∈ K}, −σ (

j ∈G Pj −2σ 1/2 1 + Pk sup ) . E sup |DE (t)| ≥ C

−2σ 1/2 G⊆K t∈R k∈K k∈G 1 + Pk Proof. By (8.9.5) we have

sup DE (t) = sup Q(z) z∈Tμ

t∈R

where μ = |K| and Q(z) =

N

χE (n)εn n−σ e2iπ a(n),z .

n=2

Let A ⊂ K and B = K\A. We assume that both A and B are nonempty sets. Define for j ∈ B, Bj = {n ∈ E : αk = 0 if k ∈ B, k = j, αj = 1} and Z ⊂ Tμ by Z = z = {zk , 1 ≤ k ≤ 2r} : zk = 0, if k ∈ A, and zk ∈ {0, 1/2} if k ∈ B . For j ∈ B, n ∈ Bj and z ∈ Z, we have 2a(n), z = 2 k∈K αk zk = 2zj = ±1, so that similar to our previous lower bound, εn n−σ , sup Q(z) ≥ z∈Z

j ∈B n∈Bj

almost surely. Hence 2 1/2 E sup Q(z) ≥ C E εn n−2σ z∈Z

j ∈B

=C

j ∈B

=C

n∈Bj

Pj−σ

(

(αk )k∈A ∈{0,1}A k∈A

(

1 + Pk−2σ

k∈A

1/2 j ∈B

Pk−2σ αk

Pj−σ .

1/2

432

8 The metric entropy method

Therefore E sup DE (t) ≥ C t∈R

=C

sup

(

A⊆K,A =K k∈A

(

1 + Pk−2σ

k∈K

1 + Pk−2σ

1/2

Pj−σ

j ∈Ac

1/2

sup

A⊆K,A =K

j ∈Ac

) k∈Ac

Pj−σ

1 + Pk−2σ

1/2 .

Chapter 9

The majorizing measure method

The majorizing measure method, which originates from a well-known paper of Garsia, Rodemich and Rumsey, is presented in the exponential case first, in an introductory way. Next a general approach initiated by Talagrand is described. For the proofs, we however followed a recent and elegant simplification of these techniques introduced by Bednorz. An application and an illustration of the method appears in Section 9.3, where a criterion for the convergence of averages of random variables satisfying suitable increment conditions is established. Several applications in ergodic theory are given. The chapter concludes with another application giving rise to a strict sharpening of the Salem–Zygmund estimate for random polynomials.

9.1

Introduction – the exponential case

In a famous article of Garsia, Rodemich and Rumsey [1970], a real variable lemma was established and was then used to establish a new type of sufficient conditions for the convergence almost everywhere of stochastic processes. Unlike the metric entropy method, the kind of conditions obtained is expressed by means of a family of integrals analysing the local scattering of the parameter space, when endowed with a suitable metric (generally induced by the relevant stochastic process). Since this original work, more than thirty years have gone by, and during this period, considerable developments of this method, hereafter called “the majorizing measure method”, were obtained mainly under the impulse of Talagrand, after isolated but productive efforts of Fernique. In 1985, Talagrand solved the open question of characterizing the regularity (sample boundedness and sample continuity) of Gaussian processes, by means of the existence of a majorizing measure. This deep result was later published in Talagrand [1987]. The same year, Talagrand announced during a famous conference in Strasbourg a series of deep results of the same kind concerning non-Gaussian processes. These results are stated and proved in another famous paper Talagrand [1990], and are at the center of this chapter. In Section 9.2 we present some of them, as well as a recent simplified approach due to Bednorz [2006a]. In Section 9.3, we apply these results to obtain a very useful almost sure convergence criterion for averages of sequences of random variables satisfying increment conditions of Gál–Koksma type. This substantially completes the work done in Section 8.4. Now, we return to the seminal work of Garsia, Rodemich, Rumsey, and first to the above mentioned real variable lemma. Let (T , d) be a metric space and μ be a Borel probability on T . Let f : T → R be a Borel

434

9 The majorizing measure method

function. Let also A, B be two Borel subsets of T with positive measure. Put f (s) − f (t) f˜(s, t) = χ{d(s,t) =0} (s, t) ∀ s, t ∈ T , d(s, t) s∈A,t∈B (9.1.1) where χ denotes the indicator function. Then for any convex function " : R → R+ , and any positive real c, 1 1 f (x)μ(dx) − f (x)μ(dx) μ(A) μ(B) A B f (s) − f (t) μ(ds)μ(dt) −1 ≤ cδ(A, B) · " " . cd(s, t) μ(A)μ(B) A A δ(A, B) =

sup d(s, t),

By twice applying Jensen’s inequality, we indeed get 1 1 f (x)μ(dx) − f (x)μ(dx) μ(A) μ(B) B A μ(du)μ(dv) = (f (u) − f (v) μ(A)μ(B) A B $ ˜ % f (u, v) μ(du)μ(dv) −1 = c" " d(u, v) c μ(A)μ(B) A B ˜ f (u, v) μ(du)μ(dv) ≤ δ(A, B) c " −1 " c μ(A)μ(B) A B ˜ f (u,v) T T "( c )μ(du)μ(dv) −1 ≤ δ(A, B) c" . μ(A)μ(B) Now if " is aYoung function and if f˜ ",μ×μ < ∞, then choosing c = f˜ ",μ×μ gives 1 1 f (x)μ(dx) − f (x)μ(dx) μ(A) μ(B) B A (9.1.2) 1 ≤ f˜ ",μ×μ δ(A, B)" −1 . μ(A)μ(B) This is the basic inequality. If A, B are d-balls centered at some point t0 ∈ T : A = B(t0 , ε1 ), B = B(t0 , ε2 ), where we set B(t, ε) = Bd (t, ε) = s ∈ T : d(s, t) ≤ ε , then

1 μ(B(t0 , ε))

f (x)μ(dx), B(t0 ,ε)

represents an approximation of f (t0 ), and inequality (9.1.2) gives us a hint on how this approximation can be controlled.

435

9.1 Introduction – the exponential case

We shall now describe how this can be used by studying the regularity of a class of stochastic processes with exponential moments. Consider for α ≥ 1, t ∈ R the exponential Young functions α "α (t) = et − 1, with Orlicz norms

f "α = inf c > 0 : T "α fc dμ ≤ 1 .

(9.1.3)

9.1.1 Theorem. Let (T , d) be a compact metric space. Let D(T ) denote the diameter of (T , d). Let X = {X(ω, t), ω ∈ , t ∈ T } be a stochastic process with basic probability space (, A, P). Assume that the following increment condition is satisfied: for all s, t ∈ T

X(s) − X(t) "α ≤ d(s, t). (9.1.4) Let μ be a Borel probability measure on T such that D(T )/2 1 L = sup "α−1 du < ∞. μ(Bd (t, u)) t∈T 0 Put for s, t ∈ T , X(s) − X(t) ˜ X˜ = X(s, t) = χ{d(s,t) =0} (s, t). d(s, t) Then X admits a d-separable version, which we denote again by X. Further X˜ ∈ L"α (T 2 , μ × μ), P-almost surely, and ˜ "α ,μ×μ = 1. P ω : sup X(ω, t) − X(ω, t) dμ(t) ≤ 12L X

T

t∈T

Furthermore, for any ρ > 0,

P ω:

sup X(ω, s) − X(ω, t) s,t∈T d(s,t)≤ρ

≤ 40 X˜ "

α

sup ,μ×μ t∈T

ρ/2 0

"α−1

1 du = 1. μ(B(t, u))

Proof. The proof consists of a simple variation of the original proof in Garsia–Rodemich–Rumsey, and will also use some ideas from Preston [1971]. Since (T , d) is compact, it is d-separable. Let T be a countable d-dense subset of T . By assumption (9.1.4), X is d-continuous in probability. It is thus easily seen that X possesses a version which is d-separable and admits T as separation set (Section 8.1). We denote this version again by X. To any Borel subset A of T with μ(A) > 0, we associate the random variable 1 X(t)μ(dt). XA = μ(A) A

436

9 The majorizing measure method

We get from (9.1.2), ˜ "α ,μ×μ "α−1 |XA − XB | ≤ δ(A, B) X

1 . μ(A)μ(B)

(9.1.5)

Now put for any t ∈ T and r > 0, Xr = Xr (t) =

1 μ(B(t, r))

X(u) μ(du). B(t,r)

Set rn = D(T )2−n , n ≥ 0. By (9.1.5),

1 2 μ (B(t, rn )) 1 −n ˜ −1 = 3D(T )2 X "α ,μ×μ "α μ2 (B(t, rn )) rn 1 −1 ˜ ≤ 6 X "α ,μ×μ "α du μ2 (B(t, u)) rn+1 rn 1 ˜ "α ,μ×μ "α−1 du, ≤ 12 X

μ(B(t, u)) rn+1

˜ "α ,μ×μ "α−1 |Xrn − Xrn−1 | ≤ (rn + rn−1 ) X

(9.1.6)

since "α−1 (u2 ) = (log(1 + u2 ))1/α ≤ (log(1 + u)2 )1/α ≤ 2(log(1 + u))1/α .

From assumption (9.1.4) it follows that E "α X(s)−X(t) ≤ 1, for any s, t ∈ T d(s,t) a.s.

with d(s, t) = 0. And if d(s, t) = 0, then X(s) = X(t). Integrating now the latter inequality with respect to μ × μ over T × T , next using Fubini’s theorem, yields ˜ E "α (X(s, t)) μ(du)μ(dv) ≤ 1. T

T

Hence X˜ ∈ L"α (T 2 , μ × μ) P-almost surely. Let now {un , n ≥ 1} be a sequence of reals decreasing to 0 and such that n 2−n /un < ∞. By Tchebycheff’s inequality P |Xrn (t) − Xrn−1 (t)| > un 1 μ(du)μ(dv) ≤ E X(u) − X(v)| un B(t,rn ) B(t,rn−1 ) μ(B(t, rn ))μ(B(t, rn−1 )) Cα μ(du)μ(dv) X(u) − X(v) " ≤ α un B(t,rn ) B(t,rn−1 ) μ(B(t, rn ))μ(B(t, rn−1 )) d(u, v)μ(du)μ(dv) 3D(T )2−n Cα . ≤ ≤ un B(t,rn ) B(t,rn−1 ) μ(B(t, rn ))μ(B(t, rn−1 )) un And so by the Borel–Cantelli lemma, we get P lim Xrn (t) = X(t) = 1. n→∞

(9.1.7)

437

9.1 Introduction – the exponential case

Owing to the fact that Xr0 (t) =

T

X(t) μ(dt), we deduce

∞ ≤ X(t) − X X(t) μ(dt) = lim (t) − X(t) μ(dt) |Xrn (t) − Xrn−1 (t)| r n→∞ n T

T

˜ "α ,μ×μ ≤ 12 X

˜ "α ,μ×μ = 12 X

∞

rn

n=1 rn+1 D(T )/2 0

"α−1 "α−1

n=1

1 du μ(B(t, u))

1 du. μ(B(t, u)) (9.1.8)

Passing to the supremum over all t varying in T gives the first inequality of the statement. Now let s, t ∈ T be fixed with d(s, t) = 2r > 0, and put successively A = B(s, r) ∪ B(t, r),

B = B(s, r),

C = B(t, r).

Then δ(A, B) ≤ 4r, and δ(A, C) ≤ 4r. But, |X(s) − X(t)| ≤ |X(s) − Xr (s)| + |Xr (s) − XA | + |XA − Xr (t)| + |Xr (t) − X(t)|. (9.1.9) From (9.1.5) we deduce

1 μ(A)μ(B) 1 ˜ "α ,μ×μ "α−1 ≤ 4r X

μ2 (B(s, r) r 1 ˜ ≤ 8 X "α ,μ×μ "α−1 du almost surely. μ(B(r, u)) 0

˜ "α ,μ×μ "α−1 |Xr (s) − XA | ≤ 4r X

Operating similarly for the other terms in (9.1.9) gives, in view of (9.1.8), ˜ "α,μ×μ sup |X(s) − X(t)| ≤ 40 X

θ ∈T

d(s,t) 2

0

"α−1

1 du μ(B(θ, u))

almost surely.

Passing again to the supremum over all s and t such that d(s, t) < ρ and varying in T , gives the second inequality of the statement. We notice from this proof and from (9.1.9) particularly, that it was also possible to work directly with the original process X, and control its supremum over any countable subset of T with the help of (9.1.9), thereby avoiding separability considerations. This has interest for boundedness, when controlling lattice suprema defined in (8.1.6).

438

9 The majorizing measure method

9.2 A general approach In several remarkable papers [1987], [1990], [1992], [1994], [1996c], [2001] and also in a recent book [2005], Talagrand showed that the majorizing measure method is in turn a rather general approach to treat problems such as sample boundedness or sample continuity of stochastic processes. It applies not only to the exponential case but equally well to the power case, with some complications inherent to this important case. For simplicity of the exposition concerning sample boundedness, we will understand supremums as lattice suprema as in (8.1.6). Consider φ : R+ → R+ such that φ(0) = 0, φ is strictly increasing continuous. Let ψ = φ −1 and set for x > 0, x x (x) = φ(t)dt "(x) = ψ(t)dt. 0

0

Then and " are called conjugate Young functions, and we have Young’s inequality uv ≤ (u) + "(v),

(u ≥ 0, v ≥ 0).

(9.2.1)

We say that a function f : R → R+ satisfies the 2 -condition with constant C if for all x ≥ 1, we have f (2x) ≤ Cf (x). Typical examples are power functions f (x) = |x|p , p ≥ 1. The general result below is essentially due to Assouad; for a proof see Talagrand [1990: Theorem 2.3]. 9.2.1 Theorem. Let (T , d) be a metric space and let be a Young function. Then the following are equivalent: (a) For any stochastic process {Xt , t ∈ T } that satisfies

Xs − Xt ≤ d(s, t) for any s, t ∈ T ,

(9.2.2)

we have P{supt,u∈T |Xt − Xu | < ∞} > 0. (b) For each ε > 0, there is A > 0, such that for each stochastic process {Xt , t ∈ T } that satisfies (9.2.2), we have P{supt,u∈T |Xt − Xu | ≥ A} ≤ ε. (c) There exists a constant S such that for each stochastic process {Xt , t ∈ T } that satisfies (9.2.2), we have E supt,u∈T |Xt − Xu | ≤ S. (d) There exists a constant M, a positive linear functional θ on the space G of continuous bounded functions on T × T \(T ), (T ) being the diagonal of T , with θ (1) = 1, such that for any Lipschitz function f on T we have the implication

θ

f (t) − f (u) d(t, u)

≤ 1 "⇒ sup |f (t) − f (u)| ≤ M. t,u∈T

Moreover, these conditions imply that T is totally bounded, and if S, M are chosen minimal, we have M ≤ S ≤ 2M.

439

9.2 A general approach

The following important result extends Theorem 9.1.1 to the power case. The first part of the statement is Theorem 4.6 in Talagrand [1990]; the second part follows from Theorem 2.9 in the same paper. 9.2.2 Theorem. Let (T , d) be a compact metric space. Let be a Young function and assume that " satisfies the 2 -condition with constant C. (a) Assume that there is a probability measure m on T such that

D(T )

sup

−1

0

t∈T

1 dε ≤ A. m(B(t, ε))

Then there exists a universal constant K such that for any stochastic process X = {Xt , t ∈ T } that satisfies (9.2.2), we have E supt,u∈T |Xt − Xu | ≤ S with S = KA(1 + log C). (b) Further assume that limx→∞ (x)/x = ∞. If X is separable, then it is moreover sample continuous. A probability measure m such that

D(T )

sup t∈T

−1

0

1 dε < ∞ m(B(t, ε))

(9.2.3)

is called a majorizing measure. The condition on " to satisfy the 2 -condition is realized if (x) = |x|p , p > 1,

β but fails if (x) = |x| log(1+|x|) , β > 0. The theorem is obtained as a combination of several results, and what is essential, the approach consists of approximating (T , d) by ultrametric spaces. A metric space (U, δ) is ultrametric when the metric satisfies the stronger condition

δ(u, v) ≤ max δ(u, w), δ(w, v) (u, v, w ∈ U ). In an ultrametric space, two balls of equal radius are either disjoint or identical, which makes the structure of these spaces rather simple. Let S(T , d, ) be the smallest constant S such that for any stochastic process that satisfies (9.2.2), we have E supt,u∈T |Xt − Xu | ≤ S. By Theorem 9.2.2, we know that S(T , d, ) ≤ KA(T , d, )(1 + log C), where A(T , d, ) :=

inf

D

sup

m∈P (T ) t∈T

0

−1

1 dε. m(B(t, ε))

When (T , d) is ultrametric, a two-sided inequality is fulfilled: 1 A(T , d, ) ≤ S(T , d, ) ≤ K(1 + log C)A(T , d, ). 8

(9.2.4)

440

9 The majorizing measure method

In the general case, Talagrand showed (Theorem 1.2 in the same paper) that D 1 1 ψ dε ≤ S(T , d, ), inf sup m(B(t, ε)) 4 m∈P (T ) t∈T 0

(9.2.5)

which is always weaker than 41 A(T , d, ) ≤ S(T , d, ), and strictly weaker for instance if (x) = |x|p , p > 1. However, when increases fast enough (essentially faster than x α log log x for some α > 0), both inequalities are equivalent, thus giving a complete understanding of the condition S(T , d, ) < ∞. These lower counterparts are however only satisfactory from a theoretical point of view. Indeed in Weber [1999: Section 4] we showed by means of Birkhoff’s theorem and a theorem of Tandori, that it is possible to find two stochastic processes Xi = {Xti , t ∈ N} ⊂ L2 (P), i = 1, 2 with increments satisfying

Xs1 − Xt1 2 ≤ Xs2 − Xt2 2

(∀s, t ∈ N)

and such that X 2 is almost surely convergent, whereas X 1 is not. Recently Bednorz [2006a] (see Theorems 1.2 and 3.1) has proposed a simplified and slightly more general new approach, although much inspired by Talagrand’s paper. Ultrametric spaces are, however, not involved in Bednorz’s proofs. Their main feature lies in the role played by an adapted calibration of the balls of the parameter space. This is a nice and also pedagogical approach, which we shall present now. Bednorz’s approach. Let (T , d) be a fixed metric space and m a fixed Borel probability on (T , d) such that supp(m) = T . For a, b ≥ 0 let Ga,b be the family of functions : R+ → R which are increasing, continuous with (0) = 0 and such that x ≤a+b

(xy) (y)

for x ≥ 0, y ≥ −1 (1).

Note that each Young function is in G1,1 . Let B(T ) be the space of all Borel bounded functions on T and C(T ) the space of continuous functions on T . Given a function in Ga,b define D(T ) 1 −1 dε, s(x) = m(B(x, ε)) 0 S = sup s(x), x∈T ˜S = s(u) m(du). T

9.2.3 Theorem. Suppose ∈ Ga,b and let R > 2. Then there exists a probability measure ν on T × T such that for each bounded continuous function f on T the inequality |f (u) − f (v)| f (t) − ≤ aAs(t) + bB S˜ f (u)m(du) ν(du, dv) d(u, v) T T ×T

441

9.2 A general approach

holds for all t ∈ T , where A =

R2 (R−1)(R−2) ,

B=

R2 R−1 .

A consequence of this result is 9.2.4 Theorem. If is a Young function and m is a majorizing measure, then for any stochastic process X = {Xt , t ∈ T } that satisfies (9.2.2), we have E sup |Xt − Xu | ≤ 32S. t,u∈T

Theorems 9.2.3, 9.2.4 apply even if (x) = |x|. It should be noted, however, that the estimate given in Theorem 9.2.3 is not homogeneous in f . Proof of Theorem 9.2.3. Define the integer k0 by the condition R k0 ≤ −1 (1) < R k0 +1 . Next put for k > k0 and any x ∈ T ,

1 rk (x) := min ε ≥ 0 : −1 m(B(x,ε)) ≤ Rk . (9.2.6) If k = k0 we simply set rk0 (x) ≡ D(T ). The first important fact is that: For k ≥ k0 , the functions rk are 1-Lipschitz. (9.2.7)

Indeed, from the elementary inclusion relation B(t, ρ) ⊂ B s, ρ + d(s, t) valid for s, t, ρ arbitrary, we deduce −1

1 m(B(s, rk (t) + d(s, t)))

≤ −1

1 m(B(t, rk (t)))

≤ Rk ,

which implies rk (s) ≤ rk (t) + d(s, t), and similarly rk (t) ≤ rk (s) + d(s, t); hence |rk (s) − rk (t)| ≤ d(s, t) as claimed. Now observe that K

rk (x)(R k − R k−1 ) = −rk0 (x)R k0 −1 + R k0 (rk0 (x) − rk0 +1 (x)) +

k=k0

· · · + R K−1 (rK−1 (x) − rK (x)) + R K rK (x) ≤ ≤

K−1

R k (rk (x) − rk+1 (x)) + R K rK (x)

k=k0 rk (x) 0

−1

0

1 du + R K rK (x). m(B(x, u))

Hence ∞

D(T )

rk (x)(R k − R k−1 ) ≤

−1

0

k=k0

= 0

D(T )

−1

1 du + lim sup R K rK (x) m(B(x, u)) K→∞

1 du. m(B(x, u))

442

9 The majorizing measure method

And consequently ∞

rk (x)R ≤ k

k=k0

R R−1

D(T )

−1

0

1 du = m(B(x, u))

R s(x). (9.2.8) R−1

Now introduce for k ≥ k0 the notation Bk (x) = B(x, rk (x)), 1 Sk f (x) = f (u)m(du) := • f (u)m(du). m(Bk (x)) Bk (x) Bk (x) The operators Sk satisfy the following properties: Sk (1) = 1, f ≤ g "⇒ Sk f ≤ Sk g and |Sk f | ≤ Sk |f |, Sk Sk0 f = Sk0 f, f ∈ C(T ) "⇒ f (x) = lim Sk f (x).

(9.2.9)

k→∞

Now observe this: let i, j ≥ k0 and take v in Bi (u) = B(u, ri (u)). By (9.2.7), |rj (v) − rj (u)| ≤ d(u, v) ≤ ri (u). Hence Si rj (u) = • rj (v)m(dv) ≤ • rj (u)m(dv)+ • ri (u)m(dv) = rj (u)+ri (u), Bi (u)

Bi (u)

and so for i, j ≥ k0 ,

Bi (u)

Si rj ≤ rj + ri .

This will permit us to establish a key ingredient of the proof, namely the inequality Sm Sm−1 . . . Sk+1 rk ≤

m

2i−k ri .

(9.2.10)

i=k

If m = k + 1, this reduces to Sk+1 rk ≤ rk + 2rk+1 , which is clear by what precedes. Now, if for m − 1 > k ≥ k0 , Sm−1 . . . Sk+1 rk ≤

m−1

2i−k ri ,

i=k

then Sm Sm−1 . . . Sk+1 rk ≤ Sm

m−1

2

i−k

ri =

i=k

=

m−1 i=k

2i−k ri + rm

m−1

2

i−k

Sm ri ≤

i=k

m−1 i=k

m−1

2i−k (rm + ri )

i=k

2i−k ≤

m i=k

2i−k ri ,

443

9.2 A general approach

as claimed. Finally note that m−1 m k=k0

2

i−k

ri R = k

i=k

m−1 m k=k0 i=k

R ≤ R−2

2 R

∞

i−k

ri R ≤ k

m ∞ j 2 j =0

R

ri R i

i=k0

(9.2.11)

ri R i .

i=k0

Now from the fact that

f dm = lim Sm f (t) − Sk0 f (t) = lim Sm f (t) − Sm Sm−1 . . . Sk0 f (t) f (t) − m→∞

T

= lim

m→∞

m→∞

m−1

Sm Sm−1 . . . Sk+2 Sk+1 f (t) − Sm Sm−1 . . . Sk+1 Sk f (t) ,

k=k0

we get the bound f (t) −

T

m−1 Sm Sm−1 . . . Sk+2 Sk+1 (I − Sk )f (t) f dm ≤ lim m→∞

k=k0 m−1

≤ lim

m→∞

Sm Sm−1 . . . Sk+2 Sk+1 (I − Sk )f (t).

k=k0

•

Sk+1 (I − Sk )f (w) = •

But

(9.2.12)

Bk+1 (w)

And so

Sk+1 (I − Sk )f (w) ≤ •

•

Bk+1 (w)

Using the fact that ∈ Ga,b , with x =

(f (u) − f (v)) m(dv)m(du).

Bk (u)

|f (u) − f (v)|m(dv)m(du).

(9.2.13)

Bk (u) |f (u)−f (v)| , R k+1 d(u,v)

y = R k+1 yields

|f (u) − f (v)| |f (u) − f (v)| b . ≤a+ k+1 k+1 R d(u, v) (R ) d(u, v) Let v ∈ Bk (u), then by definition d(u, v) ≤ rk (u). Note also that the inequality m(Bk+1 (w)) ≥ 1/(R k+1 ) holds for any w ∈ T , by construction. Incorporating these two ingredients into the above, now leads to the more suitable form

|f (u) − f (v)| ≤ ark (u)R k+1 + bm(Bk+1 (w))rk (u)R k+1

|f (u) − f (v)| . d(u, v) (9.2.14)

444

9 The majorizing measure method

Thus with (9.2.13) and (9.2.14), |Sk+1 (I − Sk )f (w)|

rk (u)R k+1 •

≤ aR k+1 Sk+1 rk (w) + b T

Bk (u)

|f (u) − f (v)| m(du)m(dv). d(u, v) (9.2.15)

i−k r . Therefore By (9.2.10), Sm Sm−1 . . . Sk+1 rk ≤ m i i=k 2 Sm Sm−1 . . . Sk+2 Sk+1 (I − Sk )f (t) m |f (u) − f (v)| i−k k k ≤ aR 2 ri (t)R + bR rk (u)R • m(du)m(dv). d(u, v) T Bk (u) i=k

In view of (9.2.11), (9.2.12) and then (9.2.8), we get f (t) − f dm T

≤a

∞ R2 rk (t)R k R−2

+ bR

k=k0 ∞

rk (u)R k •

k=k0 T ∞

Bk (u)

≤ aAs(t) + bR

k=k0 T

rk (u)R • k

|f (u) − f (v)| m(du)m(dv) d(u, v)

|f (u) − f (v)| m(du)m(dv), d(u, v) Bk (u) (9.2.16)

where A =

R3 (R−1)(R−2) .

Let ν be a probability measure on T × T defined by

∞ 1 k ν(g) := rk (u)R • g(u, v)m(du)m(dv) for g ∈ B(T × T ), M T Bk (u) k=k0

k where M = ∞ k=k0 T rk (u)R m(du). By (9.2.8) we have M ≤ R ˜ S. Hence

R R−1 T

(9.2.17) s(u)m(du) =

R−1

f (t) − ˜ f dm ≤ aAs(t) + bB S T

where B =

R2 R−1 .

T ×T

|f (u) − f (v)| ν(du, dv), d(u, v)

The proof is now complete.

Proof of Theorem 9.2.4. To prove the result, we may replace the process {X(t), t ∈ T } by {X(t) − X(t0 ), t ∈ T } where t0 is arbitrary in T . As X(t) − X(t0 ) is integrable by

445

9.2 A general approach

(9.2.2), we may also assume for the proof that X(t) is integrable. Let (, B, P) be the underlying probability space on which X is defined. First assume that B is finite. We identify points in each atom of B and so assume that is finite. Observe from (9.2.2) that (9.2.18) |X(ω, s) − X(ω, t)| ≤ d(s, t)−1 (1/P({ω}). This means that the trajectories of X are Lipschitz and bounded, thereby bounded continuous. Now, from Theorem 9.2.3 and the triangle inequality also follows that there exists a probability measure ν on T × T such that for each bounded continuous function f on T , |f (u) − f (v)| sup |f (s) − f (t)| ≤ 2aAS + 2bB S˜ ν(du, dv). d(u, v) T ×T s,t∈T Therefore E sup |X(s) − X(t)| ≤ 2aAS + 2bB S˜

s,t∈T

T ×T

E

|X(u) − X(v)| ν(du, dv) d(u, v)

˜ = 2aAS + 2bB S. In the general case, we have to show for any finite subset T0 of T that ˜ E sup |X(s) − X(t)| ≤ 2aAS + 2bB S. s,t∈T0

We may assume that B is countably generated. And so there exists an increasing sequence Bn of finite σ -fields whose union generates B. As E |X(t)| < ∞, the conditional expectations Xn (t) = E (X(t)|Bn ) are well defined. Observe by Jensen’s inequality that

E Hence

|Xn (s) − Xn (t)| d(s, t)

≤ E

|X(s) − X(t)| d(s, t)

≤ 1.

˜ E sup |Xn (s) − Xn (t)| ≤ 2aAS + 2bB S. s,t∈T0

Owing to the fact that Xn (t) → X(t) P-almost surely and in L1 (P) for each t ∈ T0 , we conclude that ˜ E sup |X(s) − X(t)| ≤ 2aAS + 2bB S, s,t∈T0

for any finite T0 ⊂ T , as requested. 9.2.5 Remark. It is natural to ask whether, under the existence of a majorizing measure, the following implication is true:

Xt − Xu ≤ d(t, u), ∀t, u ∈ T "⇒ sup |Xt − Xu | < ∞. t,u∈T

446

9 The majorizing measure method

By Theorem 9.1.2 and Proposition 2.7 in Talagrand [1990], if there exists a Young function and α > 0 such that a ≥ −1 then

1 and b ≥ 1 "⇒ (ab) ≥ α(a) (b), 2

sup |Xt − Xu | ≤ K( )S(T , d, )/α,

(9.2.19)

(9.2.20)

t,u∈T

where K( ) depends on only. This applies if (x) = |x|p , p > 1, in which case the answer is yes, but is no if (x) = |x| log(1 + |x|), by Proposition 2.9 in the same α paper. The answer is also naturally yes for exponential functions α (x) = e|x| − 1, α ≥ 1 considered in Section 9.1.1. Bednorz [2006a: Theorem 2.1] has also considered this problem and proved the 9.2.6 Proposition. Let ∈ Ga,b . Let α ≥ 0, β ≥ 0 and ϑ : R+ → R be increasing continuous with ϑ(0) = 0, limx→∞ ϑ(x) = ∞, such that ϑ(x) ≤ α + β

(xy) (y)

for x ≥ 0, y ≥ 0.

(9.2.21)

Then for each bounded continuous function f on T the following inequality holds: f (t) − T f (u)m(du) |f (u) − f (v)| ν(du, dv), ≤α+β sup ϑ K d(u, v) T ×T t∈T (9.2.22) where K = (aA + bB)S and A, B, ν are as in Theorem 9.2.3. Proof. Given f , let c be defined by |f (u) − f (v)| ν(du, dv). ϑ(c) = α + β d(u, v) T ×T In view of (9.2.21), for all u, v ∈ T ,

|f (u) − f (v)| |f (u) − f (v)| ϑ(c) − α) ≤ β . cd(u, v) d(u, v)

Thereby

|f (u) − f (v)| ν(du, dv) cd(u, v) T ×T β |f (u) − f (v)| ν(du, dv) = 1. ≤ ϑ(c) − α T ×T d(u, v)

9.3 A useful criterion

447

Using now Theorem 9.2.3, we obtain 1 |f (u) − f (v)| ν(du, dv) sup f (t) − f (u)m(du) ≤ aAs(t) + bB S˜ c t∈T cd(u, v) T T ×T ≤ (aA + bB)S = K. Since ϑ is increasing f (t) − T f (u)m(du) sup ϑ K t∈T f (t) − f (u)m(du) T = ϑ sup K t∈T |f (u) − f (v)| ≤ ϑ(1) = α + β ν(du, dv), d(u, v) T ×T as requested. Problem 10. Let (, A, P) be a probability space, (T , d) a compact metric space and a Radon probability μ on T . Let 1 < p < ∞. Consider a stochastic process X = {X(ω, t), ω ∈ , t ∈ T } with increments satisfying the assumption X(s) − X(t) p μ(ds)μ(dt) < ∞. E d(s, t) T T Find conditions ensuring that X is sample bounded.

9.3 A useful criterion Let ξ = {ξl , l ≥ 1} be a sequence of random variables defined on some probability space (, A, P). Let m = {ml , l ≥ 1} be a sequence of positive reals with partial sums Mn = nl=1 ml . Assume that (ξ, m) are linked by the increment condition E

j j 2 ξl ≤ ml l=i

(i ≤ j ).

l=i

In Chapter 8, we used the metric entropy method to obtain various criteria for the convergence almost everywhere of the series l≥1 ξl under the above assumption or similarones. Here, instead of studying the convergence almost everywhere of the series l≥1 ξl , we are rather interested in finding fine convergence criteria for the aver ages v1n l≤n ξl , where vn are suitable normalizing factors. The convergence of these averages can often be efficiently established via Kronecker’s lemma, once the series ξ is shown to be convergent almost everywhere. However, the two convergence l≥1 l properties are basically different, and it seemed natural to develop a separate study for

448

9 The majorizing measure method

the averages. Because these properties are close, it also appeared appropriate to use a finer approach: namely the majorizing measure method. We shall prove by means of this method, the existence of a simple general criterion, uniquely built up from the sequence m, and allowing one to get remarkably efficient uniform bounds for suitable averages of the random variables ξl . We assume from now on, and throughout the whole section, that the sequence m = {ml , l ≥ 1} has partial sums Mn verifying Mn ↑ ∞,

(9.3.1)

as n tends to infinity; and let M = {Mn , n ≥ 1}. We will further assume that m does not increase faster than exponentially. To be precise, we assume the following growth condition: for any ρ large enough, mk ρ −k Cm (ρ) = sup k>n −n < ∞. (9.3.2) mn ρ n≥1 We also consider sequences of random variables ξ satisfying a more general type of increment condition. Let 1 < p < ∞ and q = p/(p − 1) be fixed. Let " : R+ → R+ be increasing. We assume that (9.3.3) "(x)/x p is nonincreasing. This implies that there exists a constant 1 < C < ∞ such that "(2x) ≤ C"(x)

(∀x ≥ 0).

(9.3.4)

As typical examples, we have the functions "(x) = x α (log(1 + x))β , 0 < α < p, β ∈ R, or α = p and β ∈ R− . Consider the more general assumption: j j p E ξl ≤ " ml l=i

(i ≤ j ).

(9.3.5)

l=i

Let ϕ : R+ → R+ denote a continuous increasing concave function such that ϕ p is convex and ϕ(0) = 0. The question studied can be described as follows. Problem. Given ϕ, find conditions ensuring the existence of a constant K (depending on p, m, " and ϕ only) such that any sequence of random variables ξ satisfying the increment condition (9.3.5) verifies n l=1 ξl sup ≤K ϕ(M ) n≥1

n

p

1 a.s. ξl −→ 0. ϕ(Mn ) n

and

l=1

(9.3.6)

9.3 A useful criterion

449

We introduce a definition. 9.3.1 Definition. A function ϕ enjoying property (9.3.6) will be called (p, ", m)-admissible, or more simply admissible. The difficulty in the application of the majorizing measure method, when compared to other methods, lies in the fact that one has, not only to imagine the measure, but also to really invent an argument that goes with, and show that this measure will, in turn, also satisfy the majorizing measure’s condition. Once this step is performed, the method yields efficient bounds. Introduce the following conditions linking " and ϕ, (a) (b)

ϕ(x)/"(x)1/p is nondecreasing, ∞ "(t)1/p dt < ∞ for some λ > 0. tϕ(t) λ

(9.3.7)

Finally, we define a class of functions of particular relevance. 9.3.2 Definition. Let L be the class of functions defined as follows: ∞ dt L = L : R+ → R+ : L(t) t p is nonincreasing and λ L(t) < ∞ for some λ > 0 . The following criterion is the main result of the section. 9.3.3 Theorem. Assume that (", ϕ) satisfy condition (9.3.7). Assume further that (m, ", ϕ) are linked by the following condition: There exists L ∈ L such that "(Mn ) L(Mn )1/p "(mn ) 1/p dt sup + < ∞. (9.3.8) 1/q " −1 (t)1/p mn n≥1 ϕ(Mn ) "(mn ) t Then ϕ is admissible. The criterion we obtain, is directly expressed in terms of the sequences m and M, which is not possible by means of the metric entropy method, since it uses by definition, covering numbers. This also makes its use very easy. In some important cases, condition (9.3.8) can be simplified. Assume that m is a bounded sequence. Then condition (9.3.8) is equivalent to L(Mn )1/p "(Mn ) dt there exists L ∈ L such that sup < ∞. 1/q " −1 (t)1/p ϕ(M ) t n n≥1 "(mn ) (9.3.9) p−1 n) ≤ m , and m is bounded. This is immediate since "(x) ≤ x p ; so "(m n mn If "(x) ≤ x, then x 1/q " −1 (x)1/p ≥ x; condition (9.3.8) reduces to there exists L ∈ L such that

"(Mn ) L(Mn )1/p log < ∞. ϕ(M ) "(mn ) n n≥1

sup

(9.3.10)

In the next statements, we apply Theorem 9.3.3 to the case "(x) = x β , 0 < β ≤ p.

450

9 The majorizing measure method

9.3.4 Corollary (0 < β < 1). ϕ is admissible if there exists L ∈ L such that ∞ L(Mn )1/p (β−1)/p dt < ∞ and (b) mn < ∞. (a) sup 1−β/p ϕ(t) n≥1 ϕ(Mn ) m1 t If mn ≥ c > 0, then for any L ∈ L, ϕ(t) = L(t)1/p is admissible; and, for instance, ϕ(t) equals t 1/p logτ/p (1 + t) with τ > 1. The first assertion is immediate. Concerning the second, if ϕ(t) = L(t)1/p , then (a) is fulfilled and we observe by Hölder’s inequality that ∞ 1/q ∞ dt 1/p ∞ dt dt ≤ < ∞, 1−(β/p) L(t)1/p q(1−(β/p)) m1 t m1 L(t) m1 t since 1 − (β/p) > 1/q. Thus (b) is satisfied too. 9.3.5 Corollary (β = 1). ϕ is admissible if there exists L ∈ L such that ∞ L(Mn )1/p dt Mn < ∞ and (b) log < ∞. (a) sup 1/q mn ϕ(t) n≥1 ϕ(Mn ) m1 t If log

Mn = O log Mn , mn

for any L ∈ L, ϕ(t) = L(t)1/p log t is admissible; and for instance ϕ(t) equals t 1/p log1+τ/p (1 + t) with τ > 1. Here again the first assertion is immediate; as for the second, one uses Hölder’s inequality to show (b). When ml ≡ 1, one recovers Theorem 3 of Gál–Koksma [1950]. The last condition on the growth of the sequence m is satisfied when ml ≥ l −c for some 0 ≤ c < 1. The critical case occurs when ml = l −1 . When the random variables ξl are indicators, it is possible to overcome that difficulty. The key observation to treat this case is that when "(x) = x, or more generally when " is subadditive, assumption (9.3.5) is preserved when replacing the sequence ξ by a sequence of sums on consecutive blocks of the ξl ’s. Let indeed {n k , k ≥ 1} be some increasing sequence of integers, and put γk = nk−1 ≤l
i≤k≤j nk−1 ≤l
i≤k≤j

Now, consider the indicator case: ξl = 1Al − P(Al ) satisfying p E ξl ≤ ml , i≤l≤j

i≤l≤j

451

9.3 A useful criterion

with a sequence m such that 0 ≤ P(Al ) ≤ ml ≤ 1. For p = 2, this assumption is realized, as soon as P(Ak ∩ Al ) ≤ P(Ak )P(Al ) + ϕl−k P(Ak ) (∀l ≥ k ≥ 1), where ϕ = {ϕi , i ≥ 0} is a sequence of nonnegative reals such that the series ∞ i=0 ϕi converges. There are many examples in metrical number theory and probability theory for which the latter condition is fulfilled. Fix some real a > 1, and let our increasing sequence of integers {nk , k ≥ 1} be defined as follows: nk = inf n ≥ 1 : Mn ≥ ka . Put also ρk =

μk =

P(Al ),

nk−1 ≤l

ml ,

%k =

nk−1 ≤l

1Al .

nk−1 ≤l
Then, Mnk −1 ≤ ka < Mnk ≤ Mnk −1 + 1. Hence 0 < a − 1 ≤ μk ≤ a + 1 and p E (%k − ρk ) ≤ μk , 0 ≤ u ≤ v < ∞. u≤k≤v

u≤k≤v

Corollary 9.3.5 applies, and we get for τ > 1, n k=1 (%k − ρk ) sup n 1/2 1+τ/p n < ∞. p n≥1 log k=1 μk k=1 μk Now, let n be arbitrary, and choose k such that nk−1 ≤ n < nk . As k−1

(%j − ρj ) − (a + 1) ≤

j =1

n

k 1Al − P(Al ) ≤ (%j − ρj ) + (a + 1), j =1

l=1

one easily gets (for any τ > 1) n l=1 (ξl − P(Al )) sup n 1/2 1+τ/p n < ∞. p n≥1 log l=1 ml l=1 ml

(9.3.11)

But the indicator case is also a limit case. Indeed, even when ξ is a sequence of bounded random variables, the implication (9.3.5) ⇒ (9.3.11) is no longer true. Assume the contrary. Let {ηn , n ≥ 1} be a uniformly bounded orthonormal sequence 2 and {cn , n ≥ 1}, a real sequence with ∞ n=1 cn < ∞. Given any function ω(n) → ∞, we can find a sequence (mk ) such that mk ≥ ck2 ,

∞ k=1

mk = ∞,

N k=1

mk = O(ω(N )).

452

9 The majorizing measure method

Then

2 E cl ξl = cl2 ≤ ml i≤l≤j

i≤l≤j

i≤l≤j

and thus condition (9.3.5) is satisfied for ξk = ck ηk and p = 2. Hence (9.3.11) yields N

a.s.

ck ηk = o ω(N)1/2 logb ω(N)

k=1

for any b > 3/2. The last conclusion is however false for sufficiently slowly increasing ω(N ) (e.g., for ω(N) = log N ): indeed, by a result of Tandori [1957] (Satz III), for any function ψ(n) = o(log n) there exists a uniformly 2 bounded orthonormal system {ηn , n ≥ 1} and a real sequence {cn , n ≥ 1} with cn < ∞ such that, almost surely N

ck ηk ≥ ψ(N )

infinitely often.

k=1

In the next statements, we continue to examine the case "(x) = x β . 9.3.6 Corollary (1 < β < p). For any L ∈ L such that ∞ dt < ∞, 1/q L(t)1/p m1 t ϕ(t) = L(t)1/p t (β−1)/p is admissible; and for instance ϕ(t) = t β/p logτ/p (1 + t) with τ > p. Indeed by Theorem 9.3.3, ϕ is admissible, if there exists L ∈ L such that ∞ L(Mn )1/p (β−1)/p dt < ∞ and (b) Mn < ∞. (a) sup 1−β/p ϕ(t) ϕ(M ) t n n≥1 m1 When ml ≡ 1, one recovers Theorem 5 in Gál–Koksma [1950]. The next application concerns some boundary cases. 9.3.7 Corollary. Case a) "(x) = x log−p (1 + x). Then ϕ is admissible, if there exists L ∈ L such that ∞ L(Mn )1/p dt 1 +log log Mn < ∞, and < ∞. sup 1/q log(1 + t)ϕ(t) ϕ(M ) log(1 + m ) t n n n≥1 m1 Moreover if

1 = O log log Mn , log(1 + mn )

then ϕ(t) = L(t)1/p log log t is, for any L ∈ L, admissible.

453

9.3 A useful criterion

Case b) "(x) = x p log−b (1 + x) (b > 0). Then ϕ is admissible, if there exists L ∈ L such that ∞ dt L(Mn )1/p 1/q sup Mn log−b/p Mn < ∞, and < ∞. b/p (1 + t)ϕ(t) n≥1 ϕ(Mn ) m1 log In particular, ϕ(t) = t log(τ −b)/p is, for any τ > 1, admissible. Case c) "(x) = x p . Then, for any L ∈ L such that ∞ dt < ∞, 1/q L(t)1/p t m1 ϕ(t) = t 1/q L(t)1/p is admissible. In particular ϕ(t) = t logτ (1 + t) with τ > 1 is admissible. Concerning Case c), we note that the increment condition (9.3.5) is trivially satisfied, when for instance ml = ξl p . The condition however forces ϕ to satisfy limt→∞ ϕ(t)/t = ∞, which is not surprising here. One thus always has, with τ > 1, n ξl l=1 sup < ∞.

n n τ p n≥1 l=1 ξl p log 1 + l=1 ξl p There are some applications in ergodic theory. 9.3.8 Proposition. Let 1 1 and put ⎧ n 1 l ⎪ ⎪ l=1 T f ⎨ n1/p (log n)τ/p n Tn τ f = n1/p (log1n)1+τ/p l=1 T l f ⎪ n ⎪ 1 l ⎩ l=1 T f n1−α (log n)τ Then,

if (1 − α)p < 1, if (1 − α)p = 1, if (1 − α)p > 1.

a.s. Tn τ f −→ 0 and sup Tn τ f p < ∞.

(9.3.13)

n≥1

According to a result of Derriennic and Lin [2001: Proposition 2.18], for T a contraction, assumption (9.3.12) is equivalent to 1

sup n≥1

n1−α

n l=1

T l f < ∞. p

(9.3.14)

454

9 The majorizing measure method

Now, if T is power bounded, T is a contraction in an equivalent norm (Krengel [1985; p. 110]), and Proposition 2.18 of Derriennic and Lin still applies to give (9.3.14). The increment condition (9.3.5) is fulfilled with "(x) = x p(1−α) . Proposition 9.3.8 thus follows at once from Corollaries 9.3.4, 9.3.5 and 9.3.6. 9.3.9 Remarks. Some comparisons with existing results are necessary. 1. In the particular case that T is induced on Lp by a Dunford–Schwartz operator, Corollary 3.7 of Derriennic and Lin [2001] gives rates of convergence under assumption 1 (9.3.14). When (1 − α)p < 1, the rate there is n1/p nl=1 T l f → 0 a.e., which is better than what Proposition 9.3.8 yields. On the other hand, when (1 − α)p ≥ 1, Proposition 9.3.8 provides a better rate than Derriennic and Lin [2001]. 2. For the particular case that T is induced by a Dunford–Schwartz operator and 1 n l f ∈ (I − T )α Lp , which implies limn→∞ n1−α l=1 T f p = 0 by Corollary 2.15 in Derriennic and Lin [2001], more precise information is given in Derriennic and Lin [2001], Theorem 3.2. 3. For T power-bounded on Lp and f ∈ Lp satisfying (9.3.14), the rates obtained here are better than in Cohen and Lin [2003: Corollary 1]. 4. For T unitary on L2 and f ∈ L2 satisfying (1.14), the rates obtained in [Gaposhkin: 1979], Theorem 3, cases (vii), (iv), and (iii) are better. Before passing to another application, we shall consider a variant of assumption (9.3.13) useful for L2 -applications. Let : R+ → R+ be some nondecreasing function, and consider the following type of increment assumption. j j j p E ξl ≤ ml " ml l=i

l=1

(i ≤ j ).

(9.3.15)

l=i

We further assume and " to also satisfy the condition below: (Mn ) − (Mm ) "(Mn − Mm ) ≤B (Mn ) "(Mm )

(m ≤ n),

(9.3.16)

where B is an absolute constant. 9.3.10 Theorem. Assume that ", satisfy condition (9.3.16). Further, assume that p, m, " and ϕ satisfy conditions (9.3.7) and (9.3.8). Then, there exists a constant K < ∞, such that any sequence ξ = {ξl , l ≥ 1} of random variables satisfying the increment condition (9.3.15) verifies 1

n

(Mn )1/p φ(Mn )

l=1

ξl −→ 0 and sup a.s.

n

l=1 ξl 1/p φ(Mn ) p n≥1 (Mn )

≤ K.

455

9.3 A useful criterion

The proof is given in Section 9.5. The main argument will consist of the fact that, under conditions (9.3.15) and (9.3.16), the increments of the averages considered are controlled in the same manner as those of the preceding averages. In view of our next theorem, we shall specialize this result to L2 -spaces and "(x) = x. Condition (9.3.15) becomes j j j 2 E ξl ≤ ml ml l=i

l=1

(i ≤ j ).

(9.3.17)

l=i

9.3.11 Theorem. Assume that is concave. Further assume ϕ is such that there exists L ∈ L satisfying the condition ∞ Mn L(Mn )1/2 dt sup < ∞ and log < ∞ for some λ > 0. √ ϕ(M ) m tϕ(t) n n n≥1 λ Then, there exists a constant K depending on m, , and ϕ only, such that any sequence ξ = {ξl , l ≥ 1} of random variables satisfying the increment condition (9.3.17) also verifies n n ξl a.s. l=1 l=1 ξl sup ≤ K. −→ 0 and 1/2 1/2 ϕ(Mn ) 2 (Mn ) ϕ(Mn ) n≥1 (Mn ) If log

Mn ∼ log Mn , mn

one can take ϕ(t) = L(t)1/2 log t for any L ∈ L; for instance ϕ(t) = t 1/2 logτ (1 + t) with τ > 3/2. Indeed, when p = 2 and "(x) = x, condition (9.3.16) reduces to (Mn ) − (Mm ) Mn − Mm ≤B (Mn ) Mm

(m ≤ n).

Since is concave, for m ≤ n, (Mn ) − (Mm ) (Mn ) (Mn ) ≤ ≤ . Mn − M m Mn Mm This implies (9.3.16) with B = 1. Theorem 9.3.11 then follows from Theorem 9.3.10 and the fact that, in the case under consideration, conditions (9.3.7) and (9.3.8) reduce to the conditions stated in Corollary 9.3.7. In the case ml ≡ 1, Theorem 9.3.11 also complements Theorem 7 in Gál–Koksma [1950], where under the assumption j p E ξl ≤ Cj p−σ (j − i)σ η(j − i) l=i

(i ≤ j )

456

9 The majorizing measure method

with p > σ > 1 and η(n) > 0 nonincreasing such that the series n≥1 η(n)/n converges, it is proved that L1 L l=1 ξl tends to 0 almost surely when L tends to infinity. Here the case p = 2 is considered and (9.3.17) with (x) = x s , s ∈ ]0, 1[ reads as follows: j 2 E ξl ≤ Cj s (j − i) (i ≤ j ). l=i

This corresponds to η(x) = x 1−σ , s = 2 − σ in Theorem 7 of Gál–Koksma [1950]. Applying Theorem 9.3.10 gives for any τ > 3/2,

L

s+1 2

1

L

logτ L

l=1

ξl → 0

almost surely when L tends to infinity, which is better than what is obtained by applying Theorem 7 in Gál–Koksma [1950]. Now, we pass to our next application to ergodic theory. Consider the following data. = {θl , l ≥ 1} is a sequence of reals, such that n = 1≤l≤n θl2 ↑ ∞. P = {pl , l ≥ 1} is an increasing sequence of positive integers. T is a contraction in L2 (P). Introduce the sequence of complex numbers ζl (x) = θl e2iπpl x

(x ∈ [0, 1[= R/Z).

Let · ∞ denote the supremum norm on C([0, 1[). We shall assume that the following condition is realized: there exists a sequence m and a concave nondecreasing function : R+ → R+ , such that 1/2 1/2 ζl ≤ ml ml (i ≤ j ). (9.3.18) i≤l≤j

∞

1≤l≤j

i≤l≤j

Condition (9.3.18) usually describes a situation where ml ∼ |θl |2 , but are not equal. Some examples are given in Section 9.6. Our next application is related to the study of the ergodic sums n θl T pl f (n ≥ 1). k=1

9.3.12 Theorem. Assume that ϕ is such that there exists L ∈ L with ∞ L(Mn )1/2 dt Mn < ∞ for some λ > 0. sup < ∞ and log √ mn tϕ(t) n≥1 ϕ(Mn ) λ

457

9.4 Proof of Theorem 9.3.3

Then, there exists a real K, such that for any f ∈ L2 (μ): n n pl pl a.s. k=1 θl T f k=1 θl T f ≤ K, and −→ 0. sup 1/2 1/2 ϕ(Mn ) 2 n (Mn ) ϕ(Mn ) n≥1 n (Mn ) Moreover, if log

Mn = O(Mn ), mn

(9.3.19)

one can choose ϕ(t) = L(t)1/2 log t, for any L ∈ L; and for instance ϕ(t) = √ t logτ (1 + t) with τ > 3/2. Then, for any f ∈ L2 (μ), n pl k=1 θl T f sup " ≤ K f 2 #1/2 τ n≥1 (Mn )Mn log (1 + Mn ) 2 and

n

(9.3.20) T pl f

a.s. k=1 θl → #1/2 τ (Mn )Mn log (1 + Mn )

"

0.

This result is proved and applied in Section 9.5. In the applications, Mn ∼ n .

9.4

Proof of Theorem 9.3.3

The proof is long. We pause to outline the steps. In Step 0, we specify Theorem 9.2.2 to our setting. Step 1 is an intermediate step consisting of the regularization of the sequence m. There are some specific functions built from this sequence, " and ϕ, and used later on, which necessitate such a regularization to be efficiently employed. In Step 2, a great deal of effort is devoted to an estimation of the increments Yn − Ym p for m ≤ n, according as Mm ≤ Mn /2 or Mm ≥ Mn /2. This preliminary work is of course indispensable. Finally, in Step 3, we really attack the proof. We construct a measure μ on N and show that a family of local integrals attached to it, is uniformly bounded. This establishes that μ is a majorizing measure, and consequently, enables us to conclude the proof. 0) Let (T , d) be a compact metric space and denote by D the diameter of T . For x ∈ T and ε > 0, consider a separable stochastic process X = {Xt , t ∈ T } indexed by T , defined on some probability space (, A, P) and satisfying the increment condition

Xs − Xt p ≤ d(s, t)

(s, t ∈ T ).

(9.4.1)

Assume that there exists a probability measure μ on T such that

D

sup x∈T

0

dε = M. μ(B(x, ε))1/p

(9.4.2)

458

9 The majorizing measure method

It follows from Theorem 9.2.2 that X is sample continuous and moreover sup (Xs − Xt ) ≤ Kp M, p

(9.4.3)

s,t∈T

where Kp depends on p only. We recall that X is separable (with respect to the metric d), if there exists a countable d-dense subset T0 of T and a null set N of B such that a.s. for any ω ∈ N and any t ∈ T , Xt (ω) = limT0 s→t Xs (ω). In our case, this is not important because we work with sequences of random variables; so T = N and the sample continuity property simply means here that the sequence studied converges almost surely. With this tool in hand, our task will consist in proving the existence of a majorizing measure on N provided with a specific metric: the one induced by the Lp -increments of the sequence nl=1 ξl /φ(Mn ), n ≥ 1. The majorizing measure is built at Step 3. But some preliminary steps are necessary. 1) Let ρ > 1 be some fixed real which we assume to be sufficiently large for condition (9.3.2) to be realized. Without loss of generality, we can assume m1 ≤

m2 . 2(1 + Cm (ρ))

(9.4.4)

˜ If this condition is not satisfied, we first "(x) by "(x) = 2p "(x). Then we let replace p ξ˜1 be a random variable satisfying E ξ˜1 ≤ "(m1 /2(1 + Cm (ρ))). We also replace

p ˜ defined by m ˜ i = mi−1 for i ≥ 2 and m ˜ 1 = " −1 E ξ˜1 . In place of ξ , m by m we then consider enlarged families ξ˜ defined as follows: ξ˜i = ξi−1 , for i ≥ 2. Then, m ˜1 ≤ m ˜ 2 /2(1 + Cm (ρ)) and j j j p ˜ E m ˜l ≤ " m ˜l ξ˜l ≤ " l=i

l=i

(2 ≤ i ≤ j ),

l=i

j j p p p p−1 ˜ E E ξl ≤ 2 ξ˜l + E ξ˜1 l=1

≤ 2p−1 "

l=2 j l=2

j ˜ ˜ 1) ≤ " m ˜ l + "(m m ˜l . l=1

˜ and the new sequence m, It follows that condition (9.3.5) is satisfied with function " ˜ for ˜ ˜ any sequence ξ obtained from ξ by adding ξ1 , as well as condition (9.4.4). Moreover, 1 the new sequence m ˜ satisfies condition (9.3.2) with Cm˜ (ρ)) = ( m m ˜ 1 ∨ 1)Cm (ρ). We now regularize the sequence m. Consider the new sequence m = {ml , l ≥ 1} defined by ∞ ml = ρ −|k−l| mk (l ≥ 1) (9.4.5) k=1

459

9.4 Proof of Theorem 9.3.3

and write Mn =

n

l=1 ml ,

n ≥ 1. Then, i)

ml ≤ ml ,

ii)

ρ −1 ≤

iii)

ml+1 ml

≤ ρ,

(9.4.6)

Mn ≥ Mn+1 /2ρ.

Assertions i) and ii) are elementary; as for iii) we have by ii) that Mn ≥ (Mn+1 −m1 )/ρ. But, in view of (9.3.2) and (9.4.4),

m1 = m1 +

∞

ρ −(k−1) mk ≤ m1 1 + Cm (ρ) ≤ m2 /2 ≤ m2 /2 ≤ Mn+1 /2.

k=2 Hence, Mn ≥ Mn+1 /2ρ. Observe now that

Mn =

∞ n

∞

ρ −|k−l| mk =

l=1 k=1

mk

n

k=1

ρ −|k−l| ,

l=1

and n l=1 n l=1 n

ρ −|k−l| ≤ ρ −(k−n)+1 /(ρ − 1)

(k > n),

ρ −|k−l| ≤ ρ/(ρ − 1)

(k = n),

ρ −|k−l| ≤ (ρ + 1)/(ρ − 1)

(k < n).

l=1

Thus, Mn ≤

Mn

≤

n k=1

≤

mk

n l=1

ρ

−|k−l|

+

k>n

mk

n

ρ −|k−l|

l=1

# ρ+1 " Mn + mn Cm (ρ) ≤ Cρ Mn , ρ−1

(9.4.7)

where we put Cρ = ( ρ+1 ρ−1 )[1 + Cm (ρ)], and Cm (ρ) is defined by condition (9.3.2). Hence, Mn ≤ Mn ≤ Cρ Mn . (9.4.8) Now, consider the following conditions: there exists L ∈ L such that "(Mn ) L(Mn )1/p "(mn ) 1/p dt sup + < ∞, 1/q " −1 (t)1/p mn n≥1 ϕ(Mn ) "(mn ) t

(9.3.8 )

460

9 The majorizing measure method j j p E ξl ≤ " ml l=i

(i ≤ j ).

(9.1.5 )

l=i

Since Mn , Mn are commensurable (9.3.5 ) and (9.3.8) ⇒ (9.1.8 ).

and mn ≤ mn , we have the implications (9.3.5) ⇒

Assume that we have provedthe theorem with m in place of m. Let ξ satisfy (9.3.5), n 1 and thus (9.3.5 ). Then φ(M ) l=1 ξl converges almost surely to 0 and verifies n

n l=1 ξl ≤ K. sup φ(M ) n≥1

n

p

1 Since φ(Mn ) ≥ φ(Cρ Mn )/Cρ ≥ φ(Mn )/Cρ , by concavity of φ we have φ(M n) converges almost surely to 0, and n ξl l=1 ≤ K. sup n≥1 φ(Mn ) p

n

l=1 ξl

It is therefore enough to prove the theorem under the additional assumption on m: a) ρ −1 ≤ ml+1 /ml ≤ ρ, b) Mn ≥ Mn+1 /2ρ.

(9.4.9)

2) Put for any integer n ≥ 1, n Yn =

l=1 ξl

φ(Mn )

.

(9.4.10)

Clearly, for any m ≤ n,

Yn − Ym p ≤ "(Mm )1/p

"(Mn − Mm )1/p ϕ(Mn ) − ϕ(Mm ) + . ϕ(Mn )ϕ(Mm ) ϕ(Mn )

We estimate the right-hand side according as Mm ≥ Mn /2 or Mm ≤ Mn /2. If Mm ≥ Mn /2, by concavity of ϕ, ϕ(Mn ) − ϕ(Mm ) ϕ(Mn ) − ϕ(Mm ) (Mn − Mm ) = 1/p "(Mn − Mm ) Mn − Mm "(Mn − Mm )1/p ϕ(Mm ) (Mn − Mm ) ≤ Mm "(Mn − Mm )1/p ϕ(Mm ) "(Mm )1/p Mn − Mm = "(Mm )1/p "(Mn − Mm )1/p Mm ϕ(Mm ) ≤ , "(Mm )1/p

(9.4.11)

461

9.4 Proof of Theorem 9.3.3 ϕ(Mn )−ϕ(Mm ) since "(x)/x p is nonincreasing and Mn −Mm ≤ Mm . Thus "(M 1/p ≤ n −Mm ) which implies

"(Mm )1/p

ϕ(Mm ) , "(Mm )1/p

"(Mn − Mm )1/p ϕ(Mn ) − ϕ(Mm ) ≤ . ϕ(Mn )ϕ(Mm ) ϕ(Mn )

Hence by (9.4.11),

Yn − Ym p ≤ 2

"(Mn − Mm )1/p ϕ(Mn )

if m ≤ n and Mm ≥ Mn /2.

(9.4.12)

Now, consider the case m ≤ n with Mm ≤ Mn /2. Since ϕ(Mn /2) ≤ ϕ(Mn )/21/p by convexity of ϕ p , we have ϕ(Mn ) − ϕ(Mm ) 1 − 2−1/p "(Mm )1/p ≥ "(Mm )1/p . ϕ(Mn )ϕ(Mm ) ϕ(Mm ) ϕ(x) But "(x) 1/p is nondecreasing; then estimate with

ϕ(Mn ) "(Mn )1/p

≥

ϕ(Mm ) "(Mm )1/p

and so we can continue our

ϕ(Mn ) − ϕ(Mm ) "(Mn )1/p "(Mm )1/p ≥ (1 − 2−1/p ) . ϕ(Mn )ϕ(Mm ) ϕ(Mn ) Thus

ϕ(Mn ) − ϕ(Mm ) "(Mn − Mm )1/p "(Mm )1/p ≥ (1 − 2−1/p ) . ϕ(Mn )ϕ(Mm ) ϕ(Mn )

Set

γp =

2 − 2−1/p . 1 − 2−1/p

(9.4.13)

Then by (9.4.11),

Yn − Ym p ≤ γp "(Mm )1/p .

ϕ(Mn ) − ϕ(Mm ) ϕ(Mn )ϕ(Mm )

if m ≤ n and Mm ≤ Mn /2. (9.4.14)

Finally remark that if n is sufficiently large, say n ≥ n1 , then m ≥ n "⇒

"(Mm − Mn )1/p "(Mn )1/p "(Mm )1/p ≤ ≤ ϕ(Mm ) ϕ(Mn ) ϕ(Mm ) ϕ(M n ) − ϕ(m1 ) ≤ "(m1 )1/p · . ϕ(Mn )ϕ(m1 )

Observe indeed, from (9.3.7) (b) we have that "(Mn )1/p = 0, n→∞ ϕ(Mn ) lim

(9.4.15)

462

9 The majorizing measure method

n )−ϕ(m1 ) and besides, limn→∞ "(m1 )1/p ϕ(M ϕ(Mn )ϕ(m1 ) =

"(m1 )1/p ϕ(m1 ) .

Define n1 so that for n ≥ n1 ,

"(Mn )1/p 1 "(m1 )1/p ϕ(Mn ) − ϕ(m1 ) ≤ ≤ "(m1 )1/p . ϕ(Mn ) 2 ϕ(m1 ) ϕ(Mn )ϕ(m1 ) This and (9.3.7) (a) prove our claim. By combining now successively (9.4.12) with (9.4.15) and (9.4.14) with (9.4.15), next using (9.3.7) (a) we get n )−ϕ(m1 ) if m ≥ n ≥ n1 and Mn ≥ Mm /2, 2"(m1 )1/p ϕ(M ϕ(Mn )ϕ(m1 )

Yn − Ym p ≤ 1/p ϕ(Mm )−ϕ(m1 ) γp "(m1 ) if m ≥ n ≥ n1 and Mn ≤ Mm /2. ϕ(Mm )ϕ(m1 ) Concerning the last case, we have by using (9.4.15) again "(m1 )1/p

"(m1 )1/p ϕ(Mm ) − ϕ(m1 ) ϕ(Mn ) − ϕ(m1 ) ≤ ≤ 2"(m1 )1/p , ϕ(Mm )ϕ(m1 ) ϕ(m1 ) ϕ(Mn )ϕ(m1 )

since n ≥ n1 . As γp > 2, we have obtained

Yn − Ym p ≤ 2γp "(m1 )1/p

ϕ(Mn ) − ϕ(m1 ) ϕ(Mn )ϕ(m1 )

if m ≥ n ≥ n1 .

(9.4.16)

Now let n ≥ n1 and m ≤ n. Then, by (9.4.12), (9.4.14), (9.3.7) (a) and (9.4.15) if Mm ≥ Mn /2, then "(Mn − Mm )1/p "(Mn )1/p "(m1 )1/p ≤ ≤ ϕ(Mn ) ϕ(Mn ) ϕ(m1 ) ϕ(M ) − ϕ(m ) n 1 ≤ 2"(m1 )1/p , ϕ(Mn )ϕ(m1 )

Yn − Ym p ≤ 2

(9.4.17)

and if Mm ≤ Mn /2, then

Yn − Ym p ≤ γp "(Mm )1/p ·

ϕ(Mn ) − ϕ(Mm ) ϕ(Mn ) − ϕ(m1 ) ≤ γp "(m1 )1/p . ϕ(Mn )ϕ(Mm ) ϕ(Mn )ϕ(m1 ) (9.4.18)

Therefore, sup Yn − Ym p ≤ 2γp "(m1 )1/p · m≥1

ϕ(Mn ) − ϕ(m1 ) ϕ(Mn )ϕ(m1 )

(n ≥ n1 ).

(9.4.19)

3) Fix now n ≥ n1 , and put for k = 1, 2, . . . , n − 1, (n)

εk = εk = 2"(Mk )1/p · Then,

ϕ(Mn ) − ϕ(Mk ) . ϕ(Mn )ϕ(Mk ) (n)

sup Yn − Ym p ≤ γp ε1 .

m≥1

(9.4.20)

(9.4.21)

463

9.4 Proof of Theorem 9.3.3

By concavity of ϕ and (9.4.9-b), we have that ϕ(Mk+1 ) ≤ Thus, for k + 1 < n, and p = [p] + 2, εk εk+1

Mk+1 Mk ϕ(Mk )

≤ 2ρϕ(Mk ).

"(Mk ) 1/p ϕ(Mk+1 ) ϕ(Mn ) − ϕ(Mk ) ϕ(Mn ) − ϕ(Mk ) ≤ρ "(Mk+1 ) ϕ(Mk ) ϕ(Mn ) − ϕ(Mk+1 ) ϕ(Mn ) − ϕ(Mk+1 ) ϕ p (M ) − ϕ p (M ) n k =ρ p ϕ (Mn ) − ϕ p (Mk+1 ) ϕ(M )p−1 + ϕ(M )p−2 ϕ(M ) + · · · + ϕ(M )p−1 n n k+1 k+1 · . ϕ(Mn )p−1 + ϕ(Mn )p−2 ϕ(Mk ) + · · · + ϕ(Mk )p−1

=

Since ϕ p is convex, then ϕ p is also convex and ϕ p (Mn ) − ϕ p (Mk ) = ϕ p (Mn ) − ϕ p (Mk+1 )

ϕ p (Mn )−ϕ p (Mk ) Mn −Mk ϕ p (Mn )−ϕ p (Mk+1 ) Mn −Mk+1

=1+

·

Mn − Mk M n − Mk ≤ Mn − Mk+1 Mn − Mk+1

Mk+1 − Mk mk+1 ≤1+ ≤ 1 + ρ. Mn − Mk+1 mk+2

Thus, εk εk+1

ϕ(Mn )p−1 + ϕ(Mn )p−2 ϕ(Mk+1 ) + · · · + ϕ(Mk+1 )p−1 ϕ(Mn )p−1 + ϕ(Mn )p−2 ϕ(Mk ) + · · · + ϕ(Mk )p−1 ϕ(Mn )p−1 + ρϕ(Mn )p−2 ϕ(Mk ) + · · · + ρ p−1 ϕ(Mk )p−1 ≤ ρ(1 + ρ) ϕ(Mn )p−1 + ϕ(Mn )p−2 ϕ(Mk ) + · · · + ϕ(Mk )p−1 p ≤ ρ (1 + ρ). ≤ ρ(1 + ρ)

Put η = ρ p (1 + ρ); we have shown εk ≤ η, εk+1

k = 1, 2, . . . , n − 2.

(9.4.22)

We denote B(n, ε) = {m ≥ 2 : Yn − Ym p < ε}. Let μ be the measure defined on the set of integers {2, 3, . . . } by Mn "(t)1/p 1 μ{n} = c dt, (9.4.23) + tϕ(t) L(t) Mn−1 with c = that

∞ "(t)1/p m1

tϕ(t)

+

1 L(t)

dt

sup

−1

. By Step 0 and (9.4.21), it suffices to establish

(n)

γp ε1

n≥n1 0

dε < ∞. μ(B(n, ε))1/p

We fix n ≥ n1 . Let kn be the unique integer such that εkn +1 <

4"(Mn )1/p ≤ εkn . ϕ(Mn )

(9.4.24)

464

9 The majorizing measure method

We compute the integral

ε1

dε

1/p . μ B(n, ε)

0

a)

"(mn )1/p ϕ(Mn )

0

dε

1/p ≤ μ B(n, ε)

"(mn ) mn

1/p

L(Mn )1/p , ϕ(Mn )

(9.4.25)

which, in view of condition (9.3.8), is bounded in n uniformly. b) Since εkn ≤ ηεkn +1 , we have

εkn "(mn )1/p ϕ(Mn )

dε

1/p ≤ μ B(n, ε)

4η"(Mn )1/p ϕ(Mn ) "(mn )1/p ϕ(Mn )

dε

1/p . μ B(n, ε)

Let H = 4ηC where C arises from (9.3.4) and observe that H ≥ 2. Put for ε ≤ 1/p n) 4η "(M ϕ(Mn ) , mε = inf m ≤ n : "(Mn − Mm )1/p ≤ Hε ϕ(Mn ) . (9.4.26) Since

ε H ϕ(Mn )

≤

4η 1/p H "(Mn )

=

1 1/p , C "(Mn )

then

"(Mn − Mm )1/p ≤ "(Mn )1/p /C, if m ≥ mε . This implies by property (9.3.4) of ", that Mm ≥ Mn /2. And by Step 2,

Yn − Ym p ≤ 2

"(Mn − Mm )1/p 2ε ≤ ≤ ε. ϕ(Mn ) H

Hence, {mε , . . . , n} ⊂ B(n, ε),

(9.4.27)

and consequently,

μ B(n, ε) ≥ c

Mn

Mmε −1

Mn − Mmε −1 c dt ≥c ≥ " −1 L(t) L(Mn ) L(Mn )

since Mn − Mmε −1 ≥ " −1

4η"(Mn )1/p ϕ(Mn ) "(mn )1/p ϕ(Mn )

" ϕ(Mn )ε #p H

dε ≤ μ(B(n, ε))1/p (ε =

H x 1/p ϕ(Mn ) )

≤ c

ϕ(Mn )ε H

%p

,

by definition of mε . Then

L(Mn ) c

$

1/p

4η"(Mn )1/p ϕ(Mn ) "(mn )1/p ϕ(Mn )

)1/p

L(Mn ϕ(Mn )

"(Mn ) Cp "(Mn ) Hp

" −1

dε

" ϕ(Mn )ε #p 1/p , H

dx , x 1/q " −1 (x)1/p (9.4.28)

465

9.4 Proof of Theorem 9.3.3

with c = (H p /c)1/p p−1 . It follows from condition (9.3.8), that the right-hand side of (9.4.28) is bounded uniformly in n. ε dε c) Consider now the integral εk1 μ(B(n,ε)) 1/p . Let 1 ≤ k ≤ kn −1 and εk+1 < ε < εk . n

1/p

n) Since k + 1 ≤ kn , we have εk+1 ≥ εkn ≥ 4 "(M ϕ(Mn ) . Thus

4

"(Mn )1/p ϕ(Mn ) − ϕ(Mk+1 ) , ≤ 2"(Mk+1 )1/p ϕ(Mn ) ϕ(Mn )ϕ(Mk+1 )

or ϕ(Mn ) − ϕ(Mk+1 ) ≥ 2"(Mn )1/p

ϕ(Mk+1 ) . "(Mk+1 )1/p

One immediately sees that Mk+1 cannot be too close to Mn . More precisely, suppose that Mk+1 > Mn /2. Then we deduce from the fact that ϕ(x)/"(x)1/p is nondecreasing and as previously, that ϕ(Mk+1 ) "(Mk+1 )1/p ϕ(Mn /2) ≥ 2"(Mn )1/p ≥ 2ϕ(Mn /2). "(Mn /2)1/p

ϕ(Mn ) − ϕ(Mk+1 ) ≥ 2"(Mn )1/p

As ϕ(Mn ) − ϕ(Mk+1 ) ≤ ϕ(Mn ) − ϕ(Mn /2), this implies 3ϕ(Mn /2) ≤ ϕ(Mn ). But ϕ is concave; thus ϕ(Mn /2) ≥ ϕ(Mn )/2. This implies that 3ϕ(Mn )/2 ≤ ϕ(Mn ), and we have a contradiction. Hence, Mk+1 ≤ Mn /2. Let n ≥ m ≥ k + 1. Using again the fact that ϕ(x)/"(x)1/p is nondecreasing and Step 2, gives by (9.4.11), (9.4.14), ϕ(Mn ) − ϕ(Mm ) "(Mn − Mm )1/p + ϕ(Mn )ϕ(Mm ) ϕ(Mn ) "(Mn − Mk+1 )1/p ϕ(Mn ) − ϕ(Mk+1 ) ≤ "(Mk+1 )1/p + ϕ(Mn )ϕ(Mk+1 ) ϕ(Mn ) ≤ γp εk+1 /2 ≤ γp ε/2.

Yn − Ym p ≤ "(Mm )1/p

Hence, by noting ε = γp ε/2, {k + 1, . . . , n} ⊂ B(n, ε ),

(9.4.29)

466

9 The majorizing measure method

∞

1/p and μ B(n, ε ) ≥ c S(Mk ) − S(Mn ) , where we put S(u) = u "(t) tϕ(t) dt. But S( · ) is convex decreasing. Since Mk ≤ Mk+1 ≤ Mn /2, we therefore have S(Mk ) − S(Mn ) ≥ S(Mk ) − S(2Mk ) ≥ −Mk S (2Mk ) =

Mk "(2Mk )1/p "(2Mk )1/p "(Mk )1/p = ≥ . 2Mk ϕ(2Mk ) 2ϕ(2Mk ) 4ϕ(Mk )

Thus, we can continue our estimate with

c"(Mk )1/p μ B(n, ε ) ≥ . 4ϕ(Mk ) And, by letting c = (4/c)1/p , and recalling that ε = γp ε/2,

εk

εk+1

dε

1/p μ B(n, γp ε/2)

≤ c (εk − εk+1 )

ϕ(Mk ) "(Mk )1/p

1/p

.

Consequently,

ε1

εkn

dε

1/p =

μ B(n, γp ε/2)

≤

k n −1 εk k=1 εk+1 k n −1 c (εk k=1 k n −1

= 2c

k=1

+

dε

1/p μ B(n, γp ε/2) − εk+1 )

ϕ(Mk ) "(Mk )1/p

ϕ(Mk ) "(Mk )1/p 1/p

1/p

(9.4.30)

"(Mk+1 )1/p "(Mk )1/p − ϕ(Mk ) ϕ(Mk+1 )

"(Mk+1 )1/p − "(Mk )1/p ϕ(Mn )

.

On the one hand, since ϕ(x)/"(x)1/p is nondecreasing, k n −1 k=1

ϕ(Mk ) "(Mk )1/p

≤ ≤

1/p

ϕ(Mn ) "(Mn )1/p ϕ(Mn ) "(Mn )1/p

"(Mk+1 )1/p − "(Mk )1/p ϕ(Mn )

1/p k n −1 1/p

k=1

"(Mk+1 )1/p − "(Mk )1/p ϕ(Mn ) 2

"(Mn )1/p "(Mn )1/p−1/p = = ϕ(Mn ) ϕ(Mn )1/q

which is bounded in n uniformly.

(9.4.31)

"(Mn )1/p ϕ(Mn )

1/q

,

467

9.4 Proof of Theorem 9.3.3

And on the other hand, concerning the sum "(Mk+1 )1/p ϕ(Mk+1 ) , we observe that k n −1 k=1

1/p

ϕ(Mk ) "(Mk )1/p

≤

k n −1

"(Mk )1/p ϕ(Mk ) "(Mk+1 )1/p ϕ(Mk+1 ) k=1 1/p "(m1 ) ϕ(m1 )

≤

dt

t 1/p

0

Therefore ε1 εkn

dε

1/p ≤ 2c μ B(n, ε )

kn −1

k=1

ϕ(Mk ) 1/p "(Mk )1/p ϕ(Mk ) "(Mk )1/p

"(Mk )1/p "(Mk+1 )1/p − ϕ(Mk ) ϕ(Mk+1 )

dt t 1/p =

(9.4.32)

1 "(m1 )1/p q ϕ(m1 )

"(Mn )1/p ϕ(Mn )

1/q

1/q

.

1 "(m1 )1/p + q ϕ(m1 )

1/q

.

(9.4.33)

From (9.4.25) and (9.4.28) follows that εk n dε 1/p

0 μ B(n, γp ε/2) εk n dε ≤

1/p 0 μ B(n, ε) ≤

"(mn ) mn

1/p

−

(9.4.34)

L(Mn )1/p L(Mn )1/p + c ϕ(Mn ) ϕ(Mn )

"(Mn ) Cp "(Mn ) Hp

dx x 1/q " −1 (x)1/p

.

Combining then these two estimates gives 0

ε1

dε

1/p ≤

μ B(n, γp ε/2)

1/p

L(Mn )1/p ϕ(Mn ) n) 1/p "(M Cp dx L(Mn ) +c 1/q "(M ) n ϕ(Mn ) x " −1 (x)1/p p

"(mn ) mn

+ 2c

H

"(Mn )1/p ϕ(Mn )

1/q

+

1 "(m1 )1/p q ϕ(m1 )

1/q

. (9.4.35)

It remains to observe that a 0

dε

1/p μ B(n, ε)

ε=λθ

= λ 0

a/λ

dθ

1/p . μ B(n, λθ )

468

9 The majorizing measure method

Applying this with λ = γp and a = γp ε1 shows that γp ε1 ε1 dε dε

1/p = γp

1/p 0 0 μ B(n, γp ε) μ B(n, ε) ε1 dε ≤ γp

1/p 0 μ B(n, γp ε/2) ≤ γp

"(mn ) mn

+ 2c

1/p

L(Mn )1/p L(Mn )1/p + c ϕ(Mn ) ϕ(Mn ) 1/p 1/q

"(Mn ) ϕ(Mn )

+

1 "(m1 )1/p 1/q q ϕ(m1 )

"(Mn ) Cp "(Mn ) Hp

dx x 1/q " −1 (x)1/p

. (9.4.36)

(n)

Since ε1 = ε1 , this finally shows that sup

(n)

γp e 1

n≥n1 0

dε

1/p < ∞. μ B(n, ε)

(9.4.37)

Let D0 = supn,m≥1 Yn − Ym p . For n ≥ n1 , we know (see (9.4.21) at the beginning (n) of Step 3) that supm≥1 Yn − Ym p ≤ γp ε1 . This implies with (9.4.37) D0 dε sup (9.4.38)

1/p < ∞. n≥n1 0 μ B(n, ε) Since,

sup n
D0

−1/p dε < ∞,

1/p ≤ D0 sup μ n n
we deduce

D0

sup n≥1 0

dε

1/p = M, μ B(n, ε)

(9.4.39)

(9.4.40)

and M < ∞ depends on p, m, " and ϕ only. According to Step 0, this implies sup (Yn − Ym ) ≤ Kp M, (9.4.41) p n,m≥1

where Kp depends on p only, and the sequence {Yn , n ≥ 1} converges almost surely. Now, from assumptions (9.3.5) and (9.3.7) (a) follows that n ξl l=1 1/p (9.4.42) sup φ(M ) ≤ "(Mm ) /ϕ(Mm ), n≥m n p since ϕ(x)/"(x)1/p is nondecreasing. And since (9.3.7) (b) implies that lim ϕ(x)/"(x)1/p = ∞,

x→∞

469

9.5 Proof of Theorems 9.3.10 and 9.3.11

then,

m

l=1 ξl

Ym =

Lp

−→ 0.

φ(Mm ) Write supn≥1 |Yn | ≤ |Ym0 | + supn,m≥1 |Yn − Ym |. With (9.4.26), this gives sup |Yn | ≤ Kp M + Ym p . 0 p

(9.4.43)

n≥1

Since m0 is arbitrary, we can let m0 tend to infinity in the above inequality and use (9.4.41) to control Ym0 p . We obtain sup |Yn | ≤ Kp M. (9.4.44) p n≥1

Now, (9.4.43) and (9.4.44) imply in view of the dominated convergence theorem and Step 0, that a.s. (9.4.45) Ym −→ 0. The proof is complete. 9.4.1 Remark. The attentive reader will have observed that the proof shows a little more than Theorem 9.3.3. Let X = {Xn , n ≥ 1} be a sequence of random variables satisfying the increment condition: for any integers n ≥ m, ⎧ "(M −M )1/p n m ⎪ ifMm ≥ Mn /2, ⎨B ϕ(Mn )

Xn − Xm p ≤ (9.4.46) ⎪ ⎩ ϕ(Mn )−ϕ(Mm ) 1/p B"(Mm ) . ϕ(Mn )ϕ(Mm ) ifMm ≤ Mn /2. where B is an absolute constant. Then the conclusion of Theorem 9.3.3 remains true for X.

9.5

Proof of Theorems 9.3.10 and 9.3.11

Theorem 9.3.11 is just a particular case of Theorem 9.3.10. Thus, we only have to give the proof of Theorem 9.3.10. Put for any positive integer n, n ξl n = (Mn ), Xn = 1/pl=1 . (9.5.1) n ϕ(Mn ) In view of Remark 9.4.1, it is enough to show that X satisfies assumption (9.4.46). We proceed in two steps. Let n, m be two positive integers with n ≥ m. From assumption (9.3.15) follows that 1/p Xm − Xn ≤ 1/p "(M ) m m p

1 1/p

−

1 1/p

+

"(Mn − Mm )1/p ϕ(Mn )

m ϕ(Mm ) n ϕ(Mn ) m 1/p "(Mn − Mm )1/p "(Mm )1/p ϕ(Mm ) + ϕ(Mn ) − . ≤ ϕ(Mm )ϕ(Mn ) n ϕ(Mn ) (9.5.2)

470

9 The majorizing measure method

1) Mm ≥ Mn /2. Then,

"(Mm )1/p m 1/p ϕ(Mn ) − ϕ(Mm ) ϕ(Mm )ϕ(Mn ) n

"(Mm )1/p ϕ(Mn ) − ϕ(Mm ) m 1/p "(Mm )1/p = 1− . + ϕ(Mm )ϕ(Mn ) ϕ(Mn ) n

(9.5.3)

But, by assumption (9.3.16), 0 ≤ 1−

m n

1/p

≤ 1−

m n

1/p

=

n − m n

1/p

≤ B 1/p

"(Mn − Mm ) 1/p . "(Mm ) (9.5.4)

Hence,

"(Mm )1/p m 1/p ϕ(Mn ) − ϕ(Mm ) ϕ(Mm )ϕ(Mn ) n

"(Mm )1/p ϕ(Mn ) − ϕ(Mm ) " 1/p (Mn − Mm ) + B 1/p ≤ . ϕ(Mm )ϕ(Mn ) ϕ(Mm )

(9.5.5)

Since n ≥ m and Mm ≥ Mn /2, we know from the preliminary computations leading 1/p 1/p (M −M ) n )−ϕ(Mm )) n m ≤ " ϕ(M . By inserting then to inequality (9.4.12), that "(Mm )ϕ(M(ϕ(M m )ϕ(Mn ) m) this estimate into (9.5.5) and using (9.5.2), we get 1/p Xm − Xn ≤ (2 + B 1/p ) " (Mn − Mm ) . p ϕ(Mm )

(9.5.6)

2) Mm ≤ Mn /2. Let A > 1 such that A/(A − 1) < 21/p . Since ϕ p is convex and ϕ(0) = 0, ϕ(Mn ) ≥ ϕ(2Mm ) ≥ 21/p ϕ(Mm ) ≥

A − (m /n )1/p ϕ(Mm ), A−1

(9.5.7)

which implies that

ϕ(Mn ) − (m /n )1/p ϕ(Mm ) ≤ A ϕ(Mn ) − ϕ(Mm ) .

Thus

(9.5.8)

"(Mm )1/p ϕ(Mn ) − ϕ(Mm ) ϕ(Mm ) ≤ A . ϕ(Mm )ϕ(Mn ) (9.5.9) From the computations leading to inequality (9.4.14), we know that if Mm ≤ Mn /2, 1/p 1/p (M −M ) n )−ϕ(Mm )) n m ≥ " ϕ(M . Therefore then (γp − 1) "(Mm )ϕ(M(ϕ(M m )ϕ(Mn ) m)

"(Mm )1/p ϕ(Mn ) − ϕ(Mm ) Xm − Xn ≤ (γp − 1 + A) . (9.5.10) p ϕ(Mm )ϕ(Mn )

"(Mm )1/p m ϕ(Mn ) − ϕ(Mm )ϕ(Mn ) n

Hence (9.4.46) is satisfied.

1/p

471

9.6 Proof of Theorem 9.3.12 and some examples

9.6

Proof of Theorem 9.3.12 and some examples

Let f ∈ L2 (P). From assumption (9.3.18) and Proposition 1.2.2, follows that j j 2 θl T pl f ≤ ml ml f 2 E i≤l≤j

l=1

(i ≤ j ).

l=i

We assume f 2 = 1 and put ξl = θl T pl f , l ≥ 1. Then the sequence ξ = {ξl , l ≥ 1} satisfies assumption (9.3.17). Since is concave increasing, Theorem 9.3.11 applies. Thus, if ϕ is such that there exists L ∈ L with ∞ Mn L(Mn )1/2 dt sup < ∞ and log < ∞ for some λ > 0, √ mn tϕ(t) n≥1 ϕ(Mn ) λ then there exists a constant K such that n θl T pl f k=1 ≤ K, sup 1/2 ϕ(M ) n n≥1 n (Mn ) 2

n

and

pl k=1 θl T f n (Mn )1/2 ϕ(Mn )

a.s.

−→ 0.

The first part of the theorem follows by replacing f by g/ g 2 for arbitrary g ∈ L2 (P). The second part of the theorem similarly follows from the second half of Theorem 9.3.10. Now we give some examples of application of Theorem 9.3.12. 1. Consider a sequence = {θk , k ≥ 1} of independent, symmetric real-valued random variables, as well as an increasing sequence of integers P = {pk , k ≥ 1}. Let (X, F , μ) be an arbitrary probability space, and T any contraction of L2 (μ). In this example, we study the growth of the weighted ergodic sums n

θl (ω)T pl f

k=1

when ω belongs to a measurable set of full measure; which is universal in the sense that the estimates of the magnitude of the considered sums are independent of the contraction T and f ∈ L2 (μ). We shall introduce conditions on the sequences and P ; some of them are very weak. All these conditions are also natural, in regard to the optimality of the result we obtain below. Condition (P ): there exists : R+ → R+ nondecreasing, concave such that

pl = O e(l) . a.s. n 2 Condition (): i) For any l, P |θl | = 0 = 0; and ii) n = O l=1 θl . Condition ii) is weak. If the θl ’s are identically distributed, condition ii) is always satisfied. This follows from the strong law of large numbers. Condition i) is natural in regard to the studied averages. Put for any positive integer n, n = nl=1 θl2 .

472

9 The majorizing measure method

9.6.1 Theorem. Let τ > 3/2. There exists a measurable set with P( ) = 1, and for any ω ∈ ∗ , a real Kω < ∞, such that for any probability space (X, F , μ), any contraction T on L2 (μ), any f ∈ L2 (μ), we have n θl (ω)T pl f k=1 ≤ Kω f 2 sup " #1/2 τ n≥1 (n (ω))n (ω) log (1 + n (ω)) 2,μ and

n

pl a.s. k=1 θl (ω)T f −→ " #1/2 τ (n (ω))n (ω) log (1 + n (ω))

0.

• The stated result expresses a rather general form of an ergodic theorem with weights sampled by sequences of independent random variables. There is indeed no moment assumption at all. When some integrability property is moreover known, n can be replaced by a suitable deterministic sequence in the normalizing sequence. • Take P such that for some B < ∞, pn = O(nB ) and an i.i.d. sequence satisfying condition () i). Conditions (P ) and () are satisfied with (t) = B log t. Let b > 2, Theorem 9.6.1 applies with, as a normalizing factor, n (ω)1/2 logb (1 + n (ω)). Further, if θ1 is square integrable, for any b > 2, there exists a measurable set with P( ) = 1, and for any ω ∈ ∗ , a real Kω < ∞, such that: For any probability space (X, F , μ), any contraction T on L2 (μ), any f ∈ L2 (μ), we have n θl (ω)T pl f k=1 ≤ Kω f 2 , sup √ n logb n n≥1 2,μ and

n pl a.s. k=1 θl (ω)T f −→ 0. √ b n log n δ

• Take P such that for some 0 < δ < 1, pn = O(en ) and an i.i.d. sequence satisfying condition () i). Conditions P and are satisfied with (t) = t δ . Then, (1+δ)/2 logb (1 + n ) or for b > 3/2, Theorem 9.6.1 applies with normalizing factor n b (1+δ)/2 n log n, if θ1 is square integrable. Proof. Fix τ > 3/2 and ρ > 0. By Theorem 8.5.6, there exists a universal constant C such that M 2iπpk t k=N +1 θk e E sup sup

1 ≤ C. M 2 2 N <M 0≤t≤1 log p θ M k=N +1 k Put for positive integers n and t ∈ [0, 1[, ζn (t) = θn e2iπpn t .

473

9.6 Proof of Theorem 9.3.12 and some examples

It follows that ζl i≤l≤j

∞

≤C

log pj

j

1/2

1/2 2 1/2 j θl , i≤l≤j

with E C < ∞. Conditions

(P ) and () imply j = O j and log pj = O (j ) . Thus log pj = O (j ) . Replacing C by λC for some suitable λ if necessary gives ζl i≤l≤j

∞

1/2 2 1/2 ≤ C j θl ≤C

i≤l≤j

(|θl | ∨ l −ρ )

2 1/2

1≤l≤j

(|θl | ∨ l −ρ )

2 1/2

.

i≤l≤j

Thus condition (9.3.18) of Theorem 9.3.12 is fulfilled. Now, in view of condition () ii), the sequence {(|θl | ∨ l −ρ )2 , l ≥ 1} clearly satisfies condition (9.3.19). The conditions for the application of Theorem 9.3.12 are fulfilled, and the proof is achieved by applying the second half of this theorem, and by observing, by means of condition () ii), that n a.s. (|θl | ∨ l −ρ )2 = O(n ). l=1

2. Consider a sequence Q = {Qk , k ≥ 1} of independent random variables with values in N, as well as an increasing sequence of integers P = {pk , k ≥ 1} and a sequence of reals A = {ak , k ≥ 1} such that An = nl=1 al2 ↑ ∞. Condition (P ): there exists : R+ → R+ nondecreasing, concave such that

pn = O e(An ) .

Condition (Q): E supn≥1

log+ (pn +Qn ) (An )

Condition (A): log |aAnn| = O log An

< ∞.

9.6.2 Theorem. Let τ > 3/2. There exists a measurable set with P( ) = 1, and for any ω ∈ ∗ , a real Kω < ∞, such that for any probability space (X, F , μ), any contraction T on L2 (μ), any f ∈ L2 (μ), we have n

p +Q (ω) l l f − E T pl +Ql f k=1 al T ≤ Kω f 2 a) sup " #1/2 τ n≥1 log (1 + An ) (An )An 2,μ and

n b)

T pl +Ql (ω) f − E T pl +Ql f " #1/2 τ log (1 + An ) (An )An

k=1 al

a.s.

−→ 0.

474

9 The majorizing measure method

• Theorem 9.6.2 provides optimal results for this type of ergodic averages. Here, are two examples. • Let 0 ≤ c < 1. Take n−c ≤ an ≤ 1. Then condition (A) is verified. Choose α pn = O(en ), for some 0 < α < 1, and Q an i.i.d. sequence such that E logB + Q1 < ∞ for some B > 1/α. Then conditions (P ) and (Q) are satisfied with (t) = t α . For τ > 3/2, Theorem 9.6.2 thus applies with normalizing factor n(1+α)/2 logτ n. In the case when an ≡ 1, by Corollary 8.6.11, a.s. 1 pl +Ql (ω) T f − E T pl +Ql f −→ 0. n n

k=1

Here we obtain that for any τ > 3/2, n

1 n(1+α)/2 logτ

n

a.s. T pl +Ql (ω) f − E T pl +Ql f −→ 0,

k=1

and a maximal inequality. • Take A as before, pn = O(nB ), for some B < ∞, and Q an i.i.d. sequence such that E Qδ1 < ∞ for some δ > 0. Choose (t) = B log t. Then conditions (P ), (Q), (A) are satisfied and for any b > 2, Theorem 9.6.2 applies with normalizing factor √ n logb n. The same kind of comments can be made for the case al ≡ 1. Proof. Fix τ > 3/2. In view of conditions (P ) and (Q), log 1 + pj + Qj < ∞. (Aj ) j ≥1

E sup

(9.6.1)

Put for any positive integer l,

ζl (x) = al e2iπ x(pl +Ql ) − E e2iπ x(pl +Ql ) }.

By Lemma 8.7.2, if for some increasing function G : N → N the following condition is satisfied $ % ∞ log(1 + pj + Qj ) 1/2 C(Q, G) = E sup < ∞, G(j ) j =1 then,

N l=M+1 ζl (t) ≤ CC(Q, G). E sup sup 1/2 M 2 N <M 0≤t≤1 G(M) k=N +1 ak

In view of (9.6.1), we can choose G by putting G(j ) = (Aj ). It follows that 1/2 ζl ≤ C (Aj )1/2 al2 . i≤l≤j

∞

i≤l≤j

Then, condition (9.3.18) is satisfied. The conditions for the application of Theorem 9.3.12 are fulfilled. And since in view of condition (A), condition (9.3.19) is verified, the proof is achieved by applying the second half of this theorem.

475

9.7 A stronger form of Salem–Zygmund’s estimate

9.7 A stronger form of Salem–Zygmund’s estimate The majorizing measure method allows us to obtain a new and strictly sharper estimate of the supremum of random trigonometric sums. The improvement is seen by considering the case when the characters are indexed on sub-exponentially growing sequences of integers. Several remarkable examples will be studied. Let p = {pk , k ≥ 1}, θ = (θk )k≥1 be two sequences of reals; and denote by p˜ N = max{2+|pk |, 1 ≤ k ≤ N }. Let also X = {X1 , X2 , . . . } and Y = {Y1 , Y2 , . . . } be two sequences of real random variables defined on a common probability space (, A, P). We will be mainly interested in the cases when X and Y are sequences of centered, independent random variables. Consider for N = 1, 2, . . . the sequence of random trigonometric sums ZN (ω, t) =

N

QN := sup |ZN (t)| .

θk {Xk (ω) cos 2πpk t + Yk (ω) sin 2πpk t},

0≤t≤1

k=1

(9.7.1) Put for s, t ∈ [0, 1], dN (s, t) = 2

N

1/2

θk2 sin2 πpk (s − t)

.

(9.7.2)

k=1

When X and Y are independent random variables with E Xk = E Yk = 0 and E Xk2 = E Yk2 = 1, then dN (s, t) = ZN (s) − ZN (t) 2 . Recall briefly the setting considered in Section 8.5. We assumed that for some constant B,

ZN (s) − ZN (t) G ≤ BdN (s, t),

N ∀N ≥ 1, ∀0 ≤ s, t ≤ 1, (8.5.4) 2 1/2 ,

ZN (s) G ≤ B k=1 θk 2

where G(x) = ex − 1. These assumptions are satisfied when X and Y are independent Rademacher or Gaussian random variables and in other interesting cases (see Examples 1–3, Section 8.5). We have shown in Theorem 8.5.1 that under assumption (8.5.4), there exists a constant C (which is a function of the constant B from (8.5.4) only) such that for any integer N ≥ 1,

QN G ≤ C (log p˜ N )1/2

N

θk2

1/2 .

k=1

This followed from estimate (8.5.18), which we recall for our purpose:

QN G ≤ C (log 4p˜ N )1/2

N

θk2

1/2

k=1

≤ C (log p˜ N )1/2

N k=1

θk2

1 2 2 1/2 θk pk p˜ N N

+

k=1

1/2 .

476

9 The majorizing measure method

And we see that QN G is controlled by two different quantities: aN =

N

θk2

1/2

1 2 2 1/2 θk pk . p˜ N N

,

bN =

(9.7.3)

k=1

k=1

Obviously bN ≤ aN . But bN is not necessarily of the same order than aN ; we may have bN 5 aN . Indeed, if p increases very fast, say exponentially, and θ no more than polynomially, then the appropriate order of bN can be sup1≤k≤N |θk |, which is quite different from aN . So the natural question to be drawn from this is: which of aN and bN really reflects the appropriate size for the order of QN G ? As will be seen, the answer turns out to be a bit subtle. From now on, we assume for simplicity that the sequence p is an increasing sequence of positive reals greater than 1. Put for r = 1, . . . , N, εr2 = pr−2

r

N

θk2 pk2 +

k=1

θk2

(9.7.4)

k=r+1

and observe first that the sequence εr , r = 1, . . . N is decreasing. Indeed, εr2 = pr−2

r k=1

Moreover ε12 =

N

θk2 pk2 +

−2 θk2 > pr+1

k=r+1

N

2 k=1 θk

r

θk2 pk2 +

k=1

2 p2 θr+1 r+1 2 pr+1

−2 N

2 , whereas ε 2 = p = aN N N

2 2 k=1 θk pk

+

N

2 θk2 = εr+1 .

k=r+2

=

[2+pN ] 2 pN

2. bN

9.7.1 Theorem. Under assumption (8.5.4), there exist constants Ci , i = 0, 1, 2 (which are functions of the constant B from (8.5.4) only) such that for any integer N ≥ 1, N 2

2 sup |ZN (s) − ZN (t)| ≤ C0 εN log pN + ε , − ε log p r−1 r r G s,t∈T

r=2

and N 2

2 sup |ZN (t)| ≤ C1 ε1 + C2 εN log pN + ε − ε log p r−1 r r . G t∈T

r=2

The last inequality follows from the first and assumption (8.5.4) by the triangle in√ equality. The right-hand side being clearly bounded above by max(C1 , C2 )ε1 log pN , it follows that Theorem 9.7.1 contains Theorem 2 8.5.1. Before giving the proof, we are first going to establish a lemma. Let ψ(x) = log(x + 1), x ≥ 0. 9.7.2 Lemma. For any positive integer N , sup α∈R 0

2ε1

ψ

1 dε ≤ CεN ψ(πpN ) + 2 (εr−1 − εr )ψ(πpr ), λ(BdN (α, ε)) N

r=2

477

9.7 A stronger form of Salem–Zygmund’s estimate

where BdN (α, ε) is the dN -ball of radius ε centered at point α, and C is an absolute constant. Proof. Let 1 ≤ r < N and let α, β ∈ R be such that dN2 (α, β)

≤4

N

θk2

≤ |α − β| <

1 πpr .

Then,

r N 2 2 2 2 π θk pk |α − β| + 4 θk2 (πpk |α − β|) ∧ 1 = 4

2

k=1

≤ 4pr−2

1 πpr+1

k=1 r

θk2 pk2 + 4

k=1

N

k=r+1

θk2 = 4εr2 .

k=r+1

(9.7.5) 2 = p −2 For r = N , εN N

dN2 (α, β) ≤ 4

N

N

2 2 k=1 θk pk .

Now, if |α − β| <

1 πpN ,

then

N

2 θk2 (πpk |α − β|)2 ∧ 1 = 4 π 2 θk2 pk2 |α − β|2 ≤ 4εN . (9.7.6)

k=1

k=1

Let 1 ≤ r0 < N; then the ball BdN (α, 2εr0 ) contains the interval ]α − πp1r , α + πp1r [. 0 0

Hence, λ BdN (α, εr0 ) ≥ πp1r . Therefore 0

2ε1 2εN

N 2εr −1 0 1 1 ψ dε = ψ dε λ(BdN (α, ε)) λ(BdN (α, ε)) 2εr0

r0 =2

≤2

N

(9.7.7)

(εr0 −1 − εr0 )ψ(πpr0 ).

r0 =2

Let now 0 < ε ≤ 2εN and 0 < τ ≤ 1. Let |α −β| < τ/πpN . Then, dN (α, β) < 2τ εN . The ball BdN (α, τ εN ) contains the interval ]α − πpτ N , α + πpτ N [ . And,

2εN

0

1 dε = 2εN ψ

λ BdN (α, ε)

1

ψ 0

1 dτ λ(BdN (α, τ εN ))

(9.7.8) πpN ψ dτ ≤ CεN ψ(πpN ), ≤ 2εN τ 0 2 2 2 since (log(1 + πpN /τ )) ≤ (log[(1 + πpN )(1 + 1/τ )] ≤ (log(1 + πpN ) + √ 1/ τ . Thus,

2ε1 0

1

1 ψ dε ≤ CεN ψ(πpN ) + 2 (εr−1 − εr )ψ(πpr ). λ(BdN (α, ε)) N

(9.7.9)

r=2

The bound in (9.7.9) being independent from α ∈ R, we have thus proved the lemma.

478

9 The majorizing measure method

Note that if ψ is another nondecreasing function such that for u, v ≥ 1, ψ(uv) ≤ 1 Kψ(u)ψ(v) and 0 ψ(u−1 )du < ∞, we have also sup α∈R 0

2ε1

ψ

1 dε ≤ Cψ εN ψ(πpN ) + 2 (εr−1 − εr )ψ(πpr ), λ(BdN (α, ε)) N

r=2

where Cψ depends on ψ only. Proof of Theorem 9.7.1. By Lemma 9.7.2, λ is a majorizing measure for (T, d) and ϕ = G. Now Theorem 9.7.1 follows directly from Theorem 9.2.2. 9.7.3 Remark. If p is an increasing sequence of positive reals such that pk ≤ pk for all k, then dN2 (α, β) ≤ 4 =4

N

N

θk2 (πpk |α − β|)2 ∧ 1 ≤ 4 θk2 (πpk |α − β|)2 ∧ 1

k=1 r

π 2 θk2 (pk )2 |α − β|2 + 4

k=1

θk2

k=r+1

≤ 4(pr )−2

r k=1

:=

k=1 N

θk2 (pk )2 + 4

N

θk2

k=r+1

4(εr )2 .

Consequently the bound in p, θ given in Theorem 9.7.1 is less than the same bound expressed with p , θ. We will use this trivial observation in the next section as follows: let pk = [pk ], where [x] stands for the integer part of x; then in order to apply Theorem 9.7.1, it is enough to compute quantities related to θ and p .

9.8

Some examples and discussion

We begin by studying two examples, the first of which will show that Theorem 9.7.1 is strictly stronger than Theorem 8.5.1. In the second example, both theorems provide the same estimate. However this example will give a hint for another reading of the estimate in 8.5.1, leading to the discovery of large classes of sequences p, θ for which more useful uniform estimates of the sup-norm are possible to obtain. 9.8.1. The subexponential case. Consider two increasing differentiable functions ψ, ϕ : R+ → [1, ∞[. We define p and θ as follows: pk = [exp{k/2ψ(k)}], θk2 = 1/ϕ(k). We assume that xψ (x) ∼ c ∈ [0, 1[, ψ(x)

ψ(x)ϕ (x) = o(1), ϕ(x)

ψ (x) = o(1) (x → ∞). (9.8.1)

479

9.8 Some examples and discussion

Note that (pr /pr−1 ) ∼ 1 if ψ(x) ↑ ∞ as x tends to infinity, and that in any case (pr /pr−1 ) ≤ C < ∞, C independent of r if the values of ψ(x) are bounded below y duby some strictly positive constant. The lemma below, in which we put (y) = 1 ϕ(u) , is elementary. 9.8.2 Lemma. The following estimates in which C is an absolute constant are valid when N and r tend to infinity: 1) pr−2 rk=1 θk2 pk2 ≤ C ψ(r) ϕ(r) , −2 1 2 2 2 − εr2 = [pr−1 − pr−2 ] r−1 2) εr−1 k=1 θk pk ≤ C ϕ(r) , 3) εr2 ≥ (N ) − (r + 1), 1/2 N N

√ k 4) , k=2 (εk−1 − εk ) log pk ≤ C k=2 ϕ(k)2 ψ(k)[(N )−(k)]

1/2 N−2

1/2 N −1 k x 5) ≤C 2 dx, 2 k=2 ϕ(k)2 ψ(k)[(N )−(k)] ϕ(x) ψ(x)[(N )−(x)] #1/2 " 1 √ √ N 1/2 r 6) εN log pN ≤ C[ ϕ(N , ε1 log pN ≤ C ( ϕ(1) + (N )) ψ(r) . )] Proof. This follows from the asymptotics (exp{x/ψ(x)}) ∼ (1 − c) exp{x/ψ(x)}/ψ(x) and

ψ(x) x exp ϕ(x) ψ(x)

$

1 ψ(x)ϕ (x) x ∼ (1 − c) + ψ (x) − exp ϕ(x) ψ(x) ϕ(x) x (1 − c) ∼ , exp ϕ(x) ψ(x)

%

as x → ∞. 1 2 ∼ ψ(N ) whereas ε 2 ∼ Note that εN 1 ϕ(N ) ϕ(1) + (N ), and therefore all the balls BdN (t, εr ) make a contribution to estimates (9.7.7) and (9.7.8). For the discussion, we choose ψ(x) = x α , ϕ(x) = x β with β ≥ 1, 0 ≤ α < 1. The set of conditions (9.8.1) is fulfilled if 0 < α < 1 as well as in the limit case α = 0, corresponding to the exponential case. First consider the case β > 1. Then N −1 N−1 1/2 1/2 x x 1−2β−α dx = ϕ(x)2 ψ(x)[(N) − (x)] [x 1−β − N 1−β ] 2 2 1−1/N $ 1−2β−α %1/2 u 1−( β+α ) 2 (x = Nu) =N du. 1−β − 1| |u 2/N

But

⎧ " 1−2β−α # 1/2 1 u ⎪ du < ∞ if β + α < 2, ⎪ ⎪ 0 |u1−β −1| ⎨ 1−1/N " u1−2β−α #1/2 du = O(log N ) if β + α = 2, 2/N |u1−β −1| ⎪ ⎪ " # β+α 1−2β−α ⎪ 1/2 ⎩ 1−1/N u 1−β du = O(N −1+( 2 ) ) if β + α > 2. 2/N |u −1|

480

9 The majorizing measure method

√ The residual terms in Lemma 9.8.2, inequality (4), (εN −1 − εN ) log pN and β+α √ (εN−2 −εN−1 ) log pN −1 make a contribution which is at most N (1−β)/2 ≤ N 1−( 2 ) . It follows that ⎧ 1−( β+α ⎪ 2 )) if β + α < 2, N ⎨O(N 2 (εr−1 − εr ) log pr = O(log N ) if β + α = 2, ⎪ ⎩ r=2 O(1) if β + α > 2. From Lemma 9.8.2 we also have that 2 1−β εN log pN = O(N 2 ) ),

2 1−α ε1 log pN = O(N 2 ) ).

Consider the case β + α < 2. By Theorem 9.7.1, β+α sup |ZN (t)| ≤ C(α, β)N 1−( 2 ) G

(9.8.2a)

whereas by Theorem 8.5.1, sup |ZN (t)| ≤ C(α, β)N 1−α 2 . G

(9.8.2b)

t∈T

t∈T

As we assumed β > 1, it follows that 1 − β+α < 1−α 2 2 , therefore implying that Theorem 9.7.1 is strictly stronger than Theorem 8.5.1. In the case β + α ≥ 2, this fact is evident. 1−α Now if β = 1, we find with Theorem 9.7.1 an estimate which is O(N 2 ), whereas with Theorem 8.5.1 we get O((N 1−α log N)1/2 ). In particular, in the exponential case α = 0, we find an order of type O(N 1/2 ) again strictly better than O((N log N )1/2 ). Finally, consider for M > N the increment QN,M := sup |ZM (t) − ZN (t)|.

(9.8.3)

t∈T

This case is a bit more delicate and the corresponding sequence (εr ) is given by εr2 = pr−2 2 and εN +1 = the use of the M−1 k=N +2

r

θk2 pk2 +

k=N +1

M

θk2 ,

r = N + 1, . . . , M

(9.8.4)

k=r+1

M 2 2 = p −2 θ 2 p 2 . The previous calculations k=N +1 θk , ε k=N +1 M M r k k 2 2 r 2 2 trivial bound k=N +1 θk pk ≤ k=1 θk pk show here that

M

2

(εk−1 − εk ) log pk ≤ C

M−1

N +2

x 2 ϕ(x) ψ(x)[(M) − (x)]

2 M 1/2 εN +1 log pM ≤ C [(M) − (N )] , ϕ(M) M 1/2 2 x . dx εM log pM ≤ C N ψ(x)ϕ(x)

and

1/2

dx, (9.8.5)

481

9.8 Some examples and discussion

For the last estimate, we used the fact that 2

εM log pM

M M 1/2 1/2 −2 2 2 = pM log pM θk pk ≤ θk2 log pk k=N +1

=

M k=N +1

k ψ(k)ϕ(k)

k=N +1

1/2 .

Choose again for the discussion ψ(x) = x α , ϕ(x) = x β with β ≥ 1, 0 ≤ α < 1. Assume first that β > 1, α + β < 2 and for technical reasons M ≥ N + 6. We shall distinguish when η := M−N M is small or not as M, N tend to infinity. With the change of variables x = Mu, the integral in (9.8.5) is rewritten as 1−1/M $ 1−2β−α %1/2 α+β) u M 1−( 2 ) du. 1−β − 1| |u (N +2)/M α+β)

Since α + β < 2, the integral converges. The order is thus at most M 1−( 2 ) . But if η is small, since (N + 2)/M = 1 − η + 2/M, we see a contribution of the integration near 1. Operating the change of variables u = 1 − h, we get 1−1/M $ 1−2β−α %1/2 η−2/M u dh M − N 1/2 du ≤ C , ≤ C √ α,β α,β 1−β − 1| M h 1−η+2/M |u 1/M where we used the fact that η − 3/M ≤ η/2, since η > 6/M. Consequently, we get M−1 k=N+2

2 α+β) M − N 1/2 (εk−1 − εk ) log pk ≤ Cα,β M 1−( 2 ) . M

By (9.8.5) we have 2

εM log pM ≤

M−N 1/2 α+β−1 N

1 1/2 Cα,β N α+β−2

Cα,β

(9.8.6)

if M − N ≤ N, if M − N ≥ N.

Thus we get by Theorem 9.7.1, ⎧

M−N 1/2 1−( α+β ⎪ 2 ) ( M−N )1/2 C + M ⎨ α,β α+β−1 M N QN,M ≤

G α+β ⎪ 1/2 1 1−( 2 ) M−N 1/2 ⎩Cα,β + M ( ) M N α+β−2

if M − N ≤ N, if M − N ≥ N. (9.8.7a)

We deduce from Theorem 8.5.1 that QN,M G

1/2

1−α 1/2 Cα,β (M−NN)M , Cα,β [N 1−β − M 1−β ]M 1−α β ≤

1−β 1−α 1/2 Cα,β N M ,

if M − N ≤ N, if M − N ≥ N. (9.8.7b)

482

9 The majorizing measure method

Thus here again Theorem 9.7.1 provides better bounds than Theorem 8.5.1. If α + β = 2, we find by Theorem 9.7.1 that

M C log e , M − N ≤ N, α,β QN,M ≤

N M−N 1/2 G , M − N ≥ N, Cα,β log e M N M whereas if α + β > 2,

M M − N ≤ N, QN,M ≤ Cα,β log e N , M M−N 1/2 G , M − N ≥ N, Cα,β log e N M again better than those obtained via Theorem 8.5.1. 9.8.3. The polynomial case. Consider another case: pk = k s/2 , θk2 = log1 k . This corresponds to the choice ψ(x) = x/(s log x) and ϕ(x) = 1/ log x. In that case, we will see that εr 3 ε1 . This means that there is only one big ball at the origin. Theorems 8.5.1 and 9.7.1 will produce similar estimates. As said before, this example is also very instructive for the sequel. At first, pr−2

r

θk2 pk2 ∼

k=1

r , (2s + 1) log r

r k=1

θk2 ∼

r log r

(r → ∞).

N r 1 2 And εr2 = pr−2 rk=1 θk2 pk2 + N k=r+1 θk ∼ (2s+1) log r + k=r+1 log r . By distinguishing the cases r ≤ N/2 and r ≥ N/2, we easily see that for N large, C1

N N ≤ εr2 ≤ C2 , log N log N

1 ≤ r ≤ N,

C1 , C2 , . . . being absolute constants, therefore showing that εr 3 ε1 [recall that these numbers are defined once the value of N has been fixed]. −2 2 Now as pr−1 − pr−2 ∼ 2s/r 2s+1 , we get εr−1 − εr2 ∼ 2s/ log r, and combining these estimates 3 log N 1 εr−1 − εr 3 sC3 (r → ∞). N log r Consequently N

2

3

(εr−1 − εr ) log pr ∼ s

3/2

r=2

C4

N √ log N 1 ∼ s 3/2 C5 N √ N log r r=2

√ √ √ √ and εN log pN ∼ N, ε1 log pN ∼ N. Then N √ 2 2 (εr−1 − εr ) log pr ∼ N εN log pN + r=2

483

9.8 Some examples and discussion

when N tends to infinity. Hence by Theorems 8.5.1 or 9.7.1, √ sup |ZN (t)| ≤ C(s) N . G

(9.8.8)

t∈T

It is interesting to observe in this example that N 2√ 2 r=1 θr log pr (εr−1 − εr ) log pr 3 , N 2 1/2 r=2 r=1 θr

N

and by the Cauchy–Schwarz inequality this is less than √ has the same order in N. As one also always has εN

2

N

1/2 2 , r=1 θr log pr

N N 1/2 1/2 −2 2 2 log pN = pN log pN θk pk ≤ θk2 log pk , k=1

(9.8.9a)

which

(9.8.9b)

k=1

we have by Theorem 9.7.1 the bound N 1/2 2 sup |ZN (t)| ≤ C θ log p . k k G t∈T

(9.8.9c)

k=1

N √ 2 1/2 . It is That expression is of course much more useful than log pN r=1 θr therefore interesting to determine whether a set of conditions on p and θ guaranteeing the validity of (9.8.9c) is possible to define. This goes as follows. We assume that there exists a sequence c = {ck , k ≥ 1} of reals and a real number , 0 < ≤ 1, such that ⎧ 2r 2 2 1) lim supr→∞ 2r ⎪ k=1 θk / k=r θk < ∞, ⎪ ⎪ r ⎨2) lim sup 2 2 −2 − p −2 ]c−2 [p r→∞ r r k=1 θk pk < ∞, r+1 (C) r ⎪ 3) lim supr→∞ k=1 ck2 / rk=1 θk2 < ∞, ⎪ ⎪ ⎩ 4) p[r/2] ≥ pr . # " r −2 2 −1 if p = k s (s > 0) or if Observe at first that [pr−2 − pr+1 ] behaves like k k=1 pk pk = 2k , in which case it is also like pr−2 . Practically (C2) reads as follows: r 2 2 k=1 θk pk < ∞, lim sup 2 (C2 ) r 2 p r→∞ cr k=1 k which is satisfied in many cases. Condition (C1) is satisfied once we have that r 2 θ varying function near infinity. The rek=1 k 3 κ(r), where κ is some regularly 2 diverges. θ quirement also implies that the series ∞ k=1 k Condition (C3) complements (C2) on comparing the growth of θ and c. Finally, condition (C4) means that the sequence p grows at most polynomially.

484

9 The majorizing measure method

9.8.4 Proposition. Under assumption (C), there exists a constant C such that for all N large enough, N 1/2 2 sup |ZN (t)| ≤ C θ log p . r r G t∈T

r=1

Proof. By assumption, for some suitable real 0 < c < 1 we have for all r large enough: 1)

2r

θk2 ≥ c

r

2)

2r

θk2 ,

1

−2 c[pr−2 − pr+1 ]

r

θk2 pk2 ≤ cr2 ,

k=1

3)

r

θk2 ≥ c

k=1

r

ck2 .

k=1

Using 1) and (C3) we get −2 2 εr2 ≥ εN = pN

N

−2 2 θk2 pk2 ≥ pN p[N/2]

k=1

θk2 ≥ c2

N/2≤k≤N

N

θk2 = c2 ε12 .

k=1

Now by (C2) and estimate 3) above −2 2 2 [pr−1 − pr−2 ] r−1 cr2 k=1 pk θk ≤ ≤ εr−1 − εr = # " N " # . 1/2 N 2 1/2 εr−1 + εr c k=1 θk2 c2 k=1 ck 2 εr−1 − εr2

Therefore, by applying the Cauchy–Schwarz inequality, N

N 2 (εr−1 − εr ) log pr ≤

r=2

r=2

√ N 1/2 1 2 cr2 log pr c log p . r " N 2 #1/2 ≤ r c2 c2 r=2 k=1 ck

One concludes by applying Theorem 9.7.1. There is an interesting case where Proposition 9.8.4 applies. We assume that X and Y are either independent i.i.d. Rademacher sequences or independent i.i.d. N (0, 1) sequences. Let U = {Uk , k ≥ 1} be a sequence of independent random variables defined on a joint probability space (ϒ, F , ). Consider also a sequence c = {ck , k ≥ 1} of reals and choose in (9.7.1) θk = ck Uk ,

k = 1, 2, . . .

(9.8.10)

It is clear with the choice made for√ X and Y that condition (8.5.4) is satisfied, condi√ tionally to U (one can take B = 18 2, or B = 18 π in the Gaussian or Rademacher

485

9.8 Some examples and discussion

case, see Section 8.5, Example 1). We now impose on U to satisfy the two following weighted strong laws of large numbers: N 2 2 N 2 2 2 k=1 ck Uk a.s. k=1 pk ck Uk a.s. lim N = a1 , lim = a2 , (9.8.11) N 2 2 2 N→∞ N →∞ k=1 ck k=1 pk ck where 0 < a1 , a2 < ∞. When the random variables Uk are moreover identically distributed and a = E U12 < ∞, according to Theorem 4.8.1 the strong laws in (9.8.11) are respectively verified as soon as r r 2 2 2 1 1 k=1 ck k=1 pk ck lim sup #{r : ≤ t} < ∞, lim sup ≤ t} < ∞, #{r : cr2 pr2 cr2 t→∞ t t→∞ t (9.8.12) in which case a1 = a2 = a. Condition (9.8.12) allows us to catch a wide range of examples,for instance pk = k s and ck = k β with s ≥ 1 and β real are suitable. Put H (r) = rk=1 ck2 , r ≥ 1. We do assume that the sequence p is polynomially growing and that the extra assumption linking both p and c holds as well: there exists C > 1 such that for any r large enough, a)

H (2r) ≥ CH (r),

b)

−2 [pr−2 − pr+1 ]

r

(9.8.13)

ck2 pk2 ≤ Ccr2 .

k=1

2 The requirement (9.8.13a), implying the divergence of the series ∞ k=1 ck , is satisfied for instance if H (r) 3 κ(r) where κ is a regularly varying function with positive Karamata index, but not if κ is slowly varying. Let us look at the effect of assumptions (9.8.11), (9.8.13) on the control of the quantities appearing in conditions (C1), (C2) and (C3). On the one hand, for any C > C > 1, by using (9.8.11) and (9.8.13a), 2r 2 2 2r 2 2 r 2 2 H (2r) k=1 ck Uk 8 k=1 ck Uk k=1 ck Uk = ≥ C, r 2U 2 H (2r) H (r) H (r) c k=1 k k r 2 2 2 2 almost surely, for r large. So that 2r k=r+1 ck Uk ≥ (C − 1) k=1 ck Uk , r large, thus implying that condition (C1) is checked. On the other hand, by (9.8.11) and (9.8.13b) r r r c2 U 2 p 2 k=1 k k k −2 −2 −2 2 2 2 −2 2 2 [pr − pr+1 ] ck Uk pk = [pr − pr+1 ] ck pk r 2 2 k=1 ck pk k=1 k=1 −2 ≤ 2[pr−2 − pr+1 ]

r

ck2 pk2 ≤ 2Ccr2 ,

k=1

almost surely, for r large. This implies that condition (C2) is satisfied. Finally, con cerning condition (C3), we observe by assumption (9.8.11) that limr→∞ (a1

)−1 ,

so that it is trivially satisfied. Consequently we can state:

r 2 k=1 ck 2U 2 c k=1 k k

r

=

486

9 The majorizing measure method

9.8.5 Corollary. The sequences X and Y being fixed as before, let p be polynomially growing. Let also U be a sequence of independent random variables defined on a joint probability space (ϒ, F , ). Let c be a sequence of reals. We assume that U, p and c satisfy conditions (9.8.11) and (9.8.13). If θ is defined by (9.8.10), for almost all υ in ϒ, there exists Cυ < ∞ such that for all N , N 1/2 2 sup |ZN (t)| ≤ Cυ c log p . r r G t∈T

r=1

And specifying this for i.i.d. square integrable sequences, we get: 9.8.6 Corollary. The sequences X and Y being fixed as before, let p be polynomially growing. Now let U be a sequence of i.i.d. square integrable random variables defined on a joint probability space (ϒ, F , ). Let p and c satisfy (9.8.12), (9.8.13). With θ defined by (9.8.10), for almost all υ in ϒ, there exists Cυ < ∞ such that for all N , N 1/2 2 sup |ZN (t)| ≤ Cυ c log p . r r G t∈T

r=1

9.8.7. Arithmetical weights. So far we have been concerned with regular (decreasing) weights, except for Corollaries 9.8.5 and 9.8.6, in which we considered random independent weights. In this example we study one symptomatic case of weights arising from arithmetic number theory. Let d(n) = #{d : d|n} be the divisor function and consider the case pk = [k s/2 ], θk = d(k). In this case the weights are very irregular, but their sums behave regularly. According to equation 18.2.1, p. 263 of [Hardy–Wright: 1979] and equation (B), p. 81 of [Ramanujan: 1916] (see [Wilson: 1922] for a proof) we recall, in effect, that N n=1

d(n) ∼ N log N,

N n=1

d (n) ∼ 2

N log3 N π2

(9.8.14)

as N tends to infinity. It follows from Theorem 8.5.1 or Theorem 9.7.1 that

QN G ≤ C(s)N 1/2 (log N )2 . This case is also an example where the sums of the weights grow to infinity. It is natural to also compare when the weights are growing. We shall perform this on the limit case: pk2 = M k , where M > 1 is fixed. We assume that there exists a r 2 nondecreasing differentiable function such that (r) = k=1 θk /r, and r r−1 x (x) ≤ c0 (x). RecallAbel summation: k=1 uk yk = j =1 Dj (yj −yj +1 )+Dr yr , j k where Dj = k=1 uk . Applying it with uk = 1, yk = M gives the relation r−1 M r+1 −1 2 r j j =1 j M (M − 1). Applying it now with uk = θk arbitrary and M−1 = M r −

9.8 Some examples and discussion

487

using the latter relation gives r

θk2 pk2 = (r)rM r −

r−1

(j )j M j (M − 1)

j =1

k=1

r−1

M r+1 − 1 j M j (M − 1) = (r) . ≥ (r) rM r − M −1 j =1

Conversely as rM r = r

θk2 pk2

M r+1 −1 M−1

+

r−1

= (r)rM − r

j =1 (j )j M r−1

j (M

− 1),

(j )j M j (M − 1)

j =1

k=1

M r+1 − 1 + j M j (M − 1)[(r) − (j )]. M −1 r−1

= (r)

j =1

But, as (r) − (j ) ≤ (r − j ) (j ) and r−1

j M (M − 1)(r − j ) (j ) ≤ C j

j =1

r−1

M j (M − 1)(r − j )(j )

j =1

≤ C(r)M r

r−1

M −k (M − 1)k,

k=1

M + C k=1 M −k (M − 1)k . Consequently, for we get k=1 θk2 pk2 ≤ (r)M M−1 some constants C1 , C2 depending on M and only, one has C1 (r)M r ≤ rk=1 θk2 pk2 ≤ C2 (r)M r . And this now implies that r

C1

N r=2

√

r

∞

√ √ N N 2 (r) r (r) r (εr−1 − εr ) log pr ≤ C2 ≤ . √ D(N) − D(r) r=2 D(N ) − D(r) r=2

x Fix some α > 1 such that c0 log(1/α) < 1. Since (x) ≤ (xα) + xα (u)du ≤ x (xα) + c0 xα ((u)/u)du ≤ (xα) + [c0 log(1/α)](x), it follows that (x) ≤ cα (xα). Thus √ N √ 2 (r) r (N α) (εr−1 − εr ) log pr ≥ C1 r ≥ C1 √ √ N (N ) N (N ) N ≥r≥N α r=2 r=2

N

≥ Cα N (N )1/2 . But in view of Theorem 8.5.1, QN G ≤ CN (N )1/2 , so that in this case both theorems produce equivalent estimates.

488

9 The majorizing measure method

9.9

Uniform convergence of random Fourier series

Let C be the space of complex-valued continuous functions on T equipped with the sup-norm f = sup0≤t≤1 |f (t)|, f ∈ C. Let U = {Uk , k ≥ 1} be a sequence of independent symmetric real random variables, and let p be a nondecreasing sequence of positive integers. In Theorem 8.5.8 we showed that the condition: there exist integers 0 := n0 < n1 < n2 < · · · such that the series ∞ i=0

E

i+1 n

|Uk |2

1/2

log1/2 pni+1

(9.8.15)

k=ni +1

converges is enough to ensure the uniform convergence of the random Fourier series 2iπpk t for almost all ω. W (ω)e k k≥1 However, it is clear from the previous section that this condition is only efficient for polynomially growing sequences p. In concrete cases, it is often enough to choose nk = 2k to obtain a sharp sufficient condition on U and p. But there are examples (for instance Rademacher Fourier series with p and θ defined by (9.8.19)) for which the k

correct choice is nk = 22 , which show that the appearance of the sequence (nk )k in the above condition is meaningful. In what follows, we would like to use the results from the previous section to investigate this question more specifically. We will restrict the scope of the study to Rademacher random Fourier series. Let ε = {εk , k ≥ 1}, ε = {εk , k ≥ 1} be two independent Rademacher sequences. We assume in (9.7.1) that X = ε, Y = ε and define for integers M ≥ N: ZN,M (ω, t) = ZM (ω, t)−ZN (ω, t) =

M

θk εk (ω) cos 2πpk t +εk (ω) sin 2πpk t .

k=N +1

(9.8.16) We investigate the uniform convergence of the series ∞

θk εk (ω) cos 2πpk t + εk (ω) sin 2πpk t .

k=1

Consider first the polynomial case. We establish another type of sufficient condition for uniform convergence in which we get rid of the sequence (nk ). We consider sequences p and θ linked by the following conditions: 1 2 2 (i) ∀N ≥ 1, θ p = o θk2 , k k 2 pm k≤m k≤m (9.8.17) −2 −2 (ii) ∃ < ∞ : [pm−1 − pm ] θk2 pk2 ≤ θm2 . k≤m

The examples studied in the previous section justify introduction of the following set: (9.8.18) D = (p, θ) : condition (9.8.16) is fulfilled .

489

9.9 Uniform convergence of random Fourier series

The pairs (p, θ) studied in 9.8.1 and 9.8.3 belong to D, as well as for instance the pair defined by 1 θ , (9.8.19) pk2 = elog k , θk2 = k logμ k where μ > 1 and θ > 0. 9.9.1 Theorem. Let (p, θ) ∈ D. Assume that 2 a) ∞ r=1 θr log pr < ∞, √ θ2 log pr b) limN→∞ lim supM→∞ N
2 r
∞

k=1 θk {εk (ω) cos 2πpk t

+ εk (ω) sin 2πpk t} con-

Proof. Let 0 < γ < 1 be fixed. Using (9.8.17i), we define recursively the following sequence of integers: N1 = 1,

Nj = sup m > Nj −1 :

1 2 pm

2 2 Nj −1 ≤k≤m θk pk

≥γ

2 Nj −1 ≤k≤m θk

.

(9.8.20)

For Nj −1 < r ≤ Nj , we write εr2 = p12 Nj −1
2 εr−1 − εr2

εr−1 + εr

≤

2 θr−1

√

γ

Nj −1
θk2

1/2 .

(9.8.22)

It follows that Nj

2

(εr−1 − εr ) log pr ≤

r=Nj −1 +2

≤

=

√ γ

√ γ

Nj

Nj r=Nj −1 +2

2 √log p θr−1 r

Nj −1
θk2

1/2 Nj 1/2 2 2 r=Nj −1 +2 θr−1 r=Nj −1 +2 θr−1 log pr

2 1/2 Nj −1
√ γ

Nj

2 θr−1 log pr

1/2

(9.8.23)

1/2 .

r=Nj −1 +2

Applying now Theorem 9.7.1, we get sup |ZN t∈T

j −1 ,Nj (t)| G ≤ C ,γ

Nj r=Nj −1 +2

2 θr−1 log pr

1/2 .

(9.8.24)

490

9 The majorizing measure method

And by means of Levy’s inequality we get E

2 sup |ZNj −1 ,R (t)|2 ≤ 2C ,γ

sup

Nj −1
Nj

2 θr−1 log pr .

(9.8.25)

r=Nj −1 +2

In view of (9.8.25) and assumption a) of the theorem, we deduce that the sequence (Zn ) converges in C(T) almost surely, if and only if, the subsequence (ZNj ) converges in C(T) almost surely. Let L < J be fixed. By Levy’s inequality, E

sup

sup |ZNl ,Nj (t)|2 ≤ 2E sup |ZNL ,NJ (t)|2 .

L≤l≤j ≤J t∈T

t∈T

For NL < r ≤ NJ , we denote this time εr2 = Plainly εr2 ≥ r+1
2 εr−1 − εr2

εr−1 + εr

1 pr2

≤

2

(εr−1 − εr ) log pr ≤

NL
2 2 NL
2 θr−1

r
so that

NL
θk2

+

r+1
θk2 .

1/2 ,

2 √log p θr−1 r

1/2 . 2 r
(9.8.26)

We deduce from assumption b) of the theorem that lim lim sup E

L→∞ J →∞

sup

sup |ZNl ,Nj (t)|2 = 0,

L≤l≤j ≤J t∈T

(9.8.27)

which clearly implies that the subsequence (ZNj ) converges in C(T) almost surely. Now consider for the sub-exponential case again Example 9.8.1. Using estimates (9.8.7a) with α + β > 2, one can prove that the random Fourier series arising from (9.7.1) converges uniformly almost surely; which cannot be obtained from existing results nor Theorem 9.9.1. Let indeed Nk = k R where we have chosen R so that R(α + β − 2) > 1. Then, one has for j ≥ k, QN ,N ≤ Cα,β k −1/2−R(2−α−β)/2 , k k+1 G (9.8.28) QN ,N ≤ Cα,β k −R[(α+β)/2−1] . k

l

G

Therefore by Levy’s inequality, E

sup

sup |ZNk−1 ,R (t)|2 ≤ Cα,β k −1/2−R(2−α−β)/2 ,

Nk−1
E

sup

sup |ZNl ,Nj (t)|2 ≤ 2E sup |ZNL ,NJ (t)|2 ≤ Cα,β k −R[(α+β)/2−1] .

L≤l≤j ≤J t∈T

t∈T

(9.8.29)

Chapter 10

Gaussian processes

This chapter is a succinct study of Gaussian processes, but also a kind of toolbox. We begin with properties of independent Gaussian random variables and vectors, and state several helpful correlation inequalities which are generally not presented in classical books on Gaussian processes. This is continued with the rotational invariance property and three fundamental tools: 0-1 laws, strong integrability of Gaussian semi-norms and comparison lemmas. The regularity properties (sample boundedness and sample continuity), their irregularity, and a brief study of Gaussian suprema are next presented. A glimpse at local time of a Gaussian process and its connection with irregularity of sample paths is also included. The remaining part of the chapter is devoted to a closer investigation of the Gaussian Stein’s sequences: their oscillations and their tightness properties.

10.1

Gaussian variables and correlation estimates

Let (, A, P ) be a given probability space. A real-valued centered random variable D

X ∈ L2 (, A, P ) is Gaussian (or normally distributed X = N (0, σ )) if its Fourier transform satisfies, for any real t, E exp(itX) = exp(−σ 2 t 2 /2),

(10.1.1)

where σ = X 2 = (EX2 )1/2 . It will be implicitly assumed that all Gaussian random variables considered in this chapter are centered. X is standard Gaussian or N (0, 1) distributed if σ = 1. A Gaussian random variable X can thus always be written in the form X = X 2 g where g is standard Gaussian. The distribution function (x) of a standard Gaussian random variable is defined for any real x as x 1 2 (x) = P{g < x} = √ e−t /2 dt. (10.1.2) 2π ∞ It is often necessary to have at disposal good (even sharp) estimates of the tail function 1 − (x) for x large. The precise estimates below are due to Komatu–Pollak. 10.1.1 Lemma. The Mills’ ratio R(x) = ex

2 /2

∞

e−t

2 /2

dt verifies for all x ≥ 0, 3 2 π 2 . (10.1.3) ≤ R(x) ≤ ≤ √ 2 2 x +4+x x 2 + π8 + x x

492

10 Gaussian processes

The lower bound in (10.1.3) is due to Komatu [1955], while the upper bound is tends to 1, whereas R(x) due to Pollak [1956]. However as x tends to 0, √ 2 2 x +4+x 2 tend to π2 . Later Boyd [1959] obtained the following refinement of and 2 8 x 2 + π +x

(10.1.3): for all x ≥ 0, √

π x2

+ 2π + (π − 1)x

π

≤ R(x) ≤ 2

(π 2π

− 2)2 x 2

+ 2π + 2x

.

(10.1.3 )

Notice that in (10.1.3 ) both bounds tend to 2 as x tends to 0. We refer to Mitrinovi´c [1970] Section 2.26 for further details. Mill’s ratio is directly related to the Laplace transform of the standard Gaussian law, as it follows from the relation, valid for any real T ≥ 0, 3 2 −T |g| Ee = R(T ). (10.1.4) π Indeed

3

Ee

−T |g|

3

2 T 2 /2 ∞ −(x+T )2 /2 e dx e π 0 0 3 3 2 T 2 /2 ∞ −y 2 /2 2 = e dy = e R(T ). π π T

=

2 π

∞

e

−T x−x 2 /2

dx =

The notation 1 "(x) = 1 − (x) = √ 2π

∞

e−t

2 /2

dt,

x∈R

(10.1.5)

x

is often used. The tail estimate below, valid for any two nonnegative reals a and b, is sometimes useful:

2 2 " a 2 + b2 ≤ "(a)e−b /2 . (10.1.6) Corresponding estimates for small deviations of g can be derived from an elementary inequality, for which we again refer to Mitrinovi´c [1970: inequality 3.6.2; p. 266]. For a and t real numbers such that a ≥ 1 and |t| ≤ a,

0 ≤ e−t − 1 −

t a

a

≤

t 2 e−t . a

(10.1.7)

Applying (10.1.7) for a = 1 and t = P{|g| > u} gives, for any u ≥ 0,

e−P{|g|>u} 1 − P{|g| > u}2 ≤ P{|g| ≤ u} ≤ e−P{|g|>u} . (10.1.8) √ In particular, if u is such that we have P{|g| > u} ≤ 2 − 1, say u ≥ u0 , then 1 ≤ (1 − P{|g| > u}2 )eP{|g|>u} , and we get for all u ≥ u0 ,

e−2P{|g|>u} ≤ 1 − P{|g| > u}2 e−P{|g|>u} ≤ P{|g| ≤ u} ≤ e−P{|g|>u} . (10.1.9)

493

10.1 Gaussian variables and correlation estimates

If |u| ≤ 1, then 3 3 u 2 2 2 4 −x 2 /2 dx e u≤ u ≤ P{|g| ≤ u} = u ≤ u. ≤ √ 5 eπ π 5 2π −u 2 (In fact eπ ≈ 0.4839417, π2 ≈ 0.79788459.)

(10.1.10)

Gaussian pairs. Let (X, Y ) be a pair of centered random variables with E X 2 = E Y 2 = 1 and correlation E XY = ρ. We exclude the trivial cases ρ = ±1 corresponding to X = ±Y and may assume |ρ| < 1. Then (X, Y ) is Gaussian distributed if its distribution function has density given by φ(x, y, ρ) =

$

%

x 2 + y 2 − 2ρxy 1 exp − . 2π(1 − ρ 2 )1/2 2(1 − ρ 2 )

As is well known (Cramér–Leadbetter [1967: p. 26]), ∂ ∂2 φ= φ. ∂x∂y ∂ρ

(10.1.11)

One can argue with this fact to prove the following useful correlation estimate. Let 0 < ε < 1 and assume that 0 ≤ ρ ≤ ε. Then for any ε < η < 1, there exists a constant ∗ depending on η, ε only, such that for any β ≥ α ≥ 0, Cη,ε P{X > α, Y > β} − P{X > α}P{Y > β} ≤ C ∗ ρ P{X > α}2/1+η . η,ε Indeed, with (10.1.7), ∞ ∞ φ(x, y, ρ)dxdy = "(α)"(β) + α

(10.1.12)

ρ

φ(α, β, z)dz. 0

β

Since α 2 + β 2 − 2ραβ = (α 2 + β 2 )(1 − ρ) + ρ(α 2 + β 2 − 2αβ) ≥ (α 2 + β 2 )(1 − ρ) if ρ ≥ 0, we deduce φ(α, β, ρ) ≤

Assume that 0 ≤ ρ ≤ ε < 1, then ρ φ(α, β, z)dz ≤ 0

$

%

α2 + β 2 1 exp − . 2 1/2 2π(1 − ρ ) 2(1 + ρ)

1 2 ρe−α /(1+ε) . 2 1/2 2π(1 − ε )

Using now estimate (10.1.3) we get for any η ∈ ]ε, 1[, e−α

2 /(1+ε)

≤ Cη,ε P{X > α}2/1+η

(∀α ≥ 0),

494

10 Gaussian processes

where Cη,ε depends on η, ε only. Consequently, ρ ∗ φ(α, β, z)dz ≤ Cη,ε ρP{X > α}2/1+η , 0

1 ∗ =C where Cη,ε η,ε 2π(1−ε 2 ) . This shows the claimed estimate. The next correlation inequalities are very classical and can be proved similarly. We refer to Chung, Erdös and Sirao [1959: p. 269–270] for instance.

10.1.2 Lemma. Let (U, V ) be jointly Gaussian centered random variables with E U 2 = E V 2 = 1, E U V = ρ and let ε > 0. (a) For any nonnegative reals x and y with ρxy ≤ ε, P{U > x, V > y} ≤ c(ε)P{U > x}P{V > y} where limε→0 c(ε) = 1. (b) If ρ ≥ 0, for any a ≥ 0,

3 1−r . P{inf(U, V ) ≥ a} ≤ P{U ≥ a}" a 1+r

(c) If ρ ≤ 0, then for any nonnegative reals x and y, P{U > x, V > y} ≤ P{U > x}P{V > y}. In the next lemma are other similar useful estimates. 10.1.3 Lemma. Let (U, V ) be jointly Gaussian centered random variables and let x ≥ 0.

x U −V 2 2 − 21 2 U 22 (a) P U > x, V > x ≤ P{U > x}e , if U 2 ≥ V 2 . Assuming for some 0 < α ≤ 1 that 2 U 22 − V 22 ≤ (1 − α 2 ) U − V 22 , then

αx U −V 2 2

− 21 2 max( U

22 , V 22 ) (b) P U > x, V > x ≤ min P{U > x}, P{V > x} e .

2x Proof. Plainly P U > x, V > x ≤ P U + V > 2x = " U +V

2 . If we write

2x 2 x2 2 = U 2 + b , then

U +V 2 2

b2 = x 2

1 4 − 2

U + V 2

U 22

= x2

4 U 22 − U + V 22

U 22 U + V 22

.

But 4 U 22 − U +V 22 = 3 U 22 − V 22 −2U, V = 2( U 22 − V 22 )+ U −V 22 . If U 22 ≥ V 22 , we get x 2 U − V 22 , b2 ≥ 4 U 42

495

10.1 Gaussian variables and correlation estimates

and consequently − 21

x 2 U −V 2 2

2

4 U 2 P{U > x, V > x} ≤ P{ U 2 > x}e . 2 2 Now if U 22 − V 22 ≤ 1−α 2 U − V 2 , for some 0 < α < 1, then we have 4 U 22 − U + V 22 ≥ α 2 U − V 22 , and so

b2 ≥

α 2 x 2 U − V 22 4 max( U 22 , V 22 )2

,

which implies P{U > x, V > x} ≤ P{ U 2 > x}e

− 21

and also ≤ P{ V 2 > x}e

− 21

2 αx U −V 2 2 max( U 22 , V 22 )

αx U −V 2 2 max( U 22 , V 22 )

2 .

Hence the lemma. We conclude this part with an interesting lemma allowing us to express the correlation of Gaussian pairs in terms of a probability involving their signs. 10.1.4 Lemma. Let (U, V ) be jointly Gaussian centered random variables and let ρ = E UU 2 · VV 2 . Then, 1 1 P U ≥ 0, V ≥ 0 − = arcsin ρ. 4 2π Proof. Let Z be an N (0, 1) distributed random2 variable, which we assume to be in1 − ρ 2 Z) have the same law. Put dependent of U . Then (U, V ) and (U, ρU + H (ρ) = P U ≥ 0, V ≥ 0 . Assume 0 ≤ ρ ≤ 1, then 2 ∞ dz − 1 − ρ2z 2 P U > sup(0, ) e−z /2 √ H (ρ) = ρ 2π −∞ 2 0 2 − 1 − ρ z −z2 /2 dz 1 ∞ −z2 /2 dz P U> e e = + √ √ ρ 2 0 2π 2π −∞ ∞ ∞ dx dθ 1 2 2 √ = e−x /2 √ e−θ /2 √ + . θ 1−ρ 2 4 2π 2π 0 ρ Besides H (ρ) =

0

∞

d dρ

$ θ

√

1−ρ 2 ρ

e−x

2 /2

% dx dθ 2 e−θ /2 √ . √ 2π 2π

496 As

d dρ

10 Gaussian processes

! θ

√

1−ρ 2 ρ

e−x

2 /2

√dx 2π

2 )−1/2 ρ −2 2 2 2 θ (1−ρ√ e−θ (1−ρ )/2ρ , 2π

=

we thus have

∞ 1 2 2 2 2 H (ρ) = θ e−θ (1−ρ )/2ρ e−θ /2 dθ 2 2πρ 2 1 − ρ 2 0 ∞ 1 2 2 θ e−θ /(2ρ ) dθ = 2 (10.1.13) 2 2 2πρ 1 − ρ 0 1 1 = = (arcsin ρ) . 2 2 2π 2π 1 − ρ ρ 1 √du Since H (0) = 1/4, we get H (ρ) − 1/4 = 0 = 2π arcsin ρ. Hence 2

2π 1−u

1 1 P U ≥ 0, V ≥ 0 − = arcsin ρ. 4 2π √ ∞ θ 1−ρ 2 −θ 2 /2 dθ √ Now we observe that H (ρ) = 0 P 0 < U < − ρ e if −1 ≤ 2π ρ ≤ 0. Further $

√

% dx dθ 2 H (ρ) = e e−θ /2 √ √ 2π 2π 0 0 ∞ 1 dθ 1 2 2 2 2 = 2 θ e−θ (1−ρ )/2ρ e−θ /2 √ . = 2 2π ρ 2 2π(1 − ρ 2 ) 0 2π 1 − ρ 2 (10.1.14) 0 1 Hence H (0) − H (ρ) = 1/4 − H (ρ) = ρ √du 2 = − 2π arcsin ρ. Thereby

∞

d dρ

−θ

1−ρ 2 ρ

−x 2 /2

2π 1−u

1 1 P U ≥ 0, V ≥ 0 − = arcsin ρ. 4 2π From these facts, the lemma follows easily. Gaussian vectors. A centered real random vector X = (X1 , . . . , XN ) is Gaussian if for any reals a1 , . . . , aN , the random variable N i=1 ai Xi is centered Gaussian. There exists an N × N nonnegative definite matrix A such that for any B ∈ B(RN ), with x = (x1 , . . . , xN ), 1t 1 −1 P X∈B = e− 2 xA x dx1 . . . dxN . (10.1.15) √ (2π )N/2 det A B It is always possible to diagonalize X so that its distribution follows the canonical Gaussian law 1 2 2 (10.1.16) γN (x1 , . . . , xN ) := (2π )−N/2 e− 2 (x1 +···+xN ) . Let indeed = (EXi Xj )1≤i,j ≤N = At A be the covariance matrix of X. The law of X is completely defined by and identical to the law of A(Y1 , . . . , YN ).

10.1 Gaussian variables and correlation estimates

497

Rotational invariance. Gaussian laws possess a remarkable rotational invariance property, which is worth first presenting for pairs of random variables before switching to Gaussian vectors. 10.1.5 Lemma. If X and Y are independent and non-constant random variables and if U = pX + qY and V = aX − bY are independent, where p, q, a and b are all real and non-zero, then X and Y are normally distributed, and hence so are U and V . The case p = q = a = b = 1 is the well-known theorem of Bernstein [1941], who further assumed that X and Y have finite, equal variances and positive densities, and is also stated in Gelbaum [1985: Theorem 1], who was apparently unaware of Bernstein’s result. But the quoted work of Gelbaum contains many other interesting aspects, which we shall mention later on. Bernstein’s theorem was extended by Gnedenko [1948] who proved Lemma 10.1.5 without moment condition. The general form we stated is due to Quine and Seneta [1999: Theorem 2]. This remarkable property should, however, be rather attributed to Kac [1939]. In an early little known paper, Kac showed this: if X and Y are independent random variables and if for every ϑ, the random variables X cos ϑ + Y sin ϑ and X sin ϑ − Y cos ϑ are independent, then X and Y are normally distributed. In fact, the assumption is √ used only for the values ϑ = π/4 and ϑ = 3π/4, √ which requires that (X + Y )/ 2 and (X − Y )/ 2 are independent, and also that √ √ (−X + Y )/ 2 and (X + Y )/ 2 are independent. This is verified once X + Y and X − Y are independent, since independence is not affected by scalar multiplication. Kac’s paper precedes even Bernstein’s, see in this regard the nice discussion in Quine and Seneta [1999: Section 3]. His proof, based on characteristic functions and the Cauchy method, is simple and elegant and extends to the finite-dimensional case as quoted at the end of the paper. We find it worth including here. Kac’s proof. We may assume X and Y symmetric, the general case indeed follows from a routine argument. Their characteristic functions are real. Let A, B be the characteristic functions of X and Y respectively: A(x) = E eixX and B(y) = E eiyY . By assumption E eix(X+Y )+iy(X−Y ) = E eix(X+Y ) E eiy(X−Y ) = A(x)B(x)A(y)B(−y), E eix(−X+Y )+iy(X+Y ) = E eix(−X+Y ) E eiy(X+Y ) = A(−x)B(x)A(y)B(y). But E eix(X+Y )+iy(X−Y ) = E ei(x+y)X E ei(x−y)Y = A(x + y)B(x − y), E eix(−X+Y )+iy(X+Y ) = E ei(−x+y)X E ei(x+y)Y = A(−x + y)B(x + y).

498

10 Gaussian processes

Comparing the two above equalities gives A(x + y)B(x − y) = A(x)A(y)B(x)B(y) = A(y − x)B(y + x), since A(x) = A(−x), B(x) = B(−x) by the symmetry assumption. By letting x = y, we get A(2x) = B(2x). And so we arrive at the functional equation A(x + y)A(x − y) = A2 (x)A2 (y).

(10.1.17)

In particular A(2x) = A4 (x), so that, A being real, A(x) ≥ 0. Repeated application of A(2x) = A4 (x) produces k

A(x/2k ) = [A(x)]1/4 . But A is continuous and A(x/2k ) → 1 as k tends to infinity. This implies that A(x) > 0 for every x. The rest of the proof is based on the well-known method of Cauchy. Replacing successively x by 2x, 3x, . . . allows us to obtain for arbitrary integers p and q, 2 2 A(px/q) = [A(x)]p /q . And since A is continuous, 2

A(x) = ekx ,

(ek = A(1)).

As 0 < A(x) ≤ 1 one has k ≤ 0. A related result is the well-known Darmois–Skitoviˇc theorem (see Darmois [1953] and Skitoviˇc [1953], see also King and Lukacs [1954]) which states as follows. 10.1.6 Lemma. Let X1 , . . . , Xn be mutually independent random variables. Then U=

n

aj Xj

j =1

and V =

n

bj Xj

j =1

are independent if and only if each Xj with a non-zero coefficient in both sums is normally distributed and nj=1 aj bj Var(Xj ) = 0. The proof depends on forming differences of the logarithmic characteristic functions and applying a theorem of Marcinkiewicz. There is a recent formulation of this result (Quine and Seneta [1999: Theorem 1]), close to Lemma 10.1.5. 10.1.7 Lemma. If X1 , . . . , Xn are independent and non-constant random variables and if n n Xj and V = bj Xj U= j =1

j =1

are independent, where the numbers are b1 , . . . , bn , all distinct and nonzero, then X1 , . . . , Xn are normally distributed.

10.1 Gaussian variables and correlation estimates

499

The Darmois–Skitoviˇc theorem has an extension for Banach-valued random variables, thus completing the previous description (see Krakowiak [1985]). It was observed long ago that under the kind of assumptions made in the above lemmas, direct computations imply that X1 , . . . , Xn have moments of any order. We may refer to Lancaster [1960] for instance. Now let (Y1 , . . . , YN ) be a Gaussian vector. The rotational invariance property can be described as follows. If U is an orthogonal matrix on RN , then U (Y1 , . . . , YN ) has law γN (defined in (10.1.16)). Consequently, for any sequence of reals a1 , . . . , aN , the

N 2 1/2 . And thus for random variable N i=1 ai i=1 ai Yi follows the same law as Y1 any 0 < p < ∞, N N 1/2 ai Yi = Y1 p ai2 . (10.1.18) i=1

p

i=1

Another way to describe this property is the following: let X be a Gaussian vector in RN , and let Y be an independent copy of X. Then for any η, the vector obtained from (X, Y ) by a rotation of angle η, (X sin η + Y cos η, X cos η − Y sin η),

(10.1.19)

has the same law as (X, Y ). It suffices, indeed, to compare their covariance matrix. Having defined and commented on this important property, we now continue with other classical Gaussian correlation estimates. The following lemma has self-evident practical interest: combined with Lemma 10.1.4, it allows us to characterize (Maruyama’s result) mixing properties of Gaussian dynamical systems, see Section 3.3.6. 10.1.8 Lemma. Let X = (X1 , . . . , XN ) be a Gaussian centered vector such that E Xn2 = 1 for 1 ≤ n ≤ N and let r(n, m) = E Xn Xm be its covariance function. Let A be a partition of {1, . . . , N} and denote by σ a generic element of A. Let x = (x1 , . . . , xN ) and y = (y1 , . . . , yN ) with distinct coordinates, be such that −∞ < xn < yn < +∞, for 1 ≤ n ≤ N . Denote also by In the interval (xn , yn ), and put for each σ ∈ A, ( ( Vσ = In , V = Vσ , X(σ ) = (Xn , n ∈ σ ). n∈σ

σ ∈A

Then there exists a constant CV depending on V only, such that ( E Xn Xm . P{X(σ ) ∈ Vσ } ≤ CV P{X ∈ V } − σ ∈A

σ =σ n∈σ m∈σ

be a GaussProof. Let 1 denote the covariance matrix of X. For each σ ∈ A, let X(σ ) ian vector having the same law as X(σ ) and such that the X(σ ) are mutually independent. , σ ∈ A). Let = (X(σ )

500

10 Gaussian processes

(1) Assume that 1 is invertible and write (λ) = λ 1 + (1 − λ) 0 , for λ ∈ [0, 1]. Then (λ) is invertible. Put 1 − 21 t u (λ)−1 u , F (λ) = gλ (u)du. (10.1.20) gλ (u) = e √ (2π )n/2 det (λ) V Then F (λ) has a derivative which may be evaluated as $ % ∂ 1 ∂ (λ) ∂ 2

∂ . 2 gλ (u) . F (λ) = (gλ (u)) du where (gλ (u)) = tr ∂λ 2 ∂λ ∂u V ∂λ (10.1.21) But ∂ (λ) r(α, β) if α ∈ σ, β ∈ σ , σ = σ ; = ∂λ 0 otherwise. Consequently ∂ 1 ∂2 r(α, β) (gλ (u)) = (gλ (u)) . ∂λ 2 ∂uα ∂uβ α∈σ σ =σ

Thus

(10.1.22)

β∈σ

1 ∂2 r(α, β) F (λ) = (gλ (u)) du. 2 V ∂uα ∂uβ α∈σ

σ =σ

β∈σ

And so ( P{X(σ ) ∈ Vσ } P{X ∈ V } − =

0

σ ∈A

1

1 1 ∂2 F (λ)dλ ≤ |r(α, β)| · (gλ (u)) dudλ. 2 0 V ∂uα ∂uβ α∈σ

σ =σ

β∈σ

(10.1.23) Put α,β g(u) = g(u1 , . . . , yα , . . . , yβ , . . . ) − g(u1 , . . . , xα , . . . , yβ , . . . ) − g(u1 , . . . , yα , . . . , xβ , . . . ) + g(u1 , . . . , yα , . . . , xβ , . . . ). Then

V

∂2 (gλ (u)) du ∂uα ∂uβ y1 y2 yβ yα ∂2 du1 du2 . . . duα = (gλ (u)) duβ . x1 x2 xα xβ ∂uα ∂uβ du1 . . . dun = x ≤u ≤y α,β g(u) j j j duα duβ j =α,j =β ∗ du1 . . . dun ≤ |α,β g(u)| ≤ (x, y, λr(α, β)), duα duβ Rn−2 (x,y)

(10.1.24)

10.1 Gaussian variables and correlation estimates

501

where the above sum runs over the set {(yα , yβ ), (xα , yβ ), (yα , xβ ), (yα , xβ )}. Thereby 1 ( P{X(σ ) ∈ Vσ } = F (λ)dλ ≤ CV |r(α, β)|, P{X ∈ V } − 0

σ ∈A

σ =σ α∈σ β∈σ

with CV = 4 max (x, y, ρ) : |ρ| ≤ 1, x, y ∈ {xα , yα , α ∈ A} . As for x = y, sup (x, y, ρ) < ∞,

−1≤ρ≤1

and it follows by assumption that CV is finite. (2) If 1 is not invertible, let be a Gaussian vector in RN with i.i.d. N (0, 1) distributed components; and put for u real, u = 0, Xu = X + uN,

u = + uN.

The covariance matrices are then invertible, and the first step of the proof shows that the conclusion of the lemma is verified by Xu . Further Xu (α, β) = r(α, β) + u2 . We then observe that it suffices to let u tend to 0 for concluding identically for X. Finally we quote a remarkable decoupling inequality due to Klein–Landau–Shucker [1982: Theorem 1]. For a proof we refer to the original paper. 10.1.9 Lemma. Let T = {Tk , k ≥ 1} be a stationary, centered Gaussian sequence with finite decoupling coefficient p(T ), that is: p(T ) :=

∞ E T1 Tk

k=1

E T12

< ∞.

Let {fk , k ≥ 1} be a sequence of complex-valued Borel-measurable functions. Then, for each finite subset J of N, ( ( fj (T1 ) fj (Tj ) ≤ . (10.1.25) E p(T ) j ∈J

j ∈J

Gaussian processes. A family X = {Xt , t ∈ T } of random variables with common basic probability space (, A, P) is a centered Gaussian process if any finite linear combination n ak Xtk k=1

with ak reals and tk ∈ T is a centered real Gaussian random variable. The law of the Gaussian process X is completely determined by its covariance function (s, t) = EXs Xt , s, t ∈ T . A more abstract way to define Gaussian processes usually goes as follows. Let H be a Hilbert space; a Gaussian process is a (linear) isometry T : H → L2 (P) such that:

502

10 Gaussian processes

(i) For any two orthogonal elements x, y ∈ H , T (x) and T (y) are independent. (ii) For any x ∈ H , T (x) is centered normally distributed with E T (x)2 = x 2 . We see from Lemma 10.1.5 that the second requirement is redundant. Indeed, if x and y are orthogonal so are x + y and x − y; whence T (x) and T (y) are normally distributed. The other requirement E T (x)2 = x 2 is implied by the fact that T is an isometry. We therefore have another simpler definition: 10.1.10 Definition. A Gaussian process is a linear isometry T : H → L2 (P) such that if x, y ∈ H are orthogonal, then T (x) and T (y) are centered independent. The comparison between the two definitions is easy. Let X = {Xt , t ∈ T } be a centered Gaussian process with basic probability space (, A, P). Let H = span{Xt } and T be the identity operator. Clearly X is the restriction of T to some subset of H . As the law of these random variables is determined by their finite margins, the rotational invariance properties stated before extend to these variables. Thus if X is a Gaussian process, or a Gaussian random variable with value in a Banach space (see Definition 10.1.11), and if X1 , . . . , XN are independent copies of X, for any sequence

N 2 1/2 a1 , . . . , aN of reals, N . i=1 ai Xi has the same law as X1 i=1 ai The finitely additive Gaussian cylinder measure on H induced by T is not extendable to a countably additive measure on the σ -algebra BT (H ) generated by the cylinders of H if H is infinite-dimensional. The following remarkable example is quoted in Gelbaum [1985]. If is a domain in R2 and if the two-dimensional Lebesgue measure of S is finite, say equal to 1, let H be the set of R-valued square integrable harmonic functions on and finally let T be any endomorphism of H . Then by a result of Hemasinha [1983], T induces a countably additive measure on BT (H ). Therefore no such T can satisfy either of the requirements (i) and (ii) above. Let H be a Hilbert space. The canonical Gaussian process Z = {Zh , h ∈ H } on H is the Gaussian centered process with covariance function given by (h, h ) = h, h , for any h, h ∈ H . By Zorn’s lemma, any Hilbert space admits an orthonormal basis although not necessarily countable. Assume that H admits a countable orthonormal basis {hn , n ≥ 1}, which is realized if and only if H is separable. Let also γ = {gn , n ≥ 1} be a sequence of i.i.d. N (0, 1) distributed random variables with basic probability space (, A, P). Then Z can be defined as follows: for any h ∈ H , Zh =

∞

gn h, hn .

(10.1.26)

n=1

We easily verify that E Zh Zh = h, h for any h, h ∈ H . Any centered Gaussian process X = {Xt , t ∈ T } can be represented as the restriction of the canonical Gaussian process to some suitable subset B of H . Let indeed H = L2 (, A, P ), and consider

10.1 Gaussian variables and correlation estimates

503

the restriction of Z on H to B = {Xt , t ∈ T }. Then X and ZB = {Zb , b ∈ B} have the same laws since their covariance functions are identical by construction. We shall introduce the notions of Gaussian measure and of Gauss space and review some of their important properties. For the proofs and for more about these spaces, we refer to the original works of Borell [1975–77] (see also Gross [1967]), which are clearly written and accessible. There are other remarkable sources: for instance Ledoux and Talagrand [1991], Lifshits [1995], Talagrand [2005] and the work of Ehrhard [1983], [1984a], [1984b]. Gauss spaces. Let E denote a locally convex Hausdorff space over the field of real numbers. 10.1.11 Definition. A Radon probability measure μ on E is said to be a (centered) Gaussian Radon measure on E if the image measure ξ(μ) is a (centered) Gaussian Radon measure on R for every ξ belonging to the topological dual E of E. The pair (E, μ) is called a Gauss space. A random variable X with value in E is Gaussian if f (X) is a real Gaussian for any f ∈ E . Equivalently, a Radon probability measure μ on E is said to be centered Gaussian if for independent random variables X, Y with common law μ, X + Y and X − Y are independent and have the same distribution. The class of all (centered) Gaussian Radon measures on E is denoted by G(E) (resp. (G0 (E)). Every μ ∈ G(E) has barycenter b ∈ E. Setting μ0 = μ(· + b), we also denote by E2 (μ) the closure of E in L2 (μ). If (H, · ) is a Hilbert space, the canonical cylinder measure on H is denoted by γH . The Fourier transform γˆH (x) of 2 γH equals e− x /2 . In the theorem below (Borell [1975: Theorem 2.1]), we list a few basic properties of Gauss spaces. 10.1.12 Theorem. Suppose μ ∈ G(E). Then a) μ has barycenter b ∈ E, b) every measure ξ μ0 , ξ ∈ E2 (μ), has barycenter ξ ∈ E. The map : E2 (μ) → E is linear and injective. We define H (μ) = range(),

˜ 2. h˜ = −1 h, h ∈ H (μ) and h 2 = μ(h)

Then, c) (H(μ), · ) is a Hilbert space and the canonical injection θ of (H(μ), · ) into E is weakly continuous. Furthermore θ (γH (μ) ) = μ0 . Let μ ∈ G(E) and write μx ( · ) = μ0 (· − x), x ∈ E. As a corollary we get # " ˜ 2 μh = e(h− h /2) · μ0 , h ∈ H(μ). (10.1.27) The Hilbert space H(μ) introduced in Theorem 10.1.12 is called the reproducing kernel Hilbert space (RKHS) of μ. Borell proved (see Theorem 7.1 in the aforementioned

504

10 Gaussian processes

paper) that H(μ) is separable.

(10.1.28)

We define O(μ) = {h ∈ H (μ) : h ≤ 1},

μ ∈ G(E).

(10.1.29)

Then O(μ) is a compact subset of E and we have the important relation μ0 (ξ 2 ) = max ξ 2 , O(μ)

10.2

ξ ∈ E.

(10.1.30)

0-1 laws, integrability and comparison lemmas

0-1 laws. The rotational invariance of Gaussian laws has an important consequence: a general 0-1 law (Fernique [1975: Theorem 1.2.1]) which can be stated as follows. 10.2.1 Proposition. Let (E, E ) be a measurable vector space. Let (, B, P) be a probability space. Consider a Gaussian vector X : (, B, P) → (E, E ). Then for any subspace V of E, we have P{X ∈ V } = 0 or 1. Proof. It is rather immediate. Let Y be an independent copy of X and put Bϑ = {X cos ϑ + Y sin ϑ ∈ V , X sin ϑ − Y cos ϑ ∈ / V }. Let ϑ1 = ϑ2 and assume that X cos ϑ1 + Y sin ϑ1 ∈ V and X cos ϑ2 + Y sin ϑ2 ∈ V . The determinant of the (2, 2) matrix cos ϑ1 sin ϑ1 cos ϑ2 sin ϑ2 being non-zero, it follows that X and Y belong to V as well, and so is the case for X sin ϑ1 − Y cos ϑ1 ∈ V and X sin ϑ2 − Y cos ϑ2 ∈ V . Thus the sets Bϑ are disjoint. Since they have the same probability, this one must be 0. In other words it follows that P(B0 ) = P{X ∈ V }(1 − P{X ∈ V }) = 0, as claimed. Integrability. Let N : (E, E ) → (R+ , B(R+ )) be a measurable semi-norm on E. A plain but useful consequence of the 0-1 law is that P{N(X) < ∞} = 0 or 1.

(10.2.1)

When N is the usual sup-norm, say N(X) = supn≥1 |Xn |, if X = {Xn , n ≥ 1}, this fact has been known for a long time, according to the discussion and related references (starting in 1951) given in the introduction of Landau and Shepp [1970]. When P{supn≥1 |Xn | < ∞} = 1, the possible exponential integrability of the supremum of X was conjectured by Varadhan in 1967, and proved by Landau and Shepp in the above quoted paper, and independently by Fernique [1970] for general seminorms. We shall indeed establish, as a direct consequence of the rotational invariance of Gaussian laws, that if P{N(X) < ∞} > 0, then N (X) is exponentially integrable.

505

10.2 0-1 laws, integrability and comparison lemmas

10.2.2 Theorem. Let (E, E ) be a measurable vector space. Let (, B, P) be a probability space. Consider a Gaussian vector X : (, B, P) → (E, E ). Let N = (E, E ) → R+ be a measurable semi-norm on E and assume that P{N (X) < ∞} > 0. Then E N(X) < ∞ and in fact there exists an absolute constant K such that

E exp

N(X)2 K(E N(X))2

≤ 2.

(10.2.2)

The proof is elementary but has some degree of elegance. Proof. Let Y be an independent copy of X. Let 0 v P N(X) ≤ u P N (X) > v = P N √ √ 2 2 (10.2.3) v−u 2 ≤ P N(X) > √ , 2

√ √ √ where we used the fact that N X+Y ≤ N X−Y + 2 sup(N (X), N (Y )). Let τ > 0 2 2 be fixed. Choose s such that

δ := P N(X) ≤ s > 1/2, Put

δ log 1−δ

and

√

tn = ( 2 + 1) 2(n+1)/2 − 1 s,

≥ τ.

n = 0, 1, . . . .

Then tn+1 − s = tn and from (10.2.3) applied with u = s and v = tn+1 we get P N(X) ≤ s}P N (X) > tn+1 ≤ 2P2 N (X) > tn . Letting xn = P N (X) > tn /P N(X) ≤ s}, the latter inequality means xn+1 ≤ xn2 . n Iterating this inequality leads to xn+1 ≤ x02 ; and so for n = 0, 1, . . . , n 1−δ 2 n P N(X) > 2 · 2n/2 s ≤ P N (X) > tn+1 ≤ δ = δe−2 τ . δ n n Let c = τ/2. Then P exp sc2 N(X)2 > e2 c = P N (X) > 2n/2 s ≤ e−2 τ and thus ∞ ∞ n c n n 2 2n c 2 c 2n−1 c e −e P exp 2 N(X) > e e−2 τ e2 c < ∞. ≤ s n=0

n=0

This establishes that

E exp

τ N (X)2 2s 2

≤ C,

506

10 Gaussian processes

where C = C(τ ) depends on τ only. We fix τ , say τ = log 2 so that C is now an absolute constant. By Jensen’s inequality, we may find a real 0 < η < 1 small enough for the following inequality to be true:

E exp η

τ N (X)2 2s 2

≤ E exp

τ N (X)2 2s 2

η

≤ C η = eη(log C) ≤ 2.

(10.2.4)

This notably implies that E N(X) < ∞. Now observe that the reasoning we just made is valid for any Gaussian vector X with value in E and satisfying P{N 0. (X) < ∞} > = X/E N (X). But δ = P N (X ) ≤ 3 = This is in particular the case of X P N(X) ≤ 3E N(X) ≥ 2/3 and

log

δ 1−δ

≥ log 2 = τ,

so that s = 3 is suitable there. Application of (10.2.4) to X yields

E exp

N(X) KE N(X)

2

≤ 2,

(10.2.5)

where K = (18/ητ )1/2 , as claimed. Comparison lemmas. This is, after the rotational invariance of Gaussian laws, the second fundamental property of Gaussian processes ([Fernique: 1975]). 10.2.3 Lemma. Let T be a finite set and consider two Gaussian (centered) processes X = {Xt , t ∈ T } and Y = {Yt , t ∈ T }. Assume that for any s, t ∈ T , dY (s, t) ≤ dX (s, t).

(10.2.6)

Then for any convex increasing function ϕ : R → R+ , Ef ( sup Ys − Yt ) ≤ Ef ( sup Xs − Xt ).

(10.2.7)

E sup Yt ≤ E sup Xt .

(10.2.8)

T ×T

T ×T

In particular t∈T

t∈T

Proof. Let n = #(T ). It suffices to prove the lemma when f is a smooth twice differentiable convex function, since any convex increasing function is the upper convex hull of such functions. Let X , Y denote the covariance matrix of X, Y respectively. (1) Assume first that X , Y are invertible and write (λ) = λ X + (1 − λ) Y , for λ ∈ [0, 1]. Then (λ) is invertible. Put for x = (xt )t∈T ∈ RT ,

1 − 21 t x (λ)−1 x e , H (λ) = f sup xs − xt gλ (x)dx. gλ (x) = √ n/2 (2π) det (λ) RT s,t∈T

10.2 0-1 laws, integrability and comparison lemmas

507

Arguing as along the lines (10.1.20) to (10.1.22), we find that H (λ) has a derivative and ∂

f sup xs − xt H (λ) = (gλ (x)) dx, ∂λ RT s,t∈T $

%

∂ 1 ∂ (λ) ∂ 2

. gλ (x) . (gλ (x)) = tr ∂λ 2 ∂λ ∂x 2 Developing more the expression of H (λ) leads to H (λ) =

∂ " # (s, s) − 2 (s, t) + (t, t) J (s, t) ∂α t∈T s∈T

=

s =t

" s∈T

# 2 (s, t) − dY2 (s, t) J (s, t), dX

t∈T s =t

where J (s, t) are positive integrals. It follows that H (λ) ≥ 0, and so H (1) ≥ H (0), which establishes (10.2.7). (2) If X or Y is not invertible, we proceed as in the second part of the proof of Lemma 10.1.8. An immediate consequence of this lemma is the well-known Sudakov’s minoration: There exists a universal constant K such that

2 E sup X(t) ≥ K inf dX (s, t) log #(T ).

(10.2.9)

s,t∈T s =t

t∈T

Proof. It suffices to prove (10.2.9) when T is finite, say T = {1, . . . , N}. Let λj , 1 ≤ j ≤ N be independent N (0, 1) distributed random variables and write ρ = inf 1≤i =j ≤N Xi − Xj 2 . Put ρ Yj = √ λj , 2

1 ≤ j ≤ N.

By construction Xi − Xj 2 ≥ Yi − Yj 2 , for all i and j , so that condition (10.2.6) is satisfied. And by Lemma 10.2.3, ρ N N E sup Xj ≥ √ E sup λj . 2 j =1 j =1 Using the symmetry of the Gaussian laws, we have that E supN i,j =1 |λi − λj | = N N N E supj =1 (λi ) + E supj =1 (−λj ) = 2E supj =1 λi . Now N

N

j =1

i,j =1

E sup λj = 21 E sup |λi − λj | ≥

1 2

N

1/2 N E sup |λi | − E |λ1 | = 21 E sup |λi | − π2 . j =1

j =1

508

10 Gaussian processes

But for any T > 0,

N N E sup |λi | ≥ T P sup |λi | > T = T 1 − P{|λ1 | < T }N j =1

j =1

≥ T 1 − eN log(1−P{|λ1 |>T }) ≥ T 1 − e−N P{|λ1 |>T } .

√ Choosing T = 2 log N implies NP{|λ1 | > T } ≤ 1; and so E supN j =1 |λi | ≥ √ C log N . The result follows easily. Error term in Slepian’s comparison lemma. It is also possible to bound the difference between the terms in (10.2.8). Put 2 (s, t) − dY2 (s, t)|. γ 2 = sup |dX s,t∈T

Then there exists a universal constant C such that 2 E sup Xt − E sup Yt ≤ Cγ log #(T ). t∈T

(10.2.10)

t∈T

This follows from a simple application of the previous lemma. Let N = {Nt , t ∈ T } where the components Nt are independent and N (0, 1) distributed and assume that N, X and Y are mutually independent. Put Z=

γ √ N + Y. 2

Then for s = t, 2 2 (u, v) − dY2 (u, v)| + dY2 (s, t) ≥ dX (s, t). dZ2 (s, t) := γ 2 + dY2 (s, t) ≥ sup |dX u,v∈T

By Lemma 10.2.3, E sup Xt ≤ E sup Zt = E sup t∈T

t∈T

Considering now Z =

t∈T

γ √ 2

%

γ γ √ Nt + Yt ≤ ( √ )E sup Nt + E sup Yt . 2 2 t∈T t∈T

N + X, we obtain similarly

E sup Yt ≤ t∈T

Therefore

$

γ √ E sup Nt + E sup Xt . 2 t∈T t∈T

γ E sup Xt − E sup Yt ≤ √ E sup Nt ≤ Cγ (log #(T ))1/2 , 2 t∈T t∈T t∈T

as claimed. This inequality was observed by Chatterjee [2005] who proved it by different arguments and considered also the non-centered case.

509

10.2 0-1 laws, integrability and comparison lemmas

Talagrand’s strengthening. A fundamental observation made by Talagrand is that the Sudakov minoration we just considered is only a piece of a stronger minoration inequality, which actually leads, when combined with a chaining argument, to the proof of the majorizing measure conjecture. Let {Xt , t ∈ T } be a Gaussian process and denote d(s, t) = Xs − Xt 2 . Consider points {t , 1 ≤ ≤ m} of T such that d(t , tk ) ≥ a if = k. Let σ > 0 and attach to each , 1 ≤ ≤ m, a finite set H ⊂ Bd (t , σ ). Let m + H = H . =1

Then we have E sup Xt ≥ t∈H

2 a 2 log m − C2 σ log m + min E sup Xt . 1≤≤m t∈H C1

(10.2.11)

In particular, if σ ≤ a(2C1 C2 ), E sup Xt ≥ t∈H

a 2 log m + min E sup Xt . 1≤≤m t∈H 2C1

(10.2.12)

The proof we shall give is taken from [Talagrand: 2005] (see p. 34), to which we refer the reader for more about Gaussian processes, Rademacher processes and majorizing measures. There is no loss to assume m ≥ 2. Consider the random variables

Y = sup Xt − Xt = sup (Xt − Xt ), 1 ≤ ≤ m, V = max Yl . t∈H

1≤≤m

t∈H

By the concentration inequality (10.4.5), 2 2 P |Y − E Y | ≥ u ≤ 2e−u /2σ . Thus P{V ≥ u} ≤ 2me−u /2σ , and so using inequality (10.1.3) and the above, ∞ ∞

2 2 EV = P{V ≥ u)du ≤ min 1, 2me−u /2σ du 2

0

≤

√ σ 2 log 2m

2

0

0

2 = σ 2 log 2m + 2mσ 2 ≤ C2 σ log m.

∞

2 2 e−u /2σ du √ σ 2 log 2m ∞ 1/2 2 π −v 2 /2 e dv ≤ σ 2 log 2m + σ √ 2 2 log 2m

du + 2m

But for each , V ≥ E Y − Y , and so Y ≥ min1≤≤m E Y − V , which implies sup Xt = Y + Xt ≥ Xt + min E Y − V .

t∈H

1≤≤m

510

10 Gaussian processes

Hence sup Xt ≥ max Xt + min E Y − V . 1≤≤m

t∈H

1≤≤m

Passing to expectation gives 2 E sup Xt ≥ E max Xt + min E Y − C2 σ log m. 1≤≤m

t∈H

1≤≤m

To conclude, it remains to apply Sudakov’s minoration to the first term of the right-hand side.

10.3

Regularity and irregularity of Gaussian processes

Let X = {Xt , t ∈ T } be a Gaussian process indexed on T and with basic probability space (, B, P). Let dX (s, t) = Xs − Xt 2 be the natural pseudo-metric induced by X on T . The following useful fact is easy to verify: in order that X has a dX -separable version or modification, it is necessary and sufficient that (T , dX ) be separable. Two fundamental properties are relevant in this section: the almost sure boundedness and almost sure continuity of sample paths. Let X = {Xt , t ∈ T } be a Gaussian process indexed on an arbitrary parameter set T . We endow T with the pseudo-metric dX (s, t) and assume that (T , dX ) is separable, so that X possesses a (dX -separable) version which we shall denote again by X. In this case, there is no ambiguity to say: X is sample bounded if P{ω : supt∈T |Xt (ω)| < ∞} = 1; X is sample dX -continuous if P{ω : t → Xt (ω) is dX -continuous} = 1. These properties lead to a fine notion of compactness in a Hilbert space. Let (H, · ) be a Hilbert space and let Z be the canonical Gaussian process on H . 10.3.1 Definition. We say that A is a GB (for Gaussian bounded) subset of H if the restriction of Z on A possesses a version which is sample bounded. We also say that A is a GC (for Gaussian continuous) subset of H if the restriction of Z on A possesses a version which is sample · -continuous. The 0-1 laws and integrability properties of Gaussian vectors (previous section) show that X is sample bounded if and only if E sup |X(t)| < ∞.

(10.3.1)

t∈T

As E sup X(t) ≤ E sup |X(t)| ≤ 2E sup X(t) + inf E |X(t0 )| t∈T

t∈T

t0 ∈T

t∈T

and E

sup

(s,t)∈T ×T

X(t) − X(s) = 2E sup X(t), t∈T

10.3 Regularity and irregularity of Gaussian processes

511

we have E sup X(t) ≤ E sup |X(t)| ≤ 2E sup X(t) + inf E |X(t0 )|. t∈T

t∈T

t0 ∈T

t∈T

(10.3.2)

It follows that X is also sample bounded if and only if E sup X(t) < ∞. t∈T

As for the sample path continuity, first examine the oscillation properties of Gaussian processes established by Ito and Nisio [1968] and Belyaev [1961]. Let (T , δ) be a separable metric space. Let X = {Xt , t ∈ T } be a Gaussian process on T . We assume that X is dX -separable. We also assume that the identity mapping i : (T , δ) → (T , dX ) is uniformly continuous. Then under these conditions the δ-oscillation of X, WX(ω) (t) = lim lim

u→0 ε→0

sup

δ(s,t)
|X(s) − X(s )|,

is almost surely deterministic: there exists a null set N and an application α : T → R+ such that ∀ω ∈ / N, ∀t ∈ T , WX(ω) (t) = α(t). (10.3.3) Further, for any t ∈ T , there exists a null set N such that for any ω ∈ / Nt , 1 lim inf X(ω, s) = X(ω, t) − α(t), 2

1 lim sup X(ω, s) = X(ω, t) + α(t), 2 δ(s,t)→0 (10.3.4)

δ(s,t)→0

and α(t) = 2 lim E sup X(s). ε→0

(10.3.5)

δ(s,t)<ε

The search for theorems characterizing the regularity properties of Gaussian processes by means of the metrical properties of the parameter space (T , dX ), gave raise to continuous efforts leading in 1985 to a characterization involving majorizing measures ([Talagrand: 1987], Theorem 1). We have, in effect, the following statement. 10.3.2 Theorem. (a) X has a sample bounded version on T , if and only if there exists a probability measure μ on T such that diam(T ,d) 1/2 1 IX (μ) = sup log du < ∞. (10.3.6) μ{s ∈ T : dX (s, t) ≤ u} t∈T 0 And there exists a universal constant K such that K −1 IX (μ) ≤ E sup Xt ≤ KIX (μ).

(10.3.7)

t∈T

(b) X has a sample continuous version on T , if and only if there exists a probability measure μ on T such that ε 1/2 1 log du = 0. (10.3.8) lim sup ε→0 t∈T 0 μ s ∈ T : dX (s, t) ≤ u

512

10 Gaussian processes

The upper bound part in (10.3.7) holds true for any probability measure μ and is due to Fernique [1975]. One should notice, however, that it is easily deduced from a previous work of Preston [1971], following the seminal paper of Garsia–Rodemich– Rumsey [1970] (see e.g., [Talagrand: 1987; 103], see also Section 9.1). This beautiful characterization is relatively delicate to apply, the existence of majorizing measures being generally difficult to establish. In many cases the following entropy criterion due to Dudley [1967: Theorem 3.1] and [1973: Theorem 2.1] is more convenient. Dudley’s entropy criterion. Assume that diam(T ,dX ) 1/2

(T , dX ) = dε < ∞, (10.3.9) log N(T , dX , ε) 0

where as usual N(T , dX , ε) is the minimal number of dX -open balls of radius ε enough to cover T . Then X has a version which is sample dX -continuous. Further denoting "(t) = exp(t 2 ) − 1, sup [Xs − Xt ] ≤ K (T , dX ), (10.3.10) " T ×T

where K is a universal constant. When X is a stationary Gaussian process on T = RN , this condition is optimal ([Fernique: 1975], Theorem 8.1.1). 10.3.3 Theorem. Let X = {Xt , t ∈ RN } be a stationary Gaussian process. In order that X has a version which is sample bounded on any bounded subset of RN , it is necessary and sufficient that there exists a bounded neighborhood V of RN such that diam(V ,dX )

1/2 log N(V , dX , ε) dε < ∞. (10.3.11) (V , dX ) = 0

Further, there exists a constant C = C(N) depending on N only such that C −1 (V , dX ) ≤ E sup X(t) ≤ C (V , dX ). t∈V

It is also possible to reformulate the previous characterization using majorizing measures. The statement below shows that the Lebesgue measure is a universal majorizing measure for stationary Gaussian processes on RN restricted to some open ball of RN . 10.3.4 Theorem. Let X = {X(ω, t), ω ∈ , t ∈ RN } be a stationary Gaussian process on RN . We assume that X is continuous in probability. Set V = [−1/2, 1/2]N . Let λ denote the normalized Lebesgue measure on V , further let B(t, u) = {s ∈ RN : dX (s, t) < u}, u > 0. Then the three following properties are equivalent: (i) X has a version with continuous paths on RN , ∞ 1 (ii) the integral J = 0 log λ(V ∩B(0,u)) du converges,

10.3 Regularity and irregularity of Gaussian processes

(iii) supt∈V

∞ 0

513

1 log λ(V ∩B(t,u)) du.

Further, there exists a constant C = C(N) depending on N only such that ∞9 1 −1 du ≤ E sup X(t) log C sup λ(V ∩ B(t, u)) t∈V 0 t∈V ∞9 1 ≤ C sup log du. λ(V ∩ B(t, u)) t∈V 0

Note that B

−1

∞

J ≤ sup t∈V

9 log

0

1 du ≤ BJ. λ(V ∩ B(t, u))

where the constant B = B(N ) depends on N only. Local time and irregularity of Gaussian processes. Let (T , A, μ), (, B, P) be two probability spaces. Let X = {X(ω, t), ω ∈ , t ∈ T } be a real-valued stochastic process, measurable with respect to A ⊗ B. For each sample path X(ω, · ), the occupation time distribution is defined as ν(A) = ν(ω, A) = μ{t ∈ T : X(ω, t) ∈ A}

for every Borel set A ∈ B(R). (10.3.12) For any real- or complex-valued Borel measurable function G, we have by the transfer formula G(X(ω, t))μ(dt) = G(x)ν(ω, dx). (10.3.13) R

T

In particular, the Fourier–Stieltjes transform of the corresponding distribution function is given by the formula eiuX(ω,t) μ(dt), u ∈ R. νˆ (u) = νˆ (ω, u) = T

If the measure ν(ω, · ) is absolutely continuous on R with respect to the Lebesgue measure, its Radon–Nikodym derivative, denoted by φ(x) = φ(ω, x) is called the local time of X(ω, · ), and we have the relation φ(ω, x)dx. (10.3.14) ν(ω, A) = A

A more restricted notion would consist of replacing T by I ∈ A, and introducing νI (ω, A) = μ t ∈ I : X(ω, t) ∈ A , fI (ω, · ) and φI (ω, · ). It is clear from the definition that if the local time exists with respect to a measurable set I , then it also exists with respect to any measurable subset J ⊂ I .

514

10 Gaussian processes

When T = [0, 1], writing φt (ω, x) = φ[0,t[ (ω, x), the question naturally arises whether the local time of X admits a version which is jointly continuous in (t, x). A large part of the theory consists of finding sharp conditions ensuring the validity of this property. For instance, if X(t), 0 ≤ t ≤ 1 is Gaussian and E |X(s) − X(t)|2 = σ 2 (|s − t|) with σ 2 concave, then according to Theorem 6.1 in Berman [1970a], the local time exists and is jointly continuous almost surely. We now assume in what follows that X is Gaussian. The study of local times of Gaussian processes was mainly developed under the impulse of Berman (see the seminal works Berman [1969a], [1969b]), with important contributions of Geman and Horowitz [1980], Lifshits [1979], Marlow [1973] and Pitt [1978]. The two formulas below are very useful: E |u|p |ˆν (u)|2 du R

p + 1 1 μ(ds)μ(dt) = 2(p+1)/2 (∀p ≥ 0) " # 2 2π T T E (X(s) − X(t))2 (p+1)/2 (10.3.15) and E

R

eub |ˆν (u)|2 du =

b2 √ 2π e 2E (X(s)−X(t))2 "

T

μ(ds)μ(dt) E (X(s) − X(t))2

T

#1/2

(∀b ∈ R).

(10.3.16) Applying (10.3.15) with p = 0 shows that a sufficient condition for the existence of the local time is μ(ds)μ(dt) (10.3.17) " #1/2 < ∞. T T E (X(s) − X(t))2 This condition is of a somewhat special nature, a reader familiar with classical potential theory will recognize the form of an energy integral. Let us sketch rapidly the proof. In view of formula (10.3.15), E |ˆν (u)|2 du < ∞. (10.3.18) R

Therefore with probability 1, f := νˆ ∈ L2 (R) and so fˆ ∈ L2 (R). Let g = χ[a,b] . By the Parseval relations, R g(x)f ˆ (x)dx = R g(x)fˆ(x)dx. Hence 1 2π

R

e−ibt − e−iat 1 νˆ (t)dt = −it 2π

b

fˆ(t)dt.

a

By the inversion formula for Fourier–Stieltjes transforms ν([a, b]) =

1 2π

R

e−ibt − e−iat νˆ (t)dt −it

10.3 Regularity and irregularity of Gaussian processes

515

at all points a, b of continuity of ν. But we have (see for instance Theorem 6.2.5 in Chung [1970]) T 1 |ˆν (t)|2 dt = ν({x})2 . lim T →∞ 2T −T x∈R

Since νˆ ∈ get

L2 (R),

this implies that ν has no discontinuity points. Defining φ = fˆ, we ν([a, b]) =

b

φ(x)dx

for all a, b reals.

a

And this proves the existence of the local time under the integral condition (10.3.17). If (10.3.18) is strengthened into E |ˆν (u)|du < ∞, (10.3.19) R

then (see for instance Theorem 4.4.2 in Kawata [1972]) with probability 1, φ ∈ L1 (R), and also belongs to the class C 0 of bounded functions on R which are uniformly continuous on R, and such that (x) → 0, x → ±∞, and φ ∈ Lr (R) for any r ≥ 1. Consider some examples, for instance let X be a Gaussian process defined on the interval [0, 1], and satisfying for some constants C and β < 1, dX (s, t) := X(s) − X(t) 2 ≥ C|s − t|β , for any s, t ∈ [0, 1]. Then the integral in (10.3.17) is convergent. And so the local time (with respect to the Lebesgue measure) of X exists almost surely. It is plain that the same conclusion can be reached if dX (s, t) ≥ C| log |s − t||−b , for some b > 0. Now if dX (s, t) 3 | log |s − t| |−b as |s − t| tends to 0, the entropy numbers −1/b associated to dX satisfy N(ε) 3 eε . And so if further b ≤ 1/2 and X is stationary, by Theorem 10.3.3, X is unbounded almost surely. The notion of local time for a process is a measure of its irregularity: the more the local time of X is regular, the less is X. There is another way to observe this which is even more striking: if the local time is analytic, the process has very erratic trajectories, and is in particular unbounded almost surely. Berman [1969a], [1969b], [1970b], [1984] investigated much of this aspect of the theory. Before going further, we collect some useful facts concerning smooth distribution functions and analytic functions (see e.g., Cartan [1961], Kawata [1972] or Lukacs [1970]). Let f (t) = R eitx F (dx) be the characteristic function of a distribution function F . If |f (t)|dt < ∞, (10.3.20) R x then F is absolutely continuous, F (x) − F (−∞) = −∞ p(u)du where p(u) ≥ 0, p ∈ L1 (R) and is bounded uniformly continuous on R. We also have by the inversion formula for all reals x, −itx e −1 1 f (t)dt. F (x) − F (0) = 2π R −it

516

10 Gaussian processes

Now assume that there exists a real r > 0 such that er|t| |f (t)|dt < ∞.

(10.3.21)

R

Put for z complex, z = x + iy, Fn (z) =

1 2π

R

e−itz − 1 f (t)dt. −it

These functions are analytic in the strip −r < 4z < r (expandable in each point of the strip into an entire series converging in some open neighbourhood of this point). −itz This is easily seen by developing e −it−1 , and using the elementary bound valid for any complex number z and for any natural number n, z z |z|n+1 |z| zn e − 1 + ≤ + · · · + e . 1! n! (n + 1)!

Suppose |y| ≤ b < r. Observe that B −itx B

2 B t|y| e −1 t|y| ≤ 1 e |f (t)|dt. f (t)dt 1 + e |f (t)|dt ≤ A −it A A A A By the assumption made, the last integral tends to 0 as B → ∞, A → ∞. This shows that sup sup |Fn (z) − Fm (z)| → 0, A, B → ∞. x∈R, |y|≤c

A≤m≤n B≤n

The functions Fn (z) thus converge uniformly on any compact subset of the strip −r < 4z < r. According to a classical result (Cartan [1961: p. 145]) the limit, namely F , is analytic in this strip. We obtained that under condition (10.3.21), the repartition function F is analytic in the strip −r < 4z < r. But this condition is satisfied once eb|t| |f (t)|2 dt < ∞ for some b > 2r. (10.3.22) R

Indeed, write b = 2(r + η). By Cauchy–Schwarz’s inequality, r|t| e |f (t)|dt = e(r+η)|t| |f (t)|e−η|t| dt R

R

1/2

≤

R

e2(r+η)|t| |f (t)|2 dt 1/2

= C(η)

R

eb|t| |f (t)|2 dt

R

e−2η|t| dt

< ∞.

1/2

10.4 Gaussian suprema

517

Now (10.3.22) holds if R ebt |f (t)|2 dt < ∞ and R e−bt |f (t)|2 dt < ∞ (conversely it is obvious). Apply these remarks to F = ν, f = νˆ . In view of formula (10.3.16), e±ub |ˆν (u)|2 du < ∞ E R

if and only if T

b2

T

μ(ds)μ(dt)

e 2E (X(s)−X(t))2 "

E (X(s) − X(t))2

#1/2 < ∞.

(10.3.23)

Under this condition, ν = ν(ω, · ) is almost surely analytic in the strip −b/2 < 4z < b/2, and so is the local time φ = φ(ω, · ) as well. Now we claim that the process spends positive time in any interval, and therefore cannot be bounded. Indeed, let I = [a, b] with a < b. If the amount of time spent in I is zero, the local time must vanish almost everywhere in I , thus everywhere in I by continuity. In view of the principle of analytic continuation (Cartan [1961: p. 39]), it must vanish everywhere in the whole strip, hence on the real line. The integral of local time having value 1, we obtain a contradiction. Hence the claim is proved. Consequently, a sufficient condition for X to be unbounded almost surely is b2 e E (X(s)−X(t))2 μ(ds)μ(dt) < ∞ for some b > 0. (10.3.24) T

T

This condition means that the parameter space has finite energy integral with respect 2 2 to the kernel K(y) = eb /y . This implies that T is sufficiently large so that the sample paths have “enough time” to visit every set of positive measure. This approach has also been extended to non-Gaussian processes in a little known paper by Berman [1984], and certainly deserves further investigations.

10.4

Gaussian suprema

The isoperimetric inequality. The fundamental result is a Brunn–Minkowski type isoperimetric inequality in Gauss spaces (Section 10.1) discovered independently by Borell [1975] and Sudakov–Tsyrelson [1974]. Let E be a locally convex Hausdorff space. Let μ ∈ G(E). We set μ∗ (A) = sup μ(K) : K compact K ⊆ A , whenever A ⊆ E. x 2 Recall that we have set (x) = √1 ∞ e−t /2 dt in (10.1.2), and that O(μ) denotes 2π the unit ball of the RKHS of μ, see (10.1.29). 10.4.1 Theorem. Suppose that A is a μ-measurable subset of E. Choose a ∈ R so that μ(A) = (a). Then, for all t > 0,

μ∗ A + tO(μ) ≥ (a + t).

518

10 Gaussian processes

Equality occurs if A is a half space. In particular, if A + H (μ) = A, then μ(A) = 0 or 1. The proof in Borell [1975] is based on the Brunn–Minkowski inequality for spherical space. It is worth mentioning that in an earlier paper, Landau and Shepp [1970] already used this inequality to prove that if X is a centered Gaussian vector in Rn , V a convex set and s a real such that P{V ∈ C} ≥ (s), then if s > 0, for any a > 1, P{V ∈ aC} ≥ (as). We point out another useful inequality valid for all μ-measurable subsets A and B of E, and every 0 < λ < 1:

μ∗ λA + (1 − λ)B ≥ μλ (A)μ1−λ (B). (10.4.1) And so if μ ∈ G0 (E) and A is a convex Borel measurable subset of E, symmetric about the origin, then μ(A) ≥ μ(A + x), x ∈ E. (10.4.2) This is a fundamental inequality in Gauss spaces. A remarkable property enjoyed by μ-measurable subsets A with positive measure states as follows: If μ(A) > 0, then there exists a positive number δ such that δO(μ) ⊆ A − A. (10.4.3) We refer for these results to Borell [1975] and also to Section 2.3 in [Ledoux–Talagrand: 1991]. Let us give some important consequences of the isoperimetric inequality. Let (B,

) be a Banach space such that for some countable subset D of the unit ball B , x = supf ∈D |f (x)|. If X is a random variable in B, the study of the distribution of X thus amounts to estimating the supremum of countably many random variables {f (X), f ∈ D}. Consider now X Gaussian in B; by this we mean that {f (X), f ∈ D} is a Gaussian process, or equivalently that every finite linear combination i αi fi (X), αi ∈ R, fi ∈ D is Gaussian. The behavior of P{ X > t} is determined by two parameters: the median M = M(X), that is a number satisfying both P{ X ≤ M} ≥ and

1 , 2

P{ X ≥ M} ≥

1 , 2

1/2

σ = σ (X) = sup E f 2 (X) . f ∈D

Set D = {fn , n ≥ 1}. Let γ be the canonical Gaussian distribution on RN . By applying the Gram–Schmidt orthonormalization procedure to the sequence {fn (X), n ≥ 1}, we can write n ajn gj , n ≥ 1. fn (X) = j =1

519

10.4 Gaussian suprema

The meaning of these equalities is that if x = {xj , j ≥1} ∈ RN , the sequence {fn (X), n ≥ 1} has the same distribution as the sequence { nj=1 ajn xj , n ≥ 1} under γ . Consequently, the study of the distribution of X amounts to the one of x = supn≥1 |fn (x)| under γ . Note also that |hj |2 ≤ 1 . σ = sup h where O(γ ) = h : h∈O(γ )

j ≥1

The next result is a very important consequence of inequality (10.4.1). 10.4.2 Theorem. If X is a Gaussian random variable with value in a Banach space (B,

), with median M and supremum of weak variances σ , then for every t > 0, 2 2 P X − M > t ≤ 2"(t/σ ) ≤ e−t /2σ . Proof. Indeed, let A = {x ∈ RN : x ≤ M}. Then At is the Hilbertian neighborhood of order t of A and by Theorem 10.4.1, γ∗ (At ) ≥ (t). Further, if x ∈ At , x = a + th, a ∈ A, h ∈ O(γ ), then

x ≤ M + t h ≤ M + tσ. Thus At ⊂ {x ∈ RN : x ≤ M + tσ } and so γ {x ∈ RN : x ≤ M + tσ } ≥ (t). Operating similarly with A = {x ∈ RN : x ≥ M} shows that γ {x ∈ RN : x ≥ M − tσ } ≥ (t). Theorem 10.4.1 also allows us to estimate suprema of finitely many Gaussian vectors: There exists a universal constant C such that if G1 , . . . , GN are Gaussian random vectors with values in (B, · ), then (10.4.4) E sup Gk ≤ C sup E Gk + E sup σk |gk | 1≤k≤N

1≤k≤N

1≤k≤N

1/2 where σk = supf ∈B , f ≤1 E f, Gk 2 , k = 1, . . . , N, {gk , 1 ≤ k ≤ N} is a sequence of independent N (0, 1) distributed random variables. Now we specify Theorems 10.4.1, 10.4.2 for suprema of Gaussian processes. If X = {Xt , t ∈ T }, T finite, is a centered Gaussian process and σ = supt∈T (E Xt2 )1/2 , it follows that for u ≥ 0, we have 2 2 P sup Xt − E sup Xt ≥ u ≤ 2e−u /2σ . (10.4.5) t∈T

t∈T

If X = {Xt , t ∈ T } is a Gaussian process, the most general result on the tail distribution of supt∈T X(t) is derived from Theorem 10.4.1. Assume there exists w such that 1 P sup X(t) > w ≤ . 2 t∈T

520

10 Gaussian processes

Then, for all u ≥ w,

P sup X(t) > u ≤ "

t∈T

u−w σ (X)

(10.4.6)

where σ (X) = supt∈T (E X(t)2 )1/2 , and for any real u, u−w P sup X(t) > u ≤ 2" . σ (X) t∈T Let us list some more or less classical estimates. Some typical results. Assume that σ (X) = 1. Then P{sup X(t) > u} ≤ C(w)ewu "(u), t∈T

where the constant C(w) depends only on w. This bound cannot be improved. However it is too crude for many important cases. Consider several examples: Let Y = {Y (t), t ∈ R} be a stationary Gaussian process verifying E Yt2 ≡ 1 and having continuous sample paths. Then, for every ε > 0, E exp which implies ∀u ≥ ε,

P

1

2

sup |Yt | − ε

2

< ∞,

0≤t≤1

sup Yt > u ≤ C(ε)eεu "(u). 0≤t≤1

Better formulations of this result are established in Talagrand [1984]. Let further {B(t), 0 ≤ t < ∞} be a Brownian motion. It is well known that ∀λ ≥ 0, P sup B(t) > λ = 2P{B(1) > λ} = 2"(λ). 0≤t≤1

Let {Y (t), t ∈ R} be a Gaussian process satisfying for some 0 < α < 1,

E |Ys − Yt |2

1/2

3 |s − t|α ,

as |t − s| → 0. For these processes, the following asymptotic estimate is established in Pickands [1969]: P sup Yt > λ 3 λ1/α "(λ), λ → ∞. 0≤t≤1

Talagrand characterized the class of Gaussian processes X = {X(t), t ∈ T } satisfying P{supt∈T X(t) > u} lim = 1. (10.4.7) u→∞ "(u)

521

10.4 Gaussian suprema

More precisely, let T be a compact metric space on which a real separable centered Gaussian process X with continuous covariance is indexed. Assume that (T , dX ) is separable and that {X(t), t ∈ T } has almost surely bounded sample paths. Then (10.4.7) is equivalent to the condition: there exists a unique τ ∈ T such that sup E X2 (t) = E X2 (τ ) = 1,

(10.4.8)

t∈T

and E

(X(t) − a(t)X(τ )) = o(h) as h → 0,

sup

(10.4.9)

a(t)≥1−h2

where a(t) = E X(t)X(τ ). In [Dobriˇc–Marcus–Weber: 1988] the following application is given. Let 2 σ2 ≥ σ3 ≥ · · · of positive reals satisfying k=1 σk < ∞. Let {gk , k ≥ 1} be a sequence of independent normal D

random variables, with gk = N (0, σk ), so that ∞

p

|gk | < ∞

a.s.

k=1

Then P

lim

∞

k=1

u→∞

|gk |p "(u)

1/p

>u

= 2.

(10.4.10a)

If further 1 = σ1 = · · · = σn > σn+1 ≥ σn+2 ≥ · · · , then lim

P

∞

u→∞

k=1

|gk |p "(u)

1/p

>u

= 2n.

The relationship between L(h) = E

sup

(X(t) − a(t)X(τ ))

a(t)≥1−h2

and the existence of a function (u) such that P supt∈T X(t) > u ≤1 lim u→∞ (u)"(u) has been further investigated.

(≥ 1)

(10.4.10b)

522

10 Gaussian processes

Independent case. Let {ζk , k ≥ 1} be a sequence of standard N (0, 1) random variables. Let σk > 0 and σ = supk σk . Observe first that E supσk |ζk | < ∞ k≥1

⇐⇒ e

−δ/σk2

< ∞, ∀δ > 0

(10.4.11)

k≥1

⇐⇒

lim ε log # k : σk ≥ ε = 0. 2

ε→0

The first equivalence follows from the Borel–Cantelli lemma and integrability properties of Gaussian semi-norms. We now indicate how the second one obtains. We may 2 assume σ = 1. Put M(δ) = k≥1 e−δ/σk . If limε→0 ε2 log # k : σk ≥ ε = 0, given any positive real δ, there exists a positive integer kδ such that 0 < ε ≤ 2−kδ "⇒ #{k : σk ≥ ε} ≤ eδ/(8ε ) . 2

Therefore

e

−δ/σk2

=

k :σk ≤2−kδ

≤

∞

e

k=kδ 2−k−1 <σk ≤2−k ∞ −δ22k δ22k+2 /8

e

e

k=1

−δ/σk2

≤

∞

2k e−δ2 # k : σk ≥ 2−k−1

k=kδ

=

∞

e−δ2

2k /2

< ∞.

k=1

Hence M(δ) < ∞ for every positive δ. Conversely, let δ be such that M(δ) < ∞. Then δ log M(δ) + 2 ≥ log # k : σk ≥ ε . ε Multiplying both sides by ε2 , and letting next ε tend to 0, gives δ ≥ lim sup ε2 log #{k : σk ≥ ε}. ε→0

Since δ is arbitrary, we get the converse implication. Now, let us estimate E supk≥1 σk |ζk |. For each m > 0, we have

E sup σk |ζk | ≤ 2 log+ k≥1

∞

1/2

exp{−mσk−2 }

σ + 3m1/2 + 2σ.

(10.4.12)

k=1

Indeed, let Jn = {k : 2−n−1 σ < σk ≤ 2−n σ } and Nn = #(Jn ). Set Ln = 2−n σ (2 log Nn )1/2 and Sn = supk∈Jn σk |ζk |. Then, E Sn 1{Sn >Ln } ≤ E σk |ζk |1{σk |ζk |>Ln } ≤ 2−n σ Nn E |ζ1 |1{|ζ1 |>(2 log Nn )1/2 } k∈Jn −n

=2

σ Nn (2/π )1/2 Nn−1 ≤ 2−n σ.

523

10.4 Gaussian suprema

Further,

E sup σk |ζk | = E sup Sn ≤ sup Ln + k≥1

n:Nn >0

n:Nn >0

≤ sup Ln + n:Nn >0

∞

E Sn 1{Sn >Ln }

n:Nn >0

2−n σ ≤ sup Ln + 2σ. n:Nn >0

n=0

Moreover, for each n such that Nn > 0, we have 2 log+

∞

1/2

exp{−mσk−2 }

σ

k=1

1/2 −n

2 σ ≥ 2 log+ (Nn exp{−22n+2 mσ −2 )

2n+3 −2 1/2 −n mσ 2 σ ≥ Ln − 23/2 m1/2 , ≥ 2 log Nn − 2 + which proves (10.4.12). Let us also briefly discuss an elementary approach often called the double sum method, which goes back to earlier works of Sirao, Watanabe, Pickands . . . later by Kôno, Adler, Piterbarg, Weber, . . . etc. This simple method, which consists of a wise use of the correlation inequalities for Gaussian pairs, is often efficient to treat concrete problems of suprema. Let X = (X1 , . . . , XN ) be a Gaussian centered vector. There is no loss of generality to assume that

X1 ≤ X2 ≤ · · · ≤ XN . By Lemma 10.1.2 for any 1 ≤ j ≤ N, j −1

P{Xi > x, Xj > x} ≤ P{Xj > x}

i=1

j −1 i=1

1 x Xi − Xj

exp − 2 2 Xj 2

2

.

By using the elementary inequality P

N +

N N Aj ≥ P{Aj } − P{Ai ∩ Aj }

j =1

j =1

i,j =1 i<j

we deduce for all x ≥ 0 1−

j −1 i=1

1 x Xi − Xj

exp − 2 Xj 2 2

2

P supN j =1 Xj > x ≤ N ≤ 1. j =1 P Xj > x

Now assume we are given a separable centered Gaussian process X = {X(t), t ∈ T } with almost surely bounded sample paths. Put for all ε > 0,

sup X(u) . mX (ε) = sup E t∈T

dX (u,t)≤ε

524

10 Gaussian processes

Recall that σ (X) = supt∈T X(t) . Let S = {t1 , . . . , tN } be a finite fixed subset of T . We order S according to the increasing order of the variances of the X(ti )’s:

X(t1 ) ≤ X(t2 ) ≤ · · · ≤ X(tN ) . To avoid trivialities, assume that X(t1 ) > 0 and that ε = inf{ X(ti ) − X(tj ) , 1 ≤ i = j ≤ N} is also positive. Consider for 1 ≤ j ≤ n and k ≥ 1 the sets Ik (j ) = {i : i < j, kε < X(ti ) − X(tj ) ≤ (k + 1)ε}. Plainly, for any j ≤ N, j −1

exp −

i=1

1 x X(ti ) − X(tj )

2 2 X(tj ) 2

2

≤

∞

1 kεx 2 2σ (X)2

#(Ik (j )) exp −

k=1

2

.

And by Sudakov’s inequality, we have for any j ≤ N and k ≥ 1, 2 sup X(s) ≤ k0 mX ((k+1)ε). ε log #(Ik (j )) ≤ k0 E sup X(s) ≤ k0 E

X(s)−X(tj ) <(k+1)ε

s∈Ik (j )

Hence for all j ≤ N, j −1

exp −

i=1

≤

∞ k=1

1 x X(ti ) − X(tj )

2 2 X(tj ) 2

exp

k0 mX ((k + 1)ε ε

2 2

−

kεx 1 2 2σ (X)2

2

.

Choose now x such that

x ≥ (2σ (X))2 sup k≥1

mX ((k + 1)ε H ∨ , 2 kε ε

where H > 1 is some fixed parameter. Then for all k ≥ 1, and so ∞ k=1

exp

k0 mX ((k + 1)ε) 2 1 kεx − ε 2 2σ (X)

2

≤

∞ k=1

e−H

x2 8σ 4 (X)

≥

∞

2 k2

≤ 0

This provides j −1 i=1

P{X(ti ) > x, X(tj ) > x} ≤

1 P{X(tj ) > x}, H

m2X ((k+1)ε k 2 ε4

e−H

2 u2

+

H2 , ε2

du <

1 . H

525

10.4 Gaussian suprema

for any 2 ≤ j ≤ N. Thereby

P{sup X(t) > x} ≥ P{sup X(t) > x} ≥ 1 − t∈T

t∈S

1 H

P{X(t) > x}.

t∈S

Now, let MX (ε) be the maximal cardinality of the subsets S of T , such that

X(s) − X(t) > ε if s = t and s, t ∈ S. Define

H x , ≤ , ε(x) = inf ε > 0 : max supk≥1 mX ((k+1))ε 2 2 ε kε (2σ (X)) where mX (ε) = sup E t∈T

sup

X(t)−X(s) ≤ε

We have ε(x) ≤ D(X). By Theorem 10.4.1, P{sup X(t) > x + 2mX (ε(x))} ≤ P{ t∈T

s∈S(x)

≤"

X(s).

X(t) > x + 2mX (ε(x))}

sup

t: X(s)−X(t) ≤ε(x)

x #(S(ε(x))). σ (X)

10.4.3 Proposition. Let X = {X(t), t ∈ T } be a separable centered Gaussian process having almost surely bounded sample paths. Let H > 1 be an arbitrary fixed parameter. Let D(X) = sups,t∈T X(t) − X(s) and γ (X) = mint∈T X(t) . For all x verifying x ≥ (2σ (X))2 max

E supT X

we have

D(X)2

P{sup X(t) > x} ≥ 1 − t∈T

P{sup X(t) > x+2mX (ε(x))} ≤ " t∈T

,

H D(X)

,

ε(x) ≤ D(X),

1 x " MX (ε(x)), H γ (X)

x MX (ε(x)) ≤ γ (X)

H P{sup X(t) > x}. H −1 t∈T

Some examples. If X(t) − X(s) ∼ |s − t|α for some 0 < α ≤ 1, then cα−1 x 1/α "(x) ≤ P sup X(t) > x ≤ cα x 1/α "(x). t∈[0,1]

If X(t) − X(s) ∼ |log |s − t||−β for some β > 21 , then cβ−1 x 2/(2β+1) ≤ log

$

%

P{supt∈[0,1] X(t) > x} ≤ cβ x 2/(2β+1) . "(x)

If X(t) − X(s) ∼ exp |log |s − t||−γ for some 0 < γ ≤ 1, then cγ−1 (log x)1/γ ≤ log

$

%

P{supt∈[0,1] X(t) > x} ≤ cγ (log x)1/γ . "(x)

526

10 Gaussian processes

Before considering and investigating in more details the properties of the specific class of Gaussian processes defined by the Stein’s elements (Chapters 5 and 6), let us briefly comment on mostly known Gaussian process: the Brownian motion and discuss one of its powerful applications through the famous Skorokhod embedding scheme. The Brownian motion. This is likely the most investigated Gaussian process, since it plays a quasi-universal role in the Probability Theory. The Brownian motion, which is also called Wiener process, is a centered Gaussian process W = {W (t), t ≥ 0} defined (and thus characterized) by its covariance function E W (s)W (t) = s ∧ t. Consequently E W (s)2 = s. In particular W (0) = 0, and if 0 ≤ u ≤ v ≤ s ≤ t, E (W (v) − W (u))(W (t) − W (s)) = v − v − u + u = 0. And for any c ≥ 0, 0 ≤ s ≤ t E (W (t + c) − W (s + c))2 = E (W (t) − W (s))2 = t − s. Thus W is a Gaussian process with orthogonal, and thus independent √ stationary increments. It also follows that {W (ct), t ≥ 0} has same law W = { c W (t), t ≥ 0} for any positive real c. Notice also that −W and W have same law. The sample paths of W are almost surely continuous. Below are some of the distributional properties of W : for any u ≥ 0

u P{ sup W (t) ≥ u} = 2P{W (T ) ≥ u} = 2" √ . T 0≤t≤T u 1 2 P{ sup |W (t)| ≤ u} = √ (−1)k e−(x−2ku) /2 dx 2π −u k∈Z 0≤t≤T ∞ 4 (−1)k −π 2 (2k+1)2 /(8u2 ) = . e π 2k + 1

(10.4.13)

(10.4.14)

k=0

A bit less known is the following estimate related to the local infimum of |W |. Let 0 < a 0 and any real M P

inf |W (t) − M| ≥ c =

a≤t≤b

! |v|>c

(M+v)2

|v| − c e− 2a 1 − 2"( √ dv. ) √ b−a 2π a

(10.4.15)

This is easily obtained with using the so-called “reflexion principle”, which in turn amounts to apply the intermediate values theorem, for getting P inf |W (t) − M| ≥ c = P inf W (t) ≥ M + c + P sup W (t) ≤ M − c . a≤t≤b

a≤t≤b

a≤t≤b

527

10.4 Gaussian suprema

Let x ≥ 0. Then P inf a≤t≤b |W (t) − M| ≥ c W (a) = M ± x = 0, if 0 ≤ x ≤ c; and if x > c, P inf |W (t) − M| ≥ c W (a) = M + x = P sup (W (a) − W (t)) ≤ x − c a≤t≤b a≤t≤b P inf |W (t) − M| ≥ c W (a) = M − x = P sup (W (t) − W (a)) ≤ x − c . a≤t≤b

a≤t≤b

Therefore P

inf |W (t) − M| = 0 = 2

a≤t≤b

In particular,

"( √

|v|

(M+v)2 2a

e− ) √

dv b−a 2π a 3b − a M2 e− 8 max(a,b−a) . ≤ C min 1, a R

2 P inf |W (t)| = 0 = 1 − arctan a≤t≤b π

3

a . b−a

(10.4.16)

(10.4.17)

And for every positive real c √ u2 u a e− 2 P 0 < inf |W (t)| < c = 2 1 − 2"( √ ) √ du a≤t≤b b−a 2π 0 √ √ u2 ∞ u a−c u a e− 2 + 4 √ "( √ ) − "( √ ) √ du. b−a b−a 2π c/ a

√ c/ a

Concerning both local and uniform modulus of continuity, Lévy proved the following result: |W (s + t) − W (s)| a.s. = 1, 2 2h log(1/ h) h→0 0≤s≤1−h 0≤t≤h |W (s + h) − W (s)| a.s. lim sup = 1. 2 2h log(1/ h) h→0 0≤s≤1−h lim

sup

sup

(10.4.18)

We refer to Csörgö and Révész [1981], Theorem 1.1.1, and for a thorough treatment of the asymptotic properties of the increments of W . The central role of the Brownian motion can be illustrated by the powerful randomization procedure introduced by Skorokhod, which we shall describe because of its usefulness and its wide range of application. The Skorokhod embedding. Let W = {W (t), t ≥ 0} denotes a standard Brownian motion. Any centered measure μ on the real line embeds into W : there exists a stopping D

time τ such that W (τ ) = μ, and further {W (τ ∧ t) : t ≥ 0} is a uniformly bounded martingale. In fact τ is the first exit time of W from a random interval containing 0. An

528

10 Gaussian processes

explicit construction of T , which is the Skorokhod stopping time, is given in Sawyer [Sawyer: 1974], Section 2, see also [Obloj: 2004], p. 332. This has been proved to be an extremely fertile idea, which usually applies as follows. Let 0 < η < 1 and set Aη = {|τ − E τ | ≤ ηE τ }. Assuming E τ < ∞, one then controls separately the set Acη by showing, via suitable use of Tchebycheff’s inequality, that its probability is small. Additional knowledge on the moments

τ − E τ p is then required. Next, on the set Aη , the problem studied is transferred in a “Brownian environment”, by translating it into local properties (on the interval ](1 − η)E τ, (1 + η)E τ [ ) of the sample paths of W , which are generally tractable. For the first step, the Burkholder, Davis, Gundy and Millar inequalities (see Proposition 2.1 in [Obloj: 2004] and [Davis: 1976], p. 697, or estimates (1.10) in [Sawyer: 1974]), are useful. For any 1 ≤ p < ∞, there exist universal constants cp , Cp such that p/2 p ≤ E |W (τ )| = |x|p μ(dx) ≤ Cp E τ p/2 . cp E τ R

A careful analysis of the integrability properties of τ is made in [Sawyer: 1974] (see Theorem 1). For instance, for any α ≥ 0, there is a constant Cα depending on α only such that 1/2 2 E e(ατ ) ≤ Cα eα|x| μ(dx). R

When μ is a symmetric measure, the latter estimate is even two-sided ([Sawyer: 1974], Theorems 2–3). See also Lemma A.2 on p.272 in Hall and Heyde [1980] Another important construction has been given in [Fisher: 1992] which turns up to be well adapted for treating questions involving weighted sums of i.i.d. random variables. Fisher’s construction takes care of the “scale change” role played by the weights, and in turn uses the fact that if ξ is a real random variable satisfying E ξ 2 < ∞ and E ξ = 0, and λ is a fixed positive real, then on a possibly larger probability space, there exist a Brownian motion W and stopping times T and Tλ such that D

W (T ) = ξ,

D

W (Tλ ) = λξ,

D

Tλ = λ2 T .

This applies as follows. Let w = {w , ≥ 1} be a sequence of positive real numbers, ξ = {ξ , ≥ 1} be centered i.i.d. random variables with unit variance, and denote N ϒN = =1 w ξ . Then there exists a probability space with a Brownian motion {W (t), t ≥ 0} and non-negative i.i.d. random variables {τ , ≥ 1} with E τ = 1, such that N D w2 τ , . . . , (ϒ1 , ϒ2 , . . . , ϒN , . . . ) = W (w12 τ1 ), W (w12 τ1 + w22 τ2 ), . . . , W =1 r/2

and, moreover, for each real number r ≥ 1, E (τ1 ) ≤ Crr E (|ξ1 |r ), where Crr = 2(8/π 2 )r−1 (r + 1). See Fisher [1992; Theorem 2.2] and Lin–Weber [2009; Theorem 3.6], in which a more direct approach than Fisher’s one is proposed, on the basis of an idea due to Breiman.

10.5 Oscillations of Gaussian Stein’s elements

529

Problem 11. Let Z be a centered square integrable random variable. Let x be some arbitrary real number. Show that for each η > 0 T |W (t) − x| = 0 , − 1| > η + P inf P Z=x ≤P | |t−E T |≤ηE T ET D

where T be a stopping time, such that W (T ) = Z and E T = E Z2 . Deduce that there exists an absolute constant C, such that for every real x, and 0 < a < b < ∞, x2 1 T − E T s 1/2 − 8E T . E P Z = x ≤ inf + Cη e 0<η<1/2 ηs ET s≥1

10.5

Oscillations of Gaussian Stein’s elements

We already met this remarkable class in the proof of the continuity principle and of the entropy criteria in Chapters 5 and 6, where they played a crucial role. Their relative compactness properties also allow us to obtain various refinements of Bourgain’s entropy criteria (see [Weber: 1998b]). Consider a measurable ergodic dynamical system (X, A, μ, T ). Let also {gn , n ≥ 1} be i.i.d. N (0, 1) distributed random variables defined on a joint probability space (, B, P ). To each element f ∈ Lp (μ) we associate the Lp (μ)-valued random sequence: ∀J ≥ 1, ∀(ω, x) ∈ × X,

1 gj (ω)f T j (x). FJ (ω, x) = FJ,f (ω, x) = √ J j ≤J

Although the role of these Gaussian elements in ergodic theory is self-evident, the study of their behavior not only contributes to the general theory of Gaussian random functions, but also helps further developments of the above mentioned results. In this section, we adopt a “J -trajectory approach” (with fixed x ∈ X and varying integer parameter J ). From this point of view, the observed structure is similar to that of the averages related to the central limit theorem. We shall obtain a nearly complete picture of the properties of their oscillations and describe the weak and strong convergence properties of associated sojourn times. First, some elementary considerations 1 J 2 are necessary. Since FJ ( ·, x) 2,P = J j =1 (f T j (x))2 , by Birkhoff’s ergodic theorem μ{limJ →∞ FJ ( ·, x) 2,P = f 2,μ } = 1. Thus for any real M, P lim sup |FJ ( ·, x)| > M ≥ P |N (0, f 2,μ )| > M > 0, J →∞

μ a.s. x. And since the law of lim sup on the first gn ’s, J →∞ |FJ ( ·, x)| does not depend this implies by the 0-1 law that P lim supJ →∞ |FJ ( ·, x)| = ∞ = 1, μ a.s. x. The regularity of the FJ ’s will be thus reflected by the magnitude of their oscillations. The study of these oscillations is the main purpose of the present work. We introduce some

530

10 Gaussian processes

convenient notation: Jˆ = Jˆf (x) =

J

(f T j (x))2 ,

AJ = AJ,f (x) = Jˆ/J ,

j =1

Af = Af (x) = sup AJ , J ≥1

f = f (x) =

∞

(AJ +1 − AJ )2

1/2 .

J =1

10.5.1 Theorem (Boundedness of the oscillations). Let {Jk , k ≥ 1} be an increasing sequence of positive integers. If for some M > 0, the series Q1 =

∞

exp{−MJk /(Jk+1 − Jk )}

k=1

converges, then Q2 = supk Jk+1Jk−Jk < ∞ and for each f ∈ L2 (X, μ), we have E sup sup |Fθ1 ,f − Fθ2 ,f | dμ ≤ K, (10.5.1) X

k

θ1 ,θ2 ∈[Jk ,Jk+1 ]

3/4 3/4 where the finite constant K does depend on M, Q1 , Q2 , Af dμ, and f dμ only. In particular, we have μ × P sup sup |Fθ1 ,f − Fθ2 ,f | < ∞ = 1. (10.5.2) k

θ1 ,θ2 ∈[Jk ,Jk+1 ]

The size of blocks in this statement is nearly the best possible. Indeed, we will also prove 10.5.2 Theorem. Let {Jk , k ≥ 1} be an increasing sequence of positive integers satisfying the two following assumptions: (H 1) the sequence {Jk+1 − Jk , k ≥ 1} is nondecreasing, (H 2) the sequence Jk+1Jk−Jk , k ≥ 1 is nonincreasing. Assume that there exists some ergodic dynamical system (X, A, μ, T ) and f ∈ L2 (X, μ), f = 0 such that (10.5.2) holds. Then, for some positive real M, Q1 =

∞

exp{−MJk /(Jk+1 − Jk )} < ∞.

(10.5.3)

k=1

These results on oscillations can be complemented by a study of the sojourn time of the sequence FJ in a given measurable subset ⊂ R1 . Consider for large the frequencies 1 d (, x, ω) = 1{FJ (x,ω)∈} . (10.5.4) J =1

531

10.5 Oscillations of Gaussian Stein’s elements

10.5.3 Proposition (Invariance principle). Let f ∈ L2 (X, μ) with f 2 = 1. Let ⊂ R be such that λ(∂) = 0. Then, for μ-almost all x ∈ X,

W (t) D lim d (, x, · ) = I = λ 0 ≤ t ≤ 1 : √ ∈ , →∞ t

(10.5.5)

where {Wt , t ≥ 0} is the Wiener process. As a corollary we get 10.5.4 Corollary. For any interval , we have μ × P (x, ω) : lim inf d (, x, ω) = 0, lim sup d (, x, ω) = 1 = 1. →∞

→∞

Oscillations – sufficient conditions. In this part, we prove Theorem 10.5.1. By the maximal Lemma 4.1.2, the maximal operator A is weak-(2,1): for any nonnegative real B, Bμ{Af > B} ≤ f 22 .

(10.5.8)

According to Theorem 4.2.4, we also know that the second operator is strong-(2,2):

f 2 ≤ C f 2 ,

(10.5.9)

where C is an absolute constant. This clearly shows that the constant K occurring in (10.5.2) depends on M, Q1 , Q2 , and f 2 only. We can now pass to the Proof of Theorem 10.5.1. Fix some x ∈ X, and let W ( · ) = W x ( · ) be a Wiener process such that for any J , W (Jˆ) =

J

f (T j x)gj = J 1/2 FJ .

(10.5.10)

j =1

Then, for any integer k and θ1 , θ2 ∈ [Jk , Jk+1 ] we have Fθ1 − Fθ2 −1/2

= θ1

−1/2 −1/2 −1/2 W (θˆ1 ) − θ2 W (θˆ2 ) + (θ2 − θ2 )W (θˆ1 )

−1/2

= (θ1 ≤

−1/2

− θ2

−3/2 (Jk+1 2−1 Jk

−1/2

)W (θˆ1 ) + θ2

(W (θˆ1 ) − W (θˆ2 ))

− Jk )

|W (u)| + 2Jk

sup u∈[0,Jˆk+1 ]

−1/2

sup u∈[Jˆk ,Jˆk+1 ]

|W (u) − W (Jˆk )|.

532

10 Gaussian processes

Concerning the first half of the last expression, we have 1 −3/2 (Jk+1 − Jk ) J 2 k

sup |W |

[0,Jˆk+1 ]

1/2

1 Jk+1 − Jk Jk+1 Jˆk+1 ˆ− 21 = J k+1 1 1/2 2 Jk Jk J 2 1/2

k+1

1 Jk+1 − Jk ≤ 2 Jk

=K

Jk+1 − Jk Jk

1/2 1/2

sup u∈[0,Jˆk+1 ]

Q2 (Q2 + 1)

A

|W (u)|

1/2

Jˆk+1

sup u∈[0,Jˆk+1 ]

|W (u)|

A1/2 sup |W1,k (u)|, u∈[0,1]

where W1,k is a Wiener process. Concerning the second half, we observe −1/2

2Jk

sup u∈[Jˆk ,Jˆk+1 ]

|W (u) − W (Jˆk )| =

ˆ J k+1 − Jˆk 1/2

Jk

sup |W2,k (u)|, u∈[0,1]

where W2,k is another Wiener process. Moreover, Jˆk+1 Jˆk+1 Jk+1 − Jk Jˆk+1 Jˆk+1 − Jˆk + − = Ak+1 − Ak + Jk Jk+1 Jk+1 Jk Jk+1 Jk+1 − Jk ≤ |Ak+1 − Ak | + A. Jk Putting now all our estimations together, leads us to

E sup

sup

k θ1 ,θ2 ∈[Jk ,Jk+1 ]

|Fθ1 − Fθ2 | ≤ KA1/2 E sup k

Jk+1 − Jk Jk

1/2

sup |W1,k (u)| 0≤u≤1

+ 2E sup |Ak+1 − Ak |1/2 sup |W2,k (u)|, k

0≤u≤1

(10.5.11) where Wi,k are Wiener processes (there is no assumption concerning their mutual independence). Now, we are ready to apply the following lemma which goes back to more general results on Gaussian processes. Applying then (10.5.11) to the first part of (10.5.10), with the choices m = M, σk = ( Jk+1Jk−Jk )1/2 , produces a bound equal to KA1/2 . Applying next (10.5.11) to the second half of (10.5.10) with the choices m = 1, σk = |Ak+1 − Ak |1/2 , σ ≤ A1/2 , also leads to the bound ∞ 1/2 exp{−|Ak+1 − Ak |−1 } σ + KA1/2 + K. KA1/2 2 log+ k=1

533

10.5 Oscillations of Gaussian Stein’s elements

with u = |Ak+1 − Ak |−1 and thus Now, we apply the obvious inequality e−u ≤ u−2 ∞ 2 replace the sum in the last expression by = k=1 |Ak+1 − Ak |2 . It remains then to study the integral X

A(x)1/2 [log+ (x) + 1]μ(dx).

We use the inequality log+ ≤ 21/4 , and next apply Hölder’s inequality, which provides 2/3 1/3 A(x)1/2 (x)1/4 μ(dx) ≤ A(x)3/4 μ(dx) (x)3/4 μ(dx) ≤ K. X

X

X

Theorem 10.5.1 is thus proved. Oscillations – necessary conditions. In this part, we prove Theorem 10.5.2. We split the proof in four steps. (1) Exponential consolidation of the sequence {Jk , k ≥ 1}. Under assumption (H2 ), b = supk Jk+1Jk−Jk < ∞. Put B = (b + 1)2 and for each integer l, J = k : Jk ∈ [B , B +1 ) ,

N = #(J ).

Let k, k ∈ J with k ≤ k . Then, by (H1 ), (H2 ), we have Jk +1 − Jk Jk · Jk · Jk B Jk+1 − Jk Jk ≤ · Jk · ≤ B(Jk+1 − Jk ). Jk B

Jk+1 − Jk ≤ Jk +1 − Jk ≤

Thus, max(Jk+1 − Jk ) ≤ B min (Jk+1 − Jk ). k∈J

(10.5.12)

k∈J

By the definition of B we also have for each k ∈ Jl , Jk+1 = Jk + (Jk+1 − Jk ) ≤ Jk (1 + b) ≤ Jk B 1/2 ≤ B +3/2 , Jk = Jk−1 + (Jk − Jk−1 ) ≤ Jk−1 (1 + b) ≤ Jk−1 B 1/2 . It follows that supk∈Jl Jk+1 ≤ B +3/2 and inf k∈Jl Jk ≤ B +1/2 . Hence the following chain of inequalities is true: N maxk∈J (Jk+1Jk−Jk )

≥

N · mink∈J (Jk+1 − Jk ) B +1

N · maxk∈J (Jk+1 − Jk ) ≥ ≥ B +2 √ 1 B +1 − B + 2 B− B ≥ = . B2 B +2

k∈J (Jk+1 B +2

− Jk )

(10.5.13)

534

10 Gaussian processes

Similarly, we also have N mink∈J (Jk+1Jk−Jk )

N · maxk∈J (Jk+1 − Jk ) B · N · mink∈J (Jk+1 − Jk ) ≤ l B Bl B · k∈J (Jk+1 − Jk ) B(B +3/2 − B l ) ≤ ≤ ≤ B(B 3/2 − 1). Bl Bl (10.5.14) ≤

Consequently, condition (10.5.3) can be rewritten in the following more convenient form: for some M ∈ (0, ∞), N e−MN < ∞. (10.5.3∗ ) ≥1

Indeed (10.5.3) ⇐⇒

∞

e−MJk /(Jk+1 −Jk ) < ∞ ⇐⇒

k=1

l

and thus (10.5.3) implies

N e

−

2 MB √ N B− B

e−MJk /(Jk+1 −Jk ) < ∞,

k∈J

< ∞.

l≥1

In the opposite direction, we also have 3/2 (10.5.3∗ ) ⇐⇒ N e−MB(B −1)mink∈J Jk /(Jk+1 −Jk ) < ∞ l≥1

"⇒

e

Jk k+1 −Jk

−MB(B 3/2 −1) J

<∞

l≥1 k∈J

⇐⇒

e

Jk k+1 −Jk

−MB(B 3/2 −1) J

< ∞.

k

(2) Reduction to bounded functions. Without loss of generality, we can indeed assume that our function f belongs to L∞ (X, μ). This follows from the comparison Lemma 10.2.3, by means of which (10.5.2) implies that the same property holds for the sequence (FJ,fA ) where fA = f 1{|f |≤A} . Consequently, we can and do assume in what follows that f ∞ < ∞. In this case, we also have f 2 ∈ (0, ∞). (3) Convergence of some random series. By our assumption (10.5.2), there is a measurable set X of x’s of unit μ-measure, such that for any x ∈ X , P{supk |FJk+1 − FJk | < ∞} = 1. We fix the variable x in X . Then, writing FJ (x) = FJ , J ≥ 1, in what follows, we have for each A large enough, 0 < P{sup |FJk+1 − FJk | ≤ A}. k

535

10.5 Oscillations of Gaussian Stein’s elements

Let us consider the conditional variance σk2 = Var(FJk+1 − FJk |{FJk +1 − FJk , 1 ≤ k < k}). Then σk2 ≥ Var(FJk+1 − FJk |{gj , 1 ≤ j ≤ Jk }) =

Jk+1

1 Jk+1

f (T j x)2 =

j =Jk +1

Jˆk+1 − Jˆk . Jk+1

By Proposition 3.3 in [Lifshits: 1995], it follows that ∞ ( P sup |FJk+1 − FJk | ≤ A ≤ P σk |ζ | ≤ A , k

k=2

where ζ denotes a standard N (0, 1) random variable. And thus which implies 2 ˆ ˆ e−A Jk+1 /2(J k+1 −J k ) < ∞.

k

e−A

2 /2σ 2 k

< ∞,

(10.5.15)

k

To conclude, we only have to replace Jˆk by Jk in the above expression. (4) Conclusion. We shall deduce (10.5.3∗ ) from (10.5.15). We fix and put k = Jˆk+1 − Jˆk , k ∈ J . Let {∗k , 1 ≤ k ≤ N } be the nondecreasing rearrangement of the sequence {k , 1 ≤ k ≤ N }. Fix also some ε > 0. According to Birkhoff’s theorem, ˆ the following inequality JJkk − f 22 ≤ ε, holds for μ-almost all x in X provided that k is large enough, say k ≥ k0 (x). Moreover

Jk+1

k =

j =Jk +1

f (T j x)2 ≤ (Jk+1 − Jk ) f 2∞ .

Then N

∗k =

k=1

k =

k∈J

(Jˆk+1 − Jˆk ) = sup Jˆk+1 − inf Jˆk k∈J

k∈J

k∈J

≥ ( f 2 − ε) sup Jk+1 − ( f 2 + ε) inf Jk k∈J

k∈J

+ 21

≥ ( f 2 − ε)B +1 − ( f 2 + ε)B √ √ # " ≥ f 2 (B − B) − ε(B + B) B = CB , with C > 0, provided that ε is chosen small enough, which we do. Let now 0 < α < 1 be fixed. Then N

∗k =

(1−α)N

∗k +

N

k=(1−α)N +1 ∗ N (1−α)N + αN f 2∞

∗k ≤ N ∗(1−α)N + αN · sup k k∈J

k=1

k=1

≤

sup (Jk+1 − Jk ). k∈J

536

10 Gaussian processes

Recall, according to (10.5.12), that N · sup (Jk+1 − Jk ) ≤ BN inf (Jk+1 − Jk ) ≤ B k∈J

k∈J

(Jk+1 − Jk )

k∈J

≤ B(B +3/2 − B ) = (B 5/2 − B)B . We thus have N ∗(1−α)N ≥

N

∗k − α f 2∞ (B 5/2 − B)B

k=1

≥ [C − α f 2∞ (B 5/2 − B)]B = C1 B , where C1 > 0, provided that α is chosen sufficiently small, which we do assume. The implication (10.5.15) ⇐⇒ (10.5.3∗ ) finally results from the following estimates: k∈J

e

−

A2 Jk+1 2(Jˆk+1 −Jˆk )

≥

e

2 B +3/2 2k

−A

e

2 B +3/2 2∗ k

−A

2 +3/2

≥ αN e

A B − 2 ∗

(1−α)N

(1−α)N ≤k≤N

k∈J

≥ αN e

≥

A2 B 3/2 B N − 2C1 B

2 3/2

= αN e

B − A 2C

1

N

.

Densities. In this part, we give the proofs of Proposition 10.5.3 and its Corollary 10.5.4. We start with the Proof of Proposition 10.5.3. By virtue of Birkhoff’s ergodic theorem, Jˆ(x) = 1, J →∞ J lim

(10.5.16)

μ-almost surely. Fix an x satisfying the above property. We will use the natural embedding of FJ into the Wiener process. More precisely, if W˜ is a Wiener process, then we have the equalities of the laws J D FJ (x, · ), J ≥ 1 = J −1/2 W˜ f (T j x)2 , J ≥ 1 = J −1/2 W˜ (Jˆ(x)), J ≥ 1

j =1

D = (J /)−1/2 W (Jˆ(x)/), J ≥ 1 ,

where W (u) = W˜ (u)−1/2 also is a Wiener process. Thus, D

d (, x, · ) = dW =

1 1{(J /)−1/2 W (Jˆ(x)/)∈} . J =1

10.6 Tightness of Gaussian Stein’s elements

537

It will be more convenient to work with the object W dˆ = −1

J =1

1{(Jˆ/)−1/2 W (Jˆ(x)/)∈} .

W This can be viewed as dˆ = λ (V ), where λ = λ (x) = −1 J =1 δθj / is a deterministic nonnegative measure on R+ , in the definition of which δa stands for the Dirac measure at the point a and V = V (ω) = {t ∈ [0, 1] : t −1/2 W (t) ∈ }. Then as a direct consequence of (10.5.16), we have that λ converges weakly to the restricted Lebesgue measure λ(1) (dt) = 1[0,1] (t)dt, as tends to infinity. Since P{λ(1) (∂A) = 0} = 1, weak convergence implies W D dˆ = λ (V ) −→ λ(1) (V ) = I,

(10.5.17)

W

almost surely. We deduce that dˆ → I , almost surely. Moreover the property (10.5.16) together with the condition λ(∂) = 0 easily imply W lim E |dW − dˆ | = 0.

l→∞

D

D

It follows now from (10.5.17) that dW → I as tends to infinity. So d (, x, ·) → I as tends to infinity. Proof of Corollary 10.5.4. Let 0 < ε < 1 be fixed. It follows from Proposition 10.5.3 that P ω : lim sup d (, x, ω) ≥ 1 − ε ≥ lim sup P ω : d (, x, ω) ≥ 1 − ε →∞

→∞

= P{I ≥ 1 − ε} > 0. And by applying the 0-1 law we show that P ω : lim sup→∞ d (, x, ω) = 1 = 1. The proof is thus achieved.

10.6 Tightness of Gaussian Stein’s elements We continue the study of the Gaussian Stein sequences undertaken in the preceding section. We now examine their tightness properties in two essential cases: the spaces Lp (T), 1 < p < ∞ and the space C(T) of continuous functions on the torus T. We begin with a useful criterion, which is in fact a corollary of a general result of Skorohod (see Fernique [1985: Lemma 1.3]).

538

10 Gaussian processes

10.6.1 Proposition. Let {gn , n ∈ N} be a sequence of Gaussian measures defined on a separable Banach space B. Assume {gn , n ∈ N} converges to g0 for the weak topology of measures on B. Then, there exists a Gaussian vector % = {xn , n ∈ N} with values in B N , such that (a) limn→∞ xn = x0 in Lr (B), for all r ≥ 0, (b) L(xn ) = gn , n = 1, 2, . . . . The next proposition is useful for studying the relative compactness of the Gaussian Stein sequence. Let (G, d) be a compact metric space and let τ : G → G be continuous and such that the following properties are satisfied: (a) (G, τ ) is a minimal system. (b) d(τ u, τ v) = d(u, v) for any u, v ∈ G. Let μ be a Borel measure on G which is left invariant under the action of τ . For any x, let Vε (x) = {u ∈ G : d(u, x) ≤ ε}. Then (a) and (b) imply μ(Vε (x)) = μ(Vε (0)). Let 1 ≤ p ≤ ∞ be fixed and put for any f ∈ Lp (μ), any x ∈ G and any ε > 0, 1 f (ε) (x) = f (u) dμ(u). μ(Vε (0)) Vε (x) The following criterion of relative compactness in Lp (μ) is due to Kolmogorov [1985: 148]. 10.6.2 Proposition. Let F be a subset of Lp (μ). Then F is compact in Lp (μ) if and only if the two following conditions are fulfilled: (a) supf ∈F f p,μ < ∞. (b) For any δ > 0, there exists ε > 0 such that supf ∈F f − f (ε) p,μ ≤ δ. From this criterion, one can deduce that the associated Gaussian Stein’s sequence ∀f ∈ Lp (μ) ∀J ≥ 1,

1 τ FJ,f =√ gj f τ j J j ≤J

is for any 2 ≤ p < ∞ and f ∈ Lp (μ), relatively compact in Lp (μ). This allows us to establish a delicate extension of Bourgain’s entropy criterion (Corollary 5.2.7 and Theorem 5.2.4 in [Weber: 1998b]). We specify in what follows G = T and μ = λ, the normalized Lebesgue measure on T. 10.6.3 Theorem. Let {Sn , n ≥ 1}, be a sequence of L2 (μ) − L∞ (μ) contractions commuting with rotations. Assume that the property (Cp ) is realized. Then for any f ∈ Lp (μ), the set Cf is a GC set of L2 (μ).

539

10.6 Tightness of Gaussian Stein’s elements

τ , J ≥ 1}, In the proof of this result, the tightness properties of the sequence {FJ,f where τ is an irrational rotation (τ x = x + ϑ mod (1), ϑ irrational) are crucial. We shall exhibit more general classes inspired by this example, and establish their tightness in Lp (T) or C(T). We will also study examples of non-tightness. Let be the family of all triangular arrays = {λJ,j , 1 ≤ j ≤ J, J ≥ 1} with λJ,j ∈ [0, 1] for all j and J . Let {gj , j ≥ 1} be a sequence of independent N (0, 1) distributed random variables defined on a common probability space (, A, P). We study the tightness properties of the families of random elements

FJ,f,

J 1 =√ gj f (x + λJ,j ) J j =1

(10.6.1)

in Lp (T) or C(T). The symbol “+” denotes the addition operation of the additive group T = R/Z = [0, 1). Two types of arrays are of special interest: the sequences λJ,j = λj and the array corresponding to randomized Riemann sums λJ,j = j/J. Tightness in Lp .

Let p ∈ [1, ∞]. Put for f ∈ Lp (T), ωf (u) = sup f (· + h) − f ( · ) p .

(10.6.2)

0≤h≤u

The modulus of continuity of a function f ∈ C(T) coincides with that of the space L∞ (T). 10.6.4 Theorem. Let p ∈ [1, ∞) and F be a subset of Lp (T). Then F is relatively compact if and only if sup f p < ∞ and F

lim sup ωf (u) = 0.

u→0 F

This Lp version of the Arzela–Ascoli theorem (for a proof see [Dunford–Schwartz: 1958], p. 298) is a very convenient criterion of tightness, and will not be applied directly. But it helps to better understand the following criterion of tightness of a family of measures. 10.6.5 Theorem. Let p ∈ [1, ∞). A family of random functions with sample paths in Lp (T) is tight if and only if lim sup P{ F > M} = 0 and for any ε > 0 lim sup P{ωf (u) > ε} = 0. u→0 F ∈

M→∞ F ∈

The criterion yields the simplified Gaussian version. 10.6.6 Theorem. Let p ∈ [1, ∞). A family of centered Gaussian random functions with sample paths in Lp (T) is tight if p

(i) sup E F p < ∞ and F ∈

(ii) lim sup E ωf (u)p = 0. u→0 F ∈

540

10 Gaussian processes

In our case, concerning (i) we have the estimate

p

E FJ,f, p = E

T

|FJ,f, |p dλ = J −p/2

= cp J −p/2

J T

T

J p E f (x + J,j )gj λ(dx) j =1

p

f 2 (x + J,j )

λ(dx),

j =1

with cp = E |g1 |p . If p ≥ 2, the discrete Hölder inequality yields J

J J J 2/p 1−2/p 2/p f 2 (x+J,j ) ≤ f p (x+J,j ) 1 = f p (x+J,j ) J 1−2/p .

j =1

j =1

j =1

j =1

Hence p E FJ,f, p

≤ cp J

−1

J T j =1

p

|f (x + J,j )|p λ(dx) = cp f p .

(10.6.3)

The latter inequality serves as a powerful instrument of “closure”. In the case 1 ≤ p < 2, we still have a Hölder estimate p/2

p p/2 p E FJ,f, p ≤ E FJ,f, 2 ≤ E FJ,f, 22 ≤ f p , (10.6.4) which is not always efficient, especially for f ∈ Lp (T)\L2 (T), but will be useful in the counterexample given in Section 10.6.9. We show now that indicator functions fa = χ[0,a) generate tight families in Lp (T), 1 ≤ p < ∞. A closing procedure will enable us to extend this result on the class of arbitrary functions f ∈ Lp , 2 ≤ p < ∞, while for 1 ≤ p < 2 the general result is false. 10.6.7 Theorem. The family of random functions = FJ,fa , , a ∈ [0, 1), λ ∈ , J ∈ N is tight in each Lp (T), 1 ≤ p < ∞. Proof. In order to keep transparent notation for intervals, we may consider, without loss of generality, only the case a ≤ 1/2. During this proof we will use a simplified notation F for FJ,fa , . Our estimates will be uniform over these parameters. We apply Theorem 10.6.6. For the moments, we already have p

p

E F p ≤ cq f q ,

q = max(2, p).

(10.6.5)

This bound is uniform over . Now we pass to the modulus of continuity. Let M ≥ 5 be a fixed integer and let u = M −1 . For each integer k = 0, . . . , M − 1, let tk = k/M

541

10.6 Tightness of Gaussian Stein’s elements

and Ik = [tk , tk + 2a). Then for each x ∈ Ik we have F (x) − F (tk ) = J −1/2 gj − gj := J −1/2 Wk+ (x) − Wk− (x) . j ≤J tk <−J,j ≤x

j ≤J tk
And so sup |F (x) − F (y)| ≤ J −1/2 x,y∈Ik

sup |Wk+ (x) − Wk+ (y)| + sup |Wk− (x) − Wk− (y)| .

x,y∈Ik

x,y∈Ik

Wk+ ,

Wk−

The oscillations of the processes are bounded by the number of terms in the corresponding sums: Nk+ = # j ≤ J : a − J,j ∈ (tk , tk+2 , Nk+ = # j ≤ J : −J,j ∈ (tk , tk+2 . Clearly max

M−1

Nk+ ,

k=0

M−1 k=0

Nk− ≤ 2J.

The processes Wk+ , Wk− being sums of i.i.d. Gaussian random variables, we bound their local oscillation with the oscillation of a Brownian motion W . This goes as follows. p

p E sup Wk+ (x) − Wk+ (y) = E sup Wk+ (x) + sup −Wk+ (y) x,y∈Ik

x∈Ik

≤2

p−1

=2

p

E sup

y∈Ik + Wk (x)p +

x∈Ik E sup (Wk+ (x))p x∈Ik

sup (−Wk+ (y))p

y∈Ik p

≤2 E

sup (Wk+ (u))p

0≤u≤Nk+

= 2p E |W (Nk+ )|p = 2p cp (Nk+ )p/2 . Similarly

p E sup Wk− (x) − Wk− (y) ≤ 2p cp (Nk− )p/2 . x,y∈Ik

With these estimates in hand, we now calculate the mean oscillations of the random functions F . For each h ∈ [0, u], we have p F (x + h) − F (x)p λ(dx)

F (·h) − F ( · ) p = =

T M−1 tk +u

F (x + h) − F (x)p λ(dx)

k=0 tk M−1

≤u

p sup F (x) − F (y)

k=0 x,y∈Ik

and E ωF (u)p = E sup F (·h) − F ( · ) p ≤ 2p cp J −p/2 u p

h≤u

M−1 k=0

(Nk+ )p/2 + (Nk− )p/2 .

542

10 Gaussian processes

If p ≥ 2, we have M−1

p/2 M−1 p/2 M−1 (Nk+ )p/2 + (Nk− )p/2 ≤ Nk+ + Nk− ≤ 2(2J )p/2 ,

k=0

k=0

k=0

and E ωF (u)p ≤ 23p/2+1 cp u → 0

as u → 0,

uniformly over F ∈ . If 1 ≤ p < 2, the estimate is not so good but still sufficient: M−1 k=0

M−1 N + p/2 M−1 N − p/2 k k (Nk+ )p/2 + (Nk− )p/2 ≤ M p/2 + M M

k=0

≤ M 1−p/2 ≤ 2u

M−1

p/2−1

k=0

Nk+

k=0 p/2

(2J )

p/2

+

M−1

Nk−

p/2

k=0

,

and E ωF (u)p ≤ 23p/2+1 cp up/2 → 0

as u → 0,

uniformly over F ∈ . Condition (ii) of Theorem 10.6.6 is verified and the application of this one completes the proof. 10.6.8 Corollary. Let f be a function in Lp , p ≥ 2. Then the family of random functions f = FJ,f, , λ ∈ , J ∈ N is tight in Lp (T). Proof. First let f (x) = χ[t,t+a] (x),

0 ≤ t ≤ t + a ≤ 1. Then f = FJ,fa , , a ∈ [0, 1), λ ∈ , J ∈ N , and the tightness follows from Theorem 10.6.7. Recall that tightness is a property stable with respect to linear operations on random vectors (a linear combination of compact sets is a compact set). Therefore we obtain the result for any function of the type

g(x) =

M−1

γk χ[k/M,(k+1)/M] (x).

(10.6.6)

k=0

Let now f be an arbitrary function. Condition (i) follows directly from closure inequality (10.6.3). Next, for arbitrary ε > 0, choose g as before such that f − g p ≤ ε. Then ωFJ,f, (u) ≤ ωFJ,g, (u) + 2 FJ,f −g, p

543

10.6 Tightness of Gaussian Stein’s elements

and

p E ωFJ,f, (u) ≤ 2p−1 E ωFJ,g, (u)p + 2p E FJ,f −g, p

≤ 2p−1 E ωFJ,g, (u) + 2p εp . Thereby

lim sup E ωFJ,f, (u) ≤ 2p−1 lim sup E ωFJ,g, (u) + 2p εp = 2p εp .

u→0 J,λ

u→0 J,λ

Since ε can be arbitrarily small we obtain (ii) for f . For 1 ≤ p < 2, this approach only yields the tightness of f in Lp (T) for f ∈ L2 (T). This is trivial since for f ∈ L2 (T) we have obtained the tightness in L2 (T)-topology, which is stronger than Lp (T) topology. 10.6.9. An example of non-tightness in Lp , p < 2. Consider the families of random elements J 1 gj f (x + λj ) (10.6.7) FJ,f, (x) = √ J j =1 where = {λj , j ≥ 1} ∈ [0, 1]∞ , J ≥ 1. We give a parametric series of examples of functions f ∈ Lp (T), 1 ≤ p < 2 and sequences such that the family f = {FJ,f, , J ≥ 1} is not tight in Lp (T). Construction. Fix 1 ≤ p < 2, p ≤ q < 2 and let, for m ∈ N, Mm = 2m

2 /q

hm = m−2 2−m , 2

,

Consider the function f (x) =

∞

2

km = m2m .

Mm χ[hm+1 ,hm ) (x)

m=1

and the sequence = {λj , j ≥ 1} = {−h1 , −2h1 , . . . , −k1 h1 , . . . , −hm , . . . , −km hm , . . . }. We immediately have p

q

f p ≤ f q ≤

∞ m=1

and so f ∈ Lq (T) ⊂ Lp (T).

q

Mm hm =

∞

m−2 < ∞,

m=1

Estimation. Consider the subsequence Jn = nm=1 km . Fix n and write more simply Fg = FJn ,g, . We have f = f1 + f2 + f3 ,

544

10 Gaussian processes

where f1 =

n−1

Mm χ[hm+1 ,hm ) (x), f2 = Mn χ[hn+1 ,hn ) (x), f3 =

m=1

∞

Mm χ[hm+1 ,hm ) (x).

m=n+1

We wish to get rid of f1 , f3 . Towards this end write

f1 22 = =

n−1 m=1 n−1

2 Mm (hm − hm+1 ) ≤

n−1

2 Mm hm

m=1

2(2/q−1)m m−2 ≤ 8(2 − q)−1 2(2/q−1)(n−1) . 2

2

m=1

By the closure inequality (10.6.3) we have E Ff1 p ≤ f1 2 ≤ 8p/2 (2 − q)−p/2 2(p/q−p/2)(n−1) . p

2

p

Let U denote the support of f3 . Then U is contained in the union of Jn intervals [−λj , −λj + hn+1 ], Thus λ(U ) ≤ Jn hn+1 ≤ 2kn hn+1 ≤ 21+n

1 ≤ j ≤ Jn . 2 −(n+1)2

≤ 2−2n ≤ (4n)−1 .

Besides, for the main term, we have Ff2 (x) =

−1/2 Jn

Jn

f2 (x + λj )gj

j =1 −1/2

= Jn

Mn

J n −1

gj χ[hn+1 −λj ,hn −λj ) (x)

j =1 −1/2

+ Jn

Mn

kn

gJn−1 +k .χ[hn+1 +khn ,(k+1)hn ) (x)

k=1

:= FA (x) + FB (x), where FA and FB are independent symmetric. Let I=

kn +

[hn+1 + khn , (k + 1)hn ).

k=1

Then λ(I ) = kn (hn − hn+1 ) ≥ kn hn /2 = (2n)−1 . We start the key estimate with p p E |Ff1 +f2 +f3 | dλ ≥ E |Ff1 +f2 |p dλ. E Ff p = T

I \U

545

10.6 Tightness of Gaussian Stein’s elements

By linearity |Ff2 | = |F−f1 +Ff1 +f2 | ≤ |Ff1 |+|Ff1 +f2 |. Thus |Ff2 |p ≤ 2p−1 |Ff1 |p + |Ff1 +f2 |p , and thereby |Ff1 +f2 |p ≥ 21−p |Ff2 |p − |Ff1 |p . Passing to expectations, we get p 21−p E |Ff2 |p dλ − E Ff1 p E Ff p ≥ I \U

p ≥ λ(I ) − λ(U ) 21−p inf E FA (x) + FB (x) − E Ff1 p

x∈I

−1 1−p

≥ (4n) ≥

inf E |FB (x)|p − E Ff1 p x∈I 2 −p/2 p (4n)−1 21−p Jn Mn cp − 23p/2 (2 − q)−p/2 2(q/p−p/2)(n−1) 2

≥ (4n)−1 21−p (2kn )−p/2 Mn cp − 23p/2 (2 − q)−p/2 2(q/p−p/2)(n−1) p

2

≥ n−1−p/2 cp 2−1−3p/2+(p/q−p/2)n − 23p/2 (2 − q)−p/2 2(q/p−p/2)(n−1) →∞ 2

2

as n tends to infinity. We have therefore obtained p

p

lim sup E FJ,f, p ≥ lim sup E FJn ,f, p = ∞, J →∞

n→∞

and the tightness does not take place. Tightness in C(T). We shall prove the following result: 10.6.10 Proposition. Let f ∈ C(T). Assume that ωf (u) du < ∞. √ 0 u | log u| Then the family of Gaussian processes J 1 f = FJ (x) = √ gj f (x + λJ,j ), J ≥ 1, λ ∈ J j =1

is tight in the space C(T). Further, if λJ,j ≡ λJ → 0 as J tends to infinity, this family converges to the law of the degenerated process g1 f ( · ). Proof. For all J ≥ 1, x, y ∈ T we have dJ2 (x, y)

J 2 1 f (x + λJ,j ) − f (y + λJ,j ) = 2,P J j =1 2 ≤ sup f (x + λJ,j ) − f (y + λJ,j ) ≤ ωf (|x − y|)2 .

2 := FJ (x) − FJ (y)

j

546

10 Gaussian processes

Hence dJ (x, y) ≤ ωf (|x − y|). Thus for each r > 0 and each J , the intervals of length ωf−1 (r) form a covering of T by sets of diameter not exceeding r with respect to the metric dJ generated by the process FJ . It follows that R3 R R2 log ω−1 (r)dr. log 1/ω−1 (r) dr = log N(T, dJ , r)dr ≤ 0

f

0

0

f

By the change of variables r = ωf (u) and integration by parts, we obtain

R

2

log N (T, dJ , r)dr ≤

0

ωf−1 (R)

log udωf (u)

0

ωf−1 (R) = ωf (u) log u + 0

ωf−1 (R) 0

ωf (u) du. 2u log u

Further, the main contribution comes from the integral term, since the function ωf is monotone and we have for each u, √ u

ωf (u) log u ≤ 2 u

ωf (v) dv. 2v log v

Letting u = ωf−1 (R), we obtain

R 0

2

log N(T, dJ , r)dr ≤ 2 0

ωf−1 (R)

ωf (u) du → 0 2u log u

as R tends to 0. Since the latter bound is uniform over J , the tightness easily follows from the Ascoli–Arzela theorem.

Part IV Three studies

Chapter 11

Riemann sums

The study of almost sure convergence of Riemann sums of Lesbegue integrable functions has been proved, since the fundamental paper of Rudin, to contain deep arithmetical aspects. The arithmetical characterization of that property is an open and certainly hard question. Riemann sums have also important connections with various problems from number theory, among them the Riemann Hypothesis, through their link with Farey sequences. This chapter provides an easy access to the main results of the theory, as well as the various methods elaborated by their authors. The two last sections are devoted to some recent advances.

11.1

Introduction

In this chapter, we are mainly interested in the study of the almost sure convergence of Riemann sums of Lesbegue integrable functions. We will state and comment on essential results, discuss their links and also give indications of proofs. The final section is reserved to some recent advances. The chapter is organized as follows: in Section 11.2 we introduce Jessen’s theorem on convergence almost everywhere of Riemann sums along chains of integers. This is likely the first result of the theory. The proof is sketched and comments about its optimality are added. Rudin’s theorem is the second fundamental result and shows for instance the irregular behavior of Riemann sums along the sequence of primes. A striking example derived from this result and Dirichlet’s theorem on distribution of primes in arithmetic progressions, shows that the convergence almost everywhere of Riemann sums along a given sequence definitely relies upon the arithmetical structure of this one. Section 11.3 is devoted to results of individual type. It is indeed possible to obtain sufficient conditions on the function f , sometimes quite sharp, ensuring the convergence almost everywhere of the Riemann sums of f . These conditions are often expressed in terms of the integral modulus of continuity of f . The results are mainly due to Marcinkiewicz and Salem. In the next section, the concepts of breadth and dimension are introduced and used to establish new convergence results for specific classes of functions. This is continued with Bourgain’s approach which we already discussed in Chapter 6. In Section 11.6 we study the connections of Riemann sums with number theory, and in particular their link (Mikolás’ works) with the Riemann Hypothesis through the study of Farey sequences, and with the prime number theorem. Finally in Sections 11.7 and 11.8, recent results related to the Marcinkiewicz–Zygmund conjecture and square functions of averages of Riemann sums are stated and proved.

550

11 Riemann sums

Let f be any measurable function on T. Define for n = 1, 2 . . . and x ∈ T the Riemann sums of f as follows:

1 j f x+ . n n n−1

Rn (f )(x) =

(11.1.1)

j =0

When x = 0, we simply write

j 1 f , n n n−1

Rn (f ) =

(11.1.2)

j =0

for the usual Riemann sums considered in Section 11.6. We begin with a first important property of Riemann sums. Write for ∈ Z, e (x) = e2iπ x . Then for all n ≥ 1, 1 2iπ j n = e (x)δn| . Rn (e (x)) = e (x) e n n−1

(11.1.3)

j =0

Hence for f ∈ L2 (T), f ∼ as

∈Z a e ,

the Riemann sums of f can be also rewritten Rn (f ) = a e . (11.1.4) ∈Z n|

We shall comment on this property by means of the infinite Möbius inversion due to Hartman and Wintner [1947: p. 853]. Consider the following two infinite systems of linear equations ∞

xnm = yn , n = 1, 2, . . . ,

(11.1.5)

m=1 ∞

μ(m)ynm = xn , n = 1, 2, . . .

(11.1.6)

m=1

where μ( · ) is the Möbius function, see (11.6.1). If xn = O(n−1−η ) for some η > 0, then (11.1.5) has a unique solution which is given by (11.1.6), namely −1−η ) for some η > 0, xn = ∞ m=1 μ(m)ynm , n = 1, 2, . . . . Conversely, if yn = O(n then (11.1.6) has a unique solution which is given by (11.1.5). In our case, this shows that if the Fourier coefficients of f satisfy the condition an = O(|n|−1−η )

for some η > 0,

then f can be reconstructed from its Riemann sums. More precisely an en (x) = μ(m)Rnm (f )(x). m

(11.1.7)

(11.1.8)

551

11.2 The results of Jessen and Rudin

11.2 The results of Jessen and Rudin The problem under consideration can be presented as follows. When f is Riemann integrable on T, for any real x, f dλ. (11.2.1) lim Rn (f )(x) = n→∞

T

When f is only Lebesgue integrable, {Rn (f ), n ≥ 1} converges to T f dλ in the mean. Indeed, let us first consider f ∈ L2 (T) with Fourier expansion f ∼ a e and a0 = T f dλ = 0. As Rn f = n| a e by (11.1.4), we have a2 → 0 (11.2.2)

Rn (f ) 22 ≤ ||≥n

as n tends to infinity. And so limn→∞ Rn (f ) − T f dλ 2 = 0. Now assume f ∈ L1 (T) and let {fk , k ≥ 1} ⊂ L2 (T) approximate f in L1 (T). Let ε > 0 be fixed, and choose k large enough such that fk − f 1 ≤ ε. Then Rn (f ) − f dλ ≤ Rn (f ) − Rn (fk ) 1 1 T + Rn (fk ) − fk dλ + fk dλ − f dλ, 1

T

T

T

L1 (T)

and since Rn is an contraction, (f ) − f dλ ≤

f − f

+ R (f ) − f dλ + f dλ − f dλ R n k n k 1 k k 1 1 T T T T ≤ 2ε + Rn (fk ) − fk dλ . 2

T

Letting n tend to infinity, we obtain lim sup Rn (f ) − f dλ ≤ 2ε. n→∞

Since ε is arbitrary, we get

T

1

lim Rn (f ) − f dλ = 0.

n→∞

T

1

It is natural to inquire about the possible convergence almost everywhere of these sums. A first study was made by Hahn [1914] where approximation of Lebesgue integral by Riemann sums was considered. In Jessen [1934: Theorem A], a first result is obtained. We introduce a preparatory definition. 11.2.1 Definition. A sequence of positive integers is a chain {nk , k ≥ 1} if, for any k ≥ 1, nk |nk+1 .

552

11 Riemann sums

11.2.2 Theorem. Let {nk , k ≥ 1} be a chain. Assume that f ∈ L1 (T). Then lim Rnk f (x) = f dλ almost everywhere. k→∞

T

As noted by Marcinkiewicz and Salem 1], this result is in a certain [1940: Theorem sense best possible. Indeed, when S = 2n , n ≥ 1 , to every positive and increasing function ω such that limx→∞ ω(x) log x = 0, a function f can be associated satisfying

T

|f | ω(|f |)dλ < ∞

sup |R2s (f )| dλ = ∞.

and

T s≥0

(11.2.3)

Jessen’s result is based on the following observation: since f is 1-periodic, Rn (f ) is 1 1 n -periodic for any n ≥ 1, and thus m -periodic if m divides n. Consequently since 1 Rnk f (x) is nk -periodic for any k, (x) = lim sup Rnk f (x) = C, nk →∞

for almost every x, where C is some constant. It suffices in fact that for infinitely many p, np divides nm whenever m is large enough. Let B be some fixed real and put Ek = {Rnk (f ) > B}. Then Ek as well as Ekc are n1k -periodic. Put E = {sup1≤k≤N Rnk (f ) > B}. We have c ∩E c c c c E = EN + EN N −1 + EN ∩ EN −1 ∩ EN −2 + · · · + EN ∩ · · · ∩ E2 ∩ E1 . Set c c Ak = EN ∩ · · · ∩ Ek+1 ∩ Ek .

Then Ak is

1 nk -periodic.

Thus,

j f (x) dx = f x+ dx = nk Ak Ak

Ak

Rnk (f )(x) dx ≥ Bλ(Ak ).

Consequently, by summing over k, f dλ ≥ Bλ(E). E

Letting N tend to infinity leads to supk≥1 Rnk (f )>B

f dλ ≥ Bλ sup Rnk (f ) > B . k≥1

If B < C, then λ{supk≥1 Rnk (f ) > B} = 1. The above relation thus shows f dλ ≥ B · 1 = B. T

11.2 The results of Jessen and Rudin

Hence C ≤

T f dλ.

553

Replacing f by −f also gives

T

f dλ ≤ lim inf Rnk (f )(x) almost everywhere, nk →∞

and the result follows. Ursell [1937: p. 231] showed that Riemann sums converge almost everywhere along the whole sequence of integers for monotone square summable functions. He also gave a simple example (f (x) = |x|−δ , 1/2 < δ < 1) showing that the convergence almost everywhere of Riemann sums of L1 (T) functions does not hold in general. The next result is due to Marcinkiewicz and Zygmund [1937: Theorems 3 and 3 ]. 11.2.3 Theorem. There exists f ∈ L1 (T) such that lim supn→∞ R2n+1 (f )(x) = ∞ almost everywhere. Much later Rudin [1964: p. 322] showed that, even for bounded functions, Riemann sums may not converge almost everywhere. 11.2.4 Theorem. Let S be an increasing sequence of positive integers satisfying the following property: for any N ≥ 1, there is a set SN of N elements of S, none of which divides the least common multiple (l.c.m.) of the others. Then there is a measurable subset A of T, such that if f = 1A , {Rn (f ), n ∈ S} does not converge almost everywhere. For instance, S can be a sequence of primes. The theorem implies that there is no maximal inequality for Riemann sums. Indeed, otherwise by means of the Banach principle, the set of elements of L2 (T) for which {Rn (f ), n ≥ 1} converges almost everywhere would be closed. And since {Rn (f ), n ≥ 1} does converge almost everywhere for finite linear combinations of the characters en , this set would also be everywhere dense in L2 (T) thus providing a contradiction. By combining this theorem with Jessen’s result, and using Dirichlet’s theorem on primes in arithmetic progressions, Rudin also built a sequence S = {nk , k ≥ 1} possessing a striking property. The construction goes as follows. Let n1 = 1 and assume nk is defined. There exists an integer r > 1 such that q = 1 + rnk is a prime. Then we set nk+1 = rnk . On the one hand, by means of Jessen’s theorem, (a) for any f ∈ L1 (T), λ x : limSn→∞ Rn (f )(x) = T f dλ = 1. And on the other, by invoking this time Rudin’s theorem, (b) there exists f ∈ L∞ (T) such that λ x : limSn→∞ Rn+1 (f )(x) = T f dλ = 0. This clearly shows that the problem relies upon the arithmetical structure of S. We indicate, before closing this section, a slight generalization of Jessen’s result. The fact that for f ∼ ∈Z a e the Riemann sums of f can be expressed by

554

11 Riemann sums

2 Rn (f ) = n| a e leads to a natural generalization of the problem in L -spaces. Assume we are given a fixed set of indices N together with {a , ∈ Z} ∈ 2 . Let μ be a Borel probability measure on [0, 1]. Let {ψ , ∈ Z} be an orthonormal sequence of L2 (μ) and define the generalized Riemann sums Rn = Rn(a) = a ψ . ∈Z n|

The investigation of the almost everywhere convergence problem of the sums Rn along the index N , for all orthonormal systems, simultaneously generalizes the study of the convergence almost everywhere of Riemann sums, as well as the one of orthogonal series. It is naturally quite a hard task since even for chains, the periodicity argument used for proving the convergence of Riemann sums is no longer available for arbitrary orthogonal systems. A slight extension of Jessen’s theorem can however be obtained. 11.2.5 Theorem. Let N = {nk , k ≥ 1} be a chain and put Ek = {n : nk |n},

Fk = Ek \Ek+1 and δk2 =

an2 .

n∈Fk

2

2

2+ε log log δ1n log log log δ1n conIf for some ε > 0 the series n≥1 δn2 log δ1n verges, then the sequence (Rn , n ∈ N ) converges almost everywhere. Notice that the latter condition is of the same type as in Marcinkiewicz–Salem [1940] (see e.g., condition (11.3.10)). Extensions of Jessen’s theorem for locally compact groups were also obtained by Ross–Stromberg [1967] and more recently by Ross–Willis [1997]. A generalization of Jessen’s theorem to one-parameter groups of measurepreserving transformations was given in Civin [1955]. Let T (ε) be such a group. If f is an integrable function satisfying f (s) = f (T (1)s), then the result asserts that the n sequence of sums fn (s) = 2−n 2i=1 f (T (i2−n s)) converges almost everywhere as n → ∞. To conclude this section, let us also mention that an approach to convergence of Riemann sums using ultrafilters was proposed by Witt (see [Mühlbach: 1962]). This was pointed out to us by Wefelscheidt.

11.3

Individual theorems of spectral type

The main contributions are due to Marcinkiewicz and Salem [1940]. Various type of results are presented here, leading to deep insight. Compared with the preceding section, the approach developed is different. The authors studied regularity assumptions on f under which the associated sequence of Riemann sums converge almost everywhere. The conditions are often expressed in terms of the integral modulus of continuity of f . For instance:

555

11.3 Individual theorems of spectral type

11.3.1 Theorem. Under the condition " #2 f (x + t) − f (x) dx = O(t ε ) (ε > 0), T

the sequence {Rn (f ), n ≥ 1} converges a.e. to Indeed let f (x) ∼

ν∈Z aν e

Rn f (x) =

2iπ νx

T

Tf

aν e2iπ νx

dλ.

with a0 =

T f dλ = 0. Then = an e2iπ nx .

ν∈Z n|ν

Thus |Rn f |2 dλ = |an |2 ,

(11.3.1)

∈Z

and

n≥1 T

∈Z

Rn2 f dλ =

|an |2 =

n≥1 ∈Z

aν2 d(|ν|)

ν∈Z

where d(k) is the number of divisors of k. But for all δ > 0 (Hardy–Wright [1979: Theorem 315]) d(k) = O(k δ ). 2 Therefore the series n≥1 T Rn f dλ converges once we know that ν∈Z aν2 |ν|δ converges for some δ > 0. Now by condition (11.3.1) the integral |f (x + t) − f (x − t)|2 dtdx tr T T converges if r < 1 + ε. Further, by the Parseval relation we have |f (x + t) − f (x − t)|2 dx = 4 aν2 sin2 (2π νt), T

so that T T

ν∈Z

|f (x + t) − f (x − t)|2 dtdx = 4 aν2 tr

Consequently

T

ν∈Z

n≥1 T

sin2 2π νt dt ≥ C aν2 |ν|r−1 . tr ν∈Z

Rn2 f (x)dx < ∞,

which easily leads to Rn f (x) → 0 for almost all x, and this is exactly the assertion of Theorem 11.3.1. When replacing Riemann sums by their averages 1 Rk (f ), n n

An (f ) =

n = 1, 2, . . . ,

k=1

assumption (11.3.1) can be essentially weakened.

(11.3.2)

556

11 Riemann sums

11.3.2 Theorem (Marcinkiewicz–Salem [1940]). Under the condition |f (x + t) − f (x)|2 dtdx < ∞ t log 2t T T the sequence {An (f ), n ≥ 1} converges almost everywhere to T f dλ. Note that condition (11.3.3) is satisfied if for instance 1 |f (x + t) − f (x)|2 dx = O , log2 | log t| T

(11.3.3)

(11.3.4)

which is essentially less restrictive than (11.3.1). The authors conjectured that {An (f ), n ≥ 1} converge almost everywhere for every f ∈ L2 (T). This famous conjecture remains still unsolved. Towards the validity of this one, Bourgain provided an affirmative answer for the logarithmic averages (see Theorem 11.5.1). Marcinkiewicz and Salem also observed the arithmetical nature of the problem. Let f = p prime cp ep with cp → 0 as p tends to infinity. Then Rn (f )(x) = 0 almost everywhere if n is not a prime and Rn (f )(x) = cn en + c−n e−n otherwise. Consequently Rn (f )(x) → 0 uniformly, outside a measurable set of zero measure. But we may have f essentially bounded in no interval, which is rather surprising. Note also that if f (x) ∼ ∈Z c e2iπ x with c0 = 0, |Rp f (x)|2 dx ≤ |cν |2 ω(ν) p prime T

ν∈Z

where ω(ν) is the number of primes dividing ν. Since ω(ν) = O that Rp (f )(x) → 0 almost everywhere whenever ∞ |ν|≥3

|cν |2

log |ν| < ∞. log log |ν|

The latter condition is satisfied in particular if " #2 f (x + t) − f (x) dt = O T

log ν log log ν ,

it follows

(11.3.5)

1 log2

1 t

,

(11.3.6)

which is a much weaker condition than (11.3.1). We also mention the following criterion due to Salem [1948: p. 60] providing a sufficient condition for the convergence almost everywhere of Riemann sums Rni (f ) along a given sequence of integers {nk , k ≥ 1}, when the integral modulus of continuity of f is sufficiently smooth. 11.3.3 Theorem. Assume that for some ε > 0, |f (x + t) − f (x)| dx = O T

1 . | log t|1+ε

(11.3.7)

11.4 Breadth and dimension

557

Let {nk , k ≥ 1} be an increasing sequence of positive integers such that, for some δ < ε, 1 1+δ < ∞. (11.3.8) log nk k≥1 Then limk→∞ Rnk f (x) = T f dλ almost everywhere.

11.4

Breadth and dimension

These results are essentially due to Baker [1976], Dubins–Pitman [1979], Révesz– Rusza [1991] and Bugeaud–Weber [1998]. We begin with a preparatory definition. 11.4.1 Definition. Let A ⊂ L1 (T). A sequence S = {nk , k ≥ 1} of positive integers ˆ is called an A-sequence if for every f ∈ A, lim Rnk (f ) = f dλ almost everywhere. k→∞

T

In this section we write L = L1 (T) and M = L∞ (T). Given two arbitrary sequences of positive integers S1 and S2 , we also write S1∨ S2 for the new sequence obtained by ordering (according to the natural order) the set [s1 , s2 ], s1 ∈ S1 , s2 ∈ S2 , where as usual [s1 , s2 ] is the least common multiple of s1 and s2 . 11.4.2 Theorem (Baker [1976]). If S1 = {mk , k ≥ 1} and S2 = {nk , k ≥ 1} are two ˆ ˆ M-sequences, then S1 ∨ S2 is again an M-sequence. The proof relies upon the fact that Rm (Rn (f )) = R[m,n] (f ).

(11.4.1)

Recall the notion of -sequences introduced by Cassels [1950]. 11.4.3 Definition. Let μk be the number of fractions mjk (0 < j < mk ) which are not equal to mlq (l integer, q < k). We say that {mk , k ≥ 1} is a -sequence, if the following condition is satisfied: 1 μk > 0. n mk n

lim inf n→∞

k=1

The interest of this notion lies in the fact that if {mk , k ≥ 1} is a -sequence, then the system of inequalities {mk x} < ψ(k)

(k = 1, 2, . . . ),

558

11 Riemann sums

where ψ is a nonincreasing function, admits an infinity of solutions for almost all there exists an example of a x when the series k≥1 ψ(k) diverges. Conversely, decreasing function ψ such that the series k≥1 ψ(k) is convergent, and for which the previous system of inequalities has only finitely many solutions for almost all x. Baker’s proof is partially based on this property. It is interesting to also observe, that almost all sequences are -sequences, although it is easy to exhibit some which are not. We mention a second result due to Baker [1976: Theorem 3.1]. 11.4.4 Theorem. Let {mk , k ≥ 1} be a -sequence with lim inf k→∞ k −1 log mk = 0. ˆ Then {mk , k ≥ 1} is not an L-sequence. Baker, however, suggested that the assumption of {mk , k ≥ 1} being a -sequence is not likely well adapted to this problem, and also established the following remarkable result [1976: Theorem 3.2]. 11.4.5 Theorem. Let ε > 0. Assume that {mk , k ≥ 1} is a sequence such that:

1 7 ∀k ≥ 1, mk = O exp (k 2 (log k)− 2 −ε ) . ˆ Then {mk , k ≥ 1} is not an L-sequence. Now we introduce a generalization of the notion of a chain used by Dubins–Pitman [1979]. For sets of positive integers S1 , . . . , Sd , put (11.4.2) [S1 , . . . , Sd ] = [n1 , . . . , nd ] : ni ∈ Si , i = 1, . . . , d . Let S be a set of positive integers. The dimension of S is the least positive integer d such that S is a subset of [S1 , . . . , Sd ] for some choice of chains S1 , . . . , Sd . Jessen’s theorem was extended by Dubins and Pitman, who proved 11.4.6 Theorem. If S has dimension d and f ∈ L(log+ L)d−1 , then λ x : limSn→∞ Rn (f )(x) = T f dλ = 1.

(11.4.3)

Here L(log+ L)d−1 denotes the set of Lebesgue measurable functions on T such that |f |(log+ |f |)d−1 dλ < ∞, T

where it is understood that log+ x = loge x if x ≥ 1 and equals 0 for 0 < x ≤ 1. A partial result (d = 2, f bounded) was proved in Baker [1976]. The proof of that result consists of associating to the sequence S a converse d-martingale bounded in L logd−1 L. The result then follows from a suitable extension to converse martingales of a maximal inequality for martingales with several parameters. By considering the sequence of dimension two S = {2i 3j , i ≥ 1, j ≥ 1}, the authors also showed that it is not possible to improve Theorem 11.4.6, replacing L log L by L.

11.4 Breadth and dimension

559

Nair [1995] suggested a more elementary proof avoiding the use of martingale theory. His argumentation is based on dominated estimates, Baker’s observation on property (11.4.1) for Riemann sums, and an induction argument on the dimension of S. In [Bugeaud–Weber: 1998] it is shown that for no d ≥ 2 can L(log+ L)d−1 in Theorem 11.4.6 be replaced by L(log+ L)d−2 , which solves a conjecture by Dubins and Pitman [1979]. For d = 2, this assertion is due to Baker. The proof of the general case consists of modifications of Baker’s arguments, which are based on an elementary but rather technical lemma. Recall a notion introduced in Dubins–Pitman [1979]. 11.4.7 Definition. We say that a set K of integers has breadth at most d, if the least common multiple of every finite subset of K is the least common multiple of at most d elements of that subset. The least such d is called the breadth of K and, if no such d exists, we say that K has infinite breadth. Rudin’s theorem can be reformulated as follows: If {nk , k ≥ 1} is a strictly increasing sequence of integers with infinite breadth, there exist bounded measurable functions f on T such that {Rnk f, k ≥ 1} does not converge almost everywhere. Indeed as {nk , k ≥ 1} has infinite breadth, for every r ≥ 2, there exist k1 , . . . , kr such that nki does not divide the least common multiple of nk1 , . . . , nki−1 , nki+1 , . . . , nr , for 1 ≤ i ≤ r. There exist sets of integers which are neither of infinite breadth nor finite dimension, and consequently the almost everywhere convergence properties of Riemann sums along these sets are unknown. Such a sequence has been given explicitly by Dubins– Pitman [1979: Section 3b]. Let p1 < p2 < · · · be the sequence of consecutive primes and consider the set E1 of all numbers of the type p1 . . . pj −1 pˇ j pj +1 . . . pk , for k ≥ 2 and 1 ≤ j ≤ k, where the symbol ˇ means that pj is excluded. In [Bugeaud–Weber: 1998], for any fixed d there is built a sequence {nk , k ≥ 1} with infinite dimension and finite breadth, which is not an L(log+ L)d -sequence. The construction goes as follows: let l be a positive integer. With the above notation, consider the set El of all integers n ranged in increasing order, such that a

−1 n = p1a1 . . . pj j−1 pˇ j pj +1 . . . pk ,

for k ≥ 2, 1 ≤ j ≤ k and l ≥ a1 ≥ · · · ≥ aj −1 ≥ 1. Then El has infinite dimension and breadth not exceeding l + 1. The proof uses the following extension of a theorem of Baker. 11.4.8 Lemma. If the sequence {nk , k ≥ 1} satisfies the growth condition

nk = O exp k 1/(2d+5) , then {nk , k ≥ 1} is not an L(log+ L)d -sequence. In the same paper is also the following result concerning the sequence E1 (of finite breadth and infinite dimension).

560

11 Riemann sums

11.4.9 Proposition. Let f ∼

∞ ∞

aν2

ν=0

Then

log l < ∞. log log l

lim

λ

where {aν , ν ≥ 0} satisfies

ν=0 aν eν

E1 n→∞

Rn (f ) =

T

f dλ = 1.

As concerning averaging along E1 , writing E1 = {nk , k ≥ 1}, λ

1 Rnk (f ) = N →∞ n N

lim

k=1

f dλ = 1

T

holds for all f ∈ L2 (T). Proof. Let t > 0 and k0 be fixed. Then λ sup |Rp1 ...pˇj ...pk+1 (f ) − Rp1 ...pk+1 (f )| > t 1≤j ≤k+1 k≥k0

≤

≤

1 t2 1 t2

1≤j ≤k+1 k≥k0

2 a e dλ

p1 ...pˇ j ...pk+1 | (pj ,)=1

a2 .

1≤j ≤k+1 p1 ...pˇ j ...pk+1 | k≥k0 (pj ,)=1

Given an arbitrary number , if k2 > k1 ≥ k0 are such that p1 . . . pˇ j1 . . . pk1 +1 | , pj1 ,

p1 . . . pˇ j2 . . . pk2 +1 | , pj2 ,

then j1 = j2 . Defining thus k() as being the index corresponding to the smallest j such that pj does not divide , we get λ

1 sup |Rp1 ...pˇj ...pk+1 (f ) − Rp1 ...pk+1 (f )| > t ≤ 2 t 1≤j ≤k+1 k≥k0

But ≥ p1 . . . pk()−2 , which gives k() = O

log , log log

≥p1 ...pk0

(k() − k0 − 1)a2 .

11.4 Breadth and dimension

561

and allows us to conclude the first half of the proposition. Concerning the second half, observe that 2 1 [R (f ) − R (f )] dλ p ...p p1 ...pˇ j ...pk+1 1 k+1 N 2 j ≤k+1 k≤N

∞ 2 1 2 = 4 a # j ≤ k + 1, k ≤ N : pj , p1 . . . pˇ j . . . pk+1 | N =0

(N − k())2 1 ≤ ≤ 2. 4 N N Therefore 2 1 [R (f ) − R (f )] dλ < ∞, p1 ...pk+1 p1 ...pˇ j ...pk+1 N 2 j ≤k+1

N≥1

k≤N

which, combined with Jessen’s theorem implies 1 R (f ) = f dλ, p1 ...pˇ j ...pk+1 N →∞ N 2 j ≤k+1 lim

k≤N

and this easily allows us to get the second half of the proposition. Révész and Ruzsa [1991] considered this problem in a wider arithmetical setting, independently of the works of Baker and Dubins–Pitman. The following notion is introduced. 11.4.10 Definition. A sequence S of positive integers has Rudin-dimension d when there exists sets Sl = {nk1 , . . . , nkl } ⊂ S such that ∀i ∈ [1, l],

nki [nk1 , . . . , nki−1 , nki+1 , . . . , nkl ],

if and only if l ≤ d. Then a sequence of Rudin-dimension 1 is a chain, whereas a sequence of infinite Rudin-dimension is simply a Rudin sequence, namely a sequence satisfying the requirement of Theorem 11.2.4. That notion is in fact equivalent to the notion of breadth, since a sequence S is of finite Rudin-dimension d if and only if it has a breadth equal to d. 11.4.11 Theorem. If S1 and S2 have Rudin-dimension α and β respectively, then the Rudin-dimension γ of the sequence S1 ∨ S2 satisfies γ ≤ α + β.

562

11 Riemann sums

Since one can find sequences for which the latter inequality is in fact an equality, the result is also optimal. Observe that a sequence of integers which is built from a given set of d primes, is of Rudin-dimension d. One could believe, in view of this result, that any sequence with large dimension can be built by means of sequences of smaller dimension. This is in turn not true. Révész and Ruzsa indeed showed the existence of a sequence of dimension 3 which cannot be represented by means of a finite number of chains. The proof is based on Van der Waerden’s theorem. Révész and Ruzsa [1991] also established the following remarkable result. 11.4.12 Theorem. Let S be a sequence of integers with Rudin-dimension equal to d. If S(n) = # ([1, n] ∩ S), then there exists a positive constant C such that for all n ≥ 1, S(n) < C(log n)d .

11.5

Bourgain’s results

The metric entropy criteria of Bourgain [1988a] were studied in detail in Chapter 6. Rudin’s theorem can be deduced from Corollary 6.1.8. Indeed, for every r ≥ 2, there exist k1 , . . . , kr such that for 1 ≤ i ≤ r, nki does not divide the least common multiple of nk1 , . . . , nki−1 , nki+1 , . . . , nr . Hence, there are p1 , . . . , pr distinct primes such that vpi (nki ) > vpi (nkj )

whenever i = j,

where vp denotes the p-adic valuation. Put N = lcm(nk1 , . . . , nkr )/(p1 . . . pr ) and notice that nki does not divide N for 1 ≤ i ≤ r. Consider the set of integers E = n = Np1α1 . . . prαr : αi ∈ {0, 1} and the function

Then

1 2iπ nx f =√ e . 2r n∈E 1 Rnks (f ) = √ 2r

e2iπ nx ,

n∈(E∩Nps N)

and for 1 ≤ s = t ≤ r, 1

Rnks (f ) − Rnkt (f ) 2 = √ . 2 √ Thus C(1/ 2) = ∞ and this achieves the proof. Akcoglu–Bellow–Jones–Losert–Reinhold-Larsson–Wierdl [1996: Theorem A.2] showed that the strong sweeping out property also takes place there. Slight extensions of Rudin’s result are given in Ruch [1997], [1998a], Ruch and Weber [1997].

11.5 Bourgain’s results

563

· is a sequence of primes and λ1 , λ2 , . . . is a sequence For instance, if p1 < p2 < · · ∞ of positive reals such that σN = N k=1 λk ↑ ∞, there exists f ∈ L (T) such that the averages N 1 λk Rpi (f ), N = 1, 2, . . . BN (f ) = σN i=1

do not converge almost everywhere. Related to Marcinkiewicz–Salem’s conjecture, Bourgain [1988d: Theorem 1.10] proved the following beautiful result we already mentioned right after Theorem 11.3.2. 11.5.1 Theorem. For any f ∈ L2 (T), N 1 1 lim f dλ almost everywhere. Rn (f ) = n→∞ log N n T n=1

Sketch of proof. The proof consists of proving the maximal inequality N 1 Rn (f ) ≤ C||f ||2 . sup s 2 N =22 N n=1

Let f ∈ L2 (T), f (t) ∼

ˆ

k≥1 f (k)e

2iπ kt ,

then

N d(k, N ) 1 Rn f (t) = fˆ(k)e2iπ kt , N N n=1

k≥1

where d(k, N ) = #{1 ≤ n ≤ N : n|k}. Let 1 if n | k, χn (k) = 0 otherwise. Define P ∗ = {pj , p prime, j ≥ 1}. Notice that ( ( χn (k) = (1 − χpj (n)) = (1 − χv (n)). v∈P ∗ v |k

p j ,j ≥1 p |k

Thus d(k, N) =

N

χn (k) =

n=1

Consider the multipliers ( (N ) μk =

v∈P ∗ ,v≤N v |k

(1 − v −1 ) =

N (

(1 − χv (n)).

v∈P ∗ v |n

n=1

( "

# (1 − v −1 ) + v −1 χv (k) .

v∈P ∗ v≤N

564

11 Riemann sums

One checks that

(1 − v −1 ) + v −1 χv (k) = |μˆ v (k)|2

where μv is the probability measure on T defined by 2 1− μv = 1 − v −1 δ0 +

√

v−1 1 − v −1 δj , v v j =0

and δx denotes the Dirac measure at point x. The leading idea of the proof consists of (N ) ) replacing the multipliers d(k,N by μk , and use Rota’s theorem [Rota: 1962], N 11.5.2 Theorem. Let (X, A, μ) be a probability space. Let {Tn , n ≥ 1} be positive operators, which are contractions on both L1 (μ) and L∞ (μ) and mapping the constant-1 function to itself. Then the sequence of operators T1 . . . Tn Tn∗ . . . T1∗ yields a bounded operator on Lp (μ), p > 1. In particular, if the Tn are given by convolution on T with a probability measure μn , one gets the inequality n ( 2 2iπ kt sup ˆ . f (k)e | μ ˆ (k)| ≤ C f p j n

k∈Z

(11.5.1)

p

j =1

In [Bourgain: 1990], a proof of this is given using the martingale maximal inequality: if {EN , N ≥ 1} is a sequence of refining expectation operators on a probability space, then sup |EN f | ≤ C f p . (11.5.2) p N

By (11.5.1)

(N ) μk fˆ(k)e2iπ kt ≤ C f 2 . sup 2

N ≥1 k≥1

Let d(k, 22s ) 2iπ kt ˆ f (k)e M1 f = sup , s 22 s≥1

s (22 ) M2 f = sup μk f (k)e2iπ kt . s≥1 k≥1

k≥1

Then M1 f ≤ M2 f +

2 1/2 " d(k, 22s ) s (22 ) # ˆ 2iπ kt − μ . f (k)e s k 22 s≥1 k≥1

By integrating and using Fubini’s theorem, we get for any f ∈ L2 (T), s s 2 1/2 d(k, 22 ) (22 ) − μk

f 2 .

M1 f 2 ≤ M2 f 2 + sup 2s k≥1

s≥1

2

11.6 Connection with number theory

565

The proof will be finished once we know that s d(k, 22s ) (22 ) 2 − μ sup ≤ C < ∞. s k 22 k≥1

s≥1

This is the main step. In the course of the proof the following interesting fact is also established: for all N and k, d(k, N) (N ) ≤ Cμk . N

11.6

Connection with number theory

Riemann sums can be connected to Farey sequences, and through this link to the Riemann Hypothesis (RH). This remarkable fact has been observed and developed by Mikolás [1949a], [1949b], [1951]. By comparing the convergence of averages associated to Farey sequences of a periodic function f with those of the Riemann sums of f , next studying the error of approximation made in this convergence (for a class of functions with bounded derivative), Mikolás showed a quite interesting equivalent reformulation of (RH) of functional analysis type. Although Mikolás’s work is still motivating number theorists, it seems to be little known. One can however quote the papers of Kanemitsu–Yoshimoto [1996] andYoshimoto [1998]. We begin by discussing the link between Farey sequences and Riemann sums and recall some useful estimates concerning Euler and Möbius functions. For the clarity of the exposition, we will display the arguments leading to the establishment of this link. At the end of this section some other results connecting Riemann sums with number theory, and especially with the prime number theorem, are presented. Farey sequences. Let x ≥ 1 be a given real; we denote by Fx = nk , 0 < k ≤ n ≤ x, (k, n) = 1 the Farey sequence of order x. The ν-th term is denoted by ρνx or ρν , when there is no confusion. The number of these fractions is (x) =

[x]

ϕ(n)

(n > 1),

n=1

where ϕ(n) is the Euler function ϕ(n) = #{m ≤ n, (m, n) = 1},

n > 1.

Let μ be the Möbius function ⎧ ⎪ ⎨1 μ(n) = 0 ⎪ ⎩ (−1)k

if n = 1, if p2 | n, if n = p1 . . . pk .

(11.6.1)

566

11 Riemann sums

1 From the formula ζ (s) = Wright [1979: 287]):

∞

μ(n) n=1 ns ,

s > 1, we get the following estimate (Hardy–

3 2 x + O(x log x). π2 Recall also for later use (Hardy–Wright [1979: 270]) that (x) =

M(x) =

x

x

μ(n) = o(x) and

n=1

|μ(n)| =

n=1

√ 6 x + O( x). 2 π

(11.6.2)

For an arbitrary real-valued function h defined on [0, 1], we have already denoted the associated Riemann sums by

1 k h . n n n

Rn (h) =

(11.6.3)

k=1

The link between Farey sequences and Riemann sums is deduced via the Möbius inversion formula: if g(n) = d|n f (d), then

f (n) =

μ(d)g

d|n

n . d

See Hardy–Wright [1979; p. 266]. Let

U (n) =

h

(k,n)=1 k≤n

Then Vn =

d|n

k , n

Vn =

(k,d)=1 k≤d

k d

U (d).

d|n

h

=

n k h , n k=1

and so, letting F (d) = dRd (h), d n n U (n) = μ h = μ F (d). d d d d|n

d|n

=1

We deduce (x)

h(ρν ) =

x n=1

ν=1

=

[x] d=1

x n U (n) = μ F (d) = μ(δ)F (d) d n=1 d|n

[ dx ]

F (d)

δ=1

μ(δ) =

dδ≤x

[x] d=1

$ %

dRd (h)M

x d

.

(11.6.4)

567

11.6 Connection with number theory

Thus for any real A, 1 (x)

h(ρν ) − A =

1≤ν≤(x)

1 (x)

n(Rn (h) − A)M([x]/n).

(11.6.5)

1≤n≤[x]

If Ru (h) → A as u → ∞, then (x) 1 h(ρν ) → A, (x)

(11.6.6)

ν=1

as x → ∞, by Toeplitz’s criterion which we recall now. 11.6.1 Lemma. Let t1 , t2 , . . . , tn be a sequence of reals converging to 0, and let {ak,l k, l ≥ 1} be an array of reals satisfying the following conditions: lim ak,l = 0,

(1) ∀l,

k→∞

(2) S(k) = |ak,1 | + |ak,2 | + · · · + |ak,k | = O(1). Then the new sequence {tk , k ≥ 1} defined by tk = ak,1 t1 + ak,2 t2 + · · · + ak,k tk converges to zero as well. For a proof see Kuipers–Niederreiter [1971], p. 75. We show that conditions (1) and (2) are indeed satisfied: 2 nM xn x (1) for all fixed n, (x) ≤ (x) ∼ π3x → 0, x → ∞, x x [x] x2 π2 (2) (x) n=1 n M( n ) ≤ (x) ∼ 3 , x → ∞. We can thus state 11.6.2 Theorem. Let h be such that the Riemann sums Rn (h) converge to a (finite) real A as n tends to infinity. Then the associated Farey averages converge to A: Fn h =

(x) 1 h(ρνx ) → A. (x) ν=1

Thus if limn→∞ Rn (h) =

T h(t)dt,

then limn→∞ Fn h =

T h(t)dt

as well.

We shall now estimate the error of approximation (x) 1 x h(ρν ) − h(t)dt, (x) T ν=1

and its connection with (RH).

(11.6.7)

568

11 Riemann sums

" # If h has bounded derivative on [0, 1], then d Rd (h) − T h(t)dt = O(1). Using this, we get (x) h(ρνx ) − (x) h(t)dt = O(x log x). (11.6.8) T

ν=1

This may however be easily improved. By using the simple relation (Landau [1927: II, p. 176]), (x) 2 ν ρνx − = O(1) (11.6.9) (x) ν=1

and writing that (x)

h(ρνx ) =

ν=1

(x)

h(ρνx ) − h

ν=1

ν (x)

+

(x)

h

ν=1

ν (x)

we get (x)

h(ρνx ) − (x)

ν=1

T

h(t)dt = O(1)

(x) ν=1

x ν ρ − + O(1). ν (x)

And so, in view of estimate (x) ∼ x 2 and Cauchy–Schwarz’s inequality, we arrive at (x)

h(ρνx ) − (x)

ν=1

! (x)

T

h(t)dt = O x

ρνx −

ν=1

ν 2 21 . (x)

(11.6.10)

(x) Thus by (11.6.9), ν=1 h(ρνx )−(x) T h(t)dt = O(x). Now, recall Franel’s identity (Franel [1924] or Landau [1927: II, 173]) (x) ν=1

ρνx

ν − (x)

2

1 x x (a, b)2 = M M −1 , 12(x) a b ab [x] [x]

(11.6.11)

a=1 b=1

and Tchudakov’s result [1936: p. 591–602] on the error of approximation in the prime number theorem x

du γ π(x) − (11.6.12) = O xe−c1 (log x) . 2 log u By using its analogue for the Möbius function (Fogels [1940])

γ M(x) = O xe−c2 (log x) , (11.6.13) # 1 11 " where γ ∈ 2 , 21 , c1 = c1 (γ ), c2 = c2 (γ ) are constants, we get the much better estimate (Mikolás [1949a]) (x) ν=1

ρνx −

ν (x)

2

γ = O xe−c3 (log x) .

(11.6.14)

11.6 Connection with number theory

569

Our next theorem relies to the Riemann Hypothesis, which we briefly recall. The Riemann Hypothesis. The Riemann zeta function defined on the half-plane {s : $s > 1} by the series ∞ ζ (s) = n−s n=1

admits a meromorphic continuation to the entire complex plane, with the unique and simple pole of residue 1 at s = 1. In the half-plane {s : $s ≤ 0}, the Riemann zeta function has simple zeros at −2, −4, −6, . . ., and only at these points which are called trivial zeros. There exist also non-trivial zeros in the band {s : 0 < $s < 1}. See for instance to [Blanchard: 1969] (Propositions IV.10 and IV.11, p. 84) and [Titchmarsh: 1951]. The Riemann Hypothesis (RH) asserts that all non-trivial zeros of the function ζ have abscissa 1/2. If the RH is true we have the well-known relations, the first implying the second: 1 +c4 logloglogloglogx x 2 M(x) = O x , (x)

ρνx −

ν=1

ν 2 1+c log log log x = O x 5 log log x . (x)

(11.6.15)

These estimates allow to establish the first part of the following result (Mikolás [1949: Theorems 3, 4]). The proof of the second part relies upon Dirichlet series machinery. 11.6.3 Theorem. Assume that h has a bounded derivative. Then (x)

h(ρνx )

ν=1

where γ ∈ ε > 0,

#1

11 2 , 21

"

= (x)

T

γ h(t)dt + O xe−c(log x) ,

and c = c(γ ) is a constant. And if (RH) is true, then for every (x) ν=1

h(ρνx ) = (x)

T

1 h(t)dt + O x 2 +ε .

Conversely, if h has a bounded derivative and

1 +ε (x) x 2 (i) , ν=1 h(ρν ) = (x) T h(t)dt + O x ∞ 1

(ii) F (s) = n=1 ns nRn (h) − n T h(t)dt is regular and has no zero in the strip $(s) > 21 , then (RH) is true. A remarkable consequence of this result is the following

570

11 Riemann sums

11.6.4 Theorem. Let f ∈ C 3 ([0, 1]) such that f (t) is not identically 0, and f (1) − f (0) 3ζ (3) > ≈ 0.574 . . . , 2π T |f (t)| dt then (RH) ⇐⇒

(x)

f (ρνx ) = (x)

T

ν=1

1 f (t)dt + O x 2 +ε ,

∀ε > 0.

Examples are f (t) = eλt , λ = 0, |λ| < 2π/(3ζ (3)), or f (t) = cos τ t (0 < τ ≤ π2 ). The proof consists of establishing condition (ii) of Theorem 11.6.3, under the assumptions made. To prove that F has no zero in the strip $(s) > 21 , Mikolás used the Euler–Maclaurin sum-formula at order 1: for ϕ having a continuous derivative in the interval (a, b), b b

x − x − 21 ϕ (x)dx ϕ(n) = ϕ(x)dx + a≤n≤b

a

a

+ a − a − 21 ϕ(a) − b − b − 21 ϕ(b), to estimate the sum nRn (h) − n T h(t)dt. This allows us to write F as a difference of two Dirichlet series, and reduces the study of the zeroes of F to finding good bounds for these two Dirichlet series. Yoshimoto [1998] recently showed that the constant 3ζ (3) 2π can be slightly sharpened by √ $ % 2 3 2 2 π + log 2 − · 6 3 3 Other equivalent reformulations of the RH. Among the many equivalent reformulations of the RH, the following one due to Robin [1984], is likely one of the most striking and at the same time the most simple. Let an integer n be termed “colossally abundant” if, for some ε > 0, σ (n)/n1+ε ≥ σ (m)/m1+ε for m < n and σ (n)/n1+ε > σ (m)/m1+ε for m > n. Using colossally abundant numbers, Robin showed that the RH is true if and only if σ (n) < eγ log log n, n for n > 5040, where σ (n) is the sum of divisors of n and γ is Euler’s constant. Let {xn , n ≥ 1} be the sequence of colossally abundant numbers. In the same paper, he also showed that the sequence {σ (xn )/xn log log xn , n ≥ 1} contains an infinite number of local extrema. In relation with Robin’s result, Lagarias [2002] showed that the RH is true if and only if σ (n) ≤ Hn + eHn log Hn ,

11.6 Connection with number theory

where Hn =

j ≤n 1/j

571

is the n-th harmonic number.

Grytczuk [2007] investigated the upper bound for σ (n) with some different n. Let ) α (2, m) = 1 and m = kj =1 pj j , where the pj are prime numbers and αj ≥ 1. Then, for all odd positive integers m > 39 /2, σ (2m) <

39 γ e 2m log log 2m, 40

and

σ (m) < eγ m log log m.

Some other criteria equivalent to the RH can be found in Cislo and Wolf [2008]. Prime number theorem. We conclude with some other links between Riemann sums and the prime number theorem. We begin by quoting some results of Wintner [1957]. Let f ∈ L1 (T), f ∼ ∞ g (x) where gm (x) = cm e(mx) + c−m e(−mx) and m 1 ∞ g (x). One has formally g (x) = c0 = 0. Then Rn (f ) ∼ ∞ nm n 1 1 μ(m)fnm (x) where μ is the Möbius function. Wintner investigated the convergence of the series ∞ μ(m)f (x), which represents the coefficients of the Fourier series of f in terms nm 1 of the equidistant Riemann sums. It is shown that the series converges for every x if f satisfies a Lipschitz condition of order greater than 1/2, and need not always converge, even though the Fourier series converges absolutely. Wintner also showed that a continuous 1-periodic function f is analytic if and only if: There exists a positive constant q = qf < 1 having the property that, for every positive integer n and for every real x, Rn (f )(x) − f dλ ≤ const · q n . T

Byrnes, Giroux and Shisha [1984: p. 181] considered step functions and proved the following result. 11.6.5 Theorem. Let f be a real step function on ]0, 1],

1 1# f (x) = an throughout n+1 , n , n = 1, 2, . . . .

Suppose that the sequence of Riemann sums n1 nk=1 f nk , n = 1, 2, . . . converges. 1 Then so does the improper Riemann integral 0+ f (t)dt, and to the same limit. From this theorem, the prime number theorem follows in a rather simple fashion. Put for any real x ≥ 1, "(x) = log p, where the sum is taken over all ordered pairs (p, m) for which p is a prime and m a natural number satisfying pm ≤ x. Define f (x) = "(x −1 ) − [x −1 ].

572

11 Riemann sums

We have for n ≥ 1, n

d(k) =

"n#

n

1=

1=

k=1 j =1

k=1 j |k, j ≥1

k=1

n k

n ! n k=1

k

,

(11.6.16)

where d( · ) is the divisor function. A classical result of Dirichlet yields for n ≥ 1 (γ being Euler’s constant), n ! n

k

k=1

√ = n log n + (2γ − 1)n + O( n).

(11.6.17)

Further n k=1

"

! n

= n log n − n + O(1 + log n),

k

n = 1, 2, . . . .

(11.6.18)

In view of the two equalities (11.6.17) and (11.6.18), we obtain that Rn (f ) → 2γ . 1 Applying Theorem 11.6.5 shows that 0+ f (t)dt converges. Then so does the integral 1 −1 −1 0+ ("(t ) − t )dt, and therefore lim

x→∞

"(x) = 1, x

from which the prime number theorem follows in an elementary way. Selvaraj [1991] has given a much easier proof of the preceding result by using a theorem of Landau. Let g(x) = f (1/x), where f (x) = a[1/x] throughout (0, 1] is the function f defined in Theorem 11.6.5 Put also x k G(x) = g = f . k x k≤x

For 0 < ε < 1, 1 f (x)dx = ε

1

g

ε

=

1 dx = x

1/ε

k≤1/ε k

k≤x

1/ε

1

1 g(t)dt = t2

1/ε 1

1 t μ(k)G dt t2 k

μ(k) t G dt, 2 t k

where μ denotes the Möbius function. Since G(x) = G([x]), G(x) 1 k [x] = , f · x [x] [x] x k≤[x]

k≤t

11.7 Riemann sums and the randomly sampled trigonometric system

and thus

G(x) 1 k = lim f x→∞ x n→∞ n n n

lim

573

= L.

k=1

∞ Hence G(x) = Lx + o(x) as x → ∞. Owing to the fact that k 1t dt diverges, we have, by applying a result of Landau [1953: p. 568], 1 μ(k) 1/ε L 1 1 1 f (x)dx = dt = L · S +o S , +o k t t ε ε ε k k≤1/ε

where

1 S ε

μ(k) 1ε 1 μ(k) 1 μ(k) = dt = log − log k k k ε k k t 1 1 1 k≤ ε

1 1 = log · ε log 1ε

k≤ ε

k≤ ε

μ(k) − log k. k 1 k≤ ε

Therefore as ε → 0+ ,

1

f (x)dx = L (o(1) − (−1)) + o(1),

ε

and

1

0+

f (x)dx = L,

which is exactly the claimed result.

11.7

Riemann sums and the randomly sampled trigonometric system

Throughout this section, ε = {εi , i ∈ N} denotes a Bernoulli sequence (P{εi = 0} = P{εi = 1} = 1/2) with basic probability space (, A, P), and let S = ε1 + · · · + ε , ∈ N. Let us specify, if necessary, that N denotes the set of positive integers. Let also e(x) = exp(2iπ x), e (x) = e(x), x ∈ T, ≥ 0. For f ∈ L2 (T), the Riemann sums

1 k f x+ Rn f (x) = n n n−1

are more conveniently written as (f =

(11.7.1)

k=0

∈N a e ),

Rn f (x) =

∈N n|

a e .

(11.7.2)

574

11 Riemann sums

Let N = {nk , k ≥ 1} be some increasing sequence of integers and consider the sequence of averages AN Nf =

1 #(N ∩ [1, N])

Rn f.

(11.7.3)

n∈N ∩[1,N ]

We shall study the convergence almost everywhere of the sequence of averages AN N f , N = 1, 2, . . . , not with respect to the trigonometric system {e , ∈ N}, but with respect to the randomly sampled trigonometric system {eS , ∈ N}. By this we mean to study the convergence for functions f˜ having a Fourier expansion with respect to the system {eS , ∈ N}, a eS , (11.7.4) f˜ = ∈N

where a = (a )∈N ∈ 2 (N)). A first question which comes to mind immediately concerns the convergence in L2 (T) and for almost all x, P-almost all ω of the trigonometric series in (11.7.4). We will show that if a ∈ 2 (N)), then this property holds. Moreover, if the coefficients a have constant signs, the convergence in L2 (T) of the series ∈N a eS (ω) (x) on a measurable set of ω’s of positive probability, implies that a ∈ L2 (T). So that, in almost all x, for P-almost all that case the series (11.7.4) converges in L2 (T) and for ω if and only if, the non-random trigonometric series ∈N a e converges almost everywhere. This fact becomes easier to understand, once the role played in that context by the Green function of some random walk associated to {S , ∈ N} is highlighted. Now, similarly system, one may consider f˜ = f˜+ + f˜− , where to the usual trigonometrical ˜ ˜ f+ = a ≥0 a eS , f− = a <0 a eS and treat the convergence almost everywhere of the series f˜, by considering those from f˜+ , f˜− , which are characterized by the above discussion. The series f˜ has a richer structure than the Fourier expansion of f . Working then on the product space T × , and using, conditionally to the first factor space the properties of the related characteristic functions, enables us to show the validity of the Marcinkiewicz–Salem conjecture, relative to the randomly sampled trigonometric system. 11.7.1 Theorem. For any a = {a , ∈ N} ∈ 2 (N)), and for f˜ defined in (11.7.4), a.s. lim AN f˜ = 0.

N →∞

This theorem is a particular case of a more general result. 11.7.2 Theorem. Let N = {nk , k ≥ 1} be an increasing sequence of integers satisfying the following growth condition: for some θ > 7, nj = O(j 2 log−θ j ).

11.7 Riemann sums and the randomly sampled trigonometric system

575

Then for any a = {a , ∈ N} ∈ 2 (N), and for f˜ defined in (11.7.4), a.s.

˜ lim AN N f = 0.

N →∞

The “a.s.” symbol means that the convergence holds for almost all x in T, and almost all ω in . A speed of convergence, as well as corresponding maximal L2 -inequalities, can be specified in both statements. Before passing to the proof of these results, it will be necessary to first establish some relevant intermediate results. The leading idea will consist in showing that the conditions of application of some classical convergence criteria of Gál–Koksma’s type (or of some variants established in Section 8.4) are fulfilled by the sequence ˜ n ∈ N}. The proof will thereby rely upon some metrical estimates. An essential {Rn f, step of the proof will be devoted to estimating the increments 2 E Rn f˜(x) dx. T n∈[i,j ]∩N

We introduce the following kernel defined for any positive integers n and m, QN ,a (n, m) =

∞

|a |

=n

1 1/2

+

∨

a2

≥n∨m

1 [m, n]

1 1/2

∞

|a |2−( −)

=(+1)∨m

(11.7.5)

1 ∨ . [m, n]

11.7.3 Proposition. There exist two absolute constants C and i0 , such that for any j ≥ i ≥ i0 , any increasing sequence N of positive integers, any sequence a ∈ 2 (N), 2 E Rn f˜(x) dx ≤ C QN ,a (n, m). T n∈[i,j ]∩N

n,m∈[i,j ]∩N

Proof. Using the formula nδn|S =

n−1

e2iπj S /n ,

(11.7.6)

j =0

and the obvious fact that n|S implies ≥ S ≥ n, allows us to write

Rn f˜(x) =

n∈[i,j ]∩N

a eS δn|S =

n∈[i,j ]∩N ≥n

=

≥1

a eS

n∈[i,j ]∩N ≥n

n∈[i,j ∧]∩N

1 n

n−1

1 2iπ kS /n a eS e n n−1 k=0

e2iπ kS /n .

k=0

(11.7.7)

576

11 Riemann sums

And by integration, E

T n∈[i,j ]∩N

2 Rn f˜(x) dx

=E a eS =

n−1 1 2iπ kS /n 2 e dx n

T ≥1

n∈[i,j ∧]∩N

a2 E

n−1

n∈[i,j ∧]∩N

≥1

+

=

1 n

k=0

2 k e2iπ n S

k=0

a a E δS =S

n∈[i,j ∧]∩N m∈[i,j ∧ ]∩N

n−1 m−1 1 2iπ k − h S l n m . e nm k=0 h=0

By independence, we have (letting for instance > ) k h k h k h E δS =S e2iπ( n − m )Sl = E δ{S −S =0} e2iπ( n − m )(Sl −S ) e2iπ( n − m )S k h k h = E δ{S −S =0} e2iπ( n − m )(Sl −S ) E e2iπ( n − m )S k h = P S = S E e2iπ( n − m )S . And so E

T n∈[i,j ]∩N

=

a2 E

n∈[i,j ∧]∩N

≥1

+2

2 Rn f˜(x) dx

n−1 1 2iπ kS /n 2 e n k=0

a a P S = S E

>

n∈[i,j ∧]∩N m∈[i,j ∧ ]∩N

n−1 m−1 1 2iπ k − h S n m := S1 + S2 . e nm k=0 h=0

(11.7.8) D We treat the sum S2 first. Plainly for u > v, Su − Sv = Su−v , and so P Su = Sv } = 2−(u−v) . Then,

k h n−1 m−1 e2iπ n − m + 1 1 S2 = a a 2−( −) nm 2 n∈[i,j ∧]∩N <

+

m∈[i,j ∧ ]∩N

a a 2

>

:= S2 + S2 .

−(− )

n∈[i,j ∧]∩N m∈[i,j ∧ ]∩N

k=0 h=0

n−1 m−1

1 e2iπ nm k=0 h=0

k

h n−m

2

+1

(11.7.9)

11.7 Riemann sums and the randomly sampled trigonometric system

577

By interchanging the notation of the indices ( with , n with m and k with h), we get

−2iπ nk − mh n−1 m−1 +1 1 e S2 = a a 2−( −) . nm 2 n∈[i,j ∧]∩N <

Since

e2iπx +1 2

k=0 h=0

m∈[i,j ∧ ]∩N

= eiπ x cos (π x),

S2 ≤ 2 |a ||a |2−( −) <

n−1 m−1 1 k h cos π . (11.7.10) − nm n m

n∈[i,j ∧]∩N m∈[i,j ∧ ]∩N

k=0 h=0

The equation km − hn = a admits a solution if and only if (m, n)|a. Let d = (m, n). Writing m = dm , n = dn , (m , d) = 1, (n , d) = 1 and a = da gives km−hn = mn km −hn a [m,n] = [m,n] . And |a | ≤ max(|km |, |hn |) < [m, n]. Further the number of ˜ h) ˜ satisfies solutions of km − hn = a is at most 2d. Indeed, any other solution (k, (k˜ − k)m = (h˜ − h)n , that is k˜ − k = j n , h˜ − h = j m . And |j |n ≤ n, thereby |j | ≤ d. We have thus the majoration [m,n]−1 n−1 m−1 h k 1 C u cos π − cos π ≤ . n m [m, n] [m, n] nm

(11.7.11)

u=0

k=0 h=0

Let a > a > 1/2 and put 3 ϕ =

2a log ,

τ =

sin ϕ /2 . ϕ /2

We assume sufficiently large for τ to be greater than (a /a)1/2 . This is realized once i is large enough, say i ≥ i0 . Consider the sector

If

πu [m,n]

A = [0, ϕ [ ∪ ]π − ϕ , π [ , πu ∈ / A , then cos [m,n] ≤ cos ϕ . And

Ac = [0, π [\A .

π u 2 cos ≤ (cos ϕ ) ≤ e−2 sin (ϕ /2) . [m, n]

But 2 sin2 (ϕ /2) = 2(ϕ /2)2 τ2 ≥ a log . We deduce 2πu 0≤u<[m,n]: [m,n] ∈A /

Now examine the case

πu [m,n]

π u cos ≤ [m, n]−a . [m, n]

∈ A . Define

I1 =]0, ϕ [ ,

I2 =]π − ϕ , π [ .

(11.7.12)

578

11 Riemann sums

Then the sums i =

2πu 0≤u<[m,n]: [m,n] ∈Ii

cos

π u [m,n] ,

i = 1, 2 are equal, and since

2 π u cos = i + 1, [m, n]

2πu 0≤u<[m,n]: [m,n] ∈A

i=1

we obtain π u cos =2 [m, n]

πu 0≤u<[m,n]: [m,n] ∈A

cos

πu 0≤u<[m,n]: [m,n] ∈I1

πu [m, n]

+ 1. (11.7.13)

πu Note that we only have to consider the case {0 ≤ u < [m, n] : [m,n] ∈ I1 } = ∅, otherwise the right-hand side in (11.7.13) equals 1. By using the elementary inequality |eu − ev | ≤ |u − v| for u, v ≤ 0, we have

0≤u<[m,n] πu [m,n] ∈I1

πu cos [m, n]

≤

0≤u<[m,n] πu [m,n] ∈I1

−

e

πu 2 − 2 ( [m,n] )

0≤u<[m,n] πu [m,n] ∈I1

πu 1 π u 2 log cos + . ( ) [m, n] 2 [m, n]

(11.7.14)

Since log(1 − 2 sin2 (x/2)) = −x 2 /2 + O(x 4 ) near 0, we deduce

cos

0≤u<[m,n] πu [m,n] ∈I1

πu [m, n]

≤C [m, n]4

−

0≤π u<[m,n]ϕ

e

≤

∞

e− 2 ( [m,n] ) ≤ C

πu

u=1

cos

0≤u<[m,n] πu [m,n] ∈I1

u [m, n]

4

(11.7.15)

log5/2 u4 ≤ C[m, n] 3/2 .

πu 2 ) − 2 ( [m,n]

0≤u<[m,n] πu [m,n] ∈I1

0≤u<[m,n] πu [m,n] ∈I1

It follows that

πu

0≤u<[m,n] πu [m,n] ∈I1

Now

2

e− 2 ( [m,n] ) ≤ C

πu [m, n]

2

[m, n] . 1/2

[m, n] ≤ C 1/2 .

(11.7.16)

The estimates (11.7.12) and (11.7.16) together with (11.7.13) thus provide

0≤u<[m,n]

πu cos [m, n]

[m, n] ≤C + 1 . 1/2

(11.7.17)

579

11.7 Riemann sums and the randomly sampled trigonometric system

Using (11.7.11), we obtain with (11.7.17),

n−1 m−1 1 1 h k 1 ≤ C ∨ . − cos π n m 1/2 [m, n] nm

(11.7.17 )

k=0 h=0

And the estimates (11.7.10), (11.7.11) together with (11.7.17 ) give 1 1 |a ||a |2−( −) ∨ , |S2 | ≤ C 1/2 [m, n] n∈[i,j ∧]∩N 1≤< <∞

m∈[i,j ∧ ]∩N

or, ∞

|S2 | ≤ C

n,m∈[i,j ]∩N

|a |

=n

1 1/2

1 ∨ [m, n]

∞

|a |2

−( −)

.

=(+1)∨m

(11.7.18) We estimate the sum S1 :

a2 E

n∈[i,j ∧]∩N

≥1

=

a2 E

By

1 2iπ e nm

n−1 m−1

h n−m

S

(11.7.19)

k=0 h=0

n,m∈[i,j ∧]∩N

k=0 h=0

we can continue with a2 ≤C

n,m∈[i,j ∧]∩N

≥1

k

n−1 m−1 h 1 k cos π n − m . nm

a2

≥1

(11.7.17 ),

k=0

n,m∈[i,j ∧]∩N

≥1

≤2

n−1 1 2iπ kS /n 2 e n

1 1/2

1 ∨ [m, n]

.

(11.7.20)

By inverting the order of summation in (11.7.20), we get 1 1 2 S1 ≤ C a 1/2 ∨ . [m, n] n,m∈[i,j ]∩N

(11.7.21)

≥n∨m

Combining now estimates (11.7.18), (11.7.21) with equality (11.7.8), gives 2 E R f (x) dx n T n,m∈[i,j ]∩N

≤C

n,m∈[i,j ]∩N

∞

|a |

=n

+

1 1/2

≥n∨m

1 ∨ [m, n]

a2

1 1/2

∞ =(+1)∨m

1 ∨ [m, n]

.

|a |2−( −)

580

11 Riemann sums

Hence E

T n∈[i,j ]∩N

2 Rn f (x) dx ≤ C

QN ,a (n, m).

(11.7.22)

n,m∈[i,j ]∩N

In our next step, we estimate the kernel QN ,a (n, m). We introduce the discrete Laplace transform of the sequence (|a |) . For any ≥ 1 we put b = β(|a |) :=

∞

|ak |2−|k−| ,

(11.7.23)

k=1 b+1 b

and write b = {b , ∈ N}. Plainly, |a | ≤ b , 1/2 ≤ 3 ∞ ∈N |a |. Further, by convexity,

≤ 2, and

β(|a |)2 ≤ 3β(|a |2 ).

∞

∈N b

≤

(11.7.24)

11.7.4 Lemma. For any two positive integers n and m, we have QN ,a (n, m) ≤ 3

∞

b b∨n∨m

=n∧m

1

1 . [m, n]

∨

1/2

Proof. We distinguish two cases. Let first n ≤ m. Then QN ,a (n, m) =

m−1 =n

+

∞

1

|a |

1/2

=m

Since m−1

∞

|a |

=n

1 1/2

1 1/2

∨

∞ 1 |a |2−( −) [m, n] =m

∞ 1 1 1 −) −( 2 + ∨ |a |2 a 1/2 ∨ . [m, n] [m, n] =(+1)

=m |a |2

|a |

−( −)

=2

∞ −(m−)

≥n∨m

=m |a |2

−( −m)

≤ 2−(m−) bm , we get

∞ m−1 1 1 1 −) −( −m ≤ 2 bm ∨ |a |2 b 2 ∨ [m, n] 1/2 [m, n] =m

=n

≤ bm

m−1

b

=n

1 1/2

∨

1 . [m, n]

Besides ∞ =m

|a |

1 1/2

∞ ∞ 1 1 1 −) −( 2 ≤ ∨ |a |2 b 1/2 ∨ , [m, n] [m, n] =(+1)

=m

581

11.7 Riemann sums and the randomly sampled trigonometric system

and

|a |2

1 1/2

≥n∨m

∨

1 [m, n]

=

|a |2

1/2

≥m

Therefore QN ,a (n, m) ≤ 3

∞

1

∨

b b∨m

=n

1 [m, n] 1

∨

1/2

≤

b2

≥m

1 1/2

∨

1 . [m, n]

1 . [m, n]

Let now n ≥ m. Then QN ,a (n, m) =

∞ =n

+ ≤

|a |

1 1/2

∞ 1 ∨ |a |2−( −) [m, n] =+1

1

a2

1 ∨ [m, n]

1/2 ≥n∨m ∞ ∞ 1 1 1 b2 1/2 ∨ + b2 1/2 [m, n] =n =n ∞

≤2

1

b b∨n

1/2

=m

∨

1 [m, n]

1 . [m, n]

∨

The next proposition is now obtained as a byproduct of Proposition 11.7.3 and Lemma 11.7.4. 11.7.5 Proposition. There exist two absolute constants C and i0 , such that for any j ≥ i ≥ i0 , any increasing sequence N of positive integers, any sequence a ∈ 2 (N), ∞ 2 1 1 E Rn f˜(x) dx ≤ C b b∨m 1/2 ∨ , [m, n] T n,m∈[i,j ]∩N n∈[i,j ]∩N

=n

n≤m

where (b ) is defined in (11.7.23).

1 Now examine the sum ∞ =n b b∨m 1/2 ∨ ∞ =n

b b∨m

1

1 ∨ 1/2 [m, n]

= bm

m

b

=n

1 [m,n]

:

∞ 1 1 1 2 ∨ + b ∨ . 1/2 [m, n] 1/2 [m, n]

1

=m+1

On the one hand, since n ≤ ≤ m implies [m, n] ≥ m ≥ , m =n

b

1 1/2

∨

1 [m, n]

=

m m m 1/2 1/2 b 2 −1 ≤ b 1/2 =n

≤C

=n

m =n

b2

1/2

log1/2 m .

=n

582

11 Riemann sums

And on the other,

∞

b2

1

∨

1/2

=m+1

1 [m, n]

∞

≤

b2

=m+1

1 1 ∨ 1/2 m m

b2 m−1/2 .

∞

=

=m+1

Thus ∞

b b∨m

=n

1 1/2

∨

1 [m, n]

m ∞ 1/2 ≤ C bm b2 log1/2 m + m−1/2 b2

≤C

=n

b2

1/2

bm + m−1/2

∈N

=m+1

b2

1/2

log1/2 m.

∈N

(11.7.25) ∞ 2 2 By the remarks following (11.7.23) and (11.7.24), we have ∞ =0 b ≤ 9 =0 a . Consequently, letting for any positive integer h: N (h) = sup N ∩ [1, h] , ∞

b b∨m

1/2

=n

i∨n≤m≤j m∈N

≤C

∞

b2

≤C

≤

bm + m−1/2

b2

b2

1/2

log1/2 m

∈N

1/2

bm + N (j )

m≤N (j )

∞

∞

i∨n≤m≤j m∈N

∈N

≤C

1 [m, n]

∨

1/2

∈N ∞

1

1/2

∞

b2

1/2

log1/2 N (j )

∈N

b2 N (j )1/2 log1/2 N (j )

∈N C a 22 N (j )1/2 log1/2 N (j ).

And then

∞

n,m∈[i,j ]∩N n≤m

b b∨m

=n

1

1 ∨ 1/2 [m, n]

≤ C a 22 # [i, j ]∩N N (j )1/2 log1/2 N (j ). (11.7.26)

Particularizing (11.7.26) to the case N = N for which N (j ) = j gives i≤n≤m≤j

∞ =n

b b∨m

1 1/2

∨

1 [m, n]

≤ C a 22 (j − i)j 1/2 log1/2 j. (11.7.27)

Proof of Theorem 11.7.1. By Proposition 11.7.5 and estimate (11.7.26), we have 2 Rn f˜(x) dx ≤ Ca (j − i)j 1/2 log1/2 j. E T i≤n≤j

11.7 Riemann sums and the randomly sampled trigonometric system

583

Theorem 11.7.1 is thus a consequence of Theorem 8.4.2 (or 4 in [Gál–Koksma: 1950]). For speed of convergence, Theorem 9.3.11 can also be used. Proof of Theorem 11.7.2. Again, by Proposition 11.7.5 and estimate (11.7.27), we have E

2 2 (1−θ )/2 Rns f˜(x) dx ≤ C a 22 (v−u) n1/2 v. v log nv ≤ C a 2 (v−u)v log T u≤s≤v

(11.7.28) Once more the result follows from Theorem 9.3.11. We turn to the study of the convergence properties for the system (eS ) . Put δ0 = 0, 0 = 0 and for any integer k ≥ 1, δk = inf{n ≥ 1 : εn+δ1 +···+δk−1 = 1}, k = δ1 + · · · + δk . Then the random variables δk are i.i.d., P{δk = m} = 2−m for all k and m, and k = inf{ ≥ 1 : Sl = k}. Put for any integer k ≥ 0, Yk =

a .

k ≤<k+1

One has the writing f˜ =

∞

a eS =

=0

∞

Yk ek .

(11.7.29)

k=0

∞ 2 2 If the series ∞ k=0 E Yk converges, then the series k=0 Yk (ω) converges for P-almost ∞ all ω; and by Carleson’s theorem, the series k=0 Yk (ω)ek (x) converges in L2 (T) and for P-almost all ω and almost all x. Besides the convergence in L2 (T) of the series ∞ k=0 Yk (ω)ek (x), on a measurable set of ω’s of positive probability, implies by orthogonality, the convergence of the series ∞ 2 Y (ω). If the coefficients a have constant signs, then by the very definition of n k=0 k the random variables Y , this implies that a ∈ 2 . We are going to show that the converse implication is also true. 11.7.6 Lemma. For any a ∈ 2 , the series

∞

2 k=0 E Yk

converges.

2 almost all ω and x of the series ∞Thus a ∈ 2 implies convergence in L (T) and for ∞ Y (ω)e (x), hence convergence of the series k k=0 k =0 a eS (ω) (x).

584

11 Riemann sums

Proof. We begin with some elementary, but useful considerations. The independence property and the fact that k = m implies m ≥ k, show for any positive integer k that ∞ ∞ E Yk = P k = m, δk+1 = n

= =

m=k n=1 ∞

a

m≤<m+n

a

=k ∞

P k = m P δk+1 = n

k≤m≤ n>−m

a

=k

P k = m 2−(−m)

k≤m≤

and E Y0 =

∞

a 2− .

=0

Write for now E Yk = E Yk (a). An important consequence of this computation is: for any y = {y , ≥ 0}, there exists a unique a = {a , ≥ 0} such that E Yk (a) = yk ,

∀k = 0, 1, . . . .

Now we compute E Yk2 . Similarly, we have E Yk2

∞ ∞ = P k = m, δk+1 = n

=

m=k n=1 ∞ ∞

P k = m, δk+1 = n

+2

:=

a

m≤<m+n

m=k n=1 ∞ ∞

(1) Sk

2

a2

m≤<m+n

P k = m, δk+1 = n

m=k n=1 (2) + Sk .

a aλ

m≤<λ<m+n

First, examine the contribution of the rectangle terms. We have (2)

Sk = 2

∞ ∞ P k = m, δk+1 = n m=k n=1

=2

k≤<λ<∞

=2

k≤<λ<∞

m≤<λ<m+n

a aλ

−n a aλ P k = m 2 m≤

n>λ−m

a aλ P k = m 2−(λ−m) . m≤

585

11.7 Riemann sums and the randomly sampled trigonometric system

We use the fact (Spitzer [1976], see also [Breiman: 1992], Proposition 3.39 and Theorems 3.33, 3.34) that for a transient random walk τ = {τn , n ∈ N} (here the sequence = {k , k ∈ N}), the Green function Gτ (0, x) =

∞

P{τk = x}

k=0

is finite for every x ∈ Z. Moreover, Lτ := sup Gτ (0, x) < ∞. x≥0

Let G be the Green function associated to the sequence and let L = L . Then, ∞

(2)

Sk = 2

k=0

∞

a aλ

k=0 k≤<λ<∞

≤2

m≤

a aλ

0≤<λ<∞

≤2

P k = m 2−(λ−m) m≤ k=0

a aλ

0≤<λ<∞

≤ 4L

P k = m 2−(λ−m) ,

m≤

a aλ 2

−(λ−)

a aλ

0≤<λ<∞

≤ 4L

0≤<λ<∞

≤ 4L

G(0, m)2−(λ−m) ≤ 2L a

0≤<∞

aλ 2

2−(λ−m)

m≤

−(λ−)

<λ<∞

b2 .

0≤<∞

We have used (see (11.7.23)) the discrete Laplace transform {b , ≥ 1} of the sequence {|a |, ≥ 1}. Therefore ∞ (2) Sk ≤ 4L b2 . 0≤<∞

k=0

Now, (1) Sk

∞ ∞ = P k = m, δk+1 = n m=k n=1

=

∞ =k

a2 ,

m≤<m+n

a2

P k = m

k≤m≤

∞ 2−n = a2 P k = m 2−(−m) .

n>−m

=k

k≤m≤

Thus, ∞

(1)

Sk =

k=0

∞ ∞ k=0 =k

=

∞ =0

a2

a2

P k = m 2−(−m)

k≤m≤

P k = m 2

k≤ k≤m≤

−(−m)

∞ ∞ 2 −(−m) ≤ 2L a 2 ≤ 4L a2 . =0

m≤

=0

586

11 Riemann sums

Putting together the two last estimates gives ∞

E Yk2 =

k=0

∞

(2)

(1)

Sk + Sk

≤ 16L

b2 .

0≤<∞

k=0

Now, recall that b = β(|a |). And by (11.7.24), b2 = β(|a |)2 ≤ 3β(|a |2 ), so that a ∈ 2 implies b ∈ 2 ; hence the convergence of the series

∞

2 k=0 E Yk .

The previously obtained estimates can be used to state individual type conditions ensuring the convergence almost everywhere of the sequence of averages {AN N , N ≥ 1}. By slightly modifying the calculations made after Proposition 11.7.3, and using the fact that ∞ m 1 1 1 1 b b∨m 1/2 ∨ = bm b 1/2 ∨ [m, n] [m, n] =n

=n

≤

∞

+

b2

=m+1 m b bm 1/2 =n

+

1 1/2

∨

1 [m, n]

∞ 1 b2 , m1/2 =m+1

we then get m i∨n≤m≤j m∈N

and thus

m

n,m∈[i,j ]∩N n≤m

=n

bm

bm

=n

b 1/2

b 1/2

≤C

N (i)≤≤N (j )

≤ C # [i, j ] ∩ N

b b m , 1/2 i∨n≤m≤j m∈N

N (i)≤≤N (j )

b 1/2

bm .

m∈[1,j ]∩N

(11.7.30) Moreover, ∞ ∞ 1 1 2 2 b ≤ # [i, j ] ∩ N b . (11.7.31) m1/2 m1/2 i≤n≤j i∨n≤m≤j i≤m≤j m∈N

m∈N

We deduce, ∞ n,m∈[i,j ]∩N n≤m

=n

=m+1

b b∨m

1 1/2

=m+1

m∈N

∨

1 [m, n]

≤ C # [i, j ] ∩ N ≤N (j )

b 1/2

(11.7.32)

m∈[1,j ]∩N

bm +

m∈[1,j ]∩N

∞

2 =m+1 b . m1/2

587

11.8 Almost sure convergence and square functions of Riemann sums

We see several cases appearing from the bound in (11.7.32). We may indeed have some of the series ∞ 2 b =m+1 b , b , , (11.7.33) m 1/2 m1/2 m∈N

m∈N

convergent, in which case there are no contributions. If any of these series converge, then n,m∈[i,j ]∩N n≤m

∞

b b∨m

=n

1 1 ∨ ≤ C(N , a) # [i, j ] ∩ N , 1/2 [m, n]

(11.7.34)

where the constant C(N , a) depends on N and a. In that case one has with Proposition 11.7.3 and Lemma 11.7.4, for any j ≥ i large enough, 2 E Rn f˜(x) dx ≤ C(N , a) # [i, j ] ∩ N , (11.7.35) T n∈[i,j ]∩N

which exhibits an nearly orthogonal behavior of the Riemann sums Rn f˜ along the subsequence N . In such a case, the convergence almost everywhere does hold. This is classical and can be for instance deduced from Theorem 9.3.11.

11.8 Almost sure convergence and square functions of Riemann sums In this section, we shall study the properties of convergence almost everywhere of sequences of Riemann sums of periodic functions, as well as their metrical structure, through the analysis of the associated square function. We will indeed establish, under a very moderate assumption, a metrical inequality which is easily seen asa spectral regularization type inequality (Section 1.4). Consider for f ∈ L2 (T), f ∼ ∈Z a e , T f dλ = 0, the Riemann sums 1 j f (x + ) = a e (x), n n ∈Z n

Rn f (x) =

j =1

n = 1, 2, . . .

(11.8.1)

n|

and their averages 1 Rs f (x) = a Vn ()e (x), n n

An f (x) =

s=1

(11.8.2)

∈Z

where the kernel Vn () is defined by Vn () =

dn () , n

dn () = #{k ≤ n : k|}.

(11.8.3)

588

11 Riemann sums

Unlike in the previous section, we consider here the problem in its initial setting: the usual trigonometric system {e , ∈ N}. Let ϕ : Z → [1, ∞) be an even nondecreasing function on Z+ , such that for some 0 < ε ≤ 1/4 and Cε finite, d() ≤ ϕ() ≤ Cε ε

(∀ ≥ 1)

(11.8.4)

where d() = #{k : k|}. Such a choice is always possible by Lemma 11.8.6 below. Define for 0 < y < 1 and 0 ≤ t < ∞, √ a2 ϕ() + ( f 2 /y) (a2 / ), "f (y) = "f,ϕ (y) = ϕ()<1/y

f (t) = f,ϕ (t) =

ϕ()≥1/y

(11.8.5)

a2 .

:ϕ()
Finally consider the following regularity condition on the Fourier coefficients of f : for any real c > 0, am (R) sup sup | | = γf (c) < ∞. n≥1 m≥cn an This condition is not a convergence condition: it excludes for instance from the scope of the study sequences converging too well like an = 2−n . The condition imposes on the Fourier coefficients the need to have small local oscillations. It also isolates the trivial component of f formed with null coefficients. We shall prove the following theorems. 11.8.1 Theorem (A metric inequality). There exists a constant Cε such that for any f ∈ L2 (T), ∞ "f (y)dy , f (t)dt ≤ Cε f 2 . sup 0 T 2 L (T) and assume that condition (R) is satisfied.

Let f ∈ Then, there exists a constant Cf depending on f only, such that for any m ≥ n ≥ 1, 1/n m "f (y)dy + f (dt) .

An (f ) − Am (f ) 2 ≤ Cf 1/m

n

This allows us to control very easily the square function associated to the averages An (f ). Put for any nondecreasing sequence N = {np , p ≥ 1} of positive integers, and any f ∈ L2 (T), SN (f ) =

∞

Anp+1 (f ) − Anp (f ) 2

1 2

.

(11.8.6)

p=1

11.8.2 Theorem (Square function). Assume that f ∈ L2 (T) is such that condition (R) is fulfilled. Then there exists a constant Cε,f depending on ε and f only, such that for any nondecreasing sequence N of positive integers, SN (f ) ≤ Cε,f .

11.8 Almost sure convergence and square functions of Riemann sums

589

Our last result is related to the convergence almost everywhere and Marcinkiewicz– Salem conjecture. 11.8.3 Theorem (Convergence almost everywhere). Let f ∈ L2 (T) and assume that condition (R)is fulfilled. Then, the sequence of averages An f converges almost everywhere to T f dλ. We begin with some preliminary results. In the following lemma, we collected some useful properties of the kernels defined in (11.8.3). 11.8.4 Lemma. We have the following estimates: (1) 0 ≤ Vn () ≤ 1, (2) Vn () − Vm () ≤ 2 m−n m

(3) Vn () − Vm () =

( n1

−

(m ≥ n),

1 1 m )dn () − m (dm () − dn ()).

Proof. Immediate. Let χ denote the indicator function on Z. Our next lemma introduces a regularizing function which is of relevance in what follows. 11.8.5 Lemma. Let ϕ : Z → [1, ∞) be an even nondecreasing function on Z+ ; we associate to it the function ϕ # defined as ∀0 < y ≤ 1, ϕ # (y) = ϕa# (y) = a2 ϕ()χ (ϕ() < 1/y) (a = (a )∈Z ∈ 2 ). ∈Z

Then ϕ # (y) is nonincreasing and

Tϕ

# (y)dy

=

2 ∈Z a .

Proof. Immediate. Recall also a classical estimate concerning the divisor function d() = #(k : k|). 11.8.6 Lemma. For some constant c0 > 2,

log / log log . d() = O c0

In particular d() = O ε , for any ε > 0. Let again S = ε1 + · · · + ε , ≥ 1 where ε = {εi , i ∈ N} is a Bernoulli sequence with basic probability space (, A, P). 11.8.7 Lemma. There exist two absolute constants d0 < ∞ and C < ∞, such that for any n ≥ d ≥ d0 , d−1 1 1 j n 1 P{d|Sn } ≤ cos π d ≤ C n1/2 + d . d j =0

590

11 Riemann sums

Proof. Using characteristic functions, we get d−1 d−1 1 2iπ S j 1 e2iπ d + 1 d Ee = E {1d|S } = d d 2 j

j =0

=

1 d

j =0

d−1 j j j 1 eiπ d cos (π ) ≤ cos π . d d d

d−1 j =0

j =0

Let a > a > 1/2 and put 3 ϕn =

2a log n , n

τn =

sin ϕn /2 . ϕn /2

We assume n sufficiently large, say n ≥ n0 , for τn to be greater than (a /a)1/2 . Consider the sector An = [0, ϕn [ ∪ ]π − ϕn , π [ , Acn = [0, π [\An . / An , then cos πdu ≤ cos ϕn , and If πdu ∈ π u n 2 cos ≤ (cos ϕn )n ≤ e−2n sin (ϕn /2) . d

But,

2n sin2 (ϕn /2) = 2n(ϕn /2)2 τn2 ≥ a log n.

We deduce

0≤u
Now, examine the case

πu d

∈ An . Define

I1n =]0, ϕn [ , Then

π u n cos ≤ dn−a . d

0
I2n =]π − ϕn , π [ . 2 π u n n cos = i . d i=1

We only estimate the sum 1n , the sum 2n being estimated similarly. First note that we only have to consider the case {0 ≤ u < d : πdu ∈ I1 } = ∅, otherwise 1n = 0. Now, by using the elementary inequality: |eu − ev | ≤ |u − v| for u, v ≤ 0, we have πu n πu 1 π u 2 − n2 ( πu )2 d cos − e + ≤n log cos d . d 2 d 0≤u
πu ∈I n 1 d

πu ∈I n 1 d

11.8 Almost sure convergence and square functions of Riemann sums

591

Since log(1 − 2 sin2 (x/2)) = −x 2 /2 + O(x 4 ) near 0, we deduce u πu n − n2 ( πu )2 d cos − e ≤ Cn ( )4 d d 0≤u
πu ∈I n 1 d

πu ∈I n 1 d

≤C

n d4

≤ Cd Now

e

2 − n2 ( πu d )

0≤u
It follows that

≤

∞

e− 2 (

n πu 2 d )

≤C

u=1

u4

0≤π u
log5/2 n . n3/2 d . n1/2

π u n d cos ≤ C 1/2 . d n 0≤u
The previous estimates thus provide 0
Therefore

π u n d cos ≤ C( 1/2 ). d n

d−1 π u n 1 1 C cos d ≤ d + n1/2 , d u=0

and this achieves the proof. Finally we shall also need Gál’s estimate (8.4.19) which we recall for convenience. Let (a, b) and [a, b] respectively denote the greatest common divisor and the least common multiple of the positive integers a and b, and denote a, b = (a, b)/[a, b]. Then there exist two constants c and C, such that for all N large enough, cN (log log N)2 ≤ sup ni

ni , nj ≤ CN (log log N )2

i,j ≤N

where the sup is taken over all N-tuples of all different positive integers. Proofs of Theorems 11.8.1 and 11.8.2. Let m ≥ n > 1 be fixed. Consider the sum S1 = :ϕ()
592

11 Riemann sums

Lemma 11.8.4,

S1 2 ≤

:ϕ()
≤2

a2 |Vn () − Vm ()|(Vn () + Vm ()) $

1 m−n 1 + m n m

%

a2 ϕ()

:ϕ()
1/n m−n 2 a ϕ() = 4 a2 ϕ() dy mn 1/m :ϕ()
(11.8.7)

≤4

1/m

Hence

S1 2 ≤ 4

Now, consider the sum S2 =

1/n

ϕ # (y)dy.

(11.8.8)

1/m

:n≤ϕ()≤m a e (Vn () − Vm ()).

S2 2 ≤ 4

Then,

a2 .

(11.8.9)

:n≤ϕ()≤m

Finally, consider the sum S3 = :ϕ()>m a e (Vn () − Vm ()), which corresponds to the difficult case. As by relation iii) of Lemma 11.8.4, Vn () − Vm () =

1 1 1 dn () − (dm () − dn ()), − n m m

using the elementary inequality (a + b)2 ≤ 2a 2 + 2b2 we get

1 1

S3 ≤ 2 − n m 2

2

a2 dn2 () +

:ϕ()>m

2 m2

2 a2 dm () − dn () . (11.8.10)

:ϕ()>m

First, examine the sum B= Then observe that 1 B= 2 m

:ϕ()>m

1 m2

2

a2 dm () − dn () .

(11.8.11)

:ϕ()>m

2

1 a2 dm () − dn () ≤ 2 m

2

aS2k dm (Sk ) − dn (Sk ) ,

k:ϕ(Sk )>m

(11.8.12) almost surely, simply because the range of values taken by the random walk {S , ∈ N} is N. Put |aS | A = sup n . (11.8.13) n≥1 |an |

593

11.8 Almost sure convergence and square functions of Riemann sums

This expression is tractable by means of the strong law of large numbers. Indeed, a.s. limn→∞ Snn = 21 . Thus if η = {Sn ≥ ηn, ∀n ≥ 1} then for some η > 0, P(η ) > 0. Now, on η by using the regularity condition (R), aSn |am | |an | ≤ |an | sup sup = |an |γf (c), an ν≥1 m≥ην |aν |

|aSn | =

(11.8.14)

for any n ≥ 1. Therefore for some M0 suitably chosen: P{A ≤ M0 } > 0.

(11.8.15)

But {A < ∞} is a tail-event, since it does not depend on the first random variables Xi . So that by (11.8.15) P{A < ∞} = 1. (11.8.16) We can write A B≤ 2 m

:ϕ(S )>m

2

A a2 dm (S ) − dn (S ) ≤ 2 m

2

a2 dm (S ) − dn (S ) ,

:ϕ()>m

(11.8.17) because S ≤ . Since d|n, δ|n is equivalent to [d, δ]|n, we have: 2 2

1 = 1+2 1. dm (S ) − dn (S ) = d|S n
(11.8.18)

[d,δ]|S n
d|S n
On the one hand, by Lemma 11.8.7, 1 1 1 (m − n) E 1 ≤C + ≤C + . 1/2 d 1/2 d d|S n
n
n
(11.8.19)

And on the other, E

1 =

[d,δ]|S n
n
n

j =0

[d,δ]−1 e2iπ [d,δ] + 1 1 [d, δ] 2 j

=

[d,δ]−1 j 1 E e2iπ S [d,δ] [d, δ]

j =0

=$

n

≤

n
≤C

[d,δ]−1 e2iπ [d,δ] + 1 1 [d, δ] 2 j

(11.8.20)

j =0

[d,δ]−1 1 j cos(π [d, δ] [d, δ]

n
j =0

1 [d, δ] (m − n)2 + 1 ≤ C + [d, δ] 1/2 1/2

n

1 . [d, δ]

594

11 Riemann sums

Putting together (11.8.18), (11.8.19) and (11.8.20) gives, 1

2 (m − n)2 E dm (S ) − dn (S ) ≤ C + + 1/2 d n
Therefore 1 E m2

n
1 . [d, δ]

(11.8.21)

2

a2 dm (S ) − dn (S )

:ϕ()>m

2 1 1 C 2 (m − n) a + + , m2 1/2 d [d, δ] nm 2 2 a 1 (m − n) 2 + a =C m2 1/2 m2 d nm :ϕ()>m 1 + . a2 m2 [d, δ]

≤

(11.8.22)

n
:ϕ()>m

Put

ϕ ' (y) =

∀0 ≤ y ≤ 1,

a2 . 1/2

:ϕ()>1/y

(11.8.23)

By (11.8.4), ϕ() > 1/y implies Cε ε ≥ 1/y for any ≥ 1. Letting U = 1/(2ε) gives ≥ (Cε y)−2U , and therefore ϕ ' (y) ≤ (Cε y)U f 2 for any ≥ 1; which implies dy ϕ ' (y) < ∞. (11.8.24) y T Then (m − n)2 m2

:ϕ()>m

a2 1/2

≤ =

(m − n) m

(m − n) n mn

:ϕ()>m

≤n

:ϕ()>m

m

≤ n

a2 1/2

a2 1/2

:ϕ()>m

(11.8.25)

m dx

1 dx ϕ ( ) = x x '

a2 1/2

n

1 n 1 m

x2 ϕ ' (y)

dy . y

Further, :ϕ()>m

a2

n
1 ≤ m2 d

n
m 1 n 1 2 dx 2 2 a ≤

f

=

f

ydy. 3 1 d3 n x m

(11.8.26)

595

11.8 Almost sure convergence and square functions of Riemann sums

And, n
1 ≤ 2 m [d, δ]

n
m 1 1 n 1 dx ≤ C ≤ C = C dy, 3 2 2 1 δ d n x m n
(11.8.27) which implies that

a2

:ϕ()>m

n

1 ≤ C f 2 2 m [d, δ]

1 n 1 m

dy.

(11.8.28)

Inserting estimates (11.8.25), (11.8.26), (11.8.28) into (11.8.22) gives 1 E m2

a2

2

dm (S )−dn (S )

≤C

1 n 1 m

:ϕ()>m

dy ϕ (y) + f 2 y

1 n

'

1 m

dy . (11.8.29)

By the Tchebycheff inequality, P

1 m2

2

a2 dm (S ) − dn (S ) > 3C

1 n 1 m

:ϕ()>m

ϕ ' (y)

dy + f 2 y

1 n 1 m

≤

dy

1 , 3

and using (11.8.16), for some M suitably chosen, 1 . 3

P{A > M} ≤

(11.8.30)

There is, moreover, no loss in assuming M > 1. These inequalities imply, in view of (11.8.17), that B=

1 m2

2 a2 dm () − dn () ≤ 3CM

:ϕ()>m

1 n 1 m

ϕ ' (y)

dy + f 2 y

1 n 1 m

dy .

(11.8.31) Now consider the term D :=

1 1 − n m

2

a2 dn2 (),

(11.8.32)

:ϕ()>m

in the right-hand side of (11.8.10). Again, by means of the same argument used to obtain (11.8.12), we have D≤

1 1 − n m

2

j :ϕ(Sj )>m

aS2j dn2 (Sj ) ≤ A .

1 1 − n m

2

j :ϕ(j )>m

aj2

2 1 , k|Sj 1≤k≤n

(11.8.33)

596

11 Riemann sums

and similarly to (11.8.18),

2

=

1

d|S 1≤d≤n

1. 1 +2 [d,δ]|S 1≤d<δ≤n

d|S 1≤d≤n

Now in view of Lemma 11.8.7, 1 1 n 1 E 1 ≤C + ≤ C + . 1/2 d 1/2 d d|S 1≤d≤n

1≤d≤n

(11.8.34)

1≤d≤n

And

E

[d,δ]|S 1≤d<δ≤n

1≤C

1

+

1/2

1≤d<δ≤n

1 [d, δ]

≤C

n2 + 1/2

1≤d<δ≤n

1 . [d, δ] (11.8.35)

Therefore, by (11.8.34) and (11.8.35), E

a2

2

1

d|S 1≤d≤n

:ϕ()>m

≤C

a2

1 n2 + + 1/2 d 1
:ϕ()>m

1≤d<δ≤n

1 . [d, δ] (11.8.36)

By Tchebycheff’s inequality,

P

a2

2

1

d|S 1≤d≤n

:ϕ()>m

> 3C

a2

1 n2 + + 1/2 d 1
:ϕ()>m

1≤d<δ≤n

1 [d, δ]

≤

1 . 3

Using inequality (11.8.30), we deduce in view of (11.8.33), D ≤ 3CM

1 1 − n m

2

a2

1 n2 + + 1/2 d 1
:ϕ()>m

1≤d<δ≤n

1 . (11.8.37) [d, δ]

But

1 1 − n m

2

:ϕ()>m

n2 a2 1/2

≤

1 1 − n m

≤

:ϕ()>m

=

1 n 1 m

:ϕ()>m

a2 1/2

ϕ ' (y)

m n

n a2 1/2

dx ≤ x

n

=

:ϕ()>m m

n a2 1/2

m n

dx x2

ϕ'

1 dx x x

dy . y (11.8.38)

11.8 Almost sure convergence and square functions of Riemann sums

597

Besides

1 1 − n m

2

1 1 1 ≤ − d n m

a2

1
:ϕ()>m

≤

1 1 − n m

1
:ϕ()>m

1 n

2

1 m

a2

a2

1
:ϕ()>m

≤ C f

1 nd 1 d2

(11.8.39)

dy.

By Gál’s estimate, 1≤d<δ≤n

1/2 1 (d, δ)[d, δ] 1≤d<δ≤n ! 1/2 ! 1 1/2 ≤ d, δ ≤ Cn1/2 (log log n)(log n). dδ

1 = [d, δ]

d, δ1/2

1≤d<δ≤n

1≤d<δ≤n

Thus

1 1 − n m ≤

2

≤C =C

1≤d<δ≤n

:ϕ()>m

1 1 1 − n m n

a2

a2

1 [d, δ]

1≤d<δ≤n

:ϕ()>m

1 [d, δ]

1 n1/2 (log log n)(log n)

1 − n m

n

≤ C f 2

1 n

a2

(11.8.40)

:ϕ()>m

1 1 (log log n)(log n) − n m n1/2

a2

:ϕ()>m

dy.

1 m

Inserting estimates (11.8.38), (11.8.39), (11.8.40) into (11.8.37), gives D ≤ 3CM

1 n 1 m

ϕ ' (y)

dy + f 2 y

1 n 1 m

dy .

(11.8.41)

In view of (11.8.13), S3 2 ≤ 2(B + D). It follows from (11.8.31) and (11.8.41) that

S3 ≤ 3CM 2

1 n 1 m

dy ϕ (y) + f 2 y '

1 n 1 m

dy .

(11.8.42)

598

11 Riemann sums

Putting now estimates (11.8.11), (11.8.12) and (11.8.42) together, gives for all m ≥ n ≥ 1,

An (f ) − Am (f ) 2 = S1 2 + S2 2 + S3 2 1 1 1 n n n dy ≤ 3CM ϕ # (y)dy + ϕ ' (y) dy + 4 + f 2 1 1 1 y m m m

a2 .

:n≤ϕ()<m

(11.8.43) In view of the definition (11.8.5) and estimate (11.8.43), the proof of Theorem 11.8.1 is achieved. Theorem 11.8.2 then straighforwardly follows from Theorem 11.8.1. We shall finally give the

Proof of Theorem 11.8.3. Let f ∈ L2 (T), f ∼ ∈Z a e satisfying T f dλ = 0. We evaluate for n < m the increment 2 2

Rd f = a2 dm () − dn () 2

n
√ :ϕ()> m

+

√ :ϕ()≤ m

2 a2 dm () − dn () := E + F.

For estimating E, we use the same trick as for the proof of Theorem 11.8.1. Consider the sequence of partial sums Sn = ε1 + · · · + εn , n = 1, 2, . . . (S0 = 0). Then

2 2 E= a2 dm () − dn () ≤ aS2k dm (Sk ) − dn (Sk ) √ :ϕ()> m

≤A

√ k:ϕ(Sk )> m

√ k:ϕ(k)> m

2 ak2 dm (Sk ) − dn (Sk )

(11.8.44)

almost surely. We used for the last inequality the fact that Sk ≤ k almost surely, and A is defined in (11.8.13). By (11.8.21),

2 E ak2 dm (S ) − dn (S ) √ :ϕ()> m

≤ C (m − n)2 +

√ :ϕ()> m

√ :ϕ()> m

a2

a2 + 1/2

n

1 = C (m − n)2 ϕ ' √ + m +

√

:ϕ()> m

a2

√ :ϕ()> m

1 [d, δ]

√ :ϕ()> m

n
1 d

n
(11.8.45)

1 [d, δ]

a2

a2

1 d

n
599

11.8 Almost sure convergence and square functions of Riemann sums

By (11.8.4) and letting U = 1/(2ε), we know from the discussion preceding (11.8.24) that ϕ ' (y) ≤ (Cy)U f 2 for 0 < y ≤ 1. As we assumed ε ≤ 1/4, then U ≥ 2 and this implies that

1 (m − n)2 ϕ ' √ m

≤ C U f 2 (m − n)2 m−U/2 ≤ C U f 2 (m − n).

(11.8.46)

By Gál’s estimate, n≤d<δ≤m

1 = [d, δ] ≤

d, δ

1/2

n≤d<δ≤m

!

1 (d, δ)[d, δ]

1/2 $

d, δ

n≤d<δ≤m

≤ C(m − n)

1/2

n≤d<δ≤m

1/2

1 dδ

%1/2

$

(log log(m − n))

n
(11.8.47) %

1 . d

By treating separately the cases n ≤ m ≤ 2n and m ≥ 2n, one sees that log(m/n) ≤ C(m − n)1/2 , for m ≥ n ≥ 2. Therefore n≤d<δ≤m

1 1 + ≤ C(m − n) log log(m − n). [d, δ] d

(11.8.48)

n
Consequently, by inserting estimates (11.8.46) and (11.8.48) into (11.8.45) we get E

√ :ϕ()> m

2

ak2 dm (S ) − dn (S ) ≤ C f 22 (m − n) log log(m − n).

(11.8.49)

Arguing as along the lines (11.8.29) to (11.8.31), gives E ≤ 3CM f 22 (m − n) log log(m − n).

(11.8.50)

Now F :=

√ :ϕ()≤ m

2

a2 dm () − dn () ≤

√ :ϕ()≤ m

a2 ϕ() dm () − dn () .

And here again, F ≤

√ k:ϕ(Sk )≤ m

√ aS2k ϕ(Sk ) dm (Sk )−dn (Sk ) ≤ A ak2 (ϕ(k)∧ m)dm (Sk )−dn (Sk ), k≥1

600

11 Riemann sums

almost surely. Now, by (11.8.21) and the Cauchy–Schwarz inequality, √ a2 (ϕ() ∧ m)dm (S ) − dn (S ) E ≥1

≤

a2 (ϕ() ∧

√

2 1/2

m) E dm (S ) − dn (S )

≥1

≤C

√ a2 (ϕ() ∧ m)

≥1

≤C

a2 (ϕ() ∧

√

m)

≥1

1 (m − n)2 + + 1/2 d n
(m − n) + 1/4

n
(11.8.51) n
1 + d

1 [d, δ]

n
1/2

1 [d, δ]

1/2

.

But using (11.8.4), √ a2 −1/4 (ϕ() ∧ m) ≤ a2 −1/4 ϕ() ≤ Cε a2 ε−1/4 ≤ Cε f 22 , ≥1

≥1

≥1

(11.8.52) since ε ≤ 1/4. Inserting now (11.8.48) and (11.8.52) into (11.8.51) gives √ E a2 (ϕ() ∧ m)dm (S ) − dn (S ) ≥1

1/2 (m − n) 1 1 + + 1/4 d [d, δ] n
≤ Cε f 22 (m − n) + C f 22 m (m − n) log log(m − n)

≤C

a2 (ϕ() ∧

√

m)

≤ Cε f 22 m1/2 (m − n)1/2 (log log(m − n))1/2 . (11.8.53) Arguing again as along the lines (11.8.29) to (11.8.31), gives √ F ≤ 3CM f 22 m(m − n) log log(m − n).

(11.8.54)

Therefore, in view of (11.8.50) and (11.8.54), 2 √ Rd f ≤ C f 22 m(m − n) log log(m − n).

(11.8.55)

n
2

It follows from Theorem 8.4.2 that the sequence of averages {An f, n ≥ 1} defined in (11.8.2) converges to 0 almost everywhere.

Chapter 12

A study of the system (f (nx))

Let f be a periodic measurable function and {nk , k ≥ 1} an increasing sequence of inte gers. The study of conditions under which the series ∞ k=1 ck f (nk x) converges in mean and for almost every x, is an old problem going back to the 1920s. The convergence properties of ∞ k=1 ck f (nk x) are determined by a delicate interplay between the coefficient sequence {ck , k ≥ 1}, the analytic properties of f and the growth speed and number-theoretic properties of {nk , k ≥ 1}. A general presentation of this convergence problem is made with proofs of several recent results. The case when the nk are random is also investigated.

12.1

Introduction and mean convergence

Throughout this chapter, N = {nk , k ≥ 1} denotes an increasing sequence of positive numbers, and c = {ck , k ≥ 1} some element of 2 . Let f ∈ L0 (T). With these quantities in hand, one can formally define the series ck f (nk x). (12.1.1) k≥1

We shall study under which conditions the series (12.1.1) defines an element of L2 (T) or converges for almost all x in T. We are thus going to investigate the convergence problem of the sequence of partial sums N (c, f ) = SN

N

ck f (nk x),

N = 1, 2, . . .

k=1

in mean (namely in the space L2 (T)) or for almost all x in T. In the trigonometric case this problem has been one of the central problems of harmonic analysis, investigated intensively from the 1920s, culminating in the celebrated theorem ∞ of Carleson [1966], stating the almost everywhere convergence of the series k=1 ck sin 2π kx, ∞ 2 . Starting from the 1930s, there has been also considerc cos 2πkx for all c ∈ k k=1 able interest in the convergence properties of the series (12.1.1) for general f ∈ L2 (T), but the existing results are, even today, much less complete than in the trigonometric case. As it turned out, for general f the behavior of the series (12.1.1) is radically different from the trigonometric case: the terms of the series are usually far from orthogonal and the convergence properties of the sum depend sensitively on the coefficient sequence c, the analytic properties of f , the growth speed and most importantly

602

12 A study of the system (f (nx))

on the number-theoretic properties of the sequence N . As a result of the ‘interference’ between the behavior of the Fourier coefficients of f and the arithmetical properties of nk , even the asymptotic computation of the integral N T

2 ck f (nk x) dx

(12.1.2)

k=1

is generally a hard problem. The difficulties encountered in this field are clearly indicated by the long history of the Khinchin conjecture (see Chapter 6 and Proposition 6.1.4), a closely related problem dealing with the a.e. convergence of the averages N 1 f (kx) N k=1

for general integrable f . The integral (12.1.2) (with ck = 1 and with indicator functions f ) is also fundamental in the metric theory of Diophantine approximation. The first insight into the nature of this integral and the closely related problem of mean convergence of (12.1.1) was given by the following result by Wintner [1944a], connecting the convergence problem with Dirichlet series. 12.1.1 Theorem. Let f ∈ L2 (T) with T f (t)dt = 0 and with Fourier series f ∼

∞

(ak cos 2π kx + bk sin 2π kx).

(12.1.3)

k=1

Then the following statements are equivalent: ∞ 2 2 (a) The series k=1 ck f (kx) converges in L (T) for any c ∈ . (b) There exists a constant K > 0 such that for any n ≥ 1 and any real numbers {ck , 1 ≤ k ≤ n} we have n n 2 ck f (kx) dx ≤ K ck2 . T

(c) The infinite matrix

k=1

k=1

T

f (kt)f (t)dt (k, = 1, 2, . . . )

defines a bounded operator on 2 . (d) The Dirichlet series

∞ n=1

−s

an n

and

∞

bn n−s

n=1

are regular and bounded in the half-plane $(s) > 0.

(12.1.4)

12.1 Introduction and mean convergence

603

The basic ingredient of Wintner’s proof is Toeplitz’s criterion [1938] for the 2 boundedness of so-called “D-matrices” in terms of Dirichlet series. The connection with the convergence problem in Theorem 12.1.1 is established by the Möbius transformation; see [1944a] for the details. Clearly, condition (a) of Theorem 12.1.1 implies that ∞ k=1 ck f (kx) converges in measure for any c ∈ 2 . By a remarkable result of Nikishin [1971], the converse is also 2 convergence and c f (kx) is the same for L true, i.e., the convergence theory of ∞ k k=1 convergence in measure. Note that if condition (d) of Theorem 12.1.1 is not satisfied, the series ∞ k=1 ck f (kx) can still converge for a large class of coefficient sequences {ck , k ≥ 1}. For example, Wintner noted that if f ∈ L2 (T), T f (t)dt = 0 with Fourier series (12.1.3), then ∞ 2 −γ ), b = O(k −γ ), c = O(k −γ ) k k k=1 ck f (kx) converges in L (T) if ak = O(k for some γ > 1/2. The assumptions made here on the Fourier coefficients of f do not in general imply the boundedness of the Dirichlet series in Theorem 12.1.1 and accordingly, the assumption made on the coefficient sequence {ck , k ≥ 1} is stronger than c ∈ 2 . An application of the last remark is the series ∞ ψ(kx + 21 ) , k

(12.1.5)

k=1

where ψ(x) =

x − [x] − 0

1 2

if x = [x], if x = [x].

This example has considerable historical interest, since it was used by Riemann [1892] to illustrate the limitations of his own integration theory. He showed that both (12.1.5) and the trigonometric sum ∞ c(n) sin 2π nx, (12.1.6) n n=1

where c(n) =

(−1)d

(12.1.7)

d|n

converge if x is rational, and to the same limit. Moreover, he observed that the function defined by these series on the set of rational numbers is unbounded on any interval, and thus (12.1.6) cannot be the Fourier series of its sum in the Riemann sense. From the remark after Theorem 12.1.1 it follows that (12.1.5) converges in L2 (T) and Wintner showed in [1937] that its sum belongs to Lp (T) for any p > 1 and has (12.1.6)–(12.1.7) as its Fourier series in the Lebesgue sense. A sequence of vectors {xn , n ∈ A} in a Hilbert space H is called a Riesz sequence if there exist positive constants C1 , C2 such that 2 C1 |an |2 ≤ an xn ≤ C2 |an |2 n∈A

n∈A

n∈A

604

12 A study of the system (f (nx))

for all sequences of scalars {an , n ∈ A}. 2 ∞Hedenmalm, Lindquist and Seip [1997], [1999] proved that if f2 ∈ L (T), f (t) ∼ n ≥ 1} is a Riesz sequence in L (T) if and only if k=1 ϕk cos 2π kt, then ∞{f (nx), −s is analytic and bounded away from 0 and ∞ in the the Dirichlet series n=1 ϕn n whole right half-plane $z > 0, i.e., ∞ δ≤ ϕn n−σ −it ≤ ,

for σ > 0,

n=1

with some positive constants δ and . Previous results are in the works of Gosselin and Neuwirth [1968], and Ginsberg, Neuwirth and Newman [1970]. So far, we have considered the convergence in mean problem in the case N = N. In the general case the existing results in the literature are much less complete, due to number-theoretic difficulties. Given positive integers a, b, define a, b =

(a, b) , [a, b]

where (a, b) and [a, b] denote the greatest common divisor, resp. least common multiple, of a and b. The following theorem is an easy consequence of results of Wintner [1944a]. 12.1.2 Theorem. Let f ∈ L2 (T) with T f (t)dt = 0 and Fourier series f ∼

∞

(ak cos 2π kt + bk sin 2π kt),

k=1

where ak = O(k −α ), bk = O(k −α ), α > 1/2. Let {nk , k ≥ 1} be an increasing ∞sequence of positive integers and {ck , k ≥ 1} a real coefficient sequence. Then k=1 ck f (nk x) converges in the mean provided ∞

|ck ||cl |nk , nl α < ∞.

(12.1.8)

k,l=1

To prove this, it suffices to consider the case when the Fourier series of f is a pure sine or cosine series. Now if f ∼ ∞ k=1 ak cos 2π kt, then by the assumption on the ak and relation (52) of [Wintner: 1944a] we have for any positive integers i, j , ∞ ∞ 1 f (it)f (j t)dt = ahi/(i,j ) ahj/(i,j ) ≤ C1 i, j α h−2α ≤ C2 i, j α 2 T h=1

for some constants C1 , C2 and thus n 2 ck f (nk t) dt ≤ C2 T

k=m

h=1

m≤k,l≤n

|ck ||cl |nk , nl α .

(12.1.9)

12.1 Introduction and mean convergence

605

Hence Theorem 12.1.2 follows from (12.1.8). In particular, under the assumptions made on f in Theorem 12.1.2, ∞ k=1 ck f (nk x) 2 2 converges in L (T) norm for any c ∈ provided the quadratic form ∞

nk , nl α xk xl

(12.1.10)

k,l=1

is bounded, i.e., there exists a constant A > 0 such that N N nk , nl α xk xl ≤ A xn2

(12.1.11)

n=1

k,l=1

for any N ≥ 1 and any real x1 , . . . , xN . This is equivalent, in turn, to the fact that the matrix nk , nl α (k, l = 1, 2, . . . ) defines a bounded operator on 2 . In the case nk = k this holds if and only if α > 1, as follows easily from Theorem 12.1.1. Also, if the nk are coprimes, nk , nl = (nk nl )−1 and thus by Cauchy’s inequality, (12.1.11) ∞then −2α is satisfied if k=1 nk < ∞. For general {nk , k ≥ 1}, a sufficient condition for (12.1.11) is nk , nl α < ∞. (12.1.12) sup k≥1 l≥1

Unfortunately, computing the order of magnitude of the sums in (12.1.11), (12.1.12) for general N is a difficult number-theoretic problem. By Gál’s estimate (8.4.19), for any increasing {nk , k ≥ 1} we have N

nk , nl ≤ Cn(log log n)2 ,

k,l=1

In fact, Gál [1949] constructed a sequence {nk , k ≥ 1} for which the bound Cn(log log n)2 is actually attained. For this sequence {nk , k ≥ 1}, relation (12.1.11) clearly fails for x1 = · · · = xN = 1. No sharp estimate for the left-hand side of (12.1.11) is known for general x1 , . . . , xN . The following theorem gives mean convergence criteria for ∞ k=1 ck f (nk x) in the case when (12.1.12) is not satisfied. 12.1.3 Theorem. Let f ∈ L2 (T) with T f (t)dt = 0 and Fourier series f (t) ∼

∞

(ak cos 2π kt + bk sin 2π kt)

k=1

where ak = O(k −α ), bk = O(k −α ), α > 1/2. Let {nk , k ≥ 1} be an increasing sequence of positive integers and let {λk , k ≥ 1} be a positive nondecreasing sequence such that λ2n /λn = O(1) and sup

N

1≤k≤N l=1

nk , nl α ≤ λN .

(12.1.13)

606 Then

12 A study of the system (f (nx))

∞

k=1 ck f (nk x)

converges in L2 (T) norm provided ∞

ck2 (log k)γ λk < ∞ for some γ > 1.

(12.1.14)

k=1

Note that case when λN = O(1), ∞ k=1 ck f (nk x) converges in the mean in the provided k=1 ck2 < ∞, but condition (12.1.14) specialized to this case gives a more stringent condition. This is due to the fairly crude estimates we use for the quadratic form appearing in the argument. We formulate a few corollaries of Theorem 12.1.3. 12.1.4 Corollary. Assume f satisfies the assumptions of Theorem 12.1.3 and let nk = ∞ r where r ≥ 2 is an integer. Then k k=1 ck f (nk x) converges in the mean provided ∞ 2 k=1 ck < ∞. Note that the assumptions made on the Fourier coefficients of f in Corollary 12.1.4 c f (n do not imply condition (d) of Theorem 12.1.1, but ∞ k k x) still converges for k=1 all c ∈ 2 . This is due to the speed and nice number-theoretic properties of nk . 12.1.5 Corollary. Assume f satisfies the assumptions of Theorem 12.1.3. Then the series ∞ k=1 ck f (kx) converges in the mean provided ∞

ck2 (log k)3+ε < ∞ if α = 1

k=1

and

∞

ck2 (log k)1+ε k 1−α < ∞ if α < 1.

k=1

Note that the case α > 1 is uninteresting: in this case ∞ k=1 (|ak | + bk |) < ∞ and thus N ∞ N ∞ N ck f (nk x) ≤ |aj | ck cos 2π nk x + |bj | ck sin 2π nk x k=1

j =1

≤C

k=1

N

ck2

j =1

k=1

1/2

k=1

2 for some constant C and thus ∞ k=1 ck f (nk x) converges in the mean for any c ∈ . Actually, the series converges almost everywhere also. 12.1.6 Corollary. Let f satisfy the assumptions of Theorem 12.1.3 with α = 1 and let {nk , k ≥ 1} be a sequence of integers such that for any d ≥ 1 we have d|nk n−1 k ≤ A/d with an absolute constant A. Then ∞ c f (n x) converges in the mean provided k k k=1 ∞ k=1

ck2 (log k)γ (log nk ) < ∞,

γ > 1.

(12.1.15)

607

12.1 Introduction and mean convergence

The condition d|nk n−1 k ≤ A/d is satisfied if the sequence {nk , k ≥ 1} is roughly uniformly distributed among the residue classes mod d. If the nk are coprimes, for −1 any d the sum d|nk nk contains at most one term and thus the conditions of Corollary 12.1.6 are satisfied. The conditions are also satisfied for nk = k r , r ≥ 2 but in this case Corollary 12.1.4 gives a better result. We see again that the number-theoretic properties of nk play a crucial role in the convergence behavior of ∞ k=1 ck f (nk x), which can be anticipated from Theorem 12.1.2. If {nk , k ≥ 1} grows with a polynomial speed, = O(log k) and thus the convergence condition (12.1.15) reduces then2 log nk2+ε to ∞ c (log k) < ∞. k=1 k Proof of Theorem 12.1.3. By assumption (12.1.13), relation (12.1.9) and the Cauchy– Schwarz inequality we have N T

N 2 ck f (nk t) dt ≤ C2 |ck ||cl |nk , nl α

k=1

k,l=1

≤ C2

N 1 2 (c + cl )2 nk , nl α 2 k

(12.1.16)

k,l=1

≤ C2 λN

N

ck2

k=1

for any real c1 , . . . , cN . Assume now (12.1.14) and let Zν = the Cauchy–Schwarz inequality we have 2n

2 ck f (nk x)

k=2m +1

=

n

2

≤

Zν

n

ν=m

ν γ Zν2

2ν+1

∞

ν=m

k=2ν +1 ck f (nk x).

ν −γ .

By

(12.1.17)

ν=1

Thus with some constant C we have, using (12.1.16), 2n T

2

ck f (nk x) dx ≤ C

n

k=2m +1

ν

γ T

ν=m n

≤C

ν

γ

Zν2 dx

ν+1 2

ck2 λ2ν+1

(12.1.18)

k=2ν +1

ν=m n

≤C

2

ck2 (log k)γ λk .

k=2m +1

Here the last expression tends to 0 as m, n → ∞ and this remains valid if an arbitrary j subset of the ck ’s is replaced by 0’s. Thus the L2 norm of k=i ck f (nk x) tends to 0 if i, j → ∞, completing the proof of Theorem 12.1.3.

608

12 A study of the system (f (nx))

Note that in the case nk = k r , r ∈ N we have nk , nl α = k, lrα . Thus in view of Theorem 12.1.3, for the proof of Corollaries 12.1.4 and 12.1.5 it suffices to prove the following 12.1.7 Lemma. Let β > 0 and λ∗n = sup

n

k, lβ .

(12.1.19)

1≤l≤n k=1

Then λ∗n = O(1), λ∗n = O(log2 n) and λ∗n = O(n1−β ) according as β > 1, β = 1 or β < 1. Proof. Fix d | h and sum in (12.1.19) first for those k for which (h, k) = d. Then we get 2 β β β β d d d d = ≤ [h, k] hk h k k≤n,d|k k≤n,(h,k)=d k≤n,(h,k)=d (12.1.20) β [n/d] 1 d ≤ . h lβ l=1

For β = 1 the last sum in (12.1.20) is at most C ∗ log n and thus summing for all d|h and noting that the sum of all divisors of h is ≤ Ch log h, we get the statement of the lemma. For β > 1 the last sum in (12.1.20) is O(1) and dβ ≤ dβ + (h/d)β . √ d|h,d≤ h

d|h

√ d|h,d≤ h

Let ε > 0. Since the number of divisors of h is O(hε ), the first sum on the right-hand side has O(hε ) terms thus this sum is O(hβ/2+ε ); the second sum on the right-hand and ∞ β −β = O(hβ ). Choosing ε sufficiently small, we get the side is at most h j =1 j statement of the lemma in the case β > 1. Let finally 0 < β < 1. Then the last expression in (12.1.20) is at most β n 1−β d C C = β n1−β d 2β−1 h d h and thus the λ∗N is less than C

1 1−β 2β−1 n d . hβ

(12.1.21)

d|h

Let 0 < ε < min(β, 1 − β). Since the number of divisors of h is O(hε ), for β ≥ 1/2 the sum in (12.1.21) is O(h2β−1+ε ) and thus the expression in (12.1.21) is less than C

1 1−β 2β−1+ε n h ≤ Cn1−β . hβ

609

12.1 Introduction and mean convergence

If β < 1/2, then the sum in (12.1.21) is O(hε ) and thus the expression in (12.1.21) is bounded by 1 C β n1−β hε ≤ Cn1−β . h Thus in both cases the expression in (12.1.21) is O(n1−β ), and thus the lemma is proved. To prove Corollary 12.1.6, it suffices to show that N (nh , nk ) k=1

[nh , nk ]

≤ C log nh .

(12.1.22)

Fix d|nh and compute the sum in (12.1.22) for those 1 ≤ k ≤ N such that (nh , nk ) = d. This restricted sum clearly cannot exceed, in view of the assumption of Corollary 12.1.6, 1≤k≤N,d|nk

d2 d2 A ≤ · . nh nk nh d

Summing for all d|nh , and using the fact that the sum of divisors of nh is O(nh log nh ), we get (12.1.22). Our next theoremgives a necessary and sufficient condition for the mean convergence of the series ∞ k=1 ck f (nk x) in terms of the coefficients ck and the Fourier coefficients of f . Despite its precise character, it is of mainly theoretical interest only since its number-theoretical character makes it difficult to apply in concrete cases. 12.1.8 Theorem. Let f ∈ L2 (T) have complex Fourier series f ∼ k∈Z ϕk ek with ϕ0 = T f (t)dt = 0 and ek (x) =exp(2π ikx). Let {nk , k ≥ 1} be an increasing sequence of positive integers. Then ∞ k=1 ck f (nk x) converges in the mean if and only if the following conditions are fulfilled: 2 a) lim sup ϕn/nk ck = 0, R→∞ P ≥R

b)

n

|n|>nR

nk |n k≤P

2 ϕn/nk ck

(12.1.23)

< ∞.

nk |n

If both sequences {ϕn , n ∈ Z} and c have constant signs then (12.1.23a) follows N from (12.1.23b), so that the sequence {SN (c, f ), N ≥ 1} converges in mean if and 2 only if condition (12.1.23b) holds. Also, if n < ∞, then the nk |n |ϕn/nk ck | N sequence {SN (c, f ), N ≥ 1} converges in mean. Proof. Observe that N (c, f ) = en ϕn/nk ck = en ϕn/nk ck + en ϕn/nk ck . SN n

nk |n k≤N

|n|≤nN

nk |n

|n|>nN

nk |n k≤N

610

12 A study of the system (f (nx))

Let M ≥ N ≥ R. Then, N N SN (c, f ), SM (c, f ) =

nk |n k≤N

n

=

ϕn/nk ck

|n|≤nR

nk |n k≤M

2 ϕn/nk ck

+

ϕn/nk ck

|n|>nR

nk |n k≤N

ϕn/nk ck

nk |n k≤N

ϕn/nk ck .

nk |n k≤M

Thus 2 N N (c, f ) − ϕn/nk ck SN (c, f ), SM |n|≤nR

nk |n k≤N

≤

2 1 ! 2 1 ! 2 2 ϕn/nk ck ϕn/nk ck |n|>nR

≤ sup

|n|>nR

nk |n k≤N

2 ϕn/nk ck

P ≥R |n|>n

R

nk |n k≤M

nk |n k≤P

→ 0, as R tends to infinity by assumption. Consequently, lim

2 N N sup SN (c, f ), SM (c, f ) − ϕn/nk ck = 0.

R→∞ M,N ≥R

|n|≤nR

nk |n k≤N

In other words, N N lim SN (c, f ), SM (c, f ) = A :=

M,N→∞

n

2 ϕn/nk ck

< ∞.

nk |n

And also N lim SN (c, f ) 22 = A.

N →∞

These two facts then imply that lim

N,M→∞

N N

SN (c, f ) − SM (c, f ) 2 = 0,

as required. N Conversely if the sequence {SN (c, f ), N ≥ 1} converges in mean, it is then bounded in mean: N sup SN (c, f ) 2 = B < ∞. N ≥1

611

12.2 Almost sure convergence – sufficient conditions

But as N (c, f ) 22 =

SN

|n|≤nN

2 ϕn/nk ck

nk |n

+

|n|>nN

2 ϕn/nk ck

,

nk |n k≤N

this implies that A ≤ B. Now let f ∗ denote the limit in mean of the sequence N (c, f ), N ≥ 1}. From {SN ∗ f , en − S N (c, f ), en ≤ f ∗ − S N (c, f ) 2 N N we deduce N (c, f ), en = lim f ∗ , en = lim SN N →∞

N →∞

ϕn/nk ck =

nk |n k≤N

ϕn/nk ck .

nk |n

Thus f ∗ = n∈Z en nk |n ϕn/nk ck . Let R be some positive integer and define HR = {en , |n| ≤ nR }. Let pR be the projection onto the orthogonal complement HR ⊥ of HR . Then, pR (f ∗ ) 2 − pR (S N (c, f )) 2 ≤ pR (f ∗ ) − pR (S N (c, f )) 2 N N N ≤ f ∗ − SN (c, f ) 2 → 0,

as N tends to infinity. Thus, N N (c, f )) 2 ≤ sup f ∗ − SN (c, f ) 2 → 0, sup pR (f ∗ ) 2 − pR (SN N ≥R

N≥R

as R tends to infinity. Now, by the triangle inequality, ! 2 1/2 N sup pR (SN (c, f )) 2 = sup ϕn/nk ck N≥R

N ≥R

|n|>nR

nk |n k≤N

N ≤ sup pR (f ∗ ) 2 − pR (SN (c, f )) 2 + pR (f ∗ ) 2 N ≥R

→ 0, as R tends to infinity. This completes the proof.

12.2 Almost sure convergence – sufficient conditions Let f ∈ L2 (T) with T f (t)dt = 0 and let N be an increasing sequence of positive integers. Using standard terminology, we call the pair (f, N ) (or, equivalently, the sequence f (nk x)) a convergence system if for any c ∈ 2 , ∞ x) converges for k=1 ck f (nk almost all x ∈ T. This is the simplest and strongest type of behavior of ∞ k=1 ck f (nk x), but it holds only in a few special situations. By Carleson’s theorem [1966], {cos 2π nx} and {sin 2πnx} are convergence systems. More generally, Gaposhkin [1968] proved (using Carleson’s theorem) the following result:

612

12 A study of the system (f (nx))

12.2.1 Theorem. Let f ∈ Lipα (T) for α > 1/2 and 1, 2, . . . } is a convergence system.

T f (t)dt

= 0. Then {f (nx), n =

Another classical result, proved by Kac [1943] for the Lipschitz class and extended substantially by Gaposhkin [1966b] is the following 12.2.2 Theorem. Let f ∈ L2 (T) with modulus of continuity ω2 (δ, f ) = sup

T f (t)dt

= 0 and assume that the square 1/2

|f (x + h) − f (x)| dx 2

0

0
of f satisfies

1

1 ω2 (δ, f ) = O log δ

−1−ε

(ε > 0).

(12.2.1)

Let {nk , k ≥ 1} be a sequence of positive reals satisfying the Hadamard gap condition nk+1 /nk ≥ q > 1,

k = 1, 2, . . . .

(12.2.2)

Then f (nk x) is a convergence system. These theorems describe the known situations when f (nk x) is a convergence system. All conditions of these results are sharp. Gaposhkin [1966a] showed that Theo

−1/2 rem 12.2.2 becomes false if we replace the right-hand side of (12.2.1) by O log 1δ and Berkes [1997] proved that the condition f ∈ Lipα (T), α > 1/2 in Theorem 12.2.1 and the Hadamard gap condition (12.2.2) in Theorem 12.2.2 are also best possible: there exists a function f ∈ Lip1/2 (T) with T f (t)dt = 0 such that for any positive sequence {εk , k ≥ 1} tending to 0, there exists an increasing sequence N of integers satisfying nk+1 /nk ≥ 1 + εk , k = 1, 2, . . . (12.2.3) and c ∈ 2 such that the series ∞ k=1 ck f (nk x) diverges almost everywhere. Going beyond the conditions of Theorems 12.2.1 and 12.2.2, the almost everywhere conver gence behavior of ∞ c f (n x) becomes very complicated and examples show that k k=1 k ∞ the properties of k=1 ck f (nk x) are determined by a delicate interplay between the coefficient sequence {ck , k ≥ 1}, the smoothness properties of f and the growth speed and number-theoretic properties of {nk , k ≥ 1}. In this section we give a detailed study of this behavior and prove several convergence results such series. Our main interest for 2 ω(k) < ∞ where ω(k) → ∞ will be to find convergence criteria of the type ∞ c k=1 k is some positive sequence (Weyl multiplier) depending on f and {nk , k ≥ 1}. Before formulating our results, we first give an equivalent reformulation of the convergence system property of ∞ k=1 ck f (nk x) in terms of maximal operators. The following result is due to Nikishin [1970b].

613

12.2 Almost sure convergence – sufficient conditions

12.2.3 Proposition. A pair (f, N ) is a convergence system if and only if for any ε > 0, 0 < δ < 1, there exist a set Aε,δ ⊂ (0, 1) with Lebesgue measure ≥ 1−ε and a constant Cε,δ > 0 such that for arbitrary c ∈ 2 we have

N ∞ 1−δ (1−δ)/2 sup ck f (nk x) dx ≤ Cε,δ ck2 .

Aε,δ N ≥1 k=1

k=1

We now prove an analogous statement involving a weak (2, 2) type inequality. 12.2.4 Proposition. A pair (f, N ) is a convergence system if and only if there exists a constant C such that for any c ∈ 2 the following maximal inequality holds: N (c, f ) > t c 2 ≤ C. sup t 2 λ sup SN t≥0

N ≥1

N ,f

Proof. Given a pair (f, N ), considerthe L2 (T)-operators SN via the isomorphism c → g if g ∼ k c|k| ek by N ,f

SN

(g) =

N

, N = 1, 2, . . . defined

ck f (nk .).

k=1

Consider also the family of pointwise measurable transformations on T: τj x = j x mod 1. For any integer j ≥ 2, the transformation τj preserves the normalized Lebesgue measure λ. It is in turn an exact endomorphism (Section 3.3) and is in particular strongly mixing: Tj g = g τj . L2 -isometries preserving 1is better viewed on That the Tj ’s are commuting positive Fourier expansion of g, since if g ∼ m∈Z gm em , then Tj f ∼ m∈Z gm emj , which readily implies Tk (Tj g) = Tj (Tk g) (j, k = 1, 2, . . . ). (12.2.4) Proceeding next by approximation, we deduce that (12.2.4) holds for any g ∈ Lp (T), N ,f 0 < p ≤ ∞. This in particular implies that the sequence of operators SN commutes with E : for any g ∈ L2 (T), N ,f

SN

N ,f

(Tj g) = Tj (SN

g)

(N, j = 1, 2, . . . ).

Further, for any g ∈ L2 (T), J 1 lim Tj g − gdλ = 0. J →∞ J j =1

T

2

(12.2.5)

614

12 A study of the system (f (nx))

Since strong convergence implies weak convergence, it follows for any u, v ∈ L2 (T) that J 1 Tj u, v = u, 1v, 1. lim J →∞ J j =1

Choosing u = χ {A}, v = χ{B} where A, B are Borel sets of T and χ denotes the indicator function, we deduce J 1 λ(Tj−1 A ∩ B) = λ(A)λ(B). J →∞ J

lim

j =1

From this it follows easily that for any a > 1 and Borel sets A, B of T, there exists T ∈ E such that λ(T −1 A ∩ B) ≤ aλ(A)λ(B). (12.2.6) Thus Proposition 12.2.4 is just a consequence of Theorem 5.2.1. Proposition 12.2.4 implies that a pair (f, N ) is a convergence system only if the maximal operator N sup |SN (c, f )| N ≥1

Lp (T)

belongs to with p < 2. This has a consequence concerning convergence in mean. Say by analogy that a pair (f, N ) is an Lp -convergence system if for any N ,f g ∈ L2 (T) the sequence {SN g, N ≥ 1} converges in Lp (T). 12.2.5 Corollary. Assume that the pair (N , f ) is a convergence system. Then, it is also an Lp -convergence system for any p < 2. N N Proof. Define ωR = supN,M≥R |SN (c, f )−SM (c, f )|. By assumption limR→∞ ωR = p 0 a.e., and by the above remark ω1 ∈ L (T), p < 2. Thus by Fatou’s lemma, N N N N (c, f ) − SM (c, f )|p ≥ lim sup E |SN (c, f ) − SM (c, f )|p . 0 = E lim sup |SN N,M→∞

N,M→∞

The previous results summarize the basic equivalence of a.e. convergence results and maximal inequalities for f (nk x). In Theorem 12.2.16 at the end of this section we will in fact prove a maximal inequality that leads to various a.e. convergence results for ∞ k=1 ck f (nk x). Except for this result, however, our approach to a.e. convergence will be different and we will use a combination of martingale and quasi-orthogonality arguments to achieve our goal. Theorems 12.2.1 and 12.2.2 above show that the convergence properties of ∞ k=1 ck f (nk x) depend sensitively on the smoothness properties of f , and we start with a few preliminary remarks concerning smoothness criteria. Let f ∈ L2 (T) with T f (t)dt = 0 have Fourier series f ∼

∞ k=1

(ak cos 2π kx + bk sin 2π kx)

(12.2.7)

615

12.2 Almost sure convergence – sufficient conditions

and let rf (N ) =

∞

(ak2 + bk2 ).

(12.2.8)

k=N

Given an integer m ≥ 1, let [f ]m denote the function in [0, 1) which takes the constant (k+1)/m value m k/m f (t)dt in the interval [k/m, (k + 1)/m), (k = 0, 1, . . . , m − 1). In probabilistic terms, [f ]m is the conditional expectation of f with respect to the σ -field generated by the intervals [k/m, (k + 1)/m). Let rf∗ (N ) = f − [f ]N .

(12.2.9)

The speed of convergence of rf∗ (N ) to zero clearly measures the smoothness of f ; for example if f is a Lip (α) function, then rf∗ (N ) = O(n−α ). A simple connection between rf (N) and rf∗ (N ) is given by the following lemma, due essentially to Ibragimov [1962]. Its proof will be given after the proofs of Theorems 12.2.7 and 12.2.2. 12.2.6 Lemma. Let λ > 1 and g(t) = f (λt). Then we have, for any m ≥ λ,

g − [g]m ≤ C (m/λ)−1/2 + rf ((m/λ)1/3 ) (12.2.10) where C is a positive constant depending only on f . In particular, for any N ≥ 1 we have rf∗ (N ) ≤ C(N −1/2 + rf (N 1/3 )). Thus if rf (N) = O(N −α ) for some 0 < α ≤ 1, then rf∗ (N ) = O(N −α/3 ). Turning to the convergence behavior of ck f (nk x), we first study the lacunary case, i.e., we assume that {nk , k ≥ 1} grows very rapidly. If {nk , k ≥ 1} satisfies the Hadamard gap condition (12.2.2), then by Theorem 12.2.2 the system f (nk x) is a convergence system under mild smoothness conditions on f . We investigate now the case when {nk , k ≥ 1} grows with a sub-exponential speed, i.e., it satisfies the gap condition nk+1 /nk ≥ 1 + εk , k ≥ k0 , where εk tends to 0. A remarkable result on trigonometric series with sub-Hadamard gaps was proved by Erdös [1962], who showed that if {nk , k ≥ 1} is a sequence of positive integers satisfying nk+1 /nk ≥ 1 + ck −β ,

k = 1, 2, . . .

(12.2.11)

for some c > 0, β < 1/2, then sin 2π nk x satisfies the central limit theorem, i.e., lim λ{x ∈ (0, 1) : (N/2)−1/2

N→∞

N k=1

sin 2π nk x ≤ t} = (2π )−1/2

t −∞

e−u

2 /2

du.

616

12 A study of the system (f (nx))

Moreover, this result becomes false for β = 1/2. Thus, under (12.2.11) with β < 1/2 the sequence sin 2π nk x behaves like a sequence of independent random variables, and this is no longer valid if β = 1/2. Our next theorem gives a strong convergence property of series ∞ k=1 ck f (nk x) under the Erdös gap condition (12.2.11). Define, for any * > 0, * L+[k ] |c |. τk,* (c) = sup L≥k *+1 =L

12.2.7 Theorem. Let f ∈ L∞ (T) with T f (t)dt = 0 and rf (N ) = O(N −α ) for some α > 0. Let {nk , k ≥ 1} be a sequence of positive integers satisfying the gap condition (12.2.11) with some β < 1/2, and let c ∈ 2 with τk,* (c) = o(1) (k → ∞) for all 2 0 < * < 1. Assume that ∞ k=1 ck f (nk x) and all of its subseries converge in L (T) ∞ norm. Then k=1 ck f (nk x) also converges a.e. It seems likely that Theorem 12.2.7 remains valid without the technical condition τk,* (c) = o(1), but this remains open. This condition is certainly satisfied if ck = O(k −1/2 ) which, in turn, holds if c ∈ 2 and {ck , k ≥ 1} is monotone. Note that if Xk are independent random variables, then under suitable moment conditions, mean convergence of ∞ k=1 Xk implies a.e. convergence of the same series. Theorem 12.2.7 establishes a similar property for ∞ k=1 ck f (nk x). Note that the central limit theorem is in general not valid for f (nk x) under the gap condition (12.2.11) with β < 1/2, despite Erdös’ theorem mentioned above (see Kac [1949: 645]). 12.2.8 Corollary. Let f ∈ L∞ (T) with T f (t)dt = 0 and rf (N ) = O(N −α ) for some α > 0. Assume that the Dirichlet series ∞ n=1

an n−s and

∞

bn n−s

(12.2.12)

n=1

are regular and bounded in the half-plane $(s) > 0. Let {nk , k ≥ 1} be a sequence of positive integers satisfying the gap condition (12.2.11) with some β < 1/2. Then ∞ 2 −1/2 ). k=1 ck f (nk x) converges a.e. provided c ∈ and ck = O(k Corollary 12.2.8 connects the a.e. convergence of lacunary series ∞ k=1 ck f (nk x) to the classical Wintner theory, showing that the boundedness of the associated Dirichlet series (12.2.12) implies not only mean, but actually a.e. convergence in the lacunary case. We will show that this result is best possible: if the boundedness condition on the Dirichlet series (12.2.12) is not satisfied, there exists a sequence {nk , k ≥ 1} satisfying (12.2.11) for all β < 1/2, and a positive nonincreasing sequence c ∈ 2 such that ∞ almost everywhere. On the other hand, if we are interested k=1 ck f (nk x) diverges ∞ in the a.e. convergence of k=1 ck f (nk x) under more stringent coefficient conditions ∞ 2 c ω(k) < ∞, ω(k) → ∞, then the condition on the Dirichlet series can be k=1 k dropped, as the following result shows.

617

12.2 Almost sure convergence – sufficient conditions

12.2.9 Lemma. Let f ∈ Lipα (T) for some 0 < α ≤ 1 and assume that Let {nk , k ≥ 1} be an increasing sequence of positive integers and put ω(j ) := max

α α nj n 1≤≤j

Then

N T

nj

,

k≥j

2

N

∞

k=1

ck f (nk x) dx ≤ C

k=1

with some constant C. In particular, if in L2 norm.

2 k=1 ck ω(k)

nk

T f (t)dt

= 0.

.

ck2 ω(k)

< ∞,

∞

k=1 ck f (nk x) converges

In particular, if nk = [exp(k/(log k)τ )], then ω(j ) = (log j )ρ and in the case nk = [exp(k η )], 0 < η < 1, then ω(j ) = j 1−η . We supplement Theorem 12.2.7 with another result reducing the almost everywhere convergence of N (nk x) to mean convergence under an additional k=1 ck f assumption 2 on the size of the tail sums k>N ck2 , or, alternatively, under assuming ∞ k=1 ck ω(k) < ∞ for a suitable ω(k) → ∞. 12.2.10 Theorem. Let f ∈ Lipα (T) for some 0 < α ≤ 1 and assume that T f (t)dt = 2 0. Let k , k ≥ 1} be an increasing sequence of positive integers and c ∈ . Assume {n ∞ that k=1 ck f (nk x) converges in L2 norm and 1/2 1/2 1/α ck2 n−2 nαk = 0. (12.2.13) lim k R→∞

Then

∞

k>R

k=1 ck f (nk x)

k>R

k≤R

converges almost everywhere.

If the sequence {nk , k ≥ 1} satisfies the Hadamard gap condition (12.2.2), relation (12.2.13) trivially holds whenever c ∈ 2 . If, on the other hand, {nk , k ≥ 1} grows slower than exponentially, condition (12.2.13) imposes a restriction on the tail sums 2 k>R ck , which is very mild if {nk , k ≥ 1} grows near exponentially. For example, if k/(log k)τ ] for some τ > 0, then (12.2.13) reduces to nk = [e

ck2 = O (log R)−τ (1+2/α) . k>R

If nk = [e

k/(log log k)τ

], then (12.2.8) becomes

ck2 = O (log log R)−τ (1+2/α) , k>R

kγ

and if nk = [e ], 0 < γ < 1 then we get

ck2 = O R −(1−γ )(1+2/α) . k>R

618

12 A study of the system (f (nx))

The latter case corresponds to the Erdös gap condition (12.2.11), and thus we see that the conditions of Theorem 12.2.10 are more restrictive than those of Theorem 12.2.7. On the other hand, in Theorem 12.2.10 we do not assume regularity conditions like ck = O(k −1/2 ). Proof of Theorem 12.2.10. We follow Kac [1943]. For almost all points t0 , 1 t0 +h ∗ ∗ f (u)du. (12.2.14) f (t0 ) = lim h→0 h t0 Now, since k≥1 ck f (nk .) converges in mean to f ∗ , by Parseval’s relation,

t0 +h

∗

f (u) du =

t0

t0 +h

ck

k≥1

f (nk u)du.

(12.2.15)

t0

We shall use the following estimate: there exists a constant C such that for any 0 ≤ a < b < 1 and any positive integer k, b ≤ Cn−1 . f (n u)du (12.2.16) k k a

Let χ be the characteristic function of the interval [a, b], with period 1 extended onto the whole real line. Suppose that χ (x) = am em (x). m∈Z

By Parseval’s relation,

b

f (nk u)du =

a

ϕm ank m .

m∈Z

We have am = O 1/|m| (see (12.2.39)); and thus we get ! b 1/2 ! 1/2 2 2 ≤ f (n u)du |ϕ | |a | ≤ C f 2 /nk . k m n m k a

m∈Z

m∈Z

Combining (12.2.15) with (12.2.16) gives

t0 +h

∗

f (u)du −

t0

R

t0 +h

ck t0

k=1

! 1/2 ! 1 1/2 f (nk u)du ≤ C ck2 ( )2 . n k>R

k>R

Since f belongs to Lipα (T), R ck k=1

R # |ck |nαk . f (nk u) − f (nk t0 ) du ≤ C|h|1+α

t0 +h "

t0

k=1

k

619

12.2 Almost sure convergence – sufficient conditions

Therefore t0 +h R ! 1/2 ! 1 1/2 1 ∗ −1 f (u)du − c f (n t ) ≤ C |h| ck2 ( )2 k k 0 h n t0

k=1

k>R

+ |h|

α

R

|ck |nαk

k>R

k

.

k=1

Choosing h = hR =

α −1/α k=1 nk

R

|hR |

α

R

and observing that R

|ck |nαk

=

k=1

α k=1 |ck |nk α k=1 nk

R

→ 0,

as R tends to infinity since ck tends to 0 as k tends to infinity, finally shows in view of condition (12.2.13),

1 lim R→∞ hR

t0 +hR

f ∗ (u)du −

t0

R

ck f (nk t0 ) = 0.

k=1

The proof is completed by combining the above result with (12.2.14).

Proof of Lemma 12.2.9. From f ∈ Lipα (T) it follows (see Zygmund [1959: 324]) that ∞

(a2 + b2 ) ≤ Dn−2α .

=n+1

Let j ≤ k be fixed positive integers. Using Parseval’s relation yields T

ϕ(nj x)ϕ(nk x)dx =

(ar as + br bs ).

rnj =snk

The relation j ≤ k together with rnj = snk implies that s ≥ 1 and r ≥ (nk /nj ). Using the inequality |ar as + br bs | ≤ (ar2 + br2 )1/2 (as2 + bs2 )1/2 , and the Cauchy–Schwarz inequality we get ! α 1/2 ! 1/2 nj 2 2 2 2 ϕ(nj x)ϕ(nk x)dx ≤ (ar + br ) (as + bs ) ≤B . n T

r≥nk /nj

s≥1

k

620 Thus

12 A study of the system (f (nx))

T 1≤j
cj ck ϕ(nj x)ϕ(nk x)dx

≤

1≤j
≤ ≤

1 2

N

nj |cj ||ck | nk

α

≤

1≤j
|cj |2 + |ck |2 2

nj nk

α

N nj α 1 nj α 2 2 + cj ck

k=1

N

1≤j
2

nk

j =1

j
nk

ck2 ω(k),

k=1

proving Lemma 12.2.9. Proof of Theorem 12.2.7. Without loss of generality, we can assume d ≤ 1. As a first step, we approximate the "functions f (nk x)# by stepfunctions ϕk (x) as follows. Let 2 ≤ nk < 2+1 , put m = + 60α −1 log k and let ϕk (x) = [f (nk x)]2m . By Lemma 12.2.6 we have m −α/3 2

f (nk x) − ϕk (x) ≤ C ≤ C2−20 log k ≤ Ck −10 . (12.2.17) nk β < * < 1 (such a * exists since β < 1/2) and split the Choose * so that 21 ∨ 1−β sequence of positive integers into consecutive blocks 1 , 1 , 2 , 2 , . . . so that

|k | = |k | = [k * ]. Set Tk =

cν f (nν x),

Dk =

ν∈k

cν ϕν (x).

ν∈k

Clearly, each integer in k exceeds (k − 1)* ≥ (k − 1)1/2 and thus by (12.2.17)

Tk − Dk ≤ C

∞

ν −10 ≤ Ck −4 .

(12.2.18)

ν=(k−1)1/2

Next we show 12.2.11 Lemma. We have P |E (Dk | Fk−1 )| ≥ k −2 ≤ Ck −2 , where Fk−1 denotes the σ -field generated by D1 , . . . , Dk−1 .

(12.2.19)

621

12.2 Almost sure convergence – sufficient conditions

Proof. We first show that

|E (Tk | Fk−1 )| ≤ Ck −2 .

(12.2.20)

To see this, let r and t denote the largest integer of k−1 #and the smallest integer of " k , respectively. Let 2 ≤ nr < 2+1 , w = + 120 α log r . From the definition of ϕν it is clear that every ϕν , 1 ≤ ν ≤ r takes a constant value on each interval of the form A = [i2−w , (i + 1)2−w ),

0 ≤ i ≤ 2w − 1

(12.2.21)

and thus each set of the σ -field Fk−1 can be written as a union of intervals of the form (12.2.21). Thus to prove (12.2.20) it suffices to show that |A|−1 Tk dx ≤ Ck −2 (12.2.22) A

for any A of the form (12.2.21). Now for the set A in (12.2.21) we have −1 w |A| Tk dx = 2

−w (i+1)2

i2−w

A

i+1 cν f (nν x)dx = cν f (mν t)dt

ν∈k

i

ν∈k

(12.2.23) where mν = 2−w nν . Using (12.2.11), 1 + x ≥ exp(x/2) for 0 ≤ x ≤ 1 and the relations r ∼ t ∼ Ck *+1 , t − r ∼ k * we get 1 2w 2 r 60/α nr = ≤ ≤ r 60/α mt nt nt nt t−1 ( 1 −1 1 −(t−r) ≤ r 60/α 1+ β ≤ r 60/α 1 + β ν t ν=r

≤ r 60/α exp −

t −r 2t β

(12.2.24)

≤ Ck 120/α exp(−Ck τ )

≤ Ck −3 , where τ = * − (* + 1)γ > 0 by the choice of *. By the periodicity of f and 1 0 f dx = 0 we clearly have for any real L and λ ≥ 1, L+1 1 2 f (λx)dx ≤ |f (x)|dx λ L

0

and thus (12.2.24) shows that the last expression of (12.2.23) cannot exceed C|cν | 1 ≤ C|k | ≤ Ck −2 . mν mt

ν∈k

622

12 A study of the system (f (nx))

Hence we proved (12.2.22) and thus (12.2.20). It is now easy to complete the proof of Lemma 12.2.11. By (12.2.18) and wellknown properties of conditional expectations we have E (|Dk − Tk |2 | Fk−1 ) = E |Dk − Tk |2 ≤ Ck −8 1 and thus by the Tchebycheff inequality

P E (|Dk − Tk | | Fk−1 ) ≥ k −2 ≤ P E |Dk − Tk |2 | Fk−1 ≥ k −4 ≤ Ck −4 . Together with (12.2.20) this yields (12.2.19). Set Dk = Dk −E (Dk | Fk−1 ); clearly (Dk , Fk ) is a martingale difference sequence and hence orthogonal. Also,

E (Dk | Fk−1 ) ≤ E ((Dk − Tk ) | Fk−1 ) + E (Tk | Fk−1 )

(12.2.25)

≤ Dk − Tk + Ck −2 ≤ C(k −4 + k −2 ) by (12.2.18) and (12.2.20). By the assumptions of Theorem 12.2.7, in L2 (T) norm and thus n Tk → 0

∞

k=1 Tk

converges

as m, n → ∞.

k=m

Consequently, using the orthogonality of Dk , (12.2.18) and (12.2.25) we get n

E Dk2

k=m

1/2

n n n = Dk ≤ Dk + E (Dk | Fk−1 ) k=m

k=m

k=m

k=m

k=m

k=m

n n n n ≤ Dk + C k −2 ≤ Tk + C k −2 → 0 k=m

(12.2.26) 2 as m, n → ∞. Thus ∞ k=1 E Dk < ∞ and thus the martingale convergence theorem implies that k Dk is a.e. convergent. Now k E (D k | Fk−1 ) is a.e. convergent by Lemma 12.2.11 and the Borel–Cantelli lemma, further k (Tk − Dk ) is a.e. convergent by (12.2.18) and the Beppo Levi theorem. Thus T is a.e. convergent; for the same k k reason k Tk is also a.e. convergent, where Tk = cν f (nν x). ν∈k

Hence setting SN =

ν≤N

cν f (nν x),

Nk = 2

[i * ] i≤k

623

12.2 Almost sure convergence – sufficient conditions

we proved that SNk is a.e. convergent. To prove the theorem it remains to show that Mk → 0 a.e. where Mk = max |SN − SNk |. Nk ≤N
Let D denote a constant such that |f | ≤ D. Then by using Nk ∼ Ck *+1 , Nk+1 − Nk ∼ 2k * and τk,* (c) = o(1) we get

Nk+1

Mk ≤ D

|cν | ≤ Cτk,* (c) = o(1).

ν=Nk +1

Hence Theorem 12.2.7 is proved. Proof of Lemma 12.2.6. Let us write f = f1 + f2 where N

f1 =

(ak cos 2π kx + bk sin 2π kx),

f2 = f − f1 ,

k=1

N is an integer to be specified later. If g(x) = f (λx), then we have g = g1 + g2 , where g1 (x) = f1 (λx), g2 (x) = f2 (λx). Evidently | cos βx − [cos βx]m | ≤ β/m,

| sin βx − [sin βx]m | ≤ β/m

for any β > 0 and thus using g1 (x) =

N

(ak cos 2π kλx + bk sin 2π kλx)

k=1

and the linearity of the operation g → [g] and the fact that [g]m ≤ g we get |g1 − [g1 ]m | ≤

N

2π kλ(|ak | + |bk |)m−1

k=1

≤ 2π λm−1

N

k2

∞ 1/2 !

k=1

ak2

1/2

+

∞

k=1

bk2

1/2

≤ Cλm−1 N 3/2

k=1

(12.2.27) with some constant C depending on f . Further, by the periodicity of f and f1 we have λ 1 2 2 2 −1 f2 (λx) dx = 4λ f2 (t)2 dt

g2 − [g2 ]m ≤ 2 g2 = 4 0 0 [λ]+1 1 (12.2.28) −1 2 −1 f2 (t) dt ≤ 4λ f2 (t)2 dt ≤ 4λ 0

0

≤ 8 f − f2 = 8r(N). 2

624

12 A study of the system (f (nx))

Using relations (12.2.27)–(12.2.28) we get

g − [g]m ≤ C(λm−1 N 3/2 + r(N ))

(12.2.29)

whence the statement of the lemma follows by choosing N = [(m/λ)1/3 ]. We turn now to the nonlacunary case, i.e., the case when no growth condition on {nk , k ≥ 1} is assumed. As we already indicated, in this case the number-theoretic structure of the sequence {nk , k ≥ 1} will play an important role in the convergence behavior of ∞ k=1 ck f (nk x). The notion of a quasi-orthogonal system is of particular relevance in the study of the convergence in mean and/or almost everywhere of series n cn f (nk x). In this direction, we will establish the following general result. Here, and in the sequel, let L(x) = log(x ∨ 1) for x ∈ R. 12.2.12 Theorem. Let f ∈ L2 (T) with T f (x)dx = 0. Let {nk , k ≥ 1} be an increasing sequence of positive integers and assume that there exists a sequence {Ck , k ≥ 1} of positive integers such that ∞ rf∗ (Ck )2 < ∞ (12.2.30) k=1

and

(nh , nk ) (nh , nk )Ck L < ∞. (12.2.31) nk nh h≥1 k>h ∞ 2 2 Then the series ∞ k=1 ck f (nk x) converges a.e. provided k=1 ck (log k) < ∞. sup ch

The following theorem describes what happens if condition (12.2.31) of Theorem 12.2.12 is not assumed. 12.2.13 Theorem. Let f ∈ L2 (T) with T f (x)dx = 0. Let {nk , k ≥ 1} be an increasing sequence of positive integers and assume that there exists a sequence {ck , k ≥ 1} of positive integers and a positive nondecreasing sequence (λk ) such that λ2k /λk = O(1) and ∞ rf∗ (ck )2 /λk < ∞, (12.2.32) k=1

(nh , nk ) (nh , nk )ck sup ch L ≤ λN . nk nh 1≤h≤N h
(12.2.33)

Choosing the sequences {ck , k ≥ 1} and {λk , k ≥ 1} optimally in Theorem 12.2.13 requires a “balancing” act, but giving up a little accuracy, such sequences are easy to find: first choose {ck , k ≥ 1} so that (12.2.32) holds with λk = 1 and then choose λk so that (12.2.33) holds. The following analogue of Theorem 12.2.12 is easier to formulate and prove, but still has useful applications.

12.2 Almost sure convergence – sufficient conditions

625

12.2.14 Theorem. Let f ∈ L2 (T) with T f (x)dx = 0 and with Fourier coefficients satisfying ak = O(k −α ), bk = O(k −α ), α > 1/2. Let {nk , k ≥ 1} be an increasing sequence of integers and let (λk ) be a positive nondecreasing sequence such that λ2k /λk = O(1) and sup

N

nh , nk α ≤ λN .

1≤h≤N k=1

Then

∞

k=1 ck f (nk x)

converges a.e. provided

∞

2 2 k=1 ck (log k) λk

< ∞.

Before proving Theorems 12.2.12–12.2.14, we give some applications. 12.2.15 Corollary. (i) Let f ∈ L2 (T) with T f (x)dx = 0 and rf (n) = O(n−α ). Let sequence of coprime integers such that nk ≥ k β with {nk , k ≥ 1} be an increasing some β > 1 + 1/(2α). Then ∞ k=1 ck f (nk x) converges almost everywhere provided ∞ 2 (log k)2 < ∞. c k=1 k (ii) Let f ∈ L2 (T) with T f (x)dx = 0 and with Fourier coefficients satisfying ak = O(k −α ), bk = O(k −α ), α > 1/2. {nk , k ≥ 1} be an increasing sequence of Let−α ∞ n < ∞. Then pairwise coprime integers such that ∞ k=1 ck f (nk x) converges ∞ 2 k=1 k2 almost everywhere provided k=1 ck (log k) < ∞. (iii) Let f ∈ L2 (T) have Fourier-coefficients O(1/k) (for example, let f ∈ BV (0, 1)) {nk , k ≥ 1} be a sequence of integers such that for any d ≥ 1 and let −1 we have d|nk nk ≤ A/d with an absolute constant A. Then ∞ k=1 ck f (nk x) con∞ 2 2 log n < ∞. c (log k) verges almost everywhere provided k k=1 k (iv) Let f ∈ L2 (T) with T f (x)dx = 0 and rf∗ (n) = O(n−α ). Then the series ∞ 2 β ∞ k=1 ck f (kx) converges almost everywhere provided k=1 ck k < ∞ for some β > 1/(1 + 2α). (v) Let f ∈ L2 (T) with T f (x)dx = 0 and with ∞Fourier coefficients satisfying −α −α ak = O(k ), bk = O(k ), 1/2 < α < 1. Then k=1 ck f (kx) almost everywhere 2 1−α (log k)2 < ∞. provided ∞ k=1 ck k (vi) Let f ∈ L2 (T) with T f (x)dx = 0 and with Fourier coefficients satisfying ak = O(k −α ), bk = O(k −α ), α > 1/2. Let nk = k r , r is aninteger with r > 1/α. ∞ 2 2 Then ∞ k=1 ck f (nk x) converges almost everywhere provided k=1 ck (log k) < ∞. Proof of Theorem 12.2.14. We follow the proof of Theorem 12.1.3 with minor modi 2 (log k)2 λ < ∞ and the c fications, using the same notation. The assumption ∞ k=1 k k 2 2 estimates in the second line of (12.1.18) with γ = 2 show that ∞ k=1 ν T Zν dx < ∞ 2 2 ∞ and thus k=1 ν Zν < ∞ almost everywhere. Hence (12.1.17) implies that 2n k=2m +1 f (nk t) → 0 almost everywhere as m, n → ∞ and thus the partial sums 2N k=1 ck f (nk x) converge a.e. Now (12.1.16) and the Rademacher–Menshov inequal-

626

12 A study of the system (f (nx))

ity (see Section 8.3) imply max

m

T 2N +1≤m≤2N+1

2

ck f (nk t) dt ≤ C3 λ2N+1

N+1 2

k=2N +1

ck2 (log 2N )2

k=2N +1

≤ C4

N+1 2

ck2 (log k)2 λk .

k=2N +1

(12.2.34) ∞

Summing these relations for N = 1, 2, . . . and using k=1 ck2 (log k)2 λk < ∞, it follows that m 2 max ck f (nk t) → 0 almost everywhere, 2N +1≤m≤2N+1

k=2N +1

completing the proof of Theorem 12.2.14. Proof of Theorem 12.2.12. Let fk = [f ]Ck (nk · ). By the Cauchy–Schwarz inequality we get ∞

|ck | f (nk ·) − fk (·) =

k=1

∞

|ck | f (·) − [f ]Ck (·) =

k=1

≤

∞

|ck |rf∗ (Ck )

k=1

∞

ck2 λk

k=1

∞ 1/2

rf∗ (Ck )2 /λk

1/2

<∞

k=1

by the assumptions of Theorem 12.2.12. It follows that ∞ k=1 |ck | f (nk t) − fk (t)

∞ c f (n t) converges almost everywhere if and converges a.e. and thus the series k k k=1 only if the series ∞ c f (t) does. The problem thus reduces to the study of the last k k k=1 series, and to do this, we will analyse the correlation properties of the functions fk . Define, for any nonempty interval π of T, 1 f (u)du. (12.2.35) fπ = |π | π

Then [f ]Cn (x) =

f π χ(π )(x),

π ∈+n

where +n denotes the partition of [0, 1) defined by the subdivision

Since

T f (f )dt

[(j − 1)/Cn , j/Cn ), = 0, we have [f ]Cn (x) =

π ∈+n

j = 1, . . . , Cn .

# " f π χ(π )(x) − |π |

(12.2.36)

(12.2.37)

627

12.2 Almost sure convergence – sufficient conditions

and consequently for h ≤ k we get fh , fk =

π ∈+h π ∈+k

f π f π χπ ({nh y}) − |π |, χπ ({nk y}) − |π |

(12.2.38)

where the indicators are extended with period 1. Thus the calculation reduces to estimating the correlation for indicators of intervals. Let 0 ≤ a < b < 1. It is classical to expand the indicator function χ ([a, b))(x) in a Fourier series, and one gets χ ([a, b))(x) = b − a + =b−a+

−1 e−2iπ nb − e−2iπ na e2iπ nx 2iπ n ∗

n∈Z ∞ n=1

1 sin 2π nx(cos 2π nb − cos 2π na) πn

(12.2.39)

+ cos 2π nx(sin 2π nb − sin 2π na) ,

for almost all x. Now, let 0 ≤ a < b < c < d < 1. Put ϕ = χ([a, b)), ψ = χ([c, d)), and ϕ¯ = ϕ − (b − a), ψ¯ = ψ − (d − c). We study for given positive integers h and ¯ k the correlation of the functions ϕ¯h = ϕ(hx), ¯ ψ¯ k = ψ(kx). Put for u, v ∈ T and integer n, δn (u, v) = e−2iπ nv − e−2iπ nu . Then, −1 e2iπ nhx δn (a, b) 2iπ n n∈Z∗ −1 ¯ e2iπ mkx δm (c, d), ψ(kx) = 2iπ m ∗ ϕ(hx) ¯ =

m∈Z

so that ϕ¯h , ψ¯ k =

n∈Z∗

=

m∈Z∗

m,n∈Z∗ nh−mk=0

1 δn (a, b)δ−m (c, d) 2 4π mn

T

e2iπ(nh−mk)x dx

1

δn (a, b)δ−m (c, d). 4π 2 mn

The equation nh − mk = 0 has solutions given by n = μk/(h, k) and m = μh/(h, k), μ = 1, 2, . . . . Thus, ϕ¯h , ψ¯ k =

∞ h, k 1 δμk/(h,k) (a, b)δ−μh/(h,k) (c, d) 4π 2 μ2 μ=1

+ δ−μk/(h,k) (a, b)δμh/(h,k) (c, d) .

(12.2.40)

628

12 A study of the system (f (nx))

It remains to compute δn (a, b)δ−m (c, d) + δ−n (a, b)δm (c, d). But, a plain calculation shows δn (a, b)δ−m (c, d) + δ−n (a, b)δm (c, d) = 2 cos 2π(nb − md) − cos 2π(nb − mc)

− cos 2π(na − md) + cos 2π(na − mc) = 2 sin 2π m(d − c) sin 2π(2nb − m(c + d)) − sin 2π(2na − m(c + d)) = 4 sin 2π m(d − c) sin 2π n(b − a) cos 2π(n(a + b) − m(c + d)).

Therefore, ϕ¯h , ψ¯ k =

∞ h, k 1 μh(d − c) μk(b − a) sin 2π sin 2π 2 2 π μ (h, k) (h, k) μ=1

(12.2.41)

μ(k(a + b) − h(c + d)) . · cos 2π (h, k) It follows that ∞ 1 μh(d − c) μk(b − a) ϕ¯h , ψ¯ k ≤ h, k sin 2π sin 2π π2 μ2 (h, k) (h, k) μ=1

∞ 2 ≤ min μ=1 π (h,k) h(d−c)

Now, if

∧ h, k),

and

(h,k) μ> h(d−c)

1 μk(b−a) μ=1 μ2 ( [h,k]

∧ h, k) . (12.2.42)

h(d − c) 1 μh(d − c) ∧h, k ≤ μ2 [h, k] [h, k]

(h,k) μ≤ h(d−c)

∞

> 1,

1 μh(d−c) ( [h,k] μ2

(h,k) μ≤ h(d−c)

1 μh(d − c) ∧ h, k ≤ h, k μ2 [h, k]

1 h(d − c) (h, k) ≤C log , μ [h, k] h(d − c)

(h,k) μ> h(d−c)

1 ≤Ch,k h(d−c) (h,k) μ2

=C

h(d − c) . [h, k]

Thus ∞ 1 μh(d − c) h(d − c) (h, k) ∧ h, k ≤ C log . μ2 [h, k] [h, k] h(d − c)

(12.2.43a)

μ=1

If

(h,k) h(d−c)

≤ 1, then 1 ≤

h(d−c) (h,k)

and

∞ 1 μh(d − c) h(d − c) h(d − c) ∧ h, k ≤ Ch, k ≤ Ch, k =C . μ2 [h, k] (h, k) [h, k]

μ=1

(12.2.43b)

629

12.2 Almost sure convergence – sufficient conditions

In both cases we get ∞ h(d − c) 1 μh(d − c) (h, k) ≤ C . ∧ h, k L μ2 [h, k] [h, k] h(d − c)

(12.2.44)

μ=1

Therefore,

ϕ¯h , ψ¯ k ≤ C min h(d−c) L (h,k) , k(b−a) L (h,k) , h, k . [h,k] h(d−c) [h,k] k(b−a)

(12.2.45)

Return now to (12.2.38). We deduce from (12.2.45) that χπ (nh y) − |π |, χ (nk y) − |π | ≤ C nh |π | L (nh , nk ) , π [nh , nk ] nh |π |

so that

f π f π |π | fh , fk ≤ C

(nh , nk ) nh L [nh , nk ] nh |π | π ∈+h π ∈+k nh (nh , nk )CNk f π ≤C f (u)du L [n , n ] nh h k π π ∈+h

π ∈+k

|π | nh (nh , nk )Ck π ≤ C f 1 L f |π | [nh , nk ] nh π ∈+h

≤ C f 21

(12.2.46)

nh Ch (nh , nk )Ck . L [nh , nk ] nh

Therefore, for h ≤ k, fh , fk ≤ C f 2 (nh , nk )Ch L (nh , nk )Ck . 1 nk nh

(12.2.47)

Thus using (12.2.33) we get N T

2 ck fk

k=1

N N N 1 dx = fh , fk ch ck ≤ fh , fk (ch2 + ck2 ) ≤ ck2 λN 2 h,k=1

h,k=1

k=1

which corresponds to relation (12.1.16) in the proof of Theorem 12.1.3. The argument is now completed by following the proof of Theorem 12.2.14. Proof of Corollary 12.2.15. (i) and (ii). Since β > 1 + 1/(2α), we can choose γ > 0 such that 2αγ > 1 and β > γ + 1. Let Ck = k γ , then (12.2.30) is trivially satisfied and the expression in (12.2.31) is at most 1 L{ck , k ≥ 1} ≤ Khγ k −β log k = O(h(γ −β−1) log h) = O(1) nk h≥1

sup

k>h

k>h

630

12 A study of the system (f (nx))

for some constant K. Hence (i) follows from Theorem 12.2.12. A similar calculation shows that (ii) follows from Theorem 12.2.14. (iii). This is immediate from Theorem 12.2.14 and estimate (12.1.22). (iv). Let Cn = nγ where γ will be determined later. Observe that n n (h, k)Ck (h, k) (h, k)Ch Ch L ≤ log Cn . k h k k=h

(12.2.48)

k=h

Fix a d | h and compute the last sum in (12.2.48) for those h ≤ k ≤ n such that (h, k) = d. This restricted sum clearly cannot exceed Ch

1≤k≤n,d|k

[n/d] 1 d ≤ Cn ≤ Cn log n. k l

(12.2.49)

l=1

Now summing for all d|h, we have to multiply the result with the number of divisors of h, which is known to be at most A(ε)hε ≤ A(ε)nε , and thus the first sum in (12.2.48) is at most A(ε)nε Cn log Cn log n = O(nγ +2ε ). Thus choosing λn = nγ +2ε , condition (12.2.32) of Theorem 12.2.13 is satisfied. Now rf {ck , k ≥ 1} = O(Ck−α ) = O(k −γ α ) and thus (12.2.30) will hold if γ = 1/(1 + 2α). As ε can be chosen arbitrarily small, (iv) follows from Theorem 12.2.13. (v). This is an immediate consequence of Theorem 12.2.14 and the last statement of Lemma 12.1.7. (vi). Let nk = k r for some integer r ≥ 2. Clearly nk , nl = k, lr and thus (vi) follows from Theorem 12.2.14 and the first statement of Lemma 12.1.7. Proof of Theorem 12.2.12. This is a special case of the previous proof for λn = O(1).

To conclude this chapter, we prove a maximal inequality providing a further way to prove a.e. convergence results for ∞ k=1 ck f (nk x). 12.2.16 Theorem. Let f ∈ L2 (T) with

SN (x) =

T f (t)dt

= 0 and put

ck f (kx).

k≤N

Then for an arbitrary sequence (mk ) of positive integers we have

1

max |SM (x)|dx ≤

0 M≤N

k≤N

|ck |rf (mk ) + A

mN l=1

N

(|al | + |bl |)

ck2

1/2 (12.2.50)

k=dl

where dl = inf{k : mk ≥ l} is the inverse function of mk and A is an absolute constant.

12.2 Almost sure convergence – sufficient conditions

631

If the Fourier series of f isabsolutely convergent, i.e., ∞ l=1 (|al | + |bl |) < ∞, r (m ) < ∞, the right-hand side of (12.2.50) is then choosing mk so large that ∞ k k=1 f

N 2 1/2 at most C , and thus the statement reduces to k=1 ck

1

max |SM (x)|dx ≤ C

N

0 M≤N

ck2

1/2 (12.2.51)

k=1

which is an extension of Hunt’s inequality ([Hunt: 1968]). (Actually, the proof of Theorem 12.2.16 uses Hunt’s inequality.) In particular, it follows that if the Fourier series of f is absolutely convergent (for example, if f belongs to the Lip 1/2 class), 2 then ∞ k=1 ck f (kx) converges a.e. provided c ∈ . This result is due to Gaposhkin [1968]. In contrast to Theorem 12.2.12, Theorem 12.2.16 loses the number-theoretic connection, but in the case nk = k it leads, despite the simplicity of its proof, to sharper results than the quasi-orthogonality method of Theorem 12.2.12, as the applications below will show. Proof of Theorem 12.2.16. For simplicity we assume that the Fourier expansion of f is a pure cosine series (i.e., bl = 0); the general case can be treated similarly. Write f = fk + gk where fk (x) =

mk

al cos 2π lx,

∞

gk (x) =

al cos 2π lx,

l=mk +1

l=1

then (1)

(2)

SN (x) = TN + TN where (1)

TN =

(2)

TN =

ck fk (kx),

k≤N

ck gk (kx).

k≤N

Clearly (2)

|TN | ≤

|ck ||gk (kx)|

k≤N

and thus (2)

max |TM | ≤

M≤N

|ck ||gk (kx)|.

k≤N

Hence

1

(2)

max |TM |dx ≤

0 M≤N

k≤N

|ck | gk (kx) 1 ≤

k≤N

|ck |rf (mk ).

(12.2.52)

632

12 A study of the system (f (nx))

On the other hand, (1) |TN |

mk mN N al cos 2π klx = al ck cos 2π klx = ck k≤N

≤

mN

l=1

l=1

k=dl

N |al | ck cos 2π klx .

l=1

k=dl

Thus (1)

max |TM | ≤

M≤N

mN

M |al | max ck cos 2π klx

l=1

M≤N

k=dl

and thus using Hunt’s inequality we get

1

(1)

max |TM |dx ≤ A

0 M≤N

mN

N

|al |

l=1

ck2

1/2 (12.2.53)

k=dl

where A is an absolute constant. The theorem now follows from (12.2.52) and (12.2.53).

We give now some corollaries of Theorem 12.2.16. 12.2.17 Corollary. Let f ∈ BV (0, 1). Then ∞

∞

k=1 ck f (kx)

converges a.e. provided

ck2 (log k)β < ∞ for some β > 2.

(12.2.54)

k=1

12.2.18 Corollary. Let f ∈ Lipα (T) for some 0 < α < 1/2 and let Then ∞ k=1 ck f (kx) converges a.e. provided ∞

T f (t)dt

ck2 k 1−2α (log k)β < ∞ for some β > 1 + 2α.

= 0.

(12.2.55)

k=1

12.2.19 Corollary. Let f ∈ Lip1/2 (T) and let converges a.e. provided ∞

T f (t)dt

= 0. Then

ck2 (log k)β < ∞ for some β > 2.

∞

k=1 ck f (kx)

(12.2.56)

k=1

Corollary 12.2.18 was proved earlier by Gaposhkin [1966a], while Corollary 12.2.19 improves Theorem 3 of Gaposhkin [1966a]. Note that in the case f ∈ Lipα (T) the

633

12.2 Almost sure convergence – sufficient conditions

convergence condition is much stronger for 0 < α < 1/2. It is possible that in the case 0 < α < 1/2 a condition ∞ ck (log k)γ < ∞ (12.2.57) k=1

suffices for the a.e. convergence of ∞ k=1 ck f (kx), but this remains open. On the other hand, Theorem 3 of Berkes [1997] shows that for any 0 < α < 1/2 there exists f ∈ Lipα (T) with T f (t)dt =0 and a real sequence {ck , k ≥ 1} such that (12.2.72) holds for any γ < 1 − 2α, but ∞ k=1 ck f (kx) is a.e. divergent. To prove the corollaries, assume first that f ∈ Lip α (T) with some 0 < α ≤ 1/2. (As we noted above, in the case α > 1/2 the series ∞ k=1 ck f (kx) converges a.e. for any {ck , k ≥ 1} ∈ 2 by Gaposhkin’s theorem, so there is no convergence problem.) The Fourier coefficients of f satisfy (see Zygmund [1959: 241]) n+1 2

(ak2 + bk2 ) ≤ C2−2nα ,

k=2n +1

whence it follows immediately that ∞

(ak2 + bk2 ) ≤ Cn−2α

(12.2.58)

k=n

and

∞

(|ak | + |bk |)k α−1/2 (log k)−γ < ∞

for any γ > 1.

(12.2.59)

k=1

The cases 0 < α < 1/2 and α = 1/2 are treated differently, so we separate them. (A) In the case α = 1/2 we note that rf (n) = O(n−1/2 ) by (12.2.54) and thus by (12.2.56) and the Cauchy–Schwarz inequality the first term on the right-hand side of (12.2.50) is bounded by C

k≤N

1/2 1 1 1 |ck | √ =C |ck |(log k)β/2 √ ≤ C mk mk (log k)β/2 mk (log k)β k≤N

k≤N

which remains bounded if mk = k(log k)1+ε−β , ε > 0. Then dl ∼ l(log l)−(1+ε−β) and since by (12.2.56) we have k≥N ck2 ≤ C(log N )−β , the second term on the right-hand side of (12.2.50) is bounded by C

mN

(|al | + |bl |)(log dl )−β/2 ,

l=1

which remains bounded by (12.2.55), since log dl ∼ log l and β > 2.

634

12 A study of the system (f (nx))

Observe that if f is of bounded variation, then its Fourier coefficients satisfy |ak | = O(k −1 ), |bk | = O(k −1 ), and thus relations (12.2.54), (12.2.55) are satisfied with α = 1/2. Hence the above proof also shows the validity of Corollary 12.2.17. (B) In the case 0 < α < 1/2 we choose now mk = k(log k)τ with τ to be determined later; then dl ∼ l(log l)−τ . By (12.2.54) we have R(n) = O(n−α ) and thus setting ψ(k) = k 1−2α (log k)β , (12.2.58) and the Cauchy–Schwarz inequality show that the first term on the right-hand side of (12.2.50) is bounded by C

k≤N

|ck |

1/2 1 1 1 =C |ck |ψ(k)1/2 α ≤ C α 2α mk mk ψ(k)1/2 m ψ(k) k≤N k≤N k

which remains bounded, in2view of the definitions of mk2 and ψ(k), if−1β + 2ατ > 1. On the other hand, ∞ c ψ(k) < ∞ implies k=1 k k≥N ck ≤ Cψ(N ) , and thus the second term on the right-hand side of (12.2.50) is bounded by C

mN

(|al | + |bl |)ψ(dl )−1/2 .

(12.2.60)

l=1

Substituting the values of ψ(k) and dl and using (12.2.55), we see that the sum in (12.2.60) remains bounded if β − (1 − 2α)τ > 2. We have thus proved that if the sum in (12.2.58) converges and mk = k(log k)τ , then the left-hand side of (12.2.55) remains bounded if β > max(2 + (1 − 2α)τ, 1 − 2ατ ). (12.2.61) The right-hand side (12.2.61) reaches its minimum for τ = −1 with minimal value 1 + 2α, completing the proof.

12.3 Almost sure convergence – necessary conditions Let f ∈ L2 (T) with

T f (t)dt

f ∼

= 0 and Fourier expansion

∞

(ak cos 2π kx + bk sin 2π kx).

k=1

Recall that by Wintner’s theorem (Theorem 12.1.1), the series n cn f (nx) converges in the mean for all (cn ) ∈ 2 iff ϕn /ns and ϕn /ns are regular and bounded for $s > 0. (12.3.1) n

n

We showed that (12.3.1) also implies the a.e. convergence of ∞ k=1 ck f (nk x) provided {nk , k ≥ 1} satisfies the Erdös gap condition (12.2.11) with β < 1/2. The following result describes the situation when (12.3.1) fails.

12.3 Almost sure convergence – necessary conditions

635

12.3.1 Theorem. Let f ∈ Lipα (T), T f (t)dt = 0 and assume that (12.3.1) is not valid. Then for any εk ↓ 0 there exists c ∈ 2 and a sequence N = {nk , k ≥ 1} of positive integers satisfying nk+1 /nk ≥ 1 + εk (k ≥ k0 ) such that the series

k ck f (nk x)

is a.e. divergent.

The result is sharp: if {nk , k ≥ 1} grows exponentially (i.e., nk+1 /nk ≥ q > 1) then k ck f (nk x) converges a.e. for any c ∈ 2 by Kac’s theorem (see 12.2.2). We note that the theorem remains valid, with minor modifications in the proof, if instead of f ∈ Lipα (T) we assume only f ∈ L2 (T). However, as the positive result concerns the Lipschitz case, we will prove the converse also for that case. For the proof we need two simple lemmas. (N )

12.3.2 Lemma. If (12.3.1) fails, then for any N ≥ 1 there exist real numbers aj , j = 1, . . . , N, such that 1 N 0

N 2 (N ) (N ) aj f (j x) dx ≥ (aj )2 L(N )

j =1

j =1

where L(N) → ∞. Proof. This is obvious, since by Wintner’s theorem relation (12.3.1) is equivalent to the existence of a constant C > 0 such that for any N ≥ 1 and any real sequence (aj ) we have 1 N N 2 aj f (j x) dx ≤ C aj2 . 0

j =1

j =1

Now, given f ∈ Lipα (T), choose the integer B so large that (B − 1)α ≥ 10. Then we have 12.3.3 Lemma. Let 1 ≤ p1 < q1 < p2 < q2 < · · · be integers such that pk+1 ≥ Bqk . Let I1 , I2 , . . . be sets of integers such that Ik ⊂ [2pk , 2qk ] and each element of Ik is (k) (k) divisible by 2pk . Let bj , j ∈ Ik be arbitrary coefficients with |bj | ≤ 1 and set Xk = Xk (ω) =

(k)

bj f (j ω) (k = 1, 2, . . . , ω ∈ T).

j ∈Ik

Then there exist independent random variables Y1 , Y2 , . . . on the probability space (T, B, λ) such that E Yk = 0 and |Xk − Yk | ≤ 2−k (k ≥ k0 ).

636

12 A study of the system (f (nx))

Proof. Let Fk denote the σ -field generated by the dyadic intervals " # Uν = ν2−Bqk , (ν + 1)2−Bqk , 0 ≤ ν < 2Bqk

(12.3.2)

and set ξj = ξj ( · ) = E (f (j · )|Fk ), (k) Yk = Yk (ω) = bj ξj (ω).

j ∈ Ik ,

j ∈Ik

By |f (x) − f (y)| ≤ C|x − y|α we have |ξj (ω) − f (j ω)| ≤ C1 2−(B−1)qk α ≤ C1 2−10qk ,

j ∈ Ik ,

and since Ik has at most 2qk elements, we get |Xk − Yk | ≤ C1 · 2qk 2−10qk ≤ 2−k

for k ≥ k0 .

Since pk+1 ≥ Bqk and since each j ∈ Ik+1 is a multiple of 2pk+1 , each interval Uν in (12.3.2) is a period interval for all f (j x), j ∈ Ik+1 and thus also for ξj , j ∈ Ik+1 . Hence Yk+1 is independent of the σ -field Fk and since F1 ⊂ F2 ⊂ . . . and Yk is Fk measurable, the random variables Y1 , Y2 , . . . are independent. Finally E ξj = 0 by f dx = 0 and thus E Yk = 0. T Turning to the proof of Theorem 12.3.1, let ψ(k) grow so rapidly that L(ψ(k)) ≥ 2k and let (rk ) be a nondecreasing sequence of integers to be chosen later. We define sets (1)

(1)

(2)

(k)

, I1 , . . . , Ir(2) , . . . , I1 , . . . , Ir(k) ,... I1 , I2 , . . . , Ir(1) 1 2 k

(12.3.3)

of positive integers by (k)

Ij

(k)

= 2cj {1, 2, . . . , ψ(k)} ,

1 ≤ j ≤ rk , k ≥ 1

(k)

where cj are suitable positive integers. (Here for any set {a, b, . . . } ⊂ R and λ ∈ R, (k)

λ{a, b, . . . } denotes the set {λa, λb, . . . }.) Clearly we can choose the integers cj inductively so that the intervals in (12.3.3) satisfy the conditions of Lemma 12.3.3. (k) By Lemma 12.3.2 there exist, for any k ≥ 1, coefficients {aν , 1 ≤ ν ≤ ψ(k)}, ψ(k) (k)2 = 1 such that, setting ν=1 aν X

(k)

=X

(k)

(ω) =

ψ(k)

aν(k) f (νω)

ν=1

we have

2 E X(k) ≥ L ψ(k).

Let (k)

(k)

Xj (ω) = X(k) (2cj ω),

1 ≤ j ≤ rk .

637

12.3 Almost sure convergence – necessary conditions (k)

Clearly the Xj have the same distribution, and consequently

(k) 2 E Xj ≥ L ψ(k). (k)

By Lemma 12.3.2 there exist independent random variables Yj 1, 2, . . . ) such that

(k) E Yj

(1 ≤ j ≤ rk , k =

= 0 and

(k)

(k)

|Xj − Yj | ≤ K

(12.3.4)

k,j

for some constant K > 0. Hence by the Minkowski inequality, (k)

E (Yj )2 ≥ (k)

1 L(ψ(k)) 2

(12.3.5)

(k)

for k ≥ k0 . Also |Yj | ≤ |Xj | + K ≤ constant · ψ(k) and thus setting rk 1 (k) Yj , Zk = (rk Lψ(k))1/2

σk2

=E

rk

j =1

(k) 2

Yj

j =1

≥

1 rk L ψ(k), 2

we get from the central limit theorem with Berry–Esseen remainder term, rk (k) P Zk ≥ 1 ≥ P Yj ≥ 2σk ≥ (1 − (2)) − C j =1

≥ 1 − (2) − o(1) ≥ 0.02,

rk (rk Lψ(k))3/2 ψ(k)3

(k ≥ k0 )

1/2

3/2 ≥ ψ(k)4 . Since the Z are indepenprovided rk grows so rapidly that rk L(ψ(k)) k dent, the Borel–Cantelli lemma implies P Zk ≥ 1 infinitely often = 1, i.e., k≥1 Zk is a.e. divergent, which, in view of (12.3.3), yields that ∞ k=1

rk 1 (k) Xj (rk L(ψ(k))1/2

is a.e. divergent.

Let now N :=

rk ∞ + +

(k)

Ij .

(12.3.7)

k=1 j =1

Then the sum in (12.3.6) is of the form ∞ i=1

ci2 =

(12.3.6)

j =1

∞ k=1

∞

i=1 ci f (ni x) ∞

where

rk 1 = < +∞. rk L(ψ(k)) L(ψ(k)) k=1

638

12 A study of the system (f (nx))

Finally, denote by 1 + ρk the smallest of the ratios (j + 1)/j , 1 ≤ j ≤ ψ(k) − 1; clearly ρk > 0. Given εk ↓ 0 one can choose rk growing so rapidly that ρk ≥ εrk−1

k = 1, 2, . . . .

(12.3.8)

(k)

Now if ns and ns+1 belong to the same set Ij , then clearly s ≥ rk−1 , and thus by 8 8 (12.3.7) we get ns+1 ns ≥ 1 + ρk ≥ 1 + εrk−1 ≥ 1 + εs . Since ns+1 ns ≥ 2 if ns and (k) ns+1 belong to different Ij ’s, we proved that {nk , k ≥ 1} satisfies 8 nk+1 nk ≥ 1 + εk (k ≥ k0 ). (12.3.9) This completes the proof of Theorem 12.3.1. There are few results concerning the bounded case, namely the case when in the series k ck f (nk x), f is not smooth but only bounded. We first consider the case of primes and prove the following result. 12.3.4 Theorem. Let P := (Pk ) be an increasing sequence of prime numbers. Let c = {ck , k ≥ 1} be a sequence of positive reals such that ck2 < ∞, ck = ∞. (12.3.10) k

k

Then with T f (t)dt = 0 such that the series ∞ there exists a function f ∈ k=1 ck f (Pk x) diverges on a set with positive measure. L∞ (T)

Theorem 12.3.4 will be deduced from the following 12.3.5 Theorem. Let P := (Pk ) be an increasing sequence of prime numbers. Let c = {ck , k ≥ 1} be a sequence of positive reals such that ck2 < ∞, ck = ∞. (12.3.11) Put Cn =

k≤n ck

k

k

and consider the weighted sums Sn f =

1 ck f (Pk x). Cn

(12.3.12)

k≤n

Then there exists a function f ∈ L∞ (T) with T f (t)dt = 0 such that the sequence {Sn f, n ≥ 1} diverges on a set with positive measure. Proof of Theorem 12.3.4. Assuming that Theorem 12.3.5 is valid, there exists a bounded measurable function f such that (Sn f )n does not converge almost everywhere. Then the partial sums k≤n ck f (Pk x) do not converge almost everywhere either. Otherwise, this would imply, in view of the assumption that the series k ck diverges, that (Sn f (x))n tend to 0 almost everywhere, a contradiction. Hence the result.

12.3 Almost sure convergence – necessary conditions

639

To prove Theorem 12.3.5, we use Bourgain’s entropy criterion in L∞ (Corollary 6.1.8) and Lemma 6.1.5. Proof of Theorem 12.3.5. Let {TN , N ≥ 1} be integers such that TN − TN −1 increases to infinity with N. Define αTN−1 +1 αTN +N = u = PTN−1 +1 . . . PTN : αi ∈ {0, 1} and (αTN−1 +1 , . . . , αTN ) = (0, . . . , 0) , 1 fN = " eu . #1/2 2TN −TN−1 − 1 u∈+N (12.3.13) Let TN−1 < R ≤ TN . Then, 1 1 ck euPk , ev . " #1/2 CR 2TN −TN−1 − 1 u∈+N v∈+N k≤R

SR (fN ), fN =

Let u, v ∈ +N and k ≤ R. Then euPk , ev = 1, if and only if uPk = v. Noting αT

+1

βT

αT

βT

+1

N−1 N u = PTN N−1−1+1 . . . PTN N , v = PTN−1 +1 . . . PTN , this means that

αT

+1

αT

βT

+1

βT

N−1 N−1 N N Pk PTN−1 +1 . . . PTN = PTN−1 +1 . . . PTN .

This equation has solutions if and only if k belongs to the interval ]TN −1 , TN ], and then the solutions are given by αk = 0,

βk = 1,

αj = βj otherwise.

Hence,

2Tθ −Tθ−1 −1 − 1 1 ≥ . (12.3.14) T −T θ θ−1 2 −1 4 Consequently, for any integer N ≥ 1 and any TN −1 < R ≤ TN , 1 1 1 ck fN (Pk · ), fN = ck fN (Pk · ), fN ≥ . SR (fN ), fN = CR CR 4 k≤R fN (Pk .), fN =

k≤R

k∈]TN−1 ,TN ]

(12.3.15) The proof is achieved by applying Lemma 6.1.5 and the entropy criterion in L∞ . The next two theorems will concern subsequences N generated by infinitely many primes. 12.3.6 Theorem. Let P = {P1 , P2 , . . . } be an increasing sequence of positive pairwise coprime integers, and denote by C(P ) the infinite-dimensional chain generated by P . ∞ Let c = {ck , k ≥ 1} be a sequence of positive reals such that the series k=1 ck diverges. Define for any measurable function f : T → R the weighted sums 1 Sn f (x) = cj f (j x). j ∈C(P )∩[1,n] cj j ∈C(P )∩[1,n]

640

12 A study of the system (f (nx))

Assume that

j ∈C(P )∩[ 21 P12i ,P12i ]) cj

lim sup i→∞

j ∈C(P )∩[1,P12i ]) cj

> 0.

(12.3.16)

Then there exists a bounded measurable function f such that (Sn f )n does not converge almost everywhere. From Theorem 12.3.6 one can obtain 12.3.7 Theorem. Let P = {P1 , P2 , . . . } be an increasing sequence of positive pairwise coprime integers, and denote by C(P ) the infinite-dimensional chain generated by P . Let c = {ck , k ≥ 1} be a sequence of positive reals such that ck2 < ∞, ck = ∞. k

k

Assume that condition (12.3.16) is satisfied. Then, there exists a bounded measurable function f such that c f (P .) does not converge almost everywhere. k k k≤n n The proof of Theorem 12.3.7 is similar to the proof of Theorem 12.3.5, so it is omitted. Proof of Theorem 12.3.6. Let s be some fixed positive integer. Put for any integer T ≥ 0, AT = n = P1α1 . . . Psαs : P1T ≤ n < P1T +1 , αi ≥ 0, i = 1, . . . , s . (12.3.17) By replacing α1 by α1 + 1, one can easily verify that #(AT ) ≤ #(AT +1 ). As for n =

P1α1

. . . Psαs

(12.3.18)

∈ AT , necessarily 0 ≤ α1 + · · · + αs ≤ T , so we also deduce #(AT ) ≤ T s .

(12.3.19)

Then, for any d > 0, there exists an integer T > 0 such that #(AT +d ) ≤ 2#(AT ).

(12.3.20)

Indeed, otherwise, #(AT +d ) > 2#(AT ) for any T , would imply for any integer n, #(And ) > B2n , where B is some positive constant, which contradicts (12.3.19). Choose d such that P1d ≤ Ps . Any element j ∈ C(P ) such that j ≤ P1d can be thus expressed as j = P1α1 . . . Prαr with r ≤ s. Put for any i = 0, . . . , d, f (i) (x) =

1 1

#(AT +i ) 2

n∈AT +i

e2iπ nx ,

(12.3.21)

641

12.3 Almost sure convergence – necessary conditions

and let Next, put for any i = 0, . . . ,

f = f (0) .

"d # 2

, f (2i−1) + f (2i) , √ 2

φi =

(12.3.22)

and let for any integer j , fj (x) = f (j x). The set of functions f (i) is a sub-orthonormal system of L2 and the same property holds true for the system of functions φi . Moreover

fj = 1 for any j . " # " # Let 1 ≤ i ≤ d2 , j ∈ P12i−1 , P12i ∩ C(P ), and examine fj . Let n ∈ AT . Then β β nj may be written as nj = P1 1 . . . Ps s . Moreover, P1T +2i−1 ≤ nj < P1T +2i+1 . It follows that we have the implication # " n ∈ AT and j ∈ P12i−1 , P12i ∩ C(P ) "⇒ nj ∈ AT +2i−1 ∪ AT +2i . We may thus write fj (x) = where D ⊂ AT +2i−1 ∪ AT +2i √ 2fj , φi =

and so for any 1 ≤ i ≤

"d # 2

1 2

e2iπ mx ,

#(D) m∈D and #(D) = #(AT ). Hence, 1

1

1

[#(AT )#(AT +2i−1 )] 2 +

≥

1

m∈D∩AT +2i−1

1

1

1

[#(AT )#(AT +2i )] 2

m∈D∩AT +2i

1

1 √ .#(AT ) = √ , #(AT ) 2 2

, P12i−1 ≤ j ≤ P12i , fj , φi ≥

1 . 2

(12.3.23)

Further, fj , φk ≥ 0 for any j and k. Thus, SP 2i (f ), φi = 1

≥

1 j ∈C(P )∩[1,P12i ] cj

cj fj , φi

j ∈C(P )∩[1,P12i ])

1 j ∈C(P )∩[1,P12i ] cj

j ∈C(P )∩[ 21 P12i ,P12i ])

1 j ∈C(P )∩[ 21 P12i ,P12i ] cj . ≥ 2 j ∈C(P )∩[1,P 2i ] cj 1

cj fj , φi

642

12 A study of the system (f (nx))

"d #

We have obtained for any i = 1, . . . ,

2

,

1 j ∈C(P )∩[ 21 P12i ,P12i ] cj SP 2i (f ), φi ≥ . 1 2 j ∈C(P )∩[1,P 2i ] cj

(12.3.24)

1

Now, by assumption

j ∈C(P )∩[ 21 P12i ,P12i ] cj

lim sup

j ∈C(P )∩[1,P12i ] cj

i→∞

> 0.

We may find an increasing sequence (iλ )λ of integers as well as a positive real c, such that 2i 2i c j ∈C(P )∩[ 21 P1 λ ,P1 λ ] j ≥ 2c (λ = 1, 2, . . . ). 2i c j ∈C(P )∩[1,P λ ] j 1

Consequently, for any λ such that iλ ≤ d, SP 2iλ (f ), φiλ ≥ c.

(12.3.25)

1

Let p" be# a positive "" #integer # such that pc ≥ 1. Lemma 6.1.5 applied with the choices R = D2 , T = D2 /13 with D = #(λ | iλ ≤ d) and p shows that

N

$

D SP 2i (f ), i ≤ 1 2

%

c , 2

≥ T.

(12.3.26)

But d is arbitrary, thus

sup

f ∈L∞ f 2 ≤1

N

c SP 2i (f ), i ≥ 1 , 1 2

= ∞.

Applying now Bourgain’s entropy criterion in L∞ (Corollary 6.1.8) achieves the proof.

12.4

Random sequences

In this section we investigate the convergence of the series ∞ k=1 ck f (nk x) where {nk , k ≥ 1} is a random sequence of real numbers. Specifically, we will investigate the model when nk = X1 + · · · + Xk , where the Xk are independent, identically distributed random variables defined on some probability space (, A, P). We will not assume that X1 is integer valued or X1 > 0; we assume only that the distribution n of X1 is nondegenerate. If the random walk X , n ≥ 1 is transient, we have k=1 k |nk | → ∞ a.s. On the other hand, if the random walk is recurrent and X1 is nonlattice,

643

12.4 Random sequences

{nk , k ≥ 1} is dense in R with probability 1. We begin our investigations with the study of random trigonometric sums of the form ∞

cn eitSn (ω)

(12.4.1)

n=1

where {ck , k ≥ 1} ∈ 2 ; the terms of this sum are functions defined on the product space × T, endowed with the product probability P × λ. 12.4.1 Theorem. Let X1 be nondegenerate with characteristic function ϕ and let Sn = nk=1 Xk be the corresponding random walk. Then for any c ∈ 2 and any real t for which ρ = max(|ϕ(t)|, |ϕ(2t)|, |ϕ(−t)|, |ϕ(−2t)|) < 1

(12.4.2)

the series (12.4.1) converges with probability 1. Consequently, the series (12.4.1) converges for almost all (t, ω) ∈ T × , provided c ∈ 2 . Since X1 is nondegenerate, (12.4.2) holds for all but countably many t’s. If X1 is nonlattice, then |ϕ(t)| < 1 for all t = 0; otherwise there exists a t0 > 0 such that |ϕ(t)| = 1 if and only if t = kt0 , k ∈ Z. If X1 is degenerate, then Sn = cn with some constant c, and the statement of Theorem 12.4.1 reduces to Carleson’s theorem, which is of course not contained in our result. But it is interesting to note that for all other random walks, the above formulated “random” version of Carleson’s theorem is valid. This seems paradoxical at first sight, since the random walk Sn can be recurrent, e.g., it is possible that Sn = 0 for infinitely many n. However, by the theory of random walks 1/2 in the interval [0, n]) the set H = {n : Sn = 0} is thin (e.g., it has O(n ) elements and Theorem 12.4.1 shows that k∈H |ck | < ∞ even if ∞ |c k=1 k | = ∞. Applying Theorem 8.2.1 with γ = 4, α = 2, uk = ck2 , for the proof of Theorem 12.4.1 it suffices to prove the following 12.4.2 Lemma. For any real c1 , . . . , cN we have N 4 E ck eitSk ≤ k=1

2 1 ck2 . (1 − ρ)2 N

(12.4.3)

k=1

where ρ is defined by (12.4.2). Proof. In the case ρ = 1 the lemma is obvious, so we can assume ρ < 1. Clearly for any real c1 , . . . , cN we have E|

N k=1

ck eitSk |4 =

1≤j,k,l,m≤N

cj ck cl cm E eit (Sj −Sk +Sl −Sm ) .

(12.4.4)

644

12 A study of the system (f (nx))

We now claim that |E eit (±Sj ±Sk ±Sl ±Sm ) | ≤ ρ (|j −k|+|l−m|)

(j ≥ k ≥ l ≥ m).

(12.4.5)

provided in the last exponent there are two positive and two negative signs. Clearly we can assume that the sign of Sj in (12.4.5) is positive; otherwise we replace t by −t. There are three cases: (a) E eit (Sj −Sk +Sl −Sm ) = E eit (Sj −Sk ) E eit (Sl −Sm ) = |ϕ(t)|j −k |ϕ(t)|l−m (b)

≤ ρ (|j −k|+|l−m|) , it (S −S −S +S ) it (S −S ) −it (S −S ) m = |ϕ(t)|j −k |ϕ(−t)|l−m l E e j k l m = E e j k E e

(c)

≤ ρ (|j −k|+|l−m|) , it (S +S −S −S ) it (S −S )+2it (S −S )+it (S −S ) m | k l l E e j k l m = E e j k = |ϕ(t)|j −k |ϕ(2t)|k−l |ϕ(t)|l−m ≤ ρ (|j −k|+|l−m|) ,

proving (12.4.5). Thus splitting the sum on the right-hand side of (12.4.4) into 24 subsums corresponding to a fixed relative order of j, k, l, m and in each such sum renaming the indices j, k, l, m so that they will be nonincreasing in the renamed order, we get N 4 E ck eitSk ≤ 24

|cj ||ck ||cl ||cm |ρ (|j −k|+|l−m|) .

(12.4.6)

N ≥j ≥k≥l≥m≥1

k=1

Summing the right-hand side of (12.4.6) first for those indices (j, k, l, m) for which j − k = r and l − m = s are fixed, we get by Cauchy’s inequality, |ck ||ck+r ||cm ||cm+s |ρ r+s 1≤k,k+r,m,m+s≤N

≤ ρ r+s

|ck ||ck+r |

|cm ||cm+s | 1≤m,m+s≤N 1/2 1/2 1/2 1/2 2 2 2 ck2 ck+r cm cm+s ρ r+s 1≤k≤N 1≤k+r≤N 1≤m≤N 1≤m+s≤N 2 ρ r+s cj2 . 1≤j ≤N 1≤k,k+r≤N

≤ ≤

Now summing for r and s we get Lemma 12.4.2. We turn now to the convergence of the series (12.4.1) in Lp (T × ) for p > 2. For simplicity, we consider the case p = 4. 12.4.3 Proposition. Let X = {X, Xi , i ≥ 1} be a sequence of independent, identically distributed, lattice random variables defined on some probability space (, A, P). We

645

12.4 Random sequences

assume that the random walk Sn = X1 + · · · + Xn , n ≥ 1 is transient. Then, E

n 4 ck e2ıπ αSk dα

T k=1

n ≤ 4G(0, 0) |ck |2 k=1

+6

|ci ||cj ||ck ||cl | P Sk − Si = ±(Sj − Sl )

1≤i≤k
+ P Sk − Si = (Sj − Sl ) − 2(Sl − Sk ) .

Here G(0, x) = ∞ k=0 P{Sk = x} is the Green function which is finite for every x ∈ Z, since the random walk is transient. Proof. Let a1 , . . . , an be complex numbers. Then, n n n n n 4 ai = ai a¯ j ak a¯ l i=1 j =1

i=1

=

n

|ai |2 +

·

|ak | +

k=1

=

n

|ak |

2

2

+2

n n n

|ak |

k=1 n

(ak a¯ l + a¯ k al )

k=1 l=k+1 n

k=1

+

(ai a¯ j + a¯ i aj )

i=1 j =i+1 n n 2

i=1 n

k=1 l=1 n

n

2

n n

(ak a¯ l + a¯ k al )

k=1 l=k+1

(ai a¯ j + a¯ i aj )(ak a¯ l + a¯ k al )

i=1 j =i+1 k=1 l=k+1

:=

n

2

|ak |2

+ A + B.

k=1

Apply this in our case: a = c e2ıπ αS (ω) , i = 1, . . . , n. The sum A is written A=2

n

|ck |2

n n

k=1

ck c¯l e2ıπ α(Sk (ω)−Sl (ω)) + c¯k cl e2ıπ α(Sl (ω)−Sk (ω)) .

k=1 l=k+1

By integrating over × T, with respect to P × m, we obtain an expression that is equal to n

ck c¯l P Sl = Sk + c¯k cl P Sl = −Sk . |ck |2 A˜ = 2 k=1

1≤k
646

12 A study of the system (f (nx))

Let fn (α, ω) = e2ıπ αSn (ω) . We claim that sup

|fn , fm P×λ | ≤ 2 G(0, 0).

n≥1 m≥1

Clearly, 0 ≤ fn , fm P×λ = E

One one hand,

1/2

e2iπ α(Sn −Sm ) dα = P{Sn = Sm } = P{S|m−n| = 0}.

−1/2

fn , fm P×λ =

m≥n

P{Sd = 0} = G(0, 0),

d≥0

and on the other,

fn , fm P×λ =

m
n

P{Sd = 0} ≤ G(0, 0).

d=1

Therefore

fn , fm P×λ ≤ 2 G(0, 0).

m≥1

Thus using (1.3.11), it follows that {fn , n ≥ 1} is a quasi-orthogonal system. As fn , fm P×λ = P{S|m−n| = 0}, we get n

˜ ≤ 4G(0, 0) |A|

|ck |2 .

(12.4.7)

k=1

Now, the sum B equals

αi α¯ j e2ıπ α(Si (ω)−Sj (ω)) + α¯ i αj e2ıπ α(Sj (ω)−Si (ω))

1≤i<j ≤n 1≤k

× αk α¯ l e2ıπ α(Sk (ω)−Sl (ω)) + α¯ k αl e2ıπ α(Sl (ω)−Sk (ω)) .

Integrating this expression over × T, with respect to P × m, we find a sum of the type γi γj γk γl P Sl − Sk = ±(Sj − Si ) , B˜ = 1≤i<j ≤n 1≤k
where γi = αi or α¯ i . Consider six cases. i) 1 ≤ k < l ≤ i: The sum differences Sj − Si and Sl − Sk are independent, and we find in this case a contribution given by γi γj γk γl P Sj − Si = ±(Sl − Sk ) . 1≤i<j ≤n 1≤k
12.4 Random sequences

647

ii) 1 ≤ k ≤ i < l < j : There are two subcases: Sl − Sk = Sj − Si and Sl − Sk = −(Sj − Si ). Write a = i − k, b = l − i, c = j − l. This corresponds to a + b = ±(b + c). If a + b = b + c, then Si − Sk = Sj − Sl , which are independent sum differences. Hence a contribution equal to γi γj γk γl P Si − Sk = Sj − Sl . 1≤l<j ≤n 1≤k≤i
If a + b = −b − c, then a = −c − 2b and Si − Sk = −(Sj − Sl ) − 2(Sl − Si ), which are independent sum differences. Hence a contribution equal to γi γj γk γl P Si − Sk = −(Sj − Sl ) − 2(Sl − Si ) . 1≤l<j ≤n 1≤k≤i≤l

iii) 1 ≤ k ≤ i < j ≤ l ≤ n: Write a = i − k, b = j − i, c = l − j . The equation Sj − Si = ±(Sl − Sk ) corresponds to a + b + c = ±b. If a + b + c = b, then a + c = 0 and (Si − Sk ) + (Sl − Sj ) = 0 where Si − Sk and Sl − Sj are independent sum differences. Therefore, this produces a contribution equal to γi γj γk γl P (Si − Sk ) > Sl − Sj ) . 1≤j ≤l≤n 1≤k≤i<j

If a + b + c = −b, then a = −c − 2b or else Si − Sk = −(Sl − Sj ) − 2(Sj − Si ) where Si − Sk , Sl − Sj and Sj − Si are independent sum differences. This produces a contribution equal to γi γj γk γl P Si − Sk = −(Sl − Sj ) − 2(Sj − Si ) . 1≤j ≤l≤n 1≤k≤i<j

iv) 1 ≤ i < k < l ≤ j ≤ n: Write a = k − i, b = l − k, c = j − l. The equation Sj − Si = ±(Sl − Sk ) again corresponds to a + b + c = ±b. If a + b + c = b, then a + c = 0 and (Sk − Si ) + (Sj − Sl ) = 0 where Sk − Si , Sj − Sl are independent sum differences. This produces a contribution equal to γi γj γk γl P (Sk − Si ) + (Sj − Sl ) = 0 . 1≤l≤j ≤n 1≤i
If a + b + c = −b, then a = −c − 2b and Sk − Si = −(Sj − Sl ) − 2(Sl − Sk ) where Sk − Si , Sj − Sl and Sl − Sk are independent sum differences. This produces a contribution equal to γi γj γk γl P Sk − Si = −(Sj − Sl ) − 2(Sl − Sk ) . 1≤l≤j ≤n 1≤i
648

12 A study of the system (f (nx))

v) 1 ≤ i < k < j < l ≤ n: Write a = k − i, b = j − k, c = l − j . The equation Sj − Si = ±(Sl − Sk ) corresponds here to a + b = ±(b + c). If a + b = b + c, then a = c and Sk − Si = Sl − Sj where Sk − Si , Sl − Sj are independent sum differences. This produces a contribution equal to

γi γj γk γl P Sk − Si = Sl − Sj .

1≤j
If a + b = −b − c, then a = −2b − c and Sk − Si = −(Sl − Sj ) − 2(Sj − Sk ) where Sk − Si , Sl − Sj and Sj − Sk are independent sum differences. This produces a contribution equal to

γi γj γk γl P Sk − Si = −(Sl − Sj ) − 2(Sj − Sk ) .

1≤j
vi) 1 ≤ i < j ≤ k < l ≤ n: The sum differences Sj − Si and Sl − Sk are independent, therefore in this case we find a contribution

γi γj γk γl P Sj − Si = ±(Sl − Sk ) .

1≤i<j ≤n 1≤k
Summarizing the above estimates, only two types of sums appear:

γi γj γk γl P Sk − Si = ±(Sj − Sl )

(S1)

γi γj γk γl P Sk − Si = (Sj − Sl ) − 2(Sl − Sk )

(S2)

1≤i≤k
and

1≤i≤k
The proof is completed now by counting the number of occurrences of these sums, and using (12.4.7). We shall deduce from Proposition 12.4.3 a more explicit estimate of the fourth moment. We will use the following transform. Let γ = {γn , n ≥ 1} be a bounded sequence of nonnegative reals. Put for any z ∈ Z, γh[z] =

γu P{Su−h = z}.

u≥h

By the transience assumption, these quantities are well defined since for any z ∈ Z, u≥0 P{Su = z} ≤ G(0, 0). In particular, if γ is nonincreasing, we get from the above equality: γh[z] ≤ G(0, 0)γz+h .

649

12.4 Random sequences

In the case of the Bernoulli random walk, this is however read directly. We have P{Su−h = z} = 0 if z ≤ 0 or z > u − h, and so γh[z] = =

γu P{Su−h = z}

(u=v+z+h)

=

∞

γv+z+h P{Sv+z = z}

v=0

u≥z+h ∞

z γv+z+h 2−(v+z) Cv+z .

v=0

Using now the formula

∞

z v v=0 Cv+z x ∞

=

1 (1−x)z+1

valid for |x| < 1, gives the relation

z 2−(v+z) Cv+z =2

v=0

for any z ≥ 0. 12.4.4 Proposition. Assume that P{X ≥ 0} = 1. Further, let a = {ak , k ≥ 1} and c = {ck , k ≥ 1} be two sequences of reals such that |ak | ≤ ck for any k and c is nonincreasing. Then n+m 4 n+m ak e2ıπ αSk dα ≤ 4G(0, 0) ck2 E T k=m

k=m

m+n 2 m+n 2 + 48 cl2 ci + ci cl2 . l=m

m≤i≤l

i=m

l≥m+n

12.4.5 Corollary. Assume P(X ≥ 0) = 1. Let a = {ak , k ≥ 1} and c = {ck , k ≥ 1} be two sequences of reals such that |ak | ≤ ck for any k and c is nonincreasing. Then 2ıπ αSk converges in L4 (P × λ), provided that the series the series ∞ a k=1 k e 2 cl2 ci l≥1

converges. In particular the series a > 3/4.

∞

1≤i≤l

k=1 k

−a e2ıπ αSk

converges in L4 (P × λ) for any

Proof of Proposition 12.4.4. By Proposition 12.4.3, n+m n+m 2ıπ αSk 4 ak e dα ≤ 4G(0, 0) ck2 + 6((S1) + (S2)), E T k=m

where (S1) =

k=m

m≤i≤k
(S2) =

m≤i≤k
ci cj ck cl P Sk − Si = ±(Sj − Sl ) , ci cj ck cl P Sk − Si = (Sj − Sl ) − 2(Sl − Sk ) .

650

12 A study of the system (f (nx))

Consider first the sums of type (S1), the others will be in turn treated similarly. Write ci cj ck cl P Sk − Si = Sj − Sl m≤i≤k
=

z∈Z m≤i≤k≤m+n

=

ci ck P Sk−i = z

cj cl P Sj − Sl = z

k

cj cl P Sj −l = z .

k
z∈Z m≤i≤k≤m+n

As

ci ck P Sk − Si = z

cj cl P Sj −l = z ≤

k
cl

cj P Sj −l = z =

j ≥l

m≤l≤m+n

cl cl[z] ,

m≤l≤m+n

we get by putting this into the previous relation ci cj ck cl P Sk − Si = Sj − Sl m≤i≤k
≤

m+n z∈Z

≤

l=m

m+n z∈Z

cl cl[z]

l=m

ci

m≤i≤m+n

cl cl[z]

m+n

k≥i

ci ci[z] ≤

i=m

ck P Sk−i = z

m≤i,l≤m+n

cl ci

cl[z] ci[z] .

z∈Z

the latter The sums related to the factor P Sk − Si = −(Sj − Sl ) are treated similarly, − S = 0 = P S − S = 0 , and its value is probability being not 0 only if P S k i j l then P Sk − Si = 0 P Sj − Sl = 0 . Thus ci cj ck cl P Sk − Si = −(Sj − Sl )

m≤i≤k

=

m≤i≤k≤m+n

≤

≤ ≤

ci

i=m

k≥i

m+n

2

i=m

ci2

.

cl

k
ci ck P Sk − Si = 0

m≤i≤k≤m+n m+n

cl cj P Sj − Sl = 0

k

ci ck P Sk − Si = 0

m≤i≤k≤m+n

≤

ci ck P Sk − Si = 0

cj P Sj − Sl = 0

j ≥l

cl cl[0]

m≤l≤m+n

m+n

ck P Sk − Si = 0

l=m

m+n [0] m+n [0] cl cl[0] ≤ ci ci cl cl i=m

l=m

651

12.4 Random sequences

Now consider the sums of type (S2): ci cj ck cl P Sk − Si = (Sj − Sl ) − 2(Sl − Sk ) m≤i≤k
=

≤

z1 ∈Z z2 ∈Z

m≤i≤k
n+m z1 ∈Z z2 ∈Z

ci cj ck cl P Sk−i = z1 P Sl−k = z2 P Sj −l = z1 + 2z2

m+n

|ci |

i=m

ck P Sk−i = z1

k=i m+n

cl

j ≥l

l=k

≤

n+m ci z1 ∈Z z2 ∈Z

i=m

|cj |P Sj −l = z1 + 2z2 P Sl−k = z2

i≤k≤n+m

cl cl[z1 +2z2 ] P Sl−k = z2 .

ck P Sk−i = z1

k≤l≤n+m

Consider on L2 (T) the operator U defined for h ∼ Let g = ∞ k=1 ak ek . It follows that n+m

cl ci

i,l=m

n+m

cl[z] ci[z] ≤ 4

z∈N hz ez

by U h ∼

z∈N hz+1 ez .

cl ci cl+z ci+z

i,l=m z∈N

z∈N

=4

n+m

n+m 2 cl ci U l g, U i g = 4 ci U i g

i,l=m

i=m

and n+m ci z1 ∈Z z2 ∈Z

i=m

≤2

i≤k≤m+n

i=m

i≤k≤m+n

n+m ci

z1 ∈Z z2 ∈Z

i=m

≤4

n+m i=m

i=m

ci

n+m k=m

cl cl[z1 +2z2 ] P Sl−k = z2

ck P Sk−i = z1

cl cl+z1 +2z2 P Sl−k = z2

k
ck P Sk−i = z1 cl cl+z1 +2z2 P Sl−k = z2

i≤k≤m+n

n+m ≤4 ci z1 ∈N z2 ∈N

k
n+m ci z1 ∈Z z2 ∈Z

≤2

ck P Sk−i = z1

l≥k+z2

ck P Sk−i = z1 ck+z2 ck+z1 +3z2

i+z1 ≤k≤m+n

ck

z1 ≤k−i

P Sk−i = z1 ck+z2 ci+z2 z2 ∈N

652

12 A study of the system (f (nx))

=4

ci ck U i g, U k g = 4

n+m i,k=m

2 ci U i g .

m≤i≤m+n

Consequently, E

n+m 4 n+m 2ıπ αSk a e dα ≤ 4G(0, 0) ck2 k T k=m

k=m

n+m

+ 48

ck2

2

+

k=m

2 ci U i g .

m≤i≤m+n

As m+n

ci U g = i

m+n

ci cl el =

i=m l≥i

i=m

=

m+n

el cl

(m+n)∧l el cl ci

l≥m

l=m

ci +

i=m

m≤i≤l

m≤i≤m+n

ci

el cl ,

l≥m+n

one has m+n 2 m+n 2 m+n 2 ci U i g = cl2 ci + ci cl2 . i=m

l=m

m≤i≤l

i=m

l≥m+n

Hence n+m 4 n+m E ak e2ıπ αSk dα ≤ 4G(0, 0) ck2 T k=m

k=m

m+n 2 m+n 2 + 48 cl2 ci + ci cl2 . l=m

m≤i≤l

i=m

l≥m+n

∞ 2 We now turn to the study of convergence of k=1 ck f (Sk x) for general f ∈ L (T), T f (t)dt = 0. In the case when the distribution of X1 is absolutely continuous, the exact analogue of Theorem 12.4.1 holds, namely we have 12.4.6 Theorem. Let X1 have a bounded density concentrated on a finite interval. Let f ∈ Lip (T) for some α > 0 with f (t)dt = 0 and let c ∈ 2 . Then for any fixed α T ∞ x = 0, k=1 ck f (Sk x) converges with probability 1. c Consequently, for almost every ω ∈ , ∞ k=1 k f (Sk (ω)x) converges for almost every x. In the case when X1 has a lattice distribution, the situation is more complicated. The following theorem describes the Bernoulli case.

653

12.4 Random sequences

12.4.7 Theorem. Let X1 take the values 0 and 1 with probability 1/2 each. Let f ∈ L2 (T) with T f (t)dt = 0 have Fourier series f ∼

∞

(ak cos 2π kx + bk sin 2π kx)

k=1

and assume that the Dirichlet series in the half-plane $(s) > 0. Let

−s n an n ,

−s n bn n

are regular and bounded

L+u c . τk (c) := sup L≥k u≤γ log k

=L

Then the series c f (S (ω)x) converges in mean P-almost surely provided c ∈ 2 and τk (c) = o(1). For the proof of Theorem 12.4.6, let ψ(x) = sup |P(Sn ≤ x) − x|. 0≤x≤1

By Theorem 1 of [Schatte: 1984] we have ψ(n) ≤ Ce−λn

(n ≥ 1)

(12.4.8)

for some constants C > 0, λ > 0. 12.4.8 Lemma. Let k < < m < n be positive integers. Then there exists a random variable with || ≤ ψ( − k) such that the vector (S − , Sm − , Sn − ) has uniform coordinates and is independent of Sk − . Similarly, there exists a random variable with | | ≤ ψ(n − m) such that Sn − has uniform coordinates and is independent of the vector (Sk − , S − , Sm − ). This lemma is implicit in [Schatte: 1988] and can be obtained along the following lines. By enlarging the underlying probability space if necessary, we can assume that there exists a random variable U , uniformly distributed over (0, 1), and independent of the sequence X1 , X2 , . . . . Let Y = S − Sk , then |P (Y ≤ t) − t| ≤ ψ( − k) for all t and thus by Lemma 3 of [Schatte: 1988] there exists a uniform random variable Y ∗ , which is a function of U and Y such that |Y − Y ∗ | ≤ ψ( − k). Let = Y − Y ∗ , then (S − , Sm − , Sn − ) = (S − Y, Sm − Y, Sn − Y ) + Y ∗ =: Z + Y ∗ . Here the vector Z is obviously independent of Y = Xk+1 + · · · + X and thus also of the uniform random variable Y ∗ , which is a function of Y and U . Thus adding Y ∗ to the components of Z, we get a vector whose components are uniform, the independence of Z + Y ∗ and Sk follows also from Lemma 1 of [Schatte: 1988]. The second statement of the lemma can be proved similarly.

654

12 A study of the system (f (nx))

Proofof Theorem 12.4.6. We prove the statement for x = 1. Let f ∈ Lipα (T), α > 0 with T f (t)dt = 0. By (12.4.8) we have 1 E f (S ) − f (x)dx ≤ C1 e−λ1 n n

(n ≥ 1).

(12.4.9)

0

Set ξk = f (Sk ) − E f (Sk ). By (12.4.9), for any bounded sequence {ck , k ≥ 1} the series c f (S ) and ck ξk are equiconvergent, and thus it suffices to prove that k k ck ξk is a.s. convergent provided {ck , k ≥ 1} ∈ 2 . In view of Lemma 12.4.2, this will follow if we show that E

N

4 ck ξk

≤K

k=1

N

ck2

2 (12.4.10)

k=1

for any real {ck , 1 ≤ k ≤ N} with a suitable constant K. We claim that |E (ξk ξl ξm ξn )| ≤ Ae−C(|l−k|+|n−m|)

(k ≤ l ≤ m ≤ n).

(12.4.11)

By Lemma 12.4.8, there exists a random variable with || ≤ ψ(l − k) such that the vector (Sl − , Sm − , Sn − ) =: (Sl , Sm , Sn ) is independent of Sk and thus the random variables X = f (Sk ) − E f (Sk ) and Y = (f (Sl ) − E f (Sl ))(f (Sm ) − E f (Sm ))(f (Sn ) − E f (Sn ))

are independent. Since E (X) = 0, it follows that

E (f (Sk ) − E f (Sk ))(f (Sl ) − E f (Sl ))(f (Sm ) − E f (Sm ))(f (Sn ) − E f (Sn ) = E (XY ) = E (X)E (Y ) = 0. (12.4.12) In view of || ≤ ψ(l − k) and the boundedness and Lipschitz property of f it follows , S by S , S , S in the first expectation in (12.4.12) results in a that replacing Sl , Sm l m n n change of at most Cψ(l − k) of the expectation and thus we see that the expectation in (12.4.11) is at most Cψ(l − k). A similar argument shows that the left-hand side of (12.4.11) is at most Cψ(n − m), and thus the left-hand side of (12.4.11) is also bounded by C(ψ(l − k)ψ(n − m))1/2 , which proves (12.4.11) in view of (12.4.8). It is now easy to verify (12.4.10). By (12.4.11), the left-hand side of (12.4.10) is bounded by (12.4.13) ck cl cm cn e−C(|l−k|+|n−m|) . 1≤k≤l≤m≤n≤N

655

12.4 Random sequences

Summing (12.4.13) first for those indices (k, l, m, n) for which l −k = r and n−m = s are fixed, we get by Cauchy’s inequality, ck ck+r cm cm+s e−C(r+s) 1≤k,k+r,m,m+s≤N

≤ e−C(r+s) ≤ ≤

ck ck+r

cm cm+s

1≤k≤k+r≤N 1≤m≤m+s≤N 1/2 1/2 1/2 1/2 2 2 2 e−C(r+s) ck2 ck+r cm cm+s 1≤k≤N 1≤k+r≤N 1≤m≤N 1≤m+s≤N 2 e−C(r+s) cj2 . 1≤j ≤N

Now summing for r and s we get (12.4.10). Proof of Theorem 12.4.7. Set δ0 = 0, 0 = 0 and for any integer k ≥ 1, δk = inf n ≥ 1 : Xn+δ1 +···+δk−1 = 1 , k = δ1 + · · · + δk . Then the random variables δk are i.i.d. and P{δk = m} = 2−m for all k and m. Further k = inf{ ≥ 1 : Sl = k}. Let ω ∈ , then

c f (S (ω)x) =

k

Yh f (hx)

(k = 0, 1, . . . )

h=0

<k+1 (ω)

where we put

Yk =

c

(k = 0, 1, . . . )

(12.4.14)

k ≤<k+1

and for k ≤ L < k+1 ,

c f (S (ω)x) =

k−1

Yh f (hx) +

c f (kx)

(k = 0, 1, . . . ).

(12.4.15)

k ≤≤L

h=0

We first work the sums k (c, x) :=

k

Yh f (hx)

(k = 0, 1, . . . ).

(12.4.16)

h=0

Let y = {Yk , k ≥ 0}. It follows from theorem 12.1.1 in that if f is such that the Dirichlet series (12.1.4) are regular and bounded in the half-plane $(s) > 0, the sequence {k (c, x), k ≥ 0} converges in mean, P-almost surely provided that P{y ∈ 2 } = 1. The following intermediate result is a straightforward consequence of Lemma 11.7.6.

656

12 A study of the system (f (nx))

12.4.9 Lemma. Assume that f satisfies the assumptions of Theorem 12.4.7. Then for any c ∈ 2 , the sequence {k (c, x), k ≥ 0} converges in mean, P-almost surely. Now we pass to the study for k ≤ L < k+1 , k = 0, 1, . . . , of the ratio c f (kx). k ≤≤L

By the strong law of large numbers k ∼ kE δ1 almost surely and E δ1 = Further, by the Borel–Cantelli lemma,

∞

m=1 m2

−m .

P{δk ≤ γ log k ultimately} = 1, for some constant γ . It follows that with probability 1, 2 max c f (S (ω)x) − k (c, x) dx ≤ τk2 (c) f 22 k ≤L<k+1 T
ultimately,

and Theorem 12.4.7 is proved. We conclude by mentioning a few open questions related to the problems investigated in this chapter, whose solution would be particularly helpful in improving the understanding of this set of issues. Problem 12. In terms of the Fourier coefficients of f ∈ L2 (T), find a necessary and sufficient (or at least sharp) condition for f (kx) to be a convergence system. Answer the same question for the system f (nk x) for a fixed increasing sequence {nk , k ≥ 1} of positive integers. There are several variants and special cases of this question which are also of considerable interest. Problem {ck , k ≥ 1} implying ∞ 13. Find sharp conditions for the coefficient sequence that k=1 ck f (kx) converges a.e. for all f ∈ Lipα (T), T f (t)dt = 0. For 0 < α < 1/2 a sufficient condition is ∞

ck2 k γ < ∞

for some γ > 1 − 2α

k=1

(cf. Gaposhkin [1966b] or Corollary 12.2.18), while the condition ∞

ck2 (log k)γ < ∞,

γ < 1 − 2α

k=1

is not sufficient (see theorem 3 of Berkes [1997]). Similarly, in the case α = 1/2 a sufficient condition is ∞ ck2 (log k)γ < ∞ for some γ > 2 (12.4.17) k=1

12.4 Random sequences

657

(see Corollary 12.2.19), while the condition ∞

ck2 < ∞

for some γ > 0

(12.4.18)

k=1

is not sufficient (see Theorem 1 in Berkes [1993]). Answer the same question for the class BV (T), where the unknown sharp condition lies again between (12.4.17) and (12.4.18). Problem 14. Let f ∈ L2 (T), T f (t)dt = 0. A simple sufficient ∞ condition for f (kx) to be a convergence system is that the Fourier series f ∼ k=1 (ak cos 2π kx + bk sin 2πkx) satisfies ∞ k=1 (|ak | + |bk |) < ∞. (Gaposhkin [1968].) Characterize the functions f for which this condition is also necessary. By Theorem 4 of Berkes [1997], this is the case if f ∼ (ak cos 2π kx + bk sin 2π kx) k∈H

and the elements of H are coprimes. Problem 15. Find a sharp condition for the square modulus of continuity ω2 (δ) of f ∈ L2 (T) to imply that f (nk x) is a convergence system for all sequences (nk ) of positive integers satisfying the Hadamard gap condition. By Gaposhkin [1967], T f (t)dt = 0 and 1 −γ ω2 (δ, f ) = O log δ are sufficient for γ > 1, but not for γ = 1/2. Problem 16. Let W (T) (Wintner class) denote class of functions f ∈ L2 (T) such the ∞ −s −s are bounded that T f (t)dt = 0 and the Dirichlet series n=1 an n , ∞ n=1 bn n and regular in the half-plane $(s) > 0, where f ∼

∞

(ak cos 2π kx + bk sin 2π kx).

k=1

Prove that if f ∈ Lipα (T), α > 0, T f (t)dt = 0, then f (nk x) is a convergence system for all sequences (nk ) of positive integers satisfying the sub-lacunarity condition nk+1 /nk ≥ 1 + ck −γ ,

k = 1, 2, . . . , for some γ < 1/2

if and only if f ∈ W (T). Some problems on series

ck f (nk x) with random nk :

Problem 17. Let X1 ≥ 0 be an integer-valued random variable over the probability space (, A, P) such that E X1 < ∞ and P(X1 = 0) < 1. Let Sn = X1 + · · · + 2 X n∞and let f ∈ L (T) belong to the Wintner class W (T). Is it true that the 2series k=1 ck f (Sk (ω)x) converges for almost every (ω, x) ∈ × T provided c ∈ ?

658

12 A study of the system (f (nx))

itSk to converge in L4 (T × ) norm Problem 18. Find precise criteria for ∞ k=1 ck e where Sk = Sk (ω) a nondegenerate random walk over a probability space (, A, P ). A rather restrictive sufficient condition is given in Corollary 12.4.5.

Chapter 13

Divisors and random walks

We discuss the idea arising from Section 11.8 of a probabilistic model of the integers built from a given random walk. Two simple cases are investigated: the Rademacher random walk and the Bernoulli random walk. The chapter is devoted to a study of the divisor functions: the value distribution of divisors of Bernoulli or Rademacher sums, the case of squarefree divisors or small divisors, series involving divisor functions and the functional equation of the Riemann zeta function.

13.1

Introduction

In order to introduce to the matter considered in this chapter, we recall an essential approach used in the course of Chapter 11. Let ≥ n be two positive integers and consider the divisor functions dn () = #{k ≤ n : k|},

d() = #{k : k|}.

In the proofs of Theorems 11.8.1 and 11.8.2, we had to estimate the series a dn () ≥1

with positive coefficients a . Because the range of values of the Bernoulli random walk {Sn , n ≥ 1} is N, we have a dn () ≤ aSk dn (Sk ), ≥1

≥1

almost surely. Further, for some u > 0, the set u = {inf n≥1 Sn /n ≥ u} has positive probability. And so, at least when {an , n ≥ 1} is sufficiently regular aSk dn (Sk ) ≤ A ak dn (Sk ), ≥1

k≥1

where A < ∞ almost surely. This reduces the problem to estimating the series a k≥1 k E dn (Sk ), thereby to the probability P{d|Sk }, which can be investigated either via characteristic function or using the local limit theorem. This naturally suggests to investigating more closely the value distribution

660

13 Divisors and random walks

of divisors of Z-value random walks, and to use this probabilistic model for studying other questions. This is the object of this chapter. We will concentrate on two cases, the Bernoulli and the Rademacher random walks. But before, it seems natural to discuss the relevance of this model and to compare it with some others. There are several probabilistic models of integers. The mostly used is the one of finite intervals (IN , μN ), IN = {1, N}, μN is the counting measure μN = N1 N n=1 δn , δn being the Dirac mass at point n. Their behavior as N tends to infinity was much investigated in Kubilius work [1964] notably, see also [Elliott: 1980], [Tenenbaum: 1990]. From a probabilistic point of view, this model is however very unsatisfactory. Indeed, in order to represent all integers, infinitely many probability spaces are necessary. Further, it is important to make the following observation. One of the most fundamental notion in Probability Theory is the one of correlation of a system or of a probabilistic model; and the possibility to use correlation properties of this one, for applying the so useful and important convergence criteria we investigated along the Chapters 8 and 9. With the model (IN , μN ), N → ∞, such possibility is excluded. Alternatively, randomizing the integers with a random walk like for instance the Bernoulli’s, allows to apply these criteria in force. Further this model behaves locally nearly like the finite interval model. In the case considered before, the study of the correlation properties of divisors of Bernoulli sums is certainly an important question which, to say the least, involves a rather thorough and complicated work. And it is already not so immediate to figure out why the divisors functions d(Bn ) should enjoy some asymptotic independence property. More precisely, why the correlation function

(d, n), (δ, m) = P d|Bn , δ|Bm − P{d|Bn }P{δ|Bm }, should tend to 0 as m−n becomes large. Suppose d < nu or δ < mu for some u < 1/2. Then, d−1 δ−1

h h j 1 iπ( j n+ h m) n e d δ cos π cosm−n π , + (d, n), (δ, m) ≈ dδ d δ δ j =1 h=1

as n, m tend to infinity. If we now fix d, δ, as cosm−n π hδ → 0, m−n → ∞, 1 ≤ h < δ, we get

lim (d, n), (δ, m) = 0. m−n→∞

And so, basically, there is a correlation phenomenon in the “Bernouilli model”. Related studies are in [Weber: 2007b]. The same observations can be made for questions involving continuous time parameter. In Section 13.7 (Theorem 13.7.3), for instance, the Cauchy random walk modelizing the reals, is used to study the asymptotic behavior of the zeta function along the critical line σ = 1/2. Let X1 , X2 , . . . denote an infinite sequence of independent

13.2 Value distribution and small divisors of Bernoulli sums

661

Cauchy distributed random variables (with characteristic function ϕ(t) = e−|t| ). The time t is modelized by the sequence of partial sums Sn = X1 + · · · + Xn .

In order to understand the behavior of ζ 21 + it when t tends to infinity, the almost sure asymptotic behavior of the system

ζn := ζ

1 + iSn , 2

n = 1, 2, . . .

is investigated. Put for any positive integer n Zn = ζ (1/2 + iSn ) − E ζ (1/2 + iSn ) = ζn − E ζn . The (crucial) preliminary study of second order properties of the system {Zn , n ≥ 1} made in [Lifshits–Weber: 2009a], yields the striking fact that this system nearly behaves like a system of non-correlated variables, i.e. the variables Zn are weakly orthogonal. More precisely, there exist constants C, C0 such that E |Zn |2 = log n + C + o(1), and

E Zn Zm ≤ C0 max 1 , 1 n 2m−n

n → ∞,

for m > n + 1.

Theorem 13.7.2, asserting that n lim

n→∞

1 k=1 ζ ( 2 + iSk ) − n a.s. = n1/2 (log n)b

0,

for any real b > 2, then follows from the convergence criterion stated in Theorem 9.3.11.

13.2 Value distribution and small divisors of Bernoulli sums Let β = {βi , i ≥ 1} be a Bernoulli sequence, and let Bn = β1 + · · · + βn , n = 1, 2, . . . be the sequence of associated partial sums. Let (, A, P) denote the underlying basic probability space. Introduce the elliptic Theta function (d, m) =

e

2 2 2d 2

imπ d − mπ

.

∈Z

13.2.1 Theorem. (i) We have the following uniform estimate: (d, n)

= O (log n)5/2 n−3/2 . sup P d|Bn − d

2≤d≤n

662

13 Divisors and random walks

(ii) Moreover

⎧ nπ 2 √ 1 ⎨C (log n)5/2 n−3/2 + 1 e− 2d 2 if d ≤ n, d P d|Bn − ≤ √ ⎩ √C d if n ≤ d ≤ n. n

(iii) Further, for any α > 0,

d<π

sup

n 2α log n

1 P d|Bn } − = Oε n−α+ε (∀ε > 0). d

(iv) Finally, for any 0 < ρ < 1, sup

√ d<(π/ 2)n(1−ρ)/2

1

ρ P d|Bn − = Oε e−(1−ε)n (∀0 < ε < 1). d

(An improvement √ of this result yielding the existence of a corrective exponential factor when d n was recently obtained; see [Weber: 2009d].) It follows from (ii) that limn→∞ P d|Bn = 1/d, and 1 d P d|Bn − ≤ C d n

In particular

if 2 ≤ d ≤

√ n.

sup√ d P d|Bn ≤ C.

2≤d≤ n

(iv) yields a strong variation of the speed of convergence (as n → ∞) of Estimate √ P d|Bn to its limit 1/d when switching from the case d ≤ n to the case d ≤ nθ , θ < 1/2. Proof. By writing that dδd|Bn =

d−1

j

e2iπ d Bn ,

(13.2.1)

j =0

we obtain after integration d−1 1 j πj n P d|Bn = eiπ n d cos . d d

(13.2.2)

j =0

The summands are of the form einx cosn x. As ein(π−x) cosn (π − x) = (−1)n e−inx (−1)n cosn x = e−inx cosn x, we have in fact, by distinguishing the case d even from the case d odd, 1 j πj n 2 cos π n cos . P d|Bn = + d d d d 1≤j
(13.2.3)

13.2 Value distribution and small divisors of Bernoulli sums

663

Now, the principle of the proof will consist in comparing the sum

cos π n

1≤j
with

j d

cos

πj d

n

cos π n

1≤j
j −n π 2 j22 e 2d . d

According to the reduction applied in (13.2.3), we only have to work in the first quadrant. Let α > α > 0. Let ϕn =

2α log n n

1/2

τn =

,

sin ϕn /2 . ϕn /2

(13.2.4)

We assume n sufficiently large, say n ≥ n0 , for τn to be greater than (α /α)1/2 . Consider two sectors $ $ π . An =]0, ϕn [, An = ϕn , 2 n

2 πj πj n ≤ e−2n sin (ϕn /2) . As If πj d ∈ An , then | cos d | ≤ cos ϕn . And | cos d | ≤ cos ϕn 2n sin2 (ϕn /2) = 2n(ϕn /2)2 τn2 ≥ α log n, we deduce 1≤j
Now, if d < π

n 2α log n ,

∈An

πj n cos ≤ dn−α . d

(13.2.5)

we deduce that

πj πj = > n d π 2α log n

3

2α log n = ϕn , n

and so {1 ≤ j < d/2 : πj d ∈ An } = ∅. Therefore, in view of (13.2.3) and (13.2.5), for each α > 0, 1

supd<π P d|Bn − = Oε n−α+ε n d 2α log n

(∀ε > 0).

(13.2.6)

(∀0 < ε < 1),

(13.2.7)

Now, let 0 < ρ < 1. In a similar fashion sup √

d<(π/ 2)n(1−ρ)/2

1

ρ P d|Bn − = Oε e−(1−ε)n d

which are respectively estimates (iii) and (iv) of Theorem 13.2.1. Indeed, consider the modified sectors $ $ π , A˜ n =]0, ψn [ , A˜ n = ψn , 2

664

13 Divisors and random walks

where

sin ψn /2 2nρ 1/2 ψn = , τ˜n = . (13.2.8) n ψn /2 suppose n sufficiently large for τ˜n to be greater than √ Let also 0 < ε < 1, and πj πj n 1 − ε. Exactly as before, if d ∈ A˜ n , then | cos πj d | ≤ cos ψn , so that | cos d | ≤ 2 (cos ψn )n ≤ e−2n sin (ψn /2) . And 2n sin2 (ψn /2) = 2n(ψn /2)2 τn2 = nρ τn2 ≥ (1−ε)nρ . We deduce πj n ρ cos ≤ de−(1−ε)n . (13.2.9) d πj 1≤j
d

∈A˜ n

For the same reasons as before, this sum is the only term appearing in (13.2.3), ˜ when d < π 2nnρ , and so {1 ≤ j < d/2 : πj d ∈ An } = ∅.

n 1/2 ≤ Now assume α > α > 3/2. Apart from (13.2.5), the inequality ϕn = 2α log n πj π d < 2 implies that 2 2 −n π j e 2d 2 ≤ dn−α . (13.2.10) 1≤j
Now consider the contribution of the terms for which πj d ∈ An . We proceed as follows: if 2 2 πj j −n π j cos π n cosn D := − e 2d 2 , d d πj 1≤j
d

∈An

then by using the elementary inequality |eu − ev | ≤ |u − v| for u, v ≤ 0, πj π 2 j 2 log cos |D| ≤ n . + d 2d 2 πj 1≤j
d

∈An

Since log(1 − 2 sin2 (x/2)) = −x 2 /2 + O(x 4 ) near 0, we deduce 4 j Cn ≤ 4 j 4 ≤ Cα d(log n)5/2 n−3/2 . |D| ≤ Cn d d πj d 2α log n 1≤j
d

∈An

j≤ π (

n

)1/2

(13.2.11) Combining (13.2.5), (13.2.10) and (13.2.11) shows that 2 2 j −n π j2 n πj 2d cos π n cos − e d d 1≤j
≤ |D| +

1≤j
2 2 πj n −n π j2 cos 2d + e d

≤ Cα d(log n)5/2 n−3/2 + dn−α + dn−α ≤ Cα d(log n)5/2 n−3/2 .

(13.2.12)

665

13.2 Value distribution and small divisors of Bernoulli sums

Dividing both sides by d and inserting the obtained estimate into (13.2.3) gives 1 2 P d|Bn − − d d

As 1 2 + d d

1≤j

cos π n

cos π n

j −n π 2 j22 e 2d ≤ Cα (log n)5/2 n−3/2 . (13.2.13) d

j −n π 2 j22 1 iπ n j −n π 2 j22 e 2d = e d e 2d , d d |j |
1≤j
we have obtained

1 iπ n j −n π 2 j 2

2d 2 = O (log n)5/2 n−3/2 . e d sup P d|Bn − d 2≤d≤n

(13.2.14)

|j |
Now consider the remainder r :=

j ≥d/2 e

2j2 2d 2

−n π

. By using the triangle inequality,

(d, n) 2r P d|Bn − ≤ C(log n)5/2 n−3/2 + . d d

(13.2.15)

If d = 2, r=

∞

e−n

π2j2 8

≤ e−

π2n 8

+

j =1

∞

=

2n

Now if d ≥ 3, then

e

2 − π8 n

d 2

+

∞ √ π n 2

−1 ≥ d

e−nx dx = e− 2

π(j√−1) 2 2

j =2 (x= √y )

πj √ 2 2

+

∞ π √

e−nx dx 2

2 2

e−y

d 2

π2n 8

2 /2

− d

d 3

dy √ ≤ Ce 2n

=

2 − π8 n

.

1 . 6

Therefore r≤

∞ j ≥d/2

√πj 2d π(j √ −1) 2d

e−nx dx ≤ 2

≤

∞

∞ π √

e−nx dx 2

π( d2 −1) √ 2d

e

−nx 2

6 2

≤ Ce−

π2n 72

(x= √y )

dx

=

2n

∞ √ π n 6

e−y

2 /2

dy √ 2n

(13.2.16)

,

a bound which is in turn valid for all integers d ≥ 2 and n ≥ 1. To get (i), it suffices to incorporate these estimates into (13.2.14). We now shall need the useful estimate below. Let a be any positive real. Then, ∞ 3e−a if a ≥ 1, −aH 2 e (13.2.17) ≤ √ 3/ a if a < 1. H =1

666

13 Divisors and random walks

Indeed k

e−aH ≤ e−a + 2

H =1

k

e−aH ≤ e−a + 2

H =2

√

∞

e−au du 2

1

∞ 1 −v 2 /2 e−a + √ dv √ e 2a 2a 3 3 1 π π −a −a 1 −a ≤e +e √ 1+ =e , 2 a 2a 2 ∞ 2 π −x 2 /2 2 where (see (10.1.3)) we used the elementary bound x e−t /2 dt ≤ e . √

∞ ∞ 2−aH 2 2 π −aH −a −a Therefore H =1 e ≤ e 1 + 2√a ≤ 3e if a ≥ 1, and H =1 e ≤ √

√

√ π 1/ a 1 + 2 ≤ 3/ a if a ≤ 1. Hence (13.2.17). Therefore ⎧ 2

π √ ∞ ⎨ 6 e− nπ π2j2 2d 2 √ (d, n) 2 1 if d ≤ n, −n d 2 e 2d 2 ≤ − ≤ √

√ π ⎩ 6 √2 d d d if √ n ≤ d ≤ n. j =1 π n u=v/ 2a

=

2

And thus

⎧

nπ 2

1 ⎨C (log n)5/2 n−3/2 + 1 e− 2d 2 P d|Bn − ≤ d d ⎩ √C

n

√ if d ≤ n, √ if n ≤ d ≤ n,

(13.2.18)

which is (ii). The proof is now complete. √ Remarks. For the critical case n < d ≤ n, by estimate (ii) we have P d|Bn − d1 ≤ √ C √ . But this can be improved in some cases. For instance, if d is such that d > n, n d|n and dn is even, then ∞ 2 1 −y 2 /2 P{d|Bn } = √ e dy + O . √ d π n πd n

√ When d n, this implies that P{d|Bn } = 2π2 n + O d1 , which yields a strictly better bound. We indicate briefly how this can be established. By assumption, 1≤j
e

2j2 2d 2

iπ n dj −n π

=

e

2j2 2d 2

−n π

.

1≤j
Apply the Euler–MacLaurin sum formula of order 1 to f (t) = e−At with A = π2 · dn2 . Then d d d−1 1 −Ad 2 −A 1 1

2 −Aj 2 −At 2 + −2A(t+k)e−A(t+k) dt. e = e dt+ e +e t− 2 2 1 0 2

j =1

j =1

667

13.2 Value distribution and small divisors of Bernoulli sums

By using estimate (10.1.3) for the Mills ratio we have uniformly for d ≤ n,

d

e

−At 2

t= √y

dt

2A

=

1

And if d O(e

2 − π2 n

√ n, then

d √ π n 1 √

√ π n √

π n d

∞

√ π n d

π n

e−y

e−y

2 /2

2 /2

dy =

dy ∼

1 √ π n

∞ √

π n d

e−y

2 /2

π2n dy + O e− 2 .

Next (e−Ad − e−A ) = e 2

√1 . 2π n

2n 2d 2

−π

+

), and

d−1

d d 1

2 −Au2 −Au2 t − 2A(t + k)e−A(t+k) dt ≤ 2Aue du = −e 1 2 1 j =1 0 1

=e Therefore

d

e

j =1

2 nj 2 2d 2

−π

d = √ π n

∞ √ π n d

2n 2d 2

−π

e−y

2 /2

+ O(e−

π2n 2

).

dy + O(1),

which allows us to conclude our √ argument. Similarly one can show if n < d ≤ n, that ∞ √ √ −y 2 /2 2 n n)e dy + O . P d|Bn = √ √ cos(y n d π n π d By using the Poisson summation formula for Theta functions (see [Huxley: 1972] equality (10.10) p. 42): 2 −1 2 e−(m+δ) π x = x 1/2 e2iπ mδ−m π x , m∈Z

m∈Z

valid for x real and 0 ≤ δ ≤ 1, we get another convenient estimate. Apply the above formula with x = π n/(2d 2 ), δ = n/(2d); we get √ √ iπ n −nπ 2 2 d 2 −2( n +m)2 d 2 d 2 −2( n +)2 d 2 2 d 2d 2d n n . 2d e =√ e =√ e πn πn ∈Z

m∈Z

∈Z

Thereby 3

2 −2( n +)2 d 2 5/2 −3/2 2d n sup P d|Bn − e . = O (log n) n πn

2≤d≤n

(13.2.19)

∈Z

Theorem 13.2.1 allows us to study the small divisors of Bn . According to the usual notation, let P − (Bn ) be the smallest divisor of Bn .

668

13 Divisors and random walks

13.2.2 Theorem. There exists a positive real c > 0 and constants C0 , ζ0 such that for n large enough and ζ0 ≤ ζ ≤ nc/ log log n , we have the estimate − e−γ C0 P P (Bn ) > ζ − , ≤ log ζ log2 ζ

where γ is Euler’s constant. The proof is based on a uniform estimate for the value distribution of Bernoulli sums. Proof of Theorem 13.2.2. Fix some 0 < θ < 1/2 and some integer D ≥ 2 such that

2 log 2D − 2 + D1 ≥ 1. Let 3 ≤ Y ≤ n. Choose K=

2D log log Y + 1 . θ

(13.2.20)

Later it will be necessary to have K ≥ e7 , which is fulfilled for say, Y ≥ Y0 , Y0 depending on θ and D only; so that we in turn consider Y0 ≤ Y ≤ n. Set also m = D log log Y . Then

(13.2.21)

2D log log Y 2m = θ. ≤ 2D K θ log log Y

And so Y

2m K

≤ nθ .

(13.2.22)

Let p1 < p2 < · · · < pH ,

(13.2.23)

Y 1/K .

the sequence of primes less than or equal to Now recall for our purpose Poincaré’s identity and inequalities (Rényi [1970: p. 66]): Let (, A, P) be some probability space and A1 , . . . , An be arbitrary events of A. Define for k = 1, . . . , n the sums P{Ai1 ∩ · · · ∩ Aik }, Sk = 1≤i1 <···
where the summation is extended over all k-tuples of different integers that can be selected from the integers 1, . . . , n. Then we have the Poincaré identity (or inclusion– exclusion formula) n n + Ak = (−1)k−1 Sk . P k=1

k=1

The Poincaré inequalities are as follows: 2m k=1

(−1)k−1 Sk ≤ P

n + k=1

2m−1 Ak ≤ (−1)k−1 Sk , k=1

m = 1, 2, . . . .

13.2 Value distribution and small divisors of Bernoulli sums

669

Thus by using Poincaré’s inequality, we get 2m−1 (−1)k−1 P ∃1 ≤ i ≤ H : pi |Bn ≤

P pi1 . . . pik |Bn . (13.2.24)

1≤i1 <···
k=1

To estimate the probability P pi1 . . . pik |Bn we use Theorem 13.2.2. Let 0 < ε < 1

√ be fixed. In view of (iv), and since (13.2.22) implies pi1 . . . pik < π/ 2 nθ for 1 ≤ i1 < · · · < ik ≤ H , k ≤ 2m − 1; there exists some positive real ϑ depending on θ only, such that for n large enough, 1 ϑ ≤ e−n . P pi . . . pi |Bn − 1 k pi1 . . . pik

(13.2.25)

From (13.2.24) and (13.2.25) follows

P ∃1 ≤ i ≤ H : pi |Bn ≤

2m−1

k=1

1≤i1 <···
(−1)k−1

where |T | ≤ e−n

ϑ

2m−1

1 + T, pi1 . . . pik

(13.2.26)

1.

(13.2.27)

k=1 1≤i1 <···
Letting ξ1 , . . . , ξH be a sequence of independent random variables taking only the values 0 or 1 and such that P{ξi = 1} = p1i , we see from Poincaré’s equality that H (−1)k−1 P ∃i : 1 ≤ i ≤ H, ξi = 1 = k=1

=1−

( 1≤i≤H

1≤i1 <···
1 pi1 . . . pik

1 1− , pi

(13.2.28)

and so 2m−1

k=1

1≤i1 <···
(−1)k−1

with |S| ≤

( 1 1 =1− 1− + S, pi1 . . . pik pi

(13.2.29)

1≤i≤H

k≥2m 1≤i1 <···
1 . pi1 . . . pik

(13.2.30)

Thereby, ( 1 P P − (Bn ) ≤ Y 1/K ≤ 1 − 1− + S + T. pi 1≤i≤H

(13.2.31)

670

13 Divisors and random walks

As for any positive reals b1 , . . . , bH , H k bi =

bi1 . . . bik ≥ k!

1≤i1 ≤···≤ik ≤H

i=1

we have

1≤i1 <···

bi1 . . . bik ,

(13.2.32)

1≤i1 <···
1 1 1 k ≤ . pi1 . . . pik k! pi

(13.2.33)

i≤H

And since (for any natural r, any complex number y) we have the elementary bound y

e − 1 + y + · · · + y r ≤ |y|r+1 e|y| , we deduce 1! 1! (r+1)!

1 2m 1 1 k 1 i≤H pi |S| ≤ ≤ (13.2.34) e i≤H pi . k! pi (2m)! k≥2m

i≤H

Recall now the useful estimate p≤x [Tenenbaum: 1990], p. 17). Thus |S| ≤

1 p

≤ log log x + 6 valid for any x ≥ 3 (see

(log log Y 1/K + 6)2m (log log Y 1/K +6) . e (2m)!

But K ≥ e7 implies log Y + 6 = log log Y − 1 e7 D log log Y − D D log log Y − 1 m = ≤ ≤ , D D D

log log Y 1/K + 6 ≤ log

so that

m 2m 1 (log log Y 1/K +6) . (13.2.35) e D (2m)! Using then the Cesàro–Buchner estimate (Mitrinovi´c [1970], inequality (6) p. 183), we deduce from (13.2.15) and definition of m, that |S| ≤

m 2m 1 1 1 2m 1 2m+log log Y 1/K +6 ≤ e e2m+log log Y +6 √ √ D (2m)2m 2 π m 2D 2 πm 1 1 e7 1 2m 1 em(2+ D )+7 = √ e−m[2 log 2D−(2+ D )] ≤ √ 2D 2 πm 2 πm D 1 ≤ Ce−m ≤ Ce−D log log Y = C . log Y (13.2.36)

|S| ≤

According to Mertens’ formula (see [Tenenbaum: 1990], p. 18 for instance), for any x ≥ 2, ( 1 e−γ 1 1− = 1+O . (13.2.37) p log x log x p≤x

671

13.2 Value distribution and small divisors of Bernoulli sums

Thus for some universal constant C, ( 1≤i≤H

1 1− pi

=

( p≤Y 1/K

1 1− p

Ke−γ CK ≥ 1− . log Y log Y

(13.2.38)

We now estimate T . This term has a small order. Using (13.2.12) we get

1≤

1≤i1 <···
Hk , k!

and so |T | ≤ e−n

ϑ

2m−1 k=1

≤e

−nϑ θ

Hk ϑ ϑ ≤ e−n H 2m ≤ e−n Y 2m/K k!

n ≤e

−nϑ /2

(13.2.39)

,

for n large enough. By inserting this estimate as well as the preceding estimates (13.2.36) and (13.2.38) into (13.2.31), we arrive at D K 2 1 Ke−γ ϑ P P − (Bn ) ≤ Y 1/K ≤ 1 − + + e−n /2 . (13.2.40) +C log Y log Y log Y

Finally, using the definition of K and, since D ≥ 2, for any n large enough, 2De−γ log log Y 2De−γ log log Y 2 P P − (Bn ) ≤ Y 1/K ≤ 1 − , + C0 θ log Y θ log Y

where C0 is an absolute constant (one can take C0 = 2C where C is the constant in (13.2.8)). Or else,

−

P P (Bn ) > e

θ log Y 2D log log Y

2De−γ log log Y 2De−γ log log Y ≥ − C0 θ log Y θ log Y

2

. (13.2.41)

Put θ log Y

ζ = e 2D log log Y . θ log n

Then ζ ≤ e 2D log log n := nc/ log log n and (13.2.20) means e−γ C0 P P − (Bn ) > ζ ≥ . − log ζ log2 ζ

(13.2.42)

But we have also prepared arguments to prove the upper bound part. By using again Poincaré’s inequalities, next (13.2.5), (13.2.13), (13.2.17), we get for any integer

672

13 Divisors and random walks

n large enough, 2m P ∃p ≤ Y 1/K : p|Bn ≥ (−1)k−1

≥

P pi1 . . . pik |Bn

1≤i1 <···
k=1 2m

(−1)k−1

k=1

(

≥1− ≥1−

1≤i1 <···
1−

1≤i≤H Ke−γ

log Y

1+

1 pi

1 −T pi1 . . . pik

(13.2.43)

− S − T

CK log Y

− S − T ,

where

|S | ≤

k≥2m+1 1≤i1 <···
|T | ≤ e

−nϑ

2m

1 , pi1 . . . pik

(13.2.44) 1.

k=1 1≤i1 <···
From (13.2.19) we get |S | ≤ (13.2.20) we get |T |

≤

ϑ e−n /2 .

D 1 , log Y

whereas in the same manner as we get

Inserting this into (13.2.24) also provides the following

estimate

P ∃p ≤ Y

1/K

: p|Bn

CK Ke−γ 1+ ≥1− log Y log Y

K − log Y

D

− e−n

ϑ /2

.

In view of the definition of K, for any n large enough θ log Y P P − (Bn ) ≤ e 2D log log Y ≥ P ∃p ≤ Y 1/K : p|Bn

2De−γ log log Y 2 2De−γ log log Y . − C0 ≥1− θ log Y θ log Y (13.2.45) Thereby

−

P P (Bn ) > e or

θ log Y 2D log log Y

2De−γ log log Y 2De−γ log log Y ≤ + C0 θ log Y θ log Y

e−γ C0 . + P P − (Bn ) > ζ ≤ log ζ log2 ζ

The proof is achieved by combining (13.2.22) with (13.2.26).

2

(13.2.46)

673

13.2 Value distribution and small divisors of Bernoulli sums

13.2.3 Remarks. 1. Jordan has generalized the Poincaré identity as follows: let A1 , . . . , An be arbitrary events from a probability space (, A, P) and set Wn,r = P{ there exist exactly r events occurring among A1 , . . . , An }. n−r k k Then we have (Jordan’s identity): Wn,r = k=0 (−1) Cr+k Sr+k , where the Sk (1 ≤ k ≤ n) are as before and S0 = 1. 2. The above proof, and more precisely estimate (13.2.40), also shows that log log n 2 e−γ c log log n . (13.2.47) + C0 P Bn is a prime ≥ nc/ log log n ≤ log n log n This is certainly a good estimate except for the factor c log log n, which seems superfluous. 3. Return to Poincaré’s identity. An arithmetic meaning can be given to the factor (−1)k ; we can re-label our events as follows: Ap1 , . . . , Apn where p1 < · · · < pn is a given sequence of primes. Set for d = p1a1 . . . pnan , ai ≥ 0, 1 ≤ i ≤ n, A if ε ≥ 1, an ε a1 Ad = Ap1 ∩ · · · ∩ Apn where A = if ε = 0. Then P

n

+

Ak = −

k=1

μ(d)P(Ad ).

(13.2.49)

a a d=p11 ...pnn ai =0 or 1

Here μ denotes the Möbius function. Let p1 < · · · < pπ(n) be the sequence of consecutive primes less than or equal to n, and consider the sets Ai = {pi |Bn }, 1 ≤ i ≤ π(n). Then μ(d)P{d|Bn }. (13.2.50) 1 − P{Bn is prime} = − d≤n

Problem 19. Find an estimate of P{Bn is prime}. Estimating from below this probability is a deep question, having a direct connection with the problem of estimating gaps between primes. A way to see that goes as follows: first observe that there is some absolute constant A > 0 such that E eA

(Bn −n/2)2 n

≤ 2. Next

n 2 P{Bn is prime} ≤ P Bn is prime and Bn − ≤ n log log n 2 Bn − n2 2 + P √ > log log n n n 2 ≤ P Bn is prime and |Bn − ≤ n log log n 2

+ e−A log log n E eA

(Bn −n/2)2 n

n 2 2 ≤ P Bn is prime and Bn − ≤ n log log n + . 2 (log n)A

674

13 Divisors and random walks

−B Now assume that for some B < A, P{B√n is prime} ≥ (log n) for all n large enough. n Then P Bn is prime and |Bn − 2 | ≤ n log log n > 0 for all n large. This means that for all n sufficiently large, there is a prime p such that

n 2 n 2 − n log log n ≤ p ≤ + n log log n. 2 2 prime. Recall that pk ∼ k log k, as k → ∞. Choose Let pk be the k-th consecutive √ n = nk√such that n2k − nk log log √ nk ∼ pk . Then there is a prime between pk and pk + 2 nk log log nk ∼ pk + 2 k log k log log k. This means that the gap function g(k) = pk+1 − pk satisfies

2 g(k) = O k log k log log k . Recall, according to a result of Cramér [1936], that the validity of the Riemann Hypothesis would imply

g(k) = O k 1/2 (log k)3/2 , and that the well-known Cramér’s conjecture states that lim supk→∞

pk+1 −pk (log pk )2

= 1.

Problem 20. Find an estimate of P{Rn is prime} where Rn is a sum of independent Rademacher random variables. This task is easier: there are absolute constants C1 and C2 such that C2 C1 ≤ P{Rn is prime} ≤ . log n log n for any n ≥ 3. √ √ Hint. Let Ik (n) = [−2k n, 2k n], k = 0, 1, . . .. Use the partitioning of the set P of all prime numbers k = 0, P ∩ I0 (n),

Jk (n) = P ∩ Ik (n)\Ik−1 (n) , k = 1, . . . , to write P{Rn is prime} =

∞

P{Rn ∈ Jk (n)}.

k=0

Next apply the local limit theorem and the prime number theorem to get that √

4 2eπ log n

−

Cε n1/2−ε

3 ∞ 2 k −22k−1 1 Cε ≤ P{Rn is prime} ≤ 2 e . + 1/2−ε π log n n k=1

The lower part above follows from P{Rn ∈ J0 (n)} which is easily seen, as observed C by Jean-Marc Deshouillers, to be of order log n.

13.3 An LIL for arithmetic functions

675

Does the limit lim (log n)P{Rn is prime}

n→∞

exist? Let m > n. What can be said about the probability P{Rn and Rm primes}? One can use Siebert’s estimate, in which ϕ is Euler totient function (see [Siebert: 1976]) #{p ≤ x : p, p + v are primes} ≤

|v| 8x , 2 (log x) ϕ(|v|)

to show that P{Rn and Rm primes} ≤

C |Rm−n | χ {|Rm−n | ≥ 1}. E 2 (log n) ϕ(|Rm−n |)

This bound is satisfactory when m is close to n.

13.3 An LIL for arithmetic functions Let f (n), n = 1, 2, . . . be a strongly additive arithmetic function, namely a function f satisfying f (mn) = f (m) + f (n), (m, n) = 1, (13.3.1a) α f (p ) = f (p), p a prime, α = 2, 3, . . . . (13.3.1b) It follows that f (n) = f (p)χ (p|n), so that f is completely determined by its values taken over the primes. A typical example is the prime divisor function ω(n), namely the number of different prime factors of n. These assumptions have some consequences on the growth of f . Erdös [1946] indeed established that a real-valued function f on the integers, satisfying the first part of (13.3.1) and such that

f (n + 1) ≥ f (n) (n ≥ 1) or lim f (n + 1) − f (n) = 0, n→∞

must be of the form f (n) = C log n

(C constant).

This gives, by the way, a remarkable characterization of the logarithmic function (up to a multiplicative constant corresponding to the arbitrary choice of the base). A second necessary comment concerns assumption (13.3.1b). Without this one, f is no longer characterized by its values taken over the primes, a typical example is the function below studied by Rényi [1955], f (n) = (α1 + · · · + αs ) − s,

if n = p1α1 . . . psαs .

Then (13.3.1a) is fulfilled and we have f (p) = 0, however f ≡ 0.

676

13 Divisors and random walks

Now put An =

f (p) , p p
Bn =

f 2 (p) . p p
(13.3.2)

In this section, our purpose will be to show that the randomization described in Section 13.1, notably estimate (13.1.6), can be successfully applied to establish a weighted strong law of large numbers and the corresponding weighted law of the iterated logarithm, with weights given by an additive function f , under natural conditions on f . Before going further, recall some classical results. By the central limit theorem of Erdös and Kac [1940], if |f (p)| = O(1) and Bn → ∞, we have x 1 2 1/2 −1/2 lim e−u /2 du. (13.3.3) #{n ≤ N : f (n) ≤ AN + xBN } = (2π ) N→∞ N −∞ The same conclusion (Kubilius [1964], Shapiro [1956]) holds for unbounded f (p), provided 1/2 f (p) = o(Bp ), Bp → ∞. (13.3.4) Condition (13.3.4) is the natural asymptotic negligibility condition assumed in most central limit theorems and is best possible: Halberstam [1955] proved that replacing the o by O in (13.3.4), relation (13.3.3) becomes generally false. For additional limit theorems related to (13.3.3), we may refer to Kubilius [1964], and for an alternative approach via the theory of mixing random variables to Philipp [1971]. Finally, concerning results without assumption (13.3.1b), let us quote Schoenberg’s theorem [1936], asserting that if f satisfies (13.3.1a) and is such that the series min(1, |f (p)|) p

p

converges, then f has an asymptotic distribution function #{n : f (n) ≤ x} , N →∞ N

σ (x) = lim

x ∈ R,

having the characteristic function ( 1 1 1 1 2 3 eitx σ (dx) = 1− 1 + eitf (p) + 2 eitf (p ) + 3 eitf (p ) + . . . . p p p p R p Note that the series above obviously converges in the case of the function considered by Rényi since f (p) ≡ 0, and thereby contains the earlier result in [Rényi: 1955]. We shall show that under condition (13.3.4) the sums f (n)Xn n≤N

13.3 An LIL for arithmetic functions

677

satisfy the LIL for any centered i.i.d. sequence Xn with finite variances and the strong law of large numbers for any i.i.d. sequence Xn with finite mean. This result establishes a connection between two different types of probabilistic behavior of arithmetic functions, namely the “density” type distribution result (13.3.3) and the almost sure asymptotic behavior of N f (n)X n with random Xn . n=1 As will be clear from the proofs, the key arithmetic property behind these results is a bound for the frequency of large values of f , and in fact a byproduct of our argument will be a large deviation result corresponding to (13.3.3). 13.3.1 Theorem. Assume that f ≥ 0 and condition (13.3.4) is satisfied. Then for any sequence X, X1 , X2 , . . . of centered, independent, identically distributed, integrable random variables we have N n=1 f (n)Xn a.s. lim = E X. N N →∞ n=1 f (n) 13.3.2 Theorem. Under the conditions of Theorem 13.3.1, for any sequence X, X1 , X2 , . . . of centered, independent, identically distributed, square integrable random variables, N f (n)Xn a.s. = X 2 . lim sup √n=1 N →∞ AN 2N log log N By the second relation of (13.3.4), f cannot be identically 0 and thus the denominators in the above theorems are positive for sufficiently large N. In view of the law of large numbers of Jamison-Orey and Pruitt (Theorem 4.8.1), for the proof of Theorem 13.3.1 it would suffice to verify the arithmetical condition # N: N n=1 f (n) ≤ tf (N ) lim sup < ∞. (13.3.5) t t→∞ Conversely, the validity of Theorem 13.3.1 implies (13.3.5). However, it does not seem that a direct argument for (13.3.5) should be easy to find; instead we will use a suitable randomization of the function f (n) in Theorems 13.3.1 and 13.3.2 and will obtain the theorems through studying the randomized function, on the basis of the approach described in Section 13.1. As we will see, (13.3.4) implies Bn = o(A2n ), and thus (13.3.3) describes the distribution of f (n) in a short interval (AN (1 − o(1)), AN (1 + o(1)) around AN . On the other hand, relation (13.3.5) can be equivalently written as #{N : f (N) ≥ cNAN } 5 1/c (see Lemma 13.3.3) below) and thus (13.3.5) is a large deviation result corresponding to (13.3.3). The theorems show the interesting fact that such a result is valid under the same condition (13.3.4) required for the validity of the weak limit theorem (13.3.3). Before passing to the proofs, some preparatory lemmas are necessary.

678

13 Divisors and random walks

13.3.3 Lemma. Under (13.3.4) we have n

n

f (m) ∼ nAn ,

m=1

f 2 (m) ∼ nA2n .

(13.3.6)

m=1

To see this, we note first that Bn is nondecreasing and thus (13.3.4) implies 1/2

max f (p) = o(Bn ) p
whence Bn =

f 2 (p) f (p) 1/2 ≤ max f (p) = o(Bn An ) p
and consequently Bn = o(A2n ). Next we observe that

(13.3.7)

1 = log log n + c0 + o(1) p p
for some constant c0 and thus nα ≤p
1 = O(1) p

for any 0 < α < 1. Hence using (13.3.4) we get Bn − B[nα ] ≤ αmax f 2 (p) n ≤p
nα ≤p
1 = o(Bn ). p

Thus by a well-known result of Kubilius (see Theorem 12.1 of Elliott [1980]) we get n

1/2

(f (m) − An ) = o(nBn ),

m=1

n

(f (m) − An )2 ∼ nBn

m=1

whence (13.3.6) follows in view of (13.3.7). The next lemma is Jamison–Orey–Pruitt’s characterization of the weighted strong law of large numbers in L1 , which we recall for the reader’s convenience and for some crucial remarks. 13.3.4 n Lemma. Let {wk , k ≥ 1} be a sequence of positive reals and put Wn = k=1 wk . Then the relation N lim

N →∞

n=1 wn Xn a.s.

N

n=1 wn

= EX

(13.3.8)

13.3 An LIL for arithmetic functions

679

holds for all sequences X = {X, X1 , X2 , . . . } of centered, independent, identically distributed, integrable random variables if and only if lim sup t→∞

N(t) < ∞, t

(13.3.9)

where N(t) = # n : Wn ≤ twn . Although (13.3.9) characterizes the weighted strong law of large numbers, we shall need a less elegant, but more adapted form of it. According to Jamison, Orey and Pruitt [1965: Theorem 2] and the remark at the bottom of p. 41, the condition N(y) 2 EX dy < ∞, (13.3.10) 3 y≥|X| y and E |X| < ∞ imply (13.3.8), hence (13.3.9). And by the first half of the proof of their Theorem 3, p. 42, relation (13.3.9) and E |X| < ∞ imply (13.3.10); hence: 13.3.5 Lemma. Under the assumption E |X| < ∞, (13.3.9) and (13.3.10) are equivalent. This observation will be crucial in the sequel. A direct application of the original characterization (13.3.9) only allows us to establish the results in the spaces L logε L, ε arbitrarily small but strictly positive. In the course of the proof, we will also need in a crucial way Lemma 11.8.7. Finally for proving Theorem 13.3.2, we shall need a weighted version of the usual LIL. The following result is implicit in Fisher [1992] (Corollary 3.4 and lines 8–11, p. 178). 13.3.6 Lemma. Let {wk , k ≥ 1} be a sequence of positive reals and put Tn = nk=1 wk2 . Assume that 1 (13.3.11) lim sup # n : Tn ≤ twn2 < ∞. t→∞ t Then, for any sequence X = {X, X1 , X2 , . . . } of centered, independent, identically distributed, square integrable random variables we have N wn Xn a.s. lim sup √ n=1 = X 2 . (13.3.12) 2TN log log TN N →∞ One recognizes in (13.3.11) condition (13.3.9) for the weights wn2 and in view of Lemma 11.3.4 it follows that under E X2 < ∞, (13.3.11) is equivalent to L(y) 4 dy < ∞, (13.3.13) EX 2 y3 y≥X where L(y) = # n : Tn ≤ ywn2 .

680

13 Divisors and random walks

Proof of Theorem 13.3.1. We note that, by (13.3.4) and (13.3.7), An − A[n/2] ≤

max f (p)

n/2≤p
n/2≤p
1 1/2 = o(Bn ) O(1) = o(An ), p

(13.3.14)

which shows that An is slowly varying. Assume now E |X| < ∞ and put L(t) = #{n :

n

f (k) ≤ tf (n)}.

k=1

According to Lemmas 13.3.4 and 13.3.5, in order to prove Theorem 13.3.1, it suffices to prove L(y) dy < ∞. (13.3.15) E X2 3 y≥|X| y To establish (13.3.15), we use the probabilistic argument mentioned in Section 13.1. Put f (p)χ(p|n), f1 (n) = d0 ≤p≤n1/4

where d0 is the same constant as in Lemma 11.8.7. Using (13.3.4) we get 1/2 f (p) + 4 max f (p) = o(Bn ) 5 An . |f (n) − f1 (n)| ≤ p≤d0

n1/4 ≤p≤n

(13.3.16)

Let (, A, P) be the probability space on which the sequence X1 , X2 , . . . is defined and ˜ where the second space supports ˜ P), ˜ A, consider the product space (, A, P) × (, ˜ denote expectation in (, ˜ and set ˜ P) ˜ A, a Bernoulli sequence {εi , i ≥ 1}. Let E Sn = ε1 + · · · + εn , n = 1, 2, . . . . Then by (13.3.6), L(t) ≤ # n : nAn ≤ Ctf (n) ≤ # n : Sn ASn ≤ Ctf (Sn ) , (13.3.17) and this is true for any t > 0, simply because the graph of the random walk {Sn , n ≥ 1} replicates all positive integers with possible multiplicities. By the strong law of large a.s. numbers limn→∞ Sn /n = 1/2, and thus Sn ASn ∼ (n/2)An/2 ∼ (n/2)An

a.s.

Here we used the fact that An is slowly varying and thus by the uniform convergence theorem for slowly varying functions, see for instance Bingham, Goldie and Teugels [1987: Theorem 1.2.1], if λn ∼ μn are two integer sequences, then Aλn ∼ Aμn . Thus if η = {Sn ASn ≥ ηnAn , ∀n ≥ 1}, then for some η > 0, P(η ) > 0. And reading (13.3.17) on η gives L(t) ≤ # n : nAn ≤ (Ct/η)f (Sn ) on η , for all t > 0. (13.3.18)

681

13.3 An LIL for arithmetic functions

But for all t > 0, 1 1 # n : nAn ≤ tf (Sn ) ≤ 1 + # n ≥ t : nAn ≤ tf (Sn ) t t f 2 (Sn ) 1 =1 + χ { (nAn )2 ≤ t 2 f 2 (Sn )} ≤ 1 + t . t n≥t (nAn )2 n≥t Further, by Lemma 11.8.7 there exists a constant C ∗ such that 2 ˜ ˜ ˜ f12 (Sn ) = E f (p)χ (p|Sn ) ≤ E E d0 ≤p≤(Sn )1/4

=

d0 ≤pi1 ,pi2

≤ C∗

(13.3.19)

2 f (p)χ(p|Sn )

d0 ≤p≤n1/4

˜ χ(pi1 pi2 |Sn ) f (pi1 )f (pi2 )E ≤n1/4

pi1 ,pi2 ≤n1/4

f (pi1 )f (pi2 ) ≤ C∗ pi1 pi2

f (p) 2 p≤n1/4

p

≤ C ∗ A2n , (13.3.20)

provided n is sufficiently large, which from now on we assume. From (13.3.15), (13.3.19) and Minkowski’s inequality it follows that ˜ f 2 (Sn ) ≤ C A2n . E

(13.3.21)

Therefore E 1 ˜ f 2 (Sn ) ˜ 1 # n : nAn ≤ tf (Sn ) ≤ 1 + t E ≤ 1 + Ct ≤ C < ∞. 2 2 t (nA ) n n n≥t n≥t (13.3.22) It follows that # n : nAn ≤ yf (Sn ) 1 2 2 ˜ EEX dy ≤ C E X dy ≤ C E |X| < ∞. 3 2 y y≥|X| y≥|X| y And, in view of (13.3.17) and Fubini’s theorem, # n : nAn ≤ (Cy/η)f (Sn ) L(y) 2 2 ˜ χ (η ) · E X ˜X E dy ≤ E E dy 3 y3 y≥|X| y y≥|X| ≤ C E |X| < ∞. As ˜ χ (η ) · E X2 E

y≥|X|

L(y) dy = P{χ(η )}E X2 y3

y≥|X|

condition (13.3.15) obtains and Theorem 13.3.1 is thus proved.

L(y) dy, y3

682

13 Divisors and random walks

Remark. It is quite interesting to observe by the randomization argument we used in the above proof, that the term “ p1 ” in the summand of An , appears in (13.3.20) as the expectation of a random factor. Proof of Theorem 13.3.2. In view of Lemma 13.3.6 and the equivalence of (13.3.11) and (13.3.13), it suffices to verify that L(y) 4 EX dy < ∞, 3 y≥X 2 y where

L(y) = # n :

n

f 2 (k) ≤ yf 2 (n) .

k=1

The proof being very similar, we only mention the modifications. We replace f1 (n) by f (p)χ(p|n), f2 (n) = d0 ≤p≤n1/8

where d0 is the same constant as in Lemma 11.8.7. Similarly to (13.3.16), we get |f (n) − f2 (n)| 5 An .

(13.3.23)

By using the same randomization argument as above and applying (13.3.6), we see that the set ∗η defined by ∗η = {Sn A2Sn ≥ ηnA2n , ∀n ≥ 1} has positive probability for some η > 0, and on ∗η we have for all y > 0, L(y) ≤ # n : nA2n ≤ (C0 y/η)f 2 (Sn )

for some positive constant C0 . Instead of (13.3.19) we have 1 1 # n : nA2n ≤ tf 2 (Sn ) ≤ 1 + # n ≥ t : nA2n ≤ tf 2 (Sn ) t t f 4 (Sn ) 1 =1+ χ {n2 A4n ≤ t 2 f 4 (Sn )} ≤ 1 + t . t n≥t n2 A4n n≥t Now, ˜ f24 (Sn ) = E ˜ E

4

f (p)χ (p|Sn )

d0 ≤p≤(Sn )1/8

d0 ≤pi1 ,pi2 ,pi3 ,pi4

≤ C∗

4 f (p)χ(p|Sn )

d0 ≤p≤n1/8

=

˜ ≤E

˜ χ(pi1 pi2 pi3 pi4 |Sn ) f (pi1 )f (pi2 )f (pi3 )f (pi4 )E ≤n1/8

pi1 ,pi2 ,pi3 ,pi4 ≤n

f (pi1 )f (pi2 )f (pi3 )f (pi4 ) ≤ C∗ p p p p i i i i 1 2 3 4 1/8

f (p) 4 p≤n1/8

p

≤ C ∗ A4n , (13.3.24)

13.3 An LIL for arithmetic functions

683

provided n is sufficiently large, which from now on we assume. From (13.3.23) and (13.3.24) it follows that ˜ f 4 (Sn ) ≤ C A4n . E Thus instead of (13.3.22) we get E 1 ˜ f 4 (Sn ) ˜ 1 # n : nA2n ≤ tf 2 (Sn ) ≤ 1 + t ≤ 1 + C t ≤ C < ∞. E 2 A4 2 t n n n n≥t n≥t To conclude, we now operate exactly as at the end of the previous proof. Remark. The reduction argument based on (13.3.16) we used to treat the case of additive arithmetic functions is no longer valid when passing to multiplicative functions, e.g., the usual divisor function d. However, the argument applies to the truncated divisor function d1 (n) = d≤n1/4 χ (d|n), and gives, for any sequence X of centered, independent, identically distributed, integrable random variables, N n=1 d1 (n)Xn a.s. lim = E X. N N →∞ n=1 d1 (n) A similar result can be obtained for the LIL with the truncated divisor function d2 (n) = d≤n1/8 χ (d|n). We omit the details of proofs, which are quite similar to the above. Problem 21. Extend Theorems 13.3.1 and 13.3.2 for functions satisfying (13.3.1a) only, and assumption of Schoenberg’s theorem. Problem 22. Let f (n) ≥ 0, n = 1, 2, . . . be a strongly additive arithmetic function 1/2 such that f (p) = o(Bp ), Bp → ∞. Let (X, A, μ, T ) be a measurable dynamical system. Is it true, possibly under stronger requirements, that for any g ∈ L1 (μ), the limit N 1 f (n)g(T n x) lim N N →∞ n=1 f (n) n=1 exists for almost all x? In particular, can we say that N 1 lim ω(n)g(T n x) N →∞ N log log N n=1

exists μ-almost everywhere, whenever g ∈ L1 (μ)? We know that the answer is positive if {g T n , n ≥ 1} are i.i.d., which gives sense to this question. And in Weber [2005e], a consequence of a stronger result (Theorem 3) yields: There is a universal set of random signs of full measure, such that for any probability space (X, F , μ), any contraction T on L2 (μ), any g ∈ L2 (μ), N pn n=1 ±d(n)T g(x) lim = 0 μ-almost surely. (13.3.25) N N→∞ n=1 d(n)

684

13 Divisors and random walks

It follows from the proof of this result that {pn , n ≥ 1} can be any polynomially growing sequence of integers. In fact even some related series are proved to converge almost everywhere. Problem 23. For multiplicative functions the picture is however less complete. For instance, for the divisor function d(n) = #{d : d|n}, by using Skorohod’s embedding scheme it is possible to show that, if X, X1 , X2 , . . . is a sequence of centered i.i.d. random variables such that X ∈ L2 logm L with m > 6, then N a.s. n=1 d(n)Xn lim sup 2 = π X 2 . (13.3.26) 3 2N(log N) log log N N →∞

1/2 Hint. Put ξn = Xn χ {|Xn | ≤ n/ logm n }. Since X ∈ L2 logm L, it follows that P{ξn = Xn ultimately} = 1. By the Skorokhod embedding scheme, if ξ is a random variable such that E ξ = 0, E ξ 2 < ∞, then on a possibly larger probability space, D

there exist a Brownian motion B and exit time T such that B(T ) = ξ . Applying this successively for n = 1, 2, . . . to ξ = d(n)ξn shows that there exists, after suitably enlarging the probability space, a linear Brownian motion B = {B(t), 0 ≤ t < ∞} starting at 0 and a sequence τ1 , τ2 , . . . (τ0 = 0) of independent nonnegative random variables such that E τn = d(n)2 E ξn2 for n ≥ 1, and n n−1 D B τk − B τk , n ≥ 1 = {d(n)ξn , n ≥ 1}. k=1

k=1

Further the exit times τk in the Skorokhod construction satisfy (see [Billingsley: 1999], p. 456) E τk = d(k)2 E ξk2 , E τk2 ≤ Cd(k)4 E |ξk |4 . Letting xn = τn − E τn gives, in view of the integrability assumption made, E xn2 ≤ E τn2 ≤ Cnd 4 (n)/ log2m nE |Xn |2 logm |Xn | ≤ CX nd 4 (n)/ log2m n. Hence 2 nd(n)4 (τn − E τn ) ≤ CX . E log2m n i≤n≤j i≤l≤j

2m n, for any b > 3/2, 4 By Corollary 9.3.5, letting LN = N n=1 nd (n)/ log 1≤n≤N (τn − E τn ) a.s. = 0. lim 1/2 N →∞ LN logb LN But of equation (B) and equation of [Ramanujan: 1916], we have that N (8) N in view 2 (n) ∼ (N/π 2 ) log3 N and 4 (n) ∼ AN log15 N , A a numerical cond d n=1 n=1 1/2 stant; so that LN 3 N log15/2−m N. Thus 1≤n≤N (τn − E τn ) a.s. = 0. lim N →∞ N log15/2−m+b N

685

13.4 On the order of magnitude of the divisor functions

As

E τn = 1≤n≤N d 2 (n)E ξn2 ∼ (NE X2 /π 2 ) log3 N and m > 6, we have 1≤n≤N (τn − E τn ) lim sup N→∞ 1≤n≤N E τn N log15/2−m+b N 1≤n≤N (τn − E τn ) ≤ lim sup lim sup 15/2−m+b N log N N →∞ N →∞ 1≤n≤N E τn N log15/2−m+b N 1≤n≤N (τn − E τn ) ≤ CX lim sup lim sup N log3 N N log15/2−m+b N N →∞ N →∞

1≤n≤N

=0

almost surely.

Therefore lim

N →∞

1≤n≤N τn a.s. = N log3 N

E X2 . π2

(13.3.27)

The rest of the proof is routine and follows by either applying an adequate invariance principle, or simply by mimicking the proof of the LIL for subsequences given in [Weber: 1990b]. Very likely this result remains valid under the optimal moment conditions E X = 0 and E X2 < ∞. There is also a possible alternate way to succeed by proving directly (13.3.11) (and also (13.3.9)) using Perron’s formulae. For instance, if a1 , a2 , . . . are numbers such that

|an | = O nβ−1 (β > 0), and if Z(s) =

∞ an n=1

n

, s

f (x) =

an ,

n≤x

where the dash over the sign of summation indicates that, if x is an integer, 1 f (x) = a1 + a2 + · · · + an−1 + an , 2 then α+iω s 1 x xα Z(s)ds ≤ K f (x) − 2π i α−iω s ω

(α > β, ω > 0),

(13.3.28)

where K is independent both of x and of ω. See, for instance, [Blanchard: 1969], p. 49.

13.4

On the order of magnitude of the divisor functions

Let d1 , d2 , . . . be an increasing sequence of positive integers, which we denote by D. Let a1 , a2 , . . . be a sequence of positive reals, which we denote by a. The order of

686

13 Divisors and random walks

magnitude of the divisor functions d(n, D) =

1,

d2 (n, D) =

1

(13.4.1)

[d,δ]|n √ d,δ∈D, [d,δ]≤ n

d|n √ d∈D, d≤ n

can be evaluated by considering the series ∞

an d(n, D),

n=1

∞

an d2 (n, D).

(13.4.2)

n=1

To study the convergence of these series, we shall show that the method described in Section 13.1 allows us to reduce the problem to the study of a series having a much richer structure. The proof also does not appeal to the classical machinery of arithmetic functions, nor the theory of Dirichlet series, and includes the subsequence case: D ⊆ N, D = N. We only consider sufficiently regular sequences a. We assume that the following condition is satisfied: for any real c > 0, am = γa (c) < ∞. n≥1 m≥cn an

sup sup

(13.4.3)

Typical examples are sequences a built from a regularly varying function ϕ: an = ϕ(n) like ϕ(t) = t −a logb t, a > 0. In the usual cases D = N and D = P where P is the set of primes, the divisor function d(n, D) is respectively equivalent to the divisor function d(n) := d|n 1 and ω(n) := p∈P 1, since p|n

d(n, N) ≤ d(n) ≤ 2d(n, N),

d(n, P ) ≤ ω(n) ≤ d(n, P ) + 1.

(13.4.4)

A similar remark can be made for the divisor function d2 (n, D) in the case when D = P only. First observe that if D ∗ = {σ : ∃d, δ ∈ D such that σ = [d, δ]}, then d2 (n, D) d(n, D ∗ ) in general. Next we have d2 (n, P ) ≤ ω2 (n) ≤ 5ω(n) + d2 (n, P ).

(13.4.5)

√ Consider among the pairs (p, q) of primes such that pq|n and pq > n, the number R of pairs (pi , qi ), i = 1, . . . , R composed with different prime divisors of n: pi = qj i, j = 1, . . . , R, pi = pj and qi = qj , i, j =√1, . . . , R, i = j . Necessarily R = 1, 2 otherwise we would get a contradiction (n > ( n) √ ). Consequently, all the remaining pairs (p, q) of primes such that pq|n and pq > n share a component with (p1 , q1 ). Their number is thus bounded above by 2ω(n), hence d2 (n, P ) ≤ ω2 (n) =

2 1 = ω(n) + d2 (n, P ) + 1 ≤ 5ω(n) + d2 (n, P ). p|n

p|n,q|n √ pq> n

687

13.4 On the order of magnitude of the divisor functions

13.4.1 Theorem. Assume that the series 1 d∈D

is convergent. Then the series

d

an ,

n≥d 2

∞

n=1 an d(n, D)

converges as well.

We indicate from Theorem 13.4.1. Let an = n−1 log−β n, some consequences −β+1 β > 1. Then n≥d 2 an ≤ Cβ log d. Thereby, ∞ 1 d(n, D) < ∞ "⇒ < ∞. β−1 n logβ n d log d n=1 d∈

(13.4.6)

In particular, in view of (13.4.4): for any β > 2, ∞ d(n) < ∞. n logβ n n=1

(13.4.7)

−β

Let now an = n−1 log−1 n log2 n, β > 1. Similarly one shows d∈

1 d

β−1 log2 d

< ∞ "⇒

∞

dD (n)

n=1

n log n log2 n

β

< ∞.

(13.4.8)

Applying it to the sequence of primes gives, in view of (13.4.4): for any β > 2, ∞

ω(n)

n=1

n log n log2 n

β

< ∞.

(13.4.9)

Since (see [Hardy–Wright: 1979], equation 18.2.1, p. 263 and Theorem 430, p. 365) N N n=1 d(n) ∼ N log N, and n=1 ω(n) ∼ N log log N , it easily follows from Abel summation, that both (13.4.7) and (13.4.9) are optimal. But the theorem also permits us to treat very different cases. Let for instance P1 be such that for some β > 1, log−β+1 p1 < ∞. p1 ∈P1

Set

(P1 , P ) = d = p1 + p, p > p1 , p1 ∈ P1 , p ∈ P . It is easily seen that the series d∈(P1 ,P ) d −1 log−β+1 d converges. Therefore, by (13.4.7) the series

∞ d n, (P1 , P ) (13.4.10) n logβ n n=1 converges as well. This case does not seem, however, tractable by means of Dirichlet series techniques. The next theorem concerns the second series in (13.4.2).

688

13 Divisors and random walks

13.4.2 Theorem. Assume that the following condition is satisfied: d,δ∈D

Then the series

∞

n=1 an d2 (n, )

1 [d, δ]

an < ∞.

n≥[d,δ]2

converges.

Let α > 2 and an = n−1 log−1 n log−α 2 n. Then Then,

1

d,δ∈D

[d, δ] logα−1 2 [d, δ]

∞

< ∞ "⇒

n=1

n≥[d,δ]2

an ≤ Cα log−α+1 [d, δ]. 2

d 2 (n, D) < ∞. n log n logα2 n

(13.4.11)

Letting = P we get

1

[d, δ] logα−1 2 [d, δ] d,δ∈P

≤

d∈P

≤ Cα

1 dδ logα−1 δ 2

δ>d δ∈P

∞ k=1 h>k

Therefore, by (13.4.5),

∞ n=1

1 < ∞. kh(log k)(log h) logα2 h

ω2 (n) < ∞. n log n logα2 n

(13.4.12)

Proof of Theorem 13.4.1. Let (, A, P) be some probability space, on which a Bernoulli sequence ε = {εi , i ≥ 1} is defined (P{εi = 0} = P{εi = 1} = 1/2 and the εi are independent). Consider the sequence of partial sums Sn = ε1 + · · · + εn , n = 1, 2, . . . . As already observed, ∞ m=1

am d(m, D) ≤

∞

aSn d(Sn , D),

(13.4.13)

n=1

simply because the sum on the right just replicates the terms of the sum on the left, with possible multiplicity. The term aSn is not satisfactory. Put aSn . n≥1 an

A = sup

(13.4.14) a.s.

By the strong law of large numbers, limn→∞ Snn = 1/2. Therefore, if η = {Sn ≥ ηn, ∀n ≥ 1}, then P(η ) > 0 for some η > 0. Now, on η , aSn =

am aSn an ≤ an sup sup = an γa (c), an ν≥1 m≥ην aν

689

13.4 On the order of magnitude of the divisor functions

by (13.4.3). Therefore, for some M suitably chosen, P{A ≤ M} > 0.

(13.4.15)

If we further prove that ∞

an E d(Sn , D) ≤ C

n=1

1 an , d 2

d∈D

(13.4.16)

n≥d

∞

then, by Beppo Levy’s theorem, the series n=1 an d(Sn , D), converges almost everywhere. By (13.4.15), the series ∞ n=1 aSn d(Sn , D) thus converges on a set of positive measure. By inequality (13.4.13), we deduce that the series ∞

an d(n, D),

n=1

converges as well, and the theorem will be proved. Both by Lemma 11.8.7 and by assumption, ∞

an E d(Sn , D) ≤ C

n=1

∞

an

n=1

1 1 1 + ≤ 2C an < ∞, n1/2 d d √ 2 d∈D

1≤d≤ n d∈D

n≥d

(13.4.17) . Proof of Theorem 13.4.2. To treat the series associated to the divisor function 1, d2 (n, D) = [d,δ]|n √ d,δ∈D, [d,δ]≤ n

we use the same trick as before. Let (, A, P) be some probability space on which a Bernoulli sequence ε = {εi , i ≥ 1} is defined, and consider the sequence of partial sums Sn = ε1 + · · · + εn , n = 1, 2, . . . . Again, with probability 1, ∞ m=1

am

∞ 1 ≤ aSn

∞ 1 ≤ aSn 1 .

[d,δ]|m √ d,δ∈D, [d,δ]≤ m

n=1

[d,δ]|Sn d,δ∈D √ [d,δ]≤ Sn

(13.4.18)

[d,δ]|Sn d,δ∈D √ [d,δ]≤ n

n=1

As in the proof of Theorem 13.4.1, by virtue of the regularity condition (13.4.3) and the strong law of large numbers, we can replace the term aSn by some asymptotic equivalent. More precisely, we can and do assume (13.4.15). If we further show that ∞ n=1

an E

[d,δ]|Sn d,δ∈D √ d<δ≤ n

1 ≤C d,δ∈D

1 [d, δ]

n≥[d,δ]2

an ,

(13.4.19)

690

13 Divisors and random walks

then, by (13.4.18), (13.4.19) and (13.4.15), ∞

an d2 (n, D) < ∞.

(13.4.20)

n=1

Theorem 13.4.2 can now be proved. By Lemma 11.8.7, we have 1 . 1 ≤C E [d, δ] [d,δ]|S d,δ∈D

(13.4.21)

√ [d,δ]≤ n

n d,δ∈D √ [d,δ]≤ n

Henceforth, ∞

an E

∞ 1 ≤C an

[d,δ]|Sn d,δ∈D √ [d,δ]≤ n

n=1

n=1

d,δ∈D √ [d,δ]≤ n

1 1 =C [d, δ] [d, δ] d,δ∈D

an < ∞,

n≥[d,δ]2

(13.4.22) by assumption. This achieves the proof. Remarks. Condition (13.4.3) excludes cases where either the sequence a decreases too fast or contains subsequences with rapid order of decrease. It is clear that the results are trivial for fast decreasing sequences a. It is less clear, however, whether condition (13.4.3) is really necessary or not. Nevertheless, it is easy to extrapolate from the proof that condition (13.4.3) can be replaced by the weaker but slightly more complicated condition: For any real c > 0, sup

sup √

ν |μ−ν/2|≤c ν log log ν

aμ a[ν/2]

= ρa (c) < ∞,

(13.4.3 )

−n/2| < ∞ almost surely. Then since by the law of the iterated logarithm, supn √|Snnlog log n for a convenient choice of c, on a set of positive probability, aSn ≤ a[n/2] ρa (c) for all n.

Problem 24. Only sufficient conditions have been presented here. However the probabilistic argument that we used can also be employed to provide necessary conditions. The idea goes as follows: if M is a subsequence of positive integers growing a little bit faster than linearly, then the sequence of partial sums {Sm , m ∈ M} is ultimately strictly increasing, almost surely. For instance, one can take M = {[δn log n], n ≥ 1} with some suitable constant δ. And so (13.4.13) can be completed by the following double inequality: aSm d(Sm , D) ≤ aμ d(μ, D) ≤ aSn d(Sn , D), (13.4.13 ) m∈M m≥M

μ≥M

n≥M

13.5 Value distribution of the divisors of n2 + 1

691

on a set of probability tending to 1 as M tends to infinity. Now, if for any real c > 0, lim inf

inf √

aμ

ν→∞ |μ−ν/2|≤c ν log log ν

a[ν/2]

= σa (c) > 0,

(13.4.3 )

then for some η > 0, with positive probability, aSm ≥ ηa[m/2] m-ultimately. And thus the sum in the left in (13.4.13 ) can be replaced by the sum m∈M a[m/2] d(Sm , D). If m≥M we moreover show that the series a[m/2] d(Sm , D) (13.4.23) m∈M

diverges almost surely, then by (13.4.13 ) the series μ aμ d(μ, D) diverges as well. Checking (13.4.23) requires, however, substantial extra work, the divergence of the series μ aμ E d(μ, D) being not enough; but this problem should be tractable.

13.5 Value distribution of the divisors of n2 + 1 The study of the value distribution of the divisors of Bernoulli sums made in the previous sections is now extended to the one of the divisors of Bn2 + a. In the above discussed works, sums of exponentials of the first order appeared, unlike here where we will consider sums of exponentials of the second order. The work will thus heavily rely upon properties of Gauss sums. 13.5.1 Theorem. There exists an absolute constant C such that for any positive integers n, d, a with n ≥ d, we have d d d log d 1/2 1 2iπ a j 2iπ j r 2 2 d d e e . P d|Sn + a − 2 ≤C d n j =1

r=1

Further the sum appearing in the above statement, d d 1 2iπ a j 2iπ j r 2 d e e d , d2 j =1

r=1

= −1. And if d has no such prime equals 0 if d has a prime factor such that −a p factor, then we have ⎧ 2αd if d is odd, ⎪ d d 1 2iπ a j 2iπ j r 2 ⎨ 2dk+αd d d e e = (13.5.1) if 2k d and (a, 2) = 1, d ⎪ d2 k−1+αd ⎩ 2 k r=1 j =1 | if 2 d and 2 a, d

692

13 Divisors and random walks

where αd := #{p|d : (p, 2a) = 1}. If a = 1 this gives ω(d) 2 2 if d is odd, d lim P d|Sn + 1 = 2k+ω(d)−1 n→∞ if 2k d. d

(13.5.2)

The proof will follow from a series of lemmas. Elementarily, d n 1 j 2 aj P d|Sn2 + a = e2iπ d 2−n Cnk e2iπ d k . d j =1

(13.5.3)

k=0

The formula (13.5.2) indeed readily follows from the relation dδd|Sn2 +a =

d−1

e2iπj

Sn2 +a d

,

(13.5.4)

j =0

which is next integrated with respect to P. Before going further, it is worth noticing that in formula (13.5.3), one could reduce the problem to the study of sums of binomial coefficients along arithmetic progressions by writing d d n 1 j 2 P d|Sn2 + a = e2iπ d (a+m ) 2−n Cnk . d k=0 m=1 j =1

(13.5.5)

k≡m(d)

Recall indeed the formula established by Ramus in 1834 (see also Problem 2 (d), p. 39 in the book of Riordan [1958]), n

Cnk =

k=0 k≡m(d)

d d (n − 2m)kπ 1 −km 1 kπ n ω (1 + ωk )n = 2 cos cos , d d d d k=1

k=1

(13.5.6) where ω is the n-th primitive root of the unit. Inserting estimate (13.5.6) into (13.5.5) gives P

d|Sn2

d d d (n − 2m)kπ 2iπ j (a+m2 ) 1 kπ n +a = 2 cos cos e d . d d d

k=1

j =1

m=1

(13.5.7) j 2 The last sum dj =1 e2iπ d (a+m ) being equal to δd|a+m2 , this is thus only shifting the initial problem. We shall therefore proceed differently. Recall the classical result n d 1 2iπ αk 2 1 2iπ αr 2 e → e , n d k=1

r=1

α=

j , d

(13.5.8)

valid for j = 1, . . . , d, d ≥ 1. The usual Gauss sum appears in the limit. One may 2 interpret (13.5.8) as the convergence (C, 1) of the sequence {e2iπ αk , k ≥ 1}, whereas

13.5 Value distribution of the divisors of n2 + 1

693

the sum appearing inside the right-hand side of (13.5.3) is the sum (E,1) by the Euler method (Hardy [1963: Chapter 8]) of this one. It is well known that (E, 1) does not include (C, 1) (see again Hardy [1963: Chapter 8]). By making the convergence (13.5.8) more precise, however, it is possible to establish that in turn, n d 1 2iπ αr 2 j 2 2−n Cnk e2iπ αk → e , α= . d d r=1

k=0

13.5.2 Lemma. Let n, d be two positive integers and write n = N d + m with 1 ≤ m ≤ d. Then, for j = 1, . . . , d,

j d

n

n d d m 1 1 2iπ j r 2 1 2iπ j r 2 2iπ j r 2 2iπ dj k 2 d d + := e − e ≤ e e d . n d n r=1

k=1

r=1

r=1

Proof. If N = 0, then n = m and the bound proposed is trivially verified. Assume N ≥ 1. Remark also that if j = d, then n ( dj ) = 0 and there is nothing to prove either. As n N −1 (ν+1)d N d+m d m j 2 1 2iπ j k 2 1 N 2iπ j r 2 1 2iπ j r 2 e2iπ d k = e d = + e d + e d n n n n ν=0 k=νd+1

k=1

=N

1 1 − n Nd

r=1

k=N d+1

d

j 2

e2iπ d r +

r=1

1 d

d

j 2

e2iπ d r +

r=1

r=1

1 n

m

j 2

e2iπ d r ,

r=1

we thus have

n

j d

d m Nd − n 2iπ j r 2 1 2iπ j r 2 d d + e e nd n

≤

r=1

r=1

d m 1 2iπ j r 2 2iπ j r 2 ≤ e d + e d . n r=1

r=1

j 2 Remark. If δ = (j, d) > 1, write j = δh, d = δd . Then d1 dr=1 e2iπ d r = 2iπ dh ρ 2 1 d , and besides in writing n in the form n = N d +m with 1 ≤ m ≤ d , ρ=1 e d we get from Lemma 13.5.2 that

n d m d 1 1 2iπ h ρ 2 1 2iπ h ρ 2 2iπ h ρ 2 2iπ dj k 2 d d e − e e e d + n ≤ n d k=1

ρ=1

ρ=1

ρ=1

m

2 d ≤ max n m =1

ρ=1

e

.

2iπ dh ρ 2

(13.5.9)

694

13 Divisors and random walks

The next ingredient is the lemma below ([Hardy: 1963], p. 213). 13.5.3 Lemma. Let A = {An , n ≥ 1} be a sequence of reals such that A1 + · · · + An = a + o(n−1/2 ). n Then A is summable (E , q) for any positive q. Cn1 (A) =

The symbol o is essential. The conclusion of the lemma is wrong when replacing o by O. We are now in a position to state: 13.5.4 Lemma. For any positive integer d, we have d d 1 2iπ a j 2iπ j r 2 d lim P d|Sn2 + a := L(a, d) = 2 e e d . n→∞ d j =1

r=1

Proof. By Lemma 13.5.2, for any j = 1, . . . , d, n d 1 2iπ j k 2 1 1 2iπ j r 2 d d e = e +O . n d n k=1

r=1

And thus, by considering separately imaginary and real parts, by Lemma 13.5.3 applied with q = 1, this implies that n k=0

j 2

2−n Cnk e2iπ d k →

d 1 2iπ j r 2 e d , d r=1

j = 1, . . . , d. Inserting this estimate into (13.5.3) gives the result. The proof of the next lemma follows from an argument due to Dartyge

and allows us to evaluate the limit L(a, d). Recall that the Jacobi–Legendre symbol dj is defined by ⎧ ⎪0 if d|j , ⎨ j = 1 if there exists N such that d|N 2 − j , ⎪ d ⎩ −1 otherwise. (See for instance Shanks [1978], Definition 12, p. 33, j being an arbitrary positive integer.)

13.5.5 Lemma. Let d be a positive integer. If d has a prime factor p such that −a = −1, then L(a, d) = 0. If d does not have such prime factors, let αd := p #{p | d : (p, 2a) = 1}. In this case we have ⎧ 2αd if d is odd, ⎪ ⎨ d k+αd 2 L(a, d) = if 2k p and (a, 2) = 1, d ⎪ ⎩ 2k−1+αd if 2k p and 2 | a. d

13.5 Value distribution of the divisors of n2 + 1

695

Proof. If F is a polynomial with integer coefficients, it is well known ([Halberstam– Richert: 1974], p. 18) that the function ρ(d) = #{1 ≤ r ≤ d : d |F (r)} is multiplicative. In particular f (d) = #{1 ≤ r ≤ d : d | r 2 + a} is multiplicative, and we have k L(a, d) = f (d) d . It suffices to evaluate f (p ). First ⎧

−a ⎪ ⎨2 if p = 1, f (p) = 0 if −a (13.5.10) p = −1, ⎪ ⎩ 1 if p | a. Next we reduce the computation of f (pk ) with k ≥ 2 to the one of f (pk−1 ) by writing r = u + vpk−1 : f (pk ) = # 0 ≤ u < pk−1 , 0 ≤ v < p : (u + vpk−1 )2 + a ≡ 0 mod (pk )} . We notice that (u + vpk−1 )2 + a = u2 + 2vupk−1 + v 2 p2k−2 + a ≡ u2 + a + 2vup k−1 mod (pk ) ≡ u2 + a mod (pk−1 ). If (u, v) is a solution for f (pk ), then u verifies u2 + a ≡ 0 mod (p k−1 ). Conversely, let 0 ≤ u < pk−1 be such that p k−1 |u2 + a. Then u2 + a + 2vup k−1 ≡ 0 mod (pk ), 2 +a if and only if 2uv ≡ upk−1 mod (p), that is 2uv ≡ 0 mod (p). If p = 2, there are two possible integers v; if p is odd, there is only one possible v. We have thus shown for any integer k ≥ 2 that: f (pk ) = f (pk−1 ) if p ≥ 3 and f (2k ) = 2f (2k−1 ). Summarizing:

f (d) = 0 if d has a prime factor p such that −a p = −1. If d has no such prime factor, we have: ⎧ α ⎪ if d is odd, ⎨2 d k+α d (13.5.11) f (d) = 2 if 2k p and (a, 2) = 1, ⎪ ⎩ k−1+αd k | 2 if 2 p and 2 a. This allows us to conclude our proof. By inseritng this into Lemma 13.5.4, we get 13.5.6 Proposition. For any prime number d, and any positive number a such that (a, d) = 1, we have the following:

| 2 (i) If d has a prime factor p such that −a p = −1, then lim n→∞ P d Sn + a = 0. (ii) If d has no such prime factor, then ⎧ 2αd ⎪ ⎨ dk+α lim P d|Sn2 + a = 2 d d n→∞ ⎪ ⎩ 2k−1+αd d

if d is odd, if 2k d and (a, 2) = 1, if 2k d and 2 | a.

696

13 Divisors and random walks

In particular if a = 1, as lim P

n→∞

−1 p

d|Sn2

= 1 by Euler’s criterion, we have

+1 =

2ω(d) d 2k+ω(d)−1 d

if d is odd,

(13.5.12)

if 2k d,

where ω(d) = #{p : p | d} is the prime divisor function of d. We shall now make precise the previous proposition by estimating the speed of convergence. In what follows d is fixed, as well as j ∈ {1, . . . d − 1}. We put α = dj , and then 2

ak = ak (α, d) = e2iπ αk −

d 1 2iπ αr 2 e , d

k = 0, 1, . . . .

r=1

We note that En (α) :=

n

d n n 1 2iπ αr 2 −n k e = 2 Cn ak = vk ak . (13.5.13) d

2−n Cnk e2iπ αk − 2

r=1

k=0

k=0

k=0

We have denoted in (13.5.13), vk = vk (n) = 2−n Cnk , k = 0, . . . , n. The supremum is reached at the value ([Hardy: 1963], p. 201 and p. 214) ν=

$

%

n+1 . 2

If n+1 2 is an integer, then vν−1 and vν are equal. Besides, vk decreases on either side of k = ν. Finally vν ≤ Cn−1/2 , where C is an absolute constant. We write A = A (α, d) :=

e

2iπ αk 2

r=1

k=0

Then a = A − A−1 ,

d 1 2iπ αr 2 − e ak . = d

≥ 1,

(13.5.14)

k=0

d 1 2iπ αr 2 a0 = 1 − e = A0 . d r=1

From the remark following Lemma 13.5.2 also follows that, if δ = (j, d) > 1, j = hδ, d = δd , writing then in the form = Ld + λ with 1 ≤ λ ≤ d , λ n d 2iπ h r 2 ak ≤ 2 max e d . max =1

k=1

λ=1

r=1

And consequently λ λ d h 2 n d 2iπ h r 2 max |A | ≤ 2 max e d +|a0 | ≤ 3 max max e2iπ d r , 1 . (13.5.15) =0

λ=1

r=1

λ=1

r=1

13.5 Value distribution of the divisors of n2 + 1

697

We split the sum En (α) into two subsums En (α) = En1 (α) + En2 (α) with En1 (α)

=

En2 (α) =

ν !

2iπ αk 2

k=0 n

!

e

d ν 1 2iπ αr 2 vk ak , = − e d r=1

1 d

2

e2iπ αk −

k=ν+1

k=0

d

e2iπ αr

2

=

r=1

n

vk ak .

k=ν+1

On the one hand ν ν vk ak = v0 a0 + vk (Ak − Ak−1 ), En1 (α) = k=0

k=1

= vν Aν − A0 (v1 − v0 ) + A1 (v2 − v1 ) + · · · + Aν−1 (vν − vν−1 ) . Thus ν−1

|En1 (α)| ≤ vν |Aν | + max |Am | m=0

ν

(vm − vm−1 )

m=1

(13.5.16)

λ d h 2 C ≤ 2vν max |Am | ≤ √ max max e2iπ d r , 1 , m=0 λ=1 n ν

r=1

where C is an absolute constant. On the other hand, En2 (α) =

n

vk ak =

k=ν+1

n

vk (Ak − Ak−1 ),

k=ν+1

= −vν+1 Aν + Aν+1 (vν+1 − vν+2 ) + · · · + An−1 (vn−1 − vn ) + vn An . (13.5.17) Thus n−1 n n (vm − vm+1 ) + vn ≤ 2vν+1 max |Am | |En2 (α)| ≤ max |Am | vν+1 + m=ν

m=ν

m=ν+1

λ d h 2 C ≤ √ max max e2iπ d r , 1 . λ=1 n r=1

d

Consequently |En (α)| ≤ Cn−1/2 max maxλ=1 established the following lemma.

λ

r=1 e

2iπ

h 2 r d

(13.5.18) , 1 . We have thus

13.5.7 Lemma. There exists an absolute constant C0 such that for any positive integers n and d, α = dj , j = 1, . . . , d, n d λ d h 2 1 2iπ αr 2 2 2−n Cnk e2iπ αk − e e2iπ d r , 1 , ≤ C0 n−1/2 max max λ=1 d k=0

r=1

r=1

698

13 Divisors and random walks

where we denoted j = hδ, d = δd if δ = (j, d), and n = N d + m with 1 ≤ m ≤ d . Now recall an inequality due to Sárközy [1978] (see Lemma 4, p. 128). 13.5.8 Lemma. Let α be a real number and a, q be positive integers such that (a, q) = 1 and |α − a/q| < 1/q 2 . Then, for any positive integer λ, λ λ 2 e2iπ r α ≤ 7 √ + (λ log q)1/2 + (q log q)1/2 . q r=1

Now let d = p1α1 . . . pkαk , where p1 , . . . , pk are distinct pairwise coprime numbers. d . Let j be an integer between 1 and d. Write dj = dh where (h, d ) = 1 and d = (j,d) Apply Lemma 13.5.8 with the choice of values α=

h , d

a = h,

q = d .

Since (h, d ) = 1, we have λ λ 2 h λ 2j e2iπ r d = e2iπ r d ≤ 7 √ + (λ log d )1/2 + (d log d )1/2 . d r=1 r=1

By Lemma 13.5.7 and the remark following Lemma 13.5.2, n d 1 2iπ j r 2 −n k 2iπ dj k 2 2 Cn e − e d d k=0

d

r=1

λ ≤ 7C0 n max √ + (λ log d )1/2 + (d log d )1/2 λ=1 d −1/2 ≤ 21C0 n (d log d )1/2 ≤ 21C0 n−1/2 (d log d)1/2 . −1/2

(13.5.19)

By repeating the same reasoning for each of the values of j between 1 and d, we finally obtain 13.5.9 Proposition. There exists an absolute constant C such that for any positive integer d, any positive integers a and n ≥ d, we have 1/2 2 P d|S + a − L(a, d) ≤ C d log d . n n

The above estimate applies only if d is not too large: d 5 n/ log n. Problem 25. Apply the method used in this section to obtain sharp estimates of the probabilities P{D|Sn Sm }, P{D|Sn Sm Sp Sq }.

13.6 Value distribution of the divisors of Rademacher sums

699

13.6 Value distribution of the divisors of Rademacher sums Let ε = {εi , i ≥ 1} be independent Rademacher random variables, and let Sn = ε1 + · · · + εn . 13.6.1 Theorem. There exist two numerical constants C, n0 such that for all n ≥ n0 the following holds. (i) If d, n are even, then 2 P d|Sn − d

e

2 2 − 2nπ 2 j

log5/2 n

≤ C n3/2 .

d

0≤j
(ii) If d, n are odd, then 1 2 P d|n − − d d

Let Kd (y) = follows:

1 d

1≤j
1 P d|Sn − 2 d R

iπ dj y

e

2 2 − 2nπ 2 h

log5/2 n

≤ C n3/2 .

d

1≤h
. The inequality in (i) can also be reformulated as

j

eiπ d y e−y

2 /2n

0≤j

dy log5/2 n √ ≤ Cα d n3/2 . 2π n

Proof of Theorem 13.6.1. The proof is very similar to the one of Theorem 13.2.1. Notice that P{Sn = j } = 0 iff n + j is odd. Further, if d ≤ n then 1 2 n 2πj if d, n are odd, 1≤j
n cos πj ≤ dn−α .

d

1≤j
Similarly, notice also that if 1≤j
2πj d

∈ An , then 2 2

e

−n 2π 2j d

≤

2πj d

1≤j
≥ ϕn = n 2α log n n

e− 2 .

2α log n 1/2 n

≤ dn−α .

, and so

700

13 Divisors and random walks

Now

cosn

1≤j
2 2 2πj −n 2π 2j ≤n d −e d

1≤j
π 2 j 2 2πj log cos + d 2d 2 4 j

≤ Cn

≤ Cα d

(13.6.1)

d

1≤j
log5/2 n . n3/2

j Suppose d, n are even: d = 2δ. If 2πj d ∈ [π −ϕn , π [, then cos 2π d = cos 2π

h cos 2π d , and 2π h/d ∈]0, ϕn ]. So that

2πj cosn −2 d 1≤j
We deduce

2 P d|Sn − d

e

2 2 −n 2π 2j

≤ Cα d

d

1≤j

e

2 2 − 2nπ 2 j

≤ Cα

d

0≤j
δ−j 2δ

:=

log5/2 n . n3/2

log5/2 n . n3/2

Suppose now d, n are odd: d = 2δ + 1. The sum corresponding to j such that 0 < 2πj/d ≤ ϕn is already estimated. For the other sum, if 2πj/d ∈ [π − ϕn , π [,

cos 2π and

2π(h+1/2) d

j δ + 1/2 j = cos 2π − d 2δ + 1 2δ + 1

:= cos 2π

h + 1/2 , d

∈]0, ϕn ]. So we get

cosn 2π

1≤h

h + 1/2 . d

We compare this sum with

e

− 2nπ

2 (h+1/2)2 d2

.

1≤h
By arguing similarly we get

1≤h
cosn 2π(

h + 1/2 )− d

1≤h
e

− 2nπ

2 (h+1/2)2 d2

log5/2 n ≤ Cα d . n3/2

13.7 The functional equation and the Lindelöf Hypothesis

701

Consequently, 1 2 P d|Sn − − d d

e

2 2 − 2nπ 2 h

≤ Cα

d

1≤h
log5/2 n . n3/2

13.7 The functional equation and the Lindelöf Hypothesis The Riemann zeta function defined on the half-plane {s : $s > 1} by the series ζ (s) =

∞

n−s

n=1

admits a meromorphic continuation to the entire complex plane, with the unique and simple pole of residue 1 at s = 1. The celebrated elliptic Theta function (u) =

e−π n u , 2

n∈Z

is linked to Gamma and Riemann zeta functions, via the functional equation of the Riemann zeta function, valid for any complex s, ∞ 1 −1 1 1 1 1 1

π − 2 s s ζ (s) = (x) − 1 x 2 s−1 + x − 2 s− 2 dx − s(1 − s) , 2 2 1 which follows from the equation π

− 21 s

∞ ∞ 1 1 2 s ζ (s) = e−m π x x 2 s−1 dx. 2 0

m=1

See for instance [Huxley: 1972], Chapter 11, equation (11.3) and (11.7), or [Blanchard: 1969] Part 5, Chapter 3, p. 136). The purpose in this section is to prove the existence of another functional equation linking the left-hand side to the distribution of the divisors of Rademacher sums, where the Theta function plays a crucial role. Let ε = {εi , i ≥ 1} be a sequence of independent spin random variables (P{εi = ±1} = 1/2) with basic probability space (, A, P). Consider the sequence of partial sums Sn = ε1 +· · ·+εn , n = 1, 2, . . . . Put for p, M even, υ(p, M) =

p P p|SM − 1. 2

When p is fixed, p2 P p|SM → 1, as M → ∞. Thus υ(p, M) analyses the speed of convergence. When p and M simultaneously tend to infinity, this quantity in turn also appears in the functional equation of the Riemann zeta function.

702

13 Divisors and random walks

13.7.1 Theorem. There exist a sequence of pairs of even positive integers (pτ , Mτ ), pτ ≤ Mτ such that for any complex s,

1 1 1 + π − 2 s s ζ (s) s(1 − s) 2 $ 1 1 1% ∞ 2 τ 2 s−1 τ − 2 s− 2 υ(pτ T , Mτ τ ) + . = lim 2 T →∞ T T2 T2 2

τ =T

It will result from the proof that the sequence of pairs (pτ , Mτ ) is intimately related to the diophantine approximation of the irrational number 2π . Proof. Reformulating Theorem 13.6.1, even case, in terms of υ(p, M), gives: for any α > α > 3/2, there exist constants C, p0 and M0 , depending on α, α only, such that for any M ≥ M0 and M ≥ p ≥ p0 , 9 ∞ 5/2 M log M −22 π 2 M2 υ(p, M) − p ≤ Cp (13.7.1) e if p ≥ 2π . M 3/2 2α log M =1

Let Pk /Qk be a sequence of irreducible fractions such that Pk = 2π. k→∞ Qk lim

(13.7.2)

By Dirichlet’s theorem, we may choose them so that 1 1 Qk 2π − P < 2 . Pk k

(13.7.3)

Define

Gk = sup L integer : L2 ≤ Pk . √ Since 1 ≤ Pk /G2k ≤ 1 + 3/ P k for k large, we have G2k = 2π. k→∞ Qk lim

(13.7.4)

(13.7.5)

Let a, d be some positive integers with d ≥ a. We assume that a is a square. Later on, we shall select √ p = 2Gk(d) a, (13.7.6) M = 4Qk(d) d, for a value k(d) depending on d only. Since there is no loss in assuming Pk ≤ 4Q2k , we also will have p ≤ M. At this stage, we first estimate the error term −2 2π 2 Qk d 2 d G2 ka = e − e− π a .

13.7 The functional equation and the Lindelöf Hypothesis

For k large, we find

703

1 2 d d 2π Qk ≤ π − 1e− 2 π a . 2 a Gk 2

Now, by (13.7.3), 2πQk Q 1 Pk − G2k 1 1 2π ≤ 2π k − − 1 + 2π Q − ≤ + 2π Q k 2 k P 2π Pk Pk2 G2k Gk G2k Pk k 2π Qk 2Gk + 1 2π 3 4 ≤ 2 + 2π ≤ 2+ ≤ 1/2 , 2 P G

Pk

k

Gk

Pk

k

Pk

(13.7.7) for k large. Thereby, there exist two absolute constants C and k0 , such that for any k ≥ k0 , any positive integers and d ≥ a, we have −2 2π 2 Qk d G2 −2 π da e ka − e ≤

C 1/2 Pk

d 1 2 d 2 e− 2 π a . a

(13.7.8)

Thence, ∞ −2 2π Qk d ∞ C d 2 − 1 2 π d C d 1 d G2 −2 π da e ka −e e 2 a ≤ 1/2 e− 2 π a , (13.7.9) ≤ 1/2 a Pk Pk a =1 =1 2 −Aj 2 ≤ Ce−A , A ≥ 1. where we used the elementary inequality ∞ j =1 j e By (13.7.1), ∞ −2 2π 2 Qk d √ log5/2 Qk d G2 υ(2Gk a, 4Qk d) − 2 k a ≤ CGk a 1/2 e , (13.7.10) (Qk d)3/2 2

=1

provided that G2k a 4π 2 d ≥ . Qk 2α log Qk d

(13.7.11)

G2

Note that Qkk ∼ 2π and Qk → ∞ as k tends to infinity. Thus (13.7.11) will be certainly satisfied if k = k(d) is chosen according to 1≥π

d , log Qk(d) d

(13.7.12)

which we do assume from now on. By combining (13.7.9) with (13.7.10) we obtain ∞ √ −2 π da υ(2G a, 4Q d) − 2 e k(d) k(d)

=1

1/2

1 d a −1/2 d ≤ C Pk(d) e− 2 π a + a d

log5/2 Qk(d) d . Qk(d) d

(13.7.13)

704

13 Divisors and random walks

Now, return to the functional equation of the Riemann zeta function. Let s = σ + it be any complex number. Define 1

γ (x) = x 2 −1 + x − 2 − 2 , s

s

ϕ (x) = e− π x , ψ (x) = ϕ (x)γ (x), ∞ d+1 a U= ψ (x)dx, Ud (a) = ψ (x)dx. 2

d a

1

Then, U =

∞

d=a

Ud (a), and we compare Ud (a) to 1 −2 π d d a γ ( ). e a a ∞

=1

Let Hd =

∞ =1

d+1 a d a

"

# ψ (x) − ψ ( da ) dx. Plainly,

s −1 x 2 + x − 2s − 21 + x 2s −2 + x − 2s − 23 ,

where C(s) depends on s only. Let σ ∗ = max σ2 − 1|, σ2 + 21 , σ2 − 2, σ2 + 23 . s −1

∗ 2 + x − 2s − 21 + x 2s −2 + x − 2s − 23 ≤ 4 d σ , and thus Now if da ≤ x ≤ d+1 a , then x a |ψ (x)| ≤ C(s)2 e−

2π x

|ψ (x)| ≤ 4C(s)2 e−

2π d a

σ ∗

d a

.

It follows that d+1 $ % σ ∗ 2 a d ≤ 4C(s) e−2 π da d ψ (x) − ψ dx . d a a2 a

(13.7.14)

a

Thereby, |Hd | ≤ 4C(s) The sum H =

∞

σ ∗ ∞

1 d a2 a

d=a

2 e−

2π d a

≤ 4C(s)

=1

σ ∗

1 d a2 a

e−π a . d

Hd is equal to ∞ ∞ 1 d U− ψ . a a d=a

=1

And by the functional equation quoted at the beginning of the section, U=

1 1 1 + π − 2 s s ζ (s). s(1 − s) 2

(13.7.15)

705

13.7 The functional equation and the Lindelöf Hypothesis

Consequently, ∞ ∞ 1 1 d 1 − 21 s ψ s)ζ (s) ( − + π s(1 − s) 2 a a d=a

=1

(13.7.16)

∞ ∗ 1 d σ −π d 1 ≤ 4C(s) 2 e a ≤ C (s) , a a a d=a

where C (s) depends on s only. Combining now (13.7.13) with (13.7.16) produces ∞ 2 √ 1 1 d − 21 s s ζ (s) − υ(2Gk(d) a, 4Qk(d) d)γ s(1 − s) + π 2 a a d=a

≤ ≤

C (s) a

d=a

C (s) a

1/2 ∞ C −1/2 d − 1 π d a log5/2 Qk(d) d d 2 a + γ Pk(d) e + (13.7.17) a a d Qk(d) d a −1/2 + C1 (s) max Pk(d) d≥a

∞ d C a 1/2 log5/2 Qk(d) d + . γ a d Qk(d) d a d=a

By (13.7.12), we have Qk(d) d ≥ eπ d . Moreover γ ( da ) ≤ 2

d σ ∗ a

. It follows that

∞ C a 1/2 log5/2 Qk(d) d d γ a d Qk(d) d a d=a

σ ∗ ∞ ∞ d 2C a 1/2 C1 σ ∗ +2 −π d ≤ (π d)5/2 e−π d = 1/2+σ ∗ d e . a d a a d=a

(13.7.18)

d=a

π ∞

∞

Let m ≥ 1. Then d=a d m e−π d ≤ e a t m e−π t dt. Further [t m e−π t ] ∼ −π t m e−π t as tends to infinity. There is thus some a0 > 0 such that for t ≥ a ≥ a0 , ∞t m t e−π t dt ≤ (2/π )a m e−π a . Applying these remarks with m = σ ∗ + 2 to the a right-hand side of (13.7.18) gives for a large, ∞

C1 ∗ a 1/2+σ

d=a

dσ

∗ +2

e−π d ≤

C2 σ ∗ +2 −π a e ∗a 1/2+σ a

= C2 a 3/2 e−π a .

(13.7.19)

Combining (13.7.17) with (13.7.18) and (13.7.19) gives for a large, ∞ √ 2 1 1 d − 21 s ζ (s) − υ(2G a, 4Q d)γ + π s k(d) k(d) s(1 − s) 2 a a d=a

C (s) −1/2 + C1 (s) max Pk(d) + C2 a 3/2 e−π a . ≤ d≥a a (13.7.20)

706

13 Divisors and random walks

We deduce ∞ √ 1 1 2 d 1 υ(2Gk(d) a, 4Qk(d) d)γ = + π − 2 s s ζ (s). a→∞ a a s(1 − s) 2 d=a (13.7.21) This achieves the proof.

lim

We conclude this section with some results related to the famous Lindelöf Hypothesis, and involving another random walk, the Cauchy random walk. The proofs being long and very technical, we refer to the original paper of [Lifshits–Weber: 2006]. We first begin with basic results. The Lindelöf Hypothesis. The Lindelöf Hypothesis (LH) asserts that

ζ

1 + it = O(t ε ) 2

(13.7.22)

for every positive ε. Up to now, the best known result towards (13.7.22) is due to Huxley [2005] 1 + it = O(t 32/205+ε ) (∀ε > 0) ζ 2 and 32/205 = 0.156097561 . . . . The validity of the Riemann Hypothesis implies ([Titchmarsh: 1951], Theorem 14.14) that

log t 1 ζ + it = O exp A 2 log log t

,

(13.7.23)

A being a constant, which is even a stronger form of LH, the latter being strictly weaker than the Riemann Hypothesis. The validity of (13.7.22) is equivalent to any of the three following assertions (see [Titchmarsh: 1951] Chapter XIII):

1 T 1 2k + it dt = O T ε , k = 1, 2, . . . , (13.7.24) ζ T 1 2 2k

1 1 T (13.7.25) σ > , k = 1, 2, . . . , ζ (σ + it) dt = O T ε , 2 T 1 ∞ 2k dk2 (n) 1 T 1 ζ (σ + it) dt = , σ > , k = 1, 2, . . . , (13.7.26) lim T →∞ T 1 n2σ 2 n=1

where dk (n) denotes the number of representations of integer n as a product of k factors. There is also a reformulation due to Backlund of the LH in terms of the location of the zeros of ζ ; (13.7.22) is equivalent to N(σ, T + 1) − N(σ, T ) = o(log T ) for every σ > 1/2.

(13.7.27)

It is thus natural to study the asymptotic behavior of the zeta function along the critical line σ = 1/2 by modelling the time t with a random walk. The Cauchy random

13.7 The functional equation and the Lindelöf Hypothesis

707

walk turns out to be most appropriate because of the smoothness of the Cauchy characteristic function, which also “preserves” the structure of the Riemann zeta function. Let X1 , X2 , . . . denote an infinite sequence of independent Cauchy distributed random variables (with characteristic function ϕ(t) = e−|t| ); then the time t is modelled by the sequence of partial sums Sn = X1 + · · · + Xn . In order to understand the behavior of ζ ( 21 + it) when t tends to infinity, one may investigate the almost sure asymptotic behavior of the system ζn := ζ

1 + iSn , 2

n = 1, 2, . . . .

(13.7.28)

Put for any positive integer n, Zn = ζ

1 1 + iSn − E ζ + iSn = ζn − E ζn . 2 2

(13.7.29)

A complete second-order theory of the system {Zn , n ≥ 1} is developed in [Lifshits– Weber: 2006]. The most striking fact is that this system nearly behaves like a system of non-correlated variables, i.e., the variables Zn are weakly orthogonal. More precisely 13.7.2 Theorem. There exist constants C, C0 such that E |Zn |2 = log n + C + o(1), n → ∞,

1 E Zn Zm ≤ C0 max 1 , m−n , for m > n + 1. n 2 ∞

1 1 dα, The explicit value of C is C = γ − 2 + 2 0 φ(α)dα + 2 1 φ(α) − 2α αeα −2eα +α+2 where γ is the Euler constant and φ(α) = 2α 2 (eα −1) . Combining then Theorem 13.7.2 with Theorem 9.3.11 also allows us to prove the following theorem [Lifshits–Weber: 2006], which displays a rather slow growth of the Riemann zeta function on the critical line, when sampled by the Cauchy random walk. 13.7.3 Theorem. For any real b > 2, n lim

n→∞

and

1 k=1 ζ ( 2 + iSk ) − n a.s. = n1/2 (log n)b

0,

n ζ ( 21 + iSk ) − n k=1 sup < ∞. n1/2 (log n)b n≥1

2

The used notation a.s. means that the corresponding property holds with probability one. Very likely, the results similar to the above theorems are valid when sampling with a large class of random walks with discrete or continuous steps. However, the necessary moment expressions we obtain for Cauchy distribution are by far more explicit than

708

13 Divisors and random walks

in other cases, e.g., for Gaussian or Bernoulli distributions. The approach is based on the following classical approximation result (Theorem 4.11 in [Titchmarsh: 1951]): letting, as usual, s = σ + it, we have ζ (s) =

1 x 1−s − + O(x −σ ), s n 1 − s n≤x

(13.7.30)

uniformly for σ ≥ σ0 > 0, |t| ≤ Tx := 2π x/K, K is any constant > 1. 13.7.4 Remark. Clearly (13.7.24) is equivalent to T

ζ (1/2 + it)2k dt = Oε T 1+ε ,

k = 1, 2, . . . .

T /2

This is also equivalent to n n 2k

m−s dt = O n1+ε ,

k = 1, 2, . . . .

n/2 m=1

Indeed, apply (13.7.30) with σ0 = 1/2. Minkowski’s inequality yields $ Tn %1/2k 2 ζ (1/2 + it)2k dt T n Tn /2 $ Tn %1/2k n 2 n1/2−it 2k −(1/2+it) ≤ Cn−1/2 . − m − dt T 1/2 − it n

As

Tn /2 m=1

Tn n1/2−it 2k dt ≤ Cnk ∞ Tn /2 1/2−it Tn /2

$ 2 T n

Tn Tn /2

dt (1/4+t 2 )k

2k %1/2k $ 1 2 ζ ( + it) dt − 2 T n

≤ Ck n1−k , we get

Tn

n 2k %1/2k −(1/2+it) ≤ Ck n−1/2 , m dt

Tn /2 m=1

which implies the claimed equivalence. The Lindelöf Hypothesis and Fourier inversion formula. It is striking to observe that (13.7.26) is “almost” a Fourier inversion formula. If ν is a distribution function on R and νˆ (t) = R eitx ν(dx) denotes its characteristic function, then (see remarks on “Continuous time and Fourier inversion formula” in Section 1.4) T 1 e−itx0 νˆ (t)dt = ν{x0 }. (13.7.31) lim T →∞ 2T −T From this result also follows that 1 T →∞ 2T lim

T

−T

|ˆν (t)|2 dt =

x∈R

ν({x})2 .

13.7 The functional equation and the Lindelöf Hypothesis

And actually, for any positive integer N, T 1 lim ν ∗N ({x})2 . |ˆν (t)|2N dt = T →∞ 2T −T

709

(13.7.32)

x∈R

Apply (13.7.32) to the measure μ = at point x, and σ > 1/2. Then

n

1 k=1 k σ δ{− log k} , where δ{x}

μ(t) ˆ =

n

is the Dirac measure

k −(σ +it) ,

k=1

and so 1 lim T →∞ 2T

n T

−T

k=1

1 2N

k σ +it

dt =

x∈R

=

k1 ...kN =ex

#{Y =

)N

i=1 ki Y 2σ

Y =ex Y ∈N

=

2 (Y ) dN,n Y ∈N

Y 2σ

1 σ σ k1 . . . kN

2

: ki ≤ n}2

(13.7.33)

,

where dN,n (Y ) denotes the number of representations of Y as a product of N factors less than or equal to n. And clearly lim

2 (Y ) dN,n

n→∞

Y ∈N

Y 2σ

∞ dk2 (m) . m2σ

=

(13.7.34)

m=1

Introduce according to (13.7.30) the measures μn =

n 1 δ{− log k} , kσ k=1 1−σ

νn = n

(13.7.35) δ{− log n} ,

m(dx) = χ[0,∞) (x)e−(1−σ )x dx where δ{x} is the Dirac measure at point x, and σ > 1/2. Then μˆ n (t) =

n

k −s ,

k=1 1−s

νˆ n (t) = n m ˆ n (t) =

,

1 . 1−s

(13.7.36)

710

13 Divisors and random walks

Therefore 1 n1−s ˆ n (t) with mn = μn − νn m. (13.7.37) − ˆ := m = μˆ n (t) − νˆ n (t) · m(t) ks 1 − s k≤n

We introduce the semi-norms 1/M T 1 f (t)M dt

f T ,M = , 2T −T

f M = lim sup f T ,M . T →∞

Choose in what follows M = 2N, N some fixed integer, and write Tn = 2π n/K, n ≥ 1. We have ˆ n T ,2N ≤ sup ζ (σ + i.) − m ˆ n T ,2N ≤ Cn−σ . sup ζ (σ + i.) T ,2N − m T ≤Tn

T ≤Tn

(13.7.38) From (13.7.38) follows that ˆ n T ,2N = 0. lim sup sup ζ (σ + i.) T ,2N − m

(13.7.39)

n→∞ T ≤Tn

Let ε > 0, and choose 0 so that

k=0 +1

2 Write for n2 ≥ n1 , μn1 ,n2 = nk=n 1 +1 μˆ n (t) = μˆ 0,0 + μˆ 0 ,n . Write also

1 ≤ ε. k 2σ

1 k σ δ{− log k} ,

(13.7.40) so that μn = μ0,0 + μ0 ,n , and

m0 ,n = μ0 ,n − νn m. By (13.7.37), mn = μ0,0 + m0 ,n

and

ˆ n (t) = μˆ 0,0 (t) + m m ˆ 0 ,n (t).

In view of (13.7.33),

μˆ 0,0 2N T ,2N

1 = 2T

0 T

−T

k=1

1 2N

k σ +it

dt →

2 dN, (Y ) 0 Y ∈N

Y 2σ

.

(13.7.41)

Choosing then 0 large enough (depending on ε and σ only), so that 2 2 (m) (Y ) dN, dN 0 ≤ ε/2, we get for all T large enough, say T ≥ Tε , − ∞ Y ∈N Y 2σ m=1 m2σ ∞ dN2 (m) 2N μ ˆ 0,0 T ,2N − m2σ m=1

T 2 dN, (Y ) 1 0 1 2N 0 dt − ≤ 2T k σ +it Y 2σ −T

+

Y ∈N

k=1 2 dN, (Y ) 0 2σ Y

Y ∈N ∞ 2 d (m) N − m2σ m=1

≤

ε ε + = ε. 2 2

(13.7.42)

711

13.8 An extremal divisor case

This is in particular true for T ≥ Tn−1 , assuming n large enough. By the triangle inequality, m ˆ n T ,2N − μˆ 0,0 T ,2N ≤ m ˆ 0 ,n T ,2N . And so for all n large enough and Tn−1 ≤ T ≤ Tn , ∞ dk2 (m) 1/2N m ≤ m ˆ

− ˆ 0 ,n T ,2N + ε. n T ,2N m2σ

(13.7.43)

m=1

Therefore by (13.7.39), for all n sufficiently large, say n ≥ nε , and for all T such that Tn−1 ≤ T ≤ Tn , ∞ dk2 (m) 1/2N ζ (σ + i · ) T ,2N − ≤ 2ε + m ˆ 0 ,n T ,2N . m2σ

(13.7.44)

m=1

We may further (and do) assume nε > 0 . In order to prove (13.7.26), it thus suffices to evaluate for n large (at least n ≥ nε ) and Tn−1 ≤ T ≤ Tn , 2N 2N 1 T m |m ˆ 0 ,n (t) dt (13.7.45) ˆ 0 ,n T ,2N = T −T and in turn to establish that sup

Tn−1 ≤T ≤Tn

1 T

T −T

2N |m ˆ 0 ,n (t) dt

is small enough for large n.

13.8 An extremal divisor case Let (, A, P) be some probability space on which a Rademacher sequence ε = {εi , i ≥ 1} is defined. Consider the sequence of partial sums SN = ε1 + · · · + εN , N = 1, 2, . . . . Put N1 = 1,

Nk = inf{ N > Nk−1 : N even and N |SN 2 }

(k > 1).

(13.8.1)

That this random sequence is well defined can be deduced from our result below. Before stating it, we observe that, for no sequence of positive reals {ak , k ≥ 1} tending to infinity, can the ratios Nk ak converge in probability to a random variable that is positive almost surely. Indeed ([Billingsley: 1999], p. 147) this would imply the validity of a randomly selected central limit theorem, namely SN D √ k "⇒ N (0, 1), Nk

712

13 Divisors and random walks

which is naturally impossible. The object of this section is to prove the following result, exhibiting an exponential growth of the sequence (Nk )k . 13.8.1 Theorem. Put s = 2

j ∈Z e

log Nk =

−2π 2 j 2 .

For any τ > 7/8,

k + O(k τ ) almost surely. s

The result extends to Bernoulli sequences β = {βi , i ≥ 1}. Write BN = β1 + · · · + D

βN , N = 1, 2, . . . . Since εi = 2βi − 1, Theorem 13.8.1 gives the same estimate for the sequence M1 = 1,

Mk = inf{ M > Mk−1 : M even and M| 2BM 2 }

(k > 1).

The proof will rely upon several intermediate results which are of independent interest. We start with a first lemma. 13.8.2 Lemma. For N even, log5/2 N N P N |SN 2 = s + O . N2 −1 2iπj S 2 /N N Proof. We use the formula NδN |SN 2 = N , which by direct integration j =0 e produces 2 −1 N 2πj N cos . (13.8.2) N P N|SN 2 = N j =0

N −1

N 2 We evaluate the trigonometric sum j =0 cos 2πj . Let a > a > 3 be fixed and N put for positive integers N, √ sin ϕN /2 2a log N ϕN = , τN = . N ϕN /2 We assume N sufficiently large for τN to be greater than (a /a)1/2 . Consider the sector A0N =] − ϕN , ϕN [ ∪ ]π − ϕN , π + ϕN [. Put

$√ % $√ % N 2πj 2a log N 2a log N AN = + 1 or 0 ≤ | − j | ≤ +1 . , 0 ≤ |j | ≤ N 2π 2 2π 2πj Since cos 2πj / AN , we have N ≤ cos ϕN for N ∈

/ N 0≤j
2 2πj N 2 2 cos ≤ N e−2N sin (ϕN /2) , N

713

13.8 An extremal divisor case

and

2N 2 sin2 (ϕN /2) = 2N 2 (ϕN /2)2 τN2 ≥ a log N.

We deduce that

2 2πj N cos ≤ N −(a −1) . N

0≤j
∈A / N

(13.8.3)

If I is an arc, we denote by I π the new arc obtained by a rotation of angle π and I s = {−x : x ∈ I }. Now define $√ % 2πj 2a log N +1 , , 0<j ≤ I1N = N 2π (13.8.4)

N s N N π N N s N I2 = I1 , I3 = I2 , I4 = I1 . Then the sums iN =

N j : 2πj N ∈Ii

cos 2πj N

0≤j
N 2

2πj cos N

, i = 1, 2, 3, 4 are equal, and since

N 2

=

4

iN + 2,

i=1

we obtain

0≤j
2πj N

cos ∈AN

N 2

=4

2πj N

cos

∈I1N

2πj N

N 2

+ 2.

Here we have used the fact that N is even to obtain the equality. Now, by using the elementary inequality: |eu − ev | ≤ |u − v| for u, v ≤ 0, we have 2 2πj N −2π 2 j 2 cos − e ≤ N 2πj 2πj N

∈I1N

N

∈I1N

2πj 2 2 N 2 log cos + 2π j N 2πj N ∈I1

N

=N

2

2πj N

∈I1N

2πj 1 2πj 2 log cos + . N 2 N

Since log(1 − 2 sin2 (x/2)) = −x 2 /2 + O(x 4 ) near 0, we deduce 2 2πj 4 2πj N log5/2 N −2π 2 j 2 2 cos − e ≤ CN ≤C . N N N2 2πj 2πj 2πj N

∈I1N

N

∈I1N

N

∈I1N

(13.8.5) Now j>

" √2a log N # 2π

e−2π +1

2j 2

≤

∞

2π

−t " √2a log N # e 2π

+1

2 /2

dt ≤

∞

√ 2a log N

e−t

2 /2

dt.

714

13 Divisors and random walks

By using the trivial bound

∞ x

e−t

2 /2

dt ≤ e−x

∞ −2π 2 j 2 −2π 2 j 2 e − e ≤ j =1

2πj N

∈I1N

2 /2

for x ≥ 1, we also get

" √2a log N #

j>

2π

e−2π

2j 2

≤ CN −a .

(13.8.6)

+1

By combining the estimates (13.8.3), (13.8.5), and (13.8.6) we obtain N−1 ∞

2πj N 2 log5/2 N −2π 2 j 2 cos − 2+4 e ≤C . N N2 j =0

(13.8.7)

j =1

The result follows from (13.8.7) and the choice of a and a .

To prove Theorem 13.8.1, it will also be necessary to estimate P N |SN 2 , M|SM 2 , and more precisely the correlation P N|SN 2 , M|SM 2 − P N|SN 2 P M|SM 2 .

This is a considerably more delicate task. The basic tool will however be again the formula used for proving Lemma 13.8.2. The next statement is the crucial step towards the proof of Theorem 13.8.1. 13.8.3 Lemma. There exists an absolute constant C, and for every ε > 0, a constant Cε depending on ε only, such that for any even positive integers N ≥ M large enough: a) if M ≤ N ≤ M + N/(log N)1/2 , then 1/2 1/2 P N |SN 2 , M|SM 2 − P N|SN 2 P M|SM 2 ≤ C (log N ) (log M) ; N M

b) if N > M + N/(log N)1/2 , then P N |SN 2 , M|SM 2 − P N|SN 2 P M|SM 2

1 (log M)5/2 (log N)1/2 + 2 (log N )3/4 +ε . ≤ Cε M 2N N N −1 M−1 2iπ(j S 2 /N +kS 2 /M) N M proProof. The formula MNδN |SN 2 δM|SM 2 = j =0 k=0 e vides by direct integration for M ≤ N:

MNP N|SN 2 , M|SM 2 =

N −1 M−1 j =0 k=0

j k cos 2π + N M

M 2

2πj cos N

N 2 −M 2

. (13.8.8)

We use the notation from the proof of Lemma 13.8.2. (A) Consider the sum 2πj N

∈AN , 2πk M ∈AM

j k cos 2π + N M

M 2

2πj cos N

N 2 −M 2

.

715

13.8 An extremal divisor case

We observe that 2πj N

$

cos 2π

k ∈AN , 2π M ∈AM

$

4 4

=

a=1 b=1

2πj N

j k + N M

%M 2 $

cos

cos 2π

M ∈IaN 2πk M ∈Ib

2πj N

k j + N M

%N 2 −M 2

(13.8.9)

%M 2 $

cos

2πj N

%N 2 −M 2

+ B,

where B is the sum of the terms corresponding to j = 0, N/2 and to k = 0, M/2; that is to say $

B=2

2π k ,4 M M ∈ i=1 Ii

k cos 2π M

%M 2

$

+2 2πj N

,4

∈

N i=1 Ii

j cos 2π N

%N 2

+ 4.

(13.8.10) By considering the sixteen possible cases, one sees that the sums $

2πj N

cos 2π

2πk M M ∈Ib

∈IaN

j k + N M

%M 2 $

cos

2πj N

%N 2 −M 2

are either equal to

1 =

2πj N

M ∈I1N 2πk M ∈I1

or

2 =

2πj N

$

$

k j cos 2π + N M

cos 2π

M ∈I1N 2πk M ∈I1

j k − N M

%M 2 $

2πj cos N

%M 2 $

cos

2πj N

%N 2 −M 2

(13.8.11a)

%N 2 −M 2

,

(13.8.11b)

in an equal number of times. We have used here the fact that N and M are even. We thus find 2 $ % 2 $ % 2 j 2πj N −M k M cos 2π cos (13.8.12) + N M N 2πj 2π k ∈AN ,

N

M

∈AM

= 8 1 + 2 +

$ 2πk M M ∈I1

k cos 2π M

Examine the sums i , i = 1, 2. M 2 2 2 e−2π {k +j +2j k N } 1 − 2πj N

≤

∈I1N

2πj N

∈I1N

2πk M M ∈I1

%M 2

+

$ 2πj N

∈I1N

j cos 2π N

%N 2

+ 4.

(13.8.13)

2 j j k M 2 2 2 2 2 eM log cos 2π( N + M )+(N −M ) log cos 2π( N ) − e−2π {k +j +2j k N } .

2π k M M ∈I1

716

13 Divisors and random walks

The right-hand side is a sum of terms of the type |eu − ev |, which we bound by |u − v| since u and v are negative. Hence, we continue the inequality above with

≤

2πj N

∈I1N

M 2 log cos 2π

2πk M M ∈I1

j k + N M

+ (N 2 − M 2 ) log cos 2π

j N

M + 2π k + j + 2j k N M 2 log cos 2π j + k + (N 2 − M 2 ) log cos 2π j = N M N 2πj M N 2πk 2

N

∈I1

M

∈I1

2

+ 2π 2 M 2

2

j k + N M

2

+ (N 2 − M 2 )

j N

2 ,

(13.8.14) and by using again the estimate log(1 − 2 sin2 (x/2)) = −x 2 /2 + O(x 4 ), near 0, we continue the inequality above with ≤C

2πj N

≤C

4 j k 4 j 2 2 M 2 + N −M + N M N 2πk M

∈I1N

M

∈I1

√ √ 2πj ≤ a log N 2π k≤ a log M

≤ C (log N)

1/2

≤C

1/2

(log M)

4 j k4 j 4 2 M + + 4 N M2 N2

(log M)2 (log N )2 (log N)2 + + N2 M2 N2

(log M)5/2 (log N)1/2 . M2

Hence, 1 − 2πj N

e−2π

2 {k 2 +j 2 +2j k M } N

(log M)5/2 (log N )1/2 . ≤C M2

(13.8.15a)

e−2π

2 {k 2 +j 2 −2j k M } N

(log M)5/2 (log N )1/2 . ≤C M2

(13.8.15b)

M ∈I1N 2πk M ∈I1

Similarly 2 − 2πj N

M ∈I1N 2πk M ∈I1

717

13.8 An extremal divisor case

Consider the sums

+ =

2πj N

e−2π

2πj N

− e−2π

2 {k 2 +j 2 }

2 {k 2 +j 2 +2j k M } N

− e−2π

2 {k 2 +j 2 }

M ∈I1N 2πk M ∈I1

− =

2 {k 2 +j 2 −2j k M } N

e−2π

, .

M ∈I1N 2πk M ∈I1

We shall consider in what follows two cases. We will indeed distinguish, for estimating the sums ± , the case M ≤ N ≤ M + N/(log N )1/2 from the case N > M + N/(log N)1/2 .

Case 1: M ≤ N ≤ M + N/(log N)1/2 . Since 2π 2 {k 2 + j 2 } − 4π 2 j k M = 2π 2 k 2 + N j 2 − 2j k M N ≥ 0, we can bound the exponential term by 1, and then obtain | + | ≤ 2

√

√

1 ≤ C (log N )1/2 (log M)1/2 .

(13.8.16)

2πj ≤ a log N 2π k≤ a log M

Therefore 2 − 2πj N

∈I1N

e−2π

2 {k 2 +j 2 }

2πk M M ∈I1

(log M)5/2 (log N)1/2 + (log N )1/2 (log M)1/2 M2 ≤ C (log N)1/2 (log M)1/2 ,

(13.8.17)

≤C

| − | =

and

2πj N

∈I1N

e−2π

2 {k 2 +j 2 }

e−2π

2j k M N

− 1 ≤ C.

2πk M M ∈I1

So, 1 − 2πj N

e

(log M)5/2 (log N )1/2 +1 ≤C M2

−2π 2 {k 2 +j 2 }

M ∈I1N 2πk M ∈I1

(log M)5/2 (log N )1/2 ≤C +1 M2 ≤ C (log N )1/2 (log M)1/2 . Case 2: N > M + N/(log N)1/2 . Put

=

∞ ∞ j =1 k=1

j ke−2π

2 {k 2 +j 2 }+4π 2 j k M N

.

(13.8.18)

718

13 Divisors and random walks

Then, by using the inequality |eu − ev | ≤ |u − v|(eu ∨ ev ), M 2 2 2 2 | + | ≤ e−2π {k +j } e4π j k N − 1 2πj N

≤

∈I1N

2πj N

∈I1N

2πk M M ∈I1

4π 2 j k

2πk M M ∈I1

M M −2π 2 {k 2 +j 2 }+4π 2 j k M N ≤ 4π 2 e . N N

(13.8.19a)

What is crucial for the sequel is the factor M/N in (13.8.19). Before continuing, we observe that | − | admits a similar bound. This follows from the elementary inequality 1 − e−u ≤ eu − 1 for u ≥ 0. Then M 2 2 2 2 | − | ≤ e−2π {k +j } e−4π j k N − 1 2πj N

≤

2πj N

≤

∈I1N

∈I1N

2πj N

∈I1N

2πk M M ∈I1

e−2π

2 {k 2 +j 2 }

4π 2 j k M e N − 1

(13.8.19b)

2πk M M ∈I1

4π 2 j k

2πk M M ∈I1

M 2π 2 {k 2 +j 2 }+4π 2 j k M M N ≤ 4π 2 e . N N

We write = =

∞ ∞ j =1 k=1 ∞ ∞

2

j ke

2 2M −2π 2 j 2 −2π 2 [k−j M 2 N ] +2πj N

(13.8.20) j ke

M 2 2 2 −2π 2 j 2 [1−( M N ) ]−2π [k−j N ]

.

j =1 k=1

We shall first estimate the sum ∞

e−2π

2 [k−j M ]2 N

.

k=1

Then, denoting by [x] the integer part of x, we have ∞

e−2π

2 [k−j M ]2 N

=

k=1

e−2π

2 [k−j M ]2 N

1≤k≤j M N

=

∞ l=0

e−2π

2 [k−j M ]2 N

k>j M N

e−2π

1≤k≤[j M N]

≤2

+

e−2π

2 [k−j M ]2 N

+

k>[j M N] 2 l2

.

e−2π

2 [k−j M ]2 N

719

13.8 An extremal divisor case

Further ∞

e

2 −2π 2 j 2 [1−( M N) ]

∞

≤

e−2π

2 x 2 [1−( M )2 ] N

dx ≤ C

1

j =2

1 2 1 − (M N)

≤ C (log N )1/4 ,

2 1/2 . Hence since 1 − ( M N ) ≥ 1/(log N) ∞

e−2π

2 j 2 [1−( M )2 ] N

≤ C (log N )1/4 ,

(13.8.21)

j =1

and so

∞ ∞

e−2π

2 {k 2 +j 2 }+4π 2 j k M N

≤ C (log N )1/4 .

(13.8.22)

j =1 k=1

Consider the symmetric measure p on Z2+ defined by pj,k = e−2π

2 {k 2 +j 2 }+4π 2 j k M N

.

Then, its total mass is bounded and satisfies p(Z2+ ) ≤ C (log N )1/4 . Put p˜ = p/p(Z2+ ), L

and let (X, Y ) be a Z2+ -valued random vector such that (X, Y ) = p, ˜ namely P{X = j, Y = k} = pj,k /p(Z2+ ). Then we have ≤ p(Z2+ )E XY ≤ p(Z2+ ) X 2 Y 2 = p(Z2+ )E X2

(13.8.23)

because p˜ is symmetric. But p(Z2+ )E X2

= =

p(Z2+ )

∞

j 2 P{X = j }

j =1

∞ ∞

j 2 pj,k =

j =1 k=1 ∞

j 2 e−2π

2 j 2 [1−( M )2 ] N

j =1

j 2 e−2πj

≤C

∞

∞

e−2π

2 [k−j M ]2 N

(13.8.24)

k=1

2 [1−( M )2 ] N

.

j =1

Moreover, ∞

2 2 −2π 2 j 2 [1−( M N) ]

j e

≤2

∞

x 2 e−2π

2 x 2 [1−( M )2 ] N

dx

1

j =2

≤2

∞

2 −y 2 /2

y e 1

≤ C (log N)3/4 ,

$ 2 %−3/2 M dx 2π 1 − N

720

13 Divisors and random walks

"

2 #−1/2 with the change of variables x = 2π 1 − M y. Therefore N ≤ C (log N )3/4 ,

(13.8.25)

and

M (13.8.26) max | + |, | − | ≤ C (log N )3/4 . N By combining (13.8.15a), (13.8.15b), (13.8.19a), (13.8.19b) with (13.8.26), we consequently get for i = 1, 2, i − 2πj N

e

M (log M)5/2 (log N )1/2 3/4 + (log N ) . ≤C M2 N

−2π 2 {k 2 +j 2 }

M ∈I1N 2πk M ∈I1

(13.8.27) We shall now terminate our process to estimate the sum (13.8.9) in Case 2). Recall also (13.8.5) for K even, 2 log5/2 K 2πj K 2 2 cos − e−2π j ≤ C . K K2 2πj 2πj K

∈I1K

∈I1K

K

Applying it for K = M and K = N , allows us to estimate the sums in (13.8.10). Now, it follows from the decomposition (13.8.12) that

$ 2πj N ∈AN 2π k ∈A M M

j k cos 2π + N M

− 4

2πj N

e

−2π 2 j 2

∈I1N

= 8 2 − 2πj N

+8

2πj N

∈I1N

2πk M M ∈I1

e−2π

2 {k 2 +j 2 }

e−2π

2 {k 2 +j 2 }

(13.8.28)

M ∈I1N 2πk M ∈I1

+8

%N 2 −M 2

2πk M M ∈I1

∈I1N

2πk M M ∈I1

2πj cos N

−2π 2 k 2 +2 4 e + 2

+ 8 1 − 2πj N

%M 2 $

2π k cos M

M 2

2πj cos β N

− N 2

e

−2π 2 j 2

2πk M M ∈I1

−

2πj N

∈I1N

e

−2π 2 j 2

+ 4 − 4.

721

13.8 An extremal divisor case

And so, by using estimates (13.8.27), (13.8.28) and (13.8.5) we get

$ 2πj N ∈AN 2π k ∈A M M

%M 2 $

cos 2π(

j k + ) N M

cos

2πj N

%N 2 −M 2

2 2 2 2 − 4 e−2π j + 2 4 e−2π k + 2 2πj N

(13.8.29)

2πk M M ∈I1

∈I1N

M log5/2 M log5/2 N (log M)5/2 (log N)1/2 + (log N )3/4 + + 2 2 M N M N2 (log M)5/2 (log N)1/2 M ≤C + (log N )3/4 . M2 N This achieves the estimate of the sum (13.8.9) in Case 2. We now return to Case 1. This case is easier. By using (13.8.28), and then estimates (13.8.17) and (13.8.18), we obtain 2 $ % 2 $ % 2 j 2πj N −M k M cos 2π cos + N M N 2πj ≤C

N ∈AN 2π k ∈A M M

2 2 2 2 − 4 e−2π j + 2 4 e−2π k + 2 2πj N

2πk M M ∈I1

∈I1N

(log M)5/2 (log N)1/2 + (log N )1/2 (log M)1/2 M2 ≤ C (log N)1/2 (log M)1/2 .

(13.8.30)

≤C

Now

2 2 2 2 4 e−2π j + 2 4 e−2π k + 2 − s 2 2πj N

∈I1N

= s−4

2πk M M ∈I1

e−2π

√

2j 2

s−4

2πj > 2a log N

= −4s

√

e

+ 16

e−2π

√

2 k2

− s2

2π k> 2a log M

−2π 2 j 2

2πj > 2a log N

− 4s

√

e−2π

2 k2

2π k> 2a log M

e

−2π 2 j 2 −2π 2 k 2

.

√ 2πj >√ 2a log N 2πk> 2a log M

And, by using estimate (13.8.6),

2 2 2 2 e−2π j +2 4 e−2π k +2 −s 2 ≤ C N −a +M −a +N −a M −a . 4 2πj N

∈I1N

2πk M M ∈I1

722

13 Divisors and random walks

Write U =

$

j k cos 2π + N M

2πj N ∈AN 2π k ∈A M M

%M 2 $

2πj cos N

Since a > 3, we have obtained C (log N)1/2 (log M)1/2 5/2 U≤ N )1/2 3/4 C (log M) M(log +M 2 N (log N )

%N 2 −M 2

− s . 2

(Case 1), (Case 2).

(13.8.31)

(13.8.32)

(B) Now, by the first step of the proof of Lemma 13.8.2,

M 2 N 2 −M 2 cos 2π( j + k ) cos 2πj ≤ N −(a −2) .

2πj N

N

∈A / N

M

N

(13.8.33)

(C) Finally, we estimate the sum N 2 −M 2 M 2 cos 2π j + k cos 2πj .

2πj N

N

∈AN , 2πk / M M ∈A

By considering, successively, the cases

2π

2πj N

M

N

∈ IiN , i = 1, 2, 3, 4, one sees that the sets

2π k 2πj k j , + ∈ / AM , ∈ IiN , N M M N

i = 2, 3, 4

are obtained from the set

j 2π k 2πj k 2π , + ∈ / AM , ∈ I1N , N M M N

by the transformations I → I π and I → I s defined in the proof of Lemma 13.8.2. Using now the fact that AcM is invariant under the transformation I → I s finally allows us to write

cos 2π

2πj N ∈AN 2π k ∈A M / M

=8

k M j + N M

2πj N N ∈I1 ϕM < 2πk M <π−ϕM

2

N 2 −M 2 cos 2πj

N

N 2 −M 2 M 2 cos 2π j + k cos 2πj .

N

M

N

(13.8.34)

723

13.8 An extremal divisor case

k < π − ϕM , and Now, two cases are to be distinguished: i) ϕM < 2π Nj + M

j k ii) π − ϕM ≤ 2π N + M ≤ π . k k ) ≤ π − ϕM . Then, cos 2π( Nj + M ) ≤ cos ϕM ≤ Case i): ϕM ≤ 2π( Nj + M 2 2 k M ) ≤ M −a ; e−2 sin (ϕM /2) . Thus, as in the proof of Lemma 13.8.2, cos 2π( Nj + M and we obtain

cos 2π

2πj N N ∈I1 k ϕM ≤ 2π M ≤π −ϕM

2

2πj N k M j + cos N M N

2 −M 2

2 ≤ CM −(a −1) log N . (13.8.35)

k k Case ii): π − ϕM < 2π( Nj + M ) ≤ π . This is possible only if π − ϕM − ϕN ≤ 2π M < 2π k π − ϕM ; and this means that the points M cover an arc of length at most ϕN . Thus, by the very construction of the set AN , the number ν of such points satisfies # " N ! M √ = 2π N 2a log N if ϕN ≥ 2π ν ≤ Mϕ N , 2π

ν=0

else.

It follows that we always have ν ≤

N

2πj N N ∈I1 j k )≤π π −ϕM ≤2π( N + M

M√ N log N

We deduce

N 2 −M 2 M 2 M cos 2π j + k cos 2πj ≤ C log N .

But

M √ 2π N 2a log N.

≤

M

N

N

(13.8.36)

√ log M. And by using this, we find

cos 2π

2πj N ∈AN 2πk ∈A M / M

2

2πj N k M j + cos N M N

−(a −1)

2

≤C M log N + 2 2 ≤ C log M log N.

2

2 −M 2

2 log M log N

(13.8.37)

The latter estimate will concern the case M ≤ N ≤ M + N/(log N )1/2 . In the case when N > M + N/(log N)1/2 , we will operate differently. We shall treat the sum

N 2 −M 2 M 2 cos 2π j + k cos 2πj

2πj N N ∈I1 j k )≤π π−ϕM ≤2π( N + M

N

M

N

724

13 Divisors and random walks

in a more subtle way. Fix some ε > 0. First N 2 −M 2 M 2 cos 2π j + k cos 2πj

N

1 +ε j ≤(log N) 4 j k )≤π π −ϕM ≤2π( N + M

≤

1 +ε j ≤(log N) 4 j k )≤π π −ϕM ≤2π( N + M

M

N

(13.8.38)

3 M 1 ≤ C (log N ) 4 +ε . N

Now, N 2 −M 2 M 2 cos 2π j + k cos 2πj

N

1 2πj N 4 +ε N ∈I1 , j >(log N) j k π−ϕM ≤2π( N + M )≤π

M ≤ C (log N)1/2 N

2πj N 2πj N =

Since

∈ I1N , we have

cos

1 − 2 sin2

πj N

πj N πj N

sin

N 2 −M 2 cos 2πj .

2πj N N ∈I1 1 +ε j >(log N) 4

N

2 ≤ 1 − 2(a /a)( πj N ) . Therefore

N 2 −M 2 cos 2πj ≤

N

∈I1N

N

≥ (a /a)1/2 , for N large enough. And, so 0 ≤

2πj N

M

2πj N

e−2(N

2 −M 2 )(a /a)( πj )2 N

.

(13.8.39)

∈I1N

√ 1 But, in the considered case, we have N 2 − M 2 ≥ N 2 / log N and j > (log N ) 4 +ε , so that πj 2 ≥ 2π 2 (a /a)(log N )2ε . 2(N 2 − M 2 )(a /a) N From the elementary inequality e−x ≤ p!x p , x ≥ 1, we deduce e−2(N

2 −M 2 )(a /a)( πj )2 N

≤ e−2π

2 (a /a)(log N )2ε

≤ Cε (log N )−1/2 ,

(13.8.40)

where Cε depends on ε only. It follows that

N 2 −M 2 cos 2πj ≤

2πj N

∈I1N

N

2πj N

e−2(N

2 −M 2 )(a /a)( πj )2 N

∈I1N

≤ Cε (log N )1/2 (log N )−1/2 Cε .

(13.8.41)

725

13.8 An extremal divisor case

We thus arrive at N 2 −M 2 M 2 M cos 2π j + k cos 2πj ≤ Cε (log N )1/2 ,

N

1 2πj N 4 +ε N ∈I1 , j >(log N) j k π −ϕM ≤2π( N + M )≤π

M

N

N

(13.8.42) and finally N 2 −M 2 M 2 3 M cos 2π j + k cos 2πj ≤ Cε (log N ) 4 +ε .

N

2πj N N ∈I1 j k )≤π π−ϕM ≤2π( N + M

M

N

N

(13.8.43) Consequently,

M 2 N 2 −M 2 3 M cos 2π k cos 2πj ≤ Cε (log N ) 4 +ε .

2πj N ∈AN 2π k ∈A M / M

M

N

N

(13.8.44)

(D) Combining (13.8.31), (13.8.32), (13.8.33), (13.8.37) and (13.8.44), gives (Case 1), C (log N)1/2 (log M)1/2 NMP{N|SN 2 , M|SM 2 }−s 2 ≤ (log M)5/2 (log N )1/2 3 M +ε Cε (Case 2). + N (log N ) 4 M2 (13.8.45) But, by Lemma 13.8.2,

2 P{N|SN 2 }P{M|SM 2 } − s ≤ C s (P{N |SN 2 } − s + C s P{M|SM 2 } − s MN M N N M (log N)5/2 (log M)5/2 (log M)5/2 . ≤C ≤C + M 2N M 2N N 2M Hence 2 5/2 P{N |S 2 }P{M|S 2 } − s ≤ C 1 (log M) . (13.8.46) N M MN N M2 Finally, by combining the two last estimates, we obtain, if M ≤ N ≤ M+N/(log N )1/2 , that 1/2 1/2 P N |SN 2 , M|SM 2 − P M|SM 2 P M|SM 2 ≤ C (log N ) (log M) N M

and if N > M + N/(log N)1/2 , P N |SN 2 , M|SM 2 − P M|SM 2 P M|SM 2

3 (log M)5/2 (log N)1/2 1 ≤ Cε + 2 (log N ) 4 +ε . 2 M N N The proof of Lemma 13.8.3 is now complete.

726

13 Divisors and random walks

In the next lemma, we put for any positive integer l, ξl = 1l|Sl 2 − P{l|Sl 2 }. 13.8.4 Lemma. For any ε > 0, there exists a constant Cε , depending on ε only, such that for any positive integers j ≥ i large enough, E

2

!

ξl

≤ Cε

i≤l≤j l even

log3/4 +ε l . l i≤l≤j

(13.8.47)

l even

Proof. We have E

!

2 ξl

i≤l≤j l even

=

P{l|Sl 2 }(1 − P{l|Sl 2 })

i≤l≤j l even

+2

P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N |SN 2 } .

i≤M
(13.8.48) By using Lemma 13.8.2,

P{l|Sl 2 }(1 − P{l|Sl 2 }) ≤ C

i≤l≤j l even

1 . l i≤l≤j l even

Write now the second sum as P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N |SN 2 } i≤M

=

P{M|SM 2 , N |SN 2 } − P{M|SM 2 }P{N |SN 2 }

i≤M
+

P{M|SM 2 , N |SN 2 } − P{M|SM 2 }P{N |SN 2 } .

i≤MM+N/ log1/2 N

By using Lemma 13.8.3 (a),

P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N |SN 2 }

i≤M
≤C

i≤M
(log M)1/2 (log N)1/2 (log M)1/2 ≤C . MN M i≤M≤j

727

13.8 An extremal divisor case

And by using Lemma 13.8.3 (b),

P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N |SN 2 } i≤MM+N/ log1/2 N

≤ Cε

i≤MM+N/ log1/2 N

3 1 (log M)5/2 (log M)1/2 + 2 (log N ) 4 +ε . M 2N N

Now i≤MM+N/ log1/2 N

(log M)5/2 (log N)1/2 M 2N

(log M)5/2 (log N)1/2 (log N )1/2 ≤ C . M2 N N

≤

i≤N ≤j

i≤M≤j

And

i≤MM+N/ log1/2 N

i≤N ≤j

3 1 (log N) 4 +ε ≤ Cε 2 N

3 (log M) 4 +ε

i≤M≤j

M

.

Thus

(log M) 43 +ε P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N|SN 2 } ≤ Cε . M

i≤MM+N/ log1/2 N

i≤M≤j

Therefore,

P{M|SM 2 , N|SN 2 } − P{M|SM 2 }P{N |SN 2 } ≤ Cε

i≤M
(log M) 43 +ε . M

i≤M≤j

This establishes Lemma 13.8.4. The next tool is a direct consequence of Lemma 8.3.3 (see also Lemma 10, p. 45 in Sprindžuk [1979]), which we recall for convenience. 13.8.5 Lemma. Let {fl , l ≥ 1} be a sequence of nonnegative random variables, and let ϕ = {ϕl , l ≥ 1}, m = {ml , l ≥ 1} be two sequences of nonnegative reals such that 0 ≤ ϕl ≤ ml ≤ 1 (l ≥ 1). Write Mn =

1≤l≤n

ml .

(13.8.49)

728

13 Divisors and random walks

Assume that Mn ↑ ∞ and that the following condition is satisfied: 2 E (fl − ϕl ) ≤ C ml , 0 ≤ i ≤ j < ∞. i≤l≤j

(13.8.50)

i≤l≤j

Then, for every a > 3/2,

a.s.

fl =

1≤l≤n

1/2

ϕl + O(Mn loga Mn ).

(13.8.51)

1≤l≤n

We are now in a position to give the proof of Theorem 13.8.1. For any positive even 3 integer l, put fl = 1l|Sl 2 , ϕl = P{l|Sl 2 } and ml = (log l) 4 +ε / l, and let these quantities equal 0 for odd values of l. It is clear from Lemma 13.8.4, that condition (13.8.50) is 7 fulfilled. Since Mn is of order log 4 +ε n, it follows from Lemma 13.8.5 that for any real a > 3/2, n

1l|Sl 2 =

l=1

n

7 P{l|Sl 2 } + O (log 8 +ε n) loga log n almost surely.

(13.8.52)

l=1

By Lemma 13.8.2, for l even

log5/2 l P l|Sl 2 = sl −1 + O l2

−2π 2 j 2 . And so with s = 2 + 4 ∞ j =1 e n

7 1l|Sl 2 = s log n + O (log 8 +ε n) loga log n almost surely.

(13.8.53)

l=1

This easily implies the result. Remarks. 1. The same method also allows us to study the sequence of integers N ∈N such that N |SN 2 , and N is an increasing subsequence such that the series N ∈N 1/N diverges. For instance, if N = P , where P is the sequence of primes, the method applies and in this case it is worth noting that, if N |SN 2 , necessarily N is the largest prime divisor of SN 2 . 2. A complementary result to Theorem 13.8.1 is also relatively easy to deduce from Lemmas 13.8.3 and 13.8.4. Let α be some positive number. Put for any positive integer k, α α Ak = ∃ ∈ [2k , 2(k+1) [ : | S2 . If α is sufficiently large (for instance α > 12), then for any a > 3/2, n k=1

1Ak = n + O(n1/2 loga n)

almost surely.

Bibliography Akcoglu, M., Bellow, A., Jones, R. L., Losert, V., Reinhold-Larsson, K., and Wierdl, M. [1996]: Strong sweeping out property for lacunary sequences, Riemann sums, convolution powers and related matters, Ergodic Theory Dynam. Systems 16, 207–253. Akcoglu, M., Ha, D., and Jones, R. [1994]: Divergence of ergodic averages, in Topological vector spaces, algebras and related areas (Hamilton, ON, 1994), Pitman Res. Notes Math. Ser. 316, Longman Sci. Tech., Harlow, 175–192. Akcoglu, M., Ha, D., and Jones, R. [1997]: Sweeping out properties of operator sequences, Canad. J. Math. 49, 3–23. Adler, A., Ordónez, C., Rosalsky, A., and Volodin, A. [1999]: Degenerate weak convergence of row sums fors arrays of random elements in stable type p Banach spaces, Bull. Inst. Math. Acad. Sinica 27, 187–212. Ahmed, S., Giuliano Antonini, R., and Volodin, A. [2002]: On the rate of complete convergence for weighted sums of arrays of Banach space valued random elements with application to moving average processes, Statist. Probab. Lett. 58, 185–194. Alexits, G. [1961]: Convergence problems of orthogonal series. Internat. Ser. Monogr. Pure Appl. Math. 20, Pergamon Press, New York, Oxford, Paris. Assani, I. [1997a]: Strong laws for weighted sums of independent identically distributed random variables, Duke Math. J. 88, 217–246. Assani, I. [1997b]: Convergence of the p-series for stationary sequences, New York J. Math. 3A, 15–30. Assani, I. [1998]: A weighted pointwise ergodic theorem, Ann. Inst. H. Poincaré Probab. Statist. 34, 139–150. Assani, I. [2003]: Wiener–Wintner dynamical systems, Ergodic Theory Dynam. Systems 23, 1637–1654. Assani, I. [2004]: Spectral characterization of Wiener–Wintner dynamical systems, Ergodic Theory Dynam. Systems 24, 347–365. Assani, I., Lesigne, E., and Rudolph, D. [1995]: Wiener–Wintner return-times ergodic theorem, Israel J. Math. 92, 375–395. Atlagh, M. [1993]: Théorème central limite presque sûr et loi du logarithme itéré, Thesis, Publication IRMA, Strasbourg. Atlagh, M., and Weber, M. [1992]: Le théorème central limite presque sûr relatif à des sous-suites, C. R. Acad. Sci. Paris Sér. I Math. 315, 202–206. Atlagh, M., and Weber, M. [2000]: Le théorème central limite presque sûr, Expo. Math. 18, 97–126. Azuma, K. [1967]: Weighted sums of certain dependent random variables, Tôhoku Math. J. (2) 19, 357–367. Baker, A. [1984]: A concise introduction to the theory of numbers, Cambridge University Press, Cambridge. Baker, R. C. [1976]: Riemann sums and Lebesgue integrals, Quart. J. Math. Oxford Ser. 27, 191–198.

730

Bibliography

Baker, R. C. [1981]: Metric number theory and the large sieve, J. London Math. Soc. 24, 34–40. Balasubramanian, R., and Ramachandra, K. [1988]: On the number of integers n such that nd(n) ≤ x, Acta Arith. 49, 313–322. Banach, S. [1926]: Sur la convergence presque partout des fonctionnelles linéaires, Bull. Sci. Math. 50, 27–32, 36–43. Baum, L. E., and Katz, M. [1965]: Convergence rates in the law of large numbers, Trans. Amer. Math. Soc. 120, 108–123. Baushev, A. N. [1987]: On the weak convergence of Gaussian measures, Theory Probab. Appl. 32, 670–677. Baxter, J., Jones R., Lin, M., and Olsen, J. [2004]: SLLN for weighted independent identically distributed random variables, J. Theoret. Probab. 17, 165–181. Bayart, F., Konyagin, S. V., and Queffélec, H. [2003/2004]: Convergence almost everywhere and divergence everywhere of Taylor and Dirichlet series, Real Anal. Exchange 29, 557–586. Beck, J., and Chen, W. W. L. [1987]: Irregularities of distribution, Cambridge Tracts in Math. 89, Cambridge University Press, Cambridge. Bednorz, W. [2006a]: A theorem on majorizing measures, Ann. Probab. 34, 1771–1781. Bednorz, W. [2006b]: A note on Menshov-Rademacher inequality, Bull. Polish Acad. Sci. Math. 54, 89–93. Bellman, R. [1944]: Almost orthogonal series, Bull. Amer. Math. Soc. 50, 517–519. Bellow, A. [1983]: On “Bad Universal” sequences in ergodic theory II, Lectures Notes in Math. 1033, Springer-Verlag. Bellow, A. [1989]: Perturbation of a sequence, Adv. Math. 78, 131–139. Bellow, A., and Losert, V. [1985]: The weighted pointwise ergodic theorem and the individual ergodic theorem along subsequences, Trans. Amer. Math. Soc. 288, 307–345. Bellow, A., and Jones, R. [1996]: A Banach principle for L∞ , Adv. Math. 120, 155–172. Bellow, A., Jones, R., and Rosenblatt, J. [1990]: Convergence for moving averages, Ergodic Theory Dynam. Systems 10, 43–62. Bellow, A., Jones, R. L., and Rosenblatt, J. [1992]: Almost everywhere convergence of weighted averages, Math. Ann. 293, 399–426. Bellow, A., Rosenblatt, J., and Tempelman, A. [1994]: Almost everywhere convergence of convolution power, Ergodic Theory Dynam. Systems 14, 415–432. Belyaev, Y. K. [1961]: Local properties of the sample functions of stationary Gaussian processes, Teor. Veroyatnost. i Primenen 5, 128–131. Berend, D., and Bergelson, V. [1984]: Ergodic and mixing sequences of transformations, Ergodic Theory Dynam. Systems 4, 353–366. Bergelson, V. [1985]: Sets of recurrence of Zm -actions and sets of differences in Zm , J. London Math. Soc. 2, 295–304. Bergelson, V., Boshernitzan, M., and Bourgain, J. [1994]: Some results on non-linear recurrence, J. Anal. Math. 62, 29–46. Bergelson V., Host, B., and Kra B. [2005]: Multiple recurrence and nilsequences, with an appendix by Imre Ruzsa, Invent. Math. 160, 261–303.

Bibliography

731

Berkes, I. [1973]: On Strassen’s version of the loglog law for multiplicative systems, Studia Sci. Math. Hungar. 8, 425–431. Berkes, I. [1975]: An almost sure invariance principle for lacunary trigonometric series, Acta Math. Acad. Sci. Hungar. 26, 209–220. Berkes, I. [1976a]: On the asymptotic behavior of f (nk x). Main theorems, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 34, 319–345. Berkes, I. [1976b]: On the asymptotic behavior of f (nk x). Applications, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 34, 347–365. Berkes, I. [1993]: Critical behavior of the trigonometric system, Trans. Amer. Math. Soc. 338, 553–585. Berkes, I. [1997]: On the convergence of n cn f (nx) and the Lip 1/2 class, Trans. Amer. Math. Soc. 349, 4143–4158. Berkes, I., and Philipp, W. [1993]: The size of trigonometric and Walsh series and uniform distribution modulo 1, J. London Math. Soc. 50, 454–464. Berkes, I., and Philipp, W. [1998]: A limit theorem for lacunary series f (nk x), Studia Sci. Math. Hungar. 34, 1–13. Berkes, I., and Weber, M. [2005]: Upper–lower class tests and frequency results along subsequences, Stochastic Process. Appl. 115, 679–700. Berkes, I., and Weber, M. [2005]: A law of the iterated logarithm for arithmetic functions, Proc. Amer. Math. Soc. 135, 1223–1232. Berkes, I., and Weber, M. [2006]: Moment convergence and the law of the iterated logarithm for arithmetic functions, Acta Arith. 1, 43–55. Berkes, I., and Weber, M. [2009]: On the convergence of ck f (nk x), to appear in Mem. Amer. Math. Soc. Berkson, E., and Gillespie, T. A. [1987]: Steckin’s theorem, transference, and spectral decomposition, J. Funct. Anal. 70, 140–170. Berman, S. M. [1969a]: Local times and sample function properties of stationary Gaussian processes, Trans. Amer. Math. Soc. 137, 277–299. Berman, S. M. [1969b]: Harmonic analysis of local times and sample functions of Gaussian processes, Trans. Amer. Math. Soc. 143, 269–281. Berman, S. M. [1970a]: Occupation times of stationary Gaussian processes, J. Appl. Probability 7, 721–733. Berman, S. M. [1970b]: Gaussian processes with stationary increments: Local times and sample function properties, Ann. Math. Statist. 41, 1260–1272. Berman, S. M. [1984]: Unboundedness of sample functions of stochastic processes with arbitrary parameter sets, with applications to linear and lp -valued parameters, Osaka J. Math. 21, 133–147. Bernstein, S. [1941]: Sur les sommes de grandeurs aléatoires liées de classes (A, N ) et (B, N ), C. R. (Doklady) Acad. Sci. URSS (N.S.) 32, 303–307. Bertrand-Mathis, A. [1986]: Ensemble intersectif et récurrence de Poincaré, Israel J. Math. 55, 184–198. Bertrand-Mathis, A. [1969]: Harmonic analysis of local times and sample functions of Gaussian processes, Trans. Amer. Math. Soc. 143, 269–281.

732

Bibliography

Beurling, A. [1989]: The collected works of Arne Beurling, Vol. 2: Harmonic analysis, Contemporary Mathematicians, Birkhäuser, Boston, 378–380. Billingsley, P. [1965]: Ergodic theory and information, Wiley Ser. Probab. Math. Sci., John Wiley & Sons, New York. Billingsley, P. [1999]: Convergence of probability measures, second edition, Wiley Ser. Probab. Statist. Probab. Statist., John Wiley & Sons, Inc., New York. Bingham, N., Goldie, C., and Teugels, J. [1987]: Regular variation, Cambridge University Press, Cambridge. Birkhoff, G. D. [1931]: Proof of the ergodic theorem, Proc. Nat. Acad. Sci. U.S.A. 17, 656–660. Blanchard, A. [1969]: Initiation à la théorie analytique des nombres premiers, Coll. Travaux et Recherches Math. 19, Dunod, Paris. Blum, J. R., and Cogburn, R. [1975]: On ergodic sequences of measures, Proc. Amer. Math. Soc. 51, 359–365. Blum, J. R., and Hanson, D. L. [1960]: On the mean ergodic theorem for subsequences, Bull. Amer. Math. Soc. 66, 308–311. Boas, R. P. [1941]: A general moment problem, Amer. J. Math. 63 361–370. Bochner, S. [1955]: Harmonic analysis and the theory of probability, University of California Press, Berkeley and Los Angeles. Bohnenblust, H. F., and Hille, E. [1931]: On the absolute convergence of Dirichlet series, Ann. of Math. 32, 600–622. Bohr, H. [1952]: Collected mathematical works, Dansk Matematisk Forening, København. Bombieri, E. [1971]: Density theorems for the zeta function, in 1969 Number theory institute, Proc. Sympos. Pure Math. 20, Amer. Math. Soc., Providence, R.I., 352–358. Borell, C. [1974]: Convex measures on locally convex spaces, Ark. Mat. 12, 239–252. Borell, C. [1975]: The Brunn-Minkowski inequality in Gauss space, Invent. Math. 30, 207–216. Borell, C. [1976]: Gaussian Radon measures on locally convex spaces, Math. Scand. 38, 265–284. Borell, C. [1977]: A note on Gauss measures which agree on small balls, Ann. Inst. H. Poincaré Sect. B (N.S.) 13, 231–238. Borell, C. [1978]: Tail probabilities in Gauss space, in Vector space measures and applications I, Lecture Notes in Math. 644, Springer-Verlag, Berlin, 73–82. Borwein, P., and Lockhart, R. [2001]: The expected Lp -norm of random polynomials, Proc. Amer. Math. Soc. 129, 1463–1472. Boukhari, F. [2002]: Convergence et propriétés métriques des moyennes ergodiques pondérées, Thesis, Publication IRMA, Strasbourg. Boukhari, F., and Weber, M. [2002]: Almost sure convergence of weighted series of contractions, Illinois J. Math. 46, 1–21. Bourgain, J. [1987]: Rusza’s problem on sets of recurrence, Israel J. Math. 59, 150–166. Bourgain, J. [1988a]: Almost sure convergence and bounded entropy, Israel J. Math. 63, 79–95. Bourgain, J. [1988b]: On the maximal ergodic theorem for certain subsets of the integers, Israel J. Math. 61, 39–72. Bourgain, J. [1988c]: On the pointwise ergodic theorem on Lp for arithmetic sets, Israel J. Math. 61, 73–84.

Bibliography

733

Bourgain, J. [1988d]: Pointwise ergodic theorems for arithmetic sets, with an appendix on return time sequences, jointly with H. Furstenberg,Y. Katznelson, D. Ornstein, Inst. Hautes Études Sci. Publ. Math. 69, 5–45. Bourgain, J. [1988e]: An approach to pointwise ergodic theorems, in Geometric aspects of functional analysis (1986/87), Lecture Notes in Math. 1317, Springer-Verlag, Berlin, 204–223. Bourgain, J. [1990]: Problems of almost everywhere convergence related to harmonic analysis and number theory, Israel J. Math. 71, 97–127. Boyd, A. V. [1959]: Inequalities for Mills’ ratio, Rep. Statist. Appl. Res. Un. Jap. Sci. Engrs. 6, 44–46. Breiman, L. [1992]: Probability, Classics in Appl. Math. (first edition 1968), SIAM, Philadelphia, PA. de la Bretèche, R., and Tenenbaum, G. [2005]: Entiers friables: inégalité de Turán-Kubilius et applications, Invent. Math. 159, 531–588. Browder, F. [1958]: On the iteration of transformations in noncompact minimal dynamical systems, Proc. Amer. Math. Soc. 5, 773–780. Bruck, R. E. [1979]: A simple proof of the mean ergodic theorem for non linear contractions in Banach spaces, Israel J. Math. 32, 107–116. Buczolich, Z. [2007]: Universally L1 -good sequences with gaps tending to infinity, Acta Math. Hungar. 117, 91–140. Buczolich, Z., and Mauldin, R. D. [2005]: Divergent square averages, preprint available at http://www. cs.elte.hu/~buczo/pubbb.htm Bugeaud, Y., and Weber, M. [1998]: Examples and counterexamples for Riemann sums, Indag. Math. 9, 1–13. Burkholder, D. L., and Gundy, R. F. [1970]: Extrapolation and interpolation of quasi-linear operators on martingales, Acta Math. 124, 249–304. Burr, S. A. [1970]: In Combinatorial theory and its applications, III, ed. by P. Erdös, A. Rényi, V. T. Sös, Coll. Math. Soc. J. Bolyai 4, North-Holland Publishing Co., Amsterdam, 1155. Burton, R., and Denker, M. [1987]: On the central limit theorem for dynamical systems, Trans. Amer. Math. Soc. 302, 715–726. Byrnes, J. S., Giroux, A., and Shisha, O. [1984]: Riemann sums and improper integrals of step functions related to the prime number theorem, J. Approx. Theory 40, 180–192. Calderón, A. P. [1968]: Ergodic theory and translation-invariant operators, Proc. Nat. Acad. Sci. U.S.A. 59, 349–353. Campbell, J. [1986]: Spectral analysis of the ergodic Hilbert transform, Indiana Univ. Math. J. 35, 379–390. Campbell, J., and Petersen, K. [1989]: The spectral measure and Hilbert transform of a measurepreserving transformation, Trans. Amer. Math. Soc. 313, 121–129. Carleson, L. [1966]: On convergence and growth of partial sums of Fourier series, Acta Math. 116, 135–157. Carlson, L. [1975]: Good sequences of integers, J. Number Theory 7, 91–104. Cartan, H. [1961]: Théorie élémentaire des fonctions analytiques d’une ou plusieurs variables complexes, Avec le concours de Reiji Takahashi, Enseignement des Sciences, Hermann, Paris.

734

Bibliography

Cassels, J. W. S. [1950]: Some metrical theorems in diophantine approximation, Proc. Cambridge Philos. Soc. 46, 209–218. Chandrasekharan, K., and Minakshisundaram, S. [1952]: Typical means, Tata Institute of Fund. Res. Bombay, Monogr. Math. Phys. 1, Oxford University Press, Oxford. Chatterjee, S. [2005]:An error bound in the Sudakov-Fernique inequality, arXiv:math/0510424v1. Chen, Y. G. [2000]: The best quantitative Kronecker’s Theorem, J. London Math. Soc. 61, 691–701. Chobanyan, S. A., Nguyen, Z. T., and Tarieladze, V. I. [1978]: On compactness of families of second order measures in a Banach space, Theor. Probability Appl. 22, 805–810. Chow, Y. S., and Lai, T. L. [1973]: Limiting behavior of weighted sums of independent random variables, Ann. Probability 1, 810–824. Chowla, S. D., and Vijayaraghavan, T. [1947]: On the largest prime divisors of numbers. J. Indian Math. Soc. (N.S.) 11, 31–37. Chui, C. K. [1969]: A convergence theorem for certain Riemann sums, Canad. Math. Bull. 12, 523–525 Chui, C. K. [1971]: Concerning rates of convergence of Riemann sums, J. Approximation Theory 4, 279–287. Chung, K. L. [1974]: A course in probability theory, second edition, Probability and Mathematical Statistics 21, Academic Press, New York, London. Chung, K. L., Erdös, P., and Sirao, T. [1959]: On the Lipschitz’s condition for Brownian motion, J. Math. Soc. Japan 11, 263–274. Cislo J., and Wolf, M. [2008]: Criteria equivalent to the Riemann Hypothesis, arXiv:0808.0640v2. Civin, P. [1955]: Abstract Riemann sums, Pacific J. Math. 5, 861–868. Clarke, L. E. [1969]: Dirichlet series with independent and identically disturbed coefficients, Proc. Cambridge Philos. Soc. 66, 393–397. Cohen, G., and Lin, M. [2003]: Laws of large numbers with rates and the one-sided ergodic Hilbert transform, Illinois J. Math. 47, 997–1031. Cohen, G., Jones, R. L., and Lin, M. [2004]: On strong laws of large numbers with rates, in Chapel Hill ergodic theory workshops, Contemp. Math. 356, Amer. Math. Soc., Providence, R.I., 101–126. Cohen, G., and Lin, M. [2005]: Extensions of the Menchoff-Rademacher theorem with applications to ergodic theory, Israel J. Math. 148, 41–86. Cohen, G., and Cuny, C. [2005]: On Billard’s theorem for random Fourier series, Bull. Pol. Acad. Sci. Math. 53, 39–53. Cohen, G., and Cuny, C. [2006]: On random almost periodic series and random ergodic theory, Ergodic Theory Dynam. Systems 26, 683–709. Cohen, G., and Cuny, C. [2006]: On random almost periodic trigonometric polynomials and applications to ergodic theory, Ann. Probab. 34, 39–79. Coifman, R. R., and Weiss, G. [1977]: Transference methods in analysis, CBMS Regional Conf. Series Math. 31, Amer. Math. Soc., Providence, RI. Conze, J. P. [1973]: Convergence des moyennes ergodiques pour des sous-suites, Bull. Soc. Math. France 35, 7–15.

Bibliography

735

Coquet, J., Kamae, K., and Mendes-France, M. [1977]: Sur la mesure spectrale de certaines suites arithmétiques, Bull. Soc. Math. France 105, 369–384. Cornfeld, I. P., Fomin, S.V., and Sinai, Y. G. [1982]: Ergodic theory, Grundlehren Math. Wiss. 245, Springer-Verlag, Berlin. Cramér, H. [1936]: On the order of magnitude of the difference between consecutive primes, Acta Arith. 2, 23–46. Cramér, H., and Leadbetter, M. R. [1967]: Stationary and related stochastic processes. Sample function properties and their applications, John Wiley & Sons, Inc., New York, London, Sydney. Csörgö, M., and Révész, P. [1981]: Strong approximations in probability and statistics,Akadémiai Kiadó, Budapest. Cuny, C. [2005]: On randomly weighted one-sided ergodic Hilbert transforms, Ergodic Theory Dynam. Systems 25, 89–99; Addendum, ibid. 25, 101–106. Cuny, C., and Weber, M. [2006]: On the convergence of moments in the CLT for triangular arrays with an application to random polynomials, Coll. Math. 106, 147–160. Dajani, K., and Kraaikamp, C. [1998]: A note on the approximation by continued fractions under an extra condition, New York J. Math. 3A, 69–80. Darmois, G. [1951]: Sur diverses propriétés caractéristiques de la loi de probabilité de Laplace– Gauss, Bull. Inst. Internat. Statist. 23 (2), 79–82. Darmois, G. [1953]: Analyse générale des liaisons stochastiques, Rev. Inst. Internat. Statistique 21, 2–8. Dartyge, C. [1996]: Entiers de la forme n2 + 1 sans grand facteur premier, Acta Math. Hungar. 72, 1–34. Dartyge, C. [2005]: Méthodes de crible, Cours de DEA de théorie des nombres 2004–2005, Université H. Poincaré-Nancy I. Davis, B. [1976]: On the Lp norms of stochastic integrals and other martingales, Duke Math. J. 43, 697–704. De la Rue, T. [1993]: Espaces de Lebesgue, in Séminaire de probabilités. XXVII, Lecture Notes in Math. 1557, Springer-Verlag, Berlin, 15–21. De la Rue, T., Ladouceur, S., Peškir, G., and Weber, M. [1997]: On the central limit theorem for ˘ aperiodic dynamical systems and applications, Teor. Imov¯ ır. Mat. Stat. 57, 140–159; English transl. in Theory Probab. Math. Statist. 57 (1998), 149–169. De la Torre, A. [1976]: A simple proof of the maximal ergodic theorem, Canad. J. Math. 28, 1073–1075. Del Junco, A., and Rosenblatt, J. [1984]: Counterexamples in ergodic theory and number theory, Math. Ann. 247, 185–197. Del Junco, A., Reinhold, K., and Weiss, B. [1999]: Partitions with independent iterates along IP-Sets, Ergodic Theory Dynam. Systems 19, 447–473. Deniel, Y. [1989]: On the a.s. Cesàro-α convergence for stationary of orthogonal sequences, J. Theoret. Probab. 2, 475–485. Deniel, Y., and Derriennic, Y. [1988]: Sur la convergence presque sûre au sens de Cesàro d’ordre α, de variables indépendantes et identiquement distribuées, Probab. Theory Related Fields 79, 629–636.

736

Bibliography

Denker, M. [1989]: The central limit theorem for dynamical systems, in Dynamical systems and ergodic theory, Banach Center Publications 23, PWN-Polish Scientific Publishers Warszawa, 33–62. Derriennic, Y., and Lin, M. [2001]: Fractional Poisson equations and ergodic theorems for fractional coboundaries, Israel J. Math. 123, 93–130. Dobriˇc, V., Marcus, M., and Weber, M. [1988]: The distribution of large values of the supremum of a Gaussian process, Astérisque 157-158, 95–127. Doeblin, W. [1940]: Remarques sur la théorie métrique des fractions continues, Compositio Math. 7, 353–371. Dragomir, S. S. [2004]: On the Boas–Bellman inequality in inner product spaces, Bull. Austral. Math. Soc. 69, 217–225. Drmota, M., and Tichy, R. [1997]: Sequences, discrepancies and applications, Lecture Notes in Math. 1651, Springer-Verlag, Berlin. Dubins, L. E., and Pitman, J. [1979]: A pointwise ergodic theorem for the group of rational rotations, Trans. Amer. Math. Soc. 251, 299–308. Dudley, R. M. [1967]: The size of compact subsets of Hilbert space and continuity of Gaussian processes, J. Funct. Anal. 1, 290–330. Dudley, R. M. [1973]: Sample functions of the Gaussian processes, Ann. Probab. 1, 66–103. Dunford, N., and Schwartz, J. T. [1956]: Convergence almost everywhere of operator averages, J. Rational Mech. Anal. 5, 129–178. Dunford, N., and Schwartz, J. T. [1958]: Linear operators, I. General theory, Pure Appl. Math. 7, Intersience Publishers, New York. Dvoretzky, A., and Chojnacki, H. [1947]: Sur les changements de signe d’une série à termes complexes, C. R. Acad. Sci. Paris 222, 515–518. Dvoretzky, A., and Erdös, P. [1955]: On power series diverging everywhere on the circle of convergence, Michigan Math. J. 3, 31–35. Dvoretzky, A., and Erdös, P. [1959]: Divergence of random power series, Michigan Math. J. 6, 343–347. Edwards, R. E., and Gaudry, G. I. [1977]: Littlewood–Paley and multiplier theory, Ergeb. Math. Grenzgeb. 90, Springer-Verlag, Berlin, New York. Ehrhard, A. [1983]: Symétrisation dans l’espace de Gauss, Math. Scand. 53, 281–301. Ehrhard, A. [1984a]: Sur l’inégalité de Sobolev logarithmique de Gross, in Seminar on probability, XVIII, Lecture Notes in Math. 1059 Springer-Verlag, Berlin, 194–196. Ehrhard, A. [1984b]: Inégalités isopérimétriques et intégrales de Dirichlet gaussiennes, Ann. Sci. École Norm. Sup. (4) 17, 317–332. Elliott, P. D. [1980]: Probabilistic number theory II, Grundlehren Math. Wiss. 240, SpringerVerlag, New York. England, J. W., and Martin, N. F. G. [1968]: On weak mixing metric automorphisms, Bull. Amer. Math. Soc. 74, 505–507. Erdös, P. [1946]: On the distribution function of additive functions, Ann. of Math. 47, 1–20. Erdös, P. [1949]: On a theorem of Hsu and Robbins, Ann. Math. Statist. 20, 286–291. Erdös, P. [1962]: On trigonometric sums with gaps, Magyar Tud. Akad. Mat. Kut. Int. Közl. 7, 37–42.

Bibliography

737

Erdös, P., and Gál, I. S. [1955]: On the law of the iterated logarithm I, II, Nederl. Akad. Wetensch. Proc. Ser. A. 58, 65–76, 77–84 (Indag. Math. 17, 65–76, 77–84). Erdös, P., and Kac, M. [1940]: The Gaussian law of errors in the theory of additive numbertheoretic functions, Amer. J. Math. 62, 738–742. Erdös, P., and Koksma, J. F. [1949]: On the uniform distribution modulo 1 of sequences (f (n, θ )), Nederl. Akad. Wetensch. Proc. 52, 851–854 (Indag. Math. 11, 299–302). Erdös, P., and Rényi, A. [1957]: A probabilistic approach to problems of diophantine approximation, Illinois J. Math. 1, 303–315. Euler, L. [1988]: Introduction to analysis of the infinite, translated by John D. Blanton, SpringerVerlag. Fan, K. [1945]: Two mean theorems in Hilbert spaces, Proc. Amer. Math. Soc. 31, 417–421. Fan, K. [1946]: On positive definite functions, Ann. of Math. 47, 593–607. Fazekas, I. [1985]: Convergence rates in the Marczinkiewicz strong law of large numbers for Banach space values random variables with multidimensional indices, Publ. Math. Debrecen 32, 203–209. Fazekas, I. [1992]: Convergence rates in the law of large numbers for arrays, Publ. Math. Debrecen 41, 53–71. Fefferman, C. [1973]: Pointwise convergence of Fourier series, Ann. of Math. 983, 551–571; Erratum [1997]: Ann. of Math. (2) 146, 239. Fejér L. [1915]: Über trigonometrische Polynomen, J. für Math. 146, 53–82. Feller, W. [1971]: An introduction to probability theory and its applications, Vol. I, II, second edition, John Wiley & Sons, Inc. New York. Fernique, X. [1970]: Intégrabilité des vecteurs gaussiens, C. R. Acad. Sci. Paris Sér. A-B 270, 1698–1699. Fernique, X. [1975]: Régularité des trajectoires de fonctions aléatoires gaussiennes, in École d’été de probabilités de Saint-Flour. IV-1974, Lectures Notes in Math. 480, Springer-Verlag, 1–96. Fernique, X. [1985]: Sur la convergence étroite des mesures gaussiennes, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 68, 331–336. Fink, A. M., Mitrinovi´c, D. S., and Pe´cari´c, J. E. [1993]: Classical and new inequalities in analysis, Kluwer Academic Publishers, Dordrecht. Fisher, E. [1992]: A Skorohod representation and an invariance principle for sums of weighted i.i.d. random variables, Rocky Mountain J. Math. 22, 169–179. Fischler, R. M. [1967]: Borel–Cantelli type theorems for mixing sets, Acta Math. Acad. Sci. Hungar. 18, 67–69. Fogels, E. [1940]: On the average values of arithmetical functions, Acta Univ. Latviensis 3, 285–313 = Publ. Sem. Math. Univ. Lettonie, no. 16. Fominykh, M. Y. [1985]: Properties of Riemann sums, Izv. Vyssh. Uchebn. Zaved. Mat. 88 (4), 65–73; English transl. Soviet Math. (Iz. VUZ) 29, 83–93. Fominykh, M. Y. [1987]: Convergence almost everywhere of Riemann sums, Vestnik Moskov. Univ. Ser. I Mat. Mekh. (6), 67–70; English transl. Moscow Univ. Math. Bull. 42 (6), 69–72. Franel, J. [1924]: Les suites de Farey et le problème des nombres premiers, Göttinger Nachrichten 198–201.

738

Bibliography

Fukuyama, K. [1990]: Functional central limit theorem and Strassen’s law of the iterated logarithms for weakly multiplicative systems, J. Math. Kyoto Univ. 30, 625–635. Furstenberg, H. [1981]: Recurrence in ergodic theory and combinatorial number theory, Princeton University Press, Princeton, N.J. Gabisoniya, O. D. [1973]: Points of strong summability of Fourier series, Mat. Zametki 14, 615–626; English transl. Math. Notes 14, 913–918. Gabisoniya, O. D. [1989]: Points of convergence of multiple Fourier series, Soobshch. Akad. Nauk Gruzin. SSR 133, 21–24 (in Russian). Gál, I. S. [1949]: A theorem concerning diophantine approximation, Nieuw. Arch. Wiskunde 23, 13–38. Gál, I. S., and Koksma, J. F. [1950]: Sur l’ordre de grandeur des fonctions sommables, Indag. Math. 12, 192–207. Gamet, C. [1996]: Théorèmes de convergence en moyenne et entropie métrique de moyennes ergodiques, Thesis, Publication IRMA, Strasbourg. Gamet, C., and Weber, M. [2000]: Entropy numbers of some ergodic averages, Teor. Veroyatnost. i Primenen. 44 (1999), 776–795; English transl. Theory Probab. Appl. 44 (2000), 650–668. Gaposhkin, V. F. [1966a]: Lacunary series and independent functions, Uspekhi Mat. Nauk 21 (6), 1–82; English transl. Russian Math. Surveys 21, 3–82. Gaposhkin, V. F. [1966b]: On series by the system φ(nx), Mat. Sb. 69, 328–353; English transl. Amer. Math. Soc. Transl. Ser. (2) 86 (1970), 167–197. Gaposhkin, V. F. [1967]: A system of convergence, Mat. Sb. 74, 93–99; English transl. Math. USSR Sb. 3, 83–90. Gaposhkin, V. F. [1968]: On convergence and divergence systems, Mat. Zametki 4, 253–260 (in Russian). Gaposhkin, V. F. [1969]: The central limit theorem for strongly multiplicative systems of functions, Mat. Zametki 6, 443–450; English transl. Math. Notes 6, 720–724. Gaposhkin, V. F. [1977]: Criteria for the strong law of large numbers for some classes of secondorder stationary processes and homogeneous random fields, Theor. Probab. Appl. 22 (1977), 286–310. Gaposhkin, V. F. [1979]: Strong consistency of estimates of the trend of a time series, Math. Notes 26, 812–818. Gaposhkin, V. F. [1981]: The local ergodic theorem for groups of unitaty operators and second order stationary processes, Math. USSR Sb. 39, 227–242. Gaposhkin, V. F. [1995]: Summability of stationary sequences by Riesz methods, Math. Notes 57, 450–456. Gaposhkin, V. F. [1996]: Spectral tests for existence of generalized ergodic transforms, Theor. Probab. Appl. 41, 251–271. Gaposhkin, V. F. [2000]: On the integrability of a maximal square function in ergodic theory, Theor. Probab. Appl. 44, 386–394. Gaposhkin, V. F. [2005]: Estimates of the entropy of the set of means for some classes of stationary and quasistationary sequences, Mat. Zametki 78 (1), 2005, 52–58; English transl. Math. Notes 78, 47–52.

Bibliography

739

Garsia, A. [1970]: Topics in almost everywhere convergence, Lectures in Adv. Math. 4, Markham Publishing Company, Chicago, IL. Garsia, A. [1973]: Martingale inequalities, Math. Lecture Note Series, W. A. Benjamin, Reading, MA. Garsia, A., Rodemich, E., and Rumsey Jr., H. [1970]: A real variable lemma and the continuity of paths of some Gaussian processes, Indiana Univ. Math. J. 20, 565–578. Gelbaum, B. R. [1985]: Some problems in probability theory, Pacific J. Math. 118, 398–391. Gelfond, A., and Linnik, Y. [1965]: Méthodes élémentaires dans la théorie analytique des nombres, Monographies Internationales de Math. Modernes 6, Gauthier-Villars, Paris. Geman, D., and Horowitz, J. [1980]: Occupation densities, Ann. Probab. 8, 1–67. Ghykman, I., and Skorhokhod, A. [1973]: Introduction à la théorie des processus aléatoires, Ed. Mir, Moscow. Gillis, J. [1936]: Note on a property of measurable sets, J. London Math. Soc. 11, 139–141. Ginsberg, J., Newman, D., and Neuwirth, J. [1970]: Approximation by {f (kx)}, J. Funct. Anal. 5, 194–203. Giuliano-Antonini, R., and Weber, M. [2004]: The intersective ASCLT, Stochastic Anal. Appl. 22, 1009–1025. Giuliano-Antonini, R., and Weber, M. [2005]: Counting occurrences in almost sure limit theorems, Colloq. Math. 102, 271–290. Gnedenko B. V. [1948]: On a theorem of S. N. Bernstein, Izv. Akad. Nauk SSSR. Ser. Mat. 12, 97–100 (in Russian). Gordin, M., Lifšic, B. A. [1978]: The central limit theorem for stationary Markov processes, Soviet Math. Dokl. 19, 392–394. Gordin, M., and Weber, M. [2002]: On the almost sure central limit theorem for a class of Z d -actions, J. Theoret. Probab. 15, 477–501. Gordin, M., and Weber, M. [2006]: A borderline random Fourier series for the sampled convergence in variation, J. Math. Anal. Appl. 318, 526–551. Gosselin, R., and Neuwirth, J. [1968]: On Paley–Wiener bases, J. Math. Mech. 18, 871–879. Graversen, S. E., Peškir, G., and Weber, M. [1995]: The continuity principle in exponential type Orlicz spaces, Nagoya Math. J. 137, 55–75. Green, B. [1999]: The large sieve for absolute beginners, http://www.dp.mms.cam.ac.uk/∼bjg23/ expos.html. Gross, L. [1967]: Abstract Wiener spaces, in Contributions to Probability Theory, Part 1, Proc. 5th Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. II, University of California Press, Berkeley, CA, 31–42. Grosswald, E. [1985]: Representations of integers as sums of squares, Springer-Verlag, New York. Grytczuk A. [2007]: Upper bound for sum of divisors function and the Riemann hypothesis, Tsukuba J. Math. 31, 67–75. Guillotin-Plantard, N. [2001]: Sur la convergence faible des systèmes dynamiques échantillonnés, Ann. Inst. Fourier 54 (2004), 211–233. Gut, A. [1992]: Complete convergence for arrays, Period. Math. Hungar. 25, 51–75.

740

Bibliography

Haas A. [2005]: An ergodic sum related to the approximation by continued fractions, New York J. Math. 11, 345–349. Haber, S., and Osgood, C. F. [1969]: On the sum nα−t and numerical integration, Pacific J. Math. 31, 383–394. Hahn, H. [1914]: Über Annäherung an Lebesgue Integrale durch Riemannsche Summen, Wiener Sitzungsberichte 123, 713–743. Halász, G. [1973]: On a result of Salem and Zygmund concerning random polynomials, Studia Sci. Math. Hungar. 8, 369–377. Halász, G. [1983]: On random multiplicative functions, in Colloque Hubert Delange, Pub. Math. Orsay 83-4, Université Paris XI, Orsay, 74–96. Halberstam, H. [1955]: Über additive zahlentheoretische Funktionen, J. Reine Angew. Math. 195, 210–214. Halberstam, H., and Richert, H. E. [1974]: Sieve methods, London Math. Soc. Monogr. 4, Academic Press, New York, London. Hall, P., and Heyde, C. C. [1980]: Martingale limit theory and its application, Probability and Mathematical Statistics, Academic Press, New York, London. Halmos, P. R. [1956]: Lectures on ergodic theory, Kenkysha Printing Co., Tokyo. Hardy, G. H. [1963]: Divergent series, Oxford at the Clarendon Press, Oxford. Hardy, G. H., and Littlewood, J. E. [1930]: A maximal theorem with function-theoretic applications, Acta Math. 54, 81–116. Hardy, G., Littlewood, J. E., and Pólya, G. [1934]: Inequalities, Cambridge University Press, Cambridge. Hardy, G. H., and Riesz, M. [1915]: The general theory of Dirichlet’s series, Cambridge Tracts in Math. and Math. Phys. 18, Cambridge University Press, Cambridge. Hardy, G. H., and Wright, E. M. [1930]: A maximal theorem with function-theoretic applications, Acta Math. 54, 81–116. Hardy, G. H., and Wright, E. M. [1979]: An introduction to the theory of numbers, fifth edition, Oxford at the Clarendon Press, Oxford. Harman, G. [1998]: Metric number theory, London Math. Soc. Monogr. (N.S.) 18, The Clarendon Press, Oxford University Press, New York. Hartman, P. [1939]: On Dirichlet series involving random coefficients, Amer. J. Math. 61, 955–964. Hartman, P. [1947]: On the ergodic theorems, Amer. J. Math. 69, 193-199. Hartman, P., and Wintner, A. [1947]: On Möbius’ inversion, Amer. J. Math. 69, 853–858. Hausdorff, F. [1923]: Moment Probleme für ein endliches Intervall, Math. Z. 16, 220–248. Hayman, W. K. [1967]: Research problems in function theory, The Athlone Press, University of London, London. Hedenmalm, H., Lindqvist, P., and Seip, K. [1997]: A Hilbert space of Dirichlet series and systems of dilated functions in L2 ([0, 1]), Duke Math. J. 86, 1–37. Hedenmalm, H., Lindqvist, P., and Seip, K. [1999]: Addendum to “A Hilbert space of Dirichlet series and systems of dilated functions in L2 ([0, 1])”, Duke Math. J. 99, 175–178.

Bibliography

741

Hedenmalm, H., and Saksman, E. [2003]: Carleson’s convergence theorem for Dirichlet series, Pacific J. Math. 208, 85–109. Hegyvári, N. [1996]: On representation problems in the additive number theory, Acta Math. Hungar. 72, 35–41. Hellinger, E., and Toeplitz, O. [1910]: Grundlagen für eine Theorie der unendlichen Matrizen, Math. Ann. 69, 289–330. Helson, H. [1967]: Foundations of the theory of Dirichlet series, Acta Math. 118, 61–77. Hemashina, R. [1983]: Ph.D. dissertation, SUNY/Buffalo, N.Y. Holewijn, P. J. [1973]: On the uniform distribution of random variables, Z. Wahrscheinlichkeitstheor. Verw. Geb. 26, 33–41. Hopf, E. [1937]: Ergodentheorie, Ergeb. Math. Grenzgeb. 5, Springer-Verlag, Berlin. Hopf, E. [1954]: The general temporally discrete Markoff process, J. Rational Mech. Anal.. 3, 13–45. Hopf, E. [1960]: On the ergodic theorem for positive linear operators, J. Reine Angew. Math. 205, 101–106. Hsu, P. L., and Robbins, H. [1947]: Complete convergence and the law of large numbers, Proc. Natl. Acad. Sci. USA 33, 25–31. Hu, T. C., Móricz, F., and Taylor, R. [1989]: Strong laws of large numbers for arrays of rowwise independent random variables, Acta Math. Acad. Sci. Hungar. 54, 153–162. Hu, T. C., Rosalsky, A., Szynal, D., and Volodin, A. [2001]: On complete convergence for arrays of rowwise independent random elements in Banach spaces, Stochastic Anal. Appl. 17, 963–992. Hua, L. K. [1982]: Introduction to number theory, Springer-Verlag, Berlin, New York. Hunt, R. [1968]: On the convergence of Fourier series, in Orthogonal expansions and their continuous analogues (Proc. Conf., Edwardsville, Ill., 1967), Southern Illinois University Press, Carbondale, IL., 235–255. Hurewicz, W. [1944]: Ergodic theorems without invariant measures, Ann. of Math. 45, 192–206. Huxley, M. N. [1972]: The distribution of prime numbers. Large sieves and zero-density theorems, Oxford Math. Monogr., Clarendon Press, Oxford. Huxley, M. N. [2005]: Exponential sums and the Riemann zeta function. V, Proc. London Math. Soc. (3) 90, 1–41. Ibragimov, I. A. [1960]: On the asymptotic distribution of values of certain sums, Vestnik Leningrad Univ. 15, 55–69 (in Russian). Ibragimov, I. A. [1962]: Some limit theorems for stationary processes, Theory Probab. Appl. 7, 349–382. Ibragimov, I. A., and Linnik, Y. V. [1971]: Independent and stationary sequences of random variables, Wolters-Noordhoff Publishing, Groningen. Irmisch, R. [1980]: Punktweise Ergodensätze für (c, α)-Verfahren, 0 < α < 1, Doctoral Dissertation, Darmstadt. Ito, K., and Nisio, M. [1968]: On the oscillation functions of Gaussian processes, Math. Scand. 22, 209–223. Jajte, R. [1987]: On the existence of the ergodic Hilbert transform, Ann. Probab. 15, 831–835.

742

Bibliography

Jakubowski, A. [1989]: The functional law of the iterated logarithm for weakly multiplicative systems, Demonstratio Math. 22, 861–867. Jamison, B., Orey, S., and Pruitt, W. [1965]: Convergence of weighted averages of independent random variables, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 40–44. Jessen, B. [1934]: On the approximation of Lebesgue integrals by Riemann sums, Ann. of Math. 35, 248–251. Jones, F., and Nirenberg, L. [1961]: On functions of bounded mean oscillation, Comm. Pure Appl. Math. 14, 785–799. Jones, K. L. [1971]:A mean ergodic theorem for weakly mixing operators, Adv. Math. 7, 211–216. Jones, R., and Wierdl, M. [1994]: Convergence and divergence of ergodic averages, Ergodic Theory Dynam. Systems 14, 515–535. Jones, R., Ostrovskii, I., and Rosenblatt, J. [1996]: Square functions in ergodic theory, Ergodic Theory Dynam. Systems 16, 267–305. Jones, R., Kaufman, R., Rosenblatt, J., and Wierdl, M. [1998]: Oscillation in ergodic theory, Ergodic Theory Dynam. Systems 18, 889–935. Jones, R.„ Lacey, M., and Wierdl, M. [1999]: Integer sequences with big gaps and the pointwise ergodic theorem, Ergodic Theory Dynam. Systems 19, 1295–1308. Jones, R., Rosenblatt, J., and Wierdl, M. [2003]: Oscillation in ergodic theory: higher dimensional results, Israel J. Math. 135, 1–27. Kac, M. [1939]: On a characterization of the normal distribution, Amer. J. Math. 61, 726–728. Kac, M. [1940]: The Gaussian law of errors in the theory of additive number-theoretic functions, Amer. J. Math. 62, 738–742. Kac, M. [1943]: Convergence of certain gap series, Ann. of Math. 44, 411–415. Kac, M. [1946]: On the distribution of values of sums of the type f (2k t), Ann. of Math. 47, 33–49. Kac, M. [1949]: Probability methods in some problems of analysis and number theory, Bull. Amer. Math. Soc. 55, 641–665. Kac, M. [1955]: A remark on the preceding paper by A. Rényi, Acad. Serbe Sci. Publ. Inst. Math. 8, 163–165. Kac, M., Salem, R., and Zygmund, A. [1948]: A gap theorem, Trans. Amer. Math. Soc. 63, 235–243. Kaczmarz, S. [1933]: On some classes of Fourier series, J. London Math. Soc. 8, 39–46. Kahane, J. P. [1968]: Some random series of functions, first edition, D.C Heath and Company, Lexington, MA; second edition (1985), Cambridge University Press, Cambridge. Kahane, C. S. [1993]: Evaluating Lebesgue integrals as limits of Riemann sums, Math. Japon. 38, 1073–1076. Kakutani, S. [1943]: Induced measure preserving transformations, Proc. Imp. Akad. Tokyo 19, 635–641. Kakutani, S., and Yoshida, K. [1939]: Birkhoff’s ergodic theorem and the maximal ergodic theorem, Proc. Imp. Akad. Tokyo 15, 165–168. Kamae, T., and Mendes France, M. [1978]: Van der Corput’s difference theorem, Israel J. Math. 31, 335–342.

743

Bibliography

Kanemitsu, S., and Yoshimoto, M. [1996]: Farey series and the Riemann hypothesis, Acta Arith. 75, 351–374. Kashin, B. S., and Saakyan, A. A. [1989]: Orthogonal series, Transl. Math. Monogr. 75, Amer. Math. Soc., Providence, R.I. Kato, Y. [1987]: Central limit theorem for Weyl automorphism, Rep. Statist. Appl. Res. Un. Japan. Sci. Engrs. 34 (3), 1–10. Kawata, T. [1972]: Fourier analysis in probability theory, Academic Press, New York, London. Kesten, H. [1964]: The discrepancy of random sequences {kx}, Acta Arith. 10, 183–213. Kesten, H. [1965]: Sums of stationary sequences cannot grow slower than linearly, Proc. Amer. Math. Soc. 49, 205–211. Khintchin, A. [1923]: Ein Satz über Kettenbrüche mit arithmetischen Anwendungen, Math. Z. 18, 289–306. Khoshnevisan, D. [2005]: Schoenberg’s theorem via the law of large numbers, http://www.math. utah.edu/davar. King E. P., and Lukacs, E. [1954]: A property of the normal distribution, Bull. Inst. Internat. Statist. 25. 389–394. Klein, A., Landau, L. J., and Shucker, D. S. [1982]: Decoupling inequalities for stationary Gaussian processes, Ann. Probab. 10, 702–708. Koˇcergin, A. V. [1976]: On the homology of functions over dynamical systems, Dokl. Akad. Nauk. SSSR 231, 795–798; English transl. Soviet Math. Dokl. 17, 1637–1641. Koksma, J. F. [1951]: A diophantine approximation of summable functions, J. Indian Math. Soc. 15, 87–96. Koksma, J. F., and Salem, R. [1950]: Uniform distribution and Lebesgue integration, Acta Sci. Math. Szeged. 12, 87–96. Kolmogorov, A. N. [1985]: Selected works of A. N. Kolmogorov, I, edited by V. M. Tikhomirov, Math. Appl. (Soviet Ser.) 25, Kluwer Academic Publishers, Dordrecht. Komatu, Y. [1955]: Elementary inequalities for Mills’ ratio, Rep. Statist. Appl. Res. Un. Jap. Sci. Engrs. 4, 69–70. Konyagin, S. V., and Queffélec, H. [2001/2002]: The translation series, Real Anal. Exchange 27, 155–176.

1 2

in the theory of Dirichlet

Kôno, N. [1980]: Sample path properties of stochastic processes, J. Math. Kyoto Univ. 20, 295–313. Knopp, K. [1931]: Theorie der Anwendung der unendlichen Reihen, 8th edition, Springer-Verlag, Berlin. Krakowiak, W. [1985]: The theorem of Darmois–Skitoviˇc for Banach space valued random variables, Bull. Polish Acad. Sci. Math. 33, 77–83. Krengel, U. [1971]: On the individual ergodic theorem for subsequences, Ann. Math. Statist. 42, 1091–1095. Krengel, U. [1972]: Weakly wandering vectors and weakly independent partitions, Trans. Amer. Math. Soc. 164, 199–226. Krengel, U. [1985]: Ergodic theorems, de Gruyter Stud. Math. 6, Walter de Gruyter, Berlin. Kristensen, S. [2002]: On two forms of approximation, Thesis, Publication IRMA, Strasbourg.

744

Bibliography

Kubilius, J. [1964]: Probabilistic methods in the theory of numbers, Transl. Math. Monogr. 11, Amer. Math. Soc., Providence, R.I. Kuczmaszewska, A., and Szynal, D. [1988]: On the Hsu and Robbins law of large numbers for subsequences, Bull. Polish Acad. Sci. Math. 36, 69–79. Kuczmaszewska, A., and Szynal, D. [1991]: On complete convergence for partial sums of independent identically distributed random variables, Probab. Math. Statist. 11, 223–235. Kuczmaszewska, A., and Szynal, D. [1994]: On complete convergence in a Banach space, Internat. J. Math. Sci. 17, 1–14. Kuipers, L., and Niederreiter, H. [1974]: Uniform distribution of sequences, Pure Appl. Math., Wiley-Interscience, New York. Lacey, M. [1993]: On central limit theorems, modulus of continuity and Diophantine type of irrational rotations, J. Anal. Math. 61, 47–59. Lacey, M., and Philipp, W. [1990]: A note on the almost sure central limit theorem, Statist. Probab. Lett. 9, 201–205. Lacey, M., Petersen, K., Wierdl, M., and Rudolph, D. [1994]: Random ergodic theorems with universally representative sequences, Ann. Inst. H. Poincaré Probab. Statist. 30, 353–395. Lagarias, J. C. [2002]: An elementary problem equivalent to the Riemann hypothesis, Amer. Math. Monthly 109, 534–543. Lancaster, H. O. [1960]: The characterisation of the normal distribution, J. Austral. Math. Soc. 1, 368–383. Landau, E. [1924]: Bemerkungen zu der vorstehenden Abhandlung von Herrn Franel, Göttinger Nachrichten, 202–206. Landau, E. [1927]: Vorlesungen über Zahlentheorie, I–II, S. Hirzel, Leipzig. Landau, E. [1953]: Handbuch der Lehre von der Verteilung der Primzahlen, Chelsea, New York. Landau H. J., and Shepp, L. A. [1970]: On the supremum of a Gaussian process, Sankhya Ser. A 32, 369–378. Landers, D., and Rogge, L. [1978]: An ergodic theorem for Fréchet valued random variables, Proc. Amer. Math. Soc. 72, 49–53. Lebed, G. K. [1967]: Trigonometric series with coefficients satisfying certain conditions, Math. USSR Sb. 3, 91–108. Lesigne, E. [1995]: On the sequence of integer parts of a good sequence for the ergodic theorem, Comment. Math. Univ. Carolin. 36, 737–743. Ledoux, M., and Talagrand, M. [1991]: Probability in Banach spaces, Ergeb. Math. Grenzgeb. 23, Springer-Verlag, Berlin. Lema´nczyk, M., Lesigne, E., Parreau, F., Volný, D., and Wierdl, M. [2002]: Random ergodic theorems and real cocycles, Israel J. Math. 130, 285–321. Lesigne, E. [1992]: Ergodic theorem along a return time sequence, in Ergodic theory and related topics, III (Güstrow, 1990), Lecture Notes in Math. 1514, Springer-Verlag, Berlin, 146–152. Lesigne, E. [1993]: Spectre quasi-discret et théorème ergodique de Wiener-Wintner pour les polynômes, Ergodic Theory Dynam. Systems 13, 767–784. Lesigne, E., and Mauduit, C. [1996]: Propriétés ergodiques des suites q-multiplicatives, Compositio Math. 100, 131–169.

Bibliography

745

Li, D., Rao, M. B., and Wang, X. [1992]: Complete convergence of moving average processes, Statist. Probab. Lett. 14, 111–114. Lifshits, M. [1979]: Local times for functions and Gaussian processes, Investigations in the theory of probability distributions, IV, Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 85, 104–112 (in Russian). Lifshits, M. [1995]: Gaussian random functions, Kluwer Academic Publishers, Dordrecht. Lifshits, M. [1997]: Private communication. Lifshits, M., and Weber, M. [1996]: Régularisation spectrale en théorie ergodique et en probabilités, C. R. Acad. Sci. Paris Sér. I Math. 324, 99–103. Lifshits, M., and Weber, M. [1998]: Oscillations of the Gaussian Stein’s elements, in High dimensional probability (Oberwolfach, 1996), Progr. Probab. 43, Birkhäuser, Basel, 249–261. Lifshits, M., and Weber, M. [2000]: Spectral regularization inequalities, Math. Scand. 86, 75–99. Lifshits, M., and Weber, M. [2001]: Tightness of stochastic families arising from randomization procedure, in Asymptotic methods in probability and statistics with applications (St. Petersburg, 1998), Stat. Ind. Technol., Birkhäuser, Boston, 143–158. Lifshits, M., and Weber, M. [2003]: Régularisation spectrale et propriétés métriques des moyennes mobiles, J. Anal. Math. 89, 1–14. Lifshits, M., and Weber, M. [2007]: On the supremum of random Dirichlet polynomials, Studia Math. 182, 41–65 Lifshits, M., and Weber, M. [2009a]: Sampling the Lindelöf hypothesis with the Cauchy random walk, Proc. Lond. Math. Soc. (3) 98 (2009), 241–270. Lifshits M., and Weber M. [2009b]: On the supremum of some random Dirichlet polynomials, Acta Math. Hungarica 123, 41–64. Lin, M., and Sine, R. [1983]: Ergodic theory and the functional equation (I − T )x = y, J. Operator Theory 10, 153–166. Lin, M., Olsen, J., and Tempelman, A. [1999]: On modulated ergodic theorems for Dunford– Schwartz operators, Illinois J. Math. 43, 542–567. Lin, M., and Weber, M. [2007]: Weighted ergodic theorems and strong laws of large numbers, Ergodic Theory Dynam. Systems 27, 511–543. Lin, M., and Weber, M. [2009]: Weighted laws of the iterated logarithm for sums of iid random variables, Contemp. Math. 485, 115–129. Loeve, M. [1963]: Probability theory, Van Nostrand, Princeton, N.J. Lorentz, G. G. [1960]: Remark on a paper by Visser, J. London Math. Soc. 35, 205–208. Lukacs, E. [1970]: Characteristic Functions, second edition, Hafner Publishing Co, New York. Lukacs, E., and Laha, R. G. [1964]: Applications of characteristic functions, Griffin’s Statistical Monographs & Courses 14, Hafner Publishing Co., New York. Maruyama, G. [1949]: The harmonic analysis of stationary stochastic processes, Mem. Fac. Sci. Kyushyu Univ. A 4, 45–46. Maruyama, G. [1970]: Infinitely divisible processes, Teor. Verojatnost. i Primenen. 15, 3–23. Marcinkiewicz, J., and Salem, R. [1940]: Sur les sommes riemanniennes. Compositio Math. 7, 376–389.

746

Bibliography

Marcinkiewicz, J., and Zygmund, A. [1937]: Mean values of trigonometrical polynomials. Fund. Math. 28, 131–166. Marcus, B., and Petersen, K. [1979]: Balancing ergodic averages, in Ergodic Theory, Lecture Notes in Math. 729, Springer-Verlag, Berlin, 126–143. Marcus, M., and Pisier, G. [1984]: Characterizations of almost surely p-stable random Fourier series and strongly continuous stationary processes, Acta Math. 152, 245–301. Marlow, N. A. [1973]: High level occupation times for continuous Gaussian processes, Ann. Probab. 1, 388–397. Marstrand, J. M. [1970]: On Khinchin’s conjecture about strong uniform distribution, Proc. London Math. Soc. 21, 540–556. Mc Leish, D. L. [1974]: Dependent central limit theorems and invariance principles, Ann. Probab. 2, 744–771. Menshov, D. E. [1923]: Sur les séries de fonctions orthogonales, Fundamenta Math. 4, 82–105. Mijnheer, J. L. [1975]: Sample path properties of stable processes, Doctoral dissertation, University of Leiden, Mathematical Centre Tracts 59, Mathematisch Centrum Amsterdam. Mikolás, M. [1949a]: Farey series and their connection with the prime problem (I), Acta Sci. Math. Szeged, 13, 93–117. Mikolás, M. [1949b]: Sur l’hypothèse de Riemann, C. R. Acad. Sci. Paris 228, 633–636. Mikolás, M. [1951]: Farey series and their connection with the prime problem (II), Acta Sci. Math. Szeged, 14, 5–21. Mitrinovi´c, D. S. [1970]: Analytic inequalities, Grundlehren Math. Wiss. 165, Springer-Verlag, Berlin. Montgomery, H. [1993]: Ten lectures on the interface between analytic number theory and harmonic analysis, CBMS Regional Conf. Ser. in Math. 84, Amer. Math. Soc., Providence, R.I. Móricz, F. [1976a]: Moment inequalities and the strong law of large numbers, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 35, 299–314. Móricz, F. [1976b]: The law of the iterated logarithm and related results for weakly multiplicative systems, Anal. Math. 2, 211–229. Móricz, F., and Révész, P. [1980]: Multiplicative systems, Mat. Lapok 28, 43–63 (in Hungarian). Móricz, F., Serfling, R. J., and Stout, W. F. [1982]: Moment and probability bounds with quasisuperadditive structure for the maximum partial sum, Ann. Probab. 10, 1032–1040. Móricz, F., and Tandori, K. [1994]: Almost everywhere convergence of orthogonal series revisited, J. Math. Anal. Appl. 182, 637–653. Mühlbach, R. [1962]: Filter und andere moderne Begriffsbildungen in der Analysis, Mathematische Gesellschaft in Hamburg, Schriftenreihe zur Schulmathematik 8, Ernst Klett Verlag, Stuttgart. Nadkarni, M. G. [1999]: Spectral theory of dynamical systems, BirkhäuserAdv. Texts, Birkhäuser, Basel. Nagel, A., and Stein, E. M. [1984]: On certain maximal functions and approach regions, Adv. Math. 54, 83–106. Nair, R. [1993]: On polynomials in primes and J. Bourgain’s circle method approach to ergodic theory, Ergodic Theory Dynam. Systems 11, 485–499.

Bibliography

747

Nair, R. [1993]: On polynomials in primes and J. Bourgain’s circle method approach to ergodic theory. II, Studia Math. 105, 207–233. Nair, R. [1995]: On Rieman sums and Legesgue integrals, Monatsh. Math. 120, 49–54. Nair, R. [1998]: On uniform distributed sequences of integers and Poincaré recurrence, Indag. Math. (N.S.) 9, 55–63. Nair, R., and Weber, M. [1999]: On variation functions for subsequence ergodic averages, Monatsh. Math. 128, 131–150. Nair, R., and Weber, M. [2004]: Intersectivity properties of randomly perturbed sequences, Indag. Math. 15, 373–381. Nair, R., and Zaris, P. [2001]: On certain sets of integers and intersectivity, Math. Proc. Camb. Philos. Soc. 131, 157–164. Neveu, J. [1965]: Une démonstration élémentaire du théorème de récurrence, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 64–68. Newman, D. J., and Shisha, O. [1991]: Magnitude of Fourier coefficients and degree of approximation by Riemann sums, Numer. Funct. Anal. Optim. 12, 545–550. Nikišin, E. M. [1970a]: Resonance theorems and superlinear operators, Uspehi Math. Nauk. 25 (6), 129–191 (in Russian). Nikišin, E. M. [1970b]: On convergence systems, Math. Sb. 81, 23–38 (in Russian). Nikišin, E. M. [1971]: Resonance theorems and functions series, Math. Zametki 10 (5), 583–595 (in Russian). Obloj J. [2004]: The Skorokhod embedding problem and its offspring, Probab. Surv. 1, 321–382. Olevskii, A. M. [1975]: Fourier series with respect to general orthogonal systems, Ergeb. Math. Grenzgeb. 86, Springer-Verlag, Berlin, Heidelberg, New York. Pannikov, B. V. [1970]: The convergence of Riemann sums of functions that can be represented by trigonometric series with monotone coefficients, Mat. Zametki 8, 607–618 (in Russian). Pannikov, B. V. [1971]: The convergence of Riemann sums of divergent functions, Dokl. Akad. Nauk SSSR 201, 542–544; English transl. Soviet Math. Dokl. 12, 1687–1690. Pannikov, B. V. [1988]: An example of the divergence of Riemann sums, Sibirsk. Math. Zh. 29, 208–210; English transl. Siberian Math. J. 29, 503–506. Pannikov, B. V. [1993]: On a Weyl multiplier for the convergence of Riemann sums, Anal. Math. 19, 243–253 (in Russian). Paszkiewicz, A. [2005a]: On a unitary operator with pathological ergodic means, Acta Sci. Math. Szeged 71, 741–749. Paszkiewicz, A. [2005b]: A new proof of the Rademacher–Menshov Theorem, Acta Sci. Math. Szeged 71, 631–642. Paszkiewicz, A. [2005c]: On complete characterization of coefficients of a a.e. converging orthogonal series, preprint available at arXiv:math.AP/0507568 v1. Paul, E. M. [1962]: Density in the light of probability theory, Sankhyá 24 Series A, Part 2, 103–114. Perez, Y. [1988]: A combinatorial application of the maximal ergodic theorem, Bull. Lond. Math. Soc. 20, 248–252.

748

Bibliography

Peškir, G., and Weber, M. [1993]: Necessary and sufficient conditions for the uniform law of large numbers in the stationary case, in Functional analysis IV (Dubrovnik, 1993), Various Publ. Ser. (Aarhus) 43, Aarhus University, Aarhus, 165–190. Peškir, G., and Weber, M. [1996]: The uniform ergodic theorem, in Convergence in ergodic theory and probability (Columbus, Ohio, 1993), Ohio State Univ. Math. Res. Inst. Publ. 5, Walter de Gruyter, Berlin, 305–332. Peškir, G., Weber, M., and Schneider, D. [1996]: Randomly weighted series of contractions in Hilbert spaces, Math. Scand. 79, 263–282. Petersen, K. [1983]: Ergodic theory, Cambridge Stud. Adv. Math. 2, Cambridge University Press, Cambridge. Petersen, K. [1983]: Another proof of the existence of the ergodic Hilbert transform, Proc. Amer, Math. Soc. 88, 39–43. Petersen, K., and Salama, I. [1995]: Ergodic theory and its connections with harmonic analysis, in Proceedings of the 1993 Alexandria Conference, London Math. Soc. Lectures Notes Ser. 205, Cambridge University Press, Cambridge. Petit, B. [1992]: Le théorème limite central pour des sommes de Riesz–Raikov, Probab. Theory Related Fields 93, 407–438. Petrov, V. [1975a]: Sums of independent random variables, Ergeb. Math. Grenzgeb. 82, SpringerVerlag, Berlin, Heidelberg, New York. Petrov V. V. [1975b]: An inequality for the moments of a random variable, Teor. Verojatnost. i Primenen. 20, 402–403. Petroviˇc, A. J. [1974]: The convergence of subsequences of Riemann sums, Mat. Zametki 16, 645–656; English transl. Math. Notes 16, 975–982. Petroviˇc, A. J. [1974]: Riemann sums for functions that are expanded in trigonometric series, Izv. Vysš. Uˇcebn. Zaved Matematika 12 (151), 70–72; English transl. Soviet Math. (Iz. VUZ) 18 (12), 61–64. Petroviˇc, A. J. [1975]: Properties of Riemann sums for functions that are representable by a trigonometric series with monotone coefficients, Mat. Sb. 97, 360–378 (in Russian). Philipp, W. [1967]: Some metrical theorems in number theory, Pacific J. Math. 20, 109–127. Philipp, W. [1970]: Some metrical theorems in number theory II, Duke Math. J. 37, 447–458. Philipp, W. [1971]: Mixing sequences of random variables and probablistic number theory, Mem. Amer. Math. Soc. 117, Amer. Math. Soc., Providence, R. I. Philipp, W. [1975]: Limit theorems for lacunary series and uniform distribution modulo 1, Acta Arith. 26, 242–251. Philipp, W., and Stout, W. F. [1975]: Almost sure invariance principles for partial sums of weakly dependent random variables, Mem. Amer. Math. Soc. (2) (1975), issue 2, no. 161. Pickands, J. [1969]: Asymptotic properties of the maximum in a stationary Gaussian process, Trans. Amer. Math. Soc. 145, 75–86. Pisier, G. [1980]: Conditions d’entropie assurant la continuité de certains processus et applications à l’analyse harmonique, Séminaire d’analyse fonctionnelle, École polytechnique, exposés 13–14. Pisier, G. [1983]: Some applications of the metric entropy condition to harmonic analysis in Banach spaces, in Harmonic analysis and probability (University of Connecticut 1980-81), Lecture Notes in Math. 995, Springer-Verlag, Berlin, 123–154.

Bibliography

749

Pitt, L. D. [1978]: Local times for Gaussian vector fields, Indiana Univ. Math. J. 27, 309–330. Pollak, H. O. [1956]: A remark on “Elementary estimates for Mills’ ratio” by Y. Komatu, Rep. Statist. Appl. Res. Un. Jap. Sci. Engrs. 4, 110. Pólya, G., and Szegö, G. [1954]: Aufgaben und Lehrsätze aus der Analysis, Band I: Reihen, Integralrechnung, Funktionentheorie, Grundlehren Math. Wiss. 19, Springer-Verlag, Berlin. Preston, C. [1971]: Banach spaces arising from some integral inequalities, Indiana Univ. Math. J. 20, 997–1015. Prohorov,Y. V. [1956]: Convergence of random processes and limit theorems in probability, Teor. Veroyatnost. i Primenen. 1 (1956), 177–238. Pruitt, W. [1966]: Summability of independent random variables, J. Math. Mech. 15, 769–776. Quine, M. P., and Seneta, E. [1999]: The generalization of the Kac–Bernstein theorem, Probab. Math. Statist. 19, 441–452. Queffélec, H. [1980]: Propriétés presque sûres et quasi-sûres des séries de Dirichlet et des produits d’Euler, Canad. J. Math. 32, 531–558. Queffélec, H. [1984]: Sur une estimation probabiliste liée à l’inégalité de Bohr, in Harmonic analysis: study group on translation-invariant Banach spaces, Exp. No. 6, Publ. Math. Orsay, 84-1, Université Paris XI, Orsay. Queffélec, H. [1995]: H. Bohr’s vision of ordinary Dirichlet series; old and new results, J. Anal. 3, 43–60. Rao, M. B., Wang, X., and Yang, X. [1993]: Convergence rates on strong laws of large numbers for arrays of rowwise independent elements, Stochastic Anal. Appl. 11, 115–132. Rademacher, H. [1922]: Einige Sätze über Reihen von allgemeinen Orthogonalfunktionen, Math. Ann. 87, 112–138. Ramanujan, S. [1915]: Some formulae in the analytic theory of numbers, Messenger of Math. 45, 81–84; see also Collected papers of Srinivasa Ramanujan, AMS Chelsea Publishing, Providence, R.I., 2000, 133–135. Rényi, A. [1955]: On the density of certain sequences of integers, Acad. Serbe Sci. Publ. Inst. Math. 8, 157–162. Rényi, A. [1957]: Representations for real numbers and their ergodic properties, Acta Math. Acad. Hungar. 8, 477–493. Rényi, A. [1958]: On mixing sequences of sets, Acta Math. Sci. Hungar. 9, 215–228. Rényi, A. [1970]: Foundations of probability, Holden-Day Series in Probability and Statistics, Holden-Day, Inc., San Francisco, CA. Révész, P. [1974]: A new law of the iterated logarithm for multiplicative systems and its applications, Acta Math. Acad. Sci. Hungar. 25, 425–433. Révész, S. G., and Ruzsa, I. Z. [1991]: On approximating Lebesgue integrals by Riemann sums, Glasgow Math. J. 33, 129–134. Rieders, E. [1993]: The size of the averages of strongly mixing random variables, Statist. Probab. Lett. 18, 57–64. Riesz, F. [1932]: Sur un théorème de maximum de MM. Hardy et Littlewood, J. London Math. Soc. 7, 10–13. Riesz, F. [1938]: Some mean ergodic theorems, J. London Math. Soc. 13, 274–278.

750

Bibliography

Riesz, F. [1942]: Sur quelques problèmes de la théorie ergodique, Mat. Fiz. Lapok 49, 34–62. Riesz, F. [1945]: Sur la théorie ergodique, Comment. Math. Helv. 17, 221–239. Riemann, B. [1892]: Gesammelte mathematische Werke und wissenschaftlicher Nachlass, B. G. Teubner, Leipzig. Riordan, J. [1958]: An introduction to combinatorial analysis, Wiley Publications in Mathematical Statistics, John Wiley & Sons, Inc., New York. Rohatgi, V. K. [1971]: Convergence of weighted sums of independent random variables, Proc. Cambridge Philos. Soc. 69, 305–307. Robbins, H. [1973]: On the equidistribution of sums of independent random variables, Proc. Amer. Math. Soc. 4, 786-799. Robin, G. [1983]: Sur l’ordre maximum de la fonction somme des diviseurs, in Seminar on number theory (Paris, 1981/1982), Progr. Math. 38, Birkhäuser, Boston, 233–244 Rochlin, V. A. [1949]: On the fundamental ideas of measure theory, Mat. Sbornik (N.S.) 25 107–150; English transl. Amer. Math. Soc. Transl. Ser. 1952 (1952), no. 71. Rochlin, V. A. [1961]: Exact endomorphims of a Lebesgue space, Izv. Akad, Nauk SSSR Ser. Mat. 25, 499–530. Rodin, V. A. [1992]: The BMO-property of the partial sums of a Fourier series, Soviet Math. Dokl. 44, 294–296. Rosenblatt, J. [1988]: Almost everywhere convergence of series, Math. Ann. 280, 565–577. Rosenblatt, J. [1991]: Universally bad sequences in ergodic theory, in Almost everywhere convergence II (Evanston, IL, 1989), Academic Press, Boston, 227–246. Rosenblatt, J. [1997]: When the integral is the limit, J. Math. Anal. Appl. 205, 560–567. Rosenblatt, J., and Wierdl, M. [1992]: A new maximal inequality and its applications, Ergodic Theory Dynam. Systems 12, 509–558. Rosenblatt, J., and Wierdl, M. [1995]: Pointwise ergodic theorems via harmonic analysis, in Ergodic theory and its connections with harmonic analysis, London Math. Soc. Lecture Note Ser. 205, Cambridge University Press, Cambridge, 3–151. Rosenblatt, M. [1956]: A central limit theorem and a strong mixing condition, Proc. Nat. Acad. Sci. U.S.A. 42, 43–47. Rosenthal, H. P. [1970]: On the subspaces of Lp (p > 2) spanned of sequences of independent random variables, Israel J. Math. 8, 273–303. Ross, K., and Stromberg, K. [1967]: Jessens’s theorem on Riemann sums for locally compact groups, Pacific J. Math. 20, 135–147. Ross, K., and Willis, G. [1997]: Riemann sums and modular functions on locally compact groups, Pacific J. Math. 180, 325–331. Rota, G. C. [1962]: On “Alternierende Verfahren” for general positive operators, Bull. Amer. Math. Soc. 58, 85–102. Ruch, J. J. [1997]: Contribution à l’étude des sommes de Riemann, Thesis, Publication IRMA, Strasbourg. Ruch, J. J. [1998a]: Étude de moyennes pondérées de sommes de Riemann, Expo. Math. 16, 277–285. Ruch, J. J. [1998b]: Convergence presque sûre de moyennes de sommes de Riemann, Acta Arith. 87, 1–12.

Bibliography

751

Ruch, J. J., and Weber, M. [1997]: Quelques résultats dans la théorie des sommes de Riemann, Expo. Math. 15, 279–288. Rudin, W. [1964]: An arithmetic property of Riemann sums, Proc. Amer. Math. Soc. 15, 321–324. Ruzsa, I. Z. [1978]: On difference sets, Studia. Sci. Math. Hungar. 13, 319-326. Saks, S. [1937]: Theory of the integral, G. E. Stechert & Co, New York. Salem, R. [1941]: A new proof of a theorem of Menchoff, Duke Math. J. 8, 269–272. Salem, R. [1948]: Sur les sommes riemanniennes des fonctions sommables, Mat. Tidsskr. B 6, 60–62. Salem, R., and Zygmund, A. [1947]: On lacunary trigonometric series, Proc. Nat. Acad. Sci. U.S.A. 33, 333–338. Salem, R., and Zygmund, A. [1948]: On lacunary trigonometric series II, Proc. Nat. Acad. Sci. U.S.A. 34, 54–62. Salem, R., and Zygmund, A. [1954]: Some properties of trigonometric series whose terms have random signs, Acta Math. 91, 245–301. Sárközy, A. [1978]: On difference sets of sequences of integers I, Acta. Math. Acad. Sci. Hungar. 31, 125–149. Sawyer, S. [1966]: Maximal inequalities of weak type, Ann. of Math. 84, 157–174. Sawyer, S. [1974]: The Skorokhod representation, Rocky Mountain J. Math. 4, 579–596. Schatte, P. [1984]: On the asymptotic distribution of sums reduced modulo one, Math. Nachr. 115, 275–281. Schatte, P. [1988]: On a law of the iterated logarithm for sums mod 1 with application to Benford’s law, Probab. Theory Related Fields 4, 167–178. Schmidt, W. M. [1964]: Metrical theorems on fractional parts of sequences, Trans. Amer. Math. Soc. 110, 493–518. Schneider, D. [1994]: Convergence presque sûre de moyennes ergodiques perturbées par des variables aléatoires, Thesis, Publication IRMA, Strasbourg. Schneider, D. [1997]: Théorèmes ergodiques perturbés, Israel J. Math. 101, 157–178. Schneider, D., and Weber, M. [1993]: Une remarque sur un théorème de Bourgain, in Séminaire de probabilités XXVII, Lectures Notes in Math. 1557, Springer-Verlag, 202–206. Schneider, D., and Weber, M. [1996]: Weighted averages of contractions along subsequences, in Convergence in ergodic theory and probability (Columbus, Ohio, 1993), Ohio State Univ. Math. Res. Inst. Publ. 5, Walter de Gruyter, Berlin, 399–404. Schoenberg, I. J. [1936]: On asymptotic distributions of arithmetical functions, Trans. Amer. Math. Soc. 39, 315–330. Schoenberg, I. J. [1938]: Metric spaces and completely monotone functions, Ann. of Math. 39, 811–841. Schoenberg, I. J. [1962]: On two theorems of P. Erdös and A. Rényi, Illinois J. Math. 6, 53–58. Selvaraj, S. [1991]: A note on Riemann sums and improper integrals related to the prime number theorem, J. Approx. Theory 66, 106–108. Shanks, D. [1978]: Solved and unsolved problems in number theory, second edition, Chelsea Publishing Co., New York.

752

Bibliography

Shapiro, H. N. [1956]: Distribution functions of additive arithmetic functions, Proc. Nat. Acad. Sci. U.S.A. 42, 426–430. Shloma, L. I. [1991]: Harmonic components of a function and Riemann sums, Dokl. Akad. Nauk. BSSR 31, 13–16 (in Russian). Siebert, H. [1976]: Montgomery’s weighted sieve for dimension two, Monatsh. Math. 82, 327–336. Skitoviˇc, M. [1953]: On a property of the normal distribution, Doklady Akad. Nauk SSSR (N.S.) 89, 217–219 (in Russian). Sonis, M. G. [1966]: Certain measurable subspaces of the space of all sequences with a Gaussian measure, Uspehi Mat. Nauk. 21, 277–279 (in Russian). Spitzer, F. L. [1976]: Principles of random walks, second edition, Grad. Texts Math. 34, SpringerVerlag, New York. Sprindžuk, V. G. [1979]: Metric theory of diophantine approximations, Scripta Series in Mathematics, V. H. Winston & Sons, Washington, D.C.; A Halsted Press Book, John Wiley & Sons, New York, Toronto, Ont., London. Stein, E. M. [1961]: On limits of sequences of operators, Ann. of Math. 74, 140–170. Stein, E. M. [1970]: Topics in harmonic analysis related to the Littlewood–Paley theory, Princeton University Press, Princeton, N.J. Stone, C. J. [1967]: On local and ratio limit theorems, in Contributions to probability theory, Part 2, Proc. 5th Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. II, University of California Press, Berkeley, CA, 217–224. Strassen, V. [1964]: An invariance principle for the law of the iterated logarithm, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 3, 211–226. Stroock, D. [1993]: Probability theory, an analytic view, Cambridge University Press, Cambridge. Sucheston, L. [1960]: Note on mixing sequences of events, Acta Math. Acad. Sci. Hungar. 11, 417–422. Sucheston, L. [1963]: On mixing and the zero-one law, J. Math. Anal. Appl. 3, 447–456. Sudakov, V. N., and Tsyrelson, B. S. [1974]: Extremal properties of half-sapces for spherically invariant measures, Zpa. Nauch. Sem. Leningrad Otdel. Mat. Inst. Steklov (LOMI) 41, 14–41 (in Russian). Sueiro, J. [1987]:A note on maximal operators of Hardy-Littelwood type, Math. Proc. Cambridge Philos. Soc. 102, 131–134. Sun, D., Tian, F., and Yu, J. R. [1998]: Sur les séries aléatoires de Dirichlet, C. R. Acad. Sci. Paris Sér. I Math. 326, 427–431. Sung, S. H. [1997]: Complete convergence for weighted sums of arrays of rowwise independent B-valued random variables, Stochastic Anal. Appl. 15, 255–267. Suquet, Ch. [1998]: Tightness in Schauder decomposable Banach spaces, Proc. St. Petersburg Math. Soc. 5, 289–327; Amer. Math. Soc. Transl. Ser. 2, 193, Amer. Math. Soc., Providence, RI, 1999. Szász, O. [1918]: Über harmonische Funktionen und L-Formen, Math. Z. 1 149–162. Takahashi, S. [1959]: A remark on the Riemann sum, Sci. Rep. Kanazawa Univ. 6, 57–59.

Bibliography

753

Talagrand, M. [1984]: Sur l’intégrabilité des vecteurs gaussiens, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 68, 1–8. Talagrand, M. [1987]: Regularity of Gaussian processes, Acta Math. 159, 99–149. Talagrand, M. [1990]: Sample boundedness of stochastic processes under increment conditions, Ann. Probab. 18, 1–49. Talagrand, M. [1992]: A simple proof of the majorizing measure theorem, Geom. Funct. Anal. 2, 118–125. Talagrand, M. [1993]: Regularity of infinitely divisible processes, Ann. Probab. 21, 362–432. Talagrand, M. [1994]: Constructions of majorizing measures, Bernoulli processes and cotype, Geom. Funct. Anal. 4, 660–717. Talagrand, M. [1996a]: Applying a theorem of Fernique, Ann. Inst. H. Poincaré 32, 779–799. Talagrand, M. [1996b]: Convergence of orthogonal series using stochastic processes, http: //www. math.ohio-state.edu/ talagran/preprints. Talagrand, M. [1996c]: Majorizing measures: the generic chaining, Ann. Probab. 24, 1049–1103. Talagrand, M. [2001]: Majorizing measures without measures, Ann. Probab. 29, 411–417. Talagrand, M. [2005]: The generic chaining, Springer Monogr. Math., Springer-Verlag, Berlin. Talalyan, A. A. [1956]: On convergence of orthogonal series, Dokl. Akad. Nauk SSSR 110, 515–516. Tandori, K. [1957]: Über die orthogonalen Funktionen. I., Acta Sci. Math. Szeged 18, 57–130. Tandori, K. [1965]: Bemerkung zur Konvergenz der Orthogonal Reihen, emphActa Sci. Math. Szeged 26, 249–251. Tchudakov, N. [1936]: On zeros of Dirichlet’s L-functions, Rec. Math. Moscou (2) 1, 591–602. Tenenbaum, G. [1990]: Introduction à la théorie analytique et probabiliste des nombres, Revue de l’Institut Elie Cartan 13, Département de Mathématiques de l’Université de Nancy I. Tenenbaum, G. [1997]: Les nombres premiers, Que Sais-Je? 571, Presses Universitaires de France, Paris. Thouvenot, J. P. [1989]: La convergence presque sûre des moyennes ergodiques suivant certaines sous-suites d’entiers (d’après Jean Bourgain), Astérisque 189-190 (1990), Exp. No. 719, 133–153. Tijdeman R. [1989]: Diophantine equations and diophantine approximations, in Number theory and applications (Banff, AB, 1988), NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci. 265, Kluwer Academic Publishers, Dordrecht, 215–243. Titchmarsh, E. C. [1951]: The theory of the Riemann zeta-function, Clarendon Press, Oxford. Toeplitz, O. [1938]: Zur Theorie der Dirichlet Reihen, Amer. J. Math. 60, 880–888. Tsuchikara, T. [1949]: Notes on Fourier analysis 13: A theorem on Riemann sums, J. Math. Soc. Japan 1, 232–234. Tsuchikara, T. [1951]: Some remarks on the Riemann sums, Tôhoku Math. J. 3, 197–202. Turán, P. [1960]: A theorem on diophantine approximation with application to Riemann zetafunction, Acta Math. Sci. Szeged 21, 311–318. Ursell, H. D. [1937]: On the behavior of a certain sequence of functions derived from a given one, J. London Math. Soc. 12, 229–232.

754

Bibliography

Vinogradov, I. M. [1948]: On an estimate of trigonometric sums with prime numbers, Izv. Akad. Nauk SSSR Ser. Math. 12, 225–248 (in Russian). Vinogradov, I. M. [1954]: The method of trigonometrical sums in the theory of numbers, Interscience Publishers, New York. Visser, J. [1937]: On certain infinite sequences, Proc. Akad. Wetensch. Amsterdam 40, 358–367. Volný, D. [1993]: Approximating martingales and the central limit theorem for strictly stationary processes, Stochastic Process. Appl. 44, 41–74. Volný, D. [1999]: Invariance principle and Gaussian approximation for strictly stationary processes, Trans. Amer. Math. Soc. 8, 3351–3371. Volný, D., and Weiss, B. [2004]: Coboundaries in L∞ 0 , Ann. Inst. H. Poincaré 10, 771–778. Wakeling, J. [2004]: An exploration of continued fractions, http: //www.cecm.sfu.ca/publications/ organic/confrac/confrac.html. Walters, P. [1982]: An introduction to ergodic theory, Grad. Texts in Math. 79, Springer-Verlag, New York. Wall, D. D. [1949]: Normal numbers, PhD Thesis, University of California, Berkeley. Wall, D. D. [1996]: Topological Wiener–Wintner ergodic theorems and a random L2 ergodic theorem, Ergodic Theory Dynam. Systems 16, 179–206. Weber, M. [1980]: Sur un théorème de Maruyama, in Séminaire de probabilités XIV, Lecture Notes in Math. 784, Springer-Verlag, 475–488. Weber, M. [1981]: Analyse infinitésimale de fonctions aléatoires non gaussiennes, in Ecole d’été de probabilités de St. Flour 1981, Lecture Notes in Math. 976, Springer-Verlag, 384–464. Weber, M. [1989]: The supremum of Gaussian processes with a constant variance, Probab. Theory Related Fields 81, 585–591. Weber, M. [1990a]: The law of the iterated logarithm on subsequences – characterizations, Nagoya Math. J. 118, 65–97. Weber, M. [1990b]: Une version fonctionnelle du théorème ergodique ponctuel, C. R. Acad. Sci. Paris Sér. I Math. 311, 131–133. Weber, M. [1992]: Méthodes de sommation matricielles, C. R. Acad. Sci. Paris Sér. I Math. 315, 759–764. Weber, M. [1993a]: GC sets, Stein’s elements and matrix summation methods, Prépublication IRMA no. 27, Strasbourg. Weber, M. [1993b]: Opérateurs réguliers sur les espaces Lp , in Séminaire de probabilités XXVII, Lectures Notes in Math. 1557, Springer-Verlag, 207–215. Weber, M. [1994]: GB and GC sets in ergodic theory, in Probability in Banach spaces, 9 (Sandjberg, 1993), Progr. Probab. 35, Birkhäuser, Boston, 129–151. Weber, M. [1995]: Borel matrix, Comment. Math. Univ. Carolin. 36, 401–415. Weber, M. [1996]: Coupling of the GB set property for ergodic averages, J. Theoret. Probab. 9, 207–215. Weber, M. [1997]: Entropy numbers in Lp –spaces for averages of rotations, J. Math. Kyoto Univ. 37, 689–700. Weber, M. [1998a]: Entropie métrique et accroissements de moyennes ergodiques, Ricerche Mat. 47, 13–27.

Bibliography

755

Weber, M. [1998b]: Entropie métrique et convergence presque partout, Travaux en Cours 58, Hermann, Paris. Weber, M. [1998c]: A propos d’une démonstration de K. Tandori, in Fascicule de probabilités (Rennes, 1998), Publ. Inst. Rech. Math. Rennes, Univ. Rennes I. Weber, M. [1999]: Controlling orthogonal series with irrational rotations, Period. Math. Hungar. 38, 153–163. Weber, M. [2000a]: Some theorems related to almost sure convergence of orthogonal series, Indag. Math. (N.S.) 11, 293–311. Weber, M. [2000b]: Estimating random polynomials by means of metric entropy methods, Math. Inequal. Appl. 3, 443–457. Weber, M. [2001a]: Sur le caractère gaussien de la convergence presque partout. Fund. Math. 167, 23–54. Weber, M. [2001b]: Un théorème central limite presque sûr à moments généralisés pour les rotations irrationnelles, Manuscripta Math. 101, 175–190. Weber, M. [2003]: Régularité ergodique de certaines classes de Donsker, Teor. Veroyatnost. i Primenen. 48, 766–784. Weber, M. [2004a]: Some examples of application of the metric entropy method, Acta Math. Hungar. 105, 39-83. Weber, M. [2004b]: A theorem related to the Marcinkiewicz–Salem conjecture, Results Math. 45, 169–184. Weber, M. [2004c]: An arithmetical property of Rademacher sums, Indag. Math. (N.S.) 15, 133–150. Weber, M. [2004d]: Discrepancy of randomly sampled sequences of reals, Math. Nachr. 271, 105–110. Weber, M. [2005a]: Almost sure convergence and square functions of averages of Riemann sums, Results Math. 47, 340–354. Weber, M. [2005b]: Divisors, spin sums and the functional equation of the Zeta-Riemann function, Period. Math. Hungar. 51, 1–13. Weber, M. [2005c]: When the cone condition fails, Anal. Math. 31, 291–300. Weber, M. [2005d]: Value distribution of squarefree divisors of Bernoulli sums, to appear in J. Math. Kyoto Univ. Weber, M. [2005e]: An ergodic theorem of arithmetic type, Tatra Mt. Math. Publ. 31, 123–129. Weber, M. [2006a]: On a stronger form of Salem–Zygmund’s inequality for random trigonometric sums with examples, Period. Math. Hungar. 52 (2), 73–104. Weber, M. [2006b]: Uniform bounds under increment conditions, Trans. Amer. Math. Soc. 358, 911–936. Weber, M. [2006c]: On the order of magnitude of the divisor function, Acta Math. Sinica 22, 377–382. Weber, M. [2006d]: On the CLT for means under the rotation action. I, Theory Probab. Appl. 50, 631–649. Weber, M. [2007a]: On the CLT for means under the rotation action. II, Theory Probab. Appl. 51, 377–387.

756

Bibliography

Weber, M. [2007b]: Divisors functions sampled with integral-valued random walks, preprint. Weber, M. [2007c]: Small divisors of Bernoulli sums, Indag. Math. (N.S.) 18, 281–293. Weber, M. [2008a]: On an identity of Ky Fan, Anal. Math. 34 (2008), 225–236. Weber, M. [2008b]: On localization in Kronecker diophantine theorem, preprint available at www.arXiv:0806.3990v1, to appear in Unif. Distrib. Theory. Weber M. [2009a]: Supremum of random Dirichlet polynomials with sub-multiplicative coefficients, preprint available at www.arXiv:0904.2316v1. Weber M. [2009b]: A remark on zeros of Brownian motion, www.arXiv:0907. 1572v1. Weber M. [2009c]: Dirichlet polynomials: some old and recent results, and their interplay in number theory, preprint available at www.arxiv:0907.4931v1 Weber M. [2009d]: A sharper estimate for divisors of Bernoulli sums, preprint available at www.arXiv:0908.2047v1. Weyl, H. [1916]: Über die Gleichverteilung von Zahlen mod Eins, Math. Ann. 77, 313–352. Wiener, N. [1939]: The ergodic theorem, Duke Math. J. 5, 1–18. Wiener, N., and Wintner, A. [1941]: Harmonic analysis and ergodic theory, Amer. J. Math. 63, 415–426. Wierdl, M. [1988]: Pointwise ergodic theorem along the prime numbers, Israel J. Math. 64, 315–336. Wilson, B. M. [1922]: Proofs of some formulae enunciated by Ramanujan, Proc. London Math. Soc. (2) 21, 235–255. Wintner, A. [1937]: On a trigonometrical series of Riemann, Amer. J Math. 59, 629–634. Wintner, A. [1944a]: Diophantine approximation and Hilbert space, Amer. J Math. 66, 564–578. Wintner, A. [1944b]: Random factorizations and Riemann’s hypothesis, Duke Math. J. 11, 267–275. Wintner, A. [1944c]: The theory of measure in arithmetical semi-groups, Baltimore, MD. Wintner, A. [1957]: Fourier constants and equidistant Riemann sums, J. Math. Pures Appl. 36, 251–261. Wittmann, R. [1995a]: Almost everywhere convergence of ergodic averages of nonlinear operators, J. Funct. Anal. 127, 326–362. Wittmann, R. [1995b]: On a maximal inequality of Rosenblatt and Wierdl, Bull. London Math. Soc. 27, 483–491. Wright, A. L. [1980]: On a theorem of Maruyama, Ann. Probab. 8, 851–852. Yano, S. [1950]: Notes on Fourier analysis XIX: A remark on Riemann sums, Tùhoku Math. J. 2, 1–3. Yoshimoto, M. [1998]: Farey series and the Riemann hypothesis, Acta Math. Hungar. 78, 287–304. Yoshimoto, T. [1976]: Induced contraction semigroups and random ergodic theorems, Dissertationes Math. (Rozprawy Mat.) 139, Polish Acad. Sci. Inst. Math., Warszawa. Yu, J. R. [1978]: Some properties of random Dirichlet series, Acta Math. Sinica 21, 97–118. Yu, J. R. [1985]: Sur quelques séries gaussiennes de Dirichlet, C. R. Acad. Sci. Paris Sér. I Math. 300, 521–522.

Bibliography

757

Yu, J. R. [1995]: Dirichlet spaces and random Dirichlet series, J. Anal. 3, 61–71. Ziegler, K. [1998]: Some uniform ergodic inequalities in the non-measurable case, J. Funct. Anal. 154, 531–541. Zygmund, A. [1959]: Trigonometric series I, II, Cambridge University Press, New York.

Index

adjoint operator, 73 admissible, 449 α-good, 55 aperiodic, 270 automorphism, 93 Azuma’s inequality, 373 Banach principle, 200 Bessel’s inequality, 22 bilateral Hilbert transform, 38 BMO space, 143 Boas–Bellman inequality, 22 Bochner–Herglotz lemma, 4 Borel–Cantelli lemma, 353 bounded fibres, 116 Bourgain’s return time theorem, 163, 168 breadth, 559 Calderon transference principle, 164 canonical Gaussian process, 231, 502 central limit theorem, vi, 267, 269 Cesàro bounded, 21 Cesàro means, 169 Chacon–Ornstein, 137 chain, 551 CLT, vi, 267 coboundary, 25 cobounding, 25 commutation assumption, 206 complete convergence, 161 completely mixing, 104 cone condition, 139 constant type, 409 continued fraction algorithm, 94 continuity principle, 206, 208 continuous in measure, 200 continuous spectrum, 104, 114

convergence in density, 110 convergence in measure, 200 convergence in variation, 268, 317 convergence system, 611 convergents, 94 Conze’s principle, 228 correlated sequence, 8 cycles, 96 Dickman function, 418 Diophantine type, 314 Dirichlet polynomials, 415 discrepancy, 410 discrete spectrum, 104 dominated ergodic theorem, 140 double sum method, 523 d-separable, 345 d-separable modification, 346 d-separable version, 346 Dunford–Schwartz contraction, 135 dyadic chaining, 81 elliptic Theta function, 701 endomorphism, 93 entropy criterion, 233, 241, 254 (ε, n)-Kakutani–Rochlin set, 270 Erdös–Turan inequality, 410 ergodic, 102 Euler–Maclaurin sum-formula, 570 extension, 98 factor, 98 factor map, 98 Farey sequences, 565 Feffermann inequality, 143 first entropy criterion, 233 flow, 132 Fourier inversion formula, 42

760 Franel’s identity, 568 functional equation of the Riemann zeta function, 701 Gaposhkin’s criterion, 86 Gauss space, 503 Gauss sums, 691 Gaussian measure, 503 Gaussian Stein’s elements, 209 GB set, 231, 510 GC set, 510 good sequences of integers, 55 Hamadard gap condition, 315 Hardy space, 143 Hilbert transform, 40 homomorphism, 93 inclusion–exclusion formula, 668 induced dynamical system, 99 ∞-sweeping out, 194 intersective, 404 IP-set, 115 isometric, 61 isoperimetric inequality, 517 Jacobi–Legendre symbol, 694 Jordan’s identity, 673 Kakutani–Rochlin’s lemma, 270 Khintchin’s inequality, 427 k-mixing, 104 Ky Fan identity, 12 lacunary sequences, 194 large sieve inequality, 24 Lebesgue space, 270 Levy’s inequality, 389 Lindelöf Hypothesis, 706 local ergodic theorem, 132 local time, 317, 513 logarithmic means, 169 Möbius function, 565 majorizing measure, 439

Index

majorizing measure method, 438 Martingale inequality, 132 maximal lemma, 130 mean ergodic, 22 mean good, 18 measurable dynamical system, 93 mixing for T , 116 modification of a stochastic process, 345 moving averages, 44 moving sums, 63 nice, 193 nonnegative definite, 3 normalization conditions, 66 normally distributed, 491 occupation time distribution, 513 oscillation functions, 152 Paley–Zygmund inequality, 354 partial quotients, 94 Poincaré identity, 668 Poincaré inequalities, 668 Poincaré Recurrence Theorem, 99 polarization formula, 62 power-bounded, 21 premeasure, 77 prime number theorem, 571 projector, 64 purely discrete spectrum, 114 quasi-orthogonal system, 22 Rademacher–Menchov’s theorem, 363 Rademacher–Menshov’s maximal inequality, 360 recurrent, 99 regular, 344 relation of weak maximal type, 144 relatively dense, 99 remotely trivial, 114 reproducing kernel Hilbert space, 503 resolution of the identity, 64 Riemann Hypothesis, 569

761

Index

Riesz harmonic averages, 170 Riesz sequence, 603 Rosenthal’s inequality, 352 rotated Hilbert transform, 138 rotational invariance property, 497 second entropy criterion, 241 self-adjoint, 73 semi-algebra, 76 separable, 345 separation set, 345 sequential dynamical system, 116 Sidon set, 196 skew product, 97 Skorokhod embedding, 527 spectral inequality, 8 spectral measure, 8 spectral measure of the sequence, 8 spectral mixing theorem, 111 spectral regularization, 28 speed of convergence, 19, 148 square function, 29 Stechkin’s theorem, 349 stochastic integral, 78 stochastic matrix, 97 strong law of large numbers, 130 strong type, 140 strongly mixing, 103 strongly sweeping out, 194 subadditive, 13 Sudakov’s minoration, 507 symbolic flow, 96 symmetrization procedure, 377

third entropy criterion, 254 tightness, 537 Toeplitz’s criterion, 567 topological dynamical system, 93 transfer-function, 25 two-sided mixing, 115 type < ψ, 409 type η, 409 uniformly distributed modulo a, 52 unitary, 62 unitary operator, 61, 135 universally p-mean good, 18, 193 universally bad, 194 universally good, 193 van der Corput inequality , 55 version of a stochastic process, 345 von Neumann theorem, 10 wandering set, 136 weak type, 140 weakly independent, 115 weakly mixing, 103 weakly multiplicative system, 357 weakly stationary process, 61 weakly wandering, 114 weighted law of large numbers, 172 weighted modulation, 189 Wiener–Wintner function, 168 Young function, 342

Optimization and dynamical systems

Optimization and Dynamical Systems

Dynamical systems and chaos

Dynamical Systems and Methods

Dynamical systems and control