Ergodic theory and topological dynamics

Ergodic Theory and Topological Dynamics Pure and Applied Mathematics A Series of Monographs and Textbooks E d i t o r...

Author: Author Unknown

81 downloads 1092 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Ergodic Theory and Topological Dynamics

Pure and Applied Mathematics A Series of Monographs and Textbooks E d i t o r s Samuel Ellenberg and Hymen Sass Columbia University, N e w York

RECENT TITLES

GERALD J. JANUSZ.Algebraic Number Fields Introduction to the Theory of Entire Functions A. S. B. HOLLAND. WAYNE ROBERTS AND DALEVARBERG. Convex Functions A. M. OSTROWSKI. Solution of Equations in Euclidean and Banach Spaces, Third Edition of Solution of Equations and Systems of Equations H. M. EDWARDS. Riemann’s Zeta Function SAMUEL EILENBERG. Automata, Languages, and Machines : Volumes A and B A N D STEPHEN SMALE. Differential Equations, Dynamical Systems, and MORRISHIRSCH Linear Algebra WILHELMMAGNUSNoneuclidean Tesselations and Their Groups FRANCOIS TREVES. Basic Linear Partial Differential Equations WILLIAMM. BOOTHBY. An Introduction to Differentiable Manifolds and Riemannian Geometry BHAYTON GRAY.Homotopy Theory : An Introduction to Algebraic Topology ROBERT A. ADAMS.Sobolev Spaces JOHNJ. BENEDETTO. Spectral Synthesis The Heat Equation D. V. WIDDER. IRVING EZRASEGAL. Mathematical Cosmology and Extragalactic Astronomy J. DIEUDONNE. Treatise on Analysis : Volume 11, enlarged and corrected printing ; Volume IV. I n preparation: Volume V WERNER GREUB,STEPHEN HALPERIN, A N D RAYVANSTONE. Connections, Curvature, and Cohomology : Volume 111, Cohoniology of Principal Bundles and Homogeneous Spaces I. MARTINISAACS. Character Theory of Finite Groups JAMES R. BROWN. Ergodic Theory and Topological Dynamics I n preparation CLIFFORD A. TRUESDELL. A First Course in Rational Continuum Mechanics: Volume 1, General Concepts K. D. STROYAN A N D W. A. J. LUXEMBURG. Introduction to the Theory of Infinitesimals A N D J O H N D. DIXON.Modular Representations of Finite B. M. PUTTASWAMAIAH Groups MELVYN BERCER. Nonlinearity and Functional Analysis : Lectures on Nonlinear Problems in Mathematical Analysis GEORGE GRATZER. Lattice Theory

Ergodic Theory and Topological Dynamics

JAMES R. BROWN Department of Mathematics Oregon State University Corvallis, Oregon

ACADEMIC PRESS

New York Sun Francisco London

A Subsidiary of Harcourt Brace Jovanovich, Publishers

1976

COPYRIGHT 0 1976, BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.

ACADEMIC PRESS, INC.

111 Fifth Avenue, New York. New York 10003

United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NWl

Library of Congress Cataloging in Publication Data Brown, James Russell, Date Ergodic theory and topological dynamics. Cpure and applied mathematics, a series of monographs and textbooks ; V. 70) 1. Topological dynamics 2. Ergodic theory. I. Title. 11. Series QA3.P8 (QA611.51 510'.8s [515'.42] ISBN 0- 12- 137 150-6 AMS (MOS)1970 Subject Classifications: 28A65,54H20, 22D4447A35

PRWTED IN THE UNITED STATES OF AMERICA

75-40607

Jolan

This Page Intentionally Left Blank

Contents

Preface

ix

Chapter I Ergodic Theory 1. 2. 3. 4.

Abstract Dynamical Systems Ergodic Theorems Ergodicity and Mixing Products and Factors 5. Inverse Limits 6. Induced Systems Exercises

1 7 14 21 24 28 32

Chapter I1 Topological Dynamics 1. 2. 3. 4.

Classical Dynamical Systems Minimal and Strictly Ergodic Systems Equicontinuous and Distal Systems Sums and Products of Dynamical Systems 5. Inverse Limits 6. The Ellis Semigroup of Z 7. Expansive Systems Exercises

43 46 51 53 57 59 63 66 vii

...

CONTENTS

Vlll

Chapter I11 Group Automorphisms and Affine Transformations 1. 2. 3. 4. 5. 6.

Dynamical Systems on Groups Ergodicity Discrete and Quasidiscrete Spectrum Quasiperiodic Spectrum and the Ergodic Part o f t Ergodic Automorphisms An Affine Transformation Associated with the Dynamical System 0 Exercises

71 76 84 89 92 96 102

Chapter IV Entropy Conditional Expectation and Kolmogorov Entropy The Information Function and Finiteness of h(4, d ) Sinai’s Theorem and Generators Topological Entropy Entropy of Affine Transformations 6. McMillan’s Theorem and Entropy of Induced Systems Exercises

1. 2. 3. 4. 5.

109 114 119 127 133 135 142

Chapter V Bernoulli Systems and Ornstein’s Theorem 1. 2. 3. 4.

Definitions Approximation Lemmas The Isomorphism Theorem Extensions and Consequences of the Isomorphism Theorem Exercises

151 158 173 177 179

Bibliography

181

Index

187

Preface

This book has been sixteen years aborning. In 1959-1960 the author sat in the lectures of S. Kakutani at Yale University and learned his first lessons in ergodic theory. Notes taken in those lectures have evolved and expanded over the years through the author’s own lectures at Oregon State University, have been garnished by his contact with distinguished mentors and colleagues, and after numerous rewritings have assumed the form of the present text. Why d o we offer such a book now? Sixteen years ago in New Haven we were hearing of great new breakthroughs coming out of Moscow as the second great era of ergodic theory-the Kolmogorov era-was being launched. Now we are already six years into the third epoch-the Ornstein age. Perhaps the time has come to take a leisurely look at some of the accomplishments of the first two periods and a glimpse at what is developing in the third. Chapters I and IV of this book are devoted, respectively, to pre- and post-Kolmogorov ergodic theory. Chapter V is an introduction to current developments. These three chapters may be read separately as an introduction to modern ergodic theory. At the same time, the most casual observer must note the parallels of this theory to the topological dynamics introduced in Chapter I1 and Section 4 of Chapter IV. These parallels as well as the validity of two viewpoints, measure-theoretic and topological, in studying classical systems led the author to include the two half-brothers of ergodic theory and topological dynamics in one volume.

X

PREFACE

The two theories merge most satisfactorily when one studies the affine transformations of compact abelian groups, which are introduced in Chapter 111. No attempt at completeness has been made, and selections will invariably reflect the tastes of the author. Specific apologies are due, however, for the omission of the extensive body of results regarding invariant measures, the virtual groups of Mackey, and some important structure theorems of topological dynamics. The author reserves the privilege of returning to some of these as well as more specialized topics in a later volume. It was intended that this book be accessible to the beginning mathematician with some background in abstract measure theory and general topology. In addition, a little familiarity with infinite-dimensional vector spaces and, for Chapter 111, topological groups, would be helpful. It was further intended that the material would not be entirely without interest to the mature mathematician. We hope only that we have not quietly slipped into the interstice between these aims. Exercises are included for all the chapters. While the number accompanying Chapter V is quite small, most readers will find,the reading of that chapter sufficient exercise in itself.

CHAPTER

I Ergodic Theory

1. ABSTRACT DYNAMICAL SYSTEMS

Ergodic theory may be defined to be the study of transformations or groups of transformations, which are defined on some measure space, which are measurable with respect to the measure structure of that space, and which leave invariant the measure of all measurable subsets of the space. In this chapter we shall concern ourselves with the theory of a single measure-preserving transformation and its iterates. This will make it possible to display the essential features of ergodic theory without becoming involved in unnecessary complications of notation and the intricacies of group theory. It should, however, be pointed out that most of the classical applications of ergodic theory require the consideration of a continuous group of transformations. It is customary in ergodic theory to assume that the underlying space is either a finite or a-finite measure space. We shall assume, except in some of the exercises, that the measure is finite and normalized to have total measure one. It is commonly further assumed that the measure space is separable (equivalently, that the space of square-integrable complex-valued functions on this measure space is a separable Hilbert space). We shall not make this assumption, principally because it would rule out some of our most interesting examples and our principal structure theorems in 1

2

I.

ERGODIC THEORY

Chapter III. There seems to be no compelling reason to impose the condition of separability, provided that we do our measure theory, in Chapter IV, for example, in terms of a-algebras rather than partitions. Let X be a nonempty set. Let .a be a a-algebra of subsets of X. In other words, 4 contains the empty set Qr and the set X and is closed under the formation of countable unions, countable intersections, and complements. We make no further assumptions about 4. Let p be a normalized measure on ( X , 93).That is, p is a nonnegative, real-valued, countably additive function defined on a, with p ( X ) = 1. A function 4: X + X is measurable if # - ‘ ( A ) E whenever A E &?. Measurability for a function from one measure space to another is similarly defined. The measurable function #: X 3 X is said to be a measure-preserving transformation if p ( + - ’ ( A ) ) = p ( A ) for all A E &?.It is an invertible measurepreserving transformation if it is one-to-one (monic) and if 4-l is also measurable. In this case 4- is also a measure-preserving transformation. Measure-preserving transformations arise, for example, in the study of classical dynamical systems. In this case 4 is first obtained as a continuous transformation of some (compact) topological space, and the existence of an invariant measure p, that is a measure preserved by #, is proved. The system (X,4, p, 4) is then abstracted from the topological setting. For this reason, we shall refer in this chapter to abstract dynamical systems.

Definition 1.1 An abstract dynamical system is a quadruple = ( X , a, 4), where X is a nonempty set, W is a a-algebra of subsets of X, p is a normalized measure defined on 4, and # is a measure-preserving transformation of X.We shall say that is inoertible if 4 is invertible. p,

While our principal object of study is, of course, the transformation 4, we adopt the above notation and terminology for a variety of reasons. For example, we shall have occasion to consider as different dynamical systems two quadruples @ = ( X , A?,p, 4) and @’ = (X,W‘,p, #), which differ only in the class of measurable sets. In order to avoid some trivialities as well as some embarrassing technical difficulties of measure theory, we shall adopt the following notion of equivalence of abstract dynamical systems.

Definition 1.2 The dynamical systems @ = (X, 93, p, 4) and 4’) are equivalenr if there exists a mapping +*: # 3 28 which is monic and epic, and which satisfies @’ = (X’, w’,p’,

p(+*(B’)) = p’(B’)

(B’E a )

1.

3

ABSTRACT DYNAMICAL SYSTEMS

and p [ d - l ( $ * ( B ’ ) ) A $*(r#-’(H))]= 0

(B’ E

a‘).

Here we have used the symbol A to denote the symmerric difference CAD=(C-D)u(D-C) of the sets C and D. Of course, if $: X + X ’ is an invertible measure-preserving transformation such that $4 = 4’1) modulo sets of measure zero, then its adjoint $* defined by $*(B) = $- ‘ ( B ) satisfies the above requirements, so that Q, and W are equivalent. However, it is not always possible to find such a $ for equivalent systems Q, and 0’. Let us consider some examples of abstract dynamical systems. Let X = [0, 11 be the set of real numbers between 0 and a denote the Borel subsets of X and p the restriction of Lebesgue measure to X . Define $(x) for each x E X to be the fractional part of .Y + a, where 0 < a < 1. It is easily verified that 4 preserves the length of any interval and hence (see Exercise 1) the measure of any set in A?. Let X’ = K = { z : ( z I = 1) be the set of complex numbers of absolute value one. Let J‘ denote the Borel subsets of X ’ and p’ the normalized linear measure on X’. Define 4’: X ’ + X ’ by 4(z) = e’”‘z. Then 4f= ( X ’ , a’,p’, 4’) is a dynamical system, which is easily seen to be equivalent to 0 = (X, a,p, 4). Let X” = X x X be the unit square, and define tff”(x, y) = (4(x), y). If p” is two-dimensional Lebesgue measure, then W = ( X ” , X’,p”, 4”) will be equivalent to 0 provided that X’ = ( A x X : A E a}, but not equivalent if L9’’is taken to be all the Borel subsets of X “ . Example I

1 inclusive. Let

Example 2 Let X , 2, and ,u be as in the previous example, and define 4(x) to be the fractional part of kx, where k is a positive integer. Equivalently, define 4’ on X’ by &’(z)= zk. This is an example of a noninvertible dynamical system (for k > 1). Since 4([0, l l k ] ) = X , it is not generally true for noninvertible systems that p ( 4 ( A ) )= p ( A ) , even when $ ( A ) is measurable. Example 3

(Shift transformations) Let X,

= (0,

be a finite set of k points. Let

1, ..., k - 1).

a,,denote

the class of all subsets of X , ,

4

I.

ERGODIC THEORY

and let p n be the measure obtained by assigning to the point j the mass Pnj 1

= { P n O 9 P n l ? * * . ? Pn, k-

C(n

1).

We can form a measure space (X,93, p) by taking the infinite product m

(X,a,

=

X

n=O

( X n , an,Pn)

or the two-sided product m

(X‘, P’)= X ( X n T a n , pn)* g

y

n=-m

That is, X or X’consists of (one- or two-sided) sequences of elements of X,,93 or 9 is the smallest a-algebra containing all “cylinder sets”

c = {x x : (Xnir .. .,xn,)

u n I

A) =

{x : x n j = sj),

(s,, ..., S,)E A j = l

and p or p’ assigns to C the measure

Now define 4 and @ on X and X’,respectively, by 4(x) = y or @(x) = y, where y n = xn+ (all n). Noting that I

4-YC)= (Si,

u n 4-yY: u n iX +

.. . , 7,)

E

A

Ynj = s , ~

J=

1

I

= (Si’

...

1

S,)E

Aj= 1

: xfl,

=sJ~,

we see that 4-l (or +’-’) carries cylinder sets into cylinder sets, hence is measurable. Clearly, it will be measure-preserving iff P“j

= pj

( j = 0,1, . .., k - I),

independently of n. Note that 4’ is invertible, but that 4 is a k-to-one transformation. In fact, 0 is equivalent to the system 0 of Example 2. To see this, we need only express each x E [0, 13 in its k-adic expansion, thus obtaining an almost one-to-one correspondence between it and the one-sided sequence space X.

1. ABSTRACT DYNAMICAL SYSTEMS

5

If CP = (X, a,p, 4) is an abstract dynamical system, then 4 determines a transformation Tb of (real- or complex-valued) functions on X, defined by the formula T , ( f ( x ) )= f ( # ( x ) ) .Iffis measurable with respect to W,then so is T , f . If f is integrable, then T,f is integrable, and jT,f d p = J f d p . This follows for simple functions from the fact that 4 is measure preserving and in the general case by a limiting argument. Recall that L, = L p ( p )= L,(X, B, p) (1 I p < 00) denotes the set of all measurable real- or complex-valued functions f defined on X for which j l f ( x ) l p p ( d x )< 00, and that L, with the norm

is a complete, normed linear space. We denote, as usual, by L , the space of p-essentially bounded functions with the p-essential supremum norm. In this chapter we shall be chiefly concerned with real L,; that is, the functions in L, will be assumed to be real valued. Note, however, that the ergodic theorems of Section 2 are valid in complex L,, and that the spectral theory introduced in Section 3 requires consideration of complex L,. Since T, 1 f 1 = 1 T,f I, it is clear that the linear transformation T+ maps L, into L, for each p , 1 Ip I co, and that IIT,fl(, = Ilfll,. That is, Tb is an isometry on L,. If is an invertible measure-preserving transformation, then T, is an invertible isometry, with T i = Tb-I . In the case p = 2, this means that T, is unitary. We define a doubly stochastic operator on L, (1 Ip I co), so called because of its origins in probability theory and its analogy to doubly stochastic matrices, as follows.

Definition 1.3 A linear operator T defined for functions on X is doubly stochastic if it maps L , into L , and satisfies for all f E L , the following conditions: 1. f 2 0 a T f 2 0 ; 2. j x Tf dP = Jx f d p ; 3. TI = 1.

Here we use the symbols L and = in the p-almost-everywhere sense, and we denote by 1 the function whose constant value is 1. For each continuous linear operator T on L,, there is a well-defined continuous linear operator T*, called the adjoint of T , defined on L,, where 1 I p < 00 and I/p + l/q = 1 ( q = 00 for p = 1). They are related by

6

I.

ERGODIC THEORY

(Tf, g) = (f, T*g) for all f~ L,, gE L,. The symbol “inner product”

(JT

g) denotes the

where C is the complex conjugate of c and the bar may be ignored for real L, .

Proposition 1.1 If T is a doubly stochastic operator, then T maps L, into L,for each p , 1 I p I co, with (IT([, I 1 and llTlll = llTllm= 1. Moreover, T* is also doubly stochastic. Proof We show first that IITI(, = 1. This makes sense because L, G L,, and T is defined on L,. For f E L,, let f and f - denote, respectively, the positive and negative parts of f . Thus f = J’ - f - , I f 1 = f + + f - , and f +, f - 2 0. By property 1 of Definition 1.3, Tf 2 0 and Tf - 2 0. Thus +

+

lTfl = IT(f+ -

f-)I

= IT’+

- Tf-I I Tf’ + Tf-

= Tlfl.

(1)

By properties 1 and 3,

Combining inequalities (1) and (2) yields ( 1 Tf(l,I 1 f l l , , or ( 1 TI(,I 1. Since T1 = 1, it follows that IJTIJ, = 1. Note that we used only properties 1 and 3 and the linearity of T to show that liTIJ, = 1. Now property 2 is equivalent to saying that (Tf,1) = (f, T*l) = (f, 1) for all f E L,. That is, property 2 is equivalent to T*l = 1. It follows as above that JIT*(l,= IJTIJ,= 1. The inequality )(TllpI.1 for 1 < p < 00 now follows as an application of the Riesz convexity theorem [16, p. 5261. (See also Exercise 7.) To complete the proof we need to show that T* maps L, into L, and satisfies condition 2. Iff€ L , , then J T*fdp = (1, T*f) = (T1,f) = (1,f) = Jf dp. Since L, is dense in L,, it follows that T* has a unique continuous extension to all of L, satisfying f T*f dp = Jf dp for all f E L,. Integrating inequality (1) above with T replaced by T* shows that T* maps L, into L,. 1 It is interesting to note that the properties mentioned earlier for doubly stochastic operators arising from dynamical systems characterize these operators. Specifically, we have the following.

2.

ERGODIC THEOREMS

7

Proposition 1.2 Suppose that ( X , 93, p) is the unit interval with Lebesgue measure on the Borel sets. Then the operators of the form for some dynamical system (4 = ( X , 93, p, 4) are just exactly the doubly stochastic operators which are isometries on L , ( X , B, p). Moreover, (4 is is unitary. invertible iff We shall not have occasion to use this result, and so will not give the proof. See, however, Exercise 6 at the end of this chapter.

2.

ERGODIC THEOREMS

In the previous section we said that ergodic theory might be defined as the study of measure-preserving transformations. A more restrictive definition would be the study of the asymptotic behavior of the iterates c#P of such a transformation. Indeed, the historical beginning of this discipline might be placed at the proof by G. D. Birkhoff in 1931 of the so-called individual ergodic theorem (Theorem 1.2) or the earlier proof by H. Poincare in 1912 of the recurrence theorem (Theorem 1.5). In this section we shall look at these theorems as well as several others of a similar nature. We shall refer to these theorems collectively as ergodic theorems. Some of them involve the iterates 4" of a measure-preserving transformation, while others involve the iterates T" of an operator having some or all of the properties of Definition 1.3. We make no pretense at completeness or ultimate generality in our selection of ergodic theorems, but give only a representative sample of those we believe have had the most impact on ergodic theory and its applications. One further historical note seems to be in order at this point. In retrospect it is clear that the mean and individual ergodic theorems for a measure-preserving transformation were anticipated considerably earlier by the (Weak) Law of Large Numbers of J. Bernoulli (1713) and the Strong Law of Large Numbers of E. Borel (1909) for Bernoulli sequences of random variables. The identification of these latter theorems as ergodic theorems only awaited the invention of measure theory by Borel and Lebesgue and its application by A. Kolmogorov in 1933 to the foundations of probability. We begin with one of the so-called maximal ergodic theorems, this one due to E. Hopf. As before, let ( X , 93, p) be a normalized measure space. Let T be an operator on L , = L , ( X , B, p). We need only assume that T has property 1 of Definition 1.3 and a weakened form of 2,

8

I.

ERGODIC THEORY

namely 1 TI1 I1. Such an operator is called a contraction. We introduce the following notation: n- 1

T,fW =

c TWx),

k=O

B * ( f ) = {x : sup T, f ( x ) > 0},

B n * ( f )= {x : max

n

where f

E

q f ( x ) > 0},

I
L,.

Theorem 1.1 (Hopfmaximal ergodic theorem) For each f

Proof

E Ll

(A. Garsia [24]) Let (n = 1, 2, ...).

fn(x)= max q f ( x ) Isksn

Thenf = fl I f2 I. - .and Bn* = B n * ( f )= {x : f , ( x ) > 0) is an increasing sequence of sets with union B * ( f ) = B*. Also, since T 2 0,

+

Tlf = f If Tf,+ q+lf= f + T ( T , f )2 f

+ Tf,+

(1 I k In)

so that

f n 5 fn+1

2

j"

X

fn+

I

f + Tfn+

d~

-5

X

Un+

d~ =

IIfn+II1

- I I T f n + I I 1 2 0.

The last inequality comes from the assumption that IITI(, I 1. Letting n -,co, we obtain the desired result

[,*f dP 2 0. 1 Now let us introduce the further notation:

I

I

1 A * ( f ; a ) = x : sup - T n f ( x )> a , n

n

2.

9

ERGODIC THEOREMS

Corollary 1.1.1 For each f

E

L , and each real a and

p we have

and

Proof To prove (4) apply (3) to the function h = f - a E L , and observe that T, h ( x ) > 0 iff (l/n)T,,f ( x ) > a. Inequality ( 5 ) then follows by applying (4) to g = -f and taking a = -pa I induced by the After particularizing this corollary to the operator dynamical system a, it is a relatively easy task to prove the most celebrated of the ergodic theorems, that of Birkhoff.

Theorem 1.2 (Birkhoflindividual ergodic theorem) Let = ( X , &?, p, (6) be an abstract dynamical system, and suppose that f E L , = L , ( X , 93,/A). Then there exists a functionf E L , such that i

Proof

n-1

(F. Riesz) Let us denote f ( x ) = lim sup n-m

f * ( x ) = sup n

1 ~

n

T, f ( x ) ,

n1 T,f ( x ) ,

1 f ( x ) = lim inf - T,,f ( x ) n+m n

1 f * ( x ) = inf T, f ( x ) , ~

n "

so that in Corollary 1.1.1

A*($; a ) = {x : f * ( x ) > a),

For fixed a and

p with p <

AJf;

8) = { x : f J x ) < Bl.

a, let

A(a, p) = {x : f ( x ) < p < a

-= J ( x ) } .

(7 )

Since f ( ( 6 ( x ) )= f ( x ) and f ( 4 ( x ) )= f ( x ) for all x E X , it is clear that A(a, p)+A(a, 8). Assuming that p(A(a, B)) = y > 0, we can apply

4:

10

ERGODIC THEORY

I.

Corollary 1.1.1 to the dynamical system = (A(& p), B n A(a, B), (l/y)p, 4). Since f* I f r f ~ f * we , have for O m , pthat A * ( f ; a ) = A J f ; p) = A(a, 8). It follows that 1

al-

I,,,

ps which contradicts p < a. Thus we have p(A(a, p)) = 0. Since Y

A = {x : f ( x )

p!

dp I

u

-= f ( x ) } = ry

A(a, B),

pta; fl rational

it follows that p ( A ) = 0. Thus f ( x ) = f ( x ) p-a.e., and the proof of convergence is complete. To see that f~ L,, note that

By Fatou’s lemma,

Remarks Z Much has been done in the way of proving individual (that is, pointwise convergence) ergodic theorems for operators. See, for example, the excellent account by Garsia in [24]. A direct generalization of Theorem 1.2 to doubly stochastic operators yields the Hopf ergodic theorem. The proof again is based on Theorem 1.1. The same result with weaker hypotheses was proved by Dunford and Schwartz (see [ 161 or [24]). Recently, using the notion of “dilation of an operator,” Akcoglu [3] has proved pointwise convergence for (positive) contractions on L,, 1 < p < 00. 2 In the case of a discrete (completely atomic) measure space, a classical theorem of Kolmogorov yields convergence as in Eq. (6) for operators T only assumed to satisfy properties 1 and 3 of Definition 1.3. This theorem, usually stated in terms of convergence of a sequence of matrices, is basic in the analysis of finite or denumerable Markov chains. 3 Many of these theorems, including Theorem 1.2, are also valid when p is a o-finite measure. However, the limit function f may be uninteresting in this case (Exercise 9). A more sophisticated result, which also includes almost all of the theorems mentioned so far, is the following.

2.

11

ERGODIC THEOREMS

Theorem 1.3 (R. Chacon-D. Ornstein) Suppose ( X , 99, p) is a j n i t e or a-jinite measure space and T is a linear operator on Ll = L l ( X , 94, p) satisfying (i) T 2 0 and (ii) I(T(1,I1. Then for each f , g E L, with g 2 0, the limit

exists and i s j n i t e almost everywhere that

SUP, T , g ( x ) > 0.

The proof of this theorem is complicated and will not be given here (see [24, p. 30 IT.]). Instead, we proceed now to a fairly general “mean ergodic theorem,” that is, one asserting convergence in L, . If Q, is a dynamical system, we shall see that for each f e L, the sequence ( l / n ) T ,f converges in the norm topology of L,Jl Ip < a).It follows (Exercise 1 1 ) that the limit must coincide with f almost everywhete. Thus

and, in particular, f E L,. In the following, we assume as before that ( X , 3, p) is a normalized, finite measure space. Theorem 1.4 (Yosida mean ergodic theorem) Let T be a doubly stochastic operator and f E L, . Then there exists f * E L, such that

=o.

Tkf-f*

(8)

IIP

Proof Suppose h is a function on X with T h Ih. Then g = h - T h 2 0 and J x g dp = 0. It follows that g = 0; that is, Th = h. According to Proposition 1.1, the same is true with T replaced by its adjoint T * . In particular, if T*h, = h, and T*h, = h,, then by the positivity of T * and the previous remark, T*(hl A h,) = h, A h,, where h, A h, denotes the infimum of h, and h,, defined by (hi

A

h,)(x) = min(h,(x), h,(X)).

Suppose f = g - Tg with g E L,. Then T,f = g

-

T”g. It follows that

12

I.

ERGODIC THEORY

as n-t co. Thus (8) holds with f* = 0 for all f in the subspace %, = { g - Tg : g E L,}. Likewise, (8) holds for all elements of X, = {f E L, : Tf = f},withf* = f . We shall show that X, + 3, is dense in L,. It will follow that (8) is valid for all f E L,. For if f in L, and (8) is valid for each fk, then

which implies that (8) holds also for f . To prove that %, + 3, is dense in L,, we shall show that the only F E L, = L,* (l/p + l / q = 1 for p > 1, and q = c13 for p = l), which is orthogonal to both X,and X 2 ,is the zero function. Suppose then that F is such a function. It follows that ( F , g - Tg) = ( F , g ) - (T*F, g ) = 0 for all g E L,, and hence that T*F = F . Let c be a fixed real number, and set A = { x : F ( x ) > c}. For each E > 0 we define

+

g, = (1/E)[(c &)

A

F - (C

A

F)].

Then 0 I g , I 1 and g, 7 x A , the characteristic function of the set A, as E 10. It follows by the monotone convergence theorem that T*g, 1 T*xA (Exercise 8). On the other hand,

+ E)

T*g, = (l/e)[T*((c = ( I / E ) [ ( C -k

since F , c, and c

E) A

F ) - T*(c A F ) ] F - (C A F ) ] = g c , A

+ E are all invariant functions for T*. Thus T * z A= lim T*g, = lim g , = z A . e-0

(9)

C-0

We shall show, in fact, TxA = x A . If B E 28 is arbitrary, then by (9) and the positivity of T T*xA n B 5 T*xA = X A

so that

and

T*xA

I T*xB,

2.

13

ERGODIC THEOREMS

Likewise,

or

Since B was arbitrary, it follows that T z A= x A , as asserted. Now x A E L , E L,, and hence x A E 3 , . Since F is assumed orthogonal to S,,this means that (XA, F) =

1

*

F dp = 0.

[F>c]

Since this is true for all real c, we must have F ( x ) = 0 a.e.

I

We conclude this section with a third type of ergodic theorem, the recurrence theorem of Poincare. p, 4) be an abstract dynamical Theorem 1.5 (Poincare) Let 0 = ( X , 9, Then for almost every x E A there is a positive integer system, and let A E 9. n = nA(x)such that @ ( x ) E A .

Proof Let m

B=A-

a3

( J { x E A : @ " ' x ) E A } =n ( A - $ - " ( A ) ) . n= 1

n=l

Since

n m

+-"(B) =

n= 1

(4-m(~)

- 4-("+")(~)),

it isclear that the sets B, 4- ' ( B ) ,$-'(B), . . . are pairwise disjoint, measurable sets. Since p ( 4 - " ( B ) )= p ( B ) for each n, and since p ( X ) = 1, it follows that P(B) = 0. I

14

I.

ERGODIC THEORY

3. ERGODICITY AND

MIXING

So far we know very little about the limit function f in Theorem 1.2. We know (see Exercise 11) that it coincides a.e. with the function f * of Theorem 1.4, and that J f d p = J f dp. Of course, in some special cases we can completely identifyf. For example, iffis an invariant function, T f = f , then f = f. We know that when f = g - T g for some g E L,, we have f = 0. There is one more situation in which we can completely identify f. This is when the dynamical system # is ergodic. Definition 1.4 The abstract dynamical system # = ( X , 93, p, 4) is ergodic if 4 - ' ( A ) = A, A E 93, implies either p(A) = 0 or p ( a ) = 0. A doubly stochastic operator T is ergodic if T f = f , f E L,, implies that f is essentially a constant function, that is, f ( x ) = c a.e. Proposition 1.3 Zf @ is ergodic, then the induced operator T4 is ergodic. Proof Suppose T+f = f . For each positive integer n and each integer k let

Since 4 is ergodic and 4 - ' ( X ( k , n)) = X ( k , n), it follows that p ( X ( k , n ) ) = 0 or p ( X ( k , n ) ) = 1. For each n there must be exactly one k with p ( X ( k , n)) = 1. Denote it by k(n). It follows that X , = X(k(n), n) has measure 1. Clearly then, there exists a constant c such that x , = {x : f ( x ) = c}.

n."=o

Proposition 1.4 Zf# is ergodic, then

1 n-1 lim n-ql

-

C f(@(x))=

k-0

5 f dp

a.e.

X

for each f E L,, and

for each A, B E 93. Conversely, if (12) holds for all for all A, B E 9?,then 4 is ergodic.

f E L,, or if (13) holds

3. ERGODICITY

AND MIXING

15

Proof The validity of (12) follows from T , f = f and f dp = f dp. If we set f = ,yg, then f ( 4 k ( x ) )= ~ + - y ~ ) ( xIntegrating ). (12) over A and applying the bounded convergence theorem yields (13). Since (12) implies (13), it only remains to show that the validity of (13) for all A, B E 93 implies that @ is ergodic. Suppose 4- ' ( B ) = B and set A = 4. The left side of (16) is zero then, and so either p ( B ) = 0 or p(B) = 0. 1

The equality (12) is very closely related to the origins of ergodic theory in statistical mechanics. If we think of the sequence &"x) as unfolding in time, then (12) is a statement of the ergodic hypothesis, namely, that time averages (of integrable functions) coincide with space (or phase) averages. In probability theory, (12) provides the foundation for a method of estimating parameters for (ergodic) stationary processes. The significance of equality (13) is related to the recurrence theorem of Poincare (Theorem 1.5). The latter theorem implies that, for a set A of positive measure, almost every point of A returns to A infinitely often. It gives us no information, however, as to how many points of A return to A at the nth step of the process, or, more generally, how many points of A are in the measurable set B after n steps. The proper measure of this number is p ( A n 4-"(B)).Equality (13) tells us that asymptotically this number is on the average for different values of n proportional to the sizes of A and B. It may in fact (Exercise 18) never, for a given value of n, be close to p ( A ) p ( B ) . Intuition tell us that for certain processes we should, in fact, have p ( A n 4-"(B))converging to p ( A ) p ( B ) as n -,co.When this is true, the process is said to be mixing (or strongly mixing).

Definition 1.5 The dynamical system lim p ( A n 4 - " ( B ) )= p ( A ) p ( B )

is (strongly) mixing if for all A, B E 93.

n-r m

To borrow an illustrative example from Halmos [32], suppose that a mixture is made containing 90% gin and 10% vermouth. If the process of stirring the mixture is ergodic, then after sufficient stirring any portion of the container will contain on the average (with respect to the number of stirrings) about 10% vermouth. If the process is a mixing one, the amount of vermouth in the given portion will become and remain close to 10%. Since molecular theory allows for occasional "accidents," such as the kitchen table that rises into the air because all of its molecules are moving in the same direction, we may want to consider a slight weakening of the

16

I.

ERGODIC THEORY

notion of mixing, namely that after a large number of stirrings the amount of vermouth in the distinguished portion of the container will be close to 10 % except for rare occasions. We shall say that a set J of positive integers has density zero if the number of elements in J n { 1, 2, ..., n} divided by n tends to 0 as n -+ 00.

Definition 1.6 The dynamical system @ is weakly mixing if for each A,BE@ lim

n-ra. n4 J

p ( A n +-“(B))= p ( A ) p ( B ) ,

(14)

where J is a set of density zero, which may vary for different choices of A and B. The following proposition shows that weak mixing lies logically between mixing and ergodicity. Proposition 1.5 Let are equivalent:

be an abstract dynamical system. Then the following

(i) @ is weakly mixing; 1 n-1

(ii) lim

-

1 Jp(An $ J - ~ ( B-) )p ( A ) p(B)I = 0 ( A , B E a ) ;

k=O

n-rm

1n-1

(iii) lim n-rm

-

1 [ p ( A n I # I - ~ ( B ) =) ] p(A)’ ~ p(B)’

k=O

( A , B E a);

(iv) the dynamical system 0’= ( X x X, 39 x ergodic, where (4 x 4 ) b ,Y ) = (4(x), 4 ~ ) ) ; (v)

(P’

a, p

x p,

I#I

x

4)

is

is weakly mixing.

Proof For a bounded sequence {a,,} let us write a = *-limn+ma, provided that a = 1imndm,n 6 J a,, where J has density zero. Then, in general, 1n-1 a = *-lima, n-x

iff lim n+m

-

1 )ak- a /

=O.

(15)

k=O

For suppose that a = *-limn-.ma,, with the exceptional set being J . If la,l I b for all n, then

3.

17

ERGODICITY AND MIXING

where (J,I is the number of elements in J n (0, 1, ..., n - 1). Thus limn+ (l/n) CE, 1 ak - a 1 = 0. To prove the converse, note that a = a, iff J ( E )= { n : la - a, 1 L E } has density zero for each E > 0. For if the latter holds, then there exists an increasing sequence of integers n, (m = 1, 2, . . .) such that

A

n 2 n,

I J , ( l / m ) l < n/m.

Setting

we have for each m

and

Now suppose that a # *-limn-toaa,. Then there exist such that I J , , ( E ~2) ~nE2 for all n. It follows that

E~

and hence

This completes the proof of equivalence (15). Clearly, then (i) and (ii) are equivalent. Also *-lim p ( A n c#-"(B))= p ( A ) p ( B ) n- m

iff *-lim Ip(A n # - " ( B ) ) - p ( A ) p(B)I

=0

n-. m

iff

*-Iim ( p ( n~ c # - " ( ~ ) ) - p ( ~~ )( B ) J=*O n-+ m

iff

> 0 and

E~

>0

18

I.

ERGODIC THEORY

Now if # is ergodic, then

n-m

n-1

f kc= O 1/44 4-"(B))- P(A) @)I 1

lim

1n-1

c [P(A

lim n

=

n+m

n 4 - n ( B ) ) Z- 2P(A) P(B) P(A n 4 - n ( q 1

k=O

+ C1(42 C1m2 1n - 1 = lim n+m

k=O

p ( A n 4-"(B))' - &Ip(B)', )'

so that in this case (ii) and (iii) are equivalent. However, either (ii) or (iii) implies that # is ergodic. Thus (ii) and (iii) are equivalent. To show that (14) holds for all A, B E g x 9.3, with 4 replaced by 4 x 4, it is sufficient to show that it holds for measurable rectangles. Condition (14) then becomes * - h P ( A n 4-"(B))P(C n 4-n(D))= P L ( 4

A C ) P(D)

n-r m

(A, B, C, D E g). (17)

Since the union of two sets of density zero has density zero, (17) follows from (14). That is, (i) implies (v). Since (iv) obviously follows from (v), it only remains to show that (iv) implies (iii). If (0' is ergodic and A, B E 94,then 1 n-1

lim n-m

cp(A

n 4-k(B))2

k=O

as was to be shown.

I

It is time now to discuss some of the spectral properties of the operator to be operating on complex on L,. For this purpose, we consider L, . Iff E L, is a nonzero, complex-valued function such that Tf = 1f for some complex number 1,we say that 1 is an eigenualue and f an eigenfunction of T. The collection of all eigenvalues of T is called the point spectrum of T. If X is a T-invariant subspace (TS= S)of L , containing no eigenfunctions of T, we say that T has continuous spectrum on S.An

3. ERGODICITY AND MIXING

19

eigenvalue I of T is simple if T f = If, Tg = Ig implies that g is a constant multiple of f. If the invariant subspace X has a basis consisting of functions f;:I ( i = 1, 2, 3, . .. ;j = 0, k 1, _+2, . ..) with TAj = fr,j+l for each i and j, we say that T has countable Lebesgue spectrum on X .

Theorem 1.6 Let

Q,

be an invertible abstract dynamical system, and let

q be the induced operator on (complex) L, . Then 1 is an eigenvalue of q, and all eigenvalues have absolute value 1. (i) If cf, is ergodic, then all eigenvalues are simple, and they form a subgroup of the multiplicative group K = { z : 1. = I}. (ii) If cf, is weakly mixing, then Tb has continuous spectrum on the complement of the space of constant functions. (iii) I f q has countable Lebesgue spectrum on the complement of the space of constant functions, then Q, is strongly mixing. Constants are eigenfunctions corresponding to the eigenvalue 1. Thus T+ cannot have continuous spectrum or Lebesgue spectrum on any space containing constants. By the complement of the space of constant functions, we mean the uniquely defined space X C L , such that L , = X + {constants} and every function in X is orthogonal to 1, that is, Remarks Z

X={f-j

X

fdp:f~Li).

2 The condition in (ii) is necessary and sufficient for mixing. For a proof see, for example, [32, pp. 39q.

Q,

to be weakly

Proof We have already remarked that the constant functions are invariant, and hence that 1 belongs to the point spectrum. Since q is unitary, all of its eigenvalues have absolute value one. Alternatively, if q f = Al; then T,lf I = 111( f1, and since q1.f I d~ = 121 (fld p = If I d p # 0, it follows that ) I \ = 1.

1

1

1

(i) According to Definition 1.4 and Proposition 1.3, Q, is ergodic iff 1 is a simple eigenvalue for T+. If Q, is ergodic, and if T+f = I f, then T+I f I = so that 1 f 1 must be a constant. If, in addition, q g = Ig, then T+(f / g ) = (f/ g ) , so that f / g is a constant. Finally, if T+f = I If and T+g = I , g, then T + ( f / g )= (,&/I,)(f /g), so that 11/12is an eigenvalue. (ii) Suppose 0 is weakly mixing. Then 0, is ergodic. Suppose Tb f = I f,and let g(x, y) = f ( x ) f(y).Then

If/,

rn)

s ( 4 ( x k 4(YN = f (4b))

=W x )

5) = g(x9 Y ) .

20

ERGODIC THEORY

I.

Thus g must be a constant. Hence f is a constant and I = 1. (iii) If f is a constant, then (T@"f, g) = (f,g) = (f,1)(1, g) for all n and each g E L,. I f f = J j , g = fp4, then (5%g) = (A,j + n r f,) = 0 for all sufficiently large n. Since the functions J J plus the constant function 1 form a basis for L,, it follows that lim (T@Y,9 ) = ( f 7 1)(L 9 )

( f 7

9 E L2).

n-r m

In particular, this is true when f and g are characteristic functions and so CP is strongly mixing. I Let us look again at the examples of Section 1. Example 2 Suppose first that a is irrational. In this case, CP is ergodic but not weakly mixing, hence not strongly mixing. Using the alternate description of Q, on K = { z : IzI = 11, we see that

T@f ( z ) = f(e2aiaz). If f,(z) = z", then T4"f"fZ)= e2nina z n

- n - cf,(z).

Thus f, is an eigenfunction with eigenvalue I = c" # 1. According to Theorem 1.6, Q, is not weakly mixing. On the other hand, any function f E L, can be expanded in a Fourier series:

f

f

anfn?

n=--00

which converges to f in L,. Thus if T# f the expansion

=

X

it follows that f also has

U

T4.f n=

1-

anc"fn.

30

By the uniqueness of the Fourier coefficients, it follows that a, = a,c" for each n. This means that a, = 0 for n # 0, and s o f i s a constant. If a is rational, then c" = 1 for some positive integer n. Thus T@has nonconstant eigenfunctions, and so 0 is not ergodic. Example 2 This system is equivalent to the system Q, of Example 3. This equivalence clearly preserves all properties of ergodicity and mixing. Note also that there is an induced unitary equivalence of the corresponding L , spaces. The system @ is not invertible, so Theorem 1.6 does not apply.

4.

PRODUCTS AND FACTORS

21

Example 3 0’ is strongly mixing. This follows from Theorem 1.6 by taking fP4 ( x ) =

e2niP.xq/k.

(18)

This example is a special case of a theorem about automorphisms of groups to be proved in Chapter 111. It is fairly clear, and it will follow from a theorem on inverse limits, that the strong mixing property for 0 is equivalent to the same property for 0’. We have not shown yet that there exist systems 0 that are weakly mixing but not strongly mixing. This is a surprisingly difficult task, especially since it is now known that “most” systems are of this type. An example is given at the end of Section 6 and in the exercises.

4.

PRODUCTS AND FACTORS

We begin now the study of methods for constructing new dynamical systems from given ones. This will lead in later chapters to representation theorems, whereby we express more complicated systems in terms of simpler, more familiar ones. The first such construction is the direct product 0 0 R of dynamical systems 0 and R. We have already used a special case of this construction, namely 0 0 0 = B2, in Proposition 1.5. Definition 1.7 We define the direct product m1 @a2of abstract dynamical systems Qi = (Xi, B i , p i , 4i) ( i = 1, 2) by 0 1 @ 0 2 = ( xx1x

, , 2 1 X.ZB,,Pl XP2’41 x 4 2 ) ,

where (41x 4 2 ) ( x 1 , x 2 ) = ( 4 1 ( x 1 ) , 4 2 ( x 2 ) ) . More generally, if 0, = (X,, a,, p,, 4=) is an abstract dynamical system for each 01 E J, we define the direct product 0 = gmEJ 0, by taking the product measure X, and defining structure on the product space X = X, 4 ( x ) = y,

where Y , = 4 A x d

(19)

We shall make use of customary modifications of this notation, such as 0 O2 0 ... 0 0”and @,“=l 0”. Proposition 1.6 The product of a weakly mixing system and an ergodic system is ergodic. The product of two weakly (strongly) mixing systems is weakly (strongly) mixing.

22

I.

ERGODIC THEORY

Proof Let CPl and (D2 be the two systems. It suffices to prove that (13) or (14) or the defining relation for strong mixing holds for pairs of measurable rectangles. That is, we need to show for all A, B E 93, and C, D E B2 that

1 n-1 lim n+m

-

c Pl(A

n 4 ; k ( B))PZ(C f-4ik(D)) l

k=O

or

or lim Pl(A n 4 a+

m ) )P2(C n 4 i n w= P l ( 4 P A B ) PZW) P2(D),

(22)

00

where CP1 is weakly mixing and O2 is ergodic for (20), both are weakly mixing for (21), and both are strongly mixing for (22). The last one is completely obvious, while (21) depends only on knowing that the union of two sets of density zero has density zero. To prove (20) we note that, for a given E > 0 and for all k larger than some no = no(&), we may replace pl(A n 4Lk(B))by p , ( A ) p , ( B ) + ek with < E, except when k belongs to some set J of density zero. Thus

i

n-1

I

5 no/n + ((n - nO)/n)&+ (l/n)lJnI.

The first and last terms on the right tend to zero as n -,co. Thus the two terms on the left have the same limit, namely the right side of (20). I We shall see in the next section that the product of any number of weakly (strongly) mixing systems is weakly (strongly) mixing. Note, however, that the union of countably many sets of density zero need not have density zero. Suppose that CP = CP, 0 CPz. Define +: X + X, by +(x, y ) = x. It follows easily that $4 = 4,+. That is, the diagram

4.

PRODUCTS AND FACTORS

!*

Xl

dl

'

23

!*

x,

commutes. Moreover, $ is measure-preserving. It is possible to have systems O and (9, related by a map +: X -,XI for which diagram (23) commutes without (9 being of the form O1@ 0 2 . Definition 1.8 We shall say that the dynamical system (9' = ( X l , a,, p,, 4,) is afactor of the system = (X,B, p, 4) if there exists a measure-preserving map $: X + X , such that diagram (23) commutes. In this case, we write Q,, I@ and t,b: (9 -+ O , or O 3 Ol. The map $ is called a homomorphisni of Q, onto a1. If O = 0 02,Q1 is called a direct factor of O. Note As usual in this chapter, when we write $4 = 4,t+h,or indicate it by a diagram like (23), we mean that equality holds pointwise almost everywhere. Suppose that O110.Let g1' = { $ - ' ( A ) : A E B,}. Then Bl' C 9,and according to Definition 1.2, the systems O, and 0,' = ( X , B,', ,u, 4) are equivalent. Thus we may always assume that the factors of O are of this latter form. That is, the factors (D1 of @ may be identified with the sub-8-algebras a, of 29 which are invariant, in the sense that 4-'(Bl) = { 4 - ' ( B ) :B E Bpi> c 9'.Note that the factor Olis an invertible system iff Bl is totally invariant, that is, 4-'(B1)= al (Exercise 22). It might be imagined that two dynamical systems 0,and O2 for which O1 and O210,are isomorphic, hence equivalent in the sense of Definition 1.2. However, it is not known if this is true even in the case where 41and 42 are the identity; that is, the problem is unsolved even for measure spaces. Since the condition $,$ = $42 gives us a further restriction on the map $, the conjecture might conceivably be false for measure spaces, but true for ergodic dynamical systems, for example. This also is unsettled. It has become customary when 011Q2 and O2J(9,to say that Ol and O2 are weakly isomorphic. Example Let us continue with Example 3 of Section 1. Recall that O was the one-sided shift on k points, with X = X.",, X , , and 0' was the two-sided shift, with X ' = X,"= - X , . Thus W is invertible, but Q is not. Define +: X ' -+ X by 1(1( ..., x - ' , x o , xi,. . .) = (xl, x2,...). If

c = {x E x : (X,,,

. . ., x,,)

E

A)

(24)

24

I.

ERGODIC THEORY

is an arbitrary cylinder set in X, then $-I(C) has the same description with X replaced by X’. Thus $I, is measurable and measure-preserving. Clearly 4$ = I)$’, so that @lo’.In particular, it follows from the following proposition and the discussion at the end of the preceding section that @ is strongly mixing. Proposition 1.7 Suppose that Q1 I@ and that @ is (1) ergodic, ( 2 ) weakly mixing, or ( 3 ) mixing. Then OIhas the same property. = (X, p, 4), where is Proof If we represent O1 in the form an invariant sub-a-algebra of 93,then the relation given in (13), (14), or Definition 1.5 is true for all A, B E 9,hence, in particular, for all A, B E B1. I

5. INVERSE LIMITS

The direct product of infinitely many dynamical systems may be thought of as a limit of finite products in a way which will become clear in the following. On the other hand, the slightly more general notion of inverse limit is also useful in the calculation of entropy (Chapter IV) and the analysis of complex dynamical systems. Rather than a constructive definition, which is possible for the inverse limit of a sequence, we shall give a categorical definition of inverse limit, thus avoiding temporarily some of the sticky problems of existence. That is, our definition will involve only homomorphisms between dynamical systems and the completion of certain commutative diagrams. We note in passing that the direct product could also have been defined categorically. Recall that a set J is said to be directed if there is defined on J a relation < such that (i) < is a partial order on J, and (ii) for each pair a, p E J there is a y E J such that a < y and /3 < y. Definition 1.9 By an inverse system of abstract dynamical systems we shall mean a triple ( J , O U ,$?@) such that J is a directed set; for each a E J, Ouis an abstract dynamical system; and for each pair a, p E J with a < 8, we have +us : OP-+aa.An upper bound for such a system is a dynamical system @ with a set of homomorphisms pz : @ --* OZ (a E J) such that for each a, p E J the diagram

5.

25

INVERSE LIMITS

commutes. Finally, an inverse limit 6 of the inverse system (5, CDa, t,ba,J is an upper bound with maps $a : CD + CDa which is a factor of every other upper bound. That is, whenever CD is an upper bound with maps p, : CD + mu, there exists a homomorphism o: CD + 6)such that the diagram

commutes for each a E J. In this case, we write

6 = inv lim

(6, 9,) = inv lim(ma,

or

aeJ

a€J

Clearly, if CD = (X, a, p, 4) is an upper bound for the system ( J , CDa, +as), then we can represent the ma as (X,B e , p, +), where the 99, (ct E J ) form an increasing net of invariant sub-o-algebras of .c#. The mappings t,bap and p, then become the identity mapping on X. Moreover, the inverse limit 6 can be identified with (X, 3, p, 4), where & is the aa.This is true because 3 reappears smallest o-algebra containing as an invariant sub-o-algebra of a for any upper bound 0, so that 6 = (X,d, p, 4) is a factor of CD and the commutativity of (26) is trivial. In fact, this argument shows that any bounded system of abstract dynamical systems has an inverse limit, and that all such inverse limits are equivalent in the sense of Definition 1.2. Thus we have proved the following theorem.

u,

Theorem 1.7 If ( J , a,, t,bap) is an inverse system of abstract dynamical systems, and if@ is an upper bound with maps pa, then

6 = (x,d, p, 6) = inv lim ZE

J

where & is the smallest o-algebra containing all of the p i '(a,). In particular, the inverse limit, when it exists, is uniquely determined up to equivalence.

The question of existence of the inverse limit is somewhat more difficult. The usual approach is to define the inverse limit set

I

X, = x

E

X X, :

x p = x, for all a, p E J , a < fi},

acJ

(27)

define the projections pa : X , + X u in the obvious way, and attempt to extend the measures p, pa from 99,, = p,- '(99#)to the a-algebra am

ua

26

I.

ERGODIC THEORY

generated by B o . However, it is known (see, e.g., [31, p. 2141) that this is not always possible. For the most part, we shall be interested in inverse limits only when we have an explicit representation. However, the following theorem is not without interest. The proof [12, 141 is omitted.

Theorem 1.8 Let ( J , O a , be an inverse system of abstract dynamical systems. Then there i s a system ( J , Oa’, $ip) such that Oa’ is equivalent to tDa ( c i J~ ) under a set of equivalences which carry the into I& and such that = inv lim Oa’ usJ

exists. Moreover, Oa is dejined on the inverse limit set (27) for (J,@a’,

#is).

We shall see several examples of inverse limits in Chapter 111. (See also Exercise 23.) For now we consider only two simple, but important examples. The first and most obvious is the direct product defined in the preceding section. For this we let I be the set of finite subsets of J, directed by set inclusion. Then @I Oa = inv lim J ( a , , . . ., a,) E

uE

QU,

0 QU2 0

0 (Dan.

I

The maps fiu and $uB are the obvious “ finite-dimensional” projections, and a routine verification shows that the appropriate diagrams commute. If @ is any other upper bound, then the map c: O + B U E Oa, J given by c ( x ) = ( ~ ~ ( Jx, completes ) ) ~ ~ diagram (26). As an application of Theorem 1.8, we give the following construction due to Rohlin [51]. If tD is a noninvertible dynamical system, it is possible to define an invertible system &, called the natural extension of 0, such that Q, is a factor of 6,and 6 is a factor of any other invertible system of which (3 is a factor (see Proposition 1.9 below). For each positive integer n let Q,, = Q, and @,,, = i$m-n for m > n. This defines an inverse system indexed by the set J of positive integers. Let & = inv limns Q,, = inv limn+a Q,,, . Taking & = Omas in Theorem 1.8, and noting that we can write m

X,

= (x E

)( X , : x ,

= i$(x,+

for each n},

n= 1

we see that &1,

x2 x3, * ..) = ( i $ ( X , ) , 9

4 4 4 , i$(x3),. . .) = ( i $ b l ) , x1, x2

9

* *

-1.

5.

Thus

27

INVERSE LIMITS

# is one-to-one, and its inverse

6- yx1, x2, x3, * . .) = (x2

3

x3 9

x 4 , * * .)

is also measurable. That is, 61is invertible. Of course, if O is invertible, then d) is isomorphic to 0. In fact, p 1 is an isomorphism, since x,, = (6-"x1 for x E 8. Proposition 1.8 Let ( J , Oa, $uB) be an inverse system of dynamical systems, and let J , E J have the property that for each ci E J there is a p E J , such that ci i p. Then ( J , , Ou,$uo) is an inverse system, and inv lim Ou= inv lim Ou. a E J

a E Jo

Proof This follows from the corresponding property for a-algebras. p, 4), it is clear that &?a = U a E J , Bu, and the Thus, if Ou= (X, Bu, result follows from Theorem 1.7. I

UuEJ

The proofs of the following two propositions are routine verifications and will be omitted. Proposition 1.9 If

to : Za -+Oufor

each

ci

E

J , then

inv Iim(C,, map)[ inv lim(Oa, $uB)r a E

J

a E J

provided that the diagrams

commute for each a, E J . I n particular, if 210, then overcarat denotes the natural extension.

216, where

the

Suppose that Ol and a2 are factors of O. Then we may write Ok= (X,Bk,p, 4 ) ( k = 1, 2), where 0 = (X, LB, p, (6). Let us denote by Bl v B2the smallest o-algebra containing both and B2.We define the join of O1 and O2 to be v (D2 = (X,Bl v B2,p, 4). Of course, the notation and terminology extends to joins of arbitrary families of sub-a-algebras of B and of factors of O.

28

I.

ERGODIC THEORY

for each a E J , then

Proposition 1.10 If 0,' and OU2are factors of inv Iim(Q,,' v U E

J

aU2) = (inv ~ i m a,')

v (inv lim Q,,'),

UEJ

U E

where the latter join is as factors of inv lim,

J

0, . In particular,

inv 1im(Qu'@ a,*) = (inv Iim Q , ~ ' ) (inv Iim a,,"). acJ

UEJ

UEJ

Proposition 1.11 The inverse limit inv Iimue Q,, = Q, is (1) ergodic, ( 2 ) weakly mixing, or (3) mixing ifleach Qu has the same property. Proof Since each OUis a factor of 0, the result follows in one direction from Proposition 1.7. To prove the converse, let us denote QU = (X,W,,p, 4 ) where Q, = (X,W, p, 4). According to Theorem 1.7, the algebra B0 = U u EW,is dense in 9. Thus (see Exercise 19) condition (13) or (14) or the defining property of mixing holds on A9 iff it holds on Wo. But the latter is true iff it holds on each W,. I Corollary The natural extension of or (3) mixing i@Q, is.

0.

Q,

is (1) ergodic, ( 2 ) weakly mixing,

INDUCED SYSTEMS

In 1943, Kakutani [36] introduced the idea of a transformation induced by a measure-preserving transformation q5 on a subset A of positive measure. The idea is to localize the system and only observe @ ( x ) when it is in A. This has been a very fruitful idea for constructing examples and has recently begun to play a role in the theory of abstract dynamical systems somewhat analogous to that of factor systems. The basis of the construction is the recurrence theorem of Poincare (Theorem 1.5). Thus if 4 is a measure-preserving transformation on a finite measure space (X,8, p), and if A E W is a measurable set of positive measure, then for almost every X E A there is a positive integer n = nA(x)such that @ ( x ) E A, but q5(x), $,"(x),. . ., q!P-'(x) q! A. Definition 1.10 The induced transformation on a set A E W with p ( A ) > 0 is the transformation 4, : A + A defined by 4 , ( x ) = @"'(")(x),where n,(x) is the smallest positive integer n such that @ ( x ) E A. The induced dynamical system is a, = (A, a,,p A , 4,), where W A= {A n B : B E 93} and p A is the normalized (total measure one) restriction of p to W,.

6.

29

INDUCED SYSTEMS

Of course, 4 A is in general only defined for almost all x E A. Its definition may be extended arbitrarily to all of A.

Theorem 1.9 The induced transformation 4 Ais measure preserving. Thus O Ais an abstract dynamical system. Zf O is invertible, so i s O A. Proof

Define for n 2 1 A, = {x E A : n A ( x ) = n} = { x : x, @(x) E A ; $(x), . . ., @- 1(x) 4 A } B, = { x : X, 4(x), ..., @ - ' ( x ) $ A;&"'x)E A}.

Since 4 is measurable, we have A,, B, E W. Moreover, A, E A and for each C E a

4;1(~) =

6

n= 1

[An n +-Yc)I.

(28)

It follows that 4 A is measurable on (A, W A ) . Now the sets A, ( n = 1, 2, . . .) form a disjoint partition of (almost all of) A, and the sets B, ( n = 1, 2, . ..) form a disjoint partition of the set of all points whose "orbits" intersect A minus A (almost all of X A in the case of an ergodic 0).Also 5-

f#rl(A) = A , u B ,

For any C E that

(29) (n 2 1). 4-'(Bn) = An+1 u Bn+1 with C E A, it follows by repeated application of (29)

4- ' ( C )= [ A , n 4- '(C)l LJ [Bl n 4- Wl n 4-("+')(C)], 4-'[B, n 4-"(C)] = [ A , + , n $-("+')(C)]u or, since 4 is measure preserving, n

~(= c k)1 p(Ak n +-'(C)) + p(Bn n 4-"(C))* = 1 Since the B, are pairwise disjoint, the last term tends to zero. Thus from (28) 5

P(C)=

C1p(An n # - " ( C ) ) = ~ ( 4' (,C ) ) .

n=

Now suppose that O is invertible. Then, of course, 4-l is measure preserving, and we can define (4-l)", the induced transformation on A. We shall show that ( 4 - 1 ) A = (4J1. By symmetry it is sufficient to show

30

I.

ERGODIC THEORY

( 4 - ' ) " ( 4 " ( ~=) x) for almost all x E A. Suppose that x E A,. Then +"(x) = @"'x) = y E A.Clearly, 4-"(y) = x E A. Suppose that z = +-"(y) E A for some m, 1 Im .c n. Then 4"(z) = y = 4"(@"'"(x)). But @"'"(x) # A,

that

and this contradicts the fact that 4" is one-to-one. It follows that ( 4 - ' ) " ( y ) = $-"(y) = x, as was to be shown. I

Figure 1. (a) Induced transformation; (b) inverse construction.

Proposition 1.12 If@ is ergodic, so is 0". Proof Suppose that C E B,C E A, and

4;

'(C) G C. Define

W

D

=

U {[An n 4-"(C)l u [Bn n 4-"(C)l)* n= 1

According to (28), A, n 4-"(C) C C for each n, and so A the other hand, by (29)

-

C E d. On

# - ' [ A , n 4-"(C)] C $ - ' ( C ) = [ A , n 4-'(C)] u [El n $ - ' ( C ) ] and

4- ' [ B , n +-"(C)]= [ A , , n q!-("+ "(C)] u [ B , + ~n +-("+')(C)]. Thus 4 - ' ( D ) c D . Since 0 is ergodic, either p ( D ) = 0 or p(d) = 0. Suppose

-

that p ( D ) = 0. It follows from (28) that p ( 4 ; ' ( C ) ) = 0, and hence that p(C) = 0. Likewise, if p(d)= 0, then from the preceding p ( A C) = p A ( e )= 0. It follows that 0"is ergodic. I In case 0 is invertible, there is an interesting way of describing the transformation 4 in terms of 4" and the sets E n . This will lead to another new construction, which is, in a definite sense described below, the inverse of the induced transformation construction.

6.

31

INDUCED SYSTEMS

u:=,

Let us write B, = A, so that X = B,. Note that maps B,+l onto a subset of B, for each n, and that 4-l maps x E B, onto the point 4; ‘+“(x). Now suppose we are given a disjoint sequence of sets B, E W,where (X, W,p ) is a finite or cr-finite measure space. Suppose further that p(B,,+1) I p(B,) < 00 and p(B,) + 0 as n + 00. For each n let a, : B,, + B, be an invertible measure-preserving transformation of B,+ onto a,(B,+ l). Let 4, : B, + B, be an invertible measure-preserving transformation of B , onto itself. We define a mapping 4: Y + Y, where Y = B,, by

-

u:=,

Theorem 1.10 The mapping 4 is an invertible measure-preserving transformation. If p( Y ) = 1, Q, = ( Y , W,p, 4) is an abstract dynamical system, and 4, is the transformation induced b y 4 on B,. If 4, is ergodic, so is 4. We leave the proof as an exercise. Suppose Q1 and Q2 are invertible, ergodic dynamical systems. Let us induced by O2 on some is isomorphic to a system write Q1 < O2 if set A of positive measure. In Kakutani’s terminology, a1 is a derivative of Q 2 , and Q2 is a primitive of There is a clear analogy to the theory of factors discussed in Section 4, and again the question arises as to whether Ol 4 a2and a2< O1imply a1z 0 2 . The construction preceding Theorem 1.10 may be described by saying that Q, = (Y, 93,pi 4) is constructed on the system Q,, = ( B , , W,,p o , 4,). From the discussion it is clear that this is equivalent to Q,, < Q,, at least when Q, is ergodic. A discussion of this in terms of “flows under a function” is given in the exercises. Example (Kakutani [37]) Let B, be the unit interval with Lebesgue measure for p. (We can take, for example, X = R x Z to be the product of the reals with the integers.) Define Cpo on B , by mapping the left half of the dyadic interval [1/2”, 1) linearly onto the right half: +o(x)=x-

1 1 1+-+2”+” 2”

1 1- - < x < 1

2” -

1 -2”+1,

Let B1 be a linear set of length $ “sitting above”

n = 0 , 1,2,....

32

I.

ERGODIC THEORY

and let B, = @ (n > 1). It is easily seen that q5,, and therefore also #, are ergodic. A little more effort (Exercise 35) reveals that (Do has discrete spectrum, that is, T+ohas enough eigenfunctions to span LJB,). According to Theorem 1.6, (Do is not weakly mixing. On the other hand, (D is weakly mixing, but not strongly mixing (Exercise 36). Thus Proposition 1.12 fails if “ergodic” is replaced by “weakly mixing.”

Figure 2. Kakutani’s example.

EXERCISES Measure-Preserving Transformations

1. (a) If V is a class of subsets of some set, let a(%?) denote the denote the smallest smallest algebra of sets containing W, and let a(%) a-algebra containing W. Suppose that ( X i , W i , p,) ( i = 1, 2) are finite measure spaces, and that 4: X, X,. If V E a2with 93(V)= W 2 ,and if #- ‘ ( B )E Wlfor all B E V, show that 0 is measurable. (b) If, in addition, V satisfies A, B E 59

A

-

B is a finite union of pairwise disjoint sets in 59.

(30)

and if p1(q5-l(C))= p2(C)for all C E W, then q5 is measure preserving. (c) The class V of measurable rectangles in a product space satisfies (30). 2. If (X, 93, p) is a a-finite measure space, we define measure-preserving transformations of X in exactly the same way as for a finite measure space. Does #(x) = x + 2 define a measure-preserving transformation of (i) the reals with Lebesgue measure, of (ii) the positive reals, of (iii) the integers withcountingmeasure?Howabout d(x) = 2x?Show that q5(x, y ) = (2x, y / 2 ) is a measure-preserving transformation of the Euclidean plane.

33

EXERCISES

3. (Baker’s transformation) Define 4 on the unit square [0, l] x [0, 11 by 4(x, y) = (2x, y/2) for 0 I x < and 4(x, y) = (2x - 1, (y + 1)/2) for

4

4 I X I l . (a)

Show that

4 is measure preserving.

(b) By mapping the sequence {x,} of 0’s and 1’s onto the point

... and y the expansion 4 is equivalent to the two-sided shift on two

(x, y) such that x has the binary expansion .xoxlxz . X - ~ X - ~ X - ~, show

that

points. 4. Verify Example 2 and show that it is equivalent to the one-sided shift. 5. (Adding machine transformation) Define m

( X , 9, p) =

x

( X , 9, P,)? 9

7

n= 1

where X, = (0, 1, ..., k,}, 9, is the class of all subsets of X,, and Define 4 : -+ by pn = { P n o , Pni, +

x x

’ 9

4(x1, x 2 , ...) = (xl =

+ 1, x z , x 3 , ...)

(0, ..., 0, x,

+ 1,

X,+l,

if x1 < k , xp+2, ...)

if x1 = k,, .. ., xp- = k,-,,

x, < k,

4 ( k l , k 2 , . ..) = (O,O, 0, ...).

Show, as in Example 3, that the inverse image of a cylinder set is a cylinder set. Conclude that 4 is measurable, and that it is measure preserving iff p n j is independent of j ; namely, pnj = l/(k, 1).

+

Doubly Stochastic Operators 6. If (X, 9, p) is the unit interval or one of a certain class of “decent” measure spaces, then, for each set function +: &?+.9if which preserves finite and countable unions and intersections and also preserves complements, there exists a measurable point transformation 4 : X -+ X such that + ( B ) = #-‘(B) for all B E %. Thus Proposition 1.2 may be proved by exhibiting such a .)I (a) If ,yA is the characteristic function of the set A, show that for T a doubly stochastic isometry on L2 and for any A, B E %

r‘I

(TXA)(TXA

dp =

[

‘ X

TXA n

Bdp.

(b) Show that 0 IT x AI 1 and hence that 0 I (Tx,,)’

ITxA.

34

I.

ERGODIC THEORY

(c) Use (a) and (b) to show that (TxA)’= TxA, and hence that TxA is the characteristic function of some set $(A). (a) Show that T z A I min{TX,, TxB},and hence that

0I TxA

= (TxA n

d2

(TxANTxB). (e) Use (a) and (d) and the relations x A U B = x A ze - x A n B , zn - 1 - x A , to conclude that $ preserves finite intersections, finite unions, and complements. ( f ) From p ( A ) = ( x A , 1) deduce that I) preserves measure and hence also countable unions and intersections. 7. Suppose that T is an operator on L , ( X , 33,p) where p is a finite or a-finite measure, and suppose T satisfies (i) f S O * Tf 2 0, (4 IITf Ill 5 (3 IIT f I1,5 I1f 1I m where f E L, for (i) and (ii), and f E L1 n L, for (iii). Suppose further that gE L, n L , . (a) Show that (Tg - c)’ 5 T(g - c)’ for any constant c. (b) Show Sx (Tg - c ) h(Tg, c ) d p s jx(g - c ) h(g, c ) dp, where nB

+

Ilflll7

9

h(u, u ) = 1 =O

if u > u ifulu.

(c) Suppose that g 2 0. Multiply both sides of the above inequality by cP-’ and integrate with respect to c from 0 to co. Apply the FubiniTonelli theorem to obtain

and hence IlTgIlp 5 l l g l l p . (a) From (c) and ITg(I Tlgl deduce that llTllpI 1. 8. (a) Let T be a doubly stochastic operator. Suppose that 0 If, 7 f a.e. with f E L,. Show that Tfn7 Tf by showing that S B Tf d p = j B limn+, T L d p for each B E B. (b) The preceding is a “monotone convergence theorem” for T. Formulate and prove a “dominated convergence theorem” for T. Ergodic Theorems 9. Let (X, g,p) be a a-finite measure space. The statement and proof of Theorem 1.1 remain valid in this context. (a) Show that Corollary 1.1.1 also remains valid as follows. Let

EXERCISES

35

fn*(x)= maxlsksfi(l/k)&f ( x ) and A,*(f; a ) = {x : fn*(x)> a}. For fixed n and any measurable set C with finite measure, let h = f - axc. Deduce as in the proof of Theorem 1.1 that f&*(h)h d p 2 0 so that

(b) If {C,} is an increasing sequence of measurable sets with finite measure and union X, show that as j cc the sequence B,*(f - axe,) decreases to A,*(f; a ) if a > 0 or increases to the same limit if a < 0. Conclude that --+

and complete the proof of Corollary 1.1.1 by letting n -+ 00. (c) The proof of Theorem 1.2 now goes through as before. In particular, j 1 f 1 d p s j 1 f I dp. Show that equality does not always hold by considering the transformation 4(x) = x + 1 on the reals. 10. (a) By an appropriate choice of the function g, show that Birkhoffs theorem follows from the Chacon-Ornstein theorem (Theorem 1.3). (b) Let P = ( p i j ) be an infinite matrix with E j p i , =. 1 for each i and p i j 2 0 for all i, j . Let Z be the integers, and provide it with a measure structure by letting p ( A ) be the number of elements in A. Define T by T(f,} = {gi}, where g j = X i p i j f j . Show that T satisfies the hypotheses of Theorem 1.3. Conclude (Kolmogorov’s theorem) that

exists, where pi;) is the (i, j)-entry in Pk.Also (ratio limit theorem)

exists. 11. (Mean ergodic theorem) Let @ be an abstract dynamical system and T = T6. (a) Suppose T*F = F.Byevaluating IITF - F1Iz2= (TF - F, TF - F), show directly that TF = F. This gives a simplified proof of Theorem 1.4 for p = 2. (b) I f f E L,, then Theorem 1.4 implies that (l/n)T,f -,f * in measure. O n the other hand, by Theorem 1.2, ( I / n ) T , f - + f in measure. Hence f = f * a.e. In particular, f~ L,.

36

I.

ERGODIC THEORY

(c) Show from (b) that Jx f d p = Jx f d p . This can also be proved directly by considering the restriction of 4 to the invariant set

B(a, p) = {x : a c f(x) IP) and applying Corollary 1.1.1 to obtain @P(B(%PI)

1 j *

W ( N %P))

B(a, 8 )

4~~ 8)

f

dP 5 PP(B(% P ) )

f dP 5 BP(B(a9 B)).

In particular,

Adding on k = 0, 1, i 2 , . , . and then letting n --* co gives the desired result. 12. Show that f * is an invariant function (in Theorem 1.4); that is, show that Tf* = f*. Recurrence

13. (a) In Theorem 1.5 show that almost every point of A returns to A infinitely often. (b) Show that the conclusion of Theorem 1.5 fails for the transformations defined in Exercise 2. 14. (a) An operator T on L, is said to be conservative if f E L , , f 2 0, C:=o T"f(x)< cc a.e. implies that f = 0. Show that any doubly stochastic operator on L, of a finite measure space is conservative. (b) If 4 : X + X is any measurable transformation on a a-finite measure space, and if T&is conservative, then the sequence A, 4-'(.4), @-'(A), . . . can be pairwise disjoint only if p ( A ) = 0. Hence 4 fulfills the conclusion of Theorem 1.5. Ergodicity and Mixing 15. If (X, A9, p) is a finite measure space, and if 4 is a measure-preserving transformation, show that 4- '(B) E B implies 4- ' ( B ) = E . Thus 4 is ergodic iff 4- '(E) E E G- p ( E ) = 0 or p(B) = 0. Show that the two definitions

EXERCISES

37

are not equivalent in the case of a a-finite measure space. We adopt the latter as our definition of ergodicity in that case. 16. Let Q, be an abstract dynamical system. Show that the following are equivalent : (a) Q, is ergodic. f is a constant. (b) q f = f , f E L, (c) For all f E L,, 1 n-1 a.e. lim - C T,kf(x) = f dp

(a)

For all f

E

[

' X

nk=O

n-m

L,, X

(e)

For all f

E

L,, g E L,, where l/p + l/q = 1, lim n+m

;1 ( f , q k g )

= ( f , 1)(1,g).

k=O

(f) For all A, B E 2, 1n-1

lim n-+x

(g)

n c Pc(A n 4 - k ( B ) )= P(A) P(B). k=O

For all A, B E 2 with p ( A ) p ( B ) > 0, X

1 4 ' 4 n 4 - " ( m '0.

n= 1

(h)

For all A, B E J with p ( A ) p ( B ) > 0, X

1p ( A n +-"(B))= + 00. n= 1

17. Let Q, be an abstract dynamical system. Show that the following are equivalent : (a) Q, is weakly mixing. (b) For all f , g E L , there exists a set J of density zero such that

lim n-x.n$J

(c)

For all f , g E L,

( f , Tbng)= ( f , I)( 1,g).

38

I.

ERGODIC THEORY

18. Let (bo be an ergodic measure-preserving transformation on [0, 11. Let X consist of the two disjoint line segments X, = {(x, 0) : 0 5 x I 1) and X 2 = {(x, 1) : 0 I xI 1) with linear measure normalized to one. Define (b on X = X1 u X, by (b(x, 0) = (x, 1) and (b(x, 1) = ((bo(x), 0). (a) Show that (b is an ergodic measure-preserving transformation. (b) Show that p(X1 n 4-"(X2))takes on only the values 0 and hence does not converge. 19. (a) If d is an algebra of subsets of X, and if .4? = B ( d ) (see Exercise l), then for each A, B E &? and each E > 0 there exist sets A , , Bo E d such that

4,

I@

n 4-k(J4) - P(A0 n (b-k(Bo))I 5 P[(A n 4 - k ( B ) )A

(A0

n 4-k(B0))1

Ip [ ( A A A,) u ((bPk(BA Bo))] < 2~ for all k . (b) If %j is a class of subsets of X satisfying condition (30) of Exercise 1, and if .4? = W(%j),then Q, is (1) ergodic, (2) weakly mixing, or (3) mixing iff the defining relation is satisfied for all A, B E W.

Products and Factors 20. Give an example of ergodic systems (Dl and 0,such that 0,0 a2 is not ergodic. 21. Show that the union of a finite number of sets of density zero has density zero. Show that this is false for a countable number. 22. If W ,c W,show that 0 = (X, p, 4) is an invertible dynamical system iff (b-'(&?,) = gl. Inverse Limits

23. Show that the system (D of Exercise 5 is an inverse limit of the , 4,), where sequence 0,= ( K , d n v,, n

n

X'

xxk,

d n =

k= 1

x k=

n

)(&,

v,=

.4?k?

k= 1

1

and 4n(x1, *.., xn)

+ 1, ~

=(XI

2 . ., - ,xn)

if x1 < k , = (0, . .., 0, xg

+ 1,

Xp+,,

*

.., x,)

if x1 = k , , ..., x p - = k p - , , (b"(k1, k 2 * . ., kn) 9

=

(O,O,. . . >0).

x p c k,

39

EXERCISES

24. A Lebesgue system is an abstract dynamical system @ = ( X , a, p, 4) such that there exists a countable class %? G 93 with 93(%') = 93. Show that the inverse limit of a countable number of Lebesgue systems, and hence also the direct product of a countable number of Lebesgue systems, is a Lebesgue system. 25. A Kolmogorov system is an invertible dynamical system @ = (X, 93,p, 4) for which there is a o-algebra goG satisfying (i) 4-'(@,,) c Bo, 4-n(9%,)= X } , and (ii) = 93. (iii) g(lJ,"=@ao) An exact dynamical system is a dynamical system @ = ( X , 93, p, 4 ) satisfying (iv) 4-"(9) = Show that @ is a Kolmogorov system iff it is the natural extension of an exact system. 26. Show that the two-sided shift on k points is isomorphic to the natural extension of the one-sided shift on k points. Show also that the two-sided shift is Kolmogorov. 27. Prove Propositions 1.9 and 1.10.

{a,

{a,x}.

Induced Systems 28. Show that Theorem 1.9 and Proposition 1.12 remain valid if 4 is a recurrent (i.e., one for which the recurrence theorem is valid) measurepreserving transformation of a a-finite measure space. 29. Prove Theorem 1.10. 30. Construct an example of an ergodic measure-preserving transformation on the reals. Show that any such transformation is conservative, hence recurrent. 31. If is induced by @ = ( X , 93, p, 4) on A, then @ A is also induced by 0'= ( Y , By,p y , $ y ) , where Y = 4 - " ( A ) is a &invariant subset of X . Moreover, Y is the minimal subset of X for which this is true. (a) Show that @ A is ergodic iff (P' is ergodic. (b) Show by example that (a) is false if "ergodic" is replaced by "mixing" or "weak mixing."

u."=o

Special Flows

32. Let @ = (X, a, p, 4) be a dynamical system, and let f 2 0 be a nonnegative measurable function defined on X . Let Y be the space under the graph of f , that is, Y = {(x, y ) : x E X , y E R, 0 Iy < f ( x ) i . Y inherits

40

I.

ERGODIC THEORY

a measure structure as a measurable subset of the product space X x R. Define a family +,, 0 I t < 00, of transformations of Y as follows: 4,(x, Y ) = (x, Y

+ r),

0 I JJ

+ t < f(x)

k=D n- 1

n

c f(4“4)I Y + < c f(4k‘x’). \

t

k=O

k=O

If t is thought of as time, the point (x, y ) moves upward with velocity one until it reaches the “roof” of the space Y , then moves back to the “floor” X and is transformed by 4. (a) Show that (b, is a measure-preserving transformation of Y for each t 2 0. (b) Show that the transformations 4, form a Jow in the sense that (b, 4s= d,,, for each t, s 2 0. (c) Show that the flow is measurable, in the sense that 4.(.): Y x R + --* Y is a measurable function. The flow defined in Exercise 31 is called the special Jow constructed under the function f on the system 0.Special flows were introduced by Ambrose [5], who showed that all ergodic measurable flows are isomorphic to special flows. 33. In the construction preceding Theorem 1.10, letfbe the step function defined on B, by setting f ( x ) = n + 1 if x is in the range of boo1 on-l,but is not in the range of oool on (x lies “under” B, but not and f ( x ) = 1 if x is not in the range of o o . If JI, is the special flow constructed under f on O 0 , show that the 4 of Theorem 1.10 is isomorphic to a factor of i,hl. (In the construction of the flow JI,, look at the subalgebra of sets generated by vertical “columns” between floors.) Does this imply that every ergodic transformation can be embedded in a flow? 34. If a1
35. (a) Let (Do be as in Kakutani’s example at the end of Section 6. Show that O0 is isomorphic to the adding machine of Exercise 5 with k, = I for each n. Deduce from Exercise 23 and Proposition 1.11 that Q0 is ergodic but not weakly mixing.

41

EXERCISES

(b) Set fo E 1, and, for each k = 1, 2, 3, . . ., 2"-' and n = 1, 2, 3, ..., define fk, ,on the space of Exercise 5 by n

fk. "(xi.x2, . . .) = exp(2ni(2k - 1)2-'"+ ')

C 2jxjJ. j= 1

Show that fk,,is an eigenfunction of T+owith eigenvalue

Note that the range off,,, is just exactly the 2"th roots of unity in some order, and that 2,'- ,is also a 2"th root of unity. (c) Show that fo and the fk,,'s constitute a complete orthonormal system in complex L2(B0). [Hint: Show first that this is true for each L,(X,) as defined in Exercise 23.1 (d) For an arbitraryf E L,(Bo),expandfin a Fourier series with respect to the basis of (c). Use this expansion, Parseval's relation, and the identity A;,",,= 1 to deduce lim n-m

S,'f(42"(x))- f(x)12 ,u(dx)

= 0.

36. Let 0 be as in Kakutani's example. Note that in the terms of Exercise 35 the set A over which B1 is constructed is the set of x = (x,,} for which the smallest n with x, = 0 is odd. (See Fig. 2.) Define u, on Bo by u l ( x ) = z A ( x ) + 1, and n- 1

un(x) =

1

~1

k=O

(4Ok(x)),

x E Bo.

Note that un(x) is n plus the number of "visits" to A in the orbit x, 40(x), ..., &-'(x). For n = 4,, regardless of what x is, the first 2p coordinates y,, . . ., y,, take each of the 4, possible combinations of values exactly once as y = 4 o k (traverses ~) this orbit, y,, increases by 1, and no other change occurs in I;, j > 2p. For all but one of these combinations, namely 1, 1, . . ., 1, y E A or y 4 A regardless of the values of yj, j > 2p. In the exceptional case, y E A for exactly 4 of the x values. (a) Show that u,(x) for n = 4p takes only two values, a, and ap + 1, where

and that

p0(x : u q r ( x )= a,} = 4.

po{x : U ~ ~ ( X=) a,

+ 1) = 3.

42

I.

ERGODIC THEORY

(b) Show that 40"(x) = 4"n'"'(x)

for each n = 1, 2, . . . and each x E B o . (c) Iff E L , ( X ) is such that f(+(x)) = e2.'Y(x) for all x E X,show that

Conclude from Exercise 35(d) that Iz = 0. According to the converse of Theorem 1.6(ii), 0 is weakly mixing. (d) Use (a) and (b) to show that B~ n ~ - o P ( C E )4o4'(c)u 404'(4(C)) for any measurable set C. In particular, if C = {x E Bo : x1 = 0}, then the right-hand side reduces to 404'(C) = c.

Therefore, p((Bo

for all p = 1, 2, ..., and

-

C ) n d-"p(C))= 0

4 is not strongly mixing.

CHAPTER

I1 Topological Dynamics

1.

CLASSICAL DYNAMICAL SYSTEMS

Topological dynamics may be defined as the study of continuous transformations, or groups of such transformations, defined on a topological space (usually compact), with particular regard to properties of interest in the qualitative theory of differential equations. We shall be concerned in this chapter with the theory of a single continuous transformation of a compact Hausdorff space. Many of the properties of transformation groups (as discussed in [23], for example) may just as well be isolated and studied for a single transformation and its iterates, and we find this study notationally and conceptually much easier to introduce at this level. On the other hand, it should be acknowledged that the classical applications to differential equations and to physics generally involve a continuous group of transformations. Our considerations on the proper level of generality for this chapter are also guided by our concern for tying in the results obtained here with those of Chapters I and 111. For this reason, we shall not make the simplifying assumption that our topological space is metrizable. On the other hand, we shall try to point out those situations where a definition or result has a significantly simpler statement in the case of a transformation defined on a metric space.

44

11.

TOPOLOGICAL DYNAMICS

Definition 2.1 A (classical)dynamical system is a pair C = (X, a), where X is a nonempty compact Hausdorff space, and 0 is a continuous map of X into itself. X is an invertible system if a is invertible. Note that for an invertible system 0 - l is necessarily continuous, and so C- = (X, a- ') is also a dynamical system. Although the development in this chapter will be largely independent of the preceding chapter and parallel to it, let us mention a classical result which binds the two theories together. In order to do this, we need first to talk about the operator T, induced by c on the space C ( X ) of continuous real-valued functions on X. For Q: X + X a continuous map, we define T , : C(X)+C(X) by T , f ( x ) = f(cr(x)). Recall that C ( X ) is an algebra of functions, and that with the norm

IIf II = max If ( 4I* XEX

C ( X ) becomes a complete normed linear space. (See, for example, [16].) It follows easily that To has the following properties: 1. f 2 O * T u f > 0 ; 2. IITufll 5 Ilfll; 3. T, 1 = 1.

(Inequalities and equalities of functions are assumed to hold pointwise.) If 0 is epic (o(X)= X),then property 2 may be replaced by the stronger If 0 is invertible, then T, is invertible with T i = To-'. The dual of the space C ( X ) as a normed linear space is the space M ( X ) of all finite (signed) Bore1 measures on X.This duality is expressed by

(f,P ) = P ( f ) = [X f ( x ) P ( d 4 . The adjoint T*: M ( X ) + M ( X ) is defined by (J T*p) = ( 7 ' p), where T is any continuous linear operator on C(X). The topology on M ( X ) is given by the total variation norm, llpll = I p l ( X ) , and T* is continuous for this = ))TI/. Properties 1-3 for T, are equivalent to the topology. Indeed, ))T*JJ following properties for T,* : I*. p 20* T**p 2 0; 2*. lKU*PIl 5 11Pll; 3** (T,*P)(X) = AX).

1. CLASSICAL DYNAMICAL SYSTEMS

45

If Q is monic, then for each function g of norm one there is an f of norm one with T, f (x) = f (u(x)) = g(x) (x E X). It follows easily that, in this case, we can replace 2* with the stronger

2’** II T,*PlI = IIPII. In particular, if I; is invertible, then both T, and T,* are invertible isometries, and T,*- = T , . ,* = T,- *. Now let K = {p E M ( X ) : p 2 0, 1)p11= p ( X ) = 1). The set K is nonempty, convex, and weak*-compact. (See [16], Corollary V.4.3.) Moreover, T,*(K) c K. According to the Markov-Kakutani fixed point theorem ([16], p. 456), T,* has a fixed point p E K, that is, T,*p = p. But a fixed point for T,* is nothing more nor less than an invariant Borel measure for Q, since

for all f E C ( X ) , and validity of (1) on C ( X ) is equivalent to for all Borel sets B. Thus we have proved the following theorem:

Theorem 2.1 Let I; = ( X , Q) be a classical dynamical system. Then there exists a normalized (total measure one), positive measure p on the class of Borel sets of X such that Q preserves the measure p. That is, C = ( X , a, p, Q) i s an abstract dynamical system. Remark We have not required that D be epic. O n the other hand, it is immediate that a measure-preserving transformation must be essentially onto. This shows that the measure p in the theorem may be degenerate. For example, if Q(X) = xo is a constant map, then p is concentrated at the single point xo. We shall see later (Exercise 7) that certain systems must have invariant measures p whose support, that is, the smallest closed subset of X whose complement has p-measure zero, is all of X .

Example I

(Symbolic dynamical systems) Let

X,

= (0, 1,

..., k - 1)

be a finite set of k points with the discrete topology. Form the product

x=

m

x

n=-m

x,

46

11.

TOPOLOGICAL DYNAMICS

with the product topology. Thus X is a compact, totally disconnected, Hausdorff space. In case k = 2, it is homeomorphic to the Cantor “middle-thirds’’ set. Define Q: X .--, X by u(x) = y , where y, = x,+ for all n. Since the “cylinder sets”

c = iXx : (xn,, .. .,x , ~E A ) =

I

u n .{

: xn, = s,}

(s,, .... s,) E A j = 1

constitute a base for the topology of X,and since

Q-W=

(sl.

u

...,s,) E

1

f(x: l %,+I A j= 1

= s,}

is also a cylinder set, Q is continuous. Clearly Q is invertible, and Q- ‘(x) = z, where z , = x,- 1. Thus X = (X,Q) is an invertible dynamical system. It is called a shift dynamical system or symbolic dynumical system (on the symbols or “alphabet” (0, 1, .. ., k - 1). The measure p which assigns to the set C above the mass P(C) =

c

1

lips,.

(s,, .._,st) E A j = 1

where p , = l/k for j = 0, 1, . . ., k - 1, is invariant for Q. However, as seen in the previous chapter, there are many other invariant measures for Q. 2. MINIMAL

AND STRICTLY ERGODIC SYSTEMS

An important and central notion in the study of dynamical systems is that of minimality. The essential idea of minimality is that everything worth knowing about the system can be determined from the present and future, or from the past, present, and future, situation of a single point under the action of Q. Let X = (X,0 ) be a dynamical system. By the positive orbit of a point x E X is meant the set O,+(x) = O+(x) = (a.(x) : n = 0,1,2,.. .} =

u W

a”({x}).

n=O

By the orbit of x is meant the set O,(x) = O(x) = 02’ - ~“({x}).We denote the closure of O+(x) by B + ( x ) and the closure of O(x) by B(x), and refer to these sets as the positive orbit clo,sureand the orbit closure of x, respectively. gl

Definition 2.2 The dynamical system C = ( X , Q) is minimal if Q(A)G A, A closed, implies either A = X or A = 0.

2.

MINIMAL AND STRICTLY ERGODIC SYSTEMS

47

Definition 2.3 The dynamical system C = (X, a) is (positively) recurrent if x E a + ( a ( x ) )for each x E X, that is, if for each open set U C X and each x E U there exists a positive integer n with a"(x)E U . Remark If x E o'(a(x)), then either x = ~ ( xor) x E D+(a2(x)).But in the former case, x = a"(x) for all n, and so x E D*(a2(x)).By induction, x E D+(a"(x))for all n 2 1. It follows that a"(x)returns to each neighborhood of x infinitely often. x

Proposition 2.1 The system C = ( X , a) is minimal ifl o + ( x )= X for each X . In particular, if C is minimal, then it is recurrent.

E

Proof For a given x E X the set O + ( x )is invariant, that is, a ( O + ( x ) E ) O+(x).It follows that its closure is also invariant: a(O+(x))s a(O+(x))-E O+(X).

If E is minimal, since x E O+(x)# 0, it follows that O + ( X = ) X. Conversely, suppose that C is not minimal. Then there is a nonempty, closed subset A G X with A # X and u ( A )E A. If x E A, then O'(x) and hence O+(x)are contained in A. Thus D + ( x )# X. If C is minimal and x E X , then D'(a(x)) = X, and so x E a + ( a ( x ) ) . That is, C is recurrent. I Proposition 2.2 I f C is recurrent, then a is epic. IJ moreover, a(A) E A for some closed subset A of X , then a(A) = A. In particular, B ( x ) = O + ( x ) for each x E X .

-

Proof Suppose that a is not epic, and let x E U = X u(X).The set U is open and a"(x) # U for any positive n. Thus C is not recurrent. Now suppose C is recurrent and a(A) E A for some closed subset A of X. Let x E A and choose y E a- '({x}). Then y E o ' ( a ( y ) ) = O + ( x )c A. Hence x = a ( y )E a(A). Thus a(A) = A. Likewise, for each x E X, O ( x ) and hence B ( x ) c 1

a+(.).

We are now in a position to summarize our information about minimal dynamical systems.

Theorem 2.2 Let C = ( X , a) be a dynumical system. Then there exists a nonempty closed subset X , G X with a(X,) C X, and C, = ( X , , a)minimal. Moreover, any such X , must satisfy a(X,) = X o. If Z = ( X , a) and X o is a closed subset of X , then the following are equivalent :

48

11.

TOPOLOGICAL DYNAMICS

1. cr(X,) E X , and C, = (X,,cr) is minimal; 2. O+(x) = X , f o r each x E X , ; 3. O(x) = X o for each x E X , ; 4. cr(X,) = X, and X, has m closed, nonempty, proper skuset A satis, ing o ( A ) = A. cr) is minimal, we shall say that X , is a minimal set for c. Note If (Xo,

Proof Let %' be the class of nonempty closed subsets A of X with cr(A)E A. 48 is partially ordered by set inclusion. If %', is any totally ordered subset of 48, then y0 = n %,' is a lower bound for g o . Moreover, Yo is closed, and a(Y,) s Yo. By compactness of X and the fact that go is totally ordered, Yo # 0.Thus Yo E 48. It follows by Zorn's lemma that 48 has a minimal element X,. Thus C, = ( X , , c) is minimal. Moreover, since cr(X,) E %', it follows that X, = cr(X,). Now suppose that X , is a closed subset of X with cr(Xo)E X , . It follows from Proposition 2.1 that statement 1 is equivalent to statement 2, and by a similar argument that statement 3 is equivalent to statement 4. If statement 1 holds, then cr(X,) = X,, and so statement 1 implies statement 4. Let us show, conversely, that statement 4 implies statement 1. Suppose that cr(X,) E X, and X, is not minimal. By the first part of the theorem, thereexists a closed subset X, c X, with C, = (X,, u ) minimal. Then XI# 0, and since C, is not minimal, X, # X , . Again from the first part of the theorem cr(X,) = X,. This shows that statement 4 is not satisfied. fi

Corollary 2.2.1 If A c X , u(A) = A, A closed, implies that A = 0 or X, or equivalently, ifO(x) = X for each x E X, then C is minimal. Proof The last part of the proof did not use the fact that a(X,) = X, .

Corollary 2.2.2

fi

If Z is invertible and minimal, then X-' is minimal.

Example2 Let X = [0,1] and cr(x) = x2. Then cr"(x) = x'". C is invertible, but not minimal or recurrent. In fact, the only recurrent points are the fixed points 0 and 1, and the only minimal sets are (0) and (1). Sets of the form A = [0, a], 0 < a < 1, satisfy a(A) C A, but not a(A) = A. Sets of the form a + ( a )= (0, a, a2, a4, ...f are closed and invariant, but not minimal or recurrent. Sets of the form A = a(a)= (0, l} u (a2": n E 2) satisfy a(A) = A and A = O(x) for all x E A except 0 and 1. Example 3 Let X = [0, 11 and let ~ ( x be ) the fractional part of x + a. If a is rational, then each point has finite orbit, and each orbit is a minimal set. If a is irrational, then O+(x) is dense in X for each x E X, and so C is minimal.

2. MINIMAL AND STRICTLY ERGODIC SYSTEMS

49

Example 4 Let C be the symbolic dynamical system (two-sided shift) on two points. Choose x E X by setting x , equal mod 2 to the sum of the binary digits in the binary expansion of n for n 2 0, that is, k

x, =

C aj j=o

k

(mod 2 )

where n =

1 aj2j,

j=o

and setting x-, = x , - ~ . Then (Exercise 5 ) the closed invariant set X o = O ( x ) is minimal. The definition of a minimal dynamical system bears a strong resemblance to that of an ergodic abstract dynamical system (Definition 1.4). In cases where both apply it is in fact a stronger condition (see Exercise 9). For ergodic systems @ and integrable functions f, we have seen that the “time averages” f,(x) = ( l / n )C;: f ( V ( x ) )converge p-a.e. to a constant, namely, the “phase average” p( f ) = Jx f ( x )p ( d x ) with respect to the invariant measure p. Since topological dynamics is concerned with relations holding at each point of X rather than almost everywhere, we might expect that this convergence would hold for each x E X in case @ = (X, 4) is a minimal classical dynamical system andfE C ( X ) .However, this is known to be false (1491, P. 134). The ergodic theorem and its corollary, Proposition 1.4, give us a clue as to the proper conditions to ensure that f , ( x ) converges everywhere to a constant. For if p1 and p2 are each ergodic invariant measures, then there will be values of x for which f , ( x ) + pl(f ) and other values for which f,(x) +pz(f). Moreover, if f , ( x ) -+ L,(f ) for a given x E X and for all f E C ( X ) , then L, determines a unique a-invariant, but not necessarily ergodic, Borel measure p, with p,( f ) = L,( f). Iff, is to converge to a constant, we must have p, independent of x. In the following, we follow Furstenberg [21],except that our definitions of unique ergodicity and strict ergodicity follow [49], to which the reader is referred for an extremely lucid presentation of the underlying ideas.

Definition 2.4 A dynamical system C = (X, a) is uniquely ergodic if there is exactly one a-invariant, normalized Borel measure p on X . C is strictly ergodic if it is uniquely ergodic and minimal. A point x E X is said to be generic for p if f , ( x ) + p( f ) for each f E C ( X ) . Theorem 2.3 Let Z = ( X , a) be a dynamical system. Then the following are equivalent: 1 . C is uniquely ergodic with invariant measure p ; 2. f,(x) conuerges to p( f ) uni$ormiy on X , f o r e a c h f e C ( X ) ; 3. every point of X is generic for p.

50

11.

TOPOLOGICAL DYNAMICS

Notice that the equivalence of statements 1 and 3 answers the question raised above about pointwise convergence of fn(x), and that the uniform convergence in statement 2 is a bonus.

Proof It is clear that statement 2 implies statement 3. We shall show that statement 1implies statement 2 and statement 3 implies statement 1. Suppose that statement 1 is true. The uniqueness of the invariant measure p means that the set of measures v E M ( X ) that vanish on the subspace 9,= {g - T, g : g E C(X)}of C ( X ) is one-dimensional. It follows that the closure = {f - p ( f ) :f E C ( X ) }of in C ( X ) of 9,coincides with the null space 9,, is the closure of p. (The annihilator of the annihilator of Thus, given f~ C(X) and E > O , there exists g E C(X) with [If - p ( f ) - (g - T,g)ll < E. It follows that

for each n. Since (l/n)(g - T,"g) converges uniformly to zero, it follows that I l f , - p(f)ll -+O as n + co. Now suppose that statement 3 holds, and that v is any normalized, invariant Bore1 measure on X. Since the functions fn(x) are uniformly bounded by 1 f 1 andf,(x) + p ( f ) for each x E X,it follows by the bounded convergence theorem that

for eachfE C(X).It follows that v = p.

I

Example The system of Example 2 is not uniquely ergodic since it has two fixed points. However, a close relative of it is. Let X = K = { z : IzI = 1) be the unit circle in the complex plane, and define o by o(eZZiX) = exp(2dx2).Then C is an invertible dynamical system. If p is any invariant measure, then the p-measure of the arc from 1 counterclockwise to ,+ia is the same as that of the arc from 1 to exp(2nia2) for each a, 0 c a < 1. Hence the arc from exp(2nia2) to eZniahas measure zero. It follows that p is concentrated at the single point z = 1, and that C is uniquely ergodic. On the other hand, 2 is not minimal, hence not strictly ergodic. An example of a strictly ergodic system is given in Exercise 1. Remark A uniquely ergodic system is strictly ergodic iff the support of the unique invariant measure p is all of X, that is, iff p ( U ) > O for each open set U E X.(See Exercise 7.)

3.

3.

51

EQUICONTINUOUS AND DISTAL SYSTEMS

EQUICONTINUOUS AND DISTAL SYSTEMS

In this section we shall assume that C = ( X , a) is an invertible dynamical system. It turns out that this is only an apparent restriction for distal systems, as we shall see in Section 6 . Recall that a compact Hausdorff space X can always have its topology described in terms of a uniformity on X . (See, for example, [ S S ] . ) In particular, if X is metrizable with metric d, then the pair (x, y ) belongs to the index a, E X x X provided that d(x, y ) < E . A collection : j E J} of functions from one uniform space X to another uniform space Y is equicontinuous if for each index on Y there exists an index a on X such that ( x , y ) E a implies ( f i ( x )f, i ( y ) )E /Ifor all j E J.

{fi

Definition 2.5 A dynamical system C = ( X , a) is equicontinuous if the collection {a": n E Z } of transformations of X is equicontinuous. Examples The shift dynamical systems of Example 1 are not equicontinuous. A suitable metric on X is given by d(x, y ) = (1

+ min{ 1 n I : x , # y,})- '.

With this metric, there exist points x and y (differing only in the nth component) with d(x, y ) = 1/(1 + n) arbitrarily small but with d(a-"(x), a - " ( y ) )= 1. Example 3 is not equicontinuous as it stands, but it becomes so if we modify the topological space X by identifying 0 and 1, or, equivalently, by defining a on K = ( z : ( z I = 1) by a(.) = ezniaz,for then d(a"(z),~ " ( w )=) I eZniM(z- w )1 = d(z, w). Proposition 2.3 A minimal equicontinuous dynamical system is strictly ergodic. Proof Let f~ C ( X ) . Equicontinuity of the transformations a" implies equicontinuity of the family of real-valued functions f(a"(.))( n E Z ) and hence of the functionsf, ( n E Z ) . It follows from the Arzela-Ascoli theorem [16, p. 2661 that some subsequencefnkconverges uniformly on X . Let g be the uniform limit of this sequence. Then g is continuous, and Tog = g. Since C is minimal, g must be a constant (see Exercise 9). Let us indicate this constant, which depends only on the choice o f x by 1(f). Now if p is any invariant normalized Bore1 measure on X , then p ( f , ) = Jx f , d p converges to p ( g ) = A ( f ) . But p ( f J = p ( f ) . Thus p(f) = 1(f) for each f~ C ( X ) . It follows that p = 1. 1

52

11.

TOPOLOGICAL DYNAMICS

In general, of course an equicontinuous system need not be minimal and so need not be uniquely ergodic. However, we shall see in Section 6 that if C = (X,a) is equicontinuous, then X is a union of disjoint invariant sets on which a is minimal. A closely related notion to that of an equicontinuous system is that of a distal system. There are several equivalent ways of defining distal systems (Exercise lo), and we choose one that is most simply stated in terms of the product topology on the product space X x X.

Definition 2.6 A dynamical system Z = ( X , a) is distal if, for each pair x, y E X with x # y, the closure of the set {(~"(x),~ " ( y :) n) E Z } is disjoint from the diagonal A = {(x, x) : x E X} in X x X. Proposition 2.4 If C is equicontinuous, then it is distal. Proof Suppose x # y. Then there exists an index fl on X with ( x , y) # fl. By equicontinuity there exists an index a such that (u, U ) E a implies (du, a%) E fl for all k E 2. It follows that ( ~ " xany) , 4 LX for any n E 2. Otherwise, we could let u = a"x, v = any, k = -n, and reach a contradiction. Thus {(~"(x),u"(y): n E Z} is disjoint from a. Since A c a and a is open in the product topology, it follows that C is distal. I Remark For a metrizable space X with metric d, the system C = (X,a) is distal iff for each pair x, y E X with x # y, there exists an E > 0 with d(a"(x),~ " ( y )2) E for all n. In general (Exercise lo), C is distal iff ankx z, anhy z for some generalized sequence (net) of integers nk implies that x = y. -+

-+

Examples The shift dynamical (symbolic) systems of Example 1 are not distal. In fact, one can easily find points x, y E X with x # y, a(.) = x, a"(y) -+ x as n -+ co.

The modification of Example 3 discussed earlier is equicontinuous, hence distal. Let us describe a similar example which is distal but not equicontinuous. Example 5 Let X = K x K, where K = {z : Iz I = I} is the unit circle in zw), where a is an the complex plane. Define a on X by o(z, w ) = (eZniaz, irrational number between 0 and 1. This is the simplest of the so-called skew product transformations introduced by H. Anzai [32, p. 601 and extensively studied both as classical and as abstract dynamical systems. It is not hard to show that the orbit of each point in X is dense and hence that C = (X,a) is minimal. Let us show that it is distal. We take as metric the product metric on K x K c E , .

4.

SUMS AND PRODUCTS OF DYNAMICAL SYSTEMS

53

The transformation a carries all points (z, w ) having the same z-coordinate onto points again having the same z-coordinate and changes the wcoordinate by the same multiplicative factor. Thus if z1 # z 2 , then for all n.

d(a"(zl, wl), a"(z2, w 2 ) ) 2 d(zl, z 2 )

On the other hand, if z1 = z2, then d(fl"(Z1, W l ) , a"(z2 9 w 2 ) ) =

4 w , , w2).

In either case, for x # y there exists an E > 0 with d(a"(x), a"(y)) > E for all n. Next we show that 2 is not equicontinuous. Notice that for positive n we have qZ, w ) = ( p m Z , , p n ( n - 1)az n w). Takingz, = effi/"z, we see that (zn,w ) + (z, w ) as n + 00. Thus d((z,, w), (z, w ) ) can be made arbitrarily small. On the other hand, ~(V"(Z,, w), o"(z,w ) ) 2 d( - w ,

W)

= 2.

Taking E = 1, we see that Z is not equicontinuous.

4.

SUMS AND PRODUCTS OF DYN'AMICAL SYSTEMS

We begin now the study of methods of constructing new dynamical systems from given ones. In addition to the products and factors, defined as they were in the previous chapter for abstract systems, we shall consider the categorically dual constructions of sums and subsystems.

Definition 2.7 Let C, = (X,,a,) be a classical dynamical system for each j e J. We define the direct product X = @,€ Z, by taking for the X, with the product topology, and defining space X the product = @ , € J 0, by ) . ( a = y, where y, = a,(x,). We also make use of the customary modifications of this notation, such as Zl€3 C,, Zl €3 . * * 0 C,, and (82- 1 Zfi.

Definition 2.8 Let C, = (X,, a,) be a classical dynamical system for each j E J, where either (i) J is finite, or (ii) X = X, is compact Hausdorff, the X, are pairwise disjoint and have the relative topology as X, to be the disjoint union of subsets of X. In case (i), define X =

uJe

54

11.

TOPOLOGICAL DYNAMICS

copies of the X, with the direct sum topology (a set is open if its inter) o,(x) for x E X,. In case section with each X, is open), and define ~ ( x = (ii), assume that 0 : X + X is continuous and satisfies ~ ( x=) o,(x) for x E X , . In either case, we define the direct sum C = C, to be Z = (X, 0 ) . Again, we shall make use of the customary modifications. If C is a direct sum of minimal systems, we say C is semisimple. Remurk It is possible to give categorical definitions as follows: If C,, then for each j E J there is a commutative diagram

C=@

,

X

' X

(I

where the $, are epimorphisms, and if !J = (Y, w ) completes a similar diagram :

Y

0

with the p, epimorphisms for each j 4 : Y + X such that the diagrams Y

0

* Y

Y

' Y

E J,

then there is an epimorphism

#J

'X

commute. The $, of course, are the projections $,(x) = x,, and defined by 4(y) = x, where xJ = P,(Y).

4 is

Finite direct sums are defined in exactly the same way, with all of the arrows reversed and epimorphism replaced by monomorphism throughout. In this case, the $, are the injections $,(x) = x, and 4 is the monomorphism defined by

CPb)= P , ( 4

(x E X,).

4.

SUMS AND PRODUCTS OF DYNAMICAL SYSTEMS

55

Infinite direct sums are, in general, not categorical. Uniqueness fails, since the map 4 defined above may not be continuous. We may now restate Definition 2.6 as follows.

Definition 2.6 The dynamical system C = (X, Q) is distal provided that the orbit closure of each point for the system C 63 C is either contained in or disjoint from tht diagonal A. Example 6 As an example of a direct sum consisting of an infinite number of summands, consider the transformation CJ defined on the torus K x K by Q(Z, w ) = (ezxiaz, e4"w), where a is an irrational number between 0 and 1. This transformation is equicontinuous, but not minimal. In fact, each of the pairwise disjoint curves of the form w = Az2, where A is a given point in K, is a minimal set for Q. It follows that the systems C, = ( X I , 0,) (A E K), where X, is the curve described above and Q, is the restriction of Q to X,, satisfy the conditions of Definition 2.8. Each of the X, in this example is homeomorphic to the compact group K, and a, is carried by this homeomorphism into a rotation through the angle 2na. This situation is, as we shall see later, typical of equicontinuous systems. Diagrams (3)-(5) and their duals for sums suggest two more definitions.

Definition 2.9 We shall say that the dynamical system C, = (X,, o1) is afactor of the system C = (X, Q) if there exists a continuous epimorphism $: X + X, such that the diagram X

U

' X

commutes. In this case, we write C, IZ and $: C + C, or C 4 X I . The map $ is called a homomorphism (or epimorphism) of C onto C,.If $ is invertible, it is called an isomorphism, and we say C, and I: are isomorphic dynamical systems. If C = C, 63 C,, we say C1 is a direct factor of C.

Definition 2.10 We shall say that the dynamical system C, = (X,, ol) is a subsystem of the system C = (X, Q) if there exists a continuous monomorphism $: X I + X such that the diagram

56

11.

TOPOLOGICAL DYNAMICS

x, *

01

TI XI

commutes. The map I,$is called an injection (embedding, monomorphism) of C1 into C. To complete the analogy between Definitions 2.9 call the system C, in the latter case a summand given terminology is well established and agrees with categories, for example, subgroup, subspace, etc. We to C, as a direct summand in case C = C, 0 X 2 .

and 2.10, we should of C. However, the terminology in other shall, however, refer

Proposition 2.5 A dynamical system is minimal i f l i t has no proper subsystems. A factor of a minimal (equicontinuous) system is minimal (equicontinuous). The proof is left as an exercise. Note that nontrivial sums are never minimal. Nor is C@C ever minimal, since the diagonal is a closed invariant set. Contrast this with the situation for abstract dynamical systems 0,where 0 @ 0 is ergodic when 0 is weakly mixing. Proposition 2.6 Products, jinite sums, and subsystems of equicontinuous (distal) dynamical systems are equicontinuous (distal). Arbitrary sums of distal systems are distal.

X,. Suppose that each C, is Proof Consider first the product C = @, equicontinuous. A basis for the uniformity on X is obtained by taking products p = X /I,, where /Ij is an index on X , for all j E J , and all but finitely many pj = X , x X , . For the finite number of exceptions, choose a, so that (& q) E a, implies (a,"(r), a,"(q)) E p, for all n. Let a, = X , x X , otherwise, and set a = X j E J a j Then . (x, y ) E a implies (an(x), a"(y))E p for all n, and C is equicontinuous. Suppose each C, is distal and that there exists a generalized sequence (net) of integers nk and points x, y , Z E X such that a"k(x)+z and a n k ( y+ ) z . Then for each j E J it follows that o?(x,) + z,, a p ( y j )+ z, so that x, = y , . Therefore, x = y , and C is distal. Next consider the direct sum I: = C, 0 Z,. A uniformity for X is obtained by taking as indices sets of the form a = a1 u a t , where a, G X , x X , are indices. Equicontinuity of X follows immediately from equicontinuity of C1 and C 2 and this observation.

,

5.

INVERSE LIMITS

57

Suppose that C, is distal for each j E J and that C = @ I j EC,. J If x E X j , then a(x) = a j ( x )E X , , and so D+(x)C X , . Suppose that there is a generalized sequence nk of integers with ank(x)+z and a Y E X with o"k(y) + z. Since the X , are pairwise disjoint, we must have y belonging to the same X j as x. Hence o"h(y) = oj""(y)+ z, aj""(x)+ z, and x, y , z E X , . Since C, is distal, this implies that x = y . Hence C is distal. Now suppose that $: C, -+ C is an injection. If the mappings a" (n E Z) are equicontinuous on X, they are certainly equicontinuous on $ ( X , ) . Since $ is a homeomorphism of X, onto $ ( X , ) , it follows that C1 is equicontinuous when C is. Assume that C is distal. If $: C1 + C is an injection, then $ 0 $: C, @ C, + C 0 C is also an injection. Moreover, for any (x, y ) E X , x X,, we have O($(x), $ ( y ) ) = ($ x $) O(x, y). Suppose x # y . Then $(x) # $ ( y ) . Since O($(x), $ ( y ) ) is disjoint from the diagonal, so also is O(x, y ) . That is, C, is distal.

a

Remarks It is also true that factors of distal systems are distal. However, the proof is surprisingly nontrivial and will be given in Section 6 using the notion of the Ellis semigroup of C. Infinite direct sums of equicontinuous systems need not be equicontinuous. (See Exercise 14.)

5.

INVERSE LIMITS

We turn now to the construction of inverse limits of dynamical systems. This construction generalizes that of products of infinitely many systems. We shall give a categorical definition for limits of collections of systems C j indexed on a directed set J. Recall that the set J is directed by the relation < provided that (i) < is a partial order on J, and (ii) for each pair, i, j E J there is a k E J with i < k and j < k.

Definition 2.11 By an inverse system of dynamical systems we shall mean a triple (J, C,, $ i j ) such that J is a directed set; for each j E J, C, is a dynamical system; for each pair i, j E J with i <j , the map $ i j is a homomorphism of C j onto C i ; and the diagram

commutes for all i

<j < k.

58

TOPOLOGICAL DYNAMICS

11.

Definition 2.12 The dynamical system I: is an inverse limit of the inverse system (J, C,, i+hr,) if (i) there exist homomorphisms p, ( ~ E Jof) C onto C,, such that all the diagrams C

(9)

Ci'

=1

$1,

commute, and (ii) if E' is another system with epimorphisms p i satisfying (9), then there exists a homomorphism 4 of C' onto Z, such that the diagrams

r

=

PI

''J

all commute, In this case, we write C = inv Iim,€ C,. Remark It follows easily from the corresponding fact in topological spaces that the inverse limit of an inverse system of dynamical systems always exists. In fact, we can take X to be the inverse limit set

X,

= {x E

X X, : +[, x, = x i for all i, j E J , i <j }

(11)

Ie J

with the relative topology induced on X, by the product topology on X,. X,.Moreover, o = o, is defined by o,(x) = y, where y, = u,(x,), and the p, are the obvious projections. Note that C, is a subsystem of @,€ C, . Any other inverse limit is obviously isomorphic to C , = (X, , 0,). Examples The classic example and prototype of the inverse limit is the direct product. Indeed, if I; = @lie, X i , if we let J be the collection of finite subsets of I ordered by set inclusion, and if we denote

A, = A(il,.. ., in)- Cil 6 *..@I Cin

then C = inv lim,.

for j = (il, ..., in) E J ,

A,.

Example 7 (Natural extension) Suppose that C = (X, o) is a dynamical system, where o is epic, but not necessarily invertible. For each positive integer n let X,= X and define i+hnm = om-'' for m > n. The system

59

6. THE ELLIS SEMIGROUP OF

C, = inv limn+, C,,is the smallest invertible dynamical system containing C as a factor. To see that C, is invertible, note that 00

X , = { X E XXn:~,=a(~n+l)foreachn}, n= 1

cr,(x1,

x2 9

x3 .. .) = ( f l ( X l ) , 1

x1, x2 9

* . .),

so that -1

a,

(Yl, Y z , Y39

...I =

(Yz9

Y39

Y 4 , 4

Proposition 2.7 Inoerse limits of (1) minimal, (2) equicontinuous, or (3) distal dynamical systems are, respectively, (1) minimal, (2) equicontinuous, or (3) distal. Proof The statements regarding equicontinuous and distal systems follow immediately from Proposition 2.6 and the observation that C, is a subsystem of @ , E J C,. Suppose that C, = inv lim,eJ C,, and each C, is minimal. Let A be a closed subset of X , with o,(A) G A. Then for eachj E J, p,(A) is closed and a,(p,(A)) = p,(a(A)) E p,(A) * p,(A) = 0 or X , . If p,(A) = X , for all j E J, then A = X . If p,(A) = fa for a n y j E J , then A = 0.

6. THE ELLIS SEMIGROUP OF C

Let C = ( X , cr) be a dynamical system. The topological space X x of all functions from X to X (not necessarily continuous) is a compact Hausdorff space with the product topology. It is also a semigroup, with composition of functions as the operation. Clearly, a" E X x for each n = 0, 1, 2, . . . . Definition 2.13 The Ellis semigroup E ( C ) of the dynamical system

C = ( X , cr) is the closure in X x of the semigroup T ( C )= (0" : n = 0, 1,2, . . .}. We note first that E(C) is compact Hausdorff and is a semigroup. The latter follows from the following observations. Let E,(C) be the set of all continuous functions in E(C). Thus T ( C )c Eo(C) E E(C). It is immediately verified that (i) g + gh is a continuous map of E(C) for each h E E(C), and (ii) g + hg is a continuous map of E ( C ) for each h E Eo(C).

60

11.

TOPOLOGICAL DYNAMICS

(Recall that the topology in X x is the topology of pointwise convergence of nets.) Now if g , h E E(C), then there exist nets g , , h, E T ( X ) with ga -,g and h, + h. From (ii) we have g , h, + g , h (convergence in p) for fixed a, and from (i) g , h + g h . Thus ghE E(X), which is therefore a semigroup. Clearly, E&) is also a semigroup.

Theorem 2.4 E ( Z ) is a group i f l X is distal. Proof Suppose first that E(X)is a group, and that there exist x, y , z E X and a net nk of integers such that o f l k x + z , crnky+z. By compactness of E ( C ) we may assume, by passing to a subnet if necessary, that oflk+ g E E(X).But then g ( x ) = g ( y ) = z, and since g is invertible, it follows that x = y . Conversely, suppose that C is distal. Then each g E E(X) is monic. For if o"k + g and g ( x ) = g(y), then by distality x = y . It follows immediately that E ( C ) has a left cancellation law: 991 =992*91

=g2*

In particular, the only idempotent g 2 = g in E(C) is the identity e. Let h E E(C).Weshallshow that h has a left inverse. Let El = {gh : g E E ( C ) } . Then El2 E El, that is El is a subsemigroup of E(C). According to (i) above, El is closed [since E ( C ) is compact]. Let Y be the collection of all closed, nonempty subsets S of El such that S 2 E S. Since El E 9',it is a nonempty collection. Let 9,be a deis nonempty. Morecreasing chain in 9'.By compactness of El, S , = n 9, over, S , is closed, and S12G s,. By Zorn's lemma Y must contain a minimal element So. Let g E So. We shall show that g 2 = g. Since So g = { f g : f E So} E 9'and So is minimal, we have Sog = S o . Thus g = f g for some f E So. Now W = { p E So : g = pg} is nonempty, since f E W . According to (i), W is closed since it is the inverse image of { g } under the continuous mapping p + pg. Since also W 2 E W, we have W E9, and again by the minimality of S o , W = S o . This means that g E W , and g = g 2 as asserted. From above we must have g = e E El; that is, h has a left inverse. Since h E E ( C ) was arbitrary, E ( C ) is a group. I Remark We have assumed up to now that distal systems were invertible. The above proof that E(X) is a group does not make use of this assumption but yields invertibility of o as a byproduct. (See Exercise 18.)

Corollary 2.4.1

Every distal system is semisimple.

6. THE ELLIS SEMIGROUP OF X

61

Proof Suppose C = (X, a) is distal, and let x E X. Let y E a ( x ) , say a"kx -+ y. By passing to a subnet, we may assume that a " k -+ g E E(C). Thus g ( x ) = y. Since E(X) is a group, we have x = g - ' ( y ) with g - l E E(C). Thus x E O(y). It follows that a ( x )= O(y) is a minimal set. Thus X is a disjoint union of minimal sets for a, and C is semisimple. 1

Now suppose that C, = (X,, a,) is a factor of the dynamical system C = (X, a), say $: X + Z l . Then $ induces a semigroup homomorphism $* of E(X)onto E(X,)as follows.

Suppose g E E(C), and let g a , h, be nets in T ( C ) with g. -,g and ha g. Let y e X 1 . Then y = $ ( x ) for some X E X. Corresponding to each g, = a"=E T ( C )there is a ga* = a;. E T(Zl). Moreover, -+

g,*(y)

= a?Y = a;=$(.)

= $(anm.) = $(ga(x))

+

$(&)I.

Similarly, h,*(y) = $(h,(x)) -+ $(g(x)). That is, both nets ga* and ha* converge to the same element g* E E(Cl),and g*($(x))= $(g(x)).Let us write $*g = g*. Thus $*g is the unique element of E ( C , ) that completes the diagram 9

X

' X

The following observations are easily verified by examining diagram (12) and recalling that $*a = al. Proposition 2.8 If$*: E ( C ) (i) (ii) (iii) (iv)

-+

E ( C , ) is defined b y (12), then

$* is onto, t,b* is a semigroup homomorphism, $*g is continuous when g is continuous, and $* is continuous.

Combined with Theorem 2.4 this yields the following result promised in Section 4. Corollary 2.4.2 Factors of distal systems are distal. Proof The homomorphic image of a group is a group.

1

Proposition 2.9 I f C is equicontinuous, then E ( C ) is a group of homeomorphisms .

62

11.

TOPOLOGICAL DYNAMICS

Proof If C is equicontinuous, it is distal. Hence E ( C ) is a group. Moreover, equicontinuity of the mappings a"*is easily seen to imply continuity of the limit function g E E(C), where nflm + g. Since g is invertible and X is compact, g is a homeomorphism. I Remarks I The time has come to reveal our fraud. Much more than Proposition 2.9 can be proved. First of all, (see [18]) the converse of that proposition is true. Moreover, using a result from [17] and observations (i) and (ii) following Definition 2.13, it can be shown that E ( C ) is a topological group when C is equicontinuous; that is, multiplication is jointly continuous. If, in addition, C is minimal, then E ( C ) is homeomorphic to X [map g to g(xo) for some fixed xo E XI and the inverse homeomorphism carries the action of n on X to multiplication by a in E ( Z ) . That is,

*(as)= g(ox0) = M g ) ) .

(13)

Thus every equicontinuous dynamical system is a disjoint union (sum) of systems each of which is given by a rotation of a compact group. Such rotations are special cases of the affine transformations to be discussed in Chapter 111. 2 The situation for distal systems is much more complex. However, Furstenberg [22] has given a complete structure theorem in this case as well. We shall content ourselves here by stating (without proof) a related characterization of minimal distal systems from [23]. First we must define group extensions.

Definition 2.14 Let C = (X,a) be a dynamical system, and let G be a group of homeomorphisms of X.We assume that (i) (ii) (iii) (iv)

(g, x ) + g ( x ) is jointly continuous for g E G, x E X , (gh)(x)= g(h(x))for all g , h E G, x E X, g ( x ) = x for some x E X =. g is the identity of G, and g(ax) = a g ( x ) for all g E G, x E X.

The orbit space X/G,whose typical element is a set of the form Gx = { g ( x ): g E G } for some x E X, is compact in the quotient topology i?), where ~ ( G x=) G ( a x ) . and supports a dynamical system C/G = (X/G, CJG is a factor of C, and I: is said to be a group extension of Z/G. We remark in passing that, if C is equicontinuous, then G = E ( C ) is such a group. In this case, the orbit space consists of isolated points, the minimal sets for a, and Ci is the identity. Thus every minimal equicontinuous system is a group extension of the trivial (one-point) system.

7.

EXPANSIVE SYSTEMS

63

Theorem (Furstenberg) The class of minimal distal systems is the smallest class %? of dynamical systems satisfying: (a) (b) (c) (d)

the trivial system is in W; factors of systems in W are in %; inverse limits of systems in W are in W; group extensions of systems in %? are in % provided they are minimal.

3 For a detailed account of the algebraic approach to dynamical systems begun in this section the reader is referred to the monograph [18] by Ellis. 7.

EXPANSIVE SYSTEMS

The motivation for the concept introduced in this section is the symbolic or shift dynamical system of Example 1. In many regards, expansive systems lie at the opposite end of the spectrum from distal systems. We shall assume in this section that C is invertible.

Definition 2.15 The (invertible) dynamical system C = (X, 0 ) is said to be expansive if there exists an index a on X (the expansive index) such , # a. that for all x, y E X with x # y there is an integer n with ( ~ “ xany) Example If the X of Example 1 is endowed with the metric d(x, y ) = (1

+ min(ln1 : x , # y,})-’,

then an expansive index for C is given by a = { ( x ,Y ) : d(x, Y ) < I}.

Remarks I If a is an expansive index, then clearly any p G a is also. However, the above example exhibits a maximal such index. 2 If C is expansive, it cannot be equicontinuous. In fact (Exercise 23), it cannot even be distal. 3 It is clear that subsystems and finite sums of expansive systems are expansive. It is not obvious that factors are. On the other hand, the principal structure theorem below says that all expansive systems can be obtained as factors of subsystems of the symbolic systems of Example 1. This result was obtained independently by Keynes and Robertson [40]and by Reddy [50].

We need to define the notion of a “generator” for the dynamical system

C (so called because of its role in the calculation of topological entropy).

64

11.

TOPOLOGICAL DYNAMICS

Definition 2.16 Let C = (X, o) be an invertible dynamical system. Let @ be a finite open cover of X. Then 4 is a generator for C if

n;Lm o-"(&)

is either empty or a single point for each choice of

A,€@ (nEZ).

Example If C is the symbolic system on k points of Example 1, we Lr,, ..., Lrk-l}, where U, = { x E X : xo =j}.Then 2, = A, can take4 = {Uo, for each A, E 4% and - u-"(A,) contains exactly one point, namely x = {x,}, where x, =j if A, = U, .

n=;

Clearly, 4? n Y = {V n Y : U E @} is a generator for the subsystem El = (Y, c)of C when 4 is a generator for C. Proposition 2.10 C has a generator iflit is expansive. Proof Suppose C is expansive with expansive index a. Choose a symmetric open index p on X with /I2 E u, that is, a p such that

-

P * x )E P (x, Y ) E P, (Y, 2) E P ( x , z) E a. (x, Y ) E

tY9

Since X is compact, there is a finite cover 4 of X consisting of sets of the form x/? = { y : (x, y) E p} (/?-neighborhoods).Suppose that there exists a choice of sets A, E 4% ( n Z)~ such that o-"(A,) contains two distinct points x and y. If A, = xnP, then

n;=-m

(o"x,any)E ( x , B ) x

(X,P)

E

p*

5a

for each n E Z. But this contradicts the assumption that a is an expansive .-"(A,) contains at most one point for each choice index. Thus OfA"E43 ( n E Z ) . In general, 4 will not be a generator, but we shall make use of it to construct one. For each x E X, let x E A, E 4. Choose an open neighborhood V, of x with V.'. c A,. The sets V, cover X and we can choose a finite subcover W . For each choice of B, E W ( n E Z ) , B, E A, E 4 and so - o-"@,) contains at most one point. Thus W" is a generator. Conversely, suppose that 4 is a generator for C. Then there exists an index a on X (called the Lebesgue index of 4%) such that for each x E X there is some U E 4 that contains xa. Suppose that (o"x, any) E a for each n E Z. Choose A, E 4 such that (o"x)a E A,. Then o"y E A, and, clearly, ~ " X A,. E It follows that both x and y belong to a-"(A,). Since 4 is a generator, x = y. I

n;=-m

n=;

n;=-m

7.

65

EXPANSIVE SYSTEMS

The existence of a generator for X turns out to be exactly what is needed for an efficient "coding" of the sequence {o"x} for x E X. Indeed, if 9 is a generator for X, say 9 = { U , , U , , . . ., U , - ,}, then we can define a mapping of a subset Z of Y = X,"=-, (0,1, ..., k - I} into X as follows. Let Z be the set of sequences y = {y,,}E Y such that o-"(u,) is nonempty, and define + ( y ) for y E Z to be the unique element x belonging to that intersection.

+

n:Lm

Proposition 2.11 Z is a closed shift-invariant subset of Y , and the mapping -+ X is continuous and epic.

II/: Z

Proof Since 9 is a cover, for each x E X there exists (possibly more than) one sequence {y,) such that o"x E U y m( n E Z ) . Thus is epic. Moreover, if - o-"(D,,) # 0, then

+

n= :

so that Z is shift invariant. That is, r ( Z ) = Z, where T is the shift operator on Y defined by ~ ( y=) w, with w, = Y , , + ~ . Now let yJ = {y'} be a sequence in Z. Thus there exists a sequence x j E X such that m

n=-m

Suppose that y j -+ y E Y . By passing to a subnet if necessary, we may assume that x j -+ x E X . For each n E 2 the net of integers y,,' converges to the integer y,. It follows that 'y = y, for j 2 j(n). Thus d"', E uyn for j 2 j(n), and, by continuity of a", onx E That is,

u,,". m

so that Z is closed and m

Since (15) holds for any cluster point of the sequence {x,}, it follows from (14)and (15) that x, = +($) -+ x = +(y). Thus is continuous. 1

+

Theorem 2.5 (Keynes-Robertson-Reddy) Let 2 be an expansive dynamical system. Then X is a factor of a subsystem of a symbolic system.

66

11.

TOPOLOGICAL DYNAMICS

Proof It only remains to show that II/: (2, T ) + X is a homomorphism, that is, that II/T = UI). But this is a routine verification: n=-m

n=-m

Corollary 2.5.1 ZfZ

= (X, B)

is expansioe, then X is metrizuble.

The homomorphism $ of Theorem 2.5 is not in general an isomorphism, so the coding referred to above is not well defined. However, it can be shown to be so in case X is zero-dimensional [40,SO]. In this case, of

course, the converse of the theorem also holds (with “factor of a” deleted). We conclude this section by noting that W. Krieger has shown [42] that every ergodic invertible abstract dynamical system with finite entropy (see Chapter IV) is measure-tkoreticdy isomorphic to a subsystem of a symbolic system, hence to an expansive system. Moreover, the subsystem may be chosen to be uniquely ergodic. EXERCISES Minimality and Ergodicity

1. (a) Discuss the “adding machine” transformation 4 (Exercise 5 of Chapter I) as a transformation of a compact (totally disconnected) Hausdorff space. (Use the product topology.) (b) Let p be any invariant Bore1 measure for this 4. Show that p l { X

: x, = a } = { x : x, = O}

( a = 0, 1, . . ., k,),

and so p { x : X I = a} = (c)

that

1 ~

k,

(U = 0,

+1

1, ..., k , ) .

Show that there exists an integer p depending on a,,

. . ., a ,

such

+-“x : XI = a , , . . ., x , = a,} = { x : x , = 0, . . ., x , = 0)

and conclude that this set must have measure 1 ~~

k,

+1

1 ‘

k,

+1

’ * ’

1 k, 1

+

This proves that p is the measure constructed in Chapter I, and that is uniquely ergodic.

4

67

EXERCISES

2. Show that the symbolic dynamical system of Example 1 is not uniquely ergodic. 3. If (X,a) is invertible and uniquely ergodic, then (X,9, p, r ~ )is ergodic, where p is the unique a-invariant measure. 4. Let C = (X,a) be an invertible dynamical system. A point x E X is almost periodic if for each open neighborhood U of x the set A,(U) = {n E 2 : a"(x) E V} has "bounded gaps"; that is, there exists an integer k such that every set of k successive integers contains at least one element of A,( V). (a) If x E X, 5 X is such that C, = (X,,0 ) is minimal, and if U is a neighborhood of x, show as in the proof of Theorem 2.2 that O(x) E a"l(U). Deduce that the gaps in A,(V) are bounded by k = max n - min, n,. Thus every invertible dynamical system has an I ! almost periodic point, and each point is almost periodic for a minimal invertible system. (b) Show conversely that if x is almost periodic, then B ( x ) is a minimal invariant set. [Show y E O(x) C uj= a")(V) implies O ( y ) n U # 121, and hence that x E O(y).] 5. (a) Referring to Example 1, show that X E X is an almost periodic point iff for each integer n 2 1 there exists a set A, with bounded gaps such that x,,, = x i ( - n I i I n, j~ A,). (16)

Uf=

,

(b) For the x of Example 4 and for n = 0, show that (16) holds, with the gaps in A, = 0': x, = x, = 0} bounded by 3. (c) It can be shown [28, p. 1071 that, in general, (16) holds with the gaps in A, bounded by 2"". Deduce that C, = a) is minimal. (a) Find the smallest positive j E A, for n = 0, 1, 2, 3, 4. 6. If x E X is a generic point for C = (X,a), that is, if the limit

(a(.),

exists for each f E C ( X ) , show that L, is a positive linear functional on C ( X ) with L,(l) = IILxll = 1 and T,*L, = L,. Hence there exists a unique normalized Borel measure p, on X such that

U f ) = 1X f dclx

(fE W)).

Suppose that Z = (X,a) is a minimal dynamical system, and that p is a a-invariant Borel measure on X. If U is an open set, show that 7. (a)

there exist positive integers n,, n2, . . ., n, such that X c Uf= a-"j( V), and hence that p( U ) 0. This means that the support of ,u is X.

=-

68

11.

TOPOLOGICAL DYNAMICS

(b) If C is not minimal, there exists a point x E X with O + ( x ) # X . Show that the measure p, defined as in Exercise 6 by 1 n-1 f 4%= lim - f ( W ) (fE C ( X ) )

[

X

c

n+m n k = O

-

has its support contained in O ' ( x ) . (Find a continuous f 2 0 which vanishes on O + ( x ) and equals 1 on the compact set A G X O+(x). Use regularity of p x to conclude that p,(X O + ( X )= ) 0.) 8. (a) Show that the set M ( C ) E M ( X ) of normalized o-invariant Borel measures on X is weak*-compact and convex. (b) A point a of a convex set M is an extreme point if it cannot be expressed as a nontrivial convex combination of two distinct points of M. Show that the extreme points of M ( Z ) are the ergodic invariant measures. (c) Show (by the Krein-Milman theorem [16, p. 4401)that a dynamical system X always has an ergodic invariant measure, and that Z is uniquely ergodic iff there is only one ergodic invariant measure. 9. Consider the following assertions about the invertible dynamical system z = ( X , o): (1) X is minimal; (2) X is ergodic, that is, the set of points x E X for which O ( x ) # X is of first category; (3) I; is uniquely ergodic; (4) O(x) = X for some x E X ; ( 5 ) there exists an ergodic o-invariant Borel measure p with support X ; (6) f E C ( X ) , T, f = f * f = constant; (7) f E B ( X ) , T, f = f * f = constant, where B ( X ) is the space of all bounded functions on X . (a) Show that the following implications hold:

-

( I ) = (5)* (2) = (4)* (6)

a

(7) (3) (b) Show that the shift dynamical system (Example 1) on two points satisfies (4) but not (3). (c) Show that the transformation o defined on the unit circle K = {z : IzI = I} by o(e2nit)= exp(2nit2)satisfies (3) but not (4). Equicontinuity and Distality 10. Show that the following assertions are pairwise equivalent. (1) C = ( X , o) is distal.

EXERCISES

69

(2) For each x # y there exists an index a on X with (a"(x), a"(y))# a for all n. (3) Dgl((x,y ) ) n A = Qr for x # y . (4) {(x, y ) : V index a 3 integer n with (afl(x), a"(y))E a } = A. (5) a"k(x) + z, a"k(y)+ z for some generalized sequence nk of integers =x=y.

11. Show that the system in Example 5 is minimal. 12. Verify the details of Example 6.

Sums, Products, Subsystems, and Factors

Prove Proposition 2.5. (b) Show that nontrivial direct sums are neuer minimal. Show that C @ C is not minimal. 14. Give an example of an infinite direct sum C = ej,,Cj such that each Cj is equicontinuous, but I: is not. 15. (a) Suppose that $: C, + C,.Let M ( C , ) and M ( X , ) be defined as in Exercise 8. For each v E M ( C , ) show that 13. (a)

A

= {p E

M ( X , ) : T,*p = v, p 2 0, p ( X , ) = 1)

is a weak*-closed, convex, nonempty set, and that T,*,(A)z A . Hence deduce that there exists a p E M ( C , ) with T , * p = v. (b) Show that a factor of a uniquely ergodic dynamical system is uniquely ergodic. How about products, sums, subsystems of uniquely ergodic systems? Inverse Limits 16. Show that the system Coo described in the Remark following and that any other Definition 2.13 is an inverse limit of ( J , C j , inverse limit is isomorphic to C, . 17. Show that an infinite direct product is the inverse limit of finite direct products.

The Ellis Semigroup 18. Give a "one-sided" definition of distality for noninvertible systems. Show that any such system must in fact be invertible and distal in the former sense.

70

11.

TOPOLOGICAL DYNAMICS

19. Prove Proposition 2.8. 20. Show that C is distal iff C@C is semisimple. Use this to obtain another proof of Corollary 2.4.2. 21. If C is minimal and equicontinuous, show that g(xo)= xo for some xo E X implies g(a"xo) = a"x0 for each n and hence that g(x) = x for all x E X.Conclude that the mapping $ defined by $ ( g ) = g(x,) is an isomorphism of 2 = (E(C), &) onto C = (X,a), where &(g) = ag. 22. (a) Let a be rotation through an irrational multiple of n on the = l}. Show by direct calculation that E(K, a) is homeocircle K = { z : morphic to K . (b) Let a be defined on K by

(zI

a(eZnix) = exp(2nix2). Show that E(K, a) = T ( K , a) u { g } , where g is the constant function g ( z ) = 1.

(c) Let C be as in Example 5. Show that E(C) consists of all functions g of the form g(z, w ) = (kh(z)w),

where L is an arbitrarily complex number of absolute value one, and h is any (not necessarily continuous) function from K to K satisfying h(eZffi'z)= Ih(z). Expansive Systems 23. Suppose C = (X,a) is expansive. Show that there exist points x, y E X that are positively asymptotic, that is, such that limn++ p(d"', any) = 0, where p is a metric yielding the topology of X. Conclude that C is not distal. 24. Show by example that neither $ nor $- necessarily carries generators onto generators, where $: E + C1 is a homomorphism. 25. Prove Corollary 2.5.1 by showing that a continuous map from a second-countable compact space onto a Hausdorff space has a secondcountable image.

'

CHAPTER

I11 Group Automorphisms and Affine Transformations

1.

DYNAMICAL SYSTEMS ON GROUPS

In this chapter we particularize the general results of the previous chapters to the study of a continuous affine transformation 4 of a compact abelian group G. The system (G, $) is a classical dynamical system in the sense of Chapter II and with the Haar measure structure on G becomes an abstract dynamical system such as was studied in Chapter I. Thus the results of the first two chapters are combined in a natural way. Moreover, as we shall see, much more complete results are possible in terms of structure and representation theorems than is generally true. This is partially due to the fact that the duality theory for locally compact groups permits us to study analytic properties of such systems by algebraic methods on discrete (nontopological) groups, such as in the beautiful result of Seethoff [56] concerning automorphisms with zero entropy (Theorem 4.1 1 of Chapter IV). On the other hand, the now-classical results of Abramov [2] and of Hahn and Parry [29] [Theorem 3.9(ii)] concerning systems with quasidiscrete spectrum indicate the degree of generality attained in studying dynamical systems on groups. (See also Theorem 2.5 and Remark 1 following Proposition 2.9.) 71

72

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Throughout this chapter, we shall let G denote a compact abelian (Hausdorff) group. We shall denote the group operation by and the identity element of G by 0, except in certain examples where the multiplicative notation is more natural. The dual group (or character group) of G will be denoted by G and the value of the character y E G at x E G by ( x , y). Since G is compact, 6 is discrete, that is, an abstract group. The normalized (total measure one) Haar measure on G will be denoted by m and the Fourier transform of a function f E L,(G, m) by Thus

+

For further explanation of these terms as well as the basic theorems we will need about topological groups, the reader is referred, for example, to the first two chapters of the monograph [53]. Let T : G + G be a continuous automorphism (or epimorphism). It is easily demonstrated that t is a Haar measure-preserving transformation (Exercise 1). Likewise, for each a E G'the translation x + x a ( x E G ) is a measure-preserving transformation. It follows that the formula

+

+

qqx) = t ( x ) a (2) defines a continuous function 4: G + G which, as the composition of two measure-preserving transformations, is measure preserving. A function of the form (2) is said to be an afJine transformation of G. Example I The simplest nonfinite compact group is the unit interval G = {x : 0 Ix < 1 ) with addition modulo one or, equivalently, the circle group K = ( z E C : Iz( = l} in the complex plane, with complex multiplication as the group operation. The correspondence between G and K is given by x + + e Z n i The x . dual G of G is the additive group 2 of integers, with ( x , n) = elninx. The only continuous automorphisms of G are the identity x + x and the map x + 1 - x. Thus the affine transformations of G are of the form x + x + a or x + a - x. On the other hand, for each integer n E 2 the map x + nx modulo one is an epimorphism of G. We define the adjoint t*: G + G of a continuous endomorphism by ( x , t * ( y ) ) = ( T ( x ) , y ) ( x E G, Y E Then z * is an endomorphism, which is epic (monic) iff T is monic (epic).

t: G + G

e).

Example 2 The n-torus K" is the product of n copies of the circle group K or G of the previous example. Let us adopt the additive notation as in G of that example. Thus the dual of K" is the lattice 2"

1.

73

DYNAMICAL SYSTEMS ON GROUPS

of points in n-dimensional Euclidean space having integer coordinates. The duality is given by (x, 7) = exPl24Ylxl

+ Y 2 x2 + + Y"X")l.

(3) Let T = ( t i j ) be any n x n matrix with integer entries tij (i, j = 1, . . ., n) and nonzero determinant. Define t on K" by considering x = (xl, . . ., xn)E K" as a column vector and setting T(X) = Tx modulo one. It follows that T ( X ) E K" and that t is an endomorphism. Indeed, T is an epimorphism (Exercise 2). If the determinant of the integral matrix T is _+ 1, it is called unimodular. Proposition 3.1 Each unimodular matrix T = ( t i j ) determines an automorphism t of K" b y T(X) = Tx modulo one. Conversely, every continuous automorphism of K" has such a representation. Proof Suppose that T is unimodular. In particular, T is nonsingular and the inverse matrix T - is also unimodular. Thus ~ ( x=) T - 'x modulo one defines an endomorphism of K". Thus for X E K" we have T-'x = ~ ( x ) y, where y = ( y l , . .., y,) has only integer components. It follows that x = T(T- 'x) = T(a(x)) + T y = T(o(x))modulo one. Similarly, o(t(x)) = x for all x E K". Hence 0 = t- ', and t is an automorphism. Conversely, suppose that t is an automorphism of K". Let T* denote the adjoint of t on 2". For each j = 1, 2, . . ., n let 6' E 2" be the vector whose jth component is aij. Define ti, to be the jth component of ~*(6'). For each y E 2" we have

+

T*(Y)

= T*(yldl -k

"'

+ y,6")

...

= (Ylt,,

+ + Y"m1,

tl")

+ + ynT*(hn)

')'1t*(6~) ' . '

+ * . . + Yn(hl, .... f"")

= Yl(tl1,

9

=

.*'9

Yltl.

+ ... + Yntnn)

yT. (4) Now define 0 : K" 4 K" by cr(x) = T x modulo one. For x, y E R" let us set y. x = ylxl + ... + y n x n . For fixed x E K" let y = Q(x). Then for each y E 2" we have (Q(x), y) = e 2 n i ) ~J. = e 2 w . ( r x ) = e2ni(yr) x =

'

=

(x,

T * ( Y ) ) =
Y>

It follows that ~ ( x=) t(x) for all x E K", or 0 = t. Clearly, T has integer entries. Moreover, there is an integral matrix S such that T - '(x) = Sx modulo one. I t follows that (TS - I ) xand (ST - 1)x have integer components for all x E K". But this can only be true if TS - I = ST - I = 0 is the zero matrix.

74

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Thus T-' = S also has integer entries. Hence det(T) and det(T-') are integers. Finally, det(T) det(T-') = det(Z) = 1 implies that det(T) = det(T-') = k 1. Example 3 Let Go be an arbitrary compact abelian group and define G = @;=-, G, to be the countable direct product (complete direct sum) of copies G, of Go. Define cr: G + G by cr(x) = y, where y, = x,- 1. The transformation cr is a generalized shijt or symbolic fiow on the group Go and is easily seen to be a bicontinuous group automorphism. In case Go is the finite group Go = (0, 1, . . ., k - I} of integers modulo k, cr is the inverse of the transformation of Example 3, Chapter I, and Example 1, Chapter 11. The dual G of the compact group G is the discrete group 6 = @."= - G,, where G, = 6, is the discrete dual of Go and the sum is the algebraic direct sum. Thus a typical element y E G is a sequence y = (y,), where y, = 0 for all but a finite number of values of n. For x E G, Y E 6, we set (x, y ) = (x,, y,). Direct computation shows that the adjoint cr* of cr is given by o*(y) = 6, where 6, = yn+ 1.

n;=-m

Example 4 Let G and cr be as in the previous example. Define r : G -+ G by r(x) = .(x) + x. Then T ( X + y) = t(x) + ~ ( y )so , that t is a (continuous) endomorphism. It is not invertible since T(X) = 0, for example, if xZn= a # 0 and xZn- = -a for each n. It is, however, an epimorphism and so (Exercise 1) preserves the Haar measure on X. To see that T is epic, let y E G be arbitrary and note that y = T ( X ) is equivalent to the infinite system of equations y, = x,- + x,. We can, for example, set xo = 0 and define inductively Xn

= Y n - Xn-

( n = 1, 2, ...)

1

( n = -1, -2,...).

= y n + l -x,+1

The adjoint t* of T is given by r*(y) = a*(y)

+ y.

Example 5 We construct a slightly different example by taking G = @FZ1G, to be a one-sided direct product of copies G, of Go. We G, as above: define cr* and T* on G = $,"= o*(y) = 6

where 6, = Y , + ~ ,

r*h)= O*(Y) + 7. It follows that cr and t are endomorphisms of G with O(X) =

y

where y1 = 0, y , = x . - ~ ( n 2 2),

T(X) =

cr(x) + x.

1. DYNAMICAL SYSTEMS ON

75

GROUPS

In this case, u is not an epimorphism, since the first component of u(x) is always 0. On the other hand, T is an automorphism, since the system of equations x1 = Y l x,

+ x , - ~ = y,

( n 2 2)

has a unique solution x E G for each y E G. Now let a o E Go and define a € G to be a = ( a o , 0, 0, ...). The a#ne transformation 4 defined by $(x) = T(X)

+ a = .(x) + x + a

will prove to be central in the later development of transformations with quasidiscrete spectrum. Note that an explicit formula for 4 is given by $(XI,

x2, x3, ...) = (x1

+ ao, x2 + X I , x3 + xz, ...).

Example 6 Let G, = (0, 1, . . ., k, - 1) be a finite cyclic group of order k, ( n = 1, 2, ...). Define G as a set to be X;=l G,. Let G have the product topology and make it into a group by setting x y = z, where z, is defined inductively by

+

x1 xn+ 1

+ yn+ 1 +

+ y1 = z1 + w,k1; wn

= zn+

1

0 Iz1 < k l ,

+ w n + 1kn+1 ;

w1

= 0, 1

0 5 zn+ 1 < k n +

1,

w,+

1

= 091.

(5)

The idea is that we perform addition component by component with “carry,” and the action resembles that of an adding machine with variably sized counters for different digits. Let a = (1, 0, 0, . ..) E G. The element a is a topological generator of G in the sense that {na : n = 0, 1,2, . . .} is dense in G. A compact group with a topological generator is said to be monothetic. Let 4 : G + G be translation by a, that is, $(x) = x + a. The transformation 4 is the adding machine transformation of Exercise 5, Chapter I. Now let H, = X i z 1G , and define addition as in ( 5 ) on H,. (The “carry” w, is simply dropped.) Then (i) H , is a cyclic group of order k,k,---k,, (ii) the natural projection of G onto H, carries (b onto an ergodic translation 4, of the finite group H,, and (iii) relative to these projections, the system (G, 4 ) is an inverse limit (in the category of affine transformations on compact groups) of the systems (H,,4,). (Cf. Exercise 23, Chapter I.)

76

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

ERGODICITY

2.

Our next project is to find conditions under which the epimorphism 7 or the affine map Q, is ergodic. Throughout this section 7 will be a continuous epimorphism of the compact abelian group G and Q, will be the affine map 4(x) = z(x) a for some fixed a E G. We will denote by CJ the endomorphism ~ ( x =) T(X) - x. Note that the adjoint 7* of the epimorphism 7 is a monomorphism of This is true because

+

e.

r*(y) = 0 * (x, 7*(y)) = (rfx), y) *y=o since 7 is epic. Our first result is that orbits in 6 except a t 0.

=

for all x E G

1

is ergodic iff

7

7*

has no finite

Theorem 3.1 T h e epimorphism 7 is ergodic if the orbit O:(y) is infinite for each y # 0. Proof Suppose 7 is ergodic and O$(y) is finite; that is, there exist nonnegative integers n, < n, such that O,l;(y)= {y, 7 * ( y ) ,

Let f

E

...

)

7 * ( y y ) } ,

7*"z(y)

= 7*"1(y),

L,(G) be defined by

+ ... +

f ( x ) = (7"'(X), y) = (x, 7*4(y))

(7"4(X),

+ + (x, *'.

y)

7*("2-')(y)).

Then f ( 7 ( x ) )= (x, =

7*("1+')(y))

+ + (x, 7 * " ( y ) ) *.*

f (4

for all x E G. It follows that f must be constantly equal to f(0) = nz - n, # 0. Since z is measure preserving, we have

1

1

n2 - nl = f(0) = f dm = (n2 - n l ) y dm, j y d m = 1.

But this means that y = 0, as required.

2.

77

ERGODICITY

Conversely, suppose that Osjy) is infinite for y # 0, and letf z-invariant. Then for each y E G we have

[ f(x)(;(x),)m(dx) = J' f(7(x))(dx), 7) = [ f(x)(x,y)m(dx)= m.

f(T*(Y)) =

That is, and so

7 is r*-invariant.

E

L2(G)be

m(dx)

For yo # 0 the points in O s ( y 0 ) are distinct,

m

c If(z*"(Yo))l' 5 c If(Y)12< a,

n=O

since

]E

;'E

G

L2(G). But each term in the sum on the left is equal to

If(yo)12, hence must be zero. SinceT(y) = 0 except for y = 0,Tis a constant. Therefore, z is ergodic. I

Remark If z is an automorphism, then O s ( y ) is finite iff ~ * ~ ( = y )y for some k 2 1.

Corollary 3.1.1 Z ~ isT ergodic, then if is strongly mixing. Proof Suppose z is ergodic and yl, y2 E G with not both of y l , y2 equal to 0. Then z*"(y,) = y 2 for at most one value of n. Therefore, (T"Yl9 Y 2 )

=

s

(T*"Y,)72

dm = 0

for all sufficiently large n. Hence, iff and g are finite linear combinations of functions in G, then

78

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Hence lim (TY, 9 ) = (f, I ) ( 199).

n+ w

In particular, settingf = z A ,g = x s yields lim m(A n r-"(E))= m(A) m(E), n- w

so that

t

is strongly mixing.

I

Corollary 3.1.2 A continuous automorphism 7 of the 1 torus K' (Example 2 ) i s ergodic ifl the associated unimodular matrix T = (ti,) has no eigenvalue which is a root of unity.

Proof For each y E 2' we have (Exercise 2) that r*(y) = yT. Suppose that t is not ergodic, and hence that O,'(y) = (7, ~ * ( y ) ,.. ., t * ( " - ' ) ( y ) } for some y # 0 and some positive integer n. Thus yT" = y. Let I be any primitive nth root of unity, and define the 1-dimensional vector f = (fi, . .., fi) by f =P

l y

+ I"-Z(yT) + + (yT"-'). * * *

(6)

Then

fT = I"-'(yT) + . * *

+ I(yT"-') + y = If.

It only remains to show thatfis not the zero vector. Suppose f = 0. Since y # 0, (6) represents 1 polynomial equations of degree no greater than n - 1 in I and having integer coefficients. But the only such equation is

... + 1 = 0 . This contradicts the assumption that y, yT, . . ., yT"- ' are distinct. I n - ' + I"-2

+

Conversely, suppose that I"= 1 and I is an eigenvalue of the integral matrix 7: It follows that T" - I is singular. This means that there exists a nonzero f E R' with rational components such that f T" = f . It follows that there exists such an f with integer components, that is, f E 2'. But then r*"(f) = fT"= f , O,';(f) is finite, and t is not ergodic. I Examples There are no ergodic automorphisms of the one-torus or circle group K. On the two-torus K 2 there are many ergodic automorphisms. Consider the unimodular 2 x 2 matrix T = (ti,). Let t denote the trace t = t l l f t,, of T . Thus the characteristic equation of T is I2

- tI f 1

= 0.

2.

ERGODICITY

79

If t = 0, then the eigenvalues are either f 1 or fi. In either case, 7 is nonergodic. In general, the roots I, and A 2 must be either real or complex conjugates satisfying I l l 2 = k 1. Case 1 det(T) = - 1. The characteristic equation is 12

-tl

- 1 = 0,

and there are no complex roots. For t # 0, the roots are real and distinct with 12, I < 1 and [ I ,I > 1. Thus 7 is ergodic. Case 2 det(T) = 1, t Z > 4. In this case also, the roots are real and distinct, and 7 is ergodic. Case 3 det( T) = 1,0 < t 2 I4. Since t is an integer, the only possibilities are t = f 1, 52. The corresponding pairs of roots in the four cases are

Each is a root of unity, and so following pro posit ion.

7

is nonergodic. We have proved the

Proposition 3.2 The automorphism 7 of the two-torus is ergodic fi the corresponding unimodular matrix T has (i) determinant 1 and nonzero trace or (ii) determinant - 1 and trace greater than 2 in absolute value. The epimorphisms 0 and 7 of Examples 3 and 4 are ergodic (Exercise 9). The automorphism 7 of Example 5 is not ergodic. To see this, let y = (C, 0, 0, 0, . . .), where E Go, 5 # 0. Then t * ( y ) = y. Ergodicity of the epimorphism 7 has interesting and immediate implications regarding the affine transformation & ( x )= t ( x )+ a, which we state in the following theorem.

<

Theorem 3.2 Let

+

7

be a continuous automorphism of G, and let

d(x) = 7 ( x ) a, where a E G. Then the following are equivalent:

(i) (ii) (iii) (iv) (v)

has nojnite orbits in is ergodic. 7 is strongly mixing. 4 is strongly mixing. 4 is weakly mixing. 7*

7

G

-

(0).

80

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Proof We have already shown that (i), (ii), and (iii) are equivalent. To show that (i) implies (iv), note that

(7)

so that T,"y is a constant multiple (varying with n ) of z*"(y) for each y E G. It follows as in the proof of Corollary 3.1.1 that ( T , " y , , y 2 ) = 0 for sufficiently large n whenever y1 and y2 are not both the zero character. The rest of the proof is unchanged. Since (iv) always implies (v), it only remains to show that (v) implies (i). Suppose (i) is false, say T*"(Y) = y, y # 0, with n 2 1 minimal. We can then write (7) as & " Y ( X ) = J(Y) Y ( X )

where IJ(y)I = 1. If M is the linear span in L2(G) of the set { y , T, y, ..., T;-'y}, it follows that q ( M )E M . Moreover, 1 4 M since the characters 0, y, r*(y), ..., T*("- ' ) ( y ) are assumed distinct and hence linearly independent in L,(G). [Note that the character 0 is the constant function 1 E L,(G).] Since M is finite-dimensional, T, must have an eigenfunction g E M . According to Theorem 1.6, this means that 4 is not weakly mixing. I Note that the above theorem does not include ergodicity of 4 as one of the equivalent conditions. Indeed, 4 may be ergodic (but not mixing) even when T is nonergodic. For example, translation by an irrational number, +(x) = x + a, on the circle group is ergodic, but T ( X ) = x is not. On the other hand, notice that 4 ( x ) = - x + a is never ergodic since 42 is the identity. A measure-preserving transformation 4 is said to be totally ergodic if 4" is ergodic for each n 2 1.

Theorem 3.3 Let T be a continuous automorphism of the compact abelian group G, and assume that G has no nonzero elements of Jinite order. Let 4 and a be as before. Then the following are equivalent: (i) 4 is ergodic. (ii) 4 is totally ergodic. (iii) ~ * " ( y )= y, & ( y ) =

n;;;

y(rk(a))= 1 = y = 0.

2.

81

ERGODICITY

(iv) r*"(y) = y 3 t * ( y ) = y, and r*(y) = y, y(a) = 1 y = 0. (v) r*"(y) = y 3 r*(y) = y, and the group G(a, a) = {nu a(x) : n E 2, x E G ) is dense in G .

+

I n any case, (iii) implies (ii) and (iv) implies (i). Proof We have shown [Eq. ( 7 ) ] that

G"Y= 4r(y)r*"(y) y E G. Suppose now

for each n 2 1 and each that 4 is ergodic, that r*"(y) = y, and that & ( y ) = 1. Let I be the smallest positive integer such that r*I(y) = y. Then y, r*(y), ..., r*('-')(y) are distinct characters, and 1 divides n, say n = j l . Moreover, r*(ii+kf(y) = 7 * k ( y )for all positive integers i and k. Let us show that [ , & l + k ( y ) ] n = [&(y)]". Since r*'(y) = y,

n

ii+k- 1

h+k(Y)

=

"drra)=

1=0

n [;;;

y(rra)

n

] i kr = - 1o

y(Tra) = [ n l ( y ) l i n k ( y ) *

In particular, An(?)

=

[Uy)Y = 1.

Thus [&I

+k ( y ) y = [nl(Y)li"[nk(Y)ln

= [nk(r)l"

Now define f E L,(G) by i=O k=O

k=O i-1

=

1-1

'1C - [ l i tr( y)]"[t*"'+k'y(x)]" + i=O k=O 1- 1

Since G has no elements of finite order other than 0, and since y, r*y, ..., r*"-"y are distinct characters, it follows that ny, r*(ny), ..., T * ( ~ - "(ny) are distinct characters. On the other hand, since &(y) = 1, we have T$"y = y, and so q f = f . By the ergodicity of 4, it follows that f is a constant function. Because of the linear independence of distinct characters, this can only happen if 1 = 1 and y = 0. Thus (i) implies (iii). (iii)* (iv). Suppose z*"(y) = y, and let { = a*(y). Since with T, we have ?*"(() = T*"CT*(y) = a*T*"(y)

= a*()') =

5.

Q

commutes

82

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

111.

Moreover,

n n

n- 1

An({) =

n- 1

?*"(a) =

k=O

fl a*r*ky(a)

k=O

n- 1

=

= r*"y(a)/y(a)= 1.

r*'"'"y(a)/t*"(a)

k=O

<

According to (iii), we have = 0, that is, r*(y) = y. Since A,(?) = ~ ( a ) the , second statement in (iv) is a special case of (iii), and we have completed the proof of (iii) =-(iv). (iv) 3(ii). Supposef Fourier series

f

=

L,(G), T,y = f.Thenfcan be represented by a

E

cjf,Y)Y,

and

T+Y =

./EG

C(f,Y)

;' E G

An(y) T*"Y*

Equating coefficients gives

(f,r*"y) = An(y)

( f i

Y)

for each y E G.In particular,

I(f,?*"?)I = I(f,r>l

(YE

and

l(f,

r*jny)l =

I(f.v)l

(YEG j 2 1).

Since EYE i; [(f, y)l' < 00, we must have (f, y) = 0 for each y with infinite orbit. But by (iv), each y with finite orbit satisfies r * y = y, and hence Tb y = y(a)y. Thus and

Since by (iv) r*y = y =- r*(ny) = ny

(ny)(a) = [y(a)]" # 1

except when y = 0, it follows by equating coefficients in (9) and (10) that (f,y) = 0 for all y # 0. That is, f is a constant. Thus 5 is totally ergodic. Since (ii) clearly implies (i), the proof of the theorem will be complete if we can show that r*y = y, y(a) = 1 =- y = 0

2.

83

ERGODICITY

iff the group generated by a and o(G) is dense in G. But this latter condition is equivalent to (a)l n o(G)' = 0,

where ( a ) denotes the cyclic group generated by a and I indicates the annihilator. Now Y E (a)l iff y(a) = 1 and Y E o(GF

iff

(TX

- x, y)

=

1

iff (TX,y> = (x, y> iff ~ * = y y.

(XE

(x E

G)

G)

c

This completes the proof when is torsion-free. The last statement of the theorem is proved similarly (Exercise 13). I

Definition 3.1 The affine transformation 4 is said to be semiergodic if one of the following three equivalent conditions is satisfied: t*y = y, y(a) =

1 3y

=0

+ o(G)]- = G

[(a)

(a)' n o(G)' = (0).

(13)

Example 5a Let 4 be defined as in Example 5. Suppose that t * y = y, that is, a*y = 0. Then y = (yl, 0, 0, . . .). If a. E Go separates the points of that is, ( a o , y o ) = 1 only for yo = 0, then 4 is semiergodic. In particular, this is true if a. is a topological generator of G o . If Go has no nonzero elements of finite order, then G has no nonzero elements of finite order. In this case,

e0,

For if y # 0, let p be the largest integer with y p # 0, and assume that p > 1. Setting n = p - 1, we see that

Since Go is torsion-free, y p = 0, a contradiction. Summarizing, if Go has no nonzero elements of finite order and if a. separates the points of Go, then (iv) is satisfied and 4 is ergodic.

a4

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Example 56 Let 4 be defined as in Example 5, and let Go = Go = (0, 1, . . ., 15} with addition modulo 16 as the group operation and (x,

Y> = e i x

y/8

Then a. = 1 is a generator for Go, and so that 4 is not ergodic. Consider

4 is semiergodic. Let us show

y = (5, 6, 4, 8, O,O, ...).

Then z*y = (11, 10, 12, 8, 0,0,. . .) T * ~ Y=

(5, 6, 4, 8, 0,0, ...) = y,

so that (iv) is not satisfied. Moreover, f ( x ) = [+)I2 trivial invariant function, and so 4 is not ergodic.

+ [T#y(x)l2 is a non-

3. DISCRETE AND QUASIDISCRETE SPECTRUM

A measure-preserving transformation 4 has discrete spectrum if the eigenfunctions of the associated operator on L, form a basis for L , . A classical result of Halmos and von Neumann [34] asserts that the ergodic measure-preservingtransformation 4 has discrete spectrum iff it is measuretheoretically isomorphic to translation by a topological generator on a monothetic group. In 1962, Abramov [2], using a notion of quasieigenfunction introduced by Halmos [32], gave a definition of quasidiscrete spectrum and generalized the theorem of Halmos and von Neumann. The appropriate model is an ergodic affine transformation satisfying an additional condition to be discussed presently. In 1965,Hahn and Parry [29] defined quasidiscrete spectrum for a minimal homeomorphism of a compact Hausdorff space and proved the corresponding isomorphism theorem. In 1969, Brown [ 113 showed that transformations with quasidiscrete spectrum in both the topological and measuretheoretic cases are factors of a single affine transformation defined in terms of the shift on a certain product group. Let 4(x) = T ( X ) + a be an affine transformation of the compact abelian group G, where T is assumed to be an automorphism. In order to describe the condition referred to above, we first define a sequence of subgroups of the dual group

e.

3.

85

DISCRETE AND QUASIDISCRETE SPECTRUM

Let To be the trivial group consisting only of the identity of

To and define

G,

= {O},

r, inductively by rn+1 = { Y E d

-y E

T*Y

r,}.

Finally, let m

r = Urn. n=O

Note that rnis the kernel of the group endomorphism a*n and that TnG r n +for , each n. Therefore each r nand also r are groups. Moreover, it follows easily that T*(rn) = rnand r*(T) = r.

4 has

Definition 3.2 The affine transformation

if

r = G.

quasidiscrete spectrum

Note that = Go when T is ergodic. Thus the transformations discussed in this section lie at the opposite end of the spectrum among affine transformations from those for which T is ergodic. Note also that 4 has quasidiscrete spectrum iff o"(G) = {0} (Exercise 21).

n:=o

Examples Of those in Section 1 only Examples 5 and 6 have quasidiscrete spectrum. Let us show this for the 4 of Example 5. We have T*(Yl, Y 2 7

so that

t *= ~y

iff y 2 = y 3

=

* . .) = (Y1

+ Y2

3

Y2

+ Y 3 . . .) 9

... = 0. That is,

rl = {v : Yk = 0 (k #

1)).

Similarly, it is easily proved inductively that

r. = { y : Yk = 0 ( k > n)}. Thus m

m

n=O

n= 1

r = U r n =@ G o = & Note that 6 = r is the inductive limit of the sequence of groups hence G is an inverse limit of the sequence G, = f,, n

G,, 2 G/TnLz

0 Go. k= 1

r,, and

86

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

It follows easily that 0 = (G, Cp) is affinely isomorphic to the inverse limit of the sequence 0,= (G,, Cp,), where C p n ( X l r X 2 , . . . , X , ) = ( x 1+ a o , x l + x z , . . . , x , - i +xn)*

The fact that the $ of Example 6 has quasidiscrete spectrum will follow from the following proposition. Proposition 3.3 Let a be a topological generator of the monothetic group G, and let Cp(x)= x a . Then rl = G,so that Cp has quasidiscrete spectrum.

+

Indeed, Cp has discrete spectrum, since G is a basis for L,(G). The proof of this proposition is left to the exercises. Let us turn now to a discussion of quasidiscrete spectrum for abstract and classical dynamical systems. Suppose that 0 = ( X , % . ?, p, Cp) is an abstract dynamical system. We shall assume that 0 is ergodic. Recall that f E L , ( X ) is an eigenfunction for T+ with eigenvalue A E K if qf = Af. Let us denote by A, the set of eigenvalues of and by B , the set of eigenfunctions having constant absolute value one. According to Theorem 1.6, A, is a group, 1 E A, E K E B,, and any two elements f , g E B , associated with the same eigenvalue A satisfy f / g E K . This last assertion must, of course, be interpreted to mean that f / g is almost everywhere equal to a constant, and the absolute value of that constant is one. For completeness let us set A . = { l} and Bo = K. We shall then define A,, B, C L , ( X ) inductively by

f€B,+,

u;!o

iff g E B , ,

u;=o

If1

=1

and

Gf=d

and set A = A,, B = B,. The elements of A are called quasieigenvalues of @, the elements of B quasieigenfunctions. Definition 3.3 0 has quasidiscrete spectrum if the linear span of B is dense in L , ( X ) . Proposition 3.4 For each n = 0, 1, 2, ..., An and B, are groups (pointwise almost everywhere multiplication), A , C A,, E B, C B,, and A , , , / A , is isomorphic to a subgroup of K .

,

Proof Let Y = Y ( X , 37) be the group of complex-valued, W-measurable functionsfon X with I f ( x ) l = 1. We define homomorphisms i and 15 on 9 by

3. DISCRETE AND QUASIDISCRETE SPECTRUM

87

By ergodicity of 4, the kernel of 8 is K = Bo.It follows easily from (15) that B,, = 8-'(B,,), and hence that B, (n = 0, 1, 2, ...) is the kernel of b("+I). Moreover, A,

= ?r(B,)_C

B,-

1

(n = 1, 2, . ..).

Hence each of the A, and B, are subgroups of '3. Clearly, the kernel of 8"" is contained in the kernel of b"'', so that B,E B,+l, and A , = 8@,) E b(Bn+l) = A,+1. To prove the last assertion, note that d"+l(B,+l) c K and recall that the kernel of 8"l is B,. It follows that 8" maps A,+1 = b(B,,+l)into K, and the kernel of 5" restricted to A , + 1 is A,. For if g = 8 ( f )E A,+1, withfe B,+l,and if F ( g ) = ? + ' ( f ) = 1, then f E B,, and so g E A,. On the other hand, A, c B,- which is the kernel of 8". Thus A , + J A , z 8n(A,+l) c K .

I

Previously, we have been using the symbol K to denote the compact topological group of complex numbers with modulus one. On the other hand, the preceding proposition is purely group-theoretic in nature since we have as yet imposed no topology on the groups A and B. In the sequel we shall want to consider A to be a discrete topological group. In this context, the last statement of Proposition 3.4 is only true if we replace K by the discrete topological group of complex numbers with modulus one. In order to avoid confusion, we shall henceforth denote this topological group by K , . The next result shows how we can embed the group A of quasieigenvalues of into the countably infinite direct sum K," = @,"= K, of copies of K , . This result is suggested by the last assertion of Proposition 3.4. The embedding will be accomplished in such a manner that A,, appears as a subgroup Of K," 1{ t E K," : t k = 1 (k > n)}, the discretized n-torus. Before proceeding to this result, let us quickly survey the analogous situation for classical dynamical systems. Let 0 = (X, 4 ) be a minimal classical dynamical system. Then the only continuous invariant functions for T+ are constants. We define subgroups A,, B,, A, B of C ( X ) just as before. Because of minimality, the homomorphism b again has kernel K. Everything else goes through exactly as before. In particular, Proposition 3.4 is valid for 0.

Definition 3.4 The classical dynamical system 0 = (X, 4) has quasidiscrete spectrum if the algebra generated by B is dense in C ( X ) .

88

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

The group K," has a natural endomorphism b*: K," + K,", called the shift and defined by 6 * ( t ) = u, where u, = t,+ 1. This, of course, is a special case of Example 5. The subgroup r E K," is said to be shift-invariant if 6*(r) c r.

Theorem 3.4 Zf @ is either a minimal classical dynamical system or an ergodic abstract dynamical system, then the group A of quasieigenvalues of @ is isomorphic to a shift-invariant subgroup of K,". Proof We have K = B, c 9. Since K is a divisible group, it follows [38, p. 111 that the identity map of K into K has an extension to 9. That is, there exists a homomorphism a : 9 + K and a ( f ) = f for f E K. Define p * : A + K," by

p*(f = (a(f ), a(8.f 1,

wf), .). * *

(16)

Clearly, p* is a homomorphism and p*6 = b*p*. Since 6 ( A )G A, it follows that p*(A) is a shift-invariant subgroup of K,". It only remains to show that p* is monic. Suppose then that f e A, f # 1. It follows that for some n 2 0 we have f # A,, and f E A,+l. Hence 8"f is a constant different from 1. Since a(8"f) = 8"f, p*( f)is not the identity. That is, the kernel of p* is trivial. I

Theorem 3.5 I f @ = (G, 4) is a semiergodic afJine system with quasidiscrete spectrum, then @ is an algebraic factor of the system & = (Kdw,$) defined as in Example 5, with $ ( x ) = &(x) x 6, ii = (ao, 0, 0, ...), and a o ( t )= t .

+ +

Proof First of all, we show that the analog of Theorem 3.4 holds with A replaced by r = We define p * : r + K," by

e.

P*(Y) = (?(a), a*y(a), a*2y(a),' . .)>

so that p*a* = 6*p*. To show that p* is monic, suppose that p * ( y ) = 1, y E r,, 1, and let 5: = a*"y. Then

r*< = a*< + 5 = a*'"+')y + a*ny = a*"y = t, and ( ( a )= a*"y(a) = 1. By semiergodicity, 5: = 0. In other words, y E r n. By a repetition of this argument, we deduce that y = 0. Now consider the adjoint mapping p: K , O + G = f . Since p* is monic, p is epic. Moreover, op = p& and hence z p = p.2. Noting that ( P ( 4 Y) = (4P * ( Y ) ) = ?(a)

4.

for each y E

QUASIPERIODIC SPECTRUM AND THE ERGODIC PART OF T

89

r implies p(b) = a, we have p&x) = p q x )

+ p(b) = tp(.) + a = 4p(.).

Thus p is a homomorphism of 6 onto 0 as asserted. [ In Section 6 we shall show that Theorem 3.5 applies to totally ergodic abstract dynamical systems (Abramov) and to totally minimal classical systems (Hahn-Parry).

4. QUASIPERIODIC SPECTRUM AND THE

ERGODIC PART OF

t

In this section we introduce a construction due to Seethoff [56], which permits us to identify the maximal subgroup H of G on which an arbitrary continuous epimorphism t of G is ergodic. The beginning point is the observation that t is ergodic on all of G iff t* has no nontrivial periodic points. In the extreme opposite case, where H is the zero group, there is a close relationship to the affine transformations with quasidiscrete spectrum, and we say that t has quasiperiodic spectrum. We show also that such a t is distal. Let t be a continuous epimorphism of G. As in the previous section, we define an increasing sequence of subgroups of 6 as follows: A0

A,,,

= (0) = {y

E

6 :~

u:=l

* -~y Ey A,, for some positive integer k}.

Let A = A(T*) = A,,. Note that r,,E A,, for each n, and so

r E A.

Definition 3.5 The affine transformation has quasiperiodic spectrum if A ( t * ) = 6. Proposition 3.5 The epimorphism

T

(17)

4, defined by

is ergodic iff A

4(x) = t ( x )

+ a,

= A1 = (0).

Proof If T is ergodic, then according to Theorem 3.1, A1 = {O}. If T is not ergodic, there exists a nonzero y E 6 and positive integers k,, k, such I ~T * ~ , + ~ But Z ~then . t * k z l = 4, and 5 E A , . Since t* is monic, that 5 = T * ~ = 5 # 0. If A 1 = {0}, it follows immediately from (17) that A,, = (0) for each n, and hence A = (0).

90

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Proposition 3.6 If 4 is ergodic, and G is torsion-free, then A = r. I n particular, 4 has quasiperiodic spectrum ifj' it has quasidiscrete spectrum. Proof According to Theorem 3.3(iv), we have A, = r,. Suppose that let Y E A , , + , . Thus r * k y - y E r , , for some k > 0 . Let = r*y - y. Then

A,,=r,,,and

<

-

t*k<

= t*(t*ky

- y) - ( t * k y - y ) E r,,- E A,,-

1.

<

It follows that E A,, = r,,,and hence that y E r,,+ l . By induction A,, = r,, for all n, and A = r. I The following theorem shows that there was no loss in generality by assuming that t was an automorphism in the preceding section.

Theorem 3.6 (Seethofj') If invertible and distal.

4 has quasiperiodic spectrum, then it is

Proof By assumption A = G. Suppose that y E A,, for some n > 0. Then * -~ y Ey A,,- By iteration it follows that there is a polynomial p with constant term f 1 such that p ( t * ) y = 0. Hence there is another polynomial q such that y = t*[q(r*)y]= t*<. Thus t* is epic and hence invertible. Following Seethoff [56], we divide the remainder of the proof into three parts.

,.

~

I. If A, = G,then t is distal. Let

n,,= { y : t*"!y = y}. Then each n,, is a subgroup of G, t*(n,,) c n,,, n,,s n,,,,, and G = Al = ,n,,.Moreover, t*"!reduces to the identity on n,,.It follows that the factor automorphism T,, = t/lI,' on G/II,,' (the adjoint of t* restricted to n,,)also satisfies t i ! = identity. But then it follows that t,, is

u:=

distal [Or,2(x, y) is a finite set]. Since (G,t) is the inverse limit of the sequence (G/n,,',t,,),it is distal. 11. If A,, = G, then t is distal. The proof is by induction. Suppose the statement is true for fixed n and all (G,t ) .Suppose further that t is an automorphism of G for which A,,+ = G. Let H = A/. By the induction hypothesis the automorphism t 1induced by t on G/H is distal, since ( G / H )-E An(?)and t 1* is the restriction of t* to A,,. Moreover, the restriction t 2of t to H is distal by I, since E? z G/A,, = A,,+ ,/A,, = Al(t2). Now suppose that there is a net nk of integers and points x, y, z E G with t"kx + z, ~ " k y+ z. Let w = x - y. Then t"kw +O. It follows that T?(W + H) + 0. By distality of T ~ w , + H is the zero coset, that is, w E H. Since r>w = tnkw+ 0, and t 2is distal, w = 0. Thus t is distal.

4.

QUASIPERIODIC SPECTRUM AND THE ERGODIC PART OF 7

91

111. If A = 6, then 4 is distal. Since A is the union of the increasing sequence A,,, it follows, as in the proof of I, that t is the inverse limit of the distal transformations z,, induced by 7 on G/A,', hence distal. Finally, suppose that &"t(x)+ z and @"'(y)+ z. For any n we have

+

@(x)= t"(x) ?-'(a)

+ + t ( a )+ a. * * *

Thus 4"*(x)- @"'(y)= t"k(x) - T " ~ ( Y=) Pk(x - y ) + 0. Hence x - y = 0. Thus 4 is distal. I We shall show in Chapter IV that the converse of Theorem 3.6 is also true. That is, 4 is distal iff A = G. Thus it makes sense to talk about the distal part of 7, or at least the distal part of T*. The following theorem gives a precise meaning to the ergodic part of z. Theorem 3.7 (Seethofl) Let t be a continuous epimorphism of the compact abelian group G. Then there exists a uniquely determined closed subgroup H of G such that

(i) t ( H ) zH, (ii) t is ergodic on H , and (iii) H contains every subgroup on which

7

is ergodic.

Moreover, r/H has quasiperiodic spectrum. Proof Set H = A'. For any closed t-invariant subgroup F of G, let tF be the restriction o f t to F. The dual E of F is isomorphic to GJF', and it is easily seen that A,,(rF) = A JF'. In particular, A1(tH)= AJA is trivial. According to Proposition 3.5, tH is ergodic. Conversely, if tF is ergodic, then A(tF) = A/FL is trivial, so that A G F'. But this implies that F G A' = H . Finally, ( G / H ) - z H I = A, so that t / H has quasiperiodic spectrum. I Combining Theorem 3.7 with Proposition 3.6 yields the following corollary.

Corollary 3.7.1 Let 4 be an ergodic affine transformation, 4 ( x ) = ~ ( x+) a, on G. Suppose 6 is torsion-free. Then there exists a uniquely determined, closed, ?-invariant subgroup H of G such that (i) t restricted to H is ergodic, and (ii) the transformation 4 / H induced by spectrum.

4 on G / H has quasidiscrete

92

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

5.

ERGODIC AUTOMORPHISMS

In the previous section we pointed out that an automorphism 7 of a compact abelian group G has a distal part (quasiperiodic spectrum) and an ergodic part. Corollary 3.7.1 says the same for an ergodic affine transformation on a connected group, with the distal part having quasidiscrete spectrum. Theorem 3.5 gives a concrete description of ergodic affine transformations with quasidiscrete spectrum. In this section we shall prove an analogous theorem for ergodic automorphisms and look at some examples. In order to introduce the class of ergodic automorphisms 7 for which our representation theorem is valid, we first prove the following proposition.

Proposition 3.7 L e t G be a compact abelian metric group, and let 7 be a continuous, ergodic automorphism of G . Then there exists a n a E G such that is, that the orbit O,(a) of a under T separates the points of y(7"a) = 1 ( n = 0, f 1, +2, . . .) implies that y = 0 .

e,

Proof We shall in fact show that for "most" of the points a E G the orbit of a is dense in G . Since G is compact, it is totally bounded. Thus there exists a double sequence {&} of open balls in G such that G C k B,, and the radius of B,, is l / n for k = 1, ..., k, and n = 1, 2, .... Let G,, denote the set of X E G for which o , ( X ) n B,, = 0. The Set Go = k G,, is exactly the set of points whose orbits are not dense in G. Since we can write G,, as

u,,, u,,

and since T is continuous, each G,, is closed. Moreover, if we let A,, denote the orbit of G n k , A,, = { T j ( X ) : X

E

Gnkrn E z},

then A,, is z-invariant and disjoint from B,,. Since B,, is open and hence has positive Haar measure, it follows from theergodicity of7 that A,, and hence G,, E A,, have measure zero. Therefore G,, has empty interior. We have shown that Go is a countable union of closed, nowhere dense sets. By the Baire category theorem, Go cannot be all of G . In fact, the set of points with dense orbits is a dense G , . I

5.

ERGODIC AUTOMORPHISMS

93

For an arbitrary continuous automorphism T , let us introduce the following notation. By 8 ( ~we) shall mean the semigroup of endomorphisms 4 of G of the form I

4 ( x )=

11 n j T k J ( X ) ,

j=

where n,, ..., n , , k , , ..., k, E Z and 1 E Z'. By G p Orb(a) or G p Orb,(a) we shall mean the group generated by the orbit O,(a). Clearly, G p Orb,(a) = &(T)u.

Definition 3.6 The affine system (G, T ) is monothetic if there exists an a E G such that G p Orb,(a) = S(z)a is dense in G . Remark If we denote the identity automorphism on G by I , then (G, I ) is monothetic iff G is a monothetic group in the usual sense, that is, iff a is a topological generator of G .

Proposition 3.8 The system (G, T ) is monothetic ifs there exists an a E G such that O,(a) separates the points of G . The proof is left to the exercises. We are now in a position to offer the analog of Theorem 3.5 for ergodic automorphisms. Let K," be the discretized version of the infinite-dimensional - K . That is, K," is K" with the discrete topology. torus K" = 8;; Note that K," is a proper subgroup of K,", since K," contains only the finitely nonzero sequences from K , whereas K," contains all bisequences. Let ?* be the shift transformation on KdW,that is, 5*(z) = w, where w, = z,+ Then 5* is an automorphism of Kdw, and its adjoint 5 is a continuous automorphism of the dual group z"= (K,") According to Theorem 3.7, there is a maximal r-invariant subgroup H of z"such that the restriction 5 , of 5 to H is ergodic. Let Oe denote the ergodic system ( H , fH). Remark ~31).

z"is the Bohr compactification of the discrete group Z"

(see

Theorem 3.8 If T is an ergodic automorphism of the compact abelian group lf (G, t) is monothetic, then (G, r ) is an algebraic factor of Qe.

G , and

Proof As in the proof of Theorem 3.5, Ive define a mapping p* of G into H = Kdl"/A,where A = A(?) = H I is defined as in Section 4, and show that (i) p* is monic and (ii) p * ~ *= i,*p*. The adjoint p : H -, G is then an algebraic homomorphism of ( H , 5,) onto ( G , 5 ) .

94

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

By assumption there exists a E G such that O,(a) separates the points of y E G define po*(y) to be the bisequence

6. For

(18) Then p o * : G + K," is clearly monic and satisfies po*t* = ?*po*. Moreover, PO*(Y)

= {y(t"a)}.

p o * ( e ) n A(?) = {O}.

(19) For suppose that z = po*(y) E A, for some p > 0. Then there exists a positive integer k with S*kz - z E A,- This means that ?*kZ

-Z

= { y ( t " ' k U ) } - { y ( t " U ) } = {y(T"'kU)/y(T"U)}

(201 Suppose that p o * ( c ) n A,- ? = (0). Then (20) implies that ~ * -~y =y 0. Since t is ergodic, this implies by Theorem 3.1 that y = 0. We conclude that = { t * k y ( t " ~ ) / y ( t " ~ )= } ~ o * ( t * ~-y 7) E

p o * ( G ) n A,-

A,-

1.

= {0}=-p , * ( e ) n A, = (0).

Since po* is monic, p o * ( e ) n A, = {0},and (19) follows by induction on p and the fact that A = U,"=,A,. Finally, let p * : -+ KdW/Abe the composition of po* and the natural projection A of K," onto K,"/A. It follows then from (19) that p* is monic. Moreover, n?* = ?,*A by the definition of factor automorphism. Thus

e

p*t* = np,*t* = "?*Po*

and we are finished.

= fH*np0* = ?,*p*,

I

Corollary 3.8.1 Let B = p o * ( e ) . Then B is a subgroup of K," with the following properties: 1. ? * ( B )= B, and 2. B n A(?) = (0). Conversely, corresponding to any subgroup B of K," satisfying properties 1 and 2 there is a cornpact abelian group G and a continuous, ergodic automorphism T of G such that (G, t) is monothetic and B = p o * ( e ) , where po* is dejned by (18).

Proof The first statement has already been proved. To prove the converse, let G = B = F / B 1 . From property 1 it follows that ?(EL)= B' and that the adjoint of the restriction of i * to B is the factor automorphism ?/EL on z"/B'. Let us denote SIB' by T. Let a, E z"= (Kd") -be defined by ( a , , z ) = zo for z E K,'", and let a be the projection of a. on G = F / B * . Since (?"a,, z ) = ( a o , ?*"z) = z,, it

5. ERGODIC AUTOMORPHISMS

95

follows that O?(a,)_separates the points of K,", and hence O,(a) separates the points of B = G . Thus (G, t )is monothetic. If po* is defined by (18), then po*(z) = z for each z E B = It only remains to show that t is ergodic. But this follows immediately from property 2 and Theorem 3.1. I

e.

An interesting side issue is the question of which compact abelian groups admit monothetic automorphisms. Let us agree to call such a group algebraically monothetic. Thus G is algebraically monothetic iff there exist a E G and a continuous automorphism t of G such that &(?)a is dense in G . Corollary 3.8.1 provides us with one answer to the above question.

Corollary 3.8.2 The group G is algebraically monothetic lfS6is isomorphic to a (nor necessarily closed) subgroup B of the injnite-dimensional torus K" satisfying Z*(B) = B, where 5* is the shiji on K". Proof The condition B n A(?) = (0)was needed only to prove ergodicity. Otherwise, the proof proceeds as above. I Example 7 For any a E Kd denote by ( a ) the cyclic group generated by B=@ =: - (a) be the complete direct product. B is identified with a subgroup of K," in the obvious way and satisfies property 1. However, property 2 is not satisfied, since, for example, z = {ak*}E A if the integers k, satisfy the recurrence relation k , , = k , ( n = 0, f 1, +2, . . .). O n the other hand, the subgroup B' = @,"=-, ( a ) of B, consisting of sequences z = ( a k n with } k, = 0 for In1 > N ( z ) , satisfies both of properties 1 and 2. (Recall that A , ( t ) = (0) implies A(?) = (O}.) a, and let

Example8 More generally, let B = @,"= - T, where T is any subgroup of K , . Then conditions 1 and 2 are satisfied. In this case, b = @ -I , .2 ,"= and t is the symbolic flow on 2 as defined in Example 3 of Section 3.1. If we take T to be the kth roots of unity, T is the Bernoulli shift on k points.

Corollary 3.8.3 The symbolic $ow on any monothetic group is monothetic. In particular, Bernoulli shifts are monothetic. Example 9 The divisibility of the group K , and hence of K," implies by an easy induction argument that A is divisible. It follows from a wellknown theorem of group theory [38, p. 81 that A is a direct summand of K,". That is, there exists a group Be such that

K," = A 0 B e .

+

96

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Clearly, Be satisfies condition 2. We would like to say that B, can be chosen to also satisfy condition 1, for this would imply that the "model" of Theorem 3.8 is monothetic since Be z A. It would also imply that any system (G, T), where T is monothetic, has a direct sum decomposition G = G, 0 G, (since the model z"would decompose as H 0 be)with G , and G , t-invariant, (GI,t ) ergodic, and (G,, t ) distal. However, this is not true, as shown by a counterexample constructed by Kerrick [39] on the 3-torus K 3 . Remark In Examples 7 and 8, the group B = p,*(G) is a subgroup of the direct sum K," G K,". This, however, does not exhaust the possibilities. It can be shown (Exercises 29 and 30) that any automorphism of K" is monothetic. Kerrick has shown [39, p. 551 that (in his Type I case) a can be chosen so that B = po*(Z")is given by

li L1

B = exp 271i

aj Re(A;(m. u , ) ) ] ) : m E Z"),

where the A, are eigenvalues and the u j eigenvectors of the associated unimodular matrix, and where the u, are real numbers. It can be shown [13] that not all such B are contained in K,". Another example is constructed in [13] of a nontrivial B satisfying conditions 1 and 2 and having trivial intersection with A 0 Kdm. 6.

AN AFFINE TRANSFORMATION ASSOCIATED WITH THE DYNAMICAL SYSTEM @

The time has arrived to pull together ideas from the last three chapters and discuss the affine system associated with an abstract dynamical system @ = (X, B, p, 4) or with a classical system @ = ( X , 4). Let us denote by Y or Y(@) the group of all complex-valued functions on X which have constant absolute value one and which, in the abstract case, are 49-measurable or, in the classical case, are continuous. In the abstract case, we shall actually deal with equivalence classes of such functions, two functions being equivalent if they are equal p-almost everywhere. In this case, Y c L , ( X ) . In the classical case, Y c C ( X ) .It is clear that Y is a group, and that the set K , of constant functions of modulus one is a subgroup. As explained in Chapters I and 11, the dynamical system @ has associated with it a bounded linear operator Tb on L , ( X ) or on C ( X ) , defined by q f ( x ) = f(4(x)). Moreover, the restriction of this operator to Y is a monomorphism of

6. AN AFFINE TRANSFORMATION ASSOCIATED WITH THE SYSTEM

97

the group 9. If CP is invertible, it is an automorphism. We shall denote this transformation of Y by r* or r6*. Thus raI*f(x)= T, f ( x ) = f(4(x)).

(22)

The mapping a. : K , + K , which assigns to each constant function its constant value, is clearly a homomorphism. Since K is divisible, a. has a homomorphic extension a : Y + K . Let us define an endomorphism a*:3 + Y by a*f = r*f / f . This is in keeping with our earlier notation, bearing in mind that the group Y is perforce written multiplicatively. Let us denote 6 = a*(%). Note The mappings r*, a*, a here are the same as 5, 8, c1 introduced in the proof of Proposition 3.4 and Theorem 3.4 of Section 3.

Proposition 3.9 I f CP is ergodic (or minimal), then Y s K 0

c.

Proof The mapping II/: f + ( a ( f ) ,a*(f))is a homomorphism of Y into K 0 G. It is epic because the range of a* is G, and, for a fixed f E Y and ;iE K , it maps g = [ A / a ( f ) ] f into (A, a*(f)).It is monic because a*f=

and, forf

E

l*T*f

=

T,f

K , with a ( f ) = 1, we havef

Remark The inverse map kernel of a.

=f*f€ =

I+-' carries

1.

K,,

I

K onto K , and

6

onto the

We shall assume from now on that CP is ergodic or minimal, so that Proposition 3.9 applies. The notation G = a*(%) was chosen with a purpose in mind. The compact dual G of the group G is precisely the group on which the affine transformation associated with 0 will be defined. By construction a E @ is a character on 3.Thus it determines a character on the subgroup 6 = o*(%).We shall denote this again by a. Thus a E G. Since r* commutes with o*,it is clear that r*G c G, with equality holding if CD is invertible. Thus T* has an adjoint 5 on G, which is an epimorphism in general and an automorphism if CD is invertible. Finally, we define the affine transformation I$ of G by

4

I$(O= r ( 5 ) + a.

(231

The following result due to Abramov is basic to the representation theory to follow. The subgroups A and B of 3 are defined as in Section 3. Thus A o = { l ] , B , = K , , B , , + , = { f ~ 9 : a * f ~ B ,A} ,,+ , = a * ( B , , + , ) for n = 0, 1, 2, ..., and A = A,,, B = B,.

u:=o

u:=o

98

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

Lemma 3.1 Suppose f, g E B with a*f # a*g. Further suppose that q has no eigenoalues that are roots of unity. Then (f,g ) = 0 ; that is, f and g are orthogonal in L 2 ( X ) .In the classical case, the integral may be calculated with respect to any &invariant, ergodic Bore1 measure on X . Proof The isomorphism $: f + ( a ( f ) ,a*(f))of Proposition 3.9 carries B, onto K O A, for each n and B onto K O A. It follows from the remark following that proposition that

B, = K1O C , , where C, = B, n a- l ( 1). Let us denote by X,, the closed linear subspace of L 2 ( X , B, p ) generated by C,. Following Abramov [2], we shall show where g,, =9 '; is the orthogonal complement first that Cn+l X,E g,,, of X,, in the Hilbert space L 2 ( X , B, p). Letf E Cn+l X,, have the unique representationf = fl + f2, fl E X,,, f 2 E gn.Since T+(B,)E B, E X,,,it follows that q(X,) E X , . Thus Td fl E 3,. Moreover, since T+ is invertible on B, (cf. the proof of Theorem 3.6), it follows that q is unitary on X,, . Thus for each g E X,,

-

( G f L9 ) = ( f 2 7

T,-'g) = 0.

That is, T+f2 E 9,. Since f E C n + l ,we have T+f = @*f = a*f * f = a*f

*

fl

+ a*f

*

f2,

(24)

as well as

q f = T+fl + T+f 2 Since o*f

E

A,+

G

B,, we have o*f (a*f *

f21

*

fl E

*

(25)

Y,, and as before

9 ) = ( f 2 a*f . 9 ) = 0 9

From (24) and (25) and the for each g E X,,, so that a*f f 2 E g,,. uniqueness of the orthogonal decomposition of the function T,f, we conclude that T+ fl

= a*f

. fl,

T+f 2 = a*f . f 2

(26)

From (26) and the definition of B,+l, we conclude that either fJ = 0 a.e. or gJ E with a*(g,) = a*(f) ( j = 1, 2). In the latter case a*(f/g,) = 1, so that f = A,g,, A, E K , . Sincef 4 X,,, it follows thatf, = 0, and f = f 2 E CV,, as asserted.

=fJ/lfJl

6.

AN AFFINE TRANSFORMATION ASSOCIATED WITH THE SYSTEM

99

Next let us show that ( f , g) = 0 for all f , g E C,, f # g. This follows from ( f 9

9) = (Tb f

9

(27 1

q 9) = (a*f)(a*s)- Yf?9)7

where a*f, a*g E K,, are constants. Since f # g and a* is monic on C , , it follows from (27) that ( f , g) = 0. Suppose now that ( f , g) = 0 for all f , g E C , , f # g. We refine our earlier result by showing that C,, C , c g,, . Equivalently, we show that C,, n X,, c C,. Suppose then that f E C,, n 3,. By the induction hypothesis, the elements of C , form an orthonormal basis for X,.Thus

,-

,

,

For each g E C , we have q g = a*(g)g, and since a*(g) E B,- = K 0 C,q g = cg,, where IcI = 1 and glg-l E C , , - , . Likewise, u*(f) = ah, where la1 = 1 and h e C,,. Thus

(f?9) = (5f , T, 9) = (ah!, cg,) I ( f 9

s)l =

h-lgdl.

I(f7

Repeating this argument, we get an infinite sequence gag C , with 9,g-l E C,,-l and

I ( f , 9) I = I ( f , h- n9n)I. From (28) it follows that only a finite member of the functions h-"g, E C , are distinct. Thus for some n, h-"g, = g, and so h" = g n g - l E CnWl. Now since CJC,-, 2 AJA,,-, z o * " - ~ ( A ,E ) a*(B1)= A, (see the proof of Proposition 3.4) and since A, consists of the eigenvalues of q and thus has no elements of finite order, it follows that h E C,-l. But this means that a*(f) = ah E B,- Therefore f E B, n C,, = C , , as asserted. Now letf, g E C,, withf # g. If bothf, g E C , , then (f, g) = 0 by the induction hypothesis. Iff 4 C, and g E C , , then (f, g) = 0 by what was just proved. Thus the induction step is reduced to considering the case f , g E C, C,. In this case,

,.

,

,

-

(f,9 ) = (Tb f , Tb9) = ( O * ( f ) f l

o*(g)g) = (fa .*Tf) a*(g)).

Since a"(f) a*(g) E B,, there exist a constant c with I c I = 1 and h E C , with a m a*(g) = ch. Since f g E C,, either

,,

(f,9) = (fa ch) = 0,

100

111.

or f g E C, and

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

fa = h. But the latter implies that c i s = as0o*(g) cT, f = c o * ( f ) f = o*(g)g = T&g

a contradiction. Thus ( f , g ) = 0. Finally, suppose that f , g E B with a*(f) # o*(g).Then f = cfl, g = d g , with IcI = Id1 = 1 and f,, g 1 E C = C , . Since o*(f) = o*(fl), o*(g)= o*(gl),it follows thatf, # 9,. Thus ( f , g ) = c d - ' ( f l , g l ) = 0. I

u:=,

Remark If 0 is a totally ergodic (totally minimal) abstract (classical) dynamical system, then q can have no eigenvalues that are roots of unity. (Exercise 35.)

Theorem 3.9 Suppose that @ is totally ergodic or totally minimal. Then

&, = 4, deJned by Eq. (23), is ergodic. Moreover,

( i ) zo is ergodic i f 4 has no nontrivial quasieigenfunctions, and (ii) (Abramov-Hahn-Parry) if 4 has quasidiscrete spectrum, then 4 also has quasidiscrete spectrum, and they are isomorphic in the appropriate category. Proof that

(Ergodicity) According to Theorem 3.3(iv), it will suffice to show ?*"f7J=ffT*f=T+f=f

and

Gf=f,

a(f)=l*f=l

for each f E 6 = o*(Y). These implications are in fact valid for all f E 9. The first implication follows from the ergodicity or minimality of 4", since f = ffor those (and only those)f E K , . The second implication follows from the definition of a, since a ( f ) = f f o r f E K , .

(i) Using the results from Section 4, we see that the ergodic part of = TL where r is the set of functions g E G such that o*"g = 1 for some n. Thus H is all of G iff r is trivial, that is, iff o*("+'y = o*"(a*f) = o*"g = 1 for f E 9 implies g = 1. By ergodicity this is equivalent to f E K

4 is defined on H = A'

,.

(ii) Suppose that 0 has quasidiscrete spectrum. In the abstract case, this means that the subgroup B of 9 consisting of all quasieigenfunctions of 4 spans L 2 ( X ) (Definition 3.3). This implies in turn that the only con-

6.

AN AFFINE TRANSFORMATION ASSOCIATED WITH THE SYSTEM @

101

tinuous linear functional on L2(X) that annihilates B is the zero functional. Let t E H = A1(t) = A' E G.Define I : Y + C by We shall show that 1 vanishes on B and that 1 can be extended to a linear functional on L2(X). Let L be the linear span in L , ( X ) of B. For each f E B there is exactly one f l E B such that a*f = a*fl and a(fl) = 1, namely, f l = a(f)-'f. Moreover, l(fl) = a ( f ) - ' 1(f).Thus eachf E L has a representation

+ . . . + cnfn,

(30) where the c j are complex numbers and a*fi # a*h for i # j . According to Lemma 3.1, this representation is unique, and the formula f=clfI

+c,f2

determines a bounded linear functional on L. By the Hahn-Banach theorem, I has an extension to a continuous linear functional on L2(X). Forf E B, a*(f)E A, so that <(a*f)= 1, and

x

4(f) = 1(f) = a ( f ) [l - 11 = 0. It follows that I = 0, and from (29) that {(g) = 1 for all g E (? = a*(%). Thus H is trivial, A ( t ) = (?, and has quasidiscrete spectrum. To complete the proof, we need to show that 0 is measure-theoretically isomorphic to 6 = (G,$), where G = A. First, we use the lemma again to define a linear mapping S from the linear span L of B in L 2 ( X ) to L,(G). Namely,

4

1(t)

S Ccjfj (j:1

=

5~ G,

Ccjt(a*fj),

(32)

j:l

where the fj are chosen to belong to the kernel of a. That is, f E ker(ct) is mapped into o*(f), and S is extended by linearity. Since distinct elements of G are clearly orthogonal in L2(G), it follows that S is an invertible isometry of L onto a dense subset of L,(G). Let be the unique extension of S to L,(X). Finally, we need to show that 9 = Tp for some measure-preserving map p of G onto X . As in Proposition 1.2, this will be proved by showing that is doubly stochastic. Since = it is clear from (32) that 31 = 1 and s*1 = 1. Let us show that Sf 2 0 for allf 2 0. Since S restricted to B is a group homomorphism into G,it follows that

s

s

s* s-',

w i g ) = (Sf)(%)

(33)

102

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

holds for all f , g E B. By linearity (33) holds for all f , g E L. Let F E B and choose gnE L with gn+ x s in L,(X). For eachf E L, bothfand Sfare bounded functions on X and G,respectively. Therefore, fgn +fxF in L,(X) and (Sf)(Sgn)--* (Sf)(Sx,) in L,(G). Thus S( f xF) = (Sf)(Sx,). Similarly, lettingf, + xF gives S(xF)= S ( x F z )= (SX,)'. Therefore, SX, 2 0. It follows that Sf 2 0 for all f 2 0. The proof of (ii) in the abstract case is now completed as in Exercise 1.6. I Remark In the case of a totally minimal, classical dynamical system, the proof of (ii) is completed by noting that S 2 0, S1 = 1 implies that S carries bounded functions into bounded functions and has norm one as an operator from L into C(G).Moreover, a point transformation that carries all continuous functions on X into continuous functions on G must be continuous.

EXERCISES

Affine Transformations 1. Let t: G + G be a continuous epimorphism. Define a measure p on G by p(A) = m(t-'(A)).

Show that p is translation invariant and hence, by uniqueness of Haar measure, must be m. Show by example that t is not necessarily measure preserving if it is only assumed to be an endomorphism (not epic). 2. If T = (t,,) is an n x n matrix with integer entries, then t * ( y ) = yT defines an endomorphism of Z". Show that t* is the adjoint of the mapping t of Example 2. If T has nonzero determinant, it determines a nonsingular transformation of R" whose restriction to Z" is t*. Deduce that t* is monic and hence t is epic. Show that the t corresponding to T = (::) is not epic. 3. (a) Show that CJ as defined in Example 3 is a bicontinuous group automorphism. (b) Show that the formulas for CJ*and t* given in Examples 3 and 4 are correct.

103

EXERCISES

4. Let Go,G, 0, and t be as in Examples 3 and 4. Let a. E Go and define a E G to have for its nth component SnOao.Let H , E G be the subgroup

H , = {x E G : x, = 0 for n I0},

+

and K O c G the affine subspace K O = a Ho. (a) Show that K O is invariant under CJ[i.e., o ( K , ) c KO] and hence under t. (b) Show that the restriction of T to K O is isomorphic to the transformation 4 of Example 5 ; that is, find an invertible a f i e mapping between the appropriate spaces such that @b, = $7. 5. If G is divisible as well as compact, then tx is defined in G for each x E X and each real number t . Suppose that G is divisible and 4 is affine, that is, 4(x) = t ( x ) + a for some endomorphism z of G. Show that 4 is affine in the sense of convex sets, that is,

+ t y ) = s ~ ( x+) @ ( y )

~ ( S X

for x, y

E

G,s, t 2 0, s + t

Conversely, show that if (34) holds for all x, y E G and s = t is affine in the former sense with a = 4(0). 6. Verify statements (i), (ii), and (iii) in Example 6.

= 1.

=

(34)

i, then 4

Ergodicity

7. According to Theorem 3.2, a finite group admits no ergodic automorphisms. Can you give a simpler proof of this fact? Show that a group automorphism (of any compact group) is neuer minimal. 8. If T is a 3 x 3 unimodular matrix, then the characteristic equation of T is

P(A) = d3 - sdZ + rA f 1 = 0. The roots are either all real or one real and a complex conjugate pair. Consider all cases and give a complete description of the ergodic case in terms of s, t and the determinant. [Hint: If t is ergodic, then P(A) is not factorable over Z. Otherwise, there would be a linear factor with integer coefficients, making _+1 an eigenvalue. If A, is a root of unity satisfying P(A,) = 0, then the minimal polynomial of A, divides (hence equals) P(A). What are the minimal polynomials of degree 3 of roots of unity?] 9. (a) If 0 is defined as in Example 3, then ~ * ~ (=y y) means &,+k = y n for all n. Use the fact that only finitely many of the y n are nonzero to conclude that y = 0 and 0 is ergodic.

104

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

(b) Let z be as in Example 4. Then ~ * ~ ( =y 6, ) where 6, = cj=o (;)Y,+~. Assume that ~ * ~ ( = y )t * ' ( y ) for some 0 I k < 1 and y # 0. Let p be the smallest integer with y p # 0 and set n + 1 = p to reach a contradiction. Thus z is ergodic. 10. Let 4obe a totally ergodic measure-preserving transformation of [0, 13. Define 4 on [0, 11 u [2, 31 by

+2

(0 I x I 1)

= cjlO(X - 2)

(2 I x I3).

+(x) = 40(x)

Show that 4 is ergodic, but 4' is not. Construct a similar example of an affine transformation on G = Go 0 {0, 1). Why does this not violate Theorem 3.3. 11. In Example 5 we know that 4 is semiergodic if a. separates the points on Go. Let Go = {0, 1) = Go, a, = 1, and y = (1, 1, 0, 0, 0, ...). Addition, of course, is modulo 2. Since z*'y = y, but r*y # y, condition (iv) of Theorem 3.3 is not satisfied. Show that G4y= y, so that 4 is not totally ergodic. 12. In Example 5b, show that (iii) of Theorem 3.3 is violated by calculating A,(y). Show further that

f(x)

=

[Y(X)12 +

= exp[ni( lox,

Y(X)12

+ 12x2 + 8x3)/8] + exp[d( 10 + 6x, + 4x2 + 8x3)/8]

is a nontrivial invariant function for Tb. 13. Let 4, z, g, a and G be as in Theorem 3.3, except we do not assume that G is torsion-free. (a) The following implications among the conditions in Theorem 3.3 are valid: (iii) * (ii)

U (iv) 3 (i)

0 (v) [Hint. Show that (iv) * (i) and then that (iii) for 4 implies (iv) for (b) If 4 is ergodic, then (iii') z*"(y) = y, A,(y) = 1 * ny = 0, and (iv') t * " ( y ) = y* z*(ny) = ny, and 4 is semiergodic.

4'.]

105

EXERCISES

[Hint. To prove (iii'), define the invariant function n- 1

n- 1

and deduce that n = 1 and y = 0, or there exists a smallest integer I,, 0 < I , < n, such that t*"(y) = y. Moreover, 1, divides n, say n = llml. Inductively, define integers l,, m, with n = l,m,, T*'g(m,_ ,y) = mP- ,y, t*'(m,- l y ) # mp- ,y for 0 i < I, and functionsf,,

-=

1.-

n- 1

1

and such that 0 < 1,+, < I, or I , = 1. Conclude that, for some p, we have 1, = 1 and m,y = ny = 0.1 14. Let Go = (0, 1) with addition modulo 2, and G = Go 0Go 0G o . Define T on G by Z(X1, X 2 r x3)

= ( x , , x1

+ x29 x 2 + x 3 ) ,

and let a = (1, 0, 0). Define 4 as usual. (a) The orbit of a under 4 is a proper subset of G. Hence 4 is not ergodic. (b) If y = (0, 1, 0) E G, then ~ * ~ ( =y y,) but t * ( y ) # y and &(y) = - 1. Moreover, ~ * ~ ( =y y) and A4(y) = 1. (c) Conditions (iii') and (iv') above are satisfied. (a) The mapping II/: G + Go 0Go defined by II/(xl, x 2 , x 3 ) = ( x l , x 2 ) induces a factor 6 of 4 on Go@ G o . 4 is ergodic, but does not satisfy (iii) or (iv) of Theorem 3.3. 15. In Example 5 assume that a. separates the points of Go, for example, if a, is a topological generator of Go. Prove that O,(O) = O,(a) separates the points of 6. Is it dense in G ? Minimality

+

16. Suppose that 7 is an ergodic automorphism and 4 ( x ) = ~ ( x ) a = + x + a. Show that (a) CJ*is monic and CJ is epic; (b) 4 is affinely isomorphic to T under the map $ ( x ) = x b, where a(b) = - a ; (c) 4 is not minimal. ~ a minimal homeomorphism of [0, 11. Define 4 on 17. (a) Let $ J be [0, 13 u [2, 31 as in Exercise 10. Show that 4 is minimal, but 4' is not. CJ(X)

+

106

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

(b) If X is a connected, compact metric space, and if 0: X + X is a minimal homeomorphism, then #J is totally minimal. 18. Deduce from Exercise 7, Chapter 11, that a uniquely ergodic affine transformation is minimal.

Quasidiscrete and Quasiperiodic Spectrum 19. Prove Proposition 3.3. 20. For any affine 4, show that rl consists of all those elements of which are eigenfunctions of T, . 21. (a) Show that :T = o"(G),and hence that l

1'

m

=

=

G

0o"(G). n=O

n:=o

(a) Conclude that #J has quasidiscrete spectrum iff an(G)= (0). 22. Show that the affine transformation 4 maps the group generated by a and o(G) into itself, and hence that a minimal #J is semiergodic. Show that the converse is false even for transformations with quasidiscrete spectrum. 23. (a) Let 0 = (X,4 ) be a classical dynamical system, and denote by 3 = 9 ( X ) the group of complex-valued, continuous functions of constant absolute value one. Show that a ( f ) = f ( x o ) is a homomorphism such as is required in the proof of Theorem 3.4. (b) If 0 is either a classical or abstract dynamical system and if 9 and a are defined appropriately, we shall say that the pair (0,a ) is semiergodic if

f E 3, 6(f)= f, a ( f ) = 1* f = 1. Show that Theorems 3.4 and 3.5 remain valid for semiergodic systems. 24. Let #J be an ergodic affine transformation. (a) Let f E B , be an arbitrary eigenfunction of T,, say T, f = Af. Expand f i n a (generalized) Fourier series

f=

c (flY)Y

;'€

G

and show as in the proof of Theorem 3.3 that

f = r *1 (f,Y)Y y=y y(a) = Y

EXERCISES

107

Deduce that there is only one such y for each 1, and hence that A , E rl,B , = f C Y : C E K , Y E r1jz K O rl. (b) Show that A, z rnand B,, z K O r,. 25. Verify the details of the proof of Theorem 3.7. 26. (a) Let t be an automorphism of G, and let 1 be an eigenvalue of T, . Show that 1 is a kth root of unity for some k, and the eigenfunctions associated with 1 are of the form

f=

c (f?Y)Y.

r*ky=y

(b) The linear space in L 2 ( X ) spanned by A 1 includes all the eigenfunctions of ?; . (c) If 4(x) = T ( X ) + a, the eigenfunctions of T@ corresponding to the eigenvalue I are all of the form

Ergodic Automorphisms 27. Show that Proposition 3.7 remains true if t is only assumed to be a continuous affine transformation of the compact metric group G. 2.8. Use the facts that the annihilator of O,(a) and the annihilator of the group G p Orb,(a) generated by Q ( a ) are the same and that H" = fi, the closure of H , for any group H s G to prove Proposition 3.8. 29. (a) If O,(a) separates the points of 6, show that $(x) = T ( X ) + a is semiergodic. (b) Show that the 4 of Exercise 14 is semiergodic. 30. (a) Let us say that the system (G, t), where t is a continuous automorphism, is weakly topologically ergodic if A s G, &(?)As A implies A is dense or A is nowhere dense in G. If G is metrizable and (G, t) is weakly topologically ergodic, show that (G, t) is monothetic. (b) If (G, I) is weakly topologically ergodic, then (G, t) is weakly topologically ergodic for any T. (c) (K", I) and (K", I) are weakly topologically ergodic. Hence (K", t) and (K", t) are monothetic for any z. 31. If G is a monothetic group, show that (G, t) is monothetic for any automorphism t. Which of the groups K", K", K,, K,", K," are monothetic? Which are algebraically monothetic?

108

111.

GROUP AUTOMORPHISMS AND AFFINE TRANSFORMATIONS

32. Algebraically monothetic groups. Let T ( H )denote the torsion subgroup of the group H . (a) Show by transfinite induction on the elements of infinite order that any monomorphism of T ( H ) onto a shift-invariant subgroup of K," can be extended to a monomorphism of H onto a shift-invariant subgroup of KdW,provided that the cardinality of H does not exceed the cardinality c of the continuum.

(b) Theorem A compact abelian group G is algebraically monothetic iff the cardinality of 6 is no greater than c and T ( 6 ) is isomorphic to a shiftinvariant subgroup of K,". (c) Every separable (and hence every metrizable) connected compact abelian group is algebraically monothetic (in fact monothetic).

Note The next two exercises require some knowledge of abstract group theory. See, for example, Kaplansky [38]. In addition, the theorem in Exercise 32(b) is used.

Z(pkl),where p is prime and 1 Ik , 33. (a) If 6 = @,S, k, IGO, and if z* is defined on 6 by 7*(t1, ..., t,)= ( t i , ti

+ t i r t2 + t 3 , ...,

ts-l

Ik , I... I

+ t,),

then (G, 7 ) is monothetic. (b) If 6 is finitely generated, then G is algebraically monothetic. 34. If G is separable and 6 is divisible, then G is algebraically monothetic. Quasidiscrete Spectrum II

35. Show that a totally ergodic abstract or totally minimal classical dynamical system can have no eigenvalues I with I" = 1. 36. Complete the proof of Theorem 3.9(ii) for totally minimal classical systems.

CHAPTER

IV Entropy

1. CONDITIONAL EXPECTATION AND KOLMOGOROV ENTROPY In Chapter I we introduced the notion of equivalence of two abstract dynamical systems O1 = (Xl, B1, p l , 41) and O2 = (X,, 93,, p z , 42). Thus O1 and Ozare equivalent if there exists an isomorphism $*: 93, 931 of the measure algebras 93, and &Il satisfying 4T1$* = $*&’ modulo sets of measure zero (Definition 1.2). In most cases, we may assume that $* = $- (mod 0), where $: X1 X, is an invertible measure-preserving transformation. In order to distinguish this notion from other types of equivalence or isomorphism for dynamical systems, let us agree to write OlE! O2 when O1 and O2 are equivalent in the above sense. We shall say that O1 and 0,are m-isomorphic. We turn now to a study of the “isomorphism problem” for abstract dynamical systems. When are two dynamical systems m-isomorphic? A complete solution of this problem should include two things: a useful set of criteria for determining when two dynamical systems are isomorphic and the concrete realization of a representative system from each isomorphism class. For the subclass of dynamical systems with quasidiscrete spectrum, Theorems 3.9(ii) and 3.5 provide such a solution. Two systems -+

-+

109

110

IV.

ENTROPY

are isomorphic iff they have “equivalent” sets of quasieigenvalues, and 6 can be realized as a factor of 6.On the other hand, the general problem is far from solution. A big step in the solution of this problem was made in 1958 when Kolmogorov [41] introduced the definition (later modified slightly by Sinai [SS]) of the entropy of,an abstract dynamical system and showed that it was an isomorphism invariant. The next big step occurred in 1970 when D. S. Ornstein showed that entropy is a “complete” invariant for invertible Bernoulli shifts (Example 3 of Chapter I). That is, two Bernoulli shifts are m-isomorphic iff they have the same entropy. We shall discuss this result and related matters in Chapter V. Other isomorphism invariants need to be mentioned at this time. For example, ergodicity and mixing, as well as invertibility, are easily seen to be preserved by m-isomorphism. Associated with each abstract dynamical system @ we have a linear operator q on the Hilbert space L,(X) and an affine transformation d, on the compact group G = G(4) where G = a*(Y) and where a* = a@*and Y = $(a)are defined in Section 111.6. It is easily established (Exercise 1) that L! @, implies that T4,and T , 2 are isomorphic as operators on Hilbert space, and that d,, and d,, are isomorphic as affine transformations. Thus the adjunct transformations q and 8 provide an abundance of isomorphism invariants of the dynamical system @. As shown in Section 1.3, all Bernoulli shifts have countable Lebesgue spectrum. It follows that they are strongly mixing and induce isomorphic operators q. The first important result using Kolmogorov entropy was that, for example, the shift on two points is not m-isomorphic to the shift on three points. In order to define the Kolmogorov entropy h(@) of @, we must digress briefly to discuss conditional expectation operators on L 2 ( X ) . Definition 4 1 Let (X,8,p ) be a normalized measure space ( p ( X )= l), and let V G 8 be a sub-a-algebra. Let E denote the embedding operator. Ef = f of L,(X, V, p ) into L,(X, 8, p), and let E* be the adjoint of E. The conditional expectation operator 4 is defined by

4 = EE*.

(1)

The operator 4 acts on L,(X, 8,p ) and leaves invariant the subspace L,(X, ‘3, p). For each f E L,(X, 9, p), we say that EWf is the conditional expectation off given V. Proposition 4.1 The operator

4 is doubly stochastic and self-adjoint.

1.

Proof

CONDITIONAL EXPECTATION AND KOLMOGOROV ENTROPY

111

Obvious.

Some other useful properties of E, are summarized in the following proposition. Note that, according to Proposition 4.1, 4 may be assumed to act also on Ll(X, W,p ) .

Proposition 4.2 For all functions f , g %, V1,

E

Ll(X, W,p ) and all o-algebras

c W the following are valid:

(i) E ’, = E., (ii) Zf f is %-measurable, and f . g E L1, then

b(f.9 ) = f . E,g. (iii) E %,* EsIEw2= Ev2EoJI= EWI. (iv) E, f = f. (v) E.4 f = S x f dp, where = {h X } . (vi) E, f dp = f dp for all C E %. (vii) E, f is %-measurable. (viii) If g has properties (vi) and (vii) of E, f , then g = I&f p-almost everywhere.

sc

sc

Proof Property (i) follows from the equality E*E = I , the identity on L,(X, %, p), which is equivalent to ( E f , E f ) = If 1’ dp = (f,f). In (ii) we assumefis %-measurable, so that as functions f = Ef. Thus (2) is equivalent to

sx

U E f . 4 ) = Ef E,g

(2’)

for f E L , ( X , %, p). Let h E L,(X, W,p). Then, since E is multiplicative and Ew is self-adjoint,

F,(Ef

. 9 ) , h ) = (Ef . 9, E, h ) = (9, EE*h) = (9, E ( f E*h)) = (E*g, f . E*h) = (f . E*g, E*h) = (Ef . EE*g, h ) = (Ef * E,g, h),

u.

3

from which follows (2). (iii) Let E , (i = 1,2) be the embedding operator of L z ( X , Wi, p ) into L,(X, W, p), and let E , be the embedding of L z ( X , g1, p ) into L,(X, W 2 , p). Then El = E , E,, and so Ew1 = E1E1* = E , E , Eo*E2*.

112

IV.

ENTROPY

Thus Ew,Ev2= E2 Eo EO*(E~*E~)EZ* = Eu,, EK2EW1= E,(EZ*E,)EO Eo*E2* = b,,

since E2*E2 = I . (iv) This is trivial, since E = I . (v) For V = A’”, E*f is a constant. Since El = 1, that constant must be jf dP* The last three statements are proved in the exercises at the end of this chapter. Certain properties of the conditional expectation needed in the sequel depend only on the fact that E, is doubly stochastic. Two such properties are given in the following propositions.

Proposition 4.3 Suppose f,, g E L p ( X ,93, p). I f.I Ig and f, + f a.e. Then, for each doubly stochastic operator T , T f , + T f a.e. and in the mean of order P(llPlC0). Proof The L p convergence follows immediately from the dominated convergence theorem and the inequality IITfn - T f Ilp I l f n - f I I p . Almost everywhere convergence for a monotone sequence of functions follows from positivity of the operator T , the equality STf d p = j f dp, and the monotone convergence theorem. Dominated convergence then follows from monotone convergence in the usual way.

A real-valued function F of a real variable is said to be conuex if

for all choices of xk in the domain of F and all t, 2 0 such that tk = 1.

Proposition 4.4 (Jensen’s inequality) Let F : [a, b] + R be a continuous, convex function, where 0 5 a < b < 00 and R denotes the real numbers. Let T be a doubly stochastic operator. Then F o

for each f

E

T f l T(F0 f)

L , ( X , a, p ) with f ( X )c [a, b].

(4)

1.

113

CONDITIONAL EXPECTATION AND KOLMOGOROV ENTROPY

Proof Suppose first thatfis a simple function: n

f

=

c

ckXBky

k= 1

where the B, E .?d are pairwise disjoint with union X . Then n

11 nEk = TI = 1, and F

n

0

f

k=

1 F(ck)Xek9

=

k= 1

so that by (3) n

T(F f ) = 0

c F(ck)TXBk

n

2

F

k= 1

0

1

ck

TXB,= F

0

Tf

a.e.

k= 1

In the general case, let fn ( n = 1, 2, ...) be simple functions with 0 I fn f . Then Tfn t Tf a.e. (by Proposition 4.3), so that F 0 Tfn+ F o 7'fa.e. (by continuity of F ) . Moreover, F fn --t F f and IF I max,,,, I F ( x )I = c. Again using Proposition 4.3, T(F o f,) -,T(F o f )a.e. Since (4) holds for eachf,, it also holds in the limit. I

r

0

Remark

0

0

fnl

We shall only need Proposition 4.4 in the case T = EI, and F(t) =

t log t

(0 < t I 1) ( t = 0).

The function F is used in the Kolmogorov definition of entropy as follows.

Definition 4.2 Let 0 = ( X , 9#,p, 4) be a dynamical system. For each finite algebra d G 99,let d*denote the collection of atoms of d (sets in d having positive measure but no d-measurable subsets of positive measure). The entropy of d is

The entropy of

4 on d is

and the entropy o f 0 is h(@) = sup h(4, d). d

114

ENTROPY

IV.

The sup in (8) is over all finite algebras d G a. The algebra VrZ; 4-idin (7) is the smallest algebra containing each of the algebras 4-i.S = { 4 + ( A ) : A E &I. (9) In the following sections we shall show that h ( 0 ) is an m-isomorphism invariant, that h(+, d )is finite for each finite algebra d,that the Tim in (7) is actually a lim, and that under certain conditions the sup in (8) is actually attained, so that h ( 0 ) < co. 2. THE INFORMATION FUNCTION AND FINITENESS OF A(+,&)

In order to study the function H ( d ) defined by Eq. (6),we introduce a related function, called the informationfunction. Thus for each finite algebra d and each x E X we define I(&, x) by

1

I ( 4 x) = -

1% PfA),

(10)

A € d

so that H ( d )=

5 I(&,

x) P ( d 4 .

(11)

X

Similarly, we define the conditional information function and conditional entropy of d given W by I(&(%

x) = A

1 Ka(4 1% E

4KA(X),

(12)

d'

and

Note that EVxA(x)= p ( A ) if % = M , so that I ( d I N , x) = I(&, x) and H ( d IN) = H ( d ) . We may think of E,xA as the conditional measure of A given W. Let f E L , ( X , a, p). Then TQ,E%f E L 1 ( X , 4-'%, p). Moreover, for any B E W we have

2.

THE INFORMATION FUNCTION AND FINITENESS OF

A(#, d)

115

According to Proposition 4.2(viii), Lettingf = zA in (14) yields Hence

Now let d , , d, be finite subalgebras of 93, and let %? c 99 be an arbitrary o-algebra. The join of d ,and d,, denoted by d,v d,, is the . atoms are exactly the intersmallest a-algebra containing d,and d 2 Its sections A n B of positive measure with A E -01, and B E -01,. We shall show that l ( d ,v d,p,x) = I(&,

I+?, x) + I(&,

Id,v

v, x).

(16)

Suppose A E d , , B E g,C E v. Then A is the disjoint union of those atoms F of d ,such that A n F # (21. From this and parts (ii) and (vi) of Proposition 4.2, we see that the integral

where the fraction is taken to be zero when iE, x ~ ( x= ) 0, is equal to

=

1

p(B n F n C )= p(B n A n C).

F E d,, A nF#0

Since finite unions of intersections of the form A n C are dense in the a-algebra d ,v 59,it follows that for any D E .dlv 59 we have

116

IV.

ENTROPY

From Proposition 4.2(viii), it follows that

for each B E 8. Now let us evaluate the left side of Eq. (16), making use of (17). From (12) we have I(d1

v

d 2

lq, X) = -

c c

X A n B(x) A E ~ BE^, ,

X f d X ) I(&l

=

B E d',

= I(&,

1%

log E,

-

XA n

1

AX)

XB(x) B E d,,

log IEdlv W X d x )

x) + I ( d 2 Id, v %, x),

as asserted. Equation (16) expresses a type of additivity for the information function. Let us see how this extends to finite joins of finite algebras and integrate to get a formula for the entropy H ( V ; i , ' 4-id) occurring in (7). Let JV = d o d , l ,. . ., d n +E .$ be!? finite I algebras. Setting %? = JV in (16) gives

for each k = 1, 2, adding on k

. . ., p f -t 1. Noting that

I ( d o ) = l ( N )= 0, we obtain by

n+ 1

1

=k=

1z ( d k l : < d i ) .

(18)

2.

THE INFORMATION FUNCTION AND FINITENESS OF

Proposition 4.5 For any finite algebra d

E L43

h(4, d)

117

and each n = 1, 2, . . .,

we have k

and hence

which is clearly the same as (19). Equation (20) now follows by (11) and the fact 4 is measure preserving. I Next let us introduce the function J ( d I%',x) = ES I(& to Proposition 4.2(ii) and linearity of Ew, we have J(d[I,x)= -

1E

0

A

E

I%?,

x). According

X h ) 1% ES XAX).

a?'

From Proposition 4.2(vi) and (13), we also have

Proposition 4.6 Suppose q1G q2c 93 are o-algebras, and d finite algebra. Then H P 1%)

5 H ( d 1%).

c 93 is a (22)

118

IV.

ENTROPY

Proof We shall apply Jensen's inequality, Proposition 4.4, to the function F defined by Eq. ( 5 ) with T = bland f = IEW2xA for each A E d*. According to Proposition 4.2(iii), we have

Tf

=IEWlEW2~A=IEW,XA*

Thus F

Tf = I E W I x A log IEW1xA 5 T ( F f)= IEW,[IEW2xA log I E W 2 X A l ' Multiplying by - 1 and adding on A E d gives by (20)

blJ(~l%). Integrating and recalling (21) gives (22). I J(4Wl)

2

Theorem 4.1 For anyfinite algebra d

C 33 the limit

exists and isfinite.

Proof By Proposition 4.6 the sequence H(dIV;=, 4-'d) is nonincreasing, hence has a finite nonnegative limit as n 00. The equality of the two limits in (23) follows from Eq. (20) and the fact that a convergent sequence is Cesaro summable to the same limit. Corollary 4.1.1

h(4, d )I H ( d ) .

We conclude this section by deducing monotonicity properties of the functions H ( d 1 W ) and h ( d , 4 ) and showing that the entropy h(@) is an isomorphism invariant.

Proof (24) follows from (16). The remainder of the proof is direct and is left to the exercises. I

3. Remark

119

SINAi'S THEOREM AND GENERATORS

Inequality (26) is the motivation for definition (8) of h(O).

Theorem 4.2 Suppose O110,.Then h ( O l ) s h ( 0 2 ) .In particular, h(O) is an m-isomorphism invariant. Proof Suppose $: O2 + Ol.Clearly, we have for each finite subalgebra dE

H(*-

bq = H ( d )

for p,-almost all x E X , . Thus

As d ranges over the finite subalgebras of B l , $- 'dranges over a subset

of the finite subalgebras of 9,. Hence h(Ol) I h ( 0 2 ) , as asserted.

I

3. SINAi'S THEOREM AND GENERATORS

In this section we prove a very powerful theorem due to Sinai [%I. Sinai's theorem (Theorem 4.3) enables us, in many interesting cases, to evaluate h(O) = h(4, d )for certain finite algebras d.That is, Sinai's theorem tells us when the sup in Eq. (7) is attained. First of all, though, we need to prove a weak form of the martingale theorem (Lemma 4.1) and deduce an alternate formula for h(4, d).

Lemma 4.1 Let Wn c B ( n = 1, 2, . . .) be an increasing sequence of a-algebras, and let W be the a-algebra generated by uF= W n. (In this case, we write Wn t W.) Then lim E,"g n+m

=

E, g

a.e.

120

IV.

for each g

E

Lm(X,B, p). If d

5

ENTROPY

9 is afinite algebra, then

and

Proof (28) follows from (27) with g = z A for each A E d'. (29) follows from (28) and the dominated convergence theorem, since

( n = 1,2, . . .).

H ( d I%?,,)I H ( d )

In order to establish (27), it is sufficient to prove lim

[E,,xA = x A

a.e.

n-rm

for each A E V . For suppose that this has been established, and let f be a bounded %?-measurablefunction. Choose V-measurable simple functions f, such that 1 at. If,(.) - f(x)l I

n

Then, since IESmis doubly stochastic,

lb"f - fl

lk"f - b"hl + l h 4 , f k - hl + I f k - fl 5 k"lf - hl + Ik,h - hl + Ih - fl 5 2/k + lk"h- hl. 5

According to (30), the last term tends to 0 as n -,co.Thus lim 4, f=f

a.e.

n+ cc

Finally, if g E Lm(X, W, p), and if f = E y g , then f is bounded and %-measurable. From Proposition 4.2( iii),

b " f = b+6g

= b"g.

Thus (31) implies (27). Now let us prove (30). Let E and 6 be arbitrary real numbers subject to 6 > 0, 0 < E < 1, and let A E V . Since gnt %?, there exists an integer k and a set B E g ksuch that p(AA B) = p(A

-

B) + p(B

-

A) < ~6/2.

(32)

3.

121

SINAf'S THEOREM AND GENERATORS

We define successively (n = 1, 2, ...)

D, = { x : 1 -E%,,xA(x) 2 E } F,

= D, n

B

It follows that D,,EV,, for each n, and hence (by induction) that F,E V, v Vk = V,, (n = 1, 2, . ..). Here, n v k denotes the larger of the integers n and k. Finally, letting F = u : = k F , , we have

F

-

AGB

-

A,

so that m

m

.

From (32) and (33) follows P ( F ) < 612.

(34)

Since

u F, u ( B m

m

F

E

=

n=k

n On),

n=k

we now have 1 - E g n z A ( x )< E

and p(A

- (B

F))I p(A

- B) +

(n I k, x E B

-

F)

+

p ( A n F ) < ~ 6 1 2 612 < 6

from (32) and (34). It follows that -

lim 11 - E , " x A ( x ) l I E

n+W

for all x E B Therefore,

- F and hence, since 6 > 0 was arbitrary, for almost all

for almost all x E A .

x E A.

122

IV.

ENTROPY

Since (35) holds for all A E V, we have

or lim IEwnxA(x) =0

n+ m

(36)

for almost all x 4 A. Combining (35) and (36) yields (30).

vi v;= vp",

In the future we shall denote by E, V, the smallest a-algebra containing each of the a-algebras Vi ( i E I ) . Thus Vi t Wi, and we deduce from Theorem 4.1 and Lemma 4.1 the following corollary. Proposition 4.8 For eachfinite algebra d E 93, (37) We could, of course, have taken (37) as the definition of h(4, d ) .However, we shall have occasion to use both (37) and (7) in the sequel. For example, the latter is used in obtaining the following generalization of inequality (26). Proposition 4.9 Let dl, d 2E B be finite algebras. Then

Proof

Exercise 8.

Theorem 4.3 (Sinai) If d is a jnite subalgebra of 33 such that (i) V:=o 4-"d = 9, or (ii) @ is invertible and

v=:

-

$-"at = &?,

then h(@) = h(4, d ) , Proof We shall prove only (i). The second part can be proved in the same way or deduced from (i) and Theorem 4.4 below. Replacing d , by 4-j.d in (38) gives

vyz0 Since

vyzo

$ - j d

f

vTzo

j=o

+ - I d = 33, it

(39)

follows from Lemma 4.1 that the

3.

123

SINAi'S THEOREM AND GENERATORS

last term in (39) tends to H ( d l 193)= 0 as n + 00. On the other hand, it is easily shown (Exercise 9) that

That is for each n. Thus h(#, d l )I h ( 4 , d )for each finite algebra d,E 9.

h(4,d)= SUP h(#?d 11, dI

as required.

I

Now suppose that 0 = (X,B, p, #) is any abstract dynamical system and d E ~43is any finite algebra. Then Bl = 4-"dis a #-invariant o-algebra (see Section 1.4). Indeed, 4-l(B1) = vF=l #-"atE Bl. Thus the = (X, Bl, p, #) is a factor of 0. Sinai's theorem says that system

vr=o

h(49 =hP&) (40) for all finite algebras d E 9. We say that d is a strong generator for 0 if I#-"&= B, and that d is a generator for 0 if V =: - #-"a = B. We shall see that the notion of generator is very useful for evaluating h ( 0 ) . On the other hand, we have the following consequence of Theorem 4.3.

vrz0

Corollary 4.3.1 h ( 0 ) = 0. Proof

If 0 is invertible and has a strong generator, then

Let d be such a strong generator. Then

h ( 0 ) = h(#, d )= H from the integrated form of (15).

I

The next theorem, which asserts a type of continuity for the function h, may be thought of as an extension of Sinai's theorem. If d is a generator #-"at, then for the invertible dynamical system 0, and if Bl = B as m + 00. It follows that 4"a,

v;I"=o

0 = inv lim O+m,d. m- cc

On the other hand,

124

IV.

ENTROPY

for each m. Thus Theorem 4.4 includes the second part of Theorem 4.3. It also includes various extensions of Sinai's theorem given in Billingsley [7]. This was first noted by the present author in [12]. Theorem 4.4 Zf @ = inv lim,

0#, then

h ( 0 ) = lim h(@J. a

For the proof we shall need the following lemma, which has some independent interest. The subalgebra goc is said to be dense if W is the smallest o-algebra containing g o .In this case, each B E g can be approximated in W,,in the sense that to each E > 0 corresponds a Be E gowith p ( B A Be) E.

-=

Lemma 4.2 Let gobe a dense subalgebra of h(@)=

SUP d

G

a. Then

h(4, d),

Bo

(42)

where the sup is taken over jinite subalgebras of go. Proof Let a?'E % . 3 be any finite subalgebra. According to Proposition 4.9,

+

h(4, d') I h(4,d ) H ( d ' l d ) , where d c g o is any other finite algebra. We need only show that for a given E > 0 there is such an d so that

H ( d ' l d ) < E. For then it will follow that h(4, d') I for each

sf,

SUP h(4, d ) .dE 8 ,

and hence h ( @ ) I sup

h(4, d).

.d E a,

Let us look at the quantity H ( d ' 1 . d ) = J l(d'1d) dp. X

For the finite algebra d we have (Exercise 3) that

(43)

3.

125

SINA~’STHEOREM AND GENERATORS

for any B E g.Hence

so that H ( d ’ l d )= -

c

A’ E

p ( A n A ’ ) log p ( A n A ’ )

~kA E d

+

p ( A ) log p ( A ) A

E

d

where F is the continuous function defined by (5). Since a, is dense in 99, we can choose a finite number of sets A’ E g o ,one for each atom A E d , such that (i) they are pairwise disjoint, and (ii) for each A E d there is exactly one A’ with p ( A A A ’ ) < 6 = 6(&/n2),where 6(&)is a modulus of continuity for F and n is the number of atoms in d.Let d’be the algebra whose atoms are these sets A’. Then expression (44)contains n2 terms each smaller in absolute value than e / n 2 , and (43) is established. 1 Proof of Theorem 4.4 We may assume without loss of generality that 4 ) and QQ= ( X , 91Q,p, 4 ) for each a E J . According to Theorem 4.2, h ( 0 J ( u E J ) is a monotone net, and so the limit lim, A(@,) I 00 exists and is equal to SUP, h ( 0 J . Moreover, h(OQ)Ih(@) for each a E J , and so @ = ( X , J, p,

+

h(@)2 lim h(OQ). Q

It only remains to prove the reverse inequality. 91Q.Then 93, satisfies the hypotheses of the lemma. Let Let d o= d oG W, be any j n i t e subalgebra. Then there exists an a~ J (J is a directed set) such that d oc J a .But then

uz

h(4, d o ) Isup h(4, d )= h(@,), ddc

a,

It follows from Lemma 4.2 that h ( 0 ) = sup h ( 4 , d )I sup h(OQ), .d G

and the proof is complete. Corollary 4.4.1

If6

8 0

1 E J

1

is the natural extension of@, then

h ( 6 ) = /I(@).

126

ENTROPY

IV.

Proof The sum, of course, is numbers are positive. It will suffice to show that

+ co if more than countably many of the

h ( @ , @ @ 2 ) = h ( @ , ) + h(@Z). (46) If we write 4 = (X, 8, p, 4) and Qi = (X,g i , p, 4), then CP = O1@ QZ iff B = gl v Bz and the a-algebras 8,and gZ are independent, in the sense that p(B1 n B 2 ) = p ( B , ) p ( B 2 )for all B , E 8,, B2E B z .According to Lemma 4.2,

h(@) = SUP h ( 4 , d I v dz), d,.d

(47)

2

where the sup is taken over all finite algebras dlc Bl and d 2E g2.If d , and d 2are such algebras, they are clearly independent. Moreover, 4-"d1 and &-"sit2are independent for each n. It follows easily (Exercise 11) from this independence that H ( f p ( d l v d z )=) H ( q ! r " d , v 4 - n d z ) = H(C#)-nd,)

+ H(4-"d,).

Dividing by n and passing to the limit gives

h(4, d , v

d 2 )

so that (46)follows from (47).

+

= h(4, 4)h(4, d z ) ,

I

We conclude this section by evaluating the entropy of the Bernoulli shift on k points (Example 3, Chapter I). According to Corollary 4.4.1, the two-sided and one-sided shifts have the same entropy. Let us calculate it for the two-sided shift. Thus

tx

(x,B,P)=

"=-m

xn,

x

gn,

,=-a

x

p"),

n=-m

where

X, = (0,1, . . ., k - 11, pn = { P O , P I ,

and 4(x) = y, where y , = x,+ the form

B, = {all subsets of X,}, . * * ?

~k-11,

Let d be the class of all cylinder sets of

A = {x : xo E Ao},

4.

TOPOLOGICAL ENTROPY

127

where A , 5 X o . Then d is a generator for 0,and $-“a? is independent of $-ma? for all n # m (since p is the product measure). Hence

and

Since the atoms of d are the sets A = { x : x ~ = ~ } ( j = O , 1,..., k - l),

it follows that k- 1

h(@) = H ( d ) =

-

C ~ plog j pj. j=O

In particular, if p o

= p1 = ... = p k -

=

k-1

h(@) =

l/k,

I

C- k1 log k = log k. ~

j=O

This proves that, in particular, “symmetric” shifts on k points for different k are nonisomorphic. 4. TOPOLOGICAL ENTROPY

In Section 11.4 we have introduced a notion of isomorphism for two classical dynamical systems C1 = (Xl, a,) and C, = (X,,a,). Let us say that the systems are i-isomorphic and write C, A C, if there exists a continuous invertible map $ of X, onto X 2 such that $0, = a, +. In 1965, Adler et al. [4] introduced an analog h,(C) of the entropy h(@) for classical dynamical systems. They conjectured, and it was later proved by Goodwyn [26], Dinaburg [15], and Goodman [25], that the topological entropy h,(C) was equal to the sup of the numbers h ( F ) where p is a a-invariant Bore1 measure on X and C p = (X, W,p, a). We shall take this as our definition.

Definition 4.3 Let Z = (X, a) be a classical dynamical system and let M ( C ) denote the class of a-invariant normalized measures on X. The topological entropy h,(Z) is

128

IV. ENTROPY

A number of useful properties of topological entropy follow directly from the definition. Recall that Z1 is a factor of C, (CllZ2) if there is a continuous epimorphism $: X 2 -+ X 1with $al = 0, $. Theorem 4.5 Suppose Z, (C,.Then ht(Zl) Iht(Z2). I n particular, h,(C) is a t-isomorphism invariant.

Suppose t,b: X , + X , is a continuous epimorphism. Let be the induced linear operator, and let T,*: M ( X , ) -+ M ( X , ) be the adjoint of T, (see Section 11.1). Since $ is epic, T, is monic and T,* is epic. If $02 = al$, then T,*T,**= T;, q*, so that T,* maps the subset M(Z,) G M ( X 2 )onto M(Z,). (See Exercise 11.15.) Moreover, for each p E M ( X 2 ) ,t,b is a homomorphism of Z2”= (X,,a,,p, a2) onto Z?’P = (X,,a,,T,*p, ol), so that, by Theorem 4.2, h(Cp’P) Ih(C,”). Proof

q :C ( X , ) + C ( X , )

It follows that h,(Z,) = sup h(ZJ = sup h(CT.’fl) M(Z,)

vE

P

E

M(&)

I sup h(C’2”)= h&). /r E

Theorem

ht(C) = @ l )

4.6

If

I

M(&)

C = C, @ Z 2 is

the

direct

product,

then

+ ht(Z2)-

Proof Suppose that p E M ( Z ) is arbitrary. Let us denote by h(a, d,p ) the quantity h(a, d)calculated according to (7) for the abstract system Z”. It follows from Eq. (16) of Section 2 and Proposition 4.6 that h(a, d , v d , p ) Ih(a, dl,p ) 4- h(o,d 9

29

p)

(50)

for any finite algebras d , , d,c 93. Since @ = i3f1 x i3fz is the product a-algebra, there are canonically determined copies Bl of 33, and 3,of i3f2 in 93. (For example, A x X, E for each A E a,.) In fact, 93 = 3,v Selecting d,E g1 and d 2s 3, in (50) and taking the sup gives by Lemma 4.2

a,.

al

h ( P ) Ih(Cf1)+ h(Z$Z), where p1 and p, are the “marginal measures,” Pl(A) =

x

X2)?

P2(B) = AX1 x B).

(51)

4.

129

TOPOLOGICAL ENTROPY

Now if p is the direct product p , x p 2 , then the o-algebras

al and

g2are independent, in the sense that P((A x

X2)

n ( X l x B ) ) = P(A x B ) = P l P ) P A B ) = P(A x X2) *

4x1

x B)

for each A E W,, B E W,. Moreover, since o = o1 x 02,we have p E M ( C ) for each p , E M(Zl), p2 E M ( Z 2 ) . It is easily seen (Exercise 11) that we then have (50) replaced by h(o, d,v d, P ) = h(o, SQ,, P ) I

+ h(a, d

2

9

PI9

and so Summing up we have for each p

E

M(C)

Taking the sup first on p1 E M(Cl), p, h,(C) Iht(Cl)

E

+ h,(C2)= sup h(CP1 PI3

This completes the proof.

Theorem 4.7

M ( C , ) and then ,u E M ( C ) gives xflz)

5 h,(Z).

Pz

I

If C = inv lim,

C,, then h,(C) = lim, h,(Z,).

Proof For each a E J , Z, is a factor of C. Moreover, ZuIZa for a < p. Thus the net h,(C,) of real numbers is monotone nondecreasing. Hence the limit exists and lim ht(Z,) = sup h,(C,) _< h,(Z). a

To prove the reverse inequality, let p E M ( C ) be arbitrary. Then, as in the proof of Theorem 4.5, p determines a measure pu E M(C,) for each a E J such that

C” = inv lim Zf. a€J

According to Theorem 4.4, h(Z’) = lim U

and so

h ( Z 2 ) I lim h&),

130

Corollary 4.7.1

IV.

ENTROPY

Xu,then h,(X) =

1fZ =

4(Xu).

Remark It follows easily now from Furstenberg’s theorem (Section 11.6) that the entropy of a minimal distal system is zero (Exercise 21). This can also be proved directly in the case of a metric space (Exercise 22).

For classical dynamical systems we have the additional concepts of subsystems and sums. Let us look now at the corresponding entropy relations. The first result again follows directly from Definition 4.3. Theorem 4.8 If&

i s n subsystem ofZ2, then

ht(Xl) Iht(Z2).

Proof Recall that X, is a subsystem of Z2 if there exists a continuous monomorphism $: X, + X 2 such that $0, = o2 $. Thus we may as well and ol is assume that X, c X, is a closed subset, @, = { B n X, : B E 9,>, the restriction of 0, to X,. If p E M ( Z , ) , we can extend it to an invariant measure on X2by setting -

A 4=P V

n Xl).

It is easily verified (Exercise 17) that

h(Z,’) = h(X,P).

(53)

Hence

h,(C1)= P

Now if C = Z, Thus

SUP h ( Z 2 i ) MX,)

5 h,(Z,).

I

E

OX,,then each of C, and C, is a subsystem of Z.

ht(Z) 2 maxMW9 ht(Z2)I. (54) It can be shown that the inequality in (54) can be replaced by equality. However, the proof is surprisingly involved, and we shall content ourselves with proving a related result. Definition 4.4 Let M,(C) denote the set of ergodic measures ,u E M(Z). The ergodic entropy h,(Z) is defined by

4.

TOPOLOGICAL ENTROPY

131

Proposition 4.10 If C = C, 0 C, is the direct sum, then

Proof Since X I and X , are a-invariant subsets of X = X , u X , , any ergodic measure p on X must be concentrated on either X I or X , . It follows as in the proof of Theorem 4.8 that h ( P ) is either h(Cf1)or h(C52). On the other hand, an ergodic measure on X i can be extended (in just one way) to an ergodic measure on X as in the preceding proof. Equation (56) follows from these observations and ( 5 5 ) . I Remarks I The set M ( C ) is a weak*-compact, convex subset of the linear space M ( X ) = C ( X ) * ,and M e @ ) is just exactly the set of extreme points of M ( C ) (Exercise 11.8). Moreover, as we shall show next, h ( Y ) is an affine function of p on M ( C ) . Thus it might reasonably be conjectured that he(C)= h@). This is true, for example, if the supremum in (49) is attained for some p E M ( C ) (Exercise 20). 2 The definition of ergodic entropy is motivated by the corresponding notion of ergodic capacity in information theory, where our definition of topological entropy corresponds to stationary capacity. (See, for example, Breiman [lo], in which equality of the two types of capacity is shown for symbolic dynamical systems. Since such systems are group automorphisms, this case is covered by the above remark and a theorem of K. R. Berg in the next section.) 3 In [35], Jacobs gives an integral representation for h(C”) which would seem to yield the equality he = h, . However, he is concerned with abstract systems ( X , a,p, 4 ) with p varying rather than with a topological setting. Thus there is in general no guarantee that his ergodic measures belong to Me(C). 4 With the historical definition of h, (see [4])the proof of (56) for h, is fairly straightforward. It follows, of course, from the Goodwyn-DinaburgGoodman theorem that it also holds for our definition. We leave this for the interested reader to pursue, but remark that Goodwyn [27] has shown (56) does not extend to infinite sums.

Proposition 4.11 Let C be a classical dynamical system. Then h ( P ) is an afJine function of p E M ( C ) .

Proof We follow [lo]. For each finite subalgebra d of 3, let us denote

132

IV.

by d nthe algebra have

v;:;

ENTROPY

a W k dFor . pl,

- lim n+m

- lim

pz E M(C), s, t > 0, s + t = 1, we

c log c S P l ( 4 lads + tP2('4)/P1(41 c tP2(A) log[t + SPl(A)/P2(41.

1 -

tPZ(4

P2(4

"AE.~"

n-m A E ~ .

- lim n+w

(57)

From the elementary inequalities 0 I log(1 + x) Ix

(x > O),

we have log s Ilog(s + t u ) Ilog s + tu/s

(u > 0).

Setting u = p z ( A ) / p l ( A ) multiplying , by (s/n) pl(A), and adding on A gives

It follows that the third term on the right in (57), and likewise the fourth term, are equal to zero. Thus (57) becomes

h(o, d ,spi

+ tPZ) = sh(a, d,P l ) + th(U, -c9, Pz).

Taking the supermum over d yields the desired result. We conclude this section by noting that yet another entropy invariant for C can be defined. In Section 3.6 we defined an affine system associated with the classical dynamical system C. Clearly, Cl Z C2 implies 2, r Thus the afJine entropy of C,

e

zz.

h @ ) = h@), (58) is an isomorphism invariant. In the next section we shall see that h@) = he@) = h(em),where m is Haar measure. It is not clear, in general, how ha@) compares with h @ ) and h@). However, according to Theorem 3.9 and Berg's theorem, they must all coincide for systems with quasidiscrete spectrum.

5.

ENTROPY OF AFFINE TRANSFORMATIONS

133

5. ENTROPY OF AFFINE TRANSFORMATIONS

Let &(x) = T ( X ) + a be an affine transformation on the compact abelian group G as in Chapter 111. We denote as usual the Haar measure on G by m. We show first that the various definitions of entropy give the same value h(C) for C = (G,t ) ,and for @ = (G,4) when h(C) < 00.

Theorem 4.9 ( K . R. Berg) Let 7 be a continuous epimorphism of G,and let $(x) = 7 ( x ) + a. Denote C = (G,T ) and 0 = (G,&), (A) For each p E M(Z), h ( P ) I h(C"). Thus h,(C) = h,(C) = h ( P ) . Denote their common value by h(C). (B) If h(C) 00 and p e M ( @ ) , then h ( W ) Ih(@"), and so !I,(@) = he(@)= h(0").

-=

In [6],Berg proved h ( V ) I h(C"') by essentially the argument given here. We prove (B) first. Let p E M(@).Define the subalgebras Bl, B2of by Bj = n;l(B), where n l ( x , y ) = x and n2(x, y ) = x + y. The mapping p : G x G -, G x G defined by p ( x , y ) = ( x , x + y ) is a homeomorphism. Thus Bl v B2 = p - '(B x B)= 8 x 8. Moreover, Proof

Bx

nl:(G x G, Bl, m x p, 7 x

4) -, (G, 9, m, 7 )

and

n 2 : ( G x G , 8 , , r n x p , ~x & ) - , ( G , % m , & ) are isomorphisms. This is obvious for n, and follows for n, from

and

&)(& Y) = T(x) + &(Y) = $(x + Y ) = &n2(x,Y ) . Now let -01, c Bl and d 2c g2 be finite subalgebras. Then d lv -01, is a finite subalgebra of g1 v B2 = B x 9,and the union of all such algebras d lv -01, is dense in 9? x 9.For each choice of -01, and d 2 we have by subadditivity of entropy (cf. Exercise 11) that %(7

h(r x

4, d lv

d2m , x p) 5

h(7

x

4, d,, mx

p)

+ h(7 x (6, d,,m x p).

It follows from Lemma 4.2 and the above observations that

h(C" 0 W)I h(C") + h(@").

134

IV. ENTROPY

According to Corollary 4.4.2, the left side is equal to h(C") + h ( W ) . If h(Z") < co,we can cancel it to obtain h(@")I h(@"), as asserted. Finally, setting a = 0 makes Q, = C, so that the above inequality becomes h(Cm)+ h ( P ) I 2h(Zm),

which gives h ( P ) Ih(C") regardless of whether or not h(C") < a. The equality of he(@) and h,(Q,) follows from h,(Q,)= h(@"), as in Exercise 20. I We turn now to the task of calculating the entropy of Q,. According to the results of Section 11.4, it is sufficient in a sense to look at ergodic automorphisms and transformations with quasiperiodic spectrum. In view of Proposition 3.7, the following theorem is a slight generalization of a result of Rohlin [52] on ergodic automorphisms of compact metrizable abelian groups. Theorem 4.10 (Rohlin) If

t

is an ergodic automorphism of G and

C = (G, t) is monothetic, then C is a Kolmogorov system. In particular, h(C,) > 0 for every nontrivial factor Z1 of C.

Proof According to Exercise 16(b) and Theorem 3.8, it is sufficient to show that Oe has a a-algebra gosatisfying (i) (b- YBO) E g o (ii) 4-"(B0)= N , and (iii) @(U,"=,( ~ " G J = ~ )g.

n,"=l

3

Recall that = (H, ?), where H is a closed subgroup of the dual of K," = - oo K , (the direct product or complete direct sum) and f = a* is the adjoint of the shift transformation on K,". In fact, H is the annihilator of the group A E Kdm, defined as in Section 111.4. Thus fi = &"/A. Now let To = {y E K d W: yn = 1 (n > 0)). It is easily seen that

x,"=

(i') o(T,) G T o , ,"=, o"(To) = { e } (e being the identity on KdW), (iii') (ii') o-"(ro)+ A spans &(kdU), (iv') To n A = {e}.

0

The proof is completed by letting gobe the smallest o-algebra on which all the functions on H determined by elements of To are measurable. I Theorem 4.11 (Seethofl) Let C = (G, T), where T is a continuous epimorphism of the compact abelian metric group G . Then h ( Z ) = 0 i j f X has quasiperiodic spectrum.

6.

MCMILLAN'S THEOREM AND ENTROPY OF INDUCED SYSTEMS

135

Proof Suppose that Z has quasiperiodic spectrum. Then C is distal according to Theorem 3.6. From the Remark following Corollary 4.7.1 it follows that h ( 0 ) = 0. Alternatively, h ( 0 ) = 0 may be proved following the same steps as in the proof of Theorem 3.6. For this purpose, we need in Step I1 the fact that T on G has entropy zero when the restriction T, of T to H E G and the factor transformation T / H on G / H have zero entropy. In fact (Exercise 24), h(C) = h(C,)

+ h(C/H).

(59)

Conversely, suppose C does not have quasidiscrete spectrum. Then the group H = A' of Theorem 3.7 is nontrivial. It follows from Theorem 4.10 that h(C,) > 0. Since C, is a subsystem of C, we have by Theorems 4.8 and 4.9 that h(C) = h@) 2 ht(C,) > 0.

I

Theorem 4.12 Let 0 be an ergodic afine system with quasidiscrete spectrum. Then h ( 0 ) = 0. Proof Either of the alternative proofs sketched above for C will suffice. A third possibility is presented by Theorem 3.5, where a direct calculation may be made for h(@).Note that metrizability of G is not needed. I

Corollary 4.12.1 Let 0 be a totally ergodic (totally minimal) abstract (classical) dynamical system with quasidiscrete spectrum. Then h ( 0 ) = 0 ( h t ( 0 ) = 0). 6. McMILLAN'S THEOREM AND ENTROPY OF INDUCED SYSTEMS

According to Theorem 4.1 and Proposition 4.5, we have for any dynamical system 0 = (X,93,p, 4) and any finite d E B

and

136

IV. ENTROPY

(where the empty join in the first term is taken to be the trivial algebra .N = {4, X}).According to Lemma 4.1,

a.e. From the ergodic theorem,

converges a.e. to a constant, if Q, is ergodic. In fact, this constant is j x g dp = h(4, d). We shall show that I n-1

in Ll(X, a, p). This result is due to McMillan [43] and proves to be essential in the calculation of the entropy of induced systems in the sense of Section 1.6. It should be noted that it has been shown by Breiman [9] that (60) holds a.e. as well. We will not need this improvement of McMillan's theorem. Following Halmos [33], we begin by showing that the convergence I(& 1%") + Z(d1%) in Lemma 4.1 holds in the norm of L,. This will follow by a standard argument on uniform integrability from the following lemma. Recall that a sequence {fn} of measurable functions is uniformly integrable if

Lemma 4.3 The sequence Z(dI%?,,)is uniformly integrable for each choice o f d and W n(n = 1, 2, ...). Proof Let 2 = {Al, . . ., A N } ,and let r, s be real numbers with 0 Set

D, = { x E X : r IZ(dI%,,, x) Is} and C , = { x E X : e-' IEw,xA, Ie - r } ,

-= r Is.

6.

MCMILLAN'S T H E O W AND ENTROPY OF INDUCED SYSTEMS

137

for each j = 1, . . ., N ; n = 1, 2, . . . . Thus A j n D, = A j n C,, and C , E W,, for each n, j . It follows that

and so

L,

I ( d lU,) d p Is p ( A j n D,)

s se-'.

n D"

Summing on j gives

+

Now set r = t k, s = t (61). Adding on k gives

L:

+ k + 1 for each k = 0, 1, 2, ... and any real t in 30

l ( d I W , , ) d p IN

, ( . d , % x) " ,2 t }

1 ( t + k + l)e-(r+k).

(62)

k=O

Since the right side of (62) tends to zero as t goes to co, the proof is complete. I

Theorem 4.13 (McMillan) Let # = ( X , g,p, 4 ) be an ergodic abstract dynamical system, and let d E B be a j n i t e subalgebra. Then

Proof By uniform integrability and almost everywhere convergence, we have

(63) n-

m

in the norm of L , . Again letting

and setting h = h(4, d),

138

IV.

ENTROPY

we have

The first term tends to zero by (63) and the regularity of Cesaro sums, and the second term tends to zero by the mean ergodic theorem. I Let d E ?d be as above. For each Z E a with p ( Z ) > 0 we let Z n d denote the algebra of sets Z n A with A E d,and define

H ( Z n d )=

-

1 p(Z n A ) log p ( Z n A ) . A E . ~

Let d,denote the algebra

VyZ;

n2no,ZEdn*

<E.

(64)

Proof Choose no such that

For such an nand any Z

E d,,, Z n

A = A or @ for each A

E

d,,, so that

Remark The functionf(t) = --t log t is monotone increasing on [0, e-l], so that whenever Z and d are such that p ( Z n A ) I e-' for each A E d; we have for each measurable set W G Z that H(W n d )I H ( Z n d).

6.

139

MCMILLAN’S THEOREM AND ENTROPY OF INDUCED SYSTEMS

Corollary 4.13.2 For each E > 0 and Z E B with p ( Z ) < min{e, e- ’ }there exists an integer n, = n,(&,.d, Z ) such that < nc. n 2 n,, W c Z* H ( W n d,)

(65 )

-

Proof Let @ = { Z ,X Z}. From the previous corollary and subsequent remark, and the subadditivity of entropy, we have n 2 no(&,,d v 59) 1

1

1

*H(W n d,) I - H ( Z n d,) I H ( Z n (dv n n n -

U),)

Ip ( Z ) h(4, .54 v %) + E I E[h(d), d )+ h(4, U ) I E[h(d), d )+ log 2 + 11.

Thus we can take nl(&,d,Z ) = no(&/[!(& d )+ log 2 + I],

+ 11

A? v U).

I

Now we are prepared to use Corollaries 4.13.1 and 4.13.2 to evaluate the entropy h(Oy) of the induced system Oy as defined in Section 1.6. First, however, it is convenient to make a slight extension of these results. U p until now we have considered only finite algebras d in the construction of entropy. However, it is not hard to show (Exercise 26) that the same results, including McMillan’s theorem and its corollaries, are valid for a-algebras d generated by countable partitions of X and satisfying H ( d )= -

1 p ( A ) log p ( A ) <

~ 0 ,

A E .cd

where the sum now is an infinite series. In the proof of Theorem 1.9 we introduced the sets A,

= {xE

Y : F ( x ) = n},

where F ( x ) = n y ( x ) is the smallest positive integer n such that +“(x) E Y . Let d obe the a-algebra whose atoms are the A , so defined and the single set X Y.

-

Lemma 4.4

H ( 2 ’ ) < 00 and

140

IV. ENTROPY

it follows from elementary considerations of positive-term series that the finiteness of H ( Z o ) follows from the convergence of the infinite series in (66). The proof of (66) is left to the exercises. Remark The equality in (66) is usually known as Kac’s theorem and is a refinement of the recurrence theorem (Theorem 1.7). It says that the “mean recurrence time” of the set Y is inversely proportional to the measure of Y,

Let us define a sequence p n of integer-valued functions on Y by n- 1

=

C F(4Ykx)*

(671

k=O

Thus p , ( x ) is the number of steps in the sequence 4 k until ~ the nth return to the set Y. For each a-algebra d 5 Y n 98 of measurable subsets of Y, let us denote by Z the a-algebra generated by d and the single set X Y. Let d be such a a-algebra generated by a countable partition of Y, and assume that d o= Y n go E d . Thus each atom of d ois a union of atoms of d, and each atom of d is entirely contained in some atom of d o .Moreover, each atom of d,= v&;4;’d is contained in an atom of dn0 = V;!h4;kdo. Clearly, p n is constant on these atoms. Thus

-

(a) pn is constant on the atoms of d,.Let A be such an atom. It is only slightly less obvious that (fl) p , ( A ) = m, A E d,=sA is an atom of g,,,= VrZi 4-kg.

Theorem 4.14 (Abramou) Let O = ( X , a, p, 4 ) be an ergodic abstract dynamicalsystem, and suppose Y E 33 with p( Y ) > 0. Then h(O)= h(OY)p( Y). Proof [l] According to Proposition 1.12, O y is ergodic. From the ergodic theorem and Lemma 4.4, 1 1 lim - p , ( x ) = Fdp=a.e. on Y. n-co n P(Y) Y P( Y) It follows that (l/n)p, + 1/p( Y) almost uniformly. That is, for each E > 0 there is a measurable set Y, with p ( Y Y,) < E and (l/n)p,-+l/p(Y) uniformly on Y,. Let d be a a-algebra of measurable subsets of Y, generated by a countable partition of Y and containing d o .Choose N such that n 2 N implies (i) I (l/n)pn(x) - l / c l ( Y )I < E for x E Y, - h(4, a ) p ( B ) I < EforallBE = 4-‘s (ii) ((l/n)H(Bn d,) (Corollary 4.13.1), ~

9

snv;;

6.

V;:

(iii)

141

MCMILLAN’S THEOREM AND ENTROPY OF INDUCED SYSTEMS

1 (l/n)H,(B n d,,) - h(&,

4Fkd (Corollary 4.13.1),

d )p ( B ) / p (Y )1 < E for all B E d n=

(iv) (l/n)H(B n d,,) < E for all B (v) l l n < E,

E

Y

- r,

(Corollary 4.13.2), and

where H , is calculated by replacing p ( A ) by p ( A ) / p (Y ) in H . For each n 2 N let Y, be the union of those atoms of d nthat intersect Y, in a set of positive measure. According to (a), inequality (i) is valid for all x E Y,. Let m, = min,, y. p,,(x), m, = max,. y, p,,(x). (Note that m 2 co because of (i) and the previous sentence.) According to (/I), Y, n am1 E Y, n d,,E Y, n zm,. Thus

-=

H,( Y, n Sml) I H,( Y, n d,,) I Hx( Y, n a m 2 ) * Now note that H x ( Y, n d n ) = -

C

pu(A

(68)

Y,) log P ( An Yn)

A E d’”

= P ( Y ) H d Y n n dn)+

AY,) log ~

( y )

becomes H ~ ( Yn, a m l ) 5 P ( Y )H ~ ( Yn, an)+ ~

( 5log) P ( Y ) 5 H A Y , n a m z ) * (69)

From (69) and (i) we have

Likewise, and by (ii),

142

IV.

Let $9 = { Y , X

-

ENTROPY

Y}. Then

H,( Y

n Jm2 I)H,( Y n (2v

+

= H,( Y, n Zm2)H,((Y

-

%)rn2)

Y,) n Srn2).

Thus by (iv)

H,( Y, n iZmJ I H,( Y Noting that m,,m, + 03 as n-,

03,

n Sm2) I H,( Y, n

am2) + E.

(72)

it is seen from (70)-(72) that

Since we have by (73), (ii), (iii), and (iv) similar statements regarding the differences of consecutive terms of

it follows that

Ih(4, a- P( Y ) h(4Y 4 I 7

for some constant M independent of

E.

< ME

Since E was arbitrary,

h(49 2)= P ( Y ) h(4Y

9

(74)

Clearly, the union of all a-algebras d of the form described is dense in Y n W. Since 0 is ergodic, it follows easily that the union of the associated a-algebras 2 is dense in W. Thus, by Lemma 4.2, the statement of the theorem follows from (74). I

EXERCISES Isomorphism 1. (a) Linear operators T, and T2 on Hilbert spaces H, and H, are isomorphic if there exists a linear operator S: H,+ H 2 such that (i) S is invertible, (ii) llSfII = l l f l l for allf E H,, and (iii) the diagram

143

EXERCISES

H 2 A H, commutes. If m 1 2 m,, show that T+,and T+2are isomorphic. (b) Affine transformations 4, and b2 on compact abelian groups G , and G , are g-isomorphic (bl a &) if there exists an affine transformation I,$: G , --* G, such that (i) I,$ is invertible, (ii) I,$ is continuous, and (5) the diagram 41 Gl Gl

-

I* -i. 42

G,

G2

commutes. Show that O1 2 @, implies J 1 A 6,. In particular, any two choices of a E G in Eq. (23) of Section 111.6 yield g-isomorphic affine transformations $. Conditional Expectation 2.

Prove Proposition 4.1. 3. (a) The conditional expectation off given V is usually defined in the following way. Show that the two definitions are equivalent.

Definition 4.1* For each f E Ll(X, g,p), (i) g is %?-measurable, and (ii) Scg dp = jcf dp for all C E V.

(b) For a finite algebra sd with atoms A , ,

f

= g iff

. . ., A , , show that

4. Let I,$ be a measure-preserving transformation of the normalized measure space (Xl, B1, pl)onto the normalized measure space ( X , , g,, p,). The two spaces may coincide. Show that the induced linear operator $ satisfies T$ T,* = q-'(.&*).

144

IV. ENTROPY

Thus the theory of conditional expectations is coextensive with the theory of noninvertible measure-preserving transformations. 5. Complete the proof of Proposition 4.3. Entropy

6. Show that F as defined by Eq. ( 5 ) satisfies the hypotheses of Proposition 4.4. 7. Complete the proof of Proposition 4.7.

8. (a) Show that H ( d I V )= 0 iff d 5 V. (b) For finite algebras d,d'c D and any 0-algebra V E D,

H ( d v d ' l W ) = H ( d l % ) + H ( d ' l d v V) IH ( d l V ) + H(d'1V). For any finite algebras d l ,d 2s D

(c)

a,)+ H ( d 1 Id,). Deduce from Exercise 8(b) that

(4 h(+?dl)5 9. (a)

h(49

d 2

C V * H ( d l v d 21%) = H ( d l 1%).

For any finite d l ,d , E W and any W E 9,

(b)

H ( d l v d2/% v d2) I H ( d l p).

For any finite d

(c)

c W and any k, m

k

10. Show that h(4l 4 4 .

h(4, d )= h(4, 4 - l . d ) and,

if q5 is invertible, h(4, d)=

145

EXERCISES

11. Suppose that d,, d,E 99 are independent, that is, p ( A , n A,) = p ( A , ) p ( A 2 ) for A, E d,, A, E d,. Show that

H(d, v

d2)= H

+ H(d,).

(d,)

Conversely, if d ,and d,are not independent, then

Nd,v

d 2 )

<Wdl)+H(d2).

Calculation of Entropy 12. (Markou shiji) Let X = X,"= - X , , where X , = (0, 1, . . ., k - l}, and define 9l and 4 as for the Bernoulli shift. A &invariant measure p is defined on (X, 97) as follows. We suppose given nonnegative numbers ps (j= 0, 1, ..., k - 1) and p i j (i, j = 0, 1, ..., k - 1) such that

The measure p is defined on cylinder sets by p { x : x, = i,, x,-

, = i,,

..., x,-"

= in} = p j o p j o j*.. g pjm-lj".

This measure is then extended (as an inverse limit of measures on finite products) to a measure on (X, a).Show as for the Bernoulli shift, that the algebra d of cylinder sets based on X o is a generator, and that

h(@) = N#, d )= -

1P i C i

Pij

log P i j *

I

13. (Translations have zero enrropy) (a) For any finite d E 9l and any positive integer k,

(b) For any k E Z, let Ok = (X, a, p,

4'). Then

h(Qk) = lklh(@). (c) If X = K is the circle group, and if 4 is multiplication by a root of unity, then cPk = I for some k, and hence h(@) = 0. (a) If X = K, if 4 is rotation through an irrational multiple of 71, and if .d is the algebra whose atoms are the upper and lower half-circles, then .d is a generator for @, #-"d= d.Hence

v:=,

h(@) = H ( . d J B )= 0. (e)

If X

=

K" is the n-torus, and if +(x)

=x

+ a, then h(@) = 0.

146

IV.

ENTROPY

14. (a) If X is a finite set, then h ( 0 ) = 0. (b) If 0 is ergodic and h(Q,) > 0, then X is nonatomic. Deterministic and Nondeterministic Systems 15. (a) Suppose h(Q,)=O and A E B . Let d = {@, A, d, X). Then h(+, d )= 0. Deduce from this and Exercise 8(a) that A E &-'(a), and hence that Q, is invertible. (b) If h(@) # 0, there exists a finite algebra d such that the factor @& = { X , VFz04-"d, p, 4} of Q, is noninvertible. (c) The system Q, is said to be deterministic if d' c 4-"dfor each finite algebra d E a. Show that the following are equivalent: (i) h ( 0 ) = 0. (ii) @ is deterministic. (iii) Every factor of Q, is invertible. 16. Let Q, = (X, B,p, 4 ) be a dynamical system, and let a d be the join = (X, go,p, 4 ) is of all +invariant a-algebras B0 C .98 such that deterministic (has entropy zero). Equivalently, let @d be the sup of the deterministic factors of 0.If = N is trivial, @ is completely nondeterministic. (a) Show that ad = (X, a d , p, 4) is deterministic. (b) The following are equivalent. (i) Q, is completely nondeterministic. (ii) Every factor of 0 has positive entropy. (We say 0 has completely positive entropy.) (iii) Every factor of Q, has a noninvertible factor. (iv) The natural extension of Q, is a Kolmogorov system (see Exercise 1.25).

v:=l

Topological Entropy 17. Verify Eq. (53). 18. If C = (X, CT) is a classical dynamical system, and if X k = (X, d), show that ht(Xk) = ( kI ht(X). 19. Show directly from Definition 4.3 and Eq. (48) that the topological entropy of the k-shift is log k. 20. Suppose that ht(X) = h ( P ) for some p E M ( X ) . (a) Show that the set A = {v E M ( Z ) : h(X') = h , ( X ) }

is weak*-compact, convex, and nonempty.

147

EXERCISES

(b) Show that A is an extremal set, in the sense that svl

+ tv,

O<s,t,s+t=

E

A 3 v,, v , 1

E

A.

(c) Conclude that any extreme point of A is an extreme point of M ( C ) , and hence that there exists an ergodic p with h,(C) = h(Cp). If a1is an algebraic extension of a, and if h,(@) = 0, show that ht(O1)= 0. (b) Show that the class %' of dynamical systems with zero entropy satisfies the hypotheses of Furstenberg's theorem (Section 11.6). Conclude that every minimal distal system has zero entropy. 22. ( W . Parry) Let ( X , d ) be a compact metric space, and let a be a distal homeomorphism of X . Let p E M ( C ) be arbitrary. For some z E X choose a sequence S, of spheres centered at z with diam(S,) + 0. (a) If 9, = S , {z}, show that 9,1 4. Hence p(&) + 0, and we may r" for some (arbitrary) r assume by passing to a subsequence that ~(9,) I satisfying 0 < r < l/e. (b) Let A , = (So S , ) u { z ) and A , = S, S n + l (n 2 1). Let d,= { A , , . . ., A,- S,}, so that d,t d,where d is the a-field generated by the countable number of atoms A , , A , , . . ., A , , . . . . Let x, y E X . If there exists a sequence nk of integers such that a"kx, a"xy E dk E Sk,show by distality that x = y . On the other hand, by minimality there exist for each k arbitrarily large integers n with O"XE s k . Conclude that V,m,, a - n d = 93. (c) Show from (b) that h ( P ) = limn+mh(a, d,, p). (d) Show that 21. (a)

-

,,

-

-

CS

h(a, d,, P) I H(dal,) I - P ( A , ) log ~ ( ' 0 ) -

C nrn log r

n= 1

m

+ C nrn = ( r - 1) log(1 - r ) + r/(1 - r)'. <

( r - 1) log( 1 - r )

n= 1

(e) Conclude from (c) and (d), the definition of h , , and the arbitrariness of r and p, that h,(C) = 0. ABue Systems

23. Complete the details of the proof of Theorem 4.9. 24. (a) Verify Eq. (59). (b) Carry out the alternative proof that h(C) = 0 when Z has quasiperiodic spectrum.

148

IV. ENTROPY

25. (Algebraic entropy) Let 7 be an epimorphism of the compact abelian group G. For any finite subgroup F of 6 define H ( F ) = log ord(F), where ord(F) denotes the order of F. (a) Show that H is subadditive on the lattice 9- of finite subgroups of 6, that is, H(F, v F , ) IH(Fl) H ( F , ) .

+

(b) Show that H(t*F) = H ( F ) for each F E Y. (c) Show that

and hence that

exists. We define the algebraic entropy of C = (G,T) by

h,(Z) =

SUP

h(r*, F).

F E F

(d) Show that h,(Z) = ha(X/Zo), where C/Xo = (G/Go, z/Go) is an algebraic factor of C, and Go is the connected component of 0 E G. (Hint: Use the fact that u 9- is the torsion subgroup of (e) (See [60].) If G is totally disconnected, then h,(C) = h,(Z) = h(C*), where m is the Haar measure on G. (f) Let C, be the subsystem (Go, T). Show that

c.)

h ( X ) = h,(C) + h(C0). Entropy of Induced Systems

Let @ = (X, 9, p, 4) be an abstract dynamical system. For each (necessarily countable) partition of X into sets An of positive measure, define

26.

H(d)= - C p(An) log p(An), n

where d the smallest o-algebra containing all the sets A , . Let 9- be the class of all such a-algebras d with H ( d ) < 00. (a) Show (e.g., as in Exercise 25) that

exists and is finite for each

E

9

EXERCISES

149

(b) Define h*(@) = supd E F h(4, d )and show that h*(@) = h(@) as previously defined. (c) If d oE Y and Yo = {dE F : d oc d}, show that

h(@) = sup

h(4, d).

&€YO

27. (Kac’s theorem) Let @ = (X, B, p, 4) be an invertible, ergodic system, and let A and n A be as in Theorem 1.5. Define A , = {x E A : n A ( x )= n) A,, = 4k(A,)

(k = 0,1, ..., n - 1; n = 1, 2, ...).

Show that {Ank)is a partition of X, and hence that rI”(X)

p(dx) = p ( X ) = 1.

Show that 1 is an upper bound on the integral even if @ is not ergodic. Can the assumption of invertibility also be dropped? 28. Establish the validity of ( p ) in the proof of Theorem 4.13.

CHAPTER

V Bernoulli Systems and Ornstein’s Theorem

The brief history of ergodic theory has had three high points, each of which evoked a flurry of activity in the field. The first was the proof by G. D. Birkhoff in 1931 of the individual ergodic theorem, which could be said to be the beginning of the mathematical theory. The second was the introduction in 1957 by A. N. Kolmogorov of entropy into ergodic theory. We have already discussed this in Chapter IV. The third was the proof by D. S. Ornstein in 1970 [45,46] that Bernoulli systems with the same entropy are isomorphic. A number of other results flow from this basic result and the tools developed to prove it. For example, it is now known that there are Kolmogorov systems (see Exercises 1.25 and IV.16) with the same entropy which are not isomorphic to each other or to any Bernoulli systems. Moreover, effective methods have been developed by Ornstein and others to establish whether or not certain classical physical systems are Bernoulli. Many of these results are quite recent and not clearly understood yet. 150

1. DEFINITIONS

151

1. DEFINITIONS

In Chapter I, Example 3, we defined the Bernoulli shift on k points (1, 2, ..., k) with distribution n = (pl, p 2 , . .., p,J. Let us denote this system by Q,(x). It was shown in Chapter I11 that

= ~(@(II’)). Thus, clearly, there are distinct distributions n and n’with h(Q,(n)) The main result of this chapter asserts that such systems are m-isomorphic. Meshalkin [44] had already shown this to be true for certain nonidentical n and n’ in 1959. A slightly more general result was given by Blum and Hanson in 1963 [8]. However, the complete solution had to wait for the powerful methods developed by Ornstein in 1970. Throughout this chapter we shall be concerned with ergodic systems Q, = ( X , B, p, (6) with positive entropy. Thus it is without loss of generality (Exercise IV.14) that we assume (X,B, p ) is nonatomic. We shall also assume throughout that Q, is invertible. The concepts and definitions to be introduced now are due to Ornstein and were developed specially for the proof of the isomorphism theorem. First, however, we recall and extend the definition of a Bernoulli system. Let II = (pl, p 2 , p 3 , . ..) be a probability distribution on the positive integers Z + , that is, pi 2 0 ( i = 1, 2, . . .) and pr = 1. In order to include the finite case, we do not insist that all pi be strictly positive. In the usual way (cf. Example 1.3), we define ( X , B, p ) to be the two-sided direct product of countably many copies of the finite measure space ( Z ’ , 2‘+, n), m

( X , B, p ) =

x

( Z + ,2z+, x ) .

n=-m

The shift

G

is defined on X by ~ ( x= ) y

with

y, = x,+~,

and is an invertible measure-preserving transformation on ( X , B, p). Definition 5.1 The dynamical system Q, = ( X , 9,43, p, 0 ) is called the Bernoulli system (or Bernoulli shift) with distribution n, and will be denoted @(XI.

Remark With our understanding that equivalent systems are to be identified, the preceding definition is clearly equivalent to the following one.

v.

152

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

Definition 5.1' Q, = (X,B, p, 6)is a Bernoulli system with distribution provided that there exists a countable partition P = (P,, P, , P,, . . .) of X such that A

(i) each P k E a, (ii) #k) = P k for each k, (iii) - &"(P) generates B, and VWJ) = Pk, (iv) for each choice of 1, k,, .. ., k,, and n, c n, c ..- < n , .

u,"=

mi=,

n;=,

Throughout this chapter a partition will mean an ordered disjoint sequence of measurable sets with union X. Thus the (finite) partitions P = (P, P) and P' = (p, P) are to be considered as different partitions. As usual we shall refer to those elements of P having positive measure as the atoms of P. If P has only a finite number of atoms, it is said to be a finite partition. Associated witheachpartition P of X is the a-algebra B ( P ) c B generated by the atoms of P. Note that different partitions may generate the same a-algebra, and that eachfinite a-algebra is generated by the partition formed by its atoms.

Definition 5.2 Let Pj = (P,j, P,j, . . ., q,) be finite partitions of X for .. ., n. Their join is the partition P = Pj whose elements are the sets

v;='

j = 1, 2,

n

Pk,k,

... k, =

n

j =1

The order on P is lexicographic; that is, P = (P1l...llrp 1 1 . . . 1 2 ,

**.,

P I I . . . I I " 9 p11...21,

***,

Pl1I2-.In).

Definition 5.3 If P' and P2are partitions of X such that each atom of P' is a union of atoms of P2, then we say P2 refines P' and write P' c P2. Clearly, if P' c P2, then &(PI) E B ( P ' ) . In any case, Pk c each k = 1, . . ., n.

v;=, Pj for

Definition 5.4 The distribution of the partition P = (Pl, P,, sequence 4 P ) = (P(Pl),P(P,),

* *

.).

. . .)

is the

1.

153

DEFINITIONS

Definition 5.5 Let CD = ( X , 3?, p, 4 ) be a dynamical system and P a partition of X. Then P is a generator for 0 if .%? is generated by

u,“=- 4YP). co

Note that this is equivalent to saying that

v

@9(P)=

f 4n(

n=l

n

n=l k=-n

n=-m

=

v v f#)”(P)= f i/ m

m

’@I=

4k9(P)

n=l k=-co

G4w).

k=-m

A sequence of partitions of ( X , 9,p ) is said to be independent if the associated o-algebras are independent. This is true for the sequence #“(P) iff condition (iv) of Definition 5.1’ is satisfied. In this case, the Bore1 zero-one law asserts that every set in

has p-measure 0 or 1. Thus the following restatement of Definition 5.1’ implies that every Bernoulli system is equivalent to a Kolmogorov system. (Set ’@Io = - 4’9(P) in Exercise 1.25.)

v:,

Definition 5.1“ 0 = (X,9?, p, 4) is a Bernoulli system with distribution n iff 0 has a generator P with distribution d ( P ) = n such that the sequence {@(P)} is independent. We shall say then that P is an independent generator for 0. Remark It is sufficient to show that the @(P) are independent for n 2 1 (or n 2 I), for then independence of the entire bisequence @(P) ( - co < n < + 0 0 ) follows from the fact that 4 is measure preserving.

Definition 5.6 By a stack of height n we shall mean a disjoint sequence F , , . .., Fn- of measurable subsets of the measure space ( X , 9,p ) and a sequence of m-isomorphisms 4, of F, onto F,+ (j = 0, 1, . . ., n - 2). If we define 4 to be 4, on F then 4 is a measure-preserving transformation of (Jjl,’ F, onto UYIf;,, and F, = + j F , . We can think of the sets F, lying above one another in a “stack” and 4 mapping each point in F, into the one directly above it in Fj+l (Fig. 5.1). Now suppose that P = (Pi, P , , ..., P,) is a partition of X. Then P induces a partition on U;: F , (Fig. 5.2). Moreover, 4-’ maps F , onto

v.

154

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

PI

F4

p2

I

I

I

I

,

PI

b-3

I

I

P2

I

I

I

0

F2

P

Fl

I

I

t

@

p ,

FO

t

Figure 5.1. Stack: n = 5 .

i

!

,

I

I:

q

p2

I

P2

PI

1 ; ; ; I I

I j

I

p2

Figure 5.2. Stack and partition: n = 5, I

=

2.

F , and provides another partition Pj of F , ( j = 0, 1, . . ., n - 1). The join P’ = Po v P, v .. . v P,- is a partition of F , into points with identical “futures” with regard to the transformation 4 and the partition P.

,

Definition 5.7 The P-n-name of a point X E F , is the sequence i,, i,, . . ., in- such that @(x) E Pi, ( j = 0, 1, . . ., n - 1). Definition 5.8 If {4jFO:j = 0, . . ., n - 1) is a stack and P a partition, a column associated with P is a substack { 4 j A : j = 0, . . ., n - l}, where A is an atom of the partition P‘ of F , into points with identical P-n-names. Each of the sets @ A is called a column level (Fig. 5.3). F4

I I

F3 I

I II I

1

I

I

i I

Fo I

I I

II

I

II

II i I

I

1

I

I I I

I

I I

PIIIII

I 1 i

I

I

F 2 1I

PI

I

1

II t

I

I II

I

I

I I

I

I

I

I

I I

I

I 1

iI

I

I I

I

I

I

I

I

I

I

1

1

I

I I

1

1

I I I I I

I

I

4

I I

I I

I

I

PlIl2, PIl22, PI1222 p 2 1 2 2 2 p 2 2 2 2 2

Figure 5 3 . Heavy lines indicate the column of points with P-n-name (1, 1, 2, 2, 1).

Clearly, the collection of column levels associated with P provides a partition P” of u;Zi @F0 which is finer than the one induced by P. It is the coarsest partition of the stack which is finer than P and is “consistent” with 4 in the sense that 4 maps atoms of P” onto atoms of P”.

155

1. DEFINITIONS

Definition 5.9 A gadget is a quadruple ( F , 4, n, P) such that F , +F, ..., +‘-‘F forms a stack and P is a partition of +jF. The columns associated with P are called columns o f f h e gadget. The P-n-name of the column A, +A, . . ., @ ‘ - ‘ A is the P-n-name of the points x E A .

u;:;

Remark An equivalent definition of a gadget could be given as a union of disjoint stacks of height n (the columns of the gadget) with distinct assignments of sequences ( j o , j,, ..., j n - l ) (the P-n-name) from the set { 1, 2, . . ., I) to the columns. For then the partition P may be recovered as follows. The atom Piof P is the union of the columns levels +,A such that j , = i in the sequence assigned to that column.

Clearly, distinct gadgets may give rise to similar appearing column structures. For example, the partitions P, and P, may contain a different number of sets, or may have the same sets arranged in a different order, and lead to column-level partitions Py and P i which are indistinguishable except for labeling (Fig. 5.4). We shall want to preserve these distinctions, for example, through the recitation of column names. However, we do want to identify gadgets which differ only by a scaling factor. We are thus led to the following definition of gadget isomorphism.

Figure 5.4. Pairwise nonisomorphic gadgets

Definition5.10 Two gadgets rl = (F,, (b,, n, P,) and T, = ( F , ,4, ,n, P,) are isomorphic if they have the same collection of column names and the distributions of the partitions P,’ and P,’ of F , and F,, respectively, into column levels agree up to a constant multiple. Remark To say that r, and I-, are isomorphic is to say (assuming that F , and F , are measure-theoretically isomorphic) that there exists an invertible measurable mapping of F , onto F , which preserves relative “lengths” and the structure of the gadget built on F,. We shall call such a mapping a gadget isomorphism.

In the approximation arguments of subsequent sections we shall want to speak of closeness of partitions or of gadgets and of approximate independence. We are thus led to the following definitions.

156

v.

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

Definition 5.11 Let P = ( P , , P , , . . .) and Q = (Q,, Q , , ...) be partitions of the same or different measure spaces ( X , , g,, p,) and ( X , , a,, p,). The distribution distance of P and Q is d(P,

Q) =

CI IPl(Pi)- pz(Qi)I*

Definition 5.12 Let P = (PlyP , , . . .) and Q = (Q1, Q , , . . .) be partitions of the same measure space (X, 99,p). The partition distance of P and Q is P ( P , Q ) = C pi A QJ I

+,,

Definition 5.13 Let rl = (Fly n, P,) and r2= (F,, 4,, n, P,) be gadgets. Let Y be the collection of invertible mappings $ of F, onto F , which are measure preserving up to a constant factor. The gadget distance of Ti and r2 is

yp-,, r,) = inf

c

p ( + i i ~n , F,, $ ( + i i p 2 n F,N, n i=o where p is the partition distance on F,. If Y = 0,set y(rl,I-,) = Jlsv'

+

00.

Closely related to the gadget distance is the process distance associated with pairs (a, P) where 0 = (X, 98, p, 6) is a dynamical system and P a finite measurable partition of X. The name arises from the fact that (0, P) determines a (finite-valued)stochastic process {fn) byf,(x) = i if +"x E PIE P. To define the process distance, we consider measure isomorphisms $ of X, onto X,, as we did isomorphisms of F, onto F, for the gadget distance. Let Y ( X 2 ,X,) denote the class of such isomorphisms.

Definition 5.14 Let O i= ( X i , gi,p i , 4J be dynamical systems and Pi finite, measurable partitions of X i with the same number of atoms ( i = 1,2). The process distance is defined by

Our final definition of this section gives a quantitative interpretation to the statement that two partitions P and Q are almost independent. It says that P partitions almost all of the sets in Q in almost the same proportions as it partitions X. If Q is a measurable subset of X with 0 < p ( Q ) < 00, we denote by P n Q the partition { P n Q : P E P} of the measure space (Q, Q n % ( l / p ( Q ) ) p ) .

1. DEFINITIONS

157

Definition 5.15 Let P and Q be partitions of ( X , B, p), and let E > 0. Then P is &-independent of Q if Q is the disjoint union of collections Q, and Q, with (i) p ( u Q2) < E, and (ii) &(a,P n Q) < E for all Q E Q,. Remarks I The function 6 is a pseudometric on partitions of finite measure spaces. In fact, 6(P, Q) = 0 iff P and Q have the same distribution, and 6 is the 1,-metric on the sequences d(P). In particular, the collection of partitions of a given finite measure space is complete in the 6-topology. 2 The function p is a metric on the space of partitions of the finite measure space ( X , 39,p). Moreover, 6(P, Q) I p ( P , a).

3 The function y is a pseudometric. Assuming that F , and F , are measure-theoretically isomorphic modulo a constant multiplicative factor, y(T,, r,) measures how well an isomorphic copy of the gadget r2 on the space of the gadget r, can be made to fit r, a t every level of r,. In particular, y(T1, r,) = 0 iff there is a gadget isomorphism of rl and T, . 4 Each partition P of X determines a factor @ q p ) as explained in the previous chapter. Thus the pseudometric n may be thought of as a measure and OJp ,) . In fact, it is not hard to see that of closeness of n((@,, P,), (@, , P,)) = 0 iff 2 @ J ( p , ) . In case P, and P, are generators for O1 and O,, respectively, z gives a useful measure of “almostisomorphism” of @, and @,. 5 The usual analytic trickery of comparing the nrth term to the nth term of the sequence shows that, in fact, sup,, can be replaced by limn+min the definition of n. 6 It is reasonable, and not hard to show for ergodic systems, that closeness in the process metric is equivalent to closeness in the gadget metric for arbitrarily long gadgets. This requires the use of Lemma 5.3 below. 7 Partitions P and Q of X are independent iff P is &-independent of Q for each E > 0. 8 If P is &/Z-independentof Q, then

xi

(a) Cj Idpi n Qj) - @(Pi)P(Qj)I < 8 . Conversely, if(a) holds, then P is &-independent of Q. Since (a) is symmetric in P and Q, it follows that P &-independent of Q implies that Q is independent of P. 9 If P is &-independentof Q, then there exists a measurable set A, with p ( X A ) < 2.5 such that P n A and Q n A are independent.

fi-

-

158

V.

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

2. APPROXIMATION LEMMAS The genera outline of the proof of the ismorphism theorem as well as the proofs of several of the lemmas follow Shields [57]. We shall assume throughout this and the following sections that all measure spaces are normalized, that is, have total measure one, and all partitions are finite. For ease of later reference, we shall begin by stating all of the lemmas to be proved in this section. The first six lemmas involve only the concepts developed in the previous section.

Lemma 5.1 (Law of large numbers) Let 0 = ( X , B, ,u, Cp) be an ergodic system, let P be a partition of X , and let E > 0. For all suflciently large n there is a set C , E VYZ; B ( 4 - P )such that

-

(i) p ( X C,) < E, and (ii) I ( l / n )CY;; xp(Cpix)- p ( P )I < E for all P

E

P, x

E

C,.

Lemma 5.2 (Rohlin) Let 0 = ( X , a, p, Cp) be an ergodic dynamical system, and let E > 0. Then for each n there exists a set F E B such that (i) CpiF n CpjF = 0 ( i # j ; i, j = 0 , I, (ii) p ( u f ; ; CpiF) > 1 - E .

..., n - l), and

Lemma 5.3 Let @ and E be as in Lemma 5.2, and let P be a j n i t e partition of X . Thenfor each n there exists a set F E B satisfying (i) and (ii) of that lemma and (iii) 6(P, P n F ) < 2 ~ .

Lemma 5.4 Let r = ( F , Cp, n, P) be a gadget in ( X , g,p), let 0’ = (XI, p’, 4’) be an ergodic dynumical system, and let E > 0. Then there exists a set F E .5iY and a finite partition P’ of X‘ such that r and r‘ = (F’, Cp’, n, P’) are isomorphic gadgets, and n- 1

Lemma 5.5 Suppose ( F , Cp, n, P) and (F’, Cp’, n, P’) are isomorphic gadgets. Let Q be a partition of UfZ,’ CpiF. Then there is a partition Q’ of CpIiF such that ( F , Cp, n, P v Q) is isomorphic to (F’7 +’, n, P’ v Q’).

u;:,’

2. APPROXIMATION LEMMAS

159

Lemma 5.6 Let r = ( F , 4, n, P) and r‘ = (F’, 4’, n, P’) be gadgets, and let E > 0. Suppose that (a) (b) (c) (d) (e)

d(v;:i4-iP n F ) = d(V7:; 4-’P), d(v!;; 4’-’P’ n F‘) = d(VlZ; @-@’), {&-‘P n F} is an &-independentsequence, {4’-’P’ n F‘} is an independent sequence, and 6(P, P’) < E .

Then

?(I-,

r )< 3 E .

The remaining lemmas involve also the notion of entropy (Chapter IV). For each dynamical system 0 = (X, g , p, 4) and each (finite) partition P of X, we shall denote by h(4, P) the entropy h(4, B(P)) of 4 on the a-algebra g(P) generated by P (Definition 4.2). Likewise, we write H(P) = H@(P)).

Lemma 5.7 (McMillan) Let @ = ( X , B, p, 4) be an ergodic system, let P be a partition of X , and let E > 0. For all sujficiently large n there is a collection W,, of atoms of &-‘P such that

Vr:J

(i) p(u W,,) 2 1 - E, and (ii) I h(4, P) ( l / n )log p ( A )I < E for A E W,, .

+

Consequently, the number v(W,,) of atoms in W, satisfies (iii) (1 - +4h(4, P)--E)5 v(w,) 5 en(h(rb*P ) + E )

Lemma 5.8 Let 0 be a dynamical system, let k be a positive integer greater than 2, and let E > 0. Then there exists a 6 = a(&, k ) > 0 such that, whenever P is a partition containing k atoms and satisfying h(4, P) 2 H(P) - 6, it follows that {@P}is an &-independentsequence. In particular, h(4, P) = H(P) ifS{4nP)is independent.

Lemma 5.9 Let @ be a Bernoulli system with independent generator P, and let E > 0. Then there exists a 6 > 0 such that, wheneuer W is ergodic and P‘is a partition of X‘ with the same number of atoms as P such that

(i) 6(P, P’)< 6, and (ii) (H(P)- h(@, P’)l< 6,

160

v.

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

we have (iii) K((@, P), (W, P‘)) < E. Lemma 5.9 turns out to be a “good” theorem and like all good theorems has become a definition.

Definition 5.16 Let 0 be an ergodic system, and let P be a finite partition of X. Then P is said to be finitely determined (relative to 0 ) if given E > 0 there is a positive integer n and a 6 > 0 such that, whenever 0‘is ergodic and P’ is a partition of X’such that (i) h ( W ) 2 h ( 0 , P), (ii) v(P‘) = v( P), (iii) S(V;Zi v1iJ 4-Q) < 6, and (iv) 0 I h(4, P)- h(@, P’) < 6, then (v) rr((0,P), (W, P))< E .

Remark It is easily seen from Lemma 5.9 that independent generators are finitely determined. The importance of finitely determined partitions lies in the (not obvious) facts that they are precisely the ones that generate Bernoulli systems and are often easier to find than independent generators.

Lemma 5.10 Let 0 be a Bernoulli system with independent generator P. Let cp’ be any ergodic dynumical system with h(W) 2 h(@), and let E > 0. Then there exists a partition Q such that (i) v(Q) = V P ) , (ii) 6(P, Q) < E, and (iii) 0 I h(4, P) - h(@, Q) < E.

Lemma 5.11 Let 0 be an ergodic system, let P be a j n i t e l y determined partition of X , and let E > 0. Then there exists a positive integer n, and a 6, > 0 such that, whenever cp’ is an ergodic system and P‘ is a partition of X‘ such that (i) h ( W ) 2 h ( 0 , P), (ii) v(P‘)= v(P), (iii) ~ ( V ~ L +’-@’, Z ’ , vr~ij’ 4 - P ) < 6,’ and (iv) 0 I h ( 4 , P) - h(&’, P’) < 6,,

2.

161

APPROXIMATION LEMMAS

then for any 6, > 0 and n2 there is a partition Q such that

(v) 6(V&' @-'o,v:L~' r # - @ ) < 6,, (vi) 0 I h(&, Q ) - h ( 4 , P) < 6,, and (vii) p(P, Q ) < E. Lemma 5.12 (Ornstein's principal lemma) Let 0 be a Bernoulli system with independent generator P, and let E > 0. Then there exists a 6 > 0 such that, whenever @' is ergodic and P' is a partition of X' satisfying (i) (ii) (iii) (iv)

h ( W ) 2 h(@), v(P') = v(P), 6(P, P') < 6, and 0 I h(4, P) - h(@, P') < 6,

there is a partition Q such that (v) {#"Q} is an independent sequence, (vi) d(Q) = d(P), and (vii) p(P, Q ) < E.

Proof (Lemma 5.1) According to Proposition 1.4, the quantity on the left in (ii) converges to zero as n -, cx) for almost all x E X and each P E P. It follows that

;: Let D, be the indicated set, so that p(D,,) + 0. Clearly, D , E VY Choose N so that n 2 N * p ( D , ) < E, and set C, = X D,. I

-

Remark

&9(4-'P').

If P = { P I , . . . , Pl}, then the quantity 1

n-1

is the relative frequency of occurrence of j in the P-n-name of x. This is 4-'P = P,. Thus Lemma 5.1 says that for constant on the atoms of large n the relative frequency of occurrence of j in the P-n-name of the atom A E P', is near p ( P j ) for most of the atoms of P,.

v;:

Proof (Lemma 5.2) Our proof, the idea of which is due to Ornstein, follows [32, pp. 70-721, where Lemma 5.2 is proved for antiperiodic 0, that is, under the assumption that 4"x # x for all n 2 1 and almost all x E X. It is easily seen (Exercise 1) that 0 ergodic with positive entropy implies @ is antiperiodic.

v.

162

BERNOULLI SYSTEMSAND ORNSTEIN'S THEOREM

'

Let p be a positive integer such that p - < E . We begin by constructing a measurable set F with p ( F ) > 0 such that (i) F, 4F, 4,F, . .., 4P"- ' F are pairwise disjoint, and (ii) G, +G, #,G, . . ., 4'"- 'G pairwise disjoint with

-

-

First let El E 99 satisfy p ( E , ) > 0 and p ( E 1 A $El) > 0 (by ergodicity). Let F, = El 4El. Since p ( E , 4El) = p ( Q E , El) = $(El A +El), it Having chosen F, 2 follows that p ( F , ) 0. Moreover, F, n 4Fl = 0. F , 2 2 F, with p(F,) > 0 and F , , 4 F k , . . ., q5,Fk pairwise disjoint, there > 0. In must exist E k + , s F, with p ( E k + , ) > 0 and p(Ek+,A 4k+'Ek+l) fact, any subset E of F, with p ( E ) > 0 and p(Fk E) > 0 will do. For otherwise,A = E u 4 E u - u #"' would be an invariant set with p ( A ) > 0 and p ( x A ) > 0. AS before, let Fk.+ = Ek+, f$'''Ek+,, SO that F,+l n 4,+'Fk+, = 0. Since clearly @ F k + , n q!dFk+, = 0 for 0 I i < j I k + 1, we have by induction defined for each k a set F, with p(Fk) > 0 and F , , qWk,. . ., 4kF, pairwise disjoint. Setting k = p n - 1, it follows that the class 9of sets F E 93 with p ( F ) > 0 and satisfying (i) above is nonempty. Identifying sets in 9 which differ by a set of measure zero and Zornifying, we obtain (ii). Now if F, is a measurable subset of $ p " - ' F of positive measure, then p(@Fo n F) > 0 for some j = 1, . . ., pn. For otherwise, F u 4F,, E 9 contradicting (ii). Set

=-

-

-

,

-

( 1 ~j 5 p n ) . i= 1

Then the A, ( j = 1,

. . ., p n ) are pairwise disjoint and

Consider the sets 4'A, for 1 I i <j I pn. For fixed i they are transforms by # of pairwise disjoint sets, hence pairwise disjoint. For i, < i,, &iz(4izAjz) = Ajz 4 p n - l F and 4-iz(4i1Ajl) & ' n - ' + i l - i z F are disjoint since i, - i, < p n - 1. Thus the total collection of 4iAj is pairwise disjoint. Moreover, each of them is disjoint from each 4 k F (0 Ik I p n - I), as is easily seen by considering separately the cases k > i (since (PP"- F n 4k-iF= and k 5 i (since i - k < j and so c$i-kAj n F = 0). In particular, (PA,, + , A , , . . ., (bPnApn are pairwise disjoint subsets of F.

a)

2.

APPROXIMATION LEMMAS

For if x = #iy = @ + j z with Y E Ai and @ A i + j G F n 4 j A i + j .Moreover,

Z E

163

A j + j , then y = @ Z E Ai n

Dfl

Dfl

Setting

we see that p(F* A Finally, setting

4 F * ) = 0. By ergodicity, it follows that p ( F * ) = 1.

we see that E, 4 E , ..., @ - l E are pairwise disjoint. Their union differs from F* by the union of a collection of sets # A j , consisting of less than n of these sets for eachj. Thus

j= 1

i=O

On the other hand, since F, 4 F , ..., qPfl-lF are pairwise disjoint, n p ( F ) In . l/pn = l/p < E, and the proof is complete. I Proof (Lemma 5.3) Recall that P n F is the partition of F given by P n F = { A n F : A E P} and its distribution is given by the normalized measure induced on F by p. Thus condition (iii) says that

Now use Lemma 5.2 to obtain a stack Fo, 4 F o , . . ., @'- ' F , such that (a) m = np - 1, (b) p ( F o ) < 42~1, and (c) p(UF: @Fo) > 1 - ~ / 2 .The partition P induces a partition P' = v YL-2 (4-@) n F , on F , corresponding to the columns of the gadget (Fo,4, m, P). Since X is nonatomic, we can partition each atom A E IFP' into n pieces of equal measure, A = Uy:; A,. Let

F

=

u

n-1

U

u

p-2

#"'+jAj.

A E P'j=O i=O

(See Fig. 5.5.) Clearly, F, 4 F , ..., @'-'F are pairwise disjoint, and their union includes all of &'Fo except a set of measure (n - l)p(Fo)< 42, which proves (ii).

uy!:

164

v.

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

I

1

1

1

1

In addition, we have d(P n F) = d(P n (UyC: +iFo)),since the atoms of the latter partition consist of unions of column levels of the gadget ( F o , 4, m, P) and we have assigned exactly l/n of the measure of each column level to F. On the other hand, uy=-; &Fo = H has measure no less than 1 - 4 2 - np(F0) > 1 - E. Thus

= 2( 1 - p ( H ) ) < 2 E .

I

Remark The last part of the proof shows that whenever H is a measurable set with p ( H ) > 1 - E and P is a partition we have 6(P, P n H ) < 2 ~ .

Proof (Lemma 5.4) From Lemma 5.2, let F' E @' be chosen so that F', &F', , .., + ' " - l F is a disjoint sequence, and p ' ( u l Z i &iF') > 1 - E . Let Q' be any partition of F' such that

u;:

@F' into column levels Then Q' determines a partition of 4'iA, A E Q'. Clearly, (2) determines an assignment of column names to the stack over F,, and thus defines a partition P' of that stack (see the Remark following Definition 5.9) such that the gadgets r and r' = (F',&, n, P') are isomorphic. I

2.

165

APPROXIMATION LEhUvlAS

Proof (Lemma 5 . 5 ) Again we can choose Qo‘ to be any partition of F’ satisfying

and

v 4’-i(P)n F’ < Qo’.

n- 1

Po’ =

i= 0

This is clearly possible to do by partitioning each of the atoms of Po’ separately. Now let Q1’ be the lumping of atoms of Qo‘ corresponding to the recovery of Q from P v Q under the correspondence (3), and define Q’ from Q1’ as we did P’ from Q’ in the previous proof. I Proof (Lemma 5.6) Note that the partitions P and P‘ are only given on H = UYZd @F and H’ = &iF‘, respectively. Thus the right sides of (a) and (b) refer to distributions on those sets. Further note that the conclusion of the lemma is equivalent to asserting the existence of a partition Q such that r’is isomorphic to the gadget (F, 4, n, Q) and

u;:;

I

n-1

The proof is by induction on n. For n = 1 the existence of a Q satisfying (4), which becomes 1

p(P n F , Q n F ) =

1p ( P i A Q i ) < 3 ~ , i= 1

is immediate from (e):

c Ip(P,)- p’(P/)I < I

6(P, P’) =

E.

i= 1

Indeed, a copy Q of P’ can be constructed on F such that for each i = 1, ..., 1 either P i c Qi or Qi c P i . Then d(Q) = d(P’) so that (F, 4, 1, Q) is isomorphic to r‘ = (F’, 4’, 1, P’), and P(Pi A

Qi)

=

I d p i ) - 4 Q i ) I = IP(Pi) - ~(f‘i’)I*

+

Assume the theorem is true for n = k, and let r = (F, 4, k 1, P), P‘) be gadgets satisfying (a)-(e) for n = k + 1. Since each of the conditions (a)-(e) implies the same condition for

r’= (F’, 4’, k + 1,

166

v.

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

u:

it follows that there exists a partition Qo on (F, 4, k, Q,) is isomorphic to r,’ and 1k - 1 p(P n +iF, &Po n 4 i F ) c 3 ~ . k i=o

4iF such that (5)

We need to show that Qo can be extended to a partition Q by defining it on the top level $&F in an appropriate manner so that (4) holds for n = k + 1 and r” = (F, Cp, k + 1, Q) is isomorphic to r’. We begin by extending the columns of the gadgets Toand (F, 4, k, Q,) to (PkF. That is, let

be the corresponding partitions of 4kF. In order that r” be isomorphic to r’ we need to have Q n c$kF independent of QPk[so that $ - iQ n F ( i = 0, . .., k) will be an independent sequence]. This is accomplished by defining Q on A E Q,, such that d(Q n A) = d(P’ n 4°F‘).This will also give the desired distribution for Q n 4kF.However, it will not guarantee that (4) holds. In order to obtain the closeness of fit desired between Q n 4kF and P n #F, we first replace r by an isomorphic gadget that differs from r only on the top level and satisfies d(P”‘ n ( A n B))= d(P”’ n B)

(6) for each A E Q,, B E P,. This we do by defining P”’ on @F so that it partitions each A n B ( AE Q, B E P,) the same as P partitions B, that is, d(P”’ n (A n B))= d(P n B).Let us drop the triple prime and assume P satisfies (6). According to (c) and the definition of &-independence,there exists a set C E V“r,’ 4 - P n F such that p(C) 2 (1 - &)p(F)and k- 1

6(4-kP n A, $ - k P ) < E,

AE

V 4 - 9 n C.

i=o

It follows that 6(P n B, P) < E,

B E Pk n @C.

Combining this with (e) gives 6(P n (A n B),P‘) c 2.5,

AE

Qk,

As before this means we can now construct Q on

B E P,

4’F

n CpkCC.

so that

(i) d(Q n (A n B ) ) = d(P’ n 4’lrFt),A E Q,, B E P, and (ii) p(Q n (A n B),P n (A n B ) ) < 2.5, A E Q, B E P, n 4’C.

2. APPROXIMATION LEMMAS

167

We have thus constructed Q n 4kF to have the same distribution as P' n 4°F' and to be independent of Q,. Thus (F, 4, k 1, Q) is isomorphic to (F', 4', k 1, P'). Moreover, by (ii) and p(4'C) = p(C) 2 (1 - & ) p ( F )we have that

+

+

p(Q n 4kF, P n 4 k F ) < 3&,

and the induction step is complete.

1

Proof (Lemma 5.7) This follows from Theorem 4.13 in much the same way as the proof of Lemma 5.1 does from the ergodic theorem. Indeed, since L , convergence implies convergence in measure, given E > 0 we have

If D, is the indicated set with p(D,) -,0, then C, = X

-

v +-%(P)

n- 1

DnE

i=O

is the union of a collection %?, of atoms of VYZ; 4-9'. Finally, for x E A E VYZi t$-iP we have +-i.49(P),x) = -log A. 1

Z(vYd:

Notice the difference between Lemma 5.7 and Corollary 4.13.1, which implies that for any atom A of 4-iP and sufficiently large n we have IW, + (1/n)log CL(A)I < &/CL(A).

v;:,'

Proof (Lemma 5.8) We need to show that @P is e-independent of or equivalently that P is &-independent of VYL,' q5-iP, for VYZ; each positive integer n. Since n- 1

H(P) - h(4, P) 2 H(P) - H it is sufficient to show that there exists a 6 > 0 such that, if P and Q are partitions with P having k atoms and

H(P) - H(P I a)I6, then P is &-independentof Q. Moreover, we may assume that Q is a two-set partition. For if P has k sets and is not e-independent of Q, then the collection V of atoms A of Q for which 6(P n A, P) 2 E has total measure greater than E. It follows that

cc

PEP A€%

I P P n A ) - P(P) P(A)I 2 2.

v.

168

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

Then there must be a Po E P such that

If %' and %?- denote the subcollections of A € % ? for which p ( P n A) - p ( P ) p ( A ) is, respectively, nonnegative and negative, it follows that for one of them, call it W ,

c

[PPO n A ) - P P O ) P W 1

Let S = n W' and Qo = {S, 9). Then p(S) 2 ~ ~ / 2 k and

6(P n S, P) 2 c2/2k

(7)

and

0 < H(P) - H(P I Q,) 5 H(P) - H(P I Q).

(8)

Thus P is not (t2/2k)-independent of Qo, and if H(P) - H(P I Qo) > 6, so is H(P) - H(P I Q). Now let K E R3k+' be the set of all vectors (d(P), d(P n S), d(P n p(S)) determined by k-set partitions P of X and sets S E 49 for which (7) holds. Clearly, P cannot be refined by Qs = {S, and so is never zero on K. Since K is compact and is continuous on K, it is bounded away from zero, and the proof is complete. I

s),

s},

Proof (Lemma 5.9) For a fixed number of atoms the function H(P') is clearly continuous for the distribution metric 6(P, P') (since - r log t is a continuous function of r). Thus we may choose 6 small enough that (i) implies (H(P)- H(P')I < 6' for any predetermined 6, > 0. This with (ii) implies that IH(P') - h(@, P')\ < 6

+ 6,.

According to Lemma 5.8, if 6 and 6, are sufficiently small, this implies that {@"P'} is an &-independentsequence. Now apply Lemma 5.3 to build gadgets r and r' satisfying the hypotheses of Lemma 5.6. If 6 < 46, it follows from that lemma that y(T, r')< 42. If r and r' are made to come sufficiently close to filling X and X', respectively, we can conclude that (iii) holds. I Remark For a nonatomic measure space X it is not hard to see that A($, P) assumes every value between 0 and h ( 0 ) as P ranges over the

2.

169

APPROXIMATION LEMMAS

finite partitions of X . Indeed, for any P, {Pfl,. . ., P f I }(0 I t < 1) be such that

= {Pi,

. I ., P I } let P, =

for j = 2, . . ., 1. Then h(4, P I ) is a continuous function of t and so assumes all values between h(4, Po)= 0 and h(4, P1). Proof (Lemma 5.10) Since

h(@) = h(4, P) I h(@’), it follows from the preceding remark that given partition Q, of X’such that

E

> 0 we can choose a

0 < h(4, P) - /I@ Q,) ’,< E . Now let

0 and choose n sufficiently large (Lemma 5.7) that

(a) there is a collection ‘G‘

E

VyZ; 4’-’Q0with

(i) p(u W) 2 1 - P, (ii) ,-“(A(@‘, 00)+8) 5 p ( A ) 2 e - n ( h ( # * 0 0 ) - b ) for A (b) there is a collection ‘G c

E

W’; and

VYZd +-iP with

(i) p ( u W) 2 1 - P, (ii) e-n(h(4. P ) + P ) 5 p ( ~I) e-n(h(4, $1-8) for A E v. By choosing fl small enough and n large enough, since h(4’, Q,) < A(+, P), we can assume that Qo)+P) < ,fl(W. - P). It follows, from (iii) of Lemma 5.7, that V(U’)5 v(%).

(9)

According to Lemma 5.1, we can assume that n and V? satisfy

wheref,(i, n) is the relative frequency of occurrence of i in the P-n-name of the atom A . It then follows from Lemma 5.3 that there is a gadget (F’,+’, n, Q,) in (X’, 9?’, p’) such that

v.

170

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

and In-

n- 1

1

Now let F = (u W ) n F’, and set WF’= W‘ n F. From (i) of part (a), (9), and (ll), it follows that n- 1

and v(W) 2 v(W,’).

be any monic map of qF’into 59. We define a new partition on :!Let u(F, 4’,$ 4’iF that (F, 4’, has the same column structure as 42,) as follows. For each WF’ assign the column level to Qi, where Pil 4 - @ - I ) P . . Define 0 arbitrarily on X - u ; : 4’iF without increasing the number of atoms Clearly, satisfies conclusion (i) of the lemma. Let us show beyond that it also satisfies (ii) and (iii) if fi and n are chosen appropriately. IS and if 2 u;;; is the column over we have by (10) that Q

n, Q)

so

n,

@’A

@ ( A )=

v(P).

AE n +-lPi2 n

4 . .

n

1,-

1

Q

tl,

A E gF’

=

4’A

A,

which implies

Together with (13) this gives (ii) for the proper choice of p. Now let Q l = (F, F). Then, recalling that (F, 4’, n, Q) and (F, 4’, n, Q,) have the same column structure, we see easily that Q, n F <

and Q nF <

L:v.

~ V ( Qv Q ~ ) n ) F

( i/ 6 i ( ~QJ)~

n P.

v

i=-n

If fl is small, F is almost all of X’.If also n is large, Q, is almost trivial. Hence h(@, Qo) is near h(#, Q). Thus we can conclude from 0 < h(4, P) - W’, Qo) < E that proper choices of n and /? give 0 < h(4, P) - h(#, Q) < E .

I

2.

171

APPROXIMATION LEMMAS

Proof (Lemma 5.11) Since P is finitely determined, there exist n, and 6, > 0 such that (i)-(iv) imply ~((0, P), (O’, P’)) < E , (to be determined later). We may assume that 6, < E. Furthermore, we can assume that

0 < h(4, P) - h@’, P’) < 6,.

(15) Otherwise, it would be true that h(4, Q) = h(@, P’) for all P’ satisfying (ii) and (iii). In this case, we could choose Q satisfying (ii), (vii), and d(V;:,’d’-’Q) = d(V;:,’4-’P), and be done. Now choose as in the proof of Lemma 5.10 a partition Qo such that P’ < &Do and 0 < h(4, P) - h(&, Q,) < a, where a is yet to be specified. Next choose /3 > 0 and n such that (a), (b), (9),and (10) all hold, and choose F’ so that (11) and (12) hold. It follows then that

6( \‘#-w n F’, i=o

v 4r-iP’ ni =- O l

1 -=

28.

Define F and WF‘ as before, so that (13) and (14) hold. Now define $: WF‘-,% as before but subject to the following condition: (A) for some collection d E W F ’with $(u d )2 (1 - 4 3 ) p ’ ( F ) it is true that the P’-n-name of each A E d and the P-n-name of $(A) E %‘ agree in more than m/3 places.

It will then follow that

1 P[(Qi n 2)A (pi’ n A)]I

~(2)~

i

where A is the column over A E %,’ so that p(Q n F, P‘ n F ) I 2~13.

Thus to complete the proof of the lemma we only need to show that $ can be defined in such a way that (A) holds. To establish (A) we go back to the inequality

n((@, P), (@, P’)) < E l , with its still unspecified e l . Choose a gadget (F, 4, n, P) in X such that

By Lemma 5.3 choose a new partition P* so that (F’, isomorphic to ( F o , 4, n, P) and 1 n-1 - C p(P’ n qYiF’, P* n @iF‘) < E ~ . n i=o

4’, n,

P*) is

v. BERNOULLISYSTEMS AND ORNSTEIN’S THEOREM

172

Replace F’ by F. For sufficiently small fi 6

r1

v 4-iP n F o , v 4’-iP* n F ) < p

i=O

and

n- 1 i=0

1 n-1

c

p(P’ n 41iF, P* n &iF)< 2.5,. n i=o Put ‘XF* = %* n F. For small p we can assume v ( % ~ *2) v ( % ~ ’ ) In . fact, we shall choose /?so small that -

$ ( A ) 2 4p(A*), A E %F‘, A* E gF*, (16) so that at least 4 sets in WF* are needed to cover a set in ‘XF‘. Let d consist of all those such that more than half of A is covered by sets A* E WF* such that the P’-n-name of A and the P*-n-name of A* differ in no more than n& places. Let E be the set of points x E F such that the P’-n-name and P*-n-name of x disagree in more than places. It follows easily that ,@)I f i p ‘ ( F ) , Moreover, if B = v WF*, then

fi

c

P ’ ( 4 5 2[P’(F

A€WFr-S4

-

B)+ PWI.

If and fi are small enough, it follows that p’(u d)2 (1 - ~ / 3 ) p ’ ( Fas ) asserted in (A). It is clear from (16) that any k elements A,, . .., A, E at intersect at least k elements in ‘XF* whose P*-n-name differs from the P’-n-name of at places. The marriage lemma least one of the A, in no more than of combinatorics (see [30])then implies the existence of a monic map t,h from .dto VF*such that the P’-n-name of A E d and the P*-n-name of $ ( A ) differ in no more than n& places. Extend the definition of J/ to VP* d in any way, and we have proved (A). I

6

-

Proof (Lemma 5.12) The 6 we choose is the S of Lemma 5.11 corresponding to e/2. Let (P and P’ be as specified in Lemma 5.12, and for each n let 6, be the 6 of Lemma 5.11 corresponding to .5/2”. By induction choose partitions Qn such that Q, = P’ and (by Lemma 5.11)

(a) 6(Q” < Sn+ 1, (b) 0 I h(+, P) - h(4’3 Q n ) 5 S n + 1 , (c) p(Qn, Q n - 1) < ~/2”* Thus { Q n ) is a Cauchy sequence of partitions in the p-metric. It follows readily from the Riesz-Fischer theorem that this implies the existence of a 7

3.

THE ISOMORPHISM THEOREM

173

limiting partition Q, that is, p(Q,,, Q) -+ 0. But then from (a)-(c) we deduce (d) 4Q)= 4% (e) h(@, Q) = h(4, (f) P') P P ' , Q,) Since (d) and (e) imply that P(Q9

+

P(Q,,

Q,-

1)

< E.

a)= H ( P ) = H ( Q ) ,

fJ(f#l',

it follows (Exercise 2) that done. I

is an independent sequence, and we are

3. THE ISOMORPHISM THEOREM

Theorem 5.1 (Ornstein) Let K = ( p l , p 2 , . . ., Pk) and K' = (PI', Pz', .. Pl') be probability distributions. Then the Bernoulli systems O(z) and @(n') are isomorphic ifSh(O(rc))= h(O(d)). According to Definition 5.1" and the Remark following, it is sufficient to show that for two systems O and @' with independent (finite) generators P and P', respectively, such that h(O) = h(@'), we can find a partition Q of X such that v(Q) = v(P'), (b) d(Q) = W'), (c) @"' ( n = 0, 1, 2, . . .) is an independent sequence and (d) P E - oo L@(&"').

(a)

v=;

Throughout this section a, a', P, and P' will be assumed to be as above. Lemma 5.12 provides the tool for satisfying (a)-(c) and an approximate form of (d). In order to facilitate the use of this lemma in a succession of approximations, we introduce the following notation.

Definition 5.17 If P and Q are partitions of the same space (X, 9% p) and E > 0, we say that P c-rejines Q and write Q < E P if there is a partition Q' < P such that (a) v(Q') = v(Q), and (b) p(Q, Q') < E. Now it is easily seen that condition (d) above is equivalent to the following.

<,v;=-,, (P'Q. For a measurable partition Q of X, let Q" = v."=-" L4?(@"'). Then

(e) For each E > 0 there exists n = n(E) such that P

OQ = (X, Q", p, 4) is a factor of O (OQ is the natural extension of OJ(Q) as defined in Chapter IV). Conditions (a)-(c) are sufficient to show

v.

174

BERNOULLI SYSTEMS AND ORNSTEIN’S THEOREM

that Qi, z 0’. In general, if Q and Q’ are partitions of X and X’, respectively, then Qi, z W,, under the canonical correspondence determined by identifying Q and Q’ iff (f) d ( V ; = , Cp’Q) = d ( V ; = , @‘Q‘), n = 0, 1, 2, .... In this case, we shall write CD, W&. Our next result shows that if Qi, @’ we can find a partition P, of X such that CDpl CDp, Qip, is a (canonical) factor of Qi, and 0, is almost a factor of Op1. (Note that Qip, Qip does not imply P, is a generator for Qi, but only that Qi contains a “copy” of itself.)

-

-

-

-

Q

-

Lemma 5.13 Suppose a, Wp, = 0’ and let E > 0. Choose N so that
vy=

-

(9 QiP, QiP, (ii) P, E Q”, and (iii) Q <2cViY_-N 4@,. More precisely, if Q’ < Vy=- N 4’P satisjies p(Q, Q‘) < E, then the canonical copy Ql’ < V:=-N 4’P1 satisjies p(Q, Q l f ) < 26. Proof Suppose Q’ is as indicated. That is, Q’ = f ( P ) = ( f 1 ( P ) , f 2 ( P ).,.., X into k sets, each of which is a union of atoms from &P, and p(Q, Q‘) < E . Then, for any other partition P of X with v(P) = v(P),fp) makes sense and is a partition of X into k sets with f(P) < - N 4v. Now if a > 0 is any positive number, there is an N, 2 N such that Q<.V~-N 4’P. , As before, we havef,(P) < V L - NC,pP’ and p(Q,f,(P)) < a. Applying Lemma 5.2 to Qig, we have for each positive integer n a set F E Q“, depending on a and n, such that F, 4F, . . ., &“‘IF is a disjoint sequence and c((U;ZA # F ) > 1 - a. According to Lemma 5.5 applied to CD = Qip and a, there is a P* c Qm such that the gadgets (F, 4, n, Q v P) and (F, 4, n, Q v P*) are isomorphic. This means, in particular, that Q v P and Q v P* have the same distribution on F = 4‘F. Summing out the atoms of Q, we see that d(P n F ) = d(P* n F). Hence 6(P, P * ) < a . Let us look at fl(P*). Since fl(P) < &P and fl(P*) < @P* are formed by the same rule, it follows that fl(P) and fl(P*) partition the atoms of Q in the same way on all of F except possibly the top N 1 levels and the bottom N, levels. Since v(Q) = v(f,(P)) = v(fl(P*)), and p(Q,fl(P)) < a, it follows that p(Q,fl(P*)) < a + 2 N 1 p ( F ) a < 3cr (17) if n > 2N,Ja. f,(P)) is a partition of

vy=-N vy=

u;;;

v~.,,,,,

vz-N, +

3. Now suppose that

175

THE ISOMORPHISM THEOREM

> 0 is given. Then for a sufficiently small, (17) implies

h(47 P*) 2 h(4,fl(P*))2 q4, Q) - B = H(Q) - # = IH(P') - p = h(W) - p = h(@) - fi = h(4, P) - #I.

If, in addition, a < 8, then we have

0 5 h(4, P) - h(4, P*) < #I. (18) According to Lemma 5.12 (with 0'= OQ), if 6 > 0 is given, we can choose p so that (18) implies the existence of a partition P, E 69" such that OP and p(P,, P*)< 6. Finally, given E > 0 we can choose S (and then B, a, N, N,, n in that order) so that p(P,, P*) < 6 implies 6(P, P*) < B,

-

P(Q,f(P,)) 5 P(Q,f(P*)) + P(f(P*Xf(Pl))< 24

and hence Q

-

Lemma 5.14 Suppose #* Wp., and let Q, of X and a positive integer k such that

E

> 0. Then there is a partition

-

(i) @a, Wpp, (ii) P <eVf=-k biQl,and (iii) p(Q, Q l ) < E . Proof

Given a > 0 choose N, so that

v N,

Q <*

4'P.

i= - N ,

Apply Lemma 5.13 to find a partition P', of X such that p(Q,f,(P,)) < 2a, wheref,(P,) < V ~ " . -+P1 N , and p(Q,f,(P)) < a, and such that (i) and (ii) of that lemma hold. Since P, _c Q", we can choose N , > N, so that P,

i=

Let f2 be such that f2(Q) < Vrz - N , Next choose N 3 > N 2 so that

9 4iQ. -N,

4iQ and p(P1, f2(Q)) < a.

where p = B(a) is yet to be specified. Again let < V Z - N , 4iQand P('l,f3(Q)) < #I*

f3(Q)

f3

be such that

176

V.

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

- u;;: - u;: -

Given n and r > 0 we can by Lemma 5.3 choose F E 33 such that 4F, . . ., c$"-'F is a disjoint sequence, p ( X &iF)< r, and S((V;;t 4-P) n F, V;Z: 4 - P ) < 2r. Similarly, choose E E Qm such that E, 4E, .. ., 4"-'E is a disjoint sequence, p ( X @ E ) < r, and S((V7;: 4-Pl)n E, V :; $-P1)c 2r. It follows (since Op, Op) that the gadgets (F, 4, n, P) and (F, 4, n, Pl)are isomorphic. Again applying Lemma 5.5, we can find a partition Q* E B such that v Q). (F, 4, % P v Q*) (E, 4, 4 In the applications of Lemma 5.3 above, let us assume that n is so large and r so small that p(f,(Q*), P) < 28 and d(Q*, Q) < 2p. Likewise, we can assume that p(f,(Q*), P) < 2a and p(f,(P), a*)< 3a. The first two conditions imply for /3 sufficiently small that h(4, Q*) and h(4, Q) are close, and so by Lemma 5.12 that there exists a partition Q, satisfying OQ, OQ and p(Ql, Q*) c 6, where 6 > 0 is yet to be specified. It will then follow that p(Vi%jl_ND 4Q1,VZ-,,,, @Q*) is small enough (for appropriate choice of 6) that p(f2(Ql), P) < 3a. Summing up, we can choose p such that F,

-

P ( Q , f i ( W + P(fl(P), Q*) + P(Q*, < a + 3a + 6 < 5a.

p(Q9 Q l ) 5

Setting a = 615 and k = N , completes the proof.

Ql)

I

Proof of Theorem 5.1 As in the proof of the principal lemma, we make a sequence of approximations and let Q be the limiting partition. Thus from Lemmas 5.10 and 5.12 we can choose Q, so that OQo Wp, = W. By Lemma 5.14 we can define inductively partitions Q,, and integers k, T 00, such that

-

-

( a ) OQa OL,. (/?) P - k , @2,,, and 4 Q n 1, Q n ) < 2-". (7)

<2-avf~

-

Let Q be the limiting partition, p(Q, Q,)-,O. It follows from ( a ) that OQ W p pso , that (a), (b), and (c) following the statement of the theorem hold. It only remains to show that (d) holds, that is,

From (/?) and ( y ) we have

c 2-k=2-" m

p(Q,,Q)r

k=n+ 1

4.

EXTENSIONS AND CONSEQUENCES OF THE THEOREM

and so k"

P <2-"(1+v(P'))

i=

v 4Q. -k

Since (20) implies (e), which is equivalent to (d), we are finished.

1

4. EXTENSIONS AND CONSEQUENCES OF THE ISOMORPHISM THEOREM In Section 1 we defined the Bernoulli system @(K) where K = . . .) is a countably infinite probability distribution on Z'. In this case also, (pl, p 2 , p 3 ,

00

h(@(n))=

-

1Pi log Pi i= I

*

(21)

Alternatively, we can consider a dynamical system @ with a countable partition P, such that 9?(@P)= 98 and the sequence 4"P is independent. It can be shown [59] that when the series (21) converges to a finite value, @ has a j n i t e independent generator, so that this case is reduced to that of Theorem 5.1. O n the other hand, Ornstein has shown [46]that all Bernoulli systems with infinite entropy are isomorphic. The following results are due to Ornstein and will not be proved here.

v:=-m

Theorem 5.2 Every factor of a Bernoulli system is a Bernoulli system. Theorem 5.3 1'0 = inv limn+mON, and ifeach @,, is Bernoulli, then 0 is Bernoulli. It is, of course, completely obvious that a product of a finite number of Bernoulli systems and hence, by Theorem 5.3, of a countable number of Bernoulli systems is Bernoulli. Thus the class of Bernoulli systems is closed under most of the constructions of Chapter I. It might be thought that the system 0" induced by an ergodic dynamical system @ is Bernoulli iff @ is Bernoulli. However, L. Swanson has shown in her Ph.D. dissertation (U. Calif., Berkeley 1975) that certain nonBernoulli Kolmogorov systems induce Bernoulli systems 0" on sets A of measure arbitrarily near one. Moreover, it is known [20] that @ may be Bernoulli and @ A not. On the positive side, Saleski has shown [54] that a Bernoulli system @ induces Bernoulli systems @ A on uncountably many measure-theoretically

178

v.

BERNOULLI SYSTEMS AND ORNSTEIN’STHEOREM

distinct sets A, and that, for weakly mixing systems 0 and certain sets A, @ is Bernoulli if @ A is. For years it was not known if there existed Kolmogorov systems that were not isomorphic to Bernoulli systems, or if entropy was a complete invariant for Kolmogorov systems, which if true would imply that all Kolmogorov systems are Bernoulli. (Why?) However, Ornstein, using techniques developed to prove Theorem 5.1, has shown [48] that there are such Kolmogorov systems. By refining Omstein’s construction, J. Clark showed (unpublished) that there is a Kolmogorov system @ that has no nth root for any positive integer n. On the other hand, it follows from Theorem 5.1 (see Exercise 3) that Bernoulli systems @ have roots of all such that each 0, is orders. In fact [47], @ can be embedded in a flow = @. Bernoulli and The remainder of this section is devoted to a brief description of some additional concepts that have proved useful in the identification of Bernoulli systems. In Section 2, Definition 5.16, we introduced the notion of finitely determined partitions. It was noted that independent generators are finitely determined. We have defined Bernoulli systems to be systems with independent generators. Thus the class of systems with finitely determined generators might be thought to be a larger class. However, it can be shown that they coincide. In fact (see [57]), (i) if @ has a finitely determined generator, then @ is Bernoulli, and (ii) if @ is Bernoulli, every generator for @ is finitely determined. Statement (i) above is established by carrying out the proof of Theorem 5.1 for systems with finitely determined generators and noting that for every positive number t there is a Bernoulli system (0 with h(@) = t. In [57] it is asserted that any partition (not only generators) for a Bernoulli system is finitely determined. Coupled with a theorem of Krieger that systems with finite entropy have finite generators, this and statement (i) above provide a quick proof of Theorem 5.2. The principal usefulness of the concept of finitely determined partitions comes from the fact that certain conditions on generators, weaker than independence, can be shown to imply the property of being finitely determined. A couple of these, originally contrived by Friedman and Ornstein in the study of Markov systems (Exercise IV.12) and Bernoulli flows are introduced below. The knowledgeable reader will note also the relation to channels with finite memory in information theory (see, e.g., Feinstein [ 191).

Definition 5.18 Let @ = (X,B, p, 4) be an ergodic dynamical system. A measurable partition P of X is said to be weak Bernoulli if for each

179

EXERCISES

E

> 0 there is a positive integer N such that +P for each m = 0, 1, 2, ....

V:2,

vp=-, +'P is Andependent of

Remark P is Bernoulli (or independent) if we can take N = 1 for each E. Thus Bernoulli partitions are weak Bernoulli. Moreover, the Bore1 zero-one law shows that weak Bernoulli lies somewhere between Bernoulli and Kolmogorov.

Example I If @ is the Markov system defined in Exercise IV.12, if 0 is mixing so that limn+mply) exists for each i and j [P"= (ply)) is the nth power of PI, and if P = {PI, . . ., P,),where P j = {x E X : xo = j ) , then P is weak Bernoulli.

In order to give the next definition, let us first expand the domain of the gadget metric y to arbitrary pairs of finite sequences of partitions. Thus we set 1 n-1 y((Po, ..., P n - J (Qo, ..., an-,))= inf- C p ( P , ' , Q,'), (22) n i=o where the infimum is over measurable partitions Po',. . .,Pn-1, &Po', . . .,Obof [O, 11 satisfying n- 1

n- 1

n- 1

n- 1

(23)

Definition 5.19 Let 0 = (X,99,p, 4) be an ergodic dynamical system. A measurable partition P of X is very weak Bernoulli if for each E > 0 there is a positive integer N = N ( E )such that for each positive integer m there is a collection %, of atoms in VyJ-,,, @lJp with y((P n A, q5P n A, ..., @-lP n A), (P, &P, ..., $"-'P)) < E

for each A E V,

(24)

and p ( u V")

>1

- E.

It can be shown that weak Bernoulli implies very weak Bernoulli (Exercise 5) and the latter implies finitely determined (see [57]). This is done by generalizing and extending the proof of Lemma 5.9. EXERCISES = (X, A?, p, 4) be an invertible dynamical system. For each n = 1, 2, . . . the set A , = {x E X : 4"x = x} is &invariant. If CP is ergodic, conclude that p(An)= 0 for all n and 0 is antiperiodic or p(An)= 1 for some

1. Let 0

180

v.

BERNOULLI SYSTEMS AND ORNSTEIN'S THEOREM

n. In the latter case, show that X is atomic with n atoms, and hence that h(@) = 0. 2. If(~'ff}is an independent sequence, it follows easily that h(4, P) = H ( P ) . Show that the converse is true. 3. (a) If CD is a Bernoulli system, show that 0 ' = (X,9,p, 4') is Bernoulli for each n. (b) Deduce from Exercise IV.l3(b) and Theorem 5.1 that Bernoulli systems have roots of all orders, and that the roots are also Bernoulli; that is, given a positive integer n and a Bernoulli system Y, there exists a Bernoulli system CD such that a'' = Y. 4. Verify the statement in Example 5.1. 5. Let us say that the sequence {P,,} of partitions of X is weakly &-independent if for each n E 2' there is a collection %, E P, such that (i) p ( u %,) > 1 - E, and n A), (Po, P,, ..., P,,-l)) E for all (ii) y((P, n A, P, n A, ..., A€%:,. (a) Show that an &-independentsequence is 2~-weaklyindependent. (b) Show that a weak Bernoulli partition is very weak Bernoulli.

-=

Bibliography

CITED REFERENCES 1. L. M. Abramov, Entropy ofa derived automorphism, Dokl. Akad. Nauk SSSR 128 (1959). 647-650 [Amer. Math. SOC. Transl. Ser. 11, 49 (1960), 162-1761, 2. L. M. Abramov, Metric automorphisms with quasidiscrete spectrum, fzu. Akad. Nauk SSSR Ser. Mat. 26 (1962), 513-530 [Amer. Math. SOC. Trawl. (2) 39 (1964), 37-56]. 3. M. A. Akcoglu, A pointwise ergodic theorem in L,-spaces, Canud. J. Math. (to appear). 4. R. L. Adler, A. G. Konheim, and M. H. McAndrew, Topological entropy, Tram. Amer. Math. SOC. 114 (1965), 309-319. 5. W. Ambrose, Representation of ergodic flows, Ann. of Math. (2) 42 (1941), 723-739. 6. K. R. Berg, Convolution of invariant measures, maximal entropy, Math. Systems Theory 3 (1969), 146-150. 7. P. Billingsley, “Ergodic Theory and Information.” Wiley, New York, 1965. 8. J. Blum and D. Hanson, On the isomorphism problem for Bernoulli schemes, Bull. Amer. Math. SOC.69 (1963), 221-223. 9. L. Breiman, The individual ergodic theorem of information theory, Ann. Math. Statist. 28 (1957), 809-8 11 ; Correction in 31 (1960), 809-8 10. 10. L. Breiman, On achieving channel capacity in finite-memory channels, Illinois J. Math. 4 (1960), 246-252. 11. J. R. Brown, A universal model for dynamical systems with quasi-discrete spectrum, Bull. Amer. Math. SOC.75 (1969), 1028-1030. 12. J . R. Brown, Inverse limits, entropy and weak isomorphism for discrete dynamical systems, Trans. Amer. Math. SOC.164 (1972), 55-66. 13. J. R. Brown, A model for ergodic automorphisms on groups, Math. Systems Theory 6 (1972), 235-240. 181

182

BIBLIOGRAPHY

14. J. R. Choksi, Inverse limits of measure spaces, Proc. London Math. Soc. (3) 8 (1958), 32 1-342. 15. E. I. Dinaburg, The relation between topological entropy and metric entropy, Dokl. Akud. Nauk SSSR 190 (1970), 19-22 [Sooiet Math. Dokl. 11 (1970), 13-16]. 16. N. Dunford and J. T. Schwartz, “Linear Operators,” Part I. Wiley (Interscience), New York, 1958. 17. R. Ellis, Locally compact transformation groups, Duke Math. J. 24 (1957), 119-125. 18. R. Ellis, “Lectures on Topological Dynamics.” Benjamin, New York, 1969. 19. A. Feinstein, “Foundations of Information Theory.” McGraw-Hill, New York, 1958. 20. N. A. Friedman and D. S.Ornstein, Ergodic transformations induce mixing transformations, Adoances in Math. 10 (1973). 147-163. 21. H. Furstenberg, Strict ergodicity and transformation of the torus, Amer. J. Math. 83 (1961), 573-601. 22. H. Furstenberg, The structure of distal.flows, Amer. J. Mach. 85 (1963) 477-515. 23. H. Furstenberg, Disjointness in ergodic theory, minimal sets, and a problem in Diophantine approximation, Math. Systems Theory 1 (1967). 1-49. 24. A. Garsia, “Topics in Almost Everywhere Convergence.” Markham, Chicago, 1970. 25. T. N. T. Goodman, Relating topological entropy and measure entropy, Bull. London Math. Soc. 3 (1971), 176-180. 26. L. W. Goodwyn, Topological entropy bounds measure-theoretic entropy, Proc. h e r . Math. SOC. 23 (1969). 679-688. 27. L. W. Goodwyn, Some counter-examples in topological entropy, Topology 11 (1972), 377-385. 28. W. H. Gottschalk and G. A. Hedlund, Topolc,.lal dynamics, Amer. Math. Soc. Colloq. Publ. 36, Providence, 1955. 29. F. J. Hahn and W. Parry, Minimal dynamical systems with quasi-discrete spectrum, J. London Math. Soc. 40 (19651 309-323. 30. M. Hall, “Combinatorial Theory.” Blaisdell, Waltham, Massachusetts, 1967. 31. P. R. Halmos, “Measure Theory.” Van Nostrand, Princeton, New Jersey, 1950. 32. P.R.Halmos, “Lectures on ErgodicTheory.” Publ. Math. Soc.Japan, No. 3, Tokyo, 1956. 33. P. R. Halmos, Entropy in Ergodic Theory. Univ. of Chicago Lecture. Notes, 1959. 34. P. R. Halmos and J. von Neumann, Operator methods in classical mechanics, 11, Ann. of Math. Ser. 11, 43 (1942), 332-350. 35. K. Jacobs, Ergodic decomposition of the Kohnogorov-Sinai invariant, in “Ergodic Theory,” F. B. Wright (ed.), pp. 173-190. Academic Press, New York, 1963. 36. S. Kakutani, Induced measure preserving transformations, Proc. Imp. Acud. Tokyo (Japan Acad.) 19 (1943), 635-641. 37. S. Kakutani, Examples of ergodic measure preserving transformations which are weakly

38. 39.

40. 41. 42. 43.

mixing but not strongly mixing, in “Recent Advances in Topological Dynamics,” A. Beck (ed.). Springer, New York, 1973. I. Kaplansky, “Infinite Abelian Groups.” Univ. of Michigan Press, Ann Arbor, 1954. J. D. Kerrick, Group automorphisms of the N-torus: a representation theorem and some applications. Ph.D. dissertation, Oregon State Univ., Corvallis, 1972. H. B. Keynes and J. B. Robertson, Generators for topological entropy and expansiveness, Math. Systems Theory 3 (1969). 51-59. A. N. Kolmogorov, A new metric invariant of transient dynamical systems and automorphisms ofkbesgue spaces, Dokl. Akad. Nauk. SSSR 119 (1958), 861-864. (In Russian.) W. Krieger, On unique ergodicity, Proc. Sixth Berkeley Symp. Math. Statisf. and Probability, Vol. 11, pp. 327-346. Univ. of California Press, Berkeley, 1972. B. McMillan, The basic theorems of information theory, Ann. Math. Statist. 24 (1953), 196219.

ADDITIONAL REFERENCES

183

44. L. D. Meshalkin, A case of isomorphism of Bernoulli schemes, Dokl. Akad. Nauk SSSR 128 (1959), 41-44. (In Russian.) D. S. Ornstein, Bernoulli shifts with the same entropy are isomorphic, Advances in Math. 4 (1970), 337-352. 46. D. S. Ornstein, Two Bernoulli shifts with infinite entropy are isomorphic, Advances in Math. 5 (1970) 339-348. 47. D. S. Omstein, Imbedding Bernoulli shifts in flows, Springer Lecture Notes 160 (1970), 45.

178-218.

48. D. S. Ornstein, An example of a Kolmogorov automorphism that is not a Bernoulli shift, Advances in Math. 10 (1973), 49-62. 49. J. C. Oxtoby, Ergodic sets, Bull. Amer. Math. SOC.58 (1952), 116136. 50. W. L. Reddy, Lifting expansive homeomorphisms to symbolic flows, Math. Systems Theory 2 (1968), 91-92. 51. V. A. Rohlin, Exact endomorphisms of a Lebesgue space, 120. Akad. Nauk SSSR Ser. Mat. 25 (1961), 499-530 [Amer. Math. SOC.Trans!. Ser. 11, 39 (1963), 1-36. 52. V. A. Rohlin, On the entropy of automorphisms of a compact commutative group, Theor. Probability Appl. 6 (1961). 322-323. 53. W. Rudin, “Fourier Analysis on Groups.” Wiley (Interscience), New York, 1962. 54. A. Saleski, On induced transformations of Bernoulli shifts, Math. Systems Theory 7 (1973). 83-96. 55. H. Schubert, “Topology.” Allyn and Bacon, Boston, 1968. 56. T. L. Seethoff, Zeroentropy automorphisms of a compact abelian group, Tech. Report No. 40,Oregon State University Department of Mathematics, Corvallis, 1968. 57. P. Shields, “The Theory of Bernoulli Shifts.” Univ. of Chicago Press, Chicago, 1973. 58. Ya. G. Sinai, On the concept of entropy for dynamical systems, Dokl. Akad. Nauk SSSR 124 (1959), 768-771. (In Russian.) 59. M. Smorodinsky, Ergodic theory, entropy, Springer Lecture Notes 214 (1970).

60. M. D. Weiss, Algebraic and other entropies of group endomorphisms, Math. Systems Theory 8 (1975). 243-248.

ADDITIONAL REFERENCES 61. L. 62. 63. 64. 65. 66.

M.Abramov and V. A. Rohlin, Entropy of a skew product of transformation with invariant measure, Vestnik Leningrad. Univ. 7 (1962). 5-13. (In Russian.) R. L. Adler and B. Weiss, Entropy, a complete metric invariant for automorphisms of the torus, Proc. Nat. Acad. Sci. U S 57 (1967). 1573-1576. H. Anzai and S. Kakutani, Bohr compactifications of a locally compact abelian group I 8c 11. Proc. Imp. Acad. Tokyo (Japan Acad.) 19 (1943). 476-480, 533-539. A. Beck and J. T. Schwartz, A vector-valued random ergodic theorem, Proc. Amer. Math. SOC.8 (1957), 1049-1059. A. Brunel and M. Keane, Ergodic theorems for operator sequences, Z . Wahrschein. verw. Geb. 12 (1969). 231-240. R. V. Chacon, Identification of the limit of operator averages, J . Math. Mech. 11 (1962),

96 1-968. 67. 68.

R. Ellis, Distal transformation groups, Pacific J . Math. 8 (1958), 40-405. R. Ellis, A semigroup associated with a transformation group, Trans. Amer. Math. SOC. 94 (1960), 272-28 1.

69.

R. Ellis and W. H. Gottschalk, Homomorphisms of transformation groups, Trans. Amer. Math. SOC.94 (1966). 258-271.

184

BIBLIOGRAPHY

70. N. A. Friedman, “Introduction to Ergodic Theory.” Van Nostrand-Reinhold, Princeton, New Jersey, 1970. 71. N. A. Friedman, Bernoulli shifts induce Bernoulli shifts, Advances in Math. 10 (1973), 39-48. 72. W. H. Gottschalk, Minimal sets: an introduction to topological dynamics, Bull. Amer. Math. SOC.64 (1958), 336-351. 73. F. J. Hahn, On affine transformations of compact abelian groups, Amer. J . Math. (3) 85 (1963), 428-446. 74. F. Hahn and Y. Katznelson, On the entropy of uniquely ergodic transformations, Trans. Amer. Math. SOC.126 (1967), 335-360. 75. F. Hahn and W. Parry, Some characteristic properties of dynamical systems with quasi-discrete spectra, Math. Systems Theory 2 (1968), 179-190. 76. P. R. Halmos and H. Samelson, On monothetic groups, Proc. Nat. Acad. Sci. US 28 (1942). 254258. 77. D. L. Hanson and G. Pledger, On the mean ergodic theorem for weighted averages, Z. Wahrschein. uerw. Geb. 13 (1969), 141-149. 78. G. A. Hedlund, Endomorphisms and automorphisms of the shift dynamical system, Math. Systems Theory 3 (1969), 320-375. 79. A. H. M. Hoare and W.Parry, Semi-groups of affine transformations, Quart. J . Math. Oxford (2) 17 (1966), 106-111. 80. A. H. M. Hoare and W.Parry, Affine transformations with quasi-discrete spectrum (I), J . London Math. SOC.41 (1966), 88-96. 81. E. Hopf, “Ergodentheorie.” Springer, Berlin, 1937. 82. S. A. Juzvinskii, Metric properties of endomorphisms of compact groups, Izu. Akad. Nauk. SSSR Ser. Mat. 29 (1965), 1295-1328 [Amer. Math. SOC. Transl. Ser. 2, 66 (1966), 63-98]. 83. S. Kakutani, Random ergodic theorems and Markoff processes with a stable distribution, Proc. Second Berkeley Syrnp. Probability and Statist., pp. 247-261. Univ. of California Press, Berkeley, 1951. 84. S. Kakutani, Determination of the spectrum of the flow of Brownian motion, Proc. Nat. Acad. Sci. U S 36 (1950), 319-323. 85. S. Kakutani, Ergodic theory, Proc. Intern. Congress of Mathematicians, pp. 319-323. Cambridge, 1952. 86. S. Kakutani and W.Parry, Infinite measure preserving transformation with “mixing,” Bull. Amer. Math. SOC. 69 (1963), 752-756. 87. Y. Katznelson, Ergodic automorphisms of T” are Bernoulli shifts, Israel J . Math. 10 (1971). 186195. 88. U. Krengel, Entropy of conservative transformations, Z. Wahrschein. uerw. Geb. 7 (1967), 161-181. 89. P. -F. Lam, On expansive transformation groups, Trans. Amer. Math. SOC.150 (1970), 131-138. 90. D. Maharam, On orbits under ergodic measure-preserving transformations, Trans. Amer. Math. SOC.119 (1965), 51-66. 91. J. Neveu, Une demonstration simplifiee et une extension de la formule d’Abramov sur I’entropie des transformations induites, Z. Wahrschein. uerw. Geb. 13 (1969), 135-140. 92. D. S. Ornstein, On invariant measures, Bull. Amer. Math. SOC.66 (1960). 297-300. 93. D. S. Ornstein, A K-automorphism with no square root and Pinsker’s conjecture, Advances in Math. 10 (1973), 89-102. 94. D. S. Ornstein, A mixing transformation for which Pinsker’s conjecture fails, Advances in Math. 10 (1973), 103-123.

ADDITIONAL REFERENCES

185

9s. D. S. Ornstein, The isomorphism theorem for Bernoulli flows, Aduances in Marh. 10 (1973), 124-142. 96. D. S. Ornstein and P. C. Shields, An uncountable family of K-automorphisms, Aduances in Math. 10 (1973), 63-88. 97. D. S. Ornstein and P. C. Shields, Mixing Markov shifts of kernel type are Bernoulli, Advances in Math. 10 (19733, 143-146. 98. W. Parry, Intrinsic Markov chain, Trans. Amer. Math. SOC.112 (1964), 55-56. 99. W. Parry, On the coincidence of three invariant o-algebras associated with an affine transformation, Proc. h e r . Math. SOC.17 (1966), 1297-1302. 100. W. Parry, Entropy and Generators in Ergodic Theory. Lecture Notes, Yale University Department of Mathematics, New Haven, 1966. 101. W. Parry and P. Walters, Minimal skew product homeomorphisms and coalescence, Compositio Math. 22 (1970), 283-288. 102. V. A. Rohlin, Selected topics from the metric theory of dynamical systems, Uspehi Mat. Nauk 4 (1949), 57-128 [Amer. Math. SOC.Trans/. Ser. I, 49 (1960), 171-2401. 103. V. A. Rohlin, Metric properties of endomorphisms of compact commutative groups, fzu. Akad. Nauk S S S R Ser. Mat. 28 (1964), 867-874. (In Russian.) 104. (3.42. Rota, On the maximal ergodic theorem for Abel limits, Proc. Amer. Math. SOC. 14 (1963), 722-723. 10s. M. Sears, The automorphisms of the shift dynamical system are relatively sparse, Math. Systems Theory 5 (1971), 228-231. 106. P. C. Shields, Cutting and independent stacking of intervals, Math. Systems Theory 10 (1973), 1-4. 107. Ya. G . Sinai, Probabilistic ideas in ergodic theory, Amer. Math. SOC. Trans/. (2) 31 (1963), 62-84. 108. Ya. G. Sinai, Weak isomorphism of transformations with invariant measure, Mar. Sb. (N.S.) 63 (105) (1964), 23-42 [Amer. Math. SOC.Transl. Ser. 2, 57 (1966), 123-1431, 109. M. Smorodinsky, A partition on a Bernoulli shift which is not weakly Bernoulli, Math. Systems Theory 5 (1971), 201-203. 110. M. Smorodinsky, On Ornstein's isomorphism theorem for Bernoulli shifts, Adiiances in Math. 9 (1972), 1-9. 111. P. Walters, On the relationship between zero entropy and quasi-discrete spectrum for affine transformations, Proc. Amer. Math. SOC. 18 (1967), 661-667. 112. P. Walters, Topological conjugacy of affine transformations of compact abelian groups, Trans. Amer. Math. SOC.140 (1969), 95-107. 113. P. Walters, Conjugacy properties of affine transformations of nilmanifolds, Math. Systems Theory 4 (1970), 327-333. 114. P. Walters, Some invariant a-algebras for measure-preserving transformations, Trans. Amer. Math. SOC. 163 (1972), 357-368. 115. 8. Weiss, The isomorphism problem in ergodic theory, Bull. Amer. Math. SOC.78 (1972), 668-684.

This Page Intentionally Left Blank

Index

A

Borel, E., 7

Abramov, L. M., 98,140 Abramov-Hahn-Parry theorem, 84, 100 Abstract group, 72 Adding machine transformation, 33,40,66,

C Chacon, R., 11 Character (group), 72 Circle group, 72, 78 Column, 154 Column level, 154 Completely nondeterministic system, 146 Completely positive entropy, 146 Conditional entropy, 114 Conditional expectation (operator),

75

Adjoint, 5,44, 72 Adjunct transformations, 110 Affine entropy, 132 Affine transformation (system), 72,93 Algebra, invariant (totally), 23 Algebraic entropy, 148 Algebraically monothetic group, 95,108 Almost periodic point, 67 Atom, 152 Automorphism, 72

110-112, 143

Conditional information function, 114 Conservative operator, 36,39 Contraction, 8 Convergence theorem, monotone (dominated), 34 Convex function, 112 cyclic group, 75

B Baker’s transformation, 33 Berg, K. R., 133 Bernoulli, J., 7 Bernoulli shift (system), 95, 110, 126,

D

150-153,173,177 Birkhoff. G. D.. 7,9,150 Bohr compactification, 93

Density zero, 16,22, 37,38 Deterministic dynamical system, 146 187

188

INDEX

Direct product, 21,26,38,53,58,69,126, 128 Direct sum, 54,69,131 Direct summand, 56 Directed set, 24 Discrete spectrum, 32,84 Distal dynamical system, 52,60,61,68, 90 Distribution distance, 156 Distribution of a partition, 152 Divisible group, 103 Doubly stochastic operator, 5, 112-11 3 Dual group, 72 Dynamical system abstract, 2 classical, 44 E Eigenfunction, 18, 20,41, 84 Eigenvalue, 18, 20,41,78 EUis semigroup, 59,69 homomorphism of, 6 1 Entropy, 110,113 Equicontinuous dynamical system, 51,61, 68 Equivalent of dynamical systems, 2, 109 Epimorphism, 72 Ergodic automorphism, 92-96 Ergodic entropy, 130 Ergodic hypothesis, 15 Ergodic operator, 14 Ergodic part, 91 Ergodic system (transformation), 14, 19,20, 21, 24, 28,30, 31, 36, 37, 38.76-84 Ergodic theorem individual, 9 maximal, 8 mean, t1,35 Exact dynamical system, 39 Expansive index, 63 Expansive dynamical system, 63, 65,70 Extreme point, 68, 131 F Factor system, 23, 24, 38,55,61,69, 128, 177 direct, 23, 55 Finitely determined partition, 160, 178 Flow (special), 40

Fourier series (coefficients), 20 Fourier transform, 72 Furstenberg’s theorem, 62-63, 147

C Gadget, 155 Gadget distance, 156 Generalized shift, 74 Generator, 64, 123, 153 Generic point, 49,67 Goodwyn-DinaburgGoodman theorem, 131 Group extension, 62

H Haar measure, 72, 102 Hahn-Banach theorem, 101 Halmos-von Neumann representation theorem, 84 Homomorphism, 23,55 Hopf, E., 7,8

I Independent @algebras, 129 &independent partitions, 157 Independent sequence of partitions, 153, 180 Index, 5 1 Induced dynamical system (transformation), 28-32,40,148 ergodicity of, 29, 39 Information function, 114 Injection, 56 Inverse limit, 24-28,38,39,57-59,69,75, 129,177 inverse system, 24-28,57-59 Isomorphism, 55, 142-143 gadget, 155 invariants, 110,119,128 weak, 23 J

Jensen’s inequality, 112 Join, 27,152

189

INDEX

K Kac’s theorem, 140, 149 Kakutani, S., 28, 31 Kerrick, J. D.,96 Keynes-Robertson-Reddy theorem, 65 Kolmogorov, A. N., 110, 150 Kolmogorov system, 39,134,146,150 Krein-Milman theorem, 68

L Law of large numbers, 158 Lebesgue system, 39 Limit, *-lim, 16

P Parry, W., 147 Partition, 152 Partition distance, 156 P-n-name, 154,155 Poincarc?, H., 7, 13 Process distance, 156

Q Quasidiscrete spectrum, 84-89, 106, 108, 135 Quasieigenfunction, 84,86 Quasieigenvalue, 86 Quasiperiodic spectrum, 89-91, 106, 134

M Markov-Kakutani fixed point theorem, 45 Markov shift, 145 Martingale theorem, 119 McMillan’s theorem, 137,139,159 Measure-preserving transformation, 2, 3233 Minimal dynamical system, 46,47,66 M-isomorphic systems, 109 Mixing strong, 15, 19,20-21,24,28,32, 38, 40-42,77,79 weak, 16, 19, 20-21, 24, 28, 32, 37, 38,4042,79 Monothetic dynamical system, 93-96, 134 Monothetic group, 75,84

N Natural extension, 26,27,28, 39,58, 125

0 Orbit, 46, 76, 79,92 Orbit closure, 46 Omstein, D. S., 11, 110, 150, 177 principal lemma, 161 theorem, 173

R Recurrence theorem, 13, 15,28 Recurrent dynamical system, 39,47 Refines, 152 €-refines, 173 Rohlin’s theorem, 158

S

Seethoff, T. L., 90, 91, 134 Semiergodic transformation (system), 8 3, 88,106 Semisimple dynamical system, 54,60 Shift dynamical system, 46 -invariant group, 88,108 transformation, 3 Sinai’s theorem, 122 Spectrum continuous, 18 Lebesgue, 19 point, 18 Stack, 153 Strictly ergodic dynamical system, 49.50 Strong generator, 123 Substack, 154 Subsystem, 55,69, 130 Symbolic dynamical system, 46 Symbolic flow,74.95

190

INDEX

T T-isomorphic, 127 Topological entropy, 127,146-147 Topological generator, 75,83,84 Torsion-freegroup, 83 TOIUS,72,78-79 Totally ergodic transformation, 80 Translation, 72,145 U Uniform space, 5 1 Uniformly integrable, 136 Unimodular matrix, 73,79 Uniquely ergodic dynamical system, 49, 50

V Very weak Bernoulli partition, 179

W Weak Bernoulli partition, 178 Weakly f-independent sequence, 180 Weakly topologically ergodic dynarnical system, 107

Y Yosida, K., 11